Re: [Pixman] [cairo] help building cairo on windows
On Thu, 13 Sep 2012 09:08:59 +0200 Andrea Canciani ranm...@gmail.com wrote: On Thu, Sep 13, 2012 at 2:08 AM, Siarhei Siamashka siarhei.siamas...@gmail.com wrote: On Tue, 11 Sep 2012 23:45:47 +0200 Andrea Canciani ranm...@gmail.com wrote: On Tue, Sep 11, 2012 at 10:44 PM, Siarhei Siamashka siarhei.siamas...@gmail.com wrote: On Tue, 11 Sep 2012 22:25:15 +0200 Andrea Canciani ranm...@gmail.com wrote: On Tue, Sep 11, 2012 at 7:34 PM, Søren Sandmann sandm...@cs.au.dk wrote: Andrea Canciani ranm...@gmail.com writes: This reminds me that I have some patches to improve building pixman on win32: http://cgit.freedesktop.org/~ranma42/pixman/commit/?h=wip/simpleops-to-master Soren, is it ok if I push the attached patches? No objections from me if the test suite still passes with PIXMAN_DISABLE=sse2 set. On MacOS X the testsuite shows no regression with that variable in the env. On win32, pixman did not compile without the first of those patches. After it, the testsuite fails with: $ PIXMAN_DISABLE=sse2 ./scaling-test.exe Assertion failed: frcd_canary_variable1 == frcd_volatile_constant1, file scaling-test.c, line 356 pixman: Disabled sse2 implementation The failing testcase is: $ PIXMAN_DISABLE=sse2 ./scaling-test.exe 3348 Assertion failed: frcd_canary_variable1 == frcd_volatile_constant1, file scaling-test.c, line 356 src_fmt=2002, dst_fmt=2002 op=3, scale_x=27881, scale_y=-65000, repeat=0 translate_x=55344, translate_y=56484 src_width=11, src_height=8, dst_width=24, dst_height=7 src_x=0, src_y=4, dst_x=4, dst_y=3 w=20, h=2 destination clip box: [16,6-20,6] destination clip box: [3,2-18,6] pixman: Disabled sse2 implementation ... This seems to be consistent with the compiler warnings: pixman-mmx.c c:\cygwin\home\ranma42\code\fdo\pixman\pixman\pixman-mmx.c(286) : warning C4799: function 'to_uint64' has no EMMS instruction c:\cygwin\home\ranma42\code\fdo\pixman\pixman\pixman-mmx.c(469) : warning C4799: function 'store' has no EMMS instruction c:\cygwin\home\ranma42\code\fdo\pixman\pixman\pixman-mmx.c(480) : warning C4799: function 'is_equal' has no EMMS instruction c:\cygwin\home\ranma42\code\fdo\pixman\pixman\pixman-mmx.c(491) : warning C4799: function 'is_opaque' has no EMMS instruction c:\cygwin\home\ranma42\code\fdo\pixman\pixman\pixman-mmx.c(497) : warning C4799: function 'is_zero' has no EMMS instruction c:\cygwin\home\ranma42\code\fdo\pixman\pixman\pixman-mmx.c(565) : warning C4799: function 'expand_4xpacked565' has no EMMS instruction c:\cygwin\home\ranma42\code\fdo\pixman\pixman\pixman-mmx.c(591) : warning C4799: function 'expand_4x565' has no EMMS instruction c:\cygwin\home\ranma42\code\fdo\pixman\pixman\pixman-mmx.c(3760) : warning C4799: function 'fast_composite_scaled_bilinear_mmx__8__pad_OVER' has no EMMS instruction c:\cygwin\home\ranma42\code\fdo\pixman\pixman\pixman-mmx.c(3764) : warning C4799: function 'fast_composite_scaled_bilinear_mmx__8__none_OVER' has no EMMS instruction I was unable to find out why MSVC thinks that fast_composite_scaled_bilinear_mmx__8__{pad,none}_OVER are missing an EMMS instruction, but cover/normal are ok. Does MSVC respect 'force_inline' and actually inline all of these functions ('to_uint64', 'store', ...)? The compiler does not report inline warnings in pixman-mmx.c, so I would expect that it respects force_inline. I'm attaching the build log, in case it is of any help. I'm not sure if I can provide much help with MSVC. The _mm_empty() intrinsic is also not totally problem free even when using gcc: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47759 But some kind of workaround is still needed. Can you try to confirm which pixman fast paths are triggering this problem? It looks like the compiler warning match the issue: fast_composite_scaled_bilinear_mmx__8__pad_OVER fast_composite_scaled_bilinear_mmx__8__none_OVER If the problem is related to incorrect EMMS instruction reordering, then maybe moving EMMS into a non-inlinable function and calling it instead of _mm_empty() can help? Just as an experiment for getting a bit better understanding about what is going on. I confirm that it is likely to be an optimization bug (disabling the optimizations fixes it). The attached patch is another possible workaround to the issue. It causes a lot of warnings about missing EMMS, because the compiler does not find any EMMS instruction after the MMX instructions), but the whole testsuite succeeds. Marking the function as noinline (without disabling the optimizations on it) does not fix the issue. What would be the best way to work around this issue both in gcc and in msvc? Would it be ok to create a pixman_mm_empty() function (to be used
Re: [Pixman] [cairo] help building cairo on windows
On Thu, 13 Sep 2012 09:08:59 +0200 Andrea Canciani ranm...@gmail.com wrote: On Thu, Sep 13, 2012 at 2:08 AM, Siarhei Siamashka siarhei.siamas...@gmail.com wrote: I'm not sure if I can provide much help with MSVC. The _mm_empty() intrinsic is also not totally problem free even when using gcc: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47759 But some kind of workaround is still needed. Can you try to confirm which pixman fast paths are triggering this problem? It looks like the compiler warning match the issue: fast_composite_scaled_bilinear_mmx__8__pad_OVER fast_composite_scaled_bilinear_mmx__8__none_OVER If the problem is related to incorrect EMMS instruction reordering, then maybe moving EMMS into a non-inlinable function and calling it instead of _mm_empty() can help? Just as an experiment for getting a bit better understanding about what is going on. I confirm that it is likely to be an optimization bug (disabling the optimizations fixes it). The attached patch is another possible workaround to the issue. It causes a lot of warnings about missing EMMS, because the compiler does not find any EMMS instruction after the MMX instructions), but the whole testsuite succeeds. Marking the function as noinline (without disabling the optimizations on it) does not fix the issue. What would be the best way to work around this issue both in gcc and in msvc? Would it be ok to create a pixman_mm_empty() function (to be used everywhere instead of mm_empty) and disable the optimizations on it? Maybe we can add a new function pointer to pixman_implementation_t specifically for EMMS instruction, which can be initialized in _pixman_implementation_create_mmx() for x86 systems? It would be best if we could ensure that this function is really called and never inlined, preferably in a portable way. I guess the compiler can't easily prove that the function pointer is never overwritten elsewhere even with link time optimizations, so it will emit a function call. As for the MSVC warnings, they look rather bogus and useless. MSVC just spams the log with false alarms for the small inline functions and possibly catches itself at wrong code generation (which should not normally happen in the first place). Maybe the warnings can be suppressed as suggested at https://bugzilla.mozilla.org/show_bug.cgi?id=682139 ? -- Best regards, Siarhei Siamashka ___ Pixman mailing list Pixman@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/pixman
Re: [Pixman] [cairo] help building cairo on windows
On Thu, 13 Sep 2012 14:25:32 -0700 Matt Turner matts...@gmail.com wrote: On Thu, Sep 13, 2012 at 12:15 PM, Siarhei Siamashka siarhei.siamas...@gmail.com wrote: On Thu, 13 Sep 2012 09:08:59 +0200 Andrea Canciani ranm...@gmail.com wrote: On Thu, Sep 13, 2012 at 2:08 AM, Siarhei Siamashka siarhei.siamas...@gmail.com wrote: I'm not sure if I can provide much help with MSVC. The _mm_empty() intrinsic is also not totally problem free even when using gcc: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47759 But some kind of workaround is still needed. Can you try to confirm which pixman fast paths are triggering this problem? It looks like the compiler warning match the issue: fast_composite_scaled_bilinear_mmx__8__pad_OVER fast_composite_scaled_bilinear_mmx__8__none_OVER If the problem is related to incorrect EMMS instruction reordering, then maybe moving EMMS into a non-inlinable function and calling it instead of _mm_empty() can help? Just as an experiment for getting a bit better understanding about what is going on. I confirm that it is likely to be an optimization bug (disabling the optimizations fixes it). The attached patch is another possible workaround to the issue. It causes a lot of warnings about missing EMMS, because the compiler does not find any EMMS instruction after the MMX instructions), but the whole testsuite succeeds. Marking the function as noinline (without disabling the optimizations on it) does not fix the issue. What would be the best way to work around this issue both in gcc and in msvc? Would it be ok to create a pixman_mm_empty() function (to be used everywhere instead of mm_empty) and disable the optimizations on it? Maybe we can add a new function pointer to pixman_implementation_t specifically for EMMS instruction, which can be initialized in _pixman_implementation_create_mmx() for x86 systems? I will NAK this with all the power I have. It would be best if we could ensure that this function is really called and never inlined, preferably in a portable way. I guess the compiler can't easily prove that the function pointer is never overwritten elsewhere even with link time optimizations, so it will emit a function call. It turns out that Windows/MSVC build problems are reported, but no one actually uses the MMX code on Windows. See commit 4fc586c3d. That bug was introduced in a075a870fd7 in 2009 and apparently no one noticed until I happened to come across it by inspection 3 years later. This is not surprising for rare and almost extinct hardware such as x86 processors without SSE2. And it is not MSVC specific at all. For example, pixman-0.26.0 had MMX bug slipped in because of the same exotic hardware needed to reproduce the problem reason not so long ago: http://cgit.freedesktop.org/pixman/commit/?h=0.26id=bfbfa2cc8fbc43 The only difference is that linux users are more familiar with autotools and do run 'make check' sometimes. Windows users are more likely to start worrying only after spotting graphics artefacts, crashes or other obvious symptoms. There is one more thing to consider: some people just fix trivial bugs and apply patches without bothering to report back. I'm okay with non-invasive fixes like f71e3dba, but I won't let the MMX code get worse because of MSVC crap. It's actually important to some platforms, and Windows/MSVC isn't one of them. First of all, both gcc and MSVC are suffering from EMMS related problems. The gcc bug 47759 is still unresolved, and I doubt that anyone is going to invest efforts into fixing it, considering that MMX is going away in the long run. And it has already proved to break pixman in the wild (admittedly only reproducible with a strange optimization flags combination -march=athlon-xp -msse2): https://bugs.gentoo.org/show_bug.cgi?id=347947: I'm pretty sure there is a chance to encounter this problem again in the future. One only might need to be a bit creative with CFLAGS selection when compiling pixman. Non-x86 systems don't need any EMMS at all, that's a noop for them. Yes, the code gets worse because of the extra #ifdef clutter added, but I don't see many options here. It's either 1) workaround it 2) stop using MMX on x86 hardware 3) do nothing, blame GCC and MSVC :) We are effectively at 3) now. But the very existence of this discussion thread proves that this unresolved problem is going to be recurring and eat our time, regardless of who is at fault. -- Best regards, Siarhei Siamashka ___ Pixman mailing list Pixman@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/pixman
Re: [Pixman] [cairo] help building cairo on windows
Andrea Canciani ranm...@gmail.com writes: This reminds me that I have some patches to improve building pixman on win32: http://cgit.freedesktop.org/~ranma42/pixman/commit/?h=wip/simpleops-to-master Soren, is it ok if I push the attached patches? No objections from me if the test suite still passes with PIXMAN_DISABLE=sse2 set. Soren ___ Pixman mailing list Pixman@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/pixman
Re: [Pixman] [cairo] help building cairo on windows
This reminds me that I have some patches to improve building pixman on win32: http://cgit.freedesktop.org/~ranma42/pixman/commit/?h=wip/simpleops-to-master Soren, is it ok if I push the attached patches? I will work a little more on simpleops before proposing it again (in particular, I'd like to make it behave like the gstreamer git submodules and I need to investigate how it is integrated with packaging and releasing), but the first two commits in the branch should be ok for merge regardless of simpleops. Andrea On Mon, Sep 10, 2012 at 8:17 PM, Dimiter 'malkia' Stanev mal...@gmail.com wrote: On 9/10/2012 7:04 AM, Cosmin Apreutesei wrote: Hi, I have to build cairo on windows because I want to use the recording surface and I can't find binaries above 1.10 anywhere on the net (till now I have used the binaries from GTK but these are old and the recording surface doesn't work with those). Trying to build pixman with: make CFG=debug -f Makefile.win32 gives me: pixman-mmx.c(545) : error C2143: syntax error : missing ';' before 'type' and then 100 syntax errors like that (other c files compile ok until the mmx one). The sources were pulled today from git. The environment is VSE2008. make is 3.81 from mingw (I also had to throw in pkg-config binaries + dependencies from GTK). I also tried to build the latest pixman (0.26.2) with the same command but it failed with: Entering directory `/x/work/pixman-0.26.2/test' Makefile.win32: No such file or directory I also tried building with mozilla-build from the mingw bash prompt, with the same results. So what's the best (preferably from cmdline) way to build pixman and cairo under windows? And which is the best version? The release or the git one? Btw, if anyone has an up-to-date cairo.dll that would work for me too. Hi Cosmin, With this patch you should be able to compile. Thanks, Dimiter 'malkia' Stanev pixman-mmx.diff diff --git a/pixman/pixman-mmx.c b/pixman/pixman-mmx.c index 74a5e87..9e597db 100644 --- a/pixman/pixman-mmx.c +++ b/pixman/pixman-mmx.c @@ -541,7 +541,7 @@ expand565 (__m64 pixel, int pos) static force_inline void expand_4xpacked565 (__m64 vin, __m64 *vout0, __m64 *vout1, int full_alpha) { -__m64 t0, t1, alpha = _mm_setzero_si64 ();; +__m64 t0, t1, alpha = _mm_setzero_si64 (); __m64 r = _mm_and_si64 (vin, MC (expand_565_r)); __m64 g = _mm_and_si64 (vin, MC (expand_565_g)); __m64 b = _mm_and_si64 (vin, MC (expand_565_b)); @@ -1902,22 +1902,22 @@ mmx_composite_over__0565 (pixman_implementation_t *imp, while (w = 4) { __m64 vdest = *(__m64 *)dst; - __m64 v0, v1, v2, v3; + __m64 v0, v1, v2, v3, vsrc0, vsrc1, vsrc2, vsrc3; expand_4x565 (vdest, v0, v1, v2, v3, 0); - __m64 vsrc0 = load ((src + 0)); - __m64 vsrc1 = load ((src + 1)); - __m64 vsrc2 = load ((src + 2)); - __m64 vsrc3 = load ((src + 3)); - + vsrc0 = load ((src + 0)); + vsrc1 = load ((src + 1)); + vsrc2 = load ((src + 2)); + vsrc3 = load ((src + 3)); + v0 = over (vsrc0, expand_alpha (vsrc0), v0); v1 = over (vsrc1, expand_alpha (vsrc1), v1); v2 = over (vsrc2, expand_alpha (vsrc2), v2); v3 = over (vsrc3, expand_alpha (vsrc3), v3); - + *(__m64 *)dst = pack_4x565 (v0, v1, v2, v3); - + w -= 4; dst += 4; src += 4; @@ -2454,22 +2454,22 @@ mmx_composite_over_n_8_0565 (pixman_implementation_t *imp, else if (m0 | m1 | m2 | m3) { __m64 vdest = *(__m64 *)dst; - __m64 v0, v1, v2, v3; + __m64 v0, v1, v2, v3, vm0, vm1, vm2, vm3; expand_4x565 (vdest, v0, v1, v2, v3, 0); - __m64 vm0 = to_m64 (m0); + vm0 = to_m64 (m0); v0 = in_over (vsrc, vsrca, expand_alpha_rev (vm0), v0); - - __m64 vm1 = to_m64 (m1); + + vm1 = to_m64 (m1); v1 = in_over (vsrc, vsrca, expand_alpha_rev (vm1), v1); - - __m64 vm2 = to_m64 (m2); + + vm2 = to_m64 (m2); v2 = in_over (vsrc, vsrca, expand_alpha_rev (vm2), v2); - - __m64 vm3 = to_m64 (m3); + + vm3 = to_m64 (m3); v3 = in_over (vsrc, vsrca, expand_alpha_rev (vm3), v3); - + *(__m64 *)dst = pack_4x565 (v0, v1, v2, v3);; } @@ -3545,32 +3545,35 @@ mmx_composite_over_reverse_n_ (pixman_implementation_t *imp, #define BILINEAR_INTERPOLATE_ONE_PIXEL(pix) \ do { \ +__m64 t_hi, t_lo, b_hi, b_lo, hi, lo; \ /* fetch 2x2 pixel block into 2 mmx registers */ \ __m64 t = ldq_u ((__m64 *)src_top [pixman_fixed_to_int (vx)]); \ __m64 b = ldq_u ((__m64