On Sunday 14 March 2010, Loïc Minier wrote: > On Wed, Mar 10, 2010, Siarhei Siamashka wrote: > > I would prefer a bit more descriptive comment (with the details copied > > from that launchpad page). > > I see you pushed this now; thanks! Yeah, it's not obvious why one > needs to try with the toolchain defaults first. I'm attaching a patch > to update the comments.
Well, it's actually obvious enough and current comments in configure.ac seem to be sufficient. I was only nitpicking about the commit summary and comment. Thanks anyway. > > In my opinion the best solution overall would be to move all the assembly > > optimizations into separate .S files also for legacy ARM processors and > > get rid of these compiler option hacks. I think that bringing support for > > legacy ARM processors into a better shape is quite realistic even for > > 0.18.0 stable release, which is due to be released this month. But it can > > only happen if enough people are interested in this, and more > > importantly, are ready to actively participate in testing. > > I think this should be kept as an open bug against pixman that the > inline asm()s would better be written as separate .S files (thanks for > the idea). For ARM NEON optimizations in pixman 0.16.x it was even more messy and ugly because of '-mfloat-abi' option. Switching to .S files solves this problem and automatically gives more control over registers allocation. Inline assembly is more fragile in this respect because if it tries to use as many registers as possible (availability of more registers is better for optimization), gcc may fail to compile the code depending on the optimizations level and other options, giving a rather annoying error: "can't find a register in class 'GENERAL_REGS' while reloading 'asm'" The downside of using assembly directly is the need to care about ABI, stack alignment, dealing with r9 register, etc. It is not a big problem to target ARM EABI in linux. But the other platforms running on ARM (windows, apple, ...) may potentially have troubles or have assembly optimizations disabled. That's why I'm hesitating to touch support for older ARM processors. Another issue is whether to use or not to use unaligned memory accesses on armv6+ systems. Currently even in NEON code from 'pixman-arm-neon-asm.S', a configuration variable RESPECT_STRICT_ALIGNMENT is set to 1 and pixman should never use unaligned memory accesses. While setting this option to 0 would give a bit better performance when dealing with leading and trailing pixels in each scanline. I created a branch to collect some older ARMv6 optimization for pixman (coincidentally they already use "naked" functions, which is practically equivalent to implementing code in external .S files): http://cgit.freedesktop.org/~siamashka/pixman/log/?h=arm-optimizations-from-xomap Cleaning up this branch and splitting out armv4 assembly (which should also provide a good performance improvement) may be a good idea for really old ARM systems. As part of this activity, an old bug/feature request can be solved too: https://bugs.freedesktop.org/show_bug.cgi?id=13445 -- Best regards, Siarhei Siamashka _______________________________________________ Pixman mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/pixman
