On Sep 23, 2005, at 5:30 PM, Ilya Lipovsky wrote:

Hello all,

I am asking this, because we're having some problems with those builtins inlining instructions properly when a certain level of logic complexity (in loops) arises. Even worse, gcc 4.0 (both 4.0.0 and 4.0.1) generates bad code (whereas gcc 3.4 is OK). 3.4 simply resorts to a series of calls to compiler generated routines (with mangled names such as "_Z7vec_addU8_vectorfs" probably corresponding to the vec_add's builtin) instead of inlining actual instructions. Again that happens when a certain level of code mass is reached. gcc 4.0 tries to do the same but, apparently, something goes wrong. I didn't inspect the produced assembly code in depth. Currently, I can only give examples in terms of the macstl expressions and compiler generated assembly output, but if you request, I can try to write a more direct loop that uses the vec_* code.

As always read http://gcc.gnu.org/bugs.html and file a bug. The inlining problem have been fixed for 4.1.0, by the stuff mentioned below. Also smaller the testcase
the better


So does anyone of you, compiler writers, know why the builtins are needed?

The builtins are required so the compiler can schedule the code correctly. In 4.1.0, the altivec.h header have been rewritten so that we don't use inline
functions but some builtins directly.


Also, does anyone care that this stuff doesn't really work?

Well there was a rewrite for 4.1.0 for the altivec.h header which should improve this. And also in both 4.0.0 and 4.1.0, there is autovectorization which should expose more bugs. Also sometime during either 3.4.0, 4.0.0, or 4.1.0, an altivec
testsuite was addded.

Thanks,
Andrew Pinski

Reply via email to