On Fri, Aug 17, 2012 at 2:04 PM, Steven Bosscher <stevenb....@gmail.com> wrote: > On Fri, Aug 17, 2012 at 1:54 PM, Richard Guenther > <richard.guent...@gmail.com> wrote: >> Well, another effect of reducing the size of BITMAP_WORD is that >> operations are not performed in a mode optimally using CPU regs >> (did you check code generation differences on a 64bit host?). > > I did, on x86_64 and on powerpc64. The effect is not dramatic, most of > these machines can perform 32 bits operations just fine (I think the > only exception would be alpha, maybe?).
I wonder how bad code gets when we unconditionally use GCCs generic vector support to do Index: gcc/bitmap.c =================================================================== --- gcc/bitmap.c (revision 190469) +++ gcc/bitmap.c (working copy) @@ -1446,6 +1446,17 @@ bitmap_compl_and_into (bitmap a, const_b unsigned ix; BITMAP_WORD ior = 0; +#if BITMAP_ELEMENT_WORDS == 5 + typedef v4si unsigned int __attribute__((vector_size((16)))); + v4si cleared4 = *(v4si *)&a_elt->bits[0] & *(v4si *)&b_elt->bits[0]; + BITMAP_WORD cleared5 = a_elt->bits[4] & b_elt->bits[4]; + v4si r4 = *(v4si *)&b_elt->bits[0] ^ cleared4; + BITMAP_WORD r5 = b_elt->bits[4] ^ cleared5; + *(v4si *)&a_elt->bits[0] = r4; + a_elt->bits[4] = r5; + ior4 |= r4; + ior5 |= r5; +#else for (ix = 0; ix < BITMAP_ELEMENT_WORDS; ix++) of course with a proper #if GCC_VERSION. The theory is we should be able to lower it to the same scalar code or even v2si operations where only those are available ... Just an idea and eventually an opportunity to improve generic vector lowering if the above really does not work out. Richard. > Ciao! > Steven