int func1( unsigned long long val ) { return __builtin_ctzll( val ); } int func2( unsigned long long val ) { unsigned lo = (unsigned)val; return lo ? __builtin_ctz(lo) : __builtin_ctz(unsigned(val>>32)) + 32; }
func1 is more than 2 times slower than func2. But it should be at least as fast as func2 __builtin_ctzll is not expanded inline like __builtin_ctz. -- Summary: __builtin_ctzll slower than 2*__builtin_ctz Product: gcc Version: 4.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: joerg dot richter at pdv-fs dot de GCC target triplet: i686-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31695