Hi,
On Thu, Nov 07, 2013 at 09:59:38PM +0000, Ramsay Jones wrote:
> +static inline uint64_t default_bswap64(uint64_t val)
> +{
> + return (((val & (uint64_t)0x00000000000000ffULL) << 56) |
> + ((val & (uint64_t)0x000000000000ff00ULL) << 40) |
> + ((val & (uint64_t)0x0000000000ff0000ULL) << 24) |
> + ((val & (uint64_t)0x00000000ff000000ULL) << 8) |
> + ((val & (uint64_t)0x000000ff00000000ULL) >> 8) |
> + ((val & (uint64_t)0x0000ff0000000000ULL) >> 24) |
> + ((val & (uint64_t)0x00ff000000000000ULL) >> 40) |
> + ((val & (uint64_t)0xff00000000000000ULL) >> 56));
> +}
This got me thinking.
To swap 8 bytes this function performs 8 bitwise shifts, 8 bitwise
ANDs and 7 bitwise ORs plus uses 8 64bit constants. We could do
better than that:
static inline uint64_t hacked_bswap64(uint64_t val)
{
uint64_t tmp = val << 32 | val >> 32;
return (((tmp & (uint64_t)0xff000000ff000000ULL) >> 24) |
((tmp & (uint64_t)0x00ff000000ff0000ULL) >> 8) |
((tmp & (uint64_t)0x0000ff000000ff00ULL) << 8) |
((tmp & (uint64_t)0x000000ff000000ffULL) << 24));
}
This performs only 6 shifts, 4 ANDs, 4 ORs and uses 4 64bit constants.
bswap64ing 1000000000 64bit ints with default_bswap64() compiled
with -O2 takes:
real 0m1.808s
user 0m1.796s
sys 0m0.000s
The same with hacked_bswap64():
real 0m0.823s
user 0m0.816s
sys 0m0.000s
I doubt that in normal usage git would spend enough time bswap64ing to
make this noticeable, but it was a fun micro-optimization on a wet
Thursday evening nevertheless :)
Best,
Gábor
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html