On Thu, Mar 12, 2020 at 7:42 AM David Rowley <dgrowle...@gmail.com> wrote: > > I don't think Jesse's proposed solution is that great due to the > additional function call overhead for pg_count_leading_zeros_32(). The > (num & (num - 1)) == 0 I imagine will perform better, but I didn't > test it.
Right, I believe we've all landed on the same page about that. I see two ways of doing next_power_of_2_32 without an indirect function call, and leaving pg_leftmost_one_pos32 the same as it is now. I haven't measured either yet (or tested for that matter): static inline uint32 next_power_of_2_32(uint32 num) { Assert(num > 0 && num <= UINT_MAX / 2); /* use some bitmasking tricks to see if only 1 bit is on */ if (num & (num - 1)) == 0) return num; return ((uint32) 1) << (pg_leftmost_one_pos32(num) + 1) } OR { Assert(num > 0 && num <= UINT_MAX / 2); return ((uint32) 1) << ceil_log2_32(num); } static inline uint32 ceil_log2_32(uint32 num) { Assert(num > 0); if (num == 1) return 0; return pg_leftmost_one_pos32(num-1) + 1; } One naming thing I noticed: the name "next power of two" implies to me num *= 2 for a power of two, not the same as the input. The latter behavior is better called "ceil power of 2". > Also, wondering if you've looked at any of the other places where we > do "*= 2;" or "<<= 1;" inside a loop? There's quite a number that look > like candidates for using the new function. A brief look shows a few functions where this is done in a tight loop: nodes/list.c:new_list LWLockRegisterTranche ensure_record_cache_typmod_slot_exists pqCheckOutBufferSpace ExecChooseHashTableSize ExecHashBuildSkewHash choose_nelem_alloc init_htab hash_estimate_size hash_select_dirsize AllocSetAlloc -- John Naylor https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services