On 05.11.24 16:03, Bertrand Drouvot wrote:
On Tue, Nov 05, 2024 at 05:08:41PM +1300, David Rowley wrote:
On Tue, 5 Nov 2024 at 06:39, Ranier Vilela <ranier...@gmail.com> wrote:
I think we can add a small optimization to this last patch [1].
I think if you want to make it faster, you could partially unroll the
inner-most loop, like:
// size_t * 4
for (; p < aligned_end - (sizeof(size_t) * 3); p += sizeof(size_t) * 4)
{
if (((size_t *) p)[0] != 0 | ((size_t *) p)[1] != 0 | ((size_t *)
p)[2] != 0 | ((size_t *) p)[3] != 0)
return false;
}
Another option could be to use SIMD instructions to check multiple bytes
is zero in a single operation. Maybe just an idea to keep in mind and experiment
if we feel the need later on.
Speaking of which, couldn't you just use
pg_popcount(ptr, len) == 0
?