On 2016-12-20 08:10:29 -0500, Robert Haas wrote:
> We could use the GUC assign hook to compute a mask and a shift, so
> that this could be written as (CurrPos & mask_variable) == 0.  That
> would avoid the division instruction, though not the memory access.

I suspect that'd be fine.

> I hope this is all in the noise, though.

Could very well be.

> I know this is code is hot but I think it'll be hard to construct a
> test case where the bottleneck is anything other than the speed at
> which the disk can absorb bytes.

I don't think that's really true. Heikki's WAL changes made a *BIG*
difference. And pretty small changes in xlog.c can make noticeable
throughput differences both in single and multi-threaded
workloads. E.g. witnessed by the fact that the crc computation used to
be a major bottleneck (and the crc32c instruction still shows up
noticeably in profiles).  SSDs have become fast enough that it's
increasingly hard to saturate them.


