Re: WAL prefetch

Amit Kapila Fri, 15 Jun 2018 08:03:58 -0700

On Fri, Jun 15, 2018 at 1:08 PM, Konstantin Knizhnik
<[email protected]> wrote:
>
>
> On 15.06.2018 07:36, Amit Kapila wrote:
>>
>> On Fri, Jun 15, 2018 at 12:16 AM, Stephen Frost <[email protected]>
>> wrote:
>>>>
>>>> I have tested wal_prefetch at two powerful servers with 24 cores, 3Tb
>>>> NVME
>>>> RAID 10 storage device and 256Gb of RAM connected using InfiniBand.
>>>> The speed of synchronous replication between two nodes is increased from
>>>> 56k
>>>> TPS to 60k TPS (on pgbench with scale 1000).
>>>
>>> I'm also surprised that it wasn't a larger improvement.
>>>
>>> Seems like it would make sense to implement in core using
>>> posix_fadvise(), perhaps in the wal receiver and in RestoreArchivedFile
>>> or nearby..  At least, that's the thinking I had when I was chatting w/
>>> Sean.
>>>
>> Doing in-core certainly has some advantage such as it can easily reuse
>> the existing xlog code rather trying to make a copy as is currently
>> done in the patch, but I think it also depends on whether this is
>> really a win in a number of common cases or is it just a win in some
>> limited cases.
>>
> I am completely agree. It was my mail concern: on which use cases this
> prefetch will be efficient.
> If "full_page_writes" is on (and it is safe and default value), then first
> update of a page since last checkpoint will be written in WAL as full page
> and applying it will not require reading any data from disk.
>


What exactly you mean by above?  AFAIU, it needs to read WAL to apply
full page image.  See below code:

XLogReadBufferForRedoExtended()
{
..
/* If it has a full-page image and it should be restored, do it. */
if (XLogRecBlockImageApply(record, block_id))
{
Assert(XLogRecHasBlockImage(record, block_id));
*buf = XLogReadBufferExtended(rnode, forknum, blkno,
  get_cleanup_lock ? RBM_ZERO_AND_CLEANUP_LOCK : RBM_ZERO_AND_LOCK);
page = BufferGetPage(*buf);
if (!RestoreBlockImage(record, block_id, page))
..
}


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: WAL prefetch

Reply via email to