Re: [HACKERS] Race condition between hot standby and restoring a FPW

2014-11-13 Thread Heikki Linnakangas
On 11/13/2014 01:06 AM, Jim Nasby wrote: On 11/12/14, 9:47 AM, Tom Lane wrote: Heikki Linnakangas hlinnakan...@vmware.com writes: On 11/12/2014 05:20 PM, Tom Lane wrote: On reconsideration I think the RBM_ZERO returns page already locked alternative may be the less ugly. That has the

[HACKERS] Race condition between hot standby and restoring a FPW

2014-11-12 Thread Heikki Linnakangas
There's a race condition between a backend running queries in hot standby mode, and restoring a full-page image from a WAL record. It's present in all supported versions. RestoreBackupBlockContents does this: buffer = XLogReadBufferExtended(bkpb.node, bkpb.fork, bkpb.block,

Re: [HACKERS] Race condition between hot standby and restoring a FPW

2014-11-12 Thread Tom Lane
Heikki Linnakangas hlinnakan...@vmware.com writes: There's a race condition between a backend running queries in hot standby mode, and restoring a full-page image from a WAL record. It's present in all supported versions. I can think of two ways to fix this: 1. Have ReadBufferExtended

Re: [HACKERS] Race condition between hot standby and restoring a FPW

2014-11-12 Thread Robert Haas
On Wed, Nov 12, 2014 at 7:39 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: 2. When ReadBufferExtended doesn't find the page in cache, it returns the buffer in !BM_VALID state (i.e. still in I/O in-progress state). Require the caller to call a second function, after locking the page, to

Re: [HACKERS] Race condition between hot standby and restoring a FPW

2014-11-12 Thread Heikki Linnakangas
On 11/12/2014 04:56 PM, Tom Lane wrote: Heikki Linnakangas hlinnakan...@vmware.com writes: There's a race condition between a backend running queries in hot standby mode, and restoring a full-page image from a WAL record. It's present in all supported versions. I can think of two ways to fix

Re: [HACKERS] Race condition between hot standby and restoring a FPW

2014-11-12 Thread Tom Lane
Heikki Linnakangas hlinnakan...@vmware.com writes: On 11/12/2014 04:56 PM, Tom Lane wrote: Not great either. What about an RBM_NOERROR mode that is like RBM_ZERO in terms of handling error conditions, but does not forcibly zero the page if it's already valid? Anyway, you don't want to read

Re: [HACKERS] Race condition between hot standby and restoring a FPW

2014-11-12 Thread Heikki Linnakangas
On 11/12/2014 05:08 PM, Robert Haas wrote: On Wed, Nov 12, 2014 at 7:39 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: 2. When ReadBufferExtended doesn't find the page in cache, it returns the buffer in !BM_VALID state (i.e. still in I/O in-progress state). Require the caller to call a

Re: [HACKERS] Race condition between hot standby and restoring a FPW

2014-11-12 Thread Heikki Linnakangas
On 11/12/2014 05:20 PM, Tom Lane wrote: Heikki Linnakangas hlinnakan...@vmware.com writes: On 11/12/2014 04:56 PM, Tom Lane wrote: Not great either. What about an RBM_NOERROR mode that is like RBM_ZERO in terms of handling error conditions, but does not forcibly zero the page if it's already

Re: [HACKERS] Race condition between hot standby and restoring a FPW

2014-11-12 Thread Tom Lane
Heikki Linnakangas hlinnakan...@vmware.com writes: On 11/12/2014 05:20 PM, Tom Lane wrote: On reconsideration I think the RBM_ZERO returns page already locked alternative may be the less ugly. That has the advantage that any code that doesn't get updated will fail clearly and reliably.

Re: [HACKERS] Race condition between hot standby and restoring a FPW

2014-11-12 Thread Jim Nasby
On 11/12/14, 9:47 AM, Tom Lane wrote: Heikki Linnakangas hlinnakan...@vmware.com writes: On 11/12/2014 05:20 PM, Tom Lane wrote: On reconsideration I think the RBM_ZERO returns page already locked alternative may be the less ugly. That has the advantage that any code that doesn't get updated