Re: [HACKERS] Torn page hazard in ginRedoUpdateMetapage()

2014-03-12 Thread Heikki Linnakangas

On 03/10/2014 09:44 PM, Robert Haas wrote:

On Mon, Apr 30, 2012 at 1:34 PM, Noah Misch n...@leadboat.com wrote:

When GIN changes a metapage, we WAL-log its ex-header content and never use a
backup block.  This reduces WAL volume since the vast majority of the metapage
is unused.  However, ginRedoUpdateMetapage() only restores the WAL-logged
content if the metapage LSN predates the WAL record LSN.  If a metapage write
tore and updated the LSN but not the other content, we would fail to complete
the update.  Instead, unconditionally reinitialize the metapage similar to how
_bt_restore_meta() handles the situation.

I found this problem by code reading and did not attempt to build a test case
illustrating its practical consequences.  It's possible that there's no
problem in practice on account of some reason I haven't contemplated.


The attached patch doesn't apply any more, but it looks like this
issue still exists.


Fixed.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Torn page hazard in ginRedoUpdateMetapage()

2014-03-12 Thread Robert Haas
On Wed, Mar 12, 2014 at 4:23 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 The attached patch doesn't apply any more, but it looks like this
 issue still exists.

 Fixed.

Did you forget to push?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Torn page hazard in ginRedoUpdateMetapage()

2014-03-12 Thread Heikki Linnakangas

On 03/12/2014 02:05 PM, Robert Haas wrote:

On Wed, Mar 12, 2014 at 4:23 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:

The attached patch doesn't apply any more, but it looks like this
issue still exists.


Fixed.


Did you forget to push?


Yep. Pushed now.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Torn page hazard in ginRedoUpdateMetapage()

2014-03-10 Thread Robert Haas
On Mon, Apr 30, 2012 at 1:34 PM, Noah Misch n...@leadboat.com wrote:
 When GIN changes a metapage, we WAL-log its ex-header content and never use a
 backup block.  This reduces WAL volume since the vast majority of the metapage
 is unused.  However, ginRedoUpdateMetapage() only restores the WAL-logged
 content if the metapage LSN predates the WAL record LSN.  If a metapage write
 tore and updated the LSN but not the other content, we would fail to complete
 the update.  Instead, unconditionally reinitialize the metapage similar to how
 _bt_restore_meta() handles the situation.

 I found this problem by code reading and did not attempt to build a test case
 illustrating its practical consequences.  It's possible that there's no
 problem in practice on account of some reason I haven't contemplated.

The attached patch doesn't apply any more, but it looks like this
issue still exists.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Torn page hazard in ginRedoUpdateMetapage()

2012-05-03 Thread Robert Haas
On Thu, May 3, 2012 at 12:16 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Having said all that, I wasn't really arguing that this was a guaranteed
 safe thing for us to rely on; just pointing out that it's quite likely
 that the issue hasn't been seen in the field because of this type of
 consideration.

Well, we do rely, in numerous places, on writes  512 bytes not
getting torn.  pd_prune_xid, index tuple kills, heap tuple hint bits,
relmapper files, etc.  We generally assume, for example, that a 4-byte
write which is 4-byte aligned does not need to be WAL-logged, which
would be necessary if we thought that the write might be torn.

Are you planning to commit Noah's patch?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Torn page hazard in ginRedoUpdateMetapage()

2012-05-03 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 Are you planning to commit Noah's patch?

I wasn't intending to do so personally in the near future; I've got
other things on my to-do list.  I won't object if somebody else
commits it though.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Torn page hazard in ginRedoUpdateMetapage()

2012-05-02 Thread Noah Misch
On Mon, Apr 30, 2012 at 02:35:20PM -0400, Tom Lane wrote:
 Noah Misch n...@leadboat.com writes:
  When GIN changes a metapage, we WAL-log its ex-header content and never use 
  a
  backup block.  This reduces WAL volume since the vast majority of the 
  metapage
  is unused.  However, ginRedoUpdateMetapage() only restores the WAL-logged
  content if the metapage LSN predates the WAL record LSN.  If a metapage 
  write
  tore and updated the LSN but not the other content, we would fail to 
  complete
  the update.  Instead, unconditionally reinitialize the metapage similar to 
  how
  _bt_restore_meta() handles the situation.
 
  I found this problem by code reading and did not attempt to build a test 
  case
  illustrating its practical consequences.  It's possible that there's no
  problem in practice on account of some reason I haven't contemplated.
 
 I think there's no problem in practice; the reason is that the
 GinMetaPageData struct isn't large enough to extend past the first
 physical sector of the page.  So it's in the same disk sector as the
 LSN and tearing is impossible.  Still, this might be a good
 future-proofing move, in case GinMetaPageData gets larger.

Can we indeed assume that all support-worthy filesystems align the start of
every file to a physical sector?  I know little about modern filesystem
design, but these references leave me wary of that assumption:

http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg14690.html
http://en.wikipedia.org/wiki/Block_suballocation

If it is a safe assumption, we could exploit it elsewhere.

Thanks,
nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Torn page hazard in ginRedoUpdateMetapage()

2012-05-02 Thread Daniel Farina
On Wed, May 2, 2012 at 6:06 PM, Noah Misch n...@leadboat.com wrote:
 Can we indeed assume that all support-worthy filesystems align the start of
 every file to a physical sector?  I know little about modern filesystem
 design, but these references leave me wary of that assumption:

 http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg14690.html
 http://en.wikipedia.org/wiki/Block_suballocation

 If it is a safe assumption, we could exploit it elsewhere.

Not to say whether this is safe or not, but it *is* exploited
elsewhere, as I understand it: the pg_control information, whose
justification for its safety is its small size.  That may point to a
very rare problem with pg_control rather the safety of the assumption
it makes.

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Torn page hazard in ginRedoUpdateMetapage()

2012-05-02 Thread Tom Lane
Daniel Farina dan...@heroku.com writes:
 On Wed, May 2, 2012 at 6:06 PM, Noah Misch n...@leadboat.com wrote:
 Can we indeed assume that all support-worthy filesystems align the start of
 every file to a physical sector?  I know little about modern filesystem
 design, but these references leave me wary of that assumption:
 
 http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg14690.html
 http://en.wikipedia.org/wiki/Block_suballocation
 
 If it is a safe assumption, we could exploit it elsewhere.

 Not to say whether this is safe or not, but it *is* exploited
 elsewhere, as I understand it: the pg_control information, whose
 justification for its safety is its small size.  That may point to a
 very rare problem with pg_control rather the safety of the assumption
 it makes.

I think it's somewhat common now for filesystems to attempt to optimize
very small files (on the order of a few dozen bytes) in that way.  It's
hard to see where's the upside for changing the conventional storage
allocation when the file is sector-sized or larger; the file system does
have to be prepared to rewrite the file on demand, and moving it from
one place to another isn't cheap.

That wikipedia reference argues for doing this type of optimization on
the last partial block of a file, which is entirely irrelevant for our
purposes since we always ask for page-multiples of space.  (The fact
that much of that might be useless padding is, I think, unknown to the
filesystem.)

Having said all that, I wasn't really arguing that this was a guaranteed
safe thing for us to rely on; just pointing out that it's quite likely
that the issue hasn't been seen in the field because of this type of
consideration.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Torn page hazard in ginRedoUpdateMetapage()

2012-04-30 Thread Tom Lane
Noah Misch n...@leadboat.com writes:
 When GIN changes a metapage, we WAL-log its ex-header content and never use a
 backup block.  This reduces WAL volume since the vast majority of the metapage
 is unused.  However, ginRedoUpdateMetapage() only restores the WAL-logged
 content if the metapage LSN predates the WAL record LSN.  If a metapage write
 tore and updated the LSN but not the other content, we would fail to complete
 the update.  Instead, unconditionally reinitialize the metapage similar to how
 _bt_restore_meta() handles the situation.

 I found this problem by code reading and did not attempt to build a test case
 illustrating its practical consequences.  It's possible that there's no
 problem in practice on account of some reason I haven't contemplated.

I think there's no problem in practice; the reason is that the
GinMetaPageData struct isn't large enough to extend past the first
physical sector of the page.  So it's in the same disk sector as the
LSN and tearing is impossible.  Still, this might be a good
future-proofing move, in case GinMetaPageData gets larger.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers