Re: [HACKERS] missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?

2012-11-30 Thread Pavan Deolasee
On Fri, Nov 30, 2012 at 5:28 PM, Andres Freund and...@2ndquadrant.comwrote:


  
  That seems to be safe to me. Anything thats been read above can't really
  change. The tuple is already locked, so a concurrent update/delete has to
  wait on us. We have a pin on the buffer, so VACUUM or HOT-prune can't
  happen either. I can't see any other operation that can really change
 those
  fields.

 We only get the pin right there, I don't see any preexisting pin. Which
 means we might have just opened a page thats in the process of being
 pruned/vacuumed by another backend.


Hmm. Yeah, you're right. That is a possible risky scenario. Even though
cleanup lock waits for all pins to go away, it will work only if every
reader takes at least a SHARE lock unless it was continuously holding a pin
on a buffer (in which case its OK to drop lock and read a tuple body
without reacquiring it again). Otherwise, as you rightly pointed out, we
could actually be reading a page which being actively cleaned up and tuples
are being moved around.


 I think a concurrent heap_(insert|update)/PageAddItem might actually be
 already dangerous because it might move line pointers around


I don't we move the line pointers around ever because the indexes will be
pointing to them. But the vacuum/prune is dangerous enough to require a
SHARE lock here in any case.

Thanks,
Pavan

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee


Re: [HACKERS] missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?

2012-11-30 Thread Pavan Deolasee
On Fri, Nov 30, 2012 at 6:21 PM, Andres Freund and...@2ndquadrant.comwrote:

 On 2012-11-30 18:09:49 +0530, Pavan Deolasee wrote:
  On Fri, Nov 30, 2012 at 5:28 PM, Andres Freund and...@2ndquadrant.com
 wrote:
 
  

That seems to be safe to me. Anything thats been read above can't
 really
change. The tuple is already locked, so a concurrent update/delete
 has to
wait on us. We have a pin on the buffer, so VACUUM or HOT-prune can't
happen either. I can't see any other operation that can really change
   those
fields.
  
   We only get the pin right there, I don't see any preexisting pin. Which
   means we might have just opened a page thats in the process of being
   pruned/vacuumed by another backend.
  
 
  Hmm. Yeah, you're right. That is a possible risky scenario. Even though
  cleanup lock waits for all pins to go away, it will work only if every
  reader takes at least a SHARE lock unless it was continuously holding a
 pin
  on a buffer (in which case its OK to drop lock and read a tuple body
  without reacquiring it again). Otherwise, as you rightly pointed out, we
  could actually be reading a page which being actively cleaned up and
 tuples
  are being moved around.

 Well, live (or recently dead) tuples can't be just moved arround unless
 I miss something. That would cause problems with with HeapTuples
 directly pointing into the page as youve noticed.


I think we confusing with the terminology. The tuple headers and bodies
(live or dead) can be moved around freely as long as you have a cleanup
lock on the page. Thats how HOT-prune frees up fragmented space.


   I think a concurrent heap_(insert|update)/PageAddItem might actually be
   already dangerous because it might move line pointers around
  
  
  I don't we move the line pointers around ever because the indexes will be
  pointing to them.

 The indexes point to a tid, not to a line pointer. So reshuffling the
 linepointer array - while keeping the tids valid - is entirely
 fine. Right?


I think its again terminology confusion :-) I thought TID and Line Pointers
are the same and used interchangeably. Or at least I've done so for long.
But I think you are reading line pointer as the thing stored in the ItemId
structure and not the ItemId itself.

Also, PageAddItem() doesn't move things around normally. It does that only
during recovery, at least for heao pages. Elsewhere it will either reuse an
UNUSED pointer or add at the end. Isn't that how it is ?

Thanks,
Pavan

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee


Re: [HACKERS] missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?

2012-11-30 Thread Pavan Deolasee
On Fri, Nov 30, 2012 at 5:28 PM, Andres Freund and...@2ndquadrant.comwrote:



  heap_fetch() though looks very similar has an important difference. It
  might be reading the tuple without lock on it and we need the buffer lock
  to protect against concurrent update/delete to the tuple. AFAIK once the
  tuple is passed through qualification checks etc, even callers of
  heap_fetch() can read the tuple data without any lock on the buffer
 itself.

 Sure, but the page header isn't accessed there, just the tuple data
 itself which always stays at the same place on the page.
 (A normal heap_fetch shouldn't be worried about updates/deletions to its
 tuple, MVCC to the rescue.)


heap_fetch() reads the tuple header though to apply snapshot visibility
rules. So a SHARE lock is must for that purpose because a concurrent
update/delete may set XMAX or other visibility related fields. On the other
hand, you can read the tuple body without a page lock if you were holding a
pin on the buffer continuously since you last dropped the lock.

In any case, a lock is needed in GetTupleForTrigger unless someone can
argue that a pin is continuously held on the buffer since the lock was last
dropped, something I don't see happening in this case.

Thanks,
Pavan

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee


Re: [HACKERS] missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?

2012-11-30 Thread Pavan Deolasee
On Fri, Nov 30, 2012 at 6:55 PM, Andres Freund and...@2ndquadrant.comwrote:



 Hm? It doesn't move the page contents around but it moves the ItemId
 array during completely normal operation (c.f. needshuffle in
 PageAddItem). Or am I missing something?


I think that probably only used for non-heap pages. For heap pages, it just
doesn't make sense to shuffle the ItemId array. That would defeat the
entire purpose of having them in the first place.

Thanks,
Pavan

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee


Re: [HACKERS] missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?

2012-11-30 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2012-11-30 14:08:05 -0500, Tom Lane wrote:
 BTW, I went looking for other places that might have a similar mistake.
 I found that contrib/pageinspect/btreefuncs.c pokes around in btree
 pages without any buffer lock, which seems pretty broken --- don't we
 want it to take share lock?

 I seem to remember comments somewhere indicating that pageinspect (?)
 doesn't take locks by intention to make debugging of locking problems
 easier. Not sure whether thats really realistic, but ...

Dunno, that seems like a pretty lame argument when compared to the very
real possibility of crashing due to processing corrupt data.  Besides,
the heap-page examination functions in the same module take buffer lock,
so why shouldn't these?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?

2012-11-29 Thread Andres Freund
Hi,

while looking at trigger.c from the POV of foreign key locks I noticed
that GetTupleForTrigger commentless assumes it can just look at a pages
content without a
LockBuffer(buffer, BUFFER_LOCK_SHARE);

That code path is only reached for AFTER ... FOR EACH ROW ... triggers,
so its fine it's not locking the tuple itself. That already needs to
have happened before.

The code in question:

if (newslot_which_is_passed_by_before_triggers)
...
else
{
Pagepage;
ItemId  lp;

buffer = ReadBuffer(relation, ItemPointerGetBlockNumber(tid));

page = BufferGetPage(buffer);
lp = PageGetItemId(page, ItemPointerGetOffsetNumber(tid));

Assert(ItemIdIsNormal(lp));

tuple.t_data = (HeapTupleHeader) PageGetItem(page, lp);
tuple.t_len = ItemIdGetLength(lp);
tuple.t_self = *tid;
tuple.t_tableOid = RelationGetRelid(relation);
}

result = heap_copytuple(tuple);
ReleaseBuffer(buffer);

As can be seen in the excerpt above this is basically a very stripped
down heap_fetch(). But without any locking on the buffer we just read.

I can't believe this is safe? Just about anything but eviction could
happen to that buffer at that moment?

Am I missing something?

Andres

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?

2012-11-29 Thread Pavan Deolasee
On Fri, Nov 30, 2012 at 3:19 AM, Andres Freund and...@2ndquadrant.comwrote:

 Hi,

 while looking at trigger.c from the POV of foreign key locks I noticed
 that GetTupleForTrigger commentless assumes it can just look at a pages
 content without a
 LockBuffer(buffer, BUFFER_LOCK_SHARE);

 That code path is only reached for AFTER ... FOR EACH ROW ... triggers,
 so its fine it's not locking the tuple itself. That already needs to
 have happened before.

 The code in question:

 if (newslot_which_is_passed_by_before_triggers)
 ...
 else
 {
 Pagepage;
 ItemId  lp;

 buffer = ReadBuffer(relation, ItemPointerGetBlockNumber(tid));

 page = BufferGetPage(buffer);
 lp = PageGetItemId(page, ItemPointerGetOffsetNumber(tid));

 Assert(ItemIdIsNormal(lp));

 tuple.t_data = (HeapTupleHeader) PageGetItem(page, lp);
 tuple.t_len = ItemIdGetLength(lp);
 tuple.t_self = *tid;
 tuple.t_tableOid = RelationGetRelid(relation);
 }

 result = heap_copytuple(tuple);
 ReleaseBuffer(buffer);

 As can be seen in the excerpt above this is basically a very stripped
 down heap_fetch(). But without any locking on the buffer we just read.

 I can't believe this is safe? Just about anything but eviction could
 happen to that buffer at that moment?

 Am I missing something?


That seems to be safe to me. Anything thats been read above can't really
change. The tuple is already locked, so a concurrent update/delete has to
wait on us. We have a pin on the buffer, so VACUUM or HOT-prune can't
happen either. I can't see any other operation that can really change those
fields.

heap_fetch() though looks very similar has an important difference. It
might be reading the tuple without lock on it and we need the buffer lock
to protect against concurrent update/delete to the tuple. AFAIK once the
tuple is passed through qualification checks etc, even callers of
heap_fetch() can read the tuple data without any lock on the buffer itself.

Thanks,
Pavan
-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee