Re: [HACKERS] B-tree descend for insertion locking

2014-09-03 Thread Peter Geoghegan
On Tue, Mar 18, 2014 at 4:12 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 When inserting into a B-tree index, all the pages are read-locked when
 descending the tree. When we reach the leaf page, the read-lock is exchanged
 for a write-lock.

 There's nothing wrong with that, but why don't we just directly grab a
 write-lock on the leaf page?


Whatever happened to this idea?

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-tree descend for insertion locking

2014-03-19 Thread Simon Riggs
On 18 March 2014 11:12, Heikki Linnakangas hlinnakan...@vmware.com wrote:
 When inserting into a B-tree index, all the pages are read-locked when
 descending the tree. When we reach the leaf page, the read-lock is exchanged
 for a write-lock.

 There's nothing wrong with that, but why don't we just directly grab a
 write-lock on the leaf page? When descending, we know the level we're on,
 and what level the child page is. The only downside I can see is that we
 would unnecessarily hold a write-lock when a read-lock would suffice, if the
 page was just split and we have to move right. But that seems like a really
 bad bet - hitting the page when it was just split is highly unlikely.

Sounds good.

Grabbing write lock directly will reduce contention on the buffer, not
just reduce the code path.

If we have a considerable number of duplicates we would normally step
thru until we found a place to insert. Presumably that will happen
with write lock enabled, rather than read lock. Would this slow down
the insertion of highly duplicate keys under concurrent load? i.e. is
this a benefit for nearly-unique but not for other cases?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-tree descend for insertion locking

2014-03-18 Thread Peter Geoghegan
On Tue, Mar 18, 2014 at 4:12 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 See attached patch. The new contract of _bt_getroot is a bit weird: it locks
 the returned page in the requested lock-mode, shared or exclusive, if the
 root page was also the leaf page. Otherwise it's locked in shared mode
 regardless off the requested lock mode. But OTOH, the new contract for
 _bt_search() is more clear now: it actually locks the returned page in the
 requested mode, where it used to only use the access parameter to decide
 whether to create a root page if the index was empty.

I had considered this myself, and am under the impression that the
only reason things work this way is because it makes the interface of
_bt_getroot() clearer. I think what you propose is logical, and you
should pursue it, but fwiw I'm not convinced that you've really
simplified _bt_search(). However, you have kind of made _bt_search()
into something that succinctly represents what is useful about Lehman
and Yao as compared to earlier concurrent locking techniques, and
that's a good thing. I would probably have remarked on that in the
comments if I were you. I certainly would not have left out some
reference to LY. You've removed an existing one where the lock
escalation occurs, emphasizing that it's safe, and I'd like to see you
at least compensate for that.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-tree descend for insertion locking

2014-03-18 Thread Amit Kapila
On Tue, Mar 18, 2014 at 4:42 PM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 When inserting into a B-tree index, all the pages are read-locked when
 descending the tree. When we reach the leaf page, the read-lock is exchanged
 for a write-lock.

 There's nothing wrong with that, but why don't we just directly grab a
 write-lock on the leaf page? When descending, we know the level we're on,
 and what level the child page is. The only downside I can see is that we
 would unnecessarily hold a write-lock when a read-lock would suffice, if the
 page was just split and we have to move right. But that seems like a really
 bad bet - hitting the page when it was just split is highly unlikely.

Another case could be when the page is half dead or deleted, but again
chances of same are relatively less.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers