On Wed, Jun 22, 2016 at 8:48 PM, Robert Haas <robertmh...@gmail.com> wrote:
> On Wed, Jun 22, 2016 at 5:14 AM, Amit Kapila <amit.kapil...@gmail.com> wrote:
>> We can do it in the way as you are suggesting, but there is another thing
>> which we need to consider here.  As of now, the patch tries to finish the
>> split if it finds split-in-progress flag in either old or new bucket.  We
>> need to lock both old and new buckets to finish the split, so it is quite
>> possible that two different backends try to lock them in opposite order
>> leading to a deadlock.  I think the correct way to handle is to always try
>> to lock the old bucket first and then new bucket.  To achieve that, if the
>> insertion on new bucket finds that split-in-progress flag is set on a
>> bucket, it needs to release the lock and then acquire the lock first on old
>> bucket, ensure pincount is 1 and then lock new bucket again and ensure that
>> pincount is 1. I have already maintained the order of locks in scan (old
>> bucket first and then new bucket; refer changes in _hash_first()).
>> Alternatively, we can try to  finish the splits only when someone tries to
>> insert in old bucket.
> Yes, I think locking buckets in increasing order is a good solution.
> I also think it's fine to only try to finish the split when the insert
> targets the old bucket.  Finishing the split enables us to remove
> tuples from the old bucket, which lets us reuse space instead of
> accelerating more.  So there is at least some potential benefit to the
> backend inserting into the old bucket.  On the other hand, a process
> inserting into the new bucket derives no direct benefit from finishing
> the split.

Okay, following this suggestion, I have updated the patch so that only
insertion into old-bucket can try to finish the splits.  Apart from
that, I have fixed the issue reported by Mithun upthread.  I have
updated the README to explain the locking used in patch.   Also, I
have changed the locking around vacuum, so that it can work with
concurrent scans when ever possible.  In previous patch version,
vacuum used to take cleanup lock on a bucket to remove the dead
tuples, moved-due-to-split tuples and squeeze operation, also it holds
the lock on bucket till end of cleanup.  Now, also it takes cleanup
lock on a bucket to out-wait scans, but it releases the lock as it
proceeds to clean the overflow pages.  The idea is first we need to
lock the next bucket page and  then release the lock on current bucket
page.  This ensures that any concurrent scan started after we start
cleaning the bucket will always be behind the cleanup.  Allowing scans
to cross vacuum will allow it to remove tuples required for sanctity
of scan.  Also for squeeze-phase we are just checking if the pincount
of buffer is one (we already have Exclusive lock on buffer of bucket
by that time), then only proceed, else will try to squeeze next time
the cleanup is required for that bucket.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment: concurrent_hash_index_v3.patch
Description: Binary data

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to