On Wed, Nov 2, 2016 at 6:39 AM, Robert Haas <robertmh...@gmail.com> wrote: > On Mon, Oct 24, 2016 at 10:30 AM, Amit Kapila <amit.kapil...@gmail.com> wrote: >> [ new patches ] > > I looked over parts of this today, mostly the hashinsert.c changes. > > + /* > + * Copy bucket mapping info now; The comment in _hash_expandtable where > + * we copy this information and calls _hash_splitbucket explains why this > + * is OK. > + */ > > So, I went and tried to find the comments to which this comment is > referring and didn't have much luck. >
I guess you have just tried to find it in the patch. However, the comment I am referring above is an existing comment in _hash_expandtable(). Refer below comment: /* * Copy bucket mapping info now; this saves re-accessing the meta page * inside _hash_splitbucket's inner loop. ... > At the point this code is > running, we have a pin but no lock on the metapage, so this is only > safe if changing any of these fields requires a cleanup lock on the > metapage. If that's true, > No that's not true, we need just Exclusive content lock to update those fields and these fields should be copied when we have Share content lock on metapage. In version-8 of patch, it was correct, but in last version, it seems during code re-arrangement, I have moved it. I will change it such that these values are copied under matapage share content lock. I think moving it just before the preceding for loop should be okay, let me know if you think otherwise. > + nblkno = _hash_get_newblk(rel, pageopaque); > > I think this is not a great name for this function. It's not clear > what "new blocks" refers to, exactly. I suggest > FIND_SPLIT_BUCKET(metap, bucket) or OLD_BUCKET_TO_NEW_BUCKET(metap, > bucket) returning a new bucket number. I think that macro can be > defined as something like this: bucket + (1 << > (fls(metap->hashm_maxbucket) - 1)). > I think such a macro would not work for the usage of incomplete splits. The reason is that by the time we try to complete the split of the current old bucket, the table half (lowmask, highmask, maxbucket) would have changed and it could give you the bucket in new table half. > Then do nblkno = > BUCKET_TO_BLKNO(metap, newbucket) to get the block number. That'd all > be considerably simpler than what you have for hash_get_newblk(). > I think to use BUCKET_TO_BLKNO we need either a share or exclusive lock on metapage and as we need a lock on metapage to find new block from old block, I thought it is better to do inside _hash_get_newblk(). > > Moving on ... > > /* > * ovfl page exists; go get it. if it doesn't have room, we'll > - * find out next pass through the loop test above. > + * find out next pass through the loop test above. Retain the > + * pin, if it is a primary bucket page. > */ > - _hash_relbuf(rel, buf); > + if (pageopaque->hasho_flag & LH_BUCKET_PAGE) > + _hash_chgbufaccess(rel, buf, HASH_READ, HASH_NOLOCK); > + else > + _hash_relbuf(rel, buf); > > It seems like it would be cheaper, safer, and clearer to test whether > buf != bucket_buf here, rather than examining the page opaque data. > That's what you do down at the bottom of the function when you ensure > that the pin on the primary bucket page gets released, and it seems > like it should work up here, too. > > + bool retain_pin = false; > + > + /* page flags must be accessed before releasing lock on a page. > */ > + retain_pin = pageopaque->hasho_flag & LH_BUCKET_PAGE; > > Similarly. > Agreed, will change the usage as per your suggestion. > I have also attached a patch with some suggested comment changes. > I will include it in next version of patch. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers