On Thu, Feb 16, 2017 at 7:15 AM, Robert Haas <robertmh...@gmail.com> wrote: > On Mon, Feb 13, 2017 at 10:22 AM, Amit Kapila <amit.kapil...@gmail.com> wrote: >> As discussed, attached are refactoring patches and a patch to enable >> WAL for the hash index on top of them. > > Thanks. I think that the refactoring patches shouldn't add > START_CRIT_SECTION() and END_CRIT_SECTION() calls; let's leave that > for the final patch. Nor should their comments reference the idea of > writing WAL; that should also be for the final patch. >
Okay, changed as per suggestion. > PageGetFreeSpaceForMulTups doesn't seem like a great name. > PageGetFreeSpaceForMultipleTuples? Okay, changed as per suggestion. > Or maybe just use > PageGetExactFreeSpace and then do the account in the caller. I'm not > sure it's really worth having a function just to subtract a multiple > of sizeof(ItemIdData), and it would actually be more efficient to have > the caller take care of this, since you wouldn't need to keep > recalculating the value for every iteration of the loop. > I have tried this one, but I think even if track outside we need something like PageGetFreeSpaceForMultipleTuples for the cases when we have to try next write page/'s where data (set of index tuples) can be accommodated. > I think we ought to just rip (as a preliminary patch) out the support > for freeing an overflow page in the middle of a bucket chain. That > code is impossible to hit, I believe, and instead of complicating the > WAL machinery to cater to a nonexistent case, maybe we ought to just > get rid of it. > I also could not think of a case where we need to care about the page in the middle of bucket chain as per it's current usage. In particular, are you worried about the code related to nextblock number in _hash_freeovflpage()? Surely, we can remove it, but what complexity are you referring to here? There is some additional book-keeping for primary bucket page, but there is nothing much about maintaining a backward link. One more thing to note is that this is an exposed API, so to avoid getting it used in some wrong way, we might want to make it static. > + /* > + * We need to release and if required > reacquire the lock on > + * rbuf to ensure that standby > shouldn't see an intermediate > + * state of it. > + */ > > How does releasing and reacquiring the lock on the master affect > whether the standby can see an intermediate state? I am talking about the intermediate state of standby with respect to the master. We will release the lock on standby after replaying the WAL for moving tuples from read page to write page whereas master will hold that for the much longer period (till we move all the tuples from read page and free that page). I understand that there is no correctness issue here, but was trying to make operations on master and standby similar and another thing is that it will help us in obeying the coding rule of releasing the lock on buffers after writing WAL record. I see no harm in maintaining the existing coding pattern, however, if you think that is not important in this case, then I can change the code to avoid releasing the lock on read page. > That sounds > totally wrong to me (and also doesn't belong in a refactoring patch > anyway since we're not emitting any WAL here). > Agreed that even if we want to do such a change, then also it should be part of WAL patch. So for now, I have reverted that change. > - Assert(PageIsNew(page)); > > This worries me a bit. What's going on here? > This is related to below change. + /* + * Initialise the freed overflow page, here we can't complete zeroed the + * page as WAL replay routines expect pages to be initialized. See + * explanation of RBM_NORMAL mode atop XLogReadBufferExtended. + */ + _hash_pageinit(ovflpage, BufferGetPageSize (ovflbuf)); Basically, we need this routine to initialize freed overflow pages. Also, if you see the similar function in btree (_bt_pageinit(Page page, Size size)), it doesn't have any such Assertion. Attached are refactoring patches. WAL patch needs some changes based on above comments, so will post it later. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
0001-Expose-a-new-API-_hash_pgaddmultitup.patch
Description: Binary data
0002-Expose-an-API-hashinitbitmapbuffer.patch
Description: Binary data
0003-Restructure-_hash_addovflpage.patch
Description: Binary data
0004-Restructure-split-bucket-code.patch
Description: Binary data
0005-Restructure-hash-index-creation.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers