On Thu, Mar 23, 2017 at 9:17 AM, Amit Kapila <amit.kapil...@gmail.com> wrote:
>
> On Thu, Mar 23, 2017 at 8:43 AM, Amit Kapila <amit.kapil...@gmail.com> wrote:
> > On Wed, Mar 22, 2017 at 3:39 PM, Ashutosh Sharma <ashu.coe...@gmail.com> 
> > wrote:
> >> Hi,
> >>
> >> On Wed, Mar 22, 2017 at 8:41 AM, Amit Kapila <amit.kapil...@gmail.com> 
> >> wrote:
> >>
> >> To fix this, I think we should pass 'REGBUF_KEEP_DATA' while
> >> registering the buffer. Something like this,
> >>
> >> -                       XLogRegisterBuffer(0, buf, REGBUF_STANDARD);
> >> +                       XLogRegisterBuffer(0, buf, REGBUF_STANDARD |
> >> REGBUF_KEEP_DATA);
> >>
> >> Attached is the patch that fixes this issue.
> >>
> >
> > I think this will work, but not sure if there is a merit to deviate
> > from what btree does to handle this case.   One thing I find slightly
> > awkward in hash_xlog_vacuum_get_latestRemovedXid() is that you are
> > using a number of tuples registered as part of fixed data
> > (xl_hash_vacuum_one_page) to traverse the data registered as buf data.
> > I think it will be better if we register offsets also in fixed part of
> > data as we are doing btree case.

Agreed. I have made the changes accordingly. Please check attached v2 patch.

>
> >
> >
>
> Also another small point in this regard, do we need two separate
> variables to track number of deleted items in below code?  I think one
> variable is sufficient.
>
> _hash_vacuum_one_page()
> {
> ..
> deletable[ndeletable++] = offnum;
> tuples_removed += 1;--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com
> ..
> }
>

Yes, I think 'ndeletable' alone should be fine.

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com
diff --git a/src/backend/access/hash/hash_xlog.c b/src/backend/access/hash/hash_xlog.c
index de7522e..d9ac42c 100644
--- a/src/backend/access/hash/hash_xlog.c
+++ b/src/backend/access/hash/hash_xlog.c
@@ -957,8 +957,6 @@ hash_xlog_vacuum_get_latestRemovedXid(XLogReaderState *record)
 	OffsetNumber	hoffnum;
 	TransactionId	latestRemovedXid = InvalidTransactionId;
 	int		i;
-	char *ptr;
-	Size len;
 
 	xlrec = (xl_hash_vacuum_one_page *) XLogRecGetData(record);
 
@@ -977,12 +975,20 @@ hash_xlog_vacuum_get_latestRemovedXid(XLogReaderState *record)
 		return latestRemovedXid;
 
 	/*
+	 * Check if WAL replay has reached a consistent database state. If not,
+	 * we must PANIC. See the definition of btree_xlog_delete_get_latestRemovedXid
+	 * for more details.
+	 */
+	if (!reachedConsistency)
+		elog(PANIC, "hash_xlog_vacuum_get_latestRemovedXid: cannot operate with inconsistent data");
+
+	/*
 	 * Get index page.  If the DB is consistent, this should not fail, nor
 	 * should any of the heap page fetches below.  If one does, we return
 	 * InvalidTransactionId to cancel all HS transactions.  That's probably
 	 * overkill, but it's safe, and certainly better than panicking here.
 	 */
-	XLogRecGetBlockTag(record, 1, &rnode, NULL, &blkno);
+	XLogRecGetBlockTag(record, 0, &rnode, NULL, &blkno);
 	ibuffer = XLogReadBufferExtended(rnode, MAIN_FORKNUM, blkno, RBM_NORMAL);
 
 	if (!BufferIsValid(ibuffer))
@@ -994,9 +1000,7 @@ hash_xlog_vacuum_get_latestRemovedXid(XLogReaderState *record)
 	 * Loop through the deleted index items to obtain the TransactionId from
 	 * the heap items they point to.
 	 */
-	ptr = XLogRecGetBlockData(record, 1, &len);
-
-	unused = (OffsetNumber *) ptr;
+	unused = (OffsetNumber *) ((char *) xlrec + SizeOfHashVacuumOnePage);
 
 	for (i = 0; i < xlrec->ntuples; i++)
 	{
@@ -1121,23 +1125,15 @@ hash_xlog_vacuum_one_page(XLogReaderState *record)
 
 	if (action == BLK_NEEDS_REDO)
 	{
-		char *ptr;
-		Size len;
-
-		ptr = XLogRecGetBlockData(record, 0, &len);
-
 		page = (Page) BufferGetPage(buffer);
 
-		if (len > 0)
+		if (XLogRecGetDataLen(record) > SizeOfHashVacuumOnePage)
 		{
 			OffsetNumber *unused;
-			OffsetNumber *unend;
 
-			unused = (OffsetNumber *) ptr;
-			unend = (OffsetNumber *) ((char *) ptr + len);
+			unused = (OffsetNumber *) ((char *) xldata + SizeOfHashVacuumOnePage);
 
-			if ((unend - unused) > 0)
-				PageIndexMultiDelete(page, unused, unend - unused);
+			PageIndexMultiDelete(page, unused, xldata->ntuples);
 		}
 
 		/*
diff --git a/src/backend/access/hash/hashinsert.c b/src/backend/access/hash/hashinsert.c
index 8640e85..8699b5b 100644
--- a/src/backend/access/hash/hashinsert.c
+++ b/src/backend/access/hash/hashinsert.c
@@ -344,7 +344,6 @@ _hash_vacuum_one_page(Relation rel, Buffer metabuf, Buffer buf,
 	Page	page = BufferGetPage(buf);
 	HashPageOpaque	pageopaque;
 	HashMetaPage	metap;
-	double tuples_removed = 0;
 
 	/* Scan each tuple in page to see if it is marked as LP_DEAD */
 	maxoff = PageGetMaxOffsetNumber(page);
@@ -355,10 +354,7 @@ _hash_vacuum_one_page(Relation rel, Buffer metabuf, Buffer buf,
 		ItemId	itemId = PageGetItemId(page, offnum);
 
 		if (ItemIdIsDead(itemId))
-		{
 			deletable[ndeletable++] = offnum;
-			tuples_removed += 1;
-		}
 	}
 
 	if (ndeletable > 0)
@@ -386,7 +382,7 @@ _hash_vacuum_one_page(Relation rel, Buffer metabuf, Buffer buf,
 		pageopaque->hasho_flag &= ~LH_PAGE_HAS_DEAD_TUPLES;
 
 		metap = HashPageGetMeta(BufferGetPage(metabuf));
-		metap->hashm_ntuples -= tuples_removed;
+		metap->hashm_ntuples -= ndeletable;
 
 		MarkBufferDirty(buf);
 		MarkBufferDirty(metabuf);
@@ -398,13 +394,18 @@ _hash_vacuum_one_page(Relation rel, Buffer metabuf, Buffer buf,
 			XLogRecPtr	recptr;
 
 			xlrec.hnode = hnode;
-			xlrec.ntuples = tuples_removed;
+			xlrec.ntuples = ndeletable;
 
 			XLogBeginInsert();
+			XLogRegisterBuffer(0, buf, REGBUF_STANDARD);
 			XLogRegisterData((char *) &xlrec, SizeOfHashVacuumOnePage);
 
-			XLogRegisterBuffer(0, buf, REGBUF_STANDARD);
-			XLogRegisterBufData(0, (char *) deletable,
+			/*
+			 * We need the target-offsets array whether or not we store the whole
+			 * buffer, to allow us to find the latestRemovedXid on a standby
+			 * server.
+			 */
+			XLogRegisterData((char *) deletable,
 						ndeletable * sizeof(OffsetNumber));
 
 			XLogRegisterBuffer(1, metabuf, REGBUF_STANDARD);
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to