Re: CURRENT OF causes an error when IndexOnlyScan is used

2018-03-16 Thread Tom Lane
Yugo Nagata  writes:
> On Mon, 12 Mar 2018 13:56:24 -0400
> Tom Lane  wrote:
>> I took a quick look at this, but I'm concerned about a couple of points:

> In addition, this patch fixes only nbtree indexes, but the simillar problem
> will occur also on GIST indexes which support index-only scan. If we resolve
> this bug by fixing index access methods, I think we need to fix all existing
> indexes that support index-only scan and also describe the specification in
> the documents, comments, or README, etc. for future indexes.

Yeah, that's a pretty good point.
 
>> At this point Yugo-san's original patch is starting to look more
>> attractive.  It's still ugly, but at least it's not imposing a performance
>> cost on unrelated queries.

> I would like to elaborate my patch if needed and possible. Any suggestion
> would be appriciated.

I don't think there's much else to be done so far as the IndexOnlyScan
case is concerned.  However, I notice that somebody's made
search_plan_tree accept ForeignScanState and CustomScanState as possible
matches for WHERE CURRENT OF, and I find that rather troubling.  It seems
likely to me that both of those would have the same problem as
IndexOnlyScans, ie whatever they're returning is probably a virtual tuple
without any ctid field.  So we'd get the same unfriendly failure as you
complained of originally.

I wonder whether it wouldn't be a good idea to provide a way for an FDW
or CustomScan provider to return a TID, or at least give a more polite
FEATURE_NOT_SUPPORTED error than what happens now.  However, that seems
more like a new feature than a bug fix; certainly any extension of those
APIs is something we'd not back-patch.

In the meantime, we could fix execCurrent.c so that it throws
FEATURE_NOT_SUPPORTED rather than the current failure if the slot it's
looking at doesn't contain a physical tuple.

regards, tom lane



Re: CURRENT OF causes an error when IndexOnlyScan is used

2018-03-15 Thread Yugo Nagata
On Mon, 12 Mar 2018 13:56:24 -0400
Tom Lane  wrote:

> Anastasia Lubennikova  writes:
> > [ return_heaptuple_in_btree_indexonlyscan_v2.patch ]
> 
> I took a quick look at this, but I'm concerned about a couple of points:
> 
> 1. What's the performance penalty of forming (and then deforming) the
> added heap tuple?  We'd be paying it for every index-only scan, whether
> or not any CURRENT OF fetch ever happened.
> 
> 2. The constructed tuple provides tableoid and ctid all right, but it'd
> still have garbage values for other system columns.  As the code stands,
> we will properly error out if some attempt is made to fetch any of those
> other columns, but with this change we'd just return the garbage.  This
> is a minor point, but it doesn't seem negligible to me; it might've been
> hard to identify the bug at hand if we'd not had the cross-check of not
> building a heap tuple.

In addition, this patch fixes only nbtree indexes, but the simillar problem
will occur also on GIST indexes which support index-only scan. If we resolve
this bug by fixing index access methods, I think we need to fix all existing
indexes that support index-only scan and also describe the specification in
the documents, comments, or README, etc. for future indexes.
 
> At this point Yugo-san's original patch is starting to look more
> attractive.  It's still ugly, but at least it's not imposing a performance
> cost on unrelated queries.

I would like to elaborate my patch if needed and possible. Any suggestion
would be appriciated.

Thanks,

> 
>   regards, tom lane



-- 
Yugo Nagata 



Re: CURRENT OF causes an error when IndexOnlyScan is used

2018-03-12 Thread Tom Lane
Anastasia Lubennikova  writes:
> [ return_heaptuple_in_btree_indexonlyscan_v2.patch ]

I took a quick look at this, but I'm concerned about a couple of points:

1. What's the performance penalty of forming (and then deforming) the
added heap tuple?  We'd be paying it for every index-only scan, whether
or not any CURRENT OF fetch ever happened.

2. The constructed tuple provides tableoid and ctid all right, but it'd
still have garbage values for other system columns.  As the code stands,
we will properly error out if some attempt is made to fetch any of those
other columns, but with this change we'd just return the garbage.  This
is a minor point, but it doesn't seem negligible to me; it might've been
hard to identify the bug at hand if we'd not had the cross-check of not
building a heap tuple.

At this point Yugo-san's original patch is starting to look more
attractive.  It's still ugly, but at least it's not imposing a performance
cost on unrelated queries.

regards, tom lane



Re: CURRENT OF causes an error when IndexOnlyScan is used

2018-02-21 Thread Anastasia Lubennikova

20.02.2018 12:52, Aleksander Alekseev:

Hi Anastasia,


I'd like to propose the patch that fixes the issue.
We already have a way to return heaptuple from IndexOnlyScan,
but it was not applied to b-tree for some reason.

Attached patch solves the reported bug.
Moreover, it will come in handy for "index with included attributes" feature
[1],
where we can store long (and even TOASTed) attributes in indextuple.

[1] https://commitfest.postgresql.org/17/1350/

I believe the patch should include a test that tries to reproduce an
issue it tries to fix.

Also maybe this code that repeats 3 times can be moved to a separate
procedure?


Good point. Updated version with test is attached.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

commit 391107028d4597e9721a3b274904e49ee69de0cf
Author: Anastasia 
Date:   Wed Feb 21 17:25:11 2018 +0300

return_heaptuple_in_btree_indexonlyscan_v2.patch

diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 8158508..db8a55c 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -374,6 +374,8 @@ btbeginscan(Relation rel, int nkeys, int norderbys)
 	so->currTuples = so->markTuples = NULL;
 
 	scan->xs_itupdesc = RelationGetDescr(rel);
+	scan->xs_hitupdesc = NULL;
+	scan->xs_hitup = NULL;
 
 	scan->opaque = so;
 
diff --git a/src/backend/access/nbtree/nbtsearch.c b/src/backend/access/nbtree/nbtsearch.c
index 51dca64..278f43d 100644
--- a/src/backend/access/nbtree/nbtsearch.c
+++ b/src/backend/access/nbtree/nbtsearch.c
@@ -25,6 +25,9 @@
 #include "utils/tqual.h"
 
 
+static HeapTuple _bt_fetch_tuple(IndexScanDesc scandesc,
+ ItemPointerData heapTid);
+static void	_bt_fill_hitupdesc(IndexScanDesc scan);
 static bool _bt_readpage(IndexScanDesc scan, ScanDirection dir,
 			 OffsetNumber offnum);
 static void _bt_saveitem(BTScanOpaque so, int itemIndex,
@@ -38,7 +41,52 @@ static bool _bt_endpoint(IndexScanDesc scan, ScanDirection dir);
 static void _bt_drop_lock_and_maybe_pin(IndexScanDesc scan, BTScanPos sp);
 static inline void _bt_initialize_more_data(BTScanOpaque so, ScanDirection dir);
 
+/*
+ * Fetch all keys in tuple.
+ * Returns a new HeapTuple containing the originally-indexed data.
+ */
+static HeapTuple
+_bt_fetch_tuple(IndexScanDesc scandesc, ItemPointerData heapTid)
+{
+	Relation index = scandesc->indexRelation;
+	Datum		fetchatt[INDEX_MAX_KEYS];
+	bool		isnull[INDEX_MAX_KEYS];
+	int			i;
+	HeapTuple htuple;
+
+	for (i = 0; i < index->rd_att->natts; i++)
+	{
+		fetchatt[i] = index_getattr(scandesc->xs_itup, i + 1,
+	scandesc->xs_itupdesc, [i]);
+	}
+
+	htuple = heap_form_tuple(scandesc->xs_hitupdesc, fetchatt, isnull);
+	htuple->t_tableOid = scandesc->heapRelation->rd_id;
+	htuple->t_self = heapTid;
 
+
+	return htuple;
+}
+
+static void
+_bt_fill_hitupdesc(IndexScanDesc scan)
+{
+	int natts = RelationGetNumberOfAttributes(scan->indexRelation);
+	int attno;
+	/*
+	* The storage type of the index can be different from the original
+	* datatype being indexed, so we cannot just grab the index's tuple
+	* descriptor. Instead, construct a descriptor with the original data
+	* types.
+	*/
+	scan->xs_hitupdesc = CreateTemplateTupleDesc(natts, false);
+	for (attno = 1; attno <= natts; attno++)
+	{
+		TupleDescInitEntry(scan->xs_hitupdesc, attno, NULL,
+		scan->indexRelation->rd_opcintype[attno - 1],
+		-1, 0);
+	}
+}
 /*
  *	_bt_drop_lock_and_maybe_pin()
  *
@@ -1105,9 +1153,19 @@ readcomplete:
 	/* OK, itemIndex says what to return */
 	currItem = >currPos.items[so->currPos.itemIndex];
 	scan->xs_ctup.t_self = currItem->heapTid;
+
 	if (scan->xs_want_itup)
+	{
 		scan->xs_itup = (IndexTuple) (so->currTuples + currItem->tupleOffset);
 
+		if (!scan->xs_hitupdesc)
+			_bt_fill_hitupdesc(scan);
+
+		if (scan->xs_hitup)
+			pfree(scan->xs_hitup);
+		scan->xs_hitup = _bt_fetch_tuple(scan, currItem->heapTid);
+	}
+
 	return true;
 }
 
@@ -1155,9 +1213,19 @@ _bt_next(IndexScanDesc scan, ScanDirection dir)
 	/* OK, itemIndex says what to return */
 	currItem = >currPos.items[so->currPos.itemIndex];
 	scan->xs_ctup.t_self = currItem->heapTid;
+
 	if (scan->xs_want_itup)
+	{
 		scan->xs_itup = (IndexTuple) (so->currTuples + currItem->tupleOffset);
 
+		if (!scan->xs_hitupdesc)
+			_bt_fill_hitupdesc(scan);
+
+		if (scan->xs_hitup)
+			pfree(scan->xs_hitup);
+		scan->xs_hitup = _bt_fetch_tuple(scan, currItem->heapTid);
+	}
+
 	return true;
 }
 
@@ -1932,9 +2000,19 @@ _bt_endpoint(IndexScanDesc scan, ScanDirection dir)
 	/* OK, itemIndex says what to return */
 	currItem = >currPos.items[so->currPos.itemIndex];
 	scan->xs_ctup.t_self = currItem->heapTid;
+
 	if (scan->xs_want_itup)
+	{
 		scan->xs_itup = (IndexTuple) (so->currTuples + currItem->tupleOffset);
 
+		if (!scan->xs_hitupdesc)
+			_bt_fill_hitupdesc(scan);
+
+		if (scan->xs_hitup)
+			pfree(scan->xs_hitup);
+		scan->xs_hitup = 

Re: CURRENT OF causes an error when IndexOnlyScan is used

2018-02-19 Thread Anastasia Lubennikova

01.02.2018 05:12, Tom Lane:

Yugo Nagata  writes:

I'm sorry the patch attached in the previous mail is broken and
not raises a compile error. I attached the fixed patch.

This patch is almost certainly wrong: you can't assume that the scan-level
state matches the tuple we are currently processing at top level.  Any
sort of delaying action, for instance a sort or materialize node in
between, would break it.

We need to either fix this aspect:


IndexOnlyScan returns a virtual tuple that doesn't have system
column, so we can not get ctid in the same way of other plans.


I'd like to propose the patch that fixes the issue.
We already have a way to return heaptuple from IndexOnlyScan,
but it was not applied to b-tree for some reason.

Attached patch solves the reported bug.
Moreover, it will come in handy for "index with included attributes" 
feature [1],

where we can store long (and even TOASTed) attributes in indextuple.

[1] https://commitfest.postgresql.org/17/1350/

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index 8158508..db8a55c 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -374,6 +374,8 @@ btbeginscan(Relation rel, int nkeys, int norderbys)
 	so->currTuples = so->markTuples = NULL;
 
 	scan->xs_itupdesc = RelationGetDescr(rel);
+	scan->xs_hitupdesc = NULL;
+	scan->xs_hitup = NULL;
 
 	scan->opaque = so;
 
diff --git a/src/backend/access/nbtree/nbtsearch.c b/src/backend/access/nbtree/nbtsearch.c
index 51dca64..dd3e8b2 100644
--- a/src/backend/access/nbtree/nbtsearch.c
+++ b/src/backend/access/nbtree/nbtsearch.c
@@ -25,6 +25,7 @@
 #include "utils/tqual.h"
 
 
+static HeapTuple _bt_fetch_tuple(IndexScanDesc scandesc);
 static bool _bt_readpage(IndexScanDesc scan, ScanDirection dir,
 			 OffsetNumber offnum);
 static void _bt_saveitem(BTScanOpaque so, int itemIndex,
@@ -38,6 +39,31 @@ static bool _bt_endpoint(IndexScanDesc scan, ScanDirection dir);
 static void _bt_drop_lock_and_maybe_pin(IndexScanDesc scan, BTScanPos sp);
 static inline void _bt_initialize_more_data(BTScanOpaque so, ScanDirection dir);
 
+/*
+ * Fetch all keys in tuple.
+ * Returns a new HeapTuple containing the originally-indexed data.
+ */
+static HeapTuple
+_bt_fetch_tuple(IndexScanDesc scandesc)
+{
+	Relation index = scandesc->indexRelation;
+	Datum		fetchatt[INDEX_MAX_KEYS];
+	bool		isnull[INDEX_MAX_KEYS];
+	int			i;
+	HeapTuple htuple;
+
+	for (i = 0; i < index->rd_att->natts; i++)
+	{
+		fetchatt[i] = index_getattr(scandesc->xs_itup, i + 1,
+	scandesc->xs_itupdesc, [i]);
+	}
+
+	htuple = heap_form_tuple(scandesc->xs_hitupdesc, fetchatt, isnull);
+	htuple->t_tableOid = scandesc->heapRelation->rd_id;
+
+
+	return htuple;
+}
 
 /*
  *	_bt_drop_lock_and_maybe_pin()
@@ -1105,8 +1131,32 @@ readcomplete:
 	/* OK, itemIndex says what to return */
 	currItem = >currPos.items[so->currPos.itemIndex];
 	scan->xs_ctup.t_self = currItem->heapTid;
+
 	if (scan->xs_want_itup)
+	{
+		if (!scan->xs_hitupdesc)
+		{
+			int natts = RelationGetNumberOfAttributes(scan->indexRelation);
+			int attno;
+			/*
+			* The storage type of the index can be different from the original
+			* datatype being indexed, so we cannot just grab the index's tuple
+			* descriptor. Instead, construct a descriptor with the original data
+			* types.
+			*/
+			scan->xs_hitupdesc = CreateTemplateTupleDesc(natts, false);
+			for (attno = 1; attno <= natts; attno++)
+			{
+TupleDescInitEntry(scan->xs_hitupdesc, attno, NULL,
+scan->indexRelation->rd_opcintype[attno - 1],
+-1, 0);
+			}
+		}
+
 		scan->xs_itup = (IndexTuple) (so->currTuples + currItem->tupleOffset);
+		scan->xs_hitup = _bt_fetch_tuple(scan);
+		scan->xs_hitup->t_self = currItem->heapTid;
+	}
 
 	return true;
 }
@@ -1155,8 +1205,34 @@ _bt_next(IndexScanDesc scan, ScanDirection dir)
 	/* OK, itemIndex says what to return */
 	currItem = >currPos.items[so->currPos.itemIndex];
 	scan->xs_ctup.t_self = currItem->heapTid;
+
 	if (scan->xs_want_itup)
+	{
+		if (!scan->xs_hitupdesc)
+		{
+			int natts = RelationGetNumberOfAttributes(scan->indexRelation);
+			int attno;
+			/*
+			* The storage type of the index can be different from the original
+			* datatype being indexed, so we cannot just grab the index's tuple
+			* descriptor. Instead, construct a descriptor with the original data
+			* types.
+			*/
+			scan->xs_hitupdesc = CreateTemplateTupleDesc(natts, false);
+			for (attno = 1; attno <= natts; attno++)
+			{
+TupleDescInitEntry(scan->xs_hitupdesc, attno, NULL,
+scan->indexRelation->rd_opcintype[attno - 1],
+-1, 0);
+			}
+		}
+
 		scan->xs_itup = (IndexTuple) (so->currTuples + currItem->tupleOffset);
+		if (scan->xs_hitup)
+			pfree(scan->xs_hitup);
+		scan->xs_hitup = _bt_fetch_tuple(scan);
+		scan->xs_hitup->t_self = currItem->heapTid;
+	}
 
 	

Re: CURRENT OF causes an error when IndexOnlyScan is used

2018-01-31 Thread Yugo Nagata
On Wed, 31 Jan 2018 21:12:51 -0500
Tom Lane  wrote:

> Yugo Nagata  writes:
> > I'm sorry the patch attached in the previous mail is broken and
> > not raises a compile error. I attached the fixed patch.
> 
> This patch is almost certainly wrong: you can't assume that the scan-level
> state matches the tuple we are currently processing at top level.  Any
> sort of delaying action, for instance a sort or materialize node in
> between, would break it.

In execCurrentOf(), when FOR UPDATE is not used, search_plan_tree() searches
through the PlanState tree for a scan node and if a sort or materialize node
(for example) is found it fails with the following error.

 ERROR cursor xxx is not a simply updatable scan of table yyy

So, I think what you concern would not occur by the patch as well as the orginal
code. However, I may be missing something. Could you explain more about this if 
so?

> 
> We need to either fix this aspect:
> 
> >> IndexOnlyScan returns a virtual tuple that doesn't have system
> >> column, so we can not get ctid in the same way of other plans.
> 
> or else disallow using IndexOnlyScan when the ctid is needed.

CURRENT OF is used after the scan is executed and a tuple is fetched,
so we can't know whether the ctid is needed or not in advance in this
case. We can raise an error message when CURRENT OF is used
for IndexOnlyScan plan, though.

Regards,

> 
>   regards, tom lane


-- 
Yugo Nagata 



Re: CURRENT OF causes an error when IndexOnlyScan is used

2018-01-31 Thread Tom Lane
Yugo Nagata  writes:
> I'm sorry the patch attached in the previous mail is broken and
> not raises a compile error. I attached the fixed patch.

This patch is almost certainly wrong: you can't assume that the scan-level
state matches the tuple we are currently processing at top level.  Any
sort of delaying action, for instance a sort or materialize node in
between, would break it.

We need to either fix this aspect:

>> IndexOnlyScan returns a virtual tuple that doesn't have system
>> column, so we can not get ctid in the same way of other plans.

or else disallow using IndexOnlyScan when the ctid is needed.

regards, tom lane



Re: CURRENT OF causes an error when IndexOnlyScan is used

2018-01-31 Thread Yugo Nagata
On Thu, 1 Feb 2018 01:33:49 +0900
Yugo Nagata  wrote:

I'm sorry the patch attached in the previous mail is broken and
not raises a compile error. I attached the fixed patch.

Regards,

> Hi,
> 
> I found that updating a cursor by using CURRENT OF causes the
> following error when the query is executed by IndexOnlyScan. 
> 
>  ERROR:  cannot extract system attribute from virtual tuple
> 
> IndexOnlyScan returns a virtual tuple that doesn't have system
> column, so we can not get ctid in the same way of other plans.
> However, the error message is not convinient and users would
> not understand why the error occurs.
> 
> Attached is a patch to fix this. By this fix, execCurrentOf
> get ctid from IndexScanDesc->xs_ctup.t_self when the plan is
> IndexOnlyScan, and it works sucessfully without errors.
> 
> 
> Here is the example of the error:
> 
> ===
> postgres=# create table test (i int primary key);
> CREATE TABLE
> postgres=# insert into test values(1);
> INSERT 0 1
> postgres=# set enable_seqscan to off;
> SET
> 
> postgres=# explain select * from test where i = 1;
> QUERY PLAN 
> ---
>  Index Only Scan using test_pkey on test  (cost=0.15..8.17 rows=1 width=4)
>Index Cond: (i = 1)
> (2 rows)
> 
> postgres=# begin;
> BEGIN
> postgres=# declare c cursor for select * from test where i = 1;
> DECLARE CURSOR
> postgres=# fetch from c;
>  i 
> ---
>  1
> (1 row)
> 
> postgres=# update test set i=i+1 where current of c;
> ERROR:  cannot extract system attribute from virtual tuple
> ===
> 
> The patch fixes the error and allows this update successfully.
> 
> Regards,
> 
> -- 
> Yugo Nagata 


-- 
Yugo Nagata 
diff --git a/src/backend/executor/execCurrent.c b/src/backend/executor/execCurrent.c
index ce7d4ac..aa2da16 100644
--- a/src/backend/executor/execCurrent.c
+++ b/src/backend/executor/execCurrent.c
@@ -19,6 +19,7 @@
 #include "utils/lsyscache.h"
 #include "utils/portal.h"
 #include "utils/rel.h"
+#include "access/relscan.h"
 
 
 static char *fetch_cursor_param_value(ExprContext *econtext, int paramId);
@@ -183,21 +184,35 @@ execCurrentOf(CurrentOfExpr *cexpr,
 		if (TupIsNull(scanstate->ss_ScanTupleSlot))
 			return false;
 
-		/* Use slot_getattr to catch any possible mistakes */
-		tuple_tableoid =
-			DatumGetObjectId(slot_getattr(scanstate->ss_ScanTupleSlot,
-		  TableOidAttributeNumber,
-		  ));
-		Assert(!lisnull);
-		tuple_tid = (ItemPointer)
-			DatumGetPointer(slot_getattr(scanstate->ss_ScanTupleSlot,
-		 SelfItemPointerAttributeNumber,
-		 ));
-		Assert(!lisnull);
-
-		Assert(tuple_tableoid == table_oid);
-
-		*current_tid = *tuple_tid;
+		/* In IndexOnlyScan case, the tuple stored in ss_ScanTupleSlot is a
+		 * virtual tuple that does not have ctid column, so we have to get TID
+		 * from xs_ctup.t_self. */
+		if (IsA(scanstate, IndexOnlyScanState))
+		{
+			IndexScanDesc scan = ((IndexOnlyScanState *)scanstate)->ioss_ScanDesc;
+
+			Assert(RelationGetRelid(scan.heapRelation) == table_oid);
+
+			*current_tid = scan->xs_ctup.t_self;
+		}
+		else
+		{
+			/* Use slot_getattr to catch any possible mistakes */
+			tuple_tableoid =
+DatumGetObjectId(slot_getattr(scanstate->ss_ScanTupleSlot,
+			  TableOidAttributeNumber,
+			  ));
+			Assert(!lisnull);
+			tuple_tid = (ItemPointer)
+DatumGetPointer(slot_getattr(scanstate->ss_ScanTupleSlot,
+			 SelfItemPointerAttributeNumber,
+			 ));
+			Assert(!lisnull);
+
+			Assert(tuple_tableoid == table_oid);
+
+			*current_tid = *tuple_tid;
+		}
 
 		return true;
 	}