Re: mark/restore failures on unsorted merge joins

2020-11-24 Thread Tom Lane
Andrew Gierth  writes:
> "Tom" == Tom Lane  writes:
>  Tom> Looks about right. Not sure if we need to bother with a regression
>  Tom> test case; once that's in, it'd be hard to break it.

> We could check the EXPLAIN output (since the Materialize node would show
> up), but it's not easy to get stable plans since the choice of which
> path to put on the outside is not fixed. Based on what I found when
> actually testing the code, it probably wouldn't be worth the effort.

If it's not easy to test, I agree it's not worth it.

(Given how long it took anyone to notice this, the difficulty of
making a stable test case is unsurprising, perhaps.)

regards, tom lane




Re: mark/restore failures on unsorted merge joins

2020-11-24 Thread Andrew Gierth
> "Tom" == Tom Lane  writes:

 >> I guess that's close enough; this should suffice then.

 Tom> Looks about right. Not sure if we need to bother with a regression
 Tom> test case; once that's in, it'd be hard to break it.

We could check the EXPLAIN output (since the Materialize node would show
up), but it's not easy to get stable plans since the choice of which
path to put on the outside is not fixed. Based on what I found when
actually testing the code, it probably wouldn't be worth the effort.

-- 
Andrew (irc:RhodiumToad)




Re: mark/restore failures on unsorted merge joins

2020-11-24 Thread Tom Lane
Andrew Gierth  writes:
> I guess that's close enough; this should suffice then.

Looks about right.  Not sure if we need to bother with a regression test
case; once that's in, it'd be hard to break it.

regards, tom lane




Re: mark/restore failures on unsorted merge joins

2020-11-24 Thread Andrew Gierth
> "Tom" == Tom Lane  writes:

 Tom> Oh, sorry, I misread your comment to be that you wanted to add a
 Tom> field to IndexAmRoutine. You're right, the real issue here is that
 Tom> ExecSupportsMarkRestore lacks any convenient access to the needed
 Tom> info, and we need to add a bool to IndexOptInfo to fix that.

 Tom> I don't see any compelling reason why you couldn't add the field
 Tom> at the end in the back branches; that's what we usually do to
 Tom> avoid ABI breaks. Although actually (counts fields...) it looks
 Tom> like there's at least one pad byte after amcanparallel, so you
 Tom> could add a bool there without any ABI consequence, resulting in a
 Tom> reasonably natural field order in all branches.

I guess that's close enough; this should suffice then.

-- 
Andrew (irc:RhodiumToad)

diff --git a/src/backend/executor/execAmi.c b/src/backend/executor/execAmi.c
index e2154ba86a..0c10f1d35c 100644
--- a/src/backend/executor/execAmi.c
+++ b/src/backend/executor/execAmi.c
@@ -417,6 +417,11 @@ ExecSupportsMarkRestore(Path *pathnode)
 	{
 		case T_IndexScan:
 		case T_IndexOnlyScan:
+			/*
+			 * Not all index types support mark/restore.
+			 */
+			return castNode(IndexPath, pathnode)->indexinfo->amcanmarkpos;
+
 		case T_Material:
 		case T_Sort:
 			return true;
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 52c01eb86b..3e94256d34 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -284,6 +284,8 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 			info->amhasgettuple = (amroutine->amgettuple != NULL);
 			info->amhasgetbitmap = amroutine->amgetbitmap != NULL &&
 relation->rd_tableam->scan_bitmap_next_block != NULL;
+			info->amcanmarkpos = (amroutine->ammarkpos != NULL &&
+  amroutine->amrestrpos != NULL);
 			info->amcostestimate = amroutine->amcostestimate;
 			Assert(info->amcostestimate != NULL);
 
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index abe6f570e3..5a10c1855d 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -864,6 +864,7 @@ struct IndexOptInfo
 	bool		amhasgettuple;	/* does AM have amgettuple interface? */
 	bool		amhasgetbitmap; /* does AM have amgetbitmap interface? */
 	bool		amcanparallel;	/* does AM support parallel scan? */
+	bool		amcanmarkpos;	/* does AM support mark/restore? */
 	/* Rather than include amapi.h here, we declare amcostestimate like this */
 	void		(*amcostestimate) ();	/* AM's cost estimator */
 };


Re: mark/restore failures on unsorted merge joins

2020-11-24 Thread Tom Lane
Andrew Gierth  writes:
> "Tom" == Tom Lane  writes:
>  Tom> Uh, why would you not just look to see if the ammarkpos/amrestrpos
>  Tom> fields are non-null?

> We don't (in the back branches) seem to have a pointer to the
> IndexAmRoutine handy, only the oid?

Oh, sorry, I misread your comment to be that you wanted to add a field
to IndexAmRoutine.  You're right, the real issue here is that
ExecSupportsMarkRestore lacks any convenient access to the needed info,
and we need to add a bool to IndexOptInfo to fix that.

I don't see any compelling reason why you couldn't add the field at
the end in the back branches; that's what we usually do to avoid
ABI breaks.  Although actually (counts fields...) it looks like
there's at least one pad byte after amcanparallel, so you could
add a bool there without any ABI consequence, resulting in a
reasonably natural field order in all branches.

regards, tom lane




Re: mark/restore failures on unsorted merge joins

2020-11-24 Thread Andrew Gierth
> "Tom" == Tom Lane  writes:

 >> The problem is that the planner calls ExecSupportsMarkRestore to
 >> find out whether a Materialize node is needed, and that function
 >> looks no further than the Path type of T_Index[Only]Path in order to
 >> return true, even though in this case it's a GiST index which does
 >> not support mark/restore.

 >> (Usually this can't be a problem because the merge join would need
 >> sorted input, thus the index scan would be a btree; but a merge join
 >> that doesn't actually have any sort keys could take unsorted input
 >> from any index type.)

 Tom> Sounds like the right analysis.

 >> Going forward, this looks like IndexOptInfo needs another am*
 >> boolean field, but that's probably not appropriate for the back
 >> branches; maybe as a workaround, ExecSupportsMarkRestore should just
 >> check for btree?

 Tom> Uh, why would you not just look to see if the ammarkpos/amrestrpos
 Tom> fields are non-null?

We don't (in the back branches) seem to have a pointer to the
IndexAmRoutine handy, only the oid? Obviously we can look it up from the
oid, but is that more overhead than we want in a join cost function,
given that this will be called for all potential mergejoins considered,
not just JOIN_FULL? Or is the overhead not worth bothering about?

-- 
Andrew (irc:RhodiumToad)




Re: mark/restore failures on unsorted merge joins

2020-11-23 Thread Tom Lane
Andrew Gierth  writes:
> The problem is that the planner calls ExecSupportsMarkRestore to find
> out whether a Materialize node is needed, and that function looks no
> further than the Path type of T_Index[Only]Path in order to return true,
> even though in this case it's a GiST index which does not support
> mark/restore.

> (Usually this can't be a problem because the merge join would need
> sorted input, thus the index scan would be a btree; but a merge join
> that doesn't actually have any sort keys could take unsorted input from
> any index type.)

Sounds like the right analysis.

> Going forward, this looks like IndexOptInfo needs another am* boolean
> field, but that's probably not appropriate for the back branches; maybe
> as a workaround, ExecSupportsMarkRestore should just check for btree?

Uh, why would you not just look to see if the ammarkpos/amrestrpos fields
are non-null?

regards, tom lane




mark/restore failures on unsorted merge joins

2020-11-23 Thread Andrew Gierth
>From a report by user "kes" on irc, I constructed this test case:

create table marktst (id integer, val numrange, exclude using gist (val with 
&&));
insert into marktst select i, numrange(i, i+1, '[)') from 
generate_series(1,1000) i;
vacuum marktst;
analyze marktst;

select * from (select val from marktst where val && numrange(10,11)) s1 full 
join (values (1)) v(a) on true;
ERROR:  function ammarkpos is not defined for index marktst_val_excl

for which the query plan is:
 QUERY PLAN 


 Merge Full Join  (cost=0.14..4.21 rows=2 width=20)
   ->  Result  (cost=0.00..0.01 rows=1 width=4)
   ->  Index Only Scan using marktst_val_excl on marktst  (cost=0.14..4.18 
rows=2 width=16)
 Index Cond: (val && '[10,11)'::numrange)
(4 rows)

The problem is that the planner calls ExecSupportsMarkRestore to find
out whether a Materialize node is needed, and that function looks no
further than the Path type of T_Index[Only]Path in order to return true,
even though in this case it's a GiST index which does not support
mark/restore.

(Usually this can't be a problem because the merge join would need
sorted input, thus the index scan would be a btree; but a merge join
that doesn't actually have any sort keys could take unsorted input from
any index type.)

Going forward, this looks like IndexOptInfo needs another am* boolean
field, but that's probably not appropriate for the back branches; maybe
as a workaround, ExecSupportsMarkRestore should just check for btree?

-- 
Andrew (irc:RhodiumToad)