Re: Covering GiST indexes

2019-03-10 Thread Andrey Borodin
Hi!

> 10 марта 2019 г., в 13:39, Alexander Korotkov  
> написал(а):
> 
> Pushed with some more small changes in docs.

That's great, thank you!

Best regards, Andrey Borodin.


Re: Covering GiST indexes

2019-03-10 Thread Alexander Korotkov
On Fri, Mar 8, 2019 at 11:58 AM Alexander Korotkov
 wrote:
> I made some adjustments for this patch:
>  * Rename tupledesc and tructTupledesc to leafTupdesc and
> nonLeafTupdesc correspondingly.  I think this makes things more clear.
>  * Some improvements to docs and comments.
>  * Revise commit message.
>
> I'm going to push this on no objections.

Pushed with some more small changes in docs.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: Covering GiST indexes

2019-03-08 Thread Alexander Korotkov
On Wed, Jan 30, 2019 at 4:16 AM Andreas Karlsson  wrote:
>
> On 29/01/2019 19.58, Andrey Borodin wrote:
> > PFA patch without redundant checks.
>
> I hope it is ok that I fixed a typo and some wording in the comment,
> plus re-added the btree query which I accidentally removed in my last mail.
>
> I think the current state of the patch is fine, assuming the solution
> with truncTupdesc is deemed to be good enough by the committer, so I
> will mark this as ready for committer.
>
> Thanks for the patch and the speedy fixes!

I made some adjustments for this patch:
 * Rename tupledesc and tructTupledesc to leafTupdesc and
nonLeafTupdesc correspondingly.  I think this makes things more clear.
 * Some improvements to docs and comments.
 * Revise commit message.

I'm going to push this on no objections.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


0001-Covering-GiST-v11.patch
Description: Binary data


Re: Covering GiST indexes

2019-01-29 Thread Andreas Karlsson

On 29/01/2019 19.58, Andrey Borodin wrote:

PFA patch without redundant checks.


I hope it is ok that I fixed a typo and some wording in the comment, 
plus re-added the btree query which I accidentally removed in my last mail.


I think the current state of the patch is fine, assuming the solution 
with truncTupdesc is deemed to be good enough by the committer, so I 
will mark this as ready for committer.


Thanks for the patch and the speedy fixes!

Andreas
>From f1a85940d01e208a6502e689616a8806146c9f3d Mon Sep 17 00:00:00 2001
From: Andreas Karlsson 
Date: Wed, 30 Jan 2019 01:59:14 +0100
Subject: [PATCH] Covering GiST v10

This patch allow adding INCLUDE colums to GiST index.
These colums do not participate in GiST tree build,
are not used for chosing subtree for inserts or splits.
But they can be fetched during IndexOnlyScan.
---
 doc/src/sgml/ref/create_index.sgml |   4 +-
 doc/src/sgml/textsearch.sgml   |   6 +
 src/backend/access/gist/gist.c |  33 +++-
 src/backend/access/gist/gistget.c  |   4 +-
 src/backend/access/gist/gistscan.c |  12 +-
 src/backend/access/gist/gistsplit.c|  12 +-
 src/backend/access/gist/gistutil.c |  42 --
 src/include/access/gist_private.h  |   2 +
 src/test/regress/expected/amutils.out  |   4 +-
 src/test/regress/expected/index_including.out  |   8 +-
 src/test/regress/expected/index_including_gist.out | 166 +
 src/test/regress/parallel_schedule |   2 +-
 src/test/regress/serial_schedule   |   1 +
 src/test/regress/sql/index_including.sql   |   6 +-
 src/test/regress/sql/index_including_gist.sql  |  90 +++
 15 files changed, 358 insertions(+), 34 deletions(-)
 create mode 100644 src/test/regress/expected/index_including_gist.out
 create mode 100644 src/test/regress/sql/index_including_gist.sql

diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index ad619cdcfe..f8657c6ae2 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -181,8 +181,8 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] 
 

-Currently, only the B-tree index access method supports this feature.
-In B-tree indexes, the values of columns listed in the
+Currently, the B-tree and the GiST index access methods supports this
+feature. In B-tree indexes, the values of columns listed in the
 INCLUDE clause are included in leaf tuples which
 correspond to heap tuples, but are not included in upper-level
 index entries used for tree navigation.
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index ecebade767..e4c75ebf6f 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -3675,6 +3675,12 @@ SELECT plainto_tsquery('supernovae stars');
   
 
   
+   A GiST index can be covering, i.e. use the INCLUDE
+   clause. Included columns can have data types without any GiST operator
+   class. Included attributes will be stored uncompressed.
+  
+
+  
Lossiness causes performance degradation due to unnecessary fetches of table
records that turn out to be false matches.  Since random access to table
records is slow, this limits the usefulness of GiST indexes.  The
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index b75b3a8dac..2bbf7a7f49 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -75,7 +75,7 @@ gisthandler(PG_FUNCTION_ARGS)
 	amroutine->amclusterable = true;
 	amroutine->ampredlocks = true;
 	amroutine->amcanparallel = false;
-	amroutine->amcaninclude = false;
+	amroutine->amcaninclude = true;
 	amroutine->amkeytype = InvalidOid;
 
 	amroutine->ambuild = gistbuild;
@@ -1382,8 +1382,8 @@ gistSplit(Relation r,
 		IndexTupleSize(itup[0]), GiSTPageSize,
 		RelationGetRelationName(r;
 
-	memset(v.spl_lisnull, true, sizeof(bool) * giststate->tupdesc->natts);
-	memset(v.spl_risnull, true, sizeof(bool) * giststate->tupdesc->natts);
+	memset(v.spl_lisnull, true, sizeof(bool) * giststate->truncTupdesc->natts);
+	memset(v.spl_risnull, true, sizeof(bool) * giststate->truncTupdesc->natts);
 	gistSplitByKey(r, page, itup, len, giststate, , 0);
 
 	/* form left and right vector */
@@ -1462,8 +1462,19 @@ initGISTstate(Relation index)
 	giststate->scanCxt = scanCxt;
 	giststate->tempCxt = scanCxt;	/* caller must change this if needed */
 	giststate->tupdesc = index->rd_att;
+	/*
+	 * The truncated tupdesc does not contain the INCLUDE attributes.
+	 *
+	 * It is used to form tuples during tuple adjustement and page split.
+	 * B-tree creates shortened tuple descriptor for every truncated tuple,
+	 * because it is doing this less often: it does not have to form
+	 * truncated tuples during page split. Also, B-tree is not 

Re: Covering GiST indexes

2019-01-29 Thread Andrey Borodin


> 29 янв. 2019 г., в 23:43, Andreas Karlsson  написал(а):
> 
> On 29/01/2019 18.45, Andrey Borodin wrote:
>> Also, I've unified gist and r-tree syntax tests for INCLUDE.
> 
> Why not just keep it simple? The purpose is not to test the GiST indexes, for 
> that we have index_including_gist.sql.
> 
> I propose the following.

Indeed, it's quite strange to check that GiST cannot be built without GiST 
opclass.

PFA patch without redundant checks.

Best regards, Andrey Borodin.



0001-Covering-GiST-v9.patch
Description: Binary data




Re: Covering GiST indexes

2019-01-29 Thread Andreas Karlsson

On 29/01/2019 18.45, Andrey Borodin wrote:

Also, I've unified gist and r-tree syntax tests for INCLUDE.


Why not just keep it simple? The purpose is not to test the GiST 
indexes, for that we have index_including_gist.sql.


I propose the following.

/*
 * 7. Check various AMs. All but btree and gist must fail.
 */
CREATE TABLE tbl (c1 int,c2 int, c3 box, c4 box);
CREATE INDEX on tbl USING brin(c1, c2) INCLUDE (c3, c4);
CREATE INDEX on tbl USING gist(c3) INCLUDE (c1, c4);
CREATE INDEX on tbl USING spgist(c3) INCLUDE (c4);
CREATE INDEX on tbl USING gin(c1, c2) INCLUDE (c3, c4);
CREATE INDEX on tbl USING hash(c1, c2) INCLUDE (c3, c4);
CREATE INDEX on tbl USING rtree(c3) INCLUDE (c1, c4);
DROP TABLE tbl;

and

/*
 * 7. Check various AMs. All but btree and gist must fail.
 */
CREATE TABLE tbl (c1 int,c2 int, c3 box, c4 box);
CREATE INDEX on tbl USING brin(c1, c2) INCLUDE (c3, c4);
ERROR:  access method "brin" does not support included columns
CREATE INDEX on tbl USING gist(c3) INCLUDE (c1, c4);
CREATE INDEX on tbl USING spgist(c3) INCLUDE (c4);
ERROR:  access method "spgist" does not support included columns
CREATE INDEX on tbl USING gin(c1, c2) INCLUDE (c3, c4);
ERROR:  access method "gin" does not support included columns
CREATE INDEX on tbl USING hash(c1, c2) INCLUDE (c3, c4);
ERROR:  access method "hash" does not support included columns
CREATE INDEX on tbl USING rtree(c3) INCLUDE (c1, c4);
NOTICE:  substituting access method "gist" for obsolete method "rtree"
DROP TABLE tbl;



Re: Covering GiST indexes

2019-01-29 Thread Andrey Borodin
Thanks! Here's new version 8

> 29 янв. 2019 г., в 22:00, Andreas Karlsson  написал(а):
> 
> * I think it is worth writing a short comment when you create truncTupdesc 
> about why this is done.
It resulted in not so short comment, but I think it is OK.
> 
> * Very minor thing: the diff below is pointless churn on a line not touched 
> by the patch.
> 
> -values, isnull, true /* size is currently bogus */ );
> +values, isnull, true /* size is currently bogus */);
Fixed.
> 
> * Another very minor thing: The diff below from gistFormTuple() should 
> probably be consistent about brackets.
> 
> +   if (isnull[i])
> +   compatt[i] = (Datum) 0;
> +   else
> +   {
> +   compatt[i] = attdata[i];
> +   }
Fixed.


Also, I've unified gist and r-tree syntax tests for INCLUDE.

Best regards, Andrey Borodin.



0001-Covering-GiST-v8.patch
Description: Binary data


Re: Covering GiST indexes

2019-01-29 Thread Andreas Karlsson

On 29/01/2019 18.00, Andreas Karlsson wrote:
Thanks for the new version of the patch. Based on my knowledge of PG 
this is starting to look good, and I have only three small comments below.


Sorry, I missed retree in the tests. Can you fix the below to match the 
gist example?


 CREATE INDEX on tbl USING rtree(c1, c2) INCLUDE (c3, c4);
 NOTICE:  substituting access method "gist" for obsolete method "rtree"
-ERROR:  access method "gist" does not support included columns
+ERROR:  data type integer has no default operator class for access 
method "gist"
+HINT:  You must specify an operator class for the index or define a 
default operator class for the data type.

+CREATE INDEX on tbl USING rtree(c4) INCLUDE (c1, c4);

Andreas



Re: Covering GiST indexes

2019-01-29 Thread Andreas Karlsson
Thanks for the new version of the patch. Based on my knowledge of PG 
this is starting to look good, and I have only three small comments below.


I am not 100% a fan of truncTupdesc, but as long as it is well commented 
I think that it is fine.


= Review

* I think it is worth writing a short comment when you create 
truncTupdesc about why this is done.


* Very minor thing: the diff below is pointless churn on a line not 
touched by the patch.


-values, isnull, true /* size is currently bogus 
*/ );
+values, isnull, true /* size is currently bogus 
*/);


* Another very minor thing: The diff below from gistFormTuple() should 
probably be consistent about brackets.


+   if (isnull[i])
+   compatt[i] = (Datum) 0;
+   else
+   {
+   compatt[i] = attdata[i];
+   }

Andreas



Re: Covering GiST indexes

2019-01-28 Thread Andrey Borodin


> 29 янв. 2019 г., в 7:32, Andreas Karlsson  написал(а):
> 
> On 1/28/19 7:26 PM, Andrey Borodin wrote:
>>> * I am no fan of the tupdesc vs truncTupdesc separation and think that it 
>>> is a potential hazard, but I do not have any better suggestion right now.
>> B-tree is copying tupdesc every time they truncate tuple. We need tuple 
>> truncation a little more often: when we are doing page split, we have to 
>> form all page tuples, truncated.
>> Initially, I've optimized only this case, but this led to prepared tupledesc 
>> for truncated tuples.
>>> 
>>> * There is no test case for exclusion constraints, and I feel since that is 
>>> one of the more important uses we should probably have at least one such 
>>> test case.
>> Actually, I did not understand this test case. Can you elaborate more on 
>> this? How included attributes should participate in exclude index? What for?
> 
> I mean include a table like below among the tests. I feel like this is a main 
> use case for INCLUDE.
> 
> CREATE TABLE t2 (
>  x int4range,
>  y int,
> 
>  EXCLUDE USING gist (x WITH &&) INCLUDE (y)
> );

Thanks for the explanation. Added this as case 6 to index_including_gist.

>>> * Why the random noise in the diff below? I think using "(c3) INCLUDE (c4)" 
>>> for both gist and rtree results in a cleaner patch.
>> I've used columns with and without opclass in INCLUDE. This led to these 
>> seemingly random changes.
> 
> I mean the diff would be smaller as the below. It also may make sense to make 
> both lines "(c3) INCLUDE (c1, c4)".
> 
> CREATE TABLE tbl (c1 int,c2 int, c3 box, c4 box);
> CREATE INDEX on tbl USING brin(c1, c2) INCLUDE (c3, c4);
> CREATE INDEX on tbl USING gist(c3) INCLUDE (c4);
> CREATE INDEX on tbl USING spgist(c3) INCLUDE (c4);
> CREATE INDEX on tbl USING gin(c1, c2) INCLUDE (c3, c4);
> CREATE INDEX on tbl USING hash(c1, c2) INCLUDE (c3, c4);
> -CREATE INDEX on tbl USING rtree(c1, c2) INCLUDE (c3, c4);
> +CREATE INDEX on tbl USING rtree(c3) INCLUDE (c4);
> CREATE INDEX on tbl USING btree(c1, c2) INCLUDE (c3, c4);
> 

I've took your version of this test and added all variations of included 
attributes.


PFA v7.

Thanks!

Best regards, Andrey Borodin.



0001-Covering-GiST-v7.patch
Description: Binary data


Re: Covering GiST indexes

2019-01-28 Thread Andreas Karlsson

On 1/28/19 7:26 PM, Andrey Borodin wrote:

* I am no fan of the tupdesc vs truncTupdesc separation and think that it is a 
potential hazard, but I do not have any better suggestion right now.

B-tree is copying tupdesc every time they truncate tuple. We need tuple 
truncation a little more often: when we are doing page split, we have to form 
all page tuples, truncated.
Initially, I've optimized only this case, but this led to prepared tupledesc 
for truncated tuples.


* There is no test case for exclusion constraints, and I feel since that is one 
of the more important uses we should probably have at least one such test case.


Actually, I did not understand this test case. Can you elaborate more on this? 
How included attributes should participate in exclude index? What for?


I mean include a table like below among the tests. I feel like this is a 
main use case for INCLUDE.


CREATE TABLE t2 (
  x int4range,
  y int,

  EXCLUDE USING gist (x WITH &&) INCLUDE (y)
);

* Why the random noise in the diff below? I think using "(c3) INCLUDE (c4)" for 
both gist and rtree results in a cleaner patch.

I've used columns with and without opclass in INCLUDE. This led to these 
seemingly random changes.


I mean the diff would be smaller as the below. It also may make sense to 
make both lines "(c3) INCLUDE (c1, c4)".


 CREATE TABLE tbl (c1 int,c2 int, c3 box, c4 box);
 CREATE INDEX on tbl USING brin(c1, c2) INCLUDE (c3, c4);
 CREATE INDEX on tbl USING gist(c3) INCLUDE (c4);
 CREATE INDEX on tbl USING spgist(c3) INCLUDE (c4);
 CREATE INDEX on tbl USING gin(c1, c2) INCLUDE (c3, c4);
 CREATE INDEX on tbl USING hash(c1, c2) INCLUDE (c3, c4);
-CREATE INDEX on tbl USING rtree(c1, c2) INCLUDE (c3, c4);
+CREATE INDEX on tbl USING rtree(c3) INCLUDE (c4);
 CREATE INDEX on tbl USING btree(c1, c2) INCLUDE (c3, c4);

Andreas



Re: Covering GiST indexes

2019-01-28 Thread Andrey Borodin
Thank you very much for reviewing the patch!

> 28 янв. 2019 г., в 12:15, Andreas Karlsson  написал(а):
> 
> = Code
> 
> * Have some minor style issues like that there should be spaces around || (in 
> gistcanreturn()) and ? and : (in gistFormTuple()).
Fixed.
> 
> * I do not see any need for adding the new parameter to gistFormTuple. As far 
> as I can tell isTruncated is always equal to !isleaf.
You are right. I've removed isTruncated parameter.
> 
> * The comment below from gistFormTuple() does not look correct. No allocation 
> is taking place.
> 
> /*
> * Allocate each included attribute.
> */
Fixed.
> 
> * Why do you set a supportCollation for the included columns? As far as I can 
> tell the collation is only used by the various GiST support functions. Also 
> in theory we should be able to skip initializing these array entires, but it 
> is probably for the best that we do.
Removed supportCollation.
> 
> * I think you should check for the attno first in gistcanreturn() to take 
> advantage of the short circuiting.
Done.
> 
> * I am no fan of the tupdesc vs truncTupdesc separation and think that it is 
> a potential hazard, but I do not have any better suggestion right now.
B-tree is copying tupdesc every time they truncate tuple. We need tuple 
truncation a little more often: when we are doing page split, we have to form 
all page tuples, truncated.
Initially, I've optimized only this case, but this led to prepared tupledesc 
for truncated tuples.
> 
> * There is no test case for exclusion constraints, and I feel since that is 
> one of the more important uses we should probably have at least one such test 
> case.

Actually, I did not understand this test case. Can you elaborate more on this? 
How included attributes should participate in exclude index? What for?

> = Minor comments
> 
> * I think that "the" should have been kept in the text below.
> 
> -Currently, only the B-tree index access method supports this feature.
> +Currently, B-tree and GiST index access methods supports this 
> feature.
> 
Fixed.
> * I am not a native speaker but I think it should be "use the INCLUDE clause" 
> in the diff below, and I think it also should be "without any GiST operator 
> class".
> 
> +   A GiST index can be covering, i.e. use INCLUDE clause.
> +   Included columns can have data types without GiST operator class. Included
> +   attributes will be stored uncompressed.
> 
Fixed.
> * I think you should write something like "Included attributes always support 
> index-only scans." rather than "Included attributes can be returned." in the 
> comment for gistcanreturn().
> 
Fixed, but slightly reworded.

> * Why the random noise in the diff below? I think using "(c3) INCLUDE (c4)" 
> for both gist and rtree results in a cleaner patch.
I've used columns with and without opclass in INCLUDE. This led to these 
seemingly random changes.

PFA v6. Thanks for reviewing!


Best regards, Andrey Borodin.



0001-Covering-GiST-v6.patch
Description: Binary data


Re: Covering GiST indexes

2019-01-27 Thread Andreas Karlsson

On 11/26/18 5:56 PM, Andrey Borodin wrote:

Here's rebased version. Thanks!


Here is my review.

= General

The features seems useful and an obvious extension of covering B-trees, 
and a potentially useful feature together with exclusion constraints.


The patch still applies (with git apply -3), compiles and passes the 
test suite.


From some testing it seems like it works as expected.

= Code

* Have some minor style issues like that there should be spaces around 
|| (in gistcanreturn()) and ? and : (in gistFormTuple()).


* I do not see any need for adding the new parameter to gistFormTuple. 
As far as I can tell isTruncated is always equal to !isleaf.


* The comment below from gistFormTuple() does not look correct. No 
allocation is taking place.


/*
 * Allocate each included attribute.
 */

* Why do you set a supportCollation for the included columns? As far as 
I can tell the collation is only used by the various GiST support 
functions. Also in theory we should be able to skip initializing these 
array entires, but it is probably for the best that we do.


* I think you should check for the attno first in gistcanreturn() to 
take advantage of the short circuiting.


* I am no fan of the tupdesc vs truncTupdesc separation and think that 
it is a potential hazard, but I do not have any better suggestion right now.


* There is no test case for exclusion constraints, and I feel since that 
is one of the more important uses we should probably have at least one 
such test case.


= Minor comments

* I think that "the" should have been kept in the text below.

-Currently, only the B-tree index access method supports this 
feature.
+Currently, B-tree and GiST index access methods supports this 
feature.


* I am not a native speaker but I think it should be "use the INCLUDE 
clause" in the diff below, and I think it also should be "without any 
GiST operator class".


+   A GiST index can be covering, i.e. use INCLUDE 
clause.
+   Included columns can have data types without GiST operator class. 
Included

+   attributes will be stored uncompressed.

* I think you should write something like "Included attributes always 
support index-only scans." rather than "Included attributes can be 
returned." in the comment for gistcanreturn().


* Why the random noise in the diff below? I think using "(c3) INCLUDE 
(c4)" for both gist and rtree results in a cleaner patch.


 CREATE TABLE tbl (c1 int,c2 int, c3 box, c4 box);
 CREATE INDEX on tbl USING brin(c1, c2) INCLUDE (c3, c4);
-CREATE INDEX on tbl USING gist(c3) INCLUDE (c4);
+CREATE INDEX on tbl USING gist(c4) INCLUDE (c3);
 CREATE INDEX on tbl USING spgist(c3) INCLUDE (c4);
 CREATE INDEX on tbl USING gin(c1, c2) INCLUDE (c3, c4);
 CREATE INDEX on tbl USING hash(c1, c2) INCLUDE (c3, c4);
-CREATE INDEX on tbl USING rtree(c1, c2) INCLUDE (c3, c4);
+CREATE INDEX on tbl USING rtree(c4) INCLUDE (c3, c1);
 CREATE INDEX on tbl USING btree(c1, c2) INCLUDE (c3, c4);



Re: Covering GiST indexes

2018-11-26 Thread Andrey Borodin
Hi, Dmitry!

> 26 нояб. 2018 г., в 21:20, Dmitry Dolgov <9erthali...@gmail.com> написал(а):
> 
> Looks like this time the patch was picked up correctly, but there are some
> conflicts with the current master branch:
> http://cfbot.cputube.org/patch_20_1615.log
> Could you please rebase it one more time?

Here's rebased version. Thanks!

Best regards, Andrey Borodin.


0001-Covering-GiST-v5.patch
Description: Binary data


Re: Covering GiST indexes

2018-11-26 Thread Dmitry Dolgov
> On Mon, Jul 30, 2018 at 7:50 AM Andrey Borodin  wrote:
>
> Thanks, Thomas!
>
> > 30 июля 2018 г., в 3:58, Thomas Munro  
> > написал(а):
> >
> > On Sun, Jul 29, 2018 at 10:50 PM, Andrey Borodin  
> > wrote:
> >> Thanks! The problem appeared with commit 701fd0b [0] which dropped
> >> validation rules checked in failed test.  Here's the patch with fixed 
> >> tests.
> >
> > Thanks.  I received the attachment.
> >
> > Just as an FYI, something about the way your mail client encoded it
> > prevented the mail archives from picking it up
>
> I'll try to investigate this.
>
> Here's patch one more: another attempt to put it into archives.

Hi,

Looks like this time the patch was picked up correctly, but there are some
conflicts with the current master branch:
http://cfbot.cputube.org/patch_20_1615.log
Could you please rebase it one more time?



Re: Covering GiST indexes

2018-07-29 Thread Andrey Borodin
Thanks, Thomas!

> 30 июля 2018 г., в 3:58, Thomas Munro  
> написал(а):
> 
> On Sun, Jul 29, 2018 at 10:50 PM, Andrey Borodin  wrote:
>> Thanks! The problem appeared with commit 701fd0b [0] which dropped
>> validation rules checked in failed test.  Here's the patch with fixed tests.
> 
> Thanks.  I received the attachment.
> 
> Just as an FYI, something about the way your mail client encoded it
> prevented the mail archives from picking it up 

I'll try to investigate this.

Here's patch one more: another attempt to put it into archives.

Best regards, Andrey Borodin.


0001-Covering-GiST-v4.patch
Description: Binary data



Re: Covering GiST indexes

2018-07-29 Thread Thomas Munro
On Sun, Jul 29, 2018 at 10:50 PM, Andrey Borodin  wrote:
> Thanks! The problem appeared with commit 701fd0b [0] which dropped
> validation rules checked in failed test.  Here's the patch with fixed tests.

Thanks.  I received the attachment.

Just as an FYI, something about the way your mail client encoded it
prevented the mail archives from picking it up (according to someone
who knows much more about these things than me, it's buried in the
"multipart/mixed" part, so not visible to the mail archive if it
extracts only the "text/plain" part, or something like that).  That
didn't happen on any of your other patches.  I have no opinion of
whether that's a bug in the mail archive or an unhelpful mail client
or a non-ideal way to post patches, but you can see here that we don't
have the patch in the usual place:

https://www.postgresql.org/message-id/850AE105-F89A-42E4-AD40-FBC6EA5A5A00%40yandex-team.ru

That explains why cfbot didn't automatically test your new patch, so
it still show as broken here:
http://cfbot.cputube.org/andrey-borodin.html

-- 
Thomas Munro
http://www.enterprisedb.com



Re: Covering GiST indexes

2018-07-29 Thread Andrey Borodin
Hi, Thomas!29 июля 2018 г., в 14:28, Thomas Munro  написал(а):On Fri, Jul 6, 2018 at 5:27 AM, Andrey Borodin  wrote:Here is v3 version of the patch. I've fixed some comments and added some words to docs.Hi again Andrey,Cfbot reported the following difference (twice) in theindex_including_gist test:- ERROR: included columns must not intersect with key columns+ ERROR: relation "tbl_gist_idx" already existsWell, both of those errors are valid complaints in this case... maybethe order of checks changed in master since you wrote the patch?Probably better to use a different name for the index anyway.Thanks! The problem appeared with commit 701fd0b [0] which dropped validation rules checked in failed test.  Here's the patch with fixed tests.Best regards, Andrey Borodin.[0] https://github.com/postgres/postgres/commit/701fd0bbc98fe8211d36e96f90753985104cd295

0001-Covering-GiST-v4.patch
Description: Binary data


Re: Covering GiST indexes

2018-07-29 Thread Thomas Munro
On Fri, Jul 6, 2018 at 5:27 AM, Andrey Borodin  wrote:
> Here is v3 version of the patch. I've fixed some comments and added some 
> words to docs.

Hi again Andrey,

Cfbot reported the following difference (twice) in the
index_including_gist test:

- ERROR: included columns must not intersect with key columns
+ ERROR: relation "tbl_gist_idx" already exists

Well, both of those errors are valid complaints in this case... maybe
the order of checks changed in master since you wrote the patch?
Probably better to use a different name for the index anyway.

-- 
Thomas Munro
http://www.enterprisedb.com



Re: Covering GiST indexes

2018-07-05 Thread Andrey Borodin
Hi!

> 13 апр. 2018 г., в 17:36, Andrey Borodin  написал(а):
> 
> Here's V2, with basic set of tests.
> Currently, I'm investigating what to document and more places to tests.

Here is v3 version of the patch. I've fixed some comments and added some words 
to docs.


Best regards, Andrey Borodin.


0001-Covering-GiST-v3.patch
Description: Binary data


Re: Covering GiST indexes

2018-04-13 Thread Andrey Borodin


> 12 апр. 2018 г., в 17:03, Teodor Sigaev  написал(а):
> 
> Interesting work. I don't have a time now to learn deep your patch, so, add 
> it to next commitfest, pls.
Thanks! Sure, I'll add it.

> First of all I'd like to see more tests in patch, not only CREATE INDEX.

Here's V2, with basic set of tests.
Currently, I'm investigating what to document and more places to tests.

Also, I do not use generic function index_truncate_tuple(), because it deforms 
and then forms tuple, but GiST usually have already deformed tuple.

Best regards, Andrey Borodin.


0001-Covering-GiST-v2.patch
Description: Binary data


Re: Covering GiST indexes

2018-04-12 Thread Komяpa
> Another thing that could be done for PostGIS geometries is just another
> opclass which
> stores geometries "as is" in leafs.  As I know, geometries contain MBRs
> inside their
> own, so there is no need to store extra MBR.  I think the reason why
> PostGIS
> doesn't have such opclass yet is that geometries are frequently large and
> easily
> can exceed maximum size of index tuple.
>

Geometry datatype layout was designed with TOASTing in mind: most of data
is stored in the header, including type, SRID, box and some other flags, so
getting just several first bytes tells you a lot.

PostGIS datasets are often of a mixed binary length: in buildings, for
example, it is quite common to have a lot of four corner houses, and just
one mapped as a circle, that digitizing software decided to make via
720-point polygon. Since reading it from TOAST for an index would require a
seek of some kind, it may be as efficient to just look it up in the table.

This way a lossy decompress function can help with index only scans: up to
some binary length, try to store the original geometry in the index. After
that, store a shape that has less points in it but covers slightly larger
area, plus a flag that it's not precise. There are a lot of ways to
generate a covering shape with less points: obvious is a box, less obvious
is non axis aligned box, a collection of boxes for a multipart shape, an
outer ring for an area with lots of holes, a convex hull or a smallest
enclosing k-gon.

In GIS there is a problem of border of Russia: the country overlaps over
180 meridian and has a complex border shape. if you take a box of it, it
spans from -180 to 180. If you query any spot in US or in Europe, you'll
have it intersecting with your area, require a full recheck, complete
detoast and decompression, and then "no, it's not a thing we need, skip".
Allowing anything better than a box would help. If we're allowing a complex
shape - we've got the container for it, geometry.

If we're storing geometry in index and original's small, why not allow
complete Index Only Scan on it, and let it skip recheck? :)

Darafei Praliaskouski,
GIS Engineer / Juno Minsk


Re: Covering GiST indexes

2018-04-12 Thread Peter Geoghegan
On Thu, Apr 12, 2018 at 4:00 AM, Andrey Borodin  wrote:
> I have two concerns.
> First one is about INDEX_AM_RESERVED_BIT.
> B-tree uses it as a base for prefix truncation (I'm not quite sure why it is 
> usually called suffix truncation, but this is a matter for other thread).

Since you brought it up, and since I pushed that particular
terminology, I should acknowledge that the original 1977 Bayer paper
on suffix truncation calls a B-Tree with suffix truncation a prefix
B-Tree. However, more recent work seems to consistently refer to the
technique as suffix truncation, while also referring to more advanced
techniques for compressing (not truncating) leaf tuples as prefix
compression.

I suggested suffix truncation because it seemed to be the dominant way
of referring to the technique. And, because it seemed more logical:
the suffix is what gets truncated away.

-- 
Peter Geoghegan



Re: Covering GiST indexes

2018-04-12 Thread Alexander Korotkov
Hi, Andrey!

On Thu, Apr 12, 2018 at 2:00 PM, Andrey Borodin <x4...@yandex-team.ru>
wrote:

> Looks like we finally have covering indexes! And that's cool!
>

Thank you!

So I decided to create a thread to discuss covering GiST indexes.
> Here's a prototype patch implementing this functionality.
> It is quite small (+80 -30) and lacks tests and docs. But it creates a
> context.
>
> I have two concerns.
> First one is about INDEX_AM_RESERVED_BIT.
> B-tree uses it as a base for prefix truncation (I'm not quite sure why it
> is usually called suffix truncation, but this is a matter for other thread).
> GiST , probably, will not use [pre\su]fix truncation. But I'd like to use
> that 13th bit to implement intra-page indexing - a way to improve search
> within gist page. See [0,1]
>

I also think that GiST isn't going to do suffix truncation of keys, because
GiST
is composing keys for every attribute and trying to use them in queries if
even GiST didn't use those attributes in any split.  Depending on data
distribution,
key representation, query and so on that keys may appear useful or not.
And GiST doesn't have any legal interface to determine that in advance.

I think that GiST needs another feature: not suffix truncation, but
different
attribute representation in leaf pages and internal pages.  For example,
now GiST on points stores boxes in leafs.  That's a great waste of space.
So, we might just have different tuple descriptors for internal and leaf
pages of GiST, which could have both different attribute types and
different counts of attributes.  Thankfully GiST pages don't have high keys
and we don't have to mix tuples of different types on the same page
(unlike B-tree).

Second, currently including indexes do not allow same attributes in both
> keys and include parts.
> # create index on x using gist(c) include (c);
> ERROR:  included columns must not intersect with key columns
>
> But it makes sense for example for geometries like PostGIS. Index keys are
> truncated to small MBRs while having whole complex geometry right in an
> index could be handy.
>

The issue discovered in [1] seems to be related.  Thus, after fix provided
there when
column is indexed without index-only scan support, then it can't be used in
index-only
scan anyway (if even it's indexed another time with index-only scan
support).
So, I think we need a better fix for [1] first.

Another thing that could be done for PostGIS geometries is just another
opclass which
stores geometries "as is" in leafs.  As I know, geometries contain MBRs
inside their
own, so there is no need to store extra MBR.  I think the reason why PostGIS
doesn't have such opclass yet is that geometries are frequently large and
easily
can exceed maximum size of index tuple.  The same problem applies to
coverting
indexes too.  Possible solution here might be to support external toasted
attributes
in covering indexes.  But I think that still requires quite amount of work.

1.
https://www.postgresql.org/message-id/1516210494.1798.16.camel%40nlpgo.com

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: Covering GiST indexes

2018-04-12 Thread Aleksander Alekseev
Hello Andrey,

> So I decided to create a thread to discuss covering GiST indexes.
> Here's a prototype patch implementing this functionality.  It is quite
> small (+80 -30) and lacks tests and docs. But it creates a context.

I'm glad you got interested in this area. It would be great to have
covering indexes support for GiST as well. Please don't forget to add
the thread to the nearest commitfest entry. I'm looking forward for the
next versions of your patch!

-- 
Best regards,
Aleksander Alekseev


signature.asc
Description: PGP signature


Re: Covering GiST indexes

2018-04-12 Thread Teodor Sigaev
Interesting work. I don't have a time now to learn deep your patch, so, add it 
to next commitfest, pls. First of all I'd like to see more tests in patch, not 
only CREATE INDEX.



Andrey Borodin wrote:

Hi, hackers!

Looks like we finally have covering indexes! And that's cool!

So I decided to create a thread to discuss covering GiST indexes.
Here's a prototype patch implementing this functionality.
It is quite small (+80 -30) and lacks tests and docs. But it creates a context.

I have two concerns.
First one is about INDEX_AM_RESERVED_BIT.
B-tree uses it as a base for prefix truncation (I'm not quite sure why it is 
usually called suffix truncation, but this is a matter for other thread).
GiST , probably, will not use [pre\su]fix truncation. But I'd like to use that 
13th bit to implement intra-page indexing - a way to improve search within gist 
page. See [0,1]

Second, currently including indexes do not allow same attributes in both keys 
and include parts.
# create index on x using gist(c) include (c);
ERROR:  included columns must not intersect with key columns

But it makes sense for example for geometries like PostGIS. Index keys are 
truncated to small MBRs while having whole complex geometry right in an index 
could be handy.

Any feedback will be appreciated.

Best regards, Andrey Borodin.


[0] 
https://www.postgresql.org/message-id/7780A07B-4D04-41E2-B228-166B41D07EEE%40yandex-team.ru
[1] 
https://www.postgresql.org/message-id/CAJEAwVE0rrr+OBT-P0gDCtXbVDkBBG_WcXwCBK=gho4fewu...@mail.gmail.com
  



--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/



Covering GiST indexes

2018-04-12 Thread Andrey Borodin
Hi, hackers!

Looks like we finally have covering indexes! And that's cool!

So I decided to create a thread to discuss covering GiST indexes.
Here's a prototype patch implementing this functionality.
It is quite small (+80 -30) and lacks tests and docs. But it creates a context.

I have two concerns.
First one is about INDEX_AM_RESERVED_BIT.
B-tree uses it as a base for prefix truncation (I'm not quite sure why it is 
usually called suffix truncation, but this is a matter for other thread).
GiST , probably, will not use [pre\su]fix truncation. But I'd like to use that 
13th bit to implement intra-page indexing - a way to improve search within gist 
page. See [0,1]

Second, currently including indexes do not allow same attributes in both keys 
and include parts.
# create index on x using gist(c) include (c);
ERROR:  included columns must not intersect with key columns

But it makes sense for example for geometries like PostGIS. Index keys are 
truncated to small MBRs while having whole complex geometry right in an index 
could be handy.

Any feedback will be appreciated.

Best regards, Andrey Borodin.


[0] 
https://www.postgresql.org/message-id/7780A07B-4D04-41E2-B228-166B41D07EEE%40yandex-team.ru
[1] 
https://www.postgresql.org/message-id/CAJEAwVE0rrr+OBT-P0gDCtXbVDkBBG_WcXwCBK=gho4fewu...@mail.gmail.com
 

0001-Covering-Gist.patch
Description: Binary data