Re: [HACKERS] Secondary index access optimizations

2018-10-12 Thread Konstantin Knizhnik



On 08.10.2018 00:16, David Rowley wrote:

On 5 October 2018 at 04:45, Konstantin Knizhnik
 wrote:

Will the following test be enough:

-- check that columns for parent table are correctly mapped to child
partition of their order doesn't match
create table paren (a int, b text) partition by range(a);
create table child_1 partition of paren for values from (0) to (10);
create table child_2 (b text, a int);
alter table paren attach partition child_2 for values from (10) to (20);
insert into paren values (generate_series(0,19), generate_series(100,119));

explain (costs off) select * from paren where a between 0 and 9;
explain (costs off) select * from paren where a between 10 and 20;
explain (costs off) select * from paren where a >= 5;
explain (costs off) select * from paren where a <= 15;

select count(*) from paren where a >= 5;
select count(*) from paren where a < 15;

drop table paren cascade;

I started looking at this to see if this test would cause a crash with
the original code, but it does not. The closest I can get is:

drop table parent;
create table parent (a bytea, b int) partition by range(a);
create table child_1 (b int, a bytea);
alter table parent attach partition child_1 for values from ('a') to ('z');
explain (costs off) select * from parent where b = 1;

But despite the varattnos of the bytea and int accidentally matching,
there's no crash due to the way operator_predicate_proof() requires
more than just the varno and varattno to match. It requires the Vars
to be equal(), which includes vartype, and those are not the same. So
the proof just fails.

In short, probably this test is doing nothing in addition to what
various other tests are doing. So given the test is unable to crash
the unfixed code, then I think it's likely not a worthwhile test to
add.

I wrote:

create table listp (a int, b int) partition by list(a);
create table listp1 partition of listp for values in(1);
create index listp_a_b_idx on listp (a,b);

and a query:

select * from listp where a = 1 order by b;

if we remove the "a = 1" qual, then listp_a_b_idx can't be used.

I had a look at what allows this query still to use the index and it's
down to pathkey_is_redundant() returning true because there's still an
equivalence class containing {a,1}. I don't quite see any reason why
it would not be okay to rely on that working, but that only works for
pathkeys. If you have a case like:

set max_parallel_workers_per_gather=0;
create table listp (a int, b int) partition by list(a);
create table listp1 partition of listp for values in(1);
insert into listp select 1,x from generate_Series(1,100) x;
create index listp_a_b_idx on listp (a,b);
vacuum analyze listp;
explain analyze select * from listp where a = 1 and b = 1;

the "a = 1" will be dropped and the index on (a,b) does not get used.

Patched results in:

postgres=# explain analyze select * from listp where a = 1 and b = 1;
  QUERY PLAN

  Append  (cost=0.00..16925.01 rows=1 width=8) (actual
time=0.019..169.231 rows=1 loops=1)
->  Seq Scan on listp1  (cost=0.00..16925.00 rows=1 width=8)
(actual time=0.017..169.228 rows=1 loops=1)
  Filter: (b = 1)
  Rows Removed by Filter: 99
  Planning Time: 0.351 ms
  Execution Time: 169.257 ms
(6 rows)

Whereas unpatched gets:

postgres=# explain analyze select * from listp where a = 1 and b = 1;
 QUERY PLAN
--
  Append  (cost=0.42..4.45 rows=1 width=8) (actual time=0.657..0.660
rows=1 loops=1)
->  Index Only Scan using listp1_a_b_idx on listp1
(cost=0.42..4.44 rows=1 width=8) (actual time=0.653..0.655 rows=1
loops=1)
  Index Cond: ((a = 1) AND (b = 1))
  Heap Fetches: 0
  Planning Time: 32.303 ms
  Execution Time: 0.826 ms
(6 rows)

so I was probably wrong about suggesting set_append_rel_size() as a
good place to remove these quals. It should perhaps be done later, or
maybe we can add some sort of marker to the qual to say it does not
need to be enforced during execution. Probably the former would be
best as we don't want to show these in EXPLAIN.

Well, I made a different conclusion from this problem (inability use of 
compound index because of redundant qual elimination).
Is it really good idea to define compound index with first key equal to 
partitioning key?
Restriction on this key in any case will lead to partition pruning. We 
do no need index for it...

In your case if we create index listp_b_idx:

 create index listp_b_idx on listp (b);

then right plan we be generated:

 Append  (cost=0.42..8.45 rows=1 width=8) (actual time=0.046..0.047 
rows=1 loops=1)
   ->  Index Scan using listp1_b_idx on listp1  (cost=0.42..8.44 rows=1 
width=8) 

Re: [HACKERS] Secondary index access optimizations

2018-10-07 Thread David Rowley
On 5 October 2018 at 04:45, Konstantin Knizhnik
 wrote:
> Will the following test be enough:
>
> -- check that columns for parent table are correctly mapped to child
> partition of their order doesn't match
> create table paren (a int, b text) partition by range(a);
> create table child_1 partition of paren for values from (0) to (10);
> create table child_2 (b text, a int);
> alter table paren attach partition child_2 for values from (10) to (20);
> insert into paren values (generate_series(0,19), generate_series(100,119));
>
> explain (costs off) select * from paren where a between 0 and 9;
> explain (costs off) select * from paren where a between 10 and 20;
> explain (costs off) select * from paren where a >= 5;
> explain (costs off) select * from paren where a <= 15;
>
> select count(*) from paren where a >= 5;
> select count(*) from paren where a < 15;
>
> drop table paren cascade;

I started looking at this to see if this test would cause a crash with
the original code, but it does not. The closest I can get is:

drop table parent;
create table parent (a bytea, b int) partition by range(a);
create table child_1 (b int, a bytea);
alter table parent attach partition child_1 for values from ('a') to ('z');
explain (costs off) select * from parent where b = 1;

But despite the varattnos of the bytea and int accidentally matching,
there's no crash due to the way operator_predicate_proof() requires
more than just the varno and varattno to match. It requires the Vars
to be equal(), which includes vartype, and those are not the same. So
the proof just fails.

In short, probably this test is doing nothing in addition to what
various other tests are doing. So given the test is unable to crash
the unfixed code, then I think it's likely not a worthwhile test to
add.

I wrote:
> create table listp (a int, b int) partition by list(a);
> create table listp1 partition of listp for values in(1);
> create index listp_a_b_idx on listp (a,b);
>
> and a query:
>
> select * from listp where a = 1 order by b;
>
> if we remove the "a = 1" qual, then listp_a_b_idx can't be used.

I had a look at what allows this query still to use the index and it's
down to pathkey_is_redundant() returning true because there's still an
equivalence class containing {a,1}. I don't quite see any reason why
it would not be okay to rely on that working, but that only works for
pathkeys. If you have a case like:

set max_parallel_workers_per_gather=0;
create table listp (a int, b int) partition by list(a);
create table listp1 partition of listp for values in(1);
insert into listp select 1,x from generate_Series(1,100) x;
create index listp_a_b_idx on listp (a,b);
vacuum analyze listp;
explain analyze select * from listp where a = 1 and b = 1;

the "a = 1" will be dropped and the index on (a,b) does not get used.

Patched results in:

postgres=# explain analyze select * from listp where a = 1 and b = 1;
 QUERY PLAN

 Append  (cost=0.00..16925.01 rows=1 width=8) (actual
time=0.019..169.231 rows=1 loops=1)
   ->  Seq Scan on listp1  (cost=0.00..16925.00 rows=1 width=8)
(actual time=0.017..169.228 rows=1 loops=1)
 Filter: (b = 1)
 Rows Removed by Filter: 99
 Planning Time: 0.351 ms
 Execution Time: 169.257 ms
(6 rows)

Whereas unpatched gets:

postgres=# explain analyze select * from listp where a = 1 and b = 1;
QUERY PLAN
--
 Append  (cost=0.42..4.45 rows=1 width=8) (actual time=0.657..0.660
rows=1 loops=1)
   ->  Index Only Scan using listp1_a_b_idx on listp1
(cost=0.42..4.44 rows=1 width=8) (actual time=0.653..0.655 rows=1
loops=1)
 Index Cond: ((a = 1) AND (b = 1))
 Heap Fetches: 0
 Planning Time: 32.303 ms
 Execution Time: 0.826 ms
(6 rows)

so I was probably wrong about suggesting set_append_rel_size() as a
good place to remove these quals. It should perhaps be done later, or
maybe we can add some sort of marker to the qual to say it does not
need to be enforced during execution. Probably the former would be
best as we don't want to show these in EXPLAIN.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] Secondary index access optimizations

2018-10-04 Thread Konstantin Knizhnik



On 04.10.2018 12:19, David Rowley wrote:

On 4 October 2018 at 22:11, Konstantin Knizhnik
 wrote:

On 04.10.2018 06:19, David Rowley wrote:

Please, can you also add a test which tests this code which has a
partition with columns in a different order than it's parent. Having
an INT and a TEXT column is best as if the translations are done
incorrectly it's likely to result in a crash which will alert us to
the issue. It would be good to also verify the test causes a crash if
you temporarily put the code back to using the untranslated qual.

Thanks for working on this.


Thank you very much for detecting and fixing this problem.
I have checked that all changes in plan caused by this fix are correct.
Updated version of the patch is attached.

Can you add the test that I mentioned above?


Will the following test be enough:

-- check that columns for parent table are correctly mapped to child 
partition of their order doesn't match

create table paren (a int, b text) partition by range(a);
create table child_1 partition of paren for values from (0) to (10);
create table child_2 (b text, a int);
alter table paren attach partition child_2 for values from (10) to (20);
insert into paren values (generate_series(0,19), generate_series(100,119));

explain (costs off) select * from paren where a between 0 and 9;
explain (costs off) select * from paren where a between 10 and 20;
explain (costs off) select * from paren where a >= 5;
explain (costs off) select * from paren where a <= 15;

select count(*) from paren where a >= 5;
select count(*) from paren where a < 15;

drop table paren cascade;


--
If so, then updated patch is attached.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 5f74d3b..b628ac7 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -37,6 +37,7 @@
 #include "optimizer/paths.h"
 #include "optimizer/plancat.h"
 #include "optimizer/planner.h"
+#include "optimizer/predtest.h"
 #include "optimizer/prep.h"
 #include "optimizer/restrictinfo.h"
 #include "optimizer/tlist.h"
@@ -1052,6 +1053,27 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
 /* Restriction reduces to constant TRUE, so drop it */
 continue;
 			}
+
+			/*
+			 * For partitions, we may be able to eliminate some quals if
+			 * they're implied by the partition bound.
+			 */
+			if (childrel->partition_qual != NIL)
+			{
+Node	   *checkqual = copyObject(childqual);
+
+/*
+ * Since the partition_qual has all Vars stored as varno=1, we
+ * must convert all Vars of the childqual to have their varnos
+ * set to 1 so that predicate_implied_by can properly match
+ * implied quals.
+ */
+ChangeVarNodes(checkqual, childrel->relid, 1, 0);
+
+if (predicate_implied_by(list_make1(checkqual), childrel->partition_qual, false))
+	continue;
+			}
+
 			/* might have gotten an AND clause, if so flatten it */
 			foreach(lc2, make_ands_implicit((Expr *) childqual))
 			{
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 8369e3a..8cd9b06 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -450,7 +450,8 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 	 */
 	if (inhparent && relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
 		set_relation_partition_info(root, rel, relation);
-
+	else if (relation->rd_rel->relispartition)
+		rel->partition_qual = RelationGetPartitionQual(relation);
 	heap_close(relation, NoLock);
 
 	/*
diff --git a/src/common/zpq_stream.c b/src/common/zpq_stream.c
new file mode 100644
index 000..afd42e9
--- /dev/null
+++ b/src/common/zpq_stream.c
@@ -0,0 +1,386 @@
+#include "postgres_fe.h"
+#include "common/zpq_stream.h"
+#include "c.h"
+#include "pg_config.h"
+
+#if HAVE_LIBZSTD
+
+#include 
+#include 
+
+#define ZPQ_BUFFER_SIZE (8*1024)
+#define ZSTD_COMPRESSION_LEVEL 1
+
+struct ZpqStream
+{
+	ZSTD_CStream*  tx_stream;
+	ZSTD_DStream*  rx_stream;
+	ZSTD_outBuffer tx;
+	ZSTD_inBuffer  rx;
+	size_t tx_not_flushed; /* Amount of datas in internal zstd buffer */
+	size_t tx_buffered;/* Data which is consumed by zpq_read but not yet sent */
+	zpq_tx_functx_func;
+	zpq_rx_funcrx_func;
+	void*  arg;
+	char const*rx_error;/* Decompress error message */
+	size_t tx_total;
+	size_t tx_total_raw;
+	size_t rx_total;
+	size_t rx_total_raw;
+	char   tx_buf[ZPQ_BUFFER_SIZE];
+	char   rx_buf[ZPQ_BUFFER_SIZE];
+};
+
+ZpqStream*
+zpq_create(zpq_tx_func tx_func, zpq_rx_func rx_func, void *arg)
+{
+	ZpqStream* zs = (ZpqStream*)malloc(sizeof(ZpqStream));
+	zs->tx_stream = ZSTD_createCStream();
+	ZSTD_initCStream(zs->tx_stream, ZSTD_COMPRESSION_LEVEL);
+	

Re: [HACKERS] Secondary index access optimizations

2018-10-04 Thread David Rowley
On 4 October 2018 at 22:11, Konstantin Knizhnik
 wrote:
> On 04.10.2018 06:19, David Rowley wrote:
>> Please, can you also add a test which tests this code which has a
>> partition with columns in a different order than it's parent. Having
>> an INT and a TEXT column is best as if the translations are done
>> incorrectly it's likely to result in a crash which will alert us to
>> the issue. It would be good to also verify the test causes a crash if
>> you temporarily put the code back to using the untranslated qual.
>>
>> Thanks for working on this.
>>
> Thank you very much for detecting and fixing this problem.
> I have checked that all changes in plan caused by this fix are correct.
> Updated version of the patch is attached.

Can you add the test that I mentioned above?

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] Secondary index access optimizations

2018-10-04 Thread Konstantin Knizhnik



On 04.10.2018 06:19, David Rowley wrote:

On 12 September 2018 at 08:32, Konstantin Knizhnik
 wrote:

Also the patch proposed by you is much simple and does mostly the same. Yes,
it is not covering CHECK constraints,

I started to look at this and found a problem in regards to varno
during the predicate_implied_by() test.  The problem is that the
partition bound is always stored as varno=1 (For example, see how
get_qual_for_list() calls makeVar()). This causes the patch to fail in
cases where the partitioned table is not varno=1. You're also
incorrectly using rinfo->clause to pass to predicate_implied_by().
This is a problem because stored here have not been translated to have
the child varattnos.  childqual is the correct thing to use as that's
just been translated. You may have not used it as the varnos will have
been converted to the child's varno, which will never be varno=1, so
you might have found that not to work due to the missing code to
change the varnos to 1.

I've attached the diff for allpaths.c (only) that I ended up with to
make it work. This causes the output of many other regression test to
change, so you'll need to go over these and verify everything is
correct again.

Please, can you also add a test which tests this code which has a
partition with columns in a different order than it's parent. Having
an INT and a TEXT column is best as if the translations are done
incorrectly it's likely to result in a crash which will alert us to
the issue. It would be good to also verify the test causes a crash if
you temporarily put the code back to using the untranslated qual.

Thanks for working on this.


Thank you very much for detecting and fixing this problem.
I have checked that all changes in plan caused by this fix are correct.
Updated version of the patch is attached.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 5f74d3b..b628ac7 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -37,6 +37,7 @@
 #include "optimizer/paths.h"
 #include "optimizer/plancat.h"
 #include "optimizer/planner.h"
+#include "optimizer/predtest.h"
 #include "optimizer/prep.h"
 #include "optimizer/restrictinfo.h"
 #include "optimizer/tlist.h"
@@ -1052,6 +1053,27 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
 /* Restriction reduces to constant TRUE, so drop it */
 continue;
 			}
+
+			/*
+			 * For partitions, we may be able to eliminate some quals if
+			 * they're implied by the partition bound.
+			 */
+			if (childrel->partition_qual != NIL)
+			{
+Node	   *checkqual = copyObject(childqual);
+
+/*
+ * Since the partition_qual has all Vars stored as varno=1, we
+ * must convert all Vars of the childqual to have their varnos
+ * set to 1 so that predicate_implied_by can properly match
+ * implied quals.
+ */
+ChangeVarNodes(checkqual, childrel->relid, 1, 0);
+
+if (predicate_implied_by(list_make1(checkqual), childrel->partition_qual, false))
+	continue;
+			}
+
 			/* might have gotten an AND clause, if so flatten it */
 			foreach(lc2, make_ands_implicit((Expr *) childqual))
 			{
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 8369e3a..8cd9b06 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -450,7 +450,8 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 	 */
 	if (inhparent && relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
 		set_relation_partition_info(root, rel, relation);
-
+	else if (relation->rd_rel->relispartition)
+		rel->partition_qual = RelationGetPartitionQual(relation);
 	heap_close(relation, NoLock);
 
 	/*
diff --git a/src/test/regress/expected/inherit.out b/src/test/regress/expected/inherit.out
index 4f29d9f..67d7a41 100644
--- a/src/test/regress/expected/inherit.out
+++ b/src/test/regress/expected/inherit.out
@@ -1772,30 +1772,26 @@ explain (costs off) select * from list_parted where a is not null;
 -
  Append
->  Seq Scan on part_ab_cd
- Filter: (a IS NOT NULL)
->  Seq Scan on part_ef_gh
- Filter: (a IS NOT NULL)
->  Seq Scan on part_null_xy
  Filter: (a IS NOT NULL)
-(7 rows)
+(5 rows)
 
 explain (costs off) select * from list_parted where a in ('ab', 'cd', 'ef');
 QUERY PLAN
 --
  Append
->  Seq Scan on part_ab_cd
- Filter: ((a)::text = ANY ('{ab,cd,ef}'::text[]))
->  Seq Scan on part_ef_gh
  Filter: ((a)::text = ANY ('{ab,cd,ef}'::text[]))
-(5 rows)
+(4 rows)
 
 explain (costs off) select * from list_parted where a = 'ab' or a in (null, 'cd');
-  QUERY PLAN   

Re: [HACKERS] Secondary index access optimizations

2018-10-03 Thread David Rowley
On 12 September 2018 at 08:32, Konstantin Knizhnik
 wrote:
> Also the patch proposed by you is much simple and does mostly the same. Yes,
> it is not covering CHECK constraints,

I started to look at this and found a problem in regards to varno
during the predicate_implied_by() test.  The problem is that the
partition bound is always stored as varno=1 (For example, see how
get_qual_for_list() calls makeVar()). This causes the patch to fail in
cases where the partitioned table is not varno=1. You're also
incorrectly using rinfo->clause to pass to predicate_implied_by().
This is a problem because stored here have not been translated to have
the child varattnos.  childqual is the correct thing to use as that's
just been translated. You may have not used it as the varnos will have
been converted to the child's varno, which will never be varno=1, so
you might have found that not to work due to the missing code to
change the varnos to 1.

I've attached the diff for allpaths.c (only) that I ended up with to
make it work. This causes the output of many other regression test to
change, so you'll need to go over these and verify everything is
correct again.

Please, can you also add a test which tests this code which has a
partition with columns in a different order than it's parent. Having
an INT and a TEXT column is best as if the translations are done
incorrectly it's likely to result in a crash which will alert us to
the issue. It would be good to also verify the test causes a crash if
you temporarily put the code back to using the untranslated qual.

Thanks for working on this.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


skip_implied_child_quals_allpaths.diff
Description: Binary data


Re: [HACKERS] Secondary index access optimizations

2018-10-01 Thread Michael Paquier
On Wed, Sep 12, 2018 at 11:43:09AM +0300, Konstantin Knizhnik wrote:
> Looks like this qual is considered for choosing optimal path before it is
> removed from list of quals in set_append_rel_size.

Hm...  The latest reviews have not been addressed yet, so I have marked
this as returned with feedback.
--
Michael


signature.asc
Description: PGP signature


Re: [HACKERS] Secondary index access optimizations

2018-09-12 Thread Konstantin Knizhnik




On 12.09.2018 08:14, David Rowley wrote:

On 12 September 2018 at 08:32, Konstantin Knizhnik
 wrote:

Also the patch proposed by you is much simple and does mostly the same. Yes,
it is not covering CHECK constraints,
but as far as partitioning becomes now standard in Postgres, I do not think
that much people will use old inheritance mechanism and CHECK constraints.
In any case, there are now many optimizations which works only for
partitions, but not for inherited tables.

I've not had time to look at your updated patch yet, but one thing I
thought about after my initial review, imagine you have a setup like:

create table listp (a int, b int) partition by list(a);
create table listp1 partition of listp for values in(1);
create index listp_a_b_idx on listp (a,b);

and a query:

select * from listp where a = 1 order by b;

if we remove the "a = 1" qual, then listp_a_b_idx can't be used.


Looks like this qual is considered for choosing optimal path before it 
is removed from list of quals in set_append_rel_size.

At least the presence of this patch is not breaking the plan in this case:

create table listp (a int, b int) partition by list(a);
create table listp1 partition of listp for values in(1);
create table listp2 partition of listp for values in(2);
create index listp_a_b_idx on listp (a,b);
insert into listp values (1,generate_series(1,10));
insert into listp values (2,generate_series(11,20));
explain select * from listp where a = 1 order by b;
   QUERY PLAN

 Merge Append  (cost=0.30..4630.43 rows=10 width=8)
   Sort Key: listp1.b
   ->  Index Only Scan using listp1_a_b_idx on listp1 
(cost=0.29..3630.42 rows=10 width=8)

(3 rows)



I didn't test this in your patch, but I guess since the additional
quals are not applied to the children in set_append_rel_size() that by
the time set_append_rel_pathlist() is called, then when we go
generating the paths, the (a,b) index won't be any good.

Perhaps there's some workaround like inventing some sort of "no-op"
qual that exists in planning but never makes it way down to scans.
Although I admit to not having fully thought that idea through.



--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: [HACKERS] Secondary index access optimizations

2018-09-11 Thread David Rowley
On 12 September 2018 at 08:32, Konstantin Knizhnik
 wrote:
> Also the patch proposed by you is much simple and does mostly the same. Yes,
> it is not covering CHECK constraints,
> but as far as partitioning becomes now standard in Postgres, I do not think
> that much people will use old inheritance mechanism and CHECK constraints.
> In any case, there are now many optimizations which works only for
> partitions, but not for inherited tables.

I've not had time to look at your updated patch yet, but one thing I
thought about after my initial review, imagine you have a setup like:

create table listp (a int, b int) partition by list(a);
create table listp1 partition of listp for values in(1);
create index listp_a_b_idx on listp (a,b);

and a query:

select * from listp where a = 1 order by b;

if we remove the "a = 1" qual, then listp_a_b_idx can't be used.

I didn't test this in your patch, but I guess since the additional
quals are not applied to the children in set_append_rel_size() that by
the time set_append_rel_pathlist() is called, then when we go
generating the paths, the (a,b) index won't be any good.

Perhaps there's some workaround like inventing some sort of "no-op"
qual that exists in planning but never makes it way down to scans.
Although I admit to not having fully thought that idea through.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] Secondary index access optimizations

2018-09-11 Thread Konstantin Knizhnik

Hi David,


On 11.09.2018 06:49, David Rowley wrote:

On 9 July 2018 at 13:26, David Rowley  wrote:

I started looking over this patch and have a few comments:

Hi Konstantin,

Wondering, are you going to be submitting an updated patch for this
commitfest?  If not then I think we can mark this as returned with
feedback as it's been waiting on author for quite a while now.



First of all thank your for review.
I am very sorry for delay with answer: I was in vacation in July and 
just forgot about this mail.
I have to agree with you that it is better to split this patch into two 
and that using range type for open and close intervals match is so good 
idea.
Also the patch proposed by you is much simple and does mostly the same. 
Yes, it is not covering CHECK constraints,
but as far as partitioning becomes now standard in Postgres, I do not 
think that much people will use old inheritance mechanism and CHECK 
constraints. In any case, there are now many optimizations which works 
only for partitions, but not for inherited tables.


I attach to this mail your patch combined with corrected tests outputs.
Unfortunately the performance gain is not so much as I expected (even 
without additional

predicate_implied_by() smarts). On the following database:

create table bt (k integer, v integer) partition by range (k);
create table dt1 partition of bt for values from (1) to (10001);
create table dt2 partition of bt for values from (10001) to (20001);
create index dti1 on dt1(v);
create index dti2 on dt2(v);
insert into bt values (generate_series(1,2), generate_series(1,2));
analyze bt;

and pgbench script:

\set x random(1, 1)
select * from bt where k between 1 and 20001 and v=:x;

I got ~170k TPS with this patch and about ~160k TPS without it.
My hypothesis was that we have to perform redundant predicate only once 
(for one selected record)

and it adds no so much overhead comparing with index search cost.
So I tried another version of  the query which selects large number of 
records:


select sum(*) from bt where k between 1 and 20001 and v between :x and 
:x + 1000;


Now patch version shows 23k TPS vs. 19k for Vanilla.
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 5db1688..48359f4 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -37,6 +37,7 @@
 #include "optimizer/paths.h"
 #include "optimizer/plancat.h"
 #include "optimizer/planner.h"
+#include "optimizer/predtest.h"
 #include "optimizer/prep.h"
 #include "optimizer/restrictinfo.h"
 #include "optimizer/tlist.h"
@@ -1039,6 +1040,12 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
 /* Restriction reduces to constant TRUE, so drop it */
 continue;
 			}
+
+			/* We can skip any quals that are implied by any partition bound. */
+			if (childrel->partition_qual != NIL &&
+predicate_implied_by(list_make1(rinfo->clause), childrel->partition_qual, false))
+continue;
+
 			/* might have gotten an AND clause, if so flatten it */
 			foreach(lc2, make_ands_implicit((Expr *) childqual))
 			{
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 8369e3a..8cd9b06 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -450,7 +450,8 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 	 */
 	if (inhparent && relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
 		set_relation_partition_info(root, rel, relation);
-
+	else if (relation->rd_rel->relispartition)
+		rel->partition_qual = RelationGetPartitionQual(relation);
 	heap_close(relation, NoLock);
 
 	/*
diff --git a/src/test/regress/expected/inherit.out b/src/test/regress/expected/inherit.out
index 4f29d9f..91219fa 100644
--- a/src/test/regress/expected/inherit.out
+++ b/src/test/regress/expected/inherit.out
@@ -1772,30 +1772,26 @@ explain (costs off) select * from list_parted where a is not null;
 -
  Append
->  Seq Scan on part_ab_cd
- Filter: (a IS NOT NULL)
->  Seq Scan on part_ef_gh
- Filter: (a IS NOT NULL)
->  Seq Scan on part_null_xy
  Filter: (a IS NOT NULL)
-(7 rows)
+(5 rows)
 
 explain (costs off) select * from list_parted where a in ('ab', 'cd', 'ef');
 QUERY PLAN
 --
  Append
->  Seq Scan on part_ab_cd
- Filter: ((a)::text = ANY ('{ab,cd,ef}'::text[]))
->  Seq Scan on part_ef_gh
  Filter: ((a)::text = ANY ('{ab,cd,ef}'::text[]))
-(5 rows)
+(4 rows)
 
 explain (costs off) select * from list_parted where a = 'ab' or a in (null, 'cd');
-  QUERY PLAN   

+  QUERY PLAN  

Re: [HACKERS] Secondary index access optimizations

2018-09-10 Thread David Rowley
On 9 July 2018 at 13:26, David Rowley  wrote:
> I started looking over this patch and have a few comments:

Hi Konstantin,

Wondering, are you going to be submitting an updated patch for this
commitfest?  If not then I think we can mark this as returned with
feedback as it's been waiting on author for quite a while now.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] Secondary index access optimizations

2018-07-08 Thread David Rowley
On 22 March 2018 at 22:38, Konstantin Knizhnik
 wrote:
> Attached please find rebased version of the patch.

Hi,

I started looking over this patch and have a few comments:

I don't think this range type idea is a great one. I don't think it's
going to ever perform very well.  I also see you're not checking the
collation of the type anywhere.  As of now, no range types have
collation support, but you can't really go and code this with that
assumption. I also don't really like the sequence scan over pg_range.

Probably a better way to do this would be to add a new bt proc, like
what was done in 0a459cec for 2 new functions. Something like
BTISNEXTVAL_PROC and BTISPREVVAL_PROC. You'd then need to define
functions for all the types based on integers, making functions which
take 2 parameters of the type, and an additional collation param. The
functions should return bool.  int4_isnextval(2, 3, InvalidOid) would
return true.  You'd need to return false on wraparound.

I also think that the patch is worth doing without the additional
predicate_implied_by() smarts. In fact, I think strongly that the two
should be considered as two independent patches. Partial indexes
suffer from the same issue you're trying to address here and that
would be resolved by any patch which makes changes to
predicate_implied_by().

Probably the best place to put the code to skip the redundant quals is
inside set_append_rel_size(). There's already code there that skips
quals that are seen as const TRUE. This applies for UNION ALL targets
with expressions that can be folded to constants once the qual is
passed through adjust_appendrel_attrs()

For example:
explain select * from (select 1 as a from pg_class union all select 2
from pg_class) t where a = 1;

I've attached a patch to show what I had in mind. I had to change how
partition_qual is populated, which I was surprised to see only gets
populated for sub-partitioned tables (the top-level parent won't have
a qual since it's not partitioned)  I didn't update the partition
pruning code that assumes this is only populated for sub-partitioned
table. That will need to be done. The patch comes complete with all
the failing regression tests where the redundant quals have been
removed by the new code.

If you want to make this work for CHECK constraints too, then I think
the code for that can be added to the same location as the code I
added in the attached patch. You'd just need to fetch the check
constraint quals and just add some extra code to check if the qual is
redundant.

Some new performance benchmarks would then be useful in order to find
out how much overhead there is. We might learn that we don't want to
enable it by default if it's too expensive.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


skip_redundant_partition_quals_poc.patch
Description: Binary data


Re: [HACKERS] Secondary index access optimizations

2018-03-22 Thread Konstantin Knizhnik



On 21.03.2018 20:30, Konstantin Knizhnik wrote:



On 01.03.2018 23:15, Andres Freund wrote:

Hi,


This patch seems like quite a good improvement. One thing I've not
really looked at but am slightly concerned in passing: Are there cases
where we now would do a lot of predicate pruning work even though the
overhead of just evaluating the qual is trivial, e.g. because there's
only one row due to a pkey? The predtest code is many things but
lightning fast is not one of them.  Obviously that won't matter for
analytics queries, but I could see it hurt in OLTPish workloads...

Greetings,

Andres Freund


Hi,
I am sorry for delay with answer.

I understand your concern and did the following experiment.
I have initialized the database in the following way:

create table base (k integer primary key, v integer);
create table part1 (check (k between 1 and 1)) inherits (base);
create table part2 (check (k between 10001 and 2)) inherits (base);
create index pi1 on part1(v);
create index pi2 on part2(v);
insert into part1 values (generate series(1,1), random()*10);
insert into part2 values (generate_series(10001,2), random()*10);
vacuum analyze part1;
vacuum analyze part2;

Vanilla Postgres uses the following plan:

explain select * from base where k between 1 and 2 and v = 100;
  QUERY PLAN
---
 Append  (cost=0.00..16.63 rows=3 width=8)
   ->  Seq Scan on base  (cost=0.00..0.00 rows=1 width=8)
 Filter: ((k >= 1) AND (k <= 2) AND (v = 100))
   ->  Index Scan using pi1 on part1  (cost=0.29..8.31 rows=1 width=8)
 Index Cond: (v = 100)
 Filter: ((k >= 1) AND (k <= 2))
   ->  Index Scan using pi2 on part2  (cost=0.29..8.31 rows=1 width=8)
 Index Cond: (v = 100)
 Filter: ((k >= 1) AND (k <= 2))
(9 rows)

Execution of this query 10 times gives is done in 12 seconds.
With applied patch query plan is changed to:

  QUERY PLAN
---
 Append  (cost=0.00..16.62 rows=3 width=8)
   ->  Seq Scan on base  (cost=0.00..0.00 rows=1 width=8)
 Filter: ((k >= 1) AND (k <= 2) AND (v = 100))
   ->  Index Scan using pi1 on part1  (cost=0.29..8.30 rows=1 width=8)
 Index Cond: (v = 100)
   ->  Index Scan using pi2 on part2  (cost=0.29..8.30 rows=1 width=8)
 Index Cond: (v = 100)
(7 rows)

Elapsed time of 10 query executions is 13 seconds.
So you was right that increased query optimization time exceeds 
advantage of extra checks elimination.

But it is true only if we are not using prepare statements.
With prepared statements results are the following:

Vanilla:   0m3.915s
This patch: 0m3.563s

So improvement is not so large, but it exists.
If somebody wants to repeat my experiments, I attached to this mail 
small shell script which I used to run query several times.

Non-prepared query is launched using the following command:

time ./repeat-query.sh "select * from base where k between 1 and 2 
and v = 100" 10


and prepared query:

time ./repeat-query.sh "execute select100" 10 "prepare select100 
as select * from base where k between 1 and 2 and v = 100"


And once again I want to notice that using prepared statements can 
increase performance almost 3 times!
As far as using prepared statements is not always possible I want to 
recall autoprepare patch waiting at the commitfest:


https://www.postgresql.org/message-id/flat/8eed9c23-19ba-5404-7a9e-0584b836b3f3%40postgrespro.ru#8eed9c23-19ba-5404-7a9e-0584b836b...@postgrespro.ru

--
Konstantin Knizhnik
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company

Attached please find rebased version of the patch.
Also I do more testing, now using pgbench.
Scripts for initialization of the database and for custom script for 
pgbench are attached.

Results at my computer are the following:

pgbench options
Vanilla
Patch
-c 1
9208
8289
-c 1 -M prepared
38503   41206
-c 10
39224
34040
-c 10 -M prepared
165465
172874






--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index a2b1384..84b8543 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -631,12 +631,12 @@ EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- Nu
Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NULL))
 (3 rows)
 
-EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest
- QUERY PLAN  

Re: [HACKERS] Secondary index access optimizations

2018-03-21 Thread Konstantin Knizhnik



On 01.03.2018 23:15, Andres Freund wrote:

Hi,


This patch seems like quite a good improvement. One thing I've not
really looked at but am slightly concerned in passing: Are there cases
where we now would do a lot of predicate pruning work even though the
overhead of just evaluating the qual is trivial, e.g. because there's
only one row due to a pkey? The predtest code is many things but
lightning fast is not one of them.  Obviously that won't matter for
analytics queries, but I could see it hurt in OLTPish workloads...

Greetings,

Andres Freund


Hi,
I am sorry for delay with answer.

I understand your concern and did the following experiment.
I have initialized the database in the following way:

create table base (k integer primary key, v integer);
create table part1 (check (k between 1 and 1)) inherits (base);
create table part2 (check (k between 10001 and 2)) inherits (base);
create index pi1 on part1(v);
create index pi2 on part2(v);
insert into part1 values (generate series(1,1), random()*10);
insert into part2 values (generate_series(10001,2), random()*10);
vacuum analyze part1;
vacuum analyze part2;

Vanilla Postgres uses the following plan:

explain select * from base where k between 1 and 2 and v = 100;
  QUERY PLAN
---
 Append  (cost=0.00..16.63 rows=3 width=8)
   ->  Seq Scan on base  (cost=0.00..0.00 rows=1 width=8)
 Filter: ((k >= 1) AND (k <= 2) AND (v = 100))
   ->  Index Scan using pi1 on part1  (cost=0.29..8.31 rows=1 width=8)
 Index Cond: (v = 100)
 Filter: ((k >= 1) AND (k <= 2))
   ->  Index Scan using pi2 on part2  (cost=0.29..8.31 rows=1 width=8)
 Index Cond: (v = 100)
 Filter: ((k >= 1) AND (k <= 2))
(9 rows)

Execution of this query 10 times gives is done in 12 seconds.
With applied patch query plan is changed to:

  QUERY PLAN
---
 Append  (cost=0.00..16.62 rows=3 width=8)
   ->  Seq Scan on base  (cost=0.00..0.00 rows=1 width=8)
 Filter: ((k >= 1) AND (k <= 2) AND (v = 100))
   ->  Index Scan using pi1 on part1  (cost=0.29..8.30 rows=1 width=8)
 Index Cond: (v = 100)
   ->  Index Scan using pi2 on part2  (cost=0.29..8.30 rows=1 width=8)
 Index Cond: (v = 100)
(7 rows)

Elapsed time of 10 query executions is 13 seconds.
So you was right that increased query optimization time exceeds 
advantage of extra checks elimination.

But it is true only if we are not using prepare statements.
With prepared statements results are the following:

Vanilla:   0m3.915s
This patch: 0m3.563s

So improvement is not so large, but it exists.
If somebody wants to repeat my experiments, I attached to this mail 
small shell script which I used to run query several times.

Non-prepared query is launched using the following command:

time ./repeat-query.sh "select * from base where k between 1 and 2 
and v = 100" 10


and prepared query:

time ./repeat-query.sh "execute select100" 10 "prepare select100 as 
select * from base where k between 1 and 2 and v = 100"


And once again I want to notice that using prepared statements can 
increase performance almost 3 times!
As far as using prepared statements is not always possible I want to 
recall autoprepare patch waiting at the commitfest:


https://www.postgresql.org/message-id/flat/8eed9c23-19ba-5404-7a9e-0584b836b3f3%40postgrespro.ru#8eed9c23-19ba-5404-7a9e-0584b836b...@postgrespro.ru

--

Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



repeat-query.sh
Description: application/shellscript


Re: [HACKERS] Secondary index access optimizations

2018-03-01 Thread Andres Freund
Hi,


This patch seems like quite a good improvement. One thing I've not
really looked at but am slightly concerned in passing: Are there cases
where we now would do a lot of predicate pruning work even though the
overhead of just evaluating the qual is trivial, e.g. because there's
only one row due to a pkey? The predtest code is many things but
lightning fast is not one of them.  Obviously that won't matter for
analytics queries, but I could see it hurt in OLTPish workloads...

Greetings,

Andres Freund



Re: [HACKERS] Secondary index access optimizations

2018-01-29 Thread Konstantin Knizhnik



On 29.01.2018 16:24, Konstantin Knizhnik wrote:

On 29.01.2018 07:34, Thomas Munro wrote:

On Sat, Jan 20, 2018 at 5:41 AM, Konstantin Knizhnik
 wrote:

On 19.01.2018 16:14, Antonin Houska wrote:

you should test the operator B-tree strategy: BTLessStrategyNumber,
BTLessEqualStrategyNumber, etc. The operator string alone does not 
tell

enough
about the operator semantics.

The strategy can be found in the pg_amop catalog.
get_op_btree_interpretation() function may be useful, but there may be
something better in utils/cache/lsyscache.c.


Thank you very much.
Shame on me that I didn't notice such solution myself - such 
checking of

B-tree strategy is done in the same predtest.c file!
Now checking of predicate clauses compatibility is done in much more 
elegant

and general way.
Attached please find new version of the patch.

I wonder if you should create a new index and SysCache entry for
looking up range types by subtype.  I'll be interested to see what
others have to say about this range type-based technique -- it seems
clever to me, but I'm not familiar enough with this stuff to say if
there is some other approach you should be using instead.

I think that it is good idea to add caching for range type lookup.
If community think that it will be useful I can try to add such 
mechanism.
But it seems to be not so trivial, especially properly handle 
invalidations.




I have added caching of element_type->range_type mapping to typcache.c. 
But it is not invalidated now if pg_range relation is changed.
Also if there are more than one range types for the specified element 
type then first of them is used.


--

Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index bce3348..6a7e7fb 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -626,12 +626,12 @@ EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- Nu
Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NULL))
 (3 rows)
 
-EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest
- QUERY PLAN  
--
+EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;-- NullTest
+QUERY PLAN
+--
  Foreign Scan on public.ft1 t1
Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NOT NULL))
+   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c3 IS NOT NULL))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 1df1e3a..c421530 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -292,7 +292,7 @@ RESET enable_nestloop;
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 1; -- Var, OpExpr(b), Const
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 100 AND t1.c2 = 0; -- BoolExpr
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- NullTest
-EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest
+EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;-- NullTest
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 = -c1;  -- OpExpr(l)
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE 1 = c1!;   -- OpExpr(r)
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 44f6b03..c7bf118 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -345,6 +345,7 @@ set_rel_size(PlannerInfo *root, RelOptInfo *rel,
 		switch (rel->rtekind)
 		{
 			case RTE_RELATION:
+remove_restrictions_implied_by_constraints(root, rel, rte);
 if (rte->relkind == RELKIND_FOREIGN_TABLE)
 {
 	/* Foreign table */
@@ -1130,6 +1131,7 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
 			set_dummy_rel_pathlist(childrel);
 			continue;
 		}
+		remove_restrictions_implied_by_constraints(root, childrel, childRTE);
 
 		/* CE failed, so finish copying/modifying join quals. */
 		childrel->joininfo = 

Re: [HACKERS] Secondary index access optimizations

2018-01-29 Thread Konstantin Knizhnik

On 29.01.2018 07:34, Thomas Munro wrote:

On Sat, Jan 20, 2018 at 5:41 AM, Konstantin Knizhnik
 wrote:

On 19.01.2018 16:14, Antonin Houska wrote:

you should test the operator B-tree strategy: BTLessStrategyNumber,
BTLessEqualStrategyNumber, etc. The operator string alone does not tell
enough
about the operator semantics.

The strategy can be found in the pg_amop catalog.
get_op_btree_interpretation() function may be useful, but there may be
something better in utils/cache/lsyscache.c.


Thank you very much.
Shame on me that I didn't notice such solution myself - such checking of
B-tree strategy is done in the same predtest.c file!
Now checking of predicate clauses compatibility is done in much more elegant
and general way.
Attached please find new version of the patch.

I wonder if you should create a new index and SysCache entry for
looking up range types by subtype.  I'll be interested to see what
others have to say about this range type-based technique -- it seems
clever to me, but I'm not familiar enough with this stuff to say if
there is some other approach you should be using instead.

I think that it is good idea to add caching for range type lookup.
If community think that it will be useful I can try to add such mechanism.
But it seems to be not so trivial, especially properly handle invalidations.


Some superficial project style comments (maybe these could be fixed
automatically with pgindent?):

+static TypeCacheEntry* lookup_range_type(Oid type)

+static RangeType* operator_to_range(TypeCacheEntry *typcache, Oid
oper, Const* literal)

... these should be like this:

static RangeType *
operator_to_range(...

I think the idea is that you can always search for a function
definition with by looking for "name(" at the start of a line, so we
put a newline there.  Then there is the whitespace before "*", not
after it.

+ if (pred_range && clause_range && range_eq_internal(typcache,
pred_range, clause_range))
+ {
+ return true;
+ }

Unnecessary braces.

+/*
+ * Get range type for the corresprent scalar type.
+ * Returns NULl if such range type is not found.
+ * This function performs sequential scan in pg_range table,
+ * but since number of range rtype is not expected to be large (6
builtin range types),
+ * it should not be a problem.
+ */

Typos.


Thank you.
I fixed this issues.
New patch is attached.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index bce3348..6a7e7fb 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -626,12 +626,12 @@ EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- Nu
Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NULL))
 (3 rows)
 
-EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest
- QUERY PLAN  
--
+EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;-- NullTest
+QUERY PLAN
+--
  Foreign Scan on public.ft1 t1
Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NOT NULL))
+   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c3 IS NOT NULL))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 1df1e3a..c421530 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -292,7 +292,7 @@ RESET enable_nestloop;
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 1; -- Var, OpExpr(b), Const
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 100 AND t1.c2 = 0; -- BoolExpr
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- NullTest
-EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest
+EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;-- NullTest
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 = -c1;  -- OpExpr(l)
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE 1 = c1!;   -- OpExpr(r)
diff --git a/src/backend/optimizer/path/allpaths.c 

Re: [HACKERS] Secondary index access optimizations

2018-01-28 Thread Thomas Munro
On Sat, Jan 20, 2018 at 5:41 AM, Konstantin Knizhnik
 wrote:
> On 19.01.2018 16:14, Antonin Houska wrote:
>> you should test the operator B-tree strategy: BTLessStrategyNumber,
>> BTLessEqualStrategyNumber, etc. The operator string alone does not tell
>> enough
>> about the operator semantics.
>>
>> The strategy can be found in the pg_amop catalog.
>> get_op_btree_interpretation() function may be useful, but there may be
>> something better in utils/cache/lsyscache.c.
>>
> Thank you very much.
> Shame on me that I didn't notice such solution myself - such checking of
> B-tree strategy is done in the same predtest.c file!
> Now checking of predicate clauses compatibility is done in much more elegant
> and general way.
> Attached please find new version of the patch.

I wonder if you should create a new index and SysCache entry for
looking up range types by subtype.  I'll be interested to see what
others have to say about this range type-based technique -- it seems
clever to me, but I'm not familiar enough with this stuff to say if
there is some other approach you should be using instead.

Some superficial project style comments (maybe these could be fixed
automatically with pgindent?):

+static TypeCacheEntry* lookup_range_type(Oid type)

+static RangeType* operator_to_range(TypeCacheEntry *typcache, Oid
oper, Const* literal)

... these should be like this:

static RangeType *
operator_to_range(...

I think the idea is that you can always search for a function
definition with by looking for "name(" at the start of a line, so we
put a newline there.  Then there is the whitespace before "*", not
after it.

+ if (pred_range && clause_range && range_eq_internal(typcache,
pred_range, clause_range))
+ {
+ return true;
+ }

Unnecessary braces.

+/*
+ * Get range type for the corresprent scalar type.
+ * Returns NULl if such range type is not found.
+ * This function performs sequential scan in pg_range table,
+ * but since number of range rtype is not expected to be large (6
builtin range types),
+ * it should not be a problem.
+ */

Typos.

-- 
Thomas Munro
http://www.enterprisedb.com



Re: [HACKERS] Secondary index access optimizations

2018-01-19 Thread Konstantin Knizhnik



On 19.01.2018 16:14, Antonin Houska wrote:

Konstantin Knizhnik  wrote:


On 11.01.2018 12:34, Antonin Houska wrote:

Konstantin Knizhnik  wrote:
I haven't thought that much about details, so just one comment: you shouldn't
need the conversion to text and back to binary form. utils/adt/rangetypes.c
contains constructors that accept the binary values.


Attached please find new version of the patch which uses range type to
interval matching in operator_predicate_proof function.

I think that instead of checking the operator name, e.g.

strcmp(oprname, "<")

you should test the operator B-tree strategy: BTLessStrategyNumber,
BTLessEqualStrategyNumber, etc. The operator string alone does not tell enough
about the operator semantics.

The strategy can be found in the pg_amop catalog.
get_op_btree_interpretation() function may be useful, but there may be
something better in utils/cache/lsyscache.c.


Thank you very much.
Shame on me that I didn't notice such solution myself - such checking of 
B-tree strategy is done in the same predtest.c file!
Now checking of predicate clauses compatibility is done in much more 
elegant and general way.

Attached please find new version of the patch.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index bce3348..6a7e7fb 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -626,12 +626,12 @@ EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- Nu
Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NULL))
 (3 rows)
 
-EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest
- QUERY PLAN  
--
+EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;-- NullTest
+QUERY PLAN
+--
  Foreign Scan on public.ft1 t1
Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NOT NULL))
+   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c3 IS NOT NULL))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 1df1e3a..c421530 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -292,7 +292,7 @@ RESET enable_nestloop;
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 1; -- Var, OpExpr(b), Const
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 100 AND t1.c2 = 0; -- BoolExpr
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- NullTest
-EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest
+EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;-- NullTest
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 = -c1;  -- OpExpr(l)
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE 1 = c1!;   -- OpExpr(r)
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 44f6b03..c7bf118 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -345,6 +345,7 @@ set_rel_size(PlannerInfo *root, RelOptInfo *rel,
 		switch (rel->rtekind)
 		{
 			case RTE_RELATION:
+remove_restrictions_implied_by_constraints(root, rel, rte);
 if (rte->relkind == RELKIND_FOREIGN_TABLE)
 {
 	/* Foreign table */
@@ -1130,6 +1131,7 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
 			set_dummy_rel_pathlist(childrel);
 			continue;
 		}
+		remove_restrictions_implied_by_constraints(root, childrel, childRTE);
 
 		/* CE failed, so finish copying/modifying join quals. */
 		childrel->joininfo = (List *)
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index f743871..f763a97 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -1466,6 +1466,51 @@ relation_excluded_by_constraints(PlannerInfo *root,
 	return false;
 }
 
+/*
+ * Remove from restrictions list items implied by table constraints
+ */
+void 

Re: [HACKERS] Secondary index access optimizations

2018-01-19 Thread Antonin Houska
Konstantin Knizhnik  wrote:

> On 11.01.2018 12:34, Antonin Houska wrote:

> > Konstantin Knizhnik  wrote:

> > I haven't thought that much about details, so just one comment: you 
> > shouldn't
> > need the conversion to text and back to binary form. utils/adt/rangetypes.c
> > contains constructors that accept the binary values.
> >
> Attached please find new version of the patch which uses range type to
> interval matching in operator_predicate_proof function.

I think that instead of checking the operator name, e.g.

strcmp(oprname, "<")

you should test the operator B-tree strategy: BTLessStrategyNumber,
BTLessEqualStrategyNumber, etc. The operator string alone does not tell enough
about the operator semantics.

The strategy can be found in the pg_amop catalog.
get_op_btree_interpretation() function may be useful, but there may be
something better in utils/cache/lsyscache.c.

-- 
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Web: https://www.cybertec-postgresql.com



Re: [HACKERS] Secondary index access optimizations

2018-01-11 Thread Konstantin Knizhnik



On 11.01.2018 12:34, Antonin Houska wrote:

Konstantin Knizhnik  wrote:


On 09.01.2018 19:48, Antonin Houska wrote:


Have you considered using the range types (internally in
operator_predicate_proof()) instead of hard-coding operator OIDs? The range
types do have the knowledge that k < 20001 and k <= 2 are equivalent for
integer type:

postgres=# SELECT int4range '(, 20001)' = int4range '(, 2]';
  ?column?
--
  t
(1 row)

It is bright idea, but it is not quit clear to me how to implement it.
There are several builtin ranges types in Postgres: int4range, int8range, 
numrange, tsrange, tstzrange, daterange.

Among them int4range, int8range and daterange are discrete types having 
canonical function, for which this transformation rules are applicable.
Now I perform checks for all this types. So the challenge is to support user 
defined range types with canonical function.
As input operator_predicate_proof function has Oid of comparison operator and 
Const * expression representing literal value.
So I see the following generic way of checking equivalence of ranges:

1. Get name of operator. If it is '<=' or '>=' then it is closed interval, if it is 
'<' or '>' then it is open interval.
2. Convert Const to text (using type's out function) and construct interval: '(,"$value"]' for '<=', 
'["$value",)' for '>=', '(,"$value")' for '<' and '("$value",)' for '>'.
3. Find range type from type of the constant:
select * from pg_range where rngsubtype=?;
4. Try to cast constructed above string to this range type (using type's in 
function).
5. Compare two produced ranges and if them are equal, then 
operator_predicate_proof should return true.

I haven't thought that much about details, so just one comment: you shouldn't
need the conversion to text and back to binary form. utils/adt/rangetypes.c
contains constructors that accept the binary values.

Attached please find new version of the patch which uses range type to 
interval matching in operator_predicate_proof function.


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index bce3348..6a7e7fb 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -626,12 +626,12 @@ EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- Nu
Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NULL))
 (3 rows)
 
-EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest
- QUERY PLAN  
--
+EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;-- NullTest
+QUERY PLAN
+--
  Foreign Scan on public.ft1 t1
Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NOT NULL))
+   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c3 IS NOT NULL))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 1df1e3a..c421530 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -292,7 +292,7 @@ RESET enable_nestloop;
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 1; -- Var, OpExpr(b), Const
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 100 AND t1.c2 = 0; -- BoolExpr
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- NullTest
-EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest
+EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;-- NullTest
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 = -c1;  -- OpExpr(l)
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE 1 = c1!;   -- OpExpr(r)
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 44f6b03..c7bf118 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -345,6 +345,7 @@ set_rel_size(PlannerInfo *root, RelOptInfo *rel,
 		switch (rel->rtekind)
 		{
 			case RTE_RELATION:
+remove_restrictions_implied_by_constraints(root, rel, 

Re: [HACKERS] Secondary index access optimizations

2018-01-11 Thread Antonin Houska
Konstantin Knizhnik  wrote:

> On 09.01.2018 19:48, Antonin Houska wrote:
> 
>> Have you considered using the range types (internally in
>> operator_predicate_proof()) instead of hard-coding operator OIDs? The range
>> types do have the knowledge that k < 20001 and k <= 2 are equivalent for
>> integer type:
>> 
>> postgres=# SELECT int4range '(, 20001)' = int4range '(, 2]';
>>  ?column? 
>> --
>>  t
>> (1 row)

> It is bright idea, but it is not quit clear to me how to implement it.
> There are several builtin ranges types in Postgres: int4range, int8range, 
> numrange, tsrange, tstzrange, daterange.
> 
> Among them int4range, int8range and daterange are discrete types having 
> canonical function, for which this transformation rules are applicable.
> Now I perform checks for all this types. So the challenge is to support user 
> defined range types with canonical function.
> As input operator_predicate_proof function has Oid of comparison operator and 
> Const * expression representing literal value.
> So I see the following generic way of checking equivalence of ranges:
> 
> 1. Get name of operator. If it is '<=' or '>=' then it is closed interval, if 
> it is '<' or '>' then it is open interval.
> 2. Convert Const to text (using type's out function) and construct interval: 
> '(,"$value"]' for '<=', '["$value",)' for '>=', '(,"$value")' for '<' and 
> '("$value",)' for '>'.
> 3. Find range type from type of the constant:
> select * from pg_range where rngsubtype=?;
> 4. Try to cast constructed above string to this range type (using type's in 
> function).
> 5. Compare two produced ranges and if them are equal, then 
> operator_predicate_proof should return true.

I haven't thought that much about details, so just one comment: you shouldn't
need the conversion to text and back to binary form. utils/adt/rangetypes.c
contains constructors that accept the binary values.

-- 
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Web: https://www.cybertec-postgresql.com



Re: [HACKERS] Secondary index access optimizations

2018-01-10 Thread Konstantin Knizhnik



On 09.01.2018 19:48, Antonin Houska wrote:

Konstantin Knizhnik  wrote:


On 14.08.2017 19:33, Konstantin Knizhnik wrote:

  On 14.08.2017 12:37, Konstantin Knizhnik wrote:

  Hi hackers,

  I am trying to compare different ways of optimizing work with huge 
append-only tables in PostgreSQL where primary key is something like timestamp 
and queries are usually accessing most recent data using some secondary keys. 
Size of secondary index is one of the most critical
  factors limiting insert/search performance. As far as data is inserted in 
timestamp ascending order, access to primary key is well localized and accessed 
tables are present in memory. But if we create secondary key for the whole 
table, then access to it will require random reads from
  the disk and significantly decrease performance.

  There are two well known solutions of the problem:
  1. Table partitioning
  2. Partial indexes

  This approaches I want to compare. First of all I want to check if optimizer 
is able to generate efficient query execution plan covering different time 
intervals.
  Unfortunately in both cases generated plan is not optimal.

  1. Table partitioning:

  create table base (k integer primary key, v integer);
  create table part1 (check (k between 1 and 1)) inherits (base);
  create table part2 (check (k between 10001 and 2)) inherits (base);
  create index pi1 on part1(v);
  create index pi2 on part2(v);
  insert int part1 values (generate series(1,1), random());
  insert into part2 values (generate_series(10001,2), random());
  explain select * from base where k between 1 and 2 and v = 100;
  QUERY PLAN
  ---
  Append (cost=0.00..15.65 rows=3 width=8)
  -> Seq Scan on base (cost=0.00..0.00 rows=1 width=8)
  Filter: ((k >= 1) AND (k <= 2) AND (v = 100))
  -> Index Scan using pi1 on part1 (cost=0.29..8.31 rows=1 width=8)
  Index Cond: (v = 100)
  Filter: ((k >= 1) AND (k <= 2))
  -> Index Scan using pi2 on part2 (cost=0.29..7.34 rows=1 width=8)
  Index Cond: (v = 100)
  Filter: ((k >= 1) AND (k <= 2))

  Questions:
  - Is there some way to avoid sequential scan of parent table? Yes, it is 
empty and so sequential scan will not take much time, but ... it still requires 
some additional actions and so increasing query execution time.
  - Why index scan of partition indexes includes filter condition if it is 
guaranteed by check constraint that all records of this partition match search 
predicate?

  2. Partial indexes:

  create table t (k integer primary key, v integer);
  insert into t values (generate_series(1,2),random());
  create index i1 on t(v) where k between 1 and 1;
  create index i2 on t(v) where k between 10001 and 2;
  postgres=# explain select * from t where k between 1 and 1 and v = 100;
  QUERY PLAN
  
  Index Scan using i1 on t (cost=0.29..7.28 rows=1 width=8)
  Index Cond: (v = 100)
  (2 rows)

  Here we get perfect plan. Let's try to extend search interval:

  postgres=# explain select * from t where k between 1 and 2 and v = 100;
  QUERY PLAN
  --
  Index Scan using t_pkey on t (cost=0.29..760.43 rows=1 width=8)
  Index Cond: ((k >= 1) AND (k <= 2))
  Filter: (v = 100)
  (3 rows)

  Unfortunately in this case Postgres is not able to apply partial indexes.
  And this is what I expected to get:

  postgres=# explain select * from t where k between 1 and 1 and v = 100 
union all select * from t where k between 10001 and 2 and v = 100;
  QUERY PLAN
  --
  Append (cost=0.29..14.58 rows=2 width=8)
  -> Index Scan using i1 on t (cost=0.29..7.28 rows=1 width=8)
  Index Cond: (v = 100)
  -> Index Scan using i2 on t t_1 (cost=0.29..7.28 rows=1 width=8)
  Index Cond: (v = 100)

  I wonder if there are some principle problems in supporting this two things 
in optimizer:
  1. Remove search condition for primary key if it is fully satisfied by 
derived table check constraint.
  2. Append index scans of several partial indexes if specified interval is 
covered by their conditions.

  I wonder if someone is familiar with this part of optimizer ad can easily fix 
it.
  Otherwise I am going to spend some time on solving this problems (if 
community think that such optimizations will be useful).

  Replying to myself: the following small patch removes redundant checks from 
index scans for derived tables:

  diff --git a/src/backend/optimizer/util/plancat.c 
b/src/backend/optimizer/util/plancat.c
  index 939045d..1f7c9cf 100644
  --- a/src/backend/optimizer/util/plancat.c
  +++ b/src/backend/optimizer/util/plancat.c
  @@ -1441,6 +1441,20 @@ relation_excluded_by_constraints(PlannerInfo *root,
  if (predicate_refuted_by(safe_constraints, rel->baserestrictinfo, 

Re: [HACKERS] Secondary index access optimizations

2018-01-09 Thread Antonin Houska
Konstantin Knizhnik  wrote:

> On 14.08.2017 19:33, Konstantin Knizhnik wrote:
> 
>  On 14.08.2017 12:37, Konstantin Knizhnik wrote: 
> 
>  Hi hackers, 
> 
>  I am trying to compare different ways of optimizing work with huge 
> append-only tables in PostgreSQL where primary key is something like 
> timestamp and queries are usually accessing most recent data using some 
> secondary keys. Size of secondary index is one of the most critical
>  factors limiting insert/search performance. As far as data is inserted in 
> timestamp ascending order, access to primary key is well localized and 
> accessed tables are present in memory. But if we create secondary key for the 
> whole table, then access to it will require random reads from
>  the disk and significantly decrease performance. 
> 
>  There are two well known solutions of the problem: 
>  1. Table partitioning 
>  2. Partial indexes 
> 
>  This approaches I want to compare. First of all I want to check if optimizer 
> is able to generate efficient query execution plan covering different time 
> intervals. 
>  Unfortunately in both cases generated plan is not optimal. 
> 
>  1. Table partitioning: 
> 
>  create table base (k integer primary key, v integer); 
>  create table part1 (check (k between 1 and 1)) inherits (base); 
>  create table part2 (check (k between 10001 and 2)) inherits (base); 
>  create index pi1 on part1(v); 
>  create index pi2 on part2(v); 
>  insert int part1 values (generate series(1,1), random()); 
>  insert into part2 values (generate_series(10001,2), random()); 
>  explain select * from base where k between 1 and 2 and v = 100; 
>  QUERY PLAN 
>  --- 
>  Append (cost=0.00..15.65 rows=3 width=8) 
>  -> Seq Scan on base (cost=0.00..0.00 rows=1 width=8) 
>  Filter: ((k >= 1) AND (k <= 2) AND (v = 100)) 
>  -> Index Scan using pi1 on part1 (cost=0.29..8.31 rows=1 width=8) 
>  Index Cond: (v = 100) 
>  Filter: ((k >= 1) AND (k <= 2)) 
>  -> Index Scan using pi2 on part2 (cost=0.29..7.34 rows=1 width=8) 
>  Index Cond: (v = 100) 
>  Filter: ((k >= 1) AND (k <= 2)) 
> 
>  Questions: 
>  - Is there some way to avoid sequential scan of parent table? Yes, it is 
> empty and so sequential scan will not take much time, but ... it still 
> requires some additional actions and so increasing query execution time. 
>  - Why index scan of partition indexes includes filter condition if it is 
> guaranteed by check constraint that all records of this partition match 
> search predicate? 
> 
>  2. Partial indexes: 
> 
>  create table t (k integer primary key, v integer); 
>  insert into t values (generate_series(1,2),random()); 
>  create index i1 on t(v) where k between 1 and 1; 
>  create index i2 on t(v) where k between 10001 and 2; 
>  postgres=# explain select * from t where k between 1 and 1 and v = 100; 
>  QUERY PLAN 
>   
>  Index Scan using i1 on t (cost=0.29..7.28 rows=1 width=8) 
>  Index Cond: (v = 100) 
>  (2 rows) 
> 
>  Here we get perfect plan. Let's try to extend search interval: 
> 
>  postgres=# explain select * from t where k between 1 and 2 and v = 100; 
>  QUERY PLAN 
>  -- 
>  Index Scan using t_pkey on t (cost=0.29..760.43 rows=1 width=8) 
>  Index Cond: ((k >= 1) AND (k <= 2)) 
>  Filter: (v = 100) 
>  (3 rows) 
> 
>  Unfortunately in this case Postgres is not able to apply partial indexes. 
>  And this is what I expected to get: 
> 
>  postgres=# explain select * from t where k between 1 and 1 and v = 100 
> union all select * from t where k between 10001 and 2 and v = 100; 
>  QUERY PLAN 
>  -- 
>  Append (cost=0.29..14.58 rows=2 width=8) 
>  -> Index Scan using i1 on t (cost=0.29..7.28 rows=1 width=8) 
>  Index Cond: (v = 100) 
>  -> Index Scan using i2 on t t_1 (cost=0.29..7.28 rows=1 width=8) 
>  Index Cond: (v = 100) 
> 
>  I wonder if there are some principle problems in supporting this two things 
> in optimizer: 
>  1. Remove search condition for primary key if it is fully satisfied by 
> derived table check constraint. 
>  2. Append index scans of several partial indexes if specified interval is 
> covered by their conditions. 
> 
>  I wonder if someone is familiar with this part of optimizer ad can easily 
> fix it. 
>  Otherwise I am going to spend some time on solving this problems (if 
> community think that such optimizations will be useful). 
> 
>  Replying to myself: the following small patch removes redundant checks from 
> index scans for derived tables: 
> 
>  diff --git a/src/backend/optimizer/util/plancat.c 
> b/src/backend/optimizer/util/plancat.c 
>  index 939045d..1f7c9cf 100644 
>  --- a/src/backend/optimizer/util/plancat.c 
>  +++ 

Re: [HACKERS] Secondary index access optimizations

2018-01-06 Thread Stephen Frost
Greetings,

* Konstantin Knizhnik (k.knizh...@postgrespro.ru) wrote:
> On 04.12.2017 19:44, Alvaro Herrera wrote:
> >Konstantin Knizhnik wrote:
> >>
> >>On 30.11.2017 05:16, Michael Paquier wrote:
> >>>On Mon, Nov 6, 2017 at 10:13 PM, Konstantin Knizhnik
> >>> wrote:
> Concerning broken partition_join test: it is "expected" failure: my patch
> removes from the plans redundant checks.
> So the only required action is to update expected file with results.
> Attached please find updated patch.
> >>>The last patch version has conflicts in regression tests and did not
> >>>get any reviews. Please provide a rebased version. I am moving the
> >>>patch to next CF with waiting on author as status. Thanks!
> >>Rebased patch is attached.
> >I don't think this is a rebase on the previously posted patch ... it's
> >about 10x as big and appears to be a thorough rewrite of the entire
> >optimizer.
> 
> Or, sorry. I really occasionally committed in this branch patch for
> aggregate push down.
> Correct reabased patch is attached.

This patch applies, builds and passes 'make check-world', with no real
review posted of it, so I don't believe it should be 'Waiting on Author'
but really in 'Needs Review' status, so I've gone ahead and updated the
CF with that status.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: [HACKERS] Secondary index access optimizations

2017-12-04 Thread Pantelis Theodosiou
On Tue, Sep 5, 2017 at 9:10 AM, Konstantin Knizhnik <
k.knizh...@postgrespro.ru> wrote:

>
>
> On 05.09.2017 04:02, Amit Langote wrote:
>
> Like Thomas, I'm not so sure about the whole predtest.c patch.  The core
> logic in operator_predicate_proof() should be able to conclude that, say,
> k < 21 is implied by k <= 20, which you are trying to address with some
> special case code.  If there is still problem you think need to be fixed
> here, a better place to look at would be somewhere around get_btree_test_op().
>
>
> Frankly speaking I also do not like this part of my patch.
> I will be pleased if you or somebody else can propose better solution.
> I do not understand how get_btree_test_op() can help here.
>
> Yes, k < 21 is implied by k <= 20. It is based on generic properties of <
> and  <= operators.
> But I need to proof something different: having table partition constraint
> (k < 21) I want to remove predicate (k <= 20) from query.
> In other words,  operator_predicate_proof() should be able to conclude
> that (k <= 20) is implied by (k < 21).
> But it is true only for integer types, not for floating point types. And
> Postgres operator definition
> doesn't provide some way to determine that user defined type is integer
> type: has integer values for which such conclusion is true.
>
> Why I think that it is important? Certainly, it is possible to rewrite
> query as (k < 21) and no changes in operator_predicate_proof() are needed.
> Assume the most natural use case: I have some positive integer key and I
> wan to range partition table by such key, for example with interval 1.
> Currently standard PostgreSQL partitioning mechanism requires to specify
> intervals with open high boundary.
> So if I want first partition to contain interval [1,1], second -
> [10001,20001],... I have to create partitions in such way:
>
> create table bt (k integer, v integer) partition by range (k);
> create table dt1 partition of bt for values from (1) to (10001);
> create table dt2 partition of bt for values from (10001) to (20001);
> ...
>
> If I want to write query inspecting data of the particular partition, then
> most likely I will use BETWEEN operator:
>
> SELECT * FROM t WHERE k BETWEEN 1 and 1;
>
> But right now operator_predicate_proof()  is not able to conclude that
> predicate (k BETWEEN 1 and 1) transformed to (k >= 1 AND k <= 1) is
> equivalent to (k >= 1 AND k < 10001)
> which is used as partition constraint.
>
> Another very popular use case (for example mentioned in PostgreSQL
> documentation of partitioning: https://www.postgresql.org/
> docs/10/static/ddl-partitioning.html)
> is using date as partition key:
>
> CREATE TABLE measurement (
> city_id int not null,
> logdate date not null,
> peaktempint,
> unitsales   int
> ) PARTITION BY RANGE (logdate);
>
>
> CREATE TABLE measurement_y2006m03 PARTITION OF measurement
> FOR VALUES FROM ('2006-03-01') TO ('2006-04-01')
>
>
> Assume that now I want to get measurements for March:
>
> There are three ways to write this query:
>
> select * from measurement where extract(month from logdate) = 3;
> select * from measurement where logdate between '2006-03-01' AND
> '2006-03-31';
> select * from measurement where logdate >= '2006-03-01' AND logdate  <
> '2006-04-01';
>
> Right now only for the last query optimal query plan will be constructed.
>

Perhaps the relative pages (about partitioning and optimization) should
mention to avoid BETWEEN and using closed-open checks, as the last query.

Dates are a perfect example to demonstrate that BETWEEN shouldn't be used,
in my opinion. Dates (and timestamps) are not like integers as they are
often used with different levels of precisions, day, month, year, hour,
minute, second, etc. (month in your example). Constructing the correct
expressions for the different precisions can be a nightmare with BETWEEN
but very simple with >= and < (in the example: get start_date,
'2006-03-01', and add one month).

So, just my 2c, is it worth the trouble to implement this feature
(conversion of (k<21) to (k<=20) and vice versa) and how much work would it
need for all data types that are commonly used for partitioning?



> Unfortunately my patch is not covering date type.
>
> --
> Konstantin Knizhnik
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
>


Re: [HACKERS] Secondary index access optimizations

2017-12-04 Thread Konstantin Knizhnik



On 04.12.2017 19:44, Alvaro Herrera wrote:

Konstantin Knizhnik wrote:


On 30.11.2017 05:16, Michael Paquier wrote:

On Mon, Nov 6, 2017 at 10:13 PM, Konstantin Knizhnik
 wrote:

Concerning broken partition_join test: it is "expected" failure: my patch
removes from the plans redundant checks.
So the only required action is to update expected file with results.
Attached please find updated patch.

The last patch version has conflicts in regression tests and did not
get any reviews. Please provide a rebased version. I am moving the
patch to next CF with waiting on author as status. Thanks!

Rebased patch is attached.

I don't think this is a rebase on the previously posted patch ... it's
about 10x as big and appears to be a thorough rewrite of the entire
optimizer.



Or, sorry. I really occasionally committed in this branch patch for 
aggregate push down.

Correct reabased patch is attached.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index bce3348..6a7e7fb 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -626,12 +626,12 @@ EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- Nu
Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NULL))
 (3 rows)
 
-EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest
- QUERY PLAN  
--
+EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;-- NullTest
+QUERY PLAN
+--
  Foreign Scan on public.ft1 t1
Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NOT NULL))
+   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c3 IS NOT NULL))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 1df1e3a..c421530 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -292,7 +292,7 @@ RESET enable_nestloop;
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 1; -- Var, OpExpr(b), Const
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 100 AND t1.c2 = 0; -- BoolExpr
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- NullTest
-EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest
+EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;-- NullTest
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 = -c1;  -- OpExpr(l)
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE 1 = c1!;   -- OpExpr(r)
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 44f6b03..c7bf118 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -345,6 +345,7 @@ set_rel_size(PlannerInfo *root, RelOptInfo *rel,
 		switch (rel->rtekind)
 		{
 			case RTE_RELATION:
+remove_restrictions_implied_by_constraints(root, rel, rte);
 if (rte->relkind == RELKIND_FOREIGN_TABLE)
 {
 	/* Foreign table */
@@ -1130,6 +1131,7 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
 			set_dummy_rel_pathlist(childrel);
 			continue;
 		}
+		remove_restrictions_implied_by_constraints(root, childrel, childRTE);
 
 		/* CE failed, so finish copying/modifying join quals. */
 		childrel->joininfo = (List *)
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index f743871..f763a97 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -1466,6 +1466,51 @@ relation_excluded_by_constraints(PlannerInfo *root,
 	return false;
 }
 
+/*
+ * Remove from restrictions list items implied by table constraints
+ */
+void remove_restrictions_implied_by_constraints(PlannerInfo *root,
+RelOptInfo *rel, RangeTblEntry *rte)
+{
+	List	   *constraint_pred;
+	List	   *safe_constraints = NIL;
+	List	   *safe_restrictions = NIL;
+	ListCell   *lc;
+
+	if (rte->rtekind != RTE_RELATION || rte->inh)
+		return;
+
+	/*
+	 * OK to fetch the constraint 

Re: [HACKERS] Secondary index access optimizations

2017-12-04 Thread Alvaro Herrera
Konstantin Knizhnik wrote:
> 
> 
> On 30.11.2017 05:16, Michael Paquier wrote:
> > On Mon, Nov 6, 2017 at 10:13 PM, Konstantin Knizhnik
> >  wrote:
> > > Concerning broken partition_join test: it is "expected" failure: my patch
> > > removes from the plans redundant checks.
> > > So the only required action is to update expected file with results.
> > > Attached please find updated patch.
> > The last patch version has conflicts in regression tests and did not
> > get any reviews. Please provide a rebased version. I am moving the
> > patch to next CF with waiting on author as status. Thanks!
> 
> Rebased patch is attached.

I don't think this is a rebase on the previously posted patch ... it's
about 10x as big and appears to be a thorough rewrite of the entire
optimizer.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services