from:"Konstantin Knizhnik"

According to Postgresql documentation it is not possible to alter server 
for foreign table:


https://www.postgresql.org/docs/10/static/sql-alterforeigntable.html

But stackoverflow suggests the following hack directly updating 
|pg_foreign_table|:

https://stackoverflow.com/questions/37388756/may-i-alter-server-for-foreign-table

I wonder how safe it is and if it is so simple, why it is not support in 
ALTER FOREIGN TABLE statement?


Thanks in advance,

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] pg_prepared_xact_status




On 29.09.2017 11:27, Craig Ringer wrote:
On 29 September 2017 at 15:57, Konstantin Knizhnik 
<k.knizh...@postgrespro.ru <mailto:k.knizh...@postgrespro.ru>> wrote:


So you are saying that Postgresql 2PC mechanism is not complete
and user needs to maintain some extra information to make it work?


No, it provides what's needed for an implementation of what in XA 
terms is a local resource manager (LRM). What it does not provide is 
infrastructure to make postgres its self into a global transaction 
manager (GTM) for co-ordinating multiple LRMs.


It sounds like you're trying to build a GTM using PostgreSQL's 
existing LRM book-keeping as your authorative data store, right?


No exactly. I am trying to add 2PC to our pg_shardman: combination of 
pg_pathman + postgres_fdw + logical replication, which should provide HA 
and write scalability.
This architecture definitely not assume presence of GTM. Most of 
transactions are expected to be local (involves only one node) and 
number of participants of distributed transaction is expected to be much 
smaller than total number of nodes (usually 2). So we need to perform 
2PC without GTM.



The problems with 2PC arrive when coordinator node is not
available but is expected to be recovered in future.
In this case we may have not enough information to make a decision
whether to abort or commit prepared transaction.
But it is a different story. We need to use 3PC or some other
protocol to prevent such situation.


In that case the node sits and waits patiently for the GTM (or in more 
complex architectures, *a* valid voting quorum of GTMs) to be 
reachable again. Likely using a protocol like Raft, Paxos, 3PC etc to 
co-ordinate.


It can't do anything else, since if it unilaterally commits or rolls 
back it might later find out that the nodes on the other side of the 
network partition or whatever made the opposite decision and, boom!


Ok, I am not sure if  pg_prepared_xact_status can be really useful or not.
I agree with you that if we are implementing distributed transaction on 
top of Poasgres, then we need some better mechanism to determine 
transaction state.
But a lot of people are using 2PC without GTM or whatever else. For 
example, many Java ORMs are using 2PC for their transactions.
I think that it is better to provide to DBA or programmer some way to 
determine status of such transaction by GID (which is usually unique and 
known), as far as this information

is available in Postgres WAL.

In any case, I attached slightly improved version of this function which 
traverse log not only since last checkpoint, but also try iterates 
backward inspecting previous WAL segments.


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index ae83291..fbf91f5 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -84,6 +84,7 @@
 #include "access/twophase_rmgr.h"
 #include "access/xact.h"
 #include "access/xlog.h"
+#include "access/xlog_internal.h"
 #include "access/xloginsert.h"
 #include "access/xlogutils.h"
 #include "access/xlogreader.h"
@@ -2408,3 +2409,106 @@ PrepareRedoRemove(TransactionId xid, bool giveWarning)
 
 	return;
 }
+
+Datum
+pg_prepared_xact_status(PG_FUNCTION_ARGS)
+{
+char const* gid = PG_GETARG_CSTRING(0);
+	XLogRecord *record;
+	XLogReaderState *xlogreader;
+	char	   *errormsg;
+	XLogRecPtr start_lsn;
+	XLogRecPtr lsn;
+	char const* xact_status = "unknown";
+	bool done = false;
+	TimeLineID timeline;
+	TransactionId xid = InvalidTransactionId;
+	XLogRecPtr wal_end = GetFlushRecPtr();
+
+	GetOldestRestartPoint(_lsn, );
+
+	xlogreader = XLogReaderAllocate(_local_xlog_page, NULL);
+	if (!xlogreader)
+		ereport(ERROR,
+(errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("out of memory"),
+ errdetail("Failed while allocating a WAL reading processor.")));
+	while (true)
+	{
+		lsn = start_lsn;
+		do
+		{
+			record = XLogReadRecord(xlogreader, lsn, );
+			if (record == NULL)
+break;
+			lsn = InvalidXLogRecPtr; /* continue after the record */
+			if (XLogRecGetRmid(xlogreader) == RM_XACT_ID)
+			{
+uint32 info = XLogRecGetInfo(xlogreader);
+switch (info & XLOG_XACT_OPMASK)
+{
+case XLOG_XACT_PREPARE:
+	{
+		TwoPhaseFileHeader *hdr = (TwoPhaseFileHeader *)XLogRecGetData(xlogreader);
+		char* xact_gid = (char*)hdr + MAXALIGN(sizeof(TwoPhaseFileHeader));
+ 		if (strcmp(xact_gid, gid) == 0)
+		{
+			xid = hdr->xid;
+			xact_status = "prepared";
+		}
+		break;
+	}
+case XLOG_XACT_COMMIT_PREPARED:
+	{
+		xl_xact_commit *xlrec;
+		xl_xact_parsed_commit parsed;
+
+		xlrec = (xl_xact_commit *) XLogRecGetData(xlogreader);

Re: [HACKERS] Index expression syntax




On 29.09.2017 11:03, Marko Tiikkaja wrote:
On Fri, Sep 29, 2017 at 9:31 AM, Konstantin Knizhnik 
<k.knizh...@postgrespro.ru <mailto:k.knizh...@postgrespro.ru>> wrote:


I wonder why syntax error is produced in this case:

postgres=# create index metaindex on foo using
gin(to_tsvector('english', x)||to_tsvector('english',y));
ERROR:  syntax error at or near "||"
LINE 1: ...taindex on foo using gin(to_tsvector('english',
x)||to_tsvec...
^
[ .. ]

/|expression:|/
An expression based on one or more columns of the table. The
expression usually must be written with surrounding
parentheses, as shown in the syntax. However, the parentheses
can be omitted if the expression has the form of a function call.

So documentations states that sometimes it is possible to avoid
parentheses, but it is unclear why I have to use double
parentheses...
I think that either grammar should be fixed, either documentation
should be updated.


Your expression is clearly not a function call, it's a concatenation 
of two of them.  The documentation seems perfectly accurate to me?



O, sorry!
You are right. I just didn't notice extra parenthesis in CREATE INDEX 
syntax in case of using expressions.


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] pg_prepared_xact_status




On 29.09.2017 06:02, Michael Paquier wrote:

On Fri, Sep 29, 2017 at 1:53 AM, Konstantin Knizhnik
<k.knizh...@postgrespro.ru> wrote:

In Postgres 10 we have txid_status function which returns status of
transaction by XID.
I wonder if it will be also useful to have similar function for 2PC
transactions which can operate with GID?
pg_prepared_xacts view allows to get information about prepared transaction
which are not yet committed or aborted.
But if transaction is committed, then there is no way now to find status of
this transaction.

But you need to keep track of the transaction XID of each transaction
happening on the remote nodes which are part of a global 2PC
transaction, no?


Why? We have GID which allows to identify 2PC transaction at all 
participant nodes.



  If you have this data at hand using txid_status is
enough to guess if a prepared transaction has been marked as committed
or prepared. And it seems to me that tracking those XIDs is mandatory
anyway for other consistency checks.


It is certainly possible to maintain information about XIDs involved in 
2PC transaction.

And it can really simplify recovery. But I wonder why it is mandatory?
Keeping track of XIDs requires some persistent storage.
So you are saying that Postgresql 2PC mechanism is not complete and user 
needs to maintain some extra information to make it work?


Also, I think that it is not necessary to know XIDs of all local 
transactions involved in 2PC. It is enough to know XID of coordinator's 
transaction.
It can be included in GID (as I proposed in the end of my mail). In this 
case txid_status can be used at coordinator to check global status of 
2PC transaction.


The idea of pg_prepared_xact_status function is that it allows to get 
status of 2PC transaction without any additional requirements to GIDs 
and any other additional information about participants of 2PC transaction.






If crash happen during 2PC commit, then transaction can be in prepared state
at some nodes and committed/aborted at  other nodes.

Handling inconsistencies here is a tricky problem, particularly if a
given transaction is marked as both committed and aborted on many
nodes.

How it can be?
Abort of transaction can happen only at prepare stage.
In this case coordinator should rollback transaction everywhere.
There should be no committed transactions in this case.

The following situations are possible:
1. Transaction is prepared at some nodes and information about it is not 
available at other nodes. It means that crash happen at prepare state 
and transaction was not able to

complete prepare at all nodes. It is safe to abort transaction in this case.
2. Transaction is prepared at some nodes and aborted at another nodes. 
The same as 1 - we can safely abort transaction everywhere.
3. Transaction is prepared at all nodes. It means that coordinator was 
crashed before sending commit message. It is safe to commit transaction 
everywhere.
4. Transaction is prepared at some nodes and committed at other nodes. 
Commit message was no delivered or proceeded by other nodes before crash.

It is safe to commit transaction at all nodes.


The problems with 2PC arrive when coordinator node is not available but 
is expected to be recovered in future.
In this case we may have not enough information to make a decision 
whether to abort or commit prepared transaction.
But it is a different story. We need to use 3PC or some other protocol 
to prevent such situation.



The only way that I could think of would be to perform PITR to
recover from the inconsistent states. So that's not an easy problem,
becoming even more tricky if more than one transaction is involved and
many transactions are inter-dependent across nodes.


3. Same GID can be reused multiple times. In this case
pg_prepared_xact_status function will return incorrect result, because it
will return information about first global transaction with such GID after
checkpoint and not the recent one.

Yeah, this argument alone is why I think that this is a dead-end approach.


May be. But I think that in most real systems unique GIDs are generated, 
because otherwise it is difficult to address concurrency and recovery 
issues.





There is actually alternative approach to recovery of 2PC transactions. We
can include coordinator identifier in GID (we can use GetSystemIdentifier()
to identify coordinator's node)
and XID of coordinator's transaction. In this case we can use txid_status()
to check status of transaction at coordinator. It eliminates need to scan
WAL to determine status of prepared transaction.

+GetOldestRestartPoint(, );
+
+xlogreader = XLogReaderAllocate(_local_xlog_page, NULL);
+if (!xlogreader)
So you scan a bunch of records for each GID? This is really costly. I
think that you would have an easier life by tracking the XID of each
transaction involved remotely. In Postgres-XL, this is not a problem
as XIDs are assigned globally and consistently. But you woul

[HACKERS] Index expression syntax


I wonder why syntax error is produced in this case:

postgres=# create index metaindex on foo using 
gin(to_tsvector('english', x)||to_tsvector('english',y));

ERROR:  syntax error at or near "||"
LINE 1: ...taindex on foo using gin(to_tsvector('english', x)||to_tsvec...
 ^
The error can be eliminated if extra surrounding parentheses are added:

postgres=# create index metaindex on foo using 
gin((to_tsvector('english', x)||to_tsvector('english',y)));

CREATE INDEX

Postgresql documentations says:

CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ]/|name|/  ] 
ON/|table_name|/  [ USING/|method|/  ]
( {/|column_name|/  | (/|expression|/  ) } [ COLLATE/|collation|/  ] 
[/|opclass|/  ] [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...] )
[ WITH (/|storage_parameter|/  =/|value|/  [, ... ] ) ]
[ TABLESPACE/|tablespace_name|/  ]
[ WHERE/|predicate|/  ]

/|expression:|/
   An expression based on one or more columns of the table. The
   expression usually must be written with surrounding parentheses, as
   shown in the syntax. However, the parentheses can be omitted if the
   expression has the form of a function call.

--





So documentations states that sometimes it is possible to avoid 
parentheses, but it is unclear why I have to use double parentheses...
I think that either grammar should be fixed, either documentation should 
be updated.

/||/

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] Surjective functional indexes

2017-09-28 Thread Konstantin Knizhnik

On 09/28/2017 10:10 PM, Robert Haas wrote:

On Wed, Sep 13, 2017 at 7:00 AM, Simon Riggs <si...@2ndquadrant.com> wrote:

If we do have an option it won't be using fancy mathematical
terminology at all, it would be described in terms of its function,
e.g. recheck_on_update

+1.

I have nothing against renaming "projection" option to "recheck_on_update" or
whatever else is suggested.
Just let me know the best version. From my point of view "recheck_on_update" is too verbose and still not self-explained (to understand the meaning of this option it is necessary to uunderstand how heap_update works). "projection"/"non-injective"/... are
more declarative notions, explaining the characteristic of the index, while "recheck_on_update" is more procedural notion, explaining behavior of heap_update.

Yes, I'd rather not have an option at all, just some simple code with
useful effect, like we have in many other places.

I think the question we need to be able to answer is: What is the
probability that an update that would otherwise be non-HOT can be made
into a HOT update by performing a recheck to see whether the value has
changed? It doesn't seem easy to figure that out from any of the
statistics we have available today or could easily get, because it
depends not only on the behavior of the expression which appears in
the index definition but also on the application behavior. For
example, consider a JSON blob representing a bank account.
b->'balance' is likely to change most of the time, but
b->'account_holder_name' only rarely. That's going to be hard for an
automated system to determine.

We should clearly check as many of the other criteria for a HOT update
as possible before performing a recheck of this type, so that it only
gets performed when it might help. For example, if column a is
indexed and b->'foo' is indexed, there's no point in checking whether
b->'foo' has changed if we know that a has changed. I don't know
whether it would be feasible to postpone deciding whether to do a
recheck until after we've figured out whether the page seems to
contain enough free space to allow a HOT update.

Turning non-HOT updates into HOT updates is really good, so it seems
likely that the rechecks will often be worthwhile. If we avoid a HOT
update in 25% of cases, that's probably easily worth the CPU overhead
of a recheck assuming the function isn't something ridiculously
expensive to compute; the extra CPU cost will be repaid by reduced
bloat. However, if we avoid a HOT update only one time in a million,
it's probably not worth the cost of recomputing the expression the
other 999,999 times. I wonder where the crossover point is -- it
seems like something that could be figured out by benchmarking.

While I agree that it would be nice to have this be a completely
automatic determination, I am not sure that will be practical. I
oppose overloading some other marker (like function_cost>1) for
this; that's too magical.

I almost agree with you.
Just few remarks: indexes are rarely created for frequently changed attributes,
like b->'balance'.
So in case of proper database schema design it is possible to expect that most
of updates are hot updates: do not actually affect any index.
But certainly different attributes may have different probability of been
updated.
Unfortunately we do not know before check which attribute of JSON field (or any
other fields used in indexed expression) is changed.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pg_prepared_xact_status

2017-09-28 Thread Konstantin Knizhnik


Hi,

In Postgres 10 we have txid_status function which returns status of 
transaction by XID.
I wonder if it will be also useful to have similar function for 2PC 
transactions which can operate with GID?
pg_prepared_xacts view allows to get information about prepared 
transaction which are not yet committed or aborted.
But if transaction is committed, then there is no way now to find status 
of this transaction.


If crash happen during 2PC commit, then transaction can be in prepared 
state at some nodes and committed/aborted at  other nodes.
Using pg_prepared_xacts view DBA can find out global transactions which 
were not completed.
But there is no way (except pg_waldump) to determine whether this 
transaction needs to be committed or aborted at rest of the nodes.


Attached please find small patch with pg_prepared_xact_status function.
This function has the following obvious drawbacks:

1. It is not able to extract information about prepared transaction 
preceding last checkpoint. It seems to be enough to perform recovery in 
case of failure unless
checkpoint happen just before failure or there is large gap between 
prepare and commit.
The only workaround I see at this moment is to pass to this function 
optional parameter with start position in the WAL.

Any better solution?

2. On systems with huge workload interval between checkpoints may be 
very large. In this case we have to scan large amount of WAL data to be 
able to locate our transaction.

Whoich make take significant amount of time.
We can traverse WAL in smarter way, starting from last segment, assuming 
that in-doubt transaction was prepared just before crash.

But it significantly complicates traverse logic.

3. Same GID can be reused multiple times. In this case 
pg_prepared_xact_status function will return incorrect result, because 
it will return information about first global transaction with such GID 
after checkpoint and not the recent one.



There is actually alternative approach to recovery of 2PC transactions. 
We can include coordinator identifier in GID (we can use 
GetSystemIdentifier() to identify coordinator's node)
and XID of coordinator's transaction. In this case we can use 
txid_status() to check status of transaction at coordinator. It 
eliminates need to scan WAL to determine status of prepared transaction.


--

Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index ae83291..be65ae7 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -84,6 +84,7 @@
 #include "access/twophase_rmgr.h"
 #include "access/xact.h"
 #include "access/xlog.h"
+#include "access/xlog_internal.h"
 #include "access/xloginsert.h"
 #include "access/xlogutils.h"
 #include "access/xlogreader.h"
@@ -2408,3 +2409,87 @@ PrepareRedoRemove(TransactionId xid, bool giveWarning)
 
 	return;
 }
+
+Datum
+pg_prepared_xact_status(PG_FUNCTION_ARGS)
+{
+char const* gid = PG_GETARG_CSTRING(0);
+	XLogRecord *record;
+	XLogReaderState *xlogreader;
+	char	   *errormsg;
+	XLogRecPtr lsn;
+	char const* xact_status = "unknown";
+	bool done = false;
+	TimeLineID timeline;
+	TransactionId xid = InvalidTransactionId;
+	XLogRecPtr wal_end = GetFlushRecPtr();
+
+	GetOldestRestartPoint(, );
+
+	xlogreader = XLogReaderAllocate(_local_xlog_page, NULL);
+	if (!xlogreader)
+		ereport(ERROR,
+(errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("out of memory"),
+ errdetail("Failed while allocating a WAL reading processor.")));
+	do
+	{
+		record = XLogReadRecord(xlogreader, lsn, );
+		if (record == NULL)
+			break;
+		lsn = InvalidXLogRecPtr; /* continue after the record */
+		if (XLogRecGetRmid(xlogreader) == RM_XACT_ID)
+		{
+			uint32 info = XLogRecGetInfo(xlogreader);
+			switch (info & XLOG_XACT_OPMASK)
+			{
+			case XLOG_XACT_PREPARE:
+{
+	TwoPhaseFileHeader *hdr = (TwoPhaseFileHeader *)XLogRecGetData(xlogreader);
+	char* xact_gid = (char*)hdr + MAXALIGN(sizeof(TwoPhaseFileHeader));
+	if (strcmp(xact_gid, gid) == 0)
+	{
+		xid = hdr->xid;
+		xact_status = "prepared";
+	}
+	break;
+}
+			case XLOG_XACT_COMMIT_PREPARED:
+{
+	xl_xact_commit *xlrec;
+	xl_xact_parsed_commit parsed;
+
+	xlrec = (xl_xact_commit *) XLogRecGetData(xlogreader);
+	ParseCommitRecord(info, xlrec, );
+	if (xid == parsed.twophase_xid)
+	{
+		Assert(TransactionIdIsValid(xid));
+		xact_status = "committed";
+		done = true;
+	}
+	break;
+}
+			case XLOG_XACT_ABORT_PREPARED:
+{
+	xl_xact_abort *xlrec;
+	xl_xact_parsed_abort parsed;
+
+	xlrec = (xl_xact_abort *) XLogRecGetData(xlogreader);
+	ParseAbortRecord(info, xlrec, );
+	if (xid == parsed.twophase

Re: [HACKERS] JIT compiling expressions/deform + inlining prototype v2.0

2017-09-19 Thread Konstantin Knizhnik




On 04.09.2017 23:52, Andres Freund wrote:


Hi. That piece of code isn't particularly clear (and has a bug in the
submitted version), I'm revising it.


...

Yea, I've changed that already, although it's currently added earlier,
because the alignment is needed before, to access the column correctly.
I've also made number of efficiency improvements, primarily to access
columns with an absolute offset if all preceding ones are fixed width
not null columns - that is quite noticeable performancewise.



Should I wait for new version of your patch or continue review of this code?


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Surjective functional indexes

2017-09-15 Thread Konstantin Knizhnik




On 14.09.2017 18:53, Simon Riggs wrote:

It's not going to work, as already mentioned above. Those stats are at
table level and very little to do with this particular index.

But you've not commented on the design I mention that can work: index relcache.


Concerning your idea to check cost of index function: it certainly makes
sense.
The only problems: I do not understand now how to calculate this cost.
It can be easily calculated by optimizer when it is building query execution
plan.
But inside BuildIndexInfo I have just reference to Relation and have no idea
how
I can propagate here information about index expression cost from optimizer.

We could copy at create index, if we took that route. Or we can look
up the cost for the index expression and cache it.


Anyway, this is just jumping around because we still have a parameter
and the idea was to remove the parameter entirely by autotuning, which
I think is both useful and possible, just as HOT itself is autotuned.



Attached please find yet another version of the patch.
I have to significantly rewrite it,  because my first attempts to add 
auto-tune were not correct.

New patch does it in correct way (I hope) and more efficiently.
I moved auto-tune code from BuildIndexInfo, which is called many times, 
including heap_update (so at least once per update tuple).
to RelationGetIndexAttrBitmap which is called only when cached 
RelationData is filled by backend.
The problem with my original implementation of auto-tune was that 
switching off "projection" property of index, it doesn't update 
attribute masks,

calculated by RelationGetIndexAttrBitmap.

I have also added check for maximal cost of indexed expression.
So now decision whether to apply projection index optimization (compare 
old and new values of indexed expression)

is based  on three sources:
 1. Calculated hot update statistic: we compare number of hot updates 
which are performed
because projection index check shows that index expression is not 
changed with total
number of updates affecting attributes used in projection indexes. 
If it is smaller than

some threshold (10%), then index is considered as non-projective.
 2. Calculated cost of index expression: if it is higher than some 
threshold (1000) then
extra comparison of index expression values is expected to be too 
expensive.
 3. "projection" index option explicitly set by user. This setting 
overrides 1) and 2)




--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index 83ee7d3..52189ac 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -294,8 +294,33 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] 
 The optional WITH clause specifies storage
 parameters for the index.  Each index method has its own set of allowed
-storage parameters.  The B-tree, hash, GiST and SP-GiST index methods all
-accept this parameter:
+storage parameters. All indexes accept the following parameter:
+   
+
+   
+   
+projection
+
+ 
+   Functional index is based on on projection function: function which extract subset of its argument.
+   In mathematic such functions are called non-injective. For injective function if any attribute used in the indexed
+   expression is changed, then value of index expression is also changed. So to check that index is affected by the
+   update, it is enough to check the set of changed fields. By default this parameters is assigned true value and function is considered
+   as non-injective.
+   In this case change of any of indexed key doesn't mean that value of the function is changed. For example, for
+   the expression expression(bookinfo-'isbn') defined
+   for column of JSON type is changed only when ISBN is changed, which rarely happen. The same is true for most
+   functional indexes. For non-injective functions, Postgres compares values of indexed expression for old and updated tuple and updates
+   index only when function results are different. It allows to eliminate index update and use HOT update.
+   But there are extra evaluations of the functions. So if function is expensive or probability that change of indexed column will not effect
+   the function value is small, then marking index as projection may increase update speed.
+
+
+   
+   
+
+   
+ The B-tree, hash, GiST and SP-GiST index methods all accept this parameter:

 

diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index ec10762..b73165f 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -130,6 +130,15 @@ static relopt_bool boolRelOpts[] =
 	},
 	{
 		{
+			"projection",
+			"Evaluate functional index expressi

Re: [HACKERS] Surjective functional indexes

2017-09-15 Thread Konstantin Knizhnik




On 14.09.2017 18:53, Simon Riggs wrote:



This works by looking at overall stats, and only looks at the overall
HOT %, so its too heavyweight and coarse.

I suggested storing stat info on the relcache and was expecting you
would look at how often the expression evaluates to new == old. If we
evaluate new against old many times, then if the success rate is low
we should stop attempting the comparison. (<10%?)

Another idea:
If we don't make a check when we should have done then we will get a
non-HOT update, so we waste time extra time difference between a HOT
and non-HOT update. If we check and fail we waste time take to perform
check. So the question is how expensive the check is against how
expensive a non-HOT update is. Could we simply say we don't bother to
check functions that have a cost higher than 1? So if the user
doesn't want to perform the check they can just increase the cost of
the function above the check threshold?


Attached pleased find one more patch which calculates hot update check hit
rate more precisely: I have to extended PgStat_StatTabEntry with two new
fields:
hot_update_hits and hot_update_misses.

It's not going to work, as already mentioned above. Those stats are at
table level and very little to do with this particular index.

But you've not commented on the design I mention that can work: index relcache.

Sorry, I do not completely agree with you.
Yes, certainly whether functional index is projective or not is property 
of the index, not of the table.
But the decision whether hot update is applicable or not is made for the 
whole table - for all indexes.
If a value of just one indexed expressions is changed then we can not 
use hot update and have to update all indexes.


Assume that we have table with "bookinfo" field of type JSONB.
And we create several functional indexes on this column: 
(bookinfo->'isbn'), (bookinfo->'title'), (bookinfo->'author'), 
(bookinfo->'rating').
Probability that indexed expression is changed is case of updating 
"bookinfo" field my be different for all this three indexes.
But there is completely no sense to check if 'isbn' is changed or not, 
if we already detect that most updates cause change of 'rating' 
attribute and
so comparing old and new values of (bookinfo->'rating') is just waste of 
time. In this case we should not also compare (bookinfo->'isbn') and
other indexed expressions because for already rejected possibility of 
hot update.


So despite to the fact that this information depends on particular 
index, it affects behavior of the whole table and it is reasonable (and 
simpler) to collect it in table's statistic.



Concerning your idea to check cost of index function: it certainly makes
sense.
The only problems: I do not understand now how to calculate this cost.
It can be easily calculated by optimizer when it is building query execution
plan.
But inside BuildIndexInfo I have just reference to Relation and have no idea
how
I can propagate here information about index expression cost from optimizer.

We could copy at create index, if we took that route. Or we can look
up the cost for the index expression and cache it.


Anyway, this is just jumping around because we still have a parameter
and the idea was to remove the parameter entirely by autotuning, which
I think is both useful and possible, just as HOT itself is autotuned.



Hot update in almost all cases is preferable to normal update, causing 
update of indexes.
There are can be some scenarios when hot updates reduce speed of some 
queries,

but it is very difficult to predict such cases user level.

But usually nature of index is well known by DBA or programmer. In 
almost all cases it is clear for person creating functional index 
whether it will perform projection or not
and whether comparing old/new expression value makes sense or is just 
waste of time. We can guess it from autotune, but such decision may be 
wrong (just because of application
business logic). Postgres indexes already have a lot of options. And I 
think that "projection" option (or whatever we name it) is also needed.



--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Surjective functional indexes

2017-09-14 Thread Konstantin Knizhnik




On 14.09.2017 13:19, Simon Riggs wrote:

On 14 September 2017 at 10:42, Konstantin Knizhnik
<k.knizh...@postgrespro.ru> wrote:


On 13.09.2017 14:00, Simon Riggs wrote:

On 13 September 2017 at 11:30, Konstantin Knizhnik
<k.knizh...@postgrespro.ru> wrote:


The only reason of all this discussion about terms is that I need to
choose
name for correspondent index option.
Simon think that we do not need this option at all. In this case we
should
not worry about right term.
  From my point of view, "projection" is quite clear notion and not only
for
mathematics. It is also widely used in IT and especially in DBMSes.

If we do have an option it won't be using fancy mathematical
terminology at all, it would be described in terms of its function,
e.g. recheck_on_update

Yes, I'd rather not have an option at all, just some simple code with
useful effect, like we have in many other places.


Attached please find new version of projection functional index optimization
patch.
I have implemented very simple autotune strategy: now I use table statistic
to compare total number of updates with number of hot updates.
If fraction of hot updates is relatively small, then there is no sense to
spend time performing extra evaluation of index expression and comparing its
old and new values.
Right now the formula is the following:

#define MIN_UPDATES_THRESHOLD 10
#define HOT_RATIO_THRESHOLD   2

 if (stat->tuples_updated > MIN_UPDATES_THRESHOLD
 && stat->tuples_updated >
stat->tuples_hot_updated*HOT_RATIO_THRESHOLD)
 {
 /* If percent of hot updates is small, then disable projection
index function
  * optimization to eliminate overhead of extra index expression
evaluations.
  */
 ii->ii_Projection = false;
 }

This threshold values are pulled out of a hat: I am not sure if this
heuristic is right.
I will be please to get feedback if such approach to autotune is promising.

Hmm, not really, but thanks for trying.

This works by looking at overall stats, and only looks at the overall
HOT %, so its too heavyweight and coarse.

I suggested storing stat info on the relcache and was expecting you
would look at how often the expression evaluates to new == old. If we
evaluate new against old many times, then if the success rate is low
we should stop attempting the comparison. (<10%?)

Another idea:
If we don't make a check when we should have done then we will get a
non-HOT update, so we waste time extra time difference between a HOT
and non-HOT update. If we check and fail we waste time take to perform
check. So the question is how expensive the check is against how
expensive a non-HOT update is. Could we simply say we don't bother to
check functions that have a cost higher than 1? So if the user
doesn't want to perform the check they can just increase the cost of
the function above the check threshold?

Attached pleased find one more patch which calculates hot update check 
hit rate more precisely: I have to extended PgStat_StatTabEntry with two 
new fields:

hot_update_hits and hot_update_misses.

Concerning your idea to check cost of index function: it certainly makes 
sense.

The only problems: I do not understand now how to calculate this cost.
It can be easily calculated by optimizer when it is building query 
execution plan.
But inside BuildIndexInfo I have just reference to Relation and have no 
idea how

I can propagate here information about index expression cost from optimizer.


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index 83ee7d3..52189ac 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -294,8 +294,33 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] 
 The optional WITH clause specifies storage
 parameters for the index.  Each index method has its own set of allowed
-storage parameters.  The B-tree, hash, GiST and SP-GiST index methods all
-accept this parameter:
+storage parameters. All indexes accept the following parameter:
+   
+
+   
+   
+projection
+
+ 
+   Functional index is based on on projection function: function which extract subset of its argument.
+   In mathematic such functions are called non-injective. For injective function if any attribute used in the indexed
+   expression is changed, then value of index expression is also changed. So to check that index is affected by the
+   update, it is enough to check the set of changed fields. By default this parameters is assigned true value and function is considered
+   as non-injective.
+   In this case change of any of indexed key doesn't mean that value of the function is changed. For example, for
+   the expression expression(bookinfo-'isbn') defined
+

Re: [HACKERS] Surjective functional indexes

2017-09-14 Thread Konstantin Knizhnik




On 13.09.2017 14:00, Simon Riggs wrote:

On 13 September 2017 at 11:30, Konstantin Knizhnik
<k.knizh...@postgrespro.ru> wrote:


The only reason of all this discussion about terms is that I need to choose
name for correspondent index option.
Simon think that we do not need this option at all. In this case we should
not worry about right term.
 From my point of view, "projection" is quite clear notion and not only for
mathematics. It is also widely used in IT and especially in DBMSes.

If we do have an option it won't be using fancy mathematical
terminology at all, it would be described in terms of its function,
e.g. recheck_on_update

Yes, I'd rather not have an option at all, just some simple code with
useful effect, like we have in many other places.

Attached please find new version of projection functional index 
optimization patch.
I have implemented very simple autotune strategy: now I use table 
statistic to compare total number of updates with number of hot updates.
If fraction of hot updates is relatively small, then there is no sense 
to spend time performing extra evaluation of index expression and 
comparing its old and new values.

Right now the formula is the following:

#define MIN_UPDATES_THRESHOLD 10
#define HOT_RATIO_THRESHOLD   2

if (stat->tuples_updated > MIN_UPDATES_THRESHOLD
&& stat->tuples_updated > 
stat->tuples_hot_updated*HOT_RATIO_THRESHOLD)

{
/* If percent of hot updates is small, then disable 
projection index function
 * optimization to eliminate overhead of extra index 
expression evaluations.

 */
ii->ii_Projection = false;
}

This threshold values are pulled out of a hat: I am not sure if this 
heuristic is right.

I will be please to get feedback if such approach to autotune is promising.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index 83ee7d3..52189ac 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -294,8 +294,33 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] 
 The optional WITH clause specifies storage
 parameters for the index.  Each index method has its own set of allowed
-storage parameters.  The B-tree, hash, GiST and SP-GiST index methods all
-accept this parameter:
+storage parameters. All indexes accept the following parameter:
+   
+
+   
+   
+projection
+
+ 
+   Functional index is based on on projection function: function which extract subset of its argument.
+   In mathematic such functions are called non-injective. For injective function if any attribute used in the indexed
+   expression is changed, then value of index expression is also changed. So to check that index is affected by the
+   update, it is enough to check the set of changed fields. By default this parameters is assigned true value and function is considered
+   as non-injective.
+   In this case change of any of indexed key doesn't mean that value of the function is changed. For example, for
+   the expression expression(bookinfo-'isbn') defined
+   for column of JSON type is changed only when ISBN is changed, which rarely happen. The same is true for most
+   functional indexes. For non-injective functions, Postgres compares values of indexed expression for old and updated tuple and updates
+   index only when function results are different. It allows to eliminate index update and use HOT update.
+   But there are extra evaluations of the functions. So if function is expensive or probability that change of indexed column will not effect
+   the function value is small, then marking index as projection may increase update speed.
+
+
+   
+   
+
+   
+ The B-tree, hash, GiST and SP-GiST index methods all accept this parameter:

 

diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index ec10762..b73165f 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -130,6 +130,15 @@ static relopt_bool boolRelOpts[] =
 	},
 	{
 		{
+			"projection",
+			"Evaluate functional index expression on update to check if its values is changed",
+			RELOPT_KIND_INDEX,
+			AccessExclusiveLock
+		},
+		true
+	},
+	{
+		{
 			"security_barrier",
 			"View acts as a row security barrier",
 			RELOPT_KIND_VIEW,
@@ -1301,7 +1310,7 @@ fillRelOptions(void *rdopts, Size basesize,
 break;
 			}
 		}
-		if (validate && !found)
+		if (validate && !found && options[i].gen->kinds != RELOPT_KIND_INDEX)
 			elog(ERROR, "reloption \"%s\" not found in parse table",
  options[i].gen->name);
 	}
diff --git a/src/

Re: [HACKERS] Surjective functional indexes

2017-09-13 Thread Konstantin Knizhnik




On 13.09.2017 14:00, Simon Riggs wrote:

On 13 September 2017 at 11:30, Konstantin Knizhnik
<k.knizh...@postgrespro.ru> wrote:


The only reason of all this discussion about terms is that I need to choose
name for correspondent index option.
Simon think that we do not need this option at all. In this case we should
not worry about right term.
 From my point of view, "projection" is quite clear notion and not only for
mathematics. It is also widely used in IT and especially in DBMSes.

If we do have an option it won't be using fancy mathematical
terminology at all, it would be described in terms of its function,
e.g. recheck_on_update

Yes, I'd rather not have an option at all, just some simple code with
useful effect, like we have in many other places.


Yehhh,
After more thinking I found out that my idea to use table/index 
statistic (particularity number of distinct values) to determine 
projection functions  was wrong.
Consider case column bookinfo of jsonb type and index expression 
(bookinfo->'ISBN').
Both can be considered as unique. But it is an obvious example of 
projection function, which value is  not changed if we update other 
information related with this book.


So this approach doesn't work. Looks like the only thing we can do to 
autotune is to collect own statistic: how frequently changing 
attribute(s) doesn't affect result of the function.
By default we can considered function as projection and perform 
comparison of old/new function results.
If after some number of comparisons  fraction of hits (when value of 
function is not changed) is smaller than some threshold (0.5?, 0.9?,...) 
then we can mark index as non-projective
and eliminate this checks in future. But it will require extending index 
statistic. Do we really need/want it?


Despite to the possibility to implement autotune, I still think that we 
should have manual switch, doesn't mater how it is named.


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Surjective functional indexes

2017-09-13 Thread Konstantin Knizhnik




On 13.09.2017 13:14, Christoph Berg wrote:

Re: Konstantin Knizhnik 2017-09-13 
<2393c4b3-2ec4-dc68-4ea9-670597b56...@postgrespro.ru>


On 13.09.2017 10:51, Christoph Berg wrote:

Re: Konstantin Knizhnik 2017-09-01 
<f530ede0-1bf6-879c-c362-34325514f...@postgrespro.ru>

+   Functional index is based on on projection function: function which 
extract subset of its argument.
+   In mathematic such functions are called non-injective. For injective 
function if any attribute used in the indexed
+   expression is changed, then value of index expression is also changed.

This is Just Wrong. I still think what you are doing here doesn't have
anything to do with the function being injective or not.

Sorry, can you please explain what is wrong?

I don't get why you are reasoning about "projection" ->
"non-injective" -> "injective". Can't you try to explain what this
functionality is about without abusing math terms that just mean
something else in the rest of the world?


I tried to explain it in my previous e-mail. In most cases (it is just 
my filling, may be it is wrong), functional indexes are built for some 
complex types, like JSON, arrays, structs,...
and index expression extracts some components of this compound value. It 
means that even if underlying column is changes, there is good chance 
that value of index function is not changed. So there is no need to 
update index and we can use HOT. It allows to several time increase 
performance.


The only reason of all this discussion about terms is that I need to 
choose name for correspondent index option.
Simon think that we do not need this option at all. In this case we 
should not worry about right term.
From my point of view, "projection" is quite clear notion and not only 
for mathematics. It is also widely used in IT and especially in DBMSes.


--

Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Surjective functional indexes

2017-09-13 Thread Konstantin Knizhnik




On 13.09.2017 10:51, Christoph Berg wrote:

Re: Konstantin Knizhnik 2017-09-01 
<f530ede0-1bf6-879c-c362-34325514f...@postgrespro.ru>

+   Functional index is based on on projection function: function which 
extract subset of its argument.
+   In mathematic such functions are called non-injective. For injective 
function if any attribute used in the indexed
+   expression is changed, then value of index expression is also changed.

This is Just Wrong. I still think what you are doing here doesn't have
anything to do with the function being injective or not.


Sorry, can you please explain what is wrong?
The problem I am trying to solve comes from particular use case: 
functional index on part of JSON column.
Usually such index is built for persistent attributes, which are rarely 
changed, like ISBN...
Right now any update of JSON column disables hot update. Even if such 
update doesn't really affect index.
So instead of disabling HOT juts based on mask of modified attributes, I 
suggest to compare old and new value of index expression.


Such behavior can significantly (several times) increase performance. 
But only for "projection" functions.
There was long discussion in this thread about right notion for this 
function (subjective,  non-injective,  projection).

But I think criteria is quite obvious.

Simon propose eliminate "projection" property and use autotune to 
determine optimal behavior.
I still think that such option will be useful, but we can really use 
statistic to compare number of unique values for index function and for 
it's argument(s).
If them are similar, then most likely the function is injective, so it 
produce different result for different attributes.
Then there is no sense to spend extra CPU time, calculating old and new 
values of the function.

This is what I am going to implement now.

So I will be please if you more precisely explain your concerns and 
suggestions (if you have one).


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Surjective functional indexes

2017-09-12 Thread Konstantin Knizhnik




On 12.09.2017 19:28, Simon Riggs wrote:

On 1 September 2017 at 09:47, Konstantin Knizhnik
<k.knizh...@postgrespro.ru> wrote:

On 01.09.2017 09:25, Simon Riggs wrote:

On 1 September 2017 at 05:40, Thomas Munro
<thomas.mu...@enterprisedb.com> wrote:

On Fri, Jun 9, 2017 at 8:08 PM, Konstantin Knizhnik
<k.knizh...@postgrespro.ru> wrote:

Attached please find rebased version of the patch.
Now "projection" attribute is used instead of surjective/injective.

Hi Konstantin,

This still applies but it doesn't compile after commits 2cd70845 and
c6293249.  You need to change this:

Form_pg_attribute att = RelationGetDescr(indexDesc)->attrs[i];

... to this:

Form_pg_attribute att = TupleDescAttr(RelationGetDescr(indexDesc),
i);

Thanks!

Does the patch work fully with that change? If so, I will review.


Attached please find rebased version of the patch.
Yes, I checked that it works after this fix.
Thank you in advance for review.

Thanks for the patch. Overall looks sound and I consider that we are
working towards commit for this.

The idea is that we default "projection = on", and can turn it off in
case the test is expensive. Why bother to have the option? (No docs at
all then!) Why not just evaluate the test and autotune whether to make
the test again in the future? That way we can avoid having an option
completely. I am imagining collecting values on the relcache entry for
the index.


Autotune is definitely good thing. But I do not think that excludes 
having explicit parameter for manual tuning.
For some functional indexes DBA or programmer knows for sure that it 
doesn't perform projection.
For example if it translates or changes encoding of original key. It 
seems to me that we should make it possible to

declare this index as non-projective and do not rely on autotune.

Also I have some doubts concerning using autotune in this case. First of 
all it is very hard to estimate complexity of test.
How can we measure it? Calculate average execution time? It can vary for 
different systems and greatly depends on system load...
Somehow calculate cost of indexed expression? It may be also not always 
produce expected result.


Moreover, in some cases test may be not expensive, but still useless, if 
index expression specifies one-to-one mapping (for example function 
reversing key).
Autotone will never be able to reliable determine that indexed 
expression is projection or not.


It seems to be more precise to compare statistic for source column and 
index expression.
If them are similar, then most likely index expression is not a 
projection...

I will think more about it.


To implement autotuning we would need to instrument the execution. We
could then display the collected value via EXPLAIN, so we could just
then use EXPLAIN in your tests rather than implementing a special
debug mode just for testing. We could also pass that information thru
to stats as well.



--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cached plans and statement generalization

2017-09-12 Thread Konstantin Knizhnik




On 11.09.2017 12:24, Konstantin Knizhnik wrote:

Attached please find rebased version of the patch.
There are the updated performance results (pgbench -s 100 -c 1):

protocol (-M)
read-write
read-only (-S)
simple
3327
19325
extended
2256
16908
prepared
6145
39326
simple+autoprepare
4950
34841




One more patch passing all regression tests with autoprepare_threshold=1.
I still do not think that it should be switch on by default...

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index e3eb0c5..17f3dfd 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -3688,6 +3688,454 @@ raw_expression_tree_walker(Node *node,
 }
 
 /*
+ * raw_expression_tree_mutator --- transform raw parse tree.
+ *
+ * This function is implementing slightly different approach for tree update than expression_tree_mutator().
+ * Callback is given pointer to pointer to the current node and can update this field instead of returning reference to new node.
+ * It makes it possible to remember changes and easily revert them without extra traversal of the tree.
+ *
+ * This function do not need QTW_DONT_COPY_QUERY flag: it never implicitly copy tree nodes, doing in-place update.
+ *
+ * Like raw_expression_tree_walker, there is no special rule about query
+ * boundaries: we descend to everything that's possibly interesting.
+ *
+ * Currently, the node type coverage here extends only to DML statements
+ * (SELECT/INSERT/UPDATE/DELETE) and nodes that can appear in them, because
+ * this is used mainly during analysis of CTEs, and only DML statements can
+ * appear in CTEs. If some other node is visited, iteration is immediately stopped and true is returned.
+ */
+bool
+raw_expression_tree_mutator(Node *node,
+			bool (*mutator) (),
+			void *context)
+{
+	ListCell   *temp;
+
+	/*
+	 * The walker has already visited the current node, and so we need only
+	 * recurse into any sub-nodes it has.
+	 */
+	if (node == NULL)
+		return false;
+
+	/* Guard against stack overflow due to overly complex expressions */
+	check_stack_depth();
+
+	switch (nodeTag(node))
+	{
+		case T_SetToDefault:
+		case T_CurrentOfExpr:
+		case T_Integer:
+		case T_Float:
+		case T_String:
+		case T_BitString:
+		case T_Null:
+		case T_ParamRef:
+		case T_A_Const:
+		case T_A_Star:
+			/* primitive node types with no subnodes */
+			break;
+		case T_Alias:
+			/* we assume the colnames list isn't interesting */
+			break;
+		case T_RangeVar:
+			return mutator(&((RangeVar *) node)->alias, context);
+		case T_GroupingFunc:
+			return mutator(&((GroupingFunc *) node)->args, context);
+		case T_SubLink:
+			{
+SubLink	   *sublink = (SubLink *) node;
+
+if (mutator(>testexpr, context))
+	return true;
+/* we assume the operName is not interesting */
+if (mutator(>subselect, context))
+	return true;
+			}
+			break;
+		case T_CaseExpr:
+			{
+CaseExpr   *caseexpr = (CaseExpr *) node;
+
+if (mutator(>arg, context))
+	return true;
+/* we assume mutator(& doesn't care about CaseWhens, either */
+foreach(temp, caseexpr->args)
+{
+	CaseWhen   *when = (CaseWhen *) lfirst(temp);
+
+	Assert(IsA(when, CaseWhen));
+	if (mutator(>expr, context))
+		return true;
+	if (mutator(>result, context))
+		return true;
+}
+if (mutator(>defresult, context))
+	return true;
+			}
+			break;
+		case T_RowExpr:
+			/* Assume colnames isn't interesting */
+			return mutator(&((RowExpr *) node)->args, context);
+		case T_CoalesceExpr:
+			return mutator(&((CoalesceExpr *) node)->args, context);
+		case T_MinMaxExpr:
+			return mutator(&((MinMaxExpr *) node)->args, context);
+		case T_XmlExpr:
+			{
+XmlExpr	   *xexpr = (XmlExpr *) node;
+
+if (mutator(>named_args, context))
+	return true;
+/* we assume mutator(& doesn't care about arg_names */
+if (mutator(>args, context))
+	return true;
+			}
+			break;
+		case T_NullTest:
+			return mutator(&((NullTest *) node)->arg, context);
+		case T_BooleanTest:
+			return mutator(&((BooleanTest *) node)->arg, context);
+		case T_JoinExpr:
+			{
+JoinExpr   *join = (JoinExpr *) node;
+
+if (mutator(>larg, context))
+	return true;
+if (mutator(>rarg, context))
+	return true;
+if (mutator(>quals, context))
+	return true;
+if (mutator(>alias, context))
+	return true;
+/* using list is deemed uninteresting */
+			}
+			break;
+		case T_IntoClause:
+			{
+IntoClause *into = (IntoClause *) node;
+
+if (mutator(>rel, context))
+	return true;
+/* colNames, options are deemed uninteresting */
+/* viewQuery should be null in raw parsetree, but check it */
+if (mutato

Re: [HACKERS] Cached plans and statement generalization

2017-09-11 Thread Konstantin Knizhnik




On 09.09.2017 06:35, Thomas Munro wrote:

On Fri, May 26, 2017 at 3:54 AM, Konstantin Knizhnik
<k.knizh...@postgrespro.ru> wrote:

Attached please find rebased version of the autoprepare patch based on Tom's
proposal (perform analyze for tree with constant literals and then replace
them with parameters).
Also I submitted this patch for the Autum commitfest.

The patch didn't survive the Summer bitrotfest.  Could you please rebase it?


Attached please find rebased version of the patch.
There are the updated performance results (pgbench -s 100 -c 1):

protocol (-M)
read-write
read-only (-S)
simple
3327
19325
extended
2256
16908
prepared
6145
39326
simple+autoprepare
4950
34841



--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index e3eb0c5..17f3dfd 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -3688,6 +3688,454 @@ raw_expression_tree_walker(Node *node,
 }
 
 /*
+ * raw_expression_tree_mutator --- transform raw parse tree.
+ *
+ * This function is implementing slightly different approach for tree update than expression_tree_mutator().
+ * Callback is given pointer to pointer to the current node and can update this field instead of returning reference to new node.
+ * It makes it possible to remember changes and easily revert them without extra traversal of the tree.
+ *
+ * This function do not need QTW_DONT_COPY_QUERY flag: it never implicitly copy tree nodes, doing in-place update.
+ *
+ * Like raw_expression_tree_walker, there is no special rule about query
+ * boundaries: we descend to everything that's possibly interesting.
+ *
+ * Currently, the node type coverage here extends only to DML statements
+ * (SELECT/INSERT/UPDATE/DELETE) and nodes that can appear in them, because
+ * this is used mainly during analysis of CTEs, and only DML statements can
+ * appear in CTEs. If some other node is visited, iteration is immediately stopped and true is returned.
+ */
+bool
+raw_expression_tree_mutator(Node *node,
+			bool (*mutator) (),
+			void *context)
+{
+	ListCell   *temp;
+
+	/*
+	 * The walker has already visited the current node, and so we need only
+	 * recurse into any sub-nodes it has.
+	 */
+	if (node == NULL)
+		return false;
+
+	/* Guard against stack overflow due to overly complex expressions */
+	check_stack_depth();
+
+	switch (nodeTag(node))
+	{
+		case T_SetToDefault:
+		case T_CurrentOfExpr:
+		case T_Integer:
+		case T_Float:
+		case T_String:
+		case T_BitString:
+		case T_Null:
+		case T_ParamRef:
+		case T_A_Const:
+		case T_A_Star:
+			/* primitive node types with no subnodes */
+			break;
+		case T_Alias:
+			/* we assume the colnames list isn't interesting */
+			break;
+		case T_RangeVar:
+			return mutator(&((RangeVar *) node)->alias, context);
+		case T_GroupingFunc:
+			return mutator(&((GroupingFunc *) node)->args, context);
+		case T_SubLink:
+			{
+SubLink	   *sublink = (SubLink *) node;
+
+if (mutator(>testexpr, context))
+	return true;
+/* we assume the operName is not interesting */
+if (mutator(>subselect, context))
+	return true;
+			}
+			break;
+		case T_CaseExpr:
+			{
+CaseExpr   *caseexpr = (CaseExpr *) node;
+
+if (mutator(>arg, context))
+	return true;
+/* we assume mutator(& doesn't care about CaseWhens, either */
+foreach(temp, caseexpr->args)
+{
+	CaseWhen   *when = (CaseWhen *) lfirst(temp);
+
+	Assert(IsA(when, CaseWhen));
+	if (mutator(>expr, context))
+		return true;
+	if (mutator(>result, context))
+		return true;
+}
+if (mutator(>defresult, context))
+	return true;
+			}
+			break;
+		case T_RowExpr:
+			/* Assume colnames isn't interesting */
+			return mutator(&((RowExpr *) node)->args, context);
+		case T_CoalesceExpr:
+			return mutator(&((CoalesceExpr *) node)->args, context);
+		case T_MinMaxExpr:
+			return mutator(&((MinMaxExpr *) node)->args, context);
+		case T_XmlExpr:
+			{
+XmlExpr	   *xexpr = (XmlExpr *) node;
+
+if (mutator(>named_args, context))
+	return true;
+/* we assume mutator(& doesn't care about arg_names */
+if (mutator(>args, context))
+	return true;
+			}
+			break;
+		case T_NullTest:
+			return mutator(&((NullTest *) node)->arg, context);
+		case T_BooleanTest:
+			return mutator(&((BooleanTest *) node)->arg, context);
+		case T_JoinExpr:
+			{
+JoinExpr   *join = (JoinExpr *) node;
+
+if (mutator(>larg, context))
+	return true;
+if (mutator(>rarg, context))
+	return true;
+if (mutator(>quals, context))
+	return true;
+if (mutator(>alias, context))
+	return true;
+/* using list is deemed uninteresting */
+			}
+			break;
+

Re: [HACKERS] Secondary index access optimizations

2017-09-07 Thread Konstantin Knizhnik




On 07.09.2017 13:00, Thomas Munro wrote:

On Sun, Sep 3, 2017 at 4:34 AM, Konstantin Knizhnik
<k.knizh...@postgrespro.ru> wrote:

Thank you for review.
I attached new version of the patch with
remove_restrictions_implied_by_constraints() function.
Concerning failed tests - this is actually result of this optimization:
extra filter conditions are removed from query plans.
Sorry, I have not included updated version of expected test output files to
the patch.
Now I did it.

A regression test under contrib/postgres_fdw now fails:

- Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T
1" WHERE (("C 1" IS NOT NULL))
+ Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1"

(("C 1" IS NOT NULL)) is indeed redundant in that case, because column
"C 1" was declared to be NOT NULL.  But:

1.  Do we want to go this far?  Note that this is not involving
inheritance and constraint exclusion.  I don't immediately see any
reason why not, but I'm not sure.

2.  If yes, then this postgres_fdw test should be changed, because I
think it was trying to demonstrate that IS NOT NULL expressions are
sent to remote databases -- it would need to be changed so that it
tries that with a column that is actually nullable.

I do not see any reasons why we should disable this optimization in case 
of FDW.

And disabling it requires some extra efforts...

So I have updated test for postgres_fdw, replacing query
 SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;
with
 SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;

Now it checks two things:
1. That not null check is passed to foreign server for nullable column (c3)
2. That not null check is excluded from query execution plan when it can 
be omitted because column is not nullable.


Updated version of the patch is attached to this mail.
Also I added support of date type to operator_predicate_proof to be able 
to imply (logdate <= '2017-03-31') from (logdate < '2017-04-01') .


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index c19b331..a9cce14 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -626,12 +626,12 @@ EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- Nu
Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NULL))
 (3 rows)
 
-EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest
- QUERY PLAN  
--
+EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;-- NullTest
+QUERY PLAN
+--
  Foreign Scan on public.ft1 t1
Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NOT NULL))
+   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c3 IS NOT NULL))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 5f65d9d..2d816db 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -292,7 +292,7 @@ RESET enable_nestloop;
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 1; -- Var, OpExpr(b), Const
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 100 AND t1.c2 = 0; -- BoolExpr
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- NullTest
-EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest
+EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL and c3 is not null;-- NullTest
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 = -c1;  -- OpExpr(l)
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE 1 = c1!;   -- OpExpr(r)
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 2d7e1d8..5de67ce 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -344,6 +344,7 @@ s

Re: [HACKERS] JIT compiling expressions/deform + inlining prototype v2.0

2017-09-05 Thread Konstantin Knizhnik




On 04.09.2017 23:52, Andres Freund wrote:


Yea, I've changed that already, although it's currently added earlier,
because the alignment is needed before, to access the column correctly.
I've also made number of efficiency improvements, primarily to access
columns with an absolute offset if all preceding ones are fixed width
not null columns - that is quite noticeable performancewise.


Unfortunately, in most of real table columns are nullable.
I wonder if we can perform some optimization in this case (assuming that 
in typical cases column either contains mostly non-null values, either 
mostly null values).


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Secondary index access optimizations

2017-09-05 Thread Konstantin Knizhnik




On 05.09.2017 04:02, Amit Langote wrote:

Like Thomas, I'm not so sure about the whole predtest.c patch.  The core
logic in operator_predicate_proof() should be able to conclude that, say,
k < 21 is implied by k <= 20, which you are trying to address with some
special case code.  If there is still problem you think need to be fixed
here, a better place to look at would be somewhere around get_btree_test_op().


Frankly speaking I also do not like this part of my patch.
I will be pleased if you or somebody else can propose better solution.
I do not understand how get_btree_test_op() can help here.

Yes, k < 21 is implied by k <= 20. It is based on generic properties of 
< and  <= operators.
But I need to proof something different: having table partition 
constraint (k < 21) I want to remove predicate (k <= 20) from query.
In other words,  operator_predicate_proof() should be able to conclude 
that (k <= 20) is implied by (k < 21).
But it is true only for integer types, not for floating point types. And 
Postgres operator definition
doesn't provide some way to determine that user defined type is integer 
type: has integer values for which such conclusion is true.


Why I think that it is important? Certainly, it is possible to rewrite 
query as (k < 21) and no changes in operator_predicate_proof() are needed.
Assume the most natural use case: I have some positive integer key and I 
wan to range partition table by such key, for example with interval 1.
Currently standard PostgreSQL partitioning mechanism requires to specify 
intervals with open high boundary.
So if I want first partition to contain interval [1,1], second - 
[10001,20001],... I have to create partitions in such way:


create table bt (k integer, v integer) partition by range (k);
create table dt1 partition of bt for values from (1) to (10001);
create table dt2 partition of bt for values from (10001) to (20001);
...

If I want to write query inspecting data of the particular partition, 
then most likely I will use BETWEEN operator:


SELECT * FROM t WHERE k BETWEEN 1 and 1;

But right now operator_predicate_proof() is not able to conclude that 
predicate (k BETWEEN 1 and 1) transformed to (k >= 1 AND k <= 1) 
is equivalent to (k >= 1 AND k < 10001)

which is used as partition constraint.

Another very popular use case (for example mentioned in PostgreSQL 
documentation of partitioning: 
https://www.postgresql.org/docs/10/static/ddl-partitioning.html)

is using date as partition key:

CREATE TABLE measurement (
city_id int not null,
logdate date not null,
peaktempint,
unitsales   int
) PARTITION BY RANGE (logdate);


CREATE TABLE measurement_y2006m03 PARTITION OF measurement
FOR VALUES FROM ('2006-03-01') TO ('2006-04-01')


Assume that now I want to get measurements for March:

There are three ways to write this query:

select * from measurement where extract(month from logdate) = 3;
select * from measurement where logdate between '2006-03-01' AND 
'2006-03-31';
select * from measurement where logdate >= '2006-03-01' AND logdate  < 
'2006-04-01';


Right now only for the last query optimal query plan will be constructed.
Unfortunately my patch is not covering date type.



--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] JIT compiling expressions/deform + inlining prototype v2.0

2017-09-04 Thread Konstantin Knizhnik




On 01.09.2017 09:41, Andres Freund wrote:

Hi,

I previously had an early prototype of JITing [1] expression evaluation
and tuple deforming.  I've since then worked a lot on this.

Here's an initial, not really pretty but functional, submission. This
supports all types of expressions, and tuples, and allows, albeit with
some drawbacks, inlining of builtin functions.  Between the version at
[1] and this I'd done some work in c++, because that allowed to
experiment more with llvm, but I've now translated everything back.
Some features I'd to re-implement due to limitations of C API.


I've whacked this around quite heavily today, this likely has some new
bugs, sorry for that :(


Can you please clarify the following fragment calculating attributes 
alignment:



/* compute what following columns are aligned to */
+if (att->attlen < 0)
+{
+/* can't guarantee any alignment after varlen field */
+attcuralign = -1;
+}
+else if (att->attnotnull && attcuralign >= 0)
+{
+Assert(att->attlen > 0);
+attcuralign += att->attlen;
+}
+else if (att->attnotnull)
+{
+/*
+ * After a NOT NULL fixed-width column, alignment is
+ * guaranteed to be the minimum of the forced alignment and
+ * length.  XXX
+ */
+attcuralign = alignto + att->attlen;
+Assert(attcuralign > 0);
+}
+else
+{
+//elog(LOG, "attnotnullreset: %d", attnum);
+attcuralign = -1;
+}


I wonder why in this branch (att->attnotnull && attcuralign >= 0)
we are not adding "alignto" and comment in the following branch else if 
(att->attnotnull)
seems to be not related to this branch, because in this case attcuralign 
is expected to be less then zero wjhich means that previous attribute is 
varlen field.



--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Secondary index access optimizations

2017-09-04 Thread Konstantin Knizhnik




On 04.09.2017 12:59, Amit Langote wrote:

Hi Konstantin,

On 2017/09/04 18:19, Konstantin Knizhnik wrote:

On 04.09.2017 05:38, Amit Langote wrote:

On 2017/09/02 12:44, Thomas Munro wrote:

On Wed, Aug 16, 2017 at 9:23 PM, Konstantin Knizhnik
<k.knizh...@postgrespro.ru> wrote:

postgres=# explain select * from bt where k between 1 and 2 and v
= 100;
QUERY PLAN
--
   Append  (cost=0.29..15.63 rows=2 width=8)
 ->  Index Scan using dti1 on dt1  (cost=0.29..8.30 rows=1 width=8)
   Index Cond: (v = 100)
 ->  Index Scan using dti2 on dt2  (cost=0.29..7.33 rows=1 width=8)
   Index Cond: (v = 100)
   Filter: (k <= 2)
(6 rows)

+1

This seems like a good feature to me: filtering stuff that is
obviously true is a waste of CPU cycles and may even require people to
add redundant stuff to indexes.  I was pondering something related to
this over in the partition-wise join thread (join quals that are
implied by partition constraints and should be discarded).

It'd be interesting to get Amit Langote's feedback, so I CC'd him.
I'd be surprised if he and others haven't got a plan or a patch for
this down the back of the sofa.

I agree that that's a good optimization in the cases it's correct.  Given
that check_index_predicates() already applies the same optimization when
considering using a partial index, it might make sense to try to do the
same even earlier for the table itself using its CHECK / NOT NULL
constraints as predicates (I said *earlier* because
relation_excluded_by_constrains happens for a relation before we look at
its indexes).  Also, at the end of relation_excluded_by_constraints() may
not be such a bad place to do this.

By the way, I read in check_index_predicates() that we should not apply
this optimization if the relation in question is a target of UPDATE /
DELETE / SELECT FOR UPDATE.

Please correct me if I wrong, but it seems to me that in case of table
constraints it is not necessary to specially handle update case.
As far as I understand we need to leave predicate in the plan in case of
partial indexes because due to "read committed" isolation policy
we may need to recheck that tuple still satisfies update condition (tuple
can be changed by some other committed transaction while we are waiting
for it and not satisfying this condition any more).
But no transaction can change tuple in such way that it violates table
constraints,  right? So we do not need to recheck it.

Actually, I don't really know why check_index_predicates() skips this
optimization in the target relation case, just wanted to point out that
that's so.

Thinking a bit from what you wrote, maybe we need not worry about
EvalPlanQual in the context of your proposed optimization based on the
table's constraints.


Concerning your suggestion to merge check_index_predicates() and
remove_restrictions_implied_by_constraints() functions: may be it can be
done, but frankly speaking I do not see much sense in it - there are too
much differences between this functions and too few code reusing.

Maybe, you meant to address Thomas here. :)  Reading his comment again, I
too am a bit concerned about destructively modifying the input rel's
baserestrictinfo.  There should at least be a comment that that's being done.
But I have considered Thomas comment and extracted code updating 
relation's baserestrictinfo from
relation_excluded_by_constraints() to 
remove_restrictions_implied_by_constraints() function. It was included 
in new version of the patch.


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Secondary index access optimizations

2017-09-04 Thread Konstantin Knizhnik




On 04.09.2017 05:38, Amit Langote wrote:

On 2017/09/02 12:44, Thomas Munro wrote:

On Wed, Aug 16, 2017 at 9:23 PM, Konstantin Knizhnik
<k.knizh...@postgrespro.ru> wrote:

postgres=# explain select * from bt where k between 1 and 2 and v = 100;
   QUERY PLAN
--
  Append  (cost=0.29..15.63 rows=2 width=8)
->  Index Scan using dti1 on dt1  (cost=0.29..8.30 rows=1 width=8)
  Index Cond: (v = 100)
->  Index Scan using dti2 on dt2  (cost=0.29..7.33 rows=1 width=8)
  Index Cond: (v = 100)
  Filter: (k <= 2)
(6 rows)

+1

This seems like a good feature to me: filtering stuff that is
obviously true is a waste of CPU cycles and may even require people to
add redundant stuff to indexes.  I was pondering something related to
this over in the partition-wise join thread (join quals that are
implied by partition constraints and should be discarded).

It'd be interesting to get Amit Langote's feedback, so I CC'd him.
I'd be surprised if he and others haven't got a plan or a patch for
this down the back of the sofa.

I agree that that's a good optimization in the cases it's correct.  Given
that check_index_predicates() already applies the same optimization when
considering using a partial index, it might make sense to try to do the
same even earlier for the table itself using its CHECK / NOT NULL
constraints as predicates (I said *earlier* because
relation_excluded_by_constrains happens for a relation before we look at
its indexes).  Also, at the end of relation_excluded_by_constraints() may
not be such a bad place to do this.

By the way, I read in check_index_predicates() that we should not apply
this optimization if the relation in question is a target of UPDATE /
DELETE / SELECT FOR UPDATE.
Please correct me if I wrong, but it seems to me that in case of table 
constraints it is not necessary to specially handle update case.
As far as I understand we need to leave predicate in the plan in case of 
partial indexes because due to "read committed" isolation policy
we may need to recheck that tuple still satisfies update condition 
(tuple can be changed by some other committed transaction while we are 
waiting for it and not satisfying this condition any more).
But no transaction can change tuple in such way that it violates table 
constraints,  right? So we do not need to recheck it.


Concerning your suggestion to merge check_index_predicates() and 
remove_restrictions_implied_by_constraints() functions: may be it can be 
done, but frankly speaking I do not see much sense in it - there are too 
much differences between this functions and too few code reusing.


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] JIT & function naming

2017-09-03 Thread Konstantin Knizhnik


On 09/03/2017 02:59 AM, Andres Freund wrote:

Hi,

On 2017-08-31 23:41:31 -0700, Andres Freund wrote:

I previously had an early prototype of JITing [1] expression evaluation
and tuple deforming.  I've since then worked a lot on this.

Here's an initial, not really pretty but functional, submission.

One of the things I'm not really happy about yet is the naming of the
generated functions. Those primarily matter when doing profiling, where
the function name will show up when the profiler supports JIT stuff
(e.g. with a patch I proposed to LLVM that emits perf compatible output,
there's also existing LLVM support for a profiler by intel and
oprofile).

Currently there's essentially a per EState counter and the generated
functions get named deform$n and evalexpr$n. That allows for profiling
of a single query, because different compiled expressions are
disambiguated. It even allows to run the same query over and over, still
giving meaningful results.  But it breaks down when running multiple
queries while profiling - evalexpr0 can mean something entirely
different for different queries.

The best idea I have so far would be to name queries like
evalexpr_$fingerprint_$n, but for that we'd need fingerprinting support
outside of pg_stat_statement, which seems painful-ish.

Perhaps somebody has a better idea?


As far as I understand we do not need precise fingerprint.
So may be just calculate some lightweight fingerprint?
For example take query text (es_sourceText from EText), replace all 
non-alphanumeric characters spaces with '_' and take first N (16?) characters 
of the result?
It seems to me that in most cases it will help to identify the query...


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Secondary index access optimizations

2017-09-02 Thread Konstantin Knizhnik


On 09/02/2017 06:44 AM, Thomas Munro wrote:

On Wed, Aug 16, 2017 at 9:23 PM, Konstantin Knizhnik
<k.knizh...@postgrespro.ru> wrote:

postgres=# explain select * from bt where k between 1 and 2 and v = 100;
   QUERY PLAN
--
  Append  (cost=0.29..15.63 rows=2 width=8)
->  Index Scan using dti1 on dt1  (cost=0.29..8.30 rows=1 width=8)
  Index Cond: (v = 100)
->  Index Scan using dti2 on dt2  (cost=0.29..7.33 rows=1 width=8)
  Index Cond: (v = 100)
  Filter: (k <= 2)
(6 rows)

+1

This seems like a good feature to me: filtering stuff that is
obviously true is a waste of CPU cycles and may even require people to
add redundant stuff to indexes.  I was pondering something related to
this over in the partition-wise join thread (join quals that are
implied by partition constraints and should be discarded).

It'd be interesting to get Amit Langote's feedback, so I CC'd him.
I'd be surprised if he and others haven't got a plan or a patch for
this down the back of the sofa.

I might be missing some higher level architectural problems with the
patch, but for what it's worth here's some feedback after a first read
through:

--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -1441,6 +1441,20 @@ relation_excluded_by_constraints(PlannerInfo *root,
  if (predicate_refuted_by(safe_constraints, rel->baserestrictinfo, false))
  return true;

+/*
+ * Remove from restrictions list items implied by table constraints
+ */
+safe_restrictions = NULL;
+foreach(lc, rel->baserestrictinfo)
+{
+RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);

I think the new way to write this is "RestrictInfo *rinfo =
lfirst_node(RestrictInfo, lc)".

+if (!predicate_implied_by(list_make1(rinfo->clause),
safe_constraints, false)) {
+safe_restrictions = lappend(safe_restrictions, rinfo);
+}
+}
+rel->baserestrictinfo = safe_restrictions;

It feels wrong to modify rel->baserestrictinfo in
relation_excluded_by_constraints().  I think there should probably be
a function with a name that more clearly indicates that it mutates its
input, and you should call that separately after
relation_excluded_by_constraints().  Something like
remove_restrictions_implied_by_constraints()?


It is because operator_predicate_proof is not able to understand that k <
20001 and k <= 2 are equivalent for integer type.

[...]

  /*
   * operator_predicate_proof
  if (clause_const->constisnull)
  return false;

+if (!refute_it
+&& ((pred_op == Int4LessOrEqualOperator && clause_op ==
Int4LessOperator)
+|| (pred_op == Int8LessOrEqualOperator && clause_op ==
Int8LessOperator)
+|| (pred_op == Int2LessOrEqualOperator && clause_op ==
Int2LessOperator))
+&& pred_const->constbyval && clause_const->constbyval
+&& pred_const->constvalue + 1 == clause_const->constvalue)
+{
+return true;
+}
+

I'm less sure about this part.  It seems like a slippery slope.

A couple of regression test failures:

  inherit  ... FAILED
  rowsecurity  ... FAILED

  2 of 179 tests failed.


I didn't try to understand the rowsecurity one, but at first glance
all the differences reported in the inherit test are in fact cases
where your patch is doing the right thing and removing redundant
filters from scans.  Nice!


Thank you for review.
I attached new version of the patch with 
remove_restrictions_implied_by_constraints() function.
Concerning failed tests - this is actually result of this optimization: extra 
filter conditions are removed from query plans.
Sorry, I have not included updated version of expected test output files to the 
patch.
Now I did it.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 2d7e1d8..5de67ce 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -344,6 +344,7 @@ set_rel_size(PlannerInfo *root, RelOptInfo *rel,
 		switch (rel->rtekind)
 		{
 			case RTE_RELATION:
+remove_restrictions_implied_by_constraints(root, rel, rte);
 if (rte->relkind == RELKIND_FOREIGN_TABLE)
 {
 	/* Foreign table */
@@ -1047,6 +1048,7 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
 			set_dummy_rel_pathlist(childrel);
 			continue;
 		}
+		remove_restrictions_implied_by_constraints(root, childrel, childRTE);
 
 		/*
 		 * CE failed, so finish copying/modifying targetlist and join quals.
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/op

Re: [HACKERS] Surjective functional indexes

2017-09-01 Thread Konstantin Knizhnik



On 01.09.2017 09:25, Simon Riggs wrote:

On 1 September 2017 at 05:40, Thomas Munro
<thomas.mu...@enterprisedb.com> wrote:

On Fri, Jun 9, 2017 at 8:08 PM, Konstantin Knizhnik
<k.knizh...@postgrespro.ru> wrote:

Attached please find rebased version of the patch.
Now "projection" attribute is used instead of surjective/injective.

Hi Konstantin,

This still applies but it doesn't compile after commits 2cd70845 and
c6293249.  You need to change this:

   Form_pg_attribute att = RelationGetDescr(indexDesc)->attrs[i];

... to this:

   Form_pg_attribute att = TupleDescAttr(RelationGetDescr(indexDesc), i);

Thanks!

Does the patch work fully with that change? If so, I will review.


Attached please find rebased version of the patch.
Yes, I checked that it works after this fix.
Thank you in advance for review.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index 83ee7d3..52189ac 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -294,8 +294,33 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] 
 The optional WITH clause specifies storage
 parameters for the index.  Each index method has its own set of allowed
-storage parameters.  The B-tree, hash, GiST and SP-GiST index methods all
-accept this parameter:
+storage parameters. All indexes accept the following parameter:
+   
+
+   
+   
+projection
+
+ 
+   Functional index is based on on projection function: function which extract subset of its argument.
+   In mathematic such functions are called non-injective. For injective function if any attribute used in the indexed
+   expression is changed, then value of index expression is also changed. So to check that index is affected by the
+   update, it is enough to check the set of changed fields. By default this parameters is assigned true value and function is considered
+   as non-injective.
+   In this case change of any of indexed key doesn't mean that value of the function is changed. For example, for
+   the expression expression(bookinfo-'isbn') defined
+   for column of JSON type is changed only when ISBN is changed, which rarely happen. The same is true for most
+   functional indexes. For non-injective functions, Postgres compares values of indexed expression for old and updated tuple and updates
+   index only when function results are different. It allows to eliminate index update and use HOT update.
+   But there are extra evaluations of the functions. So if function is expensive or probability that change of indexed column will not effect
+   the function value is small, then marking index as projection may increase update speed.
+
+
+   
+   
+
+   
+ The B-tree, hash, GiST and SP-GiST index methods all accept this parameter:

 

diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index ec10762..b73165f 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -130,6 +130,15 @@ static relopt_bool boolRelOpts[] =
 	},
 	{
 		{
+			"projection",
+			"Evaluate functional index expression on update to check if its values is changed",
+			RELOPT_KIND_INDEX,
+			AccessExclusiveLock
+		},
+		true
+	},
+	{
+		{
 			"security_barrier",
 			"View acts as a row security barrier",
 			RELOPT_KIND_VIEW,
@@ -1301,7 +1310,7 @@ fillRelOptions(void *rdopts, Size basesize,
 break;
 			}
 		}
-		if (validate && !found)
+		if (validate && !found && options[i].gen->kinds != RELOPT_KIND_INDEX)
 			elog(ERROR, "reloption \"%s\" not found in parse table",
  options[i].gen->name);
 	}
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index e29c5ad..05e372f 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -56,6 +56,7 @@
 #include "access/xlogutils.h"
 #include "catalog/catalog.h"
 #include "catalog/namespace.h"
+#include "catalog/index.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "port/atomics.h"
@@ -74,7 +75,9 @@
 #include "utils/snapmgr.h"
 #include "utils/syscache.h"
 #include "utils/tqual.h"
-
+#include "utils/memutils.h"
+#include "nodes/execnodes.h"
+#include "executor/executor.h"
 
 /* GUC variable */
 bool		synchronize_seqscans = true;
@@ -126,6 +129,7 @@ static bool ConditionalMultiXactIdWait(MultiXactId multi, MultiXactStatus status
 static XLogRecPtr log_heap_new_cid(Relation relation, HeapTuple tup);
 static HeapTuple ExtractReplicaIdentity(Relation rel, HeapTuple tup, bool key_modified,

Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-23 Thread Konstantin Knizhnik




On 22.08.2017 17:27, Konstantin Knizhnik wrote:



On 18.08.2017 04:33, Robert Haas wrote:


It seems like a somewhat ad-hoc approach; it supposes that we can 
take any query produced by deparseSelectStmtForRel() and stick a 
LIMIT clause onto the very end and all will be well.  Maybe that's 
not a problematic assumption, not sure.  The grammar happens to allow 
both FOR UPDATE LIMIT n and LIMIT n FOR UPDATE even though only the 
latter syntax is documented.


--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



I am not absolutely sure that it is possible to append any query which 
can be constructed by postgres_fdw for foreign scan with "LIMIT n" clause.
But I also do not know example when it is not possible. As you have 
mentioned, "FOR UPDATE LIMIT n" is currently recognized by Postgres.




I have inspected deparseSelectStmtForRel function and now I am sure that 
appending LIMIT to the SQL statement generated by this function will not 
cause any problems.

It can produce only the following subset of SELECT:

select  FROM  [GROUP BY ... [ HAVING ... ]] [ 
OREDER BY ... ] [ FOR UPDATE ... ];



The only suspicious clause is FOR UPDATE, but I have checked that "FOR 
UPDATE ... LIMIT n" is  really accepted by Postgres parser.


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-22 Thread Konstantin Knizhnik




On 18.08.2017 04:33, Robert Haas wrote:


It seems like a somewhat ad-hoc approach; it supposes that we can take 
any query produced by deparseSelectStmtForRel() and stick a LIMIT 
clause onto the very end and all will be well.  Maybe that's not a 
problematic assumption, not sure.  The grammar happens to allow both 
FOR UPDATE LIMIT n and LIMIT n FOR UPDATE even though only the latter 
syntax is documented.


--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



I am not absolutely sure that it is possible to append any query which 
can be constructed by postgres_fdw for foreign scan with "LIMIT n" clause.
But I also do not know example when it is not possible. As you have 
mentioned, "FOR UPDATE LIMIT n" is currently recognized by Postgres.


Can you suggest how to implement limit push down to FDW in better way?
Move deparseSelectStmtForRel() from postgresGetForeignPlan to 
postgresIterateForeignScan ?
It seems to be problematic because many information required by 
deparseSelectStmtForRel is not available in postgresIterateForeignScan.
In principle, it is possible to somehow propagate it here. But from my 
point of view it is not right approach...


IMHO there is some contradiction in Postgres optimizer that static 
information about limit is not taken in account at the planning stage 
and is actually used only during query execution,
when pass_down_bound() function is called to propagate knowledge about 
limit down through plan nodes. Certainly I understand that it gives more 
flexibility: we can use information from

previous steps of query execution which was not available at planning stage.

But pushing down limit at planning stage requires too much changes. And 
the proposed patch is very small and non-invasive. And in principle, it 
can be used not only postgres_fdw, but also in other FDW implementations 
to push down information about LIMIT.


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-17 Thread Konstantin Knizhnik




On 29.04.2017 00:13, Douglas Doole wrote:


If you add this to the commitfest app, more people might look at
it when the next commitfest opens.


I have added it. https://commitfest.postgresql.org/14/1119/

Also, it might help if you can provide a query/ies with numbers
where this optimization shows improvement.


I can't provide the real queries where we encountered the problem 
because they are internal. However I showed a simplified version of 
the queries in my first post.


On our queries, the change made quite a difference - execution time 
dropped from 31.4 seconds to 7.2 seconds. Explain analyze also shows 
that memory use dropped significantly and we didn't have to spill the 
sort to disk


From:

-> Sort (cost=989.95..1013.27 rows=9326 width=30) 
(node_startup_time/loop=31328.891, node_total_time/loop: 31329.756 
rows=2001 loops=1) Buffers: temp read=772 written=11201 lsm_bufmgr 
hits=3392 Sort Key: *** Sort Method: external merge Sort Space Used: 
89592 Sort Space Type: Disk


To:

-> Sort (cost=989.95..1013.27 rows=9326 width=30) 
(node_startup_time/loop=7123.275, node_total_time/loop: 7123.504 
rows=2001 loops=1) Buffers: lsm_bufmgr hits=3387 Sort Key: *** Sort 
Method: top-N heapsort Sort Space Used: 3256 Sort Space Type: Memory


Attached please find yet another small patch which pushes down LIMIT to 
ForeignScan.
I should notice that currently Postgres optimizer is using "Merge 
Append" and fetches from remote nodes only required number of tuples.
So even without LIMIT push down, postgres_fdw will not pull the whole 
table from remote host.
postgres_fdw is using cursor for fetching data from remote. Default 
fetch size is 100, so even without limit remote query will fetch no 
more  than 100 rows at remote site.


Assume the following example:

postgres=# create extension postgres_fdw;
postgres=# create server shard1  FOREIGN DATA WRAPPER postgres_fdw 
options(dbname 'postgres', host 'localhost', port '5432');
postgres=# create server shard2  FOREIGN DATA WRAPPER postgres_fdw 
options(dbname 'postgres', host 'localhost', port '5432');
postgres=# CREATE USER MAPPING for $user SERVER shard1 options (user 
'$user');
postgres=# CREATE USER MAPPING for $user SERVER shard1 options (user 
'$user');

postgres=# CREATE TABLE t(u integer primary key, v integer);
postgres=# CREATE TABLE t1(u integer primary key, v integer);
postgres=# CREATE TABLE t2(u integer primary key, v integer);
postgres=# insert into t1 values (generate_series(1,10), 
random()*10);
postgres=# insert into t2 values (generate_series(1,10), 
random()*10);
postgres=# CREATE FOREIGN TABLE t_fdw1() inherits (t) server shard1 
options(table_name 't1');
postgres=# CREATE FOREIGN TABLE t_fdw2() inherits (t) server shard2 
options(table_name 't2');



postgres=# explain analyze select * from t order by u limit 1;
  QUERY PLAN
---
 Limit  (cost=200.15..200.20 rows=1 width=8) (actual time=2.010..2.010 
rows=1 loops=1)
   ->  Merge Append  (cost=200.15..449.39 rows=5121 width=8) (actual 
time=2.009..2.009 rows=1 loops=1)

 Sort Key: t.u
 ->  Index Scan using t_pkey on t  (cost=0.12..8.14 rows=1 
width=8) (actual time=0.005..0.005 rows=0 loops=1)
 ->  Foreign Scan on t_fdw2  (cost=100.00..193.92 rows=2560 
width=8) (actual time=1.074..1.074rows=1 loops=1)
 ->  Foreign Scan on t_fdw1  (cost=100.00..193.92 rows=2560 
width=8) (actual time=0.928..0.928rows=1 loops=1)

 Planning time: 0.769 ms
 Execution time: 6.837 ms
(8 rows)

As you can see foreign scan fetches only one row from each remote node.

But still pushing down limit can have positive effect on performance, 
especially if SORT can be replaced with TOP-N.

I got the following results (time in seconds):

Query
original
limit push down
select * from t order by u limit 1
2.276
1.777
select * from t order by v limit 1
100 42


There is index for "u", so fetching records with smallest "u" values can 
be done without sorting, so times are similar.
But in case of sorting by "v", pushing down limit allows to use TOP-1 
instead of global sort and it reduces query execution time more than 2 
times.


--

Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 080cb0a..e3847ce 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -2949,7 +2949,8 @@ create_cursor(ForeignScanState *node)
 	initStringInfo();
 	appendStringInfo(, "DECLARE c%u CURSOR FOR\n%s",
 	 fsstate->cursor_number, fsstate->query);
-
+	if (node->limit > 0)
+		appendStringInfo(, " LIMIT %l

Re: [HACKERS] Secondary index access optimizations

2017-08-16 Thread Konstantin Knizhnik


On 14.08.2017 19:33, Konstantin Knizhnik wrote:



On 14.08.2017 12:37, Konstantin Knizhnik wrote:

Hi hackers,

I am trying to compare different ways of optimizing work with huge 
append-only tables in PostgreSQL where primary key is something like 
timestamp and queries are usually accessing most recent data using 
some secondary keys. Size of secondary index is one of the most 
critical factors limiting  insert/search performance. As far as data 
is inserted in timestamp ascending order, access to primary key is 
well localized and accessed tables are present in memory. But if we 
create secondary key for the whole table, then access to it will 
require random reads from the disk and significantly decrease 
performance.


There are two well known solutions of the problem:
1. Table partitioning
2. Partial indexes

This approaches I want to compare. First of all I want to check if 
optimizer is able to generate efficient query execution plan covering 
different time intervals.

Unfortunately in both cases generated plan is not optimal.

1. Table partitioning:

create table base (k integer primary key, v integer);
create table part1 (check (k between 1 and 1)) inherits (base);
create table part2 (check (k between 10001 and 2)) inherits (base);
create index pi1 on part1(v);
create index pi2 on part2(v);
insert int part1 values (generate series(1,1), random());
insert into part2 values (generate_series(10001,2), random());
explain select * from base where k between 1 and 2 and v = 100;
  QUERY PLAN
---
 Append  (cost=0.00..15.65 rows=3 width=8)
   ->  Seq Scan on base  (cost=0.00..0.00 rows=1 width=8)
 Filter: ((k >= 1) AND (k <= 2) AND (v = 100))
   ->  Index Scan using pi1 on part1  (cost=0.29..8.31 rows=1 width=8)
 Index Cond: (v = 100)
 Filter: ((k >= 1) AND (k <= 2))
   ->  Index Scan using pi2 on part2  (cost=0.29..7.34 rows=1 width=8)
 Index Cond: (v = 100)
 Filter: ((k >= 1) AND (k <= 2))

Questions:
- Is there some way to avoid sequential scan of parent table? Yes, it 
is empty and so sequential scan will not take much time, but ... it 
still requires some additional actions and so increasing query 
execution time.
- Why index scan of partition indexes includes filter condition if it 
is guaranteed by check constraint that all records of this partition 
match search predicate?



2. Partial indexes:

create table t (k integer primary key, v integer);
insert into t values (generate_series(1,2),random());
create index i1 on t(v) where k between 1 and 1;
create index i2 on t(v) where k between 10001 and 2;
postgres=# explain select * from t where k between 1 and 1 and v 
= 100;

 QUERY PLAN

 Index Scan using i1 on t  (cost=0.29..7.28 rows=1 width=8)
   Index Cond: (v = 100)
(2 rows)


Here we get perfect plan. Let's try to extend search interval:


postgres=# explain select * from t where k between 1 and 2 and v 
= 100;

QUERY PLAN
--
 Index Scan using t_pkey on t  (cost=0.29..760.43 rows=1 width=8)
   Index Cond: ((k >= 1) AND (k <= 2))
   Filter: (v = 100)
(3 rows)

Unfortunately in this case Postgres is not able to apply partial 
indexes.

And this is what I expected to get:

postgres=# explain select * from t where k between 1 and 1 and v 
= 100 union all select * from t where k between 10001 and 2 and v 
= 100;

  QUERY PLAN
--
 Append  (cost=0.29..14.58 rows=2 width=8)
   ->  Index Scan using i1 on t  (cost=0.29..7.28 rows=1 width=8)
 Index Cond: (v = 100)
   ->  Index Scan using i2 on t t_1  (cost=0.29..7.28 rows=1 width=8)
 Index Cond: (v = 100)


I wonder if there are some principle problems in supporting this two 
things in optimizer:
1. Remove search condition for primary key if it is fully satisfied 
by derived table check constraint.
2. Append index scans of several partial indexes if specified 
interval is covered by their conditions.


I wonder if someone is familiar with this part of optimizer ad can 
easily fix it.
Otherwise I am going to spend some time on solving this problems (if 
community think that such optimizations will be useful).




Replying to myself: the following small patch removes redundant checks 
from index scans for derived tables:



diff --git a/src/backend/optimizer/util/plancat.c 
b/src/backend/optimizer/util/plancat.c

index 939045d..1f7c9cf 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -1441,6 +1441,20 @@ relation_excluded_by_constraints(PlannerInfo 
*root,
if (predicate_refuted_by(s

Re: [HACKERS] Secondary index access optimizations

2017-08-14 Thread Konstantin Knizhnik




On 14.08.2017 12:37, Konstantin Knizhnik wrote:

Hi hackers,

I am trying to compare different ways of optimizing work with huge 
append-only tables in PostgreSQL where primary key is something like 
timestamp and queries are usually accessing most recent data using 
some secondary keys. Size of secondary index is one of the most 
critical factors limiting  insert/search performance. As far as data 
is inserted in timestamp ascending order, access to primary key is 
well localized and accessed tables are present in memory. But if we 
create secondary key for the whole table, then access to it will 
require random reads from the disk and significantly decrease 
performance.


There are two well known solutions of the problem:
1. Table partitioning
2. Partial indexes

This approaches I want to compare. First of all I want to check if 
optimizer is able to generate efficient query execution plan covering 
different time intervals.

Unfortunately in both cases generated plan is not optimal.

1. Table partitioning:

create table base (k integer primary key, v integer);
create table part1 (check (k between 1 and 1)) inherits (base);
create table part2 (check (k between 10001 and 2)) inherits (base);
create index pi1 on part1(v);
create index pi2 on part2(v);
insert int part1 values (generate series(1,1), random());
insert into part2 values (generate_series(10001,2), random());
explain select * from base where k between 1 and 2 and v = 100;
  QUERY PLAN
---
 Append  (cost=0.00..15.65 rows=3 width=8)
   ->  Seq Scan on base  (cost=0.00..0.00 rows=1 width=8)
 Filter: ((k >= 1) AND (k <= 2) AND (v = 100))
   ->  Index Scan using pi1 on part1  (cost=0.29..8.31 rows=1 width=8)
 Index Cond: (v = 100)
 Filter: ((k >= 1) AND (k <= 2))
   ->  Index Scan using pi2 on part2  (cost=0.29..7.34 rows=1 width=8)
 Index Cond: (v = 100)
 Filter: ((k >= 1) AND (k <= 2))

Questions:
- Is there some way to avoid sequential scan of parent table? Yes, it 
is empty and so sequential scan will not take much time, but ... it 
still requires some additional actions and so increasing query 
execution time.
- Why index scan of partition indexes includes filter condition if it 
is guaranteed by check constraint that all records of this partition 
match search predicate?



2. Partial indexes:

create table t (k integer primary key, v integer);
insert into t values (generate_series(1,2),random());
create index i1 on t(v) where k between 1 and 1;
create index i2 on t(v) where k between 10001 and 2;
postgres=# explain select * from t where k between 1 and 1 and v = 
100;

 QUERY PLAN

 Index Scan using i1 on t  (cost=0.29..7.28 rows=1 width=8)
   Index Cond: (v = 100)
(2 rows)


Here we get perfect plan. Let's try to extend search interval:


postgres=# explain select * from t where k between 1 and 2 and v = 
100;

QUERY PLAN
--
 Index Scan using t_pkey on t  (cost=0.29..760.43 rows=1 width=8)
   Index Cond: ((k >= 1) AND (k <= 2))
   Filter: (v = 100)
(3 rows)

Unfortunately in this case Postgres is not able to apply partial indexes.
And this is what I expected to get:

postgres=# explain select * from t where k between 1 and 1 and v = 
100 union all select * from t where k between 10001 and 2 and v = 
100;

  QUERY PLAN
--
 Append  (cost=0.29..14.58 rows=2 width=8)
   ->  Index Scan using i1 on t  (cost=0.29..7.28 rows=1 width=8)
 Index Cond: (v = 100)
   ->  Index Scan using i2 on t t_1  (cost=0.29..7.28 rows=1 width=8)
 Index Cond: (v = 100)


I wonder if there are some principle problems in supporting this two 
things in optimizer:
1. Remove search condition for primary key if it is fully satisfied by 
derived table check constraint.
2. Append index scans of several partial indexes if specified interval 
is covered by their conditions.


I wonder if someone is familiar with this part of optimizer ad can 
easily fix it.
Otherwise I am going to spend some time on solving this problems (if 
community think that such optimizations will be useful).




Replying to myself: the following small patch removes redundant checks 
from index scans for derived tables:



diff --git a/src/backend/optimizer/util/plancat.c 
b/src/backend/optimizer/util/plancat.c

index 939045d..1f7c9cf 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -1441,6 +1441,20 @@ relation_excluded_by_constraints(PlannerInfo *root,
if (predicate_refuted_by(safe_constraints, 
rel->baserestrict

[HACKERS] Secondary index access optimizations

2017-08-14 Thread Konstantin Knizhnik


Hi hackers,

I am trying to compare different ways of optimizing work with huge 
append-only tables in PostgreSQL where primary key is something like 
timestamp and queries are usually accessing most recent data using some 
secondary keys. Size of secondary index is one of the most critical 
factors limiting  insert/search performance. As far as data is inserted 
in timestamp ascending order, access to primary key is well localized 
and accessed tables are present in memory. But if we create secondary 
key for the whole table, then access to it will require random reads 
from the disk and significantly decrease performance.


There are two well known solutions of the problem:
1. Table partitioning
2. Partial indexes

This approaches I want to compare. First of all I want to check if 
optimizer is able to generate efficient query execution plan covering 
different time intervals.

Unfortunately in both cases generated plan is not optimal.

1. Table partitioning:

create table base (k integer primary key, v integer);
create table part1 (check (k between 1 and 1)) inherits (base);
create table part2 (check (k between 10001 and 2)) inherits (base);
create index pi1 on part1(v);
create index pi2 on part2(v);
insert int part1 values (generate series(1,1), random());
insert into part2 values (generate_series(10001,2), random());
explain select * from base where k between 1 and 2 and v = 100;
  QUERY PLAN
---
 Append  (cost=0.00..15.65 rows=3 width=8)
   ->  Seq Scan on base  (cost=0.00..0.00 rows=1 width=8)
 Filter: ((k >= 1) AND (k <= 2) AND (v = 100))
   ->  Index Scan using pi1 on part1  (cost=0.29..8.31 rows=1 width=8)
 Index Cond: (v = 100)
 Filter: ((k >= 1) AND (k <= 2))
   ->  Index Scan using pi2 on part2  (cost=0.29..7.34 rows=1 width=8)
 Index Cond: (v = 100)
 Filter: ((k >= 1) AND (k <= 2))

Questions:
- Is there some way to avoid sequential scan of parent table? Yes, it is 
empty and so sequential scan will not take much time, but ... it still 
requires some additional actions and so increasing query execution time.
- Why index scan of partition indexes includes filter condition if it is 
guaranteed by check constraint that all records of this partition match 
search predicate?



2. Partial indexes:

create table t (k integer primary key, v integer);
insert into t values (generate_series(1,2),random());
create index i1 on t(v) where k between 1 and 1;
create index i2 on t(v) where k between 10001 and 2;
postgres=# explain select * from t where k between 1 and 1 and v = 100;
 QUERY PLAN

 Index Scan using i1 on t  (cost=0.29..7.28 rows=1 width=8)
   Index Cond: (v = 100)
(2 rows)


Here we get perfect plan. Let's try to extend search interval:


postgres=# explain select * from t where k between 1 and 2 and v = 100;
QUERY PLAN
--
 Index Scan using t_pkey on t  (cost=0.29..760.43 rows=1 width=8)
   Index Cond: ((k >= 1) AND (k <= 2))
   Filter: (v = 100)
(3 rows)

Unfortunately in this case Postgres is not able to apply partial indexes.
And this is what I expected to get:

postgres=# explain select * from t where k between 1 and 1 and v = 
100 union all select * from t where k between 10001 and 2 and v = 100;

  QUERY PLAN
--
 Append  (cost=0.29..14.58 rows=2 width=8)
   ->  Index Scan using i1 on t  (cost=0.29..7.28 rows=1 width=8)
 Index Cond: (v = 100)
   ->  Index Scan using i2 on t t_1  (cost=0.29..7.28 rows=1 width=8)
 Index Cond: (v = 100)


I wonder if there are some principle problems in supporting this two 
things in optimizer:
1. Remove search condition for primary key if it is fully satisfied by 
derived table check constraint.
2. Append index scans of several partial indexes if specified interval 
is covered by their conditions.


I wonder if someone is familiar with this part of optimizer ad can 
easily fix it.
Otherwise I am going to spend some time on solving this problems (if 
community think that such optimizations will be useful).


--

Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Error : undefined symbol : LWLockAssign in 9.6.3

2017-08-08 Thread Konstantin Knizhnik


On 08/09/2017 07:07 AM, 송기훈 wrote:

본문 이미지 1
Hi.
I'm trying to use imcs module with 9.6 and got this error message. LWLockAssign 
function has been deleted from 9.6. I can't use this module anymore from 9.6.

What I want to ask you something is that your team decides not to support imcs 
module anymore or doesn't concern about imcs module or are there any ways to 
run postgresql in memory only?


Hi,
I am author of IMCS module and performing support of it.
Please contact to me directly.
I have committed patch in https://github.com/knizhnik/imcs.git repository
which allows to use IMCS with 9.6.3 and later Postgres versions.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] ASOF join

2017-06-21 Thread Konstantin Knizhnik




On 21.06.2017 11:00, Thomas Munro wrote:

Hmm.  Yeah, I see the notational problem.  It's hard to come up with a
new syntax that has SQL nature.  What if... we didn't use a new syntax
at all, but recognised existing queries that are executable with this
strategy?  Queries like this:

WITH ticks(time, price) AS
(VALUES ('2017-07-20 12:00:00'::timestamptz, 100.00),
('2017-07-21 11:00:00'::timestamptz, 150.00)),
  times(time) AS
(VALUES ('2017-07-19 12:00:00'::timestamptz),
('2017-07-20 12:00:00'::timestamptz),
('2017-07-21 12:00:00'::timestamptz),
('2017-07-22 12:00:00'::timestamptz))

SELECT times.time, previous_tick.price
   FROM times
   LEFT JOIN LATERAL (SELECT * FROM ticks
   WHERE ticks.time <= times.time
   ORDER BY ticks.time DESC LIMIT 1) previous_tick ON true
  ORDER BY times.time;

   time  | price
+
  2017-07-19 12:00:00+12 |
  2017-07-20 12:00:00+12 | 100.00
  2017-07-21 12:00:00+12 | 150.00
  2017-07-22 12:00:00+12 | 150.00
(4 rows)

I haven't used LATERAL much myself but I've noticed that it's often
used to express this type of thing.  "Get me the latest ... as of time
...".

It'd a bit like the way we recognise EXISTS (...) as a semi-join and
execute it with a join operator instead of having a SEMI JOIN syntax.
On the other hand it's a bit more long winded, extreme and probably
quite niche.
Thank you for this idea. I agree that it is the best way of implementing 
ASOF join - just as optimization of standard SQL query.
But do you think that still it will be good idea to extend SQL syntax 
with ASOF JOIN ... USING ... clause? It will significantly simplify 
writing queries like above
and IMHO doesn't introduce some confusions with standard SQL syntax. My 
primary idea of suggesting ASOF join for Postgres was not  just building 
more efficient plan (using merge join instead of nested loop) but also 
simplifying writing of such queries. Or do you think that nobody will be 
interested in non-standard SQL extensions?


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ASOF join

2017-06-19 Thread Konstantin Knizhnik




On 16.06.2017 19:07, David Fetter wrote:

On Fri, Jun 16, 2017 at 11:51:34AM +1200, Thomas Munro wrote:

On Fri, Jun 16, 2017 at 4:20 AM, Konstantin Knizhnik
<k.knizh...@postgrespro.ru> wrote:

I wonder if there were some discussion/attempts to add ASOF join to Postgres
(sorry, may be there is better term for it, I am refereeing KDB definition:
http://code.kx.com/wiki/Reference/aj ).

Interesting idea.  Also in Pandas:

http://pandas.pydata.org/pandas-docs/version/0.19.0/generated/pandas.merge_asof.html#pandas.merge_asof


I attached simple patch adding ASOF join to Postgres. Right now it 
support only outer join and requires USING clause (consequently it is 
not possible to join two tables which joi keys has different names. May 
be it is also possible to support ON clause with condition written like 
o.k1 = i.k2 AND o.k2 = i.k2 AND ... AND o.kN >= i.kN
But such notation can be confusing, because join result includes only 
one matching inner record with kN smaller or equal than kN of outer 
record and not all such records.

As alternative we can add specia

If people fin such construction really useful, I will continue work on it.




I suppose you could write a function that pulls tuples out of a bunch
of cursors and zips them together like this, as a kind of hand-coded
special merge join "except that we match on nearest key rather than
equal keys" (as they put it).

I've written code like this before in a trading context, where we
called that 'previous tick interpolation', and in a scientific context
where other kinds of interpolation were called for (so not really
matching a tuple but synthesising one if no exact match).  If you view
the former case as a kind of degenerate case of interpolation then it
doesn't feel like a "join" as we know it, but clearly it is.  I had
never considered before that such things might belong inside the
database as a kind of join operator.

If you turn your head sideways, it's very similar to the range merge
join Jeff Davis proposed.  https://commitfest.postgresql.org/14/1106/


May be, but I do not understand how to limit result to contain exactly 
one (last) inner tuple for each outer tuple.


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 482a3dd..f7a8f38 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -1324,6 +1324,9 @@ get_jointype_name(JoinType jointype)
 		case JOIN_FULL:
 			return "FULL";
 
+		case JOIN_ASOF:
+			return "ASOF";
+
 		default:
 			/* Shouldn't come here, but protect from buggy code. */
 			elog(ERROR, "unsupported join type %d", jointype);
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 080cb0a..54cf6c1 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -4073,7 +4073,7 @@ foreign_join_ok(PlannerInfo *root, RelOptInfo *joinrel, JoinType jointype,
 	 * Constructing queries representing SEMI and ANTI joins is hard, hence
 	 * not considered right now.
 	 */
-	if (jointype != JOIN_INNER && jointype != JOIN_LEFT &&
+	if (jointype != JOIN_INNER && jointype != JOIN_LEFT && jointype != JOIN_ASOF && 
 		jointype != JOIN_RIGHT && jointype != JOIN_FULL)
 		return false;
 
@@ -4211,6 +4211,7 @@ foreign_join_ok(PlannerInfo *root, RelOptInfo *joinrel, JoinType jointype,
 			break;
 
 		case JOIN_LEFT:
+		case JOIN_ASOF:
 			fpinfo->joinclauses = list_concat(fpinfo->joinclauses,
 		  list_copy(fpinfo_i->remote_conds));
 			fpinfo->remote_conds = list_concat(fpinfo->remote_conds,
diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml
index 211e4c3..fd3be8c 100644
--- a/doc/src/sgml/ref/select.sgml
+++ b/doc/src/sgml/ref/select.sgml
@@ -514,6 +514,9 @@ TABLE [ ONLY ] table_name [ * ]
  
   CROSS JOIN
  
+ 
+  ASOF [ OUTER ] JOIN
+ 
 
 
 For the INNER and OUTER join types, a
@@ -523,7 +526,9 @@ TABLE [ ONLY ] table_name [ * ]
 USING (join_column [, ...]).
 See below for the meaning.  For CROSS JOIN,
-none of these clauses can appear.
+none of these clauses can appear. For ASOF join type, a
+join condition must be USING (join_column [, ...]).

 

@@ -571,6 +576,32 @@ TABLE [ ONLY ] table_name [ * ]
 on the right), plus one row for each unmatched right-hand row
 (extended with nulls on the left).

+
+   ASOF OUTER JOIN is similar to LEFT OUTER JOIN but it accepts only
+USING (join_column_1 [, ...], join_column_N) clause
+where last joined column join_column_N is expected to be timestamp 
+(but actually can have any comparable type) and outer tuple is matched with only one inner tuple wi

Re: [HACKERS] WIP: Data at rest encryption

2017-06-16 Thread Konstantin Knizhnik




On 16.06.2017 03:08, Bruce Momjian wrote:


Yeah, I guess we will just have to wait to see it since other people are
excited about it.  My concern is code complexity and usability
challenges, vs punting the problem to the operating system, though
admittedly there are some cases where that is not possible.



Let me also share my opinion about encryption and compression support at 
database level.
PostgresPro Enterprise does support both. I have made presentation about 
it at PgConn 2016 in Tallinn.
I was a little bit surprised that there were more questions about 
encryption than about compression.
But right now we have several customers which are using compression and 
none of them use encryption (just because them do not need
to protect their databases). But I absolutely sure that there are many 
Postgres users which first of all need to protect their data.


Encryption is much easier to implement than compression, because it is 
not changing page size. So I do not see any "complexity and flexibility 
challenges" here.
Just for reference I attached to this mail our own encryption patch. I 
do not want to propose it as alternative to Aasmas patch: it is less 
flexible and doesn't support encryption of WAL, just encryption of 
relation data. Also it doesn't allow custom encryption libraries: AES 
implementation is embedded. Encryption cipher is taken from environment 
variable. At Tallin's conferences I was informed about possible security 
issue with passing key through environment variable: it is possible to 
inspect server's environment variables using plpython/plperl stored 
procedure.
This is why we unset this environment variable after reading. I am not 
expect in security, but I do not know other issues with such solution.


Concerning the question whether to implement compression/encryption on 
database level or rely on OS, my opinion is that there are many 
scenarios where it is not possible or is not desirable to use OS level 
encryption/protection. It first of all includes cloud installations and 
embedded applications.  I do not want to repeat arguments already 
mentioned in this thread.
But the fact is that there are many people which really need 
compression/encryption support and them can not or do not want to 
redirect this aspects to OS. Almost all DBMSes are supporting 
compression encryption, so lack of this features in Postgres definitely 
can not be considered as Postgres advantage.


Postgres buffer manager interface significantly simplifies integration 
of encryption and compression. There is actually single path through 
which data is fetched/stored to the disk.
It is most obvious and natural solution to decompress/decrypt data when 
it is read from the disk to page pool and compress/encrypt it when it is 
written back. Taken in account that memory is cheap now and many 
databases can completely fit in memory, storing pages in the buffer 
cache in plain (decompressed/decrypted) format allows to minimize 
overhead of compression/encryption and its influence on performance. For 
read only queries working with cached data performance will be exactly 
the same as without encryption/compression.
Write speed for encrypted pages will be certainly slightly worse, but 
still encryption speed is much higher than disk IO speed.


So I do not think that it is really necessary to support encryption of 
some particular tables, storing "non-secrete" data in plain format 
without encryption. It should not cause noticeable  improve of 
performance, but may complicate implementation and increase possibility 
of leaking secure data.


I do not think that pluggable storage API is right approach to integrate 
compression and especially encryption. It is better to plugin encryption 
between buffer manager and storage device,
allowing to use  it with any storage implementation. Also it is not 
clear to me whether encryption of WAL can be provided using pluggable 
storage API.


The last discussed question is whether it is necessary to encrypt 
temporary data (BufFile). In our solution we encrypt only main fork of 
non-system relations and do no encrypt temporary relations. It may cause 
that some secrete data will be stored at this disk in non-encrypted 
format. But accessing this data is not trivial. You can not just 
copy/stole disk, open database and do "select * from SecreteTable": you 
will have to extract data from raw file yourself. So looks like it is 
better to allow user to make choice whether to encrypt temporary data or 
not.


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/src/backend/storage/file/Makefile b/src/backend/storage/file/Makefile
index d2198f2..9492662 100644
--- a/src/backend/storage/file/Makefile
+++ b/src/backend/storage/file/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/storage/file
 top_builddir = ../../../..
 include $(top_builddir)/src/Makefile.global
 
-OBJS = fd.o buff

[HACKERS] ASOF join

2017-06-15 Thread Konstantin Knizhnik

I wonder if there were some discussion/attempts to add ASOF join to 
Postgres  (sorry, may be there is better term for it, I am refereeing 
KDB definition: http://code.kx.com/wiki/Reference/aj ).
Such kind of join can be useful when we need to associate two 
timeseries. It is quite popular in trading:


//join the eq price and size for the op trade time
a::aj[`underlyingSym`time;select time, underlyingSym, sym, putorcall, 
ex, cond, price, seqnum, contracts, contractsTraded from t;eqtrades];


...and not only. Below is one example of how people now manually coding 
ASOF join:


select
count(*),
count(*)
filter (where timedelta_prev < -30),
count(*)
filter (where ride_prev = ride_next),
... -- many different aggregates
from
(
select
p.provider_id,
p.ts,
(
select extract(epoch from t.ts - p.ts)
from providers_positions t
where p.provider_id = t.provider_id and t.ts < p.ts and 
t.source = 'gps'

order by t.ts desc
limit 1
) as timedelta_prev,
(
select extract(epoch from t.ts - p.ts)
from providers_positions t
where p.provider_id = t.provider_id and t.ts > p.ts and 
t.source = 'gps'

order by t.ts
limit 1
) as timedelta,
(
select ride_id
from providers_positions t
where p.provider_id = t.provider_id and t.ts < p.ts and 
t.source = 'gps'

order by t.ts desc
limit 1
) as ride_prev,
(
select ride_id
from providers_positions t
where p.provider_id = t.provider_id and t.ts > p.ts and 
t.source = 'gps'

order by t.ts
limit 1
) as ride_next
from (
 select
 provider_id,
 ts,
 event_name
 from
 lvl2_681_parsed p
 ) p
where
p.event_name = 'GPS signal restored'
   -- offset 0
) z;

Without OFFSET 0 this query generates awful execution plans with 
hundreds (!) of subplans  corresponding to the subqueries.
Number of subplans (most of them are the same) is equal number of 
occurrences of timedelta, timedelta_prev, ... columns in target aggregates.
OFFSET 0 reduce number of subplans to 4. And I expect that using LATERAL 
join can reduce it to two and even without "OFFSET 0" trick.
But in any case - it is very complex and unnatural way of expressing 
this not so complex query.

With ASOF join is can be written much simpler.

Also Postgres is implementing this query using nested loop with index 
scan, which is definitely not the most efficient strategy.
The best way to implement ASOF join is to use something like mergejoin. 
Usually there are indexes for both timeseries, so what we need is to 
merge two ordered sets using ASOF join rules.
It will require minimal changes in SQL syntax, just adding ASOF keyword 
seems to be enough:


   select  * from Trades ASOF JOIN EqTrades USING (underlyingSym,time);

It seems to me that adding ASOF joins should not require huge amount of 
work and can be done with minimal number of changes in executor and 
optimizer.

But may be there are some problems/challenges which I do not realize now?

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] Surjective functional indexes

2017-06-09 Thread Konstantin Knizhnik


Attached please find rebased version of the patch.
Now "projection" attribute is used instead of surjective/injective.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index 83ee7d3..b221c18 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -294,8 +294,33 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] 
 The optional WITH clause specifies storage
 parameters for the index.  Each index method has its own set of allowed
-storage parameters.  The B-tree, hash, GiST and SP-GiST index methods all
-accept this parameter:
+storage parameters. All indexes accept the following parameter:
+   
+
+   
+   
+projection
+
+ 
+   Functional index is based on on projection function: function which extract subset of its argument. 
+   In mathematic such functions are called non-injective. For injective function if any attribute used in the indexed 
+   expression is changed, then value of index expression is also changed. So to check that index is affected by the 
+   update, it is enough to check the set of changed fields. By default this parameters is assigned true value and function is considered 
+   as non-injective.
+   In this case change of any of indexed key doesn't mean that value of the function is changed. For example, for 
+   the expression expression(bookinfo-'isbn') defined
+   for column of JSON type is changed only when ISBN is changed, which rarely happen. The same is true for most
+   functional indexes. For non-injective functions, Postgres compares values of indexed expression for old and updated tuple and updates
+   index only when function results are different. It allows to eliminate index update and use HOT update.
+   But there are extra evaluations of the functions. So if function is expensive or probability that change of indexed column will not effect 
+   the function value is small, then marking index as projection may increase update speed.
+
+
+   
+   
+
+   
+ The B-tree, hash, GiST and SP-GiST index methods all accept this parameter:

 

diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 6d1f22f..509c647 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -130,6 +130,15 @@ static relopt_bool boolRelOpts[] =
 	},
 	{
 		{
+			"projection",
+			"Evaluate functional index expression on update to check if its values is changed",
+			RELOPT_KIND_INDEX,
+			AccessExclusiveLock
+		},
+		true
+	},
+	{
+		{
 			"security_barrier",
 			"View acts as a row security barrier",
 			RELOPT_KIND_VIEW,
@@ -1301,7 +1310,7 @@ fillRelOptions(void *rdopts, Size basesize,
 break;
 			}
 		}
-		if (validate && !found)
+		if (validate && !found && options[i].gen->kinds != RELOPT_KIND_INDEX)
 			elog(ERROR, "reloption \"%s\" not found in parse table",
  options[i].gen->name);
 	}
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index e890e08..2be99ab 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -56,6 +56,7 @@
 #include "access/xlogutils.h"
 #include "catalog/catalog.h"
 #include "catalog/namespace.h"
+#include "catalog/index.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "storage/bufmgr.h"
@@ -73,7 +74,9 @@
 #include "utils/snapmgr.h"
 #include "utils/syscache.h"
 #include "utils/tqual.h"
-
+#include "utils/memutils.h"
+#include "nodes/execnodes.h"
+#include "executor/executor.h"
 
 /* GUC variable */
 bool		synchronize_seqscans = true;
@@ -124,6 +127,7 @@ static bool ConditionalMultiXactIdWait(MultiXactId multi, MultiXactStatus status
 static XLogRecPtr log_heap_new_cid(Relation relation, HeapTuple tup);
 static HeapTuple ExtractReplicaIdentity(Relation rel, HeapTuple tup, bool key_modified,
 	   bool *copy);
+static bool ProjectionIsNotChanged(Relation relation, HeapTuple oldtup, HeapTuple newtup);
 
 
 /*
@@ -3533,8 +3537,6 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
 	key_attrs = RelationGetIndexAttrBitmap(relation, INDEX_ATTR_BITMAP_KEY);
 	id_attrs = RelationGetIndexAttrBitmap(relation,
 		  INDEX_ATTR_BITMAP_IDENTITY_KEY);
-
-
 	block = ItemPointerGetBlockNumber(otid);
 	buffer = ReadBuffer(relation, block);
 	page = BufferGetPage(buffer);
@@ -4161,8 +4163,12 @@ l2:
 		 * changed. If the page was already full, we may have skipped checking
 		 * for index columns. If so, HOT update is possible.
 		 */
-		if (hot_attrs_checked && !bms_overlap(modifie

Re: [HACKERS] Proposal : For Auto-Prewarm.

2017-05-30 Thread Konstantin Knizhnik


On 27.10.2016 14:39, Mithun Cy wrote:

# pg_autoprewarm.

This a PostgreSQL contrib module which automatically dump all of the 
blocknums
present in buffer pool at the time of server shutdown(smart and fast 
mode only,
to be enhanced to dump at regular interval.) and load these blocks 
when server restarts.


Design:
--
We have created a BG Worker Auto Pre-warmer which during shutdown 
dumps all the

blocknum in buffer pool in sorted order.
Format of each entry is 
<DatabaseId,TableSpaceId,RelationId,Forknum,BlockNum>.
Auto Pre-warmer is started as soon as the postmaster is started we do 
not wait
for recovery to finish and database to reach a consistent state. If 
there is a

"dump_file" to load we start loading each block entry to buffer pool until
there is a free buffer. This way we do not replace any new blocks 
which was
loaded either by recovery process or querying clients. Then it waits 
until it receives

SIGTERM to dump the block information in buffer pool.

HOW TO USE:
---
Build and add the pg_autoprewarm to shared_preload_libraries. Auto 
Pre-warmer
process automatically do dumping of buffer pool's block info and load 
them when

restarted.

TO DO:
--
Add functionality to dump based on timer at regular interval.
And some cleanups.


I wonder if you considered parallel prewarming of a table?
Right now either with pg_prewarm, either with pg_autoprewarm, preloading 
table's data is performed by one backend.
It certainly makes sense if there is just one HDD and we want to 
minimize impact of pg_prewarm on normal DBMS activity.
But sometimes we need to load data in memory as soon as possible. And 
modern systems has larger number of CPU cores and

RAID devices make it possible to efficiently load data in parallel.

I have asked this question in context of my CFS (compressed file system) 
for Postgres. The customer's complaint was that there are 64 cores at 
his system but when
he is building index, decompression of heap data is performed by only 
one core. This is why I thought about prewarm... (parallel index 
construction is separate story...)


pg_prewarm makes is possible to specify range of blocks, so, in 
principle, it is possible to manually preload table in parallel, by 
spawining pg_prewarm
with different subranges in several backends. But it is definitely not 
user friendly approach.
And as far as I understand pg_autoprewarm has all necessary 
infrastructure to do parallel load. We just need to spawn more than one 
background worker and specify

separate block range for each worker.

Do you think that such functionality (parallel autoprewarm) can be 
useful and be easily added?


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Surjective functional indexes

2017-05-30 Thread Konstantin Knizhnik




On 29.05.2017 20:21, Christoph Berg wrote:


I think the term you were looking for is "projection".

https://en.wikipedia.org/wiki/Projection_(set_theory)


I have already renamed parameter from "surjective" to "injective".
But I am ok to do do one more renaming to "projection" if it will be 
considered as better alternative.
From my point of view, "projection" seems to be clearer for people 
without mathematical background,
but IMHO this term is overloaded in DBMS context. The irony is that in 
Wikipedia "projection" is explained using "surjection" term:)




Christoph


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Surjective functional indexes

2017-05-29 Thread Konstantin Knizhnik


On 05/27/2017 09:50 PM, Peter Eisentraut wrote:

On 5/25/17 12:30, Konstantin Knizhnik wrote:

Functions like (info->>'name') are named "surjective" ni mathematics.

A surjective function is one where each value in the output type can be
obtained by some input value.  That's not what you are after here.  The
behavior you are describing is a not-injective function.

I think you are right that in practice most functions are not injective.
  But I think there is still quite some difference between a function
like the one you showed that selects a component from a composite data
structure and, for example, round(), where in practice any update is
likely to change the result of the function.


Thank you, I will rename "surjective" parameter to "injective" with "false" as 
default value.
Concerning "round" and other similar functions - obviously there are use cases 
when such functions are used for
functional indexes. This is why I want to allow user to make a choice and this 
is the reason of introducing this parameter.
The question is the default value of this parameter: should we by default 
preserve original Postgres behavior:
check only affected set of keys or should we pay extra cost for calculating 
value of the function (even if we managed to store
calculated value of the indexes expression for new tuple, we still have to 
calculate it for old tuple, so function will be calculated
at least twice more times).

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Logical replication & corrupted pages recovery

2017-05-26 Thread Konstantin Knizhnik

Several PgPro cstomers, which are evaluating our multimaster, are 
interested in possibility to recover corrupted pages from other cluster 
nodes.
This task seems to be more general and is not multimaster specific. This 
is why I want to discuss it here.


With physical (streaming) replication content of master and replica 
database are identical, so it is quite easy do restore corrupted page 
from the replica by just copying correspondent file or part of file. 
With logical replication content of database pages on the disk may be 
different even through data is logically identical.
If some heap page is corrupted, then there is no some simple and 
efficient way to determine records which were located on this page.

Clustered indexes can help, but this is a long story...

So my question is whether there is now some efficient way to synchronize 
two tables?
If not, are there any plans to provide such functionality in logical 
replication in future?


Right now, the only approach which comes to me mind is to extract all 
primary keys at two nodes, exchanges them between nodes, find out 
missing tuples by comparing two ordered set of keys and request them 
from other node. It is based on the assumption that Postgres just skips 
records from the corrupted pages.
The drawback of this approach is that it will be very slow and cause 
large network traffic for huge tables.
May be it is possible to somehow optimize it, by checking ranges of 
primary key values
(if number of records in the range is the same at both nodes, then 
ranges can be considered as identical and not compared).


Also this approach requires suspending of cluster while table 
synchronization (or at least, locking this table).
Synchronization of table in case of presence of active updates of this 
tables seems to be much more challenged task.


If somebody has already thought about this problem, have some plan or 
may be even ready solution for it, please share your thoughts.

Thanks in advance,

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Surjective functional indexes

2017-05-25 Thread Konstantin Knizhnik




On 25.05.2017 19:37, Tom Lane wrote:

Konstantin Knizhnik <k.knizh...@postgrespro.ru> writes:

My proposal is to check value of function for functional indexes instead
of just comparing set of effected attributes.
Obviously, for some complex functions it may  have negative effect on
update speed.
This is why I have added "surjective" option to index.

This seems overcomplicated.  We would have to compute the function
value at some point anyway.  Can't we refactor to do that earlier?

regards, tom lane



Check for affected indexes/applicability of HOT update and update of 
indexes themselves is done in two completely different parts of code.
And if we find out that values of indexed expressions are not changed, 
then we can use HOT update and indexes should not be updated
(so calculated value of function is not needed). And it is expected to 
be most frequent case.


Certainly, if value of indexed expression is changed, then we can avoid 
redundant calculation of function by storing result of calculations 
somewhere.
But it will greatly complicate all logic of updating indexes. Please 
notice, that if we have several functional indexes and only one of them 
is actually changed,
then in any case we can not use HOT and have to update all indexes. So 
we do not need to evaluate values of all indexed expressions. We just 
need to find first
changed one. So we should somehow keep track values of which expression 
are calculated and which not.


One more argument. Originally Postgres evaluates index expression only 
once (when inserting new version of tuple to the index).
Now (with this patch) Postgres has to evaluate expression three times in 
the worst case: calculate the value of expression for old and new tuples 
to make a decision bout hot update,
and the evaluate it once again when performing index update itself. Even 
if I managed to store somewhere calculated value of the expression, we 
still have to perform
twice more evaluations than before. This is why for expensive functions 
or for functions defined for frequently updated attributes (in case of 
JSON) such policy should be disabled.
And for non-expensive functions extra overhead is negligible. Also there 
is completely no overhead if indexed expression is not actually changed. 
And it is expected to be most frequent case.


At least at the particular example with YCSB benchmark, our first try 
was just to disable index update by commenting correspondent check of 
updated fields mask.
Obviously there are no extra function calculations in this case. Then I 
have implemented this patch. And performance is almost the same.
This is why I think that simplicity and modularity of code is more 
important here than elimination of redundant function calculation.



--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Surjective functional indexes

2017-05-25 Thread Konstantin Knizhnik

Right now Postgres determines whether update operation touch index or 
not based only on set of the affected columns.
But in case of functional indexes such policy quite frequently leads to 
unnecessary index updates.
For example, functional index are widely use for indexing JSON data: 
info->>'name'.


JSON data may contain multiple attributes and only few of them may be 
affected by update.
Moreover, index is used to build for immutable attributes (like "id", 
"isbn", "name",...).


Functions like (info->>'name') are named "surjective" ni mathematics.
I have strong feeling that most of functional indexes are based on 
surjective functions.
For such indexes current Postgresql index update policy is very 
inefficient.  It cause disabling of hot updates

and so leads to significant degrade of performance.

Without this patch Postgres is slower than Mongo on YCSB benchmark with 
(50% update,50 % select)  workload.

And after applying this patch Postgres beats Mongo at all workloads.

My proposal is to check value of function for functional indexes instead 
of just comparing set of effected attributes.
Obviously, for some complex functions it may  have negative effect on 
update speed.
This is why I have added "surjective" option to index. By default it is 
switched on for all functional indexes (based on my assumption
that most functions used in functional indexes are surjective). But it 
is possible to explicitly disable it and make decision weather index

needs to be updated or not only based on set of effected attributes.


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 6d1f22f..37fc407 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -130,6 +130,15 @@ static relopt_bool boolRelOpts[] =
 	},
 	{
 		{
+			"surjective",
+			"Reevaluate functional index expression on update to check if its values is changed",
+			RELOPT_KIND_INDEX,
+			AccessExclusiveLock
+		},
+		true
+	},
+	{
+		{
 			"security_barrier",
 			"View acts as a row security barrier",
 			RELOPT_KIND_VIEW,
@@ -1301,7 +1310,7 @@ fillRelOptions(void *rdopts, Size basesize,
 break;
 			}
 		}
-		if (validate && !found)
+		if (validate && !found && options[i].gen->kinds != RELOPT_KIND_INDEX)
 			elog(ERROR, "reloption \"%s\" not found in parse table",
  options[i].gen->name);
 	}
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index e890e08..3525e3c 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -56,6 +56,7 @@
 #include "access/xlogutils.h"
 #include "catalog/catalog.h"
 #include "catalog/namespace.h"
+#include "catalog/index.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "storage/bufmgr.h"
@@ -73,7 +74,9 @@
 #include "utils/snapmgr.h"
 #include "utils/syscache.h"
 #include "utils/tqual.h"
-
+#include "utils/memutils.h"
+#include "nodes/execnodes.h"
+#include "executor/executor.h"
 
 /* GUC variable */
 bool		synchronize_seqscans = true;
@@ -4199,6 +4202,7 @@ l2:
 
 	if (use_hot_update)
 	{
+		elog(DEBUG1, "Use hot update");
 		/* Mark the old tuple as HOT-updated */
 		HeapTupleSetHotUpdated();
 		/* And mark the new tuple as heap-only */
@@ -4436,6 +4440,73 @@ HeapDetermineModifiedColumns(Relation relation, Bitmapset *interesting_cols,
 attnum - FirstLowInvalidHeapAttributeNumber);
 	}
 
+	if (hot_result && relation->rd_surjective)
+	{
+		ListCell   *l;
+		List	   *indexoidlist = RelationGetIndexList(relation);
+		EState *estate = CreateExecutorState();
+		ExprContext*econtext = GetPerTupleExprContext(estate);
+		TupleTableSlot *slot = MakeSingleTupleTableSlot(RelationGetDescr(relation));
+		Datum	   	old_values[INDEX_MAX_KEYS];
+		bool		old_isnull[INDEX_MAX_KEYS];
+		Datum	   	new_values[INDEX_MAX_KEYS];
+		bool		new_isnull[INDEX_MAX_KEYS];
+
+		econtext->ecxt_scantuple = slot;
+
+		foreach(l, indexoidlist)
+		{
+			Oid		indexOid = lfirst_oid(l);
+			RelationindexDesc = index_open(indexOid, AccessShareLock);
+			IndexInfo  *indexInfo = BuildIndexInfo(indexDesc);
+			int i;
+
+			if (indexInfo->ii_Expressions && indexInfo->ii_Surjective)
+			{
+ResetExprContext(econtext);
+ExecStoreTuple(oldtup, slot, InvalidBuffer, false);
+FormIndexDatum(indexInfo,
+			   slot,
+			   estate,
+			   old_values,
+			   old_isnull);
+
+ExecStoreTuple(newtup, slot, InvalidBuffer, false);
+FormIndexDatum(indexInfo,
+			   slot,
+			   esta

Re: [HACKERS] Cached plans and statement generalization

2017-05-25 Thread Konstantin Knizhnik


On 10.05.2017 19:11, Konstantin Knizhnik wrote:


Based on the Robert's feedback and Tom's proposal I have implemented 
two new versions of autoprepare patch.


First version is just refactoring of my original implementation: I 
have extracted common code into prepare_cached_plan and 
exec_prepared_plan
function to avoid code duplication. Also I rewrote assignment of 
values to parameters. Now types of parameters are inferred from types 
of literals, so there may be several
prepared plans which are different only by types of parameters. Due to 
the problem with type coercion for parameters, I have to catch errors 
in parse_analyze_varparams.




Attached please find rebased version of the autoprepare patch based on 
Tom's proposal (perform analyze for tree with constant literals and then 
replace them with parameters).

Also I submitted this patch for the Autum commitfest.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index 95c1d3e..0b0642b 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -3710,6 +3710,454 @@ raw_expression_tree_walker(Node *node,
 }
 
 /*
+ * raw_expression_tree_mutator --- transform raw parse tree.
+ *
+ * This function is implementing slightly different approach for tree update than expression_tree_mutator().
+ * Callback is given pointer to pointer to the current node and can update this field instead of returning reference to new node.
+ * It makes it possible to remember changes and easily revert them without extra traversal of the tree.
+ *
+ * This function do not need QTW_DONT_COPY_QUERY flag: it never implicitly copy tree nodes, doing in-place update.
+ *
+ * Like raw_expression_tree_walker, there is no special rule about query
+ * boundaries: we descend to everything that's possibly interesting.
+ *
+ * Currently, the node type coverage here extends only to DML statements
+ * (SELECT/INSERT/UPDATE/DELETE) and nodes that can appear in them, because
+ * this is used mainly during analysis of CTEs, and only DML statements can
+ * appear in CTEs. If some other node is visited, iteration is immediately stopped and true is returned.
+ */
+bool
+raw_expression_tree_mutator(Node *node,
+			bool (*mutator) (),
+			void *context)
+{
+	ListCell   *temp;
+
+	/*
+	 * The walker has already visited the current node, and so we need only
+	 * recurse into any sub-nodes it has.
+	 */
+	if (node == NULL)
+		return false;
+
+	/* Guard against stack overflow due to overly complex expressions */
+	check_stack_depth();
+
+	switch (nodeTag(node))
+	{
+		case T_SetToDefault:
+		case T_CurrentOfExpr:
+		case T_Integer:
+		case T_Float:
+		case T_String:
+		case T_BitString:
+		case T_Null:
+		case T_ParamRef:
+		case T_A_Const:
+		case T_A_Star:
+			/* primitive node types with no subnodes */
+			break;
+		case T_Alias:
+			/* we assume the colnames list isn't interesting */
+			break;
+		case T_RangeVar:
+			return mutator(&((RangeVar *) node)->alias, context);
+		case T_GroupingFunc:
+			return mutator(&((GroupingFunc *) node)->args, context);
+		case T_SubLink:
+			{
+SubLink	   *sublink = (SubLink *) node;
+
+if (mutator(>testexpr, context))
+	return true;
+/* we assume the operName is not interesting */
+if (mutator(>subselect, context))
+	return true;
+			}
+			break;
+		case T_CaseExpr:
+			{
+CaseExpr   *caseexpr = (CaseExpr *) node;
+
+if (mutator(>arg, context))
+	return true;
+/* we assume mutator(& doesn't care about CaseWhens, either */
+foreach(temp, caseexpr->args)
+{
+	CaseWhen   *when = (CaseWhen *) lfirst(temp);
+
+	Assert(IsA(when, CaseWhen));
+	if (mutator(>expr, context))
+		return true;
+	if (mutator(>result, context))
+		return true;
+}
+if (mutator(>defresult, context))
+	return true;
+			}
+			break;
+		case T_RowExpr:
+			/* Assume colnames isn't interesting */
+			return mutator(&((RowExpr *) node)->args, context);
+		case T_CoalesceExpr:
+			return mutator(&((CoalesceExpr *) node)->args, context);
+		case T_MinMaxExpr:
+			return mutator(&((MinMaxExpr *) node)->args, context);
+		case T_XmlExpr:
+			{
+XmlExpr	   *xexpr = (XmlExpr *) node;
+
+if (mutator(>named_args, context))
+	return true;
+/* we assume mutator(& doesn't care about arg_names */
+if (mutator(>args, context))
+	return true;
+			}
+			break;
+		case T_NullTest:
+			return mutator(&((NullTest *) node)->arg, context);
+		case T_BooleanTest:
+			return mutator(&((BooleanTest *) node)->arg, context);
+		case T_JoinExpr:
+			{
+JoinExpr   *join = (JoinExpr *) node;
+
+if (mutator(>larg, context))
+	return true;
+if (mutator(>rarg, context))
+	return true;
+if (mutator(>quals, context))
+	return true;
+

Re: [HACKERS] Cached plans and statement generalization

2017-05-18 Thread Konstantin Knizhnik




On 15.05.2017 18:31, Robert Haas wrote:

On Wed, May 10, 2017 at 12:11 PM, Konstantin Knizhnik
<k.knizh...@postgrespro.ru> wrote:

Robert, can you please explain why using TRY/CATCH is not safe here:

This is definitely not a safe way of using TRY/CATCH.

This has been discussed many, many times on this mailing list before,
and I don't really want to go over it again here.  We really need a
README or some documentation about this so that we don't have to keep
answering this same question again and again.

First of all I want to notice that new version of my patch is not using 
PG_TRY/PG_CATCH.

But I still want to clarify for myself whats wrong with this constructions.
I searched both hackers mailing list archive and world-wide using google 
but failed to find any references except of

sharing non-volatilie variables between try and catch blocks.
Can you please point me at the thread where this problem was discussed 
or just explain in few words the source of the problem?


From my own experience I found out that PG_TRY/PG_CATCH mechanism is 
not providing proper cleanup (unlike C++ exceptions).
If there are opened relations, catalog cache entries,... then throwing 
error will not release them.
It will cause no problems if error is handled in PostgresMain which 
aborts current transaction and releases all resources in any case.
But if I want to ignore this error and continue query execution, then 
warnings about resources leaks can be reported.

Is it want you mean by unsafety of PG_TRY/PG_CATCH constructions?

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cached plans and statement generalization

2017-05-12 Thread Konstantin Knizhnik




On 12.05.2017 18:23, Bruce Momjian wrote:

On Fri, May 12, 2017 at 10:50:41AM +0300, Konstantin Knizhnik wrote:

Definitely changing session context (search_path, date/time format, ...) may
cause incorrect behavior of cached statements.

I wonder if we should clear the cache whenever any SET command is
issued.


Well, with autoprepare cache disabled on each set variable, alter system 
and any slow utility statement
only one regression test is not passed. And only because of different 
error message context:


*** /home/knizhnik/postgresql.master/src/test/regress/expected/date.out 
2017-04-11 18:07:56.497461208 +0300
--- /home/knizhnik/postgresql.master/src/test/regress/results/date.out 
2017-05-12 20:21:19.767566302 +0300

***
*** 1443,1452 
  --
  SELECT EXTRACT(MICROSEC  FROM DATE 'infinity'); -- ERROR: 
timestamp units "microsec" not recognized

  ERROR:  timestamp units "microsec" not recognized
- CONTEXT:  SQL function "date_part" statement 1
  SELECT EXTRACT(UNDEFINED FROM DATE 'infinity'); -- ERROR: 
timestamp units "undefined" not supported

  ERROR:  timestamp units "undefined" not supported
- CONTEXT:  SQL function "date_part" statement 1
  -- test constructors
  select make_date(2013, 7, 15);
   make_date
--- 1443,1450 

==




Actually you may get the same problem with explicitly prepared statements
(certainly, in the last case, you better understand what going on and it is
your choice whether to use or not to use prepared statement).

The fact of failure of 7 regression tests means that autoprepare can really
change behavior of existed program. This is why my suggestion is  to switch
off this feature by default.

I would like to see us target something that can be enabled by default.
Even if it only improves performance by 5%, it would be better overall
than a feature that improves performance by 90% but is only used by 1%
of our users.


I have to agree with you here.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cached plans and statement generalization

2017-05-12 Thread Konstantin Knizhnik




On 12.05.2017 03:58, Bruce Momjian wrote:

On Thu, May 11, 2017 at 10:41:45PM +0300, Konstantin Knizhnik wrote:

This is why I have provided second implementation which replace
literals with parameters after raw parsing.  Certainly it is slower
than first approach. But still provide significant advantage in
performance: more than two times at pgbench.  Then I tried to run
regression tests and find several situations where type analysis is
not correctly performed in case of replacing literals with parameters.

So the issue is that per-command output from the parser, SelectStmt,
only has strings for identifers, e.g. table and column names, so you
can't be sure it is the same as the cached entry you matched.  I suppose
if you cleared the cache every time someone created an object or changed
search_path, it might work.

Definitely changing session context (search_path, date/time format, ...) 
may cause incorrect behavior of cached statements.
Actually you may get the same problem with explicitly prepared 
statements (certainly, in the last case, you better understand what 
going on and it is your choice whether to use or not to use prepared 
statement).


The fact of failure of 7 regression tests means that autoprepare can 
really change behavior of existed program. This is why my suggestion is  
to switch off this feature by default.
But in 99.9% real cases (my estimation plucked out of thin air:) there 
will be no such problems with autoprepare. And it can significantly 
improve performance of OLTP applications
which are not able to use prepared statements (because of working 
through pgbouncer or any other reasons).


Can autoprepare slow down the system?
Yes, it can. It can happen if application perform larger number of 
unique queries and autoprepare cache size is not limited.
In this case large (and infinitely growing) number of stored plans can 
consume a lot of memory and, what is even worse, slowdown cache lookup.
This is why I by default limit number of cached statements 
(autoprepare_limit parameter) by 100.


I am almost sure that there will be some other issues with autoprepare 
which I have not encountered yet (because I mostly tested it on pgbench 
and Postgres regression tests).
But I am also sure that benefit of doubling system performance is good 
motivation to continue work in this direction.


My main concern is whether to continue to improve current approach with 
local (per-backend) cache of prepared statements.
Or create shared cache (as in Oracle). It is much more difficult to 
implement shared cache (the same problem with session context, different 
catalog snapshots, cache invalidation,...)
but it also provides more opportunities for queries optimization and 
tuning.






--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cached plans and statement generalization

2017-05-11 Thread Konstantin Knizhnik


On 05/11/2017 10:52 PM, Andres Freund wrote:

On 2017-05-11 22:48:26 +0300, Konstantin Knizhnik wrote:

On 05/11/2017 09:31 PM, Tom Lane wrote:

Bruce Momjian <br...@momjian.us> writes:

Good point.  I think we need to do some measurements to see if the
parser-only stage is actually significant.  I have a hunch that
commercial databases have much heavier parsers than we do.

FWIW, gram.y does show up as significant in many of the profiles I take.
I speculate that this is not so much that it eats many CPU cycles, as that
the constant tables are so large as to incur lots of cache misses.  scan.l
is not quite as big a deal for some reason, even though it's also large.

regards, tom lane

Yes, my results shows that pg_parse_query adds not so much overhead:
206k TPS for my first variant with string literal substitution and modified 
query text used as hash key vs.
181k. TPS for version with patching raw parse tree constructed by 
pg_parse_query.

Those numbers and your statement seem to contradict each other?


Oops, my parse error:( I incorrectly read Tom's statement.
Actually, I also was afraid that price of parsing is large enough and this is 
why my first attempt was to avoid parsing.
But then I find out that most of the time is spent in analyze and planning (see 
attached profile):

pg_parse_query: 4.23%
pg_analyze_and_rewrite 8.45%
pg_plan_queries: 15.49%



- Andres



--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cached plans and statement generalization

2017-05-11 Thread Konstantin Knizhnik


On 05/11/2017 09:31 PM, Tom Lane wrote:

Bruce Momjian <br...@momjian.us> writes:

Good point.  I think we need to do some measurements to see if the
parser-only stage is actually significant.  I have a hunch that
commercial databases have much heavier parsers than we do.

FWIW, gram.y does show up as significant in many of the profiles I take.
I speculate that this is not so much that it eats many CPU cycles, as that
the constant tables are so large as to incur lots of cache misses.  scan.l
is not quite as big a deal for some reason, even though it's also large.

regards, tom lane

Yes, my results shows that pg_parse_query adds not so much overhead:
206k TPS for my first variant with string literal substitution and modified 
query text used as hash key vs.
181k. TPS for version with patching raw parse tree constructed by 
pg_parse_query.


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cached plans and statement generalization

2017-05-11 Thread Konstantin Knizhnik


On 05/11/2017 06:12 PM, Bruce Momjian wrote:

On Wed, May 10, 2017 at 07:11:07PM +0300, Konstantin Knizhnik wrote:

I am going to continue work on this patch I will be glad to receive any
feedback and suggestions for its improvement.
In most cases, applications are not accessing Postgres directly, but using
some connection pooling layer and so them are not able to use prepared
statements.
But at simple OLTP Postgres spent more time on building query plan than on
execution itself. And it is possible to speedup Postgres about two times at
such workload!
Another alternative is true shared plan cache.  May be it is even more
perspective approach, but definitely much more invasive and harder to
implement.

Can we back up and get an overview of what you are doing and how you are
doing it?  Our TODO list suggests this order for successful patches:

Desirability -> Design -> Implement -> Test -> Review -> Commit

You kind of started at the Implementation/patch level, which makes it
hard to evaluate.

I think everyone agrees on the Desirability of the feature, but the
Design is the tricky part.  I think the design questions are:

*  What information is stored about cached plans?
*  How are the cached plans invalidated?
*  How is a query matched against a cached plan?

Looking at the options, ideally the plan would be cached at the same
query stage as the stage where the incoming query is checked against the
cache.  However, caching and checking at the same level offers no
benefit, so they are going to be different.  For example, caching a
parse tree at the time it is created, then checking at the same point if
the incoming query is the same doesn't help you because you already had
to create the parse tree get to that point.

A more concrete example is prepared statements.  They are stored at the
end of planning and matched in the parser.  However, you can easily do
that since the incoming query specifies the name of the prepared query,
so there is no trick to matching.

The desire is to cache as late as possible so you cache more work and
you have more detail about the referenced objects, which helps with
cache invalidation.  However, you also want to do cache matching as
early as possible to improve performance.

So, let's look at some options.  One interesting idea from Doug Doole
was to do it between the tokenizer and parser.  I think they are glued
together so you would need a way to run the tokenizer separately and
compare that to the tokens you stored for the cached plan.  The larger
issue is that prepared plans already are checked after parsing, and we
know they are a win, so matching any earlier than that just seems like
overkill and likely to lead to lots of problems.

So, you could do it after parsing but before parse-analysis, which is
kind of what prepared queries do.  One tricky problem is that we don't
bind the query string tokens to database objects until after parse
analysis.

Doing matching before parse-analysis is going to be tricky, which is why
there are so many comments about the approach.  Changing search_path can
certainly affect it, but creating objects in earlier-mentioned schemas
can also change how an object reference in a query is resolved.  Even
obscure things like the creation of a new operator that has higher
precedence in the query could change the plan, though am not sure if
our prepared query system even handles that properly.

Anyway, that is my feedback.  I would like to get an overview of what
you are trying to do and the costs/benefits of each option so we can
best guide you.


Sorry, for luck of overview.
I have started with small prototype just to investigate if such optimization 
makes sense or not.
When I get more than two time advantage in performance on standard pgbench, I 
come to conclusion that this
optimization can be really very useful and now try to find the best way of its 
implementation.

I have started with simplest approach when string literals are replaced with 
parameters. It is done before parsing.
And can be done very fast - just need to locate data in quotes.
But this approach is not safe and universal: you will not be able to speedup 
most of the existed queries without rewriting them.

This is why I have provided second implementation which replace literals with 
parameters after raw parsing.
Certainly it is slower than first approach. But still provide significant 
advantage in performance: more than two times at pgbench.
Then I tried to run regression tests and find several situations where type 
analysis is not correctly performed in case of replacing literals with 
parameters.

So my third attempt is to replace constant nodes with parameters in already 
analyzed tree.

Now answering your questions:

*  What information is stored about cached plans?

Key to locate cached plan is raw parse tree. Value is saved CachedPlanSource.

*  How are the cached plans invalidated?

In the same way as plans for explicitly prepared stat

Re: [HACKERS] Cached plans and statement generalization

2017-05-10 Thread Konstantin Knizhnik


On 02.05.2017 21:26, Robert Haas wrote:


I am sympathetic to the fact that this is a hard problem to solve.
I'm just telling you that the way you've got it is not acceptable and
nobody's going to commit it like that (or if they do, they will end up
having to revert it).  If you want to have a technical discussion
about what might be a way to change the patch to be more acceptable,
cool, but I don't want to get into a long debate about whether what
you have is acceptable or not; I've already said what I think about
that and I believe that opinion will be widely shared.  I am not
trying to beat you up here, just trying to be clear.




Based on the Robert's feedback and Tom's proposal I have implemented two 
new versions of autoprepare patch.


First version is just refactoring of my original implementation: I have 
extracted common code into prepare_cached_plan and exec_prepared_plan
function to avoid code duplication. Also I rewrote assignment of values 
to parameters. Now types of parameters are inferred from types of 
literals, so there may be several
prepared plans which are different only by types of parameters. Due to 
the problem with type coercion for parameters, I have to catch errors in 
parse_analyze_varparams.


Robert, can you please explain why using TRY/CATCH is not safe here:

This is definitely not a safe way of using TRY/CATCH.


Second version is based on Tom's suggestion:

Personally I'd think about
replacing the entire literal-with-cast construct with a Param having
already-known type.
So here I first patch raw parse tree, replacing A_Const with ParamRef. 
Such plan is needed to perform cache lookup.
Then I restore original raw parse tree and call pg_analyze_and_rewrite. 
Then I mutate analyzed tree, replacing Const with Param nodes.
In this case type coercion is already performed and I know precise types 
which should be used for parameters.
It seems to be more sophisticated approach. And I can not extract common 
code in prepare_cached_plan,
because preparing of plan is currently mix of steps done in 
exec_simple_query and exec_parse_message.

But there is no need to catch analyze errors.

Finally performance of both approaches is the same: at pgbench it is 
180k TPS on read-only queries, comparing with 80k TPS for not prepared 
queries.
In both cases 7 out of  177 regression tests  are not passed (mostly 
because session environment is changed between subsequent query execution).


I am going to continue work on this patch I will be glad to receive any 
feedback and suggestions for its improvement.
In most cases, applications are not accessing Postgres directly, but 
using some connection pooling layer and so them are not able to use 
prepared statements.
But at simple OLTP Postgres spent more time on building query plan than 
on execution itself. And it is possible to speedup Postgres about two 
times at such workload!
Another alternative is true shared plan cache.  May be it is even more 
perspective approach, but definitely much more invasive and harder to 
implement.


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index 6e52eb7..f2eb0f5 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -3696,6 +3696,454 @@ raw_expression_tree_walker(Node *node,
 }
 
 /*
+ * raw_expression_tree_mutator --- transform raw parse tree. 
+ *
+ * This function is implementing slightly different approach for tree update than expression_tree_mutator().
+ * Callback is given pointer to pointer to the current node and can update this field instead of returning reference to new node.
+ * It makes it possible to remember changes and easily revert them without extra traversal of the tree.
+ * 
+ * This function do not need QTW_DONT_COPY_QUERY flag: it never implicitly copy tree nodes, doing in-place update.
+ * 
+ * Like raw_expression_tree_walker, there is no special rule about query
+ * boundaries: we descend to everything that's possibly interesting.
+ *
+ * Currently, the node type coverage here extends only to DML statements
+ * (SELECT/INSERT/UPDATE/DELETE) and nodes that can appear in them, because
+ * this is used mainly during analysis of CTEs, and only DML statements can
+ * appear in CTEs. If some other node is visited, iteration is immediately stopped and true is returned.
+ */
+bool
+raw_expression_tree_mutator(Node *node,
+			bool (*mutator) (),
+			void *context)
+{
+	ListCell   *temp;
+
+	/*
+	 * The walker has already visited the current node, and so we need only
+	 * recurse into any sub-nodes it has.
+	 */
+	if (node == NULL)
+		return false;
+
+	/* Guard against stack overflow due to overly complex expressions */
+	check_stack_depth();
+
+	switch (nodeTag(node))
+	{
+		case T_SetToDefault:
+		case T_CurrentOfExpr:
+		case T_Integer:
+		case T_Float:
+		case T_String:
+		case T_BitString:
+		case T_Null:
+		case

Re: [HACKERS] Why type coercion is not performed for parameters?

2017-05-05 Thread Konstantin Knizhnik




On 05.05.2017 13:29, Marko Tiikkaja wrote:


But you know that the type of the literal "10" is int. If you're 
throwing that information away, surely that's a bug in your code.



Yes, in case of integer literal I can easily determine parameter type.
But in case of string literal I have to set UNKNOWNOID type otherwise a 
lot of queries will not work.





.m


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

[HACKERS] Why type coercion is not performed for parameters?

2017-05-05 Thread Konstantin Knizhnik


Hi hackers,

If I evaluate expression typename('literal'), then type coercion is 
performed and the function is successfully resolved, i.e.


SELECT regnamespace('"pg_catalog"');

But if I want to prepare this query, I get the error:

postgres=#  prepare foo as SELECT regnamespace($1);
ERROR:  function regnamespace(unknown) does not exist
LINE 1: prepare foo as SELECT regnamespace($1);

Certainly, I can explicitly specify parameter type:

prepare foo (text) as SELECT regnamespace($1);

and it will work. But it is not always possible.

Actually coerce_type function can normally handle parameters.
But func_get_detail always allows coercion only for constants:


if (sourceType == UNKNOWNOID && IsA(arg1, Const))
{
/* always treat typename('literal') as coercion */
iscoercion = true;
}

If this condition is changed to:

if (sourceType == UNKNOWNOID && (IsA(arg1, Const) || 
IsA(arg1, Param)))


then the example above will normally work.

Why do I need it? I want to implement autoprepare.
My original intention was to let parse_analyze_varparams to infer type 
of parameters from the context.
But it is not always possible  and sometime leads to different behavior 
of query.

For example if the query:

 select count(*) from test_range_gist where ir @> 10;

is replaced with

 select count(*) from test_range_gist where ir @> $1;

then type of parameter will be int4range rather then int, which 
corresponds to the different operator.


This is why now I infer parameter type from literal value. But in this 
case I get errors in parse_analyze_varparams which is not able to 
resolve some functions.
The fix in func_get_detail functions solves the problem and doesn't 
cause some new issues: all regression tests are passed.


So my question is whether it is possible to use the same rule for type 
coercion of parameters as for constant?


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug in prepared statement cache invalidation?

2017-05-02 Thread Konstantin Knizhnik


On 05/02/2017 09:30 PM, Robert Haas wrote:

I am not sure how critical is this problem. Definitely it rarely happens,
but lack of normal workarounds (restart backend, recreate function?) seems
to be  disappointing.

The problem goes away if you reconnect.  The problematic cache is only
backend-lifetime.


Most of clients are not connected to the Postgres directly, them are using some 
kind of connection pooling.
It means that backends are never restarted. And it will be necessary to restart 
the whole service just because we do not have
dependency tracking mechanism for PL code. Even invalidation of all functions 
in case of DDL seems to be more acceptable solution.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bug in prepared statement cache invalidation?

2017-05-02 Thread Konstantin Knizhnik




On 01.05.2017 16:09, Robert Haas wrote:


This problem has been discussed before but nobody's done anything
about it.  The problem is a bit tricky because the core system doesn't
know anything about the function caches maintained by individual PLs.
I suppose ideally there'd be a way for a PL to say "if the definition
of X changes, please tell me to recompile function Y".  That probably
wouldn't be perfect because the PL might not be able to figure out
everything on which they actually depend; that might be
Turing-complete in some cases.  But even a partial solution would
probably be welcomed by users.



Thank you for explanation.
May be I am missing something, but what is the problem with keeping 
dependencies for  PL functions?
As you wrote, PL can inform core that functions depends on some set of 
relations/types/functions and so has to be recompiled if some of them 
are changed.
It is not necessary to build closure of dependency graph - instead of it 
cascade invalidation can be used.
So it is not clear to me where you see here the source of complexity and 
why this task may be "Turing-complete in some cases"?


The problem can be with overloaded functions and PL languages without 
static type checking.
In this case  resolving has to be performed at runtime during function 
evaluation. But there should be no such problem with PLpgSQL.


But definitely introducing such dependency tracking mechanism for PL 
will require a lot of changes, in all PL implementations. Looks like no 
easy fix is possible here.
I am not sure how critical is this problem. Definitely it rarely 
happens, but lack of normal workarounds (restart backend, recreate 
function?) seems to be  disappointing.



--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cached plans and statement generalization

2017-05-02 Thread Konstantin Knizhnik




On 01.05.2017 18:52, Robert Haas wrote:
On Fri, Apr 28, 2017 at 6:01 AM, Konstantin Knizhnik 
<k.knizh...@postgrespro.ru <mailto:k.knizh...@postgrespro.ru>> wrote:



Any comments and suggestions for future improvement of this patch
are welcome.


+PG_TRY();
+{
+query = parse_analyze_varparams(parse_tree,
+query_string,
+ _types,
+ _params);
+}
+PG_CATCH();
+{
+/*
+ * In case of analyze errors revert back to original 
query processing
+ * and disable autoprepare for this query to avoid such 
problems in future.

+ */
+FlushErrorState();
+if (snapshot_set) {
+PopActiveSnapshot();
+}
+entry->disable_autoprepare = true;
+undo_query_plan_changes(parse_tree, const_param_list);
+MemoryContextSwitchTo(old_context);
+return false;
+}
+PG_END_TRY();

This is definitely not a safe way of using TRY/CATCH.

+
+/* Convert literal value to parameter value */
+switch (const_param->literal->val.type)
+{
+  /*
+   * Convert from integer literal
+   */
+  case T_Integer:
+switch (ptype) {
+  case INT8OID:
+params->params[paramno].value = 
Int64GetDatum((int64)const_param->literal->val.val.ival);

+break;
+  case INT4OID:
+params->params[paramno].value = 
Int32GetDatum((int32)const_param->literal->val.val.ival);

+break;
+  case INT2OID:
+if (const_param->literal->val.val.ival < SHRT_MIN
+|| const_param->literal->val.val.ival > SHRT_MAX)
+{
+ereport(ERROR,
+ (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("smallint out of range")));
+}
+params->params[paramno].value = 
Int16GetDatum((int16)const_param->literal->val.val.ival);

+break;
+  case FLOAT4OID:
+params->params[paramno].value = 
Float4GetDatum((float)const_param->literal->val.val.ival);

+break;
+  case FLOAT8OID:
+params->params[paramno].value = 
Float8GetDatum((double)const_param->literal->val.val.ival);

+break;
+  case INT4RANGEOID:
+sprintf(buf, "[%ld,%ld]", 
const_param->literal->val.val.ival, const_param->literal->val.val.ival);

+getTypeInputInfo(ptype, , );
+params->params[paramno].value = 
OidInputFunctionCall(typinput, buf, typioparam, -1);

+break;
+  default:
+ pg_lltoa(const_param->literal->val.val.ival, buf);
+getTypeInputInfo(ptype, , );
+params->params[paramno].value = 
OidInputFunctionCall(typinput, buf, typioparam, -1);

+}
+break;
+  case T_Null:
+params->params[paramno].isnull = true;
+break;
+  default:
+/*
+ * Convert from string literal
+ */
+getTypeInputInfo(ptype, , );
+params->params[paramno].value = 
OidInputFunctionCall(typinput, const_param->literal->val.val.str, 
typioparam, -1);

+}

I don't see something with a bunch of hard-coded rules for particular 
type OIDs having any chance of being acceptable.




Well, what I need is to convert literal value represented in Value 
struct to parameter datum value.

Struct "value" contains union with integer literal and text.
So this peace of code is just provides efficient handling of most common 
cases (integer parameters) and uses type's input function in other cases.



This patch seems to duplicate a large amount of existing code.  That 
would be a good thing to avoid.


Yes,  I have to copy a lot of code from exec_parse_message + 
exec_bind_message + exec_execute_message functions.
Definitely copying of code is bad flaw. It will be much better and 
easier just to call three original functions instead of mixing gathering 
their code into the new function.

But I failed to do it because
1.  Autoprepare should be integrated into exec_simple_query. Before 
executing query in normal way, I need to perform cache lookup for 
previously prepared plan for this generalized query.
And generalization of query requires building of query tree (query 
parsing). In other words, parsing should be done before I can call 
exec_parse_message.
2. exec_b

Re: [HACKERS] Cached plans and statement generalization

2017-04-28 Thread Konstantin Knizhnik




On 26.04.2017 13:46, Pavel Stehule wrote:


I attach new patch which allows to limit the number of
autoprepared statements (autoprepare_limit GUC variable).
Also I did more measurements, now with several concurrent
connections and read-only statements.
Results of pgbench with 10 connections, scale 10 and read-only
statements are below:

Protocol
TPS
extended
87k
prepared
209k
simple+autoprepare
206k


As you can see, autoprepare provides more than 2 times speed
improvement.

Also I tried to measure overhead of parsing (to be able to
substitute all literals, not only string literals).
I just added extra call of pg_parse_query. Speed is reduced to 181k.
So overhead is noticeable, but still making such optimization useful.
This is why I want to ask question:  is it better to implement
slower but safer and more universal solution?


Unsafe solution has not any sense, and it is dangerous (80% of 
database users has not necessary knowledge). If somebody needs the max 
possible performance, then he use explicit prepared statements.




I attached new patch to this mail. I completely reimplement my original 
approach and now use parse tree transformation.

New pgbench (-S -c 10) results are the following:

Protocol
TPS
extended
87k
prepared
209k
simple+autoprepare
185k


So there is some slowdown comparing with my original implementation and 
explicitly prepared statements, but still it provide more than two times 
speed-up comparing with unprepared queries. And it doesn't require to 
change existed applications.
As far as most of real production application are working with DBMS 
through some connection pool (pgbouncer,...), I think that such 
optimization will be useful.
Isn't it interesting if If we can increase system throughput almost two 
times by just setting one parameter in configuration file?


I also tried to enable autoprepare by default and run regression tests. 
7 tests are not passed because of the following reasons:
1. Slightly different error reporting (for example error location is not 
always identically specified).
2. Difference in query behavior caused by  changed local settings 
(Andres gives an example with search_path,  and date test is failed 
because of changing datestyle).
3. Problems with indirect dependencies (when table is altered only 
cached plans directly depending on this relation and invalidated, but 
not plans with indirect dependencies).

4. Not performing domain checks for null values.

I do not think that this issues can cause problems for real application.

Also it is possible to limit number of autoprepared statements using 
autoprepare_limit parameter, avoid possible backend memory overflow in 
case of larger number of unique queries sent by application. LRU 
discipline is used to drop least recently used plans.


Any comments and suggestions for future improvement of this patch are 
welcome.


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index cd39167..4fbc8b7 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -3610,6 +3610,454 @@ raw_expression_tree_walker(Node *node,
 }
 
 /*
+ * raw_expression_tree_mutator --- transform raw parse tree. 
+ *
+ * This function is implementing slightly different approach for tree update than expression_tree_mutator().
+ * Callback is given pointer to pointer to the current node and can update this field instead of returning reference to new node.
+ * It makes it possible to remember changes and easily revert them without extra traversal of the tree.
+ * 
+ * This function do not need QTW_DONT_COPY_QUERY flag: it never implicitly copy tree nodes, doing in-place update.
+ * 
+ * Like raw_expression_tree_walker, there is no special rule about query
+ * boundaries: we descend to everything that's possibly interesting.
+ *
+ * Currently, the node type coverage here extends only to DML statements
+ * (SELECT/INSERT/UPDATE/DELETE) and nodes that can appear in them, because
+ * this is used mainly during analysis of CTEs, and only DML statements can
+ * appear in CTEs. If some other node is visited, iteration is immediately stopped and true is returned.
+ */
+bool
+raw_expression_tree_mutator(Node *node,
+			bool (*mutator) (),
+			void *context)
+{
+	ListCell   *temp;
+
+	/*
+	 * The walker has already visited the current node, and so we need only
+	 * recurse into any sub-nodes it has.
+	 */
+	if (node == NULL)
+		return false;
+
+	/* Guard against stack overflow due to overly complex expressions */
+	check_stack_depth();
+
+	switch (nodeTag(node))
+	{
+		case T_SetToDefault:
+		case T_CurrentOfExpr:
+		case T_Integer:
+		case T_Float:
+		case T_String:
+		case T_BitString:
+		case T_Null:
+		case T_ParamRef:
+		case T_A_Const:
+		case

[HACKERS] Bug in prepared statement cache invalidation?

2017-04-28 Thread Konstantin Knizhnik


Hi hackers,

I find out that now Postgres correctly invalidates prepared plans which 
directly depend on altered relation, but doesn't invalidate plans having 
transitive (indirect) dependencies.

Is it a bug or feature?

postgres=# create table foo(x integer);
CREATE TABLE
postgres=# select * from foo;
 x
---
(0 rows)

postgres=# create function returnqueryf()returns setof foo  as $$ begin 
return query select * from foo; end; $$ language plpgsql;

CREATE FUNCTION
postgres=# select * from returnqueryf();
 x
---
(0 rows)

postgres=# create function returnqueryff()returns setof foo  as $$ begin 
return query select * from returnqueryf(); end; $$ language plpgsql;

CREATE FUNCTION
postgres=# select * from returnqueryff();
 x
---
(0 rows)

postgres=# alter table foo add column y integer;
ALTER TABLE
postgres=# select * from foo;
 x | y
---+---
(0 rows)

postgres=# select * from returnqueryf();
 x | y
---+---
(0 rows)

postgres=# select * from returnqueryff();
ERROR:  structure of query does not match function result type
DETAIL:  Number of returned columns (1) does not match expected column 
count (2).

CONTEXT:  PL/pgSQL function returnqueryff() line 1 at RETURN QUERY
p

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cached plans and statement generalization


On 04/26/2017 08:08 PM, Doug Doole wrote:


A naive option would be to invalidate anything that depends on table or 
view *.FOOBAR. You could probably make it a bit smarter by also requiring that 
schema A appear in the path.


This has been rumbling around in my head. I wonder if you could solve this 
problem by registering dependencies on objects which don't yet exist. Consider:

CREATE TABLE C.T1(...);
CREATE TABLE C.T2(...);
SET search_path='A,B,C,D';
SELECT * FROM C.T1, T2;

For T1, you'd register a hard dependency on C.T1 and no virtual dependencies 
since the table is explicitly qualified.

For T2, you'd register a hard dependency on C.T2 since that is the table that was selected for the query. You'd also register virtual dependencies on A.T2 and B.T2 since if either of those tables (or views) are created you need to recompile the 
statement. (Note that no virtual dependency is created on D.T2() since that table would never be selected by the compiler.)


The catch is that virtual dependencies would have to be recorded and searched 
as strings, not OIDs since the objects don't exist. Virtual dependencies only 
have to be checked during CREATE processing though, so that might not be too 
bad.

But this is getting off topic - I just wanted to capture the idea while it was 
rumbling around.


I think that it will be enough to handle modification of search path and 
invalidate prepared statements cache in this case.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] Cached plans and statement generalization




On 26.04.2017 10:49, Konstantin Knizhnik wrote:



On 26.04.2017 04:00, Tsunakawa, Takayuki wrote:   Are you considering 
some upper limit on the number of prepared statements?
In this case we need some kind of LRU for maintaining cache of 
autoprepared statements.
I think that it is good idea to have such limited cached - it can 
avoid memory overflow problem.

I will try to implement it.


I attach new patch which allows to limit the number of autoprepared 
statements (autoprepare_limit GUC variable).
Also I did more measurements, now with several concurrent connections 
and read-only statements.
Results of pgbench with 10 connections, scale 10 and read-only 
statements are below:


Protocol
TPS
extended
87k
prepared
209k
simple+autoprepare
206k


As you can see, autoprepare provides more than 2 times speed improvement.

Also I tried to measure overhead of parsing (to be able to substitute 
all literals, not only string literals).

I just added extra call of pg_parse_query. Speed is reduced to 181k.
So overhead is noticeable, but still making such optimization useful.
This is why I want to ask question:  is it better to implement slower 
but safer and more universal solution?


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index f6be98b..0c9abfc 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -188,6 +188,7 @@ static bool IsTransactionStmtList(List *parseTrees);
 static void drop_unnamed_stmt(void);
 static void SigHupHandler(SIGNAL_ARGS);
 static void log_disconnections(int code, Datum arg);
+static bool exec_cached_query(const char *query_string);
 
 
 /* 
@@ -916,6 +917,14 @@ exec_simple_query(const char *query_string)
 	drop_unnamed_stmt();
 
 	/*
+	 * Try to find cached plan
+	 */
+	if (autoprepare_threshold != 0 && exec_cached_query(query_string))
+	{
+		return;
+	}
+
+	/*
 	 * Switch to appropriate context for constructing parsetrees.
 	 */
 	oldcontext = MemoryContextSwitchTo(MessageContext);
@@ -4500,3 +4509,606 @@ log_disconnections(int code, Datum arg)
 	port->user_name, port->database_name, port->remote_host,
   port->remote_port[0] ? " port=" : "", port->remote_port)));
 }
+
+typedef struct { 
+	char const*   query;
+	dlist_nodelru;
+	int64 exec_count;
+	CachedPlanSource* plan;	
+	int   n_params;
+	int16 format;
+	bool  disable_autoprepare;
+} plan_cache_entry;
+
+/*
+ * Replace string literals with parameters. We do not consider integer or real literals to avoid problems with 
+ * negative number, user defined operators, ... For example it is not easy to distinguish cases (-1), (1-1), (1-1)-1
+ */
+static void generalize_statement(const char *query_string, char** gen_query, char** query_params, int* n_params)
+{
+	size_t query_len = strlen(query_string);
+	char const* src = query_string;
+	char* dst;
+	char* params;
+	unsigned char ch;
+
+	*n_params = 0;
+
+	*gen_query = (char*)palloc(query_len*2); /* assume that we have less than 1000 parameters, the worst case is replacing '' with $999 */
+	*query_params = (char*)palloc(query_len + 1);
+	dst = *gen_query;
+	params = *query_params;
+
+	while ((ch = *src++) != '\0') { 
+		if (isspace(ch)) { 
+			/* Replace sequence of whitespaces with just one space */
+			while (*src && isspace(*(unsigned char*)src)) { 
+src += 1;
+			}
+			*dst++ = ' ';
+		} else if (ch == '\'') { 
+			while (true) { 
+ch = *src++;
+if (ch == '\'') { 
+	if (*src != '\'') { 
+		break;
+	} else {
+		/* escaped quote */
+		*params++ = '\'';
+		src += 1;
+	}
+} else { 
+	*params++ = ch;
+}
+			}
+			*params++ = '\0';
+			dst += sprintf(dst, "$%d", ++*n_params);
+		} else { 
+			*dst++ = ch;
+		}
+	}			
+	Assert(dst <= *gen_query + query_len);
+	Assert(params <= *query_params + query_len*2);
+	*dst = '\0';
+}
+
+static uint32 plan_cache_hash_fn(const void *key, Size keysize)
+{
+	return string_hash(((plan_cache_entry*)key)->query, 0);
+}
+
+static int plan_cache_match_fn(const void *key1, const void *key2, Size keysize)
+{
+	return strcmp(((plan_cache_entry*)key1)->query, ((plan_cache_entry*)key2)->query);
+}
+
+static void* plan_cache_keycopy_fn(void *dest, const void *src, Size keysize)
+{ 
+	((plan_cache_entry*)dest)->query = pstrdup(((plan_cache_entry*)src)->query);
+return dest;
+}
+
+#define PLAN_CACHE_SIZE 113
+
+size_t nPlanCacheHits;
+size_t nPlanCacheMisses;
+
+/*
+ * Try to generalize query, find cached plan for it and execute
+ */
+static bool exec_cached_query(const char *query_string)
+{
+	CommandDest   dest = whereToSendOutput;
+	DestReceiver *receiver;
+	char *gen_query;
+	char

Re: [HACKERS] Cached plans and statement generalization




On 26.04.2017 04:00, Tsunakawa, Takayuki wrote:

From: pgsql-hackers-ow...@postgresql.org

[mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Konstantin
Knizhnik
Well, first of all I want to share results I already get: pgbench with default
parameters,  scale 10 and one connection:

So autoprepare is as efficient as explicit prepare and can increase
performance almost two times.

This sounds great.

BTW, when are the autoprepared statements destroyed?
Right now them are destroyed only in case of receiving invalidation  
message (when catalog is changed).
Prepared statements are local to backend and are located in backend's  
memory.
It is unlikely, that there will be too much different queries which  
cause memory overflow.

But in theory such situation is certainly possible.



  Are you considering some upper limit on the number of prepared statements?
In this case we need some kind of LRU for maintaining cache of  
autoprepared statements.
I think that it is good idea to have such limited cached - it can avoid  
memory overflow problem.

I will try to implement it.



Regards
Takayuki Tsunakawa




--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cached plans and statement generalization




On 26.04.2017 01:34, Andres Freund wrote:

Hi,

(FWIW, on this list we don't do top-quotes)

On 2017-04-25 22:21:22 +, Doug Doole wrote:

Plan invalidation was no different than for any SQL statement. DB2 keeps a
list of the objects the statement depends on. If any of the objects changes
in an incompatible way the plan is invalidated and kicked out of the cache.

I suspect what is more interesting is plan lookup. DB2 has something called
the "compilation environment". This is a collection of everything that
impacts how a statement is compiled (SQL path, optimization level, etc.).
Plan lookup is done using both the statement text and the compilation
environment. So, for example, if my path is DOUG, MYTEAM, SYSIBM and your
path is ANDRES, MYTEAM, SYSIBM we will have different compilation
environments. If we both issue "SELECT * FROM T" we'll end up with
different cache entries even if T in both of our statements resolves to
MYTEAM.T. If I execute "SELECT * FROM T", change my SQL path and then
execute "SELECT * FROM T" again, I have a new compilation environment so
the second invocation of the statement will create a new entry in the
cache. The first entry is not kicked out - it will still be there for
re-use if I change my SQL path back to my original value (modulo LRU for
cache memory management of course).

It's not always that simple, at least in postgres, unless you disregard
search_path.  Consider e.g. cases like

CREATE SCHEMA a;
CREATE SCHEMA b;
CREATE TABLE a.foobar(somecol int);
SET search_patch = 'b,a';
SELECT * FROM foobar;
CREATE TABLE b.foobar(anothercol int);
SELECT * FROM foobar; -- may not be cached plan from before!

it sounds - my memory of DB2 is very faint, and I never used it much -
like similar issues could arise in DB2 too?


There is the same problem with explicitly prepared statements, isn't it?
Certainly in case of using prepared statements it is responsibility of 
programmer to avoid such collisions.

And in case of autoprepare programmer it is hidden from programming.
But there is guc variable controlling autoprepare feature and by default 
it is switched off.
So if programmer or DBA enables it, then them should take in account 
effects of such decision.


By the way, isn't it a bug in PostgreSQL that altering search path is 
not invalidating cached plans?
As I already mentioned, the same problem can be reproduced with 
explicitly prepared statements.






Greetings,

Andres Freund


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cached plans and statement generalization




On 26.04.2017 00:47, Andres Freund wrote:

On 2017-04-25 21:11:08 +, Doug Doole wrote:

When I did this in DB2, I didn't use the parser - it was too expensive. I
just tokenized the statement and used some simple rules to bypass the
invalid cases. For example, if I saw the tokens "ORDER" and "BY" then I'd
disallow replacement replacement until I hit the end of the current
subquery or statement.

How did you manage plan invalidation and such?


The same mechanism as for prepared statements.
Cached plans are linked in the list by SaveCachedPlan function and are 
invalidated by PlanCacheRelCallback.





- Andres




--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cached plans and statement generalization


On 04/25/2017 11:40 PM, Serge Rielau wrote:



On Apr 25, 2017, at 1:37 PM, Konstantin Knizhnik <k.knizh...@postgrespro.ru 
<mailto:k.knizh...@postgrespro.ru>> wrote:



SELECT ‘hello’::CHAR(10) || ‘World’, 5 + 6;

You can substitute ‘hello’, ‘World’, 5, and 6. But not 10.


I am substituting only string literals. So the query above will be transformed 
to

SELECT $1::CHAR(10) || $2, 5 + 6;

What's wrong with it?


Oh, well that leaves a lot of opportunities on the table, doesn’t it?


Well, actually my primary intention was not to make badly designed programs 
(not using prepared statements) work faster.
I wanted to address cases when it is not possible to use prepared statements.
If we want to substitute with parameters as much literals as possible, then 
parse+deparse tree seems to be the only reasonable approach.
I will try to implement it also, just to estimate parsing overhead.





Cheers
Serge




--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] Cached plans and statement generalization


On 04/25/2017 08:09 PM, Serge Rielau wrote:


On Tue, Apr 25, 2017 at 9:45 AM, Konstantin Knizhnik 
<k.knizh...@postgrespro.ru> wrote:

On 25.04.2017 19:12, Serge Rielau wrote:


On Apr 25, 2017, at 8:11 AM, Konstantin Knizhnik <k.knizh...@postgrespro.ru 
<mailto:k.knizh...@postgrespro.ru>> wrote:
Another problem is caused by using integer literals in context where 
parameters can not be used, for example "order by 1”.

You will also need to deal with modifiers in types such as VARCHAR(10). 
Not sure if there are specific functions which can only deal with literals (?) 
as well.

Sorry, I do not completely understand how presence of type modifiers can 
affect string literals used in query.
Can you provide me some example?

SELECT ‘hello’::CHAR(10) || ‘World’, 5 + 6;

You can substitute ‘hello’, ‘World’, 5, and 6. But not 10.


I am substituting only string literals. So the query above will be transformed 
to

SELECT $1::CHAR(10) || $2, 5 + 6;

What's wrong with it?



Also some OLAP syntax like “rows preceding”

It pretty much boils down to whether you can do some shallow parsing rather 
than expending the effort to build the parse tree.

Cheers
Serge



--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] Cached plans and statement generalization

On 04/25/2017 07:54 PM, David Fetter wrote:

On Tue, Apr 25, 2017 at 06:11:09PM +0300, Konstantin Knizhnik wrote:

On 24.04.2017 21:43, Andres Freund wrote:

Hi,

On 2017-04-24 11:46:02 +0300, Konstantin Knizhnik wrote:

So what I am thinking now is implicit query caching. If the same query with
different literal values is repeated many times, then we can try to
generalize this query and replace it with prepared query with
parameters.

That's not actuall all that easy:
- You pretty much do parse analysis to be able to do an accurate match.
How much overhead is parse analysis vs. planning in your cases?
- The invalidation infrastructure for this, if not tied to to fully
parse-analyzed statements, is going to be hell.
- Migrating to parameters can actually cause significant slowdowns, not
nice if that happens implicitly.

Well, first of all I want to share results I already get: pgbench with
default parameters, scale 10 and one connection:

protocol
TPS
simple
3492
extended
2927
prepared
6865
simple + autoprepare
6844

If this is string mashing on the unparsed query, as it appears to be,
it's going to be a perennial source of security issues.

Sorry, may be I missed something, but I can not understand how security can be violated by extracting string literals from query. I am just copying bytes from one buffer to another. I do not try to somehow interpret this parameters. What I am doing is
very similar with standard prepared statements.

And moreover query is parsed! Only query which was already parsed and executed
(but with different values of parameters) can be autoprepared.

Best,
David.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cached plans and statement generalization




On 25.04.2017 19:12, Serge Rielau wrote:


On Apr 25, 2017, at 8:11 AM, Konstantin Knizhnik 
<k.knizh...@postgrespro.ru <mailto:k.knizh...@postgrespro.ru>> wrote:
Another problem is caused by using integer literals in context where 
parameters can not be used, for example "order by 1”.
You will also need to deal with modifiers in types such as 
VARCHAR(10). Not sure if there are specific functions which can only 
deal with literals (?) as well.


Sorry, I do not completely understand how presence of type modifiers can 
affect string literals used in query.

Can you provide me some example?



Doug Doole did this work in DB2 LUW and he may be able to point to 
more places to watch out for semantically.


Generally, in my experience, this feature is very valuable when 
dealing with (poorly designed) web apps that just glue together strings.


I do not think that this optimization will be useful only for poorly 
designed application.
I already pointed on two use cases where prepapred statements can not be 
used:

1. pgbouncer without session-level pooling.
2. partitioning

Protecting it under a GUC would allow to only do the work if it’s 
deemed likely to help.
Another rule I find useful is to abort any efforts to substitute 
literals if any bind variable is found in the query.
That can be used as a cue that the author of the SQL left the 
remaining literals in on purpose.


A follow up feature would be to formalize different flavors of peeking.
I.e. can you produce a generic plan, but still recruit the initial set 
of bind values/substituted literals to dos costing?

Here situation is the same as for explicitly prepared statements, isn't it?
Sometimes it is preferrable to use specialized plan rather than generic 
plan.

I am not sure if postgres now is able to do it.




Cheers
Serge Rielau
Salesforce.com <http://salesforce.com>

PS: FWIW, I like this feature.


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] Cached plans and statement generalization