Re: [HACKERS] Statement-level rollback

2017-11-02 Thread Vladimir Sitnikov
Tsunakawa>So the statement-level rollback is newer to users, isn't it?

Technically speaking, the feature was listed in the changelog.

Tsunakawa>Doesn't PgJDBC execute RELEASE after each SQL statement?

It does not.

Tsunakawa>That said, even with RELEASE, the server memory bloat is not
solved.

That is what I mean.

Vladimir


Re: [HACKERS] Statement-level rollback

2017-11-02 Thread Vladimir Sitnikov
Tsunakawa> PgJDBC has supported the feature with autosave parameter only
recently

PgJDBC has the implementation for more than a year (REL9.4.1210,
2016-09-07, see https://github.com/pgjdbc/pgjdbc/pull/477 )

Tsunakawa> The point raised in this thread was that that creates
Tsunakawa> too much network overhead, so a backend-based solution would be
preferable.
Tsunakawa> We haven't seen any numbers or other evidence to quantify that
claim, so
Tsunakawa> maybe it's worth looking into that some more

The performance overhead for "SELECT" statement (no columns, just select)
statement over localhost is 36±4 us vs 38±3 us (savepoint is pipelined
along with user-provided query). That is network overhead is close to
negligible.

As far as I understand, the main problem with savepoints is they would
consume memory even in case the same savepoint is reassigned again and
again.
In other words, "savepoint; insert;savepoint; insert;savepoint;
insert;savepoint; insert;savepoint; insert;" would allocate xids and might
blow up backend's memory.
I see no way driver can workaround that, so it would be great if backend
could release memory or provide a way to do so.

Adding protocol messages would blow pgbouncer, etc things, so it makes
sense to refrain from new messages unless it is absolutely required.

Vladimir


Re: [HACKERS] 64-bit queryId?

2017-10-03 Thread Vladimir Sitnikov
>OK, so here's a patch.  Review appreciated.

Please correct typo "Write an unsigned integer field (anythign written with
UINT64_FORMAT)".  anythign ->  anything.

Vladimir


Re: [HACKERS] pgjdbc logical replication client throwing exception

2017-09-15 Thread Vladimir Sitnikov
++pgjdbc dev list.

>I am facing unusual connection breakdown problem. Here is the simple code
that I am using to read WAL file:

Does it always fails?
Can you create a test case? For instance, if you file a pull request with
the test, it will get automatically tested across various PG versions, so
it would be easier to reson about

Have you tried "withStatusInterval(20, TimeUnit.SECONDS)" instead of 20
millis? I don't think it matter much, however 20ms seems to be an overkill.

Vladimir

пт, 15 сент. 2017 г. в 19:57, Dipesh Dangol :

> hi,
>
> I am trying to implement logical replication stream API of postgresql.
> I am facing unusual connection breakdown problem. Here is the simple code
> that I am
> using to read WAL file:
>
> String url = "jdbc:postgresql://pcnode2:5432/benchmarksql";
> Properties props = new Properties();
> PGProperty.USER.set(props, "benchmarksql");
> PGProperty.PASSWORD.set(props, "benchmarksql");
> PGProperty.ASSUME_MIN_SERVER_VERSION.set(props, "9.4");
> PGProperty.REPLICATION.set(props, "database");
> PGProperty.PREFER_QUERY_MODE.set(props, "simple");
>
> Connection conn = DriverManager.getConnection(url, props);
> PGConnection replConnection = conn.unwrap(PGConnection.class);
>
> PGReplicationStream stream = replConnection.getReplicationAPI()
> .replicationStream().logical()
> .withSlotName("replication_slot3")
> .withSlotOption("include-xids", true)
> .withSlotOption("include-timestamp", "on")
> .withSlotOption("skip-empty-xacts", true)
> .withStatusInterval(20, TimeUnit.MILLISECONDS).start();
> while (true) {
>
> ByteBuffer msg = stream.read();
>
> if (msg == null) {
> TimeUnit.MILLISECONDS.sleep(10L);
> continue;
> }
>
> int offset = msg.arrayOffset();
> byte[] source = msg.array();
> int length = source.length - offset;
> String data = new String(source, offset, length);
>* System.out.println(data);*
>
> stream.setAppliedLSN(stream.getLastReceiveLSN());
> stream.setFlushedLSN(stream.getLastReceiveLSN());
>
> }
>
> Even the slightest modification in the code like commenting
> *System.out.println(data)*;
> which is just printing the data in the console, causes connection
> breakdown problem with
> following error msg
>
> org.postgresql.util.PSQLException: Database connection failed when reading
> from copy
> at
> org.postgresql.core.v3.QueryExecutorImpl.readFromCopy(QueryExecutorImpl.java:1028)
> at
> org.postgresql.core.v3.CopyDualImpl.readFromCopy(CopyDualImpl.java:41)
> at
> org.postgresql.core.v3.replication.V3PGReplicationStream.receiveNextData(V3PGReplicationStream.java:155)
> at
> org.postgresql.core.v3.replication.V3PGReplicationStream.readInternal(V3PGReplicationStream.java:124)
> at
> org.postgresql.core.v3.replication.V3PGReplicationStream.read(V3PGReplicationStream.java:70)
> at Server.main(Server.java:52)
> Caused by: java.net.SocketException: Socket closed
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
> at java.net.SocketInputStream.read(SocketInputStream.java:171)
> at java.net.SocketInputStream.read(SocketInputStream.java:141)
> at
> org.postgresql.core.VisibleBufferedInputStream.readMore(VisibleBufferedInputStream.java:140)
> at
> org.postgresql.core.VisibleBufferedInputStream.ensureBytes(VisibleBufferedInputStream.java:109)
> at
> org.postgresql.core.VisibleBufferedInputStream.read(VisibleBufferedInputStream.java:191)
> at org.postgresql.core.PGStream.receive(PGStream.java:495)
> at org.postgresql.core.PGStream.receive(PGStream.java:479)
> at
> org.postgresql.core.v3.QueryExecutorImpl.processCopyResults(QueryExecutorImpl.java:1161)
> at
> org.postgresql.core.v3.QueryExecutorImpl.readFromCopy(QueryExecutorImpl.java:1026)
> ... 5 more
>
> I am trying to implement some logic like filtering out the unrelated table
> after reading log.
> But due to this unusual behavior I couldn't implement properly.
> Can somebody give me some hint how to solve this problem.
>
> Thank you.
>


[HACKERS] Приглашение: Re: [HACKERS] intermittent failures in Cygwin from select... - пт, 16 июнь 2017 09:00 - 10:00 (MSK) (pgsql-hackers@postgresql.org)

2017-06-15 Thread Vladimir Sitnikov
BEGIN:VCALENDAR
PRODID:-//Google Inc//Google Calendar 70.9054//EN
VERSION:2.0
CALSCALE:GREGORIAN
METHOD:REQUEST
BEGIN:VEVENT
DTSTART:20170616T06Z
DTEND:20170616T07Z
DTSTAMP:20170615T193848Z
ORGANIZER;CN=Vladimir Sitnikov:mailto:sitnikov.vladi...@gmail.com
UID:2a6cc082-5202-11e7-978f-bfaa91f79ea0
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=pgsql-hackers@postgresql.org;X-NUM-GUESTS=0:mailto:pgsql-hackers@po
 stgresql.org
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=andrew.duns...@2ndquadrant.com;X-NUM-GUESTS=0:mailto:andrew.dunstan
 @2ndquadrant.com
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=Robert Haas;X-NUM-GUESTS=0:mailto:robertmh...@gmail.com
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=Amit Kapila;X-NUM-GUESTS=0:mailto:amit.kapil...@gmail.com
ATTENDEE;CUTYPE=INDIVIDUAL;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=
 TRUE;CN=t...@sss.pgh.pa.us;X-NUM-GUESTS=0:mailto:t...@sss.pgh.pa.us
CREATED:20170615T193847Z
DESCRIPTION:Просмотрите свое мероприятие на странице https://www.google.com
 /calendar/event?action=VIEW=XzY5Z2pjb3IzNjBzMzRiOWw2OG8zNGI5aDY1aWplYjl
 wNnNzNmNiYjJjcGdtMmU5aGNvcmppcGIxNjAgcGdzcWwtaGFja2Vyc0Bwb3N0Z3Jlc3FsLm9yZw
 =Mjcjc2l0bmlrb3YudmxhZGltaXJAZ21haWwuY29tYTk1NGFmNjg4YzY4MGU2OWE4Y2VkYj
 IxMDczMzA0ZWE0ZjA4YTNlMA=Europe/Moscow=ru.
LAST-MODIFIED:20170615T193848Z
LOCATION:
SEQUENCE:0
STATUS:CONFIRMED
SUMMARY:Re: [HACKERS] intermittent failures in Cygwin from select_parallel 
 testsэх"
TRANSP:OPAQUE
BEGIN:VALARM
ACTION:DISPLAY
DESCRIPTION:This is an event reminder
TRIGGER:-P0DT0H15M0S
END:VALARM
END:VEVENT
END:VCALENDAR


invite.ics
Description: application/ics

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Statement-level rollback

2017-03-07 Thread Vladimir Sitnikov
Please disregard my previous message.
pgjdbc is already doing upcase conversion, so I would like to see a test
case that reproduces the error.

Alternatively, could you please capture and share TRACE log? (
https://jdbc.postgresql.org/documentation/head/logging.html#configuration )

Vladimir

ср, 8 мар. 2017 г. в 1:26, Vladimir Sitnikov <sitnikov.vladi...@gmail.com>:

> legrand>when usingversion 42.0.0 with
> legrand> jdbc:postgresql://localhost:5432/postgres?autosave=always
>
> The pitfall there is the value should be written with upper case like
> autosave=ALWAYS.
>
> I've filed https://github.com/pgjdbc/pgjdbc/issues/769 to improve that at
> some point.
>
>
> Vladimir
>


Re: [HACKERS] Statement-level rollback

2017-03-07 Thread Vladimir Sitnikov
legrand>when usingversion 42.0.0 with
legrand> jdbc:postgresql://localhost:5432/postgres?autosave=always

The pitfall there is the value should be written with upper case like
autosave=ALWAYS.

I've filed https://github.com/pgjdbc/pgjdbc/issues/769 to improve that at
some point.


Vladimir


[HACKERS] Query cancel seems to be broken in master since Oct 17

2016-10-18 Thread Vladimir Sitnikov
Hi,

In pgjdbc we have regular regression testing against "build from
master" PostgreSQL, and recent master builds fail for "statement cancel"
test.

The PostgreSQL as of Mon Oct 17 00:09:39 UTC 2016 was fine, then "statement
cancel" started to fail.
The test executes "select pg_sleep(10)" and tries to cancel it. In recent
master builds, cancel seems to be ignored, and the statement lasts for 10
seconds.

Exactly the same driver and test version works fine for 8.4, 9.1, 9.2, 9.3,
9.4, 9.5, and 9.6:
Here's a sample build report:
https://travis-ci.org/pgjdbc/pgjdbc/builds/168444341

Any hints what could be the issue?
Was the breakage intentional?

Vladimir


Re: [HACKERS] Index Onlys Scan for expressions

2016-09-08 Thread Vladimir Sitnikov
Ildar> Could you please try the patch and tell if it works for you?

I've tested patch6 against recent head. The patch applies with no problems.

The previous case (filter on top of i-o-s) is fixed. Great work.

Here are the test cases and results:
https://gist.github.com/vlsi/008e18e18b609fcaaec53d9cc210b7e2

However, it looks there are issues when accessing non-indexed columns.
The error is "ERROR: variable not found in subplan target list"
The case is 02_case2_fails.sql (see the gist link above)

The essence of the case is "create index on substr(vc, 1, 128)"
and assume that majority of the rows have length(vc)<128.
Under that conditions, it would be nice to do index-only-scan
and filter (like in my previous case), but detect "long" rows
and do additional recheck for them.

Vladimir


Re: [HACKERS] Index Onlys Scan for expressions

2016-09-03 Thread Vladimir Sitnikov
Ildar>The reason why this doesn't work is that '~~' operator (which is a
Ildar>synonym for 'like') isn't supported by operator class for btree. Since
Ildar>the only operators supported by btree are <, <=, =, >=, >, you can use
Ildar>it with queries like:

Ildar>And in 3rd query 'OFFSET' statement prevents rewriter from
Ildar>transforming the query, so it is possible to use index only scan on
Ildar>subquery and then filter the result of subquery with '~~' operator.

I'm afraid I do not follow you.
Note: query 3 is 100% equivalent of query 2, however query 3 takes 55 times
less reads.
It looks like either an optimizer bug, or some missing feature in the
"index only scan" logic.

Here's quote from "query 2" (note % are at both ends):  ... where type=42)
as x where upper_vc like '%ABC%';

Note: I do NOT use "indexed scan" for the like operator. I'm very well aware
that LIKE patterns with leading % cannot be optimized to a btree range scan.
What I want is "use the first indexed column as index scan, then use the
second column
for filtering".

As shown in "query 2" vs "query 3", PostgreSQL cannot come up with such a
plan on its own
for some reason.

This is not a theoretical issue, but it is something that I use a lot with
Oracle DB (it just creates a good plan for "query 2").

Vladimir


Re: [HACKERS] Index Onlys Scan for expressions

2016-08-23 Thread Vladimir Sitnikov
Hi,

I've tried your indexonlypatch5.patch against REL9_6_BETA3.
Here are some results.

TL;DR:
1) <> does not support index-only
scan for index (type, upper(vc) varchar_pattern_ops).
3) <<(... where type=42 offset 0) where upper_vc like '%ABC%'>> does
trigger index-only scan. IOS reduces number of buffers from 977 to 17 and
that is impressive.

Can IOS be used for simple query like #1 as well?

Here are the details.

drop table vlsi;
create table vlsi(type numeric, vc varchar(500));
insert into vlsi(type,vc) select round(x/1000),
md5('||x)||md5('||x+1)||md5(''||x+2) from generate_series(1,100) as
s(x);
create index type_vc__vlsi on vlsi(type, upper(vc) varchar_pattern_ops);
vacuum analyze vlsi;

0) Smoke test (index only scan works when selecting indexed expression):

explain (analyze, buffers) select type, upper(vc) from vlsi where type=42;

 Index Only Scan using type_vc__vlsi on vlsi  (cost=0.55..67.97 rows=971
width=36) (actual time=0.012..0.212 rows=1000 loops=1)
   Index Cond: (type = '42'::numeric)
   Heap Fetches: 0
   Buffers: shared hit=17
 Planning time: 0.112 ms
 Execution time: 0.272 ms

1) When trying to apply "like condition", index only scan does not work.
Note: "buffers hit" becomes 977 instead of 17.

explain (analyze, buffers) select type, upper(vc) from vlsi where type=42
and upper(vc) like '%ABC%';

 Index Scan using type_vc__vlsi on vlsi  (cost=0.55..1715.13 rows=20
width=36) (actual time=0.069..1.343 rows=23 loops=1)
   Index Cond: (type = '42'::numeric)
   Filter: (upper((vc)::text) ~~ '%ABC%'::text)
   Rows Removed by Filter: 977
   Buffers: shared hit=939
 Planning time: 0.104 ms
 Execution time: 1.358 ms

Mere "subquery" does not help: still no index-only scan

2) explain (analyze, buffers) select * from (select type, upper(vc)
upper_vc from vlsi where type=42) as x where upper_vc like '%ABC%';

 Index Scan using type_vc__vlsi on vlsi  (cost=0.55..1715.13 rows=20
width=36) (actual time=0.068..1.344 rows=23 loops=1)
   Index Cond: (type = '42'::numeric)
   Filter: (upper((vc)::text) ~~ '%ABC%'::text)
   Rows Removed by Filter: 977
   Buffers: shared hit=939
 Planning time: 0.114 ms
 Execution time: 1.357 ms

3) "offset 0" trick does help:
explain (analyze, buffers) select * from (select type, upper(vc) upper_vc
from vlsi where type=42 offset 0) as x where upper_vc like '%ABC%';

 Subquery Scan on x  (cost=0.55..80.11 rows=39 width=36) (actual
time=0.033..0.488 rows=23 loops=1)
   Filter: (x.upper_vc ~~ '%ABC%'::text)
   Rows Removed by Filter: 977
   Buffers: shared hit=17
   ->  Index Only Scan using type_vc__vlsi on vlsi  (cost=0.55..67.97
rows=971 width=36) (actual time=0.015..0.210 rows=1000 loops=1)
 Index Cond: (type = '42'::numeric)
 Heap Fetches: 0
 Buffers: shared hit=17
 Planning time: 0.086 ms
 Execution time: 0.503 ms

Vladimir


Re: [HACKERS] Slowness of extended protocol

2016-08-19 Thread Vladimir Sitnikov
Tatsuo>understanding it always uses unnamed portal even if the SQL is like
"BEGIN" or "COMMIT" (no parameters). They are too often used. Why not
doing like this?

Does it actually work?

The documentation says named portals last till the end of the transaction:

https://www.postgresql.org/docs/9.5/static/protocol-flow.html#PROTOCOL-FLOW-EXT-QUERY


doc>If successfully created, a named portal object lasts till the end of
the current transaction, unless explicitly destroyed

Vladimir


Re: [HACKERS] Slowness of extended protocol

2016-08-16 Thread Vladimir Sitnikov
Shay> your analogy breaks down. Of course L2 was invented to improve
performance,
Shay> but that doesn't mean that all caches are the same. More precisely, what I
Shay> find objectionable about your approach is not any caching - it's the
Shay> implicit or automatic preparation of statements. This practice isn't
Shay> invisible in that a) it may cause errors that wouldn't have been there
Shay> otherwise (e.g. because of DDL),

Long-lived named server-prepared statements cause problems even if
server-prepared statements are created manually by developers.

Could you please stop saying "automatic preparation causes ~DDL issues"?

Those errors are not inherent to "automatic preparation of statements"
Those are just database defects that need to be cured.

Automatic savepointing is just a workaround for current DB limitation, and
it provides users with a simplified migration path.


Please, don't try to tell me that "IDbCommand.Prepare()" documentation says
that "prepared statement might fail for no reason just because it is prepared".


Shay> As I said above, I think this is a critical point of misunderstand between
Shay> us. The developers tells the driver which statements should be
Shay> server-prepared by calling .prepareStatement(). I'm guessing you have a
Shay> totally different understanding here.

Please, quote the document you got that "developers tell the driver which
statements should be server-prepared by calling ..." from. It never
works like that.
Neither in Java, nor in C#. I would admit I've no C# experience, but I did
find documentation on IDbCommand.Prepare() and examined it.

The proper way to say is "by calling .Prepare() developer passes the
intention that
he might be using the same query multiple times".
That is it. It never means "driver must absolutely use server-prepare
in the response
to .Prepare() call".

The same goes for Java's PreparedStatement.
It never means "the driver must use server-prepared features".

As Microsoft lists in the .Prepare() documentation, modern versions of
MSSQL just ignore .Prepare() and cache statements automatically.

It is not a developer's business which statements should be in the
database cache.
Neither developer should care which statements reside in the driver cache.


Shay> What exactly does "server-prepare on each execution" means? Sending Parse
Shay> on each execution? How can that be considered prepared at all?

Remember, java.sql.PreparedStatement interface is NOT bound to PostgreSQL in any
manner. It is a generic database API.
Thus the word "prepared" does not mean anything specific there.
It gives no promise whether the statement will use "parseOnce,
execMany" PostgreSQL's
feature or not.

A compliant implementation (that is a driver) could just assemble full SQL
by concatenating the parameters on each execution and send it via 'Q' simple
execute message.


Shay> Does pgjdbc consider
Shay> something "prepared" without it being the 2nd option above? Note that I'm
Shay> genuinely interested in case I'm missing something.

Currently the first 5 executions of PreparedStatement use unnamed
statements (Parse/Bind/Exec).
Then pgjdbc assigns a name and uses just Bind/Exec.

So if a particular SQL is rare, then it would not get its own
server-prepared name
even though it is "java.sql.PreparedStatement".

What pgjdbc does is it picks the most used queries and enables them to be cached
at the database level.


Vladimir>> app/component and assign variables to CPU registers.
Vladimir>> This is exactly "programmer knowledge" which the compiler
doesn't have.
Vladimir>> Does it sound good to you?

Shay> Of course not. But I don't think it's a very valid analogy.

The analogy was not supposed to play in a way
you twisted it with ORM=Java, driver=C, etc.

Here's more detailed explanation:

1) You claim that programmers should manually examine all the SQL statements,
and put ".prepare()" call if and only if the specific SQL should be
server-prepared.

2) My analogy: programmers should manually examine all the variables
(think of C#
variables, or Java variables, or C variables, it does not matter),
and assign which variables should use CPU registers, and which ones should
go into the memory.

Named server-prepared statements == CPU registers
SQL statements in the code == variables in the code (e.g. C# variables)

That is very valid analogy. What you say is "programmer has full visibility over
the meaning of the code, thus it knows better which statements should be
server-prepared and which should not".

Well, register allocation is a bit harder problem that "statement name
allocation",
but the essence is the same: there's limited number of registers/named
statements,
so someone (or something) should decide which statements deserve a name.

Just in case: you definitely know what CPU registers are and what is
"register allocation" problem, don't you?

You seem to pick up that "application developer != compiler engineer", however
then you slipped into "nevertheless 

Re: [HACKERS] Slowness of extended protocol

2016-08-15 Thread Vladimir Sitnikov
Vladimir>> Yes, that is what happens.
Vladimir>> The idea is not to mess with gucs.

Shay:> Wow... That is... insane...

Someone might say that "programming languages that enable side-effects
are insane".
Lots of connection pools work by sharing the connections and it is up
to developer
if he can behave well (see "It is not" below)

Shay> it's entirely reasonable for
Shay> more than one app to use use the same pool

Sharing connections between different applications might be not that good idea.
However, I would not agree that "having out-of-process connection
pool" is the only sane
way to go.
I do admit there might be valid cases for out of process pooling,
however I do not agree
it is the only way to go. Not only "inprocess" is one of the ways,
"in-process" way is wildly used
in at least one of the top enterprise languages.

If you agree with that, you might agree that "in-process connection
pool that serves
exactly one application" might work in a reasonable fashion even
without DISCARD ALL.

Shay> Even with in-process pools it's standard practice (and a good idea) to
Shay> reset state.

It is not. Can you quote where did you get that "standard practice is
to reset state" from?

Oracle Weblogic application server does not reset connections.
JBoss WildFly application server does not reset connections.
HikariCP connection pool does not reset connections.

I can easily continue the list.
The above are heavily used java servers (Hikari is a pool to be exact).

Shay> If some part of a big complex
Shay> application modified state, and then some other part happens to get that
Shay> physical connection, it's extremely hard to make sense of things.

Let's not go into "functional programming" vs "imperative programming"
discussion?
Of course you might argue that "programming in Haskell or OCaml or F#" makes
"extremely easy to make sense of things", but that's completely
another discussion.

Shay> One note - in Npgsql's implementation of persistent prepared statements,
Shay> instead of sending DISCARD ALL Npgsql sends the commands listed in
Shay> https://www.postgresql.org/docs/current/static/sql-discard.html,
except for
Shay> DEALLOCATE ALL. This cleans up state changes but leaves prepared
statements
Shay> in place.

Ok. At least you consider that "always discarding all the state" might be bad.


Shay> This is somewhat similar to the CPU reordering you
Shay> keep coming back to - it's totally invisible

I would disagree. CPU reordering is easily visible if you are dealing
with multithreaded case.
It can easily result in application bugs if application misses some
synchronization.

CPU reordering is very visible to regular programmers, and it is a compromise:
1) Developers enable compiler and CPU do certain "reorderings"
2) Developers agree to follow the rules like "missing synchronization
might screw things up"
3) In the result, the code gets executed faster.


Vladimir> Just in case: PostgreSQL does not execute "discard all" on its own.

Shay> Of course it doesn't - it doesn't know anything about connection pooling,
Shay> it only knows about physical connections. When would it execute "discard
Shay> all" on its own?

That my point was for "pgpool aiming to look like a regular postgresql
connection".
The point was: "postgresql does not discard on its own, so pgpool
should not discard".


Shay> To enforce isolation, which is maybe the most important way for
programs to
Shay> be reliable - but this is a general theme which you don't seem to agree
Shay> with.

If you want to isolate something, you might better have a
per-application connection pool.
That way, if a particular application consumes all the connections, it
would not impact
other applications. If all the applications use the same
out-of-process pool, there might
be trouble of resource hogging.

Shay> Regardless, resetting state doesn't have to have a necessary effect
Shay> on response times/throughput.

Even if you do not reset prepared statements, "reset query" takes time.
For instance: there's a common problem to "validate connections before use".
That is the pooler should ensure the connection is working before handling it
to the application.
Both Weblogic server, and HikariCP have those connection validation built in
and the validation is enabled by default.

However, it turns out that "connection validation" takes too much time,
it is visible in the application response times, etc, so they both implement a
grace period. That is "if the connection was recently used, it is
assumed to be fine".
Weblogic trusts 15 seconds by default, so if you borrow connections
each 10 seconds, then
they will not be tested.
Well, there's additional background validation, but my main point is
"even select 1"
is visible on the application response times.



Shay> This is something new - maybe it's part of the misunderstanding here. To
Shay> me, the term "prepared statements" always means "server-prepared
Shay> statements"; this seems to be supported by the documentation 

Re: [HACKERS] Slowness of extended protocol

2016-08-15 Thread Vladimir Sitnikov
Shay> What? I really didn't understand your point here. All the doc is saying is
Shay> that if the driver doesn't support prepared statements, then using them

Please read again. PreparedStatement is the only way to execute statements
in JDBC API. There's no API that allows user to specify "use
server-prepared here".
Well, there's non-prepared API in JDBC, however it misses "bind
variables" support,
so if bind variables required, developer would use PreparedStatement.

Java's PreparedStatement does not have an option to distinguish which statements
should be server-prepared and which should not.

Vladimir>> My experience shows, that people are very bad at predicting where the
Vladimir>> performance problem would be.
Vladimir>> For 80% (or even 99%) of the cases, they just do not care
thinking if a
Vladimir>> particular statement should be server-prepared or not.
Vladimir>> They have absolutely no idea how much resources it would
take and so on.
Shay> Maybe you're even right saying these things, I don't know. But
that doesn't
Shay> mean I as a driver should solve their problems for them. And I also get
Shay> that you have an additional argument here besides programmer
Shay> laziness/stupidity - the ORM argument - which makes more sense.

Suppose backend can handle 20 server-prepared statements at most (if using more
it would run out of memory).
Suppose an application has 100 statements with ".prepare()" call.
I think it is reasonable for the DB driver to figure out which
statements are most important
and server-prepare just "20 most important ones", and leave the rest
80 as regular
non-prepared statements.

Do you think the DB driver should just follow developer's advice and
server-prepare
all the 100 statements causing backend crash?
Do you think application developer should have a single list of all
the statements ever
used in the application and make sure there's no more than 20 queries in it?

My main point is not "developers are stupid", but "people often have
wrong guess when
it comes to performance". There are too many moving parts, so it is
hard to predict
performance implications.

Often it is much easier to execute a series of benchmarks that
validate certain hypothesis.
For instance, as Tatsuo says, savepoint overhead for DML is higher
than savepoint
overhead for SELECT, so I plan to have that benchmark as well.


Shay> First, there's nothing stopping an ORM from optimizing multiple inserts
Shay> into a single multivalue insert. I do admit I'm not aware of any who do
Shay> this, but it's a good idea for an optimization - I happen to maintain the
Shay> Entity Framework Core provider for Npgsql, I might take a look at this
Shay> optimization (so again thanks for the idea).

Nothings stops, but M framework times N database drivers results in M*N effort
for each feature.
As you say: application should just use batch API, and it's driver's
job to convert
that into suitable for the database sequence of bytes.

Same for Npgsql: if you implement rewrite at Npgsql level, that would
automatically
improve all the framework/applications running on top of Npgsql.


Shay> I'm going to repeat what I said
Shay> before and it would be good to get some reaction to this. Every software
Shay> component in the stack has a role, and maintaining those separations is
Shay> what keeps things simple and sane

You might be missing my comments on CPU, x86, etc.
My reaction is: almost every existing component is extremely hard to
reason about.

For instance: CPU has certain number of registers, it has certain
amount of L1/L2/...
caches and so on.
Do you mean each and every developer should explicitly specify which
program variable should use register and which one should go into L2 cache?

This is a counter-example to your "sane" "separation". CPU is free to reorder
instruction stream as long as the net result complies to the specification.
In the same way, CPU is free to use L1/L2 caches in whatever way it
thinks is the best.
Note: typical DB driver developer does not try to maintain a set of
"optimal assembly
instructions".
Driver developer relies on the compiler and the CPU so they would optimize
driver's code into the best machine code.

Of course driver might have inline assembly, but that is not how
mainstream drivers are written.

Another example: TCP stack. When DB driver sends some data, kernel is
free to reorder
packets, it is free to interleave, delay them, or even send even use
multiple network cards
to send a single TCP stream.
Windows 10 includes several performance improvements to the TCP stack,
and it is nowhere near
to "kernel is doing exactly what application/driver coded".
Once again: application/driver developer does not optimize for a
specific hardware (e.g. network card).
Developers just use common API and it is kernel's job to use best
optimizations for the particular HW.

The same goes to ORM-DB combo. ORM uses DB driver's API, and it's
drivers job to use
optimal command sequence for the specific 

Re: [HACKERS] Slowness of extended protocol

2016-08-13 Thread Vladimir Sitnikov
Tatsuo>Interesting. What would happen if a user changes some of GUC
parameters? Subsequent session accidentally inherits the changed GUC
parameter?

Yes, that is what happens.
The idea is not to mess with gucs.

Tatsuo>There's nothing wrong with DICARD ALL
Tatsuo>"DISCARD ALL" is perfect for this goal.

It looks like you mean: "let's reset the connection state just in case".
I see where it might help: e.g. when providing database access to random
people who might screw a connection in every possible way.

Just in case: do you think one should reset CPU caches, OS filesystem
caches, DNS caches, bounce the application, and bounce the OS in addition
to "discard all"?
Why reset only "connection properties" when we can reset everything to its
pristine state?

Just in case: PostgreSQL does not execute "discard all" on its own.
If you mean "pgpool is exactly like reconnect to postgresql but faster
since connection is already established", then OK, that might work in some
cases (e.g. when response times/throughput are not important), however why
forcing "you must always start from scratch" execution model?

Developers are not that dumb, and they can understand that "changing gucs
at random is bad".

When a connection pool is dedicated to a particular application (or a set
of applications), then it makes sense to reuse statement cache for
performance reasons.
Of course, it requires some discipline like "don't mess with gucs", however
that is acceptable and it is easily understandable by developers.

My application cannot afford re-parsing hot SQL statements as hurts
performance. It is very visible in the end-to-end performance tests, so
"discard all" is not used, and developers know not to mess with gucs.

Vladimir


Re: [HACKERS] Slowness of extended protocol

2016-08-13 Thread Vladimir Sitnikov
Shay>To be honest, the mere idea of having an SQL parser inside my driver
makes me shiver.

Same for me.
However I cannot wait for PostgreSQL 18 that does not need client-side
parsing.

Shay>We did, you just dismissed or ignored them

Please prove me wrong, but I did provide a justified answer to both
yours:
https://www.postgresql.org/message-id/CAB%3DJe-FHSwrbJiTcTDeT4J3y_%2BWvN1d%2BS%2B26aesr85swocb7EA%40mail.gmail.com
(starting
with "Why the connection is reset")
and Robert's examples:
https://www.postgresql.org/message-id/CAB%3DJe-GSAs_340dqdrJoTtP6KO6xxN067CtB6Y0ea5c8LRHC9Q%40mail.gmail.com

Shay>There's nothing your driver is doing that the application developer
can't do themselves -
Shay>so your driver isn't faster than other drivers. It's faster only when
used by lazy programmers.

I'm afraid you do not get the point.
ORMs like Hibernate, EclipseLink, etc send regular "insert ... values" via
batch API.
For the developer the only way to make use of "multivalues" is to implement
either "ORM fix" or "the driver fix" or "postgresql fix".

So the feature has very little to do with laziness of the programmers.
Application developer just does not have full control of each SQL when
working though ORM.
Do you suggest "stop using ORMs"? Do you suggest fixing all the ORMs so it
uses optimal for each DB insert statement?
Do you suggest fixing postgresql?

Once again "multivalues rewrite at pgjdbc level" enables the feature
transparently for all the users. If PostgreSQL 10/11 would improve
bind/exec performance, we could even drop that rewrite at pgjdbc level and
revert to the regular flow. That would again be transparent to the
application.

Shay>are you sure there aren't "hidden" costs on the PostgreSQL side for
generating so many implicit savepoints?

Technically speaking I use the same savepoint name through bind/exec
message.

Shay>What you're doing is optimizing developer code, with the assumption
that developers can't be trusted to code efficiently - they're going to
write bad SQL and forget to prepare their statements

Please, be careful. "you are completely wrong here" he-he. Well, you list
the wrong assumption. Why do you think my main assumption is "developers
can't be trusted"?

The proper assumption is: I follow Java database API specification, and I
optimize pgjdbc for the common use case (e.g. ORM or ORM-like).

For instance, if Java application wants to use bind values (e.g. to prevent
security issues), then the only way is to go through
java.sql.PreparedStatement.

Here's the documentation:
https://docs.oracle.com/javase/8/docs/api/java/sql/Connection.html#prepareStatement-java.lang.String-

Here's a quote:
Javadoc> *Note:* This method is optimized for handling parametric SQL
statements that benefit from precompilation. If the driver supports
precompilation, the methodprepareStatement will send the statement to the
database for precompilation. Some drivers may not support precompilation.
In this case, the statement may not be sent to the database until the
PreparedStatement object is executed. This has no direct effect on users;
however, it does affect which methods throw certainSQLException objects.

The most important part is "if the driver supports precompilation..."
There's no API to enable/disable precompilation at all.
So, when using Java, there is no such thing as
"statement.enableServerPrepare=true".

It is expected, that "driver" would "optimize" the handling somehow in the
best possible way.

It is Java API specification that enables me (as a driver developer) to be
flexible, and leverage database features so end user gets best experience.

Vladimir

>


Re: [HACKERS] handling unconvertible error messages

2016-08-13 Thread Vladimir Sitnikov
Tom> while giving something at least passable in the cases
that are broken today.

Would you mind adding an explicit "encoding" field to the error message?
At least it would give clear explanation how to parse that message without
resorting to a guess dance.

The biggest problem is client has no idea how to parse backend error
messages. If implementing client_encoding properly is too hard at this
point in time, then I would rather have "encoding field" right in the
startup error message.

That "encoding" field would enable sending properly localized messages in
the future if "pre-connect client_encoding" would be implemented somehow.

Vladimir


Re: [HACKERS] handling unconvertible error messages

2016-08-13 Thread Vladimir Sitnikov
Victor>It is not a client job to convert encodings.

Of course.

However, there is a vast amount of old PG versions deployed in the wild
that send wrong data to clients.

This indeed makes bad user experience, so it is worth doing 2 things:
1) Implement proper solution in further PostgreSQL versions (e.g. include
encoding name right into the error message).
2) Implement workaround for current drivers, so clients would get proper
error messages even when trying to connect to unpatched server.

Vladimir


Re: [HACKERS] Slowness of extended protocol

2016-08-13 Thread Vladimir Sitnikov
Shay>I don't know much about the Java world, but both pgbouncer and pgpool
(the major pools?)

In Java world, https://github.com/brettwooldridge/HikariCP is a very good
connection pool.
Neither pgbouncer nor pgpool is required.
The architecture is:  application <=> HikariCP <=> pgjdbc <=> PostgreSQL

The idea is HikariCP pools pgjdbc connections, and server-prepared
statements are per-pgjdbc connection, so there is no reason to "discard
all" or "deallocate all" or whatever.

Shay>send DISCARD ALL by default. That is a fact, and it has nothing to do
with any bugs or issues pgbouncer may have.

That is configurable. If pgbouncer/pgpool defaults to "wrong
configuration", why should we (driver developers, backend developers) try
to accommodate that?

Shay> What? Do you mean you do implicit savepoints and autorollback too?

I mean that.
On top of that it enables opt-in psql-like ON_ERROR_ROLLBACK functionality
so it could automatically roll back the latest statement if it failed.
For instance, that might simplify migration from other DBs that have the
same "rollback just one statement on error" semantics by default.

Shay>How does the driver decide when to do a savepoint?

By default it would set a savepoint in a case when there's open
transaction, and the query to be executed has been previously described.

In other words, the default mode is to protect user from "cached plan
cannot change result type" error.

Shay>Is it on every single command?

Exactly (modulo the above). If user manually sets "autosave=always", every
command would get prepended with a savepoint and rolled back.

Shay>f you do a savepoint on every single command, that surely would impact
performance even without extra roundtrips...?

My voltmeter says me that the overhead is just 3us for a simple "SELECT"
statement (see
https://github.com/pgjdbc/pgjdbc/pull/477#issuecomment-239579547 ).
I think it is acceptable to have it enabled by default, however it would be
nice if the database did not require a savepoint dance to heal its "not
implemented" "cache query cannot change result type" error.


Shay>I'm not aware of other drivers that implicitly prepare statements,
Shay >and definitely of no drivers that internally create savepoints and
Shay> roll the back without explicit user APIs

Glad to hear that.
Are you aware of other drivers that translate "insert into table(a,b,c)
values(?,?,?)" =into=> "insert into table(a,b,c)
values(?,?,?),(?,?,?),...,(?,?,?)"
statement on the fly?

That gives 2-3x performance boost (that includes client result processing
as well!) for batched inserts since "bind/exec" message processing is not
that fast. That is why I say that investing into "bind/exec performance"
makes more sense than investing into "improved execution of non-prepared
statements".

Shay>you're doing something very different - not necessarily wrong - and not
Shay>attempt to impose your ideas on everyone as if it's the only true way
Shay>to write a db driver.

1) Feel free to pick ideas you like

2) I don't say "it is the only true way". I would change my mind if someone
would suggest better solution. Everybody makes mistakes, and I have no
problem with admitting that "I made a mistake" and moving forward.
They say: "Don't cling to a mistake just because you spent a lot of time
making it"

However, no-one did suggest a case when application issues lots of unique
SQL statements.
The suggested alternative "a new message for non-prepared extended query"
might shave 5-10us per query for those who are lazy to implement a query
cache. However, that is just 5-10ms per 1000 queries. Would that be
noticeable by the end-users? I don't think so.

Having a query cache can easily shave 5-10+ms for each query, that is why I
suggest driver developers to implement a query cache first, and only then
ask for new performance-related messages.

3) For "performance related" issues, a business case and a voltmeter is
required to prove there's an issue.


Shay>But the second query, which should be invalidated, has already been
Shay>sent to the server (because of batching), and boom

-- Doctor, it hurts me when I do that
-- Don't do that

When doing batched SQL, some of the queries might fail with "duplicate
key", or "constraint violation". There's already API that covers those kind
of cases and reports which statements did succeed (if any).
In the case as you described (one query in a batch somehow invalidates the
subsequent one) it would just resort to generic error handling.


Shay>When would you send this ValidatePreparedStatement?
Shay>Before each execution as a separate roundtrip?
Shay>That would kill performance.

Why do you think the performance would degrade? Once again: the current
problem is the client thinks it knows "which columns will be returned by a
particular server-prepared statement" but the table might get change behind
the scenes (e.g. online schema upgrade), so the error occurs. That "return
type" is already validated by the database (the time is 

Re: [HACKERS] handling unconvertible error messages

2016-08-13 Thread Vladimir Sitnikov
Victor>We don't have 190 message  catalog translations in the PostgreSQL.
Victor>So problem with encoding for messages is quite limited.

Even though the number of translations is limited, there's a problem when
trying to tell one "one-byte-encoding" from another "one-byte" one.
It would be so much better if ServerErrorMessages included encoding right
in the message itself.

For pgjdbc, I've implemented a workaround that relies on the following:
1) It knows how "FATAL" looks like in several translations, and it knows
often used encodings in those translations. For instance, for Russian it
tries CP1251, KOI8, and ALT encodings. It converts "ВАЖНО" (Russian for
FATAL) using those three encodings and searches that byte sequence in the
error message. If there's a match, then the encoding is identified.
2) Unfortunately, it does not help for Japanese, as "FATAL there is
translated as FATAL". So I hard-coded several typical words like
"database", "user", "role" (see [1]), so if those byte sequences are
present, the message is assumed to be in Japanese. It would be great if
someone could review those as I do not speak Japanese.
3) Then it tries different LATIN encodings.

Here's the commit
https://github.com/pgjdbc/pgjdbc/commit/ec5fb4f5a66b6598aea1c7ab8df3126ee77d15e2

Kyotaro> Is there any source to know the compatibility for any combination
Kyotaro> of language vs encoding? Maybe we need a ground for the list.

I use "locale -a" for that.

For instance, for Japanese it prints the following on my machine (OS X
10.11.6):
locale -a | grep ja
ja_JP
ja_JP.eucJP
ja_JP.SJIS
ja_JP.UTF-8

[1]:
https://github.com/pgjdbc/pgjdbc/commit/ec5fb4f5a66b6598aea1c7ab8df3126ee77d15e2#diff-57ed15f90f50144391f1c134bf08a45cR47

Vladimir


Re: [HACKERS] Slowness of extended protocol

2016-08-11 Thread Vladimir Sitnikov
Shay>As I said, an error is going to kill the ongoing transaction, how can
this be solved without application logic?

1) At least, some well-defined error code should be created for that kind
of matter.

2) The driver can use safepoints and autorollback to the good "right before
failure" state in case of a known failure. Here's the implementation:
https://github.com/pgjdbc/pgjdbc/pull/477
As far as I can remember, performance overheads are close to zero (no extra
roundtrips to create a safepoint)

3) Backend could somehow invalidate prepared statements, and notify clients
accordingly. Note: the problem is hard in a generic case, however it might
be not that hard if we fix well-known often-used cases like "a column is
added". That however, would add memory overheads to store additional maps
like "table_oid -> statement_names[]"

4) Other. For instance, new message flow so frontend and backend could
re-negotiate "binary vs text formats for the new resulting type". Or
"ValidatePreparedStatement" message that would just validate the statement
and avoid killing the transaction if the statement is invalid. Or whatever
else there can be invented.


Shay>So the general point is that the existence of pools is problematic for
the argument "always prepare for recurring statements".

So what?
Don't use pools that issue "discard all" or configure them accordingly.
That's it.
In Java world, no wildly used pool defaults to "discard everything"
strategy.

Please stop saying "pgbouncer" as its issue is confirmed, and pgbouncer
developers did acknowledge they would prefer to solve "prepared statement
issue" right inside pgbouncer without any cooperation from driver
developers.

Do you mean in C# world major connection pools default to "discard all"
setup? That sounds strange to me.

Vladimir

>


Re: [HACKERS] Slowness of extended protocol

2016-08-11 Thread Vladimir Sitnikov
 Shay:

> Prepared statements can have very visible effects apart from the speedup
> they provide (e.g. failure because of schema changes) It's not that these
> effects can't be worked around - they can be - but programmers can be
> surprised by these effects, which can cause difficult-to-diagnose issues.
>

The specific effect of "cached plan cannot change return type" can be
solved by cooperation of backend and frontend (~driver) developers.
I find that solving that kind of issues is more important than investing
into "ParseBindExecDeallocateInOneGo" message.


I hope you would forgive me if I just stop the discussion here.
I find I'd better spent that time on just fixing pgbouncer issue rather
than discussing if it is pgbouncer's or postgresql's issue.

I'm sorry for being impolite if I was ever.

Vladimir

>


Re: [HACKERS] Slowness of extended protocol

2016-08-10 Thread Vladimir Sitnikov
Shay> it's important to note that query parsing and rewriting isn't an
"inevitable evil".

Ok, I stay corrected. ADO.NET have raw mode in the API. That's interesting.

Let's say "lots of heavily used languages do have their own notion of bind
placeholders".
And for the reset, it is still not that hard to prepare on the go.

Shay> As Tom said, if an application can benefit from preparing, the
developer has the responsibility (and also the best knowledge) to manage
preparation, not the driver. Magical behavior under the hood causes
surprises, hard-to-diagnose bugs etc.

Why do you do C# then?
Aren't you supposed to love machine codes as the least magical thing?
Even x86 does not describe "the exact way the code should be executed".
All the CPUs shuffle the instructions to make it faster.

Shay>As Tom said, if an application can benefit from preparing, the
developer has the responsibility

Does developer have the responsibility to choose between "index scan" and
"table seq scan"? So your "developer has the responsibility" is like
building on sand.

Even "SQL execution on PostgreSQL is a magical behavior under the hood".
Does that mean we should drop the optimizer and implement "fully static
hint-based execution mode"? I do not buy that.

My experience shows, that people are very bad at predicting where the
performance problem would be.
For 80% (or even 99%) of the cases, they just do not care thinking if a
particular statement should be server-prepared or not.
They have absolutely no idea how much resources it would take and so on.

ORMs have no that notion of "this query must be server-prepared, while that
one must not be".
And so on.

It is somewhat insane to assume people would use naked SQL. Of course they
would use ORMs and alike, so they just would not be able to pass that
additional parameter telling if a particular query out of thousands should
be server-prepared or not.

Vladimir> "cached plan cannot change result type" -- PostgreSQL just fails
to execute the server-prepared statement if a table was altered.

Shay>How exactly do users cope with this in pgjdbc? Do they have some
special API to flush (i.e. deallocate) prepared statements which they're
supposed to use after a DDL?

First of all, pgjdbc does report those problems to hackers.
Unfortunately, it is still "not implemented".
Then, a workaround at pgjdbc side is made.
Here's relevant pgjdbc fix: https://github.com/pgjdbc/pgjdbc/pull/451

It analyzes error code, and if it finds "not_implemented
from RevalidateCachedQuery", then it realizes it should re-prepare.
Unfortunately, there is no dedicated error code, but at least there's a
routine name.

On top of that, each pgjdbc commit is tested against HEAD PostgreSQL
revision, so if the routine will get renamed for some reason, we'll know
that right away.

There will be some more parsing to cover "deallocate all" case.

Shay>it have been listed many times - pgbouncer

Let's stop discussing pgbouncer issue here?
It has absolutely nothing to do with pgsql-hackers.
Thanks.

Vladimir


Re: [HACKERS] Slowness of extended protocol

2016-08-10 Thread Vladimir Sitnikov
Stephen> While it may have good results in many cases, it's not accurate to
say that using prepared statements will always be faster than not.

There's no silver bullet. <-- that is accurate, but it is useless for
end-user applications
I've never claimed that "server prepared statement" is a silver bullet.

I just claim that "PrepareBindExecuteDeallocate" message does have
justification from performance point of view.

Stephen Frost> Dropping and recreating the prepared statement is how that
particular issue is addressed.

Stephen,

The original problem is: "extended query execution is slow"
I recommend "let's just add a query cache"
You mention: "that does not work since one query in a year might get slower
on 6th execution"
I ask: "what should be done _instead_ of a query cache?"
You say: "the same query cache, but execute 5 times at most"

Does that mean you agree that "query cache is a way to go"?


I do not follow that. Of course, I could add "yet another debugger switch"
to pgjdbc so it executes server-prepared statements 5 times maximum,
however I do consider it to be a PostgreSQL's issue.
I do not like to hard-code "5" into the driver logic.

I do not like making "5 times maximum" a default mode.

Vladimir


Re: [HACKERS] Slowness of extended protocol

2016-08-10 Thread Vladimir Sitnikov
Stephen>I encourage you to look through the archives

The thing is pl/pgsql suffers from exactly the same problem.
pl/pgsql is not a typical language of choice (e.g. see Tiobe index and
alike), so the probability of running into "prepared statement issues" was
low.

As more languages would use server-prepared statements, the rate of the
issues would naturally increase.

JFYI: I did participate in those conversations, so I do not get which
particular point are you asking for me to "look through" there.

Stephen Frost:

> And is the source of frequent complaints on various mailing lists along
> the lines of "why did my query suddently get slow the 6th time it was
> run?!".
>

I claim the following:
1) People run into such problems with pl/pgsql as well. pl/pgsql does
exactly the same server-prepared logic. So what? Pl/pgsql does have a query
cache, but other languages are forbidden from having one?
2) Those problematic queries are not that often
3) "suddently get slow the 6th time" is a PostgreSQL bug that both fails to
estimate cardinality properly, and it does not provide administrator a way
to disable the feature (generic vs specific plan).

4) Do you have better solution? Of course, the planner is not perfect. Of
course it will have issues with wrong cardinality estimations. So what?
Should we completely abandon the optimizer?
I do not think so.
Query cache does have very good results for the overall web page times, and
problems like "6th execution" are not that often.

By the way, other common problems are:
"cached plan cannot change result type" -- PostgreSQL just fails to execute
the server-prepared statement if a table was altered.
"prepared statement does not exist" -- the applications might use
"deallocate all" for some unknown reason, so the driver has to keep eye on
that.
"set search_path" vs "prepared statement" -- the prepared statement binds
by oids, so "search_path changes" should be accompanied by "deallocate all"
or alike.

Vladimir


Re: [HACKERS] Slowness of extended protocol

2016-08-10 Thread Vladimir Sitnikov
Robert>But that makes it the job of every driver to implement some
sort of cache, which IMHO isn't a very reasonable position

Let's wait what Shay decides on implementing query cache in npgsql ?
Here's the issue:  https://github.com/npgsql/npgsql/issues/1237

He could change his mind when it comes to the need of "new
ParseBindExecuteDeallocate message".


Robert>I think you should consider the possibility that those people
know what they are talking about.

I do appreciate all the inputs, however, all the performance-related
complaints in this thread I've seen can trivially be solved by adding a
query cache at the driver level + fixing pgbouncer's issue.
Unfortunately, client-side SQL parsing is inevitable evil (see below on
'?'), so any sane driver would cache queries any way. The ones that do not
have query cache will perform badly anyway.

As far as I can understand, the alternative solution is "to add
ParseBindExecDeallocateInOneGo
message to the protocol". This new message would require changes from all
the drivers, and all the pgbouncers. This new message would be slower than
proper query cache, so why should we all spend time on a half-baked
solution?

Of course, I might miss some use cases, that is why keep asking: please
provide close to production scenario that does require the new protocol
message we are talking about.
Note: examples (by Robert and Shay) like "typical web application that
fetches a single row from DB and prints it to the browser" were already
presented, and they are easily covered by the query cache.



To be fair, implementing a cache is a trivial thing when compared with
hard-coding binary/text formats for all the datatypes in each and every
language.
Remember, each driver has to implement its own set of procedures to
input/output values in text/binary format, and that is a way harder than
implementing the cache we are talking about.

If only there was some ability to have language-independent binary transfer
format (protobuf, avro, kryo, whatever)

Regarding query cache: each language has its own notion how bind variables
are represented in SQL text.
For instance, in Go language (as well as in Java), bind placeholders are
represented as '?' character.
Of course, PostgreSQL does not support that (it wants $1, $2, etc), so Go
driver has to parse SQL text at the driver side in order to convert it into
PostgreSQL-compatible flavor. This parser has to support comments, string
literals, etc, etc. It is just natural thing to have a cache there so the
driver does not repeat the same parsing logic again and again (typical
applications are known to use the same query text multiple times).

Robert>When
there's a common need that affects users of many different programming
languages

You are right. No questions here.
Ok, what is a need? What problem does "new message" solve?

>From what I see there is no performance need to introduce
"ParseBindExecDeallocateInOneGo" message. The thread is
performance-related, so I naturally object spending everybody's time on
implementing a useless feature.

Vladimir>Do you agree that the major part would be some hot queries, the
rest will be
Vladimir> much less frequently used ones (e.g. one time queries)?
Robert>Sure, but I don't want the application to have to know about that

Application does not need to know that. It is like "branch prediction in
the CPU". Application does not need to know there is a branch predictor in
the CPU. The same goes for query cache. Application should just continue to
execute SQLs in a sane manner, and the query cache would pick up the
pattern (server-prepare hot-used queries).

Vladimir


Re: [HACKERS] Slowness of extended protocol

2016-08-09 Thread Vladimir Sitnikov
Robert Haas:

> but for some reason you can't use prepared statements, for example because
> the queries are dynamically generated and .  That case is analogous to -M
> extended, not -M prepared.  And -M extended is well-known to be SLOWER
>

I do not buy that "dynamically generated queries defeat server-prepared
usage" argument. It is just not true (see below).

Do you mean "in language X, where X != Java it is impossible to implement a
query cache"?
That is just ridiculus.

At the end of the day, there will be a finite number of hot queries that
are important.
Here's relevant pgjdbc commit: https://github.com/pgjdbc/pgjdbc/pull/319
It works completely transparent to the application, and it does use
server-prepared statements even though application builds "brand new" sql
text every time.

It is not something theoretical, but it is something that is already
implemented and battle-tested. The application does build SQL texts based
on the names of tables and columns that are shown in the browser, and
pgjdbc uses query cache (to map user SQL to backend statement name), thus
it benefits from server-prepared statements automatically.

Not a single line change was required at the application side.

Am I missing something?
I cannot imagine a real life case when an application throws 10'000+ UNIQUE
SQL texts per second at the database.
Cases like "where id=1", "where id=2", "where id=3" do not count as they
should be written with bind variables, thus it represents a single SQL text
like "where id=$1".

Robert>you have to keep sending a different query text every time

Do you agree that the major part would be some hot queries, the rest will
be much less frequently used ones (e.g. one time queries)?

In OLTP applications the number of queries is high, and almost all the
queries are reused.
server-prepared to rescue here.
"protocol optimization" would not be noticeable.

In DWH applications the queries might be unique, however the number of
queries is much less, thus the "protocol optimization" would be invisible
as the query plan/process time would be much higher than the gain from
"protocol
optimization".

Vladimir


Re: [HACKERS] Slowness of extended protocol

2016-08-09 Thread Vladimir Sitnikov
Shay>But here's the more important general point. We're driver developers,
not application developers. I don't really know what performance is "just
fine" for each of my users, and what is not worth optimizing further. Users
may follow best practices, or they may not for various reasons.

Of course we cannot benchmark all the existing applications, however we
should at lest try to use "close to production" benchmarks.

Let me recap: even "select 1" shows clear advantage of reusing
server-prepared statements.
My machine shows the following results for "select 1 pgbench":
simple: 38K ops/sec (~26us/op)
extended: 32K ops/sec (~31us/op)
prepared: 47K ops/sec (~21us/op)

Note: reusing server-prepared statements shaves 10us (out of 31us), while
"brand new ParseBindExecDeallocate" message would not able to perform
better than 26us/op (that is 5 us worse than the prepared one). So it makes
much more sense investing time in "server-prepared statement reuse" at the
client side and "improving Bind/Exec performance" at the backend side.

For more complex queries the gap (prepared vs simple) would be much bigger
since parse/validate/plan for a complex query is much harder operation than
the one for "select 1"

Note: I do not mean "look, prepared always win". I mean: "if your
environment does not allow reuse of prepared statements for some reason,
you lose huge amount of time on re-parsing the queries, and it is worth
fixing that obvious issue first".

Shay>I don't see how reusing SQL text affects security in any way.

Reusing SQL text makes application more secure as "build SQL on the fly" is
prone to SQL injection security issues.
So reusing SQL text makes application more secure and it enables
server-prepared statements that improve performance considerably. It is a
win-win.

Shay>a new feature in the Npgsql dev branch which allows prepared
statements to be persisted across open/close on pooled connections

Do you have some benchmark results for "reusing server-prepared statements
across open/close on pooled"? I would expect that feature to be a great win.

Once again, I'd like to focus on real-life cases, not artificial ones.

For example, the major part of my performance fixes to pgjdbc were made as
a part of improving my java application that was suffering from performance
issues when talking to PostgreSQL.
For instance, there were queries that took 20ms to plan and 0.2ms to
execute (the query is like where id=? but the SQL text was more involved).
As transparent server-side statement was implemented at pgjdbc side, it
shaved those 20ms by eliminating Parse messages on the hot path.

In other words, it was not just "lets optimize pgjdbc". It was driven by
the need to optimize the client application, and the profiling results were
pointing to pgjdbc issues.

Shay>Again, in a world where prepared statements aren't persisted across
connections (e.g. pgbouncer)

pgbouncer does not properly support named statements, and that is
pbgouncer's issue.

Here's the issue for pgbouncer project:
https://github.com/pgbouncer/pgbouncer/issues/126#issuecomment-200900171
The response from pgbouncer team is "all the protocol bits are there, it is
just implementation from pgbouncer that is missing".

By the way: I do not use pgbouncer, thus there's no much interest for me to
invest time in fixing pgbouncer's issues.


Shay>Any scenario where you open a relatively short-lived connection and
execute something once is problematic - imagine a simple web service which
needs to insert a single record into a table.

I would assume the application does not use random string for a table name
(and columns/aliases), thus it would result in typical SQL text reuse, thus
it should trigger "server-side statement prepare" logic. In other way, that
kind of application does not need the "new ParseBindExecDeallocate message
we are talking about".

In other words, if an application is using "select name from objects where
id=$1" kind of queries, the driver should be using extended protocol
(Bind/Exec) behind the scenes if it does aim to get high performance.

Vladimir


Re: [HACKERS] Slowness of extended protocol

2016-08-09 Thread Vladimir Sitnikov
Shay>There are many scenarios where connections are very short-lived (think
about webapps where a pooled connection is allocated per-request and reset
in between)

Why the connection is reset in between in the first place?
In pgjdbc we do not reset per-connection statement cache, thus we easily
reuse named statements for pooled connections.

Shay>and the extra roundtrip that preparing entails is too much.

When server-prepared statement gets reused, neither parse neither describe
are used.

Shay>There are also many scenarios where you're not necessarily going to
send the same query multiple times in a single connection lifespan, so
preparing is again out of the question.

Can you list at least one scenario of that kind, so we can code it into
pgbench (or alike) and validate "simple vs prepared" performance?


Shay>And more generally, there's no reason for a basic, non-prepared
execution to be slower than it can be.

That's too generic. If the performance for "end-to-end cases" is just fine,
then it is not worth optimizing further. Typical application best practice
is to reuse SQL text (for both security and performance point of views), so
in typical applications I've seen, query text was reused, thus it naturally
was handled by server-prepared logic.

Let me highlight another direction: current execution of server-prepared
statement requires some copying of "parse tree" (or whatever). I bet it
would be much better investing in removal of that copying rather than
investing into "make one-time queries faster" thing. If we could make
"Exec" processing faster, it would immediately improve tons of applications.


Shay>Of course we can choose a different query to benchmark instead of
SELECT 1 - feel free to propose one (or several).

I've tried pgbench -M prepared, and it is way faster than pgbench -M simple.

Once again: all cases I have in mind would benefit from reusing
server-prepared statements. In other words, after some warmup the
appication would use just Bind-Execute-Sync kind of messages, and it would
completely avoid Parse/Describe ones.

If a statement is indeed "one-time" statement, then I do not care much how
long it would take to execute.

Shay>FYI in Npgsql specifically describe isn't used to get any knowledge
about parameters - users must populate the correct parameters or query
execution fails.

I think the main reason to describe for pgjdbc is to get result oids.
pgjdbc is not "full binary", thus it has to be careful which fields it
requests in binary format.
That indeed slows down "unknown queries", but as the query gets reused,
pgjdbc switches to server-prepared execution, and eliminates parse-describe
overheads completely.

Vladimir


Re: [HACKERS] Slowness of extended protocol

2016-08-08 Thread Vladimir Sitnikov
I'm sorry, we are discussing technical details with no real-life use case
to cover that.
I do not want to suck time for no reason. Please accept my sincere
apologies for not asking the real-life case earlier.

Shay, can you come up with a real-life use case when those "I claim the
statement will be used only once" is would indeed improve performance?
Or, to put it in another way: "do you have a real-life case when simple
protocol is faster than extended protocol with statement reuse"?

I do have a couple of java applications and it turns out there's a huge win
of reusing server-prepared statements.
There's a problem that "generic plan after 5th execution might be much
worse than a specific one", however those statements are not often and I
just put hints to the SQL (limit 0, +0, CTE, those kind of things).

Tom Lane :

> I do not accept this idea that retroactively defining special semantics
> for certain statement names is not a protocol break.


Sir, any new SQL keyword is what you call a "retroactively defining special
semantics".
It's obvious that very little current clients do use named server-prepared
statements.
Statement names are not something that is provided by the end-user in a web
page, so it is not a rocket science to come up with a statement name that
is both short and "never ever used in the wild".

Tom Lane :

> If it causes any
> change in what the server's response would be, then it is a protocol
> break.
>

I see no changes except "backend would report a protocol violation for the
case when special statement is used and message sequence is wrong".


> > Note: it is quite easy to invent a name that is not yet used in the wild,
> > so it is safe.
>
> Sir, that is utter nonsense.



Tom Lane :

> And even if it were true, why is it that
> this way would safely pass through existing releases of pgbouncer when
> other ways would not?  Either pgbouncer needs to understand what it's
> passing through, or it doesn't.
>

Once again: exiting pgbouncer versions know how to parse
Parse/Bind/Exec/Deallocate messages, so if we bless some well-defined
statement name with a semantics that "it is forbidden to reuse that name
for multiple executions in a row", then that is completely transparent for
pgbouncer.  Pgbouncer would just think that "the application is dumb since
it reparses the same statement again and againt", but it would work just
fine.

On contrary, if a new statement name is added, then pgbouncer would fail to
understand the new message.

Vladimir


Re: [HACKERS] Slowness of extended protocol

2016-08-08 Thread Vladimir Sitnikov
Shay Rojansky :

> Ah, I understand the proposal better now - you're not proposing encoding a
> new message type in an old one, but rather a magic statement name in Parse
> which triggers an optimized processing path in PostgreSQL, that wouldn't go
> through the query cache etc.
>

Exactly.


> If so, isn't that what the empty statement is already supposed to do? I
> know there's already some optimizations in place around the scenario of
> empty statement name (and empty portal).
>

The problem with "empty statement name" is statements with empty name can
be reused (for instance, for batch insert executions), so the server side
has to do a defensive copy (it cannot predict how many times this unnamed
statement will be used).

Shay Rojansky :

> Also, part of the point here is to reduce the number of protocol messages
> needed in order to send a parameterized query - not to have to do
> Parse/Describe/Bind/Execute/Sync - since part of the slowdown comes from
> that (although I'm not sure how much). Your proposal keeps the 5 messages.
>

As my benchmarks show, notable overhead is due to "defensive copying of the
execution plan". So I would measure first, and only then would claim where
the overhead is.

Some more profiling is required to tell which part is a main time consumer.
Technically speaking, I would prefer to have a more real-life looking
example instead of SELECT 1.
Do you have something in mind?
For instance, for more complex queries, "Parse/Plan" could cost much more
than we shave by adding "a special non-cached statement name" or by
reducing "5 messages into 1".

There's a side problem: describe message requires full roundtrip since
there are cases when client needs to know how to encode parameters.
Additional roundtrip hurts much worse than just an additional message that
is pipelined (e.g. sent in the same TCP packet).

Shay Rojansky :

> But people seem to be suggesting that a significant part of the overhead
> comes from the fact that there are 5 messages, meaning there's no way to
> optimize this without a new message type.
>

Of course 5 messages are slower than 1 message.
However, that does not mean "there's no way to optimize without a new
message type".
Profiling can easily reveal time consumer parts, then we can decide if
there's a solution.
Note: if we improve "SELECT 1" by 10%, it does not mean we improved
statement execution by 10%. Real-life statements matter for proper
profiling/discussion.

 Shay Rojansky :

> Note: it is quite easy to invent a name that is not yet used in the wild,
>> so it is safe.
>>
>
> That's problematic, how do you know what's being used in the wild and what
> isn't? The protocol has a specification, it's very problematic to get up
> one day and to change it retroactively. But again, the empty statement
> seems to already be there for that.
>

Empty statement has different semantics, and it is wildly used.
For instance, pgjdbc uses unnamed statements a lot.
On the other hand, statement name of "!pq@#!@#42" is rather safe to use as
a special case.
Note: statement names are not typically created by humans (statement name
is not a SQL), and very little PG clients do support named statements.

Vladimir


Re: [HACKERS] Slowness of extended protocol

2016-08-08 Thread Vladimir Sitnikov
Shay Rojansky :

> That's definitely a valid point. But do you think it's a strong enough
> argument to avoid ever adding new messages?
>

The point is "adding a message to current v3 protocol is not a backward
compatible change".
The problem with adding new message types is not only "client support", but
deployment issues as well: new message would require simultaneous upgrade
of both backend, client, and pgbouncer.

It could make sense to add some kind of transparent extensibility to the
protocol, so clients can just ignore unknown message types.

For instance, consider a new message "ExtensibleMessage" is added: '#',
int16 (message id), int32 (message length), contents.
For the pgbouncer case above, pgbouncer could just proxy "unknown requests"
as is and it would cover many of the cases out of the box.
That message has a well-defined message length, so it is easy to skip.

If new "messages" required, a new message_id can be allocated and assigned
to the new message type. Old clients would be able to skip the message as
they know its length.

Of course, adding that '#' message does require support from pgbouncer,
etc, etc, however I'm sure it would simplify protocol evolution in the
future.

Of course there might appear a need for a message that cannot be ignored by
the client, but I think "even a bit of flexibility" is better than no
flexibility at all.

Technically speaking, there's a "NotificationResponse" message, however it
is not that good since it cannot have 0 bytes in the contents.

Shay Rojansky :

>
>> On the other hand, usage of some well-defined statement name to trigger
>> the special case would be fine: all pgbouncer versions would pass those
>> parse/bind/exec message as if it were regular messages.
>>
>
> Can you elaborate on what that means exactly? Are you proposing to somehow
> piggyback on an existing message (e.g. Parse) but use some special
> statement name that would make PostgreSQL interpret it as a different
> message type?
>

Exactly.
For instance: if client sends
Parse(statementName=I_swear_the_statement_will_be_used_only_once), then the
subsequent message must be BindMessage, and the subsequent must be
ExecMessage for exactly the same statement id.

Then backend could recognize the pattern and perform the optimization.
Note: it is quite easy to invent a name that is not yet used in the wild,
so it is safe.
>From security point of view (e.g. if a client would want to exploit
use-after-free kind of issues), backend could detect deviations from this
parse-bind-exec sequence and just drop the mic off.

Shay Rojansky :

> Apart from being a pretty horrible hack,
>

I would not call it a horrible hack. That is just a clever use of existing
bits, and it does not break neither backward nor forward compatibility.
Backward compatibility: new clients would be compatible with old PG
versions (even 8.4).
Forward compatibility: even if the support of that special statement name
would get dropped for some reason, there will be no application issues (it
would just result in a slight performance degradation).

Shay Rojansky :

> it would still break pgbouncer, which has to actually inspect and
> understand SQL being sent to the database (e.g. to know when transactions
> start and stop).
>

Note: I do not suggest to change message formats. The message itself is
just fine and existing pgbouncer versions can inspect the SQL. The
difference is a special statementName, and I see no reasons for that kind
of change to break pgbouncer.

Vladimir


Re: [HACKERS] No longer possible to query catalogs for index capabilities?

2016-08-08 Thread Vladimir Sitnikov
Tom Lane :

> FWIW, this thread started on 25-July, less than two weeks ago.


Technically speaking, there was a pgsql-jdbc thread started on May 14:
https://www.postgresql.org/message-id/nh72v6%24582%241%40ger.gmane.org

9.6beta1 was released on May 12

The fact that it wasn't raised
> till more than 6 months after we committed the pg_am changes


This means that nobody was testing compatibility of "postgresql's master
branch with existing third-party clients".
Testing against well-known clients makes sense to catch bugs early.

I've added "build postgresql from master branch" test to the pgjdbc's
regression suite a week ago, so I hope it would highlight issues early
(even before the official postgresql beta is released).

However, pgjdbc tests are executed only for pgjdbc commits, so if there's
no pgjdbc changes, then there is no logic to trigger "try newer postgres
with current pgjdbc".

Ideally, postgresql's regression suite should validate well-known clients
as well.
I've no idea how long would it take to add something to postgresql's
buildfarm, so I just went ahead and created Travis test configuration at
https://github.com/vlsi/postgres
Do you think those Travis changes can be merged to the upstream?

I mean the following:
1) Activate TravisCI integration for https://github.com/postgres/postgres
 mirror.
2) Add relevant Travis CI file so it checks postgresql's regression suite,
and the other
3) Add build badge to the readme (link to travis ci build) for simplified
navigation to test results.

Here's the commit:
https://github.com/vlsi/postgres/commit/4841f8bc00b7c6717d91f51c98979ce84b4f7df3
Here's how test results look like: https://travis-ci.org/vlsi/postgres

Vladimir


Re: [HACKERS] Slowness of extended protocol

2016-08-08 Thread Vladimir Sitnikov
Shay Rojansky :

>
> That sounds right to me. As you say, the server version is sent early in
> the startup phase, before any queries are sent to the backend, so frontends
> know which server they're communicating with.
>
> We could call this "protocol 3.1" since it doesn't break backwards
> compatibility (no incompatible server-initiated message changes, but it
> does include a feature that won't be supported by servers which only
> support 3.0. This could be a sort of "semantic versioning" for the protocol
> - optional new client-initiated features are a minor version bump, others
> are a major version bump...
>

Adding a new message is not backward compatible since it will fail in
pgbouncer kind of deployments.
Suppose there's "new backend", "old pgbouncer", "new client" deployment.
If the client tries to send the new message, it will fail since pgbouncer
would have no idea what to do with that new message.

On the other hand, usage of some well-defined statement name to trigger the
special case would be fine: all pgbouncer versions would pass those
parse/bind/exec message as if it were regular messages.

Vladimir

>


Re: [HACKERS] New version numbering practices

2016-08-04 Thread Vladimir Sitnikov
Tom Lane :

> [ shrug... ]  What do you claim is not machine-readable about
> server_version?
>

0) server_version needs a dance to parse.
For instance, recent "Stamp version 10devel" patch did touch
"server_version" parsing in fe-exec.c:
https://github.com/vlsi/postgres/pull/2/files#diff-2df5cad06efe4485ad362b0eb765cec0L986
Of course it might happen there was just a bug in fe-exec.c, however that
is a smell. Lots of clients might need to update server_version parsing
logic for no reason except "support 10devel kind of versions".
There are cases when the dance is locale-specific:
https://github.com/pgjdbc/pgjdbc/pull/190

1) By saying "just parse server_version" you are basically forcing every
postgresql client (except libpq) to implement, test, and support its own
parse logic. Do you care of having robust clients that will not break in
parts when tested against 10.whatever?

1) Several examples were provided by Craig:
https://www.postgresql.org/message-id/CAMsr%2BYFt1NcjseExt_Ov%2Bfrk0Jzb0-DKqYKt8ALzVEXHBM0jKg%40mail.gmail.com
Craig> This means that at least when connecting to newer servers clients no
longer have to do any stupid dances around parsing "9.5beta1",
"9.4.0mycustompatchedPg"

2) Official documentation suggests "see also server_version_num for a
machine-readable version."
https://www.postgresql.org/docs/9.5/static/functions-info.html

Even though "one can simply try to parse server_version value", I claim
that server_version_num is much more machine-readable than server_version
one.

Vladimir


Re: [HACKERS] New version numbering practices

2016-08-04 Thread Vladimir Sitnikov
>
> Sorry, but I don't buy that.  I think sending both server_version and
> server_version_num would be silly, and we're certainly not going to stop
> sending server_version.
>

What is wrong with sending machine-readable value?

Vladimir


Re: [HACKERS] Slowness of extended protocol

2016-08-03 Thread Vladimir Sitnikov
Tatsuo Ishii :

> Doesn't this patch break an existing behavior of unnamed statements?
> That is, an unnamed statement shall exist until next parse message
> using unnamed statement received. It is possible to use the same
> unnamed statement multiple times in a transaction.
>

>Doesn't this patch break an existing behavior of unnamed statements?

As it was expected, the behavior for unnamed statements is broken (some
tests from make check-world fail with segmentation fault).
So some more sophisticated patch is required.

For those who are interested, I've created a Github-Travis mirror that
automatically runs several regression suites for the given postgresql
patch: https://github.com/vlsi/postgres
I think it will simplify running regression tests for postgresql patches
against multiple suites.

Current tests include: make check, make check-world, and pgjdbc test suite
(except XA and SSL).

For instance, here's the link to my patch
https://github.com/vlsi/postgres/pull/1
Feel free to file PRs for travis branch of https://github.com/vlsi/postgres so
the patch gets tested.

Vladimir


Re: [HACKERS] Slowness of extended protocol

2016-08-03 Thread Vladimir Sitnikov
Tom Lane :

> Bruce Momjian  writes:
> > On Sun, Jul 31, 2016 at 05:57:12PM -0400, Tom Lane wrote:
> >> In hindsight it seems clear that what a lot of apps want out of extended
> >> protocol is only the ability to send parameter values out-of-line
> instead
> >> of having to quote/escape them into SQL literals.  Maybe an idea for the
> >> fabled V4 protocol update is some compromise query type that corresponds
> >> precisely to PQexecParams's feature set: you can send parameter values
> >> out-of-line, and you can specify text or binary results, but there's no
> >> notion of any persistent state being created and no feedback about
> >> parameter data types.
>
> > Do you want this on the TODO list?
>
> I didn't hear anyone say it was a silly idea, so sure.
>

Frankly speaking, it is not clear what this change buys.

Are you sure v3 cannot be tuned to reach comparable performance?

I do not like very much having a variety of "query modes".
For instance, when working with logical replication, extended queries are
not supported over the wire, that complicates client.
This particular issue delays merge of logical repilcation support to the
JDBC driver:
https://github.com/pgjdbc/pgjdbc/pull/550#issuecomment-236418614



If adding one more "execute flavor" the things would get only worse, not
better.

Reusing parse state does indeed improve the performance in real-life
applications, so I would wonder if we can make current "extended" query
faster rather than implementing yet another protocol.

So while the request itself would definitely make sense if we had no
"v2/v3" protocols at all, however as we do have v2 and v3, it adding
"PQexecParams's
feature set" looks not that important.

Just in case, here are "protocol wanted features" as seen by client
applications (e.g. JDBC client):
https://github.com/pgjdbc/pgjdbc/blob/master/backend_protocol_v4_wanted_features.md



Vladimir


Re: [HACKERS] Slowness of extended protocol

2016-08-02 Thread Vladimir Sitnikov
>
> I really don't get what's problematic with posting a message on a mailing
> list about a potential performance issue, to try to get people's reactions,
> without diving into profiling right away
>

"Benchmark data is a perfect substitute for benchmarking results. Data is
easy to misinterpret, so try not to do that." (see [1], and slide 73 of [2])

The key points are:
0) It is extremely easy to take a wrong way unless you analyze the
benchmark results.
1) If you (or someone else) thinks that "ok, the original email did meet
its goal, as Vladimir did provide a patch with measurements", then I failed.
The only reason for me doing the benchmark and patch was to teach you how
to do that.
2) Have you seen recent discussion "TODO item: Implement Boyer-Moore
searching in LIKE queries" on the list?
It does include relevant details right from the start.
(see
https://www.postgresql.org/message-id/CALkFZpcbipVJO%3DxVvNQMZ7uLUgHzBn65GdjtBHdeb47QV4XzLw%40mail.gmail.com
 )

I'm not a PostgreSQL developer
>
Neither am I.


> I have other urgent things to do
>
So do I.


> and don't even spend most of my programming time in C.
>

Java and SQL covers 99% of my time.

[1]: https://twitter.com/shipilev/status/760387758246486017
[2]: https://shipilev.net/talks/jvmls-July2014-benchmarking.pdf

Vladimir


Re: [HACKERS] Slowness of extended protocol

2016-07-31 Thread Vladimir Sitnikov
Shay Rojansky :

> I'm well aware of how the extended protocol works, but it seems odd for a
> 30% increase in processing time to be the result exclusively of processing
> 5 messages instead of just 1 - it doesn't seem like that big a deal
> (although I may be mistaken). I was imagining that there's something more
> fundamental in how the protocol or PostgreSQL state is managed internally,
> that would be responsible for the slowdown.
>

Hi, have you tried to use a profiler to identify the _cause_ of the
difference in performance?

Here's relevant read:
https://shipilev.net/blog/2015/voltmeter/#_english_version

Vladimir


Re: [HACKERS] Why we lost Uber as a user

2016-07-28 Thread Vladimir Sitnikov
>
>
> >> That's a recipe for runaway table bloat; VACUUM can't do much because
> >> there's always some minutes-old transaction hanging around (and SNAPSHOT
> >> TOO OLD doesn't really help, we're talking about minutes here), and
> >> because of all of the indexes HOT isn't effective.
>

Just curious: what if PostgreSQL supported index that stores "primary key"
(or unique key) instead of tids?
Am I right that kind of index would not suffer from that bloat? I'm
assuming the primary key is not updated, thus secondary indices build in
that way should be much less prone to bloat when updates land to other
columns (even if tid moves, its PK does not change, thus secondary index
row could be reused).

If that works, it could reduce index bloat, reduce the amount of WAL (less
indices will need be updated). Of course it will make index scan a bit
worse, however it looks like at least Uber is fine with that extra cost of
index scan.

Does it make sense to implement that kind of index as an access method?

Vladimir


Re: [HACKERS] One process per session lack of sharing

2016-07-14 Thread Vladimir Sitnikov
Craig>That moves work further away from the DB, which has its own costs,
and isn't something you're likely to be happy with if you're looking at
things like optimising PL/PgSQL with a bytecode compiler. But it's the best
we have right now.

What if JVM was started within a background worker?
Then JVM can spawn several threads that serve PL requests on a "thread per
backend" basis.

Craig>You may be able to greatly reduce that cost if you can store your
cached compiled data in a shared memory segment created by your extension.
Craig>This will get a bit easier with the new dynamic shared memory
infrastructure, but it's going to be no fun at all to make that play with
the JVM. You'll probably need a lot of JNI.

There's https://github.com/jnr/jnr-ffi that enables to call C functions
without resorting to writing JNI wrappers.

Vladimir


Re: [HACKERS] Statistics Injection

2016-07-02 Thread Vladimir Sitnikov
  > The problem is that, even if I set the reltuples and relpages of my choice, when I run the EXPLAIN clause for a query in which the 'newTable'  is involved in (e.g. EXPLAIN SELECT * FROM newTable), I get the same cost and row estimation.>Could anyone help me with that?There's a pg_dbms_stats extension that enables you to override/freeze the statistics: https://github.com/ossc-db/pg_dbms_statsVladimir  ‎



Re: [HACKERS] NOT EXIST for PREPARE

2016-03-24 Thread Vladimir Sitnikov
Tom> Not to mention that the whole idea of that being a semantically
Tom> significant property of a name is a monstrous kluge.

You are right here.

Just in case, Marko Kreen says (see [1]) pgbouncer has all the information
required to remap statement names, so he says pgbouncer needs no
cooperation from backend nor from app side in order to implement
prepared statements properly.

[1]: https://github.com/pgbouncer/pgbouncer/issues/126#issuecomment-200900171

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] NOT EXIST for PREPARE

2016-03-24 Thread Vladimir Sitnikov
Tom>If you think that's not a protocol change, you are mistaken.  It
Tom>changes a behavior that's specified in the protocol documentation.

Even if it requires documentation, this particular change will work seamlessly
across existing implementations of v3 protocol.

For instance, it would not require to update pgbouncer to support that
__ convention.
In other words, __ convention is transparent to pgbouncer.

Consider Prepare2 kind of message is added. Then it would require to update
virtually every software that talks v3 protocol.

That is why I say that "some kind of __ convention" does not require protocol
version bump, while "adding new message" does require the bump.

Just to be clear: I'm not fond of encoding the answer to the universe
into statement name.
However, I find that "name convention" a smart invention.

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] NOT EXIST for PREPARE

2016-03-24 Thread Vladimir Sitnikov
Craig>I really, really doubt you can change this before we do a
protocol version bump.

Technically speaking, the idea of using first bytes of statement name
to convey extra information does not require protocol version bump. It
can be backward and forward compatible.

For instance: if statement name starts with __, then skip throwing an
error if statement with exactly same text and parameter types has
already been prepared.

by "prepare ..." below I mean v3 prepare message, not "prepare sql" command.

prepare __s1(int) select $1; -> prepared
prepare __s1(int) select $1; -> prepared, no error since statement
name starts with __
prepare s1(int) select $1; -> prepared
prepare s1(int) select $1; -> error "statement s1 has already been used"

>doesn't have any kind of capabilities negotiation

Do you think capability negotiation should indeed be at the protocol level?
What's wrong with say "select * from backend_capabilities" at the
connection setup?

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] NOT EXIST for PREPARE

2016-03-23 Thread Vladimir Sitnikov
Merlin> All I'm saying is that the use of
Merlin> server side prepared statements is extremely problematic in
Merlin> conjunction with pgbouncer

I've filed https://github.com/pgbouncer/pgbouncer/issues/126 to get
pgbouncer improved in regard to prepared statements.

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] NOT EXIST for PREPARE

2016-03-23 Thread Vladimir Sitnikov
Merlin>proposed would allow use of server side prepared statements with JDBC.

It would not. If we discuss end-to-end scenarios in detail, we would end up with
"send full query on each execution" -> lex/gram on each execution kind
of overheads.

That is hardly a proper way of using prepared statements.
I'm not eager to merge half-a-solution to pgjdbc.
Just in case: I'm one of those who maintain pgjdbc.

Merlin>https://pgbouncer.github.io/faq.html#how-to-use-prepared-statements-with-session-pooling
pgbouncer's docs> the reset query must clean old prepared statements

I'm afraid, that is just "you must clean prepared statements".
Where are the issues/suggestions from pgbouncer team?
I'm just a year into pgjdbc, I've attended a couple of the largest
PostgreSQL conferences
(like 700+ attendees), however I know of exactly 0 suggestions on
improving the matter.
Everybody just keeps saying "discard all".

Merlin>With proposed, the application can simply prepare after
Merlin>opening the 'connection' and not have to worry about handling the
Merlin>error or scope.

Can you name at least a couple of applications that indeed
"prepare after opening the connection"?
Note: it is not something JDBC driver can do. That "prepare after opening"
requires cooperation from the application.

I'm afraid, I know exactly 0 such applications. At least, in java world.
Applications typically use framework generated queries, so it would be
hard to impossible to "simply prepare after opening".
The "set of used sql queries" is likely to be infinite even for a
single application.
That is sad, but true.

That is exactly the reason why I've implemented _transparent_ statement cache
for pgjdbc. As application is using generated queries, pgjdbc detects
query reuse and enables server-prepared queries behind the scenes.


If no one objects, I would just go ahead and file
"report ParameterStatus(pgbouncer.backend_id=...)" issue for pgbouncer.
I guess we can agree on the name and semantics of the new status message,
so it would not accidentally collide with the one of newer PG version.

Merlin>Although there was a very long standing issue where jdbc
Merlin>would try to prepare 'BEGIN' in such a a way that it could not be
Merlin>disabled -- that was fixed.

What bothers me is current pgjdbc CI has exactly 0 pgbouncer tests.
That means "BEGIN is fixed" can easily break and no one would notice that.
It is tracked under https://github.com/pgjdbc/pgjdbc/issues/509, so
if there's interest in pgbouncer vs pgjdbc, then #509 might be a good start.

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] NOT EXIST for PREPARE

2016-03-23 Thread Vladimir Sitnikov
Merlin>No one is arguing that that you should send it any every time (at
least -- I hope not).

Well, what is your suggestion exactly?

pgjdbc is NOT using "prepare ..." sql command.
I'm inclined to suppose, it will not use "prepare..." even after your fix.

Merlin>Again, not in pooling scenarios
Merlin>The problems integrating server side
Merlin>prepared statements with pgbouncer are well known.

I'm afraid, they are not.

Your words are "This feature should be immediately be incorporated
by the JDBC driver" yet you have not raised that subject on pgsql-jdbc
mailing list/github issue. That is not very fair.

Let me just name an alternative, so you can see what "a step back" could be:
What if pg-bouncer generated _fake_ ParameterStatus(backend_id=...)
messages to pgjdbc?
Then pgjdbc can learn true backend session id and it can use proper
set of prepared statements. Basically, pgjdbc could have prepared statement
cache per backend_id.
Well, it is not a 100% solution, however it is yet another approach to
"pgbouncer" problem, and it will support all the PostgreSQL versions.

It fits into current frontend-backend protocol as all clients are supposed
to handle ParameterStatus messages, etc.

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] NOT EXIST for PREPARE

2016-03-23 Thread Vladimir Sitnikov
2016-03-23 16:21 GMT+03:00 Merlin Moncure :
> On Wed, Mar 23, 2016 at 7:27 AM, Craig Ringer  wrote:
Craig>> With PREPARE IF NOT EXISTS the client is also paying parse and network
Craig>> overhead for every time you send that statement. Much better
not to send it
Craig>> repeatedly in the first place.
>
Merlin> How did you determine that?  The client prepares the statement exactly
Merlin> once.  The problem is there's no easy way to determine if someone else
Merlin> prepared it first and this patch directly addresses that.

With suggested "prepare if not exists", client would still have to send full
query text along with "prepare if not exist" command.
That is "network overhead".
DB would still have to check if the same query with same "query name" has
already been registered. Well, that query check would probably be easier than
"generate execution plan", yet it can be of non-zero overhead.

Craig>> I think we need to take a step back here and better define the problem
Craig>> before stepping in with a proposed solution. Something that
avoids the need
Craig>> to spam the server with endless copies of the same PREPARE
statements would
Craig>> be good.

+1

Merlin> A typical pattern is for the application to
Merlin> prepare them all upon startup, but currently each PREPARE needs to be
Merlin> wrapped with an exception handler in case someone else prepared it
Merlin> first.

If you plan to have "prepare if not exists" at startup only, why don't
you just wrap it with
exception handler then?

If you plan to always issue "prepare if not exists", then you will
have to send full query text
for each prepare => overhead. Those repeated query texts are
"endless copies of the same PREPARE statements" Craig is talking about.

Merlin>The client prepares the statement exactly
Merlin>once.  The problem is there's no easy way to determine if someone else
Merlin>prepared it first

Merlin, if by "client" you somehow mean JDBC (e.g. pgjdbc), then it
does track which connections
have which queries prepared.
So the thing we/you need might be not backend support for "prepare if
not exists", but some kind of
bouncer vs jdbc integration, so jdbc knows which connection it is
using at each point in time.

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: RETURNING primary_key()

2016-03-11 Thread Vladimir Sitnikov
Igal, thanks for the analysis.

Craig>Part of the question for Pg is what exactly we should and should not
be returning.

I think the following might be a good starting point: return set of columns
that would identify the inserted row(s).
E.g. columns of any primary key would do. Columns of any unique key would
do as well.
"returning *" would do as well, however it would return excessive columns,
thus it would be less efficient.

I do not think it makes sense to tie "generated keys" to sequences or
things like that.

For example:
1) Consider Pg returns column_name==ABC,  value==42.  That would mean
client could locate exactly that row via "where ABC=42"
2) Same for multicolumn keys:  Pg just returns (col1, col2) == (42, 146).
Then client would be able to locate the row via "where col1=42 and col2=146
3) If multiple unique keys present, it is fine if Pg returns one or the
another depending on the phase of the moon. Yet more compact key would be
preferable to save on bandwidth.

Does that make sense?

I think naming the resulting column(s) like "generated_key" /
"generated_keys" does not make much sense. Especially, for multi-column
keys.

If ctid was update-proof, it could could do. Unfortunately, ctid might
easily get changed.


Theoretical end-to-end (it is the only use of "generated keys" I can
imagine at the moment):
1) Client issues an insert statement, asking "generated keys"
2) Pg inserts the row, and returns resultset with "primary key" (or unique
key) columns.
3) Client stores it somewhere. For instance, passes that to UI.
4) As UI wants to update the row, client just uses those keys to identify
the row to update.

PS. Frankly speaking, I feel "generated keys" is more like a "plug & pray"
kind of API. ORMs should know the column names of the primary keys => ORMs
should use "return specific column names" API, not just "return something
generated".
​
Vladimir


Re: [HACKERS] Set search_path + server-prepared statements = cached plan must not change result type

2016-01-28 Thread Vladimir Sitnikov
Robert>Hmm, so in your example, you actually want replanning to be able to
Robert>change the cached plan's result type?

I want backend to cache _several_ plans behind a single "statement name".
I want to treat "prepare...exec...deallocate" dance as an optimization
step for a simple "exec...exec...exec" sequence.
I do not want to care if "previously prepared query is still valid or
not". For instance, I do not want to check if search_path is still the
same.

Current backend implementation does not report changes to
"search_path", thus clients have no solid way to detect "search_path
changes".

David>Maybe call the new command "PARSE name AS query".

>From JDBC perspective, there is no need in "prepare vs parse" distinction:
1) Explicit "prepare...execute" are not used in typical application code
2) That means, in 99.9% cases, "prepare" would be used by the jdbc driver itself
3) Thus just a single "protocol command" is sufficient.

What I am saying is there are lots of consumers that want to avoid
parsing overhead: plpgsql, pgjdbc, pgjdbc-ng, postgresql-async,
8kdata/phoebe, etc, etc.

All of them will have to deal with search_path vs prepare issue.
If you suggest to deprecate "prepare" in favor of "parse", then all of
the above clients would have to switch to that "parse".
It does not look like a good solution, since lots of existing clients
assume "prepare just works".

If "prepare" command gets deprecated, why "parse" would be better?
What would be the use of "prepare" if all the clients would have to
use "parse" in order to be search_path-compatible?

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implementing a new Scripting Language

2016-01-27 Thread Vladimir Sitnikov
Why do you want that at the database level?
Do you have end-to-end scenario that benefits from using Lucee?

>I was wondering how difficult it would be to implement a Postgres extension 
>that will act as a wrapper around it and will allow to write functions in that 
>language?

Have you checked PL/Java?

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implementing a new Scripting Language

2016-01-27 Thread Vladimir Sitnikov
> If the pl/v8 was easily ported to Windows then I probably wouldn't even try 
> to add Lucee,

That is a good question. ChakraCore has been open sourced recently. It
might be easier to build under Windows.

>That seems like a good place to start, thanks
I am not sure you would be able to bind high performance java runtime
with the backend. There are no that many JREs, and not much of them
are good at "in-backend" operation.

Thus your question boils down to 2 possibilities:
1) You execute Lucee in some JRE that runs in the backend (frankly
speaking, I doubt it is a good way to go)
2) You implement Lucee parser/executor/compiler in C and use it as
typical PostgreSQL extension

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set search_path + server-prepared statements = cached plan must not change result type

2016-01-25 Thread Vladimir Sitnikov
I want to treat 'prepare' operation as an optimization step, so it is 
functionally equivalent to sending a query text.

In other words, I would like backend to track search_path and other parameters 
if necessary transparently‎, creating (caching) different execution plans if 
different plans are required.
‎
Does that make sense?‎
‎
Vladimir‎



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set search_path + server-prepared statements = cached plan must not change result type

2016-01-21 Thread Vladimir Sitnikov
Robert>Are you really seeing the same behavior in all versions?

I do not have "pre 9.1" at hand, however all 9.1, 9.2, 9.3, 9.4, and
9.5 are affected.

9.1 just silently executes "old statement" as if search_path was not
modified at all.
9.2, 9.3, 9.4, and 9.5 all fail with "cached plan must not change
result type" error.

See java-based test in [1], and build logs for 9.1-9.4 in [2]

I do not have "brand new 9.5", however I think 9.5rc1 is good enough:
"PostgreSQL 9.5rc1 on x86_64-apple-darwin15.2.0, compiled by Apple
LLVM version 7.0.0 (clang-700.1.76), 64-bit"

Here's my test case:

select version();

create schema customer1;
create table customer1.test(i int4);

create schema customer2;
create table customer2.test(i varchar);

set search_path to customer1,public;
prepare stmt as select * from test;

set search_path to customer2,public;

execute stmt;

--ERROR: cached plan must not change result type
--SQL state: 0A000

[1]: 
https://github.com/pgjdbc/pgjdbc/commit/8fcd07a24666de308419d54e49e2f65f40661e2a#diff-526a72847ed4c9f31f699515d06e508bR188
[2]: https://travis-ci.org/pgjdbc/pgjdbc/builds/103940843

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set search_path + server-prepared statements = cached plan must not change result type

2016-01-20 Thread Vladimir Sitnikov
>  I believe, and the conclusion was that
>if you think you need this, you're doing it wrong

So what is the recommended approach to use server-prepared statements
at the client side (I mean at JDBC driver side)?

Currently "prepare, switch search_path, execute" leads to "cached plan
must not change result type" error.
Can one expect the issue to be fixed in subsequent 8.4, 8.5, ..., 9.5 versions?

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Set search_path + server-prepared statements = cached plan must not change result type

2016-01-20 Thread Vladimir Sitnikov
Hi,

There's a not so well known issue of "set search_path" and
"server-prepared statement" usage.
In short, it does not always work. See more details in [1].

There issue has reproduced once again recently (see [2], that explains
that search_path is modified for a multi-tenant setup).

Even though I wish that kind of details were handled at the backend
level (see Tom's example in [1] when "create table" should invalidate
the statement), we need to do something about that with existing PG
versions. Otherwise it is another "never modify search_path" or "never
use server-prepared" catch-22.

Here's my question: why change in search_path does NOT generate
ParameterStatus message from the backend?

I thought I could capture ParameterStatus events, and use
per-search_path cache at the JDBC level. However that does not seem to
work.

Here's what I get with 9.5rc1:

simple execute, maxRows=0, fetchSize=0, flags=17
FE=> Parse(stmt=null,query="SET search_path TO "$user",public,schema2",oids={})
FE=> Bind(stmt=null,portal=null)
FE=> Describe(portal=null)
FE=> Execute(portal=null,limit=0)
FE=> Sync
<=BE ParseComplete [null]
<=BE BindComplete [unnamed]
<=BE NoData
<=BE CommandStatus(SET)
<=BE ReadyForQuery(I)

Am I missing something?

[1]: http://www.postgresql.org/message-id/22921.1358876...@sss.pgh.pa.us
[2]: https://github.com/pgjdbc/pgjdbc/issues/496

Vladimir Sitnikov


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Insert values() per-statement overhead

2016-01-15 Thread Vladimir Sitnikov
Hi,

There is a finding that insert(x) values(y);insert(x) values(z);  is
2-4 times slower than insert(..) values(y),(z);
see [1], [2].

In other words, there is a significant per-statement overhead even
though server-prepared statements are properly used.
The issue is reproducible in 9.5rc1.

Is it something that was discussed previously? (I was not able to find
that in archives)
Is it something that can be improved? (e.g. "insert node state"
caching across execute calls, improve performance of "INSERT %u %u"
generation, etc)

Even though I understand there will always be _some_ per-statement
overhead, such a hight overhead plays against common case of using
ORMs.
End-users are just stuck with insert(...) values(...);

1) Java's standard way of batching statements is
"PreparedStatement#addBatch()". Unfortunately, backend protocol does
not support statement batching.
One does not simply teach Hibernate/EclipseLink,etc etc to use
PostgreSQL's-specific COPY.
Note: I am not talking about network roundtrips here. I'm just
highlighting that there is no way to execute "bind bind bind
executebatch" sequence at the protocol level.

2) One might consider "transparent rewrite of insert()  batches into a
single insert() values(),(),() statement" at JDBC driver level, but it
is hard to get right as there is no easy way to parse a query. It is
really expected that every PostgreSQL connector would implement SQL
parser & insert rewriter?

3) Transparent rewrites (including "rewrite inserts to COPY") would
fail to provide "number of modified rows" for each row. Error
semantics is different as well.

4) COPY does not support UPSERT, does it?

My profiler (Instruments in Mac OS) shows that significant time is
spent in standard_ExecutorStart: see [3]
In fact, the time spent in standard_ExecutorStart even exceeds the
time spent in standard_ExecutorRun.

[1]: http://www.postgresql.org/message-id/55130dc8.2070...@redhat.com
[2]: https://github.com/pgjdbc/pgjdbc/pull/491#issuecomment-171780054
[3]: https://github.com/pgjdbc/pgjdbc/pull/491#issuecomment-171908974

Vladimir Sitnikov


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Insert values() per-statement overhead

2016-01-15 Thread Vladimir Sitnikov
>I guess you mean there's a transaction surrounding it?

Sure there is a transaction.
I measure the latency from the first Bind message to the ReadyForQuery response.
The database is at localhost.

The flow is as follows (I've use 4 queries in batch for brevity,
however the test above is executed for 1024 statements in single
batch):

create table batch_perf_test(a int4, b varchar(100), c int4)

insert into batch_perf_test(a, b, c) values($1, $2, $3)

Typical JDBC batch look like the following:

13:53:17.815 (1) batch execute 4 queries,
handler=org.postgresql.jdbc.BatchResultHandler@38d611f4, maxRows=0,
fetchSize=0, flags=532
13:53:17.816 (1)  FE=> Bind(stmt=S_1,portal=null,$1=<0>,$2=<'s0'>,$3=<0>)
13:53:17.816 (1)  FE=> Execute(portal=null,limit=1)
13:53:17.816 (1)  FE=> Bind(stmt=S_1,portal=null,$1=<1>,$2=<'s1'>,$3=<1>)
13:53:17.816 (1)  FE=> Execute(portal=null,limit=1)
13:53:17.816 (1)  FE=> Bind(stmt=S_1,portal=null,$1=<2>,$2=<'s2'>,$3=<2>)
13:53:17.816 (1)  FE=> Execute(portal=null,limit=1)
13:53:17.816 (1)  FE=> Bind(stmt=S_1,portal=null,$1=<3>,$2=<'s3'>,$3=<3>)
13:53:17.816 (1)  FE=> Execute(portal=null,limit=1)
13:53:17.816 (1)  FE=> Sync
13:53:17.817 (1)  <=BE BindComplete [unnamed]
13:53:17.817 (1)  <=BE CommandStatus(INSERT 0 1)
13:53:17.817 (1)  <=BE BindComplete [unnamed]
13:53:17.817 (1)  <=BE CommandStatus(INSERT 0 1)
13:53:17.817 (1)  <=BE BindComplete [unnamed]
13:53:17.817 (1)  <=BE CommandStatus(INSERT 0 1)
13:53:17.817 (1)  <=BE BindComplete [unnamed]
13:53:17.817 (1)  <=BE CommandStatus(INSERT 0 1)
13:53:17.817 (1)  <=BE ReadyForQuery(I)


"Rewritten" batch looks like the following (inserting pairs gives 1.5
times improvement when testing 1024 row inserts):

insert into batch_perf_test(a, b, c) values($1, $2, $3), ($4, $5, $6)

13:53:41.048 (1) batch execute 2 queries,
handler=org.postgresql.jdbc.BatchResultHandler@38d611f4, maxRows=0,
fetchSize=0, flags=532
13:53:41.048 (1)  FE=>
Bind(stmt=S_1,portal=null,$1=<0>,$2=<'s0'>,$3=<0>,$4=<1>,$5=<'s1'>,$6=<1>)
13:53:41.049 (1)  FE=> Execute(portal=null,limit=1)
13:53:41.049 (1)  FE=>
Bind(stmt=S_1,portal=null,$1=<2>,$2=<'s2'>,$3=<2>,$4=<3>,$5=<'s3'>,$6=<3>)
13:53:41.049 (1)  FE=> Execute(portal=null,limit=1)
13:53:41.049 (1)  FE=> Sync
13:53:41.049 (1)  <=BE BindComplete [unnamed]
13:53:41.049 (1)  <=BE CommandStatus(INSERT 0 2)
13:53:41.049 (1)  <=BE BindComplete [unnamed]
13:53:41.049 (1)  <=BE CommandStatus(INSERT 0 2)
13:53:41.049 (1)  <=BE ReadyForQuery(I)


Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Vladimir Sitnikov
>the implementation is simply - but it hard to design some really general - it 
>is task for UI

Can you please rephrase?

Current design is "if the cost of a generic plan is less than the one
of a custom plan+replan, prefer generic".
I think that is wrong.

"Generic plan" misunderestimates a cost in a sense that it assumes
some pre-defined selectivities.
In other words, if "skewed" values are used, "custom plan" would
likely to have *worse cost* than the one of a generic plan, yet custom
plan is much more suitable for a particular parameter set.
As backend refers to boundParams, it does see that particular
condition is tough, while generic estimator just the cost.

Looking into plancache.c comments I see 3 possible plans:
1) custom plan with PARAM_FLAG_CONST=1. It should probably
constant-fold based on input parameters.

2) custom plan with PARAM_FLAG_CONST=0. I think it should just use
given parameters for selectivity estimations. The generated plan
should still be valid for use with other input values.
3) generic plan. The plan with all variables. <-- here's current behavior

1 has a replan cost.

2&3 can be cached and reused.

Is that correct?
I think #2 is better option than #3 since it gives better plan
stability, thus it is much easier to test and reason about.

This all boils down to adjustment in a single line:
https://github.com/postgres/postgres/blob/ee943004466418595363d567f18c053bae407792/src/backend/utils/cache/plancache.c#L1151-L1152

Does that make sense?

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Vladimir Sitnikov
Note: I do not suggest changing already cached plans yet.
I suggest looking into "6th bind values" when building a cached plan.
In other words, "if first 5 execution do not reveal dependence on bind
values, then cache the generated plan".

>Say you already have a plan which looks like this:
>Now the plan gets invoked with  $1 = 5.  What exactly in your mind would 
>happen here?

A sequential scan with $1=5 condition. What else could be there?

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Vladimir Sitnikov
> The custom plan is *more expensive*;

You compare costs of custom vs generic plans.
I suggest: do not compare costs *at all*.

>I don't know, it's your proposal :-)  But it looks like I misunderstood.

It is not.

My suggestion is: build a generic plan (that is the plan that will
return proper result for every possible bind value), yet refer to the
values of 6th binds when estimating cardinalitites.
Is it clear now?

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Vladimir Sitnikov
Please, read my suggestion again.

TL;DR: I suggest to create "generic plan" with regard to current bind values.
What's wrong with that approach?

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Vladimir Sitnikov
>so you don't get to (or want to) have any control over the underlying prepared 
>statement.

That is pl/pgsql's problem, isn't it?
In the mean time, user can use different query texts (e.g. by adding
offset 0, offset 0*1, offset 0*2, etc kind of stuff they typically use
to tune queries) to convince plpgsql to use different statement ids.

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Vladimir Sitnikov
>A value of -1 could disable generic plans

I do not like the idea.

I've seen dramatic performance improvements from using cached plans.
The numbers are like "20ms to plan vs 1ms to execute" for an often
used OLTP query. Query text is involved (~5-10KiB).

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Vladimir Sitnikov
>(1) the generic plan is falsely optimistic

That is my case.
Application is sending most common value on every execution while
backend is optimistic and it things that the app would stop sending
MCVs.

Costs for the plans are OK. However, there is a data skew, so it is
hard to tell what is the "true" selectivity of the skewed column in
general, thus the discussion.

VS>>In other words, application developer should understand
VS>> if a query is DWH-like (requires replans) or OLTP-like (does not
VS>> require replans). Agreed?
Tom>No, not agreed.  As was already pointed out upthread, such information
Tom>is not available in many use-cases for the plancache.

I think you answer the wrong question.
I was asking if you agree that _application_ developer (not pg backed
developer) should know if a query is OLTP or DWH like.

Do you really think app developer should not care which plan would be
chosen for a particular query he is working on?
Why all that "explain" stuff in documentation then?

In the plancache.c you have CURSOR_OPT_GENERIC_PLAN and
CURSOR_OPT_CUSTOM_PLAN flags.
It is obvious that those flags are not yet exposed/used by
applications, but my message is that "one should *not* think that DB
has artificial intelligence to properly identify a plan for each bind
sets and cache plans at the same time".

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Vladimir Sitnikov
>If plan is ok for one value parameters, then can be pretty bad for following 
>parameters.

Happy statements are all alike; every unhappy statement is unhappy in
its own way (see [1]).
If user is sending different kinds of parameters, he is shooting in the foot.

>Albe's proposal can be good enough for 2/3 cases and it doesn't block any 
>other enhancing

Albe's proposal effectively disables plan cache, thus it blocks enhancing.
If a user goes "replan every time" route, there is no way you
introduce plan caching there.

[1]: https://en.wikipedia.org/wiki/Anna_Karenina_principle

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Vladimir Sitnikov
> for one custom plans can be much better than the generic plan, independent of 
> cardinalities

So what? I do not suggest dropping custom plans entirely.
I perfectly understand there are cases when better replan every time.

> consider e.g a table with one somewhat common and otherwise just unique 
> values.

So what?
If I understand you properly, you mean: "if client sends unique binds
first 5-6 executions and bad non-unique afterwards, then cached plan
would be bad". Is that what you are saying?
I agree that is the corner-case for my suggestion.
Is is really happening often?

I state the following:
1) It is way easier to debug & analyze.
For instance: current documentation does *not* list a way to get a
*generic plan*.
Is that obvious that "you just need to EXPLAIN ANALYZE EXECUTE *6
times in a row*" just to get a generic plan?

2) It is likely to be more performant. We just need to explain users
that "if different plans required, just use different statements".
Isn't that obvious?
Frankly speaking, I do not like "plug" kind of code that just
sends bind values and expects magically optimized plan for each bind
combination.

3) What about "client sends top most common value 5 times in a row"?
Why assume "it will stop doing that"?
I think the better assumption is "it will continue doing that".

At the end, if a client wants specific treatment of a query, then
he/she might be better using separate server-prepared statements (the
one for "unique values", and another one for "non-unique").

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Vladimir Sitnikov
>This completely screws over PL/PgSQL, among other things.

Can you elaborate a bit?

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Vladimir Sitnikov
>Basically you're arguing to fix one specific edge case which bugs you
>personally, by creating a lot of others, which don't bug you. Not
>convincing.

It bugs me.
It bugs clients of pgjdbc (e.g. Thomas who started the thread).

Note: support of prepared statements for java applications has just
landed. The release dates are 2015-08-27 for pgjdbc, and 2014-02-24
for pgjdbc-ng.
I think current report is just a tip of the iceberg.

> by creating a lot of others, which don't bug you

I think it will not create "lots of others".
Do you have any statistics why particular flavour of generic plan was
committed in 9.2?

Suppose there are two type of binds: "non_unique" (N) and "unique" (U)
that require different plans for perfect response times.

I see the following sequences
 -- all clear, all the approaches would converge to plan for
"unique values".

 -- query for non-unique value is executed again and again.
  Perfect optimizer would either replan or reuse plan with regard to "MCV"
  Current behaviour would switch to "optimistic" plan at 6th
iteration. It is the case of the thread.
  My suggestion is to learn that "MCV is used -> use plan optimized for MCV"

^^^ note that above are "recommended" uses of the database. Each
statement is used for its own purpose: one for MCVs, another for "good
values".

Then there are cases of mixed executions.
Note: I state that mixing "kinds" of bind values is a bad application
design anyway. In other words, application developer should understand
if a query is DWH-like (requires replans) or OLTP-like (does not
require replans). Agreed?

NUUU
  Current behavior optimized for exactly this pattern.
  Well, why was it chosen over "UNNN"?

In other words, a pattern like UNNN would "create a lot of
others" as you say.

NUNUNUNUNUN -- perfect optimizer would replan every time (or have two
sets of plans, but let's leave that out)
  Neither my suggestion nor current behaviour properly covers the case.


I suggest to spare "NUUU" pattern in order to improve "".

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-13 Thread Vladimir Sitnikov
> Generally using a very small sample

That is another issue. Inventing some other algorithm instead of
current "cache after 5 executions" is another effort.

However, I suggest to "learn" from what client is sending.
You suggest to completely ignore that and just prepare for the case
he/she will send "a random value".
Why expect client would stop sending MCVs if we have already seen them
during previous 5 executions?

> That'd not change with the change you propose.

It will.
In my suggestion, the first "explain analyze execute" will match the
"finally cached plan" provided the plan is not treated in a special
way (e.g. replan every time, etc).

> That a prepared statement suddenly performs way differently
>depending on which the first bind values are is not, in any way, easier
>to debug.

It is way easier to debug since *the first* execution plan you get out
of "explain" *matches* the one that will finally be used.
Lots of developers are just not aware of "5 replans by backend".
Lots of the remaining confuse it with "5 non-server-prepared
executions by pgjdbc driver".

In other way: in order to identify a root cause of a slow query you
find bind values. Then you perform explain analyze and you find shiny
fast plan.
Does that immediately ring bells that you need to execute it 6 times
to ensure the plan would still be good?
Well, try being someone not that smart as you are when you answering
this question.

2) In my suggestion, "the first execution would likely to match the plan".

VS>> 3) What about "client sends top most common value 5 times in a row"?
VS>> Why assume "it will stop doing that"?
AF>If 20% of your values are nonunique and the rest is unique you'll get
AF>*drastically* different plans, each performing badly for the other case;
AF>with the unique cardinality plan being extremly bad.

Can you elaborate a bit? I can hardly follow that.



Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-12 Thread Vladimir Sitnikov
Here's the simplified testcase:
https://gist.github.com/vlsi/df08cbef370b2e86a5c1

It reproduces the problem in both 9.4.4 and 9.5rc1.
It is reproducible via both psql and pgjdbc.

I use a single table, however my production case includes a join of
two tables and the query is like
select ... from foo, bar where foo.skewed=? and bar.non_skewed=? and
foo.bar_id=bar.id

Note: my application _always_ sends *the same* *bad* value for skewed
column (it effectively is used as a filtering column in the particular
query).
Unfortunately, on 6th execution backend switches to the plan that uses
skewed index access.

Is it something that can be fixed/improved?

Good plan (the first 5 executions):
Index Scan using non_skewed__flipper on plan_flipper
(cost=0.43..42.77 rows=10 width=113) (actual time=0.030..0.072 rows=10
loops=1)
  Index Cond: (non_skewed = 42)
  Filter: (skewed = 0)
  Rows Removed by Filter: 10
  Buffers: shared hit=20 read=3
Execution time: 0.094 ms

Bad plan (all the subsequent executions):
Index Scan using skewed__flipper on plan_flipper  (cost=0.43..6.77
rows=1 width=113) (actual time=0.067..355.867 rows=10 loops=1)
  Index Cond: (skewed = $1)
  Filter: (non_skewed = $2)
  Rows Removed by Filter: 90
  Buffers: shared hit=18182 read=2735
Execution time: 355.901 ms


Vladimir


plan_flipper.sql
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-12 Thread Vladimir Sitnikov
> I don't know if there is a good solution except disabling server prepared 
> statements.

Why doesn't backend reuse already existing good plan?
The plan does account for the skew.

Can backend take selectivities from the original bind values?

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: [JDBC] Re: 9.4-1207 behaves differently with server side prepared statements compared to 9.2-1102

2016-01-12 Thread Vladimir Sitnikov
VS>>Why doesn't backend reuse already existing good plan?
PS>this logic is driven by plan cost, not by plan execution time.

It do understand that currently PG replans with $1, $2 and uses
default selectivities for that.

What I am asking is to make PG aware of "previously used bind values",
so it would calculate proper selectivities for $1, $2.

PS. It is not the first time the problem bites me, so I hand-crafted a testcase.

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [POC] FETCH limited by bytes.

2015-12-26 Thread Vladimir Sitnikov
>Have you got numbers showing any actual performance win for postgres_fdw?

For JDBC purposes, it would be nice to have an ability of asking
backend "to stop fetching if produced more than X MiB of response
data".
For small table (4 int4 fields), having decent fetchSize (~1000) makes
result processing 7 times faster than with fetchSize of 50 rows (14 ms
-> 2 ms for 2000 rows).
Here are the measurements: [1] and [2].

Note: it is not required to precisely follow given "max fetch bytes"
limit. It would be enough just to stop after certain amount of data
was sent.
The whole thing of using limited fetch size is to avoid running out of
memory at client side.
I do not think developers care how many rows is fetched at once. It
they do, they should rather use "limit X" SQL syntax.

Do you have a suggestion for such a use case?

For fixed-size data types, JDBC driver can estimate "max sane fetch
size", however:
1) In order to know data types, a roundtrip is required. This means
the first fetch must be conservative, thus small queries would be
penalized.
2) For variable length types there is no way to estimate "sane number
of rows", except of using "average row size of already received data".
This is not reliable, especially if the first rows have nulls, and
subsequent ones contain non-empty strings.

[1]: https://github.com/pgjdbc/pgjdbc/issues/292#issue-82595473
[2]: https://github.com/pgjdbc/pgjdbc/issues/292#issuecomment-107019387

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [POC] FETCH limited by bytes.

2015-12-26 Thread Vladimir Sitnikov
>and fetch a number of rows that, by its estimation, would fit in the memory 
>available

What's wrong with having size limit in the first place? It seems to
make much more sense.

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] W-TinyLfu for cache eviction

2015-12-13 Thread Vladimir Sitnikov
> a global lock would be good enough for a proof of concept that only
evaluates cache hit ratios.

I think emulator can be used to check hit ratios. That way we can see
how different algorithms affect hit ratio.

Is there a set of traces of "buffer load events"? (I did some Google
searches like "postgresql buffer cache trace" with no luck)
Is there an option that enables tracing of each requested buffer Id?

Frankly speaking, I've no access to PG instances with lots of data
(i.e. >10GiB).

> Maybe.  Want to code it up?

That would be interesting, however: I'm not fluent at C. I've never
written multithreaded C code either. I understand what a cache line is
though.
Anyway, before hacking a prototype it makes sense to gather some traces.

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] W-TinyLfu for cache eviction

2015-12-02 Thread Vladimir Sitnikov
I've recently noticed W-TinyLfu cache admission policy (see [1]) being
used for caffeine "high performance caching library for Java 8".
It demonstrates high cache hit ratios (see [2]) and enables to build
high-throughput caches (see caffeine in [3])
Authors explicitly allow implementations of the algorithm (see [4]).

Does it make sense to evaluate the algorithm for buffer replacement?

[1]: http://arxiv.org/pdf/1512.00727v1.pdf
[2]: https://github.com/ben-manes/caffeine/wiki/Efficiency
[3]: https://github.com/ben-manes/caffeine/wiki/Benchmarks
[4]: https://github.com/ben-manes/caffeine/issues/23#issuecomment-161536706

Vladimir Sitnikov


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] lag_until_you_get_something() OVER () window function

2014-10-28 Thread Vladimir Sitnikov
There is already a patch for that (ignore/respect nulls in lead/lag):
https://commitfest.postgresql.org/action/patch_view?id=1096

--
Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-03-05 Thread Vladimir Sitnikov
Tom,

I did not follow the thread very close, so I need to look what the
ambiguity is there, however I would love to see window rescans in explain
analyze.

I have great experience in tuning Oracle queries.
There are features in PostgreSQL's explain analyze that I miss badly in
Oracle: 'rows removed by filter' is my favourite one. Improving explain
analyze is great for performance analysis.

I would say target audience for 'explain analyze' is not all the users, but
someone closer to 'performance engineers'. Those beings are used to
triple-check results and build/validate hypothesis since all the counters
tend to lie, so 'a bit misleading counter' is not a showstopper.

I did not think of
'non-being-able-to-use-negative-transition-since-floats-do-not-commute'
before.
Thanks to this discussion I see what kind of dragons live here.

I would vote (if I had any vote at all) for the inclusion of performance
statistics to explain analyze (e.g. number of rescans and number of rows
fed to aggregate) provided performance impact is tolerable in both regular
and explain analyze mode.

I wonder how Oracle handles negative transition (does it?), however that is
a different discussion.

Regards,
Vladimir Sitnikov


Re: [HACKERS] Alias hstore's ? to ~ so that it works with JDBC

2013-02-11 Thread Vladimir Sitnikov
The thing is PostgeSQL JDBC driver should be able to parse sql in order to
tell if specific question mark is a bind variable or it is inside string
literal, or it is inside of some comment.

I do not believe JDBC spec forces to treat all ? as bind placeholders.
That's unusable.
Oracle JDBC allows having ? symbols in comments and string literals without
any special escapement.
08.02.2013 22:01 пользователь Dave Cramer p...@fastcrypt.com написал:

 That would seem to be the implication. JDBC wouldn't really know anything
 about hstore.

 Dave Cramer

 dave.cramer(at)credativ(dot)ca
 http://www.credativ.ca


 On Fri, Feb 8, 2013 at 12:51 PM, Andrew Dunstan and...@dunslane.netwrote:


 On 02/08/2013 12:41 PM, Kris Jurka wrote:


 On Fri, 8 Feb 2013, Dave Cramer wrote:

  Would this be an postgresql specific escape sequence ? I don't think the
 spec allows for this does it ?

  Yes, this would be a postgresql jdbc driver specific escape.  The spec
 doesn't have a concept of private escape sequences, but that doesn't seem
 like the end of the world.  Clearly the user here is writing postgresql
 specific code to use hstore operators, so there's not a portability loss
 here.



 I assume, though, that you're not talking about something that's
 hstore-specific, but rather something that will allow the user to put a
 non-parameter question mark in the query string. As has been noted
 upthread, the hstore use is far from the only one that causes users to trip
 on this.

 cheers

 andrew





Re: [HACKERS] benchmarking the query planner

2008-12-11 Thread Vladimir Sitnikov


 BTW, does anyone have an opinion about changing the upper limit for
 default_stats_target to, say, 1?  These tests suggest that you
 wouldn't want such a value for a column used as a join key, but
 I can see a possible argument for high values in text search and
 similar applications.

Do you consider using hash tables?
I am not sure hash is a perfect match here, however I guess some kind of
data structure might improve N^2 behaviour. Looks like that would improve
both array_eq (that will narrow the list of possible arrays to the single
hash bucket) and large _target (I guess that would improve N^2 to N)

Regards,
Vladimir Sitnikov


Re: [HACKERS] benchmarking the query planner

2008-12-11 Thread Vladimir Sitnikov


  Do you consider using hash tables?

 Doubt it's really worth it, unless there's some way to amortize the
 setup cost across multiple selectivity estimations; which would surely
 complicate life.

MCV lists are updated only during analyze phase, don't they? If the setup
cost is the cost of maintaining those hash tables, it is not going to
hurt much.




 One thing that just now occurred to me is that as long as we maintain
 the convention that MCV lists are in decreasing frequency order, one can
 take any prefix of the list and it's a perfectly good MCV list of less
 resolution.  So one way to reduce the time taken in eqjoinsel is to set
 an upper limit on the number of entries considered *by that routine*,
 whereas other estimator functions could use larger lists.

That makes sense, however, linear search for single item in the list of
10'000 elements could take a while. Hash lookup might be better choice.

Regards,
Vladimir Sitnikov


Re: [HACKERS] benchmarking the query planner

2008-12-11 Thread Vladimir Sitnikov



 There's something in what you say, but consider that we have pretty
 much unanimous agreement that 10 is too small.  I think we should
 try to fix the problem, not just gradually ratchet up the value until
 people start complaining in the other direction.  (Also, we should have
 plenty of opportunity during beta to find out if we went too far.)

I am not sure if entity-attribute-value model could be used for postgres
database, however that is one of the cases that require large MCV list
(generally, for attribute column).

You know, Oracle is not able to store more than 254 distinct values for
histogram statistics. That really limits the use of histograms for software
product the company I work for creates.

One more direction could be implementing MCV for range of values (group
values and interpolate in between). Consider statistics on timestamp column
that says that for 2008-December there are as many X rows, for
2008-November as many as Y, etc.  That could be used for rather accurate
cardinality estimation of between cases, while keeping number of entries
in MCV list small.


Regards,
Vladimir Sitnikov


Re: [HACKERS] benchmarking the query planner

2008-12-11 Thread Vladimir Sitnikov


 I think that would likely be redundant with the histogram.

I am afraid I do not get you. I mean histograms should be considered when it
comes to increasing number of MCV entries (at least for numeric/timestamp
values). With histogram lower number of entries could be sufficient to get
reasonable accuracy of estimations.I have no idea how to decide between
automatic switch between histograms and MCV. It might sound crazy one could
compute both histograms and MCV and use them both (and pick an average of
two estimations :) )
Regards,
Vladimir Sitnikov


Re: [HACKERS] benchmarking the query planner

2008-12-11 Thread Vladimir Sitnikov

 What is the specific difference between what you are talking about and
 what scalarineqsel already implements?

Hmm... Northing new. Feel sorry for bothering you. I did not realize
histograms are implemented.


Regards,
Vladimir Sitnikov


Re: [HACKERS] contrib/pg_stat_statements 1202

2008-12-09 Thread Vladimir Sitnikov

 I'm not sure what the best way is though.  I don't think continuing to
 add keywords between EXPLAIN and the start of the query is very
 scalable.  Putting parentheses around the option list seems like it
 might eliminate a lot of grammar headaches:

Do you think it is required to invent special grammar just for presentation
purposes?

I guess database should not deal with presentation. Provided explain
retuns table, it is up to the client to do the formatting. I do not believe
it makes sense creating several different explain outputs, and redo all the
work in 8.5.

It still could make sense having several options for explain if that would
result in *different instrumentation *(e.g. explain vs explain analyze).



Regards,
Vladimir Sitnikov


Re: [HACKERS] contrib/pg_stat_statements 1202

2008-12-09 Thread Vladimir Sitnikov
On Tue, Dec 9, 2008 at 8:53 PM, Robert Haas [EMAIL PROTECTED] wrote:

 On Tue, Dec 9, 2008 at 12:44 PM, Greg Stark [EMAIL PROTECTED]
 wrote:
  Yes this is one reasonable option, as is the idea of using XML or a table
  and making it the client's problem. Neither are going to happen for this
  release I think.

 Agreed.

I 100% agree with that point. Thus I suggest output additional information
into explain analyze since:
 1) it will require minimal code change
 2) it will be consistent with previous behaviour
 3) looks like a natural EXPLAIN's feature improvement
 4) will be anyway changed when table for explain will come


  And in any case it will always be useful to have an option to print all
 the
  available information anyways so we make as well do that with verbose.

 Sounds very nice.

Can I ask my question once again?
Why don't you want to make print all the info the default output format?
As long as it comes to pgsql-performance, they used to recommend: please,
provide EXPLAIN ANALYZE, and not just EXPLAIN.
If the default output format is not changed in 8.4, this will transform into
please, provide EXPLAIN ANALYZE VERBOSE, not just EXPLAIN ANALYZE or
EXPLAIN. Do you really want that?


Regards,
Vladimir Sitnikov


Re: [HACKERS] contrib/pg_stat_statements 1202

2008-12-08 Thread Vladimir Sitnikov


   2. EXPLAIN ANALYZE VERBOSE shows buffer statistics in 'actual' section.
  
  I do not get the point of VERBOSE.
  As far as I understand, explain analyze (without verbose) will anyway add
  overhead for calculation of gets/hits/cpu. Why discard that information
 in
  non verbose mode? Just to make the investigation more complex?

 I thought that output of new counters are too wide and it brakes
 compatibility of EXPLAIN ANALYZE. On the other hand, we don't have to
 think of compatibility in EXPLAIN ANALYZE VERBOSE because it is newly
 added in 8.4. However, overheads should be avoided. We could have
 two kinds of instrumentations, time-only or all-stats.

I am not sure if this impact (in terms of compatibility) is really that big.
As far as I could understand, pgAdmin parses modified explain analyze output
well. However, pgAdmin does not support verbose.

The impact in terms of computational overhead could be measured. I guess
it depends only on the number of calls to the instrumentation (that is
either rows or loops). Thus, after explaining some basic select from
generate_series, one could adjust the timings. We could even incorporate
this to EXPLAIN ANALYZE, so it would display cpu is 0.5s, while
instrumentation cpu is 0.3s.


Another idea is to have buffer and cpu counters not in Instrumentation
 but in QueryDesc (i.e, only one per query). A new field for the counter
 will be initialized in pg_stat_statements module. EXPLAIN ANALYZE could
 also use the field but the result is printed only the end of query plan,
 as the follows. We can avoid too-wide-line problem with the approach.

Single number per query is sufficient only for pg_stat_statements purposes.
That will give an insight of what the top consuming queries are (by cpu
time, by gets, etc).
However, single gets=... reads=... is not sufficient to pinpoint the
problem especially in case of complex query (that is comparable to query
returned N rows vs this plan node returned N rows) .



 =# EXPLAIN ANALYZE SELECT ...;
QUERY PLAN
 ---
  ...
  Total runtime: 24.273 ms
  CPU usage: user=20.2 sys=2.4   #
  Buffer usage: gets=100 reads=10 writes=20  # added lines
  Tempfile i/o: reads=50 writes=50   #
 (10 rows)

I wish pgAdmin (or whatever client) had an option to fetch that counters for
each and every SQL query and display the consumed resources at a separate
tab. I mean, even before/after plain select (without any explain). That
will show you how the query would behave without any instrumentation.

Regards,
Vladimir Sitnikov


Re: [HACKERS] contrib/pg_stat_statements 1202

2008-12-05 Thread Vladimir Sitnikov

 The main benefit is that you can track how EXPLAIN plans change over time.

It is not required to output plan *into* some table to be able track it over
time. If EXPLAIN returns a table, it is up to you to perform insert into
history select * from explain(...).

Workflow that does not make sense for me is look at plans generated _into
some plan_table_ by other sessions in Oracle.
I am 100% sure it really makes sense have some view like pg_execute_plan
that will reveal execution plans for currently running queries (see
v$sql_plan in Oracle). However, I would stress once again I have no idea
what the sense could be in one session explained into plan_table, while the
other reads that plan.

Does that make sense?

Regards,
Vladimir Sitnikov


Re: [HACKERS] contrib/pg_stat_statements 1202

2008-12-04 Thread Vladimir Sitnikov
 2. EXPLAIN ANALYZE VERBOSE shows buffer statistics in 'actual' section.

I do not get the point of VERBOSE.
As far as I understand, explain analyze (without verbose) will anyway add
overhead for calculation of gets/hits/cpu. Why discard that information in
non verbose mode? Just to make the investigation more complex?

Write-counters are not included because I think they are not so useful.

Never say never. I guess they (or just one counter for accumulated writes)
could be useful for monitoring operations that spill to the disk. For
instance, PostgreSQL does not show the amount of temp used for the join.



buffer_gets;/* # of buffer hits */
buffer_hits;/* # of buffer gets */
buffile_reads;  /* # of buffile reads */

I guess it makes sense expanding buffile reads into buffer file reads or
just file reads

Here is an sample output. We'd better to add a linebreak before
 the 'actual' section because the line is too wide to display.

I wish there was a way to get the results of explain into some table. I wish
it was the default output format. That would make life of pgAdmin easier,
and improve readability even in psql. Do not you think there is something
wrong with having cost=... rows=... loops=... in each and every row?



 ResetBufferUsage() is save the current counters in global variables as
 baseline and buffer statistics are measured in difference from them
 because the counters are used by struct Instrumentation.

That would definitely work well for Instrumentation (it will not notice
resetBufferUsage any more), however that will not isolate the guys who do
the reset. I am afraid the idea of having api for reset is broken and it
makes sense removing that function. However it looks like it is out of scope
of this patch.

Regards,
Vladimir Sitnikov


Re: [HACKERS] contrib/pg_stat_statements 1202

2008-12-04 Thread Vladimir Sitnikov

 Vladimir Sitnikov [EMAIL PROTECTED] writes:

  I wish there was a way to get the results of explain into some table. I
 wish
  it was the default output format. That would make life of pgAdmin
 easier,
  and improve readability even in psql. Do not you think there is something
  wrong with having cost=... rows=... loops=... in each and every row?

 A number of people have suggesting we switch to XML.

I do not see much benefit of XML:
 * XML is not human-readable
 * Plain old result set is even easier to process since it is the main
PostgreSQL interface at this point

The only benefit of XML I could imagine is it could provide a nicer markup
for sort/hash/etc nodes. It is not that nice to have a column sort method
that would be empty nearly for all the rows. At the same time it looks fine
to have a column with xml inside for any additional information execution
node wants provide (like topN-allrows sort / number of batches in hash join
or whatever)



 An alternative would be to build up a tuplestore of data and then send that
 to
 the client in a separate result set. That's kind of nice because it would
 give
 us a way to send both the real results and the explain results. And at
 least
 we already have an api for accessing result sets.

Sounds good. As for me, current output of explain is not very easy to read:
it suits well only for find timings for particular node workflow only (I
mean, the source is a particular node, the result is
timings/rows/buffers/etc). However from my point of view, when it comes to
query tuning the main workflow is find node by suspicious timings. If all
the relevant data were displayed in the same column it would be easier to
read. Consider all the row counts in the very first column.



 Oracle's approach is to have the explain command stuff the results into a
 table. That has advantages for tools, especially if you want to be able to
 look at plans generated by other sessions.

I do not believe that workflow makes sense. I have never ever thought of it.

External table makes sense if you have several output formats (say, create a
formatting function for psql and let pgAdmin format the plan on its own)

Regards,
Vladimir Sitnikov


Re: [HACKERS] Windowing Function Patch Review - ROW_NUMBER without ORDER BY

2008-11-09 Thread Vladimir Sitnikov


 I see this as a greate feature.

I would treat ranking functions without explicit order by clause as a
feature rather than a bug. However, I believe, in most cases optimizer will
avoid additional sort when possible, thus an order by in a windowing
clause would not cause any performance degradation.



 It will hopefully be possible to write:

 SELECT *, max(row_number()) over() as total_rows from employees;

I believe this query does not make sense.  At least, row_number without
over sounds odd.

To count all the rows (if you really want to) you might use count(*) over
().




 To get the maximum number of rows in a separate column. Very usefull when
 writing queries to retrieve paged results. Like Give me the 20 top
 articles sorted on date and also the total number of articles in *one*
 query, eliminating the need for a separate count(*) query.

Sometimes it is better to perform several separate queries since optimizer
could use an index scan to get 20 top and seq scan to get the count(*)

Regards,
Vladimir Sitnikov


Re: [HACKERS] Bitmap index - first look

2008-11-07 Thread Vladimir Sitnikov
Could you please put more comments on the index build procedure?

I guess it strongly relies on the fact the block number increases as tuples
are returned from heap scan, doesn't it? However, thanks to synchronized
scans the order of tuples could be a little bit different.

I found no evidence of  build index being able to add tid # 10 after it
has created the bitmap for tids 1000...2000. As far as I understand it never
tries to update words that were written during index creation.

That particular  if (_blockno != buf-hot_buffer_block) {  in
buf_add_tid_with_fill looks to be a wrong thing for me -- I believe it is a
way to often (it will try to sync the buffer after each 300/8=~40 bytes
since there are no more than 300 tuples on a single page)
I would not flush the bitmap every data block, but I would keep hot buffer
as a temporary view (uncompressed) of the bitmap being build. So as tuples
come, we either set the bit directry in the hot buffer (if it covers the
relevant tid range) or flush that view to the bitmap (merging, and splitting
the bitmap where it is required) and repoint the window so it starts with
block of tid that triggered flushing. Does that make sense?


Regards,
Vladimir Sitnikov


Re: [HACKERS] Bitmap index - first look

2008-11-06 Thread Vladimir Sitnikov
One more point on pg_am:  amsearchnull is equal to f  however the index
stores and could find nulls perfectly.

Regards,
Vladimir Sitnikov


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-11-04 Thread Vladimir Sitnikov
 Quoted from SQL:2008
 If CUME_DIST is specified, then the relative *rank *of a row R is defined
 as
 NP/NR, where NP is defined
 to be the number of rows preceding or peer with R in the window ordering of
 the window partition of R
 and NR is defined to be the number of rows in the window partition of R.

 I guess there is a difference between  row_number and number of rows
preceding or peer with R

number of rows preceding or peer with R == count(*) over (order by salary)

As far as I understand, the following query should calculate cume_dist
properly (and it does so in Oracle):

SELECT name,CAST(r AS FLOAT) / c, cd
FROM (SELECT name,
COUNT(*) OVER(ORDER BY salary) as r,
COUNT(*) OVER() AS c,
CUME_DIST() OVER(ORDER BY salary) AS cd
 FROM employees
) t;

Sincerely yours,
Vladimir Sitnikov


Re: [HACKERS] Windowing Function Patch Review - Standard Conformance

2008-11-04 Thread Vladimir Sitnikov


 Even though I understand the definition, your suggestion of COUNT(*)
 OVER (ORDER BY salary) doesn't make sense.

Why does not that make sense?
I have not read the spec, however Oracle has a default window specification
in case there is only an order by clause. The default window is range
between unbounded preceding and current row.

count(*) over (order by salary range between unbounded preceding and
current row) is perfectly identical to the number of rows preceding or
peers to R by the definition, isn't it? I see here a word-by-word
translation from SQL to the English and vice versa.

If the patch returns row_number it is wrong since there is no way for
row_number to be a number of rows preceding or peer with R, is there?

Regards,
Vladimir Sitnikov


Re: [HACKERS] Bitmap Indexes patch (was Re: Bitmap Indexes: request for feedback)

2008-11-03 Thread Vladimir Sitnikov
 I looked at that aspect of the patch specifically a few weeks back while
 checking for possible issues with Hot Standby. IIRC the patch is fairly
 careful with locking and uses Exclusive locks extensively throughout. I
 looked at both the theory and the implementation. Unless Gianni changed
 something in that regard recently, I don't understand that comment at
 all. Probably need to provide specific examples of your concerns.

The major thing there is to get the modifications right. There is no much
sense in reviewing wrong code against locking issues.

I wish to focus on the performance aspect of the patch, however, it turned
out there are major issues with functionality: the index stores wrong tids
inside :(
I really would love to fix that issue and have a chance to validate the
performance. Unfortunately, I have spent more than a day with almost void
success.

I have two testcases for which the index fails to get the correct result:

Testcase 1 (I guess there is a conflict between _bitmap_formitem and
mergewords):

Basically I create a table with all the rows equal to 1 besides 19-th, which
is 0.

create table t1 as select case when i=19 then 0 else 1 end as i from
generate_series(1,20) as s(i)
create index t1ix on t1 using bitmap (i) where i = 0;
set enable_seqscan=off;
select ctid,i From t1 where i=0; -- no rows selected.  Debug shows index
suggests ctid==(0,35) instead of (0,19).  35==16+16+3.

Testcase 2

create table t2 as select i, 0 j from generate_series(1,1000) as s(i);
update t2 set j=1 where i in (5, 230)
create index t2ix on t2 using bitmap(j) where j=1;

set enable_seqscan=off;
select ctid, i, j from t2 where j=1; -- no rows selected. Debug shows index
suggests ctids==(0,97) and (0,98) instead of (4,97) and (4,98) -- it loses
page number somewhere on the way.

Both testcases reveal defects in index creation.

Regards,
Vladimir Sitnikov


  1   2   >