On 04.05.2018 18:22, Merlin Moncure wrote:
On Thu, May 3, 2018 at 12:01 PM, Robert Haas <robertmh...@gmail.com> wrote:
On Fri, Apr 27, 2018 at 4:43 PM, Merlin Moncure <mmonc...@gmail.com> wrote:
What _I_ (maybe not others) want is a
faster pgbouncer that is integrated into the database; IMO it does
everything exactly right.
I have to admit that I find that an amazing statement.  Not that
pgbouncer is bad technology, but saying that it does everything
exactly right seems like a vast overstatement.  That's like saying
that you don't want running water in your house, just a faster motor
for the bucket you use to draw water from the well.
Well you certainly have a point there; I do have a strong tendency for
overstatement :-).

Let's put it like this: being able to have connections funnel down to
a smaller number of sessions is nice feature.  Applications that are
large, complex, or super high volume have a tendency towards stateless
(with respect to the database session) architecture anyways so I tend
not to mind lack of session features when pooling (prepared statements
perhaps being the big outlier here).  It really opens up a lot of
scaling avenues.  So better a better phrased statement might be, "I
like the way pgbouncer works, in particular transaction mode pooling
from the perspective of the applications using it".  Current main pain
points are the previously mentioned administrative headaches and
better performance from a different architecture (pthreads vs libev)
would be nice.

I'm a little skeptical that we're on the right path if we are pushing
a lot of memory consumption into the session level where a session is
pinned all the way back to a client connection. plpgsql function plan
caches can be particularly hungry on memory and since sessions have
their own GUC ISTM each sessions has to have their own set of them
since plans depend on search path GUC which is session specific.
Previous discussions on managing cache memory consumption (I do dimly
recall you making a proposal on that very thing) centrally haven't
gone past panning stages AFAIK.

If we are breaking 1:1 backend:session relationship, what controls
would we have to manage resource consumption?

Most of resource consumption is related with backends, not with sessions.
It is first of all catalog and relation caches. If there are thousands of tables in a databases, then this caches (which size is not limited now) can grow up to several megabytes. Taken in account, that at modern SMP systems with hundreds of CPU core it may be reasonable to spawn hundreds of backends, total memory footprint of this caches can be very significant. This is why I think that we should move towards shared caches... But this trip is not expected to be so easy.

Right now connection pooler allows to handle much more user sessions than there are active backends.
So it helps to partly solve this problem with resource consumption.
Session context itself is not expected to be very large: changed GUCs + prepared statements.

I accept your argument about stateless application architecture.
Moreover, this is more or less current state of things: most customers has to use pgbouncer and so have to prohibit to use in their application all session specific stuff. What them are loosing in this case? Prepared statements? But there are really alternative solutions: autoprepare, shared plan cache,... which allow to use prepared statements without session context. Temporary tables, advisory locks,... ?

Temporary tables are actually very "ugly" thing, causing a lot of problems:
- can not be created at hot standby
- cause catalog bloating
- deallocation of large number of temporary table may acquire too much locks.
...
May be them somehow should be redesigned? For example, have shared ctalog entry for temporary table, but backend-private content... Or make it possible to change lifetime of temporary tables from session to transaction...


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Reply via email to