subject:"Re\: \[HACKERS\] Separate connection handling from backends"

Re: [HACKERS] Separate connection handling from backends

2016-12-07 Thread Craig Ringer

On 7 December 2016 at 22:27, Kevin Grittner  wrote:

> I don't know how that execution model would compare to what we use
> now in terms of performance, but its popularity makes it hard to
> ignore as something to consider.

Those engines also tend to be threaded. They can stash state in memory
and hand it around between executors in ways we cannot really do.

I'd love to see a full separation of executor from session in
postgres, but I can't see how it could be at all practical. The use of
globals for state and the assumption that session == backend is baked
in way too deep.

At least, I think it'd be a slow and difficult thing to change, and
would need many steps. Something like what was proposed upthread would
possibly make sense as a first step.

But again, I don't see anyone who's likely to actually do it.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Separate connection handling from backends

2016-12-07 Thread Kevin Grittner

On Wed, Dec 7, 2016 at 12:36 AM, Jim Nasby  wrote:

> The way I'm picturing it backends would no longer be directly
> tied to connections. The code that directly handles connections
> would grab an available backend when a statement actually came in
> (and certainly it'd need to worry about transactions and session
> GUCs).

If we're going to consider that, I think we should consider going
all the way to the technique used by many (most?) database
products, which is to have a configurable number of "engines" that
pull work requests from queues.  We might have one queue for disk
writes, one for disk reads, one for network writes, etc.
Traditionally, each engine spins over attempts to read from the
queues until it finds a request to process; blocking only if
several passes over all queues come up empty.  It is often possible
to bind each engine to a particular core.  Current process-local
state would be passed around, attached to queued requests, in a
structure associated with the connection.

I don't know how that execution model would compare to what we use
now in terms of performance, but its popularity makes it hard to
ignore as something to consider.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Separate connection handling from backends

2016-12-06 Thread Jim Nasby


On 12/6/16 10:34 PM, Craig Ringer wrote:

In other words, we could start to separate session state from executor
state in a limited manner. That'd definitely be valuable, IMO; it's a
real shame that Pg's architecture so closely couples the two.

So - is just doing "PgInCoreBouncer" a good idea? No, I don't think
so. But there are potentially good things to be done in the area.


Right.


What I don't see here is a patch, or a vague proposal for a patch, so
I'm not sure how this can go past the hot-air stage.


Yeah, I brought it up because I think there's potential tie-in with 
other things that have been discussed (notably async transactions, but 
maybe BG workers and parallel query could benefit too). Maybe it would 
make sense as part of one of those efforts.


Though, this is something that's asked about often enough that it'd 
probably be possible to round up a few companies to fund it.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Separate connection handling from backends

2016-12-06 Thread Jim Nasby


On 12/6/16 6:19 PM, Tom Lane wrote:

I'm kind of mystified how a simple code restructuring could solve the
fundamental problems with a large number of backends. It sounds like
what you're describing would just push the problem around, you would
end up with some other maximum instead, max_backends, or
max_active_backends, or something like that with the same problems.

What it sounds like to me is building a connection pooler into the
backend.  I'm not really convinced we ought to go there.


The way I'm picturing it backends would no longer be directly tied to 
connections. The code that directly handles connections would grab an 
available backend when a statement actually came in (and certainly it'd 
need to worry about transactions and session GUCs).


So in a way it's like a pooler, except it'd be able to do things that 
poolers simply can't (like safely switch the user the backend is using).


I think there might be other uses as well, since there's several other 
places where we need something that's kind-of like a backend, but if 
Heikki's work radically shifts the expense of running many thousands of 
backends then it's probably not worth doing.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Separate connection handling from backends

2016-12-06 Thread Craig Ringer

On 7 December 2016 at 10:19, Tom Lane  wrote:

> What it sounds like to me is building a connection pooler into the
> backend.  I'm not really convinced we ought to go there.

If we do, it probably needs to be able to offer things that
out-of-tree ones can't.

The main things I see that you can't do sensibly with an out-of-tree pooler are:

* Re-use a backend for different session users. You can SET SESSION
AUTHORIZATION, but once you hand the connection off to the client they
can just do it again or RESET SESSION AUTHORIZATION and whammo,
they're a superuser. Same issue applies for SET ROLE and RESET ROLE.

* Cope with session-level state when transaction pooling. We probably
can't do anything much about WITH HOLD cursors, advisory locks, etc,
but we could save and restore GUC state and a few other things, and we
could detect whether or not we can save and restore state so we could
switch transparently between session and transaction pooling.

* Know, conclusively, whether a query is safe to reroute to a
read-only standby, without hard coded lists of allowed functions, iffy
SQL parsers, etc. Or conversely, transparently re-route queries from
standbys to a read/write master.

In other words, we could start to separate session state from executor
state in a limited manner. That'd definitely be valuable, IMO; it's a
real shame that Pg's architecture so closely couples the two.

So - is just doing "PgInCoreBouncer" a good idea? No, I don't think
so. But there are potentially good things to be done in the area.

What I don't see here is a patch, or a vague proposal for a patch, so
I'm not sure how this can go past the hot-air stage.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Separate connection handling from backends

2016-12-06 Thread Jim Nasby


On 12/6/16 1:46 PM, Adam Brusselback wrote:

BTW, it just occurred to me that having this separation would make
it relatively easy to support re-directing DML queries from a
replica to the master; if the backend throws the error indicating
you tried to write data, the connection layer could re-route that.


This also sounds like it would potentially allow re-routing the other
way where you know the replica contains up-to-date data, couldn't you
potentially re-direct read only queries to your replicas?


That's a lot more complicated, so I don't see that happening anytime soon.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Separate connection handling from backends

2016-12-06 Thread Tom Lane

Greg Stark  writes:
> On 5 December 2016 at 19:48, Jim Nasby  wrote:
>> One solution to this would be to segregate connection handling from actual
>> backends, somewhere along the lines of separating the main loop from the
>> switch() that handles libpq commands. Benefits:

> I'm kind of mystified how a simple code restructuring could solve the
> fundamental problems with a large number of backends. It sounds like
> what you're describing would just push the problem around, you would
> end up with some other maximum instead, max_backends, or
> max_active_backends, or something like that with the same problems.

What it sounds like to me is building a connection pooler into the
backend.  I'm not really convinced we ought to go there.

> Heikki's work with CSN would actually address the main fundamental
> problem. Instead of having to scan PGPROC when taking a snapshot
> taking a snapshot would be O(1).

While that would certainly improve matters, I suspect there are still
going to be bottlenecks arising from too many backends.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Separate connection handling from backends

2016-12-06 Thread Greg Stark

On 5 December 2016 at 19:48, Jim Nasby  wrote:
> One solution to this would be to segregate connection handling from actual
> backends, somewhere along the lines of separating the main loop from the
> switch() that handles libpq commands. Benefits:

I'm kind of mystified how a simple code restructuring could solve the
fundamental problems with a large number of backends. It sounds like
what you're describing would just push the problem around, you would
end up with some other maximum instead, max_backends, or
max_active_backends, or something like that with the same problems.
At best it would help people who have connection pooling or but few
connections active at any given time.

Heikki's work with CSN would actually address the main fundamental
problem. Instead of having to scan PGPROC when taking a snapshot
taking a snapshot would be O(1). There might need to be scans of the
list of active transactions but never of all connections whether
they're in a transaction or not.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Separate connection handling from backends

2016-12-06 Thread Adam Brusselback

>
> BTW, it just occurred to me that having this separation would make it
> relatively easy to support re-directing DML queries from a replica to the
> master; if the backend throws the error indicating you tried to write data,
> the connection layer could re-route that.


This also sounds like it would potentially allow re-routing the other way
where you know the replica contains up-to-date data, couldn't you
potentially re-direct read only queries to your replicas?

Re: [HACKERS] Separate connection handling from backends

2016-12-06 Thread Kevin Grittner

On Mon, Dec 5, 2016 at 6:54 PM, Jim Nasby  wrote:
> On 12/5/16 2:14 PM, David Fetter wrote:

>> What do you see as the relationship between this proposal and the
>> earlier one for admission control?
>>
>> https://www.postgresql.org/message-id/4b38c1c502250002d...@gw.wicourts.gov
>
> Without having read the paper reference in that email or the rest of the
> thread...

> One big difference from what Kevin describe though: I don't think it makes
> sense for the connection layer to be able to parse queries. I suspect it
> would take a very large amount of work to allow something that's not a
> full-blown backend to parse, because it needs access to the catalogs.
> *Maybe* it'd be possible if we used a method other than ProcArray to
> register the snapshot that required, but you'd still have to duplicate all
> the relcache stuff.

I don't recall ever, on the referenced thread or any other,
suggesting what you describe.  Basically, I was suggesting that we
create a number hooks which an admission control policy (ACP) could
tie into, and we could create pluggable APCs.  One ACP that I think
would be useful would be one that ties into a hook placed at the
point(s) where a transaction is attempting to acquire its first
"contentious resource" -- which would include at least snapshot and
locks.  If the user was a superuser it would allow the transaction
to proceed; otherwise it would check whether the number of
transactions which were holding contentious resources had reached
some (configurable) limit.  If allowing the transaction to proceed
would put it over the limit, the transaction would be blocked and
put on a queue behind any other transactions which had already been
blocked for this reason, and a transaction from the queue would be
unblocked whenever the count of transactions holding contentious
resources fell below the threshold.

I don't see where parsing even enters into this.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Separate connection handling from backends

2016-12-05 Thread Jim Nasby


On 12/5/16 2:14 PM, David Fetter wrote:

One solution to this would be to segregate connection handling from actual
backends, somewhere along the lines of separating the main loop from the
switch() that handles libpq commands. Benefits:

[interesting stuff elided]

What do you see as the relationship between this proposal and the
earlier one for admission control?

https://www.postgresql.org/message-id/4b38c1c502250002d...@gw.wicourts.gov


Without having read the paper reference in that email or the rest of the 
thread...


I think my proposal would completely eliminate the need for what Kevin 
proposed as long as the "connection" layer released the backend that it 
was using as soon as possible (namely, as soon as the backend was no 
longer in a transaction). This does assume that the connection layer is 
keeping a copy of all user/session settable GUCs. I don't think we need 
that ability in the first pass, but it would be very high on the desired 
feature list (because it would allow "transaction-level" pooling).


Actually, we could potentially do one better... if a backend sat idle in 
transaction for long enough, we could "save" that transaction state and 
free up the backend to do something else. I'm thinking this would be 
similar to a prepared transaction, but presumably there'd be some 
differences to allow for picking the transaction back up.


One big difference from what Kevin describe though: I don't think it 
makes sense for the connection layer to be able to parse queries. I 
suspect it would take a very large amount of work to allow something 
that's not a full-blown backend to parse, because it needs access to the 
catalogs. *Maybe* it'd be possible if we used a method other than 
ProcArray to register the snapshot that required, but you'd still have 
to duplicate all the relcache stuff.


BTW, it just occurred to me that having this separation would make it 
relatively easy to support re-directing DML queries from a replica to 
the master; if the backend throws the error indicating you tried to 
write data, the connection layer could re-route that.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Separate connection handling from backends

2016-12-05 Thread David Fetter

On Mon, Dec 05, 2016 at 01:48:03PM -0600, Jim Nasby wrote:
> max_connections is a frequent point of contention between users and
> developers. Users want to set it high so they don't have to deal with Yet
> More Software (pgpool or pgBouncer); PG developers freak out because
> backends are pretty heavyweight, there's some very hot code that's sensitive
> to the size of ProcArray, lock contention, etc.
> 
> One solution to this would be to segregate connection handling from actual
> backends, somewhere along the lines of separating the main loop from the
> switch() that handles libpq commands. Benefits:

[interesting stuff elided]

What do you see as the relationship between this proposal and the
earlier one for admission control?

https://www.postgresql.org/message-id/4b38c1c502250002d...@gw.wicourts.gov

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Separate connection handling from backends

Re: [HACKERS] Separate connection handling from backends

Re: [HACKERS] Separate connection handling from backends

Re: [HACKERS] Separate connection handling from backends

Re: [HACKERS] Separate connection handling from backends

Re: [HACKERS] Separate connection handling from backends

Re: [HACKERS] Separate connection handling from backends

Re: [HACKERS] Separate connection handling from backends

Re: [HACKERS] Separate connection handling from backends

Re: [HACKERS] Separate connection handling from backends

Re: [HACKERS] Separate connection handling from backends

Re: [HACKERS] Separate connection handling from backends

12 matches

Site Navigation

Mail list logo

Footer information