Re: [HACKERS] Process startup infrastructure is a mess

2017-09-15 Thread Robert Haas
On Thu, Sep 14, 2017 at 9:30 PM, Stephen Frost  wrote:
> I'm definitely in agreement with Andres on this one.  This isn't
> refactoring of little-to-never changed code, it's refactoring bits of
> the system which are changed with some regularity and looks likely to
> continue to need change as we add more features moving forward, and
> perhaps add greater controls over process startup.

I agree to some extent.

I think this is a mess and that cleaning it up would be better than
not cleaning it up.  I don't view it as a particularly urgent problem,
though, and I'm not sure I see a need to attack it in a monolithic
way.  I think that unifying the error recovery paths in various
processes would actually be more useful than unifying process startup,
but that's not to say unifying process startup wouldn't be nice.

Generally, I think the newer process startup methods are superior to
the older ones.  The new background worker mechanism provides a
flexible way of adding new processes whose exact number doesn't need
to be known in advance without having to tinker with so much code.  In
contrast, adding an auxiliary process is big nuisance.  So if I were
going to attack this problem, I'd try to make the background worker
machinery sufficiently capable that we can do everything that way, and
then gradually move more things over to that mechanism.

However, I honestly probably wouldn't bother with this unless it were
blocking something specific that I wanted to do.  I wouldn't go as far
as Simon did in his reply; I would not favor refusing a good patch
from a highly qualified contributor that makes this all less ugly and
fragile.  On the other hand, I think he's right that we wouldn't
necessarily get a lot of immediate benefit out of it apart from
feeling good about having done it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Process startup infrastructure is a mess

2017-09-14 Thread Stephen Frost
Andres, Simon,

* Andres Freund (and...@anarazel.de) wrote:
> On 2017-09-15 01:06:54 +0100, Simon Riggs wrote:
> > If we add something to an area then its a good time to refactor it
> > since we were going to get bugs anyway.
> 
> We've added something to the area on a regular basis. As in last in
> v10. The fundamental problem with your argument is that that makes it
> impossible for most contributors to get feature in that touch these
> parts of the code - many neither have the time nor the knowledge to do
> such a refactoring. By your argument we should have rejected logical
> replication for v10 because it complicated this further, without
> cleaning it up.
> 
> Besides that, it also makes it harder to develop new features, not just
> to get them in.

I'm definitely in agreement with Andres on this one.  This isn't
refactoring of little-to-never changed code, it's refactoring bits of
the system which are changed with some regularity and looks likely to
continue to need change as we add more features moving forward, and
perhaps add greater controls over process startup.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Process startup infrastructure is a mess

2017-09-14 Thread Tsunakawa, Takayuki
From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Andres Freund
> I think we should seriously consider doing a larger refactoring of this
> soon.  I've some ideas about what to do, but I'd welcome some thoughts on
> whether others consider this a serious problem or not, and what they think
> we should do about this, first.

Thank you for raising this.  I think so, too.  Some other things that quickly 
come to mind are:

* The signal handlers are similar in many processes and redundant.  It's not 
easy to see how the processes respond to a particular signal and what signal 
can be used for new things such as dumping the stack trace in the server log.  
Maybe we can have an array of process attributes that specify the signal 
handlers, startup/shutdown order, etc.  Then the common process management code 
handles processes based on the information in the array.

* postmaster.c is very large (more than 6,000 lines) and hard to read.

* It may be easy to forget adding initialization code in the Windows-specific 
path (SubPostmasterMaain).

* It's confusing to find out which process counts toward max_connections, 
MaxBackends, the count of auxiliary processes.  As a user, I'm sometimes 
confused about the relationship between max_connections, 
autovacuum_max_workers, max_wal_senders, max_worker_processes.

Regards
Takayuki Tsunakawa
 




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Process startup infrastructure is a mess

2017-09-14 Thread Andres Freund
On 2017-09-15 01:06:54 +0100, Simon Riggs wrote:
> On 14 September 2017 at 22:44, Andres Freund  wrote:
> 
> > The way we currently start and initialize individual postgres (sub-)
> > processes is pretty complicated and duplicative.  I've a couple
> > complaints:
> ...
> > I think we should seriously consider doing a larger refactoring of this
> > soon.  I've some ideas about what to do, but I'd welcome some thoughts
> > on whether others consider this a serious problem or not, and what they
> > think we should do about this, first.
> 
> Refactoring without a purpose is a negative for me. It takes time,
> introduces bugs and means the greater code churn over time introduces
> more bugs because fewer people have seen the code. That is arguable,
> but when we compare the priority of that against things people want
> and need there is little contest in my mind.

That's true for code that's not going to be touched anymore, largely bug
free, and can be understood with reasonable effort. Neither is the case
here.


> If we add something to an area then its a good time to refactor it
> since we were going to get bugs anyway.

We've added something to the area on a regular basis. As in last in
v10. The fundamental problem with your argument is that that makes it
impossible for most contributors to get feature in that touch these
parts of the code - many neither have the time nor the knowledge to do
such a refactoring. By your argument we should have rejected logical
replication for v10 because it complicated this further, without
cleaning it up.

Besides that, it also makes it harder to develop new features, not just
to get them in.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Process startup infrastructure is a mess

2017-09-14 Thread Simon Riggs
On 14 September 2017 at 22:44, Andres Freund  wrote:

> The way we currently start and initialize individual postgres (sub-)
> processes is pretty complicated and duplicative.  I've a couple
> complaints:
...
> I think we should seriously consider doing a larger refactoring of this
> soon.  I've some ideas about what to do, but I'd welcome some thoughts
> on whether others consider this a serious problem or not, and what they
> think we should do about this, first.

Refactoring without a purpose is a negative for me. It takes time,
introduces bugs and means the greater code churn over time introduces
more bugs because fewer people have seen the code. That is arguable,
but when we compare the priority of that against things people want
and need there is little contest in my mind.

If we add something to an area then its a good time to refactor it
since we were going to get bugs anyway.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Process startup infrastructure is a mess

2017-09-14 Thread Andres Freund
Hi,

The way we currently start and initialize individual postgres (sub-)
processes is pretty complicated and duplicative.  I've a couple
complaints:

1) It's completely non-obvious that bootstrap.c:AuxiliaryProcessMain()
   can get invoked both via postmaster for subprocesses (startup, wal
   writer, bgwriter, checkpointer, wal receiver, "checker"), as well as
   directly via main.c for the bootstrap processes.  Note the autovacuum
   launcher, archiver, logger are *not* started this way.

2) Most of the processes mentioned in 1) and some additional ones
   (autovac launcher and walsender / autovacuum workers to some degree)
   duplicate a lot of logic for startup, error handling and main loop.
   Especially the error handling code is order dependent, complex, and
   has been broken in various subprocesses in the past.

3) There exists yet *another* way of starting processes in the form of
   background workers.  Initially that was "just" for extensions, but is
   now employed for builtin tasks too, like the logical rep launcher /
   workers.  In the course of that special case handling had to be
   sprinkled around, because the bgworker logic isn't quite able to be
   good enough for something builtin.

4) The naming of initialization functions is, uh, not particularly
   clear. How many people can differentiate, without checking,
   BaseInit(), BackendInitialize(), InitPostgres(), InitProcess().

I think the complexity here just grew incrementally with the addition of
more and more subprocesses, without sufficient refactoring. I did some
in 31c453165b5a6, but that's not even remotely enough.

I think we should seriously consider doing a larger refactoring of this
soon.  I've some ideas about what to do, but I'd welcome some thoughts
on whether others consider this a serious problem or not, and what they
think we should do about this, first.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers