Re: [HACKERS] pg_dump dump catalog ACLs

Noah Misch Sat, 23 Apr 2016 15:52:41 -0700

On Fri, Apr 22, 2016 at 12:31:41PM -0400, Stephen Frost wrote:
> * Noah Misch (n...@leadboat.com) wrote:
> > On Wed, Apr 20, 2016 at 10:50:21PM -0400, Stephen Frost wrote:
> > > I'm certainly open to improving these issues now if we agree that they
> > > should be fixed for 9.6.  If we don't want to include such changes in 9.6
> > > then I will propose then for post-9.6.
> > 
> > Folks run clusters with ~1000 databases; we previously accepted at least one
> > complex performance improvement[1] based on that use case.  On the faster of
> > the two machines I tested, the present thread's commits slowed "pg_dumpall
> > --schema-only --binary-upgrade" by 1-2s per database.  That doubles pg_dump
> > runtime against the installcheck regression database.  A run against a 
> > cluster
> > of one hundred empty databases slowed fifteen-fold, from 8.6s to 131s.
> > "pg_upgrade -j50" probably will keep things tolerable for the 1000-database
> > case, but the performance regression remains jarring.  I think we should not
> > release 9.6 with pg_dump performance as it stands today.
> 
> After looking through the code a bit, I realized that there are a lot of
> object types which don't have ACLs at all but which exist in pg_catalog
> and were being analyzed because the bitmask for pg_catalog included ACLs
> and therefore was non-zero.
> 
> Clearing that bit for object types which don't have ACLs improved the
> performance for empty databases quite a bit (from about 3s to a bit
> under 1s on my laptop).  That's a 42-line patch, with comment lines
> being half of that, which I'll push once I've looked into the other
> concerns which were brought up on this thread.


That's good news.

> Much of the remaining inefficiancy is how we query for column
> information one table at a time (that appears to be around 300ms of the
> 900ms or so total time).  I'm certainly interested in improving that but
> that would require adding more complex data structures to pg_dump than
> what we use currently (we'd want to grab all of the columns we care
> about in an entire schema and store it locally and then provide a way to
> look up that information, etc), so I'm not sure it'd be appropriate to
> do now.

I'm not sure, either; I'd need to see more to decide.  If I were you, I would
draft a patch to the point where the community can see the complexity and the
performance change.  That should be enough to inform the choice among moving
forward with the proposed complexity, soliciting other designs, reverting the
original changes, or accepting for 9.6 the slowdown as it stands at that time.
Other people may have more definite opinions already.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_dump dump catalog ACLs

Reply via email to