Re: [HACKERS] pg_dump and premature optimizations for objects not to be dumped
I wrote: > Attached is a draft patch that does things this way. BTW, the bulk of the changes in this patch are just there to pass down the DumpOptions struct to where it's needed, following some pre-existing work that passed that around as a separate pointer. I wonder why things were done that way, rather than adding a DumpOptions pointer in the Archive struct? The vast majority of the functions I had to touch already had an Archive* parameter, and would not have needed to be touched if it were done the other way. I'm tempted to push a preliminary patch that adds such a field and gets rid of DumpOptions as a separate parameter to anything that also takes an Archive* pointer. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dump and premature optimizations for objects not to be dumped
I wrote: > I looked into Karsten Hilbert's report here > http://www.postgresql.org/message-id/20160108114529.gb22...@hermes.hilbert.loc > of being unable to run pg_upgrade on a database containing the pg_trgm > extension. After some investigation, the problem is explained like this: > ... > The thing that seems possibly most robust to me is to pull in the > extension membership data *first* and incorporate it into the initial > selectDumpableObject tests, thus getting us back to the pre-9.1 state > of affairs where the initial settings of the object dump flags could > be trusted. This might be expensive though; and it would certainly add > a good chunk of work to the race-condition window where we've taken > pg_dump's transaction snapshot but not yet acquired lock on any of > the tables. Attached is a draft patch that does things this way. Some simple testing suggests it's just about exactly the same speed as 9.5 pg_dump, which means that the "expensive" objection is dismissible. It's hard to tell though whether the pre-table-lock race-condition window has gotten meaningfully wider. In addition to fixing Karsten's problem, this deals with the bugs I noted earlier about event triggers and transforms being dumped unconditionally. I was also able to get rid of some unsightly special cases for procedural languages, FDWs, and foreign servers, in favor of setting their dump flags according to standard rules up front. If we were to put a test rejecting initdb-created objects (those with OID < FirstNormalObjectId) into selectDumpableObject, it'd be possible to get rid of selectDumpableDefaultACL, selectDumpableCast, and/or selectDumpableProcLang, since those would then have various subsets of the selectDumpableObject rules. I'm not sure if this would be an improvement or just confusing; any thoughts? I'm not very sure what to do with this patch. I think it should definitely go into 9.5: it applies cleanly there and it will fix our two new-in-9.5 bugs with event triggers and transforms. I'm less enthused about back-porting it further. In principle, the extension membership issues exist all the way back to 9.1, but we haven't had complaints before, and there's a nonzero chance of changing corner-case behaviors. (I think any such changes would likely be for the better, but nonetheless they would be changes.) Back-porting it further than about 9.4 would also be quite a lot of work :-( Comments? regards, tom lane diff --git a/src/bin/pg_dump/common.c b/src/bin/pg_dump/common.c index 7869ee9..ded14f3 100644 *** a/src/bin/pg_dump/common.c --- b/src/bin/pg_dump/common.c *** static int numCatalogIds = 0; *** 40,69 /* * These variables are static to avoid the notational cruft of having to pass ! * them into findTableByOid() and friends. For each of these arrays, we ! * build a sorted-by-OID index array immediately after it's built, and then ! * we use binary search in findTableByOid() and friends. (qsort'ing the base ! * arrays themselves would be simpler, but it doesn't work because pg_dump.c ! * may have already established pointers between items.) */ - static TableInfo *tblinfo; - static TypeInfo *typinfo; - static FuncInfo *funinfo; - static OprInfo *oprinfo; - static NamespaceInfo *nspinfo; - static int numTables; - static int numTypes; - static int numFuncs; - static int numOperators; - static int numCollations; - static int numNamespaces; static DumpableObject **tblinfoindex; static DumpableObject **typinfoindex; static DumpableObject **funinfoindex; static DumpableObject **oprinfoindex; static DumpableObject **collinfoindex; static DumpableObject **nspinfoindex; static void flagInhTables(TableInfo *tbinfo, int numTables, InhInfo *inhinfo, int numInherits); --- 40,69 /* * These variables are static to avoid the notational cruft of having to pass ! * them into findTableByOid() and friends. For each of these arrays, we build ! * a sorted-by-OID index array immediately after the objects are fetched, ! * and then we use binary search in findTableByOid() and friends. (qsort'ing ! * the object arrays themselves would be simpler, but it doesn't work because ! * pg_dump.c may have already established pointers between items.) */ static DumpableObject **tblinfoindex; static DumpableObject **typinfoindex; static DumpableObject **funinfoindex; static DumpableObject **oprinfoindex; static DumpableObject **collinfoindex; static DumpableObject **nspinfoindex; + static DumpableObject **extinfoindex; + static int numTables; + static int numTypes; + static int numFuncs; + static int numOperators; + static int numCollations; + static int numNamespaces; + static int numExtensions; + /* This is an array of object identities, not actual DumpableObjects */ + static ExtensionMemberId *extmembers; + static int numextmembers; static void flagInhTables(TableInfo *tbinfo, int numTables,
[HACKERS] pg_dump and premature optimizations for objects not to be dumped
I looked into Karsten Hilbert's report here http://www.postgresql.org/message-id/20160108114529.gb22...@hermes.hilbert.loc of being unable to run pg_upgrade on a database containing the pg_trgm extension. After some investigation, the problem is explained like this: 1. Karsten had installed pg_trgm into pg_catalog rather than some user schema. 2. When getTypes() processes the gtrgm type created by pg_trgm, it decides the type won't be dumped (cf selectDumpableType, which quite properly rejects types in pg_catalog). This causes it not to generate a ShellTypeInfo subsidiary object. 3. Later, getExtensionMembership decides that we *should* dump gtrgm and associated I/O routines, since we're in binary-upgrade mode. 4. Later still, repairTypeFuncLoop breaks the dependency loops between gtrgm and its I/O routines, but it doesn't mark the shell type as dumpable because there isn't one. 5. So we end up with no shell type being dumped, which does not work in binary-upgrade mode because we can't create the shell type when we see the first I/O function; we don't know what OID to give it. The core of the problem, therefore, is the choice in getTypes to not create a ShellTypeInfo if it thinks the type is not to be dumped. That was probably fine when it was written, but now that we can change our minds about object dumpability later, it's clearly broken. It's really just unnecessary optimization anyway, since if we create the object and don't use it, we've not wasted much but cycles. I've verified that a patch like this: diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c index 56c0528..6acbf4a 100644 *** a/src/bin/pg_dump/pg_dump.c --- b/src/bin/pg_dump/pg_dump.c *** getTypes(Archive *fout, int *numTypes) *** 3664,3671 * should copy the base type's catId, but then it might capture the * pg_depend entries for the type, which we don't want. */ ! if (tyinfo[i].dobj.dump && (tyinfo[i].typtype == TYPTYPE_BASE || ! tyinfo[i].typtype == TYPTYPE_RANGE)) { stinfo = (ShellTypeInfo *) pg_malloc(sizeof(ShellTypeInfo)); stinfo->dobj.objType = DO_SHELL_TYPE; --- 3664,3671 * should copy the base type's catId, but then it might capture the * pg_depend entries for the type, which we don't want. */ ! if (tyinfo[i].typtype == TYPTYPE_BASE || ! tyinfo[i].typtype == TYPTYPE_RANGE) { stinfo = (ShellTypeInfo *) pg_malloc(sizeof(ShellTypeInfo)); stinfo->dobj.objType = DO_SHELL_TYPE; fixes Karsten's symptom without obvious other problems. However ... there are a whole bunch of *other* places where pg_dump skips work on objects that are not to be dumped, and some of them would add quite a bit more cost than this one. An example is just above this code, where we skip getDomainConstraints() for domains we think aren't to be dumped. A fix for that wouldn't be a one-liner either, because if we do pull in the domain's constraints unconditionally, we'd have to do something to cause the domain's own dumpability flag (and updates thereof) to get propagated to them. Even if we fix all these issues today, it's not hard to imagine new bugs of similar ilk sneaking in. Exacerbating this is that some of those checks are fine because they occur after getExtensionMembership(), at which point the dump-or-not decisions won't change. But in most places in pg_dump, it's not instantly obvious whether you're looking at a flag that is still subject to change or not. The thing that seems possibly most robust to me is to pull in the extension membership data *first* and incorporate it into the initial selectDumpableObject tests, thus getting us back to the pre-9.1 state of affairs where the initial settings of the object dump flags could be trusted. This might be expensive though; and it would certainly add a good chunk of work to the race-condition window where we've taken pg_dump's transaction snapshot but not yet acquired lock on any of the tables. Thoughts? Anybody have another idea about what to do about this? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers