Re: evtfoid and evtowner missing in findoidjoins/README
On Sun, Jan 17, 2021, at 21:32, Tom Lane wrote: >Well, SGML is actually plenty easy to parse as long as you've got xml >tooling at hand. We'd never want to introduce such a dependency in the >normal build process, but making something like findoidjoins depend on >such tools seems within reason. I recall having whipped up some one-off >Perl scripts of that sort when I was doing that massive rewrite of the >func.sgml tables last year. I didn't save them though, and in any case >I'm the world's worst Perl programmer, so it'd be better for someone >with more Perl-fu to take point if we decide to go that route. I went a head and implemented the parser, it was indeed easy. Patch attached. Add catalog_oidjoins.pl -- parses catalog references out of catalogs.sgml Since doc/src/sgml/catalogs.sgml is the master where catalog references are to be documented, if someone needs machine-readable access to such information, it should be extracted from this document. The added script catalog_oidjoins.pl parses the SGML and extract the fields necessary to produce the same output as generated by findoidjoins, which has historically been copy/pasted to the README. This is to allow for easy comparison, to verify the correctness of catalog_oidjoins.pl, and if necessary, to update the README based on the information in catalogs.sgml. Helper-files: diff_oidjoins.sh Runs ./catalog_oidjoins.pl and compares its output with STDIN. Shows a diff of the result, witout any context. test_oidjoins.sh Runs ./diff_oidjoins.sh for both the README and the output from ./findoidjoins regression. bogus_oidjoins.txt List of bogus diff entires to ignore, based on documentation in README. After having run make installcheck in src/test/regress, the test_oidjoins.sh can be run: $ ./test_oidjoins.sh README: + Join pg_catalog.pg_attrdef.adnum => pg_catalog.pg_attribute.attnum + Join pg_catalog.pg_class.relrewrite => pg_catalog.pg_class.oid + Join pg_catalog.pg_constraint.confkey []=> pg_catalog.pg_attribute.attnum + Join pg_catalog.pg_constraint.conkey []=> pg_catalog.pg_attribute.attnum + Join pg_catalog.pg_db_role_setting.setrole => pg_catalog.pg_authid.oid + Join pg_catalog.pg_default_acl.defaclnamespace => pg_catalog.pg_namespace.oid + Join pg_catalog.pg_default_acl.defaclrole => pg_catalog.pg_authid.oid + Join pg_catalog.pg_event_trigger.evtfoid => pg_catalog.pg_proc.oid + Join pg_catalog.pg_event_trigger.evtowner => pg_catalog.pg_authid.oid + Join pg_catalog.pg_extension.extconfig []=> pg_catalog.pg_class.oid + Join pg_catalog.pg_foreign_data_wrapper.fdwhandler => pg_catalog.pg_proc.oid + Join pg_catalog.pg_foreign_data_wrapper.fdwvalidator => pg_catalog.pg_proc.oid + Join pg_catalog.pg_foreign_table.ftrelid => pg_catalog.pg_class.oid + Join pg_catalog.pg_foreign_table.ftserver => pg_catalog.pg_foreign_server.oid + Join pg_catalog.pg_index.indkey => pg_catalog.pg_attribute.attnum + Join pg_catalog.pg_partitioned_table.partattrs => pg_catalog.pg_attribute.attnum + Join pg_catalog.pg_policy.polroles []=> pg_catalog.pg_authid.oid + Join pg_catalog.pg_publication.pubowner => pg_catalog.pg_authid.oid + Join pg_catalog.pg_publication_rel.prpubid => pg_catalog.pg_publication.oid + Join pg_catalog.pg_publication_rel.prrelid => pg_catalog.pg_class.oid + Join pg_catalog.pg_range.rngmultitypid => pg_catalog.pg_type.oid + Join pg_catalog.pg_seclabel.classoid => pg_catalog.pg_class.oid + Join pg_catalog.pg_shseclabel.classoid => pg_catalog.pg_class.oid + Join pg_catalog.pg_statistic.staattnum => pg_catalog.pg_attribute.attnum + Join pg_catalog.pg_statistic_ext.stxkeys => pg_catalog.pg_attribute.attnum + Join pg_catalog.pg_subscription.subdbid => pg_catalog.pg_database.oid + Join pg_catalog.pg_subscription.subowner => pg_catalog.pg_authid.oid + Join pg_catalog.pg_subscription_rel.srrelid => pg_catalog.pg_class.oid + Join pg_catalog.pg_subscription_rel.srsubid => pg_catalog.pg_subscription.oid + Join pg_catalog.pg_trigger.tgattr => pg_catalog.pg_attribute.attnum + Join pg_catalog.pg_type.typsubscript => pg_catalog.pg_proc.oid + Join pg_catalog.pg_user_mapping.umserver => pg_catalog.pg_foreign_server.oid + Join pg_catalog.pg_user_mapping.umuser => pg_catalog.pg_authid.oid findoidjoins: + Join pg_catalog.pg_attrdef.adnum => pg_catalog.pg_attribute.attnum + Join pg_catalog.pg_class.relrewrite => pg_catalog.pg_class.oid + Join pg_catalog.pg_constraint.confkey []=> pg_catalog.pg_attribute.attnum + Join pg_catalog.pg_constraint.conkey []=> pg_catalog.pg_attribute.attnum + Join pg_catalog.pg_db_role_setting.setrole => pg_catalog.pg_authid.oid + Join pg_catalog.pg_default_acl.defaclnamespace => pg_catalog.pg_namespace.oid + Join pg_catalog.pg_default_acl.defaclrole => pg_catalog.pg_authid.oid + Join pg_catalog.pg_event_trigger.evtfoid => pg_catalog.pg_proc.oid + Join pg_catalog.pg_event_trigger.evtowner => pg_catalog.pg_authid.oid + Join pg_catalog.pg_extension.extconfig
Re: evtfoid and evtowner missing in findoidjoins/README
"Joel Jacobson" writes: > A bit of background: > I'm working on an extension where I need SQL access to this reference > information. My extension is successfully automatically helping me to > find problems in extension update-scripts, where an update from a > version to a version would give a different result than directly > installing the to-version from scratch. > Currently, I'm parsing findoidjoins/README and importing the "from > column" and "to column" to a lookup-table, which is used by my > extension. Hmm. That README was certainly never intended to be used that way ;-) > So I think adding this as a lookup table in pg_catalog is the best solution, > since extension writers could then use this information in various ways. I'm definitely -1 on adding a catalog for that. But it seems like the idea of not-really-enforced FK entries in pg_constraint would serve your purposes just as well (and it'd be better from the standpoint of getting the planner to be aware of these relationships). > The information is theoretically already available via catalogs.sgml, > but a) it's not easy to parse, and b) it's not available from SQL. Well, SGML is actually plenty easy to parse as long as you've got xml tooling at hand. We'd never want to introduce such a dependency in the normal build process, but making something like findoidjoins depend on such tools seems within reason. I recall having whipped up some one-off Perl scripts of that sort when I was doing that massive rewrite of the func.sgml tables last year. I didn't save them though, and in any case I'm the world's worst Perl programmer, so it'd be better for someone with more Perl-fu to take point if we decide to go that route. regards, tom lane
Re: evtfoid and evtowner missing in findoidjoins/README
On Sun, Jan 17, 2021, at 18:16, Tom Lane wrote: > I kind of wonder whether findoidjoins has outlived its purpose and > we should just remove it (along with the oidjoins test script). A bit of background: I'm working on an extension where I need SQL access to this reference information. My extension is successfully automatically helping me to find problems in extension update-scripts, where an update from a version to a version would give a different result than directly installing the to-version from scratch. Currently, I'm parsing findoidjoins/README and importing the "from column" and "to column" to a lookup-table, which is used by my extension. If findoidjoins is removed, I would be happy as long as this reference information continues to be provided in some other simple machine-readable way, like a CSV-file somewhere in the repo, or even better: making this information available from SQL via a new lookup-table in pg_catalog. I can see how parsing catalogs.sgml would be doable, but proper SGML parsing is quite complex since it's a recursive language, and can't be reliably parsed with e.g. simple regexes. So I think adding this as a lookup table in pg_catalog is the best solution, since extension writers could then use this information in various ways. The information is theoretically already available via catalogs.sgml, but a) it's not easy to parse, and b) it's not available from SQL. Are there any other hackers who ever wished they would have had SQL access to these catalog references? If desired by enough others, perhaps something along these lines could work? CREATE TABLE pg_catalog.pg_references ( colfrom text, colto text, UNIQUE (colfrom) ); Where "colfrom" would be e.g. "pg_catalog.pg_class.relfilenode" and "colto" would be "pg_catalog.pg_class.oid" for that example. Not sure about the column names "colfrom"/"colto" though, since the abbreviation for columns seems to be "att" in the pg_catalog context. /Joel
Re: evtfoid and evtowner missing in findoidjoins/README
"Joel Jacobson" writes: > The below references are already properly documented in >https://www.postgresql.org/docs/current/catalog-pg-event-trigger.html > but missing in src/tools/findoidjoins/README. > Join pg_catalog.pg_event_trigger.evtowner => pg_catalog.pg_authid.oid > Join pg_catalog.pg_event_trigger.evtfoid => pg_catalog.pg_proc.oid Yup, no surprise given the way findoidjoins works: it could only discover those relationships if pg_event_trigger had some entries in the end state of the regression database. Of course it doesn't, and I'd be against leaving a live event trigger in place in that DB. (I suspect there are other similar gaps in the coverage.) I kind of wonder whether findoidjoins has outlived its purpose and we should just remove it (along with the oidjoins test script). IMO it was intended to find mistakes in the initial catalog data, but given the current policy that the .dat files shall not contain numeric OID references, that type of mistake is impossible anymore. Certainly, it's been so long since that test script has caught anything that it doesn't seem worth the annual-or-so maintenance effort to update it. A different line of thought would be to try to teach findoidjoins to scrape info about catalog references out of catalogs.sgml, and use that instead of or in addition to its current methods. That seems like a fair amount of work though, and again I can't get excited that it'd be worth the trouble. Also, I recall mutterings on -hackers about adding foreign-key entries to pg_constraint to document the catalog reference relationships. (In my possibly-faulty recollection, the idea was that these'd only be documentation and would lack enforcement triggers; but perhaps we'd allow the planner to trust them for purposes of optimizing multi-catalog queries.) If we had those, we could make findoidjoins use them instead of trawling the data, or maybe throw away findoidjoins per se and let the oidjoins.sql script read the FK entries to find out what to check dynamically. regards, tom lane
evtfoid and evtowner missing in findoidjoins/README
The below references are already properly documented in https://www.postgresql.org/docs/current/catalog-pg-event-trigger.html but missing in src/tools/findoidjoins/README. Join pg_catalog.pg_event_trigger.evtowner => pg_catalog.pg_authid.oid Join pg_catalog.pg_event_trigger.evtfoid => pg_catalog.pg_proc.oid I'm not sure what the process of updating the README is, the git log seems to indicate this is usually part of the release cycle, so perhaps it's OK this file is out-of-sync between releases? But if so, that won't explain these two, since they have been around for ages. /Joel