Re: Statistics Import and Export

2024-05-16 Thread Jeff Davis
On Thu, 2024-05-16 at 05:25 -0400, Corey Huinker wrote: > > Per previous comments, it was suggested by others that: > > - having them in SECTION_NONE was a grave mistake > - Everything that could belong in SECTION_DATA should, and the rest > should be in SECTION_POST_DATA I don't understand the

Re: Statistics Import and Export

2024-05-16 Thread Corey Huinker
> > Can you explain what you did with the > SECTION_NONE/SECTION_DATA/SECTION_POST_DATA over v19-v21 and why? > Initially, I got things to work by having statistics import behave like COMMENTs, which meant that they were run immediately after the table/matview/index/constraint that created the

Re: Statistics Import and Export

2024-05-15 Thread Jeff Davis
On Mon, 2024-05-06 at 23:43 -0400, Corey Huinker wrote: > > v21 attached. > > 0003 is the statistics changes to pg_dump, adding the options -X / -- > statistics-only, and the derivative boolean statisticsOnly. The -P > option is already used by pg_restore, so instead I chose -X because > of the

Re: Statistics Import and Export

2024-04-24 Thread Matthias van de Meent
On Wed, 24 Apr 2024 at 21:31, Bruce Momjian wrote: > > On Tue, Apr 23, 2024 at 06:33:48PM +0200, Matthias van de Meent wrote: > > I've heard of use cases where dumping stats without data would help > > with production database planner debugging on a non-prod system. > > > > Sure, some planner

Re: Statistics Import and Export

2024-04-24 Thread Bruce Momjian
On Tue, Apr 23, 2024 at 06:33:48PM +0200, Matthias van de Meent wrote: > I've heard of use cases where dumping stats without data would help > with production database planner debugging on a non-prod system. > > Sure, some planner inputs would have to be taken into account too, but > having an

Re: Statistics Import and Export

2024-04-23 Thread Matthias van de Meent
On Tue, 23 Apr 2024, 05:52 Tom Lane, wrote: > Jeff Davis writes: > > On Mon, 2024-04-22 at 16:19 -0400, Tom Lane wrote: > >> Loading data without stats, and hoping > >> that auto-analyze will catch up sooner not later, is exactly the > >> current behavior that we're doing all this work to get

Re: Statistics Import and Export

2024-04-22 Thread Tom Lane
Jeff Davis writes: > On Mon, 2024-04-22 at 16:19 -0400, Tom Lane wrote: >> Loading data without stats, and hoping >> that auto-analyze will catch up sooner not later, is exactly the >> current behavior that we're doing all this work to get out of. > That's the disconnect, I think. For me, the

Re: Statistics Import and Export

2024-04-22 Thread Jeff Davis
On Mon, 2024-04-22 at 16:19 -0400, Tom Lane wrote: > Loading data without stats, and hoping > that auto-analyze will catch up sooner not later, is exactly the > current behavior that we're doing all this work to get out of. That's the disconnect, I think. For me, the main reason I'm excited about

Re: Statistics Import and Export

2024-04-22 Thread Tom Lane
Jeff Davis writes: > Would it make sense to have a new SECTION_STATS? Perhaps, but the implications for pg_dump's API would be nontrivial, eg would we break any applications that know about the current options for --section. And you still have to face up to the question "does --data-only

Re: Statistics Import and Export

2024-04-22 Thread Jeff Davis
On Wed, 2024-04-17 at 11:50 -0500, Nathan Bossart wrote: > It looks like the problem is that the ACLs are getting dumped in the > data > section when we are also dumping stats.  I'm able to get the tests to > pass > by moving the call to dumpRelationStats() that's in dumpTableSchema() > to >

Re: Statistics Import and Export

2024-04-17 Thread Nathan Bossart
On Thu, Apr 11, 2024 at 03:54:07PM -0400, Corey Huinker wrote: > At the request of a few people, attached is an attempt to move stats to > DATA/POST-DATA, and the TAP test failure that results from that. > > The relevant errors are confusing, in that they all concern GRANT/REVOKE, > and the fact

Re: Statistics Import and Export

2024-04-06 Thread Corey Huinker
> > > > I think it'll be a serious, serious error for this not to be > SECTION_DATA. Maybe POST_DATA is OK, but even that seems like > an implementation compromise not "the way it ought to be". > We'd have to split them on account of when the underlying object is created. Index statistics would

Re: Statistics Import and Export

2024-04-05 Thread Tom Lane
Jeff Davis writes: > Thank you, this has improved a lot and the fundamentals are very close. > I think it could benefit from a bit more time to settle on a few > issues: Yeah ... it feels like we aren't quite going to manage to get this over the line for v17. We could commit with the hope that

Re: Statistics Import and Export

2024-04-05 Thread Jeff Davis
On Thu, 2024-04-04 at 00:30 -0400, Corey Huinker wrote: > > v17 > > 0001 > - array_in now repackages cast errors as warnings and skips the stat, > test added > - version parameter added, though it's mostly for future > compatibility, tests modified > - both functions delay object/attribute

Re: Statistics Import and Export

2024-04-05 Thread Ashutosh Bapat
On Fri, Apr 5, 2024 at 10:07 AM Tom Lane wrote: > Ashutosh Bapat writes: > > I read that discussion, and it may be ok for pg_upgrade/pg_dump usecase > and > > maybe also for IMPORT foreign schema where the SQL is generated by > > PostgreSQL itself. But not for simulating statistics. In that

Re: Statistics Import and Export

2024-04-04 Thread Tom Lane
Ashutosh Bapat writes: > I read that discussion, and it may be ok for pg_upgrade/pg_dump usecase and > maybe also for IMPORT foreign schema where the SQL is generated by > PostgreSQL itself. But not for simulating statistics. In that case, if the > function happily installs statistics cooked by

Re: Statistics Import and Export

2024-04-04 Thread Ashutosh Bapat
On Fri, Apr 5, 2024 at 7:00 AM Corey Huinker wrote: > For a partitioned table this value has to be true. For a normal table when >> setting this value to true, it should at least make sure that the table has >> at least one child. Otherwise it should throw an error. Blindly accepting >> the

Re: Statistics Import and Export

2024-04-04 Thread Corey Huinker
> > For a partitioned table this value has to be true. For a normal table when > setting this value to true, it should at least make sure that the table has > at least one child. Otherwise it should throw an error. Blindly accepting > the given value may render the statistics unusable. Prologue of

Re: Statistics Import and Export

2024-04-03 Thread Michael Paquier
On Mon, Apr 01, 2024 at 01:21:53PM -0400, Tom Lane wrote: > I'm not sure. I think if we put our heads down we could finish > the changes I'm suggesting and resolve the other issues this week. > However, it is starting to feel like the sort of large, barely-ready > patch that we often regret

Re: Statistics Import and Export

2024-04-03 Thread Tom Lane
Corey Huinker writes: >> As far as that goes, it shouldn't be that hard to deal with, at least >> not for "soft" errors which hopefully cover most input-function >> failures these days. You should be invoking array_in via >> InputFunctionCallSafe and passing a suitably-set-up ErrorSaveContext.

Re: Statistics Import and Export

2024-04-03 Thread Corey Huinker
> > > As far as that goes, it shouldn't be that hard to deal with, at least > not for "soft" errors which hopefully cover most input-function > failures these days. You should be invoking array_in via > InputFunctionCallSafe and passing a suitably-set-up ErrorSaveContext. > (Look at

Re: Statistics Import and Export

2024-04-03 Thread Tom Lane
Corey Huinker writes: > - functions strive to not ERROR unless absolutely necessary. The biggest > exposure is the call to array_in(). As far as that goes, it shouldn't be that hard to deal with, at least not for "soft" errors which hopefully cover most input-function failures these days. You

Re: Statistics Import and Export

2024-04-02 Thread Corey Huinker
> > Yeah, but that problem exists no matter what. I haven't read enough > of the patch to find where it's determining that, but I assume there's > code in there to intuit the statistics storage type depending on the > table column's data type and the statistics kind. > Correct. It borrows a lot

Re: Statistics Import and Export

2024-04-02 Thread Tom Lane
Jeff Davis writes: > We need to get the original element type on the import side somehow, > right? Otherwise it will be hard to tell whether '{1, 2, 3, 4}' has > element type "int4" or "text", which affects the binary representation > of the anyarray value in pg_statistic. Yeah, but that problem

Re: Statistics Import and Export

2024-04-02 Thread Jeff Davis
On Tue, 2024-04-02 at 17:31 -0400, Tom Lane wrote: > And that means that we don't need the sending > side to know the element type anyway. We need to get the original element type on the import side somehow, right? Otherwise it will be hard to tell whether '{1, 2, 3, 4}' has element type "int4"

Re: Statistics Import and Export

2024-04-02 Thread Corey Huinker
> > side to know the element type anyway. So, I apologize for sending > us down a useless side path. We may as well stick to the function > signature as shown in the v15 patch --- although maybe variadic > any is still worthwhile so that an unrecognized field name doesn't > need to be a hard

Re: Statistics Import and Export

2024-04-02 Thread Tom Lane
Jeff Davis writes: > On the export side, the problem is that the element type (and > dimensionality and maybe hasnull) is an important part of the anyarray > value, but it's not part of the output of anyarray_out(). For new > versions, we can add a scalar function that simply outputs the >

Re: Statistics Import and Export

2024-04-02 Thread Jeff Davis
On Tue, 2024-04-02 at 12:59 -0400, Corey Huinker wrote: > However, some of the ANYARRAYs have element types that are > themselves arrays, and near as I can tell, such a construct is not > expressible in SQL. So, rather than getting an anyarray of an array > type, you instead get an array of one

Re: Statistics Import and Export

2024-04-02 Thread Corey Huinker
I have refactored pg_set_relation_stats to be variadic, and I'm working on pg_set_attribute_sttats, but I'm encountering an issue with the anyarray values. Jeff suggested looking at anyarray_send as a way of extracting the type, and with some extra twiddling we can get and cast the type. However,

Re: Statistics Import and Export

2024-04-02 Thread Jeff Davis
On Tue, 2024-04-02 at 05:38 -0400, Corey Huinker wrote: > Here's a one-liner patch for disabling update of pg_class > relpages/reltuples/relallviible during a binary upgrade. This change makes sense to me regardless of the rest of the work. Updating the relpages/reltuples/relallvisible during

Re: Statistics Import and Export

2024-04-02 Thread Corey Huinker
Here's a one-liner patch for disabling update of pg_class relpages/reltuples/relallviible during a binary upgrade. This was causting pg_upgrade tests to fail in the existing stats import work. From 322db8c9e8796ce673f7d7b63bd64e4c9403a144 Mon Sep 17 00:00:00 2001 From: Corey Huinker Date: Mon, 1

Re: Statistics Import and Export

2024-04-01 Thread Tom Lane
Corey Huinker writes: > A boolean is what we had before, I'm quite comfortable with that, and it > addresses my silent-failure concerns. WFM. regards, tom lane

Re: Statistics Import and Export

2024-04-01 Thread Corey Huinker
> > If we are envisioning that the function might emit multiple warnings > per call, a useful definition could be to return the number of > warnings (so zero is good, not-zero is bad). But I'm not sure that's > really better than a boolean result. pg_dump/pg_restore won't notice > anyway, but

Re: Statistics Import and Export

2024-04-01 Thread Tom Lane
Corey Huinker writes: > Any thoughts about going back to having a return value, a caller could then > see that the function returned NULL rather than whatever the expected value > was (example: TRUE)? If we are envisioning that the function might emit multiple warnings per call, a useful

Re: Statistics Import and Export

2024-04-01 Thread Corey Huinker
> > > I still think that we could just declare the function strict, if we > use the variadic-any approach. Passing a null in any position is > indisputable caller error. However, if you're allergic to silently > doing nothing in such a case, we could have pg_set_attribute_stats > check each

Re: Statistics Import and Export

2024-04-01 Thread Tom Lane
Corey Huinker writes: > So what's the behavior when the user fails to supply a parameter that is > currently NOT NULL checked (example: avg_witdth)? Is that a WARN-and-exit? I still think that we could just declare the function strict, if we use the variadic-any approach. Passing a null in any

Re: Statistics Import and Export

2024-04-01 Thread Corey Huinker
> > Ah, yeah, you could change the function to have more parameters, > given the assumption that all calls will be named-parameter style. > I still suggest that my proposal is more robust for the case where > the dump lists parameters that the receiving system doesn't have. > So what's the

Re: Statistics Import and Export

2024-04-01 Thread Corey Huinker
> > I think pg_upgrade could check for the existence of extended statistics > in any database and adjust the analyze recommdnation wording > accordingly. > +1

Re: Statistics Import and Export

2024-04-01 Thread Corey Huinker
> > That's not what I suggested at all. The function parameters would > be declared anyarray, but the values passed to them would be coerced > to the correct concrete array types. So as far as the coercion rules > are concerned this'd be equivalent to the variadic-any approach. > +1 > > >

Re: Statistics Import and Export

2024-04-01 Thread Tom Lane
Jeff Davis writes: > On Sun, 2024-03-31 at 14:48 -0400, Tom Lane wrote: >> What happens when >> somebody adds a new stakind (and hence new pg_stats column)? > Why would you need to overload in this case? Wouldn't we just define a > new function with more optional named parameters? Ah, yeah, you

Re: Statistics Import and Export

2024-04-01 Thread Tom Lane
Jeff Davis writes: > On Mon, 2024-04-01 at 13:11 -0400, Bruce Momjian wrote: >> Reality check --- are we still targeting this feature for PG 17? > I see a few useful pieces here: > 1. Support import of statistics (i.e. > pg_set_{relation|attribute}_stats()). > 2. Support pg_dump of stats > 3.

Re: Statistics Import and Export

2024-04-01 Thread Jeff Davis
On Sun, 2024-03-31 at 14:48 -0400, Tom Lane wrote: > What happens when > somebody adds a new stakind (and hence new pg_stats column)? > You could try to add an overloaded pg_set_attribute_stats > version with more parameters, but I'm pretty sure that would > lead to "ambiguous function call"

Re: Statistics Import and Export

2024-04-01 Thread Bruce Momjian
On Sun, Mar 31, 2024 at 07:04:47PM -0400, Tom Lane wrote: > Corey Huinker writes: > >> I can't quibble with that view of what has priority. I'm just > >> suggesting that redesigning what pg_upgrade does in this area > >> should come later than doing something about extended stats. > > > I

Re: Statistics Import and Export

2024-04-01 Thread Jeff Davis
On Mon, 2024-04-01 at 13:11 -0400, Bruce Momjian wrote: > Reality check --- are we still targeting this feature for PG 17? I see a few useful pieces here: 1. Support import of statistics (i.e. pg_set_{relation|attribute}_stats()). 2. Support pg_dump of stats 3. Support pg_upgrade with stats

Re: Statistics Import and Export

2024-04-01 Thread Tom Lane
Bruce Momjian writes: > Reality check --- are we still targeting this feature for PG 17? I'm not sure. I think if we put our heads down we could finish the changes I'm suggesting and resolve the other issues this week. However, it is starting to feel like the sort of large, barely-ready patch

Re: Statistics Import and Export

2024-04-01 Thread Tom Lane
Jeff Davis writes: > On Sat, 2024-03-30 at 20:08 -0400, Tom Lane wrote: >> I haven't looked at the details, but I'm really a bit surprised >> by Jeff's assertion that CREATE INDEX destroys statistics on the >> base table.  That seems wrong from here, and maybe something we >> could have it not

Re: Statistics Import and Export

2024-04-01 Thread Bruce Momjian
Reality check --- are we still targeting this feature for PG 17? -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Only you can decide what is important to you.

Re: Statistics Import and Export

2024-04-01 Thread Jeff Davis
On Sat, 2024-03-30 at 20:08 -0400, Tom Lane wrote: > I haven't looked at the details, but I'm really a bit surprised > by Jeff's assertion that CREATE INDEX destroys statistics on the > base table.  That seems wrong from here, and maybe something we > could have it not do.  (I do realize that it

Re: Statistics Import and Export

2024-04-01 Thread Tom Lane
Corey Huinker writes: >> IIRC, "variadic any" requires having at least one variadic parameter. >> But that seems fine --- what would be the point, or even the >> semantics, of calling pg_set_attribute_stats with no data fields? > If my pg_dump run emitted a bunch of stats that could never be

Re: Statistics Import and Export

2024-04-01 Thread Ashutosh Bapat
Hi Corey, Some more comments on v15. +/* + * A more encapsulated version of can_modify_relation for when the the + * HeapTuple and Form_pg_class are not needed later. + */ +static void +check_relation_permissions(Relation rel) This function is used exactly at one place, so usually won't make

Re: Statistics Import and Export

2024-04-01 Thread Ashutosh Bapat
Hi Corey, On Mon, Mar 25, 2024 at 3:38 PM Ashutosh Bapat wrote: > Hi Corey, > > > On Sat, Mar 23, 2024 at 7:21 AM Corey Huinker > wrote: > >> v12 attached. >> >> 0001 - >> >> > Some random comments > > +SELECT > +format('SELECT pg_catalog.pg_set_attribute_stats( ' > +||

Re: Statistics Import and Export

2024-03-31 Thread Corey Huinker
> > IIRC, "variadic any" requires having at least one variadic parameter. > But that seems fine --- what would be the point, or even the > semantics, of calling pg_set_attribute_stats with no data fields? > If my pg_dump run emitted a bunch of stats that could never be imported, I'd want to know.

Re: Statistics Import and Export

2024-03-31 Thread Tom Lane
Corey Huinker writes: >> and I really think that we need to provide >> the source server's major version number --- maybe we will never >> need that, but if we do and we don't have it we will be sad. > The JSON had it, and I never did use it. Not against having it again. Well, you don't need it

Re: Statistics Import and Export

2024-03-31 Thread Corey Huinker
> > > I concur with the plan of extracting data from pg_stats not > pg_statistics, and with emitting a single "set statistics" > call per attribute. (I think at one point I'd suggested a call > per stakind slot, but that would lead to a bunch of UPDATEs on > existing pg_attribute tuples and hence

Re: Statistics Import and Export

2024-03-31 Thread Tom Lane
Corey Huinker writes: >> I can't quibble with that view of what has priority. I'm just >> suggesting that redesigning what pg_upgrade does in this area >> should come later than doing something about extended stats. > I mostly agree, with the caveat that pg_upgrade's existing message saying >

Re: Statistics Import and Export

2024-03-31 Thread Corey Huinker
> > I wonder if the right answer to that is "let's enhance the I/O > functions for those types". But whether that helps or not, it's > v18-or-later material for sure. > That was Stephen's take as well, and I agreed given that I had to throw the kitchen-sink of source-side oid mappings (attname,

Re: Statistics Import and Export

2024-03-31 Thread Tom Lane
Corey Huinker writes: > On Sun, Mar 31, 2024 at 2:41 PM Tom Lane wrote: >> There's a bigger issue though: AFAICS this patch set does nothing >> about dumping extended statistics. I surely don't want to hold up >> the patch insisting that that has to happen before we can commit the >>

Re: Statistics Import and Export

2024-03-31 Thread Corey Huinker
On Sun, Mar 31, 2024 at 2:41 PM Tom Lane wrote: > Corey Huinker writes: > > Having given this some thought, I'd be inclined to create a view, > > pg_stats_missing, with the same security barrier as pg_stats, but looking > > for tables that lack stats on at least one column, or lack stats on an

Re: Statistics Import and Export

2024-03-31 Thread Tom Lane
My apologies for having paid so little attention to this thread for months. I got around to reading the v15 patches today, and while I think they're going in more or less the right direction, there's a long way to go IMO. I concur with the plan of extracting data from pg_stats not pg_statistics,

Re: Statistics Import and Export

2024-03-31 Thread Tom Lane
Corey Huinker writes: > Having given this some thought, I'd be inclined to create a view, > pg_stats_missing, with the same security barrier as pg_stats, but looking > for tables that lack stats on at least one column, or lack stats on an > extended statistics object. The week before feature

Re: Statistics Import and Export

2024-03-31 Thread Corey Huinker
> > That will make it *really* hard for any form of automation or drivers of > this. The information needs to go somewhere where such tools can easily > consume it, and an informational message during runtime (which is also > likely to be translated in many environments) is the exact opposite of

Re: Statistics Import and Export

2024-03-30 Thread Tom Lane
Corey Huinker writes: >> I didn't have any specific proposal in mind, was just trying to think >> outside the box. > What if we added a separate resection SECTION_STATISTICS which is run > following post-data? Maybe, but that would have a lot of side-effects on pg_dump's API and probably on

Re: Statistics Import and Export

2024-03-30 Thread Corey Huinker
> > I didn't have any specific proposal in mind, was just trying to think > outside the box. > What if we added a separate resection SECTION_STATISTICS which is run following post-data?

Re: Statistics Import and Export

2024-03-30 Thread Corey Huinker
> > I'm getting late into this discussion and I apologize if I've missed this > being discussed before. But. > > Please don't. > > That will make it *really* hard for any form of automation or drivers of > this. The information needs to go somewhere where such tools can easily > consume it, and an

Re: Statistics Import and Export

2024-03-30 Thread Jeff Davis
On Sat, 2024-03-30 at 13:39 -0400, Tom Lane wrote: > (You could also imagine an explicit positive --stats switch that > would > override --schema-only, but I don't see that it's worth the trouble.) That would have its own utility for reproducing planner problems outside of production systems.

Re: Statistics Import and Export

2024-03-30 Thread Tom Lane
Jeff Davis writes: > On Sat, 2024-03-30 at 13:18 -0400, Tom Lane wrote: >> Surely they are data, not schema. It would make zero sense to >> restore them if you aren't restoring the data they describe. > The complexity is that pg_upgrade does create the data, but relies on a > schema-only dump.

Re: Statistics Import and Export

2024-03-30 Thread Jeff Davis
On Sat, 2024-03-30 at 13:18 -0400, Tom Lane wrote: > Surely they are data, not schema.  It would make zero sense to > restore > them if you aren't restoring the data they describe. The complexity is that pg_upgrade does create the data, but relies on a schema-only dump. So we'd need to at least

Re: Statistics Import and Export

2024-03-30 Thread Tom Lane
Jeff Davis writes: > This re-raises the question of whether stats are part of a schema-only > dump or not. Did we settle conclusively that they are? Surely they are data, not schema. It would make zero sense to restore them if you aren't restoring the data they describe. Hence, it'll be a bit

Re: Statistics Import and Export

2024-03-30 Thread Jeff Davis
On Sat, 2024-03-30 at 01:34 -0400, Corey Huinker wrote: > > - 002pg_upgrade.pl now dumps before/after databases with --no- > statistics. I tried to find out why some tables were getting their > relstats either not set, or set and reset, never affecting the > attribute stats. I even tried turning

Re: Statistics Import and Export

2024-03-30 Thread Magnus Hagander
On Sat, Mar 30, 2024 at 1:26 AM Corey Huinker wrote: > > > On Fri, Mar 29, 2024 at 7:34 PM Jeff Davis wrote: > >> On Fri, 2024-03-29 at 18:02 -0400, Stephen Frost wrote: >> > I’d certainly think “with stats” would be the preferred default of >> > our users. >> >> I'm concerned there could still

Re: Statistics Import and Export

2024-03-29 Thread Jeff Davis
On Fri, 2024-03-29 at 20:54 -0400, Stephen Frost wrote: > What’s different, given the above arguments, in making the change > with 18 instead of now? Acknowledged. You, Tom, and Corey (and perhaps everyone else) seem to be aligned here, so that's consensus enough for me. Default is with stats,

Re: Statistics Import and Export

2024-03-29 Thread Stephen Frost
Greetings, On Fri, Mar 29, 2024 at 19:35 Jeff Davis wrote: > On Fri, 2024-03-29 at 18:02 -0400, Stephen Frost wrote: > > I’d certainly think “with stats” would be the preferred default of > > our users. > > I'm concerned there could still be paths that lead to an error. For > pg_restore, or

Re: Statistics Import and Export

2024-03-29 Thread Corey Huinker
> > (I've not read the patch yet, but I assume the switch works like > other pg_dump filters in that you can apply it on the restore > side?) > Correct. It follows the existing --no-something pattern.

Re: Statistics Import and Export

2024-03-29 Thread Corey Huinker
On Fri, Mar 29, 2024 at 7:34 PM Jeff Davis wrote: > On Fri, 2024-03-29 at 18:02 -0400, Stephen Frost wrote: > > I’d certainly think “with stats” would be the preferred default of > > our users. > > I'm concerned there could still be paths that lead to an error. For > pg_restore, or when loading

Re: Statistics Import and Export

2024-03-29 Thread Tom Lane
Jeff Davis writes: > On Fri, 2024-03-29 at 18:02 -0400, Stephen Frost wrote: >> I’d certainly think “with stats” would be the preferred default of >> our users. > What do you think about starting off with it as non-default, and then > switching it to default in 18? I'm with Stephen: I find it

Re: Statistics Import and Export

2024-03-29 Thread Jeff Davis
On Fri, 2024-03-29 at 18:02 -0400, Stephen Frost wrote: > I’d certainly think “with stats” would be the preferred default of > our users. I'm concerned there could still be paths that lead to an error. For pg_restore, or when loading a SQL file, a single error isn't fatal (unless -e is

Re: Statistics Import and Export

2024-03-29 Thread Corey Huinker
> > > There's still a failure in the pg_upgrade TAP test. One seems to be > ordering, so perhaps we need to ORDER BY the attribute number. Others > seem to be missing relstats and I'm not sure why yet. I suggest doing > some manual pg_upgrade tests and comparing the before/after dumps to > see if

Re: Statistics Import and Export

2024-03-29 Thread Stephen Frost
Greetings, On Fri, Mar 29, 2024 at 11:05 Jeff Davis wrote: > On Fri, 2024-03-29 at 05:32 -0400, Corey Huinker wrote: > > 0002: > > - All relstats and attrstats calls are now their own statement > > instead of a compound statement > > - moved the archive TOC entry from post-data back to

Re: Statistics Import and Export

2024-03-29 Thread Jeff Davis
On Fri, 2024-03-29 at 05:32 -0400, Corey Huinker wrote: > That is fairly close to what I came up with per our conversation > (attached below), but I really like the att_stats_arginfo construct > and I definitely want to adopt that and expand it to a third > dimension that flags the fields that

Re: Statistics Import and Export

2024-03-29 Thread Jeff Davis
On Tue, 2024-03-26 at 00:16 +0100, Tomas Vondra wrote: > I did take a closer look at v13 today. I have a bunch of comments and > some minor whitespace fixes in the attached review patches. I also attached a patch implementing a different approach to the pg_dump support. Instead of trying to

Re: Statistics Import and Export

2024-03-28 Thread Jeff Davis
Hi Tom, Comparing the current patch set to your advice below: On Tue, 2023-12-26 at 14:19 -0500, Tom Lane wrote: > I had things set up with simple functions, which > pg_dump would invoke by writing more or less > > SELECT pg_catalog.load_statistics(); > > This has a number of

Re: Statistics Import and Export

2024-03-27 Thread Corey Huinker
> > 1) The docs say this: > > >The purpose of this function is to apply statistics values in an >upgrade situation that are "good enough" for system operation until >they are replaced by the next ANALYZE, usually via >autovacuum This function is used by >pg_upgrade and

Re: Statistics Import and Export

2024-03-27 Thread Corey Huinker
> > > > +\gexec > > Why do we need to construct the command and execute? Can we instead > execute the function directly? That would also avoid ECHO magic. > We don't strictly need it, but I've found the set-difference operation to be incredibly useful in diagnosing problems. Additionally, the

Re: Statistics Import and Export

2024-03-25 Thread Ashutosh Bapat
Hi Corey, On Sat, Mar 23, 2024 at 7:21 AM Corey Huinker wrote: > v12 attached. > > 0001 - > > Some random comments +SELECT +format('SELECT pg_catalog.pg_set_attribute_stats( ' +|| 'relation => %L::regclass::oid, attname => %L::name, ' +|| 'inherited => %L::boolean,

Re: Statistics Import and Export

2024-03-21 Thread Corey Huinker
> > > > But ideally we'd just make it safe to dump and reload stats on your own > tables, and then not worry about it. > That is my strong preference, yes. > > > Not off hand, no. > > To me it seems like inconsistent data to have most_common_freqs in > anything but descending order, and we

Re: Statistics Import and Export

2024-03-21 Thread Jeff Davis
On Thu, 2024-03-21 at 15:10 -0400, Corey Huinker wrote: > > In which case wouldn't the checkCanModify on pg_statistic would be a > proxy for is_superuser/has_special_role_we_havent_created_yet. So if someone pg_dumps their table and gets the statistics in the SQL, then they will get errors

Re: Statistics Import and Export

2024-03-21 Thread Corey Huinker
> > How about just some defaults then? Many of them have a reasonable > default, like NULL or an empty array. Some are parallel arrays and > either both should be specified or neither (e.g. > most_common_vals+most_common_freqs), but you can check for that. > +1 Default NULL has been implemented

Re: Statistics Import and Export

2024-03-21 Thread Jeff Davis
On Thu, 2024-03-21 at 03:27 -0400, Corey Huinker wrote: > > They can, but part of what I wanted to show was that the values that > aren't directly passed in as parameters (staopN, stacollN) get set to > the correct values, and those values aren't guaranteed to match > across databases, hence

Re: Statistics Import and Export

2024-03-21 Thread Corey Huinker
On Thu, Mar 21, 2024 at 2:29 AM Jeff Davis wrote: > On Tue, 2024-03-19 at 05:16 -0400, Corey Huinker wrote: > > v11 attached. > > Thank you. > > Comments on 0001: > > This test: > >+SELECT >+format('SELECT pg_catalog.pg_set_attribute_stats( ' >... > > seems misplaced. It's

Re: Statistics Import and Export

2024-03-21 Thread Jeff Davis
On Tue, 2024-03-19 at 05:16 -0400, Corey Huinker wrote: > v11 attached. Thank you. Comments on 0001: This test: +SELECT +format('SELECT pg_catalog.pg_set_attribute_stats( ' ... seems misplaced. It's generating SQL that can be used to restore or copy the stats -- that seems like

Re: Statistics Import and Export

2024-03-19 Thread Corey Huinker
v11 attached. - TAP tests passing (the big glitch was that indexes that are used in constraints should have their stats dependent on the constraint, not the index, thanks Jeff) - The new range-specific statistics types are now supported. I'm not happy with the typid machinations I do to get them

Re: Statistics Import and Export

2024-03-18 Thread Corey Huinker
> > > > > From testrun/pg_dump/002_pg_dump/log/regress_log_002_pg_dump, search > for the "not ok" and then look at what it tried to do right before > that. I see: > > pg_dump: error: prepared statement failed: ERROR: syntax error at or > near "%" > LINE 1: ..._histogram => %L::real[])

Re: Statistics Import and Export

2024-03-18 Thread Jeff Davis
On Sun, 2024-03-17 at 23:33 -0400, Corey Huinker wrote: > > A few TAP tests are still failing and I haven't been able to diagnose > why, though the failures in parallel dump seem to be that it tries to > import stats on indexes that haven't been created yet, which is odd > because I sent the

Re: Statistics Import and Export

2024-03-17 Thread Corey Huinker
> > > * pg_set_relation_stats() needs to do an ACL check so you can't set the > stats on someone else's table. I suggest honoring the new MAINTAIN > privilege as well. > Added. > > * If possible, reading from pg_stats (instead of pg_statistic) would be > ideal because pg_stats already does the

Re: Statistics Import and Export

2024-03-15 Thread Jeff Davis
On Fri, 2024-03-15 at 15:30 -0700, Jeff Davis wrote: > Still looking, but one quick comment is that the third argument of > dumpRelationStats() should be const, which eliminates a warning. A few other comments: * pg_set_relation_stats() needs to do an ACL check so you can't set the stats on

Re: Statistics Import and Export

2024-03-15 Thread Jeff Davis
On Fri, 2024-03-15 at 03:55 -0400, Corey Huinker wrote: > > Statistics are preserved by default, but this can be disabled with > the option --no-statistics. This follows the prevailing option > pattern in pg_dump, etc. I'm not sure if saving statistics should be the default in 17. I'm inclined

Re: Statistics Import and Export

2024-03-15 Thread Corey Huinker
> > ANALYZE takes out one lock StatisticRelationId per relation, not per > attribute like we do now. If we didn't release the lock after every > attribute, and we only called the function outside of a larger transaction > (as we plan to do with pg_restore) then that is the closest we're going to >

Re: Statistics Import and Export

2024-03-13 Thread Corey Huinker
> > Note that there's two different things we're talking about here- the > lock on the relation that we're analyzing and then the lock on the > pg_statistic (or pg_class) catalog itself. Currently, at least, it > looks like in the three places in the backend that we open > StatisticRelationId, we

Re: Statistics Import and Export

2024-03-13 Thread Stephen Frost
Greetings, * Corey Huinker (corey.huin...@gmail.com) wrote: > > No, we should be keeping the lock until the end of the transaction > > (which in this case would be just the one statement, but it would be the > > whole statement and all of the calls in it). See analyze.c:268 or > > so, where we

Re: Statistics Import and Export

2024-03-12 Thread Corey Huinker
> > No, we should be keeping the lock until the end of the transaction > (which in this case would be just the one statement, but it would be the > whole statement and all of the calls in it). See analyze.c:268 or > so, where we call relation_close(onerel, NoLock); meaning we're closing > the

  1   2   >