Invalid remote sampling test in postgres_fdw.

2025-08-11 Thread Corey Huinker
From: Corey Huinker Date: Mon, 11 Aug 2025 10:05:08 -0400 Subject: [PATCH v1 1/2] Fix remote sampling tests in postgres_fdw. These tests were changing the sampling setting of a foreign server, but then were analyzing a local table, which doesn't actually test the sampling. Changed the ANALYZE c

Re: CAST(... ON DEFAULT) - WIP build on top of Error-Safe User Functions

2025-08-04 Thread Corey Huinker
> > > In the end, it seems we need to make all these functions in the below > query error safe. > select castsource::regtype, casttarget::regtype, castfunc, > castcontext,castmethod, pp.prosrc, pp.proname from pg_cast pc join pg_proc > pp on > pp.oid = pc.castfunc and pc.castfunc > 0 > order by cas

Re: implement CAST(expr AS type FORMAT 'template')

2025-08-03 Thread Corey Huinker
> another question is: > should we first implement CAST(expr AS type FORMAT 'template') for limited > types > (to_date, to_char, to_number, to_timestamptz) > or first try to make it more generic? > > That was my plan, essentially rewriting these into safe versions of the existing to_date/to_timesta

Re: implement CAST(expr AS type FORMAT 'template')

2025-08-03 Thread Corey Huinker
On Sun, Aug 3, 2025 at 11:36 PM David G. Johnston < david.g.johns...@gmail.com> wrote: > On Sun, Aug 3, 2025 at 8:10 PM jian he > wrote: > >> hi. >> one more question: >> >> For binary coercible type casts, no formatted related function for it, >> should we error out? >> For example, should the f

Re: CAST(... ON DEFAULT) - WIP build on top of Error-Safe User Functions

2025-08-03 Thread Corey Huinker
> > so we need to handle numeric source types with fractional points with > special care. > currently, this applies only to numeric, float4, and float8. > (hope this is all the corner case we need to catch...) > I'm fairly certain that the committers won't like us special-casing the internal cast

Re: pg_dump --with-* options

2025-08-01 Thread Corey Huinker
On Fri, Aug 1, 2025 at 4:02 PM Jeff Davis wrote: > On Thu, 2025-07-31 at 16:28 -0400, Corey Huinker wrote: > > > > In general, I like the idea of --include, but it would need to be > > consistent in behavior across pg_dump/pg_restore/pg_upgrade(if > > applicable).

Re: pg_dump --with-* options

2025-07-31 Thread Corey Huinker
> > > > I assume that should be read as something like "include only", because > > --include=data would also be excluding the schema and the stats. > > Of course. > In general, I like the idea of --include, but it would need to be consistent in behavior across pg_dump/pg_restore/pg_upgrade(if appl

Re: CAST(... ON DEFAULT) - WIP build on top of Error-Safe User Functions

2025-07-30 Thread Corey Huinker
> > > I didn't implement the [ FORMAT ] part for now. > please check the attached regress test and tests expected result. > Question about this: +/* + * Push steps to evaluate a SafeTypeCastExpr and its various subsidiary expressions. + * We already handle CoerceViaIO, CoerceToDomain, and ArrayC

Re: support create index on virtual generated column.

2025-07-30 Thread Corey Huinker
On Tue, Jul 22, 2025 at 4:54 PM Tom Lane wrote: > Corey Huinker writes: > > I'm interested in this feature, specifically whether the optimizer uses > the > > index in situations where the expression is used rather than the virtual > > column name. > > Hmm,

Re: vacuumdb changes for stats import/export

2025-07-30 Thread Corey Huinker
> > > It seems like there is some support for adding "When used in conjunction > > with --analyze in stages..." to the beginning of the sentence. I'll give > > it another day or two for any further discussion before committing. > > Here is what I have staged for commit. > +1

Re: vacuumdb changes for stats import/export

2025-07-26 Thread Corey Huinker
> > > > > > Statistics are transactional. Without this option specified are we > really > > removing them and commiting prior to computing and saving new ones? And > we are > > opposed to just changing this behavior and instead prefer to add an > option that > > at first glance seems like everyon

Re: support create index on virtual generated column.

2025-07-22 Thread Corey Huinker
> > > hi. > > refactor and rebase. > > fix the regress tests failure in v4. > This may need another rebase, as it doesn't apply to master. I'm interested in this feature, specifically whether the optimizer uses the index in situations where the expression is used rather than the virtual column na

Re: [PATCH] Use strchr() to search for a single character

2025-07-22 Thread Corey Huinker
On Sun, Jul 20, 2025 at 6:21 PM Dmitry Mityugov wrote: > Code in src/port/pgmkdirp.c uses strstr() to find a single character in > a string, but strstr() seems to be too generic for this job. Another > function, strchr(), might be better suited for this purpose, because it > is optimized to searc

Re: CAST(... ON DEFAULT) - WIP build on top of Error-Safe User Functions

2025-07-22 Thread Corey Huinker
On Tue, Jul 22, 2025 at 2:45 AM Vik Fearing wrote: > > On 22/07/2025 03:59, jian he wrote: > > Based on my reading of [4], it seems CAST(EXPRESSION AS TYPE DEFAULT > > def_expr ON ERROR) > > is not included in SQL:2023. > > > > [4] > https://peter.eisentraut.org/blog/2023/04/04/sql-2023-is-finish

Re: teach pg_upgrade to handle in-place tablespaces

2025-07-21 Thread Corey Huinker
On Tue, Jul 1, 2025 at 4:06 PM Nathan Bossart wrote: > rebased > > -- > nathan Everything here makes sense to me, but I do have one question: In src/bin/pg_upgrade/info.c @@ -616,11 +630,21 @@ process_rel_infos(DbInfo *dbinfo, PGresult *res, void *arg) + if (inplace) + tablespace = psprintf("%

Re: pg_dumpall dumps global objects with --statistics-only or --no-schema

2025-06-30 Thread Corey Huinker
> > > Since pg_dumpall treats global objects as schema-level content, it > currently > includes them with --schema-only but skips them with --data-only. By that > logic, > it should also skip them when either --statistics-only or --no-schema is > used. > Thought? > > +1, pending resolution of the d

Re: Allow pg_dump --statistics-only to dump foreign table statistics?

2025-06-16 Thread Corey Huinker
> > The proposed patch [0] adds RELKIND_FOREIGN_TABLE to this list. That > appears to be the only missing relation kind that ANALYZE handles. > > [0] > https://postgr.es/m/attachment/177608/v1-0001-pg_dump-Allow-pg_dump-to-dump-the-statistics-for-.patch > > Thanks for pointing it out, a little dis

Re: pg_dump --with-* options

2025-06-16 Thread Corey Huinker
> > I noticed that --statistics (i.e., the current --with-statistics) causes > statistics to be dumped even when used with --data-only or --schema-only. > So, as far as I understand, here are the possible combinations of dump > targets and options: > Those should also be mutually exclusive, and I'

Re: Allow pg_dump --statistics-only to dump foreign table statistics?

2025-06-16 Thread Corey Huinker
> > I skimmed through the main thread for the feature [0] (which seems to be so > long that it sometimes doesn't load completely), and I didn't see anything > directly related to the topic. There was some discussion about importing > foreign relation stats with the new functions instead of remote

Re: pg_dump --with-* options

2025-06-13 Thread Corey Huinker
> > > Good point. Now that we are getting rid of some of the other options, > we don't need to worry about consistency with them, and I think we > should just use "--statistics". The point of the --with flags was to future proof commands to preserve behavior in case the defaults ever changed. Th

Re: pg_dump --with-* options

2025-06-13 Thread Corey Huinker
> > One of the challenges in the current case is that it is not obvious how > --with-data, --no-data, --data-only etc. are connected. If that were > clearer, then the way these options should combine or conflict would > hopefully follow somewhat naturally. > They all should be mutually exclusive,

Re: pg_dump --with-* options

2025-06-12 Thread Corey Huinker
On Thu, Jun 12, 2025 at 4:22 PM Nathan Bossart wrote: > On Thu, Jun 12, 2025 at 04:12:35PM -0400, Corey Huinker wrote: > > The use case for --statistics-only is to extract the existing statistics > > for the tables and indexes that are involved in a given query that is > >

Re: pg_dump --with-* options

2025-06-12 Thread Corey Huinker
On Thu, Jun 12, 2025 at 1:36 PM Robert Haas wrote: > On Thu, Jun 12, 2025 at 11:58 AM Jeff Davis wrote: > > On Thu, 2025-06-12 at 09:52 -0500, Nathan Bossart wrote: > > > If the idea is to remove all options for default behavior, we'd be > > > removing > > > --no-statistics, --with-data, and --w

Re: Extended Statistics set/restore/clear functions.

2025-05-29 Thread Corey Huinker
On Mon, Mar 31, 2025 at 1:10 AM Corey Huinker wrote: > Just rebasing. > At pgconf.dev this year, the subject of changing the formats of pg_ndistinct and pg_depdentencies came up again. To recap: presently these datatypes have no working input function, but would need one for statistics

Re: Statistics Import and Export

2025-05-21 Thread Corey Huinker
> > I don't know precisely where that line might be, but in this case, the > dumped stats have no hope of restoring into anything older than v18 (since > the stats import functions won't exist), which is well past the point where > we started using -1 for reltuples. If we could dump the stats from

Re: Disallow redundant indexes

2025-04-28 Thread Corey Huinker
> > > I've had this idea before, and even wrote a quick POC at one point, but > I had it simply throw a warning rather than an > > error. That avoids the need for any GUC, which I agree is not a good > idea. And it still allows people to create a > > duplicate index if they really want to. > > > >

Re: Can we use Statistics Import and Export feature to perforamance testing?

2025-04-12 Thread Corey Huinker
> > You might be getting confused because the code does look at the > pg_class fields, but that's only to estimate the tuple density. When > pg_class has those estimates, they're used to calculate the estimated > density by doing reltuples / relpages, but that average rows per page > > Thanks for t

Re: someone else to do the list of acknowledgments

2025-04-12 Thread Corey Huinker
> > The whole thing might take about 20 to 30 hours wall-clock time. > After this dev cycle, things with a defined end to them hold a greater attraction than they did previously. > So, there is some time to think about this. Please discuss here if > you're interested or have questions. > I am

Re: Can we use Statistics Import and Export feature to perforamance testing?

2025-04-12 Thread Corey Huinker
> > * Question > > By using Statistics Import and Export feature, is it possible to achieve > the above request by following procedure? > > > > (1) Export the statistics from production environment by using pg_dump > --statistics-only. > > (2) On the staging environment, set the autovacuum related

Re: Can we use Statistics Import and Export feature to perforamance testing?

2025-04-12 Thread Corey Huinker
> > at the *actual size* of the relation and takes that into account when > scaling the statistics (see table_block_relation_estimate_size() in > tableam.c). If the table sizes don't match between the two servers > then there's no guarantees the planner will produce the same plan. > Sorry that I d

Re: Statistics Import and Export

2025-04-05 Thread Corey Huinker
> > * Changed to use LookupExplicitNamespace() > Seems good. > * Added test for temp tables > +1 > * Doc fixes So this patch swings the pendulum a bit back towards accepting some things as errors. That's understandable, as we're never going to have a situation where we can guarantee that th

Re: Statistics Import and Export

2025-04-05 Thread Corey Huinker
> > > Also, why do we need the clause "WHERE s.tablename = ANY($2)"? Isn't > > that already implied by "JOIN unnest($1, $2) ... s.tablename = > > u.tablename"? > > Good question. Corey, do you recall why this was needed? > In my patch, that SQL statement came with the comment: + /* + * The resul

Re: Statistics Import and Export

2025-04-05 Thread Corey Huinker
> > The first is that i_relallfrozen is undefined in versions earlier than > 18. That's trivial to fix, we just add "0 AS relallfrozen," in the > earlier versions, but still refrain from outputting it. > Ok, so long as we refrain from outputting it, I'm cool with whatever we store internally. >

Re: Statistics Import and Export

2025-04-04 Thread Corey Huinker
until v9.4. Looking into it... > > > This patch shrinks the array size to 1 for versions < 9.4, which keeps the modern code fairly elegant. From fe551ab55622f95d84ac4c4d79fba898c6b60057 Mon Sep 17 00:00:00 2001 From: Corey Huinker Date: Fri, 4 Apr 2025 19:30:00 -0400 Subject: [PATC

Re: Add partial :-variable expansion to psql \copy

2025-04-01 Thread Corey Huinker
> > I'm hesitating about the right syntax, though, for an input backslash > command which in effect would really only apply to COPY? ISTM that \g* is > used for "go", i.e. a semi-colon replacement which executes the SQL, and we > should want the same thing, which suggests: > making it a \g-variant

Re: Add partial :-variable expansion to psql \copy

2025-03-31 Thread Corey Huinker
> > Anyway, my feeling about it is that \copy parsing is a huge hack > right now, and I'd rather see it become less of a hack, that is > more like other psql commands, instead of getting even hackier. > I wasn't as horrified as Tom, but it did have the feeling of it solving half the problem. We c

Re: Statistics Import and Export

2025-03-31 Thread Corey Huinker
ing step in >> adjust_new_dumpfile? >> > > That sounds trickier. > Narrator: It was not trickier. In light of v11-0001 being committed as 4694aedf63bf, I've rebased the remaining patches. From 607984bdcc91fa31fb7a12e9b24fb8704aa14975 Mon Sep 17 00:00:00 2001 From: Corey Huinke

Re: Statistics Import and Export

2025-03-28 Thread Corey Huinker
A rebase and a reordering of the commits to put the really-really-must-have relallfrozen ahead of the really-must-have stats batching and both of them head of the error->warning step-downs. From 96b10b1eb955c5619d23cadf7de8b12d2db638a9 Mon Sep 17 00:00:00 2001 From: Corey Huinker Date: Sat,

Re: Statistics Import and Export

2025-03-25 Thread Corey Huinker
And here's the rebase (after bde2fb797aaebcbe06bf60f330ba5a068f17dda7). The order of the patches is different, but the purpose of each is the same as before. From 1f9b2578f55fa1233121bcf5949a6f69d6cf8cee Mon Sep 17 00:00:00 2001 From: Corey Huinker Date: Fri, 14 Mar 2025 03:54:26 -0400 Subject: [PATCH v1

Re: Statistics Import and Export

2025-03-25 Thread Corey Huinker
> > The original reason we wanted to issue warnings was to allow ourselves > a chance to change the meaning of parameters, add new parameters, or > even remove parameters without causing restore failures. If there are > any ERRORs that might limit our flexibility I think we should downgrade > those

Re: Statistics Import and Export

2025-03-19 Thread Corey Huinker
> > This replaces regclassin with custom lookups of the namespace and > relname, but misses some of the complexities that regclassin is > handling. For instance, it calls RangeVarGetRelid(), which calls > LookupExplicitNamespace(), which handles temp tables and > InvokeNamespaceSearchHook(). > > At

Re: vacuumdb changes for stats import/export

2025-03-17 Thread Corey Huinker
> > > While preparing this for commit, I noticed that the expression index part > of the query was disregarding attstattarget. To fix, I've modified that > part to look at the index's pg_attribute entries. > +1, should have been there all along.

Re: Statistics Import and Export

2025-03-17 Thread Corey Huinker
On Mon, Mar 17, 2025 at 10:24 AM Nathan Bossart wrote: > On Sun, Mar 16, 2025 at 05:32:15PM -0400, Corey Huinker wrote: > >> > >> * The custom format actually does two WriteToc() calls, and since these > >> patches move the queries to this part of pg_dump, it me

Re: Statistics Import and Export

2025-03-16 Thread Corey Huinker
> > * The custom format actually does two WriteToc() calls, and since these > patches move the queries to this part of pg_dump, it means we'll run all > the queries twice. The comments around this code suggest that the second > pass isn't strictly necessary and that it is really only useful

Re: Statistics Import and Export

2025-03-15 Thread Corey Huinker
On Thu, Mar 6, 2025 at 3:48 AM Jeff Davis wrote: > On Wed, 2025-03-05 at 23:04 -0500, Corey Huinker wrote: > > > > Anyway, here's a rebased set of the existing up-for-consideration > > patches, plus the optimization of avoiding querying on non-expression > &g

Re: Statistics Import and Export

2025-03-15 Thread Corey Huinker
> > > https://www.postgresql.org/docs/current/ddl-priv.html > > The above text indicates that we should do the check, but also that > it's not terribly important for actual security. > Ok, I'm convinced. > > > If we do, we'll want to change downgrade the following errors to > > warn+return fals

Re: Statistics Import and Export

2025-03-11 Thread Corey Huinker
> > I don't follow. We already have the tablenames, schemanames and oids of the > to-be-dumped tables/indexes collected in pg_dump, all that's needed is to > send > a list of those to the server to filter there? > Do we have something that currently does that? All of the collect functions (collect

Re: Statistics Import and Export

2025-03-08 Thread Corey Huinker
> > Until we add a fourth option, and then it becomes completely ambiguous as >> to whether you wanted data+statstics, or you not-wanted schema. >> >> > except it is perfectly clear that you *asked for* data and statistics, so > you get what you asked for. however the user conjures in their heads w

Re: Statistics Import and Export

2025-03-07 Thread Corey Huinker
a couple orders of magnitude gain. From 9cd4b4e0e280d0fd8cb120ac105d6e65a491cd7e Mon Sep 17 00:00:00 2001 From: Corey Huinker Date: Tue, 4 Mar 2025 22:16:52 -0500 Subject: [PATCH v7 1/2] Split relation into schemaname and relname. In order to further reduce potential error-failures in restor

Re: Statistics Import and Export

2025-03-07 Thread Corey Huinker
On Sat, Mar 8, 2025 at 12:52 AM Hari Krishna Sunder wrote: > To improve the performance of pg_dump can we add a new sql function that > can operate more efficiently than the pg_stats view? It could also take in > an optional list of oids to filter on. > This will help speed up the dump and restor

Re: Statistics Import and Export

2025-03-07 Thread Corey Huinker
> > I tried to generalize that requirement to all of > {schema|data|statistics} for consistency, but that resulted in 9 > options. > 9 options that resolve to 3 boolean variables. It's not that hard. And if we add a fourth option set, then we have 12 options. So it's O(3N), not O(N^2). People ha

Re: Statistics Import and Export

2025-03-07 Thread Corey Huinker
> > > if you want everything --include=schema,data,statistics (presumably > redundant with the default behavior) > if you want schema only --include=schema > if you want "everything except schema" --include=data,statistics > Until we add a fourth option, and then it becomes completely ambiguous as

Re: Statistics Import and Export

2025-03-06 Thread Corey Huinker
> > > Patch attached. This patch does NOT change the default; stats are still > opt-out. But it makes it easier for users to start specifying what they > want or not explicitly, or to rely on the defaults if they prefer. > > Note that the patch would mean we go from 2 options in v17: > --{schema|

Re: Statistics Import and Export

2025-03-06 Thread Corey Huinker
> > To be honest, I am a bit surprised that we decided to enable this by > default. It's not obvious to me that statistics should be regarded as > part of the database in the same way that table definitions or table > data are. That said, I'm not overwhelmingly opposed to that choice. > However, ev

Re: Statistics Import and Export

2025-03-06 Thread Corey Huinker
> > > The more I think about it, the less correct it seems to me to have the > statement to restore statistics tracked via ArchiveOpts->createStmt. We > use > that for DDL, but this really is data, not DDL. Because we store it in > ->createStmt it's stored in-memory for the runtime of pg_dump, wh

Re: Statistics Import and Export

2025-03-06 Thread Corey Huinker
> > Would it be appropriate to create a temp table? I wouldn't normally > expect pg_dump to create temp tables, but I can't think of a major > reason not to. > I think we can't - the db might be a replica. > > If not, did you have in mind a CTE with a large VALUES expression, or > just a giant I

Re: Statistics Import and Export

2025-03-06 Thread Corey Huinker
> > > > > Pardon my inexperience, but aren't the ArchiveEntry records needed right > up > > until the program's run? > > s/the/the end of the/? > yes > > If there's value in freeing them, why isn't it being done already? What > > other thing would consume this freed memory? > > I'm not saying th

Re: Statistics Import and Export

2025-03-05 Thread Corey Huinker
On Wed, Mar 5, 2025 at 9:18 PM Andres Freund wrote: > Hi, > > On 2025-03-05 20:54:35 -0500, Corey Huinker wrote: > > It's been considered and not ruled out, with a "let's see how the simple > > thing works, first" approach. Considerations are:

Re: Statistics Import and Export

2025-03-05 Thread Corey Huinker
> > Apologies if this has already been considered upthread, but would it be > possible to use one query to gather all the required information into a > sorted table? At a glance, it looks to me like it might be feasible. I > had a lot of luck with reducing the number per-object queries with that

Re: Statistics Import and Export

2025-03-05 Thread Corey Huinker
> > One fairly easy win would be to stop issuing getAttributeStats() for > non-expression indexes. In most cases that'll already drastically cut down > on > the extra queries. That does seem like an easy win, especially since we're already using indexprs for some filters. I am concerned that, dow

Re: Statistics Import and Export

2025-03-02 Thread Corey Huinker
> > Also, we will need to think through the set of pg_dump options again. A >> lot of our tools seem to assume that "if it's the default, we don't >> need a way to ask for it explicitly", which makes it a lot harder to >> ever change the default and keep a coherent set of options. >> > > That's a g

Re: Statistics Import and Export

2025-03-01 Thread Corey Huinker
> > Independently of that, do we want to switch over to storing > reltuples as a string instead of converting it? I still feel > uncomfortable about the amount of baggage we added to pg_dump > to avoid that. I'm obviously a 'yes' vote for string, either fixed width buffer or pg_strdup'd, for the

Re: Statistics Import and Export

2025-02-28 Thread Corey Huinker
On Fri, Feb 28, 2025 at 4:25 PM Jeff Davis wrote: > On Fri, 2025-02-28 at 14:56 -0600, Nathan Bossart wrote: > > On Fri, Feb 28, 2025 at 12:54:03PM -0800, Jeff Davis wrote: > > > (Aside: I assume everyone here agrees that pg_upgrade should > > > transfer > > > the stats by default.) > > > > That

Re: Statistics Import and Export

2025-02-27 Thread Corey Huinker
On Thu, Feb 27, 2025 at 10:01 PM Corey Huinker wrote: > +--- error: relation is wrong type >> +SELECT pg_catalog.pg_restore_relation_stats( >> +'relation', 0::oid, >> +'relpages', 17::integer, >> +'reltuples&#x

Re: Statistics Import and Export

2025-02-27 Thread Corey Huinker
> > +--- error: relation is wrong type > +SELECT pg_catalog.pg_restore_relation_stats( > +'relation', 0::oid, > +'relpages', 17::integer, > +'reltuples', 400.0::real, > +'relallvisible', 4::integer); > > Why do you need to specify all the stats (relpages, reltuples,

Re: Statistics Import and Export commit related issue.

2025-02-27 Thread Corey Huinker
On Tue, Feb 25, 2025 at 1:31 AM jian he wrote: > hi. > the thread "Statistics Import and Export" is quite hot. > so a new thread for some minor issue i found. > > > static int > _tocEntryRequired(TocEntry *te, teSection curSection, ArchiveHandle *AH) > { > > if (strcmp(te->desc, "STATISTICS D

Re: Statistics Import and Export

2025-02-26 Thread Corey Huinker
what was already there, etc. * the set difference tests remain, as they proved extremely useful in detecting undesirable side-effects during development From f3087b04784d6853970bf41eb619281d72ce94bd Mon Sep 17 00:00:00 2001 From: Corey Huinker Date: Wed, 26 Feb 2025 21:02:44 -0500 Subject: [PAT

Re: Statistics Import and Export

2025-02-26 Thread Corey Huinker
On Wed, Feb 26, 2025 at 4:46 PM Tom Lane wrote: > Melanie Plageman writes: > > I have a patch that is getting thwacked around by the churn in > > stats_import.sql, and it occurred to me that I don't see why all the > > negative tests for pg_restore_relation_stats() need to have all the > > para

Re: Statistics Import and Export

2025-02-26 Thread Corey Huinker
> > I have a patch that is getting thwacked around by the churn in > stats_import.sql, and it occurred to me that I don't see why all the > negative tests for pg_restore_relation_stats() need to have all the > parameters provided. For example, in both of these tests, you are > testing the relation

Re: Statistics Import and Export

2025-02-26 Thread Corey Huinker
On Wed, Feb 26, 2025 at 11:23 AM Tom Lane wrote: > Jeff Davis writes: > > I think you had mentioned upthread something about getting rid of the > > table-driven logic, which is fine with me. Did you mean for that to > > happen in this patch as well? > > Per Corey's description of the patch (I di

Re: Statistics Import and Export

2025-02-26 Thread Corey Huinker
On Wed, Feb 26, 2025 at 12:05 AM Tom Lane wrote: > Corey Huinker writes: > > Just to confirm, we ARE able to assume dense packing of attributes in an > > index, and thus we can infer the attnum from the position of the attname > in > > the aggregated array, and there&#x

Re: Statistics Import and Export

2025-02-25 Thread Corey Huinker
On Tue, Feb 25, 2025 at 11:36 PM Tom Lane wrote: > Corey Huinker writes: > > My solution so far is to take allo the v11+ (SELECT array_agg...) > functions > > and put them into a LATERAL, two of them filtered by attstattarget > 0 > and > > a new one aggregating att

Re: Statistics Import and Export

2025-02-25 Thread Corey Huinker
> > To my mind the next task is to get the buildfarm green again by > fixing the expression-index-stats problem. I can have a go at > that once Jeff pushes these patches, unless one of you are already > on it? > Already on it, but I can step aside if you've got a clearer vision of how to solve it

Re: Statistics Import and Export

2025-02-25 Thread Corey Huinker
On Tue, Feb 25, 2025 at 9:00 PM Jeff Davis wrote: > On Mon, 2025-02-24 at 09:54 -0500, Andres Freund wrote: > > Have you compared performance of with/without stats after these > > optimizations? > > On unoptimized build with asserts enabled, dumping the regression > database: > > --no-statistic

Re: Statistics Import and Export

2025-02-25 Thread Corey Huinker
On Tue, Feb 25, 2025 at 1:22 PM Jeff Davis wrote: > On Mon, 2025-02-24 at 12:50 -0500, Tom Lane wrote: > > Also, while working on the attached, I couldn't help forming the > > opinion that we'd be better off to nuke pg_set_attribute_stats() > > from orbit and require people to use pg_restore_attr

Re: Statistics Import and Export

2025-02-24 Thread Corey Huinker
On Mon, Feb 24, 2025 at 4:30 PM Corey Huinker wrote: > >> >> After a bit of playing around, it seemed messy to make it into >> a join, but we could replace the two array_agg sub-selects with >> a single one: >> >> (SELECT pg_catalog.array_agg(ROW(attname, a

Re: Statistics Import and Export

2025-02-24 Thread Corey Huinker
On Mon, Feb 24, 2025 at 4:07 PM Jeff Davis wrote: > On Mon, 2025-02-24 at 15:40 -0500, Tom Lane wrote: > > I'm a little suspicious whether that has any effect if you insert it > > before set_pglocale_pgservice(). > > Ah, right. Corey, can you please include that (in the right place, of > course)

Re: Statistics Import and Export

2025-02-24 Thread Corey Huinker
On Mon, Feb 24, 2025 at 4:33 PM Andres Freund wrote: > Hi, > > On February 24, 2025 10:30:08 PM GMT+01:00, Corey Huinker < > corey.huin...@gmail.com> wrote: > >From what I can see, it doesn't. Moreover, the attstattarget array agg is > >only done in version

Re: Statistics Import and Export

2025-02-24 Thread Corey Huinker
> > > > After a bit of playing around, it seemed messy to make it into > a join, but we could replace the two array_agg sub-selects with > a single one: > > (SELECT pg_catalog.array_agg(ROW(attname, attstattarget) ORDER BY attnum) > FROM pg_catalog.pg_attribute WHERE attrelid = i.indexrelid) > > a

Re: Statistics Import and Export

2025-02-24 Thread Corey Huinker
> > I don't think that's necessarily true, hot pruning might help some, as > afaict > the restore happens in multiple transactions. > If we're willing to take the potential bloat to avoid a nasty complexity, then I'm all for discarding it. Jeff just indicated off-list that he isn't seeing noticeab

Re: Statistics Import and Export

2025-02-24 Thread Corey Huinker
On Mon, Feb 24, 2025 at 2:36 PM Tom Lane wrote: > Corey Huinker writes: > > Sadly, that attnum isn't available in pg_stats, so we'd have to > reintroduce > > the joins to pg_namespace and pg_class to get at pg_attribute, at least > for > > indexes. > >

Re: Statistics Import and Export

2025-02-24 Thread Corey Huinker
> > > > I suspect that this is a *really* bad idea. It's very very hard to get > inplace > updates right. We have several unfixed correctness bugs that are related to > the use of inplace updates. I really don't think it's wise to add > additional > interfaces that can reach inplace updates unless

Re: Statistics Import and Export

2025-02-24 Thread Corey Huinker
On Mon, Feb 24, 2025 at 1:54 PM Jeff Davis wrote: > On Mon, 2025-02-24 at 13:47 -0500, Corey Huinker wrote: > > There doesn't seem to be any way around it, but it will > > slightly complicate the dump-ing side of things, in that we need to > > either: > > &g

Re: Statistics Import and Export

2025-02-24 Thread Corey Huinker
On Mon, Feb 24, 2025 at 12:51 PM Tom Lane wrote: > Andres Freund writes: > > On 2025-02-24 05:11:48 -0500, Corey Huinker wrote: > >> * relpages/reltuples/relallvisible are now char[32] buffers in > RelStatsInfo > >> and nowhere else (existing relpages conversion r

Re: Statistics Import and Export

2025-02-24 Thread Corey Huinker
On Mon, Feb 24, 2025 at 9:54 AM Andres Freund wrote: > Hi, > > On 2025-02-24 05:11:48 -0500, Corey Huinker wrote: > > Incorporating most of the feedback (I kept a few of > > the appendNamedArgument() calls) presented over the weekend. > > > > * removeVer

Re: Statistics Import and Export

2025-02-24 Thread Corey Huinker
oved to end of both queries for consistency. From 95127f6fd82bde843e5840de93678ff784750c8a Mon Sep 17 00:00:00 2001 From: Corey Huinker Date: Mon, 24 Feb 2025 02:38:22 -0500 Subject: [PATCH v2 1/3] Leverage existing functions for relation stats. Rather than quer pg_class once per relation in order to

Re: Statistics Import and Export

2025-02-23 Thread Corey Huinker
On Sun, Feb 23, 2025 at 7:22 PM Jeff Davis wrote: > On Sat, 2025-02-22 at 00:00 -0500, Corey Huinker wrote: > > > > Attached is the first optimization, which gets rid of the pg_class > > queries entirely, instead getting the same information from the > > existi

Re: Statistics Import and Export

2025-02-21 Thread Corey Huinker
moteVersionStr is "18devel" rather than "18". I didn't include any work on the attribute query as I wanted to keep that separate for clarity purposes. From 45467a69813cbf25c2850b254c5d2710c231a723 Mon Sep 17 00:00:00 2001 From: Corey Huinker Date: Fri, 21 Feb 2025 23

Re: Statistics Import and Export

2025-02-21 Thread Corey Huinker
> > Oy. Those are outright horrid, even without any consideration of > pre-preparing them. We know the OID of the table we want to dump, > we should be doing "FROM pg_class WHERE oid = whatever" and lose > the join to pg_namespace altogether. The explicit casts to regclass > are quite expensive

Re: Statistics Import and Export

2025-02-11 Thread Corey Huinker
en yes, we could trim it down, but as it is I think it's a limitation of the testing structure. But aside from creating a whole extra XYZ_pg_dump.pl file, I don't think there's a way to do that. The previous 0001 is now committed (thanks!) so only one remains. From afb3d0fd81fb

Re: Statistics Import and Export

2025-02-09 Thread Corey Huinker
actually meant for any dump that has all schema excluded. From 2de404f8187466061469abfaf3a773290cce6af8 Mon Sep 17 00:00:00 2001 From: Corey Huinker Date: Tue, 21 Jan 2025 11:52:58 -0500 Subject: [PATCH v47 1/2] Lock table first when setting index relation statistics. Jian He reported [1] a missing lock relation bug

Re: Statistics Import and Export

2025-02-06 Thread Corey Huinker
rding has been done, some of which I'm still not totally satisfied with, so I'm going to give it another look tomorrow, but am putting this out for reviewers in the mean time. From fb095b7ea75ef366eaae5eb7b5322b9869a760ea Mon Sep 17 00:00:00 2001 From: Corey Huinker Date: Tue, 21 Jan 2025 11:

Re: Statistics Import and Export

2025-02-05 Thread Corey Huinker
On Mon, Jan 27, 2025 at 11:09 AM Corey Huinker wrote: > On Mon, Jan 27, 2025 at 9:05 AM jian he > wrote: > >> On Tue, Jan 21, 2025 at 7:31 AM Jeff Davis wrote: >> > >> > On Mon, 2025-01-20 at 16:45 -0500, Corey Huinker wrote: >> > > >> &g

Re: Statistics Import and Export

2025-02-05 Thread Corey Huinker
On Sat, Jan 25, 2025 at 10:02 AM Corey Huinker wrote: > Fixed. Holding off on posting updated patch pending decision on what's the >>> best thing to do with partitioned indexes. >> >> > Though I was able to get it to work multiple ways, the one that seems to >

Re: should we have a fast-path planning for OLTP starjoins?

2025-02-05 Thread Corey Huinker
> > > Hmmm, yeah. But that's only for the INNER JOIN case. But I've seen many > of these star join queries with LEFT JOIN too, and then the FKs are not > needed. All you need is a PK / unique index on the other side. Indeed, many installations specifically _remove_ foreign keys because of the dre

Re: Extended Statistics set/restore/clear functions.

2025-01-29 Thread Corey Huinker
On Wed, Jan 29, 2025 at 2:50 AM jian he wrote: > hi. > > select '{"1, 0B100101":"NaN"}'::pg_ndistinct; > pg_ndistinct > > {"1, 37": -2147483648} > (1 row) > I think my initial reaction is to just refuse those special values, but I'll look into the parsing code to

Re: Extended Statistics set/restore/clear functions.

2025-01-29 Thread Corey Huinker
On Tue, Jan 28, 2025 at 11:25 AM jian he wrote: > hi. > I reviewed 0001 only. > > in src/backend/statistics/mvdistinct.c > > no need #include "nodes/pg_list.h" since > src/include/statistics/statistics.h sub level include "nodes/pg_list.h" > > no need #include "utils/palloc.h" > sicne #include "p

Re: Extended Statistics set/restore/clear functions.

2025-01-27 Thread Corey Huinker
> > I'd like to merge these down to 3 patches again, but I'm keeping them > separate for this patchset to isolate the attnum-checking code for this > go-round. > These are mock-ups of the to/from JSON functions, but building from/to text rather than the not-yet-committed pg_ndistinct and pg_depend

Re: Statistics Import and Export

2025-01-27 Thread Corey Huinker
On Mon, Jan 27, 2025 at 9:05 AM jian he wrote: > On Tue, Jan 21, 2025 at 7:31 AM Jeff Davis wrote: > > > > On Mon, 2025-01-20 at 16:45 -0500, Corey Huinker wrote: > > > > > > What I struggle to understand is how that purpose isn't served better > >

Re: vacuumdb changes for stats import/export

2025-01-24 Thread Corey Huinker
> > Thoughts? > I don't have anything to add to what Nathan said, but thought I should say so since this thread was broken off from my earlier thread. Eagerly awaiting feedback.

  1   2   3   4   5   >