Re: Extended Statistics set/restore/clear functions.

2025-11-18 Thread jian he
On Tue, Nov 18, 2025 at 4:52 PM Corey Huinker wrote: > v15: > > - catches duplicate object keys cited above > - enforces attnum ordering (ascending positive numbers followed by descending > negative numbers, no zeros allowed), which means we get duplicate attnum > detection for free > - attnum v

Re: Extended Statistics set/restore/clear functions.

2025-11-17 Thread Michael Paquier
On Tue, Nov 18, 2025 at 01:07:23PM +0800, jian he wrote: > + errsave(parse->escontext, > + errcode(ERRCODE_INVALID_TEXT_REPRESENTATION), > + errmsg("malformed pg_ndistinct: \"%s\"", parse->str), > + errdetail("Invalid \"%s\" value.", > + PG_NDISTINCT_KEY_NDISTINCT)); > > the errdetail is way too

Re: Extended Statistics set/restore/clear functions.

2025-11-17 Thread jian he
On Mon, Nov 17, 2025 at 2:56 PM Michael Paquier wrote: > > On Fri, Nov 14, 2025 at 03:25:27PM +0900, Michael Paquier wrote: > > Thanks for the new versions, I'll also look at all these across the > > next couple of days. Probably not at 0005~ for now. > > 0001 and 0002 from series v13 have been a

Re: Extended Statistics set/restore/clear functions.

2025-11-17 Thread Corey Huinker
> > > But, if we don't care about the order of the combinations, I also don't > > think we need to expose the functions at all. We know exactly how many > > combinations there should be for any N attributes as each attribute must > be > > unique. So if we have the right number of unique combination

Re: Extended Statistics set/restore/clear functions.

2025-11-17 Thread Michael Paquier
On Mon, Nov 17, 2025 at 09:32:37PM -0500, Corey Huinker wrote: > So I looked at the generator functions, hoping they'd have enough in common > that they could be made generic. And they're just different enough that I > think it's not worth it to try. > > But, if we don't care about the order of th

Re: Extended Statistics set/restore/clear functions.

2025-11-17 Thread Corey Huinker
On Mon, Nov 17, 2025 at 6:35 PM Michael Paquier wrote: > On Mon, Nov 17, 2025 at 08:53:35PM +0800, jian he wrote: > > segfault: > > SELECT '[{"attributes" : [1,2,3,4,5,67,6,7,8], "ndistinct" : > 4}]'::pg_ndistinct; > > > > because src/backend/statistics/mvdistinct.c line: 310, Assert > > Assert((

Re: Extended Statistics set/restore/clear functions.

2025-11-17 Thread Corey Huinker
> > > This feels like a different pretty still compressed output for json. > I don't think we should change the output functions to do that, but if > you want to add a function that filters these contents a bit in the > tests for the input functions, sure, why not. > +1, I'll probably just use the

Re: Extended Statistics set/restore/clear functions.

2025-11-17 Thread Michael Paquier
On Mon, Nov 17, 2025 at 08:53:35PM +0800, jian he wrote: > segfault: > SELECT '[{"attributes" : [1,2,3,4,5,67,6,7,8], "ndistinct" : > 4}]'::pg_ndistinct; > > because src/backend/statistics/mvdistinct.c line: 310, Assert > Assert((item->nattributes >= 2) && (item->nattributes <= > STATS_MAX_DIMEN

Re: Extended Statistics set/restore/clear functions.

2025-11-17 Thread Michael Paquier
On Mon, Nov 17, 2025 at 12:18:55PM -0500, Corey Huinker wrote: > On Mon, Nov 17, 2025 at 1:56 AM Michael Paquier wrote: > Though, I was thinking some more about the output format. Using > jsonb_pretty() makes it readable in one way, and very clumsy in other ways. > Instead, I'm going to try doing

Re: Extended Statistics set/restore/clear functions.

2025-11-17 Thread Corey Huinker
On Mon, Nov 17, 2025 at 1:56 AM Michael Paquier wrote: > On Fri, Nov 14, 2025 at 03:25:27PM +0900, Michael Paquier wrote: > > Thanks for the new versions, I'll also look at all these across the > > next couple of days. Probably not at 0005~ for now. > > 0001 and 0002 from series v13 have been ap

Re: Extended Statistics set/restore/clear functions.

2025-11-17 Thread jian he
On Mon, Nov 17, 2025 at 2:56 PM Michael Paquier wrote: > > On Fri, Nov 14, 2025 at 03:25:27PM +0900, Michael Paquier wrote: > > Thanks for the new versions, I'll also look at all these across the > > next couple of days. Probably not at 0005~ for now. > > 0001 and 0002 from series v13 have been a

Re: Extended Statistics set/restore/clear functions.

2025-11-13 Thread Michael Paquier
On Fri, Nov 14, 2025 at 12:49:23AM -0500, Corey Huinker wrote: > Negative numbers represent the Nth expression defined in the extended > statistics object. So if you have extended statistics on a, b, length(a), > length(b) then you can legally have -1 and -2 in the attributes, but > nothing lower t

Re: Extended Statistics set/restore/clear functions.

2025-11-13 Thread jian he
hi. now looking at v12-0003-Add-working-input-function-for-pg_ndistinct.patch again. + * example input: + * [{"attributes": [6, -1], "ndistinct": 14}, + * {"attributes": [6, -2], "ndistinct": 9143}, + * {"attributes": [-1,-2], "ndistinct": 13454}, + * {"attributes": [6, -1, -2],

Re: Extended Statistics set/restore/clear functions.

2025-11-12 Thread Corey Huinker
On Thu, Nov 13, 2025 at 2:02 AM jian he wrote: > hi. > > v12-0001 and v12-0002 overall look good to me. > > if (dependency->nattributes <= 1) > elog(ERROR, "invalid zero-length nattributes array in > MVDependencies"); > This is an unlikely-to-happen error message, but still,

Re: Extended Statistics set/restore/clear functions.

2025-11-12 Thread jian he
hi. v12-0001 and v12-0002 overall look good to me. if (dependency->nattributes <= 1) elog(ERROR, "invalid zero-length nattributes array in MVDependencies"); This is an unlikely-to-happen error message, but still, “nattributes” seems confusing? in doc/src/sgml/func/func-json

Re: Extended Statistics set/restore/clear functions.

2025-11-12 Thread Michael Paquier
On Wed, Nov 12, 2025 at 08:45:16AM +, Man Zeng wrote: > pg_dependencies_out - output routine for type pg_dependencies. > pg_ndistinct_out > > Oh, I didn’t operate it properly, which resulted in the creation of a > duplicate theme. > https://www.postgresql.org/message-id/tencent_6EC50DD07D5D

Re: Extended Statistics set/restore/clear functions.

2025-11-12 Thread Man Zeng
Hey, I saw that some comments in the recent commit have inconsistent styles—maybe we can tweak them to make them uniform! ```c // pg_dependencies- output routine for type pg_dependencies. pg_dependencies_out - output routine for type pg_dependencies. // pg_ndistinct //output routine fo

Re: Extended Statistics set/restore/clear functions.

2025-11-12 Thread zengman
Hey, I saw that some comments in the recent commit have inconsistent styles??maybe we can tweak them to make them uniform! ```c // pg_dependencies- output routine for type pg_dependencies. pg_dependencies_out - output routine for type pg_dependencies. // pg_ndistinct //output routine f

Re: Extended Statistics set/restore/clear functions.

2025-11-11 Thread Michael Paquier
On Wed, Nov 12, 2025 at 01:47:33AM -0500, Corey Huinker wrote: >> Thanks for the new patch. And FWIW I disagree with this approach: >> cleanup and refactoring pieces make more sense if done first, as these >> lead to less code churn in the final result. So... I've begun to put >> my hands on the

Re: Extended Statistics set/restore/clear functions.

2025-11-11 Thread jian he
On Tue, Nov 11, 2025 at 11:14 PM jian he wrote: > > On Tue, Nov 11, 2025 at 4:07 PM Michael Paquier wrote: > > I've rebased the full set using the new structure. 0001~0004 are > > clean. 0005~ need more work and analysis, but that's a start. hi. +Datum +pg_ndistinct_out(PG_FUNCTION_ARGS) + +

Re: Extended Statistics set/restore/clear functions.

2025-11-11 Thread jian he
On Tue, Nov 11, 2025 at 4:07 PM Michael Paquier wrote: > I've rebased the full set using the new structure. 0001~0004 are > clean. 0005~ need more work and analysis, but that's a start. hi. diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml index 106583fb2965..e0e9f0468cb8 10064

Re: Extended Statistics set/restore/clear functions.

2025-11-07 Thread Michael Paquier
On Fri, Nov 07, 2025 at 05:28:50PM -0500, Corey Huinker wrote: > I'm open to other formats, but aside from renaming the json keys (maybe > "attnums" or "keys" instead of "attributes"?), I'm not sure what really > could be done and still be JSON. I suppose we could go with a tuple format > like this

Re: Extended Statistics set/restore/clear functions.

2025-11-07 Thread Corey Huinker
> > > Patch 0001 for ndistinct was missing a documentation update, we have > one query in perform.sgml that looks at stxdndistinct. Patch 0003 is > looking OK here as well. > Well spotted. > For dependencies, the format switches from a single json object > with key/vals like that: > "3 => 4": 1

Re: Extended Statistics set/restore/clear functions.

2025-11-06 Thread Michael Paquier
On Wed, Nov 05, 2025 at 01:38:56AM -0500, Corey Huinker wrote: > Paquier's response got sidetracked because of an errant subject line > change, so I will try to recap: That was a typo that found its way into the email subject. Sorry about that, that broke gmail's tracking at least. > in that off

Re: Extended Statistics set/restore/clear functions.[

2025-11-03 Thread Michael Paquier
On Fri, Oct 31, 2025 at 09:22:55PM +0100, Tomas Vondra wrote: > Sorry for not paying much attention to this thread ... It was PGEU last week, it's not surprising knowing that you are in.. Europe :) > My opinion is that we should both use the new format and keep the > pg_dump code to allow upgradi

Re: Extended Statistics set/restore/clear functions.

2025-10-31 Thread Tomas Vondra
On 10/23/25 01:46, Michael Paquier wrote: > On Wed, Oct 22, 2025 at 02:55:31PM +0300, Corey Huinker wrote: >>> Do you have some numbers regarding the increase in size this generates >>> for the catalogs? >> >> Sorry, I don't understand. There shouldn't be any increase inside the >> catalogs as t

Re: Extended Statistics set/restore/clear functions.

2025-10-22 Thread Michael Paquier
On Thu, Oct 23, 2025 at 08:46:27AM +0900, Michael Paquier wrote: > I'd prefer the new format. One killer pushing in favor of the new > format that you are making upthread in favor of is that it makes much > easier the viewing, editing and injecting of these stats. This is missing an "argument", a

Re: Extended Statistics set/restore/clear functions.

2025-10-22 Thread Michael Paquier
On Wed, Oct 22, 2025 at 02:55:31PM +0300, Corey Huinker wrote: >> Do you have some numbers regarding the increase in size this generates >> for the catalogs? > > Sorry, I don't understand. There shouldn't be any increase inside the > catalogs as the internal storage of the datatypes hasn't changed

Re: Extended Statistics set/restore/clear functions.

2025-10-22 Thread Corey Huinker
> > The functions exposed in 0003 should be renamed to match more with the > style of the rest, aka it is a bit hard to figure out what they do at > first sight. Presumably, these should be prefixed with some > "statext_", except text_to_stavalues() which could still be named the > same. > That p

Re: Extended Statistics set/restore/clear functions.

2025-10-21 Thread Michael Paquier
On Sat, Oct 18, 2025 at 08:27:58PM -0400, Corey Huinker wrote: > And rebased again to conform to 688dc6299 and 4bd919129. The format redesign for extended stats is pretty nice, as done in 0001. I think that this patch should be split in two, actually, as it tackles two issues: - One patch for th

Re: Extended Statistics set/restore/clear functions.

2025-10-20 Thread Michael Paquier
On Tue, Oct 21, 2025 at 02:48:37PM +0900, Michael Paquier wrote: > Tomas, what is your take about the format changes and my argument > about the backward requirements of pg_dump (about not dumping these > stats if connecting to a server older than v18, included)? By the way, the patch is still nee

Re: Extended Statistics set/restore/clear functions.

2025-05-29 Thread Corey Huinker
On Mon, Mar 31, 2025 at 1:10 AM Corey Huinker wrote: > Just rebasing. > At pgconf.dev this year, the subject of changing the formats of pg_ndistinct and pg_depdentencies came up again. To recap: presently these datatypes have no working input function, but would need one for statistics import t

Re: Extended Statistics set/restore/clear functions.

2025-01-29 Thread Corey Huinker
On Wed, Jan 29, 2025 at 2:50 AM jian he wrote: > hi. > > select '{"1, 0B100101":"NaN"}'::pg_ndistinct; > pg_ndistinct > > {"1, 37": -2147483648} > (1 row) > I think my initial reaction is to just refuse those special values, but I'll look into the parsing code to

Re: Extended Statistics set/restore/clear functions.

2025-01-29 Thread Corey Huinker
On Tue, Jan 28, 2025 at 11:25 AM jian he wrote: > hi. > I reviewed 0001 only. > > in src/backend/statistics/mvdistinct.c > > no need #include "nodes/pg_list.h" since > src/include/statistics/statistics.h sub level include "nodes/pg_list.h" > > no need #include "utils/palloc.h" > sicne #include "p

Re: Extended Statistics set/restore/clear functions.

2025-01-28 Thread jian he
hi. select '{"1, 0B100101":"NaN"}'::pg_ndistinct; pg_ndistinct {"1, 37": -2147483648} (1 row) this is not what we expected? For the VALUE part of pg_ndistinct, float8 has 3 special values: inf, -inf, NaN. For the key part of pg_ndistinct, see example. select '{"1,

Re: Extended Statistics set/restore/clear functions.

2025-01-28 Thread jian he
hi. I reviewed 0001 only. in src/backend/statistics/mvdistinct.c no need #include "nodes/pg_list.h" since src/include/statistics/statistics.h sub level include "nodes/pg_list.h" no need #include "utils/palloc.h" sicne #include "postgres.h" already included it. select '[{"6, -32768,,": -11}]':

Re: Extended Statistics set/restore/clear functions.

2025-01-27 Thread Corey Huinker
> > I'd like to merge these down to 3 patches again, but I'm keeping them > separate for this patchset to isolate the attnum-checking code for this > go-round. > These are mock-ups of the to/from JSON functions, but building from/to text rather than the not-yet-committed pg_ndistinct and pg_depend

Re: Extended Statistics set/restore/clear functions.

2025-01-23 Thread Corey Huinker
> > I see there's a couple MCV-specific functions in the extended_stats.c. >> Shouldn't those go into mvc.c instead? >> > > I wanted to put it there, but there was a reason I didn't and I've now > forgotten what it was. I'll make an effort to relocate it to mcv.c. > Looking at it now, I see that c

Re: Extended Statistics set/restore/clear functions.

2025-01-23 Thread Corey Huinker
> > > > > * no negative attnums in key list > Disregard this suggestion - negative attnums mean the Nth expression in the extended stats object, though it boggles the mind how we could have 222 expressions... > > * no duplicate attnums in key list > This one is still live, am considering. At

Re: Extended Statistics set/restore/clear functions.

2025-01-23 Thread Tomas Vondra
On 1/23/25 15:51, Corey Huinker wrote: > On Wed, Jan 22, 2025 at 5:50 PM Tomas Vondra > wrote: > > Hi, > > Thanks for continuing to work on this. > > On 1/22/25 19:17, Corey Huinker wrote: > > This is a separate thread for work started in [1] but focus

Re: Extended Statistics set/restore/clear functions.

2025-01-23 Thread Corey Huinker
On Wed, Jan 22, 2025 at 5:50 PM Tomas Vondra wrote: > Hi, > > Thanks for continuing to work on this. > > On 1/22/25 19:17, Corey Huinker wrote: > > This is a separate thread for work started in [1] but focused purely on > > getting the following functions working: > > > > * pg_set_extended_stats

Re: Extended Statistics set/restore/clear functions.

2025-01-22 Thread Tomas Vondra
Hi, Thanks for continuing to work on this. On 1/22/25 19:17, Corey Huinker wrote: > This is a separate thread for work started in [1] but focused purely on > getting the following functions working: > > * pg_set_extended_stats > * pg_clear_extended_stats > * pg_restore_extended_stats > > These