Re: Statistics Import and Export

Corey Huinker Tue, 26 Mar 2024 23:27:49 -0700

>
> 1) The docs say this:
>
>   <para>
>    The purpose of this function is to apply statistics values in an
>    upgrade situation that are "good enough" for system operation until
>    they are replaced by the next <command>ANALYZE</command>, usually via
>    <command>autovacuum</command> This function is used by
>    <command>pg_upgrade</command> and <command>pg_restore</command> to
>    convey the statistics from the old system version into the new one.
>   </para>
>
> I find this a bit confusing, considering the pg_dump/pg_restore changes
> are only in 0002, not in this patch.
>


True, I'll split the docs.


>
> 2) Also, I'm not sure about this:
>
>   <parameter>relation</parameter>, the parameters in this are all
>   derived from <structname>pg_stats</structname>, and the values
>   given are most often extracted from there.
>
> How do we know where do the values come from "most often"? I mean, where
> else would it come from?
>

The next most likely sources would be 1. stats from another similar table
and 2. the imagination of a user testing hypothetical query plans.


>
> 3) The function pg_set_attribute_stats() is veeeeery long - 1000 lines
> or so, that's way too many for me to think about. I agree the flow is
> pretty simple, but I still wonder if there's a way to maybe split it
> into some smaller "meaningful" steps.
>

I wrestle with that myself. I think there's some pieces that can be
factored out.


> 4) It took me *ages* to realize the enums at the beginning of some of
> the functions are actually indexes of arguments in PG_FUNCTION_ARGS.
> That'd surely deserve a comment explaining this.
>

My apologies, it definitely deserves a comment.


>
> 5) The comment for param_names in pg_set_attribute_stats says this:
>
>     /* names of columns that cannot be null */
>     const char *param_names[] = { ... }
>
> but isn't that actually incorrect? I think that applies only to a couple
> initial arguments, but then other fields (MCV, mcelem stats, ...) can be
> NULL, right?
>

Yes, that is vestigial, I'll remove it.


>
> 6) There's a couple minor whitespace fixes or comments etc.
>
>
> 0002
> ----
>
> 1) I don't understand why we have exportExtStatsSupported(). Seems
> pointless - nothing calls it, even if it did we don't know how to export
> the stats.
>

It's not strictly necessary.


>
> 2) I think this condition in dumpTableSchema() is actually incorrect:
>
> IMO that's wrong, the statistics should be delayed to the post-data
> section. Which probably means there needs to be a separate dumpable
> object for statistics on table/index, with a dependency on the object.
>

Good points.


>
> 3) I don't like the "STATS IMPORT" description. For extended statistics
> we dump the definition as "STATISTICS" so why to shorten it to "STATS"
> here? And "IMPORT" seems more like the process of loading data, not the
> data itself. So I suggest "STATISTICS DATA".
>

+1

Re: Statistics Import and Export

Reply via email to