On 12/12/16 22:50, Tomas Vondra wrote:
> On 12/12/2016 12:26 PM, Amit Langote wrote:
>>
>> Hi Tomas,
>>
>> On 2016/10/30 4:23, Tomas Vondra wrote:
>>> Hi,
>>>
>>> Attached is v20 of the multivariate statistics patch series, doing
>>> mostly
>>> the changes outlined in the preceding e-mail from October 11.
>>>
>>> The patch series currently has these parts:
>>>
>>> * 0001 : (FIX) teach pull_varno about RestrictInfo
>>> * 0002 : (PATCH) shared infrastructure and ndistinct coefficients
Hi,
I went over these two (IMHO those could easily be considered as minimal
committable set even if the user visible functionality they provide is
rather limited).
> dropping statistics
> -------------------
>
> The statistics may be dropped automatically using DROP STATISTICS.
>
> After ALTER TABLE ... DROP COLUMN, statistics referencing are:
>
> (a) dropped, if the statistics would reference only one column
>
> (b) retained, but modified on the next ANALYZE
This should be documented in user visible form if you plan to keep it
(it does make sense to me).
> + therefore perfectly correlated. Providing additional information about
> + correlation between columns is the purpose of multivariate statistics,
> + and the rest of this section thoroughly explains how the planner
> + leverages them to improve estimates.
> + </para>
> +
> + <para>
> + For additional details about multivariate statistics, see
> + <filename>src/backend/utils/mvstats/README.stats</>. There are additional
> + <literal>READMEs</> for each type of statistics, mentioned in the
> following
> + sections.
> + </para>
> +
> + </sect1>
I don't think this qualifies as "thoroughly explains" ;)
> +
> +Oid
> +get_statistics_oid(List *names, bool missing_ok)
No comment?
> + case OBJECT_STATISTICS:
> + msg = gettext_noop("statistics \"%s\" does not exist,
> skipping");
> + name = NameListToString(objname);
> + break;
This sounds somewhat weird (plural vs singular).
> + * XXX Maybe this should check for duplicate stats. Although it's not clear
> + * what "duplicate" would mean here (wheter to compare only keys or also
> + * options). Moreover, we don't do such checks for indexes, although those
> + * store tuples and recreating a new index may be a way to fix bloat (which
> + * is a problem statistics don't have).
> + */
> +ObjectAddress
> +CreateStatistics(CreateStatsStmt *stmt)
I don't think we should check duplicates TBH so I would remove the XXX
(also "wheter" is typo but if you remove that paragraph it does not matter).
> + if (true)
> + {
Huh?
> +
> +List *
> +RelationGetMVStatList(Relation relation)
> +{
...
> +
> +void
> +update_mv_stats(Oid mvoid, MVNDistinct ndistinct,
> + int2vector *attrs, VacAttrStats **stats)
...
> +static double
> +ndistinct_for_combination(double totalrows, int numrows, HeapTuple *rows,
> + int2vector *attrs, VacAttrStats **stats,
> + int k, int *combination)
> +{
Again, these deserve comment.
I'll try to look at other patches in the series as time permits.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers