Re: n_distinct off by a factor of 1000

Pavel Luzanov Thu, 25 Jun 2020 06:28:25 -0700


Hello,

I got my first hint of why this problem occurs when I looked at thestatistics. For the column in question, "instrument_ref" thestatistics claimed it to be:
The default_statistics_target=500, and analyze has been run.
select * from pg_stats where attname like 'instr%_ref'; -- Result:*40.000*select count(distinct instrumentid_ref) from bigtable -- Result: *33385 922 (!!)*
That is an astonishing difference of almost a 1000X.

I have tried to increase the statistics target to 5000, and it helps,but it reduces the error to 100X. Still crazy high.


As far as I know, increasing default_statistics_target will not help. [1]

I have considered these fixes:
- hardcode the statistics to a particular ratio of the total number ofrows


You can hardcode the percentage of distinct values:

ALTER TABLE bigtable ALTER COLUMN instrument_ref SET ( n_distinct=-0.06); /* -1 * (33385922 / 500000000) */

[1]https://www.postgresql.org/message-id/4136ffa0812111823u645b6ec9wdca60b3da4b00499%40mail.gmail.com


-----
Pavel Luzanov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: n_distinct off by a factor of 1000

Reply via email to