=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= <j.urban...@students.mimuw.edu.pl> writes: > Tom Lane wrote: >> I came across this bit in ts_typanalyze.c: >> >> /* We want statistic_target * 100 lexemes in the MCELEM array */ >> num_mcelem = stats->attr->attstattarget * 100; >> >> I wonder whether the multiplier here should be changed?
> The origin of that bit is this post: > http://archives.postgresql.org/pgsql-hackers/2008-07/msg00556.php > and the following few downthread ones. > If we bump the default statistics target 10 times, then changing the > multiplier to 10 seems the right thing to do. OK, will do. > Only thing that needs > caution is the frequency of pruning we do in the Lossy Counting > algorithm, that IIRC is correlated with the desired target length of the > MCELEM array. Right below that we have /* * We set bucket width equal to the target number of result lexemes. * This is probably about right but perhaps might need to be scaled * up or down a bit? */ bucket_width = num_mcelem; so it should track automatically. AFAICS the argument in the above thread that this is an appropriate pruning distance holds good regardless of just how we obtain the target mcelem count. > BTW: I've been occupied with other things and might have missed some > discussions, but at some point it has been considered to use Lossy > Counting to gather statistics from regular columns, not only tsvectors. > Wouldn't this help the performance hit ANALYZE takes from upping > default_stats_target? Perhaps, but it's not likely to get done for 8.4 ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers