Reduce memory usage of tsvector type analyze function. compute_tsvector_stats() detoasted and kept in memory every tsvector value in the sample, but that can be a lot of memory. The original bug report described a case using over 10 gigabytes, with statistics target of 10000 (the maximum).
To fix, allocate a separate copy of just the lexemes that we keep around, and free the detoasted tsvector values as we go. This adds some palloc/pfree overhead, when you have a lot of distinct lexemes in the sample, but it's better than running out of memory. Fixes bug #14654 reported by James C. Reviewed by Tom Lane. Backport to all supported versions. Discussion: https://www.postgresql.org/message-id/20170514200602.1451.46...@wrigleys.postgresql.org Branch ------ REL9_6_STABLE Details ------- https://git.postgresql.org/pg/commitdiff/bbeec3c749bcbd9b75fc1f036979fed516f9a2c8 Modified Files -------------- src/backend/tsearch/ts_typanalyze.c | 21 +++++++++++++++++---- 1 file changed, 17 insertions(+), 4 deletions(-) -- Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-committers