Re: [HACKERS] WIP: store additional info in GIN index

Alexander Korotkov Sat, 22 Dec 2012 08:16:39 -0800

Hi!

On Thu, Dec 6, 2012 at 5:44 AM, Tomas Vondra <t...@fuzzy.cz> wrote:


> Then I've run a simple benchmarking script, and the results are not as
> good as I expected, actually I'm getting much worse performance than
> with the original GIN index.
>
> The following table contains the time of loading the data (not a big
> difference), and number of queries per minute for various number of
> words in the query.
>
> The queries looks like this
>
> SELECT id FROM messages
>  WHERE body_tsvector @@ plainto_tsquery('english', 'word1 word2 ...')
>
> so it's really the simplest form of FTS query possible.
>
>            without patch |      with patch
> --------------------------------------------
> loading       750 sec    |         770 sec
> 1 word           1500    |            1100
> 2 words         23000    |            9800
> 3 words         24000    |            9700
> 4 words         16000    |            7200
> --------------------------------------------
>
> I'm not saying this is a perfect benchmark, but the differences (of
> querying) are pretty huge. Not sure where this difference comes from,
> but it seems to be quite consistent (I usually get +-10% results, which
> is negligible considering the huge difference).
>
> Is this an expected behaviour that will be fixed by another patch?
>

Another patches which significantly accelerate index search will be
provided. This patch changes only GIN posting lists/trees storage. However,
it wasn't expected that this patch significantly changes index scan speed
in any direction.

The database contains ~680k messages from the mailing list archives,
> i.e. about 900 MB of data (in the table), and the GIN index on tsvector
> is about 900MB too. So the whole dataset nicely fits into memory (8GB
> RAM), and it seems to be completely CPU bound (no I/O activity at all).
>
> The configuration was exactly the same in both cases
>
>     shared buffers = 1GB
>     work mem = 64 MB
>     maintenance work mem = 256 MB
>
> I can either upload the database somewhere, or provide the benchmarking
> script if needed.


Unfortunately, I can't reproduce such huge slowdown on my testcases. Could
you share both database and benchmarking script?

------
With best regards,
Alexander Korotkov.

Re: [HACKERS] WIP: store additional info in GIN index

Reply via email to