Hi!
On Thu, Dec 6, 2012 at 5:44 AM, Tomas Vondra <[email protected]> wrote:
> Then I've run a simple benchmarking script, and the results are not as
> good as I expected, actually I'm getting much worse performance than
> with the original GIN index.
>
> The following table contains the time of loading the data (not a big
> difference), and number of queries per minute for various number of
> words in the query.
>
> The queries looks like this
>
> SELECT id FROM messages
> WHERE body_tsvector @@ plainto_tsquery('english', 'word1 word2 ...')
>
> so it's really the simplest form of FTS query possible.
>
> without patch | with patch
> --------------------------------------------
> loading 750 sec | 770 sec
> 1 word 1500 | 1100
> 2 words 23000 | 9800
> 3 words 24000 | 9700
> 4 words 16000 | 7200
> --------------------------------------------
>
> I'm not saying this is a perfect benchmark, but the differences (of
> querying) are pretty huge. Not sure where this difference comes from,
> but it seems to be quite consistent (I usually get +-10% results, which
> is negligible considering the huge difference).
>
> Is this an expected behaviour that will be fixed by another patch?
>
Another patches which significantly accelerate index search will be
provided. This patch changes only GIN posting lists/trees storage. However,
it wasn't expected that this patch significantly changes index scan speed
in any direction.
The database contains ~680k messages from the mailing list archives,
> i.e. about 900 MB of data (in the table), and the GIN index on tsvector
> is about 900MB too. So the whole dataset nicely fits into memory (8GB
> RAM), and it seems to be completely CPU bound (no I/O activity at all).
>
> The configuration was exactly the same in both cases
>
> shared buffers = 1GB
> work mem = 64 MB
> maintenance work mem = 256 MB
>
> I can either upload the database somewhere, or provide the benchmarking
> script if needed.
Unfortunately, I can't reproduce such huge slowdown on my testcases. Could
you share both database and benchmarking script?
------
With best regards,
Alexander Korotkov.