hello ...
i got that one ...
"idx_product_t_product_titleprice" gist (to_tsvector('german'::regconfig,
title), int_price)
so, i have a combined index on text + number.
to me the plan seems fine ... it looks like a prober KNN traversal.
the difference between my plan and your plan seems to be the fact that i have,
say, 1 mio rows which have "handy" or so in it (1 mio out of 11 mio or so). you
are moving out from one specific place.
my maths is like that:
11 mio in total
1 mio matching "iphone"
cheapest / most expensive 10 out of this mio needed.
operator classes are all nice and in place:
SELECT 10 <-> 4 as distance;
distance
----------
6
(1 row)
what does "buffers true" in your case say?
many thanks,
hans
On Apr 8, 2011, at 3:22 PM, Oleg Bartunov wrote:
> Probably, you miss two-columnt index. From my early post:
> http://www.sai.msu.su/~megera/wiki/knngist
>
> =# CREATE INDEX spots_idx ON spots USING knngist (coordinates,
> to_tsvector('french',address));
> =# SELECT id, address, (coordinates <->
> '(2.29470491409302,48.858263472125)'::point) AS dist FROM spots WHERE
> coordinates >< '(2.29470491409302,48.858263472125)'::point AND
> to_tsvector('french',address) @@ to_tsquery('french','mars') LIMIT 10;
> id | address |
> dist
> ---------+-------------------------------------------------------------+---------------------
> 366096 | 1st Floor Tour Eiffel | Champs de Mars, Paris 75007, France |
> 2.32488941293945e-05
> 4356328 | r Champ de Mars 75007 PARIS |
> 0.00421854756964406
> 5200167 | Champ De Mars 75007 Paris |
> 0.00453564562587288
> 9301676 | Champ de Mars, 75007 Paris, |
> 0.00453564562587288
> 2152213 | 16, ave Rapp, Champ de Mars, Tour Eiffel, Paris, France |
> 0.00624152097590896
> 1923818 | Champ de Mars Paris, France |
> 0.00838214733539654
> 5165953 | 39 Rue Champ De Mars Paris, France |
> 0.00874410234569529
> 7395870 | 39 Rue Champ De Mars Paris, France |
> 0.00874410234569529
> 4358671 | 32 Rue Champ De Mars Paris, France |
> 0.00876089659276339
> 1923742 | 12 rue du Champ de Mars Paris, France |
> 0.00876764731845995
> (10 rows)
>
> Time: 7.859 ms
>
> =# EXPLAIN (COSTS OFF) SELECT id, address FROM spots WHERE coordinates ><
> '(2.29470491409302,48.858263472125)'::point
> AND to_tsvector('french',address) @@ to_tsquery('french','mars') LIMIT 10;
>
> QUERY PLAN
> --------------------------------------------------------------------------------------------------------------------------------------
> Limit
> -> Index Scan using spots_idx on spots
> Index Cond: ((coordinates ><
> '(2.29470491409302,48.858263472125)'::point) AND
> (to_tsvector('french'::regconfig, address) @@ '''mar'''::tsquery))
> (3 rows)
>
>
> On Fri, 8 Apr 2011, PostgreSQL - Hans-J?rgen Sch?nig wrote:
>
>> hello all ...
>>
>> given oleg's posting before i also wanted to fire up some KNN related
>> question.
>> let us consider a simple example. i got some million lines and i want all
>> rows matching a tsquery sorted by price.
>> i did some tests:
>>
>> test=# explain (analyze true, buffers true, costs true) SELECT id FROM
>> product.t_product WHERE to_tsvector('german', title) @@ to_tsquery('german',
>> 'iphone') ORDER BY int_price <-> 0 LIMIT 10;
>>
>> QUERY PLAN
>>
>> -----------------------------------------------------------------------------------------------------------------------------
>> --------------------------------------
>> Limit (cost=0.00..41.11 rows=10 width=16) (actual time=36391.717..45542.590
>> rows=10 loops=1)
>> Buffers: shared hit=9 read=5004
>> -> Index Scan using idx_product_t_product_titleprice on t_product
>> (cost=0.00..13251.91 rows=3224 width=16) (actual time=
>> 36391.715..45542.573 rows=10 loops=1)
>> Index Cond: (to_tsvector('german'::regconfig, title) @@
>> '''iphon'''::tsquery)
>> Order By: (int_price <-> 0::bigint)
>> Buffers: shared hit=9 read=5004
>> Total runtime: 45542.676 ms
>> (7 rows)
>>
>> test=# explain (analyze true, buffers true, costs true) SELECT id FROM
>> product.t_product WHERE to_tsvector('german', title) @@ to_tsquery('german',
>> 'handy') ORDER BY int_price <-> 0 LIMIT 10;
>>
>> QUERY PLAN
>>
>> -----------------------------------------------------------------------------------------------------------------------------
>> -------------------------------------
>> Limit (cost=0.00..41.03 rows=10 width=16) (actual time=7243.526..10935.227
>> rows=10 loops=1)
>> Buffers: shared hit=3 read=2316
>> -> Index Scan using idx_product_t_product_titleprice on t_product
>> (cost=0.00..29762.61 rows=7255 width=16) (actual time=
>> 7243.524..10935.217 rows=10 loops=1)
>> Index Cond: (to_tsvector('german'::regconfig, title) @@
>> '''handy'''::tsquery)
>> Order By: (int_price <-> 0::bigint)
>> Buffers: shared hit=3 read=2316
>> Total runtime: 10935.265 ms
>> (7 rows)
>>
>> test=# explain (analyze true, buffers true, costs true) SELECT id FROM
>> product.t_product WHERE to_tsvector('german', title) @@ to_tsquery('german',
>> 'handy') ORDER BY int_price <-> 0 LIMIT 1;
>> QUERY
>> PLAN
>>
>> -----------------------------------------------------------------------------------------------------------------------------
>> -------------------------------
>> Limit (cost=0.00..4.10 rows=1 width=16) (actual time=28.527..28.528 rows=1
>> loops=1)
>> Buffers: shared hit=1 read=1577
>> -> Index Scan using idx_product_t_product_titleprice on t_product
>> (cost=0.00..29762.61 rows=7255 width=16) (actual time=
>> 28.525..28.525 rows=1 loops=1)
>> Index Cond: (to_tsvector('german'::regconfig, title) @@
>> '''handy'''::tsquery)
>> Order By: (int_price <-> 0::bigint)
>> Buffers: shared hit=1 read=1577
>> Total runtime: 28.558 ms
>> (7 rows)
>>
>>
>> under any circumstances - there is no way to reduce the number of buffers
>> needed for a query like that.
>> if everything is cached this is still ok but as soon as you have to take a
>> single block from disk you will die a painful random I/O death.
>> is there any alternative which does not simply die when i try to achieve
>> what i want?
>>
>> the use case is quite simple: all products with a certain word (10 cheapest
>> or so).
>>
>> is there any alternative approach to this?
>> i was putting some hope into KNN but it seems it needs too much random I/O
>> :(.
>>
>> many thanks,
>>
>> hans
>>
>> --
>> Cybertec Sch?nig & Sch?nig GmbH
>> Gr?hrm?hlgasse 26
>> A-2700 Wiener Neustadt, Austria
>> Web: http://www.postgresql-support.de
>>
>>
>>
>
> Regards,
> Oleg
> _____________________________________________________________
> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
> Sternberg Astronomical Institute, Moscow University, Russia
> Internet: [email protected], http://www.sai.msu.su/~megera/
> phone: +007(495)939-16-83, +007(495)939-23-83
>
> --
> Sent via pgsql-hackers mailing list ([email protected])
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers