Re: [GENERAL] Why is this query not using GIN index?
Hey guys, I'm trying to understand the performance impact of "Index Recheck", I googled for Index Recheck, but didn't find much details about it, where can I know more about it? And how did you know the performance is being significantly hurt by inadequate work_mem? I'm running PG 9.6.1, built from source. On Mon, Nov 14, 2016 at 2:51 AM, Tom Lanewrote: > Oleg Bartunov writes: >> On Sun, Nov 13, 2016 at 6:05 PM, Aaron Lewis >>> It takes 500ms with 10m rows, could it be faster? > >> sure. Recheck with function call is pretty expensive, so I'd not recommend >> to create functional index, just create separate column of type tsvector >> (materialize to_tsvector) and create gin index on it. You should surprise. > > I doubt it'll help that much --- more than half the time is going into the > bitmap indexscan, and with over 1m candidate matches, there's no way > that's going to be super cheap. > > I wonder whether a gist index would be better here, since it would support > a plain indexscan which should require scanning much less of the index > given the small LIMIT. > > (Materializing the tsvector would probably help for gist, too, by reducing > the cost of lossy-index rechecks.) > > BTW, it still looks like the performance is being significantly hurt by > inadequate work_mem. > > regards, tom lane -- Best Regards, Aaron Lewis - PGP: 0x13714D33 - http://pgp.mit.edu/ Finger Print: 9F67 391B B770 8FF6 99DC D92D 87F6 2602 1371 4D33 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why is this query not using GIN index?
Oleg Bartunovwrites: > On Sun, Nov 13, 2016 at 6:05 PM, Aaron Lewis >> It takes 500ms with 10m rows, could it be faster? > sure. Recheck with function call is pretty expensive, so I'd not recommend > to create functional index, just create separate column of type tsvector > (materialize to_tsvector) and create gin index on it. You should surprise. I doubt it'll help that much --- more than half the time is going into the bitmap indexscan, and with over 1m candidate matches, there's no way that's going to be super cheap. I wonder whether a gist index would be better here, since it would support a plain indexscan which should require scanning much less of the index given the small LIMIT. (Materializing the tsvector would probably help for gist, too, by reducing the cost of lossy-index rechecks.) BTW, it still looks like the performance is being significantly hurt by inadequate work_mem. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why is this query not using GIN index?
On Sun, Nov 13, 2016 at 6:05 PM, Aaron Lewiswrote: > Sigh, didn't notice that. Thanks for the heads up. > > It takes 500ms with 10m rows, could it be faster? > sure. Recheck with function call is pretty expensive, so I'd not recommend to create functional index, just create separate column of type tsvector (materialize to_tsvector) and create gin index on it. You should surprise. > I've increased work_mem to 256MB > > test=# explain analyze select * from mytable where > to_tsvector('english', title) @@ 'x264'::tsquery limit 1000 offset 10; > QUERY PLAN > > > --- > Limit (cost=684.06..2949.42 rows=1000 width=83) (actual > time=348.506..536.483 rows=1000 loops=1) >-> Bitmap Heap Scan on mytable (cost=661.41..158917.22 rows=69859 > width=83) (actual time=345.354..536.199 rows=1010 loops=1) > Recheck Cond: (to_tsvector('english'::regconfig, title) @@ > '''x264'''::tsquery) > Rows Removed by Index Recheck: 12242 > Heap Blocks: exact=20 lossy=186 > -> Bitmap Index Scan on name_fts (cost=0.00..643.95 > rows=69859 width=0) (actual time=333.703..333.703 rows=1044673 > loops=1) >Index Cond: (to_tsvector('english'::regconfig, title) > @@ '''x264'''::tsquery) > Planning time: 0.144 ms > Execution time: 537.212 ms > (9 rows) > > On Sun, Nov 13, 2016 at 10:33 PM, Julien Rouhaud > wrote: > > On 13/11/2016 15:26, Aaron Lewis wrote: > >> Hi Oleg, > >> > >> Can you elaborate on the title column? I don't get it. > >> > > > create table mytable(hash char(40), title varchar(500)); > create index name_fts on mytable using gin(to_tsvector('english', > 'title')); > > > > You created an index on the text 'title', not on the title column, so > > the index is useless. > > > > Drop the existing index and create this one instead: > > > > create index name_fts on mytable using gin(to_tsvector('english', > title)); > > > >> On Sun, Nov 13, 2016 at 10:10 PM, Oleg Bartunov > wrote: > >>> > >>> > >>> On Sun, Nov 13, 2016 at 2:50 PM, Aaron Lewis < > the.warl0ck.1...@gmail.com> > >>> wrote: > > I have a simple table, and a gin index, > > create table mytable(hash char(40), title varchar(500)); > create index name_fts on mytable using gin(to_tsvector('english', > 'title')); > >>> > >>> > >>> > >>> ^ > >>> > > create unique index md5_uniq_idx on mytable(hash); > > When I execute a query with tsquery, the GIN index was not in use: > > test=# explain analyze select * from mytable where > to_tsvector('english', title) @@ 'abc | def'::tsquery limit 10; > QUERY PLAN > > > > Limit (cost=0.00..277.35 rows=10 width=83) (actual > time=0.111..75.549 rows=10 loops=1) > -> Seq Scan on mytable (cost=0.00..381187.45 rows=13744 width=83) > (actual time=0.110..75.546 rows=10 loops=1) > Filter: (to_tsvector('english'::regconfig, (title)::text) @@ > '''abc'' | ''def'''::tsquery) > Rows Removed by Filter: 10221 > Planning time: 0.176 ms > Execution time: 75.564 ms > (6 rows) > > Any ideas? > > > > > -- > > Julien Rouhaud > > http://dalibo.com - http://dalibo.org > > > > -- > Best Regards, > Aaron Lewis - PGP: 0x13714D33 - http://pgp.mit.edu/ > Finger Print: 9F67 391B B770 8FF6 99DC D92D 87F6 2602 1371 4D33 >
Re: [GENERAL] Why is this query not using GIN index?
Sigh, didn't notice that. Thanks for the heads up. It takes 500ms with 10m rows, could it be faster? I've increased work_mem to 256MB test=# explain analyze select * from mytable where to_tsvector('english', title) @@ 'x264'::tsquery limit 1000 offset 10; QUERY PLAN --- Limit (cost=684.06..2949.42 rows=1000 width=83) (actual time=348.506..536.483 rows=1000 loops=1) -> Bitmap Heap Scan on mytable (cost=661.41..158917.22 rows=69859 width=83) (actual time=345.354..536.199 rows=1010 loops=1) Recheck Cond: (to_tsvector('english'::regconfig, title) @@ '''x264'''::tsquery) Rows Removed by Index Recheck: 12242 Heap Blocks: exact=20 lossy=186 -> Bitmap Index Scan on name_fts (cost=0.00..643.95 rows=69859 width=0) (actual time=333.703..333.703 rows=1044673 loops=1) Index Cond: (to_tsvector('english'::regconfig, title) @@ '''x264'''::tsquery) Planning time: 0.144 ms Execution time: 537.212 ms (9 rows) On Sun, Nov 13, 2016 at 10:33 PM, Julien Rouhaudwrote: > On 13/11/2016 15:26, Aaron Lewis wrote: >> Hi Oleg, >> >> Can you elaborate on the title column? I don't get it. >> > create table mytable(hash char(40), title varchar(500)); create index name_fts on mytable using gin(to_tsvector('english', 'title')); > > You created an index on the text 'title', not on the title column, so > the index is useless. > > Drop the existing index and create this one instead: > > create index name_fts on mytable using gin(to_tsvector('english', title)); > >> On Sun, Nov 13, 2016 at 10:10 PM, Oleg Bartunov wrote: >>> >>> >>> On Sun, Nov 13, 2016 at 2:50 PM, Aaron Lewis >>> wrote: I have a simple table, and a gin index, create table mytable(hash char(40), title varchar(500)); create index name_fts on mytable using gin(to_tsvector('english', 'title')); >>> >>> >>> >>> ^ >>> create unique index md5_uniq_idx on mytable(hash); When I execute a query with tsquery, the GIN index was not in use: test=# explain analyze select * from mytable where to_tsvector('english', title) @@ 'abc | def'::tsquery limit 10; QUERY PLAN Limit (cost=0.00..277.35 rows=10 width=83) (actual time=0.111..75.549 rows=10 loops=1) -> Seq Scan on mytable (cost=0.00..381187.45 rows=13744 width=83) (actual time=0.110..75.546 rows=10 loops=1) Filter: (to_tsvector('english'::regconfig, (title)::text) @@ '''abc'' | ''def'''::tsquery) Rows Removed by Filter: 10221 Planning time: 0.176 ms Execution time: 75.564 ms (6 rows) Any ideas? > > -- > Julien Rouhaud > http://dalibo.com - http://dalibo.org -- Best Regards, Aaron Lewis - PGP: 0x13714D33 - http://pgp.mit.edu/ Finger Print: 9F67 391B B770 8FF6 99DC D92D 87F6 2602 1371 4D33 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why is this query not using GIN index?
On 13/11/2016 15:26, Aaron Lewis wrote: > Hi Oleg, > > Can you elaborate on the title column? I don't get it. > >>> create table mytable(hash char(40), title varchar(500)); >>> create index name_fts on mytable using gin(to_tsvector('english', >>> 'title')); You created an index on the text 'title', not on the title column, so the index is useless. Drop the existing index and create this one instead: create index name_fts on mytable using gin(to_tsvector('english', title)); > On Sun, Nov 13, 2016 at 10:10 PM, Oleg Bartunovwrote: >> >> >> On Sun, Nov 13, 2016 at 2:50 PM, Aaron Lewis >> wrote: >>> >>> I have a simple table, and a gin index, >>> >>> create table mytable(hash char(40), title varchar(500)); >>> create index name_fts on mytable using gin(to_tsvector('english', >>> 'title')); >> >> >> >> ^ >> >>> >>> create unique index md5_uniq_idx on mytable(hash); >>> >>> When I execute a query with tsquery, the GIN index was not in use: >>> >>> test=# explain analyze select * from mytable where >>> to_tsvector('english', title) @@ 'abc | def'::tsquery limit 10; >>> QUERY PLAN >>> >>> >>> Limit (cost=0.00..277.35 rows=10 width=83) (actual >>> time=0.111..75.549 rows=10 loops=1) >>>-> Seq Scan on mytable (cost=0.00..381187.45 rows=13744 width=83) >>> (actual time=0.110..75.546 rows=10 loops=1) >>> Filter: (to_tsvector('english'::regconfig, (title)::text) @@ >>> '''abc'' | ''def'''::tsquery) >>> Rows Removed by Filter: 10221 >>> Planning time: 0.176 ms >>> Execution time: 75.564 ms >>> (6 rows) >>> >>> Any ideas? >>> -- Julien Rouhaud http://dalibo.com - http://dalibo.org -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why is this query not using GIN index?
Hi Oleg, Can you elaborate on the title column? I don't get it. On Sun, Nov 13, 2016 at 10:10 PM, Oleg Bartunovwrote: > > > On Sun, Nov 13, 2016 at 2:50 PM, Aaron Lewis > wrote: >> >> I have a simple table, and a gin index, >> >> create table mytable(hash char(40), title varchar(500)); >> create index name_fts on mytable using gin(to_tsvector('english', >> 'title')); > > > > ^ > >> >> create unique index md5_uniq_idx on mytable(hash); >> >> When I execute a query with tsquery, the GIN index was not in use: >> >> test=# explain analyze select * from mytable where >> to_tsvector('english', title) @@ 'abc | def'::tsquery limit 10; >> QUERY PLAN >> >> >> Limit (cost=0.00..277.35 rows=10 width=83) (actual >> time=0.111..75.549 rows=10 loops=1) >>-> Seq Scan on mytable (cost=0.00..381187.45 rows=13744 width=83) >> (actual time=0.110..75.546 rows=10 loops=1) >> Filter: (to_tsvector('english'::regconfig, (title)::text) @@ >> '''abc'' | ''def'''::tsquery) >> Rows Removed by Filter: 10221 >> Planning time: 0.176 ms >> Execution time: 75.564 ms >> (6 rows) >> >> Any ideas? >> >> >> -- >> Best Regards, >> Aaron Lewis - PGP: 0x13714D33 - http://pgp.mit.edu/ >> Finger Print: 9F67 391B B770 8FF6 99DC D92D 87F6 2602 1371 4D33 >> >> >> -- >> Sent via pgsql-general mailing list (pgsql-general@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-general > > -- Best Regards, Aaron Lewis - PGP: 0x13714D33 - http://pgp.mit.edu/ Finger Print: 9F67 391B B770 8FF6 99DC D92D 87F6 2602 1371 4D33 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why is this query not using GIN index?
On Sun, Nov 13, 2016 at 2:50 PM, Aaron Lewiswrote: > I have a simple table, and a gin index, > > create table mytable(hash char(40), title varchar(500)); > create index name_fts on mytable using gin(to_tsvector('english', > 'title')); > ^ > create unique index md5_uniq_idx on mytable(hash); > > When I execute a query with tsquery, the GIN index was not in use: > > test=# explain analyze select * from mytable where > to_tsvector('english', title) @@ 'abc | def'::tsquery limit 10; > QUERY PLAN > > > Limit (cost=0.00..277.35 rows=10 width=83) (actual > time=0.111..75.549 rows=10 loops=1) >-> Seq Scan on mytable (cost=0.00..381187.45 rows=13744 width=83) > (actual time=0.110..75.546 rows=10 loops=1) > Filter: (to_tsvector('english'::regconfig, (title)::text) @@ > '''abc'' | ''def'''::tsquery) > Rows Removed by Filter: 10221 > Planning time: 0.176 ms > Execution time: 75.564 ms > (6 rows) > > Any ideas? > > > -- > Best Regards, > Aaron Lewis - PGP: 0x13714D33 - http://pgp.mit.edu/ > Finger Print: 9F67 391B B770 8FF6 99DC D92D 87F6 2602 1371 4D33 > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >
[GENERAL] Why is this query not using GIN index?
I have a simple table, and a gin index, create table mytable(hash char(40), title varchar(500)); create index name_fts on mytable using gin(to_tsvector('english', 'title')); create unique index md5_uniq_idx on mytable(hash); When I execute a query with tsquery, the GIN index was not in use: test=# explain analyze select * from mytable where to_tsvector('english', title) @@ 'abc | def'::tsquery limit 10; QUERY PLAN Limit (cost=0.00..277.35 rows=10 width=83) (actual time=0.111..75.549 rows=10 loops=1) -> Seq Scan on mytable (cost=0.00..381187.45 rows=13744 width=83) (actual time=0.110..75.546 rows=10 loops=1) Filter: (to_tsvector('english'::regconfig, (title)::text) @@ '''abc'' | ''def'''::tsquery) Rows Removed by Filter: 10221 Planning time: 0.176 ms Execution time: 75.564 ms (6 rows) Any ideas? -- Best Regards, Aaron Lewis - PGP: 0x13714D33 - http://pgp.mit.edu/ Finger Print: 9F67 391B B770 8FF6 99DC D92D 87F6 2602 1371 4D33 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general