On Sun, Mar 17, 2019 at 8:00 PM Jonathan S. Katz <jk...@postgresql.org> wrote: > On 3/17/19 12:55 PM, Alexander Korotkov wrote: > > > >> However, when I did something a little more complex, like the below: > >> > >> SELECT count(*) > >> FROM news_feed > >> WHERE data @? '$.length ? (@ < 150)'; > >> > >> SELECT count(*) > >> FROM news_feed > >> WHERE data @? '$.content ? (@ like_regex "^Start")'; > >> > >> SELECT id, jsonb_path_query(data, '$.content') > >> FROM news_feed > >> WHERE data @? '$.content ? (@ like_regex "risk" flag "i")'; > >> > >> I would find that the index scan performed as well as the sequential > >> scan. Additionally, on my laptop, the parallel sequential scan would > >> beat the index scan by ~2.5x in some cases. > > > > Yeah, this cases are not supported. Did optimizer automatically > > select sequential scan in this case (if not touching enable_* > > variables)? It should, because optimizer understands that GIN scan > > will be bad if extract_query method failed to extract anything. > > It did not - it was doing a bitmap heap scan. I have default costs > setup. Example output from EXPLAIN ANALYZE with the index available: > > Aggregate (cost=1539.78..1539.79 rows=1 width=8) (actual > time=270.419..270.419 rows=1 loops=1) > -> Bitmap Heap Scan on news_feed (cost=23.24..1538.73 rows=418 > width=0) (actual time=84.040..270.407 rows=5 loops=1) > Recheck Cond: (data @? '$."length"?(@ < 150)'::jsonpath) > Rows Removed by Index Recheck: 418360 > Heap Blocks: exact=28690 > -> Bitmap Index Scan on news_feed_data_gin_idx > (cost=0.00..23.14 rows=418 width=0) (actual time=41.788..41.788 > rows=418365 loops=1) > Index Cond: (data @? '$."length"?(@ < 150)'::jsonpath) > Planning Time: 0.168 ms > Execution Time: 271.105 ms > > And for arguments sake, after I dropped the index (and > max_parallel_workers = 8): > > Finalize Aggregate (cost=30998.07..30998.08 rows=1 width=8) (actual > time=91.062..91.062 rows=1 loops=1) > -> Gather (cost=30997.65..30998.06 rows=4 width=8) (actual > time=90.892..97.739 rows=5 loops=1) > Workers Planned: 4 > Workers Launched: 4 > -> Partial Aggregate (cost=29997.65..29997.66 rows=1 width=8) > (actual time=76.977..76.977 rows=1 loops=5) > -> Parallel Seq Scan on news_feed (cost=0.00..29997.39 > rows=104 width=0) (actual time=39.736..76.964 rows=1 loops=5) > Filter: (data @? '$."length"?(@ < 150)'::jsonpath) > Rows Removed by Filter: 83672 > Planning Time: 0.127 ms > Execution Time: 97.801 ms
Thank you for the explanation. Is it jsonb_ops or jsonb_path_ops? ------ Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company