Re: jsonpath

Alexander Korotkov Sun, 17 Mar 2019 10:03:55 -0700

On Sun, Mar 17, 2019 at 8:00 PM Jonathan S. Katz <jk...@postgresql.org> wrote:
> On 3/17/19 12:55 PM, Alexander Korotkov wrote:
> >
> >> However, when I did something a little more complex, like the below:
> >>
> >> SELECT count(*)
> >> FROM news_feed
> >> WHERE data @? '$.length ? (@ < 150)';
> >>
> >> SELECT count(*)
> >> FROM news_feed
> >> WHERE data @? '$.content ? (@ like_regex "^Start")';
> >>
> >> SELECT id, jsonb_path_query(data, '$.content')
> >> FROM news_feed
> >> WHERE data @? '$.content ? (@ like_regex "risk" flag "i")';
> >>
> >> I would find that the index scan performed as well as the sequential
> >> scan. Additionally, on my laptop, the parallel sequential scan would
> >> beat the index scan by ~2.5x in some cases.
> >
> > Yeah, this cases are not supported.  Did optimizer automatically
> > select sequential scan in this case (if not touching enable_*
> > variables)?  It should, because optimizer understands that GIN scan
> > will be bad if extract_query method failed to extract anything.
>
> It did not - it was doing a bitmap heap scan. I have default costs
> setup. Example output from EXPLAIN ANALYZE with the index available:
>
>  Aggregate  (cost=1539.78..1539.79 rows=1 width=8) (actual
> time=270.419..270.419 rows=1 loops=1)
>    ->  Bitmap Heap Scan on news_feed  (cost=23.24..1538.73 rows=418
> width=0) (actual time=84.040..270.407 rows=5 loops=1)
>          Recheck Cond: (data @? '$."length"?(@ < 150)'::jsonpath)
>          Rows Removed by Index Recheck: 418360
>          Heap Blocks: exact=28690
>          ->  Bitmap Index Scan on news_feed_data_gin_idx
> (cost=0.00..23.14 rows=418 width=0) (actual time=41.788..41.788
> rows=418365 loops=1)
>                Index Cond: (data @? '$."length"?(@ < 150)'::jsonpath)
>  Planning Time: 0.168 ms
>  Execution Time: 271.105 ms
>
> And for arguments sake, after I dropped the index (and
> max_parallel_workers = 8):
>
>  Finalize Aggregate  (cost=30998.07..30998.08 rows=1 width=8) (actual
> time=91.062..91.062 rows=1 loops=1)
>    ->  Gather  (cost=30997.65..30998.06 rows=4 width=8) (actual
> time=90.892..97.739 rows=5 loops=1)
>          Workers Planned: 4
>          Workers Launched: 4
>          ->  Partial Aggregate  (cost=29997.65..29997.66 rows=1 width=8)
> (actual time=76.977..76.977 rows=1 loops=5)
>                ->  Parallel Seq Scan on news_feed  (cost=0.00..29997.39
> rows=104 width=0) (actual time=39.736..76.964 rows=1 loops=5)
>                      Filter: (data @? '$."length"?(@ < 150)'::jsonpath)
>                      Rows Removed by Filter: 83672
>  Planning Time: 0.127 ms
>  Execution Time: 97.801 ms


Thank you for the explanation.  Is it jsonb_ops or jsonb_path_ops?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: jsonpath

Reply via email to