Re: index on jsonb col with 2D array inside the json

Mariel Cherkassky Tue, 19 Feb 2019 10:31:35 -0800

Thanks for the feedback!

On Tue, Feb 19, 2019, 6:42 PM Michael Lewis <mle...@entrata.com wrote:


> I would expect that R2 vs R3 would be negligible but perhaps gist works
> much better and would be an improvement. When you are down to 7ms already,
> I wouldn't hope for any big change. I assume you used btree for the
> multi-column index on R2 range_first, range_last but am not familiar with
> gist on range vs btree on two int columns.
>
> It seems a little odd to have a jsonb value to hold multiple range values.
> A range is already a complex type so separating out into the association
> table like R3 would make sense to me.
>
> *Michael Lewis*
>
> On Tue, Feb 19, 2019 at 9:34 AM Mariel Cherkassky <
> mariel.cherkas...@gmail.com> wrote:
>
>> I dont have any indexes on R (the table with the jsonb column). I was
>> asking if I can create any that can increase this query`s performance.
>> If I understood you correctly I have  3 options right now :
>> 1)R, without indexes
>> 2)R2 with an index on first and last
>> 3)R3 that should contain a single range column (type int4range) with gist
>> index on it.
>>
>> In aspect of performance, R<R2<? R3
>>
>> ‫בתאריך יום ג׳, 19 בפבר׳ 2019 ב-18:28 מאת ‪Michael Lewis‬‏ <‪
>> mle...@entrata.com‬‏>:‬
>>
>>> Is your JSON data getting toasted? I wouldn't assume so if it is
>>> remaining small but something to check. Regardless, if an index exists and
>>> isn't being used, then that would be the primary concern. You didn't share
>>> what the definition of the index on R.data is... what do you already have?
>>>
>>> You have an array of ranges stored as the value of key "ranges" in jsonb
>>> field data. If you created a table like R2, but with a single "range"
>>> column that is int4range type, then I would expect that you could add a
>>> GiST and then use overlaps &&, or another operator. I would not expect that
>>> you could index (unnest data->>'ranges' for instance) to get the separated
>>> out range values.
>>>
>>>
>>>
>>> *Michael Lewis *
>>>
>>>
>>> On Tue, Feb 19, 2019 at 8:59 AM Mariel Cherkassky <
>>> mariel.cherkas...@gmail.com> wrote:
>>>
>>>> Hi,
>>>> I have a table with json col : R(object int, data jsonb).
>>>> Example for content :
>>>>  object         |                 data
>>>> ----------------+---------------------------------------
>>>>              50 | {"ranges": [[1, 1]]}
>>>>              51 | {"ranges": [[5, 700],[1,5],[9,10}
>>>>              52 | {"ranges": [[4, 200],[2,4],[3,4]]}
>>>>              53 | {"ranges": [[2, 2]]}
>>>>              54 | {"ranges": [[5, 10]]}
>>>>
>>>> Now I tried to query for all the objects that contains a specific
>>>> range, for example [2,2] :
>>>> explain analyze SELECT *
>>>> FROM   R d
>>>> WHERE  EXISTS (
>>>>    SELECT FROM jsonb_array_elements(R.data -> 'ranges') rng
>>>>    WHERE (rng->>0)::bigint <= 2 and (rng->>1)::bigint >= 2
>>>>    );
>>>>
>>>> I saw that the gin index isnt suitable for this type of comparison.
>>>> However, I saw that the gist index is suitable to handle ranges. Any idea
>>>> of I can implement a gist index here ?
>>>>
>>>> In addition, I saved the same data in relational table
>>>> R2(object,range_first,range_last).
>>>>  The previous data in this format :
>>>> object   range_first   range_last
>>>> 50              1                  1
>>>> 51              5                  700
>>>> 51              1                    5
>>>> 51              9                     10
>>>>
>>>> i compared the first query with :
>>>>   explain analyze  select * from R2 where  range_first   <=2 and
>>>> range_last >= 2; (I have an index on range_first,range_last that is used)
>>>>
>>>> The query on the jsonb column was 100x slower (700 m/s vs 7m/s). The
>>>> question is, Am I missing an index or the jsonb datatype isnt suitable for
>>>> this structure of data. The R2 table contains 500K records while the R
>>>> table contains about 200K records.
>>>>
>>>>
>>>>
>>>>

Re: index on jsonb col with 2D array inside the json

Reply via email to