Re: [HACKERS] jsonb contains behaviour weirdness
This was never committed... On Sat, Sep 13, 2014 at 4:31 PM, Peter Geoghegan p...@heroku.com wrote: On Fri, Sep 12, 2014 at 6:40 AM, Alexander Korotkov aekorot...@gmail.com wrote: It's likely that JB_ROOT_COUNT(val) JB_ROOT_COUNT(tmpl) should be checked only for objects, not arrays. Also, should JsonbDeepContains does same fast check when it deals with nested objects? Attached patch implements something similar to what you describe here, fixing your example. I haven't added the optimization to JsonbDeepContains(). I think that if anything, we should remove the optimization entirely, which is what I've done -- an rhs is it contained within? value is hardly ever going to be an object that has more pairs than the object we're checking it is contained within. It's almost certainly going to have far fewer pairs. Apart from only applying to objects, that optimization just isn't an effective way of eliminating jsonb values from consideration quickly. I'd rather not bother at all, rather than having a complicated comment about why the optimization applies to objects and not arrays. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb contains behaviour weirdness
Peter Geoghegan p...@heroku.com writes: This was never committed... Yeah ... the discussion trailed off and I think we forgot to actually commit a fix. Will go do so. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb contains behaviour weirdness
On 09/15/2014 11:12 AM, Tom Lane wrote: Or are you proposing that JSONARRAY @ JSONARRAY should work differently from ARRAY @ ARRAY? And no. It's a bug that jsonb array containment works differently from regular array containment. We understand the source of the bug, ie a mistaken optimization. I don't see why there's much need for discussion about anything except whether removing the optimization altogether (as Peter proposed) is the best fix, or whether we want to retain some weaker form of it. Right, so I was just saying that after we fix this behavior, the behavior of JSONARRAY @ JSONARRAY should be commented somewhere because that comparison may not work the way users who are not long-time postgres users expect. Heck, I've personally done very little ARRAY @ ARRAY myself in 12 years of using PostgreSQL arrays; I had to test it to verify the current behavior. Not sure exactly where this note should go, mind you. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb contains behaviour weirdness
On 09/12/2014 01:33 PM, Tom Lane wrote: Josh Berkus j...@agliodbs.com writes: However, this better become a FAQ item, because it's not necessarily the behavior that folks used to JSON but not Postgres will expect. No, it's a bug, not a documentation deficiency. Hmmm? Are you proposing that we should change how ARRAY @ ARRAY works for non-JSON data? Or are you proposing that JSONARRAY @ JSONARRAY should work differently from ARRAY @ ARRAY? Or are you agreeing with Peter that it's the first case that's a bug, and the 2nd two tests are correct, and just being unclear? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb contains behaviour weirdness
Josh Berkus j...@agliodbs.com writes: On 09/12/2014 01:33 PM, Tom Lane wrote: No, it's a bug, not a documentation deficiency. Hmmm? Are you proposing that we should change how ARRAY @ ARRAY works for non-JSON data? No. Or are you proposing that JSONARRAY @ JSONARRAY should work differently from ARRAY @ ARRAY? And no. It's a bug that jsonb array containment works differently from regular array containment. We understand the source of the bug, ie a mistaken optimization. I don't see why there's much need for discussion about anything except whether removing the optimization altogether (as Peter proposed) is the best fix, or whether we want to retain some weaker form of it. Personally I'd think that we should retain it for objects; Peter's main argument against that was that the comment would be too complicated, but that seems a bit silly from here. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb contains behaviour weirdness
On Mon, Sep 15, 2014 at 11:12 AM, Tom Lane t...@sss.pgh.pa.us wrote: Personally I'd think that we should retain it for objects; Peter's main argument against that was that the comment would be too complicated, but that seems a bit silly from here. I just don't see any point to it. My argument against the complexity of explaining why the optimization is only used with objects is based on the costs and the benefits. I think the benefits are very close to nil. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb contains behaviour weirdness
On Mon, Sep 15, 2014 at 2:18 PM, Peter Geoghegan p...@heroku.com wrote: On Mon, Sep 15, 2014 at 11:12 AM, Tom Lane t...@sss.pgh.pa.us wrote: Personally I'd think that we should retain it for objects; Peter's main argument against that was that the comment would be too complicated, but that seems a bit silly from here. I just don't see any point to it. My argument against the complexity of explaining why the optimization is only used with objects is based on the costs and the benefits. I think the benefits are very close to nil. That seems pessimistic to me; I'm with Tom. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb contains behaviour weirdness
Peter Geoghegan p...@heroku.com writes: On Mon, Sep 15, 2014 at 11:12 AM, Tom Lane t...@sss.pgh.pa.us wrote: Personally I'd think that we should retain it for objects; Peter's main argument against that was that the comment would be too complicated, but that seems a bit silly from here. I just don't see any point to it. My argument against the complexity of explaining why the optimization is only used with objects is based on the costs and the benefits. I think the benefits are very close to nil. It might be that the benefit is very close to nil; that would depend a lot on workload, so it's hard to be sure. I'd say though that the cost is also very close to nil, in the sense that we're considering two additional compare-and-branch instructions in a function that will surely expend hundreds or thousands of instructions if there's no such short-circuit. I've certainly been on the side of that optimization isn't worth its keep many times before, but I don't think the case is terribly clear cut here. Since somebody (possibly you) thought it was worth having to begin with, I'm inclined to follow that lead. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb contains behaviour weirdness
On Mon, Sep 15, 2014 at 11:29 AM, Tom Lane t...@sss.pgh.pa.us wrote: It might be that the benefit is very close to nil; that would depend a lot on workload, so it's hard to be sure. I'd say though that the cost is also very close to nil, in the sense that we're considering two additional compare-and-branch instructions in a function that will surely expend hundreds or thousands of instructions if there's no such short-circuit. Fair enough. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb contains behaviour weirdness
On Fri, Sep 12, 2014 at 6:40 AM, Alexander Korotkov aekorot...@gmail.com wrote: It's likely that JB_ROOT_COUNT(val) JB_ROOT_COUNT(tmpl) should be checked only for objects, not arrays. Also, should JsonbDeepContains does same fast check when it deals with nested objects? Attached patch implements something similar to what you describe here, fixing your example. I haven't added the optimization to JsonbDeepContains(). I think that if anything, we should remove the optimization entirely, which is what I've done -- an rhs is it contained within? value is hardly ever going to be an object that has more pairs than the object we're checking it is contained within. It's almost certainly going to have far fewer pairs. Apart from only applying to objects, that optimization just isn't an effective way of eliminating jsonb values from consideration quickly. I'd rather not bother at all, rather than having a complicated comment about why the optimization applies to objects and not arrays. -- Peter Geoghegan diff --git a/src/backend/utils/adt/jsonb_op.c b/src/backend/utils/adt/jsonb_op.c new file mode 100644 index 2d071b2..6fcdbad *** a/src/backend/utils/adt/jsonb_op.c --- b/src/backend/utils/adt/jsonb_op.c *** jsonb_contains(PG_FUNCTION_ARGS) *** 117,124 JsonbIterator *it1, *it2; ! if (JB_ROOT_COUNT(val) JB_ROOT_COUNT(tmpl) || ! JB_ROOT_IS_OBJECT(val) != JB_ROOT_IS_OBJECT(tmpl)) PG_RETURN_BOOL(false); it1 = JsonbIteratorInit(val-root); --- 117,123 JsonbIterator *it1, *it2; ! if (JB_ROOT_IS_OBJECT(val) != JB_ROOT_IS_OBJECT(tmpl)) PG_RETURN_BOOL(false); it1 = JsonbIteratorInit(val-root); *** jsonb_contained(PG_FUNCTION_ARGS) *** 137,144 JsonbIterator *it1, *it2; ! if (JB_ROOT_COUNT(val) JB_ROOT_COUNT(tmpl) || ! JB_ROOT_IS_OBJECT(val) != JB_ROOT_IS_OBJECT(tmpl)) PG_RETURN_BOOL(false); it1 = JsonbIteratorInit(val-root); --- 136,142 JsonbIterator *it1, *it2; ! if (JB_ROOT_IS_OBJECT(val) != JB_ROOT_IS_OBJECT(tmpl)) PG_RETURN_BOOL(false); it1 = JsonbIteratorInit(val-root); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] jsonb contains behaviour weirdness
Hi! Let's consider some examples. # select '[1,2]'::jsonb @ '[1,2,2]'::jsonb; ?column? -- f (1 row) One may think it's because second jsonb array contain two 2. So, contains takes care about count of equal elements. # select '[1,1,2]'::jsonb @ '[1,2,2]'::jsonb; ?column? -- t (1 row) But, it's not. Jsonb contains takes care only about length of array. # select '[[1,2]]'::jsonb @ '[[1,2,2]]'::jsonb; ?column? -- t (1 row) Even more weird :) The reason why jsonb contains behaves so is check in the beginning of jsonb_contains. It makes fast check of jsonb type and elements count before calling JsonbDeepContains. if (JB_ROOT_COUNT(val) JB_ROOT_COUNT(tmpl) || JB_ROOT_IS_OBJECT(val) != JB_ROOT_IS_OBJECT(tmpl)) PG_RETURN_BOOL(false); It's likely that JB_ROOT_COUNT(val) JB_ROOT_COUNT(tmpl) should be checked only for objects, not arrays. Also, should JsonbDeepContains does same fast check when it deals with nested objects? -- With best regards, Alexander Korotkov.
Re: [HACKERS] jsonb contains behaviour weirdness
On Fri, Sep 12, 2014 at 6:40 AM, Alexander Korotkov aekorot...@gmail.com wrote: Even more weird :) Agreed. The reason why jsonb contains behaves so is check in the beginning of jsonb_contains. It makes fast check of jsonb type and elements count before calling JsonbDeepContains. if (JB_ROOT_COUNT(val) JB_ROOT_COUNT(tmpl) || JB_ROOT_IS_OBJECT(val) != JB_ROOT_IS_OBJECT(tmpl)) PG_RETURN_BOOL(false); It's likely that JB_ROOT_COUNT(val) JB_ROOT_COUNT(tmpl) should be checked only for objects, not arrays. Also, should JsonbDeepContains does same fast check when it deals with nested objects? I think this is due to commit 364ddc. That removed the extra step that had arrays sorted (and then de-duped) ahead of time, which made arrays behave like objects at the top level. I think that this sort + de-dup step was mischaracterized as purely a performance thing (possibly by me). Basically, JsonbDeepContains() is consistent with the previous behavior at the top level, but not the current (inadvertently altered) behavior. I think the fix is probably a return to the previous behavior. I'll take a closer look. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb contains behaviour weirdness
Peter Geoghegan p...@heroku.com writes: I think this is due to commit 364ddc. That removed the extra step that had arrays sorted (and then de-duped) ahead of time, which made arrays behave like objects at the top level. I think that this sort + de-dup step was mischaracterized as purely a performance thing (possibly by me). Basically, JsonbDeepContains() is consistent with the previous behavior at the top level, but not the current (inadvertently altered) behavior. I think the fix is probably a return to the previous behavior. I'll take a closer look. I'm confused. Are you proposing to return to sort + de-dup of JSON arrays? Surely that is completely broken. Arrays are ordered. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb contains behaviour weirdness
On Fri, Sep 12, 2014 at 11:09 AM, Tom Lane t...@sss.pgh.pa.us wrote: I'm confused. Are you proposing to return to sort + de-dup of JSON arrays? Surely that is completely broken. Arrays are ordered. Sorry, my earlier remarks were premature. In fact, that alteration only applied to existence, not containment. However, arrays are ordered for the purposes of equality, but not containment. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb contains behaviour weirdness
On 09/12/2014 06:40 AM, Alexander Korotkov wrote: Hi! Let's consider some examples. # select '[1,2]'::jsonb @ '[1,2,2]'::jsonb; ?column? -- f (1 row) One may think it's because second jsonb array contain two 2. So, contains takes care about count of equal elements. JSONB arrays are allowed to have repleating elements. It's keys which are not allowed to repeat. # select '[1,1,2]'::jsonb @ '[1,2,2]'::jsonb; ?column? -- t (1 row) But, it's not. Jsonb contains takes care only about length of array. OK, now, that's messed up. # select '[[1,2]]'::jsonb @ '[[1,2,2]]'::jsonb; ?column? -- t (1 row) Even more weird :) The reason why jsonb contains behaves so is check in the beginning of jsonb_contains. It makes fast check of jsonb type and elements count before calling JsonbDeepContains. if (JB_ROOT_COUNT(val) JB_ROOT_COUNT(tmpl) || JB_ROOT_IS_OBJECT(val) != JB_ROOT_IS_OBJECT(tmpl)) PG_RETURN_BOOL(false); It's likely that JB_ROOT_COUNT(val) JB_ROOT_COUNT(tmpl) should be checked only for objects, not arrays. Yeah, agreed. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb contains behaviour weirdness
Peter Geoghegan p...@heroku.com writes: On Fri, Sep 12, 2014 at 11:09 AM, Tom Lane t...@sss.pgh.pa.us wrote: I'm confused. Are you proposing to return to sort + de-dup of JSON arrays? Surely that is completely broken. Arrays are ordered. Sorry, my earlier remarks were premature. In fact, that alteration only applied to existence, not containment. However, arrays are ordered for the purposes of equality, but not containment. My remarks were also premature, because looking back at the referenced commit, I see what it removed was a sort and de-dup as a preliminary step in containment comparisons, not as a generic alteration of array contents. So that was sane enough, though I concur with Heikki's opinion that it likely failed to be a performance win. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb contains behaviour weirdness
On Fri, Sep 12, 2014 at 11:21 AM, Josh Berkus j...@agliodbs.com wrote: # select '[1,1,2]'::jsonb @ '[1,2,2]'::jsonb; ?column? -- t (1 row) But, it's not. Jsonb contains takes care only about length of array. OK, now, that's messed up. To be clear: I don't think that this example is messed up (in isolation). I think it's the correct behavior. What I find objectionable is the inconsistency. I believe that this is Alexander's concern too. Alexander's first example exhibits broken behavior. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb contains behaviour weirdness
On 09/12/2014 11:38 AM, Peter Geoghegan wrote: To be clear: I don't think that this example is messed up (in isolation). I think it's the correct behavior. What I find objectionable is the inconsistency. I believe that this is Alexander's concern too. Alexander's first example exhibits broken behavior. Hmmm, oh. Yeah, I see what you mean; PostgreSQL's SQL array behavior is that @ is true if array A contains all of the elements of array B regardless of ordering or repetition. jsonic=# select array[1,2,2] @ array[1,1,2] ; ?column? -- t That's consistent with our docs and past behavior. However, this better become a FAQ item, because it's not necessarily the behavior that folks used to JSON but not Postgres will expect. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb contains behaviour weirdness
On Fri, Sep 12, 2014 at 10:38 PM, Peter Geoghegan p...@heroku.com wrote: On Fri, Sep 12, 2014 at 11:21 AM, Josh Berkus j...@agliodbs.com wrote: # select '[1,1,2]'::jsonb @ '[1,2,2]'::jsonb; ?column? -- t (1 row) But, it's not. Jsonb contains takes care only about length of array. OK, now, that's messed up. To be clear: I don't think that this example is messed up (in isolation). I think it's the correct behavior. What I find objectionable is the inconsistency. I believe that this is Alexander's concern too. Alexander's first example exhibits broken behavior. Agree. I just tried to explain how current behaviour could look for user who sees it for the first time. -- With best regards, Alexander Korotkov.
Re: [HACKERS] jsonb contains behaviour weirdness
Josh Berkus j...@agliodbs.com writes: However, this better become a FAQ item, because it's not necessarily the behavior that folks used to JSON but not Postgres will expect. No, it's a bug, not a documentation deficiency. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers