Re: [HACKERS] Bug in jsonb minus operator

2015-05-20 Thread Andrew Dunstan


On 05/19/2015 07:11 PM, Andrew Dunstan wrote:


On 05/18/2015 10:52 PM, Peter Geoghegan wrote:
On Mon, May 18, 2015 at 7:11 AM, Andrew Dunstan and...@dunslane.net 
wrote:
Here's an patch along those lines. It seems to do the trick, at 
least for
your test case, and it has the merit of being very small, so small 
I'd like
to backpatch it - accepting jbvBinary as is in pushJsonbValue seems 
like a

bug to me.

Isn't that for the benefit of raw scalar pseudo arrays? The existing
comments above pushJsonbValue() acknowledge such callers.



Umm, no, the raw scalar pseudo arrays are of type jbvArray, not 
jbvBinary. And they are pushed with WJB_BEGIN_ARRAY, not with WJB_ELEM 
or WJB_VALUE, as the comment notes. See this fragment of code from 
JsonbValueToJsonb:


scalarArray.type = jbvArray;
scalarArray.val.array.rawScalar = true;
scalarArray.val.array.nElems = 1;

pushJsonbValue(pstate, WJB_BEGIN_ARRAY, scalarArray);


I tested this by removing the assert test for jbvBinary in the 
WJB_ELEM and WJB_VALUE switch beranches and then running the 
regression suite. No assertion failure was triggered. While that's not 
guaranteed to be a perfect test, it doesn't seem like a bad one. Can 
you pose a counter example where this will break?





Why are you passing the skipNested variable to JsonbIteratorNext()
within jsonb_delete()? I'm not seeing a need for that.



If you don't the logic gets more complex, as you need to keep track of 
what level of the object you are at. The virtue of this change is that 
it will simplify a lot of such processing by removing the unnecessary 
restriction on passing jbvBinary values to pushJsonbValue().


If you have a better fix for the bug you complained about I'll be 
happy to take a look.








OK, here's a slightly updated version of the patch. The current 
behaviour in which pushJsonbValue() happily pushes  a jbvBinary value, 
only to have it fail later when we try to get a Jsonb out of it, seems 
like a plain bug. The fact that none of our present 9.4 code exercises 
that bug is lucky, but since this is an exposed API, and has been the 
cause of past complaint, I propose, in the absence of further objections 
or comments, to apply it to both master and 9.4, once the new releases 
are out of the way.


cheers

andrew


diff --git a/src/backend/utils/adt/jsonb_util.c b/src/backend/utils/adt/jsonb_util.c
index 5cd9140..974e386 100644
--- a/src/backend/utils/adt/jsonb_util.c
+++ b/src/backend/utils/adt/jsonb_util.c
@@ -57,6 +57,9 @@ static void appendElement(JsonbParseState *pstate, JsonbValue *scalarVal);
 static int	lengthCompareJsonbStringValue(const void *a, const void *b);
 static int	lengthCompareJsonbPair(const void *a, const void *b, void *arg);
 static void uniqueifyJsonbObject(JsonbValue *object);
+static JsonbValue *pushJsonbValueScalar(JsonbParseState **pstate,
+		JsonbIteratorToken seq,
+		JsonbValue *scalarVal);
 
 /*
  * Turn an in-memory JsonbValue into a Jsonb for on-disk storage.
@@ -503,10 +506,43 @@ fillJsonbValue(JsonbContainer *container, int index,
  *
  * Only sequential tokens pertaining to non-container types should pass a
  * JsonbValue.  There is one exception -- WJB_BEGIN_ARRAY callers may pass a
- * raw scalar pseudo array to append that.
+ * raw scalar pseudo array to append it - the actual scalar should be passed
+ * next and it will be added as the only member of the array.
+ *
+ * Values of type jvbBinary, which are rolled up arrays and objects,
+ * are unpacked before being added to the result.
  */
 JsonbValue *
 pushJsonbValue(JsonbParseState **pstate, JsonbIteratorToken seq,
+			   JsonbValue *jbval)
+{
+	JsonbIterator *it;
+	JsonbValue *res = NULL;
+	JsonbValue v;
+	JsonbIteratorToken tok;
+
+	if (!jbval || (seq != WJB_ELEM  seq != WJB_VALUE) ||
+		jbval-type != jbvBinary)
+	{
+		/* drop through */
+		return pushJsonbValueScalar(pstate, seq, jbval);
+	}
+
+	/* unpack the binary and add each piece to the pstate */
+	it = JsonbIteratorInit(jbval-val.binary.data);
+	while ((tok = JsonbIteratorNext(it, v, false)) != WJB_DONE)
+		res = pushJsonbValueScalar(pstate, tok,
+   tok  WJB_BEGIN_ARRAY ? v : NULL);
+
+	return res;
+}
+
+/*
+ * Do the actual pushing, with only scalar or pseudo-scalar-array values
+ * accepted.
+ */
+static JsonbValue *
+pushJsonbValueScalar(JsonbParseState **pstate, JsonbIteratorToken seq,
 			   JsonbValue *scalarVal)
 {
 	JsonbValue *result = NULL;
@@ -549,13 +585,11 @@ pushJsonbValue(JsonbParseState **pstate, JsonbIteratorToken seq,
 			appendKey(*pstate, scalarVal);
 			break;
 		case WJB_VALUE:
-			Assert(IsAJsonbScalar(scalarVal) ||
-   scalarVal-type == jbvBinary);
+			Assert(IsAJsonbScalar(scalarVal));
 			appendValue(*pstate, scalarVal);
 			break;
 		case WJB_ELEM:
-			Assert(IsAJsonbScalar(scalarVal) ||
-   scalarVal-type == jbvBinary);
+			Assert(IsAJsonbScalar(scalarVal));
 			appendElement(*pstate, scalarVal);
 			break;
 		case WJB_END_OBJECT:

Re: [HACKERS] Bug in jsonb minus operator

2015-05-19 Thread Andrew Dunstan


On 05/18/2015 10:52 PM, Peter Geoghegan wrote:

On Mon, May 18, 2015 at 7:11 AM, Andrew Dunstan and...@dunslane.net wrote:

Here's an patch along those lines. It seems to do the trick, at least for
your test case, and it has the merit of being very small, so small I'd like
to backpatch it - accepting jbvBinary as is in pushJsonbValue seems like a
bug to me.

Isn't that for the benefit of raw scalar pseudo arrays? The existing
comments above pushJsonbValue() acknowledge such callers.



Umm, no, the raw scalar pseudo arrays are of type jbvArray, not 
jbvBinary. And they are pushed with WJB_BEGIN_ARRAY, not with WJB_ELEM 
or WJB_VALUE, as the comment notes. See this fragment of code from 
JsonbValueToJsonb:


scalarArray.type = jbvArray;
scalarArray.val.array.rawScalar = true;
scalarArray.val.array.nElems = 1;

pushJsonbValue(pstate, WJB_BEGIN_ARRAY, scalarArray);


I tested this by removing the assert test for jbvBinary in the WJB_ELEM 
and WJB_VALUE switch beranches and then running the regression suite. No 
assertion failure was triggered. While that's not guaranteed to be a 
perfect test, it doesn't seem like a bad one. Can you pose a counter 
example where this will break?





Why are you passing the skipNested variable to JsonbIteratorNext()
within jsonb_delete()? I'm not seeing a need for that.



If you don't the logic gets more complex, as you need to keep track of 
what level of the object you are at. The virtue of this change is that 
it will simplify a lot of such processing by removing the unnecessary 
restriction on passing jbvBinary values to pushJsonbValue().


If you have a better fix for the bug you complained about I'll be happy 
to take a look.


cheers

andrew






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in jsonb minus operator

2015-05-18 Thread Andrew Dunstan


On 05/18/2015 08:17 AM, Andrew Dunstan wrote:


On 05/17/2015 09:08 PM, Peter Geoghegan wrote:
On Sun, May 17, 2015 at 8:04 AM, Andrew Dunstan and...@dunslane.net 
wrote:
Sure. I thought we'd covered this but it's possible that we didn't, 
or that
it got rebroken. There have been complaints about the limitation on 
values
containing jbvBinary, so let's just remove it if that can be done 
simply, as

it seems to be a not uncommonly encountered problem.

Are you going to submit a patch for that?

I'll try and come up with something. It's not a trivial fix.




Here's a thought. in pushJsonbValue() the value pushed is called 
scalarVal, and in all our regression tests it is in fact scalar. 
However, the Asserts inside the function tell a different story:


case WJB_VALUE:
Assert(IsAJsonbScalar(scalarVal) ||
   scalarVal-type == jbvBinary);
appendValue(*pstate, scalarVal);
break;
case WJB_ELEM:
Assert(IsAJsonbScalar(scalarVal) ||
   scalarVal-type == jbvBinary);
appendElement(*pstate, scalarVal);
break;

and indeed it turns out that your delete example does push a value 
with jbvBinary, thus triggering the problem. So, could we deal with 
that, by, say, setting up an iterator for the binary datum and just 
pushing all of its values?






Here's an patch along those lines. It seems to do the trick, at least 
for your test case, and it has the merit of being very small, so small 
I'd like to backpatch it - accepting jbvBinary as is in pushJsonbValue 
seems like a bug to me.


cheers

andrew


diff --git a/src/backend/utils/adt/jsonb_util.c b/src/backend/utils/adt/jsonb_util.c
index 5cd9140..03cc125 100644
--- a/src/backend/utils/adt/jsonb_util.c
+++ b/src/backend/utils/adt/jsonb_util.c
@@ -57,6 +57,9 @@ static void appendElement(JsonbParseState *pstate, JsonbValue *scalarVal);
 static int	lengthCompareJsonbStringValue(const void *a, const void *b);
 static int	lengthCompareJsonbPair(const void *a, const void *b, void *arg);
 static void uniqueifyJsonbObject(JsonbValue *object);
+static JsonbValue *pushJsonbValueScalar(JsonbParseState **pstate,
+		JsonbIteratorToken seq,
+		JsonbValue *scalarVal);
 
 /*
  * Turn an in-memory JsonbValue into a Jsonb for on-disk storage.
@@ -505,8 +508,37 @@ fillJsonbValue(JsonbContainer *container, int index,
  * JsonbValue.  There is one exception -- WJB_BEGIN_ARRAY callers may pass a
  * raw scalar pseudo array to append that.
  */
+
 JsonbValue *
 pushJsonbValue(JsonbParseState **pstate, JsonbIteratorToken seq,
+			   JsonbValue *jbval)
+{
+	JsonbIterator *it;
+	JsonbValue *res = NULL;
+	JsonbValue v;
+	JsonbIteratorToken tok;
+
+	if (!jbval || (seq != WJB_ELEM  seq != WJB_VALUE) ||
+		jbval-type != jbvBinary)
+	{
+		/* drop through */
+		return pushJsonbValueScalar(pstate, seq, jbval);
+	}
+
+	/* unpack the binary and add each piece to the pstate */
+	it = JsonbIteratorInit(jbval-val.binary.data);
+	while ((tok = JsonbIteratorNext(it, v, false)) != WJB_DONE)
+		res = pushJsonbValueScalar(pstate, tok,
+   tok  WJB_BEGIN_ARRAY ? v : NULL);
+
+	return res;
+}
+
+
+
+
+static JsonbValue *
+pushJsonbValueScalar(JsonbParseState **pstate, JsonbIteratorToken seq,
 			   JsonbValue *scalarVal)
 {
 	JsonbValue *result = NULL;
@@ -549,13 +581,11 @@ pushJsonbValue(JsonbParseState **pstate, JsonbIteratorToken seq,
 			appendKey(*pstate, scalarVal);
 			break;
 		case WJB_VALUE:
-			Assert(IsAJsonbScalar(scalarVal) ||
-   scalarVal-type == jbvBinary);
+			Assert(IsAJsonbScalar(scalarVal));
 			appendValue(*pstate, scalarVal);
 			break;
 		case WJB_ELEM:
-			Assert(IsAJsonbScalar(scalarVal) ||
-   scalarVal-type == jbvBinary);
+			Assert(IsAJsonbScalar(scalarVal));
 			appendElement(*pstate, scalarVal);
 			break;
 		case WJB_END_OBJECT:

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in jsonb minus operator

2015-05-18 Thread Andrew Dunstan


On 05/17/2015 09:08 PM, Peter Geoghegan wrote:

On Sun, May 17, 2015 at 8:04 AM, Andrew Dunstan and...@dunslane.net wrote:

Sure. I thought we'd covered this but it's possible that we didn't, or that
it got rebroken. There have been complaints about the limitation on values
containing jbvBinary, so let's just remove it if that can be done simply, as
it seems to be a not uncommonly encountered problem.

Are you going to submit a patch for that?

I'll try and come up with something. It's not a trivial fix.




Here's a thought. in pushJsonbValue() the value pushed is called 
scalarVal, and in all our regression tests it is in fact scalar. 
However, the Asserts inside the function tell a different story:


case WJB_VALUE:
Assert(IsAJsonbScalar(scalarVal) ||
   scalarVal-type == jbvBinary);
appendValue(*pstate, scalarVal);
break;
case WJB_ELEM:
Assert(IsAJsonbScalar(scalarVal) ||
   scalarVal-type == jbvBinary);
appendElement(*pstate, scalarVal);
break;

and indeed it turns out that your delete example does push a value with 
jbvBinary, thus triggering the problem. So, could we deal with that, by, 
say, setting up an iterator for the binary datum and just pushing all of 
its values?


cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in jsonb minus operator

2015-05-18 Thread Peter Geoghegan
On Mon, May 18, 2015 at 7:11 AM, Andrew Dunstan and...@dunslane.net wrote:
 Here's an patch along those lines. It seems to do the trick, at least for
 your test case, and it has the merit of being very small, so small I'd like
 to backpatch it - accepting jbvBinary as is in pushJsonbValue seems like a
 bug to me.

Isn't that for the benefit of raw scalar pseudo arrays? The existing
comments above pushJsonbValue() acknowledge such callers.

Why are you passing the skipNested variable to JsonbIteratorNext()
within jsonb_delete()? I'm not seeing a need for that.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in jsonb minus operator

2015-05-17 Thread Andrew Dunstan


On 05/16/2015 08:04 PM, Peter Geoghegan wrote:

I'm seeing the following problem on the master branch:

postgres=# select '{foo:5}'::jsonb - 'bar'; -- okay
   ?column?

  {foo: 5}
(1 row)

postgres=# select '{foo:{bar:5}}'::jsonb - 'foo'; -- okay
  ?column?
--
  {}
(1 row)

postgres=# select '{foo:{bar:5}}'::jsonb - 'rion'; -- spurious error
ERROR:  XX000: unknown type of jsonb container to convert
LOCATION:  convertJsonbValue, jsonb_util.c:1430

This is an elog() - a can't happen error - due to this restriction
within convertJsonbValue():

/*
  * A JsonbValue passed as val should never have a type of jbvBinary, and
  * neither should any of its sub-components. Those values will be produced
  * by convertJsonbArray and convertJsonbObject, the results of which will
  * not be passed back to this function as an argument.
  */

This call to convertJsonbValue() actually originates from the final
line of the new jsonb_delete() function, here:

#5  0x00877e10 in jsonb_delete (fcinfo=0x160e060) at jsonfuncs.c:3389
3389 PG_RETURN_JSONB(JsonbValueToJsonb(res));

I explored writing a fix for this bug. I went back and forth on it,
but I think that the most promising approach might be to simply teach
convertJsonbValue() to care about jbvBinary-typed JsonbValue
variables. That way, the jsonb_delete() function could actually expect
this to work. I can't remember why I thought it was a good idea to
have convertJsonbValue() reject jbvBinary values, but I believe the
reason was that it simply wasn't necessary.



Sure. I thought we'd covered this but it's possible that we didn't, or 
that it got rebroken. There have been complaints about the limitation on 
values containing jbvBinary, so let's just remove it if that can be done 
simply, as it seems to be a not uncommonly encountered problem.


Are you going to submit a patch for that?



jsonb_delete() is unusual in that it converts from a JsonbValue back
to the on-disk Jsonb varlena format, but with a nested jbvBinary-typed
value (in the presence of a nested object or array) -- it seems like
it wouldn't be too hard to just roll with that. jsonb_delete() makes
the mistake of not expecting to see jbvBinary values (since it doesn't
always recurse into the json structure). We shouldn't really deal with
jbvBinary-typed values specially from outside specialized utility
routines like JsonbDeepContains() as noted in the above comment,
though (so offhand I don't think we should teach jsonb_delete()
anything new, as that would be a modularity violation). Thoughts?



Assuming the above fix we won't have to teach it anything new, right?



Note that the existence related operators (that, like the minus
operator should only work at the top level) don't go through
JsonbValue conversion of the lhs value as part of their search at all.
I don't think that their findJsonbValueFromContainer() routine (or a
similar routine) is the right way of approaching this, though - that's
a specialized routine, that doesn't care if an array value (which is,
you'll recall, a key for the purposes of existence checking) appears
once or multiple times.


Seems reasonable.


On that topic, I think it's sloppy that Table
9-41. Additional jsonb Operators isn't clear about the fact that the
operator - text op matches things on the same basis as the existence
operators -- notice how the existence operator notes with emphasis
that it cares about array *string* values only.


OK, please submit a patch that you think clears it up.



Finally, I don't think an operator implementation function that is
jsonb-only belongs anywhere other than jsonb_ops.c (for the same
reason, replacePath() should live in jsonb_util.c).



The heading on jsonb_op.c (note spelling) suggests that it's for indexed 
operations. I don't mind rearranging the code layout to something people 
think more logical, but I also don't want to engage in unnecessary code 
churn. I'm not that religious about the organization.




And I'm
disappointed that the jsonb tests can no longer be run atomically with
'make installcheck TESTS=jsonb' - I liked being able to do that.


It was always known that some cases would not work with TESTS - that was 
part of Tom's reservation about the whole thing. You can say


   make check-tests TESTS=create_table copy jsonb


and while create_table will fail on an irrelevant test, copy and jsonb 
will succeed.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in jsonb minus operator

2015-05-17 Thread Peter Geoghegan
On Sun, May 17, 2015 at 8:04 AM, Andrew Dunstan and...@dunslane.net wrote:
 Sure. I thought we'd covered this but it's possible that we didn't, or that
 it got rebroken. There have been complaints about the limitation on values
 containing jbvBinary, so let's just remove it if that can be done simply, as
 it seems to be a not uncommonly encountered problem.

 Are you going to submit a patch for that?

I'll try and come up with something. It's not a trivial fix.

 Assuming the above fix we won't have to teach it anything new, right?

I'm not 100% sure, but I think not.

 On that topic, I think it's sloppy that Table
 9-41. Additional jsonb Operators isn't clear about the fact that the
 operator - text op matches things on the same basis as the existence
 operators -- notice how the existence operator notes with emphasis
 that it cares about array *string* values only.


 OK, please submit a patch that you think clears it up.

I was going to better document the new operators anyway, so I'll get
this in passing.

 The heading on jsonb_op.c (note spelling) suggests that it's for indexed
 operations. I don't mind rearranging the code layout to something people
 think more logical, but I also don't want to engage in unnecessary code
 churn. I'm not that religious about the organization.

I see what you mean.

 It was always known that some cases would not work with TESTS - that was
 part of Tom's reservation about the whole thing. You can say

Sure, but you added it, and you mentioned jsonb in the commit message,
before jsonb was even committed to the master branch. :-)

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers