I looked into bug #13756,
http://www.postgresql.org/message-id/20151105171933.14035.25...@wrigleys.postgresql.org

The cause of the problem is that gin_extract_jsonb_path() computes
a different hash for the "2" in
        {"a":[ ["b",{"x":1}], ["b",{"x":2}]]}
than it does for the "2" in
        {"a":[[{"x":2}]]}
And the cause of that is that it supposes that after emitting a hash
for a VALUE, it can just leave stack->hash to be reinitialized by the
next KEY or ELEM.  But if the next thing is a sub-object, we propagate
the previous value's hash down into the sub-object, causing values
therein to receive different hashes than they'd have gotten without
the preceding outer-level VALUE.

The attached one-liner fixes it, but of course this means that on-disk
jsonb_path_ops indexes are possibly broken and will need to be reindexed.
I see no way around that ... does anybody else?

                        regards, tom lane

diff --git a/src/backend/utils/adt/jsonb_gin.c b/src/backend/utils/adt/jsonb_gin.c
index 204fb8b..b917ec2 100644
*** a/src/backend/utils/adt/jsonb_gin.c
--- b/src/backend/utils/adt/jsonb_gin.c
*************** gin_extract_jsonb_path(PG_FUNCTION_ARGS)
*** 419,425 ****
  				JsonbHashScalarValue(&v, &stack->hash);
  				/* and emit an index entry */
  				entries[i++] = UInt32GetDatum(stack->hash);
! 				/* Note: we assume we'll see KEY before another VALUE */
  				break;
  			case WJB_END_ARRAY:
  			case WJB_END_OBJECT:
--- 419,426 ----
  				JsonbHashScalarValue(&v, &stack->hash);
  				/* and emit an index entry */
  				entries[i++] = UInt32GetDatum(stack->hash);
! 				/* reset hash for next sub-object */
! 				stack->hash = stack->parent->hash;
  				break;
  			case WJB_END_ARRAY:
  			case WJB_END_OBJECT:
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to