Hello Kirk, Thank you for your answer and detailed explanation. You are absolutely right, I made a mistake in my reasoning. On the other hand, some ambiguity exists. It is a perfect idea to clarify what WITHOUT UNIQUE KEYS means. Such clarification could help with understanding the logic of IS JSON faster and avoiding confusion and ambiguity. I found the same option for a few JSON creation functions in table 9.47 at the same page, like json_object() and json(). Their descriptions also don't clarify this point.
Best regards, Valery чт, 20 мар. 2025 г. в 15:22, Kirk Parker <k...@equatoria.us>: > On Thu, Mar 20, 2025 at 7:08 AM Kirk Parker <k...@equatoria.us> wrote: > >> >> On Thu, Mar 20, 2025 at 2:46 AM PG Doc comments form < >> nore...@postgresql.org> wrote: >> >>> The following documentation comment has been logged on the website: >>> >>> Page: https://www.postgresql.org/docs/17/functions-json.html >>> Description: >>> >>> On the manual page >>> https://www.postgresql.org/docs/current/functions-json.html, in the >>> Table >>> 9.48. "SQL/JSON Testing Functions" there is a description of IS JSON. It >>> includes the next sentence: "If WITH UNIQUE KEYS is specified, then any >>> object in the expression is also tested to see if it has duplicate keys." >>> And such text is ambiguous, because the term "object" has certain meaning >>> regarding json format. In reality the option WITH UNIQUE KEYS allows to >>> check for duplicated keys any array element not object. For objects, both >>> WITH UNIQUE KEYS and WITHOUT UNIQUE KEYS return false, and both IS JSON >>> ARRAY WITH UNIQUE KEY and IS JSON ARRAY WITHOUT UNIQUE KEY return true >>> (it >>> is at the same time with and without unique values, how it is possible?), >>> i.e. it works the same as just IS JSON ARRAY. The example code that >>> confirms >>> my reasoning: >>> SELECT >>> js.vl AS "tested str", >>> >>> >>> js.vl IS JSON OBJECT WITH UNIQUE KEYS AS ".. object w. UQ >>> keys", >>> >>> js.vl IS JSON OBJECT WITHOUT UNIQUE KEYS AS ".. object w/o UQ >>> keys", >>> js.vl IS JSON ARRAY WITH UNIQUE KEYS AS ".. array w. UQ >>> keys", >>> >>> js.vl IS JSON ARRAY WITHOUT UNIQUE KEYS AS ".. array w/o UQ keys", >>> js.vl IS JSON ARRAY ".. array" >>> FROM (VALUES ('{{"a": "a1"}, {"a": "a2"}}'), ('[{"a": "a1"}, {"a": >>> "a2"}]'), >>> ('["a", "a"]')) AS js(vl); >>> >>> I'm not sure what should be the right logic for this option, for me it >>> looks >>> now the same as simple IS JSON ARRAY without any UNIQUE KEY option, but >>> if >>> we use an option it should be either true for WITH UNIQUE KEYS or WITHOUT >>> UNIQUE KEYS but not for both at the same time. But anyway the sentence I >>> showed above should contain "array" instead of "object" because for >>> objects >>> it returns false independently of applied option. I tested it on >>> "PostgreSQL 17.0 on x86_64-windows, compiled by msvc-19.41.34120, >>> 64-bit". >>> >> >> First, WITHOUT UNIQUE KEYS does not mean "confirm that there are >> duplicate keys", it's just a way of stating the default explicitly. In >> other words it means "w/o testing for duplicate keys". Thus IS JSON OBJECT >> and IS JSON OBJECT WITHOUT UNIQUE KEYS will both always return identical >> results on the same JSON expression. >> >> Secondly, the UNIQUE test is recursive; for objects maybe the meaning is >> intuitive, but for JSON arrays -- which don't have any concept of keys; >> JSON arrays are just ordered lists -- it means "does this array contain any >> embedded objects with duplicate keys". >> >> See: >> >> SELECT js, >> js IS JSON "json?", >> js IS JSON OBJECT "object?", >> js IS JSON OBJECT WITH UNIQUE KEYS "object w. UK?", >> js IS JSON OBJECT WITHOUT UNIQUE KEYS "object w/o UK?", >> js IS JSON ARRAY "array?", >> js IS JSON ARRAY WITH UNIQUE KEYS "array w. UK?", >> js IS JSON ARRAY WITHOUT UNIQUE KEYS "array w/o UK?" >> FROM (VALUES >> ('[{"a":1},{"b":2,"b":3}]'), -- expect t for array, array w/o UK >> ('[{"a":1},{"b":2,"c":3}]'), -- expect t for ALL array tests >> ('{"b":2,"b":3}'), -- expect t for object, object w/o UK >> ('{"c":2,"d":3}'), -- expect t for ALL object tests >> ('{"c":2,"d":{ "e": 0, "e": 1}}'), -- WITH UNIQUE is recursive for >> nested objects >> ('{"c":2,"d":{ "e": 0, "f": {"g":1,"g":2}}}'), -- no matter how deep >> ('[{"a":1},{"b":2,"c":{"d":1, "d":2}}]') -- and also tests arrays >> recursively for embedded objecs >> ) foo(js); >> >> >> A couple of side notes: >> >> 1. Your first data example is not JSON at all. It's helpful for this >> kind of test to include a plain IS JSON column, since any of the IS JSON X >> tests can fail for two reasons: (a) it's not JSON, or (b) it is JSON but >> it's not an X. >> >> 2. Curiously, the JSON spec itself is completely silent on the meaning of >> objects with duplicate keys. PostgreSQL is more helpful in this >> regard--the docs explicitly state that the last value is the one that is >> retained by JSONB and used in processing functions. >> >> > To improve the documentation here, I would suggest simply adding the word > "recursively" after "tested": > > If WITH UNIQUE KEYS is specified, then any object in the *expression* is > also tested recursively to see if it has duplicate keys > > As for clarifying -- if we need to -- that WITHOUT UNIQUE KEYS means > "without testing for uniqueness" rather than "confirming that non-unique > keys are present", I will defer to anyone who has more familiarity with the > PostgreSQL documentation style. I probed around looking at other > situations where there are phrases that make the default condition explicit > (e.g. CREATE INDEX ... NULLS DISTINCT) that didn't help me much; all my > attempts ended up too wordy for my liking. > -- С уважением, Валерий