On Thu, Mar 20, 2025 at 7:08 AM Kirk Parker <k...@equatoria.us> wrote:
> > On Thu, Mar 20, 2025 at 2:46 AM PG Doc comments form < > nore...@postgresql.org> wrote: > >> The following documentation comment has been logged on the website: >> >> Page: https://www.postgresql.org/docs/17/functions-json.html >> Description: >> >> On the manual page >> https://www.postgresql.org/docs/current/functions-json.html, in the Table >> 9.48. "SQL/JSON Testing Functions" there is a description of IS JSON. It >> includes the next sentence: "If WITH UNIQUE KEYS is specified, then any >> object in the expression is also tested to see if it has duplicate keys." >> And such text is ambiguous, because the term "object" has certain meaning >> regarding json format. In reality the option WITH UNIQUE KEYS allows to >> check for duplicated keys any array element not object. For objects, both >> WITH UNIQUE KEYS and WITHOUT UNIQUE KEYS return false, and both IS JSON >> ARRAY WITH UNIQUE KEY and IS JSON ARRAY WITHOUT UNIQUE KEY return true (it >> is at the same time with and without unique values, how it is possible?), >> i.e. it works the same as just IS JSON ARRAY. The example code that >> confirms >> my reasoning: >> SELECT >> js.vl AS "tested str", >> >> >> js.vl IS JSON OBJECT WITH UNIQUE KEYS AS ".. object w. UQ >> keys", >> >> js.vl IS JSON OBJECT WITHOUT UNIQUE KEYS AS ".. object w/o UQ keys", >> js.vl IS JSON ARRAY WITH UNIQUE KEYS AS ".. array w. UQ keys", >> >> >> js.vl IS JSON ARRAY WITHOUT UNIQUE KEYS AS ".. array w/o UQ keys", >> js.vl IS JSON ARRAY ".. array" >> FROM (VALUES ('{{"a": "a1"}, {"a": "a2"}}'), ('[{"a": "a1"}, {"a": >> "a2"}]'), >> ('["a", "a"]')) AS js(vl); >> >> I'm not sure what should be the right logic for this option, for me it >> looks >> now the same as simple IS JSON ARRAY without any UNIQUE KEY option, but if >> we use an option it should be either true for WITH UNIQUE KEYS or WITHOUT >> UNIQUE KEYS but not for both at the same time. But anyway the sentence I >> showed above should contain "array" instead of "object" because for >> objects >> it returns false independently of applied option. I tested it on >> "PostgreSQL 17.0 on x86_64-windows, compiled by msvc-19.41.34120, 64-bit". >> > > First, WITHOUT UNIQUE KEYS does not mean "confirm that there are duplicate > keys", it's just a way of stating the default explicitly. In other words it > means "w/o testing for duplicate keys". Thus IS JSON OBJECT and IS JSON > OBJECT WITHOUT UNIQUE KEYS will both always return identical results on the > same JSON expression. > > Secondly, the UNIQUE test is recursive; for objects maybe the meaning is > intuitive, but for JSON arrays -- which don't have any concept of keys; > JSON arrays are just ordered lists -- it means "does this array contain any > embedded objects with duplicate keys". > > See: > > SELECT js, > js IS JSON "json?", > js IS JSON OBJECT "object?", > js IS JSON OBJECT WITH UNIQUE KEYS "object w. UK?", > js IS JSON OBJECT WITHOUT UNIQUE KEYS "object w/o UK?", > js IS JSON ARRAY "array?", > js IS JSON ARRAY WITH UNIQUE KEYS "array w. UK?", > js IS JSON ARRAY WITHOUT UNIQUE KEYS "array w/o UK?" > FROM (VALUES > ('[{"a":1},{"b":2,"b":3}]'), -- expect t for array, array w/o UK > ('[{"a":1},{"b":2,"c":3}]'), -- expect t for ALL array tests > ('{"b":2,"b":3}'), -- expect t for object, object w/o UK > ('{"c":2,"d":3}'), -- expect t for ALL object tests > ('{"c":2,"d":{ "e": 0, "e": 1}}'), -- WITH UNIQUE is recursive for > nested objects > ('{"c":2,"d":{ "e": 0, "f": {"g":1,"g":2}}}'), -- no matter how deep > ('[{"a":1},{"b":2,"c":{"d":1, "d":2}}]') -- and also tests arrays > recursively for embedded objecs > ) foo(js); > > > A couple of side notes: > > 1. Your first data example is not JSON at all. It's helpful for this kind > of test to include a plain IS JSON column, since any of the IS JSON X tests > can fail for two reasons: (a) it's not JSON, or (b) it is JSON but it's not > an X. > > 2. Curiously, the JSON spec itself is completely silent on the meaning of > objects with duplicate keys. PostgreSQL is more helpful in this > regard--the docs explicitly state that the last value is the one that is > retained by JSONB and used in processing functions. > > To improve the documentation here, I would suggest simply adding the word "recursively" after "tested": If WITH UNIQUE KEYS is specified, then any object in the *expression* is also tested recursively to see if it has duplicate keys As for clarifying -- if we need to -- that WITHOUT UNIQUE KEYS means "without testing for uniqueness" rather than "confirming that non-unique keys are present", I will defer to anyone who has more familiarity with the PostgreSQL documentation style. I probed around looking at other situations where there are phrases that make the default condition explicit (e.g. CREATE INDEX ... NULLS DISTINCT) that didn't help me much; all my attempts ended up too wordy for my liking.