Hello Hackers, I’ve attached a patch to start adding SQL:2023 JSON simplified accessor support. This allows accessing JSON or JSONB fields using dot notation (e.g., colname.field.field...), similar to composite types.
Currently, PostgreSQL uses nonstandard syntax like colname->x->y for JSON and JSONB, and colname['blah'] for JSONB. These existing syntaxes predate the standard. Oracle already supports the standard dot notation syntax [1]. The full specification for the JSON simplified accessor format is as follows: <JSON simplified accessor> ::= <value expression primary> <JSON simplified accessor op chain> <JSON simplified accessor op chain> ::= <JSON simplified accessor op> | <JSON simplified accessor op chain> <JSON simplified accessor op> <JSON simplified accessor op> ::= <JSON member accessor> | <JSON wildcard member accessor> | <JSON array accessor> | <JSON wildcard array accessor> | <JSON item method> I’ve implemented the member and array accessors and attached two alternative patches: 1. v1-0001-Add-JSON-JSONB-simplified-accessor.patch: This patch enables dot access to JSON object fields and subscript access to indexed JSON array elements by converting "." and "[]" indirection into a JSON_QUERY JsonFuncExpr node. 2. v2-0001-Transform-JSON-dot-access-to-arrow-operator.txt: This alternative patch implements dot access to JSON object fields by transforming the "." indirection into a "->" operator. The upside of the v1 patch is that it strictly aligns with the SQL standard, which specifies that the simplified access is equivalent to: JSON_QUERY (VEP, 'lax $.JC' WITH CONDITIONAL ARRAY WRAPPER NULL ON EMPTY NULL ON ERROR) However, the performance of JSON_QUERY might be suboptimal due to function call overhead. Therefore, I implemented the v2 alternative using the "->" operator. There is some uncertainty about the semantics of conditional array wrappers. Currently, there is at least one subtle difference between the "->" operator and JSON_QUERY, as shown: postgres=# select '{"a": 42}'::json->'a'; ?column? ---------- 42 (1 row) postgres=# select json_query('{"a": 42}'::json, 'lax $.a' with conditional array wrapper null on empty null on error); json_query ------------ [42] (1 row) JSON_QUERY encloses the JSON value 42 in brackets, which may be a bug, as Peter noted [2]. If there are no other semantic differences, we could implement simple access without using JSON_QUERY to avoid function call overhead. I aim to first enable standard dot notation access to JSON object fields. Both patches implement this, and I’m also open to alternative approaches. For subscripting access to jsonb array elements, jsonb already supports this via the subscripting handler interface. In the v1 patch, I added json support using JSON_QUERY, but I can easily adapt this for the v2 patch using the -> operator. I did not leverage the subscripting handler interface for json because implementing the fetch/assign functions for json seems challenging for plain text. Let me know if you have a different approach in mind. Finally, I have not implemented wildcard or item method accessors yet and would appreciate input on their necessity. [1] https://docs.oracle.com/en/database/oracle/oracle-database/21/adjsn/simple-dot-notation-access-to-json-data.html#GUID-7249417B-A337-4854-8040-192D5CEFD576 [2] https://www.postgresql.org/message-id/8022e067-818b-45d3-8fab-6e0d94d03...@eisentraut.org
From 99ee1cbc45ebd76ce21881bda95933d8a5a104c6 Mon Sep 17 00:00:00 2001 From: Alexandra Wang <alexandra.wang....@gmail.com> Date: Thu, 15 Aug 2024 02:11:33 -0700 Subject: [PATCH v2] Transform JSON dot access to arrow operator Enabled dot-notation access to JSON/JSONB object by making a syntatic sugar for the "->" operator in ParseFuncOrColumn() for arg of JSON/JSONB type. JSON array access via subscripting is not yet supported in this patch, but can be implemented similarly by creating an OpExpr for the json_array_element "->" operator. Note that the output of the "->" operators are not wrapped by brackets, which differs from the SQL standard specification for the JSON simplified accessor equivalence shown below: JSON_QUERY (VEP, 'lax $.JC' WITH CONDITIONAL ARRAY WRAPPER NULL ON EMPTY NULL ON ERROR) --- src/backend/parser/parse_func.c | 57 +++++++++++++++++++++--- src/include/catalog/pg_operator.dat | 4 +- src/include/parser/parse_type.h | 1 + src/test/regress/expected/json.out | 67 +++++++++++++++++++++++++++++ src/test/regress/expected/jsonb.out | 55 +++++++++++++++++++++++ src/test/regress/sql/json.sql | 20 +++++++++ src/test/regress/sql/jsonb.sql | 17 ++++++++ 7 files changed, 214 insertions(+), 7 deletions(-) diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c index 9b23344a3b..431c9883f2 100644 --- a/src/backend/parser/parse_func.c +++ b/src/backend/parser/parse_func.c @@ -33,6 +33,8 @@ #include "utils/builtins.h" #include "utils/lsyscache.h" #include "utils/syscache.h" +#include "parser/parse_oper.h" +#include "catalog/pg_operator_d.h" /* Possible error codes from LookupFuncNameInternal */ @@ -48,6 +50,8 @@ static void unify_hypothetical_args(ParseState *pstate, static Oid FuncNameAsType(List *funcname); static Node *ParseComplexProjection(ParseState *pstate, const char *funcname, Node *first_arg, int location); +static Node *ParseJsonSimplifiedAccessorProjection(ParseState *pstate, const char *funcname, + Node *first_arg, int location); static Oid LookupFuncNameInternal(ObjectType objtype, List *funcname, int nargs, const Oid *argtypes, bool include_out_arguments, bool missing_ok, @@ -226,17 +230,24 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs, !func_variadic && argnames == NIL && list_length(funcname) == 1 && (actual_arg_types[0] == RECORDOID || - ISCOMPLEX(actual_arg_types[0]))); + ISCOMPLEX(actual_arg_types[0]) || + ISJSON(actual_arg_types[0]))); /* * If it's column syntax, check for column projection case first. */ if (could_be_projection && is_column) { - retval = ParseComplexProjection(pstate, - strVal(linitial(funcname)), - first_arg, - location); + if (ISJSON(actual_arg_types[0])) + retval = ParseJsonSimplifiedAccessorProjection(pstate, + strVal(linitial(funcname)), + first_arg, + location); + else + retval = ParseComplexProjection(pstate, + strVal(linitial(funcname)), + first_arg, + location); if (retval) return retval; @@ -1902,6 +1913,42 @@ FuncNameAsType(List *funcname) return result; } +/* + * ParseJsonSimplifiedAccessorProjection - + * handles function calls with a single argument that is of json type. + * If the function call is actually a column projection, return a suitably + * transformed expression tree. If not, return NULL. + */ +static Node * +ParseJsonSimplifiedAccessorProjection(ParseState *pstate, const char *funcname, + Node *first_arg, int location) +{ + OpExpr *result; + Node *rexpr; + rexpr = (Node *) makeConst( + TEXTOID, + -1, + InvalidOid, + -1, + CStringGetTextDatum(funcname), + false, + false); + + result = makeNode(OpExpr); + if (exprType(first_arg) == JSONOID) { + result->opno = OID_JSON_OBJECT_FIELD_OP; + result->opresulttype = JSONOID; + } else { + Assert(exprType(first_arg) == JSONBOID); + result->opno = OID_JSONB_OBJECT_FIELD_OP; + result->opresulttype = JSONBOID; + } + result->opfuncid = get_opcode(result->opno); + result->args = list_make2(first_arg, rexpr); + result->location = location; + return (Node *) result; +} + /* * ParseComplexProjection - * handles function calls with a single argument that is of complex type. diff --git a/src/include/catalog/pg_operator.dat b/src/include/catalog/pg_operator.dat index 0e7511dde1..0ef9d80357 100644 --- a/src/include/catalog/pg_operator.dat +++ b/src/include/catalog/pg_operator.dat @@ -3154,7 +3154,7 @@ oprname => '*', oprleft => 'anyrange', oprright => 'anyrange', oprresult => 'anyrange', oprcom => '*(anyrange,anyrange)', oprcode => 'range_intersect' }, -{ oid => '3962', descr => 'get json object field', +{ oid => '3962', oid_symbol => 'OID_JSON_OBJECT_FIELD_OP', descr => 'get json object field', oprname => '->', oprleft => 'json', oprright => 'text', oprresult => 'json', oprcode => 'json_object_field' }, { oid => '3963', descr => 'get json object field as text', @@ -3172,7 +3172,7 @@ { oid => '3967', descr => 'get value from json as text with path elements', oprname => '#>>', oprleft => 'json', oprright => '_text', oprresult => 'text', oprcode => 'json_extract_path_text' }, -{ oid => '3211', descr => 'get jsonb object field', +{ oid => '3211', oid_symbol => 'OID_JSONB_OBJECT_FIELD_OP', descr => 'get jsonb object field', oprname => '->', oprleft => 'jsonb', oprright => 'text', oprresult => 'jsonb', oprcode => 'jsonb_object_field' }, { oid => '3477', descr => 'get jsonb object field as text', diff --git a/src/include/parser/parse_type.h b/src/include/parser/parse_type.h index b62e7a6ce9..9c8b3bfb2f 100644 --- a/src/include/parser/parse_type.h +++ b/src/include/parser/parse_type.h @@ -57,5 +57,6 @@ extern bool parseTypeString(const char *str, Oid *typeid_p, int32 *typmod_p, /* true if typeid is composite, or domain over composite, but not RECORD */ #define ISCOMPLEX(typeid) (typeOrDomainTypeRelid(typeid) != InvalidOid) +#define ISJSON(typeid) (typeid == JSONOID || typeid == JSONBOID) #endif /* PARSE_TYPE_H */ diff --git a/src/test/regress/expected/json.out b/src/test/regress/expected/json.out index 7df11c2f38..39bfa724c2 100644 --- a/src/test/regress/expected/json.out +++ b/src/test/regress/expected/json.out @@ -2713,3 +2713,70 @@ select ts_headline('[]'::json, tsquery('aaa & bbb')); [] (1 row) +-- simple dot notation +drop table if exists test_json_dot; +NOTICE: table "test_json_dot" does not exist, skipping +create table test_json_dot(id int, test_json json); +insert into test_json_dot select 1, '{"a": 1, "b": 42}'::json; +insert into test_json_dot select 1, '{"a": 2, "b": {"c": 42}}'::json; +insert into test_json_dot select 1, '{"a": 3, "b": {"c": "42"}, "d":[11, 12]}'::json; +insert into test_json_dot select 1, '{"a": 3, "b": {"c": "42"}, "d":[{"x": [11, 12]}, {"y": [21, 22]}]}'::json; +-- member object access +select (test_json_dot.test_json).b, json_query(test_json, 'lax $.b' WITH CONDITIONAL WRAPPER NULL ON EMPTY NULL ON ERROR) as expected from test_json_dot; + b | expected +-------------+------------- + 42 | [42] + {"c": 42} | {"c": 42} + {"c": "42"} | {"c": "42"} + {"c": "42"} | {"c": "42"} +(4 rows) + +select (test_json_dot.test_json).b.c, json_query(test_json, 'lax $.b.c' WITH CONDITIONAL WRAPPER NULL ON EMPTY NULL ON ERROR) as expected from test_json_dot; + c | expected +------+---------- + | + 42 | [42] + "42" | ["42"] + "42" | ["42"] +(4 rows) + +select (test_json_dot.test_json).d, json_query(test_json, 'lax $.d' WITH CONDITIONAL WRAPPER NULL ON EMPTY NULL ON ERROR) as expected from test_json_dot; + d | expected +------------------------------------+------------------------------------ + | + | + [11, 12] | [11, 12] + [{"x": [11, 12]}, {"y": [21, 22]}] | [{"x": [11, 12]}, {"y": [21, 22]}] +(4 rows) + +select (test_json_dot.test_json)."d", json_query(test_json, 'lax $.d' WITH CONDITIONAL WRAPPER NULL ON EMPTY NULL ON ERROR) as expected from test_json_dot; + d | expected +------------------------------------+------------------------------------ + | + | + [11, 12] | [11, 12] + [{"x": [11, 12]}, {"y": [21, 22]}] | [{"x": [11, 12]}, {"y": [21, 22]}] +(4 rows) + +select (test_json_dot.test_json).'d' from test_json_dot; +ERROR: syntax error at or near "'d'" +LINE 1: select (test_json_dot.test_json).'d' from test_json_dot; + ^ +-- array element access +select (test_json_dot.test_json).d->0 from test_json_dot; + ?column? +----------------- + + + 11 + {"x": [11, 12]} +(4 rows) + +select (test_json_dot.test_json).d[0], json_query(test_json, 'lax $.d[0]' WITH CONDITIONAL WRAPPER NULL ON EMPTY NULL ON ERROR) as expected from test_json_dot; +ERROR: cannot subscript type json because it does not support subscripting +LINE 1: select (test_json_dot.test_json).d[0], json_query(test_json,... + ^ +select (test_json_dot.test_json).d[1], json_query(test_json, 'lax $.d[1]' WITH CONDITIONAL WRAPPER NULL ON EMPTY NULL ON ERROR) as expected from test_json_dot; +ERROR: cannot subscript type json because it does not support subscripting +LINE 1: select (test_json_dot.test_json).d[1], json_query(test_json,... + ^ diff --git a/src/test/regress/expected/jsonb.out b/src/test/regress/expected/jsonb.out index 7d163a156e..c30f3e174c 100644 --- a/src/test/regress/expected/jsonb.out +++ b/src/test/regress/expected/jsonb.out @@ -5715,3 +5715,58 @@ select '12345.0000000000000000000000000000000000000000000005'::jsonb::int8; 12345 (1 row) +-- simple dot notation +drop table if exists test_jsonb_dot; +NOTICE: table "test_jsonb_dot" does not exist, skipping +create table test_jsonb_dot(id int, test_jsonb jsonb); +insert into test_jsonb_dot select 1, '{"a": 1, "b": 42}'::json; +insert into test_jsonb_dot select 1, '{"a": 2, "b": {"c": 42}}'::json; +insert into test_jsonb_dot select 1, '{"a": 3, "b": {"c": "42"}, "d":[11, 12]}'::json; +-- member object access +select (test_jsonb_dot.test_jsonb).b from test_jsonb_dot; + b +------------- + 42 + {"c": 42} + {"c": "42"} +(3 rows) + +select (test_jsonb_dot.test_jsonb).b.c from test_jsonb_dot; + c +------ + + 42 + "42" +(3 rows) + +select (test_json_dot.test_json).d from test_json_dot; + d +------------------------------------ + + + [11, 12] + [{"x": [11, 12]}, {"y": [21, 22]}] +(4 rows) + +select (test_json_dot.test_json)."d" from test_json_dot; + d +------------------------------------ + + + [11, 12] + [{"x": [11, 12]}, {"y": [21, 22]}] +(4 rows) + +select (test_json_dot.test_json).'d' from test_json_dot; +ERROR: syntax error at or near "'d'" +LINE 1: select (test_json_dot.test_json).'d' from test_json_dot; + ^ +-- array element access +select (test_jsonb_dot.test_jsonb).d[0] from test_jsonb_dot; + d +---- + + + 11 +(3 rows) + diff --git a/src/test/regress/sql/json.sql b/src/test/regress/sql/json.sql index 5c886cd6b3..f9b2f7a35d 100644 --- a/src/test/regress/sql/json.sql +++ b/src/test/regress/sql/json.sql @@ -867,3 +867,23 @@ select ts_headline('english', '{"a": "aaa bbb", "b": {"c": "ccc ddd fff", "c1": select ts_headline('null'::json, tsquery('aaa & bbb')); select ts_headline('{}'::json, tsquery('aaa & bbb')); select ts_headline('[]'::json, tsquery('aaa & bbb')); + +-- simple dot notation +drop table if exists test_json_dot; +create table test_json_dot(id int, test_json json); +insert into test_json_dot select 1, '{"a": 1, "b": 42}'::json; +insert into test_json_dot select 1, '{"a": 2, "b": {"c": 42}}'::json; +insert into test_json_dot select 1, '{"a": 3, "b": {"c": "42"}, "d":[11, 12]}'::json; +insert into test_json_dot select 1, '{"a": 3, "b": {"c": "42"}, "d":[{"x": [11, 12]}, {"y": [21, 22]}]}'::json; + +-- member object access +select (test_json_dot.test_json).b, json_query(test_json, 'lax $.b' WITH CONDITIONAL WRAPPER NULL ON EMPTY NULL ON ERROR) as expected from test_json_dot; +select (test_json_dot.test_json).b.c, json_query(test_json, 'lax $.b.c' WITH CONDITIONAL WRAPPER NULL ON EMPTY NULL ON ERROR) as expected from test_json_dot; +select (test_json_dot.test_json).d, json_query(test_json, 'lax $.d' WITH CONDITIONAL WRAPPER NULL ON EMPTY NULL ON ERROR) as expected from test_json_dot; +select (test_json_dot.test_json)."d", json_query(test_json, 'lax $.d' WITH CONDITIONAL WRAPPER NULL ON EMPTY NULL ON ERROR) as expected from test_json_dot; +select (test_json_dot.test_json).'d' from test_json_dot; + +-- array element access +select (test_json_dot.test_json).d->0 from test_json_dot; +select (test_json_dot.test_json).d[0], json_query(test_json, 'lax $.d[0]' WITH CONDITIONAL WRAPPER NULL ON EMPTY NULL ON ERROR) as expected from test_json_dot; +select (test_json_dot.test_json).d[1], json_query(test_json, 'lax $.d[1]' WITH CONDITIONAL WRAPPER NULL ON EMPTY NULL ON ERROR) as expected from test_json_dot; diff --git a/src/test/regress/sql/jsonb.sql b/src/test/regress/sql/jsonb.sql index 5f0190d5a2..f2b800910d 100644 --- a/src/test/regress/sql/jsonb.sql +++ b/src/test/regress/sql/jsonb.sql @@ -1559,3 +1559,20 @@ select '12345.0000000000000000000000000000000000000000000005'::jsonb::float8; select '12345.0000000000000000000000000000000000000000000005'::jsonb::int2; select '12345.0000000000000000000000000000000000000000000005'::jsonb::int4; select '12345.0000000000000000000000000000000000000000000005'::jsonb::int8; + +-- simple dot notation +drop table if exists test_jsonb_dot; +create table test_jsonb_dot(id int, test_jsonb jsonb); +insert into test_jsonb_dot select 1, '{"a": 1, "b": 42}'::json; +insert into test_jsonb_dot select 1, '{"a": 2, "b": {"c": 42}}'::json; +insert into test_jsonb_dot select 1, '{"a": 3, "b": {"c": "42"}, "d":[11, 12]}'::json; + +-- member object access +select (test_jsonb_dot.test_jsonb).b from test_jsonb_dot; +select (test_jsonb_dot.test_jsonb).b.c from test_jsonb_dot; +select (test_json_dot.test_json).d from test_json_dot; +select (test_json_dot.test_json)."d" from test_json_dot; +select (test_json_dot.test_json).'d' from test_json_dot; + +-- array element access +select (test_jsonb_dot.test_jsonb).d[0] from test_jsonb_dot; -- 2.39.3 (Apple Git-146)
v1-0001-Add-JSON-JSONB-simplified-accessor.patch
Description: Binary data