Re: remaining sql/json patches

2024-05-15 Thread Thom Brown
On Mon, 8 Apr 2024 at 10:09, Amit Langote  wrote:

> On Mon, Apr 8, 2024 at 2:02 PM jian he 
> wrote:
> > On Mon, Apr 8, 2024 at 11:21 AM jian he 
> wrote:
> > >
> > > On Mon, Apr 8, 2024 at 12:34 AM jian he 
> wrote:
> > > >
> > > > On Sun, Apr 7, 2024 at 9:36 PM Amit Langote 
> wrote:
> > > > > 0002 needs an expanded commit message but I've run out of energy
> today.
> > > > >
> > >
> > > other than that, it looks good to me.
> >
> > one more tiny issue.
> > +EXPLAIN (COSTS OFF, VERBOSE) SELECT * FROM jsonb_table_view1;
> > +ERROR:  relation "jsonb_table_view1" does not exist
> > +LINE 1: EXPLAIN (COSTS OFF, VERBOSE) SELECT * FROM jsonb_table_view1...
> > +   ^
> > maybe you want
> > EXPLAIN (COSTS OFF, VERBOSE) SELECT * FROM jsonb_table_view7;
> > at the end of the sqljson_jsontable.sql.
> > I guess it will be fine, but the format json explain's out is quite big.
> >
> > you also need to `drop table s cascade;` at the end of the test?
>
> Pushed after fixing this and other issues.  Thanks a lot for your
> careful reviews.
>
> I've marked the CF entry for this as committed now:
> https://commitfest.postgresql.org/47/4377/
>
> Let's work on the remaining PLAN clause with a new entry in the next
> CF, possibly in a new email thread.
>

I've just taken a look at the doc changes, and I think we need to either
remove the leading "select" keyword, or uppercase it in the examples.

For example (on
https://www.postgresql.org/docs/devel/functions-json.html#SQLJSON-QUERY-FUNCTIONS
):

json_exists ( context_item, path_expression [ PASSING { value AS varname }
[, ...]] [ { TRUE | FALSE | UNKNOWN | ERROR } ON ERROR ])

Returns true if the SQL/JSON path_expression applied to the context_item
using the PASSING values yields any items.
The ON ERROR clause specifies the behavior if an error occurs; the default
is to return the boolean FALSE value. Note that if the path_expression is
strict and ON ERROR behavior is ERROR, an error is generated if it yields
no items.
Examples:
select json_exists(jsonb '{"key1": [1,2,3]}', 'strict $.key1[*] ? (@ > 2)')
→ t
select json_exists(jsonb '{"a": [1,2,3]}', 'lax $.a[5]' ERROR ON ERROR) → f
select json_exists(jsonb '{"a": [1,2,3]}', 'strict $.a[5]' ERROR ON ERROR) →
ERROR:  jsonpath array subscript is out of bounds

Examples are more difficult to read when keywords appear to be at the same
level as predicates.  Plus other examples within tables on the same page
don't start with "select", and further down, block examples uppercase
keywords.  Either way, I don't like it as it is.

Separate from this, I think these tables don't scan well (see json_query
for an example of what I'm referring to).  There is no clear separation of
the syntax definition, the description, and the example. This is more a
matter for the website mailing list, but I'm expressing it here to check
whether others agree.

Thom


Re: remaining sql/json patches

2024-04-08 Thread Amit Langote
On Mon, Apr 8, 2024 at 2:02 PM jian he  wrote:
> On Mon, Apr 8, 2024 at 11:21 AM jian he  wrote:
> >
> > On Mon, Apr 8, 2024 at 12:34 AM jian he  wrote:
> > >
> > > On Sun, Apr 7, 2024 at 9:36 PM Amit Langote  
> > > wrote:
> > > > 0002 needs an expanded commit message but I've run out of energy today.
> > > >
> >
> > other than that, it looks good to me.
>
> one more tiny issue.
> +EXPLAIN (COSTS OFF, VERBOSE) SELECT * FROM jsonb_table_view1;
> +ERROR:  relation "jsonb_table_view1" does not exist
> +LINE 1: EXPLAIN (COSTS OFF, VERBOSE) SELECT * FROM jsonb_table_view1...
> +   ^
> maybe you want
> EXPLAIN (COSTS OFF, VERBOSE) SELECT * FROM jsonb_table_view7;
> at the end of the sqljson_jsontable.sql.
> I guess it will be fine, but the format json explain's out is quite big.
>
> you also need to `drop table s cascade;` at the end of the test?

Pushed after fixing this and other issues.  Thanks a lot for your
careful reviews.

I've marked the CF entry for this as committed now:
https://commitfest.postgresql.org/47/4377/

Let's work on the remaining PLAN clause with a new entry in the next
CF, possibly in a new email thread.

-- 
Thanks, Amit Langote




Re: remaining sql/json patches

2024-04-07 Thread jian he
On Mon, Apr 8, 2024 at 11:21 AM jian he  wrote:
>
> On Mon, Apr 8, 2024 at 12:34 AM jian he  wrote:
> >
> > On Sun, Apr 7, 2024 at 9:36 PM Amit Langote  wrote:
> > > 0002 needs an expanded commit message but I've run out of energy today.
> > >
>
> other than that, it looks good to me.

one more tiny issue.
+EXPLAIN (COSTS OFF, VERBOSE) SELECT * FROM jsonb_table_view1;
+ERROR:  relation "jsonb_table_view1" does not exist
+LINE 1: EXPLAIN (COSTS OFF, VERBOSE) SELECT * FROM jsonb_table_view1...
+   ^
maybe you want
EXPLAIN (COSTS OFF, VERBOSE) SELECT * FROM jsonb_table_view7;
at the end of the sqljson_jsontable.sql.
I guess it will be fine, but the format json explain's out is quite big.

you also need to `drop table s cascade;` at the end of the test?




Re: remaining sql/json patches

2024-04-07 Thread jian he
On Mon, Apr 8, 2024 at 12:34 AM jian he  wrote:
>
> On Sun, Apr 7, 2024 at 9:36 PM Amit Langote  wrote:
> > 0002 needs an expanded commit message but I've run out of energy today.
> >
>

+/*
+ * Fetch next row from a JsonTablePlan's path evaluation result and from
+ * any child nested path(s).
+ *
+ * Returns true if the any of the paths (this or the nested) has more rows to
+ * return.
+ *
+ * By fetching the nested path(s)'s rows based on the parent row at each
+ * level, this essentially joins the rows of different levels.  If any level
+ * has no matching rows, the columns at that level will compute to NULL,
+ * making it an OUTER join.
+ */
+static bool
+JsonTablePlanScanNextRow(JsonTablePlanState *planstate)

"if the any"
should be
"if any" ?

also I think,
 + If any level
+ * has no matching rows, the columns at that level will compute to NULL,
+ * making it an OUTER join.
means
+ If any level rows do not match, the rows at that level will compute to NULL,
+ making it an OUTER join.

other than that, it looks good to me.




Re: remaining sql/json patches

2024-04-07 Thread jian he
On Sun, Apr 7, 2024 at 9:36 PM Amit Langote  wrote:
>
>
> 0002 needs an expanded commit message but I've run out of energy today.
>

some cosmetic issues in v51, 0002.

in struct JsonTablePathScan,
/* ERROR/EMPTY ON ERROR behavior */
bool errorOnError;

the comments seem not right.
I think "errorOnError" means
while evaluating the top level JSON path expression, whether "error on
error" is specified or not?


+  | NESTED  PATH  ]
json_path_specification  AS
json_path_name  COLUMNS (
json_table_column ,
... )
 

"NESTED  PATH  ] "
no need the closing bracket.



+ /* Update the nested plan(s)'s row(s) using this new row. */
+ if (planstate->nested)
+ {
+ JsonTableResetNestedPlan(planstate->nested);
+ if (JsonTablePlanNextRow(planstate->nested))
+ return true;
+ }
+
  return true;
 }

this part can be simplified as:
+ if (planstate->nested)
+{
+ JsonTableResetNestedPlan(planstate->nested);
+ JsonTablePlanNextRow(planstate->nested));
+}
since the last part, if it returns false, eventually it returns true.
also the comments seem slightly confusing?


v51 recursion function(JsonTablePlanNextRow, JsonTablePlanScanNextRow)
is far clearer than v50!
thanks. I think I get it.




Re: remaining sql/json patches

2024-04-07 Thread Amit Langote
On Sun, Apr 7, 2024 at 10:21 PM jian he  wrote:
> On Sun, Apr 7, 2024 at 12:30 PM jian he  wrote:
> >
> > other than that, it looks good to me.
> while looking at it again.
>
> + | NESTED path_opt Sconst
> + COLUMNS '(' json_table_column_definition_list ')'
> + {
> + JsonTableColumn *n = makeNode(JsonTableColumn);
> +
> + n->coltype = JTC_NESTED;
> + n->pathspec = (JsonTablePathSpec *)
> + makeJsonTablePathSpec($3, NULL, @3, -1);
> + n->columns = $6;
> + n->location = @1;
> + $$ = (Node *) n;
> + }
> + | NESTED path_opt Sconst AS name
> + COLUMNS '(' json_table_column_definition_list ')'
> + {
> + JsonTableColumn *n = makeNode(JsonTableColumn);
> +
> + n->coltype = JTC_NESTED;
> + n->pathspec = (JsonTablePathSpec *)
> + makeJsonTablePathSpec($3, $5, @3, @5);
> + n->columns = $8;
> + n->location = @1;
> + $$ = (Node *) n;
> + }
> + ;
> +
> +path_opt:
> + PATH
> + | /* EMPTY */
>   ;
>
> for `NESTED PATH`, `PATH` is optional.
> So for the doc, many places we need to replace `NESTED PATH` to `NESTED 
> [PATH]`?

Thanks for checking.

I've addressed most of your comments in the recent days including
today's.  Thanks for the patches for adding new test cases.  That was
very helpful.

I've changed the recursive structure of JsonTablePlanNextRow().  While
it still may not be perfect, I think it's starting to look good now.

0001 is a patch to fix up get_json_expr_options() so that it now emits
WRAPPER and QUOTES such that they work correctly.

0002 needs an expanded commit message but I've run out of energy today.

-- 
Thanks, Amit Langote


v51-0001-Fix-JsonExpr-deparsing-to-emit-QUOTES-and-WRAPPE.patch
Description: Binary data


v51-0002-JSON_TABLE-Add-support-for-NESTED-columns.patch
Description: Binary data


Re: remaining sql/json patches

2024-04-07 Thread jian he
On Sun, Apr 7, 2024 at 12:30 PM jian he  wrote:
>
> other than that, it looks good to me.
while looking at it again.

+ | NESTED path_opt Sconst
+ COLUMNS '(' json_table_column_definition_list ')'
+ {
+ JsonTableColumn *n = makeNode(JsonTableColumn);
+
+ n->coltype = JTC_NESTED;
+ n->pathspec = (JsonTablePathSpec *)
+ makeJsonTablePathSpec($3, NULL, @3, -1);
+ n->columns = $6;
+ n->location = @1;
+ $$ = (Node *) n;
+ }
+ | NESTED path_opt Sconst AS name
+ COLUMNS '(' json_table_column_definition_list ')'
+ {
+ JsonTableColumn *n = makeNode(JsonTableColumn);
+
+ n->coltype = JTC_NESTED;
+ n->pathspec = (JsonTablePathSpec *)
+ makeJsonTablePathSpec($3, $5, @3, @5);
+ n->columns = $8;
+ n->location = @1;
+ $$ = (Node *) n;
+ }
+ ;
+
+path_opt:
+ PATH
+ | /* EMPTY */
  ;

for `NESTED PATH`, `PATH` is optional.
So for the doc, many places we need to replace `NESTED PATH` to `NESTED [PATH]`?




Re: remaining sql/json patches

2024-04-06 Thread jian he
hi.
about v50.
+/*
+ * JsonTableSiblingJoin -
+ * Plan to union-join rows of nested paths of the same level
+ */
+typedef struct JsonTableSiblingJoin
+{
+ JsonTablePlan plan;
+
+ JsonTablePlan *lplan;
+ JsonTablePlan *rplan;
+} JsonTableSiblingJoin;

"Plan to union-join rows of nested paths of the same level"
same level problem misleading?
I think it means
"Plan to union-join rows of top level columns clause is a nested path"

+ if (IsA(planstate->plan, JsonTableSiblingJoin))
+ {
+ /* Fetch new from left sibling. */
+ if (!JsonTablePlanNextRow(planstate->left))
+ {
+ /*
+ * Left sibling ran out of rows, fetch new from right sibling.
+ */
+ if (!JsonTablePlanNextRow(planstate->right))
+ {
+ /* Right sibling and thus the plan has now more rows. */
+ return false;
+ }
+ }
+ }
/* Right sibling and thus the plan has now more rows. */
I think you mean:
/* Right sibling ran out of rows and thus the plan has no more rows. */


in  section,
+  | NESTED PATH json_path_specification
 AS path_name 
+COLUMNS ( json_table_column
, ... )

maybe make it into one line.

  | NESTED PATH json_path_specification
 AS path_name  COLUMNS
( json_table_column ,
... )

since the surrounding pattern is the next line beginning with "[",
meaning that next line is optional.


+ at arbitrary nesting levels.
maybe
+ at arbitrary nested level.

in src/tools/pgindent/typedefs.list, "JsonPathSpec" is unnecessary.

other than that, it looks good to me.




Re: remaining sql/json patches

2024-04-06 Thread Amit Langote
Hi,

On Sat, Apr 6, 2024 at 3:55 PM jian he  wrote:
> On Sat, Apr 6, 2024 at 2:03 PM Amit Langote  wrote:
> >
> > >
> > > * problem with type "char". the view def  output is not the same as
> > > the select * from v1.
> > >
> > > create or replace view v1 as
> > > SELECT col FROM s,
> > > JSON_TABLE(jsonb '{"d": ["hello", "hello1"]}', '$' as c1
> > > COLUMNS(col "char" path '$.d' without wrapper keep quotes))sub;
> > >
> > > \sv v1
> > > CREATE OR REPLACE VIEW public.v1 AS
> > >  SELECT sub.col
> > >FROM s,
> > > JSON_TABLE(
> > > '{"d": ["hello", "hello1"]}'::jsonb, '$' AS c1
> > > COLUMNS (
> > > col "char" PATH '$."d"'
> > > )
> > > ) sub
> > > one under the hood called JSON_QUERY_OP, another called JSON_VALUE_OP.
> >
> > Hmm, I don't see a problem as long as both are equivalent or produce
> > the same result.  Though, perhaps we could make
> > get_json_expr_options() also deparse JSW_NONE explicitly into "WITHOUT
> > WRAPPER" instead of a blank.  But that's existing code, so will take
> > care of it as part of the above open item.
> >
> > > I will do extensive checking for other types later, so far, other than
> > > these two issues,
> > > get_json_table_columns is pretty solid, I've tried nested columns with
> > > nested columns, it just works.
> >
> > Thanks for checking.
> >
> After applying v50, this type also has some issues.
> CREATE OR REPLACE VIEW t1 as
> SELECT sub.* FROM JSON_TABLE(jsonb '{"d": ["hello", "hello1"]}',
> '$' AS c1 COLUMNS (
> "tsvector0" tsvector path '$.d' without wrapper omit quotes,
> "tsvector1" tsvector path '$.d' without wrapper keep quotes))sub;
> table t1;
>
> return
> tsvector0|tsvector1
> -+-
>  '"hello1"]' '["hello",' | '"hello1"]' '["hello",'
> (1 row)
>
> src5=# \sv t1
> CREATE OR REPLACE VIEW public.t1 AS
>  SELECT tsvector0,
> tsvector1
>FROM JSON_TABLE(
> '{"d": ["hello", "hello1"]}'::jsonb, '$' AS c1
> COLUMNS (
> tsvector0 tsvector PATH '$."d"' OMIT QUOTES,
> tsvector1 tsvector PATH '$."d"'
> )
> ) sub
>
> but
>
>  SELECT tsvector0,
> tsvector1
>FROM JSON_TABLE(
> '{"d": ["hello", "hello1"]}'::jsonb, '$' AS c1
> COLUMNS (
> tsvector0 tsvector PATH '$."d"' OMIT QUOTES,
> tsvector1 tsvector PATH '$."d"'
> )
> ) sub
>
> only return
> tsvector0| tsvector1
> -+---
>  '"hello1"]' '["hello",' |

Yep, we *should* fix get_json_expr_options() to emit KEEP QUOTES and
WITHOUT WRAPPER options so that transformJsonTableColumns() does the
correct thing when you execute the \sv output.  Like this:

diff --git a/src/backend/utils/adt/ruleutils.c
b/src/backend/utils/adt/ruleutils.c
index 283ca53cb5..5a6aabe100 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -8853,9 +8853,13 @@ get_json_expr_options(JsonExpr *jsexpr,
deparse_context *context,
 appendStringInfo(context->buf, " WITH CONDITIONAL WRAPPER");
 else if (jsexpr->wrapper == JSW_UNCONDITIONAL)
 appendStringInfo(context->buf, " WITH UNCONDITIONAL WRAPPER");
+else if (jsexpr->wrapper == JSW_NONE)
+appendStringInfo(context->buf, " WITHOUT WRAPPER");

 if (jsexpr->omit_quotes)
 appendStringInfo(context->buf, " OMIT QUOTES");
+else
+appendStringInfo(context->buf, " KEEP QUOTES");
 }

Will get that pushed tomorrow.  Thanks for the test case.

-- 
Thanks, Amit Langote




Re: remaining sql/json patches

2024-04-06 Thread jian he
On Fri, Apr 5, 2024 at 8:35 PM Amit Langote  wrote:
>
> On Thu, Apr 4, 2024 at 9:02 PM Amit Langote  wrote:
> > I'll post the rebased 0002 tomorrow after addressing your comments.
>
> Here's one.  Main changes:
>
> * Fixed a bug in get_table_json_columns() which caused nested columns
> to be deparsed incorrectly, something Jian reported upthread.
> * Simplified the algorithm in JsonTablePlanNextRow()
>
> I'll post another revision or two maybe tomorrow, but posting what I
> have now in case Jian wants to do more testing.
>

+ else
+ {
+ /*
+ * Parent and thus the plan has no more rows.
+ */
+ return false;
+ }
in JsonTablePlanNextRow, the above comment seems strange to me.

+ /*
+ * Re-evaluate a nested plan's row pattern using the new parent row
+ * pattern, if present.
+ */
+ Assert(parent != NULL);
+ if (!parent->current.isnull)
+ JsonTableResetRowPattern(planstate, parent->current.value);
Is this assertion useful?
if parent is null, then parent->current.isnull will cause segmentation fault.

I tested with 3 NESTED PATH, it works! (I didn't fully understand
JsonTablePlanNextRow though).
the doc needs some polish work.




Re: remaining sql/json patches

2024-04-06 Thread jian he
On Sat, Apr 6, 2024 at 2:03 PM Amit Langote  wrote:
>
> >
> > * problem with type "char". the view def  output is not the same as
> > the select * from v1.
> >
> > create or replace view v1 as
> > SELECT col FROM s,
> > JSON_TABLE(jsonb '{"d": ["hello", "hello1"]}', '$' as c1
> > COLUMNS(col "char" path '$.d' without wrapper keep quotes))sub;
> >
> > \sv v1
> > CREATE OR REPLACE VIEW public.v1 AS
> >  SELECT sub.col
> >FROM s,
> > JSON_TABLE(
> > '{"d": ["hello", "hello1"]}'::jsonb, '$' AS c1
> > COLUMNS (
> > col "char" PATH '$."d"'
> > )
> > ) sub
> > one under the hood called JSON_QUERY_OP, another called JSON_VALUE_OP.
>
> Hmm, I don't see a problem as long as both are equivalent or produce
> the same result.  Though, perhaps we could make
> get_json_expr_options() also deparse JSW_NONE explicitly into "WITHOUT
> WRAPPER" instead of a blank.  But that's existing code, so will take
> care of it as part of the above open item.
>
> > I will do extensive checking for other types later, so far, other than
> > these two issues,
> > get_json_table_columns is pretty solid, I've tried nested columns with
> > nested columns, it just works.
>
> Thanks for checking.
>
After applying v50, this type also has some issues.
CREATE OR REPLACE VIEW t1 as
SELECT sub.* FROM JSON_TABLE(jsonb '{"d": ["hello", "hello1"]}',
'$' AS c1 COLUMNS (
"tsvector0" tsvector path '$.d' without wrapper omit quotes,
"tsvector1" tsvector path '$.d' without wrapper keep quotes))sub;
table t1;

return
tsvector0|tsvector1
-+-
 '"hello1"]' '["hello",' | '"hello1"]' '["hello",'
(1 row)

src5=# \sv t1
CREATE OR REPLACE VIEW public.t1 AS
 SELECT tsvector0,
tsvector1
   FROM JSON_TABLE(
'{"d": ["hello", "hello1"]}'::jsonb, '$' AS c1
COLUMNS (
tsvector0 tsvector PATH '$."d"' OMIT QUOTES,
tsvector1 tsvector PATH '$."d"'
)
) sub

but

 SELECT tsvector0,
tsvector1
   FROM JSON_TABLE(
'{"d": ["hello", "hello1"]}'::jsonb, '$' AS c1
COLUMNS (
tsvector0 tsvector PATH '$."d"' OMIT QUOTES,
tsvector1 tsvector PATH '$."d"'
)
) sub

only return
tsvector0| tsvector1
-+---
 '"hello1"]' '["hello",' |




Re: remaining sql/json patches

2024-04-06 Thread Amit Langote
On Sat, Apr 6, 2024 at 12:31 PM jian he  wrote:
> On Fri, Apr 5, 2024 at 8:35 PM Amit Langote  wrote:
> > Here's one.  Main changes:
> >
> > * Fixed a bug in get_table_json_columns() which caused nested columns
> > to be deparsed incorrectly, something Jian reported upthread.
> > * Simplified the algorithm in JsonTablePlanNextRow()
> >
> > I'll post another revision or two maybe tomorrow, but posting what I
> > have now in case Jian wants to do more testing.
>
> i am using the upthread view validation function.
> by comparing `execute the view definition` and `select * from the_view`,
> I did find 2 issues.
>
> * problem in transformJsonBehavior, JSON_BEHAVIOR_DEFAULT branch.
> I think we can fix this problem later, since sql/json query function
> already committed?
>
> CREATE DOMAIN jsonb_test_domain AS text CHECK (value <> 'foo');
> normally, we do:
> SELECT JSON_VALUE(jsonb '{"d1": "H"}', '$.a2' returning
> jsonb_test_domain DEFAULT 'foo' ON ERROR);
>
> but parsing back view def, we do:
> SELECT JSON_VALUE(jsonb '{"d1": "H"}', '$.a2' returning
> jsonb_test_domain DEFAULT 'foo'::text::jsonb_test_domain ON ERROR);
>
> then I found the following two queries should not be error out.
> SELECT JSON_VALUE(jsonb '{"d1": "H"}', '$.a2' returning
> jsonb_test_domain DEFAULT 'foo1'::text::jsonb_test_domain ON ERROR);
> SELECT JSON_VALUE(jsonb '{"d1": "H"}', '$.a2' returning
> jsonb_test_domain DEFAULT 'foo1'::jsonb_test_domain ON ERROR);

Yeah, added an open item for this:
https://wiki.postgresql.org/wiki/PostgreSQL_17_Open_Items#Open_Issues

> 
>
> * problem with type "char". the view def  output is not the same as
> the select * from v1.
>
> create or replace view v1 as
> SELECT col FROM s,
> JSON_TABLE(jsonb '{"d": ["hello", "hello1"]}', '$' as c1
> COLUMNS(col "char" path '$.d' without wrapper keep quotes))sub;
>
> \sv v1
> CREATE OR REPLACE VIEW public.v1 AS
>  SELECT sub.col
>FROM s,
> JSON_TABLE(
> '{"d": ["hello", "hello1"]}'::jsonb, '$' AS c1
> COLUMNS (
> col "char" PATH '$."d"'
> )
> ) sub
> one under the hood called JSON_QUERY_OP, another called JSON_VALUE_OP.

Hmm, I don't see a problem as long as both are equivalent or produce
the same result.  Though, perhaps we could make
get_json_expr_options() also deparse JSW_NONE explicitly into "WITHOUT
WRAPPER" instead of a blank.  But that's existing code, so will take
care of it as part of the above open item.

> I will do extensive checking for other types later, so far, other than
> these two issues,
> get_json_table_columns is pretty solid, I've tried nested columns with
> nested columns, it just works.

Thanks for checking.

-- 
Thanks, Amit Langote




Re: remaining sql/json patches

2024-04-05 Thread Amit Langote
Hi Michael,

On Fri, Apr 5, 2024 at 3:07 PM Michael Paquier  wrote:
> On Fri, Apr 05, 2024 at 09:00:00AM +0300, Alexander Lakhin wrote:
> > Please look at an assertion failure:
> > TRAP: failed Assert("count <= tupdesc->natts"), File: "parse_relation.c", 
> > Line: 3048, PID: 1325146
> >
> > triggered by the following query:
> > SELECT * FROM JSON_TABLE('0', '$' COLUMNS (js int PATH '$')),
> >   COALESCE(row(1)) AS (a int, b int);
> >
> > Without JSON_TABLE() I get:
> > ERROR:  function return row and query-specified return row do not match
> > DETAIL:  Returned row contains 1 attribute, but query expects 2.
>
> I've added an open item on this one.  We need to keep track of all
> that.

We figured out that this is an existing bug unrelated to JSON_TABLE(),
which Alexander reported to -bugs:
https://postgr.es/m/18422-89ca86c8eac52...@postgresql.org

I have moved the item to Older Bugs:
https://wiki.postgresql.org/wiki/PostgreSQL_17_Open_Items#Live_issues

-- 
Thanks, Amit Langote




Re: remaining sql/json patches

2024-04-05 Thread jian he
On Fri, Apr 5, 2024 at 8:35 PM Amit Langote  wrote:
> Here's one.  Main changes:
>
> * Fixed a bug in get_table_json_columns() which caused nested columns
> to be deparsed incorrectly, something Jian reported upthread.
> * Simplified the algorithm in JsonTablePlanNextRow()
>
> I'll post another revision or two maybe tomorrow, but posting what I
> have now in case Jian wants to do more testing.

i am using the upthread view validation function.
by comparing `execute the view definition` and `select * from the_view`,
I did find 2 issues.

* problem in transformJsonBehavior, JSON_BEHAVIOR_DEFAULT branch.
I think we can fix this problem later, since sql/json query function
already committed?

CREATE DOMAIN jsonb_test_domain AS text CHECK (value <> 'foo');
normally, we do:
SELECT JSON_VALUE(jsonb '{"d1": "H"}', '$.a2' returning
jsonb_test_domain DEFAULT 'foo' ON ERROR);

but parsing back view def, we do:
SELECT JSON_VALUE(jsonb '{"d1": "H"}', '$.a2' returning
jsonb_test_domain DEFAULT 'foo'::text::jsonb_test_domain ON ERROR);

then I found the following two queries should not be error out.
SELECT JSON_VALUE(jsonb '{"d1": "H"}', '$.a2' returning
jsonb_test_domain DEFAULT 'foo1'::text::jsonb_test_domain ON ERROR);
SELECT JSON_VALUE(jsonb '{"d1": "H"}', '$.a2' returning
jsonb_test_domain DEFAULT 'foo1'::jsonb_test_domain ON ERROR);


* problem with type "char". the view def  output is not the same as
the select * from v1.

create or replace view v1 as
SELECT col FROM s,
JSON_TABLE(jsonb '{"d": ["hello", "hello1"]}', '$' as c1
COLUMNS(col "char" path '$.d' without wrapper keep quotes))sub;

\sv v1
CREATE OR REPLACE VIEW public.v1 AS
 SELECT sub.col
   FROM s,
JSON_TABLE(
'{"d": ["hello", "hello1"]}'::jsonb, '$' AS c1
COLUMNS (
col "char" PATH '$."d"'
)
) sub
one under the hood called JSON_QUERY_OP, another called JSON_VALUE_OP.

I will do extensive checking for other types later, so far, other than
these two issues,
get_json_table_columns is pretty solid, I've tried nested columns with
nested columns, it just works.




Re: remaining sql/json patches

2024-04-05 Thread Amit Langote
On Thu, Apr 4, 2024 at 9:02 PM Amit Langote  wrote:
> I'll post the rebased 0002 tomorrow after addressing your comments.

Here's one.  Main changes:

* Fixed a bug in get_table_json_columns() which caused nested columns
to be deparsed incorrectly, something Jian reported upthread.
* Simplified the algorithm in JsonTablePlanNextRow()

I'll post another revision or two maybe tomorrow, but posting what I
have now in case Jian wants to do more testing.

-- 
Thanks, Amit Langote


v50-0001-JSON_TABLE-Add-support-for-NESTED-columns.patch
Description: Binary data


Re: remaining sql/json patches

2024-04-05 Thread Amit Langote
On Fri, Apr 5, 2024 at 5:00 PM Alexander Lakhin  wrote:
> 05.04.2024 10:09, Amit Langote wrote:
> > Seems like it might be a pre-existing issue, because I can also
> > reproduce the crash with:
>
> That's strange, because I get the error (on master, 6f132ed69).
> With backtrace_functions = 'tupledesc_match', I see
> 2024-04-05 10:48:27.827 MSK client backend[2898632] regress ERROR: function 
> return row and query-specified return row do
> not match
> 2024-04-05 10:48:27.827 MSK client backend[2898632] regress DETAIL: Returned 
> row contains 1 attribute, but query expects 2.
> 2024-04-05 10:48:27.827 MSK client backend[2898632] regress BACKTRACE:
> tupledesc_match at execSRF.c:948:3
> ExecMakeTableFunctionResult at execSRF.c:427:13
> FunctionNext at nodeFunctionscan.c:94:5
> ExecScanFetch at execScan.c:131:10
> ExecScan at execScan.c:180:10
> ExecFunctionScan at nodeFunctionscan.c:272:1
> ExecProcNodeFirst at execProcnode.c:465:1
> ExecProcNode at executor.h:274:9
>   (inlined by) ExecutePlan at execMain.c:1646:10
> standard_ExecutorRun at execMain.c:363:3
> ExecutorRun at execMain.c:305:1
> PortalRunSelect at pquery.c:926:26
> PortalRun at pquery.c:775:8
> exec_simple_query at postgres.c:1282:3
> PostgresMain at postgres.c:4684:27
> BackendMain at backend_startup.c:57:2
> pgarch_die at pgarch.c:847:1
> BackendStartup at postmaster.c:3593:8
> ServerLoop at postmaster.c:1674:6
> main at main.c:184:3
>  /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80) 
> [0x7f37127f0e40]
> 2024-04-05 10:48:27.827 MSK client backend[2898632] regress STATEMENT:  
> SELECT * FROM COALESCE(row(1)) AS (a int, b int);
>
> That's why I had attributed the failure to JSON_TABLE().
>
> Though SELECT * FROM generate_series(1, 1), COALESCE(row(1)) AS (a int, b 
> int);
> really triggers the assert too.
> Sorry for the noise...

No worries.  Let's start another thread so that this gets more attention.

-- 
Thanks, Amit Langote




Re: remaining sql/json patches

2024-04-05 Thread Alexander Lakhin

05.04.2024 10:09, Amit Langote wrote:

Seems like it might be a pre-existing issue, because I can also
reproduce the crash with:

SELECT * FROM COALESCE(row(1)) AS (a int, b int);
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!>

Backtrace:

#0  __pthread_kill_implementation (threadid=281472845250592,
signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
#1  0x806c4334 in __pthread_kill_internal (signo=6,
threadid=) at pthread_kill.c:78
#2  0x8067c73c in __GI_raise (sig=sig@entry=6) at
../sysdeps/posix/raise.c:26
#3  0x80669034 in __GI_abort () at abort.c:79
#4  0x00ad9d4c in ExceptionalCondition (conditionName=0xcbb368
"!(tupdesc->natts >= colcount)", errorType=0xcbb278 "FailedAssertion",
fileName=0xcbb2c8 "nodeFunctionscan.c",
 lineNumber=379) at assert.c:54


That's strange, because I get the error (on master, 6f132ed69).
With backtrace_functions = 'tupledesc_match', I see
2024-04-05 10:48:27.827 MSK client backend[2898632] regress ERROR: function return row and query-specified return row do 
not match

2024-04-05 10:48:27.827 MSK client backend[2898632] regress DETAIL: Returned 
row contains 1 attribute, but query expects 2.
2024-04-05 10:48:27.827 MSK client backend[2898632] regress BACKTRACE:
tupledesc_match at execSRF.c:948:3
ExecMakeTableFunctionResult at execSRF.c:427:13
FunctionNext at nodeFunctionscan.c:94:5
ExecScanFetch at execScan.c:131:10
ExecScan at execScan.c:180:10
ExecFunctionScan at nodeFunctionscan.c:272:1
ExecProcNodeFirst at execProcnode.c:465:1
ExecProcNode at executor.h:274:9
 (inlined by) ExecutePlan at execMain.c:1646:10
standard_ExecutorRun at execMain.c:363:3
ExecutorRun at execMain.c:305:1
PortalRunSelect at pquery.c:926:26
PortalRun at pquery.c:775:8
exec_simple_query at postgres.c:1282:3
PostgresMain at postgres.c:4684:27
BackendMain at backend_startup.c:57:2
pgarch_die at pgarch.c:847:1
BackendStartup at postmaster.c:3593:8
ServerLoop at postmaster.c:1674:6
main at main.c:184:3
    /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80) [0x7f37127f0e40]
2024-04-05 10:48:27.827 MSK client backend[2898632] regress STATEMENT:  SELECT 
* FROM COALESCE(row(1)) AS (a int, b int);

That's why I had attributed the failure to JSON_TABLE().

Though SELECT * FROM generate_series(1, 1), COALESCE(row(1)) AS (a int, b int);
really triggers the assert too.
Sorry for the noise...

Best regards,
Alexander




Re: remaining sql/json patches

2024-04-05 Thread Amit Langote
Hi Alexander,

On Fri, Apr 5, 2024 at 3:00 PM Alexander Lakhin  wrote:
>
> Hello Amit,
>
> 04.04.2024 15:02, Amit Langote wrote:
> > Pushed after fixing these and a few other issues.  I didn't include
> > the testing function you proposed in your other email.  It sounds
> > useful for testing locally but will need some work before we can
> > include it in the tree.
> >
> > I'll post the rebased 0002 tomorrow after addressing your comments.
>
> Please look at an assertion failure:
> TRAP: failed Assert("count <= tupdesc->natts"), File: "parse_relation.c", 
> Line: 3048, PID: 1325146
>
> triggered by the following query:
> SELECT * FROM JSON_TABLE('0', '$' COLUMNS (js int PATH '$')),
>COALESCE(row(1)) AS (a int, b int);
>
> Without JSON_TABLE() I get:
> ERROR:  function return row and query-specified return row do not match
> DETAIL:  Returned row contains 1 attribute, but query expects 2.

Thanks for the report.

Seems like it might be a pre-existing issue, because I can also
reproduce the crash with:

SELECT * FROM COALESCE(row(1)) AS (a int, b int);
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!>

Backtrace:

#0  __pthread_kill_implementation (threadid=281472845250592,
signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
#1  0x806c4334 in __pthread_kill_internal (signo=6,
threadid=) at pthread_kill.c:78
#2  0x8067c73c in __GI_raise (sig=sig@entry=6) at
../sysdeps/posix/raise.c:26
#3  0x80669034 in __GI_abort () at abort.c:79
#4  0x00ad9d4c in ExceptionalCondition (conditionName=0xcbb368
"!(tupdesc->natts >= colcount)", errorType=0xcbb278 "FailedAssertion",
fileName=0xcbb2c8 "nodeFunctionscan.c",
lineNumber=379) at assert.c:54
#5  0x0073edec in ExecInitFunctionScan (node=0x293d4ed0,
estate=0x293d51b8, eflags=16) at nodeFunctionscan.c:379
#6  0x00724bc4 in ExecInitNode (node=0x293d4ed0,
estate=0x293d51b8, eflags=16) at execProcnode.c:248
#7  0x0071b1cc in InitPlan (queryDesc=0x292f5d78, eflags=16)
at execMain.c:1006
#8  0x00719f6c in standard_ExecutorStart
(queryDesc=0x292f5d78, eflags=16) at execMain.c:252
#9  0x00719cac in ExecutorStart (queryDesc=0x292f5d78,
eflags=0) at execMain.c:134
#10 0x00945520 in PortalStart (portal=0x29399458, params=0x0,
eflags=0, snapshot=0x0) at pquery.c:527
#11 0x0093ee50 in exec_simple_query (query_string=0x29332d38
"SELECT * FROM COALESCE(row(1)) AS (a int, b int);") at
postgres.c:1175
#12 0x00943cb8 in PostgresMain (argc=1, argv=0x2935d610,
dbname=0x2935d450 "postgres", username=0x2935d430 "amit") at
postgres.c:4297
#13 0x0087e978 in BackendRun (port=0x29356c00) at postmaster.c:4517
#14 0x0087e0bc in BackendStartup (port=0x29356c00) at postmaster.c:4200
#15 0x00879638 in ServerLoop () at postmaster.c:1725
#16 0x00878eb4 in PostmasterMain (argc=3, argv=0x292eeac0) at
postmaster.c:1398
#17 0x00791db8 in main (argc=3, argv=0x292eeac0) at main.c:228

Backtrace looks a bit different with a query similar to yours:

SELECT * FROM generate_series(1, 1), COALESCE(row(1)) AS (a int, b int);
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!>

#0  __pthread_kill_implementation (threadid=281472845250592,
signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
#1  0x806c4334 in __pthread_kill_internal (signo=6,
threadid=) at pthread_kill.c:78
#2  0x8067c73c in __GI_raise (sig=sig@entry=6) at
../sysdeps/posix/raise.c:26
#3  0x80669034 in __GI_abort () at abort.c:79
#4  0x00ad9d4c in ExceptionalCondition (conditionName=0xc903b0
"!(count <= tupdesc->natts)", errorType=0xc8f8c8 "FailedAssertion",
fileName=0xc8f918 "parse_relation.c",
lineNumber=2649) at assert.c:54
#5  0x00649664 in expandTupleDesc (tupdesc=0x293da188,
eref=0x293d7318, count=2, offset=0, rtindex=2, sublevels_up=0,
location=-1, include_dropped=true, colnames=0x0,
colvars=0xc39253c8) at parse_relation.c:2649
#6  0x00648d08 in expandRTE (rte=0x293d7390, rtindex=2,
sublevels_up=0, location=-1, include_dropped=true, colnames=0x0,
colvars=0xc39253c8) at parse_relation.c:2361
#7  0x00849bd0 in build_physical_tlist (root=0x293d5318,
rel=0x293d88e8) at plancat.c:1681
#8  0x00806ad0 in create_scan_plan (root=0x293d5318,
best_path=0x293cd888, flags=0) at createplan.c:605
#9  0x0080666c in create_plan_recurse (root=0x293d5318,
best_path=0x293cd888, flags=0) at createplan.c:389
#10 0x0080c4e8 in create_nestloop_plan (root=0x293d5318,
best_path=0x293d96f0) at createplan.c:4056
#11 0x00807464 in create_join_plan (root=0x293d5318,
best_path=0x293d96f0) at createplan.c:1037

Re: remaining sql/json patches

2024-04-05 Thread Michael Paquier
On Fri, Apr 05, 2024 at 09:00:00AM +0300, Alexander Lakhin wrote:
> Please look at an assertion failure:
> TRAP: failed Assert("count <= tupdesc->natts"), File: "parse_relation.c", 
> Line: 3048, PID: 1325146
> 
> triggered by the following query:
> SELECT * FROM JSON_TABLE('0', '$' COLUMNS (js int PATH '$')),
>   COALESCE(row(1)) AS (a int, b int);
> 
> Without JSON_TABLE() I get:
> ERROR:  function return row and query-specified return row do not match
> DETAIL:  Returned row contains 1 attribute, but query expects 2.

I've added an open item on this one.  We need to keep track of all
that.
--
Michael


signature.asc
Description: PGP signature


Re: remaining sql/json patches

2024-04-05 Thread Alexander Lakhin

Hello Amit,

04.04.2024 15:02, Amit Langote wrote:

Pushed after fixing these and a few other issues.  I didn't include
the testing function you proposed in your other email.  It sounds
useful for testing locally but will need some work before we can
include it in the tree.

I'll post the rebased 0002 tomorrow after addressing your comments.


Please look at an assertion failure:
TRAP: failed Assert("count <= tupdesc->natts"), File: "parse_relation.c", Line: 
3048, PID: 1325146

triggered by the following query:
SELECT * FROM JSON_TABLE('0', '$' COLUMNS (js int PATH '$')),
  COALESCE(row(1)) AS (a int, b int);

Without JSON_TABLE() I get:
ERROR:  function return row and query-specified return row do not match
DETAIL:  Returned row contains 1 attribute, but query expects 2.

Best regards,
Alexander




Re: remaining sql/json patches

2024-04-04 Thread Amit Langote
On Wed, Apr 3, 2024 at 11:48 PM jian he  wrote:
> hi.
> +  
> +   json_table is an SQL/JSON function which
> +   queries JSON data
> +   and presents the results as a relational view, which can be accessed as a
> +   regular SQL table. You can only use
> json_table inside the
> +   FROM clause of a SELECT,
> +   UPDATE, DELETE, or
> MERGE
> +   statement.
> +  
>
> the only issue is that MERGE Synopsis don't have
> FROM clause.
> other than that, it's quite correct.
> see following tests demo:
>
> drop table ss;
> create table ss(a int);
> insert into ss select 1;
> delete from ss using JSON_TABLE(jsonb '1', '$' COLUMNS (a int PATH '$'
> ) ERROR ON ERROR) jt where jt.a = 1;
> insert into ss select 2;
> update ss set a = 1 from JSON_TABLE(jsonb '2', '$' COLUMNS (a int PATH
> '$')) jt where jt.a = 2;
> DROP TABLE IF EXISTS target;
> CREATE TABLE target (tid integer, balance integer) WITH
> (autovacuum_enabled=off);
> INSERT INTO target VALUES (1, 10),(2, 20),(3, 30);
> MERGE INTO target USING JSON_TABLE(jsonb '2', '$' COLUMNS (a int PATH
> '$' ) ERROR ON ERROR) source(sid)
> ON target.tid = source.sid
> WHEN MATCHED THEN UPDATE SET balance = 0
> returning *;
> --
>
> +  
> +   To split the row pattern into columns, json_table
> +   provides the COLUMNS clause that defines the
> +   schema of the created view. For each column, a separate path expression
> +   can be specified to be evaluated against the row pattern to get a
> +   SQL/JSON value that will become the value for the specified column in
> +   a given output row.
> +  
> should be "an SQL/JSON".
>
> +
> + Inserts a SQL/JSON value obtained by applying
> + path_expression against the row pattern into
> + the view's output row after coercing it to specified
> + type.
> +
> should be "an SQL/JSON".
>
> "coercing it to specified type"
> should be
> "coercing it to the specified type"?
> ---
> +
> + The value corresponds to whether evaluating the PATH
> + expression yields any SQL/JSON values.
> +
> maybe we can change to
> +
> + The value corresponds to whether applying the
> path_expression
> + expression yields any SQL/JSON values.
> +
> so it looks more consistent with the preceding paragraph.
>
> +
> + Optionally, ON ERROR can be used to specify whether
> + to throw an error or return the specified value to handle structural
> + errors, respectively.  The default is to return a boolean value
> + FALSE.
> +
> we don't need "respectively" here?
>
> + if (jt->on_error &&
> + jt->on_error->btype != JSON_BEHAVIOR_ERROR &&
> + jt->on_error->btype != JSON_BEHAVIOR_EMPTY &&
> + jt->on_error->btype != JSON_BEHAVIOR_EMPTY_ARRAY)
> + ereport(ERROR,
> + errcode(ERRCODE_SYNTAX_ERROR),
> + errmsg("invalid ON ERROR behavior"),
> + errdetail("Only EMPTY or ERROR is allowed for ON ERROR in JSON_TABLE()."),
> + parser_errposition(pstate, jt->on_error->location));
>
> errdetail("Only EMPTY or ERROR is allowed for ON ERROR in JSON_TABLE()."),
> maybe change to something like:
> `
> errdetail("Only EMPTY or ERROR is allowed for ON ERROR in the
> top-level JSON_TABLE() ").
> `
> i guess mentioning "top-level" is fine.
> since "top-level", we have 19  appearances in functions-json.html.

Thanks for checking.

Pushed after fixing these and a few other issues.  I didn't include
the testing function you proposed in your other email.  It sounds
useful for testing locally but will need some work before we can
include it in the tree.

I'll post the rebased 0002 tomorrow after addressing your comments.


--
Thanks, Amit Langote




Re: remaining sql/json patches

2024-04-04 Thread jian he
On Thu, Apr 4, 2024 at 3:50 PM jian he  wrote:
>
> On Thu, Apr 4, 2024 at 2:41 PM jian he  wrote:
> >
> > On Wed, Apr 3, 2024 at 8:39 PM Amit Langote  wrote:
> > >
> > > Attached updated patches.  I have addressed your doc comments on 0001,
> > > but not 0002 yet.

hi
some doc issue about v49, 0002.
+  Each
+  NESTED PATH clause can be used to generate one or more
+  columns using the data from a nested level of the row pattern, which can be
+  specified using a COLUMNS clause.
 maybe change to

+  Each
+  NESTED PATH clause can be used to generate one or more
+  columns using the data from an upper nested level of the row
pattern, which can be
+  specified using a COLUMNS clause


+   Child
+   columns may themselves contain a NESTED PATH
+   specifification thus allowing to extract data located at arbitrary nesting
+   levels.
maybe change to
+  Child
+  columns themselves  may contain a NESTED PATH
+   specification thus allowing to extract data located at any arbitrary nesting
+   level.


+
+ 
+ 
+  The following is a modified version of the above query to show the usage
+  of NESTED PATH for populating title and director
+  columns, illustrating how they are joined to the parent columns id and
+  kind:
+
+SELECT jt.* FROM
+ my_films,
+ JSON_TABLE ( js, '$.favorites[*] ? (@.films[*].director == $filter)'
+   PASSING 'Alfred Hitchcock' AS filter
+   COLUMNS (
+id FOR ORDINALITY,
+kind text PATH '$.kind',
+NESTED PATH '$.films[*]' COLUMNS (
+  title text FORMAT JSON PATH '$.title' OMIT QUOTES,
+  director text PATH '$.director' KEEP QUOTES))) AS jt;
+ id |   kind   |  title  |  director
++--+-+
+  1 | horror   | Psycho  | "Alfred Hitchcock"
+  2 | thriller | Vertigo | "Alfred Hitchcock"
+(2 rows)
+
+ 
+ 
+  The following is the same query but without the filter in the root
+  path:
+
+SELECT jt.* FROM
+ my_films,
+ JSON_TABLE ( js, '$.favorites[*]'
+   COLUMNS (
+id FOR ORDINALITY,
+kind text PATH '$.kind',
+NESTED PATH '$.films[*]' COLUMNS (
+  title text FORMAT JSON PATH '$.title' OMIT QUOTES,
+  director text PATH '$.director' KEEP QUOTES))) AS jt;
+ id |   kind   |  title  |  director
++--+-+
+  1 | comedy   | Bananas | "Woody Allen"
+  1 | comedy   | The Dinner Game | "Francis Veber"
+  2 | horror   | Psycho  | "Alfred Hitchcock"
+  3 | thriller | Vertigo | "Alfred Hitchcock"
+  4 | drama| Yojimbo | "Akira Kurosawa"
+(5 rows)
 

just found out that the query and the query's output condensed together.
in https://www.postgresql.org/docs/current/tutorial-window.html
the query we use , the output we use .
maybe we can do it the same way,
or we could just have one or two empty new lines separate them.
we have the similar problem in v49, 0001.




Re: remaining sql/json patches

2024-04-04 Thread jian he
On Thu, Apr 4, 2024 at 2:41 PM jian he  wrote:
>
> On Wed, Apr 3, 2024 at 8:39 PM Amit Langote  wrote:
> >
> > Attached updated patches.  I have addressed your doc comments on 0001,
> > but not 0002 yet.
> >
>
about v49, 0002.

--tests setup.
drop table if exists s cascade;
create table s(js jsonb);
insert into s values
('{"a":{"za":[{"z1": [11,]},{"z21": [22, 234,2345]},{"z22": [32,
204,145]}]},"c": 3}'),
('{"a":{"za":[{"z1": [21,4222]},{"z21": [32, 134,1345]}]},"c": 10}');

after playing around, I found, the non-nested column will be sorted first,
and the nested column will be ordered as is.
the below query, column "xx1" will be the first column, "xx" will be
the second column.

SELECT sub.* FROM s,(values(23)) x(x),generate_series(13, 13) y,
JSON_TABLE(js, '$' as c1 PASSING x AS x, y AS y COLUMNS(
NESTED PATH '$.a.za[2]' as n3 columns (NESTED PATH '$.z22[*]' as z22
COLUMNS (c int path  '$')),
NESTED PATH '$.a.za[1]' as n4 columns (d int[] PATH '$.z21'),
NESTED PATH '$.a.za[0]' as n1 columns (NESTED PATH '$.z1[*]' as z1
COLUMNS (a int path  '$')),
xx1 int path '$.c',
NESTED PATH '$.a.za[1]' as n2 columns (NESTED PATH '$.z21[*]' as z21
COLUMNS (b int path '$')),
xx int path '$.c'
))sub;
maybe this behavior is fine. but there is no explanation?

--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -1327,6 +1327,7 @@ JsonPathMutableContext
 JsonPathParseItem
 JsonPathParseResult
 JsonPathPredicateCallback
+JsonPathSpec
this change is no need.


+ if (scan->child)
+ get_json_table_nested_columns(tf, scan->child, context, showimplicit,
+  scan->colMax >= scan->colMin);
except parse_jsontable.c, we only use colMin, colMax in get_json_table_columns.
aslo in parse_jsontable.c, we do it via:

+ /* Start of column range */
+ colMin = list_length(tf->colvalexprs);

+ /* End of column range */
+ colMax = list_length(tf->colvalexprs) - 1;

maybe we can use (bool *) to tag whether this JsonTableColumn is nested or not
in transformJsonTableColumns.

currently colMin, colMax seems to make parsing back json_table (nested
path only) not working.

I also added some slightly complicated tests to prove that the PASSING
clause works
with every level, aslo the multi level nesting clause works as intended.

As mentioned in the previous mail, parsing back nest columns
json_table expression
not working as we expected.

so the last view (jsonb_table_view7) I added,  the view definition is WRONG!!
the good news is the output is what we expected, the coverage is pretty high.


v1-0001-add-test-for-nested-path-clause-json_table.for_v49_0002
Description: Binary data


Re: remaining sql/json patches

2024-04-04 Thread jian he
On Wed, Apr 3, 2024 at 8:39 PM Amit Langote  wrote:
>
> Attached updated patches.  I have addressed your doc comments on 0001,
> but not 0002 yet.
>

in v49, 0002.
+\sv jsonb_table_view1
+CREATE OR REPLACE VIEW public.jsonb_table_view1 AS
+ SELECT id,
+a1,
+b1,
+a11,
+a21,
+a22
+   FROM JSON_TABLE(
+'null'::jsonb, '$[*]' AS json_table_path_0
+PASSING
+1 + 2 AS a,
+'"foo"'::json AS "b c"
+COLUMNS (
+id FOR ORDINALITY,
+a1 integer PATH '$."a1"',
+b1 text PATH '$."b1"',
+a11 text PATH '$."a11"',
+a21 text PATH '$."a21"',
+a22 text PATH '$."a22"',
+NESTED PATH '$[1]' AS p1
+COLUMNS (
+id FOR ORDINALITY,
+a1 integer PATH '$."a1"',
+b1 text PATH '$."b1"',
+a11 text PATH '$."a11"',
+a21 text PATH '$."a21"',
+a22 text PATH '$."a22"',
+NESTED PATH '$[*]' AS "p1 1"
+COLUMNS (
+id FOR ORDINALITY,
+a1 integer PATH '$."a1"',
+b1 text PATH '$."b1"',
+a11 text PATH '$."a11"',
+a21 text PATH '$."a21"',
+a22 text PATH '$."a22"'
+)
+),
+NESTED PATH '$[2]' AS p2
+COLUMNS (
+id FOR ORDINALITY,
+a1 integer PATH '$."a1"',
+b1 text PATH '$."b1"',
+a11 text PATH '$."a11"',
+a21 text PATH '$."a21"',
+a22 text PATH '$."a22"'
+NESTED PATH '$[*]' AS "p2:1"
+COLUMNS (
+id FOR ORDINALITY,
+a1 integer PATH '$."a1"',
+b1 text PATH '$."b1"',
+a11 text PATH '$."a11"',
+a21 text PATH '$."a21"',
+a22 text PATH '$."a22"'
+),
+NESTED PATH '$[*]' AS p22
+COLUMNS (
+id FOR ORDINALITY,
+a1 integer PATH '$."a1"',
+b1 text PATH '$."b1"',
+a11 text PATH '$."a11"',
+a21 text PATH '$."a21"',
+a22 text PATH '$."a22"'
+)
+)
+)
+)

execute this view definition (not the "create view") will have syntax error.
That means the changes in v49,0002 ruleutils.c are wrong.
also \sv the output is quite long, not easy to validate it.

we need a way to validate that the view definition is equivalent to
"select * from view".
so I added a view validate function to it.

we can put it in v49, 0001.
since json data type don't equality operator,
so I did some minor change to make the view validate  function works with
jsonb_table_view2
jsonb_table_view3
jsonb_table_view4
jsonb_table_view5
jsonb_table_view6


v49-0001-validate-parsing-back-json_table.no-cfbot
Description: Binary data


Re: remaining sql/json patches

2024-04-03 Thread jian he
hi.
+  
+   json_table is an SQL/JSON function which
+   queries JSON data
+   and presents the results as a relational view, which can be accessed as a
+   regular SQL table. You can only use
json_table inside the
+   FROM clause of a SELECT,
+   UPDATE, DELETE, or
MERGE
+   statement.
+  

the only issue is that MERGE Synopsis don't have
FROM clause.
other than that, it's quite correct.
see following tests demo:

drop table ss;
create table ss(a int);
insert into ss select 1;
delete from ss using JSON_TABLE(jsonb '1', '$' COLUMNS (a int PATH '$'
) ERROR ON ERROR) jt where jt.a = 1;
insert into ss select 2;
update ss set a = 1 from JSON_TABLE(jsonb '2', '$' COLUMNS (a int PATH
'$')) jt where jt.a = 2;
DROP TABLE IF EXISTS target;
CREATE TABLE target (tid integer, balance integer) WITH
(autovacuum_enabled=off);
INSERT INTO target VALUES (1, 10),(2, 20),(3, 30);
MERGE INTO target USING JSON_TABLE(jsonb '2', '$' COLUMNS (a int PATH
'$' ) ERROR ON ERROR) source(sid)
ON target.tid = source.sid
WHEN MATCHED THEN UPDATE SET balance = 0
returning *;
--

+  
+   To split the row pattern into columns, json_table
+   provides the COLUMNS clause that defines the
+   schema of the created view. For each column, a separate path expression
+   can be specified to be evaluated against the row pattern to get a
+   SQL/JSON value that will become the value for the specified column in
+   a given output row.
+  
should be "an SQL/JSON".

+
+ Inserts a SQL/JSON value obtained by applying
+ path_expression against the row pattern into
+ the view's output row after coercing it to specified
+ type.
+
should be "an SQL/JSON".

"coercing it to specified type"
should be
"coercing it to the specified type"?
---
+
+ The value corresponds to whether evaluating the PATH
+ expression yields any SQL/JSON values.
+
maybe we can change to
+
+ The value corresponds to whether applying the
path_expression
+ expression yields any SQL/JSON values.
+
so it looks more consistent with the preceding paragraph.

+
+ Optionally, ON ERROR can be used to specify whether
+ to throw an error or return the specified value to handle structural
+ errors, respectively.  The default is to return a boolean value
+ FALSE.
+
we don't need "respectively" here?

+ if (jt->on_error &&
+ jt->on_error->btype != JSON_BEHAVIOR_ERROR &&
+ jt->on_error->btype != JSON_BEHAVIOR_EMPTY &&
+ jt->on_error->btype != JSON_BEHAVIOR_EMPTY_ARRAY)
+ ereport(ERROR,
+ errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("invalid ON ERROR behavior"),
+ errdetail("Only EMPTY or ERROR is allowed for ON ERROR in JSON_TABLE()."),
+ parser_errposition(pstate, jt->on_error->location));

errdetail("Only EMPTY or ERROR is allowed for ON ERROR in JSON_TABLE()."),
maybe change to something like:
`
errdetail("Only EMPTY or ERROR is allowed for ON ERROR in the
top-level JSON_TABLE() ").
`
i guess mentioning "top-level" is fine.
since "top-level", we have 19  appearances in functions-json.html.




Re: remaining sql/json patches

2024-04-03 Thread jian he
On Wed, Apr 3, 2024 at 3:15 PM jian he  wrote:
>
> On Wed, Apr 3, 2024 at 11:30 AM jian he  wrote:
> >
> > On Tue, Apr 2, 2024 at 9:57 PM Amit Langote  wrote:
> > >
> > > Please let me know if you have further comments on 0001.  I'd like to
> > > get that in before spending more energy on 0002.
> > >

more doc issue with v48. 0001, 0002.

 The optional json_path_name serves as an
 identifier of the provided path_expression.
 The path name must be unique and distinct from the column names.

"path name" should be
json_path_name


git diff --check
doc/src/sgml/func.sgml:19192: trailing whitespace.
+ id |   kind   |  title  | director


+  
+   JSON data stored at a nested level of the row pattern can be extracted using
+   the NESTED PATH clause.  Each
+   NESTED PATH clause can be used to generate one or more
+   columns using the data from a nested level of the row pattern, which can be
+   specified using a COLUMNS clause.  Rows constructed from
+   such columns are called child rows and are joined
+   agaist the row constructed from the columns specified in the parent
+   COLUMNS clause to get the row in the final view.  Child
+   columns may themselves contain a NESTED PATH
+   specifification thus allowing to extract data located at arbitrary nesting
+   levels.  Columns produced by NESTED PATHs at the same
+   level are considered to be siblings and are joined
+   with each other before joining to the parent row.
+  

"agaist" should be "against".
"specifification" should be "specification".
+Rows constructed from
+   such columns are called child rows and are joined
+   agaist the row constructed from the columns specified in the parent
+   COLUMNS clause to get the row in the final view.
this sentence is long, not easy to comprehend, maybe we can rephrase it
or split it into two.



+  | NESTED PATH json_path_specification
 AS path_name 
+COLUMNS ( json_table_column
, ... )
v48, 0002 patch.
in the json_table synopsis section, put these two lines into one line,
I think would make it more readable.
also the following sgml code will render the html into one line.

  NESTED PATH
json_path_specification 
AS json_path_name

  COLUMNS (
json_table_column ,
... )


also path_name should be
json_path_name.



+
+ The NESTED PATH syntax is recursive,
+ so you can go down multiple nested levels by specifying several
+ NESTED PATH subclauses within each other.
+ It allows to unnest the hierarchy of JSON objects and arrays
+ in a single function invocation rather than chaining several
+ JSON_TABLE expressions in an SQL statement.
+
"The NESTED PATH syntax is recursive"
should be
`
The NESTED PATH syntax can be recursive,
you can go down multiple nested levels by specifying several
NESTED PATH subclauses within each other.
`




Re: remaining sql/json patches

2024-04-03 Thread jian he
On Wed, Apr 3, 2024 at 11:30 AM jian he  wrote:
>
> On Tue, Apr 2, 2024 at 9:57 PM Amit Langote  wrote:
> >
> > Please let me know if you have further comments on 0001.  I'd like to
> > get that in before spending more energy on 0002.
> >

-- a/src/backend/parser/parse_target.c
+++ b/src/backend/parser/parse_target.c
@@ -2019,6 +2019,9 @@ FigureColnameInternal(Node *node, char **name)
  case JSON_VALUE_OP:
  *name = "json_value";
  return 2;
+ case JSON_TABLE_OP:
+ *name = "json_table";
+ return 2;
  default:
  elog(ERROR, "unrecognized JsonExpr op: %d",
  (int) ((JsonFuncExpr *) node)->op);

"case JSON_TABLE_OP part", no need?
json_table output must provide column name and type?

I did some minor refactor transformJsonTableColumns, make the comments
align with the function intention.
in v48-0001, in transformJsonTableColumns we can `Assert(rawc->name);`.
since non-nested JsonTableColumn must specify column name.
in v48-0002, we can change to `if (rawc->coltype != JTC_NESTED)
Assert(rawc->name);`



SELECT * FROM JSON_TABLE(jsonb '1', '$' COLUMNS (a int PATH '$.a' )
ERROR ON ERROR) jt;
ERROR:  no SQL/JSON item

I thought it should just return NULL.
In this case, I thought that
(not column-level) ERROR ON ERROR should not interfere with "COLUMNS
(a int PATH '$.a' )".

+-- Other miscellanous checks
"miscellanous" should be "miscellaneous".


overall the coverage is pretty high.
the current test output looks fine.


v48-0001-minor-refactor-transformJsonTableColumns.only_for_v48_0001
Description: Binary data


Re: remaining sql/json patches

2024-04-02 Thread jian he
On Tue, Apr 2, 2024 at 9:57 PM Amit Langote  wrote:
>
> Please let me know if you have further comments on 0001.  I'd like to
> get that in before spending more energy on 0002.
>

hi. some issues with the doc.
i think, some of the "path expression" can be replaced by
"path_expression".
maybe not all of them.

+  
+   
+
+ context_item,
path_expression 
AS json_path_name
  PASSING {
value AS
varname } ,
...
+
+
+
+ The input data to query, the JSON path expression defining the query,
+ and an optional PASSING clause, which can provide data
+ values to the path_expression.
+ The result of the input data
+ evaluation is called the row pattern. The row
+ pattern is used as the source for row values in the constructed view.
+
+
+   

maybe
change this part "The input data to query, the JSON path expression
defining the query,"
to
`
context_item is the input data to query,
path_expression is the JSON path expression
defining the query,
`

+
+ Specifying FORMAT JSON makes it explcit that you
+ expect that the value to be a valid json object.
+
"explcit" change to "explicit", or should it be "explicitly"?
also FORMAT JSON can be override by OMIT QUOTES.
SELECT sub.* FROM JSON_TABLE('{"a":{"z1": "a"}}', '$.a' COLUMNS(xx
TEXT format json path '$.z1' omit quotes))sub;
it return not double quoted literal 'a', which cannot be a valid json.

create or replace FUNCTION test_format_json() returns table (thetype
text, is_ok bool) AS $$
declare
part1_sql text := $sql$SELECT sub.* FROM JSON_TABLE('{"a":{"z1":
"a"}}', '$.a'  COLUMNS(xx $sql$;
part2_sql text := $sql$ format json path '$.z1' omit quotes))sub $sql$;
run_status bool := true;
r record;
fin record;
BEGIN
for r in
select format_type(oid, -1) as aa
from pg_type where  typtype = 'b' and typarray != 0 and
typnamespace = 11 and typnotnull is false
loop
begin
-- raise notice '%',CONCAT_WS(' ', part1_sql, r.aa, part2_sql);
-- raise notice 'r.aa %', r.aa;
run_status := true;
execute CONCAT_WS(' ', part1_sql, r.aa, part2_sql) into fin;
return query select r.aa, run_status;
exception when others then
begin
run_status := false;
return query select r.aa, run_status;
end;
end;
end loop;
END;
$$ language plpgsql;
create table sss_1 as select * from test_format_json();
select * from sss_1 where is_ok is true;

use the above query, I've figure out that FORMAT JSON can apply to the
following types:
bytea
name
text
json
bpchar
character varying
jsonb
and these type's customized domain type.

overall, the idea is that:
 Specifying FORMAT JSON makes it explicitly that you
 expect that the value to be a valid json object.
FORMAT JSON can be overridden by OMIT QUOTES
specification, which can make the return value not a valid
json.
FORMAT JSON can only work with certain kinds of
data types.
---
+
+ Optionally, you can add ON ERROR clause to define
+ error behavior.
+
I think "error behavior" may refer to "what kind of error message it will omit"
but here, it's about what to do when an error happens.
so I guess it's misleading.

maybe we can explain it similar to json_exist.
+
+ Optionally, you can add ON ERROR clause to define
+  the behavior if an error occurs.
+

+
+ The specified type should have a cast from the
+ boolean.
+
should be
+
+ The specified type should have a cast from the
+ boolean.
+


+
+ Inserts a SQL/JSON value into the output row.
+
maybe
+
+ Inserts a value that the data type is
type into the output row.
+

+
+ Inserts a boolean item into each output row.
+
maybe changed to:
+
+ Inserts a value that the data type is
type into the output row.
+

"name type EXISTS" branch mentioned: "The specified type should have a
cast from the boolean."
but "name type [FORMAT JSON [ENCODING UTF8]] [ PATH path_expression ]"
never mentioned the "type"parameter.
maybe add one para, something like:
"after apply path_expression, the yield value cannot be coerce to
type it will return null"




Re: remaining sql/json patches

2024-04-02 Thread jian he
On Fri, Mar 22, 2024 at 12:08 AM Amit Langote  wrote:
>
> On Wed, Mar 20, 2024 at 9:53 PM Amit Langote  wrote:
> > I'll push 0001 tomorrow.
>
> Pushed that one.  Here's the remaining JSON_TABLE() patch.
>
I know v45 is very different from v47.
but v45 contains all the remaining features to be implemented.

I've attached 2 files.
v45-0001-propagate-passing-clause-to-every-json_ta.based_on_v45
after_apply_jsonpathvar.sql.

the first file should be applied after v45-0001-JSON_TABLE.patch
the second file has all kinds of tests to prove that
applying JsonPathVariable to the NESTED PATH is ok.

I know that v45 is not the whole patch we are going to push for postgres17.
I just want to point out that applying the PASSING clause to the NESTED PATH
works fine with V45.

that means, I think, we can safely apply PASSING clause to NESTED PATH for
feature "PLAN DEFAULT clause", "specific PLAN clause" and "sibling
NESTED COLUMNS clauses".


v45-0001-propagate-passing-clause-to-every-json_ta.based_on_v45
Description: Binary data


after_apply_jsonpathvar.sql
Description: application/sql


Re: remaining sql/json patches

2024-04-02 Thread jian he
hi.

+/*
+ * Recursively transform child JSON_TABLE plan.
+ *
+ * Default plan is transformed into a cross/union join of its nested columns.
+ * Simple and outer/inner plans are transformed into a JsonTablePlan by
+ * finding and transforming corresponding nested column.
+ * Sibling plans are recursively transformed into a JsonTableSibling.
+ */
+static Node *
+transformJsonTableChildPlan(JsonTableParseContext *cxt,
+ List *columns)
this comment is not the same as the function intention for now.
maybe we need to refactor it.


/*
* Each call to fetch a new set of rows - of which there may be very many
* if XMLTABLE is being used in a lateral join - will allocate a possibly
* substantial amount of memory, so we cannot use the per-query context
* here. perTableCxt now serves the same function as "argcontext" does in
* FunctionScan - a place to store per-one-call (i.e. one result table)
* lifetime data (as opposed to per-query or per-result-tuple).
*/
MemoryContextSwitchTo(tstate->perTableCxt);

maybe we can replace "XMLTABLE" to "XMLTABLE or JSON_TABLE"?



/* Transform and coerce the PASSING arguments to to jsonb. */
there should be only one "to"?

---
json_table_column clause doesn't have a passing clause.
we can only have one passing clause in json_table.
but during JsonTableInitPathScan, for each output columns associated
JsonTablePlanState
we already initialized the PASSING arguments via  `planstate->args = args;`
also transformJsonTableColumn already has a passingArgs argument.
technically we can use the jsonpath variable for every output column
regardless of whether it's nested or not.

JsonTable already has the "passing" clause,
we just need to pass it to function transformJsonTableColumns and it's callees.
based on that, I implemented it. seems quite straightforward.
I also wrote several contrived, slightly complicated tests.
It seems to work just fine.

simple explanation:
previously the following sql will fail, error message is that "could
not find jsonpath variable  %s".
now it will work.

SELECT sub.* FROM
JSON_TABLE(jsonb '{"a":{"za":[{"z1": [11,]},{"z21": [22,
234,2345]}]},"c": 3}',
'$' PASSING 22 AS x, 234 AS y
COLUMNS(
xx int path '$.c',
NESTED PATH '$.a.za[1]' as n1 columns
(NESTED PATH '$.z21[*]' as n2
COLUMNS (z21 int path '$?(@ == $"x" || @ == $"y" )' default 0 on empty)),
NESTED PATH '$.a.za[0]' as n4 columns
(NESTED PATH '$.z1[*]' as n3
COLUMNS (z1 int path '$?(@ > $"y" + 1988)' default 0 on empty)))
)sub;


v47-0001-propagate-passing-clause-to-every-json_table_.no-cfbot
Description: Binary data


Re: remaining sql/json patches

2024-04-01 Thread jian he
On Mon, Apr 1, 2024 at 8:00 AM jian he  wrote:
>
> +-- Should fail (JSON arguments are not passed to column paths)
> +SELECT *
> +FROM JSON_TABLE(
> + jsonb '[1,2,3]',
> + '$[*] ? (@ < $x)'
> + PASSING 10 AS x
> + COLUMNS (y text FORMAT JSON PATH '$ ? (@ < $x)')
> + ) jt;
> +ERROR:  could not find jsonpath variable "x"
>
> the error message does not correspond to the comments intention.
> also "y text FORMAT JSON" should be fine?

sorry for the noise, i've figured out why.

> only the second last example really using the PASSING clause.
> should the following query work just fine in this context?
>
> create table s(js jsonb);
> insert into s select '{"a":{"za":[{"z1": [11,]},{"z21": [22,
> 234,2345]}]},"c": 3}';
> SELECT sub.* FROM s,JSON_TABLE(js, '$' passing 11 AS "b c", 1 + 2 as y
> COLUMNS (xx int path '$.c ? (@ == $y)')) sub;
>
>
> I thought the json and text data type were quite similar.
> should these following two queries return the same result?
>
> SELECT sub.* FROM s, JSON_TABLE(js, '$' COLUMNS(
> xx int path '$.c',
> nested PATH '$.a.za[1]' columns (NESTED PATH '$.z21[*]' COLUMNS (a12
> jsonb path '$'))
> ))sub;
>
> SELECT sub.* FROM s,JSON_TABLE(js, '$' COLUMNS (
> c int path '$.c',
> NESTED PATH '$.a.za[1]' columns (z json path '$')
> )) sub;
sorry for the noise, i've figured out why.

there are 12 appearances of "NESTED PATH" in  sqljson_jsontable.sql.
but we don't have a real example of  NESTED PATH nested with NESTED PATH.
so I added some real tests on it.
i also added some tests about the PASSING clause.
please check the attachment.


/*
 * JsonTableInitPlan
 * Initialize information for evaluating a jsonpath given in
 * JsonTablePlan
 */
static void
JsonTableInitPathScan(JsonTableExecContext *cxt,
  JsonTablePlanState *planstate,
  List *args, MemoryContext mcxt)
{
JsonTablePlan *plan = (JsonTablePlan *) planstate->plan;
int i;

planstate->path = DatumGetJsonPathP(plan->path->value->constvalue);
planstate->args = args;
planstate->mcxt = AllocSetContextCreate(mcxt, "JsonTableExecContext",
ALLOCSET_DEFAULT_SIZES);

/* No row pattern evaluated yet. */
planstate->currentRow = PointerGetDatum(NULL);
planstate->currentRowIsNull = true;

for (i = plan->colMin; i <= plan->colMax; i++)
cxt->colexprplans[i] = planstate;
}

JsonTableInitPathScan's work is to init/assign struct
JsonTablePlanState's elements.
maybe we should just put JsonTableInitPathScan's work into JsonTableInitPlan
and also rename JsonTableInitPlan to "JsonTableInitPlanState" or
"InitJsonTablePlanState".



JsonTableSiblingJoin *join = (JsonTableSiblingJoin *) plan;
just rename the variable name, seems unnecessary?


v47-0001-add-more-json_table-tests.no-cfbot
Description: Binary data


Re: remaining sql/json patches

2024-03-31 Thread jian he
typedef struct JsonTableExecContext
{
int magic;
JsonTablePlanState *rootplanstate;
JsonTablePlanState **colexprplans;
} JsonTableExecContext;

imho, this kind of naming is kind of inconsistent.
"state" and "plan" are mixed together.
maybe

typedef struct JsonTableExecContext
{
int magic;
JsonTablePlanState *rootplanstate;
JsonTablePlanState **colexprstates;
} JsonTableExecContext;


+ cxt->colexprplans = palloc(sizeof(JsonTablePlanState *) *
+   list_length(tf->colvalexprs));
+
  /* Initialize plan */
- cxt->rootplanstate = JsonTableInitPlan(cxt, rootplan, args,
+ cxt->rootplanstate = JsonTableInitPlan(cxt, (Node *) rootplan, NULL, args,
CurrentMemoryContext);
I think, the comments "Initialize plan" is not right, here we
initialize the rootplanstate (JsonTablePlanState)
and also for each (no ordinality) columns, we also initialized the
specific JsonTablePlanState.

 static void JsonTableRescan(JsonTablePlanState *planstate);
@@ -331,6 +354,9 @@ static Datum JsonTableGetValue(TableFuncScanState
*state, int colnum,
Oid typid, int32 typmod, bool *isnull);
 static void JsonTableDestroyOpaque(TableFuncScanState *state);
 static bool JsonTablePlanNextRow(JsonTablePlanState *planstate);
+static bool JsonTablePlanPathNextRow(JsonTablePlanState *planstate);
+static void JsonTableRescan(JsonTablePlanState *planstate);

JsonTableRescan included twice?




Re: remaining sql/json patches

2024-03-31 Thread jian he
FAILED: src/interfaces/ecpg/test/sql/sqljson_jsontable.c
/home/jian/postgres/buildtest6/src/interfaces/ecpg/preproc/ecpg
--regression -I../../Desktop/pg_src/src6/postgres/src/interfaces/ecpg/test/sql
-I../../Desktop/pg_src/src6/postgres/src/interfaces/ecpg/include/ -o
src/interfaces/ecpg/test/sql/sqljson_jsontable.c
../../Desktop/pg_src/src6/postgres/src/interfaces/ecpg/test/sql/sqljson_jsontable.pgc
../../Desktop/pg_src/src6/postgres/src/interfaces/ecpg/test/sql/sqljson_jsontable.pgc:21:
WARNING: unsupported feature will be passed to server
../../Desktop/pg_src/src6/postgres/src/interfaces/ecpg/test/sql/sqljson_jsontable.pgc:32:
ERROR: syntax error at or near ";"
need an extra closing parenthesis?

   
The rows produced by JSON_TABLE are laterally
joined to the row that generated them, so you do not have to explicitly join
the constructed view with the original table holding JSON
-   data.
need closing para.

SELECT * FROM JSON_TABLE('[]', 'strict $.a' COLUMNS (js2 text  PATH
'$' error on empty error on error) EMPTY ON ERROR);
should i expect it return one row?
is there any example to make it return one row from top level "EMPTY ON ERROR"?


+ {
+ JsonTablePlan *scan = (JsonTablePlan *) plan;
+
+ JsonTableInitPathScan(cxt, planstate, args, mcxt);
+
+ planstate->nested = scan->child ?
+ JsonTableInitPlan(cxt, scan->child, planstate, args, mcxt) : NULL;
+ }
first line seems strange, do we just simply change from "plan" to "scan"?


+ case JTC_REGULAR:
+ typenameTypeIdAndMod(pstate, rawc->typeName, , );
+
+ /*
+ * Use implicit FORMAT JSON for composite types (arrays and
+ * records) or if a non-default WRAPPER / QUOTES behavior is
+ * specified.
+ */
+ if (typeIsComposite(typid) ||
+ rawc->quotes != JS_QUOTES_UNSPEC ||
+ rawc->wrapper != JSW_UNSPEC)
+ rawc->coltype = JTC_FORMATTED;
per previous discussion, should we refactor the above comment?


+/* Recursively set 'reset' flag of planstate and its child nodes */
+static void
+JsonTablePlanReset(JsonTablePlanState *planstate)
+{
+ if (IsA(planstate->plan, JsonTableSiblingJoin))
+ {
+ JsonTablePlanReset(planstate->left);
+ JsonTablePlanReset(planstate->right);
+ planstate->advanceRight = false;
+ }
+ else
+ {
+ planstate->reset = true;
+ planstate->advanceNested = false;
+
+ if (planstate->nested)
+ JsonTablePlanReset(planstate->nested);
+ }
per coverage, the first part of the IF branch never executed.
i also found out that JsonTablePlanReset is quite similar to JsonTableRescan,
i don't fully understand these two functions though.


SELECT * FROM JSON_TABLE(jsonb'{"a": {"z":[]}, "b": 1,"c": 2, "d":
91}', '$' COLUMNS (
c int path '$.c',
d int path '$.d',
id1 for ordinality,
NESTED PATH '$.a.z[*]' columns (z int path '$', id for ordinality)
));
doc seems to say that duplicated ordinality columns in different nest
levels are not allowed?


"currentRow" naming seems misleading, generally, when we think of "row",
we think of several (not one) datums, or several columns.
but here, we only have one datum.
I don't have good optional naming though.


+ case JTC_FORMATTED:
+ case JTC_EXISTS:
+ {
+ Node   *je;
+ CaseTestExpr *param = makeNode(CaseTestExpr);
+
+ param->collation = InvalidOid;
+ param->typeId = contextItemTypid;
+ param->typeMod = -1;
+
+ je = transformJsonTableColumn(rawc, (Node *) param,
+  NIL, errorOnError);
+
+ colexpr = transformExpr(pstate, je, EXPR_KIND_FROM_FUNCTION);
+ assign_expr_collations(pstate, colexpr);
+
+ typid = exprType(colexpr);
+ typmod = exprTypmod(colexpr);
+ break;
+ }
+
+ default:
+ elog(ERROR, "unknown JSON_TABLE column type: %d", rawc->coltype);
+ break;
+ }
+
+ tf->coltypes = lappend_oid(tf->coltypes, typid);
+ tf->coltypmods = lappend_int(tf->coltypmods, typmod);
+ tf->colcollations = lappend_oid(tf->colcollations, get_typcollation(typid));
+ tf->colvalexprs = lappend(tf->colvalexprs, colexpr);

why not use exprCollation(colexpr) for tf->colcollations, similar to
exprType(colexpr)?




+-- Should fail (JSON arguments are not passed to column paths)
+SELECT *
+FROM JSON_TABLE(
+ jsonb '[1,2,3]',
+ '$[*] ? (@ < $x)'
+ PASSING 10 AS x
+ COLUMNS (y text FORMAT JSON PATH '$ ? (@ < $x)')
+ ) jt;
+ERROR:  could not find jsonpath variable "x"

the error message does not correspond to the comments intention.
also "y text FORMAT JSON" should be fine?

only the second last example really using the PASSING clause.
should the following query work just fine in this context?

create table s(js jsonb);
insert into s select '{"a":{"za":[{"z1": [11,]},{"z21": [22,
234,2345]}]},"c": 3}';
SELECT sub.* FROM s,JSON_TABLE(js, '$' passing 11 AS "b c", 1 + 2 as y
COLUMNS (xx int path '$.c ? (@ == $y)')) sub;


I thought the json and text data type were quite similar.
should these following two queries return the same result?

SELECT sub.* FROM s, JSON_TABLE(js, '$' COLUMNS(
xx int path '$.c',
nested PATH '$.a.za[1]' columns (NESTED PATH '$.z21[*]' COLUMNS (a12
jsonb path '$'))
))sub;

SELECT sub.* FROM s,JSON_TABLE(js, '$' 

Re: remaining sql/json patches

2024-03-29 Thread Amit Langote
Hi Alvaro,

On Fri, Mar 29, 2024 at 2:04 AM Alvaro Herrera  wrote:
> On 2024-Mar-28, Amit Langote wrote:
>
> > Here's patch 1 for the time being that implements barebones
> > JSON_TABLE(), that is, without NESTED paths/columns and PLAN clause.
> > I've tried to shape the interfaces so that those features can be added
> > in future commits without significant rewrite of the code that
> > implements barebones JSON_TABLE() functionality.  I'll know whether
> > that's really the case when I rebase the full patch over it.
>
> I think this barebones patch looks much closer to something that can be
> committed for pg17, given the current commitfest timeline.  Maybe we
> should just slip NESTED and PLAN to pg18 to focus current efforts into
> getting the basic functionality in 17.  When I looked at the JSON_TABLE
> patch last month, it appeared far too large to be reviewable in
> reasonable time.  The fact that this split now exists gives me hope that
> we can get at least the first part of it.

Thanks for chiming in.  I agree that 0001 looks more manageable.

> (A note that PLAN seems to correspond to separate features T824+T838, so
> leaving that one out would still let us claim T821 "Basic SQL/JSON query
> operators" ... however, the NESTED clause does not appear to be a
> separate SQL feature; in particular it does not appear to correspond to
> T827, though I may be reading the standard wrong.  So if we don't have
> NESTED, apparently we could not claim to support T821.)

I've posted 0002 just now, which shows that adding just NESTED but not
PLAN might be feasible.

-- 
Thanks, Amit Langote




Re: remaining sql/json patches

2024-03-29 Thread jian he
On Fri, Mar 29, 2024 at 11:20 AM jian he  wrote:
>
>
> +
> +JSON_TABLE (
> +context_item,
> path_expression  AS
> json_path_name  
> PASSING { value AS
> varname } , ...
> 
> +COLUMNS (  class="parameter">json_table_column ,
> ... )
> + { ERROR | EMPTY
> } ON ERROR 
> +)
> top level (not in the COLUMN clause) also allows
> NULL ON ERROR.
>
we can also specify DEFAULT expression ON
ERROR.
like:
SELECT * FROM JSON_TABLE(jsonb'"1.23"', 'strict $.a' COLUMNS (js2 int
PATH '$') default '1' on error);

+   
+
+ name type
FORMAT JSON ENCODING
UTF8
+   PATH
json_path_specification 
+
+
+
+ Inserts a composite SQL/JSON item into the output row.
+
+
+ The provided PATH expression is evaluated and
+ the column is filled with the produced SQL/JSON item.  If the
+ PATH expression is omitted, path expression
+ $.name is used,
+ where name is the provided column name.
+ In this case, the column name must correspond to one of the
+ keys within the SQL/JSON item produced by the row pattern.
+
+
+ Optionally, you can specify WRAPPER,
+ QUOTES clauses to format the output and
+ ON EMPTY and ON ERROR to handle
+ those scenarios appropriately.
+

Similarly, I am not sure of the description of "composite SQL/JSON item".
by observing the following 3 examples:
SELECT * FROM JSON_TABLE(jsonb'{"a": "z"}', '$.a' COLUMNS (js2 text
format json PATH '$' omit quotes));
SELECT * FROM JSON_TABLE(jsonb'{"a": "z"}', '$.a' COLUMNS (js2 text
format json PATH '$'));
SELECT * FROM JSON_TABLE(jsonb'{"a": "z"}', '$.a' COLUMNS (js2 text PATH '$'));

i think, FORMAT JSON specification means that,
if your specified type is text or varchar related AND didn't specify
quotes behavior
then FORMAT JSON produced output can be casted to json data type.
so FORMAT JSON seems not related to array and records data type.

also the last para can be:
+
+ Optionally, you can specify WRAPPER,
+ QUOTES clauses to format the output and
+ ON EMPTY and ON ERROR to handle
+ those missing values and structural errors, respectively.
+


+ ereport(ERROR,
+ errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("only string constants are supported in JSON_TABLE"
+   " path specification"),
should be:

+ ereport(ERROR,
+ errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("only string constants are supported in JSON_TABLE path
specification"),


+   
+
+ AS json_path_name
+
+
+
+
+ The optional json_path_name serves as an
+ identifier of the provided
json_path_specification.
+ The path name must be unique and distinct from the column names.
+ When using the PLAN clause, you must specify the names
+ for all the paths, including the row pattern. Each path name can appear in
+ the PLAN clause only once.
+
+
+   
as of v46, we don't have PLAN clause.
also "must be unique and distinct from the column names." seems incorrect.
for example:
SELECT * FROM JSON_TABLE(jsonb'"1.23"', '$.a' as js2 COLUMNS (js2 int
PATH '$'));




Re: remaining sql/json patches

2024-03-28 Thread jian he
On Thu, Mar 28, 2024 at 1:23 PM Amit Langote  wrote:
>
> On Wed, Mar 27, 2024 at 1:34 PM Amit Langote  wrote:
> > On Wed, Mar 27, 2024 at 12:42 PM jian he  
> > wrote:
> > > hi.
> > > I don't fully understand all the code in json_table patch.
> > > maybe we can split it into several patches,
> >
> > I'm working on exactly that atm.
> >
> > > like:
> > > * no nested json_table_column.
> > > * nested json_table_column, with PLAN DEFAULT
> > > * nested json_table_column, with PLAN ( json_table_plan )
> >
> > Yes, I think it will end up something like this.  I'll try to post the
> > breakdown tomorrow.
>
> Here's patch 1 for the time being that implements barebones
> JSON_TABLE(), that is, without NESTED paths/columns and PLAN clause.
> I've tried to shape the interfaces so that those features can be added
> in future commits without significant rewrite of the code that
> implements barebones JSON_TABLE() functionality.  I'll know whether
> that's really the case when I rebase the full patch over it.
>
> I'm still reading and polishing it and would be happy to get feedback
> and testing.
>

+static void
+JsonValueListClear(JsonValueList *jvl)
+{
+ jvl->singleton = NULL;
+ jvl->list = NULL;
+}
 jvl->list is a List structure, do we need to set it like "jvl->list = NIL"?

+ if (jperIsError(res))
+ {
+ /* EMPTY ON ERROR case */
+ Assert(!planstate->plan->errorOnError);
+ JsonValueListClear(>found);
+ }
i am not sure the comment is right.
`SELECT * FROM JSON_TABLE(jsonb'"1.23"', 'strict $.a' COLUMNS (js2 int
PATH '$') );`
will execute jperIsError branch.
also
SELECT * FROM JSON_TABLE(jsonb'"1.23"', 'strict $.a' COLUMNS (js2 int
PATH '$') default '1' on error);

I think it means applying path_expression, if the top level on_error
behavior is not on error
then ` if (jperIsError(res))` part may be executed.



--- a/src/include/utils/jsonpath.h
+++ b/src/include/utils/jsonpath.h
@@ -15,6 +15,7 @@
 #define JSONPATH_H

 #include "fmgr.h"
+#include "executor/tablefunc.h"
 #include "nodes/pg_list.h"
 #include "nodes/primnodes.h"
 #include "utils/jsonb.h"

should be:
+#include "executor/tablefunc.h"
 #include "fmgr.h"


+
+JSON_TABLE (
+context_item,
path_expression  AS
json_path_name  
PASSING { value AS
varname } , ...

+COLUMNS ( json_table_column ,
... )
+ { ERROR | EMPTY
} ON ERROR 
+)
top level (not in the COLUMN clause) also allows
NULL ON ERROR.

SELECT JSON_VALUE(jsonb'"1.23"', 'strict $.a' null on error);
returns one value.
SELECT * FROM JSON_TABLE(jsonb'"1.23"', 'strict $.a' COLUMNS (js2 int
PATH '$') NULL on ERROR);
return zero rows.
Is this what we expected?


main changes are in jsonpath_exec.c, parse_expr.c, parse_jsontable.c
overall the coverage seems pretty good.
I added some tests to improve the coverage.


v46-0001-improve-regress-coverage-test-based-on-v46.no-cfbot
Description: Binary data


Re: remaining sql/json patches

2024-03-28 Thread Alvaro Herrera
On 2024-Mar-28, Amit Langote wrote:

> Here's patch 1 for the time being that implements barebones
> JSON_TABLE(), that is, without NESTED paths/columns and PLAN clause.
> I've tried to shape the interfaces so that those features can be added
> in future commits without significant rewrite of the code that
> implements barebones JSON_TABLE() functionality.  I'll know whether
> that's really the case when I rebase the full patch over it.

I think this barebones patch looks much closer to something that can be
committed for pg17, given the current commitfest timeline.  Maybe we
should just slip NESTED and PLAN to pg18 to focus current efforts into
getting the basic functionality in 17.  When I looked at the JSON_TABLE
patch last month, it appeared far too large to be reviewable in
reasonable time.  The fact that this split now exists gives me hope that
we can get at least the first part of it.

(A note that PLAN seems to correspond to separate features T824+T838, so
leaving that one out would still let us claim T821 "Basic SQL/JSON query
operators" ... however, the NESTED clause does not appear to be a
separate SQL feature; in particular it does not appear to correspond to
T827, though I may be reading the standard wrong.  So if we don't have
NESTED, apparently we could not claim to support T821.)

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"La fuerza no está en los medios físicos
sino que reside en una voluntad indomable" (Gandhi)




Re: remaining sql/json patches

2024-03-26 Thread Amit Langote
On Wed, Mar 27, 2024 at 12:42 PM jian he  wrote:
> hi.
> I don't fully understand all the code in json_table patch.
> maybe we can split it into several patches,

I'm working on exactly that atm.

> like:
> * no nested json_table_column.
> * nested json_table_column, with PLAN DEFAULT
> * nested json_table_column, with PLAN ( json_table_plan )

Yes, I think it will end up something like this.  I'll try to post the
breakdown tomorrow.

-- 
Thanks, Amit Langote




Re: remaining sql/json patches

2024-03-26 Thread jian he
On Tue, Mar 26, 2024 at 6:16 PM jian he  wrote:
>
> On Fri, Mar 22, 2024 at 12:08 AM Amit Langote  wrote:
> >
> > On Wed, Mar 20, 2024 at 9:53 PM Amit Langote  
> > wrote:
> > > I'll push 0001 tomorrow.
> >
> > Pushed that one.  Here's the remaining JSON_TABLE() patch.
> >

hi.
I don't fully understand all the code in json_table patch.
maybe we can split it into several patches, like:
* no nested json_table_column.
* nested json_table_column, with PLAN DEFAULT
* nested json_table_column, with PLAN ( json_table_plan )

i can understand the "no nested json_table_column" part,
which seems to be how oracle[1] implemented it.
I think we can make the "no nested json_table_column" part into  v17.
i am not sure about other complex parts.
lack of comment, makes it kind of hard to fully understand.

[1] 
https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/img_text/json_table.html



+/* Reset context item of a scan, execute JSON path and reset a scan */
+static void
+JsonTableResetContextItem(JsonTableScanState *scan, Datum item)
+{
+ MemoryContext oldcxt;
+ JsonPathExecResult res;
+ Jsonb   *js = (Jsonb *) DatumGetJsonbP(item);
+
+ JsonValueListClear(>found);
+
+ MemoryContextResetOnly(scan->mcxt);
+
+ oldcxt = MemoryContextSwitchTo(scan->mcxt);
+
+ res = executeJsonPath(scan->path, scan->args,
+  GetJsonPathVar, CountJsonPathVars,
+  js, scan->errorOnError, >found,
+  false /* FIXME */ );
+
+ MemoryContextSwitchTo(oldcxt);
+
+ if (jperIsError(res))
+ {
+ Assert(!scan->errorOnError);
+ JsonValueListClear(>found); /* EMPTY ON ERROR case */
+ }
+
+ JsonTableRescan(scan);
+}

"FIXME".
set the last argument in executeJsonPath to true also works as expected.
also there is no test related to the "FIXME"
i am not 100% sure about the "FIXME".

see demo (after set the executeJsonPath's "useTz" argument to true).

create table ss(js jsonb);
INSERT into ss select '{"a": "2018-02-21 12:34:56 +10"}';
INSERT into ss select '{"b": "2018-02-21 12:34:56 "}';
PREPARE q2 as SELECT jt.*  FROM ss, JSON_TABLE(js, '$.a.datetime()'
COLUMNS ("int7" timestamptz PATH '$')) jt;
PREPARE qb as SELECT jt.*  FROM ss, JSON_TABLE(js, '$.b.datetime()'
COLUMNS ("tstz" timestamptz PATH '$')) jt;
PREPARE q3 as SELECT jt.*  FROM ss, JSON_TABLE(js, '$.a.datetime()'
COLUMNS ("ts" timestamp PATH '$')) jt;

begin;
set time zone +10;
EXECUTE q2;
set time zone -10;
EXECUTE q2;
rollback;

begin;
set time zone +10;
SELECT JSON_VALUE(js, '$.a' returning timestamptz) from ss;
set time zone -10;
SELECT JSON_VALUE(js, '$.a' returning timestamptz) from ss;
rollback;
-
begin;
set time zone +10;
EXECUTE qb;
set time zone -10;
EXECUTE qb;
rollback;

begin;
set time zone +10;
SELECT JSON_VALUE(js, '$.b' returning timestamptz) from ss;
set time zone -10;
SELECT JSON_VALUE(js, '$.b' returning timestamptz) from ss;
rollback;
-
begin;
set time zone +10;
EXECUTE q3;
set time zone -10;
EXECUTE q3;
rollback;

begin;
set time zone +10;
SELECT JSON_VALUE(js, '$.b' returning timestamp) from ss;
set time zone -10;
SELECT JSON_VALUE(js, '$.b' returning timestamp) from ss;
rollback;




Re: remaining sql/json patches

2024-03-26 Thread jian he
On Fri, Mar 22, 2024 at 12:08 AM Amit Langote  wrote:
>
> On Wed, Mar 20, 2024 at 9:53 PM Amit Langote  wrote:
> > I'll push 0001 tomorrow.
>
> Pushed that one.  Here's the remaining JSON_TABLE() patch.
>

hi. minor issues i found json_table patch.

+ if (!IsA($5, A_Const) ||
+ castNode(A_Const, $5)->val.node.type != T_String)
+ ereport(ERROR,
+ errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("only string constants are supported in JSON_TABLE"
+   " path specification"),
+ parser_errposition(@5));
as mentioned in upthread, this error message should be one line.


+const TableFuncRoutine JsonbTableRoutine =
+{
+ JsonTableInitOpaque,
+ JsonTableSetDocument,
+ NULL,
+ NULL,
+ NULL,
+ JsonTableFetchRow,
+ JsonTableGetValue,
+ JsonTableDestroyOpaque
+};
should be:

const TableFuncRoutine JsonbTableRoutine =
{
.InitOpaque = JsonTableInitOpaque,
.SetDocument = JsonTableSetDocument,
.SetNamespace = NULL,
.SetRowFilter = NULL,
.SetColumnFilter = NULL,
.FetchRow = JsonTableFetchRow,
.GetValue = JsonTableGetValue,
.DestroyOpaque = JsonTableDestroyOpaque
};

+/*
+ * JsonTablePathSpec
+ * untransformed specification of JSON path expression with an optional
+ * name
+ */
+typedef struct JsonTablePathSpec
+{
+ NodeTag type;
+
+ Node   *string;
+ char   *name;
+ int name_location;
+ int location; /* location of 'string' */
+} JsonTablePathSpec;
the comment still does not explain the distinction between "location"
and "name_location"?


JsonTablePathSpec needs to be added to typedefs.list.
JsonPathSpec should be removed from typedefs.list.


+/*
+ * JsonTablePlanType -
+ * flags for JSON_TABLE plan node types representation
+ */
+typedef enum JsonTablePlanType
+{
+ JSTP_DEFAULT,
+ JSTP_SIMPLE,
+ JSTP_JOINED,
+} JsonTablePlanType;
+
+/*
+ * JsonTablePlanJoinType -
+ * JSON_TABLE join types for JSTP_JOINED plans
+ */
+typedef enum JsonTablePlanJoinType
+{
+ JSTP_JOIN_INNER,
+ JSTP_JOIN_OUTER,
+ JSTP_JOIN_CROSS,
+ JSTP_JOIN_UNION,
+} JsonTablePlanJoinType;
I can guess the enum value meaning of JsonTablePlanJoinType,
but I can't guess the meaning of "JSTP_SIMPLE" or "JSTP_JOINED".
adding some comments in JsonTablePlanType would make it more clear.

I think I can understand JsonTableScanNextRow.
but i don't understand JsonTablePlanNextRow.
maybe we can add some comments on JsonTableJoinState.


+-- unspecified plan (outer, union)
+select
+ jt.*
+from
+ jsonb_table_test jtt,
+ json_table (
+ jtt.js,'strict $[*]' as p
+ columns (
+ n for ordinality,
+ a int path 'lax $.a' default -1 on empty,
+ nested path 'strict $.b[*]' as pb columns ( b int path '$' ),
+ nested path 'strict $.c[*]' as pc columns ( c int path '$' )
+ )
+ ) jt;
+ n | a  | b | c
+---++---+
+ 1 |  1 |   |
+ 2 |  2 | 1 |
+ 2 |  2 | 2 |
+ 2 |  2 | 3 |
+ 2 |  2 |   | 10
+ 2 |  2 |   |
+ 2 |  2 |   | 20
+ 3 |  3 | 1 |
+ 3 |  3 | 2 |
+ 4 | -1 | 1 |
+ 4 | -1 | 2 |
+(11 rows)
+
+-- default plan (outer, union)
+select
+ jt.*
+from
+ jsonb_table_test jtt,
+ json_table (
+ jtt.js,'strict $[*]' as p
+ columns (
+ n for ordinality,
+ a int path 'lax $.a' default -1 on empty,
+ nested path 'strict $.b[*]' as pb columns ( b int path '$' ),
+ nested path 'strict $.c[*]' as pc columns ( c int path '$' )
+ )
+ plan default (outer, union)
+ ) jt;
+ n | a  | b | c
+---++---+
+ 1 |  1 |   |
+ 2 |  2 | 1 | 10
+ 2 |  2 | 1 |
+ 2 |  2 | 1 | 20
+ 2 |  2 | 2 | 10
+ 2 |  2 | 2 |
+ 2 |  2 | 2 | 20
+ 2 |  2 | 3 | 10
+ 2 |  2 | 3 |
+ 2 |  2 | 3 | 20
+ 3 |  3 |   |
+ 4 | -1 |   |
+(12 rows)
these two query results should be the same, if i understand it correctly.




Re: remaining sql/json patches

2024-03-21 Thread Kyotaro Horiguchi
At Fri, 22 Mar 2024 11:44:08 +0900, Amit Langote  
wrote in 
> Thanks for the heads up.
> 
> My bad, will push a fix shortly.

No problem. Thank you for the prompt correction.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: remaining sql/json patches

2024-03-21 Thread Amit Langote
Hi Horiguchi-san,

On Fri, Mar 22, 2024 at 9:51 AM Kyotaro Horiguchi
 wrote:
> At Wed, 20 Mar 2024 21:53:52 +0900, Amit Langote  
> wrote in
> > I'll push 0001 tomorrow.
>
> This patch (v44-0001-Add-SQL-JSON-query-functions.patch) introduced the 
> following new erro message:
>
> +errmsg("can only specify 
> constant, non-aggregate"
> +   " function, 
> or operator expression for"
> +   " DEFAULT"),
>
> I believe that our convention here is to write an error message in a
> single string literal, not split into multiple parts, for better
> grep'ability.

Thanks for the heads up.

My bad, will push a fix shortly.

-- 
Thanks, Amit Langote




Re: remaining sql/json patches

2024-03-21 Thread Kyotaro Horiguchi
At Wed, 20 Mar 2024 21:53:52 +0900, Amit Langote  
wrote in 
> I'll push 0001 tomorrow.

This patch (v44-0001-Add-SQL-JSON-query-functions.patch) introduced the 
following new erro message:

+errmsg("can only specify 
constant, non-aggregate"
+   " function, or 
operator expression for"
+   " DEFAULT"),

I believe that our convention here is to write an error message in a
single string literal, not split into multiple parts, for better
grep'ability.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: remaining sql/json patches

2024-03-20 Thread jian he
looking at documentation again.
one very minor question (issue)

+   
+The ON EMPTY clause specifies the behavior if the
+path_expression yields no value at all; the
+default when ON EMPTY is not specified is to return
+a null value.
+   

I think it should be:

applying path_expression
or
evaluating path_expression

not "the path_expression"
?




Re: remaining sql/json patches

2024-03-20 Thread jian he
minor issues I found while looking through it.
other than these issues, looks good!

/*
 * Convert the a given JsonbValue to its C string representation
 *
 * Returns the string as a Datum setting *resnull if the JsonbValue is a
 * a jbvNull.
 */
static char *
ExecGetJsonValueItemString(JsonbValue *item, bool *resnull)
{
}
I think the comments are not right?

/*
 * Checks if the coercion evaluation led to an error.  If an error did occur,
 * this sets post_eval->error to trigger the ON ERROR handling steps.
 */
void
ExecEvalJsonCoercionFinish(ExprState *state, ExprEvalStep *op)
{
}
these comments on ExecEvalJsonCoercionFinish also need to be updated?


+ /*
+ * Coerce the result value by calling the input function coercion.
+ * *op->resvalue must point to C string in this case.
+ */
+ if (!*op->resnull && jsexpr->use_io_coercion)
+ {
+ FunctionCallInfo fcinfo;
+
+ fcinfo = jsestate->input_fcinfo;
+ Assert(fcinfo != NULL);
+ Assert(val_string != NULL);
+ fcinfo->args[0].value = PointerGetDatum(val_string);
+ fcinfo->args[0].isnull = *op->resnull;
+ /* second and third arguments are already set up */
+
+ fcinfo->isnull = false;
+ *op->resvalue = FunctionCallInvoke(fcinfo);
+ if (SOFT_ERROR_OCCURRED(>escontext))
+ error = true;
+
+ jump_eval_coercion = -1;
+ }

+ /* second and third arguments are already set up */
change to
/* second and third arguments are already set up in ExecInitJsonExpr */
would be great.


commit message

All of these functions only operate on jsonb values. The workaround
for now is to cast the argument to jsonb.

should be removed?


+ case T_JsonFuncExpr:
+ {
+ JsonFuncExpr *jfe = (JsonFuncExpr *) node;
+
+ if (WALK(jfe->context_item))
+ return true;
+ if (WALK(jfe->pathspec))
+ return true;
+ if (WALK(jfe->passing))
+ return true;
+ if (jfe->output && WALK(jfe->output))
+ return true;
+ if (jfe->on_empty)
+ return true;
+ if (jfe->on_error)
+ return true;
+ }

+ if (jfe->output && WALK(jfe->output))
+ return true;
can be simplified:

+ if (WALK(jfe->output))
+ return true;




Re: remaining sql/json patches

2024-03-19 Thread jian he
On Tue, Mar 19, 2024 at 6:46 PM Amit Langote  wrote:
>
> I intend to commit 0001+0002 after a bit more polishing.
>

V43 is far more intuitive! thanks!

if (isnull ||
(exprType(expr) == JSONBOID &&
btype == default_behavior))
coerce = true;
else
coerced_expr =
coerce_to_target_type(pstate, expr, exprType(expr),
  returning->typid, returning->typmod,
  COERCION_EXPLICIT, COERCE_EXPLICIT_CAST,
  exprLocation((Node *) behavior));

obviously, there are cases where "coerce" is false, and "coerced_expr"
is not null.
so I think the bool "coerce" variable naming is not very intuitive.
maybe we can add some comments or change to a better name.


JsonPathVariableEvalContext
JsonPathVarCallback
JsonItemType
JsonExprPostEvalState
these should remove from src/tools/pgindent/typedefs.list


+/*
+ * Performs JsonPath{Exists|Query|Value}() for a given context_item and
+ * jsonpath.
+ *
+ * Result is set in *op->resvalue and *op->resnull.  Return value is the
+ * step address to be performed next.
+ *
+ * On return, JsonExprPostEvalState is populated with the following details:
+ * - error.value: true if an error occurred during JsonPath evaluation
+ * - empty.value: true if JsonPath{Query|Value}() found no matching item
+ *
+ * No return if the ON ERROR/EMPTY behavior is ERROR.
+ */
+int
+ExecEvalJsonExprPath(ExprState *state, ExprEvalStep *op,
+ ExprContext *econtext)

" No return if the ON ERROR/EMPTY behavior is ERROR."  is wrong?
counter example:
SELECT JSON_QUERY(jsonb '{"a":[12,2]}', '$.a' RETURNING int4RANGE omit
quotes error on error);
also "JsonExprPostEvalState" does not exist any more.
overall feel like ExecEvalJsonExprPath comments need to be rephrased.




Re: remaining sql/json patches

2024-03-18 Thread Himanshu Upadhyaya
On Mon, Mar 18, 2024 at 3:33 PM Amit Langote 
wrote:

> Himanshu,
>
> On Mon, Mar 18, 2024 at 4:57 PM Himanshu Upadhyaya
>  wrote:
> > I have tested a nested case  but  why is the negative number allowed in
> subscript(NESTED '$.phones[-1]'COLUMNS), it should error out if the number
> is negative.
> >
> > ‘postgres[170683]=#’SELECT * FROM JSON_TABLE(jsonb '{
> > ‘...>’ "id" : "0.234567897890",
> > ‘...>’ "name" : {
> "first":"John",
> "last":"Doe" },
> > ‘...>’ "phones" : [{"type":"home", "number":"555-3762"},
> > ‘...>’ {"type":"work", "number":"555-7252",
> "test":123}]}',
> > ‘...>’'$'
> > ‘...>’COLUMNS(
> > ‘...>’ id numeric(2,2) PATH 'lax $.id',
> > ‘...>’ last_name varCHAR(10) PATH 'lax $.name.last',
> first_name VARCHAR(10) PATH 'lax $.name.first',
> > ‘...>’  NESTED '$.phones[-1]'COLUMNS (
> > ‘...>’"type" VARCHAR(10),
> > ‘...>’"number" VARCHAR(10)
> > ‘...>’ )
> > ‘...>’  )
> > ‘...>’   ) as t;
> >   id  | last_name | first_name | type | number
> > --+---++--+
> >  0.23 | Doe   | Johnnn |  |
> > (1 row)
>
> You're not getting an error because the default mode of handling
> structural errors in SQL/JSON path expressions is "lax".  If you say
> "strict" in the path string, you will get an error:
>
>
ok, got it, thanks.


> SELECT * FROM JSON_TABLE(jsonb '{
>  "id" : "0.234567897890",
>  "name" : {
> "first":"John",
> "last":"Doe" },
>  "phones" : [{"type":"home", "number":"555-3762"},
>  {"type":"work", "number":"555-7252", "test":123}]}',
> '$'
> COLUMNS(
>  id numeric(2,2) PATH 'lax $.id',
>  last_name varCHAR(10) PATH 'lax $.name.last',
> first_name VARCHAR(10) PATH 'lax $.name.first',
>   NESTED 'strict $.phones[-1]'COLUMNS (
> "type" VARCHAR(10),
> "number" VARCHAR(10)
>  )
>   ) error on error
>) as t;
> ERROR:  jsonpath array subscript is out of bounds
>
> --
> Thanks, Amit Langote
>


-- 
Regards,
Himanshu Upadhyaya
EnterpriseDB: http://www.enterprisedb.com


Re: remaining sql/json patches

2024-03-18 Thread Amit Langote
Himanshu,

On Mon, Mar 18, 2024 at 4:57 PM Himanshu Upadhyaya
 wrote:
> I have tested a nested case  but  why is the negative number allowed in 
> subscript(NESTED '$.phones[-1]'COLUMNS), it should error out if the number is 
> negative.
>
> ‘postgres[170683]=#’SELECT * FROM JSON_TABLE(jsonb '{
> ‘...>’ "id" : "0.234567897890",
> ‘...>’ "name" : { 
> "first":"John", "last":"Doe" 
> },
> ‘...>’ "phones" : [{"type":"home", "number":"555-3762"},
> ‘...>’ {"type":"work", "number":"555-7252", 
> "test":123}]}',
> ‘...>’'$'
> ‘...>’COLUMNS(
> ‘...>’ id numeric(2,2) PATH 'lax $.id',
> ‘...>’ last_name varCHAR(10) PATH 'lax $.name.last', 
> first_name VARCHAR(10) PATH 'lax $.name.first',
> ‘...>’  NESTED '$.phones[-1]'COLUMNS (
> ‘...>’"type" VARCHAR(10),
> ‘...>’"number" VARCHAR(10)
> ‘...>’ )
> ‘...>’  )
> ‘...>’   ) as t;
>   id  | last_name | first_name | type | number
> --+---++--+
>  0.23 | Doe   | Johnnn |  |
> (1 row)

You're not getting an error because the default mode of handling
structural errors in SQL/JSON path expressions is "lax".  If you say
"strict" in the path string, you will get an error:

SELECT * FROM JSON_TABLE(jsonb '{
 "id" : "0.234567897890",
 "name" : {
"first":"John",
"last":"Doe" },
 "phones" : [{"type":"home", "number":"555-3762"},
 {"type":"work", "number":"555-7252", "test":123}]}',
'$'
COLUMNS(
 id numeric(2,2) PATH 'lax $.id',
 last_name varCHAR(10) PATH 'lax $.name.last',
first_name VARCHAR(10) PATH 'lax $.name.first',
  NESTED 'strict $.phones[-1]'COLUMNS (
"type" VARCHAR(10),
"number" VARCHAR(10)
 )
  ) error on error
   ) as t;
ERROR:  jsonpath array subscript is out of bounds

-- 
Thanks, Amit Langote




Re: remaining sql/json patches

2024-03-18 Thread Himanshu Upadhyaya
I have tested a nested case  but  why is the negative number allowed in
subscript(NESTED '$.phones[-1]'COLUMNS), it should error out if the number
is negative.

‘postgres[170683]=#’SELECT * FROM JSON_TABLE(jsonb '{
‘...>’ "id" : "0.234567897890",
‘...>’ "name" : {
"first":"John",
"last":"Doe" },
‘...>’ "phones" : [{"type":"home", "number":"555-3762"},
‘...>’ {"type":"work", "number":"555-7252",
"test":123}]}',
‘...>’'$'
‘...>’COLUMNS(
‘...>’ id numeric(2,2) PATH 'lax $.id',
‘...>’ last_name varCHAR(10) PATH 'lax $.name.last',
first_name VARCHAR(10) PATH 'lax $.name.first',
‘...>’  NESTED '$.phones[-1]'COLUMNS (
‘...>’"type" VARCHAR(10),
‘...>’"number" VARCHAR(10)
‘...>’ )
‘...>’  )
‘...>’   ) as t;
  id  | last_name | first_name | type | number
--+---++--+
 0.23 | Doe   | Johnnn |  |
(1 row)

Thanks,
Himanshu


Re: remaining sql/json patches

2024-03-17 Thread Amit Langote
On Wed, Mar 13, 2024 at 5:47 AM Alvaro Herrera  wrote:
> About 0002:
>
> I think we should just drop it.  Look at the changes it produces in the
> plans for aliases XMLTABLE:
>
> > @@ -1556,7 +1556,7 @@ SELECT f.* FROM xmldata, LATERAL 
> > xmltable('/ROWS/ROW[COUNTRY_NAME="Japan" or COU
> > Output: f."COUNTRY_NAME", f."REGION_ID"
> > ->  Seq Scan on public.xmldata
> >   Output: xmldata.data
> > -   ->  Table Function Scan on "xmltable" f
> > +   ->  Table Function Scan on "XMLTABLE" f
> >   Output: f."COUNTRY_NAME", f."REGION_ID"
> >   Table Function Call: XMLTABLE(('/ROWS/ROW[COUNTRY_NAME="Japan" or 
> > COUNTRY_NAME="India"]'::text) PASSING (xmldata.data) COLUMNS "COUNTRY_NAME" 
> > text, "REGION_ID" integer)
> >   Filter: (f."COUNTRY_NAME" = 'Japan'::text)
>
> Here in text-format EXPLAIN, we already have the alias next to the
> "xmltable" moniker, when an alias is present.  This matches the
> query itself as well as the labels used in the "Output:" display.
> If an alias is not present, then this says just 'Table Function Scan on 
> "xmltable"'
> and the rest of the plans refers to this as "xmltable", so it's also
> fine.
>
> > @@ -1591,7 +1591,7 @@ SELECT f.* FROM xmldata, LATERAL 
> > xmltable('/ROWS/ROW[COUNTRY_NAME="Japan" or COU
> > "Parent Relationship": "Inner", 
> > 
> >  +
> > "Parallel Aware": false,
> > 
> >  +
> > "Async Capable": false, 
> > 
> >  +
> > -   "Table Function Name": "xmltable",  
> > 
> >  +
> > +   "Table Function Name": "XMLTABLE",  
> > 
> >  +
> > "Alias": "f",   
> > 
> >  +
> > "Output": ["f.\"COUNTRY_NAME\"", "f.\"REGION_ID\""],
> > 
> >  +
> > "Table Function Call": 
> > "XMLTABLE(('/ROWS/ROW[COUNTRY_NAME=\"Japan\" or 
> > COUNTRY_NAME=\"India\"]'::text) PASSING (xmldata.data) COLUMNS 
> > \"COUNTRY_NAME\" text, \"REGION_ID\" integer)",+
>
> This is the JSON-format explain.  Notice that the "Alias" member already
> shows the alias "f", so the only thing this change is doing is
> uppercasing the "xmltable" to "XMLTABLE".  We're not really achieving
> anything here.
>
> I think the only salvageable piece from this, **if anything**, is making
> the "xmltable" literal string into uppercase.  That might bring a little
> clarity to the fact that this is a keyword and not a user-introduced
> name.
>
>
> In your 0003 I think this would only have relevance in this query,
>
> +-- JSON_TABLE() with alias
> +EXPLAIN (COSTS OFF, VERBOSE)
> +SELECT * FROM
> +   JSON_TABLE(
> +   jsonb 'null', 'lax $[*]' PASSING 1 + 2 AS a, json '"foo"' AS "b c"
> +   COLUMNS (
> +   id FOR ORDINALITY,
> +   "int" int PATH '$',
> +   "text" text PATH '$'
> +   )) json_table_func;
> + 
>   QUERY PLAN
>
> +--
> --
> + Table Function Scan on "JSON_TABLE" json_table_func
> +   Output: id, "int", text
> +   Table Function Call: JSON_TABLE('null'::jsonb, '$[*]' AS 
> json_table_path_0 PASSING 3 AS a, '"foo"'::jsonb AS "b c" COLUMNS (id FOR 
> ORDINALITY, "int" integer PATH '$', text text PATH '$') PLAN 
> (json_table_path_0))
> +(3 rows)
>
> and I'm curious to see what this would output if this was to be run
> without the 0002 patch.  If I understand things correctly, the alias
> would be displayed anyway, meaning 0002 doesn't get us anything.

Patch 0002 came about because old versions of json_table patch were
changing ExplainTargetRel() incorrectly to use rte->tablefunc to get
the function type to set objectname, but rte->tablefunc is NULL
because add_rte_to_flat_rtable() zaps it.  You pointed that out in
[1].

However, 

Re: remaining sql/json patches

2024-03-15 Thread jian he
On Mon, Mar 11, 2024 at 11:30 AM jian he  wrote:
>
> On Sun, Mar 10, 2024 at 10:57 PM jian he  wrote:
> >
> > one more issue.
>
> Hi
> one more documentation issue.
> after applied V42, 0001 to 0003,
> there are 11 appearance of `FORMAT JSON` in functions-json.html
> still not a single place explained what it is for.
>
> json_query ( context_item, path_expression [ PASSING { value AS
> varname } [, ...]] [ RETURNING data_type [ FORMAT JSON [ ENCODING UTF8
> ] ] ] [ { WITHOUT | WITH { CONDITIONAL | [UNCONDITIONAL] } } [ ARRAY ]
> WRAPPER ] [ { KEEP | OMIT } QUOTES [ ON SCALAR STRING ] ] [ { ERROR |
> NULL | EMPTY { [ ARRAY ] | OBJECT } | DEFAULT expression } ON EMPTY ]
> [ { ERROR | NULL | EMPTY { [ ARRAY ] | OBJECT } | DEFAULT expression }
> ON ERROR ])
>
> FORMAT JSON seems just a syntax sugar or for compatibility in json_query.
> but it returns an error when the returning type category is not
> TYPCATEGORY_STRING.
>
> for example, even the following will return an error.
> `
> CREATE TYPE regtest_comptype AS (b text);
> SELECT JSON_QUERY(jsonb '{"a":{"b":"c"}}', '$.a' RETURNING
> regtest_comptype format json);
> `
>
> seems only types in[0] will not generate an error, when specifying
> FORMAT JSON in JSON_QUERY.
>
> so it actually does something, not a syntax sugar?
>

SELECT * FROM JSON_TABLE(jsonb'[{"aaa": 123}]', 'lax $[*]' COLUMNS
(js2 text format json PATH '$' omit quotes));
SELECT * FROM JSON_TABLE(jsonb'[{"aaa": 123}]', 'lax $[*]' COLUMNS
(js2 text format json PATH '$' keep quotes));
SELECT * FROM JSON_TABLE(jsonb'[{"aaa": 123}]', 'lax $[*]' COLUMNS
(js2 text PATH '$' keep quotes)); -- JSON_QUERY_OP
SELECT * FROM JSON_TABLE(jsonb'[{"aaa": 123}]', 'lax $[*]' COLUMNS
(js2 text PATH '$' omit quotes)); -- JSON_QUERY_OP
SELECT * FROM JSON_TABLE(jsonb'[{"aaa": 123}]', 'lax $[*]' COLUMNS
(js2 text PATH '$')); -- JSON_VALUE_OP
SELECT * FROM JSON_TABLE(jsonb'[{"aaa": 123}]', 'lax $[*]' COLUMNS
(js2 json PATH '$')); -- JSON_QUERY_OP
comparing these queries, I think 'FORMAT JSON' main usage is in json_table.

CREATE TYPE regtest_comptype AS (b text);
SELECT JSON_QUERY(jsonb '{"a":{"b":"c"}}', '$.a' RETURNING
regtest_comptype format json);
ERROR:  cannot use JSON format with non-string output types
LINE 1: ..."a":{"b":"c"}}', '$.a' RETURNING regtest_comptype format jso...
 ^
the error message is not good, but that's a minor issue. we can pursue it later.
-
SELECT JSON_QUERY(jsonb 'true', '$' RETURNING int KEEP QUOTES );
SELECT JSON_QUERY(jsonb 'true', '$' RETURNING int omit QUOTES );
SELECT JSON_VALUE(jsonb 'true', '$' RETURNING int);
the third query returns integer 1, not sure this is the desired behavior.
it obviously has an implication for json_table.
-
in jsonb_get_element, we have something like:
if (jbvp->type == jbvBinary)
{
container = jbvp->val.binary.data;
have_object = JsonContainerIsObject(container);
have_array = JsonContainerIsArray(container);
Assert(!JsonContainerIsScalar(container));
}

+ res = JsonValueListHead();
+ if (res->type == jbvBinary && JsonContainerIsScalar(res->val.binary.data))
+ JsonbExtractScalar(res->val.binary.data, res);
So in JsonPathValue, the above (res->type == jbvBinary) is unreachable?
also see the comment in jbvBinary.

maybe we can just simply do:
if (res->type == jbvBinary)
Assert(!JsonContainerIsScalar(res->val.binary.data));
-
+
+JSON_TABLE (
+  context_item,
path_expression  AS
json_path_name  
PASSING { value AS
varname } , ...

+  COLUMNS ( json_table_column ,
... )
+   { ERROR | EMPTY }
ON ERROR 
+  
+PLAN ( json_table_plan ) |
+PLAN DEFAULT ( { INNER | OUTER }  , { CROSS | UNION } 
+ | { CROSS | UNION }  , { INNER | OUTER }
 )
+  
+)

based on the synopsis
the following query should not be allowed?
SELECT *FROM (VALUES ('"11"'), ('"err"')) vals(js)
LEFT OUTER JOIN  JSON_TABLE(vals.js::jsonb, '$' COLUMNS (a int PATH
'$') default '11' ON ERROR) jt ON true;

aslo the synopsis need to reflect case like:
SELECT *FROM (VALUES ('"11"'), ('"err"')) vals(js)
LEFT OUTER JOIN  JSON_TABLE(vals.js::jsonb, '$' COLUMNS (a int PATH
'$') NULL ON ERROR) jt ON true;




Re: remaining sql/json patches

2024-03-14 Thread jian he
one more question...
SELECT  JSON_value(NULL::int, '$' returning int);
ERROR:  cannot use non-string types with implicit FORMAT JSON clause
LINE 1: SELECT  JSON_value(NULL::int, '$' returning int);
   ^

SELECT  JSON_query(NULL::int, '$' returning int);
ERROR:  cannot use non-string types with implicit FORMAT JSON clause
LINE 1: SELECT  JSON_query(NULL::int, '$' returning int);
   ^

SELECT * FROM JSON_TABLE(NULL::int, '$' COLUMNS (foo text));
ERROR:  cannot use non-string types with implicit FORMAT JSON clause
LINE 1: SELECT * FROM JSON_TABLE(NULL::int, '$' COLUMNS (foo text));
 ^

SELECT  JSON_value(NULL::text, '$' returning int);
ERROR:  JSON_VALUE() is not yet implemented for the json type
LINE 1: SELECT  JSON_value(NULL::text, '$' returning int);
   ^
HINT:  Try casting the argument to jsonb


SELECT  JSON_query(NULL::text, '$' returning int);
ERROR:  JSON_QUERY() is not yet implemented for the json type
LINE 1: SELECT  JSON_query(NULL::text, '$' returning int);
   ^
HINT:  Try casting the argument to jsonb

in all these cases, the error message seems strange.

we already mentioned:
  
   
SQL/JSON query functions currently only accept values of the
jsonb type, because the SQL/JSON path language only
supports those, so it might be necessary to cast the
context_item argument of these functions to
jsonb.
   
  

we can simply say, only accept the first argument to be jsonb data type.




Re: remaining sql/json patches

2024-03-12 Thread Himanshu Upadhyaya
On Tue, Mar 12, 2024 at 5:37 PM Amit Langote 
wrote:

>
>
> SELECT JSON_EXISTS(jsonb '{"customer_name": "test", "salary":1000,
> "department_id":1}', '$ ? (@.department_id == $dept_id && @.salary ==
> $sal)' PASSING 1000 AS sal, 1 as dept_id);
>  json_exists
> -
>  t
> (1 row)
>
> Does that make sense?
>
> Yes, got it. Thanks for the clarification.

-- 
Regards,
Himanshu Upadhyaya
EnterpriseDB: http://www.enterprisedb.com


Re: remaining sql/json patches

2024-03-12 Thread Alvaro Herrera
About 0002:

I think we should just drop it.  Look at the changes it produces in the
plans for aliases XMLTABLE:

> @@ -1556,7 +1556,7 @@ SELECT f.* FROM xmldata, LATERAL 
> xmltable('/ROWS/ROW[COUNTRY_NAME="Japan" or COU
> Output: f."COUNTRY_NAME", f."REGION_ID"
> ->  Seq Scan on public.xmldata
>   Output: xmldata.data
> -   ->  Table Function Scan on "xmltable" f
> +   ->  Table Function Scan on "XMLTABLE" f
>   Output: f."COUNTRY_NAME", f."REGION_ID"
>   Table Function Call: XMLTABLE(('/ROWS/ROW[COUNTRY_NAME="Japan" or 
> COUNTRY_NAME="India"]'::text) PASSING (xmldata.data) COLUMNS "COUNTRY_NAME" 
> text, "REGION_ID" integer)
>   Filter: (f."COUNTRY_NAME" = 'Japan'::text)

Here in text-format EXPLAIN, we already have the alias next to the
"xmltable" moniker, when an alias is present.  This matches the
query itself as well as the labels used in the "Output:" display.
If an alias is not present, then this says just 'Table Function Scan on 
"xmltable"'
and the rest of the plans refers to this as "xmltable", so it's also
fine.

> @@ -1591,7 +1591,7 @@ SELECT f.* FROM xmldata, LATERAL 
> xmltable('/ROWS/ROW[COUNTRY_NAME="Japan" or COU
> "Parent Relationship": "Inner",   
>   
>  +
> "Parallel Aware": false,  
>   
>  +
> "Async Capable": false,   
>   
>  +
> -   "Table Function Name": "xmltable",
>   
>  +
> +   "Table Function Name": "XMLTABLE",
>   
>  +
> "Alias": "f", 
>   
>  +
> "Output": ["f.\"COUNTRY_NAME\"", "f.\"REGION_ID\""],  
>   
>  +
> "Table Function Call": 
> "XMLTABLE(('/ROWS/ROW[COUNTRY_NAME=\"Japan\" or 
> COUNTRY_NAME=\"India\"]'::text) PASSING (xmldata.data) COLUMNS 
> \"COUNTRY_NAME\" text, \"REGION_ID\" integer)",+

This is the JSON-format explain.  Notice that the "Alias" member already
shows the alias "f", so the only thing this change is doing is
uppercasing the "xmltable" to "XMLTABLE".  We're not really achieving
anything here.

I think the only salvageable piece from this, **if anything**, is making
the "xmltable" literal string into uppercase.  That might bring a little
clarity to the fact that this is a keyword and not a user-introduced
name.


In your 0003 I think this would only have relevance in this query,

+-- JSON_TABLE() with alias
+EXPLAIN (COSTS OFF, VERBOSE)
+SELECT * FROM
+   JSON_TABLE(
+   jsonb 'null', 'lax $[*]' PASSING 1 + 2 AS a, json '"foo"' AS "b c"
+   COLUMNS (
+   id FOR ORDINALITY,
+   "int" int PATH '$',
+   "text" text PATH '$'
+   )) json_table_func;
+   
QUERY PLAN 
  
+--
--
+ Table Function Scan on "JSON_TABLE" json_table_func
+   Output: id, "int", text
+   Table Function Call: JSON_TABLE('null'::jsonb, '$[*]' AS json_table_path_0 
PASSING 3 AS a, '"foo"'::jsonb AS "b c" COLUMNS (id FOR ORDINALITY, "int" 
integer PATH '$', text text PATH '$') PLAN (json_table_path_0))
+(3 rows)

and I'm curious to see what this would output if this was to be run
without the 0002 patch.  If I understand things correctly, the alias
would be displayed anyway, meaning 0002 doesn't get us anything.

Please do add a test with EXPLAIN (FORMAT JSON) in 0003.

Thanks

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"La vida es para el que se aventura"




Re: remaining sql/json patches

2024-03-12 Thread Amit Langote
Hi Himanshu,

On Tue, Mar 12, 2024 at 6:42 PM Himanshu Upadhyaya
 wrote:
>
> Hi,
>
> wanted to share the below case:
>
> ‘postgres[146443]=#’SELECT JSON_EXISTS(jsonb '{"customer_name": "test", 
> "salary":1000, "department_id":1}', '$.* ? (@== $dept_id && @ == $sal)' 
> PASSING 1000 AS sal, 1 as dept_id);
>  json_exists
> -
>  f
> (1 row)
>
> isn't it supposed to return "true" as json in input is matching with both the 
> condition dept_id and salary?

I think you meant to use || in your condition, not &&, because 1000 != 1.

See:

SELECT JSON_EXISTS(jsonb '{"customer_name": "test", "salary":1000,
"department_id":1}', '$.* ? (@ == $dept_id || @ == $sal)' PASSING 1000
AS sal, 1 as dept_id);
 json_exists
-
 t
(1 row)

Or you could've written the query as:

SELECT JSON_EXISTS(jsonb '{"customer_name": "test", "salary":1000,
"department_id":1}', '$ ? (@.department_id == $dept_id && @.salary ==
$sal)' PASSING 1000 AS sal, 1 as dept_id);
 json_exists
-
 t
(1 row)

Does that make sense?

In any case, JSON_EXISTS() added by the patch here returns whatever
the jsonpath executor returns.  The latter is not touched by this
patch.  PASSING args, which this patch adds, seem to be working
correctly too.

-- 
Thanks, Amit Langote




Re: remaining sql/json patches

2024-03-12 Thread Himanshu Upadhyaya
Hi,

wanted to share the below case:

‘postgres[146443]=#’SELECT JSON_EXISTS(jsonb '{"customer_name": "test",
"salary":1000, "department_id":1}', '$.* ? (@== $dept_id && @ == $sal)'
PASSING 1000 AS sal, 1 as dept_id);
 json_exists
-
 f
(1 row)

isn't it supposed to return "true" as json in input is matching with both
the condition dept_id and salary?

-- 
Regards,
Himanshu Upadhyaya
EnterpriseDB: http://www.enterprisedb.com


Re: remaining sql/json patches

2024-03-11 Thread Shruthi Gowda
Thanka Alvaro. It works fine when quotes are used around the column name.

On Mon, Mar 11, 2024 at 9:04 PM Alvaro Herrera 
wrote:

> On 2024-Mar-11, Shruthi Gowda wrote:
>
> > *CASE 2:*
> > --
> > SELECT * FROM JSON_TABLE(jsonb '{
> >  "id" : 901,
> >  "age" : 30,
> >  "*FULL_NAME*" : "KATE DANIEL"}',
> > '$'
> > COLUMNS(
> >  FULL_NAME varchar(20),
> >  ID int,
> >  AGE int
> >   )
> >) as t;
>
> I think this is expected: when you use FULL_NAME as a SQL identifier, it
> is down-cased, so it no longer matches the uppercase identifier in the
> JSON data.  You'd have to do it like this:
>
> SELECT * FROM JSON_TABLE(jsonb '{
>  "id" : 901,
>  "age" : 30,
>  "*FULL_NAME*" : "KATE DANIEL"}',
> '$'
> COLUMNS(
>  "FULL_NAME" varchar(20),
>  ID int,
>  AGE int
>   )
>) as t;
>
> so that the SQL identifier is not downcased.
>
> --
> Álvaro Herrera PostgreSQL Developer  —
> https://www.EnterpriseDB.com/
>


Re: remaining sql/json patches

2024-03-11 Thread Alvaro Herrera
On 2024-Mar-11, Shruthi Gowda wrote:

> *CASE 2:*
> --
> SELECT * FROM JSON_TABLE(jsonb '{
>  "id" : 901,
>  "age" : 30,
>  "*FULL_NAME*" : "KATE DANIEL"}',
> '$'
> COLUMNS(
>  FULL_NAME varchar(20),
>  ID int,
>  AGE int
>   )
>) as t;

I think this is expected: when you use FULL_NAME as a SQL identifier, it
is down-cased, so it no longer matches the uppercase identifier in the
JSON data.  You'd have to do it like this:

SELECT * FROM JSON_TABLE(jsonb '{
 "id" : 901,
 "age" : 30,
 "*FULL_NAME*" : "KATE DANIEL"}',
'$'
COLUMNS(
 "FULL_NAME" varchar(20),
 ID int,
 AGE int
  )
   ) as t;

so that the SQL identifier is not downcased.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/




Re: remaining sql/json patches

2024-03-11 Thread Shruthi Gowda
Hi,
I was experimenting with the v42 patches, and I tried testing without
providing the path explicitly. There is one difference between the two test
cases that I have highlighted in blue.
The full_name column is empty in the second test case result.  Let me know
if this is an issue or expected behaviour.

*CASE 1:*
---
SELECT * FROM JSON_TABLE(jsonb '{
 "id" : 901,
 "age" : 30,
 "*full_name*" : "KATE DANIEL"}',
'$'
COLUMNS(
 FULL_NAME varchar(20),
 ID int,
 AGE  int
  )

   ) as t;

*RESULT:*
  full_name  | id  | age
-+-+-
 KATE DANIEL | 901 | 30

(1 row)

*CASE 2:*
--
SELECT * FROM JSON_TABLE(jsonb '{
 "id" : 901,
 "age" : 30,
 "*FULL_NAME*" : "KATE DANIEL"}',
'$'
COLUMNS(
 FULL_NAME varchar(20),
 ID int,
 AGE int
  )

   ) as t;

*RESULT:*
 full_name | id  | age
---+-+-
   | 901 | 30
(1 row)


Thanks & Regards,
Shruthi K C
EnterpriseDB: http://www.enterprisedb.com


Re: remaining sql/json patches

2024-03-11 Thread jian he
Hi.
more minor issues.

by searching `elog(ERROR, "unrecognized node type: %d"`
I found that generally enum is cast to int, before printing it out.
I also found a related post at [1].

So I add the typecast to int, before printing it out.
most of the refactored code is unlikely to be reachable, but still.

I also refactored ExecPrepareJsonItemCoercion error message, to make
the error message more explicit.
@@ -4498,7 +4498,9 @@ ExecPrepareJsonItemCoercion(JsonbValue *item,
JsonExprState *jsestate,
if (throw_error)
ereport(ERROR,

errcode(ERRCODE_SQL_JSON_ITEM_CANNOT_BE_CAST_TO_TARGET_TYPE),
-   errmsg("SQL/JSON item cannot
be cast to target type"));
+
errcode(ERRCODE_SQL_JSON_ITEM_CANNOT_BE_CAST_TO_TARGET_TYPE),
+   errmsg("SQL/JSON item cannot
be cast to type %s",
+
format_type_be(jsestate->jsexpr->returning->typid)));

+ /*
+ * We abuse CaseTestExpr here as placeholder to pass the result of
+ * evaluating the JSON_VALUE/QUERY jsonpath expression as input to the
+ * coercion expression.
+ */
+ CaseTestExpr *placeholder = makeNode(CaseTestExpr);
typo in comment, should it be `JSON_VALUE/JSON_QUERY`?

[1] 
https://stackoverflow.com/questions/8012647/can-we-typecast-a-enum-variable-in-c


v42-0001-miscellaneous-fix-based-on-v42.no-cfbot
Description: Binary data


Re: remaining sql/json patches

2024-03-11 Thread jian he
one more issue.

+-- Extension: non-constant JSON path
+SELECT JSON_EXISTS(jsonb '{"a": 123}', '$' || '.' || 'a');
+SELECT JSON_VALUE(jsonb '{"a": 123}', '$' || '.' || 'a');
+SELECT JSON_VALUE(jsonb '{"a": 123}', '$' || '.' || 'b' DEFAULT 'foo'
ON EMPTY);
+SELECT JSON_QUERY(jsonb '{"a": 123}', '$' || '.' || 'a');
+SELECT JSON_QUERY(jsonb '{"a": 123}', '$' || '.' || 'a' WITH WRAPPER);

json path may not be a plain Const.
does the following code in expression_tree_walker_impl need to consider
cases when the `jexpr->path_spec` part is not a Const?

+ case T_JsonExpr:
+ {
+ JsonExpr   *jexpr = (JsonExpr *) node;
+
+ if (WALK(jexpr->formatted_expr))
+ return true;
+ if (WALK(jexpr->result_coercion))
+ return true;
+ if (WALK(jexpr->item_coercions))
+ return true;
+ if (WALK(jexpr->passing_values))
+ return true;
+ /* we assume walker doesn't care about passing_names */
+ if (WALK(jexpr->on_empty))
+ return true;
+ if (WALK(jexpr->on_error))
+ return true;
+ }




Re: remaining sql/json patches

2024-03-10 Thread jian he
On Sun, Mar 10, 2024 at 10:57 PM jian he  wrote:
>
> one more issue.

Hi
one more documentation issue.
after applied V42, 0001 to 0003,
there are 11 appearance of `FORMAT JSON` in functions-json.html
still not a single place explained what it is for.

json_query ( context_item, path_expression [ PASSING { value AS
varname } [, ...]] [ RETURNING data_type [ FORMAT JSON [ ENCODING UTF8
] ] ] [ { WITHOUT | WITH { CONDITIONAL | [UNCONDITIONAL] } } [ ARRAY ]
WRAPPER ] [ { KEEP | OMIT } QUOTES [ ON SCALAR STRING ] ] [ { ERROR |
NULL | EMPTY { [ ARRAY ] | OBJECT } | DEFAULT expression } ON EMPTY ]
[ { ERROR | NULL | EMPTY { [ ARRAY ] | OBJECT } | DEFAULT expression }
ON ERROR ])

FORMAT JSON seems just a syntax sugar or for compatibility in json_query.
but it returns an error when the returning type category is not
TYPCATEGORY_STRING.

for example, even the following will return an error.
`
CREATE TYPE regtest_comptype AS (b text);
SELECT JSON_QUERY(jsonb '{"a":{"b":"c"}}', '$.a' RETURNING
regtest_comptype format json);
`

seems only types in[0] will not generate an error, when specifying
FORMAT JSON in JSON_QUERY.

so it actually does something, not a syntax sugar?

[0] https://www.postgresql.org/docs/current/datatype-character.html




Re: remaining sql/json patches

2024-03-10 Thread jian he
one more issue.
+ case JSON_VALUE_OP:
+ /* Always omit quotes from scalar strings. */
+ jsexpr->omit_quotes = (func->quotes == JS_QUOTES_OMIT);
+
+ /* JSON_VALUE returns text by default. */
+ if (!OidIsValid(jsexpr->returning->typid))
+ {
+ jsexpr->returning->typid = TEXTOID;
+ jsexpr->returning->typmod = -1;
+ }

by default, makeNode(JsonExpr), node initialization,
jsexpr->omit_quotes will initialize to false,
Even though there was no implication to the JSON_TABLE patch (probably
because coerceJsonFuncExprOutput), all tests still passed.
based on the above comment, and the regress test, you still need do (i think)
`
jsexpr->omit_quotes = true;
`




Re: remaining sql/json patches

2024-03-09 Thread Andy Fan


jian he  writes:

> On Tue, Mar 5, 2024 at 12:38 PM Andy Fan  wrote:
>>
>>
>> In the commit message of 0001, we have:
>>
>> """
>> Both JSON_VALUE() and JSON_QUERY() functions have options for
>> handling EMPTY and ERROR conditions, which can be used to specify
>> the behavior when no values are matched and when an error occurs
>> during evaluation, respectively.
>>
>> All of these functions only operate on jsonb values. The workaround
>> for now is to cast the argument to jsonb.
>> """
>>
>> which is not clear for me why we introduce JSON_VALUE() function, is it
>> for handling EMPTY or ERROR conditions? I think the existing cast
>> workaround have a similar capacity?
>>
>
> I guess because it's in the standard.
> but I don't see individual sql standard Identifier, JSON_VALUE in
> sql_features.txt
> I do see JSON_QUERY.
> mysql also have JSON_VALUE, [1]
>
> EMPTY, ERROR: there is a standard Identifier: T825: SQL/JSON: ON EMPTY
> and ON ERROR clauses
>
> [1] 
> https://dev.mysql.com/doc/refman/8.0/en/json-search-functions.html#function_json-value

Thank you for this informatoin!

-- 
Best Regards
Andy Fan





Re: remaining sql/json patches

2024-03-07 Thread jian he
I looked at the documentation again.
one more changes for JSON_QUERY:

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 3e58ebd2..0c49b321 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -18715,8 +18715,8 @@ ERROR:  jsonpath array subscript is out of bounds
 be of type jsonb.


-The ON EMPTY clause specifies the behavior if the
-path_expression yields no value at all; the
+The ON EMPTY clause specifies the behavior
if applying the
+path_expression to the
context_item yields no value at all; the
 default when ON EMPTY is not specified is to return
 a null value.





Re: remaining sql/json patches

2024-03-07 Thread Amit Langote
On Thu, Mar 7, 2024 at 23:14 Alvaro Herrera  wrote:

> On 2024-Mar-07, Tomas Vondra wrote:
>
> > I was experimenting with the v42 patches, and I think the handling of ON
> > EMPTY / ON ERROR clauses may need some improvement.
>
> Well, the 2023 standard says things like
>
>  ::=
>   JSON_VALUE 
>   
>   [  ]
>   [  ON EMPTY ]
>   [  ON ERROR ]
>   
>
> which implies that if you specify it the other way around, it's a syntax
> error.
>
> > I'm not sure what the SQL standard says about this, but it seems other
> > databases don't agree on the order. Is there a particular reason to
> > not allow both orderings?
>
> I vaguely recall that trying to also support the other ordering leads to
> having more rules.


Yeah, I think that was it.  At one point, I removed rules supporting syntax
that wasn’t documented.

Now maybe we do want that because of compatibility
> with other DBMSs, but frankly at this stage I wouldn't bother.


+1.

>


Re: remaining sql/json patches

2024-03-07 Thread Alvaro Herrera
On 2024-Mar-07, Tomas Vondra wrote:

> I was experimenting with the v42 patches, and I think the handling of ON
> EMPTY / ON ERROR clauses may need some improvement.

Well, the 2023 standard says things like

 ::=
  JSON_VALUE 
  
  [  ]
  [  ON EMPTY ]
  [  ON ERROR ]
  

which implies that if you specify it the other way around, it's a syntax
error.

> I'm not sure what the SQL standard says about this, but it seems other
> databases don't agree on the order. Is there a particular reason to
> not allow both orderings?

I vaguely recall that trying to also support the other ordering leads to
having more rules.  Now maybe we do want that because of compatibility
with other DBMSs, but frankly at this stage I wouldn't bother.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"I am amazed at [the pgsql-sql] mailing list for the wonderful support, and
lack of hesitasion in answering a lost soul's question, I just wished the rest
of the mailing list could be like this."   (Fotis)
  https://postgr.es/m/200606261359.k5qdxe2p004...@auth-smtp.hol.gr




Re: remaining sql/json patches

2024-03-07 Thread Amit Langote
On Thu, Mar 7, 2024 at 22:46 jian he  wrote:

> On Thu, Mar 7, 2024 at 8:06 PM Amit Langote 
> wrote:
> >
> >
> > Indeed.
> >
> > This boils down to the difference in the cast expression chosen to
> > convert the source value to int in the two cases.
> >
> > The case where the source value has no quotes, the chosen cast
> > expression is a FuncExpr for function numeric_int4(), which has no way
> > to suppress errors.  When the source value has quotes, the cast
> > expression is a CoerceViaIO expression, which can suppress the error.
> > The default behavior is to suppress the error and return NULL, so the
> > correct behavior is when the source value has quotes.
> >
> > I think we'll need either:
> >
> > * fix the code in 0001 to avoid getting numeric_int4() in this case,
> > and generally cast functions that don't have soft-error handling
> > support, in favor of using IO coercion.
> > * fix FuncExpr (like CoerceViaIO) to respect SQL/JSON's request to
> > suppress errors and fix downstream functions like numeric_int4() to
> > comply by handling errors softly.
> >
> > I'm inclined to go with the 1st option as we already have the
> > infrastructure in place -- input functions can all handle errors
> > softly.
>
> not sure this is the right way.
> but attached patches solved this problem.
>
> Also, can you share the previous memory-hogging bug issue
> when you are free, I want to know which part I am missing.


Take a look at the json_populate_type() call in ExecEvalJsonCoercion() or
specifically compare the new way of passing its void *cache parameter with
the earlier patches.

>


Re: remaining sql/json patches

2024-03-07 Thread Tomas Vondra
Hi,

I was experimenting with the v42 patches, and I think the handling of ON
EMPTY / ON ERROR clauses may need some improvement. The grammar is
currently defined like this:

| json_behavior ON EMPTY_P json_behavior ON ERROR_P

This means the clauses have to be defined exactly in this order, and if
someone does

NULL ON ERROR NULL ON EMPTY

it results in syntax error. I'm not sure what the SQL standard says
about this, but it seems other databases don't agree on the order. Is
there a particular reason to not allow both orderings?


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: remaining sql/json patches

2024-03-07 Thread jian he
On Thu, Mar 7, 2024 at 8:06 PM Amit Langote  wrote:
>
>
> Indeed.
>
> This boils down to the difference in the cast expression chosen to
> convert the source value to int in the two cases.
>
> The case where the source value has no quotes, the chosen cast
> expression is a FuncExpr for function numeric_int4(), which has no way
> to suppress errors.  When the source value has quotes, the cast
> expression is a CoerceViaIO expression, which can suppress the error.
> The default behavior is to suppress the error and return NULL, so the
> correct behavior is when the source value has quotes.
>
> I think we'll need either:
>
> * fix the code in 0001 to avoid getting numeric_int4() in this case,
> and generally cast functions that don't have soft-error handling
> support, in favor of using IO coercion.
> * fix FuncExpr (like CoerceViaIO) to respect SQL/JSON's request to
> suppress errors and fix downstream functions like numeric_int4() to
> comply by handling errors softly.
>
> I'm inclined to go with the 1st option as we already have the
> infrastructure in place -- input functions can all handle errors
> softly.

not sure this is the right way.
but attached patches solved this problem.

Also, can you share the previous memory-hogging bug issue
when you are free, I want to know which part I am missing.


v42-0001-minor-refactor-SQL-JSON-query-functions-based.no-cfbot
Description: Binary data


v42-0002-minor-refactor-JOSN_TABLE-functions-based-on-.no-cfbot
Description: Binary data


Re: remaining sql/json patches

2024-03-07 Thread Amit Langote
On Thu, Mar 7, 2024 at 8:13 PM Tomas Vondra
 wrote:
> On 3/7/24 06:18, Himanshu Upadhyaya wrote:

Thanks Himanshu for the testing.

> > On Wed, Mar 6, 2024 at 9:04 PM Tomas Vondra 
> > wrote:
> >>
> >> I'm pretty sure this is the correct & expected behavior. The second
> >> query treats the value as string (because that's what should happen for
> >> values in double quotes).
> >>
> >>  ok, Then why does the below query provide the correct conversion, even if
> > we enclose that in double quotes?
> > ‘postgres[102531]=#’SELECT * FROM JSON_TABLE(jsonb '{
> >  "id" : "1234567890",
> >  "FULL_NAME" : "JOHN DOE"}',
> > '$'
> > COLUMNS(
> >  name varchar(20) PATH 'lax $.FULL_NAME',
> >  id int PATH 'lax $.id'
> >   )
> >)
> > ;
> >name   | id
> > --+
> >  JOHN DOE | 1234567890
> > (1 row)
> >
> > and for bigger input(string) it will leave as empty as below.
> > ‘postgres[102531]=#’SELECT * FROM JSON_TABLE(jsonb '{
> >  "id" : "12345678901",
> >  "FULL_NAME" : "JOHN DOE"}',
> > '$'
> > COLUMNS(
> >  name varchar(20) PATH 'lax $.FULL_NAME',
> >  id int PATH 'lax $.id'
> >   )
> >)
> > ;
> >name   | id
> > --+
> >  JOHN DOE |
> > (1 row)
> >
> > seems it is not something to do with data enclosed in double quotes but
> > somehow related with internal casting it to integer and I think in case of
> > bigger input it is not able to cast it to integer(as defined under COLUMNS
> > as id int PATH 'lax $.id')
> >
> > ‘postgres[102531]=#’SELECT * FROM JSON_TABLE(jsonb '{
> >  "id" : "12345678901",
> >  "FULL_NAME" : "JOHN DOE"}',
> > '$'
> > COLUMNS(
> >  name varchar(20) PATH 'lax $.FULL_NAME',
> >  id int PATH 'lax $.id'
> >   )
> >)
> > ;
> >name   | id
> > --+
> >  JOHN DOE |
> > (1 row)
> > )
> >
> > if it is not able to represent it to integer because of bigger input, it
> > should error out with a similar error message instead of leaving it empty.
> >
> > Thoughts?
> >
>
> Ah, I see! Yes, that's a bit weird. Put slightly differently:
>
> test=# SELECT * FROM JSON_TABLE(jsonb '{"id" : "20"}',
> '$' COLUMNS(id int PATH '$.id'));
>  id
> 
>  20
> (1 row)
>
> Time: 0.248 ms
> test=# SELECT * FROM JSON_TABLE(jsonb '{"id" : "30"}',
> '$' COLUMNS(id int PATH '$.id'));
>  id
> 
>
> (1 row)
>
> Clearly, when converting the string literal into int value, there's some
> sort of error handling that realizes 3B overflows, and returns NULL
> instead. I'm not sure if this is intentional.

Indeed.

This boils down to the difference in the cast expression chosen to
convert the source value to int in the two cases.

The case where the source value has no quotes, the chosen cast
expression is a FuncExpr for function numeric_int4(), which has no way
to suppress errors.  When the source value has quotes, the cast
expression is a CoerceViaIO expression, which can suppress the error.
The default behavior is to suppress the error and return NULL, so the
correct behavior is when the source value has quotes.

I think we'll need either:

* fix the code in 0001 to avoid getting numeric_int4() in this case,
and generally cast functions that don't have soft-error handling
support, in favor of using IO coercion.
* fix FuncExpr (like CoerceViaIO) to respect SQL/JSON's request to
suppress errors and fix downstream functions like numeric_int4() to
comply by handling errors softly.

I'm inclined to go with the 1st option as we already have the
infrastructure in place -- input functions can all handle errors
softly.

For the latter, it uses numeric_int4() which doesn't support
soft-error handling, so throws the error.  With quotes, the


--
Thanks, Amit Langote

--
Thanks, Amit Langote




Re: remaining sql/json patches

2024-03-07 Thread Tomas Vondra



On 3/7/24 06:18, Himanshu Upadhyaya wrote:
> On Wed, Mar 6, 2024 at 9:04 PM Tomas Vondra 
> wrote:
> 
>>
>>
>> I'm pretty sure this is the correct & expected behavior. The second
>> query treats the value as string (because that's what should happen for
>> values in double quotes).
>>
>>  ok, Then why does the below query provide the correct conversion, even if
> we enclose that in double quotes?
> ‘postgres[102531]=#’SELECT * FROM JSON_TABLE(jsonb '{
>  "id" : "1234567890",
>  "FULL_NAME" : "JOHN DOE"}',
> '$'
> COLUMNS(
>  name varchar(20) PATH 'lax $.FULL_NAME',
>  id int PATH 'lax $.id'
>   )
>)
> ;
>name   | id
> --+
>  JOHN DOE | 1234567890
> (1 row)
> 
> and for bigger input(string) it will leave as empty as below.
> ‘postgres[102531]=#’SELECT * FROM JSON_TABLE(jsonb '{
>  "id" : "12345678901",
>  "FULL_NAME" : "JOHN DOE"}',
> '$'
> COLUMNS(
>  name varchar(20) PATH 'lax $.FULL_NAME',
>  id int PATH 'lax $.id'
>   )
>)
> ;
>name   | id
> --+
>  JOHN DOE |
> (1 row)
> 
> seems it is not something to do with data enclosed in double quotes but
> somehow related with internal casting it to integer and I think in case of
> bigger input it is not able to cast it to integer(as defined under COLUMNS
> as id int PATH 'lax $.id')
> 
> ‘postgres[102531]=#’SELECT * FROM JSON_TABLE(jsonb '{
>  "id" : "12345678901",
>  "FULL_NAME" : "JOHN DOE"}',
> '$'
> COLUMNS(
>  name varchar(20) PATH 'lax $.FULL_NAME',
>  id int PATH 'lax $.id'
>   )
>)
> ;
>name   | id
> --+
>  JOHN DOE |
> (1 row)
> )
> 
> if it is not able to represent it to integer because of bigger input, it
> should error out with a similar error message instead of leaving it empty.
> 
> Thoughts?
> 

Ah, I see! Yes, that's a bit weird. Put slightly differently:

test=# SELECT * FROM JSON_TABLE(jsonb '{"id" : "20"}',
'$' COLUMNS(id int PATH '$.id'));
 id

 20
(1 row)

Time: 0.248 ms
test=# SELECT * FROM JSON_TABLE(jsonb '{"id" : "30"}',
'$' COLUMNS(id int PATH '$.id'));
 id


(1 row)

Clearly, when converting the string literal into int value, there's some
sort of error handling that realizes 3B overflows, and returns NULL
instead. I'm not sure if this is intentional.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: remaining sql/json patches

2024-03-06 Thread Himanshu Upadhyaya
On Wed, Mar 6, 2024 at 9:04 PM Tomas Vondra 
wrote:

>
>
> I'm pretty sure this is the correct & expected behavior. The second
> query treats the value as string (because that's what should happen for
> values in double quotes).
>
>  ok, Then why does the below query provide the correct conversion, even if
we enclose that in double quotes?
‘postgres[102531]=#’SELECT * FROM JSON_TABLE(jsonb '{
 "id" : "1234567890",
 "FULL_NAME" : "JOHN DOE"}',
'$'
COLUMNS(
 name varchar(20) PATH 'lax $.FULL_NAME',
 id int PATH 'lax $.id'
  )
   )
;
   name   | id
--+
 JOHN DOE | 1234567890
(1 row)

and for bigger input(string) it will leave as empty as below.
‘postgres[102531]=#’SELECT * FROM JSON_TABLE(jsonb '{
 "id" : "12345678901",
 "FULL_NAME" : "JOHN DOE"}',
'$'
COLUMNS(
 name varchar(20) PATH 'lax $.FULL_NAME',
 id int PATH 'lax $.id'
  )
   )
;
   name   | id
--+
 JOHN DOE |
(1 row)

seems it is not something to do with data enclosed in double quotes but
somehow related with internal casting it to integer and I think in case of
bigger input it is not able to cast it to integer(as defined under COLUMNS
as id int PATH 'lax $.id')

‘postgres[102531]=#’SELECT * FROM JSON_TABLE(jsonb '{
 "id" : "12345678901",
 "FULL_NAME" : "JOHN DOE"}',
'$'
COLUMNS(
 name varchar(20) PATH 'lax $.FULL_NAME',
 id int PATH 'lax $.id'
  )
   )
;
   name   | id
--+
 JOHN DOE |
(1 row)
)

if it is not able to represent it to integer because of bigger input, it
should error out with a similar error message instead of leaving it empty.

Thoughts?

-- 
Regards,
Himanshu Upadhyaya
EnterpriseDB: http://www.enterprisedb.com


Re: remaining sql/json patches

2024-03-06 Thread jian he
two cosmetic minor issues.

+/*
+ * JsonCoercion
+ * Information about coercing a SQL/JSON value to the specified
+ * type at runtime
+ *
+ * A node of this type is created if the parser cannot find a cast expression
+ * using coerce_type() or OMIT QUOTES is specified for JSON_QUERY.  If the
+ * latter, 'expr' may contain the cast expression; if not, the quote-stripped
+ * scalar string will be coerced by calling the target type's input function.
+ * See ExecEvalJsonCoercion.
+ */
+typedef struct JsonCoercion
+{
+ NodeTag type;
+
+ Oid targettype;
+ int32 targettypmod;
+ bool omit_quotes; /* OMIT QUOTES specified for JSON_QUERY? */
+ Node   *cast_expr; /* coercion cast expression or NULL */
+ Oid collation;
+} JsonCoercion;

comment:  'expr' may contain the cast expression;
here "exr" should be "cast_expr"?
"a SQL/JSON" should be " an SQL/JSON"?




Re: remaining sql/json patches

2024-03-06 Thread jian he
On Tue, Mar 5, 2024 at 12:38 PM Andy Fan  wrote:
>
>
> In the commit message of 0001, we have:
>
> """
> Both JSON_VALUE() and JSON_QUERY() functions have options for
> handling EMPTY and ERROR conditions, which can be used to specify
> the behavior when no values are matched and when an error occurs
> during evaluation, respectively.
>
> All of these functions only operate on jsonb values. The workaround
> for now is to cast the argument to jsonb.
> """
>
> which is not clear for me why we introduce JSON_VALUE() function, is it
> for handling EMPTY or ERROR conditions? I think the existing cast
> workaround have a similar capacity?
>

I guess because it's in the standard.
but I don't see individual sql standard Identifier, JSON_VALUE in
sql_features.txt
I do see JSON_QUERY.
mysql also have JSON_VALUE, [1]

EMPTY, ERROR: there is a standard Identifier: T825: SQL/JSON: ON EMPTY
and ON ERROR clauses

[1] 
https://dev.mysql.com/doc/refman/8.0/en/json-search-functions.html#function_json-value




Re: remaining sql/json patches

2024-03-06 Thread jian he
On Wed, Mar 6, 2024 at 9:22 PM jian he  wrote:
>
> Another case, I did test yet: more keys in a single json, but the
> value is small.

Another case attached. see the attached SQL file's comments.
a single simple jsonb, with 33 keys, each key's value with fixed length: 256.
total table size: SELECT pg_size_pretty(pg_table_size('json33keys'));  --5369 MB
number of rows: 61.

using the previously mentioned method: pg_log_backend_memory_contexts.
all these tests under:
-Dcassert=true \
-Db_coverage=true \
-Dbuildtype=debug \

I hope someone will tell me if the test method is correct or not.


jsonb_33keys.sql
Description: application/sql


Re: remaining sql/json patches

2024-03-06 Thread Tomas Vondra



On 3/6/24 12:58, Himanshu Upadhyaya wrote:
> On Tue, Mar 5, 2024 at 6:52 AM Amit Langote  wrote:
> 
> Hi,
> 
> I am doing some random testing with the latest patch and found one scenario
> that I wanted to share.
> consider a below case.
> 
> ‘postgres[102531]=#’SELECT * FROM JSON_TABLE(jsonb '{
>  "id" : 12345678901,
>  "FULL_NAME" : "JOHN DOE"}',
> '$'
> COLUMNS(
>  name varchar(20) PATH 'lax $.FULL_NAME',
>  id int PATH 'lax $.id'
>   )
>)
> ;
> ERROR:  22003: integer out of range
> LOCATION:  numeric_int4_opt_error, numeric.c:4385
> ‘postgres[102531]=#’SELECT * FROM JSON_TABLE(jsonb '{
>  "id" : "12345678901",
>  "FULL_NAME" : "JOHN DOE"}',
> '$'
> COLUMNS(
>  name varchar(20) PATH 'lax $.FULL_NAME',
>  id int PATH 'lax $.id'
>   )
>)
> ;
>name   | id
> --+
>  JOHN DOE |
> (1 row)
> 
> The first query throws an error that the integer is "out of range" and is
> quite expected but in the second case(when the value is enclosed with ") it
> is able to process the JSON object but does not return any relevant
> error(in fact processes the JSON but returns it with empty data for "id"
> field). I think second query should fail with a similar error.
> 

I'm pretty sure this is the correct & expected behavior. The second
query treats the value as string (because that's what should happen for
values in double quotes).

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: remaining sql/json patches

2024-03-06 Thread jian he
On Wed, Mar 6, 2024 at 12:07 PM Amit Langote  wrote:
>
> Hi Tomas,
>
> On Wed, Mar 6, 2024 at 6:30 AM Tomas Vondra
>  wrote:
> >
> > Hi,
> >
> > I know very little about sql/json and all the json internals, but I
> > decided to do some black box testing. I built a large JSONB table
> > (single column, ~7GB of data after loading). And then I did a query
> > transforming the data into tabular form using JSON_TABLE.
> >
> > The JSON_TABLE query looks like this:
> >
> > SELECT jt.* FROM
> >   title_jsonb t,
> >   json_table(t.info, '$'
> > COLUMNS (
> >   "id" text path '$."id"',
> >   "type" text path '$."type"',
> >   "title" text path '$."title"',
> >   "original_title" text path '$."original_title"',
> >   "is_adult" text path '$."is_adult"',
> >   "start_year" text path '$."start_year"',
> >   "end_year" text path '$."end_year"',
> >   "minutes" text path '$."minutes"',
> >   "genres" text path '$."genres"',
> >   "aliases" text path '$."aliases"',
> >   "directors" text path '$."directors"',
> >   "writers" text path '$."writers"',
> >   "ratings" text path '$."ratings"',
> >   NESTED PATH '$."aliases"[*]'
> > COLUMNS (
> >   "alias_title" text path '$."title"',
> >   "alias_region" text path '$."region"'
> > ),
> >   NESTED PATH '$."directors"[*]'
> > COLUMNS (
> >   "director_name" text path '$."name"',
> >   "director_birth_year" text path '$."birth_year"',
> >   "director_death_year" text path '$."death_year"'
> > ),
> >   NESTED PATH '$."writers"[*]'
> > COLUMNS (
> >   "writer_name" text path '$."name"',
> >   "writer_birth_year" text path '$."birth_year"',
> >   "writer_death_year" text path '$."death_year"'
> > ),
> >   NESTED PATH '$."ratings"[*]'
> > COLUMNS (
> >   "rating_average" text path '$."average"',
> >   "rating_votes" text path '$."votes"'
> > )
> > )
> >   ) as jt;
> >
> > again, not particularly complex. But if I run this, it consumes multiple
> > gigabytes of memory, before it gets killed by OOM killer. This happens
> > even when ran using
> >
> >   COPY (...) TO '/dev/null'
> >
> > so there's nothing sent to the client. I did catch memory context info,
> > where it looks like this (complete stats attached):
> >
> > --
> > TopMemoryContext: 97696 total in 5 blocks; 13056 free (11 chunks);
> >   84640 used
> >   ...
> >   TopPortalContext: 8192 total in 1 blocks; 7680 free (0 chunks); ...
> > PortalContext: 1024 total in 1 blocks; 560 free (0 chunks); ...
> >   ExecutorState: 2541764672 total in 314 blocks; 6528176 free
> >  (1208 chunks); 2535236496 used
> > printtup: 8192 total in 1 blocks; 7952 free (0 chunks); ...
> > ...
> > ...
> > Grand total: 2544132336 bytes in 528 blocks; 7484504 free
> >  (1340 chunks); 2536647832 used
> > --
> >
> > I'd say 2.5GB in ExecutorState seems a bit excessive ... Seems there's
> > some memory management issue? My guess is we're not releasing memory
> > allocated while parsing the JSON or building JSON output.
> >
> > I'm not attaching the data, but I can provide that if needed - it's
> > about 600MB compressed. The structure is not particularly complex, it's
> > movie info from [1] combined into a JSON document (one per movie).
>
> Thanks for the report.
>
> Yeah, I'd like to see the data to try to drill down into what's piling
> up in ExecutorState.  I want to be sure of if the 1st, query functions
> patch, is not implicated in this, because I'd like to get that one out
> of the way sooner than later.
>

I did some tests. it generally looks like:

create or replace function random_text() returns text
as $$select string_agg(md5(random()::text),'') from
generate_Series(1,8) s $$ LANGUAGE SQL;
DROP TABLE if exists s;
create table s(a jsonb);
INSERT INTO s SELECT (
'{"id": "' || random_text() || '",'
'"type": "' || random_text() || '",'
'"title": "' || random_text() || '",'
'"original_title": "' || random_text() || '",'
'"is_adult": "' || random_text() || '",'
'"start_year": "' || random_text() || '",'
'"end_year": "' || random_text() || '",'
'"minutes": "' || random_text() || '",'
'"genres": "' || random_text() || '",'
'"aliases": "' || random_text() || '",'
'"genres": "' || random_text() || '",'
'"directors": "' || random_text() || '",'
'"writers": "' || random_text() || '",'
'"ratings": "' || random_text() || '",'
'"director_name": "' || random_text() || '",'
'"alias_title": "' || random_text() || '",'
'"alias_region": "' || random_text() || '",'
'"director_birth_year": "' || random_text() || '",'
'"director_death_year": "' || random_text() || '",'
'"rating_average": "' || random_text() || '",'
'"rating_votes": "' || random_text() || '"'
||'}' )::jsonb
FROM generate_series(1, 1e6);
SELECT pg_size_pretty(pg_table_size('s')); -- 5975 MB

It's less 

Re: remaining sql/json patches

2024-03-06 Thread Himanshu Upadhyaya
On Tue, Mar 5, 2024 at 6:52 AM Amit Langote  wrote:

Hi,

I am doing some random testing with the latest patch and found one scenario
that I wanted to share.
consider a below case.

‘postgres[102531]=#’SELECT * FROM JSON_TABLE(jsonb '{
 "id" : 12345678901,
 "FULL_NAME" : "JOHN DOE"}',
'$'
COLUMNS(
 name varchar(20) PATH 'lax $.FULL_NAME',
 id int PATH 'lax $.id'
  )
   )
;
ERROR:  22003: integer out of range
LOCATION:  numeric_int4_opt_error, numeric.c:4385
‘postgres[102531]=#’SELECT * FROM JSON_TABLE(jsonb '{
 "id" : "12345678901",
 "FULL_NAME" : "JOHN DOE"}',
'$'
COLUMNS(
 name varchar(20) PATH 'lax $.FULL_NAME',
 id int PATH 'lax $.id'
  )
   )
;
   name   | id
--+
 JOHN DOE |
(1 row)

The first query throws an error that the integer is "out of range" and is
quite expected but in the second case(when the value is enclosed with ") it
is able to process the JSON object but does not return any relevant
error(in fact processes the JSON but returns it with empty data for "id"
field). I think second query should fail with a similar error.


-- 
Regards,
Himanshu Upadhyaya
EnterpriseDB: http://www.enterprisedb.com


Re: remaining sql/json patches

2024-03-05 Thread Amit Langote
Hi Tomas,

On Wed, Mar 6, 2024 at 6:30 AM Tomas Vondra
 wrote:
>
> Hi,
>
> I know very little about sql/json and all the json internals, but I
> decided to do some black box testing. I built a large JSONB table
> (single column, ~7GB of data after loading). And then I did a query
> transforming the data into tabular form using JSON_TABLE.
>
> The JSON_TABLE query looks like this:
>
> SELECT jt.* FROM
>   title_jsonb t,
>   json_table(t.info, '$'
> COLUMNS (
>   "id" text path '$."id"',
>   "type" text path '$."type"',
>   "title" text path '$."title"',
>   "original_title" text path '$."original_title"',
>   "is_adult" text path '$."is_adult"',
>   "start_year" text path '$."start_year"',
>   "end_year" text path '$."end_year"',
>   "minutes" text path '$."minutes"',
>   "genres" text path '$."genres"',
>   "aliases" text path '$."aliases"',
>   "directors" text path '$."directors"',
>   "writers" text path '$."writers"',
>   "ratings" text path '$."ratings"',
>   NESTED PATH '$."aliases"[*]'
> COLUMNS (
>   "alias_title" text path '$."title"',
>   "alias_region" text path '$."region"'
> ),
>   NESTED PATH '$."directors"[*]'
> COLUMNS (
>   "director_name" text path '$."name"',
>   "director_birth_year" text path '$."birth_year"',
>   "director_death_year" text path '$."death_year"'
> ),
>   NESTED PATH '$."writers"[*]'
> COLUMNS (
>   "writer_name" text path '$."name"',
>   "writer_birth_year" text path '$."birth_year"',
>   "writer_death_year" text path '$."death_year"'
> ),
>   NESTED PATH '$."ratings"[*]'
> COLUMNS (
>   "rating_average" text path '$."average"',
>   "rating_votes" text path '$."votes"'
> )
> )
>   ) as jt;
>
> again, not particularly complex. But if I run this, it consumes multiple
> gigabytes of memory, before it gets killed by OOM killer. This happens
> even when ran using
>
>   COPY (...) TO '/dev/null'
>
> so there's nothing sent to the client. I did catch memory context info,
> where it looks like this (complete stats attached):
>
> --
> TopMemoryContext: 97696 total in 5 blocks; 13056 free (11 chunks);
>   84640 used
>   ...
>   TopPortalContext: 8192 total in 1 blocks; 7680 free (0 chunks); ...
> PortalContext: 1024 total in 1 blocks; 560 free (0 chunks); ...
>   ExecutorState: 2541764672 total in 314 blocks; 6528176 free
>  (1208 chunks); 2535236496 used
> printtup: 8192 total in 1 blocks; 7952 free (0 chunks); ...
> ...
> ...
> Grand total: 2544132336 bytes in 528 blocks; 7484504 free
>  (1340 chunks); 2536647832 used
> --
>
> I'd say 2.5GB in ExecutorState seems a bit excessive ... Seems there's
> some memory management issue? My guess is we're not releasing memory
> allocated while parsing the JSON or building JSON output.
>
> I'm not attaching the data, but I can provide that if needed - it's
> about 600MB compressed. The structure is not particularly complex, it's
> movie info from [1] combined into a JSON document (one per movie).

Thanks for the report.

Yeah, I'd like to see the data to try to drill down into what's piling
up in ExecutorState.  I want to be sure of if the 1st, query functions
patch, is not implicated in this, because I'd like to get that one out
of the way sooner than later.

-- 
Thanks, Amit Langote




Re: remaining sql/json patches

2024-03-05 Thread Tomas Vondra
Hi,

I know very little about sql/json and all the json internals, but I
decided to do some black box testing. I built a large JSONB table
(single column, ~7GB of data after loading). And then I did a query
transforming the data into tabular form using JSON_TABLE.

The JSON_TABLE query looks like this:

SELECT jt.* FROM
  title_jsonb t,
  json_table(t.info, '$'
COLUMNS (
  "id" text path '$."id"',
  "type" text path '$."type"',
  "title" text path '$."title"',
  "original_title" text path '$."original_title"',
  "is_adult" text path '$."is_adult"',
  "start_year" text path '$."start_year"',
  "end_year" text path '$."end_year"',
  "minutes" text path '$."minutes"',
  "genres" text path '$."genres"',
  "aliases" text path '$."aliases"',
  "directors" text path '$."directors"',
  "writers" text path '$."writers"',
  "ratings" text path '$."ratings"',
  NESTED PATH '$."aliases"[*]'
COLUMNS (
  "alias_title" text path '$."title"',
  "alias_region" text path '$."region"'
),
  NESTED PATH '$."directors"[*]'
COLUMNS (
  "director_name" text path '$."name"',
  "director_birth_year" text path '$."birth_year"',
  "director_death_year" text path '$."death_year"'
),
  NESTED PATH '$."writers"[*]'
COLUMNS (
  "writer_name" text path '$."name"',
  "writer_birth_year" text path '$."birth_year"',
  "writer_death_year" text path '$."death_year"'
),
  NESTED PATH '$."ratings"[*]'
COLUMNS (
  "rating_average" text path '$."average"',
  "rating_votes" text path '$."votes"'
)
)
  ) as jt;

again, not particularly complex. But if I run this, it consumes multiple
gigabytes of memory, before it gets killed by OOM killer. This happens
even when ran using

  COPY (...) TO '/dev/null'

so there's nothing sent to the client. I did catch memory context info,
where it looks like this (complete stats attached):

--
TopMemoryContext: 97696 total in 5 blocks; 13056 free (11 chunks);
  84640 used
  ...
  TopPortalContext: 8192 total in 1 blocks; 7680 free (0 chunks); ...
PortalContext: 1024 total in 1 blocks; 560 free (0 chunks); ...
  ExecutorState: 2541764672 total in 314 blocks; 6528176 free
 (1208 chunks); 2535236496 used
printtup: 8192 total in 1 blocks; 7952 free (0 chunks); ...
...
...
Grand total: 2544132336 bytes in 528 blocks; 7484504 free
 (1340 chunks); 2536647832 used
--

I'd say 2.5GB in ExecutorState seems a bit excessive ... Seems there's
some memory management issue? My guess is we're not releasing memory
allocated while parsing the JSON or building JSON output.


I'm not attaching the data, but I can provide that if needed - it's
about 600MB compressed. The structure is not particularly complex, it's
movie info from [1] combined into a JSON document (one per movie).


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL CompanyTopMemoryContext: 97696 total in 5 blocks; 13056 free (11 chunks); 84640 used
  Type information cache: 24384 total in 2 blocks; 2640 free (0 chunks); 21744 
used
  TableSpace cache: 8192 total in 1 blocks; 2112 free (0 chunks); 6080 used
  TopTransactionContext: 8192 total in 1 blocks; 7760 free (2 chunks); 432 used
  RowDescriptionContext: 8192 total in 1 blocks; 6912 free (0 chunks); 1280 used
  MessageContext: 524288 total in 7 blocks; 240736 free (4 chunks); 283552 used
  search_path processing cache: 8192 total in 1 blocks; 5616 free (8 chunks); 
2576 used
  Operator class cache: 8192 total in 1 blocks; 576 free (0 chunks); 7616 used
  PgStat Shared Ref Hash: 7232 total in 2 blocks; 704 free (0 chunks); 6528 used
  PgStat Shared Ref: 4096 total in 3 blocks; 496 free (2 chunks); 3600 used
  PgStat Pending: 16384 total in 5 blocks; 6464 free (9 chunks); 9920 used
  smgr relation table: 32768 total in 3 blocks; 16848 free (8 chunks); 15920 
used
  TransactionAbortContext: 32768 total in 1 blocks; 32528 free (0 chunks); 240 
used
  Portal hash: 8192 total in 1 blocks; 576 free (0 chunks); 7616 used
  TopPortalContext: 8192 total in 1 blocks; 7680 free (0 chunks); 512 used
PortalContext: 1024 total in 1 blocks; 560 free (0 chunks); 464 used: 

  ExecutorState: 2541764672 total in 314 blocks; 6528176 free (1208 
chunks); 2535236496 used
printtup: 8192 total in 1 blocks; 7952 free (0 chunks); 240 used
TableFunc per value context: 8192 total in 1 blocks; 6672 free (0 
chunks); 1520 used
  JsonTableExecContext: 8192 total in 1 blocks; 6864 free (0 chunks); 
1328 used
JsonTableExecContext: 8192 total in 1 blocks; 7952 free (0 chunks); 
240 used
JsonTableExecContext: 8192 total in 1 blocks; 7952 free (0 chunks); 
240 used
JsonTableExecContext: 8192 total in 1 blocks; 7952 free (0 chunks); 
240 used

Re: remaining sql/json patches

2024-03-04 Thread jian he
On Tue, Mar 5, 2024 at 9:22 AM Amit Langote  wrote:
>
> Thanks for the heads up.  Attaching rebased patches.
>

Walking through the v41-0001-Add-SQL-JSON-query-functions.patch documentation.
I found some minor cosmetic issues.

+   
+select json_query(jsonb '{"a": "[1, 2]"}', 'lax $.a'
RETURNING int[] OMIT QUOTES);
+
+   
this example is not so good, it returns NULL, makes it harder to
render the result.

+   context_item (the document); seen
+for more details on what
+   path_expression can contain.
"seen" should be "see"?

+   
+This function must return a JSON string, so if the path expression
+returns multiple SQL/JSON items, you must wrap the result using the
+WITH WRAPPER clause.  If the wrapper is
"must" may be not correct?
since we have a RETURNING clause.
"generally" may be more accurate, I think.
maybe we can rephrase the sentence:
+This function generally return a JSON string, so if the path expression
+yield multiple SQL/JSON items, you must wrap the result using the
+WITH WRAPPER clause

+is spcified, the returned value will be of type text.
+If no RETURNING is spcified, the returned value will
two typos, and should be "specified".

+Note that if the path_expression
+is strict and ON ERROR behavior
+is ON ERROR, an error is generated if it yields no
+items.

may be the following:
+Note that if the path_expression
+is strict and ON ERROR behavior
+is ERROR, an error is generated if it yields no
+items.

most of the place, you use
 path_expression
but there are two place you use:
path_expression
I guess that's ok, but the appearance is different.
  more prominent. Anyway, it is a minor issue.

+json_query.  Note that scalar strings returned
+by json_value always have their quotes removed,
+equivalent to what one would get with OMIT QUOTES
+when using json_query.

I think we can simplify it like the following:

+json_query.  Note that scalar strings returned
+by json_value always have their quotes removed,
+equivalent to OMIT QUOTES
+when using json_query.




Re: remaining sql/json patches

2024-03-04 Thread Andy Fan


Hi,

> On Tue, Mar 5, 2024 at 12:03 AM Alvaro Herrera  
> wrote:
>> On 2024-Mar-04, Erik Rijkers wrote:
>>
>> > In my hands (applying with patch), the patches, esp. 0001, do not apply.
>> > But I see the cfbot builds without problem so maybe just ignore these 
>> > FAILED
>> > lines.  Better get them merged - so I can test there...
>>
>> It's because of dbbca2cf299b.  It should apply cleanly if you do "git
>> checkout dbbca2cf299b^" first ...  That commit is so recent that
>> evidently the cfbot hasn't had a chance to try this patch again since it
>> went in, which is why it's still green.
>
> Thanks for the heads up.  Attaching rebased patches.

In the commit message of 0001, we have:

"""
Both JSON_VALUE() and JSON_QUERY() functions have options for
handling EMPTY and ERROR conditions, which can be used to specify
the behavior when no values are matched and when an error occurs
during evaluation, respectively.

All of these functions only operate on jsonb values. The workaround
for now is to cast the argument to jsonb.
"""

which is not clear for me why we introduce JSON_VALUE() function, is it
for handling EMPTY or ERROR conditions? I think the existing cast
workaround have a similar capacity?

Then I think if it is introduced as a performance improvement like [1],
then the test at [1] might be interesting. If this is the case, the
method in [1] can avoid the user to modify these queries for the using
the new function. 

[1] https://www.postgresql.org/message-id/8734t6c5rh@163.com

-- 
Best Regards
Andy Fan





Re: remaining sql/json patches

2024-03-04 Thread Alvaro Herrera
On 2024-Mar-04, Erik Rijkers wrote:

> In my hands (applying with patch), the patches, esp. 0001, do not apply.
> But I see the cfbot builds without problem so maybe just ignore these FAILED
> lines.  Better get them merged - so I can test there...

It's because of dbbca2cf299b.  It should apply cleanly if you do "git
checkout dbbca2cf299b^" first ...  That commit is so recent that
evidently the cfbot hasn't had a chance to try this patch again since it
went in, which is why it's still green.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"If you have nothing to say, maybe you need just the right tool to help you
not say it."   (New York Times, about Microsoft PowerPoint)




Re: remaining sql/json patches

2024-03-04 Thread Erik Rijkers

Op 3/4/24 om 10:40 schreef Amit Langote:

Hi Jian,

Thanks for the reviews and sorry for the late reply. Replying to all
emails in one.


> [v40-0001-Add-SQL-JSON-query-functions.patch]
> [v40-0002-Show-function-name-in-TableFuncScan.patch]
> [v40-0003-JSON_TABLE.patch]

In my hands (applying with patch), the patches, esp. 0001, do not apply. 
 But I see the cfbot builds without problem so maybe just ignore these 
FAILED lines.  Better get them merged - so I can test there...


Erik


checking file doc/src/sgml/func.sgml
checking file src/backend/catalog/sql_features.txt
checking file src/backend/executor/execExpr.c
Hunk #1 succeeded at 48 with fuzz 2 (offset -1 lines).
Hunk #2 succeeded at 88 (offset -1 lines).
Hunk #3 succeeded at 2419 (offset -1 lines).
Hunk #4 succeeded at 4195 (offset -1 lines).
checking file src/backend/executor/execExprInterp.c
Hunk #1 succeeded at 72 (offset -1 lines).
Hunk #2 succeeded at 180 (offset -1 lines).
Hunk #3 succeeded at 485 (offset -1 lines).
Hunk #4 succeeded at 1560 (offset -1 lines).
Hunk #5 succeeded at 4242 (offset -1 lines).
checking file src/backend/jit/llvm/llvmjit_expr.c
checking file src/backend/jit/llvm/llvmjit_types.c
checking file src/backend/nodes/makefuncs.c
Hunk #1 succeeded at 856 (offset -1 lines).
checking file src/backend/nodes/nodeFuncs.c
Hunk #1 succeeded at 233 (offset -1 lines).
Hunk #2 succeeded at 517 (offset -1 lines).
Hunk #3 succeeded at 1019 (offset -1 lines).
Hunk #4 succeeded at 1276 (offset -1 lines).
Hunk #5 succeeded at 1617 (offset -1 lines).
Hunk #6 succeeded at 2381 (offset -1 lines).
Hunk #7 succeeded at 3429 (offset -1 lines).
Hunk #8 succeeded at 4164 (offset -1 lines).
checking file src/backend/optimizer/path/costsize.c
Hunk #1 succeeded at 4878 (offset -1 lines).
checking file src/backend/optimizer/util/clauses.c
Hunk #1 succeeded at 50 (offset -3 lines).
Hunk #2 succeeded at 415 (offset -3 lines).
checking file src/backend/parser/gram.y
checking file src/backend/parser/parse_expr.c
checking file src/backend/parser/parse_target.c
Hunk #1 succeeded at 1988 (offset -1 lines).
checking file src/backend/utils/adt/formatting.c
Hunk #1 succeeded at 4465 (offset -1 lines).
checking file src/backend/utils/adt/jsonb.c
Hunk #1 succeeded at 2159 (offset -4 lines).
checking file src/backend/utils/adt/jsonfuncs.c
checking file src/backend/utils/adt/jsonpath.c
Hunk #1 FAILED at 68.
Hunk #2 succeeded at 1239 (offset -1 lines).
1 out of 2 hunks FAILED
checking file src/backend/utils/adt/jsonpath_exec.c
Hunk #1 succeeded at 229 (offset -5 lines).
Hunk #2 succeeded at 2866 (offset -5 lines).
Hunk #3 succeeded at 3751 (offset -5 lines).
checking file src/backend/utils/adt/ruleutils.c
Hunk #1 succeeded at 474 (offset -1 lines).
Hunk #2 succeeded at 518 (offset -1 lines).
Hunk #3 succeeded at 8303 (offset -1 lines).
Hunk #4 succeeded at 8475 (offset -1 lines).
Hunk #5 succeeded at 8591 (offset -1 lines).
Hunk #6 succeeded at 9808 (offset -1 lines).
Hunk #7 succeeded at 9858 (offset -1 lines).
Hunk #8 succeeded at 10039 (offset -1 lines).
Hunk #9 succeeded at 10909 (offset -1 lines).
checking file src/include/executor/execExpr.h
checking file src/include/nodes/execnodes.h
checking file src/include/nodes/makefuncs.h
checking file src/include/nodes/parsenodes.h
checking file src/include/nodes/primnodes.h
checking file src/include/parser/kwlist.h
checking file src/include/utils/formatting.h
checking file src/include/utils/jsonb.h
checking file src/include/utils/jsonfuncs.h
checking file src/include/utils/jsonpath.h
checking file src/interfaces/ecpg/preproc/ecpg.trailer
checking file src/test/regress/expected/sqljson_queryfuncs.out
checking file src/test/regress/parallel_schedule
checking file src/test/regress/sql/sqljson_queryfuncs.sql
checking file src/tools/pgindent/typedefs.list




Re: remaining sql/json patches

2024-02-05 Thread jian he
On Thu, Jan 25, 2024 at 10:39 PM jian he  wrote:
>
> On Thu, Jan 25, 2024 at 7:54 PM Amit Langote  wrote:
> >
> > >
> > > The problem with returning comp_domain_with_typmod from json_value()
> > > seems to be that it's using a text-to-record CoerceViaIO expression
> > > picked from JsonExpr.item_coercions, which behaves differently than
> > > the expression tree that the following uses:
> > >
> > > select ('abcd', 42)::comp_domain_with_typmod;
> > >row
> > > --
> > >  (abc,42)
> > > (1 row)
> >
> > Oh, it hadn't occurred to me to check what trying to coerce a "string"
> > containing the record literal would do:
> >
> > select '(''abcd'', 42)'::comp_domain_with_typmod;
> > ERROR:  value too long for type character(3)
> > LINE 1: select '(''abcd'', 42)'::comp_domain_with_typmod;
> >
> > which is the same thing as what the JSON_QUERY() and JSON_VALUE() are
> > running into.  So, it might be fair to think that the error is not a
> > limitation of the SQL/JSON patch but an underlying behavior that it
> > has to accept as is.
> >
>
> Hi, I reconciled with these cases.
> What bugs me now is the first query of the following 4 cases (for comparison).
> SELECT JSON_QUERY(jsonb '[1,2]', '$' RETURNING char(3) omit quotes);
> SELECT JSON_QUERY(jsonb '[1,2]', '$' RETURNING char(3) keep quotes);
> SELECT JSON_QUERY(jsonb '[1,2]', '$' RETURNING text omit quotes);
> SELECT JSON_QUERY(jsonb '[1,2]', '$' RETURNING text keep quotes);
>

based on v39.
in ExecEvalJsonCoercion
coercion->targettypmod related function calls:
json_populate_type calls populate_record_field, then populate_scalar,
later will eventually call InputFunctionCallSafe.

so I make the following change:
--- a/src/backend/executor/execExprInterp.c
+++ b/src/backend/executor/execExprInterp.c
@@ -4533,7 +4533,7 @@ ExecEvalJsonCoercion(ExprState *state, ExprEvalStep *op,
 * deed ourselves by calling the input function, that is, after removing
 * the quotes.
 */
-   if (jb && JB_ROOT_IS_SCALAR(jb) && coercion->omit_quotes)
+   if ((jb && JB_ROOT_IS_SCALAR(jb) && coercion->omit_quotes) ||
coercion->targettypmod != -1)

now the following two return the same result:  `[1,`
SELECT JSON_QUERY(jsonb '[1,2]', '$' RETURNING char(3) omit quotes);
SELECT JSON_QUERY(jsonb '[1,2]', '$' RETURNING char(3) keep quotes);




Re: remaining sql/json patches

2024-02-05 Thread jian he
based on this query:
begin;
SET LOCAL TIME ZONE 10.5;
with cte(s) as (select jsonb '"2023-08-15 12:34:56 +05:30"')
select JSON_QUERY(s, '$.timestamp_tz()')::text,'+10.5'::text,
'timestamp_tz'::text from cte
union all
select JSON_QUERY(s, '$.time()')::text,'+10.5'::text, 'time'::text from cte
union all
select JSON_QUERY(s, '$.timestamp()')::text,'+10.5'::text,
'timestamp'::text from cte
union all
select JSON_QUERY(s, '$.date()')::text,'+10.5'::text, 'date'::text from cte
union all
select JSON_QUERY(s, '$.time_tz()')::text,'+10.5'::text,
'time_tz'::text from cte;

SET LOCAL TIME ZONE -8;
with cte(s) as (select jsonb '"2023-08-15 12:34:56 +05:30"')
select JSON_QUERY(s, '$.timestamp_tz()')::text,'+10.5'::text,
'timestamp_tz'::text from cte
union all
select JSON_QUERY(s, '$.time()')::text,'+10.5'::text, 'time'::text from cte
union all
select JSON_QUERY(s, '$.timestamp()')::text,'+10.5'::text,
'timestamp'::text from cte
union all
select JSON_QUERY(s, '$.date()')::text,'+10.5'::text, 'date'::text from cte
union all
select JSON_QUERY(s, '$.time_tz()')::text,'+10.5'::text,
'time_tz'::text from cte;
commit;

I made some changes on jspIsMutableWalker.
various new jsonpath methods added:
https://git.postgresql.org/cgit/postgresql.git/commit/?id=66ea94e8e606529bb334515f388c62314956739e
so we need to change jspIsMutableWalker accordingly.

based on v39.


v1-0001-handle-various-jsonpath-methods-in-jspI.v39_001_changes
Description: Binary data


Re: remaining sql/json patches

2024-01-31 Thread jian he
Hi.
minor issues.
I am wondering do we need add `pg_node_attr(query_jumble_ignore)`
to some of our created structs in src/include/nodes/parsenodes.h in
v39-0001-Add-SQL-JSON-query-functions.patch

diff --git a/src/backend/parser/parse_jsontable.c
b/src/backend/parser/parse_jsontable.c
new file mode 100644
index 00..25b8204dc6
--- /dev/null
+++ b/src/backend/parser/parse_jsontable.c
@@ -0,0 +1,718 @@
+/*-
+ *
+ * parse_jsontable.c
+ *  parsing of JSON_TABLE
+ *
+ * Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *  src/backend/parser/parse_jsontable.c
+ *
+ *-
+ */
2022 should change to 2024.




Re: remaining sql/json patches

2024-01-25 Thread jian he
On Thu, Jan 25, 2024 at 7:54 PM Amit Langote  wrote:
>
> >
> > The problem with returning comp_domain_with_typmod from json_value()
> > seems to be that it's using a text-to-record CoerceViaIO expression
> > picked from JsonExpr.item_coercions, which behaves differently than
> > the expression tree that the following uses:
> >
> > select ('abcd', 42)::comp_domain_with_typmod;
> >row
> > --
> >  (abc,42)
> > (1 row)
>
> Oh, it hadn't occurred to me to check what trying to coerce a "string"
> containing the record literal would do:
>
> select '(''abcd'', 42)'::comp_domain_with_typmod;
> ERROR:  value too long for type character(3)
> LINE 1: select '(''abcd'', 42)'::comp_domain_with_typmod;
>
> which is the same thing as what the JSON_QUERY() and JSON_VALUE() are
> running into.  So, it might be fair to think that the error is not a
> limitation of the SQL/JSON patch but an underlying behavior that it
> has to accept as is.
>

Hi, I reconciled with these cases.
What bugs me now is the first query of the following 4 cases (for comparison).
SELECT JSON_QUERY(jsonb '[1,2]', '$' RETURNING char(3) omit quotes);
SELECT JSON_QUERY(jsonb '[1,2]', '$' RETURNING char(3) keep quotes);
SELECT JSON_QUERY(jsonb '[1,2]', '$' RETURNING text omit quotes);
SELECT JSON_QUERY(jsonb '[1,2]', '$' RETURNING text keep quotes);

I did some minor refactoring on the function coerceJsonFuncExprOutput.
it will make the following queries return null instead of error. NULL
is the return of json_value.

SELECT JSON_QUERY(jsonb '"123"', '$' RETURNING int2);
SELECT JSON_QUERY(jsonb '"123"', '$' RETURNING int4);
SELECT JSON_QUERY(jsonb '"123"', '$' RETURNING int8);
SELECT JSON_QUERY(jsonb '"123"', '$' RETURNING bool);
SELECT JSON_QUERY(jsonb '"123"', '$' RETURNING numeric);
SELECT JSON_QUERY(jsonb '"123"', '$' RETURNING real);
SELECT JSON_QUERY(jsonb '"123"', '$' RETURNING float8);


v1-0001-minor-refactor-coerceJsonFuncExprOutput.no-cfbot
Description: Binary data


Re: remaining sql/json patches

2024-01-25 Thread Amit Langote
On Thu, Jan 25, 2024 at 6:09 PM Amit Langote  wrote:
> On Wed, Jan 24, 2024 at 10:11 PM Amit Langote  wrote:
> > I still need to take a look at your other report regarding typmod but
> > I'm out of energy today.
>
> The attached updated patch should address one of the concerns --
> JSON_QUERY() should now work appropriately with RETURNING type with
> typmod whether or  OMIT QUOTES is specified.
>
> But I wasn't able to address the problems with RETURNING
> record_type_with_typmod, that is, the following example you shared
> upthread:
>
> create domain char3_domain_not_null as char(3) NOT NULL;
> create domain hello as text not null check (value = 'hello');
> create domain int42 as int check (value = 42);
> create type comp_domain_with_typmod AS (a char3_domain_not_null, b int42);
> select json_value(jsonb'{"rec": "(abcd,42)"}', '$.rec' returning
> comp_domain_with_typmod);
>  json_value
> 
>
> (1 row)
>
> select json_value(jsonb'{"rec": "(abcd,42)"}', '$.rec' returning
> comp_domain_with_typmod error on error);
> ERROR:  value too long for type character(3)
>
> select json_value(jsonb'{"rec": "abcd"}', '$.rec' returning
> char3_domain_not_null error on error);
>  json_value
> 
>  abc
> (1 row)
>
> The problem with returning comp_domain_with_typmod from json_value()
> seems to be that it's using a text-to-record CoerceViaIO expression
> picked from JsonExpr.item_coercions, which behaves differently than
> the expression tree that the following uses:
>
> select ('abcd', 42)::comp_domain_with_typmod;
>row
> --
>  (abc,42)
> (1 row)

Oh, it hadn't occurred to me to check what trying to coerce a "string"
containing the record literal would do:

select '(''abcd'', 42)'::comp_domain_with_typmod;
ERROR:  value too long for type character(3)
LINE 1: select '(''abcd'', 42)'::comp_domain_with_typmod;

which is the same thing as what the JSON_QUERY() and JSON_VALUE() are
running into.  So, it might be fair to think that the error is not a
limitation of the SQL/JSON patch but an underlying behavior that it
has to accept as is.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com




Re: remaining sql/json patches

2024-01-25 Thread jian he
On 9.16.4. JSON_TABLE
`
name type FORMAT JSON [ENCODING UTF8] [ PATH json_path_specification ]
Inserts a composite SQL/JSON item into the output row
`
i am not sure "Inserts a composite SQL/JSON item into the output row"
I think it means, for any type's typecategory is TYPCATEGORY_STRING,
if FORMAT JSON is specified explicitly, the output value (text type)
will be legal
json type representation.

I also did a minor refactor on JSON_VALUE_OP, jsexpr->omit_quotes.
diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c
index 74bc6f49..56ab12ac 100644
--- a/src/backend/executor/execExpr.c
+++ b/src/backend/executor/execExpr.c
@@ -4289,7 +4289,7 @@ ExecInitJsonExpr(JsonExpr *jexpr, ExprState *state,
 	 * nodes.
 	 */
 	jsestate->escontext.type = T_ErrorSaveContext;
-	if (jexpr->result_coercion || jexpr->omit_quotes)
+	if (jexpr->result_coercion)
 	{
 		jsestate->jump_eval_result_coercion =
 			ExecInitJsonExprCoercion(state, jexpr->result_coercion,
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 31c0847e..9802b4ae 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -4363,6 +4363,7 @@ transformJsonFuncExpr(ParseState *pstate, JsonFuncExpr *func)
 			jsexpr->returning->format->format_type = JS_FORMAT_DEFAULT;
 			jsexpr->returning->format->encoding = JS_ENC_DEFAULT;
 
+			jsexpr->omit_quotes = true;
 			jsexpr->result_coercion = coerceJsonFuncExprOutput(pstate, jsexpr);
 
 			/*


v1-0001-refactor-sqljson_jsontable-related-tests.no-cfbot
Description: Binary data


Re: remaining sql/json patches

2024-01-23 Thread jian he
On Mon, Jan 22, 2024 at 11:46 PM jian he  wrote:
>
> On Mon, Jan 22, 2024 at 10:28 PM Amit Langote  wrote:
> >
> > > based on v35.
> > > Now I only applied from 0001 to 0007.
> > > For {DEFAULT expression  ON EMPTY}  | {DEFAULT expression ON ERROR}
> > > restrict DEFAULT expression be either Const node or FuncExpr node.
> > > so these 3 SQL/JSON functions can be used in the btree expression index.
> >
> > I'm not really excited about adding these restrictions into the
> > transformJsonFuncExpr() path.  Index or any other code that wants to
> > put restrictions already have those in place, no need to add them
> > here.  Moreover, by adding these restrictions, we might end up
> > preventing users from doing useful things with this like specify
> > column references.  If there are semantic issues with allowing that,
> > we should discuss them.
> >
>
> after applying v36.
> The following index creation and query operation works. I am not 100%
> sure about these cases.
> just want confirmation, sorry for bothering you
>
> drop table t;
> create table t(a jsonb, b  int);
> insert into t select '{"hello":11}',1;
> insert into t select '{"hello":12}',2;
> CREATE INDEX t_idx2 ON t (JSON_query(a, '$.hello1' RETURNING int
> default b + random() on error));
> CREATE INDEX t_idx3 ON t (JSON_query(a, '$.hello1' RETURNING int
> default random()::int on error));
> SELECT JSON_query(a, '$.hello1'  RETURNING int default ret_setint() on
> error) from t;

I forgot to attach ret_setint defition.

create or replace function ret_setint() returns setof integer as
$$
begin
-- perform pg_sleep(0.1);
return query execute 'select 1 union all select 1';
end;
$$
language plpgsql IMMUTABLE;

-
In the function transformJsonExprCommon, we have
`JsonExpr   *jsexpr = makeNode(JsonExpr);`
then the following 2 assignments are not necessary.

/* Both set in the caller. */
jsexpr->result_coercion = NULL;
jsexpr->omit_quotes = false;

So I removed it.

JSON_VALUE OMIT QUOTES by default, so I set it accordingly.
I also changed coerceJsonFuncExprOutput accordingly


v1-0001-minor-refactor-transformJsonFuncExpr.based_on_v36
Description: Binary data


Re: remaining sql/json patches

2024-01-22 Thread Alvaro Herrera
On 2024-Jan-18, Alvaro Herrera wrote:

> > commands/explain.c (Hmm, I think this is a preexisting bug actually)
> > 
> > 3893  18 : case T_TableFuncScan:
> > 3894  18 : Assert(rte->rtekind == RTE_TABLEFUNC);
> > 3895  18 : if (rte->tablefunc)
> > 3896   0 : if (rte->tablefunc->functype == 
> > TFT_XMLTABLE)
> > 3897   0 : objectname = "xmltable";
> > 3898 : else/* Must be 
> > TFT_JSON_TABLE */
> > 3899   0 : objectname = "json_table";
> > 3900 : else
> > 3901  18 : objectname = NULL;
> > 3902  18 : objecttag = "Table Function Name";
> > 3903  18 : break;
> 
> Indeed 

I was completely wrong about this, and in order to gain coverage the
only thing we needed was to add an EXPLAIN that uses the JSON format.
I did that just now.  I think your addition here works just fine.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/




Re: remaining sql/json patches

2024-01-22 Thread jian he
On Mon, Jan 22, 2024 at 10:28 PM Amit Langote  wrote:
>
> > based on v35.
> > Now I only applied from 0001 to 0007.
> > For {DEFAULT expression  ON EMPTY}  | {DEFAULT expression ON ERROR}
> > restrict DEFAULT expression be either Const node or FuncExpr node.
> > so these 3 SQL/JSON functions can be used in the btree expression index.
>
> I'm not really excited about adding these restrictions into the
> transformJsonFuncExpr() path.  Index or any other code that wants to
> put restrictions already have those in place, no need to add them
> here.  Moreover, by adding these restrictions, we might end up
> preventing users from doing useful things with this like specify
> column references.  If there are semantic issues with allowing that,
> we should discuss them.
>

after applying v36.
The following index creation and query operation works. I am not 100%
sure about these cases.
just want confirmation, sorry for bothering you

drop table t;
create table t(a jsonb, b  int);
insert into t select '{"hello":11}',1;
insert into t select '{"hello":12}',2;
CREATE INDEX t_idx2 ON t (JSON_query(a, '$.hello1' RETURNING int
default b + random() on error));
CREATE INDEX t_idx3 ON t (JSON_query(a, '$.hello1' RETURNING int
default random()::int on error));
SELECT JSON_query(a, '$.hello1'  RETURNING int default ret_setint() on
error) from t;
SELECT JSON_query(a, '$.hello1'  RETURNING int default sum(b) over()
on error) from t;
SELECT JSON_query(a, '$.hello1'  RETURNING int default sum(b) on
error) from t group by a;

but the following cases will fail related to index and default expression.
create table zz(a int, b int);
CREATE INDEX zz_idx1 ON zz ( (b + random()::int));
create table (a int, b int default ret_setint());
create table (a int, b int default sum(b) over());




Re: remaining sql/json patches

2024-01-21 Thread John Naylor
On Mon, Nov 27, 2023 at 9:06 PM Alvaro Herrera  wrote:
> At this point one thing that IMO we cannot afford to do, is stop feature
> progress work on the name of parser speed.  I mean, parser speed is
> important, and we need to be mindful that what we add is reasonable.
> But at some point we'll probably have to fix that by parsing
> differently (a top-down parser, perhaps?  Split the parser in smaller
> pieces that each deal with subsets of the whole thing?)

I was reorganizing some old backups and rediscovered an experiment I
did four years ago when I had some extra time on my hands, to use a
lexer generator that emits a state machine driven by code, rather than
a table. It made parsing 12% faster on the above info-schema test, but
only (maybe) 3% on parsing pgbench-like queries. My quick hack ended
up a bit uglier and more verbose than Flex, but that could be
improved, and in fact small components could be shared across the
whole code base. I might work on it again; I might not.




Re: remaining sql/json patches

2024-01-21 Thread jian he
I found two main issues regarding cocece SQL/JSON function output to
other data types.
* returning typmod influence the returning result of JSON_VALUE | JSON_QUERY.
* JSON_VALUE | JSON_QUERY handles returning type domains allowing null
and not allowing null inconsistencies.

in ExecInitJsonExprCoercion, there is IsA(coercion,JsonCoercion) or
not difference.
for the returning of (JSON_VALUE | JSON_QUERY),
"coercion" is a JsonCoercion or not is set in coerceJsonFuncExprOutput.

this influence returning type with typmod is not -1.
if set "coercion" as JsonCoercion Node then it may call the
InputFunctionCallSafe to do the coercion.
If not, it may call ExecInitFunc related code which is wrapped in
ExecEvalCoerceViaIOSafe.

for example:
SELECT JSON_QUERY(jsonb '[1,2]', '$' RETURNING char(3));
will ExecInitFunc, will init function bpchar(character, integer,
boolean). it will set the third argument to true.
so it will initiate related instructions like: `select
bpchar('[,2]',7,true); ` which in the end will make the result be
`[,2`
However, InputFunctionCallSafe cannot handle that.
simple demo:
create table t(a char(3));
--fail.
INSERT INTO t values ('test');
--ok
select 'test'::char(3);

however current ExecEvalCoerceViaIOSafe cannot handle omit quotes.

even if I made the changes, still not bullet-proof.
for example:
create domain char3_domain_not_null as char(3) NOT NULL;
create domain hello as text NOT NULL check (value = 'hello');
create domain int42 as int check (value = 42);
CREATE TYPE comp_domain_with_typmod AS (a char3_domain_not_null, b int42);

SELECT JSON_VALUE(jsonb'{"rec": "(abcd,42)"}', '$.rec' returning
comp_domain_with_typmod);
will return NULL

however
SELECT JSON_VALUE(jsonb'{"rec": "abcd"}', '$.rec' returning
char3_domain_not_null);
will return `abc`.

I made the modification, you can see the difference.
attached is test_coerce.sql is the test file.
test_coerce_only_v35.out  is the test output of only applying v35 0001
to 0007 plus my previous changes[0].
test_coerce_v35_plus_change.out is  the test output of applying to v35
0001 to 0007 plus changes (attachment) and previous changes[0].

[0] 
https://www.postgresql.org/message-id/CACJufxHo1VVk_0th3AsFxqdMgjaUDz6s0F7%2Bj9rYA3d%3DURw97A%40mail.gmail.com


test_coerce.sql
Description: application/sql


test_coerce_only_v35.out
Description: Binary data


test_coerce_v35_plus_change.out
Description: Binary data


v1-0001-make-JSON_QUERY-JSON_VALUE-returning-type-with.no-cfbot
Description: Binary data


Re: remaining sql/json patches

2024-01-21 Thread Peter Smith
2024-01 Commitfest.

Hi, This patch has a CF status of "Needs Review" [1], but it seems
there were CFbot test failures last time it was run [2]. Please have a
look and post an updated version if necessary.

==
[1] https://commitfest.postgresql.org/46/4377/
[2] https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest/46/4377

Kind Regards,
Peter Smith.




Re: remaining sql/json patches

2024-01-21 Thread jian he
based on v35.
Now I only applied from 0001 to 0007.
For {DEFAULT expression  ON EMPTY}  | {DEFAULT expression ON ERROR}
restrict DEFAULT expression be either Const node or FuncExpr node.
so these 3 SQL/JSON functions can be used in the btree expression index.

I made some big changes on the doc. (see attachment)
list (json_query, json_exists, json_value) as a new  may be
a good idea.

follow these two links, we can see the difference.
only apply v35, 0001 to 0007: https://v35-functions-json-html.vercel.app
apply v35, 0001 to 0007 plus my changes:
https://html-starter-seven-pied.vercel.app


minor issues:
+Note that if the path_expression
+is strict, an error is generated if it yields no
+items, provided the specified ON ERROR behavior is
+ERROR.

how about something like this:
+Note that if the path_expression
+is strict and ON ERROR
behavior is specified
+ERROR, an error is generated if it yields no
+items

+  
+   
+SQL/JSON path expression can currently only accept values of the
+jsonb type, so it might be necessary to cast the
+context_item argument of these functions to
+jsonb.
+   
+  
here should it be "SQL/JSON query functions"?


v35-0001-only-allow-Const-node-or-scalar-returning-fun.no-cfbot
Description: Binary data


v35-0001-refactor-json_value-json_query-js.sql_json_new_section
Description: Binary data


Re: remaining sql/json patches

2024-01-19 Thread jian he
play with domain types.
in ExecEvalJsonCoercion, seems func json_populate_type cannot cope
with domain type.

tests:
drop domain test;
create domain test as int[] check ( array_length(value,1) =2 and
(value[1] = 1 or value[2] = 2));
SELECT * from JSON_QUERY(jsonb'{"rec": "{1,2,3}"}', '$.rec' returning
test omit quotes);
SELECT * from JSON_QUERY(jsonb'{"rec": "{1,11}"}', '$.rec' returning
test keep quotes);
SELECT * from JSON_QUERY(jsonb'{"rec": "{2,11}"}', '$.rec' returning
test omit quotes error on error);
SELECT * from JSON_QUERY(jsonb'{"rec": "{2,2}"}', '$.rec' returning
test keep quotes error on error);

SELECT * from JSON_QUERY(jsonb'{"rec": [1,2,3]}', '$.rec' returning
test omit quotes );
SELECT * from JSON_QUERY(jsonb'{"rec": [1,2,3]}', '$.rec' returning
test omit quotes null on error);
SELECT * from JSON_QUERY(jsonb'{"rec": [1,2,3]}', '$.rec' returning
test null on error);
SELECT * from JSON_QUERY(jsonb'{"rec": [1,11]}', '$.rec' returning
test omit quotes);
SELECT * from JSON_QUERY(jsonb'{"rec": [2,2]}', '$.rec' returning test
omit quotes);

Many domain related tests seem not right.
like the following, i think it should just return null.
+SELECT JSON_QUERY(jsonb '{"a": 1}', '$.b' RETURNING sqljsonb_int_not_null);
+ERROR:  domain sqljsonb_int_not_null does not allow null values

--another example
SELECT JSON_QUERY(jsonb '{"a": 1}', '$.b' RETURNING
sqljsonb_int_not_null null on error);

Maybe in node JsonCoercion, we don't need both via_io and
via_populate, but we can have one bool to indicate either call
InputFunctionCallSafe or json_populate_type in ExecEvalJsonCoercion.




Re: remaining sql/json patches

2024-01-18 Thread Amit Langote
On Fri, Jan 19, 2024 at 2:11 AM Alvaro Herrera  wrote:
> On 2024-Jan-18, Alvaro Herrera wrote:
>
> > commands/explain.c (Hmm, I think this is a preexisting bug actually)
> >
> > 3893  18 : case T_TableFuncScan:
> > 3894  18 : Assert(rte->rtekind == RTE_TABLEFUNC);
> > 3895  18 : if (rte->tablefunc)
> > 3896   0 : if (rte->tablefunc->functype == 
> > TFT_XMLTABLE)
> > 3897   0 : objectname = "xmltable";
> > 3898 : else/* Must be 
> > TFT_JSON_TABLE */
> > 3899   0 : objectname = "json_table";
> > 3900 : else
> > 3901  18 : objectname = NULL;
> > 3902  18 : objecttag = "Table Function Name";
> > 3903  18 : break;
>
> Indeed -- the problem seems to be that add_rte_to_flat_rtable is
> creating a new RTE and zaps the ->tablefunc pointer for it.  So when
> EXPLAIN goes to examine the struct, there's a NULL pointer there and
> nothing is printed.

Ah yes.

> One simple fix is to change add_rte_to_flat_rtable so that it doesn't
> zero out the tablefunc pointer, but this is straight against what that
> function is trying to do, namely to remove substructure.

Yes.

>  Which means
> that we need to preserve the name somewhere else.  I added a new member
> to RangeTblEntry for this, which perhaps is a little ugly.  So here's
> the patch for that.
>
>  (I also added an alias to one XMLTABLE invocation
> under EXPLAIN, to show what it looks like when an alias is specified.
> Otherwise they're always shown as "XMLTABLE" "xmltable" which is a bit
> dumb).

Thanks for the patch.  Seems alright to me.

> Another possible way out is to decide that we don't want the
> "objectname" to be reported here.  I admit it's perhaps redundant.  In
> this case we'd just remove lines 3896-3899 shown above and let it be
> NULL.

Showing the function's name spelled out in the query (XMLTABLE /
JSON_TABLE) seems fine to me, even though maybe a bit redundant, yes.

-- 
Thanks, Amit Langote
EDB: http://www.enterprisedb.com




  1   2   3   >