Re: [HACKERS] [PATCH] Generic type subscripting

2021-02-01 Thread Dmitry Dolgov
> On Fri, Jan 29, 2021 at 7:01 PM Alexander Korotkov  
> wrote:
> Pushed with minor cleanup.

Thanks a lot!

> On Sun, Jan 31, 2021 at 05:23:25PM -0500, Tom Lane wrote:
>
> thorntail seems unhappy:
>
> [From 7c5d57c...]
> Fix portability issue in new jsonbsubs code.
>
> On machines where sizeof(Datum) > sizeof(Oid) (that is, any 64-bit
> platform), the previous coding would compute a misaligned
> workspace->index pointer if nupper is odd.  Architectures where
> misaligned access is a hard no-no would then fail.  This appears
> to explain why thorntail is unhappy but other buildfarm members
> are not.

Yeah, that was an unexpected issue, thanks! I assume few other failing
buildfarm members are the same, as they show similar symptoms (e.g.
mussurana or ibisbill).




Re: [HACKERS] [PATCH] Generic type subscripting

2021-01-31 Thread Tom Lane
Alexander Korotkov  writes:
> Pushed with minor cleanup.

thorntail seems unhappy:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=thorntail=2021-01-31%2020%3A58%3A12

==-=-== stack trace: pgsql.build/src/test/regress/tmp_check/data/core 
==-=-==
[New LWP 2266507]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/sparc64-linux-gnu/libthread_db.so.1".
Core was generated by `postgres: nm regression [local] SELECT   
 '.
Program terminated with signal SIGILL, Illegal instruction.
#0  0x0175c410 in jsonb_subscript_check_subscripts (state=, op=0x1d852b0, econtext=) at 
/home/nm/farm/sparc64_deb10_gcc_64_ubsan/HEAD/pgsql.build/../pgsql/src/backend/utils/adt/jsonbsubs.c:198
198 for (int i = 0; i < sbsrefstate->numupper; i++)
#0  0x0175c410 in jsonb_subscript_check_subscripts (state=, op=0x1d852b0, econtext=) at 
/home/nm/farm/sparc64_deb10_gcc_64_ubsan/HEAD/pgsql.build/../pgsql/src/backend/utils/adt/jsonbsubs.c:198
#1  0x013e55c0 in ExecInterpExpr (state=0x1d85068, 
econtext=0x1d85660, isnull=0x7feffa2fbbc) at 
/home/nm/farm/sparc64_deb10_gcc_64_ubsan/HEAD/pgsql.build/../pgsql/src/backend/executor/execExprInterp.c:1402
#2  0x013de4bc in ExecInterpExprStillValid (state=0x1d85068, 
econtext=0x1d85660, isNull=0x7feffa2fbbc) at 
/home/nm/farm/sparc64_deb10_gcc_64_ubsan/HEAD/pgsql.build/../pgsql/src/backend/executor/execExprInterp.c:1765
#3  0x0154fbd4 in ExecEvalExprSwitchContext (isNull=0x7feffa2fbbc, 
econtext=, state=0x1d85068) at 
/home/nm/farm/sparc64_deb10_gcc_64_ubsan/HEAD/pgsql.build/../pgsql/src/include/executor/executor.h:315
#4  evaluate_expr (expr=, result_type=, 
result_typmod=, result_collation=) at 
/home/nm/farm/sparc64_deb10_gcc_64_ubsan/HEAD/pgsql.build/../pgsql/src/backend/optimizer/util/clauses.c:4533
#5  0x015513b8 in eval_const_expressions_mutator (node=0x1dce218, 
context=0x7feffa30108) at 
/home/nm/farm/sparc64_deb10_gcc_64_ubsan/HEAD/pgsql.build/../pgsql/src/backend/optimizer/util/clauses.c:2883
#6  0x014b4968 in expression_tree_mutator (node=0x1cc10e8, 
mutator=0x154fca4 , context=0x7feffa30108) 
at 
/home/nm/farm/sparc64_deb10_gcc_64_ubsan/HEAD/pgsql.build/../pgsql/src/backend/nodes/nodeFuncs.c:2762
#7  0x0154fd0c in eval_const_expressions_mutator (node=0x1cc10e8, 
context=0x7feffa30108) at 
/home/nm/farm/sparc64_deb10_gcc_64_ubsan/HEAD/pgsql.build/../pgsql/src/backend/optimizer/util/clauses.c:3312
#8  0x014b52d0 in expression_tree_mutator (node=0x1cc1140, 
mutator=0x154fca4 , context=0x7feffa30108) 
at 
/home/nm/farm/sparc64_deb10_gcc_64_ubsan/HEAD/pgsql.build/../pgsql/src/backend/nodes/nodeFuncs.c:3050
#9  0x0154fd0c in eval_const_expressions_mutator (node=0x1cc1140, 
context=0x7feffa30108) at 
/home/nm/farm/sparc64_deb10_gcc_64_ubsan/HEAD/pgsql.build/../pgsql/src/backend/optimizer/util/clauses.c:3312
#10 0x0155284c in eval_const_expressions (root=0x1dcdca0, 
node=0x1cc1140) at 
/home/nm/farm/sparc64_deb10_gcc_64_ubsan/HEAD/pgsql.build/../pgsql/src/backend/optimizer/util/clauses.c:2034
#11 0x01523134 in preprocess_expression (root=0x1dcdca0, 
expr=0x1cc1140, kind=) at 
/home/nm/farm/sparc64_deb10_gcc_64_ubsan/HEAD/pgsql.build/../pgsql/src/backend/optimizer/plan/planner.c:1088
#12 0x0152ed3c in subquery_planner (glob=, 
parse=0x1cc0350, parent_root=, hasRecursion=, 
tuple_fraction=0) at 
/home/nm/farm/sparc64_deb10_gcc_64_ubsan/HEAD/pgsql.build/../pgsql/src/backend/optimizer/plan/planner.c:765
#13 0x01531afc in standard_planner (parse=0x1cc0350, 
query_string=, cursorOptions=, boundParams=0x0) 
at 
/home/nm/farm/sparc64_deb10_gcc_64_ubsan/HEAD/pgsql.build/../pgsql/src/backend/optimizer/plan/planner.c:402
#14 0x01696d6c in pg_plan_query (querytree=0x1cc0350, 
query_string=0x1cbf340 "select ('123'::jsonb)['a'];", 
cursorOptions=, boundParams=0x0) at 
/home/nm/farm/sparc64_deb10_gcc_64_ubsan/HEAD/pgsql.build/../pgsql/src/backend/tcop/postgres.c:876
#15 0x01696f14 in pg_plan_queries (querytrees=0x1dcdbb0, 
query_string=0x1cbf340 "select ('123'::jsonb)['a'];", 
cursorOptions=, boundParams=0x0) at 
/home/nm/farm/sparc64_deb10_gcc_64_ubsan/HEAD/pgsql.build/../pgsql/src/backend/tcop/postgres.c:967
#16 0x016976e4 in exec_simple_query (query_string=0x1cbf340 "select 
('123'::jsonb)['a'];") at 
/home/nm/farm/sparc64_deb10_gcc_64_ubsan/HEAD/pgsql.build/../pgsql/src/backend/tcop/postgres.c:1159
#17 0x0169a0e0 in PostgresMain (argc=, argv=, dbname=, username=) at 
/home/nm/farm/sparc64_deb10_gcc_64_ubsan/HEAD/pgsql.build/../pgsql/src/backend/tcop/postgres.c:4394
#18 0x015a94ec in BackendRun (port=0x1ce4000) at 
/home/nm/farm/sparc64_deb10_gcc_64_ubsan/HEAD/pgsql.build/../pgsql/src/backend/postmaster/postmaster.c:4484
#19 BackendStartup 

Re: [HACKERS] [PATCH] Generic type subscripting

2021-01-31 Thread Alexander Korotkov
On Fri, Jan 29, 2021 at 7:01 PM Alexander Korotkov  wrote:
> On Thu, Jan 21, 2021 at 11:14 PM Pavel Stehule  
> wrote:
> >> Looks good, I've applied it, thanks.
> >
> > I tested last set of patches
> >
> > 1. There is no problem with patching and compilation
> > 2. make check-world passed
> > 3. build doc without problems
> > 4. I have not any objections against implemented functionality, 
> > implementation and tests
> >
> > I'll mark this patch as ready for committers
> >
> > Thank you for your work. It will be nice feature
>
> I've skimmed through the thread, it seems that consensus over
> functionality is reached.  Patchset themself looks good for me.  I'm
> going to push this if no objections.

Pushed with minor cleanup.

--
Regards,
Alexander Korotkov




Re: [HACKERS] [PATCH] Generic type subscripting

2021-01-29 Thread Alexander Korotkov
On Thu, Jan 21, 2021 at 11:14 PM Pavel Stehule  wrote:
>> Looks good, I've applied it, thanks.
>
> I tested last set of patches
>
> 1. There is no problem with patching and compilation
> 2. make check-world passed
> 3. build doc without problems
> 4. I have not any objections against implemented functionality, 
> implementation and tests
>
> I'll mark this patch as ready for committers
>
> Thank you for your work. It will be nice feature

I've skimmed through the thread, it seems that consensus over
functionality is reached.  Patchset themself looks good for me.  I'm
going to push this if no objections.

--
Regards,
Alexander Korotkov




Re: [HACKERS] [PATCH] Generic type subscripting

2021-01-21 Thread Pavel Stehule
Hi


> Looks good, I've applied it, thanks.
>

I tested last set of patches

1. There is no problem with patching and compilation
2. make check-world passed
3. build doc without problems
4. I have not any objections against implemented functionality,
implementation and tests

I'll mark this patch as ready for committers

Thank you for your work. It will be nice feature

Regards

Pavel


Re: [HACKERS] [PATCH] Generic type subscripting

2021-01-21 Thread Dmitry Dolgov
> On Wed, Jan 20, 2021 at 11:37:32PM -0500, Dian M Fay wrote:
> On Wed Jan 20, 2021 at 2:08 PM EST, Dmitry Dolgov wrote:
> > > On Wed, Jan 20, 2021 at 11:34:16AM -0500, Dian M Fay wrote:
> > > > Thanks, I need to remember to not skipp doc building for testing process
> > > > even for such small changes. Hope now I didn't forget anything.
> > > >
> > > > > On Wed, Jan 20, 2021 at 09:58:43AM -0500, Dian M Fay wrote:
> > > >
> > > > > Here's a full editing pass on the documentation, with v45 and Pavel's
> > > > > doc-whitespaces-fix.patch applied. I also corrected a typo in one of 
> > > > > the
> > > > > added hints.
> > > >
> > > > Great! I've applied almost all of it, except:
> > > >
> > > > + A jsonb value will accept assignments to nonexistent
> > > > subscript
> > > > + paths as long as the nonexistent elements being traversed are all
> > > > arrays.
> > > >
> > > > Maybe I've misunderstood the intention, but there is no requirement
> > > > about arrays for creating such an empty path. I've formulated it as:
> > > >
> > > > + A jsonb value will accept assignments to nonexistent
> > > > subscript
> > > > + paths as long as the last existing path key is an object or an array.
> > >
> > > My intention there was to highlight the difference between:
> > >
> > > * SET obj['a']['b']['c'] = '"newvalue"'
> > > * SET arr[0][0][3] = '"newvalue"'
> > >
> > > obj has to conform to {"a": {"b": {...}}} in order to receive the
> > > assignment of the nested c. If it doesn't, that's the error case we
> > > discussed earlier. But arr can be null, [], and so on, and any missing
> > > structure [[[null, null, null, "newvalue"]]] will be created.
> >
> > If arr is 'null', or any other scalar value, such subscripting will work
> > only one level deep because they represented internally as an array of
> > one element. If arr is '[]' the path will comply by definition. So it's
> > essentially the same as for objects with no particular difference. If
> > such a quirk about scalars being treated like arrays is bothering, we
> > could also bend it in this case as well (see the attached version).
>
> I missed that distinction in the original UPDATE paragraph too. Here's
> another revision based on v48.

Looks good, I've applied it, thanks.
>From a4037c651a0cfd2448f38b6c8c932b5a09136b0a Mon Sep 17 00:00:00 2001
From: Dmitrii Dolgov <9erthali...@gmail.com>
Date: Fri, 18 Dec 2020 17:19:51 +0100
Subject: [PATCH v49 1/3] Subscripting for jsonb

Subscripting implementation for jsonb. It does not support slices, does
not have a limit for number of subscripts and for assignment expects a
replace value to be of jsonb type. There is also one functional
difference in assignment via subscripting from jsonb_set, when an
original jsonb container is NULL, subscripting replaces it with an empty
jsonb and proceed with assignment.

For the sake of code reuse, some parts of jsonb functionality were
rearranged to allow use the same functions for jsonb_set and assign
subscripting operation.

The original idea belongs to Oleg Bartunov.

Reviewed-by: Tom Lane, Arthur Zakirov, Pavel Stehule, Dian M Fay
---
 doc/src/sgml/json.sgml  |  51 
 src/backend/utils/adt/Makefile  |   1 +
 src/backend/utils/adt/jsonb_util.c  |  76 -
 src/backend/utils/adt/jsonbsubs.c   | 413 
 src/backend/utils/adt/jsonfuncs.c   | 180 ++--
 src/include/catalog/pg_proc.dat |   4 +
 src/include/catalog/pg_type.dat |   3 +-
 src/include/utils/jsonb.h   |   6 +-
 src/test/regress/expected/jsonb.out | 272 +-
 src/test/regress/sql/jsonb.sql  |  84 +-
 10 files changed, 985 insertions(+), 105 deletions(-)
 create mode 100644 src/backend/utils/adt/jsonbsubs.c

diff --git a/doc/src/sgml/json.sgml b/doc/src/sgml/json.sgml
index 5b9a5557a4..3ace5e444b 100644
--- a/doc/src/sgml/json.sgml
+++ b/doc/src/sgml/json.sgml
@@ -602,6 +602,57 @@ SELECT jdoc-'guid', jdoc-'name' FROM api WHERE jdoc @ '{"tags": ["qu
   
  
 
+ 
+  jsonb Subscripting
+  
+   The jsonb data type supports array-style subscripting expressions
+   to extract and modify elements. Nested values can be indicated by chaining
+   subscripting expressions, following the same rules as the path
+   argument in the jsonb_set function. If a jsonb
+   value is an array, numeric subscripts start at zero, and negative integers count
+   backwards from the last element of the array. Slice expressions are not supported.
+   The result of a subscripting expression is always of the jsonb data type.
+  
+
+  
+   An example of subscripting syntax:
+
+
+-- Extract object value by key
+SELECT ('{"a": 1}'::jsonb)['a'];
+
+-- Extract nested object value by key path
+SELECT ('{"a": {"b": {"c": 1}}}'::jsonb)['a']['b']['c'];
+
+-- Extract array element by index
+SELECT ('[1, "2", null]'::jsonb)[1];
+
+-- Update object value by key. Note the quotes around '1': the assigned
+-- value must be of the jsonb type as well
+UPDATE table_name SET 

Re: [HACKERS] [PATCH] Generic type subscripting

2021-01-20 Thread Dian M Fay
On Wed Jan 20, 2021 at 2:08 PM EST, Dmitry Dolgov wrote:
> > On Wed, Jan 20, 2021 at 11:34:16AM -0500, Dian M Fay wrote:
> > > Thanks, I need to remember to not skipp doc building for testing process
> > > even for such small changes. Hope now I didn't forget anything.
> > >
> > > > On Wed, Jan 20, 2021 at 09:58:43AM -0500, Dian M Fay wrote:
> > >
> > > > Here's a full editing pass on the documentation, with v45 and Pavel's
> > > > doc-whitespaces-fix.patch applied. I also corrected a typo in one of the
> > > > added hints.
> > >
> > > Great! I've applied almost all of it, except:
> > >
> > > + A jsonb value will accept assignments to nonexistent
> > > subscript
> > > + paths as long as the nonexistent elements being traversed are all
> > > arrays.
> > >
> > > Maybe I've misunderstood the intention, but there is no requirement
> > > about arrays for creating such an empty path. I've formulated it as:
> > >
> > > + A jsonb value will accept assignments to nonexistent
> > > subscript
> > > + paths as long as the last existing path key is an object or an array.
> >
> > My intention there was to highlight the difference between:
> >
> > * SET obj['a']['b']['c'] = '"newvalue"'
> > * SET arr[0][0][3] = '"newvalue"'
> >
> > obj has to conform to {"a": {"b": {...}}} in order to receive the
> > assignment of the nested c. If it doesn't, that's the error case we
> > discussed earlier. But arr can be null, [], and so on, and any missing
> > structure [[[null, null, null, "newvalue"]]] will be created.
>
> If arr is 'null', or any other scalar value, such subscripting will work
> only one level deep because they represented internally as an array of
> one element. If arr is '[]' the path will comply by definition. So it's
> essentially the same as for objects with no particular difference. If
> such a quirk about scalars being treated like arrays is bothering, we
> could also bend it in this case as well (see the attached version).

I missed that distinction in the original UPDATE paragraph too. Here's
another revision based on v48.
From a486ee221469037b08d3663f1ec142a905406f8b Mon Sep 17 00:00:00 2001
From: Dian M Fay 
Date: Wed, 20 Jan 2021 23:36:34 -0500
Subject: [PATCH] more jsonb subscripting documentation edits

---
 doc/src/sgml/json.sgml | 40 ++--
 1 file changed, 22 insertions(+), 18 deletions(-)

diff --git a/doc/src/sgml/json.sgml b/doc/src/sgml/json.sgml
index deeb9e66e0..e16dd6973d 100644
--- a/doc/src/sgml/json.sgml
+++ b/doc/src/sgml/json.sgml
@@ -616,16 +616,17 @@ SELECT jdoc-'guid', jdoc-'name' FROM api WHERE 
jdoc @ '{"tags": ["qu
 
   
UPDATE statements may use subscripting in the
-   SET clause to modify jsonb values. Object
-   values being traversed must exist as specified by the subscript path. For
-   instance, the path val['a']['b']['c'] assumes that
-   val, val['a'], and 
val['a']['b']
-   are all objects in every record being updated 
(val['a']['b']
-   may or may not contain a field named c, as long as it's 
an
-   object). If any individual val, 
val['a'],
-   or val['a']['b'] is a non-object such as a string, a 
number,
-   or NULL, an error is raised even if other values do 
conform.
-   Array values are not subject to this restriction, as detailed below.
+   SET clause to modify jsonb values. Subscript
+   paths must be traversible for all affected values insofar as they exist. For
+   instance, the path val['a']['b']['c'] can be traversed 
all
+   the way to c if every val,
+   val['a'], and val['a']['b'] is an
+   object. If any val['a'] or 
val['a']['b']
+   is not defined, it will be created as an empty object and filled as
+   necessary. However, if any val itself or one of the
+   intermediary values is defined as a non-object such as a string, number, or
+   jsonb null, traversal cannot proceed 
so
+   an error is raised and the transaction aborted.
   
 
   
@@ -658,8 +659,9 @@ SELECT * FROM table_name WHERE jsonb_field['key'] = 
'"value"';
 
jsonb assignment via subscripting handles a few edge cases
differently from jsonb_set. When a source 
jsonb
-   is NULL, assignment via subscripting will proceed as if
-   it was an empty JSON object:
+   value is NULL, assignment via subscripting will proceed
+   as if it was an empty JSON value of the type (object or array) implied by 
the
+   subscript key:
 
 
 -- Where jsonb_field was NULL, it is now {"a": 1}
@@ -680,17 +682,19 @@ UPDATE table_name SET jsonb_field[2] = '2';
 
 
A jsonb value will accept assignments to nonexistent subscript
-   paths as long as the last existing path key is an object or an array. Since
-   the final subscript is not traversed, it may be an object key. Nested arrays
-   will be created and NULL-padded according to the path 
until
-   the value can be placed appropriately.
+   paths as long as the last existing element to be traversed is an object or
+   array, as implied by the corresponding subscript (the element indicated by
+   the last subscript 

Re: [HACKERS] [PATCH] Generic type subscripting

2021-01-20 Thread Dmitry Dolgov
> On Wed, Jan 20, 2021 at 11:34:16AM -0500, Dian M Fay wrote:
> > Thanks, I need to remember to not skipp doc building for testing process
> > even for such small changes. Hope now I didn't forget anything.
> >
> > > On Wed, Jan 20, 2021 at 09:58:43AM -0500, Dian M Fay wrote:
> >
> > > Here's a full editing pass on the documentation, with v45 and Pavel's
> > > doc-whitespaces-fix.patch applied. I also corrected a typo in one of the
> > > added hints.
> >
> > Great! I've applied almost all of it, except:
> >
> > + A jsonb value will accept assignments to nonexistent
> > subscript
> > + paths as long as the nonexistent elements being traversed are all
> > arrays.
> >
> > Maybe I've misunderstood the intention, but there is no requirement
> > about arrays for creating such an empty path. I've formulated it as:
> >
> > + A jsonb value will accept assignments to nonexistent
> > subscript
> > + paths as long as the last existing path key is an object or an array.
>
> My intention there was to highlight the difference between:
>
> * SET obj['a']['b']['c'] = '"newvalue"'
> * SET arr[0][0][3] = '"newvalue"'
>
> obj has to conform to {"a": {"b": {...}}} in order to receive the
> assignment of the nested c. If it doesn't, that's the error case we
> discussed earlier. But arr can be null, [], and so on, and any missing
> structure [[[null, null, null, "newvalue"]]] will be created.

If arr is 'null', or any other scalar value, such subscripting will work
only one level deep because they represented internally as an array of
one element. If arr is '[]' the path will comply by definition. So it's
essentially the same as for objects with no particular difference. If
such a quirk about scalars being treated like arrays is bothering, we
could also bend it in this case as well (see the attached version).
>From a4037c651a0cfd2448f38b6c8c932b5a09136b0a Mon Sep 17 00:00:00 2001
From: Dmitrii Dolgov <9erthali...@gmail.com>
Date: Fri, 18 Dec 2020 17:19:51 +0100
Subject: [PATCH v48 1/3] Subscripting for jsonb

Subscripting implementation for jsonb. It does not support slices, does
not have a limit for number of subscripts and for assignment expects a
replace value to be of jsonb type. There is also one functional
difference in assignment via subscripting from jsonb_set, when an
original jsonb container is NULL, subscripting replaces it with an empty
jsonb and proceed with assignment.

For the sake of code reuse, some parts of jsonb functionality were
rearranged to allow use the same functions for jsonb_set and assign
subscripting operation.

The original idea belongs to Oleg Bartunov.

Reviewed-by: Tom Lane, Arthur Zakirov, Pavel Stehule, Dian M Fay
---
 doc/src/sgml/json.sgml  |  51 
 src/backend/utils/adt/Makefile  |   1 +
 src/backend/utils/adt/jsonb_util.c  |  76 -
 src/backend/utils/adt/jsonbsubs.c   | 413 
 src/backend/utils/adt/jsonfuncs.c   | 180 ++--
 src/include/catalog/pg_proc.dat |   4 +
 src/include/catalog/pg_type.dat |   3 +-
 src/include/utils/jsonb.h   |   6 +-
 src/test/regress/expected/jsonb.out | 272 +-
 src/test/regress/sql/jsonb.sql  |  84 +-
 10 files changed, 985 insertions(+), 105 deletions(-)
 create mode 100644 src/backend/utils/adt/jsonbsubs.c

diff --git a/doc/src/sgml/json.sgml b/doc/src/sgml/json.sgml
index 5b9a5557a4..3ace5e444b 100644
--- a/doc/src/sgml/json.sgml
+++ b/doc/src/sgml/json.sgml
@@ -602,6 +602,57 @@ SELECT jdoc-'guid', jdoc-'name' FROM api WHERE jdoc @ '{"tags": ["qu
   
  
 
+ 
+  jsonb Subscripting
+  
+   The jsonb data type supports array-style subscripting expressions
+   to extract and modify elements. Nested values can be indicated by chaining
+   subscripting expressions, following the same rules as the path
+   argument in the jsonb_set function. If a jsonb
+   value is an array, numeric subscripts start at zero, and negative integers count
+   backwards from the last element of the array. Slice expressions are not supported.
+   The result of a subscripting expression is always of the jsonb data type.
+  
+
+  
+   An example of subscripting syntax:
+
+
+-- Extract object value by key
+SELECT ('{"a": 1}'::jsonb)['a'];
+
+-- Extract nested object value by key path
+SELECT ('{"a": {"b": {"c": 1}}}'::jsonb)['a']['b']['c'];
+
+-- Extract array element by index
+SELECT ('[1, "2", null]'::jsonb)[1];
+
+-- Update object value by key. Note the quotes around '1': the assigned
+-- value must be of the jsonb type as well
+UPDATE table_name SET jsonb_field['key'] = '1';
+
+-- Filter records using a WHERE clause with subscripting. Since the result of
+-- subscripting is jsonb, the value we compare it against must also be jsonb.
+-- The double quotes make "value" also a valid jsonb string.
+SELECT * FROM table_name WHERE jsonb_field['key'] = '"value"';
+
+
+   jsonb assignment via subscripting handles a few edge cases
+   differently from jsonb_set. When a source jsonb
+   is NULL, 

Re: [HACKERS] [PATCH] Generic type subscripting

2021-01-20 Thread Dian M Fay
On Wed Jan 20, 2021 at 11:22 AM EST, Dmitry Dolgov wrote:
> > On Tue Jan 19, 2021 at 1:42 PM EST, Pavel Stehule wrote:
> >
> > I found minor issues.
> >
> > Doc - missing tag
> >
> > and three whitespaces issues
> >
> > see attached patch
>
> Thanks, I need to remember to not skipp doc building for testing process
> even for such small changes. Hope now I didn't forget anything.
>
> > On Wed, Jan 20, 2021 at 09:58:43AM -0500, Dian M Fay wrote:
>
> > Here's a full editing pass on the documentation, with v45 and Pavel's
> > doc-whitespaces-fix.patch applied. I also corrected a typo in one of the
> > added hints.
>
> Great! I've applied almost all of it, except:
>
> + A jsonb value will accept assignments to nonexistent
> subscript
> + paths as long as the nonexistent elements being traversed are all
> arrays.
>
> Maybe I've misunderstood the intention, but there is no requirement
> about arrays for creating such an empty path. I've formulated it as:
>
> + A jsonb value will accept assignments to nonexistent
> subscript
> + paths as long as the last existing path key is an object or an array.

My intention there was to highlight the difference between:

* SET obj['a']['b']['c'] = '"newvalue"'
* SET arr[0][0][3] = '"newvalue"'

obj has to conform to {"a": {"b": {...}}} in order to receive the
assignment of the nested c. If it doesn't, that's the error case we
discussed earlier. But arr can be null, [], and so on, and any missing
structure [[[null, null, null, "newvalue"]]] will be created. Take 2:

A jsonb value will accept assignments to nonexistent
subscript paths as long as object key subscripts can be traversed as
described above. The final subscript is not traversed and, if it
describes a missing object key, will be created. Nested arrays will
always be created and NULL-padded according to the
path until the value can be placed appropriately.




Re: [HACKERS] [PATCH] Generic type subscripting

2021-01-20 Thread Dmitry Dolgov
> On Tue Jan 19, 2021 at 1:42 PM EST, Pavel Stehule wrote:
>
> I found minor issues.
>
> Doc - missing tag
>
> and three whitespaces issues
>
> see attached patch

Thanks, I need to remember to not skipp doc building for testing process
even for such small changes. Hope now I didn't forget anything.

> On Wed, Jan 20, 2021 at 09:58:43AM -0500, Dian M Fay wrote:

> Here's a full editing pass on the documentation, with v45 and Pavel's
> doc-whitespaces-fix.patch applied. I also corrected a typo in one of the
> added hints.

Great! I've applied almost all of it, except:

+   A jsonb value will accept assignments to nonexistent subscript
+   paths as long as the nonexistent elements being traversed are all arrays.

Maybe I've misunderstood the intention, but there is no requirement
about arrays for creating such an empty path. I've formulated it as:

+   A jsonb value will accept assignments to nonexistent subscript
+   paths as long as the last existing path key is an object or an array.
>From a4037c651a0cfd2448f38b6c8c932b5a09136b0a Mon Sep 17 00:00:00 2001
From: Dmitrii Dolgov <9erthali...@gmail.com>
Date: Fri, 18 Dec 2020 17:19:51 +0100
Subject: [PATCH v46 1/3] Subscripting for jsonb

Subscripting implementation for jsonb. It does not support slices, does
not have a limit for number of subscripts and for assignment expects a
replace value to be of jsonb type. There is also one functional
difference in assignment via subscripting from jsonb_set, when an
original jsonb container is NULL, subscripting replaces it with an empty
jsonb and proceed with assignment.

For the sake of code reuse, some parts of jsonb functionality were
rearranged to allow use the same functions for jsonb_set and assign
subscripting operation.

The original idea belongs to Oleg Bartunov.

Reviewed-by: Tom Lane, Arthur Zakirov, Pavel Stehule, Dian M Fay
---
 doc/src/sgml/json.sgml  |  51 
 src/backend/utils/adt/Makefile  |   1 +
 src/backend/utils/adt/jsonb_util.c  |  76 -
 src/backend/utils/adt/jsonbsubs.c   | 413 
 src/backend/utils/adt/jsonfuncs.c   | 180 ++--
 src/include/catalog/pg_proc.dat |   4 +
 src/include/catalog/pg_type.dat |   3 +-
 src/include/utils/jsonb.h   |   6 +-
 src/test/regress/expected/jsonb.out | 272 +-
 src/test/regress/sql/jsonb.sql  |  84 +-
 10 files changed, 985 insertions(+), 105 deletions(-)
 create mode 100644 src/backend/utils/adt/jsonbsubs.c

diff --git a/doc/src/sgml/json.sgml b/doc/src/sgml/json.sgml
index 5b9a5557a4..3ace5e444b 100644
--- a/doc/src/sgml/json.sgml
+++ b/doc/src/sgml/json.sgml
@@ -602,6 +602,57 @@ SELECT jdoc-'guid', jdoc-'name' FROM api WHERE jdoc @ '{"tags": ["qu
   
  
 
+ 
+  jsonb Subscripting
+  
+   The jsonb data type supports array-style subscripting expressions
+   to extract and modify elements. Nested values can be indicated by chaining
+   subscripting expressions, following the same rules as the path
+   argument in the jsonb_set function. If a jsonb
+   value is an array, numeric subscripts start at zero, and negative integers count
+   backwards from the last element of the array. Slice expressions are not supported.
+   The result of a subscripting expression is always of the jsonb data type.
+  
+
+  
+   An example of subscripting syntax:
+
+
+-- Extract object value by key
+SELECT ('{"a": 1}'::jsonb)['a'];
+
+-- Extract nested object value by key path
+SELECT ('{"a": {"b": {"c": 1}}}'::jsonb)['a']['b']['c'];
+
+-- Extract array element by index
+SELECT ('[1, "2", null]'::jsonb)[1];
+
+-- Update object value by key. Note the quotes around '1': the assigned
+-- value must be of the jsonb type as well
+UPDATE table_name SET jsonb_field['key'] = '1';
+
+-- Filter records using a WHERE clause with subscripting. Since the result of
+-- subscripting is jsonb, the value we compare it against must also be jsonb.
+-- The double quotes make "value" also a valid jsonb string.
+SELECT * FROM table_name WHERE jsonb_field['key'] = '"value"';
+
+
+   jsonb assignment via subscripting handles a few edge cases
+   differently from jsonb_set. When a source jsonb
+   is NULL, assignment via subscripting will proceed as if
+   it was an empty JSON object:
+
+
+-- Where jsonb_field was NULL, it is now {"a": 1}
+UPDATE table_name SET jsonb_field['a'] = '1';
+
+-- Where jsonb_field was NULL, it is now [1]
+UPDATE table_name SET jsonb_field[0] = '1';
+
+
+  
+ 
+
  
   Transforms
 
diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
index 82732146d3..279ff15ade 100644
--- a/src/backend/utils/adt/Makefile
+++ b/src/backend/utils/adt/Makefile
@@ -50,6 +50,7 @@ OBJS = \
 	jsonb_op.o \
 	jsonb_util.o \
 	jsonfuncs.o \
+	jsonbsubs.o \
 	jsonpath.o \
 	jsonpath_exec.o \
 	jsonpath_gram.o \
diff --git a/src/backend/utils/adt/jsonb_util.c b/src/backend/utils/adt/jsonb_util.c
index 4eeffa1424..41a1c1f9bb 100644
--- a/src/backend/utils/adt/jsonb_util.c
+++ 

Re: [HACKERS] [PATCH] Generic type subscripting

2021-01-20 Thread Dian M Fay
On Tue Jan 19, 2021 at 1:42 PM EST, Pavel Stehule wrote:
> Hi
>
> I found minor issues.
>
> Doc - missing tag
>
> and three whitespaces issues
>
> see attached patch
>
> Following sentence is hard to read due long nested example
>
> If the
> + path contradicts structure of modified jsonb for any
> individual
> + value (e.g. path val['a']['b']['c'] assumes keys
> + 'a' and 'b' have object values
> + assigned to them, but if val['a'] or
> + val['b'] is null, a string, or a number, then the
> path
> + contradicts with the existing structure), an error is raised even if
> other
> + values do conform.
>
> It can be divided into two sentences - predicate, and example.
>
> Regards
>
> Pavel

Here's a full editing pass on the documentation, with v45 and Pavel's
doc-whitespaces-fix.patch applied. I also corrected a typo in one of the
added hints.
From 086a34ca860e8513484d829db1cb3f0c17c4ec1e Mon Sep 17 00:00:00 2001
From: Dian M Fay 
Date: Tue, 19 Jan 2021 23:44:23 -0500
Subject: [PATCH] Revise jsonb subscripting documentation

---
 doc/src/sgml/json.sgml  | 101 +---
 src/backend/utils/adt/jsonbsubs.c   |   2 +-
 src/test/regress/expected/jsonb.out |   2 +-
 3 files changed, 49 insertions(+), 56 deletions(-)

diff --git a/doc/src/sgml/json.sgml b/doc/src/sgml/json.sgml
index 4e19fe4fb8..eb3952193a 100644
--- a/doc/src/sgml/json.sgml
+++ b/doc/src/sgml/json.sgml
@@ -605,97 +605,90 @@ SELECT jdoc-'guid', jdoc-'name' FROM api WHERE 
jdoc @ '{"tags": ["qu
  
   jsonb Subscripting
   
-   jsonb data type supports array-style subscripting expressions
-   to extract or update particular elements. It's possible to use multiple
-   subscripting expressions to extract nested values. In this case, a chain of
-   subscripting expressions follows the same rules as the
-   path argument in jsonb_set function,
-   e.g. in case of arrays it is a 0-based operation or that negative integers
-   that appear in path count from the end of JSON arrays.
-   The result of subscripting expressions is always jsonb data type.
+   The jsonb data type supports array-style subscripting 
expressions
+   to extract and modify elements. Nested values can be indicated by chaining
+   subscripting expressions, following the same rules as the 
path
+   argument in the jsonb_set function. If a 
jsonb
+   value is an array, numeric subscripts start at zero, and negative integers 
count
+   backwards from the last element of the array. Slice expressions are not 
supported.
+   The result of a subscripting expression is always of the jsonb data type.
   
 
   
UPDATE statements may use subscripting in the
-   SET clause to modify jsonb values. Every
-   affected value must conform to the path defined by the subscript(s). If the
-   path contradicts structure of modified jsonb for any individual
-   value (e.g. path val['a']['b']['c'] assumes keys
-   'a' and 'b' have object values
-   assigned to them, but if val['a'] or
-   val['b'] is null, a string, or a number, then the path
-   contradicts with the existing structure), an error is raised even if other
-   values do conform.
+   SET clause to modify jsonb values. Object
+   values being traversed must exist as specified by the subscript path. For
+   instance, the path val['a']['b']['c'] assumes that
+   val, val['a'], and 
val['a']['b']
+   are all objects in every record being updated 
(val['a']['b']
+   may or may not contain a field named c, as long as it's 
an
+   object). If any individual val, 
val['a'],
+   or val['a']['b'] is a non-object such as a string, a 
number,
+   or NULL, an error is raised even if other values do 
conform.
+   Array values are not subject to this restriction, as detailed below.
   
-  
 
+  
An example of subscripting syntax:
+
 
--- Extract value by key
+-- Extract object value by key
 SELECT ('{"a": 1}'::jsonb)['a'];
 
--- Extract nested value by key path
+-- Extract nested object value by key path
 SELECT ('{"a": {"b": {"c": 1}}}'::jsonb)['a']['b']['c'];
 
--- Extract element by index
+-- Extract array element by index
 SELECT ('[1, "2", null]'::jsonb)[1];
 
--- Update value by key, note the single quotes - the assigned value
--- needs to be of jsonb type as well
+-- Update object value by key. Note the quotes around '1': the assigned
+-- value must be of the jsonb type as well
 UPDATE table_name SET jsonb_field['key'] = '1';
 
--- This will raise an error if jsonb_field is {"a": 1}
+-- This will raise an error if any record's jsonb_field['a']['b'] is something
+-- other than an object. For example, the value {"a": 1} has no 'b' key.
 UPDATE table_name SET jsonb_field['a']['b']['c'] = '1';
 
--- Select records using where clause with subscripting. Since the result of
--- subscripting is jsonb and we basically want to compare two jsonb objects, we
--- need to put the value in double quotes to be able to convert it to jsonb.
+-- Filter records using a WHERE clause with subscripting. Since the result of
+-- subscripting 

Re: [HACKERS] [PATCH] Generic type subscripting

2021-01-19 Thread Pavel Stehule
Hi

I found minor issues.

Doc - missing tag

and three whitespaces issues

see attached patch

Following sentence is hard to read due long nested example

If the
+   path contradicts structure of modified jsonb for any
individual
+   value (e.g. path val['a']['b']['c'] assumes keys
+   'a' and 'b' have object values
+   assigned to them, but if val['a'] or
+   val['b'] is null, a string, or a number, then the
path
+   contradicts with the existing structure), an error is raised even if
other
+   values do conform.

It can be divided into two sentences - predicate, and example.

Regards

Pavel
commit 7ce4fbe2620a5d8efdff963b2368c3d0fd904c59
Author: ok...@github.com 
Date:   Tue Jan 19 19:37:02 2021 +0100

fix whitespaces

diff --git a/doc/src/sgml/json.sgml b/doc/src/sgml/json.sgml
index 924762e128..4e19fe4fb8 100644
--- a/doc/src/sgml/json.sgml
+++ b/doc/src/sgml/json.sgml
@@ -613,6 +613,7 @@ SELECT jdoc-'guid', jdoc-'name' FROM api WHERE jdoc @ '{"tags": ["qu
e.g. in case of arrays it is a 0-based operation or that negative integers
that appear in path count from the end of JSON arrays.
The result of subscripting expressions is always jsonb data type.
+  
 
   
UPDATE statements may use subscripting in the
diff --git a/src/backend/utils/adt/jsonbsubs.c b/src/backend/utils/adt/jsonbsubs.c
index 306c37b5a6..64979f3a5b 100644
--- a/src/backend/utils/adt/jsonbsubs.c
+++ b/src/backend/utils/adt/jsonbsubs.c
@@ -342,8 +342,8 @@ jsonb_subscript_fetch_old(ExprState *state,
 	{
 		Jsonb	*jsonbSource = DatumGetJsonbP(*op->resvalue);
 		sbsrefstate->prevvalue = jsonb_get_element(jsonbSource,
-	  			   sbsrefstate->upperindex,
-	  			   sbsrefstate->numupper,
+   sbsrefstate->upperindex,
+   sbsrefstate->numupper,
    >prevnull,
    false);
 	}
@@ -366,7 +366,7 @@ jsonb_exec_setup(const SubscriptingRef *sbsref,
 
 	/* Allocate type-specific workspace with space for per-subscript data */
 	workspace = palloc0(MAXALIGN(sizeof(JsonbSubWorkspace)) +
-	nupper * (sizeof(Datum) + sizeof(Oid)));
+		nupper * (sizeof(Datum) + sizeof(Oid)));
 	workspace->expectArray = false;
 	ptr = ((char *) workspace) + MAXALIGN(sizeof(JsonbSubWorkspace));
 	workspace->indexOid = (Oid *) ptr;
@@ -379,7 +379,7 @@ jsonb_exec_setup(const SubscriptingRef *sbsref,
 	foreach(lc, sbsref->refupperindexpr)
 	{
 		Node   *expr = lfirst(lc);
-		int 	i = foreach_current_index(lc);
+		int		i = foreach_current_index(lc);
 
 		workspace->indexOid[i] = exprType(expr);
 	}


Re: [HACKERS] [PATCH] Generic type subscripting

2021-01-19 Thread Dmitry Dolgov
> On Thu, Jan 14, 2021 at 12:02:42PM -0500, Dian M Fay wrote:
> > Other than that, since I've already posted the patch for returning an
> > error option, it seems that the only thing left is to decide with which
> > version to go.
>
> The trigger issue (which I did verify) makes the "no update" option
> unworkable imo, JavaScript's behavior notwithstanding. But it should be
> called out very clearly in the documentation, since it does depart from
> what people more familiar with that behavior may expect. Here's a quick
> draft, based on your v44 patch:
>
> 
>  jsonb data type supports array-style subscripting expressions
>  to extract or update particular elements. It's possible to use multiple
>  subscripting expressions to extract nested values. In this case, a chain of
>  subscripting expressions follows the same rules as the
>  path argument in jsonb_set function,
>  e.g. in case of arrays it is a 0-based operation or that negative integers
>  that appear in path count from the end of JSON arrays.
>  The result of subscripting expressions is always of the jsonb data type.
> 
> 
>  UPDATE statements may use subscripting in the
>  SET clause to modify jsonb values. Every
>  affected value must conform to the path defined by the subscript(s). If the
>  path cannot be followed to its end for any individual value (e.g.
>  val['a']['b']['c'] where val['a'] or
>  val['b'] is null, a string, or a number), an error is
>  raised even if other values do conform.
> 
> 
>  An example of subscripting syntax:

Yes, makes sense. I've incorporated your suggestion into the last patch,
thanks.
>From c9143a620497dac5615c4de1d9349684e9af95b5 Mon Sep 17 00:00:00 2001
From: Dmitrii Dolgov <9erthali...@gmail.com>
Date: Fri, 18 Dec 2020 17:19:51 +0100
Subject: [PATCH v45 1/3] Subscripting for jsonb

Subscripting implementation for jsonb. It does not support slices, does
not have a limit for number of subscripts and for assignment expects a
replace value to be of jsonb type. There is also one functional
difference in assignment via subscripting from jsonb_set, when an
original jsonb container is NULL, subscripting replaces it with an empty
jsonb and proceed with assignment.

For the sake of code reuse, some parts of jsonb functionality were
rearranged to allow use the same functions for jsonb_set and assign
subscripting operation.

The original idea belongs to Oleg Bartunov.

Reviewed-by: Tom Lane, Arthur Zakirov, Pavel Stehule
---
 doc/src/sgml/json.sgml  |  48 
 src/backend/utils/adt/Makefile  |   1 +
 src/backend/utils/adt/jsonb_util.c  |  76 -
 src/backend/utils/adt/jsonbsubs.c   | 413 
 src/backend/utils/adt/jsonfuncs.c   | 180 ++--
 src/include/catalog/pg_proc.dat |   4 +
 src/include/catalog/pg_type.dat |   3 +-
 src/include/utils/jsonb.h   |   6 +-
 src/test/regress/expected/jsonb.out | 272 +-
 src/test/regress/sql/jsonb.sql  |  84 +-
 10 files changed, 982 insertions(+), 105 deletions(-)
 create mode 100644 src/backend/utils/adt/jsonbsubs.c

diff --git a/doc/src/sgml/json.sgml b/doc/src/sgml/json.sgml
index 5b9a5557a4..100d1a60f4 100644
--- a/doc/src/sgml/json.sgml
+++ b/doc/src/sgml/json.sgml
@@ -602,6 +602,54 @@ SELECT jdoc-'guid', jdoc-'name' FROM api WHERE jdoc @ '{"tags": ["qu
   
  
 
+ 
+  jsonb Subscripting
+  
+   jsonb data type supports array-style subscripting expressions
+   to extract or update particular elements. It's possible to use multiple
+   subscripting expressions to extract nested values. In this case, a chain of
+   subscripting expressions follows the same rules as the
+   path argument in jsonb_set function,
+   e.g. in case of arrays it is a 0-based operation or that negative integers
+   that appear in path count from the end of JSON arrays.
+   The result of subscripting expressions is always jsonb data type. An
+   example of subscripting syntax:
+
+-- Extract value by key
+SELECT ('{"a": 1}'::jsonb)['a'];
+
+-- Extract nested value by key path
+SELECT ('{"a": {"b": {"c": 1}}}'::jsonb)['a']['b']['c'];
+
+-- Extract element by index
+SELECT ('[1, "2", null]'::jsonb)[1];
+
+-- Update value by key, note the single quotes - the assigned value
+-- needs to be of jsonb type as well
+UPDATE table_name SET jsonb_field['key'] = '1';
+
+-- Select records using where clause with subscripting. Since the result of
+-- subscripting is jsonb and we basically want to compare two jsonb objects, we
+-- need to put the value in double quotes to be able to convert it to jsonb.
+SELECT * FROM table_name WHERE jsonb_field['key'] = '"value"';
+
+
+  Subscripting for jsonb does not support slice expressions,
+  even if it contains an array.
+
+  In case if source jsonb is NULL, assignment
+  via subscripting will proceed as if it was an empty JSON object:
+
+-- If jsonb_field here is NULL, the result is {"a": 1}
+UPDATE table_name SET jsonb_field['a'] = '1';
+
+-- If jsonb_field here is NULL, the result is [1]

Re: [HACKERS] [PATCH] Generic type subscripting

2021-01-17 Thread Pavel Stehule
čt 14. 1. 2021 v 18:09 odesílatel Dian M Fay  napsal:

> On Thu Jan 14, 2021 at 10:04 AM EST, Dmitry Dolgov wrote:
> > > On Tue, Jan 12, 2021 at 08:02:59PM +0100, Pavel Stehule wrote:
> > > ne 10. 1. 2021 v 19:52 odesílatel Pavel Stehule <
> pavel.steh...@gmail.com>
> > > napsal:
> > >
> > > I tested behaviour and I didn't find anything other than the mentioned
> > > issue.
> > >
> > > Now I can check this feature from plpgsql, and it is working. Because
> there
> > > is no special support in plpgsql runtime, the update of jsonb is
> > > significantly slower than in update of arrays, and looks so update of
> jsonb
> > > has O(N2) cost. I don't think it is important at this moment - more
> > > important is fact, so I didn't find any memory problems.
> >
> > Thanks for testing. Regarding updates when the structure doesn't match
> > provided path as I've mentioned I don't have strong preferences, but on
> > the second though probably more inclined for returning an error in this
> > case. Since there are pros and cons for both suggestions, it could be
> > decided by vote majority between no update (Dian) or an error (Pavel,
> > me) options. Any +1 to one of the options from others?
> >
> > Other than that, since I've already posted the patch for returning an
> > error option, it seems that the only thing left is to decide with which
> > version to go.
>
> The trigger issue (which I did verify) makes the "no update" option
> unworkable imo, JavaScript's behavior notwithstanding. But it should be
> called out very clearly in the documentation, since it does depart from
> what people more familiar with that behavior may expect. Here's a quick
> draft, based on your v44 patch:
>
> 
>  jsonb data type supports array-style subscripting expressions
>  to extract or update particular elements. It's possible to use multiple
>  subscripting expressions to extract nested values. In this case, a chain
> of
>  subscripting expressions follows the same rules as the
>  path argument in jsonb_set function,
>  e.g. in case of arrays it is a 0-based operation or that negative integers
>  that appear in path count from the end of JSON arrays.
>  The result of subscripting expressions is always of the jsonb data type.
> 
> 
>  UPDATE statements may use subscripting in the
>  SET clause to modify jsonb values. Every
>  affected value must conform to the path defined by the subscript(s). If
> the
>  path cannot be followed to its end for any individual value (e.g.
>  val['a']['b']['c'] where val['a'] or
>  val['b'] is null, a string, or a number), an error is
>  raised even if other values do conform.
> 
> 
>  An example of subscripting syntax:
>

+1

Pavel


Re: [HACKERS] [PATCH] Generic type subscripting

2021-01-14 Thread Dian M Fay
On Thu Jan 14, 2021 at 10:04 AM EST, Dmitry Dolgov wrote:
> > On Tue, Jan 12, 2021 at 08:02:59PM +0100, Pavel Stehule wrote:
> > ne 10. 1. 2021 v 19:52 odesílatel Pavel Stehule 
> > napsal:
> >
> > I tested behaviour and I didn't find anything other than the mentioned
> > issue.
> >
> > Now I can check this feature from plpgsql, and it is working. Because there
> > is no special support in plpgsql runtime, the update of jsonb is
> > significantly slower than in update of arrays, and looks so update of jsonb
> > has O(N2) cost. I don't think it is important at this moment - more
> > important is fact, so I didn't find any memory problems.
>
> Thanks for testing. Regarding updates when the structure doesn't match
> provided path as I've mentioned I don't have strong preferences, but on
> the second though probably more inclined for returning an error in this
> case. Since there are pros and cons for both suggestions, it could be
> decided by vote majority between no update (Dian) or an error (Pavel,
> me) options. Any +1 to one of the options from others?
>
> Other than that, since I've already posted the patch for returning an
> error option, it seems that the only thing left is to decide with which
> version to go.

The trigger issue (which I did verify) makes the "no update" option
unworkable imo, JavaScript's behavior notwithstanding. But it should be
called out very clearly in the documentation, since it does depart from
what people more familiar with that behavior may expect. Here's a quick
draft, based on your v44 patch:


 jsonb data type supports array-style subscripting expressions
 to extract or update particular elements. It's possible to use multiple
 subscripting expressions to extract nested values. In this case, a chain of
 subscripting expressions follows the same rules as the
 path argument in jsonb_set function,
 e.g. in case of arrays it is a 0-based operation or that negative integers
 that appear in path count from the end of JSON arrays.
 The result of subscripting expressions is always of the jsonb data type.


 UPDATE statements may use subscripting in the 
 SET clause to modify jsonb values. Every
 affected value must conform to the path defined by the subscript(s). If the
 path cannot be followed to its end for any individual value (e.g.
 val['a']['b']['c'] where val['a'] or
 val['b'] is null, a string, or a number), an error is
 raised even if other values do conform.


 An example of subscripting syntax:




Re: [HACKERS] [PATCH] Generic type subscripting

2021-01-14 Thread Dmitry Dolgov
> On Tue, Jan 12, 2021 at 08:02:59PM +0100, Pavel Stehule wrote:
> ne 10. 1. 2021 v 19:52 odesílatel Pavel Stehule 
> napsal:
>
> I tested behaviour and I didn't find anything other than the mentioned
> issue.
>
> Now I can check this feature from plpgsql, and it is working. Because there
> is no special support in plpgsql runtime, the update of jsonb is
> significantly slower than in update of arrays, and looks so update of jsonb
> has O(N2) cost. I don't think it is important at this moment - more
> important is fact, so I didn't find any memory problems.

Thanks for testing. Regarding updates when the structure doesn't match
provided path as I've mentioned I don't have strong preferences, but on
the second though probably more inclined for returning an error in this
case. Since there are pros and cons for both suggestions, it could be
decided by vote majority between no update (Dian) or an error (Pavel,
me) options. Any +1 to one of the options from others?

Other than that, since I've already posted the patch for returning an
error option, it seems that the only thing left is to decide with which
version to go.




Re: [HACKERS] [PATCH] Generic type subscripting

2021-01-12 Thread Pavel Stehule
ne 10. 1. 2021 v 19:52 odesílatel Pavel Stehule 
napsal:

> Hi
>
>
>> I'm thinking of the update path as a kind of implicit schema. JSON is
>> intentionally not bound to any schema on creation, so I don't see a
>> failure to enforce another schema at runtime (and outside the WHERE
>> clause, at that) as an error exactly.
>>
>
> This concept is not consistent with other implemented behaviour.
>
> 1. The schema is dynamically enhanced - so although the path doesn't
> exists, it is created and data are changed
>
> postgres=# create table foo(a jsonb);
> CREATE TABLE
> postgres=# insert into foo values('{}');
> INSERT 0 1
> postgres=# update foo set a['a']['a'][10] = '0';
> UPDATE 1
> postgres=# select * from foo;
>
> ┌───┐
> │   a
>   │
>
> ╞═══╡
> │ {"a": {"a": [null, null, null, null, null, null, null, null, null, null,
> 0]}} │
>
> └───┘
> (1 row)
>
> So although the path [a,a,10] was not exists, it was created.
>
> 2. this update fails (and it is correct)
>
> postgres=# update foo set a['a']['a']['b'] = '0';
> ERROR:  path element at position 3 is not an integer: "b"
>
> although the path [a,a,b] doesn't exists, and it is not ignored.
>
> This implementation doesn't do only UPDATE (and then analogy with WHERE
> clause isn't fully adequate). It does MERGE. This is necessary, because
> without it, the behaviour will be pretty unfriendly - because there is not
> any external schema. I think so this is important - and it can be little
> bit messy. I am not sure if I use correct technical terms - we try to use
> LAX update in first step, and if it is not successful, then we try to do
> LAX insert. This is maybe correct from JSON semantic - but for developer it
> is unfriendly, because he hasn't possibility to detect if insert was or was
> not successful. In special JSON functions I can control behave and can
> specify LAX or STRICT how it is necessity. But in this interface
> (subscripting) this possibility is missing.
>
> I think so there should be final check (semantically) if value was
> updated, and if the value was changed. If not, then error should be raised.
> It should be very similar like RLS update. I know and I understand so there
> should be more than one possible implementations, but safe is only one -
> after successful update I would to see new value inside, and when it is not
> possible, then I expect exception. I think so it is more practical too. I
> can control filtering with WHERE clause. But I cannot to control MERGE
> process. Manual recheck after every update can be terrible slow.
>

I tested behaviour and I didn't find anything other than the mentioned
issue.

Now I can check this feature from plpgsql, and it is working. Because there
is no special support in plpgsql runtime, the update of jsonb is
significantly slower than in update of arrays, and looks so update of jsonb
has O(N2) cost. I don't think it is important at this moment - more
important is fact, so I didn't find any memory problems.


postgres=# do $$
declare v int[] = array_fill(0, ARRAY[10,10,10,20]);
begin
  for i1 in 1..10 loop
for i2 in 1..10 loop
  for i3 in 1..10 loop
for i4 in 1..20 loop
  v[i1][i2][i3][i4] = 10;
  raise notice '% % % %', i1, i2, i3, i4;
end loop;
  end loop;
end loop;
  end loop;
end;
$$;

postgres=# do $$
declare v jsonb;
begin
  for i1 in 1..10 loop
for i2 in 1..10 loop
  for i3 in 1..10 loop
for i4 in 1..20 loop
  v[i1][i2][i3][i4] = '10'::jsonb;
  raise notice '% % % %', i1, i2, i3, i4;
end loop;
  end loop;
end loop;
  end loop;
end;
$$;

There are some unwanted white spaces

+   Jsonb   *jsonbSource = DatumGetJsonbP(*op->resvalue);
+   sbsrefstate->prevvalue = jsonb_get_element(jsonbSource,
+  sbsrefstate->upperindex,
+  sbsrefstate->numupper,
+  >prevnull,
+  false);


+   workspace = palloc0(MAXALIGN(sizeof(JsonbSubWorkspace)) +
+   nupper * (sizeof(Datum) + sizeof(Oid)));
+   workspace->expectArray = false;
+   ptr = ((char *) workspace) + MAXALIGN(sizeof(JsonbSubWorkspace));

Regards

Pavel



> Regards
>
> Pavel
>
>
>
>> But I looked into the bulk case a little further, and "outside the
>> WHERE clause" cuts both ways. The server reports an update whether or
>> not the JSON could have been modified, which suggests triggers will
>> fire for no-op updates. That's more clearly a problem.
>>
>> insert into j (val) values
>>  ('{"a": 100}'),
>>  ('{"a": "200"}'),
>>  ('{"b": "300"}'),
>>  ('{"c": {"d": 400}}'),
>>  

Re: [HACKERS] [PATCH] Generic type subscripting

2021-01-10 Thread Pavel Stehule
Hi


> I'm thinking of the update path as a kind of implicit schema. JSON is
> intentionally not bound to any schema on creation, so I don't see a
> failure to enforce another schema at runtime (and outside the WHERE
> clause, at that) as an error exactly.
>

This concept is not consistent with other implemented behaviour.

1. The schema is dynamically enhanced - so although the path doesn't
exists, it is created and data are changed

postgres=# create table foo(a jsonb);
CREATE TABLE
postgres=# insert into foo values('{}');
INSERT 0 1
postgres=# update foo set a['a']['a'][10] = '0';
UPDATE 1
postgres=# select * from foo;
┌───┐
│   a
│
╞═══╡
│ {"a": {"a": [null, null, null, null, null, null, null, null, null, null,
0]}} │
└───┘
(1 row)

So although the path [a,a,10] was not exists, it was created.

2. this update fails (and it is correct)

postgres=# update foo set a['a']['a']['b'] = '0';
ERROR:  path element at position 3 is not an integer: "b"

although the path [a,a,b] doesn't exists, and it is not ignored.

This implementation doesn't do only UPDATE (and then analogy with WHERE
clause isn't fully adequate). It does MERGE. This is necessary, because
without it, the behaviour will be pretty unfriendly - because there is not
any external schema. I think so this is important - and it can be little
bit messy. I am not sure if I use correct technical terms - we try to use
LAX update in first step, and if it is not successful, then we try to do
LAX insert. This is maybe correct from JSON semantic - but for developer it
is unfriendly, because he hasn't possibility to detect if insert was or was
not successful. In special JSON functions I can control behave and can
specify LAX or STRICT how it is necessity. But in this interface
(subscripting) this possibility is missing.

I think so there should be final check (semantically) if value was updated,
and if the value was changed. If not, then error should be raised. It
should be very similar like RLS update. I know and I understand so there
should be more than one possible implementations, but safe is only one -
after successful update I would to see new value inside, and when it is not
possible, then I expect exception. I think so it is more practical too. I
can control filtering with WHERE clause. But I cannot to control MERGE
process. Manual recheck after every update can be terrible slow.

Regards

Pavel



> But I looked into the bulk case a little further, and "outside the
> WHERE clause" cuts both ways. The server reports an update whether or
> not the JSON could have been modified, which suggests triggers will
> fire for no-op updates. That's more clearly a problem.
>
> insert into j (val) values
>  ('{"a": 100}'),
>  ('{"a": "200"}'),
>  ('{"b": "300"}'),
>  ('{"c": {"d": 400}}'),
>  ('{"a": {"z": 500}}');
>
> INSERT 0 5
> update j set val['a']['z'] = '600' returning *;
> val
> 
>  {"a": 100}
>  {"a": "200"}
>  {"a": {"z": 600}, "b": "300"}
>  {"a": {"z": 600}, "c": {"d": 400}}
>  {"a": {"z": 600}}
> (5 rows)
>
> *UPDATE 5*
>


Re: [HACKERS] [PATCH] Generic type subscripting

2021-01-09 Thread Dian M Fay
On Sat Jan 9, 2021 at 3:34 PM EST, Pavel Stehule wrote:
> so 9. 1. 2021 v 21:06 odesílatel Dian M Fay 
> napsal:
>
> > On Thu Jan 7, 2021 at 3:24 AM EST, Pavel Stehule wrote:
> > > čt 7. 1. 2021 v 9:15 odesílatel Dmitry Dolgov <9erthali...@gmail.com>
> > > napsal:
> > >
> > > > > On Wed, Jan 06, 2021 at 09:22:53PM +0100, Pavel Stehule wrote:
> > > > >
> > > > > this case should to raise exception - the value should be changed or
> > > > error
> > > > > should be raised
> > > > >
> > > > > postgres=# insert into foo values('{}');
> > > > > postgres=# update foo set a['a'] = '100';
> > > > > postgres=# update foo set a['a'][1] = '-1';
> > > > > postgres=# select * from foo;
> > > > > ┌┐
> > > > > │ a  │
> > > > > ╞╡
> > > > > │ {"a": 100} │
> > > > > └┘
> > > >
> > > > I was expecting this question, as I've left this like that
> > intentionally
> > > > because of two reasons:
> > > >
> > > > * Opposite to other changes, to implement this one we need to introduce
> > > >   a condition more interfering with normal processing, which raises
> > > >   performance issues for already existing functionality in jsonb_set.
> > > >
> > > > * I vaguely recall there was a similar discussion about jsonb_set with
> > > >   the similar solution.
> > > >
> > >
> > > ok.
> > >
> > > In this case I have a strong opinion so current behavior is wrong. It
> > > can
> > > mask errors. There are two correct possibilities
> > >
> > > 1. raising error - because the update try to apply index on scalar value
> > >
> > > 2. replace by array - a = {NULL, -1}
> > >
> > > But isn't possible ignore assignment
> > >
> > > What do people think about it?
> >
> > I've been following this thread looking forward to the feature and was
> > all set to come in on the side of raising an exception here, but then I
> > tried it in a JS REPL:
> >
> > ; a = {}
> > {}
> > ; a['a'] = '100'
> > '100'
> > ; a['a'][1] = -1
> > -1
> > ; a
> > { a: '100' }
> >
> > ; b = {}
> > {}
> > ; b['b'] = 100
> > 100
> > ; b['b'][1] = -1
> > -1
> > ; b
> > { b: 100 }
> >
> > Even when the value shouldn't be subscriptable _at all_, the invalid
> > assignment is ignored silently. But since the patch follows some of
> > JavaScript's more idiosyncratic leads in other respects (e.g. padding
> > out arrays with nulls when something is inserted at a higher subscript),
> > the current behavior makes at least as much sense as JavaScript's
> > canonical behavior.
> >
> > There's also the bulk update case to think about. An error makes sense
> > when there's only one tuple being affected at a time, but with 1000
> > tuples, should a few no-ops where the JSON turns out to be a structural
> > mismatch stop the rest from changing correctly? What's the alternative?
> > The only answer I've got is double-checking the structure in the WHERE
> > clause, which seems like a lot of effort to go to for something that's
> > supposed to make working with JSON easier.
> >
> > Changing the surrounding structure (e.g. turning a['a'] into an array)
> > seems much more surprising than the no-op, and more likely to have
> > unforeseen consequences in client code working with the JSON. Ignoring
> > invalid assignments -- like JavaScript itself -- seems like the best
> > solution to me.
> >
>
> We don't need 100% compatibility in possible buggy behaviour.
>
> I very much disliked the situation when the system reports ok, but the
> operation was ignored. It is pretty hard to identify bugs in this
> system.
>
> What can be the benefit or use case for this behavior? JavaScript is
> designed for use in web browsers - and a lot of technologies there are
> fault tolerant. But this is a database. I would like to know about all
> the
> errors there.

I'm thinking of the update path as a kind of implicit schema. JSON is
intentionally not bound to any schema on creation, so I don't see a
failure to enforce another schema at runtime (and outside the WHERE
clause, at that) as an error exactly.

But I looked into the bulk case a little further, and "outside the
WHERE clause" cuts both ways. The server reports an update whether or
not the JSON could have been modified, which suggests triggers will
fire for no-op updates. That's more clearly a problem.

insert into j (val) values
 ('{"a": 100}'),
 ('{"a": "200"}'),
 ('{"b": "300"}'),
 ('{"c": {"d": 400}}'),
 ('{"a": {"z": 500}}');

INSERT 0 5
update j set val['a']['z'] = '600' returning *;
val 

 {"a": 100}
 {"a": "200"}
 {"a": {"z": 600}, "b": "300"}
 {"a": {"z": 600}, "c": {"d": 400}}
 {"a": {"z": 600}}
(5 rows)

*UPDATE 5*




Re: [HACKERS] [PATCH] Generic type subscripting

2021-01-09 Thread Pavel Stehule
so 9. 1. 2021 v 21:06 odesílatel Dian M Fay  napsal:

> On Thu Jan 7, 2021 at 3:24 AM EST, Pavel Stehule wrote:
> > čt 7. 1. 2021 v 9:15 odesílatel Dmitry Dolgov <9erthali...@gmail.com>
> > napsal:
> >
> > > > On Wed, Jan 06, 2021 at 09:22:53PM +0100, Pavel Stehule wrote:
> > > >
> > > > this case should to raise exception - the value should be changed or
> > > error
> > > > should be raised
> > > >
> > > > postgres=# insert into foo values('{}');
> > > > postgres=# update foo set a['a'] = '100';
> > > > postgres=# update foo set a['a'][1] = '-1';
> > > > postgres=# select * from foo;
> > > > ┌┐
> > > > │ a  │
> > > > ╞╡
> > > > │ {"a": 100} │
> > > > └┘
> > >
> > > I was expecting this question, as I've left this like that
> intentionally
> > > because of two reasons:
> > >
> > > * Opposite to other changes, to implement this one we need to introduce
> > >   a condition more interfering with normal processing, which raises
> > >   performance issues for already existing functionality in jsonb_set.
> > >
> > > * I vaguely recall there was a similar discussion about jsonb_set with
> > >   the similar solution.
> > >
> >
> > ok.
> >
> > In this case I have a strong opinion so current behavior is wrong. It
> > can
> > mask errors. There are two correct possibilities
> >
> > 1. raising error - because the update try to apply index on scalar value
> >
> > 2. replace by array - a = {NULL, -1}
> >
> > But isn't possible ignore assignment
> >
> > What do people think about it?
>
> I've been following this thread looking forward to the feature and was
> all set to come in on the side of raising an exception here, but then I
> tried it in a JS REPL:
>
> ; a = {}
> {}
> ; a['a'] = '100'
> '100'
> ; a['a'][1] = -1
> -1
> ; a
> { a: '100' }
>
> ; b = {}
> {}
> ; b['b'] = 100
> 100
> ; b['b'][1] = -1
> -1
> ; b
> { b: 100 }
>
> Even when the value shouldn't be subscriptable _at all_, the invalid
> assignment is ignored silently. But since the patch follows some of
> JavaScript's more idiosyncratic leads in other respects (e.g. padding
> out arrays with nulls when something is inserted at a higher subscript),
> the current behavior makes at least as much sense as JavaScript's
> canonical behavior.
>
> There's also the bulk update case to think about. An error makes sense
> when there's only one tuple being affected at a time, but with 1000
> tuples, should a few no-ops where the JSON turns out to be a structural
> mismatch stop the rest from changing correctly? What's the alternative?
> The only answer I've got is double-checking the structure in the WHERE
> clause, which seems like a lot of effort to go to for something that's
> supposed to make working with JSON easier.
>
> Changing the surrounding structure (e.g. turning a['a'] into an array)
> seems much more surprising than the no-op, and more likely to have
> unforeseen consequences in client code working with the JSON. Ignoring
> invalid assignments -- like JavaScript itself -- seems like the best
> solution to me.
>

We don't need 100% compatibility in possible buggy behaviour.

I very much disliked the situation when the system reports ok, but the
operation was ignored. It is pretty hard to identify bugs in this system.

What can be the benefit or use case for this behavior? JavaScript is
designed for use in web browsers - and a lot of technologies there are
fault tolerant. But this is a database. I would like to know about all the
errors there.


Re: [HACKERS] [PATCH] Generic type subscripting

2021-01-09 Thread Dian M Fay
On Thu Jan 7, 2021 at 3:24 AM EST, Pavel Stehule wrote:
> čt 7. 1. 2021 v 9:15 odesílatel Dmitry Dolgov <9erthali...@gmail.com>
> napsal:
>
> > > On Wed, Jan 06, 2021 at 09:22:53PM +0100, Pavel Stehule wrote:
> > >
> > > this case should to raise exception - the value should be changed or
> > error
> > > should be raised
> > >
> > > postgres=# insert into foo values('{}');
> > > postgres=# update foo set a['a'] = '100';
> > > postgres=# update foo set a['a'][1] = '-1';
> > > postgres=# select * from foo;
> > > ┌┐
> > > │ a  │
> > > ╞╡
> > > │ {"a": 100} │
> > > └┘
> >
> > I was expecting this question, as I've left this like that intentionally
> > because of two reasons:
> >
> > * Opposite to other changes, to implement this one we need to introduce
> >   a condition more interfering with normal processing, which raises
> >   performance issues for already existing functionality in jsonb_set.
> >
> > * I vaguely recall there was a similar discussion about jsonb_set with
> >   the similar solution.
> >
>
> ok.
>
> In this case I have a strong opinion so current behavior is wrong. It
> can
> mask errors. There are two correct possibilities
>
> 1. raising error - because the update try to apply index on scalar value
>
> 2. replace by array - a = {NULL, -1}
>
> But isn't possible ignore assignment
>
> What do people think about it?

I've been following this thread looking forward to the feature and was
all set to come in on the side of raising an exception here, but then I
tried it in a JS REPL:

; a = {}
{}
; a['a'] = '100'
'100'
; a['a'][1] = -1
-1
; a
{ a: '100' }

; b = {}
{}
; b['b'] = 100
100
; b['b'][1] = -1
-1
; b
{ b: 100 }

Even when the value shouldn't be subscriptable _at all_, the invalid
assignment is ignored silently. But since the patch follows some of
JavaScript's more idiosyncratic leads in other respects (e.g. padding
out arrays with nulls when something is inserted at a higher subscript),
the current behavior makes at least as much sense as JavaScript's
canonical behavior.

There's also the bulk update case to think about. An error makes sense
when there's only one tuple being affected at a time, but with 1000
tuples, should a few no-ops where the JSON turns out to be a structural
mismatch stop the rest from changing correctly? What's the alternative?
The only answer I've got is double-checking the structure in the WHERE
clause, which seems like a lot of effort to go to for something that's
supposed to make working with JSON easier.

Changing the surrounding structure (e.g. turning a['a'] into an array)
seems much more surprising than the no-op, and more likely to have
unforeseen consequences in client code working with the JSON. Ignoring
invalid assignments -- like JavaScript itself -- seems like the best
solution to me.




Re: [HACKERS] [PATCH] Generic type subscripting

2021-01-07 Thread Pavel Stehule
čt 7. 1. 2021 v 9:15 odesílatel Dmitry Dolgov <9erthali...@gmail.com>
napsal:

> > On Wed, Jan 06, 2021 at 09:22:53PM +0100, Pavel Stehule wrote:
> >
> > this case should to raise exception - the value should be changed or
> error
> > should be raised
> >
> > postgres=# insert into foo values('{}');
> > postgres=# update foo set a['a'] = '100';
> > postgres=# update foo set a['a'][1] = '-1';
> > postgres=# select * from foo;
> > ┌┐
> > │ a  │
> > ╞╡
> > │ {"a": 100} │
> > └┘
>
> I was expecting this question, as I've left this like that intentionally
> because of two reasons:
>
> * Opposite to other changes, to implement this one we need to introduce
>   a condition more interfering with normal processing, which raises
>   performance issues for already existing functionality in jsonb_set.
>
> * I vaguely recall there was a similar discussion about jsonb_set with
>   the similar solution.
>

ok.

In this case I have a strong opinion so current behavior is wrong. It can
mask errors. There are two correct possibilities

1. raising error - because the update try to apply index on scalar value

2. replace by array - a = {NULL, -1}

But isn't possible ignore assignment

What do people think about it?




> For the references what I mean I've attached the third patch, which does
> this. My opinion would be to not consider it, but I'm fine leaving this
> decision to committer.
>


Re: [HACKERS] [PATCH] Generic type subscripting

2021-01-07 Thread Dmitry Dolgov
> On Wed, Jan 06, 2021 at 09:22:53PM +0100, Pavel Stehule wrote:
>
> this case should to raise exception - the value should be changed or error
> should be raised
>
> postgres=# insert into foo values('{}');
> postgres=# update foo set a['a'] = '100';
> postgres=# update foo set a['a'][1] = '-1';
> postgres=# select * from foo;
> ┌┐
> │ a  │
> ╞╡
> │ {"a": 100} │
> └┘

I was expecting this question, as I've left this like that intentionally
because of two reasons:

* Opposite to other changes, to implement this one we need to introduce
  a condition more interfering with normal processing, which raises
  performance issues for already existing functionality in jsonb_set.

* I vaguely recall there was a similar discussion about jsonb_set with
  the similar solution.

For the references what I mean I've attached the third patch, which does
this. My opinion would be to not consider it, but I'm fine leaving this
decision to committer.
>From c9143a620497dac5615c4de1d9349684e9af95b5 Mon Sep 17 00:00:00 2001
From: Dmitrii Dolgov <9erthali...@gmail.com>
Date: Fri, 18 Dec 2020 17:19:51 +0100
Subject: [PATCH v44 1/3] Subscripting for jsonb

Subscripting implementation for jsonb. It does not support slices, does
not have a limit for number of subscripts and for assignment expects a
replace value to be of jsonb type. There is also one functional
difference in assignment via subscripting from jsonb_set, when an
original jsonb container is NULL, subscripting replaces it with an empty
jsonb and proceed with assignment.

For the sake of code reuse, some parts of jsonb functionality were
rearranged to allow use the same functions for jsonb_set and assign
subscripting operation.

The original idea belongs to Oleg Bartunov.

Reviewed-by: Tom Lane, Arthur Zakirov, Pavel Stehule
---
 doc/src/sgml/json.sgml  |  48 
 src/backend/utils/adt/Makefile  |   1 +
 src/backend/utils/adt/jsonb_util.c  |  76 -
 src/backend/utils/adt/jsonbsubs.c   | 413 
 src/backend/utils/adt/jsonfuncs.c   | 180 ++--
 src/include/catalog/pg_proc.dat |   4 +
 src/include/catalog/pg_type.dat |   3 +-
 src/include/utils/jsonb.h   |   6 +-
 src/test/regress/expected/jsonb.out | 272 +-
 src/test/regress/sql/jsonb.sql  |  84 +-
 10 files changed, 982 insertions(+), 105 deletions(-)
 create mode 100644 src/backend/utils/adt/jsonbsubs.c

diff --git a/doc/src/sgml/json.sgml b/doc/src/sgml/json.sgml
index 5b9a5557a4..100d1a60f4 100644
--- a/doc/src/sgml/json.sgml
+++ b/doc/src/sgml/json.sgml
@@ -602,6 +602,54 @@ SELECT jdoc-'guid', jdoc-'name' FROM api WHERE jdoc @ '{"tags": ["qu
   
  
 
+ 
+  jsonb Subscripting
+  
+   jsonb data type supports array-style subscripting expressions
+   to extract or update particular elements. It's possible to use multiple
+   subscripting expressions to extract nested values. In this case, a chain of
+   subscripting expressions follows the same rules as the
+   path argument in jsonb_set function,
+   e.g. in case of arrays it is a 0-based operation or that negative integers
+   that appear in path count from the end of JSON arrays.
+   The result of subscripting expressions is always jsonb data type. An
+   example of subscripting syntax:
+
+-- Extract value by key
+SELECT ('{"a": 1}'::jsonb)['a'];
+
+-- Extract nested value by key path
+SELECT ('{"a": {"b": {"c": 1}}}'::jsonb)['a']['b']['c'];
+
+-- Extract element by index
+SELECT ('[1, "2", null]'::jsonb)[1];
+
+-- Update value by key, note the single quotes - the assigned value
+-- needs to be of jsonb type as well
+UPDATE table_name SET jsonb_field['key'] = '1';
+
+-- Select records using where clause with subscripting. Since the result of
+-- subscripting is jsonb and we basically want to compare two jsonb objects, we
+-- need to put the value in double quotes to be able to convert it to jsonb.
+SELECT * FROM table_name WHERE jsonb_field['key'] = '"value"';
+
+
+  Subscripting for jsonb does not support slice expressions,
+  even if it contains an array.
+
+  In case if source jsonb is NULL, assignment
+  via subscripting will proceed as if it was an empty JSON object:
+
+-- If jsonb_field here is NULL, the result is {"a": 1}
+UPDATE table_name SET jsonb_field['a'] = '1';
+
+-- If jsonb_field here is NULL, the result is [1]
+UPDATE table_name SET jsonb_field[0] = '1';
+
+
+  
+ 
+
  
   Transforms
 
diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
index 82732146d3..279ff15ade 100644
--- a/src/backend/utils/adt/Makefile
+++ b/src/backend/utils/adt/Makefile
@@ -50,6 +50,7 @@ OBJS = \
 	jsonb_op.o \
 	jsonb_util.o \
 	jsonfuncs.o \
+	jsonbsubs.o \
 	jsonpath.o \
 	jsonpath_exec.o \
 	jsonpath_gram.o \
diff --git a/src/backend/utils/adt/jsonb_util.c b/src/backend/utils/adt/jsonb_util.c
index 4eeffa1424..41a1c1f9bb 100644
--- a/src/backend/utils/adt/jsonb_util.c
+++ b/src/backend/utils/adt/jsonb_util.c
@@ 

Re: [HACKERS] [PATCH] Generic type subscripting

2021-01-06 Thread Pavel Stehule
Hi

út 5. 1. 2021 v 20:32 odesílatel Dmitry Dolgov <9erthali...@gmail.com>
napsal:

> > On Mon, Jan 04, 2021 at 06:56:17PM +0100, Pavel Stehule wrote:
> > po 4. 1. 2021 v 14:58 odesílatel Dmitry Dolgov <9erthali...@gmail.com>
> > napsal:
> > postgres=# update foo set a['c']['c'][10] = '10';
> > postgres=# update foo set a['c'][10][10] = '10';
>
> Yeah, there was one clumsy memory allocation. On the way I've found and
> fixed another issue with jsonb generation, right now I don't see any
> other problems. But as my imagination, despite all the sci-fi I've read
> this year, is apparently not so versatile, I'll rely on yours, could you
> please check this version again?
>

this case should to raise exception - the value should be changed or error
should be raised

postgres=# insert into foo values('{}');
INSERT 0 1
postgres=# update foo set a['a'] = '100';
UPDATE 1
postgres=# select * from foo;
┌┐
│ a  │
╞╡
│ {"a": 100} │
└┘
(1 row)

postgres=# update foo set a['a'][1] = '-1';
UPDATE 1
postgres=# select * from foo;
┌┐
│ a  │
╞╡
│ {"a": 100} │
└┘
(1 row)

Regards

Pavel


Re: [HACKERS] [PATCH] Generic type subscripting

2021-01-05 Thread Dmitry Dolgov
> On Mon, Jan 04, 2021 at 06:56:17PM +0100, Pavel Stehule wrote:
> po 4. 1. 2021 v 14:58 odesílatel Dmitry Dolgov <9erthali...@gmail.com>
> napsal:
> postgres=# update foo set a['c']['c'][10] = '10';
> postgres=# update foo set a['c'][10][10] = '10';

Yeah, there was one clumsy memory allocation. On the way I've found and
fixed another issue with jsonb generation, right now I don't see any
other problems. But as my imagination, despite all the sci-fi I've read
this year, is apparently not so versatile, I'll rely on yours, could you
please check this version again?
>From c9143a620497dac5615c4de1d9349684e9af95b5 Mon Sep 17 00:00:00 2001
From: Dmitrii Dolgov <9erthali...@gmail.com>
Date: Fri, 18 Dec 2020 17:19:51 +0100
Subject: [PATCH v43 1/2] Subscripting for jsonb

Subscripting implementation for jsonb. It does not support slices, does
not have a limit for number of subscripts and for assignment expects a
replace value to be of jsonb type. There is also one functional
difference in assignment via subscripting from jsonb_set, when an
original jsonb container is NULL, subscripting replaces it with an empty
jsonb and proceed with assignment.

For the sake of code reuse, some parts of jsonb functionality were
rearranged to allow use the same functions for jsonb_set and assign
subscripting operation.

The original idea belongs to Oleg Bartunov.

Reviewed-by: Tom Lane, Arthur Zakirov, Pavel Stehule
---
 doc/src/sgml/json.sgml  |  48 
 src/backend/utils/adt/Makefile  |   1 +
 src/backend/utils/adt/jsonb_util.c  |  76 -
 src/backend/utils/adt/jsonbsubs.c   | 413 
 src/backend/utils/adt/jsonfuncs.c   | 180 ++--
 src/include/catalog/pg_proc.dat |   4 +
 src/include/catalog/pg_type.dat |   3 +-
 src/include/utils/jsonb.h   |   6 +-
 src/test/regress/expected/jsonb.out | 272 +-
 src/test/regress/sql/jsonb.sql  |  84 +-
 10 files changed, 982 insertions(+), 105 deletions(-)
 create mode 100644 src/backend/utils/adt/jsonbsubs.c

diff --git a/doc/src/sgml/json.sgml b/doc/src/sgml/json.sgml
index 5b9a5557a4..100d1a60f4 100644
--- a/doc/src/sgml/json.sgml
+++ b/doc/src/sgml/json.sgml
@@ -602,6 +602,54 @@ SELECT jdoc-'guid', jdoc-'name' FROM api WHERE jdoc @ '{"tags": ["qu
   
  
 
+ 
+  jsonb Subscripting
+  
+   jsonb data type supports array-style subscripting expressions
+   to extract or update particular elements. It's possible to use multiple
+   subscripting expressions to extract nested values. In this case, a chain of
+   subscripting expressions follows the same rules as the
+   path argument in jsonb_set function,
+   e.g. in case of arrays it is a 0-based operation or that negative integers
+   that appear in path count from the end of JSON arrays.
+   The result of subscripting expressions is always jsonb data type. An
+   example of subscripting syntax:
+
+-- Extract value by key
+SELECT ('{"a": 1}'::jsonb)['a'];
+
+-- Extract nested value by key path
+SELECT ('{"a": {"b": {"c": 1}}}'::jsonb)['a']['b']['c'];
+
+-- Extract element by index
+SELECT ('[1, "2", null]'::jsonb)[1];
+
+-- Update value by key, note the single quotes - the assigned value
+-- needs to be of jsonb type as well
+UPDATE table_name SET jsonb_field['key'] = '1';
+
+-- Select records using where clause with subscripting. Since the result of
+-- subscripting is jsonb and we basically want to compare two jsonb objects, we
+-- need to put the value in double quotes to be able to convert it to jsonb.
+SELECT * FROM table_name WHERE jsonb_field['key'] = '"value"';
+
+
+  Subscripting for jsonb does not support slice expressions,
+  even if it contains an array.
+
+  In case if source jsonb is NULL, assignment
+  via subscripting will proceed as if it was an empty JSON object:
+
+-- If jsonb_field here is NULL, the result is {"a": 1}
+UPDATE table_name SET jsonb_field['a'] = '1';
+
+-- If jsonb_field here is NULL, the result is [1]
+UPDATE table_name SET jsonb_field[0] = '1';
+
+
+  
+ 
+
  
   Transforms
 
diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
index 82732146d3..279ff15ade 100644
--- a/src/backend/utils/adt/Makefile
+++ b/src/backend/utils/adt/Makefile
@@ -50,6 +50,7 @@ OBJS = \
 	jsonb_op.o \
 	jsonb_util.o \
 	jsonfuncs.o \
+	jsonbsubs.o \
 	jsonpath.o \
 	jsonpath_exec.o \
 	jsonpath_gram.o \
diff --git a/src/backend/utils/adt/jsonb_util.c b/src/backend/utils/adt/jsonb_util.c
index 4eeffa1424..41a1c1f9bb 100644
--- a/src/backend/utils/adt/jsonb_util.c
+++ b/src/backend/utils/adt/jsonb_util.c
@@ -68,18 +68,29 @@ static JsonbValue *pushJsonbValueScalar(JsonbParseState **pstate,
 		JsonbIteratorToken seq,
 		JsonbValue *scalarVal);
 
+JsonbValue *
+JsonbToJsonbValue(Jsonb *jsonb)
+{
+	JsonbValue *val = (JsonbValue *) palloc(sizeof(JsonbValue));
+
+	val->type = jbvBinary;
+	val->val.binary.data = >root;
+	val->val.binary.len = VARSIZE(jsonb) - VARHDRSZ;
+
+	return val;
+}
+
 /*
  * Turn an 

Re: [HACKERS] [PATCH] Generic type subscripting

2021-01-04 Thread Pavel Stehule
po 4. 1. 2021 v 14:58 odesílatel Dmitry Dolgov <9erthali...@gmail.com>
napsal:

> > On Sun, Jan 03, 2021 at 08:41:17PM +0100, Pavel Stehule wrote:
> >
> > probably some is wrong still
> >
> > create table foo(a jsonb);
> > update foo set a['a'] = '10';
> > update foo set a['b']['c'][1] = '10';
> > update foo set a['b']['c'][10] = '10'
>
> Thanks for noticing. Indeed, there was a subtle change of meaning for
> 'done' flag in setPath, which I haven't covered. Could you try this
> version?
>

sure

postgres=# insert into foo values('{}');
INSERT 0 1
postgres=# update foo set a['c']['c'][10] = '10';
UPDATE 1
postgres=# select * from foo;
┌┐
│   a
 │
╞╡
│ {"c": {"c": [null, null, null, null, null, null, null, null, null, null,
10]}} │
└┘
(1 row)

postgres=# update foo set a['c'][10][10] = '10';
WARNING:  problem in alloc set ExprContext: req size > alloc size for chunk
0x151b688 in block 0x151aa90
WARNING:  problem in alloc set ExprContext: bogus aset link in block
0x151aa90, chunk 0x151b688
WARNING:  problem in alloc set ExprContext: bad size 0 for chunk 0x151b8a0
in block 0x151aa90
WARNING:  problem in alloc set ExprContext: bad single-chunk 0x151b8b8 in
block 0x151aa90
WARNING:  problem in alloc set ExprContext: bogus aset link in block
0x151aa90, chunk 0x151b8b8
WARNING:  problem in alloc set ExprContext: found inconsistent memory block
0x151aa90
UPDATE 1


Re: [HACKERS] [PATCH] Generic type subscripting

2021-01-04 Thread Dmitry Dolgov
> On Sun, Jan 03, 2021 at 08:41:17PM +0100, Pavel Stehule wrote:
>
> probably some is wrong still
>
> create table foo(a jsonb);
> update foo set a['a'] = '10';
> update foo set a['b']['c'][1] = '10';
> update foo set a['b']['c'][10] = '10'

Thanks for noticing. Indeed, there was a subtle change of meaning for
'done' flag in setPath, which I haven't covered. Could you try this
version?
>From c9143a620497dac5615c4de1d9349684e9af95b5 Mon Sep 17 00:00:00 2001
From: Dmitrii Dolgov <9erthali...@gmail.com>
Date: Fri, 18 Dec 2020 17:19:51 +0100
Subject: [PATCH v42 1/2] Subscripting for jsonb

Subscripting implementation for jsonb. It does not support slices, does
not have a limit for number of subscripts and for assignment expects a
replace value to be of jsonb type. There is also one functional
difference in assignment via subscripting from jsonb_set, when an
original jsonb container is NULL, subscripting replaces it with an empty
jsonb and proceed with assignment.

For the sake of code reuse, some parts of jsonb functionality were
rearranged to allow use the same functions for jsonb_set and assign
subscripting operation.

The original idea belongs to Oleg Bartunov.

Reviewed-by: Tom Lane, Arthur Zakirov, Pavel Stehule
---
 doc/src/sgml/json.sgml  |  48 
 src/backend/utils/adt/Makefile  |   1 +
 src/backend/utils/adt/jsonb_util.c  |  76 -
 src/backend/utils/adt/jsonbsubs.c   | 413 
 src/backend/utils/adt/jsonfuncs.c   | 180 ++--
 src/include/catalog/pg_proc.dat |   4 +
 src/include/catalog/pg_type.dat |   3 +-
 src/include/utils/jsonb.h   |   6 +-
 src/test/regress/expected/jsonb.out | 272 +-
 src/test/regress/sql/jsonb.sql  |  84 +-
 10 files changed, 982 insertions(+), 105 deletions(-)
 create mode 100644 src/backend/utils/adt/jsonbsubs.c

diff --git a/doc/src/sgml/json.sgml b/doc/src/sgml/json.sgml
index 5b9a5557a4..100d1a60f4 100644
--- a/doc/src/sgml/json.sgml
+++ b/doc/src/sgml/json.sgml
@@ -602,6 +602,54 @@ SELECT jdoc-'guid', jdoc-'name' FROM api WHERE jdoc @ '{"tags": ["qu
   
  
 
+ 
+  jsonb Subscripting
+  
+   jsonb data type supports array-style subscripting expressions
+   to extract or update particular elements. It's possible to use multiple
+   subscripting expressions to extract nested values. In this case, a chain of
+   subscripting expressions follows the same rules as the
+   path argument in jsonb_set function,
+   e.g. in case of arrays it is a 0-based operation or that negative integers
+   that appear in path count from the end of JSON arrays.
+   The result of subscripting expressions is always jsonb data type. An
+   example of subscripting syntax:
+
+-- Extract value by key
+SELECT ('{"a": 1}'::jsonb)['a'];
+
+-- Extract nested value by key path
+SELECT ('{"a": {"b": {"c": 1}}}'::jsonb)['a']['b']['c'];
+
+-- Extract element by index
+SELECT ('[1, "2", null]'::jsonb)[1];
+
+-- Update value by key, note the single quotes - the assigned value
+-- needs to be of jsonb type as well
+UPDATE table_name SET jsonb_field['key'] = '1';
+
+-- Select records using where clause with subscripting. Since the result of
+-- subscripting is jsonb and we basically want to compare two jsonb objects, we
+-- need to put the value in double quotes to be able to convert it to jsonb.
+SELECT * FROM table_name WHERE jsonb_field['key'] = '"value"';
+
+
+  Subscripting for jsonb does not support slice expressions,
+  even if it contains an array.
+
+  In case if source jsonb is NULL, assignment
+  via subscripting will proceed as if it was an empty JSON object:
+
+-- If jsonb_field here is NULL, the result is {"a": 1}
+UPDATE table_name SET jsonb_field['a'] = '1';
+
+-- If jsonb_field here is NULL, the result is [1]
+UPDATE table_name SET jsonb_field[0] = '1';
+
+
+  
+ 
+
  
   Transforms
 
diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
index 82732146d3..279ff15ade 100644
--- a/src/backend/utils/adt/Makefile
+++ b/src/backend/utils/adt/Makefile
@@ -50,6 +50,7 @@ OBJS = \
 	jsonb_op.o \
 	jsonb_util.o \
 	jsonfuncs.o \
+	jsonbsubs.o \
 	jsonpath.o \
 	jsonpath_exec.o \
 	jsonpath_gram.o \
diff --git a/src/backend/utils/adt/jsonb_util.c b/src/backend/utils/adt/jsonb_util.c
index 4eeffa1424..41a1c1f9bb 100644
--- a/src/backend/utils/adt/jsonb_util.c
+++ b/src/backend/utils/adt/jsonb_util.c
@@ -68,18 +68,29 @@ static JsonbValue *pushJsonbValueScalar(JsonbParseState **pstate,
 		JsonbIteratorToken seq,
 		JsonbValue *scalarVal);
 
+JsonbValue *
+JsonbToJsonbValue(Jsonb *jsonb)
+{
+	JsonbValue *val = (JsonbValue *) palloc(sizeof(JsonbValue));
+
+	val->type = jbvBinary;
+	val->val.binary.data = >root;
+	val->val.binary.len = VARSIZE(jsonb) - VARHDRSZ;
+
+	return val;
+}
+
 /*
  * Turn an in-memory JsonbValue into a Jsonb for on-disk storage.
  *
- * There isn't a JsonbToJsonbValue(), because generally we find it more
- * convenient to directly iterate through the Jsonb 

Re: [HACKERS] [PATCH] Generic type subscripting

2021-01-03 Thread Pavel Stehule
Hi

probably some is wrong still

create table foo(a jsonb);
update foo set a['a'] = '10';
update foo set a['b']['c'][1] = '10';
update foo set a['b']['c'][10] = '10'

WARNING:  problem in alloc set ExprContext: req size > alloc size for chunk
0x256dd88 in block 0x256d160
WARNING:  problem in alloc set ExprContext: bogus aset link in block
0x256d160, chunk 0x256dd88
WARNING:  problem in alloc set ExprContext: req size > alloc size for chunk
0x256dfa0 in block 0x256d160
WARNING:  problem in alloc set ExprContext: bogus aset link in block
0x256d160, chunk 0x256dfa0
WARNING:  problem in alloc set ExprContext: req size > alloc size for chunk
0x256dfc4 in block 0x256d160
WARNING:  problem in alloc set ExprContext: bad single-chunk 0x256dfc4 in
block 0x256d160
WARNING:  problem in alloc set ExprContext: bogus aset link in block
0x256d160, chunk 0x256dfc4
WARNING:  problem in alloc set ExprContext: found inconsistent memory block
0x256d160
UPDATE 1

and result is wrong, the value of a['b']['c'][1] was overwritten by NULL

Regards

Pavel


Re: [HACKERS] [PATCH] Generic type subscripting

2021-01-02 Thread Dmitry Dolgov
> On Thu, Dec 31, 2020 at 08:21:55PM +0100, Pavel Stehule wrote:
> čt 31. 12. 2020 v 15:27 odesílatel Dmitry Dolgov <9erthali...@gmail.com>
> napsal:
>
> the tests passed and filling gaps works well
>
> but creating empty objects doesn't work
>
> create table foo(a jsonb);
>
> insert into foo values('{}');
>
> postgres=# update foo set a['k'][1] = '20';
> UPDATE 1
> postgres=# select * from foo;
> ┌───┐
> │ a │
> ╞═══╡
> │ {"k": [null, 20]} │
> └───┘
> (1 row)
>
> it is ok
>
> postgres=# update foo set a['k3'][10] = '20';
> UPDATE 1
> postgres=# select * from foo;
> ┌───┐
> │ a │
> ╞═══╡
> │ {"k": [null, 20]} │
> └───┘
> (1 row)
>
> the second update was not successful

Right, it was working only if the source level is empty, thanks for
checking. I've found a bit more time and prepared more decent version
which covers all the cases I could come up with following the same
implementation logic. The first patch is the same though.
>From c9143a620497dac5615c4de1d9349684e9af95b5 Mon Sep 17 00:00:00 2001
From: Dmitrii Dolgov <9erthali...@gmail.com>
Date: Fri, 18 Dec 2020 17:19:51 +0100
Subject: [PATCH v41 1/2] Subscripting for jsonb

Subscripting implementation for jsonb. It does not support slices, does
not have a limit for number of subscripts and for assignment expects a
replace value to be of jsonb type. There is also one functional
difference in assignment via subscripting from jsonb_set, when an
original jsonb container is NULL, subscripting replaces it with an empty
jsonb and proceed with assignment.

For the sake of code reuse, some parts of jsonb functionality were
rearranged to allow use the same functions for jsonb_set and assign
subscripting operation.

The original idea belongs to Oleg Bartunov.

Reviewed-by: Tom Lane, Arthur Zakirov, Pavel Stehule
---
 doc/src/sgml/json.sgml  |  48 
 src/backend/utils/adt/Makefile  |   1 +
 src/backend/utils/adt/jsonb_util.c  |  76 -
 src/backend/utils/adt/jsonbsubs.c   | 413 
 src/backend/utils/adt/jsonfuncs.c   | 180 ++--
 src/include/catalog/pg_proc.dat |   4 +
 src/include/catalog/pg_type.dat |   3 +-
 src/include/utils/jsonb.h   |   6 +-
 src/test/regress/expected/jsonb.out | 272 +-
 src/test/regress/sql/jsonb.sql  |  84 +-
 10 files changed, 982 insertions(+), 105 deletions(-)
 create mode 100644 src/backend/utils/adt/jsonbsubs.c

diff --git a/doc/src/sgml/json.sgml b/doc/src/sgml/json.sgml
index 5b9a5557a4..100d1a60f4 100644
--- a/doc/src/sgml/json.sgml
+++ b/doc/src/sgml/json.sgml
@@ -602,6 +602,54 @@ SELECT jdoc-'guid', jdoc-'name' FROM api WHERE jdoc @ '{"tags": ["qu
   
  
 
+ 
+  jsonb Subscripting
+  
+   jsonb data type supports array-style subscripting expressions
+   to extract or update particular elements. It's possible to use multiple
+   subscripting expressions to extract nested values. In this case, a chain of
+   subscripting expressions follows the same rules as the
+   path argument in jsonb_set function,
+   e.g. in case of arrays it is a 0-based operation or that negative integers
+   that appear in path count from the end of JSON arrays.
+   The result of subscripting expressions is always jsonb data type. An
+   example of subscripting syntax:
+
+-- Extract value by key
+SELECT ('{"a": 1}'::jsonb)['a'];
+
+-- Extract nested value by key path
+SELECT ('{"a": {"b": {"c": 1}}}'::jsonb)['a']['b']['c'];
+
+-- Extract element by index
+SELECT ('[1, "2", null]'::jsonb)[1];
+
+-- Update value by key, note the single quotes - the assigned value
+-- needs to be of jsonb type as well
+UPDATE table_name SET jsonb_field['key'] = '1';
+
+-- Select records using where clause with subscripting. Since the result of
+-- subscripting is jsonb and we basically want to compare two jsonb objects, we
+-- need to put the value in double quotes to be able to convert it to jsonb.
+SELECT * FROM table_name WHERE jsonb_field['key'] = '"value"';
+
+
+  Subscripting for jsonb does not support slice expressions,
+  even if it contains an array.
+
+  In case if source jsonb is NULL, assignment
+  via subscripting will proceed as if it was an empty JSON object:
+
+-- If jsonb_field here is NULL, the result is {"a": 1}
+UPDATE table_name SET jsonb_field['a'] = '1';
+
+-- If jsonb_field here is NULL, the result is [1]
+UPDATE table_name SET jsonb_field[0] = '1';
+
+
+  
+ 
+
  
   Transforms
 
diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
index 82732146d3..279ff15ade 100644
--- a/src/backend/utils/adt/Makefile
+++ b/src/backend/utils/adt/Makefile
@@ -50,6 +50,7 @@ OBJS = \
 	jsonb_op.o \
 	jsonb_util.o \
 	jsonfuncs.o \
+	jsonbsubs.o \
 	jsonpath.o \
 	jsonpath_exec.o \
 	jsonpath_gram.o \
diff --git a/src/backend/utils/adt/jsonb_util.c b/src/backend/utils/adt/jsonb_util.c
index 4eeffa1424..41a1c1f9bb 100644
--- 

Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-31 Thread Pavel Stehule
čt 31. 12. 2020 v 15:27 odesílatel Dmitry Dolgov <9erthali...@gmail.com>
napsal:

> > On Wed, Dec 30, 2020 at 09:01:37PM +0100, Dmitry Dolgov wrote:
> > > make check fails
> >
> > Yeah, apparently I forgot to enable asserts back after the last
> > benchmarking discussion, and missed some of those. Will fix.
> >
> > > 2. The index position was ignored.
> > >
> > > postgres=# update foo set a['a'][10] = '20';
> > > UPDATE 1
> > > postgres=# select * from foo;
> > > ┌─┐
> > > │  a  │
> > > ╞═╡
> > > │ {"a": [20]} │
> > > └─┘
> > > (1 row)
> >
> > I just realized I haven't included "filling the gaps" part, that's why
> > it works as before. Can add this too.
> >
> > > 1. quietly ignored update
> > >
> > > postgres=# update foo set a['a'][10] = '20';
> > > UPDATE 1
> > > postgres=# select * from foo;
> > > ┌┐
> > > │ a  │
> > > ╞╡
> > > │ {} │
> > > └┘
> > > (1 row)
> >
> > This belongs to the original jsonb_set implementation. Although if we
> > started to change it anyway with "filling the gaps", maybe it's fine to
> > add one more flag to tune its behaviour in this case as well. I can
> > check how complicated that could be.
>
> Here is what I had in mind. Assert issue in main patch is fixed (nothing
> serious, it was just the rawscalar check for an empty jsonb created
> during assignment), and the second patch contains all the bits with
> "filling the gaps" including your suggestion about creating the whole
> path if it's not present. The latter (creating the chain of empty
> objects) I haven't tested that much, but if there are any issues or
> concerns I guess it will not prevent the main patch from going forward


the tests passed and filling gaps works well

but creating empty objects doesn't work

create table foo(a jsonb);

insert into foo values('{}');

postgres=# update foo set a['k'][1] = '20';
UPDATE 1
postgres=# select * from foo;
┌───┐
│ a │
╞═══╡
│ {"k": [null, 20]} │
└───┘
(1 row)

it is ok

postgres=# update foo set a['k3'][10] = '20';
UPDATE 1
postgres=# select * from foo;
┌───┐
│ a │
╞═══╡
│ {"k": [null, 20]} │
└───┘
(1 row)

the second update was not successful



.
>


Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-31 Thread Dmitry Dolgov
> On Wed, Dec 30, 2020 at 09:01:37PM +0100, Dmitry Dolgov wrote:
> > make check fails
>
> Yeah, apparently I forgot to enable asserts back after the last
> benchmarking discussion, and missed some of those. Will fix.
>
> > 2. The index position was ignored.
> >
> > postgres=# update foo set a['a'][10] = '20';
> > UPDATE 1
> > postgres=# select * from foo;
> > ┌─┐
> > │  a  │
> > ╞═╡
> > │ {"a": [20]} │
> > └─┘
> > (1 row)
>
> I just realized I haven't included "filling the gaps" part, that's why
> it works as before. Can add this too.
>
> > 1. quietly ignored update
> >
> > postgres=# update foo set a['a'][10] = '20';
> > UPDATE 1
> > postgres=# select * from foo;
> > ┌┐
> > │ a  │
> > ╞╡
> > │ {} │
> > └┘
> > (1 row)
>
> This belongs to the original jsonb_set implementation. Although if we
> started to change it anyway with "filling the gaps", maybe it's fine to
> add one more flag to tune its behaviour in this case as well. I can
> check how complicated that could be.

Here is what I had in mind. Assert issue in main patch is fixed (nothing
serious, it was just the rawscalar check for an empty jsonb created
during assignment), and the second patch contains all the bits with
"filling the gaps" including your suggestion about creating the whole
path if it's not present. The latter (creating the chain of empty
objects) I haven't tested that much, but if there are any issues or
concerns I guess it will not prevent the main patch from going forward.
>From c9143a620497dac5615c4de1d9349684e9af95b5 Mon Sep 17 00:00:00 2001
From: Dmitrii Dolgov <9erthali...@gmail.com>
Date: Fri, 18 Dec 2020 17:19:51 +0100
Subject: [PATCH v40 1/2] Subscripting for jsonb

Subscripting implementation for jsonb. It does not support slices, does
not have a limit for number of subscripts and for assignment expects a
replace value to be of jsonb type. There is also one functional
difference in assignment via subscripting from jsonb_set, when an
original jsonb container is NULL, subscripting replaces it with an empty
jsonb and proceed with assignment.

For the sake of code reuse, some parts of jsonb functionality were
rearranged to allow use the same functions for jsonb_set and assign
subscripting operation.

The original idea belongs to Oleg Bartunov.

Reviewed-by: Tom Lane, Arthur Zakirov, Pavel Stehule
---
 doc/src/sgml/json.sgml  |  48 
 src/backend/utils/adt/Makefile  |   1 +
 src/backend/utils/adt/jsonb_util.c  |  76 -
 src/backend/utils/adt/jsonbsubs.c   | 413 
 src/backend/utils/adt/jsonfuncs.c   | 180 ++--
 src/include/catalog/pg_proc.dat |   4 +
 src/include/catalog/pg_type.dat |   3 +-
 src/include/utils/jsonb.h   |   6 +-
 src/test/regress/expected/jsonb.out | 272 +-
 src/test/regress/sql/jsonb.sql  |  84 +-
 10 files changed, 982 insertions(+), 105 deletions(-)
 create mode 100644 src/backend/utils/adt/jsonbsubs.c

diff --git a/doc/src/sgml/json.sgml b/doc/src/sgml/json.sgml
index 5b9a5557a4..100d1a60f4 100644
--- a/doc/src/sgml/json.sgml
+++ b/doc/src/sgml/json.sgml
@@ -602,6 +602,54 @@ SELECT jdoc-'guid', jdoc-'name' FROM api WHERE jdoc @ '{"tags": ["qu
   
  
 
+ 
+  jsonb Subscripting
+  
+   jsonb data type supports array-style subscripting expressions
+   to extract or update particular elements. It's possible to use multiple
+   subscripting expressions to extract nested values. In this case, a chain of
+   subscripting expressions follows the same rules as the
+   path argument in jsonb_set function,
+   e.g. in case of arrays it is a 0-based operation or that negative integers
+   that appear in path count from the end of JSON arrays.
+   The result of subscripting expressions is always jsonb data type. An
+   example of subscripting syntax:
+
+-- Extract value by key
+SELECT ('{"a": 1}'::jsonb)['a'];
+
+-- Extract nested value by key path
+SELECT ('{"a": {"b": {"c": 1}}}'::jsonb)['a']['b']['c'];
+
+-- Extract element by index
+SELECT ('[1, "2", null]'::jsonb)[1];
+
+-- Update value by key, note the single quotes - the assigned value
+-- needs to be of jsonb type as well
+UPDATE table_name SET jsonb_field['key'] = '1';
+
+-- Select records using where clause with subscripting. Since the result of
+-- subscripting is jsonb and we basically want to compare two jsonb objects, we
+-- need to put the value in double quotes to be able to convert it to jsonb.
+SELECT * FROM table_name WHERE jsonb_field['key'] = '"value"';
+
+
+  Subscripting for jsonb does not support slice expressions,
+  even if it contains an array.
+
+  In case if source jsonb is NULL, assignment
+  via subscripting will proceed as if it was an empty JSON object:
+
+-- If jsonb_field here is NULL, the result is {"a": 1}
+UPDATE table_name SET jsonb_field['a'] = '1';
+
+-- If jsonb_field here is NULL, the result is [1]
+UPDATE table_name SET jsonb_field[0] = '1';
+
+
+  
+ 
+
  
   Transforms
 
diff 

Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-30 Thread Dmitry Dolgov
> On Wed, Dec 30, 2020 at 07:48:57PM +0100, Pavel Stehule wrote:
> st 30. 12. 2020 v 14:46 odesílatel Dmitry Dolgov <9erthali...@gmail.com>
> napsal:
>
> > > On Wed, Dec 30, 2020 at 02:45:12PM +0100, Dmitry Dolgov wrote:
> > > > On Sat, Dec 26, 2020 at 01:24:04PM -0500, Tom Lane wrote:
> > > >
> > > > In a case like jsonpath['...'], the initially UNKNOWN-type literal
> > could
> > > > in theory be coerced to any of these types, so you'd have to resolve
> > that
> > > > case manually.  The overloaded-function code has an internal preference
> > > > that makes it choose TEXT if it has a choice of TEXT or some other
> > target
> > > > type for an UNKNOWN input (cf parse_func.c starting about line 1150),
> > but
> > > > if you ask can_coerce_type() it's going to say TRUE for all three
> > cases.
> > > >
> > > > Roughly speaking, then, I think what you want to do is
> > > >
> > > > 1. If input type is UNKNOWNOID, choose result type TEXT.
> > > >
> > > > 2. Otherwise, apply can_coerce_type() to see if the input type can be
> > > > coerced to int4, text, or jsonpath.  If it succeeds for none or more
> > > > than one of these, throw error.  Otherwise choose the single successful
> > > > type.
> > > >
> > > > 3. Apply coerce_type() to coerce to the chosen result type.
> > > >
> > > > 4. At runtime, examine exprType() of the input to figure out what to
> > do.
> > >
> > > Thanks, that was super useful. Following this suggestion I've made
> > > necessary adjustments for the patch. There is no jsonpath support, but
> > > this could be easily added on top.
> >
> > And the forgotten patch itself.
> >
>
> make check fails

Yeah, apparently I forgot to enable asserts back after the last
benchmarking discussion, and missed some of those. Will fix.

> 2. The index position was ignored.
>
> postgres=# update foo set a['a'][10] = '20';
> UPDATE 1
> postgres=# select * from foo;
> ┌─┐
> │  a  │
> ╞═╡
> │ {"a": [20]} │
> └─┘
> (1 row)

I just realized I haven't included "filling the gaps" part, that's why
it works as before. Can add this too.

> 1. quietly ignored update
>
> postgres=# update foo set a['a'][10] = '20';
> UPDATE 1
> postgres=# select * from foo;
> ┌┐
> │ a  │
> ╞╡
> │ {} │
> └┘
> (1 row)

This belongs to the original jsonb_set implementation. Although if we
started to change it anyway with "filling the gaps", maybe it's fine to
add one more flag to tune its behaviour in this case as well. I can
check how complicated that could be.




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-30 Thread Pavel Stehule
st 30. 12. 2020 v 14:46 odesílatel Dmitry Dolgov <9erthali...@gmail.com>
napsal:

> > On Wed, Dec 30, 2020 at 02:45:12PM +0100, Dmitry Dolgov wrote:
> > > On Sat, Dec 26, 2020 at 01:24:04PM -0500, Tom Lane wrote:
> > >
> > > In a case like jsonpath['...'], the initially UNKNOWN-type literal
> could
> > > in theory be coerced to any of these types, so you'd have to resolve
> that
> > > case manually.  The overloaded-function code has an internal preference
> > > that makes it choose TEXT if it has a choice of TEXT or some other
> target
> > > type for an UNKNOWN input (cf parse_func.c starting about line 1150),
> but
> > > if you ask can_coerce_type() it's going to say TRUE for all three
> cases.
> > >
> > > Roughly speaking, then, I think what you want to do is
> > >
> > > 1. If input type is UNKNOWNOID, choose result type TEXT.
> > >
> > > 2. Otherwise, apply can_coerce_type() to see if the input type can be
> > > coerced to int4, text, or jsonpath.  If it succeeds for none or more
> > > than one of these, throw error.  Otherwise choose the single successful
> > > type.
> > >
> > > 3. Apply coerce_type() to coerce to the chosen result type.
> > >
> > > 4. At runtime, examine exprType() of the input to figure out what to
> do.
> >
> > Thanks, that was super useful. Following this suggestion I've made
> > necessary adjustments for the patch. There is no jsonpath support, but
> > this could be easily added on top.
>
> And the forgotten patch itself.
>

make check fails

But I dislike two issues

1. quietly ignored update

postgres=# update foo set a['a'][10] = '20';
UPDATE 1
postgres=# select * from foo;
┌┐
│ a  │
╞╡
│ {} │
└┘
(1 row)

The value should be modified or there should be an error (but I prefer
implicit creating nested empty objects when it is necessary).

update foo set a['a'] = '[]';

2. The index position was ignored.

postgres=# update foo set a['a'][10] = '20';
UPDATE 1
postgres=# select * from foo;
┌─┐
│  a  │
╞═╡
│ {"a": [20]} │
└─┘
(1 row)

Notes:

1. It is very nice so casts are supported. I wrote int2jsonb cast and it
was working. Maybe we can create buildin casts for int, bigint, numeric,
boolean, date, timestamp to jsonb.

Regards

Pavel


Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-30 Thread Dmitry Dolgov
> On Wed, Dec 30, 2020 at 02:45:12PM +0100, Dmitry Dolgov wrote:
> > On Sat, Dec 26, 2020 at 01:24:04PM -0500, Tom Lane wrote:
> >
> > In a case like jsonpath['...'], the initially UNKNOWN-type literal could
> > in theory be coerced to any of these types, so you'd have to resolve that
> > case manually.  The overloaded-function code has an internal preference
> > that makes it choose TEXT if it has a choice of TEXT or some other target
> > type for an UNKNOWN input (cf parse_func.c starting about line 1150), but
> > if you ask can_coerce_type() it's going to say TRUE for all three cases.
> >
> > Roughly speaking, then, I think what you want to do is
> >
> > 1. If input type is UNKNOWNOID, choose result type TEXT.
> >
> > 2. Otherwise, apply can_coerce_type() to see if the input type can be
> > coerced to int4, text, or jsonpath.  If it succeeds for none or more
> > than one of these, throw error.  Otherwise choose the single successful
> > type.
> >
> > 3. Apply coerce_type() to coerce to the chosen result type.
> >
> > 4. At runtime, examine exprType() of the input to figure out what to do.
>
> Thanks, that was super useful. Following this suggestion I've made
> necessary adjustments for the patch. There is no jsonpath support, but
> this could be easily added on top.

And the forgotten patch itself.
>From 7e2fa3dc4a8b1f907a385451c6782f2dfdc743ab Mon Sep 17 00:00:00 2001
From: Dmitrii Dolgov <9erthali...@gmail.com>
Date: Fri, 18 Dec 2020 17:19:51 +0100
Subject: [PATCH v39] Subscripting for jsonb

Subscripting implementation for jsonb. It does not support slices, does
not have a limit for number of subscripts and for assignment expects a
replace value to be of jsonb type. There is also one functional
difference in assignment via subscripting from jsonb_set, when an
original jsonb container is NULL, subscripting replaces it with an empty
jsonb and proceed with assignment.

For the sake of code reuse, some parts of jsonb functionality were
rearranged to allow use the same functions for jsonb_set and assign
subscripting operation.

The original idea belongs to Oleg Bartunov.

Reviewed-by: Tom Lane, Arthur Zakirov, Pavel Stehule
---
 doc/src/sgml/json.sgml  |  48 
 src/backend/utils/adt/Makefile  |   1 +
 src/backend/utils/adt/jsonb_util.c  |  76 -
 src/backend/utils/adt/jsonbsubs.c   | 412 
 src/backend/utils/adt/jsonfuncs.c   | 180 ++--
 src/include/catalog/pg_proc.dat |   4 +
 src/include/catalog/pg_type.dat |   3 +-
 src/include/utils/jsonb.h   |   6 +-
 src/test/regress/expected/jsonb.out | 272 +-
 src/test/regress/sql/jsonb.sql  |  84 +-
 10 files changed, 981 insertions(+), 105 deletions(-)
 create mode 100644 src/backend/utils/adt/jsonbsubs.c

diff --git a/doc/src/sgml/json.sgml b/doc/src/sgml/json.sgml
index 5b9a5557a4..100d1a60f4 100644
--- a/doc/src/sgml/json.sgml
+++ b/doc/src/sgml/json.sgml
@@ -602,6 +602,54 @@ SELECT jdoc-'guid', jdoc-'name' FROM api WHERE jdoc @ '{"tags": ["qu
   
  
 
+ 
+  jsonb Subscripting
+  
+   jsonb data type supports array-style subscripting expressions
+   to extract or update particular elements. It's possible to use multiple
+   subscripting expressions to extract nested values. In this case, a chain of
+   subscripting expressions follows the same rules as the
+   path argument in jsonb_set function,
+   e.g. in case of arrays it is a 0-based operation or that negative integers
+   that appear in path count from the end of JSON arrays.
+   The result of subscripting expressions is always jsonb data type. An
+   example of subscripting syntax:
+
+-- Extract value by key
+SELECT ('{"a": 1}'::jsonb)['a'];
+
+-- Extract nested value by key path
+SELECT ('{"a": {"b": {"c": 1}}}'::jsonb)['a']['b']['c'];
+
+-- Extract element by index
+SELECT ('[1, "2", null]'::jsonb)[1];
+
+-- Update value by key, note the single quotes - the assigned value
+-- needs to be of jsonb type as well
+UPDATE table_name SET jsonb_field['key'] = '1';
+
+-- Select records using where clause with subscripting. Since the result of
+-- subscripting is jsonb and we basically want to compare two jsonb objects, we
+-- need to put the value in double quotes to be able to convert it to jsonb.
+SELECT * FROM table_name WHERE jsonb_field['key'] = '"value"';
+
+
+  Subscripting for jsonb does not support slice expressions,
+  even if it contains an array.
+
+  In case if source jsonb is NULL, assignment
+  via subscripting will proceed as if it was an empty JSON object:
+
+-- If jsonb_field here is NULL, the result is {"a": 1}
+UPDATE table_name SET jsonb_field['a'] = '1';
+
+-- If jsonb_field here is NULL, the result is [1]
+UPDATE table_name SET jsonb_field[0] = '1';
+
+
+  
+ 
+
  
   Transforms
 
diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
index ce09ad7375..0ef0f36e36 100644
--- a/src/backend/utils/adt/Makefile
+++ b/src/backend/utils/adt/Makefile
@@ -50,6 +50,7 @@ OBJS = 

Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-30 Thread Dmitry Dolgov
> On Sat, Dec 26, 2020 at 01:24:04PM -0500, Tom Lane wrote:
>
> In a case like jsonpath['...'], the initially UNKNOWN-type literal could
> in theory be coerced to any of these types, so you'd have to resolve that
> case manually.  The overloaded-function code has an internal preference
> that makes it choose TEXT if it has a choice of TEXT or some other target
> type for an UNKNOWN input (cf parse_func.c starting about line 1150), but
> if you ask can_coerce_type() it's going to say TRUE for all three cases.
>
> Roughly speaking, then, I think what you want to do is
>
> 1. If input type is UNKNOWNOID, choose result type TEXT.
>
> 2. Otherwise, apply can_coerce_type() to see if the input type can be
> coerced to int4, text, or jsonpath.  If it succeeds for none or more
> than one of these, throw error.  Otherwise choose the single successful
> type.
>
> 3. Apply coerce_type() to coerce to the chosen result type.
>
> 4. At runtime, examine exprType() of the input to figure out what to do.

Thanks, that was super useful. Following this suggestion I've made
necessary adjustments for the patch. There is no jsonpath support, but
this could be easily added on top.




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-26 Thread Tom Lane
Dmitry Dolgov <9erthali...@gmail.com> writes:
>> On Tue, Dec 22, 2020 at 02:21:22PM -0500, Tom Lane wrote:
>> We do have precedent for this, it's the rules about resolving argument
>> types for overloaded functions.  But the conclusion that that precedent
>> leads to is that we should check whether the subscript expression can
>> be *implicitly* coerced to either integer or text, and fail if neither
>> coercion or both coercions succeed.

> I'm not sure I completely follow and can't immediately find the relevant
> code for overloaded functions, so I need to do a perception check.
> Following this design in jsonb_subscripting_transform we try to coerce
> the subscription expression to both integer and text (and maybe even to
> jsonpath), and based on the result of which coercion has succeeded chose
> different logic to handle it, right?

Right, with the important proviso that the coercion strength is 
COERCION_IMPLICIT not COERCION_ASSIGNMENT.

> And just for me to understand. In the above example of the overloaded
> function, with the integer we can coerce it only to text (since original
> type of the expression is integer), and with the bigint it could be
> coerced both to integer and text, that's why failure, isn't?

No, there's no such IMPLICIT-level casts.  Coercing bigint down to int
is only allowed at ASSIGNMENT or higher coercion strength.

In a case like jsonpath['...'], the initially UNKNOWN-type literal could
in theory be coerced to any of these types, so you'd have to resolve that
case manually.  The overloaded-function code has an internal preference
that makes it choose TEXT if it has a choice of TEXT or some other target
type for an UNKNOWN input (cf parse_func.c starting about line 1150), but
if you ask can_coerce_type() it's going to say TRUE for all three cases.

Roughly speaking, then, I think what you want to do is

1. If input type is UNKNOWNOID, choose result type TEXT.

2. Otherwise, apply can_coerce_type() to see if the input type can be
coerced to int4, text, or jsonpath.  If it succeeds for none or more
than one of these, throw error.  Otherwise choose the single successful
type.

3. Apply coerce_type() to coerce to the chosen result type.

4. At runtime, examine exprType() of the input to figure out what to do.

regards, tom lane




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-25 Thread Dmitry Dolgov
> On Tue, Dec 22, 2020 at 02:21:22PM -0500, Tom Lane wrote:
> Pavel Stehule  writes:
> > But maybe we try to design some that are designed already. Is there some
> > info about index specification in SQL/JSON?
>
> We do have precedent for this, it's the rules about resolving argument
> types for overloaded functions.  But the conclusion that that precedent
> leads to is that we should check whether the subscript expression can
> be *implicitly* coerced to either integer or text, and fail if neither
> coercion or both coercions succeed.  I'd be okay with that from a system
> design standpoint, but it's hard to say without trying it whether it
> will work out nicely from a usability standpoint.  In a quick trial
> it seems it might be okay:
>
> regression=# create function mysub(int) returns text language sql
> regression-# as $$select 'int'$$;
> CREATE FUNCTION
> regression=# create function mysub(text) returns text language sql
> as $$select 'text'$$;
> CREATE FUNCTION
> regression=# select mysub(42);
>  mysub
> ---
>  int
> (1 row)
>
> regression=# select mysub('foo');
>  mysub
> ---
>  text
> (1 row)
>
> regression=# select mysub(42::bigint);
> ERROR:  function mysub(bigint) does not exist

I'm not sure I completely follow and can't immediately find the relevant
code for overloaded functions, so I need to do a perception check.
Following this design in jsonb_subscripting_transform we try to coerce
the subscription expression to both integer and text (and maybe even to
jsonpath), and based on the result of which coercion has succeeded chose
different logic to handle it, right?

And just for me to understand. In the above example of the overloaded
function, with the integer we can coerce it only to text (since original
type of the expression is integer), and with the bigint it could be
coerced both to integer and text, that's why failure, isn't?




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-22 Thread Tom Lane
Pavel Stehule  writes:
> But maybe we try to design some that are designed already. Is there some
> info about index specification in SQL/JSON?

We do have precedent for this, it's the rules about resolving argument
types for overloaded functions.  But the conclusion that that precedent
leads to is that we should check whether the subscript expression can
be *implicitly* coerced to either integer or text, and fail if neither
coercion or both coercions succeed.  I'd be okay with that from a system
design standpoint, but it's hard to say without trying it whether it
will work out nicely from a usability standpoint.  In a quick trial
it seems it might be okay:

regression=# create function mysub(int) returns text language sql
regression-# as $$select 'int'$$;
CREATE FUNCTION
regression=# create function mysub(text) returns text language sql
as $$select 'text'$$;
CREATE FUNCTION
regression=# select mysub(42);
 mysub 
---
 int
(1 row)

regression=# select mysub('foo');
 mysub 
---
 text
(1 row)

But there are definitely cases that will fail when an assignment
coercion would have succeeded, eg

regression=# select mysub(42::bigint);
ERROR:  function mysub(bigint) does not exist

Maybe that's okay.  (As I said earlier, we can't use assignment
coercion when there's two possible coercion targets, because it'd
be too likely that they both succeed.)

regards, tom lane




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-22 Thread Pavel Stehule
út 22. 12. 2020 v 18:35 odesílatel Dmitry Dolgov <9erthali...@gmail.com>
napsal:

> > On Tue, Dec 22, 2020 at 11:57:13AM -0500, Tom Lane wrote:
> > Dmitry Dolgov <9erthali...@gmail.com> writes:
> > > On Tue, Dec 22, 2020 at 12:19:26PM +0100, Pavel Stehule wrote:
> > >> I expect behave like
> > >>
> > >> update x set test[1] = 10; --> "[10]";
> > >> update x set test['1'] = 10; --> "{"1": 10}"
> >
> > > Yes, I also was thinking about this because such behaviour is more
> > > natural.
> >
> > I continue to feel that this is a fundamentally bad idea that will
> > lead to much more pain than benefit.  People are going to want to
> > know why "test[1.0]" doesn't act like "test[1]".  They are going
> > to complain because "test[$1]" acts so much differently depending
> > on whether they assigned a type to the $1 parameter or not.  And
> > they are going to bitch because dumping and reloading a rule causes
> > it to do something different than it did before --- or at least we'd
> > be at horrid risk of that; only if we hide the injected cast-to-text
> > doesd the dumped rule look the way it needs to.  Even then, the whole
> > thing is critically dependent on the fact that integer-type constants
> > are written and displayed differently from other constants, so it
> > won't scale to any other type that someone might want to treat specially.
> > So you're just leading datatype designers down a garden path that will be
> > a dead end for many of them.
> >
> > IMO this isn't actually any saner than your previous iterations
> > on the idea.
>
> Ok. While I don't have any preferences here, we can disregard the last
> posted patch (extended-with-subscript-type) and consider only
> v38-0001-Subscripting-for-jsonb version.
>

There are two parts - fetching and setting.

Probably there can be an agreement on fetching part: if index text is
JSONPath expression, use jsonb_path_query, else use jsonb_extract_path.
The setting should be the same in the inverse direction.

I like the behavior of jsonb_extract_path - it has intuitive behaviour and
we should use it.


Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-22 Thread Dmitry Dolgov
> On Tue, Dec 22, 2020 at 11:57:13AM -0500, Tom Lane wrote:
> Dmitry Dolgov <9erthali...@gmail.com> writes:
> > On Tue, Dec 22, 2020 at 12:19:26PM +0100, Pavel Stehule wrote:
> >> I expect behave like
> >>
> >> update x set test[1] = 10; --> "[10]";
> >> update x set test['1'] = 10; --> "{"1": 10}"
>
> > Yes, I also was thinking about this because such behaviour is more
> > natural.
>
> I continue to feel that this is a fundamentally bad idea that will
> lead to much more pain than benefit.  People are going to want to
> know why "test[1.0]" doesn't act like "test[1]".  They are going
> to complain because "test[$1]" acts so much differently depending
> on whether they assigned a type to the $1 parameter or not.  And
> they are going to bitch because dumping and reloading a rule causes
> it to do something different than it did before --- or at least we'd
> be at horrid risk of that; only if we hide the injected cast-to-text
> doesd the dumped rule look the way it needs to.  Even then, the whole
> thing is critically dependent on the fact that integer-type constants
> are written and displayed differently from other constants, so it
> won't scale to any other type that someone might want to treat specially.
> So you're just leading datatype designers down a garden path that will be
> a dead end for many of them.
>
> IMO this isn't actually any saner than your previous iterations
> on the idea.

Ok. While I don't have any preferences here, we can disregard the last
posted patch (extended-with-subscript-type) and consider only
v38-0001-Subscripting-for-jsonb version.




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-22 Thread Pavel Stehule
út 22. 12. 2020 v 17:57 odesílatel Tom Lane  napsal:

> Dmitry Dolgov <9erthali...@gmail.com> writes:
> > On Tue, Dec 22, 2020 at 12:19:26PM +0100, Pavel Stehule wrote:
> >> I expect behave like
> >>
> >> update x set test[1] = 10; --> "[10]";
> >> update x set test['1'] = 10; --> "{"1": 10}"
>
> > Yes, I also was thinking about this because such behaviour is more
> > natural.
>
> I continue to feel that this is a fundamentally bad idea that will
> lead to much more pain than benefit.  People are going to want to
> know why "test[1.0]" doesn't act like "test[1]".  They are going
> to complain because "test[$1]" acts so much differently depending
> on whether they assigned a type to the $1 parameter or not.  And
> they are going to bitch because dumping and reloading a rule causes
> it to do something different than it did before --- or at least we'd
> be at horrid risk of that; only if we hide the injected cast-to-text
> doesd the dumped rule look the way it needs to.  Even then, the whole
> thing is critically dependent on the fact that integer-type constants
> are written and displayed differently from other constants, so it
> won't scale to any other type that someone might want to treat specially.
> So you're just leading datatype designers down a garden path that will be
> a dead end for many of them.
>
> IMO this isn't actually any saner than your previous iterations
> on the idea.
>

I think so there can be some logic. But json has two kinds of very
different objects - objects and arrays, and we should support both.

can be a good solution based on initial source value?

DECLARE v jsonb;
BEGIN
  v := '[]';
  v[1] := 10; v[2] := 20; -- v = "[10,20]"
  v['a'] := 30; --> should to raise an error
  v := '{}';
  v[1] := 10; v[2] := 20; -- v = "{"1": 10, "2":20}"
  v['a'] := 30; -- v = "{"1": 10, "2":20, "a": 30}"

When the source variable is null, then the default behavior can be the same
like json objects. But it doesn't solve well numeric indexes.

because

v := '[]'
v[1.5] := 100; -- it is nonsense

but
v := '{}'
v[1.5] := 100; -- v = "{"1.5":100}" -- and this can have good benefit, but
"1" and "1.0" are different keys.

But maybe we try to design some that are designed already. Is there some
info about index specification in SQL/JSON?

Regards

Pavel







regards, tom lane
>


Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-22 Thread Tom Lane
Dmitry Dolgov <9erthali...@gmail.com> writes:
> On Tue, Dec 22, 2020 at 12:19:26PM +0100, Pavel Stehule wrote:
>> I expect behave like
>> 
>> update x set test[1] = 10; --> "[10]";
>> update x set test['1'] = 10; --> "{"1": 10}"

> Yes, I also was thinking about this because such behaviour is more
> natural.

I continue to feel that this is a fundamentally bad idea that will
lead to much more pain than benefit.  People are going to want to
know why "test[1.0]" doesn't act like "test[1]".  They are going
to complain because "test[$1]" acts so much differently depending
on whether they assigned a type to the $1 parameter or not.  And
they are going to bitch because dumping and reloading a rule causes
it to do something different than it did before --- or at least we'd
be at horrid risk of that; only if we hide the injected cast-to-text
doesd the dumped rule look the way it needs to.  Even then, the whole
thing is critically dependent on the fact that integer-type constants
are written and displayed differently from other constants, so it
won't scale to any other type that someone might want to treat specially.
So you're just leading datatype designers down a garden path that will be
a dead end for many of them.

IMO this isn't actually any saner than your previous iterations
on the idea.

regards, tom lane




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-22 Thread Dmitry Dolgov
> On Tue, Dec 22, 2020 at 12:19:26PM +0100, Pavel Stehule wrote:
>
> > Here is the new version of jsonb subscripting rebased on the committed
> > infrastructure patch. I hope it will not introduce any confusion with
> > the previously posted patched in this thread (about alter type subscript
> > and hstore) as they are independent.
> >
> > There are few differences from the previous version:
> >
> > * No limit on number of subscripts for jsonb (as there is no intrinsic
> >   limitation of this kind for jsonb).
> >
> > * In case of assignment via subscript now it expects the replace value
> >   to be of jsonb type.
> >
> > * Similar to the implementation for arrays, if the source jsonb is NULL,
> >   it will be replaced by an empty jsonb and the new value will be
> >   assigned to it. This means:
> >
> > =# select * from test_jsonb_subscript where id = 3;
> >  id | test_json
> > +---
> >   3 | NULL
> >
> > =# update test_jsonb_subscript set test_json['a'] = '1' where id =
> > 3;
> > UPDATE 1
> >
> > =# select * from test_jsonb_subscript where id = 3;
> >  id | test_json
> > +---
> >   3 | {"a": 1}
> >
> >   and similar:
> >
> > =# select * from test_jsonb_subscript where id = 3;
> >  id | test_json
> > +---
> >   3 | NULL
> >
> > =# update test_jsonb_subscript set test_json[1] = '1' where id = 3;
> > UPDATE 1
> >
> > =# select * from test_jsonb_subscript where id = 3;
> >  id | test_json
> > +---
> >   3 | {"1": 1}
> >
> >   The latter is probably a bit strange looking, but if there are any
> > concerns
> >   about this part (and in general about an assignment to jsonb which is
> > NULL)
> >   of the implementation it could be easily changed.
> >
>
> What is the possibility to make an array instead of a record?
>
> I expect behave like
>
> update x set test[1] = 10; --> "[10]";
> update x set test['1'] = 10; --> "{"1": 10}"

Yes, I also was thinking about this because such behaviour is more
natural. To implement this we need to provide possibility for type
specific code to remember original subscript expression type (something
like in the attached version), which could be also useful for the future
work on jsonpath. I'm just not sure if there are again some important
bits are missing in this idea, so if someone can take a look I would
appreciate. In case there are any issues, I would just suggest keep it
simple and return NULL.
>From dc7fc5fff7b1597861b950138e1084f4ac04e321 Mon Sep 17 00:00:00 2001
From: Dmitrii Dolgov <9erthali...@gmail.com>
Date: Fri, 18 Dec 2020 17:19:51 +0100
Subject: [PATCH v38] Subscripting for jsonb (extended with subscript type)

Subscripting implementation for jsonb. It does not support slices, does
not have a limit for number of subscripts and for assignment expects a
replace value to be of jsonb type. There is also one functional
difference in assignment via subscripting from jsonb_set, when an
original jsonb container is NULL, subscripting replaces it with an empty
jsonb and proceed with assignment. The type of such empty jsonb would be
array if the first subscript was an integer, and object in all other
cases.

For the sake of code reuse, some parts of jsonb functionality were
rearranged to allow use the same functions for jsonb_set and assign
subscripting operation.

The patch also extends SubscriptingRef and SubscriptingRefState with
information about original types of subscript expressions. This could be
useful if type specific subscript implementation needs to distinguish
between different data types in subscripting.

The original idea belongs to Oleg Bartunov.

Reviewed-by: Tom Lane, Arthur Zakirov, Pavel Stehule
---
 doc/src/sgml/json.sgml  |  48 +
 src/backend/executor/execExpr.c |  21 +-
 src/backend/nodes/copyfuncs.c   |   2 +
 src/backend/nodes/outfuncs.c|   2 +
 src/backend/nodes/readfuncs.c   |   2 +
 src/backend/utils/adt/Makefile  |   1 +
 src/backend/utils/adt/jsonb_util.c  |  76 ++-
 src/backend/utils/adt/jsonbsubs.c   | 299 
 src/backend/utils/adt/jsonfuncs.c   | 180 +
 src/include/catalog/pg_proc.dat |   4 +
 src/include/catalog/pg_type.dat |   3 +-
 src/include/executor/execExpr.h |   2 +
 src/include/nodes/primnodes.h   |   2 +
 src/include/utils/jsonb.h   |   6 +-
 src/test/regress/expected/jsonb.out | 273 -
 src/test/regress/sql/jsonb.sql  |  84 +++-
 16 files changed, 899 insertions(+), 106 deletions(-)
 create mode 100644 src/backend/utils/adt/jsonbsubs.c

diff --git a/doc/src/sgml/json.sgml b/doc/src/sgml/json.sgml
index 5b9a5557a4..cad7b02559 100644
--- a/doc/src/sgml/json.sgml
+++ b/doc/src/sgml/json.sgml
@@ -602,6 +602,54 @@ SELECT jdoc-'guid', jdoc-'name' FROM api WHERE jdoc @ '{"tags": ["qu
   
  
 
+ 

Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-22 Thread Pavel Stehule
út 22. 12. 2020 v 11:24 odesílatel Dmitry Dolgov <9erthali...@gmail.com>
napsal:

> > On Fri, Dec 18, 2020 at 08:59:25PM +0100, Dmitry Dolgov wrote:
> > > On Thu, Dec 17, 2020 at 03:29:35PM -0500, Tom Lane wrote:
> > > Dmitry Dolgov <9erthali...@gmail.com> writes:
> > > > On Thu, Dec 17, 2020 at 01:49:17PM -0500, Tom Lane wrote:
> > > >> We can certainly reconsider the API for the parsing hook if there's
> > > >> really a good reason for these to be different types, but it seems
> > > >> like that would just be encouraging poor design.
> > >
> > > > To be more specific, this is the current behaviour (an example from
> the
> > > > tests) and it doesn't seem right:
> > >
> > > > =# update test_jsonb_subscript
> > > >set test_json['a'] = 3 where id = 1;
> > > > UPDATE 1
> > > > =# select jsonb_typeof(test_json->'a')
> > > >from test_jsonb_subscript where id = 1;
> > > >  jsonb_typeof
> > > >  --
> > > >   string
> > >
> > >
> > > I'm rather inclined to think that the result of subscripting a
> > > jsonb (and therefore also the required source type for assignment)
> > > should be jsonb, not just text.  In that case, something like
> > > update ... set jsoncol['a'] = 3
> > > would fail, because there's no cast from integer to jsonb.  You'd
> > > have to write one of
> > > update ... set jsoncol['a'] = '3'
> > > update ... set jsoncol['a'] = '"3"'
> > > to clarify how you wanted the input to be interpreted.
> > > But that seems like a good thing, just as it is for jsonb_in.
> >
> > Yep, that makes sense, will go with this idea.
>
> Here is the new version of jsonb subscripting rebased on the committed
> infrastructure patch. I hope it will not introduce any confusion with
> the previously posted patched in this thread (about alter type subscript
> and hstore) as they are independent.
>
> There are few differences from the previous version:
>
> * No limit on number of subscripts for jsonb (as there is no intrinsic
>   limitation of this kind for jsonb).
>
> * In case of assignment via subscript now it expects the replace value
>   to be of jsonb type.
>
> * Similar to the implementation for arrays, if the source jsonb is NULL,
>   it will be replaced by an empty jsonb and the new value will be
>   assigned to it. This means:
>
> =# select * from test_jsonb_subscript where id = 3;
>  id | test_json
> +---
>   3 | NULL
>
> =# update test_jsonb_subscript set test_json['a'] = '1' where id =
> 3;
> UPDATE 1
>
> =# select * from test_jsonb_subscript where id = 3;
>  id | test_json
> +---
>   3 | {"a": 1}
>
>   and similar:
>
> =# select * from test_jsonb_subscript where id = 3;
>  id | test_json
> +---
>   3 | NULL
>
> =# update test_jsonb_subscript set test_json[1] = '1' where id = 3;
> UPDATE 1
>
> =# select * from test_jsonb_subscript where id = 3;
>  id | test_json
> +---
>   3 | {"1": 1}
>
>   The latter is probably a bit strange looking, but if there are any
> concerns
>   about this part (and in general about an assignment to jsonb which is
> NULL)
>   of the implementation it could be easily changed.
>

What is the possibility to make an array instead of a record?

I expect behave like

update x set test[1] = 10; --> "[10]";
update x set test['1'] = 10; --> "{"1": 10}"

Regards

Pavel


> * There is nothing to address question about distinguishing a regular text
>   subscript and jsonpath in the patch yet. I guess the idea would be to
> save
>   the original subscript value type before coercing it into text and allow
> a
>   type specific code to convert it back. But I'll probably do it as a
> separate
>   patch when we finish with this one.
>


Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-22 Thread Dmitry Dolgov
> On Fri, Dec 18, 2020 at 08:59:25PM +0100, Dmitry Dolgov wrote:
> > On Thu, Dec 17, 2020 at 03:29:35PM -0500, Tom Lane wrote:
> > Dmitry Dolgov <9erthali...@gmail.com> writes:
> > > On Thu, Dec 17, 2020 at 01:49:17PM -0500, Tom Lane wrote:
> > >> We can certainly reconsider the API for the parsing hook if there's
> > >> really a good reason for these to be different types, but it seems
> > >> like that would just be encouraging poor design.
> >
> > > To be more specific, this is the current behaviour (an example from the
> > > tests) and it doesn't seem right:
> >
> > > =# update test_jsonb_subscript
> > >set test_json['a'] = 3 where id = 1;
> > > UPDATE 1
> > > =# select jsonb_typeof(test_json->'a')
> > >from test_jsonb_subscript where id = 1;
> > >  jsonb_typeof
> > >  --
> > >   string
> >
> >
> > I'm rather inclined to think that the result of subscripting a
> > jsonb (and therefore also the required source type for assignment)
> > should be jsonb, not just text.  In that case, something like
> > update ... set jsoncol['a'] = 3
> > would fail, because there's no cast from integer to jsonb.  You'd
> > have to write one of
> > update ... set jsoncol['a'] = '3'
> > update ... set jsoncol['a'] = '"3"'
> > to clarify how you wanted the input to be interpreted.
> > But that seems like a good thing, just as it is for jsonb_in.
>
> Yep, that makes sense, will go with this idea.

Here is the new version of jsonb subscripting rebased on the committed
infrastructure patch. I hope it will not introduce any confusion with
the previously posted patched in this thread (about alter type subscript
and hstore) as they are independent.

There are few differences from the previous version:

* No limit on number of subscripts for jsonb (as there is no intrinsic
  limitation of this kind for jsonb).

* In case of assignment via subscript now it expects the replace value
  to be of jsonb type.

* Similar to the implementation for arrays, if the source jsonb is NULL,
  it will be replaced by an empty jsonb and the new value will be
  assigned to it. This means:

=# select * from test_jsonb_subscript where id = 3;
 id | test_json
+---
  3 | NULL

=# update test_jsonb_subscript set test_json['a'] = '1' where id = 3;
UPDATE 1

=# select * from test_jsonb_subscript where id = 3;
 id | test_json
+---
  3 | {"a": 1}

  and similar:

=# select * from test_jsonb_subscript where id = 3;
 id | test_json
+---
  3 | NULL

=# update test_jsonb_subscript set test_json[1] = '1' where id = 3;
UPDATE 1

=# select * from test_jsonb_subscript where id = 3;
 id | test_json
+---
  3 | {"1": 1}

  The latter is probably a bit strange looking, but if there are any concerns
  about this part (and in general about an assignment to jsonb which is NULL)
  of the implementation it could be easily changed.

* There is nothing to address question about distinguishing a regular text
  subscript and jsonpath in the patch yet. I guess the idea would be to save
  the original subscript value type before coercing it into text and allow a
  type specific code to convert it back. But I'll probably do it as a separate
  patch when we finish with this one.
>From d2aab172a4e70af0684a937b99c426652231f456 Mon Sep 17 00:00:00 2001
From: Dmitrii Dolgov <9erthali...@gmail.com>
Date: Fri, 18 Dec 2020 17:19:51 +0100
Subject: [PATCH v38] Subscripting for jsonb

Subscripting implementation for jsonb. It does not support slices, does
not have a limit for number of subscripts and for assignment expects a
replace value to be of jsonb type. There is also one functional
difference in assignment via subscripting from jsonb_set, when an
original jsonb container is NULL, subscripting replaces it with an empty
jsonb of type object and proceed with assignment.

For the sake of code reuse, some parts of jsonb functionality were
rearranged to allow use the same functions for jsonb_set and assign
subscripting operation.

The original idea belongs to Oleg Bartunov.

Reviewed-by: Tom Lane, Arthur Zakirov, Pavel Stehule
---
 doc/src/sgml/json.sgml  |  48 +
 src/backend/utils/adt/Makefile  |   1 +
 src/backend/utils/adt/jsonb_util.c  |  76 +++-
 src/backend/utils/adt/jsonbsubs.c   | 282 
 src/backend/utils/adt/jsonfuncs.c   | 180 +-
 src/include/catalog/pg_proc.dat |   4 +
 src/include/catalog/pg_type.dat |   3 +-
 src/include/utils/jsonb.h   |   6 +-
 src/test/regress/expected/jsonb.out | 273 ++-
 src/test/regress/sql/jsonb.sql  |  84 -
 10 files changed, 852 insertions(+), 105 deletions(-)
 create mode 100644 src/backend/utils/adt/jsonbsubs.c

diff --git a/doc/src/sgml/json.sgml 

Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-18 Thread Dmitry Dolgov
> On Thu, Dec 17, 2020 at 03:29:35PM -0500, Tom Lane wrote:
> Dmitry Dolgov <9erthali...@gmail.com> writes:
> > On Thu, Dec 17, 2020 at 01:49:17PM -0500, Tom Lane wrote:
> >> We can certainly reconsider the API for the parsing hook if there's
> >> really a good reason for these to be different types, but it seems
> >> like that would just be encouraging poor design.
>
> > To be more specific, this is the current behaviour (an example from the
> > tests) and it doesn't seem right:
>
> > =# update test_jsonb_subscript
> >set test_json['a'] = 3 where id = 1;
> > UPDATE 1
> > =# select jsonb_typeof(test_json->'a')
> >from test_jsonb_subscript where id = 1;
> >  jsonb_typeof
> >  --
> >   string
>
>
> I'm rather inclined to think that the result of subscripting a
> jsonb (and therefore also the required source type for assignment)
> should be jsonb, not just text.  In that case, something like
>   update ... set jsoncol['a'] = 3
> would fail, because there's no cast from integer to jsonb.  You'd
> have to write one of
>   update ... set jsoncol['a'] = '3'
>   update ... set jsoncol['a'] = '"3"'
> to clarify how you wanted the input to be interpreted.
> But that seems like a good thing, just as it is for jsonb_in.

Yep, that makes sense, will go with this idea.




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-17 Thread Pavel Stehule
čt 17. 12. 2020 v 22:47 odesílatel Tom Lane  napsal:

> Chapman Flack  writes:
> > That's likely to be what a programmer intends when writing
> > (variable explicitly typed integer) := js['n'] and
> > (variable explicitly types varchar) := js['v']
>
> I think that what we want, if we're to support that sort of thing,
> is that the js[] constructs produce jsonb by definition, and then an
> assignment-level cast is applied to get from jsonb to integer or text.
> I see we already have most of the necessary casts, but they're currently
> marked explicit-only.  Downgrading them to assignment level might be
> okay though.  If we don't want to do that, it means we have to write
> integervar := js['n']::integer
> which is a bit more wordy but also unmistakable as to intent.  (I think
> the "intent" angle might be the reason we insisted on these things
> being explicit to start with.)
>
> It's somewhat interesting to speculate about whether we could optimize
> the combination of the subscripting function and the cast function.
> But (a) that's an optimization, not something that should be part of
> the user-visible semantics, and (b) it should not be part of the initial
> feature.  I think a large part of the reason this patch is still not
> done after four years is that it's been biting off more than it could
> chew all along.  Let's try to get it to completion and then optimize
> later.
>

sure

Pavel


> As far as "treat as" is concerned, we already have a spelling for
> that, it's called a cast.
>
> regards, tom lane
>
>
>


Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-17 Thread Chapman Flack
On 12/17/20 16:56, Chapman Flack wrote:
>> that, it's called a cast.
> 
> I find them different; XQuery was the first language I had encountered
> that provides both (a cast in XQuery is spelled 'cast as', just as you'd
> expect), and the idea of an explicit operation that means "I am only
> asserting statically what type this will have at run time; do not ever
> perform any conversion or coercion, just give me an error if I'm wrong"
> seems to be a distinct and useful one.

I should have added: it may be an idea that never seemed important in
languages that mean to statically type everything, but it seems to arise
quite naturally for a language like XQuery (and arguably jsonpath) that
tries to do a useful amount of static typing while applied to data
structures like XML or JSON that don't come with ironclad guarantees.

Regards,
-Chap




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-17 Thread Chapman Flack
On 12/17/20 16:47, Tom Lane wrote:
> As far as "treat as" is concerned, we already have a spelling for
> that, it's called a cast.

I find them different; XQuery was the first language I had encountered
that provides both (a cast in XQuery is spelled 'cast as', just as you'd
expect), and the idea of an explicit operation that means "I am only
asserting statically what type this will have at run time; do not ever
perform any conversion or coercion, just give me an error if I'm wrong"
seems to be a distinct and useful one.

Even if there is no available SQL spelling for that, it might still one day
be a useful expression node that could be generated in certain chosen cases.

Regards,
-Chap




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-17 Thread Tom Lane
Chapman Flack  writes:
> That's likely to be what a programmer intends when writing
> (variable explicitly typed integer) := js['n'] and
> (variable explicitly types varchar) := js['v']

I think that what we want, if we're to support that sort of thing,
is that the js[] constructs produce jsonb by definition, and then an
assignment-level cast is applied to get from jsonb to integer or text.
I see we already have most of the necessary casts, but they're currently
marked explicit-only.  Downgrading them to assignment level might be
okay though.  If we don't want to do that, it means we have to write
integervar := js['n']::integer
which is a bit more wordy but also unmistakable as to intent.  (I think
the "intent" angle might be the reason we insisted on these things
being explicit to start with.)

It's somewhat interesting to speculate about whether we could optimize
the combination of the subscripting function and the cast function.
But (a) that's an optimization, not something that should be part of
the user-visible semantics, and (b) it should not be part of the initial
feature.  I think a large part of the reason this patch is still not
done after four years is that it's been biting off more than it could
chew all along.  Let's try to get it to completion and then optimize
later.

As far as "treat as" is concerned, we already have a spelling for
that, it's called a cast.

regards, tom lane




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-17 Thread Chapman Flack
On 12/17/20 15:50, Tom Lane wrote:
> Chapman Flack  writes:
>> On 12/17/20 14:28, Tom Lane wrote:
>>> If you're imagining that js['n'] and js['v'] would emit different
>>> datatypes, forget it.  That would require knowing at parse time
>>> what the structure of the json object will be at run time.
> 
>> Would it be feasible to analyze that as something like an implicit
>> 'treat as' with the type of the assignment target?
> 
> TBH, I think that expending any great amount of effort in that direction
> would be a big waste of effort.  We already have strongly-typed
> composite types.  The use-case for json is where you *don't* have
> ironclad guarantees about what the structure of the data is.
> 
> As for doing it implicitly, that is still going to fall foul of the
> fundamental problem, which is that we don't have the info at parse
> time.  Examples with constant values for the json input are not what
> to look at, because they'll just mislead you as to what's possible.

Respectfully, I think that fundamental problem is exactly what led to
XQuery having the 'treat as' construct [1]. XML is in the same boat as JSON
as far as not having ironclad guarantees about what the structure will be.
But there are situations where the programmer knows full well that the
only inputs of interest will have js['n'] an integer and js['v'] a string,
and any input not conforming to that expectation will be erroneous and
should produce an error at runtime.

That's likely to be what a programmer intends when writing
(variable explicitly typed integer) := js['n'] and
(variable explicitly types varchar) := js['v']

so it might be nice to be able to write it without a lot of extra
ceremony. What I had in mind was not to try too hard to analyze the
JSON subscript expression, but only to know that its result can only
ever be: more JSON, a string, a number, a boolean, or an array of one
of those, and if the assignment target has one of those types, assume
that a 'treat as' is intended.

Naturally there's a trade-off, and that provides economy of expression
at the cost of not giving an immediate parse-time error if the
programmer really made a thinko rather than intending a 'treat as'.

I haven't closely followed what's proposed as the subscript in
js[...] - can it be any arbitrary jsonpath? And does jsonpath have
an operator analogous to XQuery's 'treat as'?

If so, something like (but with jsonpath rather than XQuery spelling)
  n := js['n treat as number'];
  v := js['v treat as string'];

might be a happy medium: perhaps parsing the expression enough to see
that its outer node is a 'treat as' is not asking too much, and then the
programmer has to explicitly add that to avoid a parse-time error,
but it's a reasonably painless notation to add.

Regards,
-Chap

[1] https://www.w3.org/TR/2003/WD-xquery-20030822/#id-treat




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-17 Thread Tom Lane
Chapman Flack  writes:
> On 12/17/20 14:28, Tom Lane wrote:
>> If you're imagining that js['n'] and js['v'] would emit different
>> datatypes, forget it.  That would require knowing at parse time
>> what the structure of the json object will be at run time.

> Would it be feasible to analyze that as something like an implicit
> 'treat as' with the type of the assignment target?

TBH, I think that expending any great amount of effort in that direction
would be a big waste of effort.  We already have strongly-typed
composite types.  The use-case for json is where you *don't* have
ironclad guarantees about what the structure of the data is.

As for doing it implicitly, that is still going to fall foul of the
fundamental problem, which is that we don't have the info at parse
time.  Examples with constant values for the json input are not what
to look at, because they'll just mislead you as to what's possible.

regards, tom lane




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-17 Thread Tom Lane
Dmitry Dolgov <9erthali...@gmail.com> writes:
> On Thu, Dec 17, 2020 at 01:49:17PM -0500, Tom Lane wrote:
>> We can certainly reconsider the API for the parsing hook if there's
>> really a good reason for these to be different types, but it seems
>> like that would just be encouraging poor design.

> To be more specific, this is the current behaviour (an example from the
> tests) and it doesn't seem right:

> =# update test_jsonb_subscript
>set test_json['a'] = 3 where id = 1;
> UPDATE 1
> =# select jsonb_typeof(test_json->'a')
>from test_jsonb_subscript where id = 1;
>  jsonb_typeof
>  --
>   string

I'm kind of unmoved by that example, because making it better would
require more guessing about what the user wanted than I care for.

You could imagine, perhaps, that the subscript parsing hook gives
back a list of potential assignment source types, or that we make
it responsible for transforming the source expression as well as
the subscripts and then let it do something like that internally.
But that just opens the door to confusion and ambiguity.  We already
had this discussion a few months ago, as I recall, when you wanted
to try assignment transforms to both text and integer but I pointed
out that both ways would succeed in some cases.  The assignment
coercion rules are only intended to be used when there is *exactly
one* possible result type.  I'd only be willing to accept multiple
possible coercion target types if we backed off the coercion level
to "implicit" to make multiple matches less likely (which is exactly
what we do when resolving input types for functions).  But I'm
afraid that doing so would break more cases than it improves.
It would certainly break existing queries for array assignment.

I'm rather inclined to think that the result of subscripting a
jsonb (and therefore also the required source type for assignment)
should be jsonb, not just text.  In that case, something like
update ... set jsoncol['a'] = 3
would fail, because there's no cast from integer to jsonb.  You'd
have to write one of
update ... set jsoncol['a'] = '3'
update ... set jsoncol['a'] = '"3"'
to clarify how you wanted the input to be interpreted.
But that seems like a good thing, just as it is for jsonb_in.

The background for my being so down on this is that it reminds me
way too much of the implicit-casts-to-text mess that we cleaned up
(with great pain and squawking) back around 8.3.  It looks to me
like you're basically trying to introduce multiple implicit casts
to jsonb, and I'm afraid that's just as bad an idea.  At the very
least, if we do do it I don't see why it should only happen in the
context of subscripted assignment.

regards, tom lane




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-17 Thread Pavel Stehule
čt 17. 12. 2020 v 20:28 odesílatel Tom Lane  napsal:

> Pavel Stehule  writes:
> > čt 17. 12. 2020 v 19:49 odesílatel Tom Lane  napsal:
> >> So ... what's the problem with that?  Seems like what you should put
> >> in and what you should get out should be the same type.
>
> > I don't think so.  For  XML or JSON the target can be different, and it
> can
> > safe one CAST
>
> > DECLARE
> >   n int;
> >   v varchar;
> >   js jsonb default '{"n": 100, "v" : "Hello"};
> > BEGIN
> >   n := js['n'];
> >   v := js['v'];
>
> If you're imagining that js['n'] and js['v'] would emit different
> datatypes, forget it.  That would require knowing at parse time
> what the structure of the json object will be at run time.
>

My idea was a little bit different. When we know the target type (in this
example int or varchar), then we can *theoretically* push this information
to the subscribing function. This optimization is used in the XMLTABLE
function. Now the subscribing function returns JSONB, although internally
inside the source value, there are stored integer and varchar values. So
the returned value should be converted to jsonb first. Immediately it is
casted to the target type outside. My idea was to join the subscription
function and outer cast to one functionality, that allows to skip casting
when it is not necessary. It will be known in run time. Sure. But because
the outer cast and subscription function are separate things, then it is
not possible to skip the outer cast.

Pavel


> But in any case, the discussion here is about the source datatype
> for an assignment, which this example doesn't even contain.
>
> regards, tom lane
>


Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-17 Thread Chapman Flack
On 12/17/20 14:28, Tom Lane wrote:
> Pavel Stehule  writes:
>>   n int;
>>   v varchar;
>>   js jsonb default '{"n": 100, "v" : "Hello"};
>> BEGIN
>>   n := js['n'];
>>   v := js['v'];
> 
> If you're imagining that js['n'] and js['v'] would emit different
> datatypes, forget it.  That would require knowing at parse time
> what the structure of the json object will be at run time.

Would it be feasible to analyze that as something like an implicit
'treat as' with the type of the assignment target?

'treat as' is an operator in XML Query that's distinct from 'cast as';
'cast as foo' has ordinary cast semantics and can coerce non-foo to foo;
'treat as foo' is just a promise from the programmer: "go ahead and
statically rely on this being a foo, and give me a runtime exception
if it isn't".

It would offer a nice economy of expression.

Following that idea further, if there were such a thing as a 'treat as'
node, would the implicit generation of such a node, according to an
assignment target data type, be the kind of thing that could be accomplished
by a user function's planner-support function?

Regards,
-Chap




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-17 Thread Dmitry Dolgov
> On Thu, Dec 17, 2020 at 01:49:17PM -0500, Tom Lane wrote:
> Dmitry Dolgov <9erthali...@gmail.com> writes:
> > While rebasing the jsonb patch I found out that the current subscripting
> > assignment implementation in transformAssignmentIndirection always
> > coerce the value to be assigned to the type which subscripting result
> > suppose to have (refrestype). For arrays it's fine, since those two
> > indeed must be the same, but for jsonb (and for hstore I guess too) the
> > result of subscripting is always jsonb (well, text type) and the
> > assigned value could be of some other type. This leads to assigning
> > everything converted to text.
>
> So ... what's the problem with that?  Seems like what you should put
> in and what you should get out should be the same type.
>
> We can certainly reconsider the API for the parsing hook if there's
> really a good reason for these to be different types, but it seems
> like that would just be encouraging poor design.

To be more specific, this is the current behaviour (an example from the
tests) and it doesn't seem right:

=# update test_jsonb_subscript
   set test_json['a'] = 3 where id = 1;

UPDATE 1

=# select jsonb_typeof(test_json->'a')
   from test_jsonb_subscript where id = 1;

 jsonb_typeof
 --
  string

=# update test_jsonb_subscript
   set test_json = jsonb_set(test_json, '{a}', '3') where id = 1;

UPDATE 1
=# select jsonb_typeof(test_json->'a')
   from test_jsonb_subscript where id = 1;

 jsonb_typeof
 --
  number




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-17 Thread Tom Lane
Pavel Stehule  writes:
> čt 17. 12. 2020 v 19:49 odesílatel Tom Lane  napsal:
>> So ... what's the problem with that?  Seems like what you should put
>> in and what you should get out should be the same type.

> I don't think so.  For  XML or JSON the target can be different, and it can
> safe one CAST

> DECLARE
>   n int;
>   v varchar;
>   js jsonb default '{"n": 100, "v" : "Hello"};
> BEGIN
>   n := js['n'];
>   v := js['v'];

If you're imagining that js['n'] and js['v'] would emit different
datatypes, forget it.  That would require knowing at parse time
what the structure of the json object will be at run time.

But in any case, the discussion here is about the source datatype
for an assignment, which this example doesn't even contain.

regards, tom lane




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-17 Thread Pavel Stehule
čt 17. 12. 2020 v 19:49 odesílatel Tom Lane  napsal:

> Dmitry Dolgov <9erthali...@gmail.com> writes:
> > While rebasing the jsonb patch I found out that the current subscripting
> > assignment implementation in transformAssignmentIndirection always
> > coerce the value to be assigned to the type which subscripting result
> > suppose to have (refrestype). For arrays it's fine, since those two
> > indeed must be the same, but for jsonb (and for hstore I guess too) the
> > result of subscripting is always jsonb (well, text type) and the
> > assigned value could be of some other type. This leads to assigning
> > everything converted to text.
>
> So ... what's the problem with that?  Seems like what you should put
> in and what you should get out should be the same type.
>

I don't think so.  For  XML or JSON the target can be different, and it can
safe one CAST

DECLARE
  n int;
  v varchar;
  js jsonb default '{"n": 100, "v" : "Hello"};
BEGIN
  n := js['n'];
  v := js['v'];

Can be nice to do this with a minimum number of transformations.

Regards

Pavel



> We can certainly reconsider the API for the parsing hook if there's
> really a good reason for these to be different types, but it seems
> like that would just be encouraging poor design.
>
> regards, tom lane
>


Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-17 Thread Tom Lane
Dmitry Dolgov <9erthali...@gmail.com> writes:
> While rebasing the jsonb patch I found out that the current subscripting
> assignment implementation in transformAssignmentIndirection always
> coerce the value to be assigned to the type which subscripting result
> suppose to have (refrestype). For arrays it's fine, since those two
> indeed must be the same, but for jsonb (and for hstore I guess too) the
> result of subscripting is always jsonb (well, text type) and the
> assigned value could be of some other type. This leads to assigning
> everything converted to text.

So ... what's the problem with that?  Seems like what you should put
in and what you should get out should be the same type.

We can certainly reconsider the API for the parsing hook if there's
really a good reason for these to be different types, but it seems
like that would just be encouraging poor design.

regards, tom lane




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-17 Thread Dmitry Dolgov
> On Fri, Dec 11, 2020 at 10:38:07AM -0500, Tom Lane wrote:
> Dmitry Dolgov <9erthali...@gmail.com> writes:
> >> On Wed, Dec 09, 2020 at 04:59:34PM -0500, Tom Lane wrote:
> >> 0001 adds the ability to attach a subscript handler to an existing
> >> data type with ALTER TYPE.  This is clearly going to be necessary
> >> if we want extension types to be able to use this facility.  The
> >> only thing that I think might be controversial here is that I did
> >> not add the ability to set pg_type.typelem.
>
> > I'm curious what could be the use case for setting pg_type.typelem for
> > subscripting? I don't see this that much controversial, but maybe I'm
> > missing something.
>
> If you want the result of subscripting to be "text" or some other built-in
> type, then clearly there's no need to use typelem for that, you can just
> refer to the standard OID macros.  The potential use-case that I thought
> of for setting typelem is where an extension defines types A and B and
> would like subscripting of B to yield A.  Installing A's OID as B.typelem
> would save a catalog lookup during subscript parsing, and remove a bunch
> of edge failure cases such as what happens if A gets renamed.  However,
> given the dependency behavior, this would also have the effect of "you
> can't drop A without dropping B, and you can't modify A in any interesting
> way either".  That would be annoyingly restrictive if there weren't any
> actual physical containment relationship.  But on the other hand, maybe
> it's acceptable and we just need to document it.
>
> The other issue is what about existing stored SubscriptingRef structs.
> If our backs were to the wall I'd think about removing the refelemtype
> field so there's no stored image of typelem that needs to be updated.
> But that would incur an extra catalog lookup in array_exec_setup, so
> I don't much like it.  If we do add the ability to set typelem, I'd
> prefer to just warn people to not change it once they've installed a
> subscript handler.
>
> Anyway, between those two issues I'm about -0.1 on adding a way to alter
> typelem.  I won't fight hard if somebody wants it, but I'm inclined
> to leave it out.

Yes, makes sense. Thanks for the clarification.

> On Wed, Dec 09, 2020 at 07:37:04PM +0100, Dmitry Dolgov wrote:
> > On Wed, Dec 09, 2020 at 12:49:48PM -0500, Tom Lane wrote:
> >
> > The jsonb parts now have to be
> > rebased onto this design, which I'm assuming Dmitry will tackle
>
> Yes, I'm already on it, just couldn't keep up with the changes in this
> thread.

While rebasing the jsonb patch I found out that the current subscripting
assignment implementation in transformAssignmentIndirection always
coerce the value to be assigned to the type which subscripting result
suppose to have (refrestype). For arrays it's fine, since those two
indeed must be the same, but for jsonb (and for hstore I guess too) the
result of subscripting is always jsonb (well, text type) and the
assigned value could be of some other type. This leads to assigning
everything converted to text.

Originally this coercion was done in the type specific code, so I hoped
to put it into "transform" routine. Unfortunately "transform" is called
before that (and could not be called later, because type information
from sbsref is required) and all the other hooks are apparently too
late. Probably the most straightforward solution here would be to add a
new argument to transformAssignmentIndirection to signal if coercion
needs to happen or not, and allow the type specific code to specify it
via SubscriptingRef.  Are there any better ideas?




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-11 Thread Tom Lane
Dmitry Dolgov <9erthali...@gmail.com> writes:
>> On Wed, Dec 09, 2020 at 04:59:34PM -0500, Tom Lane wrote:
>> 0001 adds the ability to attach a subscript handler to an existing
>> data type with ALTER TYPE.  This is clearly going to be necessary
>> if we want extension types to be able to use this facility.  The
>> only thing that I think might be controversial here is that I did
>> not add the ability to set pg_type.typelem.

> I'm curious what could be the use case for setting pg_type.typelem for
> subscripting? I don't see this that much controversial, but maybe I'm
> missing something.

If you want the result of subscripting to be "text" or some other built-in
type, then clearly there's no need to use typelem for that, you can just
refer to the standard OID macros.  The potential use-case that I thought
of for setting typelem is where an extension defines types A and B and
would like subscripting of B to yield A.  Installing A's OID as B.typelem
would save a catalog lookup during subscript parsing, and remove a bunch
of edge failure cases such as what happens if A gets renamed.  However,
given the dependency behavior, this would also have the effect of "you
can't drop A without dropping B, and you can't modify A in any interesting
way either".  That would be annoyingly restrictive if there weren't any
actual physical containment relationship.  But on the other hand, maybe
it's acceptable and we just need to document it.

The other issue is what about existing stored SubscriptingRef structs.
If our backs were to the wall I'd think about removing the refelemtype
field so there's no stored image of typelem that needs to be updated.
But that would incur an extra catalog lookup in array_exec_setup, so
I don't much like it.  If we do add the ability to set typelem, I'd
prefer to just warn people to not change it once they've installed a
subscript handler.

Anyway, between those two issues I'm about -0.1 on adding a way to alter
typelem.  I won't fight hard if somebody wants it, but I'm inclined
to leave it out.

>> +1 using subscripts for hstore is nice idea

> Yeah, I also find it's a good suggestion, the implementation seems fine
> as well. As a side note, I'm surprised hstore doesn't have any
> functionality to update values, except hstore_concat.

Yeah.  I cribbed the subscript-fetch implementation from hstore_fetchval,
but was surprised to find that there wasn't any direct equivalent function
for subscript-store.  I guess people have gotten by with concat, but
it's not exactly an obvious way to do things.

regards, tom lane




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-11 Thread Dmitry Dolgov
> On Wed, Dec 09, 2020 at 04:59:34PM -0500, Tom Lane wrote:
>
> 0001 adds the ability to attach a subscript handler to an existing
> data type with ALTER TYPE.  This is clearly going to be necessary
> if we want extension types to be able to use this facility.  The
> only thing that I think might be controversial here is that I did
> not add the ability to set pg_type.typelem.  While that'd be easy
> enough so far as ALTER TYPE is concerned, I'm not sure that we want
> to encourage people to change it.  The dependency rules mean that
> the semantics of typelem aren't something you really want to change
> after-the-fact on an existing type.  Also, if we did allow it, any
> existing SubscriptingRef.refelemtype values in stored views would
> fail to be updated.

I'm curious what could be the use case for setting pg_type.typelem for
subscripting? I don't see this that much controversial, but maybe I'm
missing something.

> On Thu, Dec 10, 2020 at 05:37:20AM +0100, Pavel Stehule wrote:
> st 9. 12. 2020 v 22:59 odesílatel Tom Lane  napsal:
>
> > 0002 makes use of that to support subscripting of hstore.  I'm not
> > sure how much we care about that from a functionality standpoint,
> > but it seems like it might be good to have a contrib module testing
> > that extensions can use this.  Also, I thought possibly an example
> > showing what's basically the minimum possible amount of complexity
> > would be good to have.  If people like this, I'll finish it up (it
> > lacks docs) and add it.
> >
>
> +1 using subscripts for hstore is nice idea

Yeah, I also find it's a good suggestion, the implementation seems fine
as well. As a side note, I'm surprised hstore doesn't have any
functionality to update values, except hstore_concat.




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-10 Thread David Fetter
On Wed, Dec 09, 2020 at 12:49:48PM -0500, Tom Lane wrote:
> I've pushed the core patch now.  The jsonb parts now have to be
> rebased onto this design, which I'm assuming Dmitry will tackle
> (I do not intend to).  It's not quite clear to me whether we have
> a meeting of the minds on what the jsonb functionality should be,
> anyway.  Alexander seemed to be thinking about offering an option
> to let the subscript be a jsonpath, but how would we distinguish
> that from a plain-text field name?
> 
> BTW, while reviewing the thread to write the commit message,
> I was reminded of my concerns around the "is it a container"
> business.  As things stand, if type A has a typelem link to
> type B, then the system supposes that A contains B physically;
> this has implications for what's allowed in DDL, for example
> (cf find_composite_type_dependencies() and other places).
> We now have a feature whereby subscripting can yield a type
> that is not contained in the source type in that sense.
> I'd be happier if the "container" terminology were reserved for
> that sort of physical containment, which means that I think a lot
> of the commentary around SubscriptingRef is misleading.  But I do
> not have a better word to suggest offhand.  Thoughts?

Would this be something more along the lines of a "dependent type," or
is that adding too much baggage?

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-09 Thread Pavel Stehule
st 9. 12. 2020 v 22:59 odesílatel Tom Lane  napsal:

> Here's a couple of little finger exercises to move this along a bit.
>
> 0001 adds the ability to attach a subscript handler to an existing
> data type with ALTER TYPE.  This is clearly going to be necessary
> if we want extension types to be able to use this facility.  The
> only thing that I think might be controversial here is that I did
> not add the ability to set pg_type.typelem.  While that'd be easy
> enough so far as ALTER TYPE is concerned, I'm not sure that we want
> to encourage people to change it.  The dependency rules mean that
> the semantics of typelem aren't something you really want to change
> after-the-fact on an existing type.  Also, if we did allow it, any
> existing SubscriptingRef.refelemtype values in stored views would
> fail to be updated.
>
> 0002 makes use of that to support subscripting of hstore.  I'm not
> sure how much we care about that from a functionality standpoint,
> but it seems like it might be good to have a contrib module testing
> that extensions can use this.  Also, I thought possibly an example
> showing what's basically the minimum possible amount of complexity
> would be good to have.  If people like this, I'll finish it up (it
> lacks docs) and add it.
>

+1 using subscripts for hstore is nice idea

Pavel



> regards, tom lane
>
>


Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-09 Thread Tom Lane
Here's a couple of little finger exercises to move this along a bit.

0001 adds the ability to attach a subscript handler to an existing
data type with ALTER TYPE.  This is clearly going to be necessary
if we want extension types to be able to use this facility.  The
only thing that I think might be controversial here is that I did
not add the ability to set pg_type.typelem.  While that'd be easy
enough so far as ALTER TYPE is concerned, I'm not sure that we want
to encourage people to change it.  The dependency rules mean that
the semantics of typelem aren't something you really want to change
after-the-fact on an existing type.  Also, if we did allow it, any
existing SubscriptingRef.refelemtype values in stored views would
fail to be updated.

0002 makes use of that to support subscripting of hstore.  I'm not
sure how much we care about that from a functionality standpoint,
but it seems like it might be good to have a contrib module testing
that extensions can use this.  Also, I thought possibly an example
showing what's basically the minimum possible amount of complexity
would be good to have.  If people like this, I'll finish it up (it
lacks docs) and add it.

regards, tom lane

diff --git a/doc/src/sgml/ref/alter_type.sgml b/doc/src/sgml/ref/alter_type.sgml
index 64bf266373..21887e88a0 100644
--- a/doc/src/sgml/ref/alter_type.sgml
+++ b/doc/src/sgml/ref/alter_type.sgml
@@ -194,6 +194,14 @@ ALTER TYPE name SET ( 

+   
+
+ SUBSCRIPT can be set to the name of a type-specific
+ subscripting handler function, or NONE to remove
+ the type's subscripting handler function.  Using this option
+ requires superuser privilege.
+
+   

 
  STORAGE
diff --git a/src/backend/commands/typecmds.c b/src/backend/commands/typecmds.c
index 29fe52d2ce..7c0b2c3bf0 100644
--- a/src/backend/commands/typecmds.c
+++ b/src/backend/commands/typecmds.c
@@ -94,6 +94,7 @@ typedef struct
 	bool		updateTypmodin;
 	bool		updateTypmodout;
 	bool		updateAnalyze;
+	bool		updateSubscript;
 	/* New values for relevant attributes */
 	char		storage;
 	Oid			receiveOid;
@@ -101,6 +102,7 @@ typedef struct
 	Oid			typmodinOid;
 	Oid			typmodoutOid;
 	Oid			analyzeOid;
+	Oid			subscriptOid;
 } AlterTypeRecurseParams;
 
 /* Potentially set by pg_upgrade_support functions */
@@ -3885,6 +3887,18 @@ AlterType(AlterTypeStmt *stmt)
 			/* Replacing an analyze function requires superuser. */
 			requireSuper = true;
 		}
+		else if (strcmp(defel->defname, "subscript") == 0)
+		{
+			if (defel->arg != NULL)
+atparams.subscriptOid =
+	findTypeSubscriptingFunction(defGetQualifiedName(defel),
+ typeOid);
+			else
+atparams.subscriptOid = InvalidOid; /* NONE, remove function */
+			atparams.updateSubscript = true;
+			/* Replacing a subscript function requires superuser. */
+			requireSuper = true;
+		}
 
 		/*
 		 * The rest of the options that CREATE accepts cannot be changed.
@@ -4042,6 +4056,11 @@ AlterTypeRecurse(Oid typeOid, bool isImplicitArray,
 		replaces[Anum_pg_type_typanalyze - 1] = true;
 		values[Anum_pg_type_typanalyze - 1] = ObjectIdGetDatum(atparams->analyzeOid);
 	}
+	if (atparams->updateSubscript)
+	{
+		replaces[Anum_pg_type_typsubscript - 1] = true;
+		values[Anum_pg_type_typsubscript - 1] = ObjectIdGetDatum(atparams->subscriptOid);
+	}
 
 	newtup = heap_modify_tuple(tup, RelationGetDescr(catalog),
 			   values, nulls, replaces);
@@ -4098,6 +4117,7 @@ AlterTypeRecurse(Oid typeOid, bool isImplicitArray,
 	atparams->updateReceive = false;	/* domains use F_DOMAIN_RECV */
 	atparams->updateTypmodin = false;	/* domains don't have typmods */
 	atparams->updateTypmodout = false;
+	atparams->updateSubscript = false;	/* domains don't have subscriptors */
 
 	/* Skip the scan if nothing remains to be done */
 	if (!(atparams->updateStorage ||
diff --git a/src/test/regress/expected/create_type.out b/src/test/regress/expected/create_type.out
index f85afcb31e..14394cc95c 100644
--- a/src/test/regress/expected/create_type.out
+++ b/src/test/regress/expected/create_type.out
@@ -260,38 +260,40 @@ ALTER TYPE myvarchar SET (
 receive = myvarcharrecv,
 typmod_in = varchartypmodin,
 typmod_out = varchartypmodout,
-analyze = array_typanalyze  -- bogus, but it doesn't matter
+-- these are bogus, but it's safe as long as we don't use the type:
+analyze = ts_typanalyze,
+subscript = raw_array_subscript_handler
 );
 SELECT typinput, typoutput, typreceive, typsend, typmodin, typmodout,
-   typanalyze, typstorage
+   typanalyze, typsubscript, typstorage
 FROM pg_type WHERE typname = 'myvarchar';
-  typinput   |  typoutput   |  typreceive   |typsend|typmodin |typmodout |typanalyze| typstorage 
--+--+---+---+-+--+--+
- myvarcharin | myvarcharout | 

Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-09 Thread Tom Lane
Dmitry Dolgov <9erthali...@gmail.com> writes:
>> On Wed, Dec 09, 2020 at 12:49:48PM -0500, Tom Lane wrote:
>> I'd be happier if the "container" terminology were reserved for
>> that sort of physical containment, which means that I think a lot
>> of the commentary around SubscriptingRef is misleading.  But I do
>> not have a better word to suggest offhand.  Thoughts?

> I have only 'a composite'/'a compound' alternative in mind, but it's
> probably the same confusing as a container.

Yeah, in fact likely worse, since we tend to use those words for
rowtypes.  Naming is the hardest problem in CS :-(

regards, tom lane




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-09 Thread Dmitry Dolgov
> On Wed, Dec 09, 2020 at 12:49:48PM -0500, Tom Lane wrote:
>
> I've pushed the core patch now.

Thanks a lot!

> The jsonb parts now have to be
> rebased onto this design, which I'm assuming Dmitry will tackle

Yes, I'm already on it, just couldn't keep up with the changes in this
thread.

> BTW, while reviewing the thread to write the commit message,
> I was reminded of my concerns around the "is it a container"
> business.  As things stand, if type A has a typelem link to
> type B, then the system supposes that A contains B physically;
> this has implications for what's allowed in DDL, for example
> (cf find_composite_type_dependencies() and other places).
> We now have a feature whereby subscripting can yield a type
> that is not contained in the source type in that sense.
> I'd be happier if the "container" terminology were reserved for
> that sort of physical containment, which means that I think a lot
> of the commentary around SubscriptingRef is misleading.  But I do
> not have a better word to suggest offhand.  Thoughts?

I have only 'a composite'/'a compound' alternative in mind, but it's
probably the same confusing as a container.




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-09 Thread Tom Lane
I've pushed the core patch now.  The jsonb parts now have to be
rebased onto this design, which I'm assuming Dmitry will tackle
(I do not intend to).  It's not quite clear to me whether we have
a meeting of the minds on what the jsonb functionality should be,
anyway.  Alexander seemed to be thinking about offering an option
to let the subscript be a jsonpath, but how would we distinguish
that from a plain-text field name?

BTW, while reviewing the thread to write the commit message,
I was reminded of my concerns around the "is it a container"
business.  As things stand, if type A has a typelem link to
type B, then the system supposes that A contains B physically;
this has implications for what's allowed in DDL, for example
(cf find_composite_type_dependencies() and other places).
We now have a feature whereby subscripting can yield a type
that is not contained in the source type in that sense.
I'd be happier if the "container" terminology were reserved for
that sort of physical containment, which means that I think a lot
of the commentary around SubscriptingRef is misleading.  But I do
not have a better word to suggest offhand.  Thoughts?

regards, tom lane




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-08 Thread Andres Freund
On 2020-12-08 22:02:24 -0500, Tom Lane wrote:
> BTW, I observe that with MAXDIM gone from execExpr.h, there are
> no widely-visible uses of MAXDIM except for array.h.  I therefore
> suggest that we should pull "#define MAXDIM" out of c.h and put it
> into array.h, as attached.  I was slightly surprised to find that
> this seems to entail *no* new inclusions of array.h ... I expected
> there would be one or two.  But the main point here is we want to
> restrict use of that symbol to stuff that's tightly integrated with
> varlena-array handling, so it ought not be in c.h.

+1




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-08 Thread Tom Lane
BTW, I observe that with MAXDIM gone from execExpr.h, there are
no widely-visible uses of MAXDIM except for array.h.  I therefore
suggest that we should pull "#define MAXDIM" out of c.h and put it
into array.h, as attached.  I was slightly surprised to find that
this seems to entail *no* new inclusions of array.h ... I expected
there would be one or two.  But the main point here is we want to
restrict use of that symbol to stuff that's tightly integrated with
varlena-array handling, so it ought not be in c.h.

regards, tom lane

diff --git a/src/include/c.h b/src/include/c.h
index 12ea056a35..7bc4b8a001 100644
--- a/src/include/c.h
+++ b/src/include/c.h
@@ -591,10 +591,6 @@ typedef uint32 CommandId;
 #define FirstCommandId	((CommandId) 0)
 #define InvalidCommandId	(~(CommandId)0)
 
-/*
- * Maximum number of array subscripts, for regular varlena arrays
- */
-#define MAXDIM 6
 
 /* 
  *		Variable-length datatypes all share the 'struct varlena' header.
diff --git a/src/include/utils/array.h b/src/include/utils/array.h
index 2809dfee93..16925880a1 100644
--- a/src/include/utils/array.h
+++ b/src/include/utils/array.h
@@ -69,6 +69,11 @@ struct ExprState;
 struct ExprContext;
 
 
+/*
+ * Maximum number of array subscripts (arbitrary limit)
+ */
+#define MAXDIM 6
+
 /*
  * Arrays are varlena objects, so must meet the varlena convention that
  * the first int32 of the object contains the total object size in bytes.


Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-08 Thread Tom Lane
Andres Freund  writes:
> On 2020-12-08 11:05:05 -0500, Tom Lane wrote:
>> Other than that nit, please finish this up and push it so I can finish
>> the generic-subscripting patch.

> Done.

Cool, thanks.  I'll fix that part of the generic-subscript patch
and push it tomorrow, unless somebody objects to it before then.

regards, tom lane




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-08 Thread Andres Freund
Hi,

On 2020-12-08 11:05:05 -0500, Tom Lane wrote:
> I've now studied this patch and it seems sane to me, although
> I wondered why you wrote "extern"s here:

Brainfart...


> Other than that nit, please finish this up and push it so I can finish
> the generic-subscripting patch.

Done.


> > WRT the prototype, I think it may be worth removing most of the types
> > from llvmjit.h. Worth keeping the most common ones, but most aren't used
> > all the time so terseness doesn't matter that much, and
> > the llvm_pg_var_type() would suffice.
> 
> Hm, that would mean redoing llvm_pg_var_type() often wouldn't it?
> I don't have a very good feeling for how expensive that is, so I'm
> not sure if this seems like a good idea or not.

It should be fairly cheap - it's basically one hashtable lookup.

Greetings,

Andres Freund




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-08 Thread Tom Lane
Andres Freund  writes:
> On 2020-12-07 17:25:41 -0500, Tom Lane wrote:
>> I can see that that should work for the two existing implementations
>> of EEO_CASE, but I wasn't sure if you wanted to wire in an assumption
>> that it'll always work.

> I don't think it's likely to be a problem, and if it ends up being one,
> we can still deduplicate the ops at that point...

Seems reasonable.

Here's a v38 that addresses the semantic loose ends I was worried about.
I decided that it's worth allowing subscripting functions to dictate
whether they should be considered strict or not, at least for the fetch
side (store is still assumed nonstrict always) and whether they should be
considered leakproof or not.  That requires only a minimal amount of extra
code.  While the planner does have to do extra catalog lookups to check
strictness and leakproofness, those are not common things to need to
check, so I don't think we're paying anything in performance for the
flexibility.  I left out the option of "strict store" because that *would*
have required extra code (to generate a nullness test on the replacement
value) and the potential use-case seems too narrow to justify that.
I also left out any option to control volatility or parallel safety,
again on the grounds of lack of use-case; plus, planner checks for those
properties would have been in significantly hotter code paths.

I'm waiting on your earlier patch to rewrite the llvmjit_expr.c code,
but otherwise I think this is ready to go.

regards, tom lane

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 2d44df19fe..ca2f9f3215 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -426,23 +426,28 @@ foreign_expr_walker(Node *node,
 	return false;
 
 /*
- * Recurse to remaining subexpressions.  Since the container
- * subscripts must yield (noncollatable) integers, they won't
- * affect the inner_cxt state.
+ * Recurse into the remaining subexpressions.  The container
+ * subscripts will not affect collation of the SubscriptingRef
+ * result, so do those first and reset inner_cxt afterwards.
  */
 if (!foreign_expr_walker((Node *) sr->refupperindexpr,
 		 glob_cxt, _cxt))
 	return false;
+inner_cxt.collation = InvalidOid;
+inner_cxt.state = FDW_COLLATE_NONE;
 if (!foreign_expr_walker((Node *) sr->reflowerindexpr,
 		 glob_cxt, _cxt))
 	return false;
+inner_cxt.collation = InvalidOid;
+inner_cxt.state = FDW_COLLATE_NONE;
 if (!foreign_expr_walker((Node *) sr->refexpr,
 		 glob_cxt, _cxt))
 	return false;
 
 /*
- * Container subscripting should yield same collation as
- * input, but for safety use same logic as for function nodes.
+ * Container subscripting typically yields same collation as
+ * refexpr's, but in case it doesn't, use same logic as for
+ * function nodes.
  */
 collation = sr->refcollid;
 if (collation == InvalidOid)
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 79069ddfab..583a5ce3b9 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -8740,6 +8740,21 @@ SCRAM-SHA-256$iteration count:
   
  
 
+ 
+  
+   typsubscript regproc
+   (references pg_proc.oid)
+  
+  
+   Subscripting handler function's OID, or zero if this type doesn't
+   support subscripting.  Types that are true array
+   types have typsubscript
+   = array_subscript_handler, but other types may
+   have other handler functions to implement specialized subscripting
+   behavior.
+  
+ 
+
  
   
typelem oid
@@ -8747,19 +8762,12 @@ SCRAM-SHA-256$iteration count:
   
   
If typelem is not 0 then it
-   identifies another row in pg_type.
-   The current type can then be subscripted like an array yielding
-   values of type typelem.  A
-   true array type is variable length
-   (typlen = -1),
-   but some fixed-length (typlen  0) types
-   also have nonzero typelem, for example
-   name and point.
-   If a fixed-length type has a typelem then
-   its internal representation must be some number of values of the
-   typelem data type with no other data.
-   Variable-length array types have a header defined by the array
-   subroutines.
+   identifies another row in pg_type,
+   defining the type yielded by subscripting.  This should be 0
+   if typsubscript is 0.  However, it can
+   be 0 when typsubscript isn't 0, if the
+   handler doesn't need typelem to
+   determine the subscripting result type.
   
  
 
diff --git a/doc/src/sgml/ref/create_type.sgml b/doc/src/sgml/ref/create_type.sgml
index 970b517db9..fc09282db7 100644
--- a/doc/src/sgml/ref/create_type.sgml
+++ b/doc/src/sgml/ref/create_type.sgml
@@ -43,6 +43,7 @@ CREATE TYPE name (
 [ , 

Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-08 Thread Tom Lane
Andres Freund  writes:
> I've attached a prototype conversion for two other such places. Which
> immediately pointed to a bug. And one harmless issue (using a pointer to
> size_t instead of ExprEvalOp* to represent the 'op' parameter), which
> you promptly copied...
> If I pushed a slightly cleaned up version of that, it should be fairly
> easy to adapt your code to it, I think?

I've now studied this patch and it seems sane to me, although
I wondered why you wrote "extern"s here:

@@ -48,6 +48,10 @@
 PGFunction TypePGFunction;
 size_t TypeSizeT;
 bool   TypeStorageBool;
+extern ExprStateEvalFunc TypeExprStateEvalFunc;
+ExprStateEvalFunc TypeExprStateEvalFunc;
+extern ExecEvalSubroutine TypeExecEvalSubroutine;
+ExecEvalSubroutine TypeExecEvalSubroutine;
 
 NullableDatum StructNullableDatum;
 AggState   StructAggState;

The other variables in that file don't have that.  Other than that nit,
please finish this up and push it so I can finish the generic-subscripting
patch.

> WRT the prototype, I think it may be worth removing most of the types
> from llvmjit.h. Worth keeping the most common ones, but most aren't used
> all the time so terseness doesn't matter that much, and
> the llvm_pg_var_type() would suffice.

Hm, that would mean redoing llvm_pg_var_type() often wouldn't it?
I don't have a very good feeling for how expensive that is, so I'm
not sure if this seems like a good idea or not.

regards, tom lane




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-07 Thread Andres Freund
Hi,

On 2020-12-07 17:25:41 -0500, Tom Lane wrote:
> Fair enough.  It wasn't entirely clear to me whether it'd be kosher to
> write
>   EEO_CASE(EEOP_SBSREF_OLD)
>   EEO_CASE(EEOP_SBSREF_ASSIGN)
>   EEO_CASE(EEOP_SBSREF_FETCH)
>   {
>   // do something
>   EEO_NEXT();
>   }
> 
> I can see that that should work for the two existing implementations
> of EEO_CASE, but I wasn't sure if you wanted to wire in an assumption
> that it'll always work.

I don't think it's likely to be a problem, and if it ends up being one,
we can still deduplicate the ops at that point...

Greetings,

Andres Freund




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-07 Thread Tom Lane
Andres Freund  writes:
> On 2020-12-07 16:32:32 -0500, Tom Lane wrote:
>> What did you think of the idea of merging EEOP_SBSREF_OLD / ASSIGN / FETCH
>> into a single step type distinguished only by the callback function?

> I don't have a strong opinion on this. I guess find it a bit easier to
> understand the generated "program" if the opcodes are distinct (I've a
> pending patch printing the opcode sequence). Especially as the payload
> is just function pointers.
> So I think I'd just merge the *implementation* of the steps, but leave
> the different opcodes around?

Fair enough.  It wasn't entirely clear to me whether it'd be kosher to
write
EEO_CASE(EEOP_SBSREF_OLD)
EEO_CASE(EEOP_SBSREF_ASSIGN)
EEO_CASE(EEOP_SBSREF_FETCH)
{
// do something
EEO_NEXT();
}

I can see that that should work for the two existing implementations
of EEO_CASE, but I wasn't sure if you wanted to wire in an assumption
that it'll always work.

regards, tom lane




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-07 Thread Andres Freund
Hi,

On 2020-12-07 16:32:32 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > I think it'd be a better to rely on the backend's definition of
> > ExecEvalBoolSubroutine etc. For the functions implementing expression
> > steps I've found that far easier to work with over time (because you can
> > get LLVM to issue type mismatch errors when the signature changes,
> > instead of seeing compile failures).
> 
> I'm a little unclear on what you mean here?  There wasn't such a
> thing as ExecEvalBoolSubroutine until I added it in this patch.

Basically that I suggest doing what I did in the prototype patch I
attached, mirroring what it did with TypeExecEvalSubroutine for the new
ExecEvalBoolSubroutine case.


> What did you think of the idea of merging EEOP_SBSREF_OLD / ASSIGN / FETCH
> into a single step type distinguished only by the callback function?

I don't have a strong opinion on this. I guess find it a bit easier to
understand the generated "program" if the opcodes are distinct (I've a
pending patch printing the opcode sequence). Especially as the payload
is just function pointers.

So I think I'd just merge the *implementation* of the steps, but leave
the different opcodes around?


Greetings,

Andres Freund




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-07 Thread Tom Lane
Andres Freund  writes:
> The TypeParamBool stuff here is ok. Basically LLVM uses a '1bit' integer
> to represent booleans in the IR. But when it comes to storing such a
> value in memory, it uses 1 byte, for obvious reasons. Hence the two
> types.

Cool, thanks for taking a look.

> I think it'd be a better to rely on the backend's definition of
> ExecEvalBoolSubroutine etc. For the functions implementing expression
> steps I've found that far easier to work with over time (because you can
> get LLVM to issue type mismatch errors when the signature changes,
> instead of seeing compile failures).

I'm a little unclear on what you mean here?  There wasn't such a
thing as ExecEvalBoolSubroutine until I added it in this patch.

> I've attached a prototype conversion for two other such places. Which
> immediately pointed to a bug. And one harmless issue (using a pointer to
> size_t instead of ExprEvalOp* to represent the 'op' parameter), which
> you promptly copied...
> If I pushed a slightly cleaned up version of that, it should be fairly
> easy to adapt your code to it, I think?

Sure.  I just copied the existing code for EEOP_PARAM_CALLBACK;
if that changes, I'll just copy the new code.

What did you think of the idea of merging EEOP_SBSREF_OLD / ASSIGN / FETCH
into a single step type distinguished only by the callback function?

regards, tom lane




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-07 Thread Andres Freund
Hi,

On 2020-12-07 14:08:35 -0500, Tom Lane wrote:
> 1. I'm still wondering if TypeParamBool is the right thing to pass to
> LLVMFunctionType() to describe a function-returning-bool.  It does
> seem to work on x64_64 and aarch64, for what that's worth.

> - v_ret = build_EvalXFunc(b, mod, 
> "ExecEvalSubscriptingRef",
> - 
> v_state, op);
> + param_types[0] = l_ptr(StructExprState);
> + param_types[1] = l_ptr(TypeSizeT);
> + param_types[2] = 
> l_ptr(StructExprContext);
> +
> + v_functype = 
> LLVMFunctionType(TypeParamBool,
> + 
>   param_types,
> + 
>   lengthof(param_types),
> + 
>   false);
> + v_func = 
> l_ptr_const(op->d.sbsref_subscript.subscriptfunc,
> + 
>  l_ptr(v_functype));
> +
> + v_params[0] = v_state;
> + v_params[1] = l_ptr_const(op, 
> l_ptr(TypeSizeT));
> + v_params[2] = v_econtext;
> + v_ret = LLVMBuildCall(b,
> + 
>   v_func,
> + 
>   v_params, lengthof(v_params), "");
>   v_ret = LLVMBuildZExt(b, v_ret, 
> TypeStorageBool, "");

The TypeParamBool stuff here is ok. Basically LLVM uses a '1bit' integer
to represent booleans in the IR. But when it comes to storing such a
value in memory, it uses 1 byte, for obvious reasons. Hence the two
types.

We infer it like this:

> /*
>  * Clang represents stdbool.h style booleans that are returned by functions
>  * differently (as i1) than stored ones (as i8). Therefore we do not just need
>  * TypeBool (above), but also a way to determine the width of a returned
>  * integer. This allows us to keep compatible with non-stdbool using
>  * architectures.
>  */
> extern bool FunctionReturningBool(void);
> bool
> FunctionReturningBool(void)
> {
>   return false;
> }

so you should be good.


I think it'd be a better to rely on the backend's definition of
ExecEvalBoolSubroutine etc. For the functions implementing expression
steps I've found that far easier to work with over time (because you can
get LLVM to issue type mismatch errors when the signature changes,
instead of seeing compile failures).

I've attached a prototype conversion for two other such places. Which
immediately pointed to a bug. And one harmless issue (using a pointer to
size_t instead of ExprEvalOp* to represent the 'op' parameter), which
you promptly copied...

If I pushed a slightly cleaned up version of that, it should be fairly
easy to adapt your code to it, I think?


WRT the prototype, I think it may be worth removing most of the types
from llvmjit.h. Worth keeping the most common ones, but most aren't used
all the time so terseness doesn't matter that much, and
the llvm_pg_var_type() would suffice.

Greetings,

Andres Freund
>From 679a47a7be220576845c50092a1cccd8303d4535 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Mon, 7 Dec 2020 13:16:55 -0800
Subject: [PATCH] jit: wip: reference function types instead of re-creating
 them.

Also includes a bugfix that needs to be split out and backpatched.
---
 src/include/jit/llvmjit.h|  2 +
 src/backend/jit/llvm/llvmjit.c   | 95 +---
 src/backend/jit/llvm/llvmjit_expr.c  | 35 ++
 src/backend/jit/llvm/llvmjit_types.c |  4 ++
 4 files changed, 71 insertions(+), 65 deletions(-)

diff --git a/src/include/jit/llvmjit.h b/src/include/jit/llvmjit.h
index 325409acd5c..1c89075eaff 100644
--- a/src/include/jit/llvmjit.h
+++ b/src/include/jit/llvmjit.h
@@ -92,6 +92,8 @@ extern LLVMModuleRef llvm_mutable_module(LLVMJitContext *context);
 extern char *llvm_expand_funcname(LLVMJitContext *context, const char *basename);
 extern void *llvm_get_function(LLVMJitContext *context, const char *funcname);
 extern void llvm_split_symbol_name(const char *name, char **modname, char **funcname);
+extern LLVMTypeRef llvm_pg_var_type(const char *varname);
+extern LLVMTypeRef llvm_pg_var_func_type(const char *varname);
 extern LLVMValueRef llvm_pg_func(LLVMModuleRef mod, const char *funcname);
 extern void llvm_copy_attributes(LLVMValueRef from, LLVMValueRef to);
 extern LLVMValueRef llvm_function_reference(LLVMJitContext *context,
diff --git 

Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-07 Thread Tom Lane
I decided that the way to get this moved forward is to ignore the jsonb
parts for the moment and focus on getting the core feature into
committable shape.  It's possible that the lack of a concrete use-case
other than arrays will cause us to miss a detail or two, but if so we
can fix it later, I think.  (We should make sure to get the jsonb parts
in for v14, though, before we ship this.)

Accordingly, I went through all of the core and array code and dealt
with a lot of details that hadn't gotten seen to, including pg_dump
support and dependency considerations.  I ended up rewriting the
parser code pretty heavily, because I didn't like the original either
from an invasiveness or usability standpoint.  I also did the thing
I suggested earlier of using separate handler functions for varlena
and fixed-length arrays, though I made no effort to separate the code
paths.  I think the attached v37 is committable or nearly so, though
there remain a few loose ends:

1. I'm still wondering if TypeParamBool is the right thing to pass to
LLVMFunctionType() to describe a function-returning-bool.  It does
seem to work on x64_64 and aarch64, for what that's worth.

2. I haven't pulled the trigger on merging the three now-identical
execution step types.  That could be done separately, of course.

3. There are some semantic details that probably need debating.
As I wrote in subscripting.h:

 * There are some general restrictions on what subscripting can do.  The
 * planner expects subscripting fetches to be strict (i.e., return NULL for
 * any null input), immutable (same inputs always give same results), and
 * leakproof (data-value-dependent errors must not be thrown; in other
 * words, you must silently return NULL for any bad subscript value).
 * Subscripting assignment need not be, and usually isn't, strict; it need
 * not be leakproof either; but it must be immutable.

I doubt that there's anything too wrong with assuming immutability
of SubscriptingRef.  And perhaps strictness is OK too, though I worry
about somebody deciding that it'd be cute to define a NULL subscript
value as doing something special.  But the leakproofness assumption
seems like a really dangerous one.  All it takes is somebody deciding
that they should throw an error for a bad subscript, and presto, we
have a security hole.

What I'm slightly inclined to do here, but did not do in the attached
patch, is to have check_functions_in_node look at the leakproofness
marking of the subscripting support function; that's a bit of a hack,
but since the support function can't be called from SQL there is no
other use for its proleakproof flag.  Alternatively we could extend
the SubscriptingRoutine return struct to include some flags saying
whether fetch and/or assignment is leakproof.

BTW, right now check_functions_in_node() effectively treats
SubscriptingRef as unconditionally leakproof.  That's okay for the
array "fetch" code, but the array "store" code does throw errors
for bad subscripts, meaning it's not leakproof.  The only reason
this isn't a live security bug is that "store" SubscriptingRefs can
only occur in the targetlist of an INSERT or UPDATE, which is not
a place that could access any variables we are trying to protect
with the leakproofness mechanism.  (If it could, you could just
store the variable's value directly into another table.)  Still,
that's a shaky chain of reasoning.  The patch's change to
check_functions_in_node() would work in the back branches, so
I'm inclined to back-patch it that way even if we end up doing
something smarter in HEAD.

Comments?

regards, tom lane

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 2d44df19fe..ca2f9f3215 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -426,23 +426,28 @@ foreign_expr_walker(Node *node,
 	return false;
 
 /*
- * Recurse to remaining subexpressions.  Since the container
- * subscripts must yield (noncollatable) integers, they won't
- * affect the inner_cxt state.
+ * Recurse into the remaining subexpressions.  The container
+ * subscripts will not affect collation of the SubscriptingRef
+ * result, so do those first and reset inner_cxt afterwards.
  */
 if (!foreign_expr_walker((Node *) sr->refupperindexpr,
 		 glob_cxt, _cxt))
 	return false;
+inner_cxt.collation = InvalidOid;
+inner_cxt.state = FDW_COLLATE_NONE;
 if (!foreign_expr_walker((Node *) sr->reflowerindexpr,
 		 glob_cxt, _cxt))
 	return false;
+inner_cxt.collation = InvalidOid;
+inner_cxt.state = FDW_COLLATE_NONE;
 if (!foreign_expr_walker((Node *) sr->refexpr,
 		 glob_cxt, _cxt))
 	return false;
 
 /*
- * Container subscripting should yield same collation as
- * input, but for safety use same logic as for function nodes.
+ * Container subscripting typically yields same collation as
+ * refexpr's, but in case 

Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-04 Thread Tom Lane
BTW, I had a thought about this patch, which I wanted to write down
before it disappears again (I'm not offering to code it right now).

I think we should split array_subscript_handler into two functions,
one for "regular" varlena arrays and one for the fixed-length
pseudo-array types like "name" and "point".  This needn't have a lot
of impact on the execution code.  In fact, for the first version both
handlers could just return the same set of method pointers, and then
if we feel like it we could tease apart the code paths later.  The
value of doing this is that then typsubshandler could be used as a
bulletproof designator of the type's semantics.  Instead of weird
implementation-dependent tests on typlen and so on, the rule could be
"it's a regular array if typsubshandler == F_ARRAY_SUBSCRIPT_HANDLER".

Later on, we could even allow the "fixed length" array semantics to
be applied to varlena types perhaps, so long as their contents are
just N copies of some fixed-size type.  The point here is that we
now have a tool for recognizing the subscripting semantics reliably,
instead of having to back into an understanding of what they are.
But for the tool to be useful, we don't want the same handler
implementing significantly different behaviors.

regards, tom lane




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-04 Thread Tom Lane
Pavel Stehule  writes:
> I tested the last patch on my FC33 Lenovo T520 (I7) and I don't see 15%
> slowdown too .. On my comp there is a slowdown of about 1.5-3%. I used your
> function arraytest.

After repeating the experiment a few times, I think I was misled
by ASLR variance (ie, hot code falling within or across cache
lines depending on where the backend executable gets loaded).
I'd tried a couple of postmaster restarts, but seemingly not
enough to expose the full variance in runtime that's due to that.
I do still see a 2% or so penalty when comparing best-case runtimes,
which is consistent with other people's reports.

However, 2% is still more than I want to pay for this feature,
and after studying the patchset awhile I don't think it's tried
hard at all on execution efficiency.  We should eliminate the
ExecEvalSubscriptingRef* interface layer altogether, and have
ExecInterpExpr dispatch directly to container-type-specific code,
so that we end up with approximately the same call depth as before.
With the patches shown below, we are (as best I can tell) right
about on par with the existing code's runtime.  This patch also gets
rid of a few more undesirable assumptions in the core executor ---
for instance, AFAICS there is no need for *any* hard-wired limit
on the number of subscripts within the core executor.  (What a
particular container type chooses to support is its business,
of course.)  We also need not back off on the specificity of error
messages, since the complaints that were in ExecEvalSubscriptingRef*
are now in container-type-specific code.

There were other things not to like about the way v35 chose to
refactor the executor support.  In particular, I don't think it
understood the point of having the EEOP_SBSREF_SUBSCRIPT steps,
which is to only transform the subscript Datums to internal form
once, even when we have to use them twice in OLD and ASSIGN steps.
Admittedly DatumGetInt32 is pretty cheap, but this cannot be said
of reading text datums as the 0003 patch wishes to do.  (BTW, 0003 is
seriously buggy in that regard, as it's failing to cope with toasted
or even short-header inputs.  We really don't want to detoast twice,
so that has to be dealt with in the SUBSCRIPT step.)  I also felt that
processing the subscripts one-at-a-time wasn't necessarily a great
solution, as one can imagine container semantics where they need to be
handled more holistically.  So I replaced EEOP_SBSREF_SUBSCRIPT with
EEOP_SBSREF_SUBSCRIPTS, which is executed just once after all the
subscript Datums have been collected.  (This does mean that we lose
the optimization of short-circuiting as soon as we've found a NULL
subscript, but I'm not troubled by that.  I note in particular that
the core code shouldn't be forcing a particular view of what to do
with null subscripts onto all container types.)

The two patches attached cover the same territory as v35's 0001 and
0002, but I split it up differently because I didn't see much point
in a division that has a nonfunctional code state in the middle.
0001 below is just concerned with revising things enough so that the
core executor doesn't have any assumption about a maximum number of
subscripts.  Then 0002 incorporates what was in v35 0001+0002, revised
with what seems to me a better set of execution APIs.

There are a bunch of loose ends yet, the first three introduced
by me and the rest being pre-existing problems:

* I don't have a lot of confidence in the LLVM changes --- they seem
to work, but I don't really understand that code, and in particular
I don't understand the difference between TypeParamBool and
TypeStorageBool.  So there might be something subtly wrong with the
code generation for EEOP_SBSREF_SUBSCRIPTS.

* As things stand here, there's no difference among the expression
step types EEOP_SBSREF_OLD, EEOP_SBSREF_ASSIGN, and EEOP_SBSREF_FETCH;
they dispatch to different support routines but the core executor's
behavior is identical.  So we could fold them all into one step type,
and lose nothing except perhaps debugging visibility.  Should we do
that, or keep them separate?

* I've not rebased v35-0003 and later onto this design, and don't
intend to do so myself.

* The patchset adds a CREATE TYPE option, but fails to provide any
pg_dump support for that option.  (There's no test coverage either.
Maybe further on, we should extend hstore or another contrib type
to have subscripting support, if only to have testing of that?)

* CREATE TYPE fails to create a dependency from a type to its
subscripting function.  (Related to which, the changes to the
GenerateTypeDependencies call in TypeShellMake are surely wrong.)

* findTypeSubscriptingFunction contains dead code (not to mention sadly
incorrect comments).

* What is refnestedfunc?  That sure seems to be dead code.

* I'm not on board with including refindexprslice in the transformed
expression, either.  AFAICS that is the untransformed subscript list,
which has *no* business being included in the 

Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-03 Thread Tom Lane
Alexander Korotkov  writes:
> I didn't get we can't add opaque field to SubscriptingRefState without
> adding it to SubscriptingRef, which has to support
> copy/compare/outfuncs/readfuncs

Umm ... all depends on what you envision putting in there.  There
certainly can be an opaque field in SubscriptingRefState as long
as the subscript-mechanism-specific code is responsible for setting
it up.  You just can't pass such a thing through the earlier phases.

regards, tom lane




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-03 Thread Alexander Korotkov
On Wed, Dec 2, 2020 at 10:18 PM Dmitry Dolgov <9erthali...@gmail.com> wrote:
> > On Wed, Dec 02, 2020 at 11:52:54AM -0500, Tom Lane wrote:
> > Dmitry Dolgov <9erthali...@gmail.com> writes:
> > >> On Mon, Nov 30, 2020 at 02:26:19PM +0100, Dmitry Dolgov wrote:
> > >>> On Mon, Nov 30, 2020 at 04:12:29PM +0300, Alexander Korotkov wrote:
> > >>> The idea of an opaque field in SubscriptingRef structure is more
> > >>> attractive to me.  Could you please implement it?
> >
> > >> Sure, doesn't seem to be that much work.
> >
> > I just happened to notice this bit.  This idea is a complete nonstarter.
> > You cannot have an "opaque" field in a parsetree node, because then the
> > backend/nodes code has no idea what to do with it for
> > copy/compare/outfuncs/readfuncs.  The patch seems to be of the opinion
> > that "do nothing" is adequate, which it completely isn't.
> >
> > Perhaps this is a good juncture at which to remind people that parse
> > tree nodes are read-only so far as the executor is concerned, so
> > storing something there only at execution time won't work either.
>
> Oh, right, stupid of me. Then I'll just stick with the original
> Alexanders suggestion.

Stupid me too :)

I didn't get we can't add opaque field to SubscriptingRefState without
adding it to SubscriptingRef, which has to support
copy/compare/outfuncs/readfuncs

--
Regards,
Alexander Korotkov




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-02 Thread Pavel Stehule
st 2. 12. 2020 v 21:02 odesílatel Tom Lane  napsal:

> Dmitry Dolgov <9erthali...@gmail.com> writes:
> >> On Wed, Dec 02, 2020 at 12:58:51PM -0500, Tom Lane wrote:
> >> So ... one of the things that's been worrying me about this patch
> >> from day one is whether it would create a noticeable performance
> >> penalty for existing use-cases.  I did a small amount of experimentation
> >> about that with the v35 patchset, and it didn't take long at all to
> >> find that this:
> >> ...
> >> is about 15% slower with the patch than with HEAD.  I'm not sure
> >> what an acceptable penalty might be, but 15% is certainly not it.
>
> > I've tried to reproduce that, but get ~2-4% slowdown (with a pinned
> > backend, no turbo etc). Are there any special steps I've probably
> > missed?
>
> Hmm, no, I just built with --disable-cassert and otherwise my usual
> development options.
>
> I had experimented with some other variants of the test case,
> where the repeated statement is
>
> a[i] := i;-- about the same
> a[i] := a[i-1] + 1;   -- 7% slower
> a[i] := a[i-1] - a[i-2];  -- 15% slower
>
> so it seems clear that the penalty is on the array fetch not array
> assign side.  This isn't too surprising now that I think about it,
> because plpgsql's array assignment code is untouched by the patch
> (which is a large feature omission BTW: you still can't write
> jsonb['x'] := y;
>

The refactoring of the left part of the assignment statement in plpgsql
probably can be harder work than this patch. But it should be the next
step.


in plpgsql).
>
>
I tested the last patch on my FC33 Lenovo T520 (I7) and I don't see 15%
slowdown too .. On my comp there is a slowdown of about 1.5-3%. I used your
function arraytest.

Regards

Pavel




> regards, tom lane
>


Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-02 Thread Tom Lane
Dmitry Dolgov <9erthali...@gmail.com> writes:
>> On Wed, Dec 02, 2020 at 12:58:51PM -0500, Tom Lane wrote:
>> So ... one of the things that's been worrying me about this patch
>> from day one is whether it would create a noticeable performance
>> penalty for existing use-cases.  I did a small amount of experimentation
>> about that with the v35 patchset, and it didn't take long at all to
>> find that this:
>> ...
>> is about 15% slower with the patch than with HEAD.  I'm not sure
>> what an acceptable penalty might be, but 15% is certainly not it.

> I've tried to reproduce that, but get ~2-4% slowdown (with a pinned
> backend, no turbo etc). Are there any special steps I've probably
> missed?

Hmm, no, I just built with --disable-cassert and otherwise my usual
development options.

I had experimented with some other variants of the test case,
where the repeated statement is

a[i] := i;-- about the same
a[i] := a[i-1] + 1;   -- 7% slower
a[i] := a[i-1] - a[i-2];  -- 15% slower

so it seems clear that the penalty is on the array fetch not array
assign side.  This isn't too surprising now that I think about it,
because plpgsql's array assignment code is untouched by the patch
(which is a large feature omission BTW: you still can't write
jsonb['x'] := y;
in plpgsql).

regards, tom lane




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-02 Thread Dmitry Dolgov
> On Wed, Dec 02, 2020 at 01:20:10PM -0600, Justin Pryzby wrote:
> On Wed, Dec 02, 2020 at 08:18:08PM +0100, Dmitry Dolgov wrote:
> > > On Wed, Dec 02, 2020 at 12:58:51PM -0500, Tom Lane wrote:
> > > So ... one of the things that's been worrying me about this patch
> > > from day one is whether it would create a noticeable performance
> > > penalty for existing use-cases.  I did a small amount of experimentation
> > > about that with the v35 patchset, and it didn't take long at all to
> > > find that this:
> > > --- cut ---
> > >
> > > is about 15% slower with the patch than with HEAD.  I'm not sure
> > > what an acceptable penalty might be, but 15% is certainly not it.
> > >
> > > I'm also not quite sure where the cost is going.  It looks like
> > > 0001+0002 aren't doing much to the executor except introducing
> > > one level of subroutine call, which doesn't seem like it'd account
> > > for that.
> >
> > I've tried to reproduce that, but get ~2-4% slowdown (with a pinned
> > backend, no turbo etc). Are there any special steps I've probably
> > missed? At the same time, I remember had conducted this sort of tests
> > before when you and others raised the performance degradation question
> > and the main part of the patch was already more or less stable. From
> > what I remember the numbers back then were also rather small.
>
> Are you comparing with casserts (and therefor MEMORY_CONTEXT_CHECKING) 
> disabled?

Yep, they're disabled.




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-02 Thread Justin Pryzby
On Wed, Dec 02, 2020 at 08:18:08PM +0100, Dmitry Dolgov wrote:
> > On Wed, Dec 02, 2020 at 12:58:51PM -0500, Tom Lane wrote:
> > So ... one of the things that's been worrying me about this patch
> > from day one is whether it would create a noticeable performance
> > penalty for existing use-cases.  I did a small amount of experimentation
> > about that with the v35 patchset, and it didn't take long at all to
> > find that this:
> > --- cut ---
> >
> > is about 15% slower with the patch than with HEAD.  I'm not sure
> > what an acceptable penalty might be, but 15% is certainly not it.
> >
> > I'm also not quite sure where the cost is going.  It looks like
> > 0001+0002 aren't doing much to the executor except introducing
> > one level of subroutine call, which doesn't seem like it'd account
> > for that.
> 
> I've tried to reproduce that, but get ~2-4% slowdown (with a pinned
> backend, no turbo etc). Are there any special steps I've probably
> missed? At the same time, I remember had conducted this sort of tests
> before when you and others raised the performance degradation question
> and the main part of the patch was already more or less stable. From
> what I remember the numbers back then were also rather small.

Are you comparing with casserts (and therefor MEMORY_CONTEXT_CHECKING) disabled?

-- 
Justin




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-02 Thread Dmitry Dolgov
> On Wed, Dec 02, 2020 at 11:52:54AM -0500, Tom Lane wrote:
> Dmitry Dolgov <9erthali...@gmail.com> writes:
> >> On Mon, Nov 30, 2020 at 02:26:19PM +0100, Dmitry Dolgov wrote:
> >>> On Mon, Nov 30, 2020 at 04:12:29PM +0300, Alexander Korotkov wrote:
> >>> The idea of an opaque field in SubscriptingRef structure is more
> >>> attractive to me.  Could you please implement it?
>
> >> Sure, doesn't seem to be that much work.
>
> I just happened to notice this bit.  This idea is a complete nonstarter.
> You cannot have an "opaque" field in a parsetree node, because then the
> backend/nodes code has no idea what to do with it for
> copy/compare/outfuncs/readfuncs.  The patch seems to be of the opinion
> that "do nothing" is adequate, which it completely isn't.
>
> Perhaps this is a good juncture at which to remind people that parse
> tree nodes are read-only so far as the executor is concerned, so
> storing something there only at execution time won't work either.

Oh, right, stupid of me. Then I'll just stick with the original
Alexanders suggestion.




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-02 Thread Dmitry Dolgov
> On Wed, Dec 02, 2020 at 12:58:51PM -0500, Tom Lane wrote:
> So ... one of the things that's been worrying me about this patch
> from day one is whether it would create a noticeable performance
> penalty for existing use-cases.  I did a small amount of experimentation
> about that with the v35 patchset, and it didn't take long at all to
> find that this:
>
> --- cut ---
> create or replace function arraytest(n int) returns void as
> $$
> declare
>   a int[];
> begin
>   a := array[1, 1];
>   for i in 3..n loop
> a[i] := a[i-1] - a[i-2];
>   end loop;
> end;
> $$
> language plpgsql stable;
>
> \timing on
>
> select arraytest(1000);
> --- cut ---
>
> is about 15% slower with the patch than with HEAD.  I'm not sure
> what an acceptable penalty might be, but 15% is certainly not it.
>
> I'm also not quite sure where the cost is going.  It looks like
> 0001+0002 aren't doing much to the executor except introducing
> one level of subroutine call, which doesn't seem like it'd account
> for that.

I've tried to reproduce that, but get ~2-4% slowdown (with a pinned
backend, no turbo etc). Are there any special steps I've probably
missed? At the same time, I remember had conducted this sort of tests
before when you and others raised the performance degradation question
and the main part of the patch was already more or less stable. From
what I remember the numbers back then were also rather small.




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-02 Thread Tom Lane
So ... one of the things that's been worrying me about this patch
from day one is whether it would create a noticeable performance
penalty for existing use-cases.  I did a small amount of experimentation
about that with the v35 patchset, and it didn't take long at all to
find that this:

--- cut ---
create or replace function arraytest(n int) returns void as
$$
declare
  a int[];
begin
  a := array[1, 1];
  for i in 3..n loop
a[i] := a[i-1] - a[i-2];
  end loop;
end;
$$
language plpgsql stable;

\timing on

select arraytest(1000);
--- cut ---

is about 15% slower with the patch than with HEAD.  I'm not sure
what an acceptable penalty might be, but 15% is certainly not it.

I'm also not quite sure where the cost is going.  It looks like
0001+0002 aren't doing much to the executor except introducing
one level of subroutine call, which doesn't seem like it'd account
for that.

I don't think this can be considered RFC until the performance
issue is addressed.

regards, tom lane




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-02 Thread Tom Lane
Dmitry Dolgov <9erthali...@gmail.com> writes:
>> On Mon, Nov 30, 2020 at 02:26:19PM +0100, Dmitry Dolgov wrote:
>>> On Mon, Nov 30, 2020 at 04:12:29PM +0300, Alexander Korotkov wrote:
>>> The idea of an opaque field in SubscriptingRef structure is more
>>> attractive to me.  Could you please implement it?

>> Sure, doesn't seem to be that much work.

I just happened to notice this bit.  This idea is a complete nonstarter.
You cannot have an "opaque" field in a parsetree node, because then the
backend/nodes code has no idea what to do with it for
copy/compare/outfuncs/readfuncs.  The patch seems to be of the opinion
that "do nothing" is adequate, which it completely isn't.

Perhaps this is a good juncture at which to remind people that parse
tree nodes are read-only so far as the executor is concerned, so
storing something there only at execution time won't work either.

regards, tom lane




Re: [HACKERS] [PATCH] Generic type subscripting

2020-12-02 Thread Dmitry Dolgov
> On Mon, Nov 30, 2020 at 02:26:19PM +0100, Dmitry Dolgov wrote:
> > On Mon, Nov 30, 2020 at 04:12:29PM +0300, Alexander Korotkov wrote:
> >
> > > > My first question is whether we're
> > > > able to handle different subscript types differently.  For instance,
> > > > one day we could handle jsonpath subscripts for jsonb.  And for sure,
> > > > jsonpath subscripts are expected to be handled differently from text
> > > > subscripts.  I see we can distinguish types during in prepare and
> > > > validate functions.  But it seems there is no type information in
> > > > fetch and assign functions.  Should we add something like this to the
> > > > SubscriptingRefState for future usage?
> > > >
> > > > Datum uppertypeoid[MAX_SUBSCRIPT_DEPTH];
> > > > Datum lowertypeoid[MAX_SUBSCRIPT_DEPTH];
> > >
> > > Yes, makes sense. My original idea was that it could be done within the
> > > jsonpath support patch itself, but at the same time providing these
> > > fields into SubscriptingRefState will help other potential extensions.
> > >
> > > Having said that, maybe it would be even better to introduce a field
> > > with an opaque structure for both SubscriptingRefState and
> > > SubscriptingRef, where every implementation of custom subscripting can
> > > store any necessary information? In case of jsonpath it could keep type
> > > information acquired in prepare function, which would be then passed via
> > > SubscriptingRefState down to the fetch/assign.
> >
> > The idea of an opaque field in SubscriptingRef structure is more
> > attractive to me.  Could you please implement it?
>
> Sure, doesn't seem to be that much work.

The attached implementation should be enough I guess, as fetch/assign
are executed in a child memory context of one where prepare does the
stuff.
>From 6f7d9589cc827dfb3b1d82a3e0e629b482e4c77a Mon Sep 17 00:00:00 2001
From: erthalion <9erthali...@gmail.com>
Date: Thu, 31 Jan 2019 22:37:19 +0100
Subject: [PATCH v35 1/5] Base implementation of subscripting mechanism

Introduce all the required machinery for generalizing subscripting
operation for a different data types:

* subscripting handler procedure, to set up a relation between data type
and corresponding subscripting logic.

* subscripting routines, that help generalize a subscripting logic,
since it involves few stages, namely preparation (e.g. to figure out
required types), validation (to check the input and return meaningful
error message), fetch (executed when we extract a value using
subscripting), assign (executed when we update a data type with a new
value using subscripting). Without this it would be neccessary to
introduce more new fields to pg_type, which would be too invasive.

All ArrayRef related logic was removed and landed as a separate
subscripting implementation in the following patch from this series. The
rest of the code was rearranged, to e.g. store a type of assigned value
for an assign operation.

Reviewed-by: Tom Lane, Arthur Zakirov
---
 .../pg_stat_statements/pg_stat_statements.c   |   1 +
 src/backend/catalog/heap.c|   6 +-
 src/backend/catalog/pg_type.c |   7 +-
 src/backend/commands/typecmds.c   |  77 +-
 src/backend/executor/execExpr.c   |  26 +---
 src/backend/executor/execExprInterp.c | 124 +++
 src/backend/nodes/copyfuncs.c |   2 +
 src/backend/nodes/equalfuncs.c|   2 +
 src/backend/nodes/outfuncs.c  |   2 +
 src/backend/nodes/readfuncs.c |   2 +
 src/backend/parser/parse_expr.c   |  54 ---
 src/backend/parser/parse_node.c   | 141 --
 src/backend/parser/parse_target.c |  88 +--
 src/backend/utils/adt/ruleutils.c |  21 +--
 src/backend/utils/cache/lsyscache.c   |  23 +++
 src/include/c.h   |   2 +
 src/include/catalog/pg_type.h |   9 +-
 src/include/executor/execExpr.h   |  15 +-
 src/include/nodes/primnodes.h |   8 +
 src/include/nodes/subscripting.h  |  42 ++
 src/include/parser/parse_node.h   |   6 +-
 src/include/utils/lsyscache.h |   1 +
 22 files changed, 333 insertions(+), 326 deletions(-)
 create mode 100644 src/include/nodes/subscripting.h

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 1eac9edaee..31ba120fb2 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -2800,6 +2800,7 @@ JumbleExpr(pgssJumbleState *jstate, Node *node)
 JumbleExpr(jstate, (Node *) sbsref->reflowerindexpr);
 JumbleExpr(jstate, (Node *) sbsref->refexpr);
 JumbleExpr(jstate, (Node *) sbsref->refassgnexpr);
+APP_JUMB(sbsref->refnestedfunc);
 			}
 			break;
 		case T_FuncExpr:
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c

Re: [HACKERS] [PATCH] Generic type subscripting

2020-11-30 Thread Dmitry Dolgov
> On Mon, Nov 30, 2020 at 04:12:29PM +0300, Alexander Korotkov wrote:
>
> > > My first question is whether we're
> > > able to handle different subscript types differently.  For instance,
> > > one day we could handle jsonpath subscripts for jsonb.  And for sure,
> > > jsonpath subscripts are expected to be handled differently from text
> > > subscripts.  I see we can distinguish types during in prepare and
> > > validate functions.  But it seems there is no type information in
> > > fetch and assign functions.  Should we add something like this to the
> > > SubscriptingRefState for future usage?
> > >
> > > Datum uppertypeoid[MAX_SUBSCRIPT_DEPTH];
> > > Datum lowertypeoid[MAX_SUBSCRIPT_DEPTH];
> >
> > Yes, makes sense. My original idea was that it could be done within the
> > jsonpath support patch itself, but at the same time providing these
> > fields into SubscriptingRefState will help other potential extensions.
> >
> > Having said that, maybe it would be even better to introduce a field
> > with an opaque structure for both SubscriptingRefState and
> > SubscriptingRef, where every implementation of custom subscripting can
> > store any necessary information? In case of jsonpath it could keep type
> > information acquired in prepare function, which would be then passed via
> > SubscriptingRefState down to the fetch/assign.
>
> The idea of an opaque field in SubscriptingRef structure is more
> attractive to me.  Could you please implement it?

Sure, doesn't seem to be that much work.




Re: [HACKERS] [PATCH] Generic type subscripting

2020-11-30 Thread Alexander Korotkov
On Mon, Nov 30, 2020 at 2:33 PM Dmitry Dolgov <9erthali...@gmail.com> wrote:
> > On Fri, Nov 27, 2020 at 12:13:48PM +0300, Alexander Korotkov wrote:
> > I've started to review this patch.
>
> Thanks!
>
> > My first question is whether we're
> > able to handle different subscript types differently.  For instance,
> > one day we could handle jsonpath subscripts for jsonb.  And for sure,
> > jsonpath subscripts are expected to be handled differently from text
> > subscripts.  I see we can distinguish types during in prepare and
> > validate functions.  But it seems there is no type information in
> > fetch and assign functions.  Should we add something like this to the
> > SubscriptingRefState for future usage?
> >
> > Datum uppertypeoid[MAX_SUBSCRIPT_DEPTH];
> > Datum lowertypeoid[MAX_SUBSCRIPT_DEPTH];
>
> Yes, makes sense. My original idea was that it could be done within the
> jsonpath support patch itself, but at the same time providing these
> fields into SubscriptingRefState will help other potential extensions.
>
> Having said that, maybe it would be even better to introduce a field
> with an opaque structure for both SubscriptingRefState and
> SubscriptingRef, where every implementation of custom subscripting can
> store any necessary information? In case of jsonpath it could keep type
> information acquired in prepare function, which would be then passed via
> SubscriptingRefState down to the fetch/assign.

The idea of an opaque field in SubscriptingRef structure is more
attractive to me.  Could you please implement it?

--
Regards,
Alexander Korotkov




Re: [HACKERS] [PATCH] Generic type subscripting

2020-11-30 Thread Dmitry Dolgov
> On Fri, Nov 27, 2020 at 12:13:48PM +0300, Alexander Korotkov wrote:
>
> Hi!
>
> I've started to review this patch.

Thanks!

> My first question is whether we're
> able to handle different subscript types differently.  For instance,
> one day we could handle jsonpath subscripts for jsonb.  And for sure,
> jsonpath subscripts are expected to be handled differently from text
> subscripts.  I see we can distinguish types during in prepare and
> validate functions.  But it seems there is no type information in
> fetch and assign functions.  Should we add something like this to the
> SubscriptingRefState for future usage?
>
> Datum uppertypeoid[MAX_SUBSCRIPT_DEPTH];
> Datum lowertypeoid[MAX_SUBSCRIPT_DEPTH];

Yes, makes sense. My original idea was that it could be done within the
jsonpath support patch itself, but at the same time providing these
fields into SubscriptingRefState will help other potential extensions.

Having said that, maybe it would be even better to introduce a field
with an opaque structure for both SubscriptingRefState and
SubscriptingRef, where every implementation of custom subscripting can
store any necessary information? In case of jsonpath it could keep type
information acquired in prepare function, which would be then passed via
SubscriptingRefState down to the fetch/assign.




Re: [HACKERS] [PATCH] Generic type subscripting

2020-11-27 Thread Alexander Korotkov
Hi!

I've started to review this patch.  My first question is whether we're
able to handle different subscript types differently.  For instance,
one day we could handle jsonpath subscripts for jsonb.  And for sure,
jsonpath subscripts are expected to be handled differently from text
subscripts.  I see we can distinguish types during in prepare and
validate functions.  But it seems there is no type information in
fetch and assign functions.  Should we add something like this to the
SubscriptingRefState for future usage?

Datum uppertypeoid[MAX_SUBSCRIPT_DEPTH];
Datum lowertypeoid[MAX_SUBSCRIPT_DEPTH];

--
Regards,
Alexander Korotkov




Re: [HACKERS] [PATCH] Generic type subscripting

2020-09-28 Thread Pavel Stehule
po 28. 9. 2020 v 11:36 odesílatel Dmitry Dolgov <9erthali...@gmail.com>
napsal:

> > On Fri, Sep 25, 2020 at 06:43:38PM +0200, Pavel Stehule wrote:
> >
> > I checked this set of patches and it looks well.
> >
> > I have only one minor comment. I understand the error message, but I am
> not
> > sure if without deeper knowledge I can understand.
> >
> > +update test_jsonb_subscript set test_json[-8] = 1;
> > +ERROR:  path element at position 1 is out of range: -8
> >
> > Maybe 'value of subscript "-8" is out of range'. Current error message is
> > fully correct - but people probably have to think "what is a path element
> > at position 1?" It doesn't look intuitive.
> >
> > Do you have some idea?
>
> Interesting question. I've borrowed this error message format from other
> parts of setPath function where it appears couple of times and
> unfortunately can't suggest anything better. In case it this patch will
> get lucky enough to attract someone else, maybe we can leave it to a
> committer, what do you think?
>

ok


> > My comment is minor, and I mark this patch with pleasure as ready for
> > committer.
> >
> > patching and compiling - without problems
> > implemented functionality - I like it
> > Building doc - without problems
> > make check-world - passed
>
> Thanks!
>


Re: [HACKERS] [PATCH] Generic type subscripting

2020-09-28 Thread Dmitry Dolgov
> On Fri, Sep 25, 2020 at 06:43:38PM +0200, Pavel Stehule wrote:
>
> I checked this set of patches and it looks well.
>
> I have only one minor comment. I understand the error message, but I am not
> sure if without deeper knowledge I can understand.
>
> +update test_jsonb_subscript set test_json[-8] = 1;
> +ERROR:  path element at position 1 is out of range: -8
>
> Maybe 'value of subscript "-8" is out of range'. Current error message is
> fully correct - but people probably have to think "what is a path element
> at position 1?" It doesn't look intuitive.
>
> Do you have some idea?

Interesting question. I've borrowed this error message format from other
parts of setPath function where it appears couple of times and
unfortunately can't suggest anything better. In case it this patch will
get lucky enough to attract someone else, maybe we can leave it to a
committer, what do you think?

> My comment is minor, and I mark this patch with pleasure as ready for
> committer.
>
> patching and compiling - without problems
> implemented functionality - I like it
> Building doc - without problems
> make check-world - passed

Thanks!




Re: [HACKERS] [PATCH] Generic type subscripting

2020-09-25 Thread Pavel Stehule
ne 20. 9. 2020 v 17:46 odesílatel Dmitry Dolgov <9erthali...@gmail.com>
napsal:

> > On Fri, Sep 18, 2020 at 07:23:11PM +0200, Pavel Stehule wrote:
> >
> > > In this way (returning an error on a negative indices bigger than the
> > > number of elements) functionality for assigning via subscripting will
> be
> > > already significantly differ from the original one via jsonb_set. Which
> > > in turn could cause a new wave of something similar to "why assigning
> an
> > > SQL NULL as a value returns NULL instead of jsonb?". Taking into
> account
> > > that this is not absolutely new interface, but rather a convenient
> > > shortcut for the existing one it probably makes sense to try to find a
> > > balance between both consistency with regular array and similarity with
> > > already existing jsonb modification functions.
> > >
> > > Having said that, my impression is that this balance should be not
> fully
> > > shifted towards consistensy with the regular array type, as jsonb array
> > > and regular array are fundamentally different in terms of
> > > implementation. If any differences are of concern, they should be
> > > addressed at different level. At the same time I've already sort of
> gave
> > > up on this patch in the form I wanted to see it anyway, so anything
> goes
> > > if it helps bring it to the finish point. In case if there would be no
> > > more arguments from other involved sides, I can post the next version
> > > with your suggestion included.
> > >
> >
> > This is a relatively new interface and at this moment we can decide if it
> > will be consistent or not.  I have not a problem if I have different
> > functions with different behaviors, but I don't like one interface with
> > slightly different behaviors for different types. I understand your
> > argument about implementing a lighter interface to some existing API.
> But I
> > think so more important should be consistency in maximall possible rate
> > (where it has sense).
> >
> > For me "jsonb" can be a very fundamental type in PLpgSQL development - it
> > can bring a lot of dynamic to this environment (it can work perfectly
> like
> > PL/SQL collection or like Perl dictionary), but for this purpose the
> > behaviour should be well consistent without surprising elements.
>
> And here we are, the rebased version with the following changes:
>
> insert into test_jsonb_subscript values (1, '[]');
> update test_jsonb_subscript set test_json[5] = 1;
> select * from test_jsonb_subscript;
>  id | test_json
> +---
>   1 | [null, null, null, null, null, 1]
> (1 row)
>
> update test_jsonb_subscript set test_json[-8] = 1;
> ERROR:  path element at position 1 is out of range: -8
>
> Thanks for the suggestions!
>

Thank you for accepting my suggestions.

I checked this set of patches and it looks well.

I have only one minor comment. I understand the error message, but I am not
sure if without deeper knowledge I can understand.

+update test_jsonb_subscript set test_json[-8] = 1;
+ERROR:  path element at position 1 is out of range: -8

Maybe 'value of subscript "-8" is out of range'. Current error message is
fully correct - but people probably have to think "what is a path element
at position 1?" It doesn't look intuitive.

Do you have some idea?

My comment is minor, and I mark this patch with pleasure as ready for
committer.

patching and compiling - without problems
implemented functionality - I like it
Building doc - without problems
make check-world - passed

Regards

Pavel


  1   2   >