Re: jsonb_set() strictness considered harmful to data

2020-01-17 Thread Ariadne Conill
Hello,

January 17, 2020 5:21 PM, "Tomas Vondra"  wrote:

> On Wed, Jan 08, 2020 at 05:24:05PM +1030, Andrew Dunstan wrote:
> 
>> On Wed, Jan 8, 2020 at 7:08 AM Pavel Stehule  wrote:
>>> Hi
>>> 
>>> po 6. 1. 2020 v 22:34 odesílatel Andrew Dunstan 
>>>  napsal:
>> 
>> Updated version including docco and better error message.
>> 
>> cheers
>> 
>> andrew
>>> I think so my objections are solved. I have small objection
>>> 
>>> + errdetail("exception raised due to \"null_value_treatment := 
>>> 'raise_exception'\""),
>>> + errhint("to avoid, either change the null_value_treatment argument or 
>>> ensure that an SQL NULL is
>>> not used")));
>>> 
>>> "null_value_treatment := 'raise_exception'\""
>>> 
>>> it use proprietary PostgreSQL syntax for named parameters. Better to use 
>>> ANSI/SQL syntax
>>> 
>>> "null_value_treatment => 'raise_exception'\""
>>> 
>>> It is fixed in attached patch
>>> 
>>> source compilation without warnings,
>>> compilation docs without warnings
>>> check-world passed without any problems
>>> 
>>> I'll mark this patch as ready for commiter
>>> 
>>> Thank you for your work
>> 
>> Thanks for the review. I propose to commit this shortly.
> 
> Now that this was committed, I've updated the patch status accordingly.

Thank you very much for coming together and finding a solution to this bug!

Ariadne




Re: jsonb_set() strictness considered harmful to data

2020-01-17 Thread Tomas Vondra

On Wed, Jan 08, 2020 at 05:24:05PM +1030, Andrew Dunstan wrote:

On Wed, Jan 8, 2020 at 7:08 AM Pavel Stehule  wrote:


Hi

po 6. 1. 2020 v 22:34 odesílatel Andrew Dunstan 
 napsal:



Updated version including docco and better error message.

cheers

andrew



I think so my objections are solved. I have small objection

+ errdetail("exception raised due to \"null_value_treatment := 
'raise_exception'\""),
+ errhint("to avoid, either change the null_value_treatment argument or ensure that 
an SQL NULL is not used")));

"null_value_treatment := 'raise_exception'\""

it use proprietary PostgreSQL syntax for named parameters. Better to use 
ANSI/SQL syntax

"null_value_treatment => 'raise_exception'\""

It is fixed in attached patch

source compilation without warnings,
compilation docs without warnings
check-world passed without any problems

I'll mark this patch as ready for commiter

Thank you for your work




Thanks for the review. I propose to commit this shortly.



Now that this was committed, I've updated the patch status accordingly.

Thanks!

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: jsonb_set() strictness considered harmful to data

2020-01-07 Thread Pavel Stehule
Hi

po 6. 1. 2020 v 22:34 odesílatel Andrew Dunstan <
andrew.duns...@2ndquadrant.com> napsal:

> On Thu, Nov 28, 2019 at 2:15 PM Andrew Dunstan
>  wrote:
> >
> >
> > On 11/27/19 9:35 PM, Michael Paquier wrote:
> > > On Fri, Nov 15, 2019 at 09:45:59PM +0100, Pavel Stehule wrote:
> > >> Maybe ERRCODE_NULL_VALUE_NOT_ALLOWED, and "NULL is not allowed",
> > >> errdetail - a exception due setting "null_value_treatment" =>
> > >> raise_exception
> > >> and maybe some errhint - "Maybe you would to use Jsonb NULL -
> "null"::jsonb"
> > >>
> > >> I don't know, but in this case, the exception should be verbose. This
> is
> > >> "rich" function with lot of functionality
> > > @Andrew: This patch is waiting on input from you for a couple of days
> > > now.
> > >
> >
> >
>
>
> Updated version including docco and better error message.
>
> cheers
>
> andrew
>

I think so my objections are solved. I have small objection

+ errdetail("exception raised due to \"null_value_treatment :=
'raise_exception'\""),
+ errhint("to avoid, either change the null_value_treatment argument or
ensure that an SQL NULL is not used")));

"null_value_treatment := 'raise_exception'\""

it use proprietary PostgreSQL syntax for named parameters. Better to use
ANSI/SQL syntax

"null_value_treatment => 'raise_exception'\""

It is fixed in attached patch

source compilation without warnings,
compilation docs without warnings
check-world passed without any problems

I'll mark this patch as ready for commiter

Thank you for your work

Pavel


>
> --
> Andrew Dunstanhttps://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 4b42f12862..72072e7545 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -12231,6 +12231,9 @@ table2-mapping
   
jsonb_set
   
+  
+   jsonb_set_lax
+  
   
jsonb_insert
   
@@ -12545,6 +12548,26 @@ table2-mapping
  [{"f1": 1, "f2": null, "f3": [2, 3, 4]}, 2]
 

+  
+   jsonb_set_lax(target jsonb, path text[], new_value jsonb , create_missing boolean , null_value_treatment text)
+ 
+   jsonb
+   
+If new_value is not null,
+behaves identically to jsonb_set. Otherwise behaves
+according to the value of null_value_treatment
+which must be one of 'raise_exception',
+'use_json_null', 'delete_key', or
+'return_target'. The default is
+'use_json_null'.
+   
+   jsonb_set_lax('[{"f1":1,"f2":null},2,null,3]', '{0,f1}',null)
+ jsonb_set_lax('[{"f1":99,"f2":null},2]', '{0,f3}',null, true, 'return_target')
+ 
+   [{"f1":null,"f2":null},2,null,3]
+ [{"f1": 99, "f2": null}, 2]
+
+   
   


diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 2fc3e3ff90..1cb2af1bcd 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1237,6 +1237,15 @@ LANGUAGE INTERNAL
 STRICT IMMUTABLE PARALLEL SAFE
 AS 'jsonb_set';
 
+CREATE OR REPLACE FUNCTION
+  jsonb_set_lax(jsonb_in jsonb, path text[] , replacement jsonb,
+create_if_missing boolean DEFAULT true,
+null_value_treatment text DEFAULT 'use_json_null')
+RETURNS jsonb
+LANGUAGE INTERNAL
+CALLED ON NULL INPUT IMMUTABLE PARALLEL SAFE
+AS 'jsonb_set_lax';
+
 CREATE OR REPLACE FUNCTION
   parse_ident(str text, strict boolean DEFAULT true)
 RETURNS text[]
diff --git a/src/backend/utils/adt/jsonfuncs.c b/src/backend/utils/adt/jsonfuncs.c
index ab5a24a858..4b5a0214dc 100644
--- a/src/backend/utils/adt/jsonfuncs.c
+++ b/src/backend/utils/adt/jsonfuncs.c
@@ -4395,6 +4395,70 @@ jsonb_set(PG_FUNCTION_ARGS)
 }
 
 
+/*
+ * SQL function jsonb_set_lax(jsonb, text[], jsonb, boolean, text)
+ */
+Datum
+jsonb_set_lax(PG_FUNCTION_ARGS)
+{
+	/* Jsonb	   *in = PG_GETARG_JSONB_P(0); */
+	/* ArrayType  *path = PG_GETARG_ARRAYTYPE_P(1); */
+	/* Jsonb	  *newval = PG_GETARG_JSONB_P(2); */
+	/* bool		create = PG_GETARG_BOOL(3); */
+	text   *handle_null;
+	char   *handle_val;
+
+	if (PG_ARGISNULL(0) || PG_ARGISNULL(1) || PG_ARGISNULL(3))
+		PG_RETURN_NULL();
+
+	/* could happen if they pass in an explicit NULL */
+	if (PG_ARGISNULL(4))
+		ereport(ERROR,
+(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("need delete_key, return_target, use_json_null, or raise_exception")));
+
+	/* if the new value isn't an SQL NULL just call jsonb_set */
+	if (! PG_ARGISNULL(2))
+		return jsonb_set(fcinfo);
+
+	handle_null = PG_GETARG_TEXT_P(4);
+	handle_val = text_to_cstring(handle_null);
+
+	if (strcmp(handle_val,"raise_exception") == 0)
+	{
+		ereport(ERROR,
+(errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED),
+ errmsg("NULL is not allowed"),
+ errdetail("exception raised due to \"null_value_treatment => 'raise_exception'\""),
+ errhint("to avoid, either change the null_value_treatment argument or 

Re: jsonb_set() strictness considered harmful to data

2020-01-06 Thread Andrew Dunstan
On Thu, Nov 28, 2019 at 2:15 PM Andrew Dunstan
 wrote:
>
>
> On 11/27/19 9:35 PM, Michael Paquier wrote:
> > On Fri, Nov 15, 2019 at 09:45:59PM +0100, Pavel Stehule wrote:
> >> Maybe ERRCODE_NULL_VALUE_NOT_ALLOWED, and "NULL is not allowed",
> >> errdetail - a exception due setting "null_value_treatment" =>
> >> raise_exception
> >> and maybe some errhint - "Maybe you would to use Jsonb NULL - 
> >> "null"::jsonb"
> >>
> >> I don't know, but in this case, the exception should be verbose. This is
> >> "rich" function with lot of functionality
> > @Andrew: This patch is waiting on input from you for a couple of days
> > now.
> >
>
>


Updated version including docco and better error message.

cheers

andrew


-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 57a1539506..7adb6a2d04 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -12203,6 +12203,9 @@ table2-mapping
   
jsonb_set
   
+  
+   jsonb_set_lax
+  
   
jsonb_insert
   
@@ -12517,6 +12520,26 @@ table2-mapping
  [{"f1": 1, "f2": null, "f3": [2, 3, 4]}, 2]
 

+  
+   jsonb_set_lax(target jsonb, path text[], new_value jsonb , create_missing boolean , null_value_treatment text)
+ 
+   jsonb
+   
+If new_value is not null,
+behaves identically to jsonb_set. Otherwise behaves
+according to the value of null_value_treatment
+which must be one of 'raise_exception',
+'use_json_null', 'delete_key', or
+'return_target'. The default is
+'use_json_null'.
+   
+   jsonb_set_lax('[{"f1":1,"f2":null},2,null,3]', '{0,f1}',null)
+ jsonb_set_lax('[{"f1":99,"f2":null},2]', '{0,f3}',null, true, 'return_target')
+ 
+   [{"f1":null,"f2":null},2,null,3]
+ [{"f1": 99, "f2": null}, 2]
+
+   
   


diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 2fc3e3ff90..1cb2af1bcd 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1237,6 +1237,15 @@ LANGUAGE INTERNAL
 STRICT IMMUTABLE PARALLEL SAFE
 AS 'jsonb_set';
 
+CREATE OR REPLACE FUNCTION
+  jsonb_set_lax(jsonb_in jsonb, path text[] , replacement jsonb,
+create_if_missing boolean DEFAULT true,
+null_value_treatment text DEFAULT 'use_json_null')
+RETURNS jsonb
+LANGUAGE INTERNAL
+CALLED ON NULL INPUT IMMUTABLE PARALLEL SAFE
+AS 'jsonb_set_lax';
+
 CREATE OR REPLACE FUNCTION
   parse_ident(str text, strict boolean DEFAULT true)
 RETURNS text[]
diff --git a/src/backend/utils/adt/jsonfuncs.c b/src/backend/utils/adt/jsonfuncs.c
index ab5a24a858..b5fa4e7d18 100644
--- a/src/backend/utils/adt/jsonfuncs.c
+++ b/src/backend/utils/adt/jsonfuncs.c
@@ -4395,6 +4395,70 @@ jsonb_set(PG_FUNCTION_ARGS)
 }
 
 
+/*
+ * SQL function jsonb_set_lax(jsonb, text[], jsonb, boolean, text)
+ */
+Datum
+jsonb_set_lax(PG_FUNCTION_ARGS)
+{
+	/* Jsonb	   *in = PG_GETARG_JSONB_P(0); */
+	/* ArrayType  *path = PG_GETARG_ARRAYTYPE_P(1); */
+	/* Jsonb	  *newval = PG_GETARG_JSONB_P(2); */
+	/* bool		create = PG_GETARG_BOOL(3); */
+	text   *handle_null;
+	char   *handle_val;
+
+	if (PG_ARGISNULL(0) || PG_ARGISNULL(1) || PG_ARGISNULL(3))
+		PG_RETURN_NULL();
+
+	/* could happen if they pass in an explicit NULL */
+	if (PG_ARGISNULL(4))
+		ereport(ERROR,
+(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("need delete_key, return_target, use_json_null, or raise_exception")));
+
+	/* if the new value isn't an SQL NULL just call jsonb_set */
+	if (! PG_ARGISNULL(2))
+		return jsonb_set(fcinfo);
+
+	handle_null = PG_GETARG_TEXT_P(4);
+	handle_val = text_to_cstring(handle_null);
+
+	if (strcmp(handle_val,"raise_exception") == 0)
+	{
+		ereport(ERROR,
+(errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED),
+ errmsg("NULL is not allowed"),
+ errdetail("exception raised due to \"null_value_treatment := 'raise_exception'\""),
+ errhint("to avoid, either change the null_value_treatment argument or ensure that an SQL NULL is not used")));
+	}
+	else if (strcmp(handle_val, "use_json_null") == 0)
+	{
+		Datum	  newval;
+
+		newval = DirectFunctionCall1(jsonb_in, CStringGetDatum("null"));
+
+		fcinfo->args[2].value = newval;
+		fcinfo->args[2].isnull = false;
+		return jsonb_set(fcinfo);
+	}
+	else if (strcmp(handle_val, "delete_key") == 0)
+	{
+		return jsonb_delete_path(fcinfo);
+	}
+	else if (strcmp(handle_val, "return_target") == 0)
+	{
+		Jsonb	   *in = PG_GETARG_JSONB_P(0);
+		PG_RETURN_JSONB_P(in);
+	}
+	else
+	{
+		ereport(ERROR,
+(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("need delete_key, return_target, use_json_null, or raise_exception")));
+	}
+}
+
 /*
  * SQL function jsonb_delete_path(jsonb, text[])
  */
diff --git a/src/include/catalog/pg_proc.dat 

Re: jsonb_set() strictness considered harmful to data

2019-11-27 Thread Andrew Dunstan


On 11/27/19 9:35 PM, Michael Paquier wrote:
> On Fri, Nov 15, 2019 at 09:45:59PM +0100, Pavel Stehule wrote:
>> Maybe ERRCODE_NULL_VALUE_NOT_ALLOWED, and "NULL is not allowed",
>> errdetail - a exception due setting "null_value_treatment" =>
>> raise_exception
>> and maybe some errhint - "Maybe you would to use Jsonb NULL - "null"::jsonb"
>>
>> I don't know, but in this case, the exception should be verbose. This is
>> "rich" function with lot of functionality
> @Andrew: This patch is waiting on input from you for a couple of days
> now.
>


Will get to this on Friday - tomorrow is Thanksgiving so I'm unlikely to
get to it then.


cheers


andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services





Re: jsonb_set() strictness considered harmful to data

2019-11-27 Thread Michael Paquier
On Fri, Nov 15, 2019 at 09:45:59PM +0100, Pavel Stehule wrote:
> Maybe ERRCODE_NULL_VALUE_NOT_ALLOWED, and "NULL is not allowed",
> errdetail - a exception due setting "null_value_treatment" =>
> raise_exception
> and maybe some errhint - "Maybe you would to use Jsonb NULL - "null"::jsonb"
> 
> I don't know, but in this case, the exception should be verbose. This is
> "rich" function with lot of functionality

@Andrew: This patch is waiting on input from you for a couple of days
now.
--
Michael


signature.asc
Description: PGP signature


Re: jsonb_set() strictness considered harmful to data

2019-11-15 Thread Pavel Stehule
pá 15. 11. 2019 v 21:01 odesílatel Andrew Dunstan <
andrew.duns...@2ndquadrant.com> napsal:

>
> On 11/15/19 2:14 PM, Pavel Stehule wrote:
> > Hi
> >
> >
> >
> > For release 13+, I have given some more thought to what should be
> > done.
> > I think the bar for altering the behaviour of a function should be
> > rather higher than we have in the present case, and the longer the
> > function has been sanctioned by time the higher the bar should be.
> > However, I think there is a case to be made for providing a
> non-strict
> > jsonb_set type function. To advance th4e discussion, attached is a
> POC
> > patch that does that. This can also be done as an extension, meaning
> > that users of back branches could deploy it immediately. I've tested
> > this against release 12, but I think it could go probably all the way
> > back to 9.5. The new function is named jsonb_ set_lax, but I'm open
> to
> > bikeshedding.
> >
> >
> > I am sending a review of this patch
> >
> > 1. this patch does what was proposed and it is based on discussion.
> >
> > 2. there are not any problem with patching or compilation, all regress
> > tests passed.
> >
> > 4. code looks well and it is well commented.
> >
> > 5. the patch has enough regress tests
> >
> > My notes:
> >
> > a) missing documentation
> >
> > b) error message is not finalized
> >
> > +   ereport(ERROR,
> > +   (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> > +errmsg("null jsonb value")));
> >
> > Any other looks well, and this function can be very handy.
> >
> >
>
> Thanks for the review. I will add some docco.
>
>
> What would be a better error message? "null jsonb replacement not
> permitted"?
>

Maybe ERRCODE_NULL_VALUE_NOT_ALLOWED, and "NULL is not allowed",
errdetail - a exception due setting "null_value_treatment" =>
raise_exception
and maybe some errhint - "Maybe you would to use Jsonb NULL - "null"::jsonb"

I don't know, but in this case, the exception should be verbose. This is
"rich" function with lot of functionality





>
> cheers
>
>
> andrew
>
> --
> Andrew Dunstanhttps://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
>


Re: jsonb_set() strictness considered harmful to data

2019-11-15 Thread Pavel Stehule
Hi



> For release 13+, I have given some more thought to what should be done.
> I think the bar for altering the behaviour of a function should be
> rather higher than we have in the present case, and the longer the
> function has been sanctioned by time the higher the bar should be.
> However, I think there is a case to be made for providing a non-strict
> jsonb_set type function. To advance th4e discussion, attached is a POC
> patch that does that. This can also be done as an extension, meaning
> that users of back branches could deploy it immediately. I've tested
> this against release 12, but I think it could go probably all the way
> back to 9.5. The new function is named jsonb_ set_lax, but I'm open to
> bikeshedding.
>
>
I am sending a review of this patch

1. this patch does what was proposed and it is based on discussion.

2. there are not any problem with patching or compilation, all regress
tests passed.

4. code looks well and it is well commented.

5. the patch has enough regress tests

My notes:

a) missing documentation

b) error message is not finalized

+   ereport(ERROR,
+   (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+errmsg("null jsonb value")));

Any other looks well, and this function can be very handy.

Regards

Pavel


Re: jsonb_set() strictness considered harmful to data

2019-10-28 Thread Mark Felder



On Mon, Oct 28, 2019, at 08:52, Andrew Dunstan wrote:
> 
> For release 13+, I have given some more thought to what should be done.
> I think the bar for altering the behaviour of a function should be
> rather higher than we have in the present case, and the longer the
> function has been sanctioned by time the higher the bar should be.
> However, I think there is a case to be made for providing a non-strict
> jsonb_set type function. To advance th4e discussion, attached is a POC
> patch that does that. This can also be done as an extension, meaning
> that users of back branches could deploy it immediately. I've tested
> this against release 12, but I think it could go probably all the way
> back to 9.5. The new function is named jsonb_ set_lax, but I'm open to
> bikeshedding.
> 
> 

Thank you Andrew, and I understand the difficulty in making changes to 
functions that already exist in production deployments. An additional function 
like this would be helpful to many.


-- 
  Mark Felder
  ports-secteam & portmgr alumni
  f...@freebsd.org




Re: jsonb_set() strictness considered harmful to data

2019-10-28 Thread Andrew Dunstan

On 10/21/19 9:28 AM, Andrew Dunstan wrote:
> On 10/21/19 2:07 AM, Tomas Vondra wrote:
>> On Sun, Oct 20, 2019 at 06:51:05PM -0400, Andrew Dunstan wrote:
 I think the general premise of this thread is that the application
 developer does not realize that may be necessary, because it's a bit
 surprising behavior, particularly when having more experience with
 other
 databases that behave differently. It's also pretty easy to not notice
 this issue for a long time, resulting in significant data loss.

 Let's say you're used to the MSSQL or MySQL behavior, you migrate your
 application to PostgreSQL or whatever - how do you find out about this
 behavior? Users are likely to visit

    https://www.postgresql.org/docs/12/functions-json.html

 but that says nothing about how jsonb_set works with NULL values :-(
>>>
>>>
>>> We should certainly fix that. I accept some responsibility for the
>>> omission.
>>>
>> +1
>>
>>
>
> So let's add something to the JSON funcs page  like this:
>
>
> Note: All the above functions except for json_build_object,
> json_build_array, json_to_recordset, json_populate_record, and
> json_populate_recordset and their jsonb equivalents are strict
> functions. That is, if any argument is NULL the function result will be
> NULL and the function won't even be called. Particular care should
> therefore be taken to avoid passing NULL arguments to those functions
> unless a NULL result is expected. This is particularly true of the
> jsonb_set and jsonb_insert functions.
>
>
>
> (We do have a heck of a lot of Note: sections on that page)
>
>


For release 13+, I have given some more thought to what should be done.
I think the bar for altering the behaviour of a function should be
rather higher than we have in the present case, and the longer the
function has been sanctioned by time the higher the bar should be.
However, I think there is a case to be made for providing a non-strict
jsonb_set type function. To advance th4e discussion, attached is a POC
patch that does that. This can also be done as an extension, meaning
that users of back branches could deploy it immediately. I've tested
this against release 12, but I think it could go probably all the way
back to 9.5. The new function is named jsonb_ set_lax, but I'm open to
bikeshedding.


cheers


andrew


-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 9fe4a4794a..bbe5bbf3ae 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1232,6 +1232,15 @@ LANGUAGE INTERNAL
 STRICT IMMUTABLE PARALLEL SAFE
 AS 'jsonb_set';
 
+CREATE OR REPLACE FUNCTION
+  jsonb_set_lax(jsonb_in jsonb, path text[] , replacement jsonb,
+create_if_missing boolean DEFAULT true,
+null_value_treatment text DEFAULT 'use_json_null')
+RETURNS jsonb
+LANGUAGE INTERNAL
+CALLED ON NULL INPUT IMMUTABLE PARALLEL SAFE
+AS 'jsonb_set_lax';
+
 CREATE OR REPLACE FUNCTION
   parse_ident(str text, strict boolean DEFAULT true)
 RETURNS text[]
diff --git a/src/backend/utils/adt/jsonfuncs.c b/src/backend/utils/adt/jsonfuncs.c
index 3553a304b8..c9ce8e53e9 100644
--- a/src/backend/utils/adt/jsonfuncs.c
+++ b/src/backend/utils/adt/jsonfuncs.c
@@ -4395,6 +4395,68 @@ jsonb_set(PG_FUNCTION_ARGS)
 }
 
 
+/*
+ * SQL function jsonb_set_lax(jsonb, text[], jsonb, boolean, text)
+ */
+Datum
+jsonb_set_lax(PG_FUNCTION_ARGS)
+{
+	/* Jsonb	   *in = PG_GETARG_JSONB_P(0); */
+	/* ArrayType  *path = PG_GETARG_ARRAYTYPE_P(1); */
+	/* Jsonb	  *newval = PG_GETARG_JSONB_P(2); */
+	/* bool		create = PG_GETARG_BOOL(3); */
+	text   *handle_null;
+	char   *handle_val;
+
+	if (PG_ARGISNULL(0) || PG_ARGISNULL(1) || PG_ARGISNULL(3))
+		PG_RETURN_NULL();
+
+	/* could happen if they pass in an explicit NULL */
+	if (PG_ARGISNULL(4))
+		ereport(ERROR,
+(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("need delete_key, return_target, use_json_null, or raise_exception")));
+
+	if (! PG_ARGISNULL(2))
+		return jsonb_set(fcinfo);
+
+	handle_null = PG_GETARG_TEXT_P(4);
+	handle_val = text_to_cstring(handle_null);
+
+	if (strcmp(handle_val,"raise_exception") == 0)
+	{
+		ereport(ERROR,
+(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("null jsonb value")));
+	}
+	else if (strcmp(handle_val, "use_json_null") == 0)
+	{
+		Datum	  newval;
+
+		newval = DirectFunctionCall1(jsonb_in, CStringGetDatum("null"));
+
+		/* XXX can we stomp on fcinfo->args like this? */
+		fcinfo->args[2].value = newval;
+		fcinfo->args[2].isnull = false;
+		return jsonb_set(fcinfo);
+	}
+	else if (strcmp(handle_val, "delete_key") == 0)
+	{
+		return jsonb_delete_path(fcinfo);
+	}
+	else if (strcmp(handle_val, "return_target") == 0)
+	{
+		Jsonb	   *in = PG_GETARG_JSONB_P(0);
+		PG_RETURN_JSONB_P(in);
+	}
+	else
+	{
+		

Re: jsonb_set() strictness considered harmful to data

2019-10-24 Thread Stuart McGraw

On 10/24/19 2:17 PM, Tom Lane wrote:

Laurenz Albe  writes:

On Wed, 2019-10-23 at 13:00 -0600, Stuart McGraw wrote:

It is less sensible with compound values where the rule can apply to
individual scalar components.


I agree that JSON can sensibly be viewed as a composite value, but ...


  And indeed that is what Postgresql does
for another compound type:

# select array_replace(array[1,2,3],2,NULL);
array_replace
---
{1,NULL,3}

The returned value is not NULL.  Why the inconsistency between the array
type and json type?


... the flaw in this argument is that the array element is actually
a SQL NULL when we're done.  To do something similar in the JSON case,
we have to translate SQL NULL to JSON null, and that's cheating to
some extent.  They're not the same thing (and I'll generally resist
proposals to, say, make SELECT 'null'::json IS NULL return true).

Maybe it's okay to make this case work like that, but don't be too
high and mighty about it being logically clean; it isn't.

regards, tom lane


Sure, but my point was not that this was a perfect "logically clean"
answer, just that the argument, which was made multiple times, that
the entire result should be NULL because "that's the way SQL NULLs
work" is not really right.

It does seem to me that mapping NULL to "null" is likely a workable
approach but that's just my uninformed opinion.




Re: jsonb_set() strictness considered harmful to data

2019-10-24 Thread Tom Lane
Laurenz Albe  writes:
> On Wed, 2019-10-23 at 13:00 -0600, Stuart McGraw wrote:
>> It is less sensible with compound values where the rule can apply to
>> individual scalar components.

I agree that JSON can sensibly be viewed as a composite value, but ...

>>  And indeed that is what Postgresql does
>> for another compound type:
>> 
>> # select array_replace(array[1,2,3],2,NULL);
>> array_replace
>> ---
>> {1,NULL,3}
>> 
>> The returned value is not NULL.  Why the inconsistency between the array
>> type and json type?

... the flaw in this argument is that the array element is actually
a SQL NULL when we're done.  To do something similar in the JSON case,
we have to translate SQL NULL to JSON null, and that's cheating to
some extent.  They're not the same thing (and I'll generally resist
proposals to, say, make SELECT 'null'::json IS NULL return true).

Maybe it's okay to make this case work like that, but don't be too
high and mighty about it being logically clean; it isn't.

regards, tom lane




Re: jsonb_set() strictness considered harmful to data

2019-10-24 Thread Laurenz Albe
On Wed, 2019-10-23 at 13:00 -0600, Stuart McGraw wrote:
> > You can only say that if you don't understand NULL (you wouldn't be alone).
> > If I modify a JSON with an unknown value, the result is unknown.
> > This seems very intuitive to me.
> 
> Would you expect modifying an array value with an unknown would result
> in the entire array being unknown?

Hm, yes, that is less intuitive.
I was viewing a JSON as an atomic value above.

> > One could argue that whoever uses SQL should understand SQL.
> > 
> > But I believe that it is reasonable to suppose that many people who
> > use JSON in the database are more savvy with JSON than with SQL
> > (they might not have chosen JSON otherwise), so I agree that it makes
> > sense to change this particular behavior.
> 
> That (generally) SQL NULL results in NULL for any operation has been
> brought up multiple times in this thread, including above, as a rationale
> for the current jsonb behavior.  I don't think it is a valid argument.
> 
> When examples are given, they typically are with scalar values where
> such behavior makes sense: the resulting scalar value has to be NULL
> or non-NULL, it can't be both.
> 
> It is less sensible with compound values where the rule can apply to
> individual scalar components.  And indeed that is what Postgresql does
> for another compound type:
> 
># select array_replace(array[1,2,3],2,NULL);
> array_replace
>---
> {1,NULL,3}
> 
> The returned value is not NULL.  Why the inconsistency between the array
> type and json type?  Are there any cases other than json where the entire
> compound value is set to NULL as a result of one of its components being
> NULL?

That is a good point.

I agree that the behavior should be changed.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: jsonb_set() strictness considered harmful to data

2019-10-23 Thread Maurice Aubrey
On Wed, Oct 23, 2019 at 12:01 PM Stuart McGraw  wrote:

> When examples are given, they typically are with scalar values where
> such behavior makes sense: the resulting scalar value has to be NULL
> or non-NULL, it can't be both.
>
> It is less sensible with compound values where the rule can apply to
> individual scalar components.  And indeed that is what Postgresql does
> for another compound type:
>

I agree completely. Scalar vs compound structure seems like the essential
difference.
You don't expect an operation on an element of a compound structure to be
able to effect the entire structure.
Maurice


Re: jsonb_set() strictness considered harmful to data

2019-10-23 Thread rob stone
Hello,

On Wed, 2019-10-23 at 20:33 +0200, Peter J. Holzer wrote:
> 
> I grant that SQL NULL takes a bit to get used to. However, it is a
> core
> part of the SQL language and everyone who uses SQL must understand it
> (I
> don't remember when I first stumbled across "select * from t where c
> =
> NULL" returning 0 rows, but it was probably within the first few days
> of
> using a database). And personally I find it much easier to deal with
> concept which are applied consistently across the whole language than
> those which sometimes apply and sometimes don't seemingly at random,
> just because a developer thought it would be convenient for the
> specific
> use-case they had in mind.
> 
> hp
> 

>From the JSON spec:-

3.  Values

   A JSON value MUST be an object, array, number, or string, or one of
   the following three literal names:

  false
  null
  true

   The literal names MUST be lowercase.  No other literal names are
   allowed.

So, you can't set a value associated to a key to SQL NULL. If a key
should not have a value then delete that key from the JSON.

If you decide your application is going to use one of those three
literal names, then you need to code accordingly. 

My 2 cents.






Re: jsonb_set() strictness considered harmful to data

2019-10-23 Thread Maciek Sakrejda
On Wed, Oct 23, 2019 at 12:01 PM Stuart McGraw  wrote:
> Why the inconsistency between the array
> type and json type?  Are there any cases other than json where the entire
> compound value is set to NULL as a result of one of its components being
> NULL?

That's a great point. It does look like hstore's delete / minus
operator behaves like that, though:

=# select 'a=>1,b=>2'::hstore - null;
 ?column?
--

(1 row)




Re: jsonb_set() strictness considered harmful to data

2019-10-23 Thread Stuart McGraw

On 10/23/19 5:42 AM, Laurenz Albe wrote:

David G. Johnston wrote:

Now if only the vast majority of users could have and keep this level of 
understanding
in mind while writing complex queries so that they remember to always add 
protections
to compensate for the unique design decision that SQL has taken here...


You can only say that if you don't understand NULL (you wouldn't be alone).
If I modify a JSON with an unknown value, the result is unknown.
This seems very intuitive to me.


Would you expect modifying an array value with an unknown would result
in the entire array being unknown?


One could argue that whoever uses SQL should understand SQL.

But I believe that it is reasonable to suppose that many people who
use JSON in the database are more savvy with JSON than with SQL
(they might not have chosen JSON otherwise), so I agree that it makes
sense to change this particular behavior.

Yours,
Laurenz Albe


That (generally) SQL NULL results in NULL for any operation has been
brought up multiple times in this thread, including above, as a rationale
for the current jsonb behavior.  I don't think it is a valid argument.

When examples are given, they typically are with scalar values where
such behavior makes sense: the resulting scalar value has to be NULL
or non-NULL, it can't be both.

It is less sensible with compound values where the rule can apply to
individual scalar components.  And indeed that is what Postgresql does
for another compound type:

  # select array_replace(array[1,2,3],2,NULL);
   array_replace
  ---
   {1,NULL,3}

The returned value is not NULL.  Why the inconsistency between the array
type and json type?  Are there any cases other than json where the entire
compound value is set to NULL as a result of one of its components being
NULL?




Re: jsonb_set() strictness considered harmful to data

2019-10-23 Thread Peter J. Holzer
On 2019-10-22 18:06:39 -0700, David G. Johnston wrote:
> On Tue, Oct 22, 2019 at 3:55 PM Peter J. Holzer  wrote:
> On 2019-10-20 13:20:23 -0700, Steven Pousty wrote:
> > I would think though that raising an exception is better than a
> > default behavior which deletes data.
> > As an app dev I am quite used to all sorts of "APIs" throwing
> > exceptions and have learned to deal with them.
> >
> > This is my way of saying that raising an exception is an
> > improvement over the current situation. May not be the "best"
> > solution but definitely an improvement.
> 
> I somewhat disagree. SQL isn't in general a language which uses
> exceptions a lot. It does have the value NULL to mean "unknown", and
> generally unknown combined with something else results in an unknown
> value again:
> 
> [...] 
> 
> 
> Throwing an exception for a pure function seems "un-SQLy" to me. In
> particular, jsonb_set does something similar for json values as replace
> does for strings, so it should behave similarly.
> 
> 
> Now if only the vast majority of users could have and keep this level of
> understanding in mind while writing complex queries so that they remember to
> always add protections to compensate for the unique design decision that SQL
> has taken here...

I grant that SQL NULL takes a bit to get used to. However, it is a core
part of the SQL language and everyone who uses SQL must understand it (I
don't remember when I first stumbled across "select * from t where c =
NULL" returning 0 rows, but it was probably within the first few days of
using a database). And personally I find it much easier to deal with
concept which are applied consistently across the whole language than
those which sometimes apply and sometimes don't seemingly at random,
just because a developer thought it would be convenient for the specific
use-case they had in mind.

hp

-- 
   _  | Peter J. Holzer| we build much bigger, better disasters now
|_|_) || because we have much more sophisticated
| |   | h...@hjp.at | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson 


signature.asc
Description: PGP signature


Re: jsonb_set() strictness considered harmful to data

2019-10-23 Thread David G. Johnston
On Wed, Oct 23, 2019 at 4:42 AM Laurenz Albe 
wrote:

> David G. Johnston wrote:
> > Now if only the vast majority of users could have and keep this level of
> understanding
> > in mind while writing complex queries so that they remember to always
> add protections
> > to compensate for the unique design decision that SQL has taken here...
>
> You can only say that if you don't understand NULL (you wouldn't be alone).
> If I modify a JSON with an unknown value, the result is unknown.
> This seems very intuitive to me.
>
> One could argue that whoever uses SQL should understand SQL.
>
> But I believe that it is reasonable to suppose that many people who
> use JSON in the database are more savvy with JSON than with SQL
> (they might not have chosen JSON otherwise), so I agree that it makes
> sense to change this particular behavior.
>

I can and do understand SQL quite well and still likely would end up being
tripped up by this (though not surprised when it happened) because I can't
and don't want to think about what will happen if NULL appears in every
expression I write when a typical SQL query can contain tens of them.  I'd
much rather assume that NULL inputs aren't going to happen and have the
system tell me when that assumption is wrong.  Having to change my
expressions to: COALESCE(original_input, function(original_input,
something_that_could_be_null_in_future_but_cannot_right_now)) just adds
undesirable mental and typing overhead.

David J.


Re: jsonb_set() strictness considered harmful to data

2019-10-23 Thread Laurenz Albe
David G. Johnston wrote:
> Now if only the vast majority of users could have and keep this level of 
> understanding
> in mind while writing complex queries so that they remember to always add 
> protections
> to compensate for the unique design decision that SQL has taken here...

You can only say that if you don't understand NULL (you wouldn't be alone).
If I modify a JSON with an unknown value, the result is unknown.
This seems very intuitive to me.

One could argue that whoever uses SQL should understand SQL.

But I believe that it is reasonable to suppose that many people who
use JSON in the database are more savvy with JSON than with SQL
(they might not have chosen JSON otherwise), so I agree that it makes
sense to change this particular behavior.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: jsonb_set() strictness considered harmful to data

2019-10-22 Thread David G. Johnston
On Tue, Oct 22, 2019 at 3:55 PM Peter J. Holzer  wrote:

> On 2019-10-20 13:20:23 -0700, Steven Pousty wrote:
> > I would think though that raising an exception is better than a default
> > behavior which deletes data.
> > As an app dev I am quite used to all sorts of "APIs" throwing exceptions
> and
> > have learned to deal with them.
> >
> > This is my way of saying that raising an exception is an improvement
> over the
> > current situation. May not be the "best" solution but definitely an
> > improvement.
>
> I somewhat disagree. SQL isn't in general a language which uses
> exceptions a lot. It does have the value NULL to mean "unknown", and
> generally unknown combined with something else results in an unknown
> value again:
>
[...]

>
> Throwing an exception for a pure function seems "un-SQLy" to me. In
> particular, jsonb_set does something similar for json values as replace
> does for strings, so it should behave similarly.
>

Now if only the vast majority of users could have and keep this level of
understanding in mind while writing complex queries so that they remember
to always add protections to compensate for the unique design decision that
SQL has taken here...

In this case I would favor a break from the historical to a more safe
design, regardless of its novelty in the codebase, since the usage patterns
and risks involved with typical JSON using code are considerably
different/larger than those for "replace".

Just because its always been done one way, and we won't change existing
code, doesn't mean we shouldn't apply lessons learned to newer code.  In
the case of JSON maybe its too late to worry about changing (though moving
to exception is safe) but a policy choice now could at least pave the way
to avoid this situation when the next new datatype is implemented.  In many
functions we do provoke exceptions when known invalid input is provided -
supplying a function with a primary/important argument being undefined
should fall into the same "malformed" category of problematic input.

David J.


Re: jsonb_set() strictness considered harmful to data

2019-10-22 Thread Peter J. Holzer
On 2019-10-22 09:16:05 +1100, raf wrote:
> Steven Pousty wrote:
> > In a perfect world I would agree with you. But often users do not read ALL
> > the documentation before they use the function in their code OR they are
> > not sure that the condition applies to them (until it does).
> 
> I'm well aware of that, hence the statement that this
> information needs to appear at the place in the
> documentation where the user is first going to
> encounter the function (i.e. in the table where its
> examples are).

I think this is a real weakness of the tabular format used for
documenting functions: While it is quite compact which is nice if you
just want to look up a function's name or parameters, it really
discourages explanations longer than a single paragraph.

Section 9.9 gets around this problem by limiting the in-table
description to a few words and "see Section 9.9.x". So you basically
have to read the text and not just the table. Maybe that would make
sense for the json functions, too?

hp

-- 
   _  | Peter J. Holzer| we build much bigger, better disasters now
|_|_) || because we have much more sophisticated
| |   | h...@hjp.at | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson 


signature.asc
Description: PGP signature


Re: jsonb_set() strictness considered harmful to data

2019-10-22 Thread Peter J. Holzer
On 2019-10-21 09:39:13 -0700, Steven Pousty wrote:
> Turning a JSON null into a SQL null  and thereby "deleting" the data
> is not the path of least surprises.

But it doesn't do that: A JSON null is perfectly fine:

wds=> select jsonb_set('{"a": 1, "b": 2}'::jsonb, '{c}', 'null'::jsonb);
╔═╗
║  jsonb_set  ║
╟─╢
║ {"a": 1, "b": 2, "c": null} ║
╚═╝
(1 row)


It is trying to replace a part of the JSON object with an SQL NULL (i.e.
unknown) which returns SQL NULL:

wds=> select jsonb_set('{"a": 1, "b": 2}'::jsonb, '{c}', NULL);
╔═══╗
║ jsonb_set ║
╟───╢
║ (∅)   ║
╚═══╝
(1 row)

hp

-- 
   _  | Peter J. Holzer| we build much bigger, better disasters now
|_|_) || because we have much more sophisticated
| |   | h...@hjp.at | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson 


signature.asc
Description: PGP signature


Re: jsonb_set() strictness considered harmful to data

2019-10-22 Thread Peter J. Holzer
On 2019-10-20 13:20:23 -0700, Steven Pousty wrote:
> I would think though that raising an exception is better than a default
> behavior which deletes data.
> As an app dev I am quite used to all sorts of "APIs" throwing exceptions and
> have learned to deal with them.
> 
> This is my way of saying that raising an exception is an improvement over the
> current situation. May not be the "best" solution but definitely an
> improvement.

I somewhat disagree. SQL isn't in general a language which uses
exceptions a lot. It does have the value NULL to mean "unknown", and
generally unknown combined with something else results in an unknown
value again:

% psql wds
Null display is "(∅)".
Line style is unicode.
Border style is 2.
Unicode border line style is "double".
Timing is on.
Expanded display is used automatically.
psql (11.5 (Ubuntu 11.5-3.pgdg18.04+1))
Type "help" for help.

wds=> select 4 + NULL;
╔══╗
║ ?column? ║
╟──╢
║  (∅) ║
╚══╝
(1 row)

Time: 0.924 ms
wds=> select replace('steven', 'e', NULL);
╔═╗
║ replace ║
╟─╢
║ (∅) ║
╚═╝
(1 row)

Time: 0.918 ms

Throwing an exception for a pure function seems "un-SQLy" to me. In
particular, jsonb_set does something similar for json values as replace
does for strings, so it should behave similarly.

hp

-- 
   _  | Peter J. Holzer| we build much bigger, better disasters now
|_|_) || because we have much more sophisticated
| |   | h...@hjp.at | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson 


signature.asc
Description: PGP signature


Re: jsonb_set() strictness considered harmful to data

2019-10-21 Thread raf
Steven Pousty wrote:

> On Sun, Oct 20, 2019 at 4:31 PM raf  wrote:
> 
> > Steven Pousty wrote:
> >
> > > I would think though that raising an exception is better than a
> > > default behavior which deletes data.
> >
> > I can't help but feel the need to make the point that
> > the function is not deleting anything. It is just
> > returning null. The deletion of data is being performed
> > by an update statement that uses the function's return
> > value to set a column value.
> >
> > I don't agree that raising an exception in the function
> > is a good idea (perhaps unless it's valid to assume
> > that this function will only ever be used in such a
> > context). Making the column not null (as already
> > suggested) and having the update statement itself raise
> > the exception seems more appropriate if an exception is
> > desirable. But that presumes an accurate understanding
> > of the behaviour of jsonb_set.
> >
> > Really, I think the best fix would be in the
> > documentation so that everyone who finds the function
> > in the documentation understands its behaviour
> > immediately.
> >
> Hey Raf
> 
> In a perfect world I would agree with you. But often users do not read ALL
> the documentation before they use the function in their code OR they are
> not sure that the condition applies to them (until it does).

I'm well aware of that, hence the statement that this
information needs to appear at the place in the
documentation where the user is first going to
encounter the function (i.e. in the table where its
examples are). Even putting it in a note box further
down the page might not be enough (but hopefully it
will be).

cheers,
raf





Re: jsonb_set() strictness considered harmful to data

2019-10-21 Thread Adrian Klaver

On 10/21/19 12:50 PM, Tomas Vondra wrote:

On Mon, Oct 21, 2019 at 08:06:46AM -0700, Adrian Klaver wrote:

On 10/20/19 11:07 PM, Tomas Vondra wrote:

On Sun, Oct 20, 2019 at 06:51:05PM -0400, Andrew Dunstan wrote:




True. And AFAIK catching exceptions is not really possible in some code,
e.g. in stored procedures (because we can't do subtransactions, so no
exception blocks).



Can you explain the above to me as I thought there are exception 
blocks in stored functions and now sub-transactions in stored procedures.




Sorry for the confusion - I've not been particularly careful when
writing that response.

Let me illustrate the issue with this example:

    CREATE TABLE t (a int);

    CREATE OR REPLACE PROCEDURE test() LANGUAGE plpgsql AS $$
    DECLARE
   msg TEXT;
    BEGIN
  -- SAVEPOINT s1;
  INSERT INTO t VALUES (1);
  -- COMMIT;
    EXCEPTION
  WHEN others THEN
    msg := SUBSTR(SQLERRM, 1, 100);
    RAISE NOTICE 'error: %', msg;
    END; $$;

    CALL test();

If you uncomment the SAVEPOINT, you get

    NOTICE:  error: unsupported transaction command in PL/pgSQL

because savepoints are not allowed in stored procedures. Fine.

If you uncomment the COMMIT, you get

    NOTICE:  error: cannot commit while a subtransaction is active

which happens because the EXCEPTION block creates a subtransaction, and
we can't commit when it's active.

But we can commit outside the exception block:

    CREATE OR REPLACE PROCEDURE test() LANGUAGE plpgsql AS $$
    DECLARE
   msg TEXT;
    BEGIN
  BEGIN
    INSERT INTO t VALUES (1);
  EXCEPTION
    WHEN others THEN
  msg := SUBSTR(SQLERRM, 1, 100);
  RAISE NOTICE 'error: %', msg;
   END;
   COMMIT;
    END; $$;


You can do something like the below though:

CREATE TABLE t (a int PRIMARY KEY);

CREATE OR REPLACE PROCEDURE public.test()
 LANGUAGE plpgsql
AS $procedure$
   DECLARE
  msg TEXT;
   BEGIN
 BEGIN
   INSERT INTO t VALUES (1);
 EXCEPTION
   WHEN others THEN
 msg := SUBSTR(SQLERRM, 1, 100);
 RAISE NOTICE 'error: %', msg;
 UPDATE t set a = 2;
  END;
  COMMIT;
   END; $procedure$

test_(postgres)# CALL test();
CALL
test_(postgres)# select * from t;
 a
---
 1
(1 row)

test_(postgres)# CALL test();
NOTICE:  error: duplicate key value violates unique constraint "t_pkey"
CALL
test_(postgres)# select * from t;
 a
---
 2
(1 row)





regards




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: jsonb_set() strictness considered harmful to data

2019-10-21 Thread Tomas Vondra

On Mon, Oct 21, 2019 at 08:06:46AM -0700, Adrian Klaver wrote:

On 10/20/19 11:07 PM, Tomas Vondra wrote:

On Sun, Oct 20, 2019 at 06:51:05PM -0400, Andrew Dunstan wrote:




True. And AFAIK catching exceptions is not really possible in some code,
e.g. in stored procedures (because we can't do subtransactions, so no
exception blocks).



Can you explain the above to me as I thought there are exception 
blocks in stored functions and now sub-transactions in stored 
procedures.




Sorry for the confusion - I've not been particularly careful when
writing that response.

Let me illustrate the issue with this example:

   CREATE TABLE t (a int);

   CREATE OR REPLACE PROCEDURE test() LANGUAGE plpgsql AS $$
   DECLARE
  msg TEXT;
   BEGIN
 -- SAVEPOINT s1;
 INSERT INTO t VALUES (1);
 -- COMMIT;
   EXCEPTION
 WHEN others THEN
   msg := SUBSTR(SQLERRM, 1, 100);
   RAISE NOTICE 'error: %', msg;
   END; $$;

   CALL test();

If you uncomment the SAVEPOINT, you get

   NOTICE:  error: unsupported transaction command in PL/pgSQL

because savepoints are not allowed in stored procedures. Fine.

If you uncomment the COMMIT, you get

   NOTICE:  error: cannot commit while a subtransaction is active

which happens because the EXCEPTION block creates a subtransaction, and
we can't commit when it's active.

But we can commit outside the exception block:

   CREATE OR REPLACE PROCEDURE test() LANGUAGE plpgsql AS $$
   DECLARE
  msg TEXT;
   BEGIN
 BEGIN
   INSERT INTO t VALUES (1);
 EXCEPTION
   WHEN others THEN
 msg := SUBSTR(SQLERRM, 1, 100);
 RAISE NOTICE 'error: %', msg;
  END;
  COMMIT;
   END; $$;


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: jsonb_set() strictness considered harmful to data

2019-10-21 Thread Steve Atkins


On 21/10/2019 17:39, Steven Pousty wrote:
 Turning a JSON null into a SQL null  and thereby "deleting" the data 
is not the path of least surprises.


In what situation does that happen? (If it's already been mentioned I 
missed it, long thread, sorry).


Cheers,
  Steve




Re: jsonb_set() strictness considered harmful to data

2019-10-21 Thread Steven Pousty
On Sun, Oct 20, 2019 at 4:31 PM raf  wrote:

> Steven Pousty wrote:
>
> > I would think though that raising an exception is better than a
> > default behavior which deletes data.
>
> I can't help but feel the need to make the point that
> the function is not deleting anything. It is just
> returning null. The deletion of data is being performed
> by an update statement that uses the function's return
> value to set a column value.
>
> I don't agree that raising an exception in the function
> is a good idea (perhaps unless it's valid to assume
> that this function will only ever be used in such a
> context). Making the column not null (as already
> suggested) and having the update statement itself raise
> the exception seems more appropriate if an exception is
> desirable. But that presumes an accurate understanding
> of the behaviour of jsonb_set.
>
> Really, I think the best fix would be in the
> documentation so that everyone who finds the function
> in the documentation understands its behaviour
> immediately.
>
>
>
Hey Raf

In a perfect world I would agree with you. But often users do not read ALL
the documentation before they use the function in their code OR they are
not sure that the condition applies to them (until it does).  Turning a
JSON null into a SQL null  and thereby "deleting" the data is not the path
of least surprises.

So while we could say reading the documentation is the proper path it is
not the most helpful path. I am not arguing against doc'ing the behavior no
matter what we decide on. What I am saying is an exception is better than
the current situation if we can't agree to any other solution. An exception
is better than just doc but probably not the best solution. (and it seems
like most other people have said as well but the lag on a mailing list is
getting us overlapping).

I see people saying Null pointer exceptions are not helpful. I mostly
agree, they are not the most helpful kind of exception BUT they are better
than some alternatives. So I think it would be better to say NPEs are not
as helpful as they possibly could be.


Re: jsonb_set() strictness considered harmful to data

2019-10-21 Thread David G. Johnston
On Sun, Oct 20, 2019 at 3:51 PM Andrew Dunstan <
andrew.duns...@2ndquadrant.com> wrote:

> I'm not arguing against the idea of improving the situation. But I am
> arguing against a minimal fix that will not provide much of value to a
> careful app developer. i.e. I want to do more to support app devs.
> Ideally they would not need to use wrapper functions. There will be
> plenty of situations where it is mighty inconvenient to catch an
> exception thrown by jsonb_set(). And catching exceptions can be
> expensive. You want to avoid that if possible in your
> performance-critical plpgsql code.
>

As there is pretty much nothing that can be done at runtime if this
exception is raised actually "catching" it anywhere deeper than near the
top of the application code is largely pointless.  Its more like a
NullPointerException in Java - if the application raises it there should be
a last line of defense error handler that basically says "you developer
made a mistake somewhere and needs to fix it - tell them this happened".

Performance critical subsections (and pretty much the whole) of the
application can just raise the error to the caller using normal mechanisms
for "SQLException" propogation.

David J.


Re: jsonb_set() strictness considered harmful to data

2019-10-21 Thread Adrian Klaver

On 10/20/19 11:07 PM, Tomas Vondra wrote:

On Sun, Oct 20, 2019 at 06:51:05PM -0400, Andrew Dunstan wrote:




True. And AFAIK catching exceptions is not really possible in some code,
e.g. in stored procedures (because we can't do subtransactions, so no
exception blocks).



Can you explain the above to me as I thought there are exception blocks 
in stored functions and now sub-transactions in stored procedures.




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: jsonb_set() strictness considered harmful to data

2019-10-21 Thread Andrew Dunstan


On 10/21/19 2:07 AM, Tomas Vondra wrote:
> On Sun, Oct 20, 2019 at 06:51:05PM -0400, Andrew Dunstan wrote:
>>
>>> I think the general premise of this thread is that the application
>>> developer does not realize that may be necessary, because it's a bit
>>> surprising behavior, particularly when having more experience with
>>> other
>>> databases that behave differently. It's also pretty easy to not notice
>>> this issue for a long time, resulting in significant data loss.
>>>
>>> Let's say you're used to the MSSQL or MySQL behavior, you migrate your
>>> application to PostgreSQL or whatever - how do you find out about this
>>> behavior? Users are likely to visit
>>>
>>>    https://www.postgresql.org/docs/12/functions-json.html
>>>
>>> but that says nothing about how jsonb_set works with NULL values :-(
>>
>>
>>
>> We should certainly fix that. I accept some responsibility for the
>> omission.
>>
>
> +1
>
>


So let's add something to the JSON funcs page  like this:


Note: All the above functions except for json_build_object,
json_build_array, json_to_recordset, json_populate_record, and
json_populate_recordset and their jsonb equivalents are strict
functions. That is, if any argument is NULL the function result will be
NULL and the function won't even be called. Particular care should
therefore be taken to avoid passing NULL arguments to those functions
unless a NULL result is expected. This is particularly true of the
jsonb_set and jsonb_insert functions.



(We do have a heck of a lot of Note: sections on that page)


cheers


andrew



-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services





Re: jsonb_set() strictness considered harmful to data

2019-10-21 Thread Tomas Vondra

On Sun, Oct 20, 2019 at 06:51:05PM -0400, Andrew Dunstan wrote:


On 10/20/19 4:18 PM, Tomas Vondra wrote:

On Sun, Oct 20, 2019 at 03:48:05PM -0400, Andrew Dunstan wrote:


On 10/20/19 1:14 PM, David G. Johnston wrote:

On Sun, Oct 20, 2019 at 5:31 AM Andrew Dunstan
mailto:andrew.duns...@2ndquadrant.com>> wrote:

    And yet another is to
    raise an exception, which is easy to write but really punts the
issue
    back to the application programmer who will have to decide how to
    ensure
    they never pass in a NULL parameter.


That's kinda the point - if they never pass NULL they won't encounter
any problems but as soon as the data and their application don't see
eye-to-eye the application developer has to decide what they want to
do about it.  We are in no position to decide for them and making it
obvious they have a decision to make and implement here doesn't seem
like a improper position to take.



The app dev can avoid this problem today by making sure they don't pass
a NULL as the value. Or they can use a wrapper function which does that
for them. So frankly this doesn't seem like much of an advance. And, as
has been noted, it's not consistent with what either MySQL or MSSQL do.
In general I'm not that keen on raising an exception for cases like
this.



I think the general premise of this thread is that the application
developer does not realize that may be necessary, because it's a bit
surprising behavior, particularly when having more experience with other
databases that behave differently. It's also pretty easy to not notice
this issue for a long time, resulting in significant data loss.

Let's say you're used to the MSSQL or MySQL behavior, you migrate your
application to PostgreSQL or whatever - how do you find out about this
behavior? Users are likely to visit

   https://www.postgresql.org/docs/12/functions-json.html

but that says nothing about how jsonb_set works with NULL values :-(




We should certainly fix that. I accept some responsibility for the omission.



+1





You're right raising an exception may not be the "right behavior" for
whatever definition of "right". But I kinda agree with David that it's
somewhat reasonable when we don't know what the "universally correct"
thing is (or when there's no such thing). IMHO that's better than
silently discarding some of the data.



I'm not arguing against the idea of improving the situation. But I am
arguing against a minimal fix that will not provide much of value to a
careful app developer. i.e. I want to do more to support app devs.
Ideally they would not need to use wrapper functions. There will be
plenty of situations where it is mighty inconvenient to catch an
exception thrown by jsonb_set(). And catching exceptions can be
expensive. You want to avoid that if possible in your
performance-critical plpgsql code.



True. And AFAIK catching exceptions is not really possible in some code,
e.g. in stored procedures (because we can't do subtransactions, so no
exception blocks).





FWIW I think the JSON/JSONB part of our code base is amazing, and the
fact that various other databases adopted something very similar over
the last couple of years just confirms that. And if this is the only
speck of dust in the API, I think that's pretty amazing.



TY. When I first saw the SQL/JSON spec I thought I should send a request
to the SQL standards committee for a royalty payment, since it looked so
familiar ;-)



;-)





I'm not sure how significant this issue actually is - it's true we got a
couple of complaints over the years (judging by a quick search for
jsonb_set and NULL in the archives), but I'm not sure that's enough to
justify any changes in backbranches. I'd say no, but I have no idea how
many people are affected by this but don't know about it ...




No, no backpatching. As I said upthread, this isn't a bug, but it is
arguably a POLA violation, which is why we should do something for
release 13.



WFM


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: jsonb_set() strictness considered harmful to data

2019-10-20 Thread Abelard Hoffman
>
>
>> I would argue that only if the target parameter (the actual json value)
> is NULL should the result be NULL. The function is documented as returning
> the target, with modifications to a small part of its structure as
> specified by the other parameters. It is strange for the result to suddenly
> collapse down to NULL just because another parameter is NULL. Perhaps if
> the path is NULL, that can mean "don't update". And if create_missing is
> NULL, that should mean the same as not specifying it. I think. At a
> minimum, if we don't change it, the documentation needs to get one of those
> warning boxes alerting people that the functions will destroy their input
> entirely rather than slightly modifying it if any of the other parameters
> are NULL.
>
> My only doubt about any of this is that by the same argument, functions
> like replace() should not return NULL if the 2nd or 3rd parameter is NULL.
> I'm guessing replace() is specified by SQL and also unchanged in many
> versions so therefore not eligible for re-thinking but it still gives me
> just a bit of pause.
>

That's the essential difference though, no? With jsonb, conceptually, we
have a nested row. That's where we get confused. We think that the
operation should affect the element within the nested structure, not the
structure itself.

It would be equivalent to replace() nulling out the entire row on null.

I understand the logic behind it, but I also definitely see why it's not
intuitive.

AH


Re: jsonb_set() strictness considered harmful to data

2019-10-20 Thread rob stone
Hello,

On Sun, 2019-10-20 at 18:51 -0400, Andrew Dunstan wrote:
> On 10/20/19 4:18 PM, Tomas Vondra wrote:
> > 
> >https://www.postgresql.org/docs/12/functions-json.html
> > 
> > but that says nothing about how jsonb_set works with NULL values :-
> > (
> 
> 
> We should certainly fix that. I accept some responsibility for the
> omission.
> 
> 
> 

FWIW, if you are able to update the documentation, the current JSON RFC
is 8259.

https://tools.ietf.org/html/rfc8259


Cheers,
Rob








Re: jsonb_set() strictness considered harmful to data

2019-10-20 Thread raf
Steven Pousty wrote:

> I would think though that raising an exception is better than a
> default behavior which deletes data.

I can't help but feel the need to make the point that
the function is not deleting anything. It is just
returning null. The deletion of data is being performed
by an update statement that uses the function's return
value to set a column value.

I don't agree that raising an exception in the function
is a good idea (perhaps unless it's valid to assume
that this function will only ever be used in such a
context). Making the column not null (as already
suggested) and having the update statement itself raise
the exception seems more appropriate if an exception is
desirable. But that presumes an accurate understanding
of the behaviour of jsonb_set.

Really, I think the best fix would be in the
documentation so that everyone who finds the function
in the documentation understands its behaviour
immediately. I didn't even know there was such a thing
as a strict function or what it means and the
documentation for jsonb_set doesn't mention that it is
a strict function and the examples of its use don't
demonstrate this behaviour. I'm referring to
https://www.postgresql.org/docs/9.5/functions-json.html.

All of this contributes to the astonishment encountered
here. Least astonishment can probably be achieved with
additional documentation but it has to be where the
reader is looking when they first encounter the
function in the documentation so that their
expectations are set correctly and set early. And
documentation can be "fixed" sooner than postgresql 13.

Perhaps an audit of the documentation for all strict
functions would be a good idea to see if they need
work. Knowing that a function won't be executed at all
and will effectively return null when given a null
argument might be important to know for other functions
as well.

cheers,
raf





Re: jsonb_set() strictness considered harmful to data

2019-10-20 Thread Andrew Dunstan


On 10/20/19 4:18 PM, Tomas Vondra wrote:
> On Sun, Oct 20, 2019 at 03:48:05PM -0400, Andrew Dunstan wrote:
>>
>> On 10/20/19 1:14 PM, David G. Johnston wrote:
>>> On Sun, Oct 20, 2019 at 5:31 AM Andrew Dunstan
>>> >> > wrote:
>>>
>>>     And yet another is to
>>>     raise an exception, which is easy to write but really punts the
>>> issue
>>>     back to the application programmer who will have to decide how to
>>>     ensure
>>>     they never pass in a NULL parameter.
>>>
>>>
>>> That's kinda the point - if they never pass NULL they won't encounter
>>> any problems but as soon as the data and their application don't see
>>> eye-to-eye the application developer has to decide what they want to
>>> do about it.  We are in no position to decide for them and making it
>>> obvious they have a decision to make and implement here doesn't seem
>>> like a improper position to take.
>>
>>
>> The app dev can avoid this problem today by making sure they don't pass
>> a NULL as the value. Or they can use a wrapper function which does that
>> for them. So frankly this doesn't seem like much of an advance. And, as
>> has been noted, it's not consistent with what either MySQL or MSSQL do.
>> In general I'm not that keen on raising an exception for cases like
>> this.
>>
>
> I think the general premise of this thread is that the application
> developer does not realize that may be necessary, because it's a bit
> surprising behavior, particularly when having more experience with other
> databases that behave differently. It's also pretty easy to not notice
> this issue for a long time, resulting in significant data loss.
>
> Let's say you're used to the MSSQL or MySQL behavior, you migrate your
> application to PostgreSQL or whatever - how do you find out about this
> behavior? Users are likely to visit
>
>    https://www.postgresql.org/docs/12/functions-json.html
>
> but that says nothing about how jsonb_set works with NULL values :-(



We should certainly fix that. I accept some responsibility for the omission.



>
> You're right raising an exception may not be the "right behavior" for
> whatever definition of "right". But I kinda agree with David that it's
> somewhat reasonable when we don't know what the "universally correct"
> thing is (or when there's no such thing). IMHO that's better than
> silently discarding some of the data.


I'm not arguing against the idea of improving the situation. But I am
arguing against a minimal fix that will not provide much of value to a
careful app developer. i.e. I want to do more to support app devs.
Ideally they would not need to use wrapper functions. There will be
plenty of situations where it is mighty inconvenient to catch an
exception thrown by jsonb_set(). And catching exceptions can be
expensive. You want to avoid that if possible in your
performance-critical plpgsql code.



>
> FWIW I think the JSON/JSONB part of our code base is amazing, and the
> fact that various other databases adopted something very similar over
> the last couple of years just confirms that. And if this is the only
> speck of dust in the API, I think that's pretty amazing.


TY. When I first saw the SQL/JSON spec I thought I should send a request
to the SQL standards committee for a royalty payment, since it looked so
familiar ;-)


>
> I'm not sure how significant this issue actually is - it's true we got a
> couple of complaints over the years (judging by a quick search for
> jsonb_set and NULL in the archives), but I'm not sure that's enough to
> justify any changes in backbranches. I'd say no, but I have no idea how
> many people are affected by this but don't know about it ...
>
>

No, no backpatching. As I said upthread, this isn't a bug, but it is
arguably a POLA violation, which is why we should do something for
release 13.


cheers


andrew


-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services





Re: jsonb_set() strictness considered harmful to data

2019-10-20 Thread Paul A Jungwirth
> That said, I think it is reasonable that a PostgreSQL JSON function
> behaves in the way that JSON users would expect, so here is my +1 for
> interpreting an SQL NULL as a JSON null in the above case

Just to chime in as another application developer: the current
functionality does seem pretty surprising and dangerous to me. Raising
an exception seems pretty annoying. Setting the key's value to a JSON
null would be fine, but I also like the idea of removing the key
entirely, since that gives you strictly more functionality: you can
always set the key to a JSON null by passing one in, if that's what
you want. But there are lots of other functions that convert SQL NULL
to JSON null:

postgres=# select row_to_json(row(null)), json_build_object('foo',
null), json_object(array['foo', null]), json_object(array['foo'],
array[null]);
 row_to_json | json_build_object |  json_object   |  json_object
-+---++
 {"f1":null} | {"foo" : null}| {"foo" : null} | {"foo" : null}
(1 row)

(The jsonb variants give the same results.)

I think those functions are very similar to json_set here, and I'd
expect json_set to do what they do (i.e. convert SQL NULL to JSON
null).

Paul




Re: jsonb_set() strictness considered harmful to data

2019-10-20 Thread Laurenz Albe
On Fri, 2019-10-18 at 21:18 -0500, Ariadne Conill wrote:
> postgres=# \pset null '(null)'
> Null display is "(null)".
> postgres=# select jsonb_set('{"a":1,"b":2,"c":3}'::jsonb, '{a}', NULL);
> jsonb_set
> ---
> (null)
> (1 row)
> 
> This behaviour is basically giving an application developer a loaded
> shotgun and pointing it at their feet.  It is not a good design.  It
> is a design which has likely lead to many users experiencing
> unintentional data loss.

I understand your sentiments, even if you voiced them too drastically for
my taste.

The basic problem is that SQL NULL and JSON null have different semantics,
and while it is surprising for you that a database function returns NULL
if an argument is NULL, many database people would be surprised by the
opposite.  Please have some understanding.

That said, I think it is reasonable that a PostgreSQL JSON function
behaves in the way that JSON users would expect, so here is my +1 for
interpreting an SQL NULL as a JSON null in the above case, so that the
result of the above becomes
{"a": null, "b": 2, "c": 3}

-1 for backpatching such a change.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: jsonb_set() strictness considered harmful to data

2019-10-20 Thread Steven Pousty
I would think though that raising an exception is better than a default
behavior which deletes data.
As an app dev I am quite used to all sorts of "APIs" throwing exceptions
and have learned to deal with them.

This is my way of saying that raising an exception is an improvement over
the current situation. May not be the "best" solution but definitely an
improvement.
Thanks
Steve

On Sun, Oct 20, 2019 at 12:48 PM Andrew Dunstan <
andrew.duns...@2ndquadrant.com> wrote:

>
> On 10/20/19 1:14 PM, David G. Johnston wrote:
> > On Sun, Oct 20, 2019 at 5:31 AM Andrew Dunstan
> >  > > wrote:
> >
> > And yet another is to
> > raise an exception, which is easy to write but really punts the issue
> > back to the application programmer who will have to decide how to
> > ensure
> > they never pass in a NULL parameter.
> >
> >
> > That's kinda the point - if they never pass NULL they won't encounter
> > any problems but as soon as the data and their application don't see
> > eye-to-eye the application developer has to decide what they want to
> > do about it.  We are in no position to decide for them and making it
> > obvious they have a decision to make and implement here doesn't seem
> > like a improper position to take.
>
>
> The app dev can avoid this problem today by making sure they don't pass
> a NULL as the value. Or they can use a wrapper function which does that
> for them. So frankly this doesn't seem like much of an advance. And, as
> has been noted, it's not consistent with what either MySQL or MSSQL do.
> In general I'm not that keen on raising an exception for cases like this.
>
>
> cheers
>
>
> andrew
>
>
> --
> Andrew Dunstanhttps://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
>
>
>


Re: jsonb_set() strictness considered harmful to data

2019-10-20 Thread Tomas Vondra

On Sun, Oct 20, 2019 at 03:48:05PM -0400, Andrew Dunstan wrote:


On 10/20/19 1:14 PM, David G. Johnston wrote:

On Sun, Oct 20, 2019 at 5:31 AM Andrew Dunstan
mailto:andrew.duns...@2ndquadrant.com>> wrote:

And yet another is to
raise an exception, which is easy to write but really punts the issue
back to the application programmer who will have to decide how to
ensure
they never pass in a NULL parameter.


That's kinda the point - if they never pass NULL they won't encounter
any problems but as soon as the data and their application don't see
eye-to-eye the application developer has to decide what they want to
do about it.  We are in no position to decide for them and making it
obvious they have a decision to make and implement here doesn't seem
like a improper position to take.



The app dev can avoid this problem today by making sure they don't pass
a NULL as the value. Or they can use a wrapper function which does that
for them. So frankly this doesn't seem like much of an advance. And, as
has been noted, it's not consistent with what either MySQL or MSSQL do.
In general I'm not that keen on raising an exception for cases like this.



I think the general premise of this thread is that the application
developer does not realize that may be necessary, because it's a bit
surprising behavior, particularly when having more experience with other
databases that behave differently. It's also pretty easy to not notice
this issue for a long time, resulting in significant data loss.

Let's say you're used to the MSSQL or MySQL behavior, you migrate your
application to PostgreSQL or whatever - how do you find out about this
behavior? Users are likely to visit

   https://www.postgresql.org/docs/12/functions-json.html

but that says nothing about how jsonb_set works with NULL values :-(

You're right raising an exception may not be the "right behavior" for
whatever definition of "right". But I kinda agree with David that it's
somewhat reasonable when we don't know what the "universally correct"
thing is (or when there's no such thing). IMHO that's better than
silently discarding some of the data.

FWIW I think the JSON/JSONB part of our code base is amazing, and the
fact that various other databases adopted something very similar over
the last couple of years just confirms that. And if this is the only
speck of dust in the API, I think that's pretty amazing.

I'm not sure how significant this issue actually is - it's true we got a
couple of complaints over the years (judging by a quick search for
jsonb_set and NULL in the archives), but I'm not sure that's enough to
justify any changes in backbranches. I'd say no, but I have no idea how
many people are affected by this but don't know about it ...

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: jsonb_set() strictness considered harmful to data

2019-10-20 Thread Andrew Dunstan


On 10/20/19 1:14 PM, David G. Johnston wrote:
> On Sun, Oct 20, 2019 at 5:31 AM Andrew Dunstan
>  > wrote:
>
> And yet another is to
> raise an exception, which is easy to write but really punts the issue
> back to the application programmer who will have to decide how to
> ensure
> they never pass in a NULL parameter.
>
>
> That's kinda the point - if they never pass NULL they won't encounter
> any problems but as soon as the data and their application don't see
> eye-to-eye the application developer has to decide what they want to
> do about it.  We are in no position to decide for them and making it
> obvious they have a decision to make and implement here doesn't seem
> like a improper position to take.


The app dev can avoid this problem today by making sure they don't pass
a NULL as the value. Or they can use a wrapper function which does that
for them. So frankly this doesn't seem like much of an advance. And, as
has been noted, it's not consistent with what either MySQL or MSSQL do.
In general I'm not that keen on raising an exception for cases like this.


cheers


andrew


-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services





Re: jsonb_set() strictness considered harmful to data

2019-10-20 Thread David G. Johnston
On Sun, Oct 20, 2019 at 5:31 AM Andrew Dunstan <
andrew.duns...@2ndquadrant.com> wrote:

> And yet another is to
> raise an exception, which is easy to write but really punts the issue
> back to the application programmer who will have to decide how to ensure
> they never pass in a NULL parameter.


That's kinda the point - if they never pass NULL they won't encounter any
problems but as soon as the data and their application don't see eye-to-eye
the application developer has to decide what they want to do about it.  We
are in no position to decide for them and making it obvious they have a
decision to make and implement here doesn't seem like a improper position
to take.


> Possibly we could even add an extra
> parameter to specify what should be done.
>

Has appeal.



> Should we return NULL in those cases as we do now?
>

Probably the same thing - though I'd accept having the input json being
null result in the output json being null as well.

David J.


Re: jsonb_set() strictness considered harmful to data

2019-10-20 Thread Isaac Morland
On Sun, 20 Oct 2019 at 08:32, Andrew Dunstan 
wrote:

>
> Understood. I think the real question here is what it should do instead
> when the value is NULL. Your behaviour above is one suggestion, which I
> personally find intuitive. Another has been to remove the associated
> key. Another is to return the original target. And yet another is to
> raise an exception, which is easy to write but really punts the issue
> back to the application programmer who will have to decide how to ensure
> they never pass in a NULL parameter. Possibly we could even add an extra
> parameter to specify what should be done.
>

I vote for remove the key. If we make NULL and 'null'::jsonb the same,
we're missing an opportunity to provide more functionality. Sometimes it's
convenient to be able to handle both the "update" and "remove" cases with
one function, just depending on the parameter value supplied.

Also, the question will arise what to do when any of the other
> parameters are NULL. Should we return NULL in those cases as we do now?
>

I would argue that only if the target parameter (the actual json value) is
NULL should the result be NULL. The function is documented as returning the
target, with modifications to a small part of its structure as specified by
the other parameters. It is strange for the result to suddenly collapse
down to NULL just because another parameter is NULL. Perhaps if the path is
NULL, that can mean "don't update". And if create_missing is NULL, that
should mean the same as not specifying it. I think. At a minimum, if we
don't change it, the documentation needs to get one of those warning boxes
alerting people that the functions will destroy their input entirely rather
than slightly modifying it if any of the other parameters are NULL.

My only doubt about any of this is that by the same argument, functions
like replace() should not return NULL if the 2nd or 3rd parameter is NULL.
I'm guessing replace() is specified by SQL and also unchanged in many
versions so therefore not eligible for re-thinking but it still gives me
just a bit of pause.


Re: jsonb_set() strictness considered harmful to data

2019-10-20 Thread Andrew Dunstan


On 10/20/19 4:39 AM, Floris Van Nee wrote:
>
> FWIW I've been bitten by this 'feature' more than once as well,
> accidentally erasing a column. Now I usually write js = jsonb_set(js,
> coalesce(new_column, 'null'::jsonb)) to prevent erasing the whole
> column, and instead setting the value to a jsonb null value, but I
> also found the STRICT behavior very surprising at first..
>
>
>



Understood. I think the real question here is what it should do instead
when the value is NULL. Your behaviour above is one suggestion, which I
personally find intuitive. Another has been to remove the associated
key. Another is to return the original target. And yet another is to
raise an exception, which is easy to write but really punts the issue
back to the application programmer who will have to decide how to ensure
they never pass in a NULL parameter. Possibly we could even add an extra
parameter to specify what should be done.


Also, the question will arise what to do when any of the other
parameters are NULL. Should we return NULL in those cases as we do now?


cheers


andrew


-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services





Re: jsonb_set() strictness considered harmful to data

2019-10-20 Thread Steve Atkins



On 19/10/2019 07:52, Ariadne Conill wrote:


I would say that any thing like

update whatever set column=jsonb_set(column, '{foo}', NULL)

should throw an exception.  It should do, literally, *anything* else
but blank that column.


steve=# create table foo (bar jsonb not null);
CREATE TABLE
steve=# insert into foo (bar) values ('{"a":"b"}');
INSERT 0 1
steve=# update foo set bar = jsonb_set(bar, '{foo}', NULL);
ERROR:  null value in column "bar" violates not-null constraint
DETAIL:  Failing row contains (null).
steve=# update foo set bar = jsonb_set(bar, '{foo}', 'null'::jsonb);
UPDATE 1

I don't see any behaviour that's particularly surprising there? Though I 
understand how an app developer who's light on SQL might get it wrong - 
and I've made similar mistakes in schema upgrade scripts without 
involving jsonb.


Cheers,
  Steve





jsonb_set() strictness considered harmful to data

2019-10-20 Thread Floris Van Nee
FWIW I've been bitten by this 'feature' more than once as well, accidentally 
erasing a column. Now I usually write js = jsonb_set(js, coalesce(new_column, 
'null'::jsonb)) to prevent erasing the whole column, and instead setting the 
value to a jsonb null value, but I also found the STRICT behavior very 
surprising at first..


-Floris



Re: jsonb_set() strictness considered harmful to data

2019-10-19 Thread Ariadne Conill
Hello,

On Sat, Oct 19, 2019, 3:27 PM Tomas Vondra 
wrote:

> On Sat, Oct 19, 2019 at 12:47:39PM -0400, Andrew Dunstan wrote:
> >
> >On 10/19/19 12:32 PM, David G. Johnston wrote:
> >> On Sat, Oct 19, 2019 at 9:19 AM Tomas Vondra
> >> mailto:tomas.von...@2ndquadrant.com>>
> >> wrote:
> >>
> >> >
> >> >We invented jsonb_set() (credit to Dmitry Dolgov). And we've had it
> >> >since 9.5. That's five releases ago.  So it's a bit late to be
> >> coming to
> >> >us telling us it's not safe (according to your preconceptions of
> >> what it
> >> >should be doing).
> >> >
> >>
> >>
> >> There have been numerous complaints and questions about this behavior
> >> in those five years; and none of the responses to those defenses has
> >> actually made the current behavior sound beneficial but rather have
> >> simply said "this is how it works, deal with it".
> >
> >
> >I haven't seen a patch, which for most possible solutions should be
> >fairly simple to code. This is open source. Code speaks louder than
> >complaints.
> >
>
> IMHO that might be a bit too harsh - I'm not surprised no one sent a
> patch when we're repeatedly telling people "you're holding it wrong".
> Without a clear consensus what the "correct" behavior is, I wouldn't
> send a patch either.
>
> >
> >>
> >> >
> >> >We could change it prospectively (i.e. from release 13 on) if we
> >> choose.
> >> >But absent an actual bug (i.e. acting contrary to documented
> >> behaviour)
> >> >we do not normally backpatch such changes, especially when there
> is a
> >> >simple workaround for the perceived problem. And it's that policy
> >> that
> >> >is in large measure responsible for Postgres' deserved reputation
> for
> >> >stability.
> >> >
> >>
> >> Yeah.
> >>
> >>
> >> Agreed, this is v13 material if enough people come on board to support
> >> making a change.
> >
> >
> >
> >We have changed such things in the past. But maybe a new function might
> >be a better way to go. I haven't given it enough thought yet.
> >
>
> I think the #1 thing we should certainly do is explaining the behavior
> in the docs.
>
> >
> >
> >>
> >> >And if we were to change it I'm not at all sure that we should do
> >> it the
> >> >way that's suggested here, which strikes me as no more intuitive
> than
> >> >the current behaviour. Rather I think we should possibly fill in
> >> a json
> >> >null in the indicated place.
> >> >
> >>
> >> Not sure, but that seems rather confusing to me, because it's
> >> mixing SQL
> >> NULL and JSON null, i.e. it's not clear to me why
> >>
> >> [...]
> >>
> >> But I admit it's quite subjective.
> >>
> >>
> >> Providing SQL NULL to this function and asking it to do something with
> >> that is indeed subjective - with no obvious reasonable default, and I
> >> agree that "return a NULL" while possible consistent is probably the
> >> least useful behavior that could have been chosen.  We should never
> >> have allowed an SQL NULL to be an acceptable argument in the first
> >> place, and can reasonably safely and effectively prevent it going
> >> forward.  Then people will have to explicitly code what they want to
> >> do if their data and queries present this invalid unknown data to the
> >> function.
> >>
> >>
> >
> >How exactly do we prevent a NULL being passed as an argument? The only
> >thing we could do would be to raise an exception, I think. That seems
> >like a fairly ugly thing to do, I'd need a h3eck of a lot of convincing.
> >
>
> I don't know, but if we don't know what the "right" behavior with NULL
> is, is raising an exception really that ugly?
>

Raising an exception at least would prevent people from blanking their
column out unintentionally.

And I am willing to write a patch to do that if we have consensus on how to
change it.

Ariadne


Re: jsonb_set() strictness considered harmful to data

2019-10-19 Thread Tomas Vondra

On Sat, Oct 19, 2019 at 12:47:39PM -0400, Andrew Dunstan wrote:


On 10/19/19 12:32 PM, David G. Johnston wrote:

On Sat, Oct 19, 2019 at 9:19 AM Tomas Vondra
mailto:tomas.von...@2ndquadrant.com>>
wrote:

>
>We invented jsonb_set() (credit to Dmitry Dolgov). And we've had it
>since 9.5. That's five releases ago.  So it's a bit late to be
coming to
>us telling us it's not safe (according to your preconceptions of
what it
>should be doing).
>


There have been numerous complaints and questions about this behavior
in those five years; and none of the responses to those defenses has
actually made the current behavior sound beneficial but rather have
simply said "this is how it works, deal with it".



I haven't seen a patch, which for most possible solutions should be
fairly simple to code. This is open source. Code speaks louder than
complaints.



IMHO that might be a bit too harsh - I'm not surprised no one sent a
patch when we're repeatedly telling people "you're holding it wrong".
Without a clear consensus what the "correct" behavior is, I wouldn't
send a patch either.





>
>We could change it prospectively (i.e. from release 13 on) if we
choose.
>But absent an actual bug (i.e. acting contrary to documented
behaviour)
>we do not normally backpatch such changes, especially when there is a
>simple workaround for the perceived problem. And it's that policy
that
>is in large measure responsible for Postgres' deserved reputation for
>stability.
>

Yeah.


Agreed, this is v13 material if enough people come on board to support
making a change.




We have changed such things in the past. But maybe a new function might
be a better way to go. I haven't given it enough thought yet.



I think the #1 thing we should certainly do is explaining the behavior
in the docs.






>And if we were to change it I'm not at all sure that we should do
it the
>way that's suggested here, which strikes me as no more intuitive than
>the current behaviour. Rather I think we should possibly fill in
a json
>null in the indicated place.
>

Not sure, but that seems rather confusing to me, because it's
mixing SQL
NULL and JSON null, i.e. it's not clear to me why

[...]

But I admit it's quite subjective.


Providing SQL NULL to this function and asking it to do something with
that is indeed subjective - with no obvious reasonable default, and I
agree that "return a NULL" while possible consistent is probably the
least useful behavior that could have been chosen.  We should never
have allowed an SQL NULL to be an acceptable argument in the first
place, and can reasonably safely and effectively prevent it going
forward.  Then people will have to explicitly code what they want to
do if their data and queries present this invalid unknown data to the
function.




How exactly do we prevent a NULL being passed as an argument? The only
thing we could do would be to raise an exception, I think. That seems
like a fairly ugly thing to do, I'd need a h3eck of a lot of convincing.



I don't know, but if we don't know what the "right" behavior with NULL
is, is raising an exception really that ugly?


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: jsonb_set() strictness considered harmful to data

2019-10-19 Thread Adrian Klaver

On 10/18/19 7:18 PM, Ariadne Conill wrote:

Hello,

On Fri, Oct 18, 2019 at 7:04 PM Adrian Klaver  wrote:


On 10/18/19 4:31 PM, Ariadne Conill wrote:

Hello,

On Fri, Oct 18, 2019 at 6:01 PM Adrian Klaver  wrote:


On 10/18/19 3:11 PM, Ariadne Conill wrote:

Hello,

On Fri, Oct 18, 2019 at 5:01 PM David G. Johnston
 wrote:


On Fri, Oct 18, 2019 at 2:50 PM Christoph Moench-Tegeder  
wrote:


## Ariadne Conill (aria...@dereferenced.org):


  update users set info=jsonb_set(info, '{bar}', info->'foo');

Typically, this works nicely, except for cases where evaluating
info->'foo' results in an SQL null being returned.  When that happens,
jsonb_set() returns an SQL null, which then results in data loss.[3]


So why don't you use the facilities of SQL to make sure to only
touch the rows which match the prerequisites?

 UPDATE users SET info = jsonb_set(info, '{bar}', info->'foo')
   WHERE info->'foo' IS NOT NULL;



There are many ways to add code to queries to make working with this function safer - though using 
them presupposes one remembers at the time of writing the query that there is danger and caveats in 
using this function.  I agree that we should have (and now) provided sane defined behavior when one 
of the inputs to the function is null instead blowing off the issue and defining the function as 
being strict.  Whether that is "ignore and return the original object" or "add the 
key with a json null scalar value" is debatable but either is considerably more useful than 
returning SQL NULL.


A great example of how we got burned by this last year: Pleroma
maintains pre-computed counters in JSONB for various types of
activities (posts, followers, followings).  Last year, another counter
was added, with a migration.  But some people did not run the
migration, because they are users, and that's what users do.  This


So you are more forgiving of your misstep, allowing users to run
outdated code, then of running afoul of Postgres documented behavior:


I'm not forgiving of either.


https://www.postgresql.org/docs/11/functions-json.html
" The field/element/path extraction operators return NULL, rather than
failing, if the JSON input does not have the right structure to match
the request; for example if no such element exists"


It is known that the extraction operators return NULL.  The problem
here is jsonb_set() returning NULL when it encounters SQL NULL.


I'm not following. Your original case was:

jsonb_set(info, '{bar}', info->'foo');

where info->'foo' is equivalent to:

test=# select '{"f1":1,"f2":null}'::jsonb ->'f3';
   ?column?
--
   NULL

So you know there is a possibility that a value extraction could return
NULL and from your wrapper that COALESCE is the way to deal with this.


You're not following because you don't want to follow.

It does not matter that info->'foo' is in my example.  That's not what
I am talking about.

What I am talking about is that jsonb_set(..., ..., NULL) returns SQL NULL >
postgres=# \pset null '(null)'
Null display is "(null)".
postgres=# select jsonb_set('{"a":1,"b":2,"c":3}'::jsonb, '{a}', NULL);
jsonb_set
---
(null)
(1 row)

This behaviour is basically giving an application developer a loaded
shotgun and pointing it at their feet.  It is not a good design.  It
is a design which has likely lead to many users experiencing
unintentional data loss.


create table null_test(fld_1 integer, fld_2 integer);

insert into null_test values(1, 2), (3, NULL);

select * from null_test ;
 fld_1 | fld_2
---+---
 1 | 2
 3 |  NULL
(2 rows)

update null_test set fld_1 = fld_1 + fld_2;

select * from null_test ;
 fld_1 | fld_2
---+---
 3 | 2
  NULL |  NULL

Failure to account for NULL is a generic issue. Given that this only the 
second post I can find that deals with this, in going on 4 years, I am 
guessing most users have dealt with it. If you really think this raises 
to the level of a bug then I would suggest filing a report here:


https://www.postgresql.org/account/login/?next=/account/submitbug/





Ariadne




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: jsonb_set() strictness considered harmful to data

2019-10-19 Thread Andrew Dunstan


On 10/19/19 12:32 PM, David G. Johnston wrote:
> On Sat, Oct 19, 2019 at 9:19 AM Tomas Vondra
> mailto:tomas.von...@2ndquadrant.com>>
> wrote:
>
> >
> >We invented jsonb_set() (credit to Dmitry Dolgov). And we've had it
> >since 9.5. That's five releases ago.  So it's a bit late to be
> coming to
> >us telling us it's not safe (according to your preconceptions of
> what it
> >should be doing).
> >
>
>
> There have been numerous complaints and questions about this behavior
> in those five years; and none of the responses to those defenses has
> actually made the current behavior sound beneficial but rather have
> simply said "this is how it works, deal with it".


I haven't seen a patch, which for most possible solutions should be
fairly simple to code. This is open source. Code speaks louder than
complaints.


>
> >
> >We could change it prospectively (i.e. from release 13 on) if we
> choose.
> >But absent an actual bug (i.e. acting contrary to documented
> behaviour)
> >we do not normally backpatch such changes, especially when there is a
> >simple workaround for the perceived problem. And it's that policy
> that
> >is in large measure responsible for Postgres' deserved reputation for
> >stability.
> >
>
> Yeah.
>
>
> Agreed, this is v13 material if enough people come on board to support
> making a change.



We have changed such things in the past. But maybe a new function might
be a better way to go. I haven't given it enough thought yet.



>
> >And if we were to change it I'm not at all sure that we should do
> it the
> >way that's suggested here, which strikes me as no more intuitive than
> >the current behaviour. Rather I think we should possibly fill in
> a json
> >null in the indicated place.
> >
>
> Not sure, but that seems rather confusing to me, because it's
> mixing SQL
> NULL and JSON null, i.e. it's not clear to me why
>
> [...]
>
> But I admit it's quite subjective.
>
>
> Providing SQL NULL to this function and asking it to do something with
> that is indeed subjective - with no obvious reasonable default, and I
> agree that "return a NULL" while possible consistent is probably the
> least useful behavior that could have been chosen.  We should never
> have allowed an SQL NULL to be an acceptable argument in the first
> place, and can reasonably safely and effectively prevent it going
> forward.  Then people will have to explicitly code what they want to
> do if their data and queries present this invalid unknown data to the
> function.
>
>

How exactly do we prevent a NULL being passed as an argument? The only
thing we could do would be to raise an exception, I think. That seems
like a fairly ugly thing to do, I'd need a h3eck of a lot of convincing.


cheers


andrew



-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services





Re: jsonb_set() strictness considered harmful to data

2019-10-19 Thread David G. Johnston
On Sat, Oct 19, 2019 at 9:19 AM Tomas Vondra 
wrote:

> >
> >We invented jsonb_set() (credit to Dmitry Dolgov). And we've had it
> >since 9.5. That's five releases ago.  So it's a bit late to be coming to
> >us telling us it's not safe (according to your preconceptions of what it
> >should be doing).
> >
>

There have been numerous complaints and questions about this behavior in
those five years; and none of the responses to those defenses has actually
made the current behavior sound beneficial but rather have simply said
"this is how it works, deal with it".

>
> >We could change it prospectively (i.e. from release 13 on) if we choose.
> >But absent an actual bug (i.e. acting contrary to documented behaviour)
> >we do not normally backpatch such changes, especially when there is a
> >simple workaround for the perceived problem. And it's that policy that
> >is in large measure responsible for Postgres' deserved reputation for
> >stability.
> >
>
> Yeah.
>
>
Agreed, this is v13 material if enough people come on board to support
making a change.

>
> >And if we were to change it I'm not at all sure that we should do it the
> >way that's suggested here, which strikes me as no more intuitive than
> >the current behaviour. Rather I think we should possibly fill in a json
> >null in the indicated place.
> >
>
> Not sure, but that seems rather confusing to me, because it's mixing SQL
> NULL and JSON null, i.e. it's not clear to me why
>
[...]

> But I admit it's quite subjective.
>

Providing SQL NULL to this function and asking it to do something with that
is indeed subjective - with no obvious reasonable default, and I agree that
"return a NULL" while possible consistent is probably the least useful
behavior that could have been chosen.  We should never have allowed an SQL
NULL to be an acceptable argument in the first place, and can reasonably
safely and effectively prevent it going forward.  Then people will have to
explicitly code what they want to do if their data and queries present this
invalid unknown data to the function.

David J.


Re: jsonb_set() strictness considered harmful to data

2019-10-19 Thread Andrew Dunstan


On 10/19/19 12:18 PM, Tomas Vondra wrote:
> On Sat, Oct 19, 2019 at 11:26:50AM -0400, Andrew Dunstan wrote:
>
> Not sure, but that seems rather confusing to me, because it's mixing SQL
> NULL and JSON null, i.e. it's not clear to me why
>
>    jsonb_set(..., "...", NULL)
>
> should do the same thing as
>
>    jsonb_set(..., "...", 'null':jsonb)
>
> I'm not entirely surprised it's what MySQL does ;-) but I'd say treating
> it as a deletion of the key (just like MSSQL) is somewhat more sensible.
> But I admit it's quite subjective.
>


That's yet another variant, which just reinforces my view that there is
no guaranteed-intuitive behaviour here.


OTOH, to me, turning jsonb_set into jsonb_delete for some case seems ...
odd.


cheers


andrew


-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services





Re: jsonb_set() strictness considered harmful to data

2019-10-19 Thread Tomas Vondra

On Sat, Oct 19, 2019 at 11:21:26AM -0400, Stephen Frost wrote:

Greetings,

* Dmitry Dolgov (9erthali...@gmail.com) wrote:

If we want to change it, the question is where to stop? Essentially we have:

update table set data = some_func(data, some_args_with_null);

where some_func happened to be jsonb_set, but could be any strict function.


I don't think it makes any sense to try and extrapolate this out to
other strict functions.  Functions should be strict when it makes sense
for them to be- in this case, it sounds like it doesn't really make
sense for jsonb_set to be strict, and that's where we stop it.



Yeah. I think the issue here is (partially) that other databases adopted
similar functions after us, but decided to use a different behavior. It
might be more natural for the users, but that does not mean we should
change the other strict functions.

Plus I'm not sure if SQL standard says anything about strict functions
(I found nothing, but I looked only very quickly), but I'm pretty sure
we can't change how basic operators change, and we translate them to
function calls (e.g. 1+2 is int4pl(1,2)).


I wonder if in this case it makes sense to think about an alternative? For
example, there is generic type subscripting patch, that allows to update a
jsonb in the following way:

update table set jsonb_data[key] = 'value';

It doesn't look like a function, so it's not a big deal if it will handle NULL
values differently. And at the same time one can argue, that people, who are
not aware about this caveat with jsonb_set and NULL values, will most likely
use it due to a bit simpler syntax (more similar to some popular programming
languages).


This seems like an entirely independent thing ...



Right. Useful, but entirely separate feature.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: jsonb_set() strictness considered harmful to data

2019-10-19 Thread Tomas Vondra

On Sat, Oct 19, 2019 at 11:26:50AM -0400, Andrew Dunstan wrote:


...

The hyperbole here is misplaced. There is a difference between a bug and
a POLA violation. This might be the latter, but it isn't the former. So
please tone it down a bit. It's not the function that's unsafe, but the
ill-informed use of it.


We invented jsonb_set() (credit to Dmitry Dolgov). And we've had it
since 9.5. That's five releases ago.  So it's a bit late to be coming to
us telling us it's not safe (according to your preconceptions of what it
should be doing).


We could change it prospectively (i.e. from release 13 on) if we choose.
But absent an actual bug (i.e. acting contrary to documented behaviour)
we do not normally backpatch such changes, especially when there is a
simple workaround for the perceived problem. And it's that policy that
is in large measure responsible for Postgres' deserved reputation for
stability.



Yeah.



Incidentally, why is your function written in plpgsql? Wouldn't a simple
SQL wrapper be better?


   create or replace function safe_jsonb_set
       (target jsonb, path text[], new_value jsonb, create_missing
   boolean default true)
   returns jsonb as
   $func$
       select case when new_value is null then target else
   jsonb_set(target, path, new_value, create_missing) end
   $func$ language sql;


And if we were to change it I'm not at all sure that we should do it the
way that's suggested here, which strikes me as no more intuitive than
the current behaviour. Rather I think we should possibly fill in a json
null in the indicated place.



Not sure, but that seems rather confusing to me, because it's mixing SQL
NULL and JSON null, i.e. it's not clear to me why

   jsonb_set(..., "...", NULL)

should do the same thing as

   jsonb_set(..., "...", 'null':jsonb)

I'm not entirely surprised it's what MySQL does ;-) but I'd say treating
it as a deletion of the key (just like MSSQL) is somewhat more sensible.
But I admit it's quite subjective.

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: jsonb_set() strictness considered harmful to data

2019-10-19 Thread Andrew Dunstan


On 10/18/19 3:10 PM, Mark Felder wrote:
>
> On Fri, Oct 18, 2019, at 12:37, Ariadne Conill wrote:
>> Hello,
>>
>> I am one of the primary maintainers of Pleroma, a federated social
>> networking application written in Elixir, which uses PostgreSQL in
>> ways that may be considered outside the typical usage scenarios for
>> PostgreSQL.
>>
>> Namely, we leverage JSONB heavily as a backing store for JSON-LD
>> documents[1].  We also use JSONB in combination with Ecto's "embedded
>> structs" to store things like user preferences.
>>
>> The fact that we can use JSONB to achieve our design goals is a
>> testament to the flexibility PostgreSQL has.
>>
>> However, in the process of doing so, we have discovered a serious flaw
>> in the way jsonb_set() functions, but upon reading through this
>> mailing list, we have discovered that this flaw appears to be an
>> intentional design.[2]
>>
>> A few times now, we have written migrations that do things like copy
>> keys in a JSONB object to a new key, to rename them.  These migrations
>> look like so:
>>
>>update users set info=jsonb_set(info, '{bar}', info->'foo');
>>
>> Typically, this works nicely, except for cases where evaluating
>> info->'foo' results in an SQL null being returned.  When that happens,
>> jsonb_set() returns an SQL null, which then results in data loss.[3]
>>
>> This is not acceptable.  PostgreSQL is a database that is renowned for
>> data integrity, but here it is wiping out data when it encounters a
>> failure case.  The way jsonb_set() should fail in this case is to
>> simply return the original input: it should NEVER return SQL null.
>>
>> But hey, we've been burned by this so many times now that we'd like to
>> donate a useful function to the commons, consider it a mollyguard for
>> the real jsonb_set() function.
>>
>> create or replace function safe_jsonb_set(target jsonb, path
>> text[], new_value jsonb, create_missing boolean default true) returns
>> jsonb as $$
>> declare
>>   result jsonb;
>> begin
>>   result := jsonb_set(target, path, coalesce(new_value,
>> 'null'::jsonb), create_missing);
>>   if result is NULL then
>> return target;
>>   else
>> return result;
>>   end if;
>> end;
>> $$ language plpgsql;
>>
>> This safe_jsonb_set() wrapper should not be necessary.  PostgreSQL's
>> own jsonb_set() should have this safety feature built in.  Without it,
>> using jsonb_set() is like playing russian roulette with your data,
>> which is not a reasonable expectation for a database renowned for its
>> commitment to data integrity.
>>
>> Please fix this bug so that we do not have to hack around this bug.
>> It has probably ruined countless people's days so far.  I don't want
>> to hear about how the function is strict, I'm aware it is strict, and
>> that strictness is harmful.  Please fix the function so that it is
>> actually safe to use.
>>
>> [1]: JSON-LD stands for JSON Linked Data.  Pleroma has an "internal
>> representation" that shares similar qualities to JSON-LD, so I use
>> JSON-LD here as a simplification.
>>
>> [2]: 
>> https://www.postgresql.org/message-id/flat/qfkua9$2q0e$1...@blaine.gmane.org
>>
>> [3]: https://git.pleroma.social/pleroma/pleroma/issues/1324 is an
>> example of data loss induced by this issue.
>>
>> Ariadne
>>
> This should be directed towards the hackers list, too.
>
> What will it take to change the semantics of jsonb_set()? MySQL implements 
> safe behavior here. It's a real shame Postgres does not. I'll offer a $200 
> bounty to whoever fixes it. I'm sure it's destroyed more than $200 worth of 
> data and people's time by now, but it's something.
>
>


The hyperbole here is misplaced. There is a difference between a bug and
a POLA violation. This might be the latter, but it isn't the former. So
please tone it down a bit. It's not the function that's unsafe, but the
ill-informed use of it.


We invented jsonb_set() (credit to Dmitry Dolgov). And we've had it
since 9.5. That's five releases ago.  So it's a bit late to be coming to
us telling us it's not safe (according to your preconceptions of what it
should be doing).


We could change it prospectively (i.e. from release 13 on) if we choose.
But absent an actual bug (i.e. acting contrary to documented behaviour)
we do not normally backpatch such changes, especially when there is a
simple workaround for the perceived problem. And it's that policy that
is in large measure responsible for Postgres' deserved reputation for
stability.


Incidentally, why is your function written in plpgsql? Wouldn't a simple
SQL wrapper be better?


create or replace function safe_jsonb_set
    (target jsonb, path text[], new_value jsonb, create_missing
boolean default true)
returns jsonb as
$func$
    select case when new_value is null then target else
jsonb_set(target, path, new_value, create_missing) end
$func$ language sql;


And if we were to change it I'm not at all sure that we 

Re: jsonb_set() strictness considered harmful to data

2019-10-19 Thread Stephen Frost
Greetings,

* Dmitry Dolgov (9erthali...@gmail.com) wrote:
> If we want to change it, the question is where to stop? Essentially we have:
> 
> update table set data = some_func(data, some_args_with_null);
> 
> where some_func happened to be jsonb_set, but could be any strict function.

I don't think it makes any sense to try and extrapolate this out to
other strict functions.  Functions should be strict when it makes sense
for them to be- in this case, it sounds like it doesn't really make
sense for jsonb_set to be strict, and that's where we stop it.

> I wonder if in this case it makes sense to think about an alternative? For
> example, there is generic type subscripting patch, that allows to update a
> jsonb in the following way:
> 
> update table set jsonb_data[key] = 'value';
> 
> It doesn't look like a function, so it's not a big deal if it will handle NULL
> values differently. And at the same time one can argue, that people, who are
> not aware about this caveat with jsonb_set and NULL values, will most likely
> use it due to a bit simpler syntax (more similar to some popular programming
> languages).

This seems like an entirely independent thing ...

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: jsonb_set() strictness considered harmful to data

2019-10-19 Thread Christoph Moench-Tegeder
## Ariadne Conill (aria...@dereferenced.org):

> NULL propagation makes sense in the context of traditional SQL.  What
> users expect from the JSONB support is for it to behave as JSON
> manipulation behaves everywhere else.

Well, some users expect that. Others are using this interface as it is
documented and implemented right now. And that's what makes this a
somewhat difficult case: I wouldn't argue for one behaviour or the
other if this was new functionality. But jsonb_set() was added in 9.5,
and changing that behaviour now will make other people about as unhappy
as you are right now.
Further, "now" is a rather flexible term: the function cannot be changed
"right now" with the next bugfix release (may break existing applications,
deterring people from installing bugfixes: very bad) and there's about
no way to get a new function into a bugfix release (catversion bump).
The next chance to do anything here is version 13, to be expected around
this time next year. This gives us ample time to think about a solution
which is consistent and works for (almost) everyone - no need to force
a behaviour change in that function right now (and in case it comes to
that: which other json/jsonb-functions would be affected?).

That creates a kind of bind for your case: you cannot rely on the new
behaviour until the new version is in reasonably widespread use.
Database servers are long-lived beasts - in the field, version 8.4
has finally mostly disappeared this year, but we still get some
questions about that version here on the lists (8.4 went EOL over
five years ago). At some point, you'll need to make a cut and require
your users to upgrade the database.

>   At some point, you have to start pondering whether the behaviour
> does not make logical sense in the context that people frame the JSONB
> type and it's associated manipulation functions.

But it does make sense from a SQL point of view - and this is a SQL
database. JSON is not SQL (the sheer amount of "Note" in between the
JSON functions and operators documentation is proof of that) and nots
ASN.1, "people expect" depends a lot on what kind of people you ask. 
None of these expectations is "right" or "wrong" in an absolute manner.
Code has to be "absolute" in order to be deterministic, and it should
do so in a way that is unsurprising to the least amount of users: I'm
willing to concede that jsonb_set() fails this test, but I'm still not
convinced that your approach is much better just because it fits your
specific use case.

> It is not *obvious*
> that jsonb_set() will trash your data, but that is what it is capable
> of doing.

It didn't. The data still fit the constraints you put on it: none,
unfortunately. Which leads me to the advice for the time being (until
we have this sorted out in one way or another, possibly the next
major release): at least put a NOT NULL on columns which must be not
NULL - that alone would have gone a long way to prevent the issues
you've unfortunately had. You could even put CHECK constraints on
your JSONB (like "CHECK (j->'info' IS NOT NULL)") to make sure it
stays well-formed. As a SQL person, I'd even argue that you shouldn't
use JSON columns for key data - there is a certain mismatch between
SQL and JSON, which will get you now and then, and once you've
implemented all the checks to be safe, you've build a type system
when the database would have given you one for free. (And running
UPDATEs inside your JSONB fields is not as efficient as on simple
columns).
And finally, you might put some version information in your
database schema, so the application can check if all the neccessary
data migrations have been run.

Regards,
Christoph

-- 
Spare Space




Re: jsonb_set() strictness considered harmful to data

2019-10-19 Thread Dmitry Dolgov
> On Sat, Oct 19, 2019 at 1:08 PM Tomas Vondra  
> wrote:
>
> >Here is how other implementations handle this case:
> >
> >MySQL/MariaDB:
> >
> >select json_set('{"a":1,"b":2,"c":3}', '$.a', NULL) results in:
> >   {"a":null,"b":2,"c":3}
> >
> >Microsoft SQL Server:
> >
> >select json_modify('{"a":1,"b":2,"c":3}', '$.a', NULL) results in:
> >   {"b":2,"c":3}
> >
> >Both of these outcomes make sense, given the nature of JSON objects.
> >I am actually more in favor of what MSSQL does however, I think that
> >makes the most sense of all.
> >
>
> I do mostly agree with this. The json[b]_set behavior seems rather
> surprising, and I think I've seen a couple of cases running into exactly
> this issue. I've solved that with a simple CASE, but maybe changing the
> behavior would be better. That's unlikely to be back-patchable, though,
> so maybe a better option is to create a non-strict wrappers. But that
> does not work when the user is unaware of the behavior :-(

Agree, that could be confusing. If I remember correctly, so far I've seen four
or five such complains in mailing lists, but of course number of people who
didn't reach out hackers is probably bigger.

If we want to change it, the question is where to stop? Essentially we have:

update table set data = some_func(data, some_args_with_null);

where some_func happened to be jsonb_set, but could be any strict function.

I wonder if in this case it makes sense to think about an alternative? For
example, there is generic type subscripting patch, that allows to update a
jsonb in the following way:

update table set jsonb_data[key] = 'value';

It doesn't look like a function, so it's not a big deal if it will handle NULL
values differently. And at the same time one can argue, that people, who are
not aware about this caveat with jsonb_set and NULL values, will most likely
use it due to a bit simpler syntax (more similar to some popular programming
languages).




Re: jsonb_set() strictness considered harmful to data

2019-10-19 Thread Ariadne Conill
Hello,

On Sat, Oct 19, 2019 at 12:52 AM Pavel Stehule  wrote:
>
>
>
> so 19. 10. 2019 v 7:41 odesílatel David G. Johnston 
>  napsal:
>>
>> On Friday, October 18, 2019, Pavel Stehule  wrote:
>>
>>>
>>> Probably there will be some applications that needs NULL result in 
>>> situations when value was not changed or when input value has not expected 
>>> format. Design using in Postgres allows later customization - you can 
>>> implement with COALESCE very simply behave that you want (sure, you have to 
>>> know what you do). If Postgres implement design used by MySQL, then there 
>>> is not any possibility to react on situation when update is not processed.
>>
>>
>> A CASE expression seems like it would work well for such detection in the 
>> rare case it is needed.  Current behavior is unsafe with minimal or no 
>> redeeming qualities.  Change it so passing in null raises an exception and 
>> make the user decide their own behavior if we don’t want to choose one for 
>> them.
>
>
> How you can do it? Buildn functions cannot to return more than one value. The 
> NULL is one possible signal how to emit this informations.
>
> The NULL value can be problem everywhere - and is not consistent to raise 
> exception somewhere and elsewhere not.
>
> I agree so the safe way is raising exception on NULL. Unfortunately, 
> exception handling is pretty expensive in Postres (more in write 
> transactions), so it should be used only when it is really necessary.

I would say that any thing like

update whatever set column=jsonb_set(column, '{foo}', NULL)

should throw an exception.  It should do, literally, *anything* else
but blank that column.

Ariadne




Re: jsonb_set() strictness considered harmful to data

2019-10-18 Thread Pavel Stehule
Hi


> What I am talking about is that jsonb_set(..., ..., NULL) returns SQL NULL.
>
> postgres=# \pset null '(null)'
> Null display is "(null)".
> postgres=# select jsonb_set('{"a":1,"b":2,"c":3}'::jsonb, '{a}', NULL);
> jsonb_set
> ---
> (null)
> (1 row)
>
> This behaviour is basically giving an application developer a loaded
> shotgun and pointing it at their feet.  It is not a good design.  It
> is a design which has likely lead to many users experiencing
> unintentional data loss.
>

on second hand - PostgreSQL design is one possible that returns additional
information if value was changed or not.

Unfortunately It is very low probably so the design of this function will
be changed - just it is not a bug (although I fully agree, it has different
behave than has other databases and for some usages it is not practical).
Probably there will be some applications that needs NULL result in
situations when value was not changed or when input value has not expected
format. Design using in Postgres allows later customization - you can
implement with COALESCE very simply behave that you want (sure, you have to
know what you do). If Postgres implement design used by MySQL, then there
is not any possibility to react on situation when update is not processed.

Is not hard to implement second function with different name that has
behave that you need and you expect - although it is just

CREATE OR REPLACE FUNCTION jsonb_modify(jsonb, text[], jsonb)
RETURNS jsonb AS $$
SELECT jsonb_set($1, $2, COALESCE($3, "null"::jsonb), true);
$$ LANGUAGE sql;

It is important to understand so JSON NULL is not PostgreSQL NULL. In this
case is not problem in PostgreSQL design because it is consistent with
everything in PG, but in bad expectations. Unfortunately, there are lot of
wrong expectations, and these cannot be covered by Postgres design because
then Postgres will be very not consistent software. You can see - my
function jsonb_modify is what you are expect, and can works for you
perfectly, but from system perspective is not consistent, and very strong
not consistent. Users should not to learn where NULL has different behave
or where NULL is JSON__NULL. Buildin functions should be consistent in
Postgres. It is Postgres, not other databases.

Pavel





> Ariadne
>
>
>


Re: jsonb_set() strictness considered harmful to data

2019-10-18 Thread Ariadne Conill
Hello,

On Fri, Oct 18, 2019 at 7:04 PM Adrian Klaver  wrote:
>
> On 10/18/19 4:31 PM, Ariadne Conill wrote:
> > Hello,
> >
> > On Fri, Oct 18, 2019 at 6:01 PM Adrian Klaver  
> > wrote:
> >>
> >> On 10/18/19 3:11 PM, Ariadne Conill wrote:
> >>> Hello,
> >>>
> >>> On Fri, Oct 18, 2019 at 5:01 PM David G. Johnston
> >>>  wrote:
> 
>  On Fri, Oct 18, 2019 at 2:50 PM Christoph Moench-Tegeder 
>   wrote:
> >
> > ## Ariadne Conill (aria...@dereferenced.org):
> >
> >>  update users set info=jsonb_set(info, '{bar}', info->'foo');
> >>
> >> Typically, this works nicely, except for cases where evaluating
> >> info->'foo' results in an SQL null being returned.  When that happens,
> >> jsonb_set() returns an SQL null, which then results in data loss.[3]
> >
> > So why don't you use the facilities of SQL to make sure to only
> > touch the rows which match the prerequisites?
> >
> > UPDATE users SET info = jsonb_set(info, '{bar}', info->'foo')
> >   WHERE info->'foo' IS NOT NULL;
> >
> 
>  There are many ways to add code to queries to make working with this 
>  function safer - though using them presupposes one remembers at the time 
>  of writing the query that there is danger and caveats in using this 
>  function.  I agree that we should have (and now) provided sane defined 
>  behavior when one of the inputs to the function is null instead blowing 
>  off the issue and defining the function as being strict.  Whether that 
>  is "ignore and return the original object" or "add the key with a json 
>  null scalar value" is debatable but either is considerably more useful 
>  than returning SQL NULL.
> >>>
> >>> A great example of how we got burned by this last year: Pleroma
> >>> maintains pre-computed counters in JSONB for various types of
> >>> activities (posts, followers, followings).  Last year, another counter
> >>> was added, with a migration.  But some people did not run the
> >>> migration, because they are users, and that's what users do.  This
> >>
> >> So you are more forgiving of your misstep, allowing users to run
> >> outdated code, then of running afoul of Postgres documented behavior:
> >
> > I'm not forgiving of either.
> >
> >> https://www.postgresql.org/docs/11/functions-json.html
> >> " The field/element/path extraction operators return NULL, rather than
> >> failing, if the JSON input does not have the right structure to match
> >> the request; for example if no such element exists"
> >
> > It is known that the extraction operators return NULL.  The problem
> > here is jsonb_set() returning NULL when it encounters SQL NULL.
>
> I'm not following. Your original case was:
>
> jsonb_set(info, '{bar}', info->'foo');
>
> where info->'foo' is equivalent to:
>
> test=# select '{"f1":1,"f2":null}'::jsonb ->'f3';
>   ?column?
> --
>   NULL
>
> So you know there is a possibility that a value extraction could return
> NULL and from your wrapper that COALESCE is the way to deal with this.

You're not following because you don't want to follow.

It does not matter that info->'foo' is in my example.  That's not what
I am talking about.

What I am talking about is that jsonb_set(..., ..., NULL) returns SQL NULL.

postgres=# \pset null '(null)'
Null display is "(null)".
postgres=# select jsonb_set('{"a":1,"b":2,"c":3}'::jsonb, '{a}', NULL);
jsonb_set
---
(null)
(1 row)

This behaviour is basically giving an application developer a loaded
shotgun and pointing it at their feet.  It is not a good design.  It
is a design which has likely lead to many users experiencing
unintentional data loss.

Ariadne




Re: jsonb_set() strictness considered harmful to data

2019-10-18 Thread Ariadne Conill
Hello,

On Fri, Oct 18, 2019 at 6:52 PM Stephen Frost  wrote:
>
> Greetings,
>
> * Ariadne Conill (aria...@dereferenced.org) wrote:
> > On Fri, Oct 18, 2019 at 6:01 PM Adrian Klaver  
> > wrote:
> > > https://www.postgresql.org/docs/11/functions-json.html
> > > " The field/element/path extraction operators return NULL, rather than
> > > failing, if the JSON input does not have the right structure to match
> > > the request; for example if no such element exists"
> >
> > It is known that the extraction operators return NULL.  The problem
> > here is jsonb_set() returning NULL when it encounters SQL NULL.
> >
> > > Just trying to figure why one is worse then the other.
> >
> > Any time a user loses data, it is worse.  The preference for not
> > having data loss is why Pleroma uses PostgreSQL as it's database of
> > choice, as PostgreSQL has traditionally valued durability.  If we
> > should not use PostgreSQL, just say so.
>
> Your contention that the documented, clear, and easily addressed
> behavior of a particular strict function equates to "the database system
> loses data and isn't durable" is really hurting your arguments here, not
> helping it.
>
> The argument about how it's unintuitive and can cause application
> developers to misuse the function (which is clearly an application bug,
> but perhaps an understandable one if the function interface isn't
> intuitive or is confusing) is a reasonable one and might be convincing
> enough to result in a change here.
>
> I'd suggest sticking to the latter argument when making this case.
>
> > > > I believe that anything that can be catastrophically broken by users
> > > > not following upgrade instructions precisely is a serious problem, and
> > > > can lead to serious problems.  I am sure that this is not the only
> > > > project using JSONB which have had users destroy their own data in
> > > > such a completely preventable fashion.
>
> Let's be clear here that the issue with the upgrade instructions was
> that the user didn't follow your *application's* upgrade instructions,
> and your later code wasn't written to use the function, as documented,
> properly- this isn't a case of PG destroying your data.  It's fine to
> contend that the interface sucks and that we should change it, but the
> argument that PG is eating data because the application sent a query to
> the database telling it, based on our documentation, to eat the data,
> isn't appropriate.  Again, let's have a reasonable discussion here about
> if it makes sense to make a change here because the interface isn't
> intuitive and doesn't match what other systems do (I'm guessing it isn't
> in the SQL standard either, so we unfortunately can't look to that for
> help; though I'd hardly be surprised if they supported what PG does
> today anyway).

Okay, I will admit that saying PG is eating data is perhaps
hyperbolic, but I will also say that the behaviour of jsonb_set()
under this type of edge case is unintuitive and frequently results in
unintended data loss.  So, while PostgreSQL is not actually eating the
data, it is putting the user in a position where they may suffer data
loss if they are not extremely careful.

Here is how other implementations handle this case:

MySQL/MariaDB:

select json_set('{"a":1,"b":2,"c":3}', '$.a', NULL) results in:
   {"a":null,"b":2,"c":3}

Microsoft SQL Server:

select json_modify('{"a":1,"b":2,"c":3}', '$.a', NULL) results in:
   {"b":2,"c":3}

Both of these outcomes make sense, given the nature of JSON objects.
I am actually more in favor of what MSSQL does however, I think that
makes the most sense of all.

I did not compare to other database systems, because using them I
found that there is a JSON_TABLE type function and then you do stuff
with that to rewrite the object and dump it back out as JSON, and it's
quite a mess.  But MySQL and MSSQL have an equivalent jsonb inline
modification function, as seen above.

> As a practical response to the issue you've raised- have you considered
> using a trigger to check the validity of the new jsonb?  Or, maybe, just
> made the jsonb column not nullable?  With a trigger you could disallow
> non-null->null transistions, for example, or if it just shouldn't ever
> be null then making the column 'not null' would suffice.

We have already mitigated the issue in a way we find appropriate to
do.  The suggestion of having a non-null constraint does seem useful
as well and I will look into that.

> I'll echo Christoph's comments up thread too, though in my own language-
> these are risks you've explicitly accepted by using JSONB and writing
> your own validation and checks (or, not, apparently) rather than using
> what the database system provides.  That doesn't mean I'm against
> making the change you suggest, but it certainly should become a lesson
> to anyone who is considering using primairly jsonb for their storage
> that it's risky to do so, because you're removing the database system's
> knowledge and understanding 

Re: jsonb_set() strictness considered harmful to data

2019-10-18 Thread Adrian Klaver

On 10/18/19 4:31 PM, Ariadne Conill wrote:

Hello,

On Fri, Oct 18, 2019 at 6:01 PM Adrian Klaver  wrote:


On 10/18/19 3:11 PM, Ariadne Conill wrote:

Hello,

On Fri, Oct 18, 2019 at 5:01 PM David G. Johnston
 wrote:


On Fri, Oct 18, 2019 at 2:50 PM Christoph Moench-Tegeder  
wrote:


## Ariadne Conill (aria...@dereferenced.org):


 update users set info=jsonb_set(info, '{bar}', info->'foo');

Typically, this works nicely, except for cases where evaluating
info->'foo' results in an SQL null being returned.  When that happens,
jsonb_set() returns an SQL null, which then results in data loss.[3]


So why don't you use the facilities of SQL to make sure to only
touch the rows which match the prerequisites?

UPDATE users SET info = jsonb_set(info, '{bar}', info->'foo')
  WHERE info->'foo' IS NOT NULL;



There are many ways to add code to queries to make working with this function safer - though using 
them presupposes one remembers at the time of writing the query that there is danger and caveats in 
using this function.  I agree that we should have (and now) provided sane defined behavior when one 
of the inputs to the function is null instead blowing off the issue and defining the function as 
being strict.  Whether that is "ignore and return the original object" or "add the 
key with a json null scalar value" is debatable but either is considerably more useful than 
returning SQL NULL.


A great example of how we got burned by this last year: Pleroma
maintains pre-computed counters in JSONB for various types of
activities (posts, followers, followings).  Last year, another counter
was added, with a migration.  But some people did not run the
migration, because they are users, and that's what users do.  This


So you are more forgiving of your misstep, allowing users to run
outdated code, then of running afoul of Postgres documented behavior:


I'm not forgiving of either.


https://www.postgresql.org/docs/11/functions-json.html
" The field/element/path extraction operators return NULL, rather than
failing, if the JSON input does not have the right structure to match
the request; for example if no such element exists"


It is known that the extraction operators return NULL.  The problem
here is jsonb_set() returning NULL when it encounters SQL NULL.


I'm not following. Your original case was:

jsonb_set(info, '{bar}', info->'foo');

where info->'foo' is equivalent to:

test=# select '{"f1":1,"f2":null}'::jsonb ->'f3';
 ?column?
--
 NULL

So you know there is a possibility that a value extraction could return 
NULL and from your wrapper that COALESCE is the way to deal with this.






Just trying to figure why one is worse then the other.


Any time a user loses data, it is worse.  The preference for not
having data loss is why Pleroma uses PostgreSQL as it's database of
choice, as PostgreSQL has traditionally valued durability.  If we
should not use PostgreSQL, just say so.


There are any number of ways you can make Postgres lose data that are 
not related to durability e.g build the following in code:


DELETE FROM some_table;

and forget the WHERE.



Ariadne




resulted in Pleroma blanking out the `info` structure for users as
they performed new activities that incremented that counter.  At that
time, Pleroma maintained various things like private keys used to sign
things in that JSONB column (we no longer do this because of being
burned by this several times now), which broke federation temporarily
for the affected accounts with other servers for up to a week as those
servers had to learn new public keys for those accounts (since the
original private keys were lost).

I believe that anything that can be catastrophically broken by users
not following upgrade instructions precisely is a serious problem, and
can lead to serious problems.  I am sure that this is not the only
project using JSONB which have had users destroy their own data in
such a completely preventable fashion.

Ariadne






--
Adrian Klaver
adrian.kla...@aklaver.com





--
Adrian Klaver
adrian.kla...@aklaver.com




Re: jsonb_set() strictness considered harmful to data

2019-10-18 Thread Stephen Frost
Greetings,

* Ariadne Conill (aria...@dereferenced.org) wrote:
> On Fri, Oct 18, 2019 at 5:57 PM Christoph Moench-Tegeder
>  wrote:
> > ## Ariadne Conill (aria...@dereferenced.org):
> > > Why don't we fix the database engine to not eat data when the
> > > jsonb_set() operation fails?
> >
> > It didn't fail, it worked like SQL (you've been doing SQL for too
> > long when you get used to the NULL propagation, but that's still
> > what SQL does - check "+" for example).
> > And changing a function will cause fun for everyone who relies on
> > the current behaviour - so at least it shouldn't be done on a whim
> > (some might argue that a whim was what got us into this situation
> > in the first place).
> 
> NULL propagation makes sense in the context of traditional SQL.  What
> users expect from the JSONB support is for it to behave as JSON
> manipulation behaves everywhere else.  It makes sense that 2 + NULL
> returns NULL -- it's easily understood as a type mismatch.  It does
> not make sense that jsonb_set('{}'::jsonb, '{foo}', NULL) returns NULL
> because a *value* was SQL NULL.  In this case, it should, at the
> least, automatically coalesce to 'null'::jsonb.

2 + NULL isn't a type mismatch, just to be clear, it's "2 + unknown =
unknown", which is pretty reasonable, if you accept the general notion
of what NULL is to begin with.

And as such, what follows with "set this blob of stuff to include this
unknown thing ... implies ... we don't know what the result of the set
is and therefore it's unknown" isn't entirely unreasonable, but I can
agree that in this specific case, because what we're dealing with
regarding JSONB is a complex data structure, not an individual value,
that it's surprising to a developer and there can be an argument made
there that we should consider changing it.

> > Continuing along that thought, I'd even argue that your are
> > writing code which relies on properties of the data which you never
> > guaranteed. There is a use case for data types and constraints.
> 
> There is a use case, but this frequently comes up as a question people
> ask.  At some point, you have to start pondering whether the behaviour
> does not make logical sense in the context that people frame the JSONB
> type and it's associated manipulation functions.  It is not *obvious*
> that jsonb_set() will trash your data, but that is what it is capable
> of doing.  In a database that is advertised as being durable and not
> trashing data, even.

Having the result of a call to a strict function be NULL isn't
"trashing" your data.

> > Not that I'm arguing for maximum surprise in programming, but
> > I'm a little puzzled when people eschew thew built-in tools and
> > start implmenting them again side-to-side with what's already
> > there.
> 
> If you read the safe_jsonb_set() function, all we do is coalesce any
> SQL NULL to 'null'::jsonb, which is what it should be doing anyway,

I'm not convinced that this is at all the right answer, particularly
since we don't do that generally.  We don't return the string 'null'
when you do: NULL || 'abc', we return NULL.  There might be something we
can do here that doesn't result in the whole jsonb document becoming
NULL though.

> and then additionally handling any *unanticipated* failure case on top
> of that.  While you are arguing that we should use tools to work
> around unanticipated effects (that are not even documented -- in no
> place in the jsonb_set() documentation does it say "if you pass SQL
> NULL to this function as a value, it will return SQL NULL"), I am
> arguing that jsonb_set() shouldn't set people up for their data to be
> trashed in the first place.

The function is marked as strict, and the meaning of that is quite
clearly defined in the documentation.  I'm not against including a
comment regarding this in the documentation, to be clear, though I
seriously doubt it would actually have changed anything in this case.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: jsonb_set() strictness considered harmful to data

2019-10-18 Thread Stephen Frost
Greetings,

* Ariadne Conill (aria...@dereferenced.org) wrote:
> On Fri, Oct 18, 2019 at 6:01 PM Adrian Klaver  
> wrote:
> > https://www.postgresql.org/docs/11/functions-json.html
> > " The field/element/path extraction operators return NULL, rather than
> > failing, if the JSON input does not have the right structure to match
> > the request; for example if no such element exists"
> 
> It is known that the extraction operators return NULL.  The problem
> here is jsonb_set() returning NULL when it encounters SQL NULL.
> 
> > Just trying to figure why one is worse then the other.
> 
> Any time a user loses data, it is worse.  The preference for not
> having data loss is why Pleroma uses PostgreSQL as it's database of
> choice, as PostgreSQL has traditionally valued durability.  If we
> should not use PostgreSQL, just say so.

Your contention that the documented, clear, and easily addressed
behavior of a particular strict function equates to "the database system
loses data and isn't durable" is really hurting your arguments here, not
helping it.

The argument about how it's unintuitive and can cause application
developers to misuse the function (which is clearly an application bug,
but perhaps an understandable one if the function interface isn't
intuitive or is confusing) is a reasonable one and might be convincing
enough to result in a change here.

I'd suggest sticking to the latter argument when making this case.

> > > I believe that anything that can be catastrophically broken by users
> > > not following upgrade instructions precisely is a serious problem, and
> > > can lead to serious problems.  I am sure that this is not the only
> > > project using JSONB which have had users destroy their own data in
> > > such a completely preventable fashion.

Let's be clear here that the issue with the upgrade instructions was
that the user didn't follow your *application's* upgrade instructions,
and your later code wasn't written to use the function, as documented,
properly- this isn't a case of PG destroying your data.  It's fine to
contend that the interface sucks and that we should change it, but the
argument that PG is eating data because the application sent a query to
the database telling it, based on our documentation, to eat the data,
isn't appropriate.  Again, let's have a reasonable discussion here about
if it makes sense to make a change here because the interface isn't
intuitive and doesn't match what other systems do (I'm guessing it isn't
in the SQL standard either, so we unfortunately can't look to that for
help; though I'd hardly be surprised if they supported what PG does
today anyway).

As a practical response to the issue you've raised- have you considered
using a trigger to check the validity of the new jsonb?  Or, maybe, just
made the jsonb column not nullable?  With a trigger you could disallow
non-null->null transistions, for example, or if it just shouldn't ever
be null then making the column 'not null' would suffice.

I'll echo Christoph's comments up thread too, though in my own language-
these are risks you've explicitly accepted by using JSONB and writing
your own validation and checks (or, not, apparently) rather than using
what the database system provides.  That doesn't mean I'm against
making the change you suggest, but it certainly should become a lesson
to anyone who is considering using primairly jsonb for their storage
that it's risky to do so, because you're removing the database system's
knowledge and understanding of the data, and further you tend to end up
not having the necessary constraints in place to ensure that the data
doesn't end up being garbage- thus letting your application destroy all
the data easily due to an application bug.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: jsonb_set() strictness considered harmful to data

2019-10-18 Thread Ariadne Conill
Hello,

On Fri, Oct 18, 2019 at 5:57 PM Christoph Moench-Tegeder
 wrote:
>
> ## Ariadne Conill (aria...@dereferenced.org):
>
> > Why don't we fix the database engine to not eat data when the
> > jsonb_set() operation fails?
>
> It didn't fail, it worked like SQL (you've been doing SQL for too
> long when you get used to the NULL propagation, but that's still
> what SQL does - check "+" for example).
> And changing a function will cause fun for everyone who relies on
> the current behaviour - so at least it shouldn't be done on a whim
> (some might argue that a whim was what got us into this situation
> in the first place).

NULL propagation makes sense in the context of traditional SQL.  What
users expect from the JSONB support is for it to behave as JSON
manipulation behaves everywhere else.  It makes sense that 2 + NULL
returns NULL -- it's easily understood as a type mismatch.  It does
not make sense that jsonb_set('{}'::jsonb, '{foo}', NULL) returns NULL
because a *value* was SQL NULL.  In this case, it should, at the
least, automatically coalesce to 'null'::jsonb.

> Continuing along that thought, I'd even argue that your are
> writing code which relies on properties of the data which you never
> guaranteed. There is a use case for data types and constraints.

There is a use case, but this frequently comes up as a question people
ask.  At some point, you have to start pondering whether the behaviour
does not make logical sense in the context that people frame the JSONB
type and it's associated manipulation functions.  It is not *obvious*
that jsonb_set() will trash your data, but that is what it is capable
of doing.  In a database that is advertised as being durable and not
trashing data, even.

> Not that I'm arguing for maximum surprise in programming, but
> I'm a little puzzled when people eschew thew built-in tools and
> start implmenting them again side-to-side with what's already
> there.

If you read the safe_jsonb_set() function, all we do is coalesce any
SQL NULL to 'null'::jsonb, which is what it should be doing anyway,
and then additionally handling any *unanticipated* failure case on top
of that.  While you are arguing that we should use tools to work
around unanticipated effects (that are not even documented -- in no
place in the jsonb_set() documentation does it say "if you pass SQL
NULL to this function as a value, it will return SQL NULL"), I am
arguing that jsonb_set() shouldn't set people up for their data to be
trashed in the first place.

Even MySQL gets this right.  MySQL!  The database that everyone knows
takes your data out for a night it will never forget.  This argument
is miserable.  It does not matter to me how jsonb_set() works as long
as it does not return NULL when passed NULL as a value to set.  JSONB
columns should be treated as the complex types that they are: you
don't null out an entire hash table because someone set a key to SQL
NULL.  So, please, let us fix this.

Ariadne




Re: jsonb_set() strictness considered harmful to data

2019-10-18 Thread Ariadne Conill
Hello,

On Fri, Oct 18, 2019 at 6:01 PM Adrian Klaver  wrote:
>
> On 10/18/19 3:11 PM, Ariadne Conill wrote:
> > Hello,
> >
> > On Fri, Oct 18, 2019 at 5:01 PM David G. Johnston
> >  wrote:
> >>
> >> On Fri, Oct 18, 2019 at 2:50 PM Christoph Moench-Tegeder 
> >>  wrote:
> >>>
> >>> ## Ariadne Conill (aria...@dereferenced.org):
> >>>
>  update users set info=jsonb_set(info, '{bar}', info->'foo');
> 
>  Typically, this works nicely, except for cases where evaluating
>  info->'foo' results in an SQL null being returned.  When that happens,
>  jsonb_set() returns an SQL null, which then results in data loss.[3]
> >>>
> >>> So why don't you use the facilities of SQL to make sure to only
> >>> touch the rows which match the prerequisites?
> >>>
> >>>UPDATE users SET info = jsonb_set(info, '{bar}', info->'foo')
> >>>  WHERE info->'foo' IS NOT NULL;
> >>>
> >>
> >> There are many ways to add code to queries to make working with this 
> >> function safer - though using them presupposes one remembers at the time 
> >> of writing the query that there is danger and caveats in using this 
> >> function.  I agree that we should have (and now) provided sane defined 
> >> behavior when one of the inputs to the function is null instead blowing 
> >> off the issue and defining the function as being strict.  Whether that is 
> >> "ignore and return the original object" or "add the key with a json null 
> >> scalar value" is debatable but either is considerably more useful than 
> >> returning SQL NULL.
> >
> > A great example of how we got burned by this last year: Pleroma
> > maintains pre-computed counters in JSONB for various types of
> > activities (posts, followers, followings).  Last year, another counter
> > was added, with a migration.  But some people did not run the
> > migration, because they are users, and that's what users do.  This
>
> So you are more forgiving of your misstep, allowing users to run
> outdated code, then of running afoul of Postgres documented behavior:

I'm not forgiving of either.

> https://www.postgresql.org/docs/11/functions-json.html
> " The field/element/path extraction operators return NULL, rather than
> failing, if the JSON input does not have the right structure to match
> the request; for example if no such element exists"

It is known that the extraction operators return NULL.  The problem
here is jsonb_set() returning NULL when it encounters SQL NULL.

> Just trying to figure why one is worse then the other.

Any time a user loses data, it is worse.  The preference for not
having data loss is why Pleroma uses PostgreSQL as it's database of
choice, as PostgreSQL has traditionally valued durability.  If we
should not use PostgreSQL, just say so.

Ariadne

>
> > resulted in Pleroma blanking out the `info` structure for users as
> > they performed new activities that incremented that counter.  At that
> > time, Pleroma maintained various things like private keys used to sign
> > things in that JSONB column (we no longer do this because of being
> > burned by this several times now), which broke federation temporarily
> > for the affected accounts with other servers for up to a week as those
> > servers had to learn new public keys for those accounts (since the
> > original private keys were lost).
> >
> > I believe that anything that can be catastrophically broken by users
> > not following upgrade instructions precisely is a serious problem, and
> > can lead to serious problems.  I am sure that this is not the only
> > project using JSONB which have had users destroy their own data in
> > such a completely preventable fashion.
> >
> > Ariadne
> >
> >
> >
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com




Re: jsonb_set() strictness considered harmful to data

2019-10-18 Thread Adrian Klaver

On 10/18/19 3:11 PM, Ariadne Conill wrote:

Hello,

On Fri, Oct 18, 2019 at 5:01 PM David G. Johnston
 wrote:


On Fri, Oct 18, 2019 at 2:50 PM Christoph Moench-Tegeder  
wrote:


## Ariadne Conill (aria...@dereferenced.org):


update users set info=jsonb_set(info, '{bar}', info->'foo');

Typically, this works nicely, except for cases where evaluating
info->'foo' results in an SQL null being returned.  When that happens,
jsonb_set() returns an SQL null, which then results in data loss.[3]


So why don't you use the facilities of SQL to make sure to only
touch the rows which match the prerequisites?

   UPDATE users SET info = jsonb_set(info, '{bar}', info->'foo')
 WHERE info->'foo' IS NOT NULL;



There are many ways to add code to queries to make working with this function safer - though using 
them presupposes one remembers at the time of writing the query that there is danger and caveats in 
using this function.  I agree that we should have (and now) provided sane defined behavior when one 
of the inputs to the function is null instead blowing off the issue and defining the function as 
being strict.  Whether that is "ignore and return the original object" or "add the 
key with a json null scalar value" is debatable but either is considerably more useful than 
returning SQL NULL.


A great example of how we got burned by this last year: Pleroma
maintains pre-computed counters in JSONB for various types of
activities (posts, followers, followings).  Last year, another counter
was added, with a migration.  But some people did not run the
migration, because they are users, and that's what users do.  This


So you are more forgiving of your misstep, allowing users to run 
outdated code, then of running afoul of Postgres documented behavior:


https://www.postgresql.org/docs/11/functions-json.html
" The field/element/path extraction operators return NULL, rather than 
failing, if the JSON input does not have the right structure to match 
the request; for example if no such element exists"


Just trying to figure why one is worse then the other.


resulted in Pleroma blanking out the `info` structure for users as
they performed new activities that incremented that counter.  At that
time, Pleroma maintained various things like private keys used to sign
things in that JSONB column (we no longer do this because of being
burned by this several times now), which broke federation temporarily
for the affected accounts with other servers for up to a week as those
servers had to learn new public keys for those accounts (since the
original private keys were lost).

I believe that anything that can be catastrophically broken by users
not following upgrade instructions precisely is a serious problem, and
can lead to serious problems.  I am sure that this is not the only
project using JSONB which have had users destroy their own data in
such a completely preventable fashion.

Ariadne






--
Adrian Klaver
adrian.kla...@aklaver.com




Re: jsonb_set() strictness considered harmful to data

2019-10-18 Thread Ariadne Conill
Hello,

On Fri, Oct 18, 2019 at 5:01 PM David G. Johnston
 wrote:
>
> On Fri, Oct 18, 2019 at 2:50 PM Christoph Moench-Tegeder 
>  wrote:
>>
>> ## Ariadne Conill (aria...@dereferenced.org):
>>
>> >update users set info=jsonb_set(info, '{bar}', info->'foo');
>> >
>> > Typically, this works nicely, except for cases where evaluating
>> > info->'foo' results in an SQL null being returned.  When that happens,
>> > jsonb_set() returns an SQL null, which then results in data loss.[3]
>>
>> So why don't you use the facilities of SQL to make sure to only
>> touch the rows which match the prerequisites?
>>
>>   UPDATE users SET info = jsonb_set(info, '{bar}', info->'foo')
>> WHERE info->'foo' IS NOT NULL;
>>
>
> There are many ways to add code to queries to make working with this function 
> safer - though using them presupposes one remembers at the time of writing 
> the query that there is danger and caveats in using this function.  I agree 
> that we should have (and now) provided sane defined behavior when one of the 
> inputs to the function is null instead blowing off the issue and defining the 
> function as being strict.  Whether that is "ignore and return the original 
> object" or "add the key with a json null scalar value" is debatable but 
> either is considerably more useful than returning SQL NULL.

A great example of how we got burned by this last year: Pleroma
maintains pre-computed counters in JSONB for various types of
activities (posts, followers, followings).  Last year, another counter
was added, with a migration.  But some people did not run the
migration, because they are users, and that's what users do.  This
resulted in Pleroma blanking out the `info` structure for users as
they performed new activities that incremented that counter.  At that
time, Pleroma maintained various things like private keys used to sign
things in that JSONB column (we no longer do this because of being
burned by this several times now), which broke federation temporarily
for the affected accounts with other servers for up to a week as those
servers had to learn new public keys for those accounts (since the
original private keys were lost).

I believe that anything that can be catastrophically broken by users
not following upgrade instructions precisely is a serious problem, and
can lead to serious problems.  I am sure that this is not the only
project using JSONB which have had users destroy their own data in
such a completely preventable fashion.

Ariadne




Re: jsonb_set() strictness considered harmful to data

2019-10-18 Thread David G. Johnston
On Fri, Oct 18, 2019 at 2:50 PM Christoph Moench-Tegeder 
wrote:

> ## Ariadne Conill (aria...@dereferenced.org):
>
> >update users set info=jsonb_set(info, '{bar}', info->'foo');
> >
> > Typically, this works nicely, except for cases where evaluating
> > info->'foo' results in an SQL null being returned.  When that happens,
> > jsonb_set() returns an SQL null, which then results in data loss.[3]
>
> So why don't you use the facilities of SQL to make sure to only
> touch the rows which match the prerequisites?
>
>   UPDATE users SET info = jsonb_set(info, '{bar}', info->'foo')
> WHERE info->'foo' IS NOT NULL;
>
>
There are many ways to add code to queries to make working with this
function safer - though using them presupposes one remembers at the time of
writing the query that there is danger and caveats in using this function.
I agree that we should have (and now) provided sane defined behavior when
one of the inputs to the function is null instead blowing off the issue and
defining the function as being strict.  Whether that is "ignore and return
the original object" or "add the key with a json null scalar value" is
debatable but either is considerably more useful than returning SQL NULL.

David J.


Re: jsonb_set() strictness considered harmful to data

2019-10-18 Thread Christoph Moench-Tegeder
## Ariadne Conill (aria...@dereferenced.org):

>update users set info=jsonb_set(info, '{bar}', info->'foo');
> 
> Typically, this works nicely, except for cases where evaluating
> info->'foo' results in an SQL null being returned.  When that happens,
> jsonb_set() returns an SQL null, which then results in data loss.[3]

So why don't you use the facilities of SQL to make sure to only
touch the rows which match the prerequisites?

  UPDATE users SET info = jsonb_set(info, '{bar}', info->'foo')
WHERE info->'foo' IS NOT NULL;

No special wrappers required.

Regards,
Christoph

-- 
Spare Space




Re: jsonb_set() strictness considered harmful to data

2019-10-18 Thread Mark Felder



On Fri, Oct 18, 2019, at 12:37, Ariadne Conill wrote:
> Hello,
> 
> I am one of the primary maintainers of Pleroma, a federated social
> networking application written in Elixir, which uses PostgreSQL in
> ways that may be considered outside the typical usage scenarios for
> PostgreSQL.
> 
> Namely, we leverage JSONB heavily as a backing store for JSON-LD
> documents[1].  We also use JSONB in combination with Ecto's "embedded
> structs" to store things like user preferences.
> 
> The fact that we can use JSONB to achieve our design goals is a
> testament to the flexibility PostgreSQL has.
> 
> However, in the process of doing so, we have discovered a serious flaw
> in the way jsonb_set() functions, but upon reading through this
> mailing list, we have discovered that this flaw appears to be an
> intentional design.[2]
> 
> A few times now, we have written migrations that do things like copy
> keys in a JSONB object to a new key, to rename them.  These migrations
> look like so:
> 
>update users set info=jsonb_set(info, '{bar}', info->'foo');
> 
> Typically, this works nicely, except for cases where evaluating
> info->'foo' results in an SQL null being returned.  When that happens,
> jsonb_set() returns an SQL null, which then results in data loss.[3]
> 
> This is not acceptable.  PostgreSQL is a database that is renowned for
> data integrity, but here it is wiping out data when it encounters a
> failure case.  The way jsonb_set() should fail in this case is to
> simply return the original input: it should NEVER return SQL null.
> 
> But hey, we've been burned by this so many times now that we'd like to
> donate a useful function to the commons, consider it a mollyguard for
> the real jsonb_set() function.
> 
> create or replace function safe_jsonb_set(target jsonb, path
> text[], new_value jsonb, create_missing boolean default true) returns
> jsonb as $$
> declare
>   result jsonb;
> begin
>   result := jsonb_set(target, path, coalesce(new_value,
> 'null'::jsonb), create_missing);
>   if result is NULL then
> return target;
>   else
> return result;
>   end if;
> end;
> $$ language plpgsql;
> 
> This safe_jsonb_set() wrapper should not be necessary.  PostgreSQL's
> own jsonb_set() should have this safety feature built in.  Without it,
> using jsonb_set() is like playing russian roulette with your data,
> which is not a reasonable expectation for a database renowned for its
> commitment to data integrity.
> 
> Please fix this bug so that we do not have to hack around this bug.
> It has probably ruined countless people's days so far.  I don't want
> to hear about how the function is strict, I'm aware it is strict, and
> that strictness is harmful.  Please fix the function so that it is
> actually safe to use.
> 
> [1]: JSON-LD stands for JSON Linked Data.  Pleroma has an "internal
> representation" that shares similar qualities to JSON-LD, so I use
> JSON-LD here as a simplification.
> 
> [2]: 
> https://www.postgresql.org/message-id/flat/qfkua9$2q0e$1...@blaine.gmane.org
> 
> [3]: https://git.pleroma.social/pleroma/pleroma/issues/1324 is an
> example of data loss induced by this issue.
> 
> Ariadne
>

This should be directed towards the hackers list, too.

What will it take to change the semantics of jsonb_set()? MySQL implements safe 
behavior here. It's a real shame Postgres does not. I'll offer a $200 bounty to 
whoever fixes it. I'm sure it's destroyed more than $200 worth of data and 
people's time by now, but it's something.


Kind regards,



-- 
  Mark Felder
  ports-secteam & portmgr alumni
  f...@freebsd.org




jsonb_set() strictness considered harmful to data

2019-10-18 Thread Ariadne Conill
Hello,

I am one of the primary maintainers of Pleroma, a federated social
networking application written in Elixir, which uses PostgreSQL in
ways that may be considered outside the typical usage scenarios for
PostgreSQL.

Namely, we leverage JSONB heavily as a backing store for JSON-LD
documents[1].  We also use JSONB in combination with Ecto's "embedded
structs" to store things like user preferences.

The fact that we can use JSONB to achieve our design goals is a
testament to the flexibility PostgreSQL has.

However, in the process of doing so, we have discovered a serious flaw
in the way jsonb_set() functions, but upon reading through this
mailing list, we have discovered that this flaw appears to be an
intentional design.[2]

A few times now, we have written migrations that do things like copy
keys in a JSONB object to a new key, to rename them.  These migrations
look like so:

   update users set info=jsonb_set(info, '{bar}', info->'foo');

Typically, this works nicely, except for cases where evaluating
info->'foo' results in an SQL null being returned.  When that happens,
jsonb_set() returns an SQL null, which then results in data loss.[3]

This is not acceptable.  PostgreSQL is a database that is renowned for
data integrity, but here it is wiping out data when it encounters a
failure case.  The way jsonb_set() should fail in this case is to
simply return the original input: it should NEVER return SQL null.

But hey, we've been burned by this so many times now that we'd like to
donate a useful function to the commons, consider it a mollyguard for
the real jsonb_set() function.

create or replace function safe_jsonb_set(target jsonb, path
text[], new_value jsonb, create_missing boolean default true) returns
jsonb as $$
declare
  result jsonb;
begin
  result := jsonb_set(target, path, coalesce(new_value,
'null'::jsonb), create_missing);
  if result is NULL then
return target;
  else
return result;
  end if;
end;
$$ language plpgsql;

This safe_jsonb_set() wrapper should not be necessary.  PostgreSQL's
own jsonb_set() should have this safety feature built in.  Without it,
using jsonb_set() is like playing russian roulette with your data,
which is not a reasonable expectation for a database renowned for its
commitment to data integrity.

Please fix this bug so that we do not have to hack around this bug.
It has probably ruined countless people's days so far.  I don't want
to hear about how the function is strict, I'm aware it is strict, and
that strictness is harmful.  Please fix the function so that it is
actually safe to use.

[1]: JSON-LD stands for JSON Linked Data.  Pleroma has an "internal
representation" that shares similar qualities to JSON-LD, so I use
JSON-LD here as a simplification.

[2]: 
https://www.postgresql.org/message-id/flat/qfkua9$2q0e$1...@blaine.gmane.org

[3]: https://git.pleroma.social/pleroma/pleroma/issues/1324 is an
example of data loss induced by this issue.

Ariadne