date:20191018

Re: jsonb_set() strictness considered harmful to data

2019-10-18 Thread Pavel Stehule

Hi


> What I am talking about is that jsonb_set(..., ..., NULL) returns SQL NULL.
>
> postgres=# \pset null '(null)'
> Null display is "(null)".
> postgres=# select jsonb_set('{"a":1,"b":2,"c":3}'::jsonb, '{a}', NULL);
> jsonb_set
> ---
> (null)
> (1 row)
>
> This behaviour is basically giving an application developer a loaded
> shotgun and pointing it at their feet.  It is not a good design.  It
> is a design which has likely lead to many users experiencing
> unintentional data loss.
>

on second hand - PostgreSQL design is one possible that returns additional
information if value was changed or not.

Unfortunately It is very low probably so the design of this function will
be changed - just it is not a bug (although I fully agree, it has different
behave than has other databases and for some usages it is not practical).
Probably there will be some applications that needs NULL result in
situations when value was not changed or when input value has not expected
format. Design using in Postgres allows later customization - you can
implement with COALESCE very simply behave that you want (sure, you have to
know what you do). If Postgres implement design used by MySQL, then there
is not any possibility to react on situation when update is not processed.

Is not hard to implement second function with different name that has
behave that you need and you expect - although it is just

CREATE OR REPLACE FUNCTION jsonb_modify(jsonb, text[], jsonb)
RETURNS jsonb AS $$
SELECT jsonb_set($1, $2, COALESCE($3, "null"::jsonb), true);
$$ LANGUAGE sql;

It is important to understand so JSON NULL is not PostgreSQL NULL. In this
case is not problem in PostgreSQL design because it is consistent with
everything in PG, but in bad expectations. Unfortunately, there are lot of
wrong expectations, and these cannot be covered by Postgres design because
then Postgres will be very not consistent software. You can see - my
function jsonb_modify is what you are expect, and can works for you
perfectly, but from system perspective is not consistent, and very strong
not consistent. Users should not to learn where NULL has different behave
or where NULL is JSON__NULL. Buildin functions should be consistent in
Postgres. It is Postgres, not other databases.

Pavel





> Ariadne
>
>
>

Re: jsonb_set() strictness considered harmful to data

2019-10-18 Thread Ariadne Conill

Hello,

On Fri, Oct 18, 2019 at 7:04 PM Adrian Klaver  wrote:
>
> On 10/18/19 4:31 PM, Ariadne Conill wrote:
> > Hello,
> >
> > On Fri, Oct 18, 2019 at 6:01 PM Adrian Klaver  
> > wrote:
> >>
> >> On 10/18/19 3:11 PM, Ariadne Conill wrote:
> >>> Hello,
> >>>
> >>> On Fri, Oct 18, 2019 at 5:01 PM David G. Johnston
> >>>  wrote:
> 
>  On Fri, Oct 18, 2019 at 2:50 PM Christoph Moench-Tegeder 
>   wrote:
> >
> > ## Ariadne Conill (aria...@dereferenced.org):
> >
> >>  update users set info=jsonb_set(info, '{bar}', info->'foo');
> >>
> >> Typically, this works nicely, except for cases where evaluating
> >> info->'foo' results in an SQL null being returned.  When that happens,
> >> jsonb_set() returns an SQL null, which then results in data loss.[3]
> >
> > So why don't you use the facilities of SQL to make sure to only
> > touch the rows which match the prerequisites?
> >
> > UPDATE users SET info = jsonb_set(info, '{bar}', info->'foo')
> >   WHERE info->'foo' IS NOT NULL;
> >
> 
>  There are many ways to add code to queries to make working with this 
>  function safer - though using them presupposes one remembers at the time 
>  of writing the query that there is danger and caveats in using this 
>  function.  I agree that we should have (and now) provided sane defined 
>  behavior when one of the inputs to the function is null instead blowing 
>  off the issue and defining the function as being strict.  Whether that 
>  is "ignore and return the original object" or "add the key with a json 
>  null scalar value" is debatable but either is considerably more useful 
>  than returning SQL NULL.
> >>>
> >>> A great example of how we got burned by this last year: Pleroma
> >>> maintains pre-computed counters in JSONB for various types of
> >>> activities (posts, followers, followings).  Last year, another counter
> >>> was added, with a migration.  But some people did not run the
> >>> migration, because they are users, and that's what users do.  This
> >>
> >> So you are more forgiving of your misstep, allowing users to run
> >> outdated code, then of running afoul of Postgres documented behavior:
> >
> > I'm not forgiving of either.
> >
> >> https://www.postgresql.org/docs/11/functions-json.html
> >> " The field/element/path extraction operators return NULL, rather than
> >> failing, if the JSON input does not have the right structure to match
> >> the request; for example if no such element exists"
> >
> > It is known that the extraction operators return NULL.  The problem
> > here is jsonb_set() returning NULL when it encounters SQL NULL.
>
> I'm not following. Your original case was:
>
> jsonb_set(info, '{bar}', info->'foo');
>
> where info->'foo' is equivalent to:
>
> test=# select '{"f1":1,"f2":null}'::jsonb ->'f3';
>   ?column?
> --
>   NULL
>
> So you know there is a possibility that a value extraction could return
> NULL and from your wrapper that COALESCE is the way to deal with this.

You're not following because you don't want to follow.

It does not matter that info->'foo' is in my example.  That's not what
I am talking about.

What I am talking about is that jsonb_set(..., ..., NULL) returns SQL NULL.

postgres=# \pset null '(null)'
Null display is "(null)".
postgres=# select jsonb_set('{"a":1,"b":2,"c":3}'::jsonb, '{a}', NULL);
jsonb_set
---
(null)
(1 row)

This behaviour is basically giving an application developer a loaded
shotgun and pointing it at their feet.  It is not a good design.  It
is a design which has likely lead to many users experiencing
unintentional data loss.

Ariadne

Re: jsonb_set() strictness considered harmful to data

2019-10-18 Thread Ariadne Conill

Hello,

On Fri, Oct 18, 2019 at 6:52 PM Stephen Frost  wrote:
>
> Greetings,
>
> * Ariadne Conill (aria...@dereferenced.org) wrote:
> > On Fri, Oct 18, 2019 at 6:01 PM Adrian Klaver  
> > wrote:
> > > https://www.postgresql.org/docs/11/functions-json.html
> > > " The field/element/path extraction operators return NULL, rather than
> > > failing, if the JSON input does not have the right structure to match
> > > the request; for example if no such element exists"
> >
> > It is known that the extraction operators return NULL.  The problem
> > here is jsonb_set() returning NULL when it encounters SQL NULL.
> >
> > > Just trying to figure why one is worse then the other.
> >
> > Any time a user loses data, it is worse.  The preference for not
> > having data loss is why Pleroma uses PostgreSQL as it's database of
> > choice, as PostgreSQL has traditionally valued durability.  If we
> > should not use PostgreSQL, just say so.
>
> Your contention that the documented, clear, and easily addressed
> behavior of a particular strict function equates to "the database system
> loses data and isn't durable" is really hurting your arguments here, not
> helping it.
>
> The argument about how it's unintuitive and can cause application
> developers to misuse the function (which is clearly an application bug,
> but perhaps an understandable one if the function interface isn't
> intuitive or is confusing) is a reasonable one and might be convincing
> enough to result in a change here.
>
> I'd suggest sticking to the latter argument when making this case.
>
> > > > I believe that anything that can be catastrophically broken by users
> > > > not following upgrade instructions precisely is a serious problem, and
> > > > can lead to serious problems.  I am sure that this is not the only
> > > > project using JSONB which have had users destroy their own data in
> > > > such a completely preventable fashion.
>
> Let's be clear here that the issue with the upgrade instructions was
> that the user didn't follow your *application's* upgrade instructions,
> and your later code wasn't written to use the function, as documented,
> properly- this isn't a case of PG destroying your data.  It's fine to
> contend that the interface sucks and that we should change it, but the
> argument that PG is eating data because the application sent a query to
> the database telling it, based on our documentation, to eat the data,
> isn't appropriate.  Again, let's have a reasonable discussion here about
> if it makes sense to make a change here because the interface isn't
> intuitive and doesn't match what other systems do (I'm guessing it isn't
> in the SQL standard either, so we unfortunately can't look to that for
> help; though I'd hardly be surprised if they supported what PG does
> today anyway).

Okay, I will admit that saying PG is eating data is perhaps
hyperbolic, but I will also say that the behaviour of jsonb_set()
under this type of edge case is unintuitive and frequently results in
unintended data loss.  So, while PostgreSQL is not actually eating the
data, it is putting the user in a position where they may suffer data
loss if they are not extremely careful.

Here is how other implementations handle this case:

MySQL/MariaDB:

select json_set('{"a":1,"b":2,"c":3}', '$.a', NULL) results in:
   {"a":null,"b":2,"c":3}

Microsoft SQL Server:

select json_modify('{"a":1,"b":2,"c":3}', '$.a', NULL) results in:
   {"b":2,"c":3}

Both of these outcomes make sense, given the nature of JSON objects.
I am actually more in favor of what MSSQL does however, I think that
makes the most sense of all.

I did not compare to other database systems, because using them I
found that there is a JSON_TABLE type function and then you do stuff
with that to rewrite the object and dump it back out as JSON, and it's
quite a mess.  But MySQL and MSSQL have an equivalent jsonb inline
modification function, as seen above.

> As a practical response to the issue you've raised- have you considered
> using a trigger to check the validity of the new jsonb?  Or, maybe, just
> made the jsonb column not nullable?  With a trigger you could disallow
> non-null->null transistions, for example, or if it just shouldn't ever
> be null then making the column 'not null' would suffice.

We have already mitigated the issue in a way we find appropriate to
do.  The suggestion of having a non-null constraint does seem useful
as well and I will look into that.

> I'll echo Christoph's comments up thread too, though in my own language-
> these are risks you've explicitly accepted by using JSONB and writing
> your own validation and checks (or, not, apparently) rather than using
> what the database system provides.  That doesn't mean I'm against
> making the change you suggest, but it certainly should become a lesson
> to anyone who is considering using primairly jsonb for their storage
> that it's risky to do so, because you're removing the database system's
> knowledge and understanding

Re: jsonb_set() strictness considered harmful to data

2019-10-18 Thread Adrian Klaver


On 10/18/19 4:31 PM, Ariadne Conill wrote:

Hello,

On Fri, Oct 18, 2019 at 6:01 PM Adrian Klaver  wrote:


On 10/18/19 3:11 PM, Ariadne Conill wrote:

Hello,

On Fri, Oct 18, 2019 at 5:01 PM David G. Johnston
 wrote:


On Fri, Oct 18, 2019 at 2:50 PM Christoph Moench-Tegeder  
wrote:


## Ariadne Conill (aria...@dereferenced.org):


 update users set info=jsonb_set(info, '{bar}', info->'foo');

Typically, this works nicely, except for cases where evaluating
info->'foo' results in an SQL null being returned.  When that happens,
jsonb_set() returns an SQL null, which then results in data loss.[3]


So why don't you use the facilities of SQL to make sure to only
touch the rows which match the prerequisites?

UPDATE users SET info = jsonb_set(info, '{bar}', info->'foo')
  WHERE info->'foo' IS NOT NULL;



There are many ways to add code to queries to make working with this function safer - though using 
them presupposes one remembers at the time of writing the query that there is danger and caveats in 
using this function.  I agree that we should have (and now) provided sane defined behavior when one 
of the inputs to the function is null instead blowing off the issue and defining the function as 
being strict.  Whether that is "ignore and return the original object" or "add the 
key with a json null scalar value" is debatable but either is considerably more useful than 
returning SQL NULL.


A great example of how we got burned by this last year: Pleroma
maintains pre-computed counters in JSONB for various types of
activities (posts, followers, followings).  Last year, another counter
was added, with a migration.  But some people did not run the
migration, because they are users, and that's what users do.  This


So you are more forgiving of your misstep, allowing users to run
outdated code, then of running afoul of Postgres documented behavior:


I'm not forgiving of either.


https://www.postgresql.org/docs/11/functions-json.html
" The field/element/path extraction operators return NULL, rather than
failing, if the JSON input does not have the right structure to match
the request; for example if no such element exists"


It is known that the extraction operators return NULL.  The problem
here is jsonb_set() returning NULL when it encounters SQL NULL.


I'm not following. Your original case was:

jsonb_set(info, '{bar}', info->'foo');

where info->'foo' is equivalent to:

test=# select '{"f1":1,"f2":null}'::jsonb ->'f3';
 ?column?
--
 NULL

So you know there is a possibility that a value extraction could return 
NULL and from your wrapper that COALESCE is the way to deal with this.






Just trying to figure why one is worse then the other.


Any time a user loses data, it is worse.  The preference for not
having data loss is why Pleroma uses PostgreSQL as it's database of
choice, as PostgreSQL has traditionally valued durability.  If we
should not use PostgreSQL, just say so.


There are any number of ways you can make Postgres lose data that are 
not related to durability e.g build the following in code:


DELETE FROM some_table;

and forget the WHERE.



Ariadne




resulted in Pleroma blanking out the `info` structure for users as
they performed new activities that incremented that counter.  At that
time, Pleroma maintained various things like private keys used to sign
things in that JSONB column (we no longer do this because of being
burned by this several times now), which broke federation temporarily
for the affected accounts with other servers for up to a week as those
servers had to learn new public keys for those accounts (since the
original private keys were lost).

I believe that anything that can be catastrophically broken by users
not following upgrade instructions precisely is a serious problem, and
can lead to serious problems.  I am sure that this is not the only
project using JSONB which have had users destroy their own data in
such a completely preventable fashion.

Ariadne






--
Adrian Klaver
adrian.kla...@aklaver.com





--
Adrian Klaver
adrian.kla...@aklaver.com

Re: jsonb_set() strictness considered harmful to data

2019-10-18 Thread Stephen Frost

Greetings,

* Ariadne Conill (aria...@dereferenced.org) wrote:
> On Fri, Oct 18, 2019 at 5:57 PM Christoph Moench-Tegeder
>  wrote:
> > ## Ariadne Conill (aria...@dereferenced.org):
> > > Why don't we fix the database engine to not eat data when the
> > > jsonb_set() operation fails?
> >
> > It didn't fail, it worked like SQL (you've been doing SQL for too
> > long when you get used to the NULL propagation, but that's still
> > what SQL does - check "+" for example).
> > And changing a function will cause fun for everyone who relies on
> > the current behaviour - so at least it shouldn't be done on a whim
> > (some might argue that a whim was what got us into this situation
> > in the first place).
> 
> NULL propagation makes sense in the context of traditional SQL.  What
> users expect from the JSONB support is for it to behave as JSON
> manipulation behaves everywhere else.  It makes sense that 2 + NULL
> returns NULL -- it's easily understood as a type mismatch.  It does
> not make sense that jsonb_set('{}'::jsonb, '{foo}', NULL) returns NULL
> because a *value* was SQL NULL.  In this case, it should, at the
> least, automatically coalesce to 'null'::jsonb.

2 + NULL isn't a type mismatch, just to be clear, it's "2 + unknown =
unknown", which is pretty reasonable, if you accept the general notion
of what NULL is to begin with.

And as such, what follows with "set this blob of stuff to include this
unknown thing ... implies ... we don't know what the result of the set
is and therefore it's unknown" isn't entirely unreasonable, but I can
agree that in this specific case, because what we're dealing with
regarding JSONB is a complex data structure, not an individual value,
that it's surprising to a developer and there can be an argument made
there that we should consider changing it.

> > Continuing along that thought, I'd even argue that your are
> > writing code which relies on properties of the data which you never
> > guaranteed. There is a use case for data types and constraints.
> 
> There is a use case, but this frequently comes up as a question people
> ask.  At some point, you have to start pondering whether the behaviour
> does not make logical sense in the context that people frame the JSONB
> type and it's associated manipulation functions.  It is not *obvious*
> that jsonb_set() will trash your data, but that is what it is capable
> of doing.  In a database that is advertised as being durable and not
> trashing data, even.

Having the result of a call to a strict function be NULL isn't
"trashing" your data.

> > Not that I'm arguing for maximum surprise in programming, but
> > I'm a little puzzled when people eschew thew built-in tools and
> > start implmenting them again side-to-side with what's already
> > there.
> 
> If you read the safe_jsonb_set() function, all we do is coalesce any
> SQL NULL to 'null'::jsonb, which is what it should be doing anyway,

I'm not convinced that this is at all the right answer, particularly
since we don't do that generally.  We don't return the string 'null'
when you do: NULL || 'abc', we return NULL.  There might be something we
can do here that doesn't result in the whole jsonb document becoming
NULL though.

> and then additionally handling any *unanticipated* failure case on top
> of that.  While you are arguing that we should use tools to work
> around unanticipated effects (that are not even documented -- in no
> place in the jsonb_set() documentation does it say "if you pass SQL
> NULL to this function as a value, it will return SQL NULL"), I am
> arguing that jsonb_set() shouldn't set people up for their data to be
> trashed in the first place.

The function is marked as strict, and the meaning of that is quite
clearly defined in the documentation.  I'm not against including a
comment regarding this in the documentation, to be clear, though I
seriously doubt it would actually have changed anything in this case.

Thanks,

Stephen

signature.asc
Description: PGP signature

Re: jsonb_set() strictness considered harmful to data

2019-10-18 Thread Stephen Frost

Greetings,

* Ariadne Conill (aria...@dereferenced.org) wrote:
> On Fri, Oct 18, 2019 at 6:01 PM Adrian Klaver  
> wrote:
> > https://www.postgresql.org/docs/11/functions-json.html
> > " The field/element/path extraction operators return NULL, rather than
> > failing, if the JSON input does not have the right structure to match
> > the request; for example if no such element exists"
> 
> It is known that the extraction operators return NULL.  The problem
> here is jsonb_set() returning NULL when it encounters SQL NULL.
> 
> > Just trying to figure why one is worse then the other.
> 
> Any time a user loses data, it is worse.  The preference for not
> having data loss is why Pleroma uses PostgreSQL as it's database of
> choice, as PostgreSQL has traditionally valued durability.  If we
> should not use PostgreSQL, just say so.

Your contention that the documented, clear, and easily addressed
behavior of a particular strict function equates to "the database system
loses data and isn't durable" is really hurting your arguments here, not
helping it.

The argument about how it's unintuitive and can cause application
developers to misuse the function (which is clearly an application bug,
but perhaps an understandable one if the function interface isn't
intuitive or is confusing) is a reasonable one and might be convincing
enough to result in a change here.

I'd suggest sticking to the latter argument when making this case.

> > > I believe that anything that can be catastrophically broken by users
> > > not following upgrade instructions precisely is a serious problem, and
> > > can lead to serious problems.  I am sure that this is not the only
> > > project using JSONB which have had users destroy their own data in
> > > such a completely preventable fashion.

Let's be clear here that the issue with the upgrade instructions was
that the user didn't follow your *application's* upgrade instructions,
and your later code wasn't written to use the function, as documented,
properly- this isn't a case of PG destroying your data.  It's fine to
contend that the interface sucks and that we should change it, but the
argument that PG is eating data because the application sent a query to
the database telling it, based on our documentation, to eat the data,
isn't appropriate.  Again, let's have a reasonable discussion here about
if it makes sense to make a change here because the interface isn't
intuitive and doesn't match what other systems do (I'm guessing it isn't
in the SQL standard either, so we unfortunately can't look to that for
help; though I'd hardly be surprised if they supported what PG does
today anyway).

As a practical response to the issue you've raised- have you considered
using a trigger to check the validity of the new jsonb?  Or, maybe, just
made the jsonb column not nullable?  With a trigger you could disallow
non-null->null transistions, for example, or if it just shouldn't ever
be null then making the column 'not null' would suffice.

I'll echo Christoph's comments up thread too, though in my own language-
these are risks you've explicitly accepted by using JSONB and writing
your own validation and checks (or, not, apparently) rather than using
what the database system provides.  That doesn't mean I'm against
making the change you suggest, but it certainly should become a lesson
to anyone who is considering using primairly jsonb for their storage
that it's risky to do so, because you're removing the database system's
knowledge and understanding of the data, and further you tend to end up
not having the necessary constraints in place to ensure that the data
doesn't end up being garbage- thus letting your application destroy all
the data easily due to an application bug.

Thanks,

Stephen

signature.asc
Description: PGP signature

Re: jsonb_set() strictness considered harmful to data

2019-10-18 Thread Ariadne Conill

Hello,

On Fri, Oct 18, 2019 at 5:57 PM Christoph Moench-Tegeder
 wrote:
>
> ## Ariadne Conill (aria...@dereferenced.org):
>
> > Why don't we fix the database engine to not eat data when the
> > jsonb_set() operation fails?
>
> It didn't fail, it worked like SQL (you've been doing SQL for too
> long when you get used to the NULL propagation, but that's still
> what SQL does - check "+" for example).
> And changing a function will cause fun for everyone who relies on
> the current behaviour - so at least it shouldn't be done on a whim
> (some might argue that a whim was what got us into this situation
> in the first place).

NULL propagation makes sense in the context of traditional SQL.  What
users expect from the JSONB support is for it to behave as JSON
manipulation behaves everywhere else.  It makes sense that 2 + NULL
returns NULL -- it's easily understood as a type mismatch.  It does
not make sense that jsonb_set('{}'::jsonb, '{foo}', NULL) returns NULL
because a *value* was SQL NULL.  In this case, it should, at the
least, automatically coalesce to 'null'::jsonb.

> Continuing along that thought, I'd even argue that your are
> writing code which relies on properties of the data which you never
> guaranteed. There is a use case for data types and constraints.

There is a use case, but this frequently comes up as a question people
ask.  At some point, you have to start pondering whether the behaviour
does not make logical sense in the context that people frame the JSONB
type and it's associated manipulation functions.  It is not *obvious*
that jsonb_set() will trash your data, but that is what it is capable
of doing.  In a database that is advertised as being durable and not
trashing data, even.

> Not that I'm arguing for maximum surprise in programming, but
> I'm a little puzzled when people eschew thew built-in tools and
> start implmenting them again side-to-side with what's already
> there.

If you read the safe_jsonb_set() function, all we do is coalesce any
SQL NULL to 'null'::jsonb, which is what it should be doing anyway,
and then additionally handling any *unanticipated* failure case on top
of that.  While you are arguing that we should use tools to work
around unanticipated effects (that are not even documented -- in no
place in the jsonb_set() documentation does it say "if you pass SQL
NULL to this function as a value, it will return SQL NULL"), I am
arguing that jsonb_set() shouldn't set people up for their data to be
trashed in the first place.

Even MySQL gets this right.  MySQL!  The database that everyone knows
takes your data out for a night it will never forget.  This argument
is miserable.  It does not matter to me how jsonb_set() works as long
as it does not return NULL when passed NULL as a value to set.  JSONB
columns should be treated as the complex types that they are: you
don't null out an entire hash table because someone set a key to SQL
NULL.  So, please, let us fix this.

Ariadne

Re: jsonb_set() strictness considered harmful to data

2019-10-18 Thread Ariadne Conill

Hello,

On Fri, Oct 18, 2019 at 6:01 PM Adrian Klaver  wrote:
>
> On 10/18/19 3:11 PM, Ariadne Conill wrote:
> > Hello,
> >
> > On Fri, Oct 18, 2019 at 5:01 PM David G. Johnston
> >  wrote:
> >>
> >> On Fri, Oct 18, 2019 at 2:50 PM Christoph Moench-Tegeder 
> >>  wrote:
> >>>
> >>> ## Ariadne Conill (aria...@dereferenced.org):
> >>>
>  update users set info=jsonb_set(info, '{bar}', info->'foo');
> 
>  Typically, this works nicely, except for cases where evaluating
>  info->'foo' results in an SQL null being returned.  When that happens,
>  jsonb_set() returns an SQL null, which then results in data loss.[3]
> >>>
> >>> So why don't you use the facilities of SQL to make sure to only
> >>> touch the rows which match the prerequisites?
> >>>
> >>>UPDATE users SET info = jsonb_set(info, '{bar}', info->'foo')
> >>>  WHERE info->'foo' IS NOT NULL;
> >>>
> >>
> >> There are many ways to add code to queries to make working with this 
> >> function safer - though using them presupposes one remembers at the time 
> >> of writing the query that there is danger and caveats in using this 
> >> function.  I agree that we should have (and now) provided sane defined 
> >> behavior when one of the inputs to the function is null instead blowing 
> >> off the issue and defining the function as being strict.  Whether that is 
> >> "ignore and return the original object" or "add the key with a json null 
> >> scalar value" is debatable but either is considerably more useful than 
> >> returning SQL NULL.
> >
> > A great example of how we got burned by this last year: Pleroma
> > maintains pre-computed counters in JSONB for various types of
> > activities (posts, followers, followings).  Last year, another counter
> > was added, with a migration.  But some people did not run the
> > migration, because they are users, and that's what users do.  This
>
> So you are more forgiving of your misstep, allowing users to run
> outdated code, then of running afoul of Postgres documented behavior:

I'm not forgiving of either.

> https://www.postgresql.org/docs/11/functions-json.html
> " The field/element/path extraction operators return NULL, rather than
> failing, if the JSON input does not have the right structure to match
> the request; for example if no such element exists"

It is known that the extraction operators return NULL.  The problem
here is jsonb_set() returning NULL when it encounters SQL NULL.

> Just trying to figure why one is worse then the other.

Any time a user loses data, it is worse.  The preference for not
having data loss is why Pleroma uses PostgreSQL as it's database of
choice, as PostgreSQL has traditionally valued durability.  If we
should not use PostgreSQL, just say so.

Ariadne

>
> > resulted in Pleroma blanking out the `info` structure for users as
> > they performed new activities that incremented that counter.  At that
> > time, Pleroma maintained various things like private keys used to sign
> > things in that JSONB column (we no longer do this because of being
> > burned by this several times now), which broke federation temporarily
> > for the affected accounts with other servers for up to a week as those
> > servers had to learn new public keys for those accounts (since the
> > original private keys were lost).
> >
> > I believe that anything that can be catastrophically broken by users
> > not following upgrade instructions precisely is a serious problem, and
> > can lead to serious problems.  I am sure that this is not the only
> > project using JSONB which have had users destroy their own data in
> > such a completely preventable fashion.
> >
> > Ariadne
> >
> >
> >
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com

Re: jsonb_set() strictness considered harmful to data

2019-10-18 Thread Adrian Klaver


On 10/18/19 3:11 PM, Ariadne Conill wrote:

Hello,

On Fri, Oct 18, 2019 at 5:01 PM David G. Johnston
 wrote:


On Fri, Oct 18, 2019 at 2:50 PM Christoph Moench-Tegeder  
wrote:


## Ariadne Conill (aria...@dereferenced.org):


update users set info=jsonb_set(info, '{bar}', info->'foo');

Typically, this works nicely, except for cases where evaluating
info->'foo' results in an SQL null being returned.  When that happens,
jsonb_set() returns an SQL null, which then results in data loss.[3]


So why don't you use the facilities of SQL to make sure to only
touch the rows which match the prerequisites?

   UPDATE users SET info = jsonb_set(info, '{bar}', info->'foo')
 WHERE info->'foo' IS NOT NULL;



There are many ways to add code to queries to make working with this function safer - though using 
them presupposes one remembers at the time of writing the query that there is danger and caveats in 
using this function.  I agree that we should have (and now) provided sane defined behavior when one 
of the inputs to the function is null instead blowing off the issue and defining the function as 
being strict.  Whether that is "ignore and return the original object" or "add the 
key with a json null scalar value" is debatable but either is considerably more useful than 
returning SQL NULL.


A great example of how we got burned by this last year: Pleroma
maintains pre-computed counters in JSONB for various types of
activities (posts, followers, followings).  Last year, another counter
was added, with a migration.  But some people did not run the
migration, because they are users, and that's what users do.  This


So you are more forgiving of your misstep, allowing users to run 
outdated code, then of running afoul of Postgres documented behavior:


https://www.postgresql.org/docs/11/functions-json.html
" The field/element/path extraction operators return NULL, rather than 
failing, if the JSON input does not have the right structure to match 
the request; for example if no such element exists"


Just trying to figure why one is worse then the other.


resulted in Pleroma blanking out the `info` structure for users as
they performed new activities that incremented that counter.  At that
time, Pleroma maintained various things like private keys used to sign
things in that JSONB column (we no longer do this because of being
burned by this several times now), which broke federation temporarily
for the affected accounts with other servers for up to a week as those
servers had to learn new public keys for those accounts (since the
original private keys were lost).

I believe that anything that can be catastrophically broken by users
not following upgrade instructions precisely is a serious problem, and
can lead to serious problems.  I am sure that this is not the only
project using JSONB which have had users destroy their own data in
such a completely preventable fashion.

Ariadne






--
Adrian Klaver
adrian.kla...@aklaver.com

Re: jsonb_set() strictness considered harmful to data

2019-10-18 Thread Ariadne Conill

Hello,

On Fri, Oct 18, 2019 at 5:01 PM David G. Johnston
 wrote:
>
> On Fri, Oct 18, 2019 at 2:50 PM Christoph Moench-Tegeder 
>  wrote:
>>
>> ## Ariadne Conill (aria...@dereferenced.org):
>>
>> >update users set info=jsonb_set(info, '{bar}', info->'foo');
>> >
>> > Typically, this works nicely, except for cases where evaluating
>> > info->'foo' results in an SQL null being returned.  When that happens,
>> > jsonb_set() returns an SQL null, which then results in data loss.[3]
>>
>> So why don't you use the facilities of SQL to make sure to only
>> touch the rows which match the prerequisites?
>>
>>   UPDATE users SET info = jsonb_set(info, '{bar}', info->'foo')
>> WHERE info->'foo' IS NOT NULL;
>>
>
> There are many ways to add code to queries to make working with this function 
> safer - though using them presupposes one remembers at the time of writing 
> the query that there is danger and caveats in using this function.  I agree 
> that we should have (and now) provided sane defined behavior when one of the 
> inputs to the function is null instead blowing off the issue and defining the 
> function as being strict.  Whether that is "ignore and return the original 
> object" or "add the key with a json null scalar value" is debatable but 
> either is considerably more useful than returning SQL NULL.

A great example of how we got burned by this last year: Pleroma
maintains pre-computed counters in JSONB for various types of
activities (posts, followers, followings).  Last year, another counter
was added, with a migration.  But some people did not run the
migration, because they are users, and that's what users do.  This
resulted in Pleroma blanking out the `info` structure for users as
they performed new activities that incremented that counter.  At that
time, Pleroma maintained various things like private keys used to sign
things in that JSONB column (we no longer do this because of being
burned by this several times now), which broke federation temporarily
for the affected accounts with other servers for up to a week as those
servers had to learn new public keys for those accounts (since the
original private keys were lost).

I believe that anything that can be catastrophically broken by users
not following upgrade instructions precisely is a serious problem, and
can lead to serious problems.  I am sure that this is not the only
project using JSONB which have had users destroy their own data in
such a completely preventable fashion.

Ariadne

Re: jsonb_set() strictness considered harmful to data

2019-10-18 Thread David G. Johnston

On Fri, Oct 18, 2019 at 2:50 PM Christoph Moench-Tegeder 
wrote:

> ## Ariadne Conill (aria...@dereferenced.org):
>
> >update users set info=jsonb_set(info, '{bar}', info->'foo');
> >
> > Typically, this works nicely, except for cases where evaluating
> > info->'foo' results in an SQL null being returned.  When that happens,
> > jsonb_set() returns an SQL null, which then results in data loss.[3]
>
> So why don't you use the facilities of SQL to make sure to only
> touch the rows which match the prerequisites?
>
>   UPDATE users SET info = jsonb_set(info, '{bar}', info->'foo')
> WHERE info->'foo' IS NOT NULL;
>
>
There are many ways to add code to queries to make working with this
function safer - though using them presupposes one remembers at the time of
writing the query that there is danger and caveats in using this function.
I agree that we should have (and now) provided sane defined behavior when
one of the inputs to the function is null instead blowing off the issue and
defining the function as being strict.  Whether that is "ignore and return
the original object" or "add the key with a json null scalar value" is
debatable but either is considerably more useful than returning SQL NULL.

David J.

Re: jsonb_set() strictness considered harmful to data

2019-10-18 Thread Christoph Moench-Tegeder

## Ariadne Conill (aria...@dereferenced.org):

>update users set info=jsonb_set(info, '{bar}', info->'foo');
> 
> Typically, this works nicely, except for cases where evaluating
> info->'foo' results in an SQL null being returned.  When that happens,
> jsonb_set() returns an SQL null, which then results in data loss.[3]

So why don't you use the facilities of SQL to make sure to only
touch the rows which match the prerequisites?

  UPDATE users SET info = jsonb_set(info, '{bar}', info->'foo')
WHERE info->'foo' IS NOT NULL;

No special wrappers required.

Regards,
Christoph

-- 
Spare Space

Re: jsonb_set() strictness considered harmful to data

2019-10-18 Thread Mark Felder




On Fri, Oct 18, 2019, at 12:37, Ariadne Conill wrote:
> Hello,
> 
> I am one of the primary maintainers of Pleroma, a federated social
> networking application written in Elixir, which uses PostgreSQL in
> ways that may be considered outside the typical usage scenarios for
> PostgreSQL.
> 
> Namely, we leverage JSONB heavily as a backing store for JSON-LD
> documents[1].  We also use JSONB in combination with Ecto's "embedded
> structs" to store things like user preferences.
> 
> The fact that we can use JSONB to achieve our design goals is a
> testament to the flexibility PostgreSQL has.
> 
> However, in the process of doing so, we have discovered a serious flaw
> in the way jsonb_set() functions, but upon reading through this
> mailing list, we have discovered that this flaw appears to be an
> intentional design.[2]
> 
> A few times now, we have written migrations that do things like copy
> keys in a JSONB object to a new key, to rename them.  These migrations
> look like so:
> 
>update users set info=jsonb_set(info, '{bar}', info->'foo');
> 
> Typically, this works nicely, except for cases where evaluating
> info->'foo' results in an SQL null being returned.  When that happens,
> jsonb_set() returns an SQL null, which then results in data loss.[3]
> 
> This is not acceptable.  PostgreSQL is a database that is renowned for
> data integrity, but here it is wiping out data when it encounters a
> failure case.  The way jsonb_set() should fail in this case is to
> simply return the original input: it should NEVER return SQL null.
> 
> But hey, we've been burned by this so many times now that we'd like to
> donate a useful function to the commons, consider it a mollyguard for
> the real jsonb_set() function.
> 
> create or replace function safe_jsonb_set(target jsonb, path
> text[], new_value jsonb, create_missing boolean default true) returns
> jsonb as $$
> declare
>   result jsonb;
> begin
>   result := jsonb_set(target, path, coalesce(new_value,
> 'null'::jsonb), create_missing);
>   if result is NULL then
> return target;
>   else
> return result;
>   end if;
> end;
> $$ language plpgsql;
> 
> This safe_jsonb_set() wrapper should not be necessary.  PostgreSQL's
> own jsonb_set() should have this safety feature built in.  Without it,
> using jsonb_set() is like playing russian roulette with your data,
> which is not a reasonable expectation for a database renowned for its
> commitment to data integrity.
> 
> Please fix this bug so that we do not have to hack around this bug.
> It has probably ruined countless people's days so far.  I don't want
> to hear about how the function is strict, I'm aware it is strict, and
> that strictness is harmful.  Please fix the function so that it is
> actually safe to use.
> 
> [1]: JSON-LD stands for JSON Linked Data.  Pleroma has an "internal
> representation" that shares similar qualities to JSON-LD, so I use
> JSON-LD here as a simplification.
> 
> [2]: 
> https://www.postgresql.org/message-id/flat/qfkua9$2q0e$1...@blaine.gmane.org
> 
> [3]: https://git.pleroma.social/pleroma/pleroma/issues/1324 is an
> example of data loss induced by this issue.
> 
> Ariadne
>

This should be directed towards the hackers list, too.

What will it take to change the semantics of jsonb_set()? MySQL implements safe 
behavior here. It's a real shame Postgres does not. I'll offer a $200 bounty to 
whoever fixes it. I'm sure it's destroyed more than $200 worth of data and 
people's time by now, but it's something.


Kind regards,



-- 
  Mark Felder
  ports-secteam & portmgr alumni
  f...@freebsd.org

jsonb_set() strictness considered harmful to data

2019-10-18 Thread Ariadne Conill

Hello,

I am one of the primary maintainers of Pleroma, a federated social
networking application written in Elixir, which uses PostgreSQL in
ways that may be considered outside the typical usage scenarios for
PostgreSQL.

Namely, we leverage JSONB heavily as a backing store for JSON-LD
documents[1].  We also use JSONB in combination with Ecto's "embedded
structs" to store things like user preferences.

The fact that we can use JSONB to achieve our design goals is a
testament to the flexibility PostgreSQL has.

However, in the process of doing so, we have discovered a serious flaw
in the way jsonb_set() functions, but upon reading through this
mailing list, we have discovered that this flaw appears to be an
intentional design.[2]

A few times now, we have written migrations that do things like copy
keys in a JSONB object to a new key, to rename them.  These migrations
look like so:

   update users set info=jsonb_set(info, '{bar}', info->'foo');

Typically, this works nicely, except for cases where evaluating
info->'foo' results in an SQL null being returned.  When that happens,
jsonb_set() returns an SQL null, which then results in data loss.[3]

This is not acceptable.  PostgreSQL is a database that is renowned for
data integrity, but here it is wiping out data when it encounters a
failure case.  The way jsonb_set() should fail in this case is to
simply return the original input: it should NEVER return SQL null.

But hey, we've been burned by this so many times now that we'd like to
donate a useful function to the commons, consider it a mollyguard for
the real jsonb_set() function.

create or replace function safe_jsonb_set(target jsonb, path
text[], new_value jsonb, create_missing boolean default true) returns
jsonb as $$
declare
  result jsonb;
begin
  result := jsonb_set(target, path, coalesce(new_value,
'null'::jsonb), create_missing);
  if result is NULL then
return target;
  else
return result;
  end if;
end;
$$ language plpgsql;

This safe_jsonb_set() wrapper should not be necessary.  PostgreSQL's
own jsonb_set() should have this safety feature built in.  Without it,
using jsonb_set() is like playing russian roulette with your data,
which is not a reasonable expectation for a database renowned for its
commitment to data integrity.

Please fix this bug so that we do not have to hack around this bug.
It has probably ruined countless people's days so far.  I don't want
to hear about how the function is strict, I'm aware it is strict, and
that strictness is harmful.  Please fix the function so that it is
actually safe to use.

[1]: JSON-LD stands for JSON Linked Data.  Pleroma has an "internal
representation" that shares similar qualities to JSON-LD, so I use
JSON-LD here as a simplification.

[2]: 
https://www.postgresql.org/message-id/flat/qfkua9$2q0e$1...@blaine.gmane.org

[3]: https://git.pleroma.social/pleroma/pleroma/issues/1324 is an
example of data loss induced by this issue.

Ariadne

RE: Execute a function through fdw

2019-10-18 Thread Patrick FICHE

Le ven. 18 oct. 2019 à 17:53, Patrick FICHE 
mailto:patrick.fi...@aqsacom.com>> a écrit :
Hi,

I got one more issue after I created my view.

I created it on my Server 1 but I am unable to view it on the Server 2.
I can see all tables through fdw after IMPORT FOREIGN SCHEMA.

I was able to get access to my view only after recreating the SERVER / USER 
MAPPING on Server 2.

Is it the expected behavior to recreate the FOREIGN SERVER / SCHEMA after a new 
table or view has been created ?

| No, you don't need to re create the foreign server. How did it not work?

After I created the view on the server 1, I tried to import it in the Foreign 
Schema on Server 2 using the LIMITED clause.
The command executed successfully but when I tried to query the view on server 
2, it returned that table did not exist.

So, I tried to reimport the full schema (after dropping / creating the schema 
on server 2), I ran the IMPORT FOREIGN SCHEMA without any LIMITED clause. I 
could see all tables but still not the view.

The only solution that I found in order to get the view accessible on Server 2 
was to recreate the SERVER / USER MAPPING and IMPORT FOREIGN SCHEMA.

My Postgres versions on both servers is 11.5.

Regards,

RE: Execute a function through fdw

2019-10-18 Thread Patrick FICHE

Hi,

I got one more issue after I created my view.

I created it on my Server 1 but I am unable to view it on the Server 2.
I can see all tables through fdw after IMPORT FOREIGN SCHEMA.

I was able to get access to my view only after recreating the SERVER / USER 
MAPPING on Server 2.

Is it the expected behavior to recreate the FOREIGN SERVER / SCHEMA after a new 
table or view has been created ?

Regards,

Patrick Fiche
Database Engineer, Aqsacom Sas.
c. 33 6 82 80 69 96 

-Original Message-
From: Patrick FICHE  
Sent: Friday, October 18, 2019 2:35 PM
To: Tom Lane ; Guillaume Lelarge 
Cc: pgsql-generallists.postgresql.org 
Subject: RE: Execute a function through fdw

Thanks a lot for your answer.
Using a view is really a good solution for my case.
As I already use fdw for some other cases, I prefer not to mix with dblink.

Regards,

Patrick Fiche

-Original Message-
From: Tom Lane 
Sent: Friday, October 18, 2019 1:55 PM
To: Guillaume Lelarge 
Cc: Patrick FICHE ; 
pgsql-generallists.postgresql.org 
Subject: Re: Execute a function through fdw

Guillaume Lelarge  writes:
> Le ven. 18 oct. 2019 à 11:51, Patrick FICHE 
>  a écrit :
>> Is it possible to execute a function located on a server accessed 
>> through Postgres fdw.

> It's probably easier to create a view on the remote server, and access 
> it as a foreign table on the local server.

Yeah.  Or if you really want to call a remote function by name, see dblink.  
postgres_fdw actively avoids doing that sort of thing.

regards, tom lane

Re: DBD::Pg exorts char columns with trailing blanks

2019-10-18 Thread Rob Sargent




On 10/18/19 9:19 AM, Adrian Klaver wrote:

On 10/18/19 8:15 AM, Rob Sargent wrote:


On 10/18/19 8:51 AM, Adrian Klaver wrote:

On 10/18/19 7:42 AM, Matthias Apitz wrote:
El día viernes, octubre 18, 2019 a las 03:01:58p. m. +0200, Tom 
Lane escribió:



Matthias Apitz  writes:
When we export char columns with our Perl tools, they come out 
with trailing blanks (in Sybase they don't). Can this be suppressed?


Switch to varchar, perhaps?

    regards, tom lane


Sometimes people does not know, what they propose. We have a 
historical
25 years grown ILS which runs on top of Sybase, Oracle, Informix 
... and

should now be ported to PostgreSQL. We can't simple switch internal
table structures and adopt some 10.000.000 lines of code (or debug 
while

it is now crashing).


That was not mentioned in the original post. Anyway:

https://metacpan.org/pod/DBD::Pg#ChopBlanks-(boolean,-inherited)

https://metacpan.org/pod/DBI

"ChopBlanks

Type: boolean, inherited

The ChopBlanks attribute can be used to control the trimming of 
trailing space characters from fixed width character (CHAR) fields. 
No other field types are affected, even where field values have 
trailing spaces.


The default is false (although it is possible that the default may 
change). Applications that need specific behaviour should set the 
attribute as needed.


Drivers are not required to support this attribute, but any driver 
which does not support it must arrange to return undef as the 
attribute value."




Thanks anyway.

matthias





It seems to me you've simply exposed a bug in you ILS.  If blanks 
intentionally went in, would they not get truncated on the way out in 
the other systems?  If all trailing blanks are expendable perhaps 
your saves should remove them.  (And a trimming of the existing 
records is in order.)


The OP is dealing with char(acter) fields:

https://www.postgresql.org/docs/11/datatype-character.html

"If the string to be stored is shorter than the declared length, 
values of type character will be space-padded; ..."






Understood

Re: DBD::Pg exorts char columns with trailing blanks

2019-10-18 Thread Adrian Klaver


On 10/18/19 8:15 AM, Rob Sargent wrote:


On 10/18/19 8:51 AM, Adrian Klaver wrote:

On 10/18/19 7:42 AM, Matthias Apitz wrote:
El día viernes, octubre 18, 2019 a las 03:01:58p. m. +0200, Tom Lane 
escribió:



Matthias Apitz  writes:
When we export char columns with our Perl tools, they come out with 
trailing blanks (in Sybase they don't). Can this be suppressed?


Switch to varchar, perhaps?

    regards, tom lane


Sometimes people does not know, what they propose. We have a historical
25 years grown ILS which runs on top of Sybase, Oracle, Informix ... and
should now be ported to PostgreSQL. We can't simple switch internal
table structures and adopt some 10.000.000 lines of code (or debug while
it is now crashing).


That was not mentioned in the original post. Anyway:

https://metacpan.org/pod/DBD::Pg#ChopBlanks-(boolean,-inherited)

https://metacpan.org/pod/DBI

"ChopBlanks

Type: boolean, inherited

The ChopBlanks attribute can be used to control the trimming of 
trailing space characters from fixed width character (CHAR) fields. No 
other field types are affected, even where field values have trailing 
spaces.


The default is false (although it is possible that the default may 
change). Applications that need specific behaviour should set the 
attribute as needed.


Drivers are not required to support this attribute, but any driver 
which does not support it must arrange to return undef as the 
attribute value."




Thanks anyway.

matthias





It seems to me you've simply exposed a bug in you ILS.  If blanks 
intentionally went in, would they not get truncated on the way out in 
the other systems?  If all trailing blanks are expendable perhaps your 
saves should remove them.  (And a trimming of the existing records is in 
order.)


The OP is dealing with char(acter) fields:

https://www.postgresql.org/docs/11/datatype-character.html

"If the string to be stored is shorter than the declared length, values 
of type character will be space-padded; ..."









--
Adrian Klaver
adrian.kla...@aklaver.com

Re: DBD::Pg exorts char columns with trailing blanks

2019-10-18 Thread Rob Sargent




On 10/18/19 8:51 AM, Adrian Klaver wrote:

On 10/18/19 7:42 AM, Matthias Apitz wrote:
El día viernes, octubre 18, 2019 a las 03:01:58p. m. +0200, Tom Lane 
escribió:



Matthias Apitz  writes:
When we export char columns with our Perl tools, they come out  
with trailing blanks (in Sybase they don't). Can this be suppressed?


Switch to varchar, perhaps?

    regards, tom lane


Sometimes people does not know, what they propose. We have a historical
25 years grown ILS which runs on top of Sybase, Oracle, Informix ... and
should now be ported to PostgreSQL. We can't simple switch internal
table structures and adopt some 10.000.000 lines of code (or debug while
it is now crashing).


That was not mentioned in the original post. Anyway:

https://metacpan.org/pod/DBD::Pg#ChopBlanks-(boolean,-inherited)

https://metacpan.org/pod/DBI

"ChopBlanks

Type: boolean, inherited

The ChopBlanks attribute can be used to control the trimming of 
trailing space characters from fixed width character (CHAR) fields. No 
other field types are affected, even where field values have trailing 
spaces.


The default is false (although it is possible that the default may 
change). Applications that need specific behaviour should set the 
attribute as needed.


Drivers are not required to support this attribute, but any driver 
which does not support it must arrange to return undef as the 
attribute value."




Thanks anyway.

matthias





It seems to me you've simply exposed a bug in you ILS.  If blanks 
intentionally went in, would they not get truncated on the way out in 
the other systems?  If all trailing blanks are expendable perhaps your 
saves should remove them.  (And a trimming of the existing records is in 
order.)

Re: DBD::Pg exorts char columns with trailing blanks

2019-10-18 Thread Adrian Klaver


On 10/18/19 7:42 AM, Matthias Apitz wrote:

El día viernes, octubre 18, 2019 a las 03:01:58p. m. +0200, Tom Lane escribió:


Matthias Apitz  writes:

When we export char columns with our Perl tools, they come out  with trailing 
blanks (in Sybase they don't). Can this be suppressed?


Switch to varchar, perhaps?

regards, tom lane


Sometimes people does not know, what they propose. We have a historical
25 years grown ILS which runs on top of Sybase, Oracle, Informix ... and
should now be ported to PostgreSQL. We can't simple switch internal
table structures and adopt some 10.000.000 lines of code (or debug while
it is now crashing).


That was not mentioned in the original post. Anyway:

https://metacpan.org/pod/DBD::Pg#ChopBlanks-(boolean,-inherited)

https://metacpan.org/pod/DBI

"ChopBlanks

Type: boolean, inherited

The ChopBlanks attribute can be used to control the trimming of trailing 
space characters from fixed width character (CHAR) fields. No other 
field types are affected, even where field values have trailing spaces.


The default is false (although it is possible that the default may 
change). Applications that need specific behaviour should set the 
attribute as needed.


Drivers are not required to support this attribute, but any driver which 
does not support it must arrange to return undef as the attribute value."




Thanks anyway.

matthias





--
Adrian Klaver
adrian.kla...@aklaver.com

Re: DBD::Pg exorts char columns with trailing blanks

2019-10-18 Thread Matthias Apitz

El día viernes, octubre 18, 2019 a las 03:01:58p. m. +0200, Tom Lane escribió:

> Matthias Apitz  writes:
> > When we export char columns with our Perl tools, they come out  with 
> > trailing blanks (in Sybase they don't). Can this be suppressed?
> 
> Switch to varchar, perhaps?
> 
>   regards, tom lane

Sometimes people does not know, what they propose. We have a historical
25 years grown ILS which runs on top of Sybase, Oracle, Informix ... and
should now be ported to PostgreSQL. We can't simple switch internal
table structures and adopt some 10.000.000 lines of code (or debug while
it is now crashing).

Thanks anyway.

matthias

-- 
Matthias Apitz, ✉ g...@unixarea.de, http://www.unixarea.de/ +49-176-38902045
Public GnuPG key: http://www.unixarea.de/key.pub

3. Oktober! Wir gratulieren! Der Berliner Fernsehturm wird 50 
aus: https://www.jungewelt.de/2019/10-02/index.php

Re: connection timeout with psycopg2

2019-10-18 Thread Adrian Klaver


On 10/16/19 2:29 AM, Vicente Juan Tomas Monserrat wrote:

Hi there,

I have been testing out the following architecture for PostgreSQL HA.

|+-+ +-+ VIP ++ | +-+ | | | +--v---+ 
+--v---+ | pgBouncer | | pgBouncer | | + | | + | | keepalived | 
| keepalived | +--+---+ +--+---+ | | | | | | 
+--v---+ +--v---+ | | | | | HAProxy | | HAProxy | | | | 
| +--+---+ +--+---+ | | ++ | | | | 
+v+ +v+ | | | | | | | | | PG01 | | PG02 | | | | | 
|(patroni)| |(patroni)| | | | | +-+ +-+ |


I'm using this python script for checking the failover events in 
pgBouncer, HAProxy and Patroni (PostgreSQL HA solution).


|#! /usr/bin/env python # -*- coding: utf-8 -*- # vim:fenc=utf-8 import 
psycopg2 ISOLEVEL = psycopg2.extensions.ISOLATION_LEVEL_AUTOCOMMIT 
import time from datetime import datetime user = 'postgres' password = 
'secretpassword' host = 'localhost' port = '6432' database = 'test' 
LIMIT_RETRIES = 10 class DB(): def __init__(self, user, password, host, 
port, database, reconnect): self.user = user self.password = password 
self.host = host self.port = port self.database = database 
self._connection = None self._cursor = None self.reconnect = reconnect 
self.init() def connect(self,retry_counter=0): if not self._connection: 
try: self._connection = psycopg2.connect(user = self.user, password = 
self.password, host = self.host, port = self.port, database = 
self.database, connect_timeout = 3) retry_counter = 0 
self._connection.autocommit = True return self._connection except 
psycopg2.OperationalError as error: if not self.reconnect or 
retry_counter >= LIMIT_RETRIES: raise error else: retry_counter += 1 
print("got error {}. reconnecting {}".format(str(error).strip(), 
retry_counter)) time.sleep(5) self.connect(retry_counter) except 
(Exception, psycopg2.Error) as error: raise error def cursor(self): if 
not self._cursor or self._cursor.closed: if not self._connection: 
self.connect() self._cursor = self._connection.cursor() return 
self._cursor def execute(self, query, retry_counter=0): try: 
self._cursor.execute(query) retry_counter = 0 except 
(psycopg2.DatabaseError, psycopg2.OperationalError) as error: if 
retry_counter >= LIMIT_RETRIES: raise error else: retry_counter += 1 
print("got error {}. retrying {}".format(str(error).strip(), 
retry_counter)) time.sleep(1) self.reset() self.execute(query, 
retry_counter) except (Exception, psycopg2.Error) as error: raise error 
def reset(self): self.close() self.connect() self.cursor() def 
close(self): if self._connection: if self._cursor: self._cursor.close() 
self._connection.close() print("PostgreSQL connection is closed") 
self._connection = None self._cursor = None def init(self): 
self.connect() self.cursor() db = DB(user=user, password=password, 
host=host, port=port, database=database, reconnect=True) 
db.execute("create table if not exists t1 (id integer);") i = 0 while 
True: db.execute("insert into t1(id) values(1);") if i % 100 == 0: 
print("%s: %d" % (datetime.now(), i)) i = i+1 |


When running this python script against the pgBouncer VIP it keeps 
inserting data into the database. Then I stop one of the HAProxy 
service (where the VIP lives) the connection it hangs and never goes on. 
The VIP is on the other node but the client/app it doesn't notice and it 
keeps waiting for 5 minutes and finally continues. I've been looking for 
some default value of 5min with no luck.


Observations and comments:
1) I would point out there is a Psycopg list:

https://www.postgresql.org/list/psycopg/

2) I am not sure where the 5 minutes comes in. I see LIMIT_RETRIES = 10 
and a sleep of 5 sec between retries.


3) Where did this:

"got error server conn crashed?  "

come from? I don't see that in the code.






|$ python insert.py 2019-10-15 10:01:51.817585: 0 2019-10-15 
10:01:51.901091: 100 2019-10-15 10:01:52.031583: 200 2019-10-15 
10:01:52.126565: 300 2019-10-15 10:01:52.216502: 400 2019-10-15 
10:01:52.307157: 500 2019-10-15 10:01:52.400867: 600 2019-10-15 
10:01:52.497239: 700 2019-10-15 10:01:52.655689: 800 2019-10-15 
10:01:52.777883: 900 got error server conn crashed? < 
HAProxy stopped manually to force the VIP to move to the other node 
server closed the connection unexpectedly This probably means the server 
terminated abnormally before or while processing the request.. retrying 
1 PostgreSQL connection is closed ^C^C^C^C^C << The 
connection gets stuck (kill PID) |


I've tried exactly the same code logic in Java (using PostgreSQL JDBC) 
and dotnet core (using Npgsql) works fine with specifying this 
parameters socketTimeout (Java) and 'Command Timeout' (dotnet) respectively.


|$ dotnet run connection initialized 2019-10-15T08:27:28.843 0 
2019-10-15T08:27:30.205 100 2019-10-15T08:27:31.566 200 got error: 
Exception while reading from stream. Retrying 1 connection closed 
connection

RE: CVE-2018-1058

2019-10-18 Thread Lizeth Solis Aramayo

Thanks a lot.  It worked!

I will have to upgrade the 9.6.5 later.  It will me take me more time.

Thank you again.


-Mensaje original-
De: Adrian Klaver [mailto:adrian.kla...@aklaver.com]
Enviado el: jueves, 17 de octubre de 2019 10:23
Para: Lizeth Solis Aramayo; pgsql-gene...@postgresql.org
Asunto: Re: CVE-2018-1058

On 10/17/19 6:46 AM, Lizeth Solis Aramayo wrote:
> Forgot a fourth option:
>
> 4) Use the 9.6.15 pg_restore to restore the 9.6.15 pg_dump to the 9.6.5 
> database.
>
>
> I don't know how to do that.  May you help me  please
>
> Is it just copy the pg_restore from one server to another?

It would be easier to just run the 9.6.15 version on the 9.6.15 machine against 
the 9.6.5 database, so:

pg_restore -h <9.6.5 hostname or IP> -p 5432 ...

Of course, how fast this runs would depend where the machines are relative to 
each on the network.

You could try copying the program, just not sure how compatible RH 6.5 and RH 
7.6 are with each other.

Your best bet would be to upgrade the 9.6.5 --> 9.6.15. There have been a lot 
of bug fixes in between.

>
>
>
>
>
> -Mensaje original-
> De: Adrian Klaver [mailto:adrian.kla...@aklaver.com]
> Enviado el: miércoles, 16 de octubre de 2019 18:32
> Para: Lizeth Solis Aramayo; pgsql-gene...@postgresql.org
> Asunto: Re: CVE-2018-1058
>
> On 10/16/19 1:05 PM, Lizeth Solis Aramayo wrote:
>> What commands did you use to dump the 9.6.15 version and restore to
>> the
>> 9.6.5 version?
>>
>> Pg_dump -p 5433 -U postgres -Fc -d dbkerp -n param > param.dump And
>> Pg_restore -p 5432 -U postgres -d dbkerp param.dump
>>
>>
>> Server with pg_dump is Linux red hat 7.6
>>
>> Server with pg_restore is linux red hat 6.5
>>
>>
>> In both servers I have postgresql 9.6, but in pg_dump is 9.6.15,   and in 
>> pg_restore is 9.6.5.
>>
>>
>> The pg_dump is correct,  everything goes ok.,  but when I do the
>> pg_restore I gota n error :  pg_restore: [archiver] unsupported
>> version (1.13) in file header
>
> Forgot a fourth option:
>
> 4) Use the 9.6.15 pg_restore to restore the 9.6.15 pg_dump to the 9.6.5 
> database.
>
>>
>>
>> I searched solutions,  and I found that I can apply a patch CVE-2018-1058,  
>> but I don¡t know how.
>> How  to download,  and install,  I dont find documents about it.
>>
>> he reason why you can't upgrade the 9.6.5 to 9.6.15?  I dont know how.
>>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
> La información contenida en este mensaje esta dirigida en forma exclusiva 
> para el uso personal y confidencial del o los destinatarios arriba nombrados. 
> Si el lector de este mensaje no es el destinatario previsto o una persona 
> responsable para su distribución al destinatario, se le notifica que ha 
> recibido este correo por error y que la revisión, distribución, difusión o 
> copia de este mensaje esta estrictamente prohibida. Si por error recibió esta 
> comunicación, por favor notifiquenos inmediatamente y borre el mensaje 
> original. The information contained in this message is intended only for 
> the personal and confidential use of the recipient(s) named above. If the 
> reader of this message is not the intended recipient or an agent responsible 
> for delivering it to the intended recipient, you are hereby notified that you 
> have received this document in error and that any review, dissemination, 
> distribution, or copying of this message is strictly prohibited. If you have 
> received this communication in error, please notify us immediately, and 
> delete the original message.
>


--
Adrian Klaver
adrian.kla...@aklaver.com
La información contenida en este mensaje esta dirigida en forma exclusiva 
para el uso personal y confidencial del o los destinatarios arriba nombrados. 
Si el lector de este mensaje no es el destinatario previsto o una persona 
responsable para su distribución al destinatario, se le notifica que ha 
recibido este correo por error y que la revisión, distribución, difusión o 
copia de este mensaje esta estrictamente prohibida. Si por error recibió esta 
comunicación, por favor notifiquenos inmediatamente y borre el mensaje 
original. The information contained in this message is intended only for 
the personal and confidential use of the recipient(s) named above. If the 
reader of this message is not the intended recipient or an agent responsible 
for delivering it to the intended recipient, you are hereby notified that you 
have received this document in error and that any review, dissemination, 
distribution, or copying of this message is strictly prohibited. If you have 
received this communication in error, please notify us immediately, and delete 
the original message.

Re: DBD::Pg exorts char columns with trailing blanks

2019-10-18 Thread Tom Lane

Matthias Apitz  writes:
> When we export char columns with our Perl tools, they come out  with trailing 
> blanks (in Sybase they don't). Can this be suppressed?

Switch to varchar, perhaps?

regards, tom lane

Visibility Map Issues

2019-10-18 Thread Jeremy Smith

Hello all,
 We have a data warehouse (postgres 11.5, on centos 7) that contains
many instances of the following structure:

 - table_a
 - table_b
 - a view that selects everything from either table_a or table_b

All external queries select from the view, so we can rebuild the table that
isn't exposed by the view and then switch the view to point at the most
recent table.

We have a procedure that will drop indexes from the non-visible table,
truncate it, commit, repopulate it with data (using oracle_fdw), commit,
build the indexes, commit, run VACUUM ANALYZE (using pg_background), and
then re-write the view to point at the new table (and a final commit).

The issue I am dealing with is that on occasion the VACUUM step does not
update the visibility map (as evidenced by pg_class.relallvisible = 0),
which means the planner won't use index-only scans.  I can't figure out the
reason for this.  I have tried the following:
 - change VACUUM ANALYZE table_name to VACUUM (ANALYZE,
DISABLE_PAGE_SKIPPING) table_name
 - Get xmin of a row in the table (all rows should be the same) and compare
to txid_snapshot_xmin(txid_current_snapshot()) before vacuuming.  I used
raise notice to log these values and I never saw a conflict.  For example,
before running a vacuum I logged xmin = 207781 and the txid_snapshot_xmin
was 207785.  After running VACUUM ANALYZE, however, relallvisible was set
to 0.
 - Running oracle_close_connections() after loading data (before indexing
and vacuuming) just in case there was an issue with open Oracle connections
 - Setting old_snapshot_threshold to 60s and waiting (pg_sleep) for 60s,
65s, 120s, or 125s...

My questions:
 - Is there some way of knowing ahead of time that a VACUUM will actually
set the visibility of all pages?  I would expect the visibility map to not
be updated if there are older transactions open, but shouldn't I be able to
see that by examining the snapshot?
- Is there an issue running all of this in a procedure?  Since I'm
repeating the same thing with small variations for many tables, I was
hoping to keep this in a procedure, but if there are issues with how the
transactions might interact with pg_background/VACUUM, I may have to
abandon this approach.

There are no standby servers and vacuum_defer_cleanup_age is set to 0,.


I tried to create a minimum reproducible example:
CREATE TABLE test_a (a int);
CREATE index ix_test_a on test_a(a);

CREATE OR REPLACE PROCEDURE test_switch() AS $$
  DECLARE
   visible_pages int;
  BEGIN
DROP INDEX IF EXISTS ix_test_a;
TRUNCATE TABLE test_a;
COMMIT;

INSERT INTO test_a SELECT generate_series(1,1);
COMMIT;

CREATE INDEX ix_test_a ON test_a (a);
COMMIT;

RAISE NOTICE 'xmin:%', (SELECT xmin from test_a limit 1);
RAISE NOTICE 'snapshot min:%',
(txid_snapshot_xmin(txid_current_snapshot()));
PERFORM * FROM pg_background_result(pg_background_launch('VACUUM
(ANALYZE,DISABLE_PAGE_SKIPPING,VERBOSE) test_a')) as result(a text);
SELECT relallvisible FROM pg_class WHERE relname = 'test_a' INTO
visible_pages;
IF visible_pages = 0 THEN
  RAISE EXCEPTION 'NO VISIBLE PAGES';
ELSE
  RAISE NOTICE 'relallvisible:%', visible_pages;
END IF;
  END;

$$
LANGUAGE PLPGSQL;

I ran this repeatedly, using \watch 1 in psql.  At the same time, I ran
three other sessions running BEGIN; SELECT 1; COMMIT; every 1s.  I got a
failure in test_switch after about one minute:

NOTICE:  xmin:1949
NOTICE:  snapshot min:1951
INFO:  aggressively vacuuming "public.test_a"
INFO:  index "ix_test_a" now contains 1 row versions in 30 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  "test_a": found 0 removable, 1 nonremovable row versions in 45
out of 45 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 1948
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  analyzing "public.test_a"
INFO:  "test_a": scanned 45 of 45 pages, containing 1 live rows and 0
dead rows; 1 rows in sample, 1 estimated total rows
ERROR:  NO VISIBLE PAGES

I don't understand why the snapshot min says 1951, but the VACUUM output
says oldest xmin: 1948.

Thanks,
Jeremy

RE: Execute a function through fdw

2019-10-18 Thread Patrick FICHE

Thanks a lot for your answer.
Using a view is really a good solution for my case.
As I already use fdw for some other cases, I prefer not to mix with dblink.

Regards,

Patrick Fiche

-Original Message-
From: Tom Lane  
Sent: Friday, October 18, 2019 1:55 PM
To: Guillaume Lelarge 
Cc: Patrick FICHE ; 
pgsql-generallists.postgresql.org 
Subject: Re: Execute a function through fdw

Guillaume Lelarge  writes:
> Le ven. 18 oct. 2019 à 11:51, Patrick FICHE 
>  a écrit :
>> Is it possible to execute a function located on a server accessed 
>> through Postgres fdw.

> It's probably easier to create a view on the remote server, and access 
> it as a foreign table on the local server.

Yeah.  Or if you really want to call a remote function by name, see dblink.  
postgres_fdw actively avoids doing that sort of thing.

regards, tom lane

DBD::Pg exorts char columns with trailing blanks

2019-10-18 Thread Matthias Apitz



Hello,

When we export char columns with our Perl tools, they come out  with trailing 
blanks (in Sybase they don't). Can this be suppressed?

Thanks

matthias


--
Sent using Dekko from my Ubuntu device

Re: Postgres Point in time Recovery (PITR),

2019-10-18 Thread Avinash Kumar

Hi Daulat,

PITR entirely depends on what type of backups you choose.
Sometimes, to reduce the amount of downtime involved while restoring and
recovering a backup, you may also use a additional delayed standby.
You could use the PG built-in feature to delay the replication and
fast-forward it to the safest point to achieve PITR. But this requires you
to have an additional standby.
https://www.percona.com/blog/2018/06/28/faster-point-in-time-recovery-pitr-postgresql-using-delayed-standby/

If you have several TBs of database, pgBackRest is of course a way to go
for backups (there are few more open source solutions), but also consider
the amount of time it takes for recovery. Keeping all of this in mind, your
approach to PITR changes.

So i would ask you this question, what is the backup tool you use and what
is your backup strategy ? Are you taking a physical backup and performing
continuous archiving of WALs ? The answer to your question entirely depends
on this. :)

Regards,
Avinash Vallarapu.

On Fri, Oct 18, 2019 at 5:17 PM David Steele  wrote:

> On 10/18/19 11:29 AM, Luca Ferrari wrote:
> > On Fri, Oct 18, 2019 at 10:30 AM Andreas Joseph Krogh
> >  wrote:
> >> We use barman (https://www.pgbarman.org/) for continuous streaming
> backup and I had to restore from it once, and it went like this:
> >
> > Just for the records, here's an example of restore with pgbackrest:
> >
> > % sudo -u postgres pgbackrest --stanza=miguel \
> > --log-level-console=info --delta restore
> > ...
> > INFO: restore backup set 20190916-125652F
> > INFO: remove invalid files/paths/links from /postgres/pgdata/11
> > INFO: cleanup removed 148 files, 3 paths
> > ...
> > INFO: write /postgres/pgdata/11/recovery.conf
> > INFO: restore global/pg_control (performed last
> >  to ensure aborted restores cannot be started)
> > INFO: restore command end: completed successfully (5113ms)
>
> pgBackRest also has a tutorial on PITR:
> https://pgbackrest.org/user-guide.html#pitr
>
> --
> -David
> da...@pgmasters.net
>
>
>

-- 
9000799060

Re: Execute a function through fdw

2019-10-18 Thread Tom Lane

Guillaume Lelarge  writes:
> Le ven. 18 oct. 2019 à 11:51, Patrick FICHE  a
> écrit :
>> Is it possible to execute a function located on a server accessed through
>> Postgres fdw.

> It's probably easier to create a view on the remote server, and access it
> as a foreign table on the local server.

Yeah.  Or if you really want to call a remote function by name, see
dblink.  postgres_fdw actively avoids doing that sort of thing.

regards, tom lane

Re: Create role like role

2019-10-18 Thread Ron


On 10/18/19 5:08 AM, Sonam Sharma wrote:
I have created one role reader and granted usage on schema and select all 
tables role.


I have created one more user and have

Grant reader to sonam.

But still user Sonam is not able to read the tables..

Anything I am missing please let me know ..


You need to show us the complete commands that you ran.

--
Angular momentum makes the world go 'round.

Re: Postgres Point in time Recovery (PITR),

2019-10-18 Thread David Steele


On 10/18/19 11:29 AM, Luca Ferrari wrote:

On Fri, Oct 18, 2019 at 10:30 AM Andreas Joseph Krogh
 wrote:

We use barman (https://www.pgbarman.org/) for continuous streaming backup and I 
had to restore from it once, and it went like this:


Just for the records, here's an example of restore with pgbackrest:

% sudo -u postgres pgbackrest --stanza=miguel \
--log-level-console=info --delta restore
...
INFO: restore backup set 20190916-125652F
INFO: remove invalid files/paths/links from /postgres/pgdata/11
INFO: cleanup removed 148 files, 3 paths
...
INFO: write /postgres/pgdata/11/recovery.conf
INFO: restore global/pg_control (performed last
 to ensure aborted restores cannot be started)
INFO: restore command end: completed successfully (5113ms)


pgBackRest also has a tutorial on PITR:
https://pgbackrest.org/user-guide.html#pitr

--
-David
da...@pgmasters.net

Re: Execute a function through fdw

2019-10-18 Thread Guillaume Lelarge

Le ven. 18 oct. 2019 à 11:51, Patrick FICHE  a
écrit :

> Hello,
>
>
>
> Is it possible to execute a function located on a server accessed through
> Postgres fdw.
>
> This function returns a TABLE structure.
>
>
>
> I have mapped rmt_schema and there is a function called Get_Tables in this
> schema.
>
> I would like to execute something like :
>
> SELECT * FROM rmt_schema.Get_Tables();
>
>
>
> Or is it required to create a local function that will access to remote
> tables in order to achieve the same result ?
>

It's probably easier to create a view on the remote server, and access it
as a foreign table on the local server.

>

Create role like role

2019-10-18 Thread Sonam Sharma

I have created one role reader and granted usage on schema and select all
tables role.

I have created one more user and have

Grant reader to sonam.

But still user Sonam is not able to read the tables..

Anything I am missing please let me know ..

Execute a function through fdw

2019-10-18 Thread Patrick FICHE

Hello,

Is it possible to execute a function located on a server accessed through 
Postgres fdw.
This function returns a TABLE structure.

I have mapped rmt_schema and there is a function called Get_Tables in this 
schema.
I would like to execute something like :
SELECT * FROM rmt_schema.Get_Tables();

Or is it required to create a local function that will access to remote tables 
in order to achieve the same result ?

Regards,

Patrick Fiche
e. patrick.fi...@aqsacom.com

[cid:image001.png@01D585AA.51A4D870]

Re: Postgres Point in time Recovery (PITR),

2019-10-18 Thread Luca Ferrari

On Fri, Oct 18, 2019 at 10:30 AM Andreas Joseph Krogh
 wrote:
> We use barman (https://www.pgbarman.org/) for continuous streaming backup and 
> I had to restore from it once, and it went like this:

Just for the records, here's an example of restore with pgbackrest:

% sudo -u postgres pgbackrest --stanza=miguel \
   --log-level-console=info --delta restore
...
INFO: restore backup set 20190916-125652F
INFO: remove invalid files/paths/links from /postgres/pgdata/11
INFO: cleanup removed 148 files, 3 paths
...
INFO: write /postgres/pgdata/11/recovery.conf
INFO: restore global/pg_control (performed last
to ensure aborted restores cannot be started)
INFO: restore command end: completed successfully (5113ms)

Re: stable for each row before insert trigger

2019-10-18 Thread Олег Самойлов

Luca, I also read this section before ask the question.

> 18 окт. 2019 г., в 10:15, Tom Lane  написал(а):
> 
> =?utf-8?B?0J7Qu9C10LMg0KHQsNC80L7QudC70L7Qsg==?=  writes:
>> According to documentation the words "VOLATILE, STABLE, IMMUTABLE" is 
>> somehow useful with trigger functions, for instance mentioned that the AFTER 
>> INSERT trigger should be VOLATILE. The question is how this words affect a 
>> for each row before insert trigger? Can be some optimisation here?
> 
> Where did you read that?  There's no optimization that considers the
> volatility of trigger functions --- they'll be called exactly when
> specified, no more or less.

Good to see this. :) But there is somehow optimisation for triggers, which is 
somehow mentioned in the documentation, but not clearly defined.

https://www.postgresql.org/docs/current/sql-createfunction.html

> STABLE indicates  (It is inappropriate for AFTER triggers that wish to 
> query rows modified by the current command.)

So, STABLE is inappropriate for such trigger, but is appropriate for BEFORE 
trigger?

Luca correctly pointed to:
https://www.postgresql.org/docs/current/trigger-datachanges.html

> If your trigger function is written in any of the standard procedural 
> languages, then the above statements apply only if the function is declared 
> VOLATILE. Functions that are declared STABLE or IMMUTABLE will not see 
> changes made by the calling command in any case.

So will be good put inside right section

https://www.postgresql.org/docs/current/xfunc-volatility.html

Exact definition how "VOLATILE, STABLE, IMMUTABLE" affect a trigger function.

For instance, I expect that the FOR EACH ROW BEFORE trigger marked as STABLE 
will be faster than VOLATILE without important negative side effects. I 
observed 3% benefit. IMMUTABLE trigger is slightly slower then STABLE, but I am 
not sure, too low difference.

Re: Sv: Conflict between autovacuum and backup restoration

2019-10-18 Thread Ekaterina Amez



El 17/10/19 a las 16:12, Andreas Joseph Krogh escribió:



But I don't understand why I'm getting those messages about autovacuum
blocking db restore process. I guess that after one table is created
with COPY sentence, as many rows have been inserted, autoanalyze
process
runs to gather statistics for the Execution Planner. But why is
happening this block? Is autoanalyze running before the table gets
fully
loaded? Is this really a problem? If so, how can I handle it? This
task
is running at night, when nobody is using second database.


Thank you for reading,

Ekaterina

It is normal to get these "canceling autovacuum"-messages when 
restoring a database, just ignore them.
If it bothers you, just turn autovacuum off by setting this in 
postgresql.conf:

autovacuum = off
and reload the config (SIGHUP)


The server stores more Production databases and I don't want to 
interfere with them, and turning off autovacuum will affect all of them.


I've been looking for other parameters that could help me to avoid 
autoanalyze during  backup restoration but all of them (at least the 
ones I've been exploring) have sighup context associated, like 
autovacuum. Is there any parameter (in v8.4) that could somehow stop 
autoanalyze only in the connection used to restore the database?




--
Andreas Joseph Krogh

Re: Postgres Point in time Recovery (PITR),

2019-10-18 Thread Andreas Kretschmer

On 18 October 2019 07:59:21 CEST, Daulat Ram  wrote:
>Hello All,
>Can you please share some ideas and scenarios how we can do the PITR in
>case of disaster.
>
>
>Thanks,


Consider Barman.


-- 
2ndQuadrant - The PostgreSQL Support Company

Sv: Postgres Point in time Recovery (PITR),

2019-10-18 Thread Andreas Joseph Krogh


På fredag 18. oktober 2019 kl. 07:59:21, skrev Daulat Ram <
daulat@exponential.com >: 

Hello All,

Can you please share some ideas and scenarios how we can do the PITR in case 
of disaster.

We use barman (https://www.pgbarman.org/ 
) for continuous streaming backup and I had to restore from it once, and it 
went like this: 

==8<=== 

$ barman recover --target-time "2018-12-06 12:20:00" --remote-ssh-command "ssh 
andreak@192.168.0.184 " db01_11 20181130T190002 
"/home/andreak/barman-restore"
 Processing xlog segments from streaming for db01_11
 00010174002E
 00010174002F
 000101740030
 Starting remote restore for server db01_11 using backup 20181130T190002
 Destination directory: /home/andreak/barman-restore
 Doing PITR. Recovery target time: '2018-12-06 12:20:00+01:00'
 17445, dbname1, /storage/fast_ssd/11/tablespaces/dbname1
 29218, dbname2, /storage/fast_ssd/11/tablespaces/dbname2
 ... 
 29235503, dbnameX, /storage/fast_ssd/11/tablespaces/dbnameX
Copying the base backup.
 Copying required WAL segments.
 Generating recovery.conf
 Identify dangerous settings in destination directory.

 WARNING
 The following configuration files have not been saved during backup, hence 
they have not been restored.
 You need to manually restore them in order to start the recovered PostgreSQL 
instance:

 postgresql.conf
 pg_hba.conf
 pg_ident.conf

 Recovery completed (start time: 2018-12-06 13:14:53.220043, elapsed time: 4 
hours, 52 minutes, 47 seconds)

 Your PostgreSQL server has been successfully prepared for recovery! 
==8<=== 


-- 
Andreas Joseph Krogh 
CTO / Partner - Visena AS 
Mobile: +47 909 56 963 
andr...@visena.com  
www.visena.com

Re: Postgres Point in time Recovery (PITR),

2019-10-18 Thread Luca Ferrari

On Fri, Oct 18, 2019 at 7:59 AM Daulat Ram  wrote:
> Can you please share some ideas and scenarios how we can do the PITR in case 
> of disaster.

In order to be able to do PITR you need:
- a base backup of your database
- WALs from the backup going on

See .
Then, to keep it simple, if the disaster happens in the time window
between the backup and the last WAL archiving, you can restore an
instance at any point in time previously the disaster itself.

Luca

Re: stable for each row before insert trigger

2019-10-18 Thread Luca Ferrari

On Fri, Oct 18, 2019 at 9:16 AM Tom Lane  wrote:
>
> =?utf-8?B?0J7Qu9C10LMg0KHQsNC80L7QudC70L7Qsg==?=  writes:
> > According to documentation the words "VOLATILE, STABLE, IMMUTABLE" is 
> > somehow useful with trigger functions, for instance mentioned that the 
> > AFTER INSERT trigger should be VOLATILE. The question is how this words 
> > affect a for each row before insert trigger? Can be some optimisation here?
>
> Where did you read that?  There's no optimization that considers the
> volatility of trigger functions --- they'll be called exactly when
> specified, no more or less.

I suspect this sentence could have raised the OP doubt:

"For best optimization results, you should label your functions with
the strictest volatility category that is valid for them."

(from ).

However, here 
it is specified that:

"If your trigger function is written in any of the standard procedural
languages, then the above statements apply only if the function is
declared VOLATILE. Functions that are declared STABLE or IMMUTABLE
will not see changes made by the calling command in any case."

which I hope is the answer to the original question.

Luca

Re: stable for each row before insert trigger

2019-10-18 Thread Tom Lane

=?utf-8?B?0J7Qu9C10LMg0KHQsNC80L7QudC70L7Qsg==?=  writes:
> According to documentation the words "VOLATILE, STABLE, IMMUTABLE" is somehow 
> useful with trigger functions, for instance mentioned that the AFTER INSERT 
> trigger should be VOLATILE. The question is how this words affect a for each 
> row before insert trigger? Can be some optimisation here?

Where did you read that?  There's no optimization that considers the
volatility of trigger functions --- they'll be called exactly when
specified, no more or less.

There are some PLs that might behave differently depending on whether they
think the function is volatile or not, but that's independent of whether
the function is a trigger.

regards, tom lane

Re: A little confusion about JSON Path

2019-10-18 Thread Thomas Kellerer

Laurenz Albe schrieb am 17.10.2019 um 13:25:
>> I don't understand why the following two JSON Path expressions aren't doing 
>> the same thing in Postgres 12:
>>
>>  jsonb_path_exists(data, '$.k1.list.type() ? (@ == "array")'), -- returns 
>> true as expected
>>  jsonb_path_exists(data, '$.k1.list ? (@.type() == "array")') -- returns 
>> false - not expected
>>
>> Apparently "@.type()" returns something different then "$.k1.list.type()"
>>
> This seems to be a consequence of "lax" mode:
> 
>  "Besides, comparison operators automatically unwrap their operands in the 
> lax mode,
>   so you can compare SQL/JSON arrays out-of-the-box. An array of size 1 is 
> considered
>   equal to its sole element. Automatic unwrapping is not performed only when:
> 
>   - The path expression contains type() or size() methods that return the 
> type and
> the number of elements in the array, respectively.
> 
> (from https://www.postgresql.org/docs/12/functions-json.html)
> 
> with sample (data) as (
>   values
> ('{"k1": {"list":[1,2,3]}}'::jsonb)
> )
> select data,
>jsonb_path_exists(data, '$.k1.list ? (@.type() == "number")'),   
> -- lax mode unwraps the array
>jsonb_path_exists(data, 'strict $.k1.list ? (@.type() == "array")')  
> -- strict mode doesn't
> from sample;


Ah, thanks. I did not see that part.

Re: stable for each row before insert trigger

2019-10-18 Thread Олег Самойлов

Eh, stupid answer. Of cause, I read the documentation. But what about you? Do 
you know what is a trigger function? (for each row before insert)

A trigger function don't have parameters, instead it get special variables. OLD 
and NEW for instance or TG_ARGV[].

If trigger function depends only on NEW is it IMMUTABLE? (With the same NEW it 
must return the same changed NEW). If trigger function makes SELECTs and change 
only NEW is it can be declared as STABLE? And etc.

> 18 окт. 2019 г., в 2:41, Adrian Klaver  написал(а):
> 
> On 10/17/19 4:31 PM, Олег Самойлов wrote:
>> Hi all.
>> According to documentation the words "VOLATILE, STABLE, IMMUTABLE" is 
>> somehow useful with trigger functions, for instance mentioned that the AFTER 
>> INSERT trigger should be VOLATILE. The question is how this words affect a 
>> for each row before insert trigger? Can be some optimisation here?
> 
> https://www.postgresql.org/docs/11/xfunc-volatility.html
> 
> -- 
> Adrian Klaver
> adrian.kla...@aklaver.com

44 matches

Mail list logo