ROWS vs RANGE in release notes for pg 16

2023-08-18 Thread Erwin Brandstetter
The release notes for Postgres 16 says here:
https://www.postgresql.org/docs/16/release-16.html#RELEASE-16-PERFORMANCE

> Allow window functions to use ROWS mode internally when RANGE mode is
specified but unnecessary (David Rowley)

But the fix also applies to the much more common case where no mode has
been specified, RANGE unfortunately being the default - including the most
common use case "row_number() OVER (ORDER BY col)". This is what made me
investigate, test and eventually propose a fix in the first place. See:

https://www.postgresql.org/message-id/flat/CAGHENJ7LBBszxS%2BSkWWFVnBmOT2oVsBhDMB1DFrgerCeYa_DyA%40mail.gmail.com
https://www.postgresql.org/message-id/flat/CAApHDvohAKEtTXxq7Pc-ic2dKT8oZfbRKeEJP64M0B6%2BS88z%2BA%40mail.gmail.com

Also, I was hoping to get a mention in the release note for working this
out:

> Allow window functions to use the faster ROWS mode internally when RANGE
mode is specified or would be used by default, but unnecessary (David
Rowley, Erwin Brandstetter)


Regards
Erwin


Re: Confusing bit in the "EXAMPLES" section for INSERT

2022-08-16 Thread Erwin Brandstetter
Seeing that an earlier, related passage in the text reads:

> Note that the effects of all per-row BEFORE INSERT triggers are reflected
in excluded values, ...

I update my suggestion to use the same expression:

Insert a distributor, or do nothing if the proposed row conflicts with an
existing row. Per-row BEFORE INSERT triggers are fired, and possible
effects on the proposed row applied, before checking for conflicts. Example
assumes a unique index on the did column exists:

Regards
Erwin


On Tue, 16 Aug 2022 at 23:09, Erwin Brandstetter  wrote:

> https://www.postgresql.org/docs/devel/sql-insert.html#id-1.9.3.152.9
>
> Insert a distributor, or do nothing for rows proposed for insertion when
>> an existing, excluded row (a row with a matching constrained column or
>> columns after before row insert triggers fire) exists. Example assumes a
>> unique index has been defined that constrains values appearing in the did
>> column:
>>
>
> Suggesting this as replacement:
>
> Insert a distributor, or do nothing if the proposed row conflicts with an
> existing row. Per-row triggers BEFORE INSERT are fired, and possible
> effects on the proposed applied, before checking for conflicts. Example
> assumes a unique index on the did column exists:
>
> Regards
> Erwin
>


Re: Confusing bit in the "EXAMPLES" section for INSERT

2022-08-16 Thread Erwin Brandstetter
Added a missing word:

Insert a distributor, or do nothing if the proposed row conflicts with an
existing row. Per-row triggers BEFORE INSERT are fired, and possible
effects on the proposed row applied, before checking for conflicts. Example
assumes a unique index on the did column exists:

Regards
Erwin

On Tue, 16 Aug 2022 at 23:09, Erwin Brandstetter  wrote:

> https://www.postgresql.org/docs/devel/sql-insert.html#id-1.9.3.152.9
>
> Insert a distributor, or do nothing for rows proposed for insertion when
>> an existing, excluded row (a row with a matching constrained column or
>> columns after before row insert triggers fire) exists. Example assumes a
>> unique index has been defined that constrains values appearing in the did
>> column:
>>
>
> Suggesting this as replacement:
>
> Insert a distributor, or do nothing if the proposed row conflicts with an
> existing row. Per-row triggers BEFORE INSERT are fired, and possible
> effects on the proposed applied, before checking for conflicts. Example
> assumes a unique index on the did column exists:
>
> Regards
> Erwin
>


Confusing bit in the "EXAMPLES" section for INSERT

2022-08-16 Thread Erwin Brandstetter
https://www.postgresql.org/docs/devel/sql-insert.html#id-1.9.3.152.9

Insert a distributor, or do nothing for rows proposed for insertion when an
> existing, excluded row (a row with a matching constrained column or columns
> after before row insert triggers fire) exists. Example assumes a unique
> index has been defined that constrains values appearing in the did column:
>

Suggesting this as replacement:

Insert a distributor, or do nothing if the proposed row conflicts with an
existing row. Per-row triggers BEFORE INSERT are fired, and possible
effects on the proposed applied, before checking for conflicts. Example
assumes a unique index on the did column exists:

Regards
Erwin


Re: count() counts ROW values that are NULL

2022-03-18 Thread Erwin Brandstetter
On Fri, 18 Mar 2022 at 22:28, Tom Lane  wrote:

> Erwin Brandstetter  writes:
> >
> https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-AGGREGATE-TABLE
> > The manual says:
> >> count ( "any" ) → bigint
> >> Computes the number of input rows in which the input value is not null.
>
> > But ROW values or composite types that "are null" are counted, anyway.
>
> Well, there's nulls and nulls.  The SQL "IS NULL" construct is fairly
> badly designed IMO, because it considers both a plain NULL and a
> row-of-all-NULL-fields to be "null".  count(), like just about everything
> in Postgres other than "IS NULL", considers only a plain NULL to be null.
>
> This is discussed somewhere in the manual, but I think it's under IS NULL,
> not under all the other places that'd have to be annotated if we decide to
> annotate as you're suggesting.  (One example is that functions that are
> marked STRICT use the tighter interpretation.)
>
> You could use "COUNT(*) FILTER (WHERE NOT (whatever IS NULL))" if you want
> to count values meeting the IS NULL definition.  (Buttressing my point
> that IS NULL is not well thought out, the obvious "whatever IS NOT NULL"
> doesn't work here, because it's not the inverse of "whatever IS NULL".)
>

I am aware of the mess, and I feel your pain (and my own).
But count() is among the most frequently used functions, and
hardly any user reading the manual will be aware of the implications. Maybe
just:

... in which the input value is not null (does not evaluate to a scalar
NULL).

To give them a fighting chance.


Regards

Erwin


count() counts ROW values that are NULL

2022-03-18 Thread Erwin Brandstetter
https://www.postgresql.org/docs/current/functions-aggregate.html#FUNCTIONS-AGGREGATE-TABLE

The manual says:

count ( "any" ) → bigint
>
> Computes the number of input rows in which the input value is not null.
>

But ROW values or composite types that "are null" are counted, anyway. See:

*db<>fiddle [here](
https://dbfiddle.uk/?rdbms=postgres_14=7364f3f582322ac687b39c677826a074)*

Feels like a bug, but I cannot imagine how this would have slipped
everybody's attention for so long. It should at least be documented. Maybe:

Computes the number of input rows in which the input value is not a plain
NULL value. (Composite or ROW values count in any case - even if value IS
NULL evaluates to true.)

Regards
Erwin


Re: PL/pgSQL PERFORM WITH query

2022-02-10 Thread Erwin Brandstetter
On Tue, 8 Feb 2022 at 23:54, David G. Johnston 
wrote:

> On Tue, Feb 8, 2022 at 3:14 PM Erwin Brandstetter 
> wrote:
>
>> The manual currently reads:
>>
>> https://www.postgresql.org/docs/current/plpgsql-statements.html#PLPGSQL-STATEMENTS-GENERAL-SQL
>>
>> > PERFORM query;
>>
>> > This executes query and discards the result. Write the query the same
>> way you would write an SQL SELECT command, but replace the initial keyword
>> SELECT with PERFORM. For WITH queries, use PERFORM and then place the query
>> in parentheses. (In this case, the query can only return one row.)
>>
>> But that only works for a single returned value (one column of one row).
>> Else we need to treat the WITH query like a subquery with alias. There was
>> a related question on Stackoverflow:
>>
>>
>> https://stackoverflow.com/questions/71040855/how-to-perform-a-cte-query-returning-multiple-rows-columns
>>
>> I suggest to clarify like:
>>
>> For WITH queries, use PERFORM and place the query in parentheses. If the
>> query returns more than a single value (one column of one row) you must
>> treat it as subquery, writing PERFORM * FROM (query) my_alias;
>>
>>
> We define the term "Scalar Subquery" in the documentation, we should not
> be avoiding it here and simply telling the user to "use parentheses".  You
> are using parentheses because you are writing a scalar subquery and placing
> it in the target list of the PERFORM command.
>
> So, I'd suggest the following wording:
>
> Since WITH queries do not start with the SELECT keyword you must instead
> write your query independently of the PERFORM top-level query.  This
> wrapping query will have a FROM clause just like any other query and you
> can place your WITH query there as a normal subquery.  An alternative is to
> use a scalar subquery (provide xref to syntax), in which case you can
> simply place it after the PERFORM keyword.
>
> Combining that with examples (or, as below, adapting the syntax example
> already provided) of both forms should suffice.  We don't need to
> interleave an example in the prose.
>
> PERFORM select_query; -- must begin with the SELECT keyword
> PERFORM * FROM (with_query) AS from_alias; -- normal subquery form
> PERFORM (with_query); -- scalar subquery form
>


I agree that's clearer. And references to existing chapters of the mnanual
are a good idea. But since the use case is not very common, I would keep it
short. (Just fix the misinformation!)
How about this:

A WITH query does not start with the SELECT keyword. Wrap the whole query
as subquery (xref to
https://www.postgresql.org/docs/current/queries-table-expressions.html#QUERIES-SUBQUERIES)
and replace the outer SELECT with PERFORM. Short syntax can be used for a
scalar subquery (xref to
https://www.postgresql.org/docs/current/sql-expressions.html#SQL-SYNTAX-SCALAR-SUBQUERIES
).

Regards
Erwin


PL/pgSQL PERFORM WITH query

2022-02-08 Thread Erwin Brandstetter
The manual currently reads:
https://www.postgresql.org/docs/current/plpgsql-statements.html#PLPGSQL-STATEMENTS-GENERAL-SQL

> PERFORM query;

> This executes query and discards the result. Write the query the same way
you would write an SQL SELECT command, but replace the initial keyword
SELECT with PERFORM. For WITH queries, use PERFORM and then place the query
in parentheses. (In this case, the query can only return one row.)

But that only works for a single returned value (one column of one row).
Else we need to treat the WITH query like a subquery with alias. There was
a related question on Stackoverflow:

https://stackoverflow.com/questions/71040855/how-to-perform-a-cte-query-returning-multiple-rows-columns

I suggest to clarify like:

For WITH queries, use PERFORM and place the query in parentheses. If the
query returns more than a single value (one column of one row) you must
treat it as subquery, writing PERFORM * FROM (query) my_alias;

Regards
Erwin


Re: json_strip_nulls()

2022-01-22 Thread Erwin Brandstetter
On Sat, 22 Jan 2022 at 22:00, Tom Lane  wrote:

> Erwin Brandstetter  writes:
> > On Sat, 22 Jan 2022 at 20:31, David G. Johnston <
> david.g.johns...@gmail.com>
> > wrote:
> >> json_strip_nulls doesn't make any promise regarding its output json
> other
> >> than that it is valid.  Since we are munging the json we are arguably
> >> within our rights to output whatever transformed version we want.  The
> >> format should not be documented.
>
> > Within our rights, maybe. The manual makes related promises[1]:
> >> Because the json type stores an exact copy of the input text, it will
> >> preserve semantically-insignificant white space between tokens
> > And[2]:
> >> As previously stated, when a JSON value is input and then printed
> without
> >> any additional processing, json outputs the same text that was input,
>
> "Without any additional processing" is the key restriction there.
>
> > Not strictly contradicting, but the current behavior of
> json_strip_nulls()
> > is still surprising. Either the input should be preserved as far as
> > possible or, failing that, the actual behavior documented.
>
> It is documented --- you just quoted the text that does so.
>
> I don't have a lot of sympathy for "JSON-reading" code that fails to
> conform to the JSON RFC, so I'm disinclined to work harder than that.
>
>
I suggest to clarify the behavior of json_strip_nulls() in the manual: that
it also strips insignificant white space. If that may change in future
versions, also say so. People are starting to use json_strip_nulls() for
the purpose (and may regret it later):
https://stackoverflow.com/questions/27505181/fetching-compact-version-of-jsonb-in-postgresql/56842519#56842519

If (like I assume) json_strip_nulls() should not be relied upon to strip
whitespace, it would be great to have a separate, dedicated function for
that.  That's independent from the first suggestion.

Regards
Erwin


Re: json_strip_nulls()

2022-01-22 Thread Erwin Brandstetter
On Sat, 22 Jan 2022 at 20:31, David G. Johnston 
wrote:

> On Sat, Jan 22, 2022 at 12:11 PM Erwin Brandstetter 
> wrote:
>
>> But the function also strips all insignificant white space:
>> [...]
>> This is a useful feature to trim noise from json values, but unreliable
>> while undocumented. So let's document the behavior:
>>
>
> json_strip_nulls doesn't make any promise regarding its output json other
> than that it is valid.  Since we are munging the json we are arguably
> within our rights to output whatever transformed version we want.  The
> format should not be documented.
>

Within our rights, maybe. The manual makes related promises[1]:

> Because the json type stores an exact copy of the input text, it will
preserve semantically-insignificant white space between tokens

And[2]:

> As previously stated, when a JSON value is input and then printed without
any additional processing, json outputs the same text that was input,

Not strictly contradicting, but the current behavior of json_strip_nulls()
is still surprising. Either the input should be preserved as far as
possible or, failing that, the actual behavior documented.

[1] https://www.postgresql.org/docs/current/datatype-json.html
[2]
https://www.postgresql.org/docs/current/datatype-json.html#JSON-KEYS-ELEMENTS



>
>
>>
>> If that's undesirable, let's remove the functionality - and provide a
>> dedicated function for the task.
>>
>
> I agree we should at least provide a function that takes a json or jsonb
> and outputs its text representation in a minimalist form.
>
> I found similar (unresolved) considerations here:
>>
>> https://www.postgresql.org/message-id/20160610110633.GG18838%40zip.com.au
>>
>>
> This whole situation went through a fairly lengthy discussion back in 2016:
>
>
> https://www.postgresql.org/message-id/flat/CAH7T-ap6R_xzWz98c6AQzQuGsK_vpgr-et4VRaSjgxqom--ibw%40mail.gmail.com
>
> It's an interesting, and IMO, disappointing thread.  Maybe we can do
> better now and focus on one missing capability the community desires and
> actually get something committed.
>


That would address the main issue here: to have a function doing that
reliably.


Regards
Erwin


Re: Typo in release notes for pg14

2021-09-23 Thread Erwin Brandstetter
On Thu, 23 Sep 2021, 16:45 Tom Lane,  wrote:

>
> No, "memoize" is the intended spelling.
>
> regards, tom lane


Oh,  my bad. Cool new term. Thanks.


Typo in release notes for pg14

2021-09-23 Thread Erwin Brandstetter
https://www.postgresql.org/docs/14/release-14.html#id-1.11.6.5.5.3.7

> Add executor method to memoize results from the inner side of a
nested-loop join (David Rowley)

->
Add executor method to memorize results from the inner side of a
nested-loop join (David Rowley)

Regards
Erwin


Improve description for quote_ident()

2021-03-16 Thread Erwin Brandstetter
 In the current manual, Table 9.10

has this to say about quote_ident():

> Returns the given string suitably quoted to be used as an identifier in
an SQL statement string. Quotes are added only if necessary (i.e., if the
string contains non-identifier characters or would be case-folded).
Embedded quotes are properly doubled. See also Example 42.1

.

No mention of SQL key words which are also double-quoted. There was a
related question on stackoverflow today:

https://stackoverflow.com/questions/2580/quote-ident-does-not-add-quotes-to-column-name-first

I suggest to at least mention key words:

* Returns the given string suitably quoted to be used as an identifier in
an SQL statement string. Quotes are added only if necessary (i.e., if the
string contains non-identifier characters, would be case-folded, or is a
reserved key word). *Embedded quotes are properly doubled. See also
Example 42.1

.

To be precise: key words marked *"reserved"* or *"(cannot be function or
type)"* in the column "PostgreSQL" of the *SQL Key Words*

table are quoted. And if the GUC quote_all_identifiers

is set, then all strings are quoted. If that's not too much information, we
might make it:

*Returns the given string suitably quoted to be used as an identifier in an
SQL statement string. Quotes are added only if necessary: if the string
contains non-identifier characters, would be case-folded, or is a reserved
word (marked "reserved" or "(cannot be function or type)" in Table C.1
).
Embedded quotes are properly doubled. See also Example 42.1

and quote_all_identifiers
.*

Regards
Erwin


Re: Update limitations for declarative partitioning

2020-08-30 Thread Erwin Brandstetter
Looking at
https://www.postgresql.org/docs/devel/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE-LIMITATIONS
... the issue seems unchanged?

Regards
Erwin



On Sat, 8 Aug 2020 at 05:38, Alvaro Herrera 
wrote:

> On 2020-Aug-08, Erwin Brandstetter wrote:
>
> > - But the manual still warns at
> >
> https://www.postgresql.org/docs/13/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE-LIMITATIONS
> >
> > BEFORE ROW triggers, if necessary, must be defined on individual
> > > partitions, not the partitioned table.
> > >
> >
> > Looks like that "limitation" can go now? Or reduced to something like:
> >
> > "BEFORE ROW triggers on the partitioned table cannot change which
> partition
> > is the destination."
>
> Oh, right, I'll patch that up tomorrow.  Thanks for pointing it out.
>
> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


Update limitations for declarative partitioning

2020-08-07 Thread Erwin Brandstetter
The release notes for pg 13 advertise a new feature in
https://www.postgresql.org/docs/13/release-13.html#DDL-PARTITION-PRUNING

-
>
> Allow BEFORE row-level triggers
>  on partitioned tables
> (Álvaro Herrera)
>
> These triggers cannot change which partition is the destination.
>
- But the manual still warns at
https://www.postgresql.org/docs/13/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE-LIMITATIONS

BEFORE ROW triggers, if necessary, must be defined on individual
> partitions, not the partitioned table.
>

Looks like that "limitation" can go now? Or reduced to something like:

"BEFORE ROW triggers on the partitioned table cannot change which partition
is the destination."

Regards
Erwin


Function name "text_out" should be "textout"

2020-07-04 Thread Erwin Brandstetter
Close to the bottom of:
https://www.postgresql.org/docs/devel/plpgsql-implementation.html
(and all previous version)
... the manual says:

> During the ensuing assignment to the local variable curtime, the PL/pgSQL
interpreter casts this string to the timestamp type by calling the text_out
and timestamp_in functions for the conversion.

Should most probably be "textout" instead of "text_out". See:

test=# SELECT proname FROM pg_proc
WHERE proname IN ('text_out', 'textout');
 proname
-
 textout
(1 row)

"timestamp_in" seems correct, though. Those function names seem rather
inconsistent: roughly half of them have the underscore, the rest doesn't.
Might explain the confusion.

SELECT * FROM pg_proc
WHERE  proname ~ 'out$'
ANDprorettype = 'cstring'::regtype
ORDER  BY proname ~ '_out$';

SELECT * FROM pg_proc
WHERE  proname ~ 'in$'
ANDproargtypes = ARRAY['cstring'::regtype]::oidvector
ORDER  BY proname ~ '_in$';

Regards
Erwin


Wrong link for FETCH FIRST in pg13 release notes

2020-05-21 Thread Erwin Brandstetter
Here:
https://www.postgresql.org/docs/13/release-13.html#id-1.11.6.5.5.5

> Allow FETCH FIRST  to
use WITH TIES to return any additional rows that match the last result row
(Surafel Temesgen)

The link, by mistake, goes to:
https://www.postgresql.org/docs/13/sql-fetch.html

Should link to:
https://www.postgresql.org/docs/13/sql-select.html#SQL-LIMIT

Regards
Erwin


\W excludes underscore

2019-05-07 Thread Erwin Brandstetter
https://www.postgresql.org/docs/current/functions-matching.html#POSIX-CLASS-SHORTHAND-ESCAPES-TABLE
Table 9.19. Regular Expression Class-shorthand Escape:

> \w [[:alnum:]_] (note underscore is included)
> ...
> \W [^[:alnum:]_] (note underscore is included)

This is misleading as \w *includes* underscore, but \W *excludes* it. I
suggest:

\w [[:alnum:]_] (underscore is included)
...
\W [^[:alnum:]_] (underscore is excluded)


Regards
Erwin


IDENTITY columns are NOT NULL automatically

2019-02-05 Thread Erwin Brandstetter
 An IDENTITY column is automatically NOT NULL - which is per SQL standard.
I think this should be documented in sql-createtable.html. The same is
currently documented for PRIMARY KEY constraints:

https://www.postgresql.org/docs/devel/sql-createtable.html

> PRIMARY KEY enforces the same data constraints as a combination of UNIQUE
and NOT NULL

I suggest to replace this paragraph:

<< This clause creates the column as an identity column. It will have an
implicit sequence attached to it and the column in new rows will
automatically have values from the sequence assigned to it.

With this:

>> This clause creates the column as an identity column - with an implicit
sequence attached to it that automatically supplies column values for new
rows. A NOT NULL constraint is applied automatically.

Regards
Erwin