Re: Functions in sort order - undocumented restriction

2018-02-10 Thread David G. Johnston
On Sat, Feb 10, 2018 at 4:40 AM, PG Doc comments form <
nore...@postgresql.org> wrote:

> The following documentation comment has been logged on the website:
>
> Page: https://www.postgresql.org/docs/9.5/static/queries-order.html
> Description:
>
> This works:
>
> select distinct o.bid, organisation, posttown, replace(case when phone =''
> then null else trim(phone) end, ' ', ''), phone, o.active, website, email,
> (select count(*) from notes as n where n.bid = o.bid) as nn from
> organisations as o right join notes as n on o.bid = n.bid where true order
> by replace(case when phone ='' then null else trim(phone) end, ' ', '')
> nulls last ;
>
> This does not work:
>
>  select distinct (o.bid), organisation, posttown, replace(case when
> postcode
> ='' then null else trim(postcode) end, ' ', '') as pc, phone, o.active,
> website, email, (select count(*) from notes as n where n.bid = o.bid) as nn
> from organisations as o right join notes as n on o.bid = n.bid where true
> order by replace(case when phone ='' then null else trim(phone) end, ' ',
> '') nulls last ;
> ERROR:  for SELECT DISTINCT, ORDER BY expressions must appear in select
> list
> LINE 1: ...n notes as n on o.bid = n.bid where true order by replace(ca...
>
> The documentation does not explain the restriction, nor that, or why, you
> cannot use 'as' to rename the field and then cite the renamed version.
>

​Let me get some clarity now since my first reply saw an "ON" clause where
there isn't one...

What was the point of adding parens around o.bid in the second query? (I
assumed it was to do DISTINCT ON but you didn't add the word ON)

Why did you change replace(case when phone...) in the first query with
replace (case when postcode...) in the second query - but only in the
select-list?  (phone no longer exists in the select-list and thus you get
the error.)

The following works just fine with an alias in the ORDER BY:

WITH vals (v,t,o) AS ( VALUES
(1,'a',10),(2,'c',20),(3,'d',30),(3,'b',40),(4,'e',50) )
SELECT DISTINCT v, replace(t || '2', ' ', '') AS v2
FROM vals ORDER BY v2

Adding "o" to the ORDER BY provokes the error you have been seeing.

Tom has explained why this is the case.  In short, DISTINCT seems to
effectively remove the ability to specify "arbitrary expressions from input
column values" in the ORDER BY - though you can still specify non-arbitrary
expressions, as long as they are the same expressions that are used in the
select-list.

David J.
​


Re: Documentation of EXCEPT ALL may have a bug

2018-02-10 Thread Tom Lane
Alvaro Herrera  writes:
> PG Doc comments form wrote:
>> create table t(x int);
>> create table u(x int);
>> 
>> insert into t values (1), (2), (2), (3), (3);
>> insert into u values (1), (2);
>> 
>> select * from t except all select * from u;

>> x 
>> ---
>> 3
>> 3
>> 2
>> (3 rows)

> I find this pretty odd behavior.  Is this not an outright bug?

It's exactly what the standard says to do: if there are M occurrences
of a row value in the LHS, and N occurrences in the RHS, emit
max(M-N, 0) copies of the row.  To my mind that's a reasonable definition
of EXCEPT if you suppose that nominally-identical rows are matched and
discarded one by one, rather than with de-duplication occurring
beforehand.

> [*] I didn't try terribly hard, but couldn't actually find where the
> behavior is defined.

In SQL:2011, it's 7.13  general rule 3) b) iii) 3) B),
on page 420 in the draft version I have.

regards, tom lane



Re: Functions in sort order - undocumented restriction

2018-02-10 Thread Tom Lane
=?utf-8?q?PG_Doc_comments_form?=  writes:
> This does not work:

>  select distinct (o.bid), organisation, posttown, replace(case when postcode
> ='' then null else trim(postcode) end, ' ', '') as pc, phone, o.active,
> website, email, (select count(*) from notes as n where n.bid = o.bid) as nn
> from organisations as o right join notes as n on o.bid = n.bid where true
> order by replace(case when phone ='' then null else trim(phone) end, ' ',
> '') nulls last ;
> ERROR:  for SELECT DISTINCT, ORDER BY expressions must appear in select list

What do you find unclear about this error message?

The reason seems clear enough to me: if the ordering expression isn't one
of the values that are being de-duplicated on, then there isn't a unique
value to associate with each surviving row for sorting purposes.

regards, tom lane



Re: Functions in sort order - undocumented restriction

2018-02-10 Thread David G. Johnston
On Saturday, February 10, 2018, PG Doc comments form 
wrote:

> The following documentation comment has been logged on the website:
>
> Page: https://www.postgresql.org/docs/9.5/static/queries-order.html
> Description:
>
> This works:
>
> select distinct o.bid, organisation, posttown, replace(case when phone =''
> then null else trim(phone) end, ' ', ''), phone, o.active, website, email,
> (select count(*) from notes as n where n.bid = o.bid) as nn from
> organisations as o right join notes as n on o.bid = n.bid where true order
> by replace(case when phone ='' then null else trim(phone) end, ' ', '')
> nulls last ;
>
> This does not work:
>
>  select distinct (o.bid), organisation, posttown, replace(case when
> postcode
> ='' then null else trim(postcode) end, ' ', '') as pc, phone, o.active,
> website, email, (select count(*) from notes as n where n.bid = o.bid) as nn
> from organisations as o right join notes as n on o.bid = n.bid where true
> order by replace(case when phone ='' then null else trim(phone) end, ' ',
> '') nulls last ;
> ERROR:  for SELECT DISTINCT, ORDER BY expressions must appear in select
> list


I suppose that error message should say "FOR SELECT DISTINCT ON, ORDER BY
..."


> LINE 1: ...n notes as n on o.bid = n.bid where true order by replace(ca...
>
> The documentation does not explain the restriction,


It is explained in the SELECT command documentation.

 """
The DISTINCT ON expression(s) must match the leftmost ORDER BY expression(s).
The ORDER BY clause will normally contain additional expression(s) that
determine the desired precedence of rows within each DISTINCT ON group.
"""

https://www.postgresql.org/docs/9.5/static/sql-select.html


>
> nor that, or why, you
> cannot use 'as' to rename the field and then cite the renamed version.
>

You can and it does with limitations.

"Note that an output column name has to stand alone, that is, it cannot be
used in an expression"

You are correct that the "why" behind some of the decisions and limitations
is not always covered in great detail.

I suppose that the sorting docs could cover DSTINCT ON, it just needs
someone willing to write it up.  Since the SELECT docs cover the material
it is a usability issue as opposed to a correctness one.  Also, that
chapter is somewhat of an introductory piece and distinct on is somewhat of
an intermediate query concept.  Adding it there might be considered too
much information at that point in the docs.  Given the directness of the
error and the coverage in the command reference the status quo has merit.

David J.


Re: Documentation of EXCEPT ALL may have a bug

2018-02-10 Thread Pantelis Theodosiou
On Sat, Feb 10, 2018 at 1:02 PM, Pantelis Theodosiou 
wrote:

>
>
> On Sat, Feb 10, 2018 at 11:59 AM, Alvaro Herrera 
> wrote:
>
>> PG Doc comments form wrote:
>>
>> > create table t(x int);
>> > create table u(x int);
>> >
>> > insert into t values (1), (2), (2), (3), (3);
>> > insert into u values (1), (2);
>> >
>> > select * from t except all select * from u;
>>
>> >  x
>> > ---
>> >  3
>> >  3
>> >  2
>> > (3 rows)
>>
>> I find this pretty odd behavior.  Is this not an outright bug?  I
>> haven't read the SQL standard on this matter[*], but if they define
>> EXCEPT ALL to work like this, then it seems pretty useless.  (Maybe they
>> just didn't intend EXCEPT ALL to be useful?)  If this is indeed the
>> case, maybe we should amend the docs not only to explain this behavior
>> but also to warn against the construct.
>>
>
> I think that's the way it is defined (but I agree, I don't remember ever
> seeing a use for it)
>
>
>>
>> [*] I didn't try terribly hard, but couldn't actually find where the
>> behavior is defined.  What I have on hand is a draft of SQL:2011 where
>> this appears to be defined in 7.13  but I was unable
>> to find the rules for set operations.  It refers to 9.12 Grouping
>> operations but that defines conformance rules only.
>>
>> --
>>
>
> I may have a different version but I see:
>

That is in:
7.13 
General Rules 3, Case b:

>
> iii)
> T contains the following rows:
> 1) Let R be a row that is a duplicate of some row in ET1 or of some row in
> ET2 or both. Let
> m be the number of duplicates of R in ET1 and let n be the number of
> duplicates of R in
> ET2, where m ≥ 0 and n ≥ 0.
> 2) If DISTINCT is specified or implicit, then 
> ...
>
> 3) If ALL is specified, then
> Case:
> A) If UNION is specified, then the number of duplicates of R that T
> contains is (m + n).
> B) If EXCEPT is specified, then the number of duplicates of R that T
> contains is the
> maximum of (m – n) and 0 (zero).
> C) If INTERSECT is specified, then the number of duplicates of R that T
> contains is the
> minimum of m and n.
>
>


Re: Documentation of EXCEPT ALL may have a bug

2018-02-10 Thread Pantelis Theodosiou
On Sat, Feb 10, 2018 at 11:59 AM, Alvaro Herrera 
wrote:

> PG Doc comments form wrote:
>
> > create table t(x int);
> > create table u(x int);
> >
> > insert into t values (1), (2), (2), (3), (3);
> > insert into u values (1), (2);
> >
> > select * from t except all select * from u;
>
> >  x
> > ---
> >  3
> >  3
> >  2
> > (3 rows)
>
> I find this pretty odd behavior.  Is this not an outright bug?  I
> haven't read the SQL standard on this matter[*], but if they define
> EXCEPT ALL to work like this, then it seems pretty useless.  (Maybe they
> just didn't intend EXCEPT ALL to be useful?)  If this is indeed the
> case, maybe we should amend the docs not only to explain this behavior
> but also to warn against the construct.
>

I think that's the way it is defined (but I agree, I don't remember ever
seeing a use for it)


>
> [*] I didn't try terribly hard, but couldn't actually find where the
> behavior is defined.  What I have on hand is a draft of SQL:2011 where
> this appears to be defined in 7.13  but I was unable
> to find the rules for set operations.  It refers to 9.12 Grouping
> operations but that defines conformance rules only.
>
> --
>

I may have a different version but I see:

iii)
T contains the following rows:
1) Let R be a row that is a duplicate of some row in ET1 or of some row in
ET2 or both. Let
m be the number of duplicates of R in ET1 and let n be the number of
duplicates of R in
ET2, where m ≥ 0 and n ≥ 0.
2) If DISTINCT is specified or implicit, then 
...

3) If ALL is specified, then
Case:
A) If UNION is specified, then the number of duplicates of R that T
contains is (m + n).
B) If EXCEPT is specified, then the number of duplicates of R that T
contains is the
maximum of (m – n) and 0 (zero).
C) If INTERSECT is specified, then the number of duplicates of R that T
contains is the
minimum of m and n.


Re: Documentation of EXCEPT ALL may have a bug

2018-02-10 Thread Alvaro Herrera
PG Doc comments form wrote:

> create table t(x int);
> create table u(x int);
> 
> insert into t values (1), (2), (2), (3), (3);
> insert into u values (1), (2);
> 
> select * from t except all select * from u;

>  x 
> ---
>  3
>  3
>  2
> (3 rows)

I find this pretty odd behavior.  Is this not an outright bug?  I
haven't read the SQL standard on this matter[*], but if they define
EXCEPT ALL to work like this, then it seems pretty useless.  (Maybe they
just didn't intend EXCEPT ALL to be useful?)  If this is indeed the
case, maybe we should amend the docs not only to explain this behavior
but also to warn against the construct.

[*] I didn't try terribly hard, but couldn't actually find where the
behavior is defined.  What I have on hand is a draft of SQL:2011 where
this appears to be defined in 7.13  but I was unable
to find the rules for set operations.  It refers to 9.12 Grouping
operations but that defines conformance rules only.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Functions in sort order - undocumented restriction

2018-02-10 Thread PG Doc comments form
The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/9.5/static/queries-order.html
Description:

This works:

select distinct o.bid, organisation, posttown, replace(case when phone =''
then null else trim(phone) end, ' ', ''), phone, o.active, website, email,
(select count(*) from notes as n where n.bid = o.bid) as nn from
organisations as o right join notes as n on o.bid = n.bid where true order
by replace(case when phone ='' then null else trim(phone) end, ' ', '')
nulls last ;

This does not work:

 select distinct (o.bid), organisation, posttown, replace(case when postcode
='' then null else trim(postcode) end, ' ', '') as pc, phone, o.active,
website, email, (select count(*) from notes as n where n.bid = o.bid) as nn
from organisations as o right join notes as n on o.bid = n.bid where true
order by replace(case when phone ='' then null else trim(phone) end, ' ',
'') nulls last ;
ERROR:  for SELECT DISTINCT, ORDER BY expressions must appear in select
list
LINE 1: ...n notes as n on o.bid = n.bid where true order by replace(ca...

The documentation does not explain the restriction, nor that, or why, you
cannot use 'as' to rename the field and then cite the renamed version. 


Engineering

2018-02-10 Thread PG Doc comments form
The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/9.3/static/runtime-config-connection.html
Description:

Engineering The Future