Re: [HACKERS] Insufficient description in collation mismatch error

2011-04-20 Thread Greg Stark
On Wed, Apr 20, 2011 at 1:37 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 postgres=# SELECT things, count(*) FROM stuff GROUP BY things COLLATE C;
 ERROR:  column stuff.things must appear in the GROUP BY clause or be
 used in an aggregate function
 LINE 1: SELECT things, count(*) FROM stuff GROUP BY things COLLATE ...

 Firstly, does it even make sense for a GROUP BY clause to accept COLLATE?

 Probably, or at least I'm hesitant to hard-wire a restriction against
 it.  The question is isomorphic to whether you believe that different
 collations can have different equality semantics.

Well the answer to his question is isomorphic to that. But the
question of whether the original query should be isn't. The query only
makes sense to be an error if different collations can have different
output representations -- which I believe is a definite no.

Now the problem gets more complicated because if the above query works
then you should expect to be able to do:

SELECT * FROM morestuff WHERE things IN (SELECT things from stuff
GROUP BY things COLLATE x)

If the GROUP BY could change the meaning of equality for things then
it's hard to figure what meaning should be used for the IN clause. If
it's the default meaning for things and that's different than x then
the IN clause is going to produce a non-deterministic set of results.

Possibly a user would expect the collation on the GROUP BY clause to
dictate the collation on the select list and vice versa. But that's a
pretty far-reaching action-at-a-distance.  Or possibly we should just
allow a mismatch but set the collation to indeterminate or something
so it can't be used in an outer query without an explicit collation
clause. Still that seems pretty arbitrary.

 regression=# select f1 from int4_tbl group by abs(f1);
 ERROR:  column int4_tbl.f1 must appear in the GROUP BY clause or be used in 
 an aggregate function
 LINE 1: select f1 from int4_tbl group by abs(f1);

I tihnk that's not a great example because from the user's point of
view it's clear that there could be multiple f1 values for a single
abs(f1) value. In the case of collation there could be multiple
different sort positions in one collation for a single thing thing in
a different collation but people probably think of them as the same
thing.

It might be more analogous to

select f1 from int4_tbl group by f1::numeric;






-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Insufficient description in collation mismatch error

2011-04-19 Thread Thom Brown
Hi,

I tried applying a collation to a GROUP BY clause without applying the
collation to the corresponding column in the SELECT clause.

postgres=# SELECT things, count(*) FROM stuff GROUP BY things COLLATE C;
ERROR:  column stuff.things must appear in the GROUP BY clause or be
used in an aggregate function
LINE 1: SELECT things, count(*) FROM stuff GROUP BY things COLLATE ...

Firstly, does it even make sense for a GROUP BY clause to accept COLLATE?

Even if it does, this error message doesn't explain the problem, being
that the column with the necessary collation doesn't appear in the
SELECT.

-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Insufficient description in collation mismatch error

2011-04-19 Thread Tom Lane
Thom Brown t...@linux.com writes:
 I tried applying a collation to a GROUP BY clause without applying the
 collation to the corresponding column in the SELECT clause.

 postgres=# SELECT things, count(*) FROM stuff GROUP BY things COLLATE C;
 ERROR:  column stuff.things must appear in the GROUP BY clause or be
 used in an aggregate function
 LINE 1: SELECT things, count(*) FROM stuff GROUP BY things COLLATE ...

 Firstly, does it even make sense for a GROUP BY clause to accept COLLATE?

Probably, or at least I'm hesitant to hard-wire a restriction against
it.  The question is isomorphic to whether you believe that different
collations can have different equality semantics.  You'd want that for
instance if you wanted a collation to be able to implement
case-insensitive comparisons.  The SQL committee seem to believe that
that is possible, because they take the trouble to specify that
foreign-key comparisons are done using the referenced not referencing
column's collation; there'd be no need for that verbiage if it couldn't
matter.  But there are a number of places in our existing code that
would need to be improved before we could support such a thing; in
general I'd have to say the code is pretty schizophrenic on the point.

 Even if it does, this error message doesn't explain the problem, being
 that the column with the necessary collation doesn't appear in the
 SELECT.

This isn't a new problem particularly; it happens whenever a GROUP BY
item isn't just a simple variable.  For example

regression=# select f1 from int4_tbl group by abs(f1);
ERROR:  column int4_tbl.f1 must appear in the GROUP BY clause or be used in 
an aggregate function
LINE 1: select f1 from int4_tbl group by abs(f1);
   ^

I agree this isn't terribly user-friendly, but it's not real clear to me
how to do better.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers