Re: [sqlite] Generated columns and COLLATE in the AS parens

2020-01-31 Thread Richard Hipp
On 1/31/20, Markus Winand  wrote:
>
> - Expressions also have a declared type. More elaborate: the result value of
> an expression has a type that is statically determined (again  the “declared
> type”).

SQLite is a generalization of "Standard SQL" that allows expressions
that can have an arbitrary type.  Some (most?) expressions in SQLite
will always return the same type, but it is not hard to construct an
SQLite expression that returns different types based on its inputs.
You can also create application-defined functions including
table-valued functions that return different types based on their
inputs.

Internally, SQLite makes no attempt to track the datatype of an
expression.  Such tracking could be added, but doing so would just
make the library bigger and slower without serving any useful purpose.

>
> - If a generated column doesn’t have an explicit data type mentioned, the
> the data type of the result of the expression is used.

Having no explicit data type is a kind of explicit data type in
SQLite, since any SQLite column (generated or otherwise) can omit the
data type specification and the data type affinity becomes the
equivalent of "ANY".  From that point of view, SQLite is like PG in
that it requires an explicit data type on generated columns, though
SQLite extends PG by allowing that explicit data type to be the
unnamed "ANY" data type.  And since the data type is always implicit
in the column definition, there is never an occasion to inherit the
data type from the AS expression.

Furthermore, SQLite expressions do not have data types, so even if a
generated column were said to omit the data type, there would be no
way of capturing it from the AS expression.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Generated columns and COLLATE in the AS parens

2020-01-31 Thread Keith Medcalf

If the standard requires static typing then anything it says about static types 
does not apply to SQLite3 because SQLite3 uses dynamic strong typing and does 
not use static fixed typing.

That means that the "type" assigned in a column is merely a "suggestion" on how 
to store the data for this column, and that data will be converted to this type 
if possible (this is called the column affinity).  A particular "value" (stored 
at the intersection of a column and a row) may be of any "type", 
notwithstanding what the column declaration of the table has to say about it.

The supported "types" are:

 NULL - a NULL value
 INTEGER a 64-bit signed integer
 REAL - a 64-bit IEEE-754 double precision float
 TEXT - a sequence of bytes in the encoding of the underlying database (set by 
pragma encoding)
 BLOB - a sequence of bytes with no meaning

A "value" has a concrete "type" associated with it and nothing else.
A "column" has a preferred storage "type" (called the affinity) and a bunch of 
constraints, plus a collation specifying the collation sequence to apply when 
the "value" contained in some particular row of that column contains TEXT.

For example:

sqlite> create table x(x integer default ('1'));
sqlite> insert into x default values;
sqlite> select typeof(x), x from x;
integer|1

The column "x" in the table "x" has integer affinity (would prefer data to be 
stored as integers if it can be).
The default value is a text string '1'.
When you execute the insert, x takes the default value '1'.
When this value is stored the column affinity is applied and the value that is 
actually stored is the integer 1.

Similarly,

sqlite> drop table x;
sqlite> create table x(x integer default ('test'));
sqlite> insert into x default values;
sqlite> select typeof(x), x from x;
text|test

The application of affinity integer for the column cannot convert the value 
'test' into an integer, so the value is stored as text.

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.

>-Original Message-
>From: sqlite-users  On
>Behalf Of Markus Winand
>Sent: Friday, 31 January, 2020 03:29
>To: SQLite mailing list 
>Subject: Re: [sqlite] Generated columns and COLLATE in the AS parens
>
>Let me first explain how collations work in standard SQL. I think my
>answers below make more sense then.
>
>In the SQL standard….
>
>- character string **types** have properties such as fix-length vs.
>variable length, the length (limit), character set and also the
>collation.
>
>- columns have types, which include the collation if it is a character
>string type.
>  However, it is not the column that has a collation. The column has a
>type, which might have a collation.
>
>- values have a static type (the so-called “declared type”). Static means
>it is determined at “compile time” from the syntax and the data
>dictionary.
>
>- Expressions also have a declared type. More elaborate: the result value
>of an expression has a type that is statically determined (again  the
>“declared type”).
>
>- expressions of which the declared type is a character string type, will
>also have all the properties of character string types (including the
>collation).
>
>- When character strings are compared, the effective collation is
>determined from the declared types of the operands.
>   The SQL standard defines rules how to do that (e.g. in 9075-2 9.15
>"Collation determination”)
>   The most important rule is the "collation derivation” order: explicit,
>implicit, none. That’s the rule that says
>   the COLLATE clause on expressions (“explicit” derivation) is stronger
>than the COLLATE clause following type names (“implicit” derivation).
>
>But note that the last statement is not the reason the collation of the
>generation expression takes precedence of the data type — IT DOESN’T!
>
>
>As far as I know, this is basically unchanged sind SQL-92. If you do not
>have access to the current release of the standard (of 2016, to which my
>references apply), you can also have a look at SQL-92 here (search for
>"4.2  Character strings”):
>
>https://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt
>
>Generated columns where introduced with SQL:2003 and are thus not in the
>linked document.
>
>
>Putting these mechanics of working with collations in context of
>generated columns:
>
>- Generated columns need a type like any other column.
>  If that type happens to be a character string type, the type includes
>the collation as part of the types properties.
>
>- Even though generated columns need a type, they allow skipping the type
>in the definition of the

Re: [sqlite] Generated columns and COLLATE in the AS parens

2020-01-31 Thread Markus Winand
Let me first explain how collations work in standard SQL. I think my answers 
below make more sense then.

In the SQL standard….

- character string **types** have properties such as fix-length vs. variable 
length, the length (limit), character set and also the collation.

- columns have types, which include the collation if it is a character string 
type.
  However, it is not the column that has a collation. The column has a type, 
which might have a collation.

- values have a static type (the so-called “declared type”). Static means it is 
determined at “compile time” from the syntax and the data dictionary.

- Expressions also have a declared type. More elaborate: the result value of an 
expression has a type that is statically determined (again  the “declared 
type”).

- expressions of which the declared type is a character string type, will also 
have all the properties of character string types (including the collation).

- When character strings are compared, the effective collation is determined 
from the declared types of the operands.
   The SQL standard defines rules how to do that (e.g. in 9075-2 9.15 
"Collation determination”)
   The most important rule is the "collation derivation” order: explicit, 
implicit, none. That’s the rule that says
   the COLLATE clause on expressions (“explicit” derivation) is stronger than 
the COLLATE clause following type names (“implicit” derivation).

But note that the last statement is not the reason the collation of the 
generation expression takes precedence of the data type — IT DOESN’T!


As far as I know, this is basically unchanged sind SQL-92. If you do not have 
access to the current release of the standard (of 2016, to which my references 
apply), you can also have a look at SQL-92 here (search for "4.2  Character 
strings”):

https://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt

Generated columns where introduced with SQL:2003 and are thus not in the linked 
document.


Putting these mechanics of working with collations in context of generated 
columns:

- Generated columns need a type like any other column.
  If that type happens to be a character string type, the type includes the 
collation as part of the types properties.

- Even though generated columns need a type, they allow skipping the type in 
the definition of the generated column—that’s very unique to generated columns.
  E.g. specifying a DEFAULT clause does not lift the requirement to explicitly 
state the type of the column.

- If a generated column doesn’t have an explicit data type mentioned, the the 
data type of the result of the expression is used.
  Whatever data type that is, whatever properties that type has.
  If it happens to be a character string type, it will also have a “collation” 
property.

However, the declared type of the expression (including its properties) is only 
relevant if the column definition doesn’t explicitly state a data type.

These are two different cases in standard SQL:

  C1 TEXT GENERATED ALWAYS AS (…)
  C2  GENERATED ALWAYS AS (…)

The type of C1 is TEXT, including all its default properties.
The type of C2 is the type of the result of the expression, including all its 
properties.

The “what’s new” paper for SQL:2003 mentions that case explicitly.

http://sigmodrecord.org/publications/sigmodRecord/0403/E.JimAndrew-standard.pdf

The example on the last page:

> CREATE TABLE EMPLOYEES (
>  EMP_ID INTEGER,
>  SALARY DECIMAL(7,2),
>  BONUS DECIMAL(7,2),
>  TOTAL_COMP GENERATED ALWAYS AS (SALARY + BONUS)
>  )
> 
> TOTAL_COMP is a generated column of the EMPLOYEES table. The data type of the 
> TOTAL_COMP is the data type of the expression (SALARY_BONUS).

I think that’s a typo and should read (SALARY + BONUS).

> Users may optionally specify a data type for a generated column, in which 
> case the specified data type must match with the data type of the associated 
> expression.

The last part is not 100% in line with the current standard. The types don’t 
need to match, they need to be assignable. I don’t know if this is just a 
simplification for this paper or if SQL:2003 really hat that requirement.

Further comments to your statements below…

> On 30 Jan 2020, at 23:06, Keith Medcalf  wrote:
> 
> 
> On: Wednesday, 29 January, 2020 06:45, Markus Winand 
>  wrote:
> 
>> I think there might be a glitch in the way SQLite 3.31.x derives the
>> collation information from the expression of a generated column.
> 
>> In particular, COLLATE inside the AS parens seems to be ignored, but it
>> is honoured after the parens:
> 
> Carrying the COLLATE from an expression into the column definition is 
> incorrect.  The definition of a generated column is:
> 
>  [type affinity] [GENERATED ALWAYS AS ()] [COLLATE 
> ] [ ...]
> 
> so why would the so including a COLLATE as part of the expression applies to 
> the expression, and not to the column.  Except for the ordering of 
>  which must be first, followed by the type which must come 
> second, the 

Re: [sqlite] Generated columns and COLLATE in the AS parens

2020-01-30 Thread Keith Medcalf

On: Wednesday, 29 January, 2020 06:45, Markus Winand  
wrote:

>I think there might be a glitch in the way SQLite 3.31.x derives the
>collation information from the expression of a generated column.

>In particular, COLLATE inside the AS parens seems to be ignored, but it
>is honoured after the parens:

Carrying the COLLATE from an expression into the column definition is 
incorrect.  The definition of a generated column is:

 [type affinity] [GENERATED ALWAYS AS ()] [COLLATE 
] [ ...]

so why would the so including a COLLATE as part of the expression applies to 
the expression, and not to the column.  Except for the ordering of  
which must be first, followed by the type which must come second, the other 
bits can appear in pretty much any ordering you heart desires (it is a Dim Sum 
of clauses).  

If the  of the column not specified then its "type affinity" is 
BLOB (None).
If no COLLATE is specified for the column, then the default collation BINARY 
applies.
If no NULL constraint is specified then NULL values are permitted.
If no DEFAULT expression is specified then the default value is NULL.

Why would you think that the COLLATE applied to the evaluation of an expression 
would "carry out" to the column itself?  Especially in light of the fact that 
the type/collation of the expression in a "DEFAULT ()" does not 
carry out to the column definition?

The collation sequence applies to "comparison" operations of TEXT values, and 
not to anything else (it is ignored for all other value types).  It is an 
attribute of a column (like column affinity) and not an attribute of a value.

In other words,

x AS (y == 'yahoo' collate nocase)

applies the affinity nocase to the evaluation of the == comparison operation.  
The result of evaluation of the expression is a data value with a concrete type 
(integer) and no associated collation.  This value then has whatever column 
affinity is specified applied to it and inherits the collation of the 
containing column.

x AS (y collate nocase)

is the same thing as

x AS (y)

since there is no "comparison" operation in the expression to which the 
collation can be applied (it is a useless attribute to the expression).  The 
result of the expression is a value of the concrete type of y with no 
associated collation.  This value then has the whatever column affinity is 
specified applied to it and inherits the collation of the containing column.

ie, "values" do not have a collation associated with them, column definitions 
and comparison operators on text have collation sequences associated with them.

>SQLite version 3.31.1 2020-01-27 19:55:54
>Enter ".help" for usage hints.
>Connected to a transient in-memory database.
>Use ".open FILENAME" to reopen on a persistent database.
>sqlite> CREATE TABLE tmp (str VARCHAR(255), str_nc1 GENERATED ALWAYS AS
>(str COLLATE NOCASE), str_nc2 GENERATED ALWAYS AS (str) COLLATE NOCASE);
>sqlite> INSERT INTO tmp VALUES ('a'), ('A'), ('b'), ('B');
>sqlite> SELECT * FROM tmp ORDER BY str;
>A|A|A
>B|B|B
>a|a|a
>b|b|b
>sqlite> SELECT * FROM tmp ORDER BY str_nc1;
>A|A|A
>B|B|B
>a|a|a
>b|b|b
>sqlite> SELECT * FROM tmp ORDER BY str_nc2;
>a|a|a
>A|A|A
>b|b|b
>B|B|B
>sqlite> SELECT * FROM tmp ORDER BY (str COLLATE NOCASE);
>a|a|a
>A|A|A
>b|b|b
>B|B|B
>sqlite>

>I believe the "ORDER BY str_nc1” should yield the same row order as the
>last two queries.

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Generated columns and COLLATE in the AS parens

2020-01-30 Thread Markus Winand


> On 30 Jan 2020, at 18:20, Richard Hipp  wrote:
> 
> On 1/30/20, Markus Winand  wrote:
>> 
>> Unfortunately, the “what would PostgreSQL do” approach doesn’t provide
>> guidance here.
> 
> Maybe it does.  PostgreSQL doesn't allow typeless columns, but it does
> allow columns with unspecified collating sequences, does it not?  

If you do not specify a COLLATE clause in a column definition, the default 
collation of the effective character set is used.

The name ‘default’ is still a collation. Quoting 
https://www.postgresql.org/docs/current/collation.html:

The collation of an expression can be the "default" collation, 
which means the locale settings defined for the database. 


> What
> if you have a normal column X with some collating sequence C and then
> a generated column Y that as just "AS(X)”.

It depends on whether the generated column specifies a  or not (see 
below).


>  If you do comparisons on
> column Y, which collating sequence does it use - the default or C?
> 
> Can you run that experiment for us?

David Raymond did.

Here is what happens, split into the three relevant cases.

> testing=> create table foo (
> testing(> a text,
> testing(> b text collate "C",
> testing(> c text collate "en-US-x-icu”,
> […]
> attnum | attname | attgenerated |  collname
> +-+--+-
>  1 | a   |  | default
>  2 | b   |  | C
>  3 | c   |  | en-US-x-icu

Pretty obvious.

The next columns:

> testing(> d text generated always as (a) stored,
> testing(> e text generated always as (b) stored,
> testing(> f text generated always as (c) stored,

>  4 | d   | s| default
>  5 | e   | s| default
>  6 | f   | s| default

The generated column definitions mention a , in that case the types 
character set’s default collation is used — coincidentally called “default”. 
Thus, it is the same case as column “a”.

Finally:

> testing(> g text collate "C" generated always as (c) stored,
> testing(> h text generated always as (c collate "C") stored,
> testing(> i text collate "C" generated always as (c collate "en-US-x-icu") 
> stored
> testing(> );

>  7 | g   | s| C
>  8 | h   | s| default
>  9 | i   | s| C

Again,  is specified, thus this collation is used. The collation of 
“h” is not “C” because there is an explicit type definition “text”, which 
includes a character set and a default collation.

The collation of the expression is really only relevant if there is no  (and no COLLATE outside the expression).

> 
> Or maybe you are thinking the collating sequence of the expression in
> the AS clause should only be carried through into the generated column
> if it is explicitly stated, and not implied?

There is only one case when the type of the expression, including the character 
set and collation should be carried through into the generated column: if the 
generated column doesn’t explicitly define a type.

> 
> What happens if there is a collating sequence specified in the AS
> clause and also another collating sequence on the column definitions?
> 
>  CREATE TABLE t1(x TEXT, y TEXT AS (x COLLATE nocase) COLLATE rtree);
> 
> Which collating sequence should be used for "SELECT * FROM t1 WHERE y='xyz';"?

There are three places where a COLLATE clause in the definition of a generated 
column is allowed:

(1) After the data type (see 6.1,  in the BNF)
(2) Inside the expression (see 6.31,  in the BNF)
(3) At the very end (see 11.4,  in the BNF).

There is a syntax rule prohibiting (1) and (3) being used at the same time 
(11.4 SR12b). The same rule says that the effect of (1) and (3) is the same:

>   • Otherwise,  shall not be both specified in  type> and immediately contained in . If  
> is immediately contained in , then it is equivalent to 
> specifying an equivalent  in . 

This is the case you were just are asking about.

Case (2) might affect the collation of the expression, which is only relevant 
in case there is neither (1) or (3) specified. In that case, also the collation 
is taken from the expression—via the data type (11.4 SR13c).

> 13)  The declared type of the column is
> Case:
>   • a)  If  is specified, then that data type. If  clause> is also specified, then the declared type of  
> shall be assignable to the declared type of the column.
>   • b)  If  is specified, […]
>   • c)  If  is specified, then the declared type of GE.

As you see, as soon as a  is specified, the collation of the 
expression is irrelevant. This is what can be demonstrated in PostgreSQL. 
Skipping  is currently not possible in PostgreSQL, thus PostgreSQL 
cannot be used to demonstrate c).

> Regarding WWPD: If you can convince Tom Lane and/or Bruce Momjian to
> send me an email that says "An explicit collating sequence at the
> top-level of a GENERATED ALWAYS AS clause should be carried 

Re: [sqlite] Generated columns and COLLATE in the AS parens

2020-01-30 Thread David Raymond
From a Windows installation of 12.0


testing=> select version();
  version

 PostgreSQL 12.0, compiled by Visual C++ build 1914, 64-bit
(1 row)

Time: 0.283 ms
testing=> create table foo (
testing(> a text,
testing(> b text collate "C",
testing(> c text collate "en-US-x-icu",
testing(> d text generated always as (a) stored,
testing(> e text generated always as (b) stored,
testing(> f text generated always as (c) stored,
testing(> g text collate "C" generated always as (c) stored,
testing(> h text generated always as (c collate "C") stored,
testing(> i text collate "C" generated always as (c collate "en-US-x-icu") 
stored
testing(> );
CREATE TABLE
Time: 81.025 ms
testing=> \d foo
  Table "public.foo"
 Column | Type |  Collation  | Nullable |   Default
+--+-+--+--
 a  | text | |  |
 b  | text | C   |  |
 c  | text | en-US-x-icu |  |
 d  | text | |  | generated always as (a) stored
 e  | text | |  | generated always as (b) stored
 f  | text | |  | generated always as (c) stored
 g  | text | C   |  | generated always as (c) stored
 h  | text | |  | generated always as (c COLLATE "C") 
stored
 i  | text | C   |  | generated always as (c COLLATE 
"en-US-x-icu") stored

testing=> select attnum, attname, attgenerated, collname from pg_attribute left 
outer join pg_collation on pg_attribute.attcollation = pg_collation.oid where 
attrelid = 'foo'::regclass and attnum > 0 order by attnum;
 attnum | attname | attgenerated |  collname
+-+--+-
  1 | a   |  | default
  2 | b   |  | C
  3 | c   |  | en-US-x-icu
  4 | d   | s| default
  5 | e   | s| default
  6 | f   | s| default
  7 | g   | s| C
  8 | h   | s| default
  9 | i   | s| C
(9 rows)

Time: 0.590 ms
testing=>


-Original Message-----
From: sqlite-users  On Behalf Of 
Richard Hipp
Sent: Thursday, January 30, 2020 12:21 PM
To: SQLite mailing list 
Subject: Re: [sqlite] Generated columns and COLLATE in the AS parens

On 1/30/20, Markus Winand  wrote:
>
> Unfortunately, the “what would PostgreSQL do” approach doesn’t provide
> guidance here.

Maybe it does.  PostgreSQL doesn't allow typeless columns, but it does
allow columns with unspecified collating sequences, does it not?  What
if you have a normal column X with some collating sequence C and then
a generated column Y that as just "AS(X)".  If you do comparisons on
column Y, which collating sequence does it use - the default or C?
Can you run that experiment for us?

Or maybe you are thinking the collating sequence of the expression in
the AS clause should only be carried through into the generated column
if it is explicitly stated, and not implied?

What happens if there is a collating sequence specified in the AS
clause and also another collating sequence on the column definitions?

  CREATE TABLE t1(x TEXT, y TEXT AS (x COLLATE nocase) COLLATE rtree);

Which collating sequence should be used for "SELECT * FROM t1 WHERE y='xyz';"?

Regarding WWPD: If you can convince Tom Lane and/or Bruce Momjian to
send me an email that says "An explicit collating sequence at the
top-level of a GENERATED ALWAYS AS clause should be carried through as
the default collating sequence of the generated column itself", then
I'll change it.  :-)

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Generated columns and COLLATE in the AS parens

2020-01-30 Thread Richard Hipp
On 1/30/20, Markus Winand  wrote:
>
> Unfortunately, the “what would PostgreSQL do” approach doesn’t provide
> guidance here.

Maybe it does.  PostgreSQL doesn't allow typeless columns, but it does
allow columns with unspecified collating sequences, does it not?  What
if you have a normal column X with some collating sequence C and then
a generated column Y that as just "AS(X)".  If you do comparisons on
column Y, which collating sequence does it use - the default or C?
Can you run that experiment for us?

Or maybe you are thinking the collating sequence of the expression in
the AS clause should only be carried through into the generated column
if it is explicitly stated, and not implied?

What happens if there is a collating sequence specified in the AS
clause and also another collating sequence on the column definitions?

  CREATE TABLE t1(x TEXT, y TEXT AS (x COLLATE nocase) COLLATE rtree);

Which collating sequence should be used for "SELECT * FROM t1 WHERE y='xyz';"?

Regarding WWPD: If you can convince Tom Lane and/or Bruce Momjian to
send me an email that says "An explicit collating sequence at the
top-level of a GENERATED ALWAYS AS clause should be carried through as
the default collating sequence of the generated column itself", then
I'll change it.  :-)

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Generated columns and COLLATE in the AS parens

2020-01-30 Thread Markus Winand

> On 29 Jan 2020, at 14:59, Richard Hipp  wrote:
> 
> On 1/29/20, Markus Winand  wrote:
>> Hi!
>> 
>> I think there might be a glitch in the way SQLite 3.31.x derives the
>> collation information from the expression of a generated column.
> 
> I think the current behavior is correct.
> 
> If you want a column to have a non-standard collating sequence, you
> should add a COLLATE constraint to that column definition.  The fact
> that there is a COLLATE operator on the expression that determines the
> value of that column seems irrelevant.
> 
> Consider this:
> 
> CREATE TABLE t1(a TEXT DEFAULT('xyzzy' COLLATE nocase));
> 
> Would you expect the COLLATE operator in the DEFAULT clause to change
> the collating sequence associated with column a?  Why should a
> GENERATED ALWAYS AS constraint work differently from a DEFAULT
> constraint?

The critical point here is that SQLite accepts a GENERATED clause without 
explicit type:

CREATE TABLE tmp (str VARCHAR(255), str_nc1 GENERATED ALWAYS AS (str COLLATE 
NOCASE), str_nc2 GENERATED ALWAYS AS (str) COLLATE NOCASE);

In that case, the type of the generated column should taken from the result 
type of the expression[0]. As collations are part of the character string 
types[1], also the collation is taken from the expression. The COLLATE clause 
applied to expressions (as opposed to following the name of a data type) just 
changes the collation in of the character string type of that expression[2].

If the GENERATED clause explicitly sets a data type, then the collation of the 
expression is irrelevant as it doesn’t affect the type of the generated column, 
similar to your example with DEFAULT (it is only a value assignment in that 
case).

At the end everything boils down to this question:
What is the type, including the character set and collation, of generated 
columns that don’t specify a type explicitly? I think the only sensible answer 
is that it is the type of the expression, including its character set and 
collation.

Unfortunately, the “what would PostgreSQL do” approach doesn’t provide guidance 
here as PostgreSQL requires an explicit type for generated columns (and so do 
MySQL and MariaDB). In SQL Server, however, it works like I would expect it.

I guess the SQLite approach to types might affect all of my reasoning, but in 
other cases like the following it seems that SQLite also “transports” the 
applicable collation together with the result (type?) of an expression:

sqlite> CREATE TABLE tmp (str VARCHAR(255));
sqlite> INSERT INTO tmp VALUES ('a'), ('A'), ('b'), ('B');
sqlite> SELECT * FROM (SELECT str COLLATE NOCASE FROM tmp) ORDER BY str;
a
A
b
B

-markus

References to SQL:2016, Part 2. I know that this is not a strong argument to 
you, but nevertheless ;)

[0] 11.4 SR 13 c: 
• If  is specified, then the declared type of GE. 
 Whereas GE is defined in SR 10 a:
• Let GE be the  contained in GC.

[1] 4.2.1
   A character string type descriptor contains:
   [….]
• —  The catalog name, schema name, and collation name of the collation 
of the character string type.

[2] 6.31 SR 4a:
• If  is specified, then the declared type of the 
 is the declared type of the , except that 
the declared type collation is the collation identified by , 
and its collation derivation is explicit.


> 
> -- 
> D. Richard Hipp
> drh at sqlite.org
> 

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Generated columns and COLLATE in the AS parens

2020-01-29 Thread Richard Hipp
On 1/29/20, Markus Winand  wrote:
> Hi!
>
> I think there might be a glitch in the way SQLite 3.31.x derives the
> collation information from the expression of a generated column.

I think the current behavior is correct.

If you want a column to have a non-standard collating sequence, you
should add a COLLATE constraint to that column definition.  The fact
that there is a COLLATE operator on the expression that determines the
value of that column seems irrelevant.

Consider this:

CREATE TABLE t1(a TEXT DEFAULT('xyzzy' COLLATE nocase));

Would you expect the COLLATE operator in the DEFAULT clause to change
the collating sequence associated with column a?  Why should a
GENERATED ALWAYS AS constraint work differently from a DEFAULT
constraint?

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Generated columns and COLLATE in the AS parens

2020-01-29 Thread Markus Winand
Hi!

I think there might be a glitch in the way SQLite 3.31.x derives the collation 
information from the expression of a generated column.

In particular, COLLATE inside the AS parens seems to be ignored, but it is 
honoured after the parens:

SQLite version 3.31.1 2020-01-27 19:55:54
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> CREATE TABLE tmp (str VARCHAR(255), str_nc1 GENERATED ALWAYS AS (str 
COLLATE NOCASE), str_nc2 GENERATED ALWAYS AS (str) COLLATE NOCASE);
sqlite> INSERT INTO tmp VALUES ('a'), ('A'), ('b'), ('B');
sqlite> SELECT * FROM tmp ORDER BY str;
A|A|A
B|B|B
a|a|a
b|b|b
sqlite> SELECT * FROM tmp ORDER BY str_nc1;
A|A|A
B|B|B
a|a|a
b|b|b
sqlite> SELECT * FROM tmp ORDER BY str_nc2;
a|a|a
A|A|A
b|b|b
B|B|B
sqlite> SELECT * FROM tmp ORDER BY (str COLLATE NOCASE);
a|a|a
A|A|A
b|b|b
B|B|B
sqlite>

I believe the "ORDER BY str_nc1” should yield the same row order as the last 
two queries.

-markus
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users