[
https://issues.apache.org/jira/browse/CALCITE-3951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17093270#comment-17093270
]
Stamatis Zampetakis commented on CALCITE-3951:
----------------------------------------------
Thanks for pushing this forward [~rubenql].
I am not sure if SqlCollation is the place to keep the comparison logic.
Here is what the SQL standard says about comparisons of character strings.
*4.2.2 Comparison of character strings*
Two character strings are comparable if and only if either they have the same
character set or there exists at
least one collation that is applicable to both their respective character sets
(which is possible only if the character
sets share the same repertoire).
A collation is defined by [ISO14651] as “a process by which two strings are
determined to be in exactly one
of the relationships of less than, greater than, or equal to one another”. Each
collation known in an SQL-environment is applicable to one or more character
sets, and for each character set, one or more collations are
applicable to it, one of which is associated with it as its character set
collation.
Anything that has a declared type can, if that type is a character string type,
be associated with a collation
applicable to its character set; this is known as a declared type collation.
Every declared type that is a character
string type has a collation derivation, this being either none, implicit, or
explicit. The collation derivation of a
declared type with a declared type collation that is explicitly or implicitly
specified by a <data type> is implicit.
If the collation derivation of a declared type that has a declared type
collation is not implicit, then it is explicit.
The collation derivation of an expression of character string type that has no
declared type collation is none.
An operation that explicitly or implicitly involves character string comparison
is a character comparison
operation. At least one of the operands of a character comparison operation
shall have a declared type collation.
There may be an SQL-session collation for some or all of the character sets
known to the SQL-implementation
(see Subclause 4.38, “SQL-sessions”).
The collation used for a particular character comparison is specified by
Subclause 9.15, “Collation determination”.
The comparison of two character string expressions depends on the collation
used for the comparison (see
Subclause 9.15, “Collation determination”). When values of unequal length are
compared, if the collation for
the comparison has the NO PAD characteristic and the shorter value is equal to
some prefix of the longer value,
then the shorter value is considered less than the longer value. If the
collation for the comparison has the PAD
SPACE characteristic, for the purposes of the comparison, the shorter value is
effectively extended to the length
of the longer by concatenation of <space>s on the right.
For every character set, there is at least one collation
> Support different string comparison based on SqlCollation
> ---------------------------------------------------------
>
> Key: CALCITE-3951
> URL: https://issues.apache.org/jira/browse/CALCITE-3951
> Project: Calcite
> Issue Type: Improvement
> Components: core
> Reporter: Ruben Q L
> Assignee: Ruben Q L
> Priority: Major
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> Currently SqlCollation defines concepts like Coercibility, Charset, Locale,
> etc. However, we cannot specify on a certain collation that e.g. a string
> field should use case insensitive comparison. The goal of this ticket is to
> evolve SqlCollation to support that, and adapt the corresponding classes to
> use that (optional) "non-standard" comparison.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)