zabetak commented on a change in pull request #1937:
URL: https://github.com/apache/calcite/pull/1937#discussion_r415611799
##########
File path: core/src/main/java/org/apache/calcite/sql/SqlCollation.java
##########
@@ -279,4 +281,18 @@ public final String getCollationName() {
public final SqlCollation.Coercibility getCoercibility() {
return coercibility;
}
+
+ /**
+ * @return Comparator to be used, or null if no special comparator is
required.
+ */
+ public Comparator<String> getComparator() {
+ return null;
+ }
+
+ /**
+ * @return Comparator expression to be used, or null if no special
comparator is required.
+ */
+ public Expression getComparatorExpression() {
Review comment:
Having a `linq4j.tree.Expression` in this level seems a bit weird. I
don't remember any other class in the sql package having dependencies to
linq4j. Moreover, for those projects that do not rely on `Enumerable` this
method is a bit useless.
##########
File path: core/src/main/java/org/apache/calcite/sql/SqlCollation.java
##########
@@ -279,4 +281,18 @@ public final String getCollationName() {
public final SqlCollation.Coercibility getCoercibility() {
return coercibility;
}
+
+ /**
+ * @return Comparator to be used, or null if no special comparator is
required.
+ */
+ public Comparator<String> getComparator() {
Review comment:
Is it better to return a `Comparator` or `Collator`? If we need to
generate collation keys in other places then maybe the second option is
preferable.
##########
File path: core/src/main/java/org/apache/calcite/sql/SqlCollation.java
##########
@@ -279,4 +281,18 @@ public final String getCollationName() {
public final SqlCollation.Coercibility getCoercibility() {
return coercibility;
}
+
Review comment:
The newly added method indicate that projects should rely on inheritance
to define a new collation but the documentation doesn't say so. Moreover, I get
the feeling that this information is not an extension but really part of this
class.
Using inheritance brings up questions like what happens with `equals` and
`hashCode`? Should the comparator be part of it?
##########
File path:
core/src/test/java/org/apache/calcite/test/enumerable/EnumerableStringComparisonTest.java
##########
@@ -0,0 +1,190 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to you under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.calcite.test.enumerable;
+
+import org.apache.calcite.adapter.enumerable.EnumerableRules;
+import org.apache.calcite.adapter.java.ReflectiveSchema;
+import org.apache.calcite.config.CalciteConnectionProperty;
+import org.apache.calcite.config.CalciteSystemProperty;
+import org.apache.calcite.config.Lex;
+import org.apache.calcite.linq4j.tree.Expression;
+import org.apache.calcite.linq4j.tree.Expressions;
+import org.apache.calcite.plan.RelOptPlanner;
+import org.apache.calcite.rel.core.JoinRelType;
+import org.apache.calcite.rel.type.RelDataType;
+import org.apache.calcite.rex.RexBuilder;
+import org.apache.calcite.runtime.Hook;
+import org.apache.calcite.sql.SqlCollation;
+import org.apache.calcite.sql.SqlOperator;
+import org.apache.calcite.sql.fun.SqlStdOperatorTable;
+import org.apache.calcite.sql.type.SqlTypeName;
+import org.apache.calcite.test.CalciteAssert;
+import org.apache.calcite.test.JdbcTest;
+import org.apache.calcite.tools.RelBuilder;
+
+import org.junit.jupiter.api.Test;
+
+import java.util.Collections;
+import java.util.Comparator;
+import java.util.function.Consumer;
+
+/**
+ * Test cases for
+ * <a href="https://issues.apache.org/jira/browse/CALCITE-3951">[CALCITE-3951]
+ * Support different string comparison based on SqlCollation</a>.
+ */
+class EnumerableStringComparisonTest {
+
+ private static final SqlCollation NO_CASE =
+ new SqlCollation(CalciteSystemProperty.DEFAULT_COLLATION.value() +
"-NO_CASE",
+ SqlCollation.Coercibility.IMPLICIT) {
+ @Override public Comparator<String> getComparator() {
+ return String.CASE_INSENSITIVE_ORDER;
+ }
+ @Override public Expression getComparatorExpression() {
+ return Expressions.field(Expressions.constant("", String.class),
+ "CASE_INSENSITIVE_ORDER");
+ }
+ };
+
+ private RelDataType createRecordVarcharNoCase(RelBuilder builder) {
+ return builder.getTypeFactory().builder()
+ .add(
+ "name",
+ builder.getTypeFactory().createTypeWithCharsetAndCollation(
+ builder.getTypeFactory().createSqlType(SqlTypeName.VARCHAR),
+ builder.getTypeFactory().getDefaultCharset(),
+ NO_CASE))
+ .build();
+ }
+
+ private RelDataType createVarcharNoCase(RelBuilder builder) {
+ return builder.getTypeFactory().createTypeWithCharsetAndCollation(
+ builder.getTypeFactory().createSqlType(SqlTypeName.VARCHAR),
+ builder.getTypeFactory().getDefaultCharset(),
+ NO_CASE);
+ }
+
+ @Test void testSortNoCase() {
+ tester()
+ .query("?")
+ .withRel(builder -> builder
+ .values(
+ createRecordVarcharNoCase(builder),
+ "Legal", "presales", "hr", "Administration", "MARKETING")
+ .sort(
+ builder.field(1, 0, "name"))
+ .build())
+ .explainHookMatches(""
+ + "EnumerableSort(sort0=[$0], dir0=[ASC])\n"
+ + " EnumerableValues(tuples=[[{ 'Legal' }, { 'presales' }, { 'hr'
}, { 'Administration' }, { 'MARKETING' }]])\n")
+ .returnsOrdered("name=Administration\n"
+ + "name=hr\n"
+ + "name=Legal\n"
+ + "name=MARKETING\n"
+ + "name=presales");
+ }
+
+ @Test void testMergeJoinOnStringNoCase() {
+ tester()
+ .query("?")
+ .withHook(Hook.PLANNER, (Consumer<RelOptPlanner>) planner -> {
+ planner.addRule(EnumerableRules.ENUMERABLE_MERGE_JOIN_RULE);
+ planner.removeRule(EnumerableRules.ENUMERABLE_JOIN_RULE);
+ })
+ .withRel(builder -> builder
+ .values(createRecordVarcharNoCase(builder),
+ "Legal", "presales", "hr", "Administration",
"MARKETING").as("v1")
+ .values(createRecordVarcharNoCase(builder),
+ "Marketing", "bureaucracy", "Sales", "HR").as("v2")
+ .join(JoinRelType.INNER,
+ builder.equals(
+ builder.field(2, 0, "name"),
+ builder.field(2, 1, "name")))
+ .project(
+ builder.field("v1", "name"),
+ builder.field("v2", "name"))
+ .build())
+ .explainHookMatches("" // It is important that we have MergeJoin in
the plan
+ + "EnumerableMergeJoin(condition=[=($0, $1)], joinType=[inner])\n"
+ + " EnumerableSort(sort0=[$0], dir0=[ASC])\n"
+ + " EnumerableValues(tuples=[[{ 'Legal' }, { 'presales' }, {
'hr' }, { 'Administration' }, { 'MARKETING' }]])\n"
+ + " EnumerableSort(sort0=[$0], dir0=[ASC])\n"
+ + " EnumerableValues(tuples=[[{ 'Marketing' }, { 'bureaucracy'
}, { 'Sales' }, { 'HR' }]])\n")
+ .returnsOrdered("name=hr; name0=HR\n"
+ + "name=MARKETING; name0=Marketing");
+ }
+
+ @Test void testStringComparison() {
Review comment:
This test is similar to the logic of `SqlOperatorBaseTest`. Could we
move things there? Are there things that we could reuse?
##########
File path: core/src/main/java/org/apache/calcite/runtime/Utilities.java
##########
@@ -212,6 +213,27 @@ public static int compareNullsLast(Comparable v0,
Comparable v1) {
: v0.compareTo(v1);
}
+ public static int compare(Comparable v0, Comparable v1, Comparator
comparator) {
Review comment:
Do we need all this new variants? Note that there are also the standard
java methods `Comparator.nullsFirst` and `Comparator.nullsLast`.
If we have a `Comparator` at hand then we could directly generate the code:
```
Comparator nfc = Comparator.nullsFirst(comparator);
nfc.compare(v0, v1);
```
I don't see clearly why we need to use the `Utilties` redirection in this
case.
##########
File path: core/src/main/java/org/apache/calcite/rex/RexBuilder.java
##########
@@ -919,9 +919,12 @@ protected RexLiteral makeLiteral(
// from the type if necessary.
assert o instanceof NlsString;
NlsString nlsString = (NlsString) o;
- if ((nlsString.getCollation() == null)
- || (nlsString.getCharset() == null)) {
- assert type.getSqlTypeName() == SqlTypeName.CHAR;
+ if (nlsString.getCollation() == null
+ || nlsString.getCharset() == null
+ || !nlsString.getCharset().equals(type.getCharset())
+ || !nlsString.getCollation().equals(type.getCollation())) {
Review comment:
Collations are compared with equals but is the `Comparator` part of the
`equals`? Should it be?
##########
File path: core/src/main/java/org/apache/calcite/runtime/SqlFunctions.java
##########
@@ -730,6 +743,11 @@ public static boolean le(String b0, String b1) {
return b0.compareTo(b1) <= 0;
}
+ /** SQL <code>≤</code> operator applied to String values. */
Review comment:
Possibly we could deprecate/remove the old methods (e.g., `le(String b0,
String b1)` and always use the new alternatives with the `Comparator`. When we
don't have one we could use `String::compareTo` or `Comparator.naturalOrder` to
have the same behavior as before.
##########
File path: core/src/main/java/org/apache/calcite/runtime/SqlFunctions.java
##########
@@ -638,6 +639,13 @@ public static boolean eq(Object b0, Object b1) {
return b0.equals(b1);
}
+ /** SQL <code>=</code> operator applied to Object values (including String;
+ * neither side may be null). */
+ public static boolean eq(Object b0, Object b1, Comparator comparator) {
Review comment:
Normally we should call this method only with parameters of type
`String`. Why do we need `Object`?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]