Re: [PR] [SPARK-48906][SQL] Introduce `SHOW COLLATIONS LIKE ...` syntax to show all collations [spark]

via GitHub Mon, 09 Sep 2024 23:15:13 -0700


mihailom-db commented on code in PR #47364:
URL: https://github.com/apache/spark/pull/47364#discussion_r1751315215



##########
common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationFactory.java:
##########
@@ -923,4 +1063,13 @@ public static String getClosestSuggestionsOnInvalidName(
 
     return String.join(", ", suggestions);
   }
+
+  public static List<CollationIdentifier> listCollations(String catalog, 
String schema) {
+    return Collation.CollationSpec.listCollations(catalog, schema);

Review Comment:
   I would maybe say lets keep catalog and schema outside of CollationFactory. 
How I see it is that we have sessionCatalog which has some catalog/schema 
defined and it is listing collations that it has under it. This way we keep 
Collations simple objects that only depend on the actual info they need to be 
constructed.



##########
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala:
##########
@@ -1112,4 +1112,29 @@ class SparkSqlAstBuilder extends AstBuilder {
       withIdentClause(ctx.identifierReference(), UnresolvedNamespace(_)),
       cleanedProperties)
   }
+
+  /**
+   * Create a [[ShowCollationsCommand]] command.
+   * Expected format:
+   * {{{
+   *   SHOW identifier? COLLATIONS ((FROM | IN) ns=identifierReference)? 
(LIKE? pattern=stringLit);
+   * }}}
+   */
+  override def visitShowCollations(ctx: ShowCollationsContext): LogicalPlan = 
withOrigin(ctx) {
+    val ns = if (ctx.ns != null) {
+      withIdentClause(ctx.ns, UnresolvedNamespace(_))
+    } else {
+      CurrentNamespace
+    }
+    val (userScope, systemScope) = Option(ctx.identifier)
+      .map(_.getText.toLowerCase(Locale.ROOT)) match {
+      case None | Some("all") => (true, true)
+      case Some("system") => (false, true)
+      case Some("user") =>
+        throw QueryParsingErrors.showCollationsUnsupportedError("user", 
ctx.identifier())

Review Comment:
   Could we maybe remove this scoping totally and just return the current 
namespace? I would argue for now it is not necessary to allow any additional 
feature that we do not actually need or have implemented. So simple SHOW 
COLLATIONS (LIKE pattern)? should be enough and we can just keep 
ShowCollationsCommand to use CurrentNamespace, userScope=false, 
systemScope=true. What do you think about that? This way I would say we are 
keeping the idea of what we will do when user defined collations are in, but 
will not pollute users with unnecessary errors.



##########
common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationFactory.java:
##########
@@ -704,6 +809,39 @@ private String collationName() {
         }
         return builder.toString();
       }
+
+      private static List<String> allCollationNames() {
+        List<String> collationNames = new ArrayList<>();
+        for (String locale: ICULocaleToId.keySet()) {
+          // CaseSensitivity.CS + AccentSensitivity.AS
+          collationNames.add(locale);
+          // CaseSensitivity.CI + AccentSensitivity.AS
+          collationNames.add(locale + "_CI");
+          // CaseSensitivity.CS + AccentSensitivity.AI
+          collationNames.add(locale + "_AI");

Review Comment:
   nit: order these alphabetically first _AI then _CI



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-48906][SQL] Introduce `SHOW COLLATIONS LIKE ...` syntax to show all collations [spark]

Reply via email to