[GitHub] [cassandra] adelapena commented on a change in pull request #1141: CASSANDRA-15135 - validate sasi analyzer

GitBox Wed, 18 Aug 2021 09:38:19 -0700


adelapena commented on a change in pull request #1141:
URL: https://github.com/apache/cassandra/pull/1141#discussion_r690552925




##########
File path: 
src/java/org/apache/cassandra/index/sasi/analyzer/AbstractAnalyzer.java
##########
@@ -38,6 +39,10 @@ public void remove()
         throw new UnsupportedOperationException();
     }
 
+    public void validate(Map<String, String> options, AbstractType validator) 
throws ConfigurationException

Review comment:
       Nit: can we use `AbstractType<?>`?

##########
File path: 
src/java/org/apache/cassandra/index/sasi/analyzer/DelimiterAnalyzer.java
##########
@@ -56,14 +57,16 @@ public ByteBuffer next()
         return iter.next();
     }
 
+    public void validate(Map<String, String> options, AbstractType validator) 
throws ConfigurationException

Review comment:
       Nit: can we mark the method with `@Override`?

##########
File path: 
src/java/org/apache/cassandra/index/sasi/analyzer/NonTokenizingAnalyzer.java
##########
@@ -56,6 +57,14 @@
     private ByteBuffer input;
     private boolean hasNext = false;
 
+    public void validate(Map<String, String> options, AbstractType validator) 
throws ConfigurationException
+    {
+        if (options.containsKey(NonTokenizingOptions.CASE_SENSITIVE) && 
(options.containsKey(NonTokenizingOptions.NORMALIZE_LOWERCASE)
+                                                       || 
options.containsKey(NonTokenizingOptions.NORMALIZE_UPPERCASE)))
+            throw new ConfigurationException("case_sensitive option cannot be 
specified together " +
+                                               "with either 
normalize_lowercase or normalize_uppercase");

Review comment:
       Nit: maybe we can try to change the original formatting to fit into the 
line length:
   ```suggestion
                   if (options.containsKey(NonTokenizingOptions.CASE_SENSITIVE) 
&&
               (options.containsKey(NonTokenizingOptions.NORMALIZE_LOWERCASE) ||
                options.containsKey(NonTokenizingOptions.NORMALIZE_UPPERCASE)))
               throw new ConfigurationException("case_sensitive option cannot 
be specified together " +
                                                "with either 
normalize_lowercase or normalize_uppercase");
   ```

##########
File path: 
src/java/org/apache/cassandra/index/sasi/analyzer/NonTokenizingAnalyzer.java
##########
@@ -56,6 +57,14 @@
     private ByteBuffer input;
     private boolean hasNext = false;
 
+    public void validate(Map<String, String> options, AbstractType validator) 
throws ConfigurationException

Review comment:
       Nit: can we mark the method with `@Override`?

##########
File path: test/unit/org/apache/cassandra/index/sasi/SASIIndexTest.java
##########
@@ -2484,6 +2484,40 @@ public void testAnalyzerValidation()
         });
     }
 
+    @Test
+    public void testIllegalArgumentsForAnalyzerShouldFail()
+    {
+        String baseTable = "illegal_argument_test";
+        String indexName = "illegal_index";
+        QueryProcessor.executeOnceInternal(String.format("CREATE KEYSPACE IF 
NOT EXISTS %s WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': '1'}", KS_NAME));
+        QueryProcessor.executeOnceInternal(String.format("CREATE TABLE IF NOT 
EXISTS %s.%s (k int primary key, v text);", KS_NAME, baseTable));
+
+        try
+        {
+            QueryProcessor.executeOnceInternal(String.format("CREATE CUSTOM 
INDEX IF NOT EXISTS %s ON %s.%s(v) " +
+                            "USING 'org.apache.cassandra.index.sasi.SASIIndex' 
WITH OPTIONS = { 'mode' : 'CONTAINS', " +
+                            "'analyzer_class': 
'org.apache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer', " +
+                            "'case_sensitive': 'false'," +
+                            "'normalize_uppercase': 'true'};",
+                    indexName, KS_NAME, baseTable));
+
+            Assert.fail("creation of index analyzer with illegal options 
should fail");
+        }
+        catch (ConfigurationException e)
+        {
+            //correct behaviour
+            //confirm that it wasn't written to the schema
+            
Assert.assertTrue(QueryProcessor.executeOnceInternal(String.format("SELECT * 
FROM system_schema.indexes WHERE keyspace_name = '%s' " +
+                    "and table_name = '%s' and index_name = '%s';", KS_NAME, 
baseTable, indexName)).isEmpty());

Review comment:
       Nit: not a big deal, but we could prevent the NPE warning for example 
using `assertThat`:
   ```suggestion
               String query = String.format("SELECT * FROM 
system_schema.indexes WHERE keyspace_name = '%s' " +
                                            "AND table_name = '%s' AND 
index_name = '%s'", 
                                            KS_NAME, baseTable, indexName);
               
Assertions.assertThat(QueryProcessor.executeOnceInternal(query)).isEmpty();
   ```

##########
File path: 
src/java/org/apache/cassandra/index/sasi/analyzer/DelimiterAnalyzer.java
##########
@@ -56,14 +57,16 @@ public ByteBuffer next()
         return iter.next();
     }
 
+    public void validate(Map<String, String> options, AbstractType validator) 
throws ConfigurationException
+    {
+        if (!VALID_ANALYZABLE_TYPES.containsKey(validator))
+            throw new ConfigurationException(String.format("Only text types 
supported, got %s", validator));

Review comment:
       I think this error is unreachable. The new `AbstractAnalyzer#validate` 
method is called exclusively from `IndexMode#validateAnalyzer`, which already 
validates the type with a call to `AbstractAnalyzer#isCompatibleWith`. 
   
   This makes me thing that probably we don't need the cell type argument on 
`AbstractAnalyzer#validate`, or alternatively we could consider consider making 
`AbstractAnalyzer#validate` consistently responsible for checking the cell type 
so `IndexMode#validateAnalyzer` doesn't have to call 
`AbstractAnalyzer#isCompatibleWith`. wdyt?

##########
File path: 
src/java/org/apache/cassandra/index/sasi/analyzer/DelimiterAnalyzer.java
##########
@@ -56,14 +57,16 @@ public ByteBuffer next()
         return iter.next();
     }
 
+    public void validate(Map<String, String> options, AbstractType validator) 
throws ConfigurationException
+    {
+        if (!VALID_ANALYZABLE_TYPES.containsKey(validator))
+            throw new ConfigurationException(String.format("Only text types 
supported, got %s", validator));

Review comment:
       I missed that one. However, I still don't see a use case where that 
second configuration exception could be raised, since SASI doesn't support 
non-frozen collections or UDTs. If the exception where reachable I wonder if we 
would need a similar check in other text analyzer, and that would make the case 
to pass the cell value type, or the entire column definition, to 
`AbstractAnalyzer#isCompatibleWith`.
   I might be missing something obvious, but I think that this check is a 
remanent from what there was before creating the `isCompatibeWith` method 
during CASSANDRA-13669. wdyt?

##########
File path: 
src/java/org/apache/cassandra/index/sasi/analyzer/DelimiterAnalyzer.java
##########
@@ -100,11 +108,11 @@ protected ByteBuffer computeNext() {
         };
     }
 
-
+    @Override
     public boolean isTokenizing()
     {
         return true;
-    }
+        }

Review comment:
       Nit: misaligned

##########
File path: 
test/unit/org/apache/cassandra/index/sasi/analyzer/DelimiterAnalyzerTest.java
##########
@@ -81,10 +83,11 @@ public void testBlankEntries() throws Exception
         Assert.assertFalse(testString.toLowerCase().equals(output.toString()));
     }
 
-    @Test(expected = IllegalArgumentException.class)
+    @Test(expected = ConfigurationException.class)
     public void ensureIncompatibleInputSkipped() throws Exception

Review comment:
       Nit: doesn't need the `throws Exception`

##########
File path: 
test/unit/org/apache/cassandra/index/sasi/analyzer/DelimiterAnalyzerTest.java
##########
@@ -81,10 +83,11 @@ public void testBlankEntries() throws Exception
         Assert.assertFalse(testString.toLowerCase().equals(output.toString()));
     }
 
-    @Test(expected = IllegalArgumentException.class)
+    @Test(expected = ConfigurationException.class)
     public void ensureIncompatibleInputSkipped() throws Exception
     {
-        new DelimiterAnalyzer().init(new HashMap(), Int32Type.instance);
+        new DelimiterAnalyzer().validate(new HashMap<>(),

Review comment:
       Nit: we can use `Collections.emptyMap()`

##########
File path: 
src/java/org/apache/cassandra/index/sasi/analyzer/AbstractAnalyzer.java
##########
@@ -38,15 +40,23 @@ public void remove()
         throw new UnsupportedOperationException();
     }
 
-    public abstract void init(Map<String, String> options, AbstractType 
validator);
+    public void validate(Map<String, String> options, ColumnDefinition cd) 
throws ConfigurationException
+    {
+        if (!isCompatibleWith(cd.type))
+            throw new ConfigurationException(String.format("%s does not 
support type %s",
+                                                           
this.getClass().getSimpleName(),
+                                                           
cd.type.asCQL3Type()));
+    }
+
+    public abstract void init(Map<String, String> options, AbstractType<?> 
validator);
 
     public abstract void reset(ByteBuffer input);
 
     /**
      * Test whether the given validator is compatible with the underlying 
analyzer.
      *
-     * @param validator
-     * @return
+     * @param validator the validator to test the compatibility with
+     * @return true if the give validator is compatible, false otherwise

Review comment:
       I think that `isCompatibleWith` can be `protected`, since now it's only 
called from `validate`.

##########
File path: 
src/java/org/apache/cassandra/index/sasi/analyzer/DelimiterAnalyzer.java
##########
@@ -56,14 +57,16 @@ public ByteBuffer next()
         return iter.next();
     }
 
+    public void validate(Map<String, String> options, AbstractType validator) 
throws ConfigurationException
+    {
+        if (!VALID_ANALYZABLE_TYPES.containsKey(validator))
+            throw new ConfigurationException(String.format("Only text types 
supported, got %s", validator));

Review comment:
       I missed that one. However, I still don't see a use case where that 
second configuration exception could be raised, since SASI doesn't support 
non-frozen collections or UDTs. If the exception where reachable I wonder if we 
would need a similar check in other analyzers, and that would make the case to 
pass the cell value type, or the entire column definition, to 
`AbstractAnalyzer#isCompatibleWith`.
   I might be missing something obvious, but I think that this check is a 
remanent from what there was before creating the `isCompatibeWith` method 
during CASSANDRA-13669. wdyt?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [cassandra] adelapena commented on a change in pull request #1141: CASSANDRA-15135 - validate sasi analyzer

Reply via email to