michaeljmarshall commented on code in PR #4353:
URL: https://github.com/apache/cassandra/pull/4353#discussion_r2688233715


##########
src/java/org/apache/cassandra/index/sai/plan/QueryController.java:
##########
@@ -171,6 +175,39 @@ public UnfilteredRowIterator queryStorage(List<PrimaryKey> 
keys, ReadExecutionCo
         return partition.queryMemtableAndDisk(cfs, executionController);
     }
 
+    /**
+     * Get an iterator over the rows for this partition key. Restrict the 
search to the specified view.
+     * @param key
+     * @param executionController
+     * @return
+     */
+    public UnfilteredRowIterator queryStorage(PrimaryKey key, 
ColumnFamilyStore.ViewFragment view, ReadExecutionController 
executionController)
+    {
+        if (key == null)
+            throw new IllegalArgumentException("non-null key required");
+
+        // TODO how do we want to handle static rows?

Review Comment:
   Looks like this PR fails if the indexed column is static. Here is a minimal 
reproducer:
   
   ```java
       @Test
       public void testStaticVectorColumnIndex() throws Throwable
       {
           createTable("CREATE TABLE %s (pk int, ck int, val vector<float, 2> 
static, PRIMARY KEY(pk, ck))");
           createIndex("CREATE CUSTOM INDEX ON %s(val) USING 
'StorageAttachedIndex'");
   
           execute("INSERT INTO %s (pk, ck, val) VALUES (0, 1, [1,0])");
           execute("INSERT INTO %s (pk, ck)      VALUES (0, 2)");
           execute("INSERT INTO %s (pk, ck, val) VALUES (1, 3, [0,1])");
   
           beforeAndAfterFlush(() -> {
               Object[][] rows = rows(row(1), row(2));
               assertRows(execute("SELECT ck FROM %s ORDER BY val ANN OF [1,0] 
LIMIT 2"), rows);
           });
       }
   ```
   
   The failure comes from this assertion in the calling code:
   
   ```
   assert !clusters.hasNext() : "Expected only one row per partition";
   ```
   
   If I remove that code, I discover that the `isIndexDataValid` seems 
incorrect here:
   
   ```java
           // If the row is static and the column is not static, or vice versa, 
the indexed value won't be present so we
           // don't need to check if live data matches indexed data.
           if (row.isStatic() != columnMetadata.isStatic())
               return true;
   ```
   
   Next, I see that we have an issue in that we don't apply the row transformer 
to the `StaticRow` and therefore we cannot use `CellWithSource`.
   
   I am considering two options:
   1. Find a way to wrap rows and then cells (might be unnecessarily invasive)
   2. Add logic to compute the vector similarity score and then we can consider 
a row valid if when the score is the same or better. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to