michaeljmarshall commented on code in PR #4353:
URL: https://github.com/apache/cassandra/pull/4353#discussion_r2688233715
##########
src/java/org/apache/cassandra/index/sai/plan/QueryController.java:
##########
@@ -171,6 +175,39 @@ public UnfilteredRowIterator queryStorage(List<PrimaryKey>
keys, ReadExecutionCo
return partition.queryMemtableAndDisk(cfs, executionController);
}
+ /**
+ * Get an iterator over the rows for this partition key. Restrict the
search to the specified view.
+ * @param key
+ * @param executionController
+ * @return
+ */
+ public UnfilteredRowIterator queryStorage(PrimaryKey key,
ColumnFamilyStore.ViewFragment view, ReadExecutionController
executionController)
+ {
+ if (key == null)
+ throw new IllegalArgumentException("non-null key required");
+
+ // TODO how do we want to handle static rows?
Review Comment:
Looks like this PR fails if the indexed column is static. Here is a minimal
reproducer:
```java
@Test
public void testStaticVectorColumnIndex() throws Throwable
{
createTable("CREATE TABLE %s (pk int, ck int, val vector<float, 2>
static, PRIMARY KEY(pk, ck))");
createIndex("CREATE CUSTOM INDEX ON %s(val) USING
'StorageAttachedIndex'");
execute("INSERT INTO %s (pk, ck, val) VALUES (0, 1, [1,0])");
execute("INSERT INTO %s (pk, ck) VALUES (0, 2)");
execute("INSERT INTO %s (pk, ck, val) VALUES (1, 3, [0,1])");
beforeAndAfterFlush(() -> {
Object[][] rows = rows(row(1), row(2));
assertRows(execute("SELECT ck FROM %s ORDER BY val ANN OF [1,0]
LIMIT 2"), rows);
});
}
```
The failure comes from this assertion in the calling code:
```
assert !clusters.hasNext() : "Expected only one row per partition";
```
If I remove that code, I discover that the `isIndexDataValid` seems
incorrect here:
```java
// If the row is static and the column is not static, or vice versa,
the indexed value won't be present so we
// don't need to check if live data matches indexed data.
if (row.isStatic() != columnMetadata.isStatic())
return true;
```
Next, I see that we have an issue in that we don't apply the row transformer
to the `StaticRow` and therefore we cannot use `CellWithSource`.
I am considering two options:
1. Find a way to wrap rows and then cells (might be unnecessarily invasive)
2. Add logic to compute the vector similarity score and then we can consider
a row valid if when the score is the same or better.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]