adelapena commented on code in PR #2673:
URL: https://github.com/apache/cassandra/pull/2673#discussion_r1362397268
##########
src/java/org/apache/cassandra/cql3/restrictions/StatementRestrictions.java:
##########
@@ -274,10 +292,54 @@ else if (relation.isLIKE())
throw invalidRequest("Non PRIMARY KEY columns found in where
clause: %s ",
Joiner.on(",
").join(nonPrimaryKeyColumns));
}
+
+ var annRestriction =
Streams.stream(nonPrimaryKeyRestrictions).filter(SingleRestriction::isANN).findFirst();
+ if (annRestriction.isPresent())
+ {
+ // If there is an ANN restriction then it must be for a
vector<float, n> column, and it must have an index
+ var annColumn = annRestriction.get().getFirstColumn();
+
+ if (!annColumn.type.isVector() ||
!(((VectorType<?>)annColumn.type).elementType instanceof FloatType))
+ throw
invalidRequest(StatementRestrictions.ANN_ONLY_SUPPORTED_ON_VECTOR_MESSAGE);
+ if (indexRegistry == null ||
indexRegistry.listIndexes().stream().noneMatch(i -> i.dependsOn(annColumn)))
+ throw
invalidRequest(StatementRestrictions.ANN_REQUIRES_INDEX_MESSAGE);
+ // We do not allow ANN queries using partition key
restrictions that need filtering
+ if (partitionKeyRestrictions.needFiltering(table))
+ throw
invalidRequest(StatementRestrictions.ANN_REQUIRES_INDEXED_FILTERING_MESSAGE);
+ // We do not allow ANN query filtering using non-indexed
columns
+ var nonAnnColumns =
Streams.stream(nonPrimaryKeyRestrictions).filter(r -> !r.isANN()).map(r ->
r.getFirstColumn()).collect(Collectors.toList());
Review Comment:
```suggestion
var nonAnnColumns =
Streams.stream(nonPrimaryKeyRestrictions).filter(r ->
!r.isANN()).map(Restriction::getFirstColumn).collect(Collectors.toList());
```
##########
src/java/org/apache/cassandra/cql3/restrictions/StatementRestrictions.java:
##########
@@ -274,10 +292,54 @@ else if (relation.isLIKE())
throw invalidRequest("Non PRIMARY KEY columns found in where
clause: %s ",
Joiner.on(",
").join(nonPrimaryKeyColumns));
}
+
+ var annRestriction =
Streams.stream(nonPrimaryKeyRestrictions).filter(SingleRestriction::isANN).findFirst();
+ if (annRestriction.isPresent())
+ {
+ // If there is an ANN restriction then it must be for a
vector<float, n> column, and it must have an index
+ var annColumn = annRestriction.get().getFirstColumn();
+
+ if (!annColumn.type.isVector() ||
!(((VectorType<?>)annColumn.type).elementType instanceof FloatType))
+ throw
invalidRequest(StatementRestrictions.ANN_ONLY_SUPPORTED_ON_VECTOR_MESSAGE);
+ if (indexRegistry == null ||
indexRegistry.listIndexes().stream().noneMatch(i -> i.dependsOn(annColumn)))
+ throw
invalidRequest(StatementRestrictions.ANN_REQUIRES_INDEX_MESSAGE);
+ // We do not allow ANN queries using partition key
restrictions that need filtering
+ if (partitionKeyRestrictions.needFiltering(table))
+ throw
invalidRequest(StatementRestrictions.ANN_REQUIRES_INDEXED_FILTERING_MESSAGE);
+ // We do not allow ANN query filtering using non-indexed
columns
+ var nonAnnColumns =
Streams.stream(nonPrimaryKeyRestrictions).filter(r -> !r.isANN()).map(r ->
r.getFirstColumn()).collect(Collectors.toList());
+ var clusteringColumns =
clusteringColumnsRestrictions.getColumnDefinitions();
+ if (!nonAnnColumns.isEmpty() || !clusteringColumns.isEmpty())
+ {
+ var nonIndexedColumns =
Stream.concat(nonAnnColumns.stream(), clusteringColumns.stream())
+ .filter(c -> indexRegistry
== null || indexRegistry.listIndexes()
+
.stream()
+
.noneMatch(i -> i.dependsOn(c)))
+
.collect(Collectors.toList());
+ if (!nonIndexedColumns.isEmpty())
+ throw
invalidRequest(StatementRestrictions.ANN_REQUIRES_INDEXED_FILTERING_MESSAGE);
Review Comment:
If we have the following schema:
```
CREATE TABLE %s.t (k int, c int, v vector<float, 1>, PRIMARY KEY(k, c))
CREATE CUSTOM INDEX ON %s.t(v) USING 'StorageAttachedIndex'
```
we can add a non-indexed restriction for the partition key, for example:
```
SELECT * FROM %s.t WHERE k=0 ORDER BY v ANN OF [9] LIMIT 10
```
however, this prevents us from using a clustering prefix, even if that
doesn't require `ALLOW FILTERING`:
```
SELECT * FROM %s.t WHERE k=0 AND c>2 ORDER BY v ANN OF [9] LIMIT 10
```
Is this correct? Does the additional clustering require post-filtering?
##########
src/java/org/apache/cassandra/cql3/restrictions/StatementRestrictions.java:
##########
@@ -274,10 +292,54 @@ else if (relation.isLIKE())
throw invalidRequest("Non PRIMARY KEY columns found in where
clause: %s ",
Joiner.on(",
").join(nonPrimaryKeyColumns));
}
+
+ var annRestriction =
Streams.stream(nonPrimaryKeyRestrictions).filter(SingleRestriction::isANN).findFirst();
+ if (annRestriction.isPresent())
+ {
+ // If there is an ANN restriction then it must be for a
vector<float, n> column, and it must have an index
+ var annColumn = annRestriction.get().getFirstColumn();
+
+ if (!annColumn.type.isVector() ||
!(((VectorType<?>)annColumn.type).elementType instanceof FloatType))
+ throw
invalidRequest(StatementRestrictions.ANN_ONLY_SUPPORTED_ON_VECTOR_MESSAGE);
+ if (indexRegistry == null ||
indexRegistry.listIndexes().stream().noneMatch(i -> i.dependsOn(annColumn)))
+ throw
invalidRequest(StatementRestrictions.ANN_REQUIRES_INDEX_MESSAGE);
+ // We do not allow ANN queries using partition key
restrictions that need filtering
+ if (partitionKeyRestrictions.needFiltering(table))
+ throw
invalidRequest(StatementRestrictions.ANN_REQUIRES_INDEXED_FILTERING_MESSAGE);
+ // We do not allow ANN query filtering using non-indexed
columns
+ var nonAnnColumns =
Streams.stream(nonPrimaryKeyRestrictions).filter(r -> !r.isANN()).map(r ->
r.getFirstColumn()).collect(Collectors.toList());
+ var clusteringColumns =
clusteringColumnsRestrictions.getColumnDefinitions();
+ if (!nonAnnColumns.isEmpty() || !clusteringColumns.isEmpty())
+ {
+ var nonIndexedColumns =
Stream.concat(nonAnnColumns.stream(), clusteringColumns.stream())
+ .filter(c -> indexRegistry
== null || indexRegistry.listIndexes()
Review Comment:
Nit: at this point `indexRegistry` cannot be null (in that case we throw
`ANN_REQUIRES_INDEX_MESSAGE` a few lines above)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]