Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]
absurdfarce merged PR #1931: URL: https://github.com/apache/cassandra-java-driver/pull/1931 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For additional commands, e-mail: pr-h...@cassandra.apache.org
Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]
SiyaoIsHiding commented on PR #1931: URL: https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2574488949 Thanks! Squashed. @absurdfarce -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For additional commands, e-mail: pr-h...@cassandra.apache.org
Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]
absurdfarce commented on PR #1931: URL: https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2573631273 DS Jenkins looks good. There was one test failure which looks to be a test issue (ccm timeout on startup) so I'm calling this good. @SiyaoIsHiding can you squash this down to a single commit so we can get this in? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For additional commands, e-mail: pr-h...@cassandra.apache.org
Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]
SiyaoIsHiding commented on PR #1931: URL: https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2572275863 No outstanding item here, this PR is pending for reviews. @absurdfarce I've updated the PR. May you please give an approval? @michaelsembwever It seems to me that you have some familiarity to this PR. Would you have time for a review as the second +1? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For additional commands, e-mail: pr-h...@cassandra.apache.org
Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]
absurdfarce commented on PR #1931: URL: https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2572242950 @michaelsembwever Short version is that this is basically done; we've essentially been waiting on a second +1 to move this one forward. Only wrinkle there is that @SiyaoIsHiding did change a couple things recently in relation to work she's been doing on the PR for JAVA-3143... but those changes should be relatively small. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For additional commands, e-mail: pr-h...@cassandra.apache.org
Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]
michaelsembwever commented on PR #1931: URL: https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2571548807 is there outstanding items or more work to do here ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For additional commands, e-mail: pr-h...@cassandra.apache.org
Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]
absurdfarce commented on code in PR #1931: URL: https://github.com/apache/cassandra-java-driver/pull/1931#discussion_r1896834284 ## query-builder/src/main/java/com/datastax/oss/driver/api/querybuilder/select/Select.java: ## @@ -146,6 +147,16 @@ default Select orderBy(@NonNull String columnName, @NonNull ClusteringOrder orde return orderBy(CqlIdentifier.fromCql(columnName), order); } + /** + * Shortcut for {@link #orderByAnnOf(CqlIdentifier, CqlVector)}, adding an ORDER BY ... ANN OF ... + * clause + */ + @NonNull + Select orderByAnnOf(@NonNull String columnName, @NonNull CqlVector ann); + + /** Adds the ORDER BY ... ANN OF ... clause */ + @NonNull + Select orderByAnnOf(@NonNull CqlIdentifier columnId, @NonNull CqlVector ann); Review Comment: @SiyaoIsHiding and I discussed in a meeting today. Both approaches have arguments for them but in the end we decided to go with CqlVector to get this moving forward and merged. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For additional commands, e-mail: pr-h...@cassandra.apache.org
Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]
SiyaoIsHiding commented on code in PR #1931: URL: https://github.com/apache/cassandra-java-driver/pull/1931#discussion_r1876495013 ## query-builder/src/main/java/com/datastax/oss/driver/api/querybuilder/select/Select.java: ## @@ -146,6 +147,16 @@ default Select orderBy(@NonNull String columnName, @NonNull ClusteringOrder orde return orderBy(CqlIdentifier.fromCql(columnName), order); } + /** + * Shortcut for {@link #orderByAnnOf(CqlIdentifier, CqlVector)}, adding an ORDER BY ... ANN OF ... + * clause + */ + @NonNull + Select orderByAnnOf(@NonNull String columnName, @NonNull CqlVector ann); + + /** Adds the ORDER BY ... ANN OF ... clause */ + @NonNull + Select orderByAnnOf(@NonNull CqlIdentifier columnId, @NonNull CqlVector ann); Review Comment: @absurdfarce bumping this discussion -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For additional commands, e-mail: pr-h...@cassandra.apache.org
Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]
SiyaoIsHiding commented on PR #1931: URL: https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2501931783 Thank you for testing the downstream too -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For additional commands, e-mail: pr-h...@cassandra.apache.org
Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]
michaelsembwever commented on PR #1931: URL: https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2500235548 > Are you sure you are using my PR version? That full class name looks like an outdated 4.x branch install before this commit: e843786 My bad, my checkout was still on 0a36ff069 Thanks for being patient with me. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For additional commands, e-mail: pr-h...@cassandra.apache.org
Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]
SiyaoIsHiding commented on PR #1931: URL: https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2500136964 And it seems like our CI isn't configured with credentials. It met GitHub quota limit. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For additional commands, e-mail: pr-h...@cassandra.apache.org
Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]
SiyaoIsHiding commented on PR #1931: URL: https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2500134628 Are you sure you are using my PR version? That full class name looks like an outdated 4.x branch install before this commit: https://github.com/apache/cassandra-java-driver/commit/e84378681aecdbb87696dc4b53cb6fd336c82b6b -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For additional commands, e-mail: pr-h...@cassandra.apache.org
Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]
michaelsembwever commented on PR #1931: URL: https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2500052188 `createTable.asCql()` is ``` CREATE TABLE IF NOT EXISTS test_springframework.ai_vector_store (id text PRIMARY KEY,content text,embedding 'org.apache.cassandra.db.marshal.VectorType(384)',country text,year smallint,meta1 text,meta2 text) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For additional commands, e-mail: pr-h...@cassandra.apache.org
Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]
SiyaoIsHiding commented on code in PR #1931: URL: https://github.com/apache/cassandra-java-driver/pull/1931#discussion_r1858081592 ## query-builder/src/main/java/com/datastax/oss/driver/api/querybuilder/select/Select.java: ## @@ -146,6 +147,16 @@ default Select orderBy(@NonNull String columnName, @NonNull ClusteringOrder orde return orderBy(CqlIdentifier.fromCql(columnName), order); } + /** + * Shortcut for {@link #orderByAnnOf(CqlIdentifier, CqlVector)}, adding an ORDER BY ... ANN OF ... + * clause + */ + @NonNull + Select orderByAnnOf(@NonNull String columnName, @NonNull CqlVector ann); + + /** Adds the ORDER BY ... ANN OF ... clause */ + @NonNull + Select orderByAnnOf(@NonNull CqlIdentifier columnId, @NonNull CqlVector ann); Review Comment: Getting back to this conversation because the current implementation will result in a compilation error when we extend vector support to arbitrary subtype. ```java public Select orderByAnnOf(@NonNull CqlIdentifier columnId, @NonNull CqlVector ann) { return withAnn(new Ann(columnId, ann)); } ``` The argument we get is a `CqlVector` , but the `Ann` constructor wants a `CqlVector`. Turns out compilation error. We either: 1. change Ann constructor to `CqlVector`, or 2. change `orderByAnnOf` API to `CqlVector`. I think option 2 makes more sense. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For additional commands, e-mail: pr-h...@cassandra.apache.org
Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]
michaelsembwever commented on PR #1931: URL: https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2500013620 > I just realized our CI broke. Can any one of you committers click the button to run tests for this PR please? done: https://ci-cassandra.apache.org/job/cassandra-java-driver/job/PR-1931/9/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For additional commands, e-mail: pr-h...@cassandra.apache.org
Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]
SiyaoIsHiding commented on PR #1931: URL: https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2499790028 Solved merge conflict and updated doc. I just realized our CI broke. Can any one of you committers click the button to run tests for this PR please? Here: https://ci-cassandra.apache.org/blue/organizations/jenkins/cassandra-java-driver/detail/PR-1931/8/pipeline -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For additional commands, e-mail: pr-h...@cassandra.apache.org
Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]
SiyaoIsHiding commented on PR #1931: URL: https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2499382006 @michaelsembwever Can you see the debug log from this line? ```java logger.debug("Executing {}", createTable.asCql()); ``` I wonder what `asCql()` outputs. I ran your code with substituted values ```java assertThat(SchemaBuilder.createTable("ks", "mytb") .ifNotExists() .withPartitionKey("k", DataTypes.INT) .withClusteringColumn("v", DataTypes.TEXT) .withColumn("embedding", DataTypes.vectorOf(DataTypes.FLOAT, 3))) .hasCql("CREATE TABLE IF NOT EXISTS ks.mytb (k int,v text,embedding vector,PRIMARY KEY(k,v))"); ``` And this will pass. I ran the cql statement in cqlsh, it turns out no problem, either. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For additional commands, e-mail: pr-h...@cassandra.apache.org
Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]
michaelsembwever commented on PR #1931: URL: https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2497870781 Testing this with https://github.com/spring-projects/spring-ai/pull/1817 I'm getting from [this code](https://github.com/spring-projects/spring-ai/pull/1817/files#diff-e3136d7ca72a9c5b17e14f161279f4adbe68d694f35bc0eb8ba04afac0042999R237): ``` SchemaBuilder.createTable(keyspace, table) .ifNotExists() .withPartitionKey(partitionKey.name, partitionKey.type) .withClusteringColumn(clusteringKey.name, clusteringKey.type) .withColumn("embedding", DataTypes.vectorOf(DataTypes.FLOAT, vectorDimension)) .build(); ``` this failure: ``` com.datastax.oss.driver.api.core.servererrors.SyntaxError: Error setting type org.apache.cassandra.db.marshal.VectorType(384): Invalid definition for comparator org.apache.cassandra.db.marshal.VectorType. at com.datastax.oss.driver.api.core.servererrors.SyntaxError.copy(SyntaxError.java:50) at com.datastax.oss.driver.internal.core.util.concurrent.CompletableFutures.getUninterruptibly(CompletableFutures.java:151) at com.datastax.oss.driver.internal.core.cql.CqlRequestSyncProcessor.process(CqlRequestSyncProcessor.java:55) at com.datastax.oss.driver.internal.core.cql.CqlRequestSyncProcessor.process(CqlRequestSyncProcessor.java:32) at com.datastax.oss.driver.internal.core.session.DefaultSession.execute(DefaultSession.java:234) at com.datastax.oss.driver.api.core.cql.SyncCqlSession.execute(SyncCqlSession.java:56) at org.springframework.ai.vectorstore.CassandraVectorStoreConfig.ensureTableExists(CassandraVectorStoreConfig.java:244) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For additional commands, e-mail: pr-h...@cassandra.apache.org
Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]
absurdfarce commented on PR #1931: URL: https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2494066507 @SiyaoIsHiding I'm okay with the idea behind these changes but we do need to fix a few things before this is merge-ready. Specifically: * The branch conflicts discussed above need to be resolved... I'm guessing those are largely from CASSJAVA-2 * Let's also add the doc updates referenced above Once these fixes are in (and we get a second +1) I think we're okay to move forward. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For additional commands, e-mail: pr-h...@cassandra.apache.org
Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]
absurdfarce commented on PR #1931: URL: https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2493987713 Raised by another user in separate communication: we should add some updates to the documentation reflecting these new features. Specifically we should update relevant areas of the [query_builder section](https://github.com/apache/cassandra-java-driver/tree/4.x/manual/query_builder) of the manual. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For additional commands, e-mail: pr-h...@cassandra.apache.org
Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]
absurdfarce commented on PR #1931: URL: https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2404243257 I'm not 💯 sure what the accessors on DefaultSelect are intended to do but I suppose it does make sense to keep it consistent and add an accessor for the Ann object. Either way I'm satisfied with where this stands now... 👍 from me! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For additional commands, e-mail: pr-h...@cassandra.apache.org
Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]
SiyaoIsHiding commented on PR #1931: URL: https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2403789196 I assume @lukasz-antoniak comment [here](https://github.com/apache/cassandra-java-driver/pull/1952/files#r1793979356) about adding `getAnn()` in `DefaultSelect` should be addressed here. Thank you for your suggestion! It's added. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For additional commands, e-mail: pr-h...@cassandra.apache.org
Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]
absurdfarce commented on code in PR #1931: URL: https://github.com/apache/cassandra-java-driver/pull/1931#discussion_r1794308525 ## query-builder/src/main/java/com/datastax/oss/driver/api/querybuilder/select/Select.java: ## @@ -146,6 +147,16 @@ default Select orderBy(@NonNull String columnName, @NonNull ClusteringOrder orde return orderBy(CqlIdentifier.fromCql(columnName), order); } + /** + * Shortcut for {@link #orderByAnnOf(CqlIdentifier, CqlVector)}, adding an ORDER BY ... ANN OF ... + * clause + */ + @NonNull + Select orderByAnnOf(@NonNull String columnName, @NonNull CqlVector ann); + + /** Adds the ORDER BY ... ANN OF ... clause */ + @NonNull + Select orderByAnnOf(@NonNull CqlIdentifier columnId, @NonNull CqlVector ann); Review Comment: I think I'm changing my answer on this: we should remove the type bound here and just make this a `CqlVector`. My rationale goes as follows. The fact that ANN comparisons only support float vectors actually isn't a constraint on the underlying Java type used here. In theory _any_ Java type whose codec will generate a serialized value that _can be understood as a float_ when the server receives the message would work here. That means we could pass in a CqlVector of floats, decimals or doubles and still have the server handle that without issue. I'll also note that there's no common supertype for Float, BigDecimal or Double (the corresponding Java types based on the [docs](https://github.com/apache/cassandra-java-driver/tree/4.x/manual/core#cql-to-java-type-mapping)) other than Number... and we can't use that here because it includes types other than these three. So there's no meaningful supertype we can use for a type bound at this point. A second point: the query builder is built around the notion of generating CQL based on whatever the user passes in; there's almost no checking whether types correlate to things the user specified. So, for example, OngoingValues doesn't have any kind of type bounds around it's various value() methods. I accept that this isn't _exactly_ the same as the float constraint we're discussing here... but it _is_ an indication that the query builder largely doesn't concern itself with type checking with the expectation that the server will handle that when it receives the query. Finally, I'll note that avoiding the type bound here at least exposes the idea (at an API level) of using external/custom types which serialize to CQL float values by way of their own codecs. Scala users, for example, might want to support using [Spire](https://typelevel.org/spire/) numerics for their CQL queries. This actually won't work for now since we [constrain the codec registry used](https://github.com/apache/cassandra-java-driver/pull/1931/files#diff-77b4c675f8250e6cac9db1ded659755f046b725a13f30d68dc7aa4a897c78bd1R447) to the (immutable) default which doesn't include any support for Spire types... but that's an implementation detail. By avoiding type bounds at an API level we've left that door open for such a change _without_ having to make a (potentially breaking) API change and without sacrificing our design principles elsewhere. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For additional commands, e-mail: pr-h...@cassandra.apache.org
Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]
absurdfarce commented on code in PR #1931: URL: https://github.com/apache/cassandra-java-driver/pull/1931#discussion_r1794259046 ## query-builder/src/main/java/com/datastax/oss/driver/api/querybuilder/QueryBuilder.java: ## @@ -538,4 +541,12 @@ public static Truncate truncate(@Nullable String keyspace, @NonNull String table return truncate( keyspace == null ? null : CqlIdentifier.fromCql(keyspace), CqlIdentifier.fromCql(table)); } + + public static Ann annOf(@NonNull CqlIdentifier cqlIdentifier, @NonNull CqlVector vector) { +return new DefaultAnn(cqlIdentifier, vector); + } + + public static Ann annOf(@NonNull String cqlIdentifier, @NonNull CqlVector vector) { +return new DefaultAnn(CqlIdentifier.fromCql(cqlIdentifier), vector); + } Review Comment: Nice catch! Yes, you're correct; Apache Cassandra 5.0.x supports _vectors_ of any subtype _as a type_ but the ANN index used there only supports floats. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For additional commands, e-mail: pr-h...@cassandra.apache.org
Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]
absurdfarce commented on PR #1931: URL: https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2402801009 Good call on the similarity_* functions @SiyaoIsHiding! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For additional commands, e-mail: pr-h...@cassandra.apache.org
Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]
absurdfarce commented on PR #1931: URL: https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2364736350 One other thing worth mentioning: the Cassandra impl also supports a way to get "the similarity calculation of the best scoring node closest to the query data as part of the results". Take a look at the similarity_dot_product() function (and the other choices as well) in the [relevant Cassandra docs](https://cassandra.apache.org/doc/latest/cassandra/getting-started/vector-search-quickstart.html#query-vector-data-with-cql). The query builder should have support for those as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For additional commands, e-mail: pr-h...@cassandra.apache.org
Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]
absurdfarce commented on code in PR #1931: URL: https://github.com/apache/cassandra-java-driver/pull/1931#discussion_r1769320442 ## query-builder/src/main/java/com/datastax/oss/driver/api/querybuilder/select/Select.java: ## @@ -146,6 +146,8 @@ default Select orderBy(@NonNull String columnName, @NonNull ClusteringOrder orde return orderBy(CqlIdentifier.fromCql(columnName), order); } + @NonNull + Select orderBy(@NonNull Ann ann); Review Comment: Maybe I'm missing something but it seems more natural to support something like the following: ``` Select orderByAnnOf(CqlIdentifier columnId, CqlVector ann); Select orderByAnnOf(String columnName, CqlVector ann); ``` Advantage is that with this approach you don't even need to introduce an Ann type... which kinda seems right as that type isn't really doing much for you here. You could also perhaps add a notion of type checking the specified column to make sure it's a vector type (and to make sure it matches the type of the input CqlVector). To the point made by @lukasz-antoniak above we could add directionality here (and throw warnings if the user tries to use a DESC order before there's server-side support for it) but I'm not sure it's worth it. There's no mention of ordering in the [relevant Cassandra docs](https://cassandra.apache.org/doc/latest/cassandra/getting-started/vector-search-quickstart.html#query-vector-data-with-cql) so my intuition says to just leave it out for now and add it when it becomes more of a thing. ## query-builder/src/test/java/com/datastax/oss/driver/api/querybuilder/schema/AlterTableTest.java: ## @@ -108,4 +108,15 @@ public void should_generate_alter_table_with_no_compression() { assertThat(alterTable("bar").withNoCompression()) .hasCql("ALTER TABLE bar WITH compression={'sstable_compression':''}"); } + + @Test + public void should_generate_alter_table_with_vector() { +assertThat( +alterTable("bar") +.alterColumn( +"v", +DataTypes.custom( + "org.apache.cassandra.db.marshal.VectorType(org.apache.cassandra.db.marshal.FloatType,3)"))) Review Comment: ``` DataTypes.custom(DataTypes.vectorOf(DataTypes.Float,3)) ``` ## query-builder/src/main/java/com/datastax/oss/driver/api/querybuilder/QueryBuilder.java: ## @@ -538,4 +541,12 @@ public static Truncate truncate(@Nullable String keyspace, @NonNull String table return truncate( keyspace == null ? null : CqlIdentifier.fromCql(keyspace), CqlIdentifier.fromCql(table)); } + + public static Ann annOf(@NonNull CqlIdentifier cqlIdentifier, @NonNull CqlVector vector) { +return new DefaultAnn(cqlIdentifier, vector); + } + + public static Ann annOf(@NonNull String cqlIdentifier, @NonNull CqlVector vector) { +return new DefaultAnn(CqlIdentifier.fromCql(cqlIdentifier), vector); + } Review Comment: These will need to be updated when the PR for JAVA-3143 is merged; the CqlVector constraint won't apply once that's in. ## core/src/main/java/com/datastax/oss/driver/internal/core/type/DefaultVectorType.java: ## @@ -60,7 +60,8 @@ public String getClassName() { @NonNull @Override public String asCql(boolean includeFrozen, boolean pretty) { -return String.format("'%s(%d)'", getClassName(), getDimensions()); +return String.format( +"VECTOR<%s, %d>", this.subtype.asCql(includeFrozen, pretty).toUpperCase(), getDimensions()); Review Comment: This should be a lower case "VECTOR" to match what's done with the other collection types ([list](https://github.com/apache/cassandra-java-driver/blob/4.x/core/src/main/java/com/datastax/oss/driver/api/core/type/ListType.java#L27-L32), [set](https://github.com/apache/cassandra-java-driver/blob/4.x/core/src/main/java/com/datastax/oss/driver/api/core/type/SetType.java#L27-L32), [map](https://github.com/apache/cassandra-java-driver/blob/4.x/core/src/main/java/com/datastax/oss/driver/api/core/type/MapType.java#L33-L41)) ## query-builder/src/test/java/com/datastax/oss/driver/api/querybuilder/insert/RegularInsertTest.java: ## @@ -41,6 +42,12 @@ public void should_generate_column_assignments() { .hasCql("INSERT INTO foo (a,b) VALUES (?,?)"); } + @Test + public void should_generate_vector_literals() { +assertThat(insertInto("foo").value("a", literal(Arrays.asList(0.1, 0.2, 0.3 Review Comment: As above: this should use a CqlVector rather than an Array ## query-builder/src/test/java/com/datastax/oss/driver/api/querybuilder/schema/AlterTypeTest.java: ## @@ -53,4 +54,10 @@ public void should_generate_alter_table_with_rename_three_columns() { assertThat(alterType("bar").renameField("x", "y").renameField("u", "v").renameField("b", "a")) .hasCql("ALTER TYPE bar RENAME x TO y AND u