Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]

2025-01-07 Thread via GitHub


absurdfarce merged PR #1931:
URL: https://github.com/apache/cassandra-java-driver/pull/1931


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org
For additional commands, e-mail: pr-h...@cassandra.apache.org



Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]

2025-01-06 Thread via GitHub


SiyaoIsHiding commented on PR #1931:
URL: 
https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2574488949

   Thanks! Squashed. @absurdfarce 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org
For additional commands, e-mail: pr-h...@cassandra.apache.org



Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]

2025-01-06 Thread via GitHub


absurdfarce commented on PR #1931:
URL: 
https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2573631273

   DS Jenkins looks good.  There was one test failure which looks to be a test 
issue (ccm timeout on startup) so I'm calling this good.  @SiyaoIsHiding can 
you squash this down to a single commit so we can get this in?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org
For additional commands, e-mail: pr-h...@cassandra.apache.org



Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]

2025-01-05 Thread via GitHub


SiyaoIsHiding commented on PR #1931:
URL: 
https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2572275863

   No outstanding item here, this PR is pending for reviews. 
   @absurdfarce I've updated the PR. May you please give an approval?
   @michaelsembwever It seems to me that you have some familiarity to this PR. 
Would you have time for a review as the second +1?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org
For additional commands, e-mail: pr-h...@cassandra.apache.org



Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]

2025-01-05 Thread via GitHub


absurdfarce commented on PR #1931:
URL: 
https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2572242950

   @michaelsembwever Short version is that this is basically done; we've 
essentially been waiting on a second +1 to move this one forward.  Only wrinkle 
there is that @SiyaoIsHiding did change a couple things recently in relation to 
work she's been doing on the PR for JAVA-3143... but those changes should be 
relatively small.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org
For additional commands, e-mail: pr-h...@cassandra.apache.org



Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]

2025-01-05 Thread via GitHub


michaelsembwever commented on PR #1931:
URL: 
https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2571548807

   is there outstanding items or more work to do here ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org
For additional commands, e-mail: pr-h...@cassandra.apache.org



Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]

2024-12-24 Thread via GitHub


absurdfarce commented on code in PR #1931:
URL: 
https://github.com/apache/cassandra-java-driver/pull/1931#discussion_r1896834284


##
query-builder/src/main/java/com/datastax/oss/driver/api/querybuilder/select/Select.java:
##
@@ -146,6 +147,16 @@ default Select orderBy(@NonNull String columnName, 
@NonNull ClusteringOrder orde
 return orderBy(CqlIdentifier.fromCql(columnName), order);
   }
 
+  /**
+   * Shortcut for {@link #orderByAnnOf(CqlIdentifier, CqlVector)}, adding an 
ORDER BY ... ANN OF ...
+   * clause
+   */
+  @NonNull
+  Select orderByAnnOf(@NonNull String columnName, @NonNull CqlVector ann);
+
+  /** Adds the ORDER BY ... ANN OF ... clause */
+  @NonNull
+  Select orderByAnnOf(@NonNull CqlIdentifier columnId, @NonNull CqlVector ann);

Review Comment:
   @SiyaoIsHiding and I discussed in a meeting today.  Both approaches have 
arguments for them but in the end we decided to go with CqlVector to get 
this moving forward and merged.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org
For additional commands, e-mail: pr-h...@cassandra.apache.org



Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]

2024-12-09 Thread via GitHub


SiyaoIsHiding commented on code in PR #1931:
URL: 
https://github.com/apache/cassandra-java-driver/pull/1931#discussion_r1876495013


##
query-builder/src/main/java/com/datastax/oss/driver/api/querybuilder/select/Select.java:
##
@@ -146,6 +147,16 @@ default Select orderBy(@NonNull String columnName, 
@NonNull ClusteringOrder orde
 return orderBy(CqlIdentifier.fromCql(columnName), order);
   }
 
+  /**
+   * Shortcut for {@link #orderByAnnOf(CqlIdentifier, CqlVector)}, adding an 
ORDER BY ... ANN OF ...
+   * clause
+   */
+  @NonNull
+  Select orderByAnnOf(@NonNull String columnName, @NonNull CqlVector ann);
+
+  /** Adds the ORDER BY ... ANN OF ... clause */
+  @NonNull
+  Select orderByAnnOf(@NonNull CqlIdentifier columnId, @NonNull CqlVector ann);

Review Comment:
   @absurdfarce bumping this discussion



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org
For additional commands, e-mail: pr-h...@cassandra.apache.org



Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]

2024-11-26 Thread via GitHub


SiyaoIsHiding commented on PR #1931:
URL: 
https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2501931783

   Thank you for testing the downstream too


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org
For additional commands, e-mail: pr-h...@cassandra.apache.org



Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]

2024-11-26 Thread via GitHub


michaelsembwever commented on PR #1931:
URL: 
https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2500235548

   > Are you sure you are using my PR version? That full class name looks like 
an outdated 4.x branch install before this commit: e843786
   
   My bad, my checkout was still on 0a36ff069
   Thanks for being patient with me.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org
For additional commands, e-mail: pr-h...@cassandra.apache.org



Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]

2024-11-26 Thread via GitHub


SiyaoIsHiding commented on PR #1931:
URL: 
https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2500136964

   And it seems like our CI isn't configured with credentials. It met GitHub 
quota limit.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org
For additional commands, e-mail: pr-h...@cassandra.apache.org



Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]

2024-11-26 Thread via GitHub


SiyaoIsHiding commented on PR #1931:
URL: 
https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2500134628

   Are you sure you are using my PR version? That full class name looks like an 
outdated 4.x branch install before this commit: 
https://github.com/apache/cassandra-java-driver/commit/e84378681aecdbb87696dc4b53cb6fd336c82b6b


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org
For additional commands, e-mail: pr-h...@cassandra.apache.org



Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]

2024-11-26 Thread via GitHub


michaelsembwever commented on PR #1931:
URL: 
https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2500052188

   `createTable.asCql()` is
   ```
   CREATE TABLE IF NOT EXISTS test_springframework.ai_vector_store (id text 
PRIMARY KEY,content text,embedding 
'org.apache.cassandra.db.marshal.VectorType(384)',country text,year 
smallint,meta1 text,meta2 text)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org
For additional commands, e-mail: pr-h...@cassandra.apache.org



Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]

2024-11-26 Thread via GitHub


SiyaoIsHiding commented on code in PR #1931:
URL: 
https://github.com/apache/cassandra-java-driver/pull/1931#discussion_r1858081592


##
query-builder/src/main/java/com/datastax/oss/driver/api/querybuilder/select/Select.java:
##
@@ -146,6 +147,16 @@ default Select orderBy(@NonNull String columnName, 
@NonNull ClusteringOrder orde
 return orderBy(CqlIdentifier.fromCql(columnName), order);
   }
 
+  /**
+   * Shortcut for {@link #orderByAnnOf(CqlIdentifier, CqlVector)}, adding an 
ORDER BY ... ANN OF ...
+   * clause
+   */
+  @NonNull
+  Select orderByAnnOf(@NonNull String columnName, @NonNull CqlVector ann);
+
+  /** Adds the ORDER BY ... ANN OF ... clause */
+  @NonNull
+  Select orderByAnnOf(@NonNull CqlIdentifier columnId, @NonNull CqlVector ann);

Review Comment:
   Getting back to this conversation because the current implementation will 
result in a compilation error when we extend vector support to arbitrary 
subtype. 
   ```java
 public Select orderByAnnOf(@NonNull CqlIdentifier columnId, @NonNull 
CqlVector ann) {
   return withAnn(new Ann(columnId, ann));
 }
   ```
   The argument we get is a `CqlVector` , but the `Ann` constructor wants a 
`CqlVector`. Turns out compilation error.
   We either:
   1. change Ann constructor to `CqlVector`, or
   2. change `orderByAnnOf` API to `CqlVector`. 
   
   I think option 2 makes more sense.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org
For additional commands, e-mail: pr-h...@cassandra.apache.org



Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]

2024-11-26 Thread via GitHub


michaelsembwever commented on PR #1931:
URL: 
https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2500013620

   >  I just realized our CI broke. Can any one of you committers click the 
button to run tests for this PR please?
   
   done: 
https://ci-cassandra.apache.org/job/cassandra-java-driver/job/PR-1931/9/


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org
For additional commands, e-mail: pr-h...@cassandra.apache.org



Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]

2024-11-25 Thread via GitHub


SiyaoIsHiding commented on PR #1931:
URL: 
https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2499790028

   Solved merge conflict and updated doc. I just realized our CI broke. Can any 
one of you committers click the button to run tests for this PR please? Here: 
https://ci-cassandra.apache.org/blue/organizations/jenkins/cassandra-java-driver/detail/PR-1931/8/pipeline


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org
For additional commands, e-mail: pr-h...@cassandra.apache.org



Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]

2024-11-25 Thread via GitHub


SiyaoIsHiding commented on PR #1931:
URL: 
https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2499382006

   @michaelsembwever Can you see the debug log from this line?
   ```java
   logger.debug("Executing {}", createTable.asCql());
   ```
   
   I wonder what `asCql()` outputs. I ran your code with substituted values
   ```java
   assertThat(SchemaBuilder.createTable("ks", "mytb")
   .ifNotExists()
   .withPartitionKey("k", DataTypes.INT)
   .withClusteringColumn("v", DataTypes.TEXT)
   .withColumn("embedding", DataTypes.vectorOf(DataTypes.FLOAT, 3)))
   .hasCql("CREATE TABLE IF NOT EXISTS ks.mytb (k int,v 
text,embedding vector,PRIMARY KEY(k,v))");
   ```
   And this will pass. I ran the cql statement in cqlsh, it turns out no 
problem, either.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org
For additional commands, e-mail: pr-h...@cassandra.apache.org



Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]

2024-11-25 Thread via GitHub


michaelsembwever commented on PR #1931:
URL: 
https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2497870781

   Testing this with https://github.com/spring-projects/spring-ai/pull/1817 
   
   I'm getting from [this 
code](https://github.com/spring-projects/spring-ai/pull/1817/files#diff-e3136d7ca72a9c5b17e14f161279f4adbe68d694f35bc0eb8ba04afac0042999R237):
   ```
   SchemaBuilder.createTable(keyspace, table)
   .ifNotExists()
   .withPartitionKey(partitionKey.name, partitionKey.type)
   .withClusteringColumn(clusteringKey.name, clusteringKey.type)
   .withColumn("embedding", DataTypes.vectorOf(DataTypes.FLOAT, 
vectorDimension))
   .build();
   ```
   this failure:
   ```
   com.datastax.oss.driver.api.core.servererrors.SyntaxError: Error setting 
type org.apache.cassandra.db.marshal.VectorType(384): Invalid definition for 
comparator org.apache.cassandra.db.marshal.VectorType.
at 
com.datastax.oss.driver.api.core.servererrors.SyntaxError.copy(SyntaxError.java:50)
at 
com.datastax.oss.driver.internal.core.util.concurrent.CompletableFutures.getUninterruptibly(CompletableFutures.java:151)
at 
com.datastax.oss.driver.internal.core.cql.CqlRequestSyncProcessor.process(CqlRequestSyncProcessor.java:55)
at 
com.datastax.oss.driver.internal.core.cql.CqlRequestSyncProcessor.process(CqlRequestSyncProcessor.java:32)
at 
com.datastax.oss.driver.internal.core.session.DefaultSession.execute(DefaultSession.java:234)
at 
com.datastax.oss.driver.api.core.cql.SyncCqlSession.execute(SyncCqlSession.java:56)
at 
org.springframework.ai.vectorstore.CassandraVectorStoreConfig.ensureTableExists(CassandraVectorStoreConfig.java:244)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org
For additional commands, e-mail: pr-h...@cassandra.apache.org



Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]

2024-11-22 Thread via GitHub


absurdfarce commented on PR #1931:
URL: 
https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2494066507

   @SiyaoIsHiding I'm okay with the idea behind these changes but we do need to 
fix a few things before this is merge-ready.  Specifically:
   
   * The branch conflicts discussed above need to be resolved... I'm guessing 
those are largely from CASSJAVA-2
   * Let's also add the doc updates referenced above
   
   Once these fixes are in (and we get a second +1) I think we're okay to move 
forward.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org
For additional commands, e-mail: pr-h...@cassandra.apache.org



Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]

2024-11-22 Thread via GitHub


absurdfarce commented on PR #1931:
URL: 
https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2493987713

   Raised by another user in separate communication: we should add some updates 
to the documentation reflecting these new features.  Specifically we should 
update relevant areas of the [query_builder 
section](https://github.com/apache/cassandra-java-driver/tree/4.x/manual/query_builder)
 of the manual.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org
For additional commands, e-mail: pr-h...@cassandra.apache.org



Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]

2024-10-10 Thread via GitHub


absurdfarce commented on PR #1931:
URL: 
https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2404243257

   I'm not 💯 sure what the accessors on DefaultSelect are intended to do but I 
suppose it does make sense to keep it consistent and add an accessor for the 
Ann object.
   
   Either way I'm satisfied with where this stands now... 👍 from me!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org
For additional commands, e-mail: pr-h...@cassandra.apache.org



Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]

2024-10-09 Thread via GitHub


SiyaoIsHiding commented on PR #1931:
URL: 
https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2403789196

   I assume @lukasz-antoniak comment 
[here](https://github.com/apache/cassandra-java-driver/pull/1952/files#r1793979356)
 about adding `getAnn()` in `DefaultSelect` should be addressed here. Thank you 
for your suggestion! It's added. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org
For additional commands, e-mail: pr-h...@cassandra.apache.org



Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]

2024-10-09 Thread via GitHub


absurdfarce commented on code in PR #1931:
URL: 
https://github.com/apache/cassandra-java-driver/pull/1931#discussion_r1794308525


##
query-builder/src/main/java/com/datastax/oss/driver/api/querybuilder/select/Select.java:
##
@@ -146,6 +147,16 @@ default Select orderBy(@NonNull String columnName, 
@NonNull ClusteringOrder orde
 return orderBy(CqlIdentifier.fromCql(columnName), order);
   }
 
+  /**
+   * Shortcut for {@link #orderByAnnOf(CqlIdentifier, CqlVector)}, adding an 
ORDER BY ... ANN OF ...
+   * clause
+   */
+  @NonNull
+  Select orderByAnnOf(@NonNull String columnName, @NonNull CqlVector ann);
+
+  /** Adds the ORDER BY ... ANN OF ... clause */
+  @NonNull
+  Select orderByAnnOf(@NonNull CqlIdentifier columnId, @NonNull CqlVector ann);

Review Comment:
   I think I'm changing my answer on this: we should remove the type bound here 
and just make this a `CqlVector`.  My rationale goes as follows.
   
   The fact that ANN comparisons only support float vectors actually isn't a 
constraint on the underlying Java type used here.  In theory _any_ Java type 
whose codec will generate a serialized value that _can be understood as a 
float_ when the server receives the message would work here.  That means we 
could pass in a CqlVector of floats, decimals or doubles and still have the 
server handle that without issue.  I'll also note that there's no common 
supertype for Float, BigDecimal or Double (the corresponding Java types based 
on the 
[docs](https://github.com/apache/cassandra-java-driver/tree/4.x/manual/core#cql-to-java-type-mapping))
 other than Number... and we can't use that here because it includes types 
other than these three.  So there's no meaningful supertype we can use for a 
type bound at this point.
   
   A second point: the query builder is built around the notion of generating 
CQL based on whatever the user passes in; there's almost no checking whether 
types correlate to things the user specified.  So, for example, OngoingValues 
doesn't have any kind of type bounds around it's various value() methods.  I 
accept that this isn't _exactly_ the same as the float constraint we're 
discussing here... but it _is_ an indication that the query builder largely 
doesn't concern itself with type checking with the expectation that the server 
will handle that when it receives the query.
   
   Finally, I'll note that avoiding the type bound here at least exposes the 
idea (at an API level) of using external/custom types which serialize to CQL 
float values by way of their own codecs.  Scala users, for example, might want 
to support using [Spire](https://typelevel.org/spire/) numerics for their CQL 
queries.  This actually won't work for now since we [constrain the codec 
registry 
used](https://github.com/apache/cassandra-java-driver/pull/1931/files#diff-77b4c675f8250e6cac9db1ded659755f046b725a13f30d68dc7aa4a897c78bd1R447)
 to the (immutable) default which doesn't include any support for Spire 
types... but that's an implementation detail.  By avoiding type bounds at an 
API level we've left that door open for such a change _without_ having to make 
a (potentially breaking) API change and without sacrificing our design 
principles elsewhere.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org
For additional commands, e-mail: pr-h...@cassandra.apache.org



Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]

2024-10-09 Thread via GitHub


absurdfarce commented on code in PR #1931:
URL: 
https://github.com/apache/cassandra-java-driver/pull/1931#discussion_r1794259046


##
query-builder/src/main/java/com/datastax/oss/driver/api/querybuilder/QueryBuilder.java:
##
@@ -538,4 +541,12 @@ public static Truncate truncate(@Nullable String keyspace, 
@NonNull String table
 return truncate(
 keyspace == null ? null : CqlIdentifier.fromCql(keyspace), 
CqlIdentifier.fromCql(table));
   }
+
+  public static Ann annOf(@NonNull CqlIdentifier cqlIdentifier, @NonNull 
CqlVector vector) {
+return new DefaultAnn(cqlIdentifier, vector);
+  }
+
+  public static Ann annOf(@NonNull String cqlIdentifier, @NonNull 
CqlVector vector) {
+return new DefaultAnn(CqlIdentifier.fromCql(cqlIdentifier), vector);
+  }

Review Comment:
   Nice catch!  Yes, you're correct; Apache Cassandra 5.0.x supports _vectors_ 
of any subtype _as a type_ but the ANN index used there only supports floats.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org
For additional commands, e-mail: pr-h...@cassandra.apache.org



Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]

2024-10-09 Thread via GitHub


absurdfarce commented on PR #1931:
URL: 
https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2402801009

   Good call on the similarity_* functions @SiyaoIsHiding!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org
For additional commands, e-mail: pr-h...@cassandra.apache.org



Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]

2024-09-20 Thread via GitHub


absurdfarce commented on PR #1931:
URL: 
https://github.com/apache/cassandra-java-driver/pull/1931#issuecomment-2364736350

   One other thing worth mentioning: the Cassandra impl also supports a way to 
get "the similarity calculation of the best scoring node closest to the query 
data as part of the results".  Take a look at the similarity_dot_product() 
function (and the other choices as well) in the [relevant Cassandra 
docs](https://cassandra.apache.org/doc/latest/cassandra/getting-started/vector-search-quickstart.html#query-vector-data-with-cql).
  The query builder should have support for those as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org
For additional commands, e-mail: pr-h...@cassandra.apache.org



Re: [PR] JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder [cassandra-java-driver]

2024-09-20 Thread via GitHub


absurdfarce commented on code in PR #1931:
URL: 
https://github.com/apache/cassandra-java-driver/pull/1931#discussion_r1769320442


##
query-builder/src/main/java/com/datastax/oss/driver/api/querybuilder/select/Select.java:
##
@@ -146,6 +146,8 @@ default Select orderBy(@NonNull String columnName, @NonNull 
ClusteringOrder orde
 return orderBy(CqlIdentifier.fromCql(columnName), order);
   }
 
+  @NonNull
+  Select orderBy(@NonNull Ann ann);

Review Comment:
   Maybe I'm missing something but it seems more natural to support something 
like the following:
   
   ```
   Select orderByAnnOf(CqlIdentifier columnId, CqlVector ann);
   Select orderByAnnOf(String columnName, CqlVector ann);
   ```
   
   Advantage is that with this approach you don't even need to introduce an Ann 
type... which kinda seems right as that type isn't really doing much for you 
here.
   
   You could also perhaps add a notion of type checking the specified column to 
make sure it's a vector type (and to make sure it matches the type of the input 
CqlVector).
   
   To the point made by @lukasz-antoniak above we could add directionality here 
(and throw warnings if the user tries to use a DESC order before there's 
server-side support for it) but I'm not sure it's worth it.  There's no mention 
of ordering in the [relevant Cassandra 
docs](https://cassandra.apache.org/doc/latest/cassandra/getting-started/vector-search-quickstart.html#query-vector-data-with-cql)
 so my intuition says to just leave it out for now and add it when it becomes 
more of a thing.



##
query-builder/src/test/java/com/datastax/oss/driver/api/querybuilder/schema/AlterTableTest.java:
##
@@ -108,4 +108,15 @@ public void 
should_generate_alter_table_with_no_compression() {
 assertThat(alterTable("bar").withNoCompression())
 .hasCql("ALTER TABLE bar WITH compression={'sstable_compression':''}");
   }
+
+  @Test
+  public void should_generate_alter_table_with_vector() {
+assertThat(
+alterTable("bar")
+.alterColumn(
+"v",
+DataTypes.custom(
+
"org.apache.cassandra.db.marshal.VectorType(org.apache.cassandra.db.marshal.FloatType,3)")))

Review Comment:
   ```
   DataTypes.custom(DataTypes.vectorOf(DataTypes.Float,3))
   ```



##
query-builder/src/main/java/com/datastax/oss/driver/api/querybuilder/QueryBuilder.java:
##
@@ -538,4 +541,12 @@ public static Truncate truncate(@Nullable String keyspace, 
@NonNull String table
 return truncate(
 keyspace == null ? null : CqlIdentifier.fromCql(keyspace), 
CqlIdentifier.fromCql(table));
   }
+
+  public static Ann annOf(@NonNull CqlIdentifier cqlIdentifier, @NonNull 
CqlVector vector) {
+return new DefaultAnn(cqlIdentifier, vector);
+  }
+
+  public static Ann annOf(@NonNull String cqlIdentifier, @NonNull 
CqlVector vector) {
+return new DefaultAnn(CqlIdentifier.fromCql(cqlIdentifier), vector);
+  }

Review Comment:
   These will need to be updated when the PR for JAVA-3143 is merged; the 
CqlVector constraint won't apply once that's in.



##
core/src/main/java/com/datastax/oss/driver/internal/core/type/DefaultVectorType.java:
##
@@ -60,7 +60,8 @@ public String getClassName() {
   @NonNull
   @Override
   public String asCql(boolean includeFrozen, boolean pretty) {
-return String.format("'%s(%d)'", getClassName(), getDimensions());
+return String.format(
+"VECTOR<%s, %d>", this.subtype.asCql(includeFrozen, 
pretty).toUpperCase(), getDimensions());

Review Comment:
   This should be a lower case "VECTOR" to match what's done with the other 
collection types 
([list](https://github.com/apache/cassandra-java-driver/blob/4.x/core/src/main/java/com/datastax/oss/driver/api/core/type/ListType.java#L27-L32),
 
[set](https://github.com/apache/cassandra-java-driver/blob/4.x/core/src/main/java/com/datastax/oss/driver/api/core/type/SetType.java#L27-L32),
 
[map](https://github.com/apache/cassandra-java-driver/blob/4.x/core/src/main/java/com/datastax/oss/driver/api/core/type/MapType.java#L33-L41))



##
query-builder/src/test/java/com/datastax/oss/driver/api/querybuilder/insert/RegularInsertTest.java:
##
@@ -41,6 +42,12 @@ public void should_generate_column_assignments() {
 .hasCql("INSERT INTO foo (a,b) VALUES (?,?)");
   }
 
+  @Test
+  public void should_generate_vector_literals() {
+assertThat(insertInto("foo").value("a", literal(Arrays.asList(0.1, 0.2, 
0.3

Review Comment:
   As above: this should use a CqlVector rather than an Array



##
query-builder/src/test/java/com/datastax/oss/driver/api/querybuilder/schema/AlterTypeTest.java:
##
@@ -53,4 +54,10 @@ public void 
should_generate_alter_table_with_rename_three_columns() {
 assertThat(alterType("bar").renameField("x", "y").renameField("u", 
"v").renameField("b", "a"))
 .hasCql("ALTER TYPE bar RENAME x TO y AND u