Re: [DISCUSS] Allow UPDATE on settings virtual table to change running configuration

2023-05-01 Thread Maxim Muzafarov
Hello everyone, I want to continue this topic and share another properties validation option/solution that emerged from my investigation of Cassandra and Accord configuration that could be used to make the virtual table SettingTable updatable, as each update must move Config from one consistent

Re: [DISCUSS] New data type for vector search

2023-05-01 Thread David Capwell
> In particular it makes no sense at all from an ML perspective to have vector > types of anything other than numerics Back to what Benedict was saying, if the proposal was a ML pluggin, then this limitation makes sense, but that is not the proposal at hand. If you wish to change the scope to

Re: [DISCUSS] New data type for vector search

2023-05-01 Thread Mick Semb Wever
> > > > But suggesting that Jonathan should work on implementing general purpose > arrays seems to fall outside the scope of this discussion, since the result > of such work wouldn't even fill the need Jonathan is targeting for here. > > Every comment I have made so far I have argued that the v1

Re: [DISCUSS] New data type for vector search

2023-05-01 Thread Benedict
Has anybody yet claimed it would be hard? Several folk seem ready to jump to the conclusion that this would be onerous, but as somebody with a good understanding of the storage layer I can assert with reasonable confidence that it would not be. As previously stated, the implementation largely

Re: [DISCUSS] New data type for vector search

2023-05-01 Thread David Capwell
> I think it is totally reasonable that the ANN patch (and Jonathan) is not > asked to implement on top of, or towards, other array (or other) new data > types. This impacts serialization, if you do not think about this day 1 you then can’t add later on without having to worry about migration

Re: [DISCUSS] New data type for vector search

2023-05-01 Thread Mick Semb Wever
Yes! What you (David) and Benedict write beautifully supports `VECTOR FLOAT[n]` imho. You are definitely bringing up valid implementation details, and that can be dealt with during patch review. This thread is about the CQL API addition. No matter which way the technical review goes with the

Re: [DISCUSS] New data type for vector search

2023-05-01 Thread Benedict
I have explained repeatedly why I am opposed to ML-specific data types. If we want to make an ML-specific data type, it should be in an ML plug-in. We should not pollute the general purpose language with hastily-considered features that target specific bandwagons - at best partially - no matter

Re: [DISCUSS] New data type for vector search

2023-05-01 Thread David Capwell
> I think a simple and easy case can be made for fixed length array types that > do not seem to create random bits of cruft in the language that dangle by > themselves should this play not pan out. If I am understanding you correctly, then a "VECTOR FLOAT[n]” is fine as its a array type but

Re: [DISCUSS] New data type for vector search

2023-05-01 Thread David Capwell
> A data type plug-in is actually really easy today, I think? Sadly not, the client reads the class from our schema tables and has to have duplicate logic to serialize/deserialize results… types are easy to add if you are ok with client not understanding them (and will some clients fail due to

Re: [DISCUSS] New data type for vector search

2023-05-01 Thread J. D. Jordan
Yes. Plugging in a new type server side is very easy. Adding that type to every client is not.Cassandra already supports plugging in custom types through a jar.  What a given client does when encountering a custom type it doesn’t know about depends on the client.I was recently looking at this for

Re: [DISCUSS] New data type for vector search

2023-05-01 Thread Josh McKenzie
> If we want to make an ML-specific data type, it should be in an ML plug-in. How can we encourage a healthier plug-in ecosystem? As far as I know it's been pretty anemic historically: cassandra: https://cassandra.apache.org/doc/latest/cassandra/plugins/index.html postgres:

Re: [DISCUSS] CEP-28: Reading and Writing Cassandra Data with Spark Bulk Analytics

2023-05-01 Thread Dinesh Joshi
Does anybody have any questions that we could answer about this proposal? > On Apr 27, 2023, at 1:24 PM, Francisco Guerrero > wrote: > > Hi folks, > > We have updated the confluence page with the source code for CEP-28. > There are two repositories with contributions. One is the patch [1] >

Re: [DISCUSS] New data type for vector search

2023-05-01 Thread Benedict
A data type plug-in is actually really easy today, I think? But, developing further hooks should probably be thought through as they’re necessary. I think in this case it would be simpler to deliver a general purpose type, which is why I’m trying to propose types that would be acceptable.I also