At some point we have to discuss this, and here’s as good a place as any. There’s a great news article published talking about how generative AI was used to assist in developing the new vector search feature, which is itself really cool. Unfortunately it *sounds* like it runs afoul of the ASF legal policy on use for contributions to the project. This proposal is to include a dependency, but I’m not sure if that avoids the issue, and I’m equally uncertain how much this issue is isolated to the dependency (or affects it at all?)

Anyway, this is an annoying discussion we need to have at some point, so raising it here now so we can figure it out.


On 21 Sep 2023, at 09:04, Mick Semb Wever <m...@apache.org> wrote:




On Wed, 20 Sept 2023 at 18:31, Mike Adamson <madam...@datastax.com> wrote:
The original patch for CEP-30 brought several modified Lucene classes in-tree to implement the concurrent HNSW graph used by the vector index.
These classes are now being replaced with the io.github.jbellis.jvector library, which contains an improved diskANN implementation for the on-disk graph format. 
The repo for this library is here: https://github.com/jbellis/jvector.
The library does not replace any code used by SAI or other parts of the codebase and is used solely by the vector index.
I would welcome any feedback on this change.

 
+1

but to nit-pick on legalities… it would be nice to avoid including a library copyrighted to DataStax (for historical reasons).
The Jamm library is in a similar state in that it has a license that refers to the copyright owner but does not state the copyright owner anywhere.

Can we get a copyright on Jamm, and can both not be Datastax (pls) ? 

Reply via email to