Thanks to all those that attended, here's a brief summary of topics discussed:

1. Lucene 10 PR is unblocked with only a few tests left to fix. Lucene 10 is a 
pre-requisite for both Solr 10 release and some new dense vector changes 
(seeded knn query, patience early termination knn query, binary bit 
quantization...etc)

2. Highlighting some other in-flight dense vector changes:
    * Reciprocal rank fusion (SOLR-17319)
    * Block join multi vector document (SOLR-17736)
    * Scalar quantized vector field (SOLR-17780)
    * Lucene support for GPUs

3. Question re: atomic updates and text to vector update processor. Ticket with 
use case and details to be filed

4. Performance of Solr dense vector vs. other available vector DBs
    * Solr performance is surprisingly good and comparable
    * No standard community benchmark process for Solr (dense or otherwise). 
Ishan and Fullstory have created solrbench, but needs hardware to continuously 
run on - perhaps there can be sponsorship to enable this?

5. Areas of Improvement
    * Dense vector indexing needs more love (performance can quickly drop off) 
- Ishan / Noble have done some investigation into this area before. They will 
see what code can be contributed / what JIRA tickets can be created for further 
investigation
    * Lucene HNSW graph search is fast, but likely there is room to improve 
search at Solr level (there is no sharing of information or optimization 
between segment searches and shards). Perhaps the multi-threaded searching 
needs further refinement (SOLR-13350)
    * FAISS integration exists in Lucene - add support in Solr?

6. Need to get a better understanding of where Solr is behind.
    * Create a table to list out relevant Lucene changes by version and if 
there are complementary Solr changes needed to unlock the value
    * Create a table to list out other popular vector DB feature sets against 
Solr's and see what is missing

7. How long do we intend to keep these meetings going? Can we eventually merge 
into regular Solr community meetup?
    * As long as people have interest in attending
    * Primary goal is to build up momentum of dense vector contributions 
through roadmap planning, information sharing, and community support

The next meeting will be on Sept 3rd. We will follow-up on the feature matrix 
initiatives in 6, features for Solr 10, and probably further discussion on 
performance benchmarking.

Cheers

-Kevin

Reply via email to