LantaoJin opened a new pull request, #16214: URL: https://github.com/apache/lucene/pull/16214
### Description Adds two `IndexWriter` APIs to update a document's KNN vector **in place**, without reindexing the rest of the document: ```java public long updateFloatVectorValue(Term term, String field, float[] value) public long updateByteVectorValue(Term term, String field, byte[] value) ``` Today the only way to change a stored embedding (`KnnFloatVectorField` / `KnnByteVectorField`) is `updateDocument` -- delete-by-term + re-add the **whole** document, which re-analyzes/re-posts/ re-stores every field. For workloads that periodically re-embed (e.g. bumping the embedding model version) that is wasteful: only the vector changed. These APIs mirror `updateDocValues` -- they rewrite just the affected field at a new per-segment generation and leave everything else untouched. ### Benchmark summary `dim=768, otherFields=8`, ms per commit (lower is better): | numDocs | batchSize | updateVectorValue | updateDocument | |--:|--:|--:|--:| | 50000 | 1 | 238 | 9 | | 50000 | 1000 | 251 | 542 | | 50000 | 10000 | 384 | 8090 | -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
