The Lucene PMC is pleased to announce the release of Apache Lucene 10.4.0.

Apache Lucene is a high-performance, full-featured search engine library
written entirely in Java. It is a technology suitable for nearly any
application that requires structured search, full-text search, faceting,
nearest-neighbor search across high-dimensionality vectors, spell
correction or query suggestions.

This release contains numerous bug fixes, optimizations, and improvements,
some of which are highlighted below. The release is available for immediate
download at:

https://lucene.apache.org/core/downloads.html

Lucene 10.4 brings significant performance improvements and a new vector
format.

Many lucene queries should see a performance improvement of 10-15%, some
might even see a 35% improvement! This is due to increasing the block size
of the terms postings and better utilization of SIMD optimized code.

Additionally, there is a new scalar quantized format for dense vectors and
knn search. Lucene104ScalarQuantizedVectorsFormat and
Lucene104HnswScalarQuantizedVectorsFormat allow custom quantized bits for
1, 2, 4, 7, and 8. The recall has improved significantly and for many
vector types, quantizing to 2 bits will achieve even better recall than
older formats at the 4 bit level. This improves latency while increasing
recall for various vector work loads.

*New Features*


   - Provides new and improved scalar quantized
   Lucene104ScalarQuantizedVectorsFormat and
   Lucene104HnswScalarQuantizedVectorsFormat for dense vectors. Allowing for
   quantizing to 1, 2, 4, 7, and 8 bits. For reference, the new 2 bit
   quantization technique provides better recall and speed than the old 4 bit.

*API Changes*


   - New bulk operation APIs for dense vectors and numeric doc values

*Improvements and Optimizations*


   - HNSW graphs can now delay being built for tiny segments and will
   prevent completely rebuilding the graphs when handling deletes
   - Handling of deletes in general got much faster and cheaper, improving
   storage costs significantly when there are very few deleted docs
   - Block size increased for terms postings, significantly improving query
   latency for many types of queries
   - Use a coarser-grained competitive iterator with lower construction
   costs for numeric sorts against fields with DocValuesSkippers.

*Runtime Behavior Changes and Bug Fixes*


   - Fix tessellator failure by preferring the shared vertex that is the
   leftmost vertex of the hole
   - The `reverse` field of SortField is now final. If you have subclassed
   SortField, you should set `reverse` in the super constructor.
   - Align float vectors on disk to 64 bytes, for optimal performance on
   Arm Neoverse machines



Please read CHANGES.txt for a full list of new features and changes:
https://lucene.apache.org/core/10_4_0/changes/Changes.html

Reply via email to