[ANNOUNCE] Apache Lucene 10.0.0 released

Luca Cavanna Mon, 14 Oct 2024 06:05:44 -0700

The Lucene PMC is pleased to announce the release of Apache Lucene 10.0.0.

Apache Lucene is a high-performance, full-featured search engine library
written entirely in Java. It is a technology suitable for nearly any
application that requires structured search, full-text search, faceting,
nearest-neighbor search across high-dimensionality vectors, spell
correction or query suggestions.


This release contains numerous bug fixes, optimizations, and improvements,
some of which are highlighted below. The release is available for immediate
download at:

  <https://lucene.apache.org/core/downloads.html>

### Lucene 10.0.0 Release Highlights:

#### System requirements
 * Lucene 10.0 requires JDK 21 or newer

#### API changes
 * KNN vector values now have a random-access API.
 * Deprecated APIs have been removed and a number of API changes have been
made. Please consult the migrate guide for an extensive list and actions to
take to migrate to 10.0.

#### New Features
 * A new IndexInput#prefetch API has been added, allowing query evaluation
logic to let the Directory know about regions of data that are about to be
read. This helps perform I/O concurrently under the hood. MMapDirectory
implements this API using the madvise system call and the MADV_WILLNEED
flag on Linux and Mac OS.
 * Lucene now supports sparse indexing on doc values via
FieldType#setDocValuesSkipIndexType. The sparse index will record the
minimum and maximum values per block of doc IDs. Used in conjunction with
index sorting to cluster similar documents together, this allows for very
space-efficient and CPU-efficient filtering.
 * Search concurrency is now decoupled from the index geometry, so that an
index can be searched using any number of threads, regardless of its number
of segments.
 * Kmeans clustering on vectors

#### Improvements
 * Lucene now opens files with the MADV_RANDOM advice by default on Linux
and Mac OS. This results in better efficiency for indexes that exceed the
size of the page cache, but can make it slower to load indexes in the page
cache. It is possible to revert to the MADV_NORMAL read advice by default
by passing -Dorg.apache.lucene.store.defaultReadAdvice=NORMAL as a JVM
startup flag.
 * Snowball dictionaries have been upgraded, resulting in improved
tokenization. This may require reindexing to ensure consistency of search
results with pre-10.0 indexes.

... plus a multitude of helpful bug fixes!


Please read CHANGES.txt for a full list of new features and changes:

  <https://lucene.apache.org/core/10_0_0/changes/Changes.html>

[ANNOUNCE] Apache Lucene 10.0.0 released

Reply via email to