The Lucene PMC is pleased to announce the release of Apache Lucene 10.0.0. Apache Lucene is a high-performance, full-featured search engine library written entirely in Java. It is a technology suitable for nearly any application that requires structured search, full-text search, faceting, nearest-neighbor search across high-dimensionality vectors, spell correction or query suggestions.
This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at: <https://lucene.apache.org/core/downloads.html> ### Lucene 10.0.0 Release Highlights: #### System requirements * Lucene 10.0 requires JDK 21 or newer #### API changes * KNN vector values now have a random-access API. * Deprecated APIs have been removed and a number of API changes have been made. Please consult the migrate guide for an extensive list and actions to take to migrate to 10.0. #### New Features * A new IndexInput#prefetch API has been added, allowing query evaluation logic to let the Directory know about regions of data that are about to be read. This helps perform I/O concurrently under the hood. MMapDirectory implements this API using the madvise system call and the MADV_WILLNEED flag on Linux and Mac OS. * Lucene now supports sparse indexing on doc values via FieldType#setDocValuesSkipIndexType. The sparse index will record the minimum and maximum values per block of doc IDs. Used in conjunction with index sorting to cluster similar documents together, this allows for very space-efficient and CPU-efficient filtering. * Search concurrency is now decoupled from the index geometry, so that an index can be searched using any number of threads, regardless of its number of segments. * Kmeans clustering on vectors #### Improvements * Lucene now opens files with the MADV_RANDOM advice by default on Linux and Mac OS. This results in better efficiency for indexes that exceed the size of the page cache, but can make it slower to load indexes in the page cache. It is possible to revert to the MADV_NORMAL read advice by default by passing -Dorg.apache.lucene.store.defaultReadAdvice=NORMAL as a JVM startup flag. * Snowball dictionaries have been upgraded, resulting in improved tokenization. This may require reindexing to ensure consistency of search results with pre-10.0 indexes. ... plus a multitude of helpful bug fixes! Please read CHANGES.txt for a full list of new features and changes: <https://lucene.apache.org/core/10_0_0/changes/Changes.html>