Adrien Grand created LUCENE-8619:
------------------------------------
Summary: Decrease I/O pressure of OfflineSorter
Key: LUCENE-8619
URL: https://issues.apache.org/jira/browse/LUCENE-8619
Project: Lucene - Core
Issue Type: Improvement
Reporter: Adrien Grand
OfflineSorter is likely I/O bound, yet it doesn't really try to relieve I/O.
For instance it always writes the length on 2 bytes, which is waseful when used
by BKDWriter since all byte[] arrays have exactly the same length. For
LatLonPoint, this is a 25% space overhead that we could remove.
Doing lightweight compression on the fly might also help.
As a data point, Ignacio told me that after indexing 60M shapes with
LatLonShape (1.65B triangles), the index directory was about 265GB and dropped
to 57GB when merging was over.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]