Adrien Grand created LUCENE-8619:
------------------------------------

             Summary: Decrease I/O pressure of OfflineSorter
                 Key: LUCENE-8619
                 URL: https://issues.apache.org/jira/browse/LUCENE-8619
             Project: Lucene - Core
          Issue Type: Improvement
            Reporter: Adrien Grand


OfflineSorter is likely I/O bound, yet it doesn't really try to relieve I/O. 
For instance it always writes the length on 2 bytes, which is waseful when used 
by BKDWriter since all byte[] arrays have exactly the same length. For 
LatLonPoint, this is a 25% space overhead that we could remove.

Doing lightweight compression on the fly might also help.

As a data point, Ignacio told me that after indexing 60M shapes with 
LatLonShape (1.65B triangles), the index directory was about 265GB and dropped 
to 57GB when merging was over.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to