Shalin Shekhar Mangar created LUCENE-6779:
---------------------------------------------

             Summary: Reduce memory allocated by CompressingStoredFieldsWriter 
to write large strings
                 Key: LUCENE-6779
                 URL: https://issues.apache.org/jira/browse/LUCENE-6779
             Project: Lucene - Core
          Issue Type: Improvement
          Components: core/codecs
            Reporter: Shalin Shekhar Mangar


In SOLR-7927, I am trying to reduce the memory required to index very large 
documents (between 10 to 100MB) and one of the places which allocate a lot of 
heap is the UTF8 encoding in CompressingStoredFieldsWriter. The same problem 
existed in JavaBinCodec and we reduced its memory allocation by falling back to 
a double pass approach in SOLR-7971 when the utf8 size of the string is greater 
than 64KB.

I propose to make the same changes to CompressingStoredFieldsWriter as we made 
to JavaBinCodec in SOLR-7971.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to