Shalin Shekhar Mangar created LUCENE-6779:
---------------------------------------------
Summary: Reduce memory allocated by CompressingStoredFieldsWriter
to write large strings
Key: LUCENE-6779
URL: https://issues.apache.org/jira/browse/LUCENE-6779
Project: Lucene - Core
Issue Type: Improvement
Components: core/codecs
Reporter: Shalin Shekhar Mangar
In SOLR-7927, I am trying to reduce the memory required to index very large
documents (between 10 to 100MB) and one of the places which allocate a lot of
heap is the UTF8 encoding in CompressingStoredFieldsWriter. The same problem
existed in JavaBinCodec and we reduced its memory allocation by falling back to
a double pass approach in SOLR-7971 when the utf8 size of the string is greater
than 64KB.
I propose to make the same changes to CompressingStoredFieldsWriter as we made
to JavaBinCodec in SOLR-7971.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]