[ https://issues.apache.org/jira/browse/LUCENE-6779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14745612#comment-14745612 ]
ASF subversion and git services commented on LUCENE-6779: --------------------------------------------------------- Commit 1703231 from sha...@apache.org in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1703231 ] LUCENE-6779: Reduce memory allocated by CompressingStoredFieldsWriter to write strings larger than 64kb by an amount equal to string's utf8 size > Reduce memory allocated by CompressingStoredFieldsWriter to write large > strings > ------------------------------------------------------------------------------- > > Key: LUCENE-6779 > URL: https://issues.apache.org/jira/browse/LUCENE-6779 > Project: Lucene - Core > Issue Type: Improvement > Components: core/codecs > Reporter: Shalin Shekhar Mangar > Attachments: LUCENE-6779.patch, LUCENE-6779.patch, LUCENE-6779.patch, > LUCENE-6779.patch, LUCENE-6779_alt.patch > > > In SOLR-7927, I am trying to reduce the memory required to index very large > documents (between 10 to 100MB) and one of the places which allocate a lot of > heap is the UTF8 encoding in CompressingStoredFieldsWriter. The same problem > existed in JavaBinCodec and we reduced its memory allocation by falling back > to a double pass approach in SOLR-7971 when the utf8 size of the string is > greater than 64KB. > I propose to make the same changes to CompressingStoredFieldsWriter as we > made to JavaBinCodec in SOLR-7971. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org