[
https://issues.apache.org/jira/browse/SOLR-7927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700245#comment-14700245
]
Shalin Shekhar Mangar commented on SOLR-7927:
---------------------------------------------
I apologize Yonik. Actually the test JSON file had a single document of 100MB
with a huge content field and not 10 docs of 10MB as I believed earlier. Now
the OOM makes more sense but I think we can shave off some extra memory usage.
For example, JavaBinCodec.writeStr creates a byte array of size 4 *
string.length but the same can be done in 3 * string.length as Lucene's
CompressingStoredFieldsWriter.writeField() does?
> Transaction log consumes lot of memory when indexing large documents
> --------------------------------------------------------------------
>
> Key: SOLR-7927
> URL: https://issues.apache.org/jira/browse/SOLR-7927
> Project: Solr
> Issue Type: Bug
> Components: update
> Affects Versions: 5.2.1
> Reporter: Shalin Shekhar Mangar
> Fix For: Trunk, 5.4
>
>
> Solr is started with 1280M heap.
> ./bin/solr start -m 1280m
> Indexing a 100MB JSON file (using curl) containing large JSON documents from
> project Gutenberg fails with OOM but indexing a 549M JSON file containing
> small documents is indexed just fine.
> The same 100MB JSON file with the same heap size can be indexed just fine if
> I disable the transaction log.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]