[
https://issues.apache.org/jira/browse/SOLR-857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653789#action_12653789
]
Ruben Jimenez commented on SOLR-857:
------------------------------------
So after some testing it seems that the omitNorms makes a big difference. With
my dynamic fields set to omitNorms I'm able to index a batch of 100 files
(500,000 documents) without any issues whereas before this failed at about half
way through the test batch. I'm still running into memory problems during
indexing, but these problems seem to come about after the index has grown to
about 1.5 million documents. I'm currently looking into simply adding more
memory to the server so that I can increase the heap size and considering
taking a shard approach to distributing the data across multiple instances.
I guess at this point we can consider this issue closed and I'll just post
questions to the user mailing list as I continue to try and find an optimal
configuration for my data.
Thanks for the help.
> Memory Leak during the indexing of large xml files
> --------------------------------------------------
>
> Key: SOLR-857
> URL: https://issues.apache.org/jira/browse/SOLR-857
> Project: Solr
> Issue Type: Bug
> Affects Versions: 1.3
> Environment: Verified on Ubuntu 8.0.4 (1.7GB RAM, 2.4GHz dual core)
> and Windows XP (2GB RAM, 2GHz pentium) both with a Java5 SDK
> Reporter: Ruben Jimenez
> Attachments: Dup_files1.zip, Dup_files10.zip, Dup_files2.zip,
> Dup_files3.zip, Dup_files4.zip, Dup_files5.zip, Dup_files6.zip,
> Dup_files7.zip, Dup_files8.zip, Dup_files9.zip, leaksuspects.gif,
> leaksuspects2.gif, OQ_SOLR_00001.xml.zip, schema.xml, schema.xml.dup,
> solr.zip, solr256MBHeap.jpg
>
>
> While indexing a set of SOLR xml files that contain 5000 document adds within
> them and are about 30MB each, SOLR 1.3 seems to continually use more and more
> memory until the heap is exhausted, while the same files are indexed without
> issue with SOLR 1.2.
> Steps used to reproduce.
> 1 - Download SOLR 1.3
> 2 - Modify example schema.xml to match fields required
> 3 - start example server with following command java -Xms512m -Xmx1024m
> -XX:MaxPermSize=128m -jar start.jar
> 4 - Index files as follow java -Xmx128m -jar
> .../examples/exampledocs/post.jar *.xml
> Directory with xml files contains about 100 xml files each of about 30MB
> each. While indexing after about the 25th file SOLR 1.3 runs out of memory,
> while SOLR 1.2 is able to index the entire set of files without any problems.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.