Are there like untold numbers of fields in those xml docs? Can make a
lot of norms eat a lot of RAM...
Ruben Jimenez (JIRA) wrote:
[ https://issues.apache.org/jira/browse/SOLR-857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12649374#action_12649374 ]
Ruben Jimenez commented on SOLR-857:
------------------------------------
Sorry I've been sick all week and am just getting back on my feet today.
I haven't been able to reproduce the issue by duplicating a single file and modifying it to allowDups. i've also tried by duplicating a random set of about 3 files.
By looking through the heap dump I noticed that the "$Norms" memory usage was
pretty high, so I modified the schema to have the text field omitNorms. This helped but
simply delayed the problem.
As of right now we believe that the issue might be related to the fact that we are creating a large number of dynamic fields, hence the large number of FieldInfo instances in memory in the heap dump. I'll try to see what is the minimum number of files required to reproduce the issue, but I have a feeling it will be on the order of 23 or so.
Memory Leak during the indexing of large xml files
--------------------------------------------------
Key: SOLR-857
URL: https://issues.apache.org/jira/browse/SOLR-857
Project: Solr
Issue Type: Bug
Affects Versions: 1.3
Environment: Verified on Ubuntu 8.0.4 (1.7GB RAM, 2.4GHz dual core) and
Windows XP (2GB RAM, 2GHz pentium) both with a Java5 SDK
Reporter: Ruben Jimenez
Attachments: OQ_SOLR_00001.xml.zip, schema.xml, solr256MBHeap.jpg
While indexing a set of SOLR xml files that contain 5000 document adds within
them and are about 30MB each, SOLR 1.3 seems to continually use more and more
memory until the heap is exhausted, while the same files are indexed without
issue with SOLR 1.2.
Steps used to reproduce.
1 - Download SOLR 1.3
2 - Modify example schema.xml to match fields required
3 - start example server with following command java -Xms512m -Xmx1024m
-XX:MaxPermSize=128m -jar start.jar
4 - Index files as follow java -Xmx128m -jar .../examples/exampledocs/post.jar
*.xml
Directory with xml files contains about 100 xml files each of about 30MB each.
While indexing after about the 25th file SOLR 1.3 runs out of memory, while
SOLR 1.2 is able to index the entire set of files without any problems.