[ https://issues.apache.org/jira/browse/LUCENE-2357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Adrien Grand updated LUCENE-2357: --------------------------------- Attachment: LUCENE-2357.patch This patch should fix this issue. I replaced the {{int[]}} by an abstract {{MergeState.DocMap}} class which has two main implementations: a direct one which maps doc ids to their new value directly, and another one which counts the number of documents which have been deleted so far to know how much to decrement doc ids. > Reduce transient RAM usage while merging by using packed ints array for docID > re-mapping > ---------------------------------------------------------------------------------------- > > Key: LUCENE-2357 > URL: https://issues.apache.org/jira/browse/LUCENE-2357 > Project: Lucene - Java > Issue Type: Improvement > Components: core/index > Reporter: Michael McCandless > Priority: Minor > Labels: gsoc2012, lucene-gsoc-12 > Fix For: 4.1 > > Attachments: LUCENE-2357.patch > > > We allocate this int[] to remap docIDs due to compaction of deleted ones. > This uses alot of RAM for large segment merges, and can fail to allocate due > to fragmentation on 32 bit JREs. > Now that we have packed ints, a simple fix would be to use a packed int > array... and maybe instead of storing abs docID in the mapping, we could > store the number of del docs seen so far (so the remap would do a lookup then > a subtract). This may add some CPU cost to merging but should bring down > transient RAM usage quite a bit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org