[ https://issues.apache.org/jira/browse/LUCENE-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12731663#action_12731663 ]
Simon Willnauer commented on LUCENE-1566: ----------------------------------------- bq. From your post above, I thought the "read in 100MB chunks into the big array" failed to work around the issue? Ie that's why you allocated a tiny intermediate buffer? Did that turn out not to be the case? I verified that again with a 1.5.0_08 as well as the latest JVM on linux 32bit. Did not run the tests on windows yet but will do so tomorrow. Haven't had time to install the JVMs (or at least one) listed in the Sun Bug Report. Reading in chunks into a big array is fine. bq. How about, instead of ignoring the chunk size on 64bit jre, we conditionally default it to Long.MAX_VALUE? Hmm, would remove "code duplication" - good point. The readBytes method seems to be critical but I guess there won't be any performance impact by assigning one or two extra variables. bq. Can you link also to the Sun VM bug in the javadocs for these set/getChunkSize? Sure, I will also add CHANGES.TXT in the patch. will do as soon as I get into the office tomorrow. > Large Lucene index can hit false OOM due to Sun JRE issue > --------------------------------------------------------- > > Key: LUCENE-1566 > URL: https://issues.apache.org/jira/browse/LUCENE-1566 > Project: Lucene - Java > Issue Type: Bug > Components: Index > Affects Versions: 2.4.1 > Reporter: Michael McCandless > Assignee: Simon Willnauer > Priority: Minor > Fix For: 2.9 > > Attachments: LUCENE-1566.patch, LUCENE-1566.patch, > LUCENE_1566_IndexInput.patch > > > This is not a Lucene issue, but I want to open this so future google > diggers can more easily find it. > There's this nasty bug in Sun's JRE: > http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6478546 > The gist seems to be, if you try to read a large (eg 200 MB) number of > bytes during a single RandomAccessFile.read call, you can incorrectly > hit OOM. Lucene does this, with norms, since we read in one byte per > doc per field with norms, as a contiguous array of length maxDoc(). > The workaround was a custom patch to do large file reads as several > smaller reads. > Background here: > http://www.nabble.com/problems-with-large-Lucene-index-td22347854.html -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org