[ https://issues.apache.org/jira/browse/LUCENE-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727595#action_12727595 ]
Simon Willnauer commented on LUCENE-1566: ----------------------------------------- bq. Could we move the fix down into SimpleFSDir.FSIndexInput? We are only hitting it with norms today, but in the future if we read other large things from the index (eg with column-stride fields) we'd like the workaround to apply there as well. sure! bq. Can we turn it on by default, only on 32 bit JREs? good point! - will try to do so. bq. I think you can read directly into the norms byte[], instead of allocating a temp buffer, as long as you read in smaller chunks. From the email thread it looked like smaller was up to ~100 MB, which I think is fine to simply do by default. Maybe offer a setter (instead of static property) to turn this workaround on/off in SimpleFSDir. I did hit the error while I did that but I will verify again. Thanks for the comments thats why I hacked it together and did throw it out ASAP. Will look at it tomorrow. simon > Large Lucene index can hit false OOM due to Sun JRE issue > --------------------------------------------------------- > > Key: LUCENE-1566 > URL: https://issues.apache.org/jira/browse/LUCENE-1566 > Project: Lucene - Java > Issue Type: Bug > Components: Index > Affects Versions: 2.4.1 > Reporter: Michael McCandless > Assignee: Michael McCandless > Priority: Minor > Fix For: 2.9 > > Attachments: LUCENE-1566.patch, LUCENE-1566.patch > > > This is not a Lucene issue, but I want to open this so future google > diggers can more easily find it. > There's this nasty bug in Sun's JRE: > http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6478546 > The gist seems to be, if you try to read a large (eg 200 MB) number of > bytes during a single RandomAccessFile.read call, you can incorrectly > hit OOM. Lucene does this, with norms, since we read in one byte per > doc per field with norms, as a contiguous array of length maxDoc(). > The workaround was a custom patch to do large file reads as several > smaller reads. > Background here: > http://www.nabble.com/problems-with-large-Lucene-index-td22347854.html -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org