[ 
https://issues.apache.org/jira/browse/LUCENE-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008764#comment-13008764
 ] 

Uwe Schindler commented on LUCENE-2975:
---------------------------------------

Before I commit this stuff, I wanted to conclude all Robert and me found out 
yesterday (in fact, Robert was doing the debugging work). I just had the idea 
how to solve it.

The issue has not really something to do with MMap and also is in the Hotspot 
VM even before 1.6.0_22 (I hit the bug in our server logs for a 
IndexReader.document() call and were not able to reproduce, this was 1.6.0_21). 
So this bug also affects previous Lucene versions, *BUT*:

The bug only happens, if the IndexInput.readByte() method is inlined by hotspot 
and we are *not* using BufferedIndexInput (which has its own VInt impl). Lucene 
3.0 and prev had a very complicated readByte() method in MMap, hotspot never 
inlined. But since an performance update in MMapIndexInput/MultiMMapIndexInput, 
the readByte method got a three liner simply delegating to the ByteBuffer's 
getByte() and catching an exception to fallback to another impl (for changing 
the buffer slice in MultiMMap). So most calls are simply calls to NIOs 
getByte() which may be intrinsics or whatever (we don't know what Hotspot does 
there). Sun optimizes a lot at NIO!

This leads to a problem with the loop inside readVInt; all other methods in 
IndexInput have already unwinded loops, so readLong is not a "for (i=0; i<64; 
i+=8)" loop, it is coded with all shifts precalculated. To solve the bug, I did 
the same for readVInt and readVLong.

We don't know which bug in hotspot is the real cause for this, so David Weiss's 
bug looks really identical, especially as we had in one of our tests also an 
endless loop (Robert have seen it, not confirmed). There are a lot of hotspot 
bugs related to loops, so one of them hit us here.

> hotspot bug in readvint gives wrong results
> -------------------------------------------
>
>                 Key: LUCENE-2975
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2975
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 3.1
>            Reporter: Uwe Schindler
>            Priority: Blocker
>             Fix For: 3.1
>
>         Attachments: LUCENE-2975.patch, LUCENE-2975.patch, LUCENE-2975.patch, 
> LUCENE-2975.patch, perf.png
>
>
> When testing the 3.1-RC1 made by Yonik on the PANGAEA (www.pangaea.de) 
> productive system I figured out that suddenly on a large segment (about 5 
> GiB) some stored fiels suddenly produce a strange deflate decompression 
> problem (CompressionTools) although the stored fields are no longer pre-3.0 
> compressed. It seems that the header of the stored field is read incorrectly 
> at the buffer boundary in MultiMMapDir and then FieldsReader just incorrectly 
> detects a deflate-compressed field (CompressionTools).
> The error occurs reproducible on CheckIndex with MMapDirectory, but not with 
> NIODir or SimpleDir. The FDT file of that segment is 2.6 GiB, on Solaris the 
> chunk size is Integer.MAX_VALUE, so we have 2 MultiMMap IndexInputs.
> Robert and me have the index ready as a tar file, we will do tests on our 
> local machines and hopefully solve the bug, maybe introduced by Robert's 
> recent changes to MMap.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to