[ 
https://issues.apache.org/jira/browse/LUCENE-5578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-5578:
---------------------------------

    Attachment: LUCENE-5578.patch

Here is a patch:
 - as suggested by Uwe, I modified the stored fields index file to store the 
maximum file pointer that is used to store stored fields data instead of 
relying on the file length,
 - I added tests to our main index formats to make sure that they don't 
accumulate stale data when doing bulk merges.

Our term vectors format (that is quite similar to stored fields) didn't have 
this bug because it disables bulk merging on the last chunk of a segment.

> Stored fields might accumulate checksums on merges
> --------------------------------------------------
>
>                 Key: LUCENE-5578
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5578
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>            Priority: Blocker
>             Fix For: 4.8
>
>         Attachments: LUCENE-5578.patch, LUCENE-5578.patch
>
>
> The bulk merge operation of our stored fields format is optimized in order to 
> avoid decompressing data when not needed. In order to know the offset of the 
> end of the current block, it either consults the stored fields index, or uses 
> {{fieldsStream.length()}} for the last chunk.
> However, we just added checksums at the end of index files, so it might 
> currently copy the current checksum in addition to the last chunk, and then 
> write a new checksum.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to