[ 
https://issues.apache.org/jira/browse/LUCENE-5578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13960525#comment-13960525
 ] 

Adrien Grand commented on LUCENE-5578:
--------------------------------------

I quickly discussed with Robert about a way to check for such issues by 
checking that the stored field files are stable through merges (eg. you merge 
into 1 segment twice and check that you got the same output every time). We 
could run this test on all index formats for which such a property is expected 
(stored fields, term vectors, postings, ...).

> Stored fields might accumulate checksums on merges
> --------------------------------------------------
>
>                 Key: LUCENE-5578
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5578
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>            Priority: Blocker
>             Fix For: 4.8
>
>         Attachments: LUCENE-5578.patch
>
>
> The bulk merge operation of our stored fields format is optimized in order to 
> avoid decompressing data when not needed. In order to know the offset of the 
> end of the current block, it either consults the stored fields index, or uses 
> {{fieldsStream.length()}} for the last chunk.
> However, we just added checksums at the end of index files, so it might 
> currently copy the current checksum in addition to the last chunk, and then 
> write a new checksum.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to