[ 
https://issues.apache.org/jira/browse/LUCENE-3607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13158842#comment-13158842
 ] 

Martin Oberhuber commented on LUCENE-3607:
------------------------------------------

Since I was asked, here is my scenario / workflow:

- we're building an Eclipse based application, which contains online help.
- The build process generates a lucene index and ships it with our application.
- 2 builds before the final RTO we detect that a critical defect needs to be 
fixed. Of course that close before release we want to change as few binary bits 
as possible.
- But even though we touched a single character in some unrelated configfile 
only, a large amount of our modules gets changed due to the modified Lucene 
index.

==> This behavior makes it hard to create and deploy incremental patches since 
after a rebuild (with minimal change) it's unclear what output was valid to 
change.

Explained in different terms, if I'm building on the same host, same OS, same 
JVM, same source files ... I expect getting the same output.
Very much like re-building a C++ EXE from source, then stripping it is expected 
to produce identical binary bits.
That's a requirement in many safety critical areas.

So, regarding "changing something that works for no gain" ... I'd like to 
assert that there _is_ tangible gain in an index that's binary reproducable.
I'd rather claim that I don't see the advantage of an index that has the 
timestamp of its creation embedded.
                
> Lucene Index files can not be reproduced faithfully (due to timestamps 
> embedded)
> --------------------------------------------------------------------------------
>
>                 Key: LUCENE-3607
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3607
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/index
>    Affects Versions: 2.9.1
>         Environment: Eclipse 3.7
>            Reporter: Martin Oberhuber
>            Assignee: Michael McCandless
>
> Eclipse 3.7 uses Lucene 2.9.1 for indexing online help content. A 
> pre-generated help index can be shipped together with online content. As per
>    [[https://bugs.eclipse.org/bugs/show_bug.cgi?id=364979 ]]
> it turns out that the help index can not be faithfully reproduced during a 
> build, because there are timestamps embedded in the index files, and the 
> "NameCounter" field in segments_2 contains different contents on every build.
> Not being able to faithfully reproduce the index from identical source bits 
> undermines trust in the index (and software delivery) being correct.
> I'm wondering whether this is a known issue and/or has been addressed in a 
> newer Lucene version already ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to