[ 
https://issues.apache.org/jira/browse/LUCENE-510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12595526#action_12595526
 ] 

Michael McCandless commented on LUCENE-510:
-------------------------------------------

Whoa, I think you are correct Mark!

On inspecting my changes here, I think the bulk-merging of stored fields is to 
blame.  Specifically, when we bulk merge the stored fields we fail to check 
whether the segments being merged are the pre-UTF8 format.  And so that code 
bulk-copies stored fields in the older format into a file that claims it's 
using the newer format.

This only affects trunk, not 2.3.

Thanks for being such a brave early-adopter trunk tester, Mark.  And, sorry :(

> IndexOutput.writeString() should write length in bytes
> ------------------------------------------------------
>
>                 Key: LUCENE-510
>                 URL: https://issues.apache.org/jira/browse/LUCENE-510
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Store
>    Affects Versions: 2.1
>            Reporter: Doug Cutting
>            Assignee: Michael McCandless
>             Fix For: 2.4
>
>         Attachments: LUCENE-510.patch, LUCENE-510.take2.patch, 
> SortExternal.java, strings.diff, TestSortExternal.java
>
>
> We should change the format of strings written to indexes so that the length 
> of the string is in bytes, not Java characters.  This issue has been 
> discussed at:
> http://www.mail-archive.com/java-dev@lucene.apache.org/msg01970.html
> We must increment the file format number to indicate this change.  At least 
> the format number in the segments file should change.
> I'm targetting this for 2.1, i.e., we shouldn't commit it to trunk until 
> after 2.0 is released, to minimize incompatible changes between 1.9 and 2.0 
> (other than removal of deprecated features).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to