[ 
https://issues.apache.org/jira/browse/LUCENE-5992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-5992:
---------------------------------------
    Attachment: LUCENE-5992.patch

Here's a starting patch so the factions can start duking it out:

I put the readVersion/writeVersion methods inside CodecUtil.java.
Yes, it's not great that this means Version's ctor is public, but in
CodecUtil it's at least more apparent that these methods are
committing to a particular format, and that format is really "scoped"
by the caller (meaning if ever we change what ints we write / how we
encode them, the caller's own format tracking must handle that
change).  I put spooky javadocs and comments to convey this
situation.

I'm also defensively writing 4 (not 3) ints since it seems at least
possible we may once again release alpha/beta releases and it seems
safest to at least not preclude the possibility ... but some factions
want 3 ints instead.

Finally, I'm encoding using vInt, not unsigned byte.  I know this is
ridiculously defensive, and surely Lucene will never get to major or
minor or bugfix version 256 (it would be a strange world), but the
fact is that we are using java's "int" for these instance vars in
Version.java... I can't remember whether there were factions
upset by this.

But as is I (my faction) think the patch is ready!


> Version should not be encoded as a String in the index
> ------------------------------------------------------
>
>                 Key: LUCENE-5992
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5992
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 5.0, Trunk
>
>         Attachments: LUCENE-5992.patch
>
>
> The version is really "just" 3 (maybe 4) ints under-the-hood, but today we 
> write it as a String which then requires spooky string tokenization/parsing 
> when we open the index.  I think it should be encoded directly as ints.
> In LUCENE-5952 I had tried to make this change, but it was controversial, and 
> got booted.
> Then in LUCENE-5969, I tried again, but that issue has morphed (nicely!) into 
> fixing all sorts of things *except* these three ints.
> Maybe 3rd time's a charm ;)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to