[
https://issues.apache.org/jira/browse/LUCENE-5969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Muir updated LUCENE-5969:
--------------------------------
Attachment: LUCENE-5969_part3.patch
Here is a patch for part 3. I think its ready, we should close the issue after
this.
Other improvements can be separate issues from here.
Also after resolving this issue and backporting, we can do further cleanups in
trunk, and remove all the 4.x support in backwards-codecs and further cleanups
in SegmentInfos.
Patch finishes adding all safety (docvalues, terms, postings, commit points).
CodecUtil "segmentHeader" is renamed to "indexHeader", as its used for all
index files (including commit points).
BlockTree doesn't "backdoor" via checkindex to return stats, there is a dead
simple API for this.
Norms sparse encoding is further improved with PATCHED strategy.
There is an API change for SegmentInfos for safety, instead of instance methods
for reading read into "mutable" SIS:
{code}
SegmentInfos.read(Dir);
SegmentInfos.read(Dir, file);
{code}
these are now static methods that return a clean instance (and named readCommit
and readLatestCommit respectively, to not be fragile on upgrade).
There is more to fix here, IMO SIS "tries to take on too much" (mutable state
by IndexWriter, tracking of counters etc by IndexWriter, reading/writing
commits, tries to be a "low level user-friendly" and too much publicly exposed
dangers. This is all for a heavily versioned important file with conditional
logic. But thats a bigger problem.
> Add Lucene50Codec
> -----------------
>
> Key: LUCENE-5969
> URL: https://issues.apache.org/jira/browse/LUCENE-5969
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Fix For: 5.0, Trunk
>
> Attachments: LUCENE-5969.patch, LUCENE-5969.patch,
> LUCENE-5969_part2.patch, LUCENE-5969_part3.patch
>
>
> Spinoff from LUCENE-5952:
> * Fix .si to write Version as 3 ints, not a String that requires parsing at
> read time.
> * Lucene42TermVectorsFormat should not use the same codecName as
> Lucene41StoredFieldsFormat
> It would also be nice if we had a "bumpCodecVersion" script so rolling a new
> codec is not so daunting.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]