[
https://issues.apache.org/jira/browse/LUCENE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273927#comment-13273927
]
Robert Muir commented on LUCENE-4050:
-------------------------------------
{quote}
In fact Lucene used to use rename to commit the segments file but this
proved problematic on Windows (sometimes the rename would hit "access
denied" error).
{quote}
Well, problematic at least once right? I dont think it justifies doing
things a strange way.
Surely this is just some problem only on windows 3.1 and java 1.2 or
something and now fixed, since this is how every other linux/cygwin program
(e.g. vi) works.
> Change SegmentInfos format to plain text
> ----------------------------------------
>
> Key: LUCENE-4050
> URL: https://issues.apache.org/jira/browse/LUCENE-4050
> Project: Lucene - Java
> Issue Type: Improvement
> Components: core/codecs
> Reporter: Andrzej Bialecki
> Fix For: 4.0
>
>
> I propose to change the format of SegmentInfos file (segments_NN) to use
> plain text instead of the current binary format.
> SegmentInfos file represents a commit point, and it also declares what codecs
> were used for writing each of the segments that the commit point consists of.
> However, this is a chicken and egg situation - in theory the format of this
> file is customizable via Codec.getSegmentInfosFormat, but in practice we have
> to first discover what is the codec implementation that wrote this file - so
> the SegmentCoreReaders assumes a certain fixed binary layout of a preamble of
> this file that contains the codec name... and then the file is read again,
> only this time using the right Codec.
> This is ugly. Instead I propose to use a simple plain text format, either
> line oriented properties or JSON, in such a way that newer versions could
> easily extend it, and which wouldn't require any special Codec to read and
> parse. Consequently we could remove SegmentInfosFormat altogether, and
> instead add SegmentInfoFormat (notice the singular) to Codec to read single
> per-segment SegmentInfo-s in a codec-specific way. E.g. for Lucene40 codec we
> could either add another file or we could extend the .fnm file (FieldInfos)
> to contain also this information.
> Then the plain text SegmentInfos would contain just the following information:
> * list of global files for this commit point (if any)
> * list of segments for this commit point, and their corresponding codec class
> names
> * user data map
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]