[ 
https://issues.apache.org/jira/browse/LUCENE-2491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12876200#action_12876200
 ] 

Michael McCandless commented on LUCENE-2491:
--------------------------------------------

This sounds great!

I've wanted to let Codecs store stuff into each SegmentInfo (eg the hasProx 
boolean really ought not be a core thing but a Codec-private flag instead).  
Maybe this is a way to do that...

The only odd thing is... Codec is per-segment now.  Every segment is free to 
have a different Codec (even within a single session of IW).  So having Codec 
write the segments file doesn't really "fit"; I guess CodecProvider could do so?

Multiple segments files can exist in the index at a time; the requirement would 
then be that the current CodecProvider must always be able to read all segments 
files written by past CodecProviders.

We could alternatively make it an option for IW to use a normal IndexOutput 
when writing segments files (skipping the checksum).

Once you remove this from HDFS, how will you ensure the written segments file 
is consistent?  Or is this (a possibly partially written segments file due to 
eg OS crash or power loss, on "ordinary" filesystems) never an issue with HDFS?

> Extend Codec with a SegmentInfos writer / reader
> ------------------------------------------------
>
>                 Key: LUCENE-2491
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2491
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>    Affects Versions: 4.0
>            Reporter: Andrzej Bialecki 
>
> I'm trying to implement a Codec that works with append-only filesystems 
> (HDFS). It's _almost_ done, except for the SegmentInfos.write(dir), which 
> uses ChecksumIndexOutput, which in turn uses IndexOutput.seek() - and seek is 
> not supported on append-only output. I propose to extend the Codec interface 
> to encapsulate also the details of SegmentInfos writing / reading. Patch to 
> follow after some feedback ;)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to