[ 
https://issues.apache.org/jira/browse/LUCENE-3218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13087963#comment-13087963
 ] 

Andrzej Bialecki  commented on LUCENE-3218:
-------------------------------------------

bq. I think append-only filesystems (eg HDFS) can make their own impl that uses 
the file length instead (like AppendingCodecc).

AppendingCodec solves only one issue, that of postings and SegmentInfos. I'm 
worried that adding seek+rewrite tricks in other places that are not under the 
control of Codec or under any other configurable implementation (such as CFS) 
will ultimately prevent the efficient use of Lucene on Hadoop. Unless we put 
those places under the control of a Codec (or some other configurable 
interface).

> Make CFS appendable  
> ---------------------
>
>                 Key: LUCENE-3218
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3218
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/index
>    Affects Versions: 4.0
>            Reporter: Simon Willnauer
>            Assignee: Simon Willnauer
>             Fix For: 3.4, 4.0
>
>         Attachments: LUCENE-3218.patch, LUCENE-3218.patch, LUCENE-3218.patch, 
> LUCENE-3218.patch, LUCENE-3218_3x.patch, LUCENE-3218_test_fix.patch, 
> LUCENE-3218_tests.patch
>
>
> Currently CFS is created once all files are written during a flush / merge. 
> Once on disk the files are copied into the CFS format which is basically a 
> unnecessary for some of the files. We can at any time write at least one file 
> directly into the CFS which can save a reasonable amount of IO. For instance 
> stored fields could be written directly during indexing and during a Codec 
> Flush one of the written files can be appended directly. This optimization is 
> a nice sideeffect for lucene indexing itself but more important for DocValues 
> and LUCENE-3216 we could transparently pack per field files into a single 
> file only for docvalues without changing any code once LUCENE-3216 is 
> resolved.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to