[ https://issues.apache.org/jira/browse/LUCENE-3218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13052222#comment-13052222 ]
Michael McCandless commented on LUCENE-3218: -------------------------------------------- Patch looks cool! So the CFW will take the first output opened against it and let it write directly into the "actual" CFS file, and then if another file is opened while that first one is still open, the 2nd file will write to separate file and then will copy in on close. We may want to delegate the separate files too? So that on close they copy themselves into the CFS and remove the original? This way IW won't have to separately create CFS in the end. Somehow we need IW to add the biggest sub-file first... s/compund/compound CFW.close should assert currentOutput != null (and, if we delegate sep entries, that they are also all closed)? You might need to sync the CompoundFileWriter.this.currentOutput test / setting to null? Though... Lucene is always single threaded in writing files for the same segment, today anyway. Can we make a separate createCompoundOutput? (Ie, instaed of passing OpenMode to openCompoundInput). And: I'm assuming a given compound output can only be opened once, appended to / separate files copied into, closed and then never opened again for writing? (Ie, still "write once" at the file level). > Make CFS appendable > --------------------- > > Key: LUCENE-3218 > URL: https://issues.apache.org/jira/browse/LUCENE-3218 > Project: Lucene - Java > Issue Type: Improvement > Components: core/index > Affects Versions: 4.0 > Reporter: Simon Willnauer > Assignee: Simon Willnauer > Fix For: 4.0 > > Attachments: LUCENE-3218.patch > > > Currently CFS is created once all files are written during a flush / merge. > Once on disk the files are copied into the CFS format which is basically a > unnecessary for some of the files. We can at any time write at least one file > directly into the CFS which can save a reasonable amount of IO. For instance > stored fields could be written directly during indexing and during a Codec > Flush one of the written files can be appended directly. This optimization is > a nice sideeffect for lucene indexing itself but more important for DocValues > and LUCENE-3216 we could transparently pack per field files into a single > file only for docvalues without changing any code once LUCENE-3216 is > resolved. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org