We are already using File.setLength to pre-set the length of the CFS
file, during merging, on the hope that it'll help the filesystem
minimize fragmentation of the file, but we don't use it when creating
the individual index files.  We could pursue doing so for individual
index files too... I wasn't able to show that this helped performance
in practice, though.

Mike

On Wed, Sep 30, 2009 at 8:47 PM, Jason Rutherglen
<jason.rutherg...@gmail.com> wrote:
> I wanted to post this before I forgot. Based on an informal
> discussion at the Katta meeting regarding the high write
> throughput of Zookeeper (see
> http://wiki.apache.org/hadoop/ZooKeeper/Performance ) which uses
> the database technique of preallocating large empty files before
> filling them up with real data, it came up that perhaps this
> technique could help with the speed of Lucene segment merging?
>
> Lucene would preallocate new target merge files with zeroes of
> lets say one megabyte in size, then proceed to fill it in with
> the merge data, truncating the file to the actual size when
> completed. This would probably only need to be switched on when
> merging large segments.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-dev-h...@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to