But that would mean we should be using at least 250k buffers for the
IndexInput ? Not the 16k or so that is the default.
Is the OS smart enough to figure out that the file is being
sequentially read, and adjust its physical read size to 256k, based
on the other concurrent IO operations. Seems this would be hard for
it to figure out, and have it not perform poorly in the general case.
On Feb 8, 2008, at 11:25 AM, Doug Cutting wrote:
Michael McCandless wrote:
Merging is far more IO intensive. With mergeFactor=10, we read from
40 input streams and write to 4 output streams when merging the
tii/tis/frq/prx files.
If your disk can transfer at 50MB/s, and takes 5ms/seek, then 250kB
reads and writes are the break-even point, where half the time is
spent seeking and half transferring, and throughput is 25MB/s.
With 44 files open, that means the OS needs just 11MB of buffering
to keep things above this threshold. Since most systems have
considerably larger buffer pools than 11MB, merging with
mergeFactor=10 shouldn't be seek-bound.
Doug
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]