One way to force larger read-aheads might be to pump up Lucene's input buffer size. As an experiment, try increasing InputStream.BUFFER_SIZE to 1024*1024 or larger. You'll want to do this just for the merge process and not for searching and indexing. That should help you spend more time doing transfers with less wasted on seeks. If that helps, then perhaps we ought to make this settable via system property or somesuch.Good suggestion... seems about 10% -> 15% faster in a few strawman benchmarks I ran.
Note that right now this var is final and not public... so that will probably need to change. Does it make sense to also increase the OutputStream.BUFFER_SIZE? This would seem to make sense since an optimize is a large number of reads and writes.
I'm obviously willing to throw memory at the problem
--
Please reply using PGP.
http://peerfear.org/pubkey.asc NewsMonster - http://www.newsmonster.org/
Kevin A. Burton, Location - San Francisco, CA, Cell - 415.595.9965
AIM/YIM - sfburtonator, Web - http://peerfear.org/
GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412
IRC - freenode.net #infoanarchy | #p2p-hackers | #newsmonster
signature.asc
Description: OpenPGP digital signature
