On Thu, May 14, 2009 at 06:47:01AM -0400, Michael McCandless wrote: > While I agree, one should properly match & tune all apps they are > testing (for a fair comparison), we in turn must set out-of-the-box > defaults (in Lucene and Solr) that get you as close to the "best > practices" as possible.
So, should Lucene use the non-compound file format by default because some idiot's sloppy benchmarks might run a smidge faster, even though that will cause many users to run out of file descriptors? Anyone doing comparative benchmarking who doesn't submit their code to the support list for the software under review is either a dolt or a propagandist. Good benchmarking is extremely difficult, like all experimental science. If there isn't ample evidence that the benchmarker appreciates that, their tests aren't worth a second thought. If you don't avail yourself of the help of experts when assembling your experiment, you are unserious. Richard Feynman: "...if you're doing an experiment, you should report everything that you think might make it invalid - not only what you think is right about it: other causes that could possibly explain your results; and things you thought of that you've eliminated by some other experiment, and how they worked - to make sure the other fellow can tell they have been eliminated." Marvin Humphrey