Robert Engels wrote:
As stated in a previous email - good idea.
All of the code and testcases were attached to the original email.
The testcases were the answer to a request for such (at least a month ago if
not longer).
I am sorry, if I gave you the wrong impression.
I was merely suggesting a formalization of the process and that there be
documentation on the Lucene website that outlines how performance tests
and data should be provided, and how people can participate and provide
their results.
I have seen this issue come up several times (perhaps the following is
an oversimplification):
Someone will suggest a performance enhancement and perhaps supply the
code. Then there will be a general discussion about the merits of the
change and the validity of the results, with question about the factors
involved and statements regarding how architectures widely differ and
the outcomes can be significantly different. If enough "voters" like the
change, then it is committed.
Should there be a representative set of architectures to which
performance test should be targeted? (For example, I have written an
application that uses lucene to index and search bibles. And the minimum
hardware requirement is a Win98 laptop, which many of our user's have.)
-----Original Message-----
From: DM Smith [mailto:[EMAIL PROTECTED]
Sent: Friday, December 09, 2005 7:07 AM
To: java-dev@lucene.apache.org
Subject: Re: NioFile cache performance
John Haxby wrote:
Robert Engels wrote:
Using a 4mb file (so I could be "guarantee" the disk data would be in
the OS cache as well), the test shows the following results.
Which OS? If it's Linux, what kernel version and distro? What
hardware (disk type, controller etc).
It's important to know: I/O (and caching) is very different between
Linux 2.4 and 2.6. The choice of I/O scheduler can also make a
significant difference on 2.6, depending on the workload. The type
of disk and its controller is also important -- and when you get
really picky, the mobo model number.
I don't dispute your finding for a second, but it would be good to run
the same test on other platforms to get comparative data: not least
because you can get the kind of I/O time improvement you're seeing on
some workloads on different versions of the Linux kernel.
I think that the results were informative from a comparative basis on a
single machine. It compared different techniques and showed their
relative performance on that machine.
I also agree that the architecture of the machine can play an important
part in how code performs. I wrote a piece of software that ran well on
a 4-way, massive raid configuration, with gobs of ram only to have it
re-targeted to a 1-way, small ram box, where it had to be rewritten to
run at all.
Perhaps, it would be good to establish guidelines for reporting
performance, including the posting of test data and test code.
This may encourage others to download the data and code, perform the
test and report the results.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]