Re: [jira] Created: (LUCENE-1172) Small speedups to DocumentsWriter

Mike Klaas Sun, 10 Feb 2008 15:48:50 -0800

While I agree in general that excessive optimization at the expenseof code clarity is undesirable, you are overstating the point. 2X isa ridiculous threshold to apply to something as performance criticalas a full text search engine. If search was twice as slow, lucenewould be utterly unusable for me. Indexing less important thansearch, of course, but a 2X slowdown with be quite painful there.

I don't have an opinion in this case: I believe that there is atradeoff but that it is the responsibility of the commiter(s) toachieve the correct balance--they are the ones who will bemaintaining the code, after all. I find your persistence surprisingand your tone dangerously near condescending. Telling the guy whohas spent hundreds of hours carefully optimizing this code that"Almost always there is a better bottleneck somewhere" shows anastonishing lack of perspective and respect.


-Mike

On 10-Feb-08, at 12:15 PM, robert engels wrote:

I am not sure these numbers matter. I think they are skewed becauseyou are probably running too short a test, and the index is inmemory (or OS cache).
Once you use a real index that needs to read/write from the disk,the percentage change will be negligible.
This is the problem with many of these "performance changes" - theyjust aren't real world enough. Even if they were, I would arguethat code simplicity/maintainability is worth more than 6 secondson a operation that takes 4 minutes to run...
There are many people that believe micro benchmarks are next toworthless. A good rule of thumb is that if the optimization doesn'tresult in 2x speedup, it probably shouldn't be done. In most casesany efficiency gains are later lost in maintainability issues.
See http://en.wikipedia.org/wiki/Optimization_(computer_science)

Almost always there is a better bottleneck somewhere.

On Feb 10, 2008, at 1:37 PM, Michael McCandless wrote:
Yonik Seeley wrote:
I wonder how well a single generic quickSort(Object[] arr, int low,
int high) would perform vs the type-specific ones?  I guess the main
overhead would be a cast from Object to the specific class to do the
compare?  Too bad Java doesn't have true generics/templates.
OK I tested this.

Starting from the patch on LUCENE-1172, which has 3 quickSort methods
(one per type), I created a single quickSort method on Object[] that
takes a Comparator, and made 3 Comparators instead.

Mac OS X 10.4 (JVM 1.5):

    original patch --> 247.1
  simplified patch --> 254.9 (3.2% slower)

Windows Server 2003 R64 (JVM 1.6):

    original patch --> 440.6
  simplified patch --> 452.7 (2.7% slower)

The times are best in 10 runs.  I'm running all tests with these JVM
args:

  -Xms1024M -Xmx1024M -Xbatch -server

I think this is a big enough difference in performance that it's
worth keeping 3 separate quickSorts in DocumentsWriter.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [jira] Created: (LUCENE-1172) Small speedups to DocumentsWriter

Reply via email to