Re: Performance of never optimizing

2008-11-04 Thread Justus Pendleton
On 05/11/2008, at 4:36 AM, Michael McCandless wrote: If possible, you should try to use a larger corpus (eg Wikipedia) rather than multiply Reuters by N, which creates unnatural term frequency distribution. I'll replicate the tests with the wikipedia corpus over the next few days and

Re: Performance of never optimizing

2008-11-03 Thread Justus Pendleton
On 03/11/2008, at 11:07 PM, Mark Miller wrote: Am I missing your benchmark algorithm somewhere? We need it. Something doesn't make sense. I thought I had included in at[1] before but apparently not, my apologies for that. I have updated that wiki page. I'll also reproduce it here: {

Performance of never optimizing

2008-11-02 Thread Justus Pendleton
Howdy, I have a couple of questions regarding some Lucene benchmarking and what the results mean[3]. (Skip to the numbered list at the end if you don't want to read the lengthy exegesis :) I'm a developer for JIRA[1]. We are currently trying to get a better understanding of Lucene, and

Re: Performance of never optimizing

2008-11-02 Thread Justus Pendleton
On 03/11/2008, at 4:27 PM, Otis Gospodnetic wrote: Why are you optimizing? Trying to make the search faster? I would try to avoid optimizing during high usage periods. I assume that the original, long-ago, decision to optimize was made to improve searching performance. One thing that