subject:"\"Re\\\: Performance issues with ConjunctionScorer\""

Re: Performance issues with ConjunctionScorer

2005-11-22 Thread Doug Cutting

Andrzej Bialecki wrote: Further input into this: after replacing the ConjunctionScorer with the fixed version from JIRA, now the bottleneck seems to be ... in Summarizer, of all things. :-) While making the summarizer faster would of course be good, keep in mind that the cost of summarizing te

Re: Performance issues with ConjunctionScorer

2005-11-22 Thread Andrzej Bialecki

Andrzej Bialecki wrote: Hi, I've been profiling a Nutch installation, and to my surprise the largest amount of throwaway allocations and the most time spent was not in Nutch specific code, or IPC, but in Lucene ConjunctionScorer.doNext() method. This method operates on a LinkedList, which s

Re: Performance issues with ConjunctionScorer

2005-11-22 Thread Piotr Kosiorowski

You are right - it is still not committed but the patch is here: http://issues.apache.org/jira/browse/LUCENE-443. During tests of my patch - it was very,very similar to this one- I had up to 5% perfomance increase. But probably it will mainly result in nicer GC behaviour. Piotr On 11/22/05, Andrz

Re: Performance issues with ConjunctionScorer

2005-11-22 Thread Andrzej Bialecki

Piotr Kosiorowski wrote: On 11/22/05, Andrzej Bialecki <[EMAIL PROTECTED]> wrote: Hi, I've been profiling a Nutch installation, and to my surprise the largest amount of throwaway allocations and the most time spent was not in Nutch specific code, or IPC, but in Lucene ConjunctionScorer.doNe

Re: Performance issues with ConjunctionScorer

2005-11-22 Thread Piotr Kosiorowski

On 11/22/05, Andrzej Bialecki <[EMAIL PROTECTED]> wrote: > > Hi, > > I've been profiling a Nutch installation, and to my surprise the largest > amount of throwaway allocations and the most time spent was not in Nutch > specific code, or IPC, but in Lucene ConjunctionScorer.doNext() method. > This m

Re: Performance issues with ConjunctionScorer

2005-11-22 Thread Stefan Groschupf

Andrzej, very interesting!!! Nutch Summarizer also needlessly re-tokenizes the text over and over again - perhaps it would be better to save already tokenized text in parse_text, instead of the raw plain text? After all, the only use for that text is to index it and then build the summaries

Re: Performance issues with ConjunctionScorer

Re: Performance issues with ConjunctionScorer

Re: Performance issues with ConjunctionScorer

Re: Performance issues with ConjunctionScorer

Re: Performance issues with ConjunctionScorer

Re: Performance issues with ConjunctionScorer

6 matches

Site Navigation

Mail list logo

Footer information