Re: Lucene spending alot of time in BooleanScorer2

Paul Taylor Tue, 03 May 2011 01:06:16 -0700

On 02/05/2011 23:36, Paul Taylor wrote:

Hi
Nearing completion on a new version of a lucene search component forthe http://www.musicbrainz.org music database and having a problemwith performance. There are a number of indexes each built from datain a database, there is one index for albums, another for artists, andanother for tracks (individual songs on albums). Testing searchperformance on every index they all perform well except for the tracksindex which is considerably slower than before.
The code is similar, though there are changes sin all areas, and Ihave checked with a profiler that it is the lucene search that istaking the time. It is spending most of its time in BooleanScorer2,you can see a breakdown at http://imagebin.org/151368, is this normal? I profiled a well performing index and this method didn't even show up.
One thought I had is that some of the test queries are a little sillyIMO and contain alot of OR queries both on terms within field and overmultiple fields. We only ever return 25 results, but I guess Lucenehas to sort all the results and in some cases there are over a millionmatches, could this be the reason ? However these queries stillperform better with the old code base and Im usingTopScoreDocsCollecter so I can't see how to improve it, and allqueries not just the silly ones appear slower.
Code Extract:
public Results searchLucene(String query, int offset, int limit)throws IOException, ParseException {
        IndexSearcher searcher = getIndexSearcher();
        QueryParser parser = getParser();
TopScoreDocCollector collector =TopScoreDocCollector.create(offset + limit, true);
        searcher.search(parser.parse(query), collector);
        searchCount.incrementAndGet();
        return processResults(searcher, collector, offset);
    }
private Results processResults(IndexSearcher searcher,TopScoreDocCollector collector, int offset) throws IOException {
        Results results = new Results();
        TopDocs topDocs = collector.topDocs();
        results.offset = offset;
        results.totalHits = topDocs.totalHits;
        ScoreDoc docs[] = topDocs.scoreDocs;
        float maxScore = topDocs.getMaxScore();
        for (int i = offset; i < docs.length; i++) {
            Result result = new Result();
            result.score = docs[i].score / maxScore;
            result.doc = new MbDocument(searcher.doc(docs[i].doc));
            results.results.add(result);
        }
        return results;
    }
I'm using Lucene 3.0.3, old code base is using 2.9.2, any helpappreciated on what the problem could be, on on how I should proceed.
thanks Paul

Rereading the javadocs I realised I was calling thw wrong search method,but having made this change it only gived me a 10% improvement.

public Results searchLucene(String query, int offset, int limit) throwsIOException, ParseException {

        IndexSearcher searcher = getIndexSearcher();
        QueryParser parser = getParser();

TopDocs topdocs = searcher.search(parser.parse(query), offset +limit);

        searchCount.incrementAndGet();
        return processResults(searcher, topdocs, offset);
    }

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Lucene spending alot of time in BooleanScorer2

Reply via email to