[
http://issues.apache.org/jira/browse/LUCENE-693?page=comments#action_12444317 ]
Peter Keegan commented on LUCENE-693:
-------------------------------------
Yonik,
I tried out your patch, but it causes an exception on some boolean queries.
This one occurred on a boolean query with 3 required terms:
java.lang.ArrayIndexOutOfBoundsException: 2147483647
at org.apache.lucene.search.TermScorer.score(TermScorer.java:129)
at org.apache.lucene.search.ConjunctionScorer.score(
ConjunctionScorer.java:97)
at org.apache.lucene.search.BooleanScorer2$2.score(BooleanScorer2.java
:186)
at org.apache.lucene.search.BooleanScorer2.score(BooleanScorer2.java
:318)
at org.apache.lucene.search.BooleanScorer2.score(BooleanScorer2.java
:282)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:132)
at org.apache.lucene.search.Searcher.search(Searcher.java:116)
at org.apache.lucene.search.Searcher.search(Searcher.java:95)
It looks like the doc id has the sentinel value (Integer.MAX_VALUE).
Note: one of the terms had no occurrences in the index.
Peter
> ConjunctionScorer - more tuneup
> -------------------------------
>
> Key: LUCENE-693
> URL: http://issues.apache.org/jira/browse/LUCENE-693
> Project: Lucene - Java
> Issue Type: Bug
> Components: Search
> Affects Versions: 2.1
> Environment: Windows Server 2003 x64, Java 1.6, pretty large index
> Reporter: Peter Keegan
> Attachments: conjunction.patch
>
>
> (See also: #LUCENE-443)
> I did some profile testing with the new ConjuctionScorer in 2.1 and
> discovered a new bottleneck in ConjunctionScorer.sortScorers. The
> java.utils.Arrays.sort method is cloning the Scorers array on every sort,
> which is quite expensive on large indexes because of the size of the 'norms'
> array within, and isn't necessary.
> Here is one possible solution:
> private void sortScorers() {
> // squeeze the array down for the sort
> // if (length != scorers.length) {
> // Scorer[] temps = new Scorer[length];
> // System.arraycopy(scorers, 0, temps, 0, length);
> // scorers = temps;
> // }
> insertionSort( scorers,length );
> // note that this comparator is not consistent with equals!
> // Arrays.sort(scorers, new Comparator() { // sort the array
> // public int compare(Object o1, Object o2) {
> // return ((Scorer)o1).doc() - ((Scorer)o2).doc();
> // }
> // });
>
> first = 0;
> last = length - 1;
> }
> private void insertionSort( Scorer[] scores, int len)
> {
> for (int i=0; i<len; i++) {
> for (int j=i; j>0 && scores[j-1].doc() > scores[j].doc();j-- ) {
> swap (scores, j, j-1);
> }
> }
> return;
> }
> private void swap(Object[] x, int a, int b) {
> Object t = x[a];
> x[a] = x[b];
> x[b] = t;
> }
>
> The squeezing of the array is no longer needed.
> We also initialized the Scorers array to 8 (instead of 2) to avoid having to
> grow the array for common queries, although this probably has less
> performance impact.
> This change added about 3% to query throughput in my testing.
> Peter
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]