[ 
https://issues.apache.org/jira/browse/LUCENE-7260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15262522#comment-15262522
 ] 

Trejkaz commented on LUCENE-7260:
---------------------------------

Is there a faster way to do it? Keeping in mind that it has to be something 
starting from a query string, since that's what the user originally entered who 
reported the issue to us.

> StandardQueryParser is over 100 times slower in v5 compared to v3
> -----------------------------------------------------------------
>
>                 Key: LUCENE-7260
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7260
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/queryparser
>    Affects Versions: 5.4.1
>         Environment: Java 8u51
>            Reporter: Trejkaz
>              Labels: performance
>
> The following test code times parsing a large query.
> {code}
> import org.apache.lucene.analysis.KeywordAnalyzer;
> //import org.apache.lucene.analysis.core.KeywordAnalyzer;
> import org.apache.lucene.queryParser.standard.StandardQueryParser;
> //import org.apache.lucene.queryparser.flexible.standard.StandardQueryParser;
> import org.apache.lucene.search.BooleanQuery;
> public class LargeQueryTest {
>     public static void main(String[] args) throws Exception {
>         BooleanQuery.setMaxClauseCount(50_000);
>         StringBuilder builder = new StringBuilder(50_000*10);
>         builder.append("id:( ");
>         boolean first = true;
>         for (int i = 0; i < 50_000; i++) {
>             if (first) {
>                 first = false;
>             } else {
>                 builder.append(" OR ");
>             }
>             builder.append(String.valueOf(i));
>         }
>         builder.append(" )");
>         String queryString = builder.toString();
>         StandardQueryParser parser2 = new StandardQueryParser(new 
> KeywordAnalyzer());
>         for (int i = 0; i < 10; i++) {
>             long t0 = System.currentTimeMillis();
>             parser2.parse(queryString, "nope");
>             long t1 = System.currentTimeMillis();
>             System.out.println(t1-t0);
>         }
>     }
> }
> {code}
> For Lucene 3.6.2, the timings settle down to 200~300 with the fastest being 
> 207.
> For Lucene 5.4.1, the timings settle down to 20000~30000 with the fastest 
> being 22444.
> So at some point, some change made the query parser 100 times slower. I would 
> suspect that it has something to do with how the list of children is now 
> handled. Every time someone gets the children, it copies the list. Every time 
> someone sets the children, it walks through to detach parent references and 
> then reattaches them all again.
> If it were me, I would probably make these collections immutable so that I 
> didn't have to defensively copy them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to