WildcardQuery won't be slower than TermQuery if there are no wildcard
characters. Beyond what QueryParser does, WildcardQuery itself
reverts to a TermQuery:
public Query rewrite(IndexReader reader) throws IOException {
if (this.termContainsWildcard) {
return super.rewrite(reader);
}
return new TermQuery(getTerm());
}
I personally would optimize which query gets created, but performance-
wise you won't pay a penalty for just using WildcardQuery.
Erik
On Sep 26, 2007, at 5:45 AM, John Byrne wrote:
I'm not using the QueryParser at all. I need to do a little more
with the terms, so i'm explicitly creating a Query from a single
term. What I was hoping was to avoid something like this:
...
if(term.contains("*") || terms.contains("?") {
return new WildcardQuery(...
}
else {
return new TermQuery(...
...
and instead just go like this:
...
return new WilcardQuery(...
...
on the basis that the WildacardQuery would only be slower if it
does contain a wildcard character. But as you pointed out, the
QueryParser makes this optimization, so I suppose I should too.
mark harwood wrote:
Are you using the out of the box Lucene QueryParser? It will
automatically detect wildcard queries by the presence of * or ?
chars.
If the user input does not contain these characters a plain
TermQuery is used.
BooleanQuery.setMaxClauseCount can be used to control the upper
limit on terms produced by Wildcard/Fuzzy Queries.
If this limit is exceeded (e.g when searching for something like
"a*" ) then an exception is thrown.
Cheers
Mark
----- Original Message ----
From: John Byrne <[EMAIL PROTECTED]>
To: [email protected]
Sent: Wednesday, 26 September, 2007 9:48:17 AM
Subject: Wildcard vs Term query
Hi,
I'm working my way through the Lucene In Action book, and there is
one thing I need explained that I didn't find there;
While wildcard queries are potentially slower than ordinary term
queries, are they slower even if theyt don't contain a wildcard?
Significantly slower?
The reason I ask is that if we assume we are going to allow
wildcards in a search engine, but we want to optimize, to take
advantage of when they are NOT used, do we have to check for the
presence of "*" or "?" in the term, and create the most
appropriate query, or can I assume that when a wildcard is not
present, the WildcardQuery will be as fast (or almost as fast) a a
plain term query?
Thanks in advance!
John B.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
___________________________________________________________
Yahoo! Answers - Got a question? Someone out there knows the
answer. Try it
now.
http://uk.answers.yahoo.com/
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]