Interestingly, the last two consulting jobs I've had dealt with this
very issue - having user entered terms be interpreted as partial
string to match in any indexed term. Care must be taken to avoid the
classic TooManyClauses exception or a more insidious OutOfMemory
exception.
By using the PrefixQuery for all unadorned terms in QueryParser, you
risk someone typing "a" and one of the above problems occurring,
depending on how many terms you have in your index.
There are techniques to more efficiently handle the "starts with" or
even the "contains" type substring queries by being clever with
tokenization and taking advantage of clever tokenization to form much
more efficient TermQuery queries.
If "starts with" are the only types of queries you need to worry
about, and not "contains" then consider indexing with prefix tokens.
For example, 'cat' could be indexed as 'cat', 'ca', and 'c'. Someone
types in 'ca' and you issue a TermQuery for 'ca' for a match. The
index size will grow, perhaps dramatically, but your searches will be
much faster and more efficient.
I plan to provide more documentation, examples, and TokenFilter(s) to
deal with this common scenario in the future.
Erik
On Mar 17, 2006, at 7:51 AM, Eric Jain wrote:
Florian Hanke wrote:
I'd like to append an * (create a WildcardQuery) to each search
term in a query, such that a query that is entered as e.g. "term1
AND term2" is modified (effectively) to "term1* AND term2*".
Parsing the search string is not very elegant (of course). I'm
thinking that overriding QueryParser#get(Boolean etc.)Query is the
way to go, the way it's designed. But still, extracting terms and
injecting them back in while operating on specific Query classes
does not seem the way to go.
Can anyone perhaps suggest a nice alternative?
Perhaps you could subclass the QueryParser and override the
getFieldQuery method:
protected Query getFieldQuery(String field, String term) {
return new PrefixQuery(new Term(field, term));
}
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]