Interestingly, the last two consulting jobs I've had dealt with this very issue - having user entered terms be interpreted as partial string to match in any indexed term. Care must be taken to avoid the classic TooManyClauses exception or a more insidious OutOfMemory exception.

By using the PrefixQuery for all unadorned terms in QueryParser, you risk someone typing "a" and one of the above problems occurring, depending on how many terms you have in your index.

There are techniques to more efficiently handle the "starts with" or even the "contains" type substring queries by being clever with tokenization and taking advantage of clever tokenization to form much more efficient TermQuery queries.

If "starts with" are the only types of queries you need to worry about, and not "contains" then consider indexing with prefix tokens. For example, 'cat' could be indexed as 'cat', 'ca', and 'c'. Someone types in 'ca' and you issue a TermQuery for 'ca' for a match. The index size will grow, perhaps dramatically, but your searches will be much faster and more efficient.

I plan to provide more documentation, examples, and TokenFilter(s) to deal with this common scenario in the future.

        Erik


On Mar 17, 2006, at 7:51 AM, Eric Jain wrote:
Florian Hanke wrote:
I'd like to append an * (create a WildcardQuery) to each search term in a query, such that a query that is entered as e.g. "term1 AND term2" is modified (effectively) to "term1* AND term2*". Parsing the search string is not very elegant (of course). I'm thinking that overriding QueryParser#get(Boolean etc.)Query is the way to go, the way it's designed. But still, extracting terms and injecting them back in while operating on specific Query classes does not seem the way to go.
Can anyone perhaps suggest a nice alternative?

Perhaps you could subclass the QueryParser and override the getFieldQuery method:

protected Query getFieldQuery(String field, String term) {
  return new PrefixQuery(new Term(field, term));
}

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to