Re: Appending * to each search term

2006-03-17 Thread Eric Jain

Florian Hanke wrote:
I'd like to append an * (create a WildcardQuery) to each search term in 
a query, such that a query that is entered as e.g. term1 AND term2 is 
modified (effectively) to term1* AND term2*.
Parsing the search string is not very elegant (of course). I'm thinking 
that overriding QueryParser#get(Boolean etc.)Query is the way to go, the 
way it's designed. But still, extracting terms and injecting them back 
in while operating on specific Query classes does not seem the way to go.

Can anyone perhaps suggest a nice alternative?


Perhaps you could subclass the QueryParser and override the getFieldQuery 
method:


protected Query getFieldQuery(String field, String term) {
  return new PrefixQuery(new Term(field, term));
}

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Appending * to each search term

2006-03-17 Thread Florian Hanke

Thank you very much - that did the trick! :)

Am 17.03.2006 um 13:51 schrieb Eric Jain:

Perhaps you could subclass the QueryParser and override the  
getFieldQuery method:


protected Query getFieldQuery(String field, String term) {
  return new PrefixQuery(new Term(field, term));
}



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Appending * to each search term

2006-03-17 Thread Erik Hatcher
Interestingly, the last two consulting jobs I've had dealt with this  
very issue - having user entered terms be interpreted as partial  
string to match in any indexed term.  Care must be taken to avoid the  
classic TooManyClauses exception or a more insidious OutOfMemory  
exception.


By using the PrefixQuery for all unadorned terms in QueryParser, you  
risk someone typing a and one of the above problems occurring,  
depending on how many terms you have in your index.


There are techniques to more efficiently handle the starts with or  
even the contains type substring queries by being clever with  
tokenization and taking advantage of clever tokenization to form much  
more efficient TermQuery queries.


If starts with are the only types of queries you need to worry  
about, and not contains then consider indexing with prefix tokens.   
For example, 'cat' could be indexed as 'cat', 'ca', and 'c'.  Someone  
types in 'ca' and you issue a TermQuery for 'ca' for a match.  The  
index size will grow, perhaps dramatically, but your searches will be  
much faster and more efficient.


I plan to provide more documentation, examples, and TokenFilter(s) to  
deal with this common scenario in the future.


Erik


On Mar 17, 2006, at 7:51 AM, Eric Jain wrote:

Florian Hanke wrote:
I'd like to append an * (create a WildcardQuery) to each search  
term in a query, such that a query that is entered as e.g. term1  
AND term2 is modified (effectively) to term1* AND term2*.
Parsing the search string is not very elegant (of course). I'm  
thinking that overriding QueryParser#get(Boolean etc.)Query is the  
way to go, the way it's designed. But still, extracting terms and  
injecting them back in while operating on specific Query classes  
does not seem the way to go.

Can anyone perhaps suggest a nice alternative?


Perhaps you could subclass the QueryParser and override the  
getFieldQuery method:


protected Query getFieldQuery(String field, String term) {
  return new PrefixQuery(new Term(field, term));
}

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]