Well... you are 50% right. when you write * * * Query q = qp.parse("\"united states\"");*
It does search for two separate tokens "united" and "states" but checks if those are written sequentially. So above search will search for documents where token "states" is written after "united". *Note* that since it checks tokens sequentially it may also find documents where some non-tokenizable characters or stop words exist between "united" and "states", e.g. - *united and states *(here "and" is a stop word). TermQuery will work it the way you said in your reply, i.e. will search for a token "united states" which is not what you want. On Mon, Apr 19, 2010 at 3:33 PM, Ian.huang <yiwong2...@hotmail.com> wrote: > Does a token of "united states" exist in index if using standard analyzer. > My understanding is, united and states are separately stored in index, but > not as "united states". So, if I build a query like Query q = > qp.parse("\"united states\""); It would not return any result. Am I right? > > Ian > > -------------------------------------------------- > From: "Samarendra Pratap" <samarz...@gmail.com> > Sent: Friday, April 16, 2010 9:02 PM > To: <java-user@lucene.apache.org> > Subject: Re: about analyzer for searching location > > Hi. I don't think you need a different analyzer. Read about >> PhraseQuery< >> http://lucene.apache.org/java/2_9_0/api/core/org/apache/lucene/search/PhraseQuery.html >> >. >> >> If you are using parse() method of QueryParser. Enclose the searched >> string >> in extra double quotes, which must obviously be escaped. >> >> Query q = qp.parse("\"united states\""); >> >> >> 2010/4/15 Ian.huang <yiwong2...@hotmail.com> >> >> Hi All, >>> >>> I am implementing a search function for address by hibernate search which >>> is based on lucene. The class definition as following: >>> >>> @Indexed >>> public class Address implements Cloneable >>> { >>> @DocumentId >>> private int id; >>> @Field >>> private String addrCountry; >>> private String addrDesc; >>> @Field >>> private String addrLineOne; >>> private String addrLineTwo; >>> @Field >>> private String addrCity; >>> ...... >>> >>> As you see, addrCountry, addrLineone and addrCity are fields for search. >>> I >>> am using default analyzer in index & search. So I think country name like >>> United States would be indexed as two terms United, and states. >>> >>> In addition, during search, a search keyword like United states, or Salt >>> lake city would be tokenized as two or three single words. >>> >>> As result, any address fields contain united, city would be returned. >>> like >>> United Kingdom, but actually I want to get a result of united states. >>> >>> My expected result as following: >>> >>> if someone searches for "united" it should return "united states" and >>> "united kingdom". >>> >>> if someone searches for "united states" it should return "united states", >>> and not "united kingdom". >>> >>> I hope the analyzer can generate term with multiple words. say, united >>> states to united states. I think standardanalyzer would analyze united >>> states to united and states? >>> >>> A different example: if search keyword is parking lot in Salt Lake City, >>> the generated terms to search need to be: parking lot and Salt Lake City, >>> not parking,lot,salt,lake and city. >>> >>> I wonder if any analyzer can help me to implement my requirement. It >>> would >>> be better to use dictionary based solution, then I can manage some search >>> terms that could have multiple words. >>> >>> thanks >>> >>> Ian >>> >> >> >> >> >> -- >> Regards, >> Samar >> >> > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > -- Regards, Samar