With PhraseQuery you can specify where each term must occur in the phrase. So X must occur in position 0, David in position 1, and then manager in position 4 (skipping 2 holes).
QueryParser does this for you: when it analyzes the users phrase, if the resulting tokens have holes, then it sets the positions accordingly. And I agree: shingles are a good solution here too, but they make your index larger. CommonGramsFilter lets you shingle only specific words, e.g. you could pass your stop words to it. Mike McCandless http://blog.mikemccandless.com On Wed, Jul 24, 2013 at 7:34 AM, Ankit Murarka <ankit.mura...@rancoretech.com> wrote: > I tried using Phrase Query with slops. Now since I am specifying the slop I > also need to specify the 2nd term. > > In my case the 2nd term is not present. The whole string to be searched is > still 1 single term. > > How do I skip the holes created by stopwords. I do not know before hand how > many stop words are skipped and what string user is going to enter. > > Is there a definite way to skip the holes created by stopwords. > > I was now looking for MultiphraseQuery splitting the user provided string on > space and providing each word as a term to multiphrasequery. > > Will it help..?? Is there any alternative. ?? > > > On 7/24/2013 4:48 PM, Michael McCandless wrote: >> >> PhraseQuery? >> >> You can skip the holes created by stopwords ... e.g. QueryParser does >> this. Ie, the PhraseQuery becomes "X David _ _ manager _ _ company" >> if is/a/of/the are stop words, which isn't perfect (could return false >> matches) but should work well in practice ... >> >> Mike McCandless >> >> http://blog.mikemccandless.com >> >> >> On Wed, Jul 24, 2013 at 4:31 AM, Ankit Murarka >> <ankit.mura...@rancoretech.com> wrote: >> >>> >>> Dear All, >>> >>> Say suppose I have 3 documents. The sample text is >>> >>> /*File 1 : */ >>> >>> Mr X David is a manager of the company. He is the senior most manager. I >>> also want to become manager of the company. >>> >>> /*File 2 :*/ >>> >>> Mr X David manager of the company is also very senior. He happens to be >>> the >>> senior most manager. I wish even I could reach that place. >>> >>> /*File 3:*/ >>> >>> Mr X David is working for a company. He happens to be the manager of the >>> company.Infact he is the senior most manager. I dont want to become like >>> him. >>> >>> /*String I wish to search :* X David is a manager of the company./ >>> >>> Ideally I should get only file1 in the hit result. >>> >>> I have no clue how to achieve this. Basically I am trying to match the >>> part >>> of the sentence or a complete sentence. What can be the best methodology. >>> I presume is a are the stop words and will be skipped during indexing by >>> the >>> StandardAnalyzer. >>> >>> What wonders me how do I then search for a part of the sentence or >>> complete >>> sentence if sentence contains some/many stopwords. >>> >>> I am using StandardAnalyzer. Please guide. >>> >>> -- >>> Regards >>> >>> Ankit >>> >>> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> >> > > > > -- > Regards > > Ankit Murarka > > "Peace is found not in what surrounds us, but in what we hold within." > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org