Exact phrase search isn't exact phrase search as you are thinking of it. A phrase search for "foo bar" searches for the terms foo and bar, and then checks whether they are one position apart. If punctuation has been removed during analysis, it *cannot* play a part in a search of any kind.
You may be able to achieve what you want with a PatternTokenizer rather than whitespace and removing the WordDelimiterFilterFactory. Upayavira On Wed, Mar 13, 2013, at 08:41 AM, adfel70 wrote: > I want the following behaivour. > if "john....@gmail.com" is indexed to the field > 1. searching 'john' or 'doe' or 'gmail.com' will retreive the doc. > 2. searching '"@gmail.com' will retreive the doc. > 3. searching '"gmail.com@"' will not retreive the doc. > > All I can accomplish, but 3. > because the word delimiter removes '@', when I search "@gmail.com" or > "gmail.com@" its like searching "gmail.com" which causes unrequired > results. > This is an exact phrase search, so I would expect only docs with the > exact > phrase I search (including punctuations ) to be retrieved. > > How can I achieve this? > > Thanks. > > > > Jack Krupansky-2 wrote > > The Word Delimiter Filter will remove all punctuation characters. That is > > its function. > > > > Maybe you should first describe in simple English what your token/term > > rules > > are, and then it would be more clear what tokenizer and filters would be > > most appropriate. > > > > -- Jack Krupansky > > > > -----Original Message----- > > From: adfel70 > > Sent: Tuesday, March 12, 2013 3:14 AM > > To: > > > solr-user@.apache > > > Subject: Re: searching exact phrase with stop word returns bad results > > > > I see that there is not token with @. > > the question is why. > > this is my field type: > > <fieldtype name="email_type" class="solr.TextField" > > positionIncrementGap="100" autoGeneratePhraseQueries="false" > > omitNorms="true"> > > > > <analyzer> > > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > > > > <filter class="solr.LowerCaseFilterFactory"/> > > > > <filter class="solr.WordDelimiterFilterFactory" > > preserveOriginal="1" generateWordParts="1" generateNumberParts="1" > > catenateWords="0" catenateNumbers="0" catenateAll="0" > > splitOnCaseChange="0"/> > > > > </analyzer> > > > > </fieldtype> > > any idea? > > > > > > > > Erick Erickson wrote > >> Take a look at admin/analysis for the field in question, feed it values > >> and > >> see how they are tokenized. My guess is that the token in the index is > > > >> abc@ > > > >> (single token), which of course won't match the fragment "@ > >> gmail.com" (assuming gmail.com@ is a typo)... > >> > >> Best > >> Erick > >> > >> > >> On Wed, Mar 6, 2013 at 5:43 AM, adfel70 < > > > >> adfel70@ > > > >> > wrote: > >> > >>> Hi > >>> > >>> I have emails indexed with the default text_general fieldType. > >>> > >>> I find that if the email " > > > >> abc@ > > > >> " is indexed, and I search for > >>> "gmail.com@" (exact phrase search) I can a result, while I should not > >>> get > >>> one. > >>> > >>> Any idea how to solve this? > >>> > >>> thanks. > >>> > >>> > >>> > >>> -- > >>> View this message in context: > >>> http://lucene.472066.n3.nabble.com/searching-exact-phrase-with-stop-word-returns-bad-results-tp4045180.html > >>> Sent from the Solr - User mailing list archive at Nabble.com. > >>> > > > > > > > > > > > > -- > > View this message in context: > > http://lucene.472066.n3.nabble.com/searching-exact-phrase-with-stop-word-returns-bad-results-tp4045180p4046560.html > > Sent from the Solr - User mailing list archive at Nabble.com. > > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/searching-exact-phrase-with-stop-word-returns-bad-results-tp4045180p4046904.html > Sent from the Solr - User mailing list archive at Nabble.com.