True, for the first two use cases, but as I indicated, the third use case is
problematic since the token needs to be split. The n-gram solution does seem
to cover it though, sort of.
The n-gram solution doesn't cover "good morning, john" or "good morning -
john", but that could be handled by having a tokenizer that that simply
ignored punctuation and whitespace and generated one big original token and
then n-grammed it based on some maximal query phrase size. And... the
original requirement spec didn't list that as a use case anyway.
-- Jack Krupansky
-----Original Message-----
From: Michael Sokolov
Sent: Monday, May 12, 2014 8:39 PM
To: java-user@lucene.apache.org
Subject: Re: How to locate a Phrase inside text (like a Browser text
searcher)
ShingleFilter can help with this; it concatenates neighboring tokens.
So a search for "good morning john" becomes a search for
"goodmorning john" OR
"good morningjohn" OR
"good morning john"
it makes your index much bigger because of all the terms, but you may
find it's worth the cost
-Mike
On 5/11/2014 9:46 PM, Jack Krupansky wrote:
The word delimiter filter can help for "MorningJohn" by setting its option
to split on case change.
You might be able to handle "Mailhow" using the
DictionaryCompoundWordTokenFilter, but that requires that you create a
complete dictionary of terms that can split off. That's not very
practical. In truth, Lucene/Solr doesn't have a good out of the box
solution for this use case.
-- Jack Krupansky
-----Original Message----- From: teko
Sent: Thursday, May 8, 2014 9:03 AM
To: java-user@lucene.apache.org
Subject: How to locate a Phrase inside text (like a Browser text searcher)
Hi, someone can help me with it??
I need do a search to locate a phrase inside text, but, I need locate this
phrase on texts like that:
'John Mail' <- phrase I want locate
' Good Morning John Mail how are you? ' < I need find this phrase here
' Good MorningJohn Mail how are you? ' < here too
' GoodMorning John Mailhow are you? ' < and here
I tried using with 'WhiteSpaceAnalyzer' and 'QueryParser'... but not work
(locate just in the first sample above... but not the others)
Please, I really need help with it!
Thanks (note: Sorry my english!! xD)
--
View this message in context:
http://lucene.472066.n3.nabble.com/How-to-locate-a-Phrase-inside-text-like-a-Browser-text-searcher-tp4135075.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org