Hi,
I built Indexer with NGramAnalizer which uses ShingleFilter
Next I built Searcher with NGramQuery which uses BooleanQuery
String termToken = charTermAttribute.toString();
Term t = new Term("content",termToken);
add(new TermQuery(t),Occur.SHOULD);
it looks like everything works perfectly however my searcher do not
find any "hits"
I suspect my indexer code, so I tried to check index. But Luke does
not work with Lucene 4.3.0 :(
Could someone give me hint what is happening?
Thanks,
gosia
On Mon, Jul 15, 2013 at 1:45 PM, Malgorzata Urbanska
<[email protected]> wrote:
> thanks !!
>
>
>
> On Mon, Jul 15, 2013 at 1:31 PM, Ivan Krišto <[email protected]> wrote:
>> On 07/15/2013 07:50 PM, Malgorzata Urbanska wrote:
>>> Hi,
>>>
>>> I've been trying to figure out how to use ngrams in Lucene 4.3.0
>>> I found some examples for earlier version but I'm still confused.
>>> How I understand it, I should:
>>> 1. create a new analyzer which uses ngrams
>>> 2. apply it to my indexer
>>> 3. search using the same analyzer
>>>
>>> I found in a documentation: NGramTokenFilter and NGramTokenizer, but I
>>> do not understand what is the difference between them.
>> This should be helpful:
>> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#Tokenizers
>>
>> Here is example of n-gram analyzer:
>>
>> public class NGramAnalyzer extends Analyzer {
>> @Override
>> protected TokenStreamComponents createComponents(String fieldName,
>> Reader reader) {
>>
>> Tokenizer src = new NGramTokenizer(reader, 3, 3);
>>
>> TokenStream tok = new StandardFilter(Version.LUCENE_43, src);
>> tok = new LowerCaseFilter(Version.LUCENE_43, tok);
>>
>> return new TokenStreamComponents(src, tok) {
>> @Override
>> protected void setReader(final Reader reader) throws
>> IOException {
>> super.setReader(reader);
>> }
>> };
>> }
>> }
>>
>> If, for example, you want to remove stop words from document before
>> breaking it into n-grams, than you would need:
>> reader(document) -> SomeTokenizer -> StopFilter -> NGramTokenFilter
>>
>>
>> Regards,
>> Ivan Krišto
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
>
>
>
> --
> Malgorzata Urbanska (Gosia)
> Graduate Assistant
> Colorado State University
--
Malgorzata Urbanska (Gosia)
Graduate Assistant
Colorado State University
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]