Re: Removing Empty Shingles in Lucene 4

2012-11-01 Thread Igal @ getRailo.org
hi Steve, you are correct. I am using StandardTokenizer. I will look into the WhitespaceTokenizer and hopefully figure it out. thank you, Igal On 11/1/2012 1:24 PM, Steve Rowe wrote: Hi Igal, You didn't say you were using StandardTokenizer, but assuming you are, right now StandardToke

Re: Removing Empty Shingles in Lucene 4

2012-11-01 Thread Steve Rowe
Hi Igal, You didn't say you were using StandardTokenizer, but assuming you are, right now StandardTokenizer throws away punctuation, so no following filters will see them. If StandardTokenizer were modified to also output currently non-tokenized punctuation as tokens, then you could use a Filt

Re: Removing Empty Shingles in Lucene 4

2012-11-01 Thread Igal @ getRailo.org
thank you. I found it at org.apache.lucene.analysis.util.FilteringTokenFilter Igal On 11/1/2012 12:51 PM, Uwe Schindler wrote: The filter is still there. In Lucene 4.0 all tokenstream implementations are in a separate module, no longer in Lucene core. The package names of most analysis

Re: Removing Empty Shingles in Lucene 4

2012-11-01 Thread Uwe Schindler
The filter is still there. In Lucene 4.0 all tokenstream implementations are in a separate module, no longer in Lucene core. The package names of most analysis components changed, too. Use your IDE to find it or ask Google... Uwe "Igal @ getRailo.org" schrieb: >hi, > >I'm trying to migrate

Removing Empty Shingles in Lucene 4

2012-11-01 Thread Igal @ getRailo.org
hi, I'm trying to migrate to Lucene 4. in Lucene 3.5 I extended org.apache.lucene.analysis.FilteringTokenFilter and overrode accept() to remove undesired shingles. in Lucene 4 org.apache.lucene.analysis.FilteringTokenFilter does not exist? I'm trying to achieve two things: 1) remove shingl