hi Steve,
you are correct. I am using StandardTokenizer. I will look into the
WhitespaceTokenizer and hopefully figure it out.
thank you,
Igal
On 11/1/2012 1:24 PM, Steve Rowe wrote:
Hi Igal,
You didn't say you were using StandardTokenizer, but assuming you are, right
now StandardToke
Hi Igal,
You didn't say you were using StandardTokenizer, but assuming you are, right
now StandardTokenizer throws away punctuation, so no following filters will see
them.
If StandardTokenizer were modified to also output currently non-tokenized
punctuation as tokens, then you could use a Filt
thank you. I found it at
org.apache.lucene.analysis.util.FilteringTokenFilter
Igal
On 11/1/2012 12:51 PM, Uwe Schindler wrote:
The filter is still there. In Lucene 4.0 all tokenstream implementations are in
a separate module, no longer in Lucene core. The package names of most analysis
The filter is still there. In Lucene 4.0 all tokenstream implementations are in
a separate module, no longer in Lucene core. The package names of most analysis
components changed, too.
Use your IDE to find it or ask Google...
Uwe
"Igal @ getRailo.org" schrieb:
>hi,
>
>I'm trying to migrate
hi,
I'm trying to migrate to Lucene 4.
in Lucene 3.5 I extended org.apache.lucene.analysis.FilteringTokenFilter
and overrode accept() to remove undesired shingles. in Lucene 4
org.apache.lucene.analysis.FilteringTokenFilter does not exist?
I'm trying to achieve two things:
1) remove shingl