WhitespaceAnalyzer will index them as-is, without any normalization -
which should work. As Luceners will tell you, start with an Analyzer
that works well enough, and go from there, improving it where necessary.
It worth checking what Java Lucene is using for indexing Portuguese. If
this is something that is not available for CLucene, it may be easier to
just port it than writing your own.
Itamar.
On 27/7/2010 1:08 PM, Rui Oliveira wrote:
I am using SimpleAnalyzer to support Portuguese characters.
WhitespaceAnalyzer will support Portuguese characters too?
Thanks & Regards,
Rui
------------------------------------------------------------------------
Date: Mon, 26 Jul 2010 12:01:02 +0300
From: ita...@code972.com
To: clucene-developers@lists.sourceforge.net
Subject: Re: [CLucene-dev] Indexate numbers
Use WhitespaceAnalyzer instead of what you're using now for indexing.
Analyzer is what is being used internally to tokenize the stream and
filter tokens from it. Depending on your needs, you'll need to choose
the right analyzer for you, or write your own.
Itamar.
On 26/7/2010 11:51 AM, Rui Oliveira wrote:
Hi,
I am using CLucene, but apparently numbers are not indexed.
For example:
W1234 -> indexed
W2XX -> indexed
1234 -> NOT INDEXED
There is some configuration to force the numbers indexation?
Thanks & Regards,
Rui
------------------------------------------------------------------------
Hotmail is redefining busy with tools for the New Busy. Get more
from your inbox. See how.
<http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_2>
------------------------------------------------------------------------------
The Palm PDK Hot Apps Program offers developers who use the
Plug-In Development Kit to bring their C/C++ apps to Palm for a share
of $1 Million in cash or HP Products. Visit us here for more details:
http://ad.doubleclick.net/clk;226879339;13503038;l?
http://clk.atdmt.com/CRS/go/247765532/direct/01/
_______________________________________________
CLucene-developers mailing list
CLucene-developers@lists.sourceforge.net
<mailto:CLucene-developers@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/clucene-developers
------------------------------------------------------------------------
The New Busy is not the old busy. Search, chat and e-mail from your
inbox. Get started.
<http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_3>
------------------------------------------------------------------------------
The Palm PDK Hot Apps Program offers developers who use the
Plug-In Development Kit to bring their C/C++ apps to Palm for a share
of $1 Million in cash or HP Products. Visit us here for more details:
http://ad.doubleclick.net/clk;226879339;13503038;l?
http://clk.atdmt.com/CRS/go/247765532/direct/01/
_______________________________________________
CLucene-developers mailing list
CLucene-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/clucene-developers
------------------------------------------------------------------------------
The Palm PDK Hot Apps Program offers developers who use the
Plug-In Development Kit to bring their C/C++ apps to Palm for a share
of $1 Million in cash or HP Products. Visit us here for more details:
http://ad.doubleclick.net/clk;226879339;13503038;l?
http://clk.atdmt.com/CRS/go/247765532/direct/01/
_______________________________________________
CLucene-developers mailing list
CLucene-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/clucene-developers