WhitespaceAnalyzer will index them as-is, without any normalization - which should work. As Luceners will tell you, start with an Analyzer that works well enough, and go from there, improving it where necessary.

It worth checking what Java Lucene is using for indexing Portuguese. If this is something that is not available for CLucene, it may be easier to just port it than writing your own.

Itamar.

On 27/7/2010 1:08 PM, Rui Oliveira wrote:
I am using SimpleAnalyzer to support Portuguese characters.

WhitespaceAnalyzer will support Portuguese characters too?

Thanks & Regards,
Rui



------------------------------------------------------------------------
Date: Mon, 26 Jul 2010 12:01:02 +0300
From: ita...@code972.com
To: clucene-developers@lists.sourceforge.net
Subject: Re: [CLucene-dev] Indexate numbers

Use WhitespaceAnalyzer instead of what you're using now for indexing. Analyzer is what is being used internally to tokenize the stream and filter tokens from it. Depending on your needs, you'll need to choose the right analyzer for you, or write your own.

Itamar.

On 26/7/2010 11:51 AM, Rui Oliveira wrote:

    Hi,

    I am using CLucene, but apparently numbers are not indexed.

    For example:
    W1234 -> indexed
    W2XX -> indexed
    1234 -> NOT INDEXED

    There is some configuration to force the numbers indexation?

    Thanks & Regards,
    Rui


    ------------------------------------------------------------------------
    Hotmail is redefining busy with tools for the New Busy. Get more
    from your inbox. See how.
    
<http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_2>



    
------------------------------------------------------------------------------
    The Palm PDK Hot Apps Program offers developers who use the
    Plug-In Development Kit to bring their C/C++ apps to Palm for a share
    of $1 Million in cash or HP Products. Visit us here for more details:
    http://ad.doubleclick.net/clk;226879339;13503038;l?
    http://clk.atdmt.com/CRS/go/247765532/direct/01/


    _______________________________________________
    CLucene-developers mailing list
    CLucene-developers@lists.sourceforge.net  
<mailto:CLucene-developers@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/clucene-developers

------------------------------------------------------------------------
The New Busy is not the old busy. Search, chat and e-mail from your inbox. Get started. <http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_3>


------------------------------------------------------------------------------
The Palm PDK Hot Apps Program offers developers who use the
Plug-In Development Kit to bring their C/C++ apps to Palm for a share
of $1 Million in cash or HP Products. Visit us here for more details:
http://ad.doubleclick.net/clk;226879339;13503038;l?
http://clk.atdmt.com/CRS/go/247765532/direct/01/


_______________________________________________
CLucene-developers mailing list
CLucene-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/clucene-developers
------------------------------------------------------------------------------
The Palm PDK Hot Apps Program offers developers who use the
Plug-In Development Kit to bring their C/C++ apps to Palm for a share 
of $1 Million in cash or HP Products. Visit us here for more details:
http://ad.doubleclick.net/clk;226879339;13503038;l?
http://clk.atdmt.com/CRS/go/247765532/direct/01/
_______________________________________________
CLucene-developers mailing list
CLucene-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/clucene-developers

Reply via email to