Re: [CLucene-dev] Problem indexin accented characters.

Itamar Syn-Hershko Sun, 20 Jun 2010 08:41:45 -0700

Looks like an encoding issue. Is the file being read correctly (check with
your debugger)?


Also, please post such questions to the CLucene user group.

Itamar. 

> -----Original Message-----
> From: Itziar Cortes [mailto:itz...@eleka.net] 
> Sent: Sunday, June 20, 2010 12:21 PM
> To: gene...@lucene.apache.org
> Subject: Problem indexin accented characters.
> 
> Hi all!
> 
> I have a little problem with CLucene when I try to index 
> accented characters. I need index characters like ñ, è, ü, or 
> ó. I use Luke to see the indexed data.
> 
> I tried this, and I had no problem:
> 
>  pDoc->add(*new Field(_T("field"), _T("a b ñ c d"), 
> Field::STORE_YES | Field::INDEX_TOKENIZED));
> 
> 
> The problem begins when I tried read from a file, and index 
> each line. For example,
> 
>  wifstream file;
>  wstring lineread;
>  while(std::getline(file, lineread)){
>       pDoc->add(*new Field(_T("testua"), lineread.c_str(), 
> Field::STORE_YES
> | Field::INDEX_TOKENIZED));
> 
> It only index "a" and "b".
> 
> 
> How can I solve this problem?
> 
> Thanks in advance,
> 
> Best regards,
> 
> --
> Itziar
> 


------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
CLucene-developers mailing list
CLucene-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/clucene-developers

Re: [CLucene-dev] Problem indexin accented characters.

Reply via email to