Re: Problem indexin accented characters.

Itziar Cortes Sun, 20 Jun 2010 23:06:15 -0700

Hi!

Thanks for the reply.


I supposed the problem could be encoding problem... but I am sure that the
file is reading correctly.

Generally I have a problem when I tried to index a variable.

Could you tell me where can I post this question in CLucene user group? Is
that a mailing list?

Thanks in advance,

--
Itziar

2010/6/20 Itamar Syn-Hershko <[email protected]>

> Looks like an encoding issue. Is the file being read correctly (check with
> your debugger)?
>
> Also, please post such questions to the CLucene user group.
>
> Itamar.
>
> > -----Original Message-----
> > From: Itziar Cortes [mailto:[email protected]]
> > Sent: Sunday, June 20, 2010 12:21 PM
> > To: [email protected]
> > Subject: Problem indexin accented characters.
> >
> > Hi all!
> >
> > I have a little problem with CLucene when I try to index
> > accented characters. I need index characters like ñ, è, ü, or
> > ó. I use Luke to see the indexed data.
> >
> > I tried this, and I had no problem:
> >
> >  pDoc->add(*new Field(_T("field"), _T("a b ñ c d"),
> > Field::STORE_YES | Field::INDEX_TOKENIZED));
> >
> >
> > The problem begins when I tried read from a file, and index
> > each line. For example,
> >
> >  wifstream file;
> >  wstring lineread;
> >  while(std::getline(file, lineread)){
> >       pDoc->add(*new Field(_T("testua"), lineread.c_str(),
> > Field::STORE_YES
> > | Field::INDEX_TOKENIZED));
> >
> > It only index "a" and "b".
> >
> >
> > How can I solve this problem?
> >
> > Thanks in advance,
> >
> > Best regards,
> >
> > --
> > Itziar
> >
>
>

Re: Problem indexin accented characters.

Reply via email to