Seems like an encoding problem with these documents. Are they html pages? Are the words "orçamento" and "administração" in the same page? for example?
Can you dump one of these files here? (One that has the problem and one that has not) On Fri, Apr 23, 2010 at 9:05 AM, Rui Oliveira <ruifra...@hotmail.com> wrote: > I am indexing some separated documents. > > The document that have these words are a small text document. This document > is indexed without any visible error. This same document is found when I > search for other words on it. > > > Rui > > ------------------------------ > From: oniltonmac...@gmail.com > Date: Fri, 23 Apr 2010 08:58:05 -0400 > > To: clucene-developers@lists.sourceforge.net > Subject: Re: [CLucene-dev] Clucene search - Do not found some words > > What are you indexing? > > Just a big document? > Or a lot of sepparate documents ? (html documents?) > > On Fri, Apr 23, 2010 at 8:54 AM, Rui Oliveira <ruifra...@hotmail.com>wrote: > > Hi Onilton, > > I have tested with "orcamento" instead of "orçamento" and didn't get > anything. > > I do not know if lucene indexes "orçamento" in a wrong way, because > indexes without any error, but when I search for it do not get anything. > > Thnaks & Regards, > Rui > > ------------------------------ > From: oniltonmac...@gmail.com > Date: Fri, 23 Apr 2010 08:09:20 -0400 > To: clucene-developers@lists.sourceforge.net > Subject: Re: [CLucene-dev] Clucene search - Do not found some words > > > If "importação" works "orçamento" should work too. > > But I didn't get the problem. Clucene removes this kind of signs so you > should get "orcamento" instead of "orçamento". > > Where is the problem happening exactly? It happens when you search for > "orçamento" or Clucene indexes "orçamento" in a wrong way? > > On Fri, Apr 23, 2010 at 7:51 AM, Rui Oliveira <ruifra...@hotmail.com>wrote: > > I am using clucene-core-0.9.21b, and lucene search do not found same > portuguese words like "orçamento", "orçamentos" or "orça". > > But for other portuguese words with portuguese characters like > "administração", "relações" or "importação" works well. > > What could be? > > Thanks & Regards, > Rui > > ------------------------------ > The New Busy think 9 to 5 is a cute idea. Combine multiple calendars with > Hotmail. Get > busy.<http://www.windowslive.com/campaign/thenewbusy?tile=multicalendar&ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_5> > > > ------------------------------------------------------------------------------ > > _______________________________________________ > CLucene-developers mailing list > CLucene-developers@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/clucene-developers > > > > ------------------------------ > Hotmail has tools for the New Busy. Search, chat and e-mail from your > inbox. Learn > more.<http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_1> > > > ------------------------------------------------------------------------------ > > _______________________________________________ > CLucene-developers mailing list > CLucene-developers@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/clucene-developers > > > > ------------------------------ > Hotmail is redefining busy with tools for the New Busy. Get more from your > inbox. See > how.<http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_2> > > > ------------------------------------------------------------------------------ > > _______________________________________________ > CLucene-developers mailing list > CLucene-developers@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/clucene-developers > >
------------------------------------------------------------------------------
_______________________________________________ CLucene-developers mailing list CLucene-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/clucene-developers