Ok. If you're crawling with this settings you don't need to reindex your
segments again. And how about the plugins that you are using? Are you using
the language-identifier plugin? If not, try it.

Regards,

Obs: Eu falo português :)

On 9/25/06, carmmello <[EMAIL PROTECTED]> wrote:

This issue happens even when I start a new crawl.  So, I'm not reindexing
the segments.  The indexing is done by nutch itself, using the intranet
method.
Do you mean that after this is done, do I have to reindex the segments,
once
again?  But, if so, why the english common terms are recognized first
time?
Tanks again
----- Original Message -----
From: "Lourival Júnior" <[EMAIL PROTECTED]>
To: <nutch-user@lucene.apache.org>
Sent: Monday, September 25, 2006 3:58 PM
Subject: Re: Common terms


Has you reindexed your segments? It's important, because it makes nutch
recognize your common terms. I've tried it and the only thing I've noted
was
the index size that is more big than the original (before use the common
terms).

On 9/25/06, carmmello <[EMAIL PROTECTED]> wrote:
>
> I'm using Nutch 0.7.2 and have added to the common-terms.utf8 in the
conf
> folder (and also under the classes folder, inside the ROOT folder on
> TomCat), some common terms in portuguese, one per line , like:
> ....................
> content:da
> contente:de
> contente:eu
> ..................
> However, when I try some search, I get all the results for those
> portuguese common terms, and, at the same time, I get zero results for
the
> original english terms.  I have even tried to list all the terms in
> alphabetical order, including the original ones, with the same results.
> In
> other words, Nutch does not seem to recognize, as such, the  added
common
> terms, only the original ones, included in the distribution.
> Can any one clarify this?
> Tanks
>



--
Lourival Junior
Universidade Federal do Pará
Curso de Bacharelado em Sistemas de Informação
http://www.ufpa.br/cbsi
Msn: [EMAIL PROTECTED]




--------------------------------------------------------------------------------


No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.1.405 / Virus Database: 268.12.6/453 - Release Date: 20/9/2006




--
Lourival Junior
Universidade Federal do Pará
Curso de Bacharelado em Sistemas de Informação
http://www.ufpa.br/cbsi
Msn: [EMAIL PROTECTED]

Reply via email to