Ok. If you're crawling with this settings you don't need to reindex your segments again. And how about the plugins that you are using? Are you using the language-identifier plugin? If not, try it.
Regards, Obs: Eu falo português :) On 9/25/06, carmmello <[EMAIL PROTECTED]> wrote:
This issue happens even when I start a new crawl. So, I'm not reindexing the segments. The indexing is done by nutch itself, using the intranet method. Do you mean that after this is done, do I have to reindex the segments, once again? But, if so, why the english common terms are recognized first time? Tanks again ----- Original Message ----- From: "Lourival Júnior" <[EMAIL PROTECTED]> To: <nutch-user@lucene.apache.org> Sent: Monday, September 25, 2006 3:58 PM Subject: Re: Common terms Has you reindexed your segments? It's important, because it makes nutch recognize your common terms. I've tried it and the only thing I've noted was the index size that is more big than the original (before use the common terms). On 9/25/06, carmmello <[EMAIL PROTECTED]> wrote: > > I'm using Nutch 0.7.2 and have added to the common-terms.utf8 in the conf > folder (and also under the classes folder, inside the ROOT folder on > TomCat), some common terms in portuguese, one per line , like: > .................... > content:da > contente:de > contente:eu > .................. > However, when I try some search, I get all the results for those > portuguese common terms, and, at the same time, I get zero results for the > original english terms. I have even tried to list all the terms in > alphabetical order, including the original ones, with the same results. > In > other words, Nutch does not seem to recognize, as such, the added common > terms, only the original ones, included in the distribution. > Can any one clarify this? > Tanks > -- Lourival Junior Universidade Federal do Pará Curso de Bacharelado em Sistemas de Informação http://www.ufpa.br/cbsi Msn: [EMAIL PROTECTED] -------------------------------------------------------------------------------- No virus found in this incoming message. Checked by AVG Free Edition. Version: 7.1.405 / Virus Database: 268.12.6/453 - Release Date: 20/9/2006
-- Lourival Junior Universidade Federal do Pará Curso de Bacharelado em Sistemas de Informação http://www.ufpa.br/cbsi Msn: [EMAIL PROTECTED]