n the following lines:
Configuration c = NutchConfiguration.create();
/* Some code removed here */
c.set("analysis.common.terms.file", "common-terms.utf8");
and including in the root of the nutch-1.0.jar the file common-terms.utf8
obtained from the "$NUTCH_HOME/conf/&qu
Any ideas !!
kazam wrote:
>
> Hi there,
> Nutch is giving an error to me saying that
>
> org.apache.hadoop.conf.Configuration common-terms.utf8 not found
>
> I have tried to specify paths in java using the configuration object.
>
>
> ServletContext applicati
Hi there,
Nutch is giving an error to me saying that
org.apache.hadoop.conf.Configuration common-terms.utf8 not found
I have tried to specify paths in java using the configuration object.
ServletContext application = session.getServletContext();
Configuration nutchConf = NutchConfiguration.get
Hi there,
For some reason nutch can't seem to find my common-terms.utf8 file. I have
placed it under WEB-INF, WEB-INF/classes and even under WEB-INF/lib.
In my nutch-default.xml the path to the file is as follows
analysis.common.terms.file
common-terms.utf8
The name of a file conta
Hi all,
I am trying to run Nutch 0.8 on a Linux server and am coming up with some
errors that did not appear when I ran Nutch on a Windows machine. It get this
error message: common-terms.utf8 not found which is throwing a
java.lang.NullPointerException. The line that giving the error
Hi all,
I am trying to run Nutch 0.8 on a Linux server and am coming up with some
errors that did not appear when I ran Nutch on a Windows machine. It get
this error message: common-terms.utf8 not found which is throwing a
java.lang.NullPointerException. The line that giving the error message
> To: nutch-user@lucene.apache.org
> Subject: Understanding common-terms.utf8
Oopps. Finally i did my homework, and found my way throught Mail Archives,
and the response to my FAQ questions..
http://www.mail-archive.com/nutch-user@lucene.apache.org/msg05635.html this
entire thread anws
other
lucene uses stopwords..
So, How can i inject a stopword list in Nutch? How it's used
common-terms.utf8? If It isnt a stopword file what it is? How it's used
?
Ignacio J. Ortega
Dpto. soporte y desarrollo
http://www.derecho.com
http://www.elabogado.com
ructor.java:513)
at
org.springframework.beans.BeanUtils.instantiateClass(BeanUtils.java:83)
... 65 more
This is caused by the common-terms.utf8 file not being found in line
152 of org.apache.nutch.analysis.CommonGrams. However, this file is
located on the root level of the nutch.jar in the lib directory that
also contains th
[EMAIL PROTECTED] wrote:
This is because Nutch turns those common terms into ngrams (not sure of what
size), and that increases the size of the index.
For example, if you have a phrase like:
vacation time
Normally, Nutch will index this phrase as 2 terms, a total of 12 characters
(probably
ROTECTED]>
To: nutch-user@lucene.apache.org
Sent: Friday, August 11, 2006 8:19:41 AM
Subject: Re: [Nutch-general] common-terms.utf8
Hi Timo!
I analyzed to index before and after using correctly the
common-terms.utf8file. Before adding the common terms in my language
my index had about 3mb.
Afte
i Timo!
Thanks a lot! now I have a clearly knowledge about this file. This article
helps a lot too: http://searchenginewatch.com/showPage.html?page=2156061
Thanks again!
On 8/11/06, Timo Scheuer < [EMAIL PROTECTED]> wrote:
>
> Hi,
>
> > Could anyone explain me what does exactly
Hi Timo!
Thanks a lot! now I have a clearly knowledge about this file. This article
helps a lot too: http://searchenginewatch.com/showPage.html?page=2156061
Thanks again!
On 8/11/06, Timo Scheuer < [EMAIL PROTECTED]> wrote:
Hi,
> Could anyone explain me what does exactly t
Hi,
> Could anyone explain me what does exactly the common-terms.utf8 file? I
> don't understand the real functionality of this file...
During indexing (and also during searching) the common terms are used to form
n-grams to make search faster for common words like articles for exam
Hi,
Could anyone explain me what does exactly the common-terms.utf8 file? I
don't understand the real functionality of this file...
Regards,
--
Lourival Junior
Universidade Federal do Pará
Curso de Bacharelado em Sistemas de Informação
http://www.ufpa.br/cbsi
Msn: [EMAIL PROTECTED]
15 matches
Mail list logo