Dear all,
After having encountered several problem using Nutch, I decided to follow
the procedure at : http://wiki.apache.org/nutch/RunNutchInEclipse0.9
I downloaded the last trunk/nutch snapshot version using Eclipse SVN plugin
at this apache's repo url :
Hello,
In the index, I have quite a few fields that are extracted from html meta
tags. As an example, I have a field called authors which contains the name
of the author of the document. On any given HTML page that I crawl, we can
have something like the following:
meta name=authors content=;
Hi all,
I am trying to run Nutch 0.8 on a Linux server and am coming up with some
errors that did not appear when I ran Nutch on a Windows machine. It get
this error message: common-terms.utf8 not found which is throwing a
java.lang.NullPointerException. The line that giving the error message
Hi,
ok, I crawled and indexed 1000 websites, and I am trying to return search
result of only 5 websites. e.g. there may be 100 websites in the search
result, but I am interested in only 5 specific websites (site1.com,
site2.comsite5.com only), so I am more interested in the rank of these 5
Step one is to identify the exact jar where this class lives. Are you sure
it's in mail.jar? Maybe it's in activate.jar?
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: Antony Bowesman a...@teamware.com
To: nutch-user@lucene.apache.org