On 12-11-02 12:45 PM, Lewis John Mcgibbney wrote:
Hi,
On Fri, Nov 2, 2012 at 5:36 PM, cocofan <[email protected]> wrote:
2012-11-01 14:46:52,027 ERROR security.UserGroupInformation -
PriviledgedActionException as:cocofan
I've never seen this Exception before...honestly.
cause:org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input
path does not exist:
file:/home/cocofan/Dropbox/project/apache-nutch-2.1/runtime/local/bin/urls
2012-11-01 14:46:52,027 ERROR crawl.InjectorJob - InjectorJob:
org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does
not exist:
The rest seems to be pretty straight forward. You appear to be running
nutch from $NUTCH_HOME/runtime/local/bin with the following command
./nutch XYZ
I am running nutch from /runtime/local and I do have the
urls directory in both /runtime/local/bin and /runtime/local (with the
seed.txt file in both).
The command I'm using is (from /runtime/local):
./bin/nutch crawl urls -solr
http://localhost:8983/solr/ -depth 3 -topN 5
Actually it seems to be a problem with hadoop so I was
wondering if I need to set a directory in a config file there?
Unless you urls directory is located in the ./bin directory (which I
doubt it is) then you should come up one directory and run the command
from $NUTCH_HOME/runtime/local e.g. ./bin/nutch XYZ
Does this make sense? Please read the tutorial carefully and
thoroughly and it will work perfectly.
hth
Lewis