On 12-11-02 12:45 PM, Lewis John Mcgibbney wrote:
Hi,

On Fri, Nov 2, 2012 at 5:36 PM, cocofan <[email protected]> wrote:

2012-11-01 14:46:52,027 ERROR security.UserGroupInformation -
PriviledgedActionException as:cocofan
I've never seen this Exception before...honestly.

cause:org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input
path does not exist:
file:/home/cocofan/Dropbox/project/apache-nutch-2.1/runtime/local/bin/urls
2012-11-01 14:46:52,027 ERROR crawl.InjectorJob - InjectorJob:
org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does
not exist:
The rest seems to be pretty straight forward. You appear to be running
nutch from $NUTCH_HOME/runtime/local/bin with the following command
./nutch XYZ
I am running nutch from /runtime/local and I do have the urls directory in both /runtime/local/bin and /runtime/local (with the seed.txt file in both).

            The command I'm using is (from /runtime/local):
./bin/nutch crawl urls -solr http://localhost:8983/solr/ -depth 3 -topN 5

Actually it seems to be a problem with hadoop so I was wondering if I need to set a directory in a config file there?

Unless you urls directory is located in the ./bin directory (which I
doubt it is) then you should come up one directory and run the command
from $NUTCH_HOME/runtime/local e.g. ./bin/nutch XYZ

Does this make sense? Please read the tutorial carefully and
thoroughly and it will work perfectly.

hth

Lewis


Reply via email to