I am working from this tutorial and get a similar error http://nlp.solutions.asia/?p=180
On Fri, Nov 2, 2012 at 1:13 PM, cocofan <[email protected]> wrote: > On 12-11-02 12:45 PM, Lewis John Mcgibbney wrote: > >> Hi, >> >> On Fri, Nov 2, 2012 at 5:36 PM, cocofan <[email protected]> wrote: >> >> 2012-11-01 14:46:52,027 ERROR security.UserGroupInformation - >>> PriviledgedActionException as:cocofan >>> >> I've never seen this Exception before...honestly. >> >> cause:org.apache.hadoop.**mapreduce.lib.input.**InvalidInputException: >>> Input >>> path does not exist: >>> file:/home/cocofan/Dropbox/**project/apache-nutch-2.1/** >>> runtime/local/bin/urls >>> 2012-11-01 14:46:52,027 ERROR crawl.InjectorJob - InjectorJob: >>> org.apache.hadoop.mapreduce.**lib.input.**InvalidInputException: Input >>> path does >>> not exist: >>> >> The rest seems to be pretty straight forward. You appear to be running >> nutch from $NUTCH_HOME/runtime/local/bin with the following command >> ./nutch XYZ >> > I am running nutch from /runtime/local and I do have the urls > directory in both /runtime/local/bin and /runtime/local (with the seed.txt > file in both). > > The command I'm using is (from /runtime/local): > ./bin/nutch crawl urls -solr > http://localhost:8983/solr/ -depth 3 -topN 5 > > Actually it seems to be a problem with hadoop so I was > wondering if I need to set a directory in a config file there? > > > Unless you urls directory is located in the ./bin directory (which I >> doubt it is) then you should come up one directory and run the command >> from $NUTCH_HOME/runtime/local e.g. ./bin/nutch XYZ >> >> Does this make sense? Please read the tutorial carefully and >> thoroughly and it will work perfectly. >> >> hth >> >> Lewis >> >> > -- -- Nicholas Roberts US 510-684-8264 http://Permaculture.TV

