correction, my mistake, I am getting a different nullpointer error

Exception in thread "main" java.lang.NullPointerException
at java.util.Hashtable.put(Hashtable.java:411)
 at java.util.Properties.setProperty(Properties.java:160)
at org.apache.hadoop.conf.Configuration.set(Configuration.java:438)
 at org.apache.nutch.indexer.IndexerJob.createIndexJob(IndexerJob.java:128)
at org.apache.nutch.indexer.solr.SolrIndexerJob.run(SolrIndexerJob.java:44)
 at org.apache.nutch.crawl.Crawler.runTool(Crawler.java:68)
at org.apache.nutch.crawl.Crawler.run(Crawler.java:192)
 at org.apache.nutch.crawl.Crawler.run(Crawler.java:250)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
 at org.apache.nutch.crawl.Crawler.main(Crawler.java:257)



On Wed, Nov 28, 2012 at 4:18 PM, Nicholas Roberts <
[email protected]> wrote:

> I am working from this tutorial and get a similar error
> http://nlp.solutions.asia/?p=180
>
>
> On Fri, Nov 2, 2012 at 1:13 PM, cocofan <[email protected]> wrote:
>
>> On 12-11-02 12:45 PM, Lewis John Mcgibbney wrote:
>>
>>> Hi,
>>>
>>> On Fri, Nov 2, 2012 at 5:36 PM, cocofan <[email protected]> wrote:
>>>
>>>  2012-11-01 14:46:52,027 ERROR security.UserGroupInformation -
>>>> PriviledgedActionException as:cocofan
>>>>
>>> I've never seen this Exception before...honestly.
>>>
>>>  cause:org.apache.hadoop.**mapreduce.lib.input.**InvalidInputException:
>>>> Input
>>>> path does not exist:
>>>> file:/home/cocofan/Dropbox/**project/apache-nutch-2.1/**
>>>> runtime/local/bin/urls
>>>> 2012-11-01 14:46:52,027 ERROR crawl.InjectorJob - InjectorJob:
>>>> org.apache.hadoop.mapreduce.**lib.input.**InvalidInputException: Input
>>>> path does
>>>> not exist:
>>>>
>>> The rest seems to be pretty straight forward. You appear to be running
>>> nutch from $NUTCH_HOME/runtime/local/bin with the following command
>>> ./nutch XYZ
>>>
>>              I am running nutch from /runtime/local and I do have the
>> urls directory in both /runtime/local/bin and /runtime/local (with the
>> seed.txt file in both).
>>
>>             The command I'm using is (from /runtime/local):
>>                                ./bin/nutch crawl urls -solr
>> http://localhost:8983/solr/ -depth 3 -topN 5
>>
>>            Actually it seems to be a problem with hadoop so I was
>> wondering if I need to set a directory in a config file there?
>>
>>
>>  Unless you urls directory is located in the ./bin directory (which I
>>> doubt it is) then you should come up one directory and run the command
>>> from $NUTCH_HOME/runtime/local e.g. ./bin/nutch XYZ
>>>
>>> Does this make sense? Please read the tutorial carefully and
>>> thoroughly and it will work perfectly.
>>>
>>> hth
>>>
>>> Lewis
>>>
>>>
>>
>
>
> --
> --
> Nicholas Roberts
> US 510-684-8264
> http://Permaculture.TV
>
>


-- 
-- 
Nicholas Roberts
US 510-684-8264
http://Permaculture.TV

Reply via email to