[Nutch-dev] [jira] Resolved: (NUTCH-428) NullPointerException

Sami Siren (JIRA) Fri, 12 Jan 2007 14:17:10 -0800

     [ 
https://issues.apache.org/jira/browse/NUTCH-428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Sami Siren resolved NUTCH-428.
------------------------------

       Resolution: Fixed
    Fix Version/s: 0.9.0

Most propably you dont have agent name configured in nutch-site.xml. I changed 
this situation to emit RuntimeException in trunk instead so it's easier to 
diagnose.

> NullPointerException
> --------------------
>
>                 Key: NUTCH-428
>                 URL: https://issues.apache.org/jira/browse/NUTCH-428
>             Project: Nutch
>          Issue Type: Bug
>          Components: fetcher
>    Affects Versions: 0.8.1
>         Environment: Windows XP
>            Reporter: Piyush
>             Fix For: 0.9.0
>
>
> I am using the NUTCH.Bat provided in one one of the thread. (i am not using 
> CYGWIN) Whenever I try to fetch the Item, I am getting fetching failed 
> "nullpointerexception" 
> I have a URL Directory. which has urls.txt file. there is only one entry in 
> the file which is http://www.winzip.com/land_about.htm. 
> I have updated the crawl-urlfilter.txt with +^http://www.winzip.com/. 
> Is there any other settings I am missing?? Any help is greatly appreciated. 
> The command i used to  start the crawl is 
> nutch  crawl urls -dir crawlResults -depth 1
> Here is my log 
> crawl started in: crawlResult
> rootUrlDir = urls
> threads = 10
> depth = 1
> Injector: starting
> Injector: crawlDb: crawlResult/crawldb
> Injector: urlDir: urls
> Injector: Converting injected urls to crawl db entries.
> Injector: Merging injected urls into crawl db.
> Injector: done
> Generator: starting
> Generator: segment: crawlResult/segments/20070110085314
> Generator: Selecting best-scoring urls due for fetch.
> Generator: Partitioning selected urls by host, for politeness.
> Generator: done.
> Fetcher: starting
> Fetcher: segment: crawlResult/segments/20070110085314
> Fetcher: threads: 10
> fetching http://www.winzip.com/land_about.htm
> fetch of http://www.winzip.com/land_about.htm failed with: 
> java.lang.NullPointerException
> Fetcher: done
> CrawlDb update: starting
> CrawlDb update: db: crawlResult/crawldb
> CrawlDb update: segment: crawlResult/segments/20070110085314
> CrawlDb update: Merging segment data into db.
> CrawlDb update: done
> LinkDb: starting
> LinkDb: linkdb: crawlResult/linkdb
> LinkDb: adding segment: crawlResult/segments/20070110085314
> LinkDb: done
> Indexer: starting
> Indexer: linkdb: crawlResult/linkdb
> Indexer: adding segment: crawlResult/segments/20070110085314
> Optimizing index.
> Indexer: done
> Dedup: starting
> Dedup: adding indexes in: crawlResult/indexes
> Dedup: done
> Adding crawlResult/indexes/part-00000
> crawl finished: crawlResult
>  

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-developers mailing list
Nutch-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-developers

[Nutch-dev] [jira] Resolved: (NUTCH-428) NullPointerException

Reply via email to