Error Network is unreachable in Nutch 1.3

Yusniel Hidalgo Delgado Mon, 11 Jul 2011 06:51:31 -0700

Hello.

I'm trying to run nutch 1.3 in my LAN following the NutchTutorial fromwiki page. When I try to run with this command line options: nutch crawlurls -dir crawl -depth 3 I get the following output:


solrUrl is not set, indexing will be skipped...
crawl started in: crawl
rootUrlDir = urls
threads = 10
depth = 3
solrUrl=null
Injector: starting at 2011-07-11 09:35:37
Injector: crawlDb: crawl/crawldb
Injector: urlDir: urls
Injector: Converting injected urls to crawl db entries.
Injector: Merging injected urls into crawl db.
Injector: finished at 2011-07-11 09:35:40, elapsed: 00:00:03
Generator: starting at 2011-07-11 09:35:40
Generator: Selecting best-scoring urls due for fetch.
Generator: filtering: true
Generator: normalizing: true
Generator: jobtracker is 'local', generating exactly one partition.
Generator: Partitioning selected urls for politeness.
Generator: segment: crawl/segments/20110711093542
Generator: finished at 2011-07-11 09:35:43, elapsed: 00:00:03
Fetcher: starting at 2011-07-11 09:35:43
Fetcher: segment: crawl/segments/20110711093542
Fetcher: threads: 10
QueueFeeder finished: total 2 records + hit by time limit :0
fetching http://FIRST SITE/
fetching http://SECOND SITE/
-finishing thread FetcherThread, activeThreads=2
-finishing thread FetcherThread, activeThreads=2
-finishing thread FetcherThread, activeThreads=2
-finishing thread FetcherThread, activeThreads=2
-finishing thread FetcherThread, activeThreads=2
-finishing thread FetcherThread, activeThreads=3
-finishing thread FetcherThread, activeThreads=2
-finishing thread FetcherThread, activeThreads=3

fetch of http://FIRST SITE/ failed with: java.net.ConnectException:Network is unreachable

-finishing thread FetcherThread, activeThreads=1

fetch of http://SECOND SITE/ failed with: java.net.ConnectException:Network is unreachable

-finishing thread FetcherThread, activeThreads=0
-activeThreads=0, spinWaiting=0, fetchQueues.totalSize=0
-activeThreads=0
Fetcher: finished at 2011-07-11 09:35:45, elapsed: 00:00:02
ParseSegment: starting at 2011-07-11 09:35:45
ParseSegment: segment: crawl/segments/20110711093542
ParseSegment: finished at 2011-07-11 09:35:47, elapsed: 00:00:01
CrawlDb update: starting at 2011-07-11 09:35:47
CrawlDb update: db: crawl/crawldb
CrawlDb update: segments: [crawl/segments/20110711093542]
CrawlDb update: additions allowed: true
CrawlDb update: URL normalizing: true
CrawlDb update: URL filtering: true
CrawlDb update: Merging segment data into db.
CrawlDb update: finished at 2011-07-11 09:35:48, elapsed: 00:00:01
Generator: starting at 2011-07-11 09:35:48
Generator: Selecting best-scoring urls due for fetch.
Generator: filtering: true
Generator: normalizing: true
Generator: jobtracker is 'local', generating exactly one partition.
Generator: 0 records selected for fetching, exiting ...
Stopping at depth=1 - no more URLs to fetch.
LinkDb: starting at 2011-07-11 09:35:49
LinkDb: linkdb: crawl/linkdb
LinkDb: URL normalize: true
LinkDb: URL filter: true

LinkDb: adding segment:file:/home/yusniel/Programas/nutch-1.3/runtime/local/bin/crawl/segments/20110711093542

LinkDb: finished at 2011-07-11 09:35:50, elapsed: 00:00:01
crawl finished: crawl

According to this output, the problem is related with the access tonetwork, however, I can access to those web site using Firefox. I'musing Debian testing version.


Greetings.

Error Network is unreachable in Nutch 1.3

Reply via email to