Thanks job. I found solution for port 500010.
It was just a firewall issue on the slave machine. I tested with firewall turned off, it worked. Thanks ! On Wed, Mar 5, 2008 at 1:31 PM, John Mendenhall <[EMAIL PROTECTED]> wrote: > On Wed, 05 Mar 2008, Developer Developer wrote: > > > Hello John and Fellow coders, > > > > I there any resolution for this 50010 port connection error !! I am > really > > struggling to get the multiple node environment working. I belive I have > > followed all the steps on the wiki. I am using nutch 0.9. > > Thanks ! > > > > On Fri, Jan 11, 2008 at 12:57 AM, John Mendenhall <[EMAIL PROTECTED]> > > wrote: > > > > > Hello, > > > > > > I am running nutch 0.9 currently. > > > I am running on 4 nodes, one is the master, in > > > addition to being a slave. > > > > > > I am running the nutch crawl command. > > > Everything runs fine until it gets to the dedup > > > command. The output from the command is as follows: > > > > > > ----- > > > Dedup: starting > > > Dedup: adding indexes in: /var/nutch/crawl/indexes > > > Exception in thread "main" java.io.IOException: Job failed! > > > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java > :604) > > > at org.apache.nutch.indexer.DeleteDuplicates.dedup( > > > DeleteDuplicates.java:439) > > > at org.apache.nutch.crawl.Crawl.main(Crawl.java:135) > > > ----- > > > > > > ... > > > > > > The hadoop.log file contains the following interesting entries: > > > (I have filtered out the thousands of debug ipc calls and results.) > > > > > > ----- > > > 2008-01-10 18:28:18,233 INFO indexer.DeleteDuplicates - Dedup: > starting > > > 2008-01-10 18:28:18,234 DEBUG conf.Configuration - java.io.IOException > : > > > config(config) > > > at org.apache.hadoop.conf.Configuration.<init>( > Configuration.java > > > :102) > > > at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:77) > > > at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:88) > > > at org.apache.nutch.util.NutchJob.<init>(NutchJob.java:27) > > > at org.apache.nutch.indexer.DeleteDuplicates.dedup( > > > DeleteDuplicates.java:418) > > > at org.apache.nutch.crawl.Crawl.main(Crawl.java:135) > > > > > > 2008-01-10 18:28:18,367 INFO indexer.DeleteDuplicates - Dedup: adding > > > indexes in: /var/nutch/crawl/indexes > > > 2008-01-10 18:28:18,382 DEBUG mapred.JobClient - default FileSystem: > > > hdfs://sunset2:50000 > > > 2008-01-10 18:28:21,672 INFO mapred.InputFormatBase - Total input > paths > > > to process : 16 > > > 2008-01-10 18:28:21,674 DEBUG mapred.JobClient - Creating splits at > > > hdfs://sunset2:50000/var/mapred/system/submit_qb31lw/job.split > > > 2008-01-10 18:28:24,145 INFO mapred.JobClient - Running job: job_0019 > > > 2008-01-10 18:28:25,156 INFO mapred.JobClient - map 0% reduce 0% > > > 2008-01-10 18:28:33,267 DEBUG mapred.TaskTracker - Child starting > > > 2008-01-10 18:28:33,304 DEBUG conf.Configuration - java.io.IOException > : > > > config() > > > at org.apache.hadoop.conf.Configuration.<init>( > Configuration.java > > > :93) > > > at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:58) > > > at org.apache.hadoop.mapred.TaskTracker$Child.main( > TaskTracker.java > > > :1425) > > The solution to the delete duplicates problem was the > following link: > > http://www.mail-archive.com/[EMAIL PROTECTED]/msg06705.html > > JohnM > > -- > john mendenhall > [EMAIL PROTECTED] > surf utopia > internet services >
