Is there any command to check if the port 50010 is open for socket connection ?
Thanks ! On Wed, Mar 5, 2008 at 1:09 PM, Developer Developer <[EMAIL PROTECTED]> wrote: > Hello John and Fellow coders, > > I there any resolution for this 50010 port connection error !! I am really > struggling to get the multiple node environment working. I belive I have > followed all the steps on the wiki. I am using nutch 0.9. > Thanks ! > > > > > > 08-03-05 13:01:08,876 WARN dfs.DataNode - Failed to transfer > blk_-1407334809134504262 to /9.2.209.4:50010 > java.net.SocketTimeoutException: connect timed out > at java.net.PlainSocketImpl.socketConnect(Native Method) > at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333) > at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java > :195) > at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182) > at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) > at java.net.Socket.connect(Socket.java:519) > at org.apache.hadoop.dfs.DataNode$DataTransfer.run(DataNode.java > :995) > at java.lang.Thread.run(Thread.java:619) > > > > > On Fri, Jan 11, 2008 at 12:57 AM, John Mendenhall <[EMAIL PROTECTED]> > wrote: > > > Hello, > > > > I am running nutch 0.9 currently. > > I am running on 4 nodes, one is the master, in > > addition to being a slave. > > > > I am running the nutch crawl command. > > Everything runs fine until it gets to the dedup > > command. The output from the command is as follows: > > > > ----- > > Dedup: starting > > Dedup: adding indexes in: /var/nutch/crawl/indexes > > Exception in thread "main" java.io.IOException: Job failed! > > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:604) > > at org.apache.nutch.indexer.DeleteDuplicates.dedup( > > DeleteDuplicates.java:439) > > at org.apache.nutch.crawl.Crawl.main(Crawl.java:135) > > ----- > > > > Can anyone please point me in the direction of getting this > > to work? I have excerpts of the interesting logs below. I > > have read for hours posts on these errors, if I could find any. > > It appears from many of the posts some of these are inocuous, > > due to the WARN message type. > > > > I did turn on the debug for log4j for the dedup process, so I > > could see if I could find anything else amiss. However, I > > was unable to determine the cause of the problem. > > > > Everything worked great when we had everything on a single > > machine, everything set to local, no distributed file system. > > > > Thank you in advance for any assistance or pointers you can > > provide. > > > > The namenode log on the master has the following errors > > which occurred at approximately the same time:: > > > > ----- > > 2008-01-10 18:28:03,358 WARN dfs.StateChange - DIR* > > FSDirectory.unprotectedDelete: failed to remove > > /var/nutch/crawl/indexes/part-00012 because it does not exist > > 2008-01-10 18:28:07,145 WARN dfs.StateChange - DIR* > > FSDirectory.unprotectedDelete: failed to remove > > /var/nutch/crawl/indexes/part-00011 because it does not exist > > 2008-01-10 18:28:10,562 WARN dfs.StateChange - DIR* > > FSDirectory.unprotectedDelete: failed to remove > > /var/nutch/crawl/indexes/part-00015 because it does not exist > > 2008-01-10 18:28:12,616 WARN dfs.StateChange - DIR* > > FSDirectory.unprotectedDelete: failed to remove > > /var/nutch/crawl/indexes/part-00013 because it does not exist > > 2008-01-10 18:28:13,955 WARN dfs.StateChange - DIR* > > FSDirectory.unprotectedDelete: failed to remove > > /var/nutch/crawl/indexes/part-00014 because it does not exist > > 2008-01-10 18:28:16,526 WARN dfs.StateChange - DIR* > > FSDirectory.unprotectedDelete: failed to remove > > /var/mapred/system/job_0018 because it does not exist > > 2008-01-10 18:28:22,028 WARN fs.FSNamesystem - Not able to place enough > > replicas, still in need of 1 > > 2008-01-10 18:28:22,114 WARN fs.FSNamesystem - Not able to place enough > > replicas, still in need of 1 > > 2008-01-10 18:28:22,207 WARN fs.FSNamesystem - Not able to place enough > > replicas, still in need of 1 > > 2008-01-10 18:29:16,724 WARN dfs.StateChange - DIR* > > FSDirectory.unprotectedDelete: failed to remove > > /var/mapred/system/job_0019 because it does not exist > > ----- > > > > The datanode log on the master has the following errors > > which occurred at approximately the same time:: > > > > ----- > > 2008-01-10 18:28:29,742 WARN dfs.DataNode - Failed to transfer > > blk_-2596562194274011404 to /76.250.98.171:50010 > > java.net.SocketException: Broken pipe > > at java.net.SocketOutputStream.socketWrite0(Native Method) > > at java.net.SocketOutputStream.socketWrite( > > SocketOutputStream.java:92) > > at java.net.SocketOutputStream.write(SocketOutputStream.java:136) > > at java.io.BufferedOutputStream.flushBuffer( > > BufferedOutputStream.java:65) > > at java.io.BufferedOutputStream.write(BufferedOutputStream.java > > :109) > > at java.io.DataOutputStream.write(DataOutputStream.java:90) > > at org.apache.hadoop.dfs.DataNode$DataTransfer.run(DataNode.java > > :1020) > > at java.lang.Thread.run(Thread.java:619) > > 2008-01-10 18:28:31,412 WARN dfs.DataNode - Failed to transfer > > blk_-2596562194274011404 to /76.250.98.171:50010 > > java.net.SocketException: Broken pipe > > at java.net.SocketOutputStream.socketWrite0(Native Method) > > at java.net.SocketOutputStream.socketWrite( > > SocketOutputStream.java:92) > > at java.net.SocketOutputStream.write(SocketOutputStream.java:136) > > at java.io.BufferedOutputStream.flushBuffer( > > BufferedOutputStream.java:65) > > at java.io.BufferedOutputStream.write(BufferedOutputStream.java > > :109) > > at java.io.DataOutputStream.write(DataOutputStream.java:90) > > at org.apache.hadoop.dfs.DataNode$DataTransfer.run(DataNode.java > > :1020) > > at java.lang.Thread.run(Thread.java:619) > > ----- > > > > The jobtracker, tasktracker, and secondarynamenode logs appear to be > > normal. > > > > The hadoop.log file contains the following interesting entries: > > (I have filtered out the thousands of debug ipc calls and results.) > > > > ----- > > 2008-01-10 18:28:18,233 INFO indexer.DeleteDuplicates - Dedup: starting > > 2008-01-10 18:28:18,234 DEBUG conf.Configuration - java.io.IOException: > > config(config) > > at org.apache.hadoop.conf.Configuration.<init>(Configuration.java > > :102) > > at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:77) > > at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:88) > > at org.apache.nutch.util.NutchJob.<init>(NutchJob.java:27) > > at org.apache.nutch.indexer.DeleteDuplicates.dedup( > > DeleteDuplicates.java:418) > > at org.apache.nutch.crawl.Crawl.main(Crawl.java:135) > > > > 2008-01-10 18:28:18,367 INFO indexer.DeleteDuplicates - Dedup: adding > > indexes in: /var/nutch/crawl/indexes > > 2008-01-10 18:28:18,382 DEBUG mapred.JobClient - default FileSystem: > > hdfs://sunset2:50000 > > 2008-01-10 18:28:21,672 INFO mapred.InputFormatBase - Total input paths > > to process : 16 > > 2008-01-10 18:28:21,674 DEBUG mapred.JobClient - Creating splits at > > hdfs://sunset2:50000/var/mapred/system/submit_qb31lw/job.split > > 2008-01-10 18:28:24,145 INFO mapred.JobClient - Running job: job_0019 > > 2008-01-10 18:28:25,156 INFO mapred.JobClient - map 0% reduce 0% > > 2008-01-10 18:28:33,267 DEBUG mapred.TaskTracker - Child starting > > 2008-01-10 18:28:33,304 DEBUG conf.Configuration - java.io.IOException: > > config() > > at org.apache.hadoop.conf.Configuration.<init>(Configuration.java > > :93) > > at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:58) > > at org.apache.hadoop.mapred.TaskTracker$Child.main( > > TaskTracker.java:1425) > > > > 2008-01-10 18:28:33,516 DEBUG mapred.TaskTracker - Child starting > > 2008-01-10 18:28:33,553 DEBUG conf.Configuration - java.io.IOException: > > config() > > at org.apache.hadoop.conf.Configuration.<init>(Configuration.java > > :93) > > at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:58) > > at org.apache.hadoop.mapred.TaskTracker$Child.main( > > TaskTracker.java:1425) > > > > 2008-01-10 18:28:35,485 DEBUG conf.Configuration - java.io.IOException: > > config() > > at org.apache.hadoop.conf.Configuration.<init>(Configuration.java > > :93) > > at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:107) > > at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:99) > > at org.apache.hadoop.mapred.TaskTracker$Child.main( > > TaskTracker.java:1435) > > > > 2008-01-10 18:28:35,657 DEBUG conf.Configuration - java.io.IOException: > > config() > > at org.apache.hadoop.conf.Configuration.<init>(Configuration.java > > :93) > > at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:107) > > at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:99) > > at org.apache.hadoop.mapred.TaskTracker$Child.main( > > TaskTracker.java:1435) > > > > 2008-01-10 18:28:36,858 DEBUG mapred.MapTask - Started thread: Sort > > progress reporter for task task_0019_m_000004_0 > > 2008-01-10 18:28:37,406 DEBUG mapred.MapTask - Started thread: Sort > > progress reporter for task task_0019_m_000000_0 > > 2008-01-10 18:28:38,133 WARN mapred.TaskTracker - Error running child > > java.lang.ArrayIndexOutOfBoundsException: -1 > > at org.apache.lucene.index.MultiReader.isDeleted(MultiReader.java > > :113) > > at > > org.apache.nutch.indexer.DeleteDuplicates$InputFormat$DDRecordReader.next > > (DeleteDuplicates.java:176) > > at org.apache.hadoop.mapred.MapTask$1.next(MapTask.java:157) > > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46) > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:175) > > at org.apache.hadoop.mapred.TaskTracker$Child.main( > > TaskTracker.java:1445) > > 2008-01-10 18:28:38,787 DEBUG mapred.MapTask - opened spill0.out > > 2008-01-10 18:28:39,335 INFO mapred.JobClient - map 6% reduce 0% > > 2008-01-10 18:28:41,142 DEBUG mapred.TaskTracker - Child starting > > 2008-01-10 18:28:41,179 DEBUG conf.Configuration - java.io.IOException: > > config() > > at org.apache.hadoop.conf.Configuration.<init>(Configuration.java > > :93) > > at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:58) > > at org.apache.hadoop.mapred.TaskTracker$Child.main( > > TaskTracker.java:1425) > > > > 2008-01-10 18:28:41,358 INFO mapred.JobClient - Task Id : > > task_0019_m_000001_0, Status : FAILED > > 2008-01-10 18:28:41,494 INFO mapred.JobClient - Task Id : > > task_0019_m_000004_0, Status : FAILED > > 2008-01-10 18:28:42,738 INFO mapred.JobClient - Task Id : > > task_0019_m_000005_0, Status : FAILED > > 2008-01-10 18:28:42,757 INFO mapred.JobClient - Task Id : > > task_0019_m_000002_0, Status : FAILED > > 2008-01-10 18:28:43,338 DEBUG conf.Configuration - java.io.IOException: > > config() > > at org.apache.hadoop.conf.Configuration.<init>(Configuration.java > > :93) > > at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:107) > > at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:99) > > at org.apache.hadoop.mapred.TaskTracker$Child.main( > > TaskTracker.java:1435) > > > > 2008-01-10 18:28:43,716 DEBUG mapred.TaskTracker - Child starting > > 2008-01-10 18:28:43,758 DEBUG conf.Configuration - java.io.IOException: > > config() > > at org.apache.hadoop.conf.Configuration.<init>(Configuration.java > > :93) > > at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:58) > > at org.apache.hadoop.mapred.TaskTracker$Child.main( > > TaskTracker.java:1425) > > > > 2008-01-10 18:28:44,494 DEBUG mapred.MapTask - Started thread: Sort > > progress reporter for task task_0019_m_000007_0 > > 2008-01-10 18:28:44,798 INFO mapred.JobClient - Task Id : > > task_0019_m_000006_0, Status : FAILED > > 2008-01-10 18:28:45,749 WARN mapred.TaskTracker - Error running child > > java.lang.ArrayIndexOutOfBoundsException: -1 > > at org.apache.lucene.index.MultiReader.isDeleted(MultiReader.java > > :113) > > at > > org.apache.nutch.indexer.DeleteDuplicates$InputFormat$DDRecordReader.next > > (DeleteDuplicates.java:176) > > at org.apache.hadoop.mapred.MapTask$1.next(MapTask.java:157) > > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46) > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:175) > > at org.apache.hadoop.mapred.TaskTracker$Child.main( > > TaskTracker.java:1445) > > 2008-01-10 18:28:45,912 DEBUG conf.Configuration - java.io.IOException: > > config() > > at org.apache.hadoop.conf.Configuration.<init>(Configuration.java > > :93) > > at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:107) > > at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:99) > > at org.apache.hadoop.mapred.TaskTracker$Child.main( > > TaskTracker.java:1435) > > > > 2008-01-10 18:28:47,047 DEBUG mapred.MapTask - Started thread: Sort > > progress reporter for task task_0019_m_000001_1 > > 2008-01-10 18:28:48,253 WARN mapred.TaskTracker - Error running child > > java.lang.ArrayIndexOutOfBoundsException: -1 > > at org.apache.lucene.index.MultiReader.isDeleted(MultiReader.java > > :113) > > at > > org.apache.nutch.indexer.DeleteDuplicates$InputFormat$DDRecordReader.next > > (DeleteDuplicates.java:176) > > at org.apache.hadoop.mapred.MapTask$1.next(MapTask.java:157) > > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46) > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:175) > > at org.apache.hadoop.mapred.TaskTracker$Child.main( > > TaskTracker.java:1445) > > 2008-01-10 18:28:49,879 INFO mapred.JobClient - Task Id : > > task_0019_m_000007_0, Status : FAILED > > 2008-01-10 18:28:50,908 INFO mapred.JobClient - Task Id : > > task_0019_m_000008_0, Status : FAILED > > 2008-01-10 18:28:50,920 INFO mapred.JobClient - Task Id : > > task_0019_m_000004_1, Status : FAILED > > 2008-01-10 18:28:50,949 DEBUG mapred.TaskTracker - Child starting > > 2008-01-10 18:28:50,986 DEBUG conf.Configuration - java.io.IOException: > > config() > > at org.apache.hadoop.conf.Configuration.<init>(Configuration.java > > :93) > > at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:58) > > at org.apache.hadoop.mapred.TaskTracker$Child.main( > > TaskTracker.java:1425) > > > > 2008-01-10 18:28:51,938 INFO mapred.JobClient - Task Id : > > task_0019_m_000001_1, Status : FAILED > > 2008-01-10 18:28:52,969 INFO mapred.JobClient - Task Id : > > task_0019_m_000005_1, Status : FAILED > > 2008-01-10 18:28:53,123 DEBUG conf.Configuration - java.io.IOException: > > config() > > at org.apache.hadoop.conf.Configuration.<init>(Configuration.java > > :93) > > at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:107) > > at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:99) > > at org.apache.hadoop.mapred.TaskTracker$Child.main( > > TaskTracker.java:1435) > > > > 2008-01-10 18:28:53,713 DEBUG mapred.TaskTracker - Child starting > > 2008-01-10 18:28:53,753 DEBUG conf.Configuration - java.io.IOException: > > config() > > at org.apache.hadoop.conf.Configuration.<init>(Configuration.java > > :93) > > at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:58) > > at org.apache.hadoop.mapred.TaskTracker$Child.main( > > TaskTracker.java:1425) > > > > 2008-01-10 18:28:54,009 INFO mapred.JobClient - Task Id : > > task_0019_m_000009_0, Status : FAILED > > 2008-01-10 18:28:54,317 DEBUG mapred.MapTask - Started thread: Sort > > progress reporter for task task_0019_m_000006_1 > > 2008-01-10 18:28:55,614 WARN mapred.TaskTracker - Error running child > > java.lang.ArrayIndexOutOfBoundsException: -1 > > at org.apache.lucene.index.MultiReader.isDeleted(MultiReader.java > > :113) > > at > > org.apache.nutch.indexer.DeleteDuplicates$InputFormat$DDRecordReader.next > > (DeleteDuplicates.java:176) > > at org.apache.hadoop.mapred.MapTask$1.next(MapTask.java:157) > > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46) > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:175) > > at org.apache.hadoop.mapred.TaskTracker$Child.main( > > TaskTracker.java:1445) > > 2008-01-10 18:28:55,960 DEBUG conf.Configuration - java.io.IOException: > > config() > > at org.apache.hadoop.conf.Configuration.<init>(Configuration.java > > :93) > > at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:107) > > at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:99) > > at org.apache.hadoop.mapred.TaskTracker$Child.main( > > TaskTracker.java:1435) > > > > 2008-01-10 18:28:57,080 DEBUG mapred.MapTask - Started thread: Sort > > progress reporter for task task_0019_m_000008_1 > > 2008-01-10 18:28:58,067 INFO mapred.JobClient - Task Id : > > task_0019_m_000003_0, Status : FAILED > > 2008-01-10 18:28:58,303 WARN mapred.TaskTracker - Error running child > > java.lang.ArrayIndexOutOfBoundsException: -1 > > at org.apache.lucene.index.MultiReader.isDeleted(MultiReader.java > > :113) > > at > > org.apache.nutch.indexer.DeleteDuplicates$InputFormat$DDRecordReader.next > > (DeleteDuplicates.java:176) > > at org.apache.hadoop.mapred.MapTask$1.next(MapTask.java:157) > > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46) > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:175) > > at org.apache.hadoop.mapred.TaskTracker$Child.main( > > TaskTracker.java:1445) > > 2008-01-10 18:28:59,087 INFO mapred.JobClient - Task Id : > > task_0019_m_000007_1, Status : FAILED > > 2008-01-10 18:28:59,099 INFO mapred.JobClient - Task Id : > > task_0019_m_000006_1, Status : FAILED > > 2008-01-10 18:28:59,112 INFO mapred.JobClient - Task Id : > > task_0019_m_000002_1, Status : FAILED > > 2008-01-10 18:29:02,157 INFO mapred.JobClient - Task Id : > > task_0019_m_000008_1, Status : FAILED > > 2008-01-10 18:29:02,168 INFO mapred.JobClient - Task Id : > > task_0019_m_000001_2, Status : FAILED > > 2008-01-10 18:29:08,247 INFO mapred.JobClient - Task Id : > > task_0019_m_000004_2, Status : FAILED > > 2008-01-10 18:29:17,365 INFO mapred.JobClient - map 100% reduce 100% > > 2008-01-10 18:29:17,367 INFO mapred.JobClient - Task Id : > > task_0019_m_000001_3, Status : FAILED > > 2008-01-10 18:29:20,870 DEBUG conf.Configuration - java.io.IOException: > > config() > > at org.apache.hadoop.conf.Configuration.<init>(Configuration.java > > :93) > > at org.apache.hadoop.fs.FsShell.main(FsShell.java:910) > > > > 2008-01-10 18:29:25,582 DEBUG conf.Configuration - java.io.IOException: > > config() > > at org.apache.hadoop.conf.Configuration.<init>(Configuration.java > > :93) > > at org.apache.hadoop.fs.FsShell.main(FsShell.java:910) > > ----- > > > > If you need me to post log excerpts from the other slaves, please > > let me know and I'll put them up. > > > > Thanks! > > > > JohnM > > > > -- > > john mendenhall > > [EMAIL PROTECTED] > > surf utopia > > internet services > > > >
