Is there any command to check if the port 50010 is open for socket
connection ?


Thanks !


On Wed, Mar 5, 2008 at 1:09 PM, Developer Developer <[EMAIL PROTECTED]>
wrote:

> Hello John and Fellow coders,
>
> I there any resolution for this 50010 port connection error !! I am really
> struggling to get the multiple node environment working. I belive I have
> followed all the steps on the wiki. I am using nutch 0.9.
> Thanks !
>
>
>
>
>
> 08-03-05 13:01:08,876 WARN  dfs.DataNode - Failed to transfer
> blk_-1407334809134504262 to /9.2.209.4:50010
> java.net.SocketTimeoutException: connect timed out
>         at java.net.PlainSocketImpl.socketConnect(Native Method)
>         at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
>         at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java
> :195)
>         at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
>         at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
>         at java.net.Socket.connect(Socket.java:519)
>         at org.apache.hadoop.dfs.DataNode$DataTransfer.run(DataNode.java
> :995)
>         at java.lang.Thread.run(Thread.java:619)
>
>
>
>
> On Fri, Jan 11, 2008 at 12:57 AM, John Mendenhall <[EMAIL PROTECTED]>
> wrote:
>
> > Hello,
> >
> > I am running nutch 0.9 currently.
> > I am running on 4 nodes, one is the master, in
> > addition to being a slave.
> >
> > I am running the nutch crawl command.
> > Everything runs fine until it gets to the dedup
> > command.  The output from the command is as follows:
> >
> > -----
> > Dedup: starting
> > Dedup: adding indexes in: /var/nutch/crawl/indexes
> > Exception in thread "main" java.io.IOException: Job failed!
> >        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:604)
> >        at org.apache.nutch.indexer.DeleteDuplicates.dedup(
> > DeleteDuplicates.java:439)
> >        at org.apache.nutch.crawl.Crawl.main(Crawl.java:135)
> > -----
> >
> > Can anyone please point me in the direction of getting this
> > to work?  I have excerpts of the interesting logs below.  I
> > have read for hours posts on these errors, if I could find any.
> > It appears from many of the posts some of these are inocuous,
> > due to the WARN message type.
> >
> > I did turn on the debug for log4j for the dedup process, so I
> > could see if I could find anything else amiss.  However, I
> > was unable to determine the cause of the problem.
> >
> > Everything worked great when we had everything on a single
> > machine, everything set to local, no distributed file system.
> >
> > Thank you in advance for any assistance or pointers you can
> > provide.
> >
> > The namenode log on the master has the following errors
> > which occurred at approximately the same time::
> >
> > -----
> > 2008-01-10 18:28:03,358 WARN  dfs.StateChange - DIR*
> > FSDirectory.unprotectedDelete: failed to remove
> > /var/nutch/crawl/indexes/part-00012 because it does not exist
> > 2008-01-10 18:28:07,145 WARN  dfs.StateChange - DIR*
> > FSDirectory.unprotectedDelete: failed to remove
> > /var/nutch/crawl/indexes/part-00011 because it does not exist
> > 2008-01-10 18:28:10,562 WARN  dfs.StateChange - DIR*
> > FSDirectory.unprotectedDelete: failed to remove
> > /var/nutch/crawl/indexes/part-00015 because it does not exist
> > 2008-01-10 18:28:12,616 WARN  dfs.StateChange - DIR*
> > FSDirectory.unprotectedDelete: failed to remove
> > /var/nutch/crawl/indexes/part-00013 because it does not exist
> > 2008-01-10 18:28:13,955 WARN  dfs.StateChange - DIR*
> > FSDirectory.unprotectedDelete: failed to remove
> > /var/nutch/crawl/indexes/part-00014 because it does not exist
> > 2008-01-10 18:28:16,526 WARN  dfs.StateChange - DIR*
> > FSDirectory.unprotectedDelete: failed to remove
> > /var/mapred/system/job_0018 because it does not exist
> > 2008-01-10 18:28:22,028 WARN  fs.FSNamesystem - Not able to place enough
> > replicas, still in need of 1
> > 2008-01-10 18:28:22,114 WARN  fs.FSNamesystem - Not able to place enough
> > replicas, still in need of 1
> > 2008-01-10 18:28:22,207 WARN  fs.FSNamesystem - Not able to place enough
> > replicas, still in need of 1
> > 2008-01-10 18:29:16,724 WARN  dfs.StateChange - DIR*
> > FSDirectory.unprotectedDelete: failed to remove
> > /var/mapred/system/job_0019 because it does not exist
> > -----
> >
> > The datanode log on the master has the following errors
> > which occurred at approximately the same time::
> >
> > -----
> > 2008-01-10 18:28:29,742 WARN  dfs.DataNode - Failed to transfer
> > blk_-2596562194274011404 to /76.250.98.171:50010
> > java.net.SocketException: Broken pipe
> >        at java.net.SocketOutputStream.socketWrite0(Native Method)
> >        at java.net.SocketOutputStream.socketWrite(
> > SocketOutputStream.java:92)
> >        at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
> >        at java.io.BufferedOutputStream.flushBuffer(
> > BufferedOutputStream.java:65)
> >        at java.io.BufferedOutputStream.write(BufferedOutputStream.java
> > :109)
> >        at java.io.DataOutputStream.write(DataOutputStream.java:90)
> >        at org.apache.hadoop.dfs.DataNode$DataTransfer.run(DataNode.java
> > :1020)
> >        at java.lang.Thread.run(Thread.java:619)
> > 2008-01-10 18:28:31,412 WARN  dfs.DataNode - Failed to transfer
> > blk_-2596562194274011404 to /76.250.98.171:50010
> > java.net.SocketException: Broken pipe
> >        at java.net.SocketOutputStream.socketWrite0(Native Method)
> >        at java.net.SocketOutputStream.socketWrite(
> > SocketOutputStream.java:92)
> >        at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
> >        at java.io.BufferedOutputStream.flushBuffer(
> > BufferedOutputStream.java:65)
> >        at java.io.BufferedOutputStream.write(BufferedOutputStream.java
> > :109)
> >        at java.io.DataOutputStream.write(DataOutputStream.java:90)
> >        at org.apache.hadoop.dfs.DataNode$DataTransfer.run(DataNode.java
> > :1020)
> >        at java.lang.Thread.run(Thread.java:619)
> > -----
> >
> > The jobtracker, tasktracker, and secondarynamenode logs appear to be
> > normal.
> >
> > The hadoop.log file contains the following interesting entries:
> > (I have filtered out the thousands of debug ipc calls and results.)
> >
> > -----
> > 2008-01-10 18:28:18,233 INFO  indexer.DeleteDuplicates - Dedup: starting
> > 2008-01-10 18:28:18,234 DEBUG conf.Configuration - java.io.IOException:
> > config(config)
> >        at org.apache.hadoop.conf.Configuration.<init>(Configuration.java
> > :102)
> >        at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:77)
> >        at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:88)
> >        at org.apache.nutch.util.NutchJob.<init>(NutchJob.java:27)
> >        at org.apache.nutch.indexer.DeleteDuplicates.dedup(
> > DeleteDuplicates.java:418)
> >        at org.apache.nutch.crawl.Crawl.main(Crawl.java:135)
> >
> > 2008-01-10 18:28:18,367 INFO  indexer.DeleteDuplicates - Dedup: adding
> > indexes in: /var/nutch/crawl/indexes
> > 2008-01-10 18:28:18,382 DEBUG mapred.JobClient - default FileSystem:
> > hdfs://sunset2:50000
> > 2008-01-10 18:28:21,672 INFO  mapred.InputFormatBase - Total input paths
> > to process : 16
> > 2008-01-10 18:28:21,674 DEBUG mapred.JobClient - Creating splits at
> > hdfs://sunset2:50000/var/mapred/system/submit_qb31lw/job.split
> > 2008-01-10 18:28:24,145 INFO  mapred.JobClient - Running job: job_0019
> > 2008-01-10 18:28:25,156 INFO  mapred.JobClient -  map 0% reduce 0%
> > 2008-01-10 18:28:33,267 DEBUG mapred.TaskTracker - Child starting
> > 2008-01-10 18:28:33,304 DEBUG conf.Configuration - java.io.IOException:
> > config()
> >        at org.apache.hadoop.conf.Configuration.<init>(Configuration.java
> > :93)
> >        at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:58)
> >        at org.apache.hadoop.mapred.TaskTracker$Child.main(
> > TaskTracker.java:1425)
> >
> > 2008-01-10 18:28:33,516 DEBUG mapred.TaskTracker - Child starting
> > 2008-01-10 18:28:33,553 DEBUG conf.Configuration - java.io.IOException:
> > config()
> >        at org.apache.hadoop.conf.Configuration.<init>(Configuration.java
> > :93)
> >        at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:58)
> >        at org.apache.hadoop.mapred.TaskTracker$Child.main(
> > TaskTracker.java:1425)
> >
> > 2008-01-10 18:28:35,485 DEBUG conf.Configuration - java.io.IOException:
> > config()
> >        at org.apache.hadoop.conf.Configuration.<init>(Configuration.java
> > :93)
> >        at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:107)
> >        at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:99)
> >        at org.apache.hadoop.mapred.TaskTracker$Child.main(
> > TaskTracker.java:1435)
> >
> > 2008-01-10 18:28:35,657 DEBUG conf.Configuration - java.io.IOException:
> > config()
> >        at org.apache.hadoop.conf.Configuration.<init>(Configuration.java
> > :93)
> >        at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:107)
> >        at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:99)
> >        at org.apache.hadoop.mapred.TaskTracker$Child.main(
> > TaskTracker.java:1435)
> >
> > 2008-01-10 18:28:36,858 DEBUG mapred.MapTask - Started thread: Sort
> > progress reporter for task task_0019_m_000004_0
> > 2008-01-10 18:28:37,406 DEBUG mapred.MapTask - Started thread: Sort
> > progress reporter for task task_0019_m_000000_0
> > 2008-01-10 18:28:38,133 WARN  mapred.TaskTracker - Error running child
> > java.lang.ArrayIndexOutOfBoundsException: -1
> >        at org.apache.lucene.index.MultiReader.isDeleted(MultiReader.java
> > :113)
> >        at
> > org.apache.nutch.indexer.DeleteDuplicates$InputFormat$DDRecordReader.next
> > (DeleteDuplicates.java:176)
> >        at org.apache.hadoop.mapred.MapTask$1.next(MapTask.java:157)
> >        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46)
> >        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:175)
> >        at org.apache.hadoop.mapred.TaskTracker$Child.main(
> > TaskTracker.java:1445)
> > 2008-01-10 18:28:38,787 DEBUG mapred.MapTask - opened spill0.out
> > 2008-01-10 18:28:39,335 INFO  mapred.JobClient -  map 6% reduce 0%
> > 2008-01-10 18:28:41,142 DEBUG mapred.TaskTracker - Child starting
> > 2008-01-10 18:28:41,179 DEBUG conf.Configuration - java.io.IOException:
> > config()
> >        at org.apache.hadoop.conf.Configuration.<init>(Configuration.java
> > :93)
> >        at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:58)
> >        at org.apache.hadoop.mapred.TaskTracker$Child.main(
> > TaskTracker.java:1425)
> >
> > 2008-01-10 18:28:41,358 INFO  mapred.JobClient - Task Id :
> > task_0019_m_000001_0, Status : FAILED
> > 2008-01-10 18:28:41,494 INFO  mapred.JobClient - Task Id :
> > task_0019_m_000004_0, Status : FAILED
> > 2008-01-10 18:28:42,738 INFO  mapred.JobClient - Task Id :
> > task_0019_m_000005_0, Status : FAILED
> > 2008-01-10 18:28:42,757 INFO  mapred.JobClient - Task Id :
> > task_0019_m_000002_0, Status : FAILED
> > 2008-01-10 18:28:43,338 DEBUG conf.Configuration - java.io.IOException:
> > config()
> >        at org.apache.hadoop.conf.Configuration.<init>(Configuration.java
> > :93)
> >        at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:107)
> >        at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:99)
> >        at org.apache.hadoop.mapred.TaskTracker$Child.main(
> > TaskTracker.java:1435)
> >
> > 2008-01-10 18:28:43,716 DEBUG mapred.TaskTracker - Child starting
> > 2008-01-10 18:28:43,758 DEBUG conf.Configuration - java.io.IOException:
> > config()
> >        at org.apache.hadoop.conf.Configuration.<init>(Configuration.java
> > :93)
> >        at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:58)
> >        at org.apache.hadoop.mapred.TaskTracker$Child.main(
> > TaskTracker.java:1425)
> >
> > 2008-01-10 18:28:44,494 DEBUG mapred.MapTask - Started thread: Sort
> > progress reporter for task task_0019_m_000007_0
> > 2008-01-10 18:28:44,798 INFO  mapred.JobClient - Task Id :
> > task_0019_m_000006_0, Status : FAILED
> > 2008-01-10 18:28:45,749 WARN  mapred.TaskTracker - Error running child
> > java.lang.ArrayIndexOutOfBoundsException: -1
> >        at org.apache.lucene.index.MultiReader.isDeleted(MultiReader.java
> > :113)
> >        at
> > org.apache.nutch.indexer.DeleteDuplicates$InputFormat$DDRecordReader.next
> > (DeleteDuplicates.java:176)
> >        at org.apache.hadoop.mapred.MapTask$1.next(MapTask.java:157)
> >        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46)
> >        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:175)
> >        at org.apache.hadoop.mapred.TaskTracker$Child.main(
> > TaskTracker.java:1445)
> > 2008-01-10 18:28:45,912 DEBUG conf.Configuration - java.io.IOException:
> > config()
> >        at org.apache.hadoop.conf.Configuration.<init>(Configuration.java
> > :93)
> >        at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:107)
> >        at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:99)
> >        at org.apache.hadoop.mapred.TaskTracker$Child.main(
> > TaskTracker.java:1435)
> >
> > 2008-01-10 18:28:47,047 DEBUG mapred.MapTask - Started thread: Sort
> > progress reporter for task task_0019_m_000001_1
> > 2008-01-10 18:28:48,253 WARN  mapred.TaskTracker - Error running child
> > java.lang.ArrayIndexOutOfBoundsException: -1
> >        at org.apache.lucene.index.MultiReader.isDeleted(MultiReader.java
> > :113)
> >        at
> > org.apache.nutch.indexer.DeleteDuplicates$InputFormat$DDRecordReader.next
> > (DeleteDuplicates.java:176)
> >        at org.apache.hadoop.mapred.MapTask$1.next(MapTask.java:157)
> >        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46)
> >        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:175)
> >        at org.apache.hadoop.mapred.TaskTracker$Child.main(
> > TaskTracker.java:1445)
> > 2008-01-10 18:28:49,879 INFO  mapred.JobClient - Task Id :
> > task_0019_m_000007_0, Status : FAILED
> > 2008-01-10 18:28:50,908 INFO  mapred.JobClient - Task Id :
> > task_0019_m_000008_0, Status : FAILED
> > 2008-01-10 18:28:50,920 INFO  mapred.JobClient - Task Id :
> > task_0019_m_000004_1, Status : FAILED
> > 2008-01-10 18:28:50,949 DEBUG mapred.TaskTracker - Child starting
> > 2008-01-10 18:28:50,986 DEBUG conf.Configuration - java.io.IOException:
> > config()
> >        at org.apache.hadoop.conf.Configuration.<init>(Configuration.java
> > :93)
> >        at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:58)
> >        at org.apache.hadoop.mapred.TaskTracker$Child.main(
> > TaskTracker.java:1425)
> >
> > 2008-01-10 18:28:51,938 INFO  mapred.JobClient - Task Id :
> > task_0019_m_000001_1, Status : FAILED
> > 2008-01-10 18:28:52,969 INFO  mapred.JobClient - Task Id :
> > task_0019_m_000005_1, Status : FAILED
> > 2008-01-10 18:28:53,123 DEBUG conf.Configuration - java.io.IOException:
> > config()
> >        at org.apache.hadoop.conf.Configuration.<init>(Configuration.java
> > :93)
> >        at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:107)
> >        at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:99)
> >        at org.apache.hadoop.mapred.TaskTracker$Child.main(
> > TaskTracker.java:1435)
> >
> > 2008-01-10 18:28:53,713 DEBUG mapred.TaskTracker - Child starting
> > 2008-01-10 18:28:53,753 DEBUG conf.Configuration - java.io.IOException:
> > config()
> >        at org.apache.hadoop.conf.Configuration.<init>(Configuration.java
> > :93)
> >        at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:58)
> >        at org.apache.hadoop.mapred.TaskTracker$Child.main(
> > TaskTracker.java:1425)
> >
> > 2008-01-10 18:28:54,009 INFO  mapred.JobClient - Task Id :
> > task_0019_m_000009_0, Status : FAILED
> > 2008-01-10 18:28:54,317 DEBUG mapred.MapTask - Started thread: Sort
> > progress reporter for task task_0019_m_000006_1
> > 2008-01-10 18:28:55,614 WARN  mapred.TaskTracker - Error running child
> > java.lang.ArrayIndexOutOfBoundsException: -1
> >        at org.apache.lucene.index.MultiReader.isDeleted(MultiReader.java
> > :113)
> >        at
> > org.apache.nutch.indexer.DeleteDuplicates$InputFormat$DDRecordReader.next
> > (DeleteDuplicates.java:176)
> >        at org.apache.hadoop.mapred.MapTask$1.next(MapTask.java:157)
> >        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46)
> >        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:175)
> >        at org.apache.hadoop.mapred.TaskTracker$Child.main(
> > TaskTracker.java:1445)
> > 2008-01-10 18:28:55,960 DEBUG conf.Configuration - java.io.IOException:
> > config()
> >        at org.apache.hadoop.conf.Configuration.<init>(Configuration.java
> > :93)
> >        at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:107)
> >        at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:99)
> >        at org.apache.hadoop.mapred.TaskTracker$Child.main(
> > TaskTracker.java:1435)
> >
> > 2008-01-10 18:28:57,080 DEBUG mapred.MapTask - Started thread: Sort
> > progress reporter for task task_0019_m_000008_1
> > 2008-01-10 18:28:58,067 INFO  mapred.JobClient - Task Id :
> > task_0019_m_000003_0, Status : FAILED
> > 2008-01-10 18:28:58,303 WARN  mapred.TaskTracker - Error running child
> > java.lang.ArrayIndexOutOfBoundsException: -1
> >        at org.apache.lucene.index.MultiReader.isDeleted(MultiReader.java
> > :113)
> >        at
> > org.apache.nutch.indexer.DeleteDuplicates$InputFormat$DDRecordReader.next
> > (DeleteDuplicates.java:176)
> >        at org.apache.hadoop.mapred.MapTask$1.next(MapTask.java:157)
> >        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46)
> >        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:175)
> >        at org.apache.hadoop.mapred.TaskTracker$Child.main(
> > TaskTracker.java:1445)
> > 2008-01-10 18:28:59,087 INFO  mapred.JobClient - Task Id :
> > task_0019_m_000007_1, Status : FAILED
> > 2008-01-10 18:28:59,099 INFO  mapred.JobClient - Task Id :
> > task_0019_m_000006_1, Status : FAILED
> > 2008-01-10 18:28:59,112 INFO  mapred.JobClient - Task Id :
> > task_0019_m_000002_1, Status : FAILED
> > 2008-01-10 18:29:02,157 INFO  mapred.JobClient - Task Id :
> > task_0019_m_000008_1, Status : FAILED
> > 2008-01-10 18:29:02,168 INFO  mapred.JobClient - Task Id :
> > task_0019_m_000001_2, Status : FAILED
> > 2008-01-10 18:29:08,247 INFO  mapred.JobClient - Task Id :
> > task_0019_m_000004_2, Status : FAILED
> > 2008-01-10 18:29:17,365 INFO  mapred.JobClient -  map 100% reduce 100%
> > 2008-01-10 18:29:17,367 INFO  mapred.JobClient - Task Id :
> > task_0019_m_000001_3, Status : FAILED
> > 2008-01-10 18:29:20,870 DEBUG conf.Configuration - java.io.IOException:
> > config()
> >        at org.apache.hadoop.conf.Configuration.<init>(Configuration.java
> > :93)
> >        at org.apache.hadoop.fs.FsShell.main(FsShell.java:910)
> >
> > 2008-01-10 18:29:25,582 DEBUG conf.Configuration - java.io.IOException:
> > config()
> >        at org.apache.hadoop.conf.Configuration.<init>(Configuration.java
> > :93)
> >        at org.apache.hadoop.fs.FsShell.main(FsShell.java:910)
> > -----
> >
> > If you need me to post log excerpts from the other slaves, please
> > let me know and I'll put them up.
> >
> > Thanks!
> >
> > JohnM
> >
> > --
> > john mendenhall
> > [EMAIL PROTECTED]
> > surf utopia
> > internet services
> >
>
>

Reply via email to