Hi Andrew Thanks Andrew for your suggestion. I updated the hdfs-site on server side and also on client side to use hostname instead of IP as mentioned here => http://rainerpeter.wordpress.com/2014/02/12/connect-to-hdfs-running-in-ec2-using-public-ip-addresses/ . Now, I could see that the client is able to talk to the datanode.
Also, I will consider submitting application from within ec2 itself so that private IP is resolvable. Thanks Praveen On Fri, Jun 20, 2014 at 2:35 AM, Andrew Or <and...@databricks.com> wrote: > (Also, an easier workaround is to simply submit the application from > within your > cluster, thus saving you all the manual labor of reconfiguring everything > to use > public hostnames. This may or may not be applicable to your use case.) > > > 2014-06-19 14:04 GMT-07:00 Andrew Or <and...@databricks.com>: > > Hi Praveen, >> >> Yes, the fact that it is trying to use a private IP from outside of the >> cluster is suspicious. >> My guess is that your HDFS is configured to use internal IPs rather than >> external IPs. >> This means even though the hadoop confs on your local machine only use >> external IPs, >> the org.apache.spark.deploy.yarn.Client that is running on your local >> machine is trying >> to use whatever address your HDFS name node tells it to use, which is >> private in this >> case. >> >> A potential fix is to update your hdfs-site.xml (and other related >> configs) within your >> cluster to use public hostnames. Let me know if that does the job. >> >> Andrew >> >> >> 2014-06-19 6:04 GMT-07:00 Praveen Seluka <psel...@qubole.com>: >> >> I am trying to run Spark on YARN. I have a hadoop 2.2 cluster (YARN + >>> HDFS) in EC2. Then, I compiled Spark using Maven with 2.2 hadoop profiles. >>> Now am trying to run the example Spark job . (In Yarn-cluster mode). >>> >>> From my *local machine. *I have setup HADOOP_CONF_DIR environment >>> variable correctly. >>> >>> ➜ spark git:(master) ✗ /bin/bash -c "./bin/spark-submit --class >>> org.apache.spark.examples.SparkPi --master yarn-cluster --num-executors 2 >>> --driver-memory 2g --executor-memory 2g --executor-cores 1 >>> examples/target/scala-2.10/spark-examples_*.jar 10" >>> 14/06/19 14:59:39 WARN util.NativeCodeLoader: Unable to load >>> native-hadoop library for your platform... using builtin-java classes where >>> applicable >>> 14/06/19 14:59:39 INFO client.RMProxy: Connecting to ResourceManager at >>> ec2-54-242-244-250.compute-1.amazonaws.com/54.242.244.250:8050 >>> 14/06/19 14:59:41 INFO yarn.Client: Got Cluster metric info from >>> ApplicationsManager (ASM), number of NodeManagers: 1 >>> 14/06/19 14:59:41 INFO yarn.Client: Queue info ... queueName: default, >>> queueCurrentCapacity: 0.0, queueMaxCapacity: 1.0, >>> queueApplicationCount = 0, queueChildQueueCount = 0 >>> 14/06/19 14:59:41 INFO yarn.Client: Max mem capabililty of a single >>> resource in this cluster 12288 >>> 14/06/19 14:59:41 INFO yarn.Client: Preparing Local resources >>> 14/06/19 14:59:42 WARN hdfs.BlockReaderLocal: The short-circuit local >>> reads feature cannot be used because libhadoop cannot be loaded. >>> 14/06/19 14:59:43 INFO yarn.Client: Uploading >>> file:/home/rgupta/awesome/spark/examples/target/scala-2.10/spark-examples_2.10-1.0.0-SNAPSHOT.jar >>> to hdfs:// >>> ec2-54-242-244-250.compute-1.amazonaws.com:8020/user/rgupta/.sparkStaging/application_1403176373037_0009/spark-examples_2.10-1.0.0-SNAPSHOT.jar >>> 14/06/19 15:00:45 INFO hdfs.DFSClient: Exception in >>> createBlockOutputStream >>> org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout >>> while waiting for channel to be ready for connect. ch : >>> java.nio.channels.SocketChannel[connection-pending remote=/ >>> 10.180.150.66:50010] >>> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:532) >>> at >>> org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1305) >>> at >>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1128) >>> at >>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1088) >>> at >>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:514) >>> 14/06/19 15:00:45 INFO hdfs.DFSClient: Abandoning >>> BP-1714253233-10.180.215.105-1403176367942:blk_1073741833_1009 >>> 14/06/19 15:00:46 INFO hdfs.DFSClient: Excluding datanode >>> 10.180.150.66:50010 >>> 14/06/19 15:00:46 WARN hdfs.DFSClient: DataStreamer Exception >>> >>> Its able to talk to Resource Manager >>> Then it puts the example.jar file to HDFS and it fails. Its trying to >>> write to datanode. I verified that 50010 port is accessible through local >>> machine. Any idea whats the issue here ? >>> One thing thats suspicious is */10.180.150.66:50010 >>> <http://10.180.150.66:50010> - it looks like its trying to connect using >>> private IP. If so, how can I resolve this to use public IP.* >>> >>> Thanks >>> Praveen >>> >> >> >