[jira] [Updated] (SPARK-925) Allow ec2 scripts to load default options from a json file
[ https://issues.apache.org/jira/browse/SPARK-925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Grigor updated SPARK-925: -- Description: The option list for ec2 script can be a little irritating to type in, especially things like path to identity-file, region , zone, ami etc. It would be nice if ec2 script looks for an options.json file in the following order: (1) CWD, (2) ~/spark-ec2, (3) same dir as spark_ec2.py Something like: def get_defaults_from_options(): # Check to see if a options.json file exists, if so load it. # However, values in the options.json file can only overide values in opts # if the Opt values are None or # i.e. commandline options take presidence defaults = {'aws-access-key-id':'','aws-secret-access-key':'','key-pair':'', 'identity-file':'', 'region':'ap-southeast-1', 'zone':'', 'ami':'','slaves':1, 'instance-type':'m1.large'} # Look for options.json in directory cluster was called from # Had to modify the spark_ec2 wrapper script since it mangles the pwd startwd = os.environ['STARTWD'] if os.path.exists(os.path.join(startwd,options.json)): optionspath = os.path.join(startwd,options.json) else: optionspath = os.path.join(os.getcwd(),options.json) try: print Loading options file: , optionspath with open (optionspath) as json_data: jdata = json.load(json_data) for k in jdata: defaults[k]=jdata[k] except IOError: print 'Warning: options.json file not loaded' # Check permissions on identity-file, if defined, otherwise launch will fail late and will be irritating if defaults['identity-file']!='': st = os.stat(defaults['identity-file']) user_can_read = bool(st.st_mode stat.S_IRUSR) grp_perms = bool(st.st_mode stat.S_IRWXG) others_perm = bool(st.st_mode stat.S_IRWXO) if (not user_can_read): print No read permission to read , defaults['identify-file'] sys.exit(1) if (grp_perms or others_perm): print Permissions are too open, please chmod 600 file , defaults['identify-file'] sys.exit(1) # if defaults contain AWS access id or private key, set it to environment. # required for use with boto to access the AWS console if defaults['aws-access-key-id'] != '': os.environ['AWS_ACCESS_KEY_ID']=defaults['aws-access-key-id'] if defaults['aws-secret-access-key'] != '': os.environ['AWS_SECRET_ACCESS_KEY'] = defaults['aws-secret-access-key'] return defaults was: The option list for ec2 script can be a little irritating to type in, especially things like path to identity-file, region , zone, ami etc. It would be nice if ec2 script looks for an options.json file in the following order: (1) PWD, (2) ~/spark-ec2, (3) same dir as spark_ec2.py Something like: def get_defaults_from_options(): # Check to see if a options.json file exists, if so load it. # However, values in the options.json file can only overide values in opts # if the Opt values are None or # i.e. commandline options take presidence defaults = {'aws-access-key-id':'','aws-secret-access-key':'','key-pair':'', 'identity-file':'', 'region':'ap-southeast-1', 'zone':'', 'ami':'','slaves':1, 'instance-type':'m1.large'} # Look for options.json in directory cluster was called from # Had to modify the spark_ec2 wrapper script since it mangles the pwd startwd = os.environ['STARTWD'] if os.path.exists(os.path.join(startwd,options.json)): optionspath = os.path.join(startwd,options.json) else: optionspath = os.path.join(os.getcwd(),options.json) try: print Loading options file: , optionspath with open (optionspath) as json_data: jdata = json.load(json_data) for k in jdata: defaults[k]=jdata[k] except IOError: print 'Warning: options.json file not loaded' # Check permissions on identity-file, if defined, otherwise launch will fail late and will be irritating if defaults['identity-file']!='': st = os.stat(defaults['identity-file']) user_can_read = bool(st.st_mode stat.S_IRUSR) grp_perms = bool(st.st_mode stat.S_IRWXG) others_perm = bool(st.st_mode stat.S_IRWXO) if (not user_can_read): print No read permission to read , defaults['identify-file'] sys.exit(1) if (grp_perms or others_perm): print Permissions are too open, please chmod 600 file , defaults['identify-file'] sys.exit(1) # if defaults contain AWS access id or private key, set it to environment. # required for use with boto to access the AWS console if defaults['aws-access-key-id'] != '': os.environ['AWS_ACCESS_KEY_ID']=defaults['aws-access-key-id'] if defaults['aws-secret-access-key'] != '': os.environ['AWS_SECRET_ACCESS_KEY'] = defaults['aws-secret-access-key'] return defaults Allow ec2 scripts to load default options from a json file
[jira] [Commented] (SPARK-5242) ec2/spark_ec2.py lauch does not work with VPC if no public DNS or IP is available
[ https://issues.apache.org/jira/browse/SPARK-5242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14395562#comment-14395562 ] Vladimir Grigor commented on SPARK-5242: updated PR from upstream/master ec2/spark_ec2.py lauch does not work with VPC if no public DNS or IP is available --- Key: SPARK-5242 URL: https://issues.apache.org/jira/browse/SPARK-5242 Project: Spark Issue Type: Bug Components: EC2 Reporter: Vladimir Grigor Labels: easyfix How to reproduce: user starting cluster in VPC needs to wait forever: {code} ./spark-ec2 -k key20141114 -i ~/aws/key.pem -s 1 --region=eu-west-1 --spark-version=1.2.0 --instance-type=m1.large --vpc-id=vpc-2e71dd46 --subnet-id=subnet-2571dd4d --zone=eu-west-1a launch SparkByScript Setting up security groups... Searching for existing cluster SparkByScript... Spark AMI: ami-1ae0166d Launching instances... Launched 1 slaves in eu-west-1a, regid = r-e70c5502 Launched master in eu-west-1a, regid = r-bf0f565a Waiting for cluster to enter 'ssh-ready' state..{forever} {code} Problem is that current code makes wrong assumption that VPC instance has public_dns_name or public ip_address. Actually more common is that VPC instance has only private_ip_address. The bug is already fixed in my fork, I am going to submit pull request -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-5479) PySpark on yarn mode need to support non-local python files
[ https://issues.apache.org/jira/browse/SPARK-5479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298400#comment-14298400 ] Vladimir Grigor commented on SPARK-5479: https://github.com/apache/spark/pull/3976 potentially closes this issue PySpark on yarn mode need to support non-local python files --- Key: SPARK-5479 URL: https://issues.apache.org/jira/browse/SPARK-5479 Project: Spark Issue Type: Bug Components: PySpark Reporter: Lianhui Wang In SPARK-5162 [~vgrigor] reports this: Now following code cannot work: aws emr add-steps --cluster-id j-XYWIXMD234 \ --steps Name=SparkPi,Jar=s3://eu-west-1.elasticmapreduce/libs/script-runner/script-runner.jar,Args=[/home/hadoop/spark/bin/spark-submit,--deploy-mode,cluster,--master,yarn-cluster,--py-files,s3://mybucketat.amazonaws.com/tasks/main.py,main.py,param1],ActionOnFailure=CONTINUE so we need to support non-local python files on yarn client and cluster mode. before submitting application to Yarn, we need to download non-local files to local or hdfs path. or spark.yarn.dist.files need to support other non-local files. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-5162) Python yarn-cluster mode
[ https://issues.apache.org/jira/browse/SPARK-5162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296430#comment-14296430 ] Vladimir Grigor commented on SPARK-5162: [~lianhuiwang] Thank you for the walkaround suggestion! Still, I believe it would be great to have feature of remote script files - that would improve usability of yarn component in Spark a lot. If you think that is the case, and you know technical details of the system better, could you please create a ticket for that feature? Or please comment with any ideas of technical implementation. Thank you! Python yarn-cluster mode Key: SPARK-5162 URL: https://issues.apache.org/jira/browse/SPARK-5162 Project: Spark Issue Type: New Feature Components: PySpark, YARN Reporter: Dana Klassen Labels: cluster, python, yarn Running pyspark in yarn is currently limited to ‘yarn-client’ mode. It would be great to be able to submit python applications to the cluster and (just like java classes) have the resource manager setup an AM on any node in the cluster. Does anyone know the issues blocking this feature? I was snooping around with enabling python apps: Removing the logic stopping python and yarn-cluster from sparkSubmit.scala ... // The following modes are not supported or applicable (clusterManager, deployMode) match { ... case (_, CLUSTER) if args.isPython = printErrorAndExit(Cluster deploy mode is currently not supported for python applications.) ... } … and submitting application via: HADOOP_CONF_DIR={{insert conf dir}} ./bin/spark-submit --master yarn-cluster --num-executors 2 —-py-files {{insert location of egg here}} --executor-cores 1 ../tools/canary.py Everything looks to run alright, pythonRunner is picked up as main class, resources get setup, yarn client gets launched but falls flat on its face: 2015-01-08 18:48:03,444 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: DEBUG: FAILED { {{redacted}}/.sparkStaging/application_1420594669313_4687/canary.py, 1420742868009, FILE, null }, Resource {{redacted}}/.sparkStaging/application_1420594669313_4687/canary.py changed on src filesystem (expected 1420742868009, was 1420742869284 and 2015-01-08 18:48:03,446 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource {{redacted}}/.sparkStaging/application_1420594669313_4687/canary.py(-/data/4/yarn/nm/usercache/klassen/filecache/11/canary.py) transitioned from DOWNLOADING to FAILED Tracked this down to the apache hadoop code(FSDownload.java line 249) related to container localization of files upon downloading. At this point thought it would be best to raise the issue here and get input. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-5162) Python yarn-cluster mode
[ https://issues.apache.org/jira/browse/SPARK-5162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293397#comment-14293397 ] Vladimir Grigor commented on SPARK-5162: [~lianhuiwang] Thank you for reply :) yarn client mode does not support remote scripts either. It'd be very good to have this feature. Please let me know if I could help you anyhow Python yarn-cluster mode Key: SPARK-5162 URL: https://issues.apache.org/jira/browse/SPARK-5162 Project: Spark Issue Type: New Feature Components: PySpark, YARN Reporter: Dana Klassen Labels: cluster, python, yarn Running pyspark in yarn is currently limited to ‘yarn-client’ mode. It would be great to be able to submit python applications to the cluster and (just like java classes) have the resource manager setup an AM on any node in the cluster. Does anyone know the issues blocking this feature? I was snooping around with enabling python apps: Removing the logic stopping python and yarn-cluster from sparkSubmit.scala ... // The following modes are not supported or applicable (clusterManager, deployMode) match { ... case (_, CLUSTER) if args.isPython = printErrorAndExit(Cluster deploy mode is currently not supported for python applications.) ... } … and submitting application via: HADOOP_CONF_DIR={{insert conf dir}} ./bin/spark-submit --master yarn-cluster --num-executors 2 —-py-files {{insert location of egg here}} --executor-cores 1 ../tools/canary.py Everything looks to run alright, pythonRunner is picked up as main class, resources get setup, yarn client gets launched but falls flat on its face: 2015-01-08 18:48:03,444 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: DEBUG: FAILED { {{redacted}}/.sparkStaging/application_1420594669313_4687/canary.py, 1420742868009, FILE, null }, Resource {{redacted}}/.sparkStaging/application_1420594669313_4687/canary.py changed on src filesystem (expected 1420742868009, was 1420742869284 and 2015-01-08 18:48:03,446 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource {{redacted}}/.sparkStaging/application_1420594669313_4687/canary.py(-/data/4/yarn/nm/usercache/klassen/filecache/11/canary.py) transitioned from DOWNLOADING to FAILED Tracked this down to the apache hadoop code(FSDownload.java line 249) related to container localization of files upon downloading. At this point thought it would be best to raise the issue here and get input. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-595) Document local-cluster mode
[ https://issues.apache.org/jira/browse/SPARK-595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292052#comment-14292052 ] Vladimir Grigor commented on SPARK-595: --- +1 for reopen Document local-cluster mode - Key: SPARK-595 URL: https://issues.apache.org/jira/browse/SPARK-595 Project: Spark Issue Type: New Feature Components: Documentation Affects Versions: 0.6.0 Reporter: Josh Rosen Priority: Minor The 'Spark Standalone Mode' guide describes how to manually launch a standalone cluster, which can be done locally for testing, but it does not mention SparkContext's `local-cluster` option. What are the differences between these approaches? Which one should I prefer for local testing? Can I still use the standalone web interface if I use 'local-cluster' mode? It would be useful to document this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-5162) Python yarn-cluster mode
[ https://issues.apache.org/jira/browse/SPARK-5162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292069#comment-14292069 ] Vladimir Grigor commented on SPARK-5162: I second [~jared.holmb...@orchestro.com] [~lianhuiwang] thank you! I'm going to try your PR. Related issue Even with this PR, there will be problem using Yarn in cluster mode on Amazon EMR. Normally one submits yarn jobs via API or aws command line utility, so paths to files are evaluated later at some remote host, hence files are not found. Currently Spark does not support non-local files. One idea would be to add support for non-local (python) files, eg: if file is not local it will be downloaded and made available locally. Something similar to Distributed Cache described at http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-plan-input-distributed-cache.html So following code would work: {code} aws emr add-steps --cluster-id j-XYWIXMD234 \ --steps Name=SparkPi,Jar=s3://eu-west-1.elasticmapreduce/libs/script-runner/script-runner.jar,Args=[/home/hadoop/spark/bin/spark-submit,--deploy-mode,cluster,--master,yarn-cluster,--py-files,s3://mybucketat.amazonaws.com/tasks/main.py,main.py,param1],ActionOnFailure=CONTINUE {code} What do you think? What is your way to run batch python spark scripts on Yarn in Amazon? Python yarn-cluster mode Key: SPARK-5162 URL: https://issues.apache.org/jira/browse/SPARK-5162 Project: Spark Issue Type: New Feature Components: PySpark, YARN Reporter: Dana Klassen Labels: cluster, python, yarn Running pyspark in yarn is currently limited to ‘yarn-client’ mode. It would be great to be able to submit python applications to the cluster and (just like java classes) have the resource manager setup an AM on any node in the cluster. Does anyone know the issues blocking this feature? I was snooping around with enabling python apps: Removing the logic stopping python and yarn-cluster from sparkSubmit.scala ... // The following modes are not supported or applicable (clusterManager, deployMode) match { ... case (_, CLUSTER) if args.isPython = printErrorAndExit(Cluster deploy mode is currently not supported for python applications.) ... } … and submitting application via: HADOOP_CONF_DIR={{insert conf dir}} ./bin/spark-submit --master yarn-cluster --num-executors 2 —-py-files {{insert location of egg here}} --executor-cores 1 ../tools/canary.py Everything looks to run alright, pythonRunner is picked up as main class, resources get setup, yarn client gets launched but falls flat on its face: 2015-01-08 18:48:03,444 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: DEBUG: FAILED { {{redacted}}/.sparkStaging/application_1420594669313_4687/canary.py, 1420742868009, FILE, null }, Resource {{redacted}}/.sparkStaging/application_1420594669313_4687/canary.py changed on src filesystem (expected 1420742868009, was 1420742869284 and 2015-01-08 18:48:03,446 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource {{redacted}}/.sparkStaging/application_1420594669313_4687/canary.py(-/data/4/yarn/nm/usercache/klassen/filecache/11/canary.py) transitioned from DOWNLOADING to FAILED Tracked this down to the apache hadoop code(FSDownload.java line 249) related to container localization of files upon downloading. At this point thought it would be best to raise the issue here and get input. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-5246) spark/spark-ec2.py cannot start Spark master in VPC if local DNS name does not resolve
[ https://issues.apache.org/jira/browse/SPARK-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14282701#comment-14282701 ] Vladimir Grigor commented on SPARK-5246: please see another pull request for this bug: https://github.com/mesos/spark-ec2/pull/92 Unfortunately, original fix didn't work that well spark/spark-ec2.py cannot start Spark master in VPC if local DNS name does not resolve -- Key: SPARK-5246 URL: https://issues.apache.org/jira/browse/SPARK-5246 Project: Spark Issue Type: Bug Components: EC2 Reporter: Vladimir Grigor How to reproduce: 1) http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Scenario2.html should be sufficient to setup VPC for this bug. After you followed that guide, start new instance in VPC, ssh to it (though NAT server) 2) user starts a cluster in VPC: {code} ./spark-ec2 -k key20141114 -i ~/aws/key.pem -s 1 --region=eu-west-1 --spark-version=1.2.0 --instance-type=m1.large --vpc-id=vpc-2e71dd46 --subnet-id=subnet-2571dd4d --zone=eu-west-1a launch SparkByScript Setting up security groups... (omitted for brevity) 10.1.1.62 10.1.1.62: no org.apache.spark.deploy.worker.Worker to stop no org.apache.spark.deploy.master.Master to stop starting org.apache.spark.deploy.master.Master, logging to /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out failed to launch org.apache.spark.deploy.master.Master: at java.net.InetAddress.getLocalHost(InetAddress.java:1469) ... 12 more full log in /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out 10.1.1.62: starting org.apache.spark.deploy.worker.Worker, logging to /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-ip-10-1-1-62.out 10.1.1.62: failed to launch org.apache.spark.deploy.worker.Worker: 10.1.1.62:at java.net.InetAddress.getLocalHost(InetAddress.java:1469) 10.1.1.62:... 12 more 10.1.1.62: full log in /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-ip-10-1-1-62.out [timing] spark-standalone setup: 00h 00m 28s (omitted for brevity) {code} /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out {code} Spark assembly has been built with Hive, including Datanucleus jars on classpath Spark Command: /usr/lib/jvm/java-1.7.0/bin/java -cp :::/root/ephemeral-hdfs/conf:/root/spark/sbin/../conf:/root/spark/lib/spark-assembly-1.2.0-hadoop1.0.4.jar:/root/spark/lib/datanucleus-api-jdo-3.2.6.jar:/root/spark/lib/datanucleus-rdbms-3.2.9.jar:/root/spark/lib/datanucleus-core-3.2.10.jar -XX:MaxPermSize=128m -Dspark.akka.logLifecycleEvents=true -Xms512m -Xmx512m org.apache.spark.deploy.master.Master --ip 10.1.1.151 --port 7077 --webui-port 8080 15/01/14 07:34:47 INFO master.Master: Registered signal handlers for [TERM, HUP, INT] Exception in thread main java.net.UnknownHostException: ip-10-1-1-151: ip-10-1-1-151: Name or service not known at java.net.InetAddress.getLocalHost(InetAddress.java:1473) at org.apache.spark.util.Utils$.findLocalIpAddress(Utils.scala:620) at org.apache.spark.util.Utils$.localIpAddress$lzycompute(Utils.scala:612) at org.apache.spark.util.Utils$.localIpAddress(Utils.scala:612) at org.apache.spark.util.Utils$.localIpAddressHostname$lzycompute(Utils.scala:613) at org.apache.spark.util.Utils$.localIpAddressHostname(Utils.scala:613) at org.apache.spark.util.Utils$$anonfun$localHostName$1.apply(Utils.scala:665) at org.apache.spark.util.Utils$$anonfun$localHostName$1.apply(Utils.scala:665) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.util.Utils$.localHostName(Utils.scala:665) at org.apache.spark.deploy.master.MasterArguments.init(MasterArguments.scala:27) at org.apache.spark.deploy.master.Master$.main(Master.scala:819) at org.apache.spark.deploy.master.Master.main(Master.scala) Caused by: java.net.UnknownHostException: ip-10-1-1-151: Name or service not known at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:901) at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1293) at java.net.InetAddress.getLocalHost(InetAddress.java:1469) ... 12 more {code} Problem is that instance launched in VPC may be not able to resolve own local hostname. Please see https://forums.aws.amazon.com/thread.jspa?threadID=92092. I am going to submit a fix for this problem since I need this functionality asap. -- This message was sent by
[jira] [Commented] (SPARK-5246) spark/spark-ec2.py cannot start Spark master in VPC if local DNS name does not resolve
[ https://issues.apache.org/jira/browse/SPARK-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14278780#comment-14278780 ] Vladimir Grigor commented on SPARK-5246: https://github.com/mesos/spark-ec2/pull/91 spark/spark-ec2.py cannot start Spark master in VPC if local DNS name does not resolve -- Key: SPARK-5246 URL: https://issues.apache.org/jira/browse/SPARK-5246 Project: Spark Issue Type: Bug Components: EC2 Reporter: Vladimir Grigor How to reproduce: 1) http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Scenario2.html should be sufficient to setup VPC for this bug. After you followed that guide, start new instance in VPC, ssh to it (though NAT server) 2) user starts a cluster in VPC: {code} ./spark-ec2 -k key20141114 -i ~/aws/key.pem -s 1 --region=eu-west-1 --spark-version=1.2.0 --instance-type=m1.large --vpc-id=vpc-2e71dd46 --subnet-id=subnet-2571dd4d --zone=eu-west-1a launch SparkByScript Setting up security groups... (omitted for brevity) 10.1.1.62 10.1.1.62: no org.apache.spark.deploy.worker.Worker to stop no org.apache.spark.deploy.master.Master to stop starting org.apache.spark.deploy.master.Master, logging to /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out failed to launch org.apache.spark.deploy.master.Master: at java.net.InetAddress.getLocalHost(InetAddress.java:1469) ... 12 more full log in /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out 10.1.1.62: starting org.apache.spark.deploy.worker.Worker, logging to /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-ip-10-1-1-62.out 10.1.1.62: failed to launch org.apache.spark.deploy.worker.Worker: 10.1.1.62:at java.net.InetAddress.getLocalHost(InetAddress.java:1469) 10.1.1.62:... 12 more 10.1.1.62: full log in /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-ip-10-1-1-62.out [timing] spark-standalone setup: 00h 00m 28s (omitted for brevity) {code} /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out {code} Spark assembly has been built with Hive, including Datanucleus jars on classpath Spark Command: /usr/lib/jvm/java-1.7.0/bin/java -cp :::/root/ephemeral-hdfs/conf:/root/spark/sbin/../conf:/root/spark/lib/spark-assembly-1.2.0-hadoop1.0.4.jar:/root/spark/lib/datanucleus-api-jdo-3.2.6.jar:/root/spark/lib/datanucleus-rdbms-3.2.9.jar:/root/spark/lib/datanucleus-core-3.2.10.jar -XX:MaxPermSize=128m -Dspark.akka.logLifecycleEvents=true -Xms512m -Xmx512m org.apache.spark.deploy.master.Master --ip 10.1.1.151 --port 7077 --webui-port 8080 15/01/14 07:34:47 INFO master.Master: Registered signal handlers for [TERM, HUP, INT] Exception in thread main java.net.UnknownHostException: ip-10-1-1-151: ip-10-1-1-151: Name or service not known at java.net.InetAddress.getLocalHost(InetAddress.java:1473) at org.apache.spark.util.Utils$.findLocalIpAddress(Utils.scala:620) at org.apache.spark.util.Utils$.localIpAddress$lzycompute(Utils.scala:612) at org.apache.spark.util.Utils$.localIpAddress(Utils.scala:612) at org.apache.spark.util.Utils$.localIpAddressHostname$lzycompute(Utils.scala:613) at org.apache.spark.util.Utils$.localIpAddressHostname(Utils.scala:613) at org.apache.spark.util.Utils$$anonfun$localHostName$1.apply(Utils.scala:665) at org.apache.spark.util.Utils$$anonfun$localHostName$1.apply(Utils.scala:665) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.util.Utils$.localHostName(Utils.scala:665) at org.apache.spark.deploy.master.MasterArguments.init(MasterArguments.scala:27) at org.apache.spark.deploy.master.Master$.main(Master.scala:819) at org.apache.spark.deploy.master.Master.main(Master.scala) Caused by: java.net.UnknownHostException: ip-10-1-1-151: Name or service not known at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:901) at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1293) at java.net.InetAddress.getLocalHost(InetAddress.java:1469) ... 12 more {code} Problem is that instance launched in VPC may be not able to resolve own local hostname. Please see https://forums.aws.amazon.com/thread.jspa?threadID=92092. I am going to submit a fix for this problem since I need this functionality asap. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To
[jira] [Updated] (SPARK-5246) spark/spark-ec2.py cannot start Spark master in VPC if local DNS name does not resolve
[ https://issues.apache.org/jira/browse/SPARK-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Grigor updated SPARK-5246: --- Description: ##How to reproduce: 1) http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Scenario2.html should be sufficient to setup VPC for this bug. After you followed that guide, start new instance in VPC, ssh to it (though NAT server) 2) user starts a cluster in VPC: {code} ./spark-ec2 -k key20141114 -i ~/aws/key.pem -s 1 --region=eu-west-1 --spark-version=1.2.0 --instance-type=m1.large --vpc-id=vpc-2e71dd46 --subnet-id=subnet-2571dd4d --zone=eu-west-1a launch SparkByScript Setting up security groups... (omitted for brevity) 10.1.1.62 10.1.1.62: no org.apache.spark.deploy.worker.Worker to stop no org.apache.spark.deploy.master.Master to stop starting org.apache.spark.deploy.master.Master, logging to /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out failed to launch org.apache.spark.deploy.master.Master: at java.net.InetAddress.getLocalHost(InetAddress.java:1469) ... 12 more full log in /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out 10.1.1.62: starting org.apache.spark.deploy.worker.Worker, logging to /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-ip-10-1-1-62.out 10.1.1.62: failed to launch org.apache.spark.deploy.worker.Worker: 10.1.1.62: at java.net.InetAddress.getLocalHost(InetAddress.java:1469) 10.1.1.62: ... 12 more 10.1.1.62: full log in /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-ip-10-1-1-62.out [timing] spark-standalone setup: 00h 00m 28s (omitted for brevity) {code} /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out {code} Spark assembly has been built with Hive, including Datanucleus jars on classpath Spark Command: /usr/lib/jvm/java-1.7.0/bin/java -cp :::/root/ephemeral-hdfs/conf:/root/spark/sbin/../conf:/root/spark/lib/spark-assembly-1.2.0-hadoop1.0.4.jar:/root/spark/lib/datanucleus-api-jdo-3.2.6.jar:/root/spark/lib/datanucleus-rdbms-3.2.9.jar:/root/spark/lib/datanucleus-core-3.2.10.jar -XX:MaxPermSize=128m -Dspark.akka.logLifecycleEvents=true -Xms512m -Xmx512m org.apache.spark.deploy.master.Master --ip 10.1.1.151 --port 7077 --webui-port 8080 15/01/14 07:34:47 INFO master.Master: Registered signal handlers for [TERM, HUP, INT] Exception in thread main java.net.UnknownHostException: ip-10-1-1-151: ip-10-1-1-151: Name or service not known at java.net.InetAddress.getLocalHost(InetAddress.java:1473) at org.apache.spark.util.Utils$.findLocalIpAddress(Utils.scala:620) at org.apache.spark.util.Utils$.localIpAddress$lzycompute(Utils.scala:612) at org.apache.spark.util.Utils$.localIpAddress(Utils.scala:612) at org.apache.spark.util.Utils$.localIpAddressHostname$lzycompute(Utils.scala:613) at org.apache.spark.util.Utils$.localIpAddressHostname(Utils.scala:613) at org.apache.spark.util.Utils$$anonfun$localHostName$1.apply(Utils.scala:665) at org.apache.spark.util.Utils$$anonfun$localHostName$1.apply(Utils.scala:665) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.util.Utils$.localHostName(Utils.scala:665) at org.apache.spark.deploy.master.MasterArguments.init(MasterArguments.scala:27) at org.apache.spark.deploy.master.Master$.main(Master.scala:819) at org.apache.spark.deploy.master.Master.main(Master.scala) Caused by: java.net.UnknownHostException: ip-10-1-1-151: Name or service not known at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:901) at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1293) at java.net.InetAddress.getLocalHost(InetAddress.java:1469) ... 12 more {code} Problem is that instance launched in VPC may be not able to resolve own local hostname. Please see https://forums.aws.amazon.com/thread.jspa?threadID=92092. I am going to submit a fix for this problem since I need this functionality asap. ## How to reproduce was: How to reproduce: 1) user starts a cluster in VPC: {code} ./spark-ec2 -k key20141114 -i ~/aws/key.pem -s 1 --region=eu-west-1 --spark-version=1.2.0 --instance-type=m1.large --vpc-id=vpc-2e71dd46 --subnet-id=subnet-2571dd4d --zone=eu-west-1a launch SparkByScript Setting up security groups... (omitted for brevity) 10.1.1.62 10.1.1.62: no org.apache.spark.deploy.worker.Worker to stop no org.apache.spark.deploy.master.Master to stop starting org.apache.spark.deploy.master.Master, logging to /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out failed to launch org.apache.spark.deploy.master.Master: at
[jira] [Updated] (SPARK-5246) spark/spark-ec2.py cannot start Spark master in VPC if local DNS name does not resolve
[ https://issues.apache.org/jira/browse/SPARK-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Grigor updated SPARK-5246: --- Description: How to reproduce: 1) http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Scenario2.html should be sufficient to setup VPC for this bug. After you followed that guide, start new instance in VPC, ssh to it (though NAT server) 2) user starts a cluster in VPC: {code} ./spark-ec2 -k key20141114 -i ~/aws/key.pem -s 1 --region=eu-west-1 --spark-version=1.2.0 --instance-type=m1.large --vpc-id=vpc-2e71dd46 --subnet-id=subnet-2571dd4d --zone=eu-west-1a launch SparkByScript Setting up security groups... (omitted for brevity) 10.1.1.62 10.1.1.62: no org.apache.spark.deploy.worker.Worker to stop no org.apache.spark.deploy.master.Master to stop starting org.apache.spark.deploy.master.Master, logging to /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out failed to launch org.apache.spark.deploy.master.Master: at java.net.InetAddress.getLocalHost(InetAddress.java:1469) ... 12 more full log in /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out 10.1.1.62: starting org.apache.spark.deploy.worker.Worker, logging to /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-ip-10-1-1-62.out 10.1.1.62: failed to launch org.apache.spark.deploy.worker.Worker: 10.1.1.62: at java.net.InetAddress.getLocalHost(InetAddress.java:1469) 10.1.1.62: ... 12 more 10.1.1.62: full log in /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-ip-10-1-1-62.out [timing] spark-standalone setup: 00h 00m 28s (omitted for brevity) {code} /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out {code} Spark assembly has been built with Hive, including Datanucleus jars on classpath Spark Command: /usr/lib/jvm/java-1.7.0/bin/java -cp :::/root/ephemeral-hdfs/conf:/root/spark/sbin/../conf:/root/spark/lib/spark-assembly-1.2.0-hadoop1.0.4.jar:/root/spark/lib/datanucleus-api-jdo-3.2.6.jar:/root/spark/lib/datanucleus-rdbms-3.2.9.jar:/root/spark/lib/datanucleus-core-3.2.10.jar -XX:MaxPermSize=128m -Dspark.akka.logLifecycleEvents=true -Xms512m -Xmx512m org.apache.spark.deploy.master.Master --ip 10.1.1.151 --port 7077 --webui-port 8080 15/01/14 07:34:47 INFO master.Master: Registered signal handlers for [TERM, HUP, INT] Exception in thread main java.net.UnknownHostException: ip-10-1-1-151: ip-10-1-1-151: Name or service not known at java.net.InetAddress.getLocalHost(InetAddress.java:1473) at org.apache.spark.util.Utils$.findLocalIpAddress(Utils.scala:620) at org.apache.spark.util.Utils$.localIpAddress$lzycompute(Utils.scala:612) at org.apache.spark.util.Utils$.localIpAddress(Utils.scala:612) at org.apache.spark.util.Utils$.localIpAddressHostname$lzycompute(Utils.scala:613) at org.apache.spark.util.Utils$.localIpAddressHostname(Utils.scala:613) at org.apache.spark.util.Utils$$anonfun$localHostName$1.apply(Utils.scala:665) at org.apache.spark.util.Utils$$anonfun$localHostName$1.apply(Utils.scala:665) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.util.Utils$.localHostName(Utils.scala:665) at org.apache.spark.deploy.master.MasterArguments.init(MasterArguments.scala:27) at org.apache.spark.deploy.master.Master$.main(Master.scala:819) at org.apache.spark.deploy.master.Master.main(Master.scala) Caused by: java.net.UnknownHostException: ip-10-1-1-151: Name or service not known at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:901) at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1293) at java.net.InetAddress.getLocalHost(InetAddress.java:1469) ... 12 more {code} Problem is that instance launched in VPC may be not able to resolve own local hostname. Please see https://forums.aws.amazon.com/thread.jspa?threadID=92092. I am going to submit a fix for this problem since I need this functionality asap. was: ##How to reproduce: 1) http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Scenario2.html should be sufficient to setup VPC for this bug. After you followed that guide, start new instance in VPC, ssh to it (though NAT server) 2) user starts a cluster in VPC: {code} ./spark-ec2 -k key20141114 -i ~/aws/key.pem -s 1 --region=eu-west-1 --spark-version=1.2.0 --instance-type=m1.large --vpc-id=vpc-2e71dd46 --subnet-id=subnet-2571dd4d --zone=eu-west-1a launch SparkByScript Setting up security groups... (omitted for brevity) 10.1.1.62 10.1.1.62: no org.apache.spark.deploy.worker.Worker to stop no org.apache.spark.deploy.master.Master to stop starting
[jira] [Created] (SPARK-5246) spark/spark-ec2.py cannot start Spark master in VPC if local DNS name does not resolve
Vladimir Grigor created SPARK-5246: -- Summary: spark/spark-ec2.py cannot start Spark master in VPC if local DNS name does not resolve Key: SPARK-5246 URL: https://issues.apache.org/jira/browse/SPARK-5246 Project: Spark Issue Type: Bug Components: EC2 Reporter: Vladimir Grigor How to reproduce: 1) user starts a cluster in VPC: {code} ./spark-ec2 -k key20141114 -i ~/aws/key.pem -s 1 --region=eu-west-1 --spark-version=1.2.0 --instance-type=m1.large --vpc-id=vpc-2e71dd46 --subnet-id=subnet-2571dd4d --zone=eu-west-1a launch SparkByScript Setting up security groups... (omitted for brevity) 10.1.1.62 10.1.1.62: no org.apache.spark.deploy.worker.Worker to stop no org.apache.spark.deploy.master.Master to stop starting org.apache.spark.deploy.master.Master, logging to /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out failed to launch org.apache.spark.deploy.master.Master: at java.net.InetAddress.getLocalHost(InetAddress.java:1469) ... 12 more full log in /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out 10.1.1.62: starting org.apache.spark.deploy.worker.Worker, logging to /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-ip-10-1-1-62.out 10.1.1.62: failed to launch org.apache.spark.deploy.worker.Worker: 10.1.1.62: at java.net.InetAddress.getLocalHost(InetAddress.java:1469) 10.1.1.62: ... 12 more 10.1.1.62: full log in /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-ip-10-1-1-62.out [timing] spark-standalone setup: 00h 00m 28s (omitted for brevity) {code} /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out { Spark assembly has been built with Hive, including Datanucleus jars on classpath Spark Command: /usr/lib/jvm/java-1.7.0/bin/java -cp :::/root/ephemeral-hdfs/conf:/root/spark/sbin/../conf:/root/spark/lib/spark-assembly-1.2.0-hadoop1.0.4.jar:/root/spark/lib/datanucleus-api-jdo-3.2.6.jar:/root/spark/lib/datanucleus-rdbms-3.2.9.jar:/root/spark/lib/datanucleus-core-3.2.10.jar -XX:MaxPermSize=128m -Dspark.akka.logLifecycleEvents=true -Xms512m -Xmx512m org.apache.spark.deploy.master.Master --ip 10.1.1.151 --port 7077 --webui-port 8080 15/01/14 07:34:47 INFO master.Master: Registered signal handlers for [TERM, HUP, INT] Exception in thread main java.net.UnknownHostException: ip-10-1-1-151: ip-10-1-1-151: Name or service not known at java.net.InetAddress.getLocalHost(InetAddress.java:1473) at org.apache.spark.util.Utils$.findLocalIpAddress(Utils.scala:620) at org.apache.spark.util.Utils$.localIpAddress$lzycompute(Utils.scala:612) at org.apache.spark.util.Utils$.localIpAddress(Utils.scala:612) at org.apache.spark.util.Utils$.localIpAddressHostname$lzycompute(Utils.scala:613) at org.apache.spark.util.Utils$.localIpAddressHostname(Utils.scala:613) at org.apache.spark.util.Utils$$anonfun$localHostName$1.apply(Utils.scala:665) at org.apache.spark.util.Utils$$anonfun$localHostName$1.apply(Utils.scala:665) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.util.Utils$.localHostName(Utils.scala:665) at org.apache.spark.deploy.master.MasterArguments.init(MasterArguments.scala:27) at org.apache.spark.deploy.master.Master$.main(Master.scala:819) at org.apache.spark.deploy.master.Master.main(Master.scala) Caused by: java.net.UnknownHostException: ip-10-1-1-151: Name or service not known at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:901) at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1293) at java.net.InetAddress.getLocalHost(InetAddress.java:1469) ... 12 more } Problem is that instance launched in VPC may be not able to resolve own local hostname. Please see https://forums.aws.amazon.com/thread.jspa?threadID=92092. I am going to submit a fix for this problem since I need this functionality asap. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-5246) spark/spark-ec2.py cannot start Spark master in VPC if local DNS name does not resolve
[ https://issues.apache.org/jira/browse/SPARK-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Grigor updated SPARK-5246: --- Description: How to reproduce: 1) user starts a cluster in VPC: {code} ./spark-ec2 -k key20141114 -i ~/aws/key.pem -s 1 --region=eu-west-1 --spark-version=1.2.0 --instance-type=m1.large --vpc-id=vpc-2e71dd46 --subnet-id=subnet-2571dd4d --zone=eu-west-1a launch SparkByScript Setting up security groups... (omitted for brevity) 10.1.1.62 10.1.1.62: no org.apache.spark.deploy.worker.Worker to stop no org.apache.spark.deploy.master.Master to stop starting org.apache.spark.deploy.master.Master, logging to /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out failed to launch org.apache.spark.deploy.master.Master: at java.net.InetAddress.getLocalHost(InetAddress.java:1469) ... 12 more full log in /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out 10.1.1.62: starting org.apache.spark.deploy.worker.Worker, logging to /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-ip-10-1-1-62.out 10.1.1.62: failed to launch org.apache.spark.deploy.worker.Worker: 10.1.1.62: at java.net.InetAddress.getLocalHost(InetAddress.java:1469) 10.1.1.62: ... 12 more 10.1.1.62: full log in /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-ip-10-1-1-62.out [timing] spark-standalone setup: 00h 00m 28s (omitted for brevity) {code} /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out {code} Spark assembly has been built with Hive, including Datanucleus jars on classpath Spark Command: /usr/lib/jvm/java-1.7.0/bin/java -cp :::/root/ephemeral-hdfs/conf:/root/spark/sbin/../conf:/root/spark/lib/spark-assembly-1.2.0-hadoop1.0.4.jar:/root/spark/lib/datanucleus-api-jdo-3.2.6.jar:/root/spark/lib/datanucleus-rdbms-3.2.9.jar:/root/spark/lib/datanucleus-core-3.2.10.jar -XX:MaxPermSize=128m -Dspark.akka.logLifecycleEvents=true -Xms512m -Xmx512m org.apache.spark.deploy.master.Master --ip 10.1.1.151 --port 7077 --webui-port 8080 15/01/14 07:34:47 INFO master.Master: Registered signal handlers for [TERM, HUP, INT] Exception in thread main java.net.UnknownHostException: ip-10-1-1-151: ip-10-1-1-151: Name or service not known at java.net.InetAddress.getLocalHost(InetAddress.java:1473) at org.apache.spark.util.Utils$.findLocalIpAddress(Utils.scala:620) at org.apache.spark.util.Utils$.localIpAddress$lzycompute(Utils.scala:612) at org.apache.spark.util.Utils$.localIpAddress(Utils.scala:612) at org.apache.spark.util.Utils$.localIpAddressHostname$lzycompute(Utils.scala:613) at org.apache.spark.util.Utils$.localIpAddressHostname(Utils.scala:613) at org.apache.spark.util.Utils$$anonfun$localHostName$1.apply(Utils.scala:665) at org.apache.spark.util.Utils$$anonfun$localHostName$1.apply(Utils.scala:665) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.util.Utils$.localHostName(Utils.scala:665) at org.apache.spark.deploy.master.MasterArguments.init(MasterArguments.scala:27) at org.apache.spark.deploy.master.Master$.main(Master.scala:819) at org.apache.spark.deploy.master.Master.main(Master.scala) Caused by: java.net.UnknownHostException: ip-10-1-1-151: Name or service not known at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:901) at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1293) at java.net.InetAddress.getLocalHost(InetAddress.java:1469) ... 12 more {code} Problem is that instance launched in VPC may be not able to resolve own local hostname. Please see https://forums.aws.amazon.com/thread.jspa?threadID=92092. I am going to submit a fix for this problem since I need this functionality asap. was: How to reproduce: 1) user starts a cluster in VPC: {code} ./spark-ec2 -k key20141114 -i ~/aws/key.pem -s 1 --region=eu-west-1 --spark-version=1.2.0 --instance-type=m1.large --vpc-id=vpc-2e71dd46 --subnet-id=subnet-2571dd4d --zone=eu-west-1a launch SparkByScript Setting up security groups... (omitted for brevity) 10.1.1.62 10.1.1.62: no org.apache.spark.deploy.worker.Worker to stop no org.apache.spark.deploy.master.Master to stop starting org.apache.spark.deploy.master.Master, logging to /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out failed to launch org.apache.spark.deploy.master.Master: at java.net.InetAddress.getLocalHost(InetAddress.java:1469) ... 12 more full log in /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out 10.1.1.62: starting org.apache.spark.deploy.worker.Worker, logging to
[jira] [Created] (SPARK-5242) ec2/spark_ec2.py lauch does not work with VPC if no public DNS or IP is available
Vladimir Grigor created SPARK-5242: -- Summary: ec2/spark_ec2.py lauch does not work with VPC if no public DNS or IP is available Key: SPARK-5242 URL: https://issues.apache.org/jira/browse/SPARK-5242 Project: Spark Issue Type: Bug Components: EC2 Reporter: Vladimir Grigor How to reproduce: user starting cluster in VPC needs to wait forever: {code} ./spark-ec2 -k key20141114 -i ~/aws/key.pem -s 1 --region=eu-west-1 --spark-version=1.2.0 --instance-type=m1.large --vpc-id=vpc-2e71dd46 --subnet-id=subnet-2571dd4d --zone=eu-west-1a launch SparkByScript Setting up security groups... Searching for existing cluster SparkByScript... Spark AMI: ami-1ae0166d Launching instances... Launched 1 slaves in eu-west-1a, regid = r-e70c5502 Launched master in eu-west-1a, regid = r-bf0f565a Waiting for cluster to enter 'ssh-ready' state..{forever} {code} Problem is that current code makes wrong assumption that VPC instance has public_dns_name or public ip_address. Actually more common is that VPC instance has only private_ip_address. The bug is already fixed in my fork, I am going to submit pull request -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-5242) ec2/spark_ec2.py lauch does not work with VPC if no public DNS or IP is available
[ https://issues.apache.org/jira/browse/SPARK-5242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14276591#comment-14276591 ] Vladimir Grigor commented on SPARK-5242: This bug is fixed in https://github.com/apache/spark/pull/4038 ec2/spark_ec2.py lauch does not work with VPC if no public DNS or IP is available --- Key: SPARK-5242 URL: https://issues.apache.org/jira/browse/SPARK-5242 Project: Spark Issue Type: Bug Components: EC2 Reporter: Vladimir Grigor Labels: easyfix How to reproduce: user starting cluster in VPC needs to wait forever: {code} ./spark-ec2 -k key20141114 -i ~/aws/key.pem -s 1 --region=eu-west-1 --spark-version=1.2.0 --instance-type=m1.large --vpc-id=vpc-2e71dd46 --subnet-id=subnet-2571dd4d --zone=eu-west-1a launch SparkByScript Setting up security groups... Searching for existing cluster SparkByScript... Spark AMI: ami-1ae0166d Launching instances... Launched 1 slaves in eu-west-1a, regid = r-e70c5502 Launched master in eu-west-1a, regid = r-bf0f565a Waiting for cluster to enter 'ssh-ready' state..{forever} {code} Problem is that current code makes wrong assumption that VPC instance has public_dns_name or public ip_address. Actually more common is that VPC instance has only private_ip_address. The bug is already fixed in my fork, I am going to submit pull request -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org