[jira] [Updated] (SPARK-925) Allow ec2 scripts to load default options from a json file

2015-04-03 Thread Vladimir Grigor (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Grigor updated SPARK-925:
--
Description: 
The option list for ec2 script can be a little irritating to type in, 
especially things like path to identity-file, region , zone, ami etc.
It would be nice if ec2 script looks for an options.json file in the following 
order: (1) CWD, (2) ~/spark-ec2, (3) same dir as spark_ec2.py

Something like:
def get_defaults_from_options():
  # Check to see if a options.json file exists, if so load it. 
  # However, values in the options.json file can only overide values in opts
  # if the Opt values are None or 
  # i.e. commandline options take presidence 
  defaults = {'aws-access-key-id':'','aws-secret-access-key':'','key-pair':'', 
'identity-file':'', 'region':'ap-southeast-1', 'zone':'', 'ami':'','slaves':1, 
'instance-type':'m1.large'}

  # Look for options.json in directory cluster was called from
  # Had to modify the spark_ec2 wrapper script since it mangles the pwd
  startwd = os.environ['STARTWD']
  if os.path.exists(os.path.join(startwd,options.json)):
  optionspath = os.path.join(startwd,options.json)
  else:
  optionspath = os.path.join(os.getcwd(),options.json)
  
  try:
print Loading options file: , optionspath  
with open (optionspath) as json_data:
jdata = json.load(json_data)
for k in jdata:
  defaults[k]=jdata[k]
  except IOError:
print 'Warning: options.json file not loaded'

  # Check permissions on identity-file, if defined, otherwise launch will fail 
late and will be irritating
  if defaults['identity-file']!='':
st = os.stat(defaults['identity-file'])
user_can_read = bool(st.st_mode  stat.S_IRUSR)
grp_perms = bool(st.st_mode  stat.S_IRWXG)
others_perm = bool(st.st_mode  stat.S_IRWXO)
if (not user_can_read):
  print No read permission to read , defaults['identify-file']
  sys.exit(1)
if (grp_perms or others_perm):
  print Permissions are too open, please chmod 600 file , 
defaults['identify-file']
  sys.exit(1)

  # if defaults contain AWS access id or private key, set it to environment. 
  # required for use with boto to access the AWS console 
  if defaults['aws-access-key-id'] != '':
os.environ['AWS_ACCESS_KEY_ID']=defaults['aws-access-key-id'] 
  if defaults['aws-secret-access-key'] != '':   
os.environ['AWS_SECRET_ACCESS_KEY'] = defaults['aws-secret-access-key']

  return defaults  


  was:
The option list for ec2 script can be a little irritating to type in, 
especially things like path to identity-file, region , zone, ami etc.
It would be nice if ec2 script looks for an options.json file in the following 
order: (1) PWD, (2) ~/spark-ec2, (3) same dir as spark_ec2.py

Something like:
def get_defaults_from_options():
  # Check to see if a options.json file exists, if so load it. 
  # However, values in the options.json file can only overide values in opts
  # if the Opt values are None or 
  # i.e. commandline options take presidence 
  defaults = {'aws-access-key-id':'','aws-secret-access-key':'','key-pair':'', 
'identity-file':'', 'region':'ap-southeast-1', 'zone':'', 'ami':'','slaves':1, 
'instance-type':'m1.large'}

  # Look for options.json in directory cluster was called from
  # Had to modify the spark_ec2 wrapper script since it mangles the pwd
  startwd = os.environ['STARTWD']
  if os.path.exists(os.path.join(startwd,options.json)):
  optionspath = os.path.join(startwd,options.json)
  else:
  optionspath = os.path.join(os.getcwd(),options.json)
  
  try:
print Loading options file: , optionspath  
with open (optionspath) as json_data:
jdata = json.load(json_data)
for k in jdata:
  defaults[k]=jdata[k]
  except IOError:
print 'Warning: options.json file not loaded'

  # Check permissions on identity-file, if defined, otherwise launch will fail 
late and will be irritating
  if defaults['identity-file']!='':
st = os.stat(defaults['identity-file'])
user_can_read = bool(st.st_mode  stat.S_IRUSR)
grp_perms = bool(st.st_mode  stat.S_IRWXG)
others_perm = bool(st.st_mode  stat.S_IRWXO)
if (not user_can_read):
  print No read permission to read , defaults['identify-file']
  sys.exit(1)
if (grp_perms or others_perm):
  print Permissions are too open, please chmod 600 file , 
defaults['identify-file']
  sys.exit(1)

  # if defaults contain AWS access id or private key, set it to environment. 
  # required for use with boto to access the AWS console 
  if defaults['aws-access-key-id'] != '':
os.environ['AWS_ACCESS_KEY_ID']=defaults['aws-access-key-id'] 
  if defaults['aws-secret-access-key'] != '':   
os.environ['AWS_SECRET_ACCESS_KEY'] = defaults['aws-secret-access-key']

  return defaults  



 Allow ec2 scripts to load default options from a json file
 

[jira] [Commented] (SPARK-5242) ec2/spark_ec2.py lauch does not work with VPC if no public DNS or IP is available

2015-04-03 Thread Vladimir Grigor (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14395562#comment-14395562
 ] 

Vladimir Grigor commented on SPARK-5242:


updated PR from upstream/master

 ec2/spark_ec2.py lauch does not work with VPC if no public DNS or IP is 
 available
 ---

 Key: SPARK-5242
 URL: https://issues.apache.org/jira/browse/SPARK-5242
 Project: Spark
  Issue Type: Bug
  Components: EC2
Reporter: Vladimir Grigor
  Labels: easyfix

 How to reproduce: user starting cluster in VPC needs to wait forever:
 {code}
 ./spark-ec2 -k key20141114 -i ~/aws/key.pem -s 1 --region=eu-west-1 
 --spark-version=1.2.0 --instance-type=m1.large --vpc-id=vpc-2e71dd46 
 --subnet-id=subnet-2571dd4d --zone=eu-west-1a  launch SparkByScript
 Setting up security groups...
 Searching for existing cluster SparkByScript...
 Spark AMI: ami-1ae0166d
 Launching instances...
 Launched 1 slaves in eu-west-1a, regid = r-e70c5502
 Launched master in eu-west-1a, regid = r-bf0f565a
 Waiting for cluster to enter 'ssh-ready' state..{forever}
 {code}
 Problem is that current code makes wrong assumption that VPC instance has 
 public_dns_name or public ip_address. Actually more common is that VPC 
 instance has only private_ip_address.
 The bug is already fixed in my fork, I am going to submit pull request



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5479) PySpark on yarn mode need to support non-local python files

2015-01-30 Thread Vladimir Grigor (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14298400#comment-14298400
 ] 

Vladimir Grigor commented on SPARK-5479:


https://github.com/apache/spark/pull/3976 potentially closes this issue

 PySpark on yarn mode need to support non-local python files
 ---

 Key: SPARK-5479
 URL: https://issues.apache.org/jira/browse/SPARK-5479
 Project: Spark
  Issue Type: Bug
  Components: PySpark
Reporter: Lianhui Wang

  In SPARK-5162 [~vgrigor] reports this:
 Now following code cannot work:
 aws emr add-steps --cluster-id j-XYWIXMD234 \
 --steps 
 Name=SparkPi,Jar=s3://eu-west-1.elasticmapreduce/libs/script-runner/script-runner.jar,Args=[/home/hadoop/spark/bin/spark-submit,--deploy-mode,cluster,--master,yarn-cluster,--py-files,s3://mybucketat.amazonaws.com/tasks/main.py,main.py,param1],ActionOnFailure=CONTINUE
 so we need to support non-local python files on yarn client and cluster mode.
 before submitting application to Yarn, we need to download non-local files to 
 local or hdfs path.
 or spark.yarn.dist.files need to support other non-local files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5162) Python yarn-cluster mode

2015-01-28 Thread Vladimir Grigor (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296430#comment-14296430
 ] 

Vladimir Grigor commented on SPARK-5162:


[~lianhuiwang] Thank you for the walkaround suggestion! Still, I believe it 
would be great to have feature of remote script files - that would improve 
usability of yarn component in Spark a lot. If you think that is the case, and 
you know technical details of the system better, could you please create a 
ticket for that feature? Or please comment with any ideas of technical 
implementation. Thank you!

 Python yarn-cluster mode
 

 Key: SPARK-5162
 URL: https://issues.apache.org/jira/browse/SPARK-5162
 Project: Spark
  Issue Type: New Feature
  Components: PySpark, YARN
Reporter: Dana Klassen
  Labels: cluster, python, yarn

 Running pyspark in yarn is currently limited to ‘yarn-client’ mode. It would 
 be great to be able to submit python applications to the cluster and (just 
 like java classes) have the resource manager setup an AM on any node in the 
 cluster. Does anyone know the issues blocking this feature? I was snooping 
 around with enabling python apps:
 Removing the logic stopping python and yarn-cluster from sparkSubmit.scala
 ...
 // The following modes are not supported or applicable
 (clusterManager, deployMode) match {
   ...
   case (_, CLUSTER) if args.isPython =
 printErrorAndExit(Cluster deploy mode is currently not supported for 
 python applications.)
   ...
 }
 …
 and submitting application via:
 HADOOP_CONF_DIR={{insert conf dir}} ./bin/spark-submit --master yarn-cluster 
 --num-executors 2  —-py-files {{insert location of egg here}} 
 --executor-cores 1  ../tools/canary.py
 Everything looks to run alright, pythonRunner is picked up as main class, 
 resources get setup, yarn client gets launched but falls flat on its face:
 2015-01-08 18:48:03,444 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
  DEBUG: FAILED { 
 {{redacted}}/.sparkStaging/application_1420594669313_4687/canary.py, 
 1420742868009, FILE, null }, Resource 
 {{redacted}}/.sparkStaging/application_1420594669313_4687/canary.py changed 
 on src filesystem (expected 1420742868009, was 1420742869284
 and
 2015-01-08 18:48:03,446 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
  Resource 
 {{redacted}}/.sparkStaging/application_1420594669313_4687/canary.py(-/data/4/yarn/nm/usercache/klassen/filecache/11/canary.py)
  transitioned from DOWNLOADING to FAILED
 Tracked this down to the apache hadoop code(FSDownload.java line 249) related 
 to container localization of files upon downloading. At this point thought it 
 would be best to raise the issue here and get input.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5162) Python yarn-cluster mode

2015-01-27 Thread Vladimir Grigor (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14293397#comment-14293397
 ] 

Vladimir Grigor commented on SPARK-5162:


[~lianhuiwang] Thank you for reply :) yarn client mode does not support remote 
scripts either. It'd be very good to have this feature. Please let me know if I 
could help you anyhow

 Python yarn-cluster mode
 

 Key: SPARK-5162
 URL: https://issues.apache.org/jira/browse/SPARK-5162
 Project: Spark
  Issue Type: New Feature
  Components: PySpark, YARN
Reporter: Dana Klassen
  Labels: cluster, python, yarn

 Running pyspark in yarn is currently limited to ‘yarn-client’ mode. It would 
 be great to be able to submit python applications to the cluster and (just 
 like java classes) have the resource manager setup an AM on any node in the 
 cluster. Does anyone know the issues blocking this feature? I was snooping 
 around with enabling python apps:
 Removing the logic stopping python and yarn-cluster from sparkSubmit.scala
 ...
 // The following modes are not supported or applicable
 (clusterManager, deployMode) match {
   ...
   case (_, CLUSTER) if args.isPython =
 printErrorAndExit(Cluster deploy mode is currently not supported for 
 python applications.)
   ...
 }
 …
 and submitting application via:
 HADOOP_CONF_DIR={{insert conf dir}} ./bin/spark-submit --master yarn-cluster 
 --num-executors 2  —-py-files {{insert location of egg here}} 
 --executor-cores 1  ../tools/canary.py
 Everything looks to run alright, pythonRunner is picked up as main class, 
 resources get setup, yarn client gets launched but falls flat on its face:
 2015-01-08 18:48:03,444 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
  DEBUG: FAILED { 
 {{redacted}}/.sparkStaging/application_1420594669313_4687/canary.py, 
 1420742868009, FILE, null }, Resource 
 {{redacted}}/.sparkStaging/application_1420594669313_4687/canary.py changed 
 on src filesystem (expected 1420742868009, was 1420742869284
 and
 2015-01-08 18:48:03,446 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
  Resource 
 {{redacted}}/.sparkStaging/application_1420594669313_4687/canary.py(-/data/4/yarn/nm/usercache/klassen/filecache/11/canary.py)
  transitioned from DOWNLOADING to FAILED
 Tracked this down to the apache hadoop code(FSDownload.java line 249) related 
 to container localization of files upon downloading. At this point thought it 
 would be best to raise the issue here and get input.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-595) Document local-cluster mode

2015-01-26 Thread Vladimir Grigor (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292052#comment-14292052
 ] 

Vladimir Grigor commented on SPARK-595:
---

+1 for reopen

 Document local-cluster mode
 -

 Key: SPARK-595
 URL: https://issues.apache.org/jira/browse/SPARK-595
 Project: Spark
  Issue Type: New Feature
  Components: Documentation
Affects Versions: 0.6.0
Reporter: Josh Rosen
Priority: Minor

 The 'Spark Standalone Mode' guide describes how to manually launch a 
 standalone cluster, which can be done locally for testing, but it does not 
 mention SparkContext's `local-cluster` option.
 What are the differences between these approaches?  Which one should I prefer 
 for local testing?  Can I still use the standalone web interface if I use 
 'local-cluster' mode?
 It would be useful to document this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5162) Python yarn-cluster mode

2015-01-26 Thread Vladimir Grigor (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292069#comment-14292069
 ] 

Vladimir Grigor commented on SPARK-5162:


I second [~jared.holmb...@orchestro.com]
[~lianhuiwang] thank you! I'm going to try your PR.

Related issue
Even with this PR, there will be problem using Yarn in cluster mode on Amazon 
EMR.

Normally one submits yarn jobs via API or aws command line utility, so paths 
to files are evaluated later at some remote host, hence files are not found. 
Currently Spark does not support non-local files. One idea would be to add 
support for non-local (python) files, eg: if file is not local it will be 
downloaded and made available locally. Something similar to Distributed Cache 
described at 
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-plan-input-distributed-cache.html

So following code would work:
{code}
aws emr add-steps --cluster-id j-XYWIXMD234 \
--steps 
Name=SparkPi,Jar=s3://eu-west-1.elasticmapreduce/libs/script-runner/script-runner.jar,Args=[/home/hadoop/spark/bin/spark-submit,--deploy-mode,cluster,--master,yarn-cluster,--py-files,s3://mybucketat.amazonaws.com/tasks/main.py,main.py,param1],ActionOnFailure=CONTINUE
{code}

What do you think? What is your way to run batch python spark scripts on Yarn 
in Amazon?

 Python yarn-cluster mode
 

 Key: SPARK-5162
 URL: https://issues.apache.org/jira/browse/SPARK-5162
 Project: Spark
  Issue Type: New Feature
  Components: PySpark, YARN
Reporter: Dana Klassen
  Labels: cluster, python, yarn

 Running pyspark in yarn is currently limited to ‘yarn-client’ mode. It would 
 be great to be able to submit python applications to the cluster and (just 
 like java classes) have the resource manager setup an AM on any node in the 
 cluster. Does anyone know the issues blocking this feature? I was snooping 
 around with enabling python apps:
 Removing the logic stopping python and yarn-cluster from sparkSubmit.scala
 ...
 // The following modes are not supported or applicable
 (clusterManager, deployMode) match {
   ...
   case (_, CLUSTER) if args.isPython =
 printErrorAndExit(Cluster deploy mode is currently not supported for 
 python applications.)
   ...
 }
 …
 and submitting application via:
 HADOOP_CONF_DIR={{insert conf dir}} ./bin/spark-submit --master yarn-cluster 
 --num-executors 2  —-py-files {{insert location of egg here}} 
 --executor-cores 1  ../tools/canary.py
 Everything looks to run alright, pythonRunner is picked up as main class, 
 resources get setup, yarn client gets launched but falls flat on its face:
 2015-01-08 18:48:03,444 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService:
  DEBUG: FAILED { 
 {{redacted}}/.sparkStaging/application_1420594669313_4687/canary.py, 
 1420742868009, FILE, null }, Resource 
 {{redacted}}/.sparkStaging/application_1420594669313_4687/canary.py changed 
 on src filesystem (expected 1420742868009, was 1420742869284
 and
 2015-01-08 18:48:03,446 INFO 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource:
  Resource 
 {{redacted}}/.sparkStaging/application_1420594669313_4687/canary.py(-/data/4/yarn/nm/usercache/klassen/filecache/11/canary.py)
  transitioned from DOWNLOADING to FAILED
 Tracked this down to the apache hadoop code(FSDownload.java line 249) related 
 to container localization of files upon downloading. At this point thought it 
 would be best to raise the issue here and get input.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5246) spark/spark-ec2.py cannot start Spark master in VPC if local DNS name does not resolve

2015-01-19 Thread Vladimir Grigor (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14282701#comment-14282701
 ] 

Vladimir Grigor commented on SPARK-5246:


please see another pull request for this bug: 
https://github.com/mesos/spark-ec2/pull/92

Unfortunately, original fix didn't work that well

 spark/spark-ec2.py cannot start Spark master in VPC if local DNS name does 
 not resolve
 --

 Key: SPARK-5246
 URL: https://issues.apache.org/jira/browse/SPARK-5246
 Project: Spark
  Issue Type: Bug
  Components: EC2
Reporter: Vladimir Grigor

 How to reproduce: 
 1)  http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Scenario2.html 
 should be sufficient to setup VPC for this bug. After you followed that 
 guide, start new instance in VPC, ssh to it (though NAT server)
 2) user starts a cluster in VPC:
 {code}
 ./spark-ec2 -k key20141114 -i ~/aws/key.pem -s 1 --region=eu-west-1 
 --spark-version=1.2.0 --instance-type=m1.large --vpc-id=vpc-2e71dd46 
 --subnet-id=subnet-2571dd4d --zone=eu-west-1a  launch SparkByScript
 Setting up security groups...
 
 (omitted for brevity)
 10.1.1.62
 10.1.1.62: no org.apache.spark.deploy.worker.Worker to stop
 no org.apache.spark.deploy.master.Master to stop
 starting org.apache.spark.deploy.master.Master, logging to 
 /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out
 failed to launch org.apache.spark.deploy.master.Master:
   at java.net.InetAddress.getLocalHost(InetAddress.java:1469)
   ... 12 more
 full log in 
 /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out
 10.1.1.62: starting org.apache.spark.deploy.worker.Worker, logging to 
 /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-ip-10-1-1-62.out
 10.1.1.62: failed to launch org.apache.spark.deploy.worker.Worker:
 10.1.1.62:at java.net.InetAddress.getLocalHost(InetAddress.java:1469)
 10.1.1.62:... 12 more
 10.1.1.62: full log in 
 /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-ip-10-1-1-62.out
 [timing] spark-standalone setup:  00h 00m 28s
  
 (omitted for brevity)
 {code}
 /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out
 {code}
 Spark assembly has been built with Hive, including Datanucleus jars on 
 classpath
 Spark Command: /usr/lib/jvm/java-1.7.0/bin/java -cp 
 :::/root/ephemeral-hdfs/conf:/root/spark/sbin/../conf:/root/spark/lib/spark-assembly-1.2.0-hadoop1.0.4.jar:/root/spark/lib/datanucleus-api-jdo-3.2.6.jar:/root/spark/lib/datanucleus-rdbms-3.2.9.jar:/root/spark/lib/datanucleus-core-3.2.10.jar
  -XX:MaxPermSize=128m -Dspark.akka.logLifecycleEvents=true -Xms512m -Xmx512m 
 org.apache.spark.deploy.master.Master --ip 10.1.1.151 --port 7077 
 --webui-port 8080
 
 15/01/14 07:34:47 INFO master.Master: Registered signal handlers for [TERM, 
 HUP, INT]
 Exception in thread main java.net.UnknownHostException: ip-10-1-1-151: 
 ip-10-1-1-151: Name or service not known
 at java.net.InetAddress.getLocalHost(InetAddress.java:1473)
 at org.apache.spark.util.Utils$.findLocalIpAddress(Utils.scala:620)
 at 
 org.apache.spark.util.Utils$.localIpAddress$lzycompute(Utils.scala:612)
 at org.apache.spark.util.Utils$.localIpAddress(Utils.scala:612)
 at 
 org.apache.spark.util.Utils$.localIpAddressHostname$lzycompute(Utils.scala:613)
 at 
 org.apache.spark.util.Utils$.localIpAddressHostname(Utils.scala:613)
 at 
 org.apache.spark.util.Utils$$anonfun$localHostName$1.apply(Utils.scala:665)
 at 
 org.apache.spark.util.Utils$$anonfun$localHostName$1.apply(Utils.scala:665)
 at scala.Option.getOrElse(Option.scala:120)
 at org.apache.spark.util.Utils$.localHostName(Utils.scala:665)
 at 
 org.apache.spark.deploy.master.MasterArguments.init(MasterArguments.scala:27)
 at org.apache.spark.deploy.master.Master$.main(Master.scala:819)
 at org.apache.spark.deploy.master.Master.main(Master.scala)
 Caused by: java.net.UnknownHostException: ip-10-1-1-151: Name or service not 
 known
 at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
 at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:901)
 at 
 java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1293)
 at java.net.InetAddress.getLocalHost(InetAddress.java:1469)
 ... 12 more
 {code}
 Problem is that instance launched in VPC may be not able to resolve own local 
 hostname. Please see  
 https://forums.aws.amazon.com/thread.jspa?threadID=92092.
 I am going to submit a fix for this problem since I need this functionality 
 asap.



--
This message was sent by 

[jira] [Commented] (SPARK-5246) spark/spark-ec2.py cannot start Spark master in VPC if local DNS name does not resolve

2015-01-15 Thread Vladimir Grigor (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14278780#comment-14278780
 ] 

Vladimir Grigor commented on SPARK-5246:


https://github.com/mesos/spark-ec2/pull/91

 spark/spark-ec2.py cannot start Spark master in VPC if local DNS name does 
 not resolve
 --

 Key: SPARK-5246
 URL: https://issues.apache.org/jira/browse/SPARK-5246
 Project: Spark
  Issue Type: Bug
  Components: EC2
Reporter: Vladimir Grigor

 How to reproduce: 
 1)  http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Scenario2.html 
 should be sufficient to setup VPC for this bug. After you followed that 
 guide, start new instance in VPC, ssh to it (though NAT server)
 2) user starts a cluster in VPC:
 {code}
 ./spark-ec2 -k key20141114 -i ~/aws/key.pem -s 1 --region=eu-west-1 
 --spark-version=1.2.0 --instance-type=m1.large --vpc-id=vpc-2e71dd46 
 --subnet-id=subnet-2571dd4d --zone=eu-west-1a  launch SparkByScript
 Setting up security groups...
 
 (omitted for brevity)
 10.1.1.62
 10.1.1.62: no org.apache.spark.deploy.worker.Worker to stop
 no org.apache.spark.deploy.master.Master to stop
 starting org.apache.spark.deploy.master.Master, logging to 
 /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out
 failed to launch org.apache.spark.deploy.master.Master:
   at java.net.InetAddress.getLocalHost(InetAddress.java:1469)
   ... 12 more
 full log in 
 /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out
 10.1.1.62: starting org.apache.spark.deploy.worker.Worker, logging to 
 /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-ip-10-1-1-62.out
 10.1.1.62: failed to launch org.apache.spark.deploy.worker.Worker:
 10.1.1.62:at java.net.InetAddress.getLocalHost(InetAddress.java:1469)
 10.1.1.62:... 12 more
 10.1.1.62: full log in 
 /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-ip-10-1-1-62.out
 [timing] spark-standalone setup:  00h 00m 28s
  
 (omitted for brevity)
 {code}
 /root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out
 {code}
 Spark assembly has been built with Hive, including Datanucleus jars on 
 classpath
 Spark Command: /usr/lib/jvm/java-1.7.0/bin/java -cp 
 :::/root/ephemeral-hdfs/conf:/root/spark/sbin/../conf:/root/spark/lib/spark-assembly-1.2.0-hadoop1.0.4.jar:/root/spark/lib/datanucleus-api-jdo-3.2.6.jar:/root/spark/lib/datanucleus-rdbms-3.2.9.jar:/root/spark/lib/datanucleus-core-3.2.10.jar
  -XX:MaxPermSize=128m -Dspark.akka.logLifecycleEvents=true -Xms512m -Xmx512m 
 org.apache.spark.deploy.master.Master --ip 10.1.1.151 --port 7077 
 --webui-port 8080
 
 15/01/14 07:34:47 INFO master.Master: Registered signal handlers for [TERM, 
 HUP, INT]
 Exception in thread main java.net.UnknownHostException: ip-10-1-1-151: 
 ip-10-1-1-151: Name or service not known
 at java.net.InetAddress.getLocalHost(InetAddress.java:1473)
 at org.apache.spark.util.Utils$.findLocalIpAddress(Utils.scala:620)
 at 
 org.apache.spark.util.Utils$.localIpAddress$lzycompute(Utils.scala:612)
 at org.apache.spark.util.Utils$.localIpAddress(Utils.scala:612)
 at 
 org.apache.spark.util.Utils$.localIpAddressHostname$lzycompute(Utils.scala:613)
 at 
 org.apache.spark.util.Utils$.localIpAddressHostname(Utils.scala:613)
 at 
 org.apache.spark.util.Utils$$anonfun$localHostName$1.apply(Utils.scala:665)
 at 
 org.apache.spark.util.Utils$$anonfun$localHostName$1.apply(Utils.scala:665)
 at scala.Option.getOrElse(Option.scala:120)
 at org.apache.spark.util.Utils$.localHostName(Utils.scala:665)
 at 
 org.apache.spark.deploy.master.MasterArguments.init(MasterArguments.scala:27)
 at org.apache.spark.deploy.master.Master$.main(Master.scala:819)
 at org.apache.spark.deploy.master.Master.main(Master.scala)
 Caused by: java.net.UnknownHostException: ip-10-1-1-151: Name or service not 
 known
 at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
 at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:901)
 at 
 java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1293)
 at java.net.InetAddress.getLocalHost(InetAddress.java:1469)
 ... 12 more
 {code}
 Problem is that instance launched in VPC may be not able to resolve own local 
 hostname. Please see  
 https://forums.aws.amazon.com/thread.jspa?threadID=92092.
 I am going to submit a fix for this problem since I need this functionality 
 asap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To 

[jira] [Updated] (SPARK-5246) spark/spark-ec2.py cannot start Spark master in VPC if local DNS name does not resolve

2015-01-15 Thread Vladimir Grigor (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Grigor updated SPARK-5246:
---
Description: 
##How to reproduce: 
1)  http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Scenario2.html 
should be sufficient to setup VPC for this bug. After you followed that guide, 
start new instance in VPC, ssh to it (though NAT server)

2) user starts a cluster in VPC:
{code}
./spark-ec2 -k key20141114 -i ~/aws/key.pem -s 1 --region=eu-west-1 
--spark-version=1.2.0 --instance-type=m1.large --vpc-id=vpc-2e71dd46 
--subnet-id=subnet-2571dd4d --zone=eu-west-1a  launch SparkByScript
Setting up security groups...

(omitted for brevity)
10.1.1.62
10.1.1.62: no org.apache.spark.deploy.worker.Worker to stop
no org.apache.spark.deploy.master.Master to stop
starting org.apache.spark.deploy.master.Master, logging to 
/root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out
failed to launch org.apache.spark.deploy.master.Master:
at java.net.InetAddress.getLocalHost(InetAddress.java:1469)
... 12 more
full log in 
/root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out
10.1.1.62: starting org.apache.spark.deploy.worker.Worker, logging to 
/root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-ip-10-1-1-62.out
10.1.1.62: failed to launch org.apache.spark.deploy.worker.Worker:
10.1.1.62:  at java.net.InetAddress.getLocalHost(InetAddress.java:1469)
10.1.1.62:  ... 12 more
10.1.1.62: full log in 
/root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-ip-10-1-1-62.out
[timing] spark-standalone setup:  00h 00m 28s
 
(omitted for brevity)
{code}

/root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out
{code}
Spark assembly has been built with Hive, including Datanucleus jars on classpath
Spark Command: /usr/lib/jvm/java-1.7.0/bin/java -cp 
:::/root/ephemeral-hdfs/conf:/root/spark/sbin/../conf:/root/spark/lib/spark-assembly-1.2.0-hadoop1.0.4.jar:/root/spark/lib/datanucleus-api-jdo-3.2.6.jar:/root/spark/lib/datanucleus-rdbms-3.2.9.jar:/root/spark/lib/datanucleus-core-3.2.10.jar
 -XX:MaxPermSize=128m -Dspark.akka.logLifecycleEvents=true -Xms512m -Xmx512m 
org.apache.spark.deploy.master.Master --ip 10.1.1.151 --port 7077 --webui-port 
8080


15/01/14 07:34:47 INFO master.Master: Registered signal handlers for [TERM, 
HUP, INT]
Exception in thread main java.net.UnknownHostException: ip-10-1-1-151: 
ip-10-1-1-151: Name or service not known
at java.net.InetAddress.getLocalHost(InetAddress.java:1473)
at org.apache.spark.util.Utils$.findLocalIpAddress(Utils.scala:620)
at 
org.apache.spark.util.Utils$.localIpAddress$lzycompute(Utils.scala:612)
at org.apache.spark.util.Utils$.localIpAddress(Utils.scala:612)
at 
org.apache.spark.util.Utils$.localIpAddressHostname$lzycompute(Utils.scala:613)
at org.apache.spark.util.Utils$.localIpAddressHostname(Utils.scala:613)
at 
org.apache.spark.util.Utils$$anonfun$localHostName$1.apply(Utils.scala:665)
at 
org.apache.spark.util.Utils$$anonfun$localHostName$1.apply(Utils.scala:665)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.util.Utils$.localHostName(Utils.scala:665)
at 
org.apache.spark.deploy.master.MasterArguments.init(MasterArguments.scala:27)
at org.apache.spark.deploy.master.Master$.main(Master.scala:819)
at org.apache.spark.deploy.master.Master.main(Master.scala)
Caused by: java.net.UnknownHostException: ip-10-1-1-151: Name or service not 
known
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:901)
at 
java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1293)
at java.net.InetAddress.getLocalHost(InetAddress.java:1469)
... 12 more
{code}

Problem is that instance launched in VPC may be not able to resolve own local 
hostname. Please see  https://forums.aws.amazon.com/thread.jspa?threadID=92092.

I am going to submit a fix for this problem since I need this functionality 
asap.


## How to reproduce

  was:
How to reproduce: 
1) user starts a cluster in VPC:
{code}
./spark-ec2 -k key20141114 -i ~/aws/key.pem -s 1 --region=eu-west-1 
--spark-version=1.2.0 --instance-type=m1.large --vpc-id=vpc-2e71dd46 
--subnet-id=subnet-2571dd4d --zone=eu-west-1a  launch SparkByScript
Setting up security groups...

(omitted for brevity)
10.1.1.62
10.1.1.62: no org.apache.spark.deploy.worker.Worker to stop
no org.apache.spark.deploy.master.Master to stop
starting org.apache.spark.deploy.master.Master, logging to 
/root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out
failed to launch org.apache.spark.deploy.master.Master:
at 

[jira] [Updated] (SPARK-5246) spark/spark-ec2.py cannot start Spark master in VPC if local DNS name does not resolve

2015-01-15 Thread Vladimir Grigor (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Grigor updated SPARK-5246:
---
Description: 
How to reproduce: 

1)  http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Scenario2.html 
should be sufficient to setup VPC for this bug. After you followed that guide, 
start new instance in VPC, ssh to it (though NAT server)

2) user starts a cluster in VPC:
{code}
./spark-ec2 -k key20141114 -i ~/aws/key.pem -s 1 --region=eu-west-1 
--spark-version=1.2.0 --instance-type=m1.large --vpc-id=vpc-2e71dd46 
--subnet-id=subnet-2571dd4d --zone=eu-west-1a  launch SparkByScript
Setting up security groups...

(omitted for brevity)
10.1.1.62
10.1.1.62: no org.apache.spark.deploy.worker.Worker to stop
no org.apache.spark.deploy.master.Master to stop
starting org.apache.spark.deploy.master.Master, logging to 
/root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out
failed to launch org.apache.spark.deploy.master.Master:
at java.net.InetAddress.getLocalHost(InetAddress.java:1469)
... 12 more
full log in 
/root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out
10.1.1.62: starting org.apache.spark.deploy.worker.Worker, logging to 
/root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-ip-10-1-1-62.out
10.1.1.62: failed to launch org.apache.spark.deploy.worker.Worker:
10.1.1.62:  at java.net.InetAddress.getLocalHost(InetAddress.java:1469)
10.1.1.62:  ... 12 more
10.1.1.62: full log in 
/root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-ip-10-1-1-62.out
[timing] spark-standalone setup:  00h 00m 28s
 
(omitted for brevity)
{code}

/root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out
{code}
Spark assembly has been built with Hive, including Datanucleus jars on classpath
Spark Command: /usr/lib/jvm/java-1.7.0/bin/java -cp 
:::/root/ephemeral-hdfs/conf:/root/spark/sbin/../conf:/root/spark/lib/spark-assembly-1.2.0-hadoop1.0.4.jar:/root/spark/lib/datanucleus-api-jdo-3.2.6.jar:/root/spark/lib/datanucleus-rdbms-3.2.9.jar:/root/spark/lib/datanucleus-core-3.2.10.jar
 -XX:MaxPermSize=128m -Dspark.akka.logLifecycleEvents=true -Xms512m -Xmx512m 
org.apache.spark.deploy.master.Master --ip 10.1.1.151 --port 7077 --webui-port 
8080


15/01/14 07:34:47 INFO master.Master: Registered signal handlers for [TERM, 
HUP, INT]
Exception in thread main java.net.UnknownHostException: ip-10-1-1-151: 
ip-10-1-1-151: Name or service not known
at java.net.InetAddress.getLocalHost(InetAddress.java:1473)
at org.apache.spark.util.Utils$.findLocalIpAddress(Utils.scala:620)
at 
org.apache.spark.util.Utils$.localIpAddress$lzycompute(Utils.scala:612)
at org.apache.spark.util.Utils$.localIpAddress(Utils.scala:612)
at 
org.apache.spark.util.Utils$.localIpAddressHostname$lzycompute(Utils.scala:613)
at org.apache.spark.util.Utils$.localIpAddressHostname(Utils.scala:613)
at 
org.apache.spark.util.Utils$$anonfun$localHostName$1.apply(Utils.scala:665)
at 
org.apache.spark.util.Utils$$anonfun$localHostName$1.apply(Utils.scala:665)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.util.Utils$.localHostName(Utils.scala:665)
at 
org.apache.spark.deploy.master.MasterArguments.init(MasterArguments.scala:27)
at org.apache.spark.deploy.master.Master$.main(Master.scala:819)
at org.apache.spark.deploy.master.Master.main(Master.scala)
Caused by: java.net.UnknownHostException: ip-10-1-1-151: Name or service not 
known
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:901)
at 
java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1293)
at java.net.InetAddress.getLocalHost(InetAddress.java:1469)
... 12 more
{code}

Problem is that instance launched in VPC may be not able to resolve own local 
hostname. Please see  https://forums.aws.amazon.com/thread.jspa?threadID=92092.

I am going to submit a fix for this problem since I need this functionality 
asap.


  was:
##How to reproduce: 
1)  http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Scenario2.html 
should be sufficient to setup VPC for this bug. After you followed that guide, 
start new instance in VPC, ssh to it (though NAT server)

2) user starts a cluster in VPC:
{code}
./spark-ec2 -k key20141114 -i ~/aws/key.pem -s 1 --region=eu-west-1 
--spark-version=1.2.0 --instance-type=m1.large --vpc-id=vpc-2e71dd46 
--subnet-id=subnet-2571dd4d --zone=eu-west-1a  launch SparkByScript
Setting up security groups...

(omitted for brevity)
10.1.1.62
10.1.1.62: no org.apache.spark.deploy.worker.Worker to stop
no org.apache.spark.deploy.master.Master to stop
starting 

[jira] [Created] (SPARK-5246) spark/spark-ec2.py cannot start Spark master in VPC if local DNS name does not resolve

2015-01-14 Thread Vladimir Grigor (JIRA)
Vladimir Grigor created SPARK-5246:
--

 Summary: spark/spark-ec2.py cannot start Spark master in VPC if 
local DNS name does not resolve
 Key: SPARK-5246
 URL: https://issues.apache.org/jira/browse/SPARK-5246
 Project: Spark
  Issue Type: Bug
  Components: EC2
Reporter: Vladimir Grigor


How to reproduce: 
1) user starts a cluster in VPC:
{code}
./spark-ec2 -k key20141114 -i ~/aws/key.pem -s 1 --region=eu-west-1 
--spark-version=1.2.0 --instance-type=m1.large --vpc-id=vpc-2e71dd46 
--subnet-id=subnet-2571dd4d --zone=eu-west-1a  launch SparkByScript
Setting up security groups...

(omitted for brevity)
10.1.1.62
10.1.1.62: no org.apache.spark.deploy.worker.Worker to stop
no org.apache.spark.deploy.master.Master to stop
starting org.apache.spark.deploy.master.Master, logging to 
/root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out
failed to launch org.apache.spark.deploy.master.Master:
at java.net.InetAddress.getLocalHost(InetAddress.java:1469)
... 12 more
full log in 
/root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out
10.1.1.62: starting org.apache.spark.deploy.worker.Worker, logging to 
/root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-ip-10-1-1-62.out
10.1.1.62: failed to launch org.apache.spark.deploy.worker.Worker:
10.1.1.62:  at java.net.InetAddress.getLocalHost(InetAddress.java:1469)
10.1.1.62:  ... 12 more
10.1.1.62: full log in 
/root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-ip-10-1-1-62.out
[timing] spark-standalone setup:  00h 00m 28s
 
(omitted for brevity)
{code}

/root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out
{
Spark assembly has been built with Hive, including Datanucleus jars on classpath
Spark Command: /usr/lib/jvm/java-1.7.0/bin/java -cp 
:::/root/ephemeral-hdfs/conf:/root/spark/sbin/../conf:/root/spark/lib/spark-assembly-1.2.0-hadoop1.0.4.jar:/root/spark/lib/datanucleus-api-jdo-3.2.6.jar:/root/spark/lib/datanucleus-rdbms-3.2.9.jar:/root/spark/lib/datanucleus-core-3.2.10.jar
 -XX:MaxPermSize=128m -Dspark.akka.logLifecycleEvents=true -Xms512m -Xmx512m 
org.apache.spark.deploy.master.Master --ip 10.1.1.151 --port 7077 --webui-port 
8080


15/01/14 07:34:47 INFO master.Master: Registered signal handlers for [TERM, 
HUP, INT]
Exception in thread main java.net.UnknownHostException: ip-10-1-1-151: 
ip-10-1-1-151: Name or service not known
at java.net.InetAddress.getLocalHost(InetAddress.java:1473)
at org.apache.spark.util.Utils$.findLocalIpAddress(Utils.scala:620)
at 
org.apache.spark.util.Utils$.localIpAddress$lzycompute(Utils.scala:612)
at org.apache.spark.util.Utils$.localIpAddress(Utils.scala:612)
at 
org.apache.spark.util.Utils$.localIpAddressHostname$lzycompute(Utils.scala:613)
at org.apache.spark.util.Utils$.localIpAddressHostname(Utils.scala:613)
at 
org.apache.spark.util.Utils$$anonfun$localHostName$1.apply(Utils.scala:665)
at 
org.apache.spark.util.Utils$$anonfun$localHostName$1.apply(Utils.scala:665)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.util.Utils$.localHostName(Utils.scala:665)
at 
org.apache.spark.deploy.master.MasterArguments.init(MasterArguments.scala:27)
at org.apache.spark.deploy.master.Master$.main(Master.scala:819)
at org.apache.spark.deploy.master.Master.main(Master.scala)
Caused by: java.net.UnknownHostException: ip-10-1-1-151: Name or service not 
known
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:901)
at 
java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1293)
at java.net.InetAddress.getLocalHost(InetAddress.java:1469)
... 12 more
}

Problem is that instance launched in VPC may be not able to resolve own local 
hostname. Please see  https://forums.aws.amazon.com/thread.jspa?threadID=92092.

I am going to submit a fix for this problem since I need this functionality 
asap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-5246) spark/spark-ec2.py cannot start Spark master in VPC if local DNS name does not resolve

2015-01-14 Thread Vladimir Grigor (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Grigor updated SPARK-5246:
---
Description: 
How to reproduce: 
1) user starts a cluster in VPC:
{code}
./spark-ec2 -k key20141114 -i ~/aws/key.pem -s 1 --region=eu-west-1 
--spark-version=1.2.0 --instance-type=m1.large --vpc-id=vpc-2e71dd46 
--subnet-id=subnet-2571dd4d --zone=eu-west-1a  launch SparkByScript
Setting up security groups...

(omitted for brevity)
10.1.1.62
10.1.1.62: no org.apache.spark.deploy.worker.Worker to stop
no org.apache.spark.deploy.master.Master to stop
starting org.apache.spark.deploy.master.Master, logging to 
/root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out
failed to launch org.apache.spark.deploy.master.Master:
at java.net.InetAddress.getLocalHost(InetAddress.java:1469)
... 12 more
full log in 
/root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out
10.1.1.62: starting org.apache.spark.deploy.worker.Worker, logging to 
/root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-ip-10-1-1-62.out
10.1.1.62: failed to launch org.apache.spark.deploy.worker.Worker:
10.1.1.62:  at java.net.InetAddress.getLocalHost(InetAddress.java:1469)
10.1.1.62:  ... 12 more
10.1.1.62: full log in 
/root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.worker.Worker-1-ip-10-1-1-62.out
[timing] spark-standalone setup:  00h 00m 28s
 
(omitted for brevity)
{code}

/root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out
{code}
Spark assembly has been built with Hive, including Datanucleus jars on classpath
Spark Command: /usr/lib/jvm/java-1.7.0/bin/java -cp 
:::/root/ephemeral-hdfs/conf:/root/spark/sbin/../conf:/root/spark/lib/spark-assembly-1.2.0-hadoop1.0.4.jar:/root/spark/lib/datanucleus-api-jdo-3.2.6.jar:/root/spark/lib/datanucleus-rdbms-3.2.9.jar:/root/spark/lib/datanucleus-core-3.2.10.jar
 -XX:MaxPermSize=128m -Dspark.akka.logLifecycleEvents=true -Xms512m -Xmx512m 
org.apache.spark.deploy.master.Master --ip 10.1.1.151 --port 7077 --webui-port 
8080


15/01/14 07:34:47 INFO master.Master: Registered signal handlers for [TERM, 
HUP, INT]
Exception in thread main java.net.UnknownHostException: ip-10-1-1-151: 
ip-10-1-1-151: Name or service not known
at java.net.InetAddress.getLocalHost(InetAddress.java:1473)
at org.apache.spark.util.Utils$.findLocalIpAddress(Utils.scala:620)
at 
org.apache.spark.util.Utils$.localIpAddress$lzycompute(Utils.scala:612)
at org.apache.spark.util.Utils$.localIpAddress(Utils.scala:612)
at 
org.apache.spark.util.Utils$.localIpAddressHostname$lzycompute(Utils.scala:613)
at org.apache.spark.util.Utils$.localIpAddressHostname(Utils.scala:613)
at 
org.apache.spark.util.Utils$$anonfun$localHostName$1.apply(Utils.scala:665)
at 
org.apache.spark.util.Utils$$anonfun$localHostName$1.apply(Utils.scala:665)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.util.Utils$.localHostName(Utils.scala:665)
at 
org.apache.spark.deploy.master.MasterArguments.init(MasterArguments.scala:27)
at org.apache.spark.deploy.master.Master$.main(Master.scala:819)
at org.apache.spark.deploy.master.Master.main(Master.scala)
Caused by: java.net.UnknownHostException: ip-10-1-1-151: Name or service not 
known
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:901)
at 
java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1293)
at java.net.InetAddress.getLocalHost(InetAddress.java:1469)
... 12 more
{code}

Problem is that instance launched in VPC may be not able to resolve own local 
hostname. Please see  https://forums.aws.amazon.com/thread.jspa?threadID=92092.

I am going to submit a fix for this problem since I need this functionality 
asap.

  was:
How to reproduce: 
1) user starts a cluster in VPC:
{code}
./spark-ec2 -k key20141114 -i ~/aws/key.pem -s 1 --region=eu-west-1 
--spark-version=1.2.0 --instance-type=m1.large --vpc-id=vpc-2e71dd46 
--subnet-id=subnet-2571dd4d --zone=eu-west-1a  launch SparkByScript
Setting up security groups...

(omitted for brevity)
10.1.1.62
10.1.1.62: no org.apache.spark.deploy.worker.Worker to stop
no org.apache.spark.deploy.master.Master to stop
starting org.apache.spark.deploy.master.Master, logging to 
/root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out
failed to launch org.apache.spark.deploy.master.Master:
at java.net.InetAddress.getLocalHost(InetAddress.java:1469)
... 12 more
full log in 
/root/spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-.out
10.1.1.62: starting org.apache.spark.deploy.worker.Worker, logging to 

[jira] [Created] (SPARK-5242) ec2/spark_ec2.py lauch does not work with VPC if no public DNS or IP is available

2015-01-13 Thread Vladimir Grigor (JIRA)
Vladimir Grigor created SPARK-5242:
--

 Summary: ec2/spark_ec2.py lauch does not work with VPC if no 
public DNS or IP is available
 Key: SPARK-5242
 URL: https://issues.apache.org/jira/browse/SPARK-5242
 Project: Spark
  Issue Type: Bug
  Components: EC2
Reporter: Vladimir Grigor


How to reproduce: user starting cluster in VPC needs to wait forever:
{code}
./spark-ec2 -k key20141114 -i ~/aws/key.pem -s 1 --region=eu-west-1 
--spark-version=1.2.0 --instance-type=m1.large --vpc-id=vpc-2e71dd46 
--subnet-id=subnet-2571dd4d --zone=eu-west-1a  launch SparkByScript
Setting up security groups...
Searching for existing cluster SparkByScript...
Spark AMI: ami-1ae0166d
Launching instances...
Launched 1 slaves in eu-west-1a, regid = r-e70c5502
Launched master in eu-west-1a, regid = r-bf0f565a
Waiting for cluster to enter 'ssh-ready' state..{forever}
{code}

Problem is that current code makes wrong assumption that VPC instance has 
public_dns_name or public ip_address. Actually more common is that VPC instance 
has only private_ip_address.


The bug is already fixed in my fork, I am going to submit pull request



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-5242) ec2/spark_ec2.py lauch does not work with VPC if no public DNS or IP is available

2015-01-13 Thread Vladimir Grigor (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-5242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14276591#comment-14276591
 ] 

Vladimir Grigor commented on SPARK-5242:


This bug is fixed in https://github.com/apache/spark/pull/4038

 ec2/spark_ec2.py lauch does not work with VPC if no public DNS or IP is 
 available
 ---

 Key: SPARK-5242
 URL: https://issues.apache.org/jira/browse/SPARK-5242
 Project: Spark
  Issue Type: Bug
  Components: EC2
Reporter: Vladimir Grigor
  Labels: easyfix

 How to reproduce: user starting cluster in VPC needs to wait forever:
 {code}
 ./spark-ec2 -k key20141114 -i ~/aws/key.pem -s 1 --region=eu-west-1 
 --spark-version=1.2.0 --instance-type=m1.large --vpc-id=vpc-2e71dd46 
 --subnet-id=subnet-2571dd4d --zone=eu-west-1a  launch SparkByScript
 Setting up security groups...
 Searching for existing cluster SparkByScript...
 Spark AMI: ami-1ae0166d
 Launching instances...
 Launched 1 slaves in eu-west-1a, regid = r-e70c5502
 Launched master in eu-west-1a, regid = r-bf0f565a
 Waiting for cluster to enter 'ssh-ready' state..{forever}
 {code}
 Problem is that current code makes wrong assumption that VPC instance has 
 public_dns_name or public ip_address. Actually more common is that VPC 
 instance has only private_ip_address.
 The bug is already fixed in my fork, I am going to submit pull request



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org