Repository: spark Updated Branches: refs/heads/master db160676c -> 44d3a6a75
[SPARK-3342] Add SSDs to block device mapping On `m3.2xlarge` instances the 2x80GB SSDs are inaccessible if not added to the block device mapping when the instance is created. They work when added with this patch. I have not tested this with other instance types, and I do not know much about this script and EC2 deployment in general. Maybe this code needs to depend on the instance type. The requirement for this mapping is described in the AWS docs at: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/InstanceStorage.html#InstanceStore_UsageScenarios "For M3 instances, you must specify instance store volumes in the block device mapping for the instance. When you launch an M3 instance, we ignore any instance store volumes specified in the block device mapping for the AMI." Author: Daniel Darabos <darabos.dan...@gmail.com> Closes #2081 from darabos/patch-1 and squashes the following commits: 1ceb2c8 [Daniel Darabos] Use %d string interpolation instead of {}. a1854d7 [Daniel Darabos] Only specify ephemeral device mapping for M3. e0d9e37 [Daniel Darabos] Create ephemeral device mapping based on get_num_disks(). 6b116a6 [Daniel Darabos] Add SSDs to block device mapping Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/44d3a6a7 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/44d3a6a7 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/44d3a6a7 Branch: refs/heads/master Commit: 44d3a6a75209370b1648e2dfedaa4c895923dae5 Parents: db16067 Author: Daniel Darabos <darabos.dan...@gmail.com> Authored: Mon Sep 1 22:14:28 2014 -0700 Committer: Matei Zaharia <ma...@databricks.com> Committed: Mon Sep 1 22:16:42 2014 -0700 ---------------------------------------------------------------------- ec2/spark_ec2.py | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/44d3a6a7/ec2/spark_ec2.py ---------------------------------------------------------------------- diff --git a/ec2/spark_ec2.py b/ec2/spark_ec2.py index ae4c488..7e25df5 100755 --- a/ec2/spark_ec2.py +++ b/ec2/spark_ec2.py @@ -26,6 +26,7 @@ import os import pipes import random import shutil +import string import subprocess import sys import tempfile @@ -34,7 +35,7 @@ import urllib2 from optparse import OptionParser from sys import stderr import boto -from boto.ec2.blockdevicemapping import BlockDeviceMapping, EBSBlockDeviceType +from boto.ec2.blockdevicemapping import BlockDeviceMapping, BlockDeviceType, EBSBlockDeviceType from boto import ec2 # A URL prefix from which to fetch AMI information @@ -355,6 +356,15 @@ def launch_cluster(conn, opts, cluster_name): device.delete_on_termination = True block_map["/dev/sdv"] = device + # AWS ignores the AMI-specified block device mapping for M3 (see SPARK-3342). + if opts.instance_type.startswith('m3.'): + for i in range(get_num_disks(opts.instance_type)): + dev = BlockDeviceType() + dev.ephemeral_name = 'ephemeral%d' % i + # The first ephemeral drive is /dev/sdb. + name = '/dev/sd' + string.letters[i + 1] + block_map[name] = dev + # Launch slaves if opts.spot_price is not None: # Launch spot instances with the requested price --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org