Troubles with the Spark-EC2 stuff

Guillaume Pitel Sat, 04 Jan 2014 12:56:03 -0800

Hi,

I'm making my first steps on EC2 (using 0.8.1 bin for CDH4) and some problems occured. First one is that once the cluster is created, the script cannot find it again for login, destroying and so on. Not a big deal, I can do that manually, but it's annoying.

Second problem is not really related to spark but to hdfs/mapreduce. I want to make a hadoop distcp from S3 to the local ephemeral HDFS. The distcp fails because there's no mapreduce running.

Questions :

- anyone has advice about a better way to copy from S3 to hdfs, or a way to make distcp work ?
- any idea why the spark-ec2 cannot find the clusters back ?

Thanks in advance for any experience and advices !

Guillaume

Guillaume PITEL, Président
+33(0)6 25 48 86 80 / +33(0)9 70 44 67 53

eXenSa S.A.S.
41, rue Périer - 92120 Montrouge - FRANCE
Tel +33(0)1 84 16 36 77 / Fax +33(0)9 72 28 37 05

Troubles with the Spark-EC2 stuff

Reply via email to