Distcp requires a mr1(or mr2) cluster to start. Do you have a mapreduce cluster on your hdfs? And from the error message, it seems that you didn't specify your jobtracker address.
-- Ye Xianjin Sent with Sparrow (http://www.sparrowmailapp.com/?sig) On Sunday, September 7, 2014 at 9:42 PM, Tomer Benyamini wrote: > Hi, > > I would like to copy log files from s3 to the cluster's > ephemeral-hdfs. I tried to use distcp, but I guess mapred is not > running on the cluster - I'm getting the exception below. > > Is there a way to activate it, or is there a spark alternative to distcp? > > Thanks, > Tomer > > mapreduce.Cluster (Cluster.java:initialize(114)) - Failed to use > org.apache.hadoop.mapred.LocalClientProtocolProvider due to error: > Invalid "mapreduce.jobtracker.address" configuration value for > LocalJobRunner : "XXX:9001" > > ERROR tools.DistCp (DistCp.java:run(126)) - Exception encountered > > java.io.IOException: Cannot initialize Cluster. Please check your > configuration for mapreduce.framework.name (http://mapreduce.framework.name) > and the correspond server > addresses. > > at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:121) > > at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:83) > > at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:76) > > at org.apache.hadoop.tools.DistCp.createMetaFolderPath(DistCp.java:352) > > at org.apache.hadoop.tools.DistCp.execute(DistCp.java:146) > > at org.apache.hadoop.tools.DistCp.run(DistCp.java:118) > > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > > at org.apache.hadoop.tools.DistCp.main(DistCp.java:374) > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > (mailto:user-unsubscr...@spark.apache.org) > For additional commands, e-mail: user-h...@spark.apache.org > (mailto:user-h...@spark.apache.org) > >