Hi,

I want to run a remote mapreduce job. So, I have created programatically a job [1], but when I submit it remotely, I get the error [2].

First, I have thought that it was a security issue because the client username is |xeon| and the remote usernomu in |ubuntu|, but I have noticed that the |temp| dirs with the username were created [3].

Now, I really don’t know why I get this error. Any help for this? Is there a good tutorial that explains how to submit a remote job, and configure Yarn Mapreduce to accept remote jobs?

[1]

|Configuration conf = job.getConfiguration();

        // this should be like defined in your yarn-site.xml
        conf.set("yarn.resourcemanager.address", host + ":" + 
Util.yarn_resourcemanager_address_port);

        // framework is now "yarn", should be defined like this in 
mapred-site.xml
        conf.set("mapreduce.framework.name", "yarn");
        conf.set("hadoop.job.ugi", "ubuntu");
        conf.set("yarn.application.classpath ", 
"/home/ubuntu/Programs/hadoop-2.6.0/etc/hadoop," +
                "/home/ubuntu/Programs/hadoop-2.6.0/etc/hadoop," +
                "/home/ubuntu/Programs/hadoop-2.6.0/etc/hadoop,"+
                "/home/ubuntu/Programs/hadoop-2.6.0/share/hadoop/common/lib/*," 
+
                "/home/ubuntu/Programs/hadoop-2.6.0/share/hadoop/common/*," +
                "/home/ubuntu/Programs/hadoop-2.6.0/share/hadoop/hdfs," +
                "/home/ubuntu/Programs/hadoop-2.6.0/share/hadoop/hdfs/lib/*," +
                "/home/ubuntu/Programs/hadoop-2.6.0/share/hadoop/hdfs/*," +
                "/home/ubuntu/Programs/hadoop-2.6.0/share/hadoop/yarn/lib/*," +
                "/home/ubuntu/Programs/hadoop-2.6.0/share/hadoop/yarn/*," +
                
"/home/ubuntu/Programs/hadoop-2.6.0/share/hadoop/mapreduce/lib/*," +
                "/home/ubuntu/Programs/hadoop-2.6.0/share/hadoop/mapreduce/*," +
                "/contrib/capacity-scheduler/*.jar," +
                "/home/ubuntu/Programs/hadoop-2.6.0/share/hadoop/yarn/*," +
                "/home/ubuntu/Programs/hadoop-2.6.0/share/hadoop/yarn/lib/*," +
                "/home/ubuntu/Programs/hadoop-2.6.0/*");

        // like defined in hdfs-site.xml
        conf.set("fs.defaultFS", "hdfs://" + host + ":" + Util.fs_defaultFS);

        for (Path inputPath : inputPaths){
            try {
                FileInputFormat.addInputPath(job, new 
Path(inputPath.toString()));
            } catch (IllegalArgumentException | IOException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            }
        }

        FileOutputFormat.setOutputPath(job, outputpath);

        try {
            job.waitForCompletion(true);
        } catch (ClassNotFoundException | IOException | InterruptedException e) 
{
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
|

[2]

|Configuration: core-default.xml, core-site.xml
-hosts
===> ec2-52-25-10-73
2015-05-14 18:42:36,277 WARN  [main] util.NativeCodeLoader 
(NativeCodeLoader.java:<clinit>(62)) - Unable to load native-hadoop library for 
your platform... using builtin-java classes where applicable
XXXX: /input1
org.apache.hadoop.mapred.examples.MyHashPartitioner
---> Job 0: /input1, : temp-1431625359972
2015-05-14 18:42:49,391 INFO  [pool-1-thread-1] client.RMProxy 
(RMProxy.java:createRMProxy(98)) - Connecting to ResourceManager at 
ec2-52-25-10-73/12.35.40.33:8040
2015-05-14 18:42:52,878 WARN  [pool-1-thread-1] mapreduce.JobSubmitter 
(JobSubmitter.java:copyAndConfigureFiles(261)) - No job jar file set.  User 
classes may not be found. See Job or Job#setJar(String).
2015-05-14 18:42:53,680 INFO  [pool-1-thread-1] input.FileInputFormat 
(FileInputFormat.java:listStatus(281)) - Total input paths to process : 1
2015-05-14 18:43:54,717 INFO  [Thread-5] hdfs.DFSClient 
(DFSOutputStream.java:createBlockOutputStream(1471)) - Exception in 
createBlockOutputStream
org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout while 
waiting for channel to be ready for connect. ch : 
java.nio.channels.SocketChannel[connection-pending remote=/172.31.17.45:50010]
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:533)
    at 
org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1610)
    at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1408)
    at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361)
    at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)
2015-05-14 18:43:54,722 INFO  [Thread-5] hdfs.DFSClient 
(DFSOutputStream.java:nextBlockOutputStream(1364)) - Abandoning 
BP-2006008085-172.31.17.45-1431620173976:blk_1073741829_1005
2015-05-14 18:43:54,934 INFO  [Thread-5] hdfs.DFSClient 
(DFSOutputStream.java:nextBlockOutputStream(1368)) - Excluding datanode 
172.31.17.45:50010
2015-05-14 18:43:55,153 WARN  [Thread-5] hdfs.DFSClient 
(DFSOutputStream.java:run(691)) - DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File 
/tmp/hadoop-yarn/staging/xeon/.staging/job_1431623302732_0004/job.split could 
only be replicated to 0 nodes instead of minReplication (=1).  There are 1 
datanode(s) running and 1 node(s) are excluded in this operation.
    at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1549)

2015-05-14 18:43:55,165 INFO  [pool-1-thread-1] mapreduce.JobSubmitter 
(JobSubmitter.java:submitJobInternal(545)) - Cleaning up the staging area 
/tmp/hadoop-yarn/staging/xeon/.staging/job_1431623302732_0004
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File 
/tmp/hadoop-yarn/staging/xeon/.staging/job_1431623302732_0004/job.split could 
only be replicated to 0 nodes instead of minReplication (=1).  There are 1 
datanode(s) running and 1 node(s) are excluded in this operation.
    at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1549)
    at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3200)
    at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:641)
    at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:482)
|

[3]

|$ ~/Programs/hadoop-wordcount-coc$ hdfs dfs -ls 
/tmp/hadoop-yarn/staging/xeon/.staging
Found 2 items
drwx------   - xeon supergroup          0 2015-05-14 17:09 
/tmp/hadoop-yarn/staging/xeon/.staging/job_1431623302732_0001
drwx------   - xeon supergroup          0 2015-05-14 17:11 
/tmp/hadoop-yarn/staging/xeon/.staging/job_1431623302732_0002
|

​

--
--

Reply via email to