Hi Rohit, Your yarn.application.classpath is missing the following: $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*,$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*
I think this is a hunch, but the JobClient inside the yarn application is not finding the hadoop-mapreduce-client-jobclient-2.3.0.jar, which has the YarnClientProtocolProvider class and is defaulting to LocalClientProtocolProvider and hence unable to initiate a connection to your YARN cluster. The above jar is typically located under $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*. Can you add the above to your yarn-site.xml, restart yarn and give it a go? Thanks, Sudarshan From: Rohit Kalhans <rohit.kalh...@gmail.com> Reply-To: "user@gobblin.incubator.apache.org" <user@gobblin.incubator.apache.org> Date: Wednesday, February 7, 2018 at 1:02 PM To: "user@gobblin.incubator.apache.org" <user@gobblin.incubator.apache.org> Subject: Re: PriviledgedActionException while submitting a gobblin job to mapreduce. hello all, First of all, thanks for the quick rtt. really appreciate the help. The environment variables have been set correctly(atleast that's what i think. ). i am running this on a feeder box (gateway) of a cdh 5.7 cluster managed by cloudera manager. the yarn-site.xml contains the following <property> <name>yarn.application.classpath</name> <value>$HADOOP_CLIENT_CONF_DIR,$HADOOP_CONF_DIR,$HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*,$HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*,$HADOOP_YARN_HOME/*,$HADOOP_YARN_HOME/lib/* </value> </property> Before the execution of my application I call the following. export HADOOP_PREFIX="/opt/cloudera/parcels/CDH/" export HADOOP_HOME=$HADOOP_PREFIX export HADOOP_COMMON_HOME=$HADOOP_PREFIX export HADOOP_CONF_DIR=HADOOP_PREFIX/etc/hadoop/ export HADOOP_HDFS_HOME=$HADOOP_PREFIX export HADOOP_CLIENT_CONF_DIR="/etc/hadoop/conf" export HADOOP_MAPRED_HOME=$HADOOP_PREFIX export HADOOP_YARN_HOME=$HADOOP_PREFIX export HADOOP_BIN_DIR=$HADOOP_PREFIX/bin source /etc/hadoop/conf/hadoop-env.sh The hadoop-env.sh sets a few variables as well. $>_ cat /etc/hadoop/conf/hadoop-env.sh # Prepend/Append plugin parcel classpaths if [ "$HADOOP_USER_CLASSPATH_FIRST" = 'true' ]; then # HADOOP_CLASSPATH={{HADOOP_CLASSPATH_APPEND}} : else # HADOOP_CLASSPATH={{HADOOP_CLASSPATH}} : fi # JAVA_LIBRARY_PATH={{JAVA_LIBRARY_PATH}} export HADOOP_MAPRED_HOME=$( ([[ ! '/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce' =~ CDH_MR2_HOME ]] && echo /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce ) || echo ${CDH_MR2_HOME:-/usr/lib/hadoop-mapreduce/} ) export HADOOP_CLIENT_OPTS="-Xmx268435456 $HADOOP_CLIENT_OPTS" export HADOOP_CLIENT_OPTS="-Djava.net.preferIPv4Stack=true $HADOOP_CLIENT_OPTS" export YARN_OPTS="-Xmx825955249 -Djava.net.preferIPv4Stack=true $YARN_OPTS" On Thu, Feb 8, 2018 at 12:48 AM, Sudarshan Vasudevan <suvasude...@linkedin.com<mailto:suvasude...@linkedin.com>> wrote: Hi Rohit, Can you share the properties in your yarn-site.xml file? The following is an example config that worked for me: I set the yarn.application.classpath in yarn-site.xml to the following: <property> <description>Classpath for typical applications.</description> <name>yarn.application.classpath</name> <value> $HADOOP_CONF_DIR, $HADOOP_COMMON_HOME/share/hadoop/common/*,$HADOOP_COMMON_HOME/share/hadoop/common/lib/*, $HADOOP_HDFS_HOME/share/hadoop/hdfs/*,$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*, $HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*,$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*, $HADOOP_YARN_HOME/share/hadoop/yarn/*,$HADOOP_YARN_HOME/share/hadoop/yarn/lib/* </value> </property> In my local Hadoop installation, I set the HADOOP_* environment variables as follows: export HADOOP_PREFIX="/usr/local/hadoop-2.3.0" export HADOOP_HOME=$HADOOP_PREFIX export HADOOP_COMMON_HOME=$HADOOP_PREFIX export HADOOP_CONF_DIR=$HADOOP_PREFIX/etc/hadoop export HADOOP_HDFS_HOME=$HADOOP_PREFIX export HADOOP_MAPRED_HOME=$HADOOP_PREFIX export HADOOP_YARN_HOME=$HADOOP_PREFIX export HADOOP_BIN_DIR=$HADOOP_PREFIX/bin Hope this helps, Sudarshan From: Rohit Kalhans <rohit.kalh...@gmail.com<mailto:rohit.kalh...@gmail.com>> Reply-To: "user@gobblin.incubator.apache.org<mailto:user@gobblin.incubator.apache.org>" <user@gobblin.incubator.apache.org<mailto:user@gobblin.incubator.apache.org>> Date: Wednesday, February 7, 2018 at 10:57 AM To: "user@gobblin.incubator.apache.org<mailto:user@gobblin.incubator.apache.org>" <user@gobblin.incubator.apache.org<mailto:user@gobblin.incubator.apache.org>> Subject: PriviledgedActionException while submitting a gobblin job to mapreduce. Hello I am integrating gobblin in embedded mode with an existing application. While submitting the job it seems like there is a unresolved dependency/requirement to mapreduce launcher. I have checked that mapreduce.framework.name<http://mapreduce.framework.name> is set to yarn and the other yarn application are running fine. Somehow I keep hitting the issue with the gobblin mr job launcher. I was hoping that you guys can help me setting up Gobblin in embedded mode for my application. Here is the stack. Do let me know if some other info is needed. Launching Hadoop MR job Gobblin-test9 WARN [2018-02-07 11:43:22,990] org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:<userName> (auth:SIMPLE) cause:java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name<http://mapreduce.framework.name> and the correspond server addresses. INFO [2018-02-07 11:43:22,991] org.apache.gobblin.runtime.TaskStateCollectorService: Stopping the TaskStateCollectorService INFO [2018-02-07 11:43:23,033] org.apache.gobblin.runtime.mapreduce.MRJobLauncher: Deleted working directory /tmp/_test9_1518003781707/test9/job_test9_1518003782322 ERROR [2018-02-07 11:43:23,033] org.apache.gobblin.runtime.AbstractJobLauncher: Failed to launch and run job job_test9_1518003782322: java.io.IOException: Cannot initialize Cluster. Please check your conf iguration for mapreduce.framework.name<http://mapreduce.framework.name> and the correspond server addresses. ! java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name<http://mapreduce.framework.name> and the correspond server addresses. ! at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:120) ! at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:82) ! at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:75) ! at org.apache.hadoop.mapreduce.Job$9.run(Job.java:1277) ! at org.apache.hadoop.mapreduce.Job$9.run(Job.java:1273) ! at java.security.AccessController.doPrivileged(Native Method) ! at javax.security.auth.Subject.doAs(Subject.java:422) ! at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693) ! at org.apache.hadoop.mapreduce.Job.connect(Job.java:1272) ! at org.apache.hadoop.mapreduce.Job.submit(Job.java:1301) ! at org.apache.gobblin.runtime.mapreduce.MRJobLauncher.runWorkUnits(MRJobLauncher.java:244) ! at org.apache.gobblin.runtime.AbstractJobLauncher.runWorkUnitStream(AbstractJobLauncher.java:596) ! at org.apache.gobblin.runtime.AbstractJobLauncher.launchJob(AbstractJobLauncher.java:443) ! at org.apache.gobblin.runtime.job_exec.JobLauncherExecutionDriver$DriverRunnable.call(JobLauncherExecutionDriver.java:159) ! at org.apache.gobblin.runtime.job_exec.JobLauncherExecutionDriver$DriverRunnable.call(JobLauncherExecutionDriver.java:147) ! at java.util.concurrent.FutureTask.run(FutureTask.java:266) ! at java.lang.Thread.run(Thread.java:745) -- Cheerio! Rohit -- Cheerio! Rohit