Hi Piper,
Just set HADOOP_CONF_DIR should work. Did you try that?

BR,
Zhankun

On Fri, 6 Dec 2019 at 00:43, Piper Piper <piperfl...@gmail.com> wrote:

> Hello,
>
> I want to run Spark or Flink jobs from a client (remote desktop) onto a
> YARN cluster. Another example will be if I am running a YARN cluster on
> VMs, then I would like to use the host OS as the client to submit Spark
> Jobs to the VM YARN cluster.
>
> What are the easiest ways to set the YARN_CONF_DIR environment variable on
> the client machine so that it can submit Spark jobs to the YARN cluster?
>
> From reading online documents, I believe I am supposed to set the client's
> YARN_CONF_DIR environment variable to $HADOOP_HOME/etc/hadoop or
> $HADOOP_HOME/etc/hadoop/conf. However, I do not understand how do I get the
> value of HADOOP_HOME, do i need to set this value on every machine in the
> cluster, and how my client machine will know how to locate the NameNode in
> the cluster?
>
> Also, does $HADOOP_HOME/etc/hadoop have to be the same on every node in
> the cluster, or is it on a special node, like NameNode or ResourceManager?
>
> I have read there is an easier way by copying the /etc/hadoop contents
> into the client machine, and then setting the client's YARN_CONF_DIR to
> that location. Can someone please explain how to do this? Which node in my
> cluster should I copy the /etc/hadoop contents from? Would this also work
> if my client can only contact the cluster via ssh?
>
> Thanks!
>
> Piper
>
>
>

Reply via email to