Re: How To Set Environment Variables For Spark Action Script From XML Definition

2018-05-14 Thread Peter Cseh
Hi!

There is no easy and straightforward way of doing this for the Spark
action, but you can take advantage of the fact that Oozie 4.1.0 uses
MapReduce to launch Spark.
Just put "mapred.map.child.env" in the action configuration using the
format k1=v1,k2=v2. EL functions should also work here.

Gp


On Thu, May 10, 2018 at 6:39 PM, Richard Primera <
richard.prim...@woombatcg.com> wrote:

> Greetings,
>
> How can I set an environment variable to be accessible from either a .jar
> or .py script launched via a spark action?
>
> The idea is to set the environment variable with the output of the EL
> function ${wf:id()} from within the XML workflow definition, something
> along these lines:
>
> script.py
>
> OOZIE_WORKFLOW_ID=${wf:id()}
>
> And then have the ability to do wf_id = os.getenv("OOZIE_WORKFLOW_ID")
> from the script without having to pass them as command line arguments. The
> thing about command line arguments is that they don't scale as well because
> they rely on a specific ordering or some custom parsing implementation.
> This can be done easily it seems with a shell action, but I've been unable
> to find a similar straightforward way of doing it for a spark action.
>
> Oozie Version: 4.1.0-cdh5.12.1
>
>


-- 
*Peter Cseh *| Software Engineer
cloudera.com 

[image: Cloudera] 

[image: Cloudera on Twitter]  [image:
Cloudera on Facebook]  [image: Cloudera
on LinkedIn] 
--


How To Set Environment Variables For Spark Action Script From XML Definition

2018-05-10 Thread Richard Primera

Greetings,

How can I set an environment variable to be accessible from either a 
.jar or .py script launched via a spark action?


The idea is to set the environment variable with the output of the EL 
function ${wf:id()} from within the XML workflow definition, something 
along these lines:


script.py

    OOZIE_WORKFLOW_ID=${wf:id()}

And then have the ability to do wf_id = os.getenv("OOZIE_WORKFLOW_ID") 
from the script without having to pass them as command line arguments. 
The thing about command line arguments is that they don't scale as well 
because they rely on a specific ordering or some custom parsing 
implementation. This can be done easily it seems with a shell action, 
but I've been unable to find a similar straightforward way of doing it 
for a spark action.


Oozie Version: 4.1.0-cdh5.12.1