[ 
https://issues.apache.org/jira/browse/OOZIE-2540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15664039#comment-15664039
 ] 

Andras Piros commented on OOZIE-2540:
-------------------------------------

Hi [~Dongying Jiao],

actually Oozie doesn't have a special {{PySparkAction}} but a {{SparkAction}} 
that can be fed w/ {{application-jar}} and {{--py-files}} argument as per 
[*official 
documentation*|http://spark.apache.org/docs/latest/submitting-applications.html]:

{quote}
For Python applications, simply pass a {{.py}} file in the place of 
{{<application-jar>}} instead of a JAR, and add Python {{.zip}}, {{.egg}} or 
{{.py}} files to the search path with {{--py-files}}.
{quote}

I think if you can submit your {{.py}} file via {{spark-submit}} you can also 
put the same parameters inside your {{workflow.xml}} and Oozie will happily 
make the Spark submit call for you.

As for the Python or Spark runtimes, this is an issue outside of Oozie's reach 
and responsibility.

Regards,

Andras

> Create a PySpark example
> ------------------------
>
>                 Key: OOZIE-2540
>                 URL: https://issues.apache.org/jira/browse/OOZIE-2540
>             Project: Oozie
>          Issue Type: Task
>          Components: examples
>    Affects Versions: trunk
>            Reporter: Robert Kanter
>
> Now that we have PySpark working correctly in the Spark Action, we should 
> make an example that runs a PySpark job to give users an example of how to do 
> it.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to