[
https://issues.apache.org/jira/browse/OOZIE-2540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15664039#comment-15664039
]
Andras Piros commented on OOZIE-2540:
-------------------------------------
Hi [~Dongying Jiao],
actually Oozie doesn't have a special {{PySparkAction}} but a {{SparkAction}}
that can be fed w/ {{application-jar}} and {{--py-files}} argument as per
[*official
documentation*|http://spark.apache.org/docs/latest/submitting-applications.html]:
{quote}
For Python applications, simply pass a {{.py}} file in the place of
{{<application-jar>}} instead of a JAR, and add Python {{.zip}}, {{.egg}} or
{{.py}} files to the search path with {{--py-files}}.
{quote}
I think if you can submit your {{.py}} file via {{spark-submit}} you can also
put the same parameters inside your {{workflow.xml}} and Oozie will happily
make the Spark submit call for you.
As for the Python or Spark runtimes, this is an issue outside of Oozie's reach
and responsibility.
Regards,
Andras
> Create a PySpark example
> ------------------------
>
> Key: OOZIE-2540
> URL: https://issues.apache.org/jira/browse/OOZIE-2540
> Project: Oozie
> Issue Type: Task
> Components: examples
> Affects Versions: trunk
> Reporter: Robert Kanter
>
> Now that we have PySpark working correctly in the Spark Action, we should
> make an example that runs a PySpark job to give users an example of how to do
> it.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)