[ 
https://issues.apache.org/jira/browse/OOZIE-2456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter resolved OOZIE-2456.
----------------------------------
    Resolution: Duplicate

> spark action can not find pyspark module
> ----------------------------------------
>
>                 Key: OOZIE-2456
>                 URL: https://issues.apache.org/jira/browse/OOZIE-2456
>             Project: Oozie
>          Issue Type: Bug
>          Components: action, client, core
>    Affects Versions: 4.1.0
>         Environment: Ubuntu 14.04.3
>            Reporter: Ming Hsuan Tu
>
> I hava a spark script written in pyspark and I want to submit it via oozie 
> spark action.
> something like this:
> {code:xml}
>   <action name="myapp">
>       <spark xmlns="uri:oozie:spark-action:0.1">
>           <job-tracker>${job_tracker}</job-tracker>
>           <name-node>${name_node}</name-node>
>           <master>local[*]</master>
>           <name>myapp</name>
>           <jar>${my_script}</jar>
>           <spark-opts>--executor-memory 4G --num-executors 4</spark-opts>
>           <arg>${arg1}</arg>
>       </spark>
>       <ok to="hive_import"/>
>       <error to="send_email"/>
>   </action>
> {code}
> The script imports pyspark module:
> {code:text}
> #!/usr/bin/spark-submit
> from pyspark import SparkContext
> from pyspark import SparkFiles
> sc = SparkContext()
> {code}
> However, the oozie will throw the " Can not import pyspark module" exception.
> This happens when I upgrade to CDH 5.5.1 from CDH 5.4.6.
> The workaround would be using the shell action, but I think the spark action 
> is better to describe the spark task.
> Any suggestion?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to