[ 
https://issues.apache.org/jira/browse/NIFI-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mageswaran updated NIFI-4946:
-----------------------------
    Description: 
Adding support for submitting PySpark based Sparks jobs (which is normally 
structured as modules) over Livy on existing "ExecuteSparkInteractive" 
processor.

This is done by reading file paths for pyfiles and file and an option from user 
whether the processor should trigger a batch job or not.

[https://livy.incubator.apache.org/docs/latest/rest-api.html]

 *Current Work flow 
Logic:([https://github.com/apache/nifi/pull/2521)|https://github.com/apache/nifi/pull/2521]*
 * Check whether the processor has to handle code or submit a Spark job
 * Read incoming flow file
 ** If batch == true
 *** If flow file matches Livy `batches` JSON response through `wait` loop
 **** Wait for Status Check Interval
 **** Read the state
 **** If state is `running` route it to `wait` or if it  is `success` or `dead` 
route it accordingly
 *** Else
 **** Ignore the flow file
 **** Trigger the Spark job over Livy `batches` endpoint
 **** Read the state of the submitted job
 **** If state is `running` route it to `wait` or if it  is `success` or `dead` 
route it accordingly
 ** Else:
 *** Existing Logic to handle `Code`

 

!nifi-spark-options.png!

!nifi-spark.png!

 

Thanks.

  was:
Adding support for submitting PySpark based Sparks jobs (which is normally 
structured as modules) over Livy on existing "ExecuteSparkInteractive" 
processor.

This is done by reading file paths for pyfiles and file and an option from user 
whether the processor should trigger a batch job or not.

[https://livy.incubator.apache.org/docs/latest/rest-api.html]

 

More details will be posted in the Git link.

 

Thanks.


> nifi-spark-bundle : Adding support for pyfiles, file, jars options
> ------------------------------------------------------------------
>
>                 Key: NIFI-4946
>                 URL: https://issues.apache.org/jira/browse/NIFI-4946
>             Project: Apache NiFi
>          Issue Type: New Feature
>          Components: Extensions
>    Affects Versions: 1.6.0
>         Environment: Ubuntu 16.04, IntelliJ
>            Reporter: Mageswaran
>            Priority: Major
>             Fix For: 1.6.0
>
>         Attachments: nifi-spark-options.png, nifi-spark.png
>
>
> Adding support for submitting PySpark based Sparks jobs (which is normally 
> structured as modules) over Livy on existing "ExecuteSparkInteractive" 
> processor.
> This is done by reading file paths for pyfiles and file and an option from 
> user whether the processor should trigger a batch job or not.
> [https://livy.incubator.apache.org/docs/latest/rest-api.html]
>  *Current Work flow 
> Logic:([https://github.com/apache/nifi/pull/2521)|https://github.com/apache/nifi/pull/2521]*
>  * Check whether the processor has to handle code or submit a Spark job
>  * Read incoming flow file
>  ** If batch == true
>  *** If flow file matches Livy `batches` JSON response through `wait` loop
>  **** Wait for Status Check Interval
>  **** Read the state
>  **** If state is `running` route it to `wait` or if it  is `success` or 
> `dead` route it accordingly
>  *** Else
>  **** Ignore the flow file
>  **** Trigger the Spark job over Livy `batches` endpoint
>  **** Read the state of the submitted job
>  **** If state is `running` route it to `wait` or if it  is `success` or 
> `dead` route it accordingly
>  ** Else:
>  *** Existing Logic to handle `Code`
>  
> !nifi-spark-options.png!
> !nifi-spark.png!
>  
> Thanks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to