Lightning Analytics created AIRFLOW-6384:
--------------------------------------------
Summary: Cusomt Spark Livy Operator
Key: AIRFLOW-6384
URL: https://issues.apache.org/jira/browse/AIRFLOW-6384
Project: Apache Airflow
Issue Type: New Feature
Components: operators
Affects Versions: 1.10.6, 1.10.5
Reporter: Lightning Analytics
Here at lightning analytics, we always thrive for innovation. Keeping in mind
the challenges posed by open source technologies in terms of integration, the
aim is to build custom solutions to bridge the gap, thereby enabling ease of
deployment. The orchestration tool, Airflow, presents certain challenges for
integration with components in the cloud environment. Airflow has in-built
support for operators which is an integral component for integration. It was
observed that the operator for Livy Rest API is not available.
An operator, Spark Livy Operator, has been developed and tested that submits
spark jobs to the cluster using the Livy Rest API. Livy provides two sessions
namely, Batch and Interactive. The custom Spark Livy Operator provides
Interactive session submission and additionally sends heart beats to check the
status of the batch. Extensive testing has been performed using Celery
executor, and all available hooks and libraries in Airflow has been integrated.
We would like to publish the same to the Apache open source community for
enhancement, use and distribution. Please inform about the steps for creating a
Git pull request for Airflow.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)