[
https://issues.apache.org/jira/browse/AIRFLOW-1713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16359595#comment-16359595
]
Fokko Driesprong commented on AIRFLOW-1713:
-------------------------------------------
You could run a PythonOperator and execute PySpark code
> Return results optionally from spark_sql_hook
> ---------------------------------------------
>
> Key: AIRFLOW-1713
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1713
> Project: Apache Airflow
> Issue Type: Improvement
> Components: contrib
> Affects Versions: 1.8.1
> Reporter: Boris Tyukin
> Assignee: Boris Tyukin
> Priority: Minor
>
> spark_sql_hook is very useful to execute Spark SQL queries without much of
> boilerplate with spark-submit but right now it is not possible to capture
> execution results. One example if one wants to run "select count(1) from
> table" and get that count back to airflow dag. To address that, optional
> argument with regex expression is added to spark_sql_hook that will capture
> all spark log output matching regex and return back to airflow.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)