[
https://issues.apache.org/jira/browse/AIRFLOW-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ash Berlin-Taylor updated AIRFLOW-4349:
---------------------------------------
Description:
Steps to recreate:
Execute an Athena operator that returns a failed stage (in this case some
missing S3 pemissions). Athena returns a failed state but the failed state code
is not operating as expected.
{code:java}
[2019-04-18 00:06:35,222] {{logging_mixin.py:95}} INFO - [2019-04-18
00:06:35,222] {{aws_athena_hook.py:129}} INFO - Trial 1: Query is still in an
intermediate state - RUNNING
[2019-04-18 00:07:05,272] {{logging_mixin.py:95}} INFO - [2019-04-18
00:07:05,272] {{connectionpool.py:203}} INFO - Starting new HTTP connection
(1): XXX.XXX.XXX.XXX
[2019-04-18 00:07:05,282] {{logging_mixin.py:95}} INFO - [2019-04-18
00:07:05,282] {{connectionpool.py:203}} INFO - Starting new HTTP connection
(1): XXX.XXX.XXX.XXX
[2019-04-18 00:07:05,286] {{logging_mixin.py:95}} INFO - [2019-04-18
00:07:05,286] {{connectionpool.py:238}} INFO - Resetting dropped connection:
athena.us-east-1.amazonaws.com
[2019-04-18 00:07:05,377] {{logging_mixin.py:95}} INFO - [2019-04-18
00:07:05,377] {{aws_athena_hook.py:132}} INFO - Trial 2: Query execution
completed. Final state is FAILED
...
...
...
[2019-04-18 00:07:08,560] {{logging_mixin.py:95}} INFO - [2019-04-18
00:07:08,559] {{jobs.py:2562}} INFO - Task exited with return code 0
{code}
Looking at the code if the query is a failure it should then return this from
the Operator:
{code:java}
'Final state of Athena job is {}, query_execution_id is {}.'
{code}
But that is not in the logs. The task is then mark as successfully completed in
Airflow and the DAG continues.
was:
Steps to recreate:
Execute an Athena operator that returns a failed stage (in this case some
missing S3 pemissions). Athena returns a failed state but the failed state code
is not operating as expected.
{code:java}
[2019-04-18 00:06:35,222] {{logging_mixin.py:95}} INFO - [2019-04-18
00:06:35,222] {{aws_athena_hook.py:129}} INFO - Trial 1: Query is still in an
intermediate state - RUNNING [2019-04-18 00:07:05,272] {{logging_mixin.py:95}}
INFO - [2019-04-18 00:07:05,272] {{connectionpool.py:203}} INFO - Starting new
HTTP connection (1): XXX.XXX.XXX.XXX [2019-04-18 00:07:05,282]
{{logging_mixin.py:95}} INFO - [2019-04-18 00:07:05,282]
{{connectionpool.py:203}} INFO - Starting new HTTP connection (1):
XXX.XXX.XXX.XXX [2019-04-18 00:07:05,286] {{logging_mixin.py:95}} INFO -
[2019-04-18 00:07:05,286] {{connectionpool.py:238}} INFO - Resetting dropped
connection: athena.us-east-1.amazonaws.com [2019-04-18 00:07:05,377]
{{logging_mixin.py:95}} INFO - [2019-04-18 00:07:05,377]
{{aws_athena_hook.py:132}} INFO - Trial 2: Query execution completed. Final
state is FAILED
...
...
...
[2019-04-18 00:07:08,560] {{logging_mixin.py:95}} INFO - [2019-04-18
00:07:08,559] {{jobs.py:2562}} INFO - Task exited with return code 0
```
{code}
Looking at the code if the query is a failure it should then return this from
the Operator:
{code:java}
'Final state of Athena job is {}, query_execution_id is {}.'
{code}
But that is not in the logs. The task is then mark as successfully completed in
Airflow and the DAG continues.
> Athena Operator Marked Successful on Failure
> --------------------------------------------
>
> Key: AIRFLOW-4349
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4349
> Project: Apache Airflow
> Issue Type: Bug
> Components: operators
> Affects Versions: 1.10.3
> Environment: Airflow 1.10.3 running on K8s with K8s Executor in AWS.
> Reporter: Steve Buckingham
> Priority: Critical
>
> Steps to recreate:
> Execute an Athena operator that returns a failed stage (in this case some
> missing S3 pemissions). Athena returns a failed state but the failed state
> code is not operating as expected.
> {code:java}
> [2019-04-18 00:06:35,222] {{logging_mixin.py:95}} INFO - [2019-04-18
> 00:06:35,222] {{aws_athena_hook.py:129}} INFO - Trial 1: Query is still in an
> intermediate state - RUNNING
> [2019-04-18 00:07:05,272] {{logging_mixin.py:95}} INFO - [2019-04-18
> 00:07:05,272] {{connectionpool.py:203}} INFO - Starting new HTTP connection
> (1): XXX.XXX.XXX.XXX
> [2019-04-18 00:07:05,282] {{logging_mixin.py:95}} INFO - [2019-04-18
> 00:07:05,282] {{connectionpool.py:203}} INFO - Starting new HTTP connection
> (1): XXX.XXX.XXX.XXX
> [2019-04-18 00:07:05,286] {{logging_mixin.py:95}} INFO - [2019-04-18
> 00:07:05,286] {{connectionpool.py:238}} INFO - Resetting dropped connection:
> athena.us-east-1.amazonaws.com
> [2019-04-18 00:07:05,377] {{logging_mixin.py:95}} INFO - [2019-04-18
> 00:07:05,377] {{aws_athena_hook.py:132}} INFO - Trial 2: Query execution
> completed. Final state is FAILED
> ...
> ...
> ...
> [2019-04-18 00:07:08,560] {{logging_mixin.py:95}} INFO - [2019-04-18
> 00:07:08,559] {{jobs.py:2562}} INFO - Task exited with return code 0
> {code}
> Looking at the code if the query is a failure it should then return this from
> the Operator:
> {code:java}
> 'Final state of Athena job is {}, query_execution_id is {}.'
> {code}
> But that is not in the logs. The task is then mark as successfully completed
> in Airflow and the DAG continues.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)