kacpermuda opened a new issue, #40598:
URL: https://github.com/apache/airflow/issues/40598
### Apache Airflow Provider(s)
amazon, openlineage
### Versions of Apache Airflow Providers
Tested on both: main branch and latest pypi versions:
apache-airflow-providers-openlineage==1.8.0
apache-airflow-providers-amazon==8.25.0
### Apache Airflow version
main branch and 2.9.2
### Operating System
MacOS 14.5
### Deployment
Astronomer
### Deployment details
astro-runtime:11.6.0 (tested with astro dev and actual deployment, they
should be the same but still verified it)
Also tested with breeze on main branch
### What happened
When running simple DAG with AthenaOperator, I am sometimes not receiving
OpenLineage DAG complete events. It looks flaky at first, and I've not yet
figured it out. There is nothing suspicious in scheduler logs for me.
I'm not sure how the choice of the task (AthenaOperator) can influence lack
of DAG complete event that is emitted from the scheduler, so it's possible that
I did something wrong here. Let me know if you are able to reproduce this
behaviour.
### What you think should happen instead
We should always receive OpenLineage DAG complete events.
### How to reproduce
I've run this DAG a couple times, and did not receive DAG complete events.
On astro, this is my dockerfile:
```
FROM quay.io/astronomer/astro-runtime:11.6.0
ENV AIRFLOW__OPENLINEAGE__TRANSPORT='{"type": "console"}'
ENV AIRFLOW__LOGGING__LOGGING_LEVEL=DEBUG
```
<details>
<summary>Example DAG</summary>
```
from airflow.providers.amazon.aws.operators.athena import AthenaOperator
from airflow import DAG
import datetime as dt
import random
import string
suffix = ''.join(random.choices(string.ascii_lowercase + string.digits, k=3))
table_name = f"t_{suffix}"
query = f"""
CREATE TABLE {table_name} AS
SELECT
UPPER(name) AS x,
age * 2 AS y
FROM
workers_csv;
"""
with DAG(
dag_id='athena',
start_date=dt.datetime(2024, 5, 21),
schedule=None
) as dag:
task = AthenaOperator(
task_id="task",
aws_conn_id="aws",
query=query,
database="default",
output_location="s3://<bucket-name>/results",
deferrable=False,
region="eu-central-1",
)
```
</details>
### Anything else
Scheduler logs from astro dev:
[astro_scheduler_logs.txt](https://github.com/user-attachments/files/16095130/astro_scheduler_logs.txt)
Scheduler and task logs from breeze:
[breeze_scheduler_logs.txt](https://github.com/user-attachments/files/16095233/breeze_scheduler_logs.txt)
[breeze_task_logs.log](https://github.com/user-attachments/files/16095234/breeze_task_logs.log)
Marquez events:

### Are you willing to submit PR?
- [X] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]