Re: [PR] Fix flaky OTEL connection in integration tests [airflow]

2026-03-11 Thread via GitHub


andreahlert commented on PR #62811:
URL: https://github.com/apache/airflow/pull/62811#issuecomment-4041112877

   Thanks everyone for the reviews and feedback! #61242 already addressed the 
root cause (DNS flakiness when waiting for the OTel collector) by adding 
wait_for_otel_collector, and #56150 later did a broader rewrite of the test 
file. There's nothing left here that isn't already covered, so I'm closing 
this. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Fix flaky OTEL connection in integration tests [airflow]

2026-03-11 Thread via GitHub


andreahlert closed pull request #62811: Fix flaky OTEL connection in 
integration tests
URL: https://github.com/apache/airflow/pull/62811


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Fix flaky OTEL connection in integration tests [airflow]

2026-03-07 Thread via GitHub


potiuk commented on PR #62811:
URL: https://github.com/apache/airflow/pull/62811#issuecomment-4016255744

   The OTEL integration tests have been completely rewritten now, so likely 
this PR is not needed any more. Converting it to Draft - as we observe 
stability - we might resume it if we find that the changes did not help.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Fix flaky OTEL connection in integration tests [airflow]

2026-03-06 Thread via GitHub


Copilot commented on code in PR #62811:
URL: https://github.com/apache/airflow/pull/62811#discussion_r2895568683


##
airflow-core/src/airflow/ui/tests/e2e/specs/backfill.spec.ts:
##
@@ -218,8 +218,8 @@ test.describe("Backfill pause, resume, and cancel 
controls", () => {
 await expect(async () => {
   await backfillPage.page.reload();
   await expect(backfillPage.triggerButton).toBeVisible({ timeout: 10_000 
});
-  await expect(backfillPage.pauseButton).toBeVisible({ timeout: 5000 });
-}).toPass({ timeout: 60_000 });
+  await expect(backfillPage.pauseButton).toBeVisible({ timeout: 15_000 });
+}).toPass({ timeout: 120_000 });

Review Comment:
   The change to `backfill.spec.ts` (increasing the `pauseButton` visibility 
timeout from 5 000 ms to 15 000 ms and the outer `toPass` timeout from 60 000 
ms to 120 000 ms) is completely unrelated to the stated PR goal of fixing flaky 
OTEL connection issues. The PR description mentions only OTEL integration test 
fixes and an OTEL Collector healthcheck, but makes no mention of these 
Playwright backfill E2E test timeout changes. This appears to be an unrelated 
change that was bundled into this PR.



##
airflow-core/tests/integration/otel/test_otel.py:
##
@@ -212,6 +212,10 @@ def setup_class(cls):
 os.environ["AIRFLOW__TRACES__OTEL_ON"] = "True"
 os.environ["OTEL_EXPORTER_OTLP_PROTOCOL"] = "http/protobuf"
 os.environ["OTEL_EXPORTER_OTLP_TRACES_ENDPOINT"] = 
"http://breeze-otel-collector:4318/v1/traces";
+if cls.use_otel != "true":
+os.environ["OTEL_TRACES_EXPORTER"] = "console"

Review Comment:
   Setting `OTEL_TRACES_EXPORTER=console` when `use_otel != "true"` will cause 
`test_dag_execution_succeeds` (lines 441–524) to fail, because that test always 
queries the Jaeger API for span data (line 497: 
`requests.get(f"http://{host}:16686/api/traces?service={service_name}";)`). With 
`OTEL_TRACES_EXPORTER=console`, no spans are sent to Jaeger (or via the 
otel-collector), so the test will either receive an empty trace list (causing 
an `IndexError` on `data["data"][-1]`) or assert against stale spans from a 
prior test run.
   
   Either this test needs a `use_otel == "true"` guard around its Jaeger 
query-and-assert block, or this env var should not be set to `console` for the 
span exporter when the Jaeger test expects OTLP data.
   
   Note that the `AIRFLOW__METRICS__OTEL_ON` and `OTEL_METRICS_EXPORTER` 
changes on lines 217–218 are fine, as those affect only metrics, not spans.
   ```suggestion
   
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Fix flaky OTEL connection in integration tests [airflow]

2026-03-06 Thread via GitHub


jason810496 commented on code in PR #62811:
URL: https://github.com/apache/airflow/pull/62811#discussion_r2895548910


##
airflow-core/tests/integration/otel/test_otel.py:
##
@@ -212,6 +212,10 @@ def setup_class(cls):
 os.environ["AIRFLOW__TRACES__OTEL_ON"] = "True"
 os.environ["OTEL_EXPORTER_OTLP_PROTOCOL"] = "http/protobuf"
 os.environ["OTEL_EXPORTER_OTLP_TRACES_ENDPOINT"] = 
"http://breeze-otel-collector:4318/v1/traces";
+if cls.use_otel != "true":
+os.environ["OTEL_TRACES_EXPORTER"] = "console"
+os.environ["OTEL_METRICS_EXPORTER"] = "none"

Review Comment:
   Should we set `OTEL_METRICS_EXPORTER` as `"console"`?
   
   
https://github.com/apache/airflow/blob/63ea296047d7e31b5c929043b11929c1484163d6/airflow-core/tests/integration/otel/test_otel.py#L830-L831



##
airflow-core/src/airflow/ui/tests/e2e/specs/backfill.spec.ts:
##
@@ -218,8 +218,8 @@ test.describe("Backfill pause, resume, and cancel 
controls", () => {
 await expect(async () => {
   await backfillPage.page.reload();
   await expect(backfillPage.triggerButton).toBeVisible({ timeout: 10_000 
});
-  await expect(backfillPage.pauseButton).toBeVisible({ timeout: 5000 });
-}).toPass({ timeout: 60_000 });
+  await expect(backfillPage.pauseButton).toBeVisible({ timeout: 15_000 });
+}).toPass({ timeout: 120_000 });

Review Comment:
   This is UI only e2e test instead of the integration test mentioning in the 
issue.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Fix flaky OTEL connection in integration tests [airflow]

2026-03-03 Thread via GitHub


andreahlert commented on PR #62811:
URL: https://github.com/apache/airflow/pull/62811#issuecomment-3993482749

   All CI failures are unrelated to this PR — they are caused by transient 
GitHub infrastructure errors (HTTP 500 during \ctions/checkout\) and a missing 
\helm gpg\ command in the CI runner. Could a maintainer please re-run the 
failed jobs? Thank you.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Fix flaky OTEL connection in integration tests [airflow]

2026-03-03 Thread via GitHub


andreahlert commented on code in PR #62811:
URL: https://github.com/apache/airflow/pull/62811#discussion_r2880093560


##
airflow-core/newsfragments/62769.misc.rst:
##
@@ -0,0 +1 @@
+Fix flaky OTEL connection in integration tests. Disable OTLP metric export for 
span-only tests and add OTEL collector healthcheck so Airflow waits for the 
collector to be ready.

Review Comment:
   Sorry, my fault. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] Fix flaky OTEL connection in integration tests [airflow]

2026-03-03 Thread via GitHub


potiuk commented on code in PR #62811:
URL: https://github.com/apache/airflow/pull/62811#discussion_r2880017165


##
airflow-core/newsfragments/62769.bugfix.rst:
##
@@ -0,0 +1 @@
+Fix flaky OTEL connection in integration tests. Disable OTLP metric export for 
span-only tests and add OTEL collector healthcheck so Airflow waits for the 
collector to be ready.

Review Comment:
   This is absolutely not needed.  We only need newsfragments for serious 
**user** facing changes.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]