[PR] Fix OTLP traces e2e test stability [skywalking]

via GitHub Thu, 19 Mar 2026 18:50:14 -0700


wu-sheng opened a new pull request, #13752:
URL: https://github.com/apache/skywalking/pull/13752


   ### Fix OTLP traces e2e test instability
   
   The OTLP traces e2e test has been flaky due to infrastructure issues (not 
sampling rate — that's 100% everywhere).
   
   **Root causes identified and fixed:**
   
   1. **No health checks on OTel demo containers** — trigger fired before 
services were ready, producing no traces during the retry window.
      - Added `healthcheck` with TCP checks (same pattern as base-compose OAP)
      - Added `depends_on: condition: service_healthy` for proper startup 
ordering
   
   2. **Non-existent service endpoints causing 20-30s timeouts** — 
`CURRENCY_SERVICE_ADDR: no.exist:80` and `FEATURE_FLAG_GRPC_SERVICE_ADDR: 
no.exist:80` caused DNS resolution failures and gRPC dial timeouts on every 
request, making `/api/products` slow or failing entirely.
      - Changed to `productcatalogservice:3550` (reachable endpoint, fast gRPC 
"unimplemented" error instead of hanging)
   
   3. **Tight memory limits** — `productcatalogservice` at 20M and `frontend` 
at 200M could OOM under CI load.
      - Bumped to 40M and 300M respectively
   
   Also adds e2e expectation specification documents (CLAUDE.md and 
protocol-specific specs) for AI-assisted e2e test development.
   
   - [ ] Explain briefly why the bug exists and how to fix it.
     - The test containers had no health checks, so the e2e trigger started 
calling endpoints before services were ready. Combined with DNS timeout on 
non-existent service addresses, requests took 20-30s each instead of completing 
quickly, starving the test of valid traces within the verify window.
   
   - [ ] Update the [`CHANGES` 
log](https://github.com/apache/skywalking/blob/master/docs/en/changes/changes.md).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] Fix OTLP traces e2e test stability [skywalking]

Reply via email to