errose28 opened a new pull request, #10238:
URL: https://github.com/apache/ozone/pull/10238

   ## What changes were proposed in this pull request?
   
   Draft generated with AI as an example to see if this is helpful.
   
   We often do not check the master branch for flaky tests to proactively track 
them under HDDS-5626 and tag them with `@Flaky`. Usually they don't get tagged 
until they disrupt PRs, but even then tests are frequently just rerun without 
bothering to tag them going forward. This can cause the reliability of master 
to slowly degrade over time, until it gets bad enough that we inspect a lot of 
past runs and add lots of `@Flaky` tags all at once.
   
   To help proactively tag these flaky tests, this PR contains a github actions 
job that will create a new github discussion for each test run on master that 
has junit failures. The current draft is sending this to the `General` 
category, but we could create a different category for these.
   
   Format of the discussion would look like this:
   
   ```
   [CI] JUnit failure on master (072f758)
   
   **Workflow run:** https://github.com/errose28/ozone/actions/runs/25561938156
   **Commit:** 072f758268491dd2b8945089f916f4421755741e
   Flaky test Jiras should be filed as **subtasks of 
[HDDS-5626](https://issues.apache.org/jira/browse/HDDS-5626)**.
   ---
   ## integration-hdds
   org.apache.hadoop.hdds.upgrade.TestDNDataDistributionFinalization
   org.apache.hadoop.hdds.upgrade.TestScmDataDistributionFinalization
   org.apache.hadoop.hdds.upgrade.TestScmHAFinalization
   Error: Process completed with exit code 1.
   ```
   
   Other alternatives were considered but dropped due to complexity:
   - Automatically filing Jira issues:
     - Requires a new Jira token to be added by ASF Infra. I recall from past 
testing that our existing github token is enough to create discussions.
     - Requires deduplication of test failures. Each run needs to somehow 
figure out if a Jira was already filed for this test which has not been tagged 
yet.
     - Jira's notification system is not as good as github's. It would be hard 
to subscribe to notifications that there is a new flaky test without also 
pushing to a different channel.
   - Automatically create a PR that adds the `@Flaky` annotation to the test. 
The corresponding Jira would then be filed by whoever reviews and merged the 
change.
     - This was my initial approach, but there is a lot of nuance in mapping 
the output of a maven surefire xml report to the line number to put the 
annotation on.
     - This also has similar deduplication trouble as automatically creating 
Jiras.
   - Using failed run archives from 
https://github.com/adoroszlai/ozone-build-results
     - This creates a dependency pointing from Apache to a personal repo which 
is not ideal.
   
   ## What is the link to the Apache JIRA
   
   HDDS-15211
   
   ## How was this patch tested?
   
   I haven't done a dry run of this on my fork yet, this PR is mostly to see if 
the community is interested in this. If so I can proceed with testing.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to