Re: RFR: 8287366: Improve test failure reporting in GHA

2022-06-06 Thread Magnus Ihse Bursie
On Mon, 6 Jun 2022 12:57:25 GMT, Jaikiran Pai  wrote:

>> It is currently both tricky and tedious to figure out what went wrong when a 
>> jtreg test fails in GHA.
>> 
>> We should utilize the full potential of GitHub Action summaries and error 
>> annotations to make finding failures easier and more discoverable.
>> 
>> With this PR, the overview of failures are presented on the "Summary" page 
>> for the action (the top-most line to the left, with the outline house icon). 
>> Below the `submit.yml` dependency graph, you'll find the annotations, which 
>> will look like this:
>> 
>> 
>> Linux x86 (jdk/tier1 part 1)
>> Test run reported 34 test failure(s) and 0 error(s). See summary for details.
>> 
>> 
>> Below the annotations follow the summaries. Go have a look at the runs for 
>> this PR to see what it looks like! In short, there is a separate summary per 
>> test job. The first part lists the names of the failed tests. This will 
>> always be included. Below this (with links from the summary list) are 
>> detailed information for each failed test. This include the jtreg output, 
>> and the `hs_err` file(s), if present. The latter part has a limit from 
>> Github on 1 MB. If this limit is broken, no detailed information at all is 
>> presented (sorry 'bout that; GitHub's rules).
>> 
>> This PR is deliberately based on a commit prior to the fix for JDK-8287137 
>> (Problemlist failing x86_32 tests after Loom integration), so you can see 
>> for yourself how the GHA runs looks in case of a "train wreck" testing 
>> situation, like on x86 after Loom. As you can see, most of the output part 
>> of the summaries got larger than the 1 MB limit, which means they were not 
>> shown. Only the summary for `Linux x86 (hs/tier1 runtime)` displays as 
>> intended. OTOH, this shows that the system has a "graceful degradation" mode 
>> for even large amount of failures like this. And, since I don't see a Loom 
>> v2.0 coming anytime soon, I believe this amount of failed tests are unlikely 
>> to be a realistic scenario.
>> 
>> Finally: the duplication in submit.yml is really, really annoying. :-( I 
>> have copied the same code block to three places. The fourth place, for 
>> Windows, do not get any support at this time. Concurrently with this change, 
>> I have started a separate branch where I split up submit.yml into reusable 
>> parts, using "callable workflows" and "custom actions". As part of this 
>> effort, I will also change the windows jobs to use cygwin bash instead of 
>> PowerShell. Until then, I could not be bothered to even think about 
>> implementing this functionality in PS. When that change is integrated, 
>> Windows will get this functionality for free, too.
>
>> With this PR, the overview of failures are presented on the "Summary" page 
>> for the action (the top-most line to the left, with the outline house icon).
> 
> @magicus, thank you. This is really useful. I didn't even know that this 
> "Summary" page existed. I now checked this page on one of my PRs (which 
> includes this commit) and it does indeed make it much simpler to analyze 
> these failures.

@jaikiran Thanks for the kind words. I think I should perhaps do some tweaking 
to the Skara bots that link to the GHA runs, so it easier to go to the summary 
page.

-

PR: https://git.openjdk.java.net/jdk/pull/8901


Re: RFR: 8287366: Improve test failure reporting in GHA

2022-06-06 Thread Jaikiran Pai
On Thu, 26 May 2022 12:04:41 GMT, Magnus Ihse Bursie  wrote:

> With this PR, the overview of failures are presented on the "Summary" page 
> for the action (the top-most line to the left, with the outline house icon).

@magicus, thank you. This is really useful. I didn't even know that this 
"Summary" page existed. I now checked this page on one of my PRs (which 
includes this commit) and it does indeed make it much simpler to analyze these 
failures.

-

PR: https://git.openjdk.java.net/jdk/pull/8901


Re: RFR: 8287366: Improve test failure reporting in GHA

2022-05-29 Thread Christoph Langer
On Thu, 26 May 2022 12:04:41 GMT, Magnus Ihse Bursie  wrote:

> It is currently both tricky and tedious to figure out what went wrong when a 
> jtreg test fails in GHA.
> 
> We should utilize the full potential of GitHub Action summaries and error 
> annotations to make finding failures easier and more discoverable.
> 
> With this PR, the overview of failures are presented on the "Summary" page 
> for the action (the top-most line to the left, with the outline house icon). 
> Below the `submit.yml` dependency graph, you'll find the annotations, which 
> will look like this:
> 
> 
> Linux x86 (jdk/tier1 part 1)
> Test run reported 34 test failure(s) and 0 error(s). See summary for details.
> 
> 
> Below the annotations follow the summaries. Go have a look at the runs for 
> this PR to see what it looks like! In short, there is a separate summary per 
> test job. The first part lists the names of the failed tests. This will 
> always be included. Below this (with links from the summary list) are 
> detailed information for each failed test. This include the jtreg output, and 
> the `hs_err` file(s), if present. The latter part has a limit from Github on 
> 1 MB. If this limit is broken, no detailed information at all is presented 
> (sorry 'bout that; GitHub's rules).
> 
> This PR is deliberately based on a commit prior to the fix for JDK-8287137 
> (Problemlist failing x86_32 tests after Loom integration), so you can see for 
> yourself how the GHA runs looks in case of a "train wreck" testing situation, 
> like on x86 after Loom. As you can see, most of the output part of the 
> summaries got larger than the 1 MB limit, which means they were not shown. 
> Only the summary for `Linux x86 (hs/tier1 runtime)` displays as intended. 
> OTOH, this shows that the system has a "graceful degradation" mode for even 
> large amount of failures like this. And, since I don't see a Loom v2.0 coming 
> anytime soon, I believe this amount of failed tests are unlikely to be a 
> realistic scenario.
> 
> Finally: the duplication in submit.yml is really, really annoying. :-( I have 
> copied the same code block to three places. The fourth place, for Windows, do 
> not get any support at this time. Concurrently with this change, I have 
> started a separate branch where I split up submit.yml into reusable parts, 
> using "callable workflows" and "custom actions". As part of this effort, I 
> will also change the windows jobs to use cygwin bash instead of PowerShell. 
> Until then, I could not be bothered to even think about implementing this 
> functionality in PS. When that change is integrated, Windows will get this 
> functionality for free, too.

This is a great improvement to GHA. I'm also looking forward to your 
de-duplication efforts and basing the windows steps on cygwin to benefit from 
the error handling there as well. Thanks for doing this!

-

Marked as reviewed by clanger (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/8901


RFR: 8287366: Improve test failure reporting in GHA

2022-05-26 Thread Magnus Ihse Bursie
It is currently both tricky and tedious to figure out what went wrong when a 
jtreg test fails in GHA.

We should utilize the full potential of GitHub Action summaries and error 
annotations to make finding failures easier and more discoverable.

With this PR, the overview of failures are presented on the "Summary" page for 
the action (the top-most line to the left, with the outline house icon). Below 
the `submit.yml` dependency graph, you'll find the annotations, which will look 
like this:


Linux x86 (jdk/tier1 part 1)
Test run reported 34 test failure(s) and 0 error(s). See summary for details.


Below the annotations follow the summaries. Go have a look at the runs for this 
PR to see what it looks like! In short, there is a separate summary per test 
job. The first part lists the names of the failed tests. This will always be 
included. Below this (with links from the summary list) are detailed 
information for each failed test. This include the jtreg output, and the 
`hs_err` file(s), if present. The latter part has a limit from Github on 1 MB. 
If this limit is broken, no detailed information at all is presented (sorry 
'bout that; GitHub's rules).

This PR is deliberately based on a commit prior to the fix for JDK-8287137 
(Problemlist failing x86_32 tests after Loom integration), so you can see for 
yourself how the GHA runs looks in case of a "train wreck" testing situation, 
like on x86 after Loom. As you can see, most of the output part of the 
summaries got larger than the 1 MB limit, which means they were not shown. Only 
the summary for `Linux x86 (hs/tier1 runtime)` displays as intended. OTOH, this 
shows that the system has a "graceful degradation" mode for even large amount 
of failures like this. And, since I don't see a Loom v2.0 coming anytime soon, 
I believe this amount of failed tests are unlikely to be a realistic scenario.

Finally: the duplication in submit.yml is really, really annoying. :-( I have 
copied the same code block to three places. The fourth place, for Windows, do 
not get any support at this time. Concurrently with this change, I have started 
a separate branch where I split up submit.yml into reusable parts, using 
"callable workflows" and "custom actions". As part of this effort, I will also 
change the windows jobs to use cygwin bash instead of PowerShell. Until then, I 
could not be bothered to even think about implementing this functionality in 
PS. When that change is integrated, Windows will get this functionality for 
free, too.

-

Commit messages:
 - Extra commit to re-trigger the GHA
 - 8287366: Improve test failure reporting in GHA

Changes: https://git.openjdk.java.net/jdk/pull/8901/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk=8901=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8287366
  Stats: 285 lines in 1 file changed: 264 ins; 0 del; 21 mod
  Patch: https://git.openjdk.java.net/jdk/pull/8901.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/8901/head:pull/8901

PR: https://git.openjdk.java.net/jdk/pull/8901