Thank you Mikhail for looking into test failures and compiling the list!

> I cannot access this link. Is it publicly accessible?

Works for me but it takes a while to show results.

> One general question: maybe it's a good idea to assign change
> authors/code owners to the issues? Or just reach them in jira
> comments?

While the authors should have a sense of ownership over the code, I
think it is enough for them to answer questions to the Assignee. They
shouldn't have to be owning the JIRA issue. This also increases
knowledge sharing.

> I believe such update sent daily or bi-daily can increase visibility
> for known failures, simplify search for people who can fix tests,
> and add nice tracking status.

Flaky tests should be fixed ASAP because they hinder development. +1 for
daily/bidaily notifications.

Cheers,
Max

On 16.08.18 10:46, Łukasz Gajowy wrote:
> Thank you for working on improving the situation with test failures! 
> 
> One general question: maybe it's a good idea to assign change
> authors/code owners to the issues? Or just reach them in jira comments?
> They know the code and they may be more likely to know solutions to
> failing tests or provide useful information (when swamped in other
> things). WDYT?
> 
> wt., 14 sie 2018 o 20:05 Mikhail Gryzykhin <mig...@google.com
> <mailto:mig...@google.com>> napisał(a):
> 
>     Hi everyone,
> 
>     We have increased amount of test jobs failures recently.
> 
>     In terms of numbers (based on my memory and http://35.226.225.164/):
>     Java precommits went down from ~55% to ~30% of succeeded jobs.
>     Java postcommits went down from ~60 to ~40 of succeeded jobs.
> 
> 
> I cannot access this link. Is it publicly accessible?
>  
> 
>     I'm currently triaging post-commit failures and wonder if it will be
>     useful to send regular updates on found issues and implemented fixes?
> 
>     What can be present in update:
>     * Tests greenness based on http://35.226.225.164/ (work on better
>     dashboard is in progress)
>     * List of Jira tickets with triaged failures with no owners
>     * List of Jira tickets in progress and who's working on fixes
>     * List of Jira tickets with fixes shipped
>      
> 
>     Each point can also have short description of failure reason.
> 
> 
> I think such report should be very brief and informative. IMO the report
> should contain the failures (as short summaries and a link to a JIRA
> ticket). Whoever's working on an issue should assign him/herself to the
> ticket and mark it as "IN PROGRESS" so there's no collisions between
> contributors fixing the tests. I don't see the need for listing the in
> progress issues (jira already shows that). List of fixed issues may show
> the progress, but I'd rather see a blank report with an empty failing
> tests list. :)
> 
> In fact, I think the list, you showed in the previous message
> <https://issues.apache.org/jira/browse/BEAM-5122?jql=project%20%3D%20BEAM%20AND%20status%20in%20%28Open%2C%20%22In%20Progress%22%2C%20Reopened%29%20AND%20resolution%20%3D%20Unresolved%20AND%20component%20%3D%20test-failures%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC>
>  will
> suffice. 
>  
> 
> 
>     I believe such update sent daily or bi-daily can increase visibility
>     for known failures, simplify search for people who can fix tests,
>     and add nice tracking status.
> 
> 
> Aren't weekly reports enough? It may be hard to change a lot in a day
> (two days). 
>  
> 
> 
>     What do you think?
> 
>     Regards,
>     --Mikhail
> 
>     Have feedback <http://go/migryz-feedback>? 
> 
> 
>     On Fri, Aug 10, 2018 at 1:24 PM Mikhail Gryzykhin <mig...@google.com
>     <mailto:mig...@google.com>> wrote:
> 
>         Hi everyone,
> 
>         I'm following up on tackling post-commit tests greenness. (See
>         beam post-commit policies
>         <https://beam.apache.org/contribute/postcommits-policies/>)
> 
>         During this week, I've assembled a list of most problematic
>         flaky or failing tests
>         
> <https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20status%20in%20%28Open%2C%20%22In%20Progress%22%2C%20Reopened%29%20AND%20resolution%20%3D%20Unresolved%20AND%20component%20%3D%20test-failures%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC>.
>         Unfortunately, I'm relatively new to the project and lack
>         triaging guides, so most of tickets contain only basic information.
> 
>         _I want to ask community help in following areas:_
>         1. If you know how to triage tests or the location of triage
>         guide, please share the knowledge. You can post links here, or
>         add pages to Confluence wiki
>         <https://cwiki.apache.org/confluence/display/BEAM/> and share
>         link here. 
>         2. Please, check on the Jira test-failures 
>         
> <https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20status%20in%20%28Open%2C%20%22In%20Progress%22%2C%20Reopened%29%20AND%20resolution%20%3D%20Unresolved%20AND%20component%20%3D%20test-failures%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC>list
>         and pick up tests that you might know how to fix and help with
>         fixing those. Tickets that do not have owner now are not being
>         worked on. I'm trying out easy mitigations for some of the
>         failures (ie increasing timeouts), but those should not be
>         treated as fixes.
> 
>         _Current status:_
>         Items that are marked critical in the failures list tend to fail
>         jobs in ~5-10% runs each.
> 
>         I contacted Anton Kedin directly and he works on fixes for
>         couple of most problematic flakes currently. Anton, thank you
>         for picking those up.
> 
>         Please, update owner and status of ticket if you start working
>         on some test failure, this will save time for others who might
>         also start looking into the failure.
> 
>         Thank you,
>         --Mikhail
> 
>         Have feedback <http://go/migryz-feedback>? 
> 

Reply via email to