On 05/09/2019 17:24, Thomas Mortagne wrote:
On Thu, Sep 5, 2019 at 3:43 PM Simon Urli <simon.u...@xwiki.com> wrote:

Hi everyone,

reopening this thread since I started to close some flicker issues as
part of BFD and got comments for those.

So the last mails on this threads suggested to close the flicker issues
if we didn't manage to reproduce them locally after a repeated tests,
and that we didn't see them after a while.

We didn't vote for those suggestion and I assumed a bit quick that I
could close some flicker issues that I personally don't remember about
on the CI after having tested them locally.
My point for doing that is the same as for the first mail I posted on
this thread: those flickers are old, and the code did change enough for
those to be fixed in a way or another.

Being old does not always means the code leading to those failures
changed that much.


Now I might be completely wrong, and the flicker to happen again, but I
don't think it's a problem since we can really easily open back the
issues if it's the case.

The other solution IMO is to indeed keep the issue open and in fact to
never really close them, because we just don't have time to investigate
each of them properly.

I really don't see any value of keeping things open and don't act on
them, that's why I suggest to close them after doing the checks we
suggested before:
    1. try to repeat locally the failure;

This is totally useless IMO unless you make sure that your computer is
made super slow some way since that's the reason for most of the
flickering tests.

    2. check that we didn't encounter those flickers since last cycle.

This one is enough for me but the hard part is to knowing that.

Ok, so the proposal is now to check only the age since last time we saw them of the open flickers before closing them.



So first question, do we all agree on that?

Then for the second check, Vincent suggested to add some tooling: it
will be best, but it takes time to do. So on the meantime, as Thomas
also suggested, we could add a check in the release plan to create or
update all jira issues that concerns flickers. It would allow us to keep
some information about the liveness of our flickers.

So second question, do you agree on that?

Depends what it exactly means. Have some dedicated jira field to
indicate when you saw it last ? Comment that you just saw that test
failing again ?

My suggestion was about a dedicated JIRA field if possible.


Other useful and a little more automated tricks not requiring much tooling:
* increase the currently very low history (10). The reason it's that
low is because of many performances issues we had in the past with old
style jobs but those most probably don't apply anymore so we should
increase the number now IMO (30 ?)

+1
* create a pipeline job which execute platform master integration
tests once a day with http://cpulimit.sourceforge.net (looks fun) and
keep a big history but not storing stuff like videos and images (100
?)


Not sure what you want there: to have a test execution where you master the slowness? to detect all problems we might have because of a slow server?


Final question: for the flickers that I closed today, I relied mainly on
my memory for the second check and on their age: I closed the older ones.

So what should we do on them?

My concern with them is that the reason you gave to close them (that
you cannot reproduce them locally) was not valid IMO. If you say some
test did not failed since a long time then fine, if what some test is
about has completely been rewritten then fine too but that's not what
you indicated :)

I actually say that in my knowledge the test I closed did not failed since a long time. I didn't checked the code for the tests, except for one and I commented about it.


If your memory is only related tests being checked just before a
release I'm not sure this is good enough.


Not really the case since I check regularly the CI. Now I'm not sure it's good enoug either :) Now as I said, we can reopen also later if needed.


Thanks,
Simon

On 26/03/2019 10:58, Vincent Massol wrote:


On 26 Mar 2019, at 10:31, Simon Urli <simon.u...@xwiki.com> wrote:

Hi everyone,

I was checking our list of flickering tests in JIRA 
(https://jira.xwiki.org/issues/?jql=labels%20%3D%20flickering%20AND%20status%20%3D%20Open%20ORDER%20BY%20updated%20DESC)
 and I noticed that we had somehow old flickering test issue concerning test 
that I've never seen failing.

So I propose we close some of them as inactive: the ones that we don't remember 
having seen for a while. The ideal would be to have a mechanism to update the 
issue when the CI fails on a flicker, but it takes time to do properly and it's 
not a priority.

On the contrary I propose to trust our memory: if we're wrong because we have 
closed a flicker that is still happening, it will allow us to remind that we 
have this flicker to fix and we can easily reopen the issue.

As Thomas mentioned on the chat, we should also update the release plan to 
include the inactive flickers in the list of issue to check.

I should be able to easily create a report when any test fails inside our 
jenkins pipeline and make it available similar to our clover report. I could 
indicate if it’s a known flicker or not too in this report. That could 
compensate for the fact that we only keep 7 days of records in our jobs.

Would need to define the report format, whether it’s the same file updated at 
each run or a different one. If the same one, then either:
* I’d need to parse it first in memory, add the new tests and overwrite the file
* or add to the bottom of the file which will grow quite large quickly

WDYT?

Thanks
-Vincent


So for now I propose to close the following list of issues as inactive:

   * XWIKI-14399: AddRemoveTagsTest#addAndDeleteTagFromTagPage is flickering 
(https://jira.xwiki.org/browse/XWIKI-14399)
   * XWIKI-14396: AnnotationsTest#addAndDeleteAnnotations is flickering 
(https://jira.xwiki.org/browse/XWIKI-14396)
   * XWIKI-14394: SectionTest.testSectionEditInWikiEditorWhenSyntax2x 
(xwiki-enterprise-test-ui) is flaky (https://jira.xwiki.org/browse/XWIKI-14394)
   * XWIKI-14386: appwithinminutes.AppsLiveTableTest.testEditApplication is 
possibly flaky (https://jira.xwiki.org/browse/XWIKI-14386)
   * XWIKI-14835: DeletePageTest#deletePageIsImpossibleWhenNoDeleteRights is 
flickering (https://jira.xwiki.org/browse/XWIKI-14835)
   * XWIKI-14860: LoginTest#testDataIsPreservedAfterLogin is flickering 
(https://jira.xwiki.org/browse/XWIKI-14860)

And I propose in general to close the flickers we don't remember having seen 
after a cycle as inactive.

WDYT?

Simon
--
Simon Urli
Software Engineer at XWiki SAS
simon.u...@xwiki.com
More about us at http://www.xwiki.com


--
Simon Urli
Software Engineer at XWiki SAS
simon.u...@xwiki.com
More about us at http://www.xwiki.com




--
Simon Urli
Software Engineer at XWiki SAS
simon.u...@xwiki.com
More about us at http://www.xwiki.com

Reply via email to