Re: [xwiki-devs] [Proposal] Cleaning of flickering tests
> On 6 Sep 2019, at 10:58, Thomas Mortagne wrote: > > On Fri, Sep 6, 2019 at 10:46 AM Vincent Massol wrote: >> >> >> >>> On 6 Sep 2019, at 10:42, Thomas Mortagne wrote: >>> >>> And why do you guys think about raisin the history to 30 at least >>> platform pipeline jobs? >> >> We need to first check if we have enough disk space first, it’s going to >> consume a lot of it. > > ci.xwiki.org currently use 393G (total, not just Jenkins) and have > 1.3T available. I check a few history entries of xwiki-platform master > and the scale seems to be about 70M (90% of which is the log file) > with something like 10 failing tests so I think we are safe for a > little while I hope so because I remember that I had to remove twice the logs for jenkins over the summer. It’s a different problem but both use up disk space. We need to fix the log issue BTW. Thanks -Vincent > >> >> Also, we would need to monitor closely for perf issues and roll it back if >> it doesn’t go well. > > Of course but that's easy. > >> >> Thanks >> -Vincent >> >>> >>> On Fri, Sep 6, 2019 at 10:39 AM Vincent Massol wrote: > On 6 Sep 2019, at 10:35, Thomas Mortagne > wrote: > > On Fri, Sep 6, 2019 at 10:32 AM Vincent Massol wrote: >> >> Hi Simon, >> >>> On 6 Sep 2019, at 10:27, Simon Urli wrote: >>> >>> Hi all, >>> >>> On 05/09/2019 17:40, Simon Urli wrote: On 05/09/2019 17:24, Thomas Mortagne wrote: > On Thu, Sep 5, 2019 at 3:43 PM Simon Urli > wrote: >> >> Hi everyone, >> >> reopening this thread since I started to close some flicker issues as >> part of BFD and got comments for those. >> >> So the last mails on this threads suggested to close the flicker >> issues >> if we didn't manage to reproduce them locally after a repeated tests, >> and that we didn't see them after a while. >> >> We didn't vote for those suggestion and I assumed a bit quick that I >> could close some flicker issues that I personally don't remember >> about >> on the CI after having tested them locally. >> My point for doing that is the same as for the first mail I posted on >> this thread: those flickers are old, and the code did change enough >> for >> those to be fixed in a way or another. > > Being old does not always means the code leading to those failures > changed that much. > >> >> Now I might be completely wrong, and the flicker to happen again, >> but I >> don't think it's a problem since we can really easily open back the >> issues if it's the case. >> >> The other solution IMO is to indeed keep the issue open and in fact >> to >> never really close them, because we just don't have time to >> investigate >> each of them properly. >> >> I really don't see any value of keeping things open and don't act on >> them, that's why I suggest to close them after doing the checks we >> suggested before: >> 1. try to repeat locally the failure; > > This is totally useless IMO unless you make sure that your computer is > made super slow some way since that's the reason for most of the > flickering tests. > >> 2. check that we didn't encounter those flickers since last cycle. > > This one is enough for me but the hard part is to knowing that. Ok, so the proposal is now to check only the age since last time we saw them of the open flickers before closing them. > >> >> So first question, do we all agree on that? >> >> Then for the second check, Vincent suggested to add some tooling: it >> will be best, but it takes time to do. So on the meantime, as Thomas >> also suggested, we could add a check in the release plan to create or >> update all jira issues that concerns flickers. It would allow us to >> keep >> some information about the liveness of our flickers. >> >> So second question, do you agree on that? > > Depends what it exactly means. Have some dedicated jira field to > indicate when you saw it last ? Comment that you just saw that test > failing again ? My suggestion was about a dedicated JIRA field if possible. >>> >>> So, ok if I create a new custom field in JIRA for flickers, called >>> "Date of last failure for flicker”? >> >> [snip] >> >> I don’t see how it’ll help since it’ll never be up to date, and the old >> value will remain making us think it’s not been flickering for a long >> time. > > In my mind the idea is not so much to use this f
Re: [xwiki-devs] [Proposal] Cleaning of flickering tests
On Fri, Sep 6, 2019 at 10:46 AM Vincent Massol wrote: > > > > > On 6 Sep 2019, at 10:42, Thomas Mortagne wrote: > > > > And why do you guys think about raisin the history to 30 at least > > platform pipeline jobs? > > We need to first check if we have enough disk space first, it’s going to > consume a lot of it. ci.xwiki.org currently use 393G (total, not just Jenkins) and have 1.3T available. I check a few history entries of xwiki-platform master and the scale seems to be about 70M (90% of which is the log file) with something like 10 failing tests so I think we are safe for a little while > > Also, we would need to monitor closely for perf issues and roll it back if it > doesn’t go well. Of course but that's easy. > > Thanks > -Vincent > > > > > On Fri, Sep 6, 2019 at 10:39 AM Vincent Massol wrote: > >> > >> > >> > >>> On 6 Sep 2019, at 10:35, Thomas Mortagne > >>> wrote: > >>> > >>> On Fri, Sep 6, 2019 at 10:32 AM Vincent Massol wrote: > > Hi Simon, > > > On 6 Sep 2019, at 10:27, Simon Urli wrote: > > > > Hi all, > > > > On 05/09/2019 17:40, Simon Urli wrote: > >> On 05/09/2019 17:24, Thomas Mortagne wrote: > >>> On Thu, Sep 5, 2019 at 3:43 PM Simon Urli > >>> wrote: > > Hi everyone, > > reopening this thread since I started to close some flicker issues as > part of BFD and got comments for those. > > So the last mails on this threads suggested to close the flicker > issues > if we didn't manage to reproduce them locally after a repeated tests, > and that we didn't see them after a while. > > We didn't vote for those suggestion and I assumed a bit quick that I > could close some flicker issues that I personally don't remember > about > on the CI after having tested them locally. > My point for doing that is the same as for the first mail I posted on > this thread: those flickers are old, and the code did change enough > for > those to be fixed in a way or another. > >>> > >>> Being old does not always means the code leading to those failures > >>> changed that much. > >>> > > Now I might be completely wrong, and the flicker to happen again, > but I > don't think it's a problem since we can really easily open back the > issues if it's the case. > > The other solution IMO is to indeed keep the issue open and in fact > to > never really close them, because we just don't have time to > investigate > each of them properly. > > I really don't see any value of keeping things open and don't act on > them, that's why I suggest to close them after doing the checks we > suggested before: > 1. try to repeat locally the failure; > >>> > >>> This is totally useless IMO unless you make sure that your computer is > >>> made super slow some way since that's the reason for most of the > >>> flickering tests. > >>> > 2. check that we didn't encounter those flickers since last cycle. > >>> > >>> This one is enough for me but the hard part is to knowing that. > >> Ok, so the proposal is now to check only the age since last time we > >> saw them of the open flickers before closing them. > >>> > > So first question, do we all agree on that? > > Then for the second check, Vincent suggested to add some tooling: it > will be best, but it takes time to do. So on the meantime, as Thomas > also suggested, we could add a check in the release plan to create or > update all jira issues that concerns flickers. It would allow us to > keep > some information about the liveness of our flickers. > > So second question, do you agree on that? > >>> > >>> Depends what it exactly means. Have some dedicated jira field to > >>> indicate when you saw it last ? Comment that you just saw that test > >>> failing again ? > >> My suggestion was about a dedicated JIRA field if possible. > > > > So, ok if I create a new custom field in JIRA for flickers, called > > "Date of last failure for flicker”? > > [snip] > > I don’t see how it’ll help since it’ll never be up to date, and the old > value will remain making us think it’s not been flickering for a long > time. > >>> > >>> In my mind the idea is not so much to use this field as a criteria to > >>> close an issue but as a criteria to not close it. > >> > >> ok, as long as we don’t use it for closing, I’m fine :) > >> > >> Thanks > >> -Vincent > >> > >>> > > Thanks > -Vincent > > >>> > >>> > >>> -- > >>> Thomas Mortagne > >> > > > > > > -- > > Thomas Mortagne > -- Thomas Mo
Re: [xwiki-devs] [Proposal] Cleaning of flickering tests
On 06/09/2019 10:44, Vincent Massol wrote: On 6 Sep 2019, at 10:42, Thomas Mortagne wrote: And why do you guys think about raisin the history to 30 at least platform pipeline jobs? We need to first check if we have enough disk space first, it’s going to consume a lot of it. Also, we would need to monitor closely for perf issues and roll it back if it doesn’t go well. +1 in general, but yeah we have to be careful especially now that we have a pretty stable CI :) Thanks -Vincent On Fri, Sep 6, 2019 at 10:39 AM Vincent Massol wrote: On 6 Sep 2019, at 10:35, Thomas Mortagne wrote: On Fri, Sep 6, 2019 at 10:32 AM Vincent Massol wrote: Hi Simon, On 6 Sep 2019, at 10:27, Simon Urli wrote: Hi all, On 05/09/2019 17:40, Simon Urli wrote: On 05/09/2019 17:24, Thomas Mortagne wrote: On Thu, Sep 5, 2019 at 3:43 PM Simon Urli wrote: Hi everyone, reopening this thread since I started to close some flicker issues as part of BFD and got comments for those. So the last mails on this threads suggested to close the flicker issues if we didn't manage to reproduce them locally after a repeated tests, and that we didn't see them after a while. We didn't vote for those suggestion and I assumed a bit quick that I could close some flicker issues that I personally don't remember about on the CI after having tested them locally. My point for doing that is the same as for the first mail I posted on this thread: those flickers are old, and the code did change enough for those to be fixed in a way or another. Being old does not always means the code leading to those failures changed that much. Now I might be completely wrong, and the flicker to happen again, but I don't think it's a problem since we can really easily open back the issues if it's the case. The other solution IMO is to indeed keep the issue open and in fact to never really close them, because we just don't have time to investigate each of them properly. I really don't see any value of keeping things open and don't act on them, that's why I suggest to close them after doing the checks we suggested before: 1. try to repeat locally the failure; This is totally useless IMO unless you make sure that your computer is made super slow some way since that's the reason for most of the flickering tests. 2. check that we didn't encounter those flickers since last cycle. This one is enough for me but the hard part is to knowing that. Ok, so the proposal is now to check only the age since last time we saw them of the open flickers before closing them. So first question, do we all agree on that? Then for the second check, Vincent suggested to add some tooling: it will be best, but it takes time to do. So on the meantime, as Thomas also suggested, we could add a check in the release plan to create or update all jira issues that concerns flickers. It would allow us to keep some information about the liveness of our flickers. So second question, do you agree on that? Depends what it exactly means. Have some dedicated jira field to indicate when you saw it last ? Comment that you just saw that test failing again ? My suggestion was about a dedicated JIRA field if possible. So, ok if I create a new custom field in JIRA for flickers, called "Date of last failure for flicker”? [snip] I don’t see how it’ll help since it’ll never be up to date, and the old value will remain making us think it’s not been flickering for a long time. In my mind the idea is not so much to use this field as a criteria to close an issue but as a criteria to not close it. ok, as long as we don’t use it for closing, I’m fine :) Thanks -Vincent Thanks -Vincent -- Thomas Mortagne -- Thomas Mortagne -- Simon Urli Software Engineer at XWiki SAS simon.u...@xwiki.com More about us at http://www.xwiki.com
Re: [xwiki-devs] [Proposal] Cleaning of flickering tests
> On 6 Sep 2019, at 10:42, Thomas Mortagne wrote: > > And why do you guys think about raisin the history to 30 at least > platform pipeline jobs? We need to first check if we have enough disk space first, it’s going to consume a lot of it. Also, we would need to monitor closely for perf issues and roll it back if it doesn’t go well. Thanks -Vincent > > On Fri, Sep 6, 2019 at 10:39 AM Vincent Massol wrote: >> >> >> >>> On 6 Sep 2019, at 10:35, Thomas Mortagne wrote: >>> >>> On Fri, Sep 6, 2019 at 10:32 AM Vincent Massol wrote: Hi Simon, > On 6 Sep 2019, at 10:27, Simon Urli wrote: > > Hi all, > > On 05/09/2019 17:40, Simon Urli wrote: >> On 05/09/2019 17:24, Thomas Mortagne wrote: >>> On Thu, Sep 5, 2019 at 3:43 PM Simon Urli wrote: Hi everyone, reopening this thread since I started to close some flicker issues as part of BFD and got comments for those. So the last mails on this threads suggested to close the flicker issues if we didn't manage to reproduce them locally after a repeated tests, and that we didn't see them after a while. We didn't vote for those suggestion and I assumed a bit quick that I could close some flicker issues that I personally don't remember about on the CI after having tested them locally. My point for doing that is the same as for the first mail I posted on this thread: those flickers are old, and the code did change enough for those to be fixed in a way or another. >>> >>> Being old does not always means the code leading to those failures >>> changed that much. >>> Now I might be completely wrong, and the flicker to happen again, but I don't think it's a problem since we can really easily open back the issues if it's the case. The other solution IMO is to indeed keep the issue open and in fact to never really close them, because we just don't have time to investigate each of them properly. I really don't see any value of keeping things open and don't act on them, that's why I suggest to close them after doing the checks we suggested before: 1. try to repeat locally the failure; >>> >>> This is totally useless IMO unless you make sure that your computer is >>> made super slow some way since that's the reason for most of the >>> flickering tests. >>> 2. check that we didn't encounter those flickers since last cycle. >>> >>> This one is enough for me but the hard part is to knowing that. >> Ok, so the proposal is now to check only the age since last time we saw >> them of the open flickers before closing them. >>> So first question, do we all agree on that? Then for the second check, Vincent suggested to add some tooling: it will be best, but it takes time to do. So on the meantime, as Thomas also suggested, we could add a check in the release plan to create or update all jira issues that concerns flickers. It would allow us to keep some information about the liveness of our flickers. So second question, do you agree on that? >>> >>> Depends what it exactly means. Have some dedicated jira field to >>> indicate when you saw it last ? Comment that you just saw that test >>> failing again ? >> My suggestion was about a dedicated JIRA field if possible. > > So, ok if I create a new custom field in JIRA for flickers, called "Date > of last failure for flicker”? [snip] I don’t see how it’ll help since it’ll never be up to date, and the old value will remain making us think it’s not been flickering for a long time. >>> >>> In my mind the idea is not so much to use this field as a criteria to >>> close an issue but as a criteria to not close it. >> >> ok, as long as we don’t use it for closing, I’m fine :) >> >> Thanks >> -Vincent >> >>> Thanks -Vincent >>> >>> >>> -- >>> Thomas Mortagne >> > > > -- > Thomas Mortagne
Re: [xwiki-devs] [Proposal] Cleaning of flickering tests
s/why/what/ Not good with mails today... On Fri, Sep 6, 2019 at 10:42 AM Thomas Mortagne wrote: > > And why do you guys think about raisin the history to 30 at least > platform pipeline jobs? > > On Fri, Sep 6, 2019 at 10:39 AM Vincent Massol wrote: > > > > > > > > > On 6 Sep 2019, at 10:35, Thomas Mortagne > > > wrote: > > > > > > On Fri, Sep 6, 2019 at 10:32 AM Vincent Massol wrote: > > >> > > >> Hi Simon, > > >> > > >>> On 6 Sep 2019, at 10:27, Simon Urli wrote: > > >>> > > >>> Hi all, > > >>> > > >>> On 05/09/2019 17:40, Simon Urli wrote: > > On 05/09/2019 17:24, Thomas Mortagne wrote: > > > On Thu, Sep 5, 2019 at 3:43 PM Simon Urli > > > wrote: > > >> > > >> Hi everyone, > > >> > > >> reopening this thread since I started to close some flicker issues as > > >> part of BFD and got comments for those. > > >> > > >> So the last mails on this threads suggested to close the flicker > > >> issues > > >> if we didn't manage to reproduce them locally after a repeated tests, > > >> and that we didn't see them after a while. > > >> > > >> We didn't vote for those suggestion and I assumed a bit quick that I > > >> could close some flicker issues that I personally don't remember > > >> about > > >> on the CI after having tested them locally. > > >> My point for doing that is the same as for the first mail I posted on > > >> this thread: those flickers are old, and the code did change enough > > >> for > > >> those to be fixed in a way or another. > > > > > > Being old does not always means the code leading to those failures > > > changed that much. > > > > > >> > > >> Now I might be completely wrong, and the flicker to happen again, > > >> but I > > >> don't think it's a problem since we can really easily open back the > > >> issues if it's the case. > > >> > > >> The other solution IMO is to indeed keep the issue open and in fact > > >> to > > >> never really close them, because we just don't have time to > > >> investigate > > >> each of them properly. > > >> > > >> I really don't see any value of keeping things open and don't act on > > >> them, that's why I suggest to close them after doing the checks we > > >> suggested before: > > >>1. try to repeat locally the failure; > > > > > > This is totally useless IMO unless you make sure that your computer is > > > made super slow some way since that's the reason for most of the > > > flickering tests. > > > > > >>2. check that we didn't encounter those flickers since last cycle. > > > > > > This one is enough for me but the hard part is to knowing that. > > Ok, so the proposal is now to check only the age since last time we > > saw them of the open flickers before closing them. > > > > > >> > > >> So first question, do we all agree on that? > > >> > > >> Then for the second check, Vincent suggested to add some tooling: it > > >> will be best, but it takes time to do. So on the meantime, as Thomas > > >> also suggested, we could add a check in the release plan to create or > > >> update all jira issues that concerns flickers. It would allow us to > > >> keep > > >> some information about the liveness of our flickers. > > >> > > >> So second question, do you agree on that? > > > > > > Depends what it exactly means. Have some dedicated jira field to > > > indicate when you saw it last ? Comment that you just saw that test > > > failing again ? > > My suggestion was about a dedicated JIRA field if possible. > > >>> > > >>> So, ok if I create a new custom field in JIRA for flickers, called > > >>> "Date of last failure for flicker”? > > >> > > >> [snip] > > >> > > >> I don’t see how it’ll help since it’ll never be up to date, and the old > > >> value will remain making us think it’s not been flickering for a long > > >> time. > > > > > > In my mind the idea is not so much to use this field as a criteria to > > > close an issue but as a criteria to not close it. > > > > ok, as long as we don’t use it for closing, I’m fine :) > > > > Thanks > > -Vincent > > > > > > > >> > > >> Thanks > > >> -Vincent > > >> > > > > > > > > > -- > > > Thomas Mortagne > > > > > -- > Thomas Mortagne -- Thomas Mortagne
Re: [xwiki-devs] [Proposal] Cleaning of flickering tests
And why do you guys think about raisin the history to 30 at least platform pipeline jobs? On Fri, Sep 6, 2019 at 10:39 AM Vincent Massol wrote: > > > > > On 6 Sep 2019, at 10:35, Thomas Mortagne wrote: > > > > On Fri, Sep 6, 2019 at 10:32 AM Vincent Massol wrote: > >> > >> Hi Simon, > >> > >>> On 6 Sep 2019, at 10:27, Simon Urli wrote: > >>> > >>> Hi all, > >>> > >>> On 05/09/2019 17:40, Simon Urli wrote: > On 05/09/2019 17:24, Thomas Mortagne wrote: > > On Thu, Sep 5, 2019 at 3:43 PM Simon Urli wrote: > >> > >> Hi everyone, > >> > >> reopening this thread since I started to close some flicker issues as > >> part of BFD and got comments for those. > >> > >> So the last mails on this threads suggested to close the flicker issues > >> if we didn't manage to reproduce them locally after a repeated tests, > >> and that we didn't see them after a while. > >> > >> We didn't vote for those suggestion and I assumed a bit quick that I > >> could close some flicker issues that I personally don't remember about > >> on the CI after having tested them locally. > >> My point for doing that is the same as for the first mail I posted on > >> this thread: those flickers are old, and the code did change enough for > >> those to be fixed in a way or another. > > > > Being old does not always means the code leading to those failures > > changed that much. > > > >> > >> Now I might be completely wrong, and the flicker to happen again, but I > >> don't think it's a problem since we can really easily open back the > >> issues if it's the case. > >> > >> The other solution IMO is to indeed keep the issue open and in fact to > >> never really close them, because we just don't have time to investigate > >> each of them properly. > >> > >> I really don't see any value of keeping things open and don't act on > >> them, that's why I suggest to close them after doing the checks we > >> suggested before: > >>1. try to repeat locally the failure; > > > > This is totally useless IMO unless you make sure that your computer is > > made super slow some way since that's the reason for most of the > > flickering tests. > > > >>2. check that we didn't encounter those flickers since last cycle. > > > > This one is enough for me but the hard part is to knowing that. > Ok, so the proposal is now to check only the age since last time we saw > them of the open flickers before closing them. > > > >> > >> So first question, do we all agree on that? > >> > >> Then for the second check, Vincent suggested to add some tooling: it > >> will be best, but it takes time to do. So on the meantime, as Thomas > >> also suggested, we could add a check in the release plan to create or > >> update all jira issues that concerns flickers. It would allow us to > >> keep > >> some information about the liveness of our flickers. > >> > >> So second question, do you agree on that? > > > > Depends what it exactly means. Have some dedicated jira field to > > indicate when you saw it last ? Comment that you just saw that test > > failing again ? > My suggestion was about a dedicated JIRA field if possible. > >>> > >>> So, ok if I create a new custom field in JIRA for flickers, called "Date > >>> of last failure for flicker”? > >> > >> [snip] > >> > >> I don’t see how it’ll help since it’ll never be up to date, and the old > >> value will remain making us think it’s not been flickering for a long time. > > > > In my mind the idea is not so much to use this field as a criteria to > > close an issue but as a criteria to not close it. > > ok, as long as we don’t use it for closing, I’m fine :) > > Thanks > -Vincent > > > > >> > >> Thanks > >> -Vincent > >> > > > > > > -- > > Thomas Mortagne > -- Thomas Mortagne
Re: [xwiki-devs] [Proposal] Cleaning of flickering tests
On Fri, Sep 6, 2019 at 10:42 AM Thomas Mortagne wrote: > > And why do you guys think about raisin the history to 30 at least > platform pipeline jobs? > > On Fri, Sep 6, 2019 at 10:39 AM Vincent Massol wrote: > > > > > > > > > On 6 Sep 2019, at 10:35, Thomas Mortagne > > > wrote: > > > > > > On Fri, Sep 6, 2019 at 10:32 AM Vincent Massol wrote: > > >> > > >> Hi Simon, > > >> > > >>> On 6 Sep 2019, at 10:27, Simon Urli wrote: > > >>> > > >>> Hi all, > > >>> > > >>> On 05/09/2019 17:40, Simon Urli wrote: > > On 05/09/2019 17:24, Thomas Mortagne wrote: > > > On Thu, Sep 5, 2019 at 3:43 PM Simon Urli > > > wrote: > > >> > > >> Hi everyone, > > >> > > >> reopening this thread since I started to close some flicker issues as > > >> part of BFD and got comments for those. > > >> > > >> So the last mails on this threads suggested to close the flicker > > >> issues > > >> if we didn't manage to reproduce them locally after a repeated tests, > > >> and that we didn't see them after a while. > > >> > > >> We didn't vote for those suggestion and I assumed a bit quick that I > > >> could close some flicker issues that I personally don't remember > > >> about > > >> on the CI after having tested them locally. > > >> My point for doing that is the same as for the first mail I posted on > > >> this thread: those flickers are old, and the code did change enough > > >> for > > >> those to be fixed in a way or another. > > > > > > Being old does not always means the code leading to those failures > > > changed that much. > > > > > >> > > >> Now I might be completely wrong, and the flicker to happen again, > > >> but I > > >> don't think it's a problem since we can really easily open back the > > >> issues if it's the case. > > >> > > >> The other solution IMO is to indeed keep the issue open and in fact > > >> to > > >> never really close them, because we just don't have time to > > >> investigate > > >> each of them properly. > > >> > > >> I really don't see any value of keeping things open and don't act on > > >> them, that's why I suggest to close them after doing the checks we > > >> suggested before: > > >>1. try to repeat locally the failure; > > > > > > This is totally useless IMO unless you make sure that your computer is > > > made super slow some way since that's the reason for most of the > > > flickering tests. > > > > > >>2. check that we didn't encounter those flickers since last cycle. > > > > > > This one is enough for me but the hard part is to knowing that. > > Ok, so the proposal is now to check only the age since last time we > > saw them of the open flickers before closing them. > > > > > >> > > >> So first question, do we all agree on that? > > >> > > >> Then for the second check, Vincent suggested to add some tooling: it > > >> will be best, but it takes time to do. So on the meantime, as Thomas > > >> also suggested, we could add a check in the release plan to create or > > >> update all jira issues that concerns flickers. It would allow us to > > >> keep > > >> some information about the liveness of our flickers. > > >> > > >> So second question, do you agree on that? > > > > > > Depends what it exactly means. Have some dedicated jira field to > > > indicate when you saw it last ? Comment that you just saw that test > > > failing again ? > > My suggestion was about a dedicated JIRA field if possible. > > >>> > > >>> So, ok if I create a new custom field in JIRA for flickers, called > > >>> "Date of last failure for flicker”? > > >> > > >> [snip] > > >> > > >> I don’t see how it’ll help since it’ll never be up to date, and the old > > >> value will remain making us think it’s not been flickering for a long > > >> time. > > > > > > In my mind the idea is not so much to use this field as a criteria to > > > close an issue but as a criteria to not close it. > > > > ok, as long as we don’t use it for closing, I’m fine :) > > > > Thanks > > -Vincent > > > > > > > >> > > >> Thanks > > >> -Vincent > > >> > > > > > > > > > -- > > > Thomas Mortagne > > > > > -- > Thomas Mortagne -- Thomas Mortagne
Re: [xwiki-devs] [Proposal] Cleaning of flickering tests
> On 6 Sep 2019, at 10:35, Thomas Mortagne wrote: > > On Fri, Sep 6, 2019 at 10:32 AM Vincent Massol wrote: >> >> Hi Simon, >> >>> On 6 Sep 2019, at 10:27, Simon Urli wrote: >>> >>> Hi all, >>> >>> On 05/09/2019 17:40, Simon Urli wrote: On 05/09/2019 17:24, Thomas Mortagne wrote: > On Thu, Sep 5, 2019 at 3:43 PM Simon Urli wrote: >> >> Hi everyone, >> >> reopening this thread since I started to close some flicker issues as >> part of BFD and got comments for those. >> >> So the last mails on this threads suggested to close the flicker issues >> if we didn't manage to reproduce them locally after a repeated tests, >> and that we didn't see them after a while. >> >> We didn't vote for those suggestion and I assumed a bit quick that I >> could close some flicker issues that I personally don't remember about >> on the CI after having tested them locally. >> My point for doing that is the same as for the first mail I posted on >> this thread: those flickers are old, and the code did change enough for >> those to be fixed in a way or another. > > Being old does not always means the code leading to those failures > changed that much. > >> >> Now I might be completely wrong, and the flicker to happen again, but I >> don't think it's a problem since we can really easily open back the >> issues if it's the case. >> >> The other solution IMO is to indeed keep the issue open and in fact to >> never really close them, because we just don't have time to investigate >> each of them properly. >> >> I really don't see any value of keeping things open and don't act on >> them, that's why I suggest to close them after doing the checks we >> suggested before: >>1. try to repeat locally the failure; > > This is totally useless IMO unless you make sure that your computer is > made super slow some way since that's the reason for most of the > flickering tests. > >>2. check that we didn't encounter those flickers since last cycle. > > This one is enough for me but the hard part is to knowing that. Ok, so the proposal is now to check only the age since last time we saw them of the open flickers before closing them. > >> >> So first question, do we all agree on that? >> >> Then for the second check, Vincent suggested to add some tooling: it >> will be best, but it takes time to do. So on the meantime, as Thomas >> also suggested, we could add a check in the release plan to create or >> update all jira issues that concerns flickers. It would allow us to keep >> some information about the liveness of our flickers. >> >> So second question, do you agree on that? > > Depends what it exactly means. Have some dedicated jira field to > indicate when you saw it last ? Comment that you just saw that test > failing again ? My suggestion was about a dedicated JIRA field if possible. >>> >>> So, ok if I create a new custom field in JIRA for flickers, called "Date of >>> last failure for flicker”? >> >> [snip] >> >> I don’t see how it’ll help since it’ll never be up to date, and the old >> value will remain making us think it’s not been flickering for a long time. > > In my mind the idea is not so much to use this field as a criteria to > close an issue but as a criteria to not close it. ok, as long as we don’t use it for closing, I’m fine :) Thanks -Vincent > >> >> Thanks >> -Vincent >> > > > -- > Thomas Mortagne
Re: [xwiki-devs] [Proposal] Cleaning of flickering tests
On 06/09/2019 10:35, Thomas Mortagne wrote: On Fri, Sep 6, 2019 at 10:32 AM Vincent Massol wrote: Hi Simon, On 6 Sep 2019, at 10:27, Simon Urli wrote: Hi all, On 05/09/2019 17:40, Simon Urli wrote: On 05/09/2019 17:24, Thomas Mortagne wrote: On Thu, Sep 5, 2019 at 3:43 PM Simon Urli wrote: Hi everyone, reopening this thread since I started to close some flicker issues as part of BFD and got comments for those. So the last mails on this threads suggested to close the flicker issues if we didn't manage to reproduce them locally after a repeated tests, and that we didn't see them after a while. We didn't vote for those suggestion and I assumed a bit quick that I could close some flicker issues that I personally don't remember about on the CI after having tested them locally. My point for doing that is the same as for the first mail I posted on this thread: those flickers are old, and the code did change enough for those to be fixed in a way or another. Being old does not always means the code leading to those failures changed that much. Now I might be completely wrong, and the flicker to happen again, but I don't think it's a problem since we can really easily open back the issues if it's the case. The other solution IMO is to indeed keep the issue open and in fact to never really close them, because we just don't have time to investigate each of them properly. I really don't see any value of keeping things open and don't act on them, that's why I suggest to close them after doing the checks we suggested before: 1. try to repeat locally the failure; This is totally useless IMO unless you make sure that your computer is made super slow some way since that's the reason for most of the flickering tests. 2. check that we didn't encounter those flickers since last cycle. This one is enough for me but the hard part is to knowing that. Ok, so the proposal is now to check only the age since last time we saw them of the open flickers before closing them. So first question, do we all agree on that? Then for the second check, Vincent suggested to add some tooling: it will be best, but it takes time to do. So on the meantime, as Thomas also suggested, we could add a check in the release plan to create or update all jira issues that concerns flickers. It would allow us to keep some information about the liveness of our flickers. So second question, do you agree on that? Depends what it exactly means. Have some dedicated jira field to indicate when you saw it last ? Comment that you just saw that test failing again ? My suggestion was about a dedicated JIRA field if possible. So, ok if I create a new custom field in JIRA for flickers, called "Date of last failure for flicker”? [snip] I don’t see how it’ll help since it’ll never be up to date, and the old value will remain making us think it’s not been flickering for a long time. In my mind the idea is not so much to use this field as a criteria to close an issue but as a criteria to not close it. That and I also wanted to propose an update in the release plan for the first point: Verify that no tests are failing on the CI Server (or that failures are understood and update their date of last seen failure, see known flickering tests). Thanks -Vincent -- Simon Urli Software Engineer at XWiki SAS simon.u...@xwiki.com More about us at http://www.xwiki.com
Re: [xwiki-devs] [Proposal] Cleaning of flickering tests
On Fri, Sep 6, 2019 at 10:32 AM Vincent Massol wrote: > > Hi Simon, > > > On 6 Sep 2019, at 10:27, Simon Urli wrote: > > > > Hi all, > > > > On 05/09/2019 17:40, Simon Urli wrote: > >> On 05/09/2019 17:24, Thomas Mortagne wrote: > >>> On Thu, Sep 5, 2019 at 3:43 PM Simon Urli wrote: > > Hi everyone, > > reopening this thread since I started to close some flicker issues as > part of BFD and got comments for those. > > So the last mails on this threads suggested to close the flicker issues > if we didn't manage to reproduce them locally after a repeated tests, > and that we didn't see them after a while. > > We didn't vote for those suggestion and I assumed a bit quick that I > could close some flicker issues that I personally don't remember about > on the CI after having tested them locally. > My point for doing that is the same as for the first mail I posted on > this thread: those flickers are old, and the code did change enough for > those to be fixed in a way or another. > >>> > >>> Being old does not always means the code leading to those failures > >>> changed that much. > >>> > > Now I might be completely wrong, and the flicker to happen again, but I > don't think it's a problem since we can really easily open back the > issues if it's the case. > > The other solution IMO is to indeed keep the issue open and in fact to > never really close them, because we just don't have time to investigate > each of them properly. > > I really don't see any value of keeping things open and don't act on > them, that's why I suggest to close them after doing the checks we > suggested before: > 1. try to repeat locally the failure; > >>> > >>> This is totally useless IMO unless you make sure that your computer is > >>> made super slow some way since that's the reason for most of the > >>> flickering tests. > >>> > 2. check that we didn't encounter those flickers since last cycle. > >>> > >>> This one is enough for me but the hard part is to knowing that. > >> Ok, so the proposal is now to check only the age since last time we saw > >> them of the open flickers before closing them. > >>> > > So first question, do we all agree on that? > > Then for the second check, Vincent suggested to add some tooling: it > will be best, but it takes time to do. So on the meantime, as Thomas > also suggested, we could add a check in the release plan to create or > update all jira issues that concerns flickers. It would allow us to keep > some information about the liveness of our flickers. > > So second question, do you agree on that? > >>> > >>> Depends what it exactly means. Have some dedicated jira field to > >>> indicate when you saw it last ? Comment that you just saw that test > >>> failing again ? > >> My suggestion was about a dedicated JIRA field if possible. > > > > So, ok if I create a new custom field in JIRA for flickers, called "Date of > > last failure for flicker”? > > [snip] > > I don’t see how it’ll help since it’ll never be up to date, and the old value > will remain making us think it’s not been flickering for a long time. In my mind the idea is not so much to use this field as a criteria to close an issue but as a criteria to not close it. > > Thanks > -Vincent > -- Thomas Mortagne
Re: [xwiki-devs] [Proposal] Cleaning of flickering tests
On Fri, Sep 6, 2019 at 10:27 AM Simon Urli wrote: > > Hi all, > > On 05/09/2019 17:40, Simon Urli wrote: > > > > > > On 05/09/2019 17:24, Thomas Mortagne wrote: > >> On Thu, Sep 5, 2019 at 3:43 PM Simon Urli wrote: > >>> > >>> Hi everyone, > >>> > >>> reopening this thread since I started to close some flicker issues as > >>> part of BFD and got comments for those. > >>> > >>> So the last mails on this threads suggested to close the flicker issues > >>> if we didn't manage to reproduce them locally after a repeated tests, > >>> and that we didn't see them after a while. > >>> > >>> We didn't vote for those suggestion and I assumed a bit quick that I > >>> could close some flicker issues that I personally don't remember about > >>> on the CI after having tested them locally. > >>> My point for doing that is the same as for the first mail I posted on > >>> this thread: those flickers are old, and the code did change enough for > >>> those to be fixed in a way or another. > >> > >> Being old does not always means the code leading to those failures > >> changed that much. > >> > >>> > >>> Now I might be completely wrong, and the flicker to happen again, but I > >>> don't think it's a problem since we can really easily open back the > >>> issues if it's the case. > >>> > >>> The other solution IMO is to indeed keep the issue open and in fact to > >>> never really close them, because we just don't have time to investigate > >>> each of them properly. > >>> > >>> I really don't see any value of keeping things open and don't act on > >>> them, that's why I suggest to close them after doing the checks we > >>> suggested before: > >>> 1. try to repeat locally the failure; > >> > >> This is totally useless IMO unless you make sure that your computer is > >> made super slow some way since that's the reason for most of the > >> flickering tests. > >> > >>> 2. check that we didn't encounter those flickers since last cycle. > >> > >> This one is enough for me but the hard part is to knowing that. > > > > Ok, so the proposal is now to check only the age since last time we saw > > them of the open flickers before closing them. > > > >> > >>> > >>> So first question, do we all agree on that? > >>> > >>> Then for the second check, Vincent suggested to add some tooling: it > >>> will be best, but it takes time to do. So on the meantime, as Thomas > >>> also suggested, we could add a check in the release plan to create or > >>> update all jira issues that concerns flickers. It would allow us to keep > >>> some information about the liveness of our flickers. > >>> > >>> So second question, do you agree on that? > >> > >> Depends what it exactly means. Have some dedicated jira field to > >> indicate when you saw it last ? Comment that you just saw that test > >> failing again ? > > > > My suggestion was about a dedicated JIRA field if possible. > > So, ok if I create a new custom field in JIRA for flickers, called "Date > of last failure for flicker"? "Last seen failing" might be more accurate since we don't actually know when was the last failure. +1 in general for the field > > > > >> > >> Other useful and a little more automated tricks not requiring much > >> tooling: > >> * increase the currently very low history (10). The reason it's that > >> low is because of many performances issues we had in the past with old > >> style jobs but those most probably don't apply anymore so we should > >> increase the number now IMO (30 ?) > > > > +1 > >> * create a pipeline job which execute platform master integration > >> tests once a day with http://cpulimit.sourceforge.net (looks fun) and > >> keep a big history but not storing stuff like videos and images (100 > >> ?) > >> > > > > Not sure what you want there: to have a test execution where you master > > the slowness? to detect all problems we might have because of a slow > > server? > > > >>> > >>> Final question: for the flickers that I closed today, I relied mainly on > >>> my memory for the second check and on their age: I closed the older > >>> ones. > >>> > >>> So what should we do on them? > >> > >> My concern with them is that the reason you gave to close them (that > >> you cannot reproduce them locally) was not valid IMO. If you say some > >> test did not failed since a long time then fine, if what some test is > >> about has completely been rewritten then fine too but that's not what > >> you indicated :) > > > > I actually say that in my knowledge the test I closed did not failed > > since a long time. I didn't checked the code for the tests, except for > > one and I commented about it. > > > >> > >> If your memory is only related tests being checked just before a > >> release I'm not sure this is good enough. > >> > > > > Not really the case since I check regularly the CI. Now I'm not sure > > it's good enoug either :) Now as I said, we can reopen also later if > > needed. > > > >>> > >>> Thanks, > >>> Simon > >>> > >>> On 26/03/2019 10:58, Vincent Massol w
Re: [xwiki-devs] [Proposal] Cleaning of flickering tests
Hi Simon, > On 6 Sep 2019, at 10:27, Simon Urli wrote: > > Hi all, > > On 05/09/2019 17:40, Simon Urli wrote: >> On 05/09/2019 17:24, Thomas Mortagne wrote: >>> On Thu, Sep 5, 2019 at 3:43 PM Simon Urli wrote: Hi everyone, reopening this thread since I started to close some flicker issues as part of BFD and got comments for those. So the last mails on this threads suggested to close the flicker issues if we didn't manage to reproduce them locally after a repeated tests, and that we didn't see them after a while. We didn't vote for those suggestion and I assumed a bit quick that I could close some flicker issues that I personally don't remember about on the CI after having tested them locally. My point for doing that is the same as for the first mail I posted on this thread: those flickers are old, and the code did change enough for those to be fixed in a way or another. >>> >>> Being old does not always means the code leading to those failures >>> changed that much. >>> Now I might be completely wrong, and the flicker to happen again, but I don't think it's a problem since we can really easily open back the issues if it's the case. The other solution IMO is to indeed keep the issue open and in fact to never really close them, because we just don't have time to investigate each of them properly. I really don't see any value of keeping things open and don't act on them, that's why I suggest to close them after doing the checks we suggested before: 1. try to repeat locally the failure; >>> >>> This is totally useless IMO unless you make sure that your computer is >>> made super slow some way since that's the reason for most of the >>> flickering tests. >>> 2. check that we didn't encounter those flickers since last cycle. >>> >>> This one is enough for me but the hard part is to knowing that. >> Ok, so the proposal is now to check only the age since last time we saw them >> of the open flickers before closing them. >>> So first question, do we all agree on that? Then for the second check, Vincent suggested to add some tooling: it will be best, but it takes time to do. So on the meantime, as Thomas also suggested, we could add a check in the release plan to create or update all jira issues that concerns flickers. It would allow us to keep some information about the liveness of our flickers. So second question, do you agree on that? >>> >>> Depends what it exactly means. Have some dedicated jira field to >>> indicate when you saw it last ? Comment that you just saw that test >>> failing again ? >> My suggestion was about a dedicated JIRA field if possible. > > So, ok if I create a new custom field in JIRA for flickers, called "Date of > last failure for flicker”? [snip] I don’t see how it’ll help since it’ll never be up to date, and the old value will remain making us think it’s not been flickering for a long time. Thanks -Vincent
Re: [xwiki-devs] [Proposal] Cleaning of flickering tests
Hi all, On 05/09/2019 17:40, Simon Urli wrote: On 05/09/2019 17:24, Thomas Mortagne wrote: On Thu, Sep 5, 2019 at 3:43 PM Simon Urli wrote: Hi everyone, reopening this thread since I started to close some flicker issues as part of BFD and got comments for those. So the last mails on this threads suggested to close the flicker issues if we didn't manage to reproduce them locally after a repeated tests, and that we didn't see them after a while. We didn't vote for those suggestion and I assumed a bit quick that I could close some flicker issues that I personally don't remember about on the CI after having tested them locally. My point for doing that is the same as for the first mail I posted on this thread: those flickers are old, and the code did change enough for those to be fixed in a way or another. Being old does not always means the code leading to those failures changed that much. Now I might be completely wrong, and the flicker to happen again, but I don't think it's a problem since we can really easily open back the issues if it's the case. The other solution IMO is to indeed keep the issue open and in fact to never really close them, because we just don't have time to investigate each of them properly. I really don't see any value of keeping things open and don't act on them, that's why I suggest to close them after doing the checks we suggested before: 1. try to repeat locally the failure; This is totally useless IMO unless you make sure that your computer is made super slow some way since that's the reason for most of the flickering tests. 2. check that we didn't encounter those flickers since last cycle. This one is enough for me but the hard part is to knowing that. Ok, so the proposal is now to check only the age since last time we saw them of the open flickers before closing them. So first question, do we all agree on that? Then for the second check, Vincent suggested to add some tooling: it will be best, but it takes time to do. So on the meantime, as Thomas also suggested, we could add a check in the release plan to create or update all jira issues that concerns flickers. It would allow us to keep some information about the liveness of our flickers. So second question, do you agree on that? Depends what it exactly means. Have some dedicated jira field to indicate when you saw it last ? Comment that you just saw that test failing again ? My suggestion was about a dedicated JIRA field if possible. So, ok if I create a new custom field in JIRA for flickers, called "Date of last failure for flicker"? Other useful and a little more automated tricks not requiring much tooling: * increase the currently very low history (10). The reason it's that low is because of many performances issues we had in the past with old style jobs but those most probably don't apply anymore so we should increase the number now IMO (30 ?) +1 * create a pipeline job which execute platform master integration tests once a day with http://cpulimit.sourceforge.net (looks fun) and keep a big history but not storing stuff like videos and images (100 ?) Not sure what you want there: to have a test execution where you master the slowness? to detect all problems we might have because of a slow server? Final question: for the flickers that I closed today, I relied mainly on my memory for the second check and on their age: I closed the older ones. So what should we do on them? My concern with them is that the reason you gave to close them (that you cannot reproduce them locally) was not valid IMO. If you say some test did not failed since a long time then fine, if what some test is about has completely been rewritten then fine too but that's not what you indicated :) I actually say that in my knowledge the test I closed did not failed since a long time. I didn't checked the code for the tests, except for one and I commented about it. If your memory is only related tests being checked just before a release I'm not sure this is good enough. Not really the case since I check regularly the CI. Now I'm not sure it's good enoug either :) Now as I said, we can reopen also later if needed. Thanks, Simon On 26/03/2019 10:58, Vincent Massol wrote: On 26 Mar 2019, at 10:31, Simon Urli wrote: Hi everyone, I was checking our list of flickering tests in JIRA (https://jira.xwiki.org/issues/?jql=labels%20%3D%20flickering%20AND%20status%20%3D%20Open%20ORDER%20BY%20updated%20DESC) and I noticed that we had somehow old flickering test issue concerning test that I've never seen failing. So I propose we close some of them as inactive: the ones that we don't remember having seen for a while. The ideal would be to have a mechanism to update the issue when the CI fails on a flicker, but it takes time to do properly and it's not a priority. On the contrary I propose to trust our memory: if we're wrong because we have closed a flicker that is stil
Re: [xwiki-devs] [Proposal] Cleaning of flickering tests
On Thu, Sep 5, 2019 at 5:40 PM Simon Urli wrote: > > > > On 05/09/2019 17:24, Thomas Mortagne wrote: > > On Thu, Sep 5, 2019 at 3:43 PM Simon Urli wrote: > >> > >> Hi everyone, > >> > >> reopening this thread since I started to close some flicker issues as > >> part of BFD and got comments for those. > >> > >> So the last mails on this threads suggested to close the flicker issues > >> if we didn't manage to reproduce them locally after a repeated tests, > >> and that we didn't see them after a while. > >> > >> We didn't vote for those suggestion and I assumed a bit quick that I > >> could close some flicker issues that I personally don't remember about > >> on the CI after having tested them locally. > >> My point for doing that is the same as for the first mail I posted on > >> this thread: those flickers are old, and the code did change enough for > >> those to be fixed in a way or another. > > > > Being old does not always means the code leading to those failures > > changed that much. > > > >> > >> Now I might be completely wrong, and the flicker to happen again, but I > >> don't think it's a problem since we can really easily open back the > >> issues if it's the case. > >> > >> The other solution IMO is to indeed keep the issue open and in fact to > >> never really close them, because we just don't have time to investigate > >> each of them properly. > >> > >> I really don't see any value of keeping things open and don't act on > >> them, that's why I suggest to close them after doing the checks we > >> suggested before: > >> 1. try to repeat locally the failure; > > > > This is totally useless IMO unless you make sure that your computer is > > made super slow some way since that's the reason for most of the > > flickering tests. > > > >> 2. check that we didn't encounter those flickers since last cycle. > > > > This one is enough for me but the hard part is to knowing that. > > Ok, so the proposal is now to check only the age since last time we saw > them of the open flickers before closing them. > > > > >> > >> So first question, do we all agree on that? > >> > >> Then for the second check, Vincent suggested to add some tooling: it > >> will be best, but it takes time to do. So on the meantime, as Thomas > >> also suggested, we could add a check in the release plan to create or > >> update all jira issues that concerns flickers. It would allow us to keep > >> some information about the liveness of our flickers. > >> > >> So second question, do you agree on that? > > > > Depends what it exactly means. Have some dedicated jira field to > > indicate when you saw it last ? Comment that you just saw that test > > failing again ? > > My suggestion was about a dedicated JIRA field if possible. > > > > > Other useful and a little more automated tricks not requiring much tooling: > > * increase the currently very low history (10). The reason it's that > > low is because of many performances issues we had in the past with old > > style jobs but those most probably don't apply anymore so we should > > increase the number now IMO (30 ?) > > +1 > > * create a pipeline job which execute platform master integration > > tests once a day with http://cpulimit.sourceforge.net (looks fun) and > > keep a big history but not storing stuff like videos and images (100 > > ?) > > > > Not sure what you want there: to have a test execution where you master > the slowness? to detect all problems we might have because of a slow > server? Have a big history of slowly executed tests so that we can very quickly see if some failing test in standard builds or in jira issues is still a thing and is failing for speed reasons (this also help fixing those tests and making sure they are actually fixed when you try something). > > >> > >> Final question: for the flickers that I closed today, I relied mainly on > >> my memory for the second check and on their age: I closed the older ones. > >> > >> So what should we do on them? > > > > My concern with them is that the reason you gave to close them (that > > you cannot reproduce them locally) was not valid IMO. If you say some > > test did not failed since a long time then fine, if what some test is > > about has completely been rewritten then fine too but that's not what > > you indicated :) > > I actually say that in my knowledge the test I closed did not failed > since a long time. I didn't checked the code for the tests, except for > one and I commented about it. > > > > > If your memory is only related tests being checked just before a > > release I'm not sure this is good enough. > > > > Not really the case since I check regularly the CI. Now I'm not sure > it's good enoug either :) Now as I said, we can reopen also later if needed. > > >> > >> Thanks, > >> Simon > >> > >> On 26/03/2019 10:58, Vincent Massol wrote: > >>> > >>> > On 26 Mar 2019, at 10:31, Simon Urli wrote: > > Hi everyone, > > I was checking our list of flickering tests in JIRA
Re: [xwiki-devs] [Proposal] Cleaning of flickering tests
On 05/09/2019 17:24, Thomas Mortagne wrote: On Thu, Sep 5, 2019 at 3:43 PM Simon Urli wrote: Hi everyone, reopening this thread since I started to close some flicker issues as part of BFD and got comments for those. So the last mails on this threads suggested to close the flicker issues if we didn't manage to reproduce them locally after a repeated tests, and that we didn't see them after a while. We didn't vote for those suggestion and I assumed a bit quick that I could close some flicker issues that I personally don't remember about on the CI after having tested them locally. My point for doing that is the same as for the first mail I posted on this thread: those flickers are old, and the code did change enough for those to be fixed in a way or another. Being old does not always means the code leading to those failures changed that much. Now I might be completely wrong, and the flicker to happen again, but I don't think it's a problem since we can really easily open back the issues if it's the case. The other solution IMO is to indeed keep the issue open and in fact to never really close them, because we just don't have time to investigate each of them properly. I really don't see any value of keeping things open and don't act on them, that's why I suggest to close them after doing the checks we suggested before: 1. try to repeat locally the failure; This is totally useless IMO unless you make sure that your computer is made super slow some way since that's the reason for most of the flickering tests. 2. check that we didn't encounter those flickers since last cycle. This one is enough for me but the hard part is to knowing that. Ok, so the proposal is now to check only the age since last time we saw them of the open flickers before closing them. So first question, do we all agree on that? Then for the second check, Vincent suggested to add some tooling: it will be best, but it takes time to do. So on the meantime, as Thomas also suggested, we could add a check in the release plan to create or update all jira issues that concerns flickers. It would allow us to keep some information about the liveness of our flickers. So second question, do you agree on that? Depends what it exactly means. Have some dedicated jira field to indicate when you saw it last ? Comment that you just saw that test failing again ? My suggestion was about a dedicated JIRA field if possible. Other useful and a little more automated tricks not requiring much tooling: * increase the currently very low history (10). The reason it's that low is because of many performances issues we had in the past with old style jobs but those most probably don't apply anymore so we should increase the number now IMO (30 ?) +1 * create a pipeline job which execute platform master integration tests once a day with http://cpulimit.sourceforge.net (looks fun) and keep a big history but not storing stuff like videos and images (100 ?) Not sure what you want there: to have a test execution where you master the slowness? to detect all problems we might have because of a slow server? Final question: for the flickers that I closed today, I relied mainly on my memory for the second check and on their age: I closed the older ones. So what should we do on them? My concern with them is that the reason you gave to close them (that you cannot reproduce them locally) was not valid IMO. If you say some test did not failed since a long time then fine, if what some test is about has completely been rewritten then fine too but that's not what you indicated :) I actually say that in my knowledge the test I closed did not failed since a long time. I didn't checked the code for the tests, except for one and I commented about it. If your memory is only related tests being checked just before a release I'm not sure this is good enough. Not really the case since I check regularly the CI. Now I'm not sure it's good enoug either :) Now as I said, we can reopen also later if needed. Thanks, Simon On 26/03/2019 10:58, Vincent Massol wrote: On 26 Mar 2019, at 10:31, Simon Urli wrote: Hi everyone, I was checking our list of flickering tests in JIRA (https://jira.xwiki.org/issues/?jql=labels%20%3D%20flickering%20AND%20status%20%3D%20Open%20ORDER%20BY%20updated%20DESC) and I noticed that we had somehow old flickering test issue concerning test that I've never seen failing. So I propose we close some of them as inactive: the ones that we don't remember having seen for a while. The ideal would be to have a mechanism to update the issue when the CI fails on a flicker, but it takes time to do properly and it's not a priority. On the contrary I propose to trust our memory: if we're wrong because we have closed a flicker that is still happening, it will allow us to remind that we have this flicker to fix and we can easily reopen the issue. As Thomas mentioned on the chat, we should also updat
Re: [xwiki-devs] [Proposal] Cleaning of flickering tests
On Thu, Sep 5, 2019 at 3:43 PM Simon Urli wrote: > > Hi everyone, > > reopening this thread since I started to close some flicker issues as > part of BFD and got comments for those. > > So the last mails on this threads suggested to close the flicker issues > if we didn't manage to reproduce them locally after a repeated tests, > and that we didn't see them after a while. > > We didn't vote for those suggestion and I assumed a bit quick that I > could close some flicker issues that I personally don't remember about > on the CI after having tested them locally. > My point for doing that is the same as for the first mail I posted on > this thread: those flickers are old, and the code did change enough for > those to be fixed in a way or another. Being old does not always means the code leading to those failures changed that much. > > Now I might be completely wrong, and the flicker to happen again, but I > don't think it's a problem since we can really easily open back the > issues if it's the case. > > The other solution IMO is to indeed keep the issue open and in fact to > never really close them, because we just don't have time to investigate > each of them properly. > > I really don't see any value of keeping things open and don't act on > them, that's why I suggest to close them after doing the checks we > suggested before: >1. try to repeat locally the failure; This is totally useless IMO unless you make sure that your computer is made super slow some way since that's the reason for most of the flickering tests. >2. check that we didn't encounter those flickers since last cycle. This one is enough for me but the hard part is to knowing that. > > So first question, do we all agree on that? > > Then for the second check, Vincent suggested to add some tooling: it > will be best, but it takes time to do. So on the meantime, as Thomas > also suggested, we could add a check in the release plan to create or > update all jira issues that concerns flickers. It would allow us to keep > some information about the liveness of our flickers. > > So second question, do you agree on that? Depends what it exactly means. Have some dedicated jira field to indicate when you saw it last ? Comment that you just saw that test failing again ? Other useful and a little more automated tricks not requiring much tooling: * increase the currently very low history (10). The reason it's that low is because of many performances issues we had in the past with old style jobs but those most probably don't apply anymore so we should increase the number now IMO (30 ?) * create a pipeline job which execute platform master integration tests once a day with http://cpulimit.sourceforge.net (looks fun) and keep a big history but not storing stuff like videos and images (100 ?) > > Final question: for the flickers that I closed today, I relied mainly on > my memory for the second check and on their age: I closed the older ones. > > So what should we do on them? My concern with them is that the reason you gave to close them (that you cannot reproduce them locally) was not valid IMO. If you say some test did not failed since a long time then fine, if what some test is about has completely been rewritten then fine too but that's not what you indicated :) If your memory is only related tests being checked just before a release I'm not sure this is good enough. > > Thanks, > Simon > > On 26/03/2019 10:58, Vincent Massol wrote: > > > > > >> On 26 Mar 2019, at 10:31, Simon Urli wrote: > >> > >> Hi everyone, > >> > >> I was checking our list of flickering tests in JIRA > >> (https://jira.xwiki.org/issues/?jql=labels%20%3D%20flickering%20AND%20status%20%3D%20Open%20ORDER%20BY%20updated%20DESC) > >> and I noticed that we had somehow old flickering test issue concerning > >> test that I've never seen failing. > >> > >> So I propose we close some of them as inactive: the ones that we don't > >> remember having seen for a while. The ideal would be to have a mechanism > >> to update the issue when the CI fails on a flicker, but it takes time to > >> do properly and it's not a priority. > >> > >> On the contrary I propose to trust our memory: if we're wrong because we > >> have closed a flicker that is still happening, it will allow us to remind > >> that we have this flicker to fix and we can easily reopen the issue. > >> > >> As Thomas mentioned on the chat, we should also update the release plan to > >> include the inactive flickers in the list of issue to check. > > > > I should be able to easily create a report when any test fails inside our > > jenkins pipeline and make it available similar to our clover report. I > > could indicate if it’s a known flicker or not too in this report. That > > could compensate for the fact that we only keep 7 days of records in our > > jobs. > > > > Would need to define the report format, whether it’s the same file updated > > at each run or a different one. If the same one, then either
Re: [xwiki-devs] [Proposal] Cleaning of flickering tests
Hi Simon and all, > On 5 Sep 2019, at 15:43, Simon Urli wrote: > > Hi everyone, > > reopening this thread since I started to close some flicker issues as part of > BFD and got comments for those. > > So the last mails on this threads suggested to close the flicker issues if we > didn't manage to reproduce them locally after a repeated tests, and that we > didn't see them after a while. > > We didn't vote for those suggestion and I assumed a bit quick that I could > close some flicker issues that I personally don't remember about on the CI > after having tested them locally. > My point for doing that is the same as for the first mail I posted on this > thread: those flickers are old, and the code did change enough for those to > be fixed in a way or another. Well you didn’t say that in the comment when closing :) At least not for the one where I commented. I was just asking for a rationale, even asking what made you think that the flicker was gone. You said that you couldn’t reproduce locally but you didn’t say that the underlying code was also changed a lot (and I didn’t know that) ;) > Now I might be completely wrong, and the flicker to happen again, but I don't > think it's a problem since we can really easily open back the issues if it's > the case. > > The other solution IMO is to indeed keep the issue open and in fact to never > really close them, because we just don't have time to investigate each of > them properly. There’s another solution which is to check the history on the CI and close the issue as cannot reproduce if the flicker doesn’t reproduce *on the CI* (not locally since locally doesn’t prove anything since in 99.99% of the case the flicker couldn’t be reproduced locally and that’s why we marked it as a flicker :) Otherwise we would have fixed it!). Only potential issue is the size of the job history which is currently set to 10 executions and a bit too small to guarantee that the flicker doesn’t reproduce. EDIT: I see you mentioned a solution below > I really don't see any value of keeping things open and don't act on them, I definitely agree about that. We need to act on them. But closing is not acting on them. It’s quite the opposite. It’s about saying that we don’t care about them and we’re not going to do anything about them. Which means that we do test session days we won’t even look at them to try to fix the flicker since the issue would be closed. > that's why I suggest to close them after doing the checks we suggested before: > 1. try to repeat locally the failure; > 2. check that we didn't encounter those flickers since last cycle. > > So first question, do we all agree on that? The second point is the most important for me. But how do you plan to implement it (since the job history queue is only 10)? EDIT: I see you mentioned a solution below > Then for the second check, Vincent suggested to add some tooling: it will be > best, but it takes time to do. So on the meantime, as Thomas also suggested, > we could add a check in the release plan to create or update all jira issues > that concerns flickers. It would allow us to keep some information about the > liveness of our flickers. > > So second question, do you agree on that? Sounds like a good idea while waiting for a tool. > Final question: for the flickers that I closed today, I relied mainly on my > memory for the second check and on their age: I closed the older ones. > > So what should we do on them? I’m fine to trust you and to reopen when needed. I was just asking why you were closing them since you didn’t provide the info in comment. Thanks -Vincent > > Thanks, > Simon > > On 26/03/2019 10:58, Vincent Massol wrote: >>> On 26 Mar 2019, at 10:31, Simon Urli wrote: >>> >>> Hi everyone, >>> >>> I was checking our list of flickering tests in JIRA >>> (https://jira.xwiki.org/issues/?jql=labels%20%3D%20flickering%20AND%20status%20%3D%20Open%20ORDER%20BY%20updated%20DESC) >>> and I noticed that we had somehow old flickering test issue concerning >>> test that I've never seen failing. >>> >>> So I propose we close some of them as inactive: the ones that we don't >>> remember having seen for a while. The ideal would be to have a mechanism to >>> update the issue when the CI fails on a flicker, but it takes time to do >>> properly and it's not a priority. >>> >>> On the contrary I propose to trust our memory: if we're wrong because we >>> have closed a flicker that is still happening, it will allow us to remind >>> that we have this flicker to fix and we can easily reopen the issue. >>> >>> As Thomas mentioned on the chat, we should also update the release plan to >>> include the inactive flickers in the list of issue to check. >> I should be able to easily create a report when any test fails inside our >> jenkins pipeline and make it available similar to our clover report. I could >> indicate if it’s a known flicker or not too in this report. That coul
Re: [xwiki-devs] [Proposal] Cleaning of flickering tests
Hi everyone, reopening this thread since I started to close some flicker issues as part of BFD and got comments for those. So the last mails on this threads suggested to close the flicker issues if we didn't manage to reproduce them locally after a repeated tests, and that we didn't see them after a while. We didn't vote for those suggestion and I assumed a bit quick that I could close some flicker issues that I personally don't remember about on the CI after having tested them locally. My point for doing that is the same as for the first mail I posted on this thread: those flickers are old, and the code did change enough for those to be fixed in a way or another. Now I might be completely wrong, and the flicker to happen again, but I don't think it's a problem since we can really easily open back the issues if it's the case. The other solution IMO is to indeed keep the issue open and in fact to never really close them, because we just don't have time to investigate each of them properly. I really don't see any value of keeping things open and don't act on them, that's why I suggest to close them after doing the checks we suggested before: 1. try to repeat locally the failure; 2. check that we didn't encounter those flickers since last cycle. So first question, do we all agree on that? Then for the second check, Vincent suggested to add some tooling: it will be best, but it takes time to do. So on the meantime, as Thomas also suggested, we could add a check in the release plan to create or update all jira issues that concerns flickers. It would allow us to keep some information about the liveness of our flickers. So second question, do you agree on that? Final question: for the flickers that I closed today, I relied mainly on my memory for the second check and on their age: I closed the older ones. So what should we do on them? Thanks, Simon On 26/03/2019 10:58, Vincent Massol wrote: On 26 Mar 2019, at 10:31, Simon Urli wrote: Hi everyone, I was checking our list of flickering tests in JIRA (https://jira.xwiki.org/issues/?jql=labels%20%3D%20flickering%20AND%20status%20%3D%20Open%20ORDER%20BY%20updated%20DESC) and I noticed that we had somehow old flickering test issue concerning test that I've never seen failing. So I propose we close some of them as inactive: the ones that we don't remember having seen for a while. The ideal would be to have a mechanism to update the issue when the CI fails on a flicker, but it takes time to do properly and it's not a priority. On the contrary I propose to trust our memory: if we're wrong because we have closed a flicker that is still happening, it will allow us to remind that we have this flicker to fix and we can easily reopen the issue. As Thomas mentioned on the chat, we should also update the release plan to include the inactive flickers in the list of issue to check. I should be able to easily create a report when any test fails inside our jenkins pipeline and make it available similar to our clover report. I could indicate if it’s a known flicker or not too in this report. That could compensate for the fact that we only keep 7 days of records in our jobs. Would need to define the report format, whether it’s the same file updated at each run or a different one. If the same one, then either: * I’d need to parse it first in memory, add the new tests and overwrite the file * or add to the bottom of the file which will grow quite large quickly WDYT? Thanks -Vincent So for now I propose to close the following list of issues as inactive: * XWIKI-14399: AddRemoveTagsTest#addAndDeleteTagFromTagPage is flickering (https://jira.xwiki.org/browse/XWIKI-14399) * XWIKI-14396: AnnotationsTest#addAndDeleteAnnotations is flickering (https://jira.xwiki.org/browse/XWIKI-14396) * XWIKI-14394: SectionTest.testSectionEditInWikiEditorWhenSyntax2x (xwiki-enterprise-test-ui) is flaky (https://jira.xwiki.org/browse/XWIKI-14394) * XWIKI-14386: appwithinminutes.AppsLiveTableTest.testEditApplication is possibly flaky (https://jira.xwiki.org/browse/XWIKI-14386) * XWIKI-14835: DeletePageTest#deletePageIsImpossibleWhenNoDeleteRights is flickering (https://jira.xwiki.org/browse/XWIKI-14835) * XWIKI-14860: LoginTest#testDataIsPreservedAfterLogin is flickering (https://jira.xwiki.org/browse/XWIKI-14860) And I propose in general to close the flickers we don't remember having seen after a cycle as inactive. WDYT? Simon -- Simon Urli Software Engineer at XWiki SAS simon.u...@xwiki.com More about us at http://www.xwiki.com -- Simon Urli Software Engineer at XWiki SAS simon.u...@xwiki.com More about us at http://www.xwiki.com
Re: [xwiki-devs] [Proposal] Cleaning of flickering tests
> On 26 Mar 2019, at 10:31, Simon Urli wrote: > > Hi everyone, > > I was checking our list of flickering tests in JIRA > (https://jira.xwiki.org/issues/?jql=labels%20%3D%20flickering%20AND%20status%20%3D%20Open%20ORDER%20BY%20updated%20DESC) > and I noticed that we had somehow old flickering test issue concerning test > that I've never seen failing. > > So I propose we close some of them as inactive: the ones that we don't > remember having seen for a while. The ideal would be to have a mechanism to > update the issue when the CI fails on a flicker, but it takes time to do > properly and it's not a priority. > > On the contrary I propose to trust our memory: if we're wrong because we have > closed a flicker that is still happening, it will allow us to remind that we > have this flicker to fix and we can easily reopen the issue. > > As Thomas mentioned on the chat, we should also update the release plan to > include the inactive flickers in the list of issue to check. I should be able to easily create a report when any test fails inside our jenkins pipeline and make it available similar to our clover report. I could indicate if it’s a known flicker or not too in this report. That could compensate for the fact that we only keep 7 days of records in our jobs. Would need to define the report format, whether it’s the same file updated at each run or a different one. If the same one, then either: * I’d need to parse it first in memory, add the new tests and overwrite the file * or add to the bottom of the file which will grow quite large quickly WDYT? Thanks -Vincent > > So for now I propose to close the following list of issues as inactive: > > * XWIKI-14399: AddRemoveTagsTest#addAndDeleteTagFromTagPage is flickering > (https://jira.xwiki.org/browse/XWIKI-14399) > * XWIKI-14396: AnnotationsTest#addAndDeleteAnnotations is flickering > (https://jira.xwiki.org/browse/XWIKI-14396) > * XWIKI-14394: SectionTest.testSectionEditInWikiEditorWhenSyntax2x > (xwiki-enterprise-test-ui) is flaky > (https://jira.xwiki.org/browse/XWIKI-14394) > * XWIKI-14386: appwithinminutes.AppsLiveTableTest.testEditApplication is > possibly flaky (https://jira.xwiki.org/browse/XWIKI-14386) > * XWIKI-14835: DeletePageTest#deletePageIsImpossibleWhenNoDeleteRights is > flickering (https://jira.xwiki.org/browse/XWIKI-14835) > * XWIKI-14860: LoginTest#testDataIsPreservedAfterLogin is flickering > (https://jira.xwiki.org/browse/XWIKI-14860) > > And I propose in general to close the flickers we don't remember having seen > after a cycle as inactive. > > WDYT? > > Simon > -- > Simon Urli > Software Engineer at XWiki SAS > simon.u...@xwiki.com > More about us at http://www.xwiki.com
Re: [xwiki-devs] [Proposal] Cleaning of flickering tests
Hi Simon and all, > On 26 Mar 2019, at 10:31, Simon Urli wrote: > > Hi everyone, > > I was checking our list of flickering tests in JIRA > (https://jira.xwiki.org/issues/?jql=labels%20%3D%20flickering%20AND%20status%20%3D%20Open%20ORDER%20BY%20updated%20DESC) > and I noticed that we had somehow old flickering test issue concerning test > that I've never seen failing. > > So I propose we close some of them as inactive: the ones that we don't > remember having seen for a while. The ideal would be to have a mechanism to > update the issue when the CI fails on a flicker, but it takes time to do > properly and it's not a priority. > > On the contrary I propose to trust our memory: if we're wrong because we have > closed a flicker that is still happening, it will allow us to remind that we > have this flicker to fix and we can easily reopen the issue. > > As Thomas mentioned on the chat, we should also update the release plan to > include the inactive flickers in the list of issue to check. > > So for now I propose to close the following list of issues as inactive: > > * XWIKI-14399: AddRemoveTagsTest#addAndDeleteTagFromTagPage is flickering > (https://jira.xwiki.org/browse/XWIKI-14399) > * XWIKI-14396: AnnotationsTest#addAndDeleteAnnotations is flickering > (https://jira.xwiki.org/browse/XWIKI-14396) > * XWIKI-14394: SectionTest.testSectionEditInWikiEditorWhenSyntax2x > (xwiki-enterprise-test-ui) is flaky > (https://jira.xwiki.org/browse/XWIKI-14394) > * XWIKI-14386: appwithinminutes.AppsLiveTableTest.testEditApplication is > possibly flaky (https://jira.xwiki.org/browse/XWIKI-14386) > * XWIKI-14835: DeletePageTest#deletePageIsImpossibleWhenNoDeleteRights is > flickering (https://jira.xwiki.org/browse/XWIKI-14835) > * XWIKI-14860: LoginTest#testDataIsPreservedAfterLogin is flickering > (https://jira.xwiki.org/browse/XWIKI-14860) > > And I propose in general to close the flickers we don't remember having seen > after a cycle as inactive. > > WDYT? Thanks for looking into this. I’d feel more comfortable if we can do some local testing before closing any test issue. For ex, I’d propose to run it N times (can also be done on a docker container agent, it’s very easy to start a new container on ks* or locally) and if it doesn’t fail at all then close it if it’s old enough. Note: To repeat N times a test: * For junit5: @RepeatedTest(N) * For junit4: @Intermittent(repetition=N) - see https://dev.xwiki.org/xwiki/bin/view/Community/Testing/ * At the shell level: for i in {1..50} do mvn test -D… done Some rules I propose: * N = 50 (for a test taking 3 minutes, that means 2.5 hours) * Last update date of the test jira issue < Now() - 6 months WDYT? Too hard? Thanks -Vincent > > Simon > -- > Simon Urli > Software Engineer at XWiki SAS > simon.u...@xwiki.com > More about us at http://www.xwiki.com