Reminder that we've still got lots of open flaky issues: https://issues.apache.org/jira/issues/?jql=project%20% 3D%20IMPALA%20AND%20status%20in%20(Open%2C%20%22In% 20Progress%22%2C%20Reopened)%20AND%20labels%20in%20(Flaky% 2C%20broken-build)%20ORDER%20BY%20priority%20DESC
Master has been healthy aside from IMPALA-6910 <https://issues.apache.org/jira/browse/IMPALA-6910> so we can continue merging as normal, but let's not get complacent. On Thu, Jun 14, 2018 at 3:35 PM, Tim Armstrong <[email protected]> wrote: > I'm concerned that for the last couple of days we've been finding new > issues faster than we're fixing them. I'll start pushing back on some > higher-risk changes (e.g. in race-prone parts of the code) until we've > cleared up some of the issues. > > On Tue, Jun 12, 2018 at 1:31 PM, Tim Armstrong <[email protected]> > wrote: > >> We still have a lot of broken build and flaky test issues open, so let's >> continue to be careful about what we merge: >> >> https://issues.apache.org/jira/issues/?jql=project%20%3D% >> 20IMPALA%20AND%20status%20in%20(Open%2C%20%22In%20Progress% >> 22%2C%20Reopened)%20AND%20labels%20in%20(Flaky%2C% >> 20broken-build)%20ORDER%20BY%20priority%20DESC >> >> If it's a large change or there's some risk of it breaking things, please >> continue to check with me so that we can get in all of the outstanding >> changes in in an orderly way: >> >> https://gerrit.cloudera.org/#/q/project:Impala-ASF+status:op >> en+Code-Review%253E%253D%252B2 >> >> >> On Thu, Jun 7, 2018 at 10:28 AM, Tim Armstrong <[email protected]> >> wrote: >> >>> All of the major known issues except an S3 infra issue are fixed. We got >>> broken in a minor way by a Hive change: https://issues.apache.org/jira >>> /browse/IMPALA-7143 so I disabled the tests until we can sort that out. >>> >>> We should start to think about how to merge outstanding changes in an >>> orderly way. I can act as a gatekeeper until we get the backload shrunk >>> down: https://gerrit.cloudera.org/#/q/project:Impala-ASF+status:op >>> en+Code-Review%253E%253D%252B2 >>> >>> It would be helpful if you could let me know if one of your changes is >>> low risk and whether you've done any additional testing to make sure that >>> it won't break any other configurations (S3, Local, etc). >>> >>> On Wed, Jun 6, 2018 at 2:44 PM, Tim Armstrong <[email protected]> >>> wrote: >>> >>>> We ran into some test issues cherry-picking the latest set of changes >>>> to 2.x. I pushed out a fix and I'm merging now. Once that is done the main >>>> build fixes should be on both 2.x and master. >>>> >>>> On Tue, Jun 5, 2018 at 6:08 PM, Tim Armstrong <[email protected]> >>>> wrote: >>>> >>>>> Ok, so 2/3 of those fixes are merged and the other is being merged. >>>>> >>>>> We still have a long list of flaky issues but I went through and we've >>>>> either mitigated them or we're blocked on being able to repro them. >>>>> >>>>> I'll see how things look tomorrow, but if you have some low-risk >>>>> changes in mind, let me know and I can considering whether to merge them. >>>>> >>>>> >>>>> >>>>> On Tue, Jun 5, 2018 at 10:11 AM, Tim Armstrong < >>>>> [email protected]> wrote: >>>>> >>>>>> Things are starting to look healthier now. >>>>>> >>>>>> I went through the broken-build JIRAs and downgraded some of the >>>>>> infrequent infrastructure issues to critical so we have a clearer idea of >>>>>> what's actually breaking the build now versus what's an occasional infra >>>>>> issue: https://issues.apache.org/jira/issues/?jql=project%20 >>>>>> %3D%20IMPALA%20AND%20status%20in%20(Open%2C%20%22In%20Progre >>>>>> ss%22%2C%20Reopened)%20AND%20labels%20%3D%20broken-build%20O >>>>>> RDER%20BY%20priority%20DESC >>>>>> >>>>>> I'd like to see the fixes for these three issues go in: >>>>>> https://issues.apache.org/jira/browse/IMPALA-7101 >>>>>> https://issues.apache.org/jira/browse/IMPALA-6956 >>>>>> https://issues.apache.org/jira/browse/IMPALA-7008 >>>>>> >>>>>> We still need to fix any flaky infrastructure issues but that should >>>>>> be able to proceed in parallel with other things. >>>>>> >>>>>> >>>>>> On Fri, Jun 1, 2018 at 11:18 AM, Thomas Tauber-Marshall < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> So while its definitely better, there are still a large number of >>>>>>> failing >>>>>>> builds. We've been hit by at least: IMPALA-6642 >>>>>>> <https://issues.apache.org/jira/browse/IMPALA-6642>, IMPALA-6956 >>>>>>> <https://issues.apache.org/jira/browse/IMPALA-6956>, IMPALA-7101 >>>>>>> <https://issues.apache.org/jira/browse/IMPALA-7101> and IMPALA-3040 >>>>>>> <https://issues.apache.org/jira/browse/IMPALA-3040> >>>>>>> all within the last day, along with some mysterious crashes that I >>>>>>> haven't >>>>>>> filed anything for with Apache yet as there's very little info about >>>>>>> what's >>>>>>> actually going on. There are still multiple builds that haven't been >>>>>>> green >>>>>>> in over a month. >>>>>>> <https://issues.apache.org/jira/browse/IMPALA-6642> >>>>>>> >>>>>>> Of course, if we hold commits for too long, there's a danger that >>>>>>> when we >>>>>>> open things back up a bunch of changes will all land at the same >>>>>>> time and >>>>>>> destabilize the builds again, putting back in the same situation. >>>>>>> So, I >>>>>>> would say at a minimum that any changes that are relatively minor >>>>>>> and low >>>>>>> risk can go in now. >>>>>>> >>>>>>> My preference would be to hold off on major changes until we have >>>>>>> more >>>>>>> stability. >>>>>>> >>>>>>> On Fri, Jun 1, 2018 at 10:30 AM Lars Volker <[email protected]> wrote: >>>>>>> >>>>>>> > Hi Thomas, >>>>>>> > >>>>>>> > Can you give an update on where we are with the builds? >>>>>>> > >>>>>>> > We currently have ~15 changes with a +2: >>>>>>> > >>>>>>> > https://gerrit.cloudera.org/#/q/status:open+project:Impala-A >>>>>>> SF+branch:master+label:Code-Review%253D2 >>>>>>> > >>>>>>> > Thanks, Lars >>>>>>> > >>>>>>> > On Fri, May 25, 2018 at 5:20 PM, Henry Robinson <[email protected]> >>>>>>> wrote: >>>>>>> > >>>>>>> > > +1 - thanks for worrying about build health. >>>>>>> > > >>>>>>> > > On 25 May 2018 at 17:18, Jim Apple <[email protected]> wrote: >>>>>>> > > >>>>>>> > > > Sounds good to me. Thanks for taking ownership! >>>>>>> > > > >>>>>>> > > > On Fri, May 25, 2018 at 5:10 PM Thomas Tauber-Marshall < >>>>>>> > > > [email protected]> wrote: >>>>>>> > > > >>>>>>> > > > > Hey Impala community, >>>>>>> > > > > >>>>>>> > > > > There seems to have been an unusually large number of flaky >>>>>>> or broken >>>>>>> > > > tests >>>>>>> > > > > < >>>>>>> > > > > https://issues.apache.org/jira/browse/IMPALA-7073?jql= >>>>>>> > > > project%20%3D%20IMPALA%20AND%20status%20in%20(Open%2C%20% >>>>>>> > > > 22In%20Progress%22%2C%20Reopened)%20AND%20labels% >>>>>>> > > > 20in%20(flaky%2C%20broken-build) >>>>>>> > > > > > >>>>>>> > > > > cropping up in the last few weeks. I'd like to suggest that >>>>>>> we hold >>>>>>> > off >>>>>>> > > > on >>>>>>> > > > > merging new changes that aren't related to fixing those >>>>>>> testing >>>>>>> > issues >>>>>>> > > > for >>>>>>> > > > > at least a few days until things become more stable. >>>>>>> > > > > >>>>>>> > > > > Does anyone have any objections? If not, I'll send out >>>>>>> another email >>>>>>> > > when >>>>>>> > > > > more of the issues have been addressed. >>>>>>> > > > > >>>>>>> > > > > Thanks, >>>>>>> > > > > Thomas Tauber-Marshall >>>>>>> > > > > >>>>>>> > > > >>>>>>> > > >>>>>>> > >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >> >
