I'm concerned that for the last couple of days we've been finding new issues faster than we're fixing them. I'll start pushing back on some higher-risk changes (e.g. in race-prone parts of the code) until we've cleared up some of the issues.
On Tue, Jun 12, 2018 at 1:31 PM, Tim Armstrong <[email protected]> wrote: > We still have a lot of broken build and flaky test issues open, so let's > continue to be careful about what we merge: > > https://issues.apache.org/jira/issues/?jql=project%20% > 3D%20IMPALA%20AND%20status%20in%20(Open%2C%20%22In% > 20Progress%22%2C%20Reopened)%20AND%20labels%20in%20(Flaky% > 2C%20broken-build)%20ORDER%20BY%20priority%20DESC > > If it's a large change or there's some risk of it breaking things, please > continue to check with me so that we can get in all of the outstanding > changes in in an orderly way: > > https://gerrit.cloudera.org/#/q/project:Impala-ASF+status: > open+Code-Review%253E%253D%252B2 > > > On Thu, Jun 7, 2018 at 10:28 AM, Tim Armstrong <[email protected]> > wrote: > >> All of the major known issues except an S3 infra issue are fixed. We got >> broken in a minor way by a Hive change: https://issues.apache.org/jira >> /browse/IMPALA-7143 so I disabled the tests until we can sort that out. >> >> We should start to think about how to merge outstanding changes in an >> orderly way. I can act as a gatekeeper until we get the backload shrunk >> down: https://gerrit.cloudera.org/#/q/project:Impala-ASF+status:op >> en+Code-Review%253E%253D%252B2 >> >> It would be helpful if you could let me know if one of your changes is >> low risk and whether you've done any additional testing to make sure that >> it won't break any other configurations (S3, Local, etc). >> >> On Wed, Jun 6, 2018 at 2:44 PM, Tim Armstrong <[email protected]> >> wrote: >> >>> We ran into some test issues cherry-picking the latest set of changes to >>> 2.x. I pushed out a fix and I'm merging now. Once that is done the main >>> build fixes should be on both 2.x and master. >>> >>> On Tue, Jun 5, 2018 at 6:08 PM, Tim Armstrong <[email protected]> >>> wrote: >>> >>>> Ok, so 2/3 of those fixes are merged and the other is being merged. >>>> >>>> We still have a long list of flaky issues but I went through and we've >>>> either mitigated them or we're blocked on being able to repro them. >>>> >>>> I'll see how things look tomorrow, but if you have some low-risk >>>> changes in mind, let me know and I can considering whether to merge them. >>>> >>>> >>>> >>>> On Tue, Jun 5, 2018 at 10:11 AM, Tim Armstrong <[email protected] >>>> > wrote: >>>> >>>>> Things are starting to look healthier now. >>>>> >>>>> I went through the broken-build JIRAs and downgraded some of the >>>>> infrequent infrastructure issues to critical so we have a clearer idea of >>>>> what's actually breaking the build now versus what's an occasional infra >>>>> issue: https://issues.apache.org/jira/issues/?jql=project%20 >>>>> %3D%20IMPALA%20AND%20status%20in%20(Open%2C%20%22In%20Progre >>>>> ss%22%2C%20Reopened)%20AND%20labels%20%3D%20broken-build%20O >>>>> RDER%20BY%20priority%20DESC >>>>> >>>>> I'd like to see the fixes for these three issues go in: >>>>> https://issues.apache.org/jira/browse/IMPALA-7101 >>>>> https://issues.apache.org/jira/browse/IMPALA-6956 >>>>> https://issues.apache.org/jira/browse/IMPALA-7008 >>>>> >>>>> We still need to fix any flaky infrastructure issues but that should >>>>> be able to proceed in parallel with other things. >>>>> >>>>> >>>>> On Fri, Jun 1, 2018 at 11:18 AM, Thomas Tauber-Marshall < >>>>> [email protected]> wrote: >>>>> >>>>>> So while its definitely better, there are still a large number of >>>>>> failing >>>>>> builds. We've been hit by at least: IMPALA-6642 >>>>>> <https://issues.apache.org/jira/browse/IMPALA-6642>, IMPALA-6956 >>>>>> <https://issues.apache.org/jira/browse/IMPALA-6956>, IMPALA-7101 >>>>>> <https://issues.apache.org/jira/browse/IMPALA-7101> and IMPALA-3040 >>>>>> <https://issues.apache.org/jira/browse/IMPALA-3040> >>>>>> all within the last day, along with some mysterious crashes that I >>>>>> haven't >>>>>> filed anything for with Apache yet as there's very little info about >>>>>> what's >>>>>> actually going on. There are still multiple builds that haven't been >>>>>> green >>>>>> in over a month. >>>>>> <https://issues.apache.org/jira/browse/IMPALA-6642> >>>>>> >>>>>> Of course, if we hold commits for too long, there's a danger that >>>>>> when we >>>>>> open things back up a bunch of changes will all land at the same time >>>>>> and >>>>>> destabilize the builds again, putting back in the same situation. So, >>>>>> I >>>>>> would say at a minimum that any changes that are relatively minor and >>>>>> low >>>>>> risk can go in now. >>>>>> >>>>>> My preference would be to hold off on major changes until we have more >>>>>> stability. >>>>>> >>>>>> On Fri, Jun 1, 2018 at 10:30 AM Lars Volker <[email protected]> wrote: >>>>>> >>>>>> > Hi Thomas, >>>>>> > >>>>>> > Can you give an update on where we are with the builds? >>>>>> > >>>>>> > We currently have ~15 changes with a +2: >>>>>> > >>>>>> > https://gerrit.cloudera.org/#/q/status:open+project:Impala-A >>>>>> SF+branch:master+label:Code-Review%253D2 >>>>>> > >>>>>> > Thanks, Lars >>>>>> > >>>>>> > On Fri, May 25, 2018 at 5:20 PM, Henry Robinson <[email protected]> >>>>>> wrote: >>>>>> > >>>>>> > > +1 - thanks for worrying about build health. >>>>>> > > >>>>>> > > On 25 May 2018 at 17:18, Jim Apple <[email protected]> wrote: >>>>>> > > >>>>>> > > > Sounds good to me. Thanks for taking ownership! >>>>>> > > > >>>>>> > > > On Fri, May 25, 2018 at 5:10 PM Thomas Tauber-Marshall < >>>>>> > > > [email protected]> wrote: >>>>>> > > > >>>>>> > > > > Hey Impala community, >>>>>> > > > > >>>>>> > > > > There seems to have been an unusually large number of flaky >>>>>> or broken >>>>>> > > > tests >>>>>> > > > > < >>>>>> > > > > https://issues.apache.org/jira/browse/IMPALA-7073?jql= >>>>>> > > > project%20%3D%20IMPALA%20AND%20status%20in%20(Open%2C%20% >>>>>> > > > 22In%20Progress%22%2C%20Reopened)%20AND%20labels% >>>>>> > > > 20in%20(flaky%2C%20broken-build) >>>>>> > > > > > >>>>>> > > > > cropping up in the last few weeks. I'd like to suggest that >>>>>> we hold >>>>>> > off >>>>>> > > > on >>>>>> > > > > merging new changes that aren't related to fixing those >>>>>> testing >>>>>> > issues >>>>>> > > > for >>>>>> > > > > at least a few days until things become more stable. >>>>>> > > > > >>>>>> > > > > Does anyone have any objections? If not, I'll send out >>>>>> another email >>>>>> > > when >>>>>> > > > > more of the issues have been addressed. >>>>>> > > > > >>>>>> > > > > Thanks, >>>>>> > > > > Thomas Tauber-Marshall >>>>>> > > > > >>>>>> > > > >>>>>> > > >>>>>> > >>>>>> >>>>> >>>>> >>>> >>> >> >
