I'm concerned that for the last couple of days we've been finding new
issues faster than we're fixing them. I'll start pushing back on some
higher-risk changes (e.g. in race-prone parts of the code) until we've
cleared up some of the issues.

On Tue, Jun 12, 2018 at 1:31 PM, Tim Armstrong <[email protected]>
wrote:

> We still have a lot of broken build and flaky test issues open, so let's
> continue to be careful about what we merge:
>
> https://issues.apache.org/jira/issues/?jql=project%20%
> 3D%20IMPALA%20AND%20status%20in%20(Open%2C%20%22In%
> 20Progress%22%2C%20Reopened)%20AND%20labels%20in%20(Flaky%
> 2C%20broken-build)%20ORDER%20BY%20priority%20DESC
>
> If it's a large change or there's some risk of it breaking things, please
> continue to check with me so that we can get in all of the outstanding
> changes in in an orderly way:
>
> https://gerrit.cloudera.org/#/q/project:Impala-ASF+status:
> open+Code-Review%253E%253D%252B2
>
>
> On Thu, Jun 7, 2018 at 10:28 AM, Tim Armstrong <[email protected]>
> wrote:
>
>> All of the major known issues except an S3 infra issue are fixed. We got
>> broken in a minor way by a Hive change: https://issues.apache.org/jira
>> /browse/IMPALA-7143 so I disabled the tests until we can sort that out.
>>
>> We should start to think about how to merge outstanding changes in an
>> orderly way. I can act as a gatekeeper until we get the backload shrunk
>> down: https://gerrit.cloudera.org/#/q/project:Impala-ASF+status:op
>> en+Code-Review%253E%253D%252B2
>>
>> It would be helpful if you could let me know if one of your changes is
>> low risk and whether you've done any additional testing to make sure that
>> it won't break any other configurations (S3, Local, etc).
>>
>> On Wed, Jun 6, 2018 at 2:44 PM, Tim Armstrong <[email protected]>
>> wrote:
>>
>>> We ran into some test issues cherry-picking the latest set of changes to
>>> 2.x. I pushed out a fix and I'm merging now. Once that is done the main
>>> build fixes should be on both 2.x and master.
>>>
>>> On Tue, Jun 5, 2018 at 6:08 PM, Tim Armstrong <[email protected]>
>>> wrote:
>>>
>>>> Ok, so 2/3 of those fixes are merged and the other is being merged.
>>>>
>>>> We still have a long list of flaky issues but I went through and we've
>>>> either mitigated them or we're blocked on being able to repro them.
>>>>
>>>> I'll see how things look tomorrow, but if you have some low-risk
>>>> changes in mind, let me know and I can considering whether to merge them.
>>>>
>>>>
>>>>
>>>> On Tue, Jun 5, 2018 at 10:11 AM, Tim Armstrong <[email protected]
>>>> > wrote:
>>>>
>>>>> Things are starting to look healthier now.
>>>>>
>>>>> I went through the broken-build JIRAs and downgraded some of the
>>>>> infrequent infrastructure issues to critical so we have a clearer idea of
>>>>> what's actually breaking the build now versus what's an occasional infra
>>>>> issue: https://issues.apache.org/jira/issues/?jql=project%20
>>>>> %3D%20IMPALA%20AND%20status%20in%20(Open%2C%20%22In%20Progre
>>>>> ss%22%2C%20Reopened)%20AND%20labels%20%3D%20broken-build%20O
>>>>> RDER%20BY%20priority%20DESC
>>>>>
>>>>> I'd like to see the fixes for these three issues go in:
>>>>> https://issues.apache.org/jira/browse/IMPALA-7101
>>>>> https://issues.apache.org/jira/browse/IMPALA-6956
>>>>> https://issues.apache.org/jira/browse/IMPALA-7008
>>>>>
>>>>> We still need to fix any flaky infrastructure issues but that should
>>>>> be able to proceed in parallel with other things.
>>>>>
>>>>>
>>>>> On Fri, Jun 1, 2018 at 11:18 AM, Thomas Tauber-Marshall <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> So while its definitely better, there are still a large number of
>>>>>> failing
>>>>>> builds. We've been hit by at least: IMPALA-6642
>>>>>> <https://issues.apache.org/jira/browse/IMPALA-6642>, IMPALA-6956
>>>>>> <https://issues.apache.org/jira/browse/IMPALA-6956>, IMPALA-7101
>>>>>> <https://issues.apache.org/jira/browse/IMPALA-7101> and IMPALA-3040
>>>>>> <https://issues.apache.org/jira/browse/IMPALA-3040>
>>>>>> all within the last day, along with some mysterious crashes that I
>>>>>> haven't
>>>>>> filed anything for with Apache yet as there's very little info about
>>>>>> what's
>>>>>> actually going on. There are still multiple builds that haven't been
>>>>>> green
>>>>>> in over a month.
>>>>>> <https://issues.apache.org/jira/browse/IMPALA-6642>
>>>>>>
>>>>>> Of course, if we hold commits for too long, there's a danger that
>>>>>> when we
>>>>>> open things back up a bunch of changes will all land at the same time
>>>>>> and
>>>>>> destabilize the builds again, putting back in the same situation. So,
>>>>>> I
>>>>>> would say at a minimum that any changes that are relatively minor and
>>>>>> low
>>>>>> risk can go in now.
>>>>>>
>>>>>> My preference would be to hold off on major changes until we have more
>>>>>> stability.
>>>>>>
>>>>>> On Fri, Jun 1, 2018 at 10:30 AM Lars Volker <[email protected]> wrote:
>>>>>>
>>>>>> > Hi Thomas,
>>>>>> >
>>>>>> > Can you give an update on where we are with the builds?
>>>>>> >
>>>>>> > We currently have ~15 changes with a +2:
>>>>>> >
>>>>>> > https://gerrit.cloudera.org/#/q/status:open+project:Impala-A
>>>>>> SF+branch:master+label:Code-Review%253D2
>>>>>> >
>>>>>> > Thanks, Lars
>>>>>> >
>>>>>> > On Fri, May 25, 2018 at 5:20 PM, Henry Robinson <[email protected]>
>>>>>> wrote:
>>>>>> >
>>>>>> > > +1 - thanks for worrying about build health.
>>>>>> > >
>>>>>> > > On 25 May 2018 at 17:18, Jim Apple <[email protected]> wrote:
>>>>>> > >
>>>>>> > > > Sounds good to me. Thanks for taking ownership!
>>>>>> > > >
>>>>>> > > > On Fri, May 25, 2018 at 5:10 PM Thomas Tauber-Marshall <
>>>>>> > > > [email protected]> wrote:
>>>>>> > > >
>>>>>> > > > > Hey Impala community,
>>>>>> > > > >
>>>>>> > > > > There seems to have been an unusually large number of flaky
>>>>>> or broken
>>>>>> > > > tests
>>>>>> > > > > <
>>>>>> > > > > https://issues.apache.org/jira/browse/IMPALA-7073?jql=
>>>>>> > > > project%20%3D%20IMPALA%20AND%20status%20in%20(Open%2C%20%
>>>>>> > > > 22In%20Progress%22%2C%20Reopened)%20AND%20labels%
>>>>>> > > > 20in%20(flaky%2C%20broken-build)
>>>>>> > > > > >
>>>>>> > > > > cropping up in the last few weeks. I'd like to suggest that
>>>>>> we hold
>>>>>> > off
>>>>>> > > > on
>>>>>> > > > > merging new changes that aren't related to fixing those
>>>>>> testing
>>>>>> > issues
>>>>>> > > > for
>>>>>> > > > > at least a few days until things become more stable.
>>>>>> > > > >
>>>>>> > > > > Does anyone have any objections? If not, I'll send out
>>>>>> another email
>>>>>> > > when
>>>>>> > > > > more of the issues have been addressed.
>>>>>> > > > >
>>>>>> > > > > Thanks,
>>>>>> > > > > Thomas Tauber-Marshall
>>>>>> > > > >
>>>>>> > > >
>>>>>> > >
>>>>>> >
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to