Reminder that we've still got lots of open flaky issues:

https://issues.apache.org/jira/issues/?jql=project%20%
3D%20IMPALA%20AND%20status%20in%20(Open%2C%20%22In%
20Progress%22%2C%20Reopened)%20AND%20labels%20in%20(Flaky%
2C%20broken-build)%20ORDER%20BY%20priority%20DESC

Master has been healthy aside from IMPALA-6910
<https://issues.apache.org/jira/browse/IMPALA-6910> so we can continue
merging as normal, but let's not get complacent.

On Thu, Jun 14, 2018 at 3:35 PM, Tim Armstrong <[email protected]>
wrote:

> I'm concerned that for the last couple of days we've been finding new
> issues faster than we're fixing them. I'll start pushing back on some
> higher-risk changes (e.g. in race-prone parts of the code) until we've
> cleared up some of the issues.
>
> On Tue, Jun 12, 2018 at 1:31 PM, Tim Armstrong <[email protected]>
> wrote:
>
>> We still have a lot of broken build and flaky test issues open, so let's
>> continue to be careful about what we merge:
>>
>> https://issues.apache.org/jira/issues/?jql=project%20%3D%
>> 20IMPALA%20AND%20status%20in%20(Open%2C%20%22In%20Progress%
>> 22%2C%20Reopened)%20AND%20labels%20in%20(Flaky%2C%
>> 20broken-build)%20ORDER%20BY%20priority%20DESC
>>
>> If it's a large change or there's some risk of it breaking things, please
>> continue to check with me so that we can get in all of the outstanding
>> changes in in an orderly way:
>>
>> https://gerrit.cloudera.org/#/q/project:Impala-ASF+status:op
>> en+Code-Review%253E%253D%252B2
>>
>>
>> On Thu, Jun 7, 2018 at 10:28 AM, Tim Armstrong <[email protected]>
>> wrote:
>>
>>> All of the major known issues except an S3 infra issue are fixed. We got
>>> broken in a minor way by a Hive change: https://issues.apache.org/jira
>>> /browse/IMPALA-7143 so I disabled the tests until we can sort that out.
>>>
>>> We should start to think about how to merge outstanding changes in an
>>> orderly way. I can act as a gatekeeper until we get the backload shrunk
>>> down: https://gerrit.cloudera.org/#/q/project:Impala-ASF+status:op
>>> en+Code-Review%253E%253D%252B2
>>>
>>> It would be helpful if you could let me know if one of your changes is
>>> low risk and whether you've done any additional testing to make sure that
>>> it won't break any other configurations (S3, Local, etc).
>>>
>>> On Wed, Jun 6, 2018 at 2:44 PM, Tim Armstrong <[email protected]>
>>> wrote:
>>>
>>>> We ran into some test issues cherry-picking the latest set of changes
>>>> to 2.x. I pushed out a fix and I'm merging now. Once that is done the main
>>>> build fixes should be on both 2.x and master.
>>>>
>>>> On Tue, Jun 5, 2018 at 6:08 PM, Tim Armstrong <[email protected]>
>>>> wrote:
>>>>
>>>>> Ok, so 2/3 of those fixes are merged and the other is being merged.
>>>>>
>>>>> We still have a long list of flaky issues but I went through and we've
>>>>> either mitigated them or we're blocked on being able to repro them.
>>>>>
>>>>> I'll see how things look tomorrow, but if you have some low-risk
>>>>> changes in mind, let me know and I can considering whether to merge them.
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Jun 5, 2018 at 10:11 AM, Tim Armstrong <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Things are starting to look healthier now.
>>>>>>
>>>>>> I went through the broken-build JIRAs and downgraded some of the
>>>>>> infrequent infrastructure issues to critical so we have a clearer idea of
>>>>>> what's actually breaking the build now versus what's an occasional infra
>>>>>> issue: https://issues.apache.org/jira/issues/?jql=project%20
>>>>>> %3D%20IMPALA%20AND%20status%20in%20(Open%2C%20%22In%20Progre
>>>>>> ss%22%2C%20Reopened)%20AND%20labels%20%3D%20broken-build%20O
>>>>>> RDER%20BY%20priority%20DESC
>>>>>>
>>>>>> I'd like to see the fixes for these three issues go in:
>>>>>> https://issues.apache.org/jira/browse/IMPALA-7101
>>>>>> https://issues.apache.org/jira/browse/IMPALA-6956
>>>>>> https://issues.apache.org/jira/browse/IMPALA-7008
>>>>>>
>>>>>> We still need to fix any flaky infrastructure issues but that should
>>>>>> be able to proceed in parallel with other things.
>>>>>>
>>>>>>
>>>>>> On Fri, Jun 1, 2018 at 11:18 AM, Thomas Tauber-Marshall <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> So while its definitely better, there are still a large number of
>>>>>>> failing
>>>>>>> builds. We've been hit by at least: IMPALA-6642
>>>>>>> <https://issues.apache.org/jira/browse/IMPALA-6642>, IMPALA-6956
>>>>>>> <https://issues.apache.org/jira/browse/IMPALA-6956>, IMPALA-7101
>>>>>>> <https://issues.apache.org/jira/browse/IMPALA-7101> and IMPALA-3040
>>>>>>> <https://issues.apache.org/jira/browse/IMPALA-3040>
>>>>>>> all within the last day, along with some mysterious crashes that I
>>>>>>> haven't
>>>>>>> filed anything for with Apache yet as there's very little info about
>>>>>>> what's
>>>>>>> actually going on. There are still multiple builds that haven't been
>>>>>>> green
>>>>>>> in over a month.
>>>>>>> <https://issues.apache.org/jira/browse/IMPALA-6642>
>>>>>>>
>>>>>>> Of course, if we hold commits for too long, there's a danger that
>>>>>>> when we
>>>>>>> open things back up a bunch of changes will all land at the same
>>>>>>> time and
>>>>>>> destabilize the builds again, putting back in the same situation.
>>>>>>> So, I
>>>>>>> would say at a minimum that any changes that are relatively minor
>>>>>>> and low
>>>>>>> risk can go in now.
>>>>>>>
>>>>>>> My preference would be to hold off on major changes until we have
>>>>>>> more
>>>>>>> stability.
>>>>>>>
>>>>>>> On Fri, Jun 1, 2018 at 10:30 AM Lars Volker <[email protected]> wrote:
>>>>>>>
>>>>>>> > Hi Thomas,
>>>>>>> >
>>>>>>> > Can you give an update on where we are with the builds?
>>>>>>> >
>>>>>>> > We currently have ~15 changes with a +2:
>>>>>>> >
>>>>>>> > https://gerrit.cloudera.org/#/q/status:open+project:Impala-A
>>>>>>> SF+branch:master+label:Code-Review%253D2
>>>>>>> >
>>>>>>> > Thanks, Lars
>>>>>>> >
>>>>>>> > On Fri, May 25, 2018 at 5:20 PM, Henry Robinson <[email protected]>
>>>>>>> wrote:
>>>>>>> >
>>>>>>> > > +1 - thanks for worrying about build health.
>>>>>>> > >
>>>>>>> > > On 25 May 2018 at 17:18, Jim Apple <[email protected]> wrote:
>>>>>>> > >
>>>>>>> > > > Sounds good to me. Thanks for taking ownership!
>>>>>>> > > >
>>>>>>> > > > On Fri, May 25, 2018 at 5:10 PM Thomas Tauber-Marshall <
>>>>>>> > > > [email protected]> wrote:
>>>>>>> > > >
>>>>>>> > > > > Hey Impala community,
>>>>>>> > > > >
>>>>>>> > > > > There seems to have been an unusually large number of flaky
>>>>>>> or broken
>>>>>>> > > > tests
>>>>>>> > > > > <
>>>>>>> > > > > https://issues.apache.org/jira/browse/IMPALA-7073?jql=
>>>>>>> > > > project%20%3D%20IMPALA%20AND%20status%20in%20(Open%2C%20%
>>>>>>> > > > 22In%20Progress%22%2C%20Reopened)%20AND%20labels%
>>>>>>> > > > 20in%20(flaky%2C%20broken-build)
>>>>>>> > > > > >
>>>>>>> > > > > cropping up in the last few weeks. I'd like to suggest that
>>>>>>> we hold
>>>>>>> > off
>>>>>>> > > > on
>>>>>>> > > > > merging new changes that aren't related to fixing those
>>>>>>> testing
>>>>>>> > issues
>>>>>>> > > > for
>>>>>>> > > > > at least a few days until things become more stable.
>>>>>>> > > > >
>>>>>>> > > > > Does anyone have any objections? If not, I'll send out
>>>>>>> another email
>>>>>>> > > when
>>>>>>> > > > > more of the issues have been addressed.
>>>>>>> > > > >
>>>>>>> > > > > Thanks,
>>>>>>> > > > > Thomas Tauber-Marshall
>>>>>>> > > > >
>>>>>>> > > >
>>>>>>> > >
>>>>>>> >
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to