Re: Fwd: [CRON] Broken: ClangBuiltLinux/continuous-integration#1432 (master - 0aceafc)

2020-05-21 Thread Nathan Chancellor
On Thu, May 21, 2020 at 03:23:11PM -0700, Nick Desaulniers wrote:
> On Thu, May 21, 2020 at 6:00 AM Michael Ellerman  wrote:
> >
> > Nathan Chancellor  writes:
> > > On Tue, May 19, 2020 at 05:56:32PM -0700, 'Nick Desaulniers' via Clang 
> > > Built Linux wrote:
> > >> Looks like our CI is still red from this:
> > >>
> > >> https://travis-ci.com/github/ClangBuiltLinux/continuous-integration/builds/166854584
> > >>
> > >> Filing a bug to follow up on:
> > >> https://github.com/ClangBuiltLinux/linux/issues/1031
> > >>
> > >> On Thu, May 7, 2020 at 8:29 PM Michael Ellerman  
> > >> wrote:
> > >> >
> > >> > Nick Desaulniers  writes:
> > >> > > Looks like ppc64le powernv_defconfig is suddenly failing the locking
> > >> > > torture tests, then locks up?
> > >> > > https://travis-ci.com/github/ClangBuiltLinux/continuous-integration/jobs/329211572#L3111-L3167
> > >> > > Any recent changes related here in -next?  I believe this is the 
> > >> > > first
> > >> > > failure, so I'll report back if we see this again.
> > >> >
> > >> > Thanks for the report.
> > >> >
> > >> > There's nothing newly in next-20200507 that seems related.
> > ...
> > >
> > > This is probably still a manifestation of
> > > https://github.com/ClangBuiltLinux/continuous-integration/issues/262
> > > because rekicking the tests usually fixes it.
> 
> I thought we had upgraded our version of QEMU in response to this already?
> https://github.com/ClangBuiltLinux/dockerimage/pull/44
> https://github.com/ClangBuiltLinux/dockerimage/pull/46

That was more of a bandaid than an actual fix. It happens a lot less
often with QEMU 4.2.0 but I could still reproduce that hang very
sparingly with the POWER9 machines on it. My machines are way more
powerful than the ones on Travis, which I am sure factors into that.
the hang with the POWER9 machines very sparingly with QEMU 4.2.0 but

The real solution is to upgrade to QEMU 5.0.0, which we could probably
do via a PPA (or through our Docker image), or wait for QEMU 4.2.1,
which should hopefully have that fix since it was CC'd for QEMU stable.

> >
> > Oh yep.
> >
> > I was looking at the RCU warning, which I still don't understand, but
> > the lockup is presumably the same problem you hit with interrupts being
> > lost.
> >
> > > We should probably just disable the torture tests like we do for x86_64
> > > for CI because we do not have access to QEMU 5.0.0 where this should be
> > > fixed. I believe it is slated for 4.2.1 as well but we still have to
> > > wait for that to be updated and packaged in Ubuntu.
> >
> > You just need to start building Qemu HEAD as part of your CI ;)
> 
> LOL
> https://github.com/ClangBuiltLinux/dockerimage/pull/46#pullrequestreview-395639442
> Yeah I think the hard part for all these dependendencies is the risk
> of living on the edge of "top of tree" for all of them, and trying to
> control for some by using stable releases.  May not always be
> possible.

Unfortunately, we are at the mercy of a bunch of different parties. If
only we had a ClangBuiltLinux build server that we maintained...

Cheers,
Nathan


Re: Fwd: [CRON] Broken: ClangBuiltLinux/continuous-integration#1432 (master - 0aceafc)

2020-05-21 Thread Michael Ellerman
Nathan Chancellor  writes:
> On Tue, May 19, 2020 at 05:56:32PM -0700, 'Nick Desaulniers' via Clang Built 
> Linux wrote:
>> Looks like our CI is still red from this:
>> 
>> https://travis-ci.com/github/ClangBuiltLinux/continuous-integration/builds/166854584
>> 
>> Filing a bug to follow up on:
>> https://github.com/ClangBuiltLinux/linux/issues/1031
>> 
>> On Thu, May 7, 2020 at 8:29 PM Michael Ellerman  wrote:
>> >
>> > Nick Desaulniers  writes:
>> > > Looks like ppc64le powernv_defconfig is suddenly failing the locking
>> > > torture tests, then locks up?
>> > > https://travis-ci.com/github/ClangBuiltLinux/continuous-integration/jobs/329211572#L3111-L3167
>> > > Any recent changes related here in -next?  I believe this is the first
>> > > failure, so I'll report back if we see this again.
>> >
>> > Thanks for the report.
>> >
>> > There's nothing newly in next-20200507 that seems related.
...
>
> This is probably still a manifestation of
> https://github.com/ClangBuiltLinux/continuous-integration/issues/262
> because rekicking the tests usually fixes it.

Oh yep.

I was looking at the RCU warning, which I still don't understand, but
the lockup is presumably the same problem you hit with interrupts being
lost.

> We should probably just disable the torture tests like we do for x86_64
> for CI because we do not have access to QEMU 5.0.0 where this should be
> fixed. I believe it is slated for 4.2.1 as well but we still have to
> wait for that to be updated and packaged in Ubuntu.

You just need to start building Qemu HEAD as part of your CI ;)

cheers


Re: Fwd: [CRON] Broken: ClangBuiltLinux/continuous-integration#1432 (master - 0aceafc)

2020-05-19 Thread Nathan Chancellor
On Tue, May 19, 2020 at 05:56:32PM -0700, 'Nick Desaulniers' via Clang Built 
Linux wrote:
> Looks like our CI is still red from this:
> 
> https://travis-ci.com/github/ClangBuiltLinux/continuous-integration/builds/166854584
> 
> Filing a bug to follow up on:
> https://github.com/ClangBuiltLinux/linux/issues/1031
> 
> On Thu, May 7, 2020 at 8:29 PM Michael Ellerman  wrote:
> >
> > Nick Desaulniers  writes:
> > > Looks like ppc64le powernv_defconfig is suddenly failing the locking
> > > torture tests, then locks up?
> > > https://travis-ci.com/github/ClangBuiltLinux/continuous-integration/jobs/329211572#L3111-L3167
> > > Any recent changes related here in -next?  I believe this is the first
> > > failure, so I'll report back if we see this again.
> >
> > Thanks for the report.
> >
> > There's nothing newly in next-20200507 that seems related.
> >
> > Odd that it just showed up.
> >
> > cheers
> >
> >
> > > -- Forwarded message -
> > > From: Travis CI 
> > > Date: Thu, May 7, 2020 at 9:40 AM
> > > Subject: [CRON] Broken: ClangBuiltLinux/continuous-integration#1432 
> > > (master
> > > - 0aceafc)
> > > To: , 
> > >
> > >
> > > ClangBuiltLinux
> > >
> > > /
> > >
> > > continuous-integration
> > > 
> > >
> > > [image: branch icon]master
> > > 
> > > [image: build has failed]
> > > Build #1432 was broken
> > > 
> > > [image: arrow to build time]
> > > [image: clock icon]7 hrs, 0 mins, and 54 secs
> > >
> > > [image: Nick Desaulniers avatar]Nick Desaulniers
> > > 0aceafc CHANGESET →
> > > 
> > >
> > > Merge pull request #182 from ClangBuiltLinux/i386
> > >
> > > i386
> > >
> > > Want to know about upcoming build environment updates?
> > >
> > > Would you like to stay up-to-date with the upcoming Travis CI build
> > > environment updates? We set up a mailing list for you!
> > > SIGN UP HERE 
> > >
> > > [image: book icon]
> > >
> > > Documentation  about Travis CI
> > > Have any questions? We're here to help. 
> > > Unsubscribe
> > > 
> > > from build emails from the ClangBuiltLinux/continuous-integration
> > > repository.
> > > To unsubscribe from *all* build emails, please update your settings
> > > .
> > >
> > > [image: black and white travis ci logo] 
> > >
> > > Travis CI GmbH, Rigaer Str. 8, 10427 Berlin, Germany | GF/CEO: Randy 
> > > Jacops
> > > | Contact: cont...@travis-ci.com | Amtsgericht Charlottenburg, Berlin, HRB
> > > 140133 B | Umsatzsteuer-ID gemäß §27 a Umsatzsteuergesetz: DE282002648
> > >
> > >
> > > --
> > > Thanks,
> > > ~Nick Desaulniers
> 
> 
> 
> -- 
> Thanks,
> ~Nick Desaulniers
> 

This is probably still a manifestation of
https://github.com/ClangBuiltLinux/continuous-integration/issues/262
because rekicking the tests usually fixes it.

We should probably just disable the torture tests like we do for x86_64
for CI because we do not have access to QEMU 5.0.0 where this should be
fixed. I believe it is slated for 4.2.1 as well but we still have to
wait for that to be updated and packaged in Ubuntu.

Relevant threads:

https://lore.kernel.org/linuxppc-dev/20200410205932.GA880@ubuntu-s3-xlarge-x86/

https://lore.kernel.org/qemu-devel/2020041431.465560-1-npig...@gmail.com/

Cheers,
Nathan


Re: Fwd: [CRON] Broken: ClangBuiltLinux/continuous-integration#1432 (master - 0aceafc)

2020-05-07 Thread Michael Ellerman
Nick Desaulniers  writes:
> Looks like ppc64le powernv_defconfig is suddenly failing the locking
> torture tests, then locks up?
> https://travis-ci.com/github/ClangBuiltLinux/continuous-integration/jobs/329211572#L3111-L3167
> Any recent changes related here in -next?  I believe this is the first
> failure, so I'll report back if we see this again.

Thanks for the report.

There's nothing newly in next-20200507 that seems related.

Odd that it just showed up.

cheers


> -- Forwarded message -
> From: Travis CI 
> Date: Thu, May 7, 2020 at 9:40 AM
> Subject: [CRON] Broken: ClangBuiltLinux/continuous-integration#1432 (master
> - 0aceafc)
> To: , 
>
>
> ClangBuiltLinux
>
> /
>
> continuous-integration
> 
>
> [image: branch icon]master
> 
> [image: build has failed]
> Build #1432 was broken
> 
> [image: arrow to build time]
> [image: clock icon]7 hrs, 0 mins, and 54 secs
>
> [image: Nick Desaulniers avatar]Nick Desaulniers
> 0aceafc CHANGESET →
> 
>
> Merge pull request #182 from ClangBuiltLinux/i386
>
> i386
>
> Want to know about upcoming build environment updates?
>
> Would you like to stay up-to-date with the upcoming Travis CI build
> environment updates? We set up a mailing list for you!
> SIGN UP HERE 
>
> [image: book icon]
>
> Documentation  about Travis CI
> Have any questions? We're here to help. 
> Unsubscribe
> 
> from build emails from the ClangBuiltLinux/continuous-integration
> repository.
> To unsubscribe from *all* build emails, please update your settings
> .
>
> [image: black and white travis ci logo] 
>
> Travis CI GmbH, Rigaer Str. 8, 10427 Berlin, Germany | GF/CEO: Randy Jacops
> | Contact: cont...@travis-ci.com | Amtsgericht Charlottenburg, Berlin, HRB
> 140133 B | Umsatzsteuer-ID gemäß §27 a Umsatzsteuergesetz: DE282002648
>
>
> -- 
> Thanks,
> ~Nick Desaulniers