Caches have been cleared again (see https://issues.apache.org/jira/browse/INFRA-11773) The first time did not help. This second request was more an act of desparation. :-( Let's see what happens now.
On Wed, Apr 27, 2016 at 3:24 PM, Maximilian Michels <m...@apache.org> wrote: > +1 for making an effort to tackle test stability problems and > potential involved bugs. > > On Wed, Apr 27, 2016 at 2:13 PM, Ufuk Celebi <u...@apache.org> wrote: >> @Max: I think you wanted to look into whether we can use Apache's >> Jenkins server for our builds instead of Travis. Did you ever get >> around at looking into it? If yes: What's your opinion on replacing >> Travis with Jenkins? Is it a viable option? Would it improve the >> Travis-specific problems? > > I've experimented with the ASF Jenkins installation while setting up > our nightly snapshot builds. I've observed that the build servers are > pretty busy. I don't know how busy they are compared to the Travis > servers and whether we could have more stable builds using Jenkins. I > guess we would have to try over a period of time. > > I was hesitant to enable Jenkins for pull requests because I didn't > want to spam the ASF servers with builds. Also, there are some > remaining steps for a good integration like making the Yarn logs > available (not hard to do though). > > What do you think about enabling Jenkins builds for the master and see > how that goes? > > On Wed, Apr 27, 2016 at 2:54 PM, Ufuk Celebi <u...@apache.org> wrote: >> Filed an issue with INFRA: https://issues.apache.org/jira/browse/INFRA-11773 >> >> @Robert: I agree, but still we see failing builds over and over again. >> At best it is annoying, at worst it "hides" new bugs being introduced. >> >> On Wed, Apr 27, 2016 at 2:41 PM, Till Rohrmann <trohrm...@apache.org> wrote: >>> That is good to hear that we can so easily solve most of the failing >>> builds. We should then iterate over the open test-stability issues to see >>> whether they are still valid after we've merged PR 1915. >>> >>> On Wed, Apr 27, 2016 at 2:25 PM, Robert Metzger <rmetz...@apache.org> wrote: >>> >>>> I'm not sure if the issues is as big as it seems on a first sight. >>>> The reason why all the builds of master are red on travis is that the cache >>>> of the 5th build is invalid. We have to ask infra to delete the caches and >>>> then they'll be green again. >>>> >>>> On Wed, Apr 27, 2016 at 2:13 PM, Ufuk Celebi <u...@apache.org> wrote: >>>> >>>> > Along the lines of what Greg already mentioned, I would like to >>>> > re-iterate that Travis is often a problem too: >>>> > - long build times and we are reaching the time limit >>>> > - unreliable I/O >>>> > - unreliable resolving of build dependencies >>>> > >>>> > @Max: I think you wanted to look into whether we can use Apache's >>>> > Jenkins server for our builds instead of Travis. Did you ever get >>>> > around at looking into it? If yes: What's your opinion on replacing >>>> > Travis with Jenkins? Is it a viable option? Would it improve the >>>> > Travis-specific problems? >>>> > >>>> > On the other hand, the very slow Travis machines also helped >>>> > discovering some hard-to-catch race conditions. >>>> > >>>> > – Ufuk >>>> > >>>> > >>>> > On Wed, Apr 27, 2016 at 2:01 PM, Greg Hogan <c...@greghogan.com> wrote: >>>> > > We have also started running over Travis' 2 hour limit for the longest >>>> > build. >>>> > > >>>> > > Greg >>>> > > >>>> > > >>>> > >> On Apr 27, 2016, at 7:53 AM, Ufuk Celebi <u...@apache.org> wrote: >>>> > >> >>>> > >> Hi Till, >>>> > >> >>>> > >> thank you for bringing this up. We really need to fix this. >>>> > >> >>>> > >> Filing JIRAs with critical priority was how we tried to solve it in >>>> > >> the past, but obviously it did not work. There seems to be a mismatch >>>> > >> between assigned and actual priorities. >>>> > >> >>>> > >> As a first step, I would volunteer to gather a list of tests, which >>>> > >> have failed in the last weeks and make sure that we have JIRAs for >>>> > >> them. >>>> > >> >>>> > >> As a next step, we should coordinate how to resolve those issues >>>> > >> (maybe prioritized by failure frequency) to get master stable again. >>>> > >> >>>> > >> – Ufuk >>>> > >> >>>> > >> >>>> > >>> On Wed, Apr 27, 2016 at 12:12 PM, Till Rohrmann < >>>> trohrm...@apache.org> >>>> > wrote: >>>> > >>> Hi Flink community, >>>> > >>> >>>> > >>> I just wanted to raise awareness that in the last 16 days there was >>>> > just a >>>> > >>> single Travis build of master which passed all tests. This indicates >>>> > that >>>> > >>> we have some serious problems with our test stability or even worse a >>>> > >>> problem with the master itself. Having an unstable master makes it >>>> > really >>>> > >>> hard to assess whether new changes actually broke something or >>>> whether >>>> > the >>>> > >>> failing test was unrelated. >>>> > >>> >>>> > >>> We have currently 37 open issues labeled with test-stability and most >>>> > of >>>> > >>> them have a critical priority. Therefore, I would propose that we try >>>> > to >>>> > >>> tackle them as soon as possible in order to improve our testing >>>> > stability. >>>> > >>> >>>> > >>> Cheers, >>>> > >>> Till >>>> > >>>>