I've also seen the BufferSpillerTest fail:
https://travis-ci.org/apache/flink/jobs/74057503


On Tue, 4 Aug 2015 at 14:10 Robert Metzger <rmetz...@apache.org> wrote:

> I've assigned https://issues.apache.org/jira/browse/FLINK-1680 to myself.
> Maybe Tachyon 0.7 will fix the issues.
>
> On Tue, Aug 4, 2015 at 1:57 PM, Stephan Ewen <se...@apache.org> wrote:
>
> > Yes.
> >
> > We should know, though, whether this is a Java 6 bug, or a bug in our
> > system that just happens to occur only with Java 6 (because of different
> > timings in this other engine)
> >
> > On Tue, Aug 4, 2015 at 12:27 PM, Chesnay Schepler <
> > chesnay.schep...@fu-berlin.de> wrote:
> >
> > > Aren't we dropping java 6 support?
> > >
> > >
> > > On 04.08.2015 12:21, Stephan Ewen wrote:
> > >
> > >> The "StateCheckpointedITCase" has not failed so far, which also test
> > these
> > >> guarantees thoroughly.
> > >>
> > >> But we need to first rule out the BarrierBuffer. The problem is that
> the
> > >> bug occur only on Java 6 and cannot be reproduced locally...
> > >>
> > >> On Tue, Aug 4, 2015 at 12:14 PM, Gyula Fóra <gyula.f...@gmail.com>
> > wrote:
> > >>
> > >> Honestly I don't think the partitioned state changes have anything to
> do
> > >>> with the stability, only the reworked test case, which now test
> proper
> > >>> exactly-once which was missing before.
> > >>>
> > >>> Stephan Ewen <se...@apache.org> ezt írta (időpont: 2015. aug. 4., K,
> > >>> 12:12):
> > >>>
> > >>> Yes, the build stability is super serious right now.
> > >>>>
> > >>>> Here are the problems in question, and what we could do about this:
> > >>>>
> > >>>>
> > >>>>
> > >>>> BarrierBuffer:
> > >>>> --------------------
> > >>>> Barrier Buffer tests fail in Java 6 builds.
> > >>>>
> > >>>> I have not found a way to diagnose that problem, yet, but if we
> cannot
> > >>>>
> > >>> find
> > >>>
> > >>>> the issue today, I would be willing to revert my latest commits on
> the
> > >>>> barrier buffer to increase the stability.
> > >>>>
> > >>>>
> > >>>> StreamCheckpointingITCase
> > >>>> -------------------------------------------
> > >>>> This seems to have started with either the barrier buffer, or the
> > >>>> updated
> > >>>> partitioned state. If fixing/reverting the barrier buffer does not
> fix
> > >>>>
> > >>> it,
> > >>>
> > >>>> and no fix has come up
> > >>>>
> > >>>> until then, let's revert the latest changes to the partitioned state
> > and
> > >>>> re-add them when they are stable.
> > >>>>
> > >>>>
> > >>>> Tachyon:
> > >>>> -------------
> > >>>> The Tachyon mini cluster has a problem, apparently, the programs
> exit
> > >>>>
> > >>> with
> > >>>
> > >>>> a sysexit or segfault.
> > >>>>
> > >>>> Since we have no Tachyon code ourselves, do we need this test as
> part
> > of
> > >>>> the nightly tests?
> > >>>> Can we make this a "manual" test that we trigger on demand?
> > >>>>
> > >>>>
> > >>>>
> > >>>> Greetings,
> > >>>> Stephan
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> On Tue, Aug 4, 2015 at 11:41 AM, Aljoscha Krettek <
> > aljos...@apache.org>
> > >>>> wrote:
> > >>>>
> > >>>> I've also seen this fail:
> > >>>>>
> > >>>> https://travis-ci.org/apache/flink/jobs/74025862
> > >>>>
> > >>>>> in SuccessAfterNetworkBuffersFailureITCase
> > >>>>>
> > >>>>> Build seems quite flaky recently.
> > >>>>>
> > >>>>> On Tue, 4 Aug 2015 at 10:27 Matthias J. Sax <
> > >>>>>
> > >>>> mj...@informatik.hu-berlin.de
> > >>>>
> > >>>>> wrote:
> > >>>>>
> > >>>>> Rebased on:
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>
> >
> https://github.com/mjsax/flink/commit/fab61a1954ff1554448e826e1d273689ed520fc3
> > >>>
> > >>>> But if the gap between two rebases is large, it's hard to say what
> > >>>>>>
> > >>>>> the
> > >>>
> > >>>> problem might be...
> > >>>>>>
> > >>>>>> The old parent commit (ie, rebase before last rebase) was
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>
> >
> https://github.com/mjsax/flink/commit/148395bcd81a93bcb1473e4e93f267edb3b71c7e
> > >>>
> > >>>> -Matthias
> > >>>>>>
> > >>>>>> On 08/04/2015 08:57 AM, Aljoscha Krettek wrote:
> > >>>>>>
> > >>>>>>> What are the commits that you rebased on? Could you maybe narrow
> > >>>>>>>
> > >>>>>> down
> > >>>
> > >>>> what
> > >>>>>>
> > >>>>>>> caused the regression?
> > >>>>>>>
> > >>>>>>> On Mon, 3 Aug 2015 at 23:31 Matthias J. Sax <
> > >>>>>>>
> > >>>>>> mj...@informatik.hu-berlin.de>
> > >>>>>>
> > >>>>>>> wrote:
> > >>>>>>>
> > >>>>>>> I only report failing tests after a rebase. ;)
> > >>>>>>>>
> > >>>>>>>> -Matthias
> > >>>>>>>>
> > >>>>>>>> On 08/03/2015 11:23 PM, Henry Saputra wrote:
> > >>>>>>>>
> > >>>>>>>>> Thanks for reporting it , Matthias. Will try to run Travis for
> > >>>>>>>>>
> > >>>>>>>> latest
> > >>>>
> > >>>>> Flink.
> > >>>>>>>>
> > >>>>>>>>> Tachyon test is a bit flaky. Maybe updating to latest release
> > >>>>>>>>>
> > >>>>>>>> could
> > >>>
> > >>>> help.
> > >>>>>>
> > >>>>>>> - Henry
> > >>>>>>>>>
> > >>>>>>>>> On Mon, Aug 3, 2015 at 2:18 PM, Matthias J. Sax
> > >>>>>>>>> <mj...@informatik.hu-berlin.de> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> Today, not a single built was successful completely. Please
> see
> > >>>>>>>>>>
> > >>>>>>>>> here:
> > >>>>>
> > >>>>>> Flink Streaming Core:
> > >>>>>>>>>> https://travis-ci.org/mjsax/flink/jobs/73938109
> > >>>>>>>>>> https://travis-ci.org/mjsax/flink/jobs/73951362
> > >>>>>>>>>> https://travis-ci.org/apache/flink/jobs/73938124
> > >>>>>>>>>> https://travis-ci.org/apache/flink/jobs/73899795
> > >>>>>>>>>> https://travis-ci.org/apache/flink/jobs/73938122
> > >>>>>>>>>> https://travis-ci.org/apache/flink/jobs/73952441
> > >>>>>>>>>>
> > >>>>>>>>>> Flink Taychon:
> > >>>>>>>>>> https://travis-ci.org/apache/flink/jobs/73938123
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> -Matthias
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>
> > >>>>>>
> > >
> >
>

Reply via email to