I've assigned https://issues.apache.org/jira/browse/FLINK-1680 to myself. Maybe Tachyon 0.7 will fix the issues.
On Tue, Aug 4, 2015 at 1:57 PM, Stephan Ewen <se...@apache.org> wrote: > Yes. > > We should know, though, whether this is a Java 6 bug, or a bug in our > system that just happens to occur only with Java 6 (because of different > timings in this other engine) > > On Tue, Aug 4, 2015 at 12:27 PM, Chesnay Schepler < > chesnay.schep...@fu-berlin.de> wrote: > > > Aren't we dropping java 6 support? > > > > > > On 04.08.2015 12:21, Stephan Ewen wrote: > > > >> The "StateCheckpointedITCase" has not failed so far, which also test > these > >> guarantees thoroughly. > >> > >> But we need to first rule out the BarrierBuffer. The problem is that the > >> bug occur only on Java 6 and cannot be reproduced locally... > >> > >> On Tue, Aug 4, 2015 at 12:14 PM, Gyula Fóra <gyula.f...@gmail.com> > wrote: > >> > >> Honestly I don't think the partitioned state changes have anything to do > >>> with the stability, only the reworked test case, which now test proper > >>> exactly-once which was missing before. > >>> > >>> Stephan Ewen <se...@apache.org> ezt írta (időpont: 2015. aug. 4., K, > >>> 12:12): > >>> > >>> Yes, the build stability is super serious right now. > >>>> > >>>> Here are the problems in question, and what we could do about this: > >>>> > >>>> > >>>> > >>>> BarrierBuffer: > >>>> -------------------- > >>>> Barrier Buffer tests fail in Java 6 builds. > >>>> > >>>> I have not found a way to diagnose that problem, yet, but if we cannot > >>>> > >>> find > >>> > >>>> the issue today, I would be willing to revert my latest commits on the > >>>> barrier buffer to increase the stability. > >>>> > >>>> > >>>> StreamCheckpointingITCase > >>>> ------------------------------------------- > >>>> This seems to have started with either the barrier buffer, or the > >>>> updated > >>>> partitioned state. If fixing/reverting the barrier buffer does not fix > >>>> > >>> it, > >>> > >>>> and no fix has come up > >>>> > >>>> until then, let's revert the latest changes to the partitioned state > and > >>>> re-add them when they are stable. > >>>> > >>>> > >>>> Tachyon: > >>>> ------------- > >>>> The Tachyon mini cluster has a problem, apparently, the programs exit > >>>> > >>> with > >>> > >>>> a sysexit or segfault. > >>>> > >>>> Since we have no Tachyon code ourselves, do we need this test as part > of > >>>> the nightly tests? > >>>> Can we make this a "manual" test that we trigger on demand? > >>>> > >>>> > >>>> > >>>> Greetings, > >>>> Stephan > >>>> > >>>> > >>>> > >>>> > >>>> On Tue, Aug 4, 2015 at 11:41 AM, Aljoscha Krettek < > aljos...@apache.org> > >>>> wrote: > >>>> > >>>> I've also seen this fail: > >>>>> > >>>> https://travis-ci.org/apache/flink/jobs/74025862 > >>>> > >>>>> in SuccessAfterNetworkBuffersFailureITCase > >>>>> > >>>>> Build seems quite flaky recently. > >>>>> > >>>>> On Tue, 4 Aug 2015 at 10:27 Matthias J. Sax < > >>>>> > >>>> mj...@informatik.hu-berlin.de > >>>> > >>>>> wrote: > >>>>> > >>>>> Rebased on: > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>> > https://github.com/mjsax/flink/commit/fab61a1954ff1554448e826e1d273689ed520fc3 > >>> > >>>> But if the gap between two rebases is large, it's hard to say what > >>>>>> > >>>>> the > >>> > >>>> problem might be... > >>>>>> > >>>>>> The old parent commit (ie, rebase before last rebase) was > >>>>>> > >>>>>> > >>>>>> > >>> > https://github.com/mjsax/flink/commit/148395bcd81a93bcb1473e4e93f267edb3b71c7e > >>> > >>>> -Matthias > >>>>>> > >>>>>> On 08/04/2015 08:57 AM, Aljoscha Krettek wrote: > >>>>>> > >>>>>>> What are the commits that you rebased on? Could you maybe narrow > >>>>>>> > >>>>>> down > >>> > >>>> what > >>>>>> > >>>>>>> caused the regression? > >>>>>>> > >>>>>>> On Mon, 3 Aug 2015 at 23:31 Matthias J. Sax < > >>>>>>> > >>>>>> mj...@informatik.hu-berlin.de> > >>>>>> > >>>>>>> wrote: > >>>>>>> > >>>>>>> I only report failing tests after a rebase. ;) > >>>>>>>> > >>>>>>>> -Matthias > >>>>>>>> > >>>>>>>> On 08/03/2015 11:23 PM, Henry Saputra wrote: > >>>>>>>> > >>>>>>>>> Thanks for reporting it , Matthias. Will try to run Travis for > >>>>>>>>> > >>>>>>>> latest > >>>> > >>>>> Flink. > >>>>>>>> > >>>>>>>>> Tachyon test is a bit flaky. Maybe updating to latest release > >>>>>>>>> > >>>>>>>> could > >>> > >>>> help. > >>>>>> > >>>>>>> - Henry > >>>>>>>>> > >>>>>>>>> On Mon, Aug 3, 2015 at 2:18 PM, Matthias J. Sax > >>>>>>>>> <mj...@informatik.hu-berlin.de> wrote: > >>>>>>>>> > >>>>>>>>>> Today, not a single built was successful completely. Please see > >>>>>>>>>> > >>>>>>>>> here: > >>>>> > >>>>>> Flink Streaming Core: > >>>>>>>>>> https://travis-ci.org/mjsax/flink/jobs/73938109 > >>>>>>>>>> https://travis-ci.org/mjsax/flink/jobs/73951362 > >>>>>>>>>> https://travis-ci.org/apache/flink/jobs/73938124 > >>>>>>>>>> https://travis-ci.org/apache/flink/jobs/73899795 > >>>>>>>>>> https://travis-ci.org/apache/flink/jobs/73938122 > >>>>>>>>>> https://travis-ci.org/apache/flink/jobs/73952441 > >>>>>>>>>> > >>>>>>>>>> Flink Taychon: > >>>>>>>>>> https://travis-ci.org/apache/flink/jobs/73938123 > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> -Matthias > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>> > >>>>>> > > >