Re: NullPointerException in DeltaIteration when no ForwardedFileds annotation

Fabian Hueske Fri, 03 Apr 2015 13:34:38 -0700

Thanks for the nice setup!
I could easily reproduce the exception you are facing.
But that's the only good news so far :-(


I checked the plans and both are valid and should compute the correct
result for the program.
The split-of solution set delta is required because the it needs to be
repartitioned (without the annotation, the optimizer does not know that it
is in fact already correctly partitioned). One thing that made me a bit
suspicious is that the solution set delta partitioning is marked with a
Pipeline-Breaker. The pipeline breaker shouldn't make a semantic
difference, but I am not sure if it is really required and also that part
of the codebase was recently worked on.

So, a closer look and more debugging is necessary to figure out what not
working correctly here...


2015-04-03 14:14 GMT+02:00 Vasiliki Kalavri <[email protected]>:

> Hi Fabian,
>
> I am using the dblp co-authorship dataset from SNAP:
> http://snap.stanford.edu/data/com-DBLP.html
> I also pushed my slightly modified version of ConnectedComponents, here:
> https://github.com/vasia/flink/tree/cc-test. It basically generates the
> vertex dataset from the edges, so that you don't need to create it
> separately.
> The annotation that creates the error is in line #172.
>
> Thanks a lot :))
>
> -Vasia.
>
>
> On 3 April 2015 at 13:09, Fabian Hueske <[email protected]> wrote:
>
> > That looks pretty much like a bug.
> >
> > As you said, fwd fields annotations are optional and may improve the
> > performance of a program, but never change its semantics (if set
> > correctly).
> >
> > I'll have a look at it later.
> > Would be great if you could provide some data to reproduce the bug.
> > On Apr 3, 2015 12:48 PM, "Vasiliki Kalavri" <[email protected]>
> > wrote:
> >
> > > Hello to my squirrels,
> > >
> > > I've been getting a NullPointerException for a DeltaIteration program
> I'm
> > > trying to implement and I could really use your help :-)
> > > It seems that some of the input Tuples of the Join operator that I'm
> > using
> > > to create the next workset / solution set delta are null.
> > > It also seems that adding ForwardedFields annotations solves the issue.
> > >
> > > I managed to reproduce the behavior using the ConnectedComponents
> > example,
> > > by removing the "@ForwardedFieldsFirst("*")" annotation from
> > > the ComponentIdFilter join.
> > > The exception message is the following:
> > >
> > > Caused by: java.lang.NullPointerException
> > > at
> > >
> > >
> >
> org.apache.flink.examples.java.graph.ConnectedComponents$ComponentIdFilter.join(ConnectedComponents.java:186)
> > > at
> > >
> > >
> >
> org.apache.flink.examples.java.graph.ConnectedComponents$ComponentIdFilter.join(ConnectedComponents.java:1)
> > > at
> > >
> > >
> >
> org.apache.flink.runtime.operators.JoinWithSolutionSetSecondDriver.run(JoinWithSolutionSetSecondDriver.java:198)
> > > at
> > >
> > >
> >
> org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:496)
> > > at
> > >
> > >
> >
> org.apache.flink.runtime.iterative.task.AbstractIterativePactTask.run(AbstractIterativePactTask.java:139)
> > > at
> > >
> > >
> >
> org.apache.flink.runtime.iterative.task.IterationIntermediatePactTask.run(IterationIntermediatePactTask.java:92)
> > > at
> > >
> > >
> >
> org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
> > > at
> > >
> > >
> >
> org.apache.flink.runtime.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:217)
> > > at java.lang.Thread.run(Thread.java:745)
> > >
> > > I get this error locally with any sufficiently big dataset (~10000
> > nodes).
> > > When the annotation is in place, it works without problem.
> > > I also generated the optimizer plans for the two cases:
> > > - with annotation (working):
> > > https://gist.github.com/vasia/4f4dc6b0cc6c72b5b64b
> > > - without annotation (failing):
> > > https://gist.github.com/vasia/086faa45b980bf7f4c09
> > >
> > > After visualizing the plans, the main difference I see is that in the
> > > working case, the next workset node and the solution set delta nodes
> are
> > > merged, while in the failing case they are separate.
> > >
> > > Shouldn't this work with and without annotation (but be more efficient
> > with
> > > the annotation in place)? Or am I missing something here?
> > >
> > > Thanks in advance for any help :))
> > >
> > > Cheers,
> > > - Vasia.
> > >
> >
>

Re: NullPointerException in DeltaIteration when no ForwardedFileds annotation

Reply via email to