Okay, that fixes my errors. On Sat, Jul 8, 2017 at 1:41 AM Pramod Immaneni <pra...@datatorrent.com> wrote:
> Hi Manu, > > Can you refresh your netlet dependency. There was a respin of a release > that usually doesn't happen. You could do this by deleting contents of > your ~/.m2/repository/com/datatorrent/netlet/ folder and rebuilding, which > will fetch the netlet dependency again. > > Thanks > > On Fri, Jul 7, 2017 at 8:09 AM, Manu Zhang <owenzhang1...@gmail.com> > wrote: > > > Hey guys, I'd like to offer some input. > > > > The test also fails locally on my Mac with the following error. (so > > WriteOperation#finalize is not called) > > > > java.lang.NullPointerException > > at com.datatorrent.netlet.util.Slice.<init>(Slice.java:54) > > at > > org.apache.beam.runners.apex.translation.utils.ApexStateInternals$ > > ApexStateInternalsFactory.stateInternalsForKey( > > ApexStateInternals.java:449) > > > > The error is in the following line, where `Slice` takes a null value > when > > the `key` is null > > > > keyBytes = (key != null) ? new > > Slice(CoderUtils.encodeToByteArray(keyCoder, key)) : > > new Slice(null); > > > > while it doesn't look right from its constructor (array can not be null). > > > > public Slice(byte[] array) { > > this.buffer = array; > > this.offset = 0; > > this.length = array.length; > > } > > > > > > On Fri, Jul 7, 2017 at 4:35 AM Reuven Lax <re...@google.com.invalid> > > wrote: > > > > > Thomas, any suggestions on what we should do? Do you have an idea > what's > > > going on, or should we exclude this test for now until you have time to > > > look at it? > > > > > > Reuven > > > > > > On Wed, Jul 5, 2017 at 3:36 PM, Reuven Lax <re...@google.com> wrote: > > > > > > > I wonder if the watermark is accidentally advancing too early, > causing > > > > Apex to shut down the pipeline before the final finalize DoFn > executes? > > > > > > > > On Wed, Jul 5, 2017 at 1:45 PM, Thomas Weise <t...@apache.org> wrote: > > > > > > > >> I don't think this is a problem with the test and if anything this > > > problem > > > >> to me shows the test is useful in catching similar issues during > unit > > > test > > > >> runs. > > > >> > > > >> Is there any form of asynchronous/trigger based processing in this > > > >> pipeline > > > >> that could cause this? > > > >> > > > >> The Apex runner will shutdown the pipeline after the final > watermark, > > > the > > > >> shutdown signal traverses the pipeline just like a watermark, but it > > is > > > >> not > > > >> seen by user code. > > > >> > > > >> Thomas > > > >> > > > >> -- > > > >> sent from mobile > > > >> On Jul 5, 2017 1:19 PM, "Kenneth Knowles" <k...@google.com.invalid> > > > wrote: > > > >> > > > >> > Upon further investigation, this tests always writes to > > > >> > ./target/wordcountresult-00000-of-00002 and > > > >> > ./target/wordcountresult-00001-of-00002. So after a successful > test > > > >> run, > > > >> > any further run without a `clean` will spuriously succeed. I was > > > running > > > >> > via IntelliJ so did not do the ritual `mvn clean` workaround. So > > > >> > reproduction appears to be easy and we could fix the test (if we > > don't > > > >> > remove it) to use a fresh temp dir. > > > >> > > > > >> > This seems to point to a bug in waitUntilFinish() and/or Apex if > the > > > >> > topology is shut down before this ParDo is run. This is a ParDo > with > > > >> > trivial bounded input but with side inputs. So I would guess the > bug > > > is > > > >> > either in watermark tracking / readiness of the side input or just > > how > > > >> > PushbackSideInputDoFnRunner is used. > > > >> > > > > >> > On Wed, Jul 5, 2017 at 12:23 PM, Reuven Lax > > <re...@google.com.invalid > > > > > > > >> > wrote: > > > >> > > > > >> > > I've done a bit more debugging with logging. It appears that the > > > >> finalize > > > >> > > ParDo is never being invoked in this Apex test (or at least the > > > >> LOG.info > > > >> > in > > > >> > > that ParDo never runs). This ParDo is run on a constant element > > > (code > > > >> > > snippet below), so it should always run. > > > >> > > > > > >> > > PCollection<Void> singletonCollection = p.apply(Create.of((Void) > > > >> null)); > > > >> > > singletonCollection > > > >> > > .apply("Finalize", ParDo.of(new DoFn<Void, Integer>() { > > > >> > > @ProcessElement > > > >> > > public void processElement(ProcessContext c) throws > > Exception > > > { > > > >> > > LOG.info("Finalizing write operation {}.", > > writeOperation); > > > >> > > > > > >> > > > > > >> > > On Wed, Jul 5, 2017 at 11:22 AM, Kenneth Knowles > > > >> <k...@google.com.invalid > > > >> > > > > > >> > > wrote: > > > >> > > > > > >> > > > Data-dependent file destinations is a pretty great feature. We > > > also > > > >> > have > > > >> > > > another change to make to this @Experimental feature, and it > > would > > > >> be > > > >> > > nice > > > >> > > > to get them both into 2.1.0 if we can unblock this quickly. > > > >> > > > > > > >> > > > I just tried this too, and failed to reproduce it. But Jenkins > > and > > > >> > Reuven > > > >> > > > both have a reliable repro. > > > >> > > > > > > >> > > > Questionss: > > > >> > > > > > > >> > > > - Any ideas about how these configurations differ? > > > >> > > > - Does this actually affect users? > > > >> > > > - Once we have another test that catches this issue, can we > > > delete > > > >> > this > > > >> > > > test? > > > >> > > > > > > >> > > > Every other test passes, including the actual example > > WordCountIT. > > > >> > Since > > > >> > > > the PR doesn't change primitives, it also seems like it is an > > > >> existing > > > >> > > > issue. And the test seems redundant with our other testing but > > > won't > > > >> > get > > > >> > > as > > > >> > > > much maintenance attention. I don't want to stop catching > > whatever > > > >> this > > > >> > > > issue is, though. > > > >> > > > > > > >> > > > Kenn > > > >> > > > > > > >> > > > On Wed, Jul 5, 2017 at 10:31 AM, Reuven Lax > > > >> <re...@google.com.invalid> > > > >> > > > wrote: > > > >> > > > > > > >> > > > > Hi Thomas, > > > >> > > > > > > > >> > > > > This only happens with https://github.com/apache/ > > beam/pull/3356 > > > . > > > >> > > > > > > > >> > > > > Reuven > > > >> > > > > > > > >> > > > > On Mon, Jul 3, 2017 at 6:11 AM, Thomas Weise < > t...@apache.org> > > > >> wrote: > > > >> > > > > > > > >> > > > > > Hi Reuven, > > > >> > > > > > > > > >> > > > > > I'm not able to reproduce the issue locally. I was hoping > to > > > see > > > >> > > which > > > >> > > > > > thread is attempting to emit the results. In Apex, only > the > > > >> > operator > > > >> > > > > thread > > > >> > > > > > can emit the results, any other thread that is launched by > > the > > > >> > > operator > > > >> > > > > > cannot. I'm not aware of ParDo managing separate threads > > > though > > > >> and > > > >> > > > > assume > > > >> > > > > > this must be a race. If you still have the log, can you > send > > > it > > > >> to > > > >> > > me? > > > >> > > > > > > > > >> > > > > > Thanks, > > > >> > > > > > Thomas > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > On Sat, Jul 1, 2017 at 5:51 AM, Reuven Lax > > > >> > <re...@google.com.invalid > > > >> > > > > > > >> > > > > > wrote: > > > >> > > > > > > > > >> > > > > > > pr/3356 fails in the Apex WordCountTest. The failed test > > is > > > >> here > > > >> > > > > > > <https://builds.apache.org/job/beam_PreCommit_Java_ > > > >> > > > > > > MavenInstall/12829/org.apache.beam$beam-runners-apex/ > > > >> > > > > > > testReport/org.apache.beam.runners.apex.examples/ > > WordCountTe > > > >> st/ > > > >> > > > > > > testWordCountExample/> > > > >> > > > > > > : > > > >> > > > > > > > > > >> > > > > > > Upon debugging, it looks like this is likely a problem > in > > > the > > > >> > Apex > > > >> > > > > runner > > > >> > > > > > > itself. A ParDo calls output(), and that triggers an > > > exception > > > >> > > thrown > > > >> > > > > > from > > > >> > > > > > > inside the Apex runner. The Apex runner calls emit on a > > > >> > > > > DefaultOutputPort > > > >> > > > > > > (ApexParDoOperator.java:275), and that throws an > exception > > > >> inside > > > >> > > of > > > >> > > > > > > verifyOperatorThread(). > > > >> > > > > > > > > > >> > > > > > > I'm going to ignore this failure for now as it seems > > > >> unrelated to > > > >> > > my > > > >> > > > > PR, > > > >> > > > > > > but does someone want to take a look? > > > >> > > > > > > > > > >> > > > > > > Reuven > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > > > > > > > > >