Re: Failure in Apex runner

2017-07-09 Thread Kenneth Knowles
On that same subject: in the SDK-angnostic proto for a pipeline there is no
such thing as a main output [1]. The distinction between single and
multiple output ParDo is SDK-specific.

Kenn

[1]
https://github.com/apache/beam/blob/master/sdks/common/runner-api/src/main/proto/beam_runner_api.proto#L157

On Sun, Jul 9, 2017 at 8:38 AM, Reuven Lax  wrote:

> Yes. Semantically all outputs from a ParDo are equivalent, so the watermark
> should traverse them all. The only reason a "default" output exists is for
> convenience so we don't force users to always specify an output tag.
>
> On Sun, Jul 9, 2017 at 12:03 AM, Thomas Weise  wrote:
>
> > This error turns out to be deterministic and debug friendly :) I enabled
> > trace and found that the watermark "disappears" between the following two
> > operators:
> >
> > [8/WriteCounts/WriteFiles/WriteBundles:ApexParDoOperator]
> >
> > [10/WriteCounts/WriteFiles/GroupUnwritten:ApexGroupByKeyOperator]
> > GroupUnwritten takes input from additonal outputs, but the watermark is
> > only emitted on the main output. When I modify ApexParDoOperator to emit
> > the watermark also on additionalOutput1, it traverses the pipeline and
> the
> > test passes.
> >
> > Are watermarks supposed to be written to additional outputs?
> >
> > Thanks,
> > Thomas
> >
> >
> > On Thu, Jul 6, 2017 at 1:35 PM, Reuven Lax 
> > wrote:
> >
> > > Thomas, any suggestions on what we should do? Do you have an idea
> what's
> > > going on, or should we exclude this test for now until you have time to
> > > look at it?
> > >
> > > Reuven
> > >
> > > On Wed, Jul 5, 2017 at 3:36 PM, Reuven Lax  wrote:
> > >
> > > > I wonder if the watermark is accidentally advancing too early,
> causing
> > > > Apex to shut down the pipeline before the final finalize DoFn
> executes?
> > > >
> > > > On Wed, Jul 5, 2017 at 1:45 PM, Thomas Weise  wrote:
> > > >
> > > >> I don't think this is a problem with the test and if anything this
> > > problem
> > > >> to me shows the test is useful in catching similar issues during
> unit
> > > test
> > > >> runs.
> > > >>
> > > >> Is there any form of asynchronous/trigger based processing in this
> > > >> pipeline
> > > >> that could cause this?
> > > >>
> > > >> The Apex runner will shutdown the pipeline after the final
> watermark,
> > > the
> > > >> shutdown signal traverses the pipeline just like a watermark, but it
> > is
> > > >> not
> > > >> seen by user code.
> > > >>
> > > >> Thomas
> > > >>
> > > >> --
> > > >> sent from mobile
> > > >> On Jul 5, 2017 1:19 PM, "Kenneth Knowles" 
> > > wrote:
> > > >>
> > > >> > Upon further investigation, this tests always writes to
> > > >> > ./target/wordcountresult-0-of-2 and
> > > >> > ./target/wordcountresult-1-of-2. So after a successful
> test
> > > >> run,
> > > >> > any further run without a `clean` will spuriously succeed. I was
> > > running
> > > >> > via IntelliJ so did not do the ritual `mvn clean` workaround. So
> > > >> > reproduction appears to be easy and we could fix the test (if we
> > don't
> > > >> > remove it) to use a fresh temp dir.
> > > >> >
> > > >> > This seems to point to a bug in waitUntilFinish() and/or Apex if
> the
> > > >> > topology is shut down before this ParDo is run. This is a ParDo
> with
> > > >> > trivial bounded input but with side inputs. So I would guess the
> bug
> > > is
> > > >> > either in watermark tracking / readiness of the side input or just
> > how
> > > >> > PushbackSideInputDoFnRunner is used.
> > > >> >
> > > >> > On Wed, Jul 5, 2017 at 12:23 PM, Reuven Lax
> >  > > >
> > > >> > wrote:
> > > >> >
> > > >> > > I've done a bit more debugging with logging. It appears that the
> > > >> finalize
> > > >> > > ParDo is never being invoked in this Apex test (or at least the
> > > >> LOG.info
> > > >> > in
> > > >> > > that ParDo never runs). This ParDo is run on a constant element
> > > (code
> > > >> > > snippet below), so it should always run.
> > > >> > >
> > > >> > > PCollection singletonCollection = p.apply(Create.of((Void)
> > > >> null));
> > > >> > > singletonCollection
> > > >> > > .apply("Finalize", ParDo.of(new DoFn() {
> > > >> > >   @ProcessElement
> > > >> > >   public void processElement(ProcessContext c) throws
> > Exception
> > > {
> > > >> > > LOG.info("Finalizing write operation {}.",
> > writeOperation);
> > > >> > >
> > > >> > >
> > > >> > > On Wed, Jul 5, 2017 at 11:22 AM, Kenneth Knowles
> > > >>  > > >> > >
> > > >> > > wrote:
> > > >> > >
> > > >> > > > Data-dependent file destinations is a pretty great feature. We
> > > also
> > > >> > have
> > > >> > > > another change to make to this @Experimental feature, and it
> > would
> > > >> be
> > > >> > > nice
> > > >> > > > to get them both into 2.1.0 if we can unblock this quickly.
> > > >> 

Re: Failure in Apex runner

2017-07-09 Thread Reuven Lax
Yes. Semantically all outputs from a ParDo are equivalent, so the watermark
should traverse them all. The only reason a "default" output exists is for
convenience so we don't force users to always specify an output tag.

On Sun, Jul 9, 2017 at 12:03 AM, Thomas Weise  wrote:

> This error turns out to be deterministic and debug friendly :) I enabled
> trace and found that the watermark "disappears" between the following two
> operators:
>
> [8/WriteCounts/WriteFiles/WriteBundles:ApexParDoOperator]
>
> [10/WriteCounts/WriteFiles/GroupUnwritten:ApexGroupByKeyOperator]
> GroupUnwritten takes input from additonal outputs, but the watermark is
> only emitted on the main output. When I modify ApexParDoOperator to emit
> the watermark also on additionalOutput1, it traverses the pipeline and the
> test passes.
>
> Are watermarks supposed to be written to additional outputs?
>
> Thanks,
> Thomas
>
>
> On Thu, Jul 6, 2017 at 1:35 PM, Reuven Lax 
> wrote:
>
> > Thomas, any suggestions on what we should do? Do you have an idea what's
> > going on, or should we exclude this test for now until you have time to
> > look at it?
> >
> > Reuven
> >
> > On Wed, Jul 5, 2017 at 3:36 PM, Reuven Lax  wrote:
> >
> > > I wonder if the watermark is accidentally advancing too early, causing
> > > Apex to shut down the pipeline before the final finalize DoFn executes?
> > >
> > > On Wed, Jul 5, 2017 at 1:45 PM, Thomas Weise  wrote:
> > >
> > >> I don't think this is a problem with the test and if anything this
> > problem
> > >> to me shows the test is useful in catching similar issues during unit
> > test
> > >> runs.
> > >>
> > >> Is there any form of asynchronous/trigger based processing in this
> > >> pipeline
> > >> that could cause this?
> > >>
> > >> The Apex runner will shutdown the pipeline after the final watermark,
> > the
> > >> shutdown signal traverses the pipeline just like a watermark, but it
> is
> > >> not
> > >> seen by user code.
> > >>
> > >> Thomas
> > >>
> > >> --
> > >> sent from mobile
> > >> On Jul 5, 2017 1:19 PM, "Kenneth Knowles" 
> > wrote:
> > >>
> > >> > Upon further investigation, this tests always writes to
> > >> > ./target/wordcountresult-0-of-2 and
> > >> > ./target/wordcountresult-1-of-2. So after a successful test
> > >> run,
> > >> > any further run without a `clean` will spuriously succeed. I was
> > running
> > >> > via IntelliJ so did not do the ritual `mvn clean` workaround. So
> > >> > reproduction appears to be easy and we could fix the test (if we
> don't
> > >> > remove it) to use a fresh temp dir.
> > >> >
> > >> > This seems to point to a bug in waitUntilFinish() and/or Apex if the
> > >> > topology is shut down before this ParDo is run. This is a ParDo with
> > >> > trivial bounded input but with side inputs. So I would guess the bug
> > is
> > >> > either in watermark tracking / readiness of the side input or just
> how
> > >> > PushbackSideInputDoFnRunner is used.
> > >> >
> > >> > On Wed, Jul 5, 2017 at 12:23 PM, Reuven Lax
>  > >
> > >> > wrote:
> > >> >
> > >> > > I've done a bit more debugging with logging. It appears that the
> > >> finalize
> > >> > > ParDo is never being invoked in this Apex test (or at least the
> > >> LOG.info
> > >> > in
> > >> > > that ParDo never runs). This ParDo is run on a constant element
> > (code
> > >> > > snippet below), so it should always run.
> > >> > >
> > >> > > PCollection singletonCollection = p.apply(Create.of((Void)
> > >> null));
> > >> > > singletonCollection
> > >> > > .apply("Finalize", ParDo.of(new DoFn() {
> > >> > >   @ProcessElement
> > >> > >   public void processElement(ProcessContext c) throws
> Exception
> > {
> > >> > > LOG.info("Finalizing write operation {}.",
> writeOperation);
> > >> > >
> > >> > >
> > >> > > On Wed, Jul 5, 2017 at 11:22 AM, Kenneth Knowles
> > >>  > >> > >
> > >> > > wrote:
> > >> > >
> > >> > > > Data-dependent file destinations is a pretty great feature. We
> > also
> > >> > have
> > >> > > > another change to make to this @Experimental feature, and it
> would
> > >> be
> > >> > > nice
> > >> > > > to get them both into 2.1.0 if we can unblock this quickly.
> > >> > > >
> > >> > > > I just tried this too, and failed to reproduce it. But Jenkins
> and
> > >> > Reuven
> > >> > > > both have a reliable repro.
> > >> > > >
> > >> > > > Questionss:
> > >> > > >
> > >> > > >  - Any ideas about how these configurations differ?
> > >> > > >  - Does this actually affect users?
> > >> > > >  - Once we have another test that catches this issue, can we
> > delete
> > >> > this
> > >> > > > test?
> > >> > > >
> > >> > > > Every other test passes, including the actual example
> WordCountIT.
> > >> > Since
> > >> > > > the PR doesn't change primitives, it also seems like it is an
> > >> existing
> > >> > > > 

Re: Failure in Apex runner

2017-07-09 Thread Thomas Weise
This error turns out to be deterministic and debug friendly :) I enabled
trace and found that the watermark "disappears" between the following two
operators:

[8/WriteCounts/WriteFiles/WriteBundles:ApexParDoOperator]

[10/WriteCounts/WriteFiles/GroupUnwritten:ApexGroupByKeyOperator]
GroupUnwritten takes input from additonal outputs, but the watermark is
only emitted on the main output. When I modify ApexParDoOperator to emit
the watermark also on additionalOutput1, it traverses the pipeline and the
test passes.

Are watermarks supposed to be written to additional outputs?

Thanks,
Thomas


On Thu, Jul 6, 2017 at 1:35 PM, Reuven Lax  wrote:

> Thomas, any suggestions on what we should do? Do you have an idea what's
> going on, or should we exclude this test for now until you have time to
> look at it?
>
> Reuven
>
> On Wed, Jul 5, 2017 at 3:36 PM, Reuven Lax  wrote:
>
> > I wonder if the watermark is accidentally advancing too early, causing
> > Apex to shut down the pipeline before the final finalize DoFn executes?
> >
> > On Wed, Jul 5, 2017 at 1:45 PM, Thomas Weise  wrote:
> >
> >> I don't think this is a problem with the test and if anything this
> problem
> >> to me shows the test is useful in catching similar issues during unit
> test
> >> runs.
> >>
> >> Is there any form of asynchronous/trigger based processing in this
> >> pipeline
> >> that could cause this?
> >>
> >> The Apex runner will shutdown the pipeline after the final watermark,
> the
> >> shutdown signal traverses the pipeline just like a watermark, but it is
> >> not
> >> seen by user code.
> >>
> >> Thomas
> >>
> >> --
> >> sent from mobile
> >> On Jul 5, 2017 1:19 PM, "Kenneth Knowles" 
> wrote:
> >>
> >> > Upon further investigation, this tests always writes to
> >> > ./target/wordcountresult-0-of-2 and
> >> > ./target/wordcountresult-1-of-2. So after a successful test
> >> run,
> >> > any further run without a `clean` will spuriously succeed. I was
> running
> >> > via IntelliJ so did not do the ritual `mvn clean` workaround. So
> >> > reproduction appears to be easy and we could fix the test (if we don't
> >> > remove it) to use a fresh temp dir.
> >> >
> >> > This seems to point to a bug in waitUntilFinish() and/or Apex if the
> >> > topology is shut down before this ParDo is run. This is a ParDo with
> >> > trivial bounded input but with side inputs. So I would guess the bug
> is
> >> > either in watermark tracking / readiness of the side input or just how
> >> > PushbackSideInputDoFnRunner is used.
> >> >
> >> > On Wed, Jul 5, 2017 at 12:23 PM, Reuven Lax  >
> >> > wrote:
> >> >
> >> > > I've done a bit more debugging with logging. It appears that the
> >> finalize
> >> > > ParDo is never being invoked in this Apex test (or at least the
> >> LOG.info
> >> > in
> >> > > that ParDo never runs). This ParDo is run on a constant element
> (code
> >> > > snippet below), so it should always run.
> >> > >
> >> > > PCollection singletonCollection = p.apply(Create.of((Void)
> >> null));
> >> > > singletonCollection
> >> > > .apply("Finalize", ParDo.of(new DoFn() {
> >> > >   @ProcessElement
> >> > >   public void processElement(ProcessContext c) throws Exception
> {
> >> > > LOG.info("Finalizing write operation {}.", writeOperation);
> >> > >
> >> > >
> >> > > On Wed, Jul 5, 2017 at 11:22 AM, Kenneth Knowles
> >>  >> > >
> >> > > wrote:
> >> > >
> >> > > > Data-dependent file destinations is a pretty great feature. We
> also
> >> > have
> >> > > > another change to make to this @Experimental feature, and it would
> >> be
> >> > > nice
> >> > > > to get them both into 2.1.0 if we can unblock this quickly.
> >> > > >
> >> > > > I just tried this too, and failed to reproduce it. But Jenkins and
> >> > Reuven
> >> > > > both have a reliable repro.
> >> > > >
> >> > > > Questionss:
> >> > > >
> >> > > >  - Any ideas about how these configurations differ?
> >> > > >  - Does this actually affect users?
> >> > > >  - Once we have another test that catches this issue, can we
> delete
> >> > this
> >> > > > test?
> >> > > >
> >> > > > Every other test passes, including the actual example WordCountIT.
> >> > Since
> >> > > > the PR doesn't change primitives, it also seems like it is an
> >> existing
> >> > > > issue. And the test seems redundant with our other testing but
> won't
> >> > get
> >> > > as
> >> > > > much maintenance attention. I don't want to stop catching whatever
> >> this
> >> > > > issue is, though.
> >> > > >
> >> > > > Kenn
> >> > > >
> >> > > > On Wed, Jul 5, 2017 at 10:31 AM, Reuven Lax
> >> 
> >> > > > wrote:
> >> > > >
> >> > > > > Hi Thomas,
> >> > > > >
> >> > > > > This only happens with https://github.com/apache/beam/pull/3356
> .
> >> > > > >
> >> > > > > Reuven
> >> > > > >
> >> > > > > On Mon, Jul 3, 2017 

Re: Failure in Apex runner

2017-07-07 Thread Manu Zhang
Okay, that fixes my errors.

On Sat, Jul 8, 2017 at 1:41 AM Pramod Immaneni 
wrote:

> Hi Manu,
>
> Can you refresh your netlet dependency. There was a respin of a release
> that usually doesn't happen. You could do this by deleting contents of
> your ~/.m2/repository/com/datatorrent/netlet/ folder and rebuilding, which
> will fetch the netlet dependency again.
>
> Thanks
>
> On Fri, Jul 7, 2017 at 8:09 AM, Manu Zhang 
> wrote:
>
> > Hey guys, I'd like to offer some input.
> >
> > The test also fails locally on my Mac with the following error. (so
> > WriteOperation#finalize is not called)
> >
> > java.lang.NullPointerException
> > at com.datatorrent.netlet.util.Slice.(Slice.java:54)
> > at
> > org.apache.beam.runners.apex.translation.utils.ApexStateInternals$
> > ApexStateInternalsFactory.stateInternalsForKey(
> > ApexStateInternals.java:449)
> >
> > The error is in the following line,  where `Slice` takes a null value
> when
> > the `key` is null
> >
> > keyBytes = (key != null) ? new
> > Slice(CoderUtils.encodeToByteArray(keyCoder, key)) :
> >   new Slice(null);
> >
> > while it doesn't look right from its constructor (array can not be null).
> >
> > public Slice(byte[] array) {
> >   this.buffer = array;
> >   this.offset = 0;
> >   this.length = array.length;
> > }
> >
> >
> > On Fri, Jul 7, 2017 at 4:35 AM Reuven Lax 
> > wrote:
> >
> > > Thomas, any suggestions on what we should do? Do you have an idea
> what's
> > > going on, or should we exclude this test for now until you have time to
> > > look at it?
> > >
> > > Reuven
> > >
> > > On Wed, Jul 5, 2017 at 3:36 PM, Reuven Lax  wrote:
> > >
> > > > I wonder if the watermark is accidentally advancing too early,
> causing
> > > > Apex to shut down the pipeline before the final finalize DoFn
> executes?
> > > >
> > > > On Wed, Jul 5, 2017 at 1:45 PM, Thomas Weise  wrote:
> > > >
> > > >> I don't think this is a problem with the test and if anything this
> > > problem
> > > >> to me shows the test is useful in catching similar issues during
> unit
> > > test
> > > >> runs.
> > > >>
> > > >> Is there any form of asynchronous/trigger based processing in this
> > > >> pipeline
> > > >> that could cause this?
> > > >>
> > > >> The Apex runner will shutdown the pipeline after the final
> watermark,
> > > the
> > > >> shutdown signal traverses the pipeline just like a watermark, but it
> > is
> > > >> not
> > > >> seen by user code.
> > > >>
> > > >> Thomas
> > > >>
> > > >> --
> > > >> sent from mobile
> > > >> On Jul 5, 2017 1:19 PM, "Kenneth Knowles" 
> > > wrote:
> > > >>
> > > >> > Upon further investigation, this tests always writes to
> > > >> > ./target/wordcountresult-0-of-2 and
> > > >> > ./target/wordcountresult-1-of-2. So after a successful
> test
> > > >> run,
> > > >> > any further run without a `clean` will spuriously succeed. I was
> > > running
> > > >> > via IntelliJ so did not do the ritual `mvn clean` workaround. So
> > > >> > reproduction appears to be easy and we could fix the test (if we
> > don't
> > > >> > remove it) to use a fresh temp dir.
> > > >> >
> > > >> > This seems to point to a bug in waitUntilFinish() and/or Apex if
> the
> > > >> > topology is shut down before this ParDo is run. This is a ParDo
> with
> > > >> > trivial bounded input but with side inputs. So I would guess the
> bug
> > > is
> > > >> > either in watermark tracking / readiness of the side input or just
> > how
> > > >> > PushbackSideInputDoFnRunner is used.
> > > >> >
> > > >> > On Wed, Jul 5, 2017 at 12:23 PM, Reuven Lax
> >  > > >
> > > >> > wrote:
> > > >> >
> > > >> > > I've done a bit more debugging with logging. It appears that the
> > > >> finalize
> > > >> > > ParDo is never being invoked in this Apex test (or at least the
> > > >> LOG.info
> > > >> > in
> > > >> > > that ParDo never runs). This ParDo is run on a constant element
> > > (code
> > > >> > > snippet below), so it should always run.
> > > >> > >
> > > >> > > PCollection singletonCollection = p.apply(Create.of((Void)
> > > >> null));
> > > >> > > singletonCollection
> > > >> > > .apply("Finalize", ParDo.of(new DoFn() {
> > > >> > >   @ProcessElement
> > > >> > >   public void processElement(ProcessContext c) throws
> > Exception
> > > {
> > > >> > > LOG.info("Finalizing write operation {}.",
> > writeOperation);
> > > >> > >
> > > >> > >
> > > >> > > On Wed, Jul 5, 2017 at 11:22 AM, Kenneth Knowles
> > > >>  > > >> > >
> > > >> > > wrote:
> > > >> > >
> > > >> > > > Data-dependent file destinations is a pretty great feature. We
> > > also
> > > >> > have
> > > >> > > > another change to make to this @Experimental feature, and it
> > would
> > > >> be
> > > >> > > nice
> > > >> > > > to get them both into 2.1.0 if we can unblock this quickly.
> > > >> > 

Re: Failure in Apex runner

2017-07-07 Thread Pramod Immaneni
Hi Manu,

Can you refresh your netlet dependency. There was a respin of a release
that usually doesn't happen. You could do this by deleting contents of
your ~/.m2/repository/com/datatorrent/netlet/ folder and rebuilding, which
will fetch the netlet dependency again.

Thanks

On Fri, Jul 7, 2017 at 8:09 AM, Manu Zhang  wrote:

> Hey guys, I'd like to offer some input.
>
> The test also fails locally on my Mac with the following error. (so
> WriteOperation#finalize is not called)
>
> java.lang.NullPointerException
> at com.datatorrent.netlet.util.Slice.(Slice.java:54)
> at
> org.apache.beam.runners.apex.translation.utils.ApexStateInternals$
> ApexStateInternalsFactory.stateInternalsForKey(
> ApexStateInternals.java:449)
>
> The error is in the following line,  where `Slice` takes a null value when
> the `key` is null
>
> keyBytes = (key != null) ? new
> Slice(CoderUtils.encodeToByteArray(keyCoder, key)) :
>   new Slice(null);
>
> while it doesn't look right from its constructor (array can not be null).
>
> public Slice(byte[] array) {
>   this.buffer = array;
>   this.offset = 0;
>   this.length = array.length;
> }
>
>
> On Fri, Jul 7, 2017 at 4:35 AM Reuven Lax 
> wrote:
>
> > Thomas, any suggestions on what we should do? Do you have an idea what's
> > going on, or should we exclude this test for now until you have time to
> > look at it?
> >
> > Reuven
> >
> > On Wed, Jul 5, 2017 at 3:36 PM, Reuven Lax  wrote:
> >
> > > I wonder if the watermark is accidentally advancing too early, causing
> > > Apex to shut down the pipeline before the final finalize DoFn executes?
> > >
> > > On Wed, Jul 5, 2017 at 1:45 PM, Thomas Weise  wrote:
> > >
> > >> I don't think this is a problem with the test and if anything this
> > problem
> > >> to me shows the test is useful in catching similar issues during unit
> > test
> > >> runs.
> > >>
> > >> Is there any form of asynchronous/trigger based processing in this
> > >> pipeline
> > >> that could cause this?
> > >>
> > >> The Apex runner will shutdown the pipeline after the final watermark,
> > the
> > >> shutdown signal traverses the pipeline just like a watermark, but it
> is
> > >> not
> > >> seen by user code.
> > >>
> > >> Thomas
> > >>
> > >> --
> > >> sent from mobile
> > >> On Jul 5, 2017 1:19 PM, "Kenneth Knowles" 
> > wrote:
> > >>
> > >> > Upon further investigation, this tests always writes to
> > >> > ./target/wordcountresult-0-of-2 and
> > >> > ./target/wordcountresult-1-of-2. So after a successful test
> > >> run,
> > >> > any further run without a `clean` will spuriously succeed. I was
> > running
> > >> > via IntelliJ so did not do the ritual `mvn clean` workaround. So
> > >> > reproduction appears to be easy and we could fix the test (if we
> don't
> > >> > remove it) to use a fresh temp dir.
> > >> >
> > >> > This seems to point to a bug in waitUntilFinish() and/or Apex if the
> > >> > topology is shut down before this ParDo is run. This is a ParDo with
> > >> > trivial bounded input but with side inputs. So I would guess the bug
> > is
> > >> > either in watermark tracking / readiness of the side input or just
> how
> > >> > PushbackSideInputDoFnRunner is used.
> > >> >
> > >> > On Wed, Jul 5, 2017 at 12:23 PM, Reuven Lax
>  > >
> > >> > wrote:
> > >> >
> > >> > > I've done a bit more debugging with logging. It appears that the
> > >> finalize
> > >> > > ParDo is never being invoked in this Apex test (or at least the
> > >> LOG.info
> > >> > in
> > >> > > that ParDo never runs). This ParDo is run on a constant element
> > (code
> > >> > > snippet below), so it should always run.
> > >> > >
> > >> > > PCollection singletonCollection = p.apply(Create.of((Void)
> > >> null));
> > >> > > singletonCollection
> > >> > > .apply("Finalize", ParDo.of(new DoFn() {
> > >> > >   @ProcessElement
> > >> > >   public void processElement(ProcessContext c) throws
> Exception
> > {
> > >> > > LOG.info("Finalizing write operation {}.",
> writeOperation);
> > >> > >
> > >> > >
> > >> > > On Wed, Jul 5, 2017 at 11:22 AM, Kenneth Knowles
> > >>  > >> > >
> > >> > > wrote:
> > >> > >
> > >> > > > Data-dependent file destinations is a pretty great feature. We
> > also
> > >> > have
> > >> > > > another change to make to this @Experimental feature, and it
> would
> > >> be
> > >> > > nice
> > >> > > > to get them both into 2.1.0 if we can unblock this quickly.
> > >> > > >
> > >> > > > I just tried this too, and failed to reproduce it. But Jenkins
> and
> > >> > Reuven
> > >> > > > both have a reliable repro.
> > >> > > >
> > >> > > > Questionss:
> > >> > > >
> > >> > > >  - Any ideas about how these configurations differ?
> > >> > > >  - Does this actually affect users?
> > >> > > >  - Once we have another test that catches this issue, can we
> > delete
> > >> > 

Re: Failure in Apex runner

2017-07-07 Thread Manu Zhang
Hey guys, I'd like to offer some input.

The test also fails locally on my Mac with the following error. (so
WriteOperation#finalize is not called)

java.lang.NullPointerException
at com.datatorrent.netlet.util.Slice.(Slice.java:54)
at
org.apache.beam.runners.apex.translation.utils.ApexStateInternals$ApexStateInternalsFactory.stateInternalsForKey(ApexStateInternals.java:449)

The error is in the following line,  where `Slice` takes a null value when
the `key` is null

keyBytes = (key != null) ? new
Slice(CoderUtils.encodeToByteArray(keyCoder, key)) :
  new Slice(null);

while it doesn't look right from its constructor (array can not be null).

public Slice(byte[] array) {
  this.buffer = array;
  this.offset = 0;
  this.length = array.length;
}


On Fri, Jul 7, 2017 at 4:35 AM Reuven Lax  wrote:

> Thomas, any suggestions on what we should do? Do you have an idea what's
> going on, or should we exclude this test for now until you have time to
> look at it?
>
> Reuven
>
> On Wed, Jul 5, 2017 at 3:36 PM, Reuven Lax  wrote:
>
> > I wonder if the watermark is accidentally advancing too early, causing
> > Apex to shut down the pipeline before the final finalize DoFn executes?
> >
> > On Wed, Jul 5, 2017 at 1:45 PM, Thomas Weise  wrote:
> >
> >> I don't think this is a problem with the test and if anything this
> problem
> >> to me shows the test is useful in catching similar issues during unit
> test
> >> runs.
> >>
> >> Is there any form of asynchronous/trigger based processing in this
> >> pipeline
> >> that could cause this?
> >>
> >> The Apex runner will shutdown the pipeline after the final watermark,
> the
> >> shutdown signal traverses the pipeline just like a watermark, but it is
> >> not
> >> seen by user code.
> >>
> >> Thomas
> >>
> >> --
> >> sent from mobile
> >> On Jul 5, 2017 1:19 PM, "Kenneth Knowles" 
> wrote:
> >>
> >> > Upon further investigation, this tests always writes to
> >> > ./target/wordcountresult-0-of-2 and
> >> > ./target/wordcountresult-1-of-2. So after a successful test
> >> run,
> >> > any further run without a `clean` will spuriously succeed. I was
> running
> >> > via IntelliJ so did not do the ritual `mvn clean` workaround. So
> >> > reproduction appears to be easy and we could fix the test (if we don't
> >> > remove it) to use a fresh temp dir.
> >> >
> >> > This seems to point to a bug in waitUntilFinish() and/or Apex if the
> >> > topology is shut down before this ParDo is run. This is a ParDo with
> >> > trivial bounded input but with side inputs. So I would guess the bug
> is
> >> > either in watermark tracking / readiness of the side input or just how
> >> > PushbackSideInputDoFnRunner is used.
> >> >
> >> > On Wed, Jul 5, 2017 at 12:23 PM, Reuven Lax  >
> >> > wrote:
> >> >
> >> > > I've done a bit more debugging with logging. It appears that the
> >> finalize
> >> > > ParDo is never being invoked in this Apex test (or at least the
> >> LOG.info
> >> > in
> >> > > that ParDo never runs). This ParDo is run on a constant element
> (code
> >> > > snippet below), so it should always run.
> >> > >
> >> > > PCollection singletonCollection = p.apply(Create.of((Void)
> >> null));
> >> > > singletonCollection
> >> > > .apply("Finalize", ParDo.of(new DoFn() {
> >> > >   @ProcessElement
> >> > >   public void processElement(ProcessContext c) throws Exception
> {
> >> > > LOG.info("Finalizing write operation {}.", writeOperation);
> >> > >
> >> > >
> >> > > On Wed, Jul 5, 2017 at 11:22 AM, Kenneth Knowles
> >>  >> > >
> >> > > wrote:
> >> > >
> >> > > > Data-dependent file destinations is a pretty great feature. We
> also
> >> > have
> >> > > > another change to make to this @Experimental feature, and it would
> >> be
> >> > > nice
> >> > > > to get them both into 2.1.0 if we can unblock this quickly.
> >> > > >
> >> > > > I just tried this too, and failed to reproduce it. But Jenkins and
> >> > Reuven
> >> > > > both have a reliable repro.
> >> > > >
> >> > > > Questionss:
> >> > > >
> >> > > >  - Any ideas about how these configurations differ?
> >> > > >  - Does this actually affect users?
> >> > > >  - Once we have another test that catches this issue, can we
> delete
> >> > this
> >> > > > test?
> >> > > >
> >> > > > Every other test passes, including the actual example WordCountIT.
> >> > Since
> >> > > > the PR doesn't change primitives, it also seems like it is an
> >> existing
> >> > > > issue. And the test seems redundant with our other testing but
> won't
> >> > get
> >> > > as
> >> > > > much maintenance attention. I don't want to stop catching whatever
> >> this
> >> > > > issue is, though.
> >> > > >
> >> > > > Kenn
> >> > > >
> >> > > > On Wed, Jul 5, 2017 at 10:31 AM, Reuven Lax
> >> 
> >> > > > wrote:
> >> > > >
> >> > > > > Hi Thomas,
> >> > > > >
> 

Re: Failure in Apex runner

2017-07-06 Thread Reuven Lax
Thomas, any suggestions on what we should do? Do you have an idea what's
going on, or should we exclude this test for now until you have time to
look at it?

Reuven

On Wed, Jul 5, 2017 at 3:36 PM, Reuven Lax  wrote:

> I wonder if the watermark is accidentally advancing too early, causing
> Apex to shut down the pipeline before the final finalize DoFn executes?
>
> On Wed, Jul 5, 2017 at 1:45 PM, Thomas Weise  wrote:
>
>> I don't think this is a problem with the test and if anything this problem
>> to me shows the test is useful in catching similar issues during unit test
>> runs.
>>
>> Is there any form of asynchronous/trigger based processing in this
>> pipeline
>> that could cause this?
>>
>> The Apex runner will shutdown the pipeline after the final watermark, the
>> shutdown signal traverses the pipeline just like a watermark, but it is
>> not
>> seen by user code.
>>
>> Thomas
>>
>> --
>> sent from mobile
>> On Jul 5, 2017 1:19 PM, "Kenneth Knowles"  wrote:
>>
>> > Upon further investigation, this tests always writes to
>> > ./target/wordcountresult-0-of-2 and
>> > ./target/wordcountresult-1-of-2. So after a successful test
>> run,
>> > any further run without a `clean` will spuriously succeed. I was running
>> > via IntelliJ so did not do the ritual `mvn clean` workaround. So
>> > reproduction appears to be easy and we could fix the test (if we don't
>> > remove it) to use a fresh temp dir.
>> >
>> > This seems to point to a bug in waitUntilFinish() and/or Apex if the
>> > topology is shut down before this ParDo is run. This is a ParDo with
>> > trivial bounded input but with side inputs. So I would guess the bug is
>> > either in watermark tracking / readiness of the side input or just how
>> > PushbackSideInputDoFnRunner is used.
>> >
>> > On Wed, Jul 5, 2017 at 12:23 PM, Reuven Lax 
>> > wrote:
>> >
>> > > I've done a bit more debugging with logging. It appears that the
>> finalize
>> > > ParDo is never being invoked in this Apex test (or at least the
>> LOG.info
>> > in
>> > > that ParDo never runs). This ParDo is run on a constant element (code
>> > > snippet below), so it should always run.
>> > >
>> > > PCollection singletonCollection = p.apply(Create.of((Void)
>> null));
>> > > singletonCollection
>> > > .apply("Finalize", ParDo.of(new DoFn() {
>> > >   @ProcessElement
>> > >   public void processElement(ProcessContext c) throws Exception {
>> > > LOG.info("Finalizing write operation {}.", writeOperation);
>> > >
>> > >
>> > > On Wed, Jul 5, 2017 at 11:22 AM, Kenneth Knowles
>> > > >
>> > > wrote:
>> > >
>> > > > Data-dependent file destinations is a pretty great feature. We also
>> > have
>> > > > another change to make to this @Experimental feature, and it would
>> be
>> > > nice
>> > > > to get them both into 2.1.0 if we can unblock this quickly.
>> > > >
>> > > > I just tried this too, and failed to reproduce it. But Jenkins and
>> > Reuven
>> > > > both have a reliable repro.
>> > > >
>> > > > Questionss:
>> > > >
>> > > >  - Any ideas about how these configurations differ?
>> > > >  - Does this actually affect users?
>> > > >  - Once we have another test that catches this issue, can we delete
>> > this
>> > > > test?
>> > > >
>> > > > Every other test passes, including the actual example WordCountIT.
>> > Since
>> > > > the PR doesn't change primitives, it also seems like it is an
>> existing
>> > > > issue. And the test seems redundant with our other testing but won't
>> > get
>> > > as
>> > > > much maintenance attention. I don't want to stop catching whatever
>> this
>> > > > issue is, though.
>> > > >
>> > > > Kenn
>> > > >
>> > > > On Wed, Jul 5, 2017 at 10:31 AM, Reuven Lax
>> 
>> > > > wrote:
>> > > >
>> > > > > Hi Thomas,
>> > > > >
>> > > > > This only happens with https://github.com/apache/beam/pull/3356.
>> > > > >
>> > > > > Reuven
>> > > > >
>> > > > > On Mon, Jul 3, 2017 at 6:11 AM, Thomas Weise 
>> wrote:
>> > > > >
>> > > > > > Hi Reuven,
>> > > > > >
>> > > > > > I'm not able to reproduce the issue locally. I was hoping to see
>> > > which
>> > > > > > thread is attempting to emit the results. In Apex, only the
>> > operator
>> > > > > thread
>> > > > > > can emit the results, any other thread that is launched by the
>> > > operator
>> > > > > > cannot. I'm not aware of ParDo managing separate threads though
>> and
>> > > > > assume
>> > > > > > this must be a race. If you still have the log, can you send it
>> to
>> > > me?
>> > > > > >
>> > > > > > Thanks,
>> > > > > > Thomas
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > On Sat, Jul 1, 2017 at 5:51 AM, Reuven Lax
>> > > > > >
>> > > > > > wrote:
>> > > > > >
>> > > > > > > pr/3356 fails in the Apex WordCountTest. The failed test is
>> here
>> > > > > > > 

Re: Failure in Apex runner

2017-07-05 Thread Reuven Lax
I wonder if the watermark is accidentally advancing too early, causing Apex
to shut down the pipeline before the final finalize DoFn executes?

On Wed, Jul 5, 2017 at 1:45 PM, Thomas Weise  wrote:

> I don't think this is a problem with the test and if anything this problem
> to me shows the test is useful in catching similar issues during unit test
> runs.
>
> Is there any form of asynchronous/trigger based processing in this pipeline
> that could cause this?
>
> The Apex runner will shutdown the pipeline after the final watermark, the
> shutdown signal traverses the pipeline just like a watermark, but it is not
> seen by user code.
>
> Thomas
>
> --
> sent from mobile
> On Jul 5, 2017 1:19 PM, "Kenneth Knowles"  wrote:
>
> > Upon further investigation, this tests always writes to
> > ./target/wordcountresult-0-of-2 and
> > ./target/wordcountresult-1-of-2. So after a successful test run,
> > any further run without a `clean` will spuriously succeed. I was running
> > via IntelliJ so did not do the ritual `mvn clean` workaround. So
> > reproduction appears to be easy and we could fix the test (if we don't
> > remove it) to use a fresh temp dir.
> >
> > This seems to point to a bug in waitUntilFinish() and/or Apex if the
> > topology is shut down before this ParDo is run. This is a ParDo with
> > trivial bounded input but with side inputs. So I would guess the bug is
> > either in watermark tracking / readiness of the side input or just how
> > PushbackSideInputDoFnRunner is used.
> >
> > On Wed, Jul 5, 2017 at 12:23 PM, Reuven Lax 
> > wrote:
> >
> > > I've done a bit more debugging with logging. It appears that the
> finalize
> > > ParDo is never being invoked in this Apex test (or at least the
> LOG.info
> > in
> > > that ParDo never runs). This ParDo is run on a constant element (code
> > > snippet below), so it should always run.
> > >
> > > PCollection singletonCollection = p.apply(Create.of((Void)
> null));
> > > singletonCollection
> > > .apply("Finalize", ParDo.of(new DoFn() {
> > >   @ProcessElement
> > >   public void processElement(ProcessContext c) throws Exception {
> > > LOG.info("Finalizing write operation {}.", writeOperation);
> > >
> > >
> > > On Wed, Jul 5, 2017 at 11:22 AM, Kenneth Knowles
>  > >
> > > wrote:
> > >
> > > > Data-dependent file destinations is a pretty great feature. We also
> > have
> > > > another change to make to this @Experimental feature, and it would be
> > > nice
> > > > to get them both into 2.1.0 if we can unblock this quickly.
> > > >
> > > > I just tried this too, and failed to reproduce it. But Jenkins and
> > Reuven
> > > > both have a reliable repro.
> > > >
> > > > Questionss:
> > > >
> > > >  - Any ideas about how these configurations differ?
> > > >  - Does this actually affect users?
> > > >  - Once we have another test that catches this issue, can we delete
> > this
> > > > test?
> > > >
> > > > Every other test passes, including the actual example WordCountIT.
> > Since
> > > > the PR doesn't change primitives, it also seems like it is an
> existing
> > > > issue. And the test seems redundant with our other testing but won't
> > get
> > > as
> > > > much maintenance attention. I don't want to stop catching whatever
> this
> > > > issue is, though.
> > > >
> > > > Kenn
> > > >
> > > > On Wed, Jul 5, 2017 at 10:31 AM, Reuven Lax  >
> > > > wrote:
> > > >
> > > > > Hi Thomas,
> > > > >
> > > > > This only happens with https://github.com/apache/beam/pull/3356.
> > > > >
> > > > > Reuven
> > > > >
> > > > > On Mon, Jul 3, 2017 at 6:11 AM, Thomas Weise 
> wrote:
> > > > >
> > > > > > Hi Reuven,
> > > > > >
> > > > > > I'm not able to reproduce the issue locally. I was hoping to see
> > > which
> > > > > > thread is attempting to emit the results. In Apex, only the
> > operator
> > > > > thread
> > > > > > can emit the results, any other thread that is launched by the
> > > operator
> > > > > > cannot. I'm not aware of ParDo managing separate threads though
> and
> > > > > assume
> > > > > > this must be a race. If you still have the log, can you send it
> to
> > > me?
> > > > > >
> > > > > > Thanks,
> > > > > > Thomas
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Sat, Jul 1, 2017 at 5:51 AM, Reuven Lax
> >  > > >
> > > > > > wrote:
> > > > > >
> > > > > > > pr/3356 fails in the Apex WordCountTest. The failed test is
> here
> > > > > > >  > > > > > > MavenInstall/12829/org.apache.beam$beam-runners-apex/
> > > > > > > testReport/org.apache.beam.runners.apex.examples/
> WordCountTest/
> > > > > > > testWordCountExample/>
> > > > > > > :
> > > > > > >
> > > > > > > Upon debugging, it looks like this is likely a problem in the
> > Apex
> > > > > runner
> > > > > > > itself. A ParDo calls 

Re: Failure in Apex runner

2017-07-05 Thread Kenneth Knowles
There is no asynchronous behavior in this test. It is basically a "batch"
test, here:
https://github.com/apache/beam/blob/master/runners/apex/src/test/java/org/apache/beam/runners/apex/examples/WordCountTest.java#L117

The pipeline is:

p.apply("ReadLines", TextIO.read().from(options.getInputFile()))
  .apply(ParDo.of(new ExtractWordsFn()))
  .apply(Count.perElement())
  .apply(ParDo.of(new FormatAsStringFn()))
  .apply("WriteCounts", TextIO.write().to(options.getOutput()))
  ;

It runs this on a hardcoded input file and verifies the two expected output
files have hardcoded hashes. The files are never renamed from their
temporary destinations to their final destinations, since that transform
(the finalizing sub-transform of TextIO) is never run.

On Wed, Jul 5, 2017 at 1:45 PM, Thomas Weise  wrote:

> I don't think this is a problem with the test and if anything this problem
> to me shows the test is useful in catching similar issues during unit test
> runs.
>
> Is there any form of asynchronous/trigger based processing in this pipeline
> that could cause this?
>
> The Apex runner will shutdown the pipeline after the final watermark, the
> shutdown signal traverses the pipeline just like a watermark, but it is not
> seen by user code.
>
> Thomas
>
> --
> sent from mobile
> On Jul 5, 2017 1:19 PM, "Kenneth Knowles"  wrote:
>
> > Upon further investigation, this tests always writes to
> > ./target/wordcountresult-0-of-2 and
> > ./target/wordcountresult-1-of-2. So after a successful test run,
> > any further run without a `clean` will spuriously succeed. I was running
> > via IntelliJ so did not do the ritual `mvn clean` workaround. So
> > reproduction appears to be easy and we could fix the test (if we don't
> > remove it) to use a fresh temp dir.
> >
> > This seems to point to a bug in waitUntilFinish() and/or Apex if the
> > topology is shut down before this ParDo is run. This is a ParDo with
> > trivial bounded input but with side inputs. So I would guess the bug is
> > either in watermark tracking / readiness of the side input or just how
> > PushbackSideInputDoFnRunner is used.
> >
> > On Wed, Jul 5, 2017 at 12:23 PM, Reuven Lax 
> > wrote:
> >
> > > I've done a bit more debugging with logging. It appears that the
> finalize
> > > ParDo is never being invoked in this Apex test (or at least the
> LOG.info
> > in
> > > that ParDo never runs). This ParDo is run on a constant element (code
> > > snippet below), so it should always run.
> > >
> > > PCollection singletonCollection = p.apply(Create.of((Void)
> null));
> > > singletonCollection
> > > .apply("Finalize", ParDo.of(new DoFn() {
> > >   @ProcessElement
> > >   public void processElement(ProcessContext c) throws Exception {
> > > LOG.info("Finalizing write operation {}.", writeOperation);
> > >
> > >
> > > On Wed, Jul 5, 2017 at 11:22 AM, Kenneth Knowles
>  > >
> > > wrote:
> > >
> > > > Data-dependent file destinations is a pretty great feature. We also
> > have
> > > > another change to make to this @Experimental feature, and it would be
> > > nice
> > > > to get them both into 2.1.0 if we can unblock this quickly.
> > > >
> > > > I just tried this too, and failed to reproduce it. But Jenkins and
> > Reuven
> > > > both have a reliable repro.
> > > >
> > > > Questionss:
> > > >
> > > >  - Any ideas about how these configurations differ?
> > > >  - Does this actually affect users?
> > > >  - Once we have another test that catches this issue, can we delete
> > this
> > > > test?
> > > >
> > > > Every other test passes, including the actual example WordCountIT.
> > Since
> > > > the PR doesn't change primitives, it also seems like it is an
> existing
> > > > issue. And the test seems redundant with our other testing but won't
> > get
> > > as
> > > > much maintenance attention. I don't want to stop catching whatever
> this
> > > > issue is, though.
> > > >
> > > > Kenn
> > > >
> > > > On Wed, Jul 5, 2017 at 10:31 AM, Reuven Lax  >
> > > > wrote:
> > > >
> > > > > Hi Thomas,
> > > > >
> > > > > This only happens with https://github.com/apache/beam/pull/3356.
> > > > >
> > > > > Reuven
> > > > >
> > > > > On Mon, Jul 3, 2017 at 6:11 AM, Thomas Weise 
> wrote:
> > > > >
> > > > > > Hi Reuven,
> > > > > >
> > > > > > I'm not able to reproduce the issue locally. I was hoping to see
> > > which
> > > > > > thread is attempting to emit the results. In Apex, only the
> > operator
> > > > > thread
> > > > > > can emit the results, any other thread that is launched by the
> > > operator
> > > > > > cannot. I'm not aware of ParDo managing separate threads though
> and
> > > > > assume
> > > > > > this must be a race. If you still have the log, can you send it
> to
> > > me?
> > > > > >
> > > > > > Thanks,
> > > > > > Thomas
> > > > > >
> > > > > >
> > 

Re: Failure in Apex runner

2017-07-05 Thread Thomas Weise
I don't think this is a problem with the test and if anything this problem
to me shows the test is useful in catching similar issues during unit test
runs.

Is there any form of asynchronous/trigger based processing in this pipeline
that could cause this?

The Apex runner will shutdown the pipeline after the final watermark, the
shutdown signal traverses the pipeline just like a watermark, but it is not
seen by user code.

Thomas

--
sent from mobile
On Jul 5, 2017 1:19 PM, "Kenneth Knowles"  wrote:

> Upon further investigation, this tests always writes to
> ./target/wordcountresult-0-of-2 and
> ./target/wordcountresult-1-of-2. So after a successful test run,
> any further run without a `clean` will spuriously succeed. I was running
> via IntelliJ so did not do the ritual `mvn clean` workaround. So
> reproduction appears to be easy and we could fix the test (if we don't
> remove it) to use a fresh temp dir.
>
> This seems to point to a bug in waitUntilFinish() and/or Apex if the
> topology is shut down before this ParDo is run. This is a ParDo with
> trivial bounded input but with side inputs. So I would guess the bug is
> either in watermark tracking / readiness of the side input or just how
> PushbackSideInputDoFnRunner is used.
>
> On Wed, Jul 5, 2017 at 12:23 PM, Reuven Lax 
> wrote:
>
> > I've done a bit more debugging with logging. It appears that the finalize
> > ParDo is never being invoked in this Apex test (or at least the LOG.info
> in
> > that ParDo never runs). This ParDo is run on a constant element (code
> > snippet below), so it should always run.
> >
> > PCollection singletonCollection = p.apply(Create.of((Void) null));
> > singletonCollection
> > .apply("Finalize", ParDo.of(new DoFn() {
> >   @ProcessElement
> >   public void processElement(ProcessContext c) throws Exception {
> > LOG.info("Finalizing write operation {}.", writeOperation);
> >
> >
> > On Wed, Jul 5, 2017 at 11:22 AM, Kenneth Knowles  >
> > wrote:
> >
> > > Data-dependent file destinations is a pretty great feature. We also
> have
> > > another change to make to this @Experimental feature, and it would be
> > nice
> > > to get them both into 2.1.0 if we can unblock this quickly.
> > >
> > > I just tried this too, and failed to reproduce it. But Jenkins and
> Reuven
> > > both have a reliable repro.
> > >
> > > Questionss:
> > >
> > >  - Any ideas about how these configurations differ?
> > >  - Does this actually affect users?
> > >  - Once we have another test that catches this issue, can we delete
> this
> > > test?
> > >
> > > Every other test passes, including the actual example WordCountIT.
> Since
> > > the PR doesn't change primitives, it also seems like it is an existing
> > > issue. And the test seems redundant with our other testing but won't
> get
> > as
> > > much maintenance attention. I don't want to stop catching whatever this
> > > issue is, though.
> > >
> > > Kenn
> > >
> > > On Wed, Jul 5, 2017 at 10:31 AM, Reuven Lax 
> > > wrote:
> > >
> > > > Hi Thomas,
> > > >
> > > > This only happens with https://github.com/apache/beam/pull/3356.
> > > >
> > > > Reuven
> > > >
> > > > On Mon, Jul 3, 2017 at 6:11 AM, Thomas Weise  wrote:
> > > >
> > > > > Hi Reuven,
> > > > >
> > > > > I'm not able to reproduce the issue locally. I was hoping to see
> > which
> > > > > thread is attempting to emit the results. In Apex, only the
> operator
> > > > thread
> > > > > can emit the results, any other thread that is launched by the
> > operator
> > > > > cannot. I'm not aware of ParDo managing separate threads though and
> > > > assume
> > > > > this must be a race. If you still have the log, can you send it to
> > me?
> > > > >
> > > > > Thanks,
> > > > > Thomas
> > > > >
> > > > >
> > > > >
> > > > > On Sat, Jul 1, 2017 at 5:51 AM, Reuven Lax
>  > >
> > > > > wrote:
> > > > >
> > > > > > pr/3356 fails in the Apex WordCountTest. The failed test is here
> > > > > >  > > > > > MavenInstall/12829/org.apache.beam$beam-runners-apex/
> > > > > > testReport/org.apache.beam.runners.apex.examples/WordCountTest/
> > > > > > testWordCountExample/>
> > > > > > :
> > > > > >
> > > > > > Upon debugging, it looks like this is likely a problem in the
> Apex
> > > > runner
> > > > > > itself. A ParDo calls output(), and that triggers an exception
> > thrown
> > > > > from
> > > > > > inside the Apex runner. The Apex runner calls emit on a
> > > > DefaultOutputPort
> > > > > > (ApexParDoOperator.java:275), and that throws an exception inside
> > of
> > > > > > verifyOperatorThread().
> > > > > >
> > > > > > I'm going to ignore this failure for now as it seems unrelated to
> > my
> > > > PR,
> > > > > > but does someone want to take a look?
> > > > > >
> > > > > > Reuven
> > > > > >
> > > > >

Re: Failure in Apex runner

2017-07-05 Thread Reuven Lax
I've done a bit more debugging with logging. It appears that the finalize
ParDo is never being invoked in this Apex test (or at least the LOG.info in
that ParDo never runs). This ParDo is run on a constant element (code
snippet below), so it should always run.

PCollection singletonCollection = p.apply(Create.of((Void) null));
singletonCollection
.apply("Finalize", ParDo.of(new DoFn() {
  @ProcessElement
  public void processElement(ProcessContext c) throws Exception {
LOG.info("Finalizing write operation {}.", writeOperation);


On Wed, Jul 5, 2017 at 11:22 AM, Kenneth Knowles 
wrote:

> Data-dependent file destinations is a pretty great feature. We also have
> another change to make to this @Experimental feature, and it would be nice
> to get them both into 2.1.0 if we can unblock this quickly.
>
> I just tried this too, and failed to reproduce it. But Jenkins and Reuven
> both have a reliable repro.
>
> Questionss:
>
>  - Any ideas about how these configurations differ?
>  - Does this actually affect users?
>  - Once we have another test that catches this issue, can we delete this
> test?
>
> Every other test passes, including the actual example WordCountIT. Since
> the PR doesn't change primitives, it also seems like it is an existing
> issue. And the test seems redundant with our other testing but won't get as
> much maintenance attention. I don't want to stop catching whatever this
> issue is, though.
>
> Kenn
>
> On Wed, Jul 5, 2017 at 10:31 AM, Reuven Lax 
> wrote:
>
> > Hi Thomas,
> >
> > This only happens with https://github.com/apache/beam/pull/3356.
> >
> > Reuven
> >
> > On Mon, Jul 3, 2017 at 6:11 AM, Thomas Weise  wrote:
> >
> > > Hi Reuven,
> > >
> > > I'm not able to reproduce the issue locally. I was hoping to see which
> > > thread is attempting to emit the results. In Apex, only the operator
> > thread
> > > can emit the results, any other thread that is launched by the operator
> > > cannot. I'm not aware of ParDo managing separate threads though and
> > assume
> > > this must be a race. If you still have the log, can you send it to me?
> > >
> > > Thanks,
> > > Thomas
> > >
> > >
> > >
> > > On Sat, Jul 1, 2017 at 5:51 AM, Reuven Lax 
> > > wrote:
> > >
> > > > pr/3356 fails in the Apex WordCountTest. The failed test is here
> > > >  > > > MavenInstall/12829/org.apache.beam$beam-runners-apex/
> > > > testReport/org.apache.beam.runners.apex.examples/WordCountTest/
> > > > testWordCountExample/>
> > > > :
> > > >
> > > > Upon debugging, it looks like this is likely a problem in the Apex
> > runner
> > > > itself. A ParDo calls output(), and that triggers an exception thrown
> > > from
> > > > inside the Apex runner. The Apex runner calls emit on a
> > DefaultOutputPort
> > > > (ApexParDoOperator.java:275), and that throws an exception inside of
> > > > verifyOperatorThread().
> > > >
> > > > I'm going to ignore this failure for now as it seems unrelated to my
> > PR,
> > > > but does someone want to take a look?
> > > >
> > > > Reuven
> > > >
> > >
> >
>


Re: Failure in Apex runner

2017-07-05 Thread Reuven Lax
Hi Thomas,

This only happens with https://github.com/apache/beam/pull/3356.

Reuven

On Mon, Jul 3, 2017 at 6:11 AM, Thomas Weise  wrote:

> Hi Reuven,
>
> I'm not able to reproduce the issue locally. I was hoping to see which
> thread is attempting to emit the results. In Apex, only the operator thread
> can emit the results, any other thread that is launched by the operator
> cannot. I'm not aware of ParDo managing separate threads though and assume
> this must be a race. If you still have the log, can you send it to me?
>
> Thanks,
> Thomas
>
>
>
> On Sat, Jul 1, 2017 at 5:51 AM, Reuven Lax 
> wrote:
>
> > pr/3356 fails in the Apex WordCountTest. The failed test is here
> >  > MavenInstall/12829/org.apache.beam$beam-runners-apex/
> > testReport/org.apache.beam.runners.apex.examples/WordCountTest/
> > testWordCountExample/>
> > :
> >
> > Upon debugging, it looks like this is likely a problem in the Apex runner
> > itself. A ParDo calls output(), and that triggers an exception thrown
> from
> > inside the Apex runner. The Apex runner calls emit on a DefaultOutputPort
> > (ApexParDoOperator.java:275), and that throws an exception inside of
> > verifyOperatorThread().
> >
> > I'm going to ignore this failure for now as it seems unrelated to my PR,
> > but does someone want to take a look?
> >
> > Reuven
> >
>


Re: Failure in Apex runner

2017-07-03 Thread Thomas Weise
Hi Reuven,

I'm not able to reproduce the issue locally. I was hoping to see which
thread is attempting to emit the results. In Apex, only the operator thread
can emit the results, any other thread that is launched by the operator
cannot. I'm not aware of ParDo managing separate threads though and assume
this must be a race. If you still have the log, can you send it to me?

Thanks,
Thomas



On Sat, Jul 1, 2017 at 5:51 AM, Reuven Lax  wrote:

> pr/3356 fails in the Apex WordCountTest. The failed test is here
>  MavenInstall/12829/org.apache.beam$beam-runners-apex/
> testReport/org.apache.beam.runners.apex.examples/WordCountTest/
> testWordCountExample/>
> :
>
> Upon debugging, it looks like this is likely a problem in the Apex runner
> itself. A ParDo calls output(), and that triggers an exception thrown from
> inside the Apex runner. The Apex runner calls emit on a DefaultOutputPort
> (ApexParDoOperator.java:275), and that throws an exception inside of
> verifyOperatorThread().
>
> I'm going to ignore this failure for now as it seems unrelated to my PR,
> but does someone want to take a look?
>
> Reuven
>


Failure in Apex runner

2017-07-01 Thread Reuven Lax
pr/3356 fails in the Apex WordCountTest. The failed test is here

:

Upon debugging, it looks like this is likely a problem in the Apex runner
itself. A ParDo calls output(), and that triggers an exception thrown from
inside the Apex runner. The Apex runner calls emit on a DefaultOutputPort
(ApexParDoOperator.java:275), and that throws an exception inside of
verifyOperatorThread().

I'm going to ignore this failure for now as it seems unrelated to my PR,
but does someone want to take a look?

Reuven