[jira] [Commented] (BEAM-52) KafkaIO - bounded/unbounded, source/sink

2016-04-29 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-52?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15265165#comment-15265165
 ] 

Aljoscha Krettek commented on BEAM-52:
--

Sure, I'll do that!

> KafkaIO - bounded/unbounded, source/sink
> 
>
> Key: BEAM-52
> URL: https://issues.apache.org/jira/browse/BEAM-52
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions
>Reporter: Daniel Halperin
>Assignee: Raghu Angadi
>
> We should support Apache Kafka. The priority list is probably:
> * UnboundedSource
> * unbounded Sink
> * BoundedSource
> * bounded Sink
> The connector should be well-tested, especially around UnboundedSource 
> checkpointing and resuming, and data duplication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (BEAM-204) truncateStackTrace fails with empty stack trace

2016-04-29 Thread Davor Bonaci (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davor Bonaci closed BEAM-204.
-
Resolution: Fixed

> truncateStackTrace fails with empty stack trace
> ---
>
> Key: BEAM-204
> URL: https://issues.apache.org/jira/browse/BEAM-204
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Malo Denielou
>Assignee: Mark Shields
>Priority: Minor
>
> From a user job: 
> exception:
> "java.lang.ArrayIndexOutOfBoundsException: -1
>   at 
> com.google.cloud.dataflow.sdk.util.UserCodeException.truncateStackTrace(UserCodeException.java:72)
>   at 
> com.google.cloud.dataflow.sdk.util.UserCodeException.(UserCodeException.java:52)
>   at 
> com.google.cloud.dataflow.sdk.util.UserCodeException.wrap(UserCodeException.java:35)
>   at 
> com.google.cloud.dataflow.sdk.util.UserCodeException.wrapIf(UserCodeException.java:40)
>   at 
> com.google.cloud.dataflow.sdk.util.DoFnRunnerBase.wrapUserCodeException(DoFnRunnerBase.java:369)
>   at 
> com.google.cloud.dataflow.sdk.util.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:51)
>   at 
> com.google.cloud.dataflow.sdk.util.DoFnRunnerBase.processElement(DoFnRunnerBase.java:138)
>   at 
> com.google.cloud.dataflow.sdk.runners.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:191)
>   at 
> com.google.cloud.dataflow.sdk.runners.worker.ForwardingParDoFn.processElement(ForwardingParDoFn.java:42)
>   at 
> com.google.cloud.dataflow.sdk.runners.worker.DataflowWorkerLoggingParDoFn.processElement(DataflowWorkerLoggingParDoFn.java:47)
>   at 
> com.google.cloud.dataflow.sdk.util.common.worker.ParDoOperation.process(ParDoOperation.java:53)
>   at 
> com.google.cloud.dataflow.sdk.util.common.worker.OutputReceiver.process(OutputReceiver.java:52)
>   at 
> com.google.cloud.dataflow.sdk.runners.worker.SimpleParDoFn$1.output(SimpleParDoFn.java:161)
>   at 
> com.google.cloud.dataflow.sdk.util.DoFnRunnerBase$DoFnContext.outputWindowedValue(DoFnRunnerBase.java:288)
>   at 
> com.google.cloud.dataflow.sdk.util.DoFnRunnerBase$DoFnContext.outputWindowedValue(DoFnRunnerBase.java:284)
>   at 
> com.google.cloud.dataflow.sdk.util.DoFnRunnerBase$DoFnProcessContext$1.outputWindowedValue(DoFnRunnerBase.java:508)
>   at 
> com.google.cloud.dataflow.sdk.util.AssignWindowsDoFn.processElement(AssignWindowsDoFn.java:65)
>   at 
> com.google.cloud.dataflow.sdk.util.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:49)
>   at 
> com.google.cloud.dataflow.sdk.util.DoFnRunnerBase.processElement(DoFnRunnerBase.java:138)
>   at 
> com.google.cloud.dataflow.sdk.runners.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:191)
>   at 
> com.google.cloud.dataflow.sdk.runners.worker.ForwardingParDoFn.processElement(ForwardingParDoFn.java:42)
>   at 
> com.google.cloud.dataflow.sdk.runners.worker.DataflowWorkerLoggingParDoFn.processElement(DataflowWorkerLoggingParDoFn.java:47)
>   at 
> com.google.cloud.dataflow.sdk.util.common.worker.ParDoOperation.process(ParDoOperation.java:53)
>   at 
> com.google.cloud.dataflow.sdk.util.common.worker.OutputReceiver.process(OutputReceiver.java:52)
>   at 
> com.google.cloud.dataflow.sdk.runners.worker.SimpleParDoFn$1.output(SimpleParDoFn.java:161)
>   at 
> com.google.cloud.dataflow.sdk.util.DoFnRunnerBase$DoFnContext.sideOutputWindowedValue(DoFnRunnerBase.java:315)
>   at 
> com.google.cloud.dataflow.sdk.util.DoFnRunnerBase$DoFnProcessContext.sideOutput(DoFnRunnerBase.java:471)
>   at 
> com.google.cloud.dataflow.sdk.transforms.Partition$PartitionDoFn.processElement(Partition.java:165)
> Looking at the code, it seems that if the user code throwable has an empty 
> stacktrace we would fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-52) KafkaIO - bounded/unbounded, source/sink

2016-04-29 Thread Raghu Angadi (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-52?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15264971#comment-15264971
 ] 

Raghu Angadi commented on BEAM-52:
--

> Kafka producer uses the key for selecting a partition.

correction : the default partitioner in Kafka actually uses serialized bytes to 
pick a partition, but Partitioner interface includes both key and value 
objects, so some custom partitioners might use the key.

> KafkaIO - bounded/unbounded, source/sink
> 
>
> Key: BEAM-52
> URL: https://issues.apache.org/jira/browse/BEAM-52
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions
>Reporter: Daniel Halperin
>Assignee: Raghu Angadi
>
> We should support Apache Kafka. The priority list is probably:
> * UnboundedSource
> * unbounded Sink
> * BoundedSource
> * bounded Sink
> The connector should be well-tested, especially around UnboundedSource 
> checkpointing and resuming, and data duplication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-204) truncateStackTrace fails with empty stack trace

2016-04-29 Thread Mark Shields (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15264935#comment-15264935
 ] 

Mark Shields commented on BEAM-204:
---

Yes, but I can't seem to close it.

> truncateStackTrace fails with empty stack trace
> ---
>
> Key: BEAM-204
> URL: https://issues.apache.org/jira/browse/BEAM-204
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Malo Denielou
>Assignee: Mark Shields
>Priority: Minor
>
> From a user job: 
> exception:
> "java.lang.ArrayIndexOutOfBoundsException: -1
>   at 
> com.google.cloud.dataflow.sdk.util.UserCodeException.truncateStackTrace(UserCodeException.java:72)
>   at 
> com.google.cloud.dataflow.sdk.util.UserCodeException.(UserCodeException.java:52)
>   at 
> com.google.cloud.dataflow.sdk.util.UserCodeException.wrap(UserCodeException.java:35)
>   at 
> com.google.cloud.dataflow.sdk.util.UserCodeException.wrapIf(UserCodeException.java:40)
>   at 
> com.google.cloud.dataflow.sdk.util.DoFnRunnerBase.wrapUserCodeException(DoFnRunnerBase.java:369)
>   at 
> com.google.cloud.dataflow.sdk.util.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:51)
>   at 
> com.google.cloud.dataflow.sdk.util.DoFnRunnerBase.processElement(DoFnRunnerBase.java:138)
>   at 
> com.google.cloud.dataflow.sdk.runners.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:191)
>   at 
> com.google.cloud.dataflow.sdk.runners.worker.ForwardingParDoFn.processElement(ForwardingParDoFn.java:42)
>   at 
> com.google.cloud.dataflow.sdk.runners.worker.DataflowWorkerLoggingParDoFn.processElement(DataflowWorkerLoggingParDoFn.java:47)
>   at 
> com.google.cloud.dataflow.sdk.util.common.worker.ParDoOperation.process(ParDoOperation.java:53)
>   at 
> com.google.cloud.dataflow.sdk.util.common.worker.OutputReceiver.process(OutputReceiver.java:52)
>   at 
> com.google.cloud.dataflow.sdk.runners.worker.SimpleParDoFn$1.output(SimpleParDoFn.java:161)
>   at 
> com.google.cloud.dataflow.sdk.util.DoFnRunnerBase$DoFnContext.outputWindowedValue(DoFnRunnerBase.java:288)
>   at 
> com.google.cloud.dataflow.sdk.util.DoFnRunnerBase$DoFnContext.outputWindowedValue(DoFnRunnerBase.java:284)
>   at 
> com.google.cloud.dataflow.sdk.util.DoFnRunnerBase$DoFnProcessContext$1.outputWindowedValue(DoFnRunnerBase.java:508)
>   at 
> com.google.cloud.dataflow.sdk.util.AssignWindowsDoFn.processElement(AssignWindowsDoFn.java:65)
>   at 
> com.google.cloud.dataflow.sdk.util.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:49)
>   at 
> com.google.cloud.dataflow.sdk.util.DoFnRunnerBase.processElement(DoFnRunnerBase.java:138)
>   at 
> com.google.cloud.dataflow.sdk.runners.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:191)
>   at 
> com.google.cloud.dataflow.sdk.runners.worker.ForwardingParDoFn.processElement(ForwardingParDoFn.java:42)
>   at 
> com.google.cloud.dataflow.sdk.runners.worker.DataflowWorkerLoggingParDoFn.processElement(DataflowWorkerLoggingParDoFn.java:47)
>   at 
> com.google.cloud.dataflow.sdk.util.common.worker.ParDoOperation.process(ParDoOperation.java:53)
>   at 
> com.google.cloud.dataflow.sdk.util.common.worker.OutputReceiver.process(OutputReceiver.java:52)
>   at 
> com.google.cloud.dataflow.sdk.runners.worker.SimpleParDoFn$1.output(SimpleParDoFn.java:161)
>   at 
> com.google.cloud.dataflow.sdk.util.DoFnRunnerBase$DoFnContext.sideOutputWindowedValue(DoFnRunnerBase.java:315)
>   at 
> com.google.cloud.dataflow.sdk.util.DoFnRunnerBase$DoFnProcessContext.sideOutput(DoFnRunnerBase.java:471)
>   at 
> com.google.cloud.dataflow.sdk.transforms.Partition$PartitionDoFn.processElement(Partition.java:165)
> Looking at the code, it seems that if the user code throwable has an empty 
> stacktrace we would fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-52) KafkaIO - bounded/unbounded, source/sink

2016-04-29 Thread Raghu Angadi (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-52?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15264860#comment-15264860
 ] 

Raghu Angadi commented on BEAM-52:
--

Ah, thanks. just skimmed through it. It looks pretty much on the same lines as 
what I have. couple of differences : 
  - Kafka producer uses the key for selecting a partition. I wanted to retain 
that functionality for users. So I apply our coders inside custom Kafka 
serializers. Otherwise Kafka will hash on serialized byte array. 
 - I noticed you are catching the exceptions in a callback and reporting it 
back.. may be I should do that too. will get Dan's opinion as well in PR.

I will ping you to review my pull request.


> KafkaIO - bounded/unbounded, source/sink
> 
>
> Key: BEAM-52
> URL: https://issues.apache.org/jira/browse/BEAM-52
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions
>Reporter: Daniel Halperin
>Assignee: Raghu Angadi
>
> We should support Apache Kafka. The priority list is probably:
> * UnboundedSource
> * unbounded Sink
> * BoundedSource
> * bounded Sink
> The connector should be well-tested, especially around UnboundedSource 
> checkpointing and resuming, and data duplication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-77) Reorganize Directory structure

2016-04-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-77?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15264846#comment-15264846
 ] 

ASF GitHub Bot commented on BEAM-77:


Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/256


> Reorganize Directory structure
> --
>
> Key: BEAM-77
> URL: https://issues.apache.org/jira/browse/BEAM-77
> Project: Beam
>  Issue Type: Task
>  Components: project-management
>Reporter: Frances Perry
>Assignee: Jean-Baptiste Onofré
>
> Now that we've done the initial Dataflow code drop, we will restructure 
> directories to provide space for additional SDKs and Runners.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[03/17] incubator-beam git commit: Move InProcessRunner to its own module

2016-04-29 Thread kenn
http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/e13cacb8/sdks/java/core/src/test/java/org/apache/beam/sdk/runners/inprocess/InMemoryWatermarkManagerTest.java
--
diff --git 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/runners/inprocess/InMemoryWatermarkManagerTest.java
 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/runners/inprocess/InMemoryWatermarkManagerTest.java
deleted file mode 100644
index 3c4503f..000
--- 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/runners/inprocess/InMemoryWatermarkManagerTest.java
+++ /dev/null
@@ -1,1168 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.beam.sdk.runners.inprocess;
-
-import static org.hamcrest.Matchers.contains;
-import static org.hamcrest.Matchers.containsInAnyOrder;
-import static org.hamcrest.Matchers.emptyIterable;
-import static org.hamcrest.Matchers.equalTo;
-import static org.hamcrest.Matchers.not;
-import static org.hamcrest.Matchers.nullValue;
-import static org.junit.Assert.assertThat;
-
-import 
org.apache.beam.sdk.runners.inprocess.InMemoryWatermarkManager.FiredTimers;
-import 
org.apache.beam.sdk.runners.inprocess.InMemoryWatermarkManager.TimerUpdate;
-import 
org.apache.beam.sdk.runners.inprocess.InMemoryWatermarkManager.TimerUpdate.TimerUpdateBuilder;
-import 
org.apache.beam.sdk.runners.inprocess.InMemoryWatermarkManager.TransformWatermarks;
-import 
org.apache.beam.sdk.runners.inprocess.InProcessPipelineRunner.CommittedBundle;
-import 
org.apache.beam.sdk.runners.inprocess.InProcessPipelineRunner.UncommittedBundle;
-import org.apache.beam.sdk.testing.TestPipeline;
-import org.apache.beam.sdk.transforms.AppliedPTransform;
-import org.apache.beam.sdk.transforms.Create;
-import org.apache.beam.sdk.transforms.DoFn;
-import org.apache.beam.sdk.transforms.Filter;
-import org.apache.beam.sdk.transforms.Flatten;
-import org.apache.beam.sdk.transforms.ParDo;
-import org.apache.beam.sdk.transforms.WithKeys;
-import org.apache.beam.sdk.transforms.windowing.BoundedWindow;
-import org.apache.beam.sdk.transforms.windowing.GlobalWindow;
-import org.apache.beam.sdk.transforms.windowing.IntervalWindow;
-import org.apache.beam.sdk.transforms.windowing.PaneInfo;
-import org.apache.beam.sdk.util.TimeDomain;
-import org.apache.beam.sdk.util.TimerInternals.TimerData;
-import org.apache.beam.sdk.util.WindowedValue;
-import org.apache.beam.sdk.util.state.StateNamespaces;
-import org.apache.beam.sdk.values.KV;
-import org.apache.beam.sdk.values.PCollection;
-import org.apache.beam.sdk.values.PCollectionList;
-import org.apache.beam.sdk.values.PValue;
-import org.apache.beam.sdk.values.TimestampedValue;
-
-import com.google.common.collect.ImmutableList;
-
-import org.hamcrest.BaseMatcher;
-import org.hamcrest.Description;
-import org.hamcrest.Matcher;
-import org.joda.time.Instant;
-import org.joda.time.ReadableInstant;
-import org.junit.Before;
-import org.junit.Test;
-import org.junit.runner.RunWith;
-import org.junit.runners.JUnit4;
-
-import java.io.Serializable;
-import java.util.Collection;
-import java.util.Collections;
-import java.util.HashMap;
-import java.util.Map;
-
-/**
- * Tests for {@link InMemoryWatermarkManager}.
- */
-@RunWith(JUnit4.class)
-public class InMemoryWatermarkManagerTest implements Serializable {
-  private transient MockClock clock;
-
-  private transient PCollection createdInts;
-
-  private transient PCollection filtered;
-  private transient PCollection filteredTimesTwo;
-  private transient PCollection> keyed;
-
-  private transient PCollection intsToFlatten;
-  private transient PCollection flattened;
-
-  private transient InMemoryWatermarkManager manager;
-  private transient BundleFactory bundleFactory;
-
-  @Before
-  public void setup() {
-TestPipeline p = TestPipeline.create();
-
-createdInts = p.apply("createdInts", Create.of(1, 2, 3));
-
-filtered = createdInts.apply("filtered", Filter.greaterThan(1));
-filteredTimesTwo = filtered.apply("timesTwo", ParDo.of(new DoFn() {
-  @Override
-  public void processElement(DoFn.ProcessContext c) 
throws Exception {
-  

[12/17] incubator-beam git commit: Move InProcessRunner to its own module

2016-04-29 Thread kenn
http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/e13cacb8/runners/direct-java/src/test/java/org/apache/beam/runners/direct/BoundedReadEvaluatorFactoryTest.java
--
diff --git 
a/runners/direct-java/src/test/java/org/apache/beam/runners/direct/BoundedReadEvaluatorFactoryTest.java
 
b/runners/direct-java/src/test/java/org/apache/beam/runners/direct/BoundedReadEvaluatorFactoryTest.java
new file mode 100644
index 000..491363a
--- /dev/null
+++ 
b/runners/direct-java/src/test/java/org/apache/beam/runners/direct/BoundedReadEvaluatorFactoryTest.java
@@ -0,0 +1,290 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.direct;
+
+import static org.hamcrest.Matchers.containsInAnyOrder;
+import static org.hamcrest.Matchers.emptyIterable;
+import static org.hamcrest.Matchers.equalTo;
+import static org.hamcrest.Matchers.is;
+import static org.junit.Assert.assertThat;
+import static org.mockito.Mockito.when;
+
+import org.apache.beam.runners.direct.InProcessPipelineRunner.CommittedBundle;
+import 
org.apache.beam.runners.direct.InProcessPipelineRunner.UncommittedBundle;
+import org.apache.beam.sdk.coders.BigEndianLongCoder;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.io.BoundedSource.BoundedReader;
+import org.apache.beam.sdk.io.CountingSource;
+import org.apache.beam.sdk.io.Read;
+import org.apache.beam.sdk.io.Read.Bounded;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.apache.beam.sdk.transforms.AppliedPTransform;
+import org.apache.beam.sdk.transforms.windowing.BoundedWindow;
+import org.apache.beam.sdk.util.WindowedValue;
+import org.apache.beam.sdk.values.PCollection;
+
+import com.google.common.collect.ImmutableList;
+
+import org.joda.time.Instant;
+import org.junit.Before;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+import org.mockito.Mock;
+import org.mockito.MockitoAnnotations;
+
+import java.io.IOException;
+import java.util.Arrays;
+import java.util.List;
+import java.util.NoSuchElementException;
+
+/**
+ * Tests for {@link BoundedReadEvaluatorFactory}.
+ */
+@RunWith(JUnit4.class)
+public class BoundedReadEvaluatorFactoryTest {
+  private BoundedSource source;
+  private PCollection longs;
+  private TransformEvaluatorFactory factory;
+  @Mock private InProcessEvaluationContext context;
+  private BundleFactory bundleFactory;
+
+  @Before
+  public void setup() {
+MockitoAnnotations.initMocks(this);
+source = CountingSource.upTo(10L);
+TestPipeline p = TestPipeline.create();
+longs = p.apply(Read.from(source));
+
+factory = new BoundedReadEvaluatorFactory();
+bundleFactory = InProcessBundleFactory.create();
+  }
+
+  @Test
+  public void boundedSourceInMemoryTransformEvaluatorProducesElements() throws 
Exception {
+UncommittedBundle output = bundleFactory.createRootBundle(longs);
+when(context.createRootBundle(longs)).thenReturn(output);
+
+TransformEvaluator evaluator =
+factory.forApplication(longs.getProducingTransformInternal(), null, 
context);
+InProcessTransformResult result = evaluator.finishBundle();
+assertThat(result.getWatermarkHold(), 
equalTo(BoundedWindow.TIMESTAMP_MAX_VALUE));
+assertThat(
+output.commit(BoundedWindow.TIMESTAMP_MAX_VALUE).getElements(),
+containsInAnyOrder(
+gw(1L), gw(2L), gw(4L), gw(8L), gw(9L), gw(7L), gw(6L), gw(5L), 
gw(3L), gw(0L)));
+  }
+
+  /**
+   * Demonstrate that acquiring multiple {@link TransformEvaluator 
TransformEvaluators} for the same
+   * {@link Bounded Read.Bounded} application with the same evaluation context 
only produces the
+   * elements once.
+   */
+  @Test
+  public void boundedSourceInMemoryTransformEvaluatorAfterFinishIsEmpty() 
throws Exception {
+UncommittedBundle output = bundleFactory.createRootBundle(longs);
+when(context.createRootBundle(longs)).thenReturn(output);
+
+TransformEvaluator evaluator =
+factory.forApplication(longs.getProducingTransformInternal(), null, 
context);

[15/17] incubator-beam git commit: Move InProcessRunner to its own module

2016-04-29 Thread kenn
http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/e13cacb8/runners/direct-java/src/main/java/org/apache/beam/runners/direct/FlattenEvaluatorFactory.java
--
diff --git 
a/runners/direct-java/src/main/java/org/apache/beam/runners/direct/FlattenEvaluatorFactory.java
 
b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/FlattenEvaluatorFactory.java
new file mode 100644
index 000..2efaad3
--- /dev/null
+++ 
b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/FlattenEvaluatorFactory.java
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.direct;
+
+import org.apache.beam.runners.direct.InProcessPipelineRunner.CommittedBundle;
+import 
org.apache.beam.runners.direct.InProcessPipelineRunner.UncommittedBundle;
+import org.apache.beam.sdk.transforms.AppliedPTransform;
+import org.apache.beam.sdk.transforms.Flatten;
+import org.apache.beam.sdk.transforms.Flatten.FlattenPCollectionList;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.util.WindowedValue;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PCollectionList;
+
+/**
+ * The {@link InProcessPipelineRunner} {@link TransformEvaluatorFactory} for 
the {@link Flatten}
+ * {@link PTransform}.
+ */
+class FlattenEvaluatorFactory implements TransformEvaluatorFactory {
+  @Override
+  public  TransformEvaluator forApplication(
+  AppliedPTransform application,
+  CommittedBundle inputBundle,
+  InProcessEvaluationContext evaluationContext) {
+@SuppressWarnings({"cast", "unchecked", "rawtypes"})
+TransformEvaluator evaluator = (TransformEvaluator) 
createInMemoryEvaluator(
+(AppliedPTransform) application, inputBundle, evaluationContext);
+return evaluator;
+  }
+
+  private  TransformEvaluator createInMemoryEvaluator(
+  final AppliedPTransform<
+  PCollectionList, PCollection, 
FlattenPCollectionList>
+  application,
+  final CommittedBundle inputBundle,
+  final InProcessEvaluationContext evaluationContext) {
+if (inputBundle == null) {
+  // it is impossible to call processElement on a flatten with no input 
bundle. A Flatten with
+  // no input bundle occurs as an output of 
Flatten.pcollections(PCollectionList.empty())
+  return new FlattenEvaluator<>(
+  null, StepTransformResult.withoutHold(application).build());
+}
+final UncommittedBundle outputBundle =
+evaluationContext.createBundle(inputBundle, application.getOutput());
+final InProcessTransformResult result =
+
StepTransformResult.withoutHold(application).addOutput(outputBundle).build();
+return new FlattenEvaluator<>(outputBundle, result);
+  }
+
+  private static class FlattenEvaluator implements 
TransformEvaluator {
+private final UncommittedBundle outputBundle;
+private final InProcessTransformResult result;
+
+public FlattenEvaluator(
+UncommittedBundle outputBundle, InProcessTransformResult 
result) {
+  this.outputBundle = outputBundle;
+  this.result = result;
+}
+
+@Override
+public void processElement(WindowedValue element) {
+  outputBundle.add(element);
+}
+
+@Override
+public InProcessTransformResult finishBundle() {
+  return result;
+}
+  }
+}

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/e13cacb8/runners/direct-java/src/main/java/org/apache/beam/runners/direct/ForwardingPTransform.java
--
diff --git 
a/runners/direct-java/src/main/java/org/apache/beam/runners/direct/ForwardingPTransform.java
 
b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/ForwardingPTransform.java
new file mode 100644
index 000..3160b58
--- /dev/null
+++ 
b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/ForwardingPTransform.java
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * 

[01/17] incubator-beam git commit: Move InProcessRunner to its own module

2016-04-29 Thread kenn
Repository: incubator-beam
Updated Branches:
  refs/heads/master bba4c64d3 -> b9116ac42


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/e13cacb8/sdks/java/core/src/test/java/org/apache/beam/sdk/runners/inprocess/ParDoSingleEvaluatorFactoryTest.java
--
diff --git 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/runners/inprocess/ParDoSingleEvaluatorFactoryTest.java
 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/runners/inprocess/ParDoSingleEvaluatorFactoryTest.java
deleted file mode 100644
index dfd857e..000
--- 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/runners/inprocess/ParDoSingleEvaluatorFactoryTest.java
+++ /dev/null
@@ -1,324 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.beam.sdk.runners.inprocess;
-
-import static org.hamcrest.Matchers.containsInAnyOrder;
-import static org.hamcrest.Matchers.equalTo;
-import static org.hamcrest.Matchers.not;
-import static org.hamcrest.Matchers.nullValue;
-import static org.junit.Assert.assertThat;
-import static org.mockito.Mockito.mock;
-import static org.mockito.Mockito.when;
-
-import org.apache.beam.sdk.coders.StringUtf8Coder;
-import 
org.apache.beam.sdk.runners.inprocess.InMemoryWatermarkManager.TimerUpdate;
-import 
org.apache.beam.sdk.runners.inprocess.InProcessPipelineRunner.CommittedBundle;
-import 
org.apache.beam.sdk.runners.inprocess.InProcessPipelineRunner.UncommittedBundle;
-import org.apache.beam.sdk.testing.TestPipeline;
-import org.apache.beam.sdk.transforms.Create;
-import org.apache.beam.sdk.transforms.DoFn;
-import org.apache.beam.sdk.transforms.ParDo;
-import org.apache.beam.sdk.transforms.windowing.BoundedWindow;
-import org.apache.beam.sdk.transforms.windowing.GlobalWindow;
-import org.apache.beam.sdk.transforms.windowing.IntervalWindow;
-import org.apache.beam.sdk.transforms.windowing.OutputTimeFns;
-import org.apache.beam.sdk.transforms.windowing.PaneInfo;
-import org.apache.beam.sdk.util.TimeDomain;
-import org.apache.beam.sdk.util.TimerInternals.TimerData;
-import org.apache.beam.sdk.util.WindowedValue;
-import org.apache.beam.sdk.util.common.CounterSet;
-import org.apache.beam.sdk.util.state.BagState;
-import org.apache.beam.sdk.util.state.StateNamespace;
-import org.apache.beam.sdk.util.state.StateNamespaces;
-import org.apache.beam.sdk.util.state.StateTag;
-import org.apache.beam.sdk.util.state.StateTags;
-import org.apache.beam.sdk.util.state.WatermarkHoldState;
-import org.apache.beam.sdk.values.KV;
-import org.apache.beam.sdk.values.PCollection;
-import org.apache.beam.sdk.values.TupleTag;
-
-import org.hamcrest.Matchers;
-import org.joda.time.Duration;
-import org.joda.time.Instant;
-import org.junit.Test;
-import org.junit.runner.RunWith;
-import org.junit.runners.JUnit4;
-
-import java.io.Serializable;
-
-/**
- * Tests for {@link ParDoSingleEvaluatorFactory}.
- */
-@RunWith(JUnit4.class)
-public class ParDoSingleEvaluatorFactoryTest implements Serializable {
-  private transient BundleFactory bundleFactory = 
InProcessBundleFactory.create();
-
-  @Test
-  public void testParDoInMemoryTransformEvaluator() throws Exception {
-TestPipeline p = TestPipeline.create();
-
-PCollection input = p.apply(Create.of("foo", "bara", "bazam"));
-PCollection collection =
-input.apply(
-ParDo.of(
-new DoFn() {
-  @Override
-  public void processElement(ProcessContext c) {
-c.output(c.element().length());
-  }
-}));
-CommittedBundle inputBundle =
-bundleFactory.createRootBundle(input).commit(Instant.now());
-
-InProcessEvaluationContext evaluationContext = 
mock(InProcessEvaluationContext.class);
-UncommittedBundle outputBundle = 
bundleFactory.createRootBundle(collection);
-when(evaluationContext.createBundle(inputBundle, 
collection)).thenReturn(outputBundle);
-InProcessExecutionContext executionContext =
-new InProcessExecutionContext(null, null, null, null);
-
when(evaluationContext.getExecutionContext(collection.getProducingTransformInternal(),
 null))
-

[GitHub] incubator-beam pull request: [BEAM-77] Move InProcessRunner to its...

2016-04-29 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/256


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[14/17] incubator-beam git commit: Move InProcessRunner to its own module

2016-04-29 Thread kenn
http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/e13cacb8/runners/direct-java/src/main/java/org/apache/beam/runners/direct/InProcessBundleOutputManager.java
--
diff --git 
a/runners/direct-java/src/main/java/org/apache/beam/runners/direct/InProcessBundleOutputManager.java
 
b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/InProcessBundleOutputManager.java
new file mode 100644
index 000..f374f99
--- /dev/null
+++ 
b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/InProcessBundleOutputManager.java
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.direct;
+
+import org.apache.beam.runners.direct.InProcessPipelineRunner.CommittedBundle;
+import 
org.apache.beam.runners.direct.InProcessPipelineRunner.UncommittedBundle;
+import org.apache.beam.sdk.util.DoFnRunners.OutputManager;
+import org.apache.beam.sdk.util.WindowedValue;
+import org.apache.beam.sdk.values.TupleTag;
+
+import java.util.Map;
+
+/**
+ * An {@link OutputManager} that outputs to {@link CommittedBundle Bundles} 
used by the
+ * {@link InProcessPipelineRunner}.
+ */
+public class InProcessBundleOutputManager implements OutputManager {
+  private final Map bundles;
+
+  public static InProcessBundleOutputManager create(
+  Map outputBundles) {
+return new InProcessBundleOutputManager(outputBundles);
+  }
+
+  public InProcessBundleOutputManager(Map 
bundles) {
+this.bundles = bundles;
+  }
+
+  @SuppressWarnings("unchecked")
+  @Override
+  public  void output(TupleTag tag, WindowedValue output) {
+@SuppressWarnings("rawtypes")
+UncommittedBundle bundle = bundles.get(tag);
+bundle.add(output);
+  }
+}

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/e13cacb8/runners/direct-java/src/main/java/org/apache/beam/runners/direct/InProcessEvaluationContext.java
--
diff --git 
a/runners/direct-java/src/main/java/org/apache/beam/runners/direct/InProcessEvaluationContext.java
 
b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/InProcessEvaluationContext.java
new file mode 100644
index 000..d9a7ff0
--- /dev/null
+++ 
b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/InProcessEvaluationContext.java
@@ -0,0 +1,425 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.direct;
+
+import static com.google.common.base.Preconditions.checkNotNull;
+
+import 
org.apache.beam.runners.direct.GroupByKeyEvaluatorFactory.InProcessGroupByKeyOnly;
+import org.apache.beam.runners.direct.InMemoryWatermarkManager.FiredTimers;
+import 
org.apache.beam.runners.direct.InMemoryWatermarkManager.TransformWatermarks;
+import org.apache.beam.runners.direct.InProcessPipelineRunner.CommittedBundle;
+import 
org.apache.beam.runners.direct.InProcessPipelineRunner.PCollectionViewWriter;
+import 
org.apache.beam.runners.direct.InProcessPipelineRunner.UncommittedBundle;
+import org.apache.beam.sdk.Pipeline;
+import org.apache.beam.sdk.transforms.AppliedPTransform;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.windowing.BoundedWindow;
+import org.apache.beam.sdk.transforms.windowing.Trigger;
+import 

[16/17] incubator-beam git commit: Move InProcessRunner to its own module

2016-04-29 Thread kenn
Move InProcessRunner to its own module


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/e13cacb8
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/e13cacb8
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/e13cacb8

Branch: refs/heads/master
Commit: e13cacb81c9c9718a45bfb7aefd839dcdfd442f6
Parents: bba4c64
Author: Kenneth Knowles 
Authored: Wed Apr 27 15:01:48 2016 -0700
Committer: Kenneth Knowles 
Committed: Fri Apr 29 14:28:45 2016 -0700

--
 .travis.yml |1 +
 runners/direct-java/pom.xml |  400 ++
 .../direct/AbstractModelEnforcement.java|   38 +
 .../direct/AvroIOShardedWriteFactory.java   |   76 +
 .../direct/BoundedReadEvaluatorFactory.java |  155 ++
 .../beam/runners/direct/BundleFactory.java  |   49 +
 .../CachedThreadPoolExecutorServiceFactory.java |   44 +
 .../org/apache/beam/runners/direct/Clock.java   |   30 +
 .../beam/runners/direct/CommittedResult.java|   46 +
 .../beam/runners/direct/CompletionCallback.java |   36 +
 .../direct/ConsumerTrackingPipelineVisitor.java |  173 +++
 .../runners/direct/EmptyTransformEvaluator.java |   50 +
 .../direct/EncodabilityEnforcementFactory.java  |   70 +
 .../beam/runners/direct/EvaluatorKey.java   |   55 +
 .../runners/direct/ExecutorServiceFactory.java  |   33 +
 .../direct/ExecutorServiceParallelExecutor.java |  478 +++
 .../runners/direct/FlattenEvaluatorFactory.java |   85 ++
 .../runners/direct/ForwardingPTransform.java|   62 +
 .../direct/GroupByKeyEvaluatorFactory.java  |  274 
 .../ImmutabilityCheckingBundleFactory.java  |  131 ++
 .../direct/ImmutabilityEnforcementFactory.java  |  103 ++
 .../direct/InMemoryWatermarkManager.java| 1327 ++
 .../runners/direct/InProcessBundleFactory.java  |  162 +++
 .../direct/InProcessBundleOutputManager.java|   51 +
 .../direct/InProcessEvaluationContext.java  |  425 ++
 .../direct/InProcessExecutionContext.java   |  105 ++
 .../beam/runners/direct/InProcessExecutor.java  |   48 +
 .../direct/InProcessPipelineOptions.java|  101 ++
 .../runners/direct/InProcessPipelineRunner.java |  370 +
 .../beam/runners/direct/InProcessRegistrar.java |   55 +
 .../direct/InProcessSideInputContainer.java |  271 
 .../runners/direct/InProcessTimerInternals.java |   84 ++
 .../direct/InProcessTransformResult.java|   77 +
 .../direct/KeyedPValueTrackingVisitor.java  |   96 ++
 .../beam/runners/direct/ModelEnforcement.java   |   63 +
 .../runners/direct/ModelEnforcementFactory.java |   30 +
 .../beam/runners/direct/NanosOffsetClock.java   |   59 +
 .../direct/PTransformOverrideFactory.java   |   33 +
 .../runners/direct/ParDoInProcessEvaluator.java |  173 +++
 .../direct/ParDoMultiEvaluatorFactory.java  |   64 +
 .../direct/ParDoSingleEvaluatorFactory.java |   63 +
 .../direct/PassthroughTransformEvaluator.java   |   49 +
 .../runners/direct/ShardControlledWrite.java|   81 ++
 .../apache/beam/runners/direct/StepAndKey.java  |   71 +
 .../runners/direct/StepTransformResult.java |  165 +++
 .../direct/TextIOShardedWriteFactory.java   |   78 +
 .../beam/runners/direct/TransformEvaluator.java |   46 +
 .../direct/TransformEvaluatorFactory.java   |   44 +
 .../direct/TransformEvaluatorRegistry.java  |   77 +
 .../beam/runners/direct/TransformExecutor.java  |  176 +++
 .../direct/TransformExecutorService.java|   35 +
 .../direct/TransformExecutorServices.java   |  154 ++
 .../direct/UnboundedReadEvaluatorFactory.java   |  177 +++
 .../runners/direct/ViewEvaluatorFactory.java|  145 ++
 .../direct/WatermarkCallbackExecutor.java   |  146 ++
 .../runners/direct/WindowEvaluatorFactory.java  |  131 ++
 .../direct/AvroIOShardedWriteFactoryTest.java   |  112 ++
 .../direct/BoundedReadEvaluatorFactoryTest.java |  290 
 .../runners/direct/CommittedResultTest.java |   77 +
 .../ConsumerTrackingPipelineVisitorTest.java|  272 
 .../EncodabilityEnforcementFactoryTest.java |  257 
 .../direct/FlattenEvaluatorFactoryTest.java |  141 ++
 .../direct/ForwardingPTransformTest.java|  112 ++
 .../direct/GroupByKeyEvaluatorFactoryTest.java  |  183 +++
 .../ImmutabilityCheckingBundleFactoryTest.java  |  220 +++
 .../ImmutabilityEnforcementFactoryTest.java |  128 ++
 .../direct/InMemoryWatermarkManagerTest.java| 1168 +++
 .../direct/InProcessBundleFactoryTest.java  |  223 +++
 .../direct/InProcessEvaluationContextTest.java  |  526 +++
 .../direct/InProcessPipelineRegistrarTest.java  |   74 +
 .../direct/InProcessPipelineRunnerTest.java |   78 +
 .../direct/InProcessSideInputContainerTest.java |  496 +++
 

[05/17] incubator-beam git commit: Move InProcessRunner to its own module

2016-04-29 Thread kenn
http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/e13cacb8/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/TextIOShardedWriteFactory.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/TextIOShardedWriteFactory.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/TextIOShardedWriteFactory.java
deleted file mode 100644
index fac5a40..000
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/TextIOShardedWriteFactory.java
+++ /dev/null
@@ -1,78 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.beam.sdk.runners.inprocess;
-
-import org.apache.beam.sdk.io.TextIO;
-import org.apache.beam.sdk.io.TextIO.Write.Bound;
-import org.apache.beam.sdk.transforms.PTransform;
-import org.apache.beam.sdk.util.IOChannelUtils;
-import org.apache.beam.sdk.values.PCollection;
-import org.apache.beam.sdk.values.PDone;
-import org.apache.beam.sdk.values.PInput;
-import org.apache.beam.sdk.values.POutput;
-
-class TextIOShardedWriteFactory implements PTransformOverrideFactory {
-
-  @Override
-  public  PTransform override(
-  PTransform transform) {
-if (transform instanceof TextIO.Write.Bound) {
-  @SuppressWarnings("unchecked")
-  TextIO.Write.Bound originalWrite = (TextIO.Write.Bound) 
transform;
-  if (originalWrite.getNumShards() > 1
-  || (originalWrite.getNumShards() == 1
-  && !"".equals(originalWrite.getShardNameTemplate( {
-@SuppressWarnings("unchecked")
-PTransform override =
-(PTransform) new 
TextIOShardedWrite(originalWrite);
-return override;
-  }
-}
-return transform;
-  }
-
-  private static class TextIOShardedWrite extends 
ShardControlledWrite {
-private final TextIO.Write.Bound initial;
-
-private TextIOShardedWrite(Bound initial) {
-  this.initial = initial;
-}
-
-@Override
-int getNumShards() {
-  return initial.getNumShards();
-}
-
-@Override
-PTransform getSingleShardTransform(int 
shardNum) {
-  String shardName =
-  IOChannelUtils.constructName(
-  initial.getFilenamePrefix(),
-  initial.getShardTemplate(),
-  initial.getFilenameSuffix(),
-  shardNum,
-  getNumShards());
-  return 
TextIO.Write.withCoder(initial.getCoder()).to(shardName).withoutSharding();
-}
-
-@Override
-protected PTransform delegate() {
-  return initial;
-}
-  }
-}

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/e13cacb8/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/TransformEvaluator.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/TransformEvaluator.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/TransformEvaluator.java
deleted file mode 100644
index e002329..000
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/TransformEvaluator.java
+++ /dev/null
@@ -1,46 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.beam.sdk.runners.inprocess;
-
-import 

[10/17] incubator-beam git commit: Move InProcessRunner to its own module

2016-04-29 Thread kenn
http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/e13cacb8/runners/direct-java/src/test/java/org/apache/beam/runners/direct/InProcessEvaluationContextTest.java
--
diff --git 
a/runners/direct-java/src/test/java/org/apache/beam/runners/direct/InProcessEvaluationContextTest.java
 
b/runners/direct-java/src/test/java/org/apache/beam/runners/direct/InProcessEvaluationContextTest.java
new file mode 100644
index 000..59c4d8e
--- /dev/null
+++ 
b/runners/direct-java/src/test/java/org/apache/beam/runners/direct/InProcessEvaluationContextTest.java
@@ -0,0 +1,526 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.direct;
+
+import static org.hamcrest.Matchers.contains;
+import static org.hamcrest.Matchers.containsInAnyOrder;
+import static org.hamcrest.Matchers.emptyIterable;
+import static org.hamcrest.Matchers.equalTo;
+import static org.hamcrest.Matchers.is;
+import static org.hamcrest.Matchers.not;
+import static org.junit.Assert.assertThat;
+
+import org.apache.beam.runners.direct.InMemoryWatermarkManager.FiredTimers;
+import org.apache.beam.runners.direct.InMemoryWatermarkManager.TimerUpdate;
+import 
org.apache.beam.runners.direct.InProcessExecutionContext.InProcessStepContext;
+import org.apache.beam.runners.direct.InProcessPipelineRunner.CommittedBundle;
+import 
org.apache.beam.runners.direct.InProcessPipelineRunner.PCollectionViewWriter;
+import 
org.apache.beam.runners.direct.InProcessPipelineRunner.UncommittedBundle;
+import org.apache.beam.sdk.coders.VarIntCoder;
+import org.apache.beam.sdk.io.CountingInput;
+import org.apache.beam.sdk.options.PipelineOptionsFactory;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.apache.beam.sdk.transforms.AppliedPTransform;
+import org.apache.beam.sdk.transforms.Create;
+import org.apache.beam.sdk.transforms.View;
+import org.apache.beam.sdk.transforms.WithKeys;
+import org.apache.beam.sdk.transforms.windowing.BoundedWindow;
+import org.apache.beam.sdk.transforms.windowing.GlobalWindow;
+import org.apache.beam.sdk.transforms.windowing.PaneInfo;
+import org.apache.beam.sdk.transforms.windowing.PaneInfo.Timing;
+import org.apache.beam.sdk.util.SideInputReader;
+import org.apache.beam.sdk.util.TimeDomain;
+import org.apache.beam.sdk.util.TimerInternals.TimerData;
+import org.apache.beam.sdk.util.WindowedValue;
+import org.apache.beam.sdk.util.WindowingStrategy;
+import org.apache.beam.sdk.util.common.Counter;
+import org.apache.beam.sdk.util.common.Counter.AggregationKind;
+import org.apache.beam.sdk.util.common.CounterSet;
+import org.apache.beam.sdk.util.state.BagState;
+import org.apache.beam.sdk.util.state.CopyOnAccessInMemoryStateInternals;
+import org.apache.beam.sdk.util.state.StateNamespaces;
+import org.apache.beam.sdk.util.state.StateTag;
+import org.apache.beam.sdk.util.state.StateTags;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PCollection.IsBounded;
+import org.apache.beam.sdk.values.PCollectionView;
+import org.apache.beam.sdk.values.PValue;
+
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.Iterables;
+
+import org.hamcrest.Matchers;
+import org.joda.time.Instant;
+import org.junit.Before;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+import java.util.Collection;
+import java.util.Collections;
+import java.util.Map;
+import java.util.concurrent.CountDownLatch;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * Tests for {@link InProcessEvaluationContext}.
+ */
+@RunWith(JUnit4.class)
+public class InProcessEvaluationContextTest {
+  private TestPipeline p;
+  private InProcessEvaluationContext context;
+
+  private PCollection created;
+  private PCollection> downstream;
+  private PCollectionView view;
+  private PCollection unbounded;
+  private Collection rootTransforms;
+  private Map> valueToConsumers;
+
+  private BundleFactory bundleFactory;
+
+  @Before
+  public void setup() {
+InProcessPipelineRunner runner =
+  

[13/17] incubator-beam git commit: Move InProcessRunner to its own module

2016-04-29 Thread kenn
http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/e13cacb8/runners/direct-java/src/main/java/org/apache/beam/runners/direct/ParDoInProcessEvaluator.java
--
diff --git 
a/runners/direct-java/src/main/java/org/apache/beam/runners/direct/ParDoInProcessEvaluator.java
 
b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/ParDoInProcessEvaluator.java
new file mode 100644
index 000..1c51738
--- /dev/null
+++ 
b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/ParDoInProcessEvaluator.java
@@ -0,0 +1,173 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.direct;
+
+import 
org.apache.beam.runners.direct.InProcessExecutionContext.InProcessStepContext;
+import org.apache.beam.runners.direct.InProcessPipelineRunner.CommittedBundle;
+import 
org.apache.beam.runners.direct.InProcessPipelineRunner.UncommittedBundle;
+import org.apache.beam.sdk.transforms.AppliedPTransform;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.util.DoFnRunner;
+import org.apache.beam.sdk.util.DoFnRunners;
+import org.apache.beam.sdk.util.DoFnRunners.OutputManager;
+import org.apache.beam.sdk.util.SerializableUtils;
+import org.apache.beam.sdk.util.UserCodeException;
+import org.apache.beam.sdk.util.WindowedValue;
+import org.apache.beam.sdk.util.common.CounterSet;
+import org.apache.beam.sdk.util.state.CopyOnAccessInMemoryStateInternals;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PCollectionView;
+import org.apache.beam.sdk.values.TupleTag;
+
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+class ParDoInProcessEvaluator implements TransformEvaluator {
+  public static  ParDoInProcessEvaluator create(
+  InProcessEvaluationContext evaluationContext,
+  CommittedBundle inputBundle,
+  AppliedPTransform application,
+  DoFn fn,
+  List sideInputs,
+  TupleTag mainOutputTag,
+  List sideOutputTags,
+  Map outputs) {
+InProcessExecutionContext executionContext =
+evaluationContext.getExecutionContext(application, 
inputBundle.getKey());
+String stepName = evaluationContext.getStepName(application);
+InProcessStepContext stepContext =
+executionContext.getOrCreateStepContext(stepName, stepName);
+
+CounterSet counters = evaluationContext.createCounterSet();
+
+Map outputBundles = new HashMap<>();
+for (Map.Entry outputEntry : 
outputs.entrySet()) {
+  outputBundles.put(
+  outputEntry.getKey(),
+  evaluationContext.createBundle(inputBundle, outputEntry.getValue()));
+}
+
+DoFnRunner runner =
+DoFnRunners.createDefault(
+evaluationContext.getPipelineOptions(),
+SerializableUtils.clone(fn),
+evaluationContext.createSideInputReader(sideInputs),
+BundleOutputManager.create(outputBundles),
+mainOutputTag,
+sideOutputTags,
+stepContext,
+counters.getAddCounterMutator(),
+application.getInput().getWindowingStrategy());
+
+try {
+  runner.startBundle();
+} catch (Exception e) {
+  throw UserCodeException.wrap(e);
+}
+
+return new ParDoInProcessEvaluator<>(
+runner, application, counters, outputBundles.values(), stepContext);
+  }
+
+  

+
+  private final DoFnRunner fnRunner;
+  private final AppliedPTransform transform;
+  private final CounterSet counters;
+  private final Collection outputBundles;
+  private final InProcessStepContext stepContext;
+
+  private ParDoInProcessEvaluator(
+  DoFnRunner fnRunner,
+  AppliedPTransform transform,
+  CounterSet counters,
+  Collection outputBundles,
+   

[06/17] incubator-beam git commit: Move InProcessRunner to its own module

2016-04-29 Thread kenn
http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/e13cacb8/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/InProcessPipelineOptions.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/InProcessPipelineOptions.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/InProcessPipelineOptions.java
deleted file mode 100644
index bdc525a..000
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/InProcessPipelineOptions.java
+++ /dev/null
@@ -1,101 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.beam.sdk.runners.inprocess;
-
-import org.apache.beam.sdk.Pipeline;
-import org.apache.beam.sdk.options.ApplicationNameOptions;
-import org.apache.beam.sdk.options.Default;
-import org.apache.beam.sdk.options.Description;
-import org.apache.beam.sdk.options.Hidden;
-import org.apache.beam.sdk.options.PipelineOptions;
-import org.apache.beam.sdk.options.Validation.Required;
-import org.apache.beam.sdk.transforms.PTransform;
-
-import com.fasterxml.jackson.annotation.JsonIgnore;
-
-import java.util.concurrent.ExecutorService;
-import java.util.concurrent.Executors;
-
-/**
- * Options that can be used to configure the {@link InProcessPipelineRunner}.
- */
-public interface InProcessPipelineOptions extends PipelineOptions, 
ApplicationNameOptions {
-  /**
-   * Gets the {@link ExecutorServiceFactory} to use to create instances of 
{@link ExecutorService}
-   * to execute {@link PTransform PTransforms}.
-   *
-   * Note that {@link ExecutorService ExecutorServices} returned by the 
factory must ensure that
-   * it cannot enter a state in which it will not schedule additional pending 
work unless currently
-   * scheduled work completes, as this may cause the {@link Pipeline} to cease 
processing.
-   *
-   * Defaults to a {@link CachedThreadPoolExecutorServiceFactory}, which 
produces instances of
-   * {@link Executors#newCachedThreadPool()}.
-   */
-  @JsonIgnore
-  @Required
-  @Hidden
-  @Default.InstanceFactory(CachedThreadPoolExecutorServiceFactory.class)
-  ExecutorServiceFactory getExecutorServiceFactory();
-
-  void setExecutorServiceFactory(ExecutorServiceFactory executorService);
-
-  /**
-   * Gets the {@link Clock} used by this pipeline. The clock is used in place 
of accessing the
-   * system time when time values are required by the evaluator.
-   */
-  @Default.InstanceFactory(NanosOffsetClock.Factory.class)
-  @JsonIgnore
-  @Required
-  @Hidden
-  @Description(
-  "The processing time source used by the pipeline. When the current time 
is "
-  + "needed by the evaluator, the result of clock#now() is used.")
-  Clock getClock();
-
-  void setClock(Clock clock);
-
-  @Default.Boolean(false)
-  @Description(
-  "If the pipeline should shut down producers which have reached the 
maximum "
-  + "representable watermark. If this is set to true, a pipeline in 
which all PTransforms "
-  + "have reached the maximum watermark will be shut down, even if 
there are unbounded "
-  + "sources that could produce additional (late) data. By default, if 
the pipeline "
-  + "contains any unbounded PCollections, it will run until explicitly 
shut down.")
-  boolean isShutdownUnboundedProducersWithMaxWatermark();
-
-  void setShutdownUnboundedProducersWithMaxWatermark(boolean shutdown);
-
-  @Default.Boolean(true)
-  @Description(
-  "If the pipeline should block awaiting completion of the pipeline. If 
set to true, "
-  + "a call to Pipeline#run() will block until all PTransforms are 
complete. Otherwise, "
-  + "the Pipeline will execute asynchronously. If set to false, the 
completion of the "
-  + "pipeline can be awaited on by use of 
InProcessPipelineResult#awaitCompletion().")
-  boolean isBlockOnRun();
-
-  void setBlockOnRun(boolean b);
-
-  @Default.Boolean(true)
-  @Description(
-  "Controls whether the runner should ensure that all of the elements of 
every "
-  + "PCollection are not mutated. PTransforms are not permitted to 
mutate input elements "
-  + 

[09/17] incubator-beam git commit: Move InProcessRunner to its own module

2016-04-29 Thread kenn
http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/e13cacb8/runners/direct-java/src/test/java/org/apache/beam/runners/direct/ParDoSingleEvaluatorFactoryTest.java
--
diff --git 
a/runners/direct-java/src/test/java/org/apache/beam/runners/direct/ParDoSingleEvaluatorFactoryTest.java
 
b/runners/direct-java/src/test/java/org/apache/beam/runners/direct/ParDoSingleEvaluatorFactoryTest.java
new file mode 100644
index 000..236ad17
--- /dev/null
+++ 
b/runners/direct-java/src/test/java/org/apache/beam/runners/direct/ParDoSingleEvaluatorFactoryTest.java
@@ -0,0 +1,324 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.direct;
+
+import static org.hamcrest.Matchers.containsInAnyOrder;
+import static org.hamcrest.Matchers.equalTo;
+import static org.hamcrest.Matchers.not;
+import static org.hamcrest.Matchers.nullValue;
+import static org.junit.Assert.assertThat;
+import static org.mockito.Mockito.mock;
+import static org.mockito.Mockito.when;
+
+import org.apache.beam.runners.direct.InMemoryWatermarkManager.TimerUpdate;
+import org.apache.beam.runners.direct.InProcessPipelineRunner.CommittedBundle;
+import 
org.apache.beam.runners.direct.InProcessPipelineRunner.UncommittedBundle;
+import org.apache.beam.sdk.coders.StringUtf8Coder;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.apache.beam.sdk.transforms.Create;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.windowing.BoundedWindow;
+import org.apache.beam.sdk.transforms.windowing.GlobalWindow;
+import org.apache.beam.sdk.transforms.windowing.IntervalWindow;
+import org.apache.beam.sdk.transforms.windowing.OutputTimeFns;
+import org.apache.beam.sdk.transforms.windowing.PaneInfo;
+import org.apache.beam.sdk.util.TimeDomain;
+import org.apache.beam.sdk.util.TimerInternals.TimerData;
+import org.apache.beam.sdk.util.WindowedValue;
+import org.apache.beam.sdk.util.common.CounterSet;
+import org.apache.beam.sdk.util.state.BagState;
+import org.apache.beam.sdk.util.state.StateNamespace;
+import org.apache.beam.sdk.util.state.StateNamespaces;
+import org.apache.beam.sdk.util.state.StateTag;
+import org.apache.beam.sdk.util.state.StateTags;
+import org.apache.beam.sdk.util.state.WatermarkHoldState;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.TupleTag;
+
+import org.hamcrest.Matchers;
+import org.joda.time.Duration;
+import org.joda.time.Instant;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+import java.io.Serializable;
+
+/**
+ * Tests for {@link ParDoSingleEvaluatorFactory}.
+ */
+@RunWith(JUnit4.class)
+public class ParDoSingleEvaluatorFactoryTest implements Serializable {
+  private transient BundleFactory bundleFactory = 
InProcessBundleFactory.create();
+
+  @Test
+  public void testParDoInMemoryTransformEvaluator() throws Exception {
+TestPipeline p = TestPipeline.create();
+
+PCollection input = p.apply(Create.of("foo", "bara", "bazam"));
+PCollection collection =
+input.apply(
+ParDo.of(
+new DoFn() {
+  @Override
+  public void processElement(ProcessContext c) {
+c.output(c.element().length());
+  }
+}));
+CommittedBundle inputBundle =
+bundleFactory.createRootBundle(input).commit(Instant.now());
+
+InProcessEvaluationContext evaluationContext = 
mock(InProcessEvaluationContext.class);
+UncommittedBundle outputBundle = 
bundleFactory.createRootBundle(collection);
+when(evaluationContext.createBundle(inputBundle, 
collection)).thenReturn(outputBundle);
+InProcessExecutionContext executionContext =
+new InProcessExecutionContext(null, null, null, null);
+
when(evaluationContext.getExecutionContext(collection.getProducingTransformInternal(),
 null))
+.thenReturn(executionContext);
+CounterSet counters = new CounterSet();
+

[jira] [Commented] (BEAM-241) Not easy for runners to get late-data dropping

2016-04-29 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15264823#comment-15264823
 ] 

Kenneth Knowles commented on BEAM-241:
--

This is getting into the realm of providing a backend/worker programming model 
as well. Dropping of too-late messages is essentially an operation interposed 
at the appropriate place in a runner. Which is not to vote against this - it 
falls right in with `ReduceFnRunner` and the proposed side-input-awaiting 
utilities as part of the runners/core package, which we are in the processing 
of spinning off.

> Not easy for runners to get late-data dropping
> --
>
> Key: BEAM-241
> URL: https://issues.apache.org/jira/browse/BEAM-241
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Mark Shields
>Assignee: Frances Perry
>
> Quite by accident realized the Flink runner delegates to 
> GroupAlsoByWindowViaWindowSetDoFn for GBK, which in turn delegates to 
> ReduceFnRunner. The latter now assumes no messages will arrive beyond the 
> 'garbage collection' time of their target window(s).
> The Dataflow runner interposes a LateDataDroppingDoFnRunner into the path so 
> as to drop those too-late messages. That's done (I think) using 
> DoFnRunners.createDefault.
> I don't think the Flink runner does that.
> We need a nice runner-friendly way of dealing with the too-late data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-241) Not easy for runners to get late-data dropping

2016-04-29 Thread Mark Shields (JIRA)
Mark Shields created BEAM-241:
-

 Summary: Not easy for runners to get late-data dropping
 Key: BEAM-241
 URL: https://issues.apache.org/jira/browse/BEAM-241
 Project: Beam
  Issue Type: Bug
  Components: runner-core
Reporter: Mark Shields
Assignee: Frances Perry


Quite by accident realized the Flink runner delegates to 
GroupAlsoByWindowViaWindowSetDoFn for GBK, which in turn delegates to 
ReduceFnRunner. The latter now assumes no messages will arrive beyond the 
'garbage collection' time of their target window(s).

The Dataflow runner interposes a LateDataDroppingDoFnRunner into the path so as 
to drop those too-late messages. That's done (I think) using 
DoFnRunners.createDefault.

I don't think the Flink runner does that.

We need a nice runner-friendly way of dealing with the too-late data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-52) KafkaIO - bounded/unbounded, source/sink

2016-04-29 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-52?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15264733#comment-15264733
 ] 

Aljoscha Krettek commented on BEAM-52:
--

Ah, I also implemented one here: 
https://github.com/aljoscha/incubator-beam/commits/kafka-sink. I was playing 
around with this today because I didn't want to wait. :-)

But no worries. Since you already started working on this and the issue is 
assigned to you we can discard my code.

> KafkaIO - bounded/unbounded, source/sink
> 
>
> Key: BEAM-52
> URL: https://issues.apache.org/jira/browse/BEAM-52
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions
>Reporter: Daniel Halperin
>Assignee: Raghu Angadi
>
> We should support Apache Kafka. The priority list is probably:
> * UnboundedSource
> * unbounded Sink
> * BoundedSource
> * bounded Sink
> The connector should be well-tested, especially around UnboundedSource 
> checkpointing and resuming, and data duplication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-52) KafkaIO - bounded/unbounded, source/sink

2016-04-29 Thread Raghu Angadi (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-52?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15264663#comment-15264663
 ] 

Raghu Angadi commented on BEAM-52:
--

[~aljoscha], I am working on Sink, will send a PR very soon (hopefully today).

> KafkaIO - bounded/unbounded, source/sink
> 
>
> Key: BEAM-52
> URL: https://issues.apache.org/jira/browse/BEAM-52
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions
>Reporter: Daniel Halperin
>Assignee: Raghu Angadi
>
> We should support Apache Kafka. The priority list is probably:
> * UnboundedSource
> * unbounded Sink
> * BoundedSource
> * bounded Sink
> The connector should be well-tested, especially around UnboundedSource 
> checkpointing and resuming, and data duplication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request: [BEAM-22] Remove redundant close in B...

2016-04-29 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/257


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-22) DirectPipelineRunner: support for unbounded collections

2016-04-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-22?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15264449#comment-15264449
 ] 

ASF GitHub Bot commented on BEAM-22:


Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/257


> DirectPipelineRunner: support for unbounded collections
> ---
>
> Key: BEAM-22
> URL: https://issues.apache.org/jira/browse/BEAM-22
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-direct
>Reporter: Davor Bonaci
>Assignee: Thomas Groh
>
> DirectPipelineRunner currently runs over bounded PCollections only, and 
> implements only a portion of the Beam Model.
> We should improve it to faithfully implement the full Beam Model, such as add 
> ability to run over unbounded PCollections, and better resemble execution 
> model in a distributed system.
> This further enables features such as a testing source which may simulate 
> late data and test triggers in the pipeline. Finally, we may want to expose 
> an option to select between "debug" (single threaded), "chaos monkey" (test 
> as many model requirements as possible), and "performance" (multi-threaded).
> more testing (chaos monkey) 
> Once this is done, we should update this StackOverflow question:
> http://stackoverflow.com/questions/35350113/testing-triggers-with-processing-time/35401426#35401426



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[2/2] incubator-beam git commit: This closes #257

2016-04-29 Thread kenn
This closes #257


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/bba4c64d
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/bba4c64d
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/bba4c64d

Branch: refs/heads/master
Commit: bba4c64d341e9b1d7961a5e89426950b3a387155
Parents: 593bf0c 08c05e0
Author: Kenneth Knowles 
Authored: Fri Apr 29 11:01:10 2016 -0700
Committer: Kenneth Knowles 
Committed: Fri Apr 29 11:01:10 2016 -0700

--
 .../beam/sdk/runners/inprocess/BoundedReadEvaluatorFactory.java | 1 -
 1 file changed, 1 deletion(-)
--




[1/2] incubator-beam git commit: Remove redundant close in BoundedReadEvaluatorFactory

2016-04-29 Thread kenn
Repository: incubator-beam
Updated Branches:
  refs/heads/master 593bf0c53 -> bba4c64d3


Remove redundant close in BoundedReadEvaluatorFactory

The reader is already closed by virtue of being the target of the
try-with-resources block that encompasses all of #finishBundle().


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/08c05e01
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/08c05e01
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/08c05e01

Branch: refs/heads/master
Commit: 08c05e01b6248de853e8bc3d8446ed98d3408a6e
Parents: a9387fc
Author: Thomas Groh 
Authored: Wed Apr 27 17:03:44 2016 -0700
Committer: Thomas Groh 
Committed: Wed Apr 27 17:08:41 2016 -0700

--
 .../beam/sdk/runners/inprocess/BoundedReadEvaluatorFactory.java | 1 -
 1 file changed, 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/08c05e01/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/BoundedReadEvaluatorFactory.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/BoundedReadEvaluatorFactory.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/BoundedReadEvaluatorFactory.java
index ef5581d..a394090 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/BoundedReadEvaluatorFactory.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/BoundedReadEvaluatorFactory.java
@@ -146,7 +146,6 @@ final class BoundedReadEvaluatorFactory implements 
TransformEvaluatorFactory {
   reader.getCurrent(), reader.getCurrentTimestamp()));
   contentsRemaining = reader.advance();
 }
-reader.close();
 return StepTransformResult.withHold(transform, 
BoundedWindow.TIMESTAMP_MAX_VALUE)
 .addOutput(output)
 .build();



[GitHub] incubator-beam pull request: Never trigger: fix broken link in Jav...

2016-04-29 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/251


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] incubator-beam git commit: This closes #251

2016-04-29 Thread kenn
This closes #251


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/593bf0c5
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/593bf0c5
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/593bf0c5

Branch: refs/heads/master
Commit: 593bf0c53fb13e21c99fee73cb1c7f4c9e8d264e
Parents: 661a4a8 a18d6b9
Author: Kenneth Knowles 
Authored: Fri Apr 29 10:52:22 2016 -0700
Committer: Kenneth Knowles 
Committed: Fri Apr 29 10:52:22 2016 -0700

--
 .../main/java/org/apache/beam/sdk/transforms/windowing/Never.java  | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--




[1/2] incubator-beam git commit: Never trigger: fix broken link in Javadoc

2016-04-29 Thread kenn
Repository: incubator-beam
Updated Branches:
  refs/heads/master 661a4a893 -> 593bf0c53


Never trigger: fix broken link in Javadoc

Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/a18d6b93
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/a18d6b93
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/a18d6b93

Branch: refs/heads/master
Commit: a18d6b93cedf3ba016c32ad4cb08796215f9cd20
Parents: 6914f2a
Author: Daniel Halperin 
Authored: Wed Apr 27 10:32:06 2016 -0700
Committer: Daniel Halperin 
Committed: Wed Apr 27 10:32:06 2016 -0700

--
 .../main/java/org/apache/beam/sdk/transforms/windowing/Never.java  | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/a18d6b93/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/windowing/Never.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/windowing/Never.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/windowing/Never.java
index 8e3e664..40f3496 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/windowing/Never.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/windowing/Never.java
@@ -28,7 +28,7 @@ import java.util.List;
  *
  * 
  * Using this trigger will only produce output when the watermark passes the 
end of the
- * {@link BoundedWindow window} plus the {@link Window#withAllowedLateness() 
allowed
+ * {@link BoundedWindow window} plus the {@link Window#withAllowedLateness 
allowed
  * lateness}.
  */
 public final class Never {



[jira] [Commented] (BEAM-115) Beam Runner API

2016-04-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15264406#comment-15264406
 ] 

ASF GitHub Bot commented on BEAM-115:
-

GitHub user kennknowles opened a pull request:

https://github.com/apache/incubator-beam/pull/268

[BEAM-115] Make in-process GroupByKey consistent with Beam model

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

This is rebased onto #256.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/incubator-beam InProcessGroupByKey

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/268.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #268


commit 310f8c4477fbcc16815af69592158ea5df1a8d6d
Author: Kenneth Knowles 
Date:   2016-04-27T22:01:48Z

Move InProcessRunner to its own module

commit 89cfe6bf22dd5849d52237f0a9b3cd9d55e6ec11
Author: Kenneth Knowles 
Date:   2016-04-28T22:51:40Z

Add accessors for sub-coders of KeyedWorkItemCoder

commit 41568a8c3318ea4e15c68e2cf4d61e18c94867ce
Author: Kenneth Knowles 
Date:   2016-04-28T23:12:21Z

Make in-process GroupByKey respect future Beam model

This introduces or clarifies the following transforms:

 - InProcessGroupByKey, which expands like GroupByKeyViaGroupByKeyOnly
   but with different intermediate PCollection types.
 - InProcessGroupByKeyOnly, which outputs KeyedWorkItem. This existed
   already under a different name.
 - InProcessGroupAlsoByWindow, which is evaluated directly and
   accepts input elements of type KeyedWorkItem.




> Beam Runner API
> ---
>
> Key: BEAM-115
> URL: https://issues.apache.org/jira/browse/BEAM-115
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>
> The PipelineRunner API from the SDK is not ideal for the Beam technical 
> vision.
> It has technical limitations:
>  - The user's DAG (even including library expansions) is never explicitly 
> represented, so it cannot be analyzed except incrementally, and cannot 
> necessarily be reconstructed (for example, to display it!).
>  - The flattened DAG of just primitive transforms isn't well-suited for 
> display or transform override.
>  - The TransformHierarchy isn't well-suited for optimizations.
>  - The user must realistically pre-commit to a runner, and its configuration 
> (batch vs streaming) prior to graph construction, since the runner will be 
> modifying the graph as it is built.
>  - It is fairly language- and SDK-specific.
> It has usability issues (these are not from intuition, but derived from 
> actual cases of failure to use according to the design)
>  - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner 
> is confusing.
>  - The TransformHierarchy, accessible only via visitor traversals, is 
> cumbersome.
>  - The staging of construction-time vs run-time is not always obvious.
> These are just examples. This ticket tracks designing, coming to consensus, 
> and building an API that more simply and directly supports the technical 
> vision.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request: [BEAM-115] Make in-process GroupByKey...

2016-04-29 Thread kennknowles
GitHub user kennknowles opened a pull request:

https://github.com/apache/incubator-beam/pull/268

[BEAM-115] Make in-process GroupByKey consistent with Beam model

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

This is rebased onto #256.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/incubator-beam InProcessGroupByKey

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/268.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #268


commit 310f8c4477fbcc16815af69592158ea5df1a8d6d
Author: Kenneth Knowles 
Date:   2016-04-27T22:01:48Z

Move InProcessRunner to its own module

commit 89cfe6bf22dd5849d52237f0a9b3cd9d55e6ec11
Author: Kenneth Knowles 
Date:   2016-04-28T22:51:40Z

Add accessors for sub-coders of KeyedWorkItemCoder

commit 41568a8c3318ea4e15c68e2cf4d61e18c94867ce
Author: Kenneth Knowles 
Date:   2016-04-28T23:12:21Z

Make in-process GroupByKey respect future Beam model

This introduces or clarifies the following transforms:

 - InProcessGroupByKey, which expands like GroupByKeyViaGroupByKeyOnly
   but with different intermediate PCollection types.
 - InProcessGroupByKeyOnly, which outputs KeyedWorkItem. This existed
   already under a different name.
 - InProcessGroupAlsoByWindow, which is evaluated directly and
   accepts input elements of type KeyedWorkItem.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-beam pull request: Encapsulate cloning behavior of in-pr...

2016-04-29 Thread kennknowles
Github user kennknowles closed the pull request at:

https://github.com/apache/incubator-beam/pull/263


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-238) Add link URLs for sources / sinks

2016-04-29 Thread Scott Wegner (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15264314#comment-15264314
 ] 

Scott Wegner commented on BEAM-238:
---

I meant to create this as a subtask to [BEAM-117] for adding links to 
registered display data on sources / sinks. But it appears I created a bug 
instead.

I'll create a new subtask for display data. [~jbonofre], If there's website 
work to be done on source/sink URLs, feel free to use this bug to track. 
Otherwise you can close it.

> Add link URLs for sources / sinks
> -
>
> Key: BEAM-238
> URL: https://issues.apache.org/jira/browse/BEAM-238
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Scott Wegner
>Assignee: Jean-Baptiste Onofré
>
> Where applicable, annotate sources/sink display data with link urls.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-240) Add display data link URLs for sources / sinks

2016-04-29 Thread Scott Wegner (JIRA)
Scott Wegner created BEAM-240:
-

 Summary: Add display data link URLs for sources / sinks
 Key: BEAM-240
 URL: https://issues.apache.org/jira/browse/BEAM-240
 Project: Beam
  Issue Type: Sub-task
  Components: sdk-java-core
Reporter: Scott Wegner
Assignee: Davor Bonaci






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-238) Add link URLs for sources / sinks

2016-04-29 Thread JIRA

[ 
https://issues.apache.org/jira/browse/BEAM-238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15264319#comment-15264319
 ] 

Jean-Baptiste Onofré commented on BEAM-238:
---

Oh ok, sorry [~swegner] I misunderstood the purpose of this Jira.

> Add link URLs for sources / sinks
> -
>
> Key: BEAM-238
> URL: https://issues.apache.org/jira/browse/BEAM-238
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Scott Wegner
>Assignee: Jean-Baptiste Onofré
>
> Where applicable, annotate sources/sink display data with link urls.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[2/4] incubator-beam git commit: fix Flink source coder handling

2016-04-29 Thread mxm
fix Flink source coder handling


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/aead96ff
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/aead96ff
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/aead96ff

Branch: refs/heads/master
Commit: aead96ff4c018b96a7b5ab1defb408c2a09b1be7
Parents: bc847a9
Author: Maximilian Michels 
Authored: Thu Apr 28 12:00:18 2016 +0200
Committer: Maximilian Michels 
Committed: Fri Apr 29 17:58:00 2016 +0200

--
 .../FlinkStreamingTransformTranslators.java | 13 +++-
 .../flink/translation/types/FlinkCoder.java | 64 
 .../streaming/io/UnboundedFlinkSource.java  | 12 +++-
 3 files changed, 84 insertions(+), 5 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/aead96ff/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/translation/FlinkStreamingTransformTranslators.java
--
diff --git 
a/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/translation/FlinkStreamingTransformTranslators.java
 
b/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/translation/FlinkStreamingTransformTranslators.java
index db24f9d..618727d 100644
--- 
a/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/translation/FlinkStreamingTransformTranslators.java
+++ 
b/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/translation/FlinkStreamingTransformTranslators.java
@@ -20,6 +20,7 @@ package org.apache.beam.runners.flink.translation;
 
 import org.apache.beam.runners.flink.translation.functions.UnionCoder;
 import org.apache.beam.runners.flink.translation.types.CoderTypeInformation;
+import org.apache.beam.runners.flink.translation.types.FlinkCoder;
 import org.apache.beam.runners.flink.translation.wrappers.SourceInputFormat;
 import 
org.apache.beam.runners.flink.translation.wrappers.streaming.FlinkGroupAlsoByWindowWrapper;
 import 
org.apache.beam.runners.flink.translation.wrappers.streaming.FlinkGroupByKeyWrapper;
@@ -262,9 +263,15 @@ public class FlinkStreamingTransformTranslators {
   DataStream source;
   if (transform.getSource().getClass().equals(UnboundedFlinkSource.class)) 
{
 @SuppressWarnings("unchecked")
-UnboundedFlinkSource flinkSource = (UnboundedFlinkSource) 
transform.getSource();
-source = context.getExecutionEnvironment()
-.addSource(flinkSource.getFlinkSource())
+UnboundedFlinkSource flinkSourceFunction = 
(UnboundedFlinkSource) transform.getSource();
+DataStream flinkSource = context.getExecutionEnvironment()
+.addSource(flinkSourceFunction.getFlinkSource());
+
+flinkSourceFunction.setCoder(
+new FlinkCoder(flinkSource.getType(),
+  context.getExecutionEnvironment().getConfig()));
+
+source = flinkSource
 .flatMap(new FlatMapFunction() {
   @Override
   public void flatMap(T s, Collector collector) 
throws Exception {

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/aead96ff/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/translation/types/FlinkCoder.java
--
diff --git 
a/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/translation/types/FlinkCoder.java
 
b/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/translation/types/FlinkCoder.java
new file mode 100644
index 000..3b1e66e
--- /dev/null
+++ 
b/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/translation/types/FlinkCoder.java
@@ -0,0 +1,64 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.flink.translation.types;
+
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.StandardCoder;
+import 

[3/4] incubator-beam git commit: Flink sink implementation

2016-04-29 Thread mxm
Flink sink implementation


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/bc847a95
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/bc847a95
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/bc847a95

Branch: refs/heads/master
Commit: bc847a9582447372461c5cf35450ba4a4c3d490d
Parents: 4fd9d74
Author: Maximilian Michels 
Authored: Fri Apr 22 12:33:26 2016 +0200
Committer: Maximilian Michels 
Committed: Fri Apr 29 17:58:00 2016 +0200

--
 .../FlinkStreamingTransformTranslators.java |  33 +++-
 .../streaming/io/UnboundedFlinkSink.java| 175 +++
 2 files changed, 204 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/bc847a95/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/translation/FlinkStreamingTransformTranslators.java
--
diff --git 
a/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/translation/FlinkStreamingTransformTranslators.java
 
b/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/translation/FlinkStreamingTransformTranslators.java
index 927c3a2..db24f9d 100644
--- 
a/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/translation/FlinkStreamingTransformTranslators.java
+++ 
b/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/translation/FlinkStreamingTransformTranslators.java
@@ -26,13 +26,16 @@ import 
org.apache.beam.runners.flink.translation.wrappers.streaming.FlinkGroupBy
 import 
org.apache.beam.runners.flink.translation.wrappers.streaming.FlinkParDoBoundMultiWrapper;
 import 
org.apache.beam.runners.flink.translation.wrappers.streaming.FlinkParDoBoundWrapper;
 import 
org.apache.beam.runners.flink.translation.wrappers.streaming.io.FlinkStreamingCreateFunction;
+import 
org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedFlinkSink;
 import 
org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedFlinkSource;
 import 
org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper;
 import org.apache.beam.sdk.coders.Coder;
 import org.apache.beam.sdk.coders.KvCoder;
 import org.apache.beam.sdk.io.BoundedSource;
 import org.apache.beam.sdk.io.Read;
+import org.apache.beam.sdk.io.Sink;
 import org.apache.beam.sdk.io.TextIO;
+import org.apache.beam.sdk.io.Write;
 import org.apache.beam.sdk.transforms.Combine;
 import org.apache.beam.sdk.transforms.Create;
 import org.apache.beam.sdk.transforms.DoFn;
@@ -64,12 +67,8 @@ import org.apache.flink.streaming.api.datastream.DataStream;
 import org.apache.flink.streaming.api.datastream.DataStreamSink;
 import org.apache.flink.streaming.api.datastream.KeyedStream;
 import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
-import 
org.apache.flink.streaming.api.functions.AssignerWithPunctuatedWatermarks;
 import org.apache.flink.streaming.api.functions.IngestionTimeExtractor;
-import org.apache.flink.streaming.api.functions.TimestampAssigner;
-import org.apache.flink.streaming.api.watermark.Watermark;
 import org.apache.flink.util.Collector;
-import org.apache.kafka.common.utils.Time;
 import org.joda.time.Instant;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
@@ -104,6 +103,9 @@ public class FlinkStreamingTransformTranslators {
 TRANSLATORS.put(Read.Unbounded.class, new UnboundedReadSourceTranslator());
 TRANSLATORS.put(ParDo.Bound.class, new ParDoBoundStreamingTranslator());
 TRANSLATORS.put(TextIO.Write.Bound.class, new 
TextIOWriteBoundStreamingTranslator());
+
+TRANSLATORS.put(Write.Bound.class, new WriteSinkStreamingTranslator());
+
 TRANSLATORS.put(Window.Bound.class, new WindowBoundTranslator());
 TRANSLATORS.put(GroupByKey.class, new GroupByKeyTranslator());
 TRANSLATORS.put(Combine.PerKey.class, new CombinePerKeyTranslator());
@@ -193,6 +195,29 @@ public class FlinkStreamingTransformTranslators {
 }
   }
 
+  private static class WriteSinkStreamingTranslator implements 
FlinkStreamingPipelineTranslator.StreamTransformTranslator {
+
+@Override
+public void translateNode(Write.Bound transform, 
FlinkStreamingTranslationContext context) {
+  String name = transform.getName();
+  PValue input = context.getInput(transform);
+
+  Sink sink = transform.getSink();
+  if (!(sink instanceof UnboundedFlinkSink)) {
+throw new UnsupportedOperationException("At the time, only unbounded 
Flink sinks are supported.");
+  }
+
+  DataStream inputDataSet = 
context.getInputDataStream(input);
+
+  inputDataSet.flatMap(new FlatMapFunction() {
+

[GitHub] incubator-beam pull request: Add option to use Flink's Kafka Write...

2016-04-29 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/266


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[4/4] incubator-beam git commit: This closes #266

2016-04-29 Thread mxm
This closes #266


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/661a4a89
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/661a4a89
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/661a4a89

Branch: refs/heads/master
Commit: 661a4a893c5afa2f257969bd25d4c01c42693fac
Parents: 4fd9d74 63bce07
Author: Maximilian Michels 
Authored: Fri Apr 29 17:58:11 2016 +0200
Committer: Maximilian Michels 
Committed: Fri Apr 29 17:58:11 2016 +0200

--
 .../examples/streaming/KafkaIOExamples.java | 337 +++
 .../FlinkStreamingTransformTranslators.java |  46 ++-
 .../flink/translation/types/FlinkCoder.java |  64 
 .../streaming/io/UnboundedFlinkSink.java| 175 ++
 .../streaming/io/UnboundedFlinkSource.java  |  12 +-
 5 files changed, 625 insertions(+), 9 deletions(-)
--




[1/4] incubator-beam git commit: add Kafka IO examples

2016-04-29 Thread mxm
Repository: incubator-beam
Updated Branches:
  refs/heads/master 4fd9d74df -> 661a4a893


add Kafka IO examples


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/63bce07d
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/63bce07d
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/63bce07d

Branch: refs/heads/master
Commit: 63bce07d8c6cc5e610ad24e915e2585fef582567
Parents: aead96f
Author: Maximilian Michels 
Authored: Thu Apr 28 12:02:05 2016 +0200
Committer: Maximilian Michels 
Committed: Fri Apr 29 17:58:00 2016 +0200

--
 .../examples/streaming/KafkaIOExamples.java | 337 +++
 1 file changed, 337 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/63bce07d/runners/flink/examples/src/main/java/org/apache/beam/runners/flink/examples/streaming/KafkaIOExamples.java
--
diff --git 
a/runners/flink/examples/src/main/java/org/apache/beam/runners/flink/examples/streaming/KafkaIOExamples.java
 
b/runners/flink/examples/src/main/java/org/apache/beam/runners/flink/examples/streaming/KafkaIOExamples.java
new file mode 100644
index 000..af6bb35
--- /dev/null
+++ 
b/runners/flink/examples/src/main/java/org/apache/beam/runners/flink/examples/streaming/KafkaIOExamples.java
@@ -0,0 +1,337 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.flink.examples.streaming;
+
+import org.apache.beam.runners.flink.FlinkPipelineOptions;
+import org.apache.beam.runners.flink.FlinkPipelineRunner;
+import 
org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedFlinkSink;
+import 
org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedFlinkSource;
+import org.apache.beam.sdk.Pipeline;
+import org.apache.beam.sdk.coders.AvroCoder;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.io.Read;
+import org.apache.beam.sdk.io.Write;
+import org.apache.beam.sdk.options.Default;
+import org.apache.beam.sdk.options.Description;
+import org.apache.beam.sdk.options.PipelineOptionsFactory;
+import org.apache.beam.sdk.transforms.Create;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.java.typeutils.TypeExtractor;
+import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer08;
+import org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer08;
+import org.apache.flink.streaming.util.serialization.DeserializationSchema;
+import org.apache.flink.streaming.util.serialization.SerializationSchema;
+import org.apache.flink.streaming.util.serialization.SimpleStringSchema;
+
+import java.io.ByteArrayInputStream;
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.io.Serializable;
+import java.util.Properties;
+
+/**
+ * Recipes/Examples that demonstrate how to read/write data from/to Kafka.
+ */
+public class KafkaIOExamples {
+
+
+  private static final String KAFKA_TOPIC = "input";  // Default kafka topic 
to read from
+  private static final String KAFKA_AVRO_TOPIC = "output";  // Default kafka 
topic to read from
+  private static final String KAFKA_BROKER = "localhost:9092";  // Default 
kafka broker to contact
+  private static final String GROUP_ID = "myGroup";  // Default groupId
+  private static final String ZOOKEEPER = "localhost:2181";  // Default 
zookeeper to connect to for Kafka
+
+  /**
+   * Read/Write String data to Kafka
+   */
+  public static class KafkaString {
+
+/**
+ * Read String data from Kafka
+ */
+public static class ReadStringFromKafka {
+
+  public static void main(String[] args) {
+
+Pipeline p = initializePipeline(args);
+KafkaOptions options = getOptions(p);
+
+FlinkKafkaConsumer08 kafkaConsumer =
+new 

[jira] [Commented] (BEAM-154) Provide Maven BOM

2016-04-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15264100#comment-15264100
 ] 

ASF GitHub Bot commented on BEAM-154:
-

GitHub user jbonofre opened a pull request:

https://github.com/apache/incubator-beam/pull/267

[BEAM-154] Use dependencyManagement and pluginManagement to keep all …

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [X] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [X] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [X] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [X] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

Use of dependencyManagement and pluginManagement to align versions between 
modules.

There's checkstyle errors in several modules, I will fix in new commits.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jbonofre/incubator-beam DEPMANAGEMENT

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/267.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #267


commit 2b174b1bf5e18e1e20b09a4e029750f8ad9ca95d
Author: Jean-Baptiste Onofré 
Date:   2016-04-29T13:26:01Z

[BEAM-154] Use dependencyManagement and pluginManagement to keep all 
modules sync in term of version




> Provide Maven BOM
> -
>
> Key: BEAM-154
> URL: https://issues.apache.org/jira/browse/BEAM-154
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>
> When using the Java SDK (for instance to develop IO), the developer has to 
> add dependencies in his pom.xml (like junit, hamcrest, slf4j, ...).
> To simplify the way to define the dependencies, each Beam SDK could provide a 
> Maven BoM (Bill of Material) describing these dependencies. Then the 
> developer could simply define this BoM as pom.xml dependency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request: [BEAM-154] Use dependencyManagement a...

2016-04-29 Thread jbonofre
GitHub user jbonofre opened a pull request:

https://github.com/apache/incubator-beam/pull/267

[BEAM-154] Use dependencyManagement and pluginManagement to keep all …

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [X] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [X] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [X] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [X] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

Use of dependencyManagement and pluginManagement to align versions between 
modules.

There's checkstyle errors in several modules, I will fix in new commits.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jbonofre/incubator-beam DEPMANAGEMENT

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/267.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #267


commit 2b174b1bf5e18e1e20b09a4e029750f8ad9ca95d
Author: Jean-Baptiste Onofré 
Date:   2016-04-29T13:26:01Z

[BEAM-154] Use dependencyManagement and pluginManagement to keep all 
modules sync in term of version




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-103) Make UnboundedSourceWrapper parallel

2016-04-29 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15264040#comment-15264040
 ] 

Aljoscha Krettek commented on BEAM-103:
---

@mxm I don't now, thought so. :-)

> Make UnboundedSourceWrapper parallel
> 
>
> Key: BEAM-103
> URL: https://issues.apache.org/jira/browse/BEAM-103
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Kostas Kloudas
>
> As of now {{UnboundedSource}} s are executed with a parallelism of 1 
> regardless of the splits which the source returns. The corresponding 
> {{UnboundedSourceWrapper}} should implement {{RichParallelSourceFunction}} 
> and deal with splits correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-52) KafkaIO - bounded/unbounded, source/sink

2016-04-29 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-52?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15264037#comment-15264037
 ] 

Aljoscha Krettek commented on BEAM-52:
--

Ok, thanks! I'll get crackin' :-)

> KafkaIO - bounded/unbounded, source/sink
> 
>
> Key: BEAM-52
> URL: https://issues.apache.org/jira/browse/BEAM-52
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions
>Reporter: Daniel Halperin
>Assignee: Raghu Angadi
>
> We should support Apache Kafka. The priority list is probably:
> * UnboundedSource
> * unbounded Sink
> * BoundedSource
> * bounded Sink
> The connector should be well-tested, especially around UnboundedSource 
> checkpointing and resuming, and data duplication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-52) KafkaIO - bounded/unbounded, source/sink

2016-04-29 Thread JIRA

[ 
https://issues.apache.org/jira/browse/BEAM-52?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15264022#comment-15264022
 ] 

Jean-Baptiste Onofré commented on BEAM-52:
--

[~aljoscha] I planned to work on it, but not yet started. So you can go ahead, 
I will work on the other new IOs on my side.

> KafkaIO - bounded/unbounded, source/sink
> 
>
> Key: BEAM-52
> URL: https://issues.apache.org/jira/browse/BEAM-52
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions
>Reporter: Daniel Halperin
>Assignee: Raghu Angadi
>
> We should support Apache Kafka. The priority list is probably:
> * UnboundedSource
> * unbounded Sink
> * BoundedSource
> * bounded Sink
> The connector should be well-tested, especially around UnboundedSource 
> checkpointing and resuming, and data duplication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Jenkins build is back to normal : beam_PostCommit_MavenVerify #306

2016-04-29 Thread Apache Jenkins Server
See 



[jira] [Commented] (BEAM-103) Make UnboundedSourceWrapper parallel

2016-04-29 Thread Kostas Kloudas (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15264019#comment-15264019
 ] 

Kostas Kloudas commented on BEAM-103:
-

I agree that this is pressing. 
I will be able to work on this shortly. 

> Make UnboundedSourceWrapper parallel
> 
>
> Key: BEAM-103
> URL: https://issues.apache.org/jira/browse/BEAM-103
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Kostas Kloudas
>
> As of now {{UnboundedSource}} s are executed with a parallelism of 1 
> regardless of the splits which the source returns. The corresponding 
> {{UnboundedSourceWrapper}} should implement {{RichParallelSourceFunction}} 
> and deal with splits correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Jenkins build is back to normal : beam_PostCommit_MavenVerify » Apache Beam :: SDKs :: Java :: Core #306

2016-04-29 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-52) KafkaIO - bounded/unbounded, source/sink

2016-04-29 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-52?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15264018#comment-15264018
 ] 

Aljoscha Krettek commented on BEAM-52:
--

[~rangadi], [~dhalp...@google.com] is anyone of you working on a Kafka sink? I 
would like to add a minimum viable version based on a DoFn so that users can 
write end-to-end pipelines with Beam and Kafka.

> KafkaIO - bounded/unbounded, source/sink
> 
>
> Key: BEAM-52
> URL: https://issues.apache.org/jira/browse/BEAM-52
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions
>Reporter: Daniel Halperin
>Assignee: Raghu Angadi
>
> We should support Apache Kafka. The priority list is probably:
> * UnboundedSource
> * unbounded Sink
> * BoundedSource
> * bounded Sink
> The connector should be well-tested, especially around UnboundedSource 
> checkpointing and resuming, and data duplication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-103) Make UnboundedSourceWrapper parallel

2016-04-29 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15264013#comment-15264013
 ] 

Maximilian Michels commented on BEAM-103:
-

Curious to know where I have mentioned that :) I believe [~kkl0u] wanted to 
work on this and he is currently assigned to the issue. Perhaps too many things 
in the pipeline..

+1 I think this is pressing. That's why I initially opened the issue. Do you 
think you'll have time to work on this soon [~kkl0u]?

> Make UnboundedSourceWrapper parallel
> 
>
> Key: BEAM-103
> URL: https://issues.apache.org/jira/browse/BEAM-103
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Kostas Kloudas
>
> As of now {{UnboundedSource}} s are executed with a parallelism of 1 
> regardless of the splits which the source returns. The corresponding 
> {{UnboundedSourceWrapper}} should implement {{RichParallelSourceFunction}} 
> and deal with splits correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-139) Print mode (batch/streaming) during translation

2016-04-29 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-139:

Component/s: (was: runner-core)

> Print mode (batch/streaming) during translation 
> 
>
> Key: BEAM-139
> URL: https://issues.apache.org/jira/browse/BEAM-139
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-core, runner-flink
>Reporter: Maximilian Michels
>Assignee: Davor Bonaci
>
> Runners have different feature sets for batch and streaming. It may be a good 
> idea to print/log the translation mode during parsing of the options in the 
> {{PipelineOptionsFactory}}. That would help users to understand they are 
> missing the streaming flag in the options and that the default mode is batch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-139) Print mode (batch/streaming) during translation

2016-04-29 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15264001#comment-15264001
 ] 

Maximilian Michels commented on BEAM-139:
-

I agree with the direction. Runners may still announce whether they're 
translating to batch or streaming but the explicit flag should go away.

> Print mode (batch/streaming) during translation 
> 
>
> Key: BEAM-139
> URL: https://issues.apache.org/jira/browse/BEAM-139
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-core, runner-flink
>Reporter: Maximilian Michels
>Assignee: Davor Bonaci
>
> Runners have different feature sets for batch and streaming. It may be a good 
> idea to print/log the translation mode during parsing of the options in the 
> {{PipelineOptionsFactory}}. That would help users to understand they are 
> missing the streaming flag in the options and that the default mode is batch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-139) Print mode (batch/streaming) during translation

2016-04-29 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15263979#comment-15263979
 ] 

Aljoscha Krettek commented on BEAM-139:
---

+1, this is rendered obsolete by BEAM-235.

> Print mode (batch/streaming) during translation 
> 
>
> Key: BEAM-139
> URL: https://issues.apache.org/jira/browse/BEAM-139
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-core, runner-flink
>Reporter: Maximilian Michels
>Assignee: Davor Bonaci
>
> Runners have different feature sets for batch and streaming. It may be a good 
> idea to print/log the translation mode during parsing of the options in the 
> {{PipelineOptionsFactory}}. That would help users to understand they are 
> missing the streaming flag in the options and that the default mode is batch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: beam_PostCommit_MavenVerify #305

2016-04-29 Thread Apache Jenkins Server
See 

--
[...truncated 798 lines...]
[INFO] Including io.grpc:grpc-protobuf:jar:0.12.0 in the shaded jar.
[INFO] Including io.grpc:grpc-stub:jar:0.12.0 in the shaded jar.
[INFO] Including io.grpc:grpc-auth:jar:0.12.0 in the shaded jar.
[INFO] Including com.google.auth:google-auth-library-oauth2-http:jar:0.3.1 in 
the shaded jar.
[INFO] Including com.google.auth:google-auth-library-credentials:jar:0.3.1 in 
the shaded jar.
[INFO] Including io.netty:netty-handler:jar:4.1.0.Beta8 in the shaded jar.
[INFO] Including io.netty:netty-buffer:jar:4.1.0.Beta8 in the shaded jar.
[INFO] Including io.netty:netty-common:jar:4.1.0.Beta8 in the shaded jar.
[INFO] Including io.netty:netty-transport:jar:4.1.0.Beta8 in the shaded jar.
[INFO] Including io.netty:netty-resolver:jar:4.1.0.Beta8 in the shaded jar.
[INFO] Including io.netty:netty-codec:jar:4.1.0.Beta8 in the shaded jar.
[INFO] Including com.google.api.grpc:grpc-pubsub-v1:jar:0.0.2 in the shaded jar.
[INFO] Including com.google.api.grpc:grpc-core-proto:jar:0.0.3 in the shaded 
jar.
[INFO] Including com.google.api-client:google-api-client:jar:1.21.0 in the 
shaded jar.
[INFO] Including 
com.google.apis:google-api-services-bigquery:jar:v2-rev248-1.21.0 in the shaded 
jar.
[INFO] Including com.google.apis:google-api-services-pubsub:jar:v1-rev7-1.21.0 
in the shaded jar.
[INFO] Including 
com.google.apis:google-api-services-storage:jar:v1-rev53-1.21.0 in the shaded 
jar.
[INFO] Including com.google.http-client:google-http-client:jar:1.21.0 in the 
shaded jar.
[INFO] Including org.apache.httpcomponents:httpclient:jar:4.0.1 in the shaded 
jar.
[INFO] Including org.apache.httpcomponents:httpcore:jar:4.0.1 in the shaded jar.
[INFO] Including commons-logging:commons-logging:jar:1.1.1 in the shaded jar.
[INFO] Including commons-codec:commons-codec:jar:1.3 in the shaded jar.
[INFO] Including com.google.http-client:google-http-client-jackson:jar:1.21.0 
in the shaded jar.
[INFO] Including com.google.http-client:google-http-client-jackson2:jar:1.21.0 
in the shaded jar.
[INFO] Including com.google.http-client:google-http-client-protobuf:jar:1.21.0 
in the shaded jar.
[INFO] Including com.google.oauth-client:google-oauth-client-java6:jar:1.21.0 
in the shaded jar.
[INFO] Including com.google.oauth-client:google-oauth-client:jar:1.21.0 in the 
shaded jar.
[INFO] Including 
com.google.apis:google-api-services-datastore-protobuf:jar:v1beta2-rev1-4.0.0 
in the shaded jar.
[INFO] Including com.google.cloud.bigdataoss:gcsio:jar:1.4.3 in the shaded jar.
[INFO] Including com.google.api-client:google-api-client-java6:jar:1.20.0 in 
the shaded jar.
[INFO] Including com.google.api-client:google-api-client-jackson2:jar:1.20.0 in 
the shaded jar.
[INFO] Including com.google.cloud.bigdataoss:util:jar:1.4.3 in the shaded jar.
[INFO] Excluding com.google.guava:guava:jar:19.0 from the shaded jar.
[INFO] Including com.google.protobuf:protobuf-java:jar:3.0.0-beta-1 in the 
shaded jar.
[INFO] Including com.google.code.findbugs:jsr305:jar:3.0.1 in the shaded jar.
[INFO] Including com.fasterxml.jackson.core:jackson-core:jar:2.7.0 in the 
shaded jar.
[INFO] Including com.fasterxml.jackson.core:jackson-annotations:jar:2.7.0 in 
the shaded jar.
[INFO] Including com.fasterxml.jackson.core:jackson-databind:jar:2.7.0 in the 
shaded jar.
[INFO] Including org.slf4j:slf4j-api:jar:1.7.14 in the shaded jar.
[INFO] Including org.apache.avro:avro:jar:1.7.7 in the shaded jar.
[INFO] Including org.codehaus.jackson:jackson-core-asl:jar:1.9.13 in the shaded 
jar.
[INFO] Including org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13 in the 
shaded jar.
[INFO] Including com.thoughtworks.paranamer:paranamer:jar:2.3 in the shaded jar.
[INFO] Including org.xerial.snappy:snappy-java:jar:1.1.2.1 in the shaded jar.
[INFO] Including org.apache.commons:commons-compress:jar:1.9 in the shaded jar.
[INFO] Including joda-time:joda-time:jar:2.4 in the shaded jar.
[INFO] Including org.codehaus.woodstox:stax2-api:jar:3.1.4 in the shaded jar.
[INFO] Including org.codehaus.woodstox:woodstox-core-asl:jar:4.4.1 in the 
shaded jar.
[INFO] Including org.tukaani:xz:jar:1.5 in the shaded jar.
[INFO] Including com.google.auto.service:auto-service:jar:1.0-rc2 in the shaded 
jar.
[INFO] Including com.google.auto:auto-common:jar:0.3 in the shaded jar.
[WARNING] grpc-netty-0.12.0.jar, grpc-all-0.12.0.jar define 80 overlapping 
classes: 
[WARNING]   - io.grpc.netty.AbstractNettyHandler
[WARNING]   - io.grpc.netty.NettyClientTransport
[WARNING]   - io.grpc.netty.NettyClientStream$1
[WARNING]   - io.grpc.netty.SendResponseHeadersCommand
[WARNING]   - io.grpc.netty.NettyServer$2
[WARNING]   - io.grpc.netty.NettyClientHandler$4
[WARNING]   - io.grpc.netty.NettyClientTransport$3
[WARNING]   - io.grpc.netty.ProtocolNegotiators$1$1
[WARNING]   - io.grpc.netty.NettyClientHandler$FrameListener
[WARNING]   - io.grpc.netty.ProtocolNegotiators$1
[WARNING]   

Build failed in Jenkins: beam_PostCommit_RunnableOnService_GoogleCloudDataflow #240

2016-04-29 Thread Apache Jenkins Server
See 


--
[...truncated 7688 lines...]
[INFO] Including com.google.guava:guava:jar:19.0 in the shaded jar.
[INFO] Excluding com.google.protobuf:protobuf-java:jar:3.0.0-beta-1 from the 
shaded jar.
[INFO] Excluding com.google.code.findbugs:jsr305:jar:3.0.1 from the shaded jar.
[INFO] Excluding com.fasterxml.jackson.core:jackson-core:jar:2.7.0 from the 
shaded jar.
[INFO] Excluding com.fasterxml.jackson.core:jackson-annotations:jar:2.7.0 from 
the shaded jar.
[INFO] Excluding com.fasterxml.jackson.core:jackson-databind:jar:2.7.0 from the 
shaded jar.
[INFO] Excluding org.slf4j:slf4j-api:jar:1.7.14 from the shaded jar.
[INFO] Excluding org.apache.avro:avro:jar:1.7.7 from the shaded jar.
[INFO] Excluding org.codehaus.jackson:jackson-core-asl:jar:1.9.13 from the 
shaded jar.
[INFO] Excluding org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13 from the 
shaded jar.
[INFO] Excluding com.thoughtworks.paranamer:paranamer:jar:2.3 from the shaded 
jar.
[INFO] Excluding org.xerial.snappy:snappy-java:jar:1.1.2.1 from the shaded jar.
[INFO] Excluding org.apache.commons:commons-compress:jar:1.9 from the shaded 
jar.
[INFO] Excluding joda-time:joda-time:jar:2.4 from the shaded jar.
[INFO] Excluding org.codehaus.woodstox:stax2-api:jar:3.1.4 from the shaded jar.
[INFO] Excluding org.codehaus.woodstox:woodstox-core-asl:jar:4.4.1 from the 
shaded jar.
[INFO] Excluding org.tukaani:xz:jar:1.5 from the shaded jar.
[INFO] Excluding com.google.auto.service:auto-service:jar:1.0-rc2 from the 
shaded jar.
[INFO] Excluding com.google.auto:auto-common:jar:0.3 from the shaded jar.
[INFO] Replacing original artifact with shaded artifact.
[INFO] Replacing 

 with 

[INFO] Replacing original test artifact with shaded test artifact.
[INFO] Replacing 

 with 

[INFO] Dependency-reduced POM written at: 

[INFO] Dependency-reduced POM written at: 

[INFO] 
[INFO] --- maven-shade-plugin:2.4.1:shade (bundle-rest-without-repackaging) @ 
java-sdk-all ---
[INFO] Including io.grpc:grpc-all:jar:0.12.0 in the shaded jar.
[INFO] Including io.grpc:grpc-core:jar:0.12.0 in the shaded jar.
[INFO] Including io.grpc:grpc-okhttp:jar:0.12.0 in the shaded jar.
[INFO] Including com.squareup.okio:okio:jar:1.6.0 in the shaded jar.
[INFO] Including com.squareup.okhttp:okhttp:jar:2.5.0 in the shaded jar.
[INFO] Including io.grpc:grpc-protobuf-nano:jar:0.12.0 in the shaded jar.
[INFO] Including com.google.protobuf.nano:protobuf-javanano:jar:3.0.0-alpha-4 
in the shaded jar.
[INFO] Including io.grpc:grpc-netty:jar:0.12.0 in the shaded jar.
[INFO] Including io.netty:netty-codec-http2:jar:4.1.0.Beta8 in the shaded jar.
[INFO] Including io.netty:netty-codec-http:jar:4.1.0.Beta8 in the shaded jar.
[INFO] Including com.twitter:hpack:jar:0.10.1 in the shaded jar.
[INFO] Including io.grpc:grpc-protobuf:jar:0.12.0 in the shaded jar.
[INFO] Including io.grpc:grpc-stub:jar:0.12.0 in the shaded jar.
[INFO] Including io.grpc:grpc-auth:jar:0.12.0 in the shaded jar.
[INFO] Including com.google.auth:google-auth-library-oauth2-http:jar:0.3.1 in 
the shaded jar.
[INFO] Including com.google.auth:google-auth-library-credentials:jar:0.3.1 in 
the shaded jar.
[INFO] Including io.netty:netty-handler:jar:4.1.0.Beta8 in the shaded jar.
[INFO] Including io.netty:netty-buffer:jar:4.1.0.Beta8 in the shaded jar.
[INFO] Including io.netty:netty-common:jar:4.1.0.Beta8 in the shaded jar.
[INFO] Including io.netty:netty-transport:jar:4.1.0.Beta8 in the shaded jar.
[INFO] Including io.netty:netty-resolver:jar:4.1.0.Beta8 in the shaded jar.
[INFO] Including io.netty:netty-codec:jar:4.1.0.Beta8 in the shaded jar.
[INFO] Including com.google.api.grpc:grpc-pubsub-v1:jar:0.0.2 in the shaded jar.
[INFO] Including com.google.api.grpc:grpc-core-proto:jar:0.0.3 in the shaded 
jar.
[INFO] Including com.google.api-client:google-api-client:jar:1.21.0 in the 
shaded jar.
[INFO] Including 
com.google.apis:google-api-services-bigquery:jar:v2-rev248-1.21.0 in the shaded 
jar.
[INFO] Including 

[GitHub] incubator-beam pull request: Add option to use Flink's Kafka Write...

2016-04-29 Thread mxm
GitHub user mxm opened a pull request:

https://github.com/apache/incubator-beam/pull/266

Add option to use Flink's Kafka Write IO

This pull request adds the counterpart of the UnboundedFlinkSource, the 
`UnboundedFlinkSink` which uses the `Write` API. Users have requested this 
multiple times, e.g. to use the Flink Kafka Producer in a Beam program. In the 
long run we will opt only for Beam IO interfaces. I would like to replace the 
custom Flink sources and sinks as soon as we have the relevant connectors for 
users in place. In the meantime, users can explore the potential of Beam using 
also native backend connectors. 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mxm/incubator-beam kafkaSink-pr

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/266.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #266


commit 3780b01f9ff0a2ffb645b961e127c50ae97affd8
Author: Maximilian Michels 
Date:   2016-04-22T10:33:26Z

Kafka sink implementation

commit 1db316971b6ecd0a27cefb0408266c914c1f7d89
Author: Maximilian Michels 
Date:   2016-04-28T10:00:18Z

fix Flink source coder handling

commit fff968b03177ba53f3bdad2055f67dc5633d5628
Author: Maximilian Michels 
Date:   2016-04-28T10:02:05Z

add Kafka IO examples




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-238) Add link URLs for sources / sinks

2016-04-29 Thread JIRA

[ 
https://issues.apache.org/jira/browse/BEAM-238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15263645#comment-15263645
 ] 

Jean-Baptiste Onofré commented on BEAM-238:
---

I plan to propose a PR with a complete overview of IOs, DSLs, ...

> Add link URLs for sources / sinks
> -
>
> Key: BEAM-238
> URL: https://issues.apache.org/jira/browse/BEAM-238
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Scott Wegner
>Assignee: Jean-Baptiste Onofré
>
> Where applicable, annotate sources/sink display data with link urls.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-238) Add link URLs for sources / sinks

2016-04-29 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/BEAM-238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Baptiste Onofré updated BEAM-238:
--
Component/s: website

> Add link URLs for sources / sinks
> -
>
> Key: BEAM-238
> URL: https://issues.apache.org/jira/browse/BEAM-238
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Scott Wegner
>Assignee: Jean-Baptiste Onofré
>
> Where applicable, annotate sources/sink display data with link urls.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (BEAM-238) Add link URLs for sources / sinks

2016-04-29 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/BEAM-238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Baptiste Onofré reassigned BEAM-238:
-

Assignee: Jean-Baptiste Onofré

> Add link URLs for sources / sinks
> -
>
> Key: BEAM-238
> URL: https://issues.apache.org/jira/browse/BEAM-238
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Scott Wegner
>Assignee: Jean-Baptiste Onofré
>
> Where applicable, annotate sources/sink display data with link urls.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: beam_PostCommit_MavenVerify » Apache Beam :: SDKs :: Java :: Core #304

2016-04-29 Thread Apache Jenkins Server
See 


--
[...truncated 540 lines...]
Running org.apache.beam.sdk.util.UserCodeExceptionTest
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.002 sec - in 
org.apache.beam.sdk.util.UserCodeExceptionTest
Running org.apache.beam.sdk.util.SerializerTest
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.012 sec - in 
org.apache.beam.sdk.util.SerializerTest
Running org.apache.beam.sdk.util.AttemptBoundedExponentialBackOffTest
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.002 sec - in 
org.apache.beam.sdk.util.AttemptBoundedExponentialBackOffTest
Running org.apache.beam.sdk.util.BigQueryServicesImplTest
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.015 sec - in 
org.apache.beam.sdk.util.BigQueryServicesImplTest
Running org.apache.beam.sdk.util.IOChannelUtilsTest
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.01 sec - in 
org.apache.beam.sdk.util.IOChannelUtilsTest

Results :

Tests run: 2769, Failures: 0, Errors: 0, Skipped: 3

[JENKINS] Recording test results
[INFO] 
[INFO] --- jacoco-maven-plugin:0.7.5.201505241946:report (report) @ 
java-sdk-all ---
[INFO] Analyzed bundle 'Apache Beam :: SDKs :: Java :: Core' with 1085 classes
[INFO] 
[INFO] --- maven-jar-plugin:2.5:jar (default-jar) @ java-sdk-all ---
[INFO] Building jar: 

[INFO] 
[INFO] --- maven-site-plugin:3.4:attach-descriptor (attach-descriptor) @ 
java-sdk-all ---
[INFO] 
[INFO] --- maven-jar-plugin:2.5:test-jar (default-test-jar) @ java-sdk-all ---
[INFO] Building jar: 

[INFO] 
[INFO] --- maven-javadoc-plugin:2.10.3:jar (default) @ java-sdk-all ---
[INFO] 
1 warning
[WARNING] Javadoc Warnings
[WARNING] 
:34:
 warning - Tag @link: can't find withAllowedLateness() in 
org.apache.beam.sdk.transforms.windowing.Window
[INFO] Building jar: 

[INFO] 
[INFO] --- maven-shade-plugin:2.4.1:shade (bundle-and-repackage) @ java-sdk-all 
---
[INFO] Excluding io.grpc:grpc-all:jar:0.12.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-core:jar:0.12.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-okhttp:jar:0.12.0 from the shaded jar.
[INFO] Excluding com.squareup.okio:okio:jar:1.6.0 from the shaded jar.
[INFO] Excluding com.squareup.okhttp:okhttp:jar:2.5.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-protobuf-nano:jar:0.12.0 from the shaded jar.
[INFO] Excluding com.google.protobuf.nano:protobuf-javanano:jar:3.0.0-alpha-4 
from the shaded jar.
[INFO] Excluding io.grpc:grpc-netty:jar:0.12.0 from the shaded jar.
[INFO] Excluding io.netty:netty-codec-http2:jar:4.1.0.Beta8 from the shaded jar.
[INFO] Excluding io.netty:netty-codec-http:jar:4.1.0.Beta8 from the shaded jar.
[INFO] Excluding com.twitter:hpack:jar:0.10.1 from the shaded jar.
[INFO] Excluding io.grpc:grpc-protobuf:jar:0.12.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-stub:jar:0.12.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-auth:jar:0.12.0 from the shaded jar.
[INFO] Excluding com.google.auth:google-auth-library-oauth2-http:jar:0.3.1 from 
the shaded jar.
[INFO] Excluding com.google.auth:google-auth-library-credentials:jar:0.3.1 from 
the shaded jar.
[INFO] Excluding io.netty:netty-handler:jar:4.1.0.Beta8 from the shaded jar.
[INFO] Excluding io.netty:netty-buffer:jar:4.1.0.Beta8 from the shaded jar.
[INFO] Excluding io.netty:netty-common:jar:4.1.0.Beta8 from the shaded jar.
[INFO] Excluding io.netty:netty-transport:jar:4.1.0.Beta8 from the shaded jar.
[INFO] Excluding io.netty:netty-resolver:jar:4.1.0.Beta8 from the shaded jar.
[INFO] Excluding io.netty:netty-codec:jar:4.1.0.Beta8 from the shaded jar.
[INFO] Excluding com.google.api.grpc:grpc-pubsub-v1:jar:0.0.2 from the shaded 
jar.
[INFO] Excluding com.google.api.grpc:grpc-core-proto:jar:0.0.3 from the shaded 
jar.
[INFO] Excluding com.google.api-client:google-api-client:jar:1.21.0 from the 
shaded jar.
[INFO] Excluding 
com.google.apis:google-api-services-bigquery:jar:v2-rev248-1.21.0 from the 
shaded jar.
[INFO] Excluding com.google.apis:google-api-services-pubsub:jar:v1-rev7-1.21.0 
from the shaded jar.
[INFO] Excluding 
com.google.apis:google-api-services-storage:jar:v1-rev53-1.21.0 from the shaded 
jar.
[INFO] Excluding com.google.http-client:google-http-client:jar:1.21.0 from the 
shaded jar.
[INFO] Excluding