Jenkins build is unstable: beam_PostCommit_Java_MavenInstall #4956

2017-10-05 Thread Apache Jenkins Server
See 




Jenkins build is back to stable : beam_PostCommit_Java_ValidatesRunner_Spark #3228

2017-10-05 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-2720) Exact-once Kafka sink

2017-10-05 Thread Raghu Angadi (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16194020#comment-16194020
 ] 

Raghu Angadi commented on BEAM-2720:


Filed BEAM-3025 about caveats with the current implementation. Some of these 
are simple to address and some are not.

> Exact-once Kafka sink
> -
>
> Key: BEAM-2720
> URL: https://issues.apache.org/jira/browse/BEAM-2720
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions
>Reporter: Raghu Angadi
>Assignee: Raghu Angadi
>
> Kafka 0.11 added support for transactions which allows end-to-end 
> exactly-once semantics. Beam's KafkaIO users can benefit from these while 
> using runners that support exactly-once processing.
> I have an implementation of EOS support for Kafka sink : 
> https://github.com/apache/beam/pull/3612
> It has two shuffles and builds on Beam state-API and checkpoint barrier 
> between stages (as in Dataflow). Pull request has a longer description.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (BEAM-3025) Caveats with KafkaIO exactly once support.

2017-10-05 Thread Raghu Angadi (JIRA)
Raghu Angadi created BEAM-3025:
--

 Summary: Caveats with KafkaIO exactly once support.
 Key: BEAM-3025
 URL: https://issues.apache.org/jira/browse/BEAM-3025
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-extensions
Affects Versions: Not applicable
Reporter: Raghu Angadi
Assignee: Raghu Angadi


BEAM-2720 adds support for exactly-once semantics in KafkaIO sink. It is tested 
with Dataflow and seems to work well. It does some with a few caveats we should 
address over time:

  * Implementation requires specific durability/checkpoint semantics across a 
GroupByKey transform.
   ** It requires a checkpoint across GBK. Not all runners support this, 
specifically horizontal distributed checkpoint  in Flink does not work.
   ** The implementation includes runtime check for compatibility.
   * Requires stateful DoFn support. Not all runners support this. Even those 
that do, often includes their own caveats. This is part of core Beam model, and 
overtime it will be widely supported across the runners.
* The user records go through extra shuffles. The implementation results in 
2 extra shuffles in Dataflow. Some enhancements to Beam API might reduce number 
of shuffles.
   * It requires user to specify 'number of shards', which determines sink 
parallelism. The shard ids are also used to store some stage in topic metadata 
on Kafka servers. If the number of shards is larger than the number of 
partitions for the output topic, the behavior is not documented, though tests 
seem to work fine. I.e. I am able to store metadata for 100 partitions even 
though a topic has just 10 partitions. We should probably file a jira for 
Kafka. Alternately we could limit number of shards to be fewer than the number 
of partitions (not required yet).
   * The metadata mentioned above is kept only for 24 hours by default. i.e., 
if a pipeline does not write anything for a day or is down for a day, it could 
lose crucial state stored with Kafka. Admin can configure this on Kafka servers 
to be larger, but there is no way for a client to increase it for specific 
topic. Note that Kafka Streams also face the same issue.
   ** I asked about both of Kafka issue on user list : 
https://www.mail-archive.com/users@kafka.apache.org/msg27624.html . 





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Build failed in Jenkins: beam_PostCommit_Java_MavenInstall #4957

2017-10-05 Thread Apache Jenkins Server
See 


Changes:

[klk] Make PackageUtil a proper class encapsulating its ExecutorService

[klk] Use AutoValue for Dataflow PackageAttributes

[klk] Refactor PackageUtil for more and simpler asynchrony

--
[...truncated 958.09 KB...]
2017-10-06T01:21:33.673 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/com/google/inject/extensions/guice-servlet/3.0/guice-servlet-3.0.jar
 (64 KB at 24.5 KB/sec)
2017-10-06T01:21:33.674 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/htrace/htrace-core/3.1.0-incubating/htrace-core-3.1.0-incubating.jar
 (1442 KB at 556.1 KB/sec)
2017-10-06T01:21:33.714 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/hadoop/hadoop-mapreduce-client-core/2.7.3/hadoop-mapreduce-client-core-2.7.3.jar
 (1521 KB at 577.5 KB/sec)
2017-10-06T01:21:33.715 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/com/google/inject/guice/3.0/guice-3.0.jar 
(694 KB at 263.5 KB/sec)
2017-10-06T01:21:33.779 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/io/netty/netty/3.6.2.Final/netty-3.6.2.Final.jar
 (1172 KB at 434.4 KB/sec)
2017-10-06T01:21:33.791 [INFO] Downloading: 
https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-hdfs/2.7.3/hadoop-hdfs-2.7.3.jar
2017-10-06T01:21:33.991 [INFO] Downloading: 
http://repository.jboss.org/nexus/content/groups/public/org/apache/hadoop/hadoop-hdfs/2.7.3/hadoop-hdfs-2.7.3.jar
[JENKINS] Archiving disabled
2017-10-06T01:21:35.064 [INFO]  
   
2017-10-06T01:21:35.065 [INFO] 

2017-10-06T01:21:35.065 [INFO] Skipping Apache Beam :: Parent
2017-10-06T01:21:35.065 [INFO] This project has been banned from the build due 
to previous failures.
2017-10-06T01:21:35.065 [INFO] 

[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
[JENKINS] Archiving disabled
2017-10-06T01:22:10.372 [INFO] 

2017-10-06T01:22:10.372 [INFO] Reactor Summary:
2017-10-06T01:22:10.372 [INFO] 
2017-10-06T01:22:10.372 [INFO] Apache Beam :: Parent 
.. SUCCESS [ 56.848 s]
2017-10-06T01:22:10.372 [INFO] Apache Beam :: SDKs :: Java :: Build Tools 
. SUCCESS [ 19.939 s]
2017-10-06T01:22:10.372 [INFO] Apache Beam :: SDKs 
 SUCCESS [  9.632 s]
2017-10-06T01:22:10.372 [INFO] Apache Beam :: SDKs :: Common 
.. SUCCESS [  4.216 s]
2017-10-06T01:22:10.372 [INFO] Apache Beam :: SDKs :: Common :: Runner API 
 SUCCESS [01:07 min]
2017-10-06T01:22:10.372 [INFO] Apache Beam :: SDKs :: Common :: Fn API 
 SUCCESS [ 32.685 s]
2017-10-06T01:22:10.373 [INFO] Apache Beam :: SDKs :: Java 
 SUCCESS [  3.351 s]
2017-10-06T01:22:10.373 [INFO] Apache Beam :: SDKs :: Java :: Core 
 SUCCESS [05:35 min]
2017-10-06T01:22:10.373 [INFO] Apache Beam :: 

Jenkins build became unstable: beam_PostCommit_Java_ValidatesRunner_Spark #3227

2017-10-05 Thread Apache Jenkins Server
See 




[jira] [Updated] (BEAM-3024) Dataflow ValidatesRunner approaching 120 minute timeout

2017-10-05 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-3024:
--
Summary: Dataflow ValidatesRunner approaching 120 minute timeout  (was: 
Dataflow ValidatesRunner approach 120 minute timeout)

> Dataflow ValidatesRunner approaching 120 minute timeout
> ---
>
> Key: BEAM-3024
> URL: https://issues.apache.org/jira/browse/BEAM-3024
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>
> It takes 1h30m to run the tests. Combined with other overheads, we will see 
> more timeout flakes like this: 
> https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/4107/consoleFull
> From that log:
> {code}
> 2017-10-05T22:56:04.740 [INFO] Results:
> 2017-10-05T22:56:04.740 [INFO] 
> 2017-10-05T22:56:04.740 [WARNING] Tests run: 258, Failures: 0, Errors: 0, 
> Skipped: 1
> ...
> 2017-10-05T22:56:14.600 [WARNING] Failed to notify spy 
> hudson.maven.Maven3Builder$JenkinsEventSpy: java.lang.InterruptedException
> 2017-10-05T22:56:14.600 [INFO] 
> 
> 2017-10-05T22:56:14.600 [INFO] Reactor Summary:
> 2017-10-05T22:56:14.600 [INFO] 
> 2017-10-05T22:56:14.600 [INFO] Apache Beam :: Parent 
> .. SUCCESS [ 48.256 s]
> 2017-10-05T22:56:14.601 [INFO] Apache Beam :: SDKs :: Java :: Build Tools 
> . SUCCESS [ 32.572 s]
> 2017-10-05T22:56:14.601 [INFO] Apache Beam :: SDKs 
>  SUCCESS [  4.731 s]
> 2017-10-05T22:56:14.601 [INFO] Apache Beam :: SDKs :: Common 
> .. SUCCESS [  3.581 s]
> 2017-10-05T22:56:14.601 [INFO] Apache Beam :: SDKs :: Common :: Runner API 
>  SUCCESS [ 49.196 s]
> 2017-10-05T22:56:14.601 [INFO] Apache Beam :: SDKs :: Common :: Fn API 
>  SUCCESS [ 19.548 s]
> 2017-10-05T22:56:14.601 [INFO] Apache Beam :: SDKs :: Java 
>  SUCCESS [  2.505 s]
> 2017-10-05T22:56:14.601 [INFO] Apache Beam :: SDKs :: Java :: Core 
>  SUCCESS [03:38 min]
> 2017-10-05T22:56:14.601 [INFO] Apache Beam :: Runners 
> . SUCCESS [  2.401 s]
> 2017-10-05T22:56:14.601 [INFO] Apache Beam :: Runners :: Core Construction 
> Java ... SUCCESS [ 58.770 s]
> 2017-10-05T22:56:14.601 [INFO] Apache Beam :: Runners :: Core Java 
>  SUCCESS [01:01 min]
> 2017-10-05T22:56:14.601 [INFO] Apache Beam :: Runners :: Direct Java 
> .. SUCCESS [14:27 min]
> 2017-10-05T22:56:14.601 [INFO] Apache Beam :: SDKs :: Java :: IO 
> .. SUCCESS [  3.558 s]
> 2017-10-05T22:56:14.601 [INFO] Apache Beam :: SDKs :: Java :: Extensions 
> .. SUCCESS [  3.310 s]
> 2017-10-05T22:56:14.601 [INFO] Apache Beam :: SDKs :: Java :: Extensions :: 
> Google Cloud Platform Core SUCCESS [ 35.687 s]
> 2017-10-05T22:56:14.601 [INFO] Apache Beam :: SDKs :: Java :: Extensions :: 
> Protobuf SUCCESS [ 37.732 s]
> 2017-10-05T22:56:14.601 [INFO] Apache Beam :: SDKs :: Java :: IO :: Google 
> Cloud Platform SUCCESS [02:34 min]
> 2017-10-05T22:56:14.601 [INFO] Apache Beam :: Runners :: Google Cloud 
> Dataflow  SUCCESS [  01:31 h]
> 2017-10-05T22:56:14.601 [INFO] 
> 
> 2017-10-05T22:56:14.601 [INFO] BUILD SUCCESS
> 2017-10-05T22:56:14.601 [INFO] 
> 
> 2017-10-05T22:56:14.601 [INFO] Total time: 01:59 h
> 2017-10-05T22:56:14.601 [INFO] Finished at: 2017-10-05T22:56:14+00:00
> Setting status of f630545926467dae7f477c575d8a8b78511aecef to FAILURE with 
> url 
> https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/4107/
>  and message: 'FAILURE
>  '
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[2/4] beam git commit: Use AutoValue for Dataflow PackageAttributes

2017-10-05 Thread kenn
Use AutoValue for Dataflow PackageAttributes


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/a328127b
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/a328127b
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/a328127b

Branch: refs/heads/master
Commit: a328127b9b0a0f59816bcbe84646446b4f75aafc
Parents: a211bd9
Author: Kenneth Knowles 
Authored: Thu Sep 28 20:03:51 2017 -0700
Committer: Kenneth Knowles 
Committed: Thu Oct 5 17:35:04 2017 -0700

--
 .../beam/runners/dataflow/util/PackageUtil.java | 164 ---
 .../runners/dataflow/util/PackageUtilTest.java  |  29 ++--
 2 files changed, 84 insertions(+), 109 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/a328127b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/PackageUtil.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/PackageUtil.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/PackageUtil.java
index 9d1e084..7496d1c 100644
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/PackageUtil.java
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/PackageUtil.java
@@ -23,6 +23,7 @@ import com.fasterxml.jackson.core.Base64Variants;
 import com.google.api.client.util.BackOff;
 import com.google.api.client.util.Sleeper;
 import com.google.api.services.dataflow.model.DataflowPackage;
+import com.google.auto.value.AutoValue;
 import com.google.cloud.hadoop.util.ApiErrorExtractor;
 import com.google.common.collect.Lists;
 import com.google.common.hash.Funnels;
@@ -46,7 +47,6 @@ import java.util.Collections;
 import java.util.Comparator;
 import java.util.LinkedList;
 import java.util.List;
-import java.util.Objects;
 import java.util.concurrent.Callable;
 import java.util.concurrent.ExecutionException;
 import java.util.concurrent.Executors;
@@ -105,49 +105,6 @@ class PackageUtil implements Closeable {
   }
 
 
-  /**
-   * Compute and cache the attributes of a classpath element that we will need 
to stage it.
-   *
-   * @param source the file or directory to be staged.
-   * @param stagingPath The base location for staged classpath elements.
-   * @param overridePackageName If non-null, use the given value as the 
package name
-   *instead of generating one automatically.
-   * @return a {@link PackageAttributes} that containing metadata about the 
object to be staged.
-   */
-  static PackageAttributes createPackageAttributes(File source,
-  String stagingPath, @Nullable String overridePackageName) {
-boolean directory = source.isDirectory();
-
-// Compute size and hash in one pass over file or directory.
-Hasher hasher = Hashing.md5().newHasher();
-OutputStream hashStream = Funnels.asOutputStream(hasher);
-try (CountingOutputStream countingOutputStream = new 
CountingOutputStream(hashStream)) {
-  if (!directory) {
-// Files are staged as-is.
-Files.asByteSource(source).copyTo(countingOutputStream);
-  } else {
-// Directories are recursively zipped.
-ZipFiles.zipDirectory(source, countingOutputStream);
-  }
-  countingOutputStream.flush();
-
-  long size = countingOutputStream.getCount();
-  String hash = 
Base64Variants.MODIFIED_FOR_URL.encode(hasher.hash().asBytes());
-
-  // Create the DataflowPackage with staging name and location.
-  String uniqueName = getUniqueContentName(source, hash);
-  String resourcePath = FileSystems.matchNewResource(stagingPath, true)
-  .resolve(uniqueName, StandardResolveOptions.RESOLVE_FILE).toString();
-  DataflowPackage target = new DataflowPackage();
-  target.setName(overridePackageName != null ? overridePackageName : 
uniqueName);
-  target.setLocation(resourcePath);
-
-  return new PackageAttributes(size, hash, directory, target, 
source.getPath());
-} catch (IOException e) {
-  throw new RuntimeException("Package setup failure for " + source, e);
-}
-  }
-
   /** Utility comparator used in uploading packages efficiently. */
   private static class PackageUploadOrder implements 
Comparator {
 @Override
@@ -193,7 +150,11 @@ class PackageUtil implements Closeable {
   executorService.submit(new Callable() {
 @Override
 public PackageAttributes call() throws Exception {
-  return createPackageAttributes(file, stagingPath, packageName);
+  PackageAttributes attributes = 
PackageAttributes.forFileToStage(file, stagingPath);
+  if 

[GitHub] beam pull request #3920: [BEAM-2963] Preliminary refactors to PackageUtil

2017-10-05 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3920


---


[jira] [Commented] (BEAM-2963) Propagate pipeline protos through Dataflow API from Java

2017-10-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193993#comment-16193993
 ] 

ASF GitHub Bot commented on BEAM-2963:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3920


> Propagate pipeline protos through Dataflow API from Java
> 
>
> Key: BEAM-2963
> URL: https://issues.apache.org/jira/browse/BEAM-2963
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-dataflow
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>
> The Java-specific blobs transmitted to Dataflow need more context, in the 
> form of portability framework protos.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[1/4] beam git commit: Refactor PackageUtil for more and simpler asynchrony

2017-10-05 Thread kenn
Repository: beam
Updated Branches:
  refs/heads/master 31da49cc1 -> ed00299cc


Refactor PackageUtil for more and simpler asynchrony


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/58b6453f
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/58b6453f
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/58b6453f

Branch: refs/heads/master
Commit: 58b6453f3ff934e8c453ab4d17bf7fd15c7d479c
Parents: a328127
Author: Kenneth Knowles 
Authored: Thu Sep 28 20:07:16 2017 -0700
Committer: Kenneth Knowles 
Committed: Thu Oct 5 17:35:04 2017 -0700

--
 .../beam/runners/dataflow/util/PackageUtil.java | 336 +++
 1 file changed, 202 insertions(+), 134 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/58b6453f/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/PackageUtil.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/PackageUtil.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/PackageUtil.java
index 7496d1c..449b36d 100644
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/PackageUtil.java
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/PackageUtil.java
@@ -25,12 +25,13 @@ import com.google.api.client.util.Sleeper;
 import com.google.api.services.dataflow.model.DataflowPackage;
 import com.google.auto.value.AutoValue;
 import com.google.cloud.hadoop.util.ApiErrorExtractor;
-import com.google.common.collect.Lists;
+import com.google.common.base.Function;
 import com.google.common.hash.Funnels;
 import com.google.common.hash.Hasher;
 import com.google.common.hash.Hashing;
 import com.google.common.io.CountingOutputStream;
 import com.google.common.io.Files;
+import com.google.common.util.concurrent.AsyncFunction;
 import com.google.common.util.concurrent.Futures;
 import com.google.common.util.concurrent.ListenableFuture;
 import com.google.common.util.concurrent.ListeningExecutorService;
@@ -42,22 +43,22 @@ import java.io.IOException;
 import java.io.OutputStream;
 import java.nio.channels.Channels;
 import java.nio.channels.WritableByteChannel;
+import java.util.ArrayList;
 import java.util.Collection;
-import java.util.Collections;
 import java.util.Comparator;
-import java.util.LinkedList;
 import java.util.List;
 import java.util.concurrent.Callable;
 import java.util.concurrent.ExecutionException;
 import java.util.concurrent.Executors;
 import java.util.concurrent.atomic.AtomicInteger;
-import javax.annotation.Nullable;
 import org.apache.beam.sdk.annotations.Internal;
+import org.apache.beam.sdk.extensions.gcp.storage.GcsCreateOptions;
 import org.apache.beam.sdk.io.FileSystems;
 import org.apache.beam.sdk.io.fs.CreateOptions;
 import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
 import org.apache.beam.sdk.util.BackOffAdapter;
 import org.apache.beam.sdk.util.FluentBackoff;
+import org.apache.beam.sdk.util.MimeTypes;
 import org.apache.beam.sdk.util.ZipFiles;
 import org.joda.time.Duration;
 import org.slf4j.Logger;
@@ -76,6 +77,14 @@ class PackageUtil implements Closeable {
 
   private static final int DEFAULT_THREAD_POOL_SIZE = 32;
 
+  private static final Sleeper DEFAULT_SLEEPER = Sleeper.DEFAULT;
+
+  private static final CreateOptions DEFAULT_CREATE_OPTIONS =
+  GcsCreateOptions.builder()
+  .setGcsUploadBufferSizeBytes(1024 * 1024)
+  .setMimeType(MimeTypes.BINARY)
+  .build();
+
   private static final FluentBackoff BACKOFF_FACTORY =
   
FluentBackoff.DEFAULT.withMaxRetries(4).withInitialBackoff(Duration.standardSeconds(5));
 
@@ -121,136 +130,155 @@ class PackageUtil implements Closeable {
 }
   }
 
-  /**
-   * Utility function that computes sizes and hashes of packages so that we 
can validate whether
-   * they have already been correctly staged.
-   */
-  private List computePackageAttributes(
-  Collection classpathElements,
-  final String stagingPath) {
-
-List futures = new LinkedList<>();
-for (String classpathElement : classpathElements) {
-  @Nullable String userPackageName = null;
-  if (classpathElement.contains("=")) {
-String[] components = classpathElement.split("=", 2);
-userPackageName = components[0];
-classpathElement = components[1];
-  }
-  @Nullable final String packageName = userPackageName;
-
-  final File file = new File(classpathElement);
-  if (!file.exists()) {
-LOG.warn("Skipping non-existent classpath element {} that was 
specified.",
-

[3/4] beam git commit: Make PackageUtil a proper class encapsulating its ExecutorService

2017-10-05 Thread kenn
Make PackageUtil a proper class encapsulating its ExecutorService


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/a211bd9b
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/a211bd9b
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/a211bd9b

Branch: refs/heads/master
Commit: a211bd9bb6365f1fe76e9b16355f721fcaa80b47
Parents: 31da49c
Author: Kenneth Knowles 
Authored: Thu Sep 28 19:35:20 2017 -0700
Committer: Kenneth Knowles 
Committed: Thu Oct 5 17:35:04 2017 -0700

--
 .../beam/runners/dataflow/util/GcsStager.java   |  8 +-
 .../beam/runners/dataflow/util/PackageUtil.java | 59 
 .../runners/dataflow/util/PackageUtilTest.java  | 95 
 3 files changed, 103 insertions(+), 59 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/a211bd9b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/GcsStager.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/GcsStager.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/GcsStager.java
index d18e306..929be99 100644
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/GcsStager.java
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/GcsStager.java
@@ -62,9 +62,9 @@ public class GcsStager implements Stager {
 .setMimeType(MimeTypes.BINARY)
 .build();
 
-return PackageUtil.stageClasspathElements(
-options.getFilesToStage(),
-options.getStagingLocation(),
-createOptions);
+try (PackageUtil packageUtil = PackageUtil.withDefaultThreadPool()) {
+  return packageUtil.stageClasspathElements(
+  options.getFilesToStage(), options.getStagingLocation(), 
createOptions);
+}
   }
 }

http://git-wip-us.apache.org/repos/asf/beam/blob/a211bd9b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/PackageUtil.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/PackageUtil.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/PackageUtil.java
index 931f7ea..9d1e084 100644
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/PackageUtil.java
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/PackageUtil.java
@@ -34,6 +34,7 @@ import com.google.common.util.concurrent.Futures;
 import com.google.common.util.concurrent.ListenableFuture;
 import com.google.common.util.concurrent.ListeningExecutorService;
 import com.google.common.util.concurrent.MoreExecutors;
+import java.io.Closeable;
 import java.io.File;
 import java.io.FileNotFoundException;
 import java.io.IOException;
@@ -51,6 +52,7 @@ import java.util.concurrent.ExecutionException;
 import java.util.concurrent.Executors;
 import java.util.concurrent.atomic.AtomicInteger;
 import javax.annotation.Nullable;
+import org.apache.beam.sdk.annotations.Internal;
 import org.apache.beam.sdk.io.FileSystems;
 import org.apache.beam.sdk.io.fs.CreateOptions;
 import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
@@ -62,13 +64,18 @@ import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
 /** Helper routines for packages. */
-class PackageUtil {
+@Internal
+class PackageUtil implements Closeable {
+
   private static final Logger LOG = LoggerFactory.getLogger(PackageUtil.class);
+
   /**
* A reasonable upper bound on the number of jars required to launch a 
Dataflow job.
*/
   private static final int SANE_CLASSPATH_SIZE = 1000;
 
+  private static final int DEFAULT_THREAD_POOL_SIZE = 32;
+
   private static final FluentBackoff BACKOFF_FACTORY =
   
FluentBackoff.DEFAULT.withMaxRetries(4).withInitialBackoff(Duration.standardSeconds(5));
 
@@ -77,6 +84,27 @@ class PackageUtil {
*/
   private static final ApiErrorExtractor ERROR_EXTRACTOR = new 
ApiErrorExtractor();
 
+  private final ListeningExecutorService executorService;
+
+  private PackageUtil(ListeningExecutorService executorService) {
+this.executorService = executorService;
+  }
+
+  public static PackageUtil withDefaultThreadPool() {
+return PackageUtil.withExecutorService(
+
MoreExecutors.listeningDecorator(Executors.newFixedThreadPool(DEFAULT_THREAD_POOL_SIZE)));
+  }
+
+  public static PackageUtil withExecutorService(ListeningExecutorService 
executorService) {
+return new PackageUtil(executorService);
+  }
+
+  

[4/4] beam git commit: This closes #3920: [BEAM-2963] Preliminary refactors to PackageUtil

2017-10-05 Thread kenn
This closes #3920: [BEAM-2963] Preliminary refactors to PackageUtil

  Refactor PackageUtil for more and simpler asynchrony
  Use AutoValue for Dataflow PackageAttributes
  Make PackageUtil a proper class encapsulating its ExecutorService


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/ed00299c
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/ed00299c
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/ed00299c

Branch: refs/heads/master
Commit: ed00299ccf3a87b528338e575704f222ab469e2d
Parents: 31da49c 58b6453
Author: Kenneth Knowles 
Authored: Thu Oct 5 17:35:27 2017 -0700
Committer: Kenneth Knowles 
Committed: Thu Oct 5 17:35:27 2017 -0700

--
 .../beam/runners/dataflow/util/GcsStager.java   |   8 +-
 .../beam/runners/dataflow/util/PackageUtil.java | 519 +++
 .../runners/dataflow/util/PackageUtilTest.java  | 124 +++--
 3 files changed, 369 insertions(+), 282 deletions(-)
--




Jenkins build is back to normal : beam_PostCommit_Java_ValidatesRunner_Spark #3226

2017-10-05 Thread Apache Jenkins Server
See 




[jira] [Created] (BEAM-3024) Dataflow ValidatesRunner approach 120 minute timeout

2017-10-05 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-3024:
-

 Summary: Dataflow ValidatesRunner approach 120 minute timeout
 Key: BEAM-3024
 URL: https://issues.apache.org/jira/browse/BEAM-3024
 Project: Beam
  Issue Type: Bug
  Components: runner-dataflow
Reporter: Kenneth Knowles
Assignee: Kenneth Knowles


It takes 1h30m to run the tests. Combined with other overheads, we will see 
more timeout flakes like this: 
https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/4107/consoleFull

>From that log:

{code}
2017-10-05T22:56:04.740 [INFO] Results:
2017-10-05T22:56:04.740 [INFO] 
2017-10-05T22:56:04.740 [WARNING] Tests run: 258, Failures: 0, Errors: 0, 
Skipped: 1
...
2017-10-05T22:56:14.600 [WARNING] Failed to notify spy 
hudson.maven.Maven3Builder$JenkinsEventSpy: java.lang.InterruptedException
2017-10-05T22:56:14.600 [INFO] 

2017-10-05T22:56:14.600 [INFO] Reactor Summary:
2017-10-05T22:56:14.600 [INFO] 
2017-10-05T22:56:14.600 [INFO] Apache Beam :: Parent 
.. SUCCESS [ 48.256 s]
2017-10-05T22:56:14.601 [INFO] Apache Beam :: SDKs :: Java :: Build Tools 
. SUCCESS [ 32.572 s]
2017-10-05T22:56:14.601 [INFO] Apache Beam :: SDKs 
 SUCCESS [  4.731 s]
2017-10-05T22:56:14.601 [INFO] Apache Beam :: SDKs :: Common 
.. SUCCESS [  3.581 s]
2017-10-05T22:56:14.601 [INFO] Apache Beam :: SDKs :: Common :: Runner API 
 SUCCESS [ 49.196 s]
2017-10-05T22:56:14.601 [INFO] Apache Beam :: SDKs :: Common :: Fn API 
 SUCCESS [ 19.548 s]
2017-10-05T22:56:14.601 [INFO] Apache Beam :: SDKs :: Java 
 SUCCESS [  2.505 s]
2017-10-05T22:56:14.601 [INFO] Apache Beam :: SDKs :: Java :: Core 
 SUCCESS [03:38 min]
2017-10-05T22:56:14.601 [INFO] Apache Beam :: Runners 
. SUCCESS [  2.401 s]
2017-10-05T22:56:14.601 [INFO] Apache Beam :: Runners :: Core Construction Java 
... SUCCESS [ 58.770 s]
2017-10-05T22:56:14.601 [INFO] Apache Beam :: Runners :: Core Java 
 SUCCESS [01:01 min]
2017-10-05T22:56:14.601 [INFO] Apache Beam :: Runners :: Direct Java 
.. SUCCESS [14:27 min]
2017-10-05T22:56:14.601 [INFO] Apache Beam :: SDKs :: Java :: IO 
.. SUCCESS [  3.558 s]
2017-10-05T22:56:14.601 [INFO] Apache Beam :: SDKs :: Java :: Extensions 
.. SUCCESS [  3.310 s]
2017-10-05T22:56:14.601 [INFO] Apache Beam :: SDKs :: Java :: Extensions :: 
Google Cloud Platform Core SUCCESS [ 35.687 s]
2017-10-05T22:56:14.601 [INFO] Apache Beam :: SDKs :: Java :: Extensions :: 
Protobuf SUCCESS [ 37.732 s]
2017-10-05T22:56:14.601 [INFO] Apache Beam :: SDKs :: Java :: IO :: Google 
Cloud Platform SUCCESS [02:34 min]
2017-10-05T22:56:14.601 [INFO] Apache Beam :: Runners :: Google Cloud Dataflow 
 SUCCESS [  01:31 h]
2017-10-05T22:56:14.601 [INFO] 

2017-10-05T22:56:14.601 [INFO] BUILD SUCCESS
2017-10-05T22:56:14.601 [INFO] 

2017-10-05T22:56:14.601 [INFO] Total time: 01:59 h
2017-10-05T22:56:14.601 [INFO] Finished at: 2017-10-05T22:56:14+00:00
Setting status of f630545926467dae7f477c575d8a8b78511aecef to FAILURE with url 
https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/4107/
 and message: 'FAILURE
 '
{code}





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is back to normal : beam_PostCommit_Java_ValidatesRunner_Flink #4004

2017-10-05 Thread Apache Jenkins Server
See 




Jenkins build is back to normal : beam_PerformanceTests_Python #412

2017-10-05 Thread Apache Jenkins Server
See 




[GitHub] beam pull request #3882: [BEAM-1630] Adds API for defining Splittable DoFns ...

2017-10-05 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3882


---


[jira] [Commented] (BEAM-1630) Add Splittable DoFn to Python SDK

2017-10-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193964#comment-16193964
 ] 

ASF GitHub Bot commented on BEAM-1630:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3882


> Add Splittable DoFn to Python SDK
> -
>
> Key: BEAM-1630
> URL: https://issues.apache.org/jira/browse/BEAM-1630
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>
> Splittable DoFn [1] is currently being implemented for Java SDK [2]. We 
> should add this to Python SDK as well.
> Following document proposes an API for this.
> https://docs.google.com/document/d/1h_zprJrOilivK2xfvl4L42vaX4DMYGfH1YDmi-s_ozM/edit?usp=sharing
> [1] https://s.apache.org/splittable-do-fn
> [2] 
> https://lists.apache.org/thread.html/0ce61ac162460a149d5c93cdface37cc383f8030fe86ca09e5699b18@%3Cdev.beam.apache.org%3E



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Build failed in Jenkins: beam_PostCommit_Java_MavenInstall #4955

2017-10-05 Thread Apache Jenkins Server
See 


--
[...truncated 580.23 KB...]
2017-10-06T00:15:34.365 [INFO] 
2017-10-06T00:15:34.365 [INFO] --- maven-jar-plugin:3.0.2:test-jar 
(default-test-jar) @ beam-runners-direct-java ---
2017-10-06T00:15:34.399 [INFO] Building jar: 

[WARNING] Cannot calculate digest of mojo class, because mojo wasn't loaded 
from a jar, but from: 

2017-10-06T00:15:34.563 [INFO] 
2017-10-06T00:15:34.563 [INFO] --- maven-shade-plugin:3.0.0:shade 
(bundle-and-repackage) @ beam-runners-direct-java ---
2017-10-06T00:15:34.571 [INFO] Excluding 
org.apache.beam:beam-sdks-java-core:jar:2.2.0-SNAPSHOT from the shaded jar.
2017-10-06T00:15:34.571 [INFO] Including 
com.google.protobuf:protobuf-java:jar:3.2.0 in the shaded jar.
2017-10-06T00:15:34.571 [INFO] Excluding 
com.fasterxml.jackson.core:jackson-core:jar:2.8.9 from the shaded jar.
2017-10-06T00:15:34.572 [INFO] Excluding 
com.fasterxml.jackson.core:jackson-annotations:jar:2.8.9 from the shaded jar.
2017-10-06T00:15:34.572 [INFO] Excluding 
com.fasterxml.jackson.core:jackson-databind:jar:2.8.9 from the shaded jar.
2017-10-06T00:15:34.572 [INFO] Excluding net.bytebuddy:byte-buddy:jar:1.6.8 
from the shaded jar.
2017-10-06T00:15:34.572 [INFO] Excluding org.apache.avro:avro:jar:1.8.2 from 
the shaded jar.
2017-10-06T00:15:34.572 [INFO] Excluding 
org.codehaus.jackson:jackson-core-asl:jar:1.9.13 from the shaded jar.
2017-10-06T00:15:34.572 [INFO] Excluding 
org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13 from the shaded jar.
2017-10-06T00:15:34.572 [INFO] Excluding 
com.thoughtworks.paranamer:paranamer:jar:2.7 from the shaded jar.
2017-10-06T00:15:34.572 [INFO] Excluding org.tukaani:xz:jar:1.5 from the shaded 
jar.
2017-10-06T00:15:34.572 [INFO] Excluding 
org.xerial.snappy:snappy-java:jar:1.1.4 from the shaded jar.
2017-10-06T00:15:34.577 [INFO] Excluding 
org.apache.commons:commons-compress:jar:1.14 from the shaded jar.
2017-10-06T00:15:34.578 [INFO] Excluding 
org.apache.commons:commons-lang3:jar:3.6 from the shaded jar.
2017-10-06T00:15:34.578 [INFO] Including 
org.apache.beam:beam-sdks-common-runner-api:jar:2.2.0-SNAPSHOT in the shaded 
jar.
2017-10-06T00:15:34.578 [INFO] Excluding io.grpc:grpc-core:jar:1.2.0 from the 
shaded jar.
2017-10-06T00:15:34.578 [INFO] Excluding io.grpc:grpc-context:jar:1.2.0 from 
the shaded jar.
2017-10-06T00:15:34.578 [INFO] Excluding 
com.google.instrumentation:instrumentation-api:jar:0.3.0 from the shaded jar.
2017-10-06T00:15:34.578 [INFO] Excluding io.grpc:grpc-protobuf:jar:1.2.0 from 
the shaded jar.
2017-10-06T00:15:34.578 [INFO] Excluding io.grpc:grpc-protobuf-lite:jar:1.2.0 
from the shaded jar.
2017-10-06T00:15:34.578 [INFO] Excluding io.grpc:grpc-stub:jar:1.2.0 from the 
shaded jar.
2017-10-06T00:15:34.578 [INFO] Including 
org.apache.beam:beam-runners-core-construction-java:jar:2.2.0-SNAPSHOT in the 
shaded jar.
2017-10-06T00:15:34.578 [INFO] Excluding 
com.google.protobuf:protobuf-java-util:jar:3.2.0 from the shaded jar.
2017-10-06T00:15:34.578 [INFO] Excluding com.google.code.gson:gson:jar:2.7 from 
the shaded jar.
2017-10-06T00:15:34.578 [INFO] Including 
org.apache.beam:beam-runners-core-java:jar:2.2.0-SNAPSHOT in the shaded jar.
2017-10-06T00:15:34.579 [INFO] Excluding 
org.apache.beam:beam-sdks-common-fn-api:jar:2.2.0-SNAPSHOT from the shaded jar.
2017-10-06T00:15:34.579 [INFO] Including com.google.guava:guava:jar:20.0 in the 
shaded jar.
2017-10-06T00:15:34.579 [INFO] Excluding 
com.google.errorprone:error_prone_annotations:jar:2.0.15 from the shaded jar.
2017-10-06T00:15:34.579 [INFO] Excluding joda-time:joda-time:jar:2.4 from the 
shaded jar.
2017-10-06T00:15:34.579 [INFO] Including 
com.google.code.findbugs:jsr305:jar:3.0.1 in the shaded jar.
2017-10-06T00:15:34.579 [INFO] Excluding org.slf4j:slf4j-api:jar:1.7.25 from 
the shaded jar.
2017-10-06T00:15:34.579 [INFO] Excluding 
com.google.auto.service:auto-service:jar:1.0-rc2 from the shaded jar.
2017-10-06T00:15:34.579 [INFO] Excluding com.google.auto:auto-common:jar:0.3 
from the shaded jar.
[WARNING] Cannot calculate digest of mojo class, because mojo wasn't loaded 
from a jar, but from: 

[JENKINS] Archiving disabled
2017-10-06T00:15:36.360 [INFO]  
   
2017-10-06T00:15:36.360 [INFO] 

2017-10-06T00:15:36.360 [INFO] Skipping Apache Beam :: Parent

[2/2] beam git commit: This closes #3882

2017-10-05 Thread chamikara
This closes #3882


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/31da49cc
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/31da49cc
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/31da49cc

Branch: refs/heads/master
Commit: 31da49cc15c9e863d20f3a2e1753ded9b783ff80
Parents: 2090ee3 d1a70a3
Author: Chamikara Jayalath 
Authored: Thu Oct 5 17:15:30 2017 -0700
Committer: Chamikara Jayalath 
Committed: Thu Oct 5 17:15:30 2017 -0700

--
 sdks/python/apache_beam/io/iobase.py   |  72 
 sdks/python/apache_beam/transforms/core.py | 143 +++-
 2 files changed, 212 insertions(+), 3 deletions(-)
--




Jenkins build is unstable: beam_PostCommit_Java_MavenInstall #4954

2017-10-05 Thread Apache Jenkins Server
See 




[1/2] beam git commit: Adds API for defining Splittable DoFns.

2017-10-05 Thread chamikara
Repository: beam
Updated Branches:
  refs/heads/master 2090ee324 -> 31da49cc1


Adds API for defining Splittable DoFns.

See https://s.apache.org/splittable-do-fn-python-sdk for the design.

This PR and the above doc were updated to reflect following recent updates to 
Splittable DoFn.
* Support for ProcessContinuations
* Support for dynamically updating output watermark irrespective of the output 
element production.

This will be followed by a PR that adds support for reading Splittable DoFns 
using DirectRunner.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/d1a70a36
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/d1a70a36
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/d1a70a36

Branch: refs/heads/master
Commit: d1a70a36cabd2f32ef57b99dc33877826d83cafd
Parents: 2090ee3
Author: chamik...@google.com 
Authored: Thu Sep 21 17:43:11 2017 -0700
Committer: Chamikara Jayalath 
Committed: Thu Oct 5 17:14:56 2017 -0700

--
 sdks/python/apache_beam/io/iobase.py   |  72 
 sdks/python/apache_beam/transforms/core.py | 143 +++-
 2 files changed, 212 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/d1a70a36/sdks/python/apache_beam/io/iobase.py
--
diff --git a/sdks/python/apache_beam/io/iobase.py 
b/sdks/python/apache_beam/io/iobase.py
index 1f2a8bf..7cffa7f 100644
--- a/sdks/python/apache_beam/io/iobase.py
+++ b/sdks/python/apache_beam/io/iobase.py
@@ -1013,3 +1013,75 @@ class _RoundRobinKeyFn(core.DoFn):
 if self.counter >= self.count:
   self.counter -= self.count
 yield self.counter, element
+
+
+class RestrictionTracker(object):
+  """Manages concurrent access to a restriction.
+
+  Experimental; no backwards-compatibility guarantees.
+
+  Keeps track of the restrictions claimed part for a Splittable DoFn.
+
+  See following documents for more details.
+  * https://s.apache.org/splittable-do-fn
+  * https://s.apache.org/splittable-do-fn-python-sdk
+  """
+
+  def current_restriction(self):
+"""Returns the current restriction.
+
+Returns a restriction accurately describing the full range of work the
+current ``DoFn.process()`` call will do, including already completed work.
+
+The current restriction returned by method may be updated dynamically due
+to due to concurrent invocation of other methods of the
+``RestrictionTracker``, For example, ``checkpoint()``.
+
+** Thread safety **
+
+Methods of the class ``RestrictionTracker`` including this method may get
+invoked by different threads, hence must be made thread-safe, e.g. by using
+a single lock object.
+"""
+raise NotImplementedError
+
+  def checkpoint(self):
+"""Performs a checkpoint of the current restriction.
+
+Signals that the current ``DoFn.process()`` call should terminate as soon 
as
+possible. After this method returns, the tracker MUST refuse all future
+claim calls, and ``RestrictionTracker.check_done()`` MUST succeed.
+
+This invocation modifies the value returned by ``current_restriction()``
+invocation and returns a restriction representing the rest of the work. The
+old value of ``current_restriction()`` is equivalent to the new value of
+``current_restriction()`` and the return value of this method invocation
+combined.
+
+** Thread safety **
+
+Methods of the class ``RestrictionTracker`` including this method may get
+invoked by different threads, hence must be made thread-safe, e.g. by using
+a single lock object.
+"""
+
+raise NotImplementedError
+
+  def check_done(self):
+"""Checks whether the restriction has been fully processed.
+
+Called by the runner after iterator returned by ``DoFn.process()`` has been
+fully read.
+
+Returns: ``True`` if current restriction has been fully processed.
+Raises ValueError: if there is still any unclaimed work remaining in the
+  restriction invoking this method. Exception raised must have an
+  informative error message.
+
+** Thread safety **
+
+Methods of the class ``RestrictionTracker`` including this method may get
+invoked by different threads, hence must be made thread-safe, e.g. by using
+a single lock object.
+"""
+raise NotImplementedError

http://git-wip-us.apache.org/repos/asf/beam/blob/d1a70a36/sdks/python/apache_beam/transforms/core.py
--
diff --git a/sdks/python/apache_beam/transforms/core.py 
b/sdks/python/apache_beam/transforms/core.py
index 153dc32..41e20ba 100644
--- a/sdks/python/apache_beam/transforms/core.py
+++ 

Build failed in Jenkins: beam_PostCommit_Java_ValidatesRunner_Flink #4003

2017-10-05 Thread Apache Jenkins Server
See 


--
[...truncated 217.39 KB...]
2017-10-06T00:03:04.491 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/io/takari/tycho/tycho-support/1.1.0/tycho-support-1.1.0.pom
2017-10-06T00:03:04.521 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/io/takari/tycho/tycho-support/1.1.0/tycho-support-1.1.0.pom
 (10 KB at 314.0 KB/sec)
2017-10-06T00:03:04.524 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/io/takari/takari/14/takari-14.pom
2017-10-06T00:03:04.553 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/io/takari/takari/14/takari-14.pom (14 KB 
at 449.0 KB/sec)
2017-10-06T00:03:04.558 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/net/revelc/code/formatter/formatter-maven-plugin/2.0.1/formatter-maven-plugin-2.0.1.jar
2017-10-06T00:03:04.596 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/net/revelc/code/formatter/formatter-maven-plugin/2.0.1/formatter-maven-plugin-2.0.1.jar
 (32 KB at 833.6 KB/sec)
2017-10-06T00:03:04.601 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-checkstyle-plugin/2.17/maven-checkstyle-plugin-2.17.pom
2017-10-06T00:03:04.630 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-checkstyle-plugin/2.17/maven-checkstyle-plugin-2.17.pom
 (14 KB at 449.5 KB/sec)
2017-10-06T00:03:04.639 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-checkstyle-plugin/2.17/maven-checkstyle-plugin-2.17.jar
2017-10-06T00:03:04.704 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-checkstyle-plugin/2.17/maven-checkstyle-plugin-2.17.jar
 (107 KB at 1631.2 KB/sec)
2017-10-06T00:03:04.710 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-surefire-plugin/2.20/maven-surefire-plugin-2.20.pom
2017-10-06T00:03:04.749 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-surefire-plugin/2.20/maven-surefire-plugin-2.20.pom
 (7 KB at 162.2 KB/sec)
2017-10-06T00:03:04.751 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/maven/surefire/surefire/2.20/surefire-2.20.pom
2017-10-06T00:03:04.783 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/maven/surefire/surefire/2.20/surefire-2.20.pom
 (21 KB at 634.8 KB/sec)
2017-10-06T00:03:04.788 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-surefire-plugin/2.20/maven-surefire-plugin-2.20.jar
2017-10-06T00:03:04.828 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-surefire-plugin/2.20/maven-surefire-plugin-2.20.jar
 (52 KB at 1299.4 KB/sec)
2017-10-06T00:03:04.836 [INFO] 
2017-10-06T00:03:04.839 [INFO] --- maven-clean-plugin:3.0.0:clean 
(default-clean) @ beam-sdks-java-build-tools ---
2017-10-06T00:03:04.842 [INFO] Deleting 

 (includes = [**/*.pyc, **/*.egg-info/, **/sdks/python/LICENSE, 
**/sdks/python/NOTICE, **/sdks/python/README.md], excludes = [])
2017-10-06T00:03:04.957 [INFO] 
2017-10-06T00:03:04.958 [INFO] --- maven-enforcer-plugin:3.0.0-M1:enforce 
(enforce) @ beam-sdks-java-build-tools ---
2017-10-06T00:03:05.096 [INFO] 
2017-10-06T00:03:05.096 [INFO] --- maven-enforcer-plugin:3.0.0-M1:enforce 
(enforce-banned-dependencies) @ beam-sdks-java-build-tools ---
2017-10-06T00:03:05.213 [INFO] 
2017-10-06T00:03:05.213 [INFO] --- maven-remote-resources-plugin:1.5:process 
(process-resource-bundles) @ beam-sdks-java-build-tools ---
2017-10-06T00:03:05.359 [INFO] 
2017-10-06T00:03:05.359 [INFO] --- maven-resources-plugin:3.0.2:resources 
(default-resources) @ beam-sdks-java-build-tools ---
2017-10-06T00:03:05.375 [INFO] Using 'UTF-8' encoding to copy filtered 
resources.
2017-10-06T00:03:05.380 [INFO] Copying 11 resources
2017-10-06T00:03:05.513 [INFO] Copying 3 resources
2017-10-06T00:03:05.624 [INFO] 
2017-10-06T00:03:05.624 [INFO] --- maven-compiler-plugin:3.6.2:compile 
(default-compile) @ beam-sdks-java-build-tools ---
2017-10-06T00:03:05.630 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/maven/shared/maven-shared-incremental/1.1/maven-shared-incremental-1.1.pom
2017-10-06T00:03:05.667 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/maven/shared/maven-shared-incremental/1.1/maven-shared-incremental-1.1.pom
 (5 KB at 128.6 KB/sec)
2017-10-06T00:03:05.672 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/maven/shared/maven-shared-utils/0.1/maven-shared-utils-0.1.pom
2017-10-06T00:03:05.699 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/maven/shared/maven-shared-utils/0.1/maven-shared-utils-0.1.pom
 (4 KB at 146.4 KB/sec)
2017-10-06T00:03:05.703 [INFO] Downloading: 

[jira] [Commented] (BEAM-2926) Java SDK support for portable side input

2017-10-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193941#comment-16193941
 ] 

ASF GitHub Bot commented on BEAM-2926:
--

GitHub user lukecwik opened a pull request:

https://github.com/apache/beam/pull/3952

[BEAM-2926] Add initial support to be able to read singleton and iterable 
side inputs to Java SDK harness.

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [x] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [x] Each commit in the pull request should have a meaningful subject 
line and body.
 - [x] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [x] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [x] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [x] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lukecwik/incubator-beam fn_api2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3952.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3952


commit c49c564daa2a32d7930b819175e77d414e617419
Author: Luke Cwik 
Date:   2017-10-05T23:54:19Z

[BEAM-2926] Add initial support to be able to read singleton and iterable 
side inputs to Java SDK harness.




> Java SDK support for portable side input 
> -
>
> Key: BEAM-2926
> URL: https://issues.apache.org/jira/browse/BEAM-2926
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Henning Rohde
>Assignee: Luke Cwik
>  Labels: portability
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3952: [BEAM-2926] Add initial support to be able to read ...

2017-10-05 Thread lukecwik
GitHub user lukecwik opened a pull request:

https://github.com/apache/beam/pull/3952

[BEAM-2926] Add initial support to be able to read singleton and iterable 
side inputs to Java SDK harness.

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [x] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [x] Each commit in the pull request should have a meaningful subject 
line and body.
 - [x] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [x] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [x] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [x] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lukecwik/incubator-beam fn_api2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3952.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3952


commit c49c564daa2a32d7930b819175e77d414e617419
Author: Luke Cwik 
Date:   2017-10-05T23:54:19Z

[BEAM-2926] Add initial support to be able to read singleton and iterable 
side inputs to Java SDK harness.




---


Jenkins build is back to normal : beam_PostCommit_Java_ValidatesRunner_Apex #2539

2017-10-05 Thread Apache Jenkins Server
See 




[1/2] beam git commit: This closes #3908

2017-10-05 Thread tgroh
Repository: beam
Updated Branches:
  refs/heads/master 0dd4a1cfa -> 2090ee324


This closes #3908


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/2090ee32
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/2090ee32
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/2090ee32

Branch: refs/heads/master
Commit: 2090ee324753604b4adaaaefa835ee09ee362510
Parents: 0dd4a1c dbf1dc0
Author: Thomas Groh 
Authored: Thu Oct 5 15:47:51 2017 -0700
Committer: Thomas Groh 
Committed: Thu Oct 5 15:47:51 2017 -0700

--
 runners/core-construction-java/pom.xml  |  10 +
 .../construction/ArtifactServiceStager.java | 244 +++
 .../construction/ArtifactServiceStagerTest.java | 138 +++
 .../InMemoryArtifactStagerService.java  | 152 
 .../src/main/proto/beam_artifact_api.proto  |   4 +-
 5 files changed, 546 insertions(+), 2 deletions(-)
--




[GitHub] beam pull request #3908: Add an ArtifactServiceStager

2017-10-05 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3908


---


Build failed in Jenkins: beam_PostCommit_Java_ValidatesRunner_Spark #3225

2017-10-05 Thread Apache Jenkins Server
See 


Changes:

[tgroh] Add an ArtifactServiceStager

--
[...truncated 258.19 KB...]
2017-10-05T22:51:28.510 [INFO] 
2017-10-05T22:51:28.510 [INFO] --- maven-site-plugin:3.5.1:attach-descriptor 
(attach-descriptor) @ beam-sdks-parent ---
2017-10-05T22:51:28.725 [INFO] 
2017-10-05T22:51:28.725 [INFO] --- maven-jar-plugin:3.0.2:jar (default-jar) @ 
beam-sdks-parent ---
2017-10-05T22:51:28.726 [INFO] Skipping packaging of the jar
2017-10-05T22:51:28.944 [INFO] 
2017-10-05T22:51:28.944 [INFO] --- maven-jar-plugin:3.0.2:test-jar 
(default-test-jar) @ beam-sdks-parent ---
2017-10-05T22:51:28.945 [INFO] Skipping packaging of the test-jar
2017-10-05T22:51:29.054 [INFO] 
2017-10-05T22:51:29.054 [INFO] --- maven-shade-plugin:3.0.0:shade 
(bundle-and-repackage) @ beam-sdks-parent ---
2017-10-05T22:51:29.061 [INFO] Replacing original artifact with shaded artifact.
2017-10-05T22:51:29.168 [INFO] 
2017-10-05T22:51:29.168 [INFO] --- maven-dependency-plugin:3.0.1:analyze-only 
(default) @ beam-sdks-parent ---
2017-10-05T22:51:29.170 [INFO] Skipping pom project
[JENKINS] Archiving disabled
2017-10-05T22:51:30.270 [INFO]  
   
2017-10-05T22:51:30.270 [INFO] 

2017-10-05T22:51:30.270 [INFO] Building Apache Beam :: SDKs :: Common 
2.2.0-SNAPSHOT
2017-10-05T22:51:30.270 [INFO] 

2017-10-05T22:51:30.274 [INFO] 
2017-10-05T22:51:30.275 [INFO] --- maven-clean-plugin:3.0.0:clean 
(default-clean) @ beam-sdks-common-parent ---
2017-10-05T22:51:30.302 [INFO] Deleting 

 (includes = [**/*.pyc, **/*.egg-info/, **/sdks/python/LICENSE, 
**/sdks/python/NOTICE, **/sdks/python/README.md], excludes = [])
2017-10-05T22:51:30.417 [INFO] 
2017-10-05T22:51:30.417 [INFO] --- maven-enforcer-plugin:3.0.0-M1:enforce 
(enforce) @ beam-sdks-common-parent ---
2017-10-05T22:51:30.526 [INFO] 
2017-10-05T22:51:30.527 [INFO] --- maven-enforcer-plugin:3.0.0-M1:enforce 
(enforce-banned-dependencies) @ beam-sdks-common-parent ---
2017-10-05T22:51:30.640 [INFO] 
2017-10-05T22:51:30.640 [INFO] --- maven-remote-resources-plugin:1.5:process 
(process-resource-bundles) @ beam-sdks-common-parent ---
2017-10-05T22:51:30.763 [INFO] 
2017-10-05T22:51:30.763 [INFO] --- maven-checkstyle-plugin:2.17:check (default) 
@ beam-sdks-common-parent ---
2017-10-05T22:51:31.396 [INFO] Starting audit...
Audit done.
2017-10-05T22:51:31.505 [INFO] 
2017-10-05T22:51:31.505 [INFO] --- 
build-helper-maven-plugin:3.0.0:regex-properties (render-artifact-id) @ 
beam-sdks-common-parent ---
2017-10-05T22:51:31.613 [INFO] 
2017-10-05T22:51:31.614 [INFO] --- maven-site-plugin:3.5.1:attach-descriptor 
(attach-descriptor) @ beam-sdks-common-parent ---
2017-10-05T22:51:31.829 [INFO] 
2017-10-05T22:51:31.829 [INFO] --- maven-jar-plugin:3.0.2:jar (default-jar) @ 
beam-sdks-common-parent ---
2017-10-05T22:51:31.831 [INFO] Skipping packaging of the jar
2017-10-05T22:51:32.301 [INFO] 
2017-10-05T22:51:32.301 [INFO] --- maven-jar-plugin:3.0.2:test-jar 
(default-test-jar) @ beam-sdks-common-parent ---
2017-10-05T22:51:32.303 [INFO] Skipping packaging of the test-jar
2017-10-05T22:51:32.410 [INFO] 
2017-10-05T22:51:32.410 [INFO] --- maven-shade-plugin:3.0.0:shade 
(bundle-and-repackage) @ beam-sdks-common-parent ---
2017-10-05T22:51:32.415 [INFO] Replacing original artifact with shaded artifact.
2017-10-05T22:51:32.522 [INFO] 
2017-10-05T22:51:32.523 [INFO] --- maven-dependency-plugin:3.0.1:analyze-only 
(default) @ beam-sdks-common-parent ---
2017-10-05T22:51:32.524 [INFO] Skipping pom project
[JENKINS] Archiving disabled
2017-10-05T22:51:33.728 [INFO]  
   
2017-10-05T22:51:33.728 [INFO] 

2017-10-05T22:51:33.728 [INFO] Building Apache Beam :: SDKs :: Common :: Runner 
API 2.2.0-SNAPSHOT
2017-10-05T22:51:33.728 [INFO] 

2017-10-05T22:51:33.730 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/xolstice/maven/plugins/protobuf-maven-plugin/0.5.0/protobuf-maven-plugin-0.5.0.pom
2017-10-05T22:51:33.766 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/xolstice/maven/plugins/protobuf-maven-plugin/0.5.0/protobuf-maven-plugin-0.5.0.pom
 (30 KB at 813.1 KB/sec)
2017-10-05T22:51:33.770 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/xolstice/maven/plugins/protobuf-maven-plugin/0.5.0/protobuf-maven-plugin-0.5.0.jar
2017-10-05T22:51:33.822 [INFO] Downloaded: 

[2/2] beam git commit: Add an ArtifactServiceStager

2017-10-05 Thread tgroh
Add an ArtifactServiceStager

This stages artifacts over a GRPC channel.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/dbf1dc0a
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/dbf1dc0a
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/dbf1dc0a

Branch: refs/heads/master
Commit: dbf1dc0a29e7c82cd13f2c1e4abe20dc0e7ea87e
Parents: 0dd4a1c
Author: Thomas Groh 
Authored: Fri Aug 18 15:06:06 2017 -0700
Committer: Thomas Groh 
Committed: Thu Oct 5 15:47:51 2017 -0700

--
 runners/core-construction-java/pom.xml  |  10 +
 .../construction/ArtifactServiceStager.java | 244 +++
 .../construction/ArtifactServiceStagerTest.java | 138 +++
 .../InMemoryArtifactStagerService.java  | 152 
 .../src/main/proto/beam_artifact_api.proto  |   4 +-
 5 files changed, 546 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/dbf1dc0a/runners/core-construction-java/pom.xml
--
diff --git a/runners/core-construction-java/pom.xml 
b/runners/core-construction-java/pom.xml
index 1a52914..ac712b0 100644
--- a/runners/core-construction-java/pom.xml
+++ b/runners/core-construction-java/pom.xml
@@ -121,6 +121,16 @@
   provided
 
 
+
+  io.grpc
+  grpc-core
+
+
+
+  io.grpc
+  grpc-stub
+
+
 
 
 

http://git-wip-us.apache.org/repos/asf/beam/blob/dbf1dc0a/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/ArtifactServiceStager.java
--
diff --git 
a/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/ArtifactServiceStager.java
 
b/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/ArtifactServiceStager.java
new file mode 100644
index 000..c37f289
--- /dev/null
+++ 
b/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/ArtifactServiceStager.java
@@ -0,0 +1,244 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.runners.core.construction;
+
+import com.google.auto.value.AutoValue;
+import com.google.common.io.BaseEncoding;
+import com.google.common.util.concurrent.Futures;
+import com.google.common.util.concurrent.ListenableFuture;
+import com.google.common.util.concurrent.ListeningExecutorService;
+import com.google.common.util.concurrent.MoreExecutors;
+import com.google.protobuf.ByteString;
+import io.grpc.Channel;
+import io.grpc.stub.StreamObserver;
+import java.io.File;
+import java.io.FileInputStream;
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.nio.channels.FileChannel;
+import java.security.MessageDigest;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.Map;
+import java.util.Map.Entry;
+import java.util.Set;
+import java.util.concurrent.Callable;
+import java.util.concurrent.CountDownLatch;
+import java.util.concurrent.ExecutionException;
+import java.util.concurrent.Executors;
+import java.util.concurrent.atomic.AtomicReference;
+import javax.annotation.Nullable;
+import org.apache.beam.sdk.common.runner.v1.ArtifactApi.ArtifactChunk;
+import org.apache.beam.sdk.common.runner.v1.ArtifactApi.ArtifactMetadata;
+import org.apache.beam.sdk.common.runner.v1.ArtifactApi.CommitManifestRequest;
+import org.apache.beam.sdk.common.runner.v1.ArtifactApi.Manifest;
+import org.apache.beam.sdk.common.runner.v1.ArtifactApi.PutArtifactRequest;
+import org.apache.beam.sdk.common.runner.v1.ArtifactApi.PutArtifactResponse;
+import org.apache.beam.sdk.common.runner.v1.ArtifactStagingServiceGrpc;
+import 
org.apache.beam.sdk.common.runner.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceBlockingStub;
+import 
org.apache.beam.sdk.common.runner.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceStub;
+
+/** A client to stage 

[jira] [Commented] (BEAM-3023) Python Post commits failing with StateSamplerTest and SlowCoders tests

2017-10-05 Thread Pablo Estrada (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193767#comment-16193767
 ] 

Pablo Estrada commented on BEAM-3023:
-

That run was for a PR that has not been merged. (I had closed it because I 
thought it would not go through - it has since been reopened). Closing.

> Python Post commits failing with StateSamplerTest and SlowCoders tests
> --
>
> Key: BEAM-3023
> URL: https://issues.apache.org/jira/browse/BEAM-3023
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Pablo Estrada
>Priority: Critical
>
> Started after https://github.com/apache/beam/pull/3936
> Errors from the log:
> https://builds.apache.org/view/A-D/view/Beam/job/beam_PostCommit_Python_Verify/3285/console
> ==
> FAIL: test_using_slow_impl (apache_beam.coders.slow_coders_test.SlowCoders)
> --
> Traceback (most recent call last):
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/sdks/python/apache_beam/coders/slow_coders_test.py",
>  line 41, in test_using_slow_impl
> import apache_beam.coders.stream
> AssertionError: ImportError not raised
> ==
> FAIL: test_sampler_lull_reporting 
> (apache_beam.runners.worker.statesampler_test.StateSamplerTest)
> --
> Traceback (most recent call last):
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/sdks/python/apache_beam/runners/worker/statesampler_test.py",
>  line 90, in test_sampler_lull_reporting
> mock_logging.warn.call_args_list[0][0])
> AssertionError: 'stateA' not found in ('LULL: Spent over %.2f ms in state 
> %s', 10092, 'basic-stateA-msecs')



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (BEAM-3023) Python Post commits failing with StateSamplerTest and SlowCoders tests

2017-10-05 Thread Ahmet Altay (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay resolved BEAM-3023.
---
   Resolution: Not A Problem
Fix Version/s: Not applicable

> Python Post commits failing with StateSamplerTest and SlowCoders tests
> --
>
> Key: BEAM-3023
> URL: https://issues.apache.org/jira/browse/BEAM-3023
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Pablo Estrada
> Fix For: Not applicable
>
>
> Started after https://github.com/apache/beam/pull/3936
> Errors from the log:
> https://builds.apache.org/view/A-D/view/Beam/job/beam_PostCommit_Python_Verify/3285/console
> ==
> FAIL: test_using_slow_impl (apache_beam.coders.slow_coders_test.SlowCoders)
> --
> Traceback (most recent call last):
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/sdks/python/apache_beam/coders/slow_coders_test.py",
>  line 41, in test_using_slow_impl
> import apache_beam.coders.stream
> AssertionError: ImportError not raised
> ==
> FAIL: test_sampler_lull_reporting 
> (apache_beam.runners.worker.statesampler_test.StateSamplerTest)
> --
> Traceback (most recent call last):
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/sdks/python/apache_beam/runners/worker/statesampler_test.py",
>  line 90, in test_sampler_lull_reporting
> mock_logging.warn.call_args_list[0][0])
> AssertionError: 'stateA' not found in ('LULL: Spent over %.2f ms in state 
> %s', 10092, 'basic-stateA-msecs')



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (BEAM-3023) Python Post commits failing with StateSamplerTest and SlowCoders tests

2017-10-05 Thread Pablo Estrada (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pablo Estrada updated BEAM-3023:

Priority: Major  (was: Critical)

> Python Post commits failing with StateSamplerTest and SlowCoders tests
> --
>
> Key: BEAM-3023
> URL: https://issues.apache.org/jira/browse/BEAM-3023
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Pablo Estrada
>
> Started after https://github.com/apache/beam/pull/3936
> Errors from the log:
> https://builds.apache.org/view/A-D/view/Beam/job/beam_PostCommit_Python_Verify/3285/console
> ==
> FAIL: test_using_slow_impl (apache_beam.coders.slow_coders_test.SlowCoders)
> --
> Traceback (most recent call last):
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/sdks/python/apache_beam/coders/slow_coders_test.py",
>  line 41, in test_using_slow_impl
> import apache_beam.coders.stream
> AssertionError: ImportError not raised
> ==
> FAIL: test_sampler_lull_reporting 
> (apache_beam.runners.worker.statesampler_test.StateSamplerTest)
> --
> Traceback (most recent call last):
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/sdks/python/apache_beam/runners/worker/statesampler_test.py",
>  line 90, in test_sampler_lull_reporting
> mock_logging.warn.call_args_list[0][0])
> AssertionError: 'stateA' not found in ('LULL: Spent over %.2f ms in state 
> %s', 10092, 'basic-stateA-msecs')



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is back to normal : beam_PostCommit_Python_Verify #3286

2017-10-05 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PostCommit_Java_ValidatesRunner_Apex #2538

2017-10-05 Thread Apache Jenkins Server
See 


Changes:

[tgroh] Fix ReferenceRunnerJobServer checkstyle

--
[...truncated 204.87 KB...]
2017-10-05T21:27:30.750 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/ow2/asm/asm/5.0.2/asm-5.0.2.pom
2017-10-05T21:27:30.776 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/ow2/asm/asm/5.0.2/asm-5.0.2.pom (2 KB 
at 72.7 KB/sec)
2017-10-05T21:27:30.778 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/ow2/asm/asm-parent/5.0.2/asm-parent-5.0.2.pom
2017-10-05T21:27:30.805 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/ow2/asm/asm-parent/5.0.2/asm-parent-5.0.2.pom
 (6 KB at 198.7 KB/sec)
2017-10-05T21:27:30.807 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-utils/1.5.1/plexus-utils-1.5.1.pom
2017-10-05T21:27:30.833 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-utils/1.5.1/plexus-utils-1.5.1.pom
 (3 KB at 86.2 KB/sec)
2017-10-05T21:27:30.836 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/maven/maven-model/2.0.5/maven-model-2.0.5.pom
2017-10-05T21:27:30.864 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/maven/maven-model/2.0.5/maven-model-2.0.5.pom
 (3 KB at 94.9 KB/sec)
2017-10-05T21:27:30.866 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/maven/maven/2.0.5/maven-2.0.5.pom
2017-10-05T21:27:30.894 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/maven/maven/2.0.5/maven-2.0.5.pom
 (6 KB at 198.9 KB/sec)
2017-10-05T21:27:30.904 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/maven/maven-artifact/2.0.5/maven-artifact-2.0.5.pom
2017-10-05T21:27:30.931 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/maven/maven-artifact/2.0.5/maven-artifact-2.0.5.pom
 (727 B at 26.3 KB/sec)
2017-10-05T21:27:30.934 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/maven/shared/maven-dependency-tree/3.0.1/maven-dependency-tree-3.0.1.pom
2017-10-05T21:27:30.961 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/maven/shared/maven-dependency-tree/3.0.1/maven-dependency-tree-3.0.1.pom
 (8 KB at 271.2 KB/sec)
2017-10-05T21:27:30.966 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/maven/shared/maven-artifact-transfer/0.9.1/maven-artifact-transfer-0.9.1.pom
2017-10-05T21:27:30.994 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/maven/shared/maven-artifact-transfer/0.9.1/maven-artifact-transfer-0.9.1.pom
 (8 KB at 265.4 KB/sec)
2017-10-05T21:27:30.998 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/commons-lang/commons-lang/2.6/commons-lang-2.6.pom
2017-10-05T21:27:31.027 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/commons-lang/commons-lang/2.6/commons-lang-2.6.pom
 (18 KB at 589.1 KB/sec)
2017-10-05T21:27:31.029 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/commons/commons-parent/17/commons-parent-17.pom
2017-10-05T21:27:31.059 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/commons/commons-parent/17/commons-parent-17.pom
 (31 KB at 1015.1 KB/sec)
2017-10-05T21:27:31.063 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/commons-collections/commons-collections/3.2.2/commons-collections-3.2.2.pom
2017-10-05T21:27:31.091 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/commons-collections/commons-collections/3.2.2/commons-collections-3.2.2.pom
 (13 KB at 432.6 KB/sec)
2017-10-05T21:27:31.114 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/maven/reporting/maven-reporting-impl/2.3/maven-reporting-impl-2.3.jar
2017-10-05T21:27:31.116 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/xerces/xercesImpl/2.9.1/xercesImpl-2.9.1.jar
2017-10-05T21:27:31.114 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/maven/shared/maven-shared-utils/0.6/maven-shared-utils-0.6.jar
2017-10-05T21:27:31.116 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/maven/doxia/doxia-core/1.2/doxia-core-1.2.jar
2017-10-05T21:27:31.117 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/xml-apis/xml-apis/1.3.04/xml-apis-1.3.04.jar
2017-10-05T21:27:31.148 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/maven/reporting/maven-reporting-impl/2.3/maven-reporting-impl-2.3.jar
 (18 KB at 516.4 KB/sec)
2017-10-05T21:27:31.148 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/commons-digester/commons-digester/1.6/commons-digester-1.6.jar
2017-10-05T21:27:31.189 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/maven/doxia/doxia-core/1.2/doxia-core-1.2.jar
 (151 KB at 2006.8 KB/sec)
2017-10-05T21:27:31.189 [INFO] Downloading: 

[jira] [Created] (BEAM-3023) Python Post commits failing with StateSamplerTest and SlowCoders tests

2017-10-05 Thread Ahmet Altay (JIRA)
Ahmet Altay created BEAM-3023:
-

 Summary: Python Post commits failing with StateSamplerTest and 
SlowCoders tests
 Key: BEAM-3023
 URL: https://issues.apache.org/jira/browse/BEAM-3023
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-core
Reporter: Ahmet Altay
Assignee: Pablo Estrada
Priority: Critical


Started after https://github.com/apache/beam/pull/3936

Errors from the log:
https://builds.apache.org/view/A-D/view/Beam/job/beam_PostCommit_Python_Verify/3285/console

==
FAIL: test_using_slow_impl (apache_beam.coders.slow_coders_test.SlowCoders)
--
Traceback (most recent call last):
  File 
"/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/sdks/python/apache_beam/coders/slow_coders_test.py",
 line 41, in test_using_slow_impl
import apache_beam.coders.stream
AssertionError: ImportError not raised

==
FAIL: test_sampler_lull_reporting 
(apache_beam.runners.worker.statesampler_test.StateSamplerTest)
--
Traceback (most recent call last):
  File 
"/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/sdks/python/apache_beam/runners/worker/statesampler_test.py",
 line 90, in test_sampler_lull_reporting
mock_logging.warn.call_args_list[0][0])
AssertionError: 'stateA' not found in ('LULL: Spent over %.2f ms in state %s', 
10092, 'basic-stateA-msecs')



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3951: Fix ReferenceRunnerJobServer checkstyle

2017-10-05 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3951


---


[2/2] beam git commit: This closes #3951

2017-10-05 Thread tgroh
This closes #3951


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/0dd4a1cf
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/0dd4a1cf
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/0dd4a1cf

Branch: refs/heads/master
Commit: 0dd4a1cfaeb78103269020a6f968d34cd0cfd328
Parents: 5c21ba7 9d46353
Author: Thomas Groh 
Authored: Thu Oct 5 13:59:28 2017 -0700
Committer: Thomas Groh 
Committed: Thu Oct 5 13:59:28 2017 -0700

--
 .../beam/runners/reference/job/ReferenceRunnerJobServer.java  | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
--




[1/2] beam git commit: Fix ReferenceRunnerJobServer checkstyle

2017-10-05 Thread tgroh
Repository: beam
Updated Branches:
  refs/heads/master 5c21ba7b4 -> 0dd4a1cfa


Fix ReferenceRunnerJobServer checkstyle


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/9d46353a
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/9d46353a
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/9d46353a

Branch: refs/heads/master
Commit: 9d46353a89972d9f167701e440bbe9d25e5a365c
Parents: 5c21ba7
Author: Thomas Groh 
Authored: Thu Oct 5 13:57:42 2017 -0700
Committer: Thomas Groh 
Committed: Thu Oct 5 13:57:42 2017 -0700

--
 .../beam/runners/reference/job/ReferenceRunnerJobServer.java  | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/9d46353a/runners/reference/src/main/java/org/apache/beam/runners/reference/job/ReferenceRunnerJobServer.java
--
diff --git 
a/runners/reference/src/main/java/org/apache/beam/runners/reference/job/ReferenceRunnerJobServer.java
 
b/runners/reference/src/main/java/org/apache/beam/runners/reference/job/ReferenceRunnerJobServer.java
index 1dcc2b3..3262036 100644
--- 
a/runners/reference/src/main/java/org/apache/beam/runners/reference/job/ReferenceRunnerJobServer.java
+++ 
b/runners/reference/src/main/java/org/apache/beam/runners/reference/job/ReferenceRunnerJobServer.java
@@ -48,7 +48,8 @@ public class ReferenceRunnerJobServer {
 System.err.println();
   }
 
-  private static void runServer(ServerConfiguration configuration) throws 
IOException, InterruptedException {
+  private static void runServer(ServerConfiguration configuration)
+  throws IOException, InterruptedException {
 ReferenceRunnerJobService service = ReferenceRunnerJobService.create();
 Server server = 
ServerBuilder.forPort(configuration.port).addService(service).build();
 server.start();



[GitHub] beam pull request #3951: Fix ReferenceRunnerJobServer checkstyle

2017-10-05 Thread tgroh
GitHub user tgroh opened a pull request:

https://github.com/apache/beam/pull/3951

Fix ReferenceRunnerJobServer checkstyle

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [ ] Each commit in the pull request should have a meaningful subject 
line and body.
 - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [ ] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/beam reference_runner_checkstyle

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3951.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3951


commit 9d46353a89972d9f167701e440bbe9d25e5a365c
Author: Thomas Groh 
Date:   2017-10-05T20:57:42Z

Fix ReferenceRunnerJobServer checkstyle




---


Jenkins build is back to normal : beam_PostCommit_Java_ValidatesRunner_Dataflow #4106

2017-10-05 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PostCommit_Python_Verify #3285

2017-10-05 Thread Apache Jenkins Server
See 


--
[...truncated 855.80 KB...]
test_default_typed_hint 
(apache_beam.typehints.typed_pipeline_test.SideInputTest) ... ok
test_default_untyped_hint 
(apache_beam.typehints.typed_pipeline_test.SideInputTest) ... ok
test_deferred_side_input_iterable 
(apache_beam.typehints.typed_pipeline_test.SideInputTest) ... ok
test_deferred_side_inputs 
(apache_beam.typehints.typed_pipeline_test.SideInputTest) ... ok
test_keyword_side_input_hint 
(apache_beam.typehints.typed_pipeline_test.SideInputTest) ... ok
test_any_compatibility 
(apache_beam.typehints.typehints_test.AnyTypeConstraintTestCase) ... ok
test_repr (apache_beam.typehints.typehints_test.AnyTypeConstraintTestCase) ... 
ok
test_type_check 
(apache_beam.typehints.typehints_test.AnyTypeConstraintTestCase) ... ok
test_composite_takes_and_returns_hints 
(apache_beam.typehints.typehints_test.CombinedReturnsAndTakesTestCase) ... ok
test_enable_and_disable_type_checking_returns 
(apache_beam.typehints.typehints_test.CombinedReturnsAndTakesTestCase) ... ok
test_enable_and_disable_type_checking_takes 
(apache_beam.typehints.typehints_test.CombinedReturnsAndTakesTestCase) ... ok
test_simple_takes_and_returns_hints 
(apache_beam.typehints.typehints_test.CombinedReturnsAndTakesTestCase) ... ok
test_valid_mix_pos_and_keyword_with_both_orders 
(apache_beam.typehints.typehints_test.CombinedReturnsAndTakesTestCase) ... ok
test_getcallargs_forhints 
(apache_beam.typehints.typehints_test.DecoratorHelpers) ... ok
test_hint_helper (apache_beam.typehints.typehints_test.DecoratorHelpers) ... ok
test_positional_arg_hints 
(apache_beam.typehints.typehints_test.DecoratorHelpers) ... ok
test_compatibility (apache_beam.typehints.typehints_test.DictHintTestCase) ... 
ok
test_getitem_param_must_be_tuple 
(apache_beam.typehints.typehints_test.DictHintTestCase) ... 
:496:
 DeprecationWarning: BaseException.message has been deprecated as of Python 2.6
  e.exception.message)
ok
test_getitem_param_must_have_length_2 
(apache_beam.typehints.typehints_test.DictHintTestCase) ... ok
test_key_type_must_be_valid_composite_param 
(apache_beam.typehints.typehints_test.DictHintTestCase) ... ok
test_match_type_variables 
(apache_beam.typehints.typehints_test.DictHintTestCase) ... ok
test_repr (apache_beam.typehints.typehints_test.DictHintTestCase) ... ok
test_type_check_invalid_key_type 
(apache_beam.typehints.typehints_test.DictHintTestCase) ... ok
test_type_check_invalid_value_type 
(apache_beam.typehints.typehints_test.DictHintTestCase) ... ok
test_type_check_valid_composite_type 
(apache_beam.typehints.typehints_test.DictHintTestCase) ... ok
test_type_check_valid_simple_type 
(apache_beam.typehints.typehints_test.DictHintTestCase) ... ok
test_type_checks_not_dict 
(apache_beam.typehints.typehints_test.DictHintTestCase) ... ok
test_value_type_must_be_valid_composite_param 
(apache_beam.typehints.typehints_test.DictHintTestCase) ... ok
test_compatibility (apache_beam.typehints.typehints_test.GeneratorHintTestCase) 
... ok
test_generator_argument_hint_invalid_yield_type 
(apache_beam.typehints.typehints_test.GeneratorHintTestCase) ... ok
test_generator_return_hint_invalid_yield_type 
(apache_beam.typehints.typehints_test.GeneratorHintTestCase) ... ok
test_repr (apache_beam.typehints.typehints_test.GeneratorHintTestCase) ... ok
test_compatibility (apache_beam.typehints.typehints_test.IterableHintTestCase) 
... ok
test_getitem_invalid_composite_type_param 
(apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_repr (apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_tuple_compatibility 
(apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_type_check_must_be_iterable 
(apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_type_check_violation_invalid_composite_type 
(apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_type_check_violation_invalid_simple_type 
(apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_type_check_violation_valid_composite_type 
(apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_type_check_violation_valid_simple_type 
(apache_beam.typehints.typehints_test.IterableHintTestCase) ... ok
test_enforce_kv_type_constraint 
(apache_beam.typehints.typehints_test.KVHintTestCase) ... ok
test_getitem_param_must_be_tuple 
(apache_beam.typehints.typehints_test.KVHintTestCase) ... ok
test_getitem_param_must_have_length_2 
(apache_beam.typehints.typehints_test.KVHintTestCase) ... ok
test_getitem_proxy_to_tuple 
(apache_beam.typehints.typehints_test.KVHintTestCase) ... ok
test_enforce_list_type_constraint_invalid_composite_type 
(apache_beam.typehints.typehints_test.ListHintTestCase) ... ok

[jira] [Closed] (BEAM-2869) Add dates to releases on downloads page

2017-10-05 Thread Melissa Pashniak (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Melissa Pashniak closed BEAM-2869.
--

> Add dates to releases on downloads page
> ---
>
> Key: BEAM-2869
> URL: https://issues.apache.org/jira/browse/BEAM-2869
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Melissa Pashniak
>Assignee: Melissa Pashniak
>Priority: Minor
> Fix For: Not applicable
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (BEAM-2869) Add dates to releases on downloads page

2017-10-05 Thread Melissa Pashniak (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Melissa Pashniak resolved BEAM-2869.

Resolution: Fixed

> Add dates to releases on downloads page
> ---
>
> Key: BEAM-2869
> URL: https://issues.apache.org/jira/browse/BEAM-2869
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Melissa Pashniak
>Assignee: Melissa Pashniak
>Priority: Minor
> Fix For: Not applicable
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2039) number the chapters in page 'Beam Programming Guide'

2017-10-05 Thread Melissa Pashniak (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193528#comment-16193528
 ] 

Melissa Pashniak commented on BEAM-2039:


This PR is merged now.


> number the chapters in page 'Beam Programming Guide'
> 
>
> Key: BEAM-2039
> URL: https://issues.apache.org/jira/browse/BEAM-2039
> Project: Beam
>  Issue Type: Task
>  Components: website
>Reporter: Xu Mingmin
>Assignee: Melissa Pashniak
>Priority: Minor
>
> Now more and more content are added to page 'Beam Programming 
> Guide'(https://beam.apache.org/documentation/programming-guide/). Not easy to 
> read to me, a well numbered table content could give some help.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Build failed in Jenkins: beam_PerformanceTests_Python #411

2017-10-05 Thread Apache Jenkins Server
See 


Changes:

[robertwb] Enable discovery of log_handler tests.

[tgroh] Add a Bare-bones ReferenceRunner Job Service

[tgroh] Remove any_param field from FunctionSpec

[tgroh] Add an Endpoints Proto file

--
[...truncated 1.17 MB...]
include/grpc/census.h:417:1: warning: function declaration isnt a prototype 
[-Wstrict-prototypes]
 CENSUSAPI void census_trace_scan_end();
 ^
In file included from src/core/ext/census/grpc_filter.c:26:0:
include/grpc/support/alloc.h:61:1: warning: function declaration isnt a 
prototype [-Wstrict-prototypes]
 GPRAPI gpr_allocation_functions gpr_get_allocation_functions();
 ^
x86_64-linux-gnu-gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 
-Wall -Wstrict-prototypes -fPIC -DOPENSSL_NO_ASM=1 -D_WIN32_WINNT=1536 
-DGPR_BACKWARDS_COMPATIBILITY_MODE=1 -DHAVE_CONFIG_H=1 
-DPyMODINIT_FUNC=__attribute__((visibility ("default"))) void 
-Isrc/python/grpcio -Iinclude -I. -Ithird_party/boringssl/include 
-Ithird_party/zlib -Ithird_party/cares -Ithird_party/cares/cares 
-Ithird_party/cares/config_linux -I/usr/include/python2.7 -c 
src/core/ext/census/grpc_plugin.c -o 
python_build/temp.linux-x86_64-2.7/src/core/ext/census/grpc_plugin.o -std=c++11 
-std=gnu99 -fvisibility=hidden -fno-wrapv -fno-exceptions -pthread
cc1: warning: command line option -std=c++11 is valid for C++/ObjC++ but 
not for C [enabled by default]
In file included from src/core/ext/census/grpc_plugin.c:24:0:
include/grpc/census.h:417:1: warning: function declaration isnt a prototype 
[-Wstrict-prototypes]
 CENSUSAPI void census_trace_scan_end();
 ^
In file included from ./src/core/lib/channel/channel_stack.h:39:0,
 from ./src/core/ext/census/grpc_filter.h:22,
 from src/core/ext/census/grpc_plugin.c:26:
include/grpc/support/log.h:71:1: warning: function declaration isnt a 
prototype [-Wstrict-prototypes]
 GPRAPI void gpr_log_verbosity_init();
 ^
x86_64-linux-gnu-gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 
-Wall -Wstrict-prototypes -fPIC -DOPENSSL_NO_ASM=1 -D_WIN32_WINNT=1536 
-DGPR_BACKWARDS_COMPATIBILITY_MODE=1 -DHAVE_CONFIG_H=1 
-DPyMODINIT_FUNC=__attribute__((visibility ("default"))) void 
-Isrc/python/grpcio -Iinclude -I. -Ithird_party/boringssl/include 
-Ithird_party/zlib -Ithird_party/cares -Ithird_party/cares/cares 
-Ithird_party/cares/config_linux -I/usr/include/python2.7 -c 
src/core/ext/census/initialize.c -o 
python_build/temp.linux-x86_64-2.7/src/core/ext/census/initialize.o -std=c++11 
-std=gnu99 -fvisibility=hidden -fno-wrapv -fno-exceptions -pthread
cc1: warning: command line option -std=c++11 is valid for C++/ObjC++ but 
not for C [enabled by default]
In file included from src/core/ext/census/initialize.c:19:0:
include/grpc/census.h:417:1: warning: function declaration isnt a prototype 
[-Wstrict-prototypes]
 CENSUSAPI void census_trace_scan_end();
 ^
In file included from src/core/ext/census/initialize.c:20:0:
./src/core/ext/census/base_resources.h:22:1: warning: function declaration 
isnt a prototype [-Wstrict-prototypes]
 void define_base_resources();
 ^
x86_64-linux-gnu-gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 
-Wall -Wstrict-prototypes -fPIC -DOPENSSL_NO_ASM=1 -D_WIN32_WINNT=1536 
-DGPR_BACKWARDS_COMPATIBILITY_MODE=1 -DHAVE_CONFIG_H=1 
-DPyMODINIT_FUNC=__attribute__((visibility ("default"))) void 
-Isrc/python/grpcio -Iinclude -I. -Ithird_party/boringssl/include 
-Ithird_party/zlib -Ithird_party/cares -Ithird_party/cares/cares 
-Ithird_party/cares/config_linux -I/usr/include/python2.7 -c 
src/core/ext/census/intrusive_hash_map.c -o 
python_build/temp.linux-x86_64-2.7/src/core/ext/census/intrusive_hash_map.o 
-std=c++11 -std=gnu99 -fvisibility=hidden -fno-wrapv -fno-exceptions -pthread
cc1: warning: command line option -std=c++11 is valid for C++/ObjC++ but 
not for C [enabled by default]
In file included from 
./src/core/ext/census/intrusive_hash_map_internal.h:22:0,
 from ./src/core/ext/census/intrusive_hash_map.h:22,
 from src/core/ext/census/intrusive_hash_map.c:19:
include/grpc/support/alloc.h:61:1: warning: function declaration isnt a 
prototype [-Wstrict-prototypes]
 GPRAPI gpr_allocation_functions gpr_get_allocation_functions();
 ^
In file included from 
./src/core/ext/census/intrusive_hash_map_internal.h:23:0,
 from ./src/core/ext/census/intrusive_hash_map.h:22,
 from src/core/ext/census/intrusive_hash_map.c:19:
include/grpc/support/log.h:71:1: warning: function declaration isnt a 
prototype [-Wstrict-prototypes]
 GPRAPI void gpr_log_verbosity_init();
 ^
x86_64-linux-gnu-gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 
-Wall 

Jenkins build is back to stable : beam_PostCommit_Java_ValidatesRunner_Apex #2536

2017-10-05 Thread Apache Jenkins Server
See 




[jira] [Updated] (BEAM-3022) Enable the ability to grow partition count in the underlying Spark RDD

2017-10-05 Thread Tim Robertson (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Robertson updated BEAM-3022:

Description: 
When using a {{HadoopInputFormatIO}} the number of splits seems to be 
controlled by the underlying {{InputFormat}} which in turn determines the 
number of partitions and therefore parallelisation when running on Spark.  It 
is possible to {{Reshuffle}} the data to compensate for data skew, but it 
_appears_ there is no way to grow the number of partitions.  The 
{{GroupCombineFunctions.reshuffle}} seems to be the only place calling the 
Spark {{repartition}} and it uses the number of partitions from the original 
RDD.

Scenarios that would benefit from this:
# Increasing parallelisation for computationally heavy stages
# ETLs where the input partitions are dictated by the source while you wish to 
optimise the partitions for fast loading to the target sink
# Zip files (my case) where they are read in single threaded manner with a 
custom HadoopInputFormat and therefore get a single task for all stages

(It would be nice if a user could supply a partitioner too, to help dictate 
data locality)

  was:
When using a {{HadoopInputFormatIO}} the number of splits seems to be 
controlled by the underlying {{InputFormat}} which in turn determines the 
number of partitions and therefore parallelisation when running on Spark.  It 
is possible to {{Reshuffle}} the data to compensate for data skew, but it 
_appears_ there is no way to grow the number of partitions.  The 
{{GroupCombineFunctions.reshuffle}} seems to be the only place calling the 
Spark {{repartition}} and it uses the number of partitions from the original 
RDD.

Scenarios that would benefit from this:
# Increasing parallelisation for computationally heavy stages
# ETLs where the input partitions are dictated by the source while you wish to 
optimise the partitions for fast loading to the target sink
# Zip files (my case) where they are read in single threaded manner with a 
custom HadoopInputFormat and therefore get a single executor

(It would be nice if a user could supply a partitioner too, to help dictate 
data locality)


> Enable the ability to grow partition count in the underlying Spark RDD
> --
>
> Key: BEAM-3022
> URL: https://issues.apache.org/jira/browse/BEAM-3022
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-spark
>Reporter: Tim Robertson
>Assignee: Amit Sela
>
> When using a {{HadoopInputFormatIO}} the number of splits seems to be 
> controlled by the underlying {{InputFormat}} which in turn determines the 
> number of partitions and therefore parallelisation when running on Spark.  It 
> is possible to {{Reshuffle}} the data to compensate for data skew, but it 
> _appears_ there is no way to grow the number of partitions.  The 
> {{GroupCombineFunctions.reshuffle}} seems to be the only place calling the 
> Spark {{repartition}} and it uses the number of partitions from the original 
> RDD.
> Scenarios that would benefit from this:
> # Increasing parallelisation for computationally heavy stages
> # ETLs where the input partitions are dictated by the source while you wish 
> to optimise the partitions for fast loading to the target sink
> # Zip files (my case) where they are read in single threaded manner with a 
> custom HadoopInputFormat and therefore get a single task for all stages
> (It would be nice if a user could supply a partitioner too, to help dictate 
> data locality)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (BEAM-3022) Enable the ability to grow partition count in the underlying Spark RDD

2017-10-05 Thread Tim Robertson (JIRA)
Tim Robertson created BEAM-3022:
---

 Summary: Enable the ability to grow partition count in the 
underlying Spark RDD
 Key: BEAM-3022
 URL: https://issues.apache.org/jira/browse/BEAM-3022
 Project: Beam
  Issue Type: Improvement
  Components: runner-spark
Reporter: Tim Robertson
Assignee: Amit Sela


When using a {{HadoopInputFormatIO}} the number of splits seems to be 
controlled by the underlying {{InputFormat}} which in turn determines the 
number of partitions and therefore parallelisation when running on Spark.  It 
is possible to {{Reshuffle}} the data to compensate for data skew, but it 
_appears_ there is no way to grow the number of partitions.  The 
{{GroupCombineFunctions.reshuffle}} seems to be the only place calling the 
Spark {{repartition}} and it uses the number of partitions from the original 
RDD.

Scenarios that would benefit from this:
# Increasing parallelisation for computationally heavy stages
# ETLs where the input partitions are dictated by the source while you wish to 
optimise the partitions for fast loading to the target sink
# Zip files (my case) where they are read in single threaded manner with a 
custom HadoopInputFormat and therefore get a single executor

(It would be nice if a user could supply a partitioner too, to help dictate 
data locality)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3917: Add an Endpoints Proto file

2017-10-05 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3917


---


[1/2] beam git commit: Add an Endpoints Proto file

2017-10-05 Thread tgroh
Repository: beam
Updated Branches:
  refs/heads/master 34eb9b0f3 -> 5c21ba7b4


Add an Endpoints Proto file

This contains the APIServiceDescriptor proto, which is used for
specifying an endpoint to communicate to.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/9bbed6d4
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/9bbed6d4
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/9bbed6d4

Branch: refs/heads/master
Commit: 9bbed6d441351a91720d17f1dfc4f236a96afdc5
Parents: 34eb9b0
Author: Thomas Groh 
Authored: Thu Sep 28 11:30:38 2017 -0700
Committer: Thomas Groh 
Committed: Thu Oct 5 11:30:28 2017 -0700

--
 .../fn-api/src/main/proto/beam_fn_api.proto | 27 ++--
 .../runner-api/src/main/proto/endpoints.proto   | 46 
 .../beam/fn/harness/BeamFnDataReadRunner.java   |  3 +-
 .../beam/fn/harness/BeamFnDataWriteRunner.java  |  3 +-
 .../org/apache/beam/fn/harness/FnHarness.java   | 19 
 .../harness/channel/ManagedChannelFactory.java  |  2 +-
 .../fn/harness/control/BeamFnControlClient.java |  5 ++-
 .../harness/control/ProcessBundleHandler.java   |  2 +-
 .../beam/fn/harness/data/BeamFnDataClient.java  |  5 ++-
 .../fn/harness/data/BeamFnDataGrpcClient.java   | 15 ---
 .../harness/data/BeamFnDataGrpcMultiplexer.java | 22 +-
 .../fn/harness/logging/BeamFnLoggingClient.java |  7 +--
 .../state/BeamFnStateGrpcClientCache.java   |  6 +--
 .../fn/harness/BeamFnDataReadRunnerTest.java|  3 +-
 .../fn/harness/BeamFnDataWriteRunnerTest.java   |  3 +-
 .../apache/beam/fn/harness/FnHarnessTest.java   |  7 ++-
 .../channel/ManagedChannelFactoryTest.java  |  2 +-
 .../control/BeamFnControlClientTest.java|  7 +--
 .../control/ProcessBundleHandlerTest.java   |  2 +-
 .../harness/data/BeamFnDataGrpcClientTest.java  | 19 
 .../data/BeamFnDataGrpcMultiplexerTest.java |  5 ++-
 .../logging/BeamFnLoggingClientTest.java|  7 +--
 .../state/BeamFnStateGrpcClientCacheTest.java   |  4 +-
 .../portability/universal_local_runner.py   |  4 +-
 .../runners/worker/log_handler_test.py  |  3 +-
 .../runners/worker/sdk_worker_main.py   |  6 +--
 26 files changed, 137 insertions(+), 97 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/9bbed6d4/sdks/common/fn-api/src/main/proto/beam_fn_api.proto
--
diff --git a/sdks/common/fn-api/src/main/proto/beam_fn_api.proto 
b/sdks/common/fn-api/src/main/proto/beam_fn_api.proto
index 9d4c5f6..5a01077 100644
--- a/sdks/common/fn-api/src/main/proto/beam_fn_api.proto
+++ b/sdks/common/fn-api/src/main/proto/beam_fn_api.proto
@@ -38,6 +38,7 @@ option java_package = "org.apache.beam.fn.v1";
 option java_outer_classname = "BeamFnApi";
 
 import "beam_runner_api.proto";
+import "endpoints.proto";
 import "google/protobuf/timestamp.proto";
 
 /*
@@ -73,7 +74,7 @@ message Target {
 message RemoteGrpcPort {
   // (Required) An API descriptor which describes where to
   // connect to including any authentication that is required.
-  ApiServiceDescriptor api_service_descriptor = 1;
+  org.apache.beam.portability.v1.ApiServiceDescriptor api_service_descriptor = 
1;
 }
 
 /*
@@ -174,7 +175,7 @@ message ProcessBundleDescriptor {
   // A descriptor describing the end point to use for State API
   // calls. Required if the Runner intends to send remote references over the
   // data plane or if any of the transforms rely on user state or side inputs.
-  ApiServiceDescriptor state_api_service_descriptor = 7;
+  org.apache.beam.portability.v1.ApiServiceDescriptor 
state_api_service_descriptor = 7;
 }
 
 // A request to process a given bundle.
@@ -706,28 +707,6 @@ service BeamFnLogging {
 /*
  * Environment types
  */
-message ApiServiceDescriptor {
-  // (Required) A pipeline level unique id which can be used as a reference to
-  // refer to this.
-  string id = 1;
-
-  // (Required) The URL to connect to.
-  string url = 2;
-
-  // (Optional) The method for authentication. If unspecified, access to the
-  // url is already being performed in a trusted context (e.g. localhost,
-  // private network).
-  oneof authentication {
-OAuth2ClientCredentialsGrant oauth2_client_credentials_grant = 3;
-  }
-}
-
-message OAuth2ClientCredentialsGrant {
-  // (Required) The URL to submit a "client_credentials" grant type request for
-  // an OAuth access token which will be used as a bearer token for requests.
-  string url = 1;
-}
-
 // A Docker container configuration for launching the SDK harness to execute
 // user specified functions.
 message DockerContainer {

http://git-wip-us.apache.org/repos/asf/beam/blob/9bbed6d4/sdks/common/runner-api/src/main/proto/endpoints.proto

[2/2] beam git commit: This closes #3917

2017-10-05 Thread tgroh
This closes #3917


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/5c21ba7b
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/5c21ba7b
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/5c21ba7b

Branch: refs/heads/master
Commit: 5c21ba7b4be1537ee1ffed4efa90c6947f55ee1e
Parents: 34eb9b0 9bbed6d
Author: Thomas Groh 
Authored: Thu Oct 5 11:30:29 2017 -0700
Committer: Thomas Groh 
Committed: Thu Oct 5 11:30:29 2017 -0700

--
 .../fn-api/src/main/proto/beam_fn_api.proto | 27 ++--
 .../runner-api/src/main/proto/endpoints.proto   | 46 
 .../beam/fn/harness/BeamFnDataReadRunner.java   |  3 +-
 .../beam/fn/harness/BeamFnDataWriteRunner.java  |  3 +-
 .../org/apache/beam/fn/harness/FnHarness.java   | 19 
 .../harness/channel/ManagedChannelFactory.java  |  2 +-
 .../fn/harness/control/BeamFnControlClient.java |  5 ++-
 .../harness/control/ProcessBundleHandler.java   |  2 +-
 .../beam/fn/harness/data/BeamFnDataClient.java  |  5 ++-
 .../fn/harness/data/BeamFnDataGrpcClient.java   | 15 ---
 .../harness/data/BeamFnDataGrpcMultiplexer.java | 22 +-
 .../fn/harness/logging/BeamFnLoggingClient.java |  7 +--
 .../state/BeamFnStateGrpcClientCache.java   |  6 +--
 .../fn/harness/BeamFnDataReadRunnerTest.java|  3 +-
 .../fn/harness/BeamFnDataWriteRunnerTest.java   |  3 +-
 .../apache/beam/fn/harness/FnHarnessTest.java   |  7 ++-
 .../channel/ManagedChannelFactoryTest.java  |  2 +-
 .../control/BeamFnControlClientTest.java|  7 +--
 .../control/ProcessBundleHandlerTest.java   |  2 +-
 .../harness/data/BeamFnDataGrpcClientTest.java  | 19 
 .../data/BeamFnDataGrpcMultiplexerTest.java |  5 ++-
 .../logging/BeamFnLoggingClientTest.java|  7 +--
 .../state/BeamFnStateGrpcClientCacheTest.java   |  4 +-
 .../portability/universal_local_runner.py   |  4 +-
 .../runners/worker/log_handler_test.py  |  3 +-
 .../runners/worker/sdk_worker_main.py   |  6 +--
 26 files changed, 137 insertions(+), 97 deletions(-)
--




[jira] [Assigned] (BEAM-2926) Java SDK support for portable side input

2017-10-05 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik reassigned BEAM-2926:
---

Assignee: Luke Cwik

> Java SDK support for portable side input 
> -
>
> Key: BEAM-2926
> URL: https://issues.apache.org/jira/browse/BEAM-2926
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Henning Rohde
>Assignee: Luke Cwik
>  Labels: portability
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Build failed in Jenkins: beam_PostCommit_Java_MavenInstall #4950

2017-10-05 Thread Apache Jenkins Server
See 


Changes:

[tgroh] Add a Bare-bones ReferenceRunner Job Service

--
[...truncated 3.29 MB...]
2017-10-05T17:36:38.825 [INFO] Installing 

 to 

2017-10-05T17:36:38.828 [INFO] Installing 

 to 

2017-10-05T17:36:38.830 [INFO] Installing 

 to 

[JENKINS] Archiving disabled
2017-10-05T17:36:40.679 [INFO]  
   
2017-10-05T17:36:40.679 [INFO] 

2017-10-05T17:36:40.679 [INFO] Building Apache Beam :: Runners :: Reference 
2.2.0-SNAPSHOT
2017-10-05T17:36:40.679 [INFO] 

2017-10-05T17:36:40.684 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/args4j/args4j/2.33/args4j-2.33.pom
2017-10-05T17:36:40.800 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/args4j/args4j/2.33/args4j-2.33.pom (2 KB 
at 10.9 KB/sec)
2017-10-05T17:36:40.802 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/args4j/args4j-site/2.33/args4j-site-2.33.pom
2017-10-05T17:36:40.831 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/args4j/args4j-site/2.33/args4j-site-2.33.pom
 (5 KB at 149.5 KB/sec)
2017-10-05T17:36:40.833 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/kohsuke/pom/14/pom-14.pom
2017-10-05T17:36:40.862 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/kohsuke/pom/14/pom-14.pom (6 KB at 
184.5 KB/sec)
2017-10-05T17:36:40.866 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/args4j/args4j/2.33/args4j-2.33.jar
2017-10-05T17:36:40.940 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/args4j/args4j/2.33/args4j-2.33.jar (152 KB 
at 2078.6 KB/sec)
2017-10-05T17:36:40.941 [INFO] 
2017-10-05T17:36:40.941 [INFO] --- maven-clean-plugin:3.0.0:clean 
(default-clean) @ beam-runners-reference ---
2017-10-05T17:36:40.943 [INFO] Deleting 

 (includes = [**/*.pyc, **/*.egg-info/, **/sdks/python/LICENSE, 
**/sdks/python/NOTICE, **/sdks/python/README.md], excludes = [])
2017-10-05T17:36:41.055 [INFO] 
2017-10-05T17:36:41.056 [INFO] --- maven-enforcer-plugin:3.0.0-M1:enforce 
(enforce) @ beam-runners-reference ---
2017-10-05T17:36:41.420 [INFO] 
2017-10-05T17:36:41.433 [INFO] --- maven-enforcer-plugin:3.0.0-M1:enforce 
(enforce-banned-dependencies) @ beam-runners-reference ---
2017-10-05T17:36:41.556 [INFO] 
2017-10-05T17:36:41.556 [INFO] --- maven-remote-resources-plugin:1.5:process 
(process-resource-bundles) @ beam-runners-reference ---
2017-10-05T17:36:41.894 [INFO] 
2017-10-05T17:36:41.894 [INFO] --- maven-resources-plugin:3.0.2:resources 
(default-resources) @ beam-runners-reference ---
2017-10-05T17:36:41.895 [INFO] Using 'UTF-8' encoding to copy filtered 
resources.
2017-10-05T17:36:41.895 [INFO] skip non existing resourceDirectory 

2017-10-05T17:36:41.934 [INFO] Copying 3 resources
2017-10-05T17:36:42.043 [INFO] 
2017-10-05T17:36:42.043 [INFO] --- maven-compiler-plugin:3.6.2:compile 
(default-compile) @ beam-runners-reference ---
2017-10-05T17:36:42.049 [INFO] Changes detected - recompiling the module!
2017-10-05T17:36:42.050 [INFO] Compiling 3 source files to 

2017-10-05T17:36:42.160 [WARNING] bootstrap class path not set in conjunction 
with -source 1.7
2017-10-05T17:36:42.281 [INFO] 
2017-10-05T17:36:42.281 [INFO] --- maven-resources-plugin:3.0.2:testResources 
(default-testResources) @ beam-runners-reference ---

[jira] [Created] (BEAM-3021) PubSubIO: make Write.withWriteFn and Read.withCoderAndParseFn public

2017-10-05 Thread Mikko Sivulainen (JIRA)
Mikko Sivulainen created BEAM-3021:
--

 Summary: PubSubIO: make Write.withWriteFn and 
Read.withCoderAndParseFn public
 Key: BEAM-3021
 URL: https://issues.apache.org/jira/browse/BEAM-3021
 Project: Beam
  Issue Type: Wish
  Components: sdk-java-gcp
Reporter: Mikko Sivulainen
Assignee: Chamikara Jayalath



PubsubIO has Write.withWriteFn and Read.withCoderAndParseFn functions that are 
used internally to read/write messages in Protobuf and Avros formats. These 
functions should be exposed also to the end user to simplify handling of custom 
formats.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (BEAM-2887) Python SDK support for portable pipelines

2017-10-05 Thread Henning Rohde (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henning Rohde reassigned BEAM-2887:
---

Assignee: Ahmet Altay

> Python SDK support for portable pipelines
> -
>
> Key: BEAM-2887
> URL: https://issues.apache.org/jira/browse/BEAM-2887
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Henning Rohde
>Assignee: Ahmet Altay
>  Labels: portability
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3936: [BEAM-3013] Prototyping lull-tracking for Python

2017-10-05 Thread pabloem
GitHub user pabloem reopened a pull request:

https://github.com/apache/beam/pull/3936

[BEAM-3013] Prototyping lull-tracking for Python

r: @charlesccychen 
cc: @bjchambers 

This change is to allow the state sampler in Python to report when too much 
time has been spent on the same state (default: 5 minutes). When more than 5 
minutes have been spent, the worker will report it.

This change requires the sampling thread (normally a pure-C thread) to 
acquire the GIL, to access the logging module. For this, I laid out a 
convention to acquire locks in a certain order - to avoid deadlocks.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pabloem/incubator-beam lull-tracking-py

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3936.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3936


commit d41d02e17763c5c7aab6c53d2d9cbf1ba8e6a83a
Author: Pablo 
Date:   2017-10-03T20:49:43Z

Prototyping lull-tracking for Python

commit ee53714c2d108c57ae6fadd92110db81a47acf8b
Author: Pablo 
Date:   2017-10-03T22:17:42Z

Fixing lint issues




---


[jira] [Commented] (BEAM-3013) The Python worker should report lulls

2017-10-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193246#comment-16193246
 ] 

ASF GitHub Bot commented on BEAM-3013:
--

GitHub user pabloem reopened a pull request:

https://github.com/apache/beam/pull/3936

[BEAM-3013] Prototyping lull-tracking for Python

r: @charlesccychen 
cc: @bjchambers 

This change is to allow the state sampler in Python to report when too much 
time has been spent on the same state (default: 5 minutes). When more than 5 
minutes have been spent, the worker will report it.

This change requires the sampling thread (normally a pure-C thread) to 
acquire the GIL, to access the logging module. For this, I laid out a 
convention to acquire locks in a certain order - to avoid deadlocks.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pabloem/incubator-beam lull-tracking-py

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3936.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3936


commit d41d02e17763c5c7aab6c53d2d9cbf1ba8e6a83a
Author: Pablo 
Date:   2017-10-03T20:49:43Z

Prototyping lull-tracking for Python

commit ee53714c2d108c57ae6fadd92110db81a47acf8b
Author: Pablo 
Date:   2017-10-03T22:17:42Z

Fixing lint issues




> The Python worker should report lulls
> -
>
> Key: BEAM-3013
> URL: https://issues.apache.org/jira/browse/BEAM-3013
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-harness
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>
> Whenever too much time has been spent on the same state (e.g. > 5 minutes), 
> the worker should report it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is unstable: beam_PostCommit_Java_ValidatesRunner_Apex #2535

2017-10-05 Thread Apache Jenkins Server
See 




[jira] [Resolved] (BEAM-2999) Split validatesrunner tests from Python postcommit

2017-10-05 Thread Mark Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Liu resolved BEAM-2999.

   Resolution: Done
Fix Version/s: Not applicable

> Split validatesrunner tests from Python postcommit
> --
>
> Key: BEAM-2999
> URL: https://issues.apache.org/jira/browse/BEAM-2999
> Project: Beam
>  Issue Type: Task
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
> Fix For: Not applicable
>
>
> The only Python Postcommit Jenkins build includes too many tests which makes 
> the build (and test) time over 1 hour. Also it became hard to found error in 
> long console logs if build failed.
> We can separate validatesrunner tests which currently take ~20mins out from 
> the Postcommit build to a separate Jenkins branch. This will shorten the 
> total build time of Postcommit.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (BEAM-2916) Python SDK support for portable user state

2017-10-05 Thread Henning Rohde (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henning Rohde reassigned BEAM-2916:
---

Assignee: (was: Ahmet Altay)

> Python SDK support for portable user state
> --
>
> Key: BEAM-2916
> URL: https://issues.apache.org/jira/browse/BEAM-2916
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Henning Rohde
>Priority: Minor
>  Labels: portability
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (BEAM-3020) Model MultimapUserState

2017-10-05 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-3020:
---

 Summary: Model MultimapUserState
 Key: BEAM-3020
 URL: https://issues.apache.org/jira/browse/BEAM-3020
 Project: Beam
  Issue Type: Sub-task
  Components: beam-model
Reporter: Luke Cwik


This is to support development of map like states like SetState within 
individual SDK languages.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (BEAM-2999) Split validatesrunner tests from Python postcommit

2017-10-05 Thread Mark Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Liu updated BEAM-2999:
---
Issue Type: Task  (was: Bug)

> Split validatesrunner tests from Python postcommit
> --
>
> Key: BEAM-2999
> URL: https://issues.apache.org/jira/browse/BEAM-2999
> Project: Beam
>  Issue Type: Task
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
> Fix For: Not applicable
>
>
> The only Python Postcommit Jenkins build includes too many tests which makes 
> the build (and test) time over 1 hour. Also it became hard to found error in 
> long console logs if build failed.
> We can separate validatesrunner tests which currently take ~20mins out from 
> the Postcommit build to a separate Jenkins branch. This will shorten the 
> total build time of Postcommit.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (BEAM-2999) Split validatesrunner tests from Python postcommit

2017-10-05 Thread Mark Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Liu closed BEAM-2999.
--

> Split validatesrunner tests from Python postcommit
> --
>
> Key: BEAM-2999
> URL: https://issues.apache.org/jira/browse/BEAM-2999
> Project: Beam
>  Issue Type: Task
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
> Fix For: Not applicable
>
>
> The only Python Postcommit Jenkins build includes too many tests which makes 
> the build (and test) time over 1 hour. Also it became hard to found error in 
> long console logs if build failed.
> We can separate validatesrunner tests which currently take ~20mins out from 
> the Postcommit build to a separate Jenkins branch. This will shorten the 
> total build time of Postcommit.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (BEAM-2915) Java SDK support for portable user state

2017-10-05 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193187#comment-16193187
 ] 

Luke Cwik edited comment on BEAM-2915 at 10/5/17 4:56 PM:
--

Support was added within the following PRs to support BagState, ValueState and 
CombiningState within the Java SDK harness:
* https://github.com/apache/beam/pull/3795
* https://github.com/apache/beam/pull/3788
* https://github.com/apache/beam/pull/3783
* https://github.com/apache/beam/pull/3760
* https://github.com/apache/beam/pull/3736
* https://github.com/apache/beam/pull/3724
* https://github.com/apache/beam/pull/3723
* https://github.com/apache/beam/pull/3638

The remaining features are:
* Add support for closing the BeamFnStateGrpcClientCache which should cancel 
outgoing calls.
* Handle outputting state which is > 2GiB by using a chunking output stream
* Move to an asynchronous persist model so that we can flush state since we 
currently cache all changes and only emit at the end of the process bundle call.
* Add map and set support

These items will improve performance:
* Support block level caching and prefetch
* Support caching of state beyond the lifetime of a single bundle
* Support readLater by issuing a prefetch
* Optimize squashing accumulators in CombiningState


was (Author: lcwik):
Support was added within the following PRs to support BagState, ValueState and 
CombiningState within the Java SDK harness:
* https://github.com/apache/beam/pull/3795
* https://github.com/apache/beam/pull/3788
* https://github.com/apache/beam/pull/3783
* https://github.com/apache/beam/pull/3760
* https://github.com/apache/beam/pull/3736
* https://github.com/apache/beam/pull/3724
* https://github.com/apache/beam/pull/3723
* https://github.com/apache/beam/pull/3638

The remaining features are:
* Add support for closing the BeamFnStateGrpcClientCache which should cancel 
outgoing calls.
* Handle outputting state which is > 2GiB by using a chunking output stream
* Move to an asynchronous persist model so that we can flush state since we 
currently cache all changes and only emit at the end of the process bundle call.
* Add map and set support

These items will improve performance:
* Support block level caching and prefetch
* Support readLater by issuing a prefetch
* Optimize squashing accumulators in CombiningState

> Java SDK support for portable user state
> 
>
> Key: BEAM-2915
> URL: https://issues.apache.org/jira/browse/BEAM-2915
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Henning Rohde
>Priority: Minor
>  Labels: portability
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (BEAM-2915) Java SDK support for portable user state

2017-10-05 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193187#comment-16193187
 ] 

Luke Cwik edited comment on BEAM-2915 at 10/5/17 4:55 PM:
--

Support was added within the following PRs to support BagState, ValueState and 
CombiningState within the Java SDK harness:
* https://github.com/apache/beam/pull/3795
* https://github.com/apache/beam/pull/3788
* https://github.com/apache/beam/pull/3783
* https://github.com/apache/beam/pull/3760
* https://github.com/apache/beam/pull/3736
* https://github.com/apache/beam/pull/3724
* https://github.com/apache/beam/pull/3723
* https://github.com/apache/beam/pull/3638

The remaining features are:
* Add support for closing the BeamFnStateGrpcClientCache which should cancel 
outgoing calls.
* Handle outputting state which is > 2GiB by using a chunking output stream
* Move to an asynchronous persist model so that we can flush state since we 
currently cache all changes and only emit at the end of the process bundle call.
* Add map and set support

These items will improve performance:
* Support block level caching and prefetch
* Support readLater by issuing a prefetch
* Optimize squashing accumulators in CombiningState


was (Author: lcwik):
Support was added within the following PRs to support BagState, ValueState and 
CombiningState:
* https://github.com/apache/beam/pull/3795
* https://github.com/apache/beam/pull/3788
* https://github.com/apache/beam/pull/3783
* https://github.com/apache/beam/pull/3760
* https://github.com/apache/beam/pull/3736
* https://github.com/apache/beam/pull/3724
* https://github.com/apache/beam/pull/3723
* https://github.com/apache/beam/pull/3638

The remaining features are:
* Add support for closing the BeamFnStateGrpcClientCache which should cancel 
outgoing calls.
* Handle outputting state which is > 2GiB by using a chunking output stream
* Move to an asynchronous persist model so that we can flush state since we 
currently cache all changes and only emit at the end of the process bundle call.
* Add map and set support

These items will improve performance:
* Support block level caching and prefetch
* Support readLater by issuing a prefetch
* Optimize squashing accumulators in CombiningState

> Java SDK support for portable user state
> 
>
> Key: BEAM-2915
> URL: https://issues.apache.org/jira/browse/BEAM-2915
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Henning Rohde
>Priority: Minor
>  Labels: portability
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2915) Java SDK support for portable user state

2017-10-05 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193187#comment-16193187
 ] 

Luke Cwik commented on BEAM-2915:
-

Support was added within the following PRs to support BagState, ValueState and 
CombiningState:
* https://github.com/apache/beam/pull/3795
* https://github.com/apache/beam/pull/3788
* https://github.com/apache/beam/pull/3783
* https://github.com/apache/beam/pull/3760
* https://github.com/apache/beam/pull/3736
* https://github.com/apache/beam/pull/3724
* https://github.com/apache/beam/pull/3723
* https://github.com/apache/beam/pull/3638

The remaining features are:
* Add support for closing the BeamFnStateGrpcClientCache which should cancel 
outgoing calls.
* Handle outputting state which is > 2GiB by using a chunking output stream
* Move to an asynchronous persist model so that we can flush state since we 
currently cache all changes and only emit at the end of the process bundle call.
* Add map and set support

These items will improve performance:
* Support block level caching and prefetch
* Support readLater by issuing a prefetch
* Optimize squashing accumulators in CombiningState

> Java SDK support for portable user state
> 
>
> Key: BEAM-2915
> URL: https://issues.apache.org/jira/browse/BEAM-2915
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Henning Rohde
>Priority: Minor
>  Labels: portability
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3869: Remove any_param field from FunctionSpec

2017-10-05 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3869


---


[2/2] beam git commit: Remove any_param field from FunctionSpec

2017-10-05 Thread tgroh
Remove any_param field from FunctionSpec


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/060bda23
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/060bda23
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/060bda23

Branch: refs/heads/master
Commit: 060bda23d1e5cd5146190aa34f2e212404cb6667
Parents: 294518e
Author: Thomas Groh 
Authored: Tue Sep 19 16:39:44 2017 -0700
Committer: Thomas Groh 
Committed: Thu Oct 5 09:45:44 2017 -0700

--
 .../WindowingStrategyTranslation.java   |  7 --
 .../src/main/proto/beam_runner_api.proto|  3 ---
 sdks/python/apache_beam/coders/coders.py|  1 -
 .../runners/portability/fn_api_runner.py| 26 
 sdks/python/apache_beam/transforms/core.py  |  4 ---
 .../python/apache_beam/transforms/ptransform.py |  1 -
 sdks/python/apache_beam/utils/urns.py   |  1 -
 7 files changed, 43 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/060bda23/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/WindowingStrategyTranslation.java
--
diff --git 
a/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/WindowingStrategyTranslation.java
 
b/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/WindowingStrategyTranslation.java
index 1b4786c..be8601c 100644
--- 
a/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/WindowingStrategyTranslation.java
+++ 
b/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/WindowingStrategyTranslation.java
@@ -17,9 +17,7 @@
  */
 package org.apache.beam.runners.core.construction;
 
-import com.google.protobuf.Any;
 import com.google.protobuf.ByteString;
-import com.google.protobuf.BytesValue;
 import com.google.protobuf.InvalidProtocolBufferException;
 import com.google.protobuf.util.Durations;
 import com.google.protobuf.util.Timestamps;
@@ -223,7 +221,6 @@ public class WindowingStrategyTranslation implements 
Serializable {
   .setSpec(
   FunctionSpec.newBuilder()
   .setUrn(OLD_SERIALIZED_JAVA_WINDOWFN_URN)
-  
.setAnyParam(Any.pack(BytesValue.newBuilder().setValue(serializedFn).build()))
   .setPayload(serializedFn)
   .build())
   .build();
@@ -241,7 +238,6 @@ public class WindowingStrategyTranslation implements 
Serializable {
   .setSpec(
   FunctionSpec.newBuilder()
   .setUrn(FIXED_WINDOWS_FN)
-  .setAnyParam(Any.pack(fixedWindowsPayload))
   .setPayload(fixedWindowsPayload.toByteString()))
   .build();
 } else if (windowFn instanceof SlidingWindows) {
@@ -254,7 +250,6 @@ public class WindowingStrategyTranslation implements 
Serializable {
   .setSpec(
   FunctionSpec.newBuilder()
   .setUrn(SLIDING_WINDOWS_FN)
-  .setAnyParam(Any.pack(slidingWindowsPayload))
   .setPayload(slidingWindowsPayload.toByteString()))
   .build();
 } else if (windowFn instanceof Sessions) {
@@ -266,7 +261,6 @@ public class WindowingStrategyTranslation implements 
Serializable {
   .setSpec(
   FunctionSpec.newBuilder()
   .setUrn(SESSION_WINDOWS_FN)
-  .setAnyParam(Any.pack(sessionsPayload))
   .setPayload(sessionsPayload.toByteString()))
   .build();
 } else {
@@ -274,7 +268,6 @@ public class WindowingStrategyTranslation implements 
Serializable {
   .setSpec(
   FunctionSpec.newBuilder()
   .setUrn(SERIALIZED_JAVA_WINDOWFN_URN)
-  
.setAnyParam(Any.pack(BytesValue.newBuilder().setValue(serializedFn).build()))
   .setPayload(serializedFn))
   .build();
 }

http://git-wip-us.apache.org/repos/asf/beam/blob/060bda23/sdks/common/runner-api/src/main/proto/beam_runner_api.proto
--
diff --git a/sdks/common/runner-api/src/main/proto/beam_runner_api.proto 
b/sdks/common/runner-api/src/main/proto/beam_runner_api.proto
index 9ba5577..74f3897 100644
--- a/sdks/common/runner-api/src/main/proto/beam_runner_api.proto
+++ b/sdks/common/runner-api/src/main/proto/beam_runner_api.proto
@@ -782,9 +782,6 @@ message FunctionSpec {
   // passed as-is.
   string urn = 1;
 
-  // (Deprecated)
-  google.protobuf.Any any_param = 2;
-
   // (Optional) The data specifying any parameters to the URN. If
   // the URN does not require any arguments, this may be omitted.

[1/2] beam git commit: This closes #3869

2017-10-05 Thread tgroh
Repository: beam
Updated Branches:
  refs/heads/master 294518e59 -> 34eb9b0f3


This closes #3869


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/34eb9b0f
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/34eb9b0f
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/34eb9b0f

Branch: refs/heads/master
Commit: 34eb9b0f3add0b9bd4a0801ceb0fbd4d5885a64d
Parents: 294518e 060bda2
Author: Thomas Groh 
Authored: Thu Oct 5 09:45:44 2017 -0700
Committer: Thomas Groh 
Committed: Thu Oct 5 09:45:44 2017 -0700

--
 .../WindowingStrategyTranslation.java   |  7 --
 .../src/main/proto/beam_runner_api.proto|  3 ---
 sdks/python/apache_beam/coders/coders.py|  1 -
 .../runners/portability/fn_api_runner.py| 26 
 sdks/python/apache_beam/transforms/core.py  |  4 ---
 .../python/apache_beam/transforms/ptransform.py |  1 -
 sdks/python/apache_beam/utils/urns.py   |  1 -
 7 files changed, 43 deletions(-)
--




Jenkins build is back to normal : beam_PostCommit_Java_ValidatesRunner_Spark #3221

2017-10-05 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-2545) bigtable e2e tests failing - UNKNOWN: Stale requests/Error mutating row

2017-10-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193134#comment-16193134
 ] 

ASF GitHub Bot commented on BEAM-2545:
--

Github user ssisk closed the pull request at:

https://github.com/apache/beam/pull/3484


> bigtable e2e tests failing -  UNKNOWN: Stale requests/Error mutating row
> 
>
> Key: BEAM-2545
> URL: https://issues.apache.org/jira/browse/BEAM-2545
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-gcp
>Reporter: Stephen Sisk
>Assignee: Chamikara Jayalath
> Fix For: 2.2.0
>
>
> The BigtableWriteIT is taking a long time (~10min) and throwing errors. 
> Example test run: 
> https://builds.apache.org/view/Beam/job/beam_PostCommit_Java_MavenInstall/4264/org.apache.beam$beam-runners-google-cloud-dataflow-java/testReport/junit/org.apache.beam.sdk.io.gcp.bigtable/BigtableWriteIT/testE2EBigtableWrite/
> (96dc5c8efaf8fa26): java.io.IOException: At least 25 errors occurred writing 
> to Bigtable. First 10 errors: 
> Error mutating row key00175 with mutations [set_cell {
>   family_name: "cf"
>   value: "value00175"
> }
> ]: UNKNOWN: Stale requests.
> Error mutating row key00176 with mutations [set_cell {
>   family_name: "cf"
>   value: "value00176"
> }
> ]: UNKNOWN: Stale requests.
> Error mutating row key00177 with mutations [set_cell {
>   family_name: "cf"
>   value: "value00177"
> }
> ]: UNKNOWN: Stale requests.
> Error mutating row key00178 with mutations [set_cell {
>   family_name: "cf"
>   value: "value00178"
> }
> ]: UNKNOWN: Stale requests.
> Error mutating row key00179 with mutations [set_cell {
>   family_name: "cf"
>   value: "value00179"
> }
> ]: UNKNOWN: Stale requests.
> Error mutating row key00180 with mutations [set_cell {
>   family_name: "cf"
>   value: "value00180"
> }
> ]: UNKNOWN: Stale requests.
> Error mutating row key00181 with mutations [set_cell {
>   family_name: "cf"
>   value: "value00181"
> }
> ]: UNKNOWN: Stale requests.
> Error mutating row key00182 with mutations [set_cell {
>   family_name: "cf"
>   value: "value00182"
> }
> ]: UNKNOWN: Stale requests.
> Error mutating row key00183 with mutations [set_cell {
>   family_name: "cf"
>   value: "value00183"
> }
> ]: UNKNOWN: Stale requests.
> Error mutating row key00184 with mutations [set_cell {
>   family_name: "cf"
>   value: "value00184"
> }
> ]: UNKNOWN: Stale requests.
>  at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$Write$BigtableWriterFn.checkForFailures(BigtableIO.java:655)
>  at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$Write$BigtableWriterFn.finishBundle(BigtableIO.java:607)
> Stacktrace
> java.lang.RuntimeException: 
> (96dc5c8efaf8fa26): java.io.IOException: At least 25 errors occurred writing 
> to Bigtable. First 10 errors: 
> Error mutating row key00175 with mutations [set_cell {
>   family_name: "cf"
>   value: "value00175"
> }
> ]: UNKNOWN: Stale requests.
> Error mutating row key00176 with mutations [set_cell {
>   family_name: "cf"
>   value: "value00176"
> }
> ]: UNKNOWN: Stale requests.
> Error mutating row key00177 with mutations [set_cell {
>   family_name: "cf"
>   value: "value00177"
> }
> ]: UNKNOWN: Stale requests.
> Error mutating row key00178 with mutations [set_cell {
>   family_name: "cf"
>   value: "value00178"
> }
> ]: UNKNOWN: Stale requests.
> Error mutating row key00179 with mutations [set_cell {
>   family_name: "cf"
>   value: "value00179"
> }
> ]: UNKNOWN: Stale requests.
> Error mutating row key00180 with mutations [set_cell {
>   family_name: "cf"
>   value: "value00180"
> }
> ]: UNKNOWN: Stale requests.
> Error mutating row key00181 with mutations [set_cell {
>   family_name: "cf"
>   value: "value00181"
> }
> ]: UNKNOWN: Stale requests.
> Error mutating row key00182 with mutations [set_cell {
>   family_name: "cf"
>   value: "value00182"
> }
> ]: UNKNOWN: Stale requests.
> Error mutating row key00183 with mutations [set_cell {
>   family_name: "cf"
>   value: "value00183"
> }
> ]: UNKNOWN: Stale requests.
> Error mutating row key00184 with mutations [set_cell {
>   family_name: "cf"
>   value: "value00184"
> }
> ]: UNKNOWN: Stale requests.
>   at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$Write$BigtableWriterFn.checkForFailures(BigtableIO.java:655)
>   at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$Write$BigtableWriterFn.finishBundle(BigtableIO.java:607)
>   at 
> org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:133)
>   at 
> org.apache.beam.runners.dataflow.TestDataflowRunner.run(TestDataflowRunner.java:89)
>   at 
> 

[GitHub] beam pull request #3484: [BEAM-2545] Upgrade bigtable client to 1.0.0-pre1

2017-10-05 Thread ssisk
Github user ssisk closed the pull request at:

https://github.com/apache/beam/pull/3484


---


[jira] [Commented] (BEAM-2141) beam_PerformanceTests_JDBC have not passed in weeks

2017-10-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193132#comment-16193132
 ] 

ASF GitHub Bot commented on BEAM-2141:
--

Github user ssisk closed the pull request at:

https://github.com/apache/beam/pull/3604


> beam_PerformanceTests_JDBC have not passed in weeks
> ---
>
> Key: BEAM-2141
> URL: https://issues.apache.org/jira/browse/BEAM-2141
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Daniel Halperin
>Assignee: Stephen Sisk
> Fix For: Not applicable
>
>
> https://builds.apache.org/job/beam_PerformanceTests_JDBC/
> Disabling them, as no one seems to be maintaining them.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3604: [BEAM-2141] Update jenkins job for JDBCIOIT

2017-10-05 Thread ssisk
Github user ssisk closed the pull request at:

https://github.com/apache/beam/pull/3604


---


[jira] [Commented] (BEAM-2899) Universal Local Runner

2017-10-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193117#comment-16193117
 ] 

ASF GitHub Bot commented on BEAM-2899:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3898


> Universal Local Runner
> --
>
> Key: BEAM-2899
> URL: https://issues.apache.org/jira/browse/BEAM-2899
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-core
>Reporter: Henning Rohde
>Assignee: Thomas Groh
>  Labels: portability
>
> To make the portability effort tractable, we should implement a Universal 
> Local Runner (ULR) in Java that runs in a single server process plus docker 
> containers for the SDK harness containers. It would serve multiple purposes:
>   (1) A reference implementation for other runners. Ideally, any new feature 
> should be implemented in the ULR first.
>   (2) A fully-featured test runner for SDKs who participate in the 
> portability framework. It thus complements the direct runners.
>   (3) A test runner for user code that depends on or customizes the runtime 
> environment. For example, a DoFn that shells out has a dependency that may be 
> satisfied on the user's desktop (and thus works fine on the direct runner), 
> but perhaps not by the container harness image. The ULR allows for an easy 
> way to find out.
> The Java direct runner presumably has lots of pieces that can be reused.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[2/2] beam git commit: Add a Bare-bones ReferenceRunner Job Service

2017-10-05 Thread tgroh
Add a Bare-bones ReferenceRunner Job Service

This will eventually accept Job API calls and execute a Pipeline using
the ReferenceRunner backend.

This change exists primarily to create the appropriate module and POM.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/f0394a64
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/f0394a64
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/f0394a64

Branch: refs/heads/master
Commit: f0394a64c22199b6a15f7eb84ab9ebde8ce069ae
Parents: 1259ee9
Author: Thomas Groh 
Authored: Fri Aug 4 10:58:32 2017 -0700
Committer: Thomas Groh 
Committed: Thu Oct 5 09:15:29 2017 -0700

--
 pom.xml |  7 ++
 runners/pom.xml |  1 +
 runners/reference/pom.xml   | 75 +++
 .../reference/job/ReferenceRunnerJobServer.java | 67 +
 .../job/ReferenceRunnerJobService.java  | 77 
 .../runners/reference/job/package-info.java | 23 ++
 .../job/ReferenceRunnerJobServiceTest.java  | 34 +
 7 files changed, 284 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/f0394a64/pom.xml
--
diff --git a/pom.xml b/pom.xml
index 3ccd8d8..d9c2e6d 100644
--- a/pom.xml
+++ b/pom.xml
@@ -106,6 +106,7 @@
 1.1
 2.24.0
 1.0.0-rc2
+2.33
 1.8.2
 v2-rev355-1.22.0
 1.0.0-pre3
@@ -1132,6 +1133,12 @@
   
 
   
+args4j
+args4j
+${args4j.version}
+  
+
+  
 org.slf4j
 slf4j-api
 ${slf4j.version}

http://git-wip-us.apache.org/repos/asf/beam/blob/f0394a64/runners/pom.xml
--
diff --git a/runners/pom.xml b/runners/pom.xml
index 4f06748..e0a47bd 100644
--- a/runners/pom.xml
+++ b/runners/pom.xml
@@ -36,6 +36,7 @@
 core-construction-java
 core-java
 local-artifact-service-java
+reference
 direct-java
 flink
 google-cloud-dataflow-java

http://git-wip-us.apache.org/repos/asf/beam/blob/f0394a64/runners/reference/pom.xml
--
diff --git a/runners/reference/pom.xml b/runners/reference/pom.xml
new file mode 100644
index 000..d421786
--- /dev/null
+++ b/runners/reference/pom.xml
@@ -0,0 +1,75 @@
+
+
+http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; 
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
+
+  4.0.0
+
+  
+org.apache.beam
+beam-runners-parent
+2.2.0-SNAPSHOT
+../pom.xml
+  
+
+  beam-runners-reference
+  Apache Beam :: Runners :: Reference
+  A Pipeline Runner which executes on the local machine using the
+  Beam portability framework to execute an arbitrary Pipeline.
+
+  jar
+
+  
+
+  org.apache.beam
+  beam-sdks-common-runner-api
+
+
+
+  io.grpc
+  grpc-core
+
+
+
+  io.grpc
+  grpc-stub
+
+
+
+  args4j
+  args4j
+
+
+
+  org.slf4j
+  slf4j-api
+
+
+
+
+  junit
+  junit
+  test
+
+
+
+  org.slf4j
+  slf4j-jdk14
+  test
+
+  
+

http://git-wip-us.apache.org/repos/asf/beam/blob/f0394a64/runners/reference/src/main/java/org/apache/beam/runners/reference/job/ReferenceRunnerJobServer.java
--
diff --git 
a/runners/reference/src/main/java/org/apache/beam/runners/reference/job/ReferenceRunnerJobServer.java
 
b/runners/reference/src/main/java/org/apache/beam/runners/reference/job/ReferenceRunnerJobServer.java
new file mode 100644
index 000..1dcc2b3
--- /dev/null
+++ 
b/runners/reference/src/main/java/org/apache/beam/runners/reference/job/ReferenceRunnerJobServer.java
@@ -0,0 +1,67 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ 

[GitHub] beam pull request #3898: [BEAM-2899] Add a Bare-bones ReferenceRunner Job Se...

2017-10-05 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3898


---


Jenkins build is back to normal : beam_PostCommit_Java_ValidatesRunner_Flink #3997

2017-10-05 Thread Apache Jenkins Server
See 




[1/2] beam git commit: This closes #3898

2017-10-05 Thread tgroh
Repository: beam
Updated Branches:
  refs/heads/master 1259ee955 -> 294518e59


This closes #3898


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/294518e5
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/294518e5
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/294518e5

Branch: refs/heads/master
Commit: 294518e59beff4e3539aa1c3160673d9a988df35
Parents: 1259ee9 f0394a6
Author: Thomas Groh 
Authored: Thu Oct 5 09:15:29 2017 -0700
Committer: Thomas Groh 
Committed: Thu Oct 5 09:15:29 2017 -0700

--
 pom.xml |  7 ++
 runners/pom.xml |  1 +
 runners/reference/pom.xml   | 75 +++
 .../reference/job/ReferenceRunnerJobServer.java | 67 +
 .../job/ReferenceRunnerJobService.java  | 77 
 .../runners/reference/job/package-info.java | 23 ++
 .../job/ReferenceRunnerJobServiceTest.java  | 34 +
 7 files changed, 284 insertions(+)
--




[GitHub] beam pull request #3946: Enable discovery of log_handler tests.

2017-10-05 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3946


---


[1/2] beam git commit: Enable discovery of log_handler tests.

2017-10-05 Thread robertwb
Repository: beam
Updated Branches:
  refs/heads/master 12e79d0a0 -> 1259ee955


Enable discovery of log_handler tests.

Also fix the one remaining bug.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/5fb3aa03
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/5fb3aa03
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/5fb3aa03

Branch: refs/heads/master
Commit: 5fb3aa03a4f018fb54cc5b65fc74920e0b7983b3
Parents: 12e79d0
Author: Robert Bradshaw 
Authored: Wed Oct 4 17:38:43 2017 -0700
Committer: Robert Bradshaw 
Committed: Thu Oct 5 08:53:30 2017 -0700

--
 sdks/python/apache_beam/runners/worker/log_handler.py  | 1 +
 sdks/python/apache_beam/runners/worker/log_handler_test.py | 7 ---
 2 files changed, 5 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/5fb3aa03/sdks/python/apache_beam/runners/worker/log_handler.py
--
diff --git a/sdks/python/apache_beam/runners/worker/log_handler.py 
b/sdks/python/apache_beam/runners/worker/log_handler.py
index 8691184..6d8a1d9 100644
--- a/sdks/python/apache_beam/runners/worker/log_handler.py
+++ b/sdks/python/apache_beam/runners/worker/log_handler.py
@@ -24,6 +24,7 @@ import threading
 import grpc
 
 from apache_beam.portability.api import beam_fn_api_pb2
+from apache_beam.portability.api import beam_fn_api_pb2_grpc
 
 # This module is experimental. No backwards-compatibility guarantees.
 

http://git-wip-us.apache.org/repos/asf/beam/blob/5fb3aa03/sdks/python/apache_beam/runners/worker/log_handler_test.py
--
diff --git a/sdks/python/apache_beam/runners/worker/log_handler_test.py 
b/sdks/python/apache_beam/runners/worker/log_handler_test.py
index 9814324..d2647d0 100644
--- a/sdks/python/apache_beam/runners/worker/log_handler_test.py
+++ b/sdks/python/apache_beam/runners/worker/log_handler_test.py
@@ -100,8 +100,9 @@ def _create_test(name, num_logs):
   lambda self: self._verify_fn_log_handler(num_logs))
 
 
-if __name__ == '__main__':
-  for test_name, num_logs_entries in data.iteritems():
-_create_test(test_name, num_logs_entries)
+for test_name, num_logs_entries in data.iteritems():
+  _create_test(test_name, num_logs_entries)
+
 
+if __name__ == '__main__':
   unittest.main()



[2/2] beam git commit: Closes #3946

2017-10-05 Thread robertwb
Closes #3946


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/1259ee95
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/1259ee95
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/1259ee95

Branch: refs/heads/master
Commit: 1259ee9554dc5529d6579d4255afc0dccb1ec5dc
Parents: 12e79d0 5fb3aa0
Author: Robert Bradshaw 
Authored: Thu Oct 5 08:53:31 2017 -0700
Committer: Robert Bradshaw 
Committed: Thu Oct 5 08:53:31 2017 -0700

--
 sdks/python/apache_beam/runners/worker/log_handler.py  | 1 +
 sdks/python/apache_beam/runners/worker/log_handler_test.py | 7 ---
 2 files changed, 5 insertions(+), 3 deletions(-)
--




[jira] [Commented] (BEAM-2993) AvroIO.write without specifying a schema

2017-10-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16192901#comment-16192901
 ] 

ASF GitHub Bot commented on BEAM-2993:
--

GitHub user echauchot opened a pull request:

https://github.com/apache/beam/pull/3950

 [BEAM-2993] AvroIO.write without specifying a schema

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [X] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [X] Each commit in the pull request should have a meaningful subject 
line and body.
 - [X] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [X] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [X] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [X] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---
This PR adds the ability to use `AvroIO.write()` and related methods 
without specifying a schema. 
The schema is determined at the first call of `AvroSink.write()`: the 
`DataFileWriter` is lazy initialized (at first write) once we have the value to 
get the schema from.  
This PR also makes the schema optional in `ConstantAvroDestination` and 
depreciate write methods that take schema as parameter. Tell me if I'm missing 
something that prevents deprecation of these methods.

To use `AvoIO.write()` with no schema, all the elements of the input 
PCollection must have the same schema, but it is the same with current 
AvroIO.write(schema) implementation because this schema instance is passed to 
the `TypedWrite` then to the `ConstantAvroDestination` that is used in 
`AvroSink`. Please tell me if I'm missing something here.

My only concern is with empty bundles, `AvroSink.write()` will not be 
called resulting in the `DataFileWriter` not being initialized.  

Please merge the PR bellow before this one because it is used as a base for 
the tests
https://github.com/apache/beam/pull/3948

R: @jkff 
R: @reuvenlax 
CC: @lukecwik 


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/echauchot/beam AvroIOWriteSchemaLess2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3950.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3950


commit 43ef4d42d7d224b1997278832ec645bccb945792
Author: Etienne Chauchot 
Date:   2017-10-05T09:45:12Z

[BEAM-3019] Make AvroIOWriteTransformTest more generic

make runTestWrite() more generic to be able to use GenericRecord[] as input 
for writeGenericRecords test in place of AvroGeneratedUser
make readAvroFile() generic to be able to read GenericRecords using 
GenericDatumReader for writeGenericRecords test

commit 84074e36085d76f569c89d4a29a647fc40b22531
Author: Etienne Chauchot 
Date:   2017-10-02T15:08:55Z

[BEAM-2993] AvroIO.write without specifying a schema

Lazy init (at first write) of the dataFileWriter once we have the value to 
get the schema from.
Make schema optional in ConstantAvroDestination and depreciate write 
methods that take schema as parameter
Cleaning

commit d19c2cb3538e5981e8138522d0c2138b455dec46
Author: Etienne Chauchot 
Date:   2017-10-05T12:08:04Z

Add tests of the schema less write methods
Cleaning

commit da95342353bd191c55d6d7768d4c052c531b8cf1
Author: Etienne Chauchot 
Date:   2017-10-05T12:43:29Z

Fixups




> AvroIO.write without specifying a schema
> 
>
> Key: BEAM-2993
> URL: https://issues.apache.org/jira/browse/BEAM-2993
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>
> Similarly to https://issues.apache.org/jira/browse/BEAM-2677, we should be 
> able to write to avro files using {{AvroIO}} without specifying a schema at 
> build time. Consider the following use case: a user has a 
> {{PCollection}}  but the schema is only known while running 
> the pipeline.  {{AvroIO.writeGenericRecords}} needs 

[GitHub] beam pull request #3950: [BEAM-2993] AvroIO.write without specifying a schem...

2017-10-05 Thread echauchot
GitHub user echauchot opened a pull request:

https://github.com/apache/beam/pull/3950

 [BEAM-2993] AvroIO.write without specifying a schema

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [X] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [X] Each commit in the pull request should have a meaningful subject 
line and body.
 - [X] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [X] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [X] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [X] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---
This PR adds the ability to use `AvroIO.write()` and related methods 
without specifying a schema. 
The schema is determined at the first call of `AvroSink.write()`: the 
`DataFileWriter` is lazy initialized (at first write) once we have the value to 
get the schema from.  
This PR also makes the schema optional in `ConstantAvroDestination` and 
depreciate write methods that take schema as parameter. Tell me if I'm missing 
something that prevents deprecation of these methods.

To use `AvoIO.write()` with no schema, all the elements of the input 
PCollection must have the same schema, but it is the same with current 
AvroIO.write(schema) implementation because this schema instance is passed to 
the `TypedWrite` then to the `ConstantAvroDestination` that is used in 
`AvroSink`. Please tell me if I'm missing something here.

My only concern is with empty bundles, `AvroSink.write()` will not be 
called resulting in the `DataFileWriter` not being initialized.  

Please merge the PR bellow before this one because it is used as a base for 
the tests
https://github.com/apache/beam/pull/3948

R: @jkff 
R: @reuvenlax 
CC: @lukecwik 


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/echauchot/beam AvroIOWriteSchemaLess2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3950.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3950


commit 43ef4d42d7d224b1997278832ec645bccb945792
Author: Etienne Chauchot 
Date:   2017-10-05T09:45:12Z

[BEAM-3019] Make AvroIOWriteTransformTest more generic

make runTestWrite() more generic to be able to use GenericRecord[] as input 
for writeGenericRecords test in place of AvroGeneratedUser
make readAvroFile() generic to be able to read GenericRecords using 
GenericDatumReader for writeGenericRecords test

commit 84074e36085d76f569c89d4a29a647fc40b22531
Author: Etienne Chauchot 
Date:   2017-10-02T15:08:55Z

[BEAM-2993] AvroIO.write without specifying a schema

Lazy init (at first write) of the dataFileWriter once we have the value to 
get the schema from.
Make schema optional in ConstantAvroDestination and depreciate write 
methods that take schema as parameter
Cleaning

commit d19c2cb3538e5981e8138522d0c2138b455dec46
Author: Etienne Chauchot 
Date:   2017-10-05T12:08:04Z

Add tests of the schema less write methods
Cleaning

commit da95342353bd191c55d6d7768d4c052c531b8cf1
Author: Etienne Chauchot 
Date:   2017-10-05T12:43:29Z

Fixups




---


[jira] [Commented] (BEAM-2779) PipelineOptionsFactory should prevent non PipelineOptions interfaces from being constructed.

2017-10-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16192807#comment-16192807
 ] 

ASF GitHub Bot commented on BEAM-2779:
--

GitHub user vectorijk opened a pull request:

https://github.com/apache/beam/pull/3949

[BEAM-2779] PipelineOptionsFactory should prevent non PipelineOptions 
interfaces from being constructed

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [ ] Each commit in the pull request should have a meaningful subject 
line and body.
 - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [ ] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---

R: @lukecwik 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vectorijk/beam BEAM-2779-PipelineOption

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3949.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3949


commit 45de4ecf0901ce6ebfab0296ba80de8357a05d2d
Author: Kai Jiang 
Date:   2017-10-04T06:41:32Z

implement dfs check and experiments

commit 7feec656aa27ad844c70aa99c22b711904a466de
Author: Kai Jiang 
Date:   2017-10-05T09:14:48Z

rewrite and add unit test

commit 51dccc0f5de4a743785e6a37745b7f505cdf549c
Author: Kai Jiang 
Date:   2017-10-05T12:31:38Z

comments and Options revise




> PipelineOptionsFactory should prevent non PipelineOptions interfaces from 
> being constructed.
> 
>
> Key: BEAM-2779
> URL: https://issues.apache.org/jira/browse/BEAM-2779
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.0.0
>Reporter: Luke Cwik
>Assignee: Kai Jiang
>Priority: Minor
>  Labels: starter
>
> PipelineOptions currently serializes information about all getter/setter 
> pairs on interfaces which don't extend PipelineOptions.
> For example:
> {code:java}
> interface Foo extends PipelineOptions, Bar {
>   String getFoo();
>   void setFoo(String value);
> }
> interface Bar {
>   String getBar();
>   void setBar(String value);
> }
> {code}
> The serialization of the above (when both *foo* and *bar* are set) will 
> produce JSON where we only include display data for *foo* but data for both 
> *foo* and *bar*. During validation of an interface in 
> *PipelineOptionsFactory*, we should throw an error if one of the users 
> interfaces doesn't extend *PipelineOptions* (note that we should ignore the 
> HasDisplayData interface).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3949: [BEAM-2779] PipelineOptionsFactory should prevent n...

2017-10-05 Thread vectorijk
GitHub user vectorijk opened a pull request:

https://github.com/apache/beam/pull/3949

[BEAM-2779] PipelineOptionsFactory should prevent non PipelineOptions 
interfaces from being constructed

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [ ] Each commit in the pull request should have a meaningful subject 
line and body.
 - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [ ] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---

R: @lukecwik 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vectorijk/beam BEAM-2779-PipelineOption

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3949.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3949


commit 45de4ecf0901ce6ebfab0296ba80de8357a05d2d
Author: Kai Jiang 
Date:   2017-10-04T06:41:32Z

implement dfs check and experiments

commit 7feec656aa27ad844c70aa99c22b711904a466de
Author: Kai Jiang 
Date:   2017-10-05T09:14:48Z

rewrite and add unit test

commit 51dccc0f5de4a743785e6a37745b7f505cdf549c
Author: Kai Jiang 
Date:   2017-10-05T12:31:38Z

comments and Options revise




---


[jira] [Comment Edited] (BEAM-2993) AvroIO.write without specifying a schema

2017-10-05 Thread Etienne Chauchot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16192689#comment-16192689
 ] 

Etienne Chauchot edited comment on BEAM-2993 at 10/5/17 12:10 PM:
--

Besides I have just submitted a preparation PR just to make 
{{AvroIOWriteTransformTest}} more generic. See 
https://github.com/apache/beam/pull/3948 . This PR needs to be merged before 
the upcoming one on {{AvroIO.write()}}.


was (Author: echauchot):
Besides I have just submitted a preparation PR just to make 
AvroIOWriteTransformTest more generic. See 
https://github.com/apache/beam/pull/3948 . This PR needs to be merged before 
the upcoming one on {{AvroIO.write()}}.

> AvroIO.write without specifying a schema
> 
>
> Key: BEAM-2993
> URL: https://issues.apache.org/jira/browse/BEAM-2993
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>
> Similarly to https://issues.apache.org/jira/browse/BEAM-2677, we should be 
> able to write to avro files using {{AvroIO}} without specifying a schema at 
> build time. Consider the following use case: a user has a 
> {{PCollection}}  but the schema is only known while running 
> the pipeline.  {{AvroIO.writeGenericRecords}} needs the schema, but the 
> schema is already available in {{GenericRecord}}. We should be able to call 
> {{AvroIO.writeGenericRecords()}} with no schema.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (BEAM-2993) AvroIO.write without specifying a schema

2017-10-05 Thread Ryan Skraba (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16192777#comment-16192777
 ] 

Ryan Skraba edited comment on BEAM-2993 at 10/5/17 11:53 AM:
-

Very good points and foreseeing some of our plans!  Fortunately, I'm pretty 
sure that we can consider that *if* someone chooses to use {{AvroIO.write()}} 
without specifying a schema, they *must* provide a homogeneous collection (all 
with the same schema)! 

But looking ahead, we *are* moving towards heterogeneous collections (or at 
least heterogeneous-ish with a limited number of possible schemas) and there 
are intelligent things we can do in intermediate transforms, such as 
reconciling them into a good, "known" schema.  I don't think it would be 
reasonable or desirable to ask AvroIO.write to implement any of that 
intermediate-type logic.

That being said, the SchemaRefAndRecord is probably what we would need to solve 
the heterogeneous collection problem, but I don't consider it related to this 
PR.  It would be baked into the intermediate processing.

For info, before Beam 2.0, we used the hadoop input format Sink, with a lazy 
configuration when the first record is received which actually worked very well 
-- but we're pretty motivated to move entirely to the BFS as soon as possible!


was (Author: ryanskraba):
Very good points and foreseeing some of our plans!  Fortunately, I'm pretty 
sure that we can consider that *if* someone chooses to use {{AvroIO.write()}} 
without specifying a schema, they *must* provide a homogeneous collection (all 
with the same schema)! 

But looking ahead, we *are* moving towards heterogeneous collections (or at 
least heterogeneous-ish with a limited number of possible schemas) and there 
are intelligent things we can do in intermediate transforms, such as 
reconciling them into a good, "known" schema.  I don't think it would be 
reasonable or desirable to ask AvroIO.write to implement any of that 
intermediate-type logic.

That being said, the SchemaRefAndRecord is probably what we would need to solve 
the heterogeneous collection problem, but I don't consider it related.

For info, before Beam 2.0, we used the hadoop input format Sink, with a lazy 
configuration when the first record is received which actually worked very well 
-- but we're pretty motivated to move entirely to the BFS as soon as possible!

> AvroIO.write without specifying a schema
> 
>
> Key: BEAM-2993
> URL: https://issues.apache.org/jira/browse/BEAM-2993
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>
> Similarly to https://issues.apache.org/jira/browse/BEAM-2677, we should be 
> able to write to avro files using {{AvroIO}} without specifying a schema at 
> build time. Consider the following use case: a user has a 
> {{PCollection}}  but the schema is only known while running 
> the pipeline.  {{AvroIO.writeGenericRecords}} needs the schema, but the 
> schema is already available in {{GenericRecord}}. We should be able to call 
> {{AvroIO.writeGenericRecords()}} with no schema.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2993) AvroIO.write without specifying a schema

2017-10-05 Thread Ryan Skraba (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16192777#comment-16192777
 ] 

Ryan Skraba commented on BEAM-2993:
---

Very good points and foreseeing some of our plans!  Fortunately, I'm pretty 
sure that we can consider that *if* someone chooses to use {{AvroIO.write()}} 
without specifying a schema, they *must* provide a homogeneous collection (all 
with the same schema)! 

But looking ahead, we *are* moving towards heterogeneous collections (or at 
least heterogeneous-ish with a limited number of possible schemas) and there 
are intelligent things we can do in intermediate transforms, such as 
reconciling them into a good, "known" schema.  I don't think it would be 
reasonable or desirable to ask AvroIO.write to implement any of this logic.

That being said, the SchemaRefAndRecord is probably what we would need to solve 
the heterogeneous collection problem, but I don't consider it related.

For info, before Beam 2.0, we used the hadoop input format Sink, with a lazy 
configuration when the first record is received which actually worked very well 
-- but we're pretty motivated to move entirely to the BFS as soon as possible!

> AvroIO.write without specifying a schema
> 
>
> Key: BEAM-2993
> URL: https://issues.apache.org/jira/browse/BEAM-2993
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>
> Similarly to https://issues.apache.org/jira/browse/BEAM-2677, we should be 
> able to write to avro files using {{AvroIO}} without specifying a schema at 
> build time. Consider the following use case: a user has a 
> {{PCollection}}  but the schema is only known while running 
> the pipeline.  {{AvroIO.writeGenericRecords}} needs the schema, but the 
> schema is already available in {{GenericRecord}}. We should be able to call 
> {{AvroIO.writeGenericRecords()}} with no schema.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (BEAM-2993) AvroIO.write without specifying a schema

2017-10-05 Thread Ryan Skraba (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16192777#comment-16192777
 ] 

Ryan Skraba edited comment on BEAM-2993 at 10/5/17 11:52 AM:
-

Very good points and foreseeing some of our plans!  Fortunately, I'm pretty 
sure that we can consider that *if* someone chooses to use {{AvroIO.write()}} 
without specifying a schema, they *must* provide a homogeneous collection (all 
with the same schema)! 

But looking ahead, we *are* moving towards heterogeneous collections (or at 
least heterogeneous-ish with a limited number of possible schemas) and there 
are intelligent things we can do in intermediate transforms, such as 
reconciling them into a good, "known" schema.  I don't think it would be 
reasonable or desirable to ask AvroIO.write to implement any of that 
intermediate-type logic.

That being said, the SchemaRefAndRecord is probably what we would need to solve 
the heterogeneous collection problem, but I don't consider it related.

For info, before Beam 2.0, we used the hadoop input format Sink, with a lazy 
configuration when the first record is received which actually worked very well 
-- but we're pretty motivated to move entirely to the BFS as soon as possible!


was (Author: ryanskraba):
Very good points and foreseeing some of our plans!  Fortunately, I'm pretty 
sure that we can consider that *if* someone chooses to use {{AvroIO.write()}} 
without specifying a schema, they *must* provide a homogeneous collection (all 
with the same schema)! 

But looking ahead, we *are* moving towards heterogeneous collections (or at 
least heterogeneous-ish with a limited number of possible schemas) and there 
are intelligent things we can do in intermediate transforms, such as 
reconciling them into a good, "known" schema.  I don't think it would be 
reasonable or desirable to ask AvroIO.write to implement any of this logic.

That being said, the SchemaRefAndRecord is probably what we would need to solve 
the heterogeneous collection problem, but I don't consider it related.

For info, before Beam 2.0, we used the hadoop input format Sink, with a lazy 
configuration when the first record is received which actually worked very well 
-- but we're pretty motivated to move entirely to the BFS as soon as possible!

> AvroIO.write without specifying a schema
> 
>
> Key: BEAM-2993
> URL: https://issues.apache.org/jira/browse/BEAM-2993
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>
> Similarly to https://issues.apache.org/jira/browse/BEAM-2677, we should be 
> able to write to avro files using {{AvroIO}} without specifying a schema at 
> build time. Consider the following use case: a user has a 
> {{PCollection}}  but the schema is only known while running 
> the pipeline.  {{AvroIO.writeGenericRecords}} needs the schema, but the 
> schema is already available in {{GenericRecord}}. We should be able to call 
> {{AvroIO.writeGenericRecords()}} with no schema.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is unstable: beam_PostCommit_Java_MavenInstall #4947

2017-10-05 Thread Apache Jenkins Server
See 




Jenkins build is back to normal : beam_PostCommit_Java_ValidatesRunner_Apex #2533

2017-10-05 Thread Apache Jenkins Server
See 




[jira] [Resolved] (BEAM-3018) Remove duplicated methods in StructuredCoder

2017-10-05 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/BEAM-3018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía resolved BEAM-3018.

   Resolution: Fixed
Fix Version/s: 2.3.0

> Remove duplicated methods in StructuredCoder
> 
>
> Key: BEAM-3018
> URL: https://issues.apache.org/jira/browse/BEAM-3018
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
> Fix For: 2.3.0
>
>
> StructuredCoder has several methods that are totally the same as its parent. 
> We should remove these.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-3018) Remove duplicated methods in StructuredCoder

2017-10-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16192697#comment-16192697
 ] 

ASF GitHub Bot commented on BEAM-3018:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3947


> Remove duplicated methods in StructuredCoder
> 
>
> Key: BEAM-3018
> URL: https://issues.apache.org/jira/browse/BEAM-3018
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>
> StructuredCoder has several methods that are totally the same as its parent. 
> We should remove these.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3947: [BEAM-3018] Remove duplicated methods in Structured...

2017-10-05 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3947


---


[1/2] beam git commit: [BEAM-3018] Remove duplicated methods in StructuredCoder

2017-10-05 Thread iemejia
Repository: beam
Updated Branches:
  refs/heads/master afe8da8bc -> 12e79d0a0


[BEAM-3018] Remove duplicated methods in StructuredCoder


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/8d8a213f
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/8d8a213f
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/8d8a213f

Branch: refs/heads/master
Commit: 8d8a213f0ba2f1cceec8e53bba01c8d2a55c8863
Parents: afe8da8
Author: Cao Manh Dat 
Authored: Thu Oct 5 11:28:35 2017 +0700
Committer: Cao Manh Dat 
Committed: Thu Oct 5 11:28:35 2017 +0700

--
 .../apache/beam/sdk/coders/StructuredCoder.java | 34 
 1 file changed, 34 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/8d8a213f/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/StructuredCoder.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/StructuredCoder.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/StructuredCoder.java
index 2eb662b..bd964f4 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/StructuredCoder.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/StructuredCoder.java
@@ -17,10 +17,8 @@
  */
 package org.apache.beam.sdk.coders;
 
-import java.io.ByteArrayOutputStream;
 import java.util.Collections;
 import java.util.List;
-import org.apache.beam.sdk.values.TypeDescriptor;
 
 /**
  * An abstract base class to implement a {@link Coder} that defines equality, 
hashing, and printing
@@ -99,36 +97,4 @@ public abstract class StructuredCoder extends Coder {
 return builder.toString();
   }
 
-  /**
-   * {@inheritDoc}
-   *
-   * @return {@code false} for {@link StructuredCoder} unless overridden.
-   */
-  @Override
-  public boolean consistentWithEquals() {
-return false;
-  }
-
-  @Override
-  public Object structuralValue(T value) {
-if (value != null && consistentWithEquals()) {
-  return value;
-} else {
-  try {
-ByteArrayOutputStream os = new ByteArrayOutputStream();
-encode(value, os, Context.OUTER);
-return new StructuralByteArray(os.toByteArray());
-  } catch (Exception exn) {
-throw new IllegalArgumentException(
-"Unable to encode element '" + value + "' with coder '" + this + 
"'.", exn);
-  }
-}
-  }
-
-  @SuppressWarnings("unchecked")
-  @Override
-  public TypeDescriptor getEncodedTypeDescriptor() {
-return (TypeDescriptor)
-TypeDescriptor.of(getClass()).resolveType(new TypeDescriptor() 
{}.getType());
-  }
 }



[2/2] beam git commit: This closes #3947

2017-10-05 Thread iemejia
This closes #3947


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/12e79d0a
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/12e79d0a
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/12e79d0a

Branch: refs/heads/master
Commit: 12e79d0a0dd417b9b09c44d042ce765a90cdc57b
Parents: afe8da8 8d8a213
Author: Ismaël Mejía 
Authored: Thu Oct 5 12:20:13 2017 +0200
Committer: Ismaël Mejía 
Committed: Thu Oct 5 12:20:13 2017 +0200

--
 .../apache/beam/sdk/coders/StructuredCoder.java | 34 
 1 file changed, 34 deletions(-)
--




[jira] [Commented] (BEAM-2993) AvroIO.write without specifying a schema

2017-10-05 Thread Etienne Chauchot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16192689#comment-16192689
 ] 

Etienne Chauchot commented on BEAM-2993:


Besides I have just submitted a preparation PR just to make 
AvroIOWriteTransformTest more generic. See 
https://github.com/apache/beam/pull/3948 . This PR needs to be merged before 
the upcoming one on {{AvroIO.write()}}.

> AvroIO.write without specifying a schema
> 
>
> Key: BEAM-2993
> URL: https://issues.apache.org/jira/browse/BEAM-2993
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>
> Similarly to https://issues.apache.org/jira/browse/BEAM-2677, we should be 
> able to write to avro files using {{AvroIO}} without specifying a schema at 
> build time. Consider the following use case: a user has a 
> {{PCollection}}  but the schema is only known while running 
> the pipeline.  {{AvroIO.writeGenericRecords}} needs the schema, but the 
> schema is already available in {{GenericRecord}}. We should be able to call 
> {{AvroIO.writeGenericRecords()}} with no schema.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


  1   2   >