[jira] [Commented] (BEAM-3685) It should be an error to run a Pipeline without ever specifying options

2018-02-21 Thread Willy Lulciuc (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16372500#comment-16372500
 ] 

Willy Lulciuc commented on BEAM-3685:
-

[~tgroh]: I'll take on this task

> It should be an error to run a Pipeline without ever specifying options
> ---
>
> Key: BEAM-3685
> URL: https://issues.apache.org/jira/browse/BEAM-3685
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Thomas Groh
>Priority: Major
>  Labels: beginner, newbie, starter
>
> Doing so lets users run some pipelines without specifying any configuration, 
> which is dangerous.
>  
> At minimum, it should log a very obvious warning.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3664) Port SolrIOTest off DoFnTester

2018-02-21 Thread Willy Lulciuc (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16372481#comment-16372481
 ] 

Willy Lulciuc commented on BEAM-3664:
-

[~kenn]: Replacing DoFnTester with PipelineTest in testWriteWithMaxBatchSize() 
is not as straight forward here (at least it certainly would seem). Let me 
explain why. The test has the following comment:

"write bundles size is the runner decision, we cannot force a bundle size, so 
we test the Writer as a DoFn outside of a runner."

Meaning, DoFnTester.of() is used to invoke processElement() as the Solr 
documents are iterated over, with periodic calls to commit documents, then 
check insertion counters, etc.

I've tried a couples ways. But none providing the convenience of 
DoFnTester.of() to compare the number inserted vs processed.

Am I missing something?

 

 

 

> Port SolrIOTest off DoFnTester
> --
>
> Key: BEAM-3664
> URL: https://issues.apache.org/jira/browse/BEAM-3664
> Project: Beam
>  Issue Type: Sub-task
>  Components: io-java-solr
>Reporter: Kenneth Knowles
>Priority: Major
>  Labels: beginner, newbie, starter
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Spark #4247

2018-02-21 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PerformanceTests_Spark #1386

2018-02-21 Thread Apache Jenkins Server
See 


Changes:

[luke_zhu] Revert invalid use of io.StringIO in utils/profiler.py

[kirpichov] A relative directory should be applied (if specified) even when 
using a

--
[...truncated 98.30 KB...]
'apache-beam-testing:bqjob_r6827e81f7652e574_0161bc2b1689_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)

2018-02-22 06:19:57,744 7abea485 MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-02-22 06:20:21,141 7abea485 MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-02-22 06:20:23,363 7abea485 MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1,  WallTime:0:02.21s,  CPU:0.25s,  MaxMemory:25168kb 
STDOUT: Upload complete.
Waiting on bqjob_r5a1bcde381f5abf_0161bc2b7a1b_1 ... (0s) Current status: 
RUNNING 
Waiting on bqjob_r5a1bcde381f5abf_0161bc2b7a1b_1 ... (0s) 
Current status: DONE   
BigQuery error in load operation: Error processing job
'apache-beam-testing:bqjob_r5a1bcde381f5abf_0161bc2b7a1b_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)

2018-02-22 06:20:23,364 7abea485 MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-02-22 06:20:45,574 7abea485 MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-02-22 06:20:47,897 7abea485 MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1,  WallTime:0:02.31s,  CPU:0.25s,  MaxMemory:25168kb 
STDOUT: Upload complete.
Waiting on bqjob_r3b7e3b5ed86ff735_0161bc2bd988_1 ... (0s) Current status: 
RUNNING 
 Waiting on bqjob_r3b7e3b5ed86ff735_0161bc2bd988_1 ... (0s) 
Current status: DONE   
BigQuery error in load operation: Error processing job
'apache-beam-testing:bqjob_r3b7e3b5ed86ff735_0161bc2bd988_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)

2018-02-22 06:20:47,898 7abea485 MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-02-22 06:21:04,070 7abea485 MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-02-22 06:21:06,375 7abea485 MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1,  WallTime:0:02.30s,  CPU:0.24s,  MaxMemory:25164kb 
STDOUT: Upload complete.
Waiting on bqjob_r7818a3f4679216bc_0161bc2c21c9_1 ... (0s) Current status: 
RUNNING 
 Waiting on 

Build failed in Jenkins: beam_PerformanceTests_JDBC #245

2018-02-21 Thread Apache Jenkins Server
See 


Changes:

[luke_zhu] Revert invalid use of io.StringIO in utils/profiler.py

[kirpichov] A relative directory should be applied (if specified) even when 
using a

--
[...truncated 49.94 KB...]
[INFO] Replacing original artifact with shaded artifact.
[INFO] Replacing 

 with 

[INFO] Replacing original test artifact with shaded test artifact.
[INFO] Replacing 

 with 

[INFO] Dependency-reduced POM written at: 

[INFO] 
[INFO] --- maven-failsafe-plugin:2.20.1:integration-test (default) @ 
beam-sdks-java-io-jdbc ---
[INFO] Failsafe report directory: 

[INFO] parallel='all', perCoreThreadCount=true, threadCount=4, 
useUnlimitedThreads=false, threadCountSuites=0, threadCountClasses=0, 
threadCountMethods=0, parallelOptimized=true
[INFO] 
[INFO] ---
[INFO]  T E S T S
[INFO] ---
[INFO] Running org.apache.beam.sdk.io.jdbc.JdbcIOIT
[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 960.831 
s <<< FAILURE! - in org.apache.beam.sdk.io.jdbc.JdbcIOIT
[ERROR] testWriteThenRead(org.apache.beam.sdk.io.jdbc.JdbcIOIT)  Time elapsed: 
960.831 s  <<< ERROR!
java.lang.RuntimeException: 
(65d9ba0a7a36756f): java.lang.RuntimeException: 
org.apache.beam.sdk.util.UserCodeException: org.postgresql.util.PSQLException: 
The connection attempt failed.
at 
com.google.cloud.dataflow.worker.MapTaskExecutorFactory$3.typedApply(MapTaskExecutorFactory.java:404)
at 
com.google.cloud.dataflow.worker.MapTaskExecutorFactory$3.typedApply(MapTaskExecutorFactory.java:374)
at 
com.google.cloud.dataflow.worker.graph.Networks$TypeSafeNodeFunction.apply(Networks.java:63)
at 
com.google.cloud.dataflow.worker.graph.Networks$TypeSafeNodeFunction.apply(Networks.java:50)
at 
com.google.cloud.dataflow.worker.graph.Networks.replaceDirectedNetworkNodes(Networks.java:87)
at 
com.google.cloud.dataflow.worker.MapTaskExecutorFactory.create(MapTaskExecutorFactory.java:158)
at 
com.google.cloud.dataflow.worker.DataflowWorker.doWork(DataflowWorker.java:308)
at 
com.google.cloud.dataflow.worker.DataflowWorker.getAndPerformWork(DataflowWorker.java:264)
at 
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.doWork(DataflowBatchWorkerHarness.java:133)
at 
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:113)
at 
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:100)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.beam.sdk.util.UserCodeException: 
org.postgresql.util.PSQLException: The connection attempt failed.
at 
org.apache.beam.sdk.util.UserCodeException.wrap(UserCodeException.java:36)
at 
org.apache.beam.sdk.io.jdbc.JdbcIO$ReadFn$DoFnInvoker.invokeSetup(Unknown 
Source)
at 
com.google.cloud.dataflow.worker.DoFnInstanceManagers$ConcurrentQueueInstanceManager.deserializeCopy(DoFnInstanceManagers.java:63)
at 
com.google.cloud.dataflow.worker.DoFnInstanceManagers$ConcurrentQueueInstanceManager.peek(DoFnInstanceManagers.java:45)
at 
com.google.cloud.dataflow.worker.UserParDoFnFactory.create(UserParDoFnFactory.java:94)
at 
com.google.cloud.dataflow.worker.DefaultParDoFnFactory.create(DefaultParDoFnFactory.java:74)
at 
com.google.cloud.dataflow.worker.MapTaskExecutorFactory.createParDoOperation(MapTaskExecutorFactory.java:481)
at 
com.google.cloud.dataflow.worker.MapTaskExecutorFactory$3.typedApply(MapTaskExecutorFactory.java:392)
... 14 more
Caused by: org.postgresql.util.PSQLException: The 

Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #5058

2018-02-21 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PerformanceTests_Python #942

2018-02-21 Thread Apache Jenkins Server
See 


Changes:

[luke_zhu] Revert invalid use of io.StringIO in utils/profiler.py

[kirpichov] A relative directory should be applied (if specified) even when 
using a

--
[...truncated 726 B...]
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 3e5ce5cfd67efd4fcb4c9cb5db1cc9e677caeff8 (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 3e5ce5cfd67efd4fcb4c9cb5db1cc9e677caeff8
Commit message: "Merge pull request #4713 from luke-zhu/python3"
 > git rev-list --no-walk 14e4cb30105c4aeb43483155e1447ffaae2fbcb5 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins2430456843286655857.sh
+ rm -rf PerfKitBenchmarker
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins6259997144783648603.sh
+ rm -rf .env
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins6780985243548626273.sh
+ virtualenv .env --system-site-packages
New python executable in .env/bin/python
Installing setuptools, pip...done.
[beam_PerformanceTests_Python] $ /bin/bash -xe /tmp/jenkins704804726630153349.sh
+ .env/bin/pip install --upgrade setuptools pip
Downloading/unpacking setuptools from 
https://pypi.python.org/packages/43/41/033a273f9a25cb63050a390ee8397acbc7eae2159195d85f06f17e7be45a/setuptools-38.5.1-py2.py3-none-any.whl#md5=908b8b5e50bf429e520b2b5fa1b350e5
Downloading/unpacking pip from 
https://pypi.python.org/packages/b6/ac/7015eb97dc749283ffdec1c3a88ddb8ae03b8fad0f0e611408f196358da3/pip-9.0.1-py2.py3-none-any.whl#md5=297dbd16ef53bcef0447d245815f5144
Installing collected packages: setuptools, pip
  Found existing installation: setuptools 2.2
Uninstalling setuptools:
  Successfully uninstalled setuptools
  Found existing installation: pip 1.5.4
Uninstalling pip:
  Successfully uninstalled pip
Successfully installed setuptools pip
Cleaning up...
[beam_PerformanceTests_Python] $ /bin/bash -xe /tmp/jenkins455838049296166197.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git
Cloning into 'PerfKitBenchmarker'...
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins3167293236549596172.sh
+ .env/bin/pip install -r PerfKitBenchmarker/requirements.txt
Requirement already satisfied: absl-py in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: jinja2>=2.7 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: setuptools in ./.env/lib/python2.7/site-packages 
(from -r PerfKitBenchmarker/requirements.txt (line 16))
Requirement already satisfied: colorlog[windows]==2.6.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 17))
Requirement already satisfied: blinker>=1.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 18))
Requirement already satisfied: futures>=3.0.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 19))
Requirement already satisfied: PyYAML==3.12 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 20))
Requirement already satisfied: pint>=0.7 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 21))
Requirement already satisfied: numpy==1.13.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 22))
Requirement already satisfied: functools32 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 23))
Requirement already satisfied: contextlib2>=0.5.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 24))
Requirement already satisfied: pywinrm in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 25))
Requirement already satisfied: six in /usr/local/lib/python2.7/dist-packages 
(from absl-py->-r PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: MarkupSafe>=0.23 in 
/usr/local/lib/python2.7/dist-packages (from jinja2>=2.7->-r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: colorama; extra == "windows" in 
/usr/lib/python2.7/dist-packages (from colorlog[windows]==2.6.0->-r 
PerfKitBenchmarker/requirements.txt (line 17))

Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Spark #4246

2018-02-21 Thread Apache Jenkins Server
See 




[jira] [Resolved] (BEAM-2304) State declared with one class cannot be accessed as a superclass (applies to BagState|CombiningState <: GroupingState)

2018-02-21 Thread Luke Zhu (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Zhu resolved BEAM-2304.

   Resolution: Fixed
Fix Version/s: 2.3.0

> State declared with one class cannot be accessed as a superclass (applies to 
> BagState|CombiningState <: GroupingState)
> --
>
> Key: BEAM-2304
> URL: https://issues.apache.org/jira/browse/BEAM-2304
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Reuven Lax
>Assignee: Luke Zhu
>Priority: Major
>  Labels: starter
> Fix For: 2.3.0
>
>
> The following code:
> {code}
> @StateId("foo")
> private final StateSpec> 
> state =
> StateSpecs.combining(Sum.ofIntegers());
> @ProcessElement
> public void processElement(ProcessContext c,
>@StateId("foo") GroupingState Integer> state) {
> }
> {code}
> Fails with: 
> {code}
> parameter of type GroupingState at index 1: reference to 
> StateId exists with different type CombiningState
> {code}
> However since GroupingState is the base class, ideally this 
> should work - and would make the API easier to use if it did.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #5057

2018-02-21 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Spark #4245

2018-02-21 Thread Apache Jenkins Server
See 




[beam] branch master updated (69e9ac8 -> 3e5ce5c)

2018-02-21 Thread altay
This is an automated email from the ASF dual-hosted git repository.

altay pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 69e9ac8  This closes #4710: Uses output directory with custom 
FileNaming in FileIO.write
 add a4ff01d  Revert invalid use of io.StringIO in utils/profiler.py
 new 3e5ce5c  Merge pull request #4713 from luke-zhu/python3

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 sdks/python/apache_beam/utils/profiler.py | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
al...@apache.org.


[beam] 01/01: Merge pull request #4713 from luke-zhu/python3

2018-02-21 Thread altay
This is an automated email from the ASF dual-hosted git repository.

altay pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 3e5ce5cfd67efd4fcb4c9cb5db1cc9e677caeff8
Merge: 69e9ac8 a4ff01d
Author: Ahmet Altay 
AuthorDate: Wed Feb 21 17:37:55 2018 -0800

Merge pull request #4713 from luke-zhu/python3

Revert invalid use of io.StringIO in utils/profiler.py

 sdks/python/apache_beam/utils/profiler.py | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
al...@apache.org.


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #5056

2018-02-21 Thread Apache Jenkins Server
See 




[beam] branch master updated: A relative directory should be applied (if specified) even when using a custom filenaming scheme

2018-02-21 Thread jkff
This is an automated email from the ASF dual-hosted git repository.

jkff pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/master by this push:
 new c514fdf  A relative directory should be applied (if specified) even 
when using a custom filenaming scheme
 new 69e9ac8  This closes #4710: Uses output directory with custom 
FileNaming in FileIO.write
c514fdf is described below

commit c514fdfdb71e91cdc5fe3a25139aa1a3e849be3e
Author: Gene Peters 
AuthorDate: Mon Feb 19 13:45:01 2018 -0800

A relative directory should be applied (if specified) even when using a 
custom filenaming scheme
---
 .../main/java/org/apache/beam/sdk/io/FileIO.java   | 94 +++---
 .../java/org/apache/beam/sdk/io/FileIOTest.java| 75 +
 2 files changed, 123 insertions(+), 46 deletions(-)

diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileIO.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileIO.java
index 76717ad..9295981 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileIO.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileIO.java
@@ -23,6 +23,7 @@ import static 
org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions.RE
 import static org.apache.beam.sdk.transforms.Contextful.fn;
 
 import com.google.auto.value.AutoValue;
+import com.google.common.annotations.VisibleForTesting;
 import com.google.common.base.MoreObjects;
 import com.google.common.collect.Lists;
 import java.io.IOException;
@@ -1147,6 +1148,52 @@ public class FileIO {
   return toBuilder().setIgnoreWindowing(true).build();
 }
 
+@VisibleForTesting
+Contextful> resolveFileNamingFn() {
+  if (getDynamic()) {
+checkArgument(
+getConstantFileNaming() == null,
+"when using writeDynamic(), must use versions of .withNaming() 
"
++ "that take functions from DestinationT");
+checkArgument(getFilenamePrefix() == null, ".withPrefix() requires 
write()");
+checkArgument(getFilenameSuffix() == null, ".withSuffix() requires 
write()");
+checkArgument(
+getFileNamingFn() != null,
+"when using writeDynamic(), must specify "
++ ".withNaming() taking a function form DestinationT");
+return fn(
+(element, c) -> {
+  FileNaming naming = 
getFileNamingFn().getClosure().apply(element, c);
+  return getOutputDirectory() == null
+  ? naming
+  : relativeFileNaming(getOutputDirectory(), 
naming);
+},
+getFileNamingFn().getRequirements());
+  } else {
+checkArgument(getFileNamingFn() == null,
+".withNaming() taking a function from DestinationT requires 
writeDynamic()");
+FileNaming constantFileNaming;
+if (getConstantFileNaming() == null) {
+  constantFileNaming = defaultNaming(
+  MoreObjects.firstNonNull(
+  getFilenamePrefix(), 
StaticValueProvider.of("output")),
+  MoreObjects.firstNonNull(getFilenameSuffix(), 
StaticValueProvider.of("")));
+} else {
+  checkArgument(
+  getFilenamePrefix() == null,
+  ".to(FileNaming) is incompatible with .withSuffix()");
+  checkArgument(
+  getFilenameSuffix() == null,
+  ".to(FileNaming) is incompatible with .withPrefix()");
+  constantFileNaming = getConstantFileNaming();
+}
+if (getOutputDirectory() != null) {
+  constantFileNaming = relativeFileNaming(getOutputDirectory(), 
constantFileNaming);
+}
+return fn(SerializableFunctions.constant(constantFileNaming));
+  }
+}
+
 @Override
 public WriteFilesResult expand(PCollection input) {
   Write.Builder resolvedSpec = new 
AutoValue_FileIO_Write.Builder<>();
@@ -1172,52 +1219,7 @@ public class FileIO {
 resolvedSpec.setDestinationCoder((Coder) VoidCoder.of());
   }
 
-  // Resolve fileNamingFn
-  Contextful> fileNamingFn;
-  if (getDynamic()) {
-checkArgument(
-getConstantFileNaming() == null,
-"when using writeDynamic(), must use versions of .withNaming() "
-+ "that take functions from DestinationT");
-checkArgument(getFilenamePrefix() == null, ".withPrefix() requires 
write()");
-checkArgument(getFilenameSuffix() == null, ".withSuffix() requires 
write()");
-checkArgument(
-getFileNamingFn() != null,
-"when using writeDynamic(), must 

Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Spark #4244

2018-02-21 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PerformanceTests_Spark #1385

2018-02-21 Thread Apache Jenkins Server
See 


Changes:

[willy] [BEAM-3662] Port MongoDbIOTest off DoFnTester

[lukasz.gajowy] [BEAM-3456] Re-enable JDBC performance test

[lukasz.gajowy] fixup! [BEAM-3456] Re-enable JDBC performance test

[lukasz.gajowy] fixup! fixup! [BEAM-3456] Re-enable JDBC performance test

[rmannibucau] ensure pipeline options setup is using contextual classloader and 
not

[Pablo] Fixing nanosecond translation issue in Gauge Fn API translation.

[lcwik] Break fusion for a ParDo which has State or Timers

--
[...truncated 88.84 KB...]
'apache-beam-testing:bqjob_rad36a899f4515cc_0161bae7d388_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)
Upload complete.Waiting on bqjob_rad36a899f4515cc_0161bae7d388_1 ... (0s) 
Current status: RUNNING 
Waiting on 
bqjob_rad36a899f4515cc_0161bae7d388_1 ... (0s) Current status: DONE   
2018-02-22 00:26:52,907 b3cd49ee MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-02-22 00:27:15,541 b3cd49ee MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-02-22 00:27:17,962 b3cd49ee MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1,  WallTime:0:02.41s,  CPU:0.27s,  MaxMemory:29116kb 
STDOUT: 

BigQuery error in load operation: Error processing job
'apache-beam-testing:bqjob_r1122956a2c390cab_0161bae83600_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)
Upload complete.Waiting on bqjob_r1122956a2c390cab_0161bae83600_1 ... (0s) 
Current status: RUNNING 
 Waiting on 
bqjob_r1122956a2c390cab_0161bae83600_1 ... (0s) Current status: DONE   
2018-02-22 00:27:17,963 b3cd49ee MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-02-22 00:27:46,610 b3cd49ee MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-02-22 00:27:48,835 b3cd49ee MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1,  WallTime:0:02.21s,  CPU:0.26s,  MaxMemory:29044kb 
STDOUT: 

BigQuery error in load operation: Error processing job
'apache-beam-testing:bqjob_r796702148d5f6e88_0161bae8af51_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)
Upload complete.Waiting on bqjob_r796702148d5f6e88_0161bae8af51_1 ... (0s) 
Current status: RUNNING 
 Waiting on 
bqjob_r796702148d5f6e88_0161bae8af51_1 ... (0s) Current status: DONE   
2018-02-22 00:27:48,835 b3cd49ee MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-02-22 00:28:09,658 b3cd49ee MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #5055

2018-02-21 Thread Apache Jenkins Server
See 




Jenkins build is back to normal : beam_PerformanceTests_TFRecordIOIT #168

2018-02-21 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PerformanceTests_Python #941

2018-02-21 Thread Apache Jenkins Server
See 


Changes:

[willy] [BEAM-3662] Port MongoDbIOTest off DoFnTester

[lukasz.gajowy] [BEAM-3456] Re-enable JDBC performance test

[lukasz.gajowy] fixup! [BEAM-3456] Re-enable JDBC performance test

[lukasz.gajowy] fixup! fixup! [BEAM-3456] Re-enable JDBC performance test

[rmannibucau] ensure pipeline options setup is using contextual classloader and 
not

[Pablo] Fixing nanosecond translation issue in Gauge Fn API translation.

[lcwik] Break fusion for a ParDo which has State or Timers

--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam7 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 14e4cb30105c4aeb43483155e1447ffaae2fbcb5 (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 14e4cb30105c4aeb43483155e1447ffaae2fbcb5
Commit message: "[BEAM-3701][BEAM-3700] ensure pipeline options setup is using 
contextual classloader and not defaulting to beam-sdk classloader"
 > git rev-list --no-walk 80fa9f8d7515c68c8f7b444eee57478051333f60 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins5193934910665894720.sh
+ rm -rf PerfKitBenchmarker
[beam_PerformanceTests_Python] $ /bin/bash -xe /tmp/jenkins332062232234448249.sh
+ rm -rf .env
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins8554349035140425215.sh
+ virtualenv .env --system-site-packages
New python executable in 

Installing setuptools, pip, wheel...done.
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins7005271535697889579.sh
+ .env/bin/pip install --upgrade setuptools pip
Requirement already up-to-date: setuptools in ./.env/lib/python2.7/site-packages
Requirement already up-to-date: pip in ./.env/lib/python2.7/site-packages
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins8439710875570571107.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git
Cloning into 'PerfKitBenchmarker'...
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins8786211004157958120.sh
+ .env/bin/pip install -r PerfKitBenchmarker/requirements.txt
Requirement already satisfied: absl-py in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: jinja2>=2.7 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: setuptools in ./.env/lib/python2.7/site-packages 
(from -r PerfKitBenchmarker/requirements.txt (line 16))
Requirement already satisfied: colorlog[windows]==2.6.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 17))
Requirement already satisfied: blinker>=1.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 18))
Requirement already satisfied: futures>=3.0.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 19))
Requirement already satisfied: PyYAML==3.12 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 20))
Requirement already satisfied: pint>=0.7 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 21))
Collecting numpy==1.13.3 (from -r PerfKitBenchmarker/requirements.txt (line 22))
:318:
 SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name 
Indication) extension to TLS is not available on this platform. This may cause 
the server to present an incorrect TLS certificate, which can cause validation 
failures. You can upgrade to a newer version of Python to solve this. For more 

Jenkins build is back to normal : beam_PerformanceTests_TextIOIT #185

2018-02-21 Thread Apache Jenkins Server
See 




Jenkins build is back to normal : beam_PerformanceTests_Compressed_TextIOIT #169

2018-02-21 Thread Apache Jenkins Server
See 




Jenkins build is back to normal : beam_PerformanceTests_AvroIOIT #171

2018-02-21 Thread Apache Jenkins Server
See 




Jenkins build became unstable: beam_PostCommit_Java_ValidatesRunner_Flink #5054

2018-02-21 Thread Apache Jenkins Server
See 




Jenkins build became unstable: beam_PostCommit_Java_ValidatesRunner_Spark #4243

2018-02-21 Thread Apache Jenkins Server
See 




Jenkins build is back to stable : beam_PostCommit_Java_MavenInstall #6017

2018-02-21 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-3700) PipelineOptionsFactory leaks memory

2018-02-21 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16372153#comment-16372153
 ] 

Luke Cwik commented on BEAM-3700:
-

As the follow up to [https://github.com/apache/beam/pull/4674,] please consider 
making PipelineOptionsFactory an instance variable instead of a Cache that is 
contained within.

> PipelineOptionsFactory leaks memory
> ---
>
> Key: BEAM-3700
> URL: https://issues.apache.org/jira/browse/BEAM-3700
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Reporter: Romain Manni-Bucau
>Assignee: Romain Manni-Bucau
>Priority: Major
> Fix For: 2.4.0
>
>
> PipelineOptionsFactory has a lot of cache but no way to reset it. This task 
> is about adding a public method to be able to control it in integrations 
> (runners likely).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (BEAM-3700) PipelineOptionsFactory leaks memory

2018-02-21 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16372153#comment-16372153
 ] 

Luke Cwik edited comment on BEAM-3700 at 2/21/18 10:56 PM:
---

As the follow up to [https://github.com/apache/beam/pull/4674,] please consider 
making PipelineOptionsFactory an instance variable instead of a Cache that is 
contained within.

 

This will help protect users from cache reset thread safety issues.


was (Author: lcwik):
As the follow up to [https://github.com/apache/beam/pull/4674,] please consider 
making PipelineOptionsFactory an instance variable instead of a Cache that is 
contained within.

> PipelineOptionsFactory leaks memory
> ---
>
> Key: BEAM-3700
> URL: https://issues.apache.org/jira/browse/BEAM-3700
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Reporter: Romain Manni-Bucau
>Assignee: Romain Manni-Bucau
>Priority: Major
> Fix For: 2.4.0
>
>
> PipelineOptionsFactory has a lot of cache but no way to reset it. This task 
> is about adding a public method to be able to control it in integrations 
> (runners likely).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-3701) PipelineOptionsFactory doesn't use the right classloader for its SPI

2018-02-21 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-3701.
-
Resolution: Fixed

> PipelineOptionsFactory doesn't use the right classloader for its SPI
> 
>
> Key: BEAM-3701
> URL: https://issues.apache.org/jira/browse/BEAM-3701
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Reporter: Romain Manni-Bucau
>Assignee: Romain Manni-Bucau
>Priority: Major
> Fix For: 2.4.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> PipelineOptionsFactory uses its own classloader to load SPI, this should use 
> the TCCL to grab what is available in the launching context to enable actual 
> pluggability in most environment and not limit users to use a flat 
> classloader environment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-3700) PipelineOptionsFactory leaks memory

2018-02-21 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik reassigned BEAM-3700:
---

Assignee: Romain Manni-Bucau  (was: Kenneth Knowles)

> PipelineOptionsFactory leaks memory
> ---
>
> Key: BEAM-3700
> URL: https://issues.apache.org/jira/browse/BEAM-3700
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Reporter: Romain Manni-Bucau
>Assignee: Romain Manni-Bucau
>Priority: Major
> Fix For: 2.4.0
>
>
> PipelineOptionsFactory has a lot of cache but no way to reset it. This task 
> is about adding a public method to be able to control it in integrations 
> (runners likely).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] 01/01: [BEAM-3701][BEAM-3700] ensure pipeline options setup is using contextual classloader and not defaulting to beam-sdk classloader

2018-02-21 Thread lcwik
This is an automated email from the ASF dual-hosted git repository.

lcwik pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 14e4cb30105c4aeb43483155e1447ffaae2fbcb5
Merge: 386d0b0 3637809
Author: Lukasz Cwik 
AuthorDate: Wed Feb 21 14:54:00 2018 -0800

[BEAM-3701][BEAM-3700] ensure pipeline options setup is using contextual 
classloader and not defaulting to beam-sdk classloader

 .../apache/beam/sdk/options/PipelineOptions.java   |   5 +-
 .../beam/sdk/options/PipelineOptionsFactory.java   | 345 +++--
 .../beam/sdk/options/ProxyInvocationHandler.java   |  18 +-
 .../sdk/options/PipelineOptionsFactoryTest.java|  40 ++-
 .../sdk/options/ProxyInvocationHandlerTest.java|   4 +-
 .../sdk/testing/InterceptingUrlClassLoader.java|  13 +-
 6 files changed, 251 insertions(+), 174 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
lc...@apache.org.


[beam] branch master updated (386d0b0 -> 14e4cb3)

2018-02-21 Thread lcwik
This is an automated email from the ASF dual-hosted git repository.

lcwik pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 386d0b0  [BEAM-3662] Port MongoDbIOTest off DoFnTester
 add 3637809  ensure pipeline options setup is using contextual classloader 
and not defaulting to beam-sdk classloader
 new 14e4cb3  [BEAM-3701][BEAM-3700] ensure pipeline options setup is using 
contextual classloader and not defaulting to beam-sdk classloader

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../apache/beam/sdk/options/PipelineOptions.java   |   5 +-
 .../beam/sdk/options/PipelineOptionsFactory.java   | 345 +++--
 .../beam/sdk/options/ProxyInvocationHandler.java   |  18 +-
 .../sdk/options/PipelineOptionsFactoryTest.java|  40 ++-
 .../sdk/options/ProxyInvocationHandlerTest.java|   4 +-
 .../sdk/testing/InterceptingUrlClassLoader.java|  13 +-
 6 files changed, 251 insertions(+), 174 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
lc...@apache.org.


[jira] [Commented] (BEAM-3681) S3Filesystem fails when copying empty files

2018-02-21 Thread JIRA

[ 
https://issues.apache.org/jira/browse/BEAM-3681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16372094#comment-16372094
 ] 

Ismaël Mejía commented on BEAM-3681:


In addition to this the current copy method is slightly inefficient, because it 
uses the more robust MultiPart API that supports bigger than 5GB files, but 
pays the price of requiring 3 requests to copy every smaller file instead of 1 
that is the case of the basic s3client.copyObject API.

> S3Filesystem fails when copying empty files
> ---
>
> Key: BEAM-3681
> URL: https://issues.apache.org/jira/browse/BEAM-3681
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-aws
>Affects Versions: 2.3.0
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Major
> Fix For: 2.4.0
>
>
> When executing a simple write on S3 with the direct runner. It breaks 
> sometimes when it ends up trying to write 'empty' shards to S3.
> {code:java}
> Pipeline pipeline = Pipeline.create(options);
> pipeline
>  .apply("CreateSomeData", Create.of("1", "2", "3"))
>  .apply("WriteToFS", TextIO.write().to(options.getOutput()));
> pipeline.run();{code}
> The related exception is:
> {code:java}
> Exception in thread "main" 
> org.apache.beam.sdk.Pipeline$PipelineExecutionException: java.io.IOException: 
> com.amazonaws.services.s3.model.AmazonS3Exception: The XML you provided was 
> not well-formed or did not validate against our published schema (Service: 
> Amazon S3; Status Code: 400; Error Code: MalformedXML; Request ID: 
> 402E99C2F602AD09; S3 Extended Request ID: 
> SDdU8AqW2mfZuG1xcKUSNeHiR0IUKcRCpZ1Wjx7sAor1CdYf8f+0dDIcQpvr3GXgqwsyk5PGWVE=),
>  S3 Extended Request ID: 
> SDdU8AqW2mfZuG1xcKUSNeHiR0IUKcRCpZ1Wjx7sAor1CdYf8f+0dDIcQpvr3GXgqwsyk5PGWVE=
>     at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:342)
>     at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:312)
>     at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:206)
>     at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:62)
>     at org.apache.beam.sdk.Pipeline.run(Pipeline.java:311)
>     at org.apache.beam.sdk.Pipeline.run(Pipeline.java:297)
>     at 
> org.apache.beam.samples.ingest.amazon.IngestToS3.main(IngestToS3.java:82)
> Caused by: java.io.IOException: 
> com.amazonaws.services.s3.model.AmazonS3Exception: The XML you provided was 
> not well-formed or did not validate against our published schema (Service: 
> Amazon S3; Status Code: 400; Error Code: MalformedXML; Request ID: 
> 402E99C2F602AD09; S3 Extended Request ID: 
> SDdU8AqW2mfZuG1xcKUSNeHiR0IUKcRCpZ1Wjx7sAor1CdYf8f+0dDIcQpvr3GXgqwsyk5PGWVE=),
>  S3 Extended Request ID: 
> SDdU8AqW2mfZuG1xcKUSNeHiR0IUKcRCpZ1Wjx7sAor1CdYf8f+0dDIcQpvr3GXgqwsyk5PGWVE=
>     at org.apache.beam.sdk.io.aws.s3.S3FileSystem.copy(S3FileSystem.java:563)
>     at 
> org.apache.beam.sdk.io.aws.s3.S3FileSystem.lambda$copy$4(S3FileSystem.java:495)
>     at 
> org.apache.beam.sdk.io.aws.s3.S3FileSystem.lambda$callTasks$8(S3FileSystem.java:642)
>     at 
> org.apache.beam.sdk.util.MoreFutures.lambda$supplyAsync$0(MoreFutures.java:100)
>     at 
> java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626)
> Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: The XML you 
> provided was not well-formed or did not validate against our published schema 
> (Service: Amazon S3; Status Code: 400; Error Code: MalformedXML; Request ID: 
> 402E99C2F602AD09; S3 Extended Request ID: 
> SDdU8AqW2mfZuG1xcKUSNeHiR0IUKcRCpZ1Wjx7sAor1CdYf8f+0dDIcQpvr3GXgqwsyk5PGWVE=),
>  S3 Extended Request ID: 
> SDdU8AqW2mfZuG1xcKUSNeHiR0IUKcRCpZ1Wjx7sAor1CdYf8f+0dDIcQpvr3GXgqwsyk5PGWVE=
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1639)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1304)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
>     at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
>     at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4325)
>     at 
> 

[jira] [Updated] (BEAM-3681) S3Filesystem fails when copying empty files

2018-02-21 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/BEAM-3681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-3681:
---
Description: 
When executing a simple write on S3 with the direct runner. It breaks sometimes 
when it ends up trying to write 'empty' shards to S3.
{code:java}
Pipeline pipeline = Pipeline.create(options);
pipeline
 .apply("CreateSomeData", Create.of("1", "2", "3"))
 .apply("WriteToFS", TextIO.write().to(options.getOutput()));
pipeline.run();{code}
The related exception is:
{code:java}
Exception in thread "main" 
org.apache.beam.sdk.Pipeline$PipelineExecutionException: java.io.IOException: 
com.amazonaws.services.s3.model.AmazonS3Exception: The XML you provided was not 
well-formed or did not validate against our published schema (Service: Amazon 
S3; Status Code: 400; Error Code: MalformedXML; Request ID: 402E99C2F602AD09; 
S3 Extended Request ID: 
SDdU8AqW2mfZuG1xcKUSNeHiR0IUKcRCpZ1Wjx7sAor1CdYf8f+0dDIcQpvr3GXgqwsyk5PGWVE=), 
S3 Extended Request ID: 
SDdU8AqW2mfZuG1xcKUSNeHiR0IUKcRCpZ1Wjx7sAor1CdYf8f+0dDIcQpvr3GXgqwsyk5PGWVE=
    at 
org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:342)
    at 
org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:312)
    at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:206)
    at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:62)
    at org.apache.beam.sdk.Pipeline.run(Pipeline.java:311)
    at org.apache.beam.sdk.Pipeline.run(Pipeline.java:297)
    at org.apache.beam.samples.ingest.amazon.IngestToS3.main(IngestToS3.java:82)
Caused by: java.io.IOException: 
com.amazonaws.services.s3.model.AmazonS3Exception: The XML you provided was not 
well-formed or did not validate against our published schema (Service: Amazon 
S3; Status Code: 400; Error Code: MalformedXML; Request ID: 402E99C2F602AD09; 
S3 Extended Request ID: 
SDdU8AqW2mfZuG1xcKUSNeHiR0IUKcRCpZ1Wjx7sAor1CdYf8f+0dDIcQpvr3GXgqwsyk5PGWVE=), 
S3 Extended Request ID: 
SDdU8AqW2mfZuG1xcKUSNeHiR0IUKcRCpZ1Wjx7sAor1CdYf8f+0dDIcQpvr3GXgqwsyk5PGWVE=
    at org.apache.beam.sdk.io.aws.s3.S3FileSystem.copy(S3FileSystem.java:563)
    at 
org.apache.beam.sdk.io.aws.s3.S3FileSystem.lambda$copy$4(S3FileSystem.java:495)
    at 
org.apache.beam.sdk.io.aws.s3.S3FileSystem.lambda$callTasks$8(S3FileSystem.java:642)
    at 
org.apache.beam.sdk.util.MoreFutures.lambda$supplyAsync$0(MoreFutures.java:100)
    at 
java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626)
Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: The XML you 
provided was not well-formed or did not validate against our published schema 
(Service: Amazon S3; Status Code: 400; Error Code: MalformedXML; Request ID: 
402E99C2F602AD09; S3 Extended Request ID: 
SDdU8AqW2mfZuG1xcKUSNeHiR0IUKcRCpZ1Wjx7sAor1CdYf8f+0dDIcQpvr3GXgqwsyk5PGWVE=), 
S3 Extended Request ID: 
SDdU8AqW2mfZuG1xcKUSNeHiR0IUKcRCpZ1Wjx7sAor1CdYf8f+0dDIcQpvr3GXgqwsyk5PGWVE=
    at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1639)
    at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1304)
    at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056)
    at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743)
    at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717)
    at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
    at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
    at 
com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
    at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4325)
    at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4272)
    at 
com.amazonaws.services.s3.AmazonS3Client.completeMultipartUpload(AmazonS3Client.java:3065)
    at org.apache.beam.sdk.io.aws.s3.S3FileSystem.copy(S3FileSystem.java:561)
    at 
org.apache.beam.sdk.io.aws.s3.S3FileSystem.lambda$copy$4(S3FileSystem.java:495)
    at 
org.apache.beam.sdk.io.aws.s3.S3FileSystem.lambda$callTasks$8(S3FileSystem.java:642)
    at 
org.apache.beam.sdk.util.MoreFutures.lambda$supplyAsync$0(MoreFutures.java:100)
    at 
java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748){code}
 After further investigation I found that the output of FileBasedSink can 
produce empty 

[jira] [Updated] (BEAM-3681) S3Filesystem fails when copying empty files

2018-02-21 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/BEAM-3681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-3681:
---
Summary: S3Filesystem fails when copying empty files  (was: Amazon S3 write 
breaks randomly)

> S3Filesystem fails when copying empty files
> ---
>
> Key: BEAM-3681
> URL: https://issues.apache.org/jira/browse/BEAM-3681
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-aws
>Affects Versions: 2.3.0
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Major
> Fix For: 2.4.0
>
>
> When executing a simple write on S3 with the direct runner. It breaks 
> sometimes when it ends up trying to write 'empty' shards to S3.
> {code:java}
> Pipeline pipeline = Pipeline.create(options);
> pipeline
>  .apply("CreateSomeData", Create.of("1", "2", "3"))
>  .apply("WriteToFS", TextIO.write().to(options.getOutput()));
> pipeline.run();{code}
> The related exception is:
> {code:java}
> Exception in thread "main" 
> org.apache.beam.sdk.Pipeline$PipelineExecutionException: java.io.IOException: 
> com.amazonaws.services.s3.model.AmazonS3Exception: The XML you provided was 
> not well-formed or did not validate against our published schema (Service: 
> Amazon S3; Status Code: 400; Error Code: MalformedXML; Request ID: 
> 402E99C2F602AD09; S3 Extended Request ID: 
> SDdU8AqW2mfZuG1xcKUSNeHiR0IUKcRCpZ1Wjx7sAor1CdYf8f+0dDIcQpvr3GXgqwsyk5PGWVE=),
>  S3 Extended Request ID: 
> SDdU8AqW2mfZuG1xcKUSNeHiR0IUKcRCpZ1Wjx7sAor1CdYf8f+0dDIcQpvr3GXgqwsyk5PGWVE=
>     at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:342)
>     at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:312)
>     at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:206)
>     at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:62)
>     at org.apache.beam.sdk.Pipeline.run(Pipeline.java:311)
>     at org.apache.beam.sdk.Pipeline.run(Pipeline.java:297)
>     at 
> org.apache.beam.samples.ingest.amazon.IngestToS3.main(IngestToS3.java:82)
> Caused by: java.io.IOException: 
> com.amazonaws.services.s3.model.AmazonS3Exception: The XML you provided was 
> not well-formed or did not validate against our published schema (Service: 
> Amazon S3; Status Code: 400; Error Code: MalformedXML; Request ID: 
> 402E99C2F602AD09; S3 Extended Request ID: 
> SDdU8AqW2mfZuG1xcKUSNeHiR0IUKcRCpZ1Wjx7sAor1CdYf8f+0dDIcQpvr3GXgqwsyk5PGWVE=),
>  S3 Extended Request ID: 
> SDdU8AqW2mfZuG1xcKUSNeHiR0IUKcRCpZ1Wjx7sAor1CdYf8f+0dDIcQpvr3GXgqwsyk5PGWVE=
>     at org.apache.beam.sdk.io.aws.s3.S3FileSystem.copy(S3FileSystem.java:563)
>     at 
> org.apache.beam.sdk.io.aws.s3.S3FileSystem.lambda$copy$4(S3FileSystem.java:495)
>     at 
> org.apache.beam.sdk.io.aws.s3.S3FileSystem.lambda$callTasks$8(S3FileSystem.java:642)
>     at 
> org.apache.beam.sdk.util.MoreFutures.lambda$supplyAsync$0(MoreFutures.java:100)
>     at 
> java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626)
> Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: The XML you 
> provided was not well-formed or did not validate against our published schema 
> (Service: Amazon S3; Status Code: 400; Error Code: MalformedXML; Request ID: 
> 402E99C2F602AD09; S3 Extended Request ID: 
> SDdU8AqW2mfZuG1xcKUSNeHiR0IUKcRCpZ1Wjx7sAor1CdYf8f+0dDIcQpvr3GXgqwsyk5PGWVE=),
>  S3 Extended Request ID: 
> SDdU8AqW2mfZuG1xcKUSNeHiR0IUKcRCpZ1Wjx7sAor1CdYf8f+0dDIcQpvr3GXgqwsyk5PGWVE=
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1639)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1304)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
>     at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
>     at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4325)
>     at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4272)
>     at 
> com.amazonaws.services.s3.AmazonS3Client.completeMultipartUpload(AmazonS3Client.java:3065)
>     at 

[jira] [Updated] (BEAM-3681) Amazon S3 write breaks randomly

2018-02-21 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/BEAM-3681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-3681:
---
Affects Version/s: (was: 2.4.0)

> Amazon S3 write breaks randomly
> ---
>
> Key: BEAM-3681
> URL: https://issues.apache.org/jira/browse/BEAM-3681
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-aws
>Affects Versions: 2.3.0
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Major
> Fix For: 2.4.0
>
>
> When executing a simple write on S3 with the direct runner. It breaks 
> sometimes when it ends up trying to write 'empty' shards to S3.
> {code:java}
> Pipeline pipeline = Pipeline.create(options);
> pipeline
>  .apply("CreateSomeData", Create.of("1", "2", "3"))
>  .apply("WriteToFS", TextIO.write().to(options.getOutput()));
> pipeline.run();{code}
> The related exception is:
> {code:java}
> Exception in thread "main" 
> org.apache.beam.sdk.Pipeline$PipelineExecutionException: java.io.IOException: 
> com.amazonaws.services.s3.model.AmazonS3Exception: The XML you provided was 
> not well-formed or did not validate against our published schema (Service: 
> Amazon S3; Status Code: 400; Error Code: MalformedXML; Request ID: 
> 402E99C2F602AD09; S3 Extended Request ID: 
> SDdU8AqW2mfZuG1xcKUSNeHiR0IUKcRCpZ1Wjx7sAor1CdYf8f+0dDIcQpvr3GXgqwsyk5PGWVE=),
>  S3 Extended Request ID: 
> SDdU8AqW2mfZuG1xcKUSNeHiR0IUKcRCpZ1Wjx7sAor1CdYf8f+0dDIcQpvr3GXgqwsyk5PGWVE=
>     at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:342)
>     at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:312)
>     at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:206)
>     at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:62)
>     at org.apache.beam.sdk.Pipeline.run(Pipeline.java:311)
>     at org.apache.beam.sdk.Pipeline.run(Pipeline.java:297)
>     at 
> org.apache.beam.samples.ingest.amazon.IngestToS3.main(IngestToS3.java:82)
> Caused by: java.io.IOException: 
> com.amazonaws.services.s3.model.AmazonS3Exception: The XML you provided was 
> not well-formed or did not validate against our published schema (Service: 
> Amazon S3; Status Code: 400; Error Code: MalformedXML; Request ID: 
> 402E99C2F602AD09; S3 Extended Request ID: 
> SDdU8AqW2mfZuG1xcKUSNeHiR0IUKcRCpZ1Wjx7sAor1CdYf8f+0dDIcQpvr3GXgqwsyk5PGWVE=),
>  S3 Extended Request ID: 
> SDdU8AqW2mfZuG1xcKUSNeHiR0IUKcRCpZ1Wjx7sAor1CdYf8f+0dDIcQpvr3GXgqwsyk5PGWVE=
>     at org.apache.beam.sdk.io.aws.s3.S3FileSystem.copy(S3FileSystem.java:563)
>     at 
> org.apache.beam.sdk.io.aws.s3.S3FileSystem.lambda$copy$4(S3FileSystem.java:495)
>     at 
> org.apache.beam.sdk.io.aws.s3.S3FileSystem.lambda$callTasks$8(S3FileSystem.java:642)
>     at 
> org.apache.beam.sdk.util.MoreFutures.lambda$supplyAsync$0(MoreFutures.java:100)
>     at 
> java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626)
> Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: The XML you 
> provided was not well-formed or did not validate against our published schema 
> (Service: Amazon S3; Status Code: 400; Error Code: MalformedXML; Request ID: 
> 402E99C2F602AD09; S3 Extended Request ID: 
> SDdU8AqW2mfZuG1xcKUSNeHiR0IUKcRCpZ1Wjx7sAor1CdYf8f+0dDIcQpvr3GXgqwsyk5PGWVE=),
>  S3 Extended Request ID: 
> SDdU8AqW2mfZuG1xcKUSNeHiR0IUKcRCpZ1Wjx7sAor1CdYf8f+0dDIcQpvr3GXgqwsyk5PGWVE=
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1639)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1304)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
>     at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
>     at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4325)
>     at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4272)
>     at 
> com.amazonaws.services.s3.AmazonS3Client.completeMultipartUpload(AmazonS3Client.java:3065)
>     at org.apache.beam.sdk.io.aws.s3.S3FileSystem.copy(S3FileSystem.java:561)
>     at 
> 

[jira] [Updated] (BEAM-3681) Amazon S3 write breaks randomly

2018-02-21 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/BEAM-3681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-3681:
---
Fix Version/s: 2.4.0

> Amazon S3 write breaks randomly
> ---
>
> Key: BEAM-3681
> URL: https://issues.apache.org/jira/browse/BEAM-3681
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-aws
>Affects Versions: 2.3.0
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Major
> Fix For: 2.4.0
>
>
> When executing a simple write on S3 with the direct runner. It breaks 
> sometimes when it ends up trying to write 'empty' shards to S3.
> {code:java}
> Pipeline pipeline = Pipeline.create(options);
> pipeline
>  .apply("CreateSomeData", Create.of("1", "2", "3"))
>  .apply("WriteToFS", TextIO.write().to(options.getOutput()));
> pipeline.run();{code}
> The related exception is:
> {code:java}
> Exception in thread "main" 
> org.apache.beam.sdk.Pipeline$PipelineExecutionException: java.io.IOException: 
> com.amazonaws.services.s3.model.AmazonS3Exception: The XML you provided was 
> not well-formed or did not validate against our published schema (Service: 
> Amazon S3; Status Code: 400; Error Code: MalformedXML; Request ID: 
> 402E99C2F602AD09; S3 Extended Request ID: 
> SDdU8AqW2mfZuG1xcKUSNeHiR0IUKcRCpZ1Wjx7sAor1CdYf8f+0dDIcQpvr3GXgqwsyk5PGWVE=),
>  S3 Extended Request ID: 
> SDdU8AqW2mfZuG1xcKUSNeHiR0IUKcRCpZ1Wjx7sAor1CdYf8f+0dDIcQpvr3GXgqwsyk5PGWVE=
>     at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:342)
>     at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:312)
>     at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:206)
>     at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:62)
>     at org.apache.beam.sdk.Pipeline.run(Pipeline.java:311)
>     at org.apache.beam.sdk.Pipeline.run(Pipeline.java:297)
>     at 
> org.apache.beam.samples.ingest.amazon.IngestToS3.main(IngestToS3.java:82)
> Caused by: java.io.IOException: 
> com.amazonaws.services.s3.model.AmazonS3Exception: The XML you provided was 
> not well-formed or did not validate against our published schema (Service: 
> Amazon S3; Status Code: 400; Error Code: MalformedXML; Request ID: 
> 402E99C2F602AD09; S3 Extended Request ID: 
> SDdU8AqW2mfZuG1xcKUSNeHiR0IUKcRCpZ1Wjx7sAor1CdYf8f+0dDIcQpvr3GXgqwsyk5PGWVE=),
>  S3 Extended Request ID: 
> SDdU8AqW2mfZuG1xcKUSNeHiR0IUKcRCpZ1Wjx7sAor1CdYf8f+0dDIcQpvr3GXgqwsyk5PGWVE=
>     at org.apache.beam.sdk.io.aws.s3.S3FileSystem.copy(S3FileSystem.java:563)
>     at 
> org.apache.beam.sdk.io.aws.s3.S3FileSystem.lambda$copy$4(S3FileSystem.java:495)
>     at 
> org.apache.beam.sdk.io.aws.s3.S3FileSystem.lambda$callTasks$8(S3FileSystem.java:642)
>     at 
> org.apache.beam.sdk.util.MoreFutures.lambda$supplyAsync$0(MoreFutures.java:100)
>     at 
> java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626)
> Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: The XML you 
> provided was not well-formed or did not validate against our published schema 
> (Service: Amazon S3; Status Code: 400; Error Code: MalformedXML; Request ID: 
> 402E99C2F602AD09; S3 Extended Request ID: 
> SDdU8AqW2mfZuG1xcKUSNeHiR0IUKcRCpZ1Wjx7sAor1CdYf8f+0dDIcQpvr3GXgqwsyk5PGWVE=),
>  S3 Extended Request ID: 
> SDdU8AqW2mfZuG1xcKUSNeHiR0IUKcRCpZ1Wjx7sAor1CdYf8f+0dDIcQpvr3GXgqwsyk5PGWVE=
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1639)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1304)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
>     at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
>     at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4325)
>     at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4272)
>     at 
> com.amazonaws.services.s3.AmazonS3Client.completeMultipartUpload(AmazonS3Client.java:3065)
>     at org.apache.beam.sdk.io.aws.s3.S3FileSystem.copy(S3FileSystem.java:561)
>     at 
> 

Jenkins build became unstable: beam_PostCommit_Java_MavenInstall #6016

2018-02-21 Thread Apache Jenkins Server
See 




[jira] [Resolved] (BEAM-3662) Port MongoDbIOTest off DoFnTester

2018-02-21 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-3662.
-
   Resolution: Fixed
Fix Version/s: 2.4.0

> Port MongoDbIOTest off DoFnTester
> -
>
> Key: BEAM-3662
> URL: https://issues.apache.org/jira/browse/BEAM-3662
> Project: Beam
>  Issue Type: Sub-task
>  Components: io-java-mongodb
>Reporter: Kenneth Knowles
>Assignee: Willy Lulciuc
>Priority: Major
>  Labels: beginner, newbie, starter
> Fix For: 2.4.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] branch master updated (8be36b9 -> 386d0b0)

2018-02-21 Thread lcwik
This is an automated email from the ASF dual-hosted git repository.

lcwik pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 8be36b9  [BEAM-3565] Break fusion for a ParDo which has State or Timers
 add 6ef6217  [BEAM-3662] Port MongoDbIOTest off DoFnTester
 new 386d0b0  [BEAM-3662] Port MongoDbIOTest off DoFnTester

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../apache/beam/sdk/io/mongodb/MongoDbIOTest.java  | 25 --
 1 file changed, 19 insertions(+), 6 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
lc...@apache.org.


[beam] 01/01: [BEAM-3662] Port MongoDbIOTest off DoFnTester

2018-02-21 Thread lcwik
This is an automated email from the ASF dual-hosted git repository.

lcwik pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 386d0b06902338c7d46703c5add6ab6223d2cabd
Merge: 8be36b9 6ef6217
Author: Lukasz Cwik 
AuthorDate: Wed Feb 21 13:50:55 2018 -0800

[BEAM-3662] Port MongoDbIOTest off DoFnTester

 .../apache/beam/sdk/io/mongodb/MongoDbIOTest.java  | 25 --
 1 file changed, 19 insertions(+), 6 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
lc...@apache.org.


[beam] 01/01: Merge pull request #4721 from [BEAM-1866] Fixing nanosecond translation issue in Gauge Fn API translation.

2018-02-21 Thread robertwb
This is an automated email from the ASF dual-hosted git repository.

robertwb pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit ce0c92e0a15b9b44e53161827afe213eb18e18b4
Merge: 80fa9f8 6cf63af
Author: Robert Bradshaw 
AuthorDate: Wed Feb 21 11:46:29 2018 -0800

Merge pull request #4721 from [BEAM-1866] Fixing nanosecond translation 
issue in Gauge Fn API translation.

[BEAM-1866] Fixing nanosecond translation issue in Gauge Fn API translation.

 sdks/python/apache_beam/metrics/cells.py | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

-- 
To stop receiving notification emails like this one, please contact
rober...@apache.org.


[beam] branch master updated: Break fusion for a ParDo which has State or Timers

2018-02-21 Thread lcwik
This is an automated email from the ASF dual-hosted git repository.

lcwik pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/master by this push:
 new a46cfbf  Break fusion for a ParDo which has State or Timers
 new 8be36b9  [BEAM-3565] Break fusion for a ParDo which has State or Timers
a46cfbf is described below

commit a46cfbf8adc13318e4e372f079c54b25b5cdbff2
Author: Thomas Groh 
AuthorDate: Fri Feb 16 11:29:22 2018 -0800

Break fusion for a ParDo which has State or Timers

Because these are provided in a key-partitioned manner, the upstream
stage has to preserve keys for this to be executable. This could be
checked, but this is a simpler method to break fusion when it is known
it will be appropriate.
---
 .../graph/GreedyPCollectionFusers.java |  41 +++---
 .../graph/GreedilyFusedExecutableStageTest.java| 122 ++
 .../graph/GreedyPipelineFuserTest.java | 139 +
 3 files changed, 288 insertions(+), 14 deletions(-)

diff --git 
a/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/graph/GreedyPCollectionFusers.java
 
b/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/graph/GreedyPCollectionFusers.java
index da2c92b..992b463 100644
--- 
a/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/graph/GreedyPCollectionFusers.java
+++ 
b/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/graph/GreedyPCollectionFusers.java
@@ -21,6 +21,7 @@ package org.apache.beam.runners.core.construction.graph;
 import static com.google.common.base.Preconditions.checkArgument;
 
 import com.google.common.collect.ImmutableMap;
+import com.google.protobuf.InvalidProtocolBufferException;
 import java.util.Map;
 import java.util.Optional;
 import org.apache.beam.model.pipeline.v1.RunnerApi.Environment;
@@ -131,27 +132,39 @@ class GreedyPCollectionFusers {
   // The PCollection's producer and this ParDo execute in different 
environments, so fusion
   // is never possible.
   return false;
-} else if (!pipeline.getSideInputs(parDo).isEmpty()) {
-  // At execution time, a Runner is required to only provide inputs to a 
PTransform that, at the
-  // time the PTransform processes them, the associated window is ready in 
all side inputs that
-  // the PTransform consumes. For an arbitrary stage, it is significantly 
complex for the runner
-  // to determine this for each input. As a result, we break fusion to 
simplify this inspection.
-  // In general, a ParDo which consumes side inputs cannot be fused into 
an executable subgraph
-  // alongside any transforms which are upstream of any of its side inputs.
+}
+if (!pipeline.getSideInputs(parDo).isEmpty()) {
+  // At execution time, a Runner is required to only provide inputs to a 
PTransform that, at
+  // the time the PTransform processes them, the associated window is 
ready in all side inputs
+  // that the PTransform consumes. For an arbitrary stage, it is 
significantly complex for the
+  // runner to determine this for each input. As a result, we break fusion 
to simplify this
+  // inspection. In general, a ParDo which consumes side inputs cannot be 
fused into an
+  // executable stage alongside any transforms which are upstream of any 
of its side inputs.
   return false;
+} else {
+  try {
+ParDoPayload payload =
+
ParDoPayload.parseFrom(parDo.getTransform().getSpec().getPayload());
+if (payload.getStateSpecsCount() > 0 || payload.getTimerSpecsCount() > 
0) {
+  // Inputs to a ParDo that uses State or Timers must be 
key-partitioned, and elements for
+  // a key must execute serially. To avoid checking if the rest of the 
stage is
+  // key-partitioned and preserves keys, these ParDos do not fuse into 
an existing stage.
+  return false;
+}
+  } catch (InvalidProtocolBufferException e) {
+throw new IllegalArgumentException(e);
+  }
 }
 return true;
   }
 
   private static boolean parDoCompatibility(
   PTransformNode parDo, PTransformNode other, QueryablePipeline pipeline) {
-if (!pipeline.getSideInputs(parDo).isEmpty()) {
-  // This is a convenience rather than a strict requirement. In general, a 
ParDo that consumes
-  // side inputs can be fused with other transforms in the same 
environment which are not
-  // upstream of any of the side inputs.
-  return false;
-}
-return compatibleEnvironments(parDo, other, pipeline);
+// This is a convenience rather than a strict requirement. In general, a 
ParDo that consumes
+// side inputs can be fused with other transforms in the same environment 
which are not
+// 

[beam] branch master updated (ce0c92e -> 89afba1)

2018-02-21 Thread chamikara
This is an automated email from the ASF dual-hosted git repository.

chamikara pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from ce0c92e  Merge pull request #4721 from [BEAM-1866] Fixing nanosecond 
translation issue in Gauge Fn API translation.
 add ae189e9  [BEAM-3456] Re-enable JDBC performance test
 add 802c4cd  fixup! [BEAM-3456] Re-enable JDBC performance test
 add 66c3755  fixup! fixup! [BEAM-3456] Re-enable JDBC performance test
 new 89afba1  Merge pull request #4714: [BEAM-3456] Re-enable JDBC 
performance test

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../jenkins/job_beam_PerformanceTests_JDBC.groovy  | 76 +-
 1 file changed, 47 insertions(+), 29 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
chamik...@apache.org.


[beam] 01/01: Merge pull request #4714: [BEAM-3456] Re-enable JDBC performance test

2018-02-21 Thread chamikara
This is an automated email from the ASF dual-hosted git repository.

chamikara pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 89afba1f7120f0b0d6ebb315ef68e4cfd54373a7
Merge: ce0c92e 66c3755
Author: Chamikara Jayalath 
AuthorDate: Wed Feb 21 13:06:47 2018 -0800

Merge pull request #4714: [BEAM-3456] Re-enable JDBC performance test

 .../jenkins/job_beam_PerformanceTests_JDBC.groovy  | 76 +-
 1 file changed, 47 insertions(+), 29 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
chamik...@apache.org.


Jenkins build is back to normal : beam_PerformanceTests_JDBC #243

2018-02-21 Thread Apache Jenkins Server
See 




Jenkins build is back to normal : beam_PostCommit_Java_MavenInstall #6015

2018-02-21 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PerformanceTests_Python #940

2018-02-21 Thread Apache Jenkins Server
See 


Changes:

[Pablo] Plumbing Gauge metrics through the Fn API.

[dkulp] [BEAM-3640] part 2 - add

[kenn] Spotless gradle: remove extraneous globbing of all java everywhere

[kenn] Explicitly exclude some troublesome optional deps from

--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam7 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 80fa9f8d7515c68c8f7b444eee57478051333f60 (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 80fa9f8d7515c68c8f7b444eee57478051333f60
Commit message: "Merge pull request #4702: [BEAM-3715] Explicitly exclude some 
troublesome optional deps"
 > git rev-list --no-walk a3a6ca043abd41fbcbbbfc91449aabcec0ff1df3 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_Python] $ /bin/bash -xe /tmp/jenkins751381016725515607.sh
+ rm -rf PerfKitBenchmarker
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins3553405013280975861.sh
+ rm -rf .env
[beam_PerformanceTests_Python] $ /bin/bash -xe /tmp/jenkins219803285062055140.sh
+ virtualenv .env --system-site-packages
New python executable in 

Installing setuptools, pip, wheel...done.
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins3554411304159112629.sh
+ .env/bin/pip install --upgrade setuptools pip
Requirement already up-to-date: setuptools in ./.env/lib/python2.7/site-packages
Requirement already up-to-date: pip in ./.env/lib/python2.7/site-packages
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins1355476054768954384.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git
Cloning into 'PerfKitBenchmarker'...
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins8877792796575892157.sh
+ .env/bin/pip install -r PerfKitBenchmarker/requirements.txt
Requirement already satisfied: absl-py in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: jinja2>=2.7 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: setuptools in ./.env/lib/python2.7/site-packages 
(from -r PerfKitBenchmarker/requirements.txt (line 16))
Requirement already satisfied: colorlog[windows]==2.6.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 17))
Requirement already satisfied: blinker>=1.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 18))
Requirement already satisfied: futures>=3.0.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 19))
Requirement already satisfied: PyYAML==3.12 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 20))
Requirement already satisfied: pint>=0.7 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 21))
Collecting numpy==1.13.3 (from -r PerfKitBenchmarker/requirements.txt (line 22))
:318:
 SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name 
Indication) extension to TLS is not available on this platform. This may cause 
the server to present an incorrect TLS certificate, which can cause validation 
failures. You can upgrade to a newer version of Python to solve this. For more 
information, see 
https://urllib3.readthedocs.io/en/latest/security.html#snimissingwarning.
  SNIMissingWarning
:122:
 InsecurePlatformWarning: A true 

[beam] branch master updated (80fa9f8 -> ce0c92e)

2018-02-21 Thread robertwb
This is an automated email from the ASF dual-hosted git repository.

robertwb pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 80fa9f8  Merge pull request #4702: [BEAM-3715] Explicitly exclude some 
troublesome optional deps
 add 6cf63af  Fixing nanosecond translation issue in Gauge Fn API 
translation.
 new ce0c92e  Merge pull request #4721 from [BEAM-1866] Fixing nanosecond 
translation issue in Gauge Fn API translation.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 sdks/python/apache_beam/metrics/cells.py | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

-- 
To stop receiving notification emails like this one, please contact
rober...@apache.org.


Jenkins build is back to normal : beam_PostCommit_Python_Verify #4274

2018-02-21 Thread Apache Jenkins Server
See 




[beam] branch master updated (0a638ae -> 80fa9f8)

2018-02-21 Thread kenn
This is an automated email from the ASF dual-hosted git repository.

kenn pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 0a638ae  Merge pull request #4701: Spotless gradle: only check current 
module source
 add 7a68cdd  Explicitly exclude some troublesome optional deps from 
hadoop-input-format
 new 80fa9f8  Merge pull request #4702: [BEAM-3715] Explicitly exclude some 
troublesome optional deps

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 sdks/java/io/hadoop-input-format/build.gradle | 4 
 1 file changed, 4 insertions(+)

-- 
To stop receiving notification emails like this one, please contact
k...@apache.org.


[beam] 01/01: Merge pull request #4702: [BEAM-3715] Explicitly exclude some troublesome optional deps

2018-02-21 Thread kenn
This is an automated email from the ASF dual-hosted git repository.

kenn pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 80fa9f8d7515c68c8f7b444eee57478051333f60
Merge: 0a638ae 7a68cdd
Author: Kenn Knowles 
AuthorDate: Wed Feb 21 09:51:37 2018 -0800

Merge pull request #4702: [BEAM-3715] Explicitly exclude some troublesome 
optional deps

 sdks/java/io/hadoop-input-format/build.gradle | 4 
 1 file changed, 4 insertions(+)

-- 
To stop receiving notification emails like this one, please contact
k...@apache.org.


[jira] [Updated] (BEAM-3726) Kinesis Reader: java.lang.IllegalArgumentException: Attempting to move backwards

2018-02-21 Thread Pawel Bartoszek (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pawel Bartoszek updated BEAM-3726:
--
Description: 
When the job is restored from savepoint Kinesis Reader throws sometimes 
{{java.lang.IllegalArgumentException: Attempting to move backwards}}

After a few job restarts caused again by the same exception, job finally starts 
up and continues to run with no further problems.

 
{code:java}
java.lang.IllegalArgumentException: Attempting to move backwards
at 
org.apache.beam.sdk.repackaged.com.google.common.base.Preconditions.checkArgument(Preconditions.java:122)
at org.apache.beam.sdk.util.MovingFunction.flush(MovingFunction.java:97)
at org.apache.beam.sdk.util.MovingFunction.add(MovingFunction.java:114)
at org.apache.beam.sdk.io.kinesis.KinesisReader.advance(KinesisReader.java:137)
at 
org.apache.beam.runners.flink.metrics.ReaderInvocationUtil.invokeAdvance(ReaderInvocationUtil.java:67)
at 
org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper.run(UnboundedSourceWrapper.java:264)
at 
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:87)
at 
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:55)
at 
org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:95)
at 
org.apache.flink.streaming.runtime.tasks.StoppableSourceStreamTask.run(StoppableSourceStreamTask.java:39)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:263)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702){code}
 

Kinesis Reader transformation configuration:
{code:java}
pipeline.apply("KINESIS READER", KinesisIO.read()
.withStreamName(streamName)
.withInitialPositionInStream(InitialPositionInStream.LATEST)
.withAWSClientsProvider(awsAccessKey, awsSecretKey, EU_WEST_1)){code}
 

  was:
When the job is restored from savepoint Kinesis Reader throws sometimes 
{{java.lang.IllegalArgumentException: Attempting to move backwards}}

After a few job restarts caused again by the same exception, job finally starts 
up and continues to run with no further problems.

 
{code:java}
java.lang.IllegalArgumentException: Attempting to move backwards
at 
org.apache.beam.sdk.repackaged.com.google.common.base.Preconditions.checkArgument(Preconditions.java:122)
at org.apache.beam.sdk.util.MovingFunction.flush(MovingFunction.java:97)
at org.apache.beam.sdk.util.MovingFunction.add(MovingFunction.java:114)
at org.apache.beam.sdk.io.kinesis.KinesisReader.advance(KinesisReader.java:137)
at 
org.apache.beam.runners.flink.metrics.ReaderInvocationUtil.invokeAdvance(ReaderInvocationUtil.java:67)
at 
org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper.run(UnboundedSourceWrapper.java:264)
at 
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:87)
at 
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:55)
at 
org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:95)
at 
org.apache.flink.streaming.runtime.tasks.StoppableSourceStreamTask.run(StoppableSourceStreamTask.java:39)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:263)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702){code}


> Kinesis Reader: java.lang.IllegalArgumentException: Attempting to move 
> backwards
> 
>
> Key: BEAM-3726
> URL: https://issues.apache.org/jira/browse/BEAM-3726
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-kinesis
>Affects Versions: 2.2.0
>Reporter: Pawel Bartoszek
>Assignee: Jean-Baptiste Onofré
>Priority: Major
>
> When the job is restored from savepoint Kinesis Reader throws sometimes 
> {{java.lang.IllegalArgumentException: Attempting to move backwards}}
> After a few job restarts caused again by the same exception, job finally 
> starts up and continues to run with no further problems.
>  
> {code:java}
> java.lang.IllegalArgumentException: Attempting to move backwards
> at 
> org.apache.beam.sdk.repackaged.com.google.common.base.Preconditions.checkArgument(Preconditions.java:122)
> at org.apache.beam.sdk.util.MovingFunction.flush(MovingFunction.java:97)
> at org.apache.beam.sdk.util.MovingFunction.add(MovingFunction.java:114)
> at 
> org.apache.beam.sdk.io.kinesis.KinesisReader.advance(KinesisReader.java:137)
> at 
> org.apache.beam.runners.flink.metrics.ReaderInvocationUtil.invokeAdvance(ReaderInvocationUtil.java:67)
> at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper.run(UnboundedSourceWrapper.java:264)
> at 
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:87)
> at 
> 

[jira] [Assigned] (BEAM-3725) Add a test rule which remembers and resets the thread context class loader

2018-02-21 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik reassigned BEAM-3725:
---

Assignee: Romain Manni-Bucau

> Add a test rule which remembers and resets the thread context class loader
> --
>
> Key: BEAM-3725
> URL: https://issues.apache.org/jira/browse/BEAM-3725
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Luke Cwik
>Assignee: Romain Manni-Bucau
>Priority: Minor
>  Labels: newbie, starter
>
> We have several tests that ensure the proper usage of the TCCL but they can 
> be brittle if the test is written improperly.
>  
> This task is to create a JUnit test rule and to replace the existing 
> remember/reset TCCL usages in our tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3725) Add a test rule which remembers and resets the thread context class loader

2018-02-21 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371728#comment-16371728
 ] 

Luke Cwik commented on BEAM-3725:
-

Not urgent.

> Add a test rule which remembers and resets the thread context class loader
> --
>
> Key: BEAM-3725
> URL: https://issues.apache.org/jira/browse/BEAM-3725
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Luke Cwik
>Priority: Minor
>  Labels: newbie, starter
>
> We have several tests that ensure the proper usage of the TCCL but they can 
> be brittle if the test is written improperly.
>  
> This task is to create a JUnit test rule and to replace the existing 
> remember/reset TCCL usages in our tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-3726) Kinesis Reader: java.lang.IllegalArgumentException: Attempting to move backwards

2018-02-21 Thread Pawel Bartoszek (JIRA)
Pawel Bartoszek created BEAM-3726:
-

 Summary: Kinesis Reader: java.lang.IllegalArgumentException: 
Attempting to move backwards
 Key: BEAM-3726
 URL: https://issues.apache.org/jira/browse/BEAM-3726
 Project: Beam
  Issue Type: Improvement
  Components: io-java-kinesis
Affects Versions: 2.2.0
Reporter: Pawel Bartoszek
Assignee: Jean-Baptiste Onofré


When the job is restored from savepoint Kinesis Reader throws sometimes 
{{java.lang.IllegalArgumentException: Attempting to move backwards}}

After a few job restarts caused again by the same exception, job finally starts 
up and continues to run with no further problems.

 
{code:java}
java.lang.IllegalArgumentException: Attempting to move backwards
at 
org.apache.beam.sdk.repackaged.com.google.common.base.Preconditions.checkArgument(Preconditions.java:122)
at org.apache.beam.sdk.util.MovingFunction.flush(MovingFunction.java:97)
at org.apache.beam.sdk.util.MovingFunction.add(MovingFunction.java:114)
at org.apache.beam.sdk.io.kinesis.KinesisReader.advance(KinesisReader.java:137)
at 
org.apache.beam.runners.flink.metrics.ReaderInvocationUtil.invokeAdvance(ReaderInvocationUtil.java:67)
at 
org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper.run(UnboundedSourceWrapper.java:264)
at 
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:87)
at 
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:55)
at 
org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:95)
at 
org.apache.flink.streaming.runtime.tasks.StoppableSourceStreamTask.run(StoppableSourceStreamTask.java:39)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:263)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] 01/01: Merge pull request #4701: Spotless gradle: only check current module source

2018-02-21 Thread kenn
This is an automated email from the ASF dual-hosted git repository.

kenn pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 0a638aee8f7e296e8a9603d66b1b16a22d5afaa0
Merge: 07d97cb 910865e
Author: Kenn Knowles 
AuthorDate: Wed Feb 21 09:37:53 2018 -0800

Merge pull request #4701: Spotless gradle: only check current module source

 build_rules.gradle | 9 -
 1 file changed, 9 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
k...@apache.org.


[jira] [Commented] (BEAM-3725) Add a test rule which remembers and resets the thread context class loader

2018-02-21 Thread Romain Manni-Bucau (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371725#comment-16371725
 ] 

Romain Manni-Bucau commented on BEAM-3725:
--

[~lcwik] if not urgent (like can wait 1 or 2 weeks) sure

> Add a test rule which remembers and resets the thread context class loader
> --
>
> Key: BEAM-3725
> URL: https://issues.apache.org/jira/browse/BEAM-3725
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Luke Cwik
>Priority: Minor
>  Labels: newbie, starter
>
> We have several tests that ensure the proper usage of the TCCL but they can 
> be brittle if the test is written improperly.
>  
> This task is to create a JUnit test rule and to replace the existing 
> remember/reset TCCL usages in our tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-3701) PipelineOptionsFactory doesn't use the right classloader for its SPI

2018-02-21 Thread Romain Manni-Bucau (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Romain Manni-Bucau updated BEAM-3701:
-
Fix Version/s: 2.4.0

> PipelineOptionsFactory doesn't use the right classloader for its SPI
> 
>
> Key: BEAM-3701
> URL: https://issues.apache.org/jira/browse/BEAM-3701
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Reporter: Romain Manni-Bucau
>Assignee: Romain Manni-Bucau
>Priority: Major
> Fix For: 2.4.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> PipelineOptionsFactory uses its own classloader to load SPI, this should use 
> the TCCL to grab what is available in the launching context to enable actual 
> pluggability in most environment and not limit users to use a flat 
> classloader environment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-3479) Add a regression test for the DoFn classloader selection

2018-02-21 Thread Romain Manni-Bucau (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Romain Manni-Bucau updated BEAM-3479:
-
Fix Version/s: 2.4.0

> Add a regression test for the DoFn classloader selection
> 
>
> Key: BEAM-3479
> URL: https://issues.apache.org/jira/browse/BEAM-3479
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Reporter: Romain Manni-Bucau
>Assignee: Romain Manni-Bucau
>Priority: Major
> Fix For: 2.4.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Follow up task after https://github.com/apache/beam/pull/4235 merge. This 
> task is about ensuring we test that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-3700) PipelineOptionsFactory leaks memory

2018-02-21 Thread Romain Manni-Bucau (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Romain Manni-Bucau updated BEAM-3700:
-
Fix Version/s: 2.4.0

> PipelineOptionsFactory leaks memory
> ---
>
> Key: BEAM-3700
> URL: https://issues.apache.org/jira/browse/BEAM-3700
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Reporter: Romain Manni-Bucau
>Assignee: Kenneth Knowles
>Priority: Major
> Fix For: 2.4.0
>
>
> PipelineOptionsFactory has a lot of cache but no way to reset it. This task 
> is about adding a public method to be able to control it in integrations 
> (runners likely).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-3409) Unexpected behavior of DoFn teardown method running in unit tests

2018-02-21 Thread Romain Manni-Bucau (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Romain Manni-Bucau updated BEAM-3409:
-
Fix Version/s: 2.4.0

> Unexpected behavior of DoFn teardown method running in unit tests 
> --
>
> Key: BEAM-3409
> URL: https://issues.apache.org/jira/browse/BEAM-3409
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 2.3.0
>Reporter: Alexey Romanenko
>Assignee: Romain Manni-Bucau
>Priority: Minor
>  Labels: test
> Fix For: 2.4.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Writing a unit test, I found out a strange behaviour of Teardown method of 
> DoFn implementation when I run this method in unit tests using TestPipeline.
> To be more precise, it doesn’t wait until teardown() method will be finished, 
> it just exits from this method after about 1 sec (on my machine) even if it 
> should take longer (very simple example - running infinite loop inside this 
> method or put thread in sleep). In the same time, when I run the same code 
> from main() with ordinary Pipeline and direct runner, then it’s ok and it 
> works as expected - teardown() method will be performed completely despite 
> how much time it will take.
> I created two test cases to reproduce this issue - the first one to run with 
> main() and the second one to run with junit. They use the same implementation 
> of DoFn (class LongTearDownFn) and expects that teardown method will be 
> running at least for SLEEP_TIME ms. In case of running as junit test it's not 
> a case (see output log).
> - run with main()
> https://github.com/aromanenko-dev/beam-samples/blob/master/runners-tests/src/main/java/TearDown.java
> - run with junit
> https://github.com/aromanenko-dev/beam-samples/blob/master/runners-tests/src/test/java/TearDownTest.java



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3725) Add a test rule which remembers and resets the thread context class loader

2018-02-21 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371719#comment-16371719
 ] 

Luke Cwik commented on BEAM-3725:
-

[~rmannibucau], I thought you might be interested in doing this since you have 
been working the most lately in improving the classloader usage.

> Add a test rule which remembers and resets the thread context class loader
> --
>
> Key: BEAM-3725
> URL: https://issues.apache.org/jira/browse/BEAM-3725
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Luke Cwik
>Priority: Minor
>  Labels: newbie, starter
>
> We have several tests that ensure the proper usage of the TCCL but they can 
> be brittle if the test is written improperly.
>  
> This task is to create a JUnit test rule and to replace the existing 
> remember/reset TCCL usages in our tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-3725) Add a test rule which remembers and resets the thread context class loader

2018-02-21 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-3725:
---

 Summary: Add a test rule which remembers and resets the thread 
context class loader
 Key: BEAM-3725
 URL: https://issues.apache.org/jira/browse/BEAM-3725
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-core
Reporter: Luke Cwik


We have several tests that ensure the proper usage of the TCCL but they can be 
brittle if the test is written improperly.

 

This task is to create a JUnit test rule and to replace the existing 
remember/reset TCCL usages in our tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-1442) Performance improvement of the Python DirectRunner

2018-02-21 Thread Konstantinos Katsiapis (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371695#comment-16371695
 ] 

Konstantinos Katsiapis commented on BEAM-1442:
--

Is this now fixed (and expected to land in Beam 2.4)?

> Performance improvement of the Python DirectRunner
> --
>
> Key: BEAM-1442
> URL: https://issues.apache.org/jira/browse/BEAM-1442
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Charles Chen
>Priority: Major
>  Labels: gsoc2017, mentor, python
>
> The DirectRunner for Python and Java are intended to act as policy enforcers, 
> and correctness checkers for Beam pipelines; but there are users that run 
> data processing tasks in them.
> Currently, the Python Direct Runner has less-than-great performance, although 
> some work has gone into improving it. There are more opportunities for 
> improvement.
> Skills for this project:
> * Python
> * Cython (nice to have)
> * Working through the Beam getting started materials (nice to have)
> To start figuring out this problem, it is advisable to run a simple pipeline, 
> and study the `Pipeline.run` and `DirectRunner.run` methods. Ask questions 
> directly on JIRA.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] 01/01: [BEAM-3640] part 2 - add STATIC_INIT,INSTANCE_INIT,ENUM_DEF,INTERFACE…

2018-02-21 Thread lcwik
This is an automated email from the ASF dual-hosted git repository.

lcwik pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 07d97cb29e1fb9576d849b67cc2054de08239423
Merge: b95d744 bb719c8
Author: Lukasz Cwik 
AuthorDate: Wed Feb 21 09:14:58 2018 -0800

[BEAM-3640] part 2 - add STATIC_INIT,INSTANCE_INIT,ENUM_DEF,INTERFACE…

 .../main/java/org/apache/beam/examples/complete/AutoComplete.java  | 3 +++
 .../java/org/apache/beam/examples/cookbook/TriggerExample.java | 1 +
 .../src/main/java/org/apache/beam/examples/snippets/Snippets.java  | 2 ++
 .../runners/apex/translation/operators/ApexProcessFnOperator.java  | 2 +-
 .../runners/core/construction/WindowingStrategyTranslation.java| 1 +
 .../main/java/org/apache/beam/runners/direct/DirectRegistrar.java  | 1 +
 .../java/org/apache/beam/runners/spark/io/EmptyCheckpointMark.java | 2 +-
 sdks/java/build-tools/src/main/resources/beam/checkstyle.xml   | 7 ---
 .../core/src/main/java/org/apache/beam/sdk/transforms/Combine.java | 2 ++
 .../java/org/apache/beam/sdk/transforms/reflect/DoFnSignature.java | 4 
 .../sdk/transforms/windowing/MergeOverlappingIntervalWindows.java  | 3 +++
 .../test/java/org/apache/beam/sdk/coders/StructuredCoderTest.java  | 1 +
 .../src/test/java/org/apache/beam/sdk/testing/PAssertTest.java | 1 +
 .../test/java/org/apache/beam/sdk/transforms/DoFnTesterTest.java   | 1 +
 .../src/test/java/org/apache/beam/sdk/transforms/ParDoTest.java| 2 ++
 .../org/apache/beam/sdk/transforms/windowing/WindowingTest.java| 1 +
 .../java/org/apache/beam/sdk/extensions/sql/impl/BeamSqlEnv.java   | 2 ++
 .../interpreter/operator/date/BeamSqlCurrentDateExpression.java| 1 +
 .../interpreter/operator/date/BeamSqlCurrentTimeExpression.java| 1 +
 .../operator/date/BeamSqlCurrentTimestampExpression.java   | 1 +
 .../impl/interpreter/operator/date/BeamSqlDateCeilExpression.java  | 1 +
 .../impl/interpreter/operator/date/BeamSqlDateFloorExpression.java | 1 +
 .../impl/interpreter/operator/date/BeamSqlExtractExpression.java   | 1 +
 .../interpreter/operator/logical/BeamSqlLogicalExpression.java | 1 +
 .../org/apache/beam/sdk/extensions/sql/impl/rel/CheckSize.java | 1 +
 .../src/main/java/org/apache/beam/sdk/fn/stream/DataStreams.java   | 3 ++-
 .../org/apache/beam/fn/harness/state/StateFetchingIterators.java   | 4 +++-
 .../java/org/apache/beam/fn/harness/BeamFnDataReadRunnerTest.java  | 1 +
 .../java/org/apache/beam/fn/harness/BeamFnDataWriteRunnerTest.java | 1 +
 .../java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java | 1 +
 .../beam/sdk/io/gcp/bigquery/DynamicDestinationsHelpers.java   | 1 +
 .../org/apache/beam/sdk/io/gcp/bigquery/TableRowJsonCoderTest.java | 1 +
 .../src/main/java/org/apache/beam/sdk/nexmark/model/Event.java | 2 ++
 33 files changed, 51 insertions(+), 7 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
lc...@apache.org.


[beam] branch master updated (b95d744 -> 07d97cb)

2018-02-21 Thread lcwik
This is an automated email from the ASF dual-hosted git repository.

lcwik pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from b95d744  Merge pull request #4719: [BEAM-1866] Plumbing Gauge metrics 
through the Fn API.
 add bb719c8  [BEAM-3640] part 2 - add 
STATIC_INIT,INSTANCE_INIT,ENUM_DEF,INTERFACE_DEF,CTOR_DEF  lf's
 new 07d97cb  [BEAM-3640] part 2 - add 
STATIC_INIT,INSTANCE_INIT,ENUM_DEF,INTERFACE…

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../main/java/org/apache/beam/examples/complete/AutoComplete.java  | 3 +++
 .../java/org/apache/beam/examples/cookbook/TriggerExample.java | 1 +
 .../src/main/java/org/apache/beam/examples/snippets/Snippets.java  | 2 ++
 .../runners/apex/translation/operators/ApexProcessFnOperator.java  | 2 +-
 .../runners/core/construction/WindowingStrategyTranslation.java| 1 +
 .../main/java/org/apache/beam/runners/direct/DirectRegistrar.java  | 1 +
 .../java/org/apache/beam/runners/spark/io/EmptyCheckpointMark.java | 2 +-
 sdks/java/build-tools/src/main/resources/beam/checkstyle.xml   | 7 ---
 .../core/src/main/java/org/apache/beam/sdk/transforms/Combine.java | 2 ++
 .../java/org/apache/beam/sdk/transforms/reflect/DoFnSignature.java | 4 
 .../sdk/transforms/windowing/MergeOverlappingIntervalWindows.java  | 3 +++
 .../test/java/org/apache/beam/sdk/coders/StructuredCoderTest.java  | 1 +
 .../src/test/java/org/apache/beam/sdk/testing/PAssertTest.java | 1 +
 .../test/java/org/apache/beam/sdk/transforms/DoFnTesterTest.java   | 1 +
 .../src/test/java/org/apache/beam/sdk/transforms/ParDoTest.java| 2 ++
 .../org/apache/beam/sdk/transforms/windowing/WindowingTest.java| 1 +
 .../java/org/apache/beam/sdk/extensions/sql/impl/BeamSqlEnv.java   | 2 ++
 .../interpreter/operator/date/BeamSqlCurrentDateExpression.java| 1 +
 .../interpreter/operator/date/BeamSqlCurrentTimeExpression.java| 1 +
 .../operator/date/BeamSqlCurrentTimestampExpression.java   | 1 +
 .../impl/interpreter/operator/date/BeamSqlDateCeilExpression.java  | 1 +
 .../impl/interpreter/operator/date/BeamSqlDateFloorExpression.java | 1 +
 .../impl/interpreter/operator/date/BeamSqlExtractExpression.java   | 1 +
 .../interpreter/operator/logical/BeamSqlLogicalExpression.java | 1 +
 .../org/apache/beam/sdk/extensions/sql/impl/rel/CheckSize.java | 1 +
 .../src/main/java/org/apache/beam/sdk/fn/stream/DataStreams.java   | 3 ++-
 .../org/apache/beam/fn/harness/state/StateFetchingIterators.java   | 4 +++-
 .../java/org/apache/beam/fn/harness/BeamFnDataReadRunnerTest.java  | 1 +
 .../java/org/apache/beam/fn/harness/BeamFnDataWriteRunnerTest.java | 1 +
 .../java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java | 1 +
 .../beam/sdk/io/gcp/bigquery/DynamicDestinationsHelpers.java   | 1 +
 .../org/apache/beam/sdk/io/gcp/bigquery/TableRowJsonCoderTest.java | 1 +
 .../src/main/java/org/apache/beam/sdk/nexmark/model/Event.java | 2 ++
 33 files changed, 51 insertions(+), 7 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
lc...@apache.org.


[jira] [Resolved] (BEAM-3720) Python Precommit Broken

2018-02-21 Thread Pablo Estrada (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pablo Estrada resolved BEAM-3720.
-
   Resolution: Fixed
Fix Version/s: 2.4.0

> Python Precommit Broken
> ---
>
> Key: BEAM-3720
> URL: https://issues.apache.org/jira/browse/BEAM-3720
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
> Fix For: 2.4.0
>
>
> the Python precommit tests are broken due to:
>  # My adding the Gauge metric without support for the portability framework
>  # The switch of the Batch direct runner to the FnApiRunner. 
> The solution for this is to plumb Gauge metrics through the portability 
> framework. I've sent out a PR for this: 
> [https://github.com/apache/beam/pull/4719]
>  
> My apologies. This should be resolved shortly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3720) Python Precommit Broken

2018-02-21 Thread Pablo Estrada (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371683#comment-16371683
 ] 

Pablo Estrada commented on BEAM-3720:
-

PR [https://github.com/apache/beam/pull/4719] has been merged. Resolving issue.

> Python Precommit Broken
> ---
>
> Key: BEAM-3720
> URL: https://issues.apache.org/jira/browse/BEAM-3720
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
> Fix For: 2.4.0
>
>
> the Python precommit tests are broken due to:
>  # My adding the Gauge metric without support for the portability framework
>  # The switch of the Batch direct runner to the FnApiRunner. 
> The solution for this is to plumb Gauge metrics through the portability 
> framework. I've sent out a PR for this: 
> [https://github.com/apache/beam/pull/4719]
>  
> My apologies. This should be resolved shortly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-3724) Make the coders package compatible with Python 3

2018-02-21 Thread Luke Zhu (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Zhu reassigned BEAM-3724:
--

Assignee: Luke Zhu  (was: Ahmet Altay)

> Make the coders package compatible with Python 3
> 
>
> Key: BEAM-3724
> URL: https://issues.apache.org/jira/browse/BEAM-3724
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Luke Zhu
>Assignee: Luke Zhu
>Priority: Major
>
> The coders package is affect a lot by the fact that Strings are unicode in 
> Python 3.
>  
> The planned approach is to
>  * Prefix bytestrings with 'b' where appropriate
>  * Replace uses of 'str' with 'bytes' where appropriate
>  * Use python-modernize to solve syntax and import errors
> The goal of this subtask is not to make the coders package completely 
> compatible with Python 3.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-3724) Make the coders package compatible with Python 3

2018-02-21 Thread Luke Zhu (JIRA)
Luke Zhu created BEAM-3724:
--

 Summary: Make the coders package compatible with Python 3
 Key: BEAM-3724
 URL: https://issues.apache.org/jira/browse/BEAM-3724
 Project: Beam
  Issue Type: Sub-task
  Components: sdk-py-core
Reporter: Luke Zhu
Assignee: Ahmet Altay


The coders package is affect a lot by the fact that Strings are unicode in 
Python 3.

 

The planned approach is to
 * Prefix bytestrings with 'b' where appropriate
 * Replace uses of 'str' with 'bytes' where appropriate
 * Use python-modernize to solve syntax and import errors

The goal of this subtask is not to make the coders package completely 
compatible with Python 3.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-3723) Duration, Timestamp, and Window in utils/ are incompatible with Python 3

2018-02-21 Thread Luke Zhu (JIRA)
Luke Zhu created BEAM-3723:
--

 Summary: Duration, Timestamp, and Window in utils/ are 
incompatible with Python 3
 Key: BEAM-3723
 URL: https://issues.apache.org/jira/browse/BEAM-3723
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-core
Reporter: Luke Zhu


They use __cmp__ instead of rich comparison operators __eq__, __lt__, etc.

This can be fixed by updating the Cython version (0.27.1 supports rich 
comparions), or possibly by configuring cythonize.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Python_Verify #4273

2018-02-21 Thread Apache Jenkins Server
See 


--
[...truncated 1.40 MB...]
coders {
  key: "ref_Coder_VarIntCoder_2"
  value {
spec {
  spec {
urn: "beam:coder:varint:v1"
  }
}
  }
}
coders {
  key: "ref_Coder_WindowedValueCoder_1"
  value {
spec {
  spec {
urn: "beam:coder:windowed_value:v1"
  }
}
component_coder_ids: "ref_Coder_VarIntCoder_2"
component_coder_ids: "ref_Coder_GlobalWindowCoder_3"
  }
}
coders {
  key: "ref_Coder_WindowedValueCoder_11"
  value {
spec {
  spec {
urn: "beam:coder:windowed_value:v1"
  }
}
component_coder_ids: "ref_Coder_TupleCoder_12"
component_coder_ids: "ref_Coder_GlobalWindowCoder_3"
  }
}
coders {
  key: "ref_Coder_WindowedValueCoder_13"
  value {
spec {
  spec {
urn: "beam:coder:windowed_value:v1"
  }
}
component_coder_ids: "ref_Coder_TupleCoder_14"
component_coder_ids: "ref_Coder_GlobalWindowCoder_3"
  }
}
coders {
  key: "ref_Coder_WindowedValueCoder_13_bytes"
  value {
spec {
  spec {
urn: "beam:coder:windowed_value:v1"
  }
}
component_coder_ids: "ref_Coder_TupleCoder_14_bytes"
component_coder_ids: "ref_Coder_GlobalWindowCoder_3"
  }
}
coders {
  key: "ref_Coder_WindowedValueCoder_13_len_prefix"
  value {
spec {
  spec {
urn: "beam:coder:windowed_value:v1"
  }
}
component_coder_ids: "ref_Coder_TupleCoder_14_len_prefix"
component_coder_ids: "ref_Coder_GlobalWindowCoder_3"
  }
}
coders {
  key: "ref_Coder_WindowedValueCoder_16"
  value {
spec {
  spec {
urn: "beam:coder:windowed_value:v1"
  }
}
component_coder_ids: "ref_Coder_FastPrimitivesCoder_6"
component_coder_ids: "ref_Coder_GlobalWindowCoder_3"
  }
}
coders {
  key: "ref_Coder_WindowedValueCoder_4"
  value {
spec {
  spec {
urn: "beam:coder:windowed_value:v1"
  }
}
component_coder_ids: "ref_Coder_TupleCoder_5"
component_coder_ids: "ref_Coder_GlobalWindowCoder_3"
  }
}
coders {
  key: "ref_Coder_WindowedValueCoder_4_bytes"
  value {
spec {
  spec {
urn: "beam:coder:windowed_value:v1"
  }
}
component_coder_ids: "ref_Coder_TupleCoder_5_bytes"
component_coder_ids: "ref_Coder_GlobalWindowCoder_3"
  }
}
coders {
  key: "ref_Coder_WindowedValueCoder_4_len_prefix"
  value {
spec {
  spec {
urn: "beam:coder:windowed_value:v1"
  }
}
component_coder_ids: "ref_Coder_TupleCoder_5_len_prefix"
component_coder_ids: "ref_Coder_GlobalWindowCoder_3"
  }
}
coders {
  key: "ref_Coder_WindowedValueCoder_7"
  value {
spec {
  spec {
urn: "beam:coder:windowed_value:v1"
  }
}
component_coder_ids: "ref_Coder_TupleCoder_8"
component_coder_ids: "ref_Coder_GlobalWindowCoder_3"
  }
}
coders {
  key: "ref_Coder_WindowedValueCoder_9"
  value {
spec {
  spec {
urn: "beam:coder:windowed_value:v1"
  }
}
component_coder_ids: "ref_Coder_TupleCoder_10"
component_coder_ids: "ref_Coder_GlobalWindowCoder_3"
  }
}
  }
}

root: DEBUG: CONTROL RESPONSE instruction_id: "control_4"
register {
}

root: DEBUG: CONTROL REQUEST instruction_id: "bundle_1211"
process_bundle {
  process_bundle_descriptor_reference: "3"
}

root: INFO: start 
root: INFO: start 
root: INFO: start 
root: INFO: start 
root: INFO: finish 
root: INFO: finish 
root: INFO: finish 
root: INFO: finish 
root: DEBUG: CONTROL RESPONSE instruction_id: "bundle_1211"
process_bundle {
  metrics {
ptransforms {
  key: "assert_that/Group/GroupByKey/Read"
  value {
processed_elements {
  measured {
output_element_counts {
  key: "None"
  value: 1
}
  }
}
  }
}
ptransforms {
  key: "assert_that/Group/Map(_merge_tagged_vals_under_key)"
  value {
processed_elements {
  measured {
output_element_counts {
  key: "None"
  value: 1
}
  }
}
  }
}
ptransforms {
  key: "assert_that/Match"
  value {
processed_elements {
  measured {
output_element_counts {
  key: "None"
  value: 1
}
  }
}
  }
}
ptransforms {

[beam-site] 01/01: Prepare repository for deployment.

2018-02-21 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 1f4014cf5ba5f857047524bf8ab528689e910966
Author: Mergebot 
AuthorDate: Wed Feb 21 05:13:30 2018 -0800

Prepare repository for deployment.
---
 content/blog/2018/02/19/beam-2.3.0.html | 4 ++--
 content/feed.xml| 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/content/blog/2018/02/19/beam-2.3.0.html 
b/content/blog/2018/02/19/beam-2.3.0.html
index af8c605..b66e1b1 100644
--- a/content/blog/2018/02/19/beam-2.3.0.html
+++ b/content/blog/2018/02/19/beam-2.3.0.html
@@ -172,9 +172,9 @@ improvement
 
 List of Contributors
 
-According to git shortlog, the following 80 people contributed to the 2.3.0 
release. Thank you to all contributors!
+According to git shortlog, the following 78 people contributed to the 2.3.0 
release. Thank you to all contributors!
 
-Ahmet Altay, Alan Myrvold, Alex Amato, Alexey Romanenko, Ankur Goenka, 
Anton Kedin, Arnaud Fournier, Alexey Romanenko, Asha Rostamianfar, Ben 
Chambers, Ben Sidhom, Bill Neubauer, Brian Foo, cclauss, Chamikara Jayalath, 
Charles Chen, Colm O hEigeartaigh, Daniel Oliveira, Dariusz Aniszewski, David 
Cavazos, David Sabater, David Sabater Dinter, Dawid Wysakowicz, Dmytro Ivanov, 
Etienne Chauchot, Eugene Kirpichov, Exprosed, Grzegorz Kołakowski, Henning 
Rohde, Holden Karau, Huygaa Batsaikhan [...]
+Ahmet Altay, Alan Myrvold, Alex Amato, Alexey Romanenko, Ankur Goenka, 
Anton Kedin, Arnaud Fournier, Asha Rostamianfar, Ben Chambers, Ben Sidhom, Bill 
Neubauer, Brian Foo, cclauss, Chamikara Jayalath, Charles Chen, Colm O 
hEigeartaigh, Daniel Oliveira, Dariusz Aniszewski, David Cavazos, David 
Sabater, David Sabater Dinter, Dawid Wysakowicz, Dmytro Ivanov, Etienne 
Chauchot, Eugene Kirpichov, Exprosed, Grzegorz Kołakowski, Henning Rohde, 
Holden Karau, Huygaa Batsaikhan, Ilya Figotin, In [...]
 
 
   
diff --git a/content/feed.xml b/content/feed.xml
index e0112c6..e5375f1 100644
--- a/content/feed.xml
+++ b/content/feed.xml
@@ -91,9 +91,9 @@ improvement/li
 
 h1 id=list-of-contributorsList of Contributors/h1
 
-pAccording to git shortlog, the following 80 people contributed to the 
2.3.0 release. Thank you to all contributors!/p
+pAccording to git shortlog, the following 78 people contributed to the 
2.3.0 release. Thank you to all contributors!/p
 
-pAhmet Altay, Alan Myrvold, Alex Amato, Alexey Romanenko, Ankur 
Goenka, Anton Kedin, Arnaud Fournier, Alexey Romanenko, Asha Rostamianfar, Ben 
Chambers, Ben Sidhom, Bill Neubauer, Brian Foo, cclauss, Chamikara Jayalath, 
Charles Chen, Colm O hEigeartaigh, Daniel Oliveira, Dariusz Aniszewski, David 
Cavazos, David Sabater, David Sabater Dinter, Dawid Wysakowicz, Dmytro Ivanov, 
Etienne Chauchot, Eugene Kirpichov, Exprosed, Grzegorz Kołakowski, Henning 
Rohde, Holden Karau, Huygaa Bats [...]
+pAhmet Altay, Alan Myrvold, Alex Amato, Alexey Romanenko, Ankur 
Goenka, Anton Kedin, Arnaud Fournier, Asha Rostamianfar, Ben Chambers, Ben 
Sidhom, Bill Neubauer, Brian Foo, cclauss, Chamikara Jayalath, Charles Chen, 
Colm O hEigeartaigh, Daniel Oliveira, Dariusz Aniszewski, David Cavazos, David 
Sabater, David Sabater Dinter, Dawid Wysakowicz, Dmytro Ivanov, Etienne 
Chauchot, Eugene Kirpichov, Exprosed, Grzegorz Kołakowski, Henning Rohde, 
Holden Karau, Huygaa Batsaikhan, Ilya Figot [...]
 
 
 Mon, 19 Feb 2018 00:00:01 -0800

-- 
To stop receiving notification emails like this one, please contact
mergebot-r...@apache.org.


[beam-site] branch asf-site updated (f75d2fd -> 1f4014c)

2018-02-21 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a change to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam-site.git.


from f75d2fd  Prepare repository for deployment.
 add 029f709  Fixed number of contributors and duplication of name of 
Alexey Romanenko
 add 2d1f948  This closes #391
 new 1f4014c  Prepare repository for deployment.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 content/blog/2018/02/19/beam-2.3.0.html | 4 ++--
 content/feed.xml| 4 ++--
 src/_posts/2018-02-19-beam-2.3.0.md | 4 ++--
 3 files changed, 6 insertions(+), 6 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
mergebot-r...@apache.org.


[beam-site] 01/02: Fixed number of contributors and duplication of name of Alexey Romanenko

2018-02-21 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 029f7095a66cee197c39edf1e504815df0e7142d
Author: Alexey Romanenko 
AuthorDate: Wed Feb 21 11:59:15 2018 +0100

Fixed number of contributors and duplication of name of Alexey Romanenko
---
 src/_posts/2018-02-19-beam-2.3.0.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/_posts/2018-02-19-beam-2.3.0.md 
b/src/_posts/2018-02-19-beam-2.3.0.md
index f01a031..e4cdd50 100644
--- a/src/_posts/2018-02-19-beam-2.3.0.md
+++ b/src/_posts/2018-02-19-beam-2.3.0.md
@@ -83,7 +83,7 @@ Cloud Dataflow and the Go SDK on any runner.
 
 # List of Contributors
 
-According to git shortlog, the following 80 people contributed to the 2.3.0 
release. Thank you to all contributors!
+According to git shortlog, the following 78 people contributed to the 2.3.0 
release. Thank you to all contributors!
 
-Ahmet Altay, Alan Myrvold, Alex Amato, Alexey Romanenko, Ankur Goenka, Anton 
Kedin, Arnaud Fournier, Alexey Romanenko, Asha Rostamianfar, Ben Chambers, Ben 
Sidhom, Bill Neubauer, Brian Foo, cclauss, Chamikara Jayalath, Charles Chen, 
Colm O hEigeartaigh, Daniel Oliveira, Dariusz Aniszewski, David Cavazos, David 
Sabater, David Sabater Dinter, Dawid Wysakowicz, Dmytro Ivanov, Etienne 
Chauchot, Eugene Kirpichov, Exprosed, Grzegorz Kołakowski, Henning Rohde, 
Holden Karau, Huygaa Batsaikhan, I [...]
+Ahmet Altay, Alan Myrvold, Alex Amato, Alexey Romanenko, Ankur Goenka, Anton 
Kedin, Arnaud Fournier, Asha Rostamianfar, Ben Chambers, Ben Sidhom, Bill 
Neubauer, Brian Foo, cclauss, Chamikara Jayalath, Charles Chen, Colm O 
hEigeartaigh, Daniel Oliveira, Dariusz Aniszewski, David Cavazos, David 
Sabater, David Sabater Dinter, Dawid Wysakowicz, Dmytro Ivanov, Etienne 
Chauchot, Eugene Kirpichov, Exprosed, Grzegorz Kołakowski, Henning Rohde, 
Holden Karau, Huygaa Batsaikhan, Ilya Figotin, Innoc [...]
 

-- 
To stop receiving notification emails like this one, please contact
mergebot-r...@apache.org.


[beam-site] branch mergebot updated (f935b9d -> 2d1f948)

2018-02-21 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a change to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git.


from f935b9d  This closes #390
 add f75d2fd  Prepare repository for deployment.
 new 029f709  Fixed number of contributors and duplication of name of 
Alexey Romanenko
 new 2d1f948  This closes #391

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../2018/02/19/beam-2.3.0.html}| 104 +-
 content/blog/index.html|  17 +
 content/feed.xml   | 410 +
 content/index.html |  10 +-
 src/_posts/2018-02-19-beam-2.3.0.md|   4 +-
 5 files changed, 205 insertions(+), 340 deletions(-)
 copy content/{beam/capability/2016/04/03/presentation-materials.html => 
blog/2018/02/19/beam-2.3.0.html} (55%)

-- 
To stop receiving notification emails like this one, please contact
mergebot-r...@apache.org.


[beam-site] 02/02: This closes #391

2018-02-21 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 2d1f94829bec0b0840528d0afe31b915691081c8
Merge: f75d2fd 029f709
Author: Mergebot 
AuthorDate: Wed Feb 21 05:08:42 2018 -0800

This closes #391

 src/_posts/2018-02-19-beam-2.3.0.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
mergebot-r...@apache.org.


Jenkins build is back to stable : beam_PostCommit_Java_MavenInstall #6010

2018-02-21 Thread Apache Jenkins Server
See 




Jenkins build is back to stable : beam_PostCommit_Java_ValidatesRunner_Spark #4237

2018-02-21 Thread Apache Jenkins Server
See 




Jenkins build is back to normal : beam_PerformanceTests_TFRecordIOIT #166

2018-02-21 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PerformanceTests_Python #939

2018-02-21 Thread Apache Jenkins Server
See 


--
[...truncated 65 B...]
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision a3a6ca043abd41fbcbbbfc91449aabcec0ff1df3 (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f a3a6ca043abd41fbcbbbfc91449aabcec0ff1df3
Commit message: "Use maven-publish plugin to publish java artifacts"
 > git rev-list --no-walk a3a6ca043abd41fbcbbbfc91449aabcec0ff1df3 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins1507305050394643198.sh
+ rm -rf PerfKitBenchmarker
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins3799476821888638049.sh
+ rm -rf .env
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins1387048532069384629.sh
+ virtualenv .env --system-site-packages
New python executable in 

Installing setuptools, pip, wheel...done.
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins6440290709613963781.sh
+ .env/bin/pip install --upgrade setuptools pip
Requirement already up-to-date: setuptools in ./.env/lib/python2.7/site-packages
Requirement already up-to-date: pip in ./.env/lib/python2.7/site-packages
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins8096603980990424995.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git
Cloning into 'PerfKitBenchmarker'...
[beam_PerformanceTests_Python] $ /bin/bash -xe /tmp/jenkins838400902889514482.sh
+ .env/bin/pip install -r PerfKitBenchmarker/requirements.txt
Requirement already satisfied: absl-py in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: jinja2>=2.7 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: setuptools in ./.env/lib/python2.7/site-packages 
(from -r PerfKitBenchmarker/requirements.txt (line 16))
Requirement already satisfied: colorlog[windows]==2.6.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 17))
Requirement already satisfied: blinker>=1.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 18))
Requirement already satisfied: futures>=3.0.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 19))
Requirement already satisfied: PyYAML==3.12 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 20))
Requirement already satisfied: pint>=0.7 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 21))
Collecting numpy==1.13.3 (from -r PerfKitBenchmarker/requirements.txt (line 22))
:318:
 SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name 
Indication) extension to TLS is not available on this platform. This may cause 
the server to present an incorrect TLS certificate, which can cause validation 
failures. You can upgrade to a newer version of Python to solve this. For more 
information, see 
https://urllib3.readthedocs.io/en/latest/security.html#snimissingwarning.
  SNIMissingWarning
:122:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.io/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
  Using cached numpy-1.13.3-cp27-cp27mu-manylinux1_x86_64.whl
Requirement already satisfied: functools32 

[jira] [Commented] (BEAM-3201) ElasticsearchIO should allow the user to optionally pass id, type and index per document

2018-02-21 Thread Jeroen Steggink (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371179#comment-16371179
 ] 

Jeroen Steggink commented on BEAM-3201:
---

Sorry, to push this again. I really need this :) [~chet.aldrich], do you have 
an ETA? Otherwise I might build this myself.

> ElasticsearchIO should allow the user to optionally pass id, type and index 
> per document
> 
>
> Key: BEAM-3201
> URL: https://issues.apache.org/jira/browse/BEAM-3201
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-elasticsearch
>Reporter: Etienne Chauchot
>Assignee: Chet Aldrich
>Priority: Major
>
> *Dynamic documents id*: Today the ESIO only inserts the payload of the ES 
> documents. Elasticsearch generates a document id for each record inserted. So 
> each new insertion is considered as a new document. Users want to be able to 
> update documents using the IO. So, for the write part of the IO, users should 
> be able to provide a document id so that they could update already stored 
> documents. Providing an id for the documents could also help the user on 
> indempotency.
> *Dynamic ES type and ES index*: In some cases (streaming pipeline with high 
> throughput) partitioning the PCollection to allow to plug to different ESIO 
> instances (pointing to different index/type) is not very practical, the users 
> would like to be able to set ES index/type per document.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Python_Verify #4272

2018-02-21 Thread Apache Jenkins Server
See 


--
[...truncated 1.40 MB...]
coders {
  key: "ref_Coder_VarIntCoder_8"
  value {
spec {
  spec {
urn: "beam:coder:varint:v1"
  }
}
  }
}
coders {
  key: "ref_Coder_WindowedValueCoder_1"
  value {
spec {
  spec {
urn: "beam:coder:windowed_value:v1"
  }
}
component_coder_ids: "ref_Coder_TupleCoder_2"
component_coder_ids: "ref_Coder_GlobalWindowCoder_5"
  }
}
coders {
  key: "ref_Coder_WindowedValueCoder_11"
  value {
spec {
  spec {
urn: "beam:coder:windowed_value:v1"
  }
}
component_coder_ids: "ref_Coder_TupleCoder_12"
component_coder_ids: "ref_Coder_GlobalWindowCoder_5"
  }
}
coders {
  key: "ref_Coder_WindowedValueCoder_13"
  value {
spec {
  spec {
urn: "beam:coder:windowed_value:v1"
  }
}
component_coder_ids: "ref_Coder_TupleCoder_14"
component_coder_ids: "ref_Coder_GlobalWindowCoder_5"
  }
}
coders {
  key: "ref_Coder_WindowedValueCoder_15"
  value {
spec {
  spec {
urn: "beam:coder:windowed_value:v1"
  }
}
component_coder_ids: "ref_Coder_TupleCoder_16"
component_coder_ids: "ref_Coder_GlobalWindowCoder_5"
  }
}
coders {
  key: "ref_Coder_WindowedValueCoder_1_bytes"
  value {
spec {
  spec {
urn: "beam:coder:windowed_value:v1"
  }
}
component_coder_ids: "ref_Coder_TupleCoder_2_bytes"
component_coder_ids: "ref_Coder_GlobalWindowCoder_5"
  }
}
coders {
  key: "ref_Coder_WindowedValueCoder_1_len_prefix"
  value {
spec {
  spec {
urn: "beam:coder:windowed_value:v1"
  }
}
component_coder_ids: "ref_Coder_TupleCoder_2_len_prefix"
component_coder_ids: "ref_Coder_GlobalWindowCoder_5"
  }
}
coders {
  key: "ref_Coder_WindowedValueCoder_6"
  value {
spec {
  spec {
urn: "beam:coder:windowed_value:v1"
  }
}
component_coder_ids: "ref_Coder_FastPrimitivesCoder_3"
component_coder_ids: "ref_Coder_GlobalWindowCoder_5"
  }
}
coders {
  key: "ref_Coder_WindowedValueCoder_7"
  value {
spec {
  spec {
urn: "beam:coder:windowed_value:v1"
  }
}
component_coder_ids: "ref_Coder_VarIntCoder_8"
component_coder_ids: "ref_Coder_GlobalWindowCoder_5"
  }
}
coders {
  key: "ref_Coder_WindowedValueCoder_9"
  value {
spec {
  spec {
urn: "beam:coder:windowed_value:v1"
  }
}
component_coder_ids: "ref_Coder_TupleCoder_10"
component_coder_ids: "ref_Coder_GlobalWindowCoder_5"
  }
}
coders {
  key: "ref_Coder_WindowedValueCoder_9_bytes"
  value {
spec {
  spec {
urn: "beam:coder:windowed_value:v1"
  }
}
component_coder_ids: "ref_Coder_TupleCoder_10_bytes"
component_coder_ids: "ref_Coder_GlobalWindowCoder_5"
  }
}
coders {
  key: "ref_Coder_WindowedValueCoder_9_len_prefix"
  value {
spec {
  spec {
urn: "beam:coder:windowed_value:v1"
  }
}
component_coder_ids: "ref_Coder_TupleCoder_10_len_prefix"
component_coder_ids: "ref_Coder_GlobalWindowCoder_5"
  }
}
  }
}

root: DEBUG: CONTROL RESPONSE instruction_id: "control_4"
register {
}

root: DEBUG: CONTROL REQUEST instruction_id: "bundle_1212"
process_bundle {
  process_bundle_descriptor_reference: "3"
}

root: INFO: start 
root: INFO: start 
root: INFO: start 
root: INFO: start 
root: INFO: finish 
root: INFO: finish 
root: INFO: finish 
root: INFO: finish 
root: DEBUG: CONTROL RESPONSE instruction_id: "bundle_1212"
process_bundle {
  metrics {
ptransforms {
  key: "assert_that/Group/GroupByKey/Read"
  value {
processed_elements {
  measured {
output_element_counts {
  key: "None"
  value: 1
}
  }
}
  }
}
ptransforms {
  key: "assert_that/Group/Map(_merge_tagged_vals_under_key)"
  value {
processed_elements {
  measured {
output_element_counts {
  key: "None"
  value: 1
}
  }
}
  }
}
ptransforms {
  key: "assert_that/Match"
  value {
processed_elements {
  measured {
output_element_counts {
  key: "None"
  value: 1
}
  }
}
  }
}
ptransforms {
 

Jenkins build is back to normal : beam_PostCommit_Java_ValidatesRunner_Dataflow #5000

2018-02-21 Thread Apache Jenkins Server
See 




[jira] [Assigned] (BEAM-3245) Dataflow runner does not respect ParDo's lifecycle on case of exceptions

2018-02-21 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/BEAM-3245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía reassigned BEAM-3245:
--

Assignee: Thomas Groh

> Dataflow runner does not respect ParDo's lifecycle on case of exceptions
> 
>
> Key: BEAM-3245
> URL: https://issues.apache.org/jira/browse/BEAM-3245
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Ismaël Mejía
>Assignee: Thomas Groh
>Priority: Major
>
> The lifecycle of the DoFn is not respected in case of exception in any of the 
> lifecycle methods after setup.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-1330) DatastoreIO Writes should flush early when duplicate keys arrive.

2018-02-21 Thread Julien Sobczak (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371080#comment-16371080
 ] 

Julien Sobczak commented on BEAM-1330:
--

Hi,

Have someone find a workaround for this problem? Removing duplicates inside the 
window does not work because it seems several windows are being sent in the 
same batch.

Julien

> DatastoreIO Writes should flush early when duplicate keys arrive.
> -
>
> Key: BEAM-1330
> URL: https://issues.apache.org/jira/browse/BEAM-1330
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Reporter: Vikas Kedigehalli
>Assignee: Vikas Kedigehalli
>Priority: Minor
>
> DatastoreIO writes batches upto 500 entities (rpc limit for Cloud Datastore), 
> before flushing them out. The writes are non-transactional and thus do not 
> support duplicate keys in the writes. This can be problem, especially when 
> using a non global windowing, where multiple windows for the same key end up 
> in the same batch, and prevents the writes from succeeding. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)