[jira] [Created] (BEAM-3754) Can't have commitOffsetsInFinalizeEnabled set to false with KafkaIO.readBytes()

2018-02-27 Thread Benjamin BENOIST (JIRA)
Benjamin BENOIST created BEAM-3754:
--

 Summary: Can't have commitOffsetsInFinalizeEnabled set to false 
with KafkaIO.readBytes()
 Key: BEAM-3754
 URL: https://issues.apache.org/jira/browse/BEAM-3754
 Project: Beam
  Issue Type: Bug
  Components: io-java-kafka
Affects Versions: 2.3.0
 Environment: Dataflow pipeline using Kafka as a Sink
Reporter: Benjamin BENOIST
Assignee: Raghu Angadi


Beam v2.3 introduces finalized offsets, in order to reduce the gaps or 
duplicate processing of records while restarting a pipeline.

_read()_ sets this parameter to false [by 
default|https://github.com/apache/beam/blob/release-2.3.0/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java#L307]
 but _readBytes()_ 
[doesn't|https://github.com/apache/beam/blob/release-2.3.0/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java#L282],
 thus creating an exception:
{noformat}
Exception in thread "main" java.lang.IllegalStateException: Missing required 
properties: commitOffsetsInFinalizeEnabled
     at 
org.apache.beam.sdk.io.kafka.AutoValue_KafkaIO_Read$Builder.build(AutoValue_KafkaIO_Read.java:344)
     at 
org.apache.beam.sdk.io.kafka.KafkaIO.readBytes(KafkaIO.java:291){noformat}
The parameter can be set to true with _commitOffsetsInFinalize()_ but never to 
false.

Using _read()_ in the definition of _readBytes()_ could prevent this kind of 
error in the future:
{code:java}
public static Read readBytes() {
  return read()
.setKeyDeserializer(ByteArrayDeserializer.class)
.setValueDeserializer(ByteArrayDeserializer.class)
.build();
}{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PerformanceTests_HadoopInputFormat #6

2018-02-27 Thread Apache Jenkins Server
See 


--
[...truncated 43.39 KB...]
[INFO] Excluding com.google.api.grpc:proto-google-longrunning-v1:jar:0.1.11 
from the shaded jar.
[INFO] Excluding com.google.cloud.bigtable:bigtable-protos:jar:1.0.0-pre3 from 
the shaded jar.
[INFO] Excluding com.google.cloud.bigtable:bigtable-client-core:jar:1.0.0 from 
the shaded jar.
[INFO] Excluding com.google.auth:google-auth-library-appengine:jar:0.7.0 from 
the shaded jar.
[INFO] Excluding io.opencensus:opencensus-contrib-grpc-util:jar:0.7.0 from the 
shaded jar.
[INFO] Excluding io.opencensus:opencensus-api:jar:0.7.0 from the shaded jar.
[INFO] Excluding io.netty:netty-tcnative-boringssl-static:jar:1.1.33.Fork26 
from the shaded jar.
[INFO] Excluding 
com.google.api.grpc:proto-google-cloud-spanner-admin-database-v1:jar:0.1.9 from 
the shaded jar.
[INFO] Excluding com.google.api.grpc:proto-google-common-protos:jar:0.1.9 from 
the shaded jar.
[INFO] Excluding com.google.api-client:google-api-client:jar:1.22.0 from the 
shaded jar.
[INFO] Excluding com.google.oauth-client:google-oauth-client:jar:1.22.0 from 
the shaded jar.
[INFO] Excluding com.google.http-client:google-http-client:jar:1.22.0 from the 
shaded jar.
[INFO] Excluding com.google.http-client:google-http-client-jackson2:jar:1.22.0 
from the shaded jar.
[INFO] Excluding 
com.google.apis:google-api-services-dataflow:jar:v1b3-rev221-1.22.0 from the 
shaded jar.
[INFO] Excluding 
com.google.apis:google-api-services-clouddebugger:jar:v2-rev8-1.22.0 from the 
shaded jar.
[INFO] Excluding 
com.google.apis:google-api-services-storage:jar:v1-rev71-1.22.0 from the shaded 
jar.
[INFO] Excluding com.google.auth:google-auth-library-credentials:jar:0.7.1 from 
the shaded jar.
[INFO] Excluding com.google.auth:google-auth-library-oauth2-http:jar:0.7.1 from 
the shaded jar.
[INFO] Excluding com.google.cloud.bigdataoss:util:jar:1.4.5 from the shaded jar.
[INFO] Excluding com.google.api-client:google-api-client-java6:jar:1.22.0 from 
the shaded jar.
[INFO] Excluding com.google.api-client:google-api-client-jackson2:jar:1.22.0 
from the shaded jar.
[INFO] Excluding com.google.oauth-client:google-oauth-client-java6:jar:1.22.0 
from the shaded jar.
[INFO] Replacing original artifact with shaded artifact.
[INFO] Replacing 

 with 

[INFO] Replacing original test artifact with shaded test artifact.
[INFO] Replacing 

 with 

[INFO] Dependency-reduced POM written at: 

[INFO] 
[INFO] --- maven-failsafe-plugin:2.20.1:integration-test (default) @ 
beam-sdks-java-io-hadoop-input-format ---
[INFO] Failsafe report directory: 

[INFO] parallel='all', perCoreThreadCount=true, threadCount=4, 
useUnlimitedThreads=false, threadCountSuites=0, threadCountClasses=0, 
threadCountMethods=0, parallelOptimized=true
[INFO] 
[INFO] ---
[INFO]  T E S T S
[INFO] ---
[INFO] Running org.apache.beam.sdk.io.hadoop.inputformat.HadoopInputFormatIOIT
[ERROR] Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 0 s <<< 
FAILURE! - in org.apache.beam.sdk.io.hadoop.inputformat.HadoopInputFormatIOIT
[ERROR] org.apache.beam.sdk.io.hadoop.inputformat.HadoopInputFormatIOIT  Time 
elapsed: 0 s  <<< ERROR!
org.postgresql.util.PSQLException: The connection attempt failed.
at 
org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:272)
at 
org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:51)
at org.postgresql.jdbc.PgConnection.(PgConnection.java:215)
at org.postgresql.Driver.makeConnection(Driver.java:404)
at org.postgresql.Driver.connect(Driver.java:272)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at 

Build failed in Jenkins: beam_PerformanceTests_HadoopInputFormat #2

2018-02-27 Thread Apache Jenkins Server
See 


--
[...truncated 92.28 KB...]

2018-02-27 15:20:12,269 872faa84 MainThread beam_integration_benchmark(1/1) 
INFO Retrying exception running IssueRetryableCommand: Command returned a 
non-zero exit code.

2018-02-27 15:20:41,030 872faa84 MainThread beam_integration_benchmark(1/1) 
INFO Running: kubectl --kubeconfig=/home/jenkins/.kube/config delete -f 

2018-02-27 15:20:41,128 872faa84 MainThread beam_integration_benchmark(1/1) 
INFO Ran: {kubectl --kubeconfig=/home/jenkins/.kube/config delete -f 

  ReturnCode:1,  WallTime:0:00.09s,  CPU:0.08s,  MaxMemory:35212kb 
STDOUT: 
STDERR: error: stat /home/jenkins/.kube/config: no such file or directory

2018-02-27 15:20:41,129 872faa84 MainThread beam_integration_benchmark(1/1) 
INFO Retrying exception running IssueRetryableCommand: Command returned a 
non-zero exit code.

2018-02-27 15:20:56,818 872faa84 MainThread beam_integration_benchmark(1/1) 
INFO Running: kubectl --kubeconfig=/home/jenkins/.kube/config delete -f 

2018-02-27 15:20:56,917 872faa84 MainThread beam_integration_benchmark(1/1) 
INFO Ran: {kubectl --kubeconfig=/home/jenkins/.kube/config delete -f 

  ReturnCode:1,  WallTime:0:00.09s,  CPU:0.09s,  MaxMemory:35224kb 
STDOUT: 
STDERR: error: stat /home/jenkins/.kube/config: no such file or directory

2018-02-27 15:20:56,917 872faa84 MainThread beam_integration_benchmark(1/1) 
INFO Retrying exception running IssueRetryableCommand: Command returned a 
non-zero exit code.

2018-02-27 15:21:22,483 872faa84 MainThread beam_integration_benchmark(1/1) 
INFO Running: kubectl --kubeconfig=/home/jenkins/.kube/config delete -f 

2018-02-27 15:21:22,581 872faa84 MainThread beam_integration_benchmark(1/1) 
INFO Ran: {kubectl --kubeconfig=/home/jenkins/.kube/config delete -f 

  ReturnCode:1,  WallTime:0:00.08s,  CPU:0.10s,  MaxMemory:35356kb 
STDOUT: 
STDERR: error: stat /home/jenkins/.kube/config: no such file or directory

2018-02-27 15:21:22,581 872faa84 MainThread beam_integration_benchmark(1/1) 
INFO Retrying exception running IssueRetryableCommand: Command returned a 
non-zero exit code.

2018-02-27 15:21:40,404 872faa84 MainThread beam_integration_benchmark(1/1) 
INFO Running: kubectl --kubeconfig=/home/jenkins/.kube/config delete -f 

2018-02-27 15:21:40,503 872faa84 MainThread beam_integration_benchmark(1/1) 
INFO Ran: {kubectl --kubeconfig=/home/jenkins/.kube/config delete -f 

  ReturnCode:1,  WallTime:0:00.09s,  CPU:0.10s,  MaxMemory:34920kb 
STDOUT: 
STDERR: error: stat /home/jenkins/.kube/config: no such file or directory

2018-02-27 15:21:40,504 872faa84 MainThread beam_integration_benchmark(1/1) 
INFO Retrying exception running IssueRetryableCommand: Command returned a 
non-zero exit code.

2018-02-27 15:22:05,035 872faa84 MainThread beam_integration_benchmark(1/1) 
INFO Running: kubectl --kubeconfig=/home/jenkins/.kube/config delete -f 

2018-02-27 15:22:05,137 872faa84 MainThread beam_integration_benchmark(1/1) 
INFO Ran: {kubectl --kubeconfig=/home/jenkins/.kube/config delete -f 

  ReturnCode:1,  WallTime:0:00.09s,  CPU:0.07s,  MaxMemory:35984kb 
STDOUT: 
STDERR: error: stat /home/jenkins/.kube/config: no such file or directory

2018-02-27 15:22:05,137 872faa84 MainThread beam_integration_benchmark(1/1) 
INFO Retrying exception running IssueRetryableCommand: Command returned a 
non-zero exit code.

2018-02-27 

Build failed in Jenkins: beam_PostCommit_Python_Verify #4322

2018-02-27 Thread Apache Jenkins Server
See 


--
[...truncated 1.02 MB...]
copying apache_beam/runners/direct/helper_transforms.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/sdf_direct_runner.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/sdf_direct_runner_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/transform_evaluator.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/util.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/watermark_manager.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/experimental/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/experimental
copying apache_beam/runners/experimental/python_rpc_direct/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/experimental/python_rpc_direct
copying 
apache_beam/runners/experimental/python_rpc_direct/python_rpc_direct_runner.py 
-> apache-beam-2.4.0.dev0/apache_beam/runners/experimental/python_rpc_direct
copying apache_beam/runners/experimental/python_rpc_direct/server.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/experimental/python_rpc_direct
copying apache_beam/runners/job/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/job
copying apache_beam/runners/job/manager.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/job
copying apache_beam/runners/job/utils.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/job
copying apache_beam/runners/portability/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/fn_api_runner.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/fn_api_runner_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/maptask_executor_runner.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/maptask_executor_runner_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/universal_local_runner.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/universal_local_runner_main.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/universal_local_runner_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/test/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/test
copying apache_beam/runners/worker/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/bundle_processor.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/data_plane.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/data_plane_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/log_handler.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/log_handler_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/logger.pxd -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/logger.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/logger_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/opcounters.pxd -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/opcounters.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/opcounters_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/operation_specs.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/operations.pxd -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/operations.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/sdk_worker.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/sdk_worker_main.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/sdk_worker_main_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/sdk_worker_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/sideinputs.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/sideinputs_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying 

Build failed in Jenkins: beam_PerformanceTests_HadoopInputFormat #3

2018-02-27 Thread Apache Jenkins Server
See 


--
[...truncated 114.18 KB...]

2018-02-27 16:09:20,307 f7ff90d5 MainThread beam_integration_benchmark(1/1) 
INFO Retrying exception running IssueRetryableCommand: Command returned a 
non-zero exit code.

2018-02-27 16:09:36,730 f7ff90d5 MainThread beam_integration_benchmark(1/1) 
INFO Running: kubectl 
--kubeconfig=
 delete -f 

2018-02-27 16:09:36,910 f7ff90d5 MainThread beam_integration_benchmark(1/1) 
INFO Ran: {kubectl 
--kubeconfig=
 delete -f 

  ReturnCode:1,  WallTime:0:00.16s,  CPU:0.16s,  MaxMemory:34960kb 
STDOUT: 
STDERR: error: stat 
:
 no such file or directory

2018-02-27 16:09:36,910 f7ff90d5 MainThread beam_integration_benchmark(1/1) 
INFO Retrying exception running IssueRetryableCommand: Command returned a 
non-zero exit code.

2018-02-27 16:10:00,302 f7ff90d5 MainThread beam_integration_benchmark(1/1) 
INFO Running: kubectl 
--kubeconfig=
 delete -f 

2018-02-27 16:10:00,417 f7ff90d5 MainThread beam_integration_benchmark(1/1) 
INFO Ran: {kubectl 
--kubeconfig=
 delete -f 

  ReturnCode:1,  WallTime:0:00.10s,  CPU:0.08s,  MaxMemory:34808kb 
STDOUT: 
STDERR: error: stat 
:
 no such file or directory

2018-02-27 16:10:00,417 f7ff90d5 MainThread beam_integration_benchmark(1/1) 
INFO Retrying exception running IssueRetryableCommand: Command returned a 
non-zero exit code.

2018-02-27 16:10:30,288 f7ff90d5 MainThread beam_integration_benchmark(1/1) 
INFO Running: kubectl 
--kubeconfig=
 delete -f 

2018-02-27 16:10:30,387 f7ff90d5 MainThread beam_integration_benchmark(1/1) 
INFO Ran: {kubectl 
--kubeconfig=
 delete -f 

  ReturnCode:1,  WallTime:0:00.09s,  CPU:0.09s,  MaxMemory:35116kb 
STDOUT: 
STDERR: error: stat 
:
 no such file or directory

2018-02-27 16:10:30,388 f7ff90d5 MainThread beam_integration_benchmark(1/1) 
INFO Retrying exception running IssueRetryableCommand: Command returned a 
non-zero exit code.

2018-02-27 16:10:47,450 f7ff90d5 MainThread beam_integration_benchmark(1/1) 
INFO Running: kubectl 
--kubeconfig=
 delete -f 

2018-02-27 16:10:47,632 f7ff90d5 MainThread beam_integration_benchmark(1/1) 
INFO Ran: {kubectl 
--kubeconfig=
 delete -f 

  ReturnCode:1,  WallTime:0:00.17s,  CPU:0.15s,  MaxMemory:34976kb 
STDOUT: 
STDERR: error: stat 
:
 no such file or directory


[jira] [Closed] (BEAM-3739) @Parameter annotation does not work for UDFs in Beam SQL

2018-02-27 Thread Xu Mingmin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Mingmin closed BEAM-3739.

   Resolution: Fixed
Fix Version/s: 2.4.0

> @Parameter annotation does not work for UDFs in Beam SQL
> 
>
> Key: BEAM-3739
> URL: https://issues.apache.org/jira/browse/BEAM-3739
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Affects Versions: 2.3.0
>Reporter: Samuel Waggoner
>Assignee: Xu Mingmin
>Priority: Major
> Fix For: 2.4.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> BeamSqlUdf javadoc indicates you can have optional parameters, but this 
> functionality is not working. I implemented the following copy/paste from the 
> doc 
> https://beam.apache.org/documentation/sdks/javadoc/2.3.0/org/apache/beam/sdk/extensions/sql/BeamSqlUdf.html:
> {code:java}
> public static class MyLeftFunction implements BeamSqlUdf {
>  public String eval(
>  @Parameter(name = "s") String s,
>  @Parameter(name = "n", optional = true) Integer n) {
>  return s.substring(0, n == null ? 1 : n);
>  }
> }{code}
> I modify a query in BeamSqlExample.java to use it. With all parameters 
> supplied, it completes successfully:
> {code:java}
> //Case 1. run a simple SQL query over input PCollection with 
> BeamSql.simpleQuery;
> PCollection outputStream = inputTable.apply(
> BeamSql.query("select c1, leftfn('string1', 1) as c2, c3 from PCOLLECTION 
> where c1 > 1")
> .registerUdf("leftfn", MyLeftFunction.class));{code}
> With the optional parameter left off, I get an exception:
> {code:java}
> //Case 1. run a simple SQL query over input PCollection with 
> BeamSql.simpleQuery;
> PCollection outputStream = inputTable.apply(
>  BeamSql.query("select c1, leftfn('string1') as c2, c3 from PCOLLECTION where 
> c1 > 1")
>  .registerUdf("leftfn", MyLeftFunction.class));{code}
> {code:java}
> Exception in thread "main" java.lang.IllegalStateException: 
> java.lang.UnsupportedOperationException: Operator: DEFAULT is not supported 
> yet!
>  at 
> org.apache.beam.sdk.extensions.sql.QueryTransform.expand(QueryTransform.java:75)
>  at 
> org.apache.beam.sdk.extensions.sql.QueryTransform.expand(QueryTransform.java:47)
>  at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:537)
>  at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:472)
>  at org.apache.beam.sdk.values.PCollection.apply(PCollection.java:286)
>  at 
> org.apache.beam.sdk.extensions.sql.example.BeamSqlExample.main(BeamSqlExample.java:76)
> Caused by: java.lang.UnsupportedOperationException: Operator: DEFAULT is not 
> supported yet!
>  at 
> org.apache.beam.sdk.extensions.sql.impl.interpreter.BeamSqlFnExecutor.buildExpression(BeamSqlFnExecutor.java:424)
>  at 
> org.apache.beam.sdk.extensions.sql.impl.interpreter.BeamSqlFnExecutor.buildExpression(BeamSqlFnExecutor.java:201)
>  at 
> org.apache.beam.sdk.extensions.sql.impl.interpreter.BeamSqlFnExecutor.(BeamSqlFnExecutor.java:125)
>  at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamProjectRel.buildBeamPipeline(BeamProjectRel.java:70)
>  at 
> org.apache.beam.sdk.extensions.sql.QueryTransform.expand(QueryTransform.java:73)
>  ... 5 more{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is back to normal : beam_PerformanceTests_HadoopInputFormat #5

2018-02-27 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PostCommit_Python_ValidatesRunner_Dataflow #1004

2018-02-27 Thread Apache Jenkins Server
See 


--
[...truncated 122.17 KB...]
  File 
"
 line 107, in __or__
return self.pipeline.apply(ptransform, self)
  File 
"
 line 479, in apply
pvalueish_result = self.runner.apply(transform, pvalueish)
  File 
"
 line 174, in apply
return m(transform, input)
  File 
"
 line 180, in apply_PTransform
return transform.expand(input)
  File 
"
 line 147, in expand
| "Match" >> Map(matcher))
  File 
"
 line 107, in __or__
return self.pipeline.apply(ptransform, self)
  File 
"
 line 433, in apply
label or transform.label)
  File 
"
 line 443, in apply
return self.apply(transform, pvalueish)
  File 
"
 line 493, in apply
self._infer_result_type(transform, inputs, result)
  File 
"
 line 513, in _infer_result_type
type_options = self._options.view_as(TypeOptions)
  File 
"
 line 227, in view_as
view = cls(self._flags)
  File 
"
 line 150, in __init__
parser = _BeamArgumentParser()
  File "/usr/lib/python2.7/argparse.py", line 1586, in __init__
self._positionals = add_group(_('positional arguments'))
  File "/usr/lib/python2.7/gettext.py", line 581, in gettext
return dgettext(_current_domain, message)
  File "/usr/lib/python2.7/gettext.py", line 545, in dgettext
codeset=_localecodesets.get(domain))
  File "/usr/lib/python2.7/gettext.py", line 480, in translation
mofiles = find(domain, localedir, languages, all=1)
  File "/usr/lib/python2.7/gettext.py", line 449, in find
mofile_lp = os.path.join("/usr/share/locale-langpack", lang,
  File 
"
 line 276, in signalhandler
raise TimedOutException()
TimedOutException: 'test_flattened_side_input 
(apache_beam.transforms.sideinputs_test.SideInputsTest)'

==
ERROR: test_iterable_side_input 
(apache_beam.transforms.sideinputs_test.SideInputsTest)
--
Traceback (most recent call last):
  File 
"
 line 812, in run
test(orig)
  File 
"
 line 45, in __call__
return self.run(*arg, **kwarg)
  File 
"
 line 133, in run
self.runTest(result)
  File 
"
 line 151, in runTest
test(result)
  File "/usr/lib/python2.7/unittest/case.py", line 395, in __call__
return self.run(*args, **kwds)
  File "/usr/lib/python2.7/unittest/case.py", line 331, in run
testMethod()
  File 
"
 line 178, in test_iterable_side_input
pipeline.run()
  File 

[jira] [Created] (BEAM-3755) Unable to Test Session "with gap duration" Windowing

2018-02-27 Thread Joe (JIRA)
Joe created BEAM-3755:
-

 Summary: Unable to Test Session "with gap duration" Windowing
 Key: BEAM-3755
 URL: https://issues.apache.org/jira/browse/BEAM-3755
 Project: Beam
  Issue Type: Bug
  Components: testing
Affects Versions: 2.2.0
 Environment: Java
Reporter: Joe
Assignee: Jason Kuster


Trying to write a unit test to verify the windowing behavior for session with 
gap duration, but my assumption is that there is a merging of IntervalWindows 
that normally happens that is not happening for my test pipeline, because my 
actual pipeline seems to work as expected, but my test fails.  

I have been using these resources:

[http://www.waitingforcode.com/apache-beam/windows-apache-beam/read]

[https://beam.apache.org/blog/2016/10/20/test-stream.html]

 

Here is an example of the issue:
{code:java}
String guid = "user1";
UniqueUserKey uniqueUser = makeUniqueUserKey(guid);
// the first value to makePageView results in the timestamp: new Instant(value)
TimestampedValue> homepage = makePageView(1, 
"HomePage", "homepage", guid, uniqueUser);
TimestampedValue> productDetails1 = 
makePageView(2, "Product Details", "product_details", guid, uniqueUser);

TestStream> testStream = 
TestStream.create(KvCoder.of(ProtoCoder.of(UniqueUserKey.class), 
ProtoCoder.of(PageLoadEvent.class)))
.addElements(homepage)
.addElements(productDetails1)
.advanceWatermarkTo(new Instant(8))
.advanceWatermarkToInfinity();

IntervalWindow window1 = new IntervalWindow(new Instant(1), new Instant(3));

// This fails because productDetails1 is not in the window
PAssert.that(firstTransform).inFinalPane(window1).containsInAnyOrder(
homepage.getValue(),
productDetails1.getValue());

pipeline.run().waitUntilFinish();{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-3300) Portable flattens in Go SDK Harness

2018-02-27 Thread Henning Rohde (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henning Rohde updated BEAM-3300:

Priority: Major  (was: Minor)

> Portable flattens in Go SDK Harness
> ---
>
> Key: BEAM-3300
> URL: https://issues.apache.org/jira/browse/BEAM-3300
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Henning Rohde
>Assignee: Henning Rohde
>Priority: Major
>  Labels: portability
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PerformanceTests_JDBC #267

2018-02-27 Thread Apache Jenkins Server
See 


--
[...truncated 37.20 KB...]
[INFO] Excluding io.opencensus:opencensus-api:jar:0.7.0 from the shaded jar.
[INFO] Excluding io.dropwizard.metrics:metrics-core:jar:3.1.2 from the shaded 
jar.
[INFO] Excluding com.google.protobuf:protobuf-java:jar:3.2.0 from the shaded 
jar.
[INFO] Excluding io.netty:netty-tcnative-boringssl-static:jar:1.1.33.Fork26 
from the shaded jar.
[INFO] Excluding 
com.google.api.grpc:proto-google-cloud-spanner-admin-database-v1:jar:0.1.9 from 
the shaded jar.
[INFO] Excluding com.google.api.grpc:proto-google-common-protos:jar:0.1.9 from 
the shaded jar.
[INFO] Excluding com.google.api-client:google-api-client:jar:1.22.0 from the 
shaded jar.
[INFO] Excluding com.google.oauth-client:google-oauth-client:jar:1.22.0 from 
the shaded jar.
[INFO] Excluding com.google.http-client:google-http-client:jar:1.22.0 from the 
shaded jar.
[INFO] Excluding org.apache.httpcomponents:httpclient:jar:4.0.1 from the shaded 
jar.
[INFO] Excluding org.apache.httpcomponents:httpcore:jar:4.0.1 from the shaded 
jar.
[INFO] Excluding commons-codec:commons-codec:jar:1.3 from the shaded jar.
[INFO] Excluding com.google.http-client:google-http-client-jackson2:jar:1.22.0 
from the shaded jar.
[INFO] Excluding 
com.google.apis:google-api-services-dataflow:jar:v1b3-rev221-1.22.0 from the 
shaded jar.
[INFO] Excluding 
com.google.apis:google-api-services-clouddebugger:jar:v2-rev8-1.22.0 from the 
shaded jar.
[INFO] Excluding 
com.google.apis:google-api-services-storage:jar:v1-rev71-1.22.0 from the shaded 
jar.
[INFO] Excluding com.google.auth:google-auth-library-credentials:jar:0.7.1 from 
the shaded jar.
[INFO] Excluding com.google.auth:google-auth-library-oauth2-http:jar:0.7.1 from 
the shaded jar.
[INFO] Excluding com.google.cloud.bigdataoss:util:jar:1.4.5 from the shaded jar.
[INFO] Excluding com.google.api-client:google-api-client-java6:jar:1.22.0 from 
the shaded jar.
[INFO] Excluding com.google.api-client:google-api-client-jackson2:jar:1.22.0 
from the shaded jar.
[INFO] Excluding com.google.oauth-client:google-oauth-client-java6:jar:1.22.0 
from the shaded jar.
[INFO] Replacing original artifact with shaded artifact.
[INFO] Replacing 

 with 

[INFO] Replacing original test artifact with shaded test artifact.
[INFO] Replacing 

 with 

[INFO] Dependency-reduced POM written at: 

[INFO] 
[INFO] --- maven-failsafe-plugin:2.20.1:integration-test (default) @ 
beam-sdks-java-io-jdbc ---
[INFO] Failsafe report directory: 

[INFO] parallel='all', perCoreThreadCount=true, threadCount=4, 
useUnlimitedThreads=false, threadCountSuites=0, threadCountClasses=0, 
threadCountMethods=0, parallelOptimized=true
[INFO] 
[INFO] ---
[INFO]  T E S T S
[INFO] ---
[INFO] Running org.apache.beam.sdk.io.jdbc.JdbcIOIT
[ERROR] Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 0.001 s 
<<< FAILURE! - in org.apache.beam.sdk.io.jdbc.JdbcIOIT
[ERROR] org.apache.beam.sdk.io.jdbc.JdbcIOIT  Time elapsed: 0.001 s  <<< ERROR!
org.postgresql.util.PSQLException: The connection attempt failed.
at 
org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:272)
at 
org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:51)
at org.postgresql.jdbc.PgConnection.(PgConnection.java:215)
at org.postgresql.Driver.makeConnection(Driver.java:404)
at org.postgresql.Driver.connect(Driver.java:272)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at java.sql.DriverManager.getConnection(DriverManager.java:247)
at 
org.postgresql.ds.common.BaseDataSource.getConnection(BaseDataSource.java:86)
at 
org.postgresql.ds.common.BaseDataSource.getConnection(BaseDataSource.java:71)
at 
org.apache.beam.sdk.io.common.DatabaseTestHelper.createTable(DatabaseTestHelper.java:46)
at org.apache.beam.sdk.io.jdbc.JdbcIOIT.setup(JdbcIOIT.java:84)

[beam-site] branch mergebot updated (422e81b -> d77143a)

2018-02-27 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a change to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git.


from 422e81b  This closes #304
 add e51719a  Prepare repository for deployment.
 new b4d09a2  Updates list of built-in IO transforms
 new d77143a  This closes #394

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 content/get-started/wordcount-example/index.html | 10 +-
 src/documentation/io/built-in.md | 25 +---
 2 files changed, 14 insertions(+), 21 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
mergebot-r...@apache.org.


[beam-site] 02/02: This closes #394

2018-02-27 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit d77143ae759b81efb3a9376eaba7a93f681b3315
Merge: e51719a b4d09a2
Author: Mergebot 
AuthorDate: Tue Feb 27 09:50:12 2018 -0800

This closes #394

 src/documentation/io/built-in.md | 25 +
 1 file changed, 9 insertions(+), 16 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
mergebot-r...@apache.org.


[beam-site] 01/02: Updates list of built-in IO transforms

2018-02-27 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit b4d09a28e9b0ebce48ea11fad5e73eb1e45df266
Author: Eugene Kirpichov 
AuthorDate: Mon Feb 26 12:23:56 2018 -0800

Updates list of built-in IO transforms
---
 src/documentation/io/built-in.md | 25 +
 1 file changed, 9 insertions(+), 16 deletions(-)

diff --git a/src/documentation/io/built-in.md b/src/documentation/io/built-in.md
index bf5bb88..d8f42f3 100644
--- a/src/documentation/io/built-in.md
+++ b/src/documentation/io/built-in.md
@@ -24,11 +24,13 @@ Consult the [Programming Guide I/O section]({{site.baseurl 
}}/documentation/prog
 
   Java
   
-https://github.com/apache/beam/tree/master/sdks/java/io/hadoop-file-system;>Apache
 Hadoop File System
+Beam Java supports Apache HDFS, Amazon S3, Google Cloud Storage, and 
local filesystems.
+https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileIO.java;>FileIO
 (general-purpose reading, writing, and matching of files)
 https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java;>AvroIO
 https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TextIO.java;>TextIO
 https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TFRecordIO.java;>TFRecordIO
-https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/;>XML
+https://github.com/apache/beam/blob/master/sdks/java/io/xml/src/main/java/org/apache/beam/sdk/io/xml/XmlIO.java;>XmlIO
+https://github.com/apache/beam/blob/master/sdks/java/io/tika/src/main/java/org/apache/beam/sdk/io/tika/TikaIO.java;>TikaIO
   
   
 https://github.com/apache/beam/tree/master/sdks/java/io/kinesis;>Amazon 
Kinesis
@@ -48,6 +50,7 @@ Consult the [Programming Guide I/O section]({{site.baseurl 
}}/documentation/prog
 https://github.com/apache/beam/tree/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery;>Google
 BigQuery
 https://github.com/apache/beam/tree/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable;>Google
 Cloud Bigtable
 https://github.com/apache/beam/tree/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/datastore;>Google
 Cloud Datastore
+https://github.com/apache/beam/tree/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner;>Google
 Cloud Spanner
 https://github.com/apache/beam/tree/master/sdks/java/io/jdbc;>JDBC
 https://github.com/apache/beam/tree/master/sdks/java/io/mongodb;>MongoDB
 https://github.com/apache/beam/tree/master/sdks/java/io/redis;>Redis
@@ -56,9 +59,11 @@ Consult the [Programming Guide I/O section]({{site.baseurl 
}}/documentation/prog
 
   Python
   
+Beam Python supports Google Cloud Storage and local filesystems.
 https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/avroio.py;>avroio
 https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/textio.py;>textio
 https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/tfrecordio.py;>tfrecordio
+https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/vcfio.py;>vcfio
   
   
   
@@ -79,8 +84,8 @@ This table contains I/O transforms that are currently planned 
or in-progress. St
 NameLanguageJIRA
   
   
-Amazon S3 File SystemJava
-https://issues.apache.org/jira/browse/BEAM-2500;>BEAM-2500
+Apache HDFS supportPython
+https://issues.apache.org/jira/browse/BEAM-3099;>BEAM-3099
   
   
 Apache DistributedLogJava
@@ -99,18 +104,10 @@ This table contains I/O transforms that are currently 
planned or in-progress. St
 https://issues.apache.org/jira/browse/BEAM-1893;>BEAM-1893
   
   
-Google Cloud SpannerJava
-https://issues.apache.org/jira/browse/BEAM-1542;>BEAM-1542
-  
-  
 InfluxDBJava
 https://issues.apache.org/jira/browse/BEAM-2546;>BEAM-2546
   
   
-JSONJava
-https://issues.apache.org/jira/browse/BEAM-1581;>BEAM-1581
-  
-  
 MemcachedJava
 https://issues.apache.org/jira/browse/BEAM-1678;>BEAM-1678
   
@@ -126,8 +123,4 @@ This table contains I/O transforms that are currently 
planned or in-progress. St
 RestIOJava
 https://issues.apache.org/jira/browse/BEAM-1946;>BEAM-1946
   
-  
-TikaIOJava
-https://issues.apache.org/jira/browse/BEAM-2328;>BEAM-2328
-  
 

-- 
To stop receiving notification emails like this one, please contact
mergebot-r...@apache.org.


[beam-site] 01/01: Prepare repository for deployment.

2018-02-27 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 1333c13e61849c00947efee7c4647959024b984a
Author: Mergebot 
AuthorDate: Tue Feb 27 09:57:00 2018 -0800

Prepare repository for deployment.
---
 content/documentation/io/built-in/index.html   | 25 --
 .../get-started/mobile-gaming-example/index.html   |  4 ++--
 2 files changed, 11 insertions(+), 18 deletions(-)

diff --git a/content/documentation/io/built-in/index.html 
b/content/documentation/io/built-in/index.html
index 4877274..272f241 100644
--- a/content/documentation/io/built-in/index.html
+++ b/content/documentation/io/built-in/index.html
@@ -211,11 +211,13 @@
 
   Java
   
-https://github.com/apache/beam/tree/master/sdks/java/io/hadoop-file-system;>Apache
 Hadoop File System
+Beam Java supports Apache HDFS, Amazon S3, Google Cloud Storage, and 
local filesystems.
+https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileIO.java;>FileIO
 (general-purpose reading, writing, and matching of files)
 https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java;>AvroIO
 https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TextIO.java;>TextIO
 https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TFRecordIO.java;>TFRecordIO
-https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/;>XML
+https://github.com/apache/beam/blob/master/sdks/java/io/xml/src/main/java/org/apache/beam/sdk/io/xml/XmlIO.java;>XmlIO
+https://github.com/apache/beam/blob/master/sdks/java/io/tika/src/main/java/org/apache/beam/sdk/io/tika/TikaIO.java;>TikaIO
   
   
 https://github.com/apache/beam/tree/master/sdks/java/io/kinesis;>Amazon 
Kinesis
@@ -235,6 +237,7 @@
 https://github.com/apache/beam/tree/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery;>Google
 BigQuery
 https://github.com/apache/beam/tree/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable;>Google
 Cloud Bigtable
 https://github.com/apache/beam/tree/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/datastore;>Google
 Cloud Datastore
+https://github.com/apache/beam/tree/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner;>Google
 Cloud Spanner
 https://github.com/apache/beam/tree/master/sdks/java/io/jdbc;>JDBC
 https://github.com/apache/beam/tree/master/sdks/java/io/mongodb;>MongoDB
 https://github.com/apache/beam/tree/master/sdks/java/io/redis;>Redis
@@ -243,9 +246,11 @@
 
   Python
   
+Beam Python supports Google Cloud Storage and local filesystems.
 https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/avroio.py;>avroio
 https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/textio.py;>textio
 https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/tfrecordio.py;>tfrecordio
+https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/vcfio.py;>vcfio
   
   
   
@@ -266,8 +271,8 @@
 NameLanguageJIRA
   
   
-Amazon S3 File SystemJava
-https://issues.apache.org/jira/browse/BEAM-2500;>BEAM-2500
+Apache HDFS supportPython
+https://issues.apache.org/jira/browse/BEAM-3099;>BEAM-3099
   
   
 Apache DistributedLogJava
@@ -286,18 +291,10 @@
 https://issues.apache.org/jira/browse/BEAM-1893;>BEAM-1893
   
   
-Google Cloud SpannerJava
-https://issues.apache.org/jira/browse/BEAM-1542;>BEAM-1542
-  
-  
 InfluxDBJava
 https://issues.apache.org/jira/browse/BEAM-2546;>BEAM-2546
   
   
-JSONJava
-https://issues.apache.org/jira/browse/BEAM-1581;>BEAM-1581
-  
-  
 MemcachedJava
 https://issues.apache.org/jira/browse/BEAM-1678;>BEAM-1678
   
@@ -313,10 +310,6 @@
 RestIOJava
 https://issues.apache.org/jira/browse/BEAM-1946;>BEAM-1946
   
-  
-TikaIOJava
-https://issues.apache.org/jira/browse/BEAM-2328;>BEAM-2328
-  
 
 
   
diff --git a/content/get-started/mobile-gaming-example/index.html 
b/content/get-started/mobile-gaming-example/index.html
index 1d6a84a..8a63425 100644
--- a/content/get-started/mobile-gaming-example/index.html
+++ b/content/get-started/mobile-gaming-example/index.html
@@ -957,8 +957,8 @@ late results.
 sum_scores
 # Use the derived mean total score (global_mean_score) 
as a side input.
 | 'ProcessAndFilter' 
 beam.Filter(
-lambda (_, score), global_mean:\
-score  global_mean * self.SCORE_WEIGHT,
+lambda key_score, global_mean:\
+key_score[1]  global_mean * self.SCORE_WEIGHT,
   

[beam-site] branch asf-site updated (e51719a -> 1333c13)

2018-02-27 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a change to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam-site.git.


from e51719a  Prepare repository for deployment.
 add b4d09a2  Updates list of built-in IO transforms
 add d77143a  This closes #394
 new 1333c13  Prepare repository for deployment.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 content/documentation/io/built-in/index.html   | 25 --
 .../get-started/mobile-gaming-example/index.html   |  4 ++--
 src/documentation/io/built-in.md   | 25 --
 3 files changed, 20 insertions(+), 34 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
mergebot-r...@apache.org.


[jira] [Created] (BEAM-3753) Integration ITCase tests are not executed

2018-02-27 Thread JIRA
Grzegorz Kołakowski created BEAM-3753:
-

 Summary: Integration ITCase tests are not executed
 Key: BEAM-3753
 URL: https://issues.apache.org/jira/browse/BEAM-3753
 Project: Beam
  Issue Type: Bug
  Components: runner-flink
Reporter: Grzegorz Kołakowski
Assignee: Aljoscha Krettek


The flink-runner {{*ITCase.java}} tests are not executed at all, either by 
surefire or by filesafe plugin.
 * org.apache.beam.runners.flink.ReadSourceStreamingITCase
 * org.apache.beam.runners.flink.ReadSourceITCase
 * org.apache.beam.runners.flink.streaming.TopWikipediaSessionsITCase

In addition, two of them fail if run manually for IDE.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-3754) KAFKA - Can't set commitOffsetsInFinalizeEnabled to false with KafkaIO.readBytes()

2018-02-27 Thread Benjamin BENOIST (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin BENOIST updated BEAM-3754:
---
Summary: KAFKA - Can't set commitOffsetsInFinalizeEnabled to false with 
KafkaIO.readBytes()  (was: KAFKA - Can't have commitOffsetsInFinalizeEnabled 
set to false with KafkaIO.readBytes())

> KAFKA - Can't set commitOffsetsInFinalizeEnabled to false with 
> KafkaIO.readBytes()
> --
>
> Key: BEAM-3754
> URL: https://issues.apache.org/jira/browse/BEAM-3754
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-kafka
>Affects Versions: 2.3.0
> Environment: Dataflow pipeline using Kafka as a Sink
>Reporter: Benjamin BENOIST
>Assignee: Raghu Angadi
>Priority: Minor
>  Labels: patch
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Beam v2.3 introduces finalized offsets, in order to reduce the gaps or 
> duplicate processing of records while restarting a pipeline.
> _read()_ sets this parameter to false [by 
> default|https://github.com/apache/beam/blob/release-2.3.0/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java#L307]
>  but _readBytes()_ 
> [doesn't|https://github.com/apache/beam/blob/release-2.3.0/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java#L282],
>  thus creating an exception:
> {noformat}
> Exception in thread "main" java.lang.IllegalStateException: Missing required 
> properties: commitOffsetsInFinalizeEnabled
>      at 
> org.apache.beam.sdk.io.kafka.AutoValue_KafkaIO_Read$Builder.build(AutoValue_KafkaIO_Read.java:344)
>      at 
> org.apache.beam.sdk.io.kafka.KafkaIO.readBytes(KafkaIO.java:291){noformat}
> The parameter can be set to true with _commitOffsetsInFinalize()_ but never 
> to false.
> Using _read()_ in the definition of _readBytes()_ could prevent this kind of 
> error in the future:
> {code:java}
> public static Read readBytes() {
>   return read()
> .setKeyDeserializer(ByteArrayDeserializer.class)
> .setValueDeserializer(ByteArrayDeserializer.class)
> .build();
> }{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-3754) KAFKA - Can't have commitOffsetsInFinalizeEnabled set to false with KafkaIO.readBytes()

2018-02-27 Thread Benjamin BENOIST (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin BENOIST updated BEAM-3754:
---
Summary: KAFKA - Can't have commitOffsetsInFinalizeEnabled set to false 
with KafkaIO.readBytes()  (was: Can't have commitOffsetsInFinalizeEnabled set 
to false with KafkaIO.readBytes())

> KAFKA - Can't have commitOffsetsInFinalizeEnabled set to false with 
> KafkaIO.readBytes()
> ---
>
> Key: BEAM-3754
> URL: https://issues.apache.org/jira/browse/BEAM-3754
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-kafka
>Affects Versions: 2.3.0
> Environment: Dataflow pipeline using Kafka as a Sink
>Reporter: Benjamin BENOIST
>Assignee: Raghu Angadi
>Priority: Minor
>  Labels: patch
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Beam v2.3 introduces finalized offsets, in order to reduce the gaps or 
> duplicate processing of records while restarting a pipeline.
> _read()_ sets this parameter to false [by 
> default|https://github.com/apache/beam/blob/release-2.3.0/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java#L307]
>  but _readBytes()_ 
> [doesn't|https://github.com/apache/beam/blob/release-2.3.0/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java#L282],
>  thus creating an exception:
> {noformat}
> Exception in thread "main" java.lang.IllegalStateException: Missing required 
> properties: commitOffsetsInFinalizeEnabled
>      at 
> org.apache.beam.sdk.io.kafka.AutoValue_KafkaIO_Read$Builder.build(AutoValue_KafkaIO_Read.java:344)
>      at 
> org.apache.beam.sdk.io.kafka.KafkaIO.readBytes(KafkaIO.java:291){noformat}
> The parameter can be set to true with _commitOffsetsInFinalize()_ but never 
> to false.
> Using _read()_ in the definition of _readBytes()_ could prevent this kind of 
> error in the future:
> {code:java}
> public static Read readBytes() {
>   return read()
> .setKeyDeserializer(ByteArrayDeserializer.class)
> .setValueDeserializer(ByteArrayDeserializer.class)
> .build();
> }{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PerformanceTests_HadoopInputFormat #4

2018-02-27 Thread Apache Jenkins Server
See 


--
GitHub pull request #4758 of commit 1477e0fd673291d74e11f11e3afdd1b9a9755410, 
no merge conflicts.
Setting status of 1477e0fd673291d74e11f11e3afdd1b9a9755410 to PENDING with url 
https://builds.apache.org/job/beam_PerformanceTests_HadoopInputFormat/4/ and 
message: 'Build started sha1 is merged.'
Using context: Jenkins: Java HadoopInputFormatIO Performance Test
[EnvInject] - Loading node environment variables.
Building remotely on beam5 (beam) in workspace 

Cloning the remote Git repository
Cloning repository https://github.com/apache/beam.git
 > git init 
 > 
 >  # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/*
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # 
 > timeout=10
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/4758/*:refs/remotes/origin/pr/4758/*
 > git rev-parse refs/remotes/origin/pr/4758/merge^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/pr/4758/merge^{commit} # timeout=10
Checking out Revision c47a3e69565743d261d81d790668f72282af982d 
(refs/remotes/origin/pr/4758/merge)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f c47a3e69565743d261d81d790668f72282af982d
Commit message: "Merge 1477e0fd673291d74e11f11e3afdd1b9a9755410 into 
a7501289a23e2d8a3a7e5a3b9f5e920702d4d634"
First time build. Skipping changelog.
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_HadoopInputFormat] $ /bin/bash -xe 
/tmp/jenkins5105158555849429356.sh
+ gcloud container clusters get-credentials io-datastores --zone=us-central1-a 
--verbosity=debug
DEBUG: Running [gcloud.container.clusters.get-credentials] with arguments: 
[--verbosity: "debug", --zone: "us-central1-a", NAME: "io-datastores"]
Fetching cluster endpoint and auth data.
DEBUG: Saved kubeconfig to /home/jenkins/.kube/config
kubeconfig entry generated for io-datastores.
INFO: Display format "default".
DEBUG: SDK update checks are disabled.
[beam_PerformanceTests_HadoopInputFormat] $ /bin/bash -xe 
/tmp/jenkins3411790566426570155.sh
+ cp /home/jenkins/.kube/config 

[beam_PerformanceTests_HadoopInputFormat] $ /bin/bash -xe 
/tmp/jenkins9191626784974588976.sh
+ kubectl 
--kubeconfig=
 create namespace hadoopinputformatioit-1519747836161
namespace "hadoopinputformatioit-1519747836161" created
[beam_PerformanceTests_HadoopInputFormat] $ /bin/bash -xe 
/tmp/jenkins6497745477934933814.sh
++ kubectl config current-context
+ kubectl 
--kubeconfig=
 set-context gke_apache-beam-testing_us-central1-a_io-datastores 
--namespace=hadoopinputformatioit-1519747836161
Error: unknown command "set-context" for "kubectl"
Run 'kubectl --help' for usage.
error: unknown command "set-context" for "kubectl"
Build step 'Execute shell' marked build as failure


Build failed in Jenkins: beam_PostCommit_Python_Verify #4323

2018-02-27 Thread Apache Jenkins Server
See 


Changes:

[kirpichov] Makes it possible to use Wait with default windowing, eg. in batch

--
[...truncated 1.02 MB...]
copying apache_beam/runners/portability/universal_local_runner_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/test/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/test
copying apache_beam/runners/worker/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/bundle_processor.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/data_plane.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/data_plane_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/log_handler.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/log_handler_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/logger.pxd -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/logger.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/logger_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/opcounters.pxd -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/opcounters.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/opcounters_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/operation_specs.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/operations.pxd -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/operations.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/sdk_worker.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/sdk_worker_main.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/sdk_worker_main_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/sdk_worker_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/sideinputs.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/sideinputs_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/statesampler.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/statesampler_fast.pyx -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/statesampler_slow.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/statesampler_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/testing/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/testing
copying apache_beam/testing/pipeline_verifiers.py -> 
apache-beam-2.4.0.dev0/apache_beam/testing
copying apache_beam/testing/pipeline_verifiers_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/testing
copying apache_beam/testing/test_pipeline.py -> 
apache-beam-2.4.0.dev0/apache_beam/testing
copying apache_beam/testing/test_pipeline_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/testing
copying apache_beam/testing/test_stream.py -> 
apache-beam-2.4.0.dev0/apache_beam/testing
copying apache_beam/testing/test_stream_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/testing
copying apache_beam/testing/test_utils.py -> 
apache-beam-2.4.0.dev0/apache_beam/testing
copying apache_beam/testing/test_utils_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/testing
copying apache_beam/testing/util.py -> 
apache-beam-2.4.0.dev0/apache_beam/testing
copying apache_beam/testing/util_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/testing
copying apache_beam/testing/data/standard_coders.yaml -> 
apache-beam-2.4.0.dev0/apache_beam/testing/data
copying apache_beam/testing/data/trigger_transcripts.yaml -> 
apache-beam-2.4.0.dev0/apache_beam/testing/data
copying apache_beam/transforms/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/transforms
copying apache_beam/transforms/combiners.py -> 
apache-beam-2.4.0.dev0/apache_beam/transforms
copying apache_beam/transforms/combiners_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/transforms
copying apache_beam/transforms/core.py -> 
apache-beam-2.4.0.dev0/apache_beam/transforms
copying apache_beam/transforms/create_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/transforms
copying apache_beam/transforms/cy_combiners.pxd -> 
apache-beam-2.4.0.dev0/apache_beam/transforms
copying apache_beam/transforms/cy_combiners.py -> 

[jira] [Created] (BEAM-3757) Shuffle read failed using python 2.2.0

2018-02-27 Thread Jonathan Delfour (JIRA)
Jonathan Delfour created BEAM-3757:
--

 Summary: Shuffle read failed using python 2.2.0
 Key: BEAM-3757
 URL: https://issues.apache.org/jira/browse/BEAM-3757
 Project: Beam
  Issue Type: Bug
  Components: runner-dataflow
Affects Versions: 2.2.0
 Environment: gcp, macos
Reporter: Jonathan Delfour
Assignee: Thomas Groh


Hi,

First issue is that the beam 2.3.0 python SDK is apparently not working on GCP. 
It gets stuck: 
{noformat}
Workflow failed. Causes: (bf832d44290fbf41): The Dataflow appears to be stuck. 
You can get help with Cloud Dataflow at 
https://cloud.google.com/dataflow/support. 
{noformat}
I tried two times.

Reverting back to 2.2.0: it usually works but today, after > 1 hour of 
processing, and 30 workers used, I get a failure with these in the logs:

{noformat}
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", 
line 582, in do_work
work_executor.execute()
  File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/executor.py", 
line 167, in execute
op.start()
  File "dataflow_worker/shuffle_operations.py", line 49, in 
dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start
def start(self):
  File "dataflow_worker/shuffle_operations.py", line 50, in 
dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start
with self.scoped_start_state:
  File "dataflow_worker/shuffle_operations.py", line 65, in 
dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start
with self.shuffle_source.reader() as reader:
  File "dataflow_worker/shuffle_operations.py", line 67, in 
dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start
for key_values in reader:
  File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/shuffle.py", 
line 406, in __iter__
for entry in entries_iterator:
  File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/shuffle.py", 
line 248, in next
return next(self.iterator)
  File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/shuffle.py", 
line 206, in __iter__
chunk, next_position = self.reader.Read(start_position, end_position)
  File "third_party/windmill/shuffle/python/shuffle_client.pyx", line 138, in 
shuffle_client.PyShuffleReader.Read
IOError: Shuffle read failed: INTERNAL: Received RST_STREAM with error code 2  
talking to my-dataflow-02271107-756f-harness-2p65:12346
{noformat}

i also get some information message:

{noformat}
Refusing to split  at '\x00\x00\x00\t\x1d\x14\x87\xa3\x00\x01': unstarted
{noformat}

For the flow, I am extracting data from BQ, cleaning using pandas, exporting as 
a csv file, gzipping and uploading the compressed file to a bucket using 
decompressive transcoding (csv export, gzip compression and upload are in the 
same 'worker' as they are done in the same beam.DoFn).




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-2421) Migrate Apache Beam to use impulse primitive as the only root primitive

2018-02-27 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16379291#comment-16379291
 ] 

Kenneth Knowles commented on BEAM-2421:
---

This is also not done for Python, yes?

> Migrate Apache Beam to use impulse primitive as the only root primitive
> ---
>
> Key: BEAM-2421
> URL: https://issues.apache.org/jira/browse/BEAM-2421
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Reporter: Luke Cwik
>Priority: Major
>  Labels: portability
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The impulse source emits a single byte array element within the global window.
> This is in preparation for using SDF as the replacement for different bounded 
> and unbounded source implementations.
> Before:
> Read(Source) -> ParDo -> ...
> Impulse -> SDF(Source) -> ParDo -> ...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-3756) Update SpannerIO to use Batch API

2018-02-27 Thread Chamikara Jayalath (JIRA)
Chamikara Jayalath created BEAM-3756:


 Summary: Update SpannerIO to use Batch API
 Key: BEAM-3756
 URL: https://issues.apache.org/jira/browse/BEAM-3756
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-core
Affects Versions: 2.4.0
Reporter: Chamikara Jayalath
Assignee: Mairbek Khadikov


Active pull requests:

[https://github.com/apache/beam/pull/4752]

[https://github.com/apache/beam/pull/4727]

[https://github.com/apache/beam/pull/4707]

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-3757) Shuffle read failed using python 2.2.0

2018-02-27 Thread Jonathan Delfour (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Delfour updated BEAM-3757:
---
Description: 
Hi,

First issue is that the beam 2.3.0 python SDK is apparently not working on GCP. 
It gets stuck: 
{noformat}
Workflow failed. Causes: (bf832d44290fbf41): The Dataflow appears to be stuck. 
You can get help with Cloud Dataflow at 
https://cloud.google.com/dataflow/support. 
{noformat}
I tried two times.

Reverting back to 2.2.0: it usually works but today, after > 1 hour of 
processing, and 30 workers used, I get a failure with these in the logs:

{noformat}
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", 
line 582, in do_work
work_executor.execute()
  File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/executor.py", 
line 167, in execute
op.start()
  File "dataflow_worker/shuffle_operations.py", line 49, in 
dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start
def start(self):
  File "dataflow_worker/shuffle_operations.py", line 50, in 
dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start
with self.scoped_start_state:
  File "dataflow_worker/shuffle_operations.py", line 65, in 
dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start
with self.shuffle_source.reader() as reader:
  File "dataflow_worker/shuffle_operations.py", line 67, in 
dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start
for key_values in reader:
  File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/shuffle.py", 
line 406, in __iter__
for entry in entries_iterator:
  File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/shuffle.py", 
line 248, in next
return next(self.iterator)
  File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/shuffle.py", 
line 206, in __iter__
chunk, next_position = self.reader.Read(start_position, end_position)
  File "third_party/windmill/shuffle/python/shuffle_client.pyx", line 138, in 
shuffle_client.PyShuffleReader.Read
IOError: Shuffle read failed: INTERNAL: Received RST_STREAM with error code 2  
talking to my-dataflow-02271107-756f-harness-2p65:12346
{noformat}

i also get some information message:

{noformat}
Refusing to split  at '\x00\x00\x00\t\x1d\x14\x87\xa3\x00\x01': unstarted
{noformat}

For the flow, I am extracting data from BQ, cleaning using pandas, exporting as 
a csv file, gzipping and uploading the compressed file to a bucket using 
decompressive transcoding (csv export, gzip compression and upload are in the 
same 'worker' as they are done in the same beam.DoFn).

PS: i can't find a reasonable way to export the logs from GCP but i can 
privately send the log file i have of the run on my machine (the log of the 
pipeline)


  was:
Hi,

First issue is that the beam 2.3.0 python SDK is apparently not working on GCP. 
It gets stuck: 
{noformat}
Workflow failed. Causes: (bf832d44290fbf41): The Dataflow appears to be stuck. 
You can get help with Cloud Dataflow at 
https://cloud.google.com/dataflow/support. 
{noformat}
I tried two times.

Reverting back to 2.2.0: it usually works but today, after > 1 hour of 
processing, and 30 workers used, I get a failure with these in the logs:

{noformat}
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", 
line 582, in do_work
work_executor.execute()
  File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/executor.py", 
line 167, in execute
op.start()
  File "dataflow_worker/shuffle_operations.py", line 49, in 
dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start
def start(self):
  File "dataflow_worker/shuffle_operations.py", line 50, in 
dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start
with self.scoped_start_state:
  File "dataflow_worker/shuffle_operations.py", line 65, in 
dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start
with self.shuffle_source.reader() as reader:
  File "dataflow_worker/shuffle_operations.py", line 67, in 
dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start
for key_values in reader:
  File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/shuffle.py", 
line 406, in __iter__
for entry in entries_iterator:
  File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/shuffle.py", 
line 248, in next
return next(self.iterator)
  File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/shuffle.py", 
line 206, in __iter__
chunk, next_position = self.reader.Read(start_position, end_position)
  File "third_party/windmill/shuffle/python/shuffle_client.pyx", line 138, in 
shuffle_client.PyShuffleReader.Read
IOError: Shuffle read failed: INTERNAL: Received RST_STREAM with error code 2  
talking to my-dataflow-02271107-756f-harness-2p65:12346

[beam] 01/01: Merge pull request #4756: Makes it possible to use Wait with default windowing, eg. in batch pipelines

2018-02-27 Thread jkff
This is an automated email from the ASF dual-hosted git repository.

jkff pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 71e2526ee092fd535960dd76afa6420ce01a9e90
Merge: a750128 5a75dce
Author: Eugene Kirpichov 
AuthorDate: Tue Feb 27 11:33:20 2018 -0800

Merge pull request #4756: Makes it possible to use Wait with default 
windowing, eg. in batch pipelines

Makes it possible to use Wait with default windowing, eg. in batch pipelines

 .../java/org/apache/beam/sdk/transforms/Wait.java  |  2 +-
 .../org/apache/beam/sdk/transforms/WaitTest.java   | 69 +++---
 2 files changed, 49 insertions(+), 22 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
j...@apache.org.


[beam] branch master updated (a750128 -> 71e2526)

2018-02-27 Thread jkff
This is an automated email from the ASF dual-hosted git repository.

jkff pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from a750128  fix type error
 add 5a75dce  Makes it possible to use Wait with default windowing, eg. in 
batch pipelines
 new 71e2526  Merge pull request #4756: Makes it possible to use Wait with 
default windowing, eg. in batch pipelines

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../java/org/apache/beam/sdk/transforms/Wait.java  |  2 +-
 .../org/apache/beam/sdk/transforms/WaitTest.java   | 69 +++---
 2 files changed, 49 insertions(+), 22 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
j...@apache.org.


Build failed in Jenkins: beam_PerformanceTests_Python #964

2018-02-27 Thread Apache Jenkins Server
See 


--
[...truncated 502 B...]
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision a7501289a23e2d8a3a7e5a3b9f5e920702d4d634 (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f a7501289a23e2d8a3a7e5a3b9f5e920702d4d634
Commit message: "fix type error"
 > git rev-list --no-walk a7501289a23e2d8a3a7e5a3b9f5e920702d4d634 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins3639786825573596181.sh
+ rm -rf PerfKitBenchmarker
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins2272559313551319328.sh
+ rm -rf .env
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins9164780249243699752.sh
+ virtualenv .env --system-site-packages
New python executable in .env/bin/python
Installing setuptools, pip...done.
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins3964490315235906637.sh
+ .env/bin/pip install --upgrade setuptools pip
Downloading/unpacking setuptools from 
https://pypi.python.org/packages/43/41/033a273f9a25cb63050a390ee8397acbc7eae2159195d85f06f17e7be45a/setuptools-38.5.1-py2.py3-none-any.whl#md5=908b8b5e50bf429e520b2b5fa1b350e5
Downloading/unpacking pip from 
https://pypi.python.org/packages/b6/ac/7015eb97dc749283ffdec1c3a88ddb8ae03b8fad0f0e611408f196358da3/pip-9.0.1-py2.py3-none-any.whl#md5=297dbd16ef53bcef0447d245815f5144
Installing collected packages: setuptools, pip
  Found existing installation: setuptools 2.2
Uninstalling setuptools:
  Successfully uninstalled setuptools
  Found existing installation: pip 1.5.4
Uninstalling pip:
  Successfully uninstalled pip
Successfully installed setuptools pip
Cleaning up...
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins4321407467292949064.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git
Cloning into 'PerfKitBenchmarker'...
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins1663688866075405379.sh
+ .env/bin/pip install -r PerfKitBenchmarker/requirements.txt
Requirement already satisfied: absl-py in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: jinja2>=2.7 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: setuptools in ./.env/lib/python2.7/site-packages 
(from -r PerfKitBenchmarker/requirements.txt (line 16))
Requirement already satisfied: colorlog[windows]==2.6.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 17))
Requirement already satisfied: blinker>=1.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 18))
Requirement already satisfied: futures>=3.0.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 19))
Requirement already satisfied: PyYAML==3.12 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 20))
Requirement already satisfied: pint>=0.7 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 21))
Requirement already satisfied: numpy==1.13.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 22))
Requirement already satisfied: functools32 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 23))
Requirement already satisfied: contextlib2>=0.5.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 24))
Requirement already satisfied: pywinrm in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 25))
Requirement already satisfied: six in /usr/local/lib/python2.7/dist-packages 
(from absl-py->-r PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: MarkupSafe>=0.23 in 
/usr/local/lib/python2.7/dist-packages (from jinja2>=2.7->-r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: 

[jira] [Updated] (BEAM-3756) Update SpannerIO to use Batch API

2018-02-27 Thread Chamikara Jayalath (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath updated BEAM-3756:
-
Affects Version/s: (was: 2.4.0)
   Not applicable

> Update SpannerIO to use Batch API
> -
>
> Key: BEAM-3756
> URL: https://issues.apache.org/jira/browse/BEAM-3756
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Affects Versions: Not applicable
>Reporter: Chamikara Jayalath
>Assignee: Mairbek Khadikov
>Priority: Critical
> Fix For: 2.4.0
>
>
> Active pull requests:
> [https://github.com/apache/beam/pull/4752]
> [https://github.com/apache/beam/pull/4727]
> [https://github.com/apache/beam/pull/4707]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-3756) Update SpannerIO to use Batch API

2018-02-27 Thread Chamikara Jayalath (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath updated BEAM-3756:
-
Fix Version/s: 2.4.0

> Update SpannerIO to use Batch API
> -
>
> Key: BEAM-3756
> URL: https://issues.apache.org/jira/browse/BEAM-3756
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Affects Versions: Not applicable
>Reporter: Chamikara Jayalath
>Assignee: Mairbek Khadikov
>Priority: Critical
> Fix For: 2.4.0
>
>
> Active pull requests:
> [https://github.com/apache/beam/pull/4752]
> [https://github.com/apache/beam/pull/4727]
> [https://github.com/apache/beam/pull/4707]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PerformanceTests_Spark #1408

2018-02-27 Thread Apache Jenkins Server
See 


--
[...truncated 85.38 KB...]
'apache-beam-testing:bqjob_r3e8bd19c4e104c58_0161d8817556_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)

2018-02-27 18:23:40,694 1da4f676 MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-02-27 18:23:58,938 1da4f676 MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-02-27 18:24:01,954 1da4f676 MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1,  WallTime:0:02.99s,  CPU:0.34s,  MaxMemory:25364kb 
STDOUT: Upload complete.
Waiting on bqjob_r7aab590498520260_0161d881c887_1 ... (0s) Current status: 
RUNNING 
 Waiting on bqjob_r7aab590498520260_0161d881c887_1 ... (0s) 
Current status: DONE   
BigQuery error in load operation: Error processing job
'apache-beam-testing:bqjob_r7aab590498520260_0161d881c887_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)

2018-02-27 18:24:01,955 1da4f676 MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-02-27 18:24:29,746 1da4f676 MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-02-27 18:24:33,622 1da4f676 MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1,  WallTime:0:03.86s,  CPU:0.41s,  MaxMemory:25576kb 
STDOUT: Upload complete.
Waiting on bqjob_rf31d13f455731_0161d8824316_1 ... (0s) Current status: 
RUNNING 
  Waiting on bqjob_rf31d13f455731_0161d8824316_1 ... (0s) Current 
status: DONE   
BigQuery error in load operation: Error processing job
'apache-beam-testing:bqjob_rf31d13f455731_0161d8824316_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)

2018-02-27 18:24:33,622 1da4f676 MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-02-27 18:25:02,218 1da4f676 MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-02-27 18:25:05,801 1da4f676 MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1,  WallTime:0:03.56s,  CPU:0.40s,  MaxMemory:25452kb 
STDOUT: Upload complete.
Waiting on bqjob_r6c337fd27be2f7ad_0161d882c27d_1 ... (0s) Current status: 
RUNNING 
 Waiting on bqjob_r6c337fd27be2f7ad_0161d882c27d_1 ... (0s) 
Current status: DONE   
BigQuery error in load operation: Error processing job

Build failed in Jenkins: beam_PostCommit_Python_ValidatesRunner_Dataflow #1005

2018-02-27 Thread Apache Jenkins Server
See 


Changes:

[kirpichov] Makes it possible to use Wait with default windowing, eg. in batch

--
[...truncated 117.36 KB...]
raise TimedOutException()
TimedOutException: 'test_empty_singleton_side_input 
(apache_beam.transforms.sideinputs_test.SideInputsTest)'

==
ERROR: test_flattened_side_input 
(apache_beam.transforms.sideinputs_test.SideInputsTest)
--
Traceback (most recent call last):
  File 
"
 line 812, in run
test(orig)
  File 
"
 line 45, in __call__
return self.run(*arg, **kwarg)
  File 
"
 line 133, in run
self.runTest(result)
  File 
"
 line 151, in runTest
test(result)
  File "/usr/lib/python2.7/unittest/case.py", line 395, in __call__
return self.run(*args, **kwds)
  File "/usr/lib/python2.7/unittest/case.py", line 331, in run
testMethod()
  File 
"
 line 305, in test_flattened_side_input
assert_that(results, equal_to(['a', 'b']))
  File 
"
 line 152, in assert_that
actual | AssertThat()  # pylint: disable=expression-not-assigned
  File 
"
 line 107, in __or__
return self.pipeline.apply(ptransform, self)
  File 
"
 line 479, in apply
pvalueish_result = self.runner.apply(transform, pvalueish)
  File 
"
 line 174, in apply
return m(transform, input)
  File 
"
 line 180, in apply_PTransform
return transform.expand(input)
  File 
"
 line 147, in expand
| "Match" >> Map(matcher))
  File 
"
 line 820, in __ror__
return self.transform.__ror__(pvalueish, self.label)
  File 
"
 line 488, in __ror__
result = p.apply(self, pvalueish, label)
  File 
"
 line 443, in apply
return self.apply(transform, pvalueish)
  File 
"
 line 479, in apply
pvalueish_result = self.runner.apply(transform, pvalueish)
  File 
"
 line 174, in apply
return m(transform, input)
  File 
"
 line 180, in apply_PTransform
return transform.expand(input)
  File 
"
 line 165, in expand
| Map(_merge_tagged_vals_under_key, result_ctor, result_ctor_arg))
  File 
"
 line 488, in __ror__
result = p.apply(self, pvalueish, label)
  File 
"
 line 479, in apply

Build failed in Jenkins: beam_PostCommit_Python_ValidatesRunner_Dataflow #1002

2018-02-27 Thread Apache Jenkins Server
See 


--
[...truncated 120.12 KB...]
==
ERROR: test_flattened_side_input 
(apache_beam.transforms.sideinputs_test.SideInputsTest)
--
Traceback (most recent call last):
  File 
"
 line 812, in run
test(orig)
  File 
"
 line 45, in __call__
return self.run(*arg, **kwarg)
  File 
"
 line 133, in run
self.runTest(result)
  File 
"
 line 151, in runTest
test(result)
  File "/usr/lib/python2.7/unittest/case.py", line 395, in __call__
return self.run(*args, **kwds)
  File "/usr/lib/python2.7/unittest/case.py", line 331, in run
testMethod()
  File 
"
 line 305, in test_flattened_side_input
assert_that(results, equal_to(['a', 'b']))
  File 
"
 line 152, in assert_that
actual | AssertThat()  # pylint: disable=expression-not-assigned
  File 
"
 line 107, in __or__
return self.pipeline.apply(ptransform, self)
  File 
"
 line 479, in apply
pvalueish_result = self.runner.apply(transform, pvalueish)
  File 
"
 line 174, in apply
return m(transform, input)
  File 
"
 line 180, in apply_PTransform
return transform.expand(input)
  File 
"
 line 147, in expand
| "Match" >> Map(matcher))
  File 
"
 line 820, in __ror__
return self.transform.__ror__(pvalueish, self.label)
  File 
"
 line 488, in __ror__
result = p.apply(self, pvalueish, label)
  File 
"
 line 443, in apply
return self.apply(transform, pvalueish)
  File 
"
 line 479, in apply
pvalueish_result = self.runner.apply(transform, pvalueish)
  File 
"
 line 174, in apply
return m(transform, input)
  File 
"
 line 180, in apply_PTransform
return transform.expand(input)
  File 
"
 line 165, in expand
| Map(_merge_tagged_vals_under_key, result_ctor, result_ctor_arg))
  File 
"
 line 988, in Map
pardo = FlatMap(wrapper, *args, **kwargs)
  File 
"
 line 939, in FlatMap
pardo = ParDo(CallableWrapperDoFn(fn), *args, **kwargs)
  File 
"
 line 781, in __init__
super(ParDo, 

[beam] 01/02: @Parameter annotation does not work for UDFs in Beam SQL

2018-02-27 Thread xumingming
This is an automated email from the ASF dual-hosted git repository.

xumingming pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 880caef3fc17295f817648ccdffe369c2dc63fa1
Author: mingmxu 
AuthorDate: Mon Feb 26 15:42:55 2018 -0800

@Parameter annotation does not work for UDFs in Beam SQL
---
 .../apache/beam/sdk/extensions/sql/BeamSqlUdf.java |  2 +-
 .../sql/impl/interpreter/BeamSqlFnExecutor.java|  4 +++
 .../operator/BeamSqlDefaultExpression.java}| 38 ++
 .../sdk/extensions/sql/BeamSqlDslUdfUdafTest.java  | 26 ++-
 4 files changed, 47 insertions(+), 23 deletions(-)

diff --git 
a/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/BeamSqlUdf.java
 
b/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/BeamSqlUdf.java
index 91bad20..5df046a 100644
--- 
a/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/BeamSqlUdf.java
+++ 
b/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/BeamSqlUdf.java
@@ -35,7 +35,7 @@ import org.apache.beam.sdk.annotations.Experimental;
  * }
  *
  * The first parameter is named "s" and is mandatory,
- * and the second parameter is named "n" and is optional.
+ * and the second parameter is named "n" and is optional(always NULL if not 
specified).
  */
 @Experimental
 public interface BeamSqlUdf extends Serializable {
diff --git 
a/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/interpreter/BeamSqlFnExecutor.java
 
b/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/interpreter/BeamSqlFnExecutor.java
index ae65c2b..dbdd03d 100644
--- 
a/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/interpreter/BeamSqlFnExecutor.java
+++ 
b/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/interpreter/BeamSqlFnExecutor.java
@@ -23,6 +23,7 @@ import java.util.Calendar;
 import java.util.List;
 import 
org.apache.beam.sdk.extensions.sql.impl.interpreter.operator.BeamSqlCaseExpression;
 import 
org.apache.beam.sdk.extensions.sql.impl.interpreter.operator.BeamSqlCastExpression;
+import 
org.apache.beam.sdk.extensions.sql.impl.interpreter.operator.BeamSqlDefaultExpression;
 import 
org.apache.beam.sdk.extensions.sql.impl.interpreter.operator.BeamSqlExpression;
 import 
org.apache.beam.sdk.extensions.sql.impl.interpreter.operator.BeamSqlInputRefExpression;
 import 
org.apache.beam.sdk.extensions.sql.impl.interpreter.operator.BeamSqlPrimitive;
@@ -383,6 +384,9 @@ public class BeamSqlFnExecutor implements 
BeamSqlExpressionExecutor {
 case "DATETIME_PLUS":
   return new BeamSqlDatetimePlusExpression(subExps);
 
+//DEFAULT keyword for UDF with optional parameter
+case "DEFAULT":
+  return new BeamSqlDefaultExpression();
 
 case "CASE":
   ret = new BeamSqlCaseExpression(subExps);
diff --git 
a/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/BeamSqlUdf.java
 
b/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/interpreter/operator/BeamSqlDefaultExpression.java
similarity index 52%
copy from 
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/BeamSqlUdf.java
copy to 
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/interpreter/operator/BeamSqlDefaultExpression.java
index 91bad20..0557600 100644
--- 
a/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/BeamSqlUdf.java
+++ 
b/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/interpreter/operator/BeamSqlDefaultExpression.java
@@ -15,29 +15,25 @@
  * See the License for the specific language governing permissions and
  * limitations under the License.
  */
-package org.apache.beam.sdk.extensions.sql;
+package org.apache.beam.sdk.extensions.sql.impl.interpreter.operator;
 
-import java.io.Serializable;
-import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.transforms.windowing.BoundedWindow;
+import org.apache.beam.sdk.values.Row;
+import org.apache.calcite.sql.type.SqlTypeName;
 
 /**
- * Interface to create a UDF in Beam SQL.
- *
- * A static method {@code eval} is required. Here is an example:
- *
- * 
- * public static class MyLeftFunction {
- *   public String eval(
- *   Parameter(name = "s") String s,
- *   Parameter(name = "n", optional = true) Integer n) {
- * return s.substring(0, n == null ? 1 : n);
- *   }
- * }
- *
- * The first parameter is named "s" and is mandatory,
- * and the second parameter is named "n" and is optional.
+ * DEFAULT keyword for UDF with optional parameter.
  */
-@Experimental
-public interface BeamSqlUdf extends Serializable {
-  String UDF_METHOD = "eval";
+public class BeamSqlDefaultExpression extends BeamSqlExpression {
+
+  @Override
+  public boolean 

[beam] branch master updated (8a39c80 -> a750128)

2018-02-27 Thread xumingming
This is an automated email from the ASF dual-hosted git repository.

xumingming pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 8a39c80  Merge pull request #4695: Add To/From Proto Round Trip for 
ExecutableStage
 new 880caef  @Parameter annotation does not work for UDFs in Beam SQL
 new a750128  fix type error

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../apache/beam/sdk/extensions/sql/BeamSqlUdf.java |  2 +-
 .../sql/impl/interpreter/BeamSqlFnExecutor.java|  4 
 ...pression.java => BeamSqlDefaultExpression.java} | 15 +++--
 .../sdk/extensions/sql/BeamSqlDslUdfUdafTest.java  | 26 +-
 4 files changed, 33 insertions(+), 14 deletions(-)
 copy 
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/interpreter/operator/{BeamSqlInputRefExpression.java
 => BeamSqlDefaultExpression.java} (74%)

-- 
To stop receiving notification emails like this one, please contact
xumingm...@apache.org.


[beam] 02/02: fix type error

2018-02-27 Thread xumingming
This is an automated email from the ASF dual-hosted git repository.

xumingming pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit a7501289a23e2d8a3a7e5a3b9f5e920702d4d634
Author: mingmxu 
AuthorDate: Mon Feb 26 23:39:08 2018 -0800

fix type error
---
 .../java/org/apache/beam/sdk/extensions/sql/BeamSqlDslUdfUdafTest.java  | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/BeamSqlDslUdfUdafTest.java
 
b/sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/BeamSqlDslUdfUdafTest.java
index 01e45c1..4589d39 100644
--- 
a/sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/BeamSqlDslUdfUdafTest.java
+++ 
b/sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/BeamSqlDslUdfUdafTest.java
@@ -98,7 +98,7 @@ public class BeamSqlDslUdfUdafTest extends BeamSqlDslBase {
 
 RowType subStrRowType = RowSqlType.builder()
 .withIntegerField("f_int")
-.withIntegerField("sub_string")
+.withVarcharField("sub_string")
 .build();
 Row subStrRow = Row.withRowType(subStrRowType).addValues(2, "s").build();
 PAssert.that(result3).containsInAnyOrder(subStrRow);

-- 
To stop receiving notification emails like this one, please contact
xumingm...@apache.org.


[jira] [Assigned] (BEAM-3043) Set user-specified Transform names on Flink operations

2018-02-27 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/BEAM-3043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grzegorz Kołakowski reassigned BEAM-3043:
-

Assignee: Grzegorz Kołakowski

> Set user-specified Transform names on Flink operations
> --
>
> Key: BEAM-3043
> URL: https://issues.apache.org/jira/browse/BEAM-3043
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Aljoscha Krettek
>Assignee: Grzegorz Kołakowski
>Priority: Major
>
> Currently, we don't always set a name on the generated operations or we set 
> the wrong name. For example, in the batch translation we set the result of 
> {{PTransform.getName()}} as the name, which is only the name of the 
> {{PTransform}} itself, not the name that the user specified when creating a 
> Pipeline.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-2873) Detect number of shards for file sink in Flink Streaming Runner

2018-02-27 Thread Dawid Wysakowicz (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Wysakowicz reassigned BEAM-2873:
--

Assignee: Dawid Wysakowicz

> Detect number of shards for file sink in Flink Streaming Runner
> ---
>
> Key: BEAM-2873
> URL: https://issues.apache.org/jira/browse/BEAM-2873
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Aljoscha Krettek
>Assignee: Dawid Wysakowicz
>Priority: Major
>
> [~reuvenlax] mentioned that this is done for the Dataflow Runner and the 
> default behaviour on Flink can be somewhat surprising for users.
> ML entry: https://www.mail-archive.com/dev@beam.apache.org/msg02665.html:
> This is how the file sink has always worked in Beam. If no sharding is 
> specified, then this means runner-determined sharding, and by default that is 
> one file per bundle. If Flink has small bundles, then I suggest using the 
> withNumShards method to explicitly pick the number of output shards.
> The Flink runner can detect that runner-determined sharding has been chosen, 
> and override it with a specific number of shards. For example, the Dataflow 
> streaming runner (which as you mentioned also has small bundles) detects this 
> case and sets the number of out files shards based on the number of workers 
> in the worker pool 
> [Here|https://github.com/apache/beam/blob/9e6530adb00669b7cf0f01cb8b128be0a21fd721/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java#L354]
>  is the code that does this; it should be quite simple to do something 
> similar for Flink, and then there will be no need for users to explicitly 
> call withNumShards themselves.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Python_Verify #4320

2018-02-27 Thread Apache Jenkins Server
See 


--
[...truncated 1.02 MB...]
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_client.py 
-> apache-beam-2.4.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_messages.py
 -> 
apache-beam-2.4.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers_test.py 
-> apache-beam-2.4.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying apache_beam/runners/dataflow/native_io/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/streaming_create.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/direct/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/bundle_factory.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/clock.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor_test.py 
-> apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/evaluation_context.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/executor.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/helper_transforms.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/sdf_direct_runner.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/sdf_direct_runner_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/transform_evaluator.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/util.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/watermark_manager.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/experimental/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/experimental
copying apache_beam/runners/experimental/python_rpc_direct/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/experimental/python_rpc_direct
copying 
apache_beam/runners/experimental/python_rpc_direct/python_rpc_direct_runner.py 
-> apache-beam-2.4.0.dev0/apache_beam/runners/experimental/python_rpc_direct
copying apache_beam/runners/experimental/python_rpc_direct/server.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/experimental/python_rpc_direct
copying apache_beam/runners/job/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/job
copying apache_beam/runners/job/manager.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/job
copying apache_beam/runners/job/utils.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/job
copying apache_beam/runners/portability/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/fn_api_runner.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/fn_api_runner_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/maptask_executor_runner.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/maptask_executor_runner_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/universal_local_runner.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/universal_local_runner_main.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying 

Build failed in Jenkins: beam_PerformanceTests_Spark #1407

2018-02-27 Thread Apache Jenkins Server
See 


Changes:

[xumingmingv] @Parameter annotation does not work for UDFs in Beam SQL

[xumingmingv] fix type error

--
[...truncated 88.62 KB...]
'apache-beam-testing:bqjob_r677381b259c0359c_0161d733a67b_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)

2018-02-27 12:19:04,044 c47966a4 MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-02-27 12:19:20,513 c47966a4 MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-02-27 12:19:23,479 c47966a4 MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1,  WallTime:0:02.96s,  CPU:0.25s,  MaxMemory:25576kb 
STDOUT: Upload complete.
Waiting on bqjob_r5e2cf29b85de003_0161d733f05a_1 ... (0s) Current status: 
RUNNING 
Waiting on bqjob_r5e2cf29b85de003_0161d733f05a_1 ... (0s) 
Current status: DONE   
BigQuery error in load operation: Error processing job
'apache-beam-testing:bqjob_r5e2cf29b85de003_0161d733f05a_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)

2018-02-27 12:19:23,479 c47966a4 MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-02-27 12:19:50,164 c47966a4 MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-02-27 12:19:53,470 c47966a4 MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1,  WallTime:0:03.30s,  CPU:0.24s,  MaxMemory:25260kb 
STDOUT: Upload complete.
Waiting on bqjob_r4b998bdd0c9c9bbb_0161d7346807_1 ... (0s) Current status: 
RUNNING 
 Waiting on bqjob_r4b998bdd0c9c9bbb_0161d7346807_1 ... (0s) 
Current status: DONE   
BigQuery error in load operation: Error processing job
'apache-beam-testing:bqjob_r4b998bdd0c9c9bbb_0161d7346807_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)

2018-02-27 12:19:53,471 c47966a4 MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-02-27 12:20:15,920 c47966a4 MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-02-27 12:20:18,038 c47966a4 MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1,  WallTime:0:02.11s,  CPU:0.24s,  MaxMemory:25552kb 
STDOUT: Upload complete.
Waiting on bqjob_r56162598c0723c0e_0161d734c8b8_1 ... (0s) Current status: 
RUNNING 
 Waiting on bqjob_r56162598c0723c0e_0161d734c8b8_1 ... (0s) 

[jira] [Assigned] (BEAM-2393) BoundedSource is not fault-tolerant in FlinkRunner Streaming mode

2018-02-27 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/BEAM-2393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grzegorz Kołakowski reassigned BEAM-2393:
-

Assignee: Grzegorz Kołakowski  (was: Jingsong Lee)

> BoundedSource is not fault-tolerant in FlinkRunner Streaming mode
> -
>
> Key: BEAM-2393
> URL: https://issues.apache.org/jira/browse/BEAM-2393
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Jingsong Lee
>Assignee: Grzegorz Kołakowski
>Priority: Major
>
> {{BoundedSourceWrapper}} does not implement snapshot() and restore(), when 
> the failure to restart, it will send duplicate data.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Python_ValidatesRunner_Dataflow #1003

2018-02-27 Thread Apache Jenkins Server
See 


Changes:

[xumingmingv] @Parameter annotation does not work for UDFs in Beam SQL

[xumingmingv] fix type error

--
[...truncated 115.48 KB...]
  File 
"
 line 45, in __call__
return self.run(*arg, **kwarg)
  File 
"
 line 133, in run
self.runTest(result)
  File 
"
 line 151, in runTest
test(result)
  File "/usr/lib/python2.7/unittest/case.py", line 395, in __call__
return self.run(*args, **kwds)
  File "/usr/lib/python2.7/unittest/case.py", line 331, in run
testMethod()
  File 
"
 line 305, in test_flattened_side_input
assert_that(results, equal_to(['a', 'b']))
  File 
"
 line 152, in assert_that
actual | AssertThat()  # pylint: disable=expression-not-assigned
  File 
"
 line 107, in __or__
return self.pipeline.apply(ptransform, self)
  File 
"
 line 479, in apply
pvalueish_result = self.runner.apply(transform, pvalueish)
  File 
"
 line 174, in apply
return m(transform, input)
  File 
"
 line 180, in apply_PTransform
return transform.expand(input)
  File 
"
 line 143, in expand
| "ToVoidKey" >> Map(lambda v: (None, v)))
  File 
"
 line 107, in __or__
return self.pipeline.apply(ptransform, self)
  File 
"
 line 475, in apply
type_options = self._options.view_as(TypeOptions)
  File 
"
 line 227, in view_as
view = cls(self._flags)
  File 
"
 line 150, in __init__
parser = _BeamArgumentParser()
  File "/usr/lib/python2.7/argparse.py", line 1602, in __init__
help=_('show this help message and exit'))
  File "/usr/lib/python2.7/gettext.py", line 581, in gettext
return dgettext(_current_domain, message)
  File "/usr/lib/python2.7/gettext.py", line 545, in dgettext
codeset=_localecodesets.get(domain))
  File "/usr/lib/python2.7/gettext.py", line 480, in translation
mofiles = find(domain, localedir, languages, all=1)
  File "/usr/lib/python2.7/gettext.py", line 437, in find
for nelang in _expand_lang(lang):
  File "/usr/lib/python2.7/gettext.py", line 132, in _expand_lang
locale = normalize(locale)
  File 
"
 line 430, in normalize
return _replace_encoding(code, encoding)
  File 
"
 line 353, in _replace_encoding
norm_encoding = encodings.normalize_encoding(encoding)
  File 
"
 line 69, in normalize_encoding
return '_'.join(encoding.translate(_norm_encoding_map).split())
  File 
"
 line 276, in signalhandler
raise TimedOutException()
TimedOutException: 'test_flattened_side_input 

Build failed in Jenkins: beam_PostCommit_Python_Verify #4321

2018-02-27 Thread Apache Jenkins Server
See 


Changes:

[xumingmingv] @Parameter annotation does not work for UDFs in Beam SQL

[xumingmingv] fix type error

--
[...truncated 1.02 MB...]
copying apache_beam/runners/direct/helper_transforms.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/sdf_direct_runner.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/sdf_direct_runner_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/transform_evaluator.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/util.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/watermark_manager.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/experimental/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/experimental
copying apache_beam/runners/experimental/python_rpc_direct/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/experimental/python_rpc_direct
copying 
apache_beam/runners/experimental/python_rpc_direct/python_rpc_direct_runner.py 
-> apache-beam-2.4.0.dev0/apache_beam/runners/experimental/python_rpc_direct
copying apache_beam/runners/experimental/python_rpc_direct/server.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/experimental/python_rpc_direct
copying apache_beam/runners/job/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/job
copying apache_beam/runners/job/manager.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/job
copying apache_beam/runners/job/utils.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/job
copying apache_beam/runners/portability/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/fn_api_runner.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/fn_api_runner_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/maptask_executor_runner.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/maptask_executor_runner_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/universal_local_runner.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/universal_local_runner_main.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/universal_local_runner_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/test/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/test
copying apache_beam/runners/worker/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/bundle_processor.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/data_plane.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/data_plane_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/log_handler.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/log_handler_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/logger.pxd -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/logger.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/logger_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/opcounters.pxd -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/opcounters.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/opcounters_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/operation_specs.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/operations.pxd -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/operations.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/sdk_worker.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/sdk_worker_main.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/sdk_worker_main_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/sdk_worker_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/sideinputs.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying 

Build failed in Jenkins: beam_PerformanceTests_JDBC #266

2018-02-27 Thread Apache Jenkins Server
See 


Changes:

[xumingmingv] @Parameter annotation does not work for UDFs in Beam SQL

[xumingmingv] fix type error

--
[...truncated 52.12 KB...]
at 
com.google.cloud.dataflow.worker.DefaultParDoFnFactory.create(DefaultParDoFnFactory.java:74)
at 
com.google.cloud.dataflow.worker.MapTaskExecutorFactory.createParDoOperation(MapTaskExecutorFactory.java:481)
at 
com.google.cloud.dataflow.worker.MapTaskExecutorFactory$3.typedApply(MapTaskExecutorFactory.java:392)
at 
com.google.cloud.dataflow.worker.MapTaskExecutorFactory$3.typedApply(MapTaskExecutorFactory.java:374)
at 
com.google.cloud.dataflow.worker.graph.Networks$TypeSafeNodeFunction.apply(Networks.java:63)
at 
com.google.cloud.dataflow.worker.graph.Networks$TypeSafeNodeFunction.apply(Networks.java:50)
at 
com.google.cloud.dataflow.worker.graph.Networks.replaceDirectedNetworkNodes(Networks.java:87)
at 
com.google.cloud.dataflow.worker.MapTaskExecutorFactory.create(MapTaskExecutorFactory.java:158)
at 
com.google.cloud.dataflow.worker.DataflowWorker.doWork(DataflowWorker.java:308)
at 
com.google.cloud.dataflow.worker.DataflowWorker.getAndPerformWork(DataflowWorker.java:264)
at 
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.doWork(DataflowBatchWorkerHarness.java:133)
at 
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:113)
at 
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:100)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
(7775af73db932b1b): java.lang.RuntimeException: 
org.apache.beam.sdk.util.UserCodeException: org.postgresql.util.PSQLException: 
The connection attempt failed.
at 
com.google.cloud.dataflow.worker.MapTaskExecutorFactory$3.typedApply(MapTaskExecutorFactory.java:404)
at 
com.google.cloud.dataflow.worker.MapTaskExecutorFactory$3.typedApply(MapTaskExecutorFactory.java:374)
at 
com.google.cloud.dataflow.worker.graph.Networks$TypeSafeNodeFunction.apply(Networks.java:63)
at 
com.google.cloud.dataflow.worker.graph.Networks$TypeSafeNodeFunction.apply(Networks.java:50)
at 
com.google.cloud.dataflow.worker.graph.Networks.replaceDirectedNetworkNodes(Networks.java:87)
at 
com.google.cloud.dataflow.worker.MapTaskExecutorFactory.create(MapTaskExecutorFactory.java:158)
at 
com.google.cloud.dataflow.worker.DataflowWorker.doWork(DataflowWorker.java:308)
at 
com.google.cloud.dataflow.worker.DataflowWorker.getAndPerformWork(DataflowWorker.java:264)
at 
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.doWork(DataflowBatchWorkerHarness.java:133)
at 
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:113)
at 
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:100)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.beam.sdk.util.UserCodeException: 
org.postgresql.util.PSQLException: The connection attempt failed.
at 
org.apache.beam.sdk.util.UserCodeException.wrap(UserCodeException.java:36)
at 
org.apache.beam.sdk.io.jdbc.JdbcIO$Write$WriteFn$DoFnInvoker.invokeSetup(Unknown
 Source)
at 
com.google.cloud.dataflow.worker.DoFnInstanceManagers$ConcurrentQueueInstanceManager.deserializeCopy(DoFnInstanceManagers.java:63)
at 
com.google.cloud.dataflow.worker.DoFnInstanceManagers$ConcurrentQueueInstanceManager.peek(DoFnInstanceManagers.java:45)
at 
com.google.cloud.dataflow.worker.UserParDoFnFactory.create(UserParDoFnFactory.java:94)
at 
com.google.cloud.dataflow.worker.DefaultParDoFnFactory.create(DefaultParDoFnFactory.java:74)
at 
com.google.cloud.dataflow.worker.MapTaskExecutorFactory.createParDoOperation(MapTaskExecutorFactory.java:481)
at 
com.google.cloud.dataflow.worker.MapTaskExecutorFactory$3.typedApply(MapTaskExecutorFactory.java:392)
... 14 more
Caused by: org.postgresql.util.PSQLException: The connection attempt failed.
at 
org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:272)
at 

[jira] [Updated] (BEAM-3089) Issue with setting the parallelism at client level using Flink runner

2018-02-27 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/BEAM-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grzegorz Kołakowski updated BEAM-3089:
--
Attachment: flink-ui-parallelism.png

> Issue with setting the parallelism at client level using Flink runner
> -
>
> Key: BEAM-3089
> URL: https://issues.apache.org/jira/browse/BEAM-3089
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 2.0.0
> Environment: I am using Flink 1.2.1 running on Docker, with Task 
> Managers distributed across different VMs as part of a Docker Swarm.
>Reporter: Thalita Vergilio
>Assignee: Aljoscha Krettek
>Priority: Major
>  Labels: docker, flink, parallel-deployment
> Attachments: flink-ui-parallelism.png
>
>
> When uploading an Apache Beam application using the Flink Web UI, the 
> parallelism set at job submission doesn't get picked up. The same happens 
> when submitting a job using the Flink CLI.
> In both cases, the parallelism ends up defaulting to 1.
> When I set the parallelism programmatically within the Apache Beam code, it 
> works: {{flinkPipelineOptions.setParallelism(4);}}
> I suspect the root of the problem may be in the 
> org.apache.beam.runners.flink.DefaultParallelismFactory class, as it checks 
> for Flink's GlobalConfiguration, which may not pick up runtime values passed 
> to Flink, then defaults to 1 if it doesn't find anything.
> Any ideas on how this could be fixed or worked around? I need to be able to 
> change the parallelism dynamically, so the programmatic approach won't really 
> work for me, nor will setting the Flink configuration at system level.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (BEAM-3441) Allow ValueProvider for JdbcIO.DataSourceConfiguration

2018-02-27 Thread Sameer Abhyankar (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sameer Abhyankar closed BEAM-3441.
--
   Resolution: Fixed
Fix Version/s: 2.4.0

> Allow ValueProvider for JdbcIO.DataSourceConfiguration
> --
>
> Key: BEAM-3441
> URL: https://issues.apache.org/jira/browse/BEAM-3441
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Sameer Abhyankar
>Assignee: Sameer Abhyankar
>Priority: Major
> Fix For: 2.4.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Currently JdbcIO only supports ValueProviders for queries but not for the 
> DataSourceConfiguration itself (i.e. driverClassName, url, username, password 
> etc.) These should support ValueProviders to allow the use of JdbcIO in 
> templates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] branch master updated (71e2526 -> 37e7c53)

2018-02-27 Thread kenn
This is an automated email from the ASF dual-hosted git repository.

kenn pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 71e2526  Merge pull request #4756: Makes it possible to use Wait with 
default windowing, eg. in batch pipelines
 add b444c6e  Add MetricsTranslation
 add 37e7c53  Merge pull request #4734: [BEAM-1866] Add MetricsTranslation

No new revisions were added by this update.

Summary of changes:
 runners/core-java/pom.xml  |   5 +
 .../runners/core/metrics/MetricsTranslation.java   | 138 ++
 .../core/metrics/MetricsTranslationTest.java   | 157 +
 3 files changed, 300 insertions(+)
 create mode 100644 
runners/core-java/src/main/java/org/apache/beam/runners/core/metrics/MetricsTranslation.java
 create mode 100644 
runners/core-java/src/test/java/org/apache/beam/runners/core/metrics/MetricsTranslationTest.java

-- 
To stop receiving notification emails like this one, please contact
k...@apache.org.


Jenkins build is back to normal : beam_PerformanceTests_TextIOIT #210

2018-02-27 Thread Apache Jenkins Server
See 




[jira] [Assigned] (BEAM-3089) Issue with setting the parallelism at client level using Flink runner

2018-02-27 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/BEAM-3089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grzegorz Kołakowski reassigned BEAM-3089:
-

Assignee: Grzegorz Kołakowski  (was: Aljoscha Krettek)

> Issue with setting the parallelism at client level using Flink runner
> -
>
> Key: BEAM-3089
> URL: https://issues.apache.org/jira/browse/BEAM-3089
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 2.0.0
> Environment: I am using Flink 1.2.1 running on Docker, with Task 
> Managers distributed across different VMs as part of a Docker Swarm.
>Reporter: Thalita Vergilio
>Assignee: Grzegorz Kołakowski
>Priority: Major
>  Labels: docker, flink, parallel-deployment
> Attachments: flink-ui-parallelism.png
>
>
> When uploading an Apache Beam application using the Flink Web UI, the 
> parallelism set at job submission doesn't get picked up. The same happens 
> when submitting a job using the Flink CLI.
> In both cases, the parallelism ends up defaulting to 1.
> When I set the parallelism programmatically within the Apache Beam code, it 
> works: {{flinkPipelineOptions.setParallelism(4);}}
> I suspect the root of the problem may be in the 
> org.apache.beam.runners.flink.DefaultParallelismFactory class, as it checks 
> for Flink's GlobalConfiguration, which may not pick up runtime values passed 
> to Flink, then defaults to 1 if it doesn't find anything.
> Any ideas on how this could be fixed or worked around? I need to be able to 
> change the parallelism dynamically, so the programmatic approach won't really 
> work for me, nor will setting the Flink configuration at system level.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3409) Unexpected behavior of DoFn teardown method running in unit tests

2018-02-27 Thread Reuven Lax (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16379317#comment-16379317
 ] 

Reuven Lax commented on BEAM-3409:
--

So teardown isn't being called, it's just that waitUntilFinish isn't properly 
waiting for tearDown in direct runner. Is that correct?

> Unexpected behavior of DoFn teardown method running in unit tests 
> --
>
> Key: BEAM-3409
> URL: https://issues.apache.org/jira/browse/BEAM-3409
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 2.3.0
>Reporter: Alexey Romanenko
>Assignee: Romain Manni-Bucau
>Priority: Blocker
>  Labels: test
> Fix For: 2.4.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Writing a unit test, I found out a strange behaviour of Teardown method of 
> DoFn implementation when I run this method in unit tests using TestPipeline.
> To be more precise, it doesn’t wait until teardown() method will be finished, 
> it just exits from this method after about 1 sec (on my machine) even if it 
> should take longer (very simple example - running infinite loop inside this 
> method or put thread in sleep). In the same time, when I run the same code 
> from main() with ordinary Pipeline and direct runner, then it’s ok and it 
> works as expected - teardown() method will be performed completely despite 
> how much time it will take.
> I created two test cases to reproduce this issue - the first one to run with 
> main() and the second one to run with junit. They use the same implementation 
> of DoFn (class LongTearDownFn) and expects that teardown method will be 
> running at least for SLEEP_TIME ms. In case of running as junit test it's not 
> a case (see output log).
> - run with main()
> https://github.com/aromanenko-dev/beam-samples/blob/master/runners-tests/src/main/java/TearDown.java
> - run with junit
> https://github.com/aromanenko-dev/beam-samples/blob/master/runners-tests/src/test/java/TearDownTest.java



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PerformanceTests_Spark #1409

2018-02-27 Thread Apache Jenkins Server
See 


Changes:

[kirpichov] Makes it possible to use Wait with default windowing, eg. in batch

--
[...truncated 92.49 KB...]
'apache-beam-testing:bqjob_r3ee00e120b2fd54a_0161d9c68e32_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)

2018-02-28 00:18:47,258 2c970c09 MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-02-28 00:19:05,098 2c970c09 MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-02-28 00:19:07,426 2c970c09 MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1,  WallTime:0:02.31s,  CPU:0.34s,  MaxMemory:25556kb 
STDOUT: Upload complete.
Waiting on bqjob_r45c06d608fa0b5b8_0161d9c6e288_1 ... (0s) Current status: 
RUNNING 
 Waiting on bqjob_r45c06d608fa0b5b8_0161d9c6e288_1 ... (0s) 
Current status: DONE   
BigQuery error in load operation: Error processing job
'apache-beam-testing:bqjob_r45c06d608fa0b5b8_0161d9c6e288_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)

2018-02-28 00:19:07,426 2c970c09 MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-02-28 00:19:33,730 2c970c09 MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-02-28 00:19:35,990 2c970c09 MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1,  WallTime:0:02.25s,  CPU:0.35s,  MaxMemory:25508kb 
STDOUT: Upload complete.
Waiting on bqjob_r5544eb30cce70f1c_0161d9c7523f_1 ... (0s) Current status: 
RUNNING 
 Waiting on bqjob_r5544eb30cce70f1c_0161d9c7523f_1 ... (0s) 
Current status: DONE   
BigQuery error in load operation: Error processing job
'apache-beam-testing:bqjob_r5544eb30cce70f1c_0161d9c7523f_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)

2018-02-28 00:19:35,990 2c970c09 MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-02-28 00:19:52,390 2c970c09 MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-02-28 00:19:54,525 2c970c09 MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1,  WallTime:0:02.12s,  CPU:0.30s,  MaxMemory:25488kb 
STDOUT: Upload complete.
Waiting on bqjob_r4b65e0c396946fff_0161d9c79b1d_1 ... (0s) Current status: 
RUNNING 
 Waiting on bqjob_r4b65e0c396946fff_0161d9c79b1d_1 ... (0s) 
Current status: 

Build failed in Jenkins: beam_PerformanceTests_Python #965

2018-02-27 Thread Apache Jenkins Server
See 


Changes:

[kirpichov] Makes it possible to use Wait with default windowing, eg. in batch

--
[...truncated 1.42 KB...]
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins3677894520140205890.sh
+ rm -rf PerfKitBenchmarker
[beam_PerformanceTests_Python] $ /bin/bash -xe /tmp/jenkins94366463883531681.sh
+ rm -rf .env
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins1998036715127430317.sh
+ virtualenv .env --system-site-packages
New python executable in .env/bin/python
Installing setuptools, pip...done.
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins2998084260037608644.sh
+ .env/bin/pip install --upgrade setuptools pip
Downloading/unpacking setuptools from 
https://pypi.python.org/packages/43/41/033a273f9a25cb63050a390ee8397acbc7eae2159195d85f06f17e7be45a/setuptools-38.5.1-py2.py3-none-any.whl#md5=908b8b5e50bf429e520b2b5fa1b350e5
Downloading/unpacking pip from 
https://pypi.python.org/packages/b6/ac/7015eb97dc749283ffdec1c3a88ddb8ae03b8fad0f0e611408f196358da3/pip-9.0.1-py2.py3-none-any.whl#md5=297dbd16ef53bcef0447d245815f5144
Installing collected packages: setuptools, pip
  Found existing installation: setuptools 2.2
Uninstalling setuptools:
  Successfully uninstalled setuptools
  Found existing installation: pip 1.5.4
Uninstalling pip:
  Successfully uninstalled pip
Successfully installed setuptools pip
Cleaning up...
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins7662529008930981844.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git
Cloning into 'PerfKitBenchmarker'...
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins6721449640690013311.sh
+ .env/bin/pip install -r PerfKitBenchmarker/requirements.txt
Requirement already satisfied: absl-py in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: jinja2>=2.7 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: setuptools in ./.env/lib/python2.7/site-packages 
(from -r PerfKitBenchmarker/requirements.txt (line 16))
Requirement already satisfied: colorlog[windows]==2.6.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 17))
Requirement already satisfied: blinker>=1.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 18))
Requirement already satisfied: futures>=3.0.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 19))
Requirement already satisfied: PyYAML==3.12 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 20))
Requirement already satisfied: pint>=0.7 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 21))
Collecting numpy==1.13.3 (from -r PerfKitBenchmarker/requirements.txt (line 22))
:318:
 SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name 
Indication) extension to TLS is not available on this platform. This may cause 
the server to present an incorrect TLS certificate, which can cause validation 
failures. You can upgrade to a newer version of Python to solve this. For more 
information, see 
https://urllib3.readthedocs.io/en/latest/security.html#snimissingwarning.
  SNIMissingWarning
:122:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.io/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
  Using cached numpy-1.13.3-cp27-cp27mu-manylinux1_x86_64.whl
Requirement already satisfied: functools32 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 23))
Requirement already satisfied: contextlib2>=0.5.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 24))
Requirement already satisfied: pywinrm in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 

Build failed in Jenkins: beam_PerformanceTests_JDBC #269

2018-02-27 Thread Apache Jenkins Server
See 


Changes:

[kirpichov] Makes it possible to use Wait with default windowing, eg. in batch

--
[...truncated 683.54 KB...]
at 
com.google.cloud.dataflow.worker.DefaultParDoFnFactory.create(DefaultParDoFnFactory.java:74)
at 
com.google.cloud.dataflow.worker.MapTaskExecutorFactory.createParDoOperation(MapTaskExecutorFactory.java:481)
at 
com.google.cloud.dataflow.worker.MapTaskExecutorFactory$3.typedApply(MapTaskExecutorFactory.java:392)
at 
com.google.cloud.dataflow.worker.MapTaskExecutorFactory$3.typedApply(MapTaskExecutorFactory.java:374)
at 
com.google.cloud.dataflow.worker.graph.Networks$TypeSafeNodeFunction.apply(Networks.java:63)
at 
com.google.cloud.dataflow.worker.graph.Networks$TypeSafeNodeFunction.apply(Networks.java:50)
at 
com.google.cloud.dataflow.worker.graph.Networks.replaceDirectedNetworkNodes(Networks.java:87)
at 
com.google.cloud.dataflow.worker.MapTaskExecutorFactory.create(MapTaskExecutorFactory.java:158)
at 
com.google.cloud.dataflow.worker.DataflowWorker.doWork(DataflowWorker.java:308)
at 
com.google.cloud.dataflow.worker.DataflowWorker.getAndPerformWork(DataflowWorker.java:264)
at 
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.doWork(DataflowBatchWorkerHarness.java:133)
at 
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:113)
at 
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:100)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
(3e4b1ae52e513069): java.lang.RuntimeException: 
org.apache.beam.sdk.util.UserCodeException: org.postgresql.util.PSQLException: 
The connection attempt failed.
at 
com.google.cloud.dataflow.worker.MapTaskExecutorFactory$3.typedApply(MapTaskExecutorFactory.java:404)
at 
com.google.cloud.dataflow.worker.MapTaskExecutorFactory$3.typedApply(MapTaskExecutorFactory.java:374)
at 
com.google.cloud.dataflow.worker.graph.Networks$TypeSafeNodeFunction.apply(Networks.java:63)
at 
com.google.cloud.dataflow.worker.graph.Networks$TypeSafeNodeFunction.apply(Networks.java:50)
at 
com.google.cloud.dataflow.worker.graph.Networks.replaceDirectedNetworkNodes(Networks.java:87)
at 
com.google.cloud.dataflow.worker.MapTaskExecutorFactory.create(MapTaskExecutorFactory.java:158)
at 
com.google.cloud.dataflow.worker.DataflowWorker.doWork(DataflowWorker.java:308)
at 
com.google.cloud.dataflow.worker.DataflowWorker.getAndPerformWork(DataflowWorker.java:264)
at 
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.doWork(DataflowBatchWorkerHarness.java:133)
at 
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:113)
at 
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:100)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.beam.sdk.util.UserCodeException: 
org.postgresql.util.PSQLException: The connection attempt failed.
at 
org.apache.beam.sdk.util.UserCodeException.wrap(UserCodeException.java:36)
at 
org.apache.beam.sdk.io.jdbc.JdbcIO$Write$WriteFn$DoFnInvoker.invokeSetup(Unknown
 Source)
at 
com.google.cloud.dataflow.worker.DoFnInstanceManagers$ConcurrentQueueInstanceManager.deserializeCopy(DoFnInstanceManagers.java:63)
at 
com.google.cloud.dataflow.worker.DoFnInstanceManagers$ConcurrentQueueInstanceManager.peek(DoFnInstanceManagers.java:45)
at 
com.google.cloud.dataflow.worker.UserParDoFnFactory.create(UserParDoFnFactory.java:94)
at 
com.google.cloud.dataflow.worker.DefaultParDoFnFactory.create(DefaultParDoFnFactory.java:74)
at 
com.google.cloud.dataflow.worker.MapTaskExecutorFactory.createParDoOperation(MapTaskExecutorFactory.java:481)
at 
com.google.cloud.dataflow.worker.MapTaskExecutorFactory$3.typedApply(MapTaskExecutorFactory.java:392)
... 14 more
Caused by: org.postgresql.util.PSQLException: The connection attempt failed.
at 
org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:272)
at 

[jira] [Created] (BEAM-3758) Migrate Python SDK Read transform to be Impulse->SDF

2018-02-27 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-3758:
-

 Summary: Migrate Python SDK Read transform to be Impulse->SDF
 Key: BEAM-3758
 URL: https://issues.apache.org/jira/browse/BEAM-3758
 Project: Beam
  Issue Type: Sub-task
  Components: sdk-py-core
Reporter: Kenneth Knowles
Assignee: Kenneth Knowles


Currently, Read is the "primitive" even though portability doesn't even have 
the concept. Anyhow at least the DataflowRunner should override it to be 
impulse, since the service requires this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-3759) Add support for PaneInfo descriptor in Python SDK

2018-02-27 Thread Charles Chen (JIRA)
Charles Chen created BEAM-3759:
--

 Summary: Add support for PaneInfo descriptor in Python SDK
 Key: BEAM-3759
 URL: https://issues.apache.org/jira/browse/BEAM-3759
 Project: Beam
  Issue Type: Improvement
  Components: sdk-py-core
Affects Versions: 2.3.0
Reporter: Charles Chen
Assignee: Charles Chen


The PaneInfo descriptor allows a user to determine which particular triggering 
emitted a value.  This allows the user to differentiate between speculative 
(early), on-time (at end of window) and late value emissions coming out of a 
GroupByKey.  We should add support for this feature in the Python SDK.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PerformanceTests_JDBC #270

2018-02-27 Thread Apache Jenkins Server
See 


Changes:

[tgroh] Scan Core Construction NeedsRunner Tests

[klk] Add MetricsTranslation

--
[...truncated 704.83 KB...]
[INFO] Excluding io.netty:netty-codec-http:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-handler-proxy:jar:4.1.8.Final from the shaded 
jar.
[INFO] Excluding io.netty:netty-codec-socks:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-handler:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-buffer:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-common:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-transport:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-resolver:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-codec:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.grpc:grpc-stub:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-all:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-okhttp:jar:1.2.0 from the shaded jar.
[INFO] Excluding com.squareup.okhttp:okhttp:jar:2.5.0 from the shaded jar.
[INFO] Excluding com.squareup.okio:okio:jar:1.6.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-protobuf-lite:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-protobuf-nano:jar:1.2.0 from the shaded jar.
[INFO] Excluding com.google.protobuf.nano:protobuf-javanano:jar:3.0.0-alpha-5 
from the shaded jar.
[INFO] Excluding com.google.cloud:google-cloud-core:jar:1.0.2 from the shaded 
jar.
[INFO] Excluding org.json:json:jar:20160810 from the shaded jar.
[INFO] Excluding com.google.cloud:google-cloud-spanner:jar:0.20.0-beta from the 
shaded jar.
[INFO] Excluding com.google.api.grpc:proto-google-cloud-spanner-v1:jar:0.1.11 
from the shaded jar.
[INFO] Excluding 
com.google.api.grpc:proto-google-cloud-spanner-admin-instance-v1:jar:0.1.11 
from the shaded jar.
[INFO] Excluding com.google.api.grpc:grpc-google-cloud-spanner-v1:jar:0.1.11 
from the shaded jar.
[INFO] Excluding 
com.google.api.grpc:grpc-google-cloud-spanner-admin-database-v1:jar:0.1.11 from 
the shaded jar.
[INFO] Excluding 
com.google.api.grpc:grpc-google-cloud-spanner-admin-instance-v1:jar:0.1.11 from 
the shaded jar.
[INFO] Excluding com.google.api.grpc:grpc-google-longrunning-v1:jar:0.1.11 from 
the shaded jar.
[INFO] Excluding com.google.api.grpc:proto-google-longrunning-v1:jar:0.1.11 
from the shaded jar.
[INFO] Excluding com.google.cloud.bigtable:bigtable-protos:jar:1.0.0-pre3 from 
the shaded jar.
[INFO] Excluding com.google.cloud.bigtable:bigtable-client-core:jar:1.0.0 from 
the shaded jar.
[INFO] Excluding com.google.auth:google-auth-library-appengine:jar:0.7.0 from 
the shaded jar.
[INFO] Excluding io.opencensus:opencensus-contrib-grpc-util:jar:0.7.0 from the 
shaded jar.
[INFO] Excluding io.opencensus:opencensus-api:jar:0.7.0 from the shaded jar.
[INFO] Excluding io.dropwizard.metrics:metrics-core:jar:3.1.2 from the shaded 
jar.
[INFO] Excluding com.google.protobuf:protobuf-java:jar:3.2.0 from the shaded 
jar.
[INFO] Excluding io.netty:netty-tcnative-boringssl-static:jar:1.1.33.Fork26 
from the shaded jar.
[INFO] Excluding 
com.google.api.grpc:proto-google-cloud-spanner-admin-database-v1:jar:0.1.9 from 
the shaded jar.
[INFO] Excluding com.google.api.grpc:proto-google-common-protos:jar:0.1.9 from 
the shaded jar.
[INFO] Excluding com.google.api-client:google-api-client:jar:1.22.0 from the 
shaded jar.
[INFO] Excluding com.google.oauth-client:google-oauth-client:jar:1.22.0 from 
the shaded jar.
[INFO] Excluding com.google.http-client:google-http-client:jar:1.22.0 from the 
shaded jar.
[INFO] Excluding org.apache.httpcomponents:httpclient:jar:4.0.1 from the shaded 
jar.
[INFO] Excluding org.apache.httpcomponents:httpcore:jar:4.0.1 from the shaded 
jar.
[INFO] Excluding commons-codec:commons-codec:jar:1.3 from the shaded jar.
[INFO] Excluding com.google.http-client:google-http-client-jackson2:jar:1.22.0 
from the shaded jar.
[INFO] Excluding 
com.google.apis:google-api-services-dataflow:jar:v1b3-rev221-1.22.0 from the 
shaded jar.
[INFO] Excluding 
com.google.apis:google-api-services-clouddebugger:jar:v2-rev8-1.22.0 from the 
shaded jar.
[INFO] Excluding 
com.google.apis:google-api-services-storage:jar:v1-rev71-1.22.0 from the shaded 
jar.
[INFO] Excluding com.google.auth:google-auth-library-credentials:jar:0.7.1 from 
the shaded jar.
[INFO] Excluding com.google.auth:google-auth-library-oauth2-http:jar:0.7.1 from 
the shaded jar.
[INFO] Excluding com.google.cloud.bigdataoss:util:jar:1.4.5 from the shaded jar.
[INFO] Excluding com.google.api-client:google-api-client-java6:jar:1.22.0 from 
the shaded jar.
[INFO] Excluding com.google.api-client:google-api-client-jackson2:jar:1.22.0 
from the shaded jar.
[INFO] Excluding com.google.oauth-client:google-oauth-client-java6:jar:1.22.0 
from the shaded jar.
[INFO] 

Build failed in Jenkins: beam_PostCommit_Python_ValidatesRunner_Dataflow #1008

2018-02-27 Thread Apache Jenkins Server
See 


Changes:

[tgroh] Scan Core Construction NeedsRunner Tests

[klk] Add MetricsTranslation

--
[...truncated 128.84 KB...]
  File "/usr/lib/python2.7/unittest/case.py", line 395, in __call__
return self.run(*args, **kwds)
  File "/usr/lib/python2.7/unittest/case.py", line 331, in run
testMethod()
  File 
"
 line 178, in test_iterable_side_input
pipeline.run()
  File 
"
 line 102, in run
result = super(TestPipeline, self).run()
  File 
"
 line 369, in run
self.to_runner_api(), self.runner, self._options).run(False)
  File 
"
 line 597, in from_runner_api
context.transforms.get_by_id(root_transform_id)]
  File 
"
 line 69, in get_by_id
self._id_to_proto[id], self._pipeline_context)
  File 
"
 line 842, in from_runner_api
part = context.transforms.get_by_id(transform_id)
  File 
"
 line 69, in get_by_id
self._id_to_proto[id], self._pipeline_context)
  File 
"
 line 842, in from_runner_api
part = context.transforms.get_by_id(transform_id)
  File 
"
 line 69, in get_by_id
self._id_to_proto[id], self._pipeline_context)
  File 
"
 line 833, in from_runner_api
transform=ptransform.PTransform.from_runner_api(proto.spec, context),
  File 
"
 line 555, in from_runner_api
context)
  File 
"
 line 886, in from_runner_api_parameter
result = ParDo(fn, *args, **kwargs)
  File 
"
 line 781, in __init__
super(ParDo, self).__init__(fn, *args, **kwargs)
  File 
"
 line 627, in __init__
self.fn = pickler.loads(pickler.dumps(self.fn))
  File 
"
 line 193, in dumps
s = dill.dumps(o)
  File 
"
 line 259, in dumps
dump(obj, file, protocol, byref, fmode, recurse)#, strictio)
  File 
"
 line 252, in dump
pik.dump(obj)
  File "/usr/lib/python2.7/pickle.py", line 224, in dump
self.save(obj)
  File "/usr/lib/python2.7/pickle.py", line 331, in save
self.save_reduce(obj=obj, *rv)
  File "/usr/lib/python2.7/pickle.py", line 419, in save_reduce
save(state)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
  File 
"
 line 165, in new_save_module_dict
return old_save_module_dict(pickler, obj)
  File 
"
 line 841, in save_module_dict
StockPickler.save_dict(pickler, obj)
  File "/usr/lib/python2.7/pickle.py", line 649, 

Build failed in Jenkins: beam_PerformanceTests_Spark #1410

2018-02-27 Thread Apache Jenkins Server
See 


Changes:

[tgroh] Scan Core Construction NeedsRunner Tests

[klk] Add MetricsTranslation

--
[...truncated 97.47 KB...]
'apache-beam-testing:bqjob_r5ac965b17b16d3f4_0161db10cd99_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)
Upload complete.Waiting on bqjob_r5ac965b17b16d3f4_0161db10cd99_1 ... (0s) 
Current status: RUNNING 
 Waiting on 
bqjob_r5ac965b17b16d3f4_0161db10cd99_1 ... (0s) Current status: DONE   
2018-02-28 06:19:29,133 d3b17b4e MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-02-28 06:19:49,593 d3b17b4e MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-02-28 06:19:51,804 d3b17b4e MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1,  WallTime:0:02.20s,  CPU:0.26s,  MaxMemory:29044kb 
STDOUT: 

BigQuery error in load operation: Error processing job
'apache-beam-testing:bqjob_r2af07adb35ab6e3f_0161db11270a_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)
Upload complete.Waiting on bqjob_r2af07adb35ab6e3f_0161db11270a_1 ... (0s) 
Current status: RUNNING 
 Waiting on 
bqjob_r2af07adb35ab6e3f_0161db11270a_1 ... (0s) Current status: DONE   
2018-02-28 06:19:51,804 d3b17b4e MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-02-28 06:20:08,442 d3b17b4e MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-02-28 06:20:10,669 d3b17b4e MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1,  WallTime:0:02.22s,  CPU:0.30s,  MaxMemory:28932kb 
STDOUT: 

BigQuery error in load operation: Error processing job
'apache-beam-testing:bqjob_r6f20a9d43cb5df22_0161db1170ae_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)
Upload complete.Waiting on bqjob_r6f20a9d43cb5df22_0161db1170ae_1 ... (0s) 
Current status: RUNNING 
 Waiting on 
bqjob_r6f20a9d43cb5df22_0161db1170ae_1 ... (0s) Current status: DONE   
2018-02-28 06:20:10,670 d3b17b4e MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-02-28 06:20:38,774 d3b17b4e MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-02-28 06:20:41,123 d3b17b4e MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1,  WallTime:0:02.34s,  CPU:0.27s,  MaxMemory:29056kb 

Build failed in Jenkins: beam_PostCommit_Python_Verify #4326

2018-02-27 Thread Apache Jenkins Server
See 


Changes:

[klk] Add MetricsTranslation

--
[...truncated 1.02 MB...]
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers_test.py 
-> apache-beam-2.4.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying apache_beam/runners/dataflow/native_io/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/streaming_create.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/direct/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/bundle_factory.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/clock.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor_test.py 
-> apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/evaluation_context.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/executor.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/helper_transforms.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/sdf_direct_runner.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/sdf_direct_runner_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/transform_evaluator.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/util.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/watermark_manager.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/experimental/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/experimental
copying apache_beam/runners/experimental/python_rpc_direct/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/experimental/python_rpc_direct
copying 
apache_beam/runners/experimental/python_rpc_direct/python_rpc_direct_runner.py 
-> apache-beam-2.4.0.dev0/apache_beam/runners/experimental/python_rpc_direct
copying apache_beam/runners/experimental/python_rpc_direct/server.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/experimental/python_rpc_direct
copying apache_beam/runners/job/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/job
copying apache_beam/runners/job/manager.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/job
copying apache_beam/runners/job/utils.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/job
copying apache_beam/runners/portability/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/fn_api_runner.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/fn_api_runner_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/maptask_executor_runner.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/maptask_executor_runner_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/universal_local_runner.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/universal_local_runner_main.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/universal_local_runner_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/test/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/test
copying apache_beam/runners/worker/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying 

[beam] 01/01: Merge pull request #4764: [BEAM-3760] Scan Core Construction NeedsRunner Tests

2018-02-27 Thread kenn
This is an automated email from the ASF dual-hosted git repository.

kenn pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 948988c921747ab6298059a94daf1180e5b76cd4
Merge: 37e7c53 7722c56
Author: Kenn Knowles 
AuthorDate: Tue Feb 27 21:47:18 2018 -0800

Merge pull request #4764: [BEAM-3760] Scan Core Construction NeedsRunner 
Tests

 runners/direct-java/pom.xml | 1 +
 1 file changed, 1 insertion(+)

-- 
To stop receiving notification emails like this one, please contact
k...@apache.org.


[beam] branch master updated (37e7c53 -> 948988c)

2018-02-27 Thread kenn
This is an automated email from the ASF dual-hosted git repository.

kenn pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 37e7c53  Merge pull request #4734: [BEAM-1866] Add MetricsTranslation
 add 7722c56  Scan Core Construction NeedsRunner Tests
 new 948988c  Merge pull request #4764: [BEAM-3760] Scan Core Construction 
NeedsRunner Tests

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 runners/direct-java/pom.xml | 1 +
 1 file changed, 1 insertion(+)

-- 
To stop receiving notification emails like this one, please contact
k...@apache.org.


Build failed in Jenkins: beam_PerformanceTests_Python #966

2018-02-27 Thread Apache Jenkins Server
See 


Changes:

[tgroh] Scan Core Construction NeedsRunner Tests

[klk] Add MetricsTranslation

--
[...truncated 726 B...]
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 948988c921747ab6298059a94daf1180e5b76cd4 (origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 948988c921747ab6298059a94daf1180e5b76cd4
Commit message: "Merge pull request #4764: [BEAM-3760] Scan Core Construction 
NeedsRunner Tests"
 > git rev-list --no-walk 71e2526ee092fd535960dd76afa6420ce01a9e90 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_Python] $ /bin/bash -xe /tmp/jenkins210539742469608841.sh
+ rm -rf PerfKitBenchmarker
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins9108428254576207921.sh
+ rm -rf .env
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins5830228962994636866.sh
+ virtualenv .env --system-site-packages
New python executable in 

Installing setuptools, pip, wheel...done.
[beam_PerformanceTests_Python] $ /bin/bash -xe /tmp/jenkins724020972910911193.sh
+ .env/bin/pip install --upgrade setuptools pip
Requirement already up-to-date: setuptools in ./.env/lib/python2.7/site-packages
Requirement already up-to-date: pip in ./.env/lib/python2.7/site-packages
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins1490470869291343502.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git
Cloning into 'PerfKitBenchmarker'...
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins3882990048909147835.sh
+ .env/bin/pip install -r PerfKitBenchmarker/requirements.txt
Requirement already satisfied: absl-py in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: jinja2>=2.7 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: setuptools in ./.env/lib/python2.7/site-packages 
(from -r PerfKitBenchmarker/requirements.txt (line 16))
Requirement already satisfied: colorlog[windows]==2.6.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 17))
Requirement already satisfied: blinker>=1.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 18))
Requirement already satisfied: futures>=3.0.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 19))
Requirement already satisfied: PyYAML==3.12 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 20))
Requirement already satisfied: pint>=0.7 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 21))
Collecting numpy==1.13.3 (from -r PerfKitBenchmarker/requirements.txt (line 22))
:318:
 SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name 
Indication) extension to TLS is not available on this platform. This may cause 
the server to present an incorrect TLS certificate, which can cause validation 
failures. You can upgrade to a newer version of Python to solve this. For more 
information, see 
https://urllib3.readthedocs.io/en/latest/security.html#snimissingwarning.
  SNIMissingWarning
:122:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. You can upgrade to a newer version of Python to solve 
this. For more information, see 
https://urllib3.readthedocs.io/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
  Using cached numpy-1.13.3-cp27-cp27mu-manylinux1_x86_64.whl
Requirement already satisfied: functools32 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 23))
Requirement already satisfied: contextlib2>=0.5.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 24))
Requirement already satisfied: pywinrm in 

[jira] [Commented] (BEAM-3757) Shuffle read failed using python 2.2.0

2018-02-27 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16379850#comment-16379850
 ] 

Luke Cwik commented on BEAM-3757:
-

It is best to reach out to 
[dataflow-python-feedb...@google.com|mailto:to%c2%a0dataflow-python-feedb...@google.com]
 for support with a job id and a link to this bug.

> Shuffle read failed using python 2.2.0
> --
>
> Key: BEAM-3757
> URL: https://issues.apache.org/jira/browse/BEAM-3757
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.2.0
> Environment: gcp, macos
>Reporter: Jonathan Delfour
>Assignee: Thomas Groh
>Priority: Major
>
> Hi,
> First issue is that the beam 2.3.0 python SDK is apparently not working on 
> GCP. It gets stuck: 
> {noformat}
> Workflow failed. Causes: (bf832d44290fbf41): The Dataflow appears to be 
> stuck. You can get help with Cloud Dataflow at 
> https://cloud.google.com/dataflow/support. 
> {noformat}
> I tried two times.
> Reverting back to 2.2.0: it usually works but today, after > 1 hour of 
> processing, and 30 workers used, I get a failure with these in the logs:
> {noformat}
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 
> 582, in do_work
> work_executor.execute()
>   File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/executor.py", 
> line 167, in execute
> op.start()
>   File "dataflow_worker/shuffle_operations.py", line 49, in 
> dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start
> def start(self):
>   File "dataflow_worker/shuffle_operations.py", line 50, in 
> dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start
> with self.scoped_start_state:
>   File "dataflow_worker/shuffle_operations.py", line 65, in 
> dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start
> with self.shuffle_source.reader() as reader:
>   File "dataflow_worker/shuffle_operations.py", line 67, in 
> dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start
> for key_values in reader:
>   File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/shuffle.py", 
> line 406, in __iter__
> for entry in entries_iterator:
>   File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/shuffle.py", 
> line 248, in next
> return next(self.iterator)
>   File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/shuffle.py", 
> line 206, in __iter__
> chunk, next_position = self.reader.Read(start_position, end_position)
>   File "third_party/windmill/shuffle/python/shuffle_client.pyx", line 138, in 
> shuffle_client.PyShuffleReader.Read
> IOError: Shuffle read failed: INTERNAL: Received RST_STREAM with error code 2 
>  talking to my-dataflow-02271107-756f-harness-2p65:12346
> {noformat}
> i also get some information message:
> {noformat}
> Refusing to split  at 0x7f03a00fe790> at '\x00\x00\x00\t\x1d\x14\x87\xa3\x00\x01': unstarted
> {noformat}
> For the flow, I am extracting data from BQ, cleaning using pandas, exporting 
> as a csv file, gzipping and uploading the compressed file to a bucket using 
> decompressive transcoding (csv export, gzip compression and upload are in the 
> same 'worker' as they are done in the same beam.DoFn).
> PS: i can't find a reasonable way to export the logs from GCP but i can 
> privately send the log file i have of the run on my machine (the log of the 
> pipeline)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Python_Verify #4327

2018-02-27 Thread Apache Jenkins Server
See 


Changes:

[tgroh] Scan Core Construction NeedsRunner Tests

--
[...truncated 1.02 MB...]
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers_test.py 
-> apache-beam-2.4.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying apache_beam/runners/dataflow/native_io/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/streaming_create.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/direct/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/bundle_factory.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/clock.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor_test.py 
-> apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/evaluation_context.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/executor.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/helper_transforms.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/sdf_direct_runner.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/sdf_direct_runner_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/transform_evaluator.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/util.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/watermark_manager.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/experimental/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/experimental
copying apache_beam/runners/experimental/python_rpc_direct/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/experimental/python_rpc_direct
copying 
apache_beam/runners/experimental/python_rpc_direct/python_rpc_direct_runner.py 
-> apache-beam-2.4.0.dev0/apache_beam/runners/experimental/python_rpc_direct
copying apache_beam/runners/experimental/python_rpc_direct/server.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/experimental/python_rpc_direct
copying apache_beam/runners/job/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/job
copying apache_beam/runners/job/manager.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/job
copying apache_beam/runners/job/utils.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/job
copying apache_beam/runners/portability/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/fn_api_runner.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/fn_api_runner_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/maptask_executor_runner.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/maptask_executor_runner_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/universal_local_runner.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/universal_local_runner_main.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/universal_local_runner_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/test/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/test
copying apache_beam/runners/worker/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying 

Build failed in Jenkins: beam_PostCommit_Python_ValidatesRunner_Dataflow #1006

2018-02-27 Thread Apache Jenkins Server
See 


--
[...truncated 118.53 KB...]

==
ERROR: test_flattened_side_input 
(apache_beam.transforms.sideinputs_test.SideInputsTest)
--
Traceback (most recent call last):
  File 
"
 line 812, in run
test(orig)
  File 
"
 line 45, in __call__
return self.run(*arg, **kwarg)
  File 
"
 line 133, in run
self.runTest(result)
  File 
"
 line 151, in runTest
test(result)
  File "/usr/lib/python2.7/unittest/case.py", line 395, in __call__
return self.run(*args, **kwds)
  File "/usr/lib/python2.7/unittest/case.py", line 331, in run
testMethod()
  File 
"
 line 305, in test_flattened_side_input
assert_that(results, equal_to(['a', 'b']))
  File 
"
 line 152, in assert_that
actual | AssertThat()  # pylint: disable=expression-not-assigned
  File 
"
 line 107, in __or__
return self.pipeline.apply(ptransform, self)
  File 
"
 line 479, in apply
pvalueish_result = self.runner.apply(transform, pvalueish)
  File 
"
 line 174, in apply
return m(transform, input)
  File 
"
 line 180, in apply_PTransform
return transform.expand(input)
  File 
"
 line 143, in expand
| "ToVoidKey" >> Map(lambda v: (None, v)))
  File 
"
 line 1613, in __init__
super(WindowInto, self).__init__(self.WindowIntoFn(self.windowing))
  File 
"
 line 781, in __init__
super(ParDo, self).__init__(fn, *args, **kwargs)
  File 
"
 line 628, in __init__
self.args = pickler.loads(pickler.dumps(self.args))
  File 
"
 line 217, in loads
s = zlib.decompress(c)
  File 
"
 line 276, in signalhandler
raise TimedOutException()
TimedOutException: 'test_flattened_side_input 
(apache_beam.transforms.sideinputs_test.SideInputsTest)'

==
ERROR: test_iterable_side_input 
(apache_beam.transforms.sideinputs_test.SideInputsTest)
--
Traceback (most recent call last):
  File 
"
 line 812, in run
test(orig)
  File 
"
 line 45, in __call__
return self.run(*arg, **kwarg)
  File 
"
 line 133, in run
self.runTest(result)
  File 

Build failed in Jenkins: beam_PerformanceTests_TextIOIT #209

2018-02-27 Thread Apache Jenkins Server
See 


Changes:

[kirpichov] Makes it possible to use Wait with default windowing, eg. in batch

--
[...truncated 700.17 KB...]
[INFO] Excluding 
org.apache.beam:beam-sdks-java-io-google-cloud-platform:jar:2.4.0-SNAPSHOT from 
the shaded jar.
[INFO] Excluding 
org.apache.beam:beam-sdks-java-extensions-protobuf:jar:2.4.0-SNAPSHOT from the 
shaded jar.
[INFO] Excluding io.grpc:grpc-core:jar:1.2.0 from the shaded jar.
[INFO] Excluding com.google.errorprone:error_prone_annotations:jar:2.0.15 from 
the shaded jar.
[INFO] Excluding io.grpc:grpc-context:jar:1.2.0 from the shaded jar.
[INFO] Excluding com.google.instrumentation:instrumentation-api:jar:0.3.0 from 
the shaded jar.
[INFO] Excluding 
com.google.apis:google-api-services-bigquery:jar:v2-rev374-1.22.0 from the 
shaded jar.
[INFO] Excluding com.google.api:gax-grpc:jar:0.20.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-protobuf:jar:1.2.0 from the shaded jar.
[INFO] Excluding com.google.api:api-common:jar:1.0.0-rc2 from the shaded jar.
[INFO] Excluding com.google.auto.value:auto-value:jar:1.5.3 from the shaded jar.
[INFO] Excluding com.google.api:gax:jar:1.3.1 from the shaded jar.
[INFO] Excluding org.threeten:threetenbp:jar:1.3.3 from the shaded jar.
[INFO] Excluding com.google.cloud:google-cloud-core-grpc:jar:1.2.0 from the 
shaded jar.
[INFO] Excluding com.google.protobuf:protobuf-java-util:jar:3.2.0 from the 
shaded jar.
[INFO] Excluding com.google.code.gson:gson:jar:2.7 from the shaded jar.
[INFO] Excluding com.google.apis:google-api-services-pubsub:jar:v1-rev10-1.22.0 
from the shaded jar.
[INFO] Excluding com.google.api.grpc:grpc-google-cloud-pubsub-v1:jar:0.1.18 
from the shaded jar.
[INFO] Excluding com.google.api.grpc:proto-google-cloud-pubsub-v1:jar:0.1.18 
from the shaded jar.
[INFO] Excluding com.google.api.grpc:proto-google-iam-v1:jar:0.1.18 from the 
shaded jar.
[INFO] Excluding com.google.cloud.datastore:datastore-v1-proto-client:jar:1.4.0 
from the shaded jar.
[INFO] Excluding com.google.http-client:google-http-client-protobuf:jar:1.22.0 
from the shaded jar.
[INFO] Excluding com.google.http-client:google-http-client-jackson:jar:1.22.0 
from the shaded jar.
[INFO] Excluding com.google.cloud.datastore:datastore-v1-protos:jar:1.3.0 from 
the shaded jar.
[INFO] Excluding com.google.api.grpc:grpc-google-common-protos:jar:0.1.9 from 
the shaded jar.
[INFO] Excluding io.grpc:grpc-auth:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-netty:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.netty:netty-codec-http2:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-codec-http:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-handler-proxy:jar:4.1.8.Final from the shaded 
jar.
[INFO] Excluding io.netty:netty-codec-socks:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-handler:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-buffer:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-common:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-transport:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-resolver:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-codec:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.grpc:grpc-stub:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-all:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-okhttp:jar:1.2.0 from the shaded jar.
[INFO] Excluding com.squareup.okhttp:okhttp:jar:2.5.0 from the shaded jar.
[INFO] Excluding com.squareup.okio:okio:jar:1.6.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-protobuf-lite:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-protobuf-nano:jar:1.2.0 from the shaded jar.
[INFO] Excluding com.google.protobuf.nano:protobuf-javanano:jar:3.0.0-alpha-5 
from the shaded jar.
[INFO] Excluding com.google.cloud:google-cloud-core:jar:1.0.2 from the shaded 
jar.
[INFO] Excluding org.json:json:jar:20160810 from the shaded jar.
[INFO] Excluding com.google.cloud:google-cloud-spanner:jar:0.20.0-beta from the 
shaded jar.
[INFO] Excluding com.google.api.grpc:proto-google-cloud-spanner-v1:jar:0.1.11 
from the shaded jar.
[INFO] Excluding 
com.google.api.grpc:proto-google-cloud-spanner-admin-instance-v1:jar:0.1.11 
from the shaded jar.
[INFO] Excluding com.google.api.grpc:grpc-google-cloud-spanner-v1:jar:0.1.11 
from the shaded jar.
[INFO] Excluding 
com.google.api.grpc:grpc-google-cloud-spanner-admin-database-v1:jar:0.1.11 from 
the shaded jar.
[INFO] Excluding 
com.google.api.grpc:grpc-google-cloud-spanner-admin-instance-v1:jar:0.1.11 from 
the shaded jar.
[INFO] Excluding com.google.api.grpc:grpc-google-longrunning-v1:jar:0.1.11 from 
the shaded jar.
[INFO] Excluding com.google.api.grpc:proto-google-longrunning-v1:jar:0.1.11 
from the shaded jar.
[INFO] 

Build failed in Jenkins: beam_PostCommit_Python_Verify #4324

2018-02-27 Thread Apache Jenkins Server
See 


--
[...truncated 1.02 MB...]
copying apache_beam/runners/portability/universal_local_runner_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/test/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/test
copying apache_beam/runners/worker/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/bundle_processor.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/data_plane.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/data_plane_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/log_handler.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/log_handler_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/logger.pxd -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/logger.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/logger_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/opcounters.pxd -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/opcounters.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/opcounters_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/operation_specs.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/operations.pxd -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/operations.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/sdk_worker.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/sdk_worker_main.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/sdk_worker_main_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/sdk_worker_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/sideinputs.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/sideinputs_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/statesampler.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/statesampler_fast.pyx -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/statesampler_slow.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/statesampler_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/worker
copying apache_beam/testing/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/testing
copying apache_beam/testing/pipeline_verifiers.py -> 
apache-beam-2.4.0.dev0/apache_beam/testing
copying apache_beam/testing/pipeline_verifiers_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/testing
copying apache_beam/testing/test_pipeline.py -> 
apache-beam-2.4.0.dev0/apache_beam/testing
copying apache_beam/testing/test_pipeline_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/testing
copying apache_beam/testing/test_stream.py -> 
apache-beam-2.4.0.dev0/apache_beam/testing
copying apache_beam/testing/test_stream_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/testing
copying apache_beam/testing/test_utils.py -> 
apache-beam-2.4.0.dev0/apache_beam/testing
copying apache_beam/testing/test_utils_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/testing
copying apache_beam/testing/util.py -> 
apache-beam-2.4.0.dev0/apache_beam/testing
copying apache_beam/testing/util_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/testing
copying apache_beam/testing/data/standard_coders.yaml -> 
apache-beam-2.4.0.dev0/apache_beam/testing/data
copying apache_beam/testing/data/trigger_transcripts.yaml -> 
apache-beam-2.4.0.dev0/apache_beam/testing/data
copying apache_beam/transforms/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/transforms
copying apache_beam/transforms/combiners.py -> 
apache-beam-2.4.0.dev0/apache_beam/transforms
copying apache_beam/transforms/combiners_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/transforms
copying apache_beam/transforms/core.py -> 
apache-beam-2.4.0.dev0/apache_beam/transforms
copying apache_beam/transforms/create_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/transforms
copying apache_beam/transforms/cy_combiners.pxd -> 
apache-beam-2.4.0.dev0/apache_beam/transforms
copying apache_beam/transforms/cy_combiners.py -> 
apache-beam-2.4.0.dev0/apache_beam/transforms
copying apache_beam/transforms/cy_combiners_test.py -> 

[jira] [Updated] (BEAM-3157) BeamSql transform should support other PCollection types

2018-02-27 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-3157:
--
Fix Version/s: (was: 2.4.0)

> BeamSql transform should support other PCollection types
> 
>
> Key: BEAM-3157
> URL: https://issues.apache.org/jira/browse/BEAM-3157
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Ismaël Mejía
>Assignee: Anton Kedin
>Priority: Major
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> Currently the Beam SQL transform only supports input and output data 
> represented as a BeamRecord. This seems to me like an usability limitation 
> (even if we can do a ParDo to prepare objects before and after the transform).
> I suppose this constraint comes from the fact that we need to map 
> name/type/value from an object field into Calcite so it is convenient to have 
> a specific data type (BeamRecord) for this. However we can accomplish the 
> same by using a PCollection of JavaBean (where we know the same information 
> via the field names/types/values) or by using Avro records where we also have 
> the Schema information. For the output PCollection we can map the object via 
> a Reference (e.g. a JavaBean to be filled with the names of an Avro object).
> Note: I am assuming for the moment simple mappings since the SQL does not 
> support composite types for the moment.
> A simple API idea would be something like this:
> A simple filter:
> PCollection col = BeamSql.query("SELECT * FROM  WHERE 
> ...").from(MyPojo.class);
> A projection:
> PCollection newCol = BeamSql.query("SELECT id, 
> name").from(MyPojo.class).as(MyNewPojo.class);
> A first approach could be to just add the extra ParDos + transform DoFns 
> however I suppose that for memory use reasons maybe mapping directly into 
> Calcite would make sense.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-2445) DSL SQL to use service locator pattern to automatically register UDFs

2018-02-27 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16379483#comment-16379483
 ] 

Kenneth Knowles commented on BEAM-2445:
---

This seems like an open-ended feature request, so let's just add Fix Version 
when we have addressed it.

> DSL SQL to use service locator pattern to automatically register UDFs
> -
>
> Key: BEAM-2445
> URL: https://issues.apache.org/jira/browse/BEAM-2445
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Luke Cwik
>Assignee: Xu Mingmin
>Priority: Major
>
> Use a service locator pattern to find UDFs that can be registered. The 
> service loader can be used to register UDFs for standard functions via DSL 
> SQL, additional UDFs registered by third party libraries, and end user 
> created UDFs.
> Example ServiceLoader usage within Apache Beam to find coder providers:
> https://github.com/apache/beam/blob/7126fdc6ee5671e99a2dede3f25ba616aa0e8fa4/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/CoderRegistry.java#L147



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-2445) DSL SQL to use service locator pattern to automatically register UDFs

2018-02-27 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-2445:
--
Fix Version/s: (was: 2.4.0)

> DSL SQL to use service locator pattern to automatically register UDFs
> -
>
> Key: BEAM-2445
> URL: https://issues.apache.org/jira/browse/BEAM-2445
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Luke Cwik
>Assignee: Xu Mingmin
>Priority: Major
>
> Use a service locator pattern to find UDFs that can be registered. The 
> service loader can be used to register UDFs for standard functions via DSL 
> SQL, additional UDFs registered by third party libraries, and end user 
> created UDFs.
> Example ServiceLoader usage within Apache Beam to find coder providers:
> https://github.com/apache/beam/blob/7126fdc6ee5671e99a2dede3f25ba616aa0e8fa4/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/CoderRegistry.java#L147



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-3679) Upgrade calcite to release 1.16

2018-02-27 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-3679:
--
Fix Version/s: (was: 2.4.0)

> Upgrade calcite to release 1.16
> ---
>
> Key: BEAM-3679
> URL: https://issues.apache.org/jira/browse/BEAM-3679
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Ted Yu
>Assignee: Andrew Pilloud
>Priority: Minor
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Currently Beam uses Calcite 1.13.0
> This issue is to upgrade to Calcite 1.16.0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3679) Upgrade calcite to release 1.16

2018-02-27 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16379481#comment-16379481
 ] 

Kenneth Knowles commented on BEAM-3679:
---

Since the Calcite version isn't release, removing the Fix Version here, which 
put it on a burndown list of 2.4.0 blockers.

> Upgrade calcite to release 1.16
> ---
>
> Key: BEAM-3679
> URL: https://issues.apache.org/jira/browse/BEAM-3679
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Ted Yu
>Assignee: Andrew Pilloud
>Priority: Minor
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Currently Beam uses Calcite 1.13.0
> This issue is to upgrade to Calcite 1.16.0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-3519) GCP IO exposes netty on its API surface, causing conflicts with runners

2018-02-27 Thread Chamikara Jayalath (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath updated BEAM-3519:
-
Fix Version/s: (was: 2.4.0)

> GCP IO exposes netty on its API surface, causing conflicts with runners
> ---
>
> Key: BEAM-3519
> URL: https://issues.apache.org/jira/browse/BEAM-3519
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Reporter: Ismaël Mejía
>Assignee: Chamikara Jayalath
>Priority: Critical
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Google Cloud Platform IOs module leaks netty this causes conflicts in 
> particular with execution systems that use conflicting versions of such 
> modules. 
>  For the case there is a dependency conflict with the Spark Runner version of 
> netty, see: BEAM-3492



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-3760) core-construction-java NeedsRunner Tests are not executed

2018-02-27 Thread Thomas Groh (JIRA)
Thomas Groh created BEAM-3760:
-

 Summary: core-construction-java NeedsRunner Tests are not executed
 Key: BEAM-3760
 URL: https://issues.apache.org/jira/browse/BEAM-3760
 Project: Beam
  Issue Type: Bug
  Components: runner-core, runner-direct
Reporter: Thomas Groh
Assignee: Thomas Groh
 Fix For: Not applicable






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-3760) core-construction-java NeedsRunner Tests are not executed

2018-02-27 Thread Thomas Groh (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Groh updated BEAM-3760:
--
Description: 
The core-construction-java dependency isn't scanned.

 

> core-construction-java NeedsRunner Tests are not executed
> -
>
> Key: BEAM-3760
> URL: https://issues.apache.org/jira/browse/BEAM-3760
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core, runner-direct
>Reporter: Thomas Groh
>Assignee: Thomas Groh
>Priority: Blocker
> Fix For: Not applicable
>
>
> The core-construction-java dependency isn't scanned.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-3761) Fix Python 3 missing functions

2018-02-27 Thread holdenk (JIRA)
holdenk created BEAM-3761:
-

 Summary: Fix Python 3 missing functions
 Key: BEAM-3761
 URL: https://issues.apache.org/jira/browse/BEAM-3761
 Project: Beam
  Issue Type: Sub-task
  Components: sdk-py-core
Reporter: holdenk
Assignee: Ahmet Altay


cmp & file is no longer defined in Python 3. We can catch regressions of this 
using flake8 f821 (although this catches some additional things as well)

 

Note: there are existing PRs for basestring and unicode ( 
[https://github.com/apache/beam/pull/4697|https://github.com/apache/beam/pull/4697,]
 , [https://github.com/apache/beam/pull/4730] )



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-3761) Fix Python 3 missing functions

2018-02-27 Thread holdenk (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

holdenk updated BEAM-3761:
--
Issue Type: Bug  (was: Sub-task)
Parent: (was: BEAM-1251)

> Fix Python 3 missing functions
> --
>
> Key: BEAM-3761
> URL: https://issues.apache.org/jira/browse/BEAM-3761
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: holdenk
>Assignee: Ahmet Altay
>Priority: Major
>
> cmp & file is no longer defined in Python 3. We can catch regressions of this 
> using flake8 f821 (although this catches some additional things as well)
>  
> Note: there are existing PRs for basestring and unicode ( 
> [https://github.com/apache/beam/pull/4697|https://github.com/apache/beam/pull/4697,]
>  , [https://github.com/apache/beam/pull/4730] )



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-3761) Fix Python 3 missing functions

2018-02-27 Thread holdenk (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

holdenk updated BEAM-3761:
--
Issue Type: Improvement  (was: Bug)

> Fix Python 3 missing functions
> --
>
> Key: BEAM-3761
> URL: https://issues.apache.org/jira/browse/BEAM-3761
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: holdenk
>Assignee: Ahmet Altay
>Priority: Major
>
> cmp & file is no longer defined in Python 3. We can catch regressions of this 
> using flake8 f821 (although this catches some additional things as well)
>  
> Note: there are existing PRs for basestring and unicode ( 
> [https://github.com/apache/beam/pull/4697|https://github.com/apache/beam/pull/4697,]
>  , [https://github.com/apache/beam/pull/4730] )



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-3761) Fix Python 3 cmp function

2018-02-27 Thread holdenk (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

holdenk updated BEAM-3761:
--
Description: 
Various functions don't exist in Python 3 that did in python 2. This Jira is to 
fix the use of cmp (which often will involve rewriting __cmp__ as well).

 

Note: there are existing PRs for basestring and unicode ( 
[https://github.com/apache/beam/pull/4697|https://github.com/apache/beam/pull/4697,]
 , [https://github.com/apache/beam/pull/4730] )

  was:
cmp & file is no longer defined in Python 3. We can catch regressions of this 
using flake8 f821 (although this catches some additional things as well)

 

Note: there are existing PRs for basestring and unicode ( 
[https://github.com/apache/beam/pull/4697|https://github.com/apache/beam/pull/4697,]
 , [https://github.com/apache/beam/pull/4730] )


> Fix Python 3 cmp function
> -
>
> Key: BEAM-3761
> URL: https://issues.apache.org/jira/browse/BEAM-3761
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: holdenk
>Assignee: Ahmet Altay
>Priority: Major
>
> Various functions don't exist in Python 3 that did in python 2. This Jira is 
> to fix the use of cmp (which often will involve rewriting __cmp__ as well).
>  
> Note: there are existing PRs for basestring and unicode ( 
> [https://github.com/apache/beam/pull/4697|https://github.com/apache/beam/pull/4697,]
>  , [https://github.com/apache/beam/pull/4730] )



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-3761) Fix Python 3 cmp function

2018-02-27 Thread holdenk (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

holdenk updated BEAM-3761:
--
Summary: Fix Python 3 cmp function  (was: Fix Python 3 missing functions)

> Fix Python 3 cmp function
> -
>
> Key: BEAM-3761
> URL: https://issues.apache.org/jira/browse/BEAM-3761
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: holdenk
>Assignee: Ahmet Altay
>Priority: Major
>
> cmp & file is no longer defined in Python 3. We can catch regressions of this 
> using flake8 f821 (although this catches some additional things as well)
>  
> Note: there are existing PRs for basestring and unicode ( 
> [https://github.com/apache/beam/pull/4697|https://github.com/apache/beam/pull/4697,]
>  , [https://github.com/apache/beam/pull/4730] )



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-3761) Fix Python 3 cmp function

2018-02-27 Thread holdenk (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

holdenk updated BEAM-3761:
--
Description: 
Various functions don't exist in Python 3 that did in python 2. This Jira is to 
fix the use of cmp (which often will involve rewriting __cmp__ as well).

 

Note: there are existing PRs for basestring and unicode ( 
[https://github.com/apache/beam/pull/4697|https://github.com/apache/beam/pull/4697,]
 , [https://github.com/apache/beam/pull/4730] )

 

Note once all of the missing names/functions are fixed we can enable F821 in 
falke8 python 3.

  was:
Various functions don't exist in Python 3 that did in python 2. This Jira is to 
fix the use of cmp (which often will involve rewriting __cmp__ as well).

 

Note: there are existing PRs for basestring and unicode ( 
[https://github.com/apache/beam/pull/4697|https://github.com/apache/beam/pull/4697,]
 , [https://github.com/apache/beam/pull/4730] )


> Fix Python 3 cmp function
> -
>
> Key: BEAM-3761
> URL: https://issues.apache.org/jira/browse/BEAM-3761
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: holdenk
>Assignee: Ahmet Altay
>Priority: Major
>
> Various functions don't exist in Python 3 that did in python 2. This Jira is 
> to fix the use of cmp (which often will involve rewriting __cmp__ as well).
>  
> Note: there are existing PRs for basestring and unicode ( 
> [https://github.com/apache/beam/pull/4697|https://github.com/apache/beam/pull/4697,]
>  , [https://github.com/apache/beam/pull/4730] )
>  
> Note once all of the missing names/functions are fixed we can enable F821 in 
> falke8 python 3.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-2421) Migrate Apache Beam to use impulse primitive as the only root primitive

2018-02-27 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16379653#comment-16379653
 ] 

Kenneth Knowles commented on BEAM-2421:
---

I misread the links above. It looks it is expected to be done for Python 
Dataflow anyhow.

> Migrate Apache Beam to use impulse primitive as the only root primitive
> ---
>
> Key: BEAM-2421
> URL: https://issues.apache.org/jira/browse/BEAM-2421
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Reporter: Luke Cwik
>Priority: Major
>  Labels: portability
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The impulse source emits a single byte array element within the global window.
> This is in preparation for using SDF as the replacement for different bounded 
> and unbounded source implementations.
> Before:
> Read(Source) -> ParDo -> ...
> Impulse -> SDF(Source) -> ParDo -> ...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3758) Migrate Python SDK Read transform to be Impulse->SDF

2018-02-27 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16379658#comment-16379658
 ] 

Kenneth Knowles commented on BEAM-3758:
---

The status appears to be that a Create in streaming is overridden for Dataflow 
but not otherwise (not batch, not other runners, not other sources).

> Migrate Python SDK Read transform to be Impulse->SDF
> 
>
> Key: BEAM-3758
> URL: https://issues.apache.org/jira/browse/BEAM-3758
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
>
> Currently, Read is the "primitive" even though portability doesn't even have 
> the concept. Anyhow at least the DataflowRunner should override it to be 
> impulse, since the service requires this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-3761) Fix Python 3 cmp function

2018-02-27 Thread Ahmet Altay (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay reassigned BEAM-3761:
-

Assignee: (was: Ahmet Altay)

> Fix Python 3 cmp function
> -
>
> Key: BEAM-3761
> URL: https://issues.apache.org/jira/browse/BEAM-3761
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: holdenk
>Priority: Major
>
> Various functions don't exist in Python 3 that did in python 2. This Jira is 
> to fix the use of cmp (which often will involve rewriting __cmp__ as well).
>  
> Note: there are existing PRs for basestring and unicode ( 
> [https://github.com/apache/beam/pull/4697|https://github.com/apache/beam/pull/4697,]
>  , [https://github.com/apache/beam/pull/4730] )
>  
> Note once all of the missing names/functions are fixed we can enable F821 in 
> falke8 python 3.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Python_ValidatesRunner_Dataflow #1007

2018-02-27 Thread Apache Jenkins Server
See 


--
[...truncated 122.24 KB...]
return self.run(*args, **kwds)
  File "/usr/lib/python2.7/unittest/case.py", line 331, in run
testMethod()
  File 
"
 line 306, in test_flattened_side_input
pipeline.run()
  File 
"
 line 102, in run
result = super(TestPipeline, self).run()
  File 
"
 line 369, in run
self.to_runner_api(), self.runner, self._options).run(False)
  File 
"
 line 597, in from_runner_api
context.transforms.get_by_id(root_transform_id)]
  File 
"
 line 69, in get_by_id
self._id_to_proto[id], self._pipeline_context)
  File 
"
 line 842, in from_runner_api
part = context.transforms.get_by_id(transform_id)
  File 
"
 line 69, in get_by_id
self._id_to_proto[id], self._pipeline_context)
  File 
"
 line 842, in from_runner_api
part = context.transforms.get_by_id(transform_id)
  File 
"
 line 69, in get_by_id
self._id_to_proto[id], self._pipeline_context)
  File 
"
 line 842, in from_runner_api
part = context.transforms.get_by_id(transform_id)
  File 
"
 line 69, in get_by_id
self._id_to_proto[id], self._pipeline_context)
  File 
"
 line 833, in from_runner_api
transform=ptransform.PTransform.from_runner_api(proto.spec, context),
  File 
"
 line 555, in from_runner_api
context)
  File 
"
 line 881, in from_runner_api_parameter
pardo_payload.do_fn.spec.payload)
  File 
"
 line 221, in loads
return dill.loads(s)
  File 
"
 line 277, in loads
return load(file)
  File 
"
 line 266, in load
obj = pik.load()
  File "/usr/lib/python2.7/pickle.py", line 858, in load
dispatch[key](self)
  File "/usr/lib/python2.7/pickle.py", line 1090, in load_global
klass = self.find_class(module, name)
  File 
"
 line 423, in find_class
return StockUnpickler.find_class(self, module, name)
  File 
"
 line 276, in signalhandler
raise TimedOutException()
TimedOutException: 'test_flattened_side_input 
(apache_beam.transforms.sideinputs_test.SideInputsTest)'

==
ERROR: test_iterable_side_input 
(apache_beam.transforms.sideinputs_test.SideInputsTest)
--
Traceback (most recent call last):
  File 

Build failed in Jenkins: beam_PostCommit_Python_Verify #4325

2018-02-27 Thread Apache Jenkins Server
See 


--
[...truncated 1.01 MB...]
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_messages.py
 -> 
apache-beam-2.4.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers_test.py 
-> apache-beam-2.4.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying apache_beam/runners/dataflow/native_io/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/streaming_create.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/direct/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/bundle_factory.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/clock.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor_test.py 
-> apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_runner_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/evaluation_context.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/executor.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/helper_transforms.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/sdf_direct_runner.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/sdf_direct_runner_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/transform_evaluator.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/util.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/watermark_manager.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/experimental/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/experimental
copying apache_beam/runners/experimental/python_rpc_direct/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/experimental/python_rpc_direct
copying 
apache_beam/runners/experimental/python_rpc_direct/python_rpc_direct_runner.py 
-> apache-beam-2.4.0.dev0/apache_beam/runners/experimental/python_rpc_direct
copying apache_beam/runners/experimental/python_rpc_direct/server.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/experimental/python_rpc_direct
copying apache_beam/runners/job/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/job
copying apache_beam/runners/job/manager.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/job
copying apache_beam/runners/job/utils.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/job
copying apache_beam/runners/portability/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/fn_api_runner.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/fn_api_runner_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/maptask_executor_runner.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/maptask_executor_runner_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/universal_local_runner.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/universal_local_runner_main.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/universal_local_runner_test.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/test/__init__.py -> 
apache-beam-2.4.0.dev0/apache_beam/runners/test
copying 

[jira] [Commented] (BEAM-1754) Will Dataflow ever support Node.js with an SDK similar to Java or Python?

2018-02-27 Thread Peter Murphy (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16379749#comment-16379749
 ] 

Peter Murphy commented on BEAM-1754:


Is the best approach to beginning to work on this to basically follow the 
example of the Python API? Or would you recommend some other approach? Are 
there other resources on writing a new SDK and how to program to that layer 
(documentation, white papers, etc)?

We know Beam / Dataflow from the user perspective, basic model etc, mostly via 
the scio interface. We run batch and streaming jobs for ESnet, part of the US 
Dept. of Energy.

If there are enough resources to start this project up we might be interested 
in doing some of the initial work and see how it goes.

> Will Dataflow ever support Node.js with an SDK similar to Java or Python?
> -
>
> Key: BEAM-1754
> URL: https://issues.apache.org/jira/browse/BEAM-1754
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-ideas
>Reporter: Diego Zuluaga
>Priority: Critical
>  Labels: node.js
>
> I like the philosophy behind DataFlow and found the Java and Python samples 
> highly comprehensible. However, I have to admit that for most Node.js 
> developers who have little background on typed languages and are used to get 
> up to speed with frameworks incredibly fast, learning Dataflow might take 
> some learning curve that they/we're not used to. So, I wonder if at any point 
> in time Dataflow will provide a Node.js SDK. Maybe this is out of the 
> question, but I wanted to run it by the team as it would be awesome to have 
> something along these lines!
> Thanks,
> Diego
> Question originaly posted in SO:
> http://stackoverflow.com/questions/42893436/will-dataflow-ever-support-node-js-with-and-sdk-similar-to-java-or-python



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-1754) Will Dataflow ever support Node.js with an SDK similar to Java or Python?

2018-02-27 Thread Davor Bonaci (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16379755#comment-16379755
 ] 

Davor Bonaci commented on BEAM-1754:


You are more than welcome to do so!

I'd suggest using dev@ mailing list for any questions or help that you might 
need. Unfortunately, we don't have a guide for writing SDKs, but the folks on 
the mailing list will be very happy to help.

One of the first decisions you'll have to make is how to interface with runners 
and whether to use the new portability framework API. This is also something 
best discussed on the mailing list.

In terms of an example, Python SDK and Go SDK would be great to follow.

Good luck!

> Will Dataflow ever support Node.js with an SDK similar to Java or Python?
> -
>
> Key: BEAM-1754
> URL: https://issues.apache.org/jira/browse/BEAM-1754
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-ideas
>Reporter: Diego Zuluaga
>Priority: Critical
>  Labels: node.js
>
> I like the philosophy behind DataFlow and found the Java and Python samples 
> highly comprehensible. However, I have to admit that for most Node.js 
> developers who have little background on typed languages and are used to get 
> up to speed with frameworks incredibly fast, learning Dataflow might take 
> some learning curve that they/we're not used to. So, I wonder if at any point 
> in time Dataflow will provide a Node.js SDK. Maybe this is out of the 
> question, but I wanted to run it by the team as it would be awesome to have 
> something along these lines!
> Thanks,
> Diego
> Question originaly posted in SO:
> http://stackoverflow.com/questions/42893436/will-dataflow-ever-support-node-js-with-and-sdk-similar-to-java-or-python



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)