Build failed in Jenkins: beam_PostCommit_Python_Verify #3388

2017-10-18 Thread Apache Jenkins Server
See 


--
[...truncated 887.02 KB...]
copying apache_beam/portability/api/standard_window_fns_pb2.py -> 
apache-beam-2.3.0.dev0/apache_beam/portability/api
copying apache_beam/portability/api/standard_window_fns_pb2_grpc.py -> 
apache-beam-2.3.0.dev0/apache_beam/portability/api
copying apache_beam/runners/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/common.pxd -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/common.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/common_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/pipeline_context.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/pipeline_context_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/dataflow/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_metrics.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_metrics_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/ptransform_overrides.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/template_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/test_dataflow_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/internal/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/apiclient.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/apiclient_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/dependency.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/dependency_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/names.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/clients/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients
copying apache_beam/runners/dataflow/internal/clients/dataflow/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_client.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_messages.py
 -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying apache_beam/runners/dataflow/native_io/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/streaming_create.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/direct/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/bundle_factory.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/clock.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying 

Jenkins build is unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #4192

2017-10-18 Thread Apache Jenkins Server
See 




[jira] [Resolved] (BEAM-2720) Exactly-once Kafka sink

2017-10-18 Thread Raghu Angadi (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghu Angadi resolved BEAM-2720.

   Resolution: Fixed
Fix Version/s: 2.2.0

> Exactly-once Kafka sink
> ---
>
> Key: BEAM-2720
> URL: https://issues.apache.org/jira/browse/BEAM-2720
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions
>Reporter: Raghu Angadi
>Assignee: Raghu Angadi
> Fix For: 2.2.0
>
>
> Kafka 0.11 added support for transactions which allows end-to-end 
> exactly-once semantics. Beam's KafkaIO users can benefit from these while 
> using runners that support exactly-once processing.
> I have an implementation of EOS support for Kafka sink : 
> https://github.com/apache/beam/pull/3612
> It has two shuffles and builds on Beam state-API and checkpoint barrier 
> between stages (as in Dataflow). Pull request has a longer description.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (BEAM-2720) Exactly-once Kafka sink

2017-10-18 Thread Raghu Angadi (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghu Angadi updated BEAM-2720:
---
Summary: Exactly-once Kafka sink  (was: Exact-once Kafka sink)

> Exactly-once Kafka sink
> ---
>
> Key: BEAM-2720
> URL: https://issues.apache.org/jira/browse/BEAM-2720
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions
>Reporter: Raghu Angadi
>Assignee: Raghu Angadi
>
> Kafka 0.11 added support for transactions which allows end-to-end 
> exactly-once semantics. Beam's KafkaIO users can benefit from these while 
> using runners that support exactly-once processing.
> I have an implementation of EOS support for Kafka sink : 
> https://github.com/apache/beam/pull/3612
> It has two shuffles and builds on Beam state-API and checkpoint barrier 
> between stages (as in Dataflow). Pull request has a longer description.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-3076) support TIMESTAMP in BeamRecordType

2017-10-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16210431#comment-16210431
 ] 

ASF GitHub Bot commented on BEAM-3076:
--

GitHub user zangshayang1 opened a pull request:

https://github.com/apache/beam/pull/4015

[BEAM-3076] support TIMESTAMP in BeamRecordType

This is to support Timestamp datatype in BeamRecordSqlType.java by 
introducing TimestampCoder. Previously Timestamp datatype is mapped to Date 
type.

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [ ] Each commit in the pull request should have a meaningful subject 
line and body.
 - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [ ] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zangshayang1/beam BEAM-3076

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/4015.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4015


commit d479b3e43c5bbc6fe6dec69a9753e510e94e3fb0
Author: zangshayang1 
Date:   2017-10-19T01:08:45Z

This is to support Timestamp datatype in BeamRecordSqlType.java by 
introducing TimestampCoder. Previously Timestamp datatype is mapped to Date 
type.




> support TIMESTAMP in BeamRecordType
> ---
>
> Key: BEAM-3076
> URL: https://issues.apache.org/jira/browse/BEAM-3076
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Shayang Zang
>Assignee: Shayang Zang
>Priority: Minor
>
> Timestamp type of data was mapped to Data.class and also sharing DateCoder() 
> during BeamSql execution. We want it to be supported in BeamRecordType as a 
> stand-alone datatype.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #4015: [BEAM-3076] support TIMESTAMP in BeamRecordType

2017-10-18 Thread zangshayang1
GitHub user zangshayang1 opened a pull request:

https://github.com/apache/beam/pull/4015

[BEAM-3076] support TIMESTAMP in BeamRecordType

This is to support Timestamp datatype in BeamRecordSqlType.java by 
introducing TimestampCoder. Previously Timestamp datatype is mapped to Date 
type.

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [ ] Each commit in the pull request should have a meaningful subject 
line and body.
 - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [ ] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zangshayang1/beam BEAM-3076

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/4015.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4015


commit d479b3e43c5bbc6fe6dec69a9753e510e94e3fb0
Author: zangshayang1 
Date:   2017-10-19T01:08:45Z

This is to support Timestamp datatype in BeamRecordSqlType.java by 
introducing TimestampCoder. Previously Timestamp datatype is mapped to Date 
type.




---


[jira] [Created] (BEAM-3076) support TIMESTAMP in BeamRecordType

2017-10-18 Thread Shayang Zang (JIRA)
Shayang Zang created BEAM-3076:
--

 Summary: support TIMESTAMP in BeamRecordType
 Key: BEAM-3076
 URL: https://issues.apache.org/jira/browse/BEAM-3076
 Project: Beam
  Issue Type: Improvement
  Components: dsl-sql
Reporter: Shayang Zang
Assignee: Shayang Zang
Priority: Minor


Timestamp type of data was mapped to Data.class and also sharing DateCoder() 
during BeamSql execution. We want it to be supported in BeamRecordType as a 
stand-alone datatype.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Build failed in Jenkins: beam_PerformanceTests_Python #464

2017-10-18 Thread Apache Jenkins Server
See 


Changes:

[rangadi] [BEAM-2720] Update kafka client version to 0.11.0.1

[kenn] Improve GcsFileSystem errors messages slightly

[kenn] Add ability to stage explicit file list

[kenn] Stage the portable pipeline in Dataflow

[kenn] Stage the pipeline without using a temp file

[kenn] Add assertion that valid jobs must have staged pipeline

[kenn] Remove duplicate mocking in DataflowRunnerTest

--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam6 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 1974b920e4b3bbe8549e25fe789f9dada13c1769 (origin/master)
Commit message: "This closes #3977: [BEAM-2963] Stage pipeline in 
DataflowRunner"
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 1974b920e4b3bbe8549e25fe789f9dada13c1769
 > git rev-list 77a4d3e9dc1ee8f2aac324e50af7b05168ee2af7 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins9017345297364317673.sh
+ rm -rf PerfKitBenchmarker
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins6440272776955412512.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git
Cloning into 'PerfKitBenchmarker'...
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins5308037775340101302.sh
+ pip install --user -r PerfKitBenchmarker/requirements.txt
Requirement already satisfied: absl-py in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: jinja2>=2.7 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: setuptools in /usr/lib/python2.7/dist-packages 
(from -r PerfKitBenchmarker/requirements.txt (line 16))
Requirement already satisfied: colorlog[windows]==2.6.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 17))
Requirement already satisfied: blinker>=1.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 18))
Requirement already satisfied: futures>=3.0.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 19))
Requirement already satisfied: PyYAML==3.12 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 20))
Requirement already satisfied: pint>=0.7 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 21))
Requirement already satisfied: numpy in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 22))
Requirement already satisfied: functools32 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 23))
Requirement already satisfied: contextlib2>=0.5.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 24))
Requirement already satisfied: six in 
/home/jenkins/.local/lib/python2.7/site-packages (from absl-py->-r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: MarkupSafe>=0.23 in 
/usr/local/lib/python2.7/dist-packages (from jinja2>=2.7->-r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: colorama; extra == "windows" in 
/usr/lib/python2.7/dist-packages (from colorlog[windows]==2.6.0->-r 
PerfKitBenchmarker/requirements.txt (line 17))
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins1013257398508824437.sh
+ pip install --user -e 'sdks/python/[gcp,test]'
Obtaining 
file://
Requirement already satisfied: avro<2.0.0,>=1.8.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from apache-beam==2.3.0.dev)
Requirement already satisfied: crcmod<2.0,>=1.7 in 

Build failed in Jenkins: beam_PostCommit_Python_Verify #3387

2017-10-18 Thread Apache Jenkins Server
See 


--
[...truncated 885.17 KB...]
copying apache_beam/portability/api/standard_window_fns_pb2.py -> 
apache-beam-2.3.0.dev0/apache_beam/portability/api
copying apache_beam/portability/api/standard_window_fns_pb2_grpc.py -> 
apache-beam-2.3.0.dev0/apache_beam/portability/api
copying apache_beam/runners/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/common.pxd -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/common.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/common_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/pipeline_context.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/pipeline_context_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/dataflow/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_metrics.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_metrics_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/ptransform_overrides.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/template_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/test_dataflow_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/internal/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/apiclient.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/apiclient_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/dependency.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/dependency_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/names.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/clients/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients
copying apache_beam/runners/dataflow/internal/clients/dataflow/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_client.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_messages.py
 -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying apache_beam/runners/dataflow/native_io/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/streaming_create.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/direct/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/bundle_factory.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/clock.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying 

[jira] [Commented] (BEAM-1542) Need Source/Sink for Spanner

2017-10-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16210201#comment-16210201
 ] 

ASF GitHub Bot commented on BEAM-1542:
--

GitHub user mairbek opened a pull request:

https://github.com/apache/beam/pull/4014

[BEAM-1542] SpannerIO: mutation encoding and size estimation improvements

#4013 must be reviewed merged first

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mairbek/beam mutationcoder

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/4014.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4014


commit 24b396f6449b73665b96ffe52b3eba46f19d4fc4
Author: Mairbek Khadikov 
Date:   2017-10-18T22:18:55Z

Reading spanner schema transform

commit 20f585ecfcbce8b8130fe7dc5d91ae46f821e77d
Author: Mairbek Khadikov 
Date:   2017-10-18T22:26:11Z

Updated the mutation size estimator to support delete mutations

commit 032fa0d8bcf966f7abca4616c4d16c04acc20b93
Author: Mairbek Khadikov 
Date:   2017-10-18T22:26:36Z

Introduced MutationGroupEncoder which efficiently encodes/decodes the 
mutation given the cloud spanner schema




> Need Source/Sink for Spanner
> 
>
> Key: BEAM-1542
> URL: https://issues.apache.org/jira/browse/BEAM-1542
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-gcp
>Reporter: Guy Molinari
>Assignee: Mairbek Khadikov
>
> Is there a source/sink for Spanner in the works?   If not I would gladly give 
> this a shot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #4014: [BEAM-1542] SpannerIO: mutation encoding and size e...

2017-10-18 Thread mairbek
GitHub user mairbek opened a pull request:

https://github.com/apache/beam/pull/4014

[BEAM-1542] SpannerIO: mutation encoding and size estimation improvements

#4013 must be reviewed merged first

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mairbek/beam mutationcoder

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/4014.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4014


commit 24b396f6449b73665b96ffe52b3eba46f19d4fc4
Author: Mairbek Khadikov 
Date:   2017-10-18T22:18:55Z

Reading spanner schema transform

commit 20f585ecfcbce8b8130fe7dc5d91ae46f821e77d
Author: Mairbek Khadikov 
Date:   2017-10-18T22:26:11Z

Updated the mutation size estimator to support delete mutations

commit 032fa0d8bcf966f7abca4616c4d16c04acc20b93
Author: Mairbek Khadikov 
Date:   2017-10-18T22:26:36Z

Introduced MutationGroupEncoder which efficiently encodes/decodes the 
mutation given the cloud spanner schema




---


[jira] [Commented] (BEAM-1542) Need Source/Sink for Spanner

2017-10-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16210145#comment-16210145
 ] 

ASF GitHub Bot commented on BEAM-1542:
--

GitHub user mairbek opened a pull request:

https://github.com/apache/beam/pull/4013

[BEAM-1542] A transform for reading the Spanner schema 

A part of https://github.com/apache/beam/pull/3729.

Ideally, spanner schema should be an introspection  part of the Cloud 
Spanner library. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mairbek/beam readschema

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/4013.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4013


commit 24b396f6449b73665b96ffe52b3eba46f19d4fc4
Author: Mairbek Khadikov 
Date:   2017-10-18T22:18:55Z

Reading spanner schema transform




> Need Source/Sink for Spanner
> 
>
> Key: BEAM-1542
> URL: https://issues.apache.org/jira/browse/BEAM-1542
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-gcp
>Reporter: Guy Molinari
>Assignee: Mairbek Khadikov
>
> Is there a source/sink for Spanner in the works?   If not I would gladly give 
> this a shot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #4013: [BEAM-1542] A transform for reading the Spanner sch...

2017-10-18 Thread mairbek
GitHub user mairbek opened a pull request:

https://github.com/apache/beam/pull/4013

[BEAM-1542] A transform for reading the Spanner schema 

A part of https://github.com/apache/beam/pull/3729.

Ideally, spanner schema should be an introspection  part of the Cloud 
Spanner library. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mairbek/beam readschema

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/4013.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4013


commit 24b396f6449b73665b96ffe52b3eba46f19d4fc4
Author: Mairbek Khadikov 
Date:   2017-10-18T22:18:55Z

Reading spanner schema transform




---


[GitHub] beam pull request #4012: Pin runner harness also for official BEAM releases.

2017-10-18 Thread tvalentyn
GitHub user tvalentyn opened a pull request:

https://github.com/apache/beam/pull/4012

Pin runner harness also for official BEAM releases.

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [ ] Each commit in the pull request should have a meaningful subject 
line and body.
 - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [ ] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tvalentyn/beam pin_runner_harness_minimal

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/4012.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4012


commit 7155931ff9eaf5fb85765e9d515469f5e6bd5bf9
Author: Valentyn Tymofieiev 
Date:   2017-10-18T21:25:33Z

Pin runner harness also for official BEAM releases.




---


[jira] [Commented] (BEAM-2926) Java SDK support for portable side input

2017-10-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16210096#comment-16210096
 ] 

ASF GitHub Bot commented on BEAM-2926:
--

GitHub user lukecwik opened a pull request:

https://github.com/apache/beam/pull/4011

[BEAM-2926] Migrate to using a trivial multimap materialization within the 
Java SDK.

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [ ] Each commit in the pull request should have a meaningful subject 
line and body.
 - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [ ] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lukecwik/incubator-beam side_input

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/4011.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4011


commit e83084491714335d1453a6a4959b6e8c2c9d3ac1
Author: Luke Cwik 
Date:   2017-10-18T20:55:10Z

[BEAM-2926] Migrate to using a trivial multimap materialization within the 
Java SDK.




> Java SDK support for portable side input 
> -
>
> Key: BEAM-2926
> URL: https://issues.apache.org/jira/browse/BEAM-2926
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Henning Rohde
>Assignee: Luke Cwik
>  Labels: portability
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #4011: [BEAM-2926] Migrate to using a trivial multimap mat...

2017-10-18 Thread lukecwik
GitHub user lukecwik opened a pull request:

https://github.com/apache/beam/pull/4011

[BEAM-2926] Migrate to using a trivial multimap materialization within the 
Java SDK.

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [ ] Each commit in the pull request should have a meaningful subject 
line and body.
 - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [ ] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lukecwik/incubator-beam side_input

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/4011.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4011


commit e83084491714335d1453a6a4959b6e8c2c9d3ac1
Author: Luke Cwik 
Date:   2017-10-18T20:55:10Z

[BEAM-2926] Migrate to using a trivial multimap materialization within the 
Java SDK.




---


[jira] [Created] (BEAM-3075) Document getting started for python sdk developers

2017-10-18 Thread Ahmet Altay (JIRA)
Ahmet Altay created BEAM-3075:
-

 Summary: Document getting started for python sdk developers
 Key: BEAM-3075
 URL: https://issues.apache.org/jira/browse/BEAM-3075
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-core, website
Reporter: Ahmet Altay


Ideas:
- Instructions for running a modified wordcount, running tests and linter 
locally.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[beam-site] branch mergebot updated (3edd5d2 -> cc8a7a7)

2017-10-18 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a change to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git.


 discard 3edd5d2  This closes #322
 discard 4d77ab9  Move Elasticsearch v5.x from in-progress to built-in
 new 68c9fef  Update IntelliJ Checkstyle instructions
 new a4662c7  Address 
https://github.com/apache/beam-site/pull/331#discussion_r144115767
 new a7159d7  Address 
https://github.com/apache/beam-site/pull/331#discussion_r144906095
 new cc8a7a7  This closes #331

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (3edd5d2)
\
 N -- N -- N   refs/heads/mergebot (cc8a7a7)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 4 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 src/contribute/contribution-guide.md | 67 ++--
 src/documentation/io/built-in.md |  6 +++-
 2 files changed, 39 insertions(+), 34 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
['"commits@beam.apache.org" '].


[beam-site] 01/04: Update IntelliJ Checkstyle instructions

2017-10-18 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 68c9fefffd21553b8cd5219d9145cef88901768e
Author: wtanaka.com 
AuthorDate: Sat Oct 7 17:13:43 2017 -1000

Update IntelliJ Checkstyle instructions

to downgrade checkstyle version to match pom.xml
---
 src/contribute/contribution-guide.md | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/contribute/contribution-guide.md 
b/src/contribute/contribution-guide.md
index 4989596..f7c463e 100644
--- a/src/contribute/contribution-guide.md
+++ b/src/contribute/contribution-guide.md
@@ -124,16 +124,17 @@ IntelliJ supports checkstyle within the IDE using the 
Checkstyle-IDEA plugin.
 
 1. Install the "Checkstyle-IDEA" plugin from the IntelliJ plugin repository.
 1. Configure the plugin by going to Settings -> Other Settings -> Checkstyle.
+1. Set Checkstyle version to the same as in `/pom.xml` (e.g. 6.19)
 1. Set the "Scan Scope" to "Only Java sources (including tests)".
 1. In the "Configuration File" pane, add a new configuration using the plus 
icon:
 1. Set the "Description" to "Beam".
 1. Select "Use a local Checkstyle file", and point it to
-  "sdks/java/build-tools/src/main/resources/beam/checkstyle.xml" within
+  `sdks/java/build-tools/src/main/resources/beam/checkstyle.xml` within
   your repository.
 1. Check the box for "Store relative to project location", and click
   "Next".
-1. Configure the "checkstyle.suppressions.file" property value to
-  "suppressions.xml", and click "Next", then "Finish".
+1. Configure the `checkstyle.suppressions.file` property value to
+  `suppressions.xml`, and click "Next", then "Finish".
 1. Select "Beam" as the only active configuration file, and click "Apply" and
"OK".
 1. Checkstyle will now give warnings in the editor for any Checkstyle

-- 
To stop receiving notification emails like this one, please contact
"commits@beam.apache.org" .


[beam-site] 04/04: This closes #331

2017-10-18 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit cc8a7a7e2141ae16a25cd2f75c97238741135f49
Merge: efdc471 a7159d7
Author: Mergebot 
AuthorDate: Wed Oct 18 21:16:46 2017 +

This closes #331

 src/contribute/contribution-guide.md | 67 ++--
 1 file changed, 34 insertions(+), 33 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
"commits@beam.apache.org" .


[beam-site] 03/04: Address https://github.com/apache/beam-site/pull/331#discussion_r144906095

2017-10-18 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit a7159d76e181a349c731c71af43ce40fd5695ccc
Author: wtanaka.com 
AuthorDate: Tue Oct 17 20:30:10 2017 -1000

Address
https://github.com/apache/beam-site/pull/331#discussion_r144906095
---
 src/contribute/contribution-guide.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/contribute/contribution-guide.md 
b/src/contribute/contribution-guide.md
index b5f9dc4..6e03ee6 100644
--- a/src/contribute/contribution-guide.md
+++ b/src/contribute/contribution-guide.md
@@ -32,7 +32,7 @@ We look forward to working with you!
 ## Engage
 
 ### Mailing list(s)
-We discuss design and implementation issues on the `d...@beam.apache.org 
mailing list, which is archived 
[here](https://lists.apache.org/list.html?d...@beam.apache.org). Join by 
emailing 
[`dev-subscr...@beam.apache.org`](mailto:dev-subscr...@beam.apache.org).
+We discuss design and implementation issues on the `d...@beam.apache.org` 
mailing list, which is archived 
[here](https://lists.apache.org/list.html?d...@beam.apache.org). Join by 
emailing 
[`dev-subscr...@beam.apache.org`](mailto:dev-subscr...@beam.apache.org).
 
 If interested, you can also join the other [mailing lists]({{ site.baseurl 
}}/get-started/support/).
 

-- 
To stop receiving notification emails like this one, please contact
"commits@beam.apache.org" .


[beam-site] 02/04: Address https://github.com/apache/beam-site/pull/331#discussion_r144115767

2017-10-18 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit a4662c73f62aa97a5c20d3f7fffe8246f85a9a35
Author: wtanaka.com 
AuthorDate: Sat Oct 14 08:01:45 2017 -1000

Address
https://github.com/apache/beam-site/pull/331#discussion_r144115767
---
 src/contribute/contribution-guide.md | 62 ++--
 1 file changed, 31 insertions(+), 31 deletions(-)

diff --git a/src/contribute/contribution-guide.md 
b/src/contribute/contribution-guide.md
index f7c463e..b5f9dc4 100644
--- a/src/contribute/contribution-guide.md
+++ b/src/contribute/contribution-guide.md
@@ -32,7 +32,7 @@ We look forward to working with you!
 ## Engage
 
 ### Mailing list(s)
-We discuss design and implementation issues on the d...@beam.apache.org 
mailing list, which is archived 
[here](https://lists.apache.org/list.html?d...@beam.apache.org). Join by 
emailing 
[`dev-subscr...@beam.apache.org`](mailto:dev-subscr...@beam.apache.org).
+We discuss design and implementation issues on the `d...@beam.apache.org 
mailing list, which is archived 
[here](https://lists.apache.org/list.html?d...@beam.apache.org). Join by 
emailing 
[`dev-subscr...@beam.apache.org`](mailto:dev-subscr...@beam.apache.org).
 
 If interested, you can also join the other [mailing lists]({{ site.baseurl 
}}/get-started/support/).
 
@@ -109,36 +109,36 @@ Depending on your preferred development environment, you 
may need to prepare it
 ## Enable Annotation Processing
 To configure annotation processing in IntelliJ:
 
-1. Open Annotation Processors Settings dialog box by going to Settings -> 
Build, Execution, Deployment -> Compiler -> Annotation Processors.
+1. Open Annotation Processors Settings dialog box by going to Settings -> 
Build, Execution, Deployment -> Compiler -> Annotation Processors
 1. Select the following buttons:
* "Enable annotation processing"
* "Obtain processors from project classpath"
* "Store generated sources relative to: _Module content root_"
 1. Set the generated source directories to be equal to the Maven directories:
-   * Set "Production sources directory:" to 
"target/generated-sources/annotations".
-   * Set "Test sources directory:" to 
"target/generated-test-sources/test-annotations".
-1. Click "OK".
+   * Set "Production sources directory:" to 
`target/generated-sources/annotations`
+   * Set "Test sources directory:" to 
`target/generated-test-sources/test-annotations`
+1. Click "OK"
 
 ## Checkstyle
 IntelliJ supports checkstyle within the IDE using the Checkstyle-IDEA plugin.
 
-1. Install the "Checkstyle-IDEA" plugin from the IntelliJ plugin repository.
-1. Configure the plugin by going to Settings -> Other Settings -> Checkstyle.
+1. Install the "Checkstyle-IDEA" plugin from the IntelliJ plugin repository
+1. Configure the plugin by going to Settings -> Other Settings -> Checkstyle
 1. Set Checkstyle version to the same as in `/pom.xml` (e.g. 6.19)
-1. Set the "Scan Scope" to "Only Java sources (including tests)".
+1. Set the "Scan Scope" to "Only Java sources (including tests)"
 1. In the "Configuration File" pane, add a new configuration using the plus 
icon:
-1. Set the "Description" to "Beam".
+1. Set the "Description" to "Beam"
 1. Select "Use a local Checkstyle file", and point it to
   `sdks/java/build-tools/src/main/resources/beam/checkstyle.xml` within
-  your repository.
+  your repository
 1. Check the box for "Store relative to project location", and click
-  "Next".
+  "Next"
 1. Configure the `checkstyle.suppressions.file` property value to
-  `suppressions.xml`, and click "Next", then "Finish".
+  `suppressions.xml`, and click "Next", then "Finish"
 1. Select "Beam" as the only active configuration file, and click "Apply" and
-   "OK".
+   "OK"
 1. Checkstyle will now give warnings in the editor for any Checkstyle
-   violations.
+   violations
 
 You can also scan an entire module by opening the Checkstyle tools window and
 clicking the "Check Module" button. The scan should report no errors.
@@ -150,13 +150,13 @@ modules as they are not configured for Checkstyle 
validation.
 IntelliJ supports code styles within the IDE. Use one of the following to 
ensure your code style
 matches the project's checkstyle enforcements.
 
-1. (Option 1) Configure IntelliJ to use "beam-codestyle.xml".
-1. Go to Settings -> Code Style -> Java.
-1. Click the cogwheel icon next to 'Scheme' and select Import Scheme -> 
Eclipse XML Profile.
-1. Select 
"sdks/java/build-tools/src/main/resources/beam/beam-codestyle.xml".
-1. Click "OK".
-1. Click "Apply" and "OK".
-1. (Option 2) Install [Google Java Format 
plugin](https://plugins.jetbrains.com/plugin/8527-google-java-format).
+1. (Option 1) Configure IntelliJ to use `beam-codestyle.xml`
+1. Go to Settings -> 

[jira] [Resolved] (BEAM-2963) Propagate pipeline protos through Dataflow API from Java

2017-10-18 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles resolved BEAM-2963.
---
   Resolution: Fixed
Fix Version/s: 2.3.0

> Propagate pipeline protos through Dataflow API from Java
> 
>
> Key: BEAM-2963
> URL: https://issues.apache.org/jira/browse/BEAM-2963
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-dataflow
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>  Labels: portability
> Fix For: 2.3.0
>
>
> The Java-specific blobs transmitted to Dataflow need more context, in the 
> form of portability framework protos.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #4010: [BEAM-3074] Stage the pipeline in Python DataflowRu...

2017-10-18 Thread kennknowles
GitHub user kennknowles opened a pull request:

https://github.com/apache/beam/pull/4010

[BEAM-3074] Stage the pipeline in Python DataflowRunner

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [ ] Each commit in the pull request should have a meaningful subject 
line and body.
 - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [ ] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---

R: @aaltay  (or redirect to other appropriate Python reviewer?)

Just hacked this out naively; it probably isn't respecting abstractions 
quite right. I confirmed enough that the file is staged - much simpler than 
Java :+1:. Also no tests :1st_place_medal:.

In doing a manual smoke test, I just tried to follow some combination of 
the quickstart plus the contribution guide, but broke during staging because 
`pip install --download` doesn't like that I did `pip install -e .[gcp]`. Is 
there a doc that has the steps for a new contributor to run wordcount with 
local modifications? I'm a bit rusty on the approved way of setting up the 
virtualenv. The crash occurs after the pipeline is staged, so I was able to 
check the basics anyhow.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/beam py-stage-pipeline

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/4010.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4010


commit 2f5293373280ebcea618a6aa4fa1057237441512
Author: Kenneth Knowles 
Date:   2017-10-18T20:56:28Z

Stage the pipeline in Python DataflowRunner




---


Build failed in Jenkins: beam_PostCommit_Python_Verify #3386

2017-10-18 Thread Apache Jenkins Server
See 


Changes:

[rangadi] [BEAM-2720] Update kafka client version to 0.11.0.1

[kenn] Improve GcsFileSystem errors messages slightly

[kenn] Add ability to stage explicit file list

[kenn] Stage the portable pipeline in Dataflow

[kenn] Stage the pipeline without using a temp file

[kenn] Add assertion that valid jobs must have staged pipeline

[kenn] Remove duplicate mocking in DataflowRunnerTest

--
[...truncated 886.25 KB...]
copying apache_beam/portability/api/standard_window_fns_pb2.py -> 
apache-beam-2.3.0.dev0/apache_beam/portability/api
copying apache_beam/portability/api/standard_window_fns_pb2_grpc.py -> 
apache-beam-2.3.0.dev0/apache_beam/portability/api
copying apache_beam/runners/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/common.pxd -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/common.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/common_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/pipeline_context.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/pipeline_context_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/dataflow/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_metrics.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_metrics_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/ptransform_overrides.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/template_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/test_dataflow_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/internal/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/apiclient.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/apiclient_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/dependency.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/dependency_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/names.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/clients/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients
copying apache_beam/runners/dataflow/internal/clients/dataflow/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_client.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_messages.py
 -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying apache_beam/runners/dataflow/native_io/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/streaming_create.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/direct/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/bundle_factory.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/clock.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying 

[jira] [Commented] (BEAM-2963) Propagate pipeline protos through Dataflow API from Java

2017-10-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16209941#comment-16209941
 ] 

ASF GitHub Bot commented on BEAM-2963:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3977


> Propagate pipeline protos through Dataflow API from Java
> 
>
> Key: BEAM-2963
> URL: https://issues.apache.org/jira/browse/BEAM-2963
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-dataflow
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>  Labels: portability
>
> The Java-specific blobs transmitted to Dataflow need more context, in the 
> form of portability framework protos.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3977: [BEAM-2963] Stage pipeline in DataflowRunner

2017-10-18 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3977


---


[6/7] beam git commit: Remove duplicate mocking in DataflowRunnerTest

2017-10-18 Thread kenn
Remove duplicate mocking in DataflowRunnerTest


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/128f3a63
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/128f3a63
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/128f3a63

Branch: refs/heads/master
Commit: 128f3a63f24f252d7e0b444187210e352a127329
Parents: cef997f
Author: Kenneth Knowles 
Authored: Wed Oct 18 10:38:16 2017 -0700
Committer: Kenneth Knowles 
Committed: Wed Oct 18 13:02:25 2017 -0700

--
 .../java/org/apache/beam/runners/dataflow/DataflowRunnerTest.java   | 1 -
 1 file changed, 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/128f3a63/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowRunnerTest.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowRunnerTest.java
 
b/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowRunnerTest.java
index 02abc34..1568eda 100644
--- 
a/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowRunnerTest.java
+++ 
b/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowRunnerTest.java
@@ -209,7 +209,6 @@ public class DataflowRunnerTest implements Serializable {
   }
 });
 
when(mockGcsUtil.bucketAccessible(GcsPath.fromUri(VALID_STAGING_BUCKET))).thenReturn(true);
-
when(mockGcsUtil.bucketAccessible(GcsPath.fromUri(VALID_STAGING_BUCKET))).thenReturn(true);
 
when(mockGcsUtil.bucketAccessible(GcsPath.fromUri(VALID_TEMP_BUCKET))).thenReturn(true);
 when(mockGcsUtil.bucketAccessible(GcsPath.fromUri(VALID_TEMP_BUCKET + 
"/staging/"))).
 thenReturn(true);



[2/7] beam git commit: Add ability to stage explicit file list

2017-10-18 Thread kenn
Add ability to stage explicit file list


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/9b866fef
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/9b866fef
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/9b866fef

Branch: refs/heads/master
Commit: 9b866fef99293d9738f0dcd862fb409265e50abb
Parents: 7409ca0
Author: Kenneth Knowles 
Authored: Tue Oct 10 21:55:49 2017 -0700
Committer: Kenneth Knowles 
Committed: Wed Oct 18 13:02:24 2017 -0700

--
 .../beam/runners/dataflow/DataflowRunner.java   |  2 +-
 .../beam/runners/dataflow/util/GcsStager.java   | 42 +++-
 .../beam/runners/dataflow/util/Stager.java  | 27 ++---
 3 files changed, 54 insertions(+), 17 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/9b866fef/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
index e637dd4..5e91850 100644
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
@@ -514,7 +514,7 @@ public class DataflowRunner extends 
PipelineRunner {
 LOG.info("Executing pipeline on the Dataflow Service, which will have 
billing implications "
 + "related to Google Compute Engine usage and other Google Cloud 
Services.");
 
-List packages = options.getStager().stageFiles();
+List packages = options.getStager().stageDefaultFiles();
 
 
 // Set a unique client_request_id in the CreateJob request.

http://git-wip-us.apache.org/repos/asf/beam/blob/9b866fef/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/GcsStager.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/GcsStager.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/GcsStager.java
index 929be99..ff205f0 100644
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/GcsStager.java
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/GcsStager.java
@@ -29,9 +29,7 @@ import 
org.apache.beam.sdk.extensions.gcp.storage.GcsCreateOptions;
 import org.apache.beam.sdk.options.PipelineOptions;
 import org.apache.beam.sdk.util.MimeTypes;
 
-/**
- * Utility class for staging files to GCS.
- */
+/** Utility class for staging files to GCS. */
 public class GcsStager implements Stager {
   private DataflowPipelineOptions options;
 
@@ -39,32 +37,54 @@ public class GcsStager implements Stager {
 this.options = options;
   }
 
-  @SuppressWarnings("unused")  // used via reflection
+  @SuppressWarnings("unused") // used via reflection
   public static GcsStager fromOptions(PipelineOptions options) {
 return new GcsStager(options.as(DataflowPipelineOptions.class));
   }
 
+  /**
+   * Stages {@link DataflowPipelineOptions#getFilesToStage()}, which defaults 
to every file on the
+   * classpath unless overridden, as well as {@link
+   * DataflowPipelineDebugOptions#getOverrideWindmillBinary()} if specified.
+   *
+   * @see #stageFiles(List)
+   */
   @Override
-  public List stageFiles() {
+  public List stageDefaultFiles() {
 checkNotNull(options.getStagingLocation());
 String windmillBinary =
 
options.as(DataflowPipelineDebugOptions.class).getOverrideWindmillBinary();
+
+List filesToStage = options.getFilesToStage();
+
 if (windmillBinary != null) {
-  options.getFilesToStage().add("windmill_main=" + windmillBinary);
+  filesToStage.add("windmill_main=" + windmillBinary);
 }
 
+return stageFiles(filesToStage);
+  }
+
+  /**
+   * Stages files to {@link DataflowPipelineOptions#getStagingLocation()}, 
suffixed with their md5
+   * hash to avoid collisions.
+   *
+   * Uses {@link DataflowPipelineOptions#getGcsUploadBufferSizeBytes()}.
+   */
+  @Override
+  public List stageFiles(List filesToStage) {
 int uploadSizeBytes = firstNonNull(options.getGcsUploadBufferSizeBytes(), 
1024 * 1024);
 checkArgument(uploadSizeBytes > 0, "gcsUploadBufferSizeBytes must be > 0");
 uploadSizeBytes = Math.min(uploadSizeBytes, 1024 * 1024);
 
-GcsCreateOptions createOptions = GcsCreateOptions.builder()
-.setGcsUploadBufferSizeBytes(uploadSizeBytes)
-

[4/7] beam git commit: Add assertion that valid jobs must have staged pipeline

2017-10-18 Thread kenn
Add assertion that valid jobs must have staged pipeline


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/cef997ff
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/cef997ff
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/cef997ff

Branch: refs/heads/master
Commit: cef997ff06629a2c77b5aeb4f9ad40d8c4b3b22c
Parents: 090c512
Author: Kenneth Knowles 
Authored: Wed Oct 18 06:49:13 2017 -0700
Committer: Kenneth Knowles 
Committed: Wed Oct 18 13:02:25 2017 -0700

--
 .../java/org/apache/beam/runners/dataflow/DataflowRunner.java | 3 ++-
 .../org/apache/beam/runners/dataflow/DataflowRunnerTest.java  | 7 +++
 2 files changed, 9 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/cef997ff/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
index ecef072..545321d 100644
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
@@ -192,7 +192,8 @@ public class DataflowRunner extends 
PipelineRunner {
   @VisibleForTesting
   static final String PIPELINE_FILE_NAME = "pipeline.pb";
 
-  private static final String STAGED_PIPELINE_METADATA_PROPERTY = 
"pipeline_url";
+  @VisibleForTesting
+  static final String STAGED_PIPELINE_METADATA_PROPERTY = "pipeline_url";
 
   private final Set pcollectionsRequiringIndexedFormat;
 

http://git-wip-us.apache.org/repos/asf/beam/blob/cef997ff/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowRunnerTest.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowRunnerTest.java
 
b/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowRunnerTest.java
index 5bc798a..02abc34 100644
--- 
a/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowRunnerTest.java
+++ 
b/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowRunnerTest.java
@@ -22,6 +22,7 @@ import static org.hamcrest.Matchers.both;
 import static org.hamcrest.Matchers.containsString;
 import static org.hamcrest.Matchers.equalTo;
 import static org.hamcrest.Matchers.hasItem;
+import static org.hamcrest.Matchers.hasKey;
 import static org.hamcrest.Matchers.is;
 import static org.hamcrest.Matchers.not;
 import static org.hamcrest.Matchers.startsWith;
@@ -45,6 +46,7 @@ import com.google.api.services.dataflow.Dataflow;
 import com.google.api.services.dataflow.model.DataflowPackage;
 import com.google.api.services.dataflow.model.Job;
 import com.google.api.services.dataflow.model.ListJobsResponse;
+import com.google.api.services.dataflow.model.WorkerPool;
 import com.google.api.services.storage.model.StorageObject;
 import com.google.common.base.Throwables;
 import com.google.common.collect.ImmutableList;
@@ -163,6 +165,11 @@ public class DataflowRunnerTest implements Serializable {
 assertNull(job.getId());
 assertNull(job.getCurrentState());
 assertTrue(Pattern.matches("[a-z]([-a-z0-9]*[a-z0-9])?", job.getName()));
+
+for (WorkerPool workerPool : job.getEnvironment().getWorkerPools()) {
+  assertThat(workerPool.getMetadata(),
+  hasKey(DataflowRunner.STAGED_PIPELINE_METADATA_PROPERTY));
+}
   }
 
   @Before



[7/7] beam git commit: This closes #3977: [BEAM-2963] Stage pipeline in DataflowRunner

2017-10-18 Thread kenn
This closes #3977: [BEAM-2963] Stage pipeline in DataflowRunner

  Remove duplicate mocking in DataflowRunnerTest
  Add assertion that valid jobs must have staged pipeline
  Stage the pipeline without using a temp file
  Stage the portable pipeline in Dataflow
  Add ability to stage explicit file list
  Improve GcsFileSystem errors messages slightly


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/1974b920
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/1974b920
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/1974b920

Branch: refs/heads/master
Commit: 1974b920e4b3bbe8549e25fe789f9dada13c1769
Parents: 2acdc74 128f3a6
Author: Kenneth Knowles 
Authored: Wed Oct 18 13:08:50 2017 -0700
Committer: Kenneth Knowles 
Committed: Wed Oct 18 13:08:50 2017 -0700

--
 .../beam/runners/dataflow/DataflowRunner.java   |  15 ++-
 .../beam/runners/dataflow/util/GcsStager.java   |  53 +++--
 .../beam/runners/dataflow/util/PackageUtil.java | 116 ++-
 .../beam/runners/dataflow/util/Stager.java  |  32 -
 .../runners/dataflow/DataflowRunnerTest.java|  87 +++---
 .../extensions/gcp/storage/GcsFileSystem.java   |   5 +-
 6 files changed, 239 insertions(+), 69 deletions(-)
--




[1/7] beam git commit: Stage the portable pipeline in Dataflow

2017-10-18 Thread kenn
Repository: beam
Updated Branches:
  refs/heads/master 2acdc747a -> 1974b920e


Stage the portable pipeline in Dataflow


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/aea0c601
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/aea0c601
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/aea0c601

Branch: refs/heads/master
Commit: aea0c6017dc2cef2e62216d0882c7cc89cb57732
Parents: 9b866fe
Author: Kenneth Knowles 
Authored: Tue Oct 10 20:11:47 2017 -0700
Committer: Kenneth Knowles 
Committed: Wed Oct 18 13:02:24 2017 -0700

--
 .../beam/runners/dataflow/DataflowRunner.java   | 28 +++
 .../runners/dataflow/DataflowRunnerTest.java| 79 
 2 files changed, 91 insertions(+), 16 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/aea0c601/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
index 5e91850..6dbc4af 100644
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
@@ -41,6 +41,7 @@ import com.google.common.collect.ImmutableList;
 import com.google.common.collect.ImmutableMap;
 import com.google.common.collect.Iterables;
 import java.io.File;
+import java.io.FileOutputStream;
 import java.io.IOException;
 import java.io.PrintWriter;
 import java.net.URISyntaxException;
@@ -64,6 +65,7 @@ import 
org.apache.beam.runners.core.construction.DeduplicatedFlattenFactory;
 import org.apache.beam.runners.core.construction.EmptyFlattenAsCreateFactory;
 import org.apache.beam.runners.core.construction.PTransformMatchers;
 import org.apache.beam.runners.core.construction.PTransformReplacements;
+import org.apache.beam.runners.core.construction.PipelineTranslation;
 import org.apache.beam.runners.core.construction.RehydratedComponents;
 import org.apache.beam.runners.core.construction.ReplacementOutputs;
 import 
org.apache.beam.runners.core.construction.SingleInputOutputOverrideFactory;
@@ -188,6 +190,14 @@ public class DataflowRunner extends 
PipelineRunner {
   @VisibleForTesting
   static final int GCS_UPLOAD_BUFFER_SIZE_BYTES_DEFAULT = 1024 * 1024;
 
+  @VisibleForTesting
+  static final String PIPELINE_FILE_NAME = "pipeline";
+
+  @VisibleForTesting
+  static final String SERIALIZED_PROTOBUF_EXTENSION = ".pb";
+
+  private static final String STAGED_PIPELINE_METADATA_PROPERTY = 
"pipeline_url";
+
   private final Set pcollectionsRequiringIndexedFormat;
 
   /**
@@ -516,6 +526,22 @@ public class DataflowRunner extends 
PipelineRunner {
 
 List packages = options.getStager().stageDefaultFiles();
 
+RunnerApi.Pipeline protoPipeline = PipelineTranslation.toProto(pipeline);
+File serializedProtoPipeline;
+try {
+  serializedProtoPipeline =
+  File.createTempFile(PIPELINE_FILE_NAME, 
SERIALIZED_PROTOBUF_EXTENSION);
+  protoPipeline.writeDelimitedTo(new 
FileOutputStream(serializedProtoPipeline));
+} catch (IOException e) {
+  throw new RuntimeException(e);
+}
+
+LOG.info("Staging pipeline description to {}", 
options.getStagingLocation());
+DataflowPackage stagedPipeline =
+options
+.getStager()
+
.stageFiles(ImmutableList.of(serializedProtoPipeline.getAbsolutePath()))
+.get(0);
 
 // Set a unique client_request_id in the CreateJob request.
 // This is used to ensure idempotence of job creation across retried
@@ -560,6 +586,8 @@ public class DataflowRunner extends 
PipelineRunner {
 String workerHarnessContainerImage = getContainerImageForJob(options);
 for (WorkerPool workerPool : newJob.getEnvironment().getWorkerPools()) {
   workerPool.setWorkerHarnessContainerImage(workerHarnessContainerImage);
+  workerPool.setMetadata(
+  ImmutableMap.of(STAGED_PIPELINE_METADATA_PROPERTY, 
stagedPipeline.getLocation()));
 }
 
 newJob.getEnvironment().setVersion(getEnvironmentVersion(options));

http://git-wip-us.apache.org/repos/asf/beam/blob/aea0c601/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowRunnerTest.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowRunnerTest.java
 

[3/7] beam git commit: Improve GcsFileSystem errors messages slightly

2017-10-18 Thread kenn
Improve GcsFileSystem errors messages slightly


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/7409ca04
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/7409ca04
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/7409ca04

Branch: refs/heads/master
Commit: 7409ca042f4cc7f57c02d2ab2843a3bbc833a49a
Parents: 2acdc74
Author: Kenneth Knowles 
Authored: Tue Oct 10 21:55:33 2017 -0700
Committer: Kenneth Knowles 
Committed: Wed Oct 18 13:02:24 2017 -0700

--
 .../apache/beam/sdk/extensions/gcp/storage/GcsFileSystem.java   | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/7409ca04/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/storage/GcsFileSystem.java
--
diff --git 
a/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/storage/GcsFileSystem.java
 
b/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/storage/GcsFileSystem.java
index 6db0a01..f35c62a 100644
--- 
a/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/storage/GcsFileSystem.java
+++ 
b/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/storage/GcsFileSystem.java
@@ -88,10 +88,11 @@ class GcsFileSystem extends FileSystem {
 ImmutableList.Builder ret = ImmutableList.builder();
 for (Boolean isGlob : isGlobBooleans) {
   if (isGlob) {
-checkState(globsMatchResults.hasNext(), "Expect globsMatchResults has 
next.");
+checkState(globsMatchResults.hasNext(), "Expect globsMatchResults has 
next: %s", globs);
 ret.add(globsMatchResults.next());
   } else {
-checkState(nonGlobsMatchResults.hasNext(), "Expect 
nonGlobsMatchResults has next.");
+checkState(
+nonGlobsMatchResults.hasNext(), "Expect nonGlobsMatchResults has 
next: %s", nonGlobs);
 ret.add(nonGlobsMatchResults.next());
   }
 }



[5/7] beam git commit: Stage the pipeline without using a temp file

2017-10-18 Thread kenn
Stage the pipeline without using a temp file


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/090c5124
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/090c5124
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/090c5124

Branch: refs/heads/master
Commit: 090c512457e25c965efab2d6c849f1a50e03e052
Parents: aea0c60
Author: Kenneth Knowles 
Authored: Tue Oct 17 16:06:05 2017 -0700
Committer: Kenneth Knowles 
Committed: Wed Oct 18 13:02:25 2017 -0700

--
 .../beam/runners/dataflow/DataflowRunner.java   |  22 +---
 .../beam/runners/dataflow/util/GcsStager.java   |  29 +++--
 .../beam/runners/dataflow/util/PackageUtil.java | 116 ++-
 .../beam/runners/dataflow/util/Stager.java  |   5 +
 4 files changed, 111 insertions(+), 61 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/090c5124/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
index 6dbc4af..ecef072 100644
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
@@ -41,7 +41,6 @@ import com.google.common.collect.ImmutableList;
 import com.google.common.collect.ImmutableMap;
 import com.google.common.collect.Iterables;
 import java.io.File;
-import java.io.FileOutputStream;
 import java.io.IOException;
 import java.io.PrintWriter;
 import java.net.URISyntaxException;
@@ -191,10 +190,7 @@ public class DataflowRunner extends 
PipelineRunner {
   static final int GCS_UPLOAD_BUFFER_SIZE_BYTES_DEFAULT = 1024 * 1024;
 
   @VisibleForTesting
-  static final String PIPELINE_FILE_NAME = "pipeline";
-
-  @VisibleForTesting
-  static final String SERIALIZED_PROTOBUF_EXTENSION = ".pb";
+  static final String PIPELINE_FILE_NAME = "pipeline.pb";
 
   private static final String STAGED_PIPELINE_METADATA_PROPERTY = 
"pipeline_url";
 
@@ -526,22 +522,10 @@ public class DataflowRunner extends 
PipelineRunner {
 
 List packages = options.getStager().stageDefaultFiles();
 
-RunnerApi.Pipeline protoPipeline = PipelineTranslation.toProto(pipeline);
-File serializedProtoPipeline;
-try {
-  serializedProtoPipeline =
-  File.createTempFile(PIPELINE_FILE_NAME, 
SERIALIZED_PROTOBUF_EXTENSION);
-  protoPipeline.writeDelimitedTo(new 
FileOutputStream(serializedProtoPipeline));
-} catch (IOException e) {
-  throw new RuntimeException(e);
-}
-
+byte[] serializedProtoPipeline = 
PipelineTranslation.toProto(pipeline).toByteArray();
 LOG.info("Staging pipeline description to {}", 
options.getStagingLocation());
 DataflowPackage stagedPipeline =
-options
-.getStager()
-
.stageFiles(ImmutableList.of(serializedProtoPipeline.getAbsolutePath()))
-.get(0);
+options.getStager().stageToFile(serializedProtoPipeline, 
PIPELINE_FILE_NAME);
 
 // Set a unique client_request_id in the CreateJob request.
 // This is used to ensure idempotence of job creation across retried

http://git-wip-us.apache.org/repos/asf/beam/blob/090c5124/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/GcsStager.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/GcsStager.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/GcsStager.java
index ff205f0..7ed78e8 100644
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/GcsStager.java
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/GcsStager.java
@@ -72,19 +72,28 @@ public class GcsStager implements Stager {
*/
   @Override
   public List stageFiles(List filesToStage) {
+try (PackageUtil packageUtil = PackageUtil.withDefaultThreadPool()) {
+  return packageUtil.stageClasspathElements(
+  filesToStage, options.getStagingLocation(), buildCreateOptions());
+}
+  }
+
+  @Override
+  public DataflowPackage stageToFile(byte[] bytes, String baseName) {
+try (PackageUtil packageUtil = PackageUtil.withDefaultThreadPool()) {
+  return packageUtil.stageToFile(
+  bytes, baseName, options.getStagingLocation(), buildCreateOptions());
+}
+  }
+
+  private 

[jira] [Commented] (BEAM-3029) BigTable integration tests failing on Dataflow: UserAgent must not be empty

2017-10-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16209900#comment-16209900
 ] 

ASF GitHub Bot commented on BEAM-3029:
--

Github user chamikaramj closed the pull request at:

https://github.com/apache/beam/pull/3998


> BigTable integration tests failing on Dataflow: UserAgent must not be empty
> ---
>
> Key: BEAM-3029
> URL: https://issues.apache.org/jira/browse/BEAM-3029
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Kenneth Knowles
>Assignee: Chamikara Jayalath
>Priority: Blocker
> Fix For: 2.2.0
>
>
> https://builds.apache.org/job/beam_PostCommit_Java_MavenInstall/4963/org.apache.beam$beam-runners-google-cloud-dataflow-java/testReport/junit/org.apache.beam.sdk.io.gcp.bigtable/BigtableReadIT/testE2EBigtableRead/
> {code}
> java.lang.IllegalArgumentException: UserAgent must not be empty or null
>   at 
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:122)
>   at 
> com.google.cloud.bigtable.grpc.BigtableSession.(BigtableSession.java:233)
>   at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableServiceImpl.tableExists(BigtableServiceImpl.java:77)
>   at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$Read.validate(BigtableIO.java:351)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3998: [BEAM-3029] Sets user agent in BigTableIO.Read.getB...

2017-10-18 Thread chamikaramj
Github user chamikaramj closed the pull request at:

https://github.com/apache/beam/pull/3998


---


[GitHub] beam pull request #4009: [BEAM-2720] Update kafka client version to 0.11.0.1

2017-10-18 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/4009


---


[jira] [Commented] (BEAM-2720) Exact-once Kafka sink

2017-10-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16209891#comment-16209891
 ] 

ASF GitHub Bot commented on BEAM-2720:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/4009


> Exact-once Kafka sink
> -
>
> Key: BEAM-2720
> URL: https://issues.apache.org/jira/browse/BEAM-2720
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions
>Reporter: Raghu Angadi
>Assignee: Raghu Angadi
>
> Kafka 0.11 added support for transactions which allows end-to-end 
> exactly-once semantics. Beam's KafkaIO users can benefit from these while 
> using runners that support exactly-once processing.
> I have an implementation of EOS support for Kafka sink : 
> https://github.com/apache/beam/pull/3612
> It has two shuffles and builds on Beam state-API and checkpoint barrier 
> between stages (as in Dataflow). Pull request has a longer description.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[2/2] beam git commit: This closes #4009: [BEAM-2720] Update kafka client version to 0.11.0.1

2017-10-18 Thread kenn
This closes #4009: [BEAM-2720] Update kafka client version to 0.11.0.1


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/2acdc747
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/2acdc747
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/2acdc747

Branch: refs/heads/master
Commit: 2acdc747a59140695269e4ac592695d902efbe0b
Parents: 77a4d3e 2768409
Author: Kenneth Knowles 
Authored: Wed Oct 18 12:41:50 2017 -0700
Committer: Kenneth Knowles 
Committed: Wed Oct 18 12:41:50 2017 -0700

--
 pom.xml| 2 +-
 .../kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java  | 1 -
 2 files changed, 1 insertion(+), 2 deletions(-)
--




[1/2] beam git commit: [BEAM-2720] Update kafka client version to 0.11.0.1

2017-10-18 Thread kenn
Repository: beam
Updated Branches:
  refs/heads/master 77a4d3e9d -> 2acdc747a


[BEAM-2720] Update kafka client version to 0.11.0.1

This was supposed to be in earler PR #3612, but it slipped through.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/27684090
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/27684090
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/27684090

Branch: refs/heads/master
Commit: 27684090f43ce24810e6a4a4efabee07628c7ae0
Parents: de7cc05
Author: Raghu Angadi 
Authored: Wed Oct 18 10:18:15 2017 -0700
Committer: Raghu Angadi 
Committed: Wed Oct 18 10:18:15 2017 -0700

--
 pom.xml| 2 +-
 .../kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java  | 1 -
 2 files changed, 1 insertion(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/27684090/pom.xml
--
diff --git a/pom.xml b/pom.xml
index 8e0b4f5..3509407 100644
--- a/pom.xml
+++ b/pom.xml
@@ -153,7 +153,7 @@
 4.4.1
 4.3.5.RELEASE
 1.1.4
-0.10.1.0
+0.11.0.1
 1.4
 
 1.5.0.Final

http://git-wip-us.apache.org/repos/asf/beam/blob/27684090/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java
--
diff --git 
a/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java 
b/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java
index 603e62f..af73a8d 100644
--- a/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java
+++ b/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java
@@ -1590,7 +1590,6 @@ public class KafkaIO {
  * consumer. Similar to {@link 
Read#withConsumerFactoryFn(SerializableFunction)}, a factory
  * function can be supplied if required in a specific case.
  * The default is {@link KafkaConsumer}.
- * @param consumerFactoryFn
  */
 public Write withConsumerFactoryFn(
 SerializableFunction, ? extends Consumer> 
consumerFactoryFn) {



Build failed in Jenkins: beam_PerformanceTests_Python #463

2017-10-18 Thread Apache Jenkins Server
See 


Changes:

[tgroh] Update PipelineTest.testReplacedNames

--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam3 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 77a4d3e9dc1ee8f2aac324e50af7b05168ee2af7 (origin/master)
Commit message: "This closes #3929"
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 77a4d3e9dc1ee8f2aac324e50af7b05168ee2af7
 > git rev-list de7cc05cc67d1aa6331cddc17c2e02ed0efbe37d # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins8918830133848153658.sh
+ rm -rf PerfKitBenchmarker
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins8593057067320330842.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git
Cloning into 'PerfKitBenchmarker'...
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins4981956731679526148.sh
+ pip install --user -r PerfKitBenchmarker/requirements.txt
Requirement already satisfied: absl-py in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: jinja2>=2.7 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: setuptools in /usr/lib/python2.7/dist-packages 
(from -r PerfKitBenchmarker/requirements.txt (line 16))
Requirement already satisfied: colorlog[windows]==2.6.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 17))
Requirement already satisfied: blinker>=1.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 18))
Requirement already satisfied: futures>=3.0.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 19))
Requirement already satisfied: PyYAML==3.12 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 20))
Requirement already satisfied: pint>=0.7 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 21))
Requirement already satisfied: numpy in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 22))
Requirement already satisfied: functools32 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 23))
Requirement already satisfied: contextlib2>=0.5.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 24))
Requirement already satisfied: six in 
/home/jenkins/.local/lib/python2.7/site-packages (from absl-py->-r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: MarkupSafe in 
/usr/local/lib/python2.7/dist-packages (from jinja2>=2.7->-r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: colorama; extra == "windows" in 
/usr/lib/python2.7/dist-packages (from colorlog[windows]==2.6.0->-r 
PerfKitBenchmarker/requirements.txt (line 17))
[beam_PerformanceTests_Python] $ /bin/bash -xe /tmp/jenkins935345968709901246.sh
+ pip install --user -e 'sdks/python/[gcp,test]'
Obtaining 
file://
Requirement already satisfied: avro<2.0.0,>=1.8.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from apache-beam==2.3.0.dev)
Requirement already satisfied: crcmod<2.0,>=1.7 in 
/home/jenkins/.local/lib/python2.7/site-packages (from apache-beam==2.3.0.dev)
Requirement already satisfied: dill==0.2.6 in 
/home/jenkins/.local/lib/python2.7/site-packages (from apache-beam==2.3.0.dev)
Requirement already satisfied: grpcio<2.0,>=1.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from apache-beam==2.3.0.dev)
Requirement already satisfied: httplib2<0.10,>=0.8 in 

Build failed in Jenkins: beam_PostCommit_Python_Verify #3385

2017-10-18 Thread Apache Jenkins Server
See 


Changes:

[tgroh] Update PipelineTest.testReplacedNames

--
[...truncated 886.08 KB...]
copying apache_beam/portability/api/standard_window_fns_pb2.py -> 
apache-beam-2.3.0.dev0/apache_beam/portability/api
copying apache_beam/portability/api/standard_window_fns_pb2_grpc.py -> 
apache-beam-2.3.0.dev0/apache_beam/portability/api
copying apache_beam/runners/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/common.pxd -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/common.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/common_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/pipeline_context.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/pipeline_context_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/dataflow/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_metrics.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_metrics_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/ptransform_overrides.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/template_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/test_dataflow_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/internal/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/apiclient.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/apiclient_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/dependency.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/dependency_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/names.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/clients/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients
copying apache_beam/runners/dataflow/internal/clients/dataflow/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_client.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_messages.py
 -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying apache_beam/runners/dataflow/native_io/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/streaming_create.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/direct/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/bundle_factory.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/clock.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics.py -> 

[jira] [Commented] (BEAM-2720) Exact-once Kafka sink

2017-10-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16209703#comment-16209703
 ] 

ASF GitHub Bot commented on BEAM-2720:
--

GitHub user rangadi opened a pull request:

https://github.com/apache/beam/pull/4009

[BEAM-2720] Update kafka client version to 0.11.0.1

This was supposed to be in earlier PR #3612, but it slipped through.

+R: @kennknowles, @iemejia 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rangadi/beam update_kafka_version

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/4009.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4009


commit 27684090f43ce24810e6a4a4efabee07628c7ae0
Author: Raghu Angadi 
Date:   2017-10-18T17:18:15Z

[BEAM-2720] Update kafka client version to 0.11.0.1

This was supposed to be in earler PR #3612, but it slipped through.




> Exact-once Kafka sink
> -
>
> Key: BEAM-2720
> URL: https://issues.apache.org/jira/browse/BEAM-2720
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions
>Reporter: Raghu Angadi
>Assignee: Raghu Angadi
>
> Kafka 0.11 added support for transactions which allows end-to-end 
> exactly-once semantics. Beam's KafkaIO users can benefit from these while 
> using runners that support exactly-once processing.
> I have an implementation of EOS support for Kafka sink : 
> https://github.com/apache/beam/pull/3612
> It has two shuffles and builds on Beam state-API and checkpoint barrier 
> between stages (as in Dataflow). Pull request has a longer description.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #4009: [BEAM-2720] Update kafka client version to 0.11.0.1

2017-10-18 Thread rangadi
GitHub user rangadi opened a pull request:

https://github.com/apache/beam/pull/4009

[BEAM-2720] Update kafka client version to 0.11.0.1

This was supposed to be in earlier PR #3612, but it slipped through.

+R: @kennknowles, @iemejia 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rangadi/beam update_kafka_version

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/4009.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4009


commit 27684090f43ce24810e6a4a4efabee07628c7ae0
Author: Raghu Angadi 
Date:   2017-10-18T17:18:15Z

[BEAM-2720] Update kafka client version to 0.11.0.1

This was supposed to be in earler PR #3612, but it slipped through.




---


[jira] [Resolved] (BEAM-3006) Minor fix in PipelineTest.testReplacedNames()

2017-10-18 Thread Thomas Groh (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Groh resolved BEAM-3006.
---
   Resolution: Fixed
Fix Version/s: Not applicable

> Minor fix in PipelineTest.testReplacedNames()
> -
>
> Key: BEAM-3006
> URL: https://issues.apache.org/jira/browse/BEAM-3006
> Project: Beam
>  Issue Type: Test
>  Components: sdk-java-core
>Reporter: Uri Silberstein
>Assignee: Uri Silberstein
>Priority: Minor
> Fix For: Not applicable
>
>
> This method tests the pipeline replace functionality. 
> It does so by replacing a OriginalTransform with ReplacementTransform and 
> then assert that the pipeline contains the ReplacementTransform by assert the 
> transformation name.
> The problem is that both transformations (OriginalTransform and 
> ReplacementTransform) have the same transformation name. So the assert 
> doesn't really checks a thing 
> The offered fix:
> Changing the names to "original_name" and "replacement_name" respectively and 
> then asserting that transformation name is "replacement_name" and not 
> "original_name"



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-3006) Minor fix in PipelineTest.testReplacedNames()

2017-10-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16209684#comment-16209684
 ] 

ASF GitHub Bot commented on BEAM-3006:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3929


> Minor fix in PipelineTest.testReplacedNames()
> -
>
> Key: BEAM-3006
> URL: https://issues.apache.org/jira/browse/BEAM-3006
> Project: Beam
>  Issue Type: Test
>  Components: sdk-java-core
>Reporter: Uri Silberstein
>Assignee: Uri Silberstein
>Priority: Minor
>
> This method tests the pipeline replace functionality. 
> It does so by replacing a OriginalTransform with ReplacementTransform and 
> then assert that the pipeline contains the ReplacementTransform by assert the 
> transformation name.
> The problem is that both transformations (OriginalTransform and 
> ReplacementTransform) have the same transformation name. So the assert 
> doesn't really checks a thing 
> The offered fix:
> Changing the names to "original_name" and "replacement_name" respectively and 
> then asserting that transformation name is "replacement_name" and not 
> "original_name"



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3929: [BEAM-3006] fixing minor issue in PipelineTest#test...

2017-10-18 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3929


---


[1/2] beam git commit: This closes #3929

2017-10-18 Thread tgroh
Repository: beam
Updated Branches:
  refs/heads/master de7cc05cc -> 77a4d3e9d


This closes #3929


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/77a4d3e9
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/77a4d3e9
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/77a4d3e9

Branch: refs/heads/master
Commit: 77a4d3e9dc1ee8f2aac324e50af7b05168ee2af7
Parents: de7cc05 1e3bee1
Author: Thomas Groh 
Authored: Wed Oct 18 10:00:00 2017 -0700
Committer: Thomas Groh 
Committed: Wed Oct 18 10:00:00 2017 -0700

--
 .../java/org/apache/beam/sdk/PipelineTest.java  | 51 +++-
 1 file changed, 27 insertions(+), 24 deletions(-)
--




[2/2] beam git commit: Update PipelineTest.testReplacedNames

2017-10-18 Thread tgroh
Update PipelineTest.testReplacedNames

Validate that the node has been replaced (via comparing the class of a
subnode) rather than just checking names.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/1e3bee18
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/1e3bee18
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/1e3bee18

Branch: refs/heads/master
Commit: 1e3bee189b0e4368604816a2c7df600c86233a20
Parents: de7cc05
Author: Uri Silberstein 
Authored: Mon Oct 2 16:27:13 2017 +0300
Committer: Thomas Groh 
Committed: Wed Oct 18 10:00:00 2017 -0700

--
 .../java/org/apache/beam/sdk/PipelineTest.java  | 51 +++-
 1 file changed, 27 insertions(+), 24 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/1e3bee18/sdks/java/core/src/test/java/org/apache/beam/sdk/PipelineTest.java
--
diff --git a/sdks/java/core/src/test/java/org/apache/beam/sdk/PipelineTest.java 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/PipelineTest.java
index 2cc3f04..57fdd75 100644
--- a/sdks/java/core/src/test/java/org/apache/beam/sdk/PipelineTest.java
+++ b/sdks/java/core/src/test/java/org/apache/beam/sdk/PipelineTest.java
@@ -30,10 +30,9 @@ import static org.junit.Assert.fail;
 import com.google.common.collect.ImmutableList;
 import com.google.common.collect.Iterables;
 import java.util.Collections;
-import java.util.HashSet;
+import java.util.HashMap;
 import java.util.List;
 import java.util.Map;
-import java.util.Set;
 import org.apache.beam.sdk.Pipeline.PipelineExecutionException;
 import org.apache.beam.sdk.Pipeline.PipelineVisitor;
 import org.apache.beam.sdk.io.GenerateSequence;
@@ -51,12 +50,13 @@ import org.apache.beam.sdk.testing.NeedsRunner;
 import org.apache.beam.sdk.testing.PAssert;
 import org.apache.beam.sdk.testing.TestPipeline;
 import org.apache.beam.sdk.testing.ValidatesRunner;
-import org.apache.beam.sdk.transforms.Count;
 import org.apache.beam.sdk.transforms.Create;
 import org.apache.beam.sdk.transforms.Flatten;
 import org.apache.beam.sdk.transforms.MapElements;
+import org.apache.beam.sdk.transforms.Max;
 import org.apache.beam.sdk.transforms.PTransform;
 import org.apache.beam.sdk.transforms.SimpleFunction;
+import org.apache.beam.sdk.transforms.Sum;
 import org.apache.beam.sdk.util.UserCodeException;
 import org.apache.beam.sdk.values.PBegin;
 import org.apache.beam.sdk.values.PCollection;
@@ -394,39 +394,40 @@ public class PipelineTest {
   }
 
   @Test
-  public void testReplacedNames() {
+  public void testReplaceWithExistingName() {
 pipeline.enableAbandonedNodeEnforcement(false);
-final PCollection originalInput = pipeline.apply(Create.of("foo", 
"bar", "baz"));
-class OriginalTransform extends PTransform {
+final PCollection originalInput = pipeline.apply(Create.of(1, 2, 
3));
+class OriginalTransform extends PTransform {
   @Override
-  public PCollection expand(PCollection input) {
-return input.apply("custom_name", Count.globally());
+  public PCollection expand(PCollection input) {
+return input.apply("custom_name", Sum.integersGlobally());
   }
 }
-class ReplacementTransform extends PTransform {
+class ReplacementTransform extends PTransform {
   @Override
-  public PCollection expand(PCollection input) {
-return input.apply("custom_name", Count.globally());
+  public PCollection expand(PCollection input) {
+return input.apply("custom_name", Max.integersGlobally());
   }
 }
 class ReplacementOverrideFactory
 implements PTransformOverrideFactory<
-PCollection, PCollection, OriginalTransform> {
-  @Override
-  public PTransformReplacement 
getReplacementTransform(
-  AppliedPTransform transform) {
+PCollection, PCollection, OriginalTransform> {
+
+  @Override public PTransformReplacement
+  getReplacementTransform(
+  AppliedPTransform transform) {
 return PTransformReplacement.of(originalInput, new 
ReplacementTransform());
   }
 
   @Override
   public Map mapOutputs(
-  Map outputs, PCollection newOutput) {
+  Map outputs, PCollection newOutput) {
 return Collections.singletonMap(
 newOutput,
 ReplacementOutput.of(
-TaggedPValue.ofExpandedValue(

Build failed in Jenkins: beam_PostCommit_Python_Verify #3384

2017-10-18 Thread Apache Jenkins Server
See 


--
[...truncated 882.71 KB...]
copying apache_beam/portability/api/standard_window_fns_pb2.py -> 
apache-beam-2.3.0.dev0/apache_beam/portability/api
copying apache_beam/portability/api/standard_window_fns_pb2_grpc.py -> 
apache-beam-2.3.0.dev0/apache_beam/portability/api
copying apache_beam/runners/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/common.pxd -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/common.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/common_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/pipeline_context.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/pipeline_context_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/dataflow/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_metrics.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_metrics_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/ptransform_overrides.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/template_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/test_dataflow_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/internal/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/apiclient.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/apiclient_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/dependency.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/dependency_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/names.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/clients/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients
copying apache_beam/runners/dataflow/internal/clients/dataflow/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_client.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_messages.py
 -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying apache_beam/runners/dataflow/native_io/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/streaming_create.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/direct/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/bundle_factory.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/clock.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying 

[jira] [Commented] (BEAM-3056) dataflow-runner-integration-tests @ beam-examples-java fail in precommit suite on 2.2.0 branch

2017-10-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16209335#comment-16209335
 ] 

ASF GitHub Bot commented on BEAM-3056:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/4006


> dataflow-runner-integration-tests @ beam-examples-java  fail in precommit 
> suite on 2.2.0 branch
> ---
>
> Key: BEAM-3056
> URL: https://issues.apache.org/jira/browse/BEAM-3056
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Affects Versions: 2.2.0
>Reporter: Valentyn Tymofieiev
>Assignee: Kenneth Knowles
>Priority: Blocker
>
> [UPDATE]
> org.apache.beam.examples.WordCountIT seems to be consistently failing on the 
> release branch:
> https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/15044/consoleFull
> https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/15037/consoleFull
> https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/15069/.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[2/2] beam git commit: This closes #4006: Use 2.2.0 container tag for Dataflow

2017-10-18 Thread kenn
This closes #4006: Use 2.2.0 container tag for Dataflow


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/582ddefb
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/582ddefb
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/582ddefb

Branch: refs/heads/release-2.2.0
Commit: 582ddefbddd85b37d859c5eb346c4402b94013c0
Parents: 09431da 1b10b91
Author: Kenneth Knowles 
Authored: Wed Oct 18 06:35:28 2017 -0700
Committer: Kenneth Knowles 
Committed: Wed Oct 18 06:35:28 2017 -0700

--
 runners/google-cloud-dataflow-java/pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--




[GitHub] beam pull request #4006: [BEAM-3056] Set Dataflow container version to 2.2.0

2017-10-18 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/4006


---


[1/2] beam git commit: Use 2.2.0 container tag for Dataflow

2017-10-18 Thread kenn
Repository: beam
Updated Branches:
  refs/heads/release-2.2.0 09431daeb -> 582ddefbd


Use 2.2.0 container tag for Dataflow


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/1b10b91d
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/1b10b91d
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/1b10b91d

Branch: refs/heads/release-2.2.0
Commit: 1b10b91d6e4ac46559d9a5c907a9c48482c1fbca
Parents: e7a8620
Author: Kenneth Knowles 
Authored: Tue Oct 17 11:37:56 2017 -0700
Committer: Kenneth Knowles 
Committed: Tue Oct 17 11:37:56 2017 -0700

--
 runners/google-cloud-dataflow-java/pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/1b10b91d/runners/google-cloud-dataflow-java/pom.xml
--
diff --git a/runners/google-cloud-dataflow-java/pom.xml 
b/runners/google-cloud-dataflow-java/pom.xml
index 90ab9f6..c6def84 100644
--- a/runners/google-cloud-dataflow-java/pom.xml
+++ b/runners/google-cloud-dataflow-java/pom.xml
@@ -33,7 +33,7 @@
   jar
 
   
-beam-2.2.0
+2.2.0
 
1
 
6
   



Jenkins build is back to stable : beam_PostCommit_Java_ValidatesRunner_Apex #2633

2017-10-18 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PerformanceTests_Python #462

2017-10-18 Thread Apache Jenkins Server
See 


--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam7 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision de7cc05cc67d1aa6331cddc17c2e02ed0efbe37d (origin/master)
Commit message: "This closes #3938: [BEAM-2674] Add custom rehydration; 
reinstate proto roundtrip for Java DirectRunner"
 > git config core.sparsecheckout # timeout=10
 > git checkout -f de7cc05cc67d1aa6331cddc17c2e02ed0efbe37d
 > git rev-list de7cc05cc67d1aa6331cddc17c2e02ed0efbe37d # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins1499733128993323419.sh
+ rm -rf PerfKitBenchmarker
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins2468989098323613665.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git
Cloning into 'PerfKitBenchmarker'...
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins6197038871836816927.sh
+ pip install --user -r PerfKitBenchmarker/requirements.txt
Requirement already satisfied: absl-py in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: jinja2>=2.7 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: setuptools in /usr/lib/python2.7/dist-packages 
(from -r PerfKitBenchmarker/requirements.txt (line 16))
Requirement already satisfied: colorlog[windows]==2.6.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 17))
Requirement already satisfied: blinker>=1.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 18))
Requirement already satisfied: futures>=3.0.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 19))
Requirement already satisfied: PyYAML==3.12 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 20))
Requirement already satisfied: pint>=0.7 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 21))
Requirement already satisfied: numpy in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 22))
Requirement already satisfied: functools32 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 23))
Requirement already satisfied: contextlib2>=0.5.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 24))
Requirement already satisfied: six in 
/home/jenkins/.local/lib/python2.7/site-packages (from absl-py->-r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: MarkupSafe>=0.23 in 
/usr/local/lib/python2.7/dist-packages (from jinja2>=2.7->-r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: colorama; extra == "windows" in 
/usr/lib/python2.7/dist-packages (from colorlog[windows]==2.6.0->-r 
PerfKitBenchmarker/requirements.txt (line 17))
[beam_PerformanceTests_Python] $ /bin/bash -xe /tmp/jenkins62753780325581744.sh
+ pip install --user -e 'sdks/python/[gcp,test]'
Obtaining 
file://
Requirement already satisfied: avro<2.0.0,>=1.8.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from apache-beam==2.3.0.dev)
Requirement already satisfied: crcmod<2.0,>=1.7 in 
/usr/lib/python2.7/dist-packages (from apache-beam==2.3.0.dev)
Requirement already satisfied: dill==0.2.6 in 
/home/jenkins/.local/lib/python2.7/site-packages (from apache-beam==2.3.0.dev)
Requirement already satisfied: grpcio<2.0,>=1.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from apache-beam==2.3.0.dev)
Requirement already satisfied: httplib2<0.10,>=0.8 in 

Build failed in Jenkins: beam_PostCommit_Python_Verify #3383

2017-10-18 Thread Apache Jenkins Server
See 


--
[...truncated 886.11 KB...]
copying apache_beam/portability/api/standard_window_fns_pb2.py -> 
apache-beam-2.3.0.dev0/apache_beam/portability/api
copying apache_beam/portability/api/standard_window_fns_pb2_grpc.py -> 
apache-beam-2.3.0.dev0/apache_beam/portability/api
copying apache_beam/runners/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/common.pxd -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/common.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/common_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/pipeline_context.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/pipeline_context_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners
copying apache_beam/runners/dataflow/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_metrics.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_metrics_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/dataflow_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/ptransform_overrides.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/template_runner_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/test_dataflow_runner.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow
copying apache_beam/runners/dataflow/internal/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/apiclient.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/apiclient_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/dependency.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/dependency_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/names.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal
copying apache_beam/runners/dataflow/internal/clients/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients
copying apache_beam/runners/dataflow/internal/clients/dataflow/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_client.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_messages.py
 -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying 
apache_beam/runners/dataflow/internal/clients/dataflow/message_matchers_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/internal/clients/dataflow
copying apache_beam/runners/dataflow/native_io/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/iobase_test.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/dataflow/native_io/streaming_create.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/dataflow/native_io
copying apache_beam/runners/direct/__init__.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/bundle_factory.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/clock.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/consumer_tracking_pipeline_visitor_test.py 
-> apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying apache_beam/runners/direct/direct_metrics.py -> 
apache-beam-2.3.0.dev0/apache_beam/runners/direct
copying 

[GitHub] beam pull request #4008: Fix error and warnings on KafkaIO javadoc generatio...

2017-10-18 Thread iemejia
GitHub user iemejia opened a pull request:

https://github.com/apache/beam/pull/4008

Fix error and warnings on KafkaIO javadoc generation

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [ ] Each commit in the pull request should have a meaningful subject 
line and body.
 - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [ ] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/iemejia/beam beam2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/4008.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4008


commit 83197b3bb5a8d5ed5b22627819ba9ad1d3ba0f62
Author: Ismaël Mejía 
Date:   2017-10-18T08:44:33Z

Fix error and warnings on KafkaIO javadoc generation




---


[jira] [Commented] (BEAM-3039) DatastoreIO.Write fails multiple mutations of same entity

2017-10-18 Thread Alexander Hoem Rosbach (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16208917#comment-16208917
 ] 

Alexander Hoem Rosbach commented on BEAM-3039:
--

In my experience the duplication removal is cheaper when incorporated into the 
Datastore writer, and when it only removes duplicates within a single commit 
batch. If the removal is to be done outside of this scope then I don't see any 
other way than to introduce windowing. 

> DatastoreIO.Write fails multiple mutations of same entity
> -
>
> Key: BEAM-3039
> URL: https://issues.apache.org/jira/browse/BEAM-3039
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Affects Versions: 2.3.0
>Reporter: Alexander Hoem Rosbach
>Assignee: Reuven Lax
>Priority: Minor
>
> When streaming messages from a source that doesn't guarantee 
> once-only-delivery, but has at-least-once-delivery, then the 
> DatastoreIO.Write will throw an exception which leads to Dataflow retrying 
> the same commit multiple times before giving up. This leads to a significant 
> bottleneck in the pipeline, with the end-result that the data is dropped. 
> This should be handled better.
> There are a number of ways to fix this. One of them could be to drop any 
> duplicate mutations within one batch. Non-duplicates should also be handled 
> in some way. Perhaps a use NON-TRANSACTIONAL commit, or make sure the 
> mutations are commited in different commits.
> {code}
> com.google.datastore.v1.client.DatastoreException: A non-transactional commit 
> may not contain multiple mutations affecting the same entity., 
> code=INVALID_ARGUMENT
> 
> com.google.datastore.v1.client.RemoteRpc.makeException(RemoteRpc.java:126)
> 
> com.google.datastore.v1.client.RemoteRpc.makeException(RemoteRpc.java:169)
> com.google.datastore.v1.client.RemoteRpc.call(RemoteRpc.java:89)
> com.google.datastore.v1.client.Datastore.commit(Datastore.java:84)
> 
> org.apache.beam.sdk.io.gcp.datastore.DatastoreV1$DatastoreWriterFn.flushBatch(DatastoreV1.java:1288)
> 
> org.apache.beam.sdk.io.gcp.datastore.DatastoreV1$DatastoreWriterFn.processElement(DatastoreV1.java:1253)
>  
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (BEAM-3073) Connect to Apache ignite via JdbcIO sdk

2017-10-18 Thread Rick Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16208867#comment-16208867
 ] 

Rick Lin edited comment on BEAM-3073 at 10/18/17 7:02 AM:
--

There are details about my run environment

*OS:*
Distributor ID: Ubuntu
Description:Ubuntu 14.04.5 LTS
Release:14.04
Codename:   trusty
*Port:*
11211 (LISTEN)
*pom.xml:*

org.apache.ignite
ignite-core
${ignite.version}


Here, I have not import any class about ignite package
In addition, the ignite is running.
Thanks
Rick




was (Author: ricklin):
There are details about my run environment

*OS:*
Distributor ID: Ubuntu
Description:Ubuntu 14.04.5 LTS
Release:14.04
Codename:   trusty
*Port:*
11211 (LISTEN)
*pom.xml:*

org.apache.ignite
ignite-core
${ignite.version}


Here, I have not import any class about ignite package

Thanks
Rick



> Connect to Apache ignite via JdbcIO sdk
> ---
>
> Key: BEAM-3073
> URL: https://issues.apache.org/jira/browse/BEAM-3073
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions
>Reporter: Rick Lin
>Assignee: Jean-Baptiste Onofré
>Priority: Minor
>
> Hi all,
> {color:#14892c}I tried to connect Apache Ignite(In-memory) via the beam's 
> sdk:org.apache.beam.sdk.io.jdbc.JdbcIO
> Here, i am not sure if the JdbcIO sdk only is provided for some specific 
> Database: MySQL(disk), postgreSQL(disk)?{color}
> my java test code is as follows:
> import java.sql.PreparedStatement;
> import java.sql.SQLException;
> import java.util.ArrayList;
> import java.util.List;
> import org.apache.beam.sdk.Pipeline;
> import org.apache.beam.sdk.io.jdbc.JdbcIO;
> import org.apache.beam.sdk.options.PipelineOptionsFactory;
> import org.apache.beam.sdk.transforms.Create;
> import org.apache.beam.sdk.values.KV;
> import org.apache.beam.sdk.values.PCollection;
> public class BeamtoJDBC {
>   public static void main(String[] args) {
>   Integer[] value=new Integer[] {1,2,3,4,5};
>   List> dataList = new ArrayList<>();
>   int n=value.length;
>   int count=0;
>   for (int i=0; i   {
>   dataList.add(KV.of(count,value[i]));
>   count=count+1;  
>   }
>   
>   Pipeline p = 
> Pipeline.create(PipelineOptionsFactory.fromArgs(args).withValidation().create());
>   
>   PCollection> data=p.apply("create data 
> with time",Create.of(dataList));
>   data.apply(JdbcIO.>write()
>   
> .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration
>   
> .create("org.apache.ignite.IgniteJdbcDriver", 
> "jdbc:ignite://localhost:11211/")
>   )   
>   .withPreparedStatementSetter(new 
> JdbcIO.PreparedStatementSetter>() {
>   public void setParameters(KV Integer> element, PreparedStatement query)
>   throws SQLException {
>   query.setInt(1, 
> element.getKey());
>   query.setInt(2, 
> element.getValue());
>   }
>   })
>   );
>   p.run();
>   }
> }
> {color:#d04437}my error message is: 
> " InvocationTargetException: org.apache.beam.sdk.util.UserCodeException: 
> java.sql.SQLException: Cannot create PoolableConnectionFactory 
> (Failed to establish connection.): Failed to get future result due to waiting 
> timed out. "{color}
> {color:#14892c}I would like to know whether the connection between beam and 
> ignite is feasible or not?{color}
> Thanks
> Rick



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build became unstable: beam_PostCommit_Java_ValidatesRunner_Apex #2632

2017-10-18 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-3073) Connect to Apache ignite via JdbcIO sdk

2017-10-18 Thread Rick Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16208867#comment-16208867
 ] 

Rick Lin commented on BEAM-3073:


There are details about my run environment

*OS:*
Distributor ID: Ubuntu
Description:Ubuntu 14.04.5 LTS
Release:14.04
Codename:   trusty
*Port:*
11211 (LISTEN)
*pom.xml:*

org.apache.ignite
ignite-core
${ignite.version}


Here, I have not import any class about ignite package

Thanks
Rick



> Connect to Apache ignite via JdbcIO sdk
> ---
>
> Key: BEAM-3073
> URL: https://issues.apache.org/jira/browse/BEAM-3073
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions
>Reporter: Rick Lin
>Assignee: Jean-Baptiste Onofré
>Priority: Minor
>
> Hi all,
> {color:#14892c}I tried to connect Apache Ignite(In-memory) via the beam's 
> sdk:org.apache.beam.sdk.io.jdbc.JdbcIO
> Here, i am not sure if the JdbcIO sdk only is provided for some specific 
> Database: MySQL(disk), postgreSQL(disk)?{color}
> my java test code is as follows:
> import java.sql.PreparedStatement;
> import java.sql.SQLException;
> import java.util.ArrayList;
> import java.util.List;
> import org.apache.beam.sdk.Pipeline;
> import org.apache.beam.sdk.io.jdbc.JdbcIO;
> import org.apache.beam.sdk.options.PipelineOptionsFactory;
> import org.apache.beam.sdk.transforms.Create;
> import org.apache.beam.sdk.values.KV;
> import org.apache.beam.sdk.values.PCollection;
> public class BeamtoJDBC {
>   public static void main(String[] args) {
>   Integer[] value=new Integer[] {1,2,3,4,5};
>   List> dataList = new ArrayList<>();
>   int n=value.length;
>   int count=0;
>   for (int i=0; i   {
>   dataList.add(KV.of(count,value[i]));
>   count=count+1;  
>   }
>   
>   Pipeline p = 
> Pipeline.create(PipelineOptionsFactory.fromArgs(args).withValidation().create());
>   
>   PCollection> data=p.apply("create data 
> with time",Create.of(dataList));
>   data.apply(JdbcIO.>write()
>   
> .withDataSourceConfiguration(JdbcIO.DataSourceConfiguration
>   
> .create("org.apache.ignite.IgniteJdbcDriver", 
> "jdbc:ignite://localhost:11211/")
>   )   
>   .withPreparedStatementSetter(new 
> JdbcIO.PreparedStatementSetter>() {
>   public void setParameters(KV Integer> element, PreparedStatement query)
>   throws SQLException {
>   query.setInt(1, 
> element.getKey());
>   query.setInt(2, 
> element.getValue());
>   }
>   })
>   );
>   p.run();
>   }
> }
> {color:#d04437}my error message is: 
> " InvocationTargetException: org.apache.beam.sdk.util.UserCodeException: 
> java.sql.SQLException: Cannot create PoolableConnectionFactory 
> (Failed to establish connection.): Failed to get future result due to waiting 
> timed out. "{color}
> {color:#14892c}I would like to know whether the connection between beam and 
> ignite is feasible or not?{color}
> Thanks
> Rick



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Build failed in Jenkins: beam_PerformanceTests_Python #461

2017-10-18 Thread Apache Jenkins Server
See 


--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam3 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision de7cc05cc67d1aa6331cddc17c2e02ed0efbe37d (origin/master)
Commit message: "This closes #3938: [BEAM-2674] Add custom rehydration; 
reinstate proto roundtrip for Java DirectRunner"
 > git config core.sparsecheckout # timeout=10
 > git checkout -f de7cc05cc67d1aa6331cddc17c2e02ed0efbe37d
 > git rev-list de7cc05cc67d1aa6331cddc17c2e02ed0efbe37d # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins5800789707936811445.sh
+ rm -rf PerfKitBenchmarker
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins2019672464813627486.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git
Cloning into 'PerfKitBenchmarker'...
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins1644852133277033587.sh
+ pip install --user -r PerfKitBenchmarker/requirements.txt
Requirement already satisfied: absl-py in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: jinja2>=2.7 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: setuptools in /usr/lib/python2.7/dist-packages 
(from -r PerfKitBenchmarker/requirements.txt (line 16))
Requirement already satisfied: colorlog[windows]==2.6.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 17))
Requirement already satisfied: blinker>=1.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 18))
Requirement already satisfied: futures>=3.0.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 19))
Requirement already satisfied: PyYAML==3.12 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 20))
Requirement already satisfied: pint>=0.7 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 21))
Requirement already satisfied: numpy in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 22))
Requirement already satisfied: functools32 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 23))
Requirement already satisfied: contextlib2>=0.5.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 24))
Requirement already satisfied: six in 
/home/jenkins/.local/lib/python2.7/site-packages (from absl-py->-r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: MarkupSafe in 
/usr/local/lib/python2.7/dist-packages (from jinja2>=2.7->-r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: colorama; extra == "windows" in 
/usr/lib/python2.7/dist-packages (from colorlog[windows]==2.6.0->-r 
PerfKitBenchmarker/requirements.txt (line 17))
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins7985866330802702704.sh
+ pip install --user -e 'sdks/python/[gcp,test]'
Obtaining 
file://
Requirement already satisfied: avro<2.0.0,>=1.8.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from apache-beam==2.3.0.dev)
Requirement already satisfied: crcmod<2.0,>=1.7 in 
/home/jenkins/.local/lib/python2.7/site-packages (from apache-beam==2.3.0.dev)
Requirement already satisfied: dill==0.2.6 in 
/home/jenkins/.local/lib/python2.7/site-packages (from apache-beam==2.3.0.dev)
Requirement already satisfied: grpcio<2.0,>=1.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from apache-beam==2.3.0.dev)
Requirement already satisfied: httplib2<0.10,>=0.8 in