[GitHub] beam pull request #3612: Kafka exactly-once sink.

2017-07-20 Thread rangadi
GitHub user rangadi opened a pull request:

https://github.com/apache/beam/pull/3612

Kafka exactly-once sink.

Implementation of an exactly-once sink for Kafka, making use of 
transactions added in Kafka 0.11. This requires exact-once semantics for 
runners similar to Dataflow. 

This is not ready for merge. Will distribute more once it goes through 
initial feedback.

There are a few minor TODOs for implementation and of course need to add 
more sink tests.

This uses consumer group id for the topic to store metadata. It seems to 
work well, but it is not very convenient for a user to manage. E.g. if the user 
wants to restart a job from scratch using the same group_id, existing metadata 
prevents it from starting. User has to use new group (which is perfectly fine) 
or needs to clear the metadata programatically. This sink could help users 
manage this better (e.g. an option to discard metatada). will see.

How this works : from a comment in he code:

//
// Dataflow ensures at-least once processing for side effects like 
sinks. In order to provide
// exactly-once semantics, a sink needs to be idempotent or it should 
avoid writing records
// that have already been written. This snk does the latter. All the 
the records are ordered
// across a fixed number of shards and records in each shard are 
written in order. It drops
// any records that are already written and buffers those arriving out 
of order.
//
//  // Exactly once sink involves two shuffles of the records:
//A -- GBK --> B -- GBK --> C
//
// Processing guarantees also require deterministic processing within 
user transforms.
// in this case that implies the order of the records seen by C should 
not be affected by
// restarts in upstream stages link B & A.
//
// A : Assigns a random shard for message. Note that there are no 
ordering guarantees for
// writing user records to Kafka. User can still control 
partitioning among topic
// partitions as with regular sink (of course, there are no 
ordering guarantees in
// regular Kafka sink either).
// B : Assigns an id sequentially for each messages within a shard.
// C : Writes each shard to Kafka in sequential id order. In Dataflow, 
when C sees a record
// and id, it implies that record and the associated id are 
checkpointed to persistent
// storage and this record will always have same id, even in 
retries.
// Exactly-once semantics are achieved by writing records in the 
strict order of
// these checkpointed sequence ids.
//
// Parallelism for B and C is fixed to 'numShards', which defaults to 
number of partitions
// for the topic. A few reasons for that:
//  - B & C implement their functionality using per-key state. Shard id 
makes it independent
//of cardinality of user key.
//  - We create one producer per shard, and its 'transactional id' is 
based on shard id. This
//requires that number of shards to be finite. This also helps with 
batching. and avoids
//initializing producers and transactions.
//  - Most importantly, each of sharded writers stores 'next message 
id' in partition
//metadata, which is committed atomically with Kafka transactions. 
This is critical
//to handle retries of C correctly. Initial testing showed number 
of shards could be
//larger than number of partitions for the topic.
//
// Number of shards can change across multiple runs of a pipeline (job 
upgrade in Dataflow).
//




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rangadi/beam eo_sink

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3612.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3612


commit 76bab2d8ec83ccc2826c929a50fb13d62bb4685b
Author: Raghu Angadi 
Date:   2017-07-21T06:26:49Z

Kafka exactly-once sink.
Tested manually with direct runner and on dataflow.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #3611: [BEAM-79] merge gearpump-runner into master

2017-07-20 Thread manuzhang
GitHub user manuzhang opened a pull request:

https://github.com/apache/beam/pull/3611

[BEAM-79] merge gearpump-runner into master

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [ ] Each commit in the pull request should have a meaningful subject 
line and body.
 - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [ ] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/manuzhang/beam gearpump-runner

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3611.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3611


commit 9478f4117de3a2d0ea40614ed4cb801918610724
Author: manuzhang 
Date:   2016-03-15T08:15:16Z

[BEAM-79] add Gearpump runner

commit 02b2248a5b3c8a2c064547d7380bebc97f849bf1
Author: Kenneth Knowles 
Date:   2016-07-20T16:07:48Z

This closes #323

commit 2a0ba61e8507e1539115b583749a78f14d577bd8
Author: Kenneth Knowles 
Date:   2016-08-25T18:36:45Z

Merge branch master into gearpump-runner

commit 1672b5483e029292816397248dc6fe63bf51f4af
Author: manuzhang 
Date:   2016-07-23T06:10:15Z

move integration tests to profile

commit 276a2e106aa1a573fc2eb2426b640f63cf68
Author: manuzhang 
Date:   2016-07-28T08:30:13Z

add package-info.java

commit 40be715a696bb1218b209f7ad9a979b7e5d088d3
Author: Kenneth Knowles 
Date:   2016-08-10T17:26:57Z

Update Gearpump runner version to 0.3.0-incubating

commit bc1b354949416db3b52c4f37c66968bdb86f0813
Author: manuzhang 
Date:   2016-08-11T23:22:00Z

Rename DoFn to OldDoFn in Gearpump runner

commit 091a15a07c7625ae3009cefaecece3a29a34c109
Author: Kenneth Knowles 
Date:   2016-08-25T18:40:03Z

This closess #750

commit fb74c936ed92c7a8548c338cc03957794fc60902
Author: Dan Halperin 
Date:   2016-08-26T23:25:58Z

gearpump: switch to stable version

They have apparently deleted the SNAPSHOT jar and now builds are failing.

commit bf0a2edae11416a3cbddeaff2c0a70adc272c5fe
Author: Dan Halperin 
Date:   2016-08-27T00:46:42Z

Closes #895

commit 89921c41ca9d4c333af45efa32359a631214c1df
Author: bchambers 
Date:   2016-07-29T16:41:17Z

Remove Counter and associated code

Aggregator is the model level concept. Counter was specific to the
Dataflow Runner, and is now not needed as part of Beam.

commit 7fc2c6848f002ac8b2ccbe35e6b5a576777a7af9
Author: Mark Liu 
Date:   2016-08-03T00:25:14Z

[BEAM-495] Create General Verifier for File Checksum

commit b47549e4893a6d487c00ea0ba02619168a3f19f3
Author: Mark Liu 
Date:   2016-08-03T00:47:46Z

Add output checksum to  WordCountITOptions

commit 58cd781c82fa728f34f5ab0641f8f9b6edcf449c
Author: Ian Zhou 
Date:   2016-08-05T22:31:59Z

Added unit tests and error handling in removeTemporaryTables

commit 36a9aa232ea56de449930194788becce585212ef
Author: Thomas Groh 
Date:   2016-08-09T02:09:58Z

Improve Write Error Message

If provided with an Unbounded PCollection, Write will fail due to
restriction of calling finalize only once. This error message fails in a
deep stack trace based on it not being possible to apply a GroupByKey.
Instead, throw immediately on application with a specific error message.

commit d5641553cebb02f08ca7c1fe667948d39cb3962c
Author: Thomas Groh 
Date:   2016-08-09T17:47:09Z

Remove Streaming Write Overrides in DataflowRunner

These writes should be forbidden based on the boundedness of the input
PCollection. As Write explicitly forbids the application of the
transform to an Unbounded PCollection, this will be equivalent in most
cases; In cases where the input PCollection is Bounded, due to an
UnboundedReadFromBoundedSource, the write will function as expected and
does not need to be forbidden.

commit 011bea9a83a828e0d8c6518ab83aa5cc4f75e146
Author: David Rieber 
Date:   2016-08-09T21:05:25Z

Do not add DataDisks to windmill service jobs.

commit 0dfb8ff55d6f80264222fde4501ea3050d2e391

[jira] [Commented] (BEAM-79) Gearpump runner

2017-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-79?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16095850#comment-16095850
 ] 

ASF GitHub Bot commented on BEAM-79:


GitHub user manuzhang opened a pull request:

https://github.com/apache/beam/pull/3611

[BEAM-79] merge gearpump-runner into master

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [ ] Each commit in the pull request should have a meaningful subject 
line and body.
 - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [ ] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/manuzhang/beam gearpump-runner

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3611.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3611


commit 9478f4117de3a2d0ea40614ed4cb801918610724
Author: manuzhang 
Date:   2016-03-15T08:15:16Z

[BEAM-79] add Gearpump runner

commit 02b2248a5b3c8a2c064547d7380bebc97f849bf1
Author: Kenneth Knowles 
Date:   2016-07-20T16:07:48Z

This closes #323

commit 2a0ba61e8507e1539115b583749a78f14d577bd8
Author: Kenneth Knowles 
Date:   2016-08-25T18:36:45Z

Merge branch master into gearpump-runner

commit 1672b5483e029292816397248dc6fe63bf51f4af
Author: manuzhang 
Date:   2016-07-23T06:10:15Z

move integration tests to profile

commit 276a2e106aa1a573fc2eb2426b640f63cf68
Author: manuzhang 
Date:   2016-07-28T08:30:13Z

add package-info.java

commit 40be715a696bb1218b209f7ad9a979b7e5d088d3
Author: Kenneth Knowles 
Date:   2016-08-10T17:26:57Z

Update Gearpump runner version to 0.3.0-incubating

commit bc1b354949416db3b52c4f37c66968bdb86f0813
Author: manuzhang 
Date:   2016-08-11T23:22:00Z

Rename DoFn to OldDoFn in Gearpump runner

commit 091a15a07c7625ae3009cefaecece3a29a34c109
Author: Kenneth Knowles 
Date:   2016-08-25T18:40:03Z

This closess #750

commit fb74c936ed92c7a8548c338cc03957794fc60902
Author: Dan Halperin 
Date:   2016-08-26T23:25:58Z

gearpump: switch to stable version

They have apparently deleted the SNAPSHOT jar and now builds are failing.

commit bf0a2edae11416a3cbddeaff2c0a70adc272c5fe
Author: Dan Halperin 
Date:   2016-08-27T00:46:42Z

Closes #895

commit 89921c41ca9d4c333af45efa32359a631214c1df
Author: bchambers 
Date:   2016-07-29T16:41:17Z

Remove Counter and associated code

Aggregator is the model level concept. Counter was specific to the
Dataflow Runner, and is now not needed as part of Beam.

commit 7fc2c6848f002ac8b2ccbe35e6b5a576777a7af9
Author: Mark Liu 
Date:   2016-08-03T00:25:14Z

[BEAM-495] Create General Verifier for File Checksum

commit b47549e4893a6d487c00ea0ba02619168a3f19f3
Author: Mark Liu 
Date:   2016-08-03T00:47:46Z

Add output checksum to  WordCountITOptions

commit 58cd781c82fa728f34f5ab0641f8f9b6edcf449c
Author: Ian Zhou 
Date:   2016-08-05T22:31:59Z

Added unit tests and error handling in removeTemporaryTables

commit 36a9aa232ea56de449930194788becce585212ef
Author: Thomas Groh 
Date:   2016-08-09T02:09:58Z

Improve Write Error Message

If provided with an Unbounded PCollection, Write will fail due to
restriction of calling finalize only once. This error message fails in a
deep stack trace based on it not being possible to apply a GroupByKey.
Instead, throw immediately on application with a specific error message.

commit d5641553cebb02f08ca7c1fe667948d39cb3962c
Author: Thomas Groh 
Date:   2016-08-09T17:47:09Z

Remove Streaming Write Overrides in DataflowRunner

These writes should be forbidden based on the boundedness of the input
PCollection. As Write explicitly forbids the application of the
transform to an Unbounded PCollection, this will be equivalent in most
cases; In cases where the input PCollection is Bounded, due to an
UnboundedReadFromBoundedSource, the write will function as 

[jira] [Created] (BEAM-2652) cleanup pom.xml

2017-07-20 Thread Xu Mingmin (JIRA)
Xu Mingmin created BEAM-2652:


 Summary: cleanup pom.xml
 Key: BEAM-2652
 URL: https://issues.apache.org/jira/browse/BEAM-2652
 Project: Beam
  Issue Type: Sub-task
  Components: dsl-sql
Reporter: Xu Mingmin
Assignee: Xu Mingmin


Update pom.xml as mentioned in 
https://github.com/apache/beam/pull/3606#pullrequestreview-51373781 




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (BEAM-2651) Change requests for DSL_SQL merge

2017-07-20 Thread Xu Mingmin (JIRA)
Xu Mingmin created BEAM-2651:


 Summary: Change requests for DSL_SQL merge
 Key: BEAM-2651
 URL: https://issues.apache.org/jira/browse/BEAM-2651
 Project: Beam
  Issue Type: Task
  Components: dsl-sql
Reporter: Xu Mingmin
Assignee: Xu Mingmin


This task collects all the sub-tasks, which are requested in 
https://github.com/apache/beam/pull/3606. 




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (BEAM-2651) Change requests for DSL_SQL merge

2017-07-20 Thread Xu Mingmin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Mingmin reassigned BEAM-2651:


Assignee: (was: Xu Mingmin)

> Change requests for DSL_SQL merge
> -
>
> Key: BEAM-2651
> URL: https://issues.apache.org/jira/browse/BEAM-2651
> Project: Beam
>  Issue Type: Task
>  Components: dsl-sql
>Reporter: Xu Mingmin
>  Labels: bundle, dsl_sql_merge, dsl_sql_review
>
> This task collects all the sub-tasks, which are requested in 
> https://github.com/apache/beam/pull/3606. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[beam-site] branch asf-site updated (e21c3d4 -> 0dabfd1)

2017-07-20 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a change to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam-site.git.


from e21c3d4  Prepare repository for deployment.
 add fd67536  [BEAM-2126] Add JStorm runner to ongoing projects.
 add 2b560ce  This closes #277
 new 0dabfd1  Prepare repository for deployment.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 content/contribute/work-in-progress/index.html | 6 ++
 src/contribute/work-in-progress.md | 1 +
 2 files changed, 7 insertions(+)

-- 
To stop receiving notification emails like this one, please contact
['"commits@beam.apache.org" '].


[beam-site] 01/01: Prepare repository for deployment.

2017-07-20 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 0dabfd10139222f44076b1471a5953d71bf4a49b
Author: Mergebot 
AuthorDate: Fri Jul 21 04:04:09 2017 +

Prepare repository for deployment.
---
 content/contribute/work-in-progress/index.html | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/content/contribute/work-in-progress/index.html 
b/content/contribute/work-in-progress/index.html
index aa2bed6..be35f52 100644
--- a/content/contribute/work-in-progress/index.html
+++ b/content/contribute/work-in-progress/index.html
@@ -179,6 +179,12 @@
   https://lists.apache.org/thread.html/e38ac4e4914a6cb1b865b1f32a6ca06c2be28ea4aa0f6b18393de66f@%3Cdev.beam.apache.org%3E";>thread
 
 
+  JStorm Runner
+  https://github.com/apache/beam/tree/jstorm-runner";>jstorm-runner
+  https://issues.apache.org/jira/browse/BEAM/component/12332477";>runner-jstorm
+  https://issues.apache.org/jira/browse/BEAM-1899";>BEAM-1899
+
+
   Beam SQL DSL
   https://github.com/apache/beam/tree/DSL_SQL";>DSL_SQL
   https://issues.apache.org/jira/browse/BEAM/component/12332480";>dsl-sql

-- 
To stop receiving notification emails like this one, please contact
"commits@beam.apache.org" .


[beam-site] 01/02: [BEAM-2126] Add JStorm runner to ongoing projects.

2017-07-20 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit fd675366270be75d95302b7fcbb892e6b5912a58
Author: Pei He 
AuthorDate: Thu Jul 20 10:54:40 2017 +0800

[BEAM-2126] Add JStorm runner to ongoing projects.
---
 src/contribute/work-in-progress.md | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/contribute/work-in-progress.md 
b/src/contribute/work-in-progress.md
index 979e458..301fa07 100644
--- a/src/contribute/work-in-progress.md
+++ b/src/contribute/work-in-progress.md
@@ -26,6 +26,7 @@ Current branches include:
 |  |  |  |  |
 | Apache Gearpump Runner | 
[gearpump-runner](https://github.com/apache/beam/tree/gearpump-runner) | 
[runner-gearpump](https://issues.apache.org/jira/browse/BEAM/component/12330829)
 | [runner homepage]({{ site.baseurl }}/documentation/runners/gearpump/) |
 | Apache Spark 2.0 Runner | 
[runners-spark2](https://github.com/apache/beam/tree/runners-spark2) | - | 
[thread](https://lists.apache.org/thread.html/e38ac4e4914a6cb1b865b1f32a6ca06c2be28ea4aa0f6b18393de66f@%3Cdev.beam.apache.org%3E)
 |
+| JStorm Runner | 
[jstorm-runner](https://github.com/apache/beam/tree/jstorm-runner) | 
[runner-jstorm](https://issues.apache.org/jira/browse/BEAM/component/12332477) 
| [BEAM-1899](https://issues.apache.org/jira/browse/BEAM-1899) |
 | Beam SQL DSL | [DSL_SQL](https://github.com/apache/beam/tree/DSL_SQL) | 
[dsl-sql](https://issues.apache.org/jira/browse/BEAM/component/12332480) | 
[BEAM-301](https://issues.apache.org/jira/browse/BEAM-301) |
 {:.table}
 

-- 
To stop receiving notification emails like this one, please contact
"commits@beam.apache.org" .


[beam-site] branch mergebot updated (4665b04 -> 2b560ce)

2017-07-20 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a change to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git.


from 4665b04  This closes #276
 add e21c3d4  Prepare repository for deployment.
 new fd67536  [BEAM-2126] Add JStorm runner to ongoing projects.
 new 2b560ce  This closes #277

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 content/documentation/io/built-in/index.html | 32 +---
 src/contribute/work-in-progress.md   |  1 +
 2 files changed, 20 insertions(+), 13 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
['"commits@beam.apache.org" '].


[beam-site] 02/02: This closes #277

2017-07-20 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 2b560ce4c668bd87eeb2fb661b7a5ef2a1d6886d
Merge: e21c3d4 fd67536
Author: Mergebot 
AuthorDate: Fri Jul 21 04:01:03 2017 +

This closes #277

 src/contribute/work-in-progress.md | 1 +
 1 file changed, 1 insertion(+)

-- 
To stop receiving notification emails like this one, please contact
"commits@beam.apache.org" .


[jira] [Commented] (BEAM-2371) Make Java DirectRunner demonstrate language-agnostic Runner API translation wrappers

2017-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16095720#comment-16095720
 ] 

ASF GitHub Bot commented on BEAM-2371:
--

GitHub user kennknowles opened a pull request:

https://github.com/apache/beam/pull/3610

[BEAM-2371] Use dehydration-insensitive APIs in ParDoEvaluatorFactory

R: @tgroh

Peeled off #3334 so it can be a trivial review. Ignore the base commit, 
which is #3601.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/beam ParDoEvaluatorFactory

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3610.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3610


commit eef50852ba5d5185207a79149393bd5a41a084f0
Author: Kenneth Knowles 
Date:   2017-07-20T03:58:36Z

Use RehydratedComponents for memoized rehydration

commit 4dd75f750f0e846b7c1a918e013ee259cdfb0614
Author: Kenneth Knowles 
Date:   2017-06-07T21:35:09Z

Use dehydration-insensitive APIs in ParDoEvaluatorFactory




> Make Java DirectRunner demonstrate language-agnostic Runner API translation 
> wrappers
> 
>
> Key: BEAM-2371
> URL: https://issues.apache.org/jira/browse/BEAM-2371
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-direct
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>  Labels: beam-python-everywhere
> Fix For: 2.2.0
>
>
> This will complete the PoC for runners-core-construction-java and the Runner 
> API and show other runners the easy path to executing non-Java pipelines, 
> modulo Fn API.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3610: [BEAM-2371] Use dehydration-insensitive APIs in Par...

2017-07-20 Thread kennknowles
GitHub user kennknowles opened a pull request:

https://github.com/apache/beam/pull/3610

[BEAM-2371] Use dehydration-insensitive APIs in ParDoEvaluatorFactory

R: @tgroh

Peeled off #3334 so it can be a trivial review. Ignore the base commit, 
which is #3601.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/beam ParDoEvaluatorFactory

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3610.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3610


commit eef50852ba5d5185207a79149393bd5a41a084f0
Author: Kenneth Knowles 
Date:   2017-07-20T03:58:36Z

Use RehydratedComponents for memoized rehydration

commit 4dd75f750f0e846b7c1a918e013ee259cdfb0614
Author: Kenneth Knowles 
Date:   2017-06-07T21:35:09Z

Use dehydration-insensitive APIs in ParDoEvaluatorFactory




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2371) Make Java DirectRunner demonstrate language-agnostic Runner API translation wrappers

2017-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16095716#comment-16095716
 ] 

ASF GitHub Bot commented on BEAM-2371:
--

GitHub user kennknowles opened a pull request:

https://github.com/apache/beam/pull/3609

[BEAM-2371] Use dehydration-insensitive APIs in WindowEvaluatorFactory

R: @tgroh 

Peeled off #3334 so it can be a trivial review. Ignore the base commit, 
which is #3601.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/beam WindowEvaluatorFactory

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3609.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3609


commit 6bf16792e779047243a29bd2af507958779a5ffc
Author: Kenneth Knowles 
Date:   2017-07-20T03:58:36Z

Use RehydratedComponents for memoized rehydration

commit d4923c9ddcc8dc836b7f9774cd23b882d7a5a341
Author: Kenneth Knowles 
Date:   2017-06-07T20:58:11Z

Use dehydration-insensitive APIs in WindowEvaluatorFactory




> Make Java DirectRunner demonstrate language-agnostic Runner API translation 
> wrappers
> 
>
> Key: BEAM-2371
> URL: https://issues.apache.org/jira/browse/BEAM-2371
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-direct
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>  Labels: beam-python-everywhere
> Fix For: 2.2.0
>
>
> This will complete the PoC for runners-core-construction-java and the Runner 
> API and show other runners the easy path to executing non-Java pipelines, 
> modulo Fn API.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3609: [BEAM-2371] Use dehydration-insensitive APIs in Win...

2017-07-20 Thread kennknowles
GitHub user kennknowles opened a pull request:

https://github.com/apache/beam/pull/3609

[BEAM-2371] Use dehydration-insensitive APIs in WindowEvaluatorFactory

R: @tgroh 

Peeled off #3334 so it can be a trivial review. Ignore the base commit, 
which is #3601.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/beam WindowEvaluatorFactory

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3609.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3609


commit 6bf16792e779047243a29bd2af507958779a5ffc
Author: Kenneth Knowles 
Date:   2017-07-20T03:58:36Z

Use RehydratedComponents for memoized rehydration

commit d4923c9ddcc8dc836b7f9774cd23b882d7a5a341
Author: Kenneth Knowles 
Date:   2017-06-07T20:58:11Z

Use dehydration-insensitive APIs in WindowEvaluatorFactory




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Build failed in Jenkins: beam_PerformanceTests_JDBC #174

2017-07-20 Thread Apache Jenkins Server
See 


--
[...truncated 38.13 KB...]
[INFO] 
[INFO] BUILD SUCCESS
[INFO] 
[INFO] Total time: 01:15 min
[INFO] Finished at: 2017-07-21T03:23:15Z
[INFO] Final Memory: 91M/1352M
[INFO] 
[beam_PerformanceTests_JDBC] $ /bin/bash -xe /tmp/jenkins5048856620029283810.sh
+ /home/jenkins/tools/maven/latest/bin/mvn -B -e verify -pl sdks/java/io/jdbc 
-Dio-it-suite 
-DpkbLocation=
 '-DintegrationTestPipelineOptions=[ "--project=apache-beam-testing", 
"--tempRoot=gs://temp-storage-for-end-to-end-tests" ]'
[INFO] Error stacktraces are turned on.
[INFO] Scanning for projects...
[INFO] 
[INFO] Detecting the operating system and CPU architecture
[INFO] 
[INFO] os.detected.name: linux
[INFO] os.detected.arch: x86_64
[INFO] os.detected.version: 3.19
[INFO] os.detected.version.major: 3
[INFO] os.detected.version.minor: 19
[INFO] os.detected.release: ubuntu
[INFO] os.detected.release.version: 14.04
[INFO] os.detected.release.like.ubuntu: true
[INFO] os.detected.release.like.debian: true
[INFO] os.detected.classifier: linux-x86_64
[INFO] 
[INFO] 
[INFO] Building Apache Beam :: SDKs :: Java :: IO :: JDBC 2.2.0-SNAPSHOT
[INFO] 
[INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/codehaus/mojo/exec-maven-plugin/1.6.0/exec-maven-plugin-1.6.0.pom
[INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/codehaus/mojo/exec-maven-plugin/1.6.0/exec-maven-plugin-1.6.0.pom
 (13 kB at 16 kB/s)
[INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/codehaus/mojo/exec-maven-plugin/1.6.0/exec-maven-plugin-1.6.0.jar
[INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/codehaus/mojo/exec-maven-plugin/1.6.0/exec-maven-plugin-1.6.0.jar
 (58 kB at 459 kB/s)
[INFO] Downloading: 
http://repository.apache.org/snapshots/org/apache/beam/beam-runners-direct-java/2.2.0-SNAPSHOT/maven-metadata.xml
[INFO] Downloaded: 
http://repository.apache.org/snapshots/org/apache/beam/beam-runners-direct-java/2.2.0-SNAPSHOT/maven-metadata.xml
 (1.7 kB at 10.0 kB/s)
[INFO] Downloading: 
http://repository.apache.org/snapshots/org/apache/beam/beam-runners-direct-java/2.2.0-SNAPSHOT/beam-runners-direct-java-2.2.0-20170720.141359-13.pom
[INFO] Downloaded: 
http://repository.apache.org/snapshots/org/apache/beam/beam-runners-direct-java/2.2.0-SNAPSHOT/beam-runners-direct-java-2.2.0-20170720.141359-13.pom
 (10 kB at 60 kB/s)
[INFO] Downloading: 
http://repository.apache.org/snapshots/org/apache/beam/beam-sdks-java-io-common/2.2.0-SNAPSHOT/maven-metadata.xml
[INFO] Downloaded: 
http://repository.apache.org/snapshots/org/apache/beam/beam-sdks-java-io-common/2.2.0-SNAPSHOT/maven-metadata.xml
 (1.4 kB at 8.5 kB/s)
[INFO] Downloading: 
http://repository.apache.org/snapshots/org/apache/beam/beam-sdks-java-io-common/2.2.0-SNAPSHOT/beam-sdks-java-io-common-2.2.0-20170719.141532-12.pom
[INFO] Downloaded: 
http://repository.apache.org/snapshots/org/apache/beam/beam-sdks-java-io-common/2.2.0-SNAPSHOT/beam-sdks-java-io-common-2.2.0-20170719.141532-12.pom
 (1.4 kB at 8.2 kB/s)
[INFO] Downloading: 
http://repository.apache.org/snapshots/org/apache/beam/beam-runners-direct-java/2.2.0-SNAPSHOT/beam-runners-direct-java-2.2.0-20170720.141359-13.jar
[INFO] Downloading: 
http://repository.apache.org/snapshots/org/apache/beam/beam-sdks-java-io-common/2.2.0-SNAPSHOT/beam-sdks-java-io-common-2.2.0-20170719.141532-12.jar
[INFO] Downloading: 
http://repository.apache.org/snapshots/org/apache/beam/beam-sdks-java-io-common/2.2.0-SNAPSHOT/beam-sdks-java-io-common-2.2.0-20170719.141532-12-tests.jar
[INFO] Downloaded: 
http://repository.apache.org/snapshots/org/apache/beam/beam-sdks-java-io-common/2.2.0-SNAPSHOT/beam-sdks-java-io-common-2.2.0-20170719.141532-12-tests.jar
 (17 kB at 76 kB/s)
[INFO] Downloaded: 
http://repository.apache.org/snapshots/org/apache/beam/beam-sdks-java-io-common/2.2.0-SNAPSHOT/beam-sdks-java-io-common-2.2.0-20170719.141532-12.jar
 (2.7 MB at 5.4 MB/s)
[INFO] Downloaded: 
http://repository.apache.org/snapshots/org/apache/beam/beam-runners-direct-java/2.2.0-SNAPSHOT/beam-runners-direct-java-2.2.0-20170720.141359-13.jar
 (11 MB at 13 MB/s)
[INFO] 
[INFO] --- maven-enforcer-plugin:1.4.1:enforce (enforce) @ 
beam-sdks-java-io-jdbc ---
[INFO] 
[INFO] --- maven-enforcer-plugin:1.4.1:enforce (enforce-banned-dependencies) @ 
beam-sdks-java-io-jdbc ---
[INFO] 
[INF

Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #3623

2017-07-20 Thread Apache Jenkins Server
See 




[GitHub] beam pull request #3608: Uniquify application nodes in TextIOReadTest (HACK)

2017-07-20 Thread kennknowles
GitHub user kennknowles opened a pull request:

https://github.com/apache/beam/pull/3608

Uniquify application nodes in TextIOReadTest (HACK)

These nodes have never had stable unique names in the original pipeline.

 - We have always mutated the pipeline before adding additional nodes so we 
got lucky. But as of #3334 we will no longer be mutating the original pipeline, 
so we need actual unique application names.
 - And, actually, current practice is that a runner may really mess with 
your pipeline, so (for example) you may have PCollections that no longer exist 
in the pipeline. Here, too, we got lucky.

So I think correct approaches include:

1. Making each call to `assertReadingCompressedFileMatchesExpected` use a 
separate pipeline by making them each a separate test method.
2. Move the call to `run()` into the test method, thus running it only 
once. To get stable unique names we could make the assertion stuff a composite 
so we could label the node, or just use an arbitrary uniquifier.

I had already implemented the hack of just incrementing a counter to get 
#3334 to work. So, realizing the second incorrectness here, I just took the 
step of also moving the call to `run()` into the test method. This avoids 
exploding the number of pipelines that are run.

R: @jkff 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/beam TextIOReadTest

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3608.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3608


commit cfadecb56c64aa155c8b5fd0d8a6654ceb918eba
Author: Kenneth Knowles 
Date:   2017-07-18T05:03:09Z

Uniquify application nodes in TextIOReadTest and only run pipeline once

 - These nodes have never had stable unique names, but we have always 
mutated
   the pipeline before running it so they got lucky.
 - A pipeline can also be mutated when run() so it should be considered dead
   after run() is called.

This fixes both issues by uniquifying the names and running a given pipeline
only once.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2646) Problem to join in slack channel

2017-07-20 Thread Kwang-in (Dennis) JUNG (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16095669#comment-16095669
 ] 

Kwang-in (Dennis) JUNG commented on BEAM-2646:
--

[~j...@nanthrax.net] No, so I subscribe on mailing list 'dev-subscribe' about 
an hour ago. But it still shows the same problem. Do I have to subscribe on 
'user' mailing list?

Thanks.

> Problem to join in slack channel
> 
>
> Key: BEAM-2646
> URL: https://issues.apache.org/jira/browse/BEAM-2646
> Project: Beam
>  Issue Type: Wish
>  Components: project-management
>Reporter: Kwang-in (Dennis) JUNG
>Assignee: Davor Bonaci
>Priority: Trivial
>
> Hello.
> While following up the guide in main page, I faced on few problem so tried to 
> join in slack to ask. But it keeps sending failure mail after I request slack 
> invitation through mail.
> --
> Hi. This is the qmail-send program at apache.org.
> I'm afraid I wasn't able to deliver your message to the following addresses.
> This is a permanent error; I've given up. Sorry it didn't work out.
> :
> Must be sent from an @apache.org address or a subscriber address or an 
> address in LDAP.
> --- Below this line is a copy of the message.
> Return-Path: 
> Received: (qmail 28881 invoked by uid 99); 20 Jul 2017 09:45:11 -
> Received: from pnap-us-west-generic-nat.apache.org (HELO 
> spamd1-us-west.apache.org) (209.188.14.142)
> by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 20 Jul 2017 09:45:11 +
> Received: from localhost (localhost [127.0.0.1])
> by spamd1-us-west.apache.org (ASF Mail Server at 
> spamd1-us-west.apache.org) with ESMTP id DA3E0C33A9
> for ; Thu, 20 Jul 2017 09:45:10 + (UTC)
> X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org
> X-Spam-Flag: NO
> X-Spam-Score: 2.629
> X-Spam-Level: **
> ...
> --
> Thanks!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #3622

2017-07-20 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PostCommit_Java_MavenInstall #4422

2017-07-20 Thread Apache Jenkins Server
See 


--
[...truncated 1.07 MB...]
2017-07-21T01:35:30.961 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/eclipse/jdt/core/compiler/ecj/4.4.2/ecj-4.4.2.jar
2017-07-21T01:35:31.062 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/fusesource/sigar/1.6.4/sigar-1.6.4.jar 
(419 KB at 70.5 KB/sec)
2017-07-21T01:35:31.062 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/caffinitas/ohc/ohc-core/0.4.3/ohc-core-0.4.3.jar
2017-07-21T01:35:31.117 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/caffinitas/ohc/ohc-core/0.4.3/ohc-core-0.4.3.jar
 (125 KB at 20.7 KB/sec)
2017-07-21T01:35:31.117 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/com/github/ben-manes/caffeine/caffeine/2.2.6/caffeine-2.2.6.jar
2017-07-21T01:35:31.264 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/io/netty/netty-all/4.0.39.Final/netty-all-4.0.39.Final.jar
 (2219 KB at 361.4 KB/sec)
2017-07-21T01:35:31.265 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/com/datastax/cassandra/cassandra-driver-core/3.1.1/cassandra-driver-core-3.1.1.jar
2017-07-21T01:35:31.304 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/eclipse/jdt/core/compiler/ecj/4.4.2/ecj-4.4.2.jar
 (2257 KB at 365.2 KB/sec)
2017-07-21T01:35:31.332 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/com/github/ben-manes/caffeine/caffeine/2.2.6/caffeine-2.2.6.jar
 (926 KB at 149.1 KB/sec)
2017-07-21T01:35:31.427 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/com/datastax/cassandra/cassandra-driver-core/3.1.1/cassandra-driver-core-3.1.1.jar
 (1029 KB at 163.2 KB/sec)
2017-07-21T01:35:32.036 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/storm/storm-core/1.0.1/storm-core-1.0.1.jar
 (19650 KB at 2843.7 KB/sec)
2017-07-21T01:35:32.093 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/it/unimi/dsi/fastutil/6.5.7/fastutil-6.5.7.jar
 (16508 KB at 2369.4 KB/sec)
2017-07-21T01:35:32.135 [INFO] Downloading: 
http://conjars.org/repo/org/pentaho/pentaho-aggdesigner-algorithm/5.1.5-jhyde/pentaho-aggdesigner-algorithm-5.1.5-jhyde.jar
2017-07-21T01:35:32.135 [INFO] Downloading: 
http://conjars.org/repo/org/scala-lang/scala-library/2.10.5/scala-library-2.10.5.jar
2017-07-21T01:35:32.135 [INFO] Downloading: 
http://conjars.org/repo/cascading/cascading-hadoop/2.6.3/cascading-hadoop-2.6.3.jar
2017-07-21T01:35:32.135 [INFO] Downloading: 
http://conjars.org/repo/cascading/cascading-core/2.6.3/cascading-core-2.6.3.jar
2017-07-21T01:35:32.136 [INFO] Downloading: 
http://conjars.org/repo/riffle/riffle/0.1-dev/riffle-0.1-dev.jar
2017-07-21T01:35:32.192 [INFO] Downloading: 
http://conjars.org/repo/thirdparty/jgrapht-jdk1.6/0.8.1/jgrapht-jdk1.6-0.8.1.jar
2017-07-21T01:35:32.249 [INFO] Downloaded: 
http://conjars.org/repo/riffle/riffle/0.1-dev/riffle-0.1-dev.jar (12 KB at 97.2 
KB/sec)
2017-07-21T01:35:32.249 [INFO] Downloading: 
http://conjars.org/repo/cascading/cascading-local/2.6.3/cascading-local-2.6.3.jar
2017-07-21T01:35:32.306 [INFO] Downloaded: 
http://conjars.org/repo/org/pentaho/pentaho-aggdesigner-algorithm/5.1.5-jhyde/pentaho-aggdesigner-algorithm-5.1.5-jhyde.jar
 (48 KB at 275.7 KB/sec)
2017-07-21T01:35:32.419 [INFO] Downloaded: 
http://conjars.org/repo/cascading/cascading-local/2.6.3/cascading-local-2.6.3.jar
 (43 KB at 149.6 KB/sec)
2017-07-21T01:35:32.550 [INFO] Downloaded: 
http://conjars.org/repo/cascading/cascading-hadoop/2.6.3/cascading-hadoop-2.6.3.jar
 (246 KB at 592.5 KB/sec)
2017-07-21T01:35:32.550 [INFO] Downloaded: 
http://conjars.org/repo/thirdparty/jgrapht-jdk1.6/0.8.1/jgrapht-jdk1.6-0.8.1.jar
 (230 KB at 553.8 KB/sec)
2017-07-21T01:35:32.745 [INFO] Downloaded: 
http://conjars.org/repo/cascading/cascading-core/2.6.3/cascading-core-2.6.3.jar 
(680 KB at 1113.9 KB/sec)
2017-07-21T01:35:32.747 [INFO] Downloading: 
http://clojars.org/repo/org/scala-lang/scala-library/2.10.5/scala-library-2.10.5.jar
2017-07-21T01:35:32.816 [INFO] Downloading: 
https://repository.apache.org/content/repositories/releases/org/scala-lang/scala-library/2.10.5/scala-library-2.10.5.jar
2017-07-21T01:35:33.007 [INFO] Downloading: 
https://repository.jboss.org/nexus/content/repositories/releases/org/scala-lang/scala-library/2.10.5/scala-library-2.10.5.jar
2017-07-21T01:35:33.476 [INFO] Downloading: 
https://repo.eclipse.org/content/repositories/paho-releases/org/scala-lang/scala-library/2.10.5/scala-library-2.10.5.jar
2017-07-21T01:35:33.908 [INFO] Downloading: 
https://repository.cloudera.com/artifactory/cloudera-repos/org/scala-lang/scala-library/2.10.5/scala-library-2.10.5.jar
2017-07-21T01:35:34.085 [INFO] Downloading: 
https://oss.sonatype.org/content/repositories/orgspark-project-1113/org/scala-lang/scala-library/2.10.5/scala-library-2.10.5.jar
2017-07-21T01:35:34.227 [INFO] Downloading: 
http://repository.mapr.com/maven/org/scala-lang/scala-library/2.10.

Build failed in Jenkins: beam_PerformanceTests_JDBC #173

2017-07-20 Thread Apache Jenkins Server
See 


--
[...truncated 47.39 KB...]
[INFO] 
[INFO] --- maven-enforcer-plugin:1.4.1:enforce (enforce-banned-dependencies) @ 
beam-runners-google-cloud-dataflow-java ---
[INFO] 
[INFO] --- jacoco-maven-plugin:0.7.8:prepare-agent (default) @ 
beam-runners-google-cloud-dataflow-java ---
[INFO] argLine set to 
-javaagent:/home/jenkins/.m2/repository/org/jacoco/org.jacoco.agent/0.7.8/org.jacoco.agent-0.7.8-runtime.jar=destfile=
[INFO] 
[INFO] --- maven-remote-resources-plugin:1.5:process (process-resource-bundles) 
@ beam-runners-google-cloud-dataflow-java ---
[INFO] 
[INFO] --- maven-resources-plugin:3.0.2:resources (default-resources) @ 
beam-runners-google-cloud-dataflow-java ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 1 resource
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-compiler-plugin:3.6.1:compile (default-compile) @ 
beam-runners-google-cloud-dataflow-java ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 59 source files to 

[WARNING] bootstrap class path not set in conjunction with -source 1.7
[WARNING] 
:[116,31]
 An @AutoValue property that is a primitive array returns the original array, 
which can therefore be modified by the caller. If this OK, you can suppress 
this warning with @SuppressWarnings("mutable"). Otherwise, you should replace 
the property with an immutable type, perhaps a simple wrapper around the 
original array.
[INFO] 
:
 Some input files use or override a deprecated API.
[INFO] 
:
 Recompile with -Xlint:deprecation for details.
[INFO] 
:
 Some input files use unchecked or unsafe operations.
[INFO] 
:
 Recompile with -Xlint:unchecked for details.
[INFO] 
[INFO] --- maven-resources-plugin:3.0.2:testResources (default-testResources) @ 
beam-runners-google-cloud-dataflow-java ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory 

[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-compiler-plugin:3.6.1:testCompile (default-testCompile) @ 
beam-runners-google-cloud-dataflow-java ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 23 source files to 

[INFO] 
:
 Some input files use or override a deprecated API.
[INFO] 
:
 Recompile with -Xlint:deprecation for details.
[INFO] 
:
 Some input files use unchecked or unsafe operations.
[INFO] 
:
 Recompile with -Xlint:unchecked for details.
[INFO] 
[INFO] --- maven-checkstyle-plugin:2.17:check (default) @ 
beam-runners-google-cloud-dataflow-java ---
[INFO] Downloading: 
http://repository.apache.org/snapshots/org/apache/beam/beam-sdks-java-build-tools/2.2.0-SNAPSHOT/maven-metadata.xml
[INFO] Downloaded: 
http://repository.apache.org/snapshots/org/apache/beam/beam-sdks-java-build-tools/2.2.0-SNAPSHOT/maven-metadata.xml
 (1.4 kB at 8.8 kB/s)
[INFO] Downloading: 
http://repository.apache.org/snapshots/

Jenkins build is still unstable: beam_PostCommit_Java_MavenInstall #4421

2017-07-20 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Spark #2675

2017-07-20 Thread Apache Jenkins Server
See 




[jira] [Created] (BEAM-2650) migrate beam_it_args -> beam_it_options

2017-07-20 Thread Stephen Sisk (JIRA)
Stephen Sisk created BEAM-2650:
--

 Summary: migrate beam_it_args -> beam_it_options
 Key: BEAM-2650
 URL: https://issues.apache.org/jira/browse/BEAM-2650
 Project: Beam
  Issue Type: Task
  Components: testing
Reporter: Stephen Sisk
Assignee: Davor Bonaci


When adding the mvn -> pkb -> mvn integration for the IO IT's usage of PKB, I 
noticed that beam_it_args had two problems:
1) args had a format for passing options that was sub-optimal
2) the thing we are working with is options, not args, so it's mis-named.

It was important to solve #1, but you can't just change the name since it'd 
break with the currently checked in jenkins job. So I needed to migrate away 
from it, and #2 presented an easy opportunity to do so, so I added 
beam_it_options as the new option.

We should remove usages of beam_it_args and migrate over to only 
beam_it_options, then remove beam_it_args from pkb



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #3453

2017-07-20 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PerformanceTests_JDBC #172

2017-07-20 Thread Apache Jenkins Server
See 


Changes:

[kirpichov] Minor changes to AvroSource in preparation for refactoring

[kirpichov] Gets rid of opening Avro files in createForSubrangeOfFile codepath

--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam5 (beam) in workspace 

Cloning the remote Git repository
Cloning repository https://github.com/apache/beam.git
 > git init  # 
 > timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/*
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # 
 > timeout=10
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 1d9160fa3337704b9fbdc423796925be78b0087e (origin/master)
Commit message: "This closes #3590: [BEAM-2628] Makes AvroSource not open files 
while splitting"
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 1d9160fa3337704b9fbdc423796925be78b0087e
 > git rev-list 7e63d2cf6c0ea0da4ce68b72536522f68b4a5272 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_JDBC] $ /bin/bash -xe /tmp/jenkins8529676476335600131.sh
+ rm -rf PerfKitBenchmarker
[beam_PerformanceTests_JDBC] $ /bin/bash -xe /tmp/jenkins4154945542406315145.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git
Cloning into 'PerfKitBenchmarker'...
[beam_PerformanceTests_JDBC] $ /bin/bash -xe /tmp/jenkins6870452514343125939.sh
+ pip install --user -r PerfKitBenchmarker/requirements.txt
Requirement already satisfied (use --upgrade to upgrade): python-gflags==3.1.1 
in /home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied (use --upgrade to upgrade): jinja2>=2.7 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied (use --upgrade to upgrade): setuptools in 
/usr/lib/python2.7/dist-packages (from -r PerfKitBenchmarker/requirements.txt 
(line 16))
Requirement already satisfied (use --upgrade to upgrade): 
colorlog[windows]==2.6.0 in /home/jenkins/.local/lib/python2.7/site-packages 
(from -r PerfKitBenchmarker/requirements.txt (line 17))
  Installing extra requirements: 'windows'
Requirement already satisfied (use --upgrade to upgrade): blinker>=1.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 18))
Requirement already satisfied (use --upgrade to upgrade): futures>=3.0.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 19))
Requirement already satisfied (use --upgrade to upgrade): PyYAML==3.12 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 20))
Requirement already satisfied (use --upgrade to upgrade): pint>=0.7 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 21))
Requirement already satisfied (use --upgrade to upgrade): numpy in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 22))
Requirement already satisfied (use --upgrade to upgrade): functools32 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 23))
Requirement already satisfied (use --upgrade to upgrade): contextlib2>=0.5.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 24))
Cleaning up...
[beam_PerformanceTests_JDBC] $ /bin/bash -xe /tmp/jenkins4969035044125789438.sh
+ pip install --user -e 'sdks/python/[gcp,test]'
Obtaining 
file://
  Running setup.py 
(path:
 egg_info for package from 
file://

Build failed in Jenkins: beam_PerformanceTests_Python #124

2017-07-20 Thread Apache Jenkins Server
See 


Changes:

[tgroh] Add maven support for invoking perfkit benchmarker to run IO ITs

[tgroh] io-it-suite-local independent of io-it-suite, k8s properties -> root pom

[kirpichov] Minor changes to AvroSource in preparation for refactoring

[kirpichov] Gets rid of opening Avro files in createForSubrangeOfFile codepath

--
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on beam7 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/master^{commit} # timeout=10
Checking out Revision 1d9160fa3337704b9fbdc423796925be78b0087e (origin/master)
Commit message: "This closes #3590: [BEAM-2628] Makes AvroSource not open files 
while splitting"
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 1d9160fa3337704b9fbdc423796925be78b0087e
 > git rev-list afeba3715c806b53115f8f7994eb7bc207c68932 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins6185997070931709586.sh
+ rm -rf PerfKitBenchmarker
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins6964434050081114127.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git
Cloning into 'PerfKitBenchmarker'...
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins6689424679235806941.sh
+ pip install --user -r PerfKitBenchmarker/requirements.txt
Requirement already satisfied (use --upgrade to upgrade): python-gflags==3.1.1 
in /home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied (use --upgrade to upgrade): jinja2>=2.7 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied (use --upgrade to upgrade): setuptools in 
/usr/lib/python2.7/dist-packages (from -r PerfKitBenchmarker/requirements.txt 
(line 16))
Requirement already satisfied (use --upgrade to upgrade): 
colorlog[windows]==2.6.0 in /home/jenkins/.local/lib/python2.7/site-packages 
(from -r PerfKitBenchmarker/requirements.txt (line 17))
  Installing extra requirements: 'windows'
Requirement already satisfied (use --upgrade to upgrade): blinker>=1.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 18))
Requirement already satisfied (use --upgrade to upgrade): futures>=3.0.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 19))
Requirement already satisfied (use --upgrade to upgrade): PyYAML==3.12 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 20))
Requirement already satisfied (use --upgrade to upgrade): pint>=0.7 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 21))
Requirement already satisfied (use --upgrade to upgrade): numpy in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 22))
Requirement already satisfied (use --upgrade to upgrade): functools32 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 23))
Requirement already satisfied (use --upgrade to upgrade): contextlib2>=0.5.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 24))
Cleaning up...
[beam_PerformanceTests_Python] $ /bin/bash -xe /tmp/jenkins894325712106013578.sh
+ pip install --user -e 'sdks/python/[gcp,test]'
Obtaining 
file://
  Running setup.py 
(path:
 egg_info for package from 
file://

:66:
 UserWarning: You are using version 1.5.4 of pip. However, version 7.0.0 is 
recommen

Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Spark #2674

2017-07-20 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-2512) TextIO should support watching for new files

2017-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16095601#comment-16095601
 ] 

ASF GitHub Bot commented on BEAM-2512:
--

GitHub user jkff opened a pull request:

https://github.com/apache/beam/pull/3607

[BEAM-2512] Introduces TextIO.read/readAll().watchForNewFiles()

https://issues.apache.org/jira/browse/BEAM-2512

Part of http://s.apache.org/textio-sdf, based on 
http://s.apache.org/beam-watch-transform.

This PR includes https://github.com/apache/beam/pull/3565 - reviewer should 
look only at the other commit. Also, requires 
https://github.com/apache/beam/pull/3598 to properly support 
read().from(ValueProvider).watchForNewFiles() - this PR should be submitted 
only after both of the PRs above.

R: @reuvenlax 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jkff/incubator-beam 
textio-read-watch-new-files

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3607.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3607


commit 3103f9438a9fc392dfa1c37ceac990fc43c2ab98
Author: Eugene Kirpichov 
Date:   2017-07-20T02:50:03Z

[BEAM-2623] Introduces Watch transform

The transform watches for new elements in a family of growing sets.
See design at http://s.apache.org/beam-watch-transform

As part of the implementation, I found and fixed a bug in tracking the
watermark in OutputAndTimeBoundedSplittableProcessElementInvoker.
The watermark must be captured at the moment checkpoint is taken,
because it describes timestamps of elements output from the checkpoint.

I also made direct runner by default checkpoint SDF's every 100 elements
rather than every 1, to make it more aggressive - that's what
uncovered the bug above.

commit c977d606f14a59ed73acf22f32a6b250d89c0ccd
Author: Eugene Kirpichov 
Date:   2017-07-20T23:58:42Z

[BEAM-2512] Introduces TextIO.read/readAll().watchForNewFiles()

Part of http://s.apache.org/textio-sdf, based on
http://s.apache.org/beam-watch-transform.




> TextIO should support watching for new files
> 
>
> Key: BEAM-2512
> URL: https://issues.apache.org/jira/browse/BEAM-2512
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Eugene Kirpichov
>Assignee: Eugene Kirpichov
>
> Motivation and proposed implementation in https://s.apache.org/textio-sdf



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3607: [BEAM-2512] Introduces TextIO.read/readAll().watchF...

2017-07-20 Thread jkff
GitHub user jkff opened a pull request:

https://github.com/apache/beam/pull/3607

[BEAM-2512] Introduces TextIO.read/readAll().watchForNewFiles()

https://issues.apache.org/jira/browse/BEAM-2512

Part of http://s.apache.org/textio-sdf, based on 
http://s.apache.org/beam-watch-transform.

This PR includes https://github.com/apache/beam/pull/3565 - reviewer should 
look only at the other commit. Also, requires 
https://github.com/apache/beam/pull/3598 to properly support 
read().from(ValueProvider).watchForNewFiles() - this PR should be submitted 
only after both of the PRs above.

R: @reuvenlax 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jkff/incubator-beam 
textio-read-watch-new-files

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3607.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3607


commit 3103f9438a9fc392dfa1c37ceac990fc43c2ab98
Author: Eugene Kirpichov 
Date:   2017-07-20T02:50:03Z

[BEAM-2623] Introduces Watch transform

The transform watches for new elements in a family of growing sets.
See design at http://s.apache.org/beam-watch-transform

As part of the implementation, I found and fixed a bug in tracking the
watermark in OutputAndTimeBoundedSplittableProcessElementInvoker.
The watermark must be captured at the moment checkpoint is taken,
because it describes timestamps of elements output from the checkpoint.

I also made direct runner by default checkpoint SDF's every 100 elements
rather than every 1, to make it more aggressive - that's what
uncovered the bug above.

commit c977d606f14a59ed73acf22f32a6b250d89c0ccd
Author: Eugene Kirpichov 
Date:   2017-07-20T23:58:42Z

[BEAM-2512] Introduces TextIO.read/readAll().watchForNewFiles()

Part of http://s.apache.org/textio-sdf, based on
http://s.apache.org/beam-watch-transform.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #3452

2017-07-20 Thread Apache Jenkins Server
See 




[jira] [Closed] (BEAM-2628) AvroSource.split() sequentially opens every matched file

2017-07-20 Thread Eugene Kirpichov (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Kirpichov closed BEAM-2628.
--
   Resolution: Fixed
Fix Version/s: 2.2.0

> AvroSource.split() sequentially opens every matched file
> 
>
> Key: BEAM-2628
> URL: https://issues.apache.org/jira/browse/BEAM-2628
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Eugene Kirpichov
>Assignee: Eugene Kirpichov
> Fix For: 2.2.0
>
>
> When you do AvroIO.read().from(filepattern), during splitting of AvroSource 
> the filepattern gets expanded into N files, and then for each of the N files 
> we do this: 
> https://github.com/apache/beam/blob/v2.0.0/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroSource.java#L259
> This is very slow. E.g. one job was reading 15,000 files, and it took almost 
> 2 hours to split the source because opening each file and reading schema was 
> taking about 0.5s.
> I'm not quite sure why we need the file metadata while splitting...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2628) AvroSource.split() sequentially opens every matched file

2017-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16095589#comment-16095589
 ] 

ASF GitHub Bot commented on BEAM-2628:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3590


> AvroSource.split() sequentially opens every matched file
> 
>
> Key: BEAM-2628
> URL: https://issues.apache.org/jira/browse/BEAM-2628
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Eugene Kirpichov
>Assignee: Eugene Kirpichov
> Fix For: 2.2.0
>
>
> When you do AvroIO.read().from(filepattern), during splitting of AvroSource 
> the filepattern gets expanded into N files, and then for each of the N files 
> we do this: 
> https://github.com/apache/beam/blob/v2.0.0/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroSource.java#L259
> This is very slow. E.g. one job was reading 15,000 files, and it took almost 
> 2 hours to split the source because opening each file and reading schema was 
> taking about 0.5s.
> I'm not quite sure why we need the file metadata while splitting...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[2/3] beam git commit: Minor changes to AvroSource in preparation for refactoring

2017-07-20 Thread jkff
Minor changes to AvroSource in preparation for refactoring


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/c52a908c
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/c52a908c
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/c52a908c

Branch: refs/heads/master
Commit: c52a908cba7765e120a94909ab02c548d1a124ad
Parents: 7e63d2c
Author: Eugene Kirpichov 
Authored: Tue Jul 18 13:40:52 2017 -0700
Committer: Eugene Kirpichov 
Committed: Thu Jul 20 16:59:11 2017 -0700

--
 .../java/org/apache/beam/sdk/io/AvroSource.java | 171 ---
 .../org/apache/beam/sdk/io/AvroSourceTest.java  |  26 +--
 2 files changed, 70 insertions(+), 127 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/c52a908c/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroSource.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroSource.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroSource.java
index 575218b..0634774 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroSource.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroSource.java
@@ -37,6 +37,7 @@ import java.util.Map;
 import java.util.WeakHashMap;
 import java.util.zip.Inflater;
 import java.util.zip.InflaterInputStream;
+import javax.annotation.Nullable;
 import javax.annotation.concurrent.GuardedBy;
 import org.apache.avro.Schema;
 import org.apache.avro.file.CodecFactory;
@@ -127,15 +128,16 @@ public class AvroSource extends BlockBasedSource {
   // The default sync interval is 64k.
   private static final long DEFAULT_MIN_BUNDLE_SIZE = 2 * 
DataFileConstants.DEFAULT_SYNC_INTERVAL;
 
-  // The JSON schema used to encode records.
-  private final String readSchemaString;
+  // The type of the records contained in the file.
+  private final Class type;
+
+  // The JSON schema used to decode records.
+  @Nullable
+  private final String readerSchemaString;
 
   // The JSON schema that was used to write the source Avro file (may differ 
from the schema we will
   // use to read from it).
-  private final String fileSchemaString;
-
-  // The type of the records contained in the file.
-  private final Class type;
+  private final String writerSchemaString;
 
   // The following metadata fields are not user-configurable. They are 
extracted from the object
   // container file header upon subsource creation.
@@ -147,87 +149,75 @@ public class AvroSource extends BlockBasedSource {
   // The object container file's 16-byte sync marker.
   private final byte[] syncMarker;
 
-  // Default output coder, lazily initialized.
-  private transient AvroCoder coder = null;
-
-  // Schema of the file, lazily initialized.
-  private transient Schema fileSchema;
-
-  // Schema used to encode records, lazily initialized.
-  private transient Schema readSchema;
-
   /**
-   * Creates an {@link AvroSource} that reads from the given file name or 
pattern ("glob"). The
-   * returned source can be further configured by calling {@link #withSchema} 
to return a type other
-   * than {@link GenericRecord}.
+   * Reads from the given file name or pattern ("glob"). The returned source 
can be further
+   * configured by calling {@link #withSchema} to return a type other than 
{@link GenericRecord}.
*/
   public static AvroSource from(String fileNameOrPattern) {
 return new AvroSource<>(
 fileNameOrPattern, DEFAULT_MIN_BUNDLE_SIZE, null, GenericRecord.class, 
null, null);
   }
 
-  /**
-   * Returns an {@link AvroSource} that's like this one but reads files 
containing records that
-   * conform to the given schema.
-   *
-   * Does not modify this object.
-   */
+  /** Reads files containing records that conform to the given schema. */
   public AvroSource withSchema(String schema) {
 return new AvroSource<>(
 getFileOrPatternSpec(), getMinBundleSize(), schema, 
GenericRecord.class, codec, syncMarker);
   }
 
-  /**
-   * Returns an {@link AvroSource} that's like this one but reads files 
containing records that
-   * conform to the given schema.
-   *
-   * Does not modify this object.
-   */
+  /** Like {@link #withSchema(String)}. */
   public AvroSource withSchema(Schema schema) {
 return new AvroSource<>(getFileOrPatternSpec(), getMinBundleSize(), 
schema.toString(),
 GenericRecord.class, codec, syncMarker);
   }
 
-  /**
-   * Returns an {@link AvroSource} that's like this one but reads files 
containing records of the
-   * type of the given class.
-   *
-   * Does not modify this object.
-   */
+  /** Reads files containing records of the given class. */
   public  AvroSource withSchema(Class clazz) {
-return new AvroSource(getFileOrPatternSpec(), getMinBundleSize(),
+retur

[1/3] beam git commit: Gets rid of opening Avro files in createForSubrangeOfFile codepath

2017-07-20 Thread jkff
Repository: beam
Updated Branches:
  refs/heads/master 7e63d2cf6 -> 1d9160fa3


Gets rid of opening Avro files in createForSubrangeOfFile codepath


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/d4026da1
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/d4026da1
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/d4026da1

Branch: refs/heads/master
Commit: d4026da1ad1fa0864052b85a66c4af5975327e9f
Parents: c52a908
Author: Eugene Kirpichov 
Authored: Tue Jul 18 14:09:03 2017 -0700
Committer: Eugene Kirpichov 
Committed: Thu Jul 20 16:59:11 2017 -0700

--
 .../java/org/apache/beam/sdk/io/AvroSource.java | 176 +++
 .../org/apache/beam/sdk/io/AvroSourceTest.java  |  11 +-
 2 files changed, 63 insertions(+), 124 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/d4026da1/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroSource.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroSource.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroSource.java
index 0634774..30af344 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroSource.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroSource.java
@@ -21,6 +21,7 @@ import static 
com.google.common.base.Preconditions.checkNotNull;
 import static com.google.common.base.Preconditions.checkState;
 
 import com.google.common.annotations.VisibleForTesting;
+import com.google.common.base.MoreObjects;
 import java.io.ByteArrayInputStream;
 import java.io.EOFException;
 import java.io.IOException;
@@ -135,45 +136,33 @@ public class AvroSource extends BlockBasedSource {
   @Nullable
   private final String readerSchemaString;
 
-  // The JSON schema that was used to write the source Avro file (may differ 
from the schema we will
-  // use to read from it).
-  private final String writerSchemaString;
-
-  // The following metadata fields are not user-configurable. They are 
extracted from the object
-  // container file header upon subsource creation.
-
-  // The codec used to encode the blocks in the Avro file. String value drawn 
from those in
-  // 
https://avro.apache.org/docs/1.7.7/api/java/org/apache/avro/file/CodecFactory.html
-  private final String codec;
-
-  // The object container file's 16-byte sync marker.
-  private final byte[] syncMarker;
-
   /**
* Reads from the given file name or pattern ("glob"). The returned source 
can be further
* configured by calling {@link #withSchema} to return a type other than 
{@link GenericRecord}.
*/
   public static AvroSource from(String fileNameOrPattern) {
-return new AvroSource<>(
-fileNameOrPattern, DEFAULT_MIN_BUNDLE_SIZE, null, GenericRecord.class, 
null, null);
+return new AvroSource<>(fileNameOrPattern, DEFAULT_MIN_BUNDLE_SIZE, null, 
GenericRecord.class);
   }
 
   /** Reads files containing records that conform to the given schema. */
   public AvroSource withSchema(String schema) {
 return new AvroSource<>(
-getFileOrPatternSpec(), getMinBundleSize(), schema, 
GenericRecord.class, codec, syncMarker);
+getFileOrPatternSpec(), getMinBundleSize(), schema, 
GenericRecord.class);
   }
 
   /** Like {@link #withSchema(String)}. */
   public AvroSource withSchema(Schema schema) {
-return new AvroSource<>(getFileOrPatternSpec(), getMinBundleSize(), 
schema.toString(),
-GenericRecord.class, codec, syncMarker);
+return new AvroSource<>(
+getFileOrPatternSpec(), getMinBundleSize(), schema.toString(), 
GenericRecord.class);
   }
 
   /** Reads files containing records of the given class. */
   public  AvroSource withSchema(Class clazz) {
-return new AvroSource<>(getFileOrPatternSpec(), getMinBundleSize(),
-ReflectData.get().getSchema(clazz).toString(), clazz, codec, 
syncMarker);
+return new AvroSource<>(
+getFileOrPatternSpec(),
+getMinBundleSize(),
+ReflectData.get().getSchema(clazz).toString(),
+clazz);
   }
 
   /**
@@ -181,24 +170,15 @@ public class AvroSource extends BlockBasedSource {
* minBundleSize} and its use.
*/
   public AvroSource withMinBundleSize(long minBundleSize) {
-return new AvroSource<>(
-getFileOrPatternSpec(), minBundleSize, readerSchemaString, type, 
codec, syncMarker);
+return new AvroSource<>(getFileOrPatternSpec(), minBundleSize, 
readerSchemaString, type);
   }
 
   /** Constructor for FILEPATTERN mode. */
   private AvroSource(
-  String fileNameOrPattern,
-  long minBundleSize,
-  String readerSchemaString,
-  Class type,
-  String codec,
-  byte[] syncMarker) {
+  String fileNameOrPattern, long minBundleSize, String readerSchemaString, 
Class typ

[GitHub] beam pull request #3590: [BEAM-2628] Makes AvroSource not open files while s...

2017-07-20 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3590


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[3/3] beam git commit: This closes #3590: [BEAM-2628] Makes AvroSource not open files while splitting

2017-07-20 Thread jkff
This closes #3590: [BEAM-2628] Makes AvroSource not open files while splitting


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/1d9160fa
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/1d9160fa
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/1d9160fa

Branch: refs/heads/master
Commit: 1d9160fa3337704b9fbdc423796925be78b0087e
Parents: 7e63d2c d4026da
Author: Eugene Kirpichov 
Authored: Thu Jul 20 16:59:27 2017 -0700
Committer: Eugene Kirpichov 
Committed: Thu Jul 20 16:59:27 2017 -0700

--
 .../java/org/apache/beam/sdk/io/AvroSource.java | 265 ++-
 .../org/apache/beam/sdk/io/AvroSourceTest.java  |  37 +--
 2 files changed, 92 insertions(+), 210 deletions(-)
--




[jira] [Commented] (BEAM-2555) add README page in dsl/sql

2017-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16095535#comment-16095535
 ] 

ASF GitHub Bot commented on BEAM-2555:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3605


> add README page in dsl/sql
> --
>
> Key: BEAM-2555
> URL: https://issues.apache.org/jira/browse/BEAM-2555
> Project: Beam
>  Issue Type: Task
>  Components: dsl-sql
>Reporter: Xu Mingmin
>Assignee: Xu Mingmin
>  Labels: dsl_sql_merge
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3605: [BEAM-2555] add README page in dsl/sql

2017-07-20 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3605


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] beam git commit: remove README.md and update usages in BeamSqlExample

2017-07-20 Thread takidau
Repository: beam
Updated Branches:
  refs/heads/DSL_SQL ada24c059 -> 0eb8e89ba


remove README.md and update usages in BeamSqlExample


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/f1aa390d
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/f1aa390d
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/f1aa390d

Branch: refs/heads/DSL_SQL
Commit: f1aa390d1989543f2848dae6b26596ffd1a5d8db
Parents: ada24c0
Author: mingmxu 
Authored: Thu Jul 20 14:32:42 2017 -0700
Committer: mingmxu 
Committed: Thu Jul 20 14:32:42 2017 -0700

--
 dsls/sql/README.md  | 24 
 .../beam/dsls/sql/example/BeamSqlExample.java   | 23 +++
 2 files changed, 13 insertions(+), 34 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/f1aa390d/dsls/sql/README.md
--
diff --git a/dsls/sql/README.md b/dsls/sql/README.md
deleted file mode 100644
index ae9e0f3..000
--- a/dsls/sql/README.md
+++ /dev/null
@@ -1,24 +0,0 @@
-
-
-# Beam SQL
-
-Beam SQL provides a new interface, to execute a SQL query as a Beam pipeline.
-
-*It's working in progress...*

http://git-wip-us.apache.org/repos/asf/beam/blob/f1aa390d/dsls/sql/src/main/java/org/apache/beam/dsls/sql/example/BeamSqlExample.java
--
diff --git 
a/dsls/sql/src/main/java/org/apache/beam/dsls/sql/example/BeamSqlExample.java 
b/dsls/sql/src/main/java/org/apache/beam/dsls/sql/example/BeamSqlExample.java
index 91df2be..4e364e1 100644
--- 
a/dsls/sql/src/main/java/org/apache/beam/dsls/sql/example/BeamSqlExample.java
+++ 
b/dsls/sql/src/main/java/org/apache/beam/dsls/sql/example/BeamSqlExample.java
@@ -34,16 +34,19 @@ import org.apache.beam.sdk.values.PBegin;
 import org.apache.beam.sdk.values.PCollection;
 import org.apache.beam.sdk.values.PCollectionTuple;
 import org.apache.beam.sdk.values.TupleTag;
-import org.slf4j.Logger;
-import org.slf4j.LoggerFactory;
 
 /**
  * This is a quick example, which uses Beam SQL DSL to create a data pipeline.
  *
+ * Run the example with
+ * 
+ * mvn -pl dsls/sql compile exec:java \
+ *  -Dexec.mainClass=org.apache.beam.dsls.sql.example.BeamSqlExample \
+ *   -Dexec.args="--runner=DirectRunner" -Pdirect-runner
+ * 
+ *
  */
 class BeamSqlExample {
-  private static final Logger LOG = 
LoggerFactory.getLogger(BeamSqlExample.class);
-
   public static void main(String[] args) throws Exception {
 PipelineOptions options = 
PipelineOptionsFactory.fromArgs(args).as(PipelineOptions.class);
 Pipeline p = Pipeline.create(options);
@@ -63,9 +66,9 @@ class BeamSqlExample {
 
 //Case 1. run a simple SQL query over input PCollection with 
BeamSql.simpleQuery;
 PCollection outputStream = inputTable.apply(
-BeamSql.simpleQuery("select c2, c3 from PCOLLECTION where c1=1"));
+BeamSql.simpleQuery("select c1, c2, c3 from PCOLLECTION where c1=1"));
 
-//log out the output record;
+//print the output record of case 1;
 outputStream.apply("log_result",
 MapElements.via(new SimpleFunction() {
   public Void apply(BeamSqlRow input) {
@@ -74,12 +77,12 @@ class BeamSqlExample {
   }
 }));
 
-//Case 2. run the query with BeamSql.query
+//Case 2. run the query with BeamSql.query over result PCollection of case 
1.
 PCollection outputStream2 =
-PCollectionTuple.of(new TupleTag("TABLE_B"), inputTable)
-.apply(BeamSql.query("select c2, c3 from TABLE_B where c1=1"));
+PCollectionTuple.of(new TupleTag("CASE1_RESULT"), 
outputStream)
+.apply(BeamSql.query("select c2, c3 from CASE1_RESULT where c1=1"));
 
-//log out the output record;
+//print the output record of case 2;
 outputStream2.apply("log_result",
 MapElements.via(new SimpleFunction() {
   @Override



[2/2] beam git commit: [BEAM-2555] This closes #3605

2017-07-20 Thread takidau
[BEAM-2555] This closes #3605


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/0eb8e89b
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/0eb8e89b
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/0eb8e89b

Branch: refs/heads/DSL_SQL
Commit: 0eb8e89babbe758c15457bc10d44c815ebaf7709
Parents: ada24c0 f1aa390
Author: Tyler Akidau 
Authored: Thu Jul 20 16:14:51 2017 -0700
Committer: Tyler Akidau 
Committed: Thu Jul 20 16:14:51 2017 -0700

--
 dsls/sql/README.md  | 24 
 .../beam/dsls/sql/example/BeamSqlExample.java   | 23 +++
 2 files changed, 13 insertions(+), 34 deletions(-)
--




Build failed in Jenkins: beam_PerformanceTests_JDBC #171

2017-07-20 Thread Apache Jenkins Server
See 


--
[...truncated 13.30 KB...]
Downloaded: 
http://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-resources-plugin/3.0.2/maven-resources-plugin-3.0.2.pom
 (7 KB at 232.2 KB/sec)
Downloading: 
http://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-resources-plugin/3.0.2/maven-resources-plugin-3.0.2.jar
Downloaded: 
http://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-resources-plugin/3.0.2/maven-resources-plugin-3.0.2.jar
 (31 KB at 938.3 KB/sec)
Downloading: 
http://repository.apache.org/snapshots/org/apache/beam/beam-sdks-java-core/2.2.0-SNAPSHOT/maven-metadata.xml
Downloaded: 
http://repository.apache.org/snapshots/org/apache/beam/beam-sdks-java-core/2.2.0-SNAPSHOT/maven-metadata.xml
 (2 KB at 9.5 KB/sec)
Downloading: 
http://repository.apache.org/snapshots/org/apache/beam/beam-sdks-java-core/2.2.0-SNAPSHOT/beam-sdks-java-core-2.2.0-20170720.140744-13.pom
Downloaded: 
http://repository.apache.org/snapshots/org/apache/beam/beam-sdks-java-core/2.2.0-SNAPSHOT/beam-sdks-java-core-2.2.0-20170720.140744-13.pom
 (11 KB at 67.9 KB/sec)
Downloading: 
http://repo.maven.apache.org/maven2/com/fasterxml/jackson/core/jackson-core/2.8.9/jackson-core-2.8.9.pom
Downloaded: 
http://repo.maven.apache.org/maven2/com/fasterxml/jackson/core/jackson-core/2.8.9/jackson-core-2.8.9.pom
 (6 KB at 182.3 KB/sec)
Downloading: 
http://repo.maven.apache.org/maven2/com/fasterxml/jackson/core/jackson-annotations/2.8.9/jackson-annotations-2.8.9.pom
Downloaded: 
http://repo.maven.apache.org/maven2/com/fasterxml/jackson/core/jackson-annotations/2.8.9/jackson-annotations-2.8.9.pom
 (2 KB at 62.2 KB/sec)
Downloading: 
http://repo.maven.apache.org/maven2/com/fasterxml/jackson/core/jackson-databind/2.8.9/jackson-databind-2.8.9.pom
Downloaded: 
http://repo.maven.apache.org/maven2/com/fasterxml/jackson/core/jackson-databind/2.8.9/jackson-databind-2.8.9.pom
 (6 KB at 189.2 KB/sec)
Downloading: 
http://repo.maven.apache.org/maven2/org/apache/avro/avro/1.8.2/avro-1.8.2.pom
Downloaded: 
http://repo.maven.apache.org/maven2/org/apache/avro/avro/1.8.2/avro-1.8.2.pom 
(7 KB at 247.2 KB/sec)
Downloading: 
http://repo.maven.apache.org/maven2/org/apache/avro/avro-parent/1.8.2/avro-parent-1.8.2.pom
Downloaded: 
http://repo.maven.apache.org/maven2/org/apache/avro/avro-parent/1.8.2/avro-parent-1.8.2.pom
 (21 KB at 653.4 KB/sec)
Downloading: 
http://repo.maven.apache.org/maven2/org/apache/avro/avro-toplevel/1.8.2/avro-toplevel-1.8.2.pom
Downloaded: 
http://repo.maven.apache.org/maven2/org/apache/avro/avro-toplevel/1.8.2/avro-toplevel-1.8.2.pom
 (16 KB at 527.0 KB/sec)
Downloading: 
http://repo.maven.apache.org/maven2/org/apache/commons/commons-compress/1.14/commons-compress-1.14.pom
Downloaded: 
http://repo.maven.apache.org/maven2/org/apache/commons/commons-compress/1.14/commons-compress-1.14.pom
 (13 KB at 442.6 KB/sec)
Downloading: 
http://repository.apache.org/snapshots/org/apache/beam/beam-sdks-java-extensions-google-cloud-platform-core/2.2.0-SNAPSHOT/maven-metadata.xml
Downloaded: 
http://repository.apache.org/snapshots/org/apache/beam/beam-sdks-java-extensions-google-cloud-platform-core/2.2.0-SNAPSHOT/maven-metadata.xml
 (2 KB at 8.9 KB/sec)
Downloading: 
http://repository.apache.org/snapshots/org/apache/beam/beam-sdks-java-extensions-google-cloud-platform-core/2.2.0-SNAPSHOT/beam-sdks-java-extensions-google-cloud-platform-core-2.2.0-20170719.141753-12.pom
Downloaded: 
http://repository.apache.org/snapshots/org/apache/beam/beam-sdks-java-extensions-google-cloud-platform-core/2.2.0-SNAPSHOT/beam-sdks-java-extensions-google-cloud-platform-core-2.2.0-20170719.141753-12.pom
 (8 KB at 30.7 KB/sec)
Downloading: 
http://repo.maven.apache.org/maven2/com/google/auth/google-auth-library-oauth2-http/0.7.1/google-auth-library-oauth2-http-0.7.1.pom
Downloaded: 
http://repo.maven.apache.org/maven2/com/google/auth/google-auth-library-oauth2-http/0.7.1/google-auth-library-oauth2-http-0.7.1.pom
 (3 KB at 73.6 KB/sec)
Downloading: 
http://repo.maven.apache.org/maven2/com/google/auth/google-auth-library-parent/0.7.1/google-auth-library-parent-0.7.1.pom
Downloaded: 
http://repo.maven.apache.org/maven2/com/google/auth/google-auth-library-parent/0.7.1/google-auth-library-parent-0.7.1.pom
 (8 KB at 221.0 KB/sec)
Downloading: 
http://repo.maven.apache.org/maven2/com/google/auth/google-auth-library-credentials/0.7.1/google-auth-library-credentials-0.7.1.pom
Downloaded: 
http://repo.maven.apache.org/maven2/com/google/auth/google-auth-library-credentials/0.7.1/google-auth-library-credentials-0.7.1.pom
 (2 KB at 55.3 KB/sec)
Downloading: 
http://repository.apache.org/snapshots/org/apache/beam/beam-sdks-common-runner-api/2.2.0-SNAPSHOT/maven-metadata.xml
Downloaded: 
http://repository.apache.org/snapshots/org/apache/beam/beam-sdks-common-runner-api/2.2.0-SNAPSHOT/maven-metadata.xml
 (2 KB at 9.0 KB/sec)
Downloading: 
http:

Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #3621

2017-07-20 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-2621) rename BeamSqlRecordType to BeamSqlRowType

2017-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16095442#comment-16095442
 ] 

ASF GitHub Bot commented on BEAM-2621:
--

GitHub user XuMingmin opened a pull request:

https://github.com/apache/beam/pull/3606

Merge DSL_SQL to Master

Merge DSL_SQL feature branch back to master branch, with Beam SQL DSL 
function is ready to use.

Resources:
1. thread in apache-beam-dev mail-list 
https://www.mail-archive.com/dev@beam.apache.org/msg02064.html 
2. JIRA tasks 
https://issues.apache.org/jira/browse/BEAM-2621?jql=labels%20%3D%20dsl_sql_merge
 
3. Task burndown doc 
https://docs.google.com/document/d/1EHZgSu4Jd75iplYpYT_K_JwSZxL2DWG8kv_EmQzNXFc/edit?usp=sharing



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/apache/beam DSL_SQL

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3606.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3606


commit bd99528af89450b44d94abc42a8b884e00cbc26e
Author: Tyler Akidau 
Date:   2017-06-27T00:36:58Z

[BEAM-2503] This closes #3427

commit 32fbc9cec1d0d86e04e3f453b0d75f2ff0e61b56
Author: Ismaël Mejía 
Date:   2017-06-26T14:37:51Z

Small fixes to make the example run in a runner agnostic way:
- Add direct runner default profile
- Add findbugs validation and fix existing findbugs issues
- Validate division by zero on arithmetic expression + other minor fixes
- Update Calcite version to 1.13

commit ab4b118869070e94a4205744d6d60525d3fa2882
Author: Tyler Akidau 
Date:   2017-06-28T01:25:56Z

This closes #3439

commit 928cec597175c363d444331b35ac8793297a242b
Author: James Xu 
Date:   2017-05-29T03:11:34Z

[BEAM-2193] Implement FULL, INNER, and OUTER JOIN:
- FULL and INNER supported on all variations of unbounded/bounded joins.
- OUTER JOIN supported when outer side is unbounded.
- Unbounded/bounded joins implemented via side inputs.

commit 2096da25e85d97ab52850453c2130ff706d7bcdf
Author: Tyler Akidau 
Date:   2017-06-29T23:34:45Z

[BEAM-2193] This closes #3277

commit a13fce98f61867fcb5adb52c80f1cfd3eecfc436
Author: mingmxu 
Date:   2017-06-26T23:03:51Z

UDAF support:
- Adds an abstract class BeamSqlUdaf for defining Calcite SQL UDAFs.
- Updates built-in COUNT/SUM/AVG/MAX/MIN accumulators to use this new class.

commit 7ba77dd435eefd697fd0a452b07e0154dd1cd1b2
Author: Tyler Akidau 
Date:   2017-06-30T22:40:05Z

[BEAM-2287] This closes #3447

commit 21497194db3ddce37a4747b3de2714b02684c57e
Author: James Xu 
Date:   2017-06-27T02:42:40Z

BeamSql: refactor the MockedBeamSqlTable and related tests

commit bc66698e6880c7788bcea78006c67bfca66b17ce
Author: James Xu 
Date:   2017-06-30T06:54:26Z

MockedBeamSqlTable -> MockedBoundedTable

commit ca2bc723dc00f0a5bf3e6157f8cd79ef4297093b
Author: Luke Cwik 
Date:   2017-07-05T16:34:17Z

[BEAM-2515] BeamSql: refactor the MockedBeamSqlTable and related tests

This closes #3478

commit 794f1901dcf5fb520a849adce7ce436e8b2f8535
Author: mingmxu 
Date:   2017-07-10T05:26:29Z

Test unsupported/invalid cases in DSL tests.

commit b8fa0addc86d2249cd2fbbce3d8ad98d27e04604
Author: Tyler Akidau 
Date:   2017-07-12T04:05:15Z

[BEAM-2574] This closes #3530

commit e9dc5ea81cbbde39bf11ee183e5403b869d21f50
Author: James Xu 
Date:   2017-07-06T03:29:41Z

[BEAM-2550] add UnitTest for JOIN in DSL

commit aa265e62a6c807ee941b1fc9379411f736754bf5
Author: Tyler Akidau 
Date:   2017-07-12T06:47:49Z

[BEAM-2550] This closes #3506

commit eb589fb943a45388e4118bb221e52009eb92d1c1
Author: mingmxu 
Date:   2017-07-09T07:52:23Z

support TUMBLE/HOP/SESSION _START function

commit a96c3a01f18826b4e43ad7e8f26e94c02903acd9
Author: Tyler Akidau 
Date:   2017-07-12T16:49:59Z

[BEAM-2204] This closes #3527

commit 0580e8b639ef77c7a6534b7b91ecad493950a3aa
Author: mingmxu 
Date:   2017-07-12T07:08:35Z

Test queries on unbounded PCollections with BeamSql DSL API.
Also add getTYPE(fieldName) overrides to BeamSqlRow.

commit 8defe6f21524b4ca00dd984176260c1bd0739774
Author: Tyler Akidau 
Date:   2017-07-12T17:03:28Z

[BEAM-2527] This closes #3477

commit 3a170744861dec12a33a19358f9ce89dbd64491c
Author: James Xu 
Date:   2017-07-10T11:58:21Z

[BEAM-2564] add integration test for string functions

commit 53d27e6c45ab7bbd6877c6674868e8a2e3f9a971
Author: Tyler Akidau 
Date:   2017-07-12T17:07:39Z

[BEAM-2564] This closes #3532

commit 39eedd566c3a1825ec3e9be0aa4490cfd824cb30
Author: tarushapptech 
Date:   2017-06-17T07:50:02Z

CAST operator supporting numeric, date and timestamp types

commit 125cef14453d625f1758aef5b129182cb13d829c
Author: Tyler Akidau 
Date:   2017-07-12T17:3

[GitHub] beam pull request #3606: Merge DSL_SQL to Master

2017-07-20 Thread XuMingmin
GitHub user XuMingmin opened a pull request:

https://github.com/apache/beam/pull/3606

Merge DSL_SQL to Master

Merge DSL_SQL feature branch back to master branch, with Beam SQL DSL 
function is ready to use.

Resources:
1. thread in apache-beam-dev mail-list 
https://www.mail-archive.com/dev@beam.apache.org/msg02064.html 
2. JIRA tasks 
https://issues.apache.org/jira/browse/BEAM-2621?jql=labels%20%3D%20dsl_sql_merge
 
3. Task burndown doc 
https://docs.google.com/document/d/1EHZgSu4Jd75iplYpYT_K_JwSZxL2DWG8kv_EmQzNXFc/edit?usp=sharing



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/apache/beam DSL_SQL

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3606.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3606


commit bd99528af89450b44d94abc42a8b884e00cbc26e
Author: Tyler Akidau 
Date:   2017-06-27T00:36:58Z

[BEAM-2503] This closes #3427

commit 32fbc9cec1d0d86e04e3f453b0d75f2ff0e61b56
Author: Ismaël Mejía 
Date:   2017-06-26T14:37:51Z

Small fixes to make the example run in a runner agnostic way:
- Add direct runner default profile
- Add findbugs validation and fix existing findbugs issues
- Validate division by zero on arithmetic expression + other minor fixes
- Update Calcite version to 1.13

commit ab4b118869070e94a4205744d6d60525d3fa2882
Author: Tyler Akidau 
Date:   2017-06-28T01:25:56Z

This closes #3439

commit 928cec597175c363d444331b35ac8793297a242b
Author: James Xu 
Date:   2017-05-29T03:11:34Z

[BEAM-2193] Implement FULL, INNER, and OUTER JOIN:
- FULL and INNER supported on all variations of unbounded/bounded joins.
- OUTER JOIN supported when outer side is unbounded.
- Unbounded/bounded joins implemented via side inputs.

commit 2096da25e85d97ab52850453c2130ff706d7bcdf
Author: Tyler Akidau 
Date:   2017-06-29T23:34:45Z

[BEAM-2193] This closes #3277

commit a13fce98f61867fcb5adb52c80f1cfd3eecfc436
Author: mingmxu 
Date:   2017-06-26T23:03:51Z

UDAF support:
- Adds an abstract class BeamSqlUdaf for defining Calcite SQL UDAFs.
- Updates built-in COUNT/SUM/AVG/MAX/MIN accumulators to use this new class.

commit 7ba77dd435eefd697fd0a452b07e0154dd1cd1b2
Author: Tyler Akidau 
Date:   2017-06-30T22:40:05Z

[BEAM-2287] This closes #3447

commit 21497194db3ddce37a4747b3de2714b02684c57e
Author: James Xu 
Date:   2017-06-27T02:42:40Z

BeamSql: refactor the MockedBeamSqlTable and related tests

commit bc66698e6880c7788bcea78006c67bfca66b17ce
Author: James Xu 
Date:   2017-06-30T06:54:26Z

MockedBeamSqlTable -> MockedBoundedTable

commit ca2bc723dc00f0a5bf3e6157f8cd79ef4297093b
Author: Luke Cwik 
Date:   2017-07-05T16:34:17Z

[BEAM-2515] BeamSql: refactor the MockedBeamSqlTable and related tests

This closes #3478

commit 794f1901dcf5fb520a849adce7ce436e8b2f8535
Author: mingmxu 
Date:   2017-07-10T05:26:29Z

Test unsupported/invalid cases in DSL tests.

commit b8fa0addc86d2249cd2fbbce3d8ad98d27e04604
Author: Tyler Akidau 
Date:   2017-07-12T04:05:15Z

[BEAM-2574] This closes #3530

commit e9dc5ea81cbbde39bf11ee183e5403b869d21f50
Author: James Xu 
Date:   2017-07-06T03:29:41Z

[BEAM-2550] add UnitTest for JOIN in DSL

commit aa265e62a6c807ee941b1fc9379411f736754bf5
Author: Tyler Akidau 
Date:   2017-07-12T06:47:49Z

[BEAM-2550] This closes #3506

commit eb589fb943a45388e4118bb221e52009eb92d1c1
Author: mingmxu 
Date:   2017-07-09T07:52:23Z

support TUMBLE/HOP/SESSION _START function

commit a96c3a01f18826b4e43ad7e8f26e94c02903acd9
Author: Tyler Akidau 
Date:   2017-07-12T16:49:59Z

[BEAM-2204] This closes #3527

commit 0580e8b639ef77c7a6534b7b91ecad493950a3aa
Author: mingmxu 
Date:   2017-07-12T07:08:35Z

Test queries on unbounded PCollections with BeamSql DSL API.
Also add getTYPE(fieldName) overrides to BeamSqlRow.

commit 8defe6f21524b4ca00dd984176260c1bd0739774
Author: Tyler Akidau 
Date:   2017-07-12T17:03:28Z

[BEAM-2527] This closes #3477

commit 3a170744861dec12a33a19358f9ce89dbd64491c
Author: James Xu 
Date:   2017-07-10T11:58:21Z

[BEAM-2564] add integration test for string functions

commit 53d27e6c45ab7bbd6877c6674868e8a2e3f9a971
Author: Tyler Akidau 
Date:   2017-07-12T17:07:39Z

[BEAM-2564] This closes #3532

commit 39eedd566c3a1825ec3e9be0aa4490cfd824cb30
Author: tarushapptech 
Date:   2017-06-17T07:50:02Z

CAST operator supporting numeric, date and timestamp types

commit 125cef14453d625f1758aef5b129182cb13d829c
Author: Tyler Akidau 
Date:   2017-07-12T17:34:36Z

[BEAM-2424] This closes #3386

commit 556c48c805cba2e2fe8b349fca55bead1a0a7ef2
Author: tarushapptech 
Date:   2017-06-09T16:02:39Z

POWER function

commit 25fea4e1e28c7cbcdb25edb0dfb4839885e0c4b8
Author: Tyler Akidau 
Date:   2017-07-12

Jenkins build is unstable: beam_PostCommit_Java_MavenInstall #4420

2017-07-20 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-2555) add README page in dsl/sql

2017-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16095419#comment-16095419
 ] 

ASF GitHub Bot commented on BEAM-2555:
--

GitHub user XuMingmin opened a pull request:

https://github.com/apache/beam/pull/3605

[BEAM-2555] add README page in dsl/sql

1. remove README.md to keep align with other BEAM modules;
2. update usages in BeamSqlExample;

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/XuMingmin/beam BEAM-2555

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3605.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3605


commit f1aa390d1989543f2848dae6b26596ffd1a5d8db
Author: mingmxu 
Date:   2017-07-20T21:32:42Z

remove README.md and update usages in BeamSqlExample




> add README page in dsl/sql
> --
>
> Key: BEAM-2555
> URL: https://issues.apache.org/jira/browse/BEAM-2555
> Project: Beam
>  Issue Type: Task
>  Components: dsl-sql
>Reporter: Xu Mingmin
>Assignee: Xu Mingmin
>  Labels: dsl_sql_merge
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3605: [BEAM-2555] add README page in dsl/sql

2017-07-20 Thread XuMingmin
GitHub user XuMingmin opened a pull request:

https://github.com/apache/beam/pull/3605

[BEAM-2555] add README page in dsl/sql

1. remove README.md to keep align with other BEAM modules;
2. update usages in BeamSqlExample;

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/XuMingmin/beam BEAM-2555

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3605.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3605


commit f1aa390d1989543f2848dae6b26596ffd1a5d8db
Author: mingmxu 
Date:   2017-07-20T21:32:42Z

remove README.md and update usages in BeamSqlExample




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build is back to normal : beam_PostCommit_Java_ValidatesRunner_Apex #2030

2017-07-20 Thread Apache Jenkins Server
See 




[GitHub] beam pull request #3604: [BEAM-2141] Update jenkins job for JDBCIOIT

2017-07-20 Thread ssisk
GitHub user ssisk opened a pull request:

https://github.com/apache/beam/pull/3604

[BEAM-2141] Update jenkins job for JDBCIOIT

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [X] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [X] Each commit in the pull request should have a meaningful subject 
line and body.
 - [X] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [X] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [X] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [X] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---
cc @jasonkuster 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ssisk/beam jenkins-jdbc

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3604.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3604


commit 3d83475a851dcdfd471ba1a68828058ee6a8ffbc
Author: Stephen Sisk 
Date:   2017-07-20T17:45:47Z

Update jenkins job for JDBCIOIT




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2141) beam_PerformanceTests_JDBC have not passed in weeks

2017-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16095377#comment-16095377
 ] 

ASF GitHub Bot commented on BEAM-2141:
--

GitHub user ssisk opened a pull request:

https://github.com/apache/beam/pull/3604

[BEAM-2141] Update jenkins job for JDBCIOIT

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [X] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [X] Each commit in the pull request should have a meaningful subject 
line and body.
 - [X] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [X] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [X] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [X] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---
cc @jasonkuster 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ssisk/beam jenkins-jdbc

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3604.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3604


commit 3d83475a851dcdfd471ba1a68828058ee6a8ffbc
Author: Stephen Sisk 
Date:   2017-07-20T17:45:47Z

Update jenkins job for JDBCIOIT




> beam_PerformanceTests_JDBC have not passed in weeks
> ---
>
> Key: BEAM-2141
> URL: https://issues.apache.org/jira/browse/BEAM-2141
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Daniel Halperin
>Assignee: Stephen Sisk
> Fix For: Not applicable
>
>
> https://builds.apache.org/job/beam_PerformanceTests_JDBC/
> Disabling them, as no one seems to be maintaining them.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Spark #2673

2017-07-20 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #3620

2017-07-20 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #3451

2017-07-20 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-1598) Build support for Kubernetes and Beam IO Testing into PKB

2017-07-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16095322#comment-16095322
 ] 

ASF GitHub Bot commented on BEAM-1598:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3588


> Build support for Kubernetes and Beam IO Testing into PKB
> -
>
> Key: BEAM-1598
> URL: https://issues.apache.org/jira/browse/BEAM-1598
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Jason Kuster
>Assignee: Stephen Sisk
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[2/3] beam git commit: io-it-suite-local independent of io-it-suite, k8s properties -> root pom

2017-07-20 Thread tgroh
io-it-suite-local independent of io-it-suite, k8s properties -> root pom


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/2f9cbec8
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/2f9cbec8
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/2f9cbec8

Branch: refs/heads/master
Commit: 2f9cbec803791fd4e17fb28ec5590649aabb9622
Parents: 4192ac6
Author: Stephen Sisk 
Authored: Thu Jul 20 11:38:40 2017 -0700
Committer: Thomas Groh 
Committed: Thu Jul 20 13:34:49 2017 -0700

--
 pom.xml   |  5 +
 sdks/java/io/jdbc/pom.xml | 33 +
 sdks/java/io/pom.xml  |  4 
 3 files changed, 38 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/2f9cbec8/pom.xml
--
diff --git a/pom.xml b/pom.xml
index f2d0dde..e0ec136 100644
--- a/pom.xml
+++ b/pom.xml
@@ -168,6 +168,11 @@
 
-Xpkginfo:always
 nothing
 0.20.0
+
+
+kubectl
+
+${user.home}/.kube/config
   
 
   pom

http://git-wip-us.apache.org/repos/asf/beam/blob/2f9cbec8/sdks/java/io/jdbc/pom.xml
--
diff --git a/sdks/java/io/jdbc/pom.xml b/sdks/java/io/jdbc/pom.xml
index 3e8ba57..357ddc0 100644
--- a/sdks/java/io/jdbc/pom.xml
+++ b/sdks/java/io/jdbc/pom.xml
@@ -105,6 +105,7 @@
   
 
   
+
   
 org.codehaus.mojo
 exec-maven-plugin
@@ -162,9 +163,32 @@
 
   io-it-suite-local
   
io-it-suite-local
+  
+
+
${project.parent.parent.parent.parent.basedir}
+  
   
 
   
+org.codehaus.gmaven
+groovy-maven-plugin
+${groovy-maven-plugin.version}
+
+  
+find-supported-python-for-compile
+initialize
+
+  execute
+
+
+  
${beamRootProjectDir}/sdks/python/findSupportedPython.groovy
+
+  
+
+  
+
+  
 org.codehaus.mojo
 exec-maven-plugin
 ${maven-exec-plugin.version}
@@ -200,6 +224,15 @@
   
 
   
+
+  
+org.apache.maven.plugins
+maven-surefire-plugin
+${surefire-plugin.version}
+
+  true
+
+  
 
   
 

http://git-wip-us.apache.org/repos/asf/beam/blob/2f9cbec8/sdks/java/io/pom.xml
--
diff --git a/sdks/java/io/pom.xml b/sdks/java/io/pom.xml
index e9aa65f..4e02aa8 100644
--- a/sdks/java/io/pom.xml
+++ b/sdks/java/io/pom.xml
@@ -33,10 +33,6 @@
   (sources and sinks) to consume and produce data from systems.
 
   
-
-kubectl
-
-${user.home}/.kube/config
 
 
 



[1/3] beam git commit: This closes #3588

2017-07-20 Thread tgroh
Repository: beam
Updated Branches:
  refs/heads/master afeba3715 -> 7e63d2cf6


This closes #3588


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/7e63d2cf
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/7e63d2cf
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/7e63d2cf

Branch: refs/heads/master
Commit: 7e63d2cf6c0ea0da4ce68b72536522f68b4a5272
Parents: afeba37 2f9cbec
Author: Thomas Groh 
Authored: Thu Jul 20 13:34:49 2017 -0700
Committer: Thomas Groh 
Committed: Thu Jul 20 13:34:49 2017 -0700

--
 .../kubernetes/postgres/pkb-config-local.yml|  34 
 .test-infra/kubernetes/postgres/pkb-config.yml  |  32 
 pom.xml |   5 +
 runners/google-cloud-dataflow-java/pom.xml  |  23 +++
 sdks/java/io/google-cloud-platform/pom.xml  |  91 ++
 sdks/java/io/jdbc/pom.xml   | 172 +++
 sdks/java/io/pom.xml|  32 
 7 files changed, 389 insertions(+)
--




[GitHub] beam pull request #3588: [BEAM-1598] Add Maven support for invoking perfkit ...

2017-07-20 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3588


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[3/3] beam git commit: Add maven support for invoking perfkit benchmarker to run IO ITs

2017-07-20 Thread tgroh
Add maven support for invoking perfkit benchmarker to run IO ITs


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/4192ac6c
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/4192ac6c
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/4192ac6c

Branch: refs/heads/master
Commit: 4192ac6c72765e8c55480eec66c499ef0798ecf3
Parents: afeba37
Author: Stephen Sisk 
Authored: Wed Jun 14 09:57:35 2017 -0700
Committer: Thomas Groh 
Committed: Thu Jul 20 13:34:49 2017 -0700

--
 .../kubernetes/postgres/pkb-config-local.yml|  34 +
 .test-infra/kubernetes/postgres/pkb-config.yml  |  32 +
 runners/google-cloud-dataflow-java/pom.xml  |  23 +++
 sdks/java/io/google-cloud-platform/pom.xml  |  91 
 sdks/java/io/jdbc/pom.xml   | 139 +++
 sdks/java/io/pom.xml|  36 +
 6 files changed, 355 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/4192ac6c/.test-infra/kubernetes/postgres/pkb-config-local.yml
--
diff --git a/.test-infra/kubernetes/postgres/pkb-config-local.yml 
b/.test-infra/kubernetes/postgres/pkb-config-local.yml
new file mode 100644
index 000..1bac0c4
--- /dev/null
+++ b/.test-infra/kubernetes/postgres/pkb-config-local.yml
@@ -0,0 +1,34 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# This file is a pkb benchmark configuration file, used when running the IO ITs
+# that use this data store. It allows users to run tests when they are on a
+# separate network from the kubernetes cluster by reading the postgres IP
+# address from the LoadBalancer service.
+#
+# This file defines pipeline options to pass to beam, as well as how to derive
+# the values for those pipeline options from kubernetes (where appropriate.)
+
+static_pipeline_options:
+  - postgresUsername: postgres
+  - postgresPassword: uuinkks
+  - postgresDatabaseName: postgres
+  - postgresSsl: false
+dynamic_pipeline_options:
+  - name: postgresServerName
+type: LoadBalancerIp
+serviceName: postgres-for-dev

http://git-wip-us.apache.org/repos/asf/beam/blob/4192ac6c/.test-infra/kubernetes/postgres/pkb-config.yml
--
diff --git a/.test-infra/kubernetes/postgres/pkb-config.yml 
b/.test-infra/kubernetes/postgres/pkb-config.yml
new file mode 100644
index 000..b943b17
--- /dev/null
+++ b/.test-infra/kubernetes/postgres/pkb-config.yml
@@ -0,0 +1,32 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# This file is a pkb benchmark configuration file, used when running the IO ITs
+# that use this data store.
+#
+# This file defines pipeline options to pass to beam, as well as how to derive
+# the values for those pipeline options from kubernetes (where appropriate.)
+
+static_pipeline_options:
+  - postgresUsername: postgres
+  - postgresPassword: uuinkks
+  - postgresDatabaseName: postgres
+  - postgresSsl: false
+dynamic_pipeline_options:
+  - name: postgresServerName
+type: NodePortIp
+podLabel: name=postgres

http://git-wip-us.apache.org/repos/asf/beam/blob/4192ac6c/runners/google-cloud-dataflow-java/pom.xml
--
diff --git a/runners/goog

[35/50] [abbrv] beam git commit: This closes #3582

2017-07-20 Thread jbonofre
This closes #3582


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/d5101750
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/d5101750
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/d5101750

Branch: refs/heads/DSL_SQL
Commit: d5101750e76460b4ad057103069abbd3833bce96
Parents: 0d927ef bdf5bd6
Author: Pei He 
Authored: Wed Jul 19 11:31:32 2017 +0800
Committer: Pei He 
Committed: Wed Jul 19 11:31:32 2017 +0800

--
 .../apache/beam/sdk/testing/TestPipeline.java   | 63 
 .../beam/sdk/testing/TestPipelineTest.java  | 38 +---
 2 files changed, 13 insertions(+), 88 deletions(-)
--




[49/50] [abbrv] beam git commit: This closes #3591: [BEAM-1542] Introduced SpannerIO.readAll

2017-07-20 Thread jbonofre
This closes #3591: [BEAM-1542] Introduced SpannerIO.readAll


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/afeba371
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/afeba371
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/afeba371

Branch: refs/heads/DSL_SQL
Commit: afeba3715c806b53115f8f7994eb7bc207c68932
Parents: c8e3744 95e9c28
Author: Eugene Kirpichov 
Authored: Thu Jul 20 10:59:14 2017 -0700
Committer: Eugene Kirpichov 
Committed: Thu Jul 20 10:59:14 2017 -0700

--
 .../sdk/io/gcp/spanner/NaiveSpannerReadFn.java  |  35 ++--
 .../beam/sdk/io/gcp/spanner/ReadOperation.java  |  96 ++
 .../beam/sdk/io/gcp/spanner/SpannerIO.java  | 187 ++-
 .../sdk/io/gcp/spanner/SpannerIOReadTest.java   | 145 +-
 4 files changed, 353 insertions(+), 110 deletions(-)
--




[36/50] [abbrv] beam git commit: [BEAM-2532] Memoizes TableSchema in BigQuerySourceBase

2017-07-20 Thread jbonofre
[BEAM-2532] Memoizes TableSchema in BigQuerySourceBase

Instead of parsing the JSON schema for every record.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/e86c004d
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/e86c004d
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/e86c004d

Branch: refs/heads/DSL_SQL
Commit: e86c004de5d4b5f8bd0c3c53207cf3c1760f5d8e
Parents: d510175
Author: Neville Li 
Authored: Tue Jul 18 09:07:21 2017 -0400
Committer: Eugene Kirpichov 
Committed: Tue Jul 18 22:28:57 2017 -0700

--
 .../sdk/io/gcp/bigquery/BigQuerySourceBase.java  | 19 +--
 1 file changed, 17 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/e86c004d/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQuerySourceBase.java
--
diff --git 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQuerySourceBase.java
 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQuerySourceBase.java
index 2de60a2..2b1eafe 100644
--- 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQuerySourceBase.java
+++ 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQuerySourceBase.java
@@ -29,11 +29,16 @@ import com.google.api.services.bigquery.model.JobReference;
 import com.google.api.services.bigquery.model.TableReference;
 import com.google.api.services.bigquery.model.TableRow;
 import com.google.api.services.bigquery.model.TableSchema;
+import com.google.common.base.Function;
+import com.google.common.base.Supplier;
+import com.google.common.base.Suppliers;
 import com.google.common.collect.ImmutableList;
 import com.google.common.collect.Lists;
 import java.io.IOException;
+import java.io.Serializable;
 import java.util.List;
 import java.util.NoSuchElementException;
+import javax.annotation.Nullable;
 import org.apache.avro.generic.GenericRecord;
 import org.apache.beam.sdk.coders.Coder;
 import org.apache.beam.sdk.io.AvroSource;
@@ -168,10 +173,12 @@ abstract class BigQuerySourceBase extends 
BoundedSource {
 
 SerializableFunction function =
 new SerializableFunction() {
+  private Supplier schema = Suppliers.memoize(
+  Suppliers.compose(new TableSchemaFunction(), 
Suppliers.ofInstance(jsonSchema)));
+
   @Override
   public TableRow apply(GenericRecord input) {
-return BigQueryAvroUtils.convertGenericRecordToTableRow(
-input, BigQueryHelpers.fromJsonString(jsonSchema, 
TableSchema.class));
+return BigQueryAvroUtils.convertGenericRecordToTableRow(input, 
schema.get());
   }};
 
 List> avroSources = Lists.newArrayList();
@@ -182,6 +189,14 @@ abstract class BigQuerySourceBase extends 
BoundedSource {
 return ImmutableList.copyOf(avroSources);
   }
 
+  private static class TableSchemaFunction implements Serializable, 
Function {
+@Nullable
+@Override
+public TableSchema apply(@Nullable String input) {
+  return BigQueryHelpers.fromJsonString(input, TableSchema.class);
+}
+  }
+
   protected static class BigQueryReader extends BoundedReader {
 private final BigQuerySourceBase source;
 private final BigQueryServices.BigQueryJsonReader reader;



[23/50] [abbrv] beam git commit: Change PR template from 1234 to XXX

2017-07-20 Thread jbonofre
Change PR template from 1234 to XXX


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/b827f656
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/b827f656
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/b827f656

Branch: refs/heads/DSL_SQL
Commit: b827f65622f9cd9203803b76935aac422c179803
Parents: d2201f9
Author: Sourabh Bajaj 
Authored: Tue Jul 18 10:30:44 2017 -0700
Committer: Sourabh Bajaj 
Committed: Tue Jul 18 10:30:44 2017 -0700

--
 .github/PULL_REQUEST_TEMPLATE.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/b827f656/.github/PULL_REQUEST_TEMPLATE.md
--
diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md
index 750..bd361b7 100644
--- a/.github/PULL_REQUEST_TEMPLATE.md
+++ b/.github/PULL_REQUEST_TEMPLATE.md
@@ -2,7 +2,7 @@ Follow this checklist to help us incorporate your contribution 
quickly and easil
 
  - [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
  - [ ] Each commit in the pull request should have a meaningful subject line 
and body.
- - [ ] Format the pull request title like `[BEAM-1234] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-1234` with the appropriate JIRA 
issue.
+ - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
  - [ ] Write a pull request description that is detailed enough to understand 
what the pull request does, how, and why.
  - [ ] Run `mvn clean verify` to make sure basic checks pass. A more thorough 
check will be performed on your pull request automatically.
  - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).



[08/50] [abbrv] beam git commit: Closes #3520

2017-07-20 Thread jbonofre
Closes #3520


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/532256e8
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/532256e8
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/532256e8

Branch: refs/heads/DSL_SQL
Commit: 532256e8811b790fdf25fb4e11b7c2b89383761a
Parents: 7e4719c 7257507
Author: Robert Bradshaw 
Authored: Mon Jul 17 14:33:01 2017 -0700
Committer: Robert Bradshaw 
Committed: Mon Jul 17 14:33:01 2017 -0700

--
 .../runners/dataflow/dataflow_runner.py   | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)
--




[50/50] [abbrv] beam git commit: This closes #3603

2017-07-20 Thread jbonofre
This closes #3603


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/ada24c05
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/ada24c05
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/ada24c05

Branch: refs/heads/DSL_SQL
Commit: ada24c059b1337fe02517c9f66fa9d29fb8bcc61
Parents: 152115e afeba37
Author: Jean-Baptiste Onofré 
Authored: Thu Jul 20 21:52:35 2017 +0200
Committer: Jean-Baptiste Onofré 
Committed: Thu Jul 20 21:52:35 2017 +0200

--
 .github/PULL_REQUEST_TEMPLATE.md|   16 +-
 .gitignore  |2 +-
 .../jenkins/common_job_properties.groovy|9 +-
 .../job_beam_PerformanceTests_Python.groovy |   58 +
 ..._beam_PostCommit_Java_JDKVersionsTest.groovy |2 +
 ..._PostCommit_Java_MavenInstall_Windows.groovy |3 +-
 .../job_beam_PreCommit_Website_Merge.groovy |   59 +
 README.md   |4 +-
 examples/java/pom.xml   |   32 +-
 .../org/apache/beam/examples/WordCount.java |4 +
 .../examples/common/WriteOneFilePerWindow.java  |   59 +-
 .../apache/beam/examples/complete/TfIdf.java|3 +-
 .../examples/complete/TopWikipediaSessions.java |   24 +-
 .../beam/examples/complete/TrafficRoutes.java   |   19 +
 .../beam/examples/cookbook/TriggerExample.java  |6 +-
 .../beam/examples/DebuggingWordCountTest.java   |   11 +-
 .../beam/examples/WindowedWordCountIT.java  |4 +-
 examples/java8/pom.xml  |   20 +-
 .../complete/game/utils/WriteToText.java|   49 +-
 .../examples/complete/game/LeaderBoardTest.java |2 +
 examples/pom.xml|2 +-
 pom.xml |  127 +-
 runners/apex/pom.xml|   20 +-
 .../apache/beam/runners/apex/ApexRunner.java|   61 +-
 .../translation/ApexPipelineTranslator.java |   16 +-
 .../apex/translation/TranslationContext.java|4 +-
 .../operators/ApexParDoOperator.java|   21 +-
 .../runners/apex/examples/WordCountTest.java|8 +-
 .../utils/ApexStateInternalsTest.java   |  411 ++-
 runners/core-construction-java/pom.xml  |2 +-
 .../CreatePCollectionViewTranslation.java   |   15 +-
 .../construction/ElementAndRestriction.java |   42 -
 .../ElementAndRestrictionCoder.java |   88 --
 .../construction/PCollectionTranslation.java|   16 +
 .../core/construction/PTransformMatchers.java   |  109 +-
 .../construction/PTransformTranslation.java |   11 +-
 .../core/construction/ParDoTranslation.java |   82 +-
 .../construction/RunnerPCollectionView.java |   31 +-
 .../core/construction/SplittableParDo.java  |  124 +-
 .../construction/TestStreamTranslation.java |   49 +-
 .../core/construction/TransformInputs.java  |   50 +
 .../WindowingStrategyTranslation.java   |   27 +-
 .../construction/WriteFilesTranslation.java |   67 +-
 .../construction/metrics/MetricFiltering.java   |  102 ++
 .../core/construction/metrics/MetricKey.java|   43 +
 .../core/construction/metrics/package-info.java |   22 +
 .../runners/core/metrics/MetricFiltering.java   |  102 --
 .../beam/runners/core/metrics/MetricKey.java|   43 -
 .../beam/runners/core/metrics/package-info.java |   22 -
 .../ElementAndRestrictionCoderTest.java |  126 --
 .../PCollectionTranslationTest.java |   22 +
 .../construction/PTransformMatchersTest.java|   54 +-
 .../core/construction/ParDoTranslationTest.java |   28 +-
 .../core/construction/SplittableParDoTest.java  |   18 +-
 .../core/construction/TransformInputsTest.java  |  166 +++
 .../WindowingStrategyTranslationTest.java   |3 +
 .../construction/WriteFilesTranslationTest.java |   68 +-
 .../metrics/MetricFilteringTest.java|  148 +++
 .../core/metrics/MetricFilteringTest.java   |  148 ---
 runners/core-java/pom.xml   |2 +-
 .../runners/core/InMemoryTimerInternals.java|9 +
 .../core/LateDataDroppingDoFnRunner.java|   33 +-
 ...eBoundedSplittableProcessElementInvoker.java |   40 +-
 .../beam/runners/core/ProcessFnRunner.java  |   16 +-
 .../beam/runners/core/ReduceFnRunner.java   |  135 ++-
 .../beam/runners/core/SimpleDoFnRunner.java |   20 +
 .../core/SplittableParDoViaKeyedWorkItems.java  |   58 +-
 .../core/SplittableProcessElementInvoker.java   |   25 +-
 .../org/apache/beam/runners/core/StateTags.java |3 +
 .../beam/runners/core/SystemReduceFn.java   |6 +
 .../runners/core/metrics/MetricUpdates.java |1 +
 .../core/metrics/MetricsContainerImpl.java  |1 +
 .../core/metrics/MetricsContainerStepMap.java   |2 +
 .../core/triggers/AfterAllStateMachine.java |   25 +-
 .../AfterDelayFromFirstElementStateMachine.java |

[06/50] [abbrv] beam git commit: Fix split package in SDK harness

2017-07-20 Thread jbonofre
http://git-wip-us.apache.org/repos/asf/beam/blob/f1b4700f/sdks/java/harness/src/main/java/org/apache/beam/runners/core/FnApiDoFnRunner.java
--
diff --git 
a/sdks/java/harness/src/main/java/org/apache/beam/runners/core/FnApiDoFnRunner.java
 
b/sdks/java/harness/src/main/java/org/apache/beam/runners/core/FnApiDoFnRunner.java
deleted file mode 100644
index b3cf3a7..000
--- 
a/sdks/java/harness/src/main/java/org/apache/beam/runners/core/FnApiDoFnRunner.java
+++ /dev/null
@@ -1,547 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.beam.runners.core;
-
-import static com.google.common.base.Preconditions.checkArgument;
-
-import com.google.auto.service.AutoService;
-import com.google.common.collect.Collections2;
-import com.google.common.collect.ImmutableMap;
-import com.google.common.collect.ImmutableMultimap;
-import com.google.common.collect.Multimap;
-import com.google.protobuf.ByteString;
-import com.google.protobuf.BytesValue;
-import com.google.protobuf.InvalidProtocolBufferException;
-import java.util.Collection;
-import java.util.HashSet;
-import java.util.Iterator;
-import java.util.Map;
-import java.util.Objects;
-import java.util.function.Consumer;
-import java.util.function.Supplier;
-import org.apache.beam.fn.harness.data.BeamFnDataClient;
-import org.apache.beam.fn.harness.fn.ThrowingConsumer;
-import org.apache.beam.fn.harness.fn.ThrowingRunnable;
-import org.apache.beam.runners.core.construction.ParDoTranslation;
-import org.apache.beam.runners.dataflow.util.DoFnInfo;
-import org.apache.beam.sdk.common.runner.v1.RunnerApi;
-import org.apache.beam.sdk.options.PipelineOptions;
-import org.apache.beam.sdk.state.State;
-import org.apache.beam.sdk.state.TimeDomain;
-import org.apache.beam.sdk.state.Timer;
-import org.apache.beam.sdk.transforms.DoFn;
-import org.apache.beam.sdk.transforms.DoFn.OnTimerContext;
-import org.apache.beam.sdk.transforms.DoFn.ProcessContext;
-import org.apache.beam.sdk.transforms.reflect.DoFnInvoker;
-import org.apache.beam.sdk.transforms.reflect.DoFnInvokers;
-import org.apache.beam.sdk.transforms.splittabledofn.RestrictionTracker;
-import org.apache.beam.sdk.transforms.windowing.BoundedWindow;
-import org.apache.beam.sdk.transforms.windowing.PaneInfo;
-import org.apache.beam.sdk.util.SerializableUtils;
-import org.apache.beam.sdk.util.UserCodeException;
-import org.apache.beam.sdk.util.WindowedValue;
-import org.apache.beam.sdk.values.PCollectionView;
-import org.apache.beam.sdk.values.TupleTag;
-import org.apache.beam.sdk.values.WindowingStrategy;
-import org.joda.time.Instant;
-
-/**
- * A {@link DoFnRunner} specific to integrating with the Fn Api. This is to 
remove the layers
- * of abstraction caused by StateInternals/TimerInternals since they model 
state and timer
- * concepts differently.
- */
-public class FnApiDoFnRunner implements DoFnRunner {
-  /**
-   * A registrar which provides a factory to handle Java {@link DoFn}s.
-   */
-  @AutoService(PTransformRunnerFactory.Registrar.class)
-  public static class Registrar implements
-  PTransformRunnerFactory.Registrar {
-
-@Override
-public Map getPTransformRunnerFactories() 
{
-  return ImmutableMap.of(ParDoTranslation.CUSTOM_JAVA_DO_FN_URN, new 
Factory());
-}
-  }
-
-  /** A factory for {@link FnApiDoFnRunner}. */
-  static class Factory
-  implements PTransformRunnerFactory> {
-
-@Override
-public DoFnRunner createRunnerForPTransform(
-PipelineOptions pipelineOptions,
-BeamFnDataClient beamFnDataClient,
-String pTransformId,
-RunnerApi.PTransform pTransform,
-Supplier processBundleInstructionId,
-Map pCollections,
-Map coders,
-Multimap>> 
pCollectionIdsToConsumers,
-Consumer addStartFunction,
-Consumer addFinishFunction) {
-
-  // For every output PCollection, create a map from output name to 
Consumer
-  ImmutableMap.Builder>>>
-  outputMapBuilder = ImmutableMap.builder();
-  for (Map.Entry entry : 
pTransform.getOutputsMap().entrySet()) {
-outputMapBuilder.put(
-entry.getKey(),
-pCollectionIds

[28/50] [abbrv] beam git commit: This closes #3455

2017-07-20 Thread jbonofre
This closes #3455


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/dd9e866e
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/dd9e866e
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/dd9e866e

Branch: refs/heads/DSL_SQL
Commit: dd9e866e087351e395e464e07d512b2e8db107c4
Parents: 2d5b6d7 111603a
Author: Thomas Groh 
Authored: Tue Jul 18 14:49:56 2017 -0700
Committer: Thomas Groh 
Committed: Tue Jul 18 14:49:56 2017 -0700

--
 .../beam/runners/dataflow/DataflowPipelineJob.java  | 14 --
 .../beam/runners/dataflow/DataflowRunner.java   |  3 ++-
 .../beam/runners/dataflow/util/MonitoringUtil.java  | 16 +---
 .../dataflow/BatchStatefulParDoOverridesTest.java   |  1 +
 .../dataflow/DataflowPipelineTranslatorTest.java|  1 +
 .../runners/dataflow/internal/apiclient.py  |  7 +--
 .../runners/dataflow/test_dataflow_runner.py|  5 +++--
 7 files changed, 37 insertions(+), 10 deletions(-)
--




[39/50] [abbrv] beam git commit: This closes #3531: [BEAM-2306] Fail build when @Deprecated is used without @deprecated javadoc

2017-07-20 Thread jbonofre
This closes #3531: [BEAM-2306] Fail build when @Deprecated is used without 
@deprecated javadoc

  [BEAM-2306] Add checkstyle check to fail the build when @Deprecated is used 
without @deprecated javadoc (or vice versa).


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/a6f460fe
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/a6f460fe
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/a6f460fe

Branch: refs/heads/DSL_SQL
Commit: a6f460fe3b760aafbc748ae18956f0f2c1fedfad
Parents: 7fde976 d290114
Author: Kenneth Knowles 
Authored: Wed Jul 19 09:03:48 2017 -0700
Committer: Kenneth Knowles 
Committed: Wed Jul 19 09:03:48 2017 -0700

--
 .../construction/CreatePCollectionViewTranslation.java  | 11 ++-
 .../core/construction/PTransformTranslation.java|  4 
 .../beam/runners/core/InMemoryTimerInternals.java   |  9 +
 .../java/org/apache/beam/runners/core/StateTags.java|  3 +++
 .../beam/runners/direct/DirectTimerInternals.java   |  9 +
 .../translation/wrappers/streaming/DoFnOperator.java|  9 +
 .../apache/beam/runners/dataflow/DataflowRunner.java|  3 ++-
 .../options/DataflowPipelineWorkerPoolOptions.java  |  3 +++
 .../build-tools/src/main/resources/beam/checkstyle.xml  |  8 
 .../src/main/java/org/apache/beam/sdk/coders/Coder.java | 12 +++-
 .../java/org/apache/beam/sdk/coders/CoderRegistry.java  |  9 +
 .../main/java/org/apache/beam/sdk/io/AvroSource.java|  6 --
 .../main/java/org/apache/beam/sdk/testing/PAssert.java  |  5 +++--
 .../java/org/apache/beam/sdk/testing/StreamingIT.java   |  4 
 .../java/org/apache/beam/sdk/transforms/Combine.java|  1 -
 .../main/java/org/apache/beam/sdk/transforms/DoFn.java  |  3 +++
 .../main/java/org/apache/beam/sdk/transforms/View.java  |  2 +-
 .../beam/sdk/transforms/reflect/DoFnInvokers.java   |  9 -
 .../java/org/apache/beam/sdk/util/IdentityWindowFn.java |  1 -
 .../org/apache/beam/sdk/values/PCollectionViews.java|  1 -
 .../main/java/org/apache/beam/sdk/values/PValue.java|  4 ++--
 .../org/apache/beam/sdk/coders/DefaultCoderTest.java|  3 ++-
 .../org/apache/beam/fn/harness/BoundedSourceRunner.java |  6 +++---
 23 files changed, 95 insertions(+), 30 deletions(-)
--




[37/50] [abbrv] beam git commit: This closes #3584: [BEAM-2532] add a Serializable TableSchema Supplier in BigQuerySourceBase

2017-07-20 Thread jbonofre
This closes #3584: [BEAM-2532] add a Serializable TableSchema Supplier in 
BigQuerySourceBase


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/7fde976d
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/7fde976d
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/7fde976d

Branch: refs/heads/DSL_SQL
Commit: 7fde976d14fe697dd88d2b161540c73d5cb01517
Parents: d510175 e86c004
Author: Eugene Kirpichov 
Authored: Tue Jul 18 22:33:58 2017 -0700
Committer: Eugene Kirpichov 
Committed: Tue Jul 18 22:33:58 2017 -0700

--
 .../sdk/io/gcp/bigquery/BigQuerySourceBase.java  | 19 +--
 1 file changed, 17 insertions(+), 2 deletions(-)
--




[20/50] [abbrv] beam git commit: This closes #3442: Splits large TextIOTest into TextIOReadTest and TextIOWriteTest

2017-07-20 Thread jbonofre
This closes #3442: Splits large TextIOTest into TextIOReadTest and 
TextIOWriteTest


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/7c363181
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/7c363181
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/7c363181

Branch: refs/heads/DSL_SQL
Commit: 7c3631810a604ba58ec16c3b3aa9a346bd6d9f17
Parents: 0f06eb2 d495d15
Author: Kenneth Knowles 
Authored: Mon Jul 17 19:43:20 2017 -0700
Committer: Kenneth Knowles 
Committed: Mon Jul 17 19:43:20 2017 -0700

--
 .../org/apache/beam/sdk/io/TextIOReadTest.java  |  847 +++
 .../java/org/apache/beam/sdk/io/TextIOTest.java | 1353 +-
 .../org/apache/beam/sdk/io/TextIOWriteTest.java |  604 
 3 files changed, 1460 insertions(+), 1344 deletions(-)
--




[33/50] [abbrv] beam git commit: This closes #3576

2017-07-20 Thread jbonofre
This closes #3576


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/0d927ef6
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/0d927ef6
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/0d927ef6

Branch: refs/heads/DSL_SQL
Commit: 0d927ef6ab0fa5dd03a6b38ea9fe9bfeacd8
Parents: be5b934 1e94704
Author: Thomas Groh 
Authored: Tue Jul 18 17:52:56 2017 -0700
Committer: Thomas Groh 
Committed: Tue Jul 18 17:52:56 2017 -0700

--
 .../beam/sdk/transforms/GroupByKeyTest.java | 156 +++
 1 file changed, 122 insertions(+), 34 deletions(-)
--




[24/50] [abbrv] beam git commit: This closes #3587: Change PR template from 1234 to XXX

2017-07-20 Thread jbonofre
This closes #3587: Change PR template from 1234 to XXX


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/5a0b74c9
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/5a0b74c9
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/5a0b74c9

Branch: refs/heads/DSL_SQL
Commit: 5a0b74c9b8654cd034a55145a60e666c579caab6
Parents: d2201f9 b827f65
Author: Eugene Kirpichov 
Authored: Tue Jul 18 11:18:53 2017 -0700
Committer: Eugene Kirpichov 
Committed: Tue Jul 18 11:18:53 2017 -0700

--
 .github/PULL_REQUEST_TEMPLATE.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--




[18/50] [abbrv] beam git commit: Splits large TextIOTest into TextIOReadTest and TextIOWriteTest

2017-07-20 Thread jbonofre
http://git-wip-us.apache.org/repos/asf/beam/blob/d495d151/sdks/java/core/src/test/java/org/apache/beam/sdk/io/TextIOWriteTest.java
--
diff --git 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/io/TextIOWriteTest.java 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/io/TextIOWriteTest.java
new file mode 100644
index 000..a73ed7d
--- /dev/null
+++ b/sdks/java/core/src/test/java/org/apache/beam/sdk/io/TextIOWriteTest.java
@@ -0,0 +1,604 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io;
+
+import static com.google.common.base.MoreObjects.firstNonNull;
+import static org.apache.beam.sdk.TestUtils.LINES2_ARRAY;
+import static org.apache.beam.sdk.TestUtils.LINES_ARRAY;
+import static org.apache.beam.sdk.TestUtils.NO_LINES_ARRAY;
+import static 
org.apache.beam.sdk.transforms.display.DisplayDataMatchers.hasDisplayItem;
+import static org.hamcrest.Matchers.containsInAnyOrder;
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertThat;
+import static org.junit.Assert.assertTrue;
+
+import com.google.common.base.Function;
+import com.google.common.base.Functions;
+import com.google.common.base.Predicate;
+import com.google.common.base.Predicates;
+import com.google.common.collect.FluentIterable;
+import com.google.common.collect.Iterables;
+import com.google.common.collect.Lists;
+import java.io.BufferedReader;
+import java.io.File;
+import java.io.FileReader;
+import java.io.IOException;
+import java.nio.file.FileVisitResult;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.nio.file.SimpleFileVisitor;
+import java.nio.file.attribute.BasicFileAttributes;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.List;
+import javax.annotation.Nullable;
+import org.apache.beam.sdk.coders.AvroCoder;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.DefaultCoder;
+import org.apache.beam.sdk.coders.StringUtf8Coder;
+import org.apache.beam.sdk.io.FileBasedSink.WritableByteChannelFactory;
+import org.apache.beam.sdk.io.fs.MatchResult;
+import org.apache.beam.sdk.io.fs.MatchResult.Metadata;
+import org.apache.beam.sdk.io.fs.ResolveOptions;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.PipelineOptionsFactory;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.testing.NeedsRunner;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.apache.beam.sdk.transforms.Create;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.util.CoderUtils;
+import org.apache.beam.sdk.values.PCollection;
+import org.junit.AfterClass;
+import org.junit.BeforeClass;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.junit.rules.ExpectedException;
+
+/** Tests for {@link TextIO.Write}. */
+public class TextIOWriteTest {
+  private static final String MY_HEADER = "myHeader";
+  private static final String MY_FOOTER = "myFooter";
+
+  private static Path tempFolder;
+
+  @Rule public TestPipeline p = TestPipeline.create();
+
+  @Rule public ExpectedException expectedException = ExpectedException.none();
+
+  @BeforeClass
+  public static void setupClass() throws IOException {
+tempFolder = Files.createTempDirectory("TextIOTest");
+  }
+
+  @AfterClass
+  public static void teardownClass() throws IOException {
+Files.walkFileTree(
+tempFolder,
+new SimpleFileVisitor() {
+  @Override
+  public FileVisitResult visitFile(Path file, BasicFileAttributes 
attrs)
+  throws IOException {
+Files.delete(file);
+return FileVisitResult.CONTINUE;
+  }
+
+  @Override
+  public FileVisitResult postVisitDirectory(Path dir, IOException exc) 
throws IOException {
+Files.delete(dir);
+return FileVisitResult.CONTINUE;
+  }
+});
+  }
+
+  static class TestDynamicDesti

[11/50] [abbrv] beam git commit: [BEAM-1502] GroupByKey should not return bare lists in DirectRunner.

2017-07-20 Thread jbonofre
[BEAM-1502] GroupByKey should not return bare lists in DirectRunner.

This leads to invalidated expectations on other runners.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/e7059e5c
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/e7059e5c
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/e7059e5c

Branch: refs/heads/DSL_SQL
Commit: e7059e5cb3cd07855582641798c58fc3cf5cd682
Parents: 532256e
Author: Robert Bradshaw 
Authored: Mon Jul 17 13:44:40 2017 -0700
Committer: Robert Bradshaw 
Committed: Mon Jul 17 15:08:02 2017 -0700

--
 .../apache_beam/examples/snippets/snippets.py   |  2 +-
 sdks/python/apache_beam/transforms/core.py  |  2 +-
 sdks/python/apache_beam/transforms/trigger.py   | 21 +++-
 3 files changed, 18 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/e7059e5c/sdks/python/apache_beam/examples/snippets/snippets.py
--
diff --git a/sdks/python/apache_beam/examples/snippets/snippets.py 
b/sdks/python/apache_beam/examples/snippets/snippets.py
index 3a5f9b1..27b8120 100644
--- a/sdks/python/apache_beam/examples/snippets/snippets.py
+++ b/sdks/python/apache_beam/examples/snippets/snippets.py
@@ -1136,7 +1136,7 @@ def model_group_by_key(contents, output_path):
 grouped_words = words_and_counts | beam.GroupByKey()
 # [END model_group_by_key_transform]
 (grouped_words
- | 'count words' >> beam.Map(lambda (word, counts): (word, len(counts)))
+ | 'count words' >> beam.Map(lambda (word, counts): (word, sum(counts)))
  | beam.io.WriteToText(output_path))
 
 

http://git-wip-us.apache.org/repos/asf/beam/blob/e7059e5c/sdks/python/apache_beam/transforms/core.py
--
diff --git a/sdks/python/apache_beam/transforms/core.py 
b/sdks/python/apache_beam/transforms/core.py
index 8018219..92b8737 100644
--- a/sdks/python/apache_beam/transforms/core.py
+++ b/sdks/python/apache_beam/transforms/core.py
@@ -1017,7 +1017,7 @@ class CombineValuesDoFn(DoFn):
self.combinefn.apply(element[1], *args, **kwargs))]
 
 # Add the elements into three accumulators (for testing of merge).
-elements = element[1]
+elements = list(element[1])
 accumulators = []
 for k in range(3):
   if len(elements) <= k:

http://git-wip-us.apache.org/repos/asf/beam/blob/e7059e5c/sdks/python/apache_beam/transforms/trigger.py
--
diff --git a/sdks/python/apache_beam/transforms/trigger.py 
b/sdks/python/apache_beam/transforms/trigger.py
index f77fa1a..c1fbfc5 100644
--- a/sdks/python/apache_beam/transforms/trigger.py
+++ b/sdks/python/apache_beam/transforms/trigger.py
@@ -24,6 +24,7 @@ from abc import ABCMeta
 from abc import abstractmethod
 import collections
 import copy
+import itertools
 
 from apache_beam.coders import observable
 from apache_beam.transforms import combiners
@@ -878,6 +879,17 @@ class _UnwindowedValues(observable.ObservableMixin):
   def __reduce__(self):
 return list, (list(self),)
 
+  def __eq__(self, other):
+if isinstance(other, collections.Iterable):
+  return all(
+  a == b
+  for a, b in itertools.izip_longest(self, other, fillvalue=object()))
+else:
+  return NotImplemented
+
+  def __ne__(self, other):
+return not self == other
+
 
 class DefaultGlobalBatchTriggerDriver(TriggerDriver):
   """Breaks a bundles into window (pane)s according to the default triggering.
@@ -888,11 +900,10 @@ class DefaultGlobalBatchTriggerDriver(TriggerDriver):
 pass
 
   def process_elements(self, state, windowed_values, unused_output_watermark):
-if isinstance(windowed_values, list):
-  unwindowed = [wv.value for wv in windowed_values]
-else:
-  unwindowed = _UnwindowedValues(windowed_values)
-yield WindowedValue(unwindowed, MIN_TIMESTAMP, self.GLOBAL_WINDOW_TUPLE)
+yield WindowedValue(
+_UnwindowedValues(windowed_values),
+MIN_TIMESTAMP,
+self.GLOBAL_WINDOW_TUPLE)
 
   def process_timer(self, window_id, name, time_domain, timestamp, state):
 raise TypeError('Triggers never set or called for batch default 
windowing.')



[15/50] [abbrv] beam git commit: This closes #3575: Adjust pull request template for Jenkins and mergebot world

2017-07-20 Thread jbonofre
This closes #3575: Adjust pull request template for Jenkins and mergebot world


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/04d364d3
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/04d364d3
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/04d364d3

Branch: refs/heads/DSL_SQL
Commit: 04d364d31959f044c7ccc7b9fc52884f4ae501d7
Parents: 1996869 4c6fa39
Author: Kenneth Knowles 
Authored: Mon Jul 17 16:00:09 2017 -0700
Committer: Kenneth Knowles 
Committed: Mon Jul 17 16:00:09 2017 -0700

--
 .github/PULL_REQUEST_TEMPLATE.md | 16 +++-
 1 file changed, 7 insertions(+), 9 deletions(-)
--




[45/50] [abbrv] beam git commit: This closes #3595

2017-07-20 Thread jbonofre
This closes #3595


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/2e51bde5
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/2e51bde5
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/2e51bde5

Branch: refs/heads/DSL_SQL
Commit: 2e51bde5bd3fc2589b0e04f2ced8bd7c24d1046a
Parents: eb0850e d128c3b
Author: Ahmet Altay 
Authored: Wed Jul 19 14:08:01 2017 -0700
Committer: Ahmet Altay 
Committed: Wed Jul 19 14:08:01 2017 -0700

--
 sdks/python/apache_beam/runners/dataflow/dataflow_runner.py | 3 +++
 1 file changed, 3 insertions(+)
--




[48/50] [abbrv] beam git commit: Introduces SpannerIO.readAll()

2017-07-20 Thread jbonofre
Introduces SpannerIO.readAll()


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/95e9c28c
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/95e9c28c
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/95e9c28c

Branch: refs/heads/DSL_SQL
Commit: 95e9c28ca4da5bac31f3d768595693e43b464c1c
Parents: c8e3744
Author: Mairbek Khadikov 
Authored: Tue Jul 18 16:23:58 2017 -0700
Committer: Eugene Kirpichov 
Committed: Thu Jul 20 10:58:51 2017 -0700

--
 .../sdk/io/gcp/spanner/NaiveSpannerReadFn.java  |  35 ++--
 .../beam/sdk/io/gcp/spanner/ReadOperation.java  |  96 ++
 .../beam/sdk/io/gcp/spanner/SpannerIO.java  | 187 ++-
 .../sdk/io/gcp/spanner/SpannerIOReadTest.java   | 145 +-
 4 files changed, 353 insertions(+), 110 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/95e9c28c/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/NaiveSpannerReadFn.java
--
diff --git 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/NaiveSpannerReadFn.java
 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/NaiveSpannerReadFn.java
index d193b95..92b3fe3 100644
--- 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/NaiveSpannerReadFn.java
+++ 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/NaiveSpannerReadFn.java
@@ -22,44 +22,53 @@ import com.google.cloud.spanner.ResultSet;
 import com.google.cloud.spanner.Struct;
 import com.google.cloud.spanner.TimestampBound;
 import com.google.common.annotations.VisibleForTesting;
+import javax.annotation.Nullable;
+import org.apache.beam.sdk.values.PCollectionView;
 
 /** A simplest read function implementation. Parallelism support is coming. */
 @VisibleForTesting
-class NaiveSpannerReadFn extends AbstractSpannerFn {
-  private final SpannerIO.Read config;
+class NaiveSpannerReadFn extends AbstractSpannerFn {
+  private final SpannerConfig config;
+  @Nullable private final PCollectionView transaction;
 
-  NaiveSpannerReadFn(SpannerIO.Read config) {
+  NaiveSpannerReadFn(SpannerConfig config, @Nullable 
PCollectionView transaction) {
 this.config = config;
+this.transaction = transaction;
+  }
+
+  NaiveSpannerReadFn(SpannerConfig config) {
+this(config, null);
   }
 
   SpannerConfig getSpannerConfig() {
-return config.getSpannerConfig();
+return config;
   }
 
   @ProcessElement
   public void processElement(ProcessContext c) throws Exception {
 TimestampBound timestampBound = TimestampBound.strong();
-if (config.getTransaction() != null) {
-  Transaction transaction = c.sideInput(config.getTransaction());
+if (transaction != null) {
+  Transaction transaction = c.sideInput(this.transaction);
   timestampBound = TimestampBound.ofReadTimestamp(transaction.timestamp());
 }
+ReadOperation op = c.element();
 try (ReadOnlyTransaction readOnlyTransaction =
 databaseClient().readOnlyTransaction(timestampBound)) {
-  ResultSet resultSet = execute(readOnlyTransaction);
+  ResultSet resultSet = execute(op, readOnlyTransaction);
   while (resultSet.next()) {
 c.output(resultSet.getCurrentRowAsStruct());
   }
 }
   }
 
-  private ResultSet execute(ReadOnlyTransaction readOnlyTransaction) {
-if (config.getQuery() != null) {
-  return readOnlyTransaction.executeQuery(config.getQuery());
+  private ResultSet execute(ReadOperation op, ReadOnlyTransaction 
readOnlyTransaction) {
+if (op.getQuery() != null) {
+  return readOnlyTransaction.executeQuery(op.getQuery());
 }
-if (config.getIndex() != null) {
+if (op.getIndex() != null) {
   return readOnlyTransaction.readUsingIndex(
-  config.getTable(), config.getIndex(), config.getKeySet(), 
config.getColumns());
+  op.getTable(), op.getIndex(), op.getKeySet(), op.getColumns());
 }
-return readOnlyTransaction.read(config.getTable(), config.getKeySet(), 
config.getColumns());
+return readOnlyTransaction.read(op.getTable(), op.getKeySet(), 
op.getColumns());
   }
 }

http://git-wip-us.apache.org/repos/asf/beam/blob/95e9c28c/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/ReadOperation.java
--
diff --git 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/ReadOperation.java
 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/ReadOperation.java
new file mode 100644
index 000..3b2bb6b
--- /dev/null
+++ 
b/sdks/java/io/google-cloud-platf

[41/50] [abbrv] beam git commit: [BEAM-2642] Update Google Auth to 0.7.1

2017-07-20 Thread jbonofre
[BEAM-2642] Update Google Auth to 0.7.1

This closes #3596


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/4d1db226
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/4d1db226
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/4d1db226

Branch: refs/heads/DSL_SQL
Commit: 4d1db2265298af324372e5212ec06cd10b4f4908
Parents: a6f460f 51427a6
Author: Luke Cwik 
Authored: Wed Jul 19 13:09:13 2017 -0700
Committer: Luke Cwik 
Committed: Wed Jul 19 13:09:13 2017 -0700

--
 pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--




[14/50] [abbrv] beam git commit: Adjust pull request template for Jenkins and mergebot world

2017-07-20 Thread jbonofre
Adjust pull request template for Jenkins and mergebot world

Adds details about making a good series of commits, while removing advice that
the user do things that Jenkins will do for them.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/4c6fa39f
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/4c6fa39f
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/4c6fa39f

Branch: refs/heads/DSL_SQL
Commit: 4c6fa39f619709ff127ca8418121ad91afa2041b
Parents: 7e4719c
Author: Kenneth Knowles 
Authored: Mon Jul 17 13:06:26 2017 -0700
Committer: Kenneth Knowles 
Committed: Mon Jul 17 15:59:39 2017 -0700

--
 .github/PULL_REQUEST_TEMPLATE.md | 16 +++-
 1 file changed, 7 insertions(+), 9 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/4c6fa39f/.github/PULL_REQUEST_TEMPLATE.md
--
diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md
index 868edd1..750 100644
--- a/.github/PULL_REQUEST_TEMPLATE.md
+++ b/.github/PULL_REQUEST_TEMPLATE.md
@@ -1,12 +1,10 @@
-Be sure to do all of the following to help us incorporate your contribution
-quickly and easily:
+Follow this checklist to help us incorporate your contribution quickly and 
easily:
 
- - [ ] Make sure the PR title is formatted like:
-   `[BEAM-] Description of pull request`
- - [ ] Make sure tests pass via `mvn clean verify`.
- - [ ] Replace `` in the title with the actual Jira issue
-   number, if there is one.
- - [ ] If this contribution is large, please file an Apache
-   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).
+ - [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
+ - [ ] Each commit in the pull request should have a meaningful subject line 
and body.
+ - [ ] Format the pull request title like `[BEAM-1234] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-1234` with the appropriate JIRA 
issue.
+ - [ ] Write a pull request description that is detailed enough to understand 
what the pull request does, how, and why.
+ - [ ] Run `mvn clean verify` to make sure basic checks pass. A more thorough 
check will be performed on your pull request automatically.
+ - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
 
 ---



[29/50] [abbrv] beam git commit: Accept Region in Dataflow Monitoring Page URL

2017-07-20 Thread jbonofre
Accept Region in Dataflow Monitoring Page URL

Update Google Cloud Dataflow FE URLs from the Dataflow Runners to
regionalized paths.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/111603a9
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/111603a9
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/111603a9

Branch: refs/heads/DSL_SQL
Commit: 111603a9952f415fa1386046f7a2d3bde5b6532d
Parents: 2d5b6d7
Author: Robert Burke 
Authored: Tue Jun 27 15:41:56 2017 -0700
Committer: Thomas Groh 
Committed: Tue Jul 18 14:49:56 2017 -0700

--
 .../beam/runners/dataflow/DataflowPipelineJob.java  | 14 --
 .../beam/runners/dataflow/DataflowRunner.java   |  3 ++-
 .../beam/runners/dataflow/util/MonitoringUtil.java  | 16 +---
 .../dataflow/BatchStatefulParDoOverridesTest.java   |  1 +
 .../dataflow/DataflowPipelineTranslatorTest.java|  1 +
 .../runners/dataflow/internal/apiclient.py  |  7 +--
 .../runners/dataflow/test_dataflow_runner.py|  5 +++--
 7 files changed, 37 insertions(+), 10 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/111603a9/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineJob.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineJob.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineJob.java
index e30d426..e736373 100644
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineJob.java
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineJob.java
@@ -169,6 +169,13 @@ public class DataflowPipelineJob implements PipelineResult 
{
   }
 
   /**
+   * Get the region this job exists in.
+   */
+  public String getRegion() {
+return dataflowOptions.getRegion();
+  }
+
+  /**
* Returns a new {@link DataflowPipelineJob} for the job that replaced this 
one, if applicable.
*
* @throws IllegalStateException if called before the job has terminated or 
if the job terminated
@@ -344,7 +351,9 @@ public class DataflowPipelineJob implements PipelineResult {
   getJobId(),
   getReplacedByJob().getJobId(),
   MonitoringUtil.getJobMonitoringPageURL(
-  getReplacedByJob().getProjectId(), 
getReplacedByJob().getJobId()));
+  getReplacedByJob().getProjectId(),
+  getRegion(),
+  getReplacedByJob().getJobId()));
   break;
 default:
   LOG.info("Job {} failed with status {}.", getJobId(), state);
@@ -422,7 +431,8 @@ public class DataflowPipelineJob implements PipelineResult {
 "Failed to cancel job in state %s, "
 + "please go to the Developers Console to cancel it 
manually: %s",
 state,
-MonitoringUtil.getJobMonitoringPageURL(getProjectId(), 
getJobId()));
+MonitoringUtil.getJobMonitoringPageURL(
+getProjectId(), getRegion(), getJobId()));
 LOG.warn(errorMsg);
 throw new IOException(errorMsg, e);
   }

http://git-wip-us.apache.org/repos/asf/beam/blob/111603a9/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
index 8935759..57a5ea5 100644
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java
@@ -679,7 +679,8 @@ public class DataflowRunner extends 
PipelineRunner {
 }
 
 LOG.info("To access the Dataflow monitoring console, please navigate to 
{}",
-MonitoringUtil.getJobMonitoringPageURL(options.getProject(), 
jobResult.getId()));
+MonitoringUtil.getJobMonitoringPageURL(
+  options.getProject(), options.getRegion(), jobResult.getId()));
 System.out.println("Submitted job: " + jobResult.getId());
 
 LOG.info("To cancel the job using the 'gcloud' tool, run:\n> {}",

http://git-wip-us.apache.org/repos/asf/beam/blob/111603a9/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/MonitoringUtil.java
---

[32/50] [abbrv] beam git commit: Add GroupByKey tests for Multiple & Merging windows

2017-07-20 Thread jbonofre
Add GroupByKey tests for Multiple & Merging windows

This gives explicit coverage to a GroupByKey where the elements are in
multiple windows, or in merging windows.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/1e947045
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/1e947045
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/1e947045

Branch: refs/heads/DSL_SQL
Commit: 1e947045a54bd59b449fd56f8f5f50879b6d9c4c
Parents: be5b934
Author: Thomas Groh 
Authored: Mon Jul 17 13:38:11 2017 -0700
Committer: Thomas Groh 
Committed: Tue Jul 18 17:52:55 2017 -0700

--
 .../beam/sdk/transforms/GroupByKeyTest.java | 156 +++
 1 file changed, 122 insertions(+), 34 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/1e947045/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/GroupByKeyTest.java
--
diff --git 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/GroupByKeyTest.java
 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/GroupByKeyTest.java
index 4b5d5f5..8fcb4c0 100644
--- 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/GroupByKeyTest.java
+++ 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/GroupByKeyTest.java
@@ -23,18 +23,20 @@ import static org.hamcrest.CoreMatchers.equalTo;
 import static org.hamcrest.CoreMatchers.hasItem;
 import static org.hamcrest.Matchers.empty;
 import static 
org.hamcrest.collection.IsIterableContainingInAnyOrder.containsInAnyOrder;
-import static org.hamcrest.core.Is.is;
 import static org.junit.Assert.assertThat;
 
 import com.google.common.base.Function;
+import com.google.common.collect.ImmutableList;
 import com.google.common.collect.Iterables;
 import java.io.DataInputStream;
 import java.io.DataOutputStream;
 import java.io.IOException;
 import java.io.InputStream;
 import java.io.OutputStream;
+import java.io.Serializable;
 import java.util.ArrayList;
 import java.util.Arrays;
+import java.util.Collection;
 import java.util.List;
 import java.util.Map;
 import java.util.concurrent.ThreadLocalRandom;
@@ -56,9 +58,12 @@ import org.apache.beam.sdk.testing.ValidatesRunner;
 import org.apache.beam.sdk.transforms.display.DisplayData;
 import org.apache.beam.sdk.transforms.windowing.AfterProcessingTime;
 import org.apache.beam.sdk.transforms.windowing.FixedWindows;
+import org.apache.beam.sdk.transforms.windowing.GlobalWindow;
+import org.apache.beam.sdk.transforms.windowing.IntervalWindow;
 import org.apache.beam.sdk.transforms.windowing.InvalidWindows;
 import org.apache.beam.sdk.transforms.windowing.Repeatedly;
 import org.apache.beam.sdk.transforms.windowing.Sessions;
+import org.apache.beam.sdk.transforms.windowing.SlidingWindows;
 import org.apache.beam.sdk.transforms.windowing.TimestampCombiner;
 import org.apache.beam.sdk.transforms.windowing.Window;
 import org.apache.beam.sdk.values.KV;
@@ -67,6 +72,7 @@ import org.apache.beam.sdk.values.PCollection;
 import org.apache.beam.sdk.values.TimestampedValue;
 import org.apache.beam.sdk.values.TypeDescriptor;
 import org.apache.beam.sdk.values.WindowingStrategy;
+import org.hamcrest.Matcher;
 import org.joda.time.Duration;
 import org.joda.time.Instant;
 import org.junit.Assert;
@@ -82,13 +88,13 @@ import org.junit.runners.JUnit4;
  */
 @RunWith(JUnit4.class)
 @SuppressWarnings({"rawtypes", "unchecked"})
-public class GroupByKeyTest {
+public class GroupByKeyTest implements Serializable {
 
   @Rule
-  public final TestPipeline p = TestPipeline.create();
+  public transient TestPipeline p = TestPipeline.create();
 
   @Rule
-  public ExpectedException thrown = ExpectedException.none();
+  public transient ExpectedException thrown = ExpectedException.none();
 
   @Test
   @Category(ValidatesRunner.class)
@@ -109,27 +115,18 @@ public class GroupByKeyTest {
 PCollection>> output =
 input.apply(GroupByKey.create());
 
-PAssert.that(output)
-.satisfies(new AssertThatHasExpectedContentsForTestGroupByKey());
+SerializableFunction>>, Void> 
checker =
+containsKvs(
+kv("k1", 3, 4),
+kv("k5", Integer.MIN_VALUE, Integer.MAX_VALUE),
+kv("k2", 66, -33),
+kv("k3", 0));
+PAssert.that(output).satisfies(checker);
+PAssert.that(output).inWindow(GlobalWindow.INSTANCE).satisfies(checker);
 
 p.run();
   }
 
-  static class AssertThatHasExpectedContentsForTestGroupByKey
-  implements SerializableFunction>>,
-  Void> {
-@Override
-public Void apply(Iterable>> actual) {
-  assertThat(actual, containsInAnyOrder(
-  isKv(is("k1"), containsInAnyOrder(3, 4)),
-  isKv(is("k5"), containsInAnyOrder(Integer.MAX_VALUE,
- 

[10/50] [abbrv] beam git commit: Closes #3578

2017-07-20 Thread jbonofre
Closes #3578


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/02905c27
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/02905c27
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/02905c27

Branch: refs/heads/DSL_SQL
Commit: 02905c27bfc59aa90ebe9c929fa060e705ff2fc3
Parents: 532256e e7059e5
Author: Robert Bradshaw 
Authored: Mon Jul 17 15:08:02 2017 -0700
Committer: Robert Bradshaw 
Committed: Mon Jul 17 15:08:02 2017 -0700

--
 .../apache_beam/examples/snippets/snippets.py   |  2 +-
 sdks/python/apache_beam/transforms/core.py  |  2 +-
 sdks/python/apache_beam/transforms/trigger.py   | 21 +++-
 3 files changed, 18 insertions(+), 7 deletions(-)
--




[44/50] [abbrv] beam git commit: [BEAM-2636] Make sure we only override the correct class

2017-07-20 Thread jbonofre
[BEAM-2636] Make sure we only override the correct class


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/d128c3b3
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/d128c3b3
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/d128c3b3

Branch: refs/heads/DSL_SQL
Commit: d128c3b378a58b0c2c31c2d30fd29e211e118324
Parents: eb0850e
Author: Sourabh Bajaj 
Authored: Wed Jul 19 10:08:14 2017 -0700
Committer: Ahmet Altay 
Committed: Wed Jul 19 14:07:54 2017 -0700

--
 sdks/python/apache_beam/runners/dataflow/dataflow_runner.py | 3 +++
 1 file changed, 3 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/d128c3b3/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py
--
diff --git a/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py 
b/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py
index 89c18d4..aec7d00 100644
--- a/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py
+++ b/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py
@@ -418,6 +418,9 @@ class DataflowRunner(PipelineRunner):
   PropertyNames.OUTPUT_NAME: PropertyNames.OUT}])
 
   def apply_WriteToBigQuery(self, transform, pcoll):
+# Make sure this is the WriteToBigQuery class that we expected
+if not isinstance(transform, beam.io.WriteToBigQuery):
+  return self.apply_PTransform(transform, pcoll)
 standard_options = pcoll.pipeline._options.view_as(StandardOptions)
 if standard_options.streaming:
   if (transform.write_disposition ==



[47/50] [abbrv] beam git commit: This closes #3585

2017-07-20 Thread jbonofre
This closes #3585


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/c8e3744a
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/c8e3744a
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/c8e3744a

Branch: refs/heads/DSL_SQL
Commit: c8e3744adc6dfdfcfd32221bbbd05ed5b2511c81
Parents: 2e51bde 0a5157e
Author: chamik...@google.com 
Authored: Thu Jul 20 10:18:55 2017 -0700
Committer: chamik...@google.com 
Committed: Thu Jul 20 10:18:55 2017 -0700

--
 .../io/gcp/datastore/v1/datastoreio.py  | 84 ++---
 .../io/gcp/datastore/v1/datastoreio_test.py | 53 +--
 .../apache_beam/io/gcp/datastore/v1/helper.py   | 35 ++--
 .../apache_beam/io/gcp/datastore/v1/util.py | 95 
 .../io/gcp/datastore/v1/util_test.py| 67 ++
 5 files changed, 310 insertions(+), 24 deletions(-)
--




[21/50] [abbrv] beam git commit: [BEAM-2084] Adding querying facility for distribution metrics in Java

2017-07-20 Thread jbonofre
[BEAM-2084] Adding querying facility for distribution metrics in Java


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/a48eefac
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/a48eefac
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/a48eefac

Branch: refs/heads/DSL_SQL
Commit: a48eeface8c5257f34e85c22f312ec03801b0f82
Parents: 7c36318
Author: Pablo 
Authored: Thu May 4 14:56:14 2017 -0700
Committer: Ben Chambers 
Committed: Tue Jul 18 09:58:47 2017 -0700

--
 .../org/apache/beam/examples/WordCount.java |   4 +
 pom.xml |   2 +-
 .../beam/runners/dataflow/DataflowMetrics.java  | 310 +--
 .../runners/dataflow/DataflowPipelineJob.java   |   4 +
 .../runners/dataflow/DataflowMetricsTest.java   | 174 ++-
 .../beam/sdk/metrics/MetricResultsMatchers.java |   2 +-
 6 files changed, 388 insertions(+), 108 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/a48eefac/examples/java/src/main/java/org/apache/beam/examples/WordCount.java
--
diff --git 
a/examples/java/src/main/java/org/apache/beam/examples/WordCount.java 
b/examples/java/src/main/java/org/apache/beam/examples/WordCount.java
index bfa7eb3..2d568ce 100644
--- a/examples/java/src/main/java/org/apache/beam/examples/WordCount.java
+++ b/examples/java/src/main/java/org/apache/beam/examples/WordCount.java
@@ -21,6 +21,7 @@ import org.apache.beam.examples.common.ExampleUtils;
 import org.apache.beam.sdk.Pipeline;
 import org.apache.beam.sdk.io.TextIO;
 import org.apache.beam.sdk.metrics.Counter;
+import org.apache.beam.sdk.metrics.Distribution;
 import org.apache.beam.sdk.metrics.Metrics;
 import org.apache.beam.sdk.options.Default;
 import org.apache.beam.sdk.options.Description;
@@ -88,9 +89,12 @@ public class WordCount {
*/
   static class ExtractWordsFn extends DoFn {
 private final Counter emptyLines = Metrics.counter(ExtractWordsFn.class, 
"emptyLines");
+private final Distribution lineLenDist = Metrics.distribution(
+ExtractWordsFn.class, "lineLenDistro");
 
 @ProcessElement
 public void processElement(ProcessContext c) {
+  lineLenDist.update(c.element().length());
   if (c.element().trim().isEmpty()) {
 emptyLines.inc();
   }

http://git-wip-us.apache.org/repos/asf/beam/blob/a48eefac/pom.xml
--
diff --git a/pom.xml b/pom.xml
index d9ab9ae..d27d367 100644
--- a/pom.xml
+++ b/pom.xml
@@ -112,7 +112,7 @@
 v1-rev6-1.22.0
 0.1.0
 v2-rev8-1.22.0
-v1b3-rev196-1.22.0
+v1b3-rev198-1.20.0
 0.5.160222
 1.4.0
 1.3.0

http://git-wip-us.apache.org/repos/asf/beam/blob/a48eefac/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowMetrics.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowMetrics.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowMetrics.java
index 330cc7e..4c9c493 100644
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowMetrics.java
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowMetrics.java
@@ -19,7 +19,9 @@ package org.apache.beam.runners.dataflow;
 
 import static com.google.common.base.MoreObjects.firstNonNull;
 
+import com.google.api.client.util.ArrayMap;
 import com.google.api.services.dataflow.model.JobMetrics;
+import com.google.api.services.dataflow.model.MetricUpdate;
 import com.google.auto.value.AutoValue;
 import com.google.common.base.Objects;
 import com.google.common.collect.ImmutableList;
@@ -28,6 +30,7 @@ import java.util.Collections;
 import java.util.HashMap;
 import java.util.HashSet;
 import java.util.List;
+import javax.annotation.Nullable;
 import org.apache.beam.runners.core.construction.metrics.MetricFiltering;
 import org.apache.beam.runners.core.construction.metrics.MetricKey;
 import org.apache.beam.sdk.metrics.DistributionResult;
@@ -73,39 +76,6 @@ class DataflowMetrics extends MetricResults {
   }
 
   /**
-   * Build an immutable map that serves as a hash key for a metric update.
-   * @return a {@link MetricKey} that can be hashed and used to identify a 
metric.
-   */
-  private MetricKey metricHashKey(
-  com.google.api.services.dataflow.model.MetricUpdate metricUpdate) {
-String fullStepName = metricUpdate.getName().getContext().get("step");
-if (dataflowPipelineJob.transformStepNames == null
-|| 
!dataflowPipelineJob.transformStepNames.inverse().containsKey(fullStepName)) {
-  /

[38/50] [abbrv] beam git commit: [BEAM-2306] Add checkstyle check to fail the build when @Deprecated is used without @deprecated javadoc (or vice versa).

2017-07-20 Thread jbonofre
[BEAM-2306] Add checkstyle check to fail the build when @Deprecated is used 
without @deprecated javadoc (or vice versa).

The check is disabled for existing violations where reason for deprecation 
and/or alternative is not clear.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/d2901145
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/d2901145
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/d2901145

Branch: refs/heads/DSL_SQL
Commit: d290114549c0b379774dbabe119a79d3ee1b2b56
Parents: 7fde976
Author: Alex Filatov 
Authored: Mon Jul 10 13:20:49 2017 +0300
Committer: Kenneth Knowles 
Committed: Wed Jul 19 09:03:31 2017 -0700

--
 .../construction/CreatePCollectionViewTranslation.java  | 11 ++-
 .../core/construction/PTransformTranslation.java|  4 
 .../beam/runners/core/InMemoryTimerInternals.java   |  9 +
 .../java/org/apache/beam/runners/core/StateTags.java|  3 +++
 .../beam/runners/direct/DirectTimerInternals.java   |  9 +
 .../translation/wrappers/streaming/DoFnOperator.java|  9 +
 .../apache/beam/runners/dataflow/DataflowRunner.java|  3 ++-
 .../options/DataflowPipelineWorkerPoolOptions.java  |  3 +++
 .../build-tools/src/main/resources/beam/checkstyle.xml  |  8 
 .../src/main/java/org/apache/beam/sdk/coders/Coder.java | 12 +++-
 .../java/org/apache/beam/sdk/coders/CoderRegistry.java  |  9 +
 .../main/java/org/apache/beam/sdk/io/AvroSource.java|  6 --
 .../main/java/org/apache/beam/sdk/testing/PAssert.java  |  5 +++--
 .../java/org/apache/beam/sdk/testing/StreamingIT.java   |  4 
 .../java/org/apache/beam/sdk/transforms/Combine.java|  1 -
 .../main/java/org/apache/beam/sdk/transforms/DoFn.java  |  3 +++
 .../main/java/org/apache/beam/sdk/transforms/View.java  |  2 +-
 .../beam/sdk/transforms/reflect/DoFnInvokers.java   |  9 -
 .../java/org/apache/beam/sdk/util/IdentityWindowFn.java |  1 -
 .../org/apache/beam/sdk/values/PCollectionViews.java|  1 -
 .../main/java/org/apache/beam/sdk/values/PValue.java|  4 ++--
 .../org/apache/beam/sdk/coders/DefaultCoderTest.java|  3 ++-
 .../org/apache/beam/fn/harness/BoundedSourceRunner.java |  6 +++---
 23 files changed, 95 insertions(+), 30 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/d2901145/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/CreatePCollectionViewTranslation.java
--
diff --git 
a/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/CreatePCollectionViewTranslation.java
 
b/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/CreatePCollectionViewTranslation.java
index 8fc99b9..c67d688 100644
--- 
a/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/CreatePCollectionViewTranslation.java
+++ 
b/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/CreatePCollectionViewTranslation.java
@@ -86,6 +86,10 @@ public class CreatePCollectionViewTranslation {
 PCollectionView.class.getSimpleName());
   }
 
+  /**
+   * @deprecated runners should move away from translating 
`CreatePCollectionView` and treat this
+   * as part of the translation for a `ParDo` side input.
+   */
   @Deprecated
   static class CreatePCollectionViewTranslator
   implements TransformPayloadTranslator> {
@@ -112,7 +116,12 @@ public class CreatePCollectionViewTranslation {
 }
   }
 
-  /** Registers {@link CreatePCollectionViewTranslator}. */
+  /**
+   * Registers {@link CreatePCollectionViewTranslator}.
+   *
+   * @deprecated runners should move away from translating 
`CreatePCollectionView` and treat this
+   * as part of the translation for a `ParDo` side input.
+   */
   @AutoService(TransformPayloadTranslatorRegistrar.class)
   @Deprecated
   public static class Registrar implements TransformPayloadTranslatorRegistrar 
{

http://git-wip-us.apache.org/repos/asf/beam/blob/d2901145/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/PTransformTranslation.java
--
diff --git 
a/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/PTransformTranslation.java
 
b/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/PTransformTranslation.java
index bae7b05..0b4a2ab 100644
--- 
a/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/PTransformTranslation.java
+++ 
b/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/PTransformTra

[25/50] [abbrv] beam git commit: This closes #3577: Fix split package in SDK harness

2017-07-20 Thread jbonofre
This closes #3577: Fix split package in SDK harness


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/2c2d8a35
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/2c2d8a35
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/2c2d8a35

Branch: refs/heads/DSL_SQL
Commit: 2c2d8a35f154a6c5615917a859f58f8fcf7f2789
Parents: 5a0b74c f1b4700
Author: Kenneth Knowles 
Authored: Tue Jul 18 12:52:32 2017 -0700
Committer: Kenneth Knowles 
Committed: Tue Jul 18 12:52:32 2017 -0700

--
 .../beam/fn/harness/BeamFnDataReadRunner.java   | 173 ++
 .../beam/fn/harness/BeamFnDataWriteRunner.java  | 159 ++
 .../beam/fn/harness/BoundedSourceRunner.java| 167 ++
 .../apache/beam/fn/harness/FnApiDoFnRunner.java | 548 +++
 .../fn/harness/PTransformRunnerFactory.java |  81 +++
 .../harness/control/ProcessBundleHandler.java   |   4 +-
 .../beam/runners/core/BeamFnDataReadRunner.java | 173 --
 .../runners/core/BeamFnDataWriteRunner.java | 159 --
 .../beam/runners/core/BoundedSourceRunner.java  | 167 --
 .../beam/runners/core/FnApiDoFnRunner.java  | 547 --
 .../runners/core/PTransformRunnerFactory.java   |  81 ---
 .../apache/beam/runners/core/package-info.java  |  22 -
 .../fn/harness/BeamFnDataReadRunnerTest.java| 281 ++
 .../fn/harness/BeamFnDataWriteRunnerTest.java   | 269 +
 .../fn/harness/BoundedSourceRunnerTest.java | 187 +++
 .../beam/fn/harness/FnApiDoFnRunnerTest.java| 210 +++
 .../control/ProcessBundleHandlerTest.java   |   2 +-
 .../runners/core/BeamFnDataReadRunnerTest.java  | 281 --
 .../runners/core/BeamFnDataWriteRunnerTest.java | 269 -
 .../runners/core/BoundedSourceRunnerTest.java   | 187 ---
 .../beam/runners/core/FnApiDoFnRunnerTest.java  | 210 ---
 21 files changed, 2078 insertions(+), 2099 deletions(-)
--




[34/50] [abbrv] beam git commit: [BEAM-2630] TestPipeline: construct job/app names based on Description in junit TestRule.

2017-07-20 Thread jbonofre
[BEAM-2630] TestPipeline: construct job/app names based on Description in junit 
TestRule.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/bdf5bd6e
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/bdf5bd6e
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/bdf5bd6e

Branch: refs/heads/DSL_SQL
Commit: bdf5bd6e50fa9f44ad7560714cd41ac3f346d124
Parents: 0d927ef
Author: Pei He 
Authored: Mon Jul 17 23:34:27 2017 +0800
Committer: Pei He 
Committed: Wed Jul 19 11:30:12 2017 +0800

--
 .../apache/beam/sdk/testing/TestPipeline.java   | 63 
 .../beam/sdk/testing/TestPipelineTest.java  | 38 +---
 2 files changed, 13 insertions(+), 88 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/bdf5bd6e/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/TestPipeline.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/TestPipeline.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/TestPipeline.java
index 9206e04..34f1c83 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/TestPipeline.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/TestPipeline.java
@@ -31,9 +31,7 @@ import com.google.common.base.Predicate;
 import com.google.common.base.Predicates;
 import com.google.common.base.Strings;
 import com.google.common.collect.FluentIterable;
-import com.google.common.collect.Iterators;
 import java.io.IOException;
-import java.lang.reflect.Method;
 import java.util.ArrayList;
 import java.util.Iterator;
 import java.util.LinkedList;
@@ -307,6 +305,7 @@ public class TestPipeline extends Pipeline implements 
TestRule {
 
   @Override
   public void evaluate() throws Throwable {
+
options.as(ApplicationNameOptions.class).setAppName(getAppName(description));
 
 setDeducedEnforcementLevel();
 
@@ -402,7 +401,6 @@ public class TestPipeline extends Pipeline implements 
TestRule {
   MAPPER.readValue(beamTestPipelineOptions, String[].class))
   .as(TestPipelineOptions.class);
 
-  options.as(ApplicationNameOptions.class).setAppName(getAppName());
   // If no options were specified, set some reasonable defaults
   if (Strings.isNullOrEmpty(beamTestPipelineOptions)) {
 // If there are no provided options, check to see if a dummy runner 
should be used.
@@ -450,56 +448,17 @@ public class TestPipeline extends Pipeline implements 
TestRule {
 }
   }
 
-  /** Returns the class + method name of the test, or a default name. */
-  private static String getAppName() {
-Optional stackTraceElement = findCallersStackTrace();
-if (stackTraceElement.isPresent()) {
-  String methodName = stackTraceElement.get().getMethodName();
-  String className = stackTraceElement.get().getClassName();
-  if (className.contains(".")) {
-className = className.substring(className.lastIndexOf(".") + 1);
-  }
-  return className + "-" + methodName;
-}
-return "UnitTest";
-  }
-
-  /** Returns the {@link StackTraceElement} of the calling class. */
-  private static Optional findCallersStackTrace() {
-Iterator elements =
-Iterators.forArray(Thread.currentThread().getStackTrace());
-// First find the TestPipeline class in the stack trace.
-while (elements.hasNext()) {
-  StackTraceElement next = elements.next();
-  if (TestPipeline.class.getName().equals(next.getClassName())) {
-break;
-  }
-}
-// Then find the first instance after that is not the TestPipeline
-Optional firstInstanceAfterTestPipeline = 
Optional.absent();
-while (elements.hasNext()) {
-  StackTraceElement next = elements.next();
-  if (!TestPipeline.class.getName().equals(next.getClassName())) {
-if (!firstInstanceAfterTestPipeline.isPresent()) {
-  firstInstanceAfterTestPipeline = Optional.of(next);
-}
-try {
-  Class nextClass = Class.forName(next.getClassName());
-  for (Method method : nextClass.getMethods()) {
-if (method.getName().equals(next.getMethodName())) {
-  if (method.isAnnotationPresent(org.junit.Test.class)) {
-return Optional.of(next);
-  } else if (method.isAnnotationPresent(org.junit.Before.class)) {
-break;
-  }
-}
-  }
-} catch (Throwable t) {
-  break;
-}
-  }
+  /** Returns the class + method name of the test. */
+  private String getAppName(Description description) {
+String methodName = description.getMethodName();
+Class testClass = description.getTestClass();
+if (testClass.isMemberClass()) {
+  return String

[42/50] [abbrv] beam git commit: Increase the gRPC message size to max value

2017-07-20 Thread jbonofre
Increase the gRPC message size to max value


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/b424aa04
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/b424aa04
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/b424aa04

Branch: refs/heads/DSL_SQL
Commit: b424aa0409b507fe1c0c56a5f652d9be6458de66
Parents: 4d1db22
Author: Vikas Kedigehalli 
Authored: Tue Jul 18 10:06:46 2017 -0700
Committer: Luke Cwik 
Committed: Wed Jul 19 13:17:37 2017 -0700

--
 .../beam/fn/harness/channel/ManagedChannelFactory.java   | 6 ++
 sdks/python/apache_beam/runners/worker/data_plane.py | 8 +++-
 2 files changed, 13 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/b424aa04/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/channel/ManagedChannelFactory.java
--
diff --git 
a/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/channel/ManagedChannelFactory.java
 
b/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/channel/ManagedChannelFactory.java
index d26f4a5..3138bab 100644
--- 
a/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/channel/ManagedChannelFactory.java
+++ 
b/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/channel/ManagedChannelFactory.java
@@ -61,6 +61,9 @@ public abstract class ManagedChannelFactory {
   ? EpollDomainSocketChannel.class : EpollSocketChannel.class)
   .eventLoopGroup(new EpollEventLoopGroup())
   .usePlaintext(true)
+  // Set the message size to max value here. The actual size is 
governed by the
+  // buffer size in the layers above.
+  .maxInboundMessageSize(Integer.MAX_VALUE)
   .build();
 }
   }
@@ -74,6 +77,9 @@ public abstract class ManagedChannelFactory {
 public ManagedChannel forDescriptor(ApiServiceDescriptor 
apiServiceDescriptor) {
   return ManagedChannelBuilder.forTarget(apiServiceDescriptor.getUrl())
   .usePlaintext(true)
+  // Set the message size to max value here. The actual size is 
governed by the
+  // buffer size in the layers above.
+  .maxInboundMessageSize(Integer.MAX_VALUE)
   .build();
 }
   }

http://git-wip-us.apache.org/repos/asf/beam/blob/b424aa04/sdks/python/apache_beam/runners/worker/data_plane.py
--
diff --git a/sdks/python/apache_beam/runners/worker/data_plane.py 
b/sdks/python/apache_beam/runners/worker/data_plane.py
index 26f65ee..e713041 100644
--- a/sdks/python/apache_beam/runners/worker/data_plane.py
+++ b/sdks/python/apache_beam/runners/worker/data_plane.py
@@ -269,7 +269,13 @@ class GrpcClientDataChannelFactory(DataChannelFactory):
 url = remote_grpc_port.api_service_descriptor.url
 if url not in self._data_channel_cache:
   logging.info('Creating channel for %s', url)
-  grpc_channel = grpc.insecure_channel(url)
+  grpc_channel = grpc.insecure_channel(
+  url,
+  # Options to have no limits (-1) on the size of the messages
+  # received or sent over the data plane. The actual buffer size is
+  # controlled in a layer above.
+  options=[("grpc.max_receive_message_length", -1),
+   ("grpc.max_send_message_length", -1)])
   self._data_channel_cache[url] = GrpcClientDataChannel(
   beam_fn_api_pb2.BeamFnDataStub(grpc_channel))
 return self._data_channel_cache[url]



[19/50] [abbrv] beam git commit: Splits large TextIOTest into TextIOReadTest and TextIOWriteTest

2017-07-20 Thread jbonofre
Splits large TextIOTest into TextIOReadTest and TextIOWriteTest


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/d495d151
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/d495d151
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/d495d151

Branch: refs/heads/DSL_SQL
Commit: d495d1511fe86a2199eb247df95ff0c876803c67
Parents: 0f06eb2
Author: Eugene Kirpichov 
Authored: Fri Jun 23 18:01:53 2017 -0700
Committer: Eugene Kirpichov 
Committed: Mon Jul 17 17:08:00 2017 -0700

--
 .../org/apache/beam/sdk/io/TextIOReadTest.java  |  847 +++
 .../java/org/apache/beam/sdk/io/TextIOTest.java | 1353 +-
 .../org/apache/beam/sdk/io/TextIOWriteTest.java |  604 
 3 files changed, 1460 insertions(+), 1344 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/d495d151/sdks/java/core/src/test/java/org/apache/beam/sdk/io/TextIOReadTest.java
--
diff --git 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/io/TextIOReadTest.java 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/io/TextIOReadTest.java
new file mode 100644
index 000..8b53111
--- /dev/null
+++ b/sdks/java/core/src/test/java/org/apache/beam/sdk/io/TextIOReadTest.java
@@ -0,0 +1,847 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io;
+
+import static org.apache.beam.sdk.TestUtils.LINES_ARRAY;
+import static org.apache.beam.sdk.TestUtils.NO_LINES_ARRAY;
+import static org.apache.beam.sdk.io.TextIO.CompressionType.AUTO;
+import static org.apache.beam.sdk.io.TextIO.CompressionType.BZIP2;
+import static org.apache.beam.sdk.io.TextIO.CompressionType.DEFLATE;
+import static org.apache.beam.sdk.io.TextIO.CompressionType.GZIP;
+import static org.apache.beam.sdk.io.TextIO.CompressionType.UNCOMPRESSED;
+import static org.apache.beam.sdk.io.TextIO.CompressionType.ZIP;
+import static 
org.apache.beam.sdk.transforms.display.DisplayDataMatchers.hasDisplayItem;
+import static 
org.apache.beam.sdk.transforms.display.DisplayDataMatchers.hasValue;
+import static org.hamcrest.Matchers.containsInAnyOrder;
+import static org.hamcrest.Matchers.equalTo;
+import static org.hamcrest.Matchers.greaterThan;
+import static org.hamcrest.Matchers.hasItem;
+import static org.hamcrest.Matchers.hasSize;
+import static org.hamcrest.Matchers.startsWith;
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertFalse;
+import static org.junit.Assert.assertNotNull;
+import static org.junit.Assert.assertThat;
+import static org.junit.Assert.assertTrue;
+
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.Iterables;
+import java.io.File;
+import java.io.FileOutputStream;
+import java.io.IOException;
+import java.io.OutputStream;
+import java.io.PrintStream;
+import java.nio.charset.StandardCharsets;
+import java.nio.file.FileVisitResult;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.nio.file.SimpleFileVisitor;
+import java.nio.file.attribute.BasicFileAttributes;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.List;
+import java.util.Set;
+import java.util.zip.GZIPOutputStream;
+import java.util.zip.ZipEntry;
+import java.util.zip.ZipOutputStream;
+import org.apache.beam.sdk.coders.StringUtf8Coder;
+import org.apache.beam.sdk.io.BoundedSource.BoundedReader;
+import org.apache.beam.sdk.io.TextIO.CompressionType;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.PipelineOptionsFactory;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.testing.NeedsRunner;
+import org.apache.beam.sdk.testing.PAssert;
+import org.apache.beam.sdk.testing.SourceTestUtils;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.apache.beam.sdk.testing.ValidatesRunner;
+import org.apache.beam.sdk.transforms.Create;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.tra

[16/50] [abbrv] beam git commit: This closes #3463

2017-07-20 Thread jbonofre
This closes #3463


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/0f06eb25
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/0f06eb25
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/0f06eb25

Branch: refs/heads/DSL_SQL
Commit: 0f06eb25bcc9c6bf9fb596a6ddc3a853f339b74d
Parents: 04d364d c5ebbff
Author: Thomas Groh 
Authored: Mon Jul 17 16:01:38 2017 -0700
Committer: Thomas Groh 
Committed: Mon Jul 17 16:01:38 2017 -0700

--
 .../beam/runners/dataflow/DataflowMetrics.java  | 30 +++
 .../runners/dataflow/DataflowMetricsTest.java   | 53 +++-
 2 files changed, 59 insertions(+), 24 deletions(-)
--




[GitHub] beam pull request #3603: redo PR 3553

2017-07-20 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3603


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[27/50] [abbrv] beam git commit: This closes #3589

2017-07-20 Thread jbonofre
This closes #3589


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/2d5b6d74
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/2d5b6d74
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/2d5b6d74

Branch: refs/heads/DSL_SQL
Commit: 2d5b6d745cec07fc59c77eacad7bb90880a0946a
Parents: 2c2d8a3 d14cef0
Author: Ahmet Altay 
Authored: Tue Jul 18 13:11:34 2017 -0700
Committer: Ahmet Altay 
Committed: Tue Jul 18 13:11:34 2017 -0700

--
 README.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--




[26/50] [abbrv] beam git commit: [BEAM-1963] Update Quickstart link in README

2017-07-20 Thread jbonofre
[BEAM-1963] Update Quickstart link in README


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/d14cef0c
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/d14cef0c
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/d14cef0c

Branch: refs/heads/DSL_SQL
Commit: d14cef0c8fd963db9865ba6a5aad647fdc6f954e
Parents: 2c2d8a3
Author: Mark Liu 
Authored: Tue Jul 18 12:25:42 2017 -0700
Committer: Ahmet Altay 
Committed: Tue Jul 18 13:11:03 2017 -0700

--
 README.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/d14cef0c/README.md
--
diff --git a/README.md b/README.md
index 52c056f..8190baf 100644
--- a/README.md
+++ b/README.md
@@ -69,7 +69,7 @@ Have ideas for new Runners? See the 
[JIRA](https://issues.apache.org/jira/browse
 
 ## Getting Started
 
-Please refer to the 
[Quickstart](http://beam.apache.org/get-started/quickstart/) available on our 
website.
+Please refer to the 
Quickstart[[Java](https://beam.apache.org/get-started/quickstart-java), 
[Python](https://beam.apache.org/get-started/quickstart-py)] available on our 
website.
 
 If you'd like to build and install the whole project from the source 
distribution, you may need some additional tools installed
 in your system. In a Debian-based distribution:
@@ -102,4 +102,4 @@ We also have a [contributor's 
guide](https://beam.apache.org/contribute/contribu
 
 * [Apache Beam](http://beam.apache.org)
 * [Overview](http://beam.apache.org/use/beam-overview/)
-* [Quickstart](http://beam.apache.org/use/quickstart/)
+* Quickstart: [Java](https://beam.apache.org/get-started/quickstart-java), 
[Python](https://beam.apache.org/get-started/quickstart-py)



[31/50] [abbrv] beam git commit: This closes #3475: [BEAM-2544] Fix flaky AvroIOTest

2017-07-20 Thread jbonofre
This closes #3475: [BEAM-2544] Fix flaky AvroIOTest


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/be5b9347
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/be5b9347
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/be5b9347

Branch: refs/heads/DSL_SQL
Commit: be5b9347bc44de1f042c76e1ba3f47a13772c72b
Parents: dd9e866 911edba
Author: Eugene Kirpichov 
Authored: Tue Jul 18 15:49:54 2017 -0700
Committer: Eugene Kirpichov 
Committed: Tue Jul 18 15:49:54 2017 -0700

--
 .../java/org/apache/beam/sdk/io/AvroIOTest.java | 46 +++-
 1 file changed, 25 insertions(+), 21 deletions(-)
--




[09/50] [abbrv] beam git commit: Improving labeling of side inputs for Dataflow

2017-07-20 Thread jbonofre
Improving labeling of side inputs for Dataflow


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/7257507d
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/7257507d
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/7257507d

Branch: refs/heads/DSL_SQL
Commit: 7257507d939271a91287837c20fcdde37dc1ddeb
Parents: 7e4719c
Author: Pablo 
Authored: Fri Jul 7 13:49:47 2017 -0700
Committer: Robert Bradshaw 
Committed: Mon Jul 17 14:33:01 2017 -0700

--
 .../runners/dataflow/dataflow_runner.py   | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/7257507d/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py
--
diff --git a/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py 
b/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py
index 059e139..89c18d4 100644
--- a/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py
+++ b/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py
@@ -21,6 +21,7 @@ The runner will create a JSON description of the job graph 
and then submit it
 to the Dataflow Service for remote execution by a worker.
 """
 
+from collections import defaultdict
 import logging
 import threading
 import time
@@ -485,11 +486,24 @@ class DataflowRunner(PipelineRunner):
 si_dict = {}
 # We must call self._cache.get_pvalue exactly once due to refcounting.
 si_labels = {}
+full_label_counts = defaultdict(int)
 lookup_label = lambda side_pval: si_labels[side_pval]
 for side_pval in transform_node.side_inputs:
   assert isinstance(side_pval, AsSideInput)
-  si_label = 'SideInput-' + self._get_unique_step_name()
-  si_full_label = '%s/%s' % (transform_node.full_label, si_label)
+  step_number = self._get_unique_step_name()
+  si_label = 'SideInput-' + step_number
+  pcollection_label = '%s.%s' % (
+  side_pval.pvalue.producer.full_label.split('/')[-1],
+  side_pval.pvalue.tag if side_pval.pvalue.tag else 'out')
+  si_full_label = '%s/%s(%s.%s)' % (transform_node.full_label,
+side_pval.__class__.__name__,
+pcollection_label,
+full_label_counts[pcollection_label])
+
+  # Count the number of times the same PCollection is a side input
+  # to the same ParDo.
+  full_label_counts[pcollection_label] += 1
+
   self._add_singleton_step(
   si_label, si_full_label, side_pval.pvalue.tag,
   self._cache.get_pvalue(side_pval.pvalue))



[12/50] [abbrv] beam git commit: [BEAM-933] Fix and enable findbugs in Java examples

2017-07-20 Thread jbonofre
[BEAM-933] Fix and enable findbugs in Java examples


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/f6daad4f
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/f6daad4f
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/f6daad4f

Branch: refs/heads/DSL_SQL
Commit: f6daad4fc95cb633794c60254c6c335602f1df31
Parents: 02905c2
Author: eralmas7 
Authored: Sun Jul 9 11:50:52 2017 +0530
Committer: Kenneth Knowles 
Committed: Mon Jul 17 15:52:08 2017 -0700

--
 examples/java/pom.xml   | 12 --
 .../apache/beam/examples/complete/TfIdf.java|  3 ++-
 .../examples/complete/TopWikipediaSessions.java | 24 ++--
 .../beam/examples/complete/TrafficRoutes.java   | 19 
 .../beam/examples/cookbook/TriggerExample.java  |  6 +++--
 5 files changed, 37 insertions(+), 27 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/f6daad4f/examples/java/pom.xml
--
diff --git a/examples/java/pom.xml b/examples/java/pom.xml
index ae64a79..12fe06f 100644
--- a/examples/java/pom.xml
+++ b/examples/java/pom.xml
@@ -365,18 +365,6 @@
   
 
   
-
-  
-
-
-  org.codehaus.mojo
-  findbugs-maven-plugin
-  
-true
-  
-
-  
-
 
 
   

http://git-wip-us.apache.org/repos/asf/beam/blob/f6daad4f/examples/java/src/main/java/org/apache/beam/examples/complete/TfIdf.java
--
diff --git 
a/examples/java/src/main/java/org/apache/beam/examples/complete/TfIdf.java 
b/examples/java/src/main/java/org/apache/beam/examples/complete/TfIdf.java
index 7552b94..435ffab 100644
--- a/examples/java/src/main/java/org/apache/beam/examples/complete/TfIdf.java
+++ b/examples/java/src/main/java/org/apache/beam/examples/complete/TfIdf.java
@@ -17,6 +17,7 @@
  */
 package org.apache.beam.examples.complete;
 
+import com.google.common.base.Optional;
 import java.io.File;
 import java.io.IOException;
 import java.net.URI;
@@ -121,7 +122,7 @@ public class TfIdf {
 Set uris = new HashSet<>();
 if (absoluteUri.getScheme().equals("file")) {
   File directory = new File(absoluteUri);
-  for (String entry : directory.list()) {
+  for (String entry : Optional.fromNullable(directory.list()).or(new 
String[] {})) {
 File path = new File(directory, entry);
 uris.add(path.toURI());
   }

http://git-wip-us.apache.org/repos/asf/beam/blob/f6daad4f/examples/java/src/main/java/org/apache/beam/examples/complete/TopWikipediaSessions.java
--
diff --git 
a/examples/java/src/main/java/org/apache/beam/examples/complete/TopWikipediaSessions.java
 
b/examples/java/src/main/java/org/apache/beam/examples/complete/TopWikipediaSessions.java
index 478e2dc..3691e53 100644
--- 
a/examples/java/src/main/java/org/apache/beam/examples/complete/TopWikipediaSessions.java
+++ 
b/examples/java/src/main/java/org/apache/beam/examples/complete/TopWikipediaSessions.java
@@ -162,17 +162,18 @@ public class TopWikipediaSessions {
 public PCollection expand(PCollection input) {
   return input
   .apply(ParDo.of(new ExtractUserAndTimestamp()))
-
-  .apply("SampleUsers", ParDo.of(
-  new DoFn() {
-@ProcessElement
-public void processElement(ProcessContext c) {
-  if (Math.abs(c.element().hashCode()) <= Integer.MAX_VALUE * 
samplingThreshold) {
-c.output(c.element());
-  }
-}
-  }))
-
+  .apply(
+  "SampleUsers",
+  ParDo.of(
+  new DoFn() {
+@ProcessElement
+public void processElement(ProcessContext c) {
+  if (Math.abs((long) c.element().hashCode())
+  <= Integer.MAX_VALUE * samplingThreshold) {
+c.output(c.element());
+  }
+}
+  }))
   .apply(new ComputeSessions())
   .apply("SessionsToStrings", ParDo.of(new SessionsToStringsDoFn()))
   .apply(new TopPerMonth())
@@ -191,7 +192,6 @@ public class TopWikipediaSessions {
 @Default.String(EXPORTED_WIKI_TABLE)
 String getInput();
 void setInput(String value);
-
 @Description("File to output results to")
 @Validation.Required
 String getOutput();

http://git-wip-us.apache.org/repos/asf/beam/blob/f6daad4f/examples/java/src/main/java/org/apache/beam/examples/complete/TrafficRoutes.java
--
diff --git 

[30/50] [abbrv] beam git commit: [BEAM-2544] Fix flaky AvroIOTest by eliminating race condition in "write then read" tests.

2017-07-20 Thread jbonofre
[BEAM-2544] Fix flaky AvroIOTest by eliminating race condition in "write then 
read" tests.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/911edbad
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/911edbad
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/911edbad

Branch: refs/heads/DSL_SQL
Commit: 911edbade388a63626e0ad6f8b7c2ad7a9f9b7c2
Parents: dd9e866
Author: Alex Filatov 
Authored: Thu Jun 29 23:23:04 2017 +0300
Committer: Eugene Kirpichov 
Committed: Tue Jul 18 15:49:44 2017 -0700

--
 .../java/org/apache/beam/sdk/io/AvroIOTest.java | 46 +++-
 1 file changed, 25 insertions(+), 21 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/911edbad/sdks/java/core/src/test/java/org/apache/beam/sdk/io/AvroIOTest.java
--
diff --git 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/io/AvroIOTest.java 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/io/AvroIOTest.java
index 4a1386c..4380c57 100644
--- a/sdks/java/core/src/test/java/org/apache/beam/sdk/io/AvroIOTest.java
+++ b/sdks/java/core/src/test/java/org/apache/beam/sdk/io/AvroIOTest.java
@@ -90,7 +90,11 @@ import org.junit.runners.JUnit4;
 @RunWith(JUnit4.class)
 public class AvroIOTest {
 
-  @Rule public TestPipeline p = TestPipeline.create();
+  @Rule
+  public TestPipeline writePipeline = TestPipeline.create();
+
+  @Rule
+  public TestPipeline readPipeline = TestPipeline.create();
 
   @Rule public TemporaryFolder tmpFolder = new TemporaryFolder();
 
@@ -144,15 +148,15 @@ public class AvroIOTest {
 ImmutableList.of(new GenericClass(3, "hi"), new GenericClass(5, 
"bar"));
 File outputFile = tmpFolder.newFile("output.avro");
 
-p.apply(Create.of(values))
+writePipeline.apply(Create.of(values))
 
.apply(AvroIO.write(GenericClass.class).to(outputFile.getAbsolutePath()).withoutSharding());
-p.run();
+writePipeline.run().waitUntilFinish();
 
 PCollection input =
-
p.apply(AvroIO.read(GenericClass.class).from(outputFile.getAbsolutePath()));
+
readPipeline.apply(AvroIO.read(GenericClass.class).from(outputFile.getAbsolutePath()));
 
 PAssert.that(input).containsInAnyOrder(values);
-p.run();
+readPipeline.run();
   }
 
   @Test
@@ -163,19 +167,19 @@ public class AvroIOTest {
 ImmutableList.of(new GenericClass(3, "hi"), new GenericClass(5, 
"bar"));
 File outputFile = tmpFolder.newFile("output.avro");
 
-p.apply(Create.of(values))
+writePipeline.apply(Create.of(values))
 .apply(
 AvroIO.write(GenericClass.class)
 .to(outputFile.getAbsolutePath())
 .withoutSharding()
 .withCodec(CodecFactory.deflateCodec(9)));
-p.run();
+writePipeline.run().waitUntilFinish();
 
 PCollection input =
-
p.apply(AvroIO.read(GenericClass.class).from(outputFile.getAbsolutePath()));
+
readPipeline.apply(AvroIO.read(GenericClass.class).from(outputFile.getAbsolutePath()));
 
 PAssert.that(input).containsInAnyOrder(values);
-p.run();
+readPipeline.run();
 DataFileStream dataFileStream =
 new DataFileStream(new FileInputStream(outputFile), new 
GenericDatumReader());
 assertEquals("deflate", dataFileStream.getMetaString("avro.codec"));
@@ -189,19 +193,19 @@ public class AvroIOTest {
 ImmutableList.of(new GenericClass(3, "hi"), new GenericClass(5, 
"bar"));
 File outputFile = tmpFolder.newFile("output.avro");
 
-p.apply(Create.of(values))
+writePipeline.apply(Create.of(values))
 .apply(
 AvroIO.write(GenericClass.class)
 .to(outputFile.getAbsolutePath())
 .withoutSharding()
 .withCodec(CodecFactory.nullCodec()));
-p.run();
+writePipeline.run().waitUntilFinish();
 
 PCollection input =
-
p.apply(AvroIO.read(GenericClass.class).from(outputFile.getAbsolutePath()));
+
readPipeline.apply(AvroIO.read(GenericClass.class).from(outputFile.getAbsolutePath()));
 
 PAssert.that(input).containsInAnyOrder(values);
-p.run();
+readPipeline.run();
 DataFileStream dataFileStream =
 new DataFileStream(new FileInputStream(outputFile), new 
GenericDatumReader());
 assertEquals("null", dataFileStream.getMetaString("avro.codec"));
@@ -261,18 +265,18 @@ public class AvroIOTest {
 ImmutableList.of(new GenericClass(3, "hi"), new GenericClass(5, 
"bar"));
 File outputFile = tmpFolder.newFile("output.avro");
 
-p.apply(Create.of(values))
+writePipeline.apply(Create.of(values))
 
.apply(AvroIO.write(GenericClass.class).to(outputFile.getAbsolutePath()).withoutSharding());
-p.run();
+writePipeline.run().waitUntilFinish();

[03/50] [abbrv] beam git commit: datastoreio: retry on socket errors

2017-07-20 Thread jbonofre
datastoreio: retry on socket errors


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/095e7916
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/095e7916
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/095e7916

Branch: refs/heads/DSL_SQL
Commit: 095e7916d23e49859acb42b9316ddf4222fbc9d9
Parents: ae0de1b
Author: Vikas Kedigehalli 
Authored: Thu Jul 13 10:29:23 2017 -0700
Committer: Ahmet Altay 
Committed: Mon Jul 17 09:16:06 2017 -0700

--
 .../apache_beam/io/gcp/datastore/v1/helper.py   |  8 +++
 .../io/gcp/datastore/v1/helper_test.py  | 22 
 2 files changed, 26 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/095e7916/sdks/python/apache_beam/io/gcp/datastore/v1/helper.py
--
diff --git a/sdks/python/apache_beam/io/gcp/datastore/v1/helper.py 
b/sdks/python/apache_beam/io/gcp/datastore/v1/helper.py
index f977536..996dace 100644
--- a/sdks/python/apache_beam/io/gcp/datastore/v1/helper.py
+++ b/sdks/python/apache_beam/io/gcp/datastore/v1/helper.py
@@ -19,6 +19,9 @@
 
 For internal use only; no backwards-compatibility guarantees.
 """
+
+import errno
+from socket import error as SocketError
 import sys
 
 # Protect against environments where datastore library is not available.
@@ -130,6 +133,11 @@ def retry_on_rpc_error(exception):
 err_code == code_pb2.UNAVAILABLE or
 err_code == code_pb2.UNKNOWN or
 err_code == code_pb2.INTERNAL)
+
+  if isinstance(exception, SocketError):
+return (exception.errno == errno.ECONNRESET or
+exception.errno == errno.ETIMEDOUT)
+
   return False
 
 

http://git-wip-us.apache.org/repos/asf/beam/blob/095e7916/sdks/python/apache_beam/io/gcp/datastore/v1/helper_test.py
--
diff --git a/sdks/python/apache_beam/io/gcp/datastore/v1/helper_test.py 
b/sdks/python/apache_beam/io/gcp/datastore/v1/helper_test.py
index a804c09..a8b1bb1 100644
--- a/sdks/python/apache_beam/io/gcp/datastore/v1/helper_test.py
+++ b/sdks/python/apache_beam/io/gcp/datastore/v1/helper_test.py
@@ -16,6 +16,9 @@
 #
 
 """Tests for datastore helper."""
+import errno
+import random
+from socket import error as SocketError
 import sys
 import unittest
 
@@ -49,6 +52,16 @@ class HelperTest(unittest.TestCase):
 self._query = query_pb2.Query()
 self._query.kind.add().name = 'dummy_kind'
 patch_retry(self, helper)
+self._retriable_errors = [
+RPCError("dummy", code_pb2.INTERNAL, "failed"),
+SocketError(errno.ECONNRESET, "Connection Reset"),
+SocketError(errno.ETIMEDOUT, "Timed out")
+]
+
+self._non_retriable_errors = [
+RPCError("dummy", code_pb2.UNAUTHENTICATED, "failed"),
+SocketError(errno.EADDRNOTAVAIL, "Address not available")
+]
 
   def permanent_retriable_datastore_failure(self, req):
 raise RPCError("dummy", code_pb2.UNAVAILABLE, "failed")
@@ -56,12 +69,12 @@ class HelperTest(unittest.TestCase):
   def transient_retriable_datastore_failure(self, req):
 if self._transient_fail_count:
   self._transient_fail_count -= 1
-  raise RPCError("dummy", code_pb2.INTERNAL, "failed")
+  raise random.choice(self._retriable_errors)
 else:
   return datastore_pb2.RunQueryResponse()
 
   def non_retriable_datastore_failure(self, req):
-raise RPCError("dummy", code_pb2.UNAUTHENTICATED, "failed")
+raise random.choice(self._non_retriable_errors)
 
   def test_query_iterator(self):
 self._mock_datastore.run_query.side_effect = (
@@ -76,7 +89,7 @@ class HelperTest(unittest.TestCase):
 self.transient_retriable_datastore_failure)
 query_iterator = helper.QueryIterator("project", None, self._query,
   self._mock_datastore)
-fail_count = 2
+fail_count = 5
 self._transient_fail_count = fail_count
 for _ in query_iterator:
   pass
@@ -89,7 +102,8 @@ class HelperTest(unittest.TestCase):
 self.non_retriable_datastore_failure)
 query_iterator = helper.QueryIterator("project", None, self._query,
   self._mock_datastore)
-self.assertRaises(RPCError, iter(query_iterator).next)
+self.assertRaises(tuple(map(type, self._non_retriable_errors)),
+  iter(query_iterator).next)
 self.assertEqual(1, len(self._mock_datastore.run_query.call_args_list))
 
   def test_query_iterator_with_single_batch(self):



  1   2   >