[GitHub] incubator-beam pull request #1680: [BEAM-XXX] Make KVCoder more efficient by...
GitHub user dhalperi opened a pull request: https://github.com/apache/incubator-beam/pull/1680 [BEAM-XXX] Make KVCoder more efficient by removing unnecessary nesting See [BEAM-469] for more information about why this is correct. You can merge this pull request into a Git repository by running: $ git pull https://github.com/dhalperi/incubator-beam efficient-nested-coders Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/1680.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1680 commit 621e8250c9535d773c4f4440a34ea0833912b51f Author: Dan Halperin <dhalp...@google.com> Date: 2016-12-21T23:37:49Z [BEAM-XXX] Make KVCoder more efficient by removing unnecessary nesting See [BEAM-469] for more information about why this is correct. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-beam pull request #1679: [BEAM-1201] Remove BoundedSource.produces...
GitHub user dhalperi opened a pull request: https://github.com/apache/incubator-beam/pull/1679 [BEAM-1201] Remove BoundedSource.producesSortedKeys R: @jkff You can merge this pull request into a Git repository by running: $ git pull https://github.com/dhalperi/incubator-beam remove-produces-sorted-keys Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/1679.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1679 commit ee15138543f8b9926466cf4e4dc6857b3173345e Author: Dan Halperin <dhalp...@google.com> Date: 2016-12-21T23:32:38Z [BEAM-1201] Remove BoundedSource.producesSortedKeys Unused and unclear; for more information see the linked JIRA. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-beam pull request #1671: [BEAM-1194] Add DataflowLocationIT
Github user dhalperi closed the pull request at: https://github.com/apache/incubator-beam/pull/1671 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-beam pull request #1671: [BEAM-1194] Add DataflowLocationIT
GitHub user dhalperi reopened a pull request: https://github.com/apache/incubator-beam/pull/1671 [BEAM-1194] Add DataflowLocationIT Opening PR to test. You can merge this pull request into a Git repository by running: $ git pull https://github.com/dhalperi/incubator-beam trailing-slash Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/1671.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1671 commit 7ad2ebd8e96a469e38c156de2d2701e500c3d955 Author: Dan Halperin <dhalp...@google.com> Date: 2016-12-21T01:56:00Z Add DataflowLocationIT --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-beam pull request #1671: [BEAM-XXX] Add DataflowLocationIT
GitHub user dhalperi opened a pull request: https://github.com/apache/incubator-beam/pull/1671 [BEAM-XXX] Add DataflowLocationIT Opening PR to test. You can merge this pull request into a Git repository by running: $ git pull https://github.com/dhalperi/incubator-beam trailing-slash Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/1671.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1671 commit ee8c98806e348a90454676ed31c0ab5489c2c62b Author: Dan Halperin <dhalp...@google.com> Date: 2016-12-21T01:56:00Z Add DataflowLocationIT --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[2/2] incubator-beam git commit: Closes #1656
Closes #1656 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/b3de17b3 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/b3de17b3 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/b3de17b3 Branch: refs/heads/master Commit: b3de17b3d1a394563d680af9ac34ecfe801c25c2 Parents: 28d7913 85422f9 Author: Dan HalperinAuthored: Mon Dec 19 16:24:09 2016 -0800 Committer: Dan Halperin Committed: Mon Dec 19 16:24:09 2016 -0800 -- .jenkins/common_job_properties.groovy | 4 1 file changed, 4 insertions(+) --
[1/2] incubator-beam git commit: Disable automatic archiving of Maven builds
Repository: incubator-beam Updated Branches: refs/heads/master 28d7913be -> b3de17b3d Disable automatic archiving of Maven builds >From the Web UI: > If checked, Jenkins will not automatically archive all artifacts generated by > this project. If you wish to archive the results of this build within > Jenkins, you will need to use the "Archive the artifacts" post-build action > below. Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/85422f99 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/85422f99 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/85422f99 Branch: refs/heads/master Commit: 85422f999752fc827113609be0ba72cc64a9d3b3 Parents: 28d7913 Author: Daniel HalperinAuthored: Mon Dec 19 11:13:49 2016 -0800 Committer: Dan Halperin Committed: Mon Dec 19 16:16:48 2016 -0800 -- .jenkins/common_job_properties.groovy | 4 1 file changed, 4 insertions(+) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/85422f99/.jenkins/common_job_properties.groovy -- diff --git a/.jenkins/common_job_properties.groovy b/.jenkins/common_job_properties.groovy index 3880236..e1688ec 100644 --- a/.jenkins/common_job_properties.groovy +++ b/.jenkins/common_job_properties.groovy @@ -140,6 +140,10 @@ class common_job_properties { context.rootPOM('pom.xml') // Use a repository local to the workspace for better isolation of jobs. context.localRepository(LocalRepositoryLocation.LOCAL_TO_WORKSPACE) +// Disable archiving the built artifacts by default, as this is slow and flaky. +// We can usually recreate them easily, and we can also opt-in individual jobs +// to artifact archiving. +context.archivingDisabled(true) } // Sets common config for PreCommit jobs.
[GitHub] incubator-beam pull request #1656: Disable automatic archiving of Maven buil...
GitHub user dhalperi opened a pull request: https://github.com/apache/incubator-beam/pull/1656 Disable automatic archiving of Maven builds From the Web UI: > If checked, Jenkins will not automatically archive all artifacts generated by this project. If you wish to archive the results of this build within Jenkins, you will need to use the "Archive the artifacts" post-build action below. Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- You can merge this pull request into a Git repository by running: $ git pull https://github.com/dhalperi/incubator-beam patch-2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/1656.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1656 commit b489873cd3168ae8b358c36315e54c41c556e0e3 Author: Daniel Halperin <dhalp...@users.noreply.github.com> Date: 2016-12-19T19:13:49Z Disable automatic archiving of Maven builds From the Web UI: > If checked, Jenkins will not automatically archive all artifacts generated by this project. If you wish to archive the results of this build within Jenkins, you will need to use the "Archive the artifacts" post-build action below. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[1/2] incubator-beam git commit: Closes #1651
Repository: incubator-beam Updated Branches: refs/heads/master 4206408bf -> 5255a3381 Closes #1651 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/5255a338 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/5255a338 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/5255a338 Branch: refs/heads/master Commit: 5255a33812758bbb9d081962675bd0180802c82b Parents: 4206408 5fb4f5d Author: Dan HalperinAuthored: Fri Dec 16 23:53:49 2016 -0800 Committer: Dan Halperin Committed: Fri Dec 16 23:53:49 2016 -0800 -- .../beam/sdk/io/gcp/bigquery/BigQueryIO.java| 23 +-- .../sdk/io/gcp/bigquery/BigQueryIOTest.java | 72 2 files changed, 63 insertions(+), 32 deletions(-) --
[GitHub] incubator-beam pull request #1653: [BEAM-545] PipelineOptions: fix parameter...
GitHub user dhalperi opened a pull request: https://github.com/apache/incubator-beam/pull/1653 [BEAM-545] PipelineOptions: fix parameter name Seems like a cut and paste error. R: @peihe You can merge this pull request into a Git repository by running: $ git pull https://github.com/dhalperi/incubator-beam patch-1 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/1653.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1653 commit 65afadb15cc320acc4e1562aec0de0c82fd102bd Author: Daniel Halperin <dhalp...@users.noreply.github.com> Date: 2016-12-17T07:47:56Z [BEAM-545] PipelineOptions: fix parameter name Seems like a cut and paste error. R: @peihe --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[2/2] incubator-beam git commit: Closes #1650
Closes #1650 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/7d1976b2 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/7d1976b2 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/7d1976b2 Branch: refs/heads/master Commit: 7d1976b2628e0d560df57610b8ed8a6b8443fb7b Parents: abdbee6 6a4a699 Author: Dan HalperinAuthored: Fri Dec 16 17:41:51 2016 -0800 Committer: Dan Halperin Committed: Fri Dec 16 17:41:51 2016 -0800 -- .../core/src/main/java/org/apache/beam/sdk/transforms/View.java | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --
[1/2] incubator-beam git commit: View.asMap: minor javadoc fixes
Repository: incubator-beam Updated Branches: refs/heads/master abdbee61c -> 7d1976b26 View.asMap: minor javadoc fixes Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/6a4a6997 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/6a4a6997 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/6a4a6997 Branch: refs/heads/master Commit: 6a4a699796fcf8a294ee0886658e6597bede0207 Parents: abdbee6 Author: Dan HalperinAuthored: Fri Dec 16 16:26:27 2016 -0800 Committer: Dan Halperin Committed: Fri Dec 16 16:49:19 2016 -0800 -- .../core/src/main/java/org/apache/beam/sdk/transforms/View.java | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/6a4a6997/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/View.java -- diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/View.java b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/View.java index 126679d..d18a0c6 100644 --- a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/View.java +++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/View.java @@ -189,9 +189,9 @@ public class View { /** * Returns a {@link View.AsMap} transform that takes a - * {@link PCollection PCollectionKVK V} as + * {@link PCollection PCollectionKVK, V} as * input and produces a {@link PCollectionView} mapping each window to - * a {@link Map MapK, V}. It is required that each key of the input be + * a {@link Map MapK, V}. It is required that each key of the input be * associated with a single value, per window. If this is not the case, precede this * view with {@code Combine.perKey}, as in the example below, or alternatively * use {@link View#asMultimap()}.
[GitHub] incubator-beam pull request #1650: [BEAM-475] View.asMap: minor javadoc fixe...
GitHub user dhalperi opened a pull request: https://github.com/apache/incubator-beam/pull/1650 [BEAM-475] View.asMap: minor javadoc fixes You can merge this pull request into a Git repository by running: $ git pull https://github.com/dhalperi/incubator-beam javadoc-fixes Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/1650.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1650 commit b0ceccb6659d60822aa9b8a84b93384c802bdefa Author: Dan Halperin <dhalp...@google.com> Date: 2016-12-17T00:26:27Z View.asMap: minor javadoc fixes --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[1/2] incubator-beam git commit: [BEAM-1108] Remove outdated language about experimental autoscaling
Repository: incubator-beam Updated Branches: refs/heads/master 33b7ca792 -> beed6080b [BEAM-1108] Remove outdated language about experimental autoscaling Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/b723 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/b723 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/b723 Branch: refs/heads/master Commit: b7231fd4d79063b2feae1ac59d5c54f2b337 Parents: 33b7ca7 Author: Dan HalperinAuthored: Fri Dec 16 08:23:22 2016 -0800 Committer: Dan Halperin Committed: Fri Dec 16 12:47:52 2016 -0800 -- .../apache/beam/examples/complete/TopWikipediaSessions.java | 9 - 1 file changed, 9 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/b723/examples/java/src/main/java/org/apache/beam/examples/complete/TopWikipediaSessions.java -- diff --git a/examples/java/src/main/java/org/apache/beam/examples/complete/TopWikipediaSessions.java b/examples/java/src/main/java/org/apache/beam/examples/complete/TopWikipediaSessions.java index df7f81e..8e0b815 100644 --- a/examples/java/src/main/java/org/apache/beam/examples/complete/TopWikipediaSessions.java +++ b/examples/java/src/main/java/org/apache/beam/examples/complete/TopWikipediaSessions.java @@ -62,15 +62,6 @@ import org.joda.time.Instant; * * The default input is {@code gs://apache-beam-samples/wikipedia_edits/*.json} and can be * overridden with {@code --input}. - * - * The input for this example is large enough that it's a good place to enable (experimental) - * autoscaling: - * {@code - * --autoscalingAlgorithm=BASIC - * --maxNumWorkers=20 - * } - * - * This will automatically scale the number of workers up over time until the job completes. */ public class TopWikipediaSessions { private static final String EXPORTED_WIKI_TABLE =
[2/2] incubator-beam git commit: Closes #1646
Closes #1646 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/beed6080 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/beed6080 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/beed6080 Branch: refs/heads/master Commit: beed6080b157ed6ff6157386927c3455c9aa347b Parents: 33b7ca7 b72 Author: Dan HalperinAuthored: Fri Dec 16 12:47:53 2016 -0800 Committer: Dan Halperin Committed: Fri Dec 16 12:47:53 2016 -0800 -- .../apache/beam/examples/complete/TopWikipediaSessions.java | 9 - 1 file changed, 9 deletions(-) --
[2/2] incubator-beam git commit: [BEAM-450] Shade modules to separate paths
[BEAM-450] Shade modules to separate paths Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/235027b9 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/235027b9 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/235027b9 Branch: refs/heads/master Commit: 235027b9fc6e322c469b099d168e60bf72a567db Parents: 5ebbd50 Author: Dan HalperinAuthored: Thu Dec 15 13:50:39 2016 -0800 Committer: Dan Halperin Committed: Fri Dec 16 08:28:40 2016 -0800 -- runners/core-java/pom.xml | 4 ++-- runners/google-cloud-dataflow-java/pom.xml | 6 +++--- 2 files changed, 5 insertions(+), 5 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/235027b9/runners/core-java/pom.xml -- diff --git a/runners/core-java/pom.xml b/runners/core-java/pom.xml index b5c610b..704aeaf 100644 --- a/runners/core-java/pom.xml +++ b/runners/core-java/pom.xml @@ -90,11 +90,11 @@ the second relocation. --> com.google.common - org.apache.beam.sdk.repackaged.com.google.common + org.apache.beam.runners.core.repackaged.com.google.common com.google.thirdparty - org.apache.beam.sdk.repackaged.com.google.thirdparty + org.apache.beam.runners.core.repackaged.com.google.thirdparty http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/235027b9/runners/google-cloud-dataflow-java/pom.xml -- diff --git a/runners/google-cloud-dataflow-java/pom.xml b/runners/google-cloud-dataflow-java/pom.xml index 77187d6..46ac7ef 100644 --- a/runners/google-cloud-dataflow-java/pom.xml +++ b/runners/google-cloud-dataflow-java/pom.xml @@ -133,15 +133,15 @@ com.google.common.**.testing.* - org.apache.beam.sdk.repackaged.com.google.common + org.apache.beam.runners.dataflow.repackaged.com.google.common com.google.thirdparty - org.apache.beam.sdk.repackaged.com.google.thirdparty + org.apache.beam.runners.dataflow.repackaged.com.google.thirdparty com.google.cloud.bigtable - org.apache.beam.sdk.repackaged.com.google.cloud.bigtable + org.apache.beam.runners.dataflow.repackaged.com.google.cloud.bigtable com.google.cloud.bigtable.config.BigtableOptions* com.google.cloud.bigtable.config.CredentialOptions*
[1/2] incubator-beam git commit: Closes #1633
Repository: incubator-beam Updated Branches: refs/heads/master 5ebbd500c -> 33b7ca792 Closes #1633 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/33b7ca79 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/33b7ca79 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/33b7ca79 Branch: refs/heads/master Commit: 33b7ca7924e6f3ac7e5a9380e6330de3c316c138 Parents: 5ebbd50 235027b Author: Dan HalperinAuthored: Fri Dec 16 08:28:40 2016 -0800 Committer: Dan Halperin Committed: Fri Dec 16 08:28:40 2016 -0800 -- runners/core-java/pom.xml | 4 ++-- runners/google-cloud-dataflow-java/pom.xml | 6 +++--- 2 files changed, 5 insertions(+), 5 deletions(-) --
[GitHub] incubator-beam pull request #1646: [BEAM-1108] Remove outdated language abou...
GitHub user dhalperi opened a pull request: https://github.com/apache/incubator-beam/pull/1646 [BEAM-1108] Remove outdated language about experimental autoscaling R: @lukecwik or @kennknowles or @tgroh You can merge this pull request into a Git repository by running: $ git pull https://github.com/dhalperi/incubator-beam autoscaling-language Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/1646.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1646 commit 5d33aa79663a3f30dbd11ae9e8733181edde1a2c Author: Dan Halperin <dhalp...@google.com> Date: 2016-12-16T16:23:22Z [BEAM-1108] Remove outdated language about experimental autoscaling --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[1/2] incubator-beam git commit: [BEAM-1022] Add testing coverage for BigQuery streaming writes
Repository: incubator-beam Updated Branches: refs/heads/master 3c4b6930e -> 3e1a62815 [BEAM-1022] Add testing coverage for BigQuery streaming writes Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/51900830 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/51900830 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/51900830 Branch: refs/heads/master Commit: 519008303f9cefd3f8f4a8a7a98a9a79717f57ff Parents: 3c4b693 Author: Reuven LaxAuthored: Thu Nov 17 10:57:41 2016 -0800 Committer: Dan Halperin Committed: Thu Dec 15 11:45:45 2016 -0800 -- .../beam/sdk/io/gcp/bigquery/BigQueryIO.java| 48 +- .../sdk/io/gcp/bigquery/BigQueryServices.java | 7 +- .../io/gcp/bigquery/BigQueryServicesImpl.java | 121 - .../io/gcp/bigquery/BigQueryTableInserter.java | 217 - .../sdk/io/gcp/bigquery/BigQueryIOTest.java | 456 +++ .../gcp/bigquery/BigQueryServicesImplTest.java | 139 +- .../gcp/bigquery/BigQueryTableInserterTest.java | 245 -- .../sdk/io/gcp/bigquery/BigQueryUtilTest.java | 50 +- 8 files changed, 655 insertions(+), 628 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/51900830/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java -- diff --git a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java index 0be8567..28049ed 100644 --- a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java +++ b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java @@ -24,7 +24,6 @@ import static com.google.common.base.Preconditions.checkState; import com.fasterxml.jackson.annotation.JsonCreator; import com.fasterxml.jackson.annotation.JsonProperty; import com.google.api.client.json.JsonFactory; -import com.google.api.services.bigquery.Bigquery; import com.google.api.services.bigquery.model.Job; import com.google.api.services.bigquery.model.JobConfigurationExtract; import com.google.api.services.bigquery.model.JobConfigurationLoad; @@ -33,6 +32,7 @@ import com.google.api.services.bigquery.model.JobConfigurationTableCopy; import com.google.api.services.bigquery.model.JobReference; import com.google.api.services.bigquery.model.JobStatistics; import com.google.api.services.bigquery.model.JobStatus; +import com.google.api.services.bigquery.model.Table; import com.google.api.services.bigquery.model.TableReference; import com.google.api.services.bigquery.model.TableRow; import com.google.api.services.bigquery.model.TableSchema; @@ -1796,8 +1796,8 @@ public class BigQueryIO { * Does not modify this object. */ public Bound withCreateDisposition(CreateDisposition createDisposition) { -return new Bound(name, jsonTableRef, tableRefFunction, jsonSchema, createDisposition, -writeDisposition, validate, bigQueryServices); +return new Bound(name, jsonTableRef, tableRefFunction, jsonSchema, +createDisposition, writeDisposition, validate, bigQueryServices); } /** @@ -1806,8 +1806,8 @@ public class BigQueryIO { * Does not modify this object. */ public Bound withWriteDisposition(WriteDisposition writeDisposition) { -return new Bound(name, jsonTableRef, tableRefFunction, jsonSchema, createDisposition, -writeDisposition, validate, bigQueryServices); +return new Bound(name, jsonTableRef, tableRefFunction, jsonSchema, +createDisposition, writeDisposition, validate, bigQueryServices); } /** @@ -2136,7 +2136,8 @@ public class BigQueryIO { /** Returns the table reference, or {@code null}. */ @Nullable public ValueProvider getTable() { -return NestedValueProvider.of(jsonTableRef, new JsonTableRefToTableRef()); +return jsonTableRef == null ? null : +NestedValueProvider.of(jsonTableRef, new JsonTableRefToTableRef()); } /** Returns {@code true} if table validation is enabled. */ @@ -2550,6 +2551,13 @@ public class BigQueryIO { } } + /** + * Clear the cached map of created tables. Used for testing. + */ + @VisibleForTesting + static void clearCreatedTables() { +StreamingWriteFn.clearCreatedTables(); + } / /** @@ -2585,6 +2593,15 @@ public class BigQueryIO { this.bqServices =
[2/2] incubator-beam git commit: Closes #1400
Closes #1400 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/3e1a6281 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/3e1a6281 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/3e1a6281 Branch: refs/heads/master Commit: 3e1a62815ca467951647788d59c00921bd02803a Parents: 3c4b693 5190083 Author: Dan HalperinAuthored: Thu Dec 15 11:46:03 2016 -0800 Committer: Dan Halperin Committed: Thu Dec 15 11:46:03 2016 -0800 -- .../beam/sdk/io/gcp/bigquery/BigQueryIO.java| 48 +- .../sdk/io/gcp/bigquery/BigQueryServices.java | 7 +- .../io/gcp/bigquery/BigQueryServicesImpl.java | 121 - .../io/gcp/bigquery/BigQueryTableInserter.java | 217 - .../sdk/io/gcp/bigquery/BigQueryIOTest.java | 456 +++ .../gcp/bigquery/BigQueryServicesImplTest.java | 139 +- .../gcp/bigquery/BigQueryTableInserterTest.java | 245 -- .../sdk/io/gcp/bigquery/BigQueryUtilTest.java | 50 +- 8 files changed, 655 insertions(+), 628 deletions(-) --
[2/2] incubator-beam git commit: Fail to split in FileBasedSource if filePattern expands to empty.
Fail to split in FileBasedSource if filePattern expands to empty. Typically, input file patterns are validated during Pipeline construction, but standard Read transforms include an option to disable validation. This is generally useful but can lead to cases where a Pipeline executes successfully with empty inputs. This changes the behavior to fail execution on empty file-based inputs even when validation is disabled. Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/cf4229a6 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/cf4229a6 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/cf4229a6 Branch: refs/heads/master Commit: cf4229a6d7e1b79416a1be4e78f5c90c38dd77b0 Parents: 46566fc Author: Scott WegnerAuthored: Wed Dec 14 14:52:34 2016 -0800 Committer: Dan Halperin Committed: Wed Dec 14 16:35:59 2016 -0800 -- .../java/org/apache/beam/sdk/io/FileBasedSource.java| 6 +- .../org/apache/beam/sdk/io/FileBasedSourceTest.java | 12 2 files changed, 17 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/cf4229a6/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSource.java -- diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSource.java b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSource.java index d835f9b..5659d5b 100644 --- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSource.java +++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSource.java @@ -331,7 +331,11 @@ public abstract class FileBasedSource extends OffsetBasedSource { try { checkState(fileOrPatternSpec.isAccessible(), "Bundle splitting should only happen at execution time."); -for (final String file : FileBasedSource.expandFilePattern(fileOrPatternSpec.get())) { +Collection expandedFiles = +FileBasedSource.expandFilePattern(fileOrPatternSpec.get()); +checkArgument(!expandedFiles.isEmpty(), +"Unable to find any files matching %s", fileOrPatternSpec.get()); +for (final String file : expandedFiles) { futures.add(createFutureForFileSplit(file, desiredBundleSizeBytes, options, service)); } List> splitResults = http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/cf4229a6/sdks/java/core/src/test/java/org/apache/beam/sdk/io/FileBasedSourceTest.java -- diff --git a/sdks/java/core/src/test/java/org/apache/beam/sdk/io/FileBasedSourceTest.java b/sdks/java/core/src/test/java/org/apache/beam/sdk/io/FileBasedSourceTest.java index a065191..f4b8574 100644 --- a/sdks/java/core/src/test/java/org/apache/beam/sdk/io/FileBasedSourceTest.java +++ b/sdks/java/core/src/test/java/org/apache/beam/sdk/io/FileBasedSourceTest.java @@ -59,6 +59,7 @@ import org.apache.beam.sdk.values.PCollection; import org.junit.Rule; import org.junit.Test; import org.junit.experimental.categories.Category; +import org.junit.rules.ExpectedException; import org.junit.rules.TemporaryFolder; import org.junit.runner.RunWith; import org.junit.runners.JUnit4; @@ -73,6 +74,7 @@ public class FileBasedSourceTest { Random random = new Random(0L); @Rule public TemporaryFolder tempFolder = new TemporaryFolder(); + @Rule public ExpectedException thrown = ExpectedException.none(); /** * If {@code splitHeader} is null, this is just a simple line-based reader. Otherwise, the file is @@ -418,6 +420,16 @@ public class FileBasedSourceTest { } @Test + public void testSplittingFailsOnEmptyFileExpansion() throws Exception { +PipelineOptions options = PipelineOptionsFactory.create(); +String missingFilePath = tempFolder.newFolder().getAbsolutePath() + "/missing.txt"; +TestFileBasedSource source = new TestFileBasedSource(missingFilePath, Long.MAX_VALUE, null); +thrown.expect(IllegalArgumentException.class); +thrown.expectMessage(String.format("Unable to find any files matching %s", missingFilePath)); +source.splitIntoBundles(1234, options); + } + + @Test public void testFractionConsumedWhenReadingFilepattern() throws IOException { List data1 = createStringDataset(3, 1000); File file1 = createFileWithData("file1", data1);
[1/2] incubator-beam git commit: Closes #1621
Repository: incubator-beam Updated Branches: refs/heads/master 46566fc71 -> 1ad638e51 Closes #1621 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/1ad638e5 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/1ad638e5 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/1ad638e5 Branch: refs/heads/master Commit: 1ad638e514adda6dc6d4c2133d47a3434a565410 Parents: 46566fc cf4229a Author: Dan HalperinAuthored: Wed Dec 14 16:35:59 2016 -0800 Committer: Dan Halperin Committed: Wed Dec 14 16:35:59 2016 -0800 -- .../java/org/apache/beam/sdk/io/FileBasedSource.java| 6 +- .../org/apache/beam/sdk/io/FileBasedSourceTest.java | 12 2 files changed, 17 insertions(+), 1 deletion(-) --
[2/2] incubator-beam git commit: [BEAM-1153] GcsUtil: use non-batch API for single file size requests.
[BEAM-1153] GcsUtil: use non-batch API for single file size requests. Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/028408f5 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/028408f5 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/028408f5 Branch: refs/heads/master Commit: 028408f5a9955879f03e5bb65c54813922ee4672 Parents: 9fbf429 Author: Pei HeAuthored: Tue Dec 13 18:29:17 2016 -0800 Committer: Dan Halperin Committed: Wed Dec 14 14:56:12 2016 -0800 -- .../java/org/apache/beam/sdk/util/GcsUtil.java | 29 - .../org/apache/beam/sdk/util/GcsUtilTest.java | 65 +++- 2 files changed, 92 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/028408f5/sdks/java/core/src/main/java/org/apache/beam/sdk/util/GcsUtil.java -- diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/util/GcsUtil.java b/sdks/java/core/src/main/java/org/apache/beam/sdk/util/GcsUtil.java index 2edb1d6..dcdba46 100644 --- a/sdks/java/core/src/main/java/org/apache/beam/sdk/util/GcsUtil.java +++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/util/GcsUtil.java @@ -267,7 +267,34 @@ public class GcsUtil { * if the resource does not exist. */ public long fileSize(GcsPath path) throws IOException { -return fileSizes(ImmutableList.of(path)).get(0); +return fileSize( +path, +BACKOFF_FACTORY.backoff(), +Sleeper.DEFAULT); + } + + /** + * Returns the file size from GCS or throws {@link FileNotFoundException} + * if the resource does not exist. + */ + @VisibleForTesting + long fileSize(GcsPath path, BackOff backoff, Sleeper sleeper) throws IOException { +Storage.Objects.Get getObject = +storageClient.objects().get(path.getBucket(), path.getObject()); +try { + StorageObject object = ResilientOperation.retry( + ResilientOperation.getGoogleRequestCallable(getObject), + backoff, + RetryDeterminer.SOCKET_ERRORS, + IOException.class, + sleeper); + return object.getSize().longValue(); +} catch (Exception e) { + if (e instanceof IOException && errorExtractor.itemNotFound((IOException) e)) { +throw new FileNotFoundException(path.toString()); + } + throw new IOException("Unable to get file size", e); +} } /** http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/028408f5/sdks/java/core/src/test/java/org/apache/beam/sdk/util/GcsUtilTest.java -- diff --git a/sdks/java/core/src/test/java/org/apache/beam/sdk/util/GcsUtilTest.java b/sdks/java/core/src/test/java/org/apache/beam/sdk/util/GcsUtilTest.java index c8ed402..6ca87f9 100644 --- a/sdks/java/core/src/test/java/org/apache/beam/sdk/util/GcsUtilTest.java +++ b/sdks/java/core/src/test/java/org/apache/beam/sdk/util/GcsUtilTest.java @@ -57,6 +57,7 @@ import com.google.common.collect.ImmutableList; import com.google.common.collect.Lists; import java.io.FileNotFoundException; import java.io.IOException; +import java.math.BigInteger; import java.net.SocketTimeoutException; import java.nio.channels.SeekableByteChannel; import java.nio.file.AccessDeniedException; @@ -320,7 +321,69 @@ public class GcsUtilTest { } @Test - public void testGetSizeBytesWhenFileNotFound() throws Exception { + public void testFileSizeNonBatch() throws Exception { +GcsOptions pipelineOptions = gcsOptionsWithTestCredential(); +GcsUtil gcsUtil = pipelineOptions.getGcsUtil(); + +Storage mockStorage = Mockito.mock(Storage.class); +gcsUtil.setStorageClient(mockStorage); + +Storage.Objects mockStorageObjects = Mockito.mock(Storage.Objects.class); +Storage.Objects.Get mockStorageGet = Mockito.mock(Storage.Objects.Get.class); + +when(mockStorage.objects()).thenReturn(mockStorageObjects); +when(mockStorageObjects.get("testbucket", "testobject")).thenReturn(mockStorageGet); +when(mockStorageGet.execute()).thenReturn( +new StorageObject().setSize(BigInteger.valueOf(1000))); + +assertEquals(1000, gcsUtil.fileSize(GcsPath.fromComponents("testbucket", "testobject"))); + } + + @Test + public void testFileSizeWhenFileNotFoundNonBatch() throws Exception { +MockLowLevelHttpResponse notFoundResponse = new MockLowLevelHttpResponse(); +notFoundResponse.setContent(""); +notFoundResponse.setStatusCode(HttpStatusCodes.STATUS_CODE_NOT_FOUND); + +MockHttpTransport mockTransport = +new MockHttpTransport.Builder().setLowLevelHttpResponse(notFoundResponse).build(); + +GcsOptions
[1/2] incubator-beam git commit: Closes #1611
Repository: incubator-beam Updated Branches: refs/heads/master 9fbf429b2 -> 46566fc71 Closes #1611 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/46566fc7 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/46566fc7 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/46566fc7 Branch: refs/heads/master Commit: 46566fc71aad7cb3937bc764ff88510c7c0fa666 Parents: 9fbf429 028408f Author: Dan HalperinAuthored: Wed Dec 14 14:56:12 2016 -0800 Committer: Dan Halperin Committed: Wed Dec 14 14:56:12 2016 -0800 -- .../java/org/apache/beam/sdk/util/GcsUtil.java | 29 - .../org/apache/beam/sdk/util/GcsUtilTest.java | 65 +++- 2 files changed, 92 insertions(+), 2 deletions(-) --
[1/2] incubator-beam git commit: starter: fix typo in pom.xml
Repository: incubator-beam Updated Branches: refs/heads/master b742a2c09 -> ce3aa657a starter: fix typo in pom.xml Manual edit introduced in https://github.com/apache/incubator-beam/commit/25215889381f7da61766054af68c84ffed4c0c71 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/60c33dd5 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/60c33dd5 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/60c33dd5 Branch: refs/heads/master Commit: 60c33dd5a40a80e2d782cddbbd6940b96f34d975 Parents: b742a2c Author: Dan HalperinAuthored: Tue Dec 13 17:27:07 2016 -0800 Committer: Dan Halperin Committed: Tue Dec 13 17:27:07 2016 -0800 -- .../starter/src/main/resources/archetype-resources/pom.xml| 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/60c33dd5/sdks/java/maven-archetypes/starter/src/main/resources/archetype-resources/pom.xml -- diff --git a/sdks/java/maven-archetypes/starter/src/main/resources/archetype-resources/pom.xml b/sdks/java/maven-archetypes/starter/src/main/resources/archetype-resources/pom.xml index c59ffee..45aa1f8 100644 --- a/sdks/java/maven-archetypes/starter/src/main/resources/archetype-resources/pom.xml +++ b/sdks/java/maven-archetypes/starter/src/main/resources/archetype-resources/pom.xml @@ -1,4 +1,5 @@ -
[GitHub] incubator-beam pull request #1610: starter: fix typo in pom.xml
GitHub user dhalperi opened a pull request: https://github.com/apache/incubator-beam/pull/1610 starter: fix typo in pom.xml Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- Manual edit introduced in https://github.com/apache/incubator-beam/commit/25215889381f7da61766054af68c84ffed4c0c71 You can merge this pull request into a Git repository by running: $ git pull https://github.com/dhalperi/incubator-beam fix-starter Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/1610.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1610 commit 60c33dd5a40a80e2d782cddbbd6940b96f34d975 Author: Dan Halperin <dhalp...@google.com> Date: 2016-12-14T01:27:07Z starter: fix typo in pom.xml Manual edit introduced in https://github.com/apache/incubator-beam/commit/25215889381f7da61766054af68c84ffed4c0c71 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[2/2] incubator-beam git commit: Closes #1327
Closes #1327 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/a5f2df2b Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/a5f2df2b Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/a5f2df2b Branch: refs/heads/master Commit: a5f2df2bdc05a3e2a3bc32a7a0663c6fdfde8c33 Parents: 0bdf7fc 5a04492 Author: Dan HalperinAuthored: Tue Dec 13 15:42:11 2016 -0800 Committer: Dan Halperin Committed: Tue Dec 13 15:42:11 2016 -0800 -- sdks/java/extensions/sorter/README.md | 2 +- sdks/java/extensions/sorter/pom.xml | 8 ++ .../sorter/BufferedExternalSorter.java | 6 - .../sdk/extensions/sorter/ExternalSorter.java | 15 ++- .../sdk/extensions/sorter/InMemorySorter.java | 26 ++-- .../sorter/BufferedExternalSorterTest.java | 16 .../extensions/sorter/ExternalSorterTest.java | 16 .../extensions/sorter/InMemorySorterTest.java | 8 ++ 8 files changed, 75 insertions(+), 22 deletions(-) --
[1/2] incubator-beam git commit: Some minor changes and fixes for sorter module
Repository: incubator-beam Updated Branches: refs/heads/master 0bdf7fc04 -> a5f2df2bd Some minor changes and fixes for sorter module Includes: * Limit max memory for ExternalSorter and BufferedExternalSorter to 2047 MB to prevent int overflow within Hadoop's sorting library * Fix int overflow for large memory values in InMemorySorter * Add note about estimated disk use to README.MD * Fix to make Hadoop's sorting library put all temp files under the specified directory * Have Hadoop clean up the temp directory on exit * Stop shading hadoop dependencies. Some context: ** The existing shading is broken (modules that depend on this one cannot use it successfully). ** Hadoop's use of reflection in several instances makes shading the dependency "in a good way" nearly impossible. It requires a couple of rather brittle hacks, and, for clients that depend on certain conflicting versions of hadoop these hacks can mean it doesn't meet its intended goal of preventing conflicts anyway. ** From what I can tell, there's no good way to shade this to make it universally usable, so leaving it unshaded seems like a reasonable default. ** Without shading Hadoop, this module can be successfully used from Beam's wordcount example (which actually does have pre-existing hadoop dependencies already). Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/5a04492e Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/5a04492e Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/5a04492e Branch: refs/heads/master Commit: 5a04492e5b7c5d5b4cb2da0f7a80ed8f0c2f2eb4 Parents: 0bdf7fc Author: Mitch ShanklinAuthored: Wed Nov 9 14:09:49 2016 -0800 Committer: Dan Halperin Committed: Tue Dec 13 15:42:10 2016 -0800 -- sdks/java/extensions/sorter/README.md | 2 +- sdks/java/extensions/sorter/pom.xml | 8 ++ .../sorter/BufferedExternalSorter.java | 6 - .../sdk/extensions/sorter/ExternalSorter.java | 15 ++- .../sdk/extensions/sorter/InMemorySorter.java | 26 ++-- .../sorter/BufferedExternalSorterTest.java | 16 .../extensions/sorter/ExternalSorterTest.java | 16 .../extensions/sorter/InMemorySorterTest.java | 8 ++ 8 files changed, 75 insertions(+), 22 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/5a04492e/sdks/java/extensions/sorter/README.md -- diff --git a/sdks/java/extensions/sorter/README.md b/sdks/java/extensions/sorter/README.md index 18bd0d2..6ff3dbe 100644 --- a/sdks/java/extensions/sorter/README.md +++ b/sdks/java/extensions/sorter/README.md @@ -22,7 +22,7 @@ This module provides the SortValues transform, which takes a `PCollection >` is sorted on a single worker using local memory and disk. This means that `SortValues` may be a performance and/or scalability bottleneck when used in different pipelines. For example, users are discouraged from using `SortValues` on a `PCollection` of a single element to globally sort a large `PCollection`. +* Each `Iterable >` is sorted on a single worker using local memory and disk. This means that `SortValues` may be a performance and/or scalability bottleneck when used in different pipelines. For example, users are discouraged from using `SortValues` on a `PCollection` of a single element to globally sort a large `PCollection`. A (rough) estimate of the number of bytes of disk space utilized if sorting spills to disk is `numRecords * (numSecondaryKeyBytesPerRecord + numValueBytesPerRecord + 16) * 3`. ##Options * The user can customize the temporary location used if sorting requires spilling to disk and the maximum amount of memory to use by creating a custom instance of `BufferedExternalSorter.Options` to pass into `SortValues.create`. http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/5a04492e/sdks/java/extensions/sorter/pom.xml -- diff --git a/sdks/java/extensions/sorter/pom.xml b/sdks/java/extensions/sorter/pom.xml index a99a793..c8dfd52 100644 --- a/sdks/java/extensions/sorter/pom.xml +++ b/sdks/java/extensions/sorter/pom.xml @@ -69,8 +69,6 @@ true - org.apache.hadoop:hadoop-mapreduce-client-core -
[2/2] incubator-beam git commit: [BEAM-909] improve starter archetype
[BEAM-909] improve starter archetype Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/6c8d93b0 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/6c8d93b0 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/6c8d93b0 Branch: refs/heads/master Commit: 6c8d93b018a040591b58e8f38d2a2442a7589692 Parents: cd8eeea Author: Dan HalperinAuthored: Tue Dec 13 09:50:33 2016 -0800 Committer: Dan Halperin Committed: Tue Dec 13 11:56:59 2016 -0800 -- .../src/main/resources/archetype-resources/pom.xml | 16 +--- .../test/resources/projects/basic/reference/pom.xml | 16 +--- 2 files changed, 26 insertions(+), 6 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/6c8d93b0/sdks/java/maven-archetypes/starter/src/main/resources/archetype-resources/pom.xml -- diff --git a/sdks/java/maven-archetypes/starter/src/main/resources/archetype-resources/pom.xml b/sdks/java/maven-archetypes/starter/src/main/resources/archetype-resources/pom.xml index 4fae02c..efafeca 100644 --- a/sdks/java/maven-archetypes/starter/src/main/resources/archetype-resources/pom.xml +++ b/sdks/java/maven-archetypes/starter/src/main/resources/archetype-resources/pom.xml @@ -24,6 +24,10 @@ ${artifactId} ${version} + +0.4.0-incubating-SNAPSHOT + + apache.snapshots @@ -69,14 +73,20 @@ org.apache.beam beam-sdks-java-core - 0.4.0-incubating-SNAPSHOT + ${beam.version} - + org.apache.beam beam-runners-direct-java - 0.4.0-incubating-SNAPSHOT + ${beam.version} runtime http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/6c8d93b0/sdks/java/maven-archetypes/starter/src/test/resources/projects/basic/reference/pom.xml -- diff --git a/sdks/java/maven-archetypes/starter/src/test/resources/projects/basic/reference/pom.xml b/sdks/java/maven-archetypes/starter/src/test/resources/projects/basic/reference/pom.xml index 4656e63..a86bd11 100644 --- a/sdks/java/maven-archetypes/starter/src/test/resources/projects/basic/reference/pom.xml +++ b/sdks/java/maven-archetypes/starter/src/test/resources/projects/basic/reference/pom.xml @@ -24,6 +24,10 @@ basic 0.1 + +0.4.0-incubating-SNAPSHOT + + apache.snapshots @@ -69,14 +73,20 @@ org.apache.beam beam-sdks-java-core - 0.4.0-incubating-SNAPSHOT + ${beam.version} - + org.apache.beam beam-runners-direct-java - 0.4.0-incubating-SNAPSHOT + ${beam.version} runtime
[1/2] incubator-beam git commit: Closes #1596
Repository: incubator-beam Updated Branches: refs/heads/master cd8eeea95 -> 2e22a4875 Closes #1596 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/2e22a487 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/2e22a487 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/2e22a487 Branch: refs/heads/master Commit: 2e22a4875129ad7c8afef2c441954156cfdf8125 Parents: cd8eeea 6c8d93b Author: Dan HalperinAuthored: Tue Dec 13 11:56:59 2016 -0800 Committer: Dan Halperin Committed: Tue Dec 13 11:56:59 2016 -0800 -- .../src/main/resources/archetype-resources/pom.xml | 16 +--- .../test/resources/projects/basic/reference/pom.xml | 16 +--- 2 files changed, 26 insertions(+), 6 deletions(-) --
[GitHub] incubator-beam pull request #1596: [BEAM-909] improve starter archetype
GitHub user dhalperi opened a pull request: https://github.com/apache/incubator-beam/pull/1596 [BEAM-909] improve starter archetype R: @davorbonaci You can merge this pull request into a Git repository by running: $ git pull https://github.com/dhalperi/incubator-beam starter-comment Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/1596.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1596 commit 1fb74438fa4993471ee2bce3e95b884b91c658ba Author: Dan Halperin <dhalp...@google.com> Date: 2016-12-13T17:50:33Z [BEAM-909] improve starter archetype --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[1/2] incubator-beam git commit: BigQueryIO.Write: support runtime schema and table
Repository: incubator-beam Updated Branches: refs/heads/master 437393712 -> 321547fb1 BigQueryIO.Write: support runtime schema and table Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/fd6d09c3 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/fd6d09c3 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/fd6d09c3 Branch: refs/heads/master Commit: fd6d09c32f6bcf67c63ec74548373ee90d67f2bd Parents: 4373937 Author: Sam McVeetyAuthored: Sun Dec 4 14:16:23 2016 -0800 Committer: Dan Halperin Committed: Mon Dec 12 11:14:20 2016 -0800 -- .../beam/sdk/io/gcp/bigquery/BigQueryIO.java| 217 +-- .../sdk/io/gcp/bigquery/BigQueryIOTest.java | 60 - 2 files changed, 206 insertions(+), 71 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/fd6d09c3/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java -- diff --git a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java index f99ca78..0be8567 100644 --- a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java +++ b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java @@ -321,6 +321,23 @@ public class BigQueryIO { return sb.toString(); } + @VisibleForTesting + static class JsonSchemaToTableSchema + implements SerializableFunction { +@Override +public TableSchema apply(String from) { + return fromJsonString(from, TableSchema.class); +} + } + + private static class TableSchemaToJsonSchema + implements SerializableFunction { +@Override +public String apply(TableSchema from) { + return toJsonString(from); +} + } + private static class JsonTableRefToTableRef implements SerializableFunction { @Override @@ -329,6 +346,14 @@ public class BigQueryIO { } } + private static class TableRefToTableSpec + implements SerializableFunction { +@Override +public String apply(TableReference from) { + return toTableSpec(from); +} + } + private static class TableRefToJson implements SerializableFunction { @Override @@ -353,6 +378,15 @@ public class BigQueryIO { } } + @Nullable + private static ValueProvider displayTable( + @Nullable ValueProvider table) { +if (table == null) { + return null; +} +return NestedValueProvider.of(table, new TableRefToTableSpec()); + } + /** * A {@link PTransform} that reads from a BigQuery table and returns a * {@link PCollection} of {@link TableRow TableRows} containing each of the rows of the table. @@ -659,11 +693,11 @@ public class BigQueryIO { .setProjectId(executingProject) .setDatasetId(queryTempDatasetId) .setTableId(queryTempTableId); + String jsonTableRef = toJsonString(queryTempTableRef); source = BigQueryQuerySource.create( jobIdToken, query, NestedValueProvider.of( - StaticValueProvider.of( - toJsonString(queryTempTableRef)), new JsonTableRefToTableRef()), + StaticValueProvider.of(jsonTableRef), new JsonTableRefToTableRef()), flattenResults, useLegacySql, extractDestinationDir, bqServices); } else { ValueProvider inputTable = getTableWithDefaultProject(bqOptions); @@ -712,17 +746,10 @@ public class BigQueryIO { @Override public void populateDisplayData(DisplayData.Builder builder) { super.populateDisplayData(builder); -TableReference table = getTable(); - -if (table != null) { - builder.add(DisplayData.item("table", toTableSpec(table)) -.withLabel("Table")); -} -String queryString = query == null -? null : query.isAccessible() -? query.get() : query.toString(); builder -.addIfNotNull(DisplayData.item("query", queryString) +.addIfNotNull(DisplayData.item("table", displayTable(getTableProvider())) + .withLabel("Table")) +.addIfNotNull(DisplayData.item("query", query) .withLabel("Query")) .addIfNotNull(DisplayData.item("flattenResults", flattenResults) .withLabel("Flatten Query Results")) @@ -752,10
[2/2] incubator-beam git commit: Closes #1513
Closes #1513 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/321547fb Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/321547fb Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/321547fb Branch: refs/heads/master Commit: 321547fb15c358fcd196954779548f6644aa3c08 Parents: 4373937 fd6d09c Author: Dan HalperinAuthored: Mon Dec 12 11:14:41 2016 -0800 Committer: Dan Halperin Committed: Mon Dec 12 11:14:41 2016 -0800 -- .../beam/sdk/io/gcp/bigquery/BigQueryIO.java| 217 +-- .../sdk/io/gcp/bigquery/BigQueryIOTest.java | 60 - 2 files changed, 206 insertions(+), 71 deletions(-) --
[2/2] incubator-beam git commit: Fix handling of null ValueProviders in DisplayData
Fix handling of null ValueProviders in DisplayData Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/a47eac91 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/a47eac91 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/a47eac91 Branch: refs/heads/master Commit: a47eac91c70846d2aa3a945e327e2b148b16ca5f Parents: 307be5f Author: Sam McVeetyAuthored: Wed Dec 7 15:31:52 2016 -0800 Committer: Dan Halperin Committed: Mon Dec 12 11:08:42 2016 -0800 -- .../org/apache/beam/sdk/transforms/display/DisplayData.java | 8 +++- .../apache/beam/sdk/transforms/display/DisplayDataTest.java | 2 ++ 2 files changed, 9 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/a47eac91/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/display/DisplayData.java -- diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/display/DisplayData.java b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/display/DisplayData.java index f0040f7..d3bfe93 100644 --- a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/display/DisplayData.java +++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/display/DisplayData.java @@ -866,9 +866,15 @@ public class DisplayData implements Serializable { /** * Create a display item for the specified key and {@link ValueProvider}. */ - public static ItemSpec item(String key, ValueProvider value) { + public static ItemSpec item(String key, @Nullable ValueProvider value) { +if (value == null) { + return item(key, Type.STRING, null); +} if (value.isAccessible()) { Object got = value.get(); + if (got == null) { +return item(key, Type.STRING, null); + } Type type = inferType(got); if (type == null) { throw new RuntimeException(String.format("Unknown value type: %s", got)); http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/a47eac91/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/display/DisplayDataTest.java -- diff --git a/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/display/DisplayDataTest.java b/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/display/DisplayDataTest.java index f5c1e73..06b2bce 100644 --- a/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/display/DisplayDataTest.java +++ b/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/display/DisplayDataTest.java @@ -379,6 +379,8 @@ public class DisplayDataTest implements Serializable { public void populateDisplayData(Builder builder) { builder .addIfNotNull(DisplayData.item("nullString", (String) null)) +.addIfNotNull(DisplayData.item("nullVPString", (ValueProvider) null)) +.addIfNotNull(DisplayData.item("nullierVPString", StaticValueProvider.of(null))) .addIfNotNull(DisplayData.item("notNullString", "foo")) .addIfNotNull(DisplayData.item("nullLong", (Long) null)) .addIfNotNull(DisplayData.item("notNullLong", 1234L))
[1/2] incubator-beam git commit: Closes #1549
Repository: incubator-beam Updated Branches: refs/heads/master 307be5ff9 -> 437393712 Closes #1549 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/43739371 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/43739371 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/43739371 Branch: refs/heads/master Commit: 437393712ebcc28f69b45fedff7a93148d944c6e Parents: 307be5f a47eac9 Author: Dan HalperinAuthored: Mon Dec 12 11:08:42 2016 -0800 Committer: Dan Halperin Committed: Mon Dec 12 11:08:42 2016 -0800 -- .../org/apache/beam/sdk/transforms/display/DisplayData.java | 8 +++- .../apache/beam/sdk/transforms/display/DisplayDataTest.java | 2 ++ 2 files changed, 9 insertions(+), 1 deletion(-) --
[2/2] incubator-beam git commit: [BEAM-551] Fix handling of default for VP
[BEAM-551] Fix handling of default for VP Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/66c29e4d Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/66c29e4d Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/66c29e4d Branch: refs/heads/master Commit: 66c29e4d8fd3654899fed6dc0054194f9e6a9b74 Parents: 2f2617c Author: Sam McVeetyAuthored: Sat Dec 10 09:16:57 2016 -0800 Committer: Dan Halperin Committed: Mon Dec 12 10:28:40 2016 -0800 -- .../org/apache/beam/sdk/options/ValueProvider.java | 13 ++--- .../org/apache/beam/sdk/options/ValueProviderTest.java | 12 2 files changed, 22 insertions(+), 3 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/66c29e4d/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ValueProvider.java -- diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ValueProvider.java b/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ValueProvider.java index 3d36a29..93fcaf8 100644 --- a/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ValueProvider.java +++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ValueProvider.java @@ -17,7 +17,6 @@ */ package org.apache.beam.sdk.options; -import static com.google.common.base.MoreObjects.firstNonNull; import static com.google.common.base.Preconditions.checkNotNull; import com.fasterxml.jackson.core.JsonGenerator; @@ -222,8 +221,16 @@ public interface ValueProvider extends Serializable { Method method = klass.getMethod(methodName); PipelineOptions methodOptions = options.as(klass); InvocationHandler handler = Proxy.getInvocationHandler(methodOptions); -T value = ((ValueProvider) handler.invoke(methodOptions, method, null)).get(); -return firstNonNull(value, defaultValue); +ValueProvider result = +(ValueProvider) handler.invoke(methodOptions, method, null); +// Two cases: If we have deserialized a new value from JSON, it will +// be wrapped in a StaticValueProvider, which we can provide here. If +// not, there was no JSON value, and we return the default, whether or +// not it is null. +if (result instanceof StaticValueProvider) { + return result.get(); +} +return defaultValue; } catch (Throwable e) { throw new RuntimeException("Unable to load runtime value.", e); } http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/66c29e4d/sdks/java/core/src/test/java/org/apache/beam/sdk/options/ValueProviderTest.java -- diff --git a/sdks/java/core/src/test/java/org/apache/beam/sdk/options/ValueProviderTest.java b/sdks/java/core/src/test/java/org/apache/beam/sdk/options/ValueProviderTest.java index 7ec40be..ea5cc54 100644 --- a/sdks/java/core/src/test/java/org/apache/beam/sdk/options/ValueProviderTest.java +++ b/sdks/java/core/src/test/java/org/apache/beam/sdk/options/ValueProviderTest.java @@ -149,6 +149,18 @@ public class ValueProviderTest { assertEquals("quux", provider.get()); } + @Test + public void testDefaultRuntimeProviderWithoutOverride() throws Exception { +TestOptions runtime = PipelineOptionsFactory.as(TestOptions.class); +TestOptions options = PipelineOptionsFactory.as(TestOptions.class); +runtime.setOptionsId(options.getOptionsId()); +RuntimeValueProvider.setRuntimeOptions(runtime); + +ValueProvider provider = options.getBar(); +assertTrue(provider.isAccessible()); +assertEquals("bar", provider.get()); + } + /** A test interface. */ public interface BadOptionsRuntime extends PipelineOptions { RuntimeValueProvider getBar();
[1/2] incubator-beam git commit: Closes #1575
Repository: incubator-beam Updated Branches: refs/heads/master 2f2617c36 -> 307be5ff9 Closes #1575 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/307be5ff Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/307be5ff Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/307be5ff Branch: refs/heads/master Commit: 307be5ff98d53a0c1f82066e40e2d2ee70421adb Parents: 2f2617c 66c29e4 Author: Dan HalperinAuthored: Mon Dec 12 10:28:40 2016 -0800 Committer: Dan Halperin Committed: Mon Dec 12 10:28:40 2016 -0800 -- .../org/apache/beam/sdk/options/ValueProvider.java | 13 ++--- .../org/apache/beam/sdk/options/ValueProviderTest.java | 12 2 files changed, 22 insertions(+), 3 deletions(-) --
[2/2] incubator-beam git commit: Closes #1562
Closes #1562 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/2f2617c3 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/2f2617c3 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/2f2617c3 Branch: refs/heads/master Commit: 2f2617c361b0b8ae25e052d1b2d186fbd2b7370b Parents: 0afadf6 cfcfa2f Author: Dan HalperinAuthored: Mon Dec 12 09:56:53 2016 -0800 Committer: Dan Halperin Committed: Mon Dec 12 09:56:53 2016 -0800 -- .../beam/runners/dataflow/DataflowRunner.java | 14 +-- .../runners/dataflow/DataflowRunnerInfo.java| 92 .../DataflowPipelineWorkerPoolOptions.java | 6 +- .../beam/runners/dataflow/dataflow.properties | 23 + .../dataflow/DataflowRunnerInfoTest.java| 51 +++ .../org/apache/beam/sdk/util/ReleaseInfo.java | 4 - 6 files changed, 172 insertions(+), 18 deletions(-) --
[1/2] incubator-beam git commit: [BEAM-1120] Move some DataflowRunner configurations from code to properties
Repository: incubator-beam Updated Branches: refs/heads/master 0afadf64f -> 2f2617c36 [BEAM-1120] Move some DataflowRunner configurations from code to properties Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/cfcfa2f3 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/cfcfa2f3 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/cfcfa2f3 Branch: refs/heads/master Commit: cfcfa2f3e739a3a71b1ec9edf31f8023e1a5ed3f Parents: 0afadf6 Author: Dan HalperinAuthored: Fri Dec 9 18:35:52 2016 +0800 Committer: Dan Halperin Committed: Mon Dec 12 09:56:52 2016 -0800 -- .../beam/runners/dataflow/DataflowRunner.java | 14 +-- .../runners/dataflow/DataflowRunnerInfo.java| 92 .../DataflowPipelineWorkerPoolOptions.java | 6 +- .../beam/runners/dataflow/dataflow.properties | 23 + .../dataflow/DataflowRunnerInfoTest.java| 51 +++ .../org/apache/beam/sdk/util/ReleaseInfo.java | 4 - 6 files changed, 172 insertions(+), 18 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/cfcfa2f3/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java -- diff --git a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java index d902ccb..711b1b0 100644 --- a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java +++ b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java @@ -205,16 +205,6 @@ public class DataflowRunner extends PipelineRunner { /** A set of user defined functions to invoke at different points in execution. */ private DataflowRunnerHooks hooks; - // Environment version information. - private static final String ENVIRONMENT_MAJOR_VERSION = "6"; - - // Default Docker container images that execute Dataflow worker harness, residing in Google - // Container Registry, separately for Batch and Streaming. - public static final String BATCH_WORKER_HARNESS_CONTAINER_IMAGE = - "dataflow.gcr.io/v1beta3/beam-java-batch:beam-master-20161205"; - public static final String STREAMING_WORKER_HARNESS_CONTAINER_IMAGE = - "dataflow.gcr.io/v1beta3/beam-java-streaming:beam-master-20161205"; - // The limit of CreateJob request size. private static final int CREATE_JOB_REQUEST_LIMIT_BYTES = 10 * 1024 * 1024; @@ -546,7 +536,9 @@ public class DataflowRunner extends PipelineRunner { // Requirements about the service. Map environmentVersion = new HashMap<>(); -environmentVersion.put(PropertyNames.ENVIRONMENT_VERSION_MAJOR_KEY, ENVIRONMENT_MAJOR_VERSION); +environmentVersion.put( +PropertyNames.ENVIRONMENT_VERSION_MAJOR_KEY, + DataflowRunnerInfo.getDataflowRunnerInfo().getEnvironmentMajorVersion()); newJob.getEnvironment().setVersion(environmentVersion); // Default jobType is JAVA_BATCH_AUTOSCALING: A Java job with workers that the job can // autoscale if specified. http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/cfcfa2f3/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunnerInfo.java -- diff --git a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunnerInfo.java b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunnerInfo.java new file mode 100644 index 000..59cb8a4 --- /dev/null +++ b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunnerInfo.java @@ -0,0 +1,92 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the
[GitHub] incubator-beam pull request #1562: [BEAM-1120] Move some DataflowRunner conf...
GitHub user dhalperi opened a pull request: https://github.com/apache/incubator-beam/pull/1562 [BEAM-1120] Move some DataflowRunner configurations from code to properties This is just a draft. Thoughts? R: @davorbonaci R: @lukecwik You can merge this pull request into a Git repository by running: $ git pull https://github.com/dhalperi/incubator-beam dataflow-properties Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/1562.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1562 commit 63edf68a95aefd280da31ccddc56f52d25f6dace Author: Dan Halperin <dhalp...@google.com> Date: 2016-12-09T10:35:52Z [BEAM-1120] Move some DataflowRunner configurations from code to properties --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[2/2] incubator-beam git commit: Closes #1560
Closes #1560 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/63d197cd Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/63d197cd Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/63d197cd Branch: refs/heads/master Commit: 63d197cd0cff332b62a5f4398b1693b6839a348b Parents: 9bab78b 9bcba39 Author: Dan HalperinAuthored: Fri Dec 9 01:43:16 2016 -0800 Committer: Dan Halperin Committed: Fri Dec 9 01:43:16 2016 -0800 -- .../java/org/apache/beam/runners/dataflow/DataflowRunner.java| 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --
[GitHub] incubator-beam pull request #1560: DataflowRunner: bump environment major ve...
GitHub user dhalperi opened a pull request: https://github.com/apache/incubator-beam/pull/1560 DataflowRunner: bump environment major version R: @davorbonaci You can merge this pull request into a Git repository by running: $ git pull https://github.com/dhalperi/incubator-beam dataflow-upgrade Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/1560.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1560 commit 9bcba398c7516437c00517e03d75e27544b01166 Author: Dan Halperin <dhalp...@google.com> Date: 2016-12-09T07:15:19Z DataflowRunner: bump environment major version --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[2/2] incubator-beam git commit: Closes #1546
Closes #1546 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/3b2e0290 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/3b2e0290 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/3b2e0290 Branch: refs/heads/master Commit: 3b2e0290ddbedd199926a70f36c02ea4515841cb Parents: b44a7ac 6439f70 Author: Dan HalperinAuthored: Wed Dec 7 17:18:11 2016 -0800 Committer: Dan Halperin Committed: Wed Dec 7 17:18:11 2016 -0800 -- .../dataflow/DataflowPipelineTranslator.java| 4 -- .../DataflowPipelineWorkerPoolOptions.java | 45 2 files changed, 49 deletions(-) --
[1/2] incubator-beam git commit: [BEAM-1108] DataflowRunner: remove deprecated TEARDOWN_POLICY control
Repository: incubator-beam Updated Branches: refs/heads/master b44a7ac4a -> 3b2e0290d [BEAM-1108] DataflowRunner: remove deprecated TEARDOWN_POLICY control Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/6439f701 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/6439f701 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/6439f701 Branch: refs/heads/master Commit: 6439f701d1008d6a0432828e11e0fcc8a4fe6ecc Parents: b44a7ac Author: Dan HalperinAuthored: Thu Dec 8 07:40:58 2016 +0800 Committer: Dan Halperin Committed: Thu Dec 8 07:43:13 2016 +0800 -- .../dataflow/DataflowPipelineTranslator.java| 4 -- .../DataflowPipelineWorkerPoolOptions.java | 45 2 files changed, 49 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/6439f701/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java -- diff --git a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java index 8783056..8048df9 100644 --- a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java +++ b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java @@ -424,10 +424,6 @@ public class DataflowPipelineTranslator { WorkerPool workerPool = new WorkerPool(); - if (options.getTeardownPolicy() != null) { - workerPool.setTeardownPolicy(options.getTeardownPolicy().getTeardownPolicyName()); - } - if (options.isStreaming()) { job.setType("JOB_TYPE_STREAMING"); } else { http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/6439f701/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineWorkerPoolOptions.java -- diff --git a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineWorkerPoolOptions.java b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineWorkerPoolOptions.java index ffb5a3a..157321a 100644 --- a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineWorkerPoolOptions.java +++ b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineWorkerPoolOptions.java @@ -191,51 +191,6 @@ public interface DataflowPipelineWorkerPoolOptions extends PipelineOptions { void setWorkerMachineType(String value); /** - * The policy for tearing down the workers spun up by the service. - * - * @deprecated Dataflow Service will only support TEARDOWN_ALWAYS policy in the future. - */ - @Deprecated - enum TeardownPolicy { -/** - * All VMs created for a Dataflow job are deleted when the job finishes, regardless of whether - * it fails or succeeds. - */ -TEARDOWN_ALWAYS("TEARDOWN_ALWAYS"), -/** - * All VMs created for a Dataflow job are left running when the job finishes, regardless of - * whether it fails or succeeds. - */ -TEARDOWN_NEVER("TEARDOWN_NEVER"), -/** - * All VMs created for a Dataflow job are deleted when the job succeeds, but are left running - * when it fails. (This is typically used for debugging failing jobs by SSHing into the - * workers.) - */ -TEARDOWN_ON_SUCCESS("TEARDOWN_ON_SUCCESS"); - -private final String teardownPolicy; - -TeardownPolicy(String teardownPolicy) { - this.teardownPolicy = teardownPolicy; -} - -public String getTeardownPolicyName() { - return this.teardownPolicy; -} - } - - /** - * The teardown policy for the VMs. - * - * If unset, the Dataflow service will choose a reasonable default. - */ - @Description("The teardown policy for the VMs. If unset, the Dataflow service will " - + "choose a reasonable default.") - TeardownPolicy getTeardownPolicy(); - void setTeardownPolicy(TeardownPolicy value); - - /** * List of local files to make available to workers. * * Files are placed on the worker's classpath.
[GitHub] incubator-beam pull request #1546: [BEAM-1108] DataflowRunner: remove deprec...
GitHub user dhalperi opened a pull request: https://github.com/apache/incubator-beam/pull/1546 [BEAM-1108] DataflowRunner: remove deprecated TEARDOWN_POLICY control R: @davorbonaci CC: @pjesa You can merge this pull request into a Git repository by running: $ git pull https://github.com/dhalperi/incubator-beam teardown-policy Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/1546.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1546 commit 6439f701d1008d6a0432828e11e0fcc8a4fe6ecc Author: Dan Halperin <dhalp...@google.com> Date: 2016-12-07T23:40:58Z [BEAM-1108] DataflowRunner: remove deprecated TEARDOWN_POLICY control --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[1/2] incubator-beam git commit: [BEAM-905] Add shading config to examples archetype and enable it for Flink
Repository: incubator-beam Updated Branches: refs/heads/master 5b31a3699 -> b44a7ac4a [BEAM-905] Add shading config to examples archetype and enable it for Flink This makes the Flink quickstart work out of the box. Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/43fef277 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/43fef277 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/43fef277 Branch: refs/heads/master Commit: 43fef2775145f67def3ab8a246ecca192a7d650b Parents: 5b31a36 Author: Dan HalperinAuthored: Wed Dec 7 20:06:57 2016 +0800 Committer: Dan Halperin Committed: Wed Dec 7 14:55:02 2016 -0800 -- .../main/resources/archetype-resources/pom.xml | 40 1 file changed, 40 insertions(+) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/43fef277/sdks/java/maven-archetypes/examples/src/main/resources/archetype-resources/pom.xml -- diff --git a/sdks/java/maven-archetypes/examples/src/main/resources/archetype-resources/pom.xml b/sdks/java/maven-archetypes/examples/src/main/resources/archetype-resources/pom.xml index df2e9f3..95d163c 100644 --- a/sdks/java/maven-archetypes/examples/src/main/resources/archetype-resources/pom.xml +++ b/sdks/java/maven-archetypes/examples/src/main/resources/archetype-resources/pom.xml @@ -85,6 +85,38 @@ false + + + + org.apache.maven.plugins + maven-shade-plugin + 2.4.1 + + + package + +shade + + + ${project.artifactId}-bundled-${project.version} + + +*:* + + META-INF/LICENSE + META-INF/*.SF + META-INF/*.DSA + META-INF/*.RSA + + + + + + + @@ -140,6 +172,14 @@ runtime + + + +org.apache.maven.plugins +maven-shade-plugin + + +
[GitHub] incubator-beam pull request #1533: [BEAM-905] Add shading config to examples...
GitHub user dhalperi opened a pull request: https://github.com/apache/incubator-beam/pull/1533 [BEAM-905] Add shading config to examples archetype and enable it for⦠R: @davorbonaci AND @aljoscha You can merge this pull request into a Git repository by running: $ git pull https://github.com/dhalperi/incubator-beam flink-package-examples Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/1533.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1533 commit 148f906e2ccecb9038010a8227f8b7f76a3b0ba3 Author: Dan Halperin <dhalp...@google.com> Date: 2016-12-07T12:06:57Z [BEAM-905] Add shading config to examples archetype and enable it for Flink This makes the Flink quickstart work out of the box. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
incubator-beam git commit: BigQueryIO.Read: support runtime options
Repository: incubator-beam Updated Branches: refs/heads/master b2b570f27 -> 0a2ed832c BigQueryIO.Read: support runtime options Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/0a2ed832 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/0a2ed832 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/0a2ed832 Branch: refs/heads/master Commit: 0a2ed832ce5af7556db605e99b985ed4ffc1b152 Parents: b2b570f Author: Sam McVeetyAuthored: Sun Oct 30 11:58:44 2016 -0700 Committer: Dan Halperin Committed: Tue Dec 6 18:05:42 2016 -0800 -- .../beam/sdk/io/gcp/bigquery/BigQueryIO.java| 208 ++- .../apache/beam/sdk/io/gcp/ApiSurfaceTest.java | 2 + .../sdk/io/gcp/bigquery/BigQueryIOTest.java | 33 +-- 3 files changed, 176 insertions(+), 67 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/0a2ed832/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java -- diff --git a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java index c00c19d..8bfbd53 100644 --- a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java +++ b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java @@ -89,6 +89,9 @@ import org.apache.beam.sdk.io.gcp.bigquery.BigQueryServices.JobService; import org.apache.beam.sdk.options.BigQueryOptions; import org.apache.beam.sdk.options.GcpOptions; import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.sdk.options.ValueProvider; +import org.apache.beam.sdk.options.ValueProvider.NestedValueProvider; +import org.apache.beam.sdk.options.ValueProvider.StaticValueProvider; import org.apache.beam.sdk.runners.PipelineRunner; import org.apache.beam.sdk.transforms.Aggregator; import org.apache.beam.sdk.transforms.Create; @@ -318,6 +321,38 @@ public class BigQueryIO { return sb.toString(); } + private static class JsonTableRefToTableRef + implements SerializableFunction { +@Override +public TableReference apply(String from) { + return fromJsonString(from, TableReference.class); +} + } + + private static class TableRefToJson + implements SerializableFunction { +@Override +public String apply(TableReference from) { + return toJsonString(from); +} + } + + private static class TableRefToProjectId + implements SerializableFunction { +@Override +public String apply(TableReference from) { + return from.getProjectId(); +} + } + + private static class TableSpecToTableRef + implements SerializableFunction { +@Override +public TableReference apply(String from) { + return parseTableSpec(from); +} + } + /** * A {@link PTransform} that reads from a BigQuery table and returns a * {@link PCollection} of {@link TableRow TableRows} containing each of the rows of the table. @@ -345,6 +380,13 @@ public class BigQueryIO { * {@code "[dataset_id].[table_id]"} for tables within the current project. */ public static Bound from(String tableSpec) { + return new Bound().from(StaticValueProvider.of(tableSpec)); +} + +/** + * Same as {@code from(String)}, but with a {@link ValueProvider}. + */ +public static Bound from(ValueProvider tableSpec) { return new Bound().from(tableSpec); } @@ -352,6 +394,13 @@ public class BigQueryIO { * Reads results received after executing the given query. */ public static Bound fromQuery(String query) { + return new Bound().fromQuery(StaticValueProvider.of(query)); +} + +/** + * Same as {@code from(String)}, but with a {@link ValueProvider}. + */ +public static Bound fromQuery(ValueProvider query) { return new Bound().fromQuery(query); } @@ -374,8 +423,8 @@ public class BigQueryIO { * {@link PCollection} of {@link TableRow TableRows}. */ public static class Bound extends PTransform { - @Nullable final String jsonTableRef; - @Nullable final String query; + @Nullable final ValueProvider jsonTableRef; + @Nullable final ValueProvider query; /** * Disable validation that the table exists or the query succeeds prior to pipeline @@ -403,7 +452,8 @@ public class BigQueryIO { } private Bound( -
incubator-beam git commit: BEAM-1083: Removing the link for the DatastoreWordCount in the README
Repository: incubator-beam Updated Branches: refs/heads/master a13024c40 -> 8f712fd62 BEAM-1083: Removing the link for the DatastoreWordCount in the README Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/8f712fd6 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/8f712fd6 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/8f712fd6 Branch: refs/heads/master Commit: 8f712fd6291803bfcda312ad7c31cb5c811c6508 Parents: a13024c Author: Neelesh Srinivas SalianAuthored: Sat Dec 3 09:08:54 2016 -0800 Committer: Neelesh Srinivas Salian Committed: Sat Dec 3 09:08:54 2016 -0800 -- .../java/src/main/java/org/apache/beam/examples/cookbook/README.md | 2 -- 1 file changed, 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/8f712fd6/examples/java/src/main/java/org/apache/beam/examples/cookbook/README.md -- diff --git a/examples/java/src/main/java/org/apache/beam/examples/cookbook/README.md b/examples/java/src/main/java/org/apache/beam/examples/cookbook/README.md index e709955..105fb4b 100644 --- a/examples/java/src/main/java/org/apache/beam/examples/cookbook/README.md +++ b/examples/java/src/main/java/org/apache/beam/examples/cookbook/README.md @@ -37,8 +37,6 @@ larger Dataflow pipeline. They include: transform, which lets you combine the values in a key-grouped PCollection. - https://github.com/GoogleCloudPlatform/DataflowJavaSDK/blob/master/examples/src/main/java/com/google/cloud/dataflow/examples/cookbook/DatastoreWordCount.java;>DatastoreWordCount - An example that shows you how to read from Google Cloud Datastore. https://github.com/apache/incubator-beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/cookbook/DistinctExample.java;>DistinctExample An example that uses Shakespeare's plays as plain text files, and removes duplicate lines across all the files. Demonstrates the
[1/2] incubator-beam git commit: BEAM-1078: Changing the links from GCP to incubator-beam in the project
Repository: incubator-beam Updated Branches: refs/heads/master 8a7919b5a -> a13024c40 BEAM-1078: Changing the links from GCP to incubator-beam in the project Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/5a997a1a Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/5a997a1a Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/5a997a1a Branch: refs/heads/master Commit: 5a997a1a5d5d977bb84af1737db1128df916de7a Parents: 8a7919b Author: Neelesh Srinivas SalianAuthored: Fri Dec 2 17:43:34 2016 -0800 Committer: Dan Halperin Committed: Fri Dec 2 21:44:33 2016 -0800 -- .travis/README.md | 2 +- .../java/org/apache/beam/examples/complete/README.md | 14 +++--- .../java/org/apache/beam/examples/cookbook/README.md | 14 +++--- 3 files changed, 15 insertions(+), 15 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/5a997a1a/.travis/README.md -- diff --git a/.travis/README.md b/.travis/README.md index e0c13f2..536692d 100644 --- a/.travis/README.md +++ b/.travis/README.md @@ -19,5 +19,5 @@ # Travis Scripts -This directory contains scripts used for [Travis CI](https://travis-ci.org/GoogleCloudPlatform/DataflowJavaSDK) +This directory contains scripts used for [Travis CI](https://travis-ci.org/apache/incubator-beam/) testing. http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/5a997a1a/examples/java/src/main/java/org/apache/beam/examples/complete/README.md -- diff --git a/examples/java/src/main/java/org/apache/beam/examples/complete/README.md b/examples/java/src/main/java/org/apache/beam/examples/complete/README.md index b98be7a..b0b6f9d 100644 --- a/examples/java/src/main/java/org/apache/beam/examples/complete/README.md +++ b/examples/java/src/main/java/org/apache/beam/examples/complete/README.md @@ -22,34 +22,34 @@ This directory contains end-to-end example pipelines that perform complex data processing tasks. They include: - https://github.com/GoogleCloudPlatform/DataflowJavaSDK/blob/master/examples/src/main/java/com/google/cloud/dataflow/examples/complete/AutoComplete.java;>AutoComplete + https://github.com/apache/incubator-beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/complete/AutoComplete.java;>AutoComplete An example that computes the most popular hash tags for every prefix, which can be used for auto-completion. Demonstrates how to use the same pipeline in both streaming and batch, combiners, and composite transforms. - https://github.com/GoogleCloudPlatform/DataflowJavaSDK/blob/master/examples/src/main/java/com/google/cloud/dataflow/examples/complete/StreamingWordExtract.java;>StreamingWordExtract + https://github.com/apache/incubator-beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/complete/StreamingWordExtract.java;>StreamingWordExtract A streaming pipeline example that inputs lines of text from a Cloud Pub/Sub topic, splits each line into individual words, capitalizes those words, and writes the output to a BigQuery table. - https://github.com/GoogleCloudPlatform/DataflowJavaSDK/blob/master/examples/src/main/java/com/google/cloud/dataflow/examples/complete/TfIdf.java;>TfIdf + https://github.com/apache/incubator-beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/complete/TfIdf.java;>TfIdf An example that computes a basic TF-IDF search table for a directory or Cloud Storage prefix. Demonstrates joining data, side inputs, and logging. - https://github.com/GoogleCloudPlatform/DataflowJavaSDK/blob/master/examples/src/main/java/com/google/cloud/dataflow/examples/complete/TopWikipediaSessions.java;>TopWikipediaSessions + https://github.com/apache/incubator-beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/complete/TopWikipediaSessions.java;>TopWikipediaSessions An example that reads Wikipedia edit data from Cloud Storage and computes the user with the longest string of edits separated by no more than an hour within each month. Demonstrates using Cloud Dataflow Windowing to perform time-based aggregations of data. - https://github.com/GoogleCloudPlatform/DataflowJavaSDK/blob/master/examples/src/main/java/com/google/cloud/dataflow/examples/complete/TrafficMaxLaneFlow.java;>TrafficMaxLaneFlow + https://github.com/apache/incubator-beam/blob/master/examples/java/src/main/java/org/apache/beam/examples/complete/TrafficMaxLaneFlow.java;>TrafficMaxLaneFlow A streaming Beam Example using BigQuery output in the traffic sensor domain.
[2/2] incubator-beam git commit: [BEAM-1078] Closes #1498
[BEAM-1078] Closes #1498 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/a13024c4 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/a13024c4 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/a13024c4 Branch: refs/heads/master Commit: a13024c40f73b6065ea4094d6e750b50c5027bb2 Parents: 8a7919b 5a997a1 Author: Dan HalperinAuthored: Fri Dec 2 21:45:31 2016 -0800 Committer: Dan Halperin Committed: Fri Dec 2 21:45:31 2016 -0800 -- .travis/README.md | 2 +- .../java/org/apache/beam/examples/complete/README.md | 14 +++--- .../java/org/apache/beam/examples/cookbook/README.md | 14 +++--- 3 files changed, 15 insertions(+), 15 deletions(-) --
[2/2] incubator-beam git commit: Closes #1475
Closes #1475 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/c8404557 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/c8404557 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/c8404557 Branch: refs/heads/master Commit: c84045573948a7cba72e37e5e562c7f63375e9ea Parents: 26eb435 9a038c4 Author: Dan HalperinAuthored: Fri Dec 2 17:25:36 2016 -0800 Committer: Dan Halperin Committed: Fri Dec 2 17:25:36 2016 -0800 -- .../org/apache/beam/sdk/io/FileBasedSink.java | 22 +-- .../java/org/apache/beam/sdk/io/TextIO.java | 28 .../java/org/apache/beam/sdk/io/XmlSink.java| 4 +-- .../org/apache/beam/sdk/io/XmlSinkTest.java | 6 ++--- 4 files changed, 42 insertions(+), 18 deletions(-) --
[1/2] incubator-beam git commit: Add TextIO.Write support for runtime-valued output prefix
Repository: incubator-beam Updated Branches: refs/heads/master 26eb4354c -> c84045573 Add TextIO.Write support for runtime-valued output prefix * Updates to TextIO * Updates for FileBasedSink to support this change * Updates to other FileBasedSinks that do not yet support runtime values but need to be aware that values are now ValueProvider instead of String Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/9a038c4f Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/9a038c4f Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/9a038c4f Branch: refs/heads/master Commit: 9a038c4f3404a3707eca29c5e898014df7fafbf4 Parents: 26eb435 Author: Sam McVeetyAuthored: Wed Nov 30 14:06:59 2016 -0800 Committer: Dan Halperin Committed: Fri Dec 2 17:24:12 2016 -0800 -- .../org/apache/beam/sdk/io/FileBasedSink.java | 22 +-- .../java/org/apache/beam/sdk/io/TextIO.java | 28 .../java/org/apache/beam/sdk/io/XmlSink.java| 4 +-- .../org/apache/beam/sdk/io/XmlSinkTest.java | 6 ++--- 4 files changed, 42 insertions(+), 18 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/9a038c4f/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSink.java -- diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSink.java b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSink.java index 5375b90..1396ab6 100644 --- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSink.java +++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSink.java @@ -41,6 +41,8 @@ import javax.annotation.Nullable; import org.apache.beam.sdk.coders.Coder; import org.apache.beam.sdk.coders.SerializableCoder; import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.sdk.options.ValueProvider; +import org.apache.beam.sdk.options.ValueProvider.StaticValueProvider; import org.apache.beam.sdk.transforms.display.DisplayData; import org.apache.beam.sdk.util.IOChannelFactory; import org.apache.beam.sdk.util.IOChannelUtils; @@ -135,7 +137,7 @@ public abstract class FileBasedSink extends Sink { /** * Base filename for final output files. */ - protected final String baseOutputFilename; + protected final ValueProvider baseOutputFilename; /** * The extension to be used for the final output files. @@ -162,7 +164,8 @@ public abstract class FileBasedSink extends Sink { */ public FileBasedSink(String baseOutputFilename, String extension, WritableByteChannelFactory writableByteChannelFactory) { -this(baseOutputFilename, extension, ShardNameTemplate.INDEX_OF_MAX, writableByteChannelFactory); +this(StaticValueProvider.of(baseOutputFilename), extension, +ShardNameTemplate.INDEX_OF_MAX, writableByteChannelFactory); } /** @@ -173,7 +176,8 @@ public abstract class FileBasedSink extends Sink { * See {@link ShardNameTemplate} for a description of file naming templates. */ public FileBasedSink(String baseOutputFilename, String extension, String fileNamingTemplate) { -this(baseOutputFilename, extension, fileNamingTemplate, CompressionType.UNCOMPRESSED); +this(StaticValueProvider.of(baseOutputFilename), extension, fileNamingTemplate, +CompressionType.UNCOMPRESSED); } /** @@ -182,8 +186,8 @@ public abstract class FileBasedSink extends Sink { * * See {@link ShardNameTemplate} for a description of file naming templates. */ - public FileBasedSink(String baseOutputFilename, String extension, String fileNamingTemplate, - WritableByteChannelFactory writableByteChannelFactory) { + public FileBasedSink(ValueProvider baseOutputFilename, String extension, + String fileNamingTemplate, WritableByteChannelFactory writableByteChannelFactory) { this.writableByteChannelFactory = writableByteChannelFactory; this.baseOutputFilename = baseOutputFilename; if (!isNullOrEmpty(writableByteChannelFactory.getFilenameSuffix())) { @@ -198,7 +202,7 @@ public abstract class FileBasedSink extends Sink { * Returns the base output filename for this file based sink. */ public String getBaseOutputFilename() { -return baseOutputFilename; +return baseOutputFilename.get(); } @Override @@ -216,7 +220,9 @@ public abstract class FileBasedSink extends Sink { super.populateDisplayData(builder); String fileNamePattern = String.format("%s%s%s", -baseOutputFilename, fileNamingTemplate, getFileExtension(extension)); +baseOutputFilename.isAccessible() +? baseOutputFilename.get() : baseOutputFilename.toString(), +
[2/2] incubator-beam git commit: Closes #1431
Closes #1431 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/fd6a52c1 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/fd6a52c1 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/fd6a52c1 Branch: refs/heads/python-sdk Commit: fd6a52c15df5741d6b6661ea98c680a94775f7f9 Parents: 2363ee5 1688690 Author: Dan HalperinAuthored: Fri Dec 2 16:13:28 2016 -0800 Committer: Dan Halperin Committed: Fri Dec 2 16:13:28 2016 -0800 -- sdks/python/apache_beam/io/filebasedsource.py | 3 ++- sdks/python/apache_beam/io/fileio.py | 7 --- sdks/python/apache_beam/io/gcsio.py | 6 -- sdks/python/apache_beam/io/gcsio_test.py | 7 +++ 4 files changed, 17 insertions(+), 6 deletions(-) --
[1/2] incubator-beam git commit: Do not need to list all files in GCS for validation. Add limit field to fileIO
Repository: incubator-beam Updated Branches: refs/heads/python-sdk 2363ee510 -> fd6a52c15 Do not need to list all files in GCS for validation. Add limit field to fileIO Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/16886904 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/16886904 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/16886904 Branch: refs/heads/python-sdk Commit: 16886904df9fd1d3f92e1f7aabd134a28d6c1c00 Parents: 2363ee5 Author: Sourabh BajajAuthored: Fri Dec 2 13:56:42 2016 -0800 Committer: Sourabh Bajaj Committed: Fri Dec 2 13:56:42 2016 -0800 -- sdks/python/apache_beam/io/filebasedsource.py | 3 ++- sdks/python/apache_beam/io/fileio.py | 7 --- sdks/python/apache_beam/io/gcsio.py | 6 -- sdks/python/apache_beam/io/gcsio_test.py | 7 +++ 4 files changed, 17 insertions(+), 6 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/16886904/sdks/python/apache_beam/io/filebasedsource.py -- diff --git a/sdks/python/apache_beam/io/filebasedsource.py b/sdks/python/apache_beam/io/filebasedsource.py index 14c2b06..8921801 100644 --- a/sdks/python/apache_beam/io/filebasedsource.py +++ b/sdks/python/apache_beam/io/filebasedsource.py @@ -175,7 +175,8 @@ class FileBasedSource(iobase.BoundedSource): def _validate(self): """Validate if there are actual files in the specified glob pattern """ -if len(fileio.ChannelFactory.glob(self._pattern)) <= 0: +# Limit the responses as we only want to check if something exists +if len(fileio.ChannelFactory.glob(self._pattern, limit=1)) <= 0: raise IOError( 'No files found based on the file pattern %s' % self._pattern) http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/16886904/sdks/python/apache_beam/io/fileio.py -- diff --git a/sdks/python/apache_beam/io/fileio.py b/sdks/python/apache_beam/io/fileio.py index c71a730..82e7813 100644 --- a/sdks/python/apache_beam/io/fileio.py +++ b/sdks/python/apache_beam/io/fileio.py @@ -588,11 +588,12 @@ class ChannelFactory(object): raise IOError(err) @staticmethod - def glob(path): + def glob(path, limit=None): if path.startswith('gs://'): - return gcsio.GcsIO().glob(path) + return gcsio.GcsIO().glob(path, limit) else: - return glob.glob(path) + files = glob.glob(path) + return files[:limit] @staticmethod def size_in_bytes(path): http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/16886904/sdks/python/apache_beam/io/gcsio.py -- diff --git a/sdks/python/apache_beam/io/gcsio.py b/sdks/python/apache_beam/io/gcsio.py index 9adb946..748465f 100644 --- a/sdks/python/apache_beam/io/gcsio.py +++ b/sdks/python/apache_beam/io/gcsio.py @@ -142,7 +142,7 @@ class GcsIO(object): @retry.with_exponential_backoff( retry_filter=retry.retry_on_server_errors_and_timeout_filter) - def glob(self, pattern): + def glob(self, pattern, limit=None): """Return the GCS path names matching a given path name pattern. Path name patterns are those recognized by fnmatch.fnmatch(). The path @@ -166,9 +166,11 @@ class GcsIO(object): object_paths.append('gs://%s/%s' % (item.bucket, item.name)) if response.nextPageToken: request.pageToken = response.nextPageToken +if limit is not None and len(object_paths) >= limit: + break else: break -return object_paths +return object_paths[:limit] @retry.with_exponential_backoff( retry_filter=retry.retry_on_server_errors_and_timeout_filter) http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/16886904/sdks/python/apache_beam/io/gcsio_test.py -- diff --git a/sdks/python/apache_beam/io/gcsio_test.py b/sdks/python/apache_beam/io/gcsio_test.py index 9d44e17..5af13c6 100644 --- a/sdks/python/apache_beam/io/gcsio_test.py +++ b/sdks/python/apache_beam/io/gcsio_test.py @@ -652,6 +652,13 @@ class TestGCSIO(unittest.TestCase): self.assertEqual( set(self.gcs.glob(file_pattern)), set(expected_file_names)) +# Check if limits are followed correctly +limit = 3 +for file_pattern, expected_object_names in test_cases: + expected_num_items = min(len(expected_object_names), limit) + self.assertEqual( + len(self.gcs.glob(file_pattern, limit)), expected_num_items) + def test_size_of_files_in_glob(self): bucket_name =
[2/2] incubator-beam git commit: Closes #1489
Closes #1489 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/e04cd47d Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/e04cd47d Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/e04cd47d Branch: refs/heads/master Commit: e04cd47ddf8fb5f04f1f684219724031179a55ec Parents: 1abbb90 e3dca4c Author: Dan HalperinAuthored: Fri Dec 2 15:20:17 2016 -0800 Committer: Dan Halperin Committed: Fri Dec 2 15:20:17 2016 -0800 -- .../beam/examples/cookbook/DeDupExample.java| 96 .../beam/examples/cookbook/DistinctExample.java | 96 2 files changed, 96 insertions(+), 96 deletions(-) --
[2/2] incubator-beam git commit: Closes #1482
Closes #1482 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/0fb56106 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/0fb56106 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/0fb56106 Branch: refs/heads/master Commit: 0fb561068a5420cc8ee668be498e53eb8665fe29 Parents: f70fc40 d6eb514 Author: Dan HalperinAuthored: Fri Dec 2 12:52:59 2016 -0800 Committer: Dan Halperin Committed: Fri Dec 2 12:52:59 2016 -0800 -- .travis.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --
[1/2] incubator-beam git commit: travis.yml: disable skipping things that no longer run
Repository: incubator-beam Updated Branches: refs/heads/master f70fc4099 -> 0fb561068 travis.yml: disable skipping things that no longer run Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/d6eb5143 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/d6eb5143 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/d6eb5143 Branch: refs/heads/master Commit: d6eb5143b17eca9e5a59eaf6d2e3cd696e8bb38c Parents: f70fc40 Author: Dan HalperinAuthored: Thu Dec 1 10:04:38 2016 -0800 Committer: Dan Halperin Committed: Fri Dec 2 12:52:58 2016 -0800 -- .travis.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/d6eb5143/.travis.yml -- diff --git a/.travis.yml b/.travis.yml index 9e1406c..a806477 100644 --- a/.travis.yml +++ b/.travis.yml @@ -30,7 +30,7 @@ notifications: env: global: - - MAVEN_OVERRIDE="--settings=.travis/settings.xml -Dmaven.javadoc.skip=true -Dcheckstyle.skip=true -Dfindbugs.skip=true" + - MAVEN_OVERRIDE="--settings=.travis/settings.xml" - MAVEN_CONTAINER_OVERRIDE="-DbeamSurefireArgline='-Xmx512m'" matrix:
[GitHub] incubator-beam-site pull request #100: Website: minor typographical correcti...
GitHub user dhalperi opened a pull request: https://github.com/apache/incubator-beam-site/pull/100 Website: minor typographical corrections You can merge this pull request into a Git repository by running: $ git pull https://github.com/dhalperi/incubator-beam-site minor-mat-model-fixes Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam-site/pull/100.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #100 commit 5438941e1b5234336710224c7b664e29578fba5b Author: Dan Halperin <dhalp...@google.com> Date: 2016-12-02T18:30:36Z Website: minor typographical corrections --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[1/2] incubator-beam git commit: Closes #1483
Repository: incubator-beam Updated Branches: refs/heads/master 7ad787797 -> f70fc4099 Closes #1483 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/f70fc409 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/f70fc409 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/f70fc409 Branch: refs/heads/master Commit: f70fc40992b4ded37ca77c44dc2569666936b30d Parents: 7ad7877 8fd520c Author: Dan HalperinAuthored: Fri Dec 2 09:17:11 2016 -0800 Committer: Dan Halperin Committed: Fri Dec 2 09:17:11 2016 -0800 -- .../org/apache/beam/runners/dataflow/DataflowRunner.java | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) --
[2/2] incubator-beam git commit: DataflowRunner: reject job submission when the version has not been properly set
DataflowRunner: reject job submission when the version has not been properly set Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/8fd520c0 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/8fd520c0 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/8fd520c0 Branch: refs/heads/master Commit: 8fd520c07e464c4308d8d32cc0e88e2ecd96c8d2 Parents: 7ad7877 Author: Dan HalperinAuthored: Thu Dec 1 11:21:30 2016 -0800 Committer: Dan Halperin Committed: Fri Dec 2 09:17:11 2016 -0800 -- .../org/apache/beam/runners/dataflow/DataflowRunner.java | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/8fd520c0/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java -- diff --git a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java index 6ed386a..0357b46 100644 --- a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java +++ b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java @@ -513,10 +513,14 @@ public class DataflowRunner extends PipelineRunner { Job newJob = jobSpecification.getJob(); newJob.setClientRequestId(requestId); -String version = ReleaseInfo.getReleaseInfo().getVersion(); +ReleaseInfo releaseInfo = ReleaseInfo.getReleaseInfo(); +String version = releaseInfo.getVersion(); +checkState( +!version.equals("${pom.version}"), +"Unable to submit a job to the Dataflow service with unset version ${pom.version}"); System.out.println("Dataflow SDK version: " + version); -newJob.getEnvironment().setUserAgent(ReleaseInfo.getReleaseInfo()); +newJob.getEnvironment().setUserAgent(releaseInfo); // The Dataflow Service may write to the temporary directory directly, so // must be verified. if (!isNullOrEmpty(options.getGcpTempLocation())) {
[2/2] incubator-beam git commit: Closes #1488
Closes #1488 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/7ad78779 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/7ad78779 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/7ad78779 Branch: refs/heads/master Commit: 7ad7877978e94c2b167f12010842e36374400775 Parents: c0c5802 ffa81ed Author: Dan HalperinAuthored: Thu Dec 1 23:17:33 2016 -0800 Committer: Dan Halperin Committed: Thu Dec 1 23:17:33 2016 -0800 -- runners/apex/pom.xml | 6 -- runners/flink/runner/pom.xml | 11 --- runners/google-cloud-dataflow-java/pom.xml | 6 -- runners/spark/pom.xml | 6 -- 4 files changed, 20 insertions(+), 9 deletions(-) --
[1/2] incubator-beam git commit: Fix pom syntax for excludedGroups for SplittableParDo
Repository: incubator-beam Updated Branches: refs/heads/master c0c580227 -> 7ad787797 Fix pom syntax for excludedGroups for SplittableParDo Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/ffa81edd Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/ffa81edd Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/ffa81edd Branch: refs/heads/master Commit: ffa81edd0ec4d9a8150280efdb6a6de412114743 Parents: c0c5802 Author: Kenneth KnowlesAuthored: Thu Dec 1 21:03:04 2016 -0800 Committer: Kenneth Knowles Committed: Thu Dec 1 21:03:04 2016 -0800 -- runners/apex/pom.xml | 6 -- runners/flink/runner/pom.xml | 11 --- runners/google-cloud-dataflow-java/pom.xml | 6 -- runners/spark/pom.xml | 6 -- 4 files changed, 20 insertions(+), 9 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/ffa81edd/runners/apex/pom.xml -- diff --git a/runners/apex/pom.xml b/runners/apex/pom.xml index 983781d..629e890 100644 --- a/runners/apex/pom.xml +++ b/runners/apex/pom.xml @@ -185,8 +185,10 @@ org.apache.beam.sdk.testing.RunnableOnService - org.apache.beam.sdk.testing.UsesStatefulParDo - org.apache.beam.sdk.testing.UsesSplittableParDo + +org.apache.beam.sdk.testing.UsesStatefulParDo, +org.apache.beam.sdk.testing.UsesSplittableParDo + none true http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/ffa81edd/runners/flink/runner/pom.xml -- diff --git a/runners/flink/runner/pom.xml b/runners/flink/runner/pom.xml index 3e3dd7e..615d5f1 100644 --- a/runners/flink/runner/pom.xml +++ b/runners/flink/runner/pom.xml @@ -53,8 +53,10 @@ org.apache.beam.sdk.testing.RunnableOnService - org.apache.beam.sdk.testing.UsesStatefulParDo - org.apache.beam.sdk.testing.UsesSplittableParDo + +org.apache.beam.sdk.testing.UsesStatefulParDo, +org.apache.beam.sdk.testing.UsesSplittableParDo + none true @@ -80,7 +82,10 @@ org.apache.beam.sdk.testing.RunnableOnService - org.apache.beam.sdk.testing.UsesStatefulParDo + +org.apache.beam.sdk.testing.UsesStatefulParDo, +org.apache.beam.sdk.testing.UsesSplittableParDo + none true http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/ffa81edd/runners/google-cloud-dataflow-java/pom.xml -- diff --git a/runners/google-cloud-dataflow-java/pom.xml b/runners/google-cloud-dataflow-java/pom.xml index 8547499..adebb2a 100644 --- a/runners/google-cloud-dataflow-java/pom.xml +++ b/runners/google-cloud-dataflow-java/pom.xml @@ -77,8 +77,10 @@ runnable-on-service-tests - org.apache.beam.sdk.testing.UsesStatefulParDo - org.apache.beam.sdk.testing.UsesSplittableParDo + +org.apache.beam.sdk.testing.UsesStatefulParDo, +org.apache.beam.sdk.testing.UsesSplittableParDo + org.apache.beam.sdk.transforms.FlattenTest http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/ffa81edd/runners/spark/pom.xml -- diff --git a/runners/spark/pom.xml b/runners/spark/pom.xml index dc000bf..e34af15 100644 --- a/runners/spark/pom.xml +++ b/runners/spark/pom.xml @@ -72,8 +72,10 @@ org.apache.beam.sdk.testing.RunnableOnService - org.apache.beam.sdk.testing.UsesStatefulParDo - org.apache.beam.sdk.testing.UsesSplittableParDo + +org.apache.beam.sdk.testing.UsesStatefulParDo, +org.apache.beam.sdk.testing.UsesSplittableParDo + 1 false true
[2/3] incubator-beam-site git commit: [BEAM-506] Fill in the documentation/runners/flink portion of the website
[BEAM-506] Fill in the documentation/runners/flink portion of the website Project: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/commit/ac0c4e06 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/tree/ac0c4e06 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/diff/ac0c4e06 Branch: refs/heads/asf-site Commit: ac0c4e063459ca251354b94eed866c0934548fec Parents: 1b458f1 Author: Aljoscha KrettekAuthored: Tue Nov 29 16:23:03 2016 +0100 Committer: Dan Halperin Committed: Thu Dec 1 14:30:22 2016 -0800 -- src/documentation/runners/flink.md | 136 +++- 1 file changed, 135 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/ac0c4e06/src/documentation/runners/flink.md -- diff --git a/src/documentation/runners/flink.md b/src/documentation/runners/flink.md index 4145be6..a984bb4 100644 --- a/src/documentation/runners/flink.md +++ b/src/documentation/runners/flink.md @@ -6,4 +6,138 @@ redirect_from: /learn/runners/flink/ --- # Using the Apache Flink Runner -This page is under construction ([BEAM-506](https://issues.apache.org/jira/browse/BEAM-506)). +The Apache Flink Runner can be used to execute Beam pipelines using [Apache Flink](https://flink.apache.org). When using the Flink Runner you will create a jar file containing your job that can be executed on a regular Flink cluster. It's also possible to execute a Beam pipeline using Flink's local execution mode without setting up a cluster. This is helpful for development and debugging of your pipeline. + +The Flink Runner and Flink are suitable for large scale, continuous jobs, and provide: + +* A streaming-first runtime that supports both batch processing and data streaming programs +* A runtime that supports very high throughput and low event latency at the same time +* Fault-tolerance with *exactly-once* processing guarantees +* Natural back-pressure in streaming programs +* Custom memory management for efficient and robust switching between in-memory and out-of-core data processing algorithms +* Integration with YARN and other components of the Apache Hadoop ecosystem + +The [Beam Capability Matrix]({{ site.baseurl }}/documentation/runners/capability-matrix/) documents the supported capabilities of the Flink Runner. + +## Flink Runner prerequisites and setup + +If you want to use the local execution mode with the Flink runner to don't have to complete any setup. + +To use the Flink Runner for executing on a cluster, you have to setup a Flink cluster by following the Flink [setup quickstart](https://ci.apache.org/projects/flink/flink-docs-release-1.1/quickstart/setup_quickstart.html). + +To find out which version of Flink you need you can run this command to check the version of the Flink dependency that your project is using: +``` +$ mvn dependency:tree -Pflink-runner |grep flink +... +[INFO] | +- org.apache.flink:flink-streaming-java_2.10:jar:1.1.2:runtime +... +``` +Here, we would need Flink 1.1.2. + +For more information, the [Flink Documentation](https://ci.apache.org/projects/flink/flink-docs-release-1.1/) can be helpful. + +### Specify your dependency + +You must specify your dependency on the Flink Runner. + +```java + + org.apache.beam + beam-runners-flink_2.10 + {{ site.release_latest }} + runtime + +``` + +## Executing a pipeline on a Flink cluster + +For executing a pipeline on a Flink cluster you need to package your program along will all dependencies in a so-called fat jar. How you do this depends on your build system but if you follow along the [Beam Quickstart]({{ site.baseurl }}/get-started/quickstart/) this is the command that you have to run: + +``` +$ mvn package -Pflink-runner +``` +The Beam Quickstart Maven project is setup to use the Maven Shade plugin to create a fat jar and the `-Pflink-runner` argument makes sure to include the dependency on the Flink Runner. + +For actually running the pipeline you would use this command +``` +$ mvn exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount \ +-Pflink-runner \ +-Dexec.args="--runner=FlinkRunner \ + --inputFile=/path/to/pom.xml \ + --output=/path/to/counts \ + --flinkMaster= \ + --filesToStage=target/word-count-beam--bundled-0.1.jar" +``` +If you have a Flink `JobManager` running on your local machine you can give `localhost:6123` for +`flinkMaster`. + +## Pipeline options for the Flink Runner + +When executing your pipeline with the Flink Runner, you can set these pipeline options. + + + + Field + Description + Default Value + + + runner + The pipeline runner to use. This option
[3/3] incubator-beam-site git commit: Regenerate website
Regenerate website Project: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/commit/f439af09 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/tree/f439af09 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/diff/f439af09 Branch: refs/heads/asf-site Commit: f439af099412e73da73a288cd212ff8e93221e35 Parents: 7e96f7b Author: Dan HalperinAuthored: Thu Dec 1 14:31:10 2016 -0800 Committer: Dan Halperin Committed: Thu Dec 1 14:31:10 2016 -0800 -- content/documentation/runners/flink/index.html | 138 +++- 1 file changed, 137 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/f439af09/content/documentation/runners/flink/index.html -- diff --git a/content/documentation/runners/flink/index.html b/content/documentation/runners/flink/index.html index 6ccaff7..edd5bcd 100644 --- a/content/documentation/runners/flink/index.html +++ b/content/documentation/runners/flink/index.html @@ -146,7 +146,143 @@ Using the Apache Flink Runner -This page is under construction (https://issues.apache.org/jira/browse/BEAM-506;>BEAM-506). +The Apache Flink Runner can be used to execute Beam pipelines using https://flink.apache.org;>Apache Flink. When using the Flink Runner you will create a jar file containing your job that can be executed on a regular Flink cluster. Itâs also possible to execute a Beam pipeline using Flinkâs local execution mode without setting up a cluster. This is helpful for development and debugging of your pipeline. + +The Flink Runner and Flink are suitable for large scale, continuous jobs, and provide: + + + A streaming-first runtime that supports both batch processing and data streaming programs + A runtime that supports very high throughput and low event latency at the same time + Fault-tolerance with exactly-once processing guarantees + Natural back-pressure in streaming programs + Custom memory management for efficient and robust switching between in-memory and out-of-core data processing algorithms + Integration with YARN and other components of the Apache Hadoop ecosystem + + +The Beam Capability Matrix documents the supported capabilities of the Flink Runner. + +Flink Runner prerequisites and setup + +If you want to use the local execution mode with the Flink runner to donât have to complete any setup. + +To use the Flink Runner for executing on a cluster, you have to setup a Flink cluster by following the Flink https://ci.apache.org/projects/flink/flink-docs-release-1.1/quickstart/setup_quickstart.html;>setup quickstart. + +To find out which version of Flink you need you can run this command to check the version of the Flink dependency that your project is using: +$ mvn dependency:tree -Pflink-runner |grep flink +... +[INFO] | +- org.apache.flink:flink-streaming-java_2.10:jar:1.1.2:runtime +... + + +Here, we would need Flink 1.1.2. + +For more information, the https://ci.apache.org/projects/flink/flink-docs-release-1.1/;>Flink Documentation can be helpful. + +Specify your dependency + +You must specify your dependency on the Flink Runner. + +dependency + groupIdorg.apache.beam/groupId + artifactIdbeam-runners-flink_2.10/artifactId + version0.3.0-incubating/version + scoperuntime/scope +/dependency + + + +Executing a pipeline on a Flink cluster + +For executing a pipeline on a Flink cluster you need to package your program along will all dependencies in a so-called fat jar. How you do this depends on your build system but if you follow along the Beam Quickstart this is the command that you have to run: + +$ mvn package -Pflink-runner + + +The Beam Quickstart Maven project is setup to use the Maven Shade plugin to create a fat jar and the -Pflink-runner argument makes sure to include the dependency on the Flink Runner. + +For actually running the pipeline you would use this command +$ mvn exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount \ +-Pflink-runner \ +-Dexec.args="--runner=FlinkRunner \ + --inputFile=/path/to/pom.xml \ + --output=/path/to/counts \ + --flinkMaster=flink master url \ + --filesToStage=target/word-count-beam--bundled-0.1.jar" + + +If you have a Flink JobManager running on your local machine you can give localhost:6123 for +flinkMaster. + +Pipeline options for the Flink Runner + +When executing your pipeline with the Flink Runner, you can set these pipeline options. + + + + Field + Description + Default Value + + + runner + The pipeline runner to use. This option allows you to determine the pipeline runner at runtime. + Set to FlinkRunner to run using
[1/3] incubator-beam-site git commit: Closes #97
Repository: incubator-beam-site Updated Branches: refs/heads/asf-site 1b458f102 -> f439af099 Closes #97 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/commit/7e96f7b9 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/tree/7e96f7b9 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/diff/7e96f7b9 Branch: refs/heads/asf-site Commit: 7e96f7b90d569e57e3c9711a51014b5e072d7188 Parents: 1b458f1 ac0c4e0 Author: Dan HalperinAuthored: Thu Dec 1 14:30:22 2016 -0800 Committer: Dan Halperin Committed: Thu Dec 1 14:30:22 2016 -0800 -- src/documentation/runners/flink.md | 136 +++- 1 file changed, 135 insertions(+), 1 deletion(-) --
[GitHub] incubator-beam pull request #1483: [BEAM-1072] DataflowRunner: reject job su...
GitHub user dhalperi opened a pull request: https://github.com/apache/incubator-beam/pull/1483 [BEAM-1072] DataflowRunner: reject job submission when the version has not been properly set R: @davorbonaci You can merge this pull request into a Git repository by running: $ git pull https://github.com/dhalperi/incubator-beam dataflow-runner-require-version Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/1483.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1483 commit 6fd0782fbb74d65ee7a535f721804a38badfa1d9 Author: Dan Halperin <dhalp...@google.com> Date: 2016-12-01T19:21:30Z DataflowRunner: reject job submission when the version has not been properly set --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[2/3] incubator-beam git commit: Revert "Move resource filtering later to avoid spurious rebuilds"
Revert "Move resource filtering later to avoid spurious rebuilds" This reverts commit 2422365719c71cade97e1e74f1fb7f42b264244f. Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/b36048bd Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/b36048bd Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/b36048bd Branch: refs/heads/master Commit: b36048bd0e558fea281a1ec42aa8435db09dbe64 Parents: 1094fa6 Author: Dan HalperinAuthored: Thu Dec 1 10:22:15 2016 -0800 Committer: Dan Halperin Committed: Thu Dec 1 13:10:56 2016 -0800 -- sdks/java/core/pom.xml | 29 +++-- 1 file changed, 7 insertions(+), 22 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/b36048bd/sdks/java/core/pom.xml -- diff --git a/sdks/java/core/pom.xml b/sdks/java/core/pom.xml index f842be7..ad84846 100644 --- a/sdks/java/core/pom.xml +++ b/sdks/java/core/pom.xml @@ -40,6 +40,13 @@ + + +src/main/resources +true + + + @@ -74,28 +81,6 @@ org.apache.maven.plugins -maven-resources-plugin - - -resources -compile - - resources - - - - - src/main/resources - true - - - - - - - - -org.apache.maven.plugins maven-jar-plugin
[3/3] incubator-beam git commit: Closes #1480
Closes #1480 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/fd4b631f Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/fd4b631f Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/fd4b631f Branch: refs/heads/master Commit: fd4b631f1b4aa1538b779c4de591bd9b18526cd6 Parents: 48130f7 b36048b Author: Dan HalperinAuthored: Thu Dec 1 13:10:56 2016 -0800 Committer: Dan Halperin Committed: Thu Dec 1 13:10:56 2016 -0800 -- sdks/java/core/pom.xml | 29 +++-- .../apache/beam/sdk/util/ReleaseInfoTest.java | 45 2 files changed, 52 insertions(+), 22 deletions(-) --
[1/3] incubator-beam git commit: Add a test of ReleaseInfo
Repository: incubator-beam Updated Branches: refs/heads/master 48130f718 -> fd4b631f1 Add a test of ReleaseInfo Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/1094fa6a Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/1094fa6a Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/1094fa6a Branch: refs/heads/master Commit: 1094fa6ac32046b4c092294b3cee046c91aea5a1 Parents: 48130f7 Author: Dan HalperinAuthored: Thu Dec 1 09:15:28 2016 -0800 Committer: Dan Halperin Committed: Thu Dec 1 13:10:55 2016 -0800 -- .../apache/beam/sdk/util/ReleaseInfoTest.java | 45 1 file changed, 45 insertions(+) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/1094fa6a/sdks/java/core/src/test/java/org/apache/beam/sdk/util/ReleaseInfoTest.java -- diff --git a/sdks/java/core/src/test/java/org/apache/beam/sdk/util/ReleaseInfoTest.java b/sdks/java/core/src/test/java/org/apache/beam/sdk/util/ReleaseInfoTest.java new file mode 100644 index 000..fabb7e2 --- /dev/null +++ b/sdks/java/core/src/test/java/org/apache/beam/sdk/util/ReleaseInfoTest.java @@ -0,0 +1,45 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.util; + +import static org.hamcrest.Matchers.containsString; +import static org.junit.Assert.assertThat; +import static org.junit.Assert.assertTrue; + +import org.junit.Test; + +/** + * Tests for {@link ReleaseInfo}. + */ +public class ReleaseInfoTest { + + @Test + public void getReleaseInfo() throws Exception { +ReleaseInfo info = ReleaseInfo.getReleaseInfo(); + +// Validate name +assertThat(info.getName(), containsString("Beam")); + +// Validate semantic version +String version = info.getVersion(); +String pattern = "\\d+\\.\\d+\\.\\d+.*"; +assertTrue( +String.format("%s does not match pattern %s", version, pattern), +version.matches(pattern)); + } +}
[GitHub] incubator-beam pull request #1482: travis.yml: disable skipping things that ...
GitHub user dhalperi opened a pull request: https://github.com/apache/incubator-beam/pull/1482 travis.yml: disable skipping things that no longer run R: @kennknowles https://travis-ci.org/dhalperi/incubator-beam/builds/180475125 is the link on my repo, with (unsurprisingly) 1 network-related flake You can merge this pull request into a Git repository by running: $ git pull https://github.com/dhalperi/incubator-beam travis Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/1482.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1482 commit bd9b295c27cab04a0619696b22dec82c7203516e Author: Dan Halperin <dhalp...@google.com> Date: 2016-12-01T18:04:38Z travis.yml: disable skipping things that no longer run --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[2/2] incubator-beam git commit: Update examples archetype with runner profiles
Update examples archetype with runner profiles This makes it possible to run the examples on all runners. Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/265c7924 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/265c7924 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/265c7924 Branch: refs/heads/master Commit: 265c79241f802f4d895648c1b1c4b75e6846d245 Parents: a20bc47 Author: Dan HalperinAuthored: Wed Nov 30 11:10:20 2016 -0800 Committer: Dan Halperin Committed: Wed Nov 30 16:28:19 2016 -0800 -- .../main/resources/archetype-resources/pom.xml | 123 --- 1 file changed, 103 insertions(+), 20 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/265c7924/sdks/java/maven-archetypes/examples/src/main/resources/archetype-resources/pom.xml -- diff --git a/sdks/java/maven-archetypes/examples/src/main/resources/archetype-resources/pom.xml b/sdks/java/maven-archetypes/examples/src/main/resources/archetype-resources/pom.xml index 031ee88..df2e9f3 100644 --- a/sdks/java/maven-archetypes/examples/src/main/resources/archetype-resources/pom.xml +++ b/sdks/java/maven-archetypes/examples/src/main/resources/archetype-resources/pom.xml @@ -26,6 +26,10 @@ jar + +0.4.0-incubating-SNAPSHOT + + apache.snapshots @@ -85,36 +89,108 @@ - - - - org.apache.beam - beam-sdks-java-core - 0.4.0-incubating-SNAPSHOT - + + + direct-runner + + + + org.apache.beam + beam-runners-direct-java + ${beam.version} + runtime + + + - - - org.apache.beam - beam-runners-direct-java - 0.4.0-incubating-SNAPSHOT - runtime - + + apex-runner + + + + org.apache.beam + beam-runners-apex + ${beam.version} + runtime + + + + + + dataflow-runner + + + + org.apache.beam + beam-runners-google-cloud-dataflow-java + ${beam.version} + runtime + + + + + flink-runner + + + + org.apache.beam + beam-runners-flink_2.10 + ${beam.version} + runtime + + + + + + spark-runner + + + + org.apache.beam + beam-runners-spark + ${beam.version} + runtime + + + org.apache.spark + spark-streaming_2.10 + 1.6.2 + runtime + + + org.slf4j + jul-to-slf4j + + + + + com.fasterxml.jackson.module + jackson-module-scala_2.10 + 2.7.2 + runtime + + + + + + + org.apache.beam - beam-runners-google-cloud-dataflow-java - 0.4.0-incubating-SNAPSHOT - runtime + beam-sdks-java-core + ${beam.version} - + org.apache.beam beam-sdks-java-io-google-cloud-platform - 0.4.0-incubating-SNAPSHOT + ${beam.version} + com.google.api-client google-api-client @@ -129,7 +205,6 @@ - com.google.apis google-api-services-bigquery @@ -212,5 +287,13 @@ junit 4.11 + + + + org.apache.beam + beam-runners-direct-java + ${beam.version} + test +
[1/2] incubator-beam git commit: Closes #1465
Repository: incubator-beam Updated Branches: refs/heads/master a20bc4793 -> 711c68092 Closes #1465 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/711c6809 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/711c6809 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/711c6809 Branch: refs/heads/master Commit: 711c68092fd771c3f9be4a5d0dd0ecf077f1aeab Parents: a20bc47 265c792 Author: Dan HalperinAuthored: Wed Nov 30 16:28:19 2016 -0800 Committer: Dan Halperin Committed: Wed Nov 30 16:28:19 2016 -0800 -- .../main/resources/archetype-resources/pom.xml | 123 --- 1 file changed, 103 insertions(+), 20 deletions(-) --
[1/2] incubator-beam git commit: Shutdown DynamicSplit Executor in Cleanup
Repository: incubator-beam Updated Branches: refs/heads/master c8f2cdb22 -> 565e99fbf Shutdown DynamicSplit Executor in Cleanup This ensures that the threads will be shut off when the pipeline shuts down, enabling a JVM with no more work to do to shut down as well. Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/6ef9a288 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/6ef9a288 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/6ef9a288 Branch: refs/heads/master Commit: 6ef9a288e281a423905c2cba520274d1c4e4747b Parents: c8f2cdb Author: Thomas GrohAuthored: Wed Nov 30 14:30:14 2016 -0800 Committer: Dan Halperin Committed: Wed Nov 30 16:08:20 2016 -0800 -- .../beam/runners/direct/BoundedReadEvaluatorFactory.java | 6 -- .../beam/runners/direct/BoundedReadEvaluatorFactoryTest.java | 6 ++ 2 files changed, 10 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/6ef9a288/runners/direct-java/src/main/java/org/apache/beam/runners/direct/BoundedReadEvaluatorFactory.java -- diff --git a/runners/direct-java/src/main/java/org/apache/beam/runners/direct/BoundedReadEvaluatorFactory.java b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/BoundedReadEvaluatorFactory.java index 65b622f..8874a04 100644 --- a/runners/direct-java/src/main/java/org/apache/beam/runners/direct/BoundedReadEvaluatorFactory.java +++ b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/BoundedReadEvaluatorFactory.java @@ -58,7 +58,7 @@ final class BoundedReadEvaluatorFactory implements TransformEvaluatorFactory { */ private static final long REQUIRED_DYNAMIC_SPLIT_ORIGINAL_SIZE = 0; private final EvaluationContext evaluationContext; - private final ExecutorService executor = Executors.newCachedThreadPool(); + @VisibleForTesting final ExecutorService executor = Executors.newCachedThreadPool(); private final long minimumDynamicSplitSize; @@ -87,7 +87,9 @@ final class BoundedReadEvaluatorFactory implements TransformEvaluatorFactory { } @Override - public void cleanup() {} + public void cleanup() { +executor.shutdown(); + } /** * A {@link BoundedReadEvaluator} produces elements from an underlying {@link BoundedSource}, http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/6ef9a288/runners/direct-java/src/test/java/org/apache/beam/runners/direct/BoundedReadEvaluatorFactoryTest.java -- diff --git a/runners/direct-java/src/test/java/org/apache/beam/runners/direct/BoundedReadEvaluatorFactoryTest.java b/runners/direct-java/src/test/java/org/apache/beam/runners/direct/BoundedReadEvaluatorFactoryTest.java index dee95a7..b1ff689 100644 --- a/runners/direct-java/src/test/java/org/apache/beam/runners/direct/BoundedReadEvaluatorFactoryTest.java +++ b/runners/direct-java/src/test/java/org/apache/beam/runners/direct/BoundedReadEvaluatorFactoryTest.java @@ -335,6 +335,12 @@ public class BoundedReadEvaluatorFactoryTest { assertThat(TestSource.readerClosed, is(true)); } + @Test + public void cleanupShutsDownExecutor() { +factory.cleanup(); +assertThat(factory.executor.isShutdown(), is(true)); + } + private static class TestSource extends OffsetBasedSource { private static boolean readerClosed; private final Coder coder;
[2/2] incubator-beam git commit: Closes #1470
Closes #1470 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/565e99fb Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/565e99fb Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/565e99fb Branch: refs/heads/master Commit: 565e99fbf8f7a9e9863bdfcfb514e2098365bbc6 Parents: c8f2cdb 6ef9a28 Author: Dan HalperinAuthored: Wed Nov 30 16:08:22 2016 -0800 Committer: Dan Halperin Committed: Wed Nov 30 16:08:22 2016 -0800 -- .../beam/runners/direct/BoundedReadEvaluatorFactory.java | 6 -- .../beam/runners/direct/BoundedReadEvaluatorFactoryTest.java | 6 ++ 2 files changed, 10 insertions(+), 2 deletions(-) --
[GitHub] incubator-beam pull request #1465: Update examples archetype with runner pro...
GitHub user dhalperi opened a pull request: https://github.com/apache/incubator-beam/pull/1465 Update examples archetype with runner profiles This makes it possible to run the examples on all runners. R: @davorbonaci @bjchambers please be skeptical. You can merge this pull request into a Git repository by running: $ git pull https://github.com/dhalperi/incubator-beam starter-pom Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/1465.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1465 commit 2a9b633f567cf955c46e3380b8ac174ccfedb415 Author: Dan Halperin <dhalp...@google.com> Date: 2016-11-30T19:10:20Z Update examples archetype with runner profiles This makes it possible to run the examples on all runners. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[2/2] incubator-beam git commit: Closes #1460
Closes #1460 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/8042d52f Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/8042d52f Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/8042d52f Branch: refs/heads/master Commit: 8042d52fcb377922a11b9cc5f548690da83a2b1c Parents: b1f7013 98ab559 Author: Dan HalperinAuthored: Tue Nov 29 17:40:51 2016 -0800 Committer: Dan Halperin Committed: Tue Nov 29 17:40:51 2016 -0800 -- .../direct/FlattenEvaluatorFactoryTest.java | 8 ++--- .../beam/runners/dataflow/DataflowRunner.java | 10 +++--- .../org/apache/beam/sdk/util/WindowedValue.java | 33 +--- .../beam/sdk/testing/PaneExtractorsTest.java| 2 +- .../apache/beam/sdk/util/WindowedValueTest.java | 10 ++ 5 files changed, 49 insertions(+), 14 deletions(-) --
[1/2] incubator-beam git commit: Revert "Remove WindowedValue.valueInEmptyWindows"
Repository: incubator-beam Updated Branches: refs/heads/master b1f7013d8 -> 8042d52fc Revert "Remove WindowedValue.valueInEmptyWindows" This reverts commit 0e49b150e83d85ae432c640da937a9497068e71b, which breaks some DataflowRunner integration tests. Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/98ab5594 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/98ab5594 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/98ab5594 Branch: refs/heads/master Commit: 98ab559410bde425c9c1944bcd2f09293c3764dc Parents: b1f7013 Author: Kenneth KnowlesAuthored: Tue Nov 29 16:57:09 2016 -0800 Committer: Dan Halperin Committed: Tue Nov 29 17:40:50 2016 -0800 -- .../direct/FlattenEvaluatorFactoryTest.java | 8 ++--- .../beam/runners/dataflow/DataflowRunner.java | 10 +++--- .../org/apache/beam/sdk/util/WindowedValue.java | 33 +--- .../beam/sdk/testing/PaneExtractorsTest.java| 2 +- .../apache/beam/sdk/util/WindowedValueTest.java | 10 ++ 5 files changed, 49 insertions(+), 14 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/98ab5594/runners/direct-java/src/test/java/org/apache/beam/runners/direct/FlattenEvaluatorFactoryTest.java -- diff --git a/runners/direct-java/src/test/java/org/apache/beam/runners/direct/FlattenEvaluatorFactoryTest.java b/runners/direct-java/src/test/java/org/apache/beam/runners/direct/FlattenEvaluatorFactoryTest.java index 39c7cab..cb27fbc 100644 --- a/runners/direct-java/src/test/java/org/apache/beam/runners/direct/FlattenEvaluatorFactoryTest.java +++ b/runners/direct-java/src/test/java/org/apache/beam/runners/direct/FlattenEvaluatorFactoryTest.java @@ -78,9 +78,9 @@ public class FlattenEvaluatorFactoryTest { rightSideEvaluator.processElement(WindowedValue.valueInGlobalWindow(-1)); leftSideEvaluator.processElement( WindowedValue.timestampedValueInGlobalWindow(2, new Instant(1024))); -leftSideEvaluator.processElement(WindowedValue.valueInGlobalWindow(4, PaneInfo.NO_FIRING)); +leftSideEvaluator.processElement(WindowedValue.valueInEmptyWindows(4, PaneInfo.NO_FIRING)); rightSideEvaluator.processElement( -WindowedValue.valueInGlobalWindow(2, PaneInfo.ON_TIME_AND_ONLY_FIRING)); +WindowedValue.valueInEmptyWindows(2, PaneInfo.ON_TIME_AND_ONLY_FIRING)); rightSideEvaluator.processElement( WindowedValue.timestampedValueInGlobalWindow(-4, new Instant(-4096))); @@ -104,12 +104,12 @@ public class FlattenEvaluatorFactoryTest { flattenedLeftBundle.commit(Instant.now()).getElements(), containsInAnyOrder( WindowedValue.timestampedValueInGlobalWindow(2, new Instant(1024)), -WindowedValue.valueInGlobalWindow(4, PaneInfo.NO_FIRING), +WindowedValue.valueInEmptyWindows(4, PaneInfo.NO_FIRING), WindowedValue.valueInGlobalWindow(1))); assertThat( flattenedRightBundle.commit(Instant.now()).getElements(), containsInAnyOrder( -WindowedValue.valueInGlobalWindow(2, PaneInfo.ON_TIME_AND_ONLY_FIRING), +WindowedValue.valueInEmptyWindows(2, PaneInfo.ON_TIME_AND_ONLY_FIRING), WindowedValue.timestampedValueInGlobalWindow(-4, new Instant(-4096)), WindowedValue.valueInGlobalWindow(-1))); } http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/98ab5594/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java -- diff --git a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java index 641daf4..0099856 100644 --- a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java +++ b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java @@ -23,7 +23,7 @@ import static com.google.common.base.Preconditions.checkState; import static com.google.common.base.Strings.isNullOrEmpty; import static org.apache.beam.sdk.util.StringUtils.approximatePTransformName; import static org.apache.beam.sdk.util.StringUtils.approximateSimpleName; -import static org.apache.beam.sdk.util.WindowedValue.valueInGlobalWindow; +import static org.apache.beam.sdk.util.WindowedValue.valueInEmptyWindows; import com.fasterxml.jackson.annotation.JsonCreator; import com.fasterxml.jackson.annotation.JsonProperty; @@ -1230,7 +1230,7 @@ public class DataflowRunner extends
[2/2] incubator-beam git commit: Update googledatastore version
Update googledatastore version Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/6c8c17a1 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/6c8c17a1 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/6c8c17a1 Branch: refs/heads/python-sdk Commit: 6c8c17a1c1977ed69860d25dc8ab45640e7a1c53 Parents: ad4dc87 Author: Vikas KedigehalliAuthored: Tue Nov 29 09:54:00 2016 -0800 Committer: Dan Halperin Committed: Tue Nov 29 14:01:50 2016 -0800 -- sdks/python/setup.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/6c8c17a1/sdks/python/setup.py -- diff --git a/sdks/python/setup.py b/sdks/python/setup.py index 525f59c..add6dc0 100644 --- a/sdks/python/setup.py +++ b/sdks/python/setup.py @@ -87,7 +87,7 @@ REQUIRED_PACKAGES = [ 'avro>=1.7.7,<2.0.0', 'dill>=0.2.5,<0.3', 'google-apitools>=0.5.2,<1.0.0', -'googledatastore==6.4.0', +'googledatastore>=6.4.1,<7.0.0', 'httplib2>=0.8,<0.10', 'mock>=1.0.1,<3.0.0', 'oauth2client>=2.0.1,<4.0.0',
[1/2] incubator-beam git commit: Closes #1453
Repository: incubator-beam Updated Branches: refs/heads/python-sdk ad4dc87a4 -> 5ce75a2ea Closes #1453 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/5ce75a2e Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/5ce75a2e Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/5ce75a2e Branch: refs/heads/python-sdk Commit: 5ce75a2eae31dbab4d07d301716b4d7e3218b8b9 Parents: ad4dc87 6c8c17a Author: Dan HalperinAuthored: Tue Nov 29 14:01:50 2016 -0800 Committer: Dan Halperin Committed: Tue Nov 29 14:01:50 2016 -0800 -- sdks/python/setup.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --
[2/2] incubator-beam git commit: Closes #1452
Closes #1452 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/8d127beb Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/8d127beb Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/8d127beb Branch: refs/heads/master Commit: 8d127beb867380b53859c98deba74172db57cc0a Parents: 4ce85ed 74682c9 Author: Dan HalperinAuthored: Tue Nov 29 12:22:01 2016 -0800 Committer: Dan Halperin Committed: Tue Nov 29 12:22:01 2016 -0800 -- .../apache/beam/sdk/options/ValueProvider.java| 2 +- .../beam/sdk/options/ValueProviderTest.java | 18 ++ 2 files changed, 19 insertions(+), 1 deletion(-) --
[1/2] incubator-beam git commit: Add a test demonstrating how to use ValueProvider with non-serializable data
Repository: incubator-beam Updated Branches: refs/heads/master 4ce85ed94 -> 8d127beb8 Add a test demonstrating how to use ValueProvider with non-serializable data Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/74682c92 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/74682c92 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/74682c92 Branch: refs/heads/master Commit: 74682c92d3d2bc5cd5385812ce985a8a75ee4899 Parents: 4ce85ed Author: Sam McVeetyAuthored: Tue Nov 1 17:58:16 2016 -0700 Committer: Dan Halperin Committed: Tue Nov 29 12:22:00 2016 -0800 -- .../apache/beam/sdk/options/ValueProvider.java| 2 +- .../beam/sdk/options/ValueProviderTest.java | 18 ++ 2 files changed, 19 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/74682c92/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ValueProvider.java -- diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ValueProvider.java b/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ValueProvider.java index 2f52ad4..3a2e7ed 100644 --- a/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ValueProvider.java +++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ValueProvider.java @@ -51,7 +51,7 @@ import org.apache.beam.sdk.transforms.SerializableFunction; */ @JsonSerialize(using = ValueProvider.Serializer.class) @JsonDeserialize(using = ValueProvider.Deserializer.class) -public interface ValueProvider { +public interface ValueProvider extends Serializable { /** * Return the value wrapped by this {@link ValueProvider}. */ http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/74682c92/sdks/java/core/src/test/java/org/apache/beam/sdk/options/ValueProviderTest.java -- diff --git a/sdks/java/core/src/test/java/org/apache/beam/sdk/options/ValueProviderTest.java b/sdks/java/core/src/test/java/org/apache/beam/sdk/options/ValueProviderTest.java index be0f076..31532b9 100644 --- a/sdks/java/core/src/test/java/org/apache/beam/sdk/options/ValueProviderTest.java +++ b/sdks/java/core/src/test/java/org/apache/beam/sdk/options/ValueProviderTest.java @@ -29,6 +29,7 @@ import org.apache.beam.sdk.options.ValueProvider.NestedValueProvider; import org.apache.beam.sdk.options.ValueProvider.RuntimeValueProvider; import org.apache.beam.sdk.options.ValueProvider.StaticValueProvider; import org.apache.beam.sdk.transforms.SerializableFunction; +import org.apache.beam.sdk.util.SerializableUtils; import org.junit.Rule; import org.junit.Test; import org.junit.rules.ExpectedException; @@ -250,4 +251,21 @@ public class ValueProviderTest { expectedException.expectMessage("Not called from a runtime context"); nvp.get(); } + + private static class NonSerializable {} + + private static class NonSerializableTranslator + implements SerializableFunction { +@Override +public NonSerializable apply(String from) { + return new NonSerializable(); +} + } + + @Test + public void testNestedValueProviderSerialize() throws Exception { +ValueProvider nvp = NestedValueProvider.of( +StaticValueProvider.of("foo"), new NonSerializableTranslator()); +SerializableUtils.ensureSerializable(nvp); + } }
[1/2] incubator-beam git commit: Fix double-close bug
Repository: incubator-beam Updated Branches: refs/heads/master 0d95d8c56 -> 4ce85ed94 Fix double-close bug The WritableByteChannel returned for GCS locations has a bug where calling close twice throws an Exception, so we cannot safely use AutoCloseable here. Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/01236906 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/01236906 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/01236906 Branch: refs/heads/master Commit: 0123690600ebd5f83cf485c92d6a35762428cd84 Parents: 0d95d8c Author: sammcveetyAuthored: Mon Nov 28 11:26:19 2016 -0800 Committer: Dan Halperin Committed: Tue Nov 29 12:19:45 2016 -0800 -- .../org/apache/beam/runners/dataflow/DataflowRunner.java| 9 +++-- 1 file changed, 3 insertions(+), 6 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/01236906/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java -- diff --git a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java index 03c503d..641daf4 100644 --- a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java +++ b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java @@ -59,7 +59,6 @@ import java.net.URISyntaxException; import java.net.URL; import java.net.URLClassLoader; import java.nio.channels.Channels; -import java.nio.channels.WritableByteChannel; import java.util.ArrayList; import java.util.Arrays; import java.util.Collection; @@ -566,15 +565,13 @@ public class DataflowRunner extends PipelineRunner { String.format( "Location must be local or on Cloud Storage, got {}.", fileLocation)); String workSpecJson = DataflowPipelineTranslator.jobToString(newJob); - try ( - WritableByteChannel writer = - IOChannelUtils.create(fileLocation, MimeTypes.TEXT); - PrintWriter printWriter = new PrintWriter(Channels.newOutputStream(writer))) { + try (PrintWriter printWriter = new PrintWriter( + Channels.newOutputStream(IOChannelUtils.create(fileLocation, MimeTypes.TEXT { printWriter.print(workSpecJson); LOG.info("Printed job specification to {}", fileLocation); } catch (IOException ex) { String error = -String.format("Cannot create output file at {}", fileLocation); +String.format("Cannot create output file at %s", fileLocation); if (isTemplate) { throw new RuntimeException(error, ex); } else {
[2/2] incubator-beam git commit: Closes #1441
Closes #1441 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/4ce85ed9 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/4ce85ed9 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/4ce85ed9 Branch: refs/heads/master Commit: 4ce85ed949e6a955433d5ff307cc2af3c38348c8 Parents: 0d95d8c 0123690 Author: Dan HalperinAuthored: Tue Nov 29 12:19:46 2016 -0800 Committer: Dan Halperin Committed: Tue Nov 29 12:19:46 2016 -0800 -- .../org/apache/beam/runners/dataflow/DataflowRunner.java| 9 +++-- 1 file changed, 3 insertions(+), 6 deletions(-) --
[GitHub] incubator-beam pull request #1449: Demonstrate serializing issue
Github user dhalperi closed the pull request at: https://github.com/apache/incubator-beam/pull/1449 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-beam pull request #1449: Demonstrate serializing issue
GitHub user dhalperi opened a pull request: https://github.com/apache/incubator-beam/pull/1449 Demonstrate serializing issue Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- You can merge this pull request into a Git repository by running: $ git pull https://github.com/sammcveety/incubator-beam sgmc/nvp_repro Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/1449.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1449 commit c9a5f8d8bfbe48921b1327acb4d2bfe8fa31f12e Author: Sam McVeety <s...@google.com> Date: 2016-11-02T00:58:16Z Demonstrate serializing issue --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[1/2] incubator-beam git commit: Demonstrate PubsubIO with NVP
Repository: incubator-beam Updated Branches: refs/heads/master ae06f759f -> aeff1d5c2 Demonstrate PubsubIO with NVP Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/f9225981 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/f9225981 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/f9225981 Branch: refs/heads/master Commit: f92259814964fb4d3b2381187247b3f11b5fe33f Parents: ae06f75 Author: Sam McVeetyAuthored: Sat Oct 29 19:02:51 2016 -0700 Committer: Dan Halperin Committed: Mon Nov 28 21:14:33 2016 -0800 -- .../java/org/apache/beam/sdk/io/PubsubIO.java | 176 --- .../apache/beam/sdk/io/PubsubUnboundedSink.java | 23 ++- .../beam/sdk/io/PubsubUnboundedSource.java | 40 +++-- .../org/apache/beam/sdk/io/PubsubIOTest.java| 43 +++-- .../beam/sdk/io/PubsubUnboundedSinkTest.java| 20 ++- .../beam/sdk/io/PubsubUnboundedSourceTest.java | 14 +- 6 files changed, 232 insertions(+), 84 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/f9225981/sdks/java/core/src/main/java/org/apache/beam/sdk/io/PubsubIO.java -- diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/PubsubIO.java b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/PubsubIO.java index 72a6399..9768788 100644 --- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/PubsubIO.java +++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/PubsubIO.java @@ -31,11 +31,15 @@ import org.apache.beam.sdk.coders.Coder; import org.apache.beam.sdk.coders.StringUtf8Coder; import org.apache.beam.sdk.coders.VoidCoder; import org.apache.beam.sdk.options.PubsubOptions; +import org.apache.beam.sdk.options.ValueProvider; +import org.apache.beam.sdk.options.ValueProvider.NestedValueProvider; +import org.apache.beam.sdk.options.ValueProvider.StaticValueProvider; import org.apache.beam.sdk.runners.PipelineRunner; import org.apache.beam.sdk.transforms.Create; import org.apache.beam.sdk.transforms.DoFn; import org.apache.beam.sdk.transforms.PTransform; import org.apache.beam.sdk.transforms.ParDo; +import org.apache.beam.sdk.transforms.SerializableFunction; import org.apache.beam.sdk.transforms.display.DisplayData; import org.apache.beam.sdk.transforms.windowing.AfterWatermark; import org.apache.beam.sdk.util.CoderUtils; @@ -134,7 +138,7 @@ public class PubsubIO { * Populate common {@link DisplayData} between Pubsub source and sink. */ private static void populateCommonDisplayData(DisplayData.Builder builder, - String timestampLabel, String idLabel, PubsubTopic topic) { + String timestampLabel, String idLabel, String topic) { builder .addIfNotNull(DisplayData.item("timestampLabel", timestampLabel) .withLabel("Timestamp Label Attribute")) @@ -142,7 +146,7 @@ public class PubsubIO { .withLabel("ID Label Attribute")); if (topic != null) { - builder.add(DisplayData.item("topic", topic.asPath()) + builder.add(DisplayData.item("topic", topic) .withLabel("Pubsub Topic")); } } @@ -253,6 +257,61 @@ public class PubsubIO { } /** + * Used to build a {@link ValueProvider} for {@link PubsubSubscription}. + */ + private static class SubscriptionTranslator + implements SerializableFunction { +@Override +public PubsubSubscription apply(String from) { + return PubsubSubscription.fromPath(from); +} + } + + /** + * Used to build a {@link ValueProvider} for {@link SubscriptionPath}. + */ + private static class SubscriptionPathTranslator + implements SerializableFunction { +@Override +public SubscriptionPath apply(PubsubSubscription from) { + return PubsubClient.subscriptionPathFromName(from.project, from.subscription); +} + } + + /** + * Used to build a {@link ValueProvider} for {@link PubsubTopic}. + */ + private static class TopicTranslator + implements SerializableFunction { +@Override +public PubsubTopic apply(String from) { + return PubsubTopic.fromPath(from); +} + } + + /** + * Used to build a {@link ValueProvider} for {@link TopicPath}. + */ + private static class TopicPathTranslator + implements SerializableFunction { +@Override +public TopicPath apply(PubsubTopic from) { + return PubsubClient.topicPathFromName(from.project, from.topic); +} + } + + /** + * Used to build a {@link ValueProvider} for {@link ProjectPath}. + */ + private static class ProjectPathTranslator + implements
[2/2] incubator-beam git commit: Closes #1230
Closes #1230 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/aeff1d5c Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/aeff1d5c Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/aeff1d5c Branch: refs/heads/master Commit: aeff1d5c219385cce20a275a4e47d9184f5cf59c Parents: ae06f75 f922598 Author: Dan HalperinAuthored: Mon Nov 28 21:21:35 2016 -0800 Committer: Dan Halperin Committed: Mon Nov 28 21:21:35 2016 -0800 -- .../java/org/apache/beam/sdk/io/PubsubIO.java | 176 --- .../apache/beam/sdk/io/PubsubUnboundedSink.java | 23 ++- .../beam/sdk/io/PubsubUnboundedSource.java | 40 +++-- .../org/apache/beam/sdk/io/PubsubIOTest.java| 43 +++-- .../beam/sdk/io/PubsubUnboundedSinkTest.java| 20 ++- .../beam/sdk/io/PubsubUnboundedSourceTest.java | 14 +- 6 files changed, 232 insertions(+), 84 deletions(-) --
[1/2] incubator-beam git commit: Add method to output runtime options
Repository: incubator-beam Updated Branches: refs/heads/master cdb7ba165 -> ae06f759f Add method to output runtime options Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/ee52318f Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/ee52318f Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/ee52318f Branch: refs/heads/master Commit: ee52318f2512c6661823e4f546f84dbc2caa955b Parents: cdb7ba1 Author: sammcveetyAuthored: Fri Oct 21 12:50:01 2016 -0400 Committer: Dan Halperin Committed: Mon Nov 28 20:24:38 2016 -0800 -- .../beam/sdk/options/PipelineOptions.java | 7 ++ .../sdk/options/PipelineOptionsFactory.java | 1 + .../sdk/options/ProxyInvocationHandler.java | 26 .../beam/sdk/options/PipelineOptionsTest.java | 24 ++ 4 files changed, 58 insertions(+) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/ee52318f/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptions.java -- diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptions.java b/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptions.java index 2139ed9..ddb040d 100644 --- a/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptions.java +++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptions.java @@ -24,6 +24,7 @@ import com.fasterxml.jackson.databind.annotation.JsonSerialize; import com.google.auto.service.AutoService; import com.google.common.base.MoreObjects; import java.lang.reflect.Proxy; +import java.util.Map; import java.util.ServiceLoader; import java.util.concurrent.ThreadLocalRandom; import java.util.concurrent.atomic.AtomicLong; @@ -322,6 +323,12 @@ public interface PipelineOptions extends HasDisplayData { } /** + * Returns a map of properties which correspond to {@link ValueProvider.RuntimeValueProvider}, + * keyed by the property name. The value is a map containing type and default information. + */ + Map > outputRuntimeOptions(); + + /** * Provides a unique ID for this {@link PipelineOptions} object, assigned at graph * construction time. */ http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/ee52318f/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptionsFactory.java -- diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptionsFactory.java b/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptionsFactory.java index 6009867..9805489 100644 --- a/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptionsFactory.java +++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptionsFactory.java @@ -1219,6 +1219,7 @@ public class PipelineOptionsFactory { // Ignore methods on the base PipelineOptions interface. try { knownMethods.add(iface.getMethod("as", Class.class)); + knownMethods.add(iface.getMethod("outputRuntimeOptions")); knownMethods.add(iface.getMethod("populateDisplayData", DisplayData.Builder.class)); } catch (NoSuchMethodException | SecurityException e) { throw new RuntimeException(e); http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/ee52318f/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ProxyInvocationHandler.java -- diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ProxyInvocationHandler.java b/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ProxyInvocationHandler.java index 86f9918..a0e3ec2 100644 --- a/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ProxyInvocationHandler.java +++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/options/ProxyInvocationHandler.java @@ -48,6 +48,7 @@ import java.io.IOException; import java.lang.annotation.Annotation; import java.lang.reflect.InvocationHandler; import java.lang.reflect.Method; +import java.lang.reflect.ParameterizedType; import java.lang.reflect.Proxy; import java.lang.reflect.Type; import java.util.Arrays; @@ -130,6 +131,8 @@ class ProxyInvocationHandler implements InvocationHandler { return equals(args[0]); } else if (args == null && "hashCode".equals(method.getName())) { return hashCode(); +} else if (args == null && "outputRuntimeOptions".equals(method.getName())) { + return outputRuntimeOptions((PipelineOptions) proxy); } else if (args != null && "as".equals(method.getName()) && args[0]
[2/2] incubator-beam git commit: Closes #1156
Closes #1156 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/ae06f759 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/ae06f759 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/ae06f759 Branch: refs/heads/master Commit: ae06f759fbf72fa31e02dc943ab46afe03471904 Parents: cdb7ba1 ee52318 Author: Dan HalperinAuthored: Mon Nov 28 20:51:48 2016 -0800 Committer: Dan Halperin Committed: Mon Nov 28 20:51:48 2016 -0800 -- .../beam/sdk/options/PipelineOptions.java | 7 ++ .../sdk/options/PipelineOptionsFactory.java | 1 + .../sdk/options/ProxyInvocationHandler.java | 26 .../beam/sdk/options/PipelineOptionsTest.java | 24 ++ 4 files changed, 58 insertions(+) --
[2/2] incubator-beam git commit: datastoreio write/delete ptransform
datastoreio write/delete ptransform update datastore_wordcount example to include writes Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/d46203b7 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/d46203b7 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/d46203b7 Branch: refs/heads/python-sdk Commit: d46203b7fcdc9895c9cee1d82710f48aba31a748 Parents: 3dbeb8e Author: Vikas KedigehalliAuthored: Wed Nov 23 14:09:09 2016 -0800 Committer: Dan Halperin Committed: Mon Nov 28 15:54:27 2016 -0800 -- .../apache_beam/examples/datastore_wordcount.py | 137 +++ .../apache_beam/io/datastore/v1/datastoreio.py | 104 +- .../io/datastore/v1/datastoreio_test.py | 46 +++ .../io/datastore/v1/fake_datastore.py | 17 +++ .../apache_beam/io/datastore/v1/helper.py | 35 - .../apache_beam/io/datastore/v1/helper_test.py | 36 + .../io/datastore/v1/query_splitter.py | 7 +- 7 files changed, 349 insertions(+), 33 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/d46203b7/sdks/python/apache_beam/examples/datastore_wordcount.py -- diff --git a/sdks/python/apache_beam/examples/datastore_wordcount.py b/sdks/python/apache_beam/examples/datastore_wordcount.py index af75b1c..6b9779b 100644 --- a/sdks/python/apache_beam/examples/datastore_wordcount.py +++ b/sdks/python/apache_beam/examples/datastore_wordcount.py @@ -22,14 +22,18 @@ from __future__ import absolute_import import argparse import logging import re +import uuid + +from google.datastore.v1 import entity_pb2 +from google.datastore.v1 import query_pb2 +from googledatastore import helper as datastore_helper, PropertyFilter import apache_beam as beam from apache_beam.io.datastore.v1.datastoreio import ReadFromDatastore +from apache_beam.io.datastore.v1.datastoreio import WriteToDatastore from apache_beam.utils.options import GoogleCloudOptions from apache_beam.utils.options import PipelineOptions from apache_beam.utils.options import SetupOptions -from google.datastore.v1 import query_pb2 - empty_line_aggregator = beam.Aggregator('emptyLines') average_word_size_aggregator = beam.Aggregator('averageWordLength', @@ -41,7 +45,7 @@ class WordExtractingDoFn(beam.DoFn): """Parse each line of input text into words.""" def process(self, context): -"""Returns an iterator over the words of this element. +"""Returns an iterator over words in contents of Cloud Datastore entity. The element is a line of text. If the line is blank, note that, too. Args: context: the call-specific context: data and aggregator. @@ -61,10 +65,100 @@ class WordExtractingDoFn(beam.DoFn): return words +class EntityWrapper(object): + """Create a Cloud Datastore entity from the given string.""" + def __init__(self, namespace, kind, ancestor): +self._namespace = namespace +self._kind = kind +self._ancestor = ancestor + + def make_entity(self, content): +entity = entity_pb2.Entity() +if self._namespace is not None: + entity.key.partition_id.namespace_id = self._namespace + +# All entities created will have the same ancestor +datastore_helper.add_key_path(entity.key, self._kind, self._ancestor, + self._kind, str(uuid.uuid4())) + +datastore_helper.add_properties(entity, {"content": unicode(content)}) +return entity + + +def write_to_datastore(project, user_options, pipeline_options): + """Creates a pipeline that writes entities to Cloud Datastore.""" + p = beam.Pipeline(options=pipeline_options) + + # pylint: disable=expression-not-assigned + (p + | 'read' >> beam.io.Read(beam.io.TextFileSource(user_options.input)) + | 'create entity' >> beam.Map( + EntityWrapper(user_options.namespace, user_options.kind, + user_options.ancestor).make_entity) + | 'write to datastore' >> WriteToDatastore(project)) + + # Actually run the pipeline (all operations above are deferred). + p.run() + + +def make_ancestor_query(kind, namespace, ancestor): + """Creates a Cloud Datastore ancestor query. + + The returned query will fetch all the entities that have the parent key name + set to the given `ancestor`. + """ + ancestor_key = entity_pb2.Key() + datastore_helper.add_key_path(ancestor_key, kind, ancestor) + if namespace is not None: +ancestor_key.partition_id.namespace_id = namespace + + query = query_pb2.Query() + query.kind.add().name = kind + + datastore_helper.set_property_filter( + query.filter, '__key__', PropertyFilter.HAS_ANCESTOR, ancestor_key) + + return query + + +def
[1/2] incubator-beam git commit: Closes #1433
Repository: incubator-beam Updated Branches: refs/heads/python-sdk 3dbeb8edf -> 1530a1727 Closes #1433 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/1530a172 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/1530a172 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/1530a172 Branch: refs/heads/python-sdk Commit: 1530a17279d098ae7459f689ef02401f5116e54e Parents: 3dbeb8e d46203b Author: Dan HalperinAuthored: Mon Nov 28 15:54:27 2016 -0800 Committer: Dan Halperin Committed: Mon Nov 28 15:54:27 2016 -0800 -- .../apache_beam/examples/datastore_wordcount.py | 137 +++ .../apache_beam/io/datastore/v1/datastoreio.py | 104 +- .../io/datastore/v1/datastoreio_test.py | 46 +++ .../io/datastore/v1/fake_datastore.py | 17 +++ .../apache_beam/io/datastore/v1/helper.py | 35 - .../apache_beam/io/datastore/v1/helper_test.py | 36 + .../io/datastore/v1/query_splitter.py | 7 +- 7 files changed, 349 insertions(+), 33 deletions(-) --
incubator-beam git commit: Revert "Closes #1356"
Repository: incubator-beam Updated Branches: refs/heads/master deef3faf7 -> bb8887398 Revert "Closes #1356" This reverts commit deef3faf7df4885395417cb8f6b4aed6ec3d04e1, reversing changes made to 2bc66f903cdfa328c4bb3546befbaa0f58bdd6fa. Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/bb888739 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/bb888739 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/bb888739 Branch: refs/heads/master Commit: bb888739841103115c7182f63bdc4858f68b298e Parents: deef3fa Author: Dan HalperinAuthored: Tue Nov 15 06:28:35 2016 -0800 Committer: Dan Halperin Committed: Tue Nov 15 06:28:35 2016 -0800 -- sdks/java/io/mongodb/pom.xml | 13 + .../java/org/apache/beam/sdk/io/mongodb/MongoDbIO.java | 4 ++-- 2 files changed, 15 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/bb888739/sdks/java/io/mongodb/pom.xml -- diff --git a/sdks/java/io/mongodb/pom.xml b/sdks/java/io/mongodb/pom.xml index 4b100a9..17dc6e7 100644 --- a/sdks/java/io/mongodb/pom.xml +++ b/sdks/java/io/mongodb/pom.xml @@ -31,6 +31,19 @@ IO to read and write on MongoDB. + + + + + org.codehaus.mojo + findbugs-maven-plugin + +true + + + + + org.apache.maven.plugins http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/bb888739/sdks/java/io/mongodb/src/main/java/org/apache/beam/sdk/io/mongodb/MongoDbIO.java -- diff --git a/sdks/java/io/mongodb/src/main/java/org/apache/beam/sdk/io/mongodb/MongoDbIO.java b/sdks/java/io/mongodb/src/main/java/org/apache/beam/sdk/io/mongodb/MongoDbIO.java index 71c017d..2729602 100644 --- a/sdks/java/io/mongodb/src/main/java/org/apache/beam/sdk/io/mongodb/MongoDbIO.java +++ b/sdks/java/io/mongodb/src/main/java/org/apache/beam/sdk/io/mongodb/MongoDbIO.java @@ -224,7 +224,7 @@ public class MongoDbIO { BasicDBObject stat = new BasicDBObject(); stat.append("collStats", spec.collection()); Document stats = mongoDatabase.runCommand(stat); - return Long.parseLong(stats.get("size").toString()); + return Long.valueOf(stats.get("size").toString()); } @Override @@ -456,7 +456,7 @@ public class MongoDbIO { private static class WriteFn extends DoFn { private final Write spec; - private transient MongoClient client; + private MongoClient client; private List batch; public WriteFn(Write spec) {
[2/2] incubator-beam git commit: [BEAM-930] Fix findbugs and re-enable Maven plugin in MongoDbIO
[BEAM-930] Fix findbugs and re-enable Maven plugin in MongoDbIO Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/7c87c662 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/7c87c662 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/7c87c662 Branch: refs/heads/master Commit: 7c87c662db99e581e28e3198c90d2f43a8eebe6d Parents: 2bc66f9 Author: Jean-Baptiste OnofréAuthored: Mon Nov 14 16:10:53 2016 +0100 Committer: Dan Halperin Committed: Tue Nov 15 04:02:08 2016 -0800 -- sdks/java/io/mongodb/pom.xml | 13 - .../java/org/apache/beam/sdk/io/mongodb/MongoDbIO.java | 4 ++-- 2 files changed, 2 insertions(+), 15 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/7c87c662/sdks/java/io/mongodb/pom.xml -- diff --git a/sdks/java/io/mongodb/pom.xml b/sdks/java/io/mongodb/pom.xml index 17dc6e7..4b100a9 100644 --- a/sdks/java/io/mongodb/pom.xml +++ b/sdks/java/io/mongodb/pom.xml @@ -31,19 +31,6 @@ IO to read and write on MongoDB. - - - - - org.codehaus.mojo - findbugs-maven-plugin - -true - - - - - org.apache.maven.plugins http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/7c87c662/sdks/java/io/mongodb/src/main/java/org/apache/beam/sdk/io/mongodb/MongoDbIO.java -- diff --git a/sdks/java/io/mongodb/src/main/java/org/apache/beam/sdk/io/mongodb/MongoDbIO.java b/sdks/java/io/mongodb/src/main/java/org/apache/beam/sdk/io/mongodb/MongoDbIO.java index 2729602..71c017d 100644 --- a/sdks/java/io/mongodb/src/main/java/org/apache/beam/sdk/io/mongodb/MongoDbIO.java +++ b/sdks/java/io/mongodb/src/main/java/org/apache/beam/sdk/io/mongodb/MongoDbIO.java @@ -224,7 +224,7 @@ public class MongoDbIO { BasicDBObject stat = new BasicDBObject(); stat.append("collStats", spec.collection()); Document stats = mongoDatabase.runCommand(stat); - return Long.valueOf(stats.get("size").toString()); + return Long.parseLong(stats.get("size").toString()); } @Override @@ -456,7 +456,7 @@ public class MongoDbIO { private static class WriteFn extends DoFn { private final Write spec; - private MongoClient client; + private transient MongoClient client; private List batch; public WriteFn(Write spec) {
[1/2] incubator-beam git commit: Closes #1356
Repository: incubator-beam Updated Branches: refs/heads/master 2bc66f903 -> deef3faf7 Closes #1356 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/deef3faf Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/deef3faf Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/deef3faf Branch: refs/heads/master Commit: deef3faf7df4885395417cb8f6b4aed6ec3d04e1 Parents: 2bc66f9 7c87c66 Author: Dan HalperinAuthored: Tue Nov 15 04:02:08 2016 -0800 Committer: Dan Halperin Committed: Tue Nov 15 04:02:08 2016 -0800 -- sdks/java/io/mongodb/pom.xml | 13 - .../java/org/apache/beam/sdk/io/mongodb/MongoDbIO.java | 4 ++-- 2 files changed, 2 insertions(+), 15 deletions(-) --
[1/2] incubator-beam git commit: Use Avro serializer for Kafka checkpoint mark.
Repository: incubator-beam Updated Branches: refs/heads/master b25131422 -> f0f4af581 Use Avro serializer for Kafka checkpoint mark. This is more partable. Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/937ac3b2 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/937ac3b2 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/937ac3b2 Branch: refs/heads/master Commit: 937ac3b2ddc60fd9446440c9354139c6234cb625 Parents: b251314 Author: Raghu AngadiAuthored: Tue Nov 8 07:08:32 2016 -0800 Committer: Dan Halperin Committed: Fri Nov 11 16:14:07 2016 -0800 -- .../beam/sdk/io/kafka/KafkaCheckpointMark.java | 32 +--- .../org/apache/beam/sdk/io/kafka/KafkaIO.java | 18 ++- 2 files changed, 32 insertions(+), 18 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/937ac3b2/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaCheckpointMark.java -- diff --git a/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaCheckpointMark.java b/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaCheckpointMark.java index 4f9e96f..763a98a 100644 --- a/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaCheckpointMark.java +++ b/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaCheckpointMark.java @@ -20,19 +20,21 @@ package org.apache.beam.sdk.io.kafka; import java.io.IOException; import java.io.Serializable; import java.util.List; + +import org.apache.beam.sdk.coders.AvroCoder; import org.apache.beam.sdk.coders.DefaultCoder; -import org.apache.beam.sdk.coders.SerializableCoder; import org.apache.beam.sdk.io.UnboundedSource; -import org.apache.kafka.common.TopicPartition; /** * Checkpoint for an unbounded KafkaIO.Read. Consists of Kafka topic name, partition id, * and the latest offset consumed so far. */ -@DefaultCoder(SerializableCoder.class) -public class KafkaCheckpointMark implements UnboundedSource.CheckpointMark, Serializable { +@DefaultCoder(AvroCoder.class) +public class KafkaCheckpointMark implements UnboundedSource.CheckpointMark { + + private List partitions; - private final List partitions; + private KafkaCheckpointMark() {} // for Avro public KafkaCheckpointMark(List partitions) { this.partitions = partitions; @@ -55,16 +57,24 @@ public class KafkaCheckpointMark implements UnboundedSource.CheckpointMark, Seri * for a single partition. */ public static class PartitionMark implements Serializable { -private final TopicPartition topicPartition; -private final long nextOffset; +private String topic; +private int partition; +private long nextOffset; + +private PartitionMark() {} // for Avro -public PartitionMark(TopicPartition topicPartition, long offset) { - this.topicPartition = topicPartition; +public PartitionMark(String topic, int partition, long offset) { + this.topic = topic; + this.partition = partition; this.nextOffset = offset; } -public TopicPartition getTopicPartition() { - return topicPartition; +public String getTopic() { + return topic; +} + +public int getPartition() { + return partition; } public long getNextOffset() { http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/937ac3b2/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java -- diff --git a/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java b/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java index 834104e..4212d59 100644 --- a/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java +++ b/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java @@ -49,11 +49,12 @@ import java.util.concurrent.SynchronousQueue; import java.util.concurrent.TimeUnit; import java.util.concurrent.atomic.AtomicBoolean; import javax.annotation.Nullable; + +import org.apache.beam.sdk.coders.AvroCoder; import org.apache.beam.sdk.coders.ByteArrayCoder; import org.apache.beam.sdk.coders.Coder; import org.apache.beam.sdk.coders.CoderException; import org.apache.beam.sdk.coders.KvCoder; -import org.apache.beam.sdk.coders.SerializableCoder; import org.apache.beam.sdk.coders.VoidCoder; import org.apache.beam.sdk.io.Read.Unbounded; import org.apache.beam.sdk.io.UnboundedSource; @@ -721,7 +722,7 @@ public class KafkaIO { @Override public Coder getCheckpointMarkCoder() { - return SerializableCoder.of(KafkaCheckpointMark.class); +
[2/2] incubator-beam git commit: Closes #1312
Closes #1312 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/f0f4af58 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/f0f4af58 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/f0f4af58 Branch: refs/heads/master Commit: f0f4af581f2cb6317ded367d4ddda35df94a7451 Parents: b251314 937ac3b Author: Dan HalperinAuthored: Fri Nov 11 16:14:15 2016 -0800 Committer: Dan Halperin Committed: Fri Nov 11 16:14:15 2016 -0800 -- .../beam/sdk/io/kafka/KafkaCheckpointMark.java | 32 +--- .../org/apache/beam/sdk/io/kafka/KafkaIO.java | 18 ++- 2 files changed, 32 insertions(+), 18 deletions(-) --
[1/2] incubator-beam git commit: [BEAM-954] FileBasedSink: remove unused code of TemporaryFileRetention.
Repository: incubator-beam Updated Branches: refs/heads/master 821923334 -> 6814a99c2 [BEAM-954] FileBasedSink: remove unused code of TemporaryFileRetention. Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/2a151127 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/2a151127 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/2a151127 Branch: refs/heads/master Commit: 2a151127f04733e6a1f87914901ae6b88c329935 Parents: 8219233 Author: Pei HeAuthored: Wed Nov 9 20:12:24 2016 -0800 Committer: Dan Halperin Committed: Fri Nov 11 15:30:03 2016 -0800 -- .../org/apache/beam/sdk/io/FileBasedSink.java | 49 ++ .../apache/beam/sdk/io/FileBasedSinkTest.java | 67 2 files changed, 21 insertions(+), 95 deletions(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/2a151127/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSink.java -- diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSink.java b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSink.java index e6c37de..2d058ae 100644 --- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSink.java +++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSink.java @@ -279,12 +279,6 @@ public abstract class FileBasedSink extends Sink { * Subclass implementations can change the file naming template by supplying a value for * {@link FileBasedSink#fileNamingTemplate}. * - * Temporary Bundle File Handling: - * - * {@link FileBasedSink.FileBasedWriteOperation#temporaryFileRetention} controls the behavior - * for managing temporary files. By default, temporary files will be removed. Subclasses can - * provide a different value to the constructor. - * * Note that in the case of permanent failure of a bundle's write, no clean up of temporary * files will occur. * @@ -294,23 +288,10 @@ public abstract class FileBasedSink extends Sink { */ public abstract static class FileBasedWriteOperation extends WriteOperation { /** - * Options for handling of temporary output files. - */ -public enum TemporaryFileRetention { - KEEP, - REMOVE -} - -/** * The Sink that this WriteOperation will write to. */ protected final FileBasedSink sink; -/** - * Option to keep or remove temporary output files. - */ -protected final TemporaryFileRetention temporaryFileRetention; - /** Directory for temporary output files. */ protected final String tempDirectory; @@ -350,27 +331,14 @@ public abstract class FileBasedSink extends Sink { } /** - * Construct a FileBasedWriteOperation. - * - * @param sink the FileBasedSink that will be used to configure this write operation. - * @param tempDirectory the base directory to be used for temporary output files. - */ -public FileBasedWriteOperation(FileBasedSink sink, String tempDirectory) { - this(sink, tempDirectory, TemporaryFileRetention.REMOVE); -} - -/** * Create a new FileBasedWriteOperation. * * @param sink the FileBasedSink that will be used to configure this write operation. * @param tempDirectory the base directory to be used for temporary output files. - * @param temporaryFileRetention defines how temporary files are handled. */ -public FileBasedWriteOperation(FileBasedSink sink, String tempDirectory, -TemporaryFileRetention temporaryFileRetention) { +public FileBasedWriteOperation(FileBasedSink sink, String tempDirectory) { this.sink = sink; this.tempDirectory = tempDirectory; - this.temporaryFileRetention = temporaryFileRetention; } /** @@ -415,15 +383,12 @@ public abstract class FileBasedSink extends Sink { } copyToOutputFiles(files, options); - // Optionally remove temporary files. - if (temporaryFileRetention == TemporaryFileRetention.REMOVE) { -// We remove the entire temporary directory, rather than specifically removing the files -// from writerResults, because writerResults includes only successfully completed bundles, -// and we'd like to clean up the failed ones too. -// Note that due to GCS eventual consistency, matching files in the temp directory is also -// currently non-perfect and may fail to delete some files. -removeTemporaryFiles(files, options); - } + // We remove the entire temporary directory, rather than specifically removing the files + // from writerResults, because writerResults
[2/2] incubator-beam git commit: Closes #1331
Closes #1331 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/6814a99c Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/6814a99c Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/6814a99c Branch: refs/heads/master Commit: 6814a99c22264e3c45864d7deb237108b1bd27d2 Parents: 8219233 2a15112 Author: Dan HalperinAuthored: Fri Nov 11 15:30:04 2016 -0800 Committer: Dan Halperin Committed: Fri Nov 11 15:30:04 2016 -0800 -- .../org/apache/beam/sdk/io/FileBasedSink.java | 49 ++ .../apache/beam/sdk/io/FileBasedSinkTest.java | 67 2 files changed, 21 insertions(+), 95 deletions(-) --
[2/2] incubator-beam git commit: Make BigQueryIO.parseTableSpec public
Make BigQueryIO.parseTableSpec public Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/3d622323 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/3d622323 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/3d622323 Branch: refs/heads/master Commit: 3d62232389d44e669671585bc653581fbf1da62b Parents: 703816d Author: Andrew MartinAuthored: Tue Nov 8 15:09:50 2016 -0500 Committer: Dan Halperin Committed: Fri Nov 11 14:50:31 2016 -0800 -- .../main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/3d622323/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java -- diff --git a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java index f30825f..7c9b3e0 100644 --- a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java +++ b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java @@ -290,7 +290,7 @@ public class BigQueryIO { * * If the project id is omitted, the default project id is used. */ - static TableReference parseTableSpec(String tableSpec) { + public static TableReference parseTableSpec(String tableSpec) { Matcher match = TABLE_SPEC.matcher(tableSpec); if (!match.matches()) { throw new IllegalArgumentException(