GitHub user manuzhang opened a pull request:
https://github.com/apache/incubator-beam/pull/943
sync gearpump-runner branch with master
Be sure to do all of the following to help us incorporate your contribution
quickly and easily:
- [ ] Make sure the PR title is formatted like:
`[BEAM-<Jira issue #>] Description of pull request`
- [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
Travis-CI on your fork and ensure the whole test matrix passes).
- [ ] Replace `<Jira issue #>` in the title with the actual Jira issue
number, if there is one.
- [ ] If this contribution is large, please file an Apache
[Individual Contributor License
Agreement](https://www.apache.org/licenses/icla.txt).
---
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/manuzhang/incubator-beam gearpump-runner-sync
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-beam/pull/943.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #943
----
commit 4e7814f716bba02a2eb1d71f61a95d081b035346
Author: gaurav gupta <[email protected]>
Date: 2016-08-10T23:43:03Z
Made byteArrayCoder final static
commit 869ba7d82390d2af0fbf67f3e6fde81b5fea0d64
Author: Dan Halperin <[email protected]>
Date: 2016-08-11T05:11:12Z
Closes #818
commit 2ae1a7478df037cf558a808816216e7002b33b47
Author: Dan Halperin <[email protected]>
Date: 2016-08-11T00:58:09Z
CompressedSource: CompressedReader is never splittable
The only way it's safe to split a compressed file is if the file is not
compressed. This can
only happen when the source itself is splittable, and that in turn will
result in the inner
source's reader being returned. A CompressedReader will only be created in
the event that
the file is NOT splittable. So remove all the logic handling splittable
compressed readers,
and instead go with the logic when we know/assume the file is compressed.
* TextIO: test compression with larger files
It is important for correctness that we test with large files
because otherwise the compressed file may be larger than the
uncompressed file, which could mask bugs
* TextIOTest: flesh out more
* TextIOTest: add large uncompressed file
commit 84a0dd1714028370befa80dea16f720edce05252
Author: Dan Halperin <[email protected]>
Date: 2016-08-11T05:59:51Z
Closes #794
commit 169f7a21336001f423ea2a741b15361bb01de3dc
Author: David Rieber <[email protected]>
Date: 2016-08-09T21:05:25Z
Do not add DataDisks to windmill service jobs.
commit 6b07ab17520c8ca88c1ac7b4fb96b327f848a8c7
Author: Dan Halperin <[email protected]>
Date: 2016-08-11T16:25:54Z
Closes #804
commit 8d4e91009555cacf2e2badc94475fb7655c26438
Author: Thomas Groh <[email protected]>
Date: 2016-08-11T16:16:55Z
Remove timeout in DirectRunnerTest
If the test hangs due to bugs, the infrastructure should kill it.
commit 1d9ad85cceb1f48bbe9ecd44f9d1b9a0668d3f82
Author: Luke Cwik <[email protected]>
Date: 2016-08-11T17:07:53Z
Remove timeout in DirectRunnerTest
This closes #819
commit d20a7ada7eb3ee714917e7c334e1b15ecc2c3b03
Author: bchambers <[email protected]>
Date: 2016-07-29T16:41:17Z
Remove Counter and associated code
Aggregator is the model level concept. Counter was specific to the
Dataflow Runner, and is now not needed as part of Beam.
commit 0e35a9b5e2e7e3c064ffe0beae7176923d1b9679
Author: Thomas Groh <[email protected]>
Date: 2016-08-09T02:09:58Z
Improve Write Error Message
If provided with an Unbounded PCollection, Write will fail due to
restriction of calling finalize only once. This error message fails in a
deep stack trace based on it not being possible to apply a GroupByKey.
Instead, throw immediately on application with a specific error message.
commit aa380d87d4cc429277482ee67118c0515633f8cb
Author: Thomas Groh <[email protected]>
Date: 2016-08-09T17:47:09Z
Remove Streaming Write Overrides in DataflowRunner
These writes should be forbidden based on the boundedness of the input
PCollection. As Write explicitly forbids the application of the
transform to an Unbounded PCollection, this will be equivalent in most
cases; In cases where the input PCollection is Bounded, due to an
UnboundedReadFromBoundedSource, the write will function as expected and
does not need to be forbidden.
commit 3a858ee9eb9f2ebd8a715d048c0abd90b1328a1f
Author: Luke Cwik <[email protected]>
Date: 2016-08-11T17:32:09Z
Improve Write Failure Message
This closes #802
commit a0769ad2a348c1296086b9dc8994e32ba5a06760
Author: bchambers <[email protected]>
Date: 2016-08-11T17:28:04Z
This closes #815
commit c9a32e8b8b4ca182721bf81639bd2a28e53f9525
Author: Mark Liu <[email protected]>
Date: 2016-08-03T00:25:14Z
[BEAM-495] Create General Verifier for File Checksum
commit 0b5da70d296543c00c8c4460107d1c2410c4e55f
Author: Mark Liu <[email protected]>
Date: 2016-08-03T00:47:46Z
Add output checksum to WordCountITOptions
commit a98bbb26c12f96446b314f8229d9218236f0ce06
Author: Mark Liu <[email protected]>
Date: 2016-08-11T18:26:28Z
More unit test and code style fix
commit d7566c53d24b76bcdd2e3d61b436edea31bdb752
Author: Mark Liu <[email protected]>
Date: 2016-08-11T18:55:17Z
Using IOChannelUtils to resolve file path
commit ad449ffd54483c2baf3a334980606b27d18fe386
Author: Luke Cwik <[email protected]>
Date: 2016-08-11T20:57:30Z
[BEAM-495] Create General Verifier for File Checksum
This closes #772
commit 705b72ed0fa0644c6130a6dffe741772d1686d83
Author: Ian Zhou <[email protected]>
Date: 2016-08-05T22:31:59Z
Added unit tests and error handling in removeTemporaryTables
commit 9467de4fb9be16877e8d8e5cec83c1373de58dcc
Author: Dan Halperin <[email protected]>
Date: 2016-08-12T15:18:07Z
Closes #801
commit e21b1c594f4307b2bc5e615d40e1d67f209c527b
Author: Maximilian Michels <[email protected]>
Date: 2016-08-12T15:51:02Z
[flink] add missing maven config to example pom
commit 2bcbf379f3c449e34306d3fee3f108f3abc075c7
Author: Maximilian Michels <[email protected]>
Date: 2016-08-12T15:57:49Z
This closes #821
commit a83738ba5fc631cc9be8c5294963e2ac2e82429d
Author: Pei He <[email protected]>
Date: 2016-08-01T20:41:59Z
Remove DataflowPipelineJob from examples
commit def2526e4512ff31b727192007bf3410b69bcc5d
Author: Dan Halperin <[email protected]>
Date: 2016-08-12T16:47:43Z
Closes #771
commit 54b0514371a1028b271d03cc309e6f66fb909e28
Author: mariusz89016 <[email protected]>
Date: 2016-08-13T22:35:19Z
[BEAM-432] Corrected BigQueryIO javadoc
commit 0b1f664216a3f9403f38ee5263f47f82a4460a50
Author: Dan Halperin <[email protected]>
Date: 2016-08-15T03:30:36Z
Closes #823
commit 12abb1b02246b8d36021c7b1a970daf1b64ba4b9
Author: Thomas Groh <[email protected]>
Date: 2016-07-14T21:51:02Z
Add DoFn @Setup and @Teardown
Methods annotated with these annotations are used to perform expensive
setup work and clean up a DoFn after another method throws an exception
or the DoFn is discarded.
commit 12b19677280c11b0dca203ef266769b05c90937e
Author: Thomas Groh <[email protected]>
Date: 2016-07-15T18:27:00Z
Add TransformEvaluatorFactory#cleanup
This cleans up any state stored within the Transform Evaluator Factory.
commit cf0bf3bf9fcab2b01d69ff90d9ba3f602a8a5bd4
Author: Thomas Groh <[email protected]>
Date: 2016-07-19T18:03:15Z
Replace CloningThreadLocal with DoFnLifecycleManager
This is a more focused interface that interacts with a DoFn before it
is available for use and after it has completed and the reference is
lost. It is required to properly support setup and teardown, as the
fields in a ThreadLocal cannot all be cleaned up without additional
tracking.
Part of BEAM-452.
commit 29cbdceb5b78ce86ad0d90050d7542b0d5b45362
Author: Thomas Groh <[email protected]>
Date: 2016-08-11T17:45:43Z
Move ParDo Lifecycle tests to their own file
These tests are not yet functional in all runners, and this makes them
easier to ignore.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---