[GitHub] [beam] kileys opened a new pull request #12899: [BEAM-8024] Add JPMS E2E test

2020-09-21 Thread GitBox


kileys opened a new pull request #12899:
URL: https://github.com/apache/beam/pull/12899


   **Please** add a meaningful description for your change here
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Dataflow | Flink | Samza | Spark | Twister2
   --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
 | ---
   Java | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/i
 
con)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 Status](htt
 
ps://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/)
   Python | ![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)[![Build
 

[GitHub] [beam] kileys commented on pull request #12899: [BEAM-8024] Add JPMS E2E test

2020-09-21 Thread GitBox


kileys commented on pull request #12899:
URL: https://github.com/apache/beam/pull/12899#issuecomment-696393175


   Run Java PostCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] lukecwik commented on pull request #11856: [BEAM-7505] SideInput Python Load Test job

2020-09-21 Thread GitBox


lukecwik commented on pull request #11856:
URL: https://github.com/apache/beam/pull/11856#issuecomment-696392541


   > Thanks @boyuanzz, it makes sense now why it didn't work. Unfortunately, I 
can't use V2 runner, because I want to run these tests in batch mode as well 
(V2 runner supports streaming only)
   > 
   > @tysonjh Can we move forward with the first six tests (with one, global 
window) now and skip those with 1000 windows? We could wait until batch runner 
supports SDF fully or until we come up with other idea.
   
   V2 runner supports batch.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #12896: Update indexing skips for pandas 1.x

2020-09-21 Thread GitBox


TheNeuralBit commented on pull request #12896:
URL: https://github.com/apache/beam/pull/12896#issuecomment-696399245


   Run Python_PVR_Flink PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] robertwb opened a new pull request #12900: [BEAM-10941] Use standard sharding conventions for fileio writes.

2020-09-21 Thread GitBox


robertwb opened a new pull request #12900:
URL: https://github.com/apache/beam/pull/12900


   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Dataflow | Flink | Samza | Spark | Twister2
   --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
 | ---
   Java | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/i
 
con)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 Status](htt
 
ps://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/)
   Python | ![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)[![Build
 

[GitHub] [beam] codecov[bot] edited a comment on pull request #12896: Update indexing skips for pandas 1.x

2020-09-21 Thread GitBox


codecov[bot] edited a comment on pull request #12896:
URL: https://github.com/apache/beam/pull/12896#issuecomment-696398122


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12896?src=pr=h1) Report
   > Merging 
[#12896](https://codecov.io/gh/apache/beam/pull/12896?src=pr=desc) into 
[master](https://codecov.io/gh/apache/beam/commit/067cba8229694e7fb9693313f51ca686746b620a?el=desc)
 will **decrease** coverage by `0.01%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/beam/pull/12896/graphs/tree.svg?width=650=150=pr=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12896?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master   #12896  +/-   ##
   ==
   - Coverage   82.34%   82.33%   -0.02% 
   ==
 Files 452  452  
 Lines   5401654016  
   ==
   - Hits4448144473   -8 
   - Misses   9535 9543   +8 
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/beam/pull/12896?src=pr=tree) | Coverage 
Δ | |
   |---|---|---|
   | 
[sdks/python/apache\_beam/io/localfilesystem.py](https://codecov.io/gh/apache/beam/pull/12896/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vbG9jYWxmaWxlc3lzdGVtLnB5)
 | `90.90% <0.00%> (-0.76%)` | :arrow_down: |
   | 
[...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12896/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==)
 | `88.80% <0.00%> (-0.54%)` | :arrow_down: |
   | 
[sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12896/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=)
 | `88.75% <0.00%> (-0.45%)` | :arrow_down: |
   | 
[...eam/runners/portability/fn\_api\_runner/fn\_runner.py](https://codecov.io/gh/apache/beam/pull/12896/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9mbl9hcGlfcnVubmVyL2ZuX3J1bm5lci5weQ==)
 | `89.53% <0.00%> (-0.21%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/beam/pull/12896?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/beam/pull/12896?src=pr=footer). Last 
update 
[3df7495...aaecf18](https://codecov.io/gh/apache/beam/pull/12896?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] boyuanzz commented on pull request #12888: [BEAM-10861]Moves PubSub Runner API encoding to Read/Write transforms

2020-09-21 Thread GitBox


boyuanzz commented on pull request #12888:
URL: https://github.com/apache/beam/pull/12888#issuecomment-696410670


   Discussed with Luke and Cham separately, we can get ride of 
`with_attributes` and `serialized_attribute_fn ` from both Read and Write. 
There are corresponding changes for Write: 
https://github.com/apache/beam/pull/12806. We should also similar changes to 
Read.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] codecov[bot] edited a comment on pull request #12880: [BEAM-10933] Adjust GBK and Flatten types before creating the pipeline proto

2020-09-21 Thread GitBox


codecov[bot] edited a comment on pull request #12880:
URL: https://github.com/apache/beam/pull/12880#issuecomment-695346831


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12880?src=pr=h1) Report
   > Merging 
[#12880](https://codecov.io/gh/apache/beam/pull/12880?src=pr=desc) into 
[master](https://codecov.io/gh/apache/beam/commit/067cba8229694e7fb9693313f51ca686746b620a?el=desc)
 will **decrease** coverage by `0.01%`.
   > The diff coverage is `100.00%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/beam/pull/12880/graphs/tree.svg?width=650=150=pr=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12880?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master   #12880  +/-   ##
   ==
   - Coverage   82.34%   82.33%   -0.02% 
   ==
 Files 452  452  
 Lines   5401654019   +3 
   ==
   - Hits4448144474   -7 
   - Misses   9535 9545  +10 
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/beam/pull/12880?src=pr=tree) | Coverage 
Δ | |
   |---|---|---|
   | 
[...on/apache\_beam/runners/dataflow/dataflow\_runner.py](https://codecov.io/gh/apache/beam/pull/12880/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kYXRhZmxvdy9kYXRhZmxvd19ydW5uZXIucHk=)
 | `77.27% <100.00%> (+0.09%)` | :arrow_up: |
   | 
[...ks/python/apache\_beam/runners/worker/data\_plane.py](https://codecov.io/gh/apache/beam/pull/12880/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvZGF0YV9wbGFuZS5weQ==)
 | `88.68% <0.00%> (-1.23%)` | :arrow_down: |
   | 
[sdks/python/apache\_beam/io/localfilesystem.py](https://codecov.io/gh/apache/beam/pull/12880/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vbG9jYWxmaWxlc3lzdGVtLnB5)
 | `90.90% <0.00%> (-0.76%)` | :arrow_down: |
   | 
[...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12880/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==)
 | `88.80% <0.00%> (-0.54%)` | :arrow_down: |
   | 
[...nners/portability/fn\_api\_runner/worker\_handlers.py](https://codecov.io/gh/apache/beam/pull/12880/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9mbl9hcGlfcnVubmVyL3dvcmtlcl9oYW5kbGVycy5weQ==)
 | `80.57% <0.00%> (-0.18%)` | :arrow_down: |
   | 
[...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12880/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==)
 | `94.45% <0.00%> (-0.14%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/beam/pull/12880?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/beam/pull/12880?src=pr=footer). Last 
update 
[067cba8...38a941d](https://codecov.io/gh/apache/beam/pull/12880?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] pabloem commented on pull request #12762: Ensuring that BigQuery jobs are tagged with the Dataflow step that launches them

2020-09-21 Thread GitBox


pabloem commented on pull request #12762:
URL: https://github.com/apache/beam/pull/12762#issuecomment-696413272


   Run Python 3.8 PostCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] trucleduc opened a new pull request #12902: [BEAM-10871] Fix FhirLROIT tests

2020-09-21 Thread GitBox


trucleduc opened a new pull request #12902:
URL: https://github.com/apache/beam/pull/12902


   **Please** add a meaningful description for your change here
   
   In #12721 , we delete all FHIR stores in the clean up which may interfere 
with other integration tests. This PR fixes this by only deleting FHIR stores 
generated by the test.
   
   R: @pabloem
   R: @TheNeuralBit 
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Dataflow | Flink | Samza | Spark | Twister2
   --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
 | ---
   Java | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/i
 
con)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 Status](htt
 
ps://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/)
   Python | ![Build 

[GitHub] [beam] TheNeuralBit commented on pull request #12902: [BEAM-10871] Fix FhirLROIT tests

2020-09-21 Thread GitBox


TheNeuralBit commented on pull request #12902:
URL: https://github.com/apache/beam/pull/12902#issuecomment-696415795


   Run Java PostCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] ibzib commented on a change in pull request #12637: [BEAM-10768] Don't assert the order in which elements are received.

2020-09-21 Thread GitBox


ibzib commented on a change in pull request #12637:
URL: https://github.com/apache/beam/pull/12637#discussion_r492386778



##
File path: sdks/python/apache_beam/runners/worker/data_plane_test.py
##
@@ -108,16 +106,28 @@ def send(instruction_id, transform_id, data):
 ])
 
 # Multiple interleaved writes to multiple instructions.
-send('1', transform_1, b'abc')
-send('2', transform_1, b'def')
+stream11 = from_channel.output_stream('1', transform_1)
+stream11.write(b'abc')
+stream21 = from_channel.output_stream('2', transform_1)
+stream21.write(b'def')
+if not time_based_flush:
+  stream11.close()
 self.assertEqual(
 list(
 itertools.islice(to_channel.input_elements('1', [transform_1]), 
1)),
 [
 beam_fn_api_pb2.Elements.Data(
 instruction_id='1', transform_id=transform_1, data=b'abc')
 ])
-send('2', transform_2, b'ghi')
+if time_based_flush:

Review comment:
   Write does not provide ordering guarantees in this case.
   
   Elements are stored in a 
[queue](https://github.com/apache/beam/blob/7b3d4251d244c10545fb37f1d93ebcad84a98681/sdks/python/apache_beam/runners/worker/data_plane.py#L371)
 before being sent, to enable batching. Elements aren't added to that queue 
until the [flush 
callback](https://github.com/apache/beam/blob/7b3d4251d244c10545fb37f1d93ebcad84a98681/sdks/python/apache_beam/runners/worker/data_plane.py#L493)
 is invoked. Because the flush callback is [invoked 
periodically](https://github.com/apache/beam/blob/7b3d4251d244c10545fb37f1d93ebcad84a98681/sdks/python/apache_beam/runners/worker/data_plane.py#L182)
 starting from when a stream is constructed, there is no guarantee that one 
stream's callback is called before the other.
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] trucleduc commented on pull request #12902: [BEAM-10871] Fix FhirLROIT tests

2020-09-21 Thread GitBox


trucleduc commented on pull request #12902:
URL: https://github.com/apache/beam/pull/12902#issuecomment-696417081


   @TheNeuralBit Would you please hold on merging, I realized that there could 
be flakiness issue for the existing tests. I'll investigate.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] trucleduc commented on pull request #12902: [BEAM-10871] Fix FhirLROIT tests

2020-09-21 Thread GitBox


trucleduc commented on pull request #12902:
URL: https://github.com/apache/beam/pull/12902#issuecomment-696419895


   @TheNeuralBit ok fixed. I tried to run the test 10 times and they all passed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #12896: Update indexing skips for pandas 1.x

2020-09-21 Thread GitBox


TheNeuralBit commented on pull request #12896:
URL: https://github.com/apache/beam/pull/12896#issuecomment-696422086


   Run Python_PVR_Flink PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #12902: [BEAM-10871] Fix FhirLROIT tests

2020-09-21 Thread GitBox


TheNeuralBit commented on pull request #12902:
URL: https://github.com/apache/beam/pull/12902#issuecomment-696421674


   Run Java PostCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] aaltay commented on pull request #12892: [BEAM-10937] Add first introductory notebook

2020-09-21 Thread GitBox


aaltay commented on pull request #12892:
URL: https://github.com/apache/beam/pull/12892#issuecomment-696425242


   /cc @KevinGG 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] codecov[bot] edited a comment on pull request #12727: [BEAM-10844] Add experiment option prebuild_sdk_container to prebuild python sdk container with dependencies.

2020-09-21 Thread GitBox


codecov[bot] edited a comment on pull request #12727:
URL: https://github.com/apache/beam/pull/12727#issuecomment-683229966


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12727?src=pr=h1) Report
   > Merging 
[#12727](https://codecov.io/gh/apache/beam/pull/12727?src=pr=desc) into 
[master](https://codecov.io/gh/apache/beam/commit/5ef04e945fb1a3948d665af5d6b7f13b86487616?el=desc)
 will **decrease** coverage by `0.18%`.
   > The diff coverage is `52.77%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/beam/pull/12727/graphs/tree.svg?width=650=150=pr=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12727?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master   #12727  +/-   ##
   ==
   - Coverage   82.36%   82.18%   -0.19% 
   ==
 Files 451  453   +2 
 Lines   5375354177 +424 
   ==
   + Hits4427344524 +251 
   - Misses   9480 9653 +173 
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/beam/pull/12727?src=pr=tree) | Coverage 
Δ | |
   |---|---|---|
   | 
[...apache\_beam/runners/dataflow/internal/apiclient.py](https://codecov.io/gh/apache/beam/pull/12727/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kYXRhZmxvdy9pbnRlcm5hbC9hcGljbGllbnQucHk=)
 | `72.60% <0.00%> (ø)` | |
   | 
[...eam/testing/benchmarks/nexmark/nexmark\_launcher.py](https://codecov.io/gh/apache/beam/pull/12727/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdGVzdGluZy9iZW5jaG1hcmtzL25leG1hcmsvbmV4bWFya19sYXVuY2hlci5weQ==)
 | `0.00% <0.00%> (ø)` | |
   | 
[...he\_beam/testing/benchmarks/nexmark/nexmark\_perf.py](https://codecov.io/gh/apache/beam/pull/12727/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdGVzdGluZy9iZW5jaG1hcmtzL25leG1hcmsvbmV4bWFya19wZXJmLnB5)
 | `0.00% <0.00%> (ø)` | |
   | 
[...he\_beam/testing/benchmarks/nexmark/nexmark\_util.py](https://codecov.io/gh/apache/beam/pull/12727/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdGVzdGluZy9iZW5jaG1hcmtzL25leG1hcmsvbmV4bWFya191dGlsLnB5)
 | `0.00% <0.00%> (ø)` | |
   | 
[...\_beam/runners/portability/sdk\_container\_builder.py](https://codecov.io/gh/apache/beam/pull/12727/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9zZGtfY29udGFpbmVyX2J1aWxkZXIucHk=)
 | `31.78% <31.78%> (ø)` | |
   | 
[...on/apache\_beam/runners/dataflow/dataflow\_runner.py](https://codecov.io/gh/apache/beam/pull/12727/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kYXRhZmxvdy9kYXRhZmxvd19ydW5uZXIucHk=)
 | `77.00% <50.00%> (-0.18%)` | :arrow_down: |
   | 
[sdks/python/apache\_beam/transforms/environments.py](https://codecov.io/gh/apache/beam/pull/12727/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9lbnZpcm9ubWVudHMucHk=)
 | `83.49% <66.66%> (-0.33%)` | :arrow_down: |
   | 
[...dks/python/apache\_beam/options/pipeline\_options.py](https://codecov.io/gh/apache/beam/pull/12727/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vb3B0aW9ucy9waXBlbGluZV9vcHRpb25zLnB5)
 | `95.01% <70.00%> (+0.02%)` | :arrow_up: |
   | 
[...beam/runners/interactive/background\_caching\_job.py](https://codecov.io/gh/apache/beam/pull/12727/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9iYWNrZ3JvdW5kX2NhY2hpbmdfam9iLnB5)
 | `94.78% <90.90%> (ø)` | |
   | 
[sdks/python/apache\_beam/utils/histogram.py](https://codecov.io/gh/apache/beam/pull/12727/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdXRpbHMvaGlzdG9ncmFtLnB5)
 | `94.28% <94.28%> (ø)` | |
   | ... and [70 
more](https://codecov.io/gh/apache/beam/pull/12727/diff?src=pr=tree-more) | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/beam/pull/12727?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/beam/pull/12727?src=pr=footer). Last 
update 
[68e7f2e...4f63252](https://codecov.io/gh/apache/beam/pull/12727?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] codecov[bot] edited a comment on pull request #12799: [BEAM-10603] Add record_pipeline, clear to RM and fix duration limiter

2020-09-21 Thread GitBox


codecov[bot] edited a comment on pull request #12799:
URL: https://github.com/apache/beam/pull/12799#issuecomment-692960218


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12799?src=pr=h1) Report
   > Merging 
[#12799](https://codecov.io/gh/apache/beam/pull/12799?src=pr=desc) into 
[master](https://codecov.io/gh/apache/beam/commit/e0e5fb529d3c7184151142c3c9bbc23ed3b20644?el=desc)
 will **decrease** coverage by `0.04%`.
   > The diff coverage is `72.34%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/beam/pull/12799/graphs/tree.svg?width=650=150=pr=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12799?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master   #12799  +/-   ##
   ==
   - Coverage   82.35%   82.31%   -0.05% 
   ==
 Files 450  451   +1 
 Lines   5370853878 +170 
   ==
   + Hits4423044347 +117 
   - Misses   9478 9531  +53 
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/beam/pull/12799?src=pr=tree) | Coverage 
Δ | |
   |---|---|---|
   | 
[sdks/python/apache\_beam/io/snowflake.py](https://codecov.io/gh/apache/beam/pull/12799/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vc25vd2ZsYWtlLnB5)
 | `64.15% <ø> (ø)` | |
   | 
[...beam/runners/interactive/background\_caching\_job.py](https://codecov.io/gh/apache/beam/pull/12799/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9iYWNrZ3JvdW5kX2NhY2hpbmdfam9iLnB5)
 | `96.52% <ø> (+1.73%)` | :arrow_up: |
   | 
[...eam/runners/portability/fn\_api\_runner/fn\_runner.py](https://codecov.io/gh/apache/beam/pull/12799/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9mbl9hcGlfcnVubmVyL2ZuX3J1bm5lci5weQ==)
 | `89.53% <ø> (-0.21%)` | :arrow_down: |
   | 
[...apache\_beam/runners/portability/portable\_runner.py](https://codecov.io/gh/apache/beam/pull/12799/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9wb3J0YWJsZV9ydW5uZXIucHk=)
 | `77.37% <ø> (ø)` | |
   | 
[...beam/testing/benchmarks/nexmark/queries/query10.py](https://codecov.io/gh/apache/beam/pull/12799/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdGVzdGluZy9iZW5jaG1hcmtzL25leG1hcmsvcXVlcmllcy9xdWVyeTEwLnB5)
 | `0.00% <0.00%> (ø)` | |
   | 
[sdks/python/apache\_beam/transforms/stats.py](https://codecov.io/gh/apache/beam/pull/12799/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9zdGF0cy5weQ==)
 | `87.39% <20.00%> (-3.03%)` | :arrow_down: |
   | 
[...on/apache\_beam/runners/direct/sdf\_direct\_runner.py](https://codecov.io/gh/apache/beam/pull/12799/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3Qvc2RmX2RpcmVjdF9ydW5uZXIucHk=)
 | `36.06% <50.00%> (ø)` | |
   | 
[...pache\_beam/runners/interactive/interactive\_beam.py](https://codecov.io/gh/apache/beam/pull/12799/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9iZWFtLnB5)
 | `76.02% <64.40%> (-10.53%)` | :arrow_down: |
   | 
[sdks/python/apache\_beam/transforms/combiners.py](https://codecov.io/gh/apache/beam/pull/12799/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9jb21iaW5lcnMucHk=)
 | `92.16% <66.66%> (+0.01%)` | :arrow_up: |
   | 
[sdks/python/apache\_beam/dataframe/frames.py](https://codecov.io/gh/apache/beam/pull/12799/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZGF0YWZyYW1lL2ZyYW1lcy5weQ==)
 | `89.90% <83.33%> (-0.59%)` | :arrow_down: |
   | ... and [38 
more](https://codecov.io/gh/apache/beam/pull/12799/diff?src=pr=tree-more) | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/beam/pull/12799?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/beam/pull/12799?src=pr=footer). Last 
update 
[126554d...f119319](https://codecov.io/gh/apache/beam/pull/12799?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] pabloem commented on pull request #12782: Overriding Dataflow Native BQSource.

2020-09-21 Thread GitBox


pabloem commented on pull request #12782:
URL: https://github.com/apache/beam/pull/12782#issuecomment-696442124


   Run Python 3.8 PostCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] codecov[bot] edited a comment on pull request #12727: [BEAM-10844] Add experiment option prebuild_sdk_container to prebuild python sdk container with dependencies.

2020-09-21 Thread GitBox


codecov[bot] edited a comment on pull request #12727:
URL: https://github.com/apache/beam/pull/12727#issuecomment-683229966


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12727?src=pr=h1) Report
   > Merging 
[#12727](https://codecov.io/gh/apache/beam/pull/12727?src=pr=desc) into 
[master](https://codecov.io/gh/apache/beam/commit/5ef04e945fb1a3948d665af5d6b7f13b86487616?el=desc)
 will **decrease** coverage by `0.18%`.
   > The diff coverage is `52.77%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/beam/pull/12727/graphs/tree.svg?width=650=150=pr=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12727?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master   #12727  +/-   ##
   ==
   - Coverage   82.36%   82.18%   -0.19% 
   ==
 Files 451  453   +2 
 Lines   5375354177 +424 
   ==
   + Hits4427344524 +251 
   - Misses   9480 9653 +173 
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/beam/pull/12727?src=pr=tree) | Coverage 
Δ | |
   |---|---|---|
   | 
[...apache\_beam/runners/dataflow/internal/apiclient.py](https://codecov.io/gh/apache/beam/pull/12727/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kYXRhZmxvdy9pbnRlcm5hbC9hcGljbGllbnQucHk=)
 | `72.60% <0.00%> (ø)` | |
   | 
[...eam/testing/benchmarks/nexmark/nexmark\_launcher.py](https://codecov.io/gh/apache/beam/pull/12727/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdGVzdGluZy9iZW5jaG1hcmtzL25leG1hcmsvbmV4bWFya19sYXVuY2hlci5weQ==)
 | `0.00% <0.00%> (ø)` | |
   | 
[...he\_beam/testing/benchmarks/nexmark/nexmark\_perf.py](https://codecov.io/gh/apache/beam/pull/12727/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdGVzdGluZy9iZW5jaG1hcmtzL25leG1hcmsvbmV4bWFya19wZXJmLnB5)
 | `0.00% <0.00%> (ø)` | |
   | 
[...he\_beam/testing/benchmarks/nexmark/nexmark\_util.py](https://codecov.io/gh/apache/beam/pull/12727/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdGVzdGluZy9iZW5jaG1hcmtzL25leG1hcmsvbmV4bWFya191dGlsLnB5)
 | `0.00% <0.00%> (ø)` | |
   | 
[...\_beam/runners/portability/sdk\_container\_builder.py](https://codecov.io/gh/apache/beam/pull/12727/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9zZGtfY29udGFpbmVyX2J1aWxkZXIucHk=)
 | `31.78% <31.78%> (ø)` | |
   | 
[...on/apache\_beam/runners/dataflow/dataflow\_runner.py](https://codecov.io/gh/apache/beam/pull/12727/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kYXRhZmxvdy9kYXRhZmxvd19ydW5uZXIucHk=)
 | `77.00% <50.00%> (-0.18%)` | :arrow_down: |
   | 
[sdks/python/apache\_beam/transforms/environments.py](https://codecov.io/gh/apache/beam/pull/12727/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9lbnZpcm9ubWVudHMucHk=)
 | `83.49% <66.66%> (-0.33%)` | :arrow_down: |
   | 
[...dks/python/apache\_beam/options/pipeline\_options.py](https://codecov.io/gh/apache/beam/pull/12727/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vb3B0aW9ucy9waXBlbGluZV9vcHRpb25zLnB5)
 | `95.01% <70.00%> (+0.02%)` | :arrow_up: |
   | 
[...beam/runners/interactive/background\_caching\_job.py](https://codecov.io/gh/apache/beam/pull/12727/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9iYWNrZ3JvdW5kX2NhY2hpbmdfam9iLnB5)
 | `94.78% <90.90%> (ø)` | |
   | 
[sdks/python/apache\_beam/utils/histogram.py](https://codecov.io/gh/apache/beam/pull/12727/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdXRpbHMvaGlzdG9ncmFtLnB5)
 | `94.28% <94.28%> (ø)` | |
   | ... and [70 
more](https://codecov.io/gh/apache/beam/pull/12727/diff?src=pr=tree-more) | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/beam/pull/12727?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/beam/pull/12727?src=pr=footer). Last 
update 
[68e7f2e...4f63252](https://codecov.io/gh/apache/beam/pull/12727?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] lukecwik commented on pull request #12794: [BEAM-10865] Support for Kafka deserialization API with headers (since Kafka API 2.1.0)

2020-09-21 Thread GitBox


lukecwik commented on pull request #12794:
URL: https://github.com/apache/beam/pull/12794#issuecomment-696393557


   > Thanks for the feedback and the merge Luke. I'll address the updates in a 
PR.
   > 
   > I agree all tests makes more sense as one can never have enough coverage. 
That said, I'm been thinking about the `ConsumerSpEL` and the fact the in the 
test suite it until now only executed against Kafka clients API 1.0.0 
(`library.java.kafka_clients`) for each run. It's possible to also have the 
tasks and configurations dynamically created for a range of Kafka API versions 
too now that a pattern emerged, but that also would add quite a lot of time to 
test runs.
   > 
   > Thoughts?
   
   I think that is a good idea to enumerate the most popular kafka client 
versions. I don't think the unit tests are that slow and the community can 
always break-up the set that run to use more jenkins executors in parallel if 
it becomes a large enough issue. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

2020-09-21 Thread GitBox


chamikaramj commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r492364337



##
File path: sdks/python/apache_beam/io/gcp/spanner.py
##
@@ -0,0 +1,504 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""PTransforms for supporting Spanner in Python pipelines.
+
+  These transforms are currently supported by Beam portable
+  Flink and Spark runners.
+
+  **Setup**
+
+  Transforms provided in this module are cross-language transforms
+  implemented in the Beam Java SDK. During the pipeline construction, Python 
SDK
+  will connect to a Java expansion service to expand these transforms.
+  To facilitate this, a small amount of setup is needed before using these
+  transforms in a Beam Python pipeline.
+
+  There are several ways to setup cross-language Spanner transforms.
+
+  * Option 1: use the default expansion service
+  * Option 2: specify a custom expansion service
+
+  See below for details regarding each of these options.
+
+  *Option 1: Use the default expansion service*
+
+  This is the recommended and easiest setup option for using Python Spanner
+  transforms. This option is only available for Beam 2.25.0 and later.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Install Java runtime in the computer from where the pipeline is constructed
+and make sure that 'java' command is available.
+
+  In this option, Python SDK will either download (for released Beam version) 
or
+  build (when running from a Beam Git clone) a expansion service jar and use
+  that to expand transforms. Currently Spanner transforms use the
+  'beam-sdks-java-io-google-cloud-platform-expansion-service' jar for this
+  purpose.
+
+  *Option 2: specify a custom expansion service*
+
+  In this option, you startup your own expansion service and provide that as
+  a parameter when using the transforms provided in this module.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Startup your own expansion service.
+  * Update your pipeline to provide the expansion service address when
+initiating Spanner transforms provided in this module.
+
+  Flink Users can use the built-in Expansion Service of the Flink Runner's
+  Job Server. If you start Flink's Job Server, the expansion service will be
+  started on port 8097. For a different address, please set the
+  expansion_service parameter.
+
+  **More information**
+
+  For more information regarding cross-language transforms see:
+  - https://beam.apache.org/roadmap/portability/
+
+  For more information specific to Flink runner see:
+  - https://beam.apache.org/documentation/runners/flink/
+"""
+
+# pytype: skip-file
+
+from __future__ import absolute_import
+
+import typing
+import uuid
+from typing import List
+from typing import NamedTuple
+from typing import Optional
+
+from past.builtins import unicode
+
+from apache_beam import coders
+from apache_beam.transforms.external import BeamJarExpansionService
+from apache_beam.transforms.external import ExternalTransform
+from apache_beam.transforms.external import NamedTupleBasedPayloadBuilder
+from apache_beam.typehints.schemas import named_tuple_to_schema
+
+__all__ = [
+'WriteToSpanner',
+'ReadFromSpanner',
+'MutationCreator',
+'TimestampBoundMode',
+'TimeUnit',
+]
+
+
+def default_io_expansion_service():
+  return BeamJarExpansionService(
+  'sdks:java:io:google-cloud-platform:expansion-service:shadowJar')
+
+
+WriteToSpannerSchema = typing.NamedTuple(
+'WriteToSpannerSchema',
+[
+('instance_id', unicode),
+('database_id', unicode),
+('project_id', Optional[unicode]),
+('batch_size_bytes', Optional[int]),
+('max_num_mutations', Optional[int]),
+('max_num_rows', Optional[int]),
+('grouping_factor', Optional[int]),
+('host', Optional[unicode]),
+('emulator_host', Optional[unicode]),
+('commit_deadline', Optional[int]),
+('max_cumulative_backoff', Optional[int]),
+],
+)
+
+
+class WriteToSpanner(ExternalTransform):

Review comment:
   Probably it makes sense to converge into one implementation. I'd 

[GitHub] [beam] codecov[bot] edited a comment on pull request #12896: Update indexing skips for pandas 1.x

2020-09-21 Thread GitBox


codecov[bot] edited a comment on pull request #12896:
URL: https://github.com/apache/beam/pull/12896#issuecomment-696398122


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12896?src=pr=h1) Report
   > Merging 
[#12896](https://codecov.io/gh/apache/beam/pull/12896?src=pr=desc) into 
[master](https://codecov.io/gh/apache/beam/commit/067cba8229694e7fb9693313f51ca686746b620a?el=desc)
 will **decrease** coverage by `0.01%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/beam/pull/12896/graphs/tree.svg?width=650=150=pr=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12896?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master   #12896  +/-   ##
   ==
   - Coverage   82.34%   82.33%   -0.02% 
   ==
 Files 452  452  
 Lines   5401654016  
   ==
   - Hits4448144473   -8 
   - Misses   9535 9543   +8 
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/beam/pull/12896?src=pr=tree) | Coverage 
Δ | |
   |---|---|---|
   | 
[sdks/python/apache\_beam/io/localfilesystem.py](https://codecov.io/gh/apache/beam/pull/12896/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vbG9jYWxmaWxlc3lzdGVtLnB5)
 | `90.90% <0.00%> (-0.76%)` | :arrow_down: |
   | 
[...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12896/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==)
 | `88.80% <0.00%> (-0.54%)` | :arrow_down: |
   | 
[sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12896/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=)
 | `88.75% <0.00%> (-0.45%)` | :arrow_down: |
   | 
[...eam/runners/portability/fn\_api\_runner/fn\_runner.py](https://codecov.io/gh/apache/beam/pull/12896/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9mbl9hcGlfcnVubmVyL2ZuX3J1bm5lci5weQ==)
 | `89.53% <0.00%> (-0.21%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/beam/pull/12896?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/beam/pull/12896?src=pr=footer). Last 
update 
[3df7495...aaecf18](https://codecov.io/gh/apache/beam/pull/12896?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] codecov[bot] commented on pull request #12896: Update indexing skips for pandas 1.x

2020-09-21 Thread GitBox


codecov[bot] commented on pull request #12896:
URL: https://github.com/apache/beam/pull/12896#issuecomment-696398122


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12896?src=pr=h1) Report
   > Merging 
[#12896](https://codecov.io/gh/apache/beam/pull/12896?src=pr=desc) into 
[master](https://codecov.io/gh/apache/beam/commit/067cba8229694e7fb9693313f51ca686746b620a?el=desc)
 will **decrease** coverage by `0.01%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/beam/pull/12896/graphs/tree.svg?width=650=150=pr=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12896?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master   #12896  +/-   ##
   ==
   - Coverage   82.34%   82.33%   -0.02% 
   ==
 Files 452  452  
 Lines   5401654016  
   ==
   - Hits4448144473   -8 
   - Misses   9535 9543   +8 
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/beam/pull/12896?src=pr=tree) | Coverage 
Δ | |
   |---|---|---|
   | 
[sdks/python/apache\_beam/io/localfilesystem.py](https://codecov.io/gh/apache/beam/pull/12896/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vbG9jYWxmaWxlc3lzdGVtLnB5)
 | `90.90% <0.00%> (-0.76%)` | :arrow_down: |
   | 
[...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12896/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==)
 | `88.80% <0.00%> (-0.54%)` | :arrow_down: |
   | 
[sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12896/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=)
 | `88.75% <0.00%> (-0.45%)` | :arrow_down: |
   | 
[...eam/runners/portability/fn\_api\_runner/fn\_runner.py](https://codecov.io/gh/apache/beam/pull/12896/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9mbl9hcGlfcnVubmVyL2ZuX3J1bm5lci5weQ==)
 | `89.53% <0.00%> (-0.21%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/beam/pull/12896?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/beam/pull/12896?src=pr=footer). Last 
update 
[3df7495...aaecf18](https://codecov.io/gh/apache/beam/pull/12896?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] robertwb commented on pull request #12873: Remove experimental declarations from fileio.

2020-09-21 Thread GitBox


robertwb commented on pull request #12873:
URL: https://github.com/apache/beam/pull/12873#issuecomment-696399851


   Sure, we can keep it experimental a bit longer. I would like to fix 
https://github.com/apache/beam/pull/12900 though.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] ibzib opened a new pull request #12901: Move ZetaSQL UDF tests into separate class.

2020-09-21 Thread GitBox


ibzib opened a new pull request #12901:
URL: https://github.com/apache/beam/pull/12901


   R: @amaliujia 
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Dataflow | Flink | Samza | Spark | Twister2
   --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
 | ---
   Java | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/i
 
con)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 Status](htt
 
ps://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
 | [![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/)
   Python | ![Build 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 
Status](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)[![Build
 

[GitHub] [beam] TheNeuralBit commented on pull request #12721: [BEAM-10871] Add deidentify for FhirIO connector

2020-09-21 Thread GitBox


TheNeuralBit commented on pull request #12721:
URL: https://github.com/apache/beam/pull/12721#issuecomment-696407136


   I think this broke Fhir integration tests in PostCommit: 
https://ci-beam.apache.org/job/beam_PostCommit_Java/6624/



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] trucleduc commented on pull request #12721: [BEAM-10871] Add deidentify for FhirIO connector

2020-09-21 Thread GitBox


trucleduc commented on pull request #12721:
URL: https://github.com/apache/beam/pull/12721#issuecomment-696408469


   I guess the issue is because we delete all FHIR stores after test. I'll send 
a PR to fix it.
   
   
https://github.com/apache/beam/blob/7f474b544f868d2addbc2463cb26e0fed31061b1/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/healthcare/FhirIOLROIT.java#L74



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kennknowles commented on pull request #12877: Upgrade GCS IO to 2.1.5 and Google OAuth to 1.31.0

2020-09-21 Thread GitBox


kennknowles commented on pull request #12877:
URL: https://github.com/apache/beam/pull/12877#issuecomment-696412239


   Ran `./gradlew -Ppublishing 
-PjavaLinkageArtifactIds=beam-sdks-java-io-google-cloud-platform,beam-runners-google-cloud-dataflow-java
 :checkJavaLinkage > ~/tmp/linkage-after` against `master` and this PR. Both 
fail, but here is the diff, with no additional errors:
   
   ```
   13094,13096c13115,13117
   < com.google.cloud.hadoop.gcsio.cooplock.CoopLockOperationDao 
(com.google.cloud.bigdataoss:gcsio:2.1.3)
   < com.google.cloud.hadoop.gcsio.cooplock.CoopLockRecordsDao 
(com.google.cloud.bigdataoss:gcsio:2.1.3)
   < com.google.cloud.hadoop.gcsio.testing.InMemoryObjectEntry 
(com.google.cloud.bigdataoss:gcsio:2.1.3)
   ---
   > com.google.cloud.hadoop.gcsio.cooplock.CoopLockOperationDao 
(com.google.cloud.bigdataoss:gcsio:2.1.5)
   > com.google.cloud.hadoop.gcsio.cooplock.CoopLockRecordsDao 
(com.google.cloud.bigdataoss:gcsio:2.1.5)
   > com.google.cloud.hadoop.gcsio.testing.InMemoryObjectEntry 
(com.google.cloud.bigdataoss:gcsio:2.1.5)
   13100c13121
   <   unselected: 
org.apache.beam:beam-sdks-java-io-google-cloud-platform:2.25.0-SNAPSHOT 
(compile) / 
org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core:2.25.0-SNAPSHOT
 (compile) / com.google.cloud.bigdataoss:gcsio:2.1.3 (compile) / 
com.google.guava:guava:29.0-jre (compile)
   ---
   >   unselected: 
org.apache.beam:beam-sdks-java-io-google-cloud-platform:2.25.0-SNAPSHOT 
(compile) / 
org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core:2.25.0-SNAPSHOT
 (compile) / com.google.cloud.bigdataoss:gcsio:2.1.5 (compile) / 
com.google.guava:guava:29.0-jre (compile)
   13122c13143
   <   and 696 other dependency paths.
   ---
   >   and 714 other dependency paths.
   13153,13154c13174,13175
   < com.google.cloud.bigdataoss:gcsio:2.1.3 is at:
   <   org.apache.beam:beam-sdks-java-io-google-cloud-platform:2.25.0-SNAPSHOT 
(compile) / 
org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core:2.25.0-SNAPSHOT
 (compile) / com.google.cloud.bigdataoss:gcsio:2.1.3 (compile)
   ---
   > com.google.cloud.bigdataoss:gcsio:2.1.5 is at:
   >   org.apache.beam:beam-sdks-java-io-google-cloud-platform:2.25.0-SNAPSHOT 
(compile) / 
org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core:2.25.0-SNAPSHOT
 (compile) / com.google.cloud.bigdataoss:gcsio:2.1.5 (compile)
   13163c13184
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] tvalentyn commented on pull request #12895: [BEAM-9372][BEAM-9980] Switches Flink VR suite to Py36 and makes the version configurable.

2020-09-21 Thread GitBox


tvalentyn commented on pull request #12895:
URL: https://github.com/apache/beam/pull/12895#issuecomment-696411840


   Run Seed Job



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj commented on a change in pull request #12880: [BEAM-10933] Adjust GBK and Flatten types before creating the pipeline proto

2020-09-21 Thread GitBox


chamikaramj commented on a change in pull request #12880:
URL: https://github.com/apache/beam/pull/12880#discussion_r492380763



##
File path: sdks/python/apache_beam/runners/dataflow/dataflow_runner.py
##
@@ -488,12 +497,15 @@ def run_pipeline(self, pipeline, options):
   # Cross language transform require using a pipeline object constructed
   # from the full pipeline proto to make sure that expanded version of
   # external transforms are reflected in the Pipeline job graph.
+  # TODO(chamikara): remove following pipeline and pipeline proto 
recreation
+  # after portable job submission path is fully in place.
   from apache_beam import Pipeline
   pipeline = Pipeline.from_runner_api(
   self.proto_pipeline,
   pipeline.runner,
   options,
   allow_proto_holders=True)
+  self._adjust_types_for_dataflow(pipeline)

Review comment:
   Verified that types are preserved in the pipeline->proto->pipeline 
roundtrip and removed this.

##
File path: sdks/python/apache_beam/runners/dataflow/dataflow_runner.py
##
@@ -430,6 +430,11 @@ def _check_for_unsupported_fnapi_features(self, 
pipeline_proto):
 components.coders[windowing_strategy.window_coder_id].spec.urn,
 windowing_strategy.window_fn.urn))
 
+  def _adjust_types_for_dataflow(self, pipeline):

Review comment:
   Done.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] trucleduc commented on pull request #12721: [BEAM-10871] Add deidentify for FhirIO connector

2020-09-21 Thread GitBox


trucleduc commented on pull request #12721:
URL: https://github.com/apache/beam/pull/12721#issuecomment-696414909


   Sent https://github.com/apache/beam/pull/12902



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #12902: [BEAM-10871] Fix FhirLROIT tests

2020-09-21 Thread GitBox


TheNeuralBit commented on pull request #12902:
URL: https://github.com/apache/beam/pull/12902#issuecomment-696417351


   ok I'll hold off, thanks



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] rohdesamuel commented on pull request #12799: [BEAM-10603] Add record_pipeline, clear to RM and fix duration limiter

2020-09-21 Thread GitBox


rohdesamuel commented on pull request #12799:
URL: https://github.com/apache/beam/pull/12799#issuecomment-696426921


   Run Python PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #12896: Update indexing skips for pandas 1.x

2020-09-21 Thread GitBox


TheNeuralBit commented on pull request #12896:
URL: https://github.com/apache/beam/pull/12896#issuecomment-696430017


   PVR_Flink is now disabled: 
https://ci-beam.apache.org/job/beam_PreCommit_Python_PVR_Flink_Commit I don't 
think the failure is related.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit merged pull request #12896: Update indexing skips for pandas 1.x

2020-09-21 Thread GitBox


TheNeuralBit merged pull request #12896:
URL: https://github.com/apache/beam/pull/12896


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] ibzib commented on pull request #12896: Update indexing skips for pandas 1.x

2020-09-21 Thread GitBox


ibzib commented on pull request #12896:
URL: https://github.com/apache/beam/pull/12896#issuecomment-696436679


   > PVR_Flink is now disabled: 
https://ci-beam.apache.org/job/beam_PreCommit_Python_PVR_Flink_Commit I don't 
think the failure is related.
   
   That's unexpected.. I wonder if it was disabled accidentally (perhaps 
because of Python version removals?)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] codecov[bot] edited a comment on pull request #12727: [BEAM-10844] Add experiment option prebuild_sdk_container to prebuild python sdk container with dependencies.

2020-09-21 Thread GitBox


codecov[bot] edited a comment on pull request #12727:
URL: https://github.com/apache/beam/pull/12727#issuecomment-683229966


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12727?src=pr=h1) Report
   > Merging 
[#12727](https://codecov.io/gh/apache/beam/pull/12727?src=pr=desc) into 
[master](https://codecov.io/gh/apache/beam/commit/5ef04e945fb1a3948d665af5d6b7f13b86487616?el=desc)
 will **decrease** coverage by `0.18%`.
   > The diff coverage is `52.77%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/beam/pull/12727/graphs/tree.svg?width=650=150=pr=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12727?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master   #12727  +/-   ##
   ==
   - Coverage   82.36%   82.18%   -0.19% 
   ==
 Files 451  453   +2 
 Lines   5375354177 +424 
   ==
   + Hits4427344524 +251 
   - Misses   9480 9653 +173 
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/beam/pull/12727?src=pr=tree) | Coverage 
Δ | |
   |---|---|---|
   | 
[...apache\_beam/runners/dataflow/internal/apiclient.py](https://codecov.io/gh/apache/beam/pull/12727/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kYXRhZmxvdy9pbnRlcm5hbC9hcGljbGllbnQucHk=)
 | `72.60% <0.00%> (ø)` | |
   | 
[...eam/testing/benchmarks/nexmark/nexmark\_launcher.py](https://codecov.io/gh/apache/beam/pull/12727/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdGVzdGluZy9iZW5jaG1hcmtzL25leG1hcmsvbmV4bWFya19sYXVuY2hlci5weQ==)
 | `0.00% <0.00%> (ø)` | |
   | 
[...he\_beam/testing/benchmarks/nexmark/nexmark\_perf.py](https://codecov.io/gh/apache/beam/pull/12727/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdGVzdGluZy9iZW5jaG1hcmtzL25leG1hcmsvbmV4bWFya19wZXJmLnB5)
 | `0.00% <0.00%> (ø)` | |
   | 
[...he\_beam/testing/benchmarks/nexmark/nexmark\_util.py](https://codecov.io/gh/apache/beam/pull/12727/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdGVzdGluZy9iZW5jaG1hcmtzL25leG1hcmsvbmV4bWFya191dGlsLnB5)
 | `0.00% <0.00%> (ø)` | |
   | 
[...\_beam/runners/portability/sdk\_container\_builder.py](https://codecov.io/gh/apache/beam/pull/12727/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9zZGtfY29udGFpbmVyX2J1aWxkZXIucHk=)
 | `31.78% <31.78%> (ø)` | |
   | 
[...on/apache\_beam/runners/dataflow/dataflow\_runner.py](https://codecov.io/gh/apache/beam/pull/12727/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kYXRhZmxvdy9kYXRhZmxvd19ydW5uZXIucHk=)
 | `77.00% <50.00%> (-0.18%)` | :arrow_down: |
   | 
[sdks/python/apache\_beam/transforms/environments.py](https://codecov.io/gh/apache/beam/pull/12727/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9lbnZpcm9ubWVudHMucHk=)
 | `83.49% <66.66%> (-0.33%)` | :arrow_down: |
   | 
[...dks/python/apache\_beam/options/pipeline\_options.py](https://codecov.io/gh/apache/beam/pull/12727/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vb3B0aW9ucy9waXBlbGluZV9vcHRpb25zLnB5)
 | `95.01% <70.00%> (+0.02%)` | :arrow_up: |
   | 
[...beam/runners/interactive/background\_caching\_job.py](https://codecov.io/gh/apache/beam/pull/12727/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9iYWNrZ3JvdW5kX2NhY2hpbmdfam9iLnB5)
 | `94.78% <90.90%> (ø)` | |
   | 
[sdks/python/apache\_beam/utils/histogram.py](https://codecov.io/gh/apache/beam/pull/12727/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdXRpbHMvaGlzdG9ncmFtLnB5)
 | `94.28% <94.28%> (ø)` | |
   | ... and [70 
more](https://codecov.io/gh/apache/beam/pull/12727/diff?src=pr=tree-more) | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/beam/pull/12727?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/beam/pull/12727?src=pr=footer). Last 
update 
[68e7f2e...4f63252](https://codecov.io/gh/apache/beam/pull/12727?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] codecov[bot] edited a comment on pull request #12889: Dataframe wordcount example.

2020-09-21 Thread GitBox


codecov[bot] edited a comment on pull request #12889:
URL: https://github.com/apache/beam/pull/12889#issuecomment-696446965


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12889?src=pr=h1) Report
   > Merging 
[#12889](https://codecov.io/gh/apache/beam/pull/12889?src=pr=desc) into 
[master](https://codecov.io/gh/apache/beam/commit/1b266601c39f243f48dfd085b565b984f936d02c?el=desc)
 will **increase** coverage by `0.00%`.
   > The diff coverage is `91.66%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/beam/pull/12889/graphs/tree.svg?width=650=150=pr=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12889?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master   #12889   +/-   ##
   ===
 Coverage   82.32%   82.33%   
   ===
 Files 452  453+1 
 Lines   5401654040   +24 
   ===
   + Hits4447144496   +25 
   + Misses   9545 9544-1 
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/beam/pull/12889?src=pr=tree) | Coverage 
Δ | |
   |---|---|---|
   | 
[...python/apache\_beam/examples/wordcount\_dataframe.py](https://codecov.io/gh/apache/beam/pull/12889/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvd29yZGNvdW50X2RhdGFmcmFtZS5weQ==)
 | `91.66% <91.66%> (ø)` | |
   | 
[sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12889/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=)
 | `88.75% <0.00%> (-0.45%)` | :arrow_down: |
   | 
[...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12889/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==)
 | `88.80% <0.00%> (-0.18%)` | :arrow_down: |
   | 
[...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12889/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==)
 | `94.58% <0.00%> (+0.13%)` | :arrow_up: |
   | 
[...nners/portability/fn\_api\_runner/worker\_handlers.py](https://codecov.io/gh/apache/beam/pull/12889/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9mbl9hcGlfcnVubmVyL3dvcmtlcl9oYW5kbGVycy5weQ==)
 | `80.75% <0.00%> (+0.17%)` | :arrow_up: |
   | 
[sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12889/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5)
 | `84.33% <0.00%> (+0.28%)` | :arrow_up: |
   | 
[...ks/python/apache\_beam/runners/worker/data\_plane.py](https://codecov.io/gh/apache/beam/pull/12889/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvZGF0YV9wbGFuZS5weQ==)
 | `89.90% <0.00%> (+1.22%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/beam/pull/12889?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/beam/pull/12889?src=pr=footer). Last 
update 
[1b26660...4146e55](https://codecov.io/gh/apache/beam/pull/12889?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] codecov[bot] edited a comment on pull request #12900: [BEAM-10941] Use standard sharding conventions for fileio writes.

2020-09-21 Thread GitBox


codecov[bot] edited a comment on pull request #12900:
URL: https://github.com/apache/beam/pull/12900#issuecomment-696448479


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12900?src=pr=h1) Report
   > Merging 
[#12900](https://codecov.io/gh/apache/beam/pull/12900?src=pr=desc) into 
[master](https://codecov.io/gh/apache/beam/commit/067cba8229694e7fb9693313f51ca686746b620a?el=desc)
 will **increase** coverage by `0.00%`.
   > The diff coverage is `100.00%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/beam/pull/12900/graphs/tree.svg?width=650=150=pr=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12900?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master   #12900   +/-   ##
   ===
 Coverage   82.34%   82.34%   
   ===
 Files 452  452   
 Lines   5401654018+2 
   ===
   + Hits4448144483+2 
 Misses   9535 9535   
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/beam/pull/12900?src=pr=tree) | Coverage 
Δ | |
   |---|---|---|
   | 
[sdks/python/apache\_beam/io/fileio.py](https://codecov.io/gh/apache/beam/pull/12900/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZmlsZWlvLnB5)
 | `95.84% <100.00%> (+1.77%)` | :arrow_up: |
   | 
[sdks/python/apache\_beam/io/localfilesystem.py](https://codecov.io/gh/apache/beam/pull/12900/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vbG9jYWxmaWxlc3lzdGVtLnB5)
 | `90.90% <0.00%> (-0.76%)` | :arrow_down: |
   | 
[sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12900/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=)
 | `88.75% <0.00%> (-0.45%)` | :arrow_down: |
   | 
[...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12900/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==)
 | `94.45% <0.00%> (-0.14%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/beam/pull/12900?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/beam/pull/12900?src=pr=footer). Last 
update 
[067cba8...e29eaad](https://codecov.io/gh/apache/beam/pull/12900?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj commented on pull request #12880: [BEAM-10933] Adjust GBK type before creating the pipeline proto

2020-09-21 Thread GitBox


chamikaramj commented on pull request #12880:
URL: https://github.com/apache/beam/pull/12880#issuecomment-696451661


   Run Python_PVR_Flink PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] pabloem commented on pull request #12762: Ensuring that BigQuery jobs are tagged with the Dataflow step that launches them

2020-09-21 Thread GitBox


pabloem commented on pull request #12762:
URL: https://github.com/apache/beam/pull/12762#issuecomment-696456903


   Run Python 3.8 PostCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] pabloem commented on pull request #12782: Overriding Dataflow Native BQSource.

2020-09-21 Thread GitBox


pabloem commented on pull request #12782:
URL: https://github.com/apache/beam/pull/12782#issuecomment-696456995


   Run Python2_PVR_Flink PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] codecov[bot] edited a comment on pull request #12880: [BEAM-10933] Adjust GBK type before creating the pipeline proto

2020-09-21 Thread GitBox


codecov[bot] edited a comment on pull request #12880:
URL: https://github.com/apache/beam/pull/12880#issuecomment-695346831


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12880?src=pr=h1) Report
   > Merging 
[#12880](https://codecov.io/gh/apache/beam/pull/12880?src=pr=desc) into 
[master](https://codecov.io/gh/apache/beam/commit/1b266601c39f243f48dfd085b565b984f936d02c?el=desc)
 will **decrease** coverage by `0.00%`.
   > The diff coverage is `92.59%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/beam/pull/12880/graphs/tree.svg?width=650=150=pr=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12880?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master   #12880  +/-   ##
   ==
   - Coverage   82.32%   82.32%   -0.01% 
   ==
 Files 452  453   +1 
 Lines   5401654042  +26 
   ==
   + Hits4447144492  +21 
   - Misses   9545 9550   +5 
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/beam/pull/12880?src=pr=tree) | Coverage 
Δ | |
   |---|---|---|
   | 
[...python/apache\_beam/examples/wordcount\_dataframe.py](https://codecov.io/gh/apache/beam/pull/12880/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvd29yZGNvdW50X2RhdGFmcmFtZS5weQ==)
 | `91.66% <91.66%> (ø)` | |
   | 
[...on/apache\_beam/runners/dataflow/dataflow\_runner.py](https://codecov.io/gh/apache/beam/pull/12880/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kYXRhZmxvdy9kYXRhZmxvd19ydW5uZXIucHk=)
 | `77.24% <100.00%> (+0.06%)` | :arrow_up: |
   | 
[sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12880/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=)
 | `88.75% <0.00%> (-0.45%)` | :arrow_down: |
   | 
[sdks/python/apache\_beam/transforms/combiners.py](https://codecov.io/gh/apache/beam/pull/12880/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9jb21iaW5lcnMucHk=)
 | `91.97% <0.00%> (-0.19%)` | :arrow_down: |
   | 
[sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12880/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5)
 | `84.33% <0.00%> (+0.28%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/beam/pull/12880?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/beam/pull/12880?src=pr=footer). Last 
update 
[096f695...83a4d41](https://codecov.io/gh/apache/beam/pull/12880?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] codecov[bot] edited a comment on pull request #12880: [BEAM-10933] Adjust GBK type before creating the pipeline proto

2020-09-21 Thread GitBox


codecov[bot] edited a comment on pull request #12880:
URL: https://github.com/apache/beam/pull/12880#issuecomment-695346831


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12880?src=pr=h1) Report
   > Merging 
[#12880](https://codecov.io/gh/apache/beam/pull/12880?src=pr=desc) into 
[master](https://codecov.io/gh/apache/beam/commit/1b266601c39f243f48dfd085b565b984f936d02c?el=desc)
 will **increase** coverage by `0.00%`.
   > The diff coverage is `100.00%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/beam/pull/12880/graphs/tree.svg?width=650=150=pr=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12880?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master   #12880   +/-   ##
   ===
 Coverage   82.32%   82.32%   
   ===
 Files 452  452   
 Lines   5401654018+2 
   ===
   + Hits4447144473+2 
 Misses   9545 9545   
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/beam/pull/12880?src=pr=tree) | Coverage 
Δ | |
   |---|---|---|
   | 
[...on/apache\_beam/runners/dataflow/dataflow\_runner.py](https://codecov.io/gh/apache/beam/pull/12880/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kYXRhZmxvdy9kYXRhZmxvd19ydW5uZXIucHk=)
 | `77.24% <100.00%> (+0.06%)` | :arrow_up: |
   | 
[sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12880/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=)
 | `88.75% <0.00%> (-0.45%)` | :arrow_down: |
   | 
[...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12880/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==)
 | `94.18% <0.00%> (-0.27%)` | :arrow_down: |
   | 
[...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12880/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==)
 | `88.80% <0.00%> (-0.18%)` | :arrow_down: |
   | 
[...nners/portability/fn\_api\_runner/worker\_handlers.py](https://codecov.io/gh/apache/beam/pull/12880/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9mbl9hcGlfcnVubmVyL3dvcmtlcl9oYW5kbGVycy5weQ==)
 | `80.75% <0.00%> (+0.17%)` | :arrow_up: |
   | 
[sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12880/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5)
 | `84.33% <0.00%> (+0.28%)` | :arrow_up: |
   | 
[...ks/python/apache\_beam/runners/worker/data\_plane.py](https://codecov.io/gh/apache/beam/pull/12880/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvZGF0YV9wbGFuZS5weQ==)
 | `89.90% <0.00%> (+1.22%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/beam/pull/12880?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/beam/pull/12880?src=pr=footer). Last 
update 
[096f695...83a4d41](https://codecov.io/gh/apache/beam/pull/12880?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj merged pull request #12880: [BEAM-10933] Adjust GBK type before creating the pipeline proto

2020-09-21 Thread GitBox


chamikaramj merged pull request #12880:
URL: https://github.com/apache/beam/pull/12880


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] codecov[bot] edited a comment on pull request #12880: [BEAM-10933] Adjust GBK type before creating the pipeline proto

2020-09-21 Thread GitBox


codecov[bot] edited a comment on pull request #12880:
URL: https://github.com/apache/beam/pull/12880#issuecomment-695346831


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12880?src=pr=h1) Report
   > Merging 
[#12880](https://codecov.io/gh/apache/beam/pull/12880?src=pr=desc) into 
[master](https://codecov.io/gh/apache/beam/commit/1b266601c39f243f48dfd085b565b984f936d02c?el=desc)
 will **decrease** coverage by `0.00%`.
   > The diff coverage is `92.59%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/beam/pull/12880/graphs/tree.svg?width=650=150=pr=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12880?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master   #12880  +/-   ##
   ==
   - Coverage   82.32%   82.32%   -0.01% 
   ==
 Files 452  453   +1 
 Lines   5401654042  +26 
   ==
   + Hits4447144492  +21 
   - Misses   9545 9550   +5 
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/beam/pull/12880?src=pr=tree) | Coverage 
Δ | |
   |---|---|---|
   | 
[...python/apache\_beam/examples/wordcount\_dataframe.py](https://codecov.io/gh/apache/beam/pull/12880/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvd29yZGNvdW50X2RhdGFmcmFtZS5weQ==)
 | `91.66% <91.66%> (ø)` | |
   | 
[...on/apache\_beam/runners/dataflow/dataflow\_runner.py](https://codecov.io/gh/apache/beam/pull/12880/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kYXRhZmxvdy9kYXRhZmxvd19ydW5uZXIucHk=)
 | `77.24% <100.00%> (+0.06%)` | :arrow_up: |
   | 
[sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12880/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=)
 | `88.75% <0.00%> (-0.45%)` | :arrow_down: |
   | 
[sdks/python/apache\_beam/transforms/combiners.py](https://codecov.io/gh/apache/beam/pull/12880/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9jb21iaW5lcnMucHk=)
 | `91.97% <0.00%> (-0.19%)` | :arrow_down: |
   | 
[sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12880/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5)
 | `84.33% <0.00%> (+0.28%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/beam/pull/12880?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/beam/pull/12880?src=pr=footer). Last 
update 
[096f695...83a4d41](https://codecov.io/gh/apache/beam/pull/12880?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] codecov[bot] edited a comment on pull request #9907: [BEAM-4091] Pass type hints in ptransform_fn

2020-09-21 Thread GitBox


codecov[bot] edited a comment on pull request #9907:
URL: https://github.com/apache/beam/pull/9907#issuecomment-682206351


   # [Codecov](https://codecov.io/gh/apache/beam/pull/9907?src=pr=h1) Report
   > Merging 
[#9907](https://codecov.io/gh/apache/beam/pull/9907?src=pr=desc) into 
[master](https://codecov.io/gh/apache/beam/commit/1b266601c39f243f48dfd085b565b984f936d02c?el=desc)
 will **increase** coverage by `0.01%`.
   > The diff coverage is `93.67%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/beam/pull/9907/graphs/tree.svg?width=650=150=pr=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/9907?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#9907  +/-   ##
   ==
   + Coverage   82.32%   82.34%   +0.01% 
   ==
 Files 452  453   +1 
 Lines   5401654096  +80 
   ==
   + Hits4447144547  +76 
   - Misses   9545 9549   +4 
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/beam/pull/9907?src=pr=tree) | Coverage Δ 
| |
   |---|---|---|
   | 
[...dks/python/apache\_beam/options/pipeline\_options.py](https://codecov.io/gh/apache/beam/pull/9907/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vb3B0aW9ucy9waXBlbGluZV9vcHRpb25zLnB5)
 | `94.75% <90.90%> (-0.24%)` | :arrow_down: |
   | 
[...python/apache\_beam/examples/wordcount\_dataframe.py](https://codecov.io/gh/apache/beam/pull/9907/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvd29yZGNvdW50X2RhdGFmcmFtZS5weQ==)
 | `91.66% <91.66%> (ø)` | |
   | 
[...n/apache\_beam/typehints/typed\_pipeline\_test\_py3.py](https://codecov.io/gh/apache/beam/pull/9907/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHlwZWhpbnRzL3R5cGVkX3BpcGVsaW5lX3Rlc3RfcHkzLnB5)
 | `90.62% <95.23%> (+0.32%)` | :arrow_up: |
   | 
[sdks/python/apache\_beam/transforms/ptransform.py](https://codecov.io/gh/apache/beam/pull/9907/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9wdHJhbnNmb3JtLnB5)
 | `92.00% <100.00%> (+0.96%)` | :arrow_up: |
   | 
[sdks/python/apache\_beam/transforms/util.py](https://codecov.io/gh/apache/beam/pull/9907/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy91dGlsLnB5)
 | `95.44% <100.00%> (ø)` | |
   | 
[sdks/python/apache\_beam/typehints/decorators.py](https://codecov.io/gh/apache/beam/pull/9907/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHlwZWhpbnRzL2RlY29yYXRvcnMucHk=)
 | `83.37% <100.00%> (+0.08%)` | :arrow_up: |
   | 
[...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/9907/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==)
 | `93.38% <0.00%> (-0.74%)` | :arrow_down: |
   | 
[sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/9907/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=)
 | `88.75% <0.00%> (-0.45%)` | :arrow_down: |
   | 
[sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/9907/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5)
 | `84.33% <0.00%> (+0.28%)` | :arrow_up: |
   | ... and [2 
more](https://codecov.io/gh/apache/beam/pull/9907/diff?src=pr=tree-more) | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/beam/pull/9907?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/beam/pull/9907?src=pr=footer). Last 
update 
[096f695...e462af6](https://codecov.io/gh/apache/beam/pull/9907?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] tvalentyn commented on pull request #12895: [BEAM-9372][BEAM-9980] Switches Flink VR suite to Py36 and makes the version configurable.

2020-09-21 Thread GitBox


tvalentyn commented on pull request #12895:
URL: https://github.com/apache/beam/pull/12895#issuecomment-696467526


   Run Python Flink ValidatesRunner



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] aaltay commented on a change in pull request #9907: [BEAM-4091] Pass type hints in ptransform_fn

2020-09-21 Thread GitBox


aaltay commented on a change in pull request #9907:
URL: https://github.com/apache/beam/pull/9907#discussion_r492447574



##
File path: sdks/python/apache_beam/options/pipeline_options.py
##
@@ -476,6 +499,26 @@ def _add_argparse_args(cls, parser):
 'time. NOTE: only supported with portable runners '
 '(including the DirectRunner)')
 
+  def validate(self, unused_validator):
+errors = []
+if beam.version.__version__ >= '3':

Review comment:
   I think when/if we reach to Beam version 3, we can remove this warning, 
and all and etc and and enable by default without an option to disable. So 
maybe we do not need this?

##
File path: CHANGES.md
##
@@ -73,6 +73,10 @@
 * In Interactive Beam, ib.show() and ib.collect() now have "n" and "duration" 
as parameters. These mean read only up to "n" elements and up to "duration" 
seconds of data read from the recording 
([BEAM-10603](https://issues.apache.org/jira/browse/BEAM-10603)).
 * Initial preview of 
[Dataframes](https://s.apache.org/simpler-python-pipelines-2020#slide=id.g905ac9257b_1_21)
 support.
 See also example at apache_beam/examples/wordcount_dataframe.py
+* Fixed support for type hints on `@ptransform_fn` decorators in the Python 
SDK.

Review comment:
   Maybe warn that the default might change in a 2.x version later?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kennknowles commented on pull request #12877: Upgrade GCS IO to 2.1.5 and Google OAuth to 1.31.0

2020-09-21 Thread GitBox


kennknowles commented on pull request #12877:
URL: https://github.com/apache/beam/pull/12877#issuecomment-696412239


   Ran `./gradlew -Ppublishing 
-PjavaLinkageArtifactIds=beam-sdks-java-io-google-cloud-platform,beam-runners-google-cloud-dataflow-java
 :checkJavaLinkage > ~/tmp/linkage-after` against `master` and this PR. Both 
fail, but here is the diff, with no additional errors:
   
   ```
   13094,13096c13115,13117
   < com.google.cloud.hadoop.gcsio.cooplock.CoopLockOperationDao 
(com.google.cloud.bigdataoss:gcsio:2.1.3)
   < com.google.cloud.hadoop.gcsio.cooplock.CoopLockRecordsDao 
(com.google.cloud.bigdataoss:gcsio:2.1.3)
   < com.google.cloud.hadoop.gcsio.testing.InMemoryObjectEntry 
(com.google.cloud.bigdataoss:gcsio:2.1.3)
   ---
   > com.google.cloud.hadoop.gcsio.cooplock.CoopLockOperationDao 
(com.google.cloud.bigdataoss:gcsio:2.1.5)
   > com.google.cloud.hadoop.gcsio.cooplock.CoopLockRecordsDao 
(com.google.cloud.bigdataoss:gcsio:2.1.5)
   > com.google.cloud.hadoop.gcsio.testing.InMemoryObjectEntry 
(com.google.cloud.bigdataoss:gcsio:2.1.5)
   13100c13121
   <   unselected: 
org.apache.beam:beam-sdks-java-io-google-cloud-platform:2.25.0-SNAPSHOT 
(compile) / 
org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core:2.25.0-SNAPSHOT
 (compile) / com.google.cloud.bigdataoss:gcsio:2.1.3 (compile) / 
com.google.guava:guava:29.0-jre (compile)
   ---
   >   unselected: 
org.apache.beam:beam-sdks-java-io-google-cloud-platform:2.25.0-SNAPSHOT 
(compile) / 
org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core:2.25.0-SNAPSHOT
 (compile) / com.google.cloud.bigdataoss:gcsio:2.1.5 (compile) / 
com.google.guava:guava:29.0-jre (compile)
   13122c13143
   <   and 696 other dependency paths.
   ---
   >   and 714 other dependency paths.
   13153,13154c13174,13175
   < com.google.cloud.bigdataoss:gcsio:2.1.3 is at:
   <   org.apache.beam:beam-sdks-java-io-google-cloud-platform:2.25.0-SNAPSHOT 
(compile) / 
org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core:2.25.0-SNAPSHOT
 (compile) / com.google.cloud.bigdataoss:gcsio:2.1.3 (compile)
   ---
   > com.google.cloud.bigdataoss:gcsio:2.1.5 is at:
   >   org.apache.beam:beam-sdks-java-io-google-cloud-platform:2.25.0-SNAPSHOT 
(compile) / 
org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core:2.25.0-SNAPSHOT
 (compile) / com.google.cloud.bigdataoss:gcsio:2.1.5 (compile)
   13163c13184
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] robertwb commented on pull request #12873: Remove experimental declarations from fileio.

2020-09-21 Thread GitBox


robertwb commented on pull request #12873:
URL: https://github.com/apache/beam/pull/12873#issuecomment-696399851


   Sure, we can keep it experimental a bit longer. I would like to fix 
https://github.com/apache/beam/pull/12900 though.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] codecov[bot] edited a comment on pull request #12889: Dataframe wordcount example.

2020-09-21 Thread GitBox


codecov[bot] edited a comment on pull request #12889:
URL: https://github.com/apache/beam/pull/12889#issuecomment-696446965







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kileys commented on pull request #12899: [BEAM-8024] Add JPMS E2E test

2020-09-21 Thread GitBox


kileys commented on pull request #12899:
URL: https://github.com/apache/beam/pull/12899#issuecomment-696393175


   Run Java PostCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit merged pull request #12896: Update indexing skips for pandas 1.x

2020-09-21 Thread GitBox


TheNeuralBit merged pull request #12896:
URL: https://github.com/apache/beam/pull/12896


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] kennknowles merged pull request #12875: Clarify Beam's use of semantic versioning

2020-09-21 Thread GitBox


kennknowles merged pull request #12875:
URL: https://github.com/apache/beam/pull/12875


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] udim commented on pull request #12893: [BEAM-6103] Enable BQ streaming insert timeouts

2020-09-21 Thread GitBox


udim commented on pull request #12893:
URL: https://github.com/apache/beam/pull/12893#issuecomment-696321660


   R: @chamikaramj 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] y1chi commented on a change in pull request #12727: [BEAM-10844] Add experiment option prebuild_sdk_container to prebuild python sdk container with dependencies.

2020-09-21 Thread GitBox


y1chi commented on a change in pull request #12727:
URL: https://github.com/apache/beam/pull/12727#discussion_r492220801



##
File path: sdks/python/apache_beam/transforms/environments.py
##
@@ -252,6 +254,14 @@ def from_runner_api_parameter(payload, capabilities, 
artifacts, context):
   @classmethod
   def from_options(cls, options):
 # type: (PipelineOptions) -> DockerEnvironment
+if options.view_as(DebugOptions).lookup_experiment(
+'prebuild_sdk_container'):
+  prebuilt_container_image = SdkContainerBuilder.build_container_imge(

Review comment:
   Changed to invoking from_options in dataflow runner.

##
File path: sdks/python/apache_beam/runners/portability/sdk_container_builder.py
##
@@ -0,0 +1,275 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""SdkContainerBuilder builds the portable SDK container with dependencies.
+
+It copies the right boot dependencies, namely: apache beam sdk, python packages
+from requirements.txt, python packages from extra_packages.txt, workflow
+tarball, into the latest public python sdk container image, and run the
+dependencies installation in advance with the boot program in setup only mode
+to build the new image.
+"""
+
+from __future__ import absolute_import
+
+import json
+import logging
+import os
+import shutil
+import subprocess
+import sys
+import tarfile
+import tempfile
+import time
+import uuid
+
+from google.protobuf.duration_pb2 import Duration
+from google.protobuf.json_format import MessageToJson
+
+from apache_beam.internal.gcp.auth import get_service_credentials
+from apache_beam.internal.http_client import get_new_http
+from apache_beam.io.gcp.internal.clients import storage
+from apache_beam.options.pipeline_options import DebugOptions
+from apache_beam.options.pipeline_options import GoogleCloudOptions
+from apache_beam.options.pipeline_options import PipelineOptions  # pylint: 
disable=unused-import
+from apache_beam.portability import common_urns
+from apache_beam.portability.api import beam_runner_api_pb2
+from apache_beam.runners.portability.stager import Stager
+
+ARTIFACTS_CONTAINER_DIR = '/opt/apache/beam/artifacts'
+ARTIFACTS_MANIFEST_FILE = 'artifacts_info.json'
+SDK_CONTAINER_ENTRYPOINT = '/opt/apache/beam/boot'
+DOCKERFILE_TEMPLATE = (
+"""FROM apache/beam_python{major}.{minor}_sdk:latest
+RUN mkdir -p {workdir}
+COPY ./* {workdir}/
+RUN {entrypoint} --setup_only --artifacts {workdir}/{manifest_file}
+""")
+
+SOURCE_FOLDER = 'source'
+_LOGGER = logging.getLogger(__name__)
+
+
+class SdkContainerBuilder(object):
+  def __init__(self, options):
+self._options = options
+self._temp_src_dir = tempfile.mkdtemp()
+self._docker_registry_push_url = self._options.view_as(
+DebugOptions).lookup_experiment('docker_registry_push_url')
+
+  def build(self):
+container_id = str(uuid.uuid4())
+container_tag = os.path.join(
+self._docker_registry_push_url or '',
+'beam_python_prebuilt_sdk:%s' % container_id)
+self.prepare_dependencies()
+self.invoke_docker_build_and_push(container_id, container_tag)
+
+return container_tag
+
+  def prepare_dependencies(self):
+tmp = tempfile.mkdtemp()
+resources = Stager.create_job_resources(self._options, tmp)
+# make a copy of the staged artifacts into the temp source folder.
+for path, name in resources:
+  shutil.copyfile(path, os.path.join(self._temp_src_dir, name))
+with open(os.path.join(self._temp_src_dir, 'Dockerfile'), 'w') as file:
+  file.write(
+  DOCKERFILE_TEMPLATE.format(
+  major=sys.version_info[0],
+  minor=sys.version_info[1],
+  workdir=ARTIFACTS_CONTAINER_DIR,
+  manifest_file=ARTIFACTS_MANIFEST_FILE,
+  entrypoint=SDK_CONTAINER_ENTRYPOINT))
+self.generate_artifacts_manifests_json_file(resources, self._temp_src_dir)
+
+  def invoke_docker_build_and_push(self, container_id, container_tag):
+raise NotImplementedError
+
+  @staticmethod
+  def generate_artifacts_manifests_json_file(resources, temp_dir):
+infos = []
+for _, name in resources:
+  info = beam_runner_api_pb2.ArtifactInformation(
+  

[GitHub] [beam] ihji commented on pull request #12822: [BEAM-10880] Log error counts to debug BigQuery streaming insert requ…

2020-09-21 Thread GitBox


ihji commented on pull request #12822:
URL: https://github.com/apache/beam/pull/12822#issuecomment-696326864


   @ajamato PTAL.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] robertwb merged pull request #12889: Dataframe wordcount example.

2020-09-21 Thread GitBox


robertwb merged pull request #12889:
URL: https://github.com/apache/beam/pull/12889


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] ibzib merged pull request #12876: [BEAM-10931] Remove obsolete ZetaSQL precommit Gradle task.

2020-09-21 Thread GitBox


ibzib merged pull request #12876:
URL: https://github.com/apache/beam/pull/12876


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj commented on pull request #12880: [BEAM-10933] Adjust GBK and Flatten types before creating the pipeline proto

2020-09-21 Thread GitBox


chamikaramj commented on pull request #12880:
URL: https://github.com/apache/beam/pull/12880#issuecomment-696230350







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on a change in pull request #12830: [BEAM-10716] TestPubsub/TestPubsubSignal clean up subscriptions

2020-09-21 Thread GitBox


TheNeuralBit commented on a change in pull request #12830:
URL: https://github.com/apache/beam/pull/12830#discussion_r492339081



##
File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/TestPubsubSignal.java
##
@@ -239,6 +245,12 @@ public void waitForSuccess(Duration duration) throws 
IOException {
 
 String result = pollForResultForDuration(resultSubscriptionPath, duration);
 
+try {
+  pubsub.deleteSubscription(resultSubscriptionPath);
+} catch (IOException e) {
+  LOG.warn(String.format("Leaked PubSub subscription '%s'", 
resultSubscriptionPath));

Review comment:
   Good idea, done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #12721: [BEAM-10871] Add deidentify for FhirIO connector

2020-09-21 Thread GitBox


TheNeuralBit commented on pull request #12721:
URL: https://github.com/apache/beam/pull/12721#issuecomment-696407136


   I think this broke Fhir integration tests in PostCommit: 
https://ci-beam.apache.org/job/beam_PostCommit_Java/6624/



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on a change in pull request #12882: [BEAM-10814] DataframeTransform outputs elements

2020-09-21 Thread GitBox


TheNeuralBit commented on a change in pull request #12882:
URL: https://github.com/apache/beam/pull/12882#discussion_r492219317



##
File path: sdks/python/apache_beam/dataframe/schemas.py
##
@@ -55,17 +159,149 @@ def expand(self, pcoll):
 lambda batch: pd.DataFrame.from_records(batch, columns=columns))
 
 
-def _make_empty_series(name, typ):
-  try:
-return pd.Series(name=name, dtype=typ)
-  except TypeError:
-raise TypeError("Unable to convert type '%s' for field '%s'" % (name, typ))
+def _make_proxy_series(name, typehint):
+  # Default to np.object. This is lossy, we won't be able to recover the type
+  # at the output.
+  dtype = BEAM_TO_PANDAS.get(typehint, np.object)
+
+  return pd.Series(name=name, dtype=dtype)
 
 
 def generate_proxy(element_type):
   # type: (type) -> pd.DataFrame
-  return pd.DataFrame({
-  name: _make_empty_series(name, typ)
-  for name,
-  typ in named_fields_from_element_type(element_type)
-  })
+
+  """ Generate a proxy pandas object for the given PCollection element_type.
+
+  Currently only supports generating a DataFrame proxy from a schema-aware
+  PCollection."""
+  fields = named_fields_from_element_type(element_type)
+  return pd.DataFrame(
+  {name: _make_proxy_series(name, typehint)
+   for name, typehint in fields},
+  columns=[name for name, _ in fields])
+
+
+def element_type_from_proxy(proxy):
+  # type: (pd.DataFrame) -> type
+
+  """ Generate an element_type for an element-wise PCollection from a proxy
+  pandas object. Currently only supports converting the element_type for
+  a schema-aware PCollection to a proxy DataFrame.
+
+  Currently only supports generating a DataFrame proxy from a schema-aware
+  PCollection."""
+  indices = [] if proxy.index.names == (None, ) else [

Review comment:
   I thought the MultiIndex or named case was important since otherwise 
we'll drop the grouped column(s) when unbatching the result of a grouped 
aggregation.
   
   It raise some tricky issues though:
   - Index names are not required to be unique.
   - It looks like my assumption that all MultiIndexes are named is wrong. It's 
possible to create a `MultiIndex` with `names=[None, None, 'foo']`, which would 
break this badly.
   - Type information is not necessarily preserved in indexes. e.g. Int64Index 
doesn't support nulls like Series with Int64Dtype does. if one is added it's 
converted to a Float64Index with nans.
   
   Maybe including the index shouldn't be the default until we have a better 
handle on these edge cases.

##
File path: sdks/python/apache_beam/dataframe/schemas.py
##
@@ -55,17 +159,149 @@ def expand(self, pcoll):
 lambda batch: pd.DataFrame.from_records(batch, columns=columns))
 
 
-def _make_empty_series(name, typ):
-  try:
-return pd.Series(name=name, dtype=typ)
-  except TypeError:
-raise TypeError("Unable to convert type '%s' for field '%s'" % (name, typ))
+def _make_proxy_series(name, typehint):
+  # Default to np.object. This is lossy, we won't be able to recover the type
+  # at the output.
+  dtype = BEAM_TO_PANDAS.get(typehint, np.object)
+
+  return pd.Series(name=name, dtype=dtype)
 
 
 def generate_proxy(element_type):
   # type: (type) -> pd.DataFrame
-  return pd.DataFrame({
-  name: _make_empty_series(name, typ)
-  for name,
-  typ in named_fields_from_element_type(element_type)
-  })
+
+  """ Generate a proxy pandas object for the given PCollection element_type.
+
+  Currently only supports generating a DataFrame proxy from a schema-aware
+  PCollection."""
+  fields = named_fields_from_element_type(element_type)
+  return pd.DataFrame(
+  {name: _make_proxy_series(name, typehint)
+   for name, typehint in fields},
+  columns=[name for name, _ in fields])
+
+
+def element_type_from_proxy(proxy):
+  # type: (pd.DataFrame) -> type
+
+  """ Generate an element_type for an element-wise PCollection from a proxy
+  pandas object. Currently only supports converting the element_type for
+  a schema-aware PCollection to a proxy DataFrame.
+
+  Currently only supports generating a DataFrame proxy from a schema-aware
+  PCollection."""
+  indices = [] if proxy.index.names == (None, ) else [

Review comment:
   We could log a warning if there's a named index in the result and 
`include_indexes` is `False`

##
File path: sdks/python/apache_beam/dataframe/schemas.py
##
@@ -55,17 +159,149 @@ def expand(self, pcoll):
 lambda batch: pd.DataFrame.from_records(batch, columns=columns))
 
 
-def _make_empty_series(name, typ):
-  try:
-return pd.Series(name=name, dtype=typ)
-  except TypeError:
-raise TypeError("Unable to convert type '%s' for field '%s'" % (name, typ))
+def _make_proxy_series(name, typehint):
+  # Default to np.object. This is lossy, we won't be able to recover the type
+  # at the output.
+  dtype = BEAM_TO_PANDAS.get(typehint, np.object)
+
+  return pd.Series(name=name, dtype=dtype)
 

[GitHub] [beam] aromanenko-dev commented on pull request #11459: [BEAM-2546] Add InfluxDbIO

2020-09-21 Thread GitBox


aromanenko-dev commented on pull request #11459:
URL: https://github.com/apache/beam/pull/11459#issuecomment-696273839


   The only minor thing that is missing -  update of `CHANGES.md` about this 
new IO.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] lukecwik commented on pull request #12367: [BEAM-10564] Support more Avro field name formats when mapping to Jav…

2020-09-21 Thread GitBox


lukecwik commented on pull request #12367:
URL: https://github.com/apache/beam/pull/12367#issuecomment-696223362


   We have a merge on green policy so I have been rerunning the tests to get 
past a known issue.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

2020-09-21 Thread GitBox


chamikaramj commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r492364337



##
File path: sdks/python/apache_beam/io/gcp/spanner.py
##
@@ -0,0 +1,504 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""PTransforms for supporting Spanner in Python pipelines.
+
+  These transforms are currently supported by Beam portable
+  Flink and Spark runners.
+
+  **Setup**
+
+  Transforms provided in this module are cross-language transforms
+  implemented in the Beam Java SDK. During the pipeline construction, Python 
SDK
+  will connect to a Java expansion service to expand these transforms.
+  To facilitate this, a small amount of setup is needed before using these
+  transforms in a Beam Python pipeline.
+
+  There are several ways to setup cross-language Spanner transforms.
+
+  * Option 1: use the default expansion service
+  * Option 2: specify a custom expansion service
+
+  See below for details regarding each of these options.
+
+  *Option 1: Use the default expansion service*
+
+  This is the recommended and easiest setup option for using Python Spanner
+  transforms. This option is only available for Beam 2.25.0 and later.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Install Java runtime in the computer from where the pipeline is constructed
+and make sure that 'java' command is available.
+
+  In this option, Python SDK will either download (for released Beam version) 
or
+  build (when running from a Beam Git clone) a expansion service jar and use
+  that to expand transforms. Currently Spanner transforms use the
+  'beam-sdks-java-io-google-cloud-platform-expansion-service' jar for this
+  purpose.
+
+  *Option 2: specify a custom expansion service*
+
+  In this option, you startup your own expansion service and provide that as
+  a parameter when using the transforms provided in this module.
+
+  This option requires following pre-requisites before running the Beam
+  pipeline.
+
+  * Startup your own expansion service.
+  * Update your pipeline to provide the expansion service address when
+initiating Spanner transforms provided in this module.
+
+  Flink Users can use the built-in Expansion Service of the Flink Runner's
+  Job Server. If you start Flink's Job Server, the expansion service will be
+  started on port 8097. For a different address, please set the
+  expansion_service parameter.
+
+  **More information**
+
+  For more information regarding cross-language transforms see:
+  - https://beam.apache.org/roadmap/portability/
+
+  For more information specific to Flink runner see:
+  - https://beam.apache.org/documentation/runners/flink/
+"""
+
+# pytype: skip-file
+
+from __future__ import absolute_import
+
+import typing
+import uuid
+from typing import List
+from typing import NamedTuple
+from typing import Optional
+
+from past.builtins import unicode
+
+from apache_beam import coders
+from apache_beam.transforms.external import BeamJarExpansionService
+from apache_beam.transforms.external import ExternalTransform
+from apache_beam.transforms.external import NamedTupleBasedPayloadBuilder
+from apache_beam.typehints.schemas import named_tuple_to_schema
+
+__all__ = [
+'WriteToSpanner',
+'ReadFromSpanner',
+'MutationCreator',
+'TimestampBoundMode',
+'TimeUnit',
+]
+
+
+def default_io_expansion_service():
+  return BeamJarExpansionService(
+  'sdks:java:io:google-cloud-platform:expansion-service:shadowJar')
+
+
+WriteToSpannerSchema = typing.NamedTuple(
+'WriteToSpannerSchema',
+[
+('instance_id', unicode),
+('database_id', unicode),
+('project_id', Optional[unicode]),
+('batch_size_bytes', Optional[int]),
+('max_num_mutations', Optional[int]),
+('max_num_rows', Optional[int]),
+('grouping_factor', Optional[int]),
+('host', Optional[unicode]),
+('emulator_host', Optional[unicode]),
+('commit_deadline', Optional[int]),
+('max_cumulative_backoff', Optional[int]),
+],
+)
+
+
+class WriteToSpanner(ExternalTransform):

Review comment:
   Probably it makes sense to converge into one implementation. I'd 

[GitHub] [beam] abhiy13 commented on a change in pull request #12645: [BEAM-10124] Add ContextualTextIO

2020-09-21 Thread GitBox


abhiy13 commented on a change in pull request #12645:
URL: https://github.com/apache/beam/pull/12645#discussion_r492474752



##
File path: 
sdks/java/io/contextual-text-io/src/main/java/org/apache/beam/sdk/io/contextualtextio/RecordWithMetadata.java
##
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.contextualtextio;
+
+import com.google.auto.value.AutoValue;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.schemas.AutoValueSchema;
+import org.apache.beam.sdk.schemas.annotations.DefaultSchema;
+
+/**
+ * Helper Class based on {@link AutoValueSchema}, it provides Metadata 
associated with each Record
+ * when reading from file(s) using {@link ContextualTextIO}.
+ *
+ * Fields:
+ *
+ * 
+ *   recordOffset: The offset of a record (the byte at which the record 
begins) in a file. This
+ *   information can be useful if you wish to reconstruct the file. {@link
+ *   RecordWithMetadata#getRecordOffset()}
+ *   recordNum: The ordinal number of the record in its file. {@link
+ *   RecordWithMetadata#getRecordNum()}
+ *   recordValue: The value / contents of the record {@link 
RecordWithMetadata#getValue()}
+ *   rangeOffset: The starting offset of the range (split), which 
contained the record, when the
+ *   record was read. {@link RecordWithMetadata#getRangeOffset()}
+ *   recordNumInOffset: The record number relative to the Range. (line 
number within the range)
+ *   {@link RecordWithMetadata#getRecordNumInOffset()}
+ *   fileName: Name of the file to which the record belongs (this is the 
full filename,
+ *   eg:path/to/file.txt) {@link RecordWithMetadata#getFileName()}
+ * 
+ */
+@Experimental(Experimental.Kind.SCHEMAS)
+@DefaultSchema(AutoValueSchema.class)
+@AutoValue
+public abstract class RecordWithMetadata {
+  public abstract long getRecordOffset();
+
+  public abstract long getRecordNum();
+
+  public abstract String getValue();
+
+  public abstract long getRangeOffset();
+
+  public abstract long getRecordNumInOffset();
+
+  public abstract Builder toBuilder();
+
+  public abstract String getFileName();
+
+  public static Builder newBuilder() {
+return new AutoValue_RecordWithMetadata.Builder();
+  }
+
+  @AutoValue.Builder
+  public abstract static class Builder {
+public abstract Builder setRecordNum(long lineNum);
+
+public abstract Builder setRecordOffset(long recordOffset);
+
+public abstract Builder setValue(String Value);

Review comment:
   Done.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] abhiy13 commented on a change in pull request #12645: [BEAM-10124] Add ContextualTextIO

2020-09-21 Thread GitBox


abhiy13 commented on a change in pull request #12645:
URL: https://github.com/apache/beam/pull/12645#discussion_r492474814



##
File path: 
sdks/java/io/contextual-text-io/src/main/java/org/apache/beam/sdk/io/contextualtextio/ContextualTextIO.java
##
@@ -0,0 +1,631 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.contextualtextio;
+
+import static org.apache.beam.sdk.io.FileIO.ReadMatches.DirectoryTreatment;
+import static 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+import static 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull;
+
+import com.google.auto.value.AutoValue;
+import java.nio.ByteBuffer;
+import java.util.Arrays;
+import java.util.Comparator;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.SortedMap;
+import java.util.TreeMap;
+import org.apache.beam.sdk.coders.StringUtf8Coder;
+import org.apache.beam.sdk.io.CompressedSource;
+import org.apache.beam.sdk.io.Compression;
+import org.apache.beam.sdk.io.FileBasedSource;
+import org.apache.beam.sdk.io.FileIO;
+import org.apache.beam.sdk.io.FileIO.MatchConfiguration;
+import org.apache.beam.sdk.io.ReadAllViaFileBasedSource;
+import org.apache.beam.sdk.io.TextIO;
+import org.apache.beam.sdk.io.fs.EmptyMatchTreatment;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.options.ValueProvider.StaticValueProvider;
+import org.apache.beam.sdk.schemas.NoSuchSchemaException;
+import org.apache.beam.sdk.schemas.SchemaCoder;
+import org.apache.beam.sdk.transforms.Count;
+import org.apache.beam.sdk.transforms.Create;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.View;
+import org.apache.beam.sdk.transforms.Watch.Growth.TerminationCondition;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PCollectionView;
+import 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting;
+import org.checkerframework.checker.nullness.qual.Nullable;
+import org.joda.time.Duration;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link PTransform}s that read text files and collect contextual information 
of the elements in
+ * the input.
+ *
+ * Use {@link TextIO} when not reading file with Multiline Records or 
additional metadata is not
+ * required.
+ *
+ * Reading from text files
+ *
+ * To read a {@link PCollection} from one or more text files, use {@code
+ * ContextualTextIO.read()}. To instantiate a transform use {@link
+ * ContextualTextIO.Read#from(String)} and specify the path of the file(s) to 
be read.
+ * Alternatively, if the filenames to be read are themselves in a {@link 
PCollection} you can use
+ * {@link FileIO} to match them and {@link ContextualTextIO#readFiles()} to 
read them.
+ *
+ * {@link #read} returns a {@link PCollection} of {@link RecordWithMetadata 
RecordWithMetadata},
+ * each corresponding to one line of an input UTF-8 text file (split into 
lines delimited by '\n',
+ * '\r', '\r\n', or specified delimiter see {@link 
ContextualTextIO.Read#withDelimiter})
+ *
+ * Filepattern expansion and watching
+ *
+ * By default, the filepatterns are expanded only once. The combination of 
{@link
+ * FileIO.Match#continuously(Duration, TerminationCondition)} and {@link 
#readFiles()} allow
+ * streaming of new files matching the filepattern(s).
+ *
+ * By default, {@link #read} prohibits filepatterns that match no files, 
and {@link #readFiles()}
+ * allows them in case the filepattern contains a glob wildcard character. Use 
{@link
+ * ContextualTextIO.Read#withEmptyMatchTreatment} or {@link
+ * FileIO.Match#withEmptyMatchTreatment(EmptyMatchTreatment)} plus {@link 
#readFiles()} to configure
+ * this behavior.
+ *
+ * Example 1: reading a file or filepattern.
+ *
+ * {@code
+ * Pipeline p = ...;

[GitHub] [beam] abhiy13 commented on a change in pull request #12645: [BEAM-10124] Add ContextualTextIO

2020-09-21 Thread GitBox


abhiy13 commented on a change in pull request #12645:
URL: https://github.com/apache/beam/pull/12645#discussion_r492475231



##
File path: sdks/java/io/contextual-text-io/build.gradle
##
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+plugins { id 'org.apache.beam.module' }
+applyJavaNature(
+automaticModuleName: 'org.apache.beam.sdk.io.contextual-text-io',
+enableChecker: false,
+ignoreRawtypeErrors: true)
+
+description = "Apache Beam :: SDKs :: Java :: Contextual-Text-IO"
+ext.summary = "Context-aware Text IO."
+
+dependencies {
+
+compile library.java.vendored_guava_26_0_jre
+compile library.java.protobuf_java
+compile project(path: ":sdks:java:core", configuration: "shadow")
+testCompile project(path: ":sdks:java:core", configuration: "shadowTest")
+
+testCompile library.java.guava_testlib
+testCompile library.java.junit
+testCompile library.java.hamcrest_core
+testRuntimeOnly library.java.slf4j_jdk14
+testCompile project(path: ":runners:direct-java", configuration: "shadow")
+
+}

Review comment:
   Done

##
File path: sdks/java/io/contextual-text-io/build.gradle
##
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+plugins { id 'org.apache.beam.module' }
+applyJavaNature(
+automaticModuleName: 'org.apache.beam.sdk.io.contextual-text-io',
+enableChecker: false,
+ignoreRawtypeErrors: true)
+
+description = "Apache Beam :: SDKs :: Java :: Contextual-Text-IO"
+ext.summary = "Context-aware Text IO."
+
+dependencies {
+

Review comment:
   Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] abhiy13 commented on a change in pull request #12645: [BEAM-10124] Add ContextualTextIO

2020-09-21 Thread GitBox


abhiy13 commented on a change in pull request #12645:
URL: https://github.com/apache/beam/pull/12645#discussion_r492474704



##
File path: 
sdks/java/io/contextual-text-io/src/main/java/org/apache/beam/sdk/io/contextualtextio/ContextualTextIOSource.java
##
@@ -0,0 +1,364 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.contextualtextio;
+
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.nio.channels.ReadableByteChannel;
+import java.nio.channels.SeekableByteChannel;
+import java.util.NoSuchElementException;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.io.FileBasedSource;
+import org.apache.beam.sdk.io.fs.EmptyMatchTreatment;
+import org.apache.beam.sdk.io.fs.MatchResult;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.schemas.NoSuchSchemaException;
+import org.apache.beam.sdk.schemas.SchemaCoder;
+import org.apache.beam.sdk.schemas.SchemaRegistry;
+import org.apache.beam.vendor.grpc.v1p26p0.com.google.protobuf.ByteString;
+import 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting;
+import 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions;
+import org.checkerframework.checker.nullness.qual.Nullable;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Implementation detail of {@link ContextualTextIO.Read}.
+ *
+ * A {@link FileBasedSource} which can decode records delimited by newline 
characters.
+ *
+ * This source splits the data into records using {@code UTF-8} {@code \n}, 
{@code \r}, or {@code
+ * \r\n} as the delimiter. This source is not strict and supports decoding the 
last record even if
+ * it is not delimited. Finally, no records are decoded if the stream is empty.
+ *
+ * This source supports reading from any arbitrary byte position within the 
stream. If the
+ * starting position is not {@code 0}, then bytes are skipped until the first 
delimiter is found
+ * representing the beginning of the first record to be decoded.
+ */
+@VisibleForTesting
+class ContextualTextIOSource extends FileBasedSource {
+  byte[] delimiter;
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(ContextualTextIOSource.class);
+
+  // Used to Override isSplittable
+  private boolean hasMultilineCSVRecords;
+
+  @Override
+  protected boolean isSplittable() throws Exception {
+if (hasMultilineCSVRecords) {
+  // When Having Multiline CSV Records,
+  // Splitting the file may cause a split to be within a record,
+  // Disabling split prevents this from happening
+  return false;
+}
+return super.isSplittable();
+  }
+
+  ContextualTextIOSource(
+  ValueProvider fileSpec,
+  EmptyMatchTreatment emptyMatchTreatment,
+  byte[] delimiter,
+  boolean hasMultilineCSVRecords) {
+super(fileSpec, emptyMatchTreatment, 1L);
+this.delimiter = delimiter;
+this.hasMultilineCSVRecords = hasMultilineCSVRecords;
+  }
+
+  private ContextualTextIOSource(
+  MatchResult.Metadata metadata,
+  long start,
+  long end,
+  byte[] delimiter,
+  boolean hasMultilineCSVRecords) {
+super(metadata, 1L, start, end);
+this.delimiter = delimiter;
+this.hasMultilineCSVRecords = hasMultilineCSVRecords;
+  }
+
+  @Override
+  protected FileBasedSource createForSubrangeOfFile(
+  MatchResult.Metadata metadata, long start, long end) {
+return new ContextualTextIOSource(metadata, start, end, delimiter, 
hasMultilineCSVRecords);
+  }
+
+  @Override
+  protected FileBasedReader 
createSingleFileReader(PipelineOptions options) {
+return new MultiLineTextBasedReader(this, delimiter, 
hasMultilineCSVRecords);
+  }
+
+  @Override
+  public Coder getOutputCoder() {
+SchemaCoder coder = null;
+try {
+  coder = 
SchemaRegistry.createDefault().getSchemaCoder(RecordWithMetadata.class);
+} catch (NoSuchSchemaException e) {
+  LOG.error("No Coder Found for RecordWithMetadata");
+}
+return coder;
+  }
+
+  /**
+   * A {@link FileBasedReader FileBasedReader} which can decode records 
delimited by delimiter
+   * 

[GitHub] [beam] abhiy13 commented on a change in pull request #12645: [BEAM-10124] Add ContextualTextIO

2020-09-21 Thread GitBox


abhiy13 commented on a change in pull request #12645:
URL: https://github.com/apache/beam/pull/12645#discussion_r492474778



##
File path: 
sdks/java/io/contextual-text-io/src/main/java/org/apache/beam/sdk/io/contextualtextio/ContextualTextIO.java
##
@@ -0,0 +1,631 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.contextualtextio;
+
+import static org.apache.beam.sdk.io.FileIO.ReadMatches.DirectoryTreatment;
+import static 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+import static 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull;
+
+import com.google.auto.value.AutoValue;
+import java.nio.ByteBuffer;
+import java.util.Arrays;
+import java.util.Comparator;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.SortedMap;
+import java.util.TreeMap;
+import org.apache.beam.sdk.coders.StringUtf8Coder;
+import org.apache.beam.sdk.io.CompressedSource;
+import org.apache.beam.sdk.io.Compression;
+import org.apache.beam.sdk.io.FileBasedSource;
+import org.apache.beam.sdk.io.FileIO;
+import org.apache.beam.sdk.io.FileIO.MatchConfiguration;
+import org.apache.beam.sdk.io.ReadAllViaFileBasedSource;
+import org.apache.beam.sdk.io.TextIO;
+import org.apache.beam.sdk.io.fs.EmptyMatchTreatment;
+import org.apache.beam.sdk.options.ValueProvider;
+import org.apache.beam.sdk.options.ValueProvider.StaticValueProvider;
+import org.apache.beam.sdk.schemas.NoSuchSchemaException;
+import org.apache.beam.sdk.schemas.SchemaCoder;
+import org.apache.beam.sdk.transforms.Count;
+import org.apache.beam.sdk.transforms.Create;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.transforms.View;
+import org.apache.beam.sdk.transforms.Watch.Growth.TerminationCondition;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.sdk.values.PBegin;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PCollectionView;
+import 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting;
+import org.checkerframework.checker.nullness.qual.Nullable;
+import org.joda.time.Duration;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * {@link PTransform}s that read text files and collect contextual information 
of the elements in
+ * the input.
+ *
+ * Use {@link TextIO} when not reading file with Multiline Records or 
additional metadata is not
+ * required.
+ *
+ * Reading from text files
+ *
+ * To read a {@link PCollection} from one or more text files, use {@code
+ * ContextualTextIO.read()}. To instantiate a transform use {@link
+ * ContextualTextIO.Read#from(String)} and specify the path of the file(s) to 
be read.
+ * Alternatively, if the filenames to be read are themselves in a {@link 
PCollection} you can use
+ * {@link FileIO} to match them and {@link ContextualTextIO#readFiles()} to 
read them.
+ *
+ * {@link #read} returns a {@link PCollection} of {@link RecordWithMetadata 
RecordWithMetadata},
+ * each corresponding to one line of an input UTF-8 text file (split into 
lines delimited by '\n',
+ * '\r', '\r\n', or specified delimiter see {@link 
ContextualTextIO.Read#withDelimiter})
+ *
+ * Filepattern expansion and watching
+ *
+ * By default, the filepatterns are expanded only once. The combination of 
{@link
+ * FileIO.Match#continuously(Duration, TerminationCondition)} and {@link 
#readFiles()} allow
+ * streaming of new files matching the filepattern(s).
+ *
+ * By default, {@link #read} prohibits filepatterns that match no files, 
and {@link #readFiles()}
+ * allows them in case the filepattern contains a glob wildcard character. Use 
{@link
+ * ContextualTextIO.Read#withEmptyMatchTreatment} or {@link
+ * FileIO.Match#withEmptyMatchTreatment(EmptyMatchTreatment)} plus {@link 
#readFiles()} to configure
+ * this behavior.
+ *
+ * Example 1: reading a file or filepattern.
+ *
+ * {@code
+ * Pipeline p = ...;

[GitHub] [beam] robertwb commented on a change in pull request #12884: [BEAM-7746] Add type checking to coders

2020-09-21 Thread GitBox


robertwb commented on a change in pull request #12884:
URL: https://github.com/apache/beam/pull/12884#discussion_r492465887



##
File path: sdks/python/apache_beam/coders/coder_impl.py
##
@@ -725,7 +728,7 @@ def __init__(self, key_coder_impl, window_coder_impl):
 self._tag_coder_impl = StrUtf8Coder().get_impl()
 
   def encode_to_stream(self, value, out, nested):
-# type: (dict, create_OutputStream, bool) -> None
+# type: (userstate.Timer, create_OutputStream, bool) -> None

Review comment:
   Hmm... these probably were right at the time. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] codecov[bot] edited a comment on pull request #12645: [BEAM-10124] Add ContextualTextIO

2020-09-21 Thread GitBox


codecov[bot] edited a comment on pull request #12645:
URL: https://github.com/apache/beam/pull/12645#issuecomment-688630083


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12645?src=pr=h1) Report
   > :exclamation: No coverage uploaded for pull request base 
(`master@2b2b8e7`). [Click here to learn what that 
means](https://docs.codecov.io/docs/error-reference#section-missing-base-commit).
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/beam/pull/12645/graphs/tree.svg?width=650=150=pr=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12645?src=pr=tree)
   
   ```diff
   @@Coverage Diff@@
   ## master   #12645   +/-   ##
   =
 Coverage  ?   34.47%   
   =
 Files ?  684   
 Lines ?81483   
 Branches  ? 9180   
   =
 Hits  ?28090   
 Misses?52972   
 Partials  ?  421   
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/beam/pull/12645?src=pr=tree) | Coverage 
Δ | |
   |---|---|---|
   | 
[io/textio\_test.py](https://codecov.io/gh/apache/beam/pull/12645/diff?src=pr=tree#diff-aW8vdGV4dGlvX3Rlc3QucHk=)
 | `16.96% <0.00%> (ø)` | |
   | 
[io/gcp/bigquery\_io\_read\_it\_test.py](https://codecov.io/gh/apache/beam/pull/12645/diff?src=pr=tree#diff-aW8vZ2NwL2JpZ3F1ZXJ5X2lvX3JlYWRfaXRfdGVzdC5weQ==)
 | `75.00% <0.00%> (ø)` | |
   | 
[...nners/portability/fn\_api\_runner/worker\_handlers.py](https://codecov.io/gh/apache/beam/pull/12645/diff?src=pr=tree#diff-cnVubmVycy9wb3J0YWJpbGl0eS9mbl9hcGlfcnVubmVyL3dvcmtlcl9oYW5kbGVycy5weQ==)
 | `34.90% <0.00%> (ø)` | |
   | 
[examples/avro\_bitcoin.py](https://codecov.io/gh/apache/beam/pull/12645/diff?src=pr=tree#diff-ZXhhbXBsZXMvYXZyb19iaXRjb2luLnB5)
 | `0.00% <0.00%> (ø)` | |
   | 
[.../snippets/transforms/elementwise/partition\_test.py](https://codecov.io/gh/apache/beam/pull/12645/diff?src=pr=tree#diff-ZXhhbXBsZXMvc25pcHBldHMvdHJhbnNmb3Jtcy9lbGVtZW50d2lzZS9wYXJ0aXRpb25fdGVzdC5weQ==)
 | `46.66% <0.00%> (ø)` | |
   | 
[io/gcp/bigquery\_file\_loads.py](https://codecov.io/gh/apache/beam/pull/12645/diff?src=pr=tree#diff-aW8vZ2NwL2JpZ3F1ZXJ5X2ZpbGVfbG9hZHMucHk=)
 | `23.36% <0.00%> (ø)` | |
   | 
[ml/gcp/visionml.py](https://codecov.io/gh/apache/beam/pull/12645/diff?src=pr=tree#diff-bWwvZ2NwL3Zpc2lvbm1sLnB5)
 | `47.61% <0.00%> (ø)` | |
   | 
[runners/interactive/pipeline\_analyzer.py](https://codecov.io/gh/apache/beam/pull/12645/diff?src=pr=tree#diff-cnVubmVycy9pbnRlcmFjdGl2ZS9waXBlbGluZV9hbmFseXplci5weQ==)
 | `20.00% <0.00%> (ø)` | |
   | 
[transforms/trigger.py](https://codecov.io/gh/apache/beam/pull/12645/diff?src=pr=tree#diff-dHJhbnNmb3Jtcy90cmlnZ2VyLnB5)
 | `37.84% <0.00%> (ø)` | |
   | 
[runners/common\_test.py](https://codecov.io/gh/apache/beam/pull/12645/diff?src=pr=tree#diff-cnVubmVycy9jb21tb25fdGVzdC5weQ==)
 | `24.29% <0.00%> (ø)` | |
   | ... and [674 
more](https://codecov.io/gh/apache/beam/pull/12645/diff?src=pr=tree-more) | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/beam/pull/12645?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/beam/pull/12645?src=pr=footer). Last 
update 
[2b2b8e7...3bed6b7](https://codecov.io/gh/apache/beam/pull/12645?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] robertwb commented on pull request #12884: [BEAM-7746] Add type checking to coders

2020-09-21 Thread GitBox


robertwb commented on pull request #12884:
URL: https://github.com/apache/beam/pull/12884#issuecomment-696504555


   Run Python PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] robertwb commented on a change in pull request #12881: [BEAM-7746] Get mypy passing on runners.worker

2020-09-21 Thread GitBox


robertwb commented on a change in pull request #12881:
URL: https://github.com/apache/beam/pull/12881#discussion_r492466948



##
File path: sdks/python/apache_beam/runners/worker/bundle_processor.py
##
@@ -119,10 +125,10 @@ class RunnerIOOperation(operations.Operation):
 
   def __init__(self,
name_context,  # type: Union[str, common.NameContext]
-   step_name,
+   step_name,  # type: Any

Review comment:
   Is this not str or Optional[str]?

##
File path: sdks/python/apache_beam/runners/worker/data_plane.py
##
@@ -331,6 +369,7 @@ def add_to_inverse_output(timer):
 is_last=False))
 
 def close_stream(timer):
+  # type: (bytes) -> None

Review comment:
   I wonder if we should call this encoded_timer[s]? 

##
File path: sdks/python/apache_beam/coders/coder_impl.py
##
@@ -725,7 +726,7 @@ def __init__(self, key_coder_impl, window_coder_impl):
 self._tag_coder_impl = StrUtf8Coder().get_impl()
 
   def encode_to_stream(self, value, out, nested):
-# type: (dict, create_OutputStream, bool) -> None
+# type: (userstate.Timer, create_OutputStream, bool) -> None

Review comment:
   I think it used to be correct back when timers were being implemented. 
This code changed a couple of months ago too. 

##
File path: sdks/python/apache_beam/runners/worker/bundle_processor.py
##
@@ -1070,23 +1078,24 @@ def delayed_bundle_application(self,
 return beam_fn_api_pb2.DelayedBundleApplication(
 requested_time_delay=proto_deferred_watermark,
 application=self.construct_bundle_application(
-op, current_watermark, element_and_restriction))
+op.input_info, current_watermark, element_and_restriction))
 
   def bundle_application(self,
  op,  # type: operations.DoOperation
  primary  # type: SplitResultPrimary
 ):
 # type: (...) -> beam_fn_api_pb2.BundleApplication
-return self.construct_bundle_application(op, None, primary.primary_value)
+assert op.input_info is not None
+return self.construct_bundle_application(
+op.input_info, None, primary.primary_value)
 
   def construct_bundle_application(self,
-   op,  # type: operations.DoOperation
+   op_input_info,  # type: 
operations.OpInputInfo

Review comment:
   Sounds good to me.

##
File path: sdks/python/apache_beam/runners/worker/data_plane.py
##
@@ -81,7 +85,11 @@
 
 class ClosableOutputStream(OutputStream):
   """A Outputstream for use with CoderImpls that has a close() method."""
-  def __init__(self, close_callback=None):
+  def __init__(
+  self,
+  close_callback=None  # type: Optional[Optional[Callable[[bytes], None]]]

Review comment:
   Why the double Optional (here and elsewhere below)?

##
File path: sdks/python/apache_beam/runners/worker/data_plane.py
##
@@ -218,7 +249,7 @@ class DataChannel(with_metaclass(abc.ABCMeta, object)):  # 
type: ignore[misc]
   @abc.abstractmethod
   def input_elements(self,
  instruction_id,  # type: str
- expected_inputs,  # type: Collection[str]
+ expected_inputs,  # type: Sized

Review comment:
   Don't we call `__contains__` as well? I'd rather keep it more fully 
typed. 

##
File path: sdks/python/apache_beam/runners/worker/bundle_processor.py
##
@@ -947,11 +955,11 @@ def process_bundle(self, instruction_id):
   # (transform_id, timer_family_id).
   data_channels = collections.defaultdict(
   list
-  )  # type: DefaultDict[data_plane.GrpcClientDataChannel, List[str]]
+  )  # type: DefaultDict[data_plane.GrpcClientDataChannel, List[Union[str, 
Tuple[str, str

Review comment:
   Yep, this changed a couple of months ago. It'll be good to finally have 
these type annotations checked. 

##
File path: sdks/python/apache_beam/runners/worker/data_plane.py
##
@@ -243,7 +274,7 @@ def output_stream(
   instruction_id,  # type: str
   transform_id  # type: str
   ):
-# type: (...) -> ClosableOutputStream
+# type: (...) -> SizeBasedBufferingClosableOutputStream

Review comment:
   We're also thinking about adding a time-based one. 
   
   Let's add a no-op maybe_flush method to the baseclass and keep 
ClosableOutputStream everywhere. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] codecov[bot] commented on pull request #12884: [BEAM-7746] Add type checking to coders

2020-09-21 Thread GitBox


codecov[bot] commented on pull request #12884:
URL: https://github.com/apache/beam/pull/12884#issuecomment-696510544


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12884?src=pr=h1) Report
   > Merging 
[#12884](https://codecov.io/gh/apache/beam/pull/12884?src=pr=desc) into 
[master](https://codecov.io/gh/apache/beam/commit/067cba8229694e7fb9693313f51ca686746b620a?el=desc)
 will **decrease** coverage by `0.01%`.
   > The diff coverage is `83.33%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/beam/pull/12884/graphs/tree.svg?width=650=150=pr=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12884?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master   #12884  +/-   ##
   ==
   - Coverage   82.34%   82.33%   -0.02% 
   ==
 Files 452  453   +1 
 Lines   5401654058  +42 
   ==
   + Hits4448144509  +28 
   - Misses   9535 9549  +14 
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/beam/pull/12884?src=pr=tree) | Coverage 
Δ | |
   |---|---|---|
   | 
[sdks/python/apache\_beam/coders/coder\_impl.py](https://codecov.io/gh/apache/beam/pull/12884/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vY29kZXJzL2NvZGVyX2ltcGwucHk=)
 | `95.13% <75.00%> (-0.12%)` | :arrow_down: |
   | 
[sdks/python/apache\_beam/coders/coders.py](https://codecov.io/gh/apache/beam/pull/12884/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vY29kZXJzL2NvZGVycy5weQ==)
 | `85.41% <100.00%> (+0.02%)` | :arrow_up: |
   | 
[.../python/apache\_beam/transforms/periodicsequence.py](https://codecov.io/gh/apache/beam/pull/12884/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9wZXJpb2RpY3NlcXVlbmNlLnB5)
 | `96.49% <0.00%> (-1.76%)` | :arrow_down: |
   | 
[...dks/python/apache\_beam/runners/pipeline\_context.py](https://codecov.io/gh/apache/beam/pull/12884/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9waXBlbGluZV9jb250ZXh0LnB5)
 | `92.80% <0.00%> (-1.01%)` | :arrow_down: |
   | 
[sdks/python/apache\_beam/io/localfilesystem.py](https://codecov.io/gh/apache/beam/pull/12884/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vbG9jYWxmaWxlc3lzdGVtLnB5)
 | `90.90% <0.00%> (-0.76%)` | :arrow_down: |
   | 
[sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12884/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=)
 | `88.75% <0.00%> (-0.45%)` | :arrow_down: |
   | 
[...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12884/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==)
 | `88.98% <0.00%> (-0.36%)` | :arrow_down: |
   | 
[sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12884/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5)
 | `84.04% <0.00%> (-0.29%)` | :arrow_down: |
   | 
[...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12884/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==)
 | `94.45% <0.00%> (-0.14%)` | :arrow_down: |
   | 
[...python/apache\_beam/examples/wordcount\_dataframe.py](https://codecov.io/gh/apache/beam/pull/12884/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvd29yZGNvdW50X2RhdGFmcmFtZS5weQ==)
 | `91.66% <0.00%> (ø)` | |
   | ... and [1 
more](https://codecov.io/gh/apache/beam/pull/12884/diff?src=pr=tree-more) | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/beam/pull/12884?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/beam/pull/12884?src=pr=footer). Last 
update 
[067cba8...e2952c5](https://codecov.io/gh/apache/beam/pull/12884?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] codecov[bot] edited a comment on pull request #12889: Dataframe wordcount example.

2020-09-21 Thread GitBox


codecov[bot] edited a comment on pull request #12889:
URL: https://github.com/apache/beam/pull/12889#issuecomment-696446965


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12889?src=pr=h1) Report
   > Merging 
[#12889](https://codecov.io/gh/apache/beam/pull/12889?src=pr=desc) into 
[master](https://codecov.io/gh/apache/beam/commit/1b266601c39f243f48dfd085b565b984f936d02c?el=desc)
 will **increase** coverage by `0.00%`.
   > The diff coverage is `91.66%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/beam/pull/12889/graphs/tree.svg?width=650=150=pr=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12889?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master   #12889   +/-   ##
   ===
 Coverage   82.32%   82.33%   
   ===
 Files 452  453+1 
 Lines   5401654040   +24 
   ===
   + Hits4447144496   +25 
   + Misses   9545 9544-1 
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/beam/pull/12889?src=pr=tree) | Coverage 
Δ | |
   |---|---|---|
   | 
[...python/apache\_beam/examples/wordcount\_dataframe.py](https://codecov.io/gh/apache/beam/pull/12889/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvd29yZGNvdW50X2RhdGFmcmFtZS5weQ==)
 | `91.66% <91.66%> (ø)` | |
   | 
[sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12889/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=)
 | `88.75% <0.00%> (-0.45%)` | :arrow_down: |
   | 
[...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12889/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==)
 | `88.80% <0.00%> (-0.18%)` | :arrow_down: |
   | 
[...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12889/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==)
 | `94.58% <0.00%> (+0.13%)` | :arrow_up: |
   | 
[...nners/portability/fn\_api\_runner/worker\_handlers.py](https://codecov.io/gh/apache/beam/pull/12889/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9mbl9hcGlfcnVubmVyL3dvcmtlcl9oYW5kbGVycy5weQ==)
 | `80.75% <0.00%> (+0.17%)` | :arrow_up: |
   | 
[sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12889/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5)
 | `84.33% <0.00%> (+0.28%)` | :arrow_up: |
   | 
[...ks/python/apache\_beam/runners/worker/data\_plane.py](https://codecov.io/gh/apache/beam/pull/12889/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvZGF0YV9wbGFuZS5weQ==)
 | `89.90% <0.00%> (+1.22%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/beam/pull/12889?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/beam/pull/12889?src=pr=footer). Last 
update 
[1b26660...4146e55](https://codecov.io/gh/apache/beam/pull/12889?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] codecov[bot] commented on pull request #12762: Ensuring that BigQuery jobs are tagged with the Dataflow step that launches them

2020-09-21 Thread GitBox


codecov[bot] commented on pull request #12762:
URL: https://github.com/apache/beam/pull/12762#issuecomment-696459985


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12762?src=pr=h1) Report
   > Merging 
[#12762](https://codecov.io/gh/apache/beam/pull/12762?src=pr=desc) into 
[master](https://codecov.io/gh/apache/beam/commit/958e445ae49da6cf5b67f769e520d90fd8aed60d?el=desc)
 will **increase** coverage by `41.69%`.
   > The diff coverage is `44.18%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/beam/pull/12762/graphs/tree.svg?width=650=150=pr=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12762?src=pr=tree)
   
   ```diff
   @@ Coverage Diff @@
   ##   master   #12762   +/-   ##
   ===
   + Coverage   40.30%   81.99%   +41.69% 
   ===
 Files 451  459+8 
 Lines   5316854387 +1219 
   ===
   + Hits2142944597+23168 
   + Misses  31739 9790-21949 
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/beam/pull/12762?src=pr=tree) | Coverage 
Δ | |
   |---|---|---|
   | 
[sdks/python/apache\_beam/io/azure/blobstorageio.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXp1cmUvYmxvYnN0b3JhZ2Vpby5weQ==)
 | `26.95% <26.95%> (ø)` | |
   | 
[sdks/python/apache\_beam/io/filesystems.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZmlsZXN5c3RlbXMucHk=)
 | `87.50% <50.00%> (+29.74%)` | :arrow_up: |
   | 
[sdks/python/apache\_beam/io/gcp/bigquery.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5LnB5)
 | `79.78% <56.52%> (+53.01%)` | :arrow_up: |
   | 
[...thon/apache\_beam/io/azure/blobstoragefilesystem.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXp1cmUvYmxvYnN0b3JhZ2VmaWxlc3lzdGVtLnB5)
 | `77.31% <77.31%> (ø)` | |
   | 
[sdks/python/apache\_beam/io/azure/\_\_init\_\_.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vYXp1cmUvX19pbml0X18ucHk=)
 | `100.00% <100.00%> (ø)` | |
   | 
[...s/python/apache\_beam/io/gcp/bigquery\_file\_loads.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5X2ZpbGVfbG9hZHMucHk=)
 | `89.97% <100.00%> (+66.61%)` | :arrow_up: |
   | 
[.../python/apache\_beam/io/gcp/bigquery\_io\_metadata.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5X2lvX21ldGFkYXRhLnB5)
 | `90.62% <100.00%> (+47.14%)` | :arrow_up: |
   | 
[setup.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr=tree#diff-c2V0dXAucHk=)
 | `0.00% <0.00%> (ø)` | |
   | 
[sdks/python/apache\_beam/portability/python\_urns.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvcHl0aG9uX3VybnMucHk=)
 | `100.00% <0.00%> (ø)` | |
   | 
[...apache\_beam/portability/api/beam\_runner\_api\_pb2.py](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvYXBpL2JlYW1fcnVubmVyX2FwaV9wYjIucHk=)
 | `100.00% <0.00%> (ø)` | |
   | ... and [290 
more](https://codecov.io/gh/apache/beam/pull/12762/diff?src=pr=tree-more) | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/beam/pull/12762?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/beam/pull/12762?src=pr=footer). Last 
update 
[bae1e7b...a5d8020](https://codecov.io/gh/apache/beam/pull/12762?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] codecov[bot] commented on pull request #12889: Dataframe wordcount example.

2020-09-21 Thread GitBox


codecov[bot] commented on pull request #12889:
URL: https://github.com/apache/beam/pull/12889#issuecomment-696446965


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12889?src=pr=h1) Report
   > Merging 
[#12889](https://codecov.io/gh/apache/beam/pull/12889?src=pr=desc) into 
[master](https://codecov.io/gh/apache/beam/commit/067cba8229694e7fb9693313f51ca686746b620a?el=desc)
 will **decrease** coverage by `0.00%`.
   > The diff coverage is `91.66%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/beam/pull/12889/graphs/tree.svg?width=650=150=pr=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12889?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master   #12889  +/-   ##
   ==
   - Coverage   82.34%   82.33%   -0.01% 
   ==
 Files 452  453   +1 
 Lines   5401654040  +24 
   ==
   + Hits4448144496  +15 
   - Misses   9535 9544   +9 
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/beam/pull/12889?src=pr=tree) | Coverage 
Δ | |
   |---|---|---|
   | 
[...python/apache\_beam/examples/wordcount\_dataframe.py](https://codecov.io/gh/apache/beam/pull/12889/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvd29yZGNvdW50X2RhdGFmcmFtZS5weQ==)
 | `91.66% <91.66%> (ø)` | |
   | 
[sdks/python/apache\_beam/io/localfilesystem.py](https://codecov.io/gh/apache/beam/pull/12889/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vbG9jYWxmaWxlc3lzdGVtLnB5)
 | `90.90% <0.00%> (-0.76%)` | :arrow_down: |
   | 
[...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12889/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==)
 | `88.80% <0.00%> (-0.54%)` | :arrow_down: |
   | 
[sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12889/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=)
 | `88.75% <0.00%> (-0.45%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/beam/pull/12889?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/beam/pull/12889?src=pr=footer). Last 
update 
[1b26660...4146e55](https://codecov.io/gh/apache/beam/pull/12889?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] chamikaramj commented on pull request #12880: [BEAM-10933] Adjust GBK type before creating the pipeline proto

2020-09-21 Thread GitBox


chamikaramj commented on pull request #12880:
URL: https://github.com/apache/beam/pull/12880#issuecomment-696447942


   Run Python_PVR_Flink PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] robertwb merged pull request #12889: Dataframe wordcount example.

2020-09-21 Thread GitBox


robertwb merged pull request #12889:
URL: https://github.com/apache/beam/pull/12889


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] udim commented on pull request #9907: [BEAM-4091] Pass type hints in ptransform_fn

2020-09-21 Thread GitBox


udim commented on pull request #9907:
URL: https://github.com/apache/beam/pull/9907#issuecomment-696465531


   PTAL, changes are in the last commit.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] codecov[bot] edited a comment on pull request #12645: [BEAM-10124] Add ContextualTextIO

2020-09-21 Thread GitBox


codecov[bot] edited a comment on pull request #12645:
URL: https://github.com/apache/beam/pull/12645#issuecomment-688630083


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12645?src=pr=h1) Report
   > :exclamation: No coverage uploaded for pull request base 
(`master@2b2b8e7`). [Click here to learn what that 
means](https://docs.codecov.io/docs/error-reference#section-missing-base-commit).
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/beam/pull/12645/graphs/tree.svg?width=650=150=pr=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12645?src=pr=tree)
   
   ```diff
   @@Coverage Diff@@
   ## master   #12645   +/-   ##
   =
 Coverage  ?   82.33%   
   =
 Files ?  453   
 Lines ?54054   
 Branches  ?0   
   =
 Hits  ?44506   
 Misses? 9548   
 Partials  ?0   
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/beam/pull/12645?src=pr=tree) | Coverage 
Δ | |
   |---|---|---|
   | 
[sdks/python/apache\_beam/coders/\_\_init\_\_.py](https://codecov.io/gh/apache/beam/pull/12645/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vY29kZXJzL19faW5pdF9fLnB5)
 | `100.00% <0.00%> (ø)` | |
   | 
[...m/runners/portability/spark\_uber\_jar\_job\_server.py](https://codecov.io/gh/apache/beam/pull/12645/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9zcGFya191YmVyX2phcl9qb2Jfc2VydmVyLnB5)
 | `85.60% <0.00%> (ø)` | |
   | 
[...am/examples/snippets/transforms/aggregation/sum.py](https://codecov.io/gh/apache/beam/pull/12645/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvc25pcHBldHMvdHJhbnNmb3Jtcy9hZ2dyZWdhdGlvbi9zdW0ucHk=)
 | `100.00% <0.00%> (ø)` | |
   | 
[...eam/testing/benchmarks/nexmark/nexmark\_launcher.py](https://codecov.io/gh/apache/beam/pull/12645/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdGVzdGluZy9iZW5jaG1hcmtzL25leG1hcmsvbmV4bWFya19sYXVuY2hlci5weQ==)
 | `0.00% <0.00%> (ø)` | |
   | 
[sdks/python/apache\_beam/io/gcp/bigquery.py](https://codecov.io/gh/apache/beam/pull/12645/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2JpZ3F1ZXJ5LnB5)
 | `79.78% <0.00%> (ø)` | |
   | 
[...ache\_beam/io/gcp/datastore/v1new/query\_splitter.py](https://codecov.io/gh/apache/beam/pull/12645/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZ2NwL2RhdGFzdG9yZS92MW5ldy9xdWVyeV9zcGxpdHRlci5weQ==)
 | `94.11% <0.00%> (ø)` | |
   | 
[...eam/portability/api/beam\_expansion\_api\_pb2\_grpc.py](https://codecov.io/gh/apache/beam/pull/12645/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcG9ydGFiaWxpdHkvYXBpL2JlYW1fZXhwYW5zaW9uX2FwaV9wYjJfZ3JwYy5weQ==)
 | `61.90% <0.00%> (ø)` | |
   | 
[...n/apache\_beam/typehints/typed\_pipeline\_test\_py3.py](https://codecov.io/gh/apache/beam/pull/12645/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHlwZWhpbnRzL3R5cGVkX3BpcGVsaW5lX3Rlc3RfcHkzLnB5)
 | `90.30% <0.00%> (ø)` | |
   | 
[...s/snippets/transforms/aggregation/combinevalues.py](https://codecov.io/gh/apache/beam/pull/12645/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvc25pcHBldHMvdHJhbnNmb3Jtcy9hZ2dyZWdhdGlvbi9jb21iaW5ldmFsdWVzLnB5)
 | `94.73% <0.00%> (ø)` | |
   | 
[...examples/snippets/transforms/elementwise/values.py](https://codecov.io/gh/apache/beam/pull/12645/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvc25pcHBldHMvdHJhbnNmb3Jtcy9lbGVtZW50d2lzZS92YWx1ZXMucHk=)
 | `100.00% <0.00%> (ø)` | |
   | ... and [443 
more](https://codecov.io/gh/apache/beam/pull/12645/diff?src=pr=tree-more) | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/beam/pull/12645?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/beam/pull/12645?src=pr=footer). Last 
update 
[2b2b8e7...3bed6b7](https://codecov.io/gh/apache/beam/pull/12645?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] abhiy13 commented on a change in pull request #12645: [BEAM-10124] Add ContextualTextIO

2020-09-21 Thread GitBox


abhiy13 commented on a change in pull request #12645:
URL: https://github.com/apache/beam/pull/12645#discussion_r492475183



##
File path: 
sdks/java/io/contextual-text-io/src/main/java/org/apache/beam/sdk/io/contextualtextio/RecordWithMetadata.java
##
@@ -0,0 +1,84 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.contextualtextio;
+
+import com.google.auto.value.AutoValue;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.schemas.AutoValueSchema;
+import org.apache.beam.sdk.schemas.annotations.DefaultSchema;
+
+/**
+ * Helper Class based on {@link AutoValueSchema}, it provides Metadata 
associated with each Record
+ * when reading from file(s) using {@link ContextualTextIO}.
+ *
+ * Fields:
+ *
+ * 
+ *   recordOffset: The offset of a record (the byte at which the record 
begins) in a file. This
+ *   information can be useful if you wish to reconstruct the file. {@link
+ *   RecordWithMetadata#getRecordOffset()}
+ *   recordNum: The ordinal number of the record in its file. {@link
+ *   RecordWithMetadata#getRecordNum()}
+ *   recordValue: The value / contents of the record {@link 
RecordWithMetadata#getRecordValue()}
+ *   rangeOffset: The starting offset of the range (split), which 
contained the record, when the
+ *   record was read. {@link RecordWithMetadata#getRangeOffset()}
+ *   recordNumInOffset: The record number relative to the Range. (line 
number within the range)
+ *   {@link RecordWithMetadata#getRecordNumInOffset()}
+ *   fileName: Name of the file to which the record belongs (this is the 
full filename,
+ *   eg:path/to/file.txt) {@link RecordWithMetadata#getFileName()}
+ * 
+ */
+@Experimental(Experimental.Kind.SCHEMAS)
+@DefaultSchema(AutoValueSchema.class)
+@AutoValue
+public abstract class RecordWithMetadata {
+  public abstract Long getRecordOffset();

Review comment:
   Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] robertwb commented on a change in pull request #12727: [BEAM-10844] Add experiment option prebuild_sdk_container to prebuild python sdk container with dependencies.

2020-09-21 Thread GitBox


robertwb commented on a change in pull request #12727:
URL: https://github.com/apache/beam/pull/12727#discussion_r492422441



##
File path: sdks/python/apache_beam/runners/portability/stager.py
##
@@ -119,6 +119,7 @@ def create_job_resources(options,  # type: PipelineOptions
temp_dir,  # type: str
build_setup_args=None,  # type: Optional[List[str]]
populate_requirements_cache=None,  # type: 
Optional[str]
+   skip_boot_dependencies=False, # type: Optional[bool]

Review comment:
   Hmm... what we need to do is fix Dataflow to not be so weird. But 
probably not in this PR. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] tvalentyn commented on pull request #12895: [BEAM-9372][BEAM-9980] Switches Flink VR suite to Py36 and makes the version configurable.

2020-09-21 Thread GitBox


tvalentyn commented on pull request #12895:
URL: https://github.com/apache/beam/pull/12895#issuecomment-696458288


   Run Seed Job



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] codecov[bot] edited a comment on pull request #12880: [BEAM-10933] Adjust GBK type before creating the pipeline proto

2020-09-21 Thread GitBox


codecov[bot] edited a comment on pull request #12880:
URL: https://github.com/apache/beam/pull/12880#issuecomment-695346831


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12880?src=pr=h1) Report
   > Merging 
[#12880](https://codecov.io/gh/apache/beam/pull/12880?src=pr=desc) into 
[master](https://codecov.io/gh/apache/beam/commit/1b266601c39f243f48dfd085b565b984f936d02c?el=desc)
 will **decrease** coverage by `0.00%`.
   > The diff coverage is `92.59%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/beam/pull/12880/graphs/tree.svg?width=650=150=pr=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12880?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master   #12880  +/-   ##
   ==
   - Coverage   82.32%   82.32%   -0.01% 
   ==
 Files 452  453   +1 
 Lines   5401654042  +26 
   ==
   + Hits4447144492  +21 
   - Misses   9545 9550   +5 
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/beam/pull/12880?src=pr=tree) | Coverage 
Δ | |
   |---|---|---|
   | 
[...python/apache\_beam/examples/wordcount\_dataframe.py](https://codecov.io/gh/apache/beam/pull/12880/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvd29yZGNvdW50X2RhdGFmcmFtZS5weQ==)
 | `91.66% <91.66%> (ø)` | |
   | 
[...on/apache\_beam/runners/dataflow/dataflow\_runner.py](https://codecov.io/gh/apache/beam/pull/12880/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kYXRhZmxvdy9kYXRhZmxvd19ydW5uZXIucHk=)
 | `77.24% <100.00%> (+0.06%)` | :arrow_up: |
   | 
[sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12880/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=)
 | `88.75% <0.00%> (-0.45%)` | :arrow_down: |
   | 
[sdks/python/apache\_beam/transforms/combiners.py](https://codecov.io/gh/apache/beam/pull/12880/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9jb21iaW5lcnMucHk=)
 | `91.97% <0.00%> (-0.19%)` | :arrow_down: |
   | 
[sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12880/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5)
 | `84.33% <0.00%> (+0.28%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/beam/pull/12880?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/beam/pull/12880?src=pr=footer). Last 
update 
[096f695...83a4d41](https://codecov.io/gh/apache/beam/pull/12880?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] codecov[bot] edited a comment on pull request #12900: [BEAM-10941] Use standard sharding conventions for fileio writes.

2020-09-21 Thread GitBox


codecov[bot] edited a comment on pull request #12900:
URL: https://github.com/apache/beam/pull/12900#issuecomment-696448479







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] tvalentyn commented on a change in pull request #12895: [BEAM-9372][BEAM-9980] Switches Flink VR suite to Py36 and makes the version configurable.

2020-09-21 Thread GitBox


tvalentyn commented on a change in pull request #12895:
URL: https://github.com/apache/beam/pull/12895#discussion_r492360722



##
File path: sdks/python/test-suites/gradle.properties
##
@@ -27,3 +27,6 @@ dataflow_chicago_taxi_example_task_py_versions=3.7
 
 # direct test-suites
 direct_mongodbio_it_task_py_versions=3.5
+
+# portable test-suites
+portable_flink_validates_runner_py_versions=3.6

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] ibzib commented on a change in pull request #12637: [BEAM-10768] Don't assert the order in which elements are received.

2020-09-21 Thread GitBox


ibzib commented on a change in pull request #12637:
URL: https://github.com/apache/beam/pull/12637#discussion_r492386778



##
File path: sdks/python/apache_beam/runners/worker/data_plane_test.py
##
@@ -108,16 +106,28 @@ def send(instruction_id, transform_id, data):
 ])
 
 # Multiple interleaved writes to multiple instructions.
-send('1', transform_1, b'abc')
-send('2', transform_1, b'def')
+stream11 = from_channel.output_stream('1', transform_1)
+stream11.write(b'abc')
+stream21 = from_channel.output_stream('2', transform_1)
+stream21.write(b'def')
+if not time_based_flush:
+  stream11.close()
 self.assertEqual(
 list(
 itertools.islice(to_channel.input_elements('1', [transform_1]), 
1)),
 [
 beam_fn_api_pb2.Elements.Data(
 instruction_id='1', transform_id=transform_1, data=b'abc')
 ])
-send('2', transform_2, b'ghi')
+if time_based_flush:

Review comment:
   Write does not provide ordering guarantees in this case.
   
   Elements are stored in a 
[queue](https://github.com/apache/beam/blob/7b3d4251d244c10545fb37f1d93ebcad84a98681/sdks/python/apache_beam/runners/worker/data_plane.py#L371)
 before being sent, to enable batching. Elements aren't added to that queue 
until the [flush 
callback](https://github.com/apache/beam/blob/7b3d4251d244c10545fb37f1d93ebcad84a98681/sdks/python/apache_beam/runners/worker/data_plane.py#L493)
 is invoked. Because the flush callback is [invoked 
periodically](https://github.com/apache/beam/blob/7b3d4251d244c10545fb37f1d93ebcad84a98681/sdks/python/apache_beam/runners/worker/data_plane.py#L182)
 starting from when a stream is constructed, there is no guarantee that one 
stream's callback is called before the other.
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] TheNeuralBit commented on pull request #12896: Update indexing skips for pandas 1.x

2020-09-21 Thread GitBox


TheNeuralBit commented on pull request #12896:
URL: https://github.com/apache/beam/pull/12896#issuecomment-696446062


   I think that job is actually a new one added in 
https://github.com/apache/beam/pull/12898. There's still 
https://ci-beam.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Commit/



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] codecov[bot] edited a comment on pull request #12900: [BEAM-10941] Use standard sharding conventions for fileio writes.

2020-09-21 Thread GitBox


codecov[bot] edited a comment on pull request #12900:
URL: https://github.com/apache/beam/pull/12900#issuecomment-696448479


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12900?src=pr=h1) Report
   > Merging 
[#12900](https://codecov.io/gh/apache/beam/pull/12900?src=pr=desc) into 
[master](https://codecov.io/gh/apache/beam/commit/067cba8229694e7fb9693313f51ca686746b620a?el=desc)
 will **increase** coverage by `0.00%`.
   > The diff coverage is `100.00%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/beam/pull/12900/graphs/tree.svg?width=650=150=pr=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12900?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master   #12900   +/-   ##
   ===
 Coverage   82.34%   82.34%   
   ===
 Files 452  452   
 Lines   5401654018+2 
   ===
   + Hits4448144483+2 
 Misses   9535 9535   
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/beam/pull/12900?src=pr=tree) | Coverage 
Δ | |
   |---|---|---|
   | 
[sdks/python/apache\_beam/io/fileio.py](https://codecov.io/gh/apache/beam/pull/12900/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vZmlsZWlvLnB5)
 | `95.84% <100.00%> (+1.77%)` | :arrow_up: |
   | 
[sdks/python/apache\_beam/io/localfilesystem.py](https://codecov.io/gh/apache/beam/pull/12900/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vbG9jYWxmaWxlc3lzdGVtLnB5)
 | `90.90% <0.00%> (-0.76%)` | :arrow_down: |
   | 
[sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12900/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=)
 | `88.75% <0.00%> (-0.45%)` | :arrow_down: |
   | 
[...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12900/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==)
 | `94.45% <0.00%> (-0.14%)` | :arrow_down: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/beam/pull/12900?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/beam/pull/12900?src=pr=footer). Last 
update 
[067cba8...e29eaad](https://codecov.io/gh/apache/beam/pull/12900?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] tvalentyn commented on pull request #12896: Update indexing skips for pandas 1.x

2020-09-21 Thread GitBox


tvalentyn commented on pull request #12896:
URL: https://github.com/apache/beam/pull/12896#issuecomment-696457581


   > That's unexpected.. I wonder if it was disabled accidentally (perhaps 
because of Python version removals?)
   
   Yes, sorry about that - I was testing my changes and ran a seed job on  
#12898 
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] codecov[bot] edited a comment on pull request #12880: [BEAM-10933] Adjust GBK type before creating the pipeline proto

2020-09-21 Thread GitBox


codecov[bot] edited a comment on pull request #12880:
URL: https://github.com/apache/beam/pull/12880#issuecomment-695346831


   # [Codecov](https://codecov.io/gh/apache/beam/pull/12880?src=pr=h1) Report
   > Merging 
[#12880](https://codecov.io/gh/apache/beam/pull/12880?src=pr=desc) into 
[master](https://codecov.io/gh/apache/beam/commit/1b266601c39f243f48dfd085b565b984f936d02c?el=desc)
 will **increase** coverage by `0.00%`.
   > The diff coverage is `100.00%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/beam/pull/12880/graphs/tree.svg?width=650=150=pr=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/12880?src=pr=tree)
   
   ```diff
   @@   Coverage Diff   @@
   ##   master   #12880   +/-   ##
   ===
 Coverage   82.32%   82.32%   
   ===
 Files 452  452   
 Lines   5401654018+2 
   ===
   + Hits4447144473+2 
 Misses   9545 9545   
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/beam/pull/12880?src=pr=tree) | Coverage 
Δ | |
   |---|---|---|
   | 
[...on/apache\_beam/runners/dataflow/dataflow\_runner.py](https://codecov.io/gh/apache/beam/pull/12880/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kYXRhZmxvdy9kYXRhZmxvd19ydW5uZXIucHk=)
 | `77.24% <100.00%> (+0.06%)` | :arrow_up: |
   | 
[sdks/python/apache\_beam/runners/common.py](https://codecov.io/gh/apache/beam/pull/12880/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9jb21tb24ucHk=)
 | `88.75% <0.00%> (-0.45%)` | :arrow_down: |
   | 
[...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/12880/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==)
 | `94.18% <0.00%> (-0.27%)` | :arrow_down: |
   | 
[...ks/python/apache\_beam/runners/worker/sdk\_worker.py](https://codecov.io/gh/apache/beam/pull/12880/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvc2RrX3dvcmtlci5weQ==)
 | `88.80% <0.00%> (-0.18%)` | :arrow_down: |
   | 
[...nners/portability/fn\_api\_runner/worker\_handlers.py](https://codecov.io/gh/apache/beam/pull/12880/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9mbl9hcGlfcnVubmVyL3dvcmtlcl9oYW5kbGVycy5weQ==)
 | `80.75% <0.00%> (+0.17%)` | :arrow_up: |
   | 
[sdks/python/apache\_beam/io/iobase.py](https://codecov.io/gh/apache/beam/pull/12880/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vaW9iYXNlLnB5)
 | `84.33% <0.00%> (+0.28%)` | :arrow_up: |
   | 
[...ks/python/apache\_beam/runners/worker/data\_plane.py](https://codecov.io/gh/apache/beam/pull/12880/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvZGF0YV9wbGFuZS5weQ==)
 | `89.90% <0.00%> (+1.22%)` | :arrow_up: |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/beam/pull/12880?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/beam/pull/12880?src=pr=footer). Last 
update 
[096f695...83a4d41](https://codecov.io/gh/apache/beam/pull/12880?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] codecov[bot] edited a comment on pull request #9907: [BEAM-4091] Pass type hints in ptransform_fn

2020-09-21 Thread GitBox


codecov[bot] edited a comment on pull request #9907:
URL: https://github.com/apache/beam/pull/9907#issuecomment-682206351


   # [Codecov](https://codecov.io/gh/apache/beam/pull/9907?src=pr=h1) Report
   > Merging 
[#9907](https://codecov.io/gh/apache/beam/pull/9907?src=pr=desc) into 
[master](https://codecov.io/gh/apache/beam/commit/1b266601c39f243f48dfd085b565b984f936d02c?el=desc)
 will **decrease** coverage by `42.03%`.
   > The diff coverage is `24.13%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/beam/pull/9907/graphs/tree.svg?width=650=150=pr=qcbbAh8Fj1)](https://codecov.io/gh/apache/beam/pull/9907?src=pr=tree)
   
   ```diff
   @@ Coverage Diff @@
   ##   master#9907   +/-   ##
   ===
   - Coverage   82.32%   40.29%   -42.04% 
   ===
 Files 452  451-1 
 Lines   5401653197  -819 
   ===
   - Hits4447121434-23037 
   - Misses   954531763+22218 
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/beam/pull/9907?src=pr=tree) | Coverage Δ 
| |
   |---|---|---|
   | 
[sdks/python/apache\_beam/transforms/util.py](https://codecov.io/gh/apache/beam/pull/9907/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy91dGlsLnB5)
 | `39.08% <0.00%> (-56.36%)` | :arrow_down: |
   | 
[...n/apache\_beam/typehints/typed\_pipeline\_test\_py3.py](https://codecov.io/gh/apache/beam/pull/9907/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHlwZWhpbnRzL3R5cGVkX3BpcGVsaW5lX3Rlc3RfcHkzLnB5)
 | `16.87% <9.52%> (-73.43%)` | :arrow_down: |
   | 
[sdks/python/apache\_beam/typehints/decorators.py](https://codecov.io/gh/apache/beam/pull/9907/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHlwZWhpbnRzL2RlY29yYXRvcnMucHk=)
 | `39.45% <50.00%> (-43.84%)` | :arrow_down: |
   | 
[sdks/python/apache\_beam/transforms/ptransform.py](https://codecov.io/gh/apache/beam/pull/9907/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9wdHJhbnNmb3JtLnB5)
 | `38.27% <80.00%> (-52.78%)` | :arrow_down: |
   | 
[...python/apache\_beam/examples/complete/distribopt.py](https://codecov.io/gh/apache/beam/pull/9907/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvY29tcGxldGUvZGlzdHJpYm9wdC5weQ==)
 | `0.00% <0.00%> (-98.59%)` | :arrow_down: |
   | 
[...dks/python/apache\_beam/transforms/create\_source.py](https://codecov.io/gh/apache/beam/pull/9907/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy9jcmVhdGVfc291cmNlLnB5)
 | `0.00% <0.00%> (-98.19%)` | :arrow_down: |
   | 
[...on/apache\_beam/runners/direct/helper\_transforms.py](https://codecov.io/gh/apache/beam/pull/9907/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvaGVscGVyX3RyYW5zZm9ybXMucHk=)
 | `0.00% <0.00%> (-98.15%)` | :arrow_down: |
   | 
[...e\_beam/runners/interactive/testing/mock\_ipython.py](https://codecov.io/gh/apache/beam/pull/9907/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS90ZXN0aW5nL21vY2tfaXB5dGhvbi5weQ==)
 | `7.14% <0.00%> (-92.86%)` | :arrow_down: |
   | 
[.../examples/snippets/transforms/elementwise/pardo.py](https://codecov.io/gh/apache/beam/pull/9907/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvc25pcHBldHMvdHJhbnNmb3Jtcy9lbGVtZW50d2lzZS9wYXJkby5weQ==)
 | `11.36% <0.00%> (-88.64%)` | :arrow_down: |
   | 
[sdks/python/apache\_beam/typehints/opcodes.py](https://codecov.io/gh/apache/beam/pull/9907/diff?src=pr=tree#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHlwZWhpbnRzL29wY29kZXMucHk=)
 | `0.00% <0.00%> (-87.92%)` | :arrow_down: |
   | ... and [291 
more](https://codecov.io/gh/apache/beam/pull/9907/diff?src=pr=tree-more) | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/beam/pull/9907?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/beam/pull/9907?src=pr=footer). Last 
update 
[1b26660...138efc5](https://codecov.io/gh/apache/beam/pull/9907?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] lukecwik commented on a change in pull request #12637: [BEAM-10768] Don't assert the order in which elements are received.

2020-09-21 Thread GitBox


lukecwik commented on a change in pull request #12637:
URL: https://github.com/apache/beam/pull/12637#discussion_r492359619



##
File path: sdks/python/apache_beam/runners/worker/data_plane_test.py
##
@@ -108,16 +106,28 @@ def send(instruction_id, transform_id, data):
 ])
 
 # Multiple interleaved writes to multiple instructions.
-send('1', transform_1, b'abc')
-send('2', transform_1, b'def')
+stream11 = from_channel.output_stream('1', transform_1)
+stream11.write(b'abc')
+stream21 = from_channel.output_stream('2', transform_1)
+stream21.write(b'def')
+if not time_based_flush:
+  stream11.close()
 self.assertEqual(
 list(
 itertools.islice(to_channel.input_elements('1', [transform_1]), 
1)),
 [
 beam_fn_api_pb2.Elements.Data(
 instruction_id='1', transform_id=transform_1, data=b'abc')
 ])
-send('2', transform_2, b'ghi')
+if time_based_flush:

Review comment:
   Why do we need to wait for the flush, shouldn't the earlier 
`stream21.write(b'def')` provide the correct ordering?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [beam] tvalentyn commented on pull request #12898: [BEAM-7372][BEAM-9980] Cleans up Flink precommit VR suite definition and makes Python version configurable.

2020-09-21 Thread GitBox


tvalentyn commented on pull request #12898:
URL: https://github.com/apache/beam/pull/12898#issuecomment-696385996







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




  1   2   3   4   >