[jira] [Work started] (BEAM-5381) Dataflow runner creates duplicate CoGBK step IDs
[ https://issues.apache.org/jira/browse/BEAM-5381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on BEAM-5381 started by Robert Burke. -- > Dataflow runner creates duplicate CoGBK step IDs > > > Key: BEAM-5381 > URL: https://issues.apache.org/jira/browse/BEAM-5381 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Cody Schroeder >Assignee: Robert Burke >Priority: Major > Time Spent: 2h > Remaining Estimate: 0h > > https://gist.github.com/schroederc/699f42e407702cf9584b15d6885ad297 > If the attached {{beam_dataflow_err.go}} pipeline is executed with the > {{dataflow}} runner, GCP reports the following error: > {code} > Step with name e5 already exists. Duplicates are not allowed. > {code} > Executing the pipeline in {{--dry_run}} mode shows that "e5" is indeed > duplicated. If the CoGBK in the pipeline is not scoped, the duplication is > fixed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5381) Dataflow runner creates duplicate CoGBK step IDs
[ https://issues.apache.org/jira/browse/BEAM-5381?focusedWorklogId=224001=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-224001 ] ASF GitHub Bot logged work on BEAM-5381: Author: ASF GitHub Bot Created on: 06/Apr/19 05:56 Start Date: 06/Apr/19 05:56 Worklog Time Spent: 10m Work Description: lostluck commented on issue #8238: [BEAM-5381] Fix duplicate nodes ID for CoGBKs with Scopes. URL: https://github.com/apache/beam/pull/8238#issuecomment-480477067 R: @youngoli CC: @robinp This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 224001) Time Spent: 2h (was: 1h 50m) > Dataflow runner creates duplicate CoGBK step IDs > > > Key: BEAM-5381 > URL: https://issues.apache.org/jira/browse/BEAM-5381 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Cody Schroeder >Assignee: Robert Burke >Priority: Major > Time Spent: 2h > Remaining Estimate: 0h > > https://gist.github.com/schroederc/699f42e407702cf9584b15d6885ad297 > If the attached {{beam_dataflow_err.go}} pipeline is executed with the > {{dataflow}} runner, GCP reports the following error: > {code} > Step with name e5 already exists. Duplicates are not allowed. > {code} > Executing the pipeline in {{--dry_run}} mode shows that "e5" is indeed > duplicated. If the CoGBK in the pipeline is not scoped, the duplication is > fixed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6493) examples in Kotlin
[ https://issues.apache.org/jira/browse/BEAM-6493?focusedWorklogId=223999=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223999 ] ASF GitHub Bot logged work on BEAM-6493: Author: ASF GitHub Bot Created on: 06/Apr/19 05:47 Start Date: 06/Apr/19 05:47 Worklog Time Spent: 10m Work Description: harshithdwivedi commented on issue #8034: [BEAM-6493] Convert the WordCount samples to Kotlin URL: https://github.com/apache/beam/pull/8034#issuecomment-480476685 run java precommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223999) Time Spent: 6.5h (was: 6h 20m) Remaining Estimate: 498.5h (was: 498h 40m) > examples in Kotlin > -- > > Key: BEAM-6493 > URL: https://issues.apache.org/jira/browse/BEAM-6493 > Project: Beam > Issue Type: Task > Components: examples-java >Affects Versions: Not applicable >Reporter: Harshit Dwivedi >Assignee: Harshit Dwivedi >Priority: Minor > Labels: documentation, triaged > Fix For: Not applicable > > Original Estimate: 504h > Time Spent: 6.5h > Remaining Estimate: 498.5h > > I have been using Apache Beam for few of my projects in production since the > past 6 months and apart from Java, [Kotlin|https://kotlinlang.org/] also > seems to work as well with no issues whatsoever. > But currently, the Github Repository of Apache Beam contains examples only in > Java which might be an issue for other developers who want to use Apache Beam > SDK with kotlin as there are no sample resources available. > That said, I would love to go ahead and add kotlin examples alongside the > current java examples in the [Beam > repository|https://github.com/apache/beam/tree/master/examples/java]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6904) Test all Coder structuralValue implementations
[ https://issues.apache.org/jira/browse/BEAM-6904?focusedWorklogId=223998=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223998 ] ASF GitHub Bot logged work on BEAM-6904: Author: ASF GitHub Bot Created on: 06/Apr/19 05:41 Start Date: 06/Apr/19 05:41 Worklog Time Spent: 10m Work Description: AlexKbit commented on issue #8208: [BEAM-6904] Add tests for structuralValue implementation in coders URL: https://github.com/apache/beam/pull/8208#issuecomment-480476385 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223998) Time Spent: 1h 50m (was: 1h 40m) > Test all Coder structuralValue implementations > -- > > Key: BEAM-6904 > URL: https://issues.apache.org/jira/browse/BEAM-6904 > Project: Beam > Issue Type: Test > Components: sdk-java-core >Reporter: Kenneth Knowles >Assignee: Alexander Savchenko >Priority: Major > Time Spent: 1h 50m > Remaining Estimate: 0h > > Here is a test helper that check that structuralValue is consistent with > equals: > https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/CoderProperties.java#L200 > And here is one that tests it another way: > https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/CoderProperties.java#L226 > With the deprecation of consistentWithEquals and implementing all the > structualValue methods, we should add these tests to every coder. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6942) Pipeline options to experiment propagation is not working in Dataflow runner.
[ https://issues.apache.org/jira/browse/BEAM-6942?focusedWorklogId=223982=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223982 ] ASF GitHub Bot logged work on BEAM-6942: Author: ASF GitHub Bot Created on: 06/Apr/19 01:43 Start Date: 06/Apr/19 01:43 Worklog Time Spent: 10m Work Description: tvalentyn commented on issue #8225: [BEAM-6942] Make modifications to pipeline options to be visible to all views. URL: https://github.com/apache/beam/pull/8225#issuecomment-480464114 Python SDK PostCommit Tests This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223982) Time Spent: 2h 10m (was: 2h) > Pipeline options to experiment propagation is not working in Dataflow runner. > - > > Key: BEAM-6942 > URL: https://issues.apache.org/jira/browse/BEAM-6942 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Valentyn Tymofieiev >Priority: Major > Time Spent: 2h 10m > Remaining Estimate: 0h > > Relevant code: > [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py#L356-L388] > 3 experiments/options are affected. We need to fix it in 2.12.0 > cc: [~altay] [~apilloud] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6942) Pipeline options to experiment propagation is not working in Dataflow runner.
[ https://issues.apache.org/jira/browse/BEAM-6942?focusedWorklogId=223981=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223981 ] ASF GitHub Bot logged work on BEAM-6942: Author: ASF GitHub Bot Created on: 06/Apr/19 01:42 Start Date: 06/Apr/19 01:42 Worklog Time Spent: 10m Work Description: tvalentyn commented on issue #8225: [BEAM-6942] Make modifications to pipeline options to be visible to all views. URL: https://github.com/apache/beam/pull/8225#issuecomment-480464028 Run Portable_Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223981) Time Spent: 2h (was: 1h 50m) > Pipeline options to experiment propagation is not working in Dataflow runner. > - > > Key: BEAM-6942 > URL: https://issues.apache.org/jira/browse/BEAM-6942 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Valentyn Tymofieiev >Priority: Major > Time Spent: 2h > Remaining Estimate: 0h > > Relevant code: > [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py#L356-L388] > 3 experiments/options are affected. We need to fix it in 2.12.0 > cc: [~altay] [~apilloud] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6942) Pipeline options to experiment propagation is not working in Dataflow runner.
[ https://issues.apache.org/jira/browse/BEAM-6942?focusedWorklogId=223980=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223980 ] ASF GitHub Bot logged work on BEAM-6942: Author: ASF GitHub Bot Created on: 06/Apr/19 01:42 Start Date: 06/Apr/19 01:42 Worklog Time Spent: 10m Work Description: tvalentyn commented on issue #8225: [BEAM-6942] Make modifications to pipeline options to be visible to all views. URL: https://github.com/apache/beam/pull/8225#issuecomment-480464016 Python SDK PostCommit Tests This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223980) Time Spent: 1h 50m (was: 1h 40m) > Pipeline options to experiment propagation is not working in Dataflow runner. > - > > Key: BEAM-6942 > URL: https://issues.apache.org/jira/browse/BEAM-6942 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Valentyn Tymofieiev >Priority: Major > Time Spent: 1h 50m > Remaining Estimate: 0h > > Relevant code: > [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py#L356-L388] > 3 experiments/options are affected. We need to fix it in 2.12.0 > cc: [~altay] [~apilloud] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6942) Pipeline options to experiment propagation is not working in Dataflow runner.
[ https://issues.apache.org/jira/browse/BEAM-6942?focusedWorklogId=223979=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223979 ] ASF GitHub Bot logged work on BEAM-6942: Author: ASF GitHub Bot Created on: 06/Apr/19 01:35 Start Date: 06/Apr/19 01:35 Worklog Time Spent: 10m Work Description: tvalentyn commented on issue #8225: [BEAM-6942] Make modifications to pipeline options to be visible to all views. URL: https://github.com/apache/beam/pull/8225#issuecomment-480106740 R: [removed reviewers for now to avoid spammy notifications] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223979) Time Spent: 1h 40m (was: 1.5h) > Pipeline options to experiment propagation is not working in Dataflow runner. > - > > Key: BEAM-6942 > URL: https://issues.apache.org/jira/browse/BEAM-6942 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Valentyn Tymofieiev >Priority: Major > Time Spent: 1h 40m > Remaining Estimate: 0h > > Relevant code: > [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py#L356-L388] > 3 experiments/options are affected. We need to fix it in 2.12.0 > cc: [~altay] [~apilloud] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5381) Dataflow runner creates duplicate CoGBK step IDs
[ https://issues.apache.org/jira/browse/BEAM-5381?focusedWorklogId=223973=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223973 ] ASF GitHub Bot logged work on BEAM-5381: Author: ASF GitHub Bot Created on: 06/Apr/19 01:10 Start Date: 06/Apr/19 01:10 Worklog Time Spent: 10m Work Description: lostluck commented on issue #8238: [BEAM-5381] Fix duplicate nodes ID for CoGBKs with Scopes. URL: https://github.com/apache/beam/pull/8238#issuecomment-480461831 Run Go PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223973) Time Spent: 1h 50m (was: 1h 40m) > Dataflow runner creates duplicate CoGBK step IDs > > > Key: BEAM-5381 > URL: https://issues.apache.org/jira/browse/BEAM-5381 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Cody Schroeder >Assignee: Robert Burke >Priority: Major > Time Spent: 1h 50m > Remaining Estimate: 0h > > https://gist.github.com/schroederc/699f42e407702cf9584b15d6885ad297 > If the attached {{beam_dataflow_err.go}} pipeline is executed with the > {{dataflow}} runner, GCP reports the following error: > {code} > Step with name e5 already exists. Duplicates are not allowed. > {code} > Executing the pipeline in {{--dry_run}} mode shows that "e5" is indeed > duplicated. If the CoGBK in the pipeline is not scoped, the duplication is > fixed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6825) Improve pipeline construction time error messages in Go SDK.
[ https://issues.apache.org/jira/browse/BEAM-6825?focusedWorklogId=223972=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223972 ] ASF GitHub Bot logged work on BEAM-6825: Author: ASF GitHub Bot Created on: 06/Apr/19 01:06 Start Date: 06/Apr/19 01:06 Worklog Time Spent: 10m Work Description: lostluck commented on issue #8243: [BEAM-6825] Improve Combine error messages. URL: https://github.com/apache/beam/pull/8243#issuecomment-480461616 R: @youngoli Please review and ask all the questions. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223972) Time Spent: 20m (was: 10m) > Improve pipeline construction time error messages in Go SDK. > > > Key: BEAM-6825 > URL: https://issues.apache.org/jira/browse/BEAM-6825 > Project: Beam > Issue Type: New Feature > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Minor > Labels: triaged > Time Spent: 20m > Remaining Estimate: 0h > > Many error messages for common pipeline construction mistakes are unclear and > unhelpful. They need to be improved to provide more context, especially for > newer users. This bug tracks these error message improvements. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6769) BigQuery IO does not support bytes in Python 3
[ https://issues.apache.org/jira/browse/BEAM-6769?focusedWorklogId=223970=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223970 ] ASF GitHub Bot logged work on BEAM-6769: Author: ASF GitHub Bot Created on: 06/Apr/19 00:56 Start Date: 06/Apr/19 00:56 Worklog Time Spent: 10m Work Description: tvalentyn commented on pull request #8047: [BEAM-6769] write bytes to bigquery in python 2 URL: https://github.com/apache/beam/pull/8047#discussion_r272777024 ## File path: sdks/python/apache_beam/io/gcp/big_query_query_to_table_it_test.py ## @@ -115,8 +117,11 @@ def _setup_new_types_env(self): table_data = [ {'bytes':b'xyw=', 'date':'2011-01-01', 'time':'23:59:59.99'}, {'bytes':b'abc=', 'date':'2000-01-01', 'time':'00:00:00'}, -{'bytes':b'dec=', 'date':'3000-12-31', 'time':'23:59:59.99'} +{'bytes':b'dec=', 'date':'3000-12-31', 'time':'23:59:59.99'}, +{'bytes':b'\xab\xac\xad', 'date':'2000-01-01', 'time':'00:00:00'} ] +for row in table_data: Review comment: Let's add a comment here that a bigquery client requires the user to base64-encode bytes sequences before passing them to BQ. If you have a link handy, please include it. Thank you. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223970) Time Spent: 4h (was: 3h 50m) > BigQuery IO does not support bytes in Python 3 > -- > > Key: BEAM-6769 > URL: https://issues.apache.org/jira/browse/BEAM-6769 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Juta Staes >Assignee: Juta Staes >Priority: Major > Time Spent: 4h > Remaining Estimate: 0h > > In Python 2 you could write bytes data to BigQuery. This is tested in > > [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/big_query_query_to_table_it_test.py#L186] > Python 3 does not support > {noformat} > json.dumps({'test': b'test'}){noformat} > which is used to encode the data in > > [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/bigquery_tools.py#L959] > > How should writing bytes to BigQuery be handled in Python 3? > * Forbid writing bytes into BigQuery on Python 3 > * Guess the encoding (utf-8?) > * Pass the encoding to BigQuery > cc: [~tvalentyn] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6769) BigQuery IO does not support bytes in Python 3
[ https://issues.apache.org/jira/browse/BEAM-6769?focusedWorklogId=223969=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223969 ] ASF GitHub Bot logged work on BEAM-6769: Author: ASF GitHub Bot Created on: 06/Apr/19 00:56 Start Date: 06/Apr/19 00:56 Worklog Time Spent: 10m Work Description: tvalentyn commented on pull request #8047: [BEAM-6769] write bytes to bigquery in python 2 URL: https://github.com/apache/beam/pull/8047#discussion_r272776978 ## File path: sdks/python/apache_beam/io/gcp/big_query_query_to_table_it_test.py ## @@ -48,12 +49,13 @@ NEW_TYPES_OUTPUT_SCHEMA = ( '{"fields": [{"name": "bytes","type": "BYTES"},' '{"name": "date","type": "DATE"},{"name": "time","type": "TIME"}]}') -NEW_TYPES_OUTPUT_VERIFY_QUERY = ('SELECT date FROM `%s`;') -# There are problems with query time and bytes with current version of bigquery. +NEW_TYPES_OUTPUT_VERIFY_QUERY = ('SELECT bytes, date, time FROM `%s`;') NEW_TYPES_OUTPUT_EXPECTED = [ -(datetime.date(2000, 1, 1),), -(datetime.date(2011, 1, 1),), -(datetime.date(3000, 12, 31),)] +(b'xyw=', datetime.date(2011, 1, 1), datetime.time(23, 59, 59, 99),), Review comment: Let's remove the `=` sign after `xyw` and others here to avoid confusion that these strings need to be base64-encoded (odd-length base64 encoded strings are padded with `=`). Also, let's replace one of the usecases with a utf-8-decodable byte-sequence. This use-case will help to make sure that this sequence doesn't get accidentally decoded somewhere. For example, we can use `b'\xe4\xbd\xa0\xe5\xa5\xbd'`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223969) Time Spent: 3h 50m (was: 3h 40m) > BigQuery IO does not support bytes in Python 3 > -- > > Key: BEAM-6769 > URL: https://issues.apache.org/jira/browse/BEAM-6769 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: Juta Staes >Assignee: Juta Staes >Priority: Major > Time Spent: 3h 50m > Remaining Estimate: 0h > > In Python 2 you could write bytes data to BigQuery. This is tested in > > [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/big_query_query_to_table_it_test.py#L186] > Python 3 does not support > {noformat} > json.dumps({'test': b'test'}){noformat} > which is used to encode the data in > > [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/bigquery_tools.py#L959] > > How should writing bytes to BigQuery be handled in Python 3? > * Forbid writing bytes into BigQuery on Python 3 > * Guess the encoding (utf-8?) > * Pass the encoding to BigQuery > cc: [~tvalentyn] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6825) Improve pipeline construction time error messages in Go SDK.
[ https://issues.apache.org/jira/browse/BEAM-6825?focusedWorklogId=223966=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223966 ] ASF GitHub Bot logged work on BEAM-6825: Author: ASF GitHub Bot Created on: 06/Apr/19 00:47 Start Date: 06/Apr/19 00:47 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #8243: [BEAM-6825] Improve Combine error messages. URL: https://github.com/apache/beam/pull/8243 * Makes the error messages for CombineFns much clearer * Add significant testing around various "bad" and "good" CombineFns * Start returning better errors than just string errors, which can then be used to improve upstream errors. * Fixes an incorrect combiner implementation of stats.Mean * Make the funcx signature utilites accept arbitrary funcs, funcx.Fns, and reflectx.Funcs. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | --- | --- | --- | --- Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/) | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/) [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [![Build
[jira] [Work logged] (BEAM-6994) SamzaRunner: further improvements for upgrading Samza
[ https://issues.apache.org/jira/browse/BEAM-6994?focusedWorklogId=223965=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223965 ] ASF GitHub Bot logged work on BEAM-6994: Author: ASF GitHub Bot Created on: 06/Apr/19 00:47 Start Date: 06/Apr/19 00:47 Worklog Time Spent: 10m Work Description: xinyuiscool commented on pull request #8221: [BEAM-6994] SamzaRunner: further improvements for upgrading Samza URL: https://github.com/apache/beam/pull/8221#discussion_r272776495 ## File path: runners/samza/src/main/java/org/apache/beam/runners/samza/translation/ConfigBuilder.java ## @@ -255,9 +230,40 @@ private static void validateYarnRun(Map config) { configBuilder.put("job.host-affinity.enabled", "true"); } +switch (options.getSamzaExecutionEnvironment()) { Review comment: Makes sense. Added more logs. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223965) Time Spent: 1h (was: 50m) > SamzaRunner: further improvements for upgrading Samza > - > > Key: BEAM-6994 > URL: https://issues.apache.org/jira/browse/BEAM-6994 > Project: Beam > Issue Type: Improvement > Components: runner-samza >Reporter: Xinyu Liu >Assignee: Xinyu Liu >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > More improvements for SamzaRunner: > - Life cycle methods for the pipeline runtime > - Hook up Samza ExternalContext for LinkedIin use cases > - Support metrics reporters in pipeline options > - Some bug fixes for the state key in Samza -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6994) SamzaRunner: further improvements for upgrading Samza
[ https://issues.apache.org/jira/browse/BEAM-6994?focusedWorklogId=223963=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223963 ] ASF GitHub Bot logged work on BEAM-6994: Author: ASF GitHub Bot Created on: 06/Apr/19 00:47 Start Date: 06/Apr/19 00:47 Worklog Time Spent: 10m Work Description: xinyuiscool commented on pull request #8221: [BEAM-6994] SamzaRunner: further improvements for upgrading Samza URL: https://github.com/apache/beam/pull/8221#discussion_r272776111 ## File path: runners/samza/src/main/java/org/apache/beam/runners/samza/SamzaPipelineResult.java ## @@ -70,6 +75,11 @@ public State waitUntilFinish(@Nullable Duration duration) { } final StateInfo stateInfo = getStateInfo(); + +if (listener != null && (stateInfo.state == State.DONE || stateInfo.state == State.FAILED)) { Review comment: Oh, the status can be RUNNING, since it can be a timed waifForFinish(). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223963) Time Spent: 50m (was: 40m) > SamzaRunner: further improvements for upgrading Samza > - > > Key: BEAM-6994 > URL: https://issues.apache.org/jira/browse/BEAM-6994 > Project: Beam > Issue Type: Improvement > Components: runner-samza >Reporter: Xinyu Liu >Assignee: Xinyu Liu >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > More improvements for SamzaRunner: > - Life cycle methods for the pipeline runtime > - Hook up Samza ExternalContext for LinkedIin use cases > - Support metrics reporters in pipeline options > - Some bug fixes for the state key in Samza -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6994) SamzaRunner: further improvements for upgrading Samza
[ https://issues.apache.org/jira/browse/BEAM-6994?focusedWorklogId=223964=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223964 ] ASF GitHub Bot logged work on BEAM-6994: Author: ASF GitHub Bot Created on: 06/Apr/19 00:47 Start Date: 06/Apr/19 00:47 Worklog Time Spent: 10m Work Description: xinyuiscool commented on pull request #8221: [BEAM-6994] SamzaRunner: further improvements for upgrading Samza URL: https://github.com/apache/beam/pull/8221#discussion_r272776155 ## File path: runners/samza/src/main/java/org/apache/beam/runners/samza/runtime/GroupByKeyOp.java ## @@ -65,6 +66,8 @@ private final Coder keyCoder; private final SystemReduceFn reduceFn; private final String stepName; + private final String stepId; + private final PCollection.IsBounded isBounded; Review comment: Fixed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223964) > SamzaRunner: further improvements for upgrading Samza > - > > Key: BEAM-6994 > URL: https://issues.apache.org/jira/browse/BEAM-6994 > Project: Beam > Issue Type: Improvement > Components: runner-samza >Reporter: Xinyu Liu >Assignee: Xinyu Liu >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > More improvements for SamzaRunner: > - Life cycle methods for the pipeline runtime > - Hook up Samza ExternalContext for LinkedIin use cases > - Support metrics reporters in pipeline options > - Some bug fixes for the state key in Samza -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7024) Calcite BINARY to Beam Schema BYTES missing in CalciteUtils
[ https://issues.apache.org/jira/browse/BEAM-7024?focusedWorklogId=223961=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223961 ] ASF GitHub Bot logged work on BEAM-7024: Author: ASF GitHub Bot Created on: 06/Apr/19 00:34 Start Date: 06/Apr/19 00:34 Worklog Time Spent: 10m Work Description: amaliujia commented on issue #8242: [BEAM-7024] Calcite BINARY to Beam Schema BYTES missing in CalciteUtils URL: https://github.com/apache/beam/pull/8242#issuecomment-480459075 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223961) Time Spent: 1h 20m (was: 1h 10m) > Calcite BINARY to Beam Schema BYTES missing in CalciteUtils > --- > > Key: BEAM-7024 > URL: https://issues.apache.org/jira/browse/BEAM-7024 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Reporter: Rui Wang >Assignee: Rui Wang >Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6493) examples in Kotlin
[ https://issues.apache.org/jira/browse/BEAM-6493?focusedWorklogId=223942=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223942 ] ASF GitHub Bot logged work on BEAM-6493: Author: ASF GitHub Bot Created on: 05/Apr/19 23:52 Start Date: 05/Apr/19 23:52 Worklog Time Spent: 10m Work Description: pabloem commented on issue #8034: [BEAM-6493] Convert the WordCount samples to Kotlin URL: https://github.com/apache/beam/pull/8034#issuecomment-480454357 @harshithdwivedi I asked the Beam community to add their feedback (as you may have noticed), because WordCount is probably the main piece of documentation for starting users, so I'm making sure to have a good amount of feedback before merging. Sorry if it's troublesome to get so many new comments. I'll merge the change on Monday or Tuesday : ) - thanks a lot for the contribution This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223942) Time Spent: 6h 20m (was: 6h 10m) Remaining Estimate: 498h 40m (was: 498h 50m) > examples in Kotlin > -- > > Key: BEAM-6493 > URL: https://issues.apache.org/jira/browse/BEAM-6493 > Project: Beam > Issue Type: Task > Components: examples-java >Affects Versions: Not applicable >Reporter: Harshit Dwivedi >Assignee: Harshit Dwivedi >Priority: Minor > Labels: documentation, triaged > Fix For: Not applicable > > Original Estimate: 504h > Time Spent: 6h 20m > Remaining Estimate: 498h 40m > > I have been using Apache Beam for few of my projects in production since the > past 6 months and apart from Java, [Kotlin|https://kotlinlang.org/] also > seems to work as well with no issues whatsoever. > But currently, the Github Repository of Apache Beam contains examples only in > Java which might be an issue for other developers who want to use Apache Beam > SDK with kotlin as there are no sample resources available. > That said, I would love to go ahead and add kotlin examples alongside the > current java examples in the [Beam > repository|https://github.com/apache/beam/tree/master/examples/java]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7020) Reduce the log severity of profiling agent discovery
[ https://issues.apache.org/jira/browse/BEAM-7020?focusedWorklogId=223939=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223939 ] ASF GitHub Bot logged work on BEAM-7020: Author: ASF GitHub Bot Created on: 05/Apr/19 23:44 Start Date: 05/Apr/19 23:44 Worklog Time Spent: 10m Work Description: pabloem commented on issue #8241: [BEAM-7020] Reduced log severity for profiling agent discovery URL: https://github.com/apache/beam/pull/8241#issuecomment-480453440 thanks @davidyan74 - the testing environment for Beam is in bad state. I'll merge this during the weekend after rerunning the tests. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223939) Time Spent: 0.5h (was: 20m) > Reduce the log severity of profiling agent discovery > > > Key: BEAM-7020 > URL: https://issues.apache.org/jira/browse/BEAM-7020 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: David Yan >Priority: Minor > Time Spent: 0.5h > Remaining Estimate: 0h > > Example: > [https://github.com/apache/beam/blob/b953645ed6db837d24284d7fe1fe091e7309f821/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/profiler/ScopedProfiler.java#L138] > These should not be at warning severity, even if the profiling agent is not > present since it's in most cases users do not run their jobs with profiling. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7017) Improve Apache Rat failure message during build time
[ https://issues.apache.org/jira/browse/BEAM-7017?focusedWorklogId=223932=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223932 ] ASF GitHub Bot logged work on BEAM-7017: Author: ASF GitHub Bot Created on: 05/Apr/19 23:11 Start Date: 05/Apr/19 23:11 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #8240: [BEAM-7017] Upgrade Apache Rat plugin version to have improved build output for failures. URL: https://github.com/apache/beam/pull/8240#issuecomment-480448914 please run tests This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223932) Time Spent: 0.5h (was: 20m) > Improve Apache Rat failure message during build time > > > Key: BEAM-7017 > URL: https://issues.apache.org/jira/browse/BEAM-7017 > Project: Beam > Issue Type: Improvement > Components: build-system >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > 0.4.0 of the plugin embeds the filenames with invalid licenses in the > exception message of the failing task so it now appears at the bottom of the > build. > {code} > > Task :rat FAILED > FAILURE: Build failed with an exception. > * What went wrong: > Execution failed for task ':rat'. > > A failure occurred while executing org.nosphere.apache.rat.RatWork >> Apache Rat audit failure > > * > Summary > --- > Generated at: 2019-04-05T14:39:11-07:00 > > Notes: 5 > Binaries: 123 > Archives: 4 > Standards: 5105 > > Apache Licensed: 5104 > Generated Documents: 0 > > JavaDocs are generated, thus a license header is optional. > Generated files do not require license headers. > > 1 Unknown Licenses > > * > > Files with unapproved licenses: > > > /usr/local/google/home/lcwik/git/beam/examples/java/src/main/java/org/apache/beam/examples/WordCount.java > > * > > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7017) Improve Apache Rat failure message during build time
[ https://issues.apache.org/jira/browse/BEAM-7017?focusedWorklogId=223928=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223928 ] ASF GitHub Bot logged work on BEAM-7017: Author: ASF GitHub Bot Created on: 05/Apr/19 23:09 Start Date: 05/Apr/19 23:09 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #8240: [BEAM-7017] Upgrade Apache Rat plugin version to have improved build output for failures. URL: https://github.com/apache/beam/pull/8240#issuecomment-480448914 please run tests This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223928) Time Spent: 20m (was: 10m) > Improve Apache Rat failure message during build time > > > Key: BEAM-7017 > URL: https://issues.apache.org/jira/browse/BEAM-7017 > Project: Beam > Issue Type: Improvement > Components: build-system >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > 0.4.0 of the plugin embeds the filenames with invalid licenses in the > exception message of the failing task so it now appears at the bottom of the > build. > {code} > > Task :rat FAILED > FAILURE: Build failed with an exception. > * What went wrong: > Execution failed for task ':rat'. > > A failure occurred while executing org.nosphere.apache.rat.RatWork >> Apache Rat audit failure > > * > Summary > --- > Generated at: 2019-04-05T14:39:11-07:00 > > Notes: 5 > Binaries: 123 > Archives: 4 > Standards: 5105 > > Apache Licensed: 5104 > Generated Documents: 0 > > JavaDocs are generated, thus a license header is optional. > Generated files do not require license headers. > > 1 Unknown Licenses > > * > > Files with unapproved licenses: > > > /usr/local/google/home/lcwik/git/beam/examples/java/src/main/java/org/apache/beam/examples/WordCount.java > > * > > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5433) Cleanup Environment.url from beam_runner_api.proto
[ https://issues.apache.org/jira/browse/BEAM-5433?focusedWorklogId=223924=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223924 ] ASF GitHub Bot logged work on BEAM-5433: Author: ASF GitHub Bot Created on: 05/Apr/19 23:08 Start Date: 05/Apr/19 23:08 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #8213: [BEAM-5433] Deprecate environment url field. URL: https://github.com/apache/beam/pull/8213#issuecomment-480448670 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223924) Time Spent: 1h (was: 50m) > Cleanup Environment.url from beam_runner_api.proto > -- > > Key: BEAM-5433 > URL: https://issues.apache.org/jira/browse/BEAM-5433 > Project: Beam > Issue Type: Task > Components: beam-model >Reporter: Ankur Goenka >Assignee: Luke Cwik >Priority: Major > Labels: triaged > Time Spent: 1h > Remaining Estimate: 0h > > Environment URL field is deprecated and should be removed ASAP. > The current blocker in removing the field is compatibility with Dataflow as > data flow has internal code which relies on it. > There is also vote passed to move the affected dataflow code to open source. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7011) Replace "urn:beam:sideinput:materialization:multimap:0.1" with standard side input type materialization
[ https://issues.apache.org/jira/browse/BEAM-7011?focusedWorklogId=223925=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223925 ] ASF GitHub Bot logged work on BEAM-7011: Author: ASF GitHub Bot Created on: 05/Apr/19 23:08 Start Date: 05/Apr/19 23:08 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #8232: [BEAM-7011] Update Beam SDKs to use the StandardSideInputType enums. URL: https://github.com/apache/beam/pull/8232#issuecomment-480448687 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223925) Time Spent: 2h 10m (was: 2h) > Replace "urn:beam:sideinput:materialization:multimap:0.1" with standard side > input type materialization > --- > > Key: BEAM-7011 > URL: https://issues.apache.org/jira/browse/BEAM-7011 > Project: Beam > Issue Type: Improvement > Components: beam-model, sdk-go, sdk-java-core, sdk-py-core >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: Major > Time Spent: 2h 10m > Remaining Estimate: 0h > > Use the StandardSideInput defined within beam_runner_api.proto: > https://github.com/apache/beam/blob/206d98b0765ac662730edd28d669b3db24dd851d/model/pipeline/src/main/proto/beam_runner_api.proto#L311 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7011) Replace "urn:beam:sideinput:materialization:multimap:0.1" with standard side input type materialization
[ https://issues.apache.org/jira/browse/BEAM-7011?focusedWorklogId=223926=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223926 ] ASF GitHub Bot logged work on BEAM-7011: Author: ASF GitHub Bot Created on: 05/Apr/19 23:08 Start Date: 05/Apr/19 23:08 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #8232: [BEAM-7011] Update Beam SDKs to use the StandardSideInputType enums. URL: https://github.com/apache/beam/pull/8232#issuecomment-480448702 Run Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223926) Time Spent: 2h 20m (was: 2h 10m) > Replace "urn:beam:sideinput:materialization:multimap:0.1" with standard side > input type materialization > --- > > Key: BEAM-7011 > URL: https://issues.apache.org/jira/browse/BEAM-7011 > Project: Beam > Issue Type: Improvement > Components: beam-model, sdk-go, sdk-java-core, sdk-py-core >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: Major > Time Spent: 2h 20m > Remaining Estimate: 0h > > Use the StandardSideInput defined within beam_runner_api.proto: > https://github.com/apache/beam/blob/206d98b0765ac662730edd28d669b3db24dd851d/model/pipeline/src/main/proto/beam_runner_api.proto#L311 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5174) Website feed is broken due to license header
[ https://issues.apache.org/jira/browse/BEAM-5174?focusedWorklogId=223923=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223923 ] ASF GitHub Bot logged work on BEAM-5174: Author: ASF GitHub Bot Created on: 05/Apr/19 23:04 Start Date: 05/Apr/19 23:04 Worklog Time Spent: 10m Work Description: melap commented on pull request #8231: [BEAM-5174] Fixes website feed URL: https://github.com/apache/beam/pull/8231 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223923) Time Spent: 40m (was: 0.5h) > Website feed is broken due to license header > > > Key: BEAM-5174 > URL: https://issues.apache.org/jira/browse/BEAM-5174 > Project: Beam > Issue Type: Bug > Components: website >Reporter: Maximilian Michels >Priority: Minor > Time Spent: 40m > Remaining Estimate: 0h > > The feed at https://beam.apache.org/feed.xml starts out with a license header > which breaks the XML support. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7010) MAX/MIN(STRING)
[ https://issues.apache.org/jira/browse/BEAM-7010?focusedWorklogId=223921=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223921 ] ASF GitHub Bot logged work on BEAM-7010: Author: ASF GitHub Bot Created on: 05/Apr/19 22:59 Start Date: 05/Apr/19 22:59 Worklog Time Spent: 10m Work Description: amaliujia commented on issue #8230: [BEAM-7010] MAX/MIN(STRING) URL: https://github.com/apache/beam/pull/8230#issuecomment-480138058 R: @kanterov @kennknowles This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223921) Time Spent: 0.5h (was: 20m) > MAX/MIN(STRING) > --- > > Key: BEAM-7010 > URL: https://issues.apache.org/jira/browse/BEAM-7010 > Project: Beam > Issue Type: Improvement > Components: dsl-sql >Reporter: Rui Wang >Assignee: Rui Wang >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/transform/BeamBuiltinAggregations.java#L108 > BeamSQL does not accept STRING as MIN/MAX's input parameter type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-7026) Python SDK: Unable to obtain the PCollection for output tags which are not consumed by a downstream step.
Alex Amato created BEAM-7026: Summary: Python SDK: Unable to obtain the PCollection for output tags which are not consumed by a downstream step. Key: BEAM-7026 URL: https://issues.apache.org/jira/browse/BEAM-7026 Project: Beam Issue Type: New Feature Components: sdk-py-harness Reporter: Alex Amato I noticed that we are not able to convert the output tag+transform to the pcollection name for metrics (element count/mean byte count), if the Pcollections for the outputed tags are not consumed by a downstream step. This isn't critical as (1) Arguably there is no pcollection at all. (2) Output but not consumed PCollections are not critical to count metrics on as those can be optomized away entirely (No need to do any work, collect metrics, etc. for an unconsumed pcollection). However, we are able to count this, but we are unable to assign a pcollection name for it, as in this case there is no information about that output tag defined in the bundle descriptor. The alternative fix is to make sure that its always available, even if not consumed. Pablo and I looked into this a bit, and he believed it would be possible in pvalue.py's DoOutputsTuple class. This fix would require calling __getitem__ on all tags to initialize them properly. However, I had some trouble doing this, as this class is a bit strange since it overrides __getattr__. I found weird behaviors when adding functionality to this code. I don't really get how the code functions today, as its own instance variable usage should trigger the custom __getattr__ code, yet we seem to be using these attrs normally with self.X usages. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7024) Calcite BINARY to Beam Schema BYTES missing in CalciteUtils
[ https://issues.apache.org/jira/browse/BEAM-7024?focusedWorklogId=223917=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223917 ] ASF GitHub Bot logged work on BEAM-7024: Author: ASF GitHub Bot Created on: 05/Apr/19 22:51 Start Date: 05/Apr/19 22:51 Worklog Time Spent: 10m Work Description: amaliujia commented on pull request #8242: [BEAM-7024] Calcite BINARY to Beam Schema BYTES missing in CalciteUtils URL: https://github.com/apache/beam/pull/8242#discussion_r272764908 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/utils/CalciteUtils.java ## @@ -156,8 +156,26 @@ public static boolean isStringType(FieldType fieldType) { .put(TIMESTAMP_WITH_LOCAL_TZ, SqlTypeName.TIMESTAMP_WITH_LOCAL_TIME_ZONE) .build(); - private static final BiMap CALCITE_TO_BEAM_TYPE_MAPPING = - BEAM_TO_CALCITE_TYPE_MAPPING.inverse(); + private static final ImmutableMap CALCITE_TO_BEAM_TYPE_MAPPING = + ImmutableMap.builder() + .put(SqlTypeName.TINYINT, TINY_INT) + .put(SqlTypeName.SMALLINT, SMALL_INT) + .put(SqlTypeName.INTEGER, INTEGER) + .put(SqlTypeName.BIGINT, BIG_INT) + .put(SqlTypeName.FLOAT, FLOAT) + .put(SqlTypeName.DOUBLE, DOUBLE) + .put(SqlTypeName.DECIMAL, DECIMAL) + .put(SqlTypeName.BOOLEAN, BOOLEAN) + .put(SqlTypeName.VARBINARY, VARBINARY) + .put(SqlTypeName.BINARY, VARBINARY) Review comment: Agree on the logical types and leave it for future work for now. It's a legacy code that defines `public static final FieldType VARBINARY = FieldType.BYTES;`. I am reusing this style. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223917) Time Spent: 1h 10m (was: 1h) > Calcite BINARY to Beam Schema BYTES missing in CalciteUtils > --- > > Key: BEAM-7024 > URL: https://issues.apache.org/jira/browse/BEAM-7024 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Reporter: Rui Wang >Assignee: Rui Wang >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-7025) Python pipelines should not be able to use output tags that are not defined in with_outputs.
[ https://issues.apache.org/jira/browse/BEAM-7025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Amato updated BEAM-7025: - Description: This is an indication of a user misconfiguring a beam pipeline. This is because its not possible to get a handle to use the produced pcollection for that output tag, if .with_outputs is not used. So this should be disallowed entirely, a run time exception should be thrown. Note: The bundle descriptor knows which tags are available for each step. So at runtime it can be detected. But we need to be careful to not test it on every element, for performance purposes i suspect its possible to detect it statically, but may require collecting more information But there should be some code path already collects the elements for the bundle into the different tags when output at that point, at the end of bundle execution we can check for it which would be cheap was: This is an indication of a user misconfiguring a beam pipeline. This is because its not possible to get a handle to use the produced pcollection for that output tag, if .with_outputs is not used. So this should be disallowed entirely, a run time exception should be thrown. > Python pipelines should not be able to use output tags that are not defined > in with_outputs. > > > Key: BEAM-7025 > URL: https://issues.apache.org/jira/browse/BEAM-7025 > Project: Beam > Issue Type: New Feature > Components: sdk-py-core >Reporter: Alex Amato >Priority: Major > > This is an indication of a user misconfiguring a beam pipeline. > This is because its not possible to get a handle to use the produced > pcollection for that output tag, if .with_outputs is not used. So this should > be disallowed entirely, a run time exception should be thrown. > Note: > The bundle descriptor knows which tags are available for each step. So at > runtime it can be detected. But we need to be careful to not test it on every > element, for performance purposes > > i suspect its possible to detect it statically, but may require collecting > more information > > But there should be some code path already collects the elements for the > bundle into the different tags when output at that point, at the end of > bundle execution we can check for it which would be cheap -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7024) Calcite BINARY to Beam Schema BYTES missing in CalciteUtils
[ https://issues.apache.org/jira/browse/BEAM-7024?focusedWorklogId=223914=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223914 ] ASF GitHub Bot logged work on BEAM-7024: Author: ASF GitHub Bot Created on: 05/Apr/19 22:47 Start Date: 05/Apr/19 22:47 Worklog Time Spent: 10m Work Description: kennknowles commented on pull request #8242: [BEAM-7024] Calcite BINARY to Beam Schema BYTES missing in CalciteUtils URL: https://github.com/apache/beam/pull/8242#discussion_r272764191 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/utils/CalciteUtils.java ## @@ -156,8 +156,26 @@ public static boolean isStringType(FieldType fieldType) { .put(TIMESTAMP_WITH_LOCAL_TZ, SqlTypeName.TIMESTAMP_WITH_LOCAL_TIME_ZONE) .build(); - private static final BiMap CALCITE_TO_BEAM_TYPE_MAPPING = - BEAM_TO_CALCITE_TYPE_MAPPING.inverse(); + private static final ImmutableMap CALCITE_TO_BEAM_TYPE_MAPPING = + ImmutableMap.builder() + .put(SqlTypeName.TINYINT, TINY_INT) + .put(SqlTypeName.SMALLINT, SMALL_INT) + .put(SqlTypeName.INTEGER, INTEGER) + .put(SqlTypeName.BIGINT, BIG_INT) + .put(SqlTypeName.FLOAT, FLOAT) + .put(SqlTypeName.DOUBLE, DOUBLE) + .put(SqlTypeName.DECIMAL, DECIMAL) + .put(SqlTypeName.BOOLEAN, BOOLEAN) + .put(SqlTypeName.VARBINARY, VARBINARY) + .put(SqlTypeName.BINARY, VARBINARY) Review comment: OK. It is fine and expected that the mapping might not be invertible. We want users to be able to feed schema PCollection in to SQL, but also compile SQL down to schema. But they don't have to be an exact match. In fact, I think Beam schema should probably reduce to only very simple types and the rest should be logical types that SQL defines. TODO later. You will need to replace `VARBINARY` with `BYTES` on the right hand side, no? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223914) Time Spent: 1h (was: 50m) > Calcite BINARY to Beam Schema BYTES missing in CalciteUtils > --- > > Key: BEAM-7024 > URL: https://issues.apache.org/jira/browse/BEAM-7024 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Reporter: Rui Wang >Assignee: Rui Wang >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7024) Calcite BINARY to Beam Schema BYTES missing in CalciteUtils
[ https://issues.apache.org/jira/browse/BEAM-7024?focusedWorklogId=223912=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223912 ] ASF GitHub Bot logged work on BEAM-7024: Author: ASF GitHub Bot Created on: 05/Apr/19 22:44 Start Date: 05/Apr/19 22:44 Worklog Time Spent: 10m Work Description: amaliujia commented on pull request #8242: [BEAM-7024] Calcite BINARY to Beam Schema BYTES missing in CalciteUtils URL: https://github.com/apache/beam/pull/8242#discussion_r272763788 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/utils/CalciteUtils.java ## @@ -156,8 +156,26 @@ public static boolean isStringType(FieldType fieldType) { .put(TIMESTAMP_WITH_LOCAL_TZ, SqlTypeName.TIMESTAMP_WITH_LOCAL_TIME_ZONE) .build(); - private static final BiMap CALCITE_TO_BEAM_TYPE_MAPPING = - BEAM_TO_CALCITE_TYPE_MAPPING.inverse(); + private static final ImmutableMap CALCITE_TO_BEAM_TYPE_MAPPING = + ImmutableMap.builder() + .put(SqlTypeName.TINYINT, TINY_INT) + .put(SqlTypeName.SMALLINT, SMALL_INT) + .put(SqlTypeName.INTEGER, INTEGER) + .put(SqlTypeName.BIGINT, BIG_INT) + .put(SqlTypeName.FLOAT, FLOAT) + .put(SqlTypeName.DOUBLE, DOUBLE) + .put(SqlTypeName.DECIMAL, DECIMAL) + .put(SqlTypeName.BOOLEAN, BOOLEAN) + .put(SqlTypeName.VARBINARY, VARBINARY) + .put(SqlTypeName.BINARY, VARBINARY) Review comment: BINARY might generate from `select b'test_string'`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223912) Time Spent: 40m (was: 0.5h) > Calcite BINARY to Beam Schema BYTES missing in CalciteUtils > --- > > Key: BEAM-7024 > URL: https://issues.apache.org/jira/browse/BEAM-7024 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Reporter: Rui Wang >Assignee: Rui Wang >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7024) Calcite BINARY to Beam Schema BYTES missing in CalciteUtils
[ https://issues.apache.org/jira/browse/BEAM-7024?focusedWorklogId=223913=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223913 ] ASF GitHub Bot logged work on BEAM-7024: Author: ASF GitHub Bot Created on: 05/Apr/19 22:44 Start Date: 05/Apr/19 22:44 Worklog Time Spent: 10m Work Description: amaliujia commented on pull request #8242: [BEAM-7024] Calcite BINARY to Beam Schema BYTES missing in CalciteUtils URL: https://github.com/apache/beam/pull/8242#discussion_r272763788 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/utils/CalciteUtils.java ## @@ -156,8 +156,26 @@ public static boolean isStringType(FieldType fieldType) { .put(TIMESTAMP_WITH_LOCAL_TZ, SqlTypeName.TIMESTAMP_WITH_LOCAL_TIME_ZONE) .build(); - private static final BiMap CALCITE_TO_BEAM_TYPE_MAPPING = - BEAM_TO_CALCITE_TYPE_MAPPING.inverse(); + private static final ImmutableMap CALCITE_TO_BEAM_TYPE_MAPPING = + ImmutableMap.builder() + .put(SqlTypeName.TINYINT, TINY_INT) + .put(SqlTypeName.SMALLINT, SMALL_INT) + .put(SqlTypeName.INTEGER, INTEGER) + .put(SqlTypeName.BIGINT, BIG_INT) + .put(SqlTypeName.FLOAT, FLOAT) + .put(SqlTypeName.DOUBLE, DOUBLE) + .put(SqlTypeName.DECIMAL, DECIMAL) + .put(SqlTypeName.BOOLEAN, BOOLEAN) + .put(SqlTypeName.VARBINARY, VARBINARY) + .put(SqlTypeName.BINARY, VARBINARY) Review comment: BINARY might generate from `select b'test_string'`. Calcite treats it as a fixed length byte array. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223913) Time Spent: 50m (was: 40m) > Calcite BINARY to Beam Schema BYTES missing in CalciteUtils > --- > > Key: BEAM-7024 > URL: https://issues.apache.org/jira/browse/BEAM-7024 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Reporter: Rui Wang >Assignee: Rui Wang >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7024) Calcite BINARY to Beam Schema BYTES missing in CalciteUtils
[ https://issues.apache.org/jira/browse/BEAM-7024?focusedWorklogId=223908=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223908 ] ASF GitHub Bot logged work on BEAM-7024: Author: ASF GitHub Bot Created on: 05/Apr/19 22:43 Start Date: 05/Apr/19 22:43 Worklog Time Spent: 10m Work Description: amaliujia commented on pull request #8242: [BEAM-7024] Calcite BINARY to Beam Schema BYTES missing in CalciteUtils URL: https://github.com/apache/beam/pull/8242 Currently the mapping between FieldType to Calcite SqlTypeName is 1 to 1. However, there is a special case where Calcite has both BINARY and VARBINARY, which should both be saved to bytes in Beam schema. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | --- | --- | --- | --- Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/) | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/) [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/) | --- | --- | --- Pre-Commit Tests Status (on master branch)
[jira] [Work logged] (BEAM-7024) Calcite BINARY to Beam Schema BYTES missing in CalciteUtils
[ https://issues.apache.org/jira/browse/BEAM-7024?focusedWorklogId=223909=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223909 ] ASF GitHub Bot logged work on BEAM-7024: Author: ASF GitHub Bot Created on: 05/Apr/19 22:43 Start Date: 05/Apr/19 22:43 Worklog Time Spent: 10m Work Description: amaliujia commented on pull request #8242: [BEAM-7024] Calcite BINARY to Beam Schema BYTES missing in CalciteUtils URL: https://github.com/apache/beam/pull/8242#discussion_r272763687 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/utils/CalciteUtils.java ## @@ -156,8 +156,26 @@ public static boolean isStringType(FieldType fieldType) { .put(TIMESTAMP_WITH_LOCAL_TZ, SqlTypeName.TIMESTAMP_WITH_LOCAL_TIME_ZONE) .build(); - private static final BiMap CALCITE_TO_BEAM_TYPE_MAPPING = - BEAM_TO_CALCITE_TYPE_MAPPING.inverse(); + private static final ImmutableMap CALCITE_TO_BEAM_TYPE_MAPPING = + ImmutableMap.builder() + .put(SqlTypeName.TINYINT, TINY_INT) + .put(SqlTypeName.SMALLINT, SMALL_INT) + .put(SqlTypeName.INTEGER, INTEGER) + .put(SqlTypeName.BIGINT, BIG_INT) + .put(SqlTypeName.FLOAT, FLOAT) + .put(SqlTypeName.DOUBLE, DOUBLE) + .put(SqlTypeName.DECIMAL, DECIMAL) + .put(SqlTypeName.BOOLEAN, BOOLEAN) + .put(SqlTypeName.VARBINARY, VARBINARY) + .put(SqlTypeName.BINARY, VARBINARY) Review comment: Both BINARY and VARBINARY should be mapped to `VARBINARY`, which is FieldType.BYTES. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223909) Time Spent: 20m (was: 10m) > Calcite BINARY to Beam Schema BYTES missing in CalciteUtils > --- > > Key: BEAM-7024 > URL: https://issues.apache.org/jira/browse/BEAM-7024 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Reporter: Rui Wang >Assignee: Rui Wang >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7024) Calcite BINARY to Beam Schema BYTES missing in CalciteUtils
[ https://issues.apache.org/jira/browse/BEAM-7024?focusedWorklogId=223910=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223910 ] ASF GitHub Bot logged work on BEAM-7024: Author: ASF GitHub Bot Created on: 05/Apr/19 22:43 Start Date: 05/Apr/19 22:43 Worklog Time Spent: 10m Work Description: amaliujia commented on issue #8242: [BEAM-7024] Calcite BINARY to Beam Schema BYTES missing in CalciteUtils URL: https://github.com/apache/beam/pull/8242#issuecomment-480444718 @kennknowles @apilloud @akedin This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223910) Time Spent: 0.5h (was: 20m) > Calcite BINARY to Beam Schema BYTES missing in CalciteUtils > --- > > Key: BEAM-7024 > URL: https://issues.apache.org/jira/browse/BEAM-7024 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Reporter: Rui Wang >Assignee: Rui Wang >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7011) Replace "urn:beam:sideinput:materialization:multimap:0.1" with standard side input type materialization
[ https://issues.apache.org/jira/browse/BEAM-7011?focusedWorklogId=223906=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223906 ] ASF GitHub Bot logged work on BEAM-7011: Author: ASF GitHub Bot Created on: 05/Apr/19 22:40 Start Date: 05/Apr/19 22:40 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #8232: [BEAM-7011] Update Beam SDKs to use the StandardSideInputType enums. URL: https://github.com/apache/beam/pull/8232#issuecomment-480444051 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223906) Time Spent: 2h (was: 1h 50m) > Replace "urn:beam:sideinput:materialization:multimap:0.1" with standard side > input type materialization > --- > > Key: BEAM-7011 > URL: https://issues.apache.org/jira/browse/BEAM-7011 > Project: Beam > Issue Type: Improvement > Components: beam-model, sdk-go, sdk-java-core, sdk-py-core >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: Major > Time Spent: 2h > Remaining Estimate: 0h > > Use the StandardSideInput defined within beam_runner_api.proto: > https://github.com/apache/beam/blob/206d98b0765ac662730edd28d669b3db24dd851d/model/pipeline/src/main/proto/beam_runner_api.proto#L311 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-7025) Python pipelines should not be able to use output tags that are not defined in with_outputs.
[ https://issues.apache.org/jira/browse/BEAM-7025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Amato updated BEAM-7025: - Summary: Python pipelines should not be able to use output tags that are not defined in with_outputs. (was: Python pipelines should not be able to use output tags that are not defiend in with_outputs.) > Python pipelines should not be able to use output tags that are not defined > in with_outputs. > > > Key: BEAM-7025 > URL: https://issues.apache.org/jira/browse/BEAM-7025 > Project: Beam > Issue Type: New Feature > Components: sdk-py-core >Reporter: Alex Amato >Priority: Major > > This is an indication of a user misconfiguring a beam pipeline. > This is because its not possible to get a handle to use the produced > pcollection for that output tag, if .with_outputs is not used. So this should > be disallowed entirely, a run time exception should be thrown. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-7025) Python pipelines should not be able to use output tags that are not defiend in with_outputs.
Alex Amato created BEAM-7025: Summary: Python pipelines should not be able to use output tags that are not defiend in with_outputs. Key: BEAM-7025 URL: https://issues.apache.org/jira/browse/BEAM-7025 Project: Beam Issue Type: New Feature Components: sdk-py-core Reporter: Alex Amato This is an indication of a user misconfiguring a beam pipeline. This is because its not possible to get a handle to use the produced pcollection for that output tag, if .with_outputs is not used. So this should be disallowed entirely, a run time exception should be thrown. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-7021) ToString transform for Python SDK
[ https://issues.apache.org/jira/browse/BEAM-7021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Altay reassigned BEAM-7021: - Assignee: (was: Ahmet Altay) > ToString transform for Python SDK > - > > Key: BEAM-7021 > URL: https://issues.apache.org/jira/browse/BEAM-7021 > Project: Beam > Issue Type: New Feature > Components: sdk-py-core >Reporter: Rose Nguyen >Priority: Minor > > PTransforms for converting a PCollection or PCollection > Iterable to a PCollection String > It should offer the same API as its Java counterpart: > [https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/ToString.java] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-7021) ToString transform for Python SDK
[ https://issues.apache.org/jira/browse/BEAM-7021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Altay updated BEAM-7021: -- Labels: starter (was: ) > ToString transform for Python SDK > - > > Key: BEAM-7021 > URL: https://issues.apache.org/jira/browse/BEAM-7021 > Project: Beam > Issue Type: New Feature > Components: sdk-py-core >Reporter: Rose Nguyen >Priority: Minor > Labels: starter > > PTransforms for converting a PCollection or PCollection > Iterable to a PCollection String > It should offer the same API as its Java counterpart: > [https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/ToString.java] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-7023) WithKeys transform for Python SDK
[ https://issues.apache.org/jira/browse/BEAM-7023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Altay reassigned BEAM-7023: - Assignee: (was: Ahmet Altay) > WithKeys transform for Python SDK > - > > Key: BEAM-7023 > URL: https://issues.apache.org/jira/browse/BEAM-7023 > Project: Beam > Issue Type: New Feature > Components: sdk-py-core >Reporter: Rose Nguyen >Priority: Minor > Labels: starter > > WithKeys PCollection and either a constant key of type K or a > function from V to K, and returns a PCollection>, where each of the > values in the input PCollection has been paired with either the constant key > or a key computed from the value. > It should offer the same API as its Java counterpart: > [https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/WithKeys.java] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-7023) WithKeys transform for Python SDK
[ https://issues.apache.org/jira/browse/BEAM-7023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Altay updated BEAM-7023: -- Labels: starter (was: ) > WithKeys transform for Python SDK > - > > Key: BEAM-7023 > URL: https://issues.apache.org/jira/browse/BEAM-7023 > Project: Beam > Issue Type: New Feature > Components: sdk-py-core >Reporter: Rose Nguyen >Assignee: Ahmet Altay >Priority: Minor > Labels: starter > > WithKeys PCollection and either a constant key of type K or a > function from V to K, and returns a PCollection>, where each of the > values in the input PCollection has been paired with either the constant key > or a key computed from the value. > It should offer the same API as its Java counterpart: > [https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/WithKeys.java] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5174) Website feed is broken due to license header
[ https://issues.apache.org/jira/browse/BEAM-5174?focusedWorklogId=223902=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223902 ] ASF GitHub Bot logged work on BEAM-5174: Author: ASF GitHub Bot Created on: 05/Apr/19 22:28 Start Date: 05/Apr/19 22:28 Worklog Time Spent: 10m Work Description: melap commented on issue #8231: [BEAM-5174] Fixes website feed URL: https://github.com/apache/beam/pull/8231#issuecomment-480441835 Run Website_Stage_GCS PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223902) Time Spent: 0.5h (was: 20m) > Website feed is broken due to license header > > > Key: BEAM-5174 > URL: https://issues.apache.org/jira/browse/BEAM-5174 > Project: Beam > Issue Type: Bug > Components: website >Reporter: Maximilian Michels >Priority: Minor > Time Spent: 0.5h > Remaining Estimate: 0h > > The feed at https://beam.apache.org/feed.xml starts out with a license header > which breaks the XML support. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-7024) Calcite BINARY to Beam Schema BYTES missing in CalciteUtils
Rui Wang created BEAM-7024: -- Summary: Calcite BINARY to Beam Schema BYTES missing in CalciteUtils Key: BEAM-7024 URL: https://issues.apache.org/jira/browse/BEAM-7024 Project: Beam Issue Type: Bug Components: dsl-sql Reporter: Rui Wang Assignee: Rui Wang -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-6976) Incorrect Doc/Description of HadoopFormatIO On Partitioner
[ https://issues.apache.org/jira/browse/BEAM-6976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismaël Mejía updated BEAM-6976: --- Summary: Incorrect Doc/Description of HadoopFormatIO On Partitioner (was: Incorrect Doc/Description of HadoopFormatIO On Paritioner) > Incorrect Doc/Description of HadoopFormatIO On Partitioner > -- > > Key: BEAM-6976 > URL: https://issues.apache.org/jira/browse/BEAM-6976 > Project: Beam > Issue Type: Bug > Components: io-java-hadoop >Affects Versions: 2.10.0, 2.11.0 >Reporter: zhenzhao wang >Priority: Minor > Labels: easyfix > Fix For: Not applicable > > Time Spent: 40m > Remaining Estimate: 0h > > Here's the current doc > {code:java} > myHadoopConfiguration.setClass("mapreduce.job.output.key.class", >MyDbOutputFormatKeyClass, Object.class); > myHadoopConfiguration.setClass("mapreduce.job.output.value.class", >MyDbOutputFormatValueClass, Object.class); > myHadoopConfiguration.setClass("mapreduce.job.output.value.class", >MyPartitionerClass, Object.class); > {code} > It should be myHadoopConfiguration.setClass("mapreduce.job.partitioner.class", >MyPartitionerClass, Object.class); > The error are found in both website/src/documentation/io/built-in-hadoop.md > and > sdks/java/io/hadoop-format/src/main/java/org/apache/beam/sdk/io/hadoop/format/HadoopFormatIO.java -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-7023) WithKeys transform for Python SDK
Rose Nguyen created BEAM-7023: - Summary: WithKeys transform for Python SDK Key: BEAM-7023 URL: https://issues.apache.org/jira/browse/BEAM-7023 Project: Beam Issue Type: New Feature Components: sdk-py-core Reporter: Rose Nguyen Assignee: Ahmet Altay WithKeys PCollection and either a constant key of type K or a function from V to K, and returns a PCollection>, where each of the values in the input PCollection has been paired with either the constant key or a key computed from the value. It should offer the same API as its Java counterpart: [https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/WithKeys.java] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-7022) BigQueryIO.write.withExtendedErrorInfo() does not catch missing tables (and maybe datasets)
Pierig Le Saux created BEAM-7022: Summary: BigQueryIO.write.withExtendedErrorInfo() does not catch missing tables (and maybe datasets) Key: BEAM-7022 URL: https://issues.apache.org/jira/browse/BEAM-7022 Project: Beam Issue Type: Bug Components: io-java-gcp Affects Versions: 2.11.0 Reporter: Pierig Le Saux In beam versions from 2.7.0 to 2.9.0 a missing table produces an exception in Dataflow that .getFailedInsertsWithErr() doesn't catch. We get a GoogleJsonResponseException 404. In beam versions 2.10.0 or 2.11.0 the step just gets silently stuck (logging every 5mins). I would expect this error to be "catchable" just like a missing field in a table is. I haven't checked the behaviour with a missing dataset yet, but I would like to be able to take action on this kind of error as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-7021) ToString transform for Python SDK
Rose Nguyen created BEAM-7021: - Summary: ToString transform for Python SDK Key: BEAM-7021 URL: https://issues.apache.org/jira/browse/BEAM-7021 Project: Beam Issue Type: New Feature Components: sdk-py-core Reporter: Rose Nguyen Assignee: Ahmet Altay PTransforms for converting a PCollection or PCollection Iterable to a PCollection String It should offer the same API as its Java counterpart: [https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/ToString.java] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7020) Reduce the log severity of profiling agent discovery
[ https://issues.apache.org/jira/browse/BEAM-7020?focusedWorklogId=223888=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223888 ] ASF GitHub Bot logged work on BEAM-7020: Author: ASF GitHub Bot Created on: 05/Apr/19 21:59 Start Date: 05/Apr/19 21:59 Worklog Time Spent: 10m Work Description: davidyan74 commented on issue #8241: [BEAM-7020] Reduced log severity for profiling agent discovery URL: https://github.com/apache/beam/pull/8241#issuecomment-480435208 R: @pabloem This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223888) Time Spent: 20m (was: 10m) > Reduce the log severity of profiling agent discovery > > > Key: BEAM-7020 > URL: https://issues.apache.org/jira/browse/BEAM-7020 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: David Yan >Priority: Minor > Time Spent: 20m > Remaining Estimate: 0h > > Example: > [https://github.com/apache/beam/blob/b953645ed6db837d24284d7fe1fe091e7309f821/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/profiler/ScopedProfiler.java#L138] > These should not be at warning severity, even if the profiling agent is not > present since it's in most cases users do not run their jobs with profiling. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7020) Reduce the log severity of profiling agent discovery
[ https://issues.apache.org/jira/browse/BEAM-7020?focusedWorklogId=223887=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223887 ] ASF GitHub Bot logged work on BEAM-7020: Author: ASF GitHub Bot Created on: 05/Apr/19 21:58 Start Date: 05/Apr/19 21:58 Worklog Time Spent: 10m Work Description: davidyan74 commented on pull request #8241: [BEAM-7020] Reduced log severity for profiling agent discovery URL: https://github.com/apache/beam/pull/8241 These should not be at warning severity, even if the profiling agent is not present since it's in most cases users do not run their jobs with profiling. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | --- | --- | --- | --- Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/) | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/) [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/) | --- | --- | --- Pre-Commit Tests Status (on master branch)
[jira] [Created] (BEAM-7020) Reduce the log severity of profiling agent discovery
David Yan created BEAM-7020: --- Summary: Reduce the log severity of profiling agent discovery Key: BEAM-7020 URL: https://issues.apache.org/jira/browse/BEAM-7020 Project: Beam Issue Type: Improvement Components: runner-dataflow Reporter: David Yan Example: [https://github.com/apache/beam/blob/b953645ed6db837d24284d7fe1fe091e7309f821/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/profiler/ScopedProfiler.java#L138] These should not be at warning severity, even if the profiling agent is not present since it's in most cases users do not run their jobs with profiling. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-7018) Regex transform for Python SDK
[ https://issues.apache.org/jira/browse/BEAM-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rose Nguyen reassigned BEAM-7018: - Assignee: Ahmet Altay > Regex transform for Python SDK > -- > > Key: BEAM-7018 > URL: https://issues.apache.org/jira/browse/BEAM-7018 > Project: Beam > Issue Type: New Feature > Components: sdk-py-core >Reporter: Rose Nguyen >Assignee: Ahmet Altay >Priority: Minor > > PTransorms to use Regular Expressions to process elements in a PCollection > It should offer the same API as its Java counterpart: > [https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Regex.java] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-7019) Reify transform for Python SDK
Rose Nguyen created BEAM-7019: - Summary: Reify transform for Python SDK Key: BEAM-7019 URL: https://issues.apache.org/jira/browse/BEAM-7019 Project: Beam Issue Type: New Feature Components: sdk-py-core Reporter: Rose Nguyen Assignee: Ahmet Altay PTransforms for converting between explicit and implicit form of various Beam values. It should offer the same API as its Java counterpart: [https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Reify.java] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-7018) Regex transform for Python SDK
Rose Nguyen created BEAM-7018: - Summary: Regex transform for Python SDK Key: BEAM-7018 URL: https://issues.apache.org/jira/browse/BEAM-7018 Project: Beam Issue Type: New Feature Components: sdk-py-core Reporter: Rose Nguyen PTransorms to use Regular Expressions to process elements in a PCollection It should offer the same API as its Java counterpart: [https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Regex.java] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7017) Improve Apache Rat failure message during build time
[ https://issues.apache.org/jira/browse/BEAM-7017?focusedWorklogId=223878=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223878 ] ASF GitHub Bot logged work on BEAM-7017: Author: ASF GitHub Bot Created on: 05/Apr/19 21:47 Start Date: 05/Apr/19 21:47 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #8240: [BEAM-7017] Upgrade Apache Rat plugin version to have improved build output for failures. URL: https://github.com/apache/beam/pull/8240 Task output now looks like: > Task :rat FAILED FAILURE: Build failed with an exception. * What went wrong: Execution failed for task ':rat'. > A failure occurred while executing org.nosphere.apache.rat.RatWork > Apache Rat audit failure * Summary --- Generated at: 2019-04-05T14:39:11-07:00 Notes: 5 Binaries: 123 Archives: 4 Standards: 5105 Apache Licensed: 5104 Generated Documents: 0 JavaDocs are generated, thus a license header is optional. Generated files do not require license headers. 1 Unknown Licenses * Files with unapproved licenses: /usr/local/google/home/lcwik/git/beam/examples/java/src/main/java/org/apache/beam/examples/WordCount.java * Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | --- | --- | --- | --- Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build
[jira] [Created] (BEAM-7017) Improve Apache Rat failure message during build time
Luke Cwik created BEAM-7017: --- Summary: Improve Apache Rat failure message during build time Key: BEAM-7017 URL: https://issues.apache.org/jira/browse/BEAM-7017 Project: Beam Issue Type: Improvement Components: build-system Reporter: Luke Cwik Assignee: Luke Cwik 0.4.0 of the plugin embeds the filenames with invalid licenses in the exception message of the failing task so it now appears at the bottom of the build. {code} > Task :rat FAILED FAILURE: Build failed with an exception. * What went wrong: Execution failed for task ':rat'. > A failure occurred while executing org.nosphere.apache.rat.RatWork > Apache Rat audit failure * Summary --- Generated at: 2019-04-05T14:39:11-07:00 Notes: 5 Binaries: 123 Archives: 4 Standards: 5105 Apache Licensed: 5104 Generated Documents: 0 JavaDocs are generated, thus a license header is optional. Generated files do not require license headers. 1 Unknown Licenses * Files with unapproved licenses: /usr/local/google/home/lcwik/git/beam/examples/java/src/main/java/org/apache/beam/examples/WordCount.java * {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6904) Test all Coder structuralValue implementations
[ https://issues.apache.org/jira/browse/BEAM-6904?focusedWorklogId=223865=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223865 ] ASF GitHub Bot logged work on BEAM-6904: Author: ASF GitHub Bot Created on: 05/Apr/19 21:26 Start Date: 05/Apr/19 21:26 Worklog Time Spent: 10m Work Description: AlexKbit commented on issue #8208: [BEAM-6904] Add tests for structuralValue implementation in coders URL: https://github.com/apache/beam/pull/8208#issuecomment-480427331 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223865) Time Spent: 1h 20m (was: 1h 10m) > Test all Coder structuralValue implementations > -- > > Key: BEAM-6904 > URL: https://issues.apache.org/jira/browse/BEAM-6904 > Project: Beam > Issue Type: Test > Components: sdk-java-core >Reporter: Kenneth Knowles >Assignee: Alexander Savchenko >Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > > Here is a test helper that check that structuralValue is consistent with > equals: > https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/CoderProperties.java#L200 > And here is one that tests it another way: > https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/CoderProperties.java#L226 > With the deprecation of consistentWithEquals and implementing all the > structualValue methods, we should add these tests to every coder. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6904) Test all Coder structuralValue implementations
[ https://issues.apache.org/jira/browse/BEAM-6904?focusedWorklogId=223866=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223866 ] ASF GitHub Bot logged work on BEAM-6904: Author: ASF GitHub Bot Created on: 05/Apr/19 21:26 Start Date: 05/Apr/19 21:26 Worklog Time Spent: 10m Work Description: AlexKbit commented on issue #8208: [BEAM-6904] Add tests for structuralValue implementation in coders URL: https://github.com/apache/beam/pull/8208#issuecomment-480427331 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223866) Time Spent: 1.5h (was: 1h 20m) > Test all Coder structuralValue implementations > -- > > Key: BEAM-6904 > URL: https://issues.apache.org/jira/browse/BEAM-6904 > Project: Beam > Issue Type: Test > Components: sdk-java-core >Reporter: Kenneth Knowles >Assignee: Alexander Savchenko >Priority: Major > Time Spent: 1.5h > Remaining Estimate: 0h > > Here is a test helper that check that structuralValue is consistent with > equals: > https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/CoderProperties.java#L200 > And here is one that tests it another way: > https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/CoderProperties.java#L226 > With the deprecation of consistentWithEquals and implementing all the > structualValue methods, we should add these tests to every coder. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5433) Cleanup Environment.url from beam_runner_api.proto
[ https://issues.apache.org/jira/browse/BEAM-5433?focusedWorklogId=223862=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223862 ] ASF GitHub Bot logged work on BEAM-5433: Author: ASF GitHub Bot Created on: 05/Apr/19 21:19 Start Date: 05/Apr/19 21:19 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #8213: [BEAM-5433] Deprecate environment url field. URL: https://github.com/apache/beam/pull/8213#issuecomment-480425517 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223862) Time Spent: 50m (was: 40m) > Cleanup Environment.url from beam_runner_api.proto > -- > > Key: BEAM-5433 > URL: https://issues.apache.org/jira/browse/BEAM-5433 > Project: Beam > Issue Type: Task > Components: beam-model >Reporter: Ankur Goenka >Assignee: Luke Cwik >Priority: Major > Labels: triaged > Time Spent: 50m > Remaining Estimate: 0h > > Environment URL field is deprecated and should be removed ASAP. > The current blocker in removing the field is compatibility with Dataflow as > data flow has internal code which relies on it. > There is also vote passed to move the affected dataflow code to open source. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6942) Pipeline options to experiment propagation is not working in Dataflow runner.
[ https://issues.apache.org/jira/browse/BEAM-6942?focusedWorklogId=223860=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223860 ] ASF GitHub Bot logged work on BEAM-6942: Author: ASF GitHub Bot Created on: 05/Apr/19 21:15 Start Date: 05/Apr/19 21:15 Worklog Time Spent: 10m Work Description: tvalentyn commented on issue #8225: [BEAM-6942] Make modifications to pipeline options to be visible to all views. URL: https://github.com/apache/beam/pull/8225#issuecomment-480424440 Run Portable_Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223860) Time Spent: 1.5h (was: 1h 20m) > Pipeline options to experiment propagation is not working in Dataflow runner. > - > > Key: BEAM-6942 > URL: https://issues.apache.org/jira/browse/BEAM-6942 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Valentyn Tymofieiev >Priority: Major > Time Spent: 1.5h > Remaining Estimate: 0h > > Relevant code: > [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py#L356-L388] > 3 experiments/options are affected. We need to fix it in 2.12.0 > cc: [~altay] [~apilloud] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7001) Remove MonitoringInfoUrns proto from FnApi metrics.proto
[ https://issues.apache.org/jira/browse/BEAM-7001?focusedWorklogId=223853=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223853 ] ASF GitHub Bot logged work on BEAM-7001: Author: ASF GitHub Bot Created on: 05/Apr/19 21:05 Start Date: 05/Apr/19 21:05 Worklog Time Spent: 10m Work Description: pabloem commented on pull request #8203: [BEAM-7001] Remove MonitoringInfoUrns proto. URL: https://github.com/apache/beam/pull/8203 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223853) Time Spent: 1h 10m (was: 1h) > Remove MonitoringInfoUrns proto from FnApi metrics.proto > > > Key: BEAM-7001 > URL: https://issues.apache.org/jira/browse/BEAM-7001 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Mikhail Gryzykhin >Assignee: Mikhail Gryzykhin >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > All urns are defined in MonitoringInfoSpecs. We want to remove duplicate > definition. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7011) Replace "urn:beam:sideinput:materialization:multimap:0.1" with standard side input type materialization
[ https://issues.apache.org/jira/browse/BEAM-7011?focusedWorklogId=223849=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223849 ] ASF GitHub Bot logged work on BEAM-7011: Author: ASF GitHub Bot Created on: 05/Apr/19 21:01 Start Date: 05/Apr/19 21:01 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #8232: [BEAM-7011] Update Beam SDKs to use the StandardSideInputType enums. URL: https://github.com/apache/beam/pull/8232#issuecomment-480420771 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223849) Time Spent: 1h 50m (was: 1h 40m) > Replace "urn:beam:sideinput:materialization:multimap:0.1" with standard side > input type materialization > --- > > Key: BEAM-7011 > URL: https://issues.apache.org/jira/browse/BEAM-7011 > Project: Beam > Issue Type: Improvement > Components: beam-model, sdk-go, sdk-java-core, sdk-py-core >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: Major > Time Spent: 1h 50m > Remaining Estimate: 0h > > Use the StandardSideInput defined within beam_runner_api.proto: > https://github.com/apache/beam/blob/206d98b0765ac662730edd28d669b3db24dd851d/model/pipeline/src/main/proto/beam_runner_api.proto#L311 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7011) Replace "urn:beam:sideinput:materialization:multimap:0.1" with standard side input type materialization
[ https://issues.apache.org/jira/browse/BEAM-7011?focusedWorklogId=223848=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223848 ] ASF GitHub Bot logged work on BEAM-7011: Author: ASF GitHub Bot Created on: 05/Apr/19 21:01 Start Date: 05/Apr/19 21:01 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #8232: [BEAM-7011] Update Beam SDKs to use the StandardSideInputType enums. URL: https://github.com/apache/beam/pull/8232#issuecomment-480420688 Run Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223848) Time Spent: 1h 40m (was: 1.5h) > Replace "urn:beam:sideinput:materialization:multimap:0.1" with standard side > input type materialization > --- > > Key: BEAM-7011 > URL: https://issues.apache.org/jira/browse/BEAM-7011 > Project: Beam > Issue Type: Improvement > Components: beam-model, sdk-go, sdk-java-core, sdk-py-core >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: Major > Time Spent: 1h 40m > Remaining Estimate: 0h > > Use the StandardSideInput defined within beam_runner_api.proto: > https://github.com/apache/beam/blob/206d98b0765ac662730edd28d669b3db24dd851d/model/pipeline/src/main/proto/beam_runner_api.proto#L311 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7011) Replace "urn:beam:sideinput:materialization:multimap:0.1" with standard side input type materialization
[ https://issues.apache.org/jira/browse/BEAM-7011?focusedWorklogId=223847=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223847 ] ASF GitHub Bot logged work on BEAM-7011: Author: ASF GitHub Bot Created on: 05/Apr/19 20:58 Start Date: 05/Apr/19 20:58 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #8232: [BEAM-7011] Update Beam SDKs to use the StandardSideInputType enums. URL: https://github.com/apache/beam/pull/8232#issuecomment-480419803 Yes, I use go 1.12 locally, didn't know that the fmt had changed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223847) Time Spent: 1.5h (was: 1h 20m) > Replace "urn:beam:sideinput:materialization:multimap:0.1" with standard side > input type materialization > --- > > Key: BEAM-7011 > URL: https://issues.apache.org/jira/browse/BEAM-7011 > Project: Beam > Issue Type: Improvement > Components: beam-model, sdk-go, sdk-java-core, sdk-py-core >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: Major > Time Spent: 1.5h > Remaining Estimate: 0h > > Use the StandardSideInput defined within beam_runner_api.proto: > https://github.com/apache/beam/blob/206d98b0765ac662730edd28d669b3db24dd851d/model/pipeline/src/main/proto/beam_runner_api.proto#L311 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7011) Replace "urn:beam:sideinput:materialization:multimap:0.1" with standard side input type materialization
[ https://issues.apache.org/jira/browse/BEAM-7011?focusedWorklogId=223845=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223845 ] ASF GitHub Bot logged work on BEAM-7011: Author: ASF GitHub Bot Created on: 05/Apr/19 20:54 Start Date: 05/Apr/19 20:54 Worklog Time Spent: 10m Work Description: lostluck commented on issue #8232: [BEAM-7011] Update Beam SDKs to use the StandardSideInputType enums. URL: https://github.com/apache/beam/pull/8232#issuecomment-480418796 Note: I think the formatting changes in the go code are from go version differences for whomever is running `go fmt`. I think many of these changed (and I updated in a PR) after I updated my local go to 1.12. Either way, they're benign if a bit noisy, and not worth nit-ing over. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223845) Time Spent: 1h 20m (was: 1h 10m) > Replace "urn:beam:sideinput:materialization:multimap:0.1" with standard side > input type materialization > --- > > Key: BEAM-7011 > URL: https://issues.apache.org/jira/browse/BEAM-7011 > Project: Beam > Issue Type: Improvement > Components: beam-model, sdk-go, sdk-java-core, sdk-py-core >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > > Use the StandardSideInput defined within beam_runner_api.proto: > https://github.com/apache/beam/blob/206d98b0765ac662730edd28d669b3db24dd851d/model/pipeline/src/main/proto/beam_runner_api.proto#L311 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-7016) DataflowWorkerLoggingInitializer.reset() sets System.out/err to null if never initialized
[ https://issues.apache.org/jira/browse/BEAM-7016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-7016: Labels: starter (was: ) > DataflowWorkerLoggingInitializer.reset() sets System.out/err to null if never > initialized > - > > Key: BEAM-7016 > URL: https://issues.apache.org/jira/browse/BEAM-7016 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Reporter: Luke Cwik >Priority: Minor > Labels: starter > > DataflowWorkerLoggingInitializer should guard so that reset() can't run > unless initialize() is first run. We should also make initialize() not able > to be run unless it has never been run or reset() has been run. > Further details in ML thread: > https://lists.apache.org/thread.html/e0c04e1e5f1efb623256ace2393b5b9a899174e726e72f9903fedf19@%3Cdev.beam.apache.org%3E -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7011) Replace "urn:beam:sideinput:materialization:multimap:0.1" with standard side input type materialization
[ https://issues.apache.org/jira/browse/BEAM-7011?focusedWorklogId=223844=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223844 ] ASF GitHub Bot logged work on BEAM-7011: Author: ASF GitHub Bot Created on: 05/Apr/19 20:52 Start Date: 05/Apr/19 20:52 Worklog Time Spent: 10m Work Description: lostluck commented on issue #8232: [BEAM-7011] Update Beam SDKs to use the StandardSideInputType enums. URL: https://github.com/apache/beam/pull/8232#issuecomment-480418323 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223844) Time Spent: 1h 10m (was: 1h) > Replace "urn:beam:sideinput:materialization:multimap:0.1" with standard side > input type materialization > --- > > Key: BEAM-7011 > URL: https://issues.apache.org/jira/browse/BEAM-7011 > Project: Beam > Issue Type: Improvement > Components: beam-model, sdk-go, sdk-java-core, sdk-py-core >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > Use the StandardSideInput defined within beam_runner_api.proto: > https://github.com/apache/beam/blob/206d98b0765ac662730edd28d669b3db24dd851d/model/pipeline/src/main/proto/beam_runner_api.proto#L311 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-7016) DataflowWorkerLoggingInitializer.reset() sets System.out/err to null if never initialized
[ https://issues.apache.org/jira/browse/BEAM-7016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-7016: Description: DataflowWorkerLoggingInitializer should guard so that reset() can't run unless initialize() is first run. We should also make initialize() not able to be run unless it has never been run or reset() has been run. Further details in ML thread: https://lists.apache.org/thread.html/e0c04e1e5f1efb623256ace2393b5b9a899174e726e72f9903fedf19@%3Cdev.beam.apache.org%3E was:DataflowWorkerLoggingInitializer should guard so that reset() can't run unless initialize() is first run. We should also make initialize() not able to be run unless it has never been run or reset() has been run. > DataflowWorkerLoggingInitializer.reset() sets System.out/err to null if never > initialized > - > > Key: BEAM-7016 > URL: https://issues.apache.org/jira/browse/BEAM-7016 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Reporter: Luke Cwik >Priority: Minor > > DataflowWorkerLoggingInitializer should guard so that reset() can't run > unless initialize() is first run. We should also make initialize() not able > to be run unless it has never been run or reset() has been run. > Further details in ML thread: > https://lists.apache.org/thread.html/e0c04e1e5f1efb623256ace2393b5b9a899174e726e72f9903fedf19@%3Cdev.beam.apache.org%3E -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-7016) DataflowWorkerLoggingInitializer.reset() sets System.out/err to null if never initialized
Luke Cwik created BEAM-7016: --- Summary: DataflowWorkerLoggingInitializer.reset() sets System.out/err to null if never initialized Key: BEAM-7016 URL: https://issues.apache.org/jira/browse/BEAM-7016 Project: Beam Issue Type: Bug Components: runner-dataflow Reporter: Luke Cwik DataflowWorkerLoggingInitializer should guard so that reset() can't run unless initialize() is first run. We should also make initialize() not able to be run unless it has never been run or reset() has been run. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-5381) Dataflow runner creates duplicate CoGBK step IDs
[ https://issues.apache.org/jira/browse/BEAM-5381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Burke reassigned BEAM-5381: -- Assignee: Robert Burke > Dataflow runner creates duplicate CoGBK step IDs > > > Key: BEAM-5381 > URL: https://issues.apache.org/jira/browse/BEAM-5381 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Cody Schroeder >Assignee: Robert Burke >Priority: Major > Time Spent: 1h 40m > Remaining Estimate: 0h > > https://gist.github.com/schroederc/699f42e407702cf9584b15d6885ad297 > If the attached {{beam_dataflow_err.go}} pipeline is executed with the > {{dataflow}} runner, GCP reports the following error: > {code} > Step with name e5 already exists. Duplicates are not allowed. > {code} > Executing the pipeline in {{--dry_run}} mode shows that "e5" is indeed > duplicated. If the CoGBK in the pipeline is not scoped, the duplication is > fixed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-7015) Have only a single definition of standard_coders.yaml
Luke Cwik created BEAM-7015: --- Summary: Have only a single definition of standard_coders.yaml Key: BEAM-7015 URL: https://issues.apache.org/jira/browse/BEAM-7015 Project: Beam Issue Type: Improvement Components: beam-model, sdk-py-core Reporter: Luke Cwik Assignee: Robert Bradshaw There are two copies of standard_coders.yaml defined: * https://github.com/apache/beam/blob/master/model/fn-execution/src/main/resources/org/apache/beam/model/fnexecution/v1/standard_coders.yaml * https://github.com/apache/beam/blob/master/sdks/python/apache_beam/testing/data/standard_coders.yaml The Python SDK specific instance should be removed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7011) Replace "urn:beam:sideinput:materialization:multimap:0.1" with standard side input type materialization
[ https://issues.apache.org/jira/browse/BEAM-7011?focusedWorklogId=223839=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223839 ] ASF GitHub Bot logged work on BEAM-7011: Author: ASF GitHub Bot Created on: 05/Apr/19 20:32 Start Date: 05/Apr/19 20:32 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #8232: [BEAM-7011] Update Beam SDKs to use the StandardSideInputType enums. URL: https://github.com/apache/beam/pull/8232#issuecomment-480413083 R: @lostluck This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223839) Time Spent: 1h (was: 50m) > Replace "urn:beam:sideinput:materialization:multimap:0.1" with standard side > input type materialization > --- > > Key: BEAM-7011 > URL: https://issues.apache.org/jira/browse/BEAM-7011 > Project: Beam > Issue Type: Improvement > Components: beam-model, sdk-go, sdk-java-core, sdk-py-core >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > Use the StandardSideInput defined within beam_runner_api.proto: > https://github.com/apache/beam/blob/206d98b0765ac662730edd28d669b3db24dd851d/model/pipeline/src/main/proto/beam_runner_api.proto#L311 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5381) Dataflow runner creates duplicate CoGBK step IDs
[ https://issues.apache.org/jira/browse/BEAM-5381?focusedWorklogId=223833=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223833 ] ASF GitHub Bot logged work on BEAM-5381: Author: ASF GitHub Bot Created on: 05/Apr/19 20:24 Start Date: 05/Apr/19 20:24 Worklog Time Spent: 10m Work Description: lostluck commented on issue #8238: [BEAM-5381] Fix duplicate nodes ID for CoGBKs with Scopes. URL: https://github.com/apache/beam/pull/8238#issuecomment-480410815 Run Go PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223833) Time Spent: 1h 40m (was: 1.5h) > Dataflow runner creates duplicate CoGBK step IDs > > > Key: BEAM-5381 > URL: https://issues.apache.org/jira/browse/BEAM-5381 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Cody Schroeder >Priority: Major > Time Spent: 1h 40m > Remaining Estimate: 0h > > https://gist.github.com/schroederc/699f42e407702cf9584b15d6885ad297 > If the attached {{beam_dataflow_err.go}} pipeline is executed with the > {{dataflow}} runner, GCP reports the following error: > {code} > Step with name e5 already exists. Duplicates are not allowed. > {code} > Executing the pipeline in {{--dry_run}} mode shows that "e5" is indeed > duplicated. If the CoGBK in the pipeline is not scoped, the duplication is > fixed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-6724) Go SDK on Dataflow processing step emits but it doesn't reach framework
[ https://issues.apache.org/jira/browse/BEAM-6724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Burke updated BEAM-6724: --- Component/s: sdk-go > Go SDK on Dataflow processing step emits but it doesn't reach framework > --- > > Key: BEAM-6724 > URL: https://issues.apache.org/jira/browse/BEAM-6724 > Project: Beam > Issue Type: Bug > Components: runner-dataflow, sdk-go >Reporter: Robin Palotai >Priority: Minor > > When sending a job with a larger (not so large, 30MB) input to Dataflow > runner, I can see the worker logs that it emits everything in a given step, > but then the framework (not sure which one, harness or above) doesn't seem to > register that it reached the finish state (or maybe it doesn't reach the > finish state). > For a smaller input (~1MB) the whole pipeline runs fine. > Do you have any pointers how to debug (say, add logging) the cause of the > stuckness? Maybe there are some buffers not flushed? A state not > transitioned? Generally, which part of the beam go codebase is responsible > for these transitions? > Thank you! > Version: current Go SDK from HEAD + https://github.com/apache/beam/pull/7889 > patches to make plan check > Cloud dataflow console says "Apache Beam SDK for Go 0.5.0". > Logs: > 16:35:39.788 CET Starting MapTask stage s02 > 16:35:44.508 CET > 16:36:34.588 CET > 16:36:34.963 CET "DataSource: 2 elements in 53507659897 ns" > 16:40:45.227 CET > Initializing Go harness: /opt/apache/beam/boot --id=1 > --logging_endpoint=localhost:12370 --control_endpoint=localhost:12371 > --artifact_endpoint=localhost:12372 --provision_endpoint=localhost:12373 > --semi_persist_dir=/var/opt/google undefined > 16:42:29.027 CET Processing stuck in step s02 for at least 05m00s without > outputting or completing in state finish -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6976) Incorrect Doc/Description of HadoopFormatIO On Paritioner
[ https://issues.apache.org/jira/browse/BEAM-6976?focusedWorklogId=223826=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223826 ] ASF GitHub Bot logged work on BEAM-6976: Author: ASF GitHub Bot Created on: 05/Apr/19 20:20 Start Date: 05/Apr/19 20:20 Worklog Time Spent: 10m Work Description: JohnZZGithub commented on issue #8201: [BEAM-6976] Fix incorrect doc of HadoopFormatIO on partitioner URL: https://github.com/apache/beam/pull/8201#issuecomment-480409639 Thanks for reviewing and merge it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223826) Time Spent: 40m (was: 0.5h) > Incorrect Doc/Description of HadoopFormatIO On Paritioner > - > > Key: BEAM-6976 > URL: https://issues.apache.org/jira/browse/BEAM-6976 > Project: Beam > Issue Type: Bug > Components: io-java-hadoop >Affects Versions: 2.10.0, 2.11.0 >Reporter: zhenzhao wang >Priority: Minor > Labels: easyfix > Fix For: Not applicable > > Time Spent: 40m > Remaining Estimate: 0h > > Here's the current doc > {code:java} > myHadoopConfiguration.setClass("mapreduce.job.output.key.class", >MyDbOutputFormatKeyClass, Object.class); > myHadoopConfiguration.setClass("mapreduce.job.output.value.class", >MyDbOutputFormatValueClass, Object.class); > myHadoopConfiguration.setClass("mapreduce.job.output.value.class", >MyPartitionerClass, Object.class); > {code} > It should be myHadoopConfiguration.setClass("mapreduce.job.partitioner.class", >MyPartitionerClass, Object.class); > The error are found in both website/src/documentation/io/built-in-hadoop.md > and > sdks/java/io/hadoop-format/src/main/java/org/apache/beam/sdk/io/hadoop/format/HadoopFormatIO.java -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5381) Dataflow runner creates duplicate CoGBK step IDs
[ https://issues.apache.org/jira/browse/BEAM-5381?focusedWorklogId=223814=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223814 ] ASF GitHub Bot logged work on BEAM-5381: Author: ASF GitHub Bot Created on: 05/Apr/19 20:15 Start Date: 05/Apr/19 20:15 Worklog Time Spent: 10m Work Description: lostluck commented on issue #8238: [BEAM-5381] Fix duplicate nodes ID for CoGBKs with Scopes. URL: https://github.com/apache/beam/pull/8238#issuecomment-480407848 Run Go PostCommit Validating the test catches the issue, then the PR will be updated with the uncommented fix. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223814) Time Spent: 1.5h (was: 1h 20m) > Dataflow runner creates duplicate CoGBK step IDs > > > Key: BEAM-5381 > URL: https://issues.apache.org/jira/browse/BEAM-5381 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Cody Schroeder >Priority: Major > Time Spent: 1.5h > Remaining Estimate: 0h > > https://gist.github.com/schroederc/699f42e407702cf9584b15d6885ad297 > If the attached {{beam_dataflow_err.go}} pipeline is executed with the > {{dataflow}} runner, GCP reports the following error: > {code} > Step with name e5 already exists. Duplicates are not allowed. > {code} > Executing the pipeline in {{--dry_run}} mode shows that "e5" is indeed > duplicated. If the CoGBK in the pipeline is not scoped, the duplication is > fixed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5381) Dataflow runner creates duplicate CoGBK step IDs
[ https://issues.apache.org/jira/browse/BEAM-5381?focusedWorklogId=223811=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223811 ] ASF GitHub Bot logged work on BEAM-5381: Author: ASF GitHub Bot Created on: 05/Apr/19 20:14 Start Date: 05/Apr/19 20:14 Worklog Time Spent: 10m Work Description: lostluck commented on issue #8238: [BEAM-5381] Fix duplicate nodes ID for CoGBKs with Scopes. URL: https://github.com/apache/beam/pull/8238#issuecomment-480407848 Run Go PostCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223811) Time Spent: 1h 20m (was: 1h 10m) > Dataflow runner creates duplicate CoGBK step IDs > > > Key: BEAM-5381 > URL: https://issues.apache.org/jira/browse/BEAM-5381 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Cody Schroeder >Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > > https://gist.github.com/schroederc/699f42e407702cf9584b15d6885ad297 > If the attached {{beam_dataflow_err.go}} pipeline is executed with the > {{dataflow}} runner, GCP reports the following error: > {code} > Step with name e5 already exists. Duplicates are not allowed. > {code} > Executing the pipeline in {{--dry_run}} mode shows that "e5" is indeed > duplicated. If the CoGBK in the pipeline is not scoped, the duplication is > fixed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5381) Dataflow runner creates duplicate CoGBK step IDs
[ https://issues.apache.org/jira/browse/BEAM-5381?focusedWorklogId=223810=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223810 ] ASF GitHub Bot logged work on BEAM-5381: Author: ASF GitHub Bot Created on: 05/Apr/19 20:14 Start Date: 05/Apr/19 20:14 Worklog Time Spent: 10m Work Description: lostluck commented on pull request #8238: [BEAM-5381] Fix duplicate nodes ID for CoGBKs with Scopes. URL: https://github.com/apache/beam/pull/8238 * Wrong id was returned for CoGBK expansions, so the expanded transform didn't end up in the subtransform of the parent Scope. * CoGBK inject node outputs need to include edge name as well, not just input node name, otherwise there's an output name collision (caught by Flink runner's consistency check). Originally in https://github.com/apache/beam/pull/7889, but includes an update to the an integration test to prevent regression. **Please** add a meaningful description for your change here Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | --- | --- | --- | --- Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/) | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/) [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [![Build
[jira] [Work logged] (BEAM-6493) examples in Kotlin
[ https://issues.apache.org/jira/browse/BEAM-6493?focusedWorklogId=223800=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223800 ] ASF GitHub Bot logged work on BEAM-6493: Author: ASF GitHub Bot Created on: 05/Apr/19 19:58 Start Date: 05/Apr/19 19:58 Worklog Time Spent: 10m Work Description: harshithdwivedi commented on issue #8034: [BEAM-6493] Convert the WordCount samples to Kotlin URL: https://github.com/apache/beam/pull/8034#issuecomment-480403509 run java precommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223800) Time Spent: 6h 10m (was: 6h) Remaining Estimate: 498h 50m (was: 499h) > examples in Kotlin > -- > > Key: BEAM-6493 > URL: https://issues.apache.org/jira/browse/BEAM-6493 > Project: Beam > Issue Type: Task > Components: examples-java >Affects Versions: Not applicable >Reporter: Harshit Dwivedi >Assignee: Harshit Dwivedi >Priority: Minor > Labels: documentation, triaged > Fix For: Not applicable > > Original Estimate: 504h > Time Spent: 6h 10m > Remaining Estimate: 498h 50m > > I have been using Apache Beam for few of my projects in production since the > past 6 months and apart from Java, [Kotlin|https://kotlinlang.org/] also > seems to work as well with no issues whatsoever. > But currently, the Github Repository of Apache Beam contains examples only in > Java which might be an issue for other developers who want to use Apache Beam > SDK with kotlin as there are no sample resources available. > That said, I would love to go ahead and add kotlin examples alongside the > current java examples in the [Beam > repository|https://github.com/apache/beam/tree/master/examples/java]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-5328) Java starter archetype does not contain dependency versions
[ https://issues.apache.org/jira/browse/BEAM-5328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik reassigned BEAM-5328: --- Assignee: Mikhail Ivanov > Java starter archetype does not contain dependency versions > --- > > Key: BEAM-5328 > URL: https://issues.apache.org/jira/browse/BEAM-5328 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: 2.6.0, 2.7.0 >Reporter: Luke Cwik >Assignee: Mikhail Ivanov >Priority: Major > Labels: newbie, starter, triaged > Time Spent: 10m > Remaining Estimate: 0h > > The starter archetype contains resource annotation markers instead of > versions: > {code:java} > @maven-compiler-plugin.version@ > @maven-exec-plugin.version@ > @slf4j.version@ > {code} > in the properties block at the top. > > This means that the starter project is broken without the user manually > editing the pom.xml that is generated and populating the versions at the top. > > We also lack testing that validates that the starter archetype works. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (BEAM-6978) Make Dataflow service receive required restriction encoding parameter on SplittableParDo
[ https://issues.apache.org/jira/browse/BEAM-6978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik resolved BEAM-6978. - Resolution: Fixed Fix Version/s: 2.13.0 > Make Dataflow service receive required restriction encoding parameter on > SplittableParDo > > > Key: BEAM-6978 > URL: https://issues.apache.org/jira/browse/BEAM-6978 > Project: Beam > Issue Type: Sub-task > Components: runner-dataflow >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: Major > Labels: portability > Fix For: 2.13.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > The Dataflow service expects a "restriction_encoding" field to be populated > with the coder CloudObject representation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6978) Make Dataflow service receive required restriction encoding parameter on SplittableParDo
[ https://issues.apache.org/jira/browse/BEAM-6978?focusedWorklogId=223797=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223797 ] ASF GitHub Bot logged work on BEAM-6978: Author: ASF GitHub Bot Created on: 05/Apr/19 19:41 Start Date: 05/Apr/19 19:41 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #8204: [BEAM-6978] Have dataflow service receive additional arguments needed to support SplittableDoFn. URL: https://github.com/apache/beam/pull/8204 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223797) Time Spent: 1.5h (was: 1h 20m) > Make Dataflow service receive required restriction encoding parameter on > SplittableParDo > > > Key: BEAM-6978 > URL: https://issues.apache.org/jira/browse/BEAM-6978 > Project: Beam > Issue Type: Sub-task > Components: runner-dataflow >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 1.5h > Remaining Estimate: 0h > > The Dataflow service expects a "restriction_encoding" field to be populated > with the coder CloudObject representation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6695) Latest transform for Python SDK
[ https://issues.apache.org/jira/browse/BEAM-6695?focusedWorklogId=223789=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223789 ] ASF GitHub Bot logged work on BEAM-6695: Author: ASF GitHub Bot Created on: 05/Apr/19 19:19 Start Date: 05/Apr/19 19:19 Worklog Time Spent: 10m Work Description: ttanay commented on pull request #8206: [BEAM-6695] Latest PTransform for Python SDK URL: https://github.com/apache/beam/pull/8206#discussion_r272713759 ## File path: sdks/python/apache_beam/transforms/combiners_test.py ## @@ -392,5 +394,92 @@ def test_global_fanout(self): assert_that(result, equal_to([49.5])) +class LatestTest(unittest.TestCase): + + def test_globally(self): +l = [window.GlobalWindows.windowed_value(1, 100), + window.GlobalWindows.windowed_value(2, 200), + window.GlobalWindows.windowed_value(3, 300)] +with TestPipeline() as p: + pc = p | Create(l) + latest = pc | combine.Latest.Globally() + assert_that(latest, equal_to([3])) + + def test_globally_empty(self): +l = [] +with TestPipeline() as p: + pc = p | Create(l) + latest = pc | combine.Latest.Globally() + assert_that(latest, equal_to([None])) + + def test_per_key(self): +l = [window.GlobalWindows.windowed_value(('a', 1), 100), + window.GlobalWindows.windowed_value(('b', 2), 200), + window.GlobalWindows.windowed_value(('a', 3), 300)] +with TestPipeline() as p: + pc = p | Create(l) + latest = pc | combine.Latest.PerKey() + assert_that(latest, equal_to([('a', 3), ('b', 2)])) + + def test_per_key_empty(self): +l = [] +with TestPipeline() as p: + pc = p | Create(l) + latest = pc | combine.Latest.PerKey() + assert_that(latest, equal_to([])) + + def test_per_key_type_violation(self): +l_dict = [window.GlobalWindows.windowed_value({'a': 1}, 100), + window.GlobalWindows.windowed_value({'b': 2}, 100), + window.GlobalWindows.windowed_value({'a': 3}, 100)] +l_3_tuple = [window.GlobalWindows.windowed_value((1, 2, 3), 100), + window.GlobalWindows.windowed_value((4, 5, 6), 100), + window.GlobalWindows.windowed_value((7, 8, 9), 100)] +with self.assertRaises(TypeCheckError): + with TestPipeline() as p: +pc = p | Create(l_dict) +_ = pc | combine.Latest.PerKey() + +with self.assertRaises(TypeCheckError): + with TestPipeline() as p: +pc = p | Create(l_3_tuple) +_ = pc | combine.Latest.PerKey() + + +class LatestCombineFnTest(unittest.TestCase): + + def setUp(self): +self.fn = combine.LatestCombineFn() + + def test_create_accumulator(self): +accumulator = self.fn.create_accumulator() +self.assertEquals(accumulator, (None, window.MIN_TIMESTAMP)) + + def test_add_input(self): +accumulator = self.fn.create_accumulator() +element = (1, 100) +new_accumulator = self.fn.add_input(accumulator, element) +self.assertEquals(new_accumulator, (1, 100)) + + def test_merge_accumulators(self): +accumulators = [(1, 100), +(2, 200), +(3, 300)] +merged_accumulator = self.fn.merge_accumulators(accumulators) +self.assertEquals(merged_accumulator, (3, 300)) + + def test_extract_output(self): +accumulator = (1, 100) +output = self.fn.extract_output(accumulator) +self.assertEquals(output, 1) + + def test_input_type_violation(self): Review comment: This test - `test_input_type_violation` is used to test the type imposed on the input to the CombineFn [here](https://github.com/ttanay/beam/blob/e99e4845410e78d77f9e1f82246c851cf38f308c/sdks/python/apache_beam/transforms/combiners.py#L897). The test - `test_per_key_type_violation` is used to check that the input is a KV pair in the DoFn - `add_timestamp` of class `PerKey` [here](https://github.com/ttanay/beam/blob/e99e4845410e78d77f9e1f82246c851cf38f308c/sdks/python/apache_beam/transforms/combiners.py#L887). I think `test_per_key_type_violation` can be written better to display the same behaviour. I'll make that change. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223789) Time Spent: 2.5h (was: 2h 20m) > Latest transform for Python SDK > --- > > Key: BEAM-6695 > URL: https://issues.apache.org/jira/browse/BEAM-6695 > Project: Beam > Issue Type: New Feature > Components: sdk-py-core >
[jira] [Work logged] (BEAM-6747) Adding ExternalTransform in JavaSDK
[ https://issues.apache.org/jira/browse/BEAM-6747?focusedWorklogId=223775=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223775 ] ASF GitHub Bot logged work on BEAM-6747: Author: ASF GitHub Bot Created on: 05/Apr/19 18:56 Start Date: 05/Apr/19 18:56 Worklog Time Spent: 10m Work Description: ihji commented on pull request #7954: [BEAM-6747] Adding ExternalTransform in JavaSDK URL: https://github.com/apache/beam/pull/7954#discussion_r272707041 ## File path: runners/core-construction-java/src/test/java/org/apache/beam/runners/core/construction/ExternalTest.java ## @@ -0,0 +1,187 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.core.construction; + +import com.google.auto.service.AutoService; +import java.io.IOException; +import java.io.Serializable; +import java.nio.charset.StandardCharsets; +import java.util.Map; +import org.apache.beam.runners.core.construction.expansion.ExpansionService; +import org.apache.beam.sdk.testing.PAssert; +import org.apache.beam.sdk.testing.TestPipeline; +import org.apache.beam.sdk.testing.UsesCrossLanguageTransforms; +import org.apache.beam.sdk.testing.ValidatesRunner; +import org.apache.beam.sdk.transforms.Create; +import org.apache.beam.sdk.transforms.DoFn; +import org.apache.beam.sdk.transforms.Filter; +import org.apache.beam.sdk.transforms.MapElements; +import org.apache.beam.sdk.transforms.ParDo; +import org.apache.beam.sdk.values.PCollection; +import org.apache.beam.sdk.values.PCollectionTuple; +import org.apache.beam.sdk.values.TupleTag; +import org.apache.beam.sdk.values.TupleTagList; +import org.apache.beam.sdk.values.TypeDescriptor; +import org.apache.beam.sdk.values.TypeDescriptors; +import org.apache.beam.vendor.grpc.v1p13p1.io.grpc.ConnectivityState; +import org.apache.beam.vendor.grpc.v1p13p1.io.grpc.ManagedChannel; +import org.apache.beam.vendor.grpc.v1p13p1.io.grpc.ManagedChannelBuilder; +import org.apache.beam.vendor.grpc.v1p13p1.io.grpc.Server; +import org.apache.beam.vendor.grpc.v1p13p1.io.grpc.ServerBuilder; +import org.apache.beam.vendor.guava.v20_0.com.google.common.collect.ImmutableMap; +import org.junit.AfterClass; +import org.junit.BeforeClass; +import org.junit.Rule; +import org.junit.Test; +import org.junit.experimental.categories.Category; +import org.junit.runner.RunWith; +import org.junit.runners.JUnit4; + +/** Test External transforms. */ +@RunWith(JUnit4.class) +public class ExternalTest implements Serializable { + @Rule public transient TestPipeline testPipeline = TestPipeline.create(); + + private static Server expansionServer; + private static final int EXPANSION_PORT = 8096; + private static final String EXPANSION_ADDR = String.format("localhost:%s", EXPANSION_PORT); + private static String pythonServerCommand; + + @BeforeClass + public static void setUp() throws IOException { +pythonServerCommand = System.getProperty("pythonTestExpansionCommand"); + +expansionServer = +ServerBuilder.forPort(EXPANSION_PORT).addService(new ExpansionService()).build(); +expansionServer.start(); + } + + @AfterClass + public static void tearDown() { +expansionServer.shutdownNow(); + } + + @Test + @Category({ValidatesRunner.class, UsesCrossLanguageTransforms.class}) + public void expandSingleTest() { +PCollection col = +testPipeline +.apply(Create.of(1, 2, 3)) +.apply(External.of("simple", new byte[] {}, EXPANSION_ADDR)); +PAssert.that(col).containsInAnyOrder(2, 3, 4); +testPipeline.run(); + } + + @Test + @Category({ValidatesRunner.class, UsesCrossLanguageTransforms.class}) + public void expandMultipleTest() { +PCollection pcol = +testPipeline +.apply(Create.of(1, 2, 3)) +.apply("add one", External.of("simple", new byte[] {}, EXPANSION_ADDR)) +.apply( +"filter <=3", +External.of("le", "3".getBytes(StandardCharsets.UTF_8), EXPANSION_ADDR)); + +PAssert.that(pcol).containsInAnyOrder(2, 3); +testPipeline.run(); + } + + @Test +
[jira] [Work logged] (BEAM-6747) Adding ExternalTransform in JavaSDK
[ https://issues.apache.org/jira/browse/BEAM-6747?focusedWorklogId=223774=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223774 ] ASF GitHub Bot logged work on BEAM-6747: Author: ASF GitHub Bot Created on: 05/Apr/19 18:53 Start Date: 05/Apr/19 18:53 Worklog Time Spent: 10m Work Description: ihji commented on pull request #7954: [BEAM-6747] Adding ExternalTransform in JavaSDK URL: https://github.com/apache/beam/pull/7954#discussion_r272706363 ## File path: runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/SdkComponents.java ## @@ -54,9 +54,11 @@ private final Set reservedIds = new HashSet<>(); + private String defaultEnvironmentId; Review comment: The information only available in `getOnlyEnvironmentId()` is insufficient for providing a default environment id. I couldn't come up with any better idea. Can you elaborate more on how to handle this directly in the method? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223774) Time Spent: 4h 10m (was: 4h) > Adding ExternalTransform in JavaSDK > --- > > Key: BEAM-6747 > URL: https://issues.apache.org/jira/browse/BEAM-6747 > Project: Beam > Issue Type: Improvement > Components: runner-core >Reporter: Heejong Lee >Assignee: Heejong Lee >Priority: Major > Time Spent: 4h 10m > Remaining Estimate: 0h > > Adding Java counterpart of Python ExternalTransform for testing Python > transforms from pipelines in Java SDK. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6695) Latest transform for Python SDK
[ https://issues.apache.org/jira/browse/BEAM-6695?focusedWorklogId=223772=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223772 ] ASF GitHub Bot logged work on BEAM-6695: Author: ASF GitHub Bot Created on: 05/Apr/19 18:47 Start Date: 05/Apr/19 18:47 Worklog Time Spent: 10m Work Description: ttanay commented on pull request #8206: [BEAM-6695] Latest PTransform for Python SDK URL: https://github.com/apache/beam/pull/8206#discussion_r272704525 ## File path: sdks/python/apache_beam/transforms/combiners_test.py ## @@ -392,5 +394,92 @@ def test_global_fanout(self): assert_that(result, equal_to([49.5])) +class LatestTest(unittest.TestCase): + + def test_globally(self): +l = [window.GlobalWindows.windowed_value(1, 100), Review comment: Sure! I'll make the change. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223772) Time Spent: 2h 20m (was: 2h 10m) > Latest transform for Python SDK > --- > > Key: BEAM-6695 > URL: https://issues.apache.org/jira/browse/BEAM-6695 > Project: Beam > Issue Type: New Feature > Components: sdk-py-core >Reporter: Ahmet Altay >Assignee: Tanay Tummalapalli >Priority: Minor > Time Spent: 2h 20m > Remaining Estimate: 0h > > Add a PTransform} and Combine.CombineFn for computing the latest element in a > PCollection. > It should offer the same API as its Java counterpart: > https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Latest.java -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-2939) Fn API SDF support
[ https://issues.apache.org/jira/browse/BEAM-2939?focusedWorklogId=223766=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223766 ] ASF GitHub Bot logged work on BEAM-2939: Author: ASF GitHub Bot Created on: 05/Apr/19 18:28 Start Date: 05/Apr/19 18:28 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #8235: [BEAM-2939] Add split_and_size method for SDFs. URL: https://github.com/apache/beam/pull/8235#discussion_r272698413 ## File path: sdks/python/apache_beam/transforms/core.py ## @@ -279,6 +276,12 @@ def restriction_size(self, element, restriction): """ return self.create_tracker(restriction).default_size() + def split_and_size(self, element, restriction): Review comment: Add this "optional" method to the list on sdks/python/apache_beam/transforms/core.py:204 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223766) Time Spent: 15h 10m (was: 15h) > Fn API SDF support > -- > > Key: BEAM-2939 > URL: https://issues.apache.org/jira/browse/BEAM-2939 > Project: Beam > Issue Type: Improvement > Components: beam-model >Reporter: Henning Rohde >Assignee: Luke Cwik >Priority: Major > Labels: portability, triaged > Time Spent: 15h 10m > Remaining Estimate: 0h > > The Fn API should support streaming SDF. Detailed design TBD. > Once design is ready, expand subtasks similarly to BEAM-2822. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-5433) Cleanup Environment.url from beam_runner_api.proto
[ https://issues.apache.org/jira/browse/BEAM-5433?focusedWorklogId=223764=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223764 ] ASF GitHub Bot logged work on BEAM-5433: Author: ASF GitHub Bot Created on: 05/Apr/19 18:16 Start Date: 05/Apr/19 18:16 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #8213: [BEAM-5433] Deprecate environment url field. URL: https://github.com/apache/beam/pull/8213#issuecomment-480373777 The go generation was broken when the metrics.proto was created, will attempt to fix as part of a separate PR and regenerate the code there. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223764) Time Spent: 40m (was: 0.5h) > Cleanup Environment.url from beam_runner_api.proto > -- > > Key: BEAM-5433 > URL: https://issues.apache.org/jira/browse/BEAM-5433 > Project: Beam > Issue Type: Task > Components: beam-model >Reporter: Ankur Goenka >Assignee: Luke Cwik >Priority: Major > Labels: triaged > Time Spent: 40m > Remaining Estimate: 0h > > Environment URL field is deprecated and should be removed ASAP. > The current blocker in removing the field is compatibility with Dataflow as > data flow has internal code which relies on it. > There is also vote passed to move the affected dataflow code to open source. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7011) Replace "urn:beam:sideinput:materialization:multimap:0.1" with standard side input type materialization
[ https://issues.apache.org/jira/browse/BEAM-7011?focusedWorklogId=223743=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223743 ] ASF GitHub Bot logged work on BEAM-7011: Author: ASF GitHub Bot Created on: 05/Apr/19 18:09 Start Date: 05/Apr/19 18:09 Worklog Time Spent: 10m Work Description: lostluck commented on issue #8232: [BEAM-7011] Update Beam SDKs to use the StandardSideInputType enums. URL: https://github.com/apache/beam/pull/8232#issuecomment-480371522 There are, but most IDEs configured for Go should be automatically downloading and using them onsave. The one that handles imports specifically, is [goimports](https://godoc.org/golang.org/x/tools/cmd/goimports) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223743) Time Spent: 50m (was: 40m) > Replace "urn:beam:sideinput:materialization:multimap:0.1" with standard side > input type materialization > --- > > Key: BEAM-7011 > URL: https://issues.apache.org/jira/browse/BEAM-7011 > Project: Beam > Issue Type: Improvement > Components: beam-model, sdk-go, sdk-java-core, sdk-py-core >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > Use the StandardSideInput defined within beam_runner_api.proto: > https://github.com/apache/beam/blob/206d98b0765ac662730edd28d669b3db24dd851d/model/pipeline/src/main/proto/beam_runner_api.proto#L311 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6942) Pipeline options to experiment propagation is not working in Dataflow runner.
[ https://issues.apache.org/jira/browse/BEAM-6942?focusedWorklogId=223738=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223738 ] ASF GitHub Bot logged work on BEAM-6942: Author: ASF GitHub Bot Created on: 05/Apr/19 17:56 Start Date: 05/Apr/19 17:56 Worklog Time Spent: 10m Work Description: tvalentyn commented on pull request #8225: [BEAM-6942] Make modifications to pipeline options to be visible to all views. URL: https://github.com/apache/beam/pull/8225 This PR: - fixes a bug in pipeline options implementation that makes changes of multi-valued pipeline options in one view to be invisible by other views of the same pipeline options. The bug was manifests when multi-valued options are not empty before modification. - adds documentation and unit tests for codepaths affected by the bug and the fix. - cleans up experiment propagation in Dataflow runner and removes the no longer required propagation of dataflow_kms_key. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | --- | --- | --- | --- Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/) | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/) [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [![Build
[jira] [Work logged] (BEAM-6942) Pipeline options to experiment propagation is not working in Dataflow runner.
[ https://issues.apache.org/jira/browse/BEAM-6942?focusedWorklogId=223736=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223736 ] ASF GitHub Bot logged work on BEAM-6942: Author: ASF GitHub Bot Created on: 05/Apr/19 17:56 Start Date: 05/Apr/19 17:56 Worklog Time Spent: 10m Work Description: tvalentyn commented on issue #8225: [BEAM-6942] Make modifications to pipeline options to be visible to all views. URL: https://github.com/apache/beam/pull/8225#issuecomment-480367226 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223736) Time Spent: 1h (was: 50m) > Pipeline options to experiment propagation is not working in Dataflow runner. > - > > Key: BEAM-6942 > URL: https://issues.apache.org/jira/browse/BEAM-6942 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Valentyn Tymofieiev >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > Relevant code: > [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py#L356-L388] > 3 experiments/options are affected. We need to fix it in 2.12.0 > cc: [~altay] [~apilloud] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6942) Pipeline options to experiment propagation is not working in Dataflow runner.
[ https://issues.apache.org/jira/browse/BEAM-6942?focusedWorklogId=223737=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223737 ] ASF GitHub Bot logged work on BEAM-6942: Author: ASF GitHub Bot Created on: 05/Apr/19 17:56 Start Date: 05/Apr/19 17:56 Worklog Time Spent: 10m Work Description: tvalentyn commented on pull request #8225: [BEAM-6942] Make modifications to pipeline options to be visible to all views. URL: https://github.com/apache/beam/pull/8225 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223737) Time Spent: 1h 10m (was: 1h) > Pipeline options to experiment propagation is not working in Dataflow runner. > - > > Key: BEAM-6942 > URL: https://issues.apache.org/jira/browse/BEAM-6942 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Valentyn Tymofieiev >Assignee: Valentyn Tymofieiev >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > Relevant code: > [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py#L356-L388] > 3 experiments/options are affected. We need to fix it in 2.12.0 > cc: [~altay] [~apilloud] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7011) Replace "urn:beam:sideinput:materialization:multimap:0.1" with standard side input type materialization
[ https://issues.apache.org/jira/browse/BEAM-7011?focusedWorklogId=223732=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223732 ] ASF GitHub Bot logged work on BEAM-7011: Author: ASF GitHub Bot Created on: 05/Apr/19 17:54 Start Date: 05/Apr/19 17:54 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #8232: [BEAM-7011] Update Beam SDKs to use the StandardSideInputType enums. URL: https://github.com/apache/beam/pull/8232#issuecomment-480366641 Thanks Robert Is there tooling that does this automatically? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223732) Time Spent: 40m (was: 0.5h) > Replace "urn:beam:sideinput:materialization:multimap:0.1" with standard side > input type materialization > --- > > Key: BEAM-7011 > URL: https://issues.apache.org/jira/browse/BEAM-7011 > Project: Beam > Issue Type: Improvement > Components: beam-model, sdk-go, sdk-java-core, sdk-py-core >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > Use the StandardSideInput defined within beam_runner_api.proto: > https://github.com/apache/beam/blob/206d98b0765ac662730edd28d669b3db24dd851d/model/pipeline/src/main/proto/beam_runner_api.proto#L311 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-7014) Flake in gcsio.py / filesystemio.py - NotImplementedError: offset: 0, whence: 0
[ https://issues.apache.org/jira/browse/BEAM-7014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Valentyn Tymofieiev updated BEAM-7014: -- Description: The flake was observed in Precommit Direct Runner IT (wordcount). Full log output: https://pastebin.com/raw/DP5J7Uch. {noformat} Traceback (most recent call last): 08:42:57 File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/io/gcp/gcsio.py", line 583, in _start_upload 08:42:57 self._client.objects.Insert(self._insert_request, upload=self._upload) 08:42:57 File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/io/gcp/internal/clients/storage/storage_v1_client.py", line 1154, in Insert 08:42:57 upload=upload, upload_config=upload_config) 08:42:57 File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/build/gradleenv/1327086738/local/lib/python2.7/site-packages/apitools/base/py/base_api.py", line 715, in _RunMethod 08:42:57 http_request, client=self.client) 08:42:57 File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/build/gradleenv/1327086738/local/lib/python2.7/site-packages/apitools/base/py/transfer.py", line 885, in InitializeUpload 08:42:57 return self.StreamInChunks() 08:42:57 File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/build/gradleenv/1327086738/local/lib/python2.7/site-packages/apitools/base/py/transfer.py", line 997, in StreamInChunks 08:42:57 additional_headers=additional_headers) 08:42:57 File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/build/gradleenv/1327086738/local/lib/python2.7/site-packages/apitools/base/py/transfer.py", line 948, in __StreamMedia 08:42:57 self.RefreshResumableUploadState() 08:42:57 File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/build/gradleenv/1327086738/local/lib/python2.7/site-packages/apitools/base/py/transfer.py", line 850, in RefreshResumableUploadState 08:42:57 self.stream.seek(self.progress) 08:42:57 File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/io/filesystemio.py", line 269, in seek 08:42:57 offset, whence, self.position, self.last_position)) 08:42:57 NotImplementedError: offset: 0, whence: 0, position: 48944, last: 0 {noformat} [~chamikara] Might have context to triage this. was: The flake was observed in Precommit Direct Runner IT (wordcount). Full log output: https://pastebin.com/raw/DP5J7Uch. {noformat} 08:42:57 root: ERROR: Error in _start_upload while inserting file gs://temp-storage-for-end-to-end-tests/py-it-cloud/output/1554478948137/beam-temp-results-6bdc71e057b911e9aa9f42010a88/3d645314-4633-41a7-8f0d-8f460124f2f6.results: Traceback (most recent call last): 08:42:57 File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/io/gcp/gcsio.py", line 583, in _start_upload 08:42:57 self._client.objects.Insert(self._insert_request, upload=self._upload) 08:42:57 File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/io/gcp/internal/clients/storage/storage_v1_client.py", line 1154, in Insert 08:42:57 upload=upload, upload_config=upload_config) 08:42:57 File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/build/gradleenv/1327086738/local/lib/python2.7/site-packages/apitools/base/py/base_api.py", line 715, in _RunMethod 08:42:57 http_request, client=self.client) 08:42:57 File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/build/gradleenv/1327086738/local/lib/python2.7/site-packages/apitools/base/py/transfer.py", line 885, in InitializeUpload 08:42:57 return self.StreamInChunks() 08:42:57 File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/build/gradleenv/1327086738/local/lib/python2.7/site-packages/apitools/base/py/transfer.py", line 997, in StreamInChunks 08:42:57 additional_headers=additional_headers) 08:42:57 File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/build/gradleenv/1327086738/local/lib/python2.7/site-packages/apitools/base/py/transfer.py", line 948, in __StreamMedia 08:42:57 self.RefreshResumableUploadState() 08:42:57 File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/build/gradleenv/1327086738/local/lib/python2.7/site-packages/apitools/base/py/transfer.py", line 850, in RefreshResumableUploadState 08:42:57 self.stream.seek(self.progress) 08:42:57 File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/io/filesystemio.py", line 269, in seek 08:42:57 offset, whence, self.position, self.last_position)) 08:42:57 NotImplementedError:
[jira] [Created] (BEAM-7014) Flake in gcsio.py / filesystemio.py - NotImplementedError: offset: 0, whence: 0
Valentyn Tymofieiev created BEAM-7014: - Summary: Flake in gcsio.py / filesystemio.py - NotImplementedError: offset: 0, whence: 0 Key: BEAM-7014 URL: https://issues.apache.org/jira/browse/BEAM-7014 Project: Beam Issue Type: Bug Components: sdk-py-core Reporter: Valentyn Tymofieiev Assignee: Chamikara Jayalath The flake was observed in Precommit Direct Runner IT (wordcount). Full log output: https://pastebin.com/raw/DP5J7Uch. {noformat} 08:42:57 root: ERROR: Error in _start_upload while inserting file gs://temp-storage-for-end-to-end-tests/py-it-cloud/output/1554478948137/beam-temp-results-6bdc71e057b911e9aa9f42010a88/3d645314-4633-41a7-8f0d-8f460124f2f6.results: Traceback (most recent call last): 08:42:57 File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/io/gcp/gcsio.py", line 583, in _start_upload 08:42:57 self._client.objects.Insert(self._insert_request, upload=self._upload) 08:42:57 File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/io/gcp/internal/clients/storage/storage_v1_client.py", line 1154, in Insert 08:42:57 upload=upload, upload_config=upload_config) 08:42:57 File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/build/gradleenv/1327086738/local/lib/python2.7/site-packages/apitools/base/py/base_api.py", line 715, in _RunMethod 08:42:57 http_request, client=self.client) 08:42:57 File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/build/gradleenv/1327086738/local/lib/python2.7/site-packages/apitools/base/py/transfer.py", line 885, in InitializeUpload 08:42:57 return self.StreamInChunks() 08:42:57 File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/build/gradleenv/1327086738/local/lib/python2.7/site-packages/apitools/base/py/transfer.py", line 997, in StreamInChunks 08:42:57 additional_headers=additional_headers) 08:42:57 File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/build/gradleenv/1327086738/local/lib/python2.7/site-packages/apitools/base/py/transfer.py", line 948, in __StreamMedia 08:42:57 self.RefreshResumableUploadState() 08:42:57 File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/build/gradleenv/1327086738/local/lib/python2.7/site-packages/apitools/base/py/transfer.py", line 850, in RefreshResumableUploadState 08:42:57 self.stream.seek(self.progress) 08:42:57 File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify_PR/src/sdks/python/apache_beam/io/filesystemio.py", line 269, in seek 08:42:57 offset, whence, self.position, self.last_position)) 08:42:57 NotImplementedError: offset: 0, whence: 0, position: 48944, last: 0 {noformat} [~chamikara] Might have context to triage this. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6978) Make Dataflow service receive required restriction encoding parameter on SplittableParDo
[ https://issues.apache.org/jira/browse/BEAM-6978?focusedWorklogId=223726=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223726 ] ASF GitHub Bot logged work on BEAM-6978: Author: ASF GitHub Bot Created on: 05/Apr/19 17:39 Start Date: 05/Apr/19 17:39 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #8204: [BEAM-6978] Have dataflow service receive additional arguments needed to support SplittableDoFn. URL: https://github.com/apache/beam/pull/8204#discussion_r272682734 ## File path: sdks/python/apache_beam/runners/dataflow/dataflow_runner.py ## @@ -781,6 +783,16 @@ def run_ParDo(self, transform_node, options): step.add_property(PropertyNames.OUTPUT_INFO, outputs) +# Add the restriction encoding if we are a splittable DoFn +# and are using the Fn API on the unified worker. +from apache_beam.runners.common import DoFnSignature +signature = DoFnSignature(transform_node.transform.fn) +if (use_fnapi and use_unified_worker and signature.is_splittable_dofn()): + restriction_coder = ( + signature.get_restriction_provider().restriction_coder()) + step.add_property('restriction_encoding', Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223726) Time Spent: 1h 10m (was: 1h) > Make Dataflow service receive required restriction encoding parameter on > SplittableParDo > > > Key: BEAM-6978 > URL: https://issues.apache.org/jira/browse/BEAM-6978 > Project: Beam > Issue Type: Sub-task > Components: runner-dataflow >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 1h 10m > Remaining Estimate: 0h > > The Dataflow service expects a "restriction_encoding" field to be populated > with the coder CloudObject representation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-6978) Make Dataflow service receive required restriction encoding parameter on SplittableParDo
[ https://issues.apache.org/jira/browse/BEAM-6978?focusedWorklogId=223727=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223727 ] ASF GitHub Bot logged work on BEAM-6978: Author: ASF GitHub Bot Created on: 05/Apr/19 17:39 Start Date: 05/Apr/19 17:39 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #8204: [BEAM-6978] Have dataflow service receive additional arguments needed to support SplittableDoFn. URL: https://github.com/apache/beam/pull/8204#discussion_r272682734 ## File path: sdks/python/apache_beam/runners/dataflow/dataflow_runner.py ## @@ -781,6 +783,16 @@ def run_ParDo(self, transform_node, options): step.add_property(PropertyNames.OUTPUT_INFO, outputs) +# Add the restriction encoding if we are a splittable DoFn +# and are using the Fn API on the unified worker. +from apache_beam.runners.common import DoFnSignature +signature = DoFnSignature(transform_node.transform.fn) +if (use_fnapi and use_unified_worker and signature.is_splittable_dofn()): + restriction_coder = ( + signature.get_restriction_provider().restriction_coder()) + step.add_property('restriction_encoding', Review comment: Done Thanks for pointing this out. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223727) Time Spent: 1h 20m (was: 1h 10m) > Make Dataflow service receive required restriction encoding parameter on > SplittableParDo > > > Key: BEAM-6978 > URL: https://issues.apache.org/jira/browse/BEAM-6978 > Project: Beam > Issue Type: Sub-task > Components: runner-dataflow >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 1h 20m > Remaining Estimate: 0h > > The Dataflow service expects a "restriction_encoding" field to be populated > with the coder CloudObject representation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7001) Remove MonitoringInfoUrns proto from FnApi metrics.proto
[ https://issues.apache.org/jira/browse/BEAM-7001?focusedWorklogId=223721=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223721 ] ASF GitHub Bot logged work on BEAM-7001: Author: ASF GitHub Bot Created on: 05/Apr/19 17:30 Start Date: 05/Apr/19 17:30 Worklog Time Spent: 10m Work Description: Ardagan commented on issue #8203: [BEAM-7001] Remove MonitoringInfoUrns proto. URL: https://github.com/apache/beam/pull/8203#issuecomment-480358097 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223721) Time Spent: 50m (was: 40m) > Remove MonitoringInfoUrns proto from FnApi metrics.proto > > > Key: BEAM-7001 > URL: https://issues.apache.org/jira/browse/BEAM-7001 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Mikhail Gryzykhin >Assignee: Mikhail Gryzykhin >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > All urns are defined in MonitoringInfoSpecs. We want to remove duplicate > definition. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-7001) Remove MonitoringInfoUrns proto from FnApi metrics.proto
[ https://issues.apache.org/jira/browse/BEAM-7001?focusedWorklogId=223722=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223722 ] ASF GitHub Bot logged work on BEAM-7001: Author: ASF GitHub Bot Created on: 05/Apr/19 17:30 Start Date: 05/Apr/19 17:30 Worklog Time Spent: 10m Work Description: Ardagan commented on issue #8203: [BEAM-7001] Remove MonitoringInfoUrns proto. URL: https://github.com/apache/beam/pull/8203#issuecomment-480358097 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223722) Time Spent: 1h (was: 50m) > Remove MonitoringInfoUrns proto from FnApi metrics.proto > > > Key: BEAM-7001 > URL: https://issues.apache.org/jira/browse/BEAM-7001 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Mikhail Gryzykhin >Assignee: Mikhail Gryzykhin >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > All urns are defined in MonitoringInfoSpecs. We want to remove duplicate > definition. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-6753) Create proto representation for schemas
[ https://issues.apache.org/jira/browse/BEAM-6753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16811068#comment-16811068 ] Thomas Weise commented on BEAM-6753: [~reuvenlax] seeing that the PR was merged, could you add a bit of context to this ticket what the proto support will enable? How far are we from being able to use Beam SQL from a Python pipeline? > Create proto representation for schemas > --- > > Key: BEAM-6753 > URL: https://issues.apache.org/jira/browse/BEAM-6753 > Project: Beam > Issue Type: Sub-task > Components: beam-model >Reporter: Reuven Lax >Assignee: Reuven Lax >Priority: Major > Time Spent: 2.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)