[jira] [Updated] (BEAM-9979) Fix race condition where the read index maybe reported from the last executed bundle
[ https://issues.apache.org/jira/browse/BEAM-9979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-9979: Description: When the BeamFnDataReadRunner/DataInputOperation is reused there is a short period of time when a progress request could happen before the the start function is called resetting the read index to -1. I believe there should be a way to *reset* an operator before it gets added to the set of cached bundle processors separate instead of placing clean-up in any *start* functions that those operators may rely on preventing exposing details of those operators before *start* may have been invoked. was:When the BeamFnDataReadRunner/DataInputOperation is reused there is a short period of time when a progress request could happen before the the start function is called resetting the read index to -1. > Fix race condition where the read index maybe reported from the last executed > bundle > > > Key: BEAM-9979 > URL: https://issues.apache.org/jira/browse/BEAM-9979 > Project: Beam > Issue Type: Bug > Components: sdk-java-harness, sdk-py-harness >Reporter: Luke Cwik >Priority: Minor > Fix For: 2.22.0 > > > When the BeamFnDataReadRunner/DataInputOperation is reused there is a short > period of time when a progress request could happen before the the start > function is called resetting the read index to -1. > I believe there should be a way to *reset* an operator before it gets added > to the set of cached bundle processors separate instead of placing clean-up > in any *start* functions that those operators may rely on preventing exposing > details of those operators before *start* may have been invoked. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9979) Fix race condition where the read index maybe reported from the last executed bundle
[ https://issues.apache.org/jira/browse/BEAM-9979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-9979: Status: Open (was: Triage Needed) > Fix race condition where the read index maybe reported from the last executed > bundle > > > Key: BEAM-9979 > URL: https://issues.apache.org/jira/browse/BEAM-9979 > Project: Beam > Issue Type: Bug > Components: sdk-java-harness, sdk-py-harness >Reporter: Luke Cwik >Priority: Minor > Fix For: 2.22.0 > > > When the BeamFnDataReadRunner/DataInputOperation is reused there is a short > period of time when a progress request could happen before the the start > function is called resetting the read index to -1. > I believe there should be a way to *reset* an operator before it gets added > to the set of cached bundle processors separate instead of placing clean-up > in any *start* functions that those operators may rely on preventing exposing > details of those operators before *start* may have been invoked. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9979) Fix race condition where the read index maybe reported from the last executed bundle
Luke Cwik created BEAM-9979: --- Summary: Fix race condition where the read index maybe reported from the last executed bundle Key: BEAM-9979 URL: https://issues.apache.org/jira/browse/BEAM-9979 Project: Beam Issue Type: Bug Components: sdk-java-harness, sdk-py-harness Reporter: Luke Cwik Fix For: 2.22.0 When the BeamFnDataReadRunner/DataInputOperation is reused there is a short period of time when a progress request could happen before the the start function is called resetting the read index to -1. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (BEAM-9945) Use consistent element count for progress counter.
[ https://issues.apache.org/jira/browse/BEAM-9945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105926#comment-17105926 ] Luke Cwik edited comment on BEAM-9945 at 5/13/20, 3:23 AM: --- When testing the implementation against runner v2, I found that the Go SDK's read index ends at *# elements* or the *stop index* when splitting while the Python/Java implementations always stop at *# elements - 1* or *stop index - 1* I think it makes sense to use Go's definition. was (Author: lcwik): When testing the implementation against runner v2, I found that the Go SDK's read index ends at *# elements* or the *stop index* when splitting while the Python/Java implements always stop at *# elements - 1* or *stop index - 1* I think it makes sense to use Go's definition. > Use consistent element count for progress counter. > -- > > Key: BEAM-9945 > URL: https://issues.apache.org/jira/browse/BEAM-9945 > Project: Beam > Issue Type: Improvement > Components: beam-model, sdk-go, sdk-java-harness, sdk-py-harness >Reporter: Robert Bradshaw >Assignee: Robert Bradshaw >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 20m > Remaining Estimate: 0h > > This influences how the SDK communicates progress and how splitting is > performed. > We currently inspect the element count of the output (which has inconsistent > definitions across SDKs, see BEAM-9934). Instead, we can move to using an > explicit, separate metric. > This can currently lead to incorrect progress reporting and in some cases > even a crash in the UW depending on the SDK, and makes it more difficult (but > not impossible) to fix element counts in the future. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9945) Use consistent element count for progress counter.
[ https://issues.apache.org/jira/browse/BEAM-9945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105937#comment-17105937 ] Luke Cwik commented on BEAM-9945: - Fix in https://github.com/apache/beam/pull/11689 > Use consistent element count for progress counter. > -- > > Key: BEAM-9945 > URL: https://issues.apache.org/jira/browse/BEAM-9945 > Project: Beam > Issue Type: Improvement > Components: beam-model, sdk-go, sdk-java-harness, sdk-py-harness >Reporter: Robert Bradshaw >Assignee: Robert Bradshaw >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 20m > Remaining Estimate: 0h > > This influences how the SDK communicates progress and how splitting is > performed. > We currently inspect the element count of the output (which has inconsistent > definitions across SDKs, see BEAM-9934). Instead, we can move to using an > explicit, separate metric. > This can currently lead to incorrect progress reporting and in some cases > even a crash in the UW depending on the SDK, and makes it more difficult (but > not impossible) to fix element counts in the future. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9945) Use consistent element count for progress counter.
[ https://issues.apache.org/jira/browse/BEAM-9945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-9945: Status: Open (was: Triage Needed) > Use consistent element count for progress counter. > -- > > Key: BEAM-9945 > URL: https://issues.apache.org/jira/browse/BEAM-9945 > Project: Beam > Issue Type: Improvement > Components: beam-model, sdk-go, sdk-java-harness, sdk-py-harness >Reporter: Robert Bradshaw >Assignee: Robert Bradshaw >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 20m > Remaining Estimate: 0h > > This influences how the SDK communicates progress and how splitting is > performed. > We currently inspect the element count of the output (which has inconsistent > definitions across SDKs, see BEAM-9934). Instead, we can move to using an > explicit, separate metric. > This can currently lead to incorrect progress reporting and in some cases > even a crash in the UW depending on the SDK, and makes it more difficult (but > not impossible) to fix element counts in the future. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Reopened] (BEAM-9945) Use consistent element count for progress counter.
[ https://issues.apache.org/jira/browse/BEAM-9945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik reopened BEAM-9945: - When testing the implementation against runner v2, I found that the Go SDK's read index ends at *# elements* or the *stop index* when splitting while the Python/Java implements always stop at *# elements - 1* or *stop index - 1* I think it makes sense to use Go's definition. > Use consistent element count for progress counter. > -- > > Key: BEAM-9945 > URL: https://issues.apache.org/jira/browse/BEAM-9945 > Project: Beam > Issue Type: Improvement > Components: beam-model, sdk-go, sdk-java-harness, sdk-py-harness >Reporter: Robert Bradshaw >Assignee: Robert Bradshaw >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 20m > Remaining Estimate: 0h > > This influences how the SDK communicates progress and how splitting is > performed. > We currently inspect the element count of the output (which has inconsistent > definitions across SDKs, see BEAM-9934). Instead, we can move to using an > explicit, separate metric. > This can currently lead to incorrect progress reporting and in some cases > even a crash in the UW depending on the SDK, and makes it more difficult (but > not impossible) to fix element counts in the future. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (BEAM-8964) DeprecationWarning: Flags not at the start of the expression
[ https://issues.apache.org/jira/browse/BEAM-8964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udi Meiri closed BEAM-8964. --- Fix Version/s: Not applicable Resolution: Fixed > DeprecationWarning: Flags not at the start of the expression > - > > Key: BEAM-8964 > URL: https://issues.apache.org/jira/browse/BEAM-8964 > Project: Beam > Issue Type: Bug > Components: io-py-files >Reporter: Udi Meiri >Assignee: Udi Meiri >Priority: Major > Labels: beginner, easy, newbie, starter > Fix For: Not applicable > > > I see lots of these warnings in our precommits. > {code} > 19:09:37 apache_beam/io/filesystem.py:583 > 19:09:37 > /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py36/build/srcs/sdks/python/apache_beam/io/filesystem.py:583: > DeprecationWarning: Flags not at the start of the expression > '\\/tmp\\/tmp2jslxh39\\/' (truncated) > 19:09:37 re_pattern = re.compile(self.translate_pattern(pattern)) > 19:09:37 > 19:09:37 apache_beam/io/filesystem.py:583 > 19:09:37 > /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py36/build/srcs/sdks/python/apache_beam/io/filesystem.py:583: > DeprecationWarning: Flags not at the start of the expression > '\\/tmp\\/tmp03vpdu3z\\/' (truncated) > 19:09:37 re_pattern = re.compile(self.translate_pattern(pattern)) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (BEAM-8964) DeprecationWarning: Flags not at the start of the expression
[ https://issues.apache.org/jira/browse/BEAM-8964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Udi Meiri reassigned BEAM-8964: --- Assignee: Udi Meiri > DeprecationWarning: Flags not at the start of the expression > - > > Key: BEAM-8964 > URL: https://issues.apache.org/jira/browse/BEAM-8964 > Project: Beam > Issue Type: Bug > Components: io-py-files >Reporter: Udi Meiri >Assignee: Udi Meiri >Priority: Major > Labels: beginner, easy, newbie, starter > > I see lots of these warnings in our precommits. > {code} > 19:09:37 apache_beam/io/filesystem.py:583 > 19:09:37 > /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py36/build/srcs/sdks/python/apache_beam/io/filesystem.py:583: > DeprecationWarning: Flags not at the start of the expression > '\\/tmp\\/tmp2jslxh39\\/' (truncated) > 19:09:37 re_pattern = re.compile(self.translate_pattern(pattern)) > 19:09:37 > 19:09:37 apache_beam/io/filesystem.py:583 > 19:09:37 > /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py36/build/srcs/sdks/python/apache_beam/io/filesystem.py:583: > DeprecationWarning: Flags not at the start of the expression > '\\/tmp\\/tmp03vpdu3z\\/' (truncated) > 19:09:37 re_pattern = re.compile(self.translate_pattern(pattern)) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-8964) DeprecationWarning: Flags not at the start of the expression
[ https://issues.apache.org/jira/browse/BEAM-8964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105915#comment-17105915 ] Udi Meiri commented on BEAM-8964: - Oops I fixed this in https://github.com/apache/beam/pull/11016, but forgot to close the bug. I don't this warning at all in https://builds.apache.org/job/beam_PreCommit_Python_Cron/2746/ > DeprecationWarning: Flags not at the start of the expression > - > > Key: BEAM-8964 > URL: https://issues.apache.org/jira/browse/BEAM-8964 > Project: Beam > Issue Type: Bug > Components: io-py-files >Reporter: Udi Meiri >Priority: Major > Labels: beginner, easy, newbie, starter > > I see lots of these warnings in our precommits. > {code} > 19:09:37 apache_beam/io/filesystem.py:583 > 19:09:37 > /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py36/build/srcs/sdks/python/apache_beam/io/filesystem.py:583: > DeprecationWarning: Flags not at the start of the expression > '\\/tmp\\/tmp2jslxh39\\/' (truncated) > 19:09:37 re_pattern = re.compile(self.translate_pattern(pattern)) > 19:09:37 > 19:09:37 apache_beam/io/filesystem.py:583 > 19:09:37 > /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py36/build/srcs/sdks/python/apache_beam/io/filesystem.py:583: > DeprecationWarning: Flags not at the start of the expression > '\\/tmp\\/tmp03vpdu3z\\/' (truncated) > 19:09:37 re_pattern = re.compile(self.translate_pattern(pattern)) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9978) Add offset range restrictions to the Go SDK proper.
[ https://issues.apache.org/jira/browse/BEAM-9978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kenneth Knowles updated BEAM-9978: -- Status: Open (was: Triage Needed) > Add offset range restrictions to the Go SDK proper. > --- > > Key: BEAM-9978 > URL: https://issues.apache.org/jira/browse/BEAM-9978 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Major > > Currently these are part of the stringsplit example. but they should probably > be generalized and in the actual SDK, and should have adequate testing. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9978) Add offset range restrictions to the Go SDK.
[ https://issues.apache.org/jira/browse/BEAM-9978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Oliveira updated BEAM-9978: -- Summary: Add offset range restrictions to the Go SDK. (was: Add offset range restrictions to the Go SDK proper.) > Add offset range restrictions to the Go SDK. > > > Key: BEAM-9978 > URL: https://issues.apache.org/jira/browse/BEAM-9978 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Major > > Currently these are part of the stringsplit example. but they should probably > be generalized and in the actual SDK, and should have adequate testing. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9978) Add offset range restrictions to the Go SDK proper.
Daniel Oliveira created BEAM-9978: - Summary: Add offset range restrictions to the Go SDK proper. Key: BEAM-9978 URL: https://issues.apache.org/jira/browse/BEAM-9978 Project: Beam Issue Type: Improvement Components: sdk-go Reporter: Daniel Oliveira Assignee: Daniel Oliveira Currently these are part of the stringsplit example. but they should probably be generalized and in the actual SDK, and should have adequate testing. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (BEAM-9975) PortableRunnerTest flake "ParseError: Unexpected type for Value message."
[ https://issues.apache.org/jira/browse/BEAM-9975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105894#comment-17105894 ] Ahmet Altay edited comment on BEAM-9975 at 5/13/20, 2:26 AM: - Could we add a log error before the raise and log the value instead of catching the error? I agree with Kyle, unless we add more logging, it will be hard to find the root cause. was (Author: altay): Could we add a log error before the raise and log the value instead of catching the error? > PortableRunnerTest flake "ParseError: Unexpected type for Value message." > - > > Key: BEAM-9975 > URL: https://issues.apache.org/jira/browse/BEAM-9975 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Brian Hulette >Priority: Major > > Error looks similar to the one in BEAM-9907. Example from > https://builds.apache.org/job/beam_PreCommit_Python_Cron/2732 > {code} > apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:569: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > apache_beam/pipeline.py:550: in __exit__ > self.run().wait_until_finish() > apache_beam/pipeline.py:529: in run > return self.runner.run_pipeline(self, self._options) > apache_beam/runners/portability/portable_runner.py:426: in run_pipeline > job_service_handle.submit(proto_pipeline) > apache_beam/runners/portability/portable_runner.py:107: in submit > prepare_response = self.prepare(proto_pipeline) > apache_beam/runners/portability/portable_runner.py:184: in prepare > pipeline_options=self.get_pipeline_options()), > apache_beam/runners/portability/portable_runner.py:174: in > get_pipeline_options > return job_utils.dict_to_struct(p_options) > apache_beam/runners/job/utils.py:33: in dict_to_struct > return json_format.ParseDict(dict_obj, struct_pb2.Struct()) > target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:450: > in ParseDict > parser.ConvertMessage(js_dict, message) > target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:479: > in ConvertMessage > methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self) > target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:667: > in _ConvertStructMessage > self._ConvertValueMessage(value[key], message.fields[key]) > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > self = > value = 0x7f69eb7b3ac8> > message = > def _ConvertValueMessage(self, value, message): > """Convert a JSON representation into Value message.""" > if isinstance(value, dict): > self._ConvertStructMessage(value, message.struct_value) > elif isinstance(value, list): > self. _ConvertListValueMessage(value, message.list_value) > elif value is None: > message.null_value = 0 > elif isinstance(value, bool): > message.bool_value = value > elif isinstance(value, six.string_types): > message.string_value = value > elif isinstance(value, _INT_OR_FLOAT): > message.number_value = value > else: > > raise ParseError('Unexpected type for Value message.') > E google.protobuf.json_format.ParseError: Unexpected type for Value > message. > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-9775) Add capability to run SDFs on Flink Runner and Python FnApiRunner.
[ https://issues.apache.org/jira/browse/BEAM-9775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Oliveira resolved BEAM-9775. --- Fix Version/s: Not applicable Resolution: Fixed This seems complete now. We have an example SDF and it runs on both those runners. > Add capability to run SDFs on Flink Runner and Python FnApiRunner. > -- > > Key: BEAM-9775 > URL: https://issues.apache.org/jira/browse/BEAM-9775 > Project: Beam > Issue Type: Sub-task > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Major > Fix For: Not applicable > > Time Spent: 2h 50m > Remaining Estimate: 0h > > This is pretty simple: Add any missing requirements for being able to > actually execute SDFs, and _maybe_ an example SDF or something that actually > works. This can be marked completed when we can run a simple SDF with the Go > SDK. > I'm still hesitant on the example SDF because it may imply that the feature > is ready for general usage, so it might need to be bundled with the > documentation PR, but we'll see. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9975) PortableRunnerTest flake "ParseError: Unexpected type for Value message."
[ https://issues.apache.org/jira/browse/BEAM-9975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105894#comment-17105894 ] Ahmet Altay commented on BEAM-9975: --- Could we add a log error before the raise and log the value instead of catching the error? > PortableRunnerTest flake "ParseError: Unexpected type for Value message." > - > > Key: BEAM-9975 > URL: https://issues.apache.org/jira/browse/BEAM-9975 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Brian Hulette >Priority: Major > > Error looks similar to the one in BEAM-9907. Example from > https://builds.apache.org/job/beam_PreCommit_Python_Cron/2732 > {code} > apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:569: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > apache_beam/pipeline.py:550: in __exit__ > self.run().wait_until_finish() > apache_beam/pipeline.py:529: in run > return self.runner.run_pipeline(self, self._options) > apache_beam/runners/portability/portable_runner.py:426: in run_pipeline > job_service_handle.submit(proto_pipeline) > apache_beam/runners/portability/portable_runner.py:107: in submit > prepare_response = self.prepare(proto_pipeline) > apache_beam/runners/portability/portable_runner.py:184: in prepare > pipeline_options=self.get_pipeline_options()), > apache_beam/runners/portability/portable_runner.py:174: in > get_pipeline_options > return job_utils.dict_to_struct(p_options) > apache_beam/runners/job/utils.py:33: in dict_to_struct > return json_format.ParseDict(dict_obj, struct_pb2.Struct()) > target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:450: > in ParseDict > parser.ConvertMessage(js_dict, message) > target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:479: > in ConvertMessage > methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self) > target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:667: > in _ConvertStructMessage > self._ConvertValueMessage(value[key], message.fields[key]) > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > self = > value = 0x7f69eb7b3ac8> > message = > def _ConvertValueMessage(self, value, message): > """Convert a JSON representation into Value message.""" > if isinstance(value, dict): > self._ConvertStructMessage(value, message.struct_value) > elif isinstance(value, list): > self. _ConvertListValueMessage(value, message.list_value) > elif value is None: > message.null_value = 0 > elif isinstance(value, bool): > message.bool_value = value > elif isinstance(value, six.string_types): > message.string_value = value > elif isinstance(value, _INT_OR_FLOAT): > message.number_value = value > else: > > raise ParseError('Unexpected type for Value message.') > E google.protobuf.json_format.ParseError: Unexpected type for Value > message. > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-9799) Add automated RTracker validation to Go SDF
[ https://issues.apache.org/jira/browse/BEAM-9799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Oliveira resolved BEAM-9799. --- Fix Version/s: Not applicable Resolution: Fixed > Add automated RTracker validation to Go SDF > --- > > Key: BEAM-9799 > URL: https://issues.apache.org/jira/browse/BEAM-9799 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Reporter: Daniel Oliveira >Assignee: Daniel Oliveira >Priority: Major > Fix For: Not applicable > > Time Spent: 40m > Remaining Estimate: 0h > > After finishing executing an SDF we should be validating the restriction > trackers to make sure they succeeded without errors. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9934) Resolve differences in beam:metric:element_count:v1 implementations
[ https://issues.apache.org/jira/browse/BEAM-9934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105866#comment-17105866 ] Luke Cwik commented on BEAM-9934: - 2.22 > Resolve differences in beam:metric:element_count:v1 implementations > --- > > Key: BEAM-9934 > URL: https://issues.apache.org/jira/browse/BEAM-9934 > Project: Beam > Issue Type: Bug > Components: sdk-go, sdk-java-harness, sdk-py-harness >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: Major > Fix For: 2.22.0 > > > The [element > count|https://github.com/apache/beam/blob/d82d061aa303430f3d2853f397f3130fae6200cd/model/pipeline/src/main/proto/metrics.proto#L206] > metric represents the number of elements within a PCollection and is > interpreted differently across the Beam SDK versions. > In the [Java > SDK|https://github.com/apache/beam/blob/d82d061aa303430f3d2853f397f3130fae6200cd/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/data/PCollectionConsumerRegistry.java#L207] > this represents the number of elements and includes how many windows those > elements are in. This metric is incremented as soon as the element has been > output. > In the [Python > SDK|https://github.com/apache/beam/blame/bfd151aa4c3aad29f3aea6482212ff8543ded8d7/sdks/python/apache_beam/runners/worker/opcounters.py#L247] > this represents the number of elements and doesn't include how many windows > those elements are in. The metric is also only incremented after the element > has finished processing. > The [Go > SDK|https://github.com/apache/beam/blob/7097850daa46674b88425a124bc442fc8ce0dcb8/sdks/go/pkg/beam/core/runtime/exec/datasource.go#L260] > does the same thing as Python. > Traditionally in Dataflow this has always been the exploded window element > count and the counter is incremented as soon as the element is output. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9934) Resolve differences in beam:metric:element_count:v1 implementations
[ https://issues.apache.org/jira/browse/BEAM-9934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-9934: Fix Version/s: (was: 2.21.0) 2.22.0 > Resolve differences in beam:metric:element_count:v1 implementations > --- > > Key: BEAM-9934 > URL: https://issues.apache.org/jira/browse/BEAM-9934 > Project: Beam > Issue Type: Bug > Components: sdk-go, sdk-java-harness, sdk-py-harness >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: Major > Fix For: 2.22.0 > > > The [element > count|https://github.com/apache/beam/blob/d82d061aa303430f3d2853f397f3130fae6200cd/model/pipeline/src/main/proto/metrics.proto#L206] > metric represents the number of elements within a PCollection and is > interpreted differently across the Beam SDK versions. > In the [Java > SDK|https://github.com/apache/beam/blob/d82d061aa303430f3d2853f397f3130fae6200cd/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/data/PCollectionConsumerRegistry.java#L207] > this represents the number of elements and includes how many windows those > elements are in. This metric is incremented as soon as the element has been > output. > In the [Python > SDK|https://github.com/apache/beam/blame/bfd151aa4c3aad29f3aea6482212ff8543ded8d7/sdks/python/apache_beam/runners/worker/opcounters.py#L247] > this represents the number of elements and doesn't include how many windows > those elements are in. The metric is also only incremented after the element > has finished processing. > The [Go > SDK|https://github.com/apache/beam/blob/7097850daa46674b88425a124bc442fc8ce0dcb8/sdks/go/pkg/beam/core/runtime/exec/datasource.go#L260] > does the same thing as Python. > Traditionally in Dataflow this has always been the exploded window element > count and the counter is incremented as soon as the element is output. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9767) test_streaming_wordcount flaky timeouts
[ https://issues.apache.org/jira/browse/BEAM-9767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105858#comment-17105858 ] Ahmet Altay commented on BEAM-9767: --- https://github.com/apache/beam/pull/11663 is merged. Does this address the flake? > test_streaming_wordcount flaky timeouts > --- > > Key: BEAM-9767 > URL: https://issues.apache.org/jira/browse/BEAM-9767 > Project: Beam > Issue Type: Bug > Components: sdk-py-core, test-failures >Reporter: Udi Meiri >Assignee: Sam Rohde >Priority: Critical > Time Spent: 2h 40m > Remaining Estimate: 0h > > Timed out after 600s, typically completes in 2.8s on my workstation. > https://builds.apache.org/job/beam_PreCommit_Python_Commit/12376/ > {code} > self = > testMethod=test_streaming_wordcount> > @unittest.skipIf( > sys.version_info < (3, 5, 3), > 'The tests require at least Python 3.6 to work.') > def test_streaming_wordcount(self): > class WordExtractingDoFn(beam.DoFn): > def process(self, element): > text_line = element.strip() > words = text_line.split() > return words > > # Add the TestStream so that it can be cached. > ib.options.capturable_sources.add(TestStream) > ib.options.capture_duration = timedelta(seconds=5) > > p = beam.Pipeline( > runner=interactive_runner.InteractiveRunner(), > options=StandardOptions(streaming=True)) > > data = ( > p > | TestStream() > .advance_watermark_to(0) > .advance_processing_time(1) > .add_elements(['to', 'be', 'or', 'not', 'to', 'be']) > .advance_watermark_to(20) > .advance_processing_time(1) > .add_elements(['that', 'is', 'the', 'question']) > | beam.WindowInto(beam.window.FixedWindows(10))) # yapf: disable > > counts = ( > data > | 'split' >> beam.ParDo(WordExtractingDoFn()) > | 'pair_with_one' >> beam.Map(lambda x: (x, 1)) > | 'group' >> beam.GroupByKey() > | 'count' >> beam.Map(lambda wordones: (wordones[0], > sum(wordones[1] > > # Watch the local scope for Interactive Beam so that referenced > PCollections > # will be cached. > ib.watch(locals()) > > # This is normally done in the interactive_utils when a transform is > # applied but needs an IPython environment. So we manually run this > here. > ie.current_env().track_user_pipelines() > > # Create a fake limiter that cancels the BCJ once the main job receives > the > # expected amount of results. > class FakeLimiter: > def __init__(self, p, pcoll): > self.p = p > self.pcoll = pcoll > > def is_triggered(self): > result = ie.current_env().pipeline_result(self.p) > if result: > try: > results = result.get(self.pcoll) > except ValueError: > return False > return len(results) >= 10 > return False > > # This sets the limiters to stop reading when the test receives 10 > elements > # or after 5 seconds have elapsed (to eliminate the possibility of > hanging). > ie.current_env().options.capture_control.set_limiters_for_test( > [FakeLimiter(p, data), DurationLimiter(timedelta(seconds=5))]) > > # This tests that the data was correctly cached. > pane_info = PaneInfo(True, True, PaneInfoTiming.UNKNOWN, 0, 0) > expected_data_df = pd.DataFrame([ > ('to', 0, [IntervalWindow(0, 10)], pane_info), > ('be', 0, [IntervalWindow(0, 10)], pane_info), > ('or', 0, [IntervalWindow(0, 10)], pane_info), > ('not', 0, [IntervalWindow(0, 10)], pane_info), > ('to', 0, [IntervalWindow(0, 10)], pane_info), > ('be', 0, [IntervalWindow(0, 10)], pane_info), > ('that', 2000, [IntervalWindow(20, 30)], pane_info), > ('is', 2000, [IntervalWindow(20, 30)], pane_info), > ('the', 2000, [IntervalWindow(20, 30)], pane_info), > ('question', 2000, [IntervalWindow(20, 30)], pane_info) > ], columns=[0, 'event_time', 'windows', 'pane_info']) # yapf: disable > > > data_df = ib.collect(data, include_window_info=True) > apache_beam/runners/interactive/interactive_runner_test.py:237: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > apache_beam/runners/interactive/interactive_beam.py:451: in collect > return head(pcoll, n=-1, include_window_info=include_window_info) > apache_beam/runners/interactive/utils.py:204: in run_within_progress_indicator > return
[jira] [Commented] (BEAM-9934) Resolve differences in beam:metric:element_count:v1 implementations
[ https://issues.apache.org/jira/browse/BEAM-9934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105846#comment-17105846 ] Kyle Weaver commented on BEAM-9934: --- What's the status on this one? Are we going to try to fix this for 2.21.0, or wait until 2.22.0? > Resolve differences in beam:metric:element_count:v1 implementations > --- > > Key: BEAM-9934 > URL: https://issues.apache.org/jira/browse/BEAM-9934 > Project: Beam > Issue Type: Bug > Components: sdk-go, sdk-java-harness, sdk-py-harness >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: Major > Fix For: 2.21.0 > > > The [element > count|https://github.com/apache/beam/blob/d82d061aa303430f3d2853f397f3130fae6200cd/model/pipeline/src/main/proto/metrics.proto#L206] > metric represents the number of elements within a PCollection and is > interpreted differently across the Beam SDK versions. > In the [Java > SDK|https://github.com/apache/beam/blob/d82d061aa303430f3d2853f397f3130fae6200cd/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/data/PCollectionConsumerRegistry.java#L207] > this represents the number of elements and includes how many windows those > elements are in. This metric is incremented as soon as the element has been > output. > In the [Python > SDK|https://github.com/apache/beam/blame/bfd151aa4c3aad29f3aea6482212ff8543ded8d7/sdks/python/apache_beam/runners/worker/opcounters.py#L247] > this represents the number of elements and doesn't include how many windows > those elements are in. The metric is also only incremented after the element > has finished processing. > The [Go > SDK|https://github.com/apache/beam/blob/7097850daa46674b88425a124bc442fc8ce0dcb8/sdks/go/pkg/beam/core/runtime/exec/datasource.go#L260] > does the same thing as Python. > Traditionally in Dataflow this has always been the exploded window element > count and the counter is incremented as soon as the element is output. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9935) Resolve differences in allowed_split_point implementations
[ https://issues.apache.org/jira/browse/BEAM-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105842#comment-17105842 ] Kyle Weaver commented on BEAM-9935: --- https://github.com/apache/beam/pull/11671 solved this for Python, which is all we needed in 2.21 IIUC. Moving fix version to 2.22 for Java and Go. > Resolve differences in allowed_split_point implementations > -- > > Key: BEAM-9935 > URL: https://issues.apache.org/jira/browse/BEAM-9935 > Project: Beam > Issue Type: Bug > Components: sdk-go, sdk-java-harness, sdk-py-harness >Reporter: Luke Cwik >Assignee: Robert Bradshaw >Priority: Blocker > Fix For: 2.21.0 > > Time Spent: 20m > Remaining Estimate: 0h > > [Java SDK > harness|https://github.com/apache/beam/blob/d82d061aa303430f3d2853f397f3130fae6200cd/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/BeamFnDataReadRunner.java#L223] > doesn't support it yet which is also safe. > [Go SDK > harness|https://github.com/apache/beam/blob/d82d061aa303430f3d2853f397f3130fae6200cd/sdks/go/pkg/beam/core/runtime/exec/datasource.go#L273] > only supports splits if points are specified and it doesn't use the fraction > at all. > [Python SDK > harness|https://github.com/apache/beam/blob/d82d061aa303430f3d2853f397f3130fae6200cd/sdks/python/apache_beam/runners/worker/bundle_processor.py#L947] > ignores the split points meaning that it may return an invalid split > location based upon the runners limitations. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9935) Resolve differences in allowed_split_point implementations
[ https://issues.apache.org/jira/browse/BEAM-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kyle Weaver updated BEAM-9935: -- Fix Version/s: (was: 2.21.0) 2.22.0 > Resolve differences in allowed_split_point implementations > -- > > Key: BEAM-9935 > URL: https://issues.apache.org/jira/browse/BEAM-9935 > Project: Beam > Issue Type: Bug > Components: sdk-go, sdk-java-harness, sdk-py-harness >Reporter: Luke Cwik >Assignee: Robert Bradshaw >Priority: Blocker > Fix For: 2.22.0 > > Time Spent: 20m > Remaining Estimate: 0h > > [Java SDK > harness|https://github.com/apache/beam/blob/d82d061aa303430f3d2853f397f3130fae6200cd/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/BeamFnDataReadRunner.java#L223] > doesn't support it yet which is also safe. > [Go SDK > harness|https://github.com/apache/beam/blob/d82d061aa303430f3d2853f397f3130fae6200cd/sdks/go/pkg/beam/core/runtime/exec/datasource.go#L273] > only supports splits if points are specified and it doesn't use the fraction > at all. > [Python SDK > harness|https://github.com/apache/beam/blob/d82d061aa303430f3d2853f397f3130fae6200cd/sdks/python/apache_beam/runners/worker/bundle_processor.py#L947] > ignores the split points meaning that it may return an invalid split > location based upon the runners limitations. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-9945) Use consistent element count for progress counter.
[ https://issues.apache.org/jira/browse/BEAM-9945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kyle Weaver resolved BEAM-9945. --- Resolution: Fixed > Use consistent element count for progress counter. > -- > > Key: BEAM-9945 > URL: https://issues.apache.org/jira/browse/BEAM-9945 > Project: Beam > Issue Type: Improvement > Components: beam-model, sdk-go, sdk-java-harness, sdk-py-harness >Reporter: Robert Bradshaw >Assignee: Robert Bradshaw >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 20m > Remaining Estimate: 0h > > This influences how the SDK communicates progress and how splitting is > performed. > We currently inspect the element count of the output (which has inconsistent > definitions across SDKs, see BEAM-9934). Instead, we can move to using an > explicit, separate metric. > This can currently lead to incorrect progress reporting and in some cases > even a crash in the UW depending on the SDK, and makes it more difficult (but > not impossible) to fix element counts in the future. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9975) PortableRunnerTest flake "ParseError: Unexpected type for Value message."
[ https://issues.apache.org/jira/browse/BEAM-9975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105836#comment-17105836 ] Kyle Weaver commented on BEAM-9975: --- The error message makes it impossible to tell which option is actually the problem. I'll try submitting a patch to the protobuf formatter. In the mean time, we could consider catching the error and printing the offending options. > PortableRunnerTest flake "ParseError: Unexpected type for Value message." > - > > Key: BEAM-9975 > URL: https://issues.apache.org/jira/browse/BEAM-9975 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Brian Hulette >Priority: Major > > Error looks similar to the one in BEAM-9907. Example from > https://builds.apache.org/job/beam_PreCommit_Python_Cron/2732 > {code} > apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:569: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > apache_beam/pipeline.py:550: in __exit__ > self.run().wait_until_finish() > apache_beam/pipeline.py:529: in run > return self.runner.run_pipeline(self, self._options) > apache_beam/runners/portability/portable_runner.py:426: in run_pipeline > job_service_handle.submit(proto_pipeline) > apache_beam/runners/portability/portable_runner.py:107: in submit > prepare_response = self.prepare(proto_pipeline) > apache_beam/runners/portability/portable_runner.py:184: in prepare > pipeline_options=self.get_pipeline_options()), > apache_beam/runners/portability/portable_runner.py:174: in > get_pipeline_options > return job_utils.dict_to_struct(p_options) > apache_beam/runners/job/utils.py:33: in dict_to_struct > return json_format.ParseDict(dict_obj, struct_pb2.Struct()) > target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:450: > in ParseDict > parser.ConvertMessage(js_dict, message) > target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:479: > in ConvertMessage > methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self) > target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:667: > in _ConvertStructMessage > self._ConvertValueMessage(value[key], message.fields[key]) > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > self = > value = 0x7f69eb7b3ac8> > message = > def _ConvertValueMessage(self, value, message): > """Convert a JSON representation into Value message.""" > if isinstance(value, dict): > self._ConvertStructMessage(value, message.struct_value) > elif isinstance(value, list): > self. _ConvertListValueMessage(value, message.list_value) > elif value is None: > message.null_value = 0 > elif isinstance(value, bool): > message.bool_value = value > elif isinstance(value, six.string_types): > message.string_value = value > elif isinstance(value, _INT_OR_FLOAT): > message.number_value = value > else: > > raise ParseError('Unexpected type for Value message.') > E google.protobuf.json_format.ParseError: Unexpected type for Value > message. > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9707) Hardcode runner harness container image for unified worker
[ https://issues.apache.org/jira/browse/BEAM-9707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105830#comment-17105830 ] Brian Hulette commented on BEAM-9707: - What happens if 2.21 isn't released before the 2.22 cut date? Can this just be pushed again to 2.23? > Hardcode runner harness container image for unified worker > -- > > Key: BEAM-9707 > URL: https://issues.apache.org/jira/browse/BEAM-9707 > Project: Beam > Issue Type: Bug > Components: runner-dataflow, sdk-py-core >Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Major > Fix For: 2.22.0 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > Hardcode runner harness image temporarily to support usage of unified worker > on head. > Remove this hardcoding once 2.21 is release -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-9622) Support for consuming tagged PCollections in Python SqlTransform
[ https://issues.apache.org/jira/browse/BEAM-9622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Hulette resolved BEAM-9622. - Fix Version/s: 2.22.0 Resolution: Fixed > Support for consuming tagged PCollections in Python SqlTransform > > > Key: BEAM-9622 > URL: https://issues.apache.org/jira/browse/BEAM-9622 > Project: Beam > Issue Type: Improvement > Components: dsl-sql, sdk-py-core >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Major > Fix For: 2.22.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9975) PortableRunnerTest flake "ParseError: Unexpected type for Value message."
[ https://issues.apache.org/jira/browse/BEAM-9975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kyle Weaver updated BEAM-9975: -- Description: Error looks similar to the one in BEAM-9907. Example from https://builds.apache.org/job/beam_PreCommit_Python_Cron/2732 {code} apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:569: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ apache_beam/pipeline.py:550: in __exit__ self.run().wait_until_finish() apache_beam/pipeline.py:529: in run return self.runner.run_pipeline(self, self._options) apache_beam/runners/portability/portable_runner.py:426: in run_pipeline job_service_handle.submit(proto_pipeline) apache_beam/runners/portability/portable_runner.py:107: in submit prepare_response = self.prepare(proto_pipeline) apache_beam/runners/portability/portable_runner.py:184: in prepare pipeline_options=self.get_pipeline_options()), apache_beam/runners/portability/portable_runner.py:174: in get_pipeline_options return job_utils.dict_to_struct(p_options) apache_beam/runners/job/utils.py:33: in dict_to_struct return json_format.ParseDict(dict_obj, struct_pb2.Struct()) target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:450: in ParseDict parser.ConvertMessage(js_dict, message) target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:479: in ConvertMessage methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self) target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:667: in _ConvertStructMessage self._ConvertValueMessage(value[key], message.fields[key]) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = value = message = def _ConvertValueMessage(self, value, message): """Convert a JSON representation into Value message.""" if isinstance(value, dict): self._ConvertStructMessage(value, message.struct_value) elif isinstance(value, list): self. _ConvertListValueMessage(value, message.list_value) elif value is None: message.null_value = 0 elif isinstance(value, bool): message.bool_value = value elif isinstance(value, six.string_types): message.string_value = value elif isinstance(value, _INT_OR_FLOAT): message.number_value = value else: > raise ParseError('Unexpected type for Value message.') E google.protobuf.json_format.ParseError: Unexpected type for Value message. {code} was: Error looks similar to the one in BEAM-9907. Example from https://builds.apache.org/job/beam_PreCommit_Python_Cron/2732: {code} apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:569: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ apache_beam/pipeline.py:550: in __exit__ self.run().wait_until_finish() apache_beam/pipeline.py:529: in run return self.runner.run_pipeline(self, self._options) apache_beam/runners/portability/portable_runner.py:426: in run_pipeline job_service_handle.submit(proto_pipeline) apache_beam/runners/portability/portable_runner.py:107: in submit prepare_response = self.prepare(proto_pipeline) apache_beam/runners/portability/portable_runner.py:184: in prepare pipeline_options=self.get_pipeline_options()), apache_beam/runners/portability/portable_runner.py:174: in get_pipeline_options return job_utils.dict_to_struct(p_options) apache_beam/runners/job/utils.py:33: in dict_to_struct return json_format.ParseDict(dict_obj, struct_pb2.Struct()) target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:450: in ParseDict parser.ConvertMessage(js_dict, message) target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:479: in ConvertMessage methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self) target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:667: in _ConvertStructMessage self._ConvertValueMessage(value[key], message.fields[key]) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = value = message = def _ConvertValueMessage(self, value, message): """Convert a JSON representation into Value message.""" if isinstance(value, dict): self._ConvertStructMessage(value, message.struct_value) elif isinstance(value, list): self. _ConvertListValueMessage(value, message.list_value) elif value is None: message.null_value = 0 elif isinstance(value, bool): message.bool_value = value elif isinstance(value, six.string_types): message.string_value = value elif isinstance(value, _INT_OR_FLOAT): message.number_value = value
[jira] [Assigned] (BEAM-9975) PortableRunnerTest flake "ParseError: Unexpected type for Value message."
[ https://issues.apache.org/jira/browse/BEAM-9975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Hulette reassigned BEAM-9975: --- Assignee: (was: Brian Hulette) > PortableRunnerTest flake "ParseError: Unexpected type for Value message." > - > > Key: BEAM-9975 > URL: https://issues.apache.org/jira/browse/BEAM-9975 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Brian Hulette >Priority: Major > > Error looks similar to the one in BEAM-9907. Example from > https://builds.apache.org/job/beam_PreCommit_Python_Cron/2732: > {code} > apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:569: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > apache_beam/pipeline.py:550: in __exit__ > self.run().wait_until_finish() > apache_beam/pipeline.py:529: in run > return self.runner.run_pipeline(self, self._options) > apache_beam/runners/portability/portable_runner.py:426: in run_pipeline > job_service_handle.submit(proto_pipeline) > apache_beam/runners/portability/portable_runner.py:107: in submit > prepare_response = self.prepare(proto_pipeline) > apache_beam/runners/portability/portable_runner.py:184: in prepare > pipeline_options=self.get_pipeline_options()), > apache_beam/runners/portability/portable_runner.py:174: in > get_pipeline_options > return job_utils.dict_to_struct(p_options) > apache_beam/runners/job/utils.py:33: in dict_to_struct > return json_format.ParseDict(dict_obj, struct_pb2.Struct()) > target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:450: > in ParseDict > parser.ConvertMessage(js_dict, message) > target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:479: > in ConvertMessage > methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self) > target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:667: > in _ConvertStructMessage > self._ConvertValueMessage(value[key], message.fields[key]) > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > self = > value = 0x7f69eb7b3ac8> > message = > def _ConvertValueMessage(self, value, message): > """Convert a JSON representation into Value message.""" > if isinstance(value, dict): > self._ConvertStructMessage(value, message.struct_value) > elif isinstance(value, list): > self. _ConvertListValueMessage(value, message.list_value) > elif value is None: > message.null_value = 0 > elif isinstance(value, bool): > message.bool_value = value > elif isinstance(value, six.string_types): > message.string_value = value > elif isinstance(value, _INT_OR_FLOAT): > message.number_value = value > else: > > raise ParseError('Unexpected type for Value message.') > E google.protobuf.json_format.ParseError: Unexpected type for Value > message. > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9977) Build Kafka Read on top of Java SplittableDoFn
Boyuan Zhang created BEAM-9977: -- Summary: Build Kafka Read on top of Java SplittableDoFn Key: BEAM-9977 URL: https://issues.apache.org/jira/browse/BEAM-9977 Project: Beam Issue Type: New Feature Components: io-java-kafka Reporter: Boyuan Zhang Assignee: Boyuan Zhang -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9977) Build Kafka Read on top of Java SplittableDoFn
[ https://issues.apache.org/jira/browse/BEAM-9977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kenneth Knowles updated BEAM-9977: -- Status: Open (was: Triage Needed) > Build Kafka Read on top of Java SplittableDoFn > -- > > Key: BEAM-9977 > URL: https://issues.apache.org/jira/browse/BEAM-9977 > Project: Beam > Issue Type: New Feature > Components: io-java-kafka >Reporter: Boyuan Zhang >Assignee: Boyuan Zhang >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9488) Python SDK sending unexpected MonitoringInfo
[ https://issues.apache.org/jira/browse/BEAM-9488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-9488: Fix Version/s: (was: 2.22.0) 2.21.0 > Python SDK sending unexpected MonitoringInfo > > > Key: BEAM-9488 > URL: https://issues.apache.org/jira/browse/BEAM-9488 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Ruoyun Huang >Assignee: Luke Cwik >Priority: Minor > Labels: portability > Fix For: 2.21.0 > > Time Spent: 3.5h > Remaining Estimate: 0h > > element_count metrics is supposed to be tied with pcollection ids, but by > inspecting what is sent over by python sdk, we see there are monitoringInfo > sent wit ptransforms in it. > [Double checked the job graph, these seem to be redundant. i.e. the > corresponding pcollection does have its own MonitoringInfo reported.] > Likely a bug. > Proof: > urn: "beam:metric:element_count:v1" > type: "beam:metrics:sum_int_64" > metric { > counter_data { > int64_value: 1 > } > } > labels { > key: "PTRANSFORM" > value: "start/MaybeReshuffle/Reshuffle/RemoveRandomKeys-ptransform-85" > } > labels { > key: "TAG" > value: "None" > } > timestamp { > seconds: 1583949073 > nanos: 842402935 > } -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9959) Mistakes Computing Composite Inputs and Outputs
[ https://issues.apache.org/jira/browse/BEAM-9959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105750#comment-17105750 ] Robert Burke commented on BEAM-9959: The right overall fix for that is to check for cycles WRT the composites after the topological sort, and print out that there's a cycle involving the *composite* node represented by the scope. Anything without the full cycle is much harder to debug. Further, the individual PTransforms involved should be fully qualified with their composite parent hierachies to make it easier to find where these are coming from, and recommend either merging two scopes or similar, and recommending that the new scope objects be moved to their own functions with 1 scope per function. This makes the bad construction impossible. > Mistakes Computing Composite Inputs and Outputs > --- > > Key: BEAM-9959 > URL: https://issues.apache.org/jira/browse/BEAM-9959 > Project: Beam > Issue Type: Bug > Components: sdk-go >Reporter: Robert Burke >Assignee: Robert Burke >Priority: Major > > The Go SDK uses a Scope object to manage beam Composites. > A bug was discovered when consuming a PCollection in both the composite that > created it, and in a separate composite. > Further, the Go SDK should verify that the root hypergraph structure is a DAG > and provides a reasonable error. In particular, the leaf nodes of the graph > could form a DAG, but due to how the beam.Scope object is used, might cause > the hypergraph to not be a DAG. > Eg. It's possible to write the following in the Go SDK. > PTransforms A, B, C and PCollections colA, colB, and Composites a, b. > A and C are in a, and B are in b. > A generates colA > B consumes colA, and generates colB. > C consumes colA and colB. > ``` > a := s.Scope(a) > b := s.Scope(b) > colA := beam.Impulse(*a*) > colB := beam.ParDo(*b*, , colA) > beam.ParDo0(*a*, , colA, beam.SideInput{colB}) > ``` > If it doesn't already, the Go SDK must emit a clear error, and fail pipeline > construction. > If the affected composites are roots in the graph, the cycle prevents being > able to topologically sort the root ptransforms for the pipeline graph, which > can adversely affect runners. > The recommendation is always to wrap uses of scope in functions or other > scopes to prevent such incorrect constructions. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9976) FlinkSavepointTest timeout flake
Kyle Weaver created BEAM-9976: - Summary: FlinkSavepointTest timeout flake Key: BEAM-9976 URL: https://issues.apache.org/jira/browse/BEAM-9976 Project: Beam Issue Type: Bug Components: runner-flink, test-failures Reporter: Kyle Weaver org.apache.beam.runners.flink.FlinkSavepointTest.testSavepointRestorePortable org.junit.runners.model.TestTimedOutException: test timed out after 60 seconds at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) at org.apache.beam.runners.flink.FlinkSavepointTest.runSavepointAndRestore(FlinkSavepointTest.java:188) at org.apache.beam.runners.flink.FlinkSavepointTest.testSavepointRestorePortable(FlinkSavepointTest.java:154) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.lang.Thread.run(Thread.java:748) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9975) PortableRunnerTest flake "ParseError: Unexpected type for Value message."
[ https://issues.apache.org/jira/browse/BEAM-9975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105743#comment-17105743 ] Brian Hulette commented on BEAM-9975: - In BEAM-9907 the underlying issue was that PipelineOptions picked up sys.argv, which was the argv for the test framework, not Beam. I don't see any clear places where we could be creating PipelineOption that read sys.argv in these PortableRunnerTest methods. > PortableRunnerTest flake "ParseError: Unexpected type for Value message." > - > > Key: BEAM-9975 > URL: https://issues.apache.org/jira/browse/BEAM-9975 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Major > > Error looks similar to the one in BEAM-9907. Example from > https://builds.apache.org/job/beam_PreCommit_Python_Cron/2732: > {code} > apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:569: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > apache_beam/pipeline.py:550: in __exit__ > self.run().wait_until_finish() > apache_beam/pipeline.py:529: in run > return self.runner.run_pipeline(self, self._options) > apache_beam/runners/portability/portable_runner.py:426: in run_pipeline > job_service_handle.submit(proto_pipeline) > apache_beam/runners/portability/portable_runner.py:107: in submit > prepare_response = self.prepare(proto_pipeline) > apache_beam/runners/portability/portable_runner.py:184: in prepare > pipeline_options=self.get_pipeline_options()), > apache_beam/runners/portability/portable_runner.py:174: in > get_pipeline_options > return job_utils.dict_to_struct(p_options) > apache_beam/runners/job/utils.py:33: in dict_to_struct > return json_format.ParseDict(dict_obj, struct_pb2.Struct()) > target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:450: > in ParseDict > parser.ConvertMessage(js_dict, message) > target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:479: > in ConvertMessage > methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self) > target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:667: > in _ConvertStructMessage > self._ConvertValueMessage(value[key], message.fields[key]) > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > self = > value = 0x7f69eb7b3ac8> > message = > def _ConvertValueMessage(self, value, message): > """Convert a JSON representation into Value message.""" > if isinstance(value, dict): > self._ConvertStructMessage(value, message.struct_value) > elif isinstance(value, list): > self. _ConvertListValueMessage(value, message.list_value) > elif value is None: > message.null_value = 0 > elif isinstance(value, bool): > message.bool_value = value > elif isinstance(value, six.string_types): > message.string_value = value > elif isinstance(value, _INT_OR_FLOAT): > message.number_value = value > else: > > raise ParseError('Unexpected type for Value message.') > E google.protobuf.json_format.ParseError: Unexpected type for Value > message. > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9975) PortableRunnerTest flake "ParseError: Unexpected type for Value message."
Brian Hulette created BEAM-9975: --- Summary: PortableRunnerTest flake "ParseError: Unexpected type for Value message." Key: BEAM-9975 URL: https://issues.apache.org/jira/browse/BEAM-9975 Project: Beam Issue Type: Bug Components: sdk-py-core Reporter: Brian Hulette Assignee: Brian Hulette Error looks similar to the one in BEAM-9907. Example from https://builds.apache.org/job/beam_PreCommit_Python_Cron/2732: {code} apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:569: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ apache_beam/pipeline.py:550: in __exit__ self.run().wait_until_finish() apache_beam/pipeline.py:529: in run return self.runner.run_pipeline(self, self._options) apache_beam/runners/portability/portable_runner.py:426: in run_pipeline job_service_handle.submit(proto_pipeline) apache_beam/runners/portability/portable_runner.py:107: in submit prepare_response = self.prepare(proto_pipeline) apache_beam/runners/portability/portable_runner.py:184: in prepare pipeline_options=self.get_pipeline_options()), apache_beam/runners/portability/portable_runner.py:174: in get_pipeline_options return job_utils.dict_to_struct(p_options) apache_beam/runners/job/utils.py:33: in dict_to_struct return json_format.ParseDict(dict_obj, struct_pb2.Struct()) target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:450: in ParseDict parser.ConvertMessage(js_dict, message) target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:479: in ConvertMessage methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self) target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:667: in _ConvertStructMessage self._ConvertValueMessage(value[key], message.fields[key]) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = value = message = def _ConvertValueMessage(self, value, message): """Convert a JSON representation into Value message.""" if isinstance(value, dict): self._ConvertStructMessage(value, message.struct_value) elif isinstance(value, list): self. _ConvertListValueMessage(value, message.list_value) elif value is None: message.null_value = 0 elif isinstance(value, bool): message.bool_value = value elif isinstance(value, six.string_types): message.string_value = value elif isinstance(value, _INT_OR_FLOAT): message.number_value = value else: > raise ParseError('Unexpected type for Value message.') E google.protobuf.json_format.ParseError: Unexpected type for Value message. {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9975) PortableRunnerTest flake "ParseError: Unexpected type for Value message."
[ https://issues.apache.org/jira/browse/BEAM-9975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kenneth Knowles updated BEAM-9975: -- Status: Open (was: Triage Needed) > PortableRunnerTest flake "ParseError: Unexpected type for Value message." > - > > Key: BEAM-9975 > URL: https://issues.apache.org/jira/browse/BEAM-9975 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Major > > Error looks similar to the one in BEAM-9907. Example from > https://builds.apache.org/job/beam_PreCommit_Python_Cron/2732: > {code} > apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:569: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > apache_beam/pipeline.py:550: in __exit__ > self.run().wait_until_finish() > apache_beam/pipeline.py:529: in run > return self.runner.run_pipeline(self, self._options) > apache_beam/runners/portability/portable_runner.py:426: in run_pipeline > job_service_handle.submit(proto_pipeline) > apache_beam/runners/portability/portable_runner.py:107: in submit > prepare_response = self.prepare(proto_pipeline) > apache_beam/runners/portability/portable_runner.py:184: in prepare > pipeline_options=self.get_pipeline_options()), > apache_beam/runners/portability/portable_runner.py:174: in > get_pipeline_options > return job_utils.dict_to_struct(p_options) > apache_beam/runners/job/utils.py:33: in dict_to_struct > return json_format.ParseDict(dict_obj, struct_pb2.Struct()) > target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:450: > in ParseDict > parser.ConvertMessage(js_dict, message) > target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:479: > in ConvertMessage > methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self) > target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:667: > in _ConvertStructMessage > self._ConvertValueMessage(value[key], message.fields[key]) > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > self = > value = 0x7f69eb7b3ac8> > message = > def _ConvertValueMessage(self, value, message): > """Convert a JSON representation into Value message.""" > if isinstance(value, dict): > self._ConvertStructMessage(value, message.struct_value) > elif isinstance(value, list): > self. _ConvertListValueMessage(value, message.list_value) > elif value is None: > message.null_value = 0 > elif isinstance(value, bool): > message.bool_value = value > elif isinstance(value, six.string_types): > message.string_value = value > elif isinstance(value, _INT_OR_FLOAT): > message.number_value = value > else: > > raise ParseError('Unexpected type for Value message.') > E google.protobuf.json_format.ParseError: Unexpected type for Value > message. > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9843) Flink UnboundedSourceWrapperTest flaky due to a timeout
[ https://issues.apache.org/jira/browse/BEAM-9843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105723#comment-17105723 ] Maximilian Michels commented on BEAM-9843: -- Thanks for reporting this. This should be fixed. Feel free to ping me here if it fails again. > Flink UnboundedSourceWrapperTest flaky due to a timeout > --- > > Key: BEAM-9843 > URL: https://issues.apache.org/jira/browse/BEAM-9843 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Chamikara Madhusanka Jayalath >Priority: Major > Fix For: Not applicable > > > For example, > [https://builds.apache.org/job/beam_PreCommit_Java_Cron/2685/] > [https://builds.apache.org/job/beam_PreCommit_Java_Cron/2684/] > [https://builds.apache.org/job/beam_PreCommit_Java_Cron/2682/] > [https://builds.apache.org/job/beam_PreCommit_Java_Cron/2680/] > [https://builds.apache.org/job/beam_PreCommit_Java_Cron/2685/testReport/junit/org.apache.beam.runners.flink.translation.wrappers.streaming.io/UnboundedSourceWrapperTest$ParameterizedUnboundedSourceWrapperTest/testWatermarkEmission_numTasks___4__numSplits_4_/] > org.junit.runners.model.TestTimedOutException: test timed out after 3 > milliseconds at sun.misc.Unsafe.park(Native Method) at > java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) > at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) at > org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapperTest$ParameterizedUnboundedSourceWrapperTest.testWatermarkEmission(UnboundedSourceWrapperTest.java:354) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) at > java.lang.Thread.run(Thread.java:748) > > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-9164) [PreCommit_Java] [Flake] UnboundedSourceWrapperTest$ParameterizedUnboundedSourceWrapperTest
[ https://issues.apache.org/jira/browse/BEAM-9164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maximilian Michels resolved BEAM-9164. -- Fix Version/s: Not applicable Resolution: Fixed This should be fixed but feel free to re-open if it pops up again. > [PreCommit_Java] [Flake] > UnboundedSourceWrapperTest$ParameterizedUnboundedSourceWrapperTest > --- > > Key: BEAM-9164 > URL: https://issues.apache.org/jira/browse/BEAM-9164 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Kirill Kozlov >Priority: Critical > Labels: flake > Fix For: Not applicable > > > Test: > org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapperTest$ParameterizedUnboundedSourceWrapperTest > >> testWatermarkEmission[numTasks = 1; numSplits=1] > Fails with the following exception: > {code:java} > org.junit.runners.model.TestTimedOutException: test timed out after 3 > milliseconds{code} > Affected Jenkins job: > [https://builds.apache.org/job/beam_PreCommit_Java_Phrase/1665/] > Gradle build scan: [https://scans.gradle.com/s/nvgeb425fe63q] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9969) Beam ZetaSQL supports pure SQL user-defined table functions
[ https://issues.apache.org/jira/browse/BEAM-9969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated BEAM-9969: --- Description: There are four types of implementation of table-valued function: 1. FixedSchemaOutput. The output of the TVF is pre-defined. 2. InputSchemaForwardToOutput. The output schema of the TVF is the same as input schema. 3. InputSchemaForwardToOutputWithAppendedColumns. The output schema of the TVF is the same as input schema plus columns appended. 4. TemplatedTVF. Parameters of TVF are templated. (e.g. ANY TYPE, ANY TABLE, etc.) was: There are three types of implementation of table-valued function: 1. FixedSchemaOutput. The output of the TVF is pre-defined. 2. InputSchemaForwardToOutput. The output schema of the TVF is the same as input schema. 3. InputSchemaForwardToOutputWithAppendedColumns. The output schema of the TVF is the same as input schema plus columns appended. 4. TemplatedTVF. Parameters of TVF are templated. (e.g. ANY TYPE, ANY TABLE, etc.) > Beam ZetaSQL supports pure SQL user-defined table functions > --- > > Key: BEAM-9969 > URL: https://issues.apache.org/jira/browse/BEAM-9969 > Project: Beam > Issue Type: Task > Components: dsl-sql-zetasql >Reporter: Rui Wang >Priority: Major > > There are four types of implementation of table-valued function: > 1. FixedSchemaOutput. The output of the TVF is pre-defined. > 2. InputSchemaForwardToOutput. The output schema of the TVF is the same as > input schema. > 3. InputSchemaForwardToOutputWithAppendedColumns. The output schema of the > TVF is the same as input schema plus columns appended. > 4. TemplatedTVF. Parameters of TVF are templated. (e.g. ANY TYPE, ANY TABLE, > etc.) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9974) beam_PostRelease_NightlySnapshot failing
Kyle Weaver created BEAM-9974: - Summary: beam_PostRelease_NightlySnapshot failing Key: BEAM-9974 URL: https://issues.apache.org/jira/browse/BEAM-9974 Project: Beam Issue Type: Bug Components: test-failures Reporter: Kyle Weaver Another failure mode: 07:02:29 > Task :runners:google-cloud-dataflow-java:runMobileGamingJavaDataflow 07:02:29 [ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.6.0:java (default-cli) on project word-count-beam: An exception occured while executing the Java class. No filesystem found for scheme gs -> [Help 1] 07:02:29 [ERROR] 07:02:29 [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. 07:02:29 [ERROR] Re-run Maven using the -X switch to enable full debug logging. 07:02:29 [ERROR] 07:02:29 [ERROR] For more information about the errors and possible solutions, please read the following articles: 07:02:29 [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException 07:02:29 [ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.6.0:java (default-cli) on project word-count-beam: An exception occured while executing the Java class. No filesystem found for scheme gs -> [Help 1] 07:02:29 [ERROR] 07:02:29 [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. 07:02:29 [ERROR] Re-run Maven using the -X switch to enable full debug logging. 07:02:29 [ERROR] 07:02:29 [ERROR] For more information about the errors and possible solutions, please read the following articles: 07:02:29 [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException 07:02:29 [ERROR] Failed command 07:02:29 07:02:29 > Task :runners:google-cloud-dataflow-java:runMobileGamingJavaDataflow FAILED -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9973) beam_PostCommit_Py_ValCont flakes (before running tests)
Kyle Weaver created BEAM-9973: - Summary: beam_PostCommit_Py_ValCont flakes (before running tests) Key: BEAM-9973 URL: https://issues.apache.org/jira/browse/BEAM-9973 Project: Beam Issue Type: Bug Components: test-failures Reporter: Kyle Weaver 14:05:15 - Last 20 lines from daemon log file - daemon-2143.out.log - 14:05:15 INFO:gen_protos:Writing urn stubs: /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_ValCont/src/sdks/python/apache_beam/portability/api/metrics_pb2_urns.py 14:05:15 INFO:gen_protos:Writing urn stubs: /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_ValCont/src/sdks/python/apache_beam/portability/api/beam_artifact_api_pb2_urns.py 14:05:15 INFO:gen_protos:Writing urn stubs: /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_ValCont/src/sdks/python/apache_beam/portability/api/standard_window_fns_pb2_urns.py 14:05:15 INFO:gen_protos:Writing urn stubs: /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_ValCont/src/sdks/python/apache_beam/portability/api/beam_fn_api_pb2_urns.py 14:05:15 INFO:gen_protos:Writing urn stubs: /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_ValCont/src/sdks/python/apache_beam/portability/api/beam_job_api_pb2_urns.py 14:05:15 INFO:gen_protos:Writing urn stubs: /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_ValCont/src/sdks/python/apache_beam/portability/api/beam_runner_api_pb2_urns.py 14:05:15 warning: no files found matching 'README.md' 14:05:15 warning: no files found matching 'NOTICE' 14:05:15 warning: no files found matching 'LICENSE' 14:05:15 warning: sdist: standard file not found: should have one of README, README.rst, README.txt, README.md 14:05:15 14:05:15 Create distribution tar file apache-beam.tar.gz in /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_ValCont/src/sdks/python/build 14:05:15 :sdks:python:sdist (Thread[Execution worker for ':',5,main]) completed. Took 7.453 secs. 14:05:15 :sdks:python:container:py2:copyDockerfileDependencies (Thread[Execution worker for ':',5,main]) started. 14:05:15 Build cache key for task ':sdks:python:container:py2:copyDockerfileDependencies' is ea7f5d2ce156f4297b3b2c5c3f21611d 14:05:15 Caching disabled for task ':sdks:python:container:py2:copyDockerfileDependencies': Caching has not been enabled for the task 14:05:15 Task ':sdks:python:container:py2:copyDockerfileDependencies' is not up-to-date because: 14:05:15 No history is available. 14:05:15 :sdks:python:container:py2:copyDockerfileDependencies (Thread[Execution worker for ':',5,main]) completed. Took 0.012 secs. 14:05:15 Daemon vm is shutting down... The daemon has exited normally or was terminated in response to a user interrupt. 14:05:15 - End of the daemon log - 14:05:15 14:05:15 14:05:15 FAILURE: Build failed with an exception. 14:05:15 14:05:15 * What went wrong: 14:05:15 Gradle build daemon disappeared unexpectedly (it may have been killed or may have crashed) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9972) Test pure SQL UDF for all data types
Rui Wang created BEAM-9972: -- Summary: Test pure SQL UDF for all data types Key: BEAM-9972 URL: https://issues.apache.org/jira/browse/BEAM-9972 Project: Beam Issue Type: Task Components: dsl-sql-zetasql Reporter: Rui Wang The following are data types that UDF will support * ARRAY * BOOL * BYTES * DATE * TIME * FLOAT64 * INT64 * NUMERIC * STRING * STRUCT * TIMESTAMP We should write unit tests to against these types. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9969) Beam ZetaSQL supports pure SQL user-defined table functions
[ https://issues.apache.org/jira/browse/BEAM-9969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated BEAM-9969: --- Status: Open (was: Triage Needed) > Beam ZetaSQL supports pure SQL user-defined table functions > --- > > Key: BEAM-9969 > URL: https://issues.apache.org/jira/browse/BEAM-9969 > Project: Beam > Issue Type: Task > Components: dsl-sql-zetasql >Reporter: Rui Wang >Priority: Major > > There are three types of implementation of table-valued function: > 1. FixedSchemaOutput. The output of the TVF is pre-defined. > 2. InputSchemaForwardToOutput. The output schema of the TVF is the same as > input schema. > 3. InputSchemaForwardToOutputWithAppendedColumns. The output schema of the > TVF is the same as input schema plus columns appended. > 4. TemplatedTVF. Parameters of TVF are templated. (e.g. ANY TYPE, ANY TABLE, > etc.) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9971) beam_PostCommit_Java_PVR_Spark_Batch flakes (no such file)
[ https://issues.apache.org/jira/browse/BEAM-9971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kenneth Knowles updated BEAM-9971: -- Status: Open (was: Triage Needed) > beam_PostCommit_Java_PVR_Spark_Batch flakes (no such file) > -- > > Key: BEAM-9971 > URL: https://issues.apache.org/jira/browse/BEAM-9971 > Project: Beam > Issue Type: Bug > Components: test-failures >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Major > > This happens sporadically. One time the issue affected 14 tests; another time > it affected 112 tests. > java.lang.RuntimeException: The Runner experienced the following error during > execution: > java.io.FileNotFoundException: > /tmp/spark-0812a463-8d6b-4c97-be4b-de43baf67108/userFiles-b90ca2e1-2041-442d-ae78-c8e9c30bff49/beam-runners-spark-2.22.0-SNAPSHOT.jar > (No such file or directory) > at > org.apache.beam.runners.portability.JobServicePipelineResult.propagateErrors(JobServicePipelineResult.java:165) > at > org.apache.beam.runners.portability.JobServicePipelineResult.waitUntilFinish(JobServicePipelineResult.java:110) > at > org.apache.beam.runners.portability.testing.TestPortableRunner.run(TestPortableRunner.java:83) > at org.apache.beam.sdk.Pipeline.run(Pipeline.java:317) > at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:350) > at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:331) > at > org.apache.beam.runners.core.metrics.MetricsPusherTest.pushesUserMetrics(MetricsPusherTest.java:70) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:319) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305) > at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:365) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > at org.junit.runners.ParentRunner$4.run(ParentRunner.java:330) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:78) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:328) > at org.junit.runners.ParentRunner.access$100(ParentRunner.java:65) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:292) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305) > at org.junit.runners.ParentRunner.run(ParentRunner.java:412) > at > org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:110) > at > org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58) > at > org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38) > at > org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:62) > at > org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51) > at sun.reflect.GeneratedMethodAccessor161.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35) > at > org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24) > at > org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32) > at > org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93) > at com.sun.proxy.$Proxy2.processTestClass(Unknown Source) >
[jira] [Created] (BEAM-9971) beam_PostCommit_Java_PVR_Spark_Batch flakes (no such file)
Kyle Weaver created BEAM-9971: - Summary: beam_PostCommit_Java_PVR_Spark_Batch flakes (no such file) Key: BEAM-9971 URL: https://issues.apache.org/jira/browse/BEAM-9971 Project: Beam Issue Type: Bug Components: test-failures Reporter: Kyle Weaver Assignee: Kyle Weaver This happens sporadically. One time the issue affected 14 tests; another time it affected 112 tests. java.lang.RuntimeException: The Runner experienced the following error during execution: java.io.FileNotFoundException: /tmp/spark-0812a463-8d6b-4c97-be4b-de43baf67108/userFiles-b90ca2e1-2041-442d-ae78-c8e9c30bff49/beam-runners-spark-2.22.0-SNAPSHOT.jar (No such file or directory) at org.apache.beam.runners.portability.JobServicePipelineResult.propagateErrors(JobServicePipelineResult.java:165) at org.apache.beam.runners.portability.JobServicePipelineResult.waitUntilFinish(JobServicePipelineResult.java:110) at org.apache.beam.runners.portability.testing.TestPortableRunner.run(TestPortableRunner.java:83) at org.apache.beam.sdk.Pipeline.run(Pipeline.java:317) at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:350) at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:331) at org.apache.beam.runners.core.metrics.MetricsPusherTest.pushesUserMetrics(MetricsPusherTest.java:70) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:319) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305) at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:365) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) at org.junit.runners.ParentRunner$4.run(ParentRunner.java:330) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:78) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:328) at org.junit.runners.ParentRunner.access$100(ParentRunner.java:65) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:292) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305) at org.junit.runners.ParentRunner.run(ParentRunner.java:412) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:110) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58) at org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38) at org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:62) at org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51) at sun.reflect.GeneratedMethodAccessor161.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35) at org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24) at org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32) at org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93) at com.sun.proxy.$Proxy2.processTestClass(Unknown Source) at org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:118) at sun.reflect.GeneratedMethodAccessor160.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at
[jira] [Created] (BEAM-9970) beam_PostRelease_NightlySnapshot failing
Kyle Weaver created BEAM-9970: - Summary: beam_PostRelease_NightlySnapshot failing Key: BEAM-9970 URL: https://issues.apache.org/jira/browse/BEAM-9970 Project: Beam Issue Type: Bug Components: test-failures Reporter: Kyle Weaver 07:44:55 gcloud dataflow jobs cancel $(gcloud dataflow jobs list | grep leaderboard-validation-1588332285874-332 | grep Running | cut -d' ' -f1) 07:44:59 ERROR: (gcloud.dataflow.jobs.cancel) argument JOB_ID [JOB_ID ...]: Must be specified. 07:44:59 Usage: gcloud dataflow jobs cancel JOB_ID [JOB_ID ...] [optional flags] 07:44:59 optional flags may be --help | --region 07:44:59 07:44:59 For detailed information on this command and its flags, run: 07:44:59 gcloud dataflow jobs cancel --help 07:44:59 ERROR: (gcloud.dataflow.jobs.cancel) argument JOB_ID [JOB_ID ...]: Must be specified. 07:44:59 Usage: gcloud dataflow jobs cancel JOB_ID [JOB_ID ...] [optional flags] 07:44:59 optional flags may be --help | --region 07:44:59 07:44:59 For detailed information on this command and its flags, run: 07:44:59 gcloud dataflow jobs cancel --help 07:45:00 [ERROR] Failed command 07:45:00 07:45:00 > Task :runners:google-cloud-dataflow-java:runMobileGamingJavaDataflow FAILED -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (BEAM-9968) beam_PreCommit_Java_Cron org.apache.beam.runners.fnexecution.control.RemoteExecutionTest.testSplit Failure
[ https://issues.apache.org/jira/browse/BEAM-9968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kenneth Knowles reassigned BEAM-9968: - Assignee: Luke Cwik > beam_PreCommit_Java_Cron > org.apache.beam.runners.fnexecution.control.RemoteExecutionTest.testSplit > Failure > -- > > Key: BEAM-9968 > URL: https://issues.apache.org/jira/browse/BEAM-9968 > Project: Beam > Issue Type: Bug > Components: test-failures >Reporter: Kenneth Knowles >Assignee: Luke Cwik >Priority: Critical > Labels: currently-failing > > _Use this form to file an issue for test failure:_ > * [https://builds.apache.org/job/beam_PreCommit_Java_Cron/2741/] > * [https://gradle.com/s/ij5a4z7nhwjn2] > * > [RemoteExecutionTest.testSplit](https://github.com/apache/beam/blob/820f0f5c12146195ed80617763a354ddb75f0bc1/runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/control/RemoteExecutionTest.java#L1358) > Failure is: `AssertionError` since an assertion was used without a meaningful > error message. It is asserting that the splits are non-empty. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9968) beam_PreCommit_Java_Cron org.apache.beam.runners.fnexecution.control.RemoteExecutionTest.testSplit Failure
[ https://issues.apache.org/jira/browse/BEAM-9968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105664#comment-17105664 ] Kenneth Knowles commented on BEAM-9968: --- Actually stderr is flooded with a bunch of errors. Not totally clear to me whether those errors are just the blast radius of the assertion error. > beam_PreCommit_Java_Cron > org.apache.beam.runners.fnexecution.control.RemoteExecutionTest.testSplit > Failure > -- > > Key: BEAM-9968 > URL: https://issues.apache.org/jira/browse/BEAM-9968 > Project: Beam > Issue Type: Bug > Components: test-failures >Reporter: Kenneth Knowles >Priority: Critical > Labels: currently-failing > > _Use this form to file an issue for test failure:_ > * [https://builds.apache.org/job/beam_PreCommit_Java_Cron/2741/] > * [https://gradle.com/s/ij5a4z7nhwjn2] > * > [RemoteExecutionTest.testSplit](https://github.com/apache/beam/blob/820f0f5c12146195ed80617763a354ddb75f0bc1/runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/control/RemoteExecutionTest.java#L1358) > Failure is: `AssertionError` since an assertion was used without a meaningful > error message. It is asserting that the splits are non-empty. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9969) Beam ZetaSQL supports pure SQL user-defined table functions
Rui Wang created BEAM-9969: -- Summary: Beam ZetaSQL supports pure SQL user-defined table functions Key: BEAM-9969 URL: https://issues.apache.org/jira/browse/BEAM-9969 Project: Beam Issue Type: Task Components: dsl-sql-zetasql Reporter: Rui Wang There are three types of implementation of table-valued function: 1. FixedSchemaOutput. The output of the TVF is pre-defined. 2. InputSchemaForwardToOutput. The output schema of the TVF is the same as input schema. 3. InputSchemaForwardToOutputWithAppendedColumns. The output schema of the TVF is the same as input schema plus columns appended. 4. TemplatedTVF. Parameters of TVF are templated. (e.g. ANY TYPE, ANY TABLE, etc.) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work started] (BEAM-9622) Support for consuming tagged PCollections in Python SqlTransform
[ https://issues.apache.org/jira/browse/BEAM-9622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on BEAM-9622 started by Brian Hulette. --- > Support for consuming tagged PCollections in Python SqlTransform > > > Key: BEAM-9622 > URL: https://issues.apache.org/jira/browse/BEAM-9622 > Project: Beam > Issue Type: Improvement > Components: dsl-sql, sdk-py-core >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (BEAM-9622) Support for consuming tagged PCollections in Python SqlTransform
[ https://issues.apache.org/jira/browse/BEAM-9622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brian Hulette reassigned BEAM-9622: --- Assignee: Brian Hulette > Support for consuming tagged PCollections in Python SqlTransform > > > Key: BEAM-9622 > URL: https://issues.apache.org/jira/browse/BEAM-9622 > Project: Beam > Issue Type: Improvement > Components: dsl-sql, sdk-py-core >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9968) beam_PreCommit_Java_Cron org.apache.beam.runners.fnexecution.control.RemoteExecutionTest.testSplit Failure
[ https://issues.apache.org/jira/browse/BEAM-9968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105661#comment-17105661 ] Kenneth Knowles commented on BEAM-9968: --- I did not see an obvious suspect commit. It has failed twice in a row now. > beam_PreCommit_Java_Cron > org.apache.beam.runners.fnexecution.control.RemoteExecutionTest.testSplit > Failure > -- > > Key: BEAM-9968 > URL: https://issues.apache.org/jira/browse/BEAM-9968 > Project: Beam > Issue Type: Bug > Components: test-failures >Reporter: Kenneth Knowles >Priority: Critical > Labels: currently-failing > > _Use this form to file an issue for test failure:_ > * [https://builds.apache.org/job/beam_PreCommit_Java_Cron/2741/] > * [https://gradle.com/s/ij5a4z7nhwjn2] > * > [RemoteExecutionTest.testSplit](https://github.com/apache/beam/blob/820f0f5c12146195ed80617763a354ddb75f0bc1/runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/control/RemoteExecutionTest.java#L1358) > Failure is: `AssertionError` since an assertion was used without a meaningful > error message. It is asserting that the splits are non-empty. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9968) beam_PreCommit_Java_Cron org.apache.beam.runners.fnexecution.control.RemoteExecutionTest.testSplit Failure
[ https://issues.apache.org/jira/browse/BEAM-9968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kenneth Knowles updated BEAM-9968: -- Description: _Use this form to file an issue for test failure:_ * [https://builds.apache.org/job/beam_PreCommit_Java_Cron/2741/] * [https://gradle.com/s/ij5a4z7nhwjn2] * [RemoteExecutionTest.testSplit](https://github.com/apache/beam/blob/820f0f5c12146195ed80617763a354ddb75f0bc1/runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/control/RemoteExecutionTest.java#L1358) Failure is: `AssertionError` since an assertion was used without a meaningful error message. It is asserting that the splits are non-empty. was: _Use this form to file an issue for test failure:_ * [https://builds.apache.org/job/beam_PreCommit_Java_Cron/2741/] * [https://gradle.com/s/ij5a4z7nhwjn2] * [RemoteExecutionTest.testSplit](https://github.com/apache/beam/blob/820f0f5c12146195ed80617763a354ddb75f0bc1/runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/control/RemoteExecutionTest.java#L1358) > beam_PreCommit_Java_Cron > org.apache.beam.runners.fnexecution.control.RemoteExecutionTest.testSplit > Failure > -- > > Key: BEAM-9968 > URL: https://issues.apache.org/jira/browse/BEAM-9968 > Project: Beam > Issue Type: Bug > Components: test-failures >Reporter: Kenneth Knowles >Priority: Critical > Labels: currently-failing > > _Use this form to file an issue for test failure:_ > * [https://builds.apache.org/job/beam_PreCommit_Java_Cron/2741/] > * [https://gradle.com/s/ij5a4z7nhwjn2] > * > [RemoteExecutionTest.testSplit](https://github.com/apache/beam/blob/820f0f5c12146195ed80617763a354ddb75f0bc1/runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/control/RemoteExecutionTest.java#L1358) > Failure is: `AssertionError` since an assertion was used without a meaningful > error message. It is asserting that the splits are non-empty. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9968) beam_PreCommit_Java_Cron org.apache.beam.runners.fnexecution.control.RemoteExecutionTest.testSplit Failure
[ https://issues.apache.org/jira/browse/BEAM-9968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105660#comment-17105660 ] Kenneth Knowles commented on BEAM-9968: --- CC [~lcwik] and [~chamikara]. Any ideas on this one? > beam_PreCommit_Java_Cron > org.apache.beam.runners.fnexecution.control.RemoteExecutionTest.testSplit > Failure > -- > > Key: BEAM-9968 > URL: https://issues.apache.org/jira/browse/BEAM-9968 > Project: Beam > Issue Type: Bug > Components: test-failures >Reporter: Kenneth Knowles >Priority: Critical > Labels: currently-failing > > _Use this form to file an issue for test failure:_ > * [https://builds.apache.org/job/beam_PreCommit_Java_Cron/2741/] > * [https://gradle.com/s/ij5a4z7nhwjn2] > * > [RemoteExecutionTest.testSplit](https://github.com/apache/beam/blob/820f0f5c12146195ed80617763a354ddb75f0bc1/runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/control/RemoteExecutionTest.java#L1358) > Failure is: `AssertionError` since an assertion was used without a meaningful > error message. It is asserting that the splits are non-empty. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9968) beam_PreCommit_Java_Cron org.apache.beam.runners.fnexecution.control.RemoteExecutionTest.testSplit Failure
Kenneth Knowles created BEAM-9968: - Summary: beam_PreCommit_Java_Cron org.apache.beam.runners.fnexecution.control.RemoteExecutionTest.testSplit Failure Key: BEAM-9968 URL: https://issues.apache.org/jira/browse/BEAM-9968 Project: Beam Issue Type: Bug Components: test-failures Reporter: Kenneth Knowles _Use this form to file an issue for test failure:_ * [https://builds.apache.org/job/beam_PreCommit_Java_Cron/2741/] * [https://gradle.com/s/ij5a4z7nhwjn2] * [RemoteExecutionTest.testSplit](https://github.com/apache/beam/blob/820f0f5c12146195ed80617763a354ddb75f0bc1/runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/control/RemoteExecutionTest.java#L1358) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9954) Beam ZetaSQL supports pure SQL user-defined aggregation functions
[ https://issues.apache.org/jira/browse/BEAM-9954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated BEAM-9954: --- Description: One naive example is {code:sql} CREATE AGGREGATE FUNCTION fun (a INT64) AS (SUM(a)); {code} > Beam ZetaSQL supports pure SQL user-defined aggregation functions > - > > Key: BEAM-9954 > URL: https://issues.apache.org/jira/browse/BEAM-9954 > Project: Beam > Issue Type: Task > Components: dsl-sql-zetasql >Reporter: Rui Wang >Priority: Major > > One naive example is > {code:sql} > CREATE AGGREGATE FUNCTION fun (a INT64) AS (SUM(a)); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9954) Beam ZetaSQL supports pure SQL user-defined aggregation functions
[ https://issues.apache.org/jira/browse/BEAM-9954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Wang updated BEAM-9954: --- Description: One naive example is {code:sql} CREATE AGGREGATE FUNCTION fun (a INT64, b INT64) AS (SUM(a)/AVG(b)); {code} was: One naive example is {code:sql} CREATE AGGREGATE FUNCTION fun (a INT64) AS (SUM(a)); {code} > Beam ZetaSQL supports pure SQL user-defined aggregation functions > - > > Key: BEAM-9954 > URL: https://issues.apache.org/jira/browse/BEAM-9954 > Project: Beam > Issue Type: Task > Components: dsl-sql-zetasql >Reporter: Rui Wang >Priority: Major > > One naive example is > {code:sql} > CREATE AGGREGATE FUNCTION fun (a INT64, b INT64) AS (SUM(a)/AVG(b)); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9967) Add support for BigQuery job labels on ReadFrom/WriteTo(BigQuery) transforms
[ https://issues.apache.org/jira/browse/BEAM-9967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kenneth Knowles updated BEAM-9967: -- Status: Open (was: Triage Needed) > Add support for BigQuery job labels on ReadFrom/WriteTo(BigQuery) transforms > > > Key: BEAM-9967 > URL: https://issues.apache.org/jira/browse/BEAM-9967 > Project: Beam > Issue Type: Bug > Components: io-py-gcp >Reporter: Pablo Estrada >Assignee: Pablo Estrada >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9967) Add support for BigQuery job labels on ReadFrom/WriteTo(BigQuery) transforms
Pablo Estrada created BEAM-9967: --- Summary: Add support for BigQuery job labels on ReadFrom/WriteTo(BigQuery) transforms Key: BEAM-9967 URL: https://issues.apache.org/jira/browse/BEAM-9967 Project: Beam Issue Type: Bug Components: io-py-gcp Reporter: Pablo Estrada Assignee: Pablo Estrada -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9807) PostRelease_NightlySnapshot failing due to missing "region" flag
[ https://issues.apache.org/jira/browse/BEAM-9807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kenneth Knowles updated BEAM-9807: -- Priority: Critical (was: Major) > PostRelease_NightlySnapshot failing due to missing "region" flag > > > Key: BEAM-9807 > URL: https://issues.apache.org/jira/browse/BEAM-9807 > Project: Beam > Issue Type: Bug > Components: test-failures >Reporter: Chamikara Madhusanka Jayalath >Assignee: Kyle Weaver >Priority: Critical > Fix For: 2.22.0 > > Time Spent: 3h 10m > Remaining Estimate: 0h > > [https://builds.apache.org/job/beam_PostRelease_NightlySnapshot/959/] > [https://builds.apache.org/job/beam_PostRelease_NightlySnapshot/958/] > [https://scans.gradle.com/s/tth6vapotb5vw/console-log?task=:runners:google-cloud-dataflow-java:runMobileGamingJavaDataflow] > [ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.6.0:java > (default-cli) on project word-count-beam: An exception occured while > executing the Java class. Failed to construct instance from factory method > DataflowRunner#fromOptions(interface > org.apache.beam.sdk.options.PipelineOptions): InvocationTargetException: > Missing required values: region -> [Help 1] > > Probably due to [https://github.com/apache/beam/pull/11281]. > > Kyle, can you take a look ? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9760) KafkaIO supports consumer group?
[ https://issues.apache.org/jira/browse/BEAM-9760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105641#comment-17105641 ] Alexey Romanenko commented on BEAM-9760: Hi [~wongkawah], thanks for details. I see your point but I'm not sure that we can use "subscribe()" method in KafkaIO because of some internal Beam limitations. We had a similar discussion a while ago ([see here|https://issues.apache.org/jira/browse/BEAM-5786?focusedCommentId=16655883=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16655883]) > KafkaIO supports consumer group? > > > Key: BEAM-9760 > URL: https://issues.apache.org/jira/browse/BEAM-9760 > Project: Beam > Issue Type: Improvement > Components: io-java-kafka >Reporter: Ka Wah WONG >Priority: Minor > > It seems only assign method of Kafka Consumer class is called in > org.apache.beam.sdk.io.kafka.ConsumerSpEL class. According to documentation > of org.apache.kafka.clients.consumer.KafkaConsumer, manual topic assignment > through this assign method does not use the consumer's group management > functionality. > May I ask if KafkaIO will be enhanced to support consumer's group management > with using Kafka consumer's subscribe method? > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9966) Investigate variance in checkpoint duration of ParDo streaming tests
[ https://issues.apache.org/jira/browse/BEAM-9966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kenneth Knowles updated BEAM-9966: -- Status: Open (was: Triage Needed) > Investigate variance in checkpoint duration of ParDo streaming tests > > > Key: BEAM-9966 > URL: https://issues.apache.org/jira/browse/BEAM-9966 > Project: Beam > Issue Type: Bug > Components: build-system, runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Major > > We need to take a closer look at the variance in checkpoint duration which, > for different test runs, fluctuates between one second and one minute. > https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9966) Investigate variance in checkpoint duration of ParDo streaming tests
Maximilian Michels created BEAM-9966: Summary: Investigate variance in checkpoint duration of ParDo streaming tests Key: BEAM-9966 URL: https://issues.apache.org/jira/browse/BEAM-9966 Project: Beam Issue Type: Bug Components: build-system, runner-flink Reporter: Maximilian Michels Assignee: Maximilian Michels We need to take a closer look at the variance in checkpoint duration which, for different test runs, fluctuates between one second and one minute. https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-6327) Don't attempt to fuse subtransforms of primitive/known transforms.
[ https://issues.apache.org/jira/browse/BEAM-6327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105629#comment-17105629 ] Luke Cwik commented on BEAM-6327: - I'm not sure if we want to add it to the fuser since the pipeline runner may want to choose the order in which they perform transform expansion/replacement. > Don't attempt to fuse subtransforms of primitive/known transforms. > -- > > Key: BEAM-6327 > URL: https://issues.apache.org/jira/browse/BEAM-6327 > Project: Beam > Issue Type: New Feature > Components: runner-direct >Reporter: Robert Bradshaw >Assignee: Kyle Weaver >Priority: Major > Labels: portability > Time Spent: 2h 20m > Remaining Estimate: 0h > > Currently we must remove all sub-components of any known transform that may > have an optional substructure, e.g. > [https://github.com/apache/beam/blob/release-2.9.0/sdks/python/apache_beam/runners/portability/portable_runner.py#L126] > (for GBK) and [https://github.com/apache/beam/pull/7360] (Reshuffle). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-6327) Don't attempt to fuse subtransforms of primitive/known transforms.
[ https://issues.apache.org/jira/browse/BEAM-6327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105627#comment-17105627 ] Luke Cwik commented on BEAM-6327: - I think this shouldn't be considered "trimming" but native transform replacement and have crafted https://github.com/apache/beam/pull/11670 which repurposes this as such. The version that was in the trimmer was a trivial implementation that covers many scenarios where the runner doesn't need custom replacement/expansion logic. > Don't attempt to fuse subtransforms of primitive/known transforms. > -- > > Key: BEAM-6327 > URL: https://issues.apache.org/jira/browse/BEAM-6327 > Project: Beam > Issue Type: New Feature > Components: runner-direct >Reporter: Robert Bradshaw >Assignee: Kyle Weaver >Priority: Major > Labels: portability > Time Spent: 2h 20m > Remaining Estimate: 0h > > Currently we must remove all sub-components of any known transform that may > have an optional substructure, e.g. > [https://github.com/apache/beam/blob/release-2.9.0/sdks/python/apache_beam/runners/portability/portable_runner.py#L126] > (for GBK) and [https://github.com/apache/beam/pull/7360] (Reshuffle). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (BEAM-6327) Don't attempt to fuse subtransforms of primitive/known transforms.
[ https://issues.apache.org/jira/browse/BEAM-6327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik reassigned BEAM-6327: --- Assignee: Kyle Weaver (was: Luke Cwik) > Don't attempt to fuse subtransforms of primitive/known transforms. > -- > > Key: BEAM-6327 > URL: https://issues.apache.org/jira/browse/BEAM-6327 > Project: Beam > Issue Type: New Feature > Components: runner-direct >Reporter: Robert Bradshaw >Assignee: Kyle Weaver >Priority: Major > Labels: portability > Time Spent: 2h 20m > Remaining Estimate: 0h > > Currently we must remove all sub-components of any known transform that may > have an optional substructure, e.g. > [https://github.com/apache/beam/blob/release-2.9.0/sdks/python/apache_beam/runners/portability/portable_runner.py#L126] > (for GBK) and [https://github.com/apache/beam/pull/7360] (Reshuffle). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (BEAM-6327) Don't attempt to fuse subtransforms of primitive/known transforms.
[ https://issues.apache.org/jira/browse/BEAM-6327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik reassigned BEAM-6327: --- Assignee: Luke Cwik (was: Kyle Weaver) > Don't attempt to fuse subtransforms of primitive/known transforms. > -- > > Key: BEAM-6327 > URL: https://issues.apache.org/jira/browse/BEAM-6327 > Project: Beam > Issue Type: New Feature > Components: runner-direct >Reporter: Robert Bradshaw >Assignee: Luke Cwik >Priority: Major > Labels: portability > Time Spent: 2h 20m > Remaining Estimate: 0h > > Currently we must remove all sub-components of any known transform that may > have an optional substructure, e.g. > [https://github.com/apache/beam/blob/release-2.9.0/sdks/python/apache_beam/runners/portability/portable_runner.py#L126] > (for GBK) and [https://github.com/apache/beam/pull/7360] (Reshuffle). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9900) Remove the need for shutdownSourcesOnFinalWatermark flag
[ https://issues.apache.org/jira/browse/BEAM-9900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105620#comment-17105620 ] Maximilian Michels commented on BEAM-9900: -- For context: https://github.com/apache/beam/pull/11558/commits/d106f263e625e5d7c4f3848f16da301871f65142 > Remove the need for shutdownSourcesOnFinalWatermark flag > > > Key: BEAM-9900 > URL: https://issues.apache.org/jira/browse/BEAM-9900 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Major > Fix For: 2.22.0 > > > The {{shutdownSourcesOnFinalWatermark}} has caused some confusion in the > past. It is generally used for testing pipelines to ensure that the pipeline > and the testing cluster shuts down at the end of the job. Without it, the > pipeline will run forever in streaming mode, regardless of whether the input > is finite or not. > We didn't want to enable the flag by default because shutting down any > operators including sources in Flink will prevent checkpointing from working. > If we have side input, for example, that may be the case even for > long-running pipelines. However, we can detect whether checkpointing is > enabled and set the flag automatically. > The only situation where we may want the flag to be disabled is when users do > not have checkpointing enabled but want to take a savepoint. This should be > rare and users can mitigate by setting the flag to false to prevent operators > from shutting down. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9965) Explain direct runner running modes
Kyle Weaver created BEAM-9965: - Summary: Explain direct runner running modes Key: BEAM-9965 URL: https://issues.apache.org/jira/browse/BEAM-9965 Project: Beam Issue Type: Improvement Components: website Reporter: Kyle Weaver We should add reasoning for which direct_running_mode a user should choose. https://beam.apache.org/documentation/runners/direct/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9959) Mistakes Computing Composite Inputs and Outputs
[ https://issues.apache.org/jira/browse/BEAM-9959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Burke updated BEAM-9959: --- Description: The Go SDK uses a Scope object to manage beam Composites. A bug was discovered when consuming a PCollection in both the composite that created it, and in a separate composite. Further, the Go SDK should verify that the root hypergraph structure is a DAG and provides a reasonable error. In particular, the leaf nodes of the graph could form a DAG, but due to how the beam.Scope object is used, might cause the hypergraph to not be a DAG. Eg. It's possible to write the following in the Go SDK. PTransforms A, B, C and PCollections colA, colB, and Composites a, b. A and C are in a, and B are in b. A generates colA B consumes colA, and generates colB. C consumes colA and colB. ``` a := s.Scope(a) b := s.Scope(b) colA := beam.Impulse(*a*) colB := beam.ParDo(*b*, , colA) beam.ParDo0(*a*, , colA, beam.SideInput{colB}) ``` If it doesn't already, the Go SDK must emit a clear error, and fail pipeline construction. If the affected composites are roots in the graph, the cycle prevents being able to topologically sort the root ptransforms for the pipeline graph, which can adversely affect runners. The recommendation is always to wrap uses of scope in functions or other scopes to prevent such incorrect constructions. was: The Go SDK uses a Scope object to manage beam Composites. A bug was discovered when consuming a PCollection in both the composite that created it, and in a separate composite. Further, the Go SDK should verify that the root hypergraph structure is a DAG and provides a reasonable error. In particular, the leaf nodes of the graph could form a DAG, but due to how the beam.Scope object is used, might cause the hypergraph to not be a DAG. Eg. It's possible to write the following in the Go SDK. PTransforms A, B, C and PCollections colA, colB, and Composites a, b. A and C are in a, and B are in b. A generates colA B consumes colA, and generates colB. C consumes colB. ``` a := s.Scope(a) b := s.Scope(b) colA := beam.Impulse(*a*) colB := beam.ParDo(*b*, , colA) beam.ParDo0(*a*, , colA) ``` If it doesn't already the Go SDK must emit a clear error, and fail pipeline construction. If the affected composites are roots in the graph, the cycle prevents being able to topologically sort the root ptransforms for the pipeline graph, which can adversely affect runners. The recommendation is always to wrap uses of scope in functions or other scopes to prevent such incorrect constructions. > Mistakes Computing Composite Inputs and Outputs > --- > > Key: BEAM-9959 > URL: https://issues.apache.org/jira/browse/BEAM-9959 > Project: Beam > Issue Type: Bug > Components: sdk-go >Reporter: Robert Burke >Assignee: Robert Burke >Priority: Major > > The Go SDK uses a Scope object to manage beam Composites. > A bug was discovered when consuming a PCollection in both the composite that > created it, and in a separate composite. > Further, the Go SDK should verify that the root hypergraph structure is a DAG > and provides a reasonable error. In particular, the leaf nodes of the graph > could form a DAG, but due to how the beam.Scope object is used, might cause > the hypergraph to not be a DAG. > Eg. It's possible to write the following in the Go SDK. > PTransforms A, B, C and PCollections colA, colB, and Composites a, b. > A and C are in a, and B are in b. > A generates colA > B consumes colA, and generates colB. > C consumes colA and colB. > ``` > a := s.Scope(a) > b := s.Scope(b) > colA := beam.Impulse(*a*) > colB := beam.ParDo(*b*, , colA) > beam.ParDo0(*a*, , colA, beam.SideInput{colB}) > ``` > If it doesn't already, the Go SDK must emit a clear error, and fail pipeline > construction. > If the affected composites are roots in the graph, the cycle prevents being > able to topologically sort the root ptransforms for the pipeline graph, which > can adversely affect runners. > The recommendation is always to wrap uses of scope in functions or other > scopes to prevent such incorrect constructions. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
[ https://issues.apache.org/jira/browse/BEAM-9964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Omar Ismail updated BEAM-9964: -- Description: Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to make it allowable to change the cache value in Streaming when setting -workerCacheMB. I've never made changes to the Beam SDK, so I am super excited to work on this! [[1] https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] was: Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to make it allowable to change the cache value in Streaming when setting --workerCacheMB. I've never made changes to the Beam SDK, so I am super excited to work on this! [[1] https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] > Setting workerCacheMb to make its way to the WindmillStateCache Constructor > --- > > Key: BEAM-9964 > URL: https://issues.apache.org/jira/browse/BEAM-9964 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Omar Ismail >Priority: Minor > > Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, > the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to > make it allowable to change the cache value in Streaming when setting > -workerCacheMB. > I've never made changes to the Beam SDK, so I am super excited to work on > this! > > [[1] > https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor
Omar Ismail created BEAM-9964: - Summary: Setting workerCacheMb to make its way to the WindmillStateCache Constructor Key: BEAM-9964 URL: https://issues.apache.org/jira/browse/BEAM-9964 Project: Beam Issue Type: Improvement Components: runner-dataflow Reporter: Omar Ismail Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to make it allowable to change the cache value in Streaming when setting --workerCacheMB. I've never made changes to the Beam SDK, so I am super excited to work on this! [[1] https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9945) Use consistent element count for progress counter.
[ https://issues.apache.org/jira/browse/BEAM-9945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105512#comment-17105512 ] Luke Cwik commented on BEAM-9945: - PR 11652 needs to be cherry picked into 2.21 release branch > Use consistent element count for progress counter. > -- > > Key: BEAM-9945 > URL: https://issues.apache.org/jira/browse/BEAM-9945 > Project: Beam > Issue Type: Improvement > Components: beam-model, sdk-go, sdk-java-harness, sdk-py-harness >Reporter: Robert Bradshaw >Assignee: Robert Bradshaw >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 20m > Remaining Estimate: 0h > > This influences how the SDK communicates progress and how splitting is > performed. > We currently inspect the element count of the output (which has inconsistent > definitions across SDKs, see BEAM-9934). Instead, we can move to using an > explicit, separate metric. > This can currently lead to incorrect progress reporting and in some cases > even a crash in the UW depending on the SDK, and makes it more difficult (but > not impossible) to fix element counts in the future. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (BEAM-8133) Create metrics publisher in Java SDK
[ https://issues.apache.org/jira/browse/BEAM-8133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kamil Wasilewski reassigned BEAM-8133: -- Assignee: Pawel Pasterz > Create metrics publisher in Java SDK > > > Key: BEAM-8133 > URL: https://issues.apache.org/jira/browse/BEAM-8133 > Project: Beam > Issue Type: Sub-task > Components: testing >Reporter: Kamil Wasilewski >Assignee: Pawel Pasterz >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (BEAM-8132) Create metrics publisher in Python SDK
[ https://issues.apache.org/jira/browse/BEAM-8132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kamil Wasilewski closed BEAM-8132. -- > Create metrics publisher in Python SDK > -- > > Key: BEAM-8132 > URL: https://issues.apache.org/jira/browse/BEAM-8132 > Project: Beam > Issue Type: Sub-task > Components: testing >Reporter: Kamil Wasilewski >Assignee: Kamil Wasilewski >Priority: Major > Fix For: Not applicable > > Time Spent: 6h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-8132) Create metrics publisher in Python SDK
[ https://issues.apache.org/jira/browse/BEAM-8132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kamil Wasilewski resolved BEAM-8132. Fix Version/s: Not applicable Resolution: Fixed > Create metrics publisher in Python SDK > -- > > Key: BEAM-8132 > URL: https://issues.apache.org/jira/browse/BEAM-8132 > Project: Beam > Issue Type: Sub-task > Components: testing >Reporter: Kamil Wasilewski >Assignee: Kamil Wasilewski >Priority: Major > Fix For: Not applicable > > Time Spent: 6h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (BEAM-8133) Create metrics publisher in Java SDK
[ https://issues.apache.org/jira/browse/BEAM-8133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kamil Wasilewski closed BEAM-8133. -- Fix Version/s: Not applicable Resolution: Fixed > Create metrics publisher in Java SDK > > > Key: BEAM-8133 > URL: https://issues.apache.org/jira/browse/BEAM-8133 > Project: Beam > Issue Type: Sub-task > Components: testing >Reporter: Kamil Wasilewski >Assignee: Pawel Pasterz >Priority: Major > Fix For: Not applicable > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (BEAM-8131) Provide Kubernetes setup with Prometheus
[ https://issues.apache.org/jira/browse/BEAM-8131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kamil Wasilewski closed BEAM-8131. -- > Provide Kubernetes setup with Prometheus > > > Key: BEAM-8131 > URL: https://issues.apache.org/jira/browse/BEAM-8131 > Project: Beam > Issue Type: Sub-task > Components: testing >Reporter: Kamil Wasilewski >Assignee: Kamil Wasilewski >Priority: Major > Fix For: 2.17.0 > > Time Spent: 6.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-8742) Add stateful processing to ParDo load test
[ https://issues.apache.org/jira/browse/BEAM-8742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maximilian Michels updated BEAM-8742: - Description: So far, the ParDo load test is not stateful. We should add a basic counter to test the stateful processing. The test should work in streaming mode and with checkpointing. was:So far, the ParDo load test is not stateful. We should add a basic counter to test the stateful processing. > Add stateful processing to ParDo load test > -- > > Key: BEAM-8742 > URL: https://issues.apache.org/jira/browse/BEAM-8742 > Project: Beam > Issue Type: Sub-task > Components: testing >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Major > Fix For: 2.22.0 > > Time Spent: 16h 10m > Remaining Estimate: 0h > > So far, the ParDo load test is not stateful. We should add a basic counter to > test the stateful processing. > The test should work in streaming mode and with checkpointing. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-8742) Add stateful processing to ParDo load test
[ https://issues.apache.org/jira/browse/BEAM-8742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105339#comment-17105339 ] Maximilian Michels commented on BEAM-8742: -- Cron: https://builds.apache.org/job/beam_LoadTests_Python_ParDo_Flink_Streaming/ PR: https://builds.apache.org/job/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/ > Add stateful processing to ParDo load test > -- > > Key: BEAM-8742 > URL: https://issues.apache.org/jira/browse/BEAM-8742 > Project: Beam > Issue Type: Sub-task > Components: testing >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Major > Fix For: 2.22.0 > > Time Spent: 16h 10m > Remaining Estimate: 0h > > So far, the ParDo load test is not stateful. We should add a basic counter to > test the stateful processing. > The test should work in streaming mode and with checkpointing. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-9963) Flink streaming load tests fail with TypeError
[ https://issues.apache.org/jira/browse/BEAM-9963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maximilian Michels resolved BEAM-9963. -- Fix Version/s: Not applicable Resolution: Fixed > Flink streaming load tests fail with TypeError > -- > > Key: BEAM-9963 > URL: https://issues.apache.org/jira/browse/BEAM-9963 > Project: Beam > Issue Type: Test > Components: build-system, runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Major > Fix For: Not applicable > > > The newly added load tests now fail on the latest master. > https://builds.apache.org/job/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/39/console > {noformat} > 16:00:38 INFO:apache_beam.runners.portability.portable_runner:Job state > changed to FAILED > 16:00:38 Traceback (most recent call last): > 16:00:38 File "/usr/lib/python3.7/runpy.py", line 193, in > _run_module_as_main > 16:00:38 "__main__", mod_spec) > 16:00:38 File "/usr/lib/python3.7/runpy.py", line 85, in _run_code > 16:00:38 exec(code, run_globals) > 16:00:38 File > "/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/testing/load_tests/pardo_test.py", > line 225, in > 16:00:38 ParDoTest().run() > 16:00:38 File > "/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/testing/load_tests/load_test.py", > line 115, in run > 16:00:38 self.result.wait_until_finish(duration=self.timeout_ms) > 16:00:38 File > "/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/runners/portability/portable_runner.py", > line 583, in wait_until_finish > 16:00:38 raise self._runtime_exception > 16:00:38 RuntimeError: Pipeline > load-tests-python-flink-streaming-pardo-5-0508133349_3afe3b8e-23f7-474b-af08-4ee7feb3adfa > failed in state FAILED: java.lang.RuntimeException: Error received from SDK > harness for instruction 2: Traceback (most recent call last): > 16:00:38 File > "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 245, in _execute > 16:00:38 response = task() > 16:00:38 File > "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 302, in > 16:00:38 lambda: self.create_worker().do_instruction(request), request) > 16:00:38 File > "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 471, in do_instruction > 16:00:38 getattr(request, request_type), request.instruction_id) > 16:00:38 File > "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 506, in process_bundle > 16:00:38 bundle_processor.process_bundle(instruction_id)) > 16:00:38 File > "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py", > line 918, in process_bundle > 16:00:38 op.finish() > 16:00:38 File "apache_beam/runners/worker/operations.py", line 697, in > apache_beam.runners.worker.operations.DoOperation.finish > 16:00:38 File "apache_beam/runners/worker/operations.py", line 699, in > apache_beam.runners.worker.operations.DoOperation.finish > 16:00:38 File "apache_beam/runners/worker/operations.py", line 702, in > apache_beam.runners.worker.operations.DoOperation.finish > 16:00:38 File > "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py", > line 709, in commit > 16:00:38 state.commit() > 16:00:38 File > "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py", > line 414, in commit > 16:00:38 self._underlying_bag_state.commit() > 16:00:38 File > "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py", > line 481, in commit > 16:00:38 is_cached=True) > 16:00:38 File > "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 951, in extend > 16:00:38 coder.encode_to_stream(element, out, True) > 16:00:38 File "apache_beam/coders/coder_impl.py", line 690, in > apache_beam.coders.coder_impl.VarIntCoderImpl.encode_to_stream > 16:00:38 File "apache_beam/coders/coder_impl.py", line 692, in > apache_beam.coders.coder_impl.VarIntCoderImpl.encode_to_stream > 16:00:38 TypeError: an integer is required > 16:00:38 > {noformat} > Cron fails as well: > https://builds.apache.org/job/beam_LoadTests_Python_ParDo_Flink_Streaming/5/console -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9963) Flink streaming load tests fail with TypeError
[ https://issues.apache.org/jira/browse/BEAM-9963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maximilian Michels updated BEAM-9963: - Description: The newly added load tests now fail on the latest master. https://builds.apache.org/job/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/39/console {noformat} 16:00:38 INFO:apache_beam.runners.portability.portable_runner:Job state changed to FAILED 16:00:38 Traceback (most recent call last): 16:00:38 File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main 16:00:38 "__main__", mod_spec) 16:00:38 File "/usr/lib/python3.7/runpy.py", line 85, in _run_code 16:00:38 exec(code, run_globals) 16:00:38 File "/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/testing/load_tests/pardo_test.py", line 225, in 16:00:38 ParDoTest().run() 16:00:38 File "/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/testing/load_tests/load_test.py", line 115, in run 16:00:38 self.result.wait_until_finish(duration=self.timeout_ms) 16:00:38 File "/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/runners/portability/portable_runner.py", line 583, in wait_until_finish 16:00:38 raise self._runtime_exception 16:00:38 RuntimeError: Pipeline load-tests-python-flink-streaming-pardo-5-0508133349_3afe3b8e-23f7-474b-af08-4ee7feb3adfa failed in state FAILED: java.lang.RuntimeException: Error received from SDK harness for instruction 2: Traceback (most recent call last): 16:00:38 File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py", line 245, in _execute 16:00:38 response = task() 16:00:38 File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py", line 302, in 16:00:38 lambda: self.create_worker().do_instruction(request), request) 16:00:38 File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py", line 471, in do_instruction 16:00:38 getattr(request, request_type), request.instruction_id) 16:00:38 File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py", line 506, in process_bundle 16:00:38 bundle_processor.process_bundle(instruction_id)) 16:00:38 File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py", line 918, in process_bundle 16:00:38 op.finish() 16:00:38 File "apache_beam/runners/worker/operations.py", line 697, in apache_beam.runners.worker.operations.DoOperation.finish 16:00:38 File "apache_beam/runners/worker/operations.py", line 699, in apache_beam.runners.worker.operations.DoOperation.finish 16:00:38 File "apache_beam/runners/worker/operations.py", line 702, in apache_beam.runners.worker.operations.DoOperation.finish 16:00:38 File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py", line 709, in commit 16:00:38 state.commit() 16:00:38 File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py", line 414, in commit 16:00:38 self._underlying_bag_state.commit() 16:00:38 File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py", line 481, in commit 16:00:38 is_cached=True) 16:00:38 File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py", line 951, in extend 16:00:38 coder.encode_to_stream(element, out, True) 16:00:38 File "apache_beam/coders/coder_impl.py", line 690, in apache_beam.coders.coder_impl.VarIntCoderImpl.encode_to_stream 16:00:38 File "apache_beam/coders/coder_impl.py", line 692, in apache_beam.coders.coder_impl.VarIntCoderImpl.encode_to_stream 16:00:38 TypeError: an integer is required 16:00:38 {noformat} Cron fails as well: https://builds.apache.org/job/beam_LoadTests_Python_ParDo_Flink_Streaming/5/console was: The newly added load tests now fail on the latest master. https://builds.apache.org/job/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/39/console {noformat} 16:00:38 INFO:apache_beam.runners.portability.portable_runner:Job state changed to FAILED 16:00:38 Traceback (most recent call last): 16:00:38 File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main 16:00:38 "__main__", mod_spec) 16:00:38 File "/usr/lib/python3.7/runpy.py", line 85, in _run_code 16:00:38 exec(code, run_globals) 16:00:38 File "/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/testing/load_tests/pardo_test.py", line 225, in 16:00:38 ParDoTest().run() 16:00:38 File "/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/testing/load_tests/load_test.py", line
[jira] [Created] (BEAM-9963) Flink streaming load tests fail with TypeError
Maximilian Michels created BEAM-9963: Summary: Flink streaming load tests fail with TypeError Key: BEAM-9963 URL: https://issues.apache.org/jira/browse/BEAM-9963 Project: Beam Issue Type: Test Components: build-system, runner-flink Reporter: Maximilian Michels Assignee: Maximilian Michels The newly added load tests now fail on the latest master. https://builds.apache.org/job/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/39/console {noformat} 16:00:38 INFO:apache_beam.runners.portability.portable_runner:Job state changed to FAILED 16:00:38 Traceback (most recent call last): 16:00:38 File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main 16:00:38 "__main__", mod_spec) 16:00:38 File "/usr/lib/python3.7/runpy.py", line 85, in _run_code 16:00:38 exec(code, run_globals) 16:00:38 File "/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/testing/load_tests/pardo_test.py", line 225, in 16:00:38 ParDoTest().run() 16:00:38 File "/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/testing/load_tests/load_test.py", line 115, in run 16:00:38 self.result.wait_until_finish(duration=self.timeout_ms) 16:00:38 File "/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/runners/portability/portable_runner.py", line 583, in wait_until_finish 16:00:38 raise self._runtime_exception 16:00:38 RuntimeError: Pipeline load-tests-python-flink-streaming-pardo-5-0508133349_3afe3b8e-23f7-474b-af08-4ee7feb3adfa failed in state FAILED: java.lang.RuntimeException: Error received from SDK harness for instruction 2: Traceback (most recent call last): 16:00:38 File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py", line 245, in _execute 16:00:38 response = task() 16:00:38 File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py", line 302, in 16:00:38 lambda: self.create_worker().do_instruction(request), request) 16:00:38 File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py", line 471, in do_instruction 16:00:38 getattr(request, request_type), request.instruction_id) 16:00:38 File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py", line 506, in process_bundle 16:00:38 bundle_processor.process_bundle(instruction_id)) 16:00:38 File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py", line 918, in process_bundle 16:00:38 op.finish() 16:00:38 File "apache_beam/runners/worker/operations.py", line 697, in apache_beam.runners.worker.operations.DoOperation.finish 16:00:38 File "apache_beam/runners/worker/operations.py", line 699, in apache_beam.runners.worker.operations.DoOperation.finish 16:00:38 File "apache_beam/runners/worker/operations.py", line 702, in apache_beam.runners.worker.operations.DoOperation.finish 16:00:38 File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py", line 709, in commit 16:00:38 state.commit() 16:00:38 File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py", line 414, in commit 16:00:38 self._underlying_bag_state.commit() 16:00:38 File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py", line 481, in commit 16:00:38 is_cached=True) 16:00:38 File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py", line 951, in extend 16:00:38 coder.encode_to_stream(element, out, True) 16:00:38 File "apache_beam/coders/coder_impl.py", line 690, in apache_beam.coders.coder_impl.VarIntCoderImpl.encode_to_stream 16:00:38 File "apache_beam/coders/coder_impl.py", line 692, in apache_beam.coders.coder_impl.VarIntCoderImpl.encode_to_stream 16:00:38 TypeError: an integer is required 16:00:38 {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9963) Flink streaming load tests fail with TypeError
[ https://issues.apache.org/jira/browse/BEAM-9963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kenneth Knowles updated BEAM-9963: -- Status: Open (was: Triage Needed) > Flink streaming load tests fail with TypeError > -- > > Key: BEAM-9963 > URL: https://issues.apache.org/jira/browse/BEAM-9963 > Project: Beam > Issue Type: Test > Components: build-system, runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Major > > The newly added load tests now fail on the latest master. > https://builds.apache.org/job/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/39/console > {noformat} > 16:00:38 INFO:apache_beam.runners.portability.portable_runner:Job state > changed to FAILED > 16:00:38 Traceback (most recent call last): > 16:00:38 File "/usr/lib/python3.7/runpy.py", line 193, in > _run_module_as_main > 16:00:38 "__main__", mod_spec) > 16:00:38 File "/usr/lib/python3.7/runpy.py", line 85, in _run_code > 16:00:38 exec(code, run_globals) > 16:00:38 File > "/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/testing/load_tests/pardo_test.py", > line 225, in > 16:00:38 ParDoTest().run() > 16:00:38 File > "/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/testing/load_tests/load_test.py", > line 115, in run > 16:00:38 self.result.wait_until_finish(duration=self.timeout_ms) > 16:00:38 File > "/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/runners/portability/portable_runner.py", > line 583, in wait_until_finish > 16:00:38 raise self._runtime_exception > 16:00:38 RuntimeError: Pipeline > load-tests-python-flink-streaming-pardo-5-0508133349_3afe3b8e-23f7-474b-af08-4ee7feb3adfa > failed in state FAILED: java.lang.RuntimeException: Error received from SDK > harness for instruction 2: Traceback (most recent call last): > 16:00:38 File > "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 245, in _execute > 16:00:38 response = task() > 16:00:38 File > "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 302, in > 16:00:38 lambda: self.create_worker().do_instruction(request), request) > 16:00:38 File > "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 471, in do_instruction > 16:00:38 getattr(request, request_type), request.instruction_id) > 16:00:38 File > "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 506, in process_bundle > 16:00:38 bundle_processor.process_bundle(instruction_id)) > 16:00:38 File > "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py", > line 918, in process_bundle > 16:00:38 op.finish() > 16:00:38 File "apache_beam/runners/worker/operations.py", line 697, in > apache_beam.runners.worker.operations.DoOperation.finish > 16:00:38 File "apache_beam/runners/worker/operations.py", line 699, in > apache_beam.runners.worker.operations.DoOperation.finish > 16:00:38 File "apache_beam/runners/worker/operations.py", line 702, in > apache_beam.runners.worker.operations.DoOperation.finish > 16:00:38 File > "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py", > line 709, in commit > 16:00:38 state.commit() > 16:00:38 File > "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py", > line 414, in commit > 16:00:38 self._underlying_bag_state.commit() > 16:00:38 File > "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py", > line 481, in commit > 16:00:38 is_cached=True) > 16:00:38 File > "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py", > line 951, in extend > 16:00:38 coder.encode_to_stream(element, out, True) > 16:00:38 File "apache_beam/coders/coder_impl.py", line 690, in > apache_beam.coders.coder_impl.VarIntCoderImpl.encode_to_stream > 16:00:38 File "apache_beam/coders/coder_impl.py", line 692, in > apache_beam.coders.coder_impl.VarIntCoderImpl.encode_to_stream > 16:00:38 TypeError: an integer is required > 16:00:38 > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-9947) Timer coder contains a faulty key coder leading to a corrupted encoding
[ https://issues.apache.org/jira/browse/BEAM-9947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maximilian Michels resolved BEAM-9947. -- Resolution: Fixed > Timer coder contains a faulty key coder leading to a corrupted encoding > --- > > Key: BEAM-9947 > URL: https://issues.apache.org/jira/browse/BEAM-9947 > Project: Beam > Issue Type: Bug > Components: java-fn-execution, sdk-py-core >Affects Versions: 2.21.0 >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Critical > Fix For: 2.21.0 > > Time Spent: 10m > Remaining Estimate: 0h > > The coder for timers contains a key coder which may have to be > length-prefixed in case of a non-standard coders. We noticed that this was > not reflected in the {{ProcessBundleDescriptor}} leading to errors like this > one for non-standard coders, e.g. Python's {{FastPrimitivesCoder}}: > {noformat} > Caused by: org.apache.beam.sdk.coders.CoderException: java.io.EOFException: > reached end of stream after reading 36 bytes; 68 bytes expected > at > org.apache.beam.sdk.coders.StringUtf8Coder.decode(StringUtf8Coder.java:104) > at > org.apache.beam.sdk.coders.StringUtf8Coder.decode(StringUtf8Coder.java:90) > at > org.apache.beam.runners.core.construction.Timer$Coder.decode(Timer.java:194) > at > org.apache.beam.runners.core.construction.Timer$Coder.decode(Timer.java:157) > at > org.apache.beam.sdk.fn.data.BeamFnDataInboundObserver.accept(BeamFnDataInboundObserver.java:81) > at > org.apache.beam.sdk.fn.data.BeamFnDataInboundObserver.accept(BeamFnDataInboundObserver.java:31) > at > org.apache.beam.sdk.fn.data.BeamFnDataGrpcMultiplexer$InboundObserver.onNext(BeamFnDataGrpcMultiplexer.java:180) > at > org.apache.beam.sdk.fn.data.BeamFnDataGrpcMultiplexer$InboundObserver.onNext(BeamFnDataGrpcMultiplexer.java:127) > at > org.apache.beam.vendor.grpc.v1p26p0.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:251) > at > org.apache.beam.vendor.grpc.v1p26p0.io.grpc.ForwardingServerCallListener.onMessage(ForwardingServerCallListener.java:33) > at > org.apache.beam.vendor.grpc.v1p26p0.io.grpc.Contexts$ContextualizedServerCallListener.onMessage(Contexts.java:76) > at > org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailableInternal(ServerCallImpl.java:309) > at > org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:292) > at > org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:782) > at > org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) > at > org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > ... 1 more > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9917) BigQueryBatchFileLoads dynamic destination
[ https://issues.apache.org/jira/browse/BEAM-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tord Sætren updated BEAM-9917: -- Priority: Minor (was: Major) > BigQueryBatchFileLoads dynamic destination > -- > > Key: BEAM-9917 > URL: https://issues.apache.org/jira/browse/BEAM-9917 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Affects Versions: 2.17.0 >Reporter: Tord Sætren >Priority: Minor > Labels: beginner, documentation > > I am trying to use BigQueryBatchFileLoads to upload data from pubsub. It > works fine for a single table, but when I try to use a dynamic destination > such as > destination=lambda elem: "my_project:my_dataset." + elem["sensor_key"], > it just makes a new table for each time the triggering_frequency procs. I > know it makes temporary tables before loading it all into one, but it never > loads them. It just creates more and more tables. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (BEAM-6710) Add Landing page to community metrics dashboard
[ https://issues.apache.org/jira/browse/BEAM-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kamil Wasilewski resolved BEAM-6710. Fix Version/s: Not applicable Resolution: Fixed > Add Landing page to community metrics dashboard > > > Key: BEAM-6710 > URL: https://issues.apache.org/jira/browse/BEAM-6710 > Project: Beam > Issue Type: New Feature > Components: community-metrics, project-management >Reporter: Mikhail Gryzykhin >Assignee: Kamil Wasilewski >Priority: Major > Fix For: Not applicable > > Time Spent: 50m > Remaining Estimate: 0h > > Community metrics dashboard sends user to list of recently opened dashboards, > that's empty. This confuses new users. > We want to add landing page with links to relevant dashboard. > Link: ttp://104.154.241.245/ > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-8742) Add stateful processing to ParDo load test
[ https://issues.apache.org/jira/browse/BEAM-8742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105273#comment-17105273 ] Maximilian Michels commented on BEAM-8742: -- I just saw your comment here. Thank you. I added streaming tests in https://github.com/apache/beam/pull/11558. Have a look if you want. I've created a dashboard here: https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056 > Add stateful processing to ParDo load test > -- > > Key: BEAM-8742 > URL: https://issues.apache.org/jira/browse/BEAM-8742 > Project: Beam > Issue Type: Sub-task > Components: testing >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Major > Fix For: 2.22.0 > > Time Spent: 16h 10m > Remaining Estimate: 0h > > So far, the ParDo load test is not stateful. We should add a basic counter to > test the stateful processing. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9962) CI/CD for Beam
Ankur Goenka created BEAM-9962: -- Summary: CI/CD for Beam Key: BEAM-9962 URL: https://issues.apache.org/jira/browse/BEAM-9962 Project: Beam Issue Type: Bug Components: sdk-ideas Reporter: Ankur Goenka Uber task for CI CD -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-7438) Distribution and Gauge metrics are not being exported to Flink dashboard neither Prometheus IO
[ https://issues.apache.org/jira/browse/BEAM-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105192#comment-17105192 ] Akshay Iyangar commented on BEAM-7438: -- [~mxm] - Facing the same issue will have a PR out for this soon. > Distribution and Gauge metrics are not being exported to Flink dashboard > neither Prometheus IO > -- > > Key: BEAM-7438 > URL: https://issues.apache.org/jira/browse/BEAM-7438 > Project: Beam > Issue Type: Bug > Components: runner-flink >Affects Versions: 2.12.0, 2.13.0 >Reporter: Ricardo Bordon >Assignee: Akshay Iyangar >Priority: Major > Attachments: image-2019-05-29-11-24-36-911.png, > image-2019-05-29-11-26-49-685.png > > > Distributions and gauge metrics are not visible at Flink dashboard neither > Prometheus IO. > I was able to debug the runner code and see that these metrics are being > update over *FlinkMetricContainer#updateDistributions()* and > *FlinkMetricContainer#updateGauges()* (meaning they are treated as "attempted > metrics") but these are not visible when looking them over the Flink > Dashboard or Prometheus. In the other hand, *Counter* metrics work as > expected. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-9164) [PreCommit_Java] [Flake] UnboundedSourceWrapperTest$ParameterizedUnboundedSourceWrapperTest
[ https://issues.apache.org/jira/browse/BEAM-9164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105184#comment-17105184 ] Maximilian Michels commented on BEAM-9164: -- I was not aware that tests had been disabled. I'm a bit shocked that we just disabled them without mentioning relevant folks or at least posting in the JIRA. I changed logic around the watermark propagation in BEAM-9900 which could have removed the flakiness. I'll re-enable the tests. > [PreCommit_Java] [Flake] > UnboundedSourceWrapperTest$ParameterizedUnboundedSourceWrapperTest > --- > > Key: BEAM-9164 > URL: https://issues.apache.org/jira/browse/BEAM-9164 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Kirill Kozlov >Priority: Critical > Labels: flake > > Test: > org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapperTest$ParameterizedUnboundedSourceWrapperTest > >> testWatermarkEmission[numTasks = 1; numSplits=1] > Fails with the following exception: > {code:java} > org.junit.runners.model.TestTimedOutException: test timed out after 3 > milliseconds{code} > Affected Jenkins job: > [https://builds.apache.org/job/beam_PreCommit_Java_Phrase/1665/] > Gradle build scan: [https://scans.gradle.com/s/nvgeb425fe63q] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (BEAM-7438) Distribution and Gauge metrics are not being exported to Flink dashboard neither Prometheus IO
[ https://issues.apache.org/jira/browse/BEAM-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akshay Iyangar reassigned BEAM-7438: Assignee: Akshay Iyangar > Distribution and Gauge metrics are not being exported to Flink dashboard > neither Prometheus IO > -- > > Key: BEAM-7438 > URL: https://issues.apache.org/jira/browse/BEAM-7438 > Project: Beam > Issue Type: Bug > Components: runner-flink >Affects Versions: 2.12.0, 2.13.0 >Reporter: Ricardo Bordon >Assignee: Akshay Iyangar >Priority: Major > Attachments: image-2019-05-29-11-24-36-911.png, > image-2019-05-29-11-26-49-685.png > > > Distributions and gauge metrics are not visible at Flink dashboard neither > Prometheus IO. > I was able to debug the runner code and see that these metrics are being > update over *FlinkMetricContainer#updateDistributions()* and > *FlinkMetricContainer#updateGauges()* (meaning they are treated as "attempted > metrics") but these are not visible when looking them over the Flink > Dashboard or Prometheus. In the other hand, *Counter* metrics work as > expected. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9961) Python MongoDBIO does not apply projection
Corvin Deboeser created BEAM-9961: - Summary: Python MongoDBIO does not apply projection Key: BEAM-9961 URL: https://issues.apache.org/jira/browse/BEAM-9961 Project: Beam Issue Type: Bug Components: sdk-py-core Affects Versions: 2.20.0 Reporter: Corvin Deboeser ReadFromMongoDB does not apply the provided projection when reading from the client - only filter is being applied as you can see here: https://github.com/apache/beam/blob/9f0cb649d39ee6236ea27f111acb4b66591a80ec/sdks/python/apache_beam/io/mongodbio.py#L204 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9960) Python MongoDBIO fails when response of split vector command is larger than 16mb
[ https://issues.apache.org/jira/browse/BEAM-9960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Corvin Deboeser updated BEAM-9960: -- Description: When using MongoDBIO on a large collection and the source bundle size was determined to be 1, then the response from the split vector command can be larger than 16mb which is not supported by pymongo / MongoDB: {{pymongo.errors.ProtocolError: Message length (33699186) is larger than server max message size (33554432)}} Environment: Was running this on Google Dataflow / Beam Python SDK 2.20. was: When using MongoDBIO on a large collection and the source bundle size was determined to be 1, then the response from the split vector command can be larger than 16mb which is not supported by pymongo / MongoDB: {{pymongo.errors.ProtocolError: Message length (33699186) is larger than server max message size (33554432)}} > Python MongoDBIO fails when response of split vector command is larger than > 16mb > > > Key: BEAM-9960 > URL: https://issues.apache.org/jira/browse/BEAM-9960 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Affects Versions: 2.20.0 >Reporter: Corvin Deboeser >Priority: Major > > When using MongoDBIO on a large collection and the source bundle size was > determined to be 1, then the response from the split vector command can be > larger than 16mb which is not supported by pymongo / MongoDB: > {{pymongo.errors.ProtocolError: Message length (33699186) is larger than > server max message size (33554432)}} > > Environment: Was running this on Google Dataflow / Beam Python SDK 2.20. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9960) Python MongoDBIO fails when response of split vector command is larger than 16mb
Corvin Deboeser created BEAM-9960: - Summary: Python MongoDBIO fails when response of split vector command is larger than 16mb Key: BEAM-9960 URL: https://issues.apache.org/jira/browse/BEAM-9960 Project: Beam Issue Type: Bug Components: sdk-py-core Affects Versions: 2.20.0 Reporter: Corvin Deboeser When using MongoDBIO on a large collection and the source bundle size was determined to be 1, then the response from the split vector command can be larger than 16mb which is not supported by pymongo / MongoDB: {{pymongo.errors.ProtocolError: Message length (33699186) is larger than server max message size (33554432)}} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (BEAM-9959) Mistakes Computing Composite Inputs and Outputs
[ https://issues.apache.org/jira/browse/BEAM-9959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kenneth Knowles updated BEAM-9959: -- Status: Open (was: Triage Needed) > Mistakes Computing Composite Inputs and Outputs > --- > > Key: BEAM-9959 > URL: https://issues.apache.org/jira/browse/BEAM-9959 > Project: Beam > Issue Type: Bug > Components: sdk-go >Reporter: Robert Burke >Assignee: Robert Burke >Priority: Major > > The Go SDK uses a Scope object to manage beam Composites. > A bug was discovered when consuming a PCollection in both the composite that > created it, and in a separate composite. > Further, the Go SDK should verify that the root hypergraph structure is a DAG > and provides a reasonable error. In particular, the leaf nodes of the graph > could form a DAG, but due to how the beam.Scope object is used, might cause > the hypergraph to not be a DAG. > Eg. It's possible to write the following in the Go SDK. > PTransforms A, B, C and PCollections colA, colB, and Composites a, b. > A and C are in a, and B are in b. > A generates colA > B consumes colA, and generates colB. > C consumes colB. > ``` > a := s.Scope(a) > b := s.Scope(b) > colA := beam.Impulse(*a*) > colB := beam.ParDo(*b*, , colA) > beam.ParDo0(*a*, , colA) > ``` > If it doesn't already the Go SDK must emit a clear error, and fail pipeline > construction. > If the affected composites are roots in the graph, the cycle prevents being > able to topologically sort the root ptransforms for the pipeline graph, which > can adversely affect runners. > The recommendation is always to wrap uses of scope in functions or other > scopes to prevent such incorrect constructions. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (BEAM-9959) Mistakes Computing Composite Inputs and Outputs
Robert Burke created BEAM-9959: -- Summary: Mistakes Computing Composite Inputs and Outputs Key: BEAM-9959 URL: https://issues.apache.org/jira/browse/BEAM-9959 Project: Beam Issue Type: Bug Components: sdk-go Reporter: Robert Burke Assignee: Robert Burke The Go SDK uses a Scope object to manage beam Composites. A bug was discovered when consuming a PCollection in both the composite that created it, and in a separate composite. Further, the Go SDK should verify that the root hypergraph structure is a DAG and provides a reasonable error. In particular, the leaf nodes of the graph could form a DAG, but due to how the beam.Scope object is used, might cause the hypergraph to not be a DAG. Eg. It's possible to write the following in the Go SDK. PTransforms A, B, C and PCollections colA, colB, and Composites a, b. A and C are in a, and B are in b. A generates colA B consumes colA, and generates colB. C consumes colB. ``` a := s.Scope(a) b := s.Scope(b) colA := beam.Impulse(*a*) colB := beam.ParDo(*b*, , colA) beam.ParDo0(*a*, , colA) ``` If it doesn't already the Go SDK must emit a clear error, and fail pipeline construction. If the affected composites are roots in the graph, the cycle prevents being able to topologically sort the root ptransforms for the pipeline graph, which can adversely affect runners. The recommendation is always to wrap uses of scope in functions or other scopes to prevent such incorrect constructions. -- This message was sent by Atlassian Jira (v8.3.4#803005)