[jira] [Updated] (BEAM-9979) Fix race condition where the read index maybe reported from the last executed bundle

2020-05-12 Thread Luke Cwik (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-9979:

Description: 
When the BeamFnDataReadRunner/DataInputOperation is reused there is a short 
period of time when a progress request could happen before the the start 
function is called resetting the read index to -1.

I believe there should be a way to *reset* an operator before it gets added to 
the set of cached bundle processors separate instead of placing clean-up in any 
*start* functions that those operators may rely on preventing exposing details 
of those operators before *start* may have been invoked.

  was:When the BeamFnDataReadRunner/DataInputOperation is reused there is a 
short period of time when a progress request could happen before the the start 
function is called resetting the read index to -1.


> Fix race condition where the read index maybe reported from the last executed 
> bundle
> 
>
> Key: BEAM-9979
> URL: https://issues.apache.org/jira/browse/BEAM-9979
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-harness, sdk-py-harness
>Reporter: Luke Cwik
>Priority: Minor
> Fix For: 2.22.0
>
>
> When the BeamFnDataReadRunner/DataInputOperation is reused there is a short 
> period of time when a progress request could happen before the the start 
> function is called resetting the read index to -1.
> I believe there should be a way to *reset* an operator before it gets added 
> to the set of cached bundle processors separate instead of placing clean-up 
> in any *start* functions that those operators may rely on preventing exposing 
> details of those operators before *start* may have been invoked.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9979) Fix race condition where the read index maybe reported from the last executed bundle

2020-05-12 Thread Luke Cwik (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-9979:

Status: Open  (was: Triage Needed)

> Fix race condition where the read index maybe reported from the last executed 
> bundle
> 
>
> Key: BEAM-9979
> URL: https://issues.apache.org/jira/browse/BEAM-9979
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-harness, sdk-py-harness
>Reporter: Luke Cwik
>Priority: Minor
> Fix For: 2.22.0
>
>
> When the BeamFnDataReadRunner/DataInputOperation is reused there is a short 
> period of time when a progress request could happen before the the start 
> function is called resetting the read index to -1.
> I believe there should be a way to *reset* an operator before it gets added 
> to the set of cached bundle processors separate instead of placing clean-up 
> in any *start* functions that those operators may rely on preventing exposing 
> details of those operators before *start* may have been invoked.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9979) Fix race condition where the read index maybe reported from the last executed bundle

2020-05-12 Thread Luke Cwik (Jira)
Luke Cwik created BEAM-9979:
---

 Summary: Fix race condition where the read index maybe reported 
from the last executed bundle
 Key: BEAM-9979
 URL: https://issues.apache.org/jira/browse/BEAM-9979
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-harness, sdk-py-harness
Reporter: Luke Cwik
 Fix For: 2.22.0


When the BeamFnDataReadRunner/DataInputOperation is reused there is a short 
period of time when a progress request could happen before the the start 
function is called resetting the read index to -1.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-9945) Use consistent element count for progress counter.

2020-05-12 Thread Luke Cwik (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105926#comment-17105926
 ] 

Luke Cwik edited comment on BEAM-9945 at 5/13/20, 3:23 AM:
---

When testing the implementation against runner v2, I found that the Go SDK's 
read index ends at *# elements* or the *stop index* when splitting while the 
Python/Java implementations always stop at *# elements - 1* or *stop index - 1*

I think it makes sense to use Go's definition.


was (Author: lcwik):
When testing the implementation against runner v2, I found that the Go SDK's 
read index ends at *# elements* or the *stop index* when splitting while the 
Python/Java implements always stop at *# elements - 1* or *stop index - 1*

I think it makes sense to use Go's definition.

> Use consistent element count for progress counter.
> --
>
> Key: BEAM-9945
> URL: https://issues.apache.org/jira/browse/BEAM-9945
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, sdk-go, sdk-java-harness, sdk-py-harness
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This influences how the SDK communicates progress and how splitting is 
> performed.
> We currently inspect the element count of the output (which has inconsistent 
> definitions across SDKs, see BEAM-9934). Instead, we can move to using an 
> explicit, separate metric. 
> This can currently lead to incorrect progress reporting and in some cases 
> even a crash in the UW depending on the SDK, and makes it more difficult (but 
> not impossible) to fix element counts in the future. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9945) Use consistent element count for progress counter.

2020-05-12 Thread Luke Cwik (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105937#comment-17105937
 ] 

Luke Cwik commented on BEAM-9945:
-

Fix in https://github.com/apache/beam/pull/11689

> Use consistent element count for progress counter.
> --
>
> Key: BEAM-9945
> URL: https://issues.apache.org/jira/browse/BEAM-9945
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, sdk-go, sdk-java-harness, sdk-py-harness
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This influences how the SDK communicates progress and how splitting is 
> performed.
> We currently inspect the element count of the output (which has inconsistent 
> definitions across SDKs, see BEAM-9934). Instead, we can move to using an 
> explicit, separate metric. 
> This can currently lead to incorrect progress reporting and in some cases 
> even a crash in the UW depending on the SDK, and makes it more difficult (but 
> not impossible) to fix element counts in the future. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9945) Use consistent element count for progress counter.

2020-05-12 Thread Luke Cwik (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-9945:

Status: Open  (was: Triage Needed)

> Use consistent element count for progress counter.
> --
>
> Key: BEAM-9945
> URL: https://issues.apache.org/jira/browse/BEAM-9945
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, sdk-go, sdk-java-harness, sdk-py-harness
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This influences how the SDK communicates progress and how splitting is 
> performed.
> We currently inspect the element count of the output (which has inconsistent 
> definitions across SDKs, see BEAM-9934). Instead, we can move to using an 
> explicit, separate metric. 
> This can currently lead to incorrect progress reporting and in some cases 
> even a crash in the UW depending on the SDK, and makes it more difficult (but 
> not impossible) to fix element counts in the future. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (BEAM-9945) Use consistent element count for progress counter.

2020-05-12 Thread Luke Cwik (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik reopened BEAM-9945:
-

When testing the implementation against runner v2, I found that the Go SDK's 
read index ends at *# elements* or the *stop index* when splitting while the 
Python/Java implements always stop at *# elements - 1* or *stop index - 1*

I think it makes sense to use Go's definition.

> Use consistent element count for progress counter.
> --
>
> Key: BEAM-9945
> URL: https://issues.apache.org/jira/browse/BEAM-9945
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, sdk-go, sdk-java-harness, sdk-py-harness
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This influences how the SDK communicates progress and how splitting is 
> performed.
> We currently inspect the element count of the output (which has inconsistent 
> definitions across SDKs, see BEAM-9934). Instead, we can move to using an 
> explicit, separate metric. 
> This can currently lead to incorrect progress reporting and in some cases 
> even a crash in the UW depending on the SDK, and makes it more difficult (but 
> not impossible) to fix element counts in the future. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (BEAM-8964) DeprecationWarning: Flags not at the start of the expression

2020-05-12 Thread Udi Meiri (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri closed BEAM-8964.
---
Fix Version/s: Not applicable
   Resolution: Fixed

> DeprecationWarning: Flags not at the start of the expression 
> -
>
> Key: BEAM-8964
> URL: https://issues.apache.org/jira/browse/BEAM-8964
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-files
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Labels: beginner, easy, newbie, starter
> Fix For: Not applicable
>
>
> I see lots of these warnings in our precommits.
> {code}
> 19:09:37 apache_beam/io/filesystem.py:583
> 19:09:37   
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py36/build/srcs/sdks/python/apache_beam/io/filesystem.py:583:
>  DeprecationWarning: Flags not at the start of the expression 
> '\\/tmp\\/tmp2jslxh39\\/' (truncated)
> 19:09:37 re_pattern = re.compile(self.translate_pattern(pattern))
> 19:09:37 
> 19:09:37 apache_beam/io/filesystem.py:583
> 19:09:37   
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py36/build/srcs/sdks/python/apache_beam/io/filesystem.py:583:
>  DeprecationWarning: Flags not at the start of the expression 
> '\\/tmp\\/tmp03vpdu3z\\/' (truncated)
> 19:09:37 re_pattern = re.compile(self.translate_pattern(pattern))
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8964) DeprecationWarning: Flags not at the start of the expression

2020-05-12 Thread Udi Meiri (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri reassigned BEAM-8964:
---

Assignee: Udi Meiri

> DeprecationWarning: Flags not at the start of the expression 
> -
>
> Key: BEAM-8964
> URL: https://issues.apache.org/jira/browse/BEAM-8964
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-files
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Labels: beginner, easy, newbie, starter
>
> I see lots of these warnings in our precommits.
> {code}
> 19:09:37 apache_beam/io/filesystem.py:583
> 19:09:37   
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py36/build/srcs/sdks/python/apache_beam/io/filesystem.py:583:
>  DeprecationWarning: Flags not at the start of the expression 
> '\\/tmp\\/tmp2jslxh39\\/' (truncated)
> 19:09:37 re_pattern = re.compile(self.translate_pattern(pattern))
> 19:09:37 
> 19:09:37 apache_beam/io/filesystem.py:583
> 19:09:37   
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py36/build/srcs/sdks/python/apache_beam/io/filesystem.py:583:
>  DeprecationWarning: Flags not at the start of the expression 
> '\\/tmp\\/tmp03vpdu3z\\/' (truncated)
> 19:09:37 re_pattern = re.compile(self.translate_pattern(pattern))
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8964) DeprecationWarning: Flags not at the start of the expression

2020-05-12 Thread Udi Meiri (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105915#comment-17105915
 ] 

Udi Meiri commented on BEAM-8964:
-

Oops I fixed this in https://github.com/apache/beam/pull/11016, but forgot to 
close the bug.
I don't this warning at all in 
https://builds.apache.org/job/beam_PreCommit_Python_Cron/2746/

 

> DeprecationWarning: Flags not at the start of the expression 
> -
>
> Key: BEAM-8964
> URL: https://issues.apache.org/jira/browse/BEAM-8964
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-files
>Reporter: Udi Meiri
>Priority: Major
>  Labels: beginner, easy, newbie, starter
>
> I see lots of these warnings in our precommits.
> {code}
> 19:09:37 apache_beam/io/filesystem.py:583
> 19:09:37   
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py36/build/srcs/sdks/python/apache_beam/io/filesystem.py:583:
>  DeprecationWarning: Flags not at the start of the expression 
> '\\/tmp\\/tmp2jslxh39\\/' (truncated)
> 19:09:37 re_pattern = re.compile(self.translate_pattern(pattern))
> 19:09:37 
> 19:09:37 apache_beam/io/filesystem.py:583
> 19:09:37   
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py36/build/srcs/sdks/python/apache_beam/io/filesystem.py:583:
>  DeprecationWarning: Flags not at the start of the expression 
> '\\/tmp\\/tmp03vpdu3z\\/' (truncated)
> 19:09:37 re_pattern = re.compile(self.translate_pattern(pattern))
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9978) Add offset range restrictions to the Go SDK proper.

2020-05-12 Thread Kenneth Knowles (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-9978:
--
Status: Open  (was: Triage Needed)

> Add offset range restrictions to the Go SDK proper.
> ---
>
> Key: BEAM-9978
> URL: https://issues.apache.org/jira/browse/BEAM-9978
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Major
>
> Currently these are part of the stringsplit example. but they should probably 
> be generalized and in the actual SDK, and should have adequate testing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9978) Add offset range restrictions to the Go SDK.

2020-05-12 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-9978:
--
Summary: Add offset range restrictions to the Go SDK.  (was: Add offset 
range restrictions to the Go SDK proper.)

> Add offset range restrictions to the Go SDK.
> 
>
> Key: BEAM-9978
> URL: https://issues.apache.org/jira/browse/BEAM-9978
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Major
>
> Currently these are part of the stringsplit example. but they should probably 
> be generalized and in the actual SDK, and should have adequate testing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9978) Add offset range restrictions to the Go SDK proper.

2020-05-12 Thread Daniel Oliveira (Jira)
Daniel Oliveira created BEAM-9978:
-

 Summary: Add offset range restrictions to the Go SDK proper.
 Key: BEAM-9978
 URL: https://issues.apache.org/jira/browse/BEAM-9978
 Project: Beam
  Issue Type: Improvement
  Components: sdk-go
Reporter: Daniel Oliveira
Assignee: Daniel Oliveira


Currently these are part of the stringsplit example. but they should probably 
be generalized and in the actual SDK, and should have adequate testing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-9975) PortableRunnerTest flake "ParseError: Unexpected type for Value message."

2020-05-12 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105894#comment-17105894
 ] 

Ahmet Altay edited comment on BEAM-9975 at 5/13/20, 2:26 AM:
-

Could we add a log error before the raise and log the value instead of catching 
the error?

I agree with Kyle, unless we add more logging, it will be hard to find the root 
cause.


was (Author: altay):
Could we add a log error before the raise and log the value instead of catching 
the error?

> PortableRunnerTest flake "ParseError: Unexpected type for Value message."
> -
>
> Key: BEAM-9975
> URL: https://issues.apache.org/jira/browse/BEAM-9975
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Brian Hulette
>Priority: Major
>
> Error looks similar to the one in BEAM-9907. Example from 
> https://builds.apache.org/job/beam_PreCommit_Python_Cron/2732
> {code}
> apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:569: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> apache_beam/pipeline.py:550: in __exit__
> self.run().wait_until_finish()
> apache_beam/pipeline.py:529: in run
> return self.runner.run_pipeline(self, self._options)
> apache_beam/runners/portability/portable_runner.py:426: in run_pipeline
> job_service_handle.submit(proto_pipeline)
> apache_beam/runners/portability/portable_runner.py:107: in submit
> prepare_response = self.prepare(proto_pipeline)
> apache_beam/runners/portability/portable_runner.py:184: in prepare
> pipeline_options=self.get_pipeline_options()),
> apache_beam/runners/portability/portable_runner.py:174: in 
> get_pipeline_options
> return job_utils.dict_to_struct(p_options)
> apache_beam/runners/job/utils.py:33: in dict_to_struct
> return json_format.ParseDict(dict_obj, struct_pb2.Struct())
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:450:
>  in ParseDict
> parser.ConvertMessage(js_dict, message)
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:479:
>  in ConvertMessage
> methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self)
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:667:
>  in _ConvertStructMessage
> self._ConvertValueMessage(value[key], message.fields[key])
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> self = 
> value =  0x7f69eb7b3ac8>
> message = 
> def _ConvertValueMessage(self, value, message):
>   """Convert a JSON representation into Value message."""
>   if isinstance(value, dict):
> self._ConvertStructMessage(value, message.struct_value)
>   elif isinstance(value, list):
> self. _ConvertListValueMessage(value, message.list_value)
>   elif value is None:
> message.null_value = 0
>   elif isinstance(value, bool):
> message.bool_value = value
>   elif isinstance(value, six.string_types):
> message.string_value = value
>   elif isinstance(value, _INT_OR_FLOAT):
> message.number_value = value
>   else:
> >   raise ParseError('Unexpected type for Value message.')
> E   google.protobuf.json_format.ParseError: Unexpected type for Value 
> message.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-9775) Add capability to run SDFs on Flink Runner and Python FnApiRunner.

2020-05-12 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira resolved BEAM-9775.
---
Fix Version/s: Not applicable
   Resolution: Fixed

This seems complete now. We have an example SDF and it runs on both those 
runners.

> Add capability to run SDFs on Flink Runner and Python FnApiRunner.
> --
>
> Key: BEAM-9775
> URL: https://issues.apache.org/jira/browse/BEAM-9775
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> This is pretty simple: Add any missing requirements for being able to 
> actually execute SDFs, and _maybe_ an example SDF or something that actually 
> works. This can be marked completed when we can run a simple SDF with the Go 
> SDK.
> I'm still hesitant on the example SDF because it may imply that the feature 
> is ready for general usage, so it might need to be bundled with the 
> documentation PR, but we'll see.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9975) PortableRunnerTest flake "ParseError: Unexpected type for Value message."

2020-05-12 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105894#comment-17105894
 ] 

Ahmet Altay commented on BEAM-9975:
---

Could we add a log error before the raise and log the value instead of catching 
the error?

> PortableRunnerTest flake "ParseError: Unexpected type for Value message."
> -
>
> Key: BEAM-9975
> URL: https://issues.apache.org/jira/browse/BEAM-9975
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Brian Hulette
>Priority: Major
>
> Error looks similar to the one in BEAM-9907. Example from 
> https://builds.apache.org/job/beam_PreCommit_Python_Cron/2732
> {code}
> apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:569: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> apache_beam/pipeline.py:550: in __exit__
> self.run().wait_until_finish()
> apache_beam/pipeline.py:529: in run
> return self.runner.run_pipeline(self, self._options)
> apache_beam/runners/portability/portable_runner.py:426: in run_pipeline
> job_service_handle.submit(proto_pipeline)
> apache_beam/runners/portability/portable_runner.py:107: in submit
> prepare_response = self.prepare(proto_pipeline)
> apache_beam/runners/portability/portable_runner.py:184: in prepare
> pipeline_options=self.get_pipeline_options()),
> apache_beam/runners/portability/portable_runner.py:174: in 
> get_pipeline_options
> return job_utils.dict_to_struct(p_options)
> apache_beam/runners/job/utils.py:33: in dict_to_struct
> return json_format.ParseDict(dict_obj, struct_pb2.Struct())
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:450:
>  in ParseDict
> parser.ConvertMessage(js_dict, message)
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:479:
>  in ConvertMessage
> methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self)
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:667:
>  in _ConvertStructMessage
> self._ConvertValueMessage(value[key], message.fields[key])
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> self = 
> value =  0x7f69eb7b3ac8>
> message = 
> def _ConvertValueMessage(self, value, message):
>   """Convert a JSON representation into Value message."""
>   if isinstance(value, dict):
> self._ConvertStructMessage(value, message.struct_value)
>   elif isinstance(value, list):
> self. _ConvertListValueMessage(value, message.list_value)
>   elif value is None:
> message.null_value = 0
>   elif isinstance(value, bool):
> message.bool_value = value
>   elif isinstance(value, six.string_types):
> message.string_value = value
>   elif isinstance(value, _INT_OR_FLOAT):
> message.number_value = value
>   else:
> >   raise ParseError('Unexpected type for Value message.')
> E   google.protobuf.json_format.ParseError: Unexpected type for Value 
> message.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-9799) Add automated RTracker validation to Go SDF

2020-05-12 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira resolved BEAM-9799.
---
Fix Version/s: Not applicable
   Resolution: Fixed

> Add automated RTracker validation to Go SDF
> ---
>
> Key: BEAM-9799
> URL: https://issues.apache.org/jira/browse/BEAM-9799
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> After finishing executing an SDF we should be validating the restriction 
> trackers to make sure they succeeded without errors.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9934) Resolve differences in beam:metric:element_count:v1 implementations

2020-05-12 Thread Luke Cwik (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105866#comment-17105866
 ] 

Luke Cwik commented on BEAM-9934:
-

2.22

> Resolve differences in beam:metric:element_count:v1 implementations
> ---
>
> Key: BEAM-9934
> URL: https://issues.apache.org/jira/browse/BEAM-9934
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go, sdk-java-harness, sdk-py-harness
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>Priority: Major
> Fix For: 2.22.0
>
>
> The [element 
> count|https://github.com/apache/beam/blob/d82d061aa303430f3d2853f397f3130fae6200cd/model/pipeline/src/main/proto/metrics.proto#L206]
>  metric represents the number of elements within a PCollection and is 
> interpreted differently across the Beam SDK versions.
> In the [Java 
> SDK|https://github.com/apache/beam/blob/d82d061aa303430f3d2853f397f3130fae6200cd/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/data/PCollectionConsumerRegistry.java#L207]
>  this represents the number of elements and includes how many windows those 
> elements are in. This metric is incremented as soon as the element has been 
> output.
> In the [Python 
> SDK|https://github.com/apache/beam/blame/bfd151aa4c3aad29f3aea6482212ff8543ded8d7/sdks/python/apache_beam/runners/worker/opcounters.py#L247]
>  this represents the number of elements and doesn't include how many windows 
> those elements are in. The metric is also only incremented after the element 
> has finished processing.
> The [Go 
> SDK|https://github.com/apache/beam/blob/7097850daa46674b88425a124bc442fc8ce0dcb8/sdks/go/pkg/beam/core/runtime/exec/datasource.go#L260]
>  does the same thing as Python.
> Traditionally in Dataflow this has always been the exploded window element 
> count and the counter is incremented as soon as the element is output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9934) Resolve differences in beam:metric:element_count:v1 implementations

2020-05-12 Thread Luke Cwik (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-9934:

Fix Version/s: (was: 2.21.0)
   2.22.0

> Resolve differences in beam:metric:element_count:v1 implementations
> ---
>
> Key: BEAM-9934
> URL: https://issues.apache.org/jira/browse/BEAM-9934
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go, sdk-java-harness, sdk-py-harness
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>Priority: Major
> Fix For: 2.22.0
>
>
> The [element 
> count|https://github.com/apache/beam/blob/d82d061aa303430f3d2853f397f3130fae6200cd/model/pipeline/src/main/proto/metrics.proto#L206]
>  metric represents the number of elements within a PCollection and is 
> interpreted differently across the Beam SDK versions.
> In the [Java 
> SDK|https://github.com/apache/beam/blob/d82d061aa303430f3d2853f397f3130fae6200cd/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/data/PCollectionConsumerRegistry.java#L207]
>  this represents the number of elements and includes how many windows those 
> elements are in. This metric is incremented as soon as the element has been 
> output.
> In the [Python 
> SDK|https://github.com/apache/beam/blame/bfd151aa4c3aad29f3aea6482212ff8543ded8d7/sdks/python/apache_beam/runners/worker/opcounters.py#L247]
>  this represents the number of elements and doesn't include how many windows 
> those elements are in. The metric is also only incremented after the element 
> has finished processing.
> The [Go 
> SDK|https://github.com/apache/beam/blob/7097850daa46674b88425a124bc442fc8ce0dcb8/sdks/go/pkg/beam/core/runtime/exec/datasource.go#L260]
>  does the same thing as Python.
> Traditionally in Dataflow this has always been the exploded window element 
> count and the counter is incremented as soon as the element is output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9767) test_streaming_wordcount flaky timeouts

2020-05-12 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105858#comment-17105858
 ] 

Ahmet Altay commented on BEAM-9767:
---

https://github.com/apache/beam/pull/11663 is merged. Does this address the 
flake?

> test_streaming_wordcount flaky timeouts
> ---
>
> Key: BEAM-9767
> URL: https://issues.apache.org/jira/browse/BEAM-9767
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, test-failures
>Reporter: Udi Meiri
>Assignee: Sam Rohde
>Priority: Critical
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Timed out after 600s, typically completes in 2.8s on my workstation.
> https://builds.apache.org/job/beam_PreCommit_Python_Commit/12376/
> {code}
> self = 
>   testMethod=test_streaming_wordcount>
> @unittest.skipIf(
> sys.version_info < (3, 5, 3),
> 'The tests require at least Python 3.6 to work.')
> def test_streaming_wordcount(self):
>   class WordExtractingDoFn(beam.DoFn):
> def process(self, element):
>   text_line = element.strip()
>   words = text_line.split()
>   return words
> 
>   # Add the TestStream so that it can be cached.
>   ib.options.capturable_sources.add(TestStream)
>   ib.options.capture_duration = timedelta(seconds=5)
> 
>   p = beam.Pipeline(
>   runner=interactive_runner.InteractiveRunner(),
>   options=StandardOptions(streaming=True))
> 
>   data = (
>   p
>   | TestStream()
>   .advance_watermark_to(0)
>   .advance_processing_time(1)
>   .add_elements(['to', 'be', 'or', 'not', 'to', 'be'])
>   .advance_watermark_to(20)
>   .advance_processing_time(1)
>   .add_elements(['that', 'is', 'the', 'question'])
>   | beam.WindowInto(beam.window.FixedWindows(10))) # yapf: disable
> 
>   counts = (
>   data
>   | 'split' >> beam.ParDo(WordExtractingDoFn())
>   | 'pair_with_one' >> beam.Map(lambda x: (x, 1))
>   | 'group' >> beam.GroupByKey()
>   | 'count' >> beam.Map(lambda wordones: (wordones[0], 
> sum(wordones[1]
> 
>   # Watch the local scope for Interactive Beam so that referenced 
> PCollections
>   # will be cached.
>   ib.watch(locals())
> 
>   # This is normally done in the interactive_utils when a transform is
>   # applied but needs an IPython environment. So we manually run this 
> here.
>   ie.current_env().track_user_pipelines()
> 
>   # Create a fake limiter that cancels the BCJ once the main job receives 
> the
>   # expected amount of results.
>   class FakeLimiter:
> def __init__(self, p, pcoll):
>   self.p = p
>   self.pcoll = pcoll
> 
> def is_triggered(self):
>   result = ie.current_env().pipeline_result(self.p)
>   if result:
> try:
>   results = result.get(self.pcoll)
> except ValueError:
>   return False
> return len(results) >= 10
>   return False
> 
>   # This sets the limiters to stop reading when the test receives 10 
> elements
>   # or after 5 seconds have elapsed (to eliminate the possibility of 
> hanging).
>   ie.current_env().options.capture_control.set_limiters_for_test(
>   [FakeLimiter(p, data), DurationLimiter(timedelta(seconds=5))])
> 
>   # This tests that the data was correctly cached.
>   pane_info = PaneInfo(True, True, PaneInfoTiming.UNKNOWN, 0, 0)
>   expected_data_df = pd.DataFrame([
>   ('to', 0, [IntervalWindow(0, 10)], pane_info),
>   ('be', 0, [IntervalWindow(0, 10)], pane_info),
>   ('or', 0, [IntervalWindow(0, 10)], pane_info),
>   ('not', 0, [IntervalWindow(0, 10)], pane_info),
>   ('to', 0, [IntervalWindow(0, 10)], pane_info),
>   ('be', 0, [IntervalWindow(0, 10)], pane_info),
>   ('that', 2000, [IntervalWindow(20, 30)], pane_info),
>   ('is', 2000, [IntervalWindow(20, 30)], pane_info),
>   ('the', 2000, [IntervalWindow(20, 30)], pane_info),
>   ('question', 2000, [IntervalWindow(20, 30)], pane_info)
>   ], columns=[0, 'event_time', 'windows', 'pane_info']) # yapf: disable
> 
> > data_df = ib.collect(data, include_window_info=True)
> apache_beam/runners/interactive/interactive_runner_test.py:237: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> apache_beam/runners/interactive/interactive_beam.py:451: in collect
> return head(pcoll, n=-1, include_window_info=include_window_info)
> apache_beam/runners/interactive/utils.py:204: in run_within_progress_indicator
> return 

[jira] [Commented] (BEAM-9934) Resolve differences in beam:metric:element_count:v1 implementations

2020-05-12 Thread Kyle Weaver (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105846#comment-17105846
 ] 

Kyle Weaver commented on BEAM-9934:
---

What's the status on this one? Are we going to try to fix this for 2.21.0, or 
wait until 2.22.0?

> Resolve differences in beam:metric:element_count:v1 implementations
> ---
>
> Key: BEAM-9934
> URL: https://issues.apache.org/jira/browse/BEAM-9934
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go, sdk-java-harness, sdk-py-harness
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>Priority: Major
> Fix For: 2.21.0
>
>
> The [element 
> count|https://github.com/apache/beam/blob/d82d061aa303430f3d2853f397f3130fae6200cd/model/pipeline/src/main/proto/metrics.proto#L206]
>  metric represents the number of elements within a PCollection and is 
> interpreted differently across the Beam SDK versions.
> In the [Java 
> SDK|https://github.com/apache/beam/blob/d82d061aa303430f3d2853f397f3130fae6200cd/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/data/PCollectionConsumerRegistry.java#L207]
>  this represents the number of elements and includes how many windows those 
> elements are in. This metric is incremented as soon as the element has been 
> output.
> In the [Python 
> SDK|https://github.com/apache/beam/blame/bfd151aa4c3aad29f3aea6482212ff8543ded8d7/sdks/python/apache_beam/runners/worker/opcounters.py#L247]
>  this represents the number of elements and doesn't include how many windows 
> those elements are in. The metric is also only incremented after the element 
> has finished processing.
> The [Go 
> SDK|https://github.com/apache/beam/blob/7097850daa46674b88425a124bc442fc8ce0dcb8/sdks/go/pkg/beam/core/runtime/exec/datasource.go#L260]
>  does the same thing as Python.
> Traditionally in Dataflow this has always been the exploded window element 
> count and the counter is incremented as soon as the element is output.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9935) Resolve differences in allowed_split_point implementations

2020-05-12 Thread Kyle Weaver (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105842#comment-17105842
 ] 

Kyle Weaver commented on BEAM-9935:
---

https://github.com/apache/beam/pull/11671 solved this for Python, which is all 
we needed in 2.21 IIUC. Moving fix version to 2.22 for Java and Go.

> Resolve differences in allowed_split_point implementations
> --
>
> Key: BEAM-9935
> URL: https://issues.apache.org/jira/browse/BEAM-9935
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go, sdk-java-harness, sdk-py-harness
>Reporter: Luke Cwik
>Assignee: Robert Bradshaw
>Priority: Blocker
> Fix For: 2.21.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> [Java SDK 
> harness|https://github.com/apache/beam/blob/d82d061aa303430f3d2853f397f3130fae6200cd/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/BeamFnDataReadRunner.java#L223]
>  doesn't support it yet which is also safe.
> [Go SDK 
> harness|https://github.com/apache/beam/blob/d82d061aa303430f3d2853f397f3130fae6200cd/sdks/go/pkg/beam/core/runtime/exec/datasource.go#L273]
>  only supports splits if points are specified and it doesn't use the fraction 
> at all.
> [Python SDK 
> harness|https://github.com/apache/beam/blob/d82d061aa303430f3d2853f397f3130fae6200cd/sdks/python/apache_beam/runners/worker/bundle_processor.py#L947]
>  ignores the split points meaning that it may return an invalid split 
> location based upon the runners limitations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9935) Resolve differences in allowed_split_point implementations

2020-05-12 Thread Kyle Weaver (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kyle Weaver updated BEAM-9935:
--
Fix Version/s: (was: 2.21.0)
   2.22.0

> Resolve differences in allowed_split_point implementations
> --
>
> Key: BEAM-9935
> URL: https://issues.apache.org/jira/browse/BEAM-9935
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go, sdk-java-harness, sdk-py-harness
>Reporter: Luke Cwik
>Assignee: Robert Bradshaw
>Priority: Blocker
> Fix For: 2.22.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> [Java SDK 
> harness|https://github.com/apache/beam/blob/d82d061aa303430f3d2853f397f3130fae6200cd/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/BeamFnDataReadRunner.java#L223]
>  doesn't support it yet which is also safe.
> [Go SDK 
> harness|https://github.com/apache/beam/blob/d82d061aa303430f3d2853f397f3130fae6200cd/sdks/go/pkg/beam/core/runtime/exec/datasource.go#L273]
>  only supports splits if points are specified and it doesn't use the fraction 
> at all.
> [Python SDK 
> harness|https://github.com/apache/beam/blob/d82d061aa303430f3d2853f397f3130fae6200cd/sdks/python/apache_beam/runners/worker/bundle_processor.py#L947]
>  ignores the split points meaning that it may return an invalid split 
> location based upon the runners limitations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-9945) Use consistent element count for progress counter.

2020-05-12 Thread Kyle Weaver (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kyle Weaver resolved BEAM-9945.
---
Resolution: Fixed

> Use consistent element count for progress counter.
> --
>
> Key: BEAM-9945
> URL: https://issues.apache.org/jira/browse/BEAM-9945
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, sdk-go, sdk-java-harness, sdk-py-harness
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This influences how the SDK communicates progress and how splitting is 
> performed.
> We currently inspect the element count of the output (which has inconsistent 
> definitions across SDKs, see BEAM-9934). Instead, we can move to using an 
> explicit, separate metric. 
> This can currently lead to incorrect progress reporting and in some cases 
> even a crash in the UW depending on the SDK, and makes it more difficult (but 
> not impossible) to fix element counts in the future. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9975) PortableRunnerTest flake "ParseError: Unexpected type for Value message."

2020-05-12 Thread Kyle Weaver (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105836#comment-17105836
 ] 

Kyle Weaver commented on BEAM-9975:
---

The error message makes it impossible to tell which option is actually the 
problem. I'll try submitting a patch to the protobuf formatter. In the mean 
time, we could consider catching the error and printing the offending options.

> PortableRunnerTest flake "ParseError: Unexpected type for Value message."
> -
>
> Key: BEAM-9975
> URL: https://issues.apache.org/jira/browse/BEAM-9975
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Brian Hulette
>Priority: Major
>
> Error looks similar to the one in BEAM-9907. Example from 
> https://builds.apache.org/job/beam_PreCommit_Python_Cron/2732
> {code}
> apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:569: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> apache_beam/pipeline.py:550: in __exit__
> self.run().wait_until_finish()
> apache_beam/pipeline.py:529: in run
> return self.runner.run_pipeline(self, self._options)
> apache_beam/runners/portability/portable_runner.py:426: in run_pipeline
> job_service_handle.submit(proto_pipeline)
> apache_beam/runners/portability/portable_runner.py:107: in submit
> prepare_response = self.prepare(proto_pipeline)
> apache_beam/runners/portability/portable_runner.py:184: in prepare
> pipeline_options=self.get_pipeline_options()),
> apache_beam/runners/portability/portable_runner.py:174: in 
> get_pipeline_options
> return job_utils.dict_to_struct(p_options)
> apache_beam/runners/job/utils.py:33: in dict_to_struct
> return json_format.ParseDict(dict_obj, struct_pb2.Struct())
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:450:
>  in ParseDict
> parser.ConvertMessage(js_dict, message)
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:479:
>  in ConvertMessage
> methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self)
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:667:
>  in _ConvertStructMessage
> self._ConvertValueMessage(value[key], message.fields[key])
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> self = 
> value =  0x7f69eb7b3ac8>
> message = 
> def _ConvertValueMessage(self, value, message):
>   """Convert a JSON representation into Value message."""
>   if isinstance(value, dict):
> self._ConvertStructMessage(value, message.struct_value)
>   elif isinstance(value, list):
> self. _ConvertListValueMessage(value, message.list_value)
>   elif value is None:
> message.null_value = 0
>   elif isinstance(value, bool):
> message.bool_value = value
>   elif isinstance(value, six.string_types):
> message.string_value = value
>   elif isinstance(value, _INT_OR_FLOAT):
> message.number_value = value
>   else:
> >   raise ParseError('Unexpected type for Value message.')
> E   google.protobuf.json_format.ParseError: Unexpected type for Value 
> message.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9707) Hardcode runner harness container image for unified worker

2020-05-12 Thread Brian Hulette (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105830#comment-17105830
 ] 

Brian Hulette commented on BEAM-9707:
-

What happens if 2.21 isn't released before the 2.22 cut date? Can this just be 
pushed again to 2.23?

> Hardcode runner harness container image for unified worker
> --
>
> Key: BEAM-9707
> URL: https://issues.apache.org/jira/browse/BEAM-9707
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-py-core
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Major
> Fix For: 2.22.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Hardcode runner harness image temporarily to support usage of unified worker 
> on head.
> Remove this hardcoding once 2.21 is release



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-9622) Support for consuming tagged PCollections in Python SqlTransform

2020-05-12 Thread Brian Hulette (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brian Hulette resolved BEAM-9622.
-
Fix Version/s: 2.22.0
   Resolution: Fixed

> Support for consuming tagged PCollections in Python SqlTransform
> 
>
> Key: BEAM-9622
> URL: https://issues.apache.org/jira/browse/BEAM-9622
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql, sdk-py-core
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Major
> Fix For: 2.22.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9975) PortableRunnerTest flake "ParseError: Unexpected type for Value message."

2020-05-12 Thread Kyle Weaver (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kyle Weaver updated BEAM-9975:
--
Description: 
Error looks similar to the one in BEAM-9907. Example from 
https://builds.apache.org/job/beam_PreCommit_Python_Cron/2732

{code}
apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:569: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
apache_beam/pipeline.py:550: in __exit__
self.run().wait_until_finish()
apache_beam/pipeline.py:529: in run
return self.runner.run_pipeline(self, self._options)
apache_beam/runners/portability/portable_runner.py:426: in run_pipeline
job_service_handle.submit(proto_pipeline)
apache_beam/runners/portability/portable_runner.py:107: in submit
prepare_response = self.prepare(proto_pipeline)
apache_beam/runners/portability/portable_runner.py:184: in prepare
pipeline_options=self.get_pipeline_options()),
apache_beam/runners/portability/portable_runner.py:174: in get_pipeline_options
return job_utils.dict_to_struct(p_options)
apache_beam/runners/job/utils.py:33: in dict_to_struct
return json_format.ParseDict(dict_obj, struct_pb2.Struct())
target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:450:
 in ParseDict
parser.ConvertMessage(js_dict, message)
target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:479:
 in ConvertMessage
methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self)
target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:667:
 in _ConvertStructMessage
self._ConvertValueMessage(value[key], message.fields[key])
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = 
value = 
message = 

def _ConvertValueMessage(self, value, message):
  """Convert a JSON representation into Value message."""
  if isinstance(value, dict):
self._ConvertStructMessage(value, message.struct_value)
  elif isinstance(value, list):
self. _ConvertListValueMessage(value, message.list_value)
  elif value is None:
message.null_value = 0
  elif isinstance(value, bool):
message.bool_value = value
  elif isinstance(value, six.string_types):
message.string_value = value
  elif isinstance(value, _INT_OR_FLOAT):
message.number_value = value
  else:
>   raise ParseError('Unexpected type for Value message.')
E   google.protobuf.json_format.ParseError: Unexpected type for Value 
message.
{code}

  was:
Error looks similar to the one in BEAM-9907. Example from 
https://builds.apache.org/job/beam_PreCommit_Python_Cron/2732:

{code}
apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:569: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
apache_beam/pipeline.py:550: in __exit__
self.run().wait_until_finish()
apache_beam/pipeline.py:529: in run
return self.runner.run_pipeline(self, self._options)
apache_beam/runners/portability/portable_runner.py:426: in run_pipeline
job_service_handle.submit(proto_pipeline)
apache_beam/runners/portability/portable_runner.py:107: in submit
prepare_response = self.prepare(proto_pipeline)
apache_beam/runners/portability/portable_runner.py:184: in prepare
pipeline_options=self.get_pipeline_options()),
apache_beam/runners/portability/portable_runner.py:174: in get_pipeline_options
return job_utils.dict_to_struct(p_options)
apache_beam/runners/job/utils.py:33: in dict_to_struct
return json_format.ParseDict(dict_obj, struct_pb2.Struct())
target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:450:
 in ParseDict
parser.ConvertMessage(js_dict, message)
target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:479:
 in ConvertMessage
methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self)
target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:667:
 in _ConvertStructMessage
self._ConvertValueMessage(value[key], message.fields[key])
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = 
value = 
message = 

def _ConvertValueMessage(self, value, message):
  """Convert a JSON representation into Value message."""
  if isinstance(value, dict):
self._ConvertStructMessage(value, message.struct_value)
  elif isinstance(value, list):
self. _ConvertListValueMessage(value, message.list_value)
  elif value is None:
message.null_value = 0
  elif isinstance(value, bool):
message.bool_value = value
  elif isinstance(value, six.string_types):
message.string_value = value
  elif isinstance(value, _INT_OR_FLOAT):
message.number_value = value
  

[jira] [Assigned] (BEAM-9975) PortableRunnerTest flake "ParseError: Unexpected type for Value message."

2020-05-12 Thread Brian Hulette (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brian Hulette reassigned BEAM-9975:
---

Assignee: (was: Brian Hulette)

> PortableRunnerTest flake "ParseError: Unexpected type for Value message."
> -
>
> Key: BEAM-9975
> URL: https://issues.apache.org/jira/browse/BEAM-9975
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Brian Hulette
>Priority: Major
>
> Error looks similar to the one in BEAM-9907. Example from 
> https://builds.apache.org/job/beam_PreCommit_Python_Cron/2732:
> {code}
> apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:569: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> apache_beam/pipeline.py:550: in __exit__
> self.run().wait_until_finish()
> apache_beam/pipeline.py:529: in run
> return self.runner.run_pipeline(self, self._options)
> apache_beam/runners/portability/portable_runner.py:426: in run_pipeline
> job_service_handle.submit(proto_pipeline)
> apache_beam/runners/portability/portable_runner.py:107: in submit
> prepare_response = self.prepare(proto_pipeline)
> apache_beam/runners/portability/portable_runner.py:184: in prepare
> pipeline_options=self.get_pipeline_options()),
> apache_beam/runners/portability/portable_runner.py:174: in 
> get_pipeline_options
> return job_utils.dict_to_struct(p_options)
> apache_beam/runners/job/utils.py:33: in dict_to_struct
> return json_format.ParseDict(dict_obj, struct_pb2.Struct())
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:450:
>  in ParseDict
> parser.ConvertMessage(js_dict, message)
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:479:
>  in ConvertMessage
> methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self)
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:667:
>  in _ConvertStructMessage
> self._ConvertValueMessage(value[key], message.fields[key])
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> self = 
> value =  0x7f69eb7b3ac8>
> message = 
> def _ConvertValueMessage(self, value, message):
>   """Convert a JSON representation into Value message."""
>   if isinstance(value, dict):
> self._ConvertStructMessage(value, message.struct_value)
>   elif isinstance(value, list):
> self. _ConvertListValueMessage(value, message.list_value)
>   elif value is None:
> message.null_value = 0
>   elif isinstance(value, bool):
> message.bool_value = value
>   elif isinstance(value, six.string_types):
> message.string_value = value
>   elif isinstance(value, _INT_OR_FLOAT):
> message.number_value = value
>   else:
> >   raise ParseError('Unexpected type for Value message.')
> E   google.protobuf.json_format.ParseError: Unexpected type for Value 
> message.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9977) Build Kafka Read on top of Java SplittableDoFn

2020-05-12 Thread Boyuan Zhang (Jira)
Boyuan Zhang created BEAM-9977:
--

 Summary: Build Kafka Read on top of Java SplittableDoFn
 Key: BEAM-9977
 URL: https://issues.apache.org/jira/browse/BEAM-9977
 Project: Beam
  Issue Type: New Feature
  Components: io-java-kafka
Reporter: Boyuan Zhang
Assignee: Boyuan Zhang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9977) Build Kafka Read on top of Java SplittableDoFn

2020-05-12 Thread Kenneth Knowles (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-9977:
--
Status: Open  (was: Triage Needed)

> Build Kafka Read on top of Java SplittableDoFn
> --
>
> Key: BEAM-9977
> URL: https://issues.apache.org/jira/browse/BEAM-9977
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-kafka
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9488) Python SDK sending unexpected MonitoringInfo

2020-05-12 Thread Luke Cwik (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-9488:

Fix Version/s: (was: 2.22.0)
   2.21.0

> Python SDK sending unexpected MonitoringInfo
> 
>
> Key: BEAM-9488
> URL: https://issues.apache.org/jira/browse/BEAM-9488
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ruoyun Huang
>Assignee: Luke Cwik
>Priority: Minor
>  Labels: portability
> Fix For: 2.21.0
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> element_count metrics is supposed to be tied with pcollection ids, but by 
> inspecting what is sent over by python sdk, we see there are monitoringInfo 
> sent wit ptransforms in it. 
> [Double checked the job graph, these seem to be redundant. i.e. the 
> corresponding pcollection does have its own MonitoringInfo reported.]
> Likely a bug. 
> Proof: 
> urn: "beam:metric:element_count:v1"
> type: "beam:metrics:sum_int_64"
> metric {
>   counter_data {
> int64_value: 1
>   }
> }
> labels {
>   key: "PTRANSFORM"
>   value: "start/MaybeReshuffle/Reshuffle/RemoveRandomKeys-ptransform-85"
> }
> labels {
>   key: "TAG"
>   value: "None"
> }
> timestamp {
>   seconds: 1583949073
>   nanos: 842402935
> }



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9959) Mistakes Computing Composite Inputs and Outputs

2020-05-12 Thread Robert Burke (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105750#comment-17105750
 ] 

Robert Burke commented on BEAM-9959:


The right overall fix for that is to check for cycles WRT the composites after 
the topological sort, and print out that there's a cycle involving the 
*composite* node represented by the scope. Anything without the full cycle is 
much harder to debug. Further, the individual PTransforms involved should be 
fully qualified with their composite parent hierachies to make it easier to 
find where these are coming from, and recommend either merging two scopes or 
similar, and recommending that the new scope objects be moved to their own 
functions with 1 scope per function. This makes the bad construction impossible.

> Mistakes Computing Composite Inputs and Outputs
> ---
>
> Key: BEAM-9959
> URL: https://issues.apache.org/jira/browse/BEAM-9959
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Reporter: Robert Burke
>Assignee: Robert Burke
>Priority: Major
>
> The Go SDK uses a Scope object to manage beam Composites.
> A bug was discovered when consuming a PCollection in both the composite that 
> created it, and in a separate composite.
> Further, the Go SDK should verify that the root hypergraph structure is a DAG 
> and provides a reasonable error.  In particular, the leaf nodes of the graph 
> could form a DAG, but due to how the beam.Scope object is used, might cause 
> the hypergraph to not be a DAG.
> Eg. It's possible to write the following in the Go SDK.
>  PTransforms A, B, C and PCollections colA, colB, and Composites a, b.
> A and C are in a, and B are in b.
> A generates colA
> B consumes colA, and generates colB.
> C consumes colA and colB.
> ```
> a := s.Scope(a)
> b := s.Scope(b)
> colA := beam.Impulse(*a*)
> colB := beam.ParDo(*b*, , colA)
> beam.ParDo0(*a*, , colA, beam.SideInput{colB})
> ```
> If it doesn't already, the Go SDK must emit a clear error, and fail pipeline 
> construction.
> If the affected composites are roots in the graph, the cycle prevents being 
> able to topologically sort the root ptransforms for the pipeline graph, which 
> can adversely affect runners.
> The recommendation is always to wrap uses of scope in functions or other 
> scopes to prevent such incorrect constructions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9976) FlinkSavepointTest timeout flake

2020-05-12 Thread Kyle Weaver (Jira)
Kyle Weaver created BEAM-9976:
-

 Summary: FlinkSavepointTest timeout flake
 Key: BEAM-9976
 URL: https://issues.apache.org/jira/browse/BEAM-9976
 Project: Beam
  Issue Type: Bug
  Components: runner-flink, test-failures
Reporter: Kyle Weaver


org.apache.beam.runners.flink.FlinkSavepointTest.testSavepointRestorePortable

org.junit.runners.model.TestTimedOutException: test timed out after 60 seconds
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
at 
org.apache.beam.runners.flink.FlinkSavepointTest.runSavepointAndRestore(FlinkSavepointTest.java:188)
at 
org.apache.beam.runners.flink.FlinkSavepointTest.testSavepointRestorePortable(FlinkSavepointTest.java:154)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9975) PortableRunnerTest flake "ParseError: Unexpected type for Value message."

2020-05-12 Thread Brian Hulette (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105743#comment-17105743
 ] 

Brian Hulette commented on BEAM-9975:
-

In BEAM-9907 the underlying issue was that PipelineOptions picked up sys.argv, 
which was the argv for the test framework, not Beam. I don't see any clear 
places where we could be creating PipelineOption that read sys.argv in these 
PortableRunnerTest methods.

> PortableRunnerTest flake "ParseError: Unexpected type for Value message."
> -
>
> Key: BEAM-9975
> URL: https://issues.apache.org/jira/browse/BEAM-9975
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Major
>
> Error looks similar to the one in BEAM-9907. Example from 
> https://builds.apache.org/job/beam_PreCommit_Python_Cron/2732:
> {code}
> apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:569: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> apache_beam/pipeline.py:550: in __exit__
> self.run().wait_until_finish()
> apache_beam/pipeline.py:529: in run
> return self.runner.run_pipeline(self, self._options)
> apache_beam/runners/portability/portable_runner.py:426: in run_pipeline
> job_service_handle.submit(proto_pipeline)
> apache_beam/runners/portability/portable_runner.py:107: in submit
> prepare_response = self.prepare(proto_pipeline)
> apache_beam/runners/portability/portable_runner.py:184: in prepare
> pipeline_options=self.get_pipeline_options()),
> apache_beam/runners/portability/portable_runner.py:174: in 
> get_pipeline_options
> return job_utils.dict_to_struct(p_options)
> apache_beam/runners/job/utils.py:33: in dict_to_struct
> return json_format.ParseDict(dict_obj, struct_pb2.Struct())
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:450:
>  in ParseDict
> parser.ConvertMessage(js_dict, message)
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:479:
>  in ConvertMessage
> methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self)
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:667:
>  in _ConvertStructMessage
> self._ConvertValueMessage(value[key], message.fields[key])
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> self = 
> value =  0x7f69eb7b3ac8>
> message = 
> def _ConvertValueMessage(self, value, message):
>   """Convert a JSON representation into Value message."""
>   if isinstance(value, dict):
> self._ConvertStructMessage(value, message.struct_value)
>   elif isinstance(value, list):
> self. _ConvertListValueMessage(value, message.list_value)
>   elif value is None:
> message.null_value = 0
>   elif isinstance(value, bool):
> message.bool_value = value
>   elif isinstance(value, six.string_types):
> message.string_value = value
>   elif isinstance(value, _INT_OR_FLOAT):
> message.number_value = value
>   else:
> >   raise ParseError('Unexpected type for Value message.')
> E   google.protobuf.json_format.ParseError: Unexpected type for Value 
> message.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9975) PortableRunnerTest flake "ParseError: Unexpected type for Value message."

2020-05-12 Thread Brian Hulette (Jira)
Brian Hulette created BEAM-9975:
---

 Summary: PortableRunnerTest flake "ParseError: Unexpected type for 
Value message."
 Key: BEAM-9975
 URL: https://issues.apache.org/jira/browse/BEAM-9975
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-core
Reporter: Brian Hulette
Assignee: Brian Hulette


Error looks similar to the one in BEAM-9907. Example from 
https://builds.apache.org/job/beam_PreCommit_Python_Cron/2732:

{code}
apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:569: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
apache_beam/pipeline.py:550: in __exit__
self.run().wait_until_finish()
apache_beam/pipeline.py:529: in run
return self.runner.run_pipeline(self, self._options)
apache_beam/runners/portability/portable_runner.py:426: in run_pipeline
job_service_handle.submit(proto_pipeline)
apache_beam/runners/portability/portable_runner.py:107: in submit
prepare_response = self.prepare(proto_pipeline)
apache_beam/runners/portability/portable_runner.py:184: in prepare
pipeline_options=self.get_pipeline_options()),
apache_beam/runners/portability/portable_runner.py:174: in get_pipeline_options
return job_utils.dict_to_struct(p_options)
apache_beam/runners/job/utils.py:33: in dict_to_struct
return json_format.ParseDict(dict_obj, struct_pb2.Struct())
target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:450:
 in ParseDict
parser.ConvertMessage(js_dict, message)
target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:479:
 in ConvertMessage
methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self)
target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:667:
 in _ConvertStructMessage
self._ConvertValueMessage(value[key], message.fields[key])
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = 
value = 
message = 

def _ConvertValueMessage(self, value, message):
  """Convert a JSON representation into Value message."""
  if isinstance(value, dict):
self._ConvertStructMessage(value, message.struct_value)
  elif isinstance(value, list):
self. _ConvertListValueMessage(value, message.list_value)
  elif value is None:
message.null_value = 0
  elif isinstance(value, bool):
message.bool_value = value
  elif isinstance(value, six.string_types):
message.string_value = value
  elif isinstance(value, _INT_OR_FLOAT):
message.number_value = value
  else:
>   raise ParseError('Unexpected type for Value message.')
E   google.protobuf.json_format.ParseError: Unexpected type for Value 
message.
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9975) PortableRunnerTest flake "ParseError: Unexpected type for Value message."

2020-05-12 Thread Kenneth Knowles (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-9975:
--
Status: Open  (was: Triage Needed)

> PortableRunnerTest flake "ParseError: Unexpected type for Value message."
> -
>
> Key: BEAM-9975
> URL: https://issues.apache.org/jira/browse/BEAM-9975
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Major
>
> Error looks similar to the one in BEAM-9907. Example from 
> https://builds.apache.org/job/beam_PreCommit_Python_Cron/2732:
> {code}
> apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:569: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> apache_beam/pipeline.py:550: in __exit__
> self.run().wait_until_finish()
> apache_beam/pipeline.py:529: in run
> return self.runner.run_pipeline(self, self._options)
> apache_beam/runners/portability/portable_runner.py:426: in run_pipeline
> job_service_handle.submit(proto_pipeline)
> apache_beam/runners/portability/portable_runner.py:107: in submit
> prepare_response = self.prepare(proto_pipeline)
> apache_beam/runners/portability/portable_runner.py:184: in prepare
> pipeline_options=self.get_pipeline_options()),
> apache_beam/runners/portability/portable_runner.py:174: in 
> get_pipeline_options
> return job_utils.dict_to_struct(p_options)
> apache_beam/runners/job/utils.py:33: in dict_to_struct
> return json_format.ParseDict(dict_obj, struct_pb2.Struct())
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:450:
>  in ParseDict
> parser.ConvertMessage(js_dict, message)
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:479:
>  in ConvertMessage
> methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self)
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:667:
>  in _ConvertStructMessage
> self._ConvertValueMessage(value[key], message.fields[key])
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> self = 
> value =  0x7f69eb7b3ac8>
> message = 
> def _ConvertValueMessage(self, value, message):
>   """Convert a JSON representation into Value message."""
>   if isinstance(value, dict):
> self._ConvertStructMessage(value, message.struct_value)
>   elif isinstance(value, list):
> self. _ConvertListValueMessage(value, message.list_value)
>   elif value is None:
> message.null_value = 0
>   elif isinstance(value, bool):
> message.bool_value = value
>   elif isinstance(value, six.string_types):
> message.string_value = value
>   elif isinstance(value, _INT_OR_FLOAT):
> message.number_value = value
>   else:
> >   raise ParseError('Unexpected type for Value message.')
> E   google.protobuf.json_format.ParseError: Unexpected type for Value 
> message.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9843) Flink UnboundedSourceWrapperTest flaky due to a timeout

2020-05-12 Thread Maximilian Michels (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105723#comment-17105723
 ] 

Maximilian Michels commented on BEAM-9843:
--

Thanks for reporting this. This should be fixed. Feel free to ping me here if 
it fails again.

> Flink UnboundedSourceWrapperTest flaky due to a timeout
> ---
>
> Key: BEAM-9843
> URL: https://issues.apache.org/jira/browse/BEAM-9843
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Chamikara Madhusanka Jayalath
>Priority: Major
> Fix For: Not applicable
>
>
> For example,
> [https://builds.apache.org/job/beam_PreCommit_Java_Cron/2685/]
> [https://builds.apache.org/job/beam_PreCommit_Java_Cron/2684/]
> [https://builds.apache.org/job/beam_PreCommit_Java_Cron/2682/]
> [https://builds.apache.org/job/beam_PreCommit_Java_Cron/2680/]
> [https://builds.apache.org/job/beam_PreCommit_Java_Cron/2685/testReport/junit/org.apache.beam.runners.flink.translation.wrappers.streaming.io/UnboundedSourceWrapperTest$ParameterizedUnboundedSourceWrapperTest/testWatermarkEmission_numTasks___4__numSplits_4_/]
> org.junit.runners.model.TestTimedOutException: test timed out after 3 
> milliseconds at sun.misc.Unsafe.park(Native Method) at 
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
>  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
>  at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) at 
> org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapperTest$ParameterizedUnboundedSourceWrapperTest.testWatermarkEmission(UnboundedSourceWrapperTest.java:354)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.lang.Thread.run(Thread.java:748)
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-9164) [PreCommit_Java] [Flake] UnboundedSourceWrapperTest$ParameterizedUnboundedSourceWrapperTest

2020-05-12 Thread Maximilian Michels (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels resolved BEAM-9164.
--
Fix Version/s: Not applicable
   Resolution: Fixed

This should be fixed but feel free to re-open if it pops up again.

> [PreCommit_Java] [Flake] 
> UnboundedSourceWrapperTest$ParameterizedUnboundedSourceWrapperTest
> ---
>
> Key: BEAM-9164
> URL: https://issues.apache.org/jira/browse/BEAM-9164
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Kirill Kozlov
>Priority: Critical
>  Labels: flake
> Fix For: Not applicable
>
>
> Test:
> org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapperTest$ParameterizedUnboundedSourceWrapperTest
>  >> testWatermarkEmission[numTasks = 1; numSplits=1]
> Fails with the following exception:
> {code:java}
> org.junit.runners.model.TestTimedOutException: test timed out after 3 
> milliseconds{code}
> Affected Jenkins job: 
> [https://builds.apache.org/job/beam_PreCommit_Java_Phrase/1665/]
> Gradle build scan: [https://scans.gradle.com/s/nvgeb425fe63q]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9969) Beam ZetaSQL supports pure SQL user-defined table functions

2020-05-12 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated BEAM-9969:
---
Description: 
There are four types of implementation of table-valued function:
1. FixedSchemaOutput. The output of the TVF is pre-defined.
2. InputSchemaForwardToOutput. The output schema of the TVF is the same as 
input schema.
3. InputSchemaForwardToOutputWithAppendedColumns. The output schema of the TVF 
is the same as input schema plus columns appended.
4. TemplatedTVF. Parameters of TVF are templated. (e.g. ANY TYPE, ANY TABLE, 
etc.)

  was:
There are three types of implementation of table-valued function:
1. FixedSchemaOutput. The output of the TVF is pre-defined.
2. InputSchemaForwardToOutput. The output schema of the TVF is the same as 
input schema.
3. InputSchemaForwardToOutputWithAppendedColumns. The output schema of the TVF 
is the same as input schema plus columns appended.
4. TemplatedTVF. Parameters of TVF are templated. (e.g. ANY TYPE, ANY TABLE, 
etc.)


> Beam ZetaSQL supports pure SQL user-defined table functions
> ---
>
> Key: BEAM-9969
> URL: https://issues.apache.org/jira/browse/BEAM-9969
> Project: Beam
>  Issue Type: Task
>  Components: dsl-sql-zetasql
>Reporter: Rui Wang
>Priority: Major
>
> There are four types of implementation of table-valued function:
> 1. FixedSchemaOutput. The output of the TVF is pre-defined.
> 2. InputSchemaForwardToOutput. The output schema of the TVF is the same as 
> input schema.
> 3. InputSchemaForwardToOutputWithAppendedColumns. The output schema of the 
> TVF is the same as input schema plus columns appended.
> 4. TemplatedTVF. Parameters of TVF are templated. (e.g. ANY TYPE, ANY TABLE, 
> etc.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9974) beam_PostRelease_NightlySnapshot failing

2020-05-12 Thread Kyle Weaver (Jira)
Kyle Weaver created BEAM-9974:
-

 Summary: beam_PostRelease_NightlySnapshot failing
 Key: BEAM-9974
 URL: https://issues.apache.org/jira/browse/BEAM-9974
 Project: Beam
  Issue Type: Bug
  Components: test-failures
Reporter: Kyle Weaver


Another failure mode:

07:02:29 > Task :runners:google-cloud-dataflow-java:runMobileGamingJavaDataflow
07:02:29 [ERROR] Failed to execute goal 
org.codehaus.mojo:exec-maven-plugin:1.6.0:java (default-cli) on project 
word-count-beam: An exception occured while executing the Java class. No 
filesystem found for scheme gs -> [Help 1]
07:02:29 [ERROR] 
07:02:29 [ERROR] To see the full stack trace of the errors, re-run Maven with 
the -e switch.
07:02:29 [ERROR] Re-run Maven using the -X switch to enable full debug logging.
07:02:29 [ERROR] 
07:02:29 [ERROR] For more information about the errors and possible solutions, 
please read the following articles:
07:02:29 [ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
07:02:29 [ERROR] Failed to execute goal 
org.codehaus.mojo:exec-maven-plugin:1.6.0:java (default-cli) on project 
word-count-beam: An exception occured while executing the Java class. No 
filesystem found for scheme gs -> [Help 1]
07:02:29 [ERROR] 
07:02:29 [ERROR] To see the full stack trace of the errors, re-run Maven with 
the -e switch.
07:02:29 [ERROR] Re-run Maven using the -X switch to enable full debug logging.
07:02:29 [ERROR] 
07:02:29 [ERROR] For more information about the errors and possible solutions, 
please read the following articles:
07:02:29 [ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
07:02:29 [ERROR] Failed command
07:02:29 
07:02:29 > Task :runners:google-cloud-dataflow-java:runMobileGamingJavaDataflow 
FAILED




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9973) beam_PostCommit_Py_ValCont flakes (before running tests)

2020-05-12 Thread Kyle Weaver (Jira)
Kyle Weaver created BEAM-9973:
-

 Summary: beam_PostCommit_Py_ValCont flakes (before running tests)
 Key: BEAM-9973
 URL: https://issues.apache.org/jira/browse/BEAM-9973
 Project: Beam
  Issue Type: Bug
  Components: test-failures
Reporter: Kyle Weaver


14:05:15 - Last  20 lines from daemon log file - daemon-2143.out.log -
14:05:15 INFO:gen_protos:Writing urn stubs: 
/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_ValCont/src/sdks/python/apache_beam/portability/api/metrics_pb2_urns.py
14:05:15 INFO:gen_protos:Writing urn stubs: 
/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_ValCont/src/sdks/python/apache_beam/portability/api/beam_artifact_api_pb2_urns.py
14:05:15 INFO:gen_protos:Writing urn stubs: 
/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_ValCont/src/sdks/python/apache_beam/portability/api/standard_window_fns_pb2_urns.py
14:05:15 INFO:gen_protos:Writing urn stubs: 
/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_ValCont/src/sdks/python/apache_beam/portability/api/beam_fn_api_pb2_urns.py
14:05:15 INFO:gen_protos:Writing urn stubs: 
/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_ValCont/src/sdks/python/apache_beam/portability/api/beam_job_api_pb2_urns.py
14:05:15 INFO:gen_protos:Writing urn stubs: 
/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_ValCont/src/sdks/python/apache_beam/portability/api/beam_runner_api_pb2_urns.py
14:05:15 warning: no files found matching 'README.md'
14:05:15 warning: no files found matching 'NOTICE'
14:05:15 warning: no files found matching 'LICENSE'
14:05:15 warning: sdist: standard file not found: should have one of README, 
README.rst, README.txt, README.md
14:05:15 
14:05:15 Create distribution tar file apache-beam.tar.gz in 
/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_ValCont/src/sdks/python/build
14:05:15 :sdks:python:sdist (Thread[Execution worker for ':',5,main]) 
completed. Took 7.453 secs.
14:05:15 :sdks:python:container:py2:copyDockerfileDependencies 
(Thread[Execution worker for ':',5,main]) started.
14:05:15 Build cache key for task 
':sdks:python:container:py2:copyDockerfileDependencies' is 
ea7f5d2ce156f4297b3b2c5c3f21611d
14:05:15 Caching disabled for task 
':sdks:python:container:py2:copyDockerfileDependencies': Caching has not been 
enabled for the task
14:05:15 Task ':sdks:python:container:py2:copyDockerfileDependencies' is not 
up-to-date because:
14:05:15   No history is available.
14:05:15 :sdks:python:container:py2:copyDockerfileDependencies 
(Thread[Execution worker for ':',5,main]) completed. Took 0.012 secs.
14:05:15 Daemon vm is shutting down... The daemon has exited normally or was 
terminated in response to a user interrupt.
14:05:15 - End of the daemon log -
14:05:15 
14:05:15 
14:05:15 FAILURE: Build failed with an exception.
14:05:15 
14:05:15 * What went wrong:
14:05:15 Gradle build daemon disappeared unexpectedly (it may have been killed 
or may have crashed)




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9972) Test pure SQL UDF for all data types

2020-05-12 Thread Rui Wang (Jira)
Rui Wang created BEAM-9972:
--

 Summary: Test pure SQL UDF for all data types
 Key: BEAM-9972
 URL: https://issues.apache.org/jira/browse/BEAM-9972
 Project: Beam
  Issue Type: Task
  Components: dsl-sql-zetasql
Reporter: Rui Wang


The following are data types that UDF will support
* ARRAY
* BOOL
* BYTES
* DATE
* TIME
* FLOAT64
* INT64
* NUMERIC
* STRING
* STRUCT
* TIMESTAMP

We should write unit tests to against these types.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9969) Beam ZetaSQL supports pure SQL user-defined table functions

2020-05-12 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated BEAM-9969:
---
Status: Open  (was: Triage Needed)

> Beam ZetaSQL supports pure SQL user-defined table functions
> ---
>
> Key: BEAM-9969
> URL: https://issues.apache.org/jira/browse/BEAM-9969
> Project: Beam
>  Issue Type: Task
>  Components: dsl-sql-zetasql
>Reporter: Rui Wang
>Priority: Major
>
> There are three types of implementation of table-valued function:
> 1. FixedSchemaOutput. The output of the TVF is pre-defined.
> 2. InputSchemaForwardToOutput. The output schema of the TVF is the same as 
> input schema.
> 3. InputSchemaForwardToOutputWithAppendedColumns. The output schema of the 
> TVF is the same as input schema plus columns appended.
> 4. TemplatedTVF. Parameters of TVF are templated. (e.g. ANY TYPE, ANY TABLE, 
> etc.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9971) beam_PostCommit_Java_PVR_Spark_Batch flakes (no such file)

2020-05-12 Thread Kenneth Knowles (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-9971:
--
Status: Open  (was: Triage Needed)

> beam_PostCommit_Java_PVR_Spark_Batch flakes (no such file)
> --
>
> Key: BEAM-9971
> URL: https://issues.apache.org/jira/browse/BEAM-9971
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>
> This happens sporadically. One time the issue affected 14 tests; another time 
> it affected 112 tests.
> java.lang.RuntimeException: The Runner experienced the following error during 
> execution:
> java.io.FileNotFoundException: 
> /tmp/spark-0812a463-8d6b-4c97-be4b-de43baf67108/userFiles-b90ca2e1-2041-442d-ae78-c8e9c30bff49/beam-runners-spark-2.22.0-SNAPSHOT.jar
>  (No such file or directory)
>   at 
> org.apache.beam.runners.portability.JobServicePipelineResult.propagateErrors(JobServicePipelineResult.java:165)
>   at 
> org.apache.beam.runners.portability.JobServicePipelineResult.waitUntilFinish(JobServicePipelineResult.java:110)
>   at 
> org.apache.beam.runners.portability.testing.TestPortableRunner.run(TestPortableRunner.java:83)
>   at org.apache.beam.sdk.Pipeline.run(Pipeline.java:317)
>   at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:350)
>   at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:331)
>   at 
> org.apache.beam.runners.core.metrics.MetricsPusherTest.pushesUserMetrics(MetricsPusherTest.java:70)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:319)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:365)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>   at org.junit.runners.ParentRunner$4.run(ParentRunner.java:330)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:78)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:328)
>   at org.junit.runners.ParentRunner.access$100(ParentRunner.java:65)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:292)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:412)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:110)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
>   at 
> org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:62)
>   at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
>   at sun.reflect.GeneratedMethodAccessor161.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>   at 
> org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
>   at 
> org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
>   at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
>   

[jira] [Created] (BEAM-9971) beam_PostCommit_Java_PVR_Spark_Batch flakes (no such file)

2020-05-12 Thread Kyle Weaver (Jira)
Kyle Weaver created BEAM-9971:
-

 Summary: beam_PostCommit_Java_PVR_Spark_Batch flakes (no such file)
 Key: BEAM-9971
 URL: https://issues.apache.org/jira/browse/BEAM-9971
 Project: Beam
  Issue Type: Bug
  Components: test-failures
Reporter: Kyle Weaver
Assignee: Kyle Weaver


This happens sporadically. One time the issue affected 14 tests; another time 
it affected 112 tests.

java.lang.RuntimeException: The Runner experienced the following error during 
execution:
java.io.FileNotFoundException: 
/tmp/spark-0812a463-8d6b-4c97-be4b-de43baf67108/userFiles-b90ca2e1-2041-442d-ae78-c8e9c30bff49/beam-runners-spark-2.22.0-SNAPSHOT.jar
 (No such file or directory)
at 
org.apache.beam.runners.portability.JobServicePipelineResult.propagateErrors(JobServicePipelineResult.java:165)
at 
org.apache.beam.runners.portability.JobServicePipelineResult.waitUntilFinish(JobServicePipelineResult.java:110)
at 
org.apache.beam.runners.portability.testing.TestPortableRunner.run(TestPortableRunner.java:83)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:317)
at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:350)
at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:331)
at 
org.apache.beam.runners.core.metrics.MetricsPusherTest.pushesUserMetrics(MetricsPusherTest.java:70)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:319)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305)
at 
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:365)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
at org.junit.runners.ParentRunner$4.run(ParentRunner.java:330)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:78)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:328)
at org.junit.runners.ParentRunner.access$100(ParentRunner.java:65)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:292)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305)
at org.junit.runners.ParentRunner.run(ParentRunner.java:412)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:110)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
at 
org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:62)
at 
org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
at sun.reflect.GeneratedMethodAccessor161.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at 
org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
at 
org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
at 
org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:118)
at sun.reflect.GeneratedMethodAccessor160.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at 

[jira] [Created] (BEAM-9970) beam_PostRelease_NightlySnapshot failing

2020-05-12 Thread Kyle Weaver (Jira)
Kyle Weaver created BEAM-9970:
-

 Summary: beam_PostRelease_NightlySnapshot failing
 Key: BEAM-9970
 URL: https://issues.apache.org/jira/browse/BEAM-9970
 Project: Beam
  Issue Type: Bug
  Components: test-failures
Reporter: Kyle Weaver


07:44:55 gcloud dataflow jobs cancel $(gcloud dataflow jobs list | grep 
leaderboard-validation-1588332285874-332 | grep Running | cut -d' ' -f1)
07:44:59 ERROR: (gcloud.dataflow.jobs.cancel) argument JOB_ID [JOB_ID ...]: 
Must be specified.
07:44:59 Usage: gcloud dataflow jobs cancel JOB_ID [JOB_ID ...] [optional flags]
07:44:59   optional flags may be  --help | --region
07:44:59 
07:44:59 For detailed information on this command and its flags, run:
07:44:59   gcloud dataflow jobs cancel --help
07:44:59 ERROR: (gcloud.dataflow.jobs.cancel) argument JOB_ID [JOB_ID ...]: 
Must be specified.
07:44:59 Usage: gcloud dataflow jobs cancel JOB_ID [JOB_ID ...] [optional flags]
07:44:59   optional flags may be  --help | --region
07:44:59 
07:44:59 For detailed information on this command and its flags, run:
07:44:59   gcloud dataflow jobs cancel --help
07:45:00 [ERROR] Failed command
07:45:00 
07:45:00 > Task :runners:google-cloud-dataflow-java:runMobileGamingJavaDataflow 
FAILED




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-9968) beam_PreCommit_Java_Cron org.apache.beam.runners.fnexecution.control.RemoteExecutionTest.testSplit Failure

2020-05-12 Thread Kenneth Knowles (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-9968:
-

Assignee: Luke Cwik

> beam_PreCommit_Java_Cron 
> org.apache.beam.runners.fnexecution.control.RemoteExecutionTest.testSplit 
> Failure
> --
>
> Key: BEAM-9968
> URL: https://issues.apache.org/jira/browse/BEAM-9968
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kenneth Knowles
>Assignee: Luke Cwik
>Priority: Critical
>  Labels: currently-failing
>
> _Use this form to file an issue for test failure:_
>  * [https://builds.apache.org/job/beam_PreCommit_Java_Cron/2741/]
>  * [https://gradle.com/s/ij5a4z7nhwjn2]
>  * 
> [RemoteExecutionTest.testSplit](https://github.com/apache/beam/blob/820f0f5c12146195ed80617763a354ddb75f0bc1/runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/control/RemoteExecutionTest.java#L1358)
> Failure is: `AssertionError` since an assertion was used without a meaningful 
> error message. It is asserting that the splits are non-empty.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9968) beam_PreCommit_Java_Cron org.apache.beam.runners.fnexecution.control.RemoteExecutionTest.testSplit Failure

2020-05-12 Thread Kenneth Knowles (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105664#comment-17105664
 ] 

Kenneth Knowles commented on BEAM-9968:
---

Actually stderr is flooded with a bunch of errors. Not totally clear to me 
whether those errors are just the blast radius of the assertion error.

> beam_PreCommit_Java_Cron 
> org.apache.beam.runners.fnexecution.control.RemoteExecutionTest.testSplit 
> Failure
> --
>
> Key: BEAM-9968
> URL: https://issues.apache.org/jira/browse/BEAM-9968
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kenneth Knowles
>Priority: Critical
>  Labels: currently-failing
>
> _Use this form to file an issue for test failure:_
>  * [https://builds.apache.org/job/beam_PreCommit_Java_Cron/2741/]
>  * [https://gradle.com/s/ij5a4z7nhwjn2]
>  * 
> [RemoteExecutionTest.testSplit](https://github.com/apache/beam/blob/820f0f5c12146195ed80617763a354ddb75f0bc1/runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/control/RemoteExecutionTest.java#L1358)
> Failure is: `AssertionError` since an assertion was used without a meaningful 
> error message. It is asserting that the splits are non-empty.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9969) Beam ZetaSQL supports pure SQL user-defined table functions

2020-05-12 Thread Rui Wang (Jira)
Rui Wang created BEAM-9969:
--

 Summary: Beam ZetaSQL supports pure SQL user-defined table 
functions
 Key: BEAM-9969
 URL: https://issues.apache.org/jira/browse/BEAM-9969
 Project: Beam
  Issue Type: Task
  Components: dsl-sql-zetasql
Reporter: Rui Wang


There are three types of implementation of table-valued function:
1. FixedSchemaOutput. The output of the TVF is pre-defined.
2. InputSchemaForwardToOutput. The output schema of the TVF is the same as 
input schema.
3. InputSchemaForwardToOutputWithAppendedColumns. The output schema of the TVF 
is the same as input schema plus columns appended.
4. TemplatedTVF. Parameters of TVF are templated. (e.g. ANY TYPE, ANY TABLE, 
etc.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (BEAM-9622) Support for consuming tagged PCollections in Python SqlTransform

2020-05-12 Thread Brian Hulette (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-9622 started by Brian Hulette.
---
> Support for consuming tagged PCollections in Python SqlTransform
> 
>
> Key: BEAM-9622
> URL: https://issues.apache.org/jira/browse/BEAM-9622
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql, sdk-py-core
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-9622) Support for consuming tagged PCollections in Python SqlTransform

2020-05-12 Thread Brian Hulette (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brian Hulette reassigned BEAM-9622:
---

Assignee: Brian Hulette

> Support for consuming tagged PCollections in Python SqlTransform
> 
>
> Key: BEAM-9622
> URL: https://issues.apache.org/jira/browse/BEAM-9622
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql, sdk-py-core
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9968) beam_PreCommit_Java_Cron org.apache.beam.runners.fnexecution.control.RemoteExecutionTest.testSplit Failure

2020-05-12 Thread Kenneth Knowles (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105661#comment-17105661
 ] 

Kenneth Knowles commented on BEAM-9968:
---

I did not see an obvious suspect commit. It has failed twice in a row now.

> beam_PreCommit_Java_Cron 
> org.apache.beam.runners.fnexecution.control.RemoteExecutionTest.testSplit 
> Failure
> --
>
> Key: BEAM-9968
> URL: https://issues.apache.org/jira/browse/BEAM-9968
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kenneth Knowles
>Priority: Critical
>  Labels: currently-failing
>
> _Use this form to file an issue for test failure:_
>  * [https://builds.apache.org/job/beam_PreCommit_Java_Cron/2741/]
>  * [https://gradle.com/s/ij5a4z7nhwjn2]
>  * 
> [RemoteExecutionTest.testSplit](https://github.com/apache/beam/blob/820f0f5c12146195ed80617763a354ddb75f0bc1/runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/control/RemoteExecutionTest.java#L1358)
> Failure is: `AssertionError` since an assertion was used without a meaningful 
> error message. It is asserting that the splits are non-empty.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9968) beam_PreCommit_Java_Cron org.apache.beam.runners.fnexecution.control.RemoteExecutionTest.testSplit Failure

2020-05-12 Thread Kenneth Knowles (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-9968:
--
Description: 
_Use this form to file an issue for test failure:_
 * [https://builds.apache.org/job/beam_PreCommit_Java_Cron/2741/]
 * [https://gradle.com/s/ij5a4z7nhwjn2]
 * 
[RemoteExecutionTest.testSplit](https://github.com/apache/beam/blob/820f0f5c12146195ed80617763a354ddb75f0bc1/runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/control/RemoteExecutionTest.java#L1358)

Failure is: `AssertionError` since an assertion was used without a meaningful 
error message. It is asserting that the splits are non-empty.

  was:
_Use this form to file an issue for test failure:_
 * [https://builds.apache.org/job/beam_PreCommit_Java_Cron/2741/]
 * [https://gradle.com/s/ij5a4z7nhwjn2]
 * 
[RemoteExecutionTest.testSplit](https://github.com/apache/beam/blob/820f0f5c12146195ed80617763a354ddb75f0bc1/runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/control/RemoteExecutionTest.java#L1358)


> beam_PreCommit_Java_Cron 
> org.apache.beam.runners.fnexecution.control.RemoteExecutionTest.testSplit 
> Failure
> --
>
> Key: BEAM-9968
> URL: https://issues.apache.org/jira/browse/BEAM-9968
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kenneth Knowles
>Priority: Critical
>  Labels: currently-failing
>
> _Use this form to file an issue for test failure:_
>  * [https://builds.apache.org/job/beam_PreCommit_Java_Cron/2741/]
>  * [https://gradle.com/s/ij5a4z7nhwjn2]
>  * 
> [RemoteExecutionTest.testSplit](https://github.com/apache/beam/blob/820f0f5c12146195ed80617763a354ddb75f0bc1/runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/control/RemoteExecutionTest.java#L1358)
> Failure is: `AssertionError` since an assertion was used without a meaningful 
> error message. It is asserting that the splits are non-empty.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9968) beam_PreCommit_Java_Cron org.apache.beam.runners.fnexecution.control.RemoteExecutionTest.testSplit Failure

2020-05-12 Thread Kenneth Knowles (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105660#comment-17105660
 ] 

Kenneth Knowles commented on BEAM-9968:
---

CC [~lcwik] and [~chamikara]. Any ideas on this one?

> beam_PreCommit_Java_Cron 
> org.apache.beam.runners.fnexecution.control.RemoteExecutionTest.testSplit 
> Failure
> --
>
> Key: BEAM-9968
> URL: https://issues.apache.org/jira/browse/BEAM-9968
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kenneth Knowles
>Priority: Critical
>  Labels: currently-failing
>
> _Use this form to file an issue for test failure:_
>  * [https://builds.apache.org/job/beam_PreCommit_Java_Cron/2741/]
>  * [https://gradle.com/s/ij5a4z7nhwjn2]
>  * 
> [RemoteExecutionTest.testSplit](https://github.com/apache/beam/blob/820f0f5c12146195ed80617763a354ddb75f0bc1/runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/control/RemoteExecutionTest.java#L1358)
> Failure is: `AssertionError` since an assertion was used without a meaningful 
> error message. It is asserting that the splits are non-empty.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9968) beam_PreCommit_Java_Cron org.apache.beam.runners.fnexecution.control.RemoteExecutionTest.testSplit Failure

2020-05-12 Thread Kenneth Knowles (Jira)
Kenneth Knowles created BEAM-9968:
-

 Summary: beam_PreCommit_Java_Cron 
org.apache.beam.runners.fnexecution.control.RemoteExecutionTest.testSplit 
Failure
 Key: BEAM-9968
 URL: https://issues.apache.org/jira/browse/BEAM-9968
 Project: Beam
  Issue Type: Bug
  Components: test-failures
Reporter: Kenneth Knowles


_Use this form to file an issue for test failure:_
 * [https://builds.apache.org/job/beam_PreCommit_Java_Cron/2741/]
 * [https://gradle.com/s/ij5a4z7nhwjn2]
 * 
[RemoteExecutionTest.testSplit](https://github.com/apache/beam/blob/820f0f5c12146195ed80617763a354ddb75f0bc1/runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/control/RemoteExecutionTest.java#L1358)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9954) Beam ZetaSQL supports pure SQL user-defined aggregation functions

2020-05-12 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated BEAM-9954:
---
Description: 
One naive example is


{code:sql}
CREATE AGGREGATE FUNCTION fun (a INT64) AS (SUM(a));
{code}


> Beam ZetaSQL supports pure SQL user-defined aggregation functions
> -
>
> Key: BEAM-9954
> URL: https://issues.apache.org/jira/browse/BEAM-9954
> Project: Beam
>  Issue Type: Task
>  Components: dsl-sql-zetasql
>Reporter: Rui Wang
>Priority: Major
>
> One naive example is
> {code:sql}
> CREATE AGGREGATE FUNCTION fun (a INT64) AS (SUM(a));
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9954) Beam ZetaSQL supports pure SQL user-defined aggregation functions

2020-05-12 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated BEAM-9954:
---
Description: 
One naive example is


{code:sql}
CREATE AGGREGATE FUNCTION fun (a INT64, b INT64) AS (SUM(a)/AVG(b));
{code}


  was:
One naive example is


{code:sql}
CREATE AGGREGATE FUNCTION fun (a INT64) AS (SUM(a));
{code}



> Beam ZetaSQL supports pure SQL user-defined aggregation functions
> -
>
> Key: BEAM-9954
> URL: https://issues.apache.org/jira/browse/BEAM-9954
> Project: Beam
>  Issue Type: Task
>  Components: dsl-sql-zetasql
>Reporter: Rui Wang
>Priority: Major
>
> One naive example is
> {code:sql}
> CREATE AGGREGATE FUNCTION fun (a INT64, b INT64) AS (SUM(a)/AVG(b));
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9967) Add support for BigQuery job labels on ReadFrom/WriteTo(BigQuery) transforms

2020-05-12 Thread Kenneth Knowles (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-9967:
--
Status: Open  (was: Triage Needed)

> Add support for BigQuery job labels on ReadFrom/WriteTo(BigQuery) transforms
> 
>
> Key: BEAM-9967
> URL: https://issues.apache.org/jira/browse/BEAM-9967
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9967) Add support for BigQuery job labels on ReadFrom/WriteTo(BigQuery) transforms

2020-05-12 Thread Pablo Estrada (Jira)
Pablo Estrada created BEAM-9967:
---

 Summary: Add support for BigQuery job labels on 
ReadFrom/WriteTo(BigQuery) transforms
 Key: BEAM-9967
 URL: https://issues.apache.org/jira/browse/BEAM-9967
 Project: Beam
  Issue Type: Bug
  Components: io-py-gcp
Reporter: Pablo Estrada
Assignee: Pablo Estrada






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9807) PostRelease_NightlySnapshot failing due to missing "region" flag

2020-05-12 Thread Kenneth Knowles (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-9807:
--
Priority: Critical  (was: Major)

> PostRelease_NightlySnapshot failing due to missing "region" flag
> 
>
> Key: BEAM-9807
> URL: https://issues.apache.org/jira/browse/BEAM-9807
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Chamikara Madhusanka Jayalath
>Assignee: Kyle Weaver
>Priority: Critical
> Fix For: 2.22.0
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> [https://builds.apache.org/job/beam_PostRelease_NightlySnapshot/959/]
> [https://builds.apache.org/job/beam_PostRelease_NightlySnapshot/958/]
> [https://scans.gradle.com/s/tth6vapotb5vw/console-log?task=:runners:google-cloud-dataflow-java:runMobileGamingJavaDataflow]
> [ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.6.0:java 
> (default-cli) on project word-count-beam: An exception occured while 
> executing the Java class. Failed to construct instance from factory method 
> DataflowRunner#fromOptions(interface 
> org.apache.beam.sdk.options.PipelineOptions): InvocationTargetException: 
> Missing required values: region -> [Help 1]
>  
> Probably due to [https://github.com/apache/beam/pull/11281].
>  
> Kyle, can you take a look ?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9760) KafkaIO supports consumer group?

2020-05-12 Thread Alexey Romanenko (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105641#comment-17105641
 ] 

Alexey Romanenko commented on BEAM-9760:


Hi [~wongkawah], thanks for details. I see your point but I'm not sure that we 
can use "subscribe()" method in KafkaIO because of some internal Beam 
limitations. We had a similar discussion a while ago ([see 
here|https://issues.apache.org/jira/browse/BEAM-5786?focusedCommentId=16655883=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16655883])

> KafkaIO supports consumer group?
> 
>
> Key: BEAM-9760
> URL: https://issues.apache.org/jira/browse/BEAM-9760
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-kafka
>Reporter: Ka Wah WONG
>Priority: Minor
>
> It seems only assign method of Kafka Consumer class is called in 
> org.apache.beam.sdk.io.kafka.ConsumerSpEL class. According to documentation 
> of org.apache.kafka.clients.consumer.KafkaConsumer,  manual topic assignment 
> through this assign method does not use the consumer's group management 
> functionality.
> May I ask if KafkaIO will be enhanced to support consumer's group management 
> with using Kafka consumer's subscribe method?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9966) Investigate variance in checkpoint duration of ParDo streaming tests

2020-05-12 Thread Kenneth Knowles (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-9966:
--
Status: Open  (was: Triage Needed)

> Investigate variance in checkpoint duration of ParDo streaming tests
> 
>
> Key: BEAM-9966
> URL: https://issues.apache.org/jira/browse/BEAM-9966
> Project: Beam
>  Issue Type: Bug
>  Components: build-system, runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>
> We need to take a closer look at the variance in checkpoint duration which, 
> for different test runs, fluctuates between one second and one minute. 
> https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9966) Investigate variance in checkpoint duration of ParDo streaming tests

2020-05-12 Thread Maximilian Michels (Jira)
Maximilian Michels created BEAM-9966:


 Summary: Investigate variance in checkpoint duration of ParDo 
streaming tests
 Key: BEAM-9966
 URL: https://issues.apache.org/jira/browse/BEAM-9966
 Project: Beam
  Issue Type: Bug
  Components: build-system, runner-flink
Reporter: Maximilian Michels
Assignee: Maximilian Michels


We need to take a closer look at the variance in checkpoint duration which, for 
different test runs, fluctuates between one second and one minute. 
https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-6327) Don't attempt to fuse subtransforms of primitive/known transforms.

2020-05-12 Thread Luke Cwik (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-6327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105629#comment-17105629
 ] 

Luke Cwik commented on BEAM-6327:
-

I'm not sure if we want to add it to the fuser since the pipeline runner may 
want to choose the order in which they perform transform expansion/replacement.

> Don't attempt to fuse subtransforms of primitive/known transforms.
> --
>
> Key: BEAM-6327
> URL: https://issues.apache.org/jira/browse/BEAM-6327
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-direct
>Reporter: Robert Bradshaw
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Currently we must remove all sub-components of any known transform that may 
> have an optional substructure, e.g. 
> [https://github.com/apache/beam/blob/release-2.9.0/sdks/python/apache_beam/runners/portability/portable_runner.py#L126]
>  (for GBK) and [https://github.com/apache/beam/pull/7360] (Reshuffle).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-6327) Don't attempt to fuse subtransforms of primitive/known transforms.

2020-05-12 Thread Luke Cwik (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-6327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105627#comment-17105627
 ] 

Luke Cwik commented on BEAM-6327:
-

I think this shouldn't be considered "trimming" but native transform 
replacement and have crafted https://github.com/apache/beam/pull/11670 which 
repurposes this as such. The version that was in the trimmer was a trivial 
implementation that covers many scenarios where the runner doesn't need custom 
replacement/expansion logic.

> Don't attempt to fuse subtransforms of primitive/known transforms.
> --
>
> Key: BEAM-6327
> URL: https://issues.apache.org/jira/browse/BEAM-6327
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-direct
>Reporter: Robert Bradshaw
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Currently we must remove all sub-components of any known transform that may 
> have an optional substructure, e.g. 
> [https://github.com/apache/beam/blob/release-2.9.0/sdks/python/apache_beam/runners/portability/portable_runner.py#L126]
>  (for GBK) and [https://github.com/apache/beam/pull/7360] (Reshuffle).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-6327) Don't attempt to fuse subtransforms of primitive/known transforms.

2020-05-12 Thread Luke Cwik (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik reassigned BEAM-6327:
---

Assignee: Kyle Weaver  (was: Luke Cwik)

> Don't attempt to fuse subtransforms of primitive/known transforms.
> --
>
> Key: BEAM-6327
> URL: https://issues.apache.org/jira/browse/BEAM-6327
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-direct
>Reporter: Robert Bradshaw
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Currently we must remove all sub-components of any known transform that may 
> have an optional substructure, e.g. 
> [https://github.com/apache/beam/blob/release-2.9.0/sdks/python/apache_beam/runners/portability/portable_runner.py#L126]
>  (for GBK) and [https://github.com/apache/beam/pull/7360] (Reshuffle).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-6327) Don't attempt to fuse subtransforms of primitive/known transforms.

2020-05-12 Thread Luke Cwik (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik reassigned BEAM-6327:
---

Assignee: Luke Cwik  (was: Kyle Weaver)

> Don't attempt to fuse subtransforms of primitive/known transforms.
> --
>
> Key: BEAM-6327
> URL: https://issues.apache.org/jira/browse/BEAM-6327
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-direct
>Reporter: Robert Bradshaw
>Assignee: Luke Cwik
>Priority: Major
>  Labels: portability
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Currently we must remove all sub-components of any known transform that may 
> have an optional substructure, e.g. 
> [https://github.com/apache/beam/blob/release-2.9.0/sdks/python/apache_beam/runners/portability/portable_runner.py#L126]
>  (for GBK) and [https://github.com/apache/beam/pull/7360] (Reshuffle).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9900) Remove the need for shutdownSourcesOnFinalWatermark flag

2020-05-12 Thread Maximilian Michels (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105620#comment-17105620
 ] 

Maximilian Michels commented on BEAM-9900:
--

For context: 
https://github.com/apache/beam/pull/11558/commits/d106f263e625e5d7c4f3848f16da301871f65142

> Remove the need for shutdownSourcesOnFinalWatermark flag
> 
>
> Key: BEAM-9900
> URL: https://issues.apache.org/jira/browse/BEAM-9900
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: 2.22.0
>
>
> The {{shutdownSourcesOnFinalWatermark}} has caused some confusion in the 
> past. It is generally used for testing pipelines to ensure that the pipeline 
> and the testing cluster shuts down at the end of the job. Without it, the 
> pipeline will run forever in streaming mode, regardless of whether the input 
> is finite or not.
> We didn't want to enable the flag by default because shutting down any 
> operators including sources in Flink will prevent checkpointing from working. 
> If we have side input, for example, that may be the case even for 
> long-running pipelines. However, we can detect whether checkpointing is 
> enabled and set the flag automatically.
> The only situation where we may want the flag to be disabled is when users do 
> not have checkpointing enabled but want to take a savepoint. This should be 
> rare and users can mitigate by setting the flag to false to prevent operators 
> from shutting down.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9965) Explain direct runner running modes

2020-05-12 Thread Kyle Weaver (Jira)
Kyle Weaver created BEAM-9965:
-

 Summary: Explain direct runner running modes
 Key: BEAM-9965
 URL: https://issues.apache.org/jira/browse/BEAM-9965
 Project: Beam
  Issue Type: Improvement
  Components: website
Reporter: Kyle Weaver


We should add reasoning for which direct_running_mode a user should choose.

https://beam.apache.org/documentation/runners/direct/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9959) Mistakes Computing Composite Inputs and Outputs

2020-05-12 Thread Robert Burke (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Burke updated BEAM-9959:
---
Description: 
The Go SDK uses a Scope object to manage beam Composites.

A bug was discovered when consuming a PCollection in both the composite that 
created it, and in a separate composite.

Further, the Go SDK should verify that the root hypergraph structure is a DAG 
and provides a reasonable error.  In particular, the leaf nodes of the graph 
could form a DAG, but due to how the beam.Scope object is used, might cause the 
hypergraph to not be a DAG.

Eg. It's possible to write the following in the Go SDK.

 PTransforms A, B, C and PCollections colA, colB, and Composites a, b.
A and C are in a, and B are in b.
A generates colA
B consumes colA, and generates colB.
C consumes colA and colB.

```
a := s.Scope(a)
b := s.Scope(b)
colA := beam.Impulse(*a*)
colB := beam.ParDo(*b*, , colA)
beam.ParDo0(*a*, , colA, beam.SideInput{colB})
```

If it doesn't already, the Go SDK must emit a clear error, and fail pipeline 
construction.

If the affected composites are roots in the graph, the cycle prevents being 
able to topologically sort the root ptransforms for the pipeline graph, which 
can adversely affect runners.

The recommendation is always to wrap uses of scope in functions or other scopes 
to prevent such incorrect constructions.




  was:
The Go SDK uses a Scope object to manage beam Composites.

A bug was discovered when consuming a PCollection in both the composite that 
created it, and in a separate composite.

Further, the Go SDK should verify that the root hypergraph structure is a DAG 
and provides a reasonable error.  In particular, the leaf nodes of the graph 
could form a DAG, but due to how the beam.Scope object is used, might cause the 
hypergraph to not be a DAG.

Eg. It's possible to write the following in the Go SDK.

 PTransforms A, B, C and PCollections colA, colB, and Composites a, b.
A and C are in a, and B are in b.
A generates colA
B consumes colA, and generates colB.
C consumes colB.

```
a := s.Scope(a)
b := s.Scope(b)
colA := beam.Impulse(*a*)
colB := beam.ParDo(*b*, , colA)
beam.ParDo0(*a*, , colA)
```

If it doesn't already the Go SDK must emit a clear error, and fail pipeline 
construction.

If the affected composites are roots in the graph, the cycle prevents being 
able to topologically sort the root ptransforms for the pipeline graph, which 
can adversely affect runners.

The recommendation is always to wrap uses of scope in functions or other scopes 
to prevent such incorrect constructions.





> Mistakes Computing Composite Inputs and Outputs
> ---
>
> Key: BEAM-9959
> URL: https://issues.apache.org/jira/browse/BEAM-9959
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Reporter: Robert Burke
>Assignee: Robert Burke
>Priority: Major
>
> The Go SDK uses a Scope object to manage beam Composites.
> A bug was discovered when consuming a PCollection in both the composite that 
> created it, and in a separate composite.
> Further, the Go SDK should verify that the root hypergraph structure is a DAG 
> and provides a reasonable error.  In particular, the leaf nodes of the graph 
> could form a DAG, but due to how the beam.Scope object is used, might cause 
> the hypergraph to not be a DAG.
> Eg. It's possible to write the following in the Go SDK.
>  PTransforms A, B, C and PCollections colA, colB, and Composites a, b.
> A and C are in a, and B are in b.
> A generates colA
> B consumes colA, and generates colB.
> C consumes colA and colB.
> ```
> a := s.Scope(a)
> b := s.Scope(b)
> colA := beam.Impulse(*a*)
> colB := beam.ParDo(*b*, , colA)
> beam.ParDo0(*a*, , colA, beam.SideInput{colB})
> ```
> If it doesn't already, the Go SDK must emit a clear error, and fail pipeline 
> construction.
> If the affected composites are roots in the graph, the cycle prevents being 
> able to topologically sort the root ptransforms for the pipeline graph, which 
> can adversely affect runners.
> The recommendation is always to wrap uses of scope in functions or other 
> scopes to prevent such incorrect constructions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor

2020-05-12 Thread Omar Ismail (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Omar Ismail updated BEAM-9964:
--
Description: 
Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, 
the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to make 
it allowable to change the cache value in Streaming when setting -workerCacheMB.

I've never made changes to the Beam SDK, so I am super excited to work on this! 

 

[[1] 
https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73]

  was:
Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, 
the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to make 
it allowable to change the cache value in Streaming when setting 
--workerCacheMB.

I've never made changes to the Beam SDK, so I am super excited to work on this! 

 

[[1] 
https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73]


> Setting workerCacheMb to make its way to the WindmillStateCache Constructor
> ---
>
> Key: BEAM-9964
> URL: https://issues.apache.org/jira/browse/BEAM-9964
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Omar Ismail
>Priority: Minor
>
> Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, 
> the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to 
> make it allowable to change the cache value in Streaming when setting 
> -workerCacheMB.
> I've never made changes to the Beam SDK, so I am super excited to work on 
> this! 
>  
> [[1] 
> https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9964) Setting workerCacheMb to make its way to the WindmillStateCache Constructor

2020-05-12 Thread Omar Ismail (Jira)
Omar Ismail created BEAM-9964:
-

 Summary: Setting workerCacheMb to make its way to the 
WindmillStateCache Constructor
 Key: BEAM-9964
 URL: https://issues.apache.org/jira/browse/BEAM-9964
 Project: Beam
  Issue Type: Improvement
  Components: runner-dataflow
Reporter: Omar Ismail


Setting --workerCacheMB seems to affect batch pipelines only. For Streaming, 
the cache seems to be hardcoded to 100Mb [1]. If possible, I would like to make 
it allowable to change the cache value in Streaming when setting 
--workerCacheMB.

I've never made changes to the Beam SDK, so I am super excited to work on this! 

 

[[1] 
https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73|https://github.com/apache/beam/blob/5e659bb80bcbf70795f6806e05a255ee72706d9f/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateCache.java#L73]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9945) Use consistent element count for progress counter.

2020-05-12 Thread Luke Cwik (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105512#comment-17105512
 ] 

Luke Cwik commented on BEAM-9945:
-

PR 11652 needs to be cherry picked into 2.21 release branch

> Use consistent element count for progress counter.
> --
>
> Key: BEAM-9945
> URL: https://issues.apache.org/jira/browse/BEAM-9945
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model, sdk-go, sdk-java-harness, sdk-py-harness
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This influences how the SDK communicates progress and how splitting is 
> performed.
> We currently inspect the element count of the output (which has inconsistent 
> definitions across SDKs, see BEAM-9934). Instead, we can move to using an 
> explicit, separate metric. 
> This can currently lead to incorrect progress reporting and in some cases 
> even a crash in the UW depending on the SDK, and makes it more difficult (but 
> not impossible) to fix element counts in the future. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8133) Create metrics publisher in Java SDK

2020-05-12 Thread Kamil Wasilewski (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamil Wasilewski reassigned BEAM-8133:
--

Assignee: Pawel Pasterz

> Create metrics publisher in Java SDK
> 
>
> Key: BEAM-8133
> URL: https://issues.apache.org/jira/browse/BEAM-8133
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Kamil Wasilewski
>Assignee: Pawel Pasterz
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (BEAM-8132) Create metrics publisher in Python SDK

2020-05-12 Thread Kamil Wasilewski (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamil Wasilewski closed BEAM-8132.
--

> Create metrics publisher in Python SDK
> --
>
> Key: BEAM-8132
> URL: https://issues.apache.org/jira/browse/BEAM-8132
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Kamil Wasilewski
>Assignee: Kamil Wasilewski
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-8132) Create metrics publisher in Python SDK

2020-05-12 Thread Kamil Wasilewski (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamil Wasilewski resolved BEAM-8132.

Fix Version/s: Not applicable
   Resolution: Fixed

> Create metrics publisher in Python SDK
> --
>
> Key: BEAM-8132
> URL: https://issues.apache.org/jira/browse/BEAM-8132
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Kamil Wasilewski
>Assignee: Kamil Wasilewski
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (BEAM-8133) Create metrics publisher in Java SDK

2020-05-12 Thread Kamil Wasilewski (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamil Wasilewski closed BEAM-8133.
--
Fix Version/s: Not applicable
   Resolution: Fixed

> Create metrics publisher in Java SDK
> 
>
> Key: BEAM-8133
> URL: https://issues.apache.org/jira/browse/BEAM-8133
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Kamil Wasilewski
>Assignee: Pawel Pasterz
>Priority: Major
> Fix For: Not applicable
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (BEAM-8131) Provide Kubernetes setup with Prometheus

2020-05-12 Thread Kamil Wasilewski (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamil Wasilewski closed BEAM-8131.
--

> Provide Kubernetes setup with Prometheus
> 
>
> Key: BEAM-8131
> URL: https://issues.apache.org/jira/browse/BEAM-8131
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Kamil Wasilewski
>Assignee: Kamil Wasilewski
>Priority: Major
> Fix For: 2.17.0
>
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8742) Add stateful processing to ParDo load test

2020-05-12 Thread Maximilian Michels (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-8742:
-
Description: 
So far, the ParDo load test is not stateful. We should add a basic counter to 
test the stateful processing.

The test should work in streaming mode and with checkpointing.

  was:So far, the ParDo load test is not stateful. We should add a basic 
counter to test the stateful processing.


> Add stateful processing to ParDo load test
> --
>
> Key: BEAM-8742
> URL: https://issues.apache.org/jira/browse/BEAM-8742
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: 2.22.0
>
>  Time Spent: 16h 10m
>  Remaining Estimate: 0h
>
> So far, the ParDo load test is not stateful. We should add a basic counter to 
> test the stateful processing.
> The test should work in streaming mode and with checkpointing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8742) Add stateful processing to ParDo load test

2020-05-12 Thread Maximilian Michels (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105339#comment-17105339
 ] 

Maximilian Michels commented on BEAM-8742:
--

Cron: https://builds.apache.org/job/beam_LoadTests_Python_ParDo_Flink_Streaming/
PR: 
https://builds.apache.org/job/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/

> Add stateful processing to ParDo load test
> --
>
> Key: BEAM-8742
> URL: https://issues.apache.org/jira/browse/BEAM-8742
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: 2.22.0
>
>  Time Spent: 16h 10m
>  Remaining Estimate: 0h
>
> So far, the ParDo load test is not stateful. We should add a basic counter to 
> test the stateful processing.
> The test should work in streaming mode and with checkpointing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-9963) Flink streaming load tests fail with TypeError

2020-05-12 Thread Maximilian Michels (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels resolved BEAM-9963.
--
Fix Version/s: Not applicable
   Resolution: Fixed

> Flink streaming load tests fail with TypeError
> --
>
> Key: BEAM-9963
> URL: https://issues.apache.org/jira/browse/BEAM-9963
> Project: Beam
>  Issue Type: Test
>  Components: build-system, runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: Not applicable
>
>
> The newly added load tests now fail on the latest master.
> https://builds.apache.org/job/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/39/console
> {noformat}
> 16:00:38 INFO:apache_beam.runners.portability.portable_runner:Job state 
> changed to FAILED
> 16:00:38 Traceback (most recent call last):
> 16:00:38   File "/usr/lib/python3.7/runpy.py", line 193, in 
> _run_module_as_main
> 16:00:38 "__main__", mod_spec)
> 16:00:38   File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
> 16:00:38 exec(code, run_globals)
> 16:00:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/testing/load_tests/pardo_test.py",
>  line 225, in 
> 16:00:38 ParDoTest().run()
> 16:00:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/testing/load_tests/load_test.py",
>  line 115, in run
> 16:00:38 self.result.wait_until_finish(duration=self.timeout_ms)
> 16:00:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/runners/portability/portable_runner.py",
>  line 583, in wait_until_finish
> 16:00:38 raise self._runtime_exception
> 16:00:38 RuntimeError: Pipeline 
> load-tests-python-flink-streaming-pardo-5-0508133349_3afe3b8e-23f7-474b-af08-4ee7feb3adfa
>  failed in state FAILED: java.lang.RuntimeException: Error received from SDK 
> harness for instruction 2: Traceback (most recent call last):
> 16:00:38   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 245, in _execute
> 16:00:38 response = task()
> 16:00:38   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 302, in 
> 16:00:38 lambda: self.create_worker().do_instruction(request), request)
> 16:00:38   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 471, in do_instruction
> 16:00:38 getattr(request, request_type), request.instruction_id)
> 16:00:38   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 506, in process_bundle
> 16:00:38 bundle_processor.process_bundle(instruction_id))
> 16:00:38   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 918, in process_bundle
> 16:00:38 op.finish()
> 16:00:38   File "apache_beam/runners/worker/operations.py", line 697, in 
> apache_beam.runners.worker.operations.DoOperation.finish
> 16:00:38   File "apache_beam/runners/worker/operations.py", line 699, in 
> apache_beam.runners.worker.operations.DoOperation.finish
> 16:00:38   File "apache_beam/runners/worker/operations.py", line 702, in 
> apache_beam.runners.worker.operations.DoOperation.finish
> 16:00:38   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 709, in commit
> 16:00:38 state.commit()
> 16:00:38   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 414, in commit
> 16:00:38 self._underlying_bag_state.commit()
> 16:00:38   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 481, in commit
> 16:00:38 is_cached=True)
> 16:00:38   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 951, in extend
> 16:00:38 coder.encode_to_stream(element, out, True)
> 16:00:38   File "apache_beam/coders/coder_impl.py", line 690, in 
> apache_beam.coders.coder_impl.VarIntCoderImpl.encode_to_stream
> 16:00:38   File "apache_beam/coders/coder_impl.py", line 692, in 
> apache_beam.coders.coder_impl.VarIntCoderImpl.encode_to_stream
> 16:00:38 TypeError: an integer is required
> 16:00:38 
> {noformat}
> Cron fails as well: 
> https://builds.apache.org/job/beam_LoadTests_Python_ParDo_Flink_Streaming/5/console



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9963) Flink streaming load tests fail with TypeError

2020-05-12 Thread Maximilian Michels (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-9963:
-
Description: 
The newly added load tests now fail on the latest master.

https://builds.apache.org/job/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/39/console

{noformat}
16:00:38 INFO:apache_beam.runners.portability.portable_runner:Job state changed 
to FAILED
16:00:38 Traceback (most recent call last):
16:00:38   File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
16:00:38 "__main__", mod_spec)
16:00:38   File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
16:00:38 exec(code, run_globals)
16:00:38   File 
"/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/testing/load_tests/pardo_test.py",
 line 225, in 
16:00:38 ParDoTest().run()
16:00:38   File 
"/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/testing/load_tests/load_test.py",
 line 115, in run
16:00:38 self.result.wait_until_finish(duration=self.timeout_ms)
16:00:38   File 
"/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/runners/portability/portable_runner.py",
 line 583, in wait_until_finish
16:00:38 raise self._runtime_exception
16:00:38 RuntimeError: Pipeline 
load-tests-python-flink-streaming-pardo-5-0508133349_3afe3b8e-23f7-474b-af08-4ee7feb3adfa
 failed in state FAILED: java.lang.RuntimeException: Error received from SDK 
harness for instruction 2: Traceback (most recent call last):
16:00:38   File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
 line 245, in _execute
16:00:38 response = task()
16:00:38   File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
 line 302, in 
16:00:38 lambda: self.create_worker().do_instruction(request), request)
16:00:38   File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
 line 471, in do_instruction
16:00:38 getattr(request, request_type), request.instruction_id)
16:00:38   File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
 line 506, in process_bundle
16:00:38 bundle_processor.process_bundle(instruction_id))
16:00:38   File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
 line 918, in process_bundle
16:00:38 op.finish()
16:00:38   File "apache_beam/runners/worker/operations.py", line 697, in 
apache_beam.runners.worker.operations.DoOperation.finish
16:00:38   File "apache_beam/runners/worker/operations.py", line 699, in 
apache_beam.runners.worker.operations.DoOperation.finish
16:00:38   File "apache_beam/runners/worker/operations.py", line 702, in 
apache_beam.runners.worker.operations.DoOperation.finish
16:00:38   File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
 line 709, in commit
16:00:38 state.commit()
16:00:38   File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
 line 414, in commit
16:00:38 self._underlying_bag_state.commit()
16:00:38   File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
 line 481, in commit
16:00:38 is_cached=True)
16:00:38   File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
 line 951, in extend
16:00:38 coder.encode_to_stream(element, out, True)
16:00:38   File "apache_beam/coders/coder_impl.py", line 690, in 
apache_beam.coders.coder_impl.VarIntCoderImpl.encode_to_stream
16:00:38   File "apache_beam/coders/coder_impl.py", line 692, in 
apache_beam.coders.coder_impl.VarIntCoderImpl.encode_to_stream
16:00:38 TypeError: an integer is required
16:00:38 
{noformat}

Cron fails as well: 
https://builds.apache.org/job/beam_LoadTests_Python_ParDo_Flink_Streaming/5/console

  was:
The newly added load tests now fail on the latest master.

https://builds.apache.org/job/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/39/console

{noformat}
16:00:38 INFO:apache_beam.runners.portability.portable_runner:Job state changed 
to FAILED
16:00:38 Traceback (most recent call last):
16:00:38   File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
16:00:38 "__main__", mod_spec)
16:00:38   File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
16:00:38 exec(code, run_globals)
16:00:38   File 
"/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/testing/load_tests/pardo_test.py",
 line 225, in 
16:00:38 ParDoTest().run()
16:00:38   File 
"/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/testing/load_tests/load_test.py",
 line 

[jira] [Created] (BEAM-9963) Flink streaming load tests fail with TypeError

2020-05-12 Thread Maximilian Michels (Jira)
Maximilian Michels created BEAM-9963:


 Summary: Flink streaming load tests fail with TypeError
 Key: BEAM-9963
 URL: https://issues.apache.org/jira/browse/BEAM-9963
 Project: Beam
  Issue Type: Test
  Components: build-system, runner-flink
Reporter: Maximilian Michels
Assignee: Maximilian Michels


The newly added load tests now fail on the latest master.

https://builds.apache.org/job/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/39/console

{noformat}
16:00:38 INFO:apache_beam.runners.portability.portable_runner:Job state changed 
to FAILED
16:00:38 Traceback (most recent call last):
16:00:38   File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
16:00:38 "__main__", mod_spec)
16:00:38   File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
16:00:38 exec(code, run_globals)
16:00:38   File 
"/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/testing/load_tests/pardo_test.py",
 line 225, in 
16:00:38 ParDoTest().run()
16:00:38   File 
"/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/testing/load_tests/load_test.py",
 line 115, in run
16:00:38 self.result.wait_until_finish(duration=self.timeout_ms)
16:00:38   File 
"/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/runners/portability/portable_runner.py",
 line 583, in wait_until_finish
16:00:38 raise self._runtime_exception
16:00:38 RuntimeError: Pipeline 
load-tests-python-flink-streaming-pardo-5-0508133349_3afe3b8e-23f7-474b-af08-4ee7feb3adfa
 failed in state FAILED: java.lang.RuntimeException: Error received from SDK 
harness for instruction 2: Traceback (most recent call last):
16:00:38   File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
 line 245, in _execute
16:00:38 response = task()
16:00:38   File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
 line 302, in 
16:00:38 lambda: self.create_worker().do_instruction(request), request)
16:00:38   File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
 line 471, in do_instruction
16:00:38 getattr(request, request_type), request.instruction_id)
16:00:38   File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
 line 506, in process_bundle
16:00:38 bundle_processor.process_bundle(instruction_id))
16:00:38   File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
 line 918, in process_bundle
16:00:38 op.finish()
16:00:38   File "apache_beam/runners/worker/operations.py", line 697, in 
apache_beam.runners.worker.operations.DoOperation.finish
16:00:38   File "apache_beam/runners/worker/operations.py", line 699, in 
apache_beam.runners.worker.operations.DoOperation.finish
16:00:38   File "apache_beam/runners/worker/operations.py", line 702, in 
apache_beam.runners.worker.operations.DoOperation.finish
16:00:38   File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
 line 709, in commit
16:00:38 state.commit()
16:00:38   File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
 line 414, in commit
16:00:38 self._underlying_bag_state.commit()
16:00:38   File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
 line 481, in commit
16:00:38 is_cached=True)
16:00:38   File 
"/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
 line 951, in extend
16:00:38 coder.encode_to_stream(element, out, True)
16:00:38   File "apache_beam/coders/coder_impl.py", line 690, in 
apache_beam.coders.coder_impl.VarIntCoderImpl.encode_to_stream
16:00:38   File "apache_beam/coders/coder_impl.py", line 692, in 
apache_beam.coders.coder_impl.VarIntCoderImpl.encode_to_stream
16:00:38 TypeError: an integer is required
16:00:38 
{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9963) Flink streaming load tests fail with TypeError

2020-05-12 Thread Kenneth Knowles (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-9963:
--
Status: Open  (was: Triage Needed)

> Flink streaming load tests fail with TypeError
> --
>
> Key: BEAM-9963
> URL: https://issues.apache.org/jira/browse/BEAM-9963
> Project: Beam
>  Issue Type: Test
>  Components: build-system, runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>
> The newly added load tests now fail on the latest master.
> https://builds.apache.org/job/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/39/console
> {noformat}
> 16:00:38 INFO:apache_beam.runners.portability.portable_runner:Job state 
> changed to FAILED
> 16:00:38 Traceback (most recent call last):
> 16:00:38   File "/usr/lib/python3.7/runpy.py", line 193, in 
> _run_module_as_main
> 16:00:38 "__main__", mod_spec)
> 16:00:38   File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
> 16:00:38 exec(code, run_globals)
> 16:00:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/testing/load_tests/pardo_test.py",
>  line 225, in 
> 16:00:38 ParDoTest().run()
> 16:00:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/testing/load_tests/load_test.py",
>  line 115, in run
> 16:00:38 self.result.wait_until_finish(duration=self.timeout_ms)
> 16:00:38   File 
> "/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_ParDo_Flink_Streaming_PR/src/sdks/python/apache_beam/runners/portability/portable_runner.py",
>  line 583, in wait_until_finish
> 16:00:38 raise self._runtime_exception
> 16:00:38 RuntimeError: Pipeline 
> load-tests-python-flink-streaming-pardo-5-0508133349_3afe3b8e-23f7-474b-af08-4ee7feb3adfa
>  failed in state FAILED: java.lang.RuntimeException: Error received from SDK 
> harness for instruction 2: Traceback (most recent call last):
> 16:00:38   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 245, in _execute
> 16:00:38 response = task()
> 16:00:38   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 302, in 
> 16:00:38 lambda: self.create_worker().do_instruction(request), request)
> 16:00:38   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 471, in do_instruction
> 16:00:38 getattr(request, request_type), request.instruction_id)
> 16:00:38   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 506, in process_bundle
> 16:00:38 bundle_processor.process_bundle(instruction_id))
> 16:00:38   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 918, in process_bundle
> 16:00:38 op.finish()
> 16:00:38   File "apache_beam/runners/worker/operations.py", line 697, in 
> apache_beam.runners.worker.operations.DoOperation.finish
> 16:00:38   File "apache_beam/runners/worker/operations.py", line 699, in 
> apache_beam.runners.worker.operations.DoOperation.finish
> 16:00:38   File "apache_beam/runners/worker/operations.py", line 702, in 
> apache_beam.runners.worker.operations.DoOperation.finish
> 16:00:38   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 709, in commit
> 16:00:38 state.commit()
> 16:00:38   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 414, in commit
> 16:00:38 self._underlying_bag_state.commit()
> 16:00:38   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 481, in commit
> 16:00:38 is_cached=True)
> 16:00:38   File 
> "/usr/local/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 951, in extend
> 16:00:38 coder.encode_to_stream(element, out, True)
> 16:00:38   File "apache_beam/coders/coder_impl.py", line 690, in 
> apache_beam.coders.coder_impl.VarIntCoderImpl.encode_to_stream
> 16:00:38   File "apache_beam/coders/coder_impl.py", line 692, in 
> apache_beam.coders.coder_impl.VarIntCoderImpl.encode_to_stream
> 16:00:38 TypeError: an integer is required
> 16:00:38 
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-9947) Timer coder contains a faulty key coder leading to a corrupted encoding

2020-05-12 Thread Maximilian Michels (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels resolved BEAM-9947.
--
Resolution: Fixed

> Timer coder contains a faulty key coder leading to a corrupted encoding
> ---
>
> Key: BEAM-9947
> URL: https://issues.apache.org/jira/browse/BEAM-9947
> Project: Beam
>  Issue Type: Bug
>  Components: java-fn-execution, sdk-py-core
>Affects Versions: 2.21.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Critical
> Fix For: 2.21.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The coder for timers contains a key coder which may have to be 
> length-prefixed in case of a non-standard coders. We noticed that this was 
> not reflected in the {{ProcessBundleDescriptor}} leading to errors like this 
> one for non-standard coders, e.g. Python's {{FastPrimitivesCoder}}:
> {noformat}
> Caused by: org.apache.beam.sdk.coders.CoderException: java.io.EOFException: 
> reached end of stream after reading 36 bytes; 68 bytes expected
>   at 
> org.apache.beam.sdk.coders.StringUtf8Coder.decode(StringUtf8Coder.java:104)
>   at 
> org.apache.beam.sdk.coders.StringUtf8Coder.decode(StringUtf8Coder.java:90)
>   at 
> org.apache.beam.runners.core.construction.Timer$Coder.decode(Timer.java:194)
>   at 
> org.apache.beam.runners.core.construction.Timer$Coder.decode(Timer.java:157)
>   at 
> org.apache.beam.sdk.fn.data.BeamFnDataInboundObserver.accept(BeamFnDataInboundObserver.java:81)
>   at 
> org.apache.beam.sdk.fn.data.BeamFnDataInboundObserver.accept(BeamFnDataInboundObserver.java:31)
>   at 
> org.apache.beam.sdk.fn.data.BeamFnDataGrpcMultiplexer$InboundObserver.onNext(BeamFnDataGrpcMultiplexer.java:180)
>   at 
> org.apache.beam.sdk.fn.data.BeamFnDataGrpcMultiplexer$InboundObserver.onNext(BeamFnDataGrpcMultiplexer.java:127)
>   at 
> org.apache.beam.vendor.grpc.v1p26p0.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:251)
>   at 
> org.apache.beam.vendor.grpc.v1p26p0.io.grpc.ForwardingServerCallListener.onMessage(ForwardingServerCallListener.java:33)
>   at 
> org.apache.beam.vendor.grpc.v1p26p0.io.grpc.Contexts$ContextualizedServerCallListener.onMessage(Contexts.java:76)
>   at 
> org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailableInternal(ServerCallImpl.java:309)
>   at 
> org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:292)
>   at 
> org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:782)
>   at 
> org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
>   at 
> org.apache.beam.vendor.grpc.v1p26p0.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   ... 1 more
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9917) BigQueryBatchFileLoads dynamic destination

2020-05-12 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tord Sætren updated BEAM-9917:
--
Priority: Minor  (was: Major)

> BigQueryBatchFileLoads dynamic destination
> --
>
> Key: BEAM-9917
> URL: https://issues.apache.org/jira/browse/BEAM-9917
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.17.0
>Reporter: Tord Sætren
>Priority: Minor
>  Labels: beginner, documentation
>
> I am trying to use BigQueryBatchFileLoads to upload data from pubsub. It 
> works fine for a single table, but when I try to use a dynamic destination 
> such as
> destination=lambda elem: "my_project:my_dataset." + elem["sensor_key"],
> it just makes a new table for each time the triggering_frequency procs. I 
> know it makes temporary tables before loading it all into one, but it never 
> loads them. It just creates more and more tables.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-6710) Add Landing page to community metrics dashboard

2020-05-12 Thread Kamil Wasilewski (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamil Wasilewski resolved BEAM-6710.

Fix Version/s: Not applicable
   Resolution: Fixed

> Add Landing page to  community metrics dashboard
> 
>
> Key: BEAM-6710
> URL: https://issues.apache.org/jira/browse/BEAM-6710
> Project: Beam
>  Issue Type: New Feature
>  Components: community-metrics, project-management
>Reporter: Mikhail Gryzykhin
>Assignee: Kamil Wasilewski
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Community metrics dashboard sends user to list of recently opened dashboards, 
> that's empty. This confuses new users. 
> We want to add landing page with links to relevant dashboard.
> Link: ttp://104.154.241.245/
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8742) Add stateful processing to ParDo load test

2020-05-12 Thread Maximilian Michels (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105273#comment-17105273
 ] 

Maximilian Michels commented on BEAM-8742:
--

I just saw your comment here. Thank you. I added streaming tests in 
https://github.com/apache/beam/pull/11558. Have a look if you want.

I've created a dashboard here: 
https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056

> Add stateful processing to ParDo load test
> --
>
> Key: BEAM-8742
> URL: https://issues.apache.org/jira/browse/BEAM-8742
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: 2.22.0
>
>  Time Spent: 16h 10m
>  Remaining Estimate: 0h
>
> So far, the ParDo load test is not stateful. We should add a basic counter to 
> test the stateful processing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9962) CI/CD for Beam

2020-05-12 Thread Ankur Goenka (Jira)
Ankur Goenka created BEAM-9962:
--

 Summary: CI/CD for Beam
 Key: BEAM-9962
 URL: https://issues.apache.org/jira/browse/BEAM-9962
 Project: Beam
  Issue Type: Bug
  Components: sdk-ideas
Reporter: Ankur Goenka


Uber task for CI CD



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-7438) Distribution and Gauge metrics are not being exported to Flink dashboard neither Prometheus IO

2020-05-12 Thread Akshay Iyangar (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105192#comment-17105192
 ] 

Akshay Iyangar commented on BEAM-7438:
--

[~mxm] - Facing the same issue will have a PR out for this soon.

> Distribution and Gauge metrics are not being exported to Flink dashboard 
> neither Prometheus IO
> --
>
> Key: BEAM-7438
> URL: https://issues.apache.org/jira/browse/BEAM-7438
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 2.12.0, 2.13.0
>Reporter: Ricardo Bordon
>Assignee: Akshay Iyangar
>Priority: Major
> Attachments: image-2019-05-29-11-24-36-911.png, 
> image-2019-05-29-11-26-49-685.png
>
>
> Distributions and gauge metrics are not visible at Flink dashboard neither 
> Prometheus IO.
> I was able to debug the runner code and see that these metrics are being 
> update over *FlinkMetricContainer#updateDistributions()* and 
> *FlinkMetricContainer#updateGauges()* (meaning they are treated as "attempted 
> metrics") but these are not visible when looking them over the Flink 
> Dashboard or Prometheus. In the other hand, *Counter* metrics work as 
> expected.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9164) [PreCommit_Java] [Flake] UnboundedSourceWrapperTest$ParameterizedUnboundedSourceWrapperTest

2020-05-12 Thread Maximilian Michels (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105184#comment-17105184
 ] 

Maximilian Michels commented on BEAM-9164:
--

I was not aware that tests had been disabled. I'm a bit shocked that we just 
disabled them without mentioning relevant folks or at least posting in the JIRA.

I changed logic around the watermark propagation in BEAM-9900 which could have 
removed the flakiness. I'll re-enable the tests.

> [PreCommit_Java] [Flake] 
> UnboundedSourceWrapperTest$ParameterizedUnboundedSourceWrapperTest
> ---
>
> Key: BEAM-9164
> URL: https://issues.apache.org/jira/browse/BEAM-9164
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Kirill Kozlov
>Priority: Critical
>  Labels: flake
>
> Test:
> org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapperTest$ParameterizedUnboundedSourceWrapperTest
>  >> testWatermarkEmission[numTasks = 1; numSplits=1]
> Fails with the following exception:
> {code:java}
> org.junit.runners.model.TestTimedOutException: test timed out after 3 
> milliseconds{code}
> Affected Jenkins job: 
> [https://builds.apache.org/job/beam_PreCommit_Java_Phrase/1665/]
> Gradle build scan: [https://scans.gradle.com/s/nvgeb425fe63q]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-7438) Distribution and Gauge metrics are not being exported to Flink dashboard neither Prometheus IO

2020-05-12 Thread Akshay Iyangar (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akshay Iyangar reassigned BEAM-7438:


Assignee: Akshay Iyangar

> Distribution and Gauge metrics are not being exported to Flink dashboard 
> neither Prometheus IO
> --
>
> Key: BEAM-7438
> URL: https://issues.apache.org/jira/browse/BEAM-7438
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 2.12.0, 2.13.0
>Reporter: Ricardo Bordon
>Assignee: Akshay Iyangar
>Priority: Major
> Attachments: image-2019-05-29-11-24-36-911.png, 
> image-2019-05-29-11-26-49-685.png
>
>
> Distributions and gauge metrics are not visible at Flink dashboard neither 
> Prometheus IO.
> I was able to debug the runner code and see that these metrics are being 
> update over *FlinkMetricContainer#updateDistributions()* and 
> *FlinkMetricContainer#updateGauges()* (meaning they are treated as "attempted 
> metrics") but these are not visible when looking them over the Flink 
> Dashboard or Prometheus. In the other hand, *Counter* metrics work as 
> expected.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9961) Python MongoDBIO does not apply projection

2020-05-12 Thread Corvin Deboeser (Jira)
Corvin Deboeser created BEAM-9961:
-

 Summary: Python MongoDBIO does not apply projection
 Key: BEAM-9961
 URL: https://issues.apache.org/jira/browse/BEAM-9961
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-core
Affects Versions: 2.20.0
Reporter: Corvin Deboeser


ReadFromMongoDB does not apply the provided projection when reading from the 
client - only filter is being applied as you can see here:

https://github.com/apache/beam/blob/9f0cb649d39ee6236ea27f111acb4b66591a80ec/sdks/python/apache_beam/io/mongodbio.py#L204



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9960) Python MongoDBIO fails when response of split vector command is larger than 16mb

2020-05-12 Thread Corvin Deboeser (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Corvin Deboeser updated BEAM-9960:
--
Description: 
When using MongoDBIO on a large collection and the source bundle size was 
determined to be 1, then the response from the split vector command can be 
larger than 16mb which is not supported by pymongo / MongoDB:

{{pymongo.errors.ProtocolError: Message length (33699186) is larger than server 
max message size (33554432)}}

 

Environment: Was running this on Google Dataflow / Beam Python SDK 2.20.

  was:
When using MongoDBIO on a large collection and the source bundle size was 
determined to be 1, then the response from the split vector command can be 
larger than 16mb which is not supported by pymongo / MongoDB:

{{pymongo.errors.ProtocolError: Message length (33699186) is larger than server 
max message size (33554432)}}

 

 

 


> Python MongoDBIO fails when response of split vector command is larger than 
> 16mb
> 
>
> Key: BEAM-9960
> URL: https://issues.apache.org/jira/browse/BEAM-9960
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.20.0
>Reporter: Corvin Deboeser
>Priority: Major
>
> When using MongoDBIO on a large collection and the source bundle size was 
> determined to be 1, then the response from the split vector command can be 
> larger than 16mb which is not supported by pymongo / MongoDB:
> {{pymongo.errors.ProtocolError: Message length (33699186) is larger than 
> server max message size (33554432)}}
>  
> Environment: Was running this on Google Dataflow / Beam Python SDK 2.20.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9960) Python MongoDBIO fails when response of split vector command is larger than 16mb

2020-05-12 Thread Corvin Deboeser (Jira)
Corvin Deboeser created BEAM-9960:
-

 Summary: Python MongoDBIO fails when response of split vector 
command is larger than 16mb
 Key: BEAM-9960
 URL: https://issues.apache.org/jira/browse/BEAM-9960
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-core
Affects Versions: 2.20.0
Reporter: Corvin Deboeser


When using MongoDBIO on a large collection and the source bundle size was 
determined to be 1, then the response from the split vector command can be 
larger than 16mb which is not supported by pymongo / MongoDB:

{{pymongo.errors.ProtocolError: Message length (33699186) is larger than server 
max message size (33554432)}}

 

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9959) Mistakes Computing Composite Inputs and Outputs

2020-05-12 Thread Kenneth Knowles (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-9959:
--
Status: Open  (was: Triage Needed)

> Mistakes Computing Composite Inputs and Outputs
> ---
>
> Key: BEAM-9959
> URL: https://issues.apache.org/jira/browse/BEAM-9959
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Reporter: Robert Burke
>Assignee: Robert Burke
>Priority: Major
>
> The Go SDK uses a Scope object to manage beam Composites.
> A bug was discovered when consuming a PCollection in both the composite that 
> created it, and in a separate composite.
> Further, the Go SDK should verify that the root hypergraph structure is a DAG 
> and provides a reasonable error.  In particular, the leaf nodes of the graph 
> could form a DAG, but due to how the beam.Scope object is used, might cause 
> the hypergraph to not be a DAG.
> Eg. It's possible to write the following in the Go SDK.
>  PTransforms A, B, C and PCollections colA, colB, and Composites a, b.
> A and C are in a, and B are in b.
> A generates colA
> B consumes colA, and generates colB.
> C consumes colB.
> ```
> a := s.Scope(a)
> b := s.Scope(b)
> colA := beam.Impulse(*a*)
> colB := beam.ParDo(*b*, , colA)
> beam.ParDo0(*a*, , colA)
> ```
> If it doesn't already the Go SDK must emit a clear error, and fail pipeline 
> construction.
> If the affected composites are roots in the graph, the cycle prevents being 
> able to topologically sort the root ptransforms for the pipeline graph, which 
> can adversely affect runners.
> The recommendation is always to wrap uses of scope in functions or other 
> scopes to prevent such incorrect constructions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9959) Mistakes Computing Composite Inputs and Outputs

2020-05-12 Thread Robert Burke (Jira)
Robert Burke created BEAM-9959:
--

 Summary: Mistakes Computing Composite Inputs and Outputs
 Key: BEAM-9959
 URL: https://issues.apache.org/jira/browse/BEAM-9959
 Project: Beam
  Issue Type: Bug
  Components: sdk-go
Reporter: Robert Burke
Assignee: Robert Burke


The Go SDK uses a Scope object to manage beam Composites.

A bug was discovered when consuming a PCollection in both the composite that 
created it, and in a separate composite.

Further, the Go SDK should verify that the root hypergraph structure is a DAG 
and provides a reasonable error.  In particular, the leaf nodes of the graph 
could form a DAG, but due to how the beam.Scope object is used, might cause the 
hypergraph to not be a DAG.

Eg. It's possible to write the following in the Go SDK.

 PTransforms A, B, C and PCollections colA, colB, and Composites a, b.
A and C are in a, and B are in b.
A generates colA
B consumes colA, and generates colB.
C consumes colB.

```
a := s.Scope(a)
b := s.Scope(b)
colA := beam.Impulse(*a*)
colB := beam.ParDo(*b*, , colA)
beam.ParDo0(*a*, , colA)
```

If it doesn't already the Go SDK must emit a clear error, and fail pipeline 
construction.

If the affected composites are roots in the graph, the cycle prevents being 
able to topologically sort the root ptransforms for the pipeline graph, which 
can adversely affect runners.

The recommendation is always to wrap uses of scope in functions or other scopes 
to prevent such incorrect constructions.






--
This message was sent by Atlassian Jira
(v8.3.4#803005)