[jira] [Commented] (BEAM-9388) Consider using github actions for building python wheels and more (aka. Transition from Travis)

2020-06-01 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17121328#comment-17121328
 ] 

Ahmet Altay commented on BEAM-9388:
---

Adding more details:

We can add a Beam setup for running github actions:

- One of these actions is to take Beam repo master branch at a recent commit; 
build wheel files; Optionally publish them to a temporary GCS location.
- It is possible to manually trigger the same job on the release branch. This 
version needs to stage its output to a given GCS location. Signing is out of 
scope.
- Action will produce the same wheel set as 
(https://github.com/apache/beam-wheels) (e.g. different python version 
linux/mac) and an additional wheel version for windows.
- Same github action also produces a tarball of the sdk
- https://github.com/apache/beam-wheels - is deprecated. (i.e. removed from 
release notes, https://github.com/apache/beam-wheels is deleted or has a readme 
to not use it)
- Beam docs are updated to explain how to use this Github action.

> Consider using github actions for building python wheels and more (aka. 
> Transition from Travis)
> ---
>
> Key: BEAM-9388
> URL: https://issues.apache.org/jira/browse/BEAM-9388
> Project: Beam
>  Issue Type: Bug
>  Components: build-system, sdk-py-core
>Reporter: Ahmet Altay
>Priority: P2
>
> Context on the mailing list: 
> https://lists.apache.org/thread.html/r4a7d34e64a34e9fe589d06aec74d9b464d252c516fe96c35b2d6c9ae%40%3Cdev.beam.apache.org%3E
> github actions instead of travis to for building python wheels during 
> releases. This will have the following advantages:
> - We will eliminate one repo. (If you don't know, we have 
> https://github.com/apache/beam-wheels for the sole purpose of building wheels 
> file.)
> - Workflow will be stored in the same repo. This will prevent bit rot that is 
> only discovered at release times. (happened a few times, although usually 
> easy to fix.)
> - github actions supports ubuntu, mac, windows environments. We could try to 
> build wheels for windows as well. (Travis also supports the same environments 
> but we only use linux and mac environments. Maybe there are other blockers 
> for building wheels for Windows.)
> - We could do more, like daily python builds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-10055) Add --region to 3 of the python examples

2020-05-21 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-10055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay reassigned BEAM-10055:
--

Assignee: Ted Romer

> Add --region to 3 of the python examples
> 
>
> Key: BEAM-10055
> URL: https://issues.apache.org/jira/browse/BEAM-10055
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ted Romer
>Assignee: Ted Romer
>Priority: P3
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Proposed fix: 
> {color:#FF}[https://github.com/tedromer/beam/compare/tedromer:ef811fe...tedromer:1f39865]{color}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9999) Remove support for EOLed runners (Apex, etc.)

2020-05-20 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay updated BEAM-:
--
Labels: beam-fixit  (was: )

> Remove support for EOLed runners (Apex, etc.)
> -
>
> Key: BEAM-
> URL: https://issues.apache.org/jira/browse/BEAM-
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-apex, runner-core
>Reporter: Ahmet Altay
>Assignee: Ahmet Altay
>Priority: P2
>  Labels: beam-fixit
>
> These runners look EOLed, not maintained:
> - Apex (last release 2+ years ago)
> - Gearpump (last release 1+ year ago)
> Removing support for these could reduce the code base size, reduce flaky 
> test, and make it easier to add new features.
> /cc [~kenn][~tysonjh]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9999) Remove support for EOLed runners (Apex, etc.)

2020-05-15 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17108766#comment-17108766
 ] 

Ahmet Altay commented on BEAM-:
---

Thank you for the quick comments. Do you think it will be worth bringing this 
to user or dev list? Or this is sufficient information to remove support. 

One additional benefit would be, some of the ignored tests could also be 
removed.

> Remove support for EOLed runners (Apex, etc.)
> -
>
> Key: BEAM-
> URL: https://issues.apache.org/jira/browse/BEAM-
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-apex, runner-core
>Reporter: Ahmet Altay
>Assignee: Ahmet Altay
>Priority: Major
>
> These runners look EOLed, not maintained:
> - Apex (last release 2+ years ago)
> - Gearpump (last release 1+ year ago)
> Removing support for these could reduce the code base size, reduce flaky 
> test, and make it easier to add new features.
> /cc [~kenn][~tysonjh]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9975) PortableRunnerTest flake "ParseError: Unexpected type for Value message."

2020-05-15 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17108506#comment-17108506
 ] 

Ahmet Altay commented on BEAM-9975:
---

Thank you Brian. 
- Solving the subclass problem is probably a larger question. If we drop in the 
current mechanism of options discovery, we will break users and need to find a 
less fragile way of registering known pipeline options. (/cc [~tvalentyn] and 
[~robertwb] on this one.)
- Encoding valueproviders by calling get and using their value instead should 
be safe. As long value providers have a value, it is fine to use it in place of 
the valueprovider itself.

> PortableRunnerTest flake "ParseError: Unexpected type for Value message."
> -
>
> Key: BEAM-9975
> URL: https://issues.apache.org/jira/browse/BEAM-9975
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Error looks similar to the one in BEAM-9907. Example from 
> https://builds.apache.org/job/beam_PreCommit_Python_Cron/2732
> {code}
> apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:569: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> apache_beam/pipeline.py:550: in __exit__
> self.run().wait_until_finish()
> apache_beam/pipeline.py:529: in run
> return self.runner.run_pipeline(self, self._options)
> apache_beam/runners/portability/portable_runner.py:426: in run_pipeline
> job_service_handle.submit(proto_pipeline)
> apache_beam/runners/portability/portable_runner.py:107: in submit
> prepare_response = self.prepare(proto_pipeline)
> apache_beam/runners/portability/portable_runner.py:184: in prepare
> pipeline_options=self.get_pipeline_options()),
> apache_beam/runners/portability/portable_runner.py:174: in 
> get_pipeline_options
> return job_utils.dict_to_struct(p_options)
> apache_beam/runners/job/utils.py:33: in dict_to_struct
> return json_format.ParseDict(dict_obj, struct_pb2.Struct())
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:450:
>  in ParseDict
> parser.ConvertMessage(js_dict, message)
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:479:
>  in ConvertMessage
> methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self)
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:667:
>  in _ConvertStructMessage
> self._ConvertValueMessage(value[key], message.fields[key])
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> self = 
> value =  0x7f69eb7b3ac8>
> message = 
> def _ConvertValueMessage(self, value, message):
>   """Convert a JSON representation into Value message."""
>   if isinstance(value, dict):
> self._ConvertStructMessage(value, message.struct_value)
>   elif isinstance(value, list):
> self. _ConvertListValueMessage(value, message.list_value)
>   elif value is None:
> message.null_value = 0
>   elif isinstance(value, bool):
> message.bool_value = value
>   elif isinstance(value, six.string_types):
> message.string_value = value
>   elif isinstance(value, _INT_OR_FLOAT):
> message.number_value = value
>   else:
> >   raise ParseError('Unexpected type for Value message.')
> E   google.protobuf.json_format.ParseError: Unexpected type for Value 
> message.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-10007) PortableRunner doesn't handle ValueProvider instances when converting pipeline options

2020-05-15 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-10007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17108503#comment-17108503
 ] 

Ahmet Altay commented on BEAM-10007:


Converting valueproviders to actual values should be always fine. Consumers 
usually expect either a value or a valueprovider. If a valueprovider already 
has a ready value, we can convert it.

> PortableRunner doesn't handle ValueProvider instances when converting 
> pipeline options
> --
>
> Key: BEAM-10007
> URL: https://issues.apache.org/jira/browse/BEAM-10007
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Brian Hulette
>Priority: Major
>
> We attempt to convert ValueProvider instances directly to JSON with 
> json_format, leading to errors like the one described in BEAM-9975.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-10006) PipelineOptions can pick up definitions from unrelated tests

2020-05-15 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-10006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17108499#comment-17108499
 ] 

Ahmet Altay commented on BEAM-10006:


Flakes probably depends on the test order. This is a known implementation 
limitation. Adding namespaces might improve things a bit 
(https://issues.apache.org/jira/browse/BEAM-6531). Or a new way to discover 
options need to be implemented (e.g. based on registering options) but that 
would be a bigger change.

> PipelineOptions can pick up definitions from unrelated tests
> 
>
> Key: BEAM-10006
> URL: https://issues.apache.org/jira/browse/BEAM-10006
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Brian Hulette
>Priority: Major
>
> Since PipelineOptions uses {{\_\_subclasses\_\_}} to look for all 
> definitions, when used in tests it can sometimes pick up sub-classes that 
> were created in previously executed tests.
> See BEAM-9975 for more details.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-10003) Need two PR to submit snippets to website

2020-05-15 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-10003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17108469#comment-17108469
 ] 

Ahmet Altay commented on BEAM-10003:


Nam, could you change the script to generate snippets from the branch (PR 
branch or current branch) instead of what is in master?

> Need two PR to submit snippets to website
> -
>
> Key: BEAM-10003
> URL: https://issues.apache.org/jira/browse/BEAM-10003
> Project: Beam
>  Issue Type: New Feature
>  Components: website
>Reporter: Reza ardeshir rokni
>Assignee: Aizhamal Nurmamat kyzy
>Priority: Minor
>
> Looks like build_github_samples.sh uses code already on the repo to build 
> local serving;
> do
>   fileName=$(echo "$url" | sed -e 's/\//_/g')
>   curl -o "$DIST_DIR"/"$fileName" 
> "[https://raw.githubusercontent.com|https://raw.githubusercontent.com/]$url;
> done
> So when tying to test locally, the code needs to have already be in Beam. 
> Ideally the script should make use of local code when building so :
> 1- Easier to  build & test changes.
> 2- No need to raise two PR for what is a single change
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-10003) Need two PR to submit snippets to website

2020-05-15 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-10003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay reassigned BEAM-10003:
--

Assignee: Aizhamal Nurmamat kyzy

> Need two PR to submit snippets to website
> -
>
> Key: BEAM-10003
> URL: https://issues.apache.org/jira/browse/BEAM-10003
> Project: Beam
>  Issue Type: New Feature
>  Components: website
>Reporter: Reza ardeshir rokni
>Assignee: Aizhamal Nurmamat kyzy
>Priority: Minor
>
> Looks like build_github_samples.sh uses code already on the repo to build 
> local serving;
> do
>   fileName=$(echo "$url" | sed -e 's/\//_/g')
>   curl -o "$DIST_DIR"/"$fileName" 
> "[https://raw.githubusercontent.com|https://raw.githubusercontent.com/]$url;
> done
> So when tying to test locally, the code needs to have already be in Beam. 
> Ideally the script should make use of local code when building so :
> 1- Easier to  build & test changes.
> 2- No need to raise two PR for what is a single change
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9999) Remove support for EOLed runners (Apex, etc.)

2020-05-14 Thread Ahmet Altay (Jira)
Ahmet Altay created BEAM-:
-

 Summary: Remove support for EOLed runners (Apex, etc.)
 Key: BEAM-
 URL: https://issues.apache.org/jira/browse/BEAM-
 Project: Beam
  Issue Type: Bug
  Components: runner-apex, runner-core
Reporter: Ahmet Altay


These runners look EOLed, not maintained:
- Apex (last release 2+ years ago)
- Gearpump (last release 1+ year ago)

Removing support for these could reduce the code base size, reduce flaky test, 
and make it easier to add new features.

/cc [~kenn][~tysonjh]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-9975) PortableRunnerTest flake "ParseError: Unexpected type for Value message."

2020-05-12 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105894#comment-17105894
 ] 

Ahmet Altay edited comment on BEAM-9975 at 5/13/20, 2:26 AM:
-

Could we add a log error before the raise and log the value instead of catching 
the error?

I agree with Kyle, unless we add more logging, it will be hard to find the root 
cause.


was (Author: altay):
Could we add a log error before the raise and log the value instead of catching 
the error?

> PortableRunnerTest flake "ParseError: Unexpected type for Value message."
> -
>
> Key: BEAM-9975
> URL: https://issues.apache.org/jira/browse/BEAM-9975
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Brian Hulette
>Priority: Major
>
> Error looks similar to the one in BEAM-9907. Example from 
> https://builds.apache.org/job/beam_PreCommit_Python_Cron/2732
> {code}
> apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:569: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> apache_beam/pipeline.py:550: in __exit__
> self.run().wait_until_finish()
> apache_beam/pipeline.py:529: in run
> return self.runner.run_pipeline(self, self._options)
> apache_beam/runners/portability/portable_runner.py:426: in run_pipeline
> job_service_handle.submit(proto_pipeline)
> apache_beam/runners/portability/portable_runner.py:107: in submit
> prepare_response = self.prepare(proto_pipeline)
> apache_beam/runners/portability/portable_runner.py:184: in prepare
> pipeline_options=self.get_pipeline_options()),
> apache_beam/runners/portability/portable_runner.py:174: in 
> get_pipeline_options
> return job_utils.dict_to_struct(p_options)
> apache_beam/runners/job/utils.py:33: in dict_to_struct
> return json_format.ParseDict(dict_obj, struct_pb2.Struct())
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:450:
>  in ParseDict
> parser.ConvertMessage(js_dict, message)
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:479:
>  in ConvertMessage
> methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self)
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:667:
>  in _ConvertStructMessage
> self._ConvertValueMessage(value[key], message.fields[key])
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> self = 
> value =  0x7f69eb7b3ac8>
> message = 
> def _ConvertValueMessage(self, value, message):
>   """Convert a JSON representation into Value message."""
>   if isinstance(value, dict):
> self._ConvertStructMessage(value, message.struct_value)
>   elif isinstance(value, list):
> self. _ConvertListValueMessage(value, message.list_value)
>   elif value is None:
> message.null_value = 0
>   elif isinstance(value, bool):
> message.bool_value = value
>   elif isinstance(value, six.string_types):
> message.string_value = value
>   elif isinstance(value, _INT_OR_FLOAT):
> message.number_value = value
>   else:
> >   raise ParseError('Unexpected type for Value message.')
> E   google.protobuf.json_format.ParseError: Unexpected type for Value 
> message.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9975) PortableRunnerTest flake "ParseError: Unexpected type for Value message."

2020-05-12 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105894#comment-17105894
 ] 

Ahmet Altay commented on BEAM-9975:
---

Could we add a log error before the raise and log the value instead of catching 
the error?

> PortableRunnerTest flake "ParseError: Unexpected type for Value message."
> -
>
> Key: BEAM-9975
> URL: https://issues.apache.org/jira/browse/BEAM-9975
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Brian Hulette
>Priority: Major
>
> Error looks similar to the one in BEAM-9907. Example from 
> https://builds.apache.org/job/beam_PreCommit_Python_Cron/2732
> {code}
> apache_beam/runners/portability/fn_api_runner/fn_runner_test.py:569: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> apache_beam/pipeline.py:550: in __exit__
> self.run().wait_until_finish()
> apache_beam/pipeline.py:529: in run
> return self.runner.run_pipeline(self, self._options)
> apache_beam/runners/portability/portable_runner.py:426: in run_pipeline
> job_service_handle.submit(proto_pipeline)
> apache_beam/runners/portability/portable_runner.py:107: in submit
> prepare_response = self.prepare(proto_pipeline)
> apache_beam/runners/portability/portable_runner.py:184: in prepare
> pipeline_options=self.get_pipeline_options()),
> apache_beam/runners/portability/portable_runner.py:174: in 
> get_pipeline_options
> return job_utils.dict_to_struct(p_options)
> apache_beam/runners/job/utils.py:33: in dict_to_struct
> return json_format.ParseDict(dict_obj, struct_pb2.Struct())
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:450:
>  in ParseDict
> parser.ConvertMessage(js_dict, message)
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:479:
>  in ConvertMessage
> methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self)
> target/.tox-py36-cython/py36-cython/lib/python3.6/site-packages/google/protobuf/json_format.py:667:
>  in _ConvertStructMessage
> self._ConvertValueMessage(value[key], message.fields[key])
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> self = 
> value =  0x7f69eb7b3ac8>
> message = 
> def _ConvertValueMessage(self, value, message):
>   """Convert a JSON representation into Value message."""
>   if isinstance(value, dict):
> self._ConvertStructMessage(value, message.struct_value)
>   elif isinstance(value, list):
> self. _ConvertListValueMessage(value, message.list_value)
>   elif value is None:
> message.null_value = 0
>   elif isinstance(value, bool):
> message.bool_value = value
>   elif isinstance(value, six.string_types):
> message.string_value = value
>   elif isinstance(value, _INT_OR_FLOAT):
> message.number_value = value
>   else:
> >   raise ParseError('Unexpected type for Value message.')
> E   google.protobuf.json_format.ParseError: Unexpected type for Value 
> message.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9767) test_streaming_wordcount flaky timeouts

2020-05-12 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105858#comment-17105858
 ] 

Ahmet Altay commented on BEAM-9767:
---

https://github.com/apache/beam/pull/11663 is merged. Does this address the 
flake?

> test_streaming_wordcount flaky timeouts
> ---
>
> Key: BEAM-9767
> URL: https://issues.apache.org/jira/browse/BEAM-9767
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, test-failures
>Reporter: Udi Meiri
>Assignee: Sam Rohde
>Priority: Critical
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Timed out after 600s, typically completes in 2.8s on my workstation.
> https://builds.apache.org/job/beam_PreCommit_Python_Commit/12376/
> {code}
> self = 
>   testMethod=test_streaming_wordcount>
> @unittest.skipIf(
> sys.version_info < (3, 5, 3),
> 'The tests require at least Python 3.6 to work.')
> def test_streaming_wordcount(self):
>   class WordExtractingDoFn(beam.DoFn):
> def process(self, element):
>   text_line = element.strip()
>   words = text_line.split()
>   return words
> 
>   # Add the TestStream so that it can be cached.
>   ib.options.capturable_sources.add(TestStream)
>   ib.options.capture_duration = timedelta(seconds=5)
> 
>   p = beam.Pipeline(
>   runner=interactive_runner.InteractiveRunner(),
>   options=StandardOptions(streaming=True))
> 
>   data = (
>   p
>   | TestStream()
>   .advance_watermark_to(0)
>   .advance_processing_time(1)
>   .add_elements(['to', 'be', 'or', 'not', 'to', 'be'])
>   .advance_watermark_to(20)
>   .advance_processing_time(1)
>   .add_elements(['that', 'is', 'the', 'question'])
>   | beam.WindowInto(beam.window.FixedWindows(10))) # yapf: disable
> 
>   counts = (
>   data
>   | 'split' >> beam.ParDo(WordExtractingDoFn())
>   | 'pair_with_one' >> beam.Map(lambda x: (x, 1))
>   | 'group' >> beam.GroupByKey()
>   | 'count' >> beam.Map(lambda wordones: (wordones[0], 
> sum(wordones[1]
> 
>   # Watch the local scope for Interactive Beam so that referenced 
> PCollections
>   # will be cached.
>   ib.watch(locals())
> 
>   # This is normally done in the interactive_utils when a transform is
>   # applied but needs an IPython environment. So we manually run this 
> here.
>   ie.current_env().track_user_pipelines()
> 
>   # Create a fake limiter that cancels the BCJ once the main job receives 
> the
>   # expected amount of results.
>   class FakeLimiter:
> def __init__(self, p, pcoll):
>   self.p = p
>   self.pcoll = pcoll
> 
> def is_triggered(self):
>   result = ie.current_env().pipeline_result(self.p)
>   if result:
> try:
>   results = result.get(self.pcoll)
> except ValueError:
>   return False
> return len(results) >= 10
>   return False
> 
>   # This sets the limiters to stop reading when the test receives 10 
> elements
>   # or after 5 seconds have elapsed (to eliminate the possibility of 
> hanging).
>   ie.current_env().options.capture_control.set_limiters_for_test(
>   [FakeLimiter(p, data), DurationLimiter(timedelta(seconds=5))])
> 
>   # This tests that the data was correctly cached.
>   pane_info = PaneInfo(True, True, PaneInfoTiming.UNKNOWN, 0, 0)
>   expected_data_df = pd.DataFrame([
>   ('to', 0, [IntervalWindow(0, 10)], pane_info),
>   ('be', 0, [IntervalWindow(0, 10)], pane_info),
>   ('or', 0, [IntervalWindow(0, 10)], pane_info),
>   ('not', 0, [IntervalWindow(0, 10)], pane_info),
>   ('to', 0, [IntervalWindow(0, 10)], pane_info),
>   ('be', 0, [IntervalWindow(0, 10)], pane_info),
>   ('that', 2000, [IntervalWindow(20, 30)], pane_info),
>   ('is', 2000, [IntervalWindow(20, 30)], pane_info),
>   ('the', 2000, [IntervalWindow(20, 30)], pane_info),
>   ('question', 2000, [IntervalWindow(20, 30)], pane_info)
>   ], columns=[0, 'event_time', 'windows', 'pane_info']) # yapf: disable
> 
> > data_df = ib.collect(data, include_window_info=True)
> apache_beam/runners/interactive/interactive_runner_test.py:237: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> apache_beam/runners/interactive/interactive_beam.py:451: in collect
> return head(pcoll, n=-1, include_window_info=include_window_info)
> apache_beam/runners/interactive/utils.py:204: in run_within_progress_indicator
> return 

[jira] [Commented] (BEAM-9001) Allow setting environment ID to all transforms in the SDK

2020-05-11 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104777#comment-17104777
 ] 

Ahmet Altay commented on BEAM-9001:
---

Does this need to be a blocker for 2.21? Is it a regression? Which users are 
affected by it?

> Allow setting environment ID to all transforms in the SDK
> -
>
> Key: BEAM-9001
> URL: https://issues.apache.org/jira/browse/BEAM-9001
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core, sdk-java-harness, sdk-py-core, 
> sdk-py-harness
>Reporter: Chamikara Madhusanka Jayalath
>Priority: Blocker
> Fix For: 2.21.0
>
>
> Currently Beam SDKs set environment in a known set of transforms and do not 
> not set it in others. Runners expect certain transforms to not to resolve to 
> an environment.
> It might be cleaner to set environment in all transforms by default (at the 
> SDKs) and allow runners to override this for transforms that are naively 
> implemented in the corresponding runners.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9767) test_streaming_wordcount flaky timeouts

2020-05-07 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102163#comment-17102163
 ] 

Ahmet Altay commented on BEAM-9767:
---

https://github.com/apache/beam/pull/11624 is merged. Can we close this now?

> test_streaming_wordcount flaky timeouts
> ---
>
> Key: BEAM-9767
> URL: https://issues.apache.org/jira/browse/BEAM-9767
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, test-failures
>Reporter: Udi Meiri
>Assignee: Sam Rohde
>Priority: Critical
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Timed out after 600s, typically completes in 2.8s on my workstation.
> https://builds.apache.org/job/beam_PreCommit_Python_Commit/12376/
> {code}
> self = 
>   testMethod=test_streaming_wordcount>
> @unittest.skipIf(
> sys.version_info < (3, 5, 3),
> 'The tests require at least Python 3.6 to work.')
> def test_streaming_wordcount(self):
>   class WordExtractingDoFn(beam.DoFn):
> def process(self, element):
>   text_line = element.strip()
>   words = text_line.split()
>   return words
> 
>   # Add the TestStream so that it can be cached.
>   ib.options.capturable_sources.add(TestStream)
>   ib.options.capture_duration = timedelta(seconds=5)
> 
>   p = beam.Pipeline(
>   runner=interactive_runner.InteractiveRunner(),
>   options=StandardOptions(streaming=True))
> 
>   data = (
>   p
>   | TestStream()
>   .advance_watermark_to(0)
>   .advance_processing_time(1)
>   .add_elements(['to', 'be', 'or', 'not', 'to', 'be'])
>   .advance_watermark_to(20)
>   .advance_processing_time(1)
>   .add_elements(['that', 'is', 'the', 'question'])
>   | beam.WindowInto(beam.window.FixedWindows(10))) # yapf: disable
> 
>   counts = (
>   data
>   | 'split' >> beam.ParDo(WordExtractingDoFn())
>   | 'pair_with_one' >> beam.Map(lambda x: (x, 1))
>   | 'group' >> beam.GroupByKey()
>   | 'count' >> beam.Map(lambda wordones: (wordones[0], 
> sum(wordones[1]
> 
>   # Watch the local scope for Interactive Beam so that referenced 
> PCollections
>   # will be cached.
>   ib.watch(locals())
> 
>   # This is normally done in the interactive_utils when a transform is
>   # applied but needs an IPython environment. So we manually run this 
> here.
>   ie.current_env().track_user_pipelines()
> 
>   # Create a fake limiter that cancels the BCJ once the main job receives 
> the
>   # expected amount of results.
>   class FakeLimiter:
> def __init__(self, p, pcoll):
>   self.p = p
>   self.pcoll = pcoll
> 
> def is_triggered(self):
>   result = ie.current_env().pipeline_result(self.p)
>   if result:
> try:
>   results = result.get(self.pcoll)
> except ValueError:
>   return False
> return len(results) >= 10
>   return False
> 
>   # This sets the limiters to stop reading when the test receives 10 
> elements
>   # or after 5 seconds have elapsed (to eliminate the possibility of 
> hanging).
>   ie.current_env().options.capture_control.set_limiters_for_test(
>   [FakeLimiter(p, data), DurationLimiter(timedelta(seconds=5))])
> 
>   # This tests that the data was correctly cached.
>   pane_info = PaneInfo(True, True, PaneInfoTiming.UNKNOWN, 0, 0)
>   expected_data_df = pd.DataFrame([
>   ('to', 0, [IntervalWindow(0, 10)], pane_info),
>   ('be', 0, [IntervalWindow(0, 10)], pane_info),
>   ('or', 0, [IntervalWindow(0, 10)], pane_info),
>   ('not', 0, [IntervalWindow(0, 10)], pane_info),
>   ('to', 0, [IntervalWindow(0, 10)], pane_info),
>   ('be', 0, [IntervalWindow(0, 10)], pane_info),
>   ('that', 2000, [IntervalWindow(20, 30)], pane_info),
>   ('is', 2000, [IntervalWindow(20, 30)], pane_info),
>   ('the', 2000, [IntervalWindow(20, 30)], pane_info),
>   ('question', 2000, [IntervalWindow(20, 30)], pane_info)
>   ], columns=[0, 'event_time', 'windows', 'pane_info']) # yapf: disable
> 
> > data_df = ib.collect(data, include_window_info=True)
> apache_beam/runners/interactive/interactive_runner_test.py:237: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> apache_beam/runners/interactive/interactive_beam.py:451: in collect
> return head(pcoll, n=-1, include_window_info=include_window_info)
> apache_beam/runners/interactive/utils.py:204: in run_within_progress_indicator
> return func(*args, 

[jira] [Commented] (BEAM-9907) apache_beam.transforms.external_test.ExternalTransformTest.test_nested flaky

2020-05-07 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17102161#comment-17102161
 ] 

Ahmet Altay commented on BEAM-9907:
---

https://github.com/apache/beam/pull/11631 is merged. Can we close this now?

> apache_beam.transforms.external_test.ExternalTransformTest.test_nested flaky
> 
>
> Key: BEAM-9907
> URL: https://issues.apache.org/jira/browse/BEAM-9907
> Project: Beam
>  Issue Type: Test
>  Components: sdk-py-core
>Reporter: Ning Kang
>Assignee: Brian Hulette
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Example test failures:
> https://builds.apache.org/job/beam_PreCommit_Python_Commit/12682/
> https://builds.apache.org/job/beam_PreCommit_Python_Commit/12684/
> A stacktrace
> {code:bash}
> apache_beam.transforms.external_test.ExternalTransformTest.test_nested (from 
> py37-cloud)
> Failing for the past 1 build (Since Failed#12682 )
> Took 54 ms.
> Error Message
> google.protobuf.json_format.ParseError: Unexpected type for Value message.
> Stacktrace
> self =  testMethod=test_nested>
> def test_nested(self):
>   with beam.Pipeline() as p:
> >   assert_that(p | FibTransform(6), equal_to([8]))
> apache_beam/transforms/external_test.py:250: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> apache_beam/transforms/ptransform.py:562: in __ror__
> result = p.apply(self, pvalueish, label)
> apache_beam/pipeline.py:651: in apply
> pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
> apache_beam/runners/runner.py:198: in apply
> return m(transform, input, options)
> apache_beam/runners/runner.py:228: in apply_PTransform
> return transform.expand(input)
> apache_beam/runners/portability/expansion_service_test.py:257: in expand
> expansion_service.ExpansionServiceServicer())
> apache_beam/pvalue.py:140: in __or__
> return self.pipeline.apply(ptransform, self)
> apache_beam/pipeline.py:598: in apply
> transform.transform, pvalueish, label or transform.label)
> apache_beam/pipeline.py:608: in apply
> return self.apply(transform, pvalueish)
> apache_beam/pipeline.py:651: in apply
> pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
> apache_beam/runners/runner.py:198: in apply
> return m(transform, input, options)
> apache_beam/runners/runner.py:228: in apply_PTransform
> return transform.expand(input)
> apache_beam/transforms/external.py:322: in expand
> pipeline_options=job_utils.pipeline_options_dict_to_struct(options))
> apache_beam/runners/job/utils.py:38: in pipeline_options_dict_to_struct
> v in options.items() if v is not None
> apache_beam/runners/job/utils.py:44: in dict_to_struct
> return json_format.ParseDict(dict_obj, struct_pb2.Struct())
> target/.tox-py37-cloud/py37-cloud/lib/python3.7/site-packages/google/protobuf/json_format.py:450:
>  in ParseDict
> parser.ConvertMessage(js_dict, message)
> target/.tox-py37-cloud/py37-cloud/lib/python3.7/site-packages/google/protobuf/json_format.py:479:
>  in ConvertMessage
> methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self)
> target/.tox-py37-cloud/py37-cloud/lib/python3.7/site-packages/google/protobuf/json_format.py:667:
>  in _ConvertStructMessage
> self._ConvertValueMessage(value[key], message.fields[key])
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> self = 
> value =  0x7f35a4c00390>
> message = 
> def _ConvertValueMessage(self, value, message):
>   """Convert a JSON representation into Value message."""
>   if isinstance(value, dict):
> self._ConvertStructMessage(value, message.struct_value)
>   elif isinstance(value, list):
> self. _ConvertListValueMessage(value, message.list_value)
>   elif value is None:
> message.null_value = 0
>   elif isinstance(value, bool):
> message.bool_value = value
>   elif isinstance(value, six.string_types):
> message.string_value = value
>   elif isinstance(value, _INT_OR_FLOAT):
> message.number_value = value
>   else:
> >   raise ParseError('Unexpected type for Value message.')
> E   google.protobuf.json_format.ParseError: Unexpected type for Value 
> message.
> target/.tox-py37-cloud/py37-cloud/lib/python3.7/site-packages/google/protobuf/json_format.py:647:
>  ParseError
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9907) apache_beam.transforms.external_test.ExternalTransformTest.test_nested flaky

2020-05-07 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17101931#comment-17101931
 ] 

Ahmet Altay commented on BEAM-9907:
---

[~bhulette] - mentioned that this could be related to 
https://github.com/apache/beam/pull/11574

> apache_beam.transforms.external_test.ExternalTransformTest.test_nested flaky
> 
>
> Key: BEAM-9907
> URL: https://issues.apache.org/jira/browse/BEAM-9907
> Project: Beam
>  Issue Type: Test
>  Components: sdk-py-core
>Reporter: Ning Kang
>Assignee: Chamikara Madhusanka Jayalath
>Priority: Major
>
> Example test failures:
> https://builds.apache.org/job/beam_PreCommit_Python_Commit/12682/
> https://builds.apache.org/job/beam_PreCommit_Python_Commit/12684/
> A stacktrace
> {code:bash}
> apache_beam.transforms.external_test.ExternalTransformTest.test_nested (from 
> py37-cloud)
> Failing for the past 1 build (Since Failed#12682 )
> Took 54 ms.
> Error Message
> google.protobuf.json_format.ParseError: Unexpected type for Value message.
> Stacktrace
> self =  testMethod=test_nested>
> def test_nested(self):
>   with beam.Pipeline() as p:
> >   assert_that(p | FibTransform(6), equal_to([8]))
> apache_beam/transforms/external_test.py:250: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> apache_beam/transforms/ptransform.py:562: in __ror__
> result = p.apply(self, pvalueish, label)
> apache_beam/pipeline.py:651: in apply
> pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
> apache_beam/runners/runner.py:198: in apply
> return m(transform, input, options)
> apache_beam/runners/runner.py:228: in apply_PTransform
> return transform.expand(input)
> apache_beam/runners/portability/expansion_service_test.py:257: in expand
> expansion_service.ExpansionServiceServicer())
> apache_beam/pvalue.py:140: in __or__
> return self.pipeline.apply(ptransform, self)
> apache_beam/pipeline.py:598: in apply
> transform.transform, pvalueish, label or transform.label)
> apache_beam/pipeline.py:608: in apply
> return self.apply(transform, pvalueish)
> apache_beam/pipeline.py:651: in apply
> pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
> apache_beam/runners/runner.py:198: in apply
> return m(transform, input, options)
> apache_beam/runners/runner.py:228: in apply_PTransform
> return transform.expand(input)
> apache_beam/transforms/external.py:322: in expand
> pipeline_options=job_utils.pipeline_options_dict_to_struct(options))
> apache_beam/runners/job/utils.py:38: in pipeline_options_dict_to_struct
> v in options.items() if v is not None
> apache_beam/runners/job/utils.py:44: in dict_to_struct
> return json_format.ParseDict(dict_obj, struct_pb2.Struct())
> target/.tox-py37-cloud/py37-cloud/lib/python3.7/site-packages/google/protobuf/json_format.py:450:
>  in ParseDict
> parser.ConvertMessage(js_dict, message)
> target/.tox-py37-cloud/py37-cloud/lib/python3.7/site-packages/google/protobuf/json_format.py:479:
>  in ConvertMessage
> methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self)
> target/.tox-py37-cloud/py37-cloud/lib/python3.7/site-packages/google/protobuf/json_format.py:667:
>  in _ConvertStructMessage
> self._ConvertValueMessage(value[key], message.fields[key])
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> self = 
> value =  0x7f35a4c00390>
> message = 
> def _ConvertValueMessage(self, value, message):
>   """Convert a JSON representation into Value message."""
>   if isinstance(value, dict):
> self._ConvertStructMessage(value, message.struct_value)
>   elif isinstance(value, list):
> self. _ConvertListValueMessage(value, message.list_value)
>   elif value is None:
> message.null_value = 0
>   elif isinstance(value, bool):
> message.bool_value = value
>   elif isinstance(value, six.string_types):
> message.string_value = value
>   elif isinstance(value, _INT_OR_FLOAT):
> message.number_value = value
>   else:
> >   raise ParseError('Unexpected type for Value message.')
> E   google.protobuf.json_format.ParseError: Unexpected type for Value 
> message.
> target/.tox-py37-cloud/py37-cloud/lib/python3.7/site-packages/google/protobuf/json_format.py:647:
>  ParseError
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9907) apache_beam.transforms.external_test.ExternalTransformTest.test_nested flaky

2020-05-07 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17101898#comment-17101898
 ] 

Ahmet Altay commented on BEAM-9907:
---

Any updates on this issue? What is the next action here?

> apache_beam.transforms.external_test.ExternalTransformTest.test_nested flaky
> 
>
> Key: BEAM-9907
> URL: https://issues.apache.org/jira/browse/BEAM-9907
> Project: Beam
>  Issue Type: Test
>  Components: sdk-py-core
>Reporter: Ning Kang
>Assignee: Chamikara Madhusanka Jayalath
>Priority: Major
>
> Example test failures:
> https://builds.apache.org/job/beam_PreCommit_Python_Commit/12682/
> https://builds.apache.org/job/beam_PreCommit_Python_Commit/12684/
> A stacktrace
> {code:bash}
> apache_beam.transforms.external_test.ExternalTransformTest.test_nested (from 
> py37-cloud)
> Failing for the past 1 build (Since Failed#12682 )
> Took 54 ms.
> Error Message
> google.protobuf.json_format.ParseError: Unexpected type for Value message.
> Stacktrace
> self =  testMethod=test_nested>
> def test_nested(self):
>   with beam.Pipeline() as p:
> >   assert_that(p | FibTransform(6), equal_to([8]))
> apache_beam/transforms/external_test.py:250: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> apache_beam/transforms/ptransform.py:562: in __ror__
> result = p.apply(self, pvalueish, label)
> apache_beam/pipeline.py:651: in apply
> pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
> apache_beam/runners/runner.py:198: in apply
> return m(transform, input, options)
> apache_beam/runners/runner.py:228: in apply_PTransform
> return transform.expand(input)
> apache_beam/runners/portability/expansion_service_test.py:257: in expand
> expansion_service.ExpansionServiceServicer())
> apache_beam/pvalue.py:140: in __or__
> return self.pipeline.apply(ptransform, self)
> apache_beam/pipeline.py:598: in apply
> transform.transform, pvalueish, label or transform.label)
> apache_beam/pipeline.py:608: in apply
> return self.apply(transform, pvalueish)
> apache_beam/pipeline.py:651: in apply
> pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
> apache_beam/runners/runner.py:198: in apply
> return m(transform, input, options)
> apache_beam/runners/runner.py:228: in apply_PTransform
> return transform.expand(input)
> apache_beam/transforms/external.py:322: in expand
> pipeline_options=job_utils.pipeline_options_dict_to_struct(options))
> apache_beam/runners/job/utils.py:38: in pipeline_options_dict_to_struct
> v in options.items() if v is not None
> apache_beam/runners/job/utils.py:44: in dict_to_struct
> return json_format.ParseDict(dict_obj, struct_pb2.Struct())
> target/.tox-py37-cloud/py37-cloud/lib/python3.7/site-packages/google/protobuf/json_format.py:450:
>  in ParseDict
> parser.ConvertMessage(js_dict, message)
> target/.tox-py37-cloud/py37-cloud/lib/python3.7/site-packages/google/protobuf/json_format.py:479:
>  in ConvertMessage
> methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self)
> target/.tox-py37-cloud/py37-cloud/lib/python3.7/site-packages/google/protobuf/json_format.py:667:
>  in _ConvertStructMessage
> self._ConvertValueMessage(value[key], message.fields[key])
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> self = 
> value =  0x7f35a4c00390>
> message = 
> def _ConvertValueMessage(self, value, message):
>   """Convert a JSON representation into Value message."""
>   if isinstance(value, dict):
> self._ConvertStructMessage(value, message.struct_value)
>   elif isinstance(value, list):
> self. _ConvertListValueMessage(value, message.list_value)
>   elif value is None:
> message.null_value = 0
>   elif isinstance(value, bool):
> message.bool_value = value
>   elif isinstance(value, six.string_types):
> message.string_value = value
>   elif isinstance(value, _INT_OR_FLOAT):
> message.number_value = value
>   else:
> >   raise ParseError('Unexpected type for Value message.')
> E   google.protobuf.json_format.ParseError: Unexpected type for Value 
> message.
> target/.tox-py37-cloud/py37-cloud/lib/python3.7/site-packages/google/protobuf/json_format.py:647:
>  ParseError
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9907) apache_beam.transforms.external_test.ExternalTransformTest.test_nested flaky

2020-05-06 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17101104#comment-17101104
 ] 

Ahmet Altay commented on BEAM-9907:
---

[~chamikara] / [~heejong] - Could you please fix or disable this test?

> apache_beam.transforms.external_test.ExternalTransformTest.test_nested flaky
> 
>
> Key: BEAM-9907
> URL: https://issues.apache.org/jira/browse/BEAM-9907
> Project: Beam
>  Issue Type: Test
>  Components: sdk-py-core
>Reporter: Ning Kang
>Assignee: Chamikara Madhusanka Jayalath
>Priority: Major
>
> Example test failures:
> https://builds.apache.org/job/beam_PreCommit_Python_Commit/12682/
> https://builds.apache.org/job/beam_PreCommit_Python_Commit/12684/
> A stacktrace
> {code:bash}
> apache_beam.transforms.external_test.ExternalTransformTest.test_nested (from 
> py37-cloud)
> Failing for the past 1 build (Since Failed#12682 )
> Took 54 ms.
> Error Message
> google.protobuf.json_format.ParseError: Unexpected type for Value message.
> Stacktrace
> self =  testMethod=test_nested>
> def test_nested(self):
>   with beam.Pipeline() as p:
> >   assert_that(p | FibTransform(6), equal_to([8]))
> apache_beam/transforms/external_test.py:250: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> apache_beam/transforms/ptransform.py:562: in __ror__
> result = p.apply(self, pvalueish, label)
> apache_beam/pipeline.py:651: in apply
> pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
> apache_beam/runners/runner.py:198: in apply
> return m(transform, input, options)
> apache_beam/runners/runner.py:228: in apply_PTransform
> return transform.expand(input)
> apache_beam/runners/portability/expansion_service_test.py:257: in expand
> expansion_service.ExpansionServiceServicer())
> apache_beam/pvalue.py:140: in __or__
> return self.pipeline.apply(ptransform, self)
> apache_beam/pipeline.py:598: in apply
> transform.transform, pvalueish, label or transform.label)
> apache_beam/pipeline.py:608: in apply
> return self.apply(transform, pvalueish)
> apache_beam/pipeline.py:651: in apply
> pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
> apache_beam/runners/runner.py:198: in apply
> return m(transform, input, options)
> apache_beam/runners/runner.py:228: in apply_PTransform
> return transform.expand(input)
> apache_beam/transforms/external.py:322: in expand
> pipeline_options=job_utils.pipeline_options_dict_to_struct(options))
> apache_beam/runners/job/utils.py:38: in pipeline_options_dict_to_struct
> v in options.items() if v is not None
> apache_beam/runners/job/utils.py:44: in dict_to_struct
> return json_format.ParseDict(dict_obj, struct_pb2.Struct())
> target/.tox-py37-cloud/py37-cloud/lib/python3.7/site-packages/google/protobuf/json_format.py:450:
>  in ParseDict
> parser.ConvertMessage(js_dict, message)
> target/.tox-py37-cloud/py37-cloud/lib/python3.7/site-packages/google/protobuf/json_format.py:479:
>  in ConvertMessage
> methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self)
> target/.tox-py37-cloud/py37-cloud/lib/python3.7/site-packages/google/protobuf/json_format.py:667:
>  in _ConvertStructMessage
> self._ConvertValueMessage(value[key], message.fields[key])
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> self = 
> value =  0x7f35a4c00390>
> message = 
> def _ConvertValueMessage(self, value, message):
>   """Convert a JSON representation into Value message."""
>   if isinstance(value, dict):
> self._ConvertStructMessage(value, message.struct_value)
>   elif isinstance(value, list):
> self. _ConvertListValueMessage(value, message.list_value)
>   elif value is None:
> message.null_value = 0
>   elif isinstance(value, bool):
> message.bool_value = value
>   elif isinstance(value, six.string_types):
> message.string_value = value
>   elif isinstance(value, _INT_OR_FLOAT):
> message.number_value = value
>   else:
> >   raise ParseError('Unexpected type for Value message.')
> E   google.protobuf.json_format.ParseError: Unexpected type for Value 
> message.
> target/.tox-py37-cloud/py37-cloud/lib/python3.7/site-packages/google/protobuf/json_format.py:647:
>  ParseError
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-9907) apache_beam.transforms.external_test.ExternalTransformTest.test_nested flaky

2020-05-06 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay reassigned BEAM-9907:
-

Assignee: Chamikara Madhusanka Jayalath

> apache_beam.transforms.external_test.ExternalTransformTest.test_nested flaky
> 
>
> Key: BEAM-9907
> URL: https://issues.apache.org/jira/browse/BEAM-9907
> Project: Beam
>  Issue Type: Test
>  Components: sdk-py-core
>Reporter: Ning Kang
>Assignee: Chamikara Madhusanka Jayalath
>Priority: Major
>
> Example test failures:
> https://builds.apache.org/job/beam_PreCommit_Python_Commit/12682/
> https://builds.apache.org/job/beam_PreCommit_Python_Commit/12684/
> A stacktrace
> {code:bash}
> apache_beam.transforms.external_test.ExternalTransformTest.test_nested (from 
> py37-cloud)
> Failing for the past 1 build (Since Failed#12682 )
> Took 54 ms.
> Error Message
> google.protobuf.json_format.ParseError: Unexpected type for Value message.
> Stacktrace
> self =  testMethod=test_nested>
> def test_nested(self):
>   with beam.Pipeline() as p:
> >   assert_that(p | FibTransform(6), equal_to([8]))
> apache_beam/transforms/external_test.py:250: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> apache_beam/transforms/ptransform.py:562: in __ror__
> result = p.apply(self, pvalueish, label)
> apache_beam/pipeline.py:651: in apply
> pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
> apache_beam/runners/runner.py:198: in apply
> return m(transform, input, options)
> apache_beam/runners/runner.py:228: in apply_PTransform
> return transform.expand(input)
> apache_beam/runners/portability/expansion_service_test.py:257: in expand
> expansion_service.ExpansionServiceServicer())
> apache_beam/pvalue.py:140: in __or__
> return self.pipeline.apply(ptransform, self)
> apache_beam/pipeline.py:598: in apply
> transform.transform, pvalueish, label or transform.label)
> apache_beam/pipeline.py:608: in apply
> return self.apply(transform, pvalueish)
> apache_beam/pipeline.py:651: in apply
> pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
> apache_beam/runners/runner.py:198: in apply
> return m(transform, input, options)
> apache_beam/runners/runner.py:228: in apply_PTransform
> return transform.expand(input)
> apache_beam/transforms/external.py:322: in expand
> pipeline_options=job_utils.pipeline_options_dict_to_struct(options))
> apache_beam/runners/job/utils.py:38: in pipeline_options_dict_to_struct
> v in options.items() if v is not None
> apache_beam/runners/job/utils.py:44: in dict_to_struct
> return json_format.ParseDict(dict_obj, struct_pb2.Struct())
> target/.tox-py37-cloud/py37-cloud/lib/python3.7/site-packages/google/protobuf/json_format.py:450:
>  in ParseDict
> parser.ConvertMessage(js_dict, message)
> target/.tox-py37-cloud/py37-cloud/lib/python3.7/site-packages/google/protobuf/json_format.py:479:
>  in ConvertMessage
> methodcaller(_WKTJSONMETHODS[full_name][1], value, message)(self)
> target/.tox-py37-cloud/py37-cloud/lib/python3.7/site-packages/google/protobuf/json_format.py:667:
>  in _ConvertStructMessage
> self._ConvertValueMessage(value[key], message.fields[key])
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> self = 
> value =  0x7f35a4c00390>
> message = 
> def _ConvertValueMessage(self, value, message):
>   """Convert a JSON representation into Value message."""
>   if isinstance(value, dict):
> self._ConvertStructMessage(value, message.struct_value)
>   elif isinstance(value, list):
> self. _ConvertListValueMessage(value, message.list_value)
>   elif value is None:
> message.null_value = 0
>   elif isinstance(value, bool):
> message.bool_value = value
>   elif isinstance(value, six.string_types):
> message.string_value = value
>   elif isinstance(value, _INT_OR_FLOAT):
> message.number_value = value
>   else:
> >   raise ParseError('Unexpected type for Value message.')
> E   google.protobuf.json_format.ParseError: Unexpected type for Value 
> message.
> target/.tox-py37-cloud/py37-cloud/lib/python3.7/site-packages/google/protobuf/json_format.py:647:
>  ParseError
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9767) test_streaming_wordcount flaky timeouts

2020-05-06 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17101103#comment-17101103
 ] 

Ahmet Altay commented on BEAM-9767:
---

Sam, could you please disable the flaky test. This is resulting pre-commit 
failures.

> test_streaming_wordcount flaky timeouts
> ---
>
> Key: BEAM-9767
> URL: https://issues.apache.org/jira/browse/BEAM-9767
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, test-failures
>Reporter: Udi Meiri
>Assignee: Sam Rohde
>Priority: Critical
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Timed out after 600s, typically completes in 2.8s on my workstation.
> https://builds.apache.org/job/beam_PreCommit_Python_Commit/12376/
> {code}
> self = 
>   testMethod=test_streaming_wordcount>
> @unittest.skipIf(
> sys.version_info < (3, 5, 3),
> 'The tests require at least Python 3.6 to work.')
> def test_streaming_wordcount(self):
>   class WordExtractingDoFn(beam.DoFn):
> def process(self, element):
>   text_line = element.strip()
>   words = text_line.split()
>   return words
> 
>   # Add the TestStream so that it can be cached.
>   ib.options.capturable_sources.add(TestStream)
>   ib.options.capture_duration = timedelta(seconds=5)
> 
>   p = beam.Pipeline(
>   runner=interactive_runner.InteractiveRunner(),
>   options=StandardOptions(streaming=True))
> 
>   data = (
>   p
>   | TestStream()
>   .advance_watermark_to(0)
>   .advance_processing_time(1)
>   .add_elements(['to', 'be', 'or', 'not', 'to', 'be'])
>   .advance_watermark_to(20)
>   .advance_processing_time(1)
>   .add_elements(['that', 'is', 'the', 'question'])
>   | beam.WindowInto(beam.window.FixedWindows(10))) # yapf: disable
> 
>   counts = (
>   data
>   | 'split' >> beam.ParDo(WordExtractingDoFn())
>   | 'pair_with_one' >> beam.Map(lambda x: (x, 1))
>   | 'group' >> beam.GroupByKey()
>   | 'count' >> beam.Map(lambda wordones: (wordones[0], 
> sum(wordones[1]
> 
>   # Watch the local scope for Interactive Beam so that referenced 
> PCollections
>   # will be cached.
>   ib.watch(locals())
> 
>   # This is normally done in the interactive_utils when a transform is
>   # applied but needs an IPython environment. So we manually run this 
> here.
>   ie.current_env().track_user_pipelines()
> 
>   # Create a fake limiter that cancels the BCJ once the main job receives 
> the
>   # expected amount of results.
>   class FakeLimiter:
> def __init__(self, p, pcoll):
>   self.p = p
>   self.pcoll = pcoll
> 
> def is_triggered(self):
>   result = ie.current_env().pipeline_result(self.p)
>   if result:
> try:
>   results = result.get(self.pcoll)
> except ValueError:
>   return False
> return len(results) >= 10
>   return False
> 
>   # This sets the limiters to stop reading when the test receives 10 
> elements
>   # or after 5 seconds have elapsed (to eliminate the possibility of 
> hanging).
>   ie.current_env().options.capture_control.set_limiters_for_test(
>   [FakeLimiter(p, data), DurationLimiter(timedelta(seconds=5))])
> 
>   # This tests that the data was correctly cached.
>   pane_info = PaneInfo(True, True, PaneInfoTiming.UNKNOWN, 0, 0)
>   expected_data_df = pd.DataFrame([
>   ('to', 0, [IntervalWindow(0, 10)], pane_info),
>   ('be', 0, [IntervalWindow(0, 10)], pane_info),
>   ('or', 0, [IntervalWindow(0, 10)], pane_info),
>   ('not', 0, [IntervalWindow(0, 10)], pane_info),
>   ('to', 0, [IntervalWindow(0, 10)], pane_info),
>   ('be', 0, [IntervalWindow(0, 10)], pane_info),
>   ('that', 2000, [IntervalWindow(20, 30)], pane_info),
>   ('is', 2000, [IntervalWindow(20, 30)], pane_info),
>   ('the', 2000, [IntervalWindow(20, 30)], pane_info),
>   ('question', 2000, [IntervalWindow(20, 30)], pane_info)
>   ], columns=[0, 'event_time', 'windows', 'pane_info']) # yapf: disable
> 
> > data_df = ib.collect(data, include_window_info=True)
> apache_beam/runners/interactive/interactive_runner_test.py:237: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> apache_beam/runners/interactive/interactive_beam.py:451: in collect
> return head(pcoll, n=-1, include_window_info=include_window_info)
> apache_beam/runners/interactive/utils.py:204: in run_within_progress_indicator
> return 

[jira] [Updated] (BEAM-7246) Add Google Spanner IO on Python SDK

2020-05-01 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay updated BEAM-7246:
--
Fix Version/s: (was: 2.20.0)
   2.21.0

> Add Google Spanner IO on Python SDK 
> 
>
> Key: BEAM-7246
> URL: https://issues.apache.org/jira/browse/BEAM-7246
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Reuven Lax
>Assignee: Shoaib Zafar
>Priority: Major
> Fix For: 2.21.0
>
>  Time Spent: 22.5h
>  Remaining Estimate: 0h
>
> Add I/O support for Google Cloud Spanner for the Python SDK (Batch Only).
> Testing in this work item will be in the form of DirectRunner tests and 
> manual testing.
> Integration and performance tests are a separate work item (not included 
> here).
> See https://beam.apache.org/documentation/io/built-in/. The goal is to add 
> Google Clound Spanner to the Database column for the Python/Batch row.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-6795) Improve Release Scripts

2020-04-20 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay updated BEAM-6795:
--
Parent: BEAM-7887
Issue Type: Sub-task  (was: Improvement)

> Improve Release Scripts
> ---
>
> Key: BEAM-6795
> URL: https://issues.apache.org/jira/browse/BEAM-6795
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Ahmet Altay
>Priority: Major
>
> - Scripts use sudo to install binaries. Could be improved by local 
> installations, or perhaps using a container for build the release.
> - Scripts make changes to bashrc file (e.g. alias hub to git), these could be 
> avoided. Even though scripts attempt make a backup file, it is easy to 
> override them if the script is cancelled.
> - There are too many yes/no questions, configuration questions for 
> validations. They are not set and forget requires attention. (Possible 
> solutions: use command line arguments)
> - Once script fails at any step (e.g. invalid password at a step) it fails 
> without giving a second chance and requires re-running from the top. 
> (Posssible idea: use breadcrumbs to continue the script for its last known 
> location.)
> - Signing with GPG is not friendly when used from a remote terminal. Has 
> modal dialogs and does not interact well with gradle.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-6613) build_release_candidate.sh fails if RC > 1 but there's no SVN directory on dist.apache.org

2020-04-20 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay updated BEAM-6613:
--
Parent: BEAM-7887
Issue Type: Sub-task  (was: Bug)

> build_release_candidate.sh fails if RC > 1 but there's no SVN directory on 
> dist.apache.org
> --
>
> Key: BEAM-6613
> URL: https://issues.apache.org/jira/browse/BEAM-6613
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Kenneth Knowles
>Priority: Major
>
> Currently the logic is "if RC > 1 then delete the dist.apache.org staging 
> directory".
> Actually the logic should be "if the staging directory exists and we want to 
> stage a new thing, delete the staging directory".
> (the overall flow might change but this is surgical)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-6441) cut_release_branch.sh should not push to master without verification and a PR

2020-04-20 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay updated BEAM-6441:
--
Parent: BEAM-7887
Issue Type: Sub-task  (was: Bug)

> cut_release_branch.sh should not push to master without verification and a PR
> -
>
> Key: BEAM-6441
> URL: https://issues.apache.org/jira/browse/BEAM-6441
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Currently, the cut_release_branch.sh does many things:
>  - Edits files in place to update the version
>  - Makes a local commit
>  - Pushing the local commit to master
>  - Creates a new branch
>  - Edits files in place to update the version
>  - Pushes the release branch
> I think all of this except the push to master are OK. It is possible that we 
> have something - website, examples, new places where the version is 
> hardcoded, etc, that get broken in this process. Moving from x-SNAPSHOT to 
> (x+1)-SNAPSHOT is easy to do in a pull request and safe. The release branch 
> creation does not need to be synchronized with this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-6614) Release candidate does not require --no-parallel

2020-04-20 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay updated BEAM-6614:
--
Parent: BEAM-7887
Issue Type: Sub-task  (was: Bug)

> Release candidate does not require --no-parallel
> 
>
> Key: BEAM-6614
> URL: https://issues.apache.org/jira/browse/BEAM-6614
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Kenneth Knowles
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-6598) build_release_candidate.sh fails because of leftover RC tags in working git clone

2020-04-20 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay updated BEAM-6598:
--
Parent: BEAM-7887
Issue Type: Sub-task  (was: Bug)

> build_release_candidate.sh fails because of leftover RC tags in working git 
> clone
> -
>
> Key: BEAM-6598
> URL: https://issues.apache.org/jira/browse/BEAM-6598
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Kenneth Knowles
>Priority: Major
>
> Currently, the build_release_candidate.sh re-uses 
> $HOME/build_release_candidate/beam as the git clone. If the RC tag already 
> exists due to a prior build, it crashes. Instead, since that clone failed, it 
> should just not be used the next time. The workflow that makes sense to me is:
>  - Locally tag the intended RC commit
>  - Try to build the RC from that
>  - If the RC build fails, remove tag / discard working branch / etc
>  - If the RC succeeds, push the tag (and if it is on a gradle release plugin 
> commit, those commits)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-6595) build_release_candidate.sh should not push to apache org on github

2020-04-20 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay updated BEAM-6595:
--
Parent: BEAM-7887
Issue Type: Sub-task  (was: Bug)

> build_release_candidate.sh should not push to apache org on github
> --
>
> Key: BEAM-6595
> URL: https://issues.apache.org/jira/browse/BEAM-6595
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Kenneth Knowles
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently, the build_release_candidate.sh does many things beyond the build
>  - Edits files in place to update the version from SNAPSHOT to non-SNAPSHOT
>  - Makes a local commit
>  - Pushes commits to release branch
>  - Reverts on failure, pushes those to release branch
> Instead, the release manager should determine what gets pushed. It is less 
> fragile of a process and avoids cruft getting pushed and churning the branch. 
> The only thing the plugin is really good for is flipping SNAPSHOT away and 
> back. And it isn't even that great because that's Java only and other 
> languages are at non-SNAPSHOT anyhow.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-6586) Design and implement a release process for Beam SDK harness containers.

2020-04-20 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17088191#comment-17088191
 ] 

Ahmet Altay commented on BEAM-6586:
---

I believe this is fixed by [~hannahjiang]'s work. Anything else to do here?

> Design and implement a release process for Beam SDK harness containers.
> ---
>
> Key: BEAM-6586
> URL: https://issues.apache.org/jira/browse/BEAM-6586
> Project: Beam
>  Issue Type: New Feature
>  Components: build-system
>Reporter: Valentyn Tymofieiev
>Priority: Major
>
> Related discussion: 
> [https://lists.apache.org/thread.html/770496ee9cf1096d78806fece8dd37716279b51ca5bb600dfa263c55@%3Cdev.beam.apache.org%3E]
> cc: [~angoenka]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9685) Don't release Go SDK container until Go is officially supported.

2020-04-16 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17085101#comment-17085101
 ] 

Ahmet Altay commented on BEAM-9685:
---

Kyle, Hannah, is this fixed? Or should this be re-opened?

> Don't release Go SDK container until Go is officially supported.
> 
>
> Key: BEAM-9685
> URL: https://issues.apache.org/jira/browse/BEAM-9685
> Project: Beam
>  Issue Type: Task
>  Components: build-system
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
> Fix For: 2.21.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> 1. Remove Go SDK container from release process.
> 2. Update document about it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9771) colab links in example notebooks don't work

2020-04-16 Thread Ahmet Altay (Jira)
Ahmet Altay created BEAM-9771:
-

 Summary: colab links in example notebooks don't work
 Key: BEAM-9771
 URL: https://issues.apache.org/jira/browse/BEAM-9771
 Project: Beam
  Issue Type: Bug
  Components: examples-python
Reporter: Ahmet Altay
Assignee: David Cavazos


Example:
https://github.com/apache/beam/blob/master/examples/notebooks/documentation/transforms/python/elementwise/map-py.ipynb

Error:
Notebook not found
There was an error loading this notebook. Ensure that the file is accessible 
and try again.
Ensure that you have permission to view this notebook in GitHub and authorize 
Colaboratory to use the GitHub API.

https://github.com/apache/beam/blob/master/Users/dcavazos/src/beam/examples/notebooks/documentation/transforms/python/elementwise/map-py.ipynb

I believe this is true for all files in that folder at least. I did not check 
other places.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9750) Streaming Word Count Example Documents is out of date (Python)

2020-04-13 Thread Ahmet Altay (Jira)
Ahmet Altay created BEAM-9750:
-

 Summary: Streaming Word Count Example Documents is out of date 
(Python)
 Key: BEAM-9750
 URL: https://issues.apache.org/jira/browse/BEAM-9750
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-core, website
Reporter: Ahmet Altay
Assignee: Rose Nguyen


Flink runners are listed as "This runner is not yet available for the Python 
SDK." This is not accurate., Flink runner supports streaming with python.

Link: 
https://beam.apache.org/get-started/wordcount-example/#streamingwordcount-example

/cc [~ibzib]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9484) apache_beam.io.gcp.bigquery_write_it_test.BigQueryWriteIntegrationTests is flaky in DirectRunner Postcommits

2020-04-08 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17078910#comment-17078910
 ] 

Ahmet Altay commented on BEAM-9484:
---

Any update on this. This is one of the top issues impacting python post 
commits. Should we sickbay it?

> apache_beam.io.gcp.bigquery_write_it_test.BigQueryWriteIntegrationTests is 
> flaky in DirectRunner Postcommits
> 
>
> Key: BEAM-9484
> URL: https://issues.apache.org/jira/browse/BEAM-9484
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp, test-failures
>Reporter: Valentyn Tymofieiev
>Assignee: Pablo Estrada
>Priority: Major
>
> From https://builds.apache.org/job/beam_PostCommit_Python37_PR/99/: 
> {noformat}
>  ==
> 04:40:28  FAIL: test_big_query_write 
> (apache_beam.io.gcp.bigquery_write_it_test.BigQueryWriteIntegrationTests)
> 04:40:28  
> --
> 04:40:28  Traceback (most recent call last):
> 04:40:28File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/sdks/python/apache_beam/io/gcp/bigquery_write_it_test.py",
>  line 167, in test_big_query_write
> 04:40:28  write_disposition=beam.io.BigQueryDisposition.WRITE_EMPTY))
> 04:40:28File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/sdks/python/apache_beam/pipeline.py",
>  line 522, in __exit__
> 04:40:28  self.run().wait_until_finish()
> 04:40:28File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/sdks/python/apache_beam/pipeline.py",
>  line 495, in run
> 04:40:28  self._options).run(False)
> 04:40:28File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/sdks/python/apache_beam/pipeline.py",
>  line 508, in run
> 04:40:28  return self.runner.run_pipeline(self, self._options)
> 04:40:28File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/sdks/python/apache_beam/runners/direct/test_direct_runner.py",
>  line 53, in run_pipeline
> 04:40:28  hc_assert_that(self.result, pickler.loads(on_success_matcher))
> 04:40:28  AssertionError: 
> 04:40:28  Expected: (Expected data is [(1, 'abc'), (2, 'def'), (3, '你好'), (4, 
> 'привет')])
> 04:40:28   but: Expected data is [(1, 'abc'), (2, 'def'), (3, '你好'), (4, 
> 'привет')] Actual data is []
> 04:40:28  
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-9484) apache_beam.io.gcp.bigquery_write_it_test.BigQueryWriteIntegrationTests is flaky in DirectRunner Postcommits

2020-04-08 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17078910#comment-17078910
 ] 

Ahmet Altay edited comment on BEAM-9484 at 4/9/20, 4:43 AM:


Any update on this. This is one of the top issues impacting python post 
commits. Should we sickbay it temporarily?


was (Author: altay):
Any update on this. This is one of the top issues impacting python post 
commits. Should we sickbay it?

> apache_beam.io.gcp.bigquery_write_it_test.BigQueryWriteIntegrationTests is 
> flaky in DirectRunner Postcommits
> 
>
> Key: BEAM-9484
> URL: https://issues.apache.org/jira/browse/BEAM-9484
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp, test-failures
>Reporter: Valentyn Tymofieiev
>Assignee: Pablo Estrada
>Priority: Major
>
> From https://builds.apache.org/job/beam_PostCommit_Python37_PR/99/: 
> {noformat}
>  ==
> 04:40:28  FAIL: test_big_query_write 
> (apache_beam.io.gcp.bigquery_write_it_test.BigQueryWriteIntegrationTests)
> 04:40:28  
> --
> 04:40:28  Traceback (most recent call last):
> 04:40:28File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/sdks/python/apache_beam/io/gcp/bigquery_write_it_test.py",
>  line 167, in test_big_query_write
> 04:40:28  write_disposition=beam.io.BigQueryDisposition.WRITE_EMPTY))
> 04:40:28File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/sdks/python/apache_beam/pipeline.py",
>  line 522, in __exit__
> 04:40:28  self.run().wait_until_finish()
> 04:40:28File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/sdks/python/apache_beam/pipeline.py",
>  line 495, in run
> 04:40:28  self._options).run(False)
> 04:40:28File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/sdks/python/apache_beam/pipeline.py",
>  line 508, in run
> 04:40:28  return self.runner.run_pipeline(self, self._options)
> 04:40:28File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/sdks/python/apache_beam/runners/direct/test_direct_runner.py",
>  line 53, in run_pipeline
> 04:40:28  hc_assert_that(self.result, pickler.loads(on_success_matcher))
> 04:40:28  AssertionError: 
> 04:40:28  Expected: (Expected data is [(1, 'abc'), (2, 'def'), (3, '你好'), (4, 
> 'привет')])
> 04:40:28   but: Expected data is [(1, 'abc'), (2, 'def'), (3, '你好'), (4, 
> 'привет')] Actual data is []
> 04:40:28  
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9626) pymongo should be an optional requirement

2020-03-27 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17069151#comment-17069151
 ] 

Ahmet Altay commented on BEAM-9626:
---

Why do we want to make this change especially retroactively for an IO that is 
already in the core package? What about the other IOs?

I do not think we would like to set a precedent for moving packages out of beam 
core unless there is a deeper issue. Also specifically for python IOs the 
growth rate outside of gcp package was not really significant so far.

> pymongo should be an optional requirement
> -
>
> Key: BEAM-9626
> URL: https://issues.apache.org/jira/browse/BEAM-9626
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Minor
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The pymongo driver is installed by default, but as the number of IO 
> connectors in the python sdk grows, I don't think this is the precedent we 
> want to set.  We already have "extra" packages for gcp, aws, and interactive, 
> we should also add one for mongo. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9557) Error setting processing time timers near end-of-window

2020-03-23 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17064984#comment-17064984
 ] 

Ahmet Altay commented on BEAM-9557:
---

For reference commit from description 
(a005fd765a762183ca88df90f261f6d4a20cf3e0) was added in 
(https://github.com/apache/beam/pull/10627)

[~reuvenlax]

> Error setting processing time timers near end-of-window
> ---
>
> Key: BEAM-9557
> URL: https://issues.apache.org/jira/browse/BEAM-9557
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Affects Versions: 2.20.0
>Reporter: Steve Niemitz
>Priority: Critical
>
> Previously, it was possible to set a processing time timer past the end of a 
> window, and it would simply not fire.
> However, now, this results in an error:
> {code:java}
> java.lang.IllegalArgumentException: Attempted to set event time timer that 
> outputs for 2020-03-19T18:01:35.000Z but that is after the expiration of 
> window 2020-03-19T17:59:59.999Z
> 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument(Preconditions.java:440)
> 
> org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner$TimerInternalsTimer.setAndVerifyOutputTimestamp(SimpleDoFnRunner.java:1011)
> 
> org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner$TimerInternalsTimer.setRelative(SimpleDoFnRunner.java:934)
> .processElement(???.scala:187)
>  {code}
>  
> I think the regression was introduced in commit 
> a005fd765a762183ca88df90f261f6d4a20cf3e0.  Also notice that the error message 
> is wrong, it says that "event time timer" but the timer is in the processing 
> time domain.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-9549) Flaky portableWordCountBatch and portableWordCountStreaming tests

2020-03-18 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay reassigned BEAM-9549:
-

Assignee: Ankur Goenka

> Flaky portableWordCountBatch and portableWordCountStreaming tests
> -
>
> Key: BEAM-9549
> URL: https://issues.apache.org/jira/browse/BEAM-9549
> Project: Beam
>  Issue Type: Test
>  Components: test-failures
>Reporter: Ning Kang
>Assignee: Ankur Goenka
>Priority: Major
> Attachments: Sr5cNnx8sAW.png
>
>
> The tests :sdks:python:test-suites:portable:py2:portableWordCountBatch and 
> :sdks:python:test-suites:portable:py2:portableWordCountStreaming are flaky, 
> sometimes throws grpc errrors.
> Stacktrace
> {code:java}
> INFO:root:Using Python SDK docker image: 
> apache/beam_python2.7_sdk:2.21.0.dev. If the image is not available at local, 
> we will try to pull from 
> hub.docker.comINFO:apache_beam.runners.portability.fn_api_runner_transforms:
>   
> INFO:apache_beam.utils.subprocess_server:Starting service 
> with ['docker' 'run' '-v' '/usr/bin/docker:/bin/docker' '-v' 
> '/var/run/docker.sock:/var/run/docker.sock' '--network=host' 
> 'apache/beam_flink1.9_job_server:latest' '--job-host' 'localhost' 
> '--job-port' '58753' '--artifact-port' '60175' '--expansion-port' 
> '33067']INFO:apache_beam.utils.subprocess_server:[main] INFO 
> org.apache.beam.runners.fnexecution.jobsubmission.JobServerDriver - 
> ArtifactStagingService started on 
> localhost:60175INFO:apache_beam.utils.subprocess_server:[main] INFO 
> org.apache.beam.runners.fnexecution.jobsubmission.JobServerDriver - Java 
> ExpansionService started on 
> localhost:33067INFO:apache_beam.utils.subprocess_server:[main] INFO 
> org.apache.beam.runners.fnexecution.jobsubmission.JobServerDriver - 
> JobService started on localhost:58753ERROR:grpc._common:Exception 
> deserializing message!Traceback (most recent call last):  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Portable_Python_Commit/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/grpc/_common.py",
>  line 84, in _transformreturn transformer(message)DecodeError: Error 
> parsing messageTraceback (most recent call last):  File 
> "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
> "__main__", fname, loader, pkg_name)  File "/usr/lib/python2.7/runpy.py", 
> line 72, in _run_codeexec code in run_globals  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Portable_Python_Commit/src/build/gradleenv/1866363813/lib/python2.7/site-packages/apache_beam/examples/wordcount.py",
>  line 142, in run()  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Portable_Python_Commit/src/build/gradleenv/1866363813/lib/python2.7/site-packages/apache_beam/examples/wordcount.py",
>  line 121, in runresult = p.run()  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Portable_Python_Commit/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/apache_beam/pipeline.py",
>  line 495, in runself._options).run(False)  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Portable_Python_Commit/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/apache_beam/pipeline.py",
>  line 508, in runreturn self.runner.run_pipeline(self, self._options)  
> File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Portable_Python_Commit/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/apache_beam/runners/portability/portable_runner.py",
>  line 401, in run_pipelinejob_service_handle.submit(proto_pipeline)  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Portable_Python_Commit/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/apache_beam/runners/portability/portable_runner.py",
>  line 102, in submitprepare_response = self.prepare(proto_pipeline)  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Portable_Python_Commit/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/apache_beam/runners/portability/portable_runner.py",
>  line 179, in preparetimeout=self.timeout)  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Portable_Python_Commit/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/grpc/_channel.py",
>  line 826, in __call__return _end_unary_response_blocking(state, call, 
> False, None)  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Portable_Python_Commit/src/build/gradleenv/1866363813/local/lib/python2.7/site-packages/grpc/_channel.py",
>  line 729, in _end_unary_response_blockingraise 
> _InactiveRpcError(state)grpc._channel._InactiveRpcError: <_InactiveRpcError 
> of RPC that terminated with:   status = StatusCode.INTERNAL   

[jira] [Commented] (BEAM-9484) apache_beam.io.gcp.bigquery_write_it_test.BigQueryWriteIntegrationTests is flaky in DirectRunner Postcommits

2020-03-17 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17061242#comment-17061242
 ] 

Ahmet Altay commented on BEAM-9484:
---

cc: [~pabloem]

> apache_beam.io.gcp.bigquery_write_it_test.BigQueryWriteIntegrationTests is 
> flaky in DirectRunner Postcommits
> 
>
> Key: BEAM-9484
> URL: https://issues.apache.org/jira/browse/BEAM-9484
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp, test-failures
>Reporter: Valentyn Tymofieiev
>Assignee: Chamikara Madhusanka Jayalath
>Priority: Major
>
> From https://builds.apache.org/job/beam_PostCommit_Python37_PR/99/: 
> {noformat}
>  ==
> 04:40:28  FAIL: test_big_query_write 
> (apache_beam.io.gcp.bigquery_write_it_test.BigQueryWriteIntegrationTests)
> 04:40:28  
> --
> 04:40:28  Traceback (most recent call last):
> 04:40:28File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/sdks/python/apache_beam/io/gcp/bigquery_write_it_test.py",
>  line 167, in test_big_query_write
> 04:40:28  write_disposition=beam.io.BigQueryDisposition.WRITE_EMPTY))
> 04:40:28File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/sdks/python/apache_beam/pipeline.py",
>  line 522, in __exit__
> 04:40:28  self.run().wait_until_finish()
> 04:40:28File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/sdks/python/apache_beam/pipeline.py",
>  line 495, in run
> 04:40:28  self._options).run(False)
> 04:40:28File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/sdks/python/apache_beam/pipeline.py",
>  line 508, in run
> 04:40:28  return self.runner.run_pipeline(self, self._options)
> 04:40:28File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/sdks/python/apache_beam/runners/direct/test_direct_runner.py",
>  line 53, in run_pipeline
> 04:40:28  hc_assert_that(self.result, pickler.loads(on_success_matcher))
> 04:40:28  AssertionError: 
> 04:40:28  Expected: (Expected data is [(1, 'abc'), (2, 'def'), (3, '你好'), (4, 
> 'привет')])
> 04:40:28   but: Expected data is [(1, 'abc'), (2, 'def'), (3, '你好'), (4, 
> 'привет')] Actual data is []
> 04:40:28  
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-9484) apache_beam.io.gcp.bigquery_write_it_test.BigQueryWriteIntegrationTests is flaky in DirectRunner Postcommits

2020-03-17 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay reassigned BEAM-9484:
-

Assignee: Chamikara Madhusanka Jayalath

> apache_beam.io.gcp.bigquery_write_it_test.BigQueryWriteIntegrationTests is 
> flaky in DirectRunner Postcommits
> 
>
> Key: BEAM-9484
> URL: https://issues.apache.org/jira/browse/BEAM-9484
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp, test-failures
>Reporter: Valentyn Tymofieiev
>Assignee: Chamikara Madhusanka Jayalath
>Priority: Major
>
> From https://builds.apache.org/job/beam_PostCommit_Python37_PR/99/: 
> {noformat}
>  ==
> 04:40:28  FAIL: test_big_query_write 
> (apache_beam.io.gcp.bigquery_write_it_test.BigQueryWriteIntegrationTests)
> 04:40:28  
> --
> 04:40:28  Traceback (most recent call last):
> 04:40:28File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/sdks/python/apache_beam/io/gcp/bigquery_write_it_test.py",
>  line 167, in test_big_query_write
> 04:40:28  write_disposition=beam.io.BigQueryDisposition.WRITE_EMPTY))
> 04:40:28File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/sdks/python/apache_beam/pipeline.py",
>  line 522, in __exit__
> 04:40:28  self.run().wait_until_finish()
> 04:40:28File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/sdks/python/apache_beam/pipeline.py",
>  line 495, in run
> 04:40:28  self._options).run(False)
> 04:40:28File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/sdks/python/apache_beam/pipeline.py",
>  line 508, in run
> 04:40:28  return self.runner.run_pipeline(self, self._options)
> 04:40:28File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37_PR/src/sdks/python/apache_beam/runners/direct/test_direct_runner.py",
>  line 53, in run_pipeline
> 04:40:28  hc_assert_that(self.result, pickler.loads(on_success_matcher))
> 04:40:28  AssertionError: 
> 04:40:28  Expected: (Expected data is [(1, 'abc'), (2, 'def'), (3, '你好'), (4, 
> 'привет')])
> 04:40:28   but: Expected data is [(1, 'abc'), (2, 'def'), (3, '你好'), (4, 
> 'привет')] Actual data is []
> 04:40:28  
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9507) Beam dependency check failing

2020-03-16 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17060556#comment-17060556
 ] 

Ahmet Altay commented on BEAM-9507:
---

cc: [~yifanzou]

> Beam dependency check failing
> -
>
> Key: BEAM-9507
> URL: https://issues.apache.org/jira/browse/BEAM-9507
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Michał Walenia
>Priority: Major
>
> Here are the logs:
> [https://builds.apache.org/job/beam_Dependency_Check/257/console]
>  
>from grpc_tools import protoc*13:04:25* ImportError: No module 
> named 'grpc_tools'*13:04:25* *13:04:25* During handling of the above 
> exception, another exception occurred:*13:04:25* *13:04:25* Traceback 
> (most recent call last):*13:04:25*   File 
> "/usr/lib/python3.5/multiprocessing/process.py", line 249, in 
> _bootstrap*13:04:25* self.run()*13:04:25*   File 
> "/usr/lib/python3.5/multiprocessing/process.py", line 93, in run*13:04:25*
>  self._target(*self._args, **self._kwargs)*13:04:25*   File 
> "/home/jenkins/jenkins-slave/workspace/beam_Dependency_Check/src/sdks/python/gen_protos.py",
>  line 378, in _install_grpcio_tools_and_generate_proto_files*13:04:25*
>  generate_proto_files(force=force)*13:04:25*   File 
> "/home/jenkins/jenkins-slave/workspace/beam_Dependency_Check/src/sdks/python/gen_protos.py",
>  line 315, in generate_proto_files*13:04:25* protoc_gen_mypy = 
> _find_protoc_gen_mypy()*13:04:25*   File 
> "/home/jenkins/jenkins-slave/workspace/beam_Dependency_Check/src/sdks/python/gen_protos.py",
>  line 233, in _find_protoc_gen_mypy*13:04:25* (fname, ', 
> '.join(search_paths)))*13:04:25* RuntimeError: Could not find 
> protoc-gen-mypy in 
> /home/jenkins/jenkins-slave/workspace/beam_Dependency_Check/src/sdks/python/sdks/python/bin,
>  
> /home/jenkins/jenkins-slave/workspace/beam_Dependency_Check/src/sdks/python/sdks/python/bin,
>  /home/jenkins/tools/java/latest1.8/bin, /usr/local/sbin, /usr/local/bin, 
> /usr/sbin, /usr/bin, /sbin, /bin, /usr/games, /usr/local/games*13:04:25* 
> Traceback (most recent call last):*13:04:25*   File 
> "/home/jenkins/jenkins-slave/workspace/beam_Dependency_Check/src/sdks/python/gen_protos.py",
>  line 292, in generate_proto_files*13:04:25* from grpc_tools import 
> protoc*13:04:25* ImportError: No module named 'grpc_tools'*13:04:25* 
> *13:04:25* During handling of the above exception, another exception 
> occurred:*13:04:25* *13:04:25* Traceback (most recent call 
> last):*13:04:25*   File "", line 1, in *13:04:25*   
> File 
> "/home/jenkins/jenkins-slave/workspace/beam_Dependency_Check/src/sdks/python/setup.py",
>  line 315, in *13:04:25* 'mypy': 
> generate_protos_first(mypy),*13:04:25*   File 
> "/home/jenkins/jenkins-slave/workspace/beam_Dependency_Check/src/sdks/python/sdks/python/lib/python3.5/site-packages/setuptools/__init__.py",
>  line 144, in setup*13:04:25* return 
> distutils.core.setup(**attrs)*13:04:25*   File 
> "/usr/lib/python3.5/distutils/core.py", line 148, in setup*13:04:25* 
> dist.run_commands()*13:04:25*   File 
> "/usr/lib/python3.5/distutils/dist.py", line 955, in run_commands*13:04:25*   
>   self.run_command(cmd)*13:04:25*   File 
> "/usr/lib/python3.5/distutils/dist.py", line 974, in run_command*13:04:25*
>  cmd_obj.run()*13:04:25*   File 
> "/home/jenkins/jenkins-slave/workspace/beam_Dependency_Check/src/sdks/python/setup.py",
>  line 239, in run*13:04:25* 
> gen_protos.generate_proto_files()*13:04:25*   File 
> "/home/jenkins/jenkins-slave/workspace/beam_Dependency_Check/src/sdks/python/gen_protos.py",
>  line 310, in generate_proto_files*13:04:25* raise ValueError("Proto 
> generation failed (see log for details).")*13:04:25* ValueError: Proto 
> generation failed (see log for details).*13:04:25* 
> *13:04:25* ERROR: Command errored out 
> with exit status 1: python setup.py egg_info Check the logs for full command 
> output.*13:04:25* *13:04:25* >
>  *Task :sdks:python:dependencyUpdates*
>  FAILED*13:04:25* *13:04:25* FAILURE: Build failed with an 
> exception.*13:04:25* *13:04:25* * Where:*13:04:25* Build file 
> '/home/jenkins/jenkins-slave/workspace/beam_Dependency_Check/src/sdks/python/build.gradle'
>  line: 94*13:04:25* *13:04:25* * What went wrong:*13:04:25* Execution failed 
> for task ':sdks:python:dependencyUpdates'.*13:04:25* > Process 'command 'sh'' 
> finished with non-zero exit value 1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-9510) Dependencies in base_image_requirements.txt are not compatible with each other

2020-03-16 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay reassigned BEAM-9510:
-

Assignee: Hannah Jiang

> Dependencies in base_image_requirements.txt are not compatible with each other
> --
>
> Key: BEAM-9510
> URL: https://issues.apache.org/jira/browse/BEAM-9510
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-harness
>Reporter: David Yan
>Assignee: Hannah Jiang
>Priority: Major
>
> [https://github.com/apache/beam/blob/master/sdks/python/container/base_image_requirements.txt#L56]
> says it requires google-cloud-bigquery==1.24.0, google-cloud-core==1.0.2, 
> google-cloud-bigtable==0.32.1, grpc-1.22.0 and tensorflow-2.1.0
> But they are incompatible with each other:
> ERROR: google-cloud-bigquery 1.24.0 has requirement 
> google-cloud-core<2.0dev,>=1.1.0, but you'll have google-cloud-core 1.0.2 
> which is incompatible.
> ERROR: google-cloud-bigtable 0.32.1 has requirement 
> google-cloud-core<0.30dev,>=0.29.0, but you'll have google-cloud-core 1.0.2 
> which is incompatible.
> ERROR: tensorboard 2.1.1 has requirement grpcio>=1.24.3, but you'll have 
> grpcio 1.22.0 which is incompatible.
> ERROR: tensorflow 2.1.0 has requirement scipy==1.4.1; python_version >= "3", 
> but you'll have scipy 1.2.2 which is incompatible.
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9510) Dependencies in base_image_requirements.txt are not compatible with each other

2020-03-16 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17060457#comment-17060457
 ] 

Ahmet Altay commented on BEAM-9510:
---

This file looks out of date. We can update it to match 
(https://cloud.google.com/dataflow/docs/concepts/sdk-worker-dependencies#sdk-for-python)

> Dependencies in base_image_requirements.txt are not compatible with each other
> --
>
> Key: BEAM-9510
> URL: https://issues.apache.org/jira/browse/BEAM-9510
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-harness
>Reporter: David Yan
>Priority: Major
>
> [https://github.com/apache/beam/blob/master/sdks/python/container/base_image_requirements.txt#L56]
> says it requires google-cloud-bigquery==1.24.0, google-cloud-core==1.0.2, 
> google-cloud-bigtable==0.32.1, grpc-1.22.0 and tensorflow-2.1.0
> But they are incompatible with each other:
> ERROR: google-cloud-bigquery 1.24.0 has requirement 
> google-cloud-core<2.0dev,>=1.1.0, but you'll have google-cloud-core 1.0.2 
> which is incompatible.
> ERROR: google-cloud-bigtable 0.32.1 has requirement 
> google-cloud-core<0.30dev,>=0.29.0, but you'll have google-cloud-core 1.0.2 
> which is incompatible.
> ERROR: tensorboard 2.1.1 has requirement grpcio>=1.24.3, but you'll have 
> grpcio 1.22.0 which is incompatible.
> ERROR: tensorflow 2.1.0 has requirement scipy==1.4.1; python_version >= "3", 
> but you'll have scipy 1.2.2 which is incompatible.
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9497) Update "Managing Python Pipeline Dependencies" page

2020-03-12 Thread Ahmet Altay (Jira)
Ahmet Altay created BEAM-9497:
-

 Summary: Update "Managing Python Pipeline Dependencies" page
 Key: BEAM-9497
 URL: https://issues.apache.org/jira/browse/BEAM-9497
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-harness, website
Reporter: Ahmet Altay


Update https://beam.apache.org/documentation/sdks/python-pipeline-dependencies/ 
with concrete examples and make it easier to use.

Additional items to cover in this page:
- pickling / managing main session
- containers to rescue how it could help with this, how to run containers 
locally (links?)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9487) GBKs on unbounded pcolls with global windows and no triggers should fail

2020-03-11 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay updated BEAM-9487:
--
Labels: EaseOfUse starter  (was: starter)

> GBKs on unbounded pcolls with global windows and no triggers should fail
> 
>
> Key: BEAM-9487
> URL: https://issues.apache.org/jira/browse/BEAM-9487
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Priority: Major
>  Labels: EaseOfUse, starter
>
> This, according to "4.2.2.1 GroupByKey and unbounded PCollections" in 
> https://beam.apache.org/documentation/programming-guide/.
> bq. If you do apply GroupByKey or CoGroupByKey to a group of unbounded 
> PCollections without setting either a non-global windowing strategy, a 
> trigger strategy, or both for each collection, Beam generates an 
> IllegalStateException error at pipeline construction time.
> Example where this doesn't happen in Python SDK: 
> https://stackoverflow.com/questions/60623246/merge-pcollection-with-apache-beam
> I also believe that this unit test should fail, since test_stream is 
> unbounded, uses global window, and has no triggers.
> {code}
>   def test_global_window_gbk_fail(self):
> with TestPipeline() as p:
>   test_stream = TestStream()
>   _ = p | test_stream | GroupByKey()
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9487) GBKs on unbounded pcolls with global windows and no triggers should fail

2020-03-11 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay updated BEAM-9487:
--
Labels: EaseOf starter  (was: EaseOfUse starter)

> GBKs on unbounded pcolls with global windows and no triggers should fail
> 
>
> Key: BEAM-9487
> URL: https://issues.apache.org/jira/browse/BEAM-9487
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Priority: Major
>  Labels: EaseOf, starter
>
> This, according to "4.2.2.1 GroupByKey and unbounded PCollections" in 
> https://beam.apache.org/documentation/programming-guide/.
> bq. If you do apply GroupByKey or CoGroupByKey to a group of unbounded 
> PCollections without setting either a non-global windowing strategy, a 
> trigger strategy, or both for each collection, Beam generates an 
> IllegalStateException error at pipeline construction time.
> Example where this doesn't happen in Python SDK: 
> https://stackoverflow.com/questions/60623246/merge-pcollection-with-apache-beam
> I also believe that this unit test should fail, since test_stream is 
> unbounded, uses global window, and has no triggers.
> {code}
>   def test_global_window_gbk_fail(self):
> with TestPipeline() as p:
>   test_stream = TestStream()
>   _ = p | test_stream | GroupByKey()
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9487) GBKs on unbounded pcolls with global windows and no triggers should fail

2020-03-11 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay updated BEAM-9487:
--
Labels: starter  (was: EaseOf starter)

> GBKs on unbounded pcolls with global windows and no triggers should fail
> 
>
> Key: BEAM-9487
> URL: https://issues.apache.org/jira/browse/BEAM-9487
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Priority: Major
>  Labels: starter
>
> This, according to "4.2.2.1 GroupByKey and unbounded PCollections" in 
> https://beam.apache.org/documentation/programming-guide/.
> bq. If you do apply GroupByKey or CoGroupByKey to a group of unbounded 
> PCollections without setting either a non-global windowing strategy, a 
> trigger strategy, or both for each collection, Beam generates an 
> IllegalStateException error at pipeline construction time.
> Example where this doesn't happen in Python SDK: 
> https://stackoverflow.com/questions/60623246/merge-pcollection-with-apache-beam
> I also believe that this unit test should fail, since test_stream is 
> unbounded, uses global window, and has no triggers.
> {code}
>   def test_global_window_gbk_fail(self):
> with TestPipeline() as p:
>   test_stream = TestStream()
>   _ = p | test_stream | GroupByKey()
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9388) Consider using github actions for building python wheels and more (aka. Transition from Travis)

2020-02-26 Thread Ahmet Altay (Jira)
Ahmet Altay created BEAM-9388:
-

 Summary: Consider using github actions for building python wheels 
and more (aka. Transition from Travis)
 Key: BEAM-9388
 URL: https://issues.apache.org/jira/browse/BEAM-9388
 Project: Beam
  Issue Type: Bug
  Components: build-system, sdk-py-core
Reporter: Ahmet Altay


Context on the mailing list: 
https://lists.apache.org/thread.html/r4a7d34e64a34e9fe589d06aec74d9b464d252c516fe96c35b2d6c9ae%40%3Cdev.beam.apache.org%3E

github actions instead of travis to for building python wheels during releases. 
This will have the following advantages:

- We will eliminate one repo. (If you don't know, we have 
https://github.com/apache/beam-wheels for the sole purpose of building wheels 
file.)
- Workflow will be stored in the same repo. This will prevent bit rot that is 
only discovered at release times. (happened a few times, although usually easy 
to fix.)
- github actions supports ubuntu, mac, windows environments. We could try to 
build wheels for windows as well. (Travis also supports the same environments 
but we only use linux and mac environments. Maybe there are other blockers for 
building wheels for Windows.)
- We could do more, like daily python builds.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-9248) [Python] PTransform that integrates Cloud Natural Language functionality

2020-02-26 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay resolved BEAM-9248.
---
Resolution: Fixed

> [Python] PTransform that integrates Cloud Natural Language functionality
> 
>
> Key: BEAM-9248
> URL: https://issues.apache.org/jira/browse/BEAM-9248
> Project: Beam
>  Issue Type: Sub-task
>  Components: io-py-gcp
>Reporter: Kamil Wasilewski
>Assignee: Kamil Wasilewski
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> The goal is to create a PTransform that integrates Google Cloud Natural 
> Language API functionality [1].
> [1] https://cloud.google.com/natural-language/docs/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9248) [Python] PTransform that integrates Cloud Natural Language functionality

2020-02-26 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay updated BEAM-9248:
--
Fix Version/s: 2.20.0

> [Python] PTransform that integrates Cloud Natural Language functionality
> 
>
> Key: BEAM-9248
> URL: https://issues.apache.org/jira/browse/BEAM-9248
> Project: Beam
>  Issue Type: Sub-task
>  Components: io-py-gcp
>Reporter: Kamil Wasilewski
>Assignee: Kamil Wasilewski
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> The goal is to create a PTransform that integrates Google Cloud Natural 
> Language API functionality [1].
> [1] https://cloud.google.com/natural-language/docs/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8618) Tear down unused DoFns periodically in Python SDK harness

2020-02-25 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044707#comment-17044707
 ] 

Ahmet Altay commented on BEAM-8618:
---

Do we need a fix version set for this? This appears to be an improvement. Could 
we remove the fix version tag?

> Tear down unused DoFns periodically in Python SDK harness
> -
>
> Key: BEAM-8618
> URL: https://issues.apache.org/jira/browse/BEAM-8618
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-harness
>Reporter: sunjincheng
>Assignee: sunjincheng
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> Per the discussion in the ML, detail can be found [1],  the teardown of DoFns 
> should be supported in the portability framework. It happens at two places:
> 1) Upon the control service termination
> 2) Tear down the unused DoFns periodically
> The aim of this JIRA is to add support for tear down the unused DoFns 
> periodically in Python SDK harness.
> [1] 
> https://lists.apache.org/thread.html/0c4a4cf83cf2e35c3dfeb9d906e26cd82d3820968ba6f862f91739e4@%3Cdev.beam.apache.org%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9250) Improve beam release script based on 2.19.0 release experience

2020-02-25 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044706#comment-17044706
 ] 

Ahmet Altay commented on BEAM-9250:
---

Is this a blocker for 2.20?

> Improve beam release script based on 2.19.0 release experience
> --
>
> Key: BEAM-9250
> URL: https://issues.apache.org/jira/browse/BEAM-9250
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9319) ResourceExhausted: topics-per-project

2020-02-20 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17041441#comment-17041441
 ] 

Ahmet Altay commented on BEAM-9319:
---

Thank you!

Do we still need to address the leaking problem?

> ResourceExhausted: topics-per-project
> -
>
> Key: BEAM-9319
> URL: https://issues.apache.org/jira/browse/BEAM-9319
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures, testing
>Reporter: Ahmet Altay
>Assignee: Brian Hulette
>Priority: Major
>
> Tests are failing due to quota issues. Do we need to clean up topics after 
> tests or set a shorter TTL?
> Log: https://builds.apache.org/job/beam_PreCommit_Python_Commit/11178/
> Error: 
> 08:24:40 
> ==
> 08:24:40 ERROR: test_streaming_wordcount_it 
> (apache_beam.examples.streaming_wordcount_it_test.StreamingWordCountIT)
> 08:24:40 
> --
> 08:24:40 Traceback (most recent call last):
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/examples/streaming_wordcount_it_test.py",
>  line 58, in setUp
> 08:24:40 self.pub_client.topic_path(self.project, INPUT_TOPIC + 
> self.uuid))
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/cloud/pubsub_v1/_gapic.py",
>  line 40, in 
> 08:24:40 fx = lambda self, *a, **kw: wrapped_fx(self.api, *a, **kw)  # 
> noqa
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/cloud/pubsub_v1/gapic/publisher_client.py",
>  line 332, in create_topic
> 08:24:40 request, retry=retry, timeout=timeout, metadata=metadata
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/api_core/gapic_v1/method.py",
>  line 143, in __call__
> 08:24:40 return wrapped_func(*args, **kwargs)
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/api_core/retry.py",
>  line 286, in retry_wrapped_func
> 08:24:40 on_error=on_error,
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/api_core/retry.py",
>  line 184, in retry_target
> 08:24:40 return target()
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/api_core/timeout.py",
>  line 214, in func_with_timeout
> 08:24:40 return func(*args, **kwargs)
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/api_core/grpc_helpers.py",
>  line 59, in error_remapped_callable
> 08:24:40 six.raise_from(exceptions.from_grpc_error(exc), exc)
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/six.py",
>  line 738, in raise_from
> 08:24:40 raise value
> 08:24:40 ResourceExhausted: 429 Your project has exceeded a limit: 
> (type="topics-per-project", current=1, maximum=1).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9119) apache_beam.runners.portability.fn_api_runner_test.FnApiRunnerTest[...].test_large_elements is flaky

2020-02-20 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17041277#comment-17041277
 ] 

Ahmet Altay commented on BEAM-9119:
---

This come up again here: https://github.com/apache/beam/pull/10731

Should we disable this test?

> apache_beam.runners.portability.fn_api_runner_test.FnApiRunnerTest[...].test_large_elements
>  is flaky
> 
>
> Key: BEAM-9119
> URL: https://issues.apache.org/jira/browse/BEAM-9119
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Robert Bradshaw
>Priority: Major
>
> Saw 3 errors today, all manifest with:
> IndexError: index out of range in apache_beam/coders/slow_stream.py", line 
> 169, in read_byte_py3.
> https://builds.apache.org/job/beam_PreCommit_Python_Phrase/1369
> https://builds.apache.org/job/beam_PreCommit_Python_Phrase/1365
> https://builds.apache.org/job/beam_PreCommit_Python_Phrase/1370
> Sample logs:
> {noformat}
> 12:10:27  === FAILURES 
> ===
> 12:10:27   FnApiRunnerTestWithDisabledCaching.test_large_elements 
> 
> 12:10:27  [gw0] linux -- Python 3.6.8 
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Phrase/src/sdks/python/test-suites/tox/py36/build/srcs/sdks/python/target/.tox-py36-gcp-pytest/py36-gcp-pytest/bin/python
> 12:10:27  
> 12:10:27  self = 
>   testMethod=test_large_elements>
> 12:10:27  
> 12:10:27  def test_large_elements(self):
> 12:10:27with self.create_pipeline() as p:
> 12:10:27  big = (p
> 12:10:27 | beam.Create(['a', 'a', 'b'])
> 12:10:27 | beam.Map(lambda x: (
> 12:10:27 x, x * 
> data_plane._DEFAULT_SIZE_FLUSH_THRESHOLD)))
> 12:10:27  
> 12:10:27  side_input_res = (
> 12:10:27  big
> 12:10:27  | beam.Map(lambda x, side: (x[0], side.count(x[0])),
> 12:10:27 beam.pvalue.AsList(big | beam.Map(lambda x: 
> x[0]
> 12:10:27  assert_that(side_input_res,
> 12:10:27  equal_to([('a', 2), ('a', 2), ('b', 1)]), 
> label='side')
> 12:10:27  
> 12:10:27  gbk_res = (
> 12:10:27  big
> 12:10:27  | beam.GroupByKey()
> 12:10:27  | beam.Map(lambda x: x[0]))
> 12:10:27  >   assert_that(gbk_res, equal_to(['a', 'b']), label='gbk')
> 12:10:27  
> 12:10:27  apache_beam/runners/portability/fn_api_runner_test.py:617: 
> 12:10:27  _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ _ _ _ _ _ 
> 12:10:27  apache_beam/pipeline.py:479: in __exit__
> 12:10:27  self.run().wait_until_finish()
> 12:10:27  apache_beam/pipeline.py:459: in run
> 12:10:27  self._options).run(False)
> 12:10:27  apache_beam/pipeline.py:472: in run
> 12:10:27  return self.runner.run_pipeline(self, self._options)
> 12:10:27  apache_beam/runners/portability/fn_api_runner.py:472: in 
> run_pipeline
> 12:10:27  default_environment=self._default_environment))
> 12:10:27  apache_beam/runners/portability/fn_api_runner.py:480: in 
> run_via_runner_api
> 12:10:27  return self.run_stages(stage_context, stages)
> 12:10:27  apache_beam/runners/portability/fn_api_runner.py:569: in run_stages
> 12:10:27  stage_context.safe_coders)
> 12:10:27  apache_beam/runners/portability/fn_api_runner.py:889: in _run_stage
> 12:10:27  result, splits = bundle_manager.process_bundle(data_input, 
> data_output)
> 12:10:27  apache_beam/runners/portability/fn_api_runner.py:2076: in 
> process_bundle
> 12:10:27  part, expected_outputs), part_inputs):
> 12:10:27  /usr/lib/python3.6/concurrent/futures/_base.py:586: in 
> result_iterator
> 12:10:27  yield fs.pop().result()
> 12:10:27  /usr/lib/python3.6/concurrent/futures/_base.py:432: in result
> 12:10:27  return self.__get_result()
> 12:10:27  /usr/lib/python3.6/concurrent/futures/_base.py:384: in __get_result
> 12:10:27  raise self._exception
> 12:10:27  apache_beam/utils/thread_pool_executor.py:44: in run
> 12:10:27  self._future.set_result(self._fn(*self._fn_args, 
> **self._fn_kwargs))
> 12:10:27  apache_beam/runners/portability/fn_api_runner.py:2076: in 
> 12:10:27  part, expected_outputs), part_inputs):
> 12:10:27  apache_beam/runners/portability/fn_api_runner.py:2020: in 
> process_bundle
> 12:10:27  expected_outputs[output.transform_id]).append(output.data)
> 12:10:27  apache_beam/runners/portability/fn_api_runner.py:285: in append
> 12:10:27  windowed_key_value = 
> coder_impl.decode_from_stream(input_stream, True)
> 12:10:27  apache_beam/coders/coder_impl.py:1153: in decode_from_stream
> 12:10:27  value = 

[jira] [Reopened] (BEAM-1080) python sdk apiclient needs proper unit tests

2020-02-18 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-1080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay reopened BEAM-1080:
---
  Assignee: (was: Pablo Estrada)

> python sdk apiclient needs proper unit tests
> 
>
> Key: BEAM-1080
> URL: https://issues.apache.org/jira/browse/BEAM-1080
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Vikas Kedigehalli
>Priority: Major
>  Labels: ccoss2019, newbie, starter
> Fix For: 2.17.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> There is only one unit test right now that tries to fetch actual gcp 
> credentials instead of mocking. This test fails when the credentials are not 
> available on the machine in which it is running. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-1080) python sdk apiclient needs proper unit tests

2020-02-18 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-1080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay updated BEAM-1080:
--
Fix Version/s: (was: 2.17.0)

> python sdk apiclient needs proper unit tests
> 
>
> Key: BEAM-1080
> URL: https://issues.apache.org/jira/browse/BEAM-1080
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Vikas Kedigehalli
>Priority: Major
>  Labels: ccoss2019, newbie, starter
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> There is only one unit test right now that tries to fetch actual gcp 
> credentials instead of mocking. This test fails when the credentials are not 
> available on the machine in which it is running. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9185) Publish pre-release python artifacts (RCs) to PyPI

2020-02-18 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17039269#comment-17039269
 ] 

Ahmet Altay commented on BEAM-9185:
---

Thank you [~zhitao]. During an RC validation there will be a fixed link 
(example: 
https://dist.apache.org/repos/dist/dev/beam/2.19.0/python/apache-beam-2.19.0.zip).
 Also the same folder will contain wheel files as well. 

> Publish pre-release python artifacts (RCs) to PyPI
> --
>
> Key: BEAM-9185
> URL: https://issues.apache.org/jira/browse/BEAM-9185
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Valentyn Tymofieiev
>Priority: Major
>
> This was discussed in the mailing list and there was consensus [1].
> Remaining part for pypi would be updating the release process such as:
>  * New RC versioned artifacts are generated along with actual non-RC 
> versioned artifacts.
>  * RC versioned artifacts are published to pypi.
>  [1] 
> [https://lists.apache.org/thread.html/f071f8ab9f115636b9e6a6cabcfccbe2bb980d4394fe5581c59a4db6%40%3Cdev.beam.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9185) Publish pre-release python artifacts (RCs) to PyPI

2020-02-14 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037402#comment-17037402
 ] 

Ahmet Altay commented on BEAM-9185:
---

As an alternative, could we ask users to do 

`pip install 
https://dist.apache.org/repos/dist/dev/beam/latest/python/apache-beam-2.19.0.zip`
 as part of the RC validation during the RC process. However, if we would like 
users to integrate this as part of their nightly tests or similar, we need to 
remove 2.19.0 etc part and replace with a generic string (like latest)

cc: [~katsia...@google.com] -- We know this is important for TFX. Could you 
include relevant folks so that we can get input on this.

> Publish pre-release python artifacts (RCs) to PyPI
> --
>
> Key: BEAM-9185
> URL: https://issues.apache.org/jira/browse/BEAM-9185
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Valentyn Tymofieiev
>Priority: Major
>
> This was discussed in the mailing list and there was consensus [1].
> Remaining part for pypi would be updating the release process such as:
>  * New RC versioned artifacts are generated along with actual non-RC 
> versioned artifacts.
>  * RC versioned artifacts are published to pypi.
>  [1] 
> [https://lists.apache.org/thread.html/f071f8ab9f115636b9e6a6cabcfccbe2bb980d4394fe5581c59a4db6%40%3Cdev.beam.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9252) Problem shading Beam pipeline with Beam 2.20.0-SNAPSHOT

2020-02-14 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037396#comment-17037396
 ] 

Ahmet Altay commented on BEAM-9252:
---

Would this be resolved when https://issues.apache.org/jira/browse/BEAM-9288 is 
fixed?

> Problem shading Beam pipeline with Beam 2.20.0-SNAPSHOT
> ---
>
> Key: BEAM-9252
> URL: https://issues.apache.org/jira/browse/BEAM-9252
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Affects Versions: 2.20.0
>Reporter: Ismaël Mejía
>Priority: Critical
> Fix For: 2.20.0
>
>
> I was checking today a pipeline against the latest 2.20.0-SNAPSHOT and I 
> found that it works perfectly with version 2.19.0, but it is failing with a  
> shade related exception that refers to grpc 1.26.0:
> {{[ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-shade-plugin:3.2.1:shade (default) on project 
> EventsToIOs: Error creating shaded jar: Problem shading JAR 
> /home/ismael/.m2/repository/org/apache/beam/beam-vendor-grpc-1_26_0/0.1/beam-vendor-grpc-1_26_0-0.1.jar
>  entry org/apache/beam/vendor/grpc/v1p26p0/org/jboss/modules/Main.class: 
> org.apache.maven.plugin.MojoExecutionException: Error in ASM processing class 
> org/apache/beam/vendor/grpc/v1p26p0/org/jboss/modules/Main.class: 65536 -> 
> [Help 1]}}
> {{There is also a warning that is not present in the build against 2.19.0}}
> {{[WARNING] Discovered module-info.class. Shading will break its strong 
> encapsulation.}}
>  
> I wonder if we are not doing something wrong during our vendoring, can 
> someone take a look please.
> This is relatively easy to reproduce with the beam-samples repo, just clone 
> it and run:
> {noformat}
> git clone https://github.com/jbonofre/beam-samples
> mvn clean verify -Pbeam-release-repo -Dbeam.version=2.20.0-SNAPSHOT
> {noformat}
> Available logs of the latest run:
> [https://github.com/jbonofre/beam-samples/runs/427537544?check_suite_focus=true]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9319) ResourceExhausted: topics-per-project

2020-02-14 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037390#comment-17037390
 ] 

Ahmet Altay commented on BEAM-9319:
---

Brian, can you cleanup some older topics to at least enable the failing tests? 
We can temporarily disable pubsubio test if we think it is consistently leaking.

> ResourceExhausted: topics-per-project
> -
>
> Key: BEAM-9319
> URL: https://issues.apache.org/jira/browse/BEAM-9319
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures, testing
>Reporter: Ahmet Altay
>Assignee: Yifan Zou
>Priority: Major
>
> Tests are failing due to quota issues. Do we need to clean up topics after 
> tests or set a shorter TTL?
> Log: https://builds.apache.org/job/beam_PreCommit_Python_Commit/11178/
> Error: 
> 08:24:40 
> ==
> 08:24:40 ERROR: test_streaming_wordcount_it 
> (apache_beam.examples.streaming_wordcount_it_test.StreamingWordCountIT)
> 08:24:40 
> --
> 08:24:40 Traceback (most recent call last):
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/examples/streaming_wordcount_it_test.py",
>  line 58, in setUp
> 08:24:40 self.pub_client.topic_path(self.project, INPUT_TOPIC + 
> self.uuid))
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/cloud/pubsub_v1/_gapic.py",
>  line 40, in 
> 08:24:40 fx = lambda self, *a, **kw: wrapped_fx(self.api, *a, **kw)  # 
> noqa
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/cloud/pubsub_v1/gapic/publisher_client.py",
>  line 332, in create_topic
> 08:24:40 request, retry=retry, timeout=timeout, metadata=metadata
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/api_core/gapic_v1/method.py",
>  line 143, in __call__
> 08:24:40 return wrapped_func(*args, **kwargs)
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/api_core/retry.py",
>  line 286, in retry_wrapped_func
> 08:24:40 on_error=on_error,
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/api_core/retry.py",
>  line 184, in retry_target
> 08:24:40 return target()
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/api_core/timeout.py",
>  line 214, in func_with_timeout
> 08:24:40 return func(*args, **kwargs)
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/api_core/grpc_helpers.py",
>  line 59, in error_remapped_callable
> 08:24:40 six.raise_from(exceptions.from_grpc_error(exc), exc)
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/six.py",
>  line 738, in raise_from
> 08:24:40 raise value
> 08:24:40 ResourceExhausted: 429 Your project has exceeded a limit: 
> (type="topics-per-project", current=1, maximum=1).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9319) ResourceExhausted: topics-per-project

2020-02-14 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037389#comment-17037389
 ] 

Ahmet Altay commented on BEAM-9319:
---

I do not know if pubsub supports TTLs for topics. Subscriptions have a 31 day 
default TTL.

[~dpcollins-google] - might actually know the answer.

> ResourceExhausted: topics-per-project
> -
>
> Key: BEAM-9319
> URL: https://issues.apache.org/jira/browse/BEAM-9319
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures, testing
>Reporter: Ahmet Altay
>Assignee: Yifan Zou
>Priority: Major
>
> Tests are failing due to quota issues. Do we need to clean up topics after 
> tests or set a shorter TTL?
> Log: https://builds.apache.org/job/beam_PreCommit_Python_Commit/11178/
> Error: 
> 08:24:40 
> ==
> 08:24:40 ERROR: test_streaming_wordcount_it 
> (apache_beam.examples.streaming_wordcount_it_test.StreamingWordCountIT)
> 08:24:40 
> --
> 08:24:40 Traceback (most recent call last):
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/examples/streaming_wordcount_it_test.py",
>  line 58, in setUp
> 08:24:40 self.pub_client.topic_path(self.project, INPUT_TOPIC + 
> self.uuid))
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/cloud/pubsub_v1/_gapic.py",
>  line 40, in 
> 08:24:40 fx = lambda self, *a, **kw: wrapped_fx(self.api, *a, **kw)  # 
> noqa
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/cloud/pubsub_v1/gapic/publisher_client.py",
>  line 332, in create_topic
> 08:24:40 request, retry=retry, timeout=timeout, metadata=metadata
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/api_core/gapic_v1/method.py",
>  line 143, in __call__
> 08:24:40 return wrapped_func(*args, **kwargs)
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/api_core/retry.py",
>  line 286, in retry_wrapped_func
> 08:24:40 on_error=on_error,
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/api_core/retry.py",
>  line 184, in retry_target
> 08:24:40 return target()
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/api_core/timeout.py",
>  line 214, in func_with_timeout
> 08:24:40 return func(*args, **kwargs)
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/api_core/grpc_helpers.py",
>  line 59, in error_remapped_callable
> 08:24:40 six.raise_from(exceptions.from_grpc_error(exc), exc)
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/six.py",
>  line 738, in raise_from
> 08:24:40 raise value
> 08:24:40 ResourceExhausted: 429 Your project has exceeded a limit: 
> (type="topics-per-project", current=1, maximum=1).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9319) ResourceExhausted: topics-per-project

2020-02-14 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037246#comment-17037246
 ] 

Ahmet Altay commented on BEAM-9319:
---

- Let's start with cleaning up leaked resources.
- If possible, it would be good to add TTL to all new topics. (7 days is 
probably a good value)
- I am pretty sure tests have logic to cleanup. Leaked topics are from various 
different dates. If there was a leak on the normal path, this would have failed 
much earlier.
- A test framework improvement might be needed. Is it possible that some 
teardowns are not executed? (Maybe jenkins stops execution?)

> ResourceExhausted: topics-per-project
> -
>
> Key: BEAM-9319
> URL: https://issues.apache.org/jira/browse/BEAM-9319
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures, testing
>Reporter: Ahmet Altay
>Assignee: Yifan Zou
>Priority: Major
>
> Tests are failing due to quota issues. Do we need to clean up topics after 
> tests or set a shorter TTL?
> Log: https://builds.apache.org/job/beam_PreCommit_Python_Commit/11178/
> Error: 
> 08:24:40 
> ==
> 08:24:40 ERROR: test_streaming_wordcount_it 
> (apache_beam.examples.streaming_wordcount_it_test.StreamingWordCountIT)
> 08:24:40 
> --
> 08:24:40 Traceback (most recent call last):
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/examples/streaming_wordcount_it_test.py",
>  line 58, in setUp
> 08:24:40 self.pub_client.topic_path(self.project, INPUT_TOPIC + 
> self.uuid))
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/cloud/pubsub_v1/_gapic.py",
>  line 40, in 
> 08:24:40 fx = lambda self, *a, **kw: wrapped_fx(self.api, *a, **kw)  # 
> noqa
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/cloud/pubsub_v1/gapic/publisher_client.py",
>  line 332, in create_topic
> 08:24:40 request, retry=retry, timeout=timeout, metadata=metadata
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/api_core/gapic_v1/method.py",
>  line 143, in __call__
> 08:24:40 return wrapped_func(*args, **kwargs)
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/api_core/retry.py",
>  line 286, in retry_wrapped_func
> 08:24:40 on_error=on_error,
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/api_core/retry.py",
>  line 184, in retry_target
> 08:24:40 return target()
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/api_core/timeout.py",
>  line 214, in func_with_timeout
> 08:24:40 return func(*args, **kwargs)
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/api_core/grpc_helpers.py",
>  line 59, in error_remapped_callable
> 08:24:40 six.raise_from(exceptions.from_grpc_error(exc), exc)
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/six.py",
>  line 738, in raise_from
> 08:24:40 raise value
> 08:24:40 ResourceExhausted: 429 Your project has exceeded a limit: 
> (type="topics-per-project", current=1, maximum=1).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9319) ResourceExhausted: topics-per-project

2020-02-14 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037189#comment-17037189
 ] 

Ahmet Altay commented on BEAM-9319:
---

Looking at 
https://console.cloud.google.com/cloudpubsub/topic/list?folder===apache-beam-testing

There are some leaky tests:
exercise_streaming_metrics_topic_input000ec192-97fe-4885-b5a4-6bfb92a9ea7e
integ-test-PubsubJsonIT-testSelectsPayloadContent-2019-10-03-16-11-52-408-start--2927088586724238875
integ-test-PubsubJsonIT-testSQLInsertJsonRowsToPubsubFlat-2020-01-16-19-41-19-267-start-7261005531427738352
...

Particularly integ-test-PubsubJsonIT-* has leakeds 1000s of topics, should we 
clean those?

> ResourceExhausted: topics-per-project
> -
>
> Key: BEAM-9319
> URL: https://issues.apache.org/jira/browse/BEAM-9319
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures, testing
>Reporter: Ahmet Altay
>Assignee: Yifan Zou
>Priority: Major
>
> Tests are failing due to quota issues. Do we need to clean up topics after 
> tests or set a shorter TTL?
> Log: https://builds.apache.org/job/beam_PreCommit_Python_Commit/11178/
> Error: 
> 08:24:40 
> ==
> 08:24:40 ERROR: test_streaming_wordcount_it 
> (apache_beam.examples.streaming_wordcount_it_test.StreamingWordCountIT)
> 08:24:40 
> --
> 08:24:40 Traceback (most recent call last):
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/examples/streaming_wordcount_it_test.py",
>  line 58, in setUp
> 08:24:40 self.pub_client.topic_path(self.project, INPUT_TOPIC + 
> self.uuid))
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/cloud/pubsub_v1/_gapic.py",
>  line 40, in 
> 08:24:40 fx = lambda self, *a, **kw: wrapped_fx(self.api, *a, **kw)  # 
> noqa
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/cloud/pubsub_v1/gapic/publisher_client.py",
>  line 332, in create_topic
> 08:24:40 request, retry=retry, timeout=timeout, metadata=metadata
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/api_core/gapic_v1/method.py",
>  line 143, in __call__
> 08:24:40 return wrapped_func(*args, **kwargs)
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/api_core/retry.py",
>  line 286, in retry_wrapped_func
> 08:24:40 on_error=on_error,
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/api_core/retry.py",
>  line 184, in retry_target
> 08:24:40 return target()
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/api_core/timeout.py",
>  line 214, in func_with_timeout
> 08:24:40 return func(*args, **kwargs)
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/api_core/grpc_helpers.py",
>  line 59, in error_remapped_callable
> 08:24:40 six.raise_from(exceptions.from_grpc_error(exc), exc)
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/six.py",
>  line 738, in raise_from
> 08:24:40 raise value
> 08:24:40 ResourceExhausted: 429 Your project has exceeded a limit: 
> (type="topics-per-project", current=1, maximum=1).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9319) ResourceExhausted: topics-per-project

2020-02-14 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17037170#comment-17037170
 ] 

Ahmet Altay commented on BEAM-9319:
---

/cc: [~markflyhigh]

> ResourceExhausted: topics-per-project
> -
>
> Key: BEAM-9319
> URL: https://issues.apache.org/jira/browse/BEAM-9319
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures, testing
>Reporter: Ahmet Altay
>Assignee: Yifan Zou
>Priority: Major
>
> Tests are failing due to quota issues. Do we need to clean up topics after 
> tests or set a shorter TTL?
> Log: https://builds.apache.org/job/beam_PreCommit_Python_Commit/11178/
> Error: 
> 08:24:40 
> ==
> 08:24:40 ERROR: test_streaming_wordcount_it 
> (apache_beam.examples.streaming_wordcount_it_test.StreamingWordCountIT)
> 08:24:40 
> --
> 08:24:40 Traceback (most recent call last):
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/examples/streaming_wordcount_it_test.py",
>  line 58, in setUp
> 08:24:40 self.pub_client.topic_path(self.project, INPUT_TOPIC + 
> self.uuid))
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/cloud/pubsub_v1/_gapic.py",
>  line 40, in 
> 08:24:40 fx = lambda self, *a, **kw: wrapped_fx(self.api, *a, **kw)  # 
> noqa
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/cloud/pubsub_v1/gapic/publisher_client.py",
>  line 332, in create_topic
> 08:24:40 request, retry=retry, timeout=timeout, metadata=metadata
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/api_core/gapic_v1/method.py",
>  line 143, in __call__
> 08:24:40 return wrapped_func(*args, **kwargs)
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/api_core/retry.py",
>  line 286, in retry_wrapped_func
> 08:24:40 on_error=on_error,
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/api_core/retry.py",
>  line 184, in retry_target
> 08:24:40 return target()
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/api_core/timeout.py",
>  line 214, in func_with_timeout
> 08:24:40 return func(*args, **kwargs)
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/api_core/grpc_helpers.py",
>  line 59, in error_remapped_callable
> 08:24:40 six.raise_from(exceptions.from_grpc_error(exc), exc)
> 08:24:40   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/six.py",
>  line 738, in raise_from
> 08:24:40 raise value
> 08:24:40 ResourceExhausted: 429 Your project has exceeded a limit: 
> (type="topics-per-project", current=1, maximum=1).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9319) ResourceExhausted: topics-per-project

2020-02-14 Thread Ahmet Altay (Jira)
Ahmet Altay created BEAM-9319:
-

 Summary: ResourceExhausted: topics-per-project
 Key: BEAM-9319
 URL: https://issues.apache.org/jira/browse/BEAM-9319
 Project: Beam
  Issue Type: Bug
  Components: test-failures, testing
Reporter: Ahmet Altay
Assignee: Yifan Zou


Tests are failing due to quota issues. Do we need to clean up topics after 
tests or set a shorter TTL?

Log: https://builds.apache.org/job/beam_PreCommit_Python_Commit/11178/

Error: 
08:24:40 ==
08:24:40 ERROR: test_streaming_wordcount_it 
(apache_beam.examples.streaming_wordcount_it_test.StreamingWordCountIT)
08:24:40 --
08:24:40 Traceback (most recent call last):
08:24:40   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/apache_beam/examples/streaming_wordcount_it_test.py",
 line 58, in setUp
08:24:40 self.pub_client.topic_path(self.project, INPUT_TOPIC + self.uuid))
08:24:40   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/cloud/pubsub_v1/_gapic.py",
 line 40, in 
08:24:40 fx = lambda self, *a, **kw: wrapped_fx(self.api, *a, **kw)  # noqa
08:24:40   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/cloud/pubsub_v1/gapic/publisher_client.py",
 line 332, in create_topic
08:24:40 request, retry=retry, timeout=timeout, metadata=metadata
08:24:40   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/api_core/gapic_v1/method.py",
 line 143, in __call__
08:24:40 return wrapped_func(*args, **kwargs)
08:24:40   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/api_core/retry.py",
 line 286, in retry_wrapped_func
08:24:40 on_error=on_error,
08:24:40   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/api_core/retry.py",
 line 184, in retry_target
08:24:40 return target()
08:24:40   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/api_core/timeout.py",
 line 214, in func_with_timeout
08:24:40 return func(*args, **kwargs)
08:24:40   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/google/api_core/grpc_helpers.py",
 line 59, in error_remapped_callable
08:24:40 six.raise_from(exceptions.from_grpc_error(exc), exc)
08:24:40   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/build/gradleenv/-194514014/local/lib/python2.7/site-packages/six.py",
 line 738, in raise_from
08:24:40 raise value
08:24:40 ResourceExhausted: 429 Your project has exceeded a limit: 
(type="topics-per-project", current=1, maximum=1).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9318) Py 2 Precommit Flake: PortableRunnerTestWithLocalDocker test flaky

2020-02-14 Thread Ahmet Altay (Jira)
Ahmet Altay created BEAM-9318:
-

 Summary: Py 2 Precommit Flake: PortableRunnerTestWithLocalDocker 
test flaky
 Key: BEAM-9318
 URL: https://issues.apache.org/jira/browse/BEAM-9318
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-core, testing
Reporter: Ahmet Altay
Assignee: Robert Bradshaw


Log: [https://builds.apache.org/job/beam_PreCommit_Python_Commit/11178/]

Precommit PR: (This looks like an unrelated change): 
https://github.com/apache/beam/pull/10856

Maybe result object is not available yet?

Error:
08:29:55 self = 

08:29:55 check_gauge = True
08:29:55 
08:29:55 def test_metrics(self, check_gauge=True):
08:29:55 p = self.create_pipeline()
08:29:55 
08:29:55 counter = beam.metrics.Metrics.counter('ns', 'counter')
08:29:55 distribution = beam.metrics.Metrics.distribution('ns', 'distribution')
08:29:55 gauge = beam.metrics.Metrics.gauge('ns', 'gauge')
08:29:55 
08:29:55 pcoll = p | beam.Create(['a', 'zzz'])
08:29:55 # pylint: disable=expression-not-assigned
08:29:55 pcoll | 'count1' >> beam.FlatMap(lambda x: counter.inc())
08:29:55 pcoll | 'count2' >> beam.FlatMap(lambda x: counter.inc(len(x)))
08:29:55 pcoll | 'dist' >> beam.FlatMap(lambda x: distribution.update(len(x)))
08:29:55 pcoll | 'gauge' >> beam.FlatMap(lambda x: gauge.set(3))
08:29:55 
08:29:55 res = p.run()
08:29:55 res.wait_until_finish()
08:29:55 > c1, = 
res.metrics().query(beam.metrics.MetricsFilter().with_step('count1'))[
08:29:55 'counters']
08:29:55 
08:29:55 apache_beam/runners/portability/fn_api_runner_test.py:699: 
08:29:55 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ 
08:29:55 apache_beam/runners/portability/portable_runner.py:415: in metrics
08:29:55 beam_job_api_pb2.GetJobMetricsRequest(job_id=self._job_id))
08:29:55 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ 
08:29:55 
08:29:55 self = 

08:29:55 request = job_id: "job-be9e2a01-154f-4707-b04a-3d7ffbc39afb"
08:29:55 , context = None
08:29:55 
08:29:55 def GetJobMetrics(self, request, context=None):
08:29:55 if request.job_id not in self._jobs:
08:29:55 raise LookupError("Job {} does not exist".format(request.job_id))
08:29:55 
08:29:55 result = self._jobs[request.job_id].result
08:29:55 monitoring_info_list = []
08:29:55 > for mi in result._monitoring_infos_by_stage.values():
08:29:55 E AttributeError: 'NoneType' object has no attribute 
'_monitoring_infos_by_stage'
08:29:55 
08:29:55 apache_beam/runners/portability/local_job_service.py:157: 
AttributeError




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9247) [Python] PTransform that integrates Cloud Vision functionality

2020-02-13 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17036348#comment-17036348
 ] 

Ahmet Altay commented on BEAM-9247:
---

I think this will be a confusing API. I will suggest 2 PTransforms of the form
 # input type : Union[text_type, binary_type], vision.types.ImageContext]
 # input type: Union[text_type, binary_type] and a  side input with: 
vision.types.ImageContext (side input could be provided statically or 
dynamically.)

If it is possible to build a unified PTransform with reduced code, offering the 
above two could be done with minimal wrapper code on top of the first.

> [Python] PTransform that integrates Cloud Vision functionality
> --
>
> Key: BEAM-9247
> URL: https://issues.apache.org/jira/browse/BEAM-9247
> Project: Beam
>  Issue Type: Sub-task
>  Components: io-py-gcp
>Reporter: Kamil Wasilewski
>Assignee: Elias Djurfeldt
>Priority: Major
>
> The goal is to create a PTransform that integrates Google Cloud Vision API 
> functionality [1].
> [1] https://cloud.google.com/vision/docs/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-9202) lintPy37 precommit broken

2020-01-27 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay resolved BEAM-9202.
---
Fix Version/s: Not applicable
   Resolution: Fixed

> lintPy37 precommit broken
> -
>
> Key: BEAM-9202
> URL: https://issues.apache.org/jira/browse/BEAM-9202
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Udi Meiri
>Assignee: Chad Dombrova
>Priority: Major
> Fix For: Not applicable
>
>
> Culprit: https://github.com/apache/beam/pull/10683
> Jenkins tests are not started automatically.
> {code}
> 09:47:37 > Task :sdks:python:test-suites:tox:py37:lintPy37
> 09:47:37 * Module apache_beam.io.gcp.datastore.v1new.types
> 09:47:37 apache_beam/io/gcp/datastore/v1new/types.py:47:0: C0301: Line too 
> long (87/80) (line-too-long)
> {code}
> https://builds.apache.org/job/beam_PreCommit_PythonLint_Commit/2033/console



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9202) lintPy37 precommit broken

2020-01-27 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024807#comment-17024807
 ] 

Ahmet Altay commented on BEAM-9202:
---

Yes, it is fixed here: [https://github.com/apache/beam/pull/10697]

> lintPy37 precommit broken
> -
>
> Key: BEAM-9202
> URL: https://issues.apache.org/jira/browse/BEAM-9202
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Udi Meiri
>Assignee: Chad Dombrova
>Priority: Major
>
> Culprit: https://github.com/apache/beam/pull/10683
> Jenkins tests are not started automatically.
> {code}
> 09:47:37 > Task :sdks:python:test-suites:tox:py37:lintPy37
> 09:47:37 * Module apache_beam.io.gcp.datastore.v1new.types
> 09:47:37 apache_beam/io/gcp/datastore/v1new/types.py:47:0: C0301: Line too 
> long (87/80) (line-too-long)
> {code}
> https://builds.apache.org/job/beam_PreCommit_PythonLint_Commit/2033/console



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9185) Publish pre-release python artifacts (RCs) to PyPI

2020-01-23 Thread Ahmet Altay (Jira)
Ahmet Altay created BEAM-9185:
-

 Summary: Publish pre-release python artifacts (RCs) to PyPI
 Key: BEAM-9185
 URL: https://issues.apache.org/jira/browse/BEAM-9185
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-core
Reporter: Ahmet Altay
Assignee: Valentyn Tymofieiev


This was discussed in the mailing list and there was consensus [1].

Remaining part for pypi would be updating the release process such as:
 * New RC versioned artifacts are generated along with actual non-RC versioned 
artifacts.
 * RC versioned artifacts are published to pypi.

 [1] 
[https://lists.apache.org/thread.html/f071f8ab9f115636b9e6a6cabcfccbe2bb980d4394fe5581c59a4db6%40%3Cdev.beam.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9137) beam_PostCommit_Py_ValCont should run with dataflow_worker_jar

2020-01-17 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay updated BEAM-9137:
--
Parent: BEAM-8193
Issue Type: Sub-task  (was: Bug)

> beam_PostCommit_Py_ValCont should run with dataflow_worker_jar
> --
>
> Key: BEAM-9137
> URL: https://issues.apache.org/jira/browse/BEAM-9137
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Boyuan Zhang
>Assignee: Valentyn Tymofieiev
>Priority: Major
>
> For the first failure, please refer to 
> https://builds.apache.org/job/beam_PostCommit_Py_ValCont/5172/#showFailuresLink



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-9137) beam_PostCommit_Py_ValCont should run with dataflow_worker_jar

2020-01-17 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay reassigned BEAM-9137:
-

Assignee: Valentyn Tymofieiev

> beam_PostCommit_Py_ValCont should run with dataflow_worker_jar
> --
>
> Key: BEAM-9137
> URL: https://issues.apache.org/jira/browse/BEAM-9137
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Boyuan Zhang
>Assignee: Valentyn Tymofieiev
>Priority: Major
>
> For the first failure, please refer to 
> https://builds.apache.org/job/beam_PostCommit_Py_ValCont/5172/#showFailuresLink



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-7689) Temporary directory for WriteOperation may not be unique in FileBaseSink

2020-01-17 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay updated BEAM-7689:
--
Affects Version/s: 2.5.0
   2.6.0
   2.7.0
   2.8.0
   2.9.0
   2.10.0
   2.11.0
   2.12.0
   2.13.0

> Temporary directory for WriteOperation may not be unique in FileBaseSink
> 
>
> Key: BEAM-7689
> URL: https://issues.apache.org/jira/browse/BEAM-7689
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-files
>Affects Versions: 2.5.0, 2.6.0, 2.7.0, 2.8.0, 2.9.0, 2.10.0, 2.11.0, 
> 2.12.0, 2.13.0
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Blocker
> Fix For: 2.14.0
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Temporary directory for WriteOperation in FileBasedSink is generated from a 
> second-granularity timestamp (-MM-dd_HH-mm-ss) and unique increasing 
> index. Such granularity is not enough to make temporary directories unique 
> between different jobs. When two jobs share the same temporary directory, 
> output file may not be produced in one job since the required temporary file 
> can be deleted from another job.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-9078) Large Tarball Artifacts Should Use GCS Resumable Upload

2020-01-09 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay reassigned BEAM-9078:
-

Assignee: Brad West

> Large Tarball Artifacts Should Use GCS Resumable Upload
> ---
>
> Key: BEAM-9078
> URL: https://issues.apache.org/jira/browse/BEAM-9078
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.17.0
>Reporter: Brad West
>Assignee: Brad West
>Priority: Major
> Fix For: 2.19.0
>
>   Original Estimate: 1h
>  Time Spent: 20m
>  Remaining Estimate: 40m
>
> It's possible for the tarball uploaded to GCS to be quite large. An example 
> is a user vendoring multiple dependencies in their tarball so as to achieve a 
> more stable deployable artifact.
> Before this change the GCS upload api call executed a multipart upload, which 
> Google 
> [documentation]([https://cloud.google.com/storage/docs/json_api/v1/how-tos/upload)]
>  states should be used when the file is small enough to upload again when the 
> connection fails. For large tarballs, we will hit 60 second socket timeouts 
> before completing the multipart upload. By passing `total_size`, apitools 
> first checks if the size exceeds the resumable upload threshold, and executes 
> the more robust resumable upload rather than a multipart, avoiding
>  socket timeouts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9013) Multi-output TestStream breaks the DataflowRunner

2020-01-06 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay updated BEAM-9013:
--
Fix Version/s: 2.18.0

> Multi-output TestStream breaks the DataflowRunner
> -
>
> Key: BEAM-9013
> URL: https://issues.apache.org/jira/browse/BEAM-9013
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.17.0
>Reporter: Sam Rohde
>Assignee: Sam Rohde
>Priority: Major
> Fix For: 2.17.0, 2.18.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8944) Python SDK harness performance degradation with UnboundedThreadPoolExecutor

2019-12-20 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17001589#comment-17001589
 ] 

Ahmet Altay commented on BEAM-8944:
---

Could this be closed after the cherry pick PR 
([https://github.com/apache/beam/pull/10430]) ?

> Python SDK harness performance degradation with UnboundedThreadPoolExecutor
> ---
>
> Key: BEAM-8944
> URL: https://issues.apache.org/jira/browse/BEAM-8944
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-harness
>Affects Versions: 2.18.0
>Reporter: Yichi Zhang
>Assignee: Yichi Zhang
>Priority: Blocker
> Fix For: 2.18.0
>
> Attachments: profiling.png, profiling_one_thread.png, 
> profiling_twelve_threads.png
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> We are seeing a performance degradation for python streaming word count load 
> tests.
>  
> After some investigation, it appears to be caused by swapping the original 
> ThreadPoolExecutor to UnboundedThreadPoolExecutor in sdk worker. Suspicion is 
> that python performance is worse with more threads on cpu-bounded tasks.
>  
> A simple test for comparing the multiple thread pool executor performance:
>  
> {code:python}
> def test_performance(self):
>    def run_perf(executor):
>      total_number = 100
>      q = queue.Queue()
>     def task(number):
>        hash(number)
>        q.put(number + 200)
>        return number
>     t = time.time()
>      count = 0
>      for i in range(200):
>        q.put(i)
>     while count < total_number:
>        executor.submit(task, q.get(block=True))
>        count += 1
>      print('%s uses %s' % (executor, time.time() - t))
>    with UnboundedThreadPoolExecutor() as executor:
>      run_perf(executor)
>    with futures.ThreadPoolExecutor(max_workers=1) as executor:
>      run_perf(executor)
>    with futures.ThreadPoolExecutor(max_workers=12) as executor:
>      run_perf(executor)
> {code}
> Results:
>  0x7fab400dbe50> uses 268.160675049
>   uses 
> 79.904583931
>   uses 
> 191.179054976
>  ```
> Profiling:
> UnboundedThreadPoolExecutor:
>  !profiling.png! 
> 1 Thread ThreadPoolExecutor:
>  !profiling_one_thread.png! 
> 12 Threads ThreadPoolExecutor:
>  !profiling_twelve_threads.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8974) apache_beam.runners.worker.log_handler_test.FnApiLogRecordHandlerTest.test_exc_info is flaky

2019-12-20 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17001562#comment-17001562
 ] 

Ahmet Altay commented on BEAM-8974:
---

What is the next action here with respect to 2.18 release?
 * Revert the cherry pick to release branch?
 * Fix forward in the release branch? Do we know what is the fix?
 * Leave it as it its? – Is this just a test flakiness? Would this affect end 
users?

> apache_beam.runners.worker.log_handler_test.FnApiLogRecordHandlerTest.test_exc_info
>  is flaky
> 
>
> Key: BEAM-8974
> URL: https://issues.apache.org/jira/browse/BEAM-8974
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-harness
>Reporter: Valentyn Tymofieiev
>Assignee: Robert Bradshaw
>Priority: Major
> Fix For: 2.18.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The test is failing at apache_beam/runners/worker/log_handler_test.py:110: 
> IndexError
> Added in https://github.com/apache/beam/pull/10292
> Sample job: [https://builds.apache.org/job/beam_PreCommit_Python_Cron/2160/]
> Console logs:
>  {noformat}
> 06:37:37 === FAILURES 
> ===
> 06:37:37 ___ FnApiLogRecordHandlerTest.test_exc_info 
> 
> 06:37:37 [gw1] linux2 -- Python 2.7.12 
> /home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Cron/src/sdks/python/test-suites/tox/py2/build/srcs/sdks/python/target/.tox-py27-gcp-pytest/py27-gcp-pytest/bin/python
> 06:37:37
> 06:37:37 self = 
>  testMethod=test_exc_info>
> 06:37:37
> 06:37:37 def test_exc_info(self):
> 06:37:37   try:
> 06:37:37 raise ValueError('some message')
> 06:37:37   except ValueError:
> 06:37:37 _LOGGER.error('some error', exc_info=True)
> 06:37:37
> 06:37:37   self.fn_log_handler.close()
> 06:37:37
> 06:37:37 > log_entry = 
> self.test_logging_service.log_records_received[0].log_entries[0]
> 06:37:37 E IndexError: list index out of range
> 06:37:37
> 06:37:37 apache_beam/runners/worker/log_handler_test.py:110: IndexError
> 06:37:37 - Captured stderr call 
> -
> 06:37:37 ERROR:apache_beam.runners.worker.log_handler_test:some error
> 06:37:37 Traceback (most recent call last):
> 06:37:37   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Cron/src/sdks/python/test-suites/tox/py2/build/srcs/sdks/python/apache_beam/runners/worker/log_handler_test.py",
>  line 104, in test_exc_info
> 06:37:37 raise ValueError('some message')
> 06:37:37 ValueError: some message
> 06:37:37 -- Captured log call 
> ---
> 06:37:37 ERROR
> apache_beam.runners.worker.log_handler_test:log_handler_test.py:106 some error
> 06:37:37 Traceback (most recent call last):
> 06:37:37   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Cron/src/sdks/python/test-suites/tox/py2/build/srcs/sdks/python/apache_beam/runners/worker/log_handler_test.py",
>  line 104, in test_exc_info
> 06:37:37 raise ValueError('some message')
> 06:37:37 ValueError: some message
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8337) Add Flink job server container images to release process

2019-12-20 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17001455#comment-17001455
 ] 

Ahmet Altay commented on BEAM-8337:
---

Do we have containers built?

> Add Flink job server container images to release process
> 
>
> Key: BEAM-8337
> URL: https://issues.apache.org/jira/browse/BEAM-8337
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
> Fix For: 2.18.0
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Could be added to the release process similar to how we now publish SDK 
> worker images.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8825) OOM when writing large numbers of 'narrow' rows

2019-12-20 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17001437#comment-17001437
 ] 

Ahmet Altay commented on BEAM-8825:
---

Closing this. cherry pick PR is merged.

> OOM when writing large numbers of 'narrow' rows
> ---
>
> Key: BEAM-8825
> URL: https://issues.apache.org/jira/browse/BEAM-8825
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Affects Versions: 2.9.0, 2.10.0, 2.11.0, 2.12.0, 2.13.0, 2.14.0, 2.15.0, 
> 2.16.0, 2.17.0
>Reporter: Niel Markwick
>Assignee: Niel Markwick
>Priority: Major
> Fix For: 2.18.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> SpannerIO can OOM when writing large numbers of 'narrow' rows. 
>  
> SpannerIO puts  input mutation elements into batches for efficient writing.
> These batches are limited by number of cells mutated, and size of data 
> written (5000 cells, 1MB data). SpannerIO groups enough mutations to build 
> 1000 of these groups (5M cells, 1GB data), then sorts and batches them.
> When the number of cells and size of data is very small (<5 cells, <100 
> bytes), the memory overhead of storing millions of mutations for batching is 
> significant, and can lead to OOMs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-8825) OOM when writing large numbers of 'narrow' rows

2019-12-20 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay resolved BEAM-8825.
---
Resolution: Fixed

> OOM when writing large numbers of 'narrow' rows
> ---
>
> Key: BEAM-8825
> URL: https://issues.apache.org/jira/browse/BEAM-8825
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Affects Versions: 2.9.0, 2.10.0, 2.11.0, 2.12.0, 2.13.0, 2.14.0, 2.15.0, 
> 2.16.0, 2.17.0
>Reporter: Niel Markwick
>Assignee: Niel Markwick
>Priority: Major
> Fix For: 2.18.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> SpannerIO can OOM when writing large numbers of 'narrow' rows. 
>  
> SpannerIO puts  input mutation elements into batches for efficient writing.
> These batches are limited by number of cells mutated, and size of data 
> written (5000 cells, 1MB data). SpannerIO groups enough mutations to build 
> 1000 of these groups (5M cells, 1GB data), then sorts and batches them.
> When the number of cells and size of data is very small (<5 cells, <100 
> bytes), the memory overhead of storing millions of mutations for batching is 
> significant, and can lead to OOMs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-8882) Allow Dataflow to automatically choose portability or not.

2019-12-20 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay resolved BEAM-8882.
---
Resolution: Fixed

> Allow Dataflow to automatically choose portability or not.
> --
>
> Key: BEAM-8882
> URL: https://issues.apache.org/jira/browse/BEAM-8882
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Critical
> Fix For: 2.18.0
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> We would like the Dataflow service to be able to automatically choose whether 
> to run pipelines in a portable way. In order to do this, we need to provide 
> more information even if portability is not explicitly requested. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8882) Allow Dataflow to automatically choose portability or not.

2019-12-20 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17001365#comment-17001365
 ] 

Ahmet Altay commented on BEAM-8882:
---

Closing this. I do not see any other open PRs related to this JIRA.

> Allow Dataflow to automatically choose portability or not.
> --
>
> Key: BEAM-8882
> URL: https://issues.apache.org/jira/browse/BEAM-8882
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Critical
> Fix For: 2.18.0
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> We would like the Dataflow service to be able to automatically choose whether 
> to run pipelines in a portable way. In order to do this, we need to provide 
> more information even if portability is not explicitly requested. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8906) Long BigQuery dry runs cause avalanche delay

2019-12-20 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay reassigned BEAM-8906:
-

Assignee: Chamikara Madhusanka Jayalath

> Long BigQuery dry runs cause avalanche delay
> 
>
> Key: BEAM-8906
> URL: https://issues.apache.org/jira/browse/BEAM-8906
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Affects Versions: 2.16.0
> Environment: Google Cloud Platform
>Reporter: June Oh
>Assignee: Chamikara Madhusanka Jayalath
>Priority: Major
>
> Reproduction Steps:
> 1. Compose a BigQuery SELECT query that will take over 80 seconds for a dry 
> run.
> 2. Run the query with Beam SDK's BigQueryIO.
> 3. Observe the 10+ minute delay before the actual query job is created.
> When running readTableRows(), BigQueryIO attempts to estimate the query size 
> by performing a dry run, even if withoutValidation() is set. If the request 
> takes over 80 seconds (RetryHttpRequestInitializer.HANGING_GET_TIMEOUT_SEC), 
> RetryHttpRequestInitializer will time out and retry, up to 9 times 
> (BigQueryServicesImpl.MAX_RPC_RETRIES). Hence, once a dry run duration 
> crosses the 80 second tipping point, it causes an inevitable avalanche of a 
> 720-second delay. Considering the fact that size estimation is not a 
> requirement in running the query [1], BigQueryIO should provide a way to 
> circumvent the redundant delay, especially in consideration of time-critical 
> enterprise workloads.
> There can be several ways to address this:
> - increasing the timeout threshold (which will still create a tipping point);
> - preventing the dry run requests from retrying; or
> - adding an option to skip the size estimation within 
> serializeToCloudSource().
> [1] 
> https://github.com/apache/beam/blob/2ec3b0495c191597c9a88830d25a2c360b3277e0/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/internal/CustomSources.java#L75



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8825) OOM when writing large numbers of 'narrow' rows

2019-12-19 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17000338#comment-17000338
 ] 

Ahmet Altay commented on BEAM-8825:
---

Could this issue be closed?

> OOM when writing large numbers of 'narrow' rows
> ---
>
> Key: BEAM-8825
> URL: https://issues.apache.org/jira/browse/BEAM-8825
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Affects Versions: 2.9.0, 2.10.0, 2.11.0, 2.12.0, 2.13.0, 2.14.0, 2.15.0, 
> 2.16.0, 2.17.0
>Reporter: Niel Markwick
>Assignee: Niel Markwick
>Priority: Major
> Fix For: 2.18.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> SpannerIO can OOM when writing large numbers of 'narrow' rows. 
>  
> SpannerIO puts  input mutation elements into batches for efficient writing.
> These batches are limited by number of cells mutated, and size of data 
> written (5000 cells, 1MB data). SpannerIO groups enough mutations to build 
> 1000 of these groups (5M cells, 1GB data), then sorts and batches them.
> When the number of cells and size of data is very small (<5 cells, <100 
> bytes), the memory overhead of storing millions of mutations for batching is 
> significant, and can lead to OOMs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8195) Quota exceeded for create requests

2019-12-19 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17000330#comment-17000330
 ] 

Ahmet Altay commented on BEAM-8195:
---

dataflow_v1b3_client.py is an auto generated file I believe. Is it possible to 
disable these retries at runtime by calling an API or something similar?

> Quota exceeded for create requests
> --
>
> Key: BEAM-8195
> URL: https://issues.apache.org/jira/browse/BEAM-8195
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core, testing
>Reporter: Ahmet Altay
>Assignee: Yifan Zou
>Priority: Critical
>
> Post commits failied with the following error:
> HttpError accessing 
> :
>  response: <{'server': 'ESF', '-content-encoding': 'gzip', 'content-type': 
> 'application/json; charset=UTF-8', 'content-length': '598', 
> 'transfer-encoding': 'chunked', 'cache-control': 'private', 
> 'x-xss-protection': '0', 'date': 'Tue, 10 Sep 2019 12:02:24 GMT', 'vary': 
> 'Origin, X-Origin, Referer', 'x-frame-options': 'SAMEORIGIN', 'status': 
> '429', 'x-content-type-options': 'nosniff'}>, content <{
>   "error": {
> "code": 429,
> "message": "Quota exceeded for quota metric 
> 'dataflow.googleapis.com/create_requests' and limit 
> 'CreateRequestsPerMinutePerUser' of service 'dataflow.googleapis.com' for 
> consumer 'project_number:844138762903'.",
> "status": "RESOURCE_EXHAUSTED",
> "details": [
>   {
> Could we increase the quota?
> /cc [~alanmyrvold] [~kenn]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8912) PreCommit_Python2_PVR_Flink_Commit flaky

2019-12-06 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay updated BEAM-8912:
--
Parent: BEAM-8193
Issue Type: Sub-task  (was: Bug)

> PreCommit_Python2_PVR_Flink_Commit flaky
> 
>
> Key: BEAM-8912
> URL: https://issues.apache.org/jira/browse/BEAM-8912
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink
>Reporter: Ahmet Altay
>Assignee: Kyle Weaver
>Priority: Critical
>
> cc: [~angoenka]
> Logs: 
> [https://builds.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Commit/1687/console]
> Error:
> 42assert_that/Group/GroupByKey/GroupByWindow.None/beam:env:external:v1:0:beam:metric:sampled_byte_size:v1
>  \{PCOLLECTION=ref_PCollection_PCollection_27}: DistributionResult\{sum=59, 
> count=1, min=59, max=59}))*19:44:02* [flink-runner-job-invoker] WARN 
> org.apache.beam.runners.fnexecution.jobsubmission.InMemoryJobService - Failed 
> to remove job staging directory for token 
> \{"sessionId":"job_b05dbc6c-00d8-4df1-9bf1-efdf35899fa6","basePath":"/tmp/flinktestdPbkyj"}:
>  {}*19:44:02* java.io.FileNotFoundException: 
> /tmp/flinktestdPbkyj/job_b05dbc6c-00d8-4df1-9bf1-efdf35899fa6/MANIFEST (No 
> such file or directory)*19:44:02*   at 
> java.io.FileInputStream.open0(Native Method)*19:44:02*   at 
> java.io.FileInputStream.open(FileInputStream.java:195)*19:44:02* at 
> java.io.FileInputStream.(FileInputStream.java:138)*19:44:02*   at 
> org.apache.beam.sdk.io.LocalFileSystem.open(LocalFileSystem.java:118)*19:44:02*
>   at 
> org.apache.beam.sdk.io.LocalFileSystem.open(LocalFileSystem.java:82)*19:44:02*
>at 
> org.apache.beam.sdk.io.FileSystems.open(FileSystems.java:252)*19:44:02*  
> at 
> org.apache.beam.runners.fnexecution.artifact.BeamFileSystemArtifactRetrievalService.loadManifest(BeamFileSystemArtifactRetrievalService.java:88)*19:44:02*
>at 
> org.apache.beam.runners.fnexecution.artifact.BeamFileSystemArtifactStagingService.removeArtifacts(BeamFileSystemArtifactStagingService.java:92)*19:44:02*
> at 
> org.apache.beam.runners.fnexecution.jobsubmission.JobServerDriver.lambda$createJobService$0(JobServerDriver.java:63)*19:44:02*
>at 
> org.apache.beam.runners.fnexecution.jobsubmission.InMemoryJobService.lambda$run$0(InMemoryJobService.java:201)*19:44:02*
>  at 
> org.apache.beam.runners.fnexecution.jobsubmission.JobInvocation.setState(JobInvocation.java:241)*19:44:02*
>at 
> org.apache.beam.runners.fnexecution.jobsubmission.JobInvocation.access$200(JobInvocation.java:48)*19:44:02*
>   at 
> org.apache.beam.runners.fnexecution.jobsubmission.JobInvocation$1.onSuccess(JobInvocation.java:110)*19:44:02*
> at 
> org.apache.beam.runners.fnexecution.jobsubmission.JobInvocation$1.onSuccess(JobInvocation.java:96)*19:44:02*
>  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1058)*19:44:02*
>at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)*19:44:02*
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)*19:44:02*
> at java.lang.Thread.run(Thread.java:748)*19:44:02* 
> INFO:apache_beam.runners.portability.portable_runner:Job state changed to 
> DONE*19:44:02* .INFO:__main__:removing conf dir: 
> /tmp/flinktest-confGy17Mj*19:44:02*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-8912) PreCommit_Python2_PVR_Flink_Commit flaky

2019-12-06 Thread Ahmet Altay (Jira)
Ahmet Altay created BEAM-8912:
-

 Summary: PreCommit_Python2_PVR_Flink_Commit flaky
 Key: BEAM-8912
 URL: https://issues.apache.org/jira/browse/BEAM-8912
 Project: Beam
  Issue Type: Bug
  Components: runner-flink
Reporter: Ahmet Altay
Assignee: Kyle Weaver


cc: [~angoenka]

Logs: 
[https://builds.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Commit/1687/console]

Error:
42assert_that/Group/GroupByKey/GroupByWindow.None/beam:env:external:v1:0:beam:metric:sampled_byte_size:v1
 \{PCOLLECTION=ref_PCollection_PCollection_27}: DistributionResult\{sum=59, 
count=1, min=59, max=59}))*19:44:02* [flink-runner-job-invoker] WARN 
org.apache.beam.runners.fnexecution.jobsubmission.InMemoryJobService - Failed 
to remove job staging directory for token 
\{"sessionId":"job_b05dbc6c-00d8-4df1-9bf1-efdf35899fa6","basePath":"/tmp/flinktestdPbkyj"}:
 {}*19:44:02* java.io.FileNotFoundException: 
/tmp/flinktestdPbkyj/job_b05dbc6c-00d8-4df1-9bf1-efdf35899fa6/MANIFEST (No such 
file or directory)*19:44:02* at java.io.FileInputStream.open0(Native 
Method)*19:44:02*   at 
java.io.FileInputStream.open(FileInputStream.java:195)*19:44:02* at 
java.io.FileInputStream.(FileInputStream.java:138)*19:44:02*   at 
org.apache.beam.sdk.io.LocalFileSystem.open(LocalFileSystem.java:118)*19:44:02* 
 at 
org.apache.beam.sdk.io.LocalFileSystem.open(LocalFileSystem.java:82)*19:44:02*  
 at org.apache.beam.sdk.io.FileSystems.open(FileSystems.java:252)*19:44:02* 
 at 
org.apache.beam.runners.fnexecution.artifact.BeamFileSystemArtifactRetrievalService.loadManifest(BeamFileSystemArtifactRetrievalService.java:88)*19:44:02*
   at 
org.apache.beam.runners.fnexecution.artifact.BeamFileSystemArtifactStagingService.removeArtifacts(BeamFileSystemArtifactStagingService.java:92)*19:44:02*
at 
org.apache.beam.runners.fnexecution.jobsubmission.JobServerDriver.lambda$createJobService$0(JobServerDriver.java:63)*19:44:02*
   at 
org.apache.beam.runners.fnexecution.jobsubmission.InMemoryJobService.lambda$run$0(InMemoryJobService.java:201)*19:44:02*
 at 
org.apache.beam.runners.fnexecution.jobsubmission.JobInvocation.setState(JobInvocation.java:241)*19:44:02*
   at 
org.apache.beam.runners.fnexecution.jobsubmission.JobInvocation.access$200(JobInvocation.java:48)*19:44:02*
  at 
org.apache.beam.runners.fnexecution.jobsubmission.JobInvocation$1.onSuccess(JobInvocation.java:110)*19:44:02*
at 
org.apache.beam.runners.fnexecution.jobsubmission.JobInvocation$1.onSuccess(JobInvocation.java:96)*19:44:02*
 at 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1058)*19:44:02*
   at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)*19:44:02*
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)*19:44:02*
at java.lang.Thread.run(Thread.java:748)*19:44:02* 
INFO:apache_beam.runners.portability.portable_runner:Job state changed to 
DONE*19:44:02* .INFO:__main__:removing conf dir: 
/tmp/flinktest-confGy17Mj*19:44:02*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (BEAM-6694) ApproximateQuantiles transform for Python SDK

2019-12-03 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay closed BEAM-6694.
-
Fix Version/s: 2.16.0
   Resolution: Fixed

> ApproximateQuantiles transform for Python SDK
> -
>
> Key: BEAM-6694
> URL: https://issues.apache.org/jira/browse/BEAM-6694
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Shehzaad Nakhoda
>Priority: Minor
> Fix For: 2.16.0
>
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> Add PTransforms for getting an idea of a PCollection's data distribution 
> using approximate N-tiles (e.g. quartiles, percentiles, etc.), either 
> globally or per-key.
> It should offer the same API as its Java counterpart: 
> https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/ApproximateQuantiles.java



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-7018) Regex transform for Python SDK

2019-12-03 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay resolved BEAM-7018.
---
Fix Version/s: 2.16.0
   Resolution: Fixed

> Regex transform for Python SDK
> --
>
> Key: BEAM-7018
> URL: https://issues.apache.org/jira/browse/BEAM-7018
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Rose Nguyen
>Assignee: Shehzaad Nakhoda
>Priority: Minor
> Fix For: 2.16.0
>
>  Time Spent: 12.5h
>  Remaining Estimate: 0h
>
> PTransorms to use Regular Expressions to process elements in a PCollection
> It should offer the same API as its Java counterpart: 
> [https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Regex.java]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-7018) Regex transform for Python SDK

2019-12-03 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987110#comment-16987110
 ] 

Ahmet Altay commented on BEAM-7018:
---

Can we close this? I believe this is completed.

> Regex transform for Python SDK
> --
>
> Key: BEAM-7018
> URL: https://issues.apache.org/jira/browse/BEAM-7018
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Rose Nguyen
>Assignee: Shehzaad Nakhoda
>Priority: Minor
>  Time Spent: 12.5h
>  Remaining Estimate: 0h
>
> PTransorms to use Regular Expressions to process elements in a PCollection
> It should offer the same API as its Java counterpart: 
> [https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Regex.java]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-7018) Regex transform for Python SDK

2019-12-03 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987109#comment-16987109
 ] 

Ahmet Altay commented on BEAM-7018:
---

Can we close this? I believe this is completed.

> Regex transform for Python SDK
> --
>
> Key: BEAM-7018
> URL: https://issues.apache.org/jira/browse/BEAM-7018
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Rose Nguyen
>Assignee: Shehzaad Nakhoda
>Priority: Minor
>  Time Spent: 12.5h
>  Remaining Estimate: 0h
>
> PTransorms to use Regular Expressions to process elements in a PCollection
> It should offer the same API as its Java counterpart: 
> [https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Regex.java]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-6694) ApproximateQuantiles transform for Python SDK

2019-12-03 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-6694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987107#comment-16987107
 ] 

Ahmet Altay commented on BEAM-6694:
---

Can we close this? I believe this is completed.

> ApproximateQuantiles transform for Python SDK
> -
>
> Key: BEAM-6694
> URL: https://issues.apache.org/jira/browse/BEAM-6694
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Shehzaad Nakhoda
>Priority: Minor
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> Add PTransforms for getting an idea of a PCollection's data distribution 
> using approximate N-tiles (e.g. quartiles, percentiles, etc.), either 
> globally or per-key.
> It should offer the same API as its Java counterpart: 
> https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/ApproximateQuantiles.java



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8877) beam_PostCommit_Py_VR_Dataflow is timing out

2019-12-03 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay updated BEAM-8877:
--
Parent: BEAM-8193
Issue Type: Sub-task  (was: Bug)

> beam_PostCommit_Py_VR_Dataflow is timing out
> 
>
> Key: BEAM-8877
> URL: https://issues.apache.org/jira/browse/BEAM-8877
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core, test-failures
>Reporter: Ahmet Altay
>Assignee: Valentyn Tymofieiev
>Priority: Critical
>
> Error:
> 06:47:45 Build timed out (after 100 minutes). Marking the build as aborted.
> 06:47:45 Build was aborted
> Log: 
> [https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/5214/console]
>  
> Should we increase the timeout here similar to : 
> [https://github.com/apache/beam/pull/10234]
> cc: [~Ardagan]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-8877) beam_PostCommit_Py_VR_Dataflow is timing out

2019-12-03 Thread Ahmet Altay (Jira)
Ahmet Altay created BEAM-8877:
-

 Summary: beam_PostCommit_Py_VR_Dataflow is timing out
 Key: BEAM-8877
 URL: https://issues.apache.org/jira/browse/BEAM-8877
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-core, test-failures
Reporter: Ahmet Altay
Assignee: Valentyn Tymofieiev


Error:

06:47:45 Build timed out (after 100 minutes). Marking the build as aborted.
06:47:45 Build was aborted

Log: [https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/5214/console]

 

Should we increase the timeout here similar to : 
[https://github.com/apache/beam/pull/10234]

cc: [~Ardagan]

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8653) installGcpTest task is flaky

2019-12-02 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay reassigned BEAM-8653:
-

Assignee: Udi Meiri

> installGcpTest task is flaky
> 
>
> Key: BEAM-8653
> URL: https://issues.apache.org/jira/browse/BEAM-8653
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Udi Meiri
>Priority: Major
>
> Logs: 
> [https://builds.apache.org/job/beam_PostCommit_Python35/984/console#gradle-task-126]
>  
> 11:11:20 > Task :sdks:python:test-suites:portable:py35:installGcpTest FAILED
> 11:11:20 Obtaining 
> file:///home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python35/src/sdks/python
> 11:11:20 ERROR: Command errored out with exit status 1:
> 11:11:20  command: 
> /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python35/src/build/gradleenv/2022703439/bin/python3.5
>  -c 'import sys, setuptools, tokenize; sys.argv[0] = 
> '"'"'/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python35/src/sdks/python/setup.py'"'"';
>  
> __file__='"'"'/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python35/src/sdks/python/setup.py'"'"';f=getattr(tokenize,
>  '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', 
> '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' 
> egg_info
> 11:11:20  cwd: 
> /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python35/src/sdks/python/
> 11:11:20 Complete output (37 lines):
> 11:11:20 Traceback (most recent call last):
> 11:11:20   File "", line 1, in 
> 11:11:20   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python35/src/sdks/python/setup.py",
>  line 264, in 
> 11:11:20 'test': generate_protos_first(test),
> 11:11:20   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python35/src/build/gradleenv/2022703439/lib/python3.5/site-packages/setuptools/__init__.py",
>  line 144, in setup
> 11:11:20 _install_setup_requires(attrs)
> 11:11:20   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python35/src/build/gradleenv/2022703439/lib/python3.5/site-packages/setuptools/__init__.py",
>  line 139, in _install_setup_requires
> 11:11:20 dist.fetch_build_eggs(dist.setup_requires)
> 11:11:20   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python35/src/build/gradleenv/2022703439/lib/python3.5/site-packages/setuptools/dist.py",
>  line 720, in fetch_build_eggs
> 11:11:20 replace_conflicting=True,
> 11:11:20   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python35/src/build/gradleenv/2022703439/lib/python3.5/site-packages/pkg_resources/__init__.py",
>  line 782, in resolve
> 11:11:20 replace_conflicting=replace_conflicting
> 11:11:20   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python35/src/build/gradleenv/2022703439/lib/python3.5/site-packages/pkg_resources/__init__.py",
>  line 1065, in best_match
> 11:11:20 return self.obtain(req, installer)
> 11:11:20   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python35/src/build/gradleenv/2022703439/lib/python3.5/site-packages/pkg_resources/__init__.py",
>  line 1077, in obtain
> 11:11:20 return installer(requirement)
> 11:11:20   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python35/src/build/gradleenv/2022703439/lib/python3.5/site-packages/setuptools/dist.py",
>  line 787, in fetch_build_egg
> 11:11:20 return cmd.easy_install(req)
> 11:11:20   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python35/src/build/gradleenv/2022703439/lib/python3.5/site-packages/setuptools/command/easy_install.py",
>  line 679, in easy_install
> 11:11:20 return self.install_item(spec, dist.location, tmpdir, deps)
> 11:11:20   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python35/src/build/gradleenv/2022703439/lib/python3.5/site-packages/setuptools/command/easy_install.py",
>  line 705, in install_item
> 11:11:20 dists = self.install_eggs(spec, download, tmpdir)
> 11:11:20   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python35/src/build/gradleenv/2022703439/lib/python3.5/site-packages/setuptools/command/easy_install.py",
>  line 855, in install_eggs
> 11:11:20 return [self.install_wheel(dist_filename, tmpdir)]
> 11:11:20   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python35/src/build/gradleenv/2022703439/lib/python3.5/site-packages/setuptools/command/easy_install.py",
>  line 1073, in install_wheel
> 11:11:20 os.path.dirname(destination)
> 11:11:20   File "/usr/lib/python3.5/distutils/cmd.py", line 336, in 
> execute
> 11:11:20 util.execute(func, args, msg, 

[jira] [Commented] (BEAM-8653) installGcpTest task is flaky

2019-12-02 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986564#comment-16986564
 ] 

Ahmet Altay commented on BEAM-8653:
---

[~udim] is this fixed with your recent changes to setup_requires and 
tests_requires?

> installGcpTest task is flaky
> 
>
> Key: BEAM-8653
> URL: https://issues.apache.org/jira/browse/BEAM-8653
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Udi Meiri
>Priority: Major
>
> Logs: 
> [https://builds.apache.org/job/beam_PostCommit_Python35/984/console#gradle-task-126]
>  
> 11:11:20 > Task :sdks:python:test-suites:portable:py35:installGcpTest FAILED
> 11:11:20 Obtaining 
> file:///home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python35/src/sdks/python
> 11:11:20 ERROR: Command errored out with exit status 1:
> 11:11:20  command: 
> /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python35/src/build/gradleenv/2022703439/bin/python3.5
>  -c 'import sys, setuptools, tokenize; sys.argv[0] = 
> '"'"'/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python35/src/sdks/python/setup.py'"'"';
>  
> __file__='"'"'/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python35/src/sdks/python/setup.py'"'"';f=getattr(tokenize,
>  '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', 
> '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' 
> egg_info
> 11:11:20  cwd: 
> /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python35/src/sdks/python/
> 11:11:20 Complete output (37 lines):
> 11:11:20 Traceback (most recent call last):
> 11:11:20   File "", line 1, in 
> 11:11:20   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python35/src/sdks/python/setup.py",
>  line 264, in 
> 11:11:20 'test': generate_protos_first(test),
> 11:11:20   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python35/src/build/gradleenv/2022703439/lib/python3.5/site-packages/setuptools/__init__.py",
>  line 144, in setup
> 11:11:20 _install_setup_requires(attrs)
> 11:11:20   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python35/src/build/gradleenv/2022703439/lib/python3.5/site-packages/setuptools/__init__.py",
>  line 139, in _install_setup_requires
> 11:11:20 dist.fetch_build_eggs(dist.setup_requires)
> 11:11:20   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python35/src/build/gradleenv/2022703439/lib/python3.5/site-packages/setuptools/dist.py",
>  line 720, in fetch_build_eggs
> 11:11:20 replace_conflicting=True,
> 11:11:20   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python35/src/build/gradleenv/2022703439/lib/python3.5/site-packages/pkg_resources/__init__.py",
>  line 782, in resolve
> 11:11:20 replace_conflicting=replace_conflicting
> 11:11:20   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python35/src/build/gradleenv/2022703439/lib/python3.5/site-packages/pkg_resources/__init__.py",
>  line 1065, in best_match
> 11:11:20 return self.obtain(req, installer)
> 11:11:20   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python35/src/build/gradleenv/2022703439/lib/python3.5/site-packages/pkg_resources/__init__.py",
>  line 1077, in obtain
> 11:11:20 return installer(requirement)
> 11:11:20   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python35/src/build/gradleenv/2022703439/lib/python3.5/site-packages/setuptools/dist.py",
>  line 787, in fetch_build_egg
> 11:11:20 return cmd.easy_install(req)
> 11:11:20   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python35/src/build/gradleenv/2022703439/lib/python3.5/site-packages/setuptools/command/easy_install.py",
>  line 679, in easy_install
> 11:11:20 return self.install_item(spec, dist.location, tmpdir, deps)
> 11:11:20   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python35/src/build/gradleenv/2022703439/lib/python3.5/site-packages/setuptools/command/easy_install.py",
>  line 705, in install_item
> 11:11:20 dists = self.install_eggs(spec, download, tmpdir)
> 11:11:20   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python35/src/build/gradleenv/2022703439/lib/python3.5/site-packages/setuptools/command/easy_install.py",
>  line 855, in install_eggs
> 11:11:20 return [self.install_wheel(dist_filename, tmpdir)]
> 11:11:20   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python35/src/build/gradleenv/2022703439/lib/python3.5/site-packages/setuptools/command/easy_install.py",
>  line 1073, in install_wheel
> 11:11:20 os.path.dirname(destination)
> 11:11:20   File 

[jira] [Updated] (BEAM-8868) BigQueryWriteIntegrationTests.test_big_query_write_without_schema - flaky post commit

2019-12-02 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay updated BEAM-8868:
--
Parent: BEAM-8193
Issue Type: Sub-task  (was: Bug)

> BigQueryWriteIntegrationTests.test_big_query_write_without_schema - flaky 
> post commit
> -
>
> Key: BEAM-8868
> URL: https://issues.apache.org/jira/browse/BEAM-8868
> Project: Beam
>  Issue Type: Sub-task
>  Components: io-py-gcp, test-failures
>Reporter: Ahmet Altay
>Assignee: Chamikara Madhusanka Jayalath
>Priority: Critical
>
> Log: 
> [https://builds.apache.org/job/beam_PostCommit_Python37/1047/testReport/junit/apache_beam.io.gcp.bigquery_write_it_test/BigQueryWriteIntegrationTests/test_big_query_write_without_schema/]
> Error Message
> Expected: (Expected data is [(b'xyw', datetime.date(2011, 1, 1), 
> datetime.time(23, 59, 59, 99)), (b'abc', datetime.date(2000, 1, 1), 
> datetime.time(0, 0)), (b'\xe4\xbd\xa0\xe5\xa5\xbd', datetime.date(3000, 12, 
> 31), datetime.time(23, 59, 59)), (b'\xab\xac\xad', datetime.date(2000, 1, 1), 
> datetime.time(0, 0))])
>  but: Expected data is [(b'xyw', datetime.date(2011, 1, 1), datetime.time(23, 
> 59, 59, 99)), (b'abc', datetime.date(2000, 1, 1), datetime.time(0, 0)), 
> (b'\xe4\xbd\xa0\xe5\xa5\xbd', datetime.date(3000, 12, 31), datetime.time(23, 
> 59, 59)), (b'\xab\xac\xad', datetime.date(2000, 1, 1), datetime.time(0, 0))] 
> Actual data is []



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-8868) BigQueryWriteIntegrationTests.test_big_query_write_without_schema - flaky post commit

2019-12-02 Thread Ahmet Altay (Jira)
Ahmet Altay created BEAM-8868:
-

 Summary: 
BigQueryWriteIntegrationTests.test_big_query_write_without_schema - flaky post 
commit
 Key: BEAM-8868
 URL: https://issues.apache.org/jira/browse/BEAM-8868
 Project: Beam
  Issue Type: Bug
  Components: io-py-gcp, test-failures
Reporter: Ahmet Altay
Assignee: Chamikara Madhusanka Jayalath


Log: 
[https://builds.apache.org/job/beam_PostCommit_Python37/1047/testReport/junit/apache_beam.io.gcp.bigquery_write_it_test/BigQueryWriteIntegrationTests/test_big_query_write_without_schema/]

Error Message
Expected: (Expected data is [(b'xyw', datetime.date(2011, 1, 1), 
datetime.time(23, 59, 59, 99)), (b'abc', datetime.date(2000, 1, 1), 
datetime.time(0, 0)), (b'\xe4\xbd\xa0\xe5\xa5\xbd', datetime.date(3000, 12, 
31), datetime.time(23, 59, 59)), (b'\xab\xac\xad', datetime.date(2000, 1, 1), 
datetime.time(0, 0))])
 but: Expected data is [(b'xyw', datetime.date(2011, 1, 1), datetime.time(23, 
59, 59, 99)), (b'abc', datetime.date(2000, 1, 1), datetime.time(0, 0)), 
(b'\xe4\xbd\xa0\xe5\xa5\xbd', datetime.date(3000, 12, 31), datetime.time(23, 
59, 59)), (b'\xab\xac\xad', datetime.date(2000, 1, 1), datetime.time(0, 0))] 
Actual data is []



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8867) sparkValidatesRunner - python post commit flaky

2019-12-02 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay updated BEAM-8867:
--
Parent: BEAM-8193
Issue Type: Sub-task  (was: Bug)

> sparkValidatesRunner - python post commit flaky
> ---
>
> Key: BEAM-8867
> URL: https://issues.apache.org/jira/browse/BEAM-8867
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-spark
>Reporter: Ahmet Altay
>Assignee: Kyle Weaver
>Priority: Critical
>
> Logs: 
> [https://scans.gradle.com/s/iwdvlnwg6euhk/console-log?task=:sdks:python:test-suites:portable:py2:sparkValidatesRunner]
>  
> Error:
> ERROR org.apache.beam.runners.fnexecution.jobsubmission.InMemoryJobService: 
> Failed to remove job staging directory for token 
> \{"sessionId":"job_79e29409-03c8-4203-bac4-ca3c30f00b35","basePath":"/tmp/sparktestzSz0RX"}:
>  {} ERROR 
> org.apache.beam.runners.fnexecution.jobsubmission.InMemoryJobService: Failed 
> to remove job staging directory for token 
> \{"sessionId":"job_79e29409-03c8-4203-bac4-ca3c30f00b35","basePath":"/tmp/sparktestzSz0RX"}:
>  {} java.io.FileNotFoundException: 
> /tmp/sparktestzSz0RX/job_79e29409-03c8-4203-bac4-ca3c30f00b35/MANIFEST (No 
> such file or directory)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-8867) sparkValidatesRunner - python post commit flaky

2019-12-02 Thread Ahmet Altay (Jira)
Ahmet Altay created BEAM-8867:
-

 Summary: sparkValidatesRunner - python post commit flaky
 Key: BEAM-8867
 URL: https://issues.apache.org/jira/browse/BEAM-8867
 Project: Beam
  Issue Type: Bug
  Components: runner-spark
Reporter: Ahmet Altay
Assignee: Kyle Weaver


Logs: 
[https://scans.gradle.com/s/iwdvlnwg6euhk/console-log?task=:sdks:python:test-suites:portable:py2:sparkValidatesRunner]

 

Error:

ERROR org.apache.beam.runners.fnexecution.jobsubmission.InMemoryJobService: 
Failed to remove job staging directory for token 
\{"sessionId":"job_79e29409-03c8-4203-bac4-ca3c30f00b35","basePath":"/tmp/sparktestzSz0RX"}:
 {} ERROR org.apache.beam.runners.fnexecution.jobsubmission.InMemoryJobService: 
Failed to remove job staging directory for token 
\{"sessionId":"job_79e29409-03c8-4203-bac4-ca3c30f00b35","basePath":"/tmp/sparktestzSz0RX"}:
 {} java.io.FileNotFoundException: 
/tmp/sparktestzSz0RX/job_79e29409-03c8-4203-bac4-ca3c30f00b35/MANIFEST (No such 
file or directory)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-8866) portableWordCountFlinkRunnerStreaming - flaky post commits

2019-12-02 Thread Ahmet Altay (Jira)
Ahmet Altay created BEAM-8866:
-

 Summary: portableWordCountFlinkRunnerStreaming - flaky post commits
 Key: BEAM-8866
 URL: https://issues.apache.org/jira/browse/BEAM-8866
 Project: Beam
  Issue Type: Bug
  Components: runner-flink
Reporter: Ahmet Altay
Assignee: Kyle Weaver


Logs: 
[https://scans.gradle.com/s/rkdiftvzvr7cy/console-log?task=:sdks:python:test-suites:portable:py36:portableWordCountFlinkRunnerStreaming]

Error:

..

  File 
"/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python36/src/sdks/python/apache_beam/io/localfilesystem.py",
 line 335, in delete   File 
"/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python36/src/sdks/python/apache_beam/io/localfilesystem.py",
 line 335, in delete     raise BeamIOError("Delete operation failed", 
exceptions) apache_beam.io.filesystem.BeamIOError: Delete operation failed with 
exceptions \{'/tmp/py-wordcount-direct-1-of-2': OSError('No files found 
to delete under: /tmp/py-wordcount-direct-1-of-2',), 
'/tmp/py-wordcount-direct-0-of-2': OSError('No files found to delete 
under: /tmp/py-wordcount-direct-0-of-2',)} During handling of the above 
exception, another exception occurred:

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   4   5   6   >