[jira] [Commented] (BEAM-8551) Beam Python containers should include all Beam SDK dependencies, and do not have conflicting dependencies

2020-03-17 Thread David Yan (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17061206#comment-17061206
 ] 

David Yan commented on BEAM-8551:
-

`pip check` is another way to check for broken dependencies.

> Beam Python containers should include all Beam SDK dependencies, and do not 
> have conflicting dependencies
> -
>
> Key: BEAM-8551
> URL: https://issues.apache.org/jira/browse/BEAM-8551
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Priority: Major
>
> Checks could be introduced during container creation, and be enforced by 
> ValidatesContainer test suites. We could:
> - Check pip output or status code for incompatible dependency errors.
> - Remove internet access when installing apache-beam in the container, to 
> makes sure all dependencies are installed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (BEAM-9530) Add `pip check` to ensure good python dependencies

2020-03-17 Thread David Yan (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Yan closed BEAM-9530.
---
Fix Version/s: Not applicable
   Resolution: Duplicate

> Add `pip check` to ensure good python dependencies
> --
>
> Key: BEAM-9530
> URL: https://issues.apache.org/jira/browse/BEAM-9530
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-harness
>Reporter: David Yan
>Priority: Major
> Fix For: Not applicable
>
>
> We should add {{pip check}} after pip install in our tests to make sure there 
> is no incompatibility.  {{pip install}} does not return an error exit code 
> for broken dependencies for historical reasons.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9510) Dependencies in base_image_requirements.txt are not compatible with each other

2020-03-17 Thread David Yan (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17061158#comment-17061158
 ] 

David Yan commented on BEAM-9510:
-

Also related: BEAM-9530

> Dependencies in base_image_requirements.txt are not compatible with each other
> --
>
> Key: BEAM-9510
> URL: https://issues.apache.org/jira/browse/BEAM-9510
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-harness
>Reporter: David Yan
>Assignee: Hannah Jiang
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/sdks/python/container/base_image_requirements.txt#L56]
> says it requires google-cloud-bigquery==1.24.0, google-cloud-core==1.0.2, 
> google-cloud-bigtable==0.32.1, grpc-1.22.0 and tensorflow-2.1.0
> But they are incompatible with each other:
> ERROR: google-cloud-bigquery 1.24.0 has requirement 
> google-cloud-core<2.0dev,>=1.1.0, but you'll have google-cloud-core 1.0.2 
> which is incompatible.
> ERROR: google-cloud-bigtable 0.32.1 has requirement 
> google-cloud-core<0.30dev,>=0.29.0, but you'll have google-cloud-core 1.0.2 
> which is incompatible.
> ERROR: tensorboard 2.1.1 has requirement grpcio>=1.24.3, but you'll have 
> grpcio 1.22.0 which is incompatible.
> ERROR: tensorflow 2.1.0 has requirement scipy==1.4.1; python_version >= "3", 
> but you'll have scipy 1.2.2 which is incompatible.
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9530) Add `pip check` to ensure good python dependencies

2020-03-17 Thread David Yan (Jira)
David Yan created BEAM-9530:
---

 Summary: Add `pip check` to ensure good python dependencies
 Key: BEAM-9530
 URL: https://issues.apache.org/jira/browse/BEAM-9530
 Project: Beam
  Issue Type: Improvement
  Components: sdk-py-harness
Reporter: David Yan


We should add {{pip check}} after pip install in our tests to make sure there 
is no incompatibility.  {{pip install}} does not return an error exit code for 
broken dependencies for historical reasons.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9510) Dependencies in base_image_requirements.txt are not compatible with each other

2020-03-16 Thread David Yan (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Yan updated BEAM-9510:

Summary: Dependencies in base_image_requirements.txt are not compatible 
with each other  (was: Dependencies in base_image_requirements.txt are not 
compatible with apache-beam pypi deps)

> Dependencies in base_image_requirements.txt are not compatible with each other
> --
>
> Key: BEAM-9510
> URL: https://issues.apache.org/jira/browse/BEAM-9510
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-harness
>Reporter: David Yan
>Priority: Major
>
> [https://github.com/apache/beam/blob/master/sdks/python/container/base_image_requirements.txt#L56]
> says it requires google-cloud-bigquery==1.24.0, google-cloud-core==1.0.2, 
> google-cloud-bigtable==0.32.1, grpc-1.22.0 and tensorflow-2.1.0
> But they are incompatible with each other:
> ERROR: google-cloud-bigquery 1.24.0 has requirement 
> google-cloud-core<2.0dev,>=1.1.0, but you'll have google-cloud-core 1.0.2 
> which is incompatible.
> ERROR: google-cloud-bigtable 0.32.1 has requirement 
> google-cloud-core<0.30dev,>=0.29.0, but you'll have google-cloud-core 1.0.2 
> which is incompatible.
> ERROR: tensorboard 2.1.1 has requirement grpcio>=1.24.3, but you'll have 
> grpcio 1.22.0 which is incompatible.
> ERROR: tensorflow 2.1.0 has requirement scipy==1.4.1; python_version >= "3", 
> but you'll have scipy 1.2.2 which is incompatible.
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9510) Dependencies in base_image_requirements.txt are not compatible with apache-beam pypi deps

2020-03-16 Thread David Yan (Jira)
David Yan created BEAM-9510:
---

 Summary: Dependencies in base_image_requirements.txt are not 
compatible with apache-beam pypi deps
 Key: BEAM-9510
 URL: https://issues.apache.org/jira/browse/BEAM-9510
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-harness
Reporter: David Yan


[https://github.com/apache/beam/blob/master/sdks/python/container/base_image_requirements.txt#L56]

says it requires google-cloud-bigquery==1.24.0, google-cloud-core==1.0.2, 
google-cloud-bigtable==0.32.1, grpc-1.22.0 and tensorflow-2.1.0

But they are incompatible with each other:

ERROR: google-cloud-bigquery 1.24.0 has requirement 
google-cloud-core<2.0dev,>=1.1.0, but you'll have google-cloud-core 1.0.2 which 
is incompatible.

ERROR: google-cloud-bigtable 0.32.1 has requirement 
google-cloud-core<0.30dev,>=0.29.0, but you'll have google-cloud-core 1.0.2 
which is incompatible.

ERROR: tensorboard 2.1.1 has requirement grpcio>=1.24.3, but you'll have grpcio 
1.22.0 which is incompatible.

ERROR: tensorflow 2.1.0 has requirement scipy==1.4.1; python_version >= "3", 
but you'll have scipy 1.2.2 which is incompatible.

 

 

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9508) Python installation fails if grpc_tools is not installed

2020-03-16 Thread David Yan (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17060427#comment-17060427
 ] 

David Yan commented on BEAM-9508:
-

This is fixed by installing mypy-protobuf, which is not immediately obvious 
from the stacktrace.

I'll leave this ticket open for a better error message.

> Python installation fails if grpc_tools is not installed
> 
>
> Key: BEAM-9508
> URL: https://issues.apache.org/jira/browse/BEAM-9508
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: David Yan
>Priority: Major
>
> When installing from master branch, I'm getting an exception below. Looks 
> like the ImportError exception handling throws an exception itself. I'll 
> manually install grpc_tools and try again but the handling of ImportError has 
> issues.
>  
> ```
> Traceback (most recent call last):
>   File 
> "/root/apache-beam-custom/packages/beam/sdks/python/gen_protos.py", line 292, 
> in generate_proto_files
> from grpc_tools import protoc
> ModuleNotFoundError: No module named 'grpc_tools'
> 
> During handling of the above exception, another exception occurred:
> 
> Traceback (most recent call last):
>   File "/opt/conda/lib/python3.7/multiprocessing/process.py", line 297, 
> in _bootstrap
> self.run()
>   File "/opt/conda/lib/python3.7/multiprocessing/process.py", line 99, in 
> run
> self._target(*self._args, **self._kwargs)
>   File 
> "/root/apache-beam-custom/packages/beam/sdks/python/gen_protos.py", line 378, 
> in _install_grpcio_tools_and_generate_proto_files
> generate_proto_files(force=force)
>   File 
> "/root/apache-beam-custom/packages/beam/sdks/python/gen_protos.py", line 315, 
> in generate_proto_files
> protoc_gen_mypy = _find_protoc_gen_mypy()
>   File 
> "/root/apache-beam-custom/packages/beam/sdks/python/gen_protos.py", line 233, 
> in _find_protoc_gen_mypy
> (fname, ', '.join(search_paths)))
> RuntimeError: Could not find protoc-gen-mypy in 
> /root/apache-beam-custom/bin, /root/apache-beam-custom/bin, /usr/local/bin, 
> /opt/conda/bin, /usr/local/sbin, /usr/local/bin, /usr/sbin, /usr/bin, /sbin, 
> /bin
> ```



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9508) Python installation fails if grpc_tools is not installed

2020-03-16 Thread David Yan (Jira)
David Yan created BEAM-9508:
---

 Summary: Python installation fails if grpc_tools is not installed
 Key: BEAM-9508
 URL: https://issues.apache.org/jira/browse/BEAM-9508
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-core
Reporter: David Yan


When installing from master branch, I'm getting an exception below. Looks like 
the ImportError exception handling throws an exception itself. I'll manually 
install grpc_tools and try again but the handling of ImportError has issues.
 
```
Traceback (most recent call last):
  File "/root/apache-beam-custom/packages/beam/sdks/python/gen_protos.py", 
line 292, in generate_proto_files
from grpc_tools import protoc
ModuleNotFoundError: No module named 'grpc_tools'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/multiprocessing/process.py", line 297, in 
_bootstrap
self.run()
  File "/opt/conda/lib/python3.7/multiprocessing/process.py", line 99, in 
run
self._target(*self._args, **self._kwargs)
  File "/root/apache-beam-custom/packages/beam/sdks/python/gen_protos.py", 
line 378, in _install_grpcio_tools_and_generate_proto_files
generate_proto_files(force=force)
  File "/root/apache-beam-custom/packages/beam/sdks/python/gen_protos.py", 
line 315, in generate_proto_files
protoc_gen_mypy = _find_protoc_gen_mypy()
  File "/root/apache-beam-custom/packages/beam/sdks/python/gen_protos.py", 
line 233, in _find_protoc_gen_mypy
(fname, ', '.join(search_paths)))
RuntimeError: Could not find protoc-gen-mypy in 
/root/apache-beam-custom/bin, /root/apache-beam-custom/bin, /usr/local/bin, 
/opt/conda/bin, /usr/local/sbin, /usr/local/bin, /usr/sbin, /usr/bin, /sbin, 
/bin
```



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9487) GBKs on unbounded pcolls with global windows and no triggers should fail

2020-03-11 Thread David Yan (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Yan updated BEAM-9487:

Labels: EaseOfUse starter  (was: starter)

> GBKs on unbounded pcolls with global windows and no triggers should fail
> 
>
> Key: BEAM-9487
> URL: https://issues.apache.org/jira/browse/BEAM-9487
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Priority: Major
>  Labels: EaseOfUse, starter
>
> This, according to "4.2.2.1 GroupByKey and unbounded PCollections" in 
> https://beam.apache.org/documentation/programming-guide/.
> bq. If you do apply GroupByKey or CoGroupByKey to a group of unbounded 
> PCollections without setting either a non-global windowing strategy, a 
> trigger strategy, or both for each collection, Beam generates an 
> IllegalStateException error at pipeline construction time.
> Example where this doesn't happen in Python SDK: 
> https://stackoverflow.com/questions/60623246/merge-pcollection-with-apache-beam
> I also believe that this unit test should fail, since test_stream is 
> unbounded, uses global window, and has no triggers.
> {code}
>   def test_global_window_gbk_fail(self):
> with TestPipeline() as p:
>   test_stream = TestStream()
>   _ = p | test_stream | GroupByKey()
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-3453) Allow usage of public Google PubSub topics in Python DirectRunner

2020-02-10 Thread David Yan (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Yan resolved BEAM-3453.
-
Fix Version/s: 2.20.0
   Resolution: Fixed

> Allow usage of public Google PubSub topics in Python DirectRunner
> -
>
> Key: BEAM-3453
> URL: https://issues.apache.org/jira/browse/BEAM-3453
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Affects Versions: 2.2.0
>Reporter: Charles Chen
>Assignee: David Yan
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> Currently, the Beam Python DirectRunner does not allow the usage of data from 
> public Google Cloud PubSub topics.  We should allow this functionality so 
> that users can more easily test Beam Python's streaming functionality.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-3453) Allow usage of public Google PubSub topics in Python DirectRunner

2020-02-10 Thread David Yan (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Yan reassigned BEAM-3453:
---

Assignee: David Yan

> Allow usage of public Google PubSub topics in Python DirectRunner
> -
>
> Key: BEAM-3453
> URL: https://issues.apache.org/jira/browse/BEAM-3453
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Affects Versions: 2.2.0
>Reporter: Charles Chen
>Assignee: David Yan
>Priority: Major
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> Currently, the Beam Python DirectRunner does not allow the usage of data from 
> public Google Cloud PubSub topics.  We should allow this functionality so 
> that users can more easily test Beam Python's streaming functionality.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-3453) Allow usage of public Google PubSub topics in Python DirectRunner

2020-02-10 Thread David Yan (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17033859#comment-17033859
 ] 

David Yan commented on BEAM-3453:
-

This is fixed by [GitHub Pull Request 
#10762|https://github.com/apache/beam/pull/10762].

> Allow usage of public Google PubSub topics in Python DirectRunner
> -
>
> Key: BEAM-3453
> URL: https://issues.apache.org/jira/browse/BEAM-3453
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Affects Versions: 2.2.0
>Reporter: Charles Chen
>Priority: Major
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> Currently, the Beam Python DirectRunner does not allow the usage of data from 
> public Google Cloud PubSub topics.  We should allow this functionality so 
> that users can more easily test Beam Python's streaming functionality.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-8415) Improve error message when adding a PTransform with a name that already exists in the pipeline

2019-10-16 Thread David Yan (Jira)
David Yan created BEAM-8415:
---

 Summary: Improve error message when adding a PTransform with a 
name that already exists in the pipeline
 Key: BEAM-8415
 URL: https://issues.apache.org/jira/browse/BEAM-8415
 Project: Beam
  Issue Type: Improvement
  Components: sdk-py-core
Reporter: David Yan


Currently, when trying to apply a PTransform with a name that already exists in 
the pipeline, it returns a confusing error:

Transform "XXX" does not have a stable unique label. This will prevent updating 
of pipelines. To apply a transform with a specified label write pvalue | 
"label" >> transform

We'd like to improve this error message.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-7982) Dataflow runner needs to identify the new format of metric names for distribution metrics

2019-08-14 Thread David Yan (JIRA)
David Yan created BEAM-7982:
---

 Summary: Dataflow runner needs to identify the new format of 
metric names for distribution metrics
 Key: BEAM-7982
 URL: https://issues.apache.org/jira/browse/BEAM-7982
 Project: Beam
  Issue Type: Improvement
  Components: runner-dataflow
Reporter: David Yan


For example, 
[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/dataflow_metrics.py#L157]

uses [MAX], [MIN], etc. but the new format will be _MAX, _MIN, etc.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (BEAM-7957) Warn at job submit time if a step is named with a / or empty in DataflowRunner

2019-08-12 Thread David Yan (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Yan updated BEAM-7957:

Summary: Warn at job submit time if a step is named with a / or empty in 
DataflowRunner  (was: Warn users if a step is named with a / or empty in 
DataflowRunner)

> Warn at job submit time if a step is named with a / or empty in DataflowRunner
> --
>
> Key: BEAM-7957
> URL: https://issues.apache.org/jira/browse/BEAM-7957
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: David Yan
>Priority: Major
>
> When a job with an empty step name or a step name that has a "/" in it, it 
> quietly breaks the job graph in the Dataflow UI. We should at least warn the 
> user at job submit time.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (BEAM-7957) Warn users if a step is named with a / or empty in DataflowRunner

2019-08-12 Thread David Yan (JIRA)
David Yan created BEAM-7957:
---

 Summary: Warn users if a step is named with a / or empty in 
DataflowRunner
 Key: BEAM-7957
 URL: https://issues.apache.org/jira/browse/BEAM-7957
 Project: Beam
  Issue Type: Improvement
  Components: runner-dataflow
Reporter: David Yan


When a job with an empty step name or a step name that has a "/" in it, it 
quietly breaks the job graph in the Dataflow UI. We should at least warn the 
user at job submit time.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (BEAM-7876) Interactive Beam example does not work with Python3

2019-08-01 Thread David Yan (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Yan resolved BEAM-7876.
-
   Resolution: Fixed
Fix Version/s: 2.15.0

> Interactive Beam example does not work with Python3
> ---
>
> Key: BEAM-7876
> URL: https://issues.apache.org/jira/browse/BEAM-7876
> Project: Beam
>  Issue Type: Bug
>  Components: examples-python
>Reporter: David Yan
>Priority: Major
> Fix For: 2.15.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When going through the example  
> [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/interactive/README.md]
>  using Jupyter Notebook running in Python 3, the run() method throws an error 
> the following error:
> {{TypeError Traceback (most recent call last)}}
> {{ in }}
> {{ 3 squares = init_pcoll | 'Square' >> beam.Map(lambda x: x*x)}}
> {{ 4 cubes = init_pcoll | 'Cube' >> beam.Map(lambda x: x**3)}}
> {{> 5 result = p.run()}}
> {{ 6 result.wait_until_finish()}}{{~/beam/sdks/python/apache_beam/pipeline.py 
> in run(self, test_runner_api)}}
> {{ 404 self.to_runner_api(use_fake_coders=True),}}
> {{ 405 self.runner,}}
> {{--> 406 self._options).run(False)}}
> {{ 407 }}
> {{ 408 if 
> self._options.view_as(TypeOptions).runtime_type_check:}}{{~/beam/sdks/python/apache_beam/pipeline.py
>  in run(self, test_runner_api)}}
> {{ 417 finally:}}
> {{ 418 shutil.rmtree(tmpdir)}}
> {{--> 419 return self.runner.run_pipeline(self, self._options)}}
> {{ 420 }}
> {{ 421 def 
> __enter__(self):}}{{~/beam/sdks/python/apache_beam/runners/interactive/interactive_runner.py
>  in run_pipeline(self, pipeline, options)}}
> {{ 142 cache_manager=self._cache_manager,}}
> {{ 143 pipeline_graph_renderer=self._renderer)}}
> {{--> 144 display.start_periodic_update()}}
> {{ 145 result = pipeline_to_execute.run()}}
> {{ 146 
> result.wait_until_finish()}}{{~/beam/sdks/python/apache_beam/runners/interactive/display/display_manager.py
>  in start_periodic_update(self)}}
> {{ 158 def start_periodic_update(self):}}
> {{ 159 """Start a thread that periodically updates the display."""}}
> {{--> 160 self.update_display(True)}}
> {{ 161 self._periodic_update = True}}
> {{ 
> 162}}{{~/beam/sdks/python/apache_beam/runners/interactive/display/display_manager.py
>  in update_display(self, force)}}
> {{ 149 rendered_graph = self._renderer.render_pipeline_graph(}}
> {{ 150 self._pipeline_graph)}}
> {{--> 151 display.display(display.HTML(rendered_graph))}}
> {{ 152 }}
> {{ 153 
> _display_progress('Running...')}}{{~/beam/sdks/python/notebook3/lib/python3.6/site-packages/IPython/core/display.py
>  in __init__(self, data, url, filename, metadata)}}
> {{ 691 return prefix.startswith("")}}
> {{ 692 }}
> {{--> 693 if warn():}}
> {{ 694 warnings.warn("Consider using IPython.display.IFrame instead")}}
> {{ 695 super(HTML, self).__init__(data=data, url=url, filename=filename, 
> metadata=metadata)}}{{~/beam/sdks/python/notebook3/lib/python3.6/site-packages/IPython/core/display.py
>  in warn()}}
> {{ 689 prefix = data[:10].lower()}}
> {{ 690 suffix = data[-10:].lower()}}
> {{--> 691 return prefix.startswith(" suffix.endswith("")}}
> {{ 692 }}
> {{ 693 if warn():}}{{TypeError: startswith first arg must be bytes or a tuple 
> of bytes, not str}}
>  
>  
>  
> This does not happen with Python 2.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (BEAM-7876) Interactive Beam example does not work with Python3

2019-08-01 Thread David Yan (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Yan updated BEAM-7876:

Status: Open  (was: Triage Needed)

> Interactive Beam example does not work with Python3
> ---
>
> Key: BEAM-7876
> URL: https://issues.apache.org/jira/browse/BEAM-7876
> Project: Beam
>  Issue Type: Bug
>  Components: examples-python
>Reporter: David Yan
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When going through the example  
> [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/interactive/README.md]
>  using Jupyter Notebook running in Python 3, the run() method throws an error 
> the following error:
> {{TypeError Traceback (most recent call last)}}
> {{ in }}
> {{ 3 squares = init_pcoll | 'Square' >> beam.Map(lambda x: x*x)}}
> {{ 4 cubes = init_pcoll | 'Cube' >> beam.Map(lambda x: x**3)}}
> {{> 5 result = p.run()}}
> {{ 6 result.wait_until_finish()}}{{~/beam/sdks/python/apache_beam/pipeline.py 
> in run(self, test_runner_api)}}
> {{ 404 self.to_runner_api(use_fake_coders=True),}}
> {{ 405 self.runner,}}
> {{--> 406 self._options).run(False)}}
> {{ 407 }}
> {{ 408 if 
> self._options.view_as(TypeOptions).runtime_type_check:}}{{~/beam/sdks/python/apache_beam/pipeline.py
>  in run(self, test_runner_api)}}
> {{ 417 finally:}}
> {{ 418 shutil.rmtree(tmpdir)}}
> {{--> 419 return self.runner.run_pipeline(self, self._options)}}
> {{ 420 }}
> {{ 421 def 
> __enter__(self):}}{{~/beam/sdks/python/apache_beam/runners/interactive/interactive_runner.py
>  in run_pipeline(self, pipeline, options)}}
> {{ 142 cache_manager=self._cache_manager,}}
> {{ 143 pipeline_graph_renderer=self._renderer)}}
> {{--> 144 display.start_periodic_update()}}
> {{ 145 result = pipeline_to_execute.run()}}
> {{ 146 
> result.wait_until_finish()}}{{~/beam/sdks/python/apache_beam/runners/interactive/display/display_manager.py
>  in start_periodic_update(self)}}
> {{ 158 def start_periodic_update(self):}}
> {{ 159 """Start a thread that periodically updates the display."""}}
> {{--> 160 self.update_display(True)}}
> {{ 161 self._periodic_update = True}}
> {{ 
> 162}}{{~/beam/sdks/python/apache_beam/runners/interactive/display/display_manager.py
>  in update_display(self, force)}}
> {{ 149 rendered_graph = self._renderer.render_pipeline_graph(}}
> {{ 150 self._pipeline_graph)}}
> {{--> 151 display.display(display.HTML(rendered_graph))}}
> {{ 152 }}
> {{ 153 
> _display_progress('Running...')}}{{~/beam/sdks/python/notebook3/lib/python3.6/site-packages/IPython/core/display.py
>  in __init__(self, data, url, filename, metadata)}}
> {{ 691 return prefix.startswith("")}}
> {{ 692 }}
> {{--> 693 if warn():}}
> {{ 694 warnings.warn("Consider using IPython.display.IFrame instead")}}
> {{ 695 super(HTML, self).__init__(data=data, url=url, filename=filename, 
> metadata=metadata)}}{{~/beam/sdks/python/notebook3/lib/python3.6/site-packages/IPython/core/display.py
>  in warn()}}
> {{ 689 prefix = data[:10].lower()}}
> {{ 690 suffix = data[-10:].lower()}}
> {{--> 691 return prefix.startswith(" suffix.endswith("")}}
> {{ 692 }}
> {{ 693 if warn():}}{{TypeError: startswith first arg must be bytes or a tuple 
> of bytes, not str}}
>  
>  
>  
> This does not happen with Python 2.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (BEAM-7876) Interactive Beam example does not work with Python3

2019-08-01 Thread David Yan (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Yan updated BEAM-7876:

Description: 
When going through the example  
[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/interactive/README.md]
 using Jupyter Notebook running in Python 3, the run() method throws an error 
the following error:

{{TypeError Traceback (most recent call last)}}
{{ in }}
{{ 3 squares = init_pcoll | 'Square' >> beam.Map(lambda x: x*x)}}
{{ 4 cubes = init_pcoll | 'Cube' >> beam.Map(lambda x: x**3)}}
{{> 5 result = p.run()}}
{{ 6 result.wait_until_finish()}}{{~/beam/sdks/python/apache_beam/pipeline.py 
in run(self, test_runner_api)}}
{{ 404 self.to_runner_api(use_fake_coders=True),}}
{{ 405 self.runner,}}
{{--> 406 self._options).run(False)}}
{{ 407 }}
{{ 408 if 
self._options.view_as(TypeOptions).runtime_type_check:}}{{~/beam/sdks/python/apache_beam/pipeline.py
 in run(self, test_runner_api)}}
{{ 417 finally:}}
{{ 418 shutil.rmtree(tmpdir)}}
{{--> 419 return self.runner.run_pipeline(self, self._options)}}
{{ 420 }}
{{ 421 def 
__enter__(self):}}{{~/beam/sdks/python/apache_beam/runners/interactive/interactive_runner.py
 in run_pipeline(self, pipeline, options)}}
{{ 142 cache_manager=self._cache_manager,}}
{{ 143 pipeline_graph_renderer=self._renderer)}}
{{--> 144 display.start_periodic_update()}}
{{ 145 result = pipeline_to_execute.run()}}
{{ 146 
result.wait_until_finish()}}{{~/beam/sdks/python/apache_beam/runners/interactive/display/display_manager.py
 in start_periodic_update(self)}}
{{ 158 def start_periodic_update(self):}}
{{ 159 """Start a thread that periodically updates the display."""}}
{{--> 160 self.update_display(True)}}
{{ 161 self._periodic_update = True}}
{{ 
162}}{{~/beam/sdks/python/apache_beam/runners/interactive/display/display_manager.py
 in update_display(self, force)}}
{{ 149 rendered_graph = self._renderer.render_pipeline_graph(}}
{{ 150 self._pipeline_graph)}}
{{--> 151 display.display(display.HTML(rendered_graph))}}
{{ 152 }}
{{ 153 
_display_progress('Running...')}}{{~/beam/sdks/python/notebook3/lib/python3.6/site-packages/IPython/core/display.py
 in __init__(self, data, url, filename, metadata)}}
{{ 691 return prefix.startswith("")}}
{{ 692 }}
{{--> 693 if warn():}}
{{ 694 warnings.warn("Consider using IPython.display.IFrame instead")}}
{{ 695 super(HTML, self).__init__(data=data, url=url, filename=filename, 
metadata=metadata)}}{{~/beam/sdks/python/notebook3/lib/python3.6/site-packages/IPython/core/display.py
 in warn()}}
{{ 689 prefix = data[:10].lower()}}
{{ 690 suffix = data[-10:].lower()}}
{{--> 691 return prefix.startswith("")}}
{{ 692 }}
{{ 693 if warn():}}{{TypeError: startswith first arg must be bytes or a tuple 
of bytes, not str}}

 

 

 

This does not happen with Python 2.

  was:
When going through the example  
[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/interactive/README.md]
 using Jupyter Notebook running in Python 3, the run() method throws an error 
the following error:

{{TypeError Traceback (most recent call last)}}
{{  in }}
{{ 3 squares = init_pcoll | 'Square' >> beam.Map(lambda x: x*x)}}
{{ 4 cubes = init_pcoll | 'Cube' >> beam.Map(lambda x: x**3)}}
{{ > 5 result = p.run()}}
{{ 6 result.wait_until_finish()~/beam/sdks/python/apache_beam/pipeline.py in 
run(self, test_runner_api)}}
{{ 404 self.to_runner_api(use_fake_coders=True),}}
{{ 405 self.runner,}}
{{ --> 406 self._options).run(False)}}
{{ 407 }}
{{ 408 if 
self._options.view_as(TypeOptions).runtime_type_check:~/beam/sdks/python/apache_beam/pipeline.py
 in run(self, test_runner_api)}}
{{ 417 finally:}}
{{ 418 shutil.rmtree(tmpdir)}}
{{ --> 419 return self.runner.run_pipeline(self, self._options)}}
{{ 420 }}
{{ 421 def 
__enter__(self):~/beam/sdks/python/apache_beam/runners/interactive/interactive_runner.py
 in run_pipeline(self, pipeline, options)}}
{{ 142 cache_manager=self._cache_manager,}}
{{ 143 pipeline_graph_renderer=self._renderer)}}
{{ --> 144 display.start_periodic_update()}}
{{ 145 result = pipeline_to_execute.run()}}
{{ 146 
result.wait_until_finish()~/beam/sdks/python/apache_beam/runners/interactive/display/display_manager.py
 in start_periodic_update(self)}}
{{ 158 def start_periodic_update(self):}}
{{ 159 """Start a thread that periodically updates the display."""}}
{{ --> 160 self.update_display(True)}}
{{ 161 self._periodic_update = True}}
{{ 
162~/beam/sdks/python/apache_beam/runners/interactive/display/display_manager.py
 in update_display(self, force)}}
{{ 149 rendered_graph = self._renderer.render_pipeline_graph(}}
{{ 150 self._pipeline_graph)}}
{{ --> 151 display.display(display.HTML(rendered_graph))}}
{{ 152 }}
{{ 153 
_display_progress('Running...')~/beam/sdks/python/notebook3/lib/python3.6/site-packages/IPython/core/display.py
 in __init__(self, data, url, filename, metadata)}}
{{ 691 return prefix.startswith("")}}
{{ 692 

[jira] [Updated] (BEAM-7876) Interactive Beam example does not work with Python3

2019-08-01 Thread David Yan (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Yan updated BEAM-7876:

Description: 
When going through the example  
[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/interactive/README.md]
 using Jupyter Notebook running in Python 3, the run() method throws an error 
the following error:

{{TypeError Traceback (most recent call last)}}
{{  in }}
{{ 3 squares = init_pcoll | 'Square' >> beam.Map(lambda x: x*x)}}
{{ 4 cubes = init_pcoll | 'Cube' >> beam.Map(lambda x: x**3)}}
{{ > 5 result = p.run()}}
{{ 6 result.wait_until_finish()~/beam/sdks/python/apache_beam/pipeline.py in 
run(self, test_runner_api)}}
{{ 404 self.to_runner_api(use_fake_coders=True),}}
{{ 405 self.runner,}}
{{ --> 406 self._options).run(False)}}
{{ 407 }}
{{ 408 if 
self._options.view_as(TypeOptions).runtime_type_check:~/beam/sdks/python/apache_beam/pipeline.py
 in run(self, test_runner_api)}}
{{ 417 finally:}}
{{ 418 shutil.rmtree(tmpdir)}}
{{ --> 419 return self.runner.run_pipeline(self, self._options)}}
{{ 420 }}
{{ 421 def 
__enter__(self):~/beam/sdks/python/apache_beam/runners/interactive/interactive_runner.py
 in run_pipeline(self, pipeline, options)}}
{{ 142 cache_manager=self._cache_manager,}}
{{ 143 pipeline_graph_renderer=self._renderer)}}
{{ --> 144 display.start_periodic_update()}}
{{ 145 result = pipeline_to_execute.run()}}
{{ 146 
result.wait_until_finish()~/beam/sdks/python/apache_beam/runners/interactive/display/display_manager.py
 in start_periodic_update(self)}}
{{ 158 def start_periodic_update(self):}}
{{ 159 """Start a thread that periodically updates the display."""}}
{{ --> 160 self.update_display(True)}}
{{ 161 self._periodic_update = True}}
{{ 
162~/beam/sdks/python/apache_beam/runners/interactive/display/display_manager.py
 in update_display(self, force)}}
{{ 149 rendered_graph = self._renderer.render_pipeline_graph(}}
{{ 150 self._pipeline_graph)}}
{{ --> 151 display.display(display.HTML(rendered_graph))}}
{{ 152 }}
{{ 153 
_display_progress('Running...')~/beam/sdks/python/notebook3/lib/python3.6/site-packages/IPython/core/display.py
 in __init__(self, data, url, filename, metadata)}}
{{ 691 return prefix.startswith("")}}
{{ 692 }}
{{ --> 693 if warn():}}
{{ 694 warnings.warn("Consider using IPython.display.IFrame instead")}}
{{ 695 super(HTML, self).__init__(data=data, url=url, filename=filename, 
metadata=metadata)~/beam/sdks/python/notebook3/lib/python3.6/site-packages/IPython/core/display.py
 in warn()}}
{{ 689 prefix = data[:10].lower()}}
{{ 690 suffix = data[-10:].lower()}}
{{ --> 691 return prefix.startswith("")}}
{{ 692 }}
{{ 693 if warn():TypeError: startswith first arg must be bytes or a tuple of 
bytes, not str  }}

 

 

 

This does not happen with Python 2.

  was:
When going through the example  
[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/interactive/README.md]
 using Jupyter Notebook running in Python 3, the run() method throws an error:

TypeError Traceback (most recent call last)
 in 
 3 squares = init_pcoll | 'Square' >> beam.Map(lambda x: x*x)
 4 cubes = init_pcoll | 'Cube' >> beam.Map(lambda x: x**3)
> 5 result = p.run()
 6 result.wait_until_finish()

~/beam/sdks/python/apache_beam/pipeline.py in run(self, test_runner_api)
 404 self.to_runner_api(use_fake_coders=True),
 405 self.runner,
--> 406 self._options).run(False)
 407 
 408 if self._options.view_as(TypeOptions).runtime_type_check:

~/beam/sdks/python/apache_beam/pipeline.py in run(self, test_runner_api)
 417 finally:
 418 shutil.rmtree(tmpdir)
--> 419 return self.runner.run_pipeline(self, self._options)
 420 
 421 def __enter__(self):

~/beam/sdks/python/apache_beam/runners/interactive/interactive_runner.py in 
run_pipeline(self, pipeline, options)
 142 cache_manager=self._cache_manager,
 143 pipeline_graph_renderer=self._renderer)
--> 144 display.start_periodic_update()
 145 result = pipeline_to_execute.run()
 146 result.wait_until_finish()

~/beam/sdks/python/apache_beam/runners/interactive/display/display_manager.py 
in start_periodic_update(self)
 158 def start_periodic_update(self):
 159 """Start a thread that periodically updates the display."""
--> 160 self.update_display(True)
 161 self._periodic_update = True
 162

~/beam/sdks/python/apache_beam/runners/interactive/display/display_manager.py 
in update_display(self, force)
 149 rendered_graph = self._renderer.render_pipeline_graph(
 150 self._pipeline_graph)
--> 151 display.display(display.HTML(rendered_graph))
 152 
 153 _display_progress('Running...')

~/beam/sdks/python/notebook3/lib/python3.6/site-packages/IPython/core/display.py
 in __init__(self, data, url, filename, metadata)
 691 return prefix.startswith("")
 692 
--> 693 if warn():
 694 warnings.warn("Consider using IPython.display.IFrame instead")
 695 super(HTML, self).__init__(data=data, url=url, filename=filename, 

[jira] [Created] (BEAM-7876) Interactive Beam example does not work with Python3

2019-08-01 Thread David Yan (JIRA)
David Yan created BEAM-7876:
---

 Summary: Interactive Beam example does not work with Python3
 Key: BEAM-7876
 URL: https://issues.apache.org/jira/browse/BEAM-7876
 Project: Beam
  Issue Type: Bug
  Components: examples-python
Reporter: David Yan


When going through the example  
[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/interactive/README.md]
 using Jupyter Notebook running in Python 3, the run() method throws an error:

TypeError Traceback (most recent call last)
 in 
 3 squares = init_pcoll | 'Square' >> beam.Map(lambda x: x*x)
 4 cubes = init_pcoll | 'Cube' >> beam.Map(lambda x: x**3)
> 5 result = p.run()
 6 result.wait_until_finish()

~/beam/sdks/python/apache_beam/pipeline.py in run(self, test_runner_api)
 404 self.to_runner_api(use_fake_coders=True),
 405 self.runner,
--> 406 self._options).run(False)
 407 
 408 if self._options.view_as(TypeOptions).runtime_type_check:

~/beam/sdks/python/apache_beam/pipeline.py in run(self, test_runner_api)
 417 finally:
 418 shutil.rmtree(tmpdir)
--> 419 return self.runner.run_pipeline(self, self._options)
 420 
 421 def __enter__(self):

~/beam/sdks/python/apache_beam/runners/interactive/interactive_runner.py in 
run_pipeline(self, pipeline, options)
 142 cache_manager=self._cache_manager,
 143 pipeline_graph_renderer=self._renderer)
--> 144 display.start_periodic_update()
 145 result = pipeline_to_execute.run()
 146 result.wait_until_finish()

~/beam/sdks/python/apache_beam/runners/interactive/display/display_manager.py 
in start_periodic_update(self)
 158 def start_periodic_update(self):
 159 """Start a thread that periodically updates the display."""
--> 160 self.update_display(True)
 161 self._periodic_update = True
 162

~/beam/sdks/python/apache_beam/runners/interactive/display/display_manager.py 
in update_display(self, force)
 149 rendered_graph = self._renderer.render_pipeline_graph(
 150 self._pipeline_graph)
--> 151 display.display(display.HTML(rendered_graph))
 152 
 153 _display_progress('Running...')

~/beam/sdks/python/notebook3/lib/python3.6/site-packages/IPython/core/display.py
 in __init__(self, data, url, filename, metadata)
 691 return prefix.startswith("")
 692 
--> 693 if warn():
 694 warnings.warn("Consider using IPython.display.IFrame instead")
 695 super(HTML, self).__init__(data=data, url=url, filename=filename, 
metadata=metadata)

~/beam/sdks/python/notebook3/lib/python3.6/site-packages/IPython/core/display.py
 in warn()
 689 prefix = data[:10].lower()
 690 suffix = data[-10:].lower()
--> 691 return prefix.startswith("")
 692 
 693 if warn():

TypeError: startswith first arg must be bytes or a tuple of bytes, not str

 

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (BEAM-7408) Beam Programming Guide inconsistencies

2019-06-05 Thread David Yan (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Yan resolved BEAM-7408.
-
Resolution: Fixed

> Beam Programming Guide inconsistencies
> --
>
> Key: BEAM-7408
> URL: https://issues.apache.org/jira/browse/BEAM-7408
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Affects Versions: Not applicable
>Reporter: David Yan
>Priority: Major
>  Labels: documentation, newbie
> Fix For: Not applicable
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> [https://beam.apache.org/documentation/programming-guide/]
>  
> Pipeline option example:
>  
> Examples in Java, Python and Go are not consistent. Java has myCustomOption, 
> while Python and Go have "input" and "output".
>  
> When Python is chosen, the doc says --myCustomOption=value is supported, 
> which only corresponds to the java example.
>  
> Reading from external source:
>  
> Java, Python and Go are not consistent. Python example reads from a GCS file, 
> while others specify a generic file.
> [https://beam.apache.org/documentation/programming-guide/#applying-transforms]:
>  The last workflow graph does not correspond to the code example.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-7408) Beam Programming Guide inconsistencies

2019-06-05 Thread David Yan (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16857005#comment-16857005
 ] 

David Yan commented on BEAM-7408:
-

Yes, thank you. :)

> Beam Programming Guide inconsistencies
> --
>
> Key: BEAM-7408
> URL: https://issues.apache.org/jira/browse/BEAM-7408
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Affects Versions: Not applicable
>Reporter: David Yan
>Priority: Major
>  Labels: documentation, newbie
> Fix For: Not applicable
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> [https://beam.apache.org/documentation/programming-guide/]
>  
> Pipeline option example:
>  
> Examples in Java, Python and Go are not consistent. Java has myCustomOption, 
> while Python and Go have "input" and "output".
>  
> When Python is chosen, the doc says --myCustomOption=value is supported, 
> which only corresponds to the java example.
>  
> Reading from external source:
>  
> Java, Python and Go are not consistent. Python example reads from a GCS file, 
> while others specify a generic file.
> [https://beam.apache.org/documentation/programming-guide/#applying-transforms]:
>  The last workflow graph does not correspond to the code example.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7408) Beam Programming Guide inconsistencies

2019-05-23 Thread David Yan (JIRA)
David Yan created BEAM-7408:
---

 Summary: Beam Programming Guide inconsistencies
 Key: BEAM-7408
 URL: https://issues.apache.org/jira/browse/BEAM-7408
 Project: Beam
  Issue Type: Improvement
  Components: website
Reporter: David Yan


[https://beam.apache.org/documentation/programming-guide/]

 

Pipeline option example:

 

Examples in Java, Python and Go are not consistent. Java has myCustomOption, 
while Python and Go have "input" and "output".

 

When Python is chosen, the doc says --myCustomOption=value is supported, which 
only corresponds to the java example.

 

Reading from external source:

 

Java, Python and Go are not consistent. Python example reads from a GCS file, 
while others specify a generic file.


[https://beam.apache.org/documentation/programming-guide/#applying-transforms]: 
The last workflow graph does not correspond to the code example.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7215) Wordcount example page does not tell the user to create the maven project using archetype

2019-05-02 Thread David Yan (JIRA)
David Yan created BEAM-7215:
---

 Summary: Wordcount example page does not tell the user to create 
the maven project using archetype
 Key: BEAM-7215
 URL: https://issues.apache.org/jira/browse/BEAM-7215
 Project: Beam
  Issue Type: Improvement
  Components: website
Reporter: David Yan


[https://beam.apache.org/get-started/wordcount-example/#wordcount-example] does 
not have a link back to 
[https://beam.apache.org/get-started/quickstart-java/#get-the-wordcount-code]. 
If the user just follows the instructions in the first link (from a search 
engine let's say), they would get:

{{$ mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount 
-Dexec.args="--runner=DataflowRunner 
--gcpTempLocation=gs://clouddfe-test/staging-$USER 
--inputFile=gs://apache-beam-samples/shakespeare/* 
--output=gs://world-readable-mkcq69tkcu/$USER/result.txt" -Pdataflow-runner 
[INFO] Scanning for projects... [INFO] 
 [INFO] 
BUILD FAILURE [INFO] 
 [INFO] 
Total time: 0.068 s [INFO] Finished at: 2019-05-02T13:32:15-07:00 [INFO] Final 
Memory: 23M/1948M [INFO] 
 
[WARNING] The requested profile "dataflow-runner" could not be activated 
because it does not exist. [ERROR] The goal you specified requires a project to 
execute but there is no POM in this directory 
(/usr/local/google/home/davidyan/beam). Please verify you invoked Maven from 
the correct directory. -> [Help 1]}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7020) Reduce the log severity of profiling agent discovery

2019-04-05 Thread David Yan (JIRA)
David Yan created BEAM-7020:
---

 Summary: Reduce the log severity of profiling agent discovery
 Key: BEAM-7020
 URL: https://issues.apache.org/jira/browse/BEAM-7020
 Project: Beam
  Issue Type: Improvement
  Components: runner-dataflow
Reporter: David Yan


Example:

[https://github.com/apache/beam/blob/b953645ed6db837d24284d7fe1fe091e7309f821/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/profiler/ScopedProfiler.java#L138]

These should not be at warning severity, even if the profiling agent is not 
present since it's in most cases users do not run their jobs with profiling.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-6918) Github link requires login and example link is broken

2019-03-26 Thread David Yan (JIRA)
David Yan created BEAM-6918:
---

 Summary: Github link requires login and example link is broken
 Key: BEAM-6918
 URL: https://issues.apache.org/jira/browse/BEAM-6918
 Project: Beam
  Issue Type: Improvement
  Components: examples-python
Reporter: David Yan


Two minor issues in 
[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/interactive/README.md]

1. git clone g...@github.com:apache/beam.git requires the user to be logged in, 
while https://github.com/apache/beam does not.
2. Spaces in the example link need to be escaped.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)