[jira] [Assigned] (BEAM-10291) Lull detection log to include full thread dump

2020-10-20 Thread David Yan (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-10291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Yan reassigned BEAM-10291:


Assignee: David Yan

> Lull detection log to include full thread dump
> --
>
> Key: BEAM-10291
> URL: https://issues.apache.org/jira/browse/BEAM-10291
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: David Yan
>Assignee: David Yan
>Priority: P2
>  Labels: stale-P2
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> What we have today is a thread dump of the thread that's stuck, but in many 
> cases (most notably BQ) I/O happens in a separate thread that is not included 
> in the dump. Ideally, we'd need to have a full thread dump of the entire 
> process.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-10291) Lull detection log to include full thread dump

2020-10-20 Thread David Yan (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-10291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Yan updated BEAM-10291:
-
Status: Resolved  (was: Open)

> Lull detection log to include full thread dump
> --
>
> Key: BEAM-10291
> URL: https://issues.apache.org/jira/browse/BEAM-10291
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: David Yan
>Assignee: David Yan
>Priority: P2
>  Labels: stale-P2
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> What we have today is a thread dump of the thread that's stuck, but in many 
> cases (most notably BQ) I/O happens in a separate thread that is not included 
> in the dump. Ideally, we'd need to have a full thread dump of the entire 
> process.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-8551) Beam Python containers should include all Beam SDK dependencies, and not have conflicting dependencies

2020-08-27 Thread David Yan (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186178#comment-17186178
 ] 

David Yan edited comment on BEAM-8551 at 8/28/20, 12:16 AM:


Also BEAM-10827 is the latest issue that is caused by lack of dependency 
presubmit check. I'm raising the priority of this issue.


was (Author: davidyan):
Also BEAM-10827 is another issue that is caused by lack of dependency presubmit 
check. I'm raising the priority of this issue.

> Beam Python containers should include all Beam SDK dependencies, and not have 
> conflicting dependencies
> --
>
> Key: BEAM-8551
> URL: https://issues.apache.org/jira/browse/BEAM-8551
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: P1
>
> Checks could be introduced during container creation, and be enforced by 
> ValidatesContainer test suites. We could:
> - Check pip output or status code for incompatible dependency errors.
> - Remove internet access when installing apache-beam in the container, to 
> makes sure all dependencies are installed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8551) Beam Python containers should include all Beam SDK dependencies, and not have conflicting dependencies

2020-08-27 Thread David Yan (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186178#comment-17186178
 ] 

David Yan commented on BEAM-8551:
-

Also BEAM-10827 is another issue that is caused by lack of dependency presubmit 
check. I'm raising the priority of this issue.

> Beam Python containers should include all Beam SDK dependencies, and not have 
> conflicting dependencies
> --
>
> Key: BEAM-8551
> URL: https://issues.apache.org/jira/browse/BEAM-8551
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: P2
>
> Checks could be introduced during container creation, and be enforced by 
> ValidatesContainer test suites. We could:
> - Check pip output or status code for incompatible dependency errors.
> - Remove internet access when installing apache-beam in the container, to 
> makes sure all dependencies are installed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8551) Beam Python containers should include all Beam SDK dependencies, and not have conflicting dependencies

2020-08-27 Thread David Yan (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Yan updated BEAM-8551:

Priority: P1  (was: P2)

> Beam Python containers should include all Beam SDK dependencies, and not have 
> conflicting dependencies
> --
>
> Key: BEAM-8551
> URL: https://issues.apache.org/jira/browse/BEAM-8551
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: P1
>
> Checks could be introduced during container creation, and be enforced by 
> ValidatesContainer test suites. We could:
> - Check pip output or status code for incompatible dependency errors.
> - Remove internet access when installing apache-beam in the container, to 
> makes sure all dependencies are installed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8415) Improve error message when adding a PTransform with a name that already exists in the pipeline

2020-07-20 Thread David Yan (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17161538#comment-17161538
 ] 

David Yan commented on BEAM-8415:
-

For Java, looks like it's done when the pipeline is in the 
[validate|[https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/Pipeline.java#L591]]
 stage rather than when adding the PTransform to the pipeline like in Python so 
we cannot just use the error message we use in Python for Java.

Should we just change the term "stable unique" to just "unique"? I'm not sure 
what "stable unique" means since the PTransform label AFAIK cannot be changed 
after the pipeline has been submitted. 

> Improve error message when adding a PTransform with a name that already 
> exists in the pipeline
> --
>
> Key: BEAM-8415
> URL: https://issues.apache.org/jira/browse/BEAM-8415
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core, sdk-py-core
>Reporter: David Yan
>Priority: P2
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Currently, when trying to apply a PTransform with a name that already exists 
> in the pipeline, it returns a confusing error:
> Transform "XXX" does not have a stable unique label. This will prevent 
> updating of pipelines. To apply a transform with a specified label write 
> pvalue | "label" >> transform
> We'd like to improve this error message.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-10291) Lull detection log to include full thread dump

2020-06-21 Thread David Yan (Jira)
David Yan created BEAM-10291:


 Summary: Lull detection log to include full thread dump
 Key: BEAM-10291
 URL: https://issues.apache.org/jira/browse/BEAM-10291
 Project: Beam
  Issue Type: Improvement
  Components: runner-dataflow
Reporter: David Yan
Assignee: David Yan


What we have today is a thread dump of the thread that's stuck, but in many 
cases (most notably BQ) I/O happens in a separate thread that is not included 
in the dump. Ideally, we'd need to have a full thread dump of the entire 
process.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-10247) google-api-core 1.20.0 is incompatible with the pinned version of grpc

2020-06-15 Thread David Yan (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-10247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Yan resolved BEAM-10247.
--
Fix Version/s: 2.23.0
   Resolution: Fixed

> google-api-core 1.20.0 is incompatible with the pinned version of grpc
> --
>
> Key: BEAM-10247
> URL: https://issues.apache.org/jira/browse/BEAM-10247
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-harness
>Reporter: David Yan
>Assignee: David Yan
>Priority: P1
> Fix For: 2.23.0
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> It looks like the google-api-core 1.20.0 has an issue with required 
> dependency or the lack thereof. This is causing this issue when using 
> datastore:
>  
> {{Traceback (most recent call last):}}
> {{ File "./query_license.py", line 11, in }}
> {{ from google.cloud import datastore}}
> {{ File 
> "/root/apache-beam-custom/lib/python3.7/site-packages/google/cloud/datastore/__init__.py",
>  line 62, in }}
> {{ from google.cloud.datastore.batch import Batch}}
> {{ File 
> "/root/apache-beam-custom/lib/python3.7/site-packages/google/cloud/datastore/batch.py",
>  line 24, in }}
> {{ from google.cloud.datastore import helpers}}
> {{ File 
> "/root/apache-beam-custom/lib/python3.7/site-packages/google/cloud/datastore/helpers.py",
>  line 29, in }}
> {{ from google.cloud.datastore_v1.proto import datastore_pb2}}
> {{ File 
> "/root/apache-beam-custom/lib/python3.7/site-packages/google/cloud/datastore_v1/__init__.py",
>  line 18, in }}
> {{ from google.cloud.datastore_v1.gapic import datastore_client}}
> {{ File 
> "/root/apache-beam-custom/lib/python3.7/site-packages/google/cloud/datastore_v1/gapic/datastore_client.py",
>  line 22, in }}
> {{ import google.api_core.gapic_v1.client_info}}
> {{ File 
> "/root/apache-beam-custom/lib/python3.7/site-packages/google/api_core/gapic_v1/__init__.py",
>  line 26, in }}
> {{ from google.api_core.gapic_v1 import method_async # noqa: F401}}
> {{ File 
> "/root/apache-beam-custom/lib/python3.7/site-packages/google/api_core/gapic_v1/method_async.py",
>  line 20, in }}
> {{ from google.api_core import general_helpers, grpc_helpers_async}}
> {{ File 
> "/root/apache-beam-custom/lib/python3.7/site-packages/google/api_core/grpc_helpers_async.py",
>  line 25, in }}
> {{ from grpc.experimental import aio}}
> {{ ImportError: cannot import name 'aio' from 'grpc.experimental' 
> (/root/apache-beam-custom/lib/python3.7/site-packages/grpc/experimental/__init__.py)}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-10247) google-api-core 1.20.0 is incompatible with the pinned version of grpc

2020-06-11 Thread David Yan (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-10247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17133814#comment-17133814
 ] 

David Yan commented on BEAM-10247:
--

This issue is the exact same issue described in 
[https://github.com/googleapis/python-api-core/issues/40]

> google-api-core 1.20.0 is incompatible with the pinned version of grpc
> --
>
> Key: BEAM-10247
> URL: https://issues.apache.org/jira/browse/BEAM-10247
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-harness
>Reporter: David Yan
>Assignee: David Yan
>Priority: P1
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It looks like the google-api-core 1.20.0 has an issue with required 
> dependency or the lack thereof. This is causing this issue when using 
> datastore:
>  
> {{Traceback (most recent call last):}}
> {{ File "./query_license.py", line 11, in }}
> {{ from google.cloud import datastore}}
> {{ File 
> "/root/apache-beam-custom/lib/python3.7/site-packages/google/cloud/datastore/__init__.py",
>  line 62, in }}
> {{ from google.cloud.datastore.batch import Batch}}
> {{ File 
> "/root/apache-beam-custom/lib/python3.7/site-packages/google/cloud/datastore/batch.py",
>  line 24, in }}
> {{ from google.cloud.datastore import helpers}}
> {{ File 
> "/root/apache-beam-custom/lib/python3.7/site-packages/google/cloud/datastore/helpers.py",
>  line 29, in }}
> {{ from google.cloud.datastore_v1.proto import datastore_pb2}}
> {{ File 
> "/root/apache-beam-custom/lib/python3.7/site-packages/google/cloud/datastore_v1/__init__.py",
>  line 18, in }}
> {{ from google.cloud.datastore_v1.gapic import datastore_client}}
> {{ File 
> "/root/apache-beam-custom/lib/python3.7/site-packages/google/cloud/datastore_v1/gapic/datastore_client.py",
>  line 22, in }}
> {{ import google.api_core.gapic_v1.client_info}}
> {{ File 
> "/root/apache-beam-custom/lib/python3.7/site-packages/google/api_core/gapic_v1/__init__.py",
>  line 26, in }}
> {{ from google.api_core.gapic_v1 import method_async # noqa: F401}}
> {{ File 
> "/root/apache-beam-custom/lib/python3.7/site-packages/google/api_core/gapic_v1/method_async.py",
>  line 20, in }}
> {{ from google.api_core import general_helpers, grpc_helpers_async}}
> {{ File 
> "/root/apache-beam-custom/lib/python3.7/site-packages/google/api_core/grpc_helpers_async.py",
>  line 25, in }}
> {{ from grpc.experimental import aio}}
> {{ ImportError: cannot import name 'aio' from 'grpc.experimental' 
> (/root/apache-beam-custom/lib/python3.7/site-packages/grpc/experimental/__init__.py)}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-10247) google-api-core 1.20.0 is incompatible with the pinned version of grpc

2020-06-11 Thread David Yan (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-10247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Yan updated BEAM-10247:
-
Description: 
It looks like the google-api-core 1.20.0 has an issue with required dependency 
or the lack thereof. This is causing this issue when using datastore:

 

{{Traceback (most recent call last):}}
{{ File "./query_license.py", line 11, in }}
{{ from google.cloud import datastore}}
{{ File 
"/root/apache-beam-custom/lib/python3.7/site-packages/google/cloud/datastore/__init__.py",
 line 62, in }}
{{ from google.cloud.datastore.batch import Batch}}
{{ File 
"/root/apache-beam-custom/lib/python3.7/site-packages/google/cloud/datastore/batch.py",
 line 24, in }}
{{ from google.cloud.datastore import helpers}}
{{ File 
"/root/apache-beam-custom/lib/python3.7/site-packages/google/cloud/datastore/helpers.py",
 line 29, in }}
{{ from google.cloud.datastore_v1.proto import datastore_pb2}}
{{ File 
"/root/apache-beam-custom/lib/python3.7/site-packages/google/cloud/datastore_v1/__init__.py",
 line 18, in }}
{{ from google.cloud.datastore_v1.gapic import datastore_client}}
{{ File 
"/root/apache-beam-custom/lib/python3.7/site-packages/google/cloud/datastore_v1/gapic/datastore_client.py",
 line 22, in }}
{{ import google.api_core.gapic_v1.client_info}}
{{ File 
"/root/apache-beam-custom/lib/python3.7/site-packages/google/api_core/gapic_v1/__init__.py",
 line 26, in }}
{{ from google.api_core.gapic_v1 import method_async # noqa: F401}}
{{ File 
"/root/apache-beam-custom/lib/python3.7/site-packages/google/api_core/gapic_v1/method_async.py",
 line 20, in }}
{{ from google.api_core import general_helpers, grpc_helpers_async}}
{{ File 
"/root/apache-beam-custom/lib/python3.7/site-packages/google/api_core/grpc_helpers_async.py",
 line 25, in }}
{{ from grpc.experimental import aio}}
{{ ImportError: cannot import name 'aio' from 'grpc.experimental' 
(/root/apache-beam-custom/lib/python3.7/site-packages/grpc/experimental/__init__.py)}}

  was:
It looks like the google-api-core 1.20.0 has an issue with required dependency 
or the lack thereof. This is causing this issue when using datastore:

``` 

{{Traceback (most recent call last):
  File "./query_license.py", line 11, in 
from google.cloud import datastore
  File 
"/root/apache-beam-custom/lib/python3.7/site-packages/google/cloud/datastore/__init__.py",
 line 62, in 
from google.cloud.datastore.batch import Batch
  File 
"/root/apache-beam-custom/lib/python3.7/site-packages/google/cloud/datastore/batch.py",
 line 24, in 
from google.cloud.datastore import helpers
  File 
"/root/apache-beam-custom/lib/python3.7/site-packages/google/cloud/datastore/helpers.py",
 line 29, in 
from google.cloud.datastore_v1.proto import datastore_pb2
  File 
"/root/apache-beam-custom/lib/python3.7/site-packages/google/cloud/datastore_v1/__init__.py",
 line 18, in 
from google.cloud.datastore_v1.gapic import datastore_client
  File 
"/root/apache-beam-custom/lib/python3.7/site-packages/google/cloud/datastore_v1/gapic/datastore_client.py",
 line 22, in 
import google.api_core.gapic_v1.client_info
  File 
"/root/apache-beam-custom/lib/python3.7/site-packages/google/api_core/gapic_v1/__init__.py",
 line 26, in 
from google.api_core.gapic_v1 import method_async  # noqa: F401
  File 
"/root/apache-beam-custom/lib/python3.7/site-packages/google/api_core/gapic_v1/method_async.py",
 line 20, in 
from google.api_core import general_helpers, grpc_helpers_async
  File 
"/root/apache-beam-custom/lib/python3.7/site-packages/google/api_core/grpc_helpers_async.py",
 line 25, in 
from grpc.experimental import aio
ImportError: cannot import name 'aio' from 'grpc.experimental' 
(/root/apache-beam-custom/lib/python3.7/site-packages/grpc/experimental/__init__.py)}}

{{```}}


> google-api-core 1.20.0 is incompatible with the pinned version of grpc
> --
>
> Key: BEAM-10247
> URL: https://issues.apache.org/jira/browse/BEAM-10247
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-harness
>Reporter: David Yan
>Assignee: David Yan
>Priority: P1
>
> It looks like the google-api-core 1.20.0 has an issue with required 
> dependency or the lack thereof. This is causing this issue when using 
> datastore:
>  
> {{Traceback (most recent call last):}}
> {{ File "./query_license.py", line 11, in }}
> {{ from google.cloud import datastore}}
> {{ File 
> "/root/apache-beam-custom/lib/python3.7/site-packages/google/cloud/datastore/__init__.py",
>  line 62, in }}
> {{ from google.cloud.datastore.batch import Batch}}
> {{ File 
> "/root/apache-beam-custom/lib/python3.7/site-packages/google/cloud/datastore/batch.py",
>  line 24, in }}
> {{ from google.cloud.datastore import helpers}}
> {{ File 
> "/root/apache-beam-custom/lib/python3.

[jira] [Created] (BEAM-10247) google-api-core 1.20.0 is incompatible with the pinned version of grpc

2020-06-11 Thread David Yan (Jira)
David Yan created BEAM-10247:


 Summary: google-api-core 1.20.0 is incompatible with the pinned 
version of grpc
 Key: BEAM-10247
 URL: https://issues.apache.org/jira/browse/BEAM-10247
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-harness
Reporter: David Yan
Assignee: David Yan


It looks like the google-api-core 1.20.0 has an issue with required dependency 
or the lack thereof. This is causing this issue when using datastore:

``` 

{{Traceback (most recent call last):
  File "./query_license.py", line 11, in 
from google.cloud import datastore
  File 
"/root/apache-beam-custom/lib/python3.7/site-packages/google/cloud/datastore/__init__.py",
 line 62, in 
from google.cloud.datastore.batch import Batch
  File 
"/root/apache-beam-custom/lib/python3.7/site-packages/google/cloud/datastore/batch.py",
 line 24, in 
from google.cloud.datastore import helpers
  File 
"/root/apache-beam-custom/lib/python3.7/site-packages/google/cloud/datastore/helpers.py",
 line 29, in 
from google.cloud.datastore_v1.proto import datastore_pb2
  File 
"/root/apache-beam-custom/lib/python3.7/site-packages/google/cloud/datastore_v1/__init__.py",
 line 18, in 
from google.cloud.datastore_v1.gapic import datastore_client
  File 
"/root/apache-beam-custom/lib/python3.7/site-packages/google/cloud/datastore_v1/gapic/datastore_client.py",
 line 22, in 
import google.api_core.gapic_v1.client_info
  File 
"/root/apache-beam-custom/lib/python3.7/site-packages/google/api_core/gapic_v1/__init__.py",
 line 26, in 
from google.api_core.gapic_v1 import method_async  # noqa: F401
  File 
"/root/apache-beam-custom/lib/python3.7/site-packages/google/api_core/gapic_v1/method_async.py",
 line 20, in 
from google.api_core import general_helpers, grpc_helpers_async
  File 
"/root/apache-beam-custom/lib/python3.7/site-packages/google/api_core/grpc_helpers_async.py",
 line 25, in 
from grpc.experimental import aio
ImportError: cannot import name 'aio' from 'grpc.experimental' 
(/root/apache-beam-custom/lib/python3.7/site-packages/grpc/experimental/__init__.py)}}

{{```}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8551) Beam Python containers should include all Beam SDK dependencies, and do not have conflicting dependencies

2020-03-17 Thread David Yan (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17061206#comment-17061206
 ] 

David Yan commented on BEAM-8551:
-

`pip check` is another way to check for broken dependencies.

> Beam Python containers should include all Beam SDK dependencies, and do not 
> have conflicting dependencies
> -
>
> Key: BEAM-8551
> URL: https://issues.apache.org/jira/browse/BEAM-8551
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Priority: Major
>
> Checks could be introduced during container creation, and be enforced by 
> ValidatesContainer test suites. We could:
> - Check pip output or status code for incompatible dependency errors.
> - Remove internet access when installing apache-beam in the container, to 
> makes sure all dependencies are installed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (BEAM-9530) Add `pip check` to ensure good python dependencies

2020-03-17 Thread David Yan (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Yan closed BEAM-9530.
---
Fix Version/s: Not applicable
   Resolution: Duplicate

> Add `pip check` to ensure good python dependencies
> --
>
> Key: BEAM-9530
> URL: https://issues.apache.org/jira/browse/BEAM-9530
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-harness
>Reporter: David Yan
>Priority: Major
> Fix For: Not applicable
>
>
> We should add {{pip check}} after pip install in our tests to make sure there 
> is no incompatibility.  {{pip install}} does not return an error exit code 
> for broken dependencies for historical reasons.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9510) Dependencies in base_image_requirements.txt are not compatible with each other

2020-03-17 Thread David Yan (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17061158#comment-17061158
 ] 

David Yan commented on BEAM-9510:
-

Also related: BEAM-9530

> Dependencies in base_image_requirements.txt are not compatible with each other
> --
>
> Key: BEAM-9510
> URL: https://issues.apache.org/jira/browse/BEAM-9510
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-harness
>Reporter: David Yan
>Assignee: Hannah Jiang
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/sdks/python/container/base_image_requirements.txt#L56]
> says it requires google-cloud-bigquery==1.24.0, google-cloud-core==1.0.2, 
> google-cloud-bigtable==0.32.1, grpc-1.22.0 and tensorflow-2.1.0
> But they are incompatible with each other:
> ERROR: google-cloud-bigquery 1.24.0 has requirement 
> google-cloud-core<2.0dev,>=1.1.0, but you'll have google-cloud-core 1.0.2 
> which is incompatible.
> ERROR: google-cloud-bigtable 0.32.1 has requirement 
> google-cloud-core<0.30dev,>=0.29.0, but you'll have google-cloud-core 1.0.2 
> which is incompatible.
> ERROR: tensorboard 2.1.1 has requirement grpcio>=1.24.3, but you'll have 
> grpcio 1.22.0 which is incompatible.
> ERROR: tensorflow 2.1.0 has requirement scipy==1.4.1; python_version >= "3", 
> but you'll have scipy 1.2.2 which is incompatible.
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9530) Add `pip check` to ensure good python dependencies

2020-03-17 Thread David Yan (Jira)
David Yan created BEAM-9530:
---

 Summary: Add `pip check` to ensure good python dependencies
 Key: BEAM-9530
 URL: https://issues.apache.org/jira/browse/BEAM-9530
 Project: Beam
  Issue Type: Improvement
  Components: sdk-py-harness
Reporter: David Yan


We should add {{pip check}} after pip install in our tests to make sure there 
is no incompatibility.  {{pip install}} does not return an error exit code for 
broken dependencies for historical reasons.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9510) Dependencies in base_image_requirements.txt are not compatible with each other

2020-03-16 Thread David Yan (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Yan updated BEAM-9510:

Summary: Dependencies in base_image_requirements.txt are not compatible 
with each other  (was: Dependencies in base_image_requirements.txt are not 
compatible with apache-beam pypi deps)

> Dependencies in base_image_requirements.txt are not compatible with each other
> --
>
> Key: BEAM-9510
> URL: https://issues.apache.org/jira/browse/BEAM-9510
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-harness
>Reporter: David Yan
>Priority: Major
>
> [https://github.com/apache/beam/blob/master/sdks/python/container/base_image_requirements.txt#L56]
> says it requires google-cloud-bigquery==1.24.0, google-cloud-core==1.0.2, 
> google-cloud-bigtable==0.32.1, grpc-1.22.0 and tensorflow-2.1.0
> But they are incompatible with each other:
> ERROR: google-cloud-bigquery 1.24.0 has requirement 
> google-cloud-core<2.0dev,>=1.1.0, but you'll have google-cloud-core 1.0.2 
> which is incompatible.
> ERROR: google-cloud-bigtable 0.32.1 has requirement 
> google-cloud-core<0.30dev,>=0.29.0, but you'll have google-cloud-core 1.0.2 
> which is incompatible.
> ERROR: tensorboard 2.1.1 has requirement grpcio>=1.24.3, but you'll have 
> grpcio 1.22.0 which is incompatible.
> ERROR: tensorflow 2.1.0 has requirement scipy==1.4.1; python_version >= "3", 
> but you'll have scipy 1.2.2 which is incompatible.
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9510) Dependencies in base_image_requirements.txt are not compatible with apache-beam pypi deps

2020-03-16 Thread David Yan (Jira)
David Yan created BEAM-9510:
---

 Summary: Dependencies in base_image_requirements.txt are not 
compatible with apache-beam pypi deps
 Key: BEAM-9510
 URL: https://issues.apache.org/jira/browse/BEAM-9510
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-harness
Reporter: David Yan


[https://github.com/apache/beam/blob/master/sdks/python/container/base_image_requirements.txt#L56]

says it requires google-cloud-bigquery==1.24.0, google-cloud-core==1.0.2, 
google-cloud-bigtable==0.32.1, grpc-1.22.0 and tensorflow-2.1.0

But they are incompatible with each other:

ERROR: google-cloud-bigquery 1.24.0 has requirement 
google-cloud-core<2.0dev,>=1.1.0, but you'll have google-cloud-core 1.0.2 which 
is incompatible.

ERROR: google-cloud-bigtable 0.32.1 has requirement 
google-cloud-core<0.30dev,>=0.29.0, but you'll have google-cloud-core 1.0.2 
which is incompatible.

ERROR: tensorboard 2.1.1 has requirement grpcio>=1.24.3, but you'll have grpcio 
1.22.0 which is incompatible.

ERROR: tensorflow 2.1.0 has requirement scipy==1.4.1; python_version >= "3", 
but you'll have scipy 1.2.2 which is incompatible.

 

 

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9508) Python installation fails if grpc_tools is not installed

2020-03-16 Thread David Yan (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17060427#comment-17060427
 ] 

David Yan commented on BEAM-9508:
-

This is fixed by installing mypy-protobuf, which is not immediately obvious 
from the stacktrace.

I'll leave this ticket open for a better error message.

> Python installation fails if grpc_tools is not installed
> 
>
> Key: BEAM-9508
> URL: https://issues.apache.org/jira/browse/BEAM-9508
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: David Yan
>Priority: Major
>
> When installing from master branch, I'm getting an exception below. Looks 
> like the ImportError exception handling throws an exception itself. I'll 
> manually install grpc_tools and try again but the handling of ImportError has 
> issues.
>  
> ```
> Traceback (most recent call last):
>   File 
> "/root/apache-beam-custom/packages/beam/sdks/python/gen_protos.py", line 292, 
> in generate_proto_files
> from grpc_tools import protoc
> ModuleNotFoundError: No module named 'grpc_tools'
> 
> During handling of the above exception, another exception occurred:
> 
> Traceback (most recent call last):
>   File "/opt/conda/lib/python3.7/multiprocessing/process.py", line 297, 
> in _bootstrap
> self.run()
>   File "/opt/conda/lib/python3.7/multiprocessing/process.py", line 99, in 
> run
> self._target(*self._args, **self._kwargs)
>   File 
> "/root/apache-beam-custom/packages/beam/sdks/python/gen_protos.py", line 378, 
> in _install_grpcio_tools_and_generate_proto_files
> generate_proto_files(force=force)
>   File 
> "/root/apache-beam-custom/packages/beam/sdks/python/gen_protos.py", line 315, 
> in generate_proto_files
> protoc_gen_mypy = _find_protoc_gen_mypy()
>   File 
> "/root/apache-beam-custom/packages/beam/sdks/python/gen_protos.py", line 233, 
> in _find_protoc_gen_mypy
> (fname, ', '.join(search_paths)))
> RuntimeError: Could not find protoc-gen-mypy in 
> /root/apache-beam-custom/bin, /root/apache-beam-custom/bin, /usr/local/bin, 
> /opt/conda/bin, /usr/local/sbin, /usr/local/bin, /usr/sbin, /usr/bin, /sbin, 
> /bin
> ```



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9508) Python installation fails if grpc_tools is not installed

2020-03-16 Thread David Yan (Jira)
David Yan created BEAM-9508:
---

 Summary: Python installation fails if grpc_tools is not installed
 Key: BEAM-9508
 URL: https://issues.apache.org/jira/browse/BEAM-9508
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-core
Reporter: David Yan


When installing from master branch, I'm getting an exception below. Looks like 
the ImportError exception handling throws an exception itself. I'll manually 
install grpc_tools and try again but the handling of ImportError has issues.
 
```
Traceback (most recent call last):
  File "/root/apache-beam-custom/packages/beam/sdks/python/gen_protos.py", 
line 292, in generate_proto_files
from grpc_tools import protoc
ModuleNotFoundError: No module named 'grpc_tools'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/multiprocessing/process.py", line 297, in 
_bootstrap
self.run()
  File "/opt/conda/lib/python3.7/multiprocessing/process.py", line 99, in 
run
self._target(*self._args, **self._kwargs)
  File "/root/apache-beam-custom/packages/beam/sdks/python/gen_protos.py", 
line 378, in _install_grpcio_tools_and_generate_proto_files
generate_proto_files(force=force)
  File "/root/apache-beam-custom/packages/beam/sdks/python/gen_protos.py", 
line 315, in generate_proto_files
protoc_gen_mypy = _find_protoc_gen_mypy()
  File "/root/apache-beam-custom/packages/beam/sdks/python/gen_protos.py", 
line 233, in _find_protoc_gen_mypy
(fname, ', '.join(search_paths)))
RuntimeError: Could not find protoc-gen-mypy in 
/root/apache-beam-custom/bin, /root/apache-beam-custom/bin, /usr/local/bin, 
/opt/conda/bin, /usr/local/sbin, /usr/local/bin, /usr/sbin, /usr/bin, /sbin, 
/bin
```



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9487) GBKs on unbounded pcolls with global windows and no triggers should fail

2020-03-11 Thread David Yan (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Yan updated BEAM-9487:

Labels: EaseOfUse starter  (was: starter)

> GBKs on unbounded pcolls with global windows and no triggers should fail
> 
>
> Key: BEAM-9487
> URL: https://issues.apache.org/jira/browse/BEAM-9487
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Priority: Major
>  Labels: EaseOfUse, starter
>
> This, according to "4.2.2.1 GroupByKey and unbounded PCollections" in 
> https://beam.apache.org/documentation/programming-guide/.
> bq. If you do apply GroupByKey or CoGroupByKey to a group of unbounded 
> PCollections without setting either a non-global windowing strategy, a 
> trigger strategy, or both for each collection, Beam generates an 
> IllegalStateException error at pipeline construction time.
> Example where this doesn't happen in Python SDK: 
> https://stackoverflow.com/questions/60623246/merge-pcollection-with-apache-beam
> I also believe that this unit test should fail, since test_stream is 
> unbounded, uses global window, and has no triggers.
> {code}
>   def test_global_window_gbk_fail(self):
> with TestPipeline() as p:
>   test_stream = TestStream()
>   _ = p | test_stream | GroupByKey()
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-3453) Allow usage of public Google PubSub topics in Python DirectRunner

2020-02-10 Thread David Yan (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Yan resolved BEAM-3453.
-
Fix Version/s: 2.20.0
   Resolution: Fixed

> Allow usage of public Google PubSub topics in Python DirectRunner
> -
>
> Key: BEAM-3453
> URL: https://issues.apache.org/jira/browse/BEAM-3453
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Affects Versions: 2.2.0
>Reporter: Charles Chen
>Assignee: David Yan
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> Currently, the Beam Python DirectRunner does not allow the usage of data from 
> public Google Cloud PubSub topics.  We should allow this functionality so 
> that users can more easily test Beam Python's streaming functionality.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-3453) Allow usage of public Google PubSub topics in Python DirectRunner

2020-02-10 Thread David Yan (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Yan reassigned BEAM-3453:
---

Assignee: David Yan

> Allow usage of public Google PubSub topics in Python DirectRunner
> -
>
> Key: BEAM-3453
> URL: https://issues.apache.org/jira/browse/BEAM-3453
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Affects Versions: 2.2.0
>Reporter: Charles Chen
>Assignee: David Yan
>Priority: Major
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> Currently, the Beam Python DirectRunner does not allow the usage of data from 
> public Google Cloud PubSub topics.  We should allow this functionality so 
> that users can more easily test Beam Python's streaming functionality.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-3453) Allow usage of public Google PubSub topics in Python DirectRunner

2020-02-10 Thread David Yan (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17033859#comment-17033859
 ] 

David Yan commented on BEAM-3453:
-

This is fixed by [GitHub Pull Request 
#10762|https://github.com/apache/beam/pull/10762].

> Allow usage of public Google PubSub topics in Python DirectRunner
> -
>
> Key: BEAM-3453
> URL: https://issues.apache.org/jira/browse/BEAM-3453
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Affects Versions: 2.2.0
>Reporter: Charles Chen
>Priority: Major
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> Currently, the Beam Python DirectRunner does not allow the usage of data from 
> public Google Cloud PubSub topics.  We should allow this functionality so 
> that users can more easily test Beam Python's streaming functionality.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-8415) Improve error message when adding a PTransform with a name that already exists in the pipeline

2019-10-16 Thread David Yan (Jira)
David Yan created BEAM-8415:
---

 Summary: Improve error message when adding a PTransform with a 
name that already exists in the pipeline
 Key: BEAM-8415
 URL: https://issues.apache.org/jira/browse/BEAM-8415
 Project: Beam
  Issue Type: Improvement
  Components: sdk-py-core
Reporter: David Yan


Currently, when trying to apply a PTransform with a name that already exists in 
the pipeline, it returns a confusing error:

Transform "XXX" does not have a stable unique label. This will prevent updating 
of pipelines. To apply a transform with a specified label write pvalue | 
"label" >> transform

We'd like to improve this error message.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-7982) Dataflow runner needs to identify the new format of metric names for distribution metrics

2019-08-14 Thread David Yan (JIRA)
David Yan created BEAM-7982:
---

 Summary: Dataflow runner needs to identify the new format of 
metric names for distribution metrics
 Key: BEAM-7982
 URL: https://issues.apache.org/jira/browse/BEAM-7982
 Project: Beam
  Issue Type: Improvement
  Components: runner-dataflow
Reporter: David Yan


For example, 
[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/dataflow_metrics.py#L157]

uses [MAX], [MIN], etc. but the new format will be _MAX, _MIN, etc.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (BEAM-7957) Warn at job submit time if a step is named with a / or empty in DataflowRunner

2019-08-12 Thread David Yan (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Yan updated BEAM-7957:

Summary: Warn at job submit time if a step is named with a / or empty in 
DataflowRunner  (was: Warn users if a step is named with a / or empty in 
DataflowRunner)

> Warn at job submit time if a step is named with a / or empty in DataflowRunner
> --
>
> Key: BEAM-7957
> URL: https://issues.apache.org/jira/browse/BEAM-7957
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: David Yan
>Priority: Major
>
> When a job with an empty step name or a step name that has a "/" in it, it 
> quietly breaks the job graph in the Dataflow UI. We should at least warn the 
> user at job submit time.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (BEAM-7957) Warn users if a step is named with a / or empty in DataflowRunner

2019-08-12 Thread David Yan (JIRA)
David Yan created BEAM-7957:
---

 Summary: Warn users if a step is named with a / or empty in 
DataflowRunner
 Key: BEAM-7957
 URL: https://issues.apache.org/jira/browse/BEAM-7957
 Project: Beam
  Issue Type: Improvement
  Components: runner-dataflow
Reporter: David Yan


When a job with an empty step name or a step name that has a "/" in it, it 
quietly breaks the job graph in the Dataflow UI. We should at least warn the 
user at job submit time.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (BEAM-7876) Interactive Beam example does not work with Python3

2019-08-01 Thread David Yan (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Yan resolved BEAM-7876.
-
   Resolution: Fixed
Fix Version/s: 2.15.0

> Interactive Beam example does not work with Python3
> ---
>
> Key: BEAM-7876
> URL: https://issues.apache.org/jira/browse/BEAM-7876
> Project: Beam
>  Issue Type: Bug
>  Components: examples-python
>Reporter: David Yan
>Priority: Major
> Fix For: 2.15.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When going through the example  
> [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/interactive/README.md]
>  using Jupyter Notebook running in Python 3, the run() method throws an error 
> the following error:
> {{TypeError Traceback (most recent call last)}}
> {{ in }}
> {{ 3 squares = init_pcoll | 'Square' >> beam.Map(lambda x: x*x)}}
> {{ 4 cubes = init_pcoll | 'Cube' >> beam.Map(lambda x: x**3)}}
> {{> 5 result = p.run()}}
> {{ 6 result.wait_until_finish()}}{{~/beam/sdks/python/apache_beam/pipeline.py 
> in run(self, test_runner_api)}}
> {{ 404 self.to_runner_api(use_fake_coders=True),}}
> {{ 405 self.runner,}}
> {{--> 406 self._options).run(False)}}
> {{ 407 }}
> {{ 408 if 
> self._options.view_as(TypeOptions).runtime_type_check:}}{{~/beam/sdks/python/apache_beam/pipeline.py
>  in run(self, test_runner_api)}}
> {{ 417 finally:}}
> {{ 418 shutil.rmtree(tmpdir)}}
> {{--> 419 return self.runner.run_pipeline(self, self._options)}}
> {{ 420 }}
> {{ 421 def 
> __enter__(self):}}{{~/beam/sdks/python/apache_beam/runners/interactive/interactive_runner.py
>  in run_pipeline(self, pipeline, options)}}
> {{ 142 cache_manager=self._cache_manager,}}
> {{ 143 pipeline_graph_renderer=self._renderer)}}
> {{--> 144 display.start_periodic_update()}}
> {{ 145 result = pipeline_to_execute.run()}}
> {{ 146 
> result.wait_until_finish()}}{{~/beam/sdks/python/apache_beam/runners/interactive/display/display_manager.py
>  in start_periodic_update(self)}}
> {{ 158 def start_periodic_update(self):}}
> {{ 159 """Start a thread that periodically updates the display."""}}
> {{--> 160 self.update_display(True)}}
> {{ 161 self._periodic_update = True}}
> {{ 
> 162}}{{~/beam/sdks/python/apache_beam/runners/interactive/display/display_manager.py
>  in update_display(self, force)}}
> {{ 149 rendered_graph = self._renderer.render_pipeline_graph(}}
> {{ 150 self._pipeline_graph)}}
> {{--> 151 display.display(display.HTML(rendered_graph))}}
> {{ 152 }}
> {{ 153 
> _display_progress('Running...')}}{{~/beam/sdks/python/notebook3/lib/python3.6/site-packages/IPython/core/display.py
>  in __init__(self, data, url, filename, metadata)}}
> {{ 691 return prefix.startswith("")}}
> {{ 692 }}
> {{--> 693 if warn():}}
> {{ 694 warnings.warn("Consider using IPython.display.IFrame instead")}}
> {{ 695 super(HTML, self).__init__(data=data, url=url, filename=filename, 
> metadata=metadata)}}{{~/beam/sdks/python/notebook3/lib/python3.6/site-packages/IPython/core/display.py
>  in warn()}}
> {{ 689 prefix = data[:10].lower()}}
> {{ 690 suffix = data[-10:].lower()}}
> {{--> 691 return prefix.startswith(" suffix.endswith("")}}
> {{ 692 }}
> {{ 693 if warn():}}{{TypeError: startswith first arg must be bytes or a tuple 
> of bytes, not str}}
>  
>  
>  
> This does not happen with Python 2.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (BEAM-7876) Interactive Beam example does not work with Python3

2019-08-01 Thread David Yan (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Yan updated BEAM-7876:

Status: Open  (was: Triage Needed)

> Interactive Beam example does not work with Python3
> ---
>
> Key: BEAM-7876
> URL: https://issues.apache.org/jira/browse/BEAM-7876
> Project: Beam
>  Issue Type: Bug
>  Components: examples-python
>Reporter: David Yan
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When going through the example  
> [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/interactive/README.md]
>  using Jupyter Notebook running in Python 3, the run() method throws an error 
> the following error:
> {{TypeError Traceback (most recent call last)}}
> {{ in }}
> {{ 3 squares = init_pcoll | 'Square' >> beam.Map(lambda x: x*x)}}
> {{ 4 cubes = init_pcoll | 'Cube' >> beam.Map(lambda x: x**3)}}
> {{> 5 result = p.run()}}
> {{ 6 result.wait_until_finish()}}{{~/beam/sdks/python/apache_beam/pipeline.py 
> in run(self, test_runner_api)}}
> {{ 404 self.to_runner_api(use_fake_coders=True),}}
> {{ 405 self.runner,}}
> {{--> 406 self._options).run(False)}}
> {{ 407 }}
> {{ 408 if 
> self._options.view_as(TypeOptions).runtime_type_check:}}{{~/beam/sdks/python/apache_beam/pipeline.py
>  in run(self, test_runner_api)}}
> {{ 417 finally:}}
> {{ 418 shutil.rmtree(tmpdir)}}
> {{--> 419 return self.runner.run_pipeline(self, self._options)}}
> {{ 420 }}
> {{ 421 def 
> __enter__(self):}}{{~/beam/sdks/python/apache_beam/runners/interactive/interactive_runner.py
>  in run_pipeline(self, pipeline, options)}}
> {{ 142 cache_manager=self._cache_manager,}}
> {{ 143 pipeline_graph_renderer=self._renderer)}}
> {{--> 144 display.start_periodic_update()}}
> {{ 145 result = pipeline_to_execute.run()}}
> {{ 146 
> result.wait_until_finish()}}{{~/beam/sdks/python/apache_beam/runners/interactive/display/display_manager.py
>  in start_periodic_update(self)}}
> {{ 158 def start_periodic_update(self):}}
> {{ 159 """Start a thread that periodically updates the display."""}}
> {{--> 160 self.update_display(True)}}
> {{ 161 self._periodic_update = True}}
> {{ 
> 162}}{{~/beam/sdks/python/apache_beam/runners/interactive/display/display_manager.py
>  in update_display(self, force)}}
> {{ 149 rendered_graph = self._renderer.render_pipeline_graph(}}
> {{ 150 self._pipeline_graph)}}
> {{--> 151 display.display(display.HTML(rendered_graph))}}
> {{ 152 }}
> {{ 153 
> _display_progress('Running...')}}{{~/beam/sdks/python/notebook3/lib/python3.6/site-packages/IPython/core/display.py
>  in __init__(self, data, url, filename, metadata)}}
> {{ 691 return prefix.startswith("")}}
> {{ 692 }}
> {{--> 693 if warn():}}
> {{ 694 warnings.warn("Consider using IPython.display.IFrame instead")}}
> {{ 695 super(HTML, self).__init__(data=data, url=url, filename=filename, 
> metadata=metadata)}}{{~/beam/sdks/python/notebook3/lib/python3.6/site-packages/IPython/core/display.py
>  in warn()}}
> {{ 689 prefix = data[:10].lower()}}
> {{ 690 suffix = data[-10:].lower()}}
> {{--> 691 return prefix.startswith(" suffix.endswith("")}}
> {{ 692 }}
> {{ 693 if warn():}}{{TypeError: startswith first arg must be bytes or a tuple 
> of bytes, not str}}
>  
>  
>  
> This does not happen with Python 2.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (BEAM-7876) Interactive Beam example does not work with Python3

2019-08-01 Thread David Yan (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Yan updated BEAM-7876:

Description: 
When going through the example  
[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/interactive/README.md]
 using Jupyter Notebook running in Python 3, the run() method throws an error 
the following error:

{{TypeError Traceback (most recent call last)}}
{{ in }}
{{ 3 squares = init_pcoll | 'Square' >> beam.Map(lambda x: x*x)}}
{{ 4 cubes = init_pcoll | 'Cube' >> beam.Map(lambda x: x**3)}}
{{> 5 result = p.run()}}
{{ 6 result.wait_until_finish()}}{{~/beam/sdks/python/apache_beam/pipeline.py 
in run(self, test_runner_api)}}
{{ 404 self.to_runner_api(use_fake_coders=True),}}
{{ 405 self.runner,}}
{{--> 406 self._options).run(False)}}
{{ 407 }}
{{ 408 if 
self._options.view_as(TypeOptions).runtime_type_check:}}{{~/beam/sdks/python/apache_beam/pipeline.py
 in run(self, test_runner_api)}}
{{ 417 finally:}}
{{ 418 shutil.rmtree(tmpdir)}}
{{--> 419 return self.runner.run_pipeline(self, self._options)}}
{{ 420 }}
{{ 421 def 
__enter__(self):}}{{~/beam/sdks/python/apache_beam/runners/interactive/interactive_runner.py
 in run_pipeline(self, pipeline, options)}}
{{ 142 cache_manager=self._cache_manager,}}
{{ 143 pipeline_graph_renderer=self._renderer)}}
{{--> 144 display.start_periodic_update()}}
{{ 145 result = pipeline_to_execute.run()}}
{{ 146 
result.wait_until_finish()}}{{~/beam/sdks/python/apache_beam/runners/interactive/display/display_manager.py
 in start_periodic_update(self)}}
{{ 158 def start_periodic_update(self):}}
{{ 159 """Start a thread that periodically updates the display."""}}
{{--> 160 self.update_display(True)}}
{{ 161 self._periodic_update = True}}
{{ 
162}}{{~/beam/sdks/python/apache_beam/runners/interactive/display/display_manager.py
 in update_display(self, force)}}
{{ 149 rendered_graph = self._renderer.render_pipeline_graph(}}
{{ 150 self._pipeline_graph)}}
{{--> 151 display.display(display.HTML(rendered_graph))}}
{{ 152 }}
{{ 153 
_display_progress('Running...')}}{{~/beam/sdks/python/notebook3/lib/python3.6/site-packages/IPython/core/display.py
 in __init__(self, data, url, filename, metadata)}}
{{ 691 return prefix.startswith("")}}
{{ 692 }}
{{--> 693 if warn():}}
{{ 694 warnings.warn("Consider using IPython.display.IFrame instead")}}
{{ 695 super(HTML, self).__init__(data=data, url=url, filename=filename, 
metadata=metadata)}}{{~/beam/sdks/python/notebook3/lib/python3.6/site-packages/IPython/core/display.py
 in warn()}}
{{ 689 prefix = data[:10].lower()}}
{{ 690 suffix = data[-10:].lower()}}
{{--> 691 return prefix.startswith("")}}
{{ 692 }}
{{ 693 if warn():}}{{TypeError: startswith first arg must be bytes or a tuple 
of bytes, not str}}

 

 

 

This does not happen with Python 2.

  was:
When going through the example  
[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/interactive/README.md]
 using Jupyter Notebook running in Python 3, the run() method throws an error 
the following error:

{{TypeError Traceback (most recent call last)}}
{{  in }}
{{ 3 squares = init_pcoll | 'Square' >> beam.Map(lambda x: x*x)}}
{{ 4 cubes = init_pcoll | 'Cube' >> beam.Map(lambda x: x**3)}}
{{ > 5 result = p.run()}}
{{ 6 result.wait_until_finish()~/beam/sdks/python/apache_beam/pipeline.py in 
run(self, test_runner_api)}}
{{ 404 self.to_runner_api(use_fake_coders=True),}}
{{ 405 self.runner,}}
{{ --> 406 self._options).run(False)}}
{{ 407 }}
{{ 408 if 
self._options.view_as(TypeOptions).runtime_type_check:~/beam/sdks/python/apache_beam/pipeline.py
 in run(self, test_runner_api)}}
{{ 417 finally:}}
{{ 418 shutil.rmtree(tmpdir)}}
{{ --> 419 return self.runner.run_pipeline(self, self._options)}}
{{ 420 }}
{{ 421 def 
__enter__(self):~/beam/sdks/python/apache_beam/runners/interactive/interactive_runner.py
 in run_pipeline(self, pipeline, options)}}
{{ 142 cache_manager=self._cache_manager,}}
{{ 143 pipeline_graph_renderer=self._renderer)}}
{{ --> 144 display.start_periodic_update()}}
{{ 145 result = pipeline_to_execute.run()}}
{{ 146 
result.wait_until_finish()~/beam/sdks/python/apache_beam/runners/interactive/display/display_manager.py
 in start_periodic_update(self)}}
{{ 158 def start_periodic_update(self):}}
{{ 159 """Start a thread that periodically updates the display."""}}
{{ --> 160 self.update_display(True)}}
{{ 161 self._periodic_update = True}}
{{ 
162~/beam/sdks/python/apache_beam/runners/interactive/display/display_manager.py
 in update_display(self, force)}}
{{ 149 rendered_graph = self._renderer.render_pipeline_graph(}}
{{ 150 self._pipeline_graph)}}
{{ --> 151 display.display(display.HTML(rendered_graph))}}
{{ 152 }}
{{ 153 
_display_progress('Running...')~/beam/sdks/python/notebook3/lib/python3.6/site-packages/IPython/core/display.py
 in __init__(self, data, url, filename, metadata)}}
{{ 691 return prefix.startswith("")}}
{{ 692 }

[jira] [Updated] (BEAM-7876) Interactive Beam example does not work with Python3

2019-08-01 Thread David Yan (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Yan updated BEAM-7876:

Description: 
When going through the example  
[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/interactive/README.md]
 using Jupyter Notebook running in Python 3, the run() method throws an error 
the following error:

{{TypeError Traceback (most recent call last)}}
{{  in }}
{{ 3 squares = init_pcoll | 'Square' >> beam.Map(lambda x: x*x)}}
{{ 4 cubes = init_pcoll | 'Cube' >> beam.Map(lambda x: x**3)}}
{{ > 5 result = p.run()}}
{{ 6 result.wait_until_finish()~/beam/sdks/python/apache_beam/pipeline.py in 
run(self, test_runner_api)}}
{{ 404 self.to_runner_api(use_fake_coders=True),}}
{{ 405 self.runner,}}
{{ --> 406 self._options).run(False)}}
{{ 407 }}
{{ 408 if 
self._options.view_as(TypeOptions).runtime_type_check:~/beam/sdks/python/apache_beam/pipeline.py
 in run(self, test_runner_api)}}
{{ 417 finally:}}
{{ 418 shutil.rmtree(tmpdir)}}
{{ --> 419 return self.runner.run_pipeline(self, self._options)}}
{{ 420 }}
{{ 421 def 
__enter__(self):~/beam/sdks/python/apache_beam/runners/interactive/interactive_runner.py
 in run_pipeline(self, pipeline, options)}}
{{ 142 cache_manager=self._cache_manager,}}
{{ 143 pipeline_graph_renderer=self._renderer)}}
{{ --> 144 display.start_periodic_update()}}
{{ 145 result = pipeline_to_execute.run()}}
{{ 146 
result.wait_until_finish()~/beam/sdks/python/apache_beam/runners/interactive/display/display_manager.py
 in start_periodic_update(self)}}
{{ 158 def start_periodic_update(self):}}
{{ 159 """Start a thread that periodically updates the display."""}}
{{ --> 160 self.update_display(True)}}
{{ 161 self._periodic_update = True}}
{{ 
162~/beam/sdks/python/apache_beam/runners/interactive/display/display_manager.py
 in update_display(self, force)}}
{{ 149 rendered_graph = self._renderer.render_pipeline_graph(}}
{{ 150 self._pipeline_graph)}}
{{ --> 151 display.display(display.HTML(rendered_graph))}}
{{ 152 }}
{{ 153 
_display_progress('Running...')~/beam/sdks/python/notebook3/lib/python3.6/site-packages/IPython/core/display.py
 in __init__(self, data, url, filename, metadata)}}
{{ 691 return prefix.startswith("")}}
{{ 692 }}
{{ --> 693 if warn():}}
{{ 694 warnings.warn("Consider using IPython.display.IFrame instead")}}
{{ 695 super(HTML, self).__init__(data=data, url=url, filename=filename, 
metadata=metadata)~/beam/sdks/python/notebook3/lib/python3.6/site-packages/IPython/core/display.py
 in warn()}}
{{ 689 prefix = data[:10].lower()}}
{{ 690 suffix = data[-10:].lower()}}
{{ --> 691 return prefix.startswith("")}}
{{ 692 }}
{{ 693 if warn():TypeError: startswith first arg must be bytes or a tuple of 
bytes, not str  }}

 

 

 

This does not happen with Python 2.

  was:
When going through the example  
[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/interactive/README.md]
 using Jupyter Notebook running in Python 3, the run() method throws an error:

TypeError Traceback (most recent call last)
 in 
 3 squares = init_pcoll | 'Square' >> beam.Map(lambda x: x*x)
 4 cubes = init_pcoll | 'Cube' >> beam.Map(lambda x: x**3)
> 5 result = p.run()
 6 result.wait_until_finish()

~/beam/sdks/python/apache_beam/pipeline.py in run(self, test_runner_api)
 404 self.to_runner_api(use_fake_coders=True),
 405 self.runner,
--> 406 self._options).run(False)
 407 
 408 if self._options.view_as(TypeOptions).runtime_type_check:

~/beam/sdks/python/apache_beam/pipeline.py in run(self, test_runner_api)
 417 finally:
 418 shutil.rmtree(tmpdir)
--> 419 return self.runner.run_pipeline(self, self._options)
 420 
 421 def __enter__(self):

~/beam/sdks/python/apache_beam/runners/interactive/interactive_runner.py in 
run_pipeline(self, pipeline, options)
 142 cache_manager=self._cache_manager,
 143 pipeline_graph_renderer=self._renderer)
--> 144 display.start_periodic_update()
 145 result = pipeline_to_execute.run()
 146 result.wait_until_finish()

~/beam/sdks/python/apache_beam/runners/interactive/display/display_manager.py 
in start_periodic_update(self)
 158 def start_periodic_update(self):
 159 """Start a thread that periodically updates the display."""
--> 160 self.update_display(True)
 161 self._periodic_update = True
 162

~/beam/sdks/python/apache_beam/runners/interactive/display/display_manager.py 
in update_display(self, force)
 149 rendered_graph = self._renderer.render_pipeline_graph(
 150 self._pipeline_graph)
--> 151 display.display(display.HTML(rendered_graph))
 152 
 153 _display_progress('Running...')

~/beam/sdks/python/notebook3/lib/python3.6/site-packages/IPython/core/display.py
 in __init__(self, data, url, filename, metadata)
 691 return prefix.startswith("")
 692 
--> 693 if warn():
 694 warnings.warn("Consider using IPython.display.IFrame instead")
 695 super(HTML, self).__init__(data=data, url=url, filename=filename, 
metadata=metada

[jira] [Created] (BEAM-7876) Interactive Beam example does not work with Python3

2019-08-01 Thread David Yan (JIRA)
David Yan created BEAM-7876:
---

 Summary: Interactive Beam example does not work with Python3
 Key: BEAM-7876
 URL: https://issues.apache.org/jira/browse/BEAM-7876
 Project: Beam
  Issue Type: Bug
  Components: examples-python
Reporter: David Yan


When going through the example  
[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/interactive/README.md]
 using Jupyter Notebook running in Python 3, the run() method throws an error:

TypeError Traceback (most recent call last)
 in 
 3 squares = init_pcoll | 'Square' >> beam.Map(lambda x: x*x)
 4 cubes = init_pcoll | 'Cube' >> beam.Map(lambda x: x**3)
> 5 result = p.run()
 6 result.wait_until_finish()

~/beam/sdks/python/apache_beam/pipeline.py in run(self, test_runner_api)
 404 self.to_runner_api(use_fake_coders=True),
 405 self.runner,
--> 406 self._options).run(False)
 407 
 408 if self._options.view_as(TypeOptions).runtime_type_check:

~/beam/sdks/python/apache_beam/pipeline.py in run(self, test_runner_api)
 417 finally:
 418 shutil.rmtree(tmpdir)
--> 419 return self.runner.run_pipeline(self, self._options)
 420 
 421 def __enter__(self):

~/beam/sdks/python/apache_beam/runners/interactive/interactive_runner.py in 
run_pipeline(self, pipeline, options)
 142 cache_manager=self._cache_manager,
 143 pipeline_graph_renderer=self._renderer)
--> 144 display.start_periodic_update()
 145 result = pipeline_to_execute.run()
 146 result.wait_until_finish()

~/beam/sdks/python/apache_beam/runners/interactive/display/display_manager.py 
in start_periodic_update(self)
 158 def start_periodic_update(self):
 159 """Start a thread that periodically updates the display."""
--> 160 self.update_display(True)
 161 self._periodic_update = True
 162

~/beam/sdks/python/apache_beam/runners/interactive/display/display_manager.py 
in update_display(self, force)
 149 rendered_graph = self._renderer.render_pipeline_graph(
 150 self._pipeline_graph)
--> 151 display.display(display.HTML(rendered_graph))
 152 
 153 _display_progress('Running...')

~/beam/sdks/python/notebook3/lib/python3.6/site-packages/IPython/core/display.py
 in __init__(self, data, url, filename, metadata)
 691 return prefix.startswith("")
 692 
--> 693 if warn():
 694 warnings.warn("Consider using IPython.display.IFrame instead")
 695 super(HTML, self).__init__(data=data, url=url, filename=filename, 
metadata=metadata)

~/beam/sdks/python/notebook3/lib/python3.6/site-packages/IPython/core/display.py
 in warn()
 689 prefix = data[:10].lower()
 690 suffix = data[-10:].lower()
--> 691 return prefix.startswith("")
 692 
 693 if warn():

TypeError: startswith first arg must be bytes or a tuple of bytes, not str

 

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (BEAM-7408) Beam Programming Guide inconsistencies

2019-06-05 Thread David Yan (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Yan resolved BEAM-7408.
-
Resolution: Fixed

> Beam Programming Guide inconsistencies
> --
>
> Key: BEAM-7408
> URL: https://issues.apache.org/jira/browse/BEAM-7408
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Affects Versions: Not applicable
>Reporter: David Yan
>Priority: Major
>  Labels: documentation, newbie
> Fix For: Not applicable
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> [https://beam.apache.org/documentation/programming-guide/]
>  
> Pipeline option example:
>  
> Examples in Java, Python and Go are not consistent. Java has myCustomOption, 
> while Python and Go have "input" and "output".
>  
> When Python is chosen, the doc says --myCustomOption=value is supported, 
> which only corresponds to the java example.
>  
> Reading from external source:
>  
> Java, Python and Go are not consistent. Python example reads from a GCS file, 
> while others specify a generic file.
> [https://beam.apache.org/documentation/programming-guide/#applying-transforms]:
>  The last workflow graph does not correspond to the code example.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-7408) Beam Programming Guide inconsistencies

2019-06-05 Thread David Yan (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16857005#comment-16857005
 ] 

David Yan commented on BEAM-7408:
-

Yes, thank you. :)

> Beam Programming Guide inconsistencies
> --
>
> Key: BEAM-7408
> URL: https://issues.apache.org/jira/browse/BEAM-7408
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Affects Versions: Not applicable
>Reporter: David Yan
>Priority: Major
>  Labels: documentation, newbie
> Fix For: Not applicable
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> [https://beam.apache.org/documentation/programming-guide/]
>  
> Pipeline option example:
>  
> Examples in Java, Python and Go are not consistent. Java has myCustomOption, 
> while Python and Go have "input" and "output".
>  
> When Python is chosen, the doc says --myCustomOption=value is supported, 
> which only corresponds to the java example.
>  
> Reading from external source:
>  
> Java, Python and Go are not consistent. Python example reads from a GCS file, 
> while others specify a generic file.
> [https://beam.apache.org/documentation/programming-guide/#applying-transforms]:
>  The last workflow graph does not correspond to the code example.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7408) Beam Programming Guide inconsistencies

2019-05-23 Thread David Yan (JIRA)
David Yan created BEAM-7408:
---

 Summary: Beam Programming Guide inconsistencies
 Key: BEAM-7408
 URL: https://issues.apache.org/jira/browse/BEAM-7408
 Project: Beam
  Issue Type: Improvement
  Components: website
Reporter: David Yan


[https://beam.apache.org/documentation/programming-guide/]

 

Pipeline option example:

 

Examples in Java, Python and Go are not consistent. Java has myCustomOption, 
while Python and Go have "input" and "output".

 

When Python is chosen, the doc says --myCustomOption=value is supported, which 
only corresponds to the java example.

 

Reading from external source:

 

Java, Python and Go are not consistent. Python example reads from a GCS file, 
while others specify a generic file.


[https://beam.apache.org/documentation/programming-guide/#applying-transforms]: 
The last workflow graph does not correspond to the code example.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7215) Wordcount example page does not tell the user to create the maven project using archetype

2019-05-02 Thread David Yan (JIRA)
David Yan created BEAM-7215:
---

 Summary: Wordcount example page does not tell the user to create 
the maven project using archetype
 Key: BEAM-7215
 URL: https://issues.apache.org/jira/browse/BEAM-7215
 Project: Beam
  Issue Type: Improvement
  Components: website
Reporter: David Yan


[https://beam.apache.org/get-started/wordcount-example/#wordcount-example] does 
not have a link back to 
[https://beam.apache.org/get-started/quickstart-java/#get-the-wordcount-code]. 
If the user just follows the instructions in the first link (from a search 
engine let's say), they would get:

{{$ mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount 
-Dexec.args="--runner=DataflowRunner 
--gcpTempLocation=gs://clouddfe-test/staging-$USER 
--inputFile=gs://apache-beam-samples/shakespeare/* 
--output=gs://world-readable-mkcq69tkcu/$USER/result.txt" -Pdataflow-runner 
[INFO] Scanning for projects... [INFO] 
 [INFO] 
BUILD FAILURE [INFO] 
 [INFO] 
Total time: 0.068 s [INFO] Finished at: 2019-05-02T13:32:15-07:00 [INFO] Final 
Memory: 23M/1948M [INFO] 
 
[WARNING] The requested profile "dataflow-runner" could not be activated 
because it does not exist. [ERROR] The goal you specified requires a project to 
execute but there is no POM in this directory 
(/usr/local/google/home/davidyan/beam). Please verify you invoked Maven from 
the correct directory. -> [Help 1]}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7020) Reduce the log severity of profiling agent discovery

2019-04-05 Thread David Yan (JIRA)
David Yan created BEAM-7020:
---

 Summary: Reduce the log severity of profiling agent discovery
 Key: BEAM-7020
 URL: https://issues.apache.org/jira/browse/BEAM-7020
 Project: Beam
  Issue Type: Improvement
  Components: runner-dataflow
Reporter: David Yan


Example:

[https://github.com/apache/beam/blob/b953645ed6db837d24284d7fe1fe091e7309f821/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/profiler/ScopedProfiler.java#L138]

These should not be at warning severity, even if the profiling agent is not 
present since it's in most cases users do not run their jobs with profiling.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-6918) Github link requires login and example link is broken

2019-03-26 Thread David Yan (JIRA)
David Yan created BEAM-6918:
---

 Summary: Github link requires login and example link is broken
 Key: BEAM-6918
 URL: https://issues.apache.org/jira/browse/BEAM-6918
 Project: Beam
  Issue Type: Improvement
  Components: examples-python
Reporter: David Yan


Two minor issues in 
[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/interactive/README.md]

1. git clone g...@github.com:apache/beam.git requires the user to be logged in, 
while https://github.com/apache/beam does not.
2. Spaces in the example link need to be escaped.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)