[jira] [Work logged] (BEAM-8456) BigQuery to Beam SQL timestamp has the wrong default: truncation makes the most sense

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8456?focusedWorklogId=331785=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331785
 ]

ASF GitHub Bot logged work on BEAM-8456:


Author: ASF GitHub Bot
Created on: 22/Oct/19 03:57
Start Date: 22/Oct/19 03:57
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #9849: [BEAM-8456] Add 
pipeline option to have Data Catalog truncate sub-millisecond precision
URL: https://github.com/apache/beam/pull/9849#issuecomment-544795948
 
 
   run sql postcommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331785)
Time Spent: 40m  (was: 0.5h)

> BigQuery to Beam SQL timestamp has the wrong default: truncation makes the 
> most sense
> -
>
> Key: BEAM-8456
> URL: https://issues.apache.org/jira/browse/BEAM-8456
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Most of the time, a user reading a timestamp from BigQuery with 
> higher-than-millisecond precision timestamps may not even realize that the 
> data source created these high precision timestamps. They are probably 
> timestamps on log entries generated by a system with higher precision.
> If they are using it with Beam SQL, which only supports millisecond 
> precision, it makes sense to "just work" by default.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8456) BigQuery to Beam SQL timestamp has the wrong default: truncation makes the most sense

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8456?focusedWorklogId=331784=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331784
 ]

ASF GitHub Bot logged work on BEAM-8456:


Author: ASF GitHub Bot
Created on: 22/Oct/19 03:49
Start Date: 22/Oct/19 03:49
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on pull request #9849: 
[BEAM-8456] Add pipeline option to have Data Catalog truncate sub-millisecond 
precision
URL: https://github.com/apache/beam/pull/9849#discussion_r337322337
 
 

 ##
 File path: 
sdks/java/extensions/sql/datacatalog/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/datacatalog/DataCatalogTableProvider.java
 ##
 @@ -138,8 +143,41 @@ private Table loadTableFromDC(String tableName) {
 }
   }
 
-  @Override
-  public BeamSqlTable buildBeamSqlTable(Table table) {
-return delegateProviders.get(table.getType()).buildBeamSqlTable(table);
+  private static DataCatalogBlockingStub createDataCatalogClient(
+  DataCatalogPipelineOptions options) {
+return DataCatalogGrpc.newBlockingStub(
+
ManagedChannelBuilder.forTarget(options.getDataCatalogEndpoint()).build())
+.withCallCredentials(
+
MoreCallCredentials.from(options.as(GcpOptions.class).getGcpCredential()));
+  }
+
+  private static Map getSupportedProviders() {
+return Stream.of(
+new PubsubJsonTableProvider(), new BigQueryTableProvider(), new 
TextTableProvider())
+.collect(toMap(TableProvider::getTableType, p -> p));
+  }
+
+  private Table toCalciteTable(String tableName, Entry entry) {
+if (entry.getSchema().getColumnsCount() == 0) {
+  throw new UnsupportedOperationException(
+  "Entry doesn't have a schema. Please attach a schema to '"
+  + tableName
+  + "' in Data Catalog: "
+  + entry.toString());
+}
+Schema schema = SchemaUtils.fromDataCatalog(entry.getSchema());
+
+Optional tableBuilder = tableFactory.tableBuilder(entry);
+if (tableBuilder.isPresent()) {
+  return tableBuilder.get().schema(schema).name(tableName).build();
+} else {
+  throw new UnsupportedOperationException(
+  String.format(
+  "Unsupported Data Catalog entry: %s",
+  MoreObjects.toStringHelper(entry)
 
 Review comment:
   The proto lib is compiled for the lite runtime so it has no `toString` of 
its own. Added just the bits used in the table factories.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331784)
Time Spent: 0.5h  (was: 20m)

> BigQuery to Beam SQL timestamp has the wrong default: truncation makes the 
> most sense
> -
>
> Key: BEAM-8456
> URL: https://issues.apache.org/jira/browse/BEAM-8456
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Most of the time, a user reading a timestamp from BigQuery with 
> higher-than-millisecond precision timestamps may not even realize that the 
> data source created these high precision timestamps. They are probably 
> timestamps on log entries generated by a system with higher precision.
> If they are using it with Beam SQL, which only supports millisecond 
> precision, it makes sense to "just work" by default.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8456) BigQuery to Beam SQL timestamp has the wrong default: truncation makes the most sense

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8456?focusedWorklogId=331781=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331781
 ]

ASF GitHub Bot logged work on BEAM-8456:


Author: ASF GitHub Bot
Created on: 22/Oct/19 03:48
Start Date: 22/Oct/19 03:48
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on pull request #9849: 
[BEAM-8456] Add pipeline option to have Data Catalog truncate sub-millisecond 
precision
URL: https://github.com/apache/beam/pull/9849
 
 
   Currently we default to crashing when we encounter sub-millisecond 
precision. This default is safe, but wrong for almost all users. Adding a 
default that "drops data" by truncating feels wrong. So instead this will allow 
any system with a default set of properties to set truncation to the default.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [x] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [x] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [x] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 

[jira] [Work logged] (BEAM-8456) BigQuery to Beam SQL timestamp has the wrong default: truncation makes the most sense

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8456?focusedWorklogId=331782=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331782
 ]

ASF GitHub Bot logged work on BEAM-8456:


Author: ASF GitHub Bot
Created on: 22/Oct/19 03:48
Start Date: 22/Oct/19 03:48
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #9849: [BEAM-8456] Add 
pipeline option to have Data Catalog truncate sub-millisecond precision
URL: https://github.com/apache/beam/pull/9849#issuecomment-544794704
 
 
   Also CC @TheNeuralBit and @amaliujia 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331782)
Time Spent: 20m  (was: 10m)

> BigQuery to Beam SQL timestamp has the wrong default: truncation makes the 
> most sense
> -
>
> Key: BEAM-8456
> URL: https://issues.apache.org/jira/browse/BEAM-8456
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Most of the time, a user reading a timestamp from BigQuery with 
> higher-than-millisecond precision timestamps may not even realize that the 
> data source created these high precision timestamps. They are probably 
> timestamps on log entries generated by a system with higher precision.
> If they are using it with Beam SQL, which only supports millisecond 
> precision, it makes sense to "just work" by default.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8456) BigQuery to Beam SQL timestamp has the wrong default: truncation makes the most sense

2019-10-21 Thread Kenneth Knowles (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-8456:
--
Status: Open  (was: Triage Needed)

> BigQuery to Beam SQL timestamp has the wrong default: truncation makes the 
> most sense
> -
>
> Key: BEAM-8456
> URL: https://issues.apache.org/jira/browse/BEAM-8456
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
>
> Most of the time, a user reading a timestamp from BigQuery with 
> higher-than-millisecond precision timestamps may not even realize that the 
> data source created these high precision timestamps. They are probably 
> timestamps on log entries generated by a system with higher precision.
> If they are using it with Beam SQL, which only supports millisecond 
> precision, it makes sense to "just work" by default.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8456) BigQuery to Beam SQL timestamp has the wrong default: truncation makes the most sense

2019-10-21 Thread Kenneth Knowles (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956614#comment-16956614
 ] 

Kenneth Knowles commented on BEAM-8456:
---

I cannot bear to make this a default, but at least we need a pipeline option so 
it can be plumbed from outside if you are doing something non-programmatic.

> BigQuery to Beam SQL timestamp has the wrong default: truncation makes the 
> most sense
> -
>
> Key: BEAM-8456
> URL: https://issues.apache.org/jira/browse/BEAM-8456
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
>
> Most of the time, a user reading a timestamp from BigQuery with 
> higher-than-millisecond precision timestamps may not even realize that the 
> data source created these high precision timestamps. They are probably 
> timestamps on log entries generated by a system with higher precision.
> If they are using it with Beam SQL, which only supports millisecond 
> precision, it makes sense to "just work" by default.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-8456) BigQuery to Beam SQL timestamp has the wrong default: truncation makes the most sense

2019-10-21 Thread Kenneth Knowles (Jira)
Kenneth Knowles created BEAM-8456:
-

 Summary: BigQuery to Beam SQL timestamp has the wrong default: 
truncation makes the most sense
 Key: BEAM-8456
 URL: https://issues.apache.org/jira/browse/BEAM-8456
 Project: Beam
  Issue Type: Improvement
  Components: dsl-sql
Reporter: Kenneth Knowles
Assignee: Kenneth Knowles


Most of the time, a user reading a timestamp from BigQuery with 
higher-than-millisecond precision timestamps may not even realize that the data 
source created these high precision timestamps. They are probably timestamps on 
log entries generated by a system with higher precision.

If they are using it with Beam SQL, which only supports millisecond precision, 
it makes sense to "just work" by default.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8368) [Python] libprotobuf-generated exception when importing apache_beam

2019-10-21 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956599#comment-16956599
 ] 

Ahmet Altay commented on BEAM-8368:
---

Thank you for the quick reply. Let us know if we can help.

> [Python] libprotobuf-generated exception when importing apache_beam
> ---
>
> Key: BEAM-8368
> URL: https://issues.apache.org/jira/browse/BEAM-8368
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.15.0, 2.17.0
>Reporter: Ubaier Bhat
>Assignee: Brian Hulette
>Priority: Blocker
> Fix For: 2.17.0
>
> Attachments: error_log.txt
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Unable to import apache_beam after upgrading to macos 10.15 (Catalina). 
> Cleared all the pipenvs and but can't get it working again.
> {code}
> import apache_beam as beam
> /Users/***/.local/share/virtualenvs/beam-etl-ims6DitU/lib/python3.7/site-packages/apache_beam/__init__.py:84:
>  UserWarning: Some syntactic constructs of Python 3 are not yet fully 
> supported by Apache Beam.
>   'Some syntactic constructs of Python 3 are not yet fully supported by '
> [libprotobuf ERROR google/protobuf/descriptor_database.cc:58] File already 
> exists in database: 
> [libprotobuf FATAL google/protobuf/descriptor.cc:1370] CHECK failed: 
> GeneratedDatabase()->Add(encoded_file_descriptor, size): 
> libc++abi.dylib: terminating with uncaught exception of type 
> google::protobuf::FatalException: CHECK failed: 
> GeneratedDatabase()->Add(encoded_file_descriptor, size): 
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8368) [Python] libprotobuf-generated exception when importing apache_beam

2019-10-21 Thread Wes McKinney (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956593#comment-16956593
 ] 

Wes McKinney commented on BEAM-8368:


It's on d...@arrow.apache.org. With luck we will cut an RC in the next 24 hours

> [Python] libprotobuf-generated exception when importing apache_beam
> ---
>
> Key: BEAM-8368
> URL: https://issues.apache.org/jira/browse/BEAM-8368
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.15.0, 2.17.0
>Reporter: Ubaier Bhat
>Assignee: Brian Hulette
>Priority: Blocker
> Fix For: 2.17.0
>
> Attachments: error_log.txt
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Unable to import apache_beam after upgrading to macos 10.15 (Catalina). 
> Cleared all the pipenvs and but can't get it working again.
> {code}
> import apache_beam as beam
> /Users/***/.local/share/virtualenvs/beam-etl-ims6DitU/lib/python3.7/site-packages/apache_beam/__init__.py:84:
>  UserWarning: Some syntactic constructs of Python 3 are not yet fully 
> supported by Apache Beam.
>   'Some syntactic constructs of Python 3 are not yet fully supported by '
> [libprotobuf ERROR google/protobuf/descriptor_database.cc:58] File already 
> exists in database: 
> [libprotobuf FATAL google/protobuf/descriptor.cc:1370] CHECK failed: 
> GeneratedDatabase()->Add(encoded_file_descriptor, size): 
> libc++abi.dylib: terminating with uncaught exception of type 
> google::protobuf::FatalException: CHECK failed: 
> GeneratedDatabase()->Add(encoded_file_descriptor, size): 
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-8405) Python: Datastore: add support for embedded entities

2019-10-21 Thread Udi Meiri (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri resolved BEAM-8405.
-
Fix Version/s: 2.17.0
   Resolution: Fixed

Resolved, though we should have a proper IT for new features such as this:
https://issues.apache.org/jira/browse/BEAM-8447

> Python: Datastore: add support for embedded entities 
> -
>
> Key: BEAM-8405
> URL: https://issues.apache.org/jira/browse/BEAM-8405
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
> Fix For: 2.17.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The conversion methods to/from the client entity type should be updated to 
> support an embedded Entity.
> https://github.com/apache/beam/blob/603d68aafe9bdcd124d28ad62ad36af01e7a7403/sdks/python/apache_beam/io/gcp/datastore/v1new/types.py#L216-L240



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8368) [Python] libprotobuf-generated exception when importing apache_beam

2019-10-21 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956590#comment-16956590
 ] 

Ahmet Altay commented on BEAM-8368:
---

Is there an issue/thread for tracking the progress of arrow 0.15.1 release ?

> [Python] libprotobuf-generated exception when importing apache_beam
> ---
>
> Key: BEAM-8368
> URL: https://issues.apache.org/jira/browse/BEAM-8368
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.15.0, 2.17.0
>Reporter: Ubaier Bhat
>Assignee: Brian Hulette
>Priority: Blocker
> Fix For: 2.17.0
>
> Attachments: error_log.txt
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Unable to import apache_beam after upgrading to macos 10.15 (Catalina). 
> Cleared all the pipenvs and but can't get it working again.
> {code}
> import apache_beam as beam
> /Users/***/.local/share/virtualenvs/beam-etl-ims6DitU/lib/python3.7/site-packages/apache_beam/__init__.py:84:
>  UserWarning: Some syntactic constructs of Python 3 are not yet fully 
> supported by Apache Beam.
>   'Some syntactic constructs of Python 3 are not yet fully supported by '
> [libprotobuf ERROR google/protobuf/descriptor_database.cc:58] File already 
> exists in database: 
> [libprotobuf FATAL google/protobuf/descriptor.cc:1370] CHECK failed: 
> GeneratedDatabase()->Add(encoded_file_descriptor, size): 
> libc++abi.dylib: terminating with uncaught exception of type 
> google::protobuf::FatalException: CHECK failed: 
> GeneratedDatabase()->Add(encoded_file_descriptor, size): 
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8405) Python: Datastore: add support for embedded entities

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8405?focusedWorklogId=331764=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331764
 ]

ASF GitHub Bot logged work on BEAM-8405:


Author: ASF GitHub Bot
Created on: 22/Oct/19 01:44
Start Date: 22/Oct/19 01:44
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #9805: [BEAM-8405] 
Support embedded Datastore entities
URL: https://github.com/apache/beam/pull/9805
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331764)
Time Spent: 1.5h  (was: 1h 20m)

> Python: Datastore: add support for embedded entities 
> -
>
> Key: BEAM-8405
> URL: https://issues.apache.org/jira/browse/BEAM-8405
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The conversion methods to/from the client entity type should be updated to 
> support an embedded Entity.
> https://github.com/apache/beam/blob/603d68aafe9bdcd124d28ad62ad36af01e7a7403/sdks/python/apache_beam/io/gcp/datastore/v1new/types.py#L216-L240



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7981) ParDo function wrapper doesn't support Iterable output types

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7981?focusedWorklogId=331760=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331760
 ]

ASF GitHub Bot logged work on BEAM-7981:


Author: ASF GitHub Bot
Created on: 22/Oct/19 01:32
Start Date: 22/Oct/19 01:32
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #9708: [BEAM-7981] Fix 
double iterable stripping
URL: https://github.com/apache/beam/pull/9708#discussion_r337305179
 
 

 ##
 File path: sdks/python/apache_beam/transforms/core.py
 ##
 @@ -585,7 +585,7 @@ def default_type_hints(self):
   try:
 fn_type_hints.strip_iterable()
   except ValueError as e:
-raise ValueError('Return value not iterable: %s: %s' % (self, e))
+logging.warning('%s: %s', self.default_label(), e)
 
 Review comment:
   I've reverted this case back to raise an exception.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331760)
Time Spent: 1h 40m  (was: 1.5h)

> ParDo function wrapper doesn't support Iterable output types
> 
>
> Key: BEAM-7981
> URL: https://issues.apache.org/jira/browse/BEAM-7981
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> I believe the bug is in CallableWrapperDoFn.default_type_hints, which 
> converts Iterable[str] to str.
> This test will be included (commented out) in 
> https://github.com/apache/beam/pull/9283
> {code}
>   def test_typed_callable_iterable_output(self):
> @typehints.with_input_types(int)
> @typehints.with_output_types(typehints.Iterable[str])
> def do_fn(element):
>   return [[str(element)] * 2]
> result = [1, 2] | beam.ParDo(do_fn)
> self.assertEqual([['1', '1'], ['2', '2']], sorted(result))
> {code}
> Result:
> {code}
> ==
> ERROR: test_typed_callable_iterable_output 
> (apache_beam.typehints.typed_pipeline_test.MainInputTest)
> --
> Traceback (most recent call last):
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/typehints/typed_pipeline_test.py",
>  line 104, in test_typed_callable_iterable_output
> result = [1, 2] | beam.ParDo(do_fn)
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/transforms/ptransform.py",
>  line 519, in __ror__
> p.run().wait_until_finish()
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/pipeline.py", 
> line 406, in run
> self._options).run(False)
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/pipeline.py", 
> line 419, in run
> return self.runner.run_pipeline(self, self._options)
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/runners/direct/direct_runner.py",
>  line 129, in run_pipeline
> return runner.run_pipeline(pipeline, options)
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 366, in run_pipeline
> default_environment=self._default_environment))
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 373, in run_via_runner_api
> return self.run_stages(stage_context, stages)
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 455, in run_stages
> stage_context.safe_coders)
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 733, in _run_stage
> result, splits = bundle_manager.process_bundle(data_input, data_output)
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 1663, in process_bundle
> part, expected_outputs), part_inputs):
>   File "/usr/lib/python3.7/concurrent/futures/_base.py", line 586, in 
> result_iterator
> yield fs.pop().result()
>   File "/usr/lib/python3.7/concurrent/futures/_base.py", line 432, in result
> return self.__get_result()
>   File "/usr/lib/python3.7/concurrent/futures/_base.py", line 384, in 
> __get_result
> raise self._exception
>   File 

[jira] [Work logged] (BEAM-7981) ParDo function wrapper doesn't support Iterable output types

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7981?focusedWorklogId=331759=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331759
 ]

ASF GitHub Bot logged work on BEAM-7981:


Author: ASF GitHub Bot
Created on: 22/Oct/19 01:32
Start Date: 22/Oct/19 01:32
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #9708: [BEAM-7981] Fix 
double iterable stripping
URL: https://github.com/apache/beam/pull/9708#discussion_r337305123
 
 

 ##
 File path: sdks/python/apache_beam/typehints/typed_pipeline_test_py3.py
 ##
 @@ -107,14 +105,16 @@ class MyDoFn(beam.DoFn):
   def process(self, element: int) -> str:
 return str(element)
 
-with self.assertRaisesRegex(ValueError, r'str.*is not iterable'):
+with self.assertLogs() as cm:
 
 Review comment:
   reverted
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331759)
Time Spent: 1.5h  (was: 1h 20m)

> ParDo function wrapper doesn't support Iterable output types
> 
>
> Key: BEAM-7981
> URL: https://issues.apache.org/jira/browse/BEAM-7981
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> I believe the bug is in CallableWrapperDoFn.default_type_hints, which 
> converts Iterable[str] to str.
> This test will be included (commented out) in 
> https://github.com/apache/beam/pull/9283
> {code}
>   def test_typed_callable_iterable_output(self):
> @typehints.with_input_types(int)
> @typehints.with_output_types(typehints.Iterable[str])
> def do_fn(element):
>   return [[str(element)] * 2]
> result = [1, 2] | beam.ParDo(do_fn)
> self.assertEqual([['1', '1'], ['2', '2']], sorted(result))
> {code}
> Result:
> {code}
> ==
> ERROR: test_typed_callable_iterable_output 
> (apache_beam.typehints.typed_pipeline_test.MainInputTest)
> --
> Traceback (most recent call last):
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/typehints/typed_pipeline_test.py",
>  line 104, in test_typed_callable_iterable_output
> result = [1, 2] | beam.ParDo(do_fn)
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/transforms/ptransform.py",
>  line 519, in __ror__
> p.run().wait_until_finish()
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/pipeline.py", 
> line 406, in run
> self._options).run(False)
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/pipeline.py", 
> line 419, in run
> return self.runner.run_pipeline(self, self._options)
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/runners/direct/direct_runner.py",
>  line 129, in run_pipeline
> return runner.run_pipeline(pipeline, options)
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 366, in run_pipeline
> default_environment=self._default_environment))
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 373, in run_via_runner_api
> return self.run_stages(stage_context, stages)
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 455, in run_stages
> stage_context.safe_coders)
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 733, in _run_stage
> result, splits = bundle_manager.process_bundle(data_input, data_output)
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 1663, in process_bundle
> part, expected_outputs), part_inputs):
>   File "/usr/lib/python3.7/concurrent/futures/_base.py", line 586, in 
> result_iterator
> yield fs.pop().result()
>   File "/usr/lib/python3.7/concurrent/futures/_base.py", line 432, in result
> return self.__get_result()
>   File "/usr/lib/python3.7/concurrent/futures/_base.py", line 384, in 
> __get_result
> raise self._exception
>   File "/usr/lib/python3.7/concurrent/futures/thread.py", line 57, in run
> result = self.fn(*self.args, 

[jira] [Work logged] (BEAM-7981) ParDo function wrapper doesn't support Iterable output types

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7981?focusedWorklogId=331758=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331758
 ]

ASF GitHub Bot logged work on BEAM-7981:


Author: ASF GitHub Bot
Created on: 22/Oct/19 01:32
Start Date: 22/Oct/19 01:32
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #9708: [BEAM-7981] Fix 
double iterable stripping
URL: https://github.com/apache/beam/pull/9708#discussion_r337304641
 
 

 ##
 File path: sdks/python/apache_beam/transforms/core.py
 ##
 @@ -676,22 +676,17 @@ def __repr__(self):
 
   def default_type_hints(self):
 fn_type_hints = typehints.decorators.IOTypeHints.from_callable(self._fn)
-if fn_type_hints is not None:
-  try:
-fn_type_hints.strip_iterable()
-  except ValueError as e:
-raise ValueError('Return value not iterable: %s: %s' % (self._fn, e))
 type_hints = get_type_hints(self._fn).with_defaults(fn_type_hints)
-# If the fn was a DoFn annotated with a type-hint that hinted a return
-# type compatible with Iterable[Any], then we strip off the outer
-# container type due to the 'flatten' portion of FlatMap.
-# TODO(robertwb): Should we require an iterable specification for FlatMap?
-if type_hints.output_types:
-  args, kwargs = type_hints.output_types
-  if len(args) == 1 and is_consistent_with(
-  args[0], typehints.Iterable[typehints.Any]):
-type_hints = type_hints.copy()
-type_hints.set_output_types(element_type(args[0]), **kwargs)
+# Do not modify self._fn's type hints object. This method may be called 
more
+# than once (such as once for input and once for output type checks) and
+# strip_iterable modifies the object.
 
 Review comment:
   done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331758)
Time Spent: 1h 20m  (was: 1h 10m)

> ParDo function wrapper doesn't support Iterable output types
> 
>
> Key: BEAM-7981
> URL: https://issues.apache.org/jira/browse/BEAM-7981
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> I believe the bug is in CallableWrapperDoFn.default_type_hints, which 
> converts Iterable[str] to str.
> This test will be included (commented out) in 
> https://github.com/apache/beam/pull/9283
> {code}
>   def test_typed_callable_iterable_output(self):
> @typehints.with_input_types(int)
> @typehints.with_output_types(typehints.Iterable[str])
> def do_fn(element):
>   return [[str(element)] * 2]
> result = [1, 2] | beam.ParDo(do_fn)
> self.assertEqual([['1', '1'], ['2', '2']], sorted(result))
> {code}
> Result:
> {code}
> ==
> ERROR: test_typed_callable_iterable_output 
> (apache_beam.typehints.typed_pipeline_test.MainInputTest)
> --
> Traceback (most recent call last):
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/typehints/typed_pipeline_test.py",
>  line 104, in test_typed_callable_iterable_output
> result = [1, 2] | beam.ParDo(do_fn)
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/transforms/ptransform.py",
>  line 519, in __ror__
> p.run().wait_until_finish()
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/pipeline.py", 
> line 406, in run
> self._options).run(False)
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/pipeline.py", 
> line 419, in run
> return self.runner.run_pipeline(self, self._options)
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/runners/direct/direct_runner.py",
>  line 129, in run_pipeline
> return runner.run_pipeline(pipeline, options)
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 366, in run_pipeline
> default_environment=self._default_environment))
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 373, in run_via_runner_api
> return self.run_stages(stage_context, stages)
>   File 
> 

[jira] [Work logged] (BEAM-7981) ParDo function wrapper doesn't support Iterable output types

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7981?focusedWorklogId=331753=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331753
 ]

ASF GitHub Bot logged work on BEAM-7981:


Author: ASF GitHub Bot
Created on: 22/Oct/19 01:11
Start Date: 22/Oct/19 01:11
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #9708: [BEAM-7981] Fix 
double iterable stripping
URL: https://github.com/apache/beam/pull/9708#discussion_r337302994
 
 

 ##
 File path: sdks/python/apache_beam/transforms/core.py
 ##
 @@ -585,7 +585,7 @@ def default_type_hints(self):
   try:
 fn_type_hints.strip_iterable()
   except ValueError as e:
-raise ValueError('Return value not iterable: %s: %s' % (self, e))
+logging.warning('%s: %s', self.default_label(), e)
 
 Review comment:
   I automatically made this a warning to be consistent with the warning for 
the other invocation of strip_iterable below.
   Below, the code historically allowed output type hints of DoFns to not be 
iterable. I would like it to be an exception when the output type is not 
iterable, but it seems that it would break existing pipelines (at least in the 
case below).
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331753)
Time Spent: 1h 10m  (was: 1h)

> ParDo function wrapper doesn't support Iterable output types
> 
>
> Key: BEAM-7981
> URL: https://issues.apache.org/jira/browse/BEAM-7981
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> I believe the bug is in CallableWrapperDoFn.default_type_hints, which 
> converts Iterable[str] to str.
> This test will be included (commented out) in 
> https://github.com/apache/beam/pull/9283
> {code}
>   def test_typed_callable_iterable_output(self):
> @typehints.with_input_types(int)
> @typehints.with_output_types(typehints.Iterable[str])
> def do_fn(element):
>   return [[str(element)] * 2]
> result = [1, 2] | beam.ParDo(do_fn)
> self.assertEqual([['1', '1'], ['2', '2']], sorted(result))
> {code}
> Result:
> {code}
> ==
> ERROR: test_typed_callable_iterable_output 
> (apache_beam.typehints.typed_pipeline_test.MainInputTest)
> --
> Traceback (most recent call last):
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/typehints/typed_pipeline_test.py",
>  line 104, in test_typed_callable_iterable_output
> result = [1, 2] | beam.ParDo(do_fn)
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/transforms/ptransform.py",
>  line 519, in __ror__
> p.run().wait_until_finish()
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/pipeline.py", 
> line 406, in run
> self._options).run(False)
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/pipeline.py", 
> line 419, in run
> return self.runner.run_pipeline(self, self._options)
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/runners/direct/direct_runner.py",
>  line 129, in run_pipeline
> return runner.run_pipeline(pipeline, options)
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 366, in run_pipeline
> default_environment=self._default_environment))
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 373, in run_via_runner_api
> return self.run_stages(stage_context, stages)
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 455, in run_stages
> stage_context.safe_coders)
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 733, in _run_stage
> result, splits = bundle_manager.process_bundle(data_input, data_output)
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
>  line 1663, in process_bundle
> part, expected_outputs), part_inputs):
>   File "/usr/lib/python3.7/concurrent/futures/_base.py", line 586, in 
> result_iterator
> 

[jira] [Work logged] (BEAM-8405) Python: Datastore: add support for embedded entities

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8405?focusedWorklogId=331752=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331752
 ]

ASF GitHub Bot logged work on BEAM-8405:


Author: ASF GitHub Bot
Created on: 22/Oct/19 01:10
Start Date: 22/Oct/19 01:10
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #9805: [BEAM-8405] 
Support embedded Datastore entities
URL: https://github.com/apache/beam/pull/9805#issuecomment-544770107
 
 
   LGTM. Thanks.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331752)
Time Spent: 1h 20m  (was: 1h 10m)

> Python: Datastore: add support for embedded entities 
> -
>
> Key: BEAM-8405
> URL: https://issues.apache.org/jira/browse/BEAM-8405
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> The conversion methods to/from the client entity type should be updated to 
> support an embedded Entity.
> https://github.com/apache/beam/blob/603d68aafe9bdcd124d28ad62ad36af01e7a7403/sdks/python/apache_beam/io/gcp/datastore/v1new/types.py#L216-L240



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8335) Add streaming support to Interactive Beam

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8335?focusedWorklogId=331743=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331743
 ]

ASF GitHub Bot logged work on BEAM-8335:


Author: ASF GitHub Bot
Created on: 22/Oct/19 01:03
Start Date: 22/Oct/19 01:03
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #9720: [BEAM-8335] Add 
initial modules for interactive streaming support
URL: https://github.com/apache/beam/pull/9720#discussion_r337284121
 
 

 ##
 File path: sdks/python/apache_beam/testing/interactive_stream_test.py
 ##
 @@ -0,0 +1,118 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from __future__ import absolute_import
+
+import unittest
+
+import grpc
+from google.protobuf import timestamp_pb2
+
+from apache_beam import coders
+from apache_beam.portability.api import beam_interactive_api_pb2 as 
interactive_api
+from apache_beam.portability.api import beam_interactive_api_pb2_grpc as 
interactive_api_grpc
+from apache_beam.portability.api.beam_interactive_api_pb2 import 
InteractiveStreamHeader
+from apache_beam.portability.api.beam_interactive_api_pb2 import 
InteractiveStreamRecord
+from apache_beam.portability.api.beam_runner_api_pb2 import TestStreamPayload
+from apache_beam.runners.interactive.caching.streaming_cache import 
StreamingCache
+from apache_beam.testing.interactive_stream import InteractiveStreamController
+
+
+def get_open_port():
+  import socket
+  s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
+  s.bind(('', 0))
+  s.listen(1)
+  port = s.getsockname()[1]
+  s.close()
+  return port
 
 Review comment:
   Would not the port returned here could become available after it is returned?
   
   I believe there is a way to pick a open port with grpc using: 
https://github.com/apache/beam/blob/e4aab40378f779ee7d6b6394301dd19db6bb0b82/sdks/python/apache_beam/runners/portability/fn_api_runner.py#L1224
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331743)
Time Spent: 5.5h  (was: 5h 20m)

> Add streaming support to Interactive Beam
> -
>
> Key: BEAM-8335
> URL: https://issues.apache.org/jira/browse/BEAM-8335
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-py-interactive
>Reporter: Sam Rohde
>Assignee: Sam Rohde
>Priority: Major
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> This issue tracks the work items to introduce streaming support to the 
> Interactive Beam experience. This will allow users to:
>  * Write and run a streaming job in IPython
>  * Automatically cache records from unbounded sources
>  * Add a replay experience that replays all cached records to simulate the 
> original pipeline execution
>  * Add controls to play/pause/stop/step individual elements from the cached 
> records
>  * Add ability to inspect/visualize unbounded PCollections



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8335) Add streaming support to Interactive Beam

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8335?focusedWorklogId=331742=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331742
 ]

ASF GitHub Bot logged work on BEAM-8335:


Author: ASF GitHub Bot
Created on: 22/Oct/19 01:03
Start Date: 22/Oct/19 01:03
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #9720: [BEAM-8335] Add 
initial modules for interactive streaming support
URL: https://github.com/apache/beam/pull/9720#discussion_r337301553
 
 

 ##
 File path: sdks/python/apache_beam/testing/interactive_stream.py
 ##
 @@ -0,0 +1,125 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from __future__ import absolute_import
+
+import time
+from concurrent.futures import ThreadPoolExecutor
+
+import grpc
+
+from apache_beam.portability.api import beam_interactive_api_pb2
+from apache_beam.portability.api import beam_interactive_api_pb2_grpc
+from apache_beam.portability.api.beam_interactive_api_pb2_grpc import 
InteractiveServiceServicer
+
+STRING_TO_API_STATE = {
+'STOPPED': beam_interactive_api_pb2.StatusResponse.STOPPED,
+'PAUSED': beam_interactive_api_pb2.StatusResponse.PAUSED,
+'RUNNING': beam_interactive_api_pb2.StatusResponse.RUNNING,
+}
+
+
+class InteractiveStreamController(InteractiveServiceServicer):
+  def __init__(self, endpoint, streaming_cache):
+self._endpoint = endpoint
+self._server = grpc.server(ThreadPoolExecutor(max_workers=2))
+beam_interactive_api_pb2_grpc.add_InteractiveServiceServicer_to_server(
+self, self._server)
+self._server.add_insecure_port(self._endpoint)
+self._server.start()
+
+self._streaming_cache = streaming_cache
+self._state = 'STOPPED'
+self._playback_speed = 1.0
+
+  def Start(self, request, context):
+"""Requests that the Service starts emitting elements.
+"""
+
+self._next_state('RUNNING')
+self._playback_speed = request.playback_speed or 1.0
+self._playback_speed = 1.0 / max(min(self._playback_speed, 100.0), 0.1)
+return beam_interactive_api_pb2.StartResponse()
+
+  def Stop(self, request, context):
+"""Requests that the Service stop emitting elements.
+"""
+self._next_state('STOPPED')
+return beam_interactive_api_pb2.StartResponse()
+
+  def Pause(self, request, context):
+"""Requests that the Service pause emitting elements.
+"""
+self._next_state('PAUSED')
+return beam_interactive_api_pb2.PauseResponse()
+
+  def Step(self, request, context):
+"""Requests that the Service emit a single element from each cached source.
+"""
+self._next_state('STEP')
+return beam_interactive_api_pb2.StepResponse()
+
+  def Status(self, request, context):
+"""Returns the status of the service.
+"""
+resp = beam_interactive_api_pb2.StatusResponse()
+resp.stream_time.GetCurrentTime()
+resp.state = STRING_TO_API_STATE[self._state]
+return resp
+
+  def _reset_state(self):
+self._reader = None
+self._playback_speed = 1.0
+self._state = 'STOPPED'
+
+  def _next_state(self, state):
+if self._state == 'STOPPED':
+  if state == 'RUNNING' or state == 'STEP':
+self._reader = self._streaming_cache.reader()
+elif self._state == 'RUNNING':
+  if state == 'STOPPED':
+self._reset_state()
+self._state = state
+
+  def Events(self, request, context):
+# The TestStream will wait until the stream starts.
+while self._state != 'RUNNING' and self._state != 'STEP':
+  time.sleep(0.01)
+
+events = self._reader.read()
+if events:
+  for e in events:
+# Here we assume that the first event is the processing_time_event so
+# that we can sleep and then emit the element. Thereby, trying to
+# emulate the original stream.
+if e.HasField('processing_time_event'):
+  sleep_duration = (
+  e.processing_time_event.advance_duration * self._playback_speed
+  ) * 10**-6
+  time.sleep(sleep_duration)
+yield beam_interactive_api_pb2.EventsResponse(events=[e])
+else:
+  resp = 

[jira] [Work logged] (BEAM-8335) Add streaming support to Interactive Beam

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8335?focusedWorklogId=331741=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331741
 ]

ASF GitHub Bot logged work on BEAM-8335:


Author: ASF GitHub Bot
Created on: 22/Oct/19 01:03
Start Date: 22/Oct/19 01:03
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #9720: [BEAM-8335] Add 
initial modules for interactive streaming support
URL: https://github.com/apache/beam/pull/9720#discussion_r337301423
 
 

 ##
 File path: sdks/python/apache_beam/testing/interactive_stream.py
 ##
 @@ -0,0 +1,125 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from __future__ import absolute_import
+
+import time
+from concurrent.futures import ThreadPoolExecutor
+
+import grpc
+
+from apache_beam.portability.api import beam_interactive_api_pb2
+from apache_beam.portability.api import beam_interactive_api_pb2_grpc
+from apache_beam.portability.api.beam_interactive_api_pb2_grpc import 
InteractiveServiceServicer
+
+STRING_TO_API_STATE = {
+'STOPPED': beam_interactive_api_pb2.StatusResponse.STOPPED,
+'PAUSED': beam_interactive_api_pb2.StatusResponse.PAUSED,
+'RUNNING': beam_interactive_api_pb2.StatusResponse.RUNNING,
+}
+
+
+class InteractiveStreamController(InteractiveServiceServicer):
+  def __init__(self, endpoint, streaming_cache):
+self._endpoint = endpoint
+self._server = grpc.server(ThreadPoolExecutor(max_workers=2))
+beam_interactive_api_pb2_grpc.add_InteractiveServiceServicer_to_server(
+self, self._server)
+self._server.add_insecure_port(self._endpoint)
+self._server.start()
+
+self._streaming_cache = streaming_cache
+self._state = 'STOPPED'
+self._playback_speed = 1.0
+
+  def Start(self, request, context):
+"""Requests that the Service starts emitting elements.
+"""
+
+self._next_state('RUNNING')
+self._playback_speed = request.playback_speed or 1.0
+self._playback_speed = 1.0 / max(min(self._playback_speed, 100.0), 0.1)
+return beam_interactive_api_pb2.StartResponse()
+
+  def Stop(self, request, context):
+"""Requests that the Service stop emitting elements.
+"""
+self._next_state('STOPPED')
+return beam_interactive_api_pb2.StartResponse()
+
+  def Pause(self, request, context):
+"""Requests that the Service pause emitting elements.
+"""
+self._next_state('PAUSED')
+return beam_interactive_api_pb2.PauseResponse()
+
+  def Step(self, request, context):
+"""Requests that the Service emit a single element from each cached source.
+"""
+self._next_state('STEP')
+return beam_interactive_api_pb2.StepResponse()
+
+  def Status(self, request, context):
+"""Returns the status of the service.
+"""
+resp = beam_interactive_api_pb2.StatusResponse()
+resp.stream_time.GetCurrentTime()
+resp.state = STRING_TO_API_STATE[self._state]
+return resp
+
+  def _reset_state(self):
+self._reader = None
+self._playback_speed = 1.0
+self._state = 'STOPPED'
+
+  def _next_state(self, state):
+if self._state == 'STOPPED':
+  if state == 'RUNNING' or state == 'STEP':
+self._reader = self._streaming_cache.reader()
+elif self._state == 'RUNNING':
+  if state == 'STOPPED':
+self._reset_state()
+self._state = state
+
+  def Events(self, request, context):
+# The TestStream will wait until the stream starts.
+while self._state != 'RUNNING' and self._state != 'STEP':
+  time.sleep(0.01)
 
 Review comment:
   Would it make sense to increase the delay here? I am worried that 0.01 might 
be too frequent for a check.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331741)
Time Spent: 5.5h  (was: 5h 20m)

> Add streaming support to Interactive Beam
> 

[jira] [Work logged] (BEAM-8335) Add streaming support to Interactive Beam

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8335?focusedWorklogId=331744=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331744
 ]

ASF GitHub Bot logged work on BEAM-8335:


Author: ASF GitHub Bot
Created on: 22/Oct/19 01:03
Start Date: 22/Oct/19 01:03
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #9720: [BEAM-8335] Add 
initial modules for interactive streaming support
URL: https://github.com/apache/beam/pull/9720#discussion_r337248607
 
 

 ##
 File path: model/pipeline/src/main/proto/beam_runner_api.proto
 ##
 @@ -500,6 +500,7 @@ message TestStreamPayload {
 
 message AdvanceWatermark {
   int64 new_watermark = 1;
+  string tag = 2;
 
 Review comment:
   How would TestStream primitive will behave when it only manages a single 
output (current mode)? Would tag be optional?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331744)
Time Spent: 5.5h  (was: 5h 20m)

> Add streaming support to Interactive Beam
> -
>
> Key: BEAM-8335
> URL: https://issues.apache.org/jira/browse/BEAM-8335
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-py-interactive
>Reporter: Sam Rohde
>Assignee: Sam Rohde
>Priority: Major
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> This issue tracks the work items to introduce streaming support to the 
> Interactive Beam experience. This will allow users to:
>  * Write and run a streaming job in IPython
>  * Automatically cache records from unbounded sources
>  * Add a replay experience that replays all cached records to simulate the 
> original pipeline execution
>  * Add controls to play/pause/stop/step individual elements from the cached 
> records
>  * Add ability to inspect/visualize unbounded PCollections



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8405) Python: Datastore: add support for embedded entities

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8405?focusedWorklogId=331740=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331740
 ]

ASF GitHub Bot logged work on BEAM-8405:


Author: ASF GitHub Bot
Created on: 22/Oct/19 00:56
Start Date: 22/Oct/19 00:56
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #9805: [BEAM-8405] Support 
embedded Datastore entities
URL: https://github.com/apache/beam/pull/9805#issuecomment-544767624
 
 
   @chamikaramj does this LGTY?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331740)
Time Spent: 1h 10m  (was: 1h)

> Python: Datastore: add support for embedded entities 
> -
>
> Key: BEAM-8405
> URL: https://issues.apache.org/jira/browse/BEAM-8405
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> The conversion methods to/from the client entity type should be updated to 
> support an embedded Entity.
> https://github.com/apache/beam/blob/603d68aafe9bdcd124d28ad62ad36af01e7a7403/sdks/python/apache_beam/io/gcp/datastore/v1new/types.py#L216-L240



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7926) Visualize PCollection with Interactive Beam

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7926?focusedWorklogId=331739=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331739
 ]

ASF GitHub Bot logged work on BEAM-7926:


Author: ASF GitHub Bot
Created on: 22/Oct/19 00:55
Start Date: 22/Oct/19 00:55
Worklog Time Spent: 10m 
  Work Description: davidyan74 commented on pull request #9741: [BEAM-7926] 
Visualize PCollection
URL: https://github.com/apache/beam/pull/9741#discussion_r337300560
 
 

 ##
 File path: 
sdks/python/apache_beam/runners/interactive/display/pcoll_visualization.py
 ##
 @@ -0,0 +1,258 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Module visualizes PCollection data.
+
+For internal use only; no backwards-compatibility guarantees.
+Only works with Python 3.5+.
+"""
+from __future__ import absolute_import
+
+import base64
+import logging
+from datetime import timedelta
+
+from pandas.io.json import json_normalize
+
+from apache_beam import pvalue
+from apache_beam.runners.interactive import interactive_environment as ie
+from apache_beam.runners.interactive import pipeline_instrument as instr
+from facets_overview.generic_feature_statistics_generator import 
GenericFeatureStatisticsGenerator
+from IPython.core.display import HTML
+from IPython.core.display import Javascript
+from IPython.core.display import display
+from IPython.core.display import display_javascript
+from IPython.core.display import update_display
+from timeloop import Timeloop
+
+# jsons doesn't support < Python 3.5. Work around with json for legacy tests.
 
 Review comment:
   Can we at least print a warning that says "interactive is not supported in 
py2", like we do for python 3 with beam <= 2.15?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331739)
Time Spent: 8h 10m  (was: 8h)

> Visualize PCollection with Interactive Beam
> ---
>
> Key: BEAM-7926
> URL: https://issues.apache.org/jira/browse/BEAM-7926
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-py-interactive
>Reporter: Ning Kang
>Assignee: Ning Kang
>Priority: Major
>  Time Spent: 8h 10m
>  Remaining Estimate: 0h
>
> Support auto plotting / charting of materialized data of a given PCollection 
> with Interactive Beam.
> Say an Interactive Beam pipeline defined as
> p = create_pipeline()
> pcoll = p | 'Transform' >> transform()
> The use can call a single function and get auto-magical charting of the data 
> as materialized pcoll.
> e.g., visualize(pcoll)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7926) Visualize PCollection with Interactive Beam

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7926?focusedWorklogId=331738=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331738
 ]

ASF GitHub Bot logged work on BEAM-7926:


Author: ASF GitHub Bot
Created on: 22/Oct/19 00:51
Start Date: 22/Oct/19 00:51
Worklog Time Spent: 10m 
  Work Description: KevinGG commented on pull request #9741: [BEAM-7926] 
Visualize PCollection
URL: https://github.com/apache/beam/pull/9741#discussion_r33733
 
 

 ##
 File path: 
sdks/python/apache_beam/runners/interactive/display/pcoll_visualization.py
 ##
 @@ -0,0 +1,258 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Module visualizes PCollection data.
+
+For internal use only; no backwards-compatibility guarantees.
+Only works with Python 3.5+.
+"""
+from __future__ import absolute_import
+
+import base64
+import logging
+from datetime import timedelta
+
+from pandas.io.json import json_normalize
+
+from apache_beam import pvalue
+from apache_beam.runners.interactive import interactive_environment as ie
+from apache_beam.runners.interactive import pipeline_instrument as instr
+from facets_overview.generic_feature_statistics_generator import 
GenericFeatureStatisticsGenerator
+from IPython.core.display import HTML
+from IPython.core.display import Javascript
+from IPython.core.display import display
+from IPython.core.display import display_javascript
+from IPython.core.display import update_display
+from timeloop import Timeloop
+
+# jsons doesn't support < Python 3.5. Work around with json for legacy tests.
 
 Review comment:
   This will still work Py2 though. And a sys.exit() would probably break some 
gradle tasks waiting for 0 exit code.
   I've put a TODO item here. It's a general ticket for Interactive Beam to 
mark for cleanup once py2 is deprecated completely from Beam.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331738)
Time Spent: 8h  (was: 7h 50m)

> Visualize PCollection with Interactive Beam
> ---
>
> Key: BEAM-7926
> URL: https://issues.apache.org/jira/browse/BEAM-7926
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-py-interactive
>Reporter: Ning Kang
>Assignee: Ning Kang
>Priority: Major
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> Support auto plotting / charting of materialized data of a given PCollection 
> with Interactive Beam.
> Say an Interactive Beam pipeline defined as
> p = create_pipeline()
> pcoll = p | 'Transform' >> transform()
> The use can call a single function and get auto-magical charting of the data 
> as materialized pcoll.
> e.g., visualize(pcoll)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8433) DataCatalogBigQueryIT runs for both Calcite and ZetaSQL dialects

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8433?focusedWorklogId=331734=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331734
 ]

ASF GitHub Bot logged work on BEAM-8433:


Author: ASF GitHub Bot
Created on: 22/Oct/19 00:28
Start Date: 22/Oct/19 00:28
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on pull request #9835: 
[BEAM-8433] Use JUnit parameterized runner for dialect-sensitive integration 
tests
URL: https://github.com/apache/beam/pull/9835
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331734)
Time Spent: 3h 10m  (was: 3h)

> DataCatalogBigQueryIT runs for both Calcite and ZetaSQL dialects
> 
>
> Key: BEAM-8433
> URL: https://issues.apache.org/jira/browse/BEAM-8433
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7926) Visualize PCollection with Interactive Beam

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7926?focusedWorklogId=331733=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331733
 ]

ASF GitHub Bot logged work on BEAM-7926:


Author: ASF GitHub Bot
Created on: 22/Oct/19 00:14
Start Date: 22/Oct/19 00:14
Worklog Time Spent: 10m 
  Work Description: davidyan74 commented on pull request #9741: [BEAM-7926] 
Visualize PCollection
URL: https://github.com/apache/beam/pull/9741#discussion_r337293449
 
 

 ##
 File path: 
sdks/python/apache_beam/runners/interactive/display/pcoll_visualization.py
 ##
 @@ -0,0 +1,258 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Module visualizes PCollection data.
+
+For internal use only; no backwards-compatibility guarantees.
+Only works with Python 3.5+.
+"""
+from __future__ import absolute_import
+
+import base64
+import logging
+from datetime import timedelta
+
+from pandas.io.json import json_normalize
+
+from apache_beam import pvalue
+from apache_beam.runners.interactive import interactive_environment as ie
+from apache_beam.runners.interactive import pipeline_instrument as instr
+from facets_overview.generic_feature_statistics_generator import 
GenericFeatureStatisticsGenerator
+from IPython.core.display import HTML
+from IPython.core.display import Javascript
+from IPython.core.display import display
+from IPython.core.display import display_javascript
+from IPython.core.display import update_display
+from timeloop import Timeloop
+
+# jsons doesn't support < Python 3.5. Work around with json for legacy tests.
 
 Review comment:
   Can we just print a warning and exit, or exclude this from running in py2? 
We probably don't want this to create an impression that this will work with 
py2.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331733)
Time Spent: 7h 50m  (was: 7h 40m)

> Visualize PCollection with Interactive Beam
> ---
>
> Key: BEAM-7926
> URL: https://issues.apache.org/jira/browse/BEAM-7926
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-py-interactive
>Reporter: Ning Kang
>Assignee: Ning Kang
>Priority: Major
>  Time Spent: 7h 50m
>  Remaining Estimate: 0h
>
> Support auto plotting / charting of materialized data of a given PCollection 
> with Interactive Beam.
> Say an Interactive Beam pipeline defined as
> p = create_pipeline()
> pcoll = p | 'Transform' >> transform()
> The use can call a single function and get auto-magical charting of the data 
> as materialized pcoll.
> e.g., visualize(pcoll)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8335) Add streaming support to Interactive Beam

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8335?focusedWorklogId=331730=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331730
 ]

ASF GitHub Bot logged work on BEAM-8335:


Author: ASF GitHub Bot
Created on: 22/Oct/19 00:09
Start Date: 22/Oct/19 00:09
Worklog Time Spent: 10m 
  Work Description: davidyan74 commented on pull request #9720: [BEAM-8335] 
Add initial modules for interactive streaming support
URL: https://github.com/apache/beam/pull/9720#discussion_r337292394
 
 

 ##
 File path: model/interactive/src/main/proto/beam_interactive_api.proto
 ##
 @@ -0,0 +1,145 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/*
+ * Protocol Buffers describing a service that can be used in conjunction with
+ * the TestStream class in order to control a pipeline remotely.
+ */
+
+syntax = "proto3";
+
+package org.apache.beam.model.interactive.v1;
+
+option go_package = "interactive_v1";
+option java_package = "org.apache.beam.model.interactive.v1";
+option java_outer_classname = "BeamInteractiveApi";
+
+import "beam_runner_api.proto";
+import "google/protobuf/timestamp.proto";
+import "google/protobuf/duration.proto";
+
+service InteractiveService {
+
+  // A TestStream will request for events using this RPC.
+  rpc Events(EventsRequest) returns (stream EventsResponse) {}
+
+  // Starts the stream of events to the EventsRequest.
+  rpc Start (StartRequest) returns (StartResponse) {}
+
+  // Stops and resets the stream to the beginning.
+  rpc Stop (StopRequest) returns (StopResponse) {}
+
+  // Pauses the stream of events to the EventsRequest. If there is already an
+  // outstanding EventsRequest streaming events, then the stream will pause
+  // after the EventsResponse is completed.
+  rpc Pause (PauseRequest) returns (PauseResponse) {}
+
+  // Sends a single element to the EventsRequest then closes the stream.
+  rpc Step (StepRequest) returns (StepResponse) {}
+
+  // Responds with debugging and other cache-specific metadata.
+  rpc Status (StatusRequest) returns (StatusResponse) {}
+}
+
+message StartRequest {
+  // How quickly the stream will be played back, e.g. if playback_speed == 2,
+  // then the stream will replay events twice as fast as they were recorded.
+  double playback_speed = 1;
+
+  oneof starting_placement {
+// (Optional) if present, will start the stream at the specified timestamp.
+google.protobuf.Timestamp start_at = 2;
+
+// (Optional) if present, will advance the stream by replaying events as
+// quickly as possible until the stream timestamp has advanced by the
+// specified amount.
+google.protobuf.Duration advance_by = 3;
 
 Review comment:
   I think we want to make that a separate method because the user might want 
to advance to an arbitrary timestamp during the course of the replay, and they 
might want to do so multiple times (e.g. start replay, ..., advance to time A, 
... advance to time B, ...).
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331730)
Time Spent: 5h 20m  (was: 5h 10m)

> Add streaming support to Interactive Beam
> -
>
> Key: BEAM-8335
> URL: https://issues.apache.org/jira/browse/BEAM-8335
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-py-interactive
>Reporter: Sam Rohde
>Assignee: Sam Rohde
>Priority: Major
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> This issue tracks the work items to introduce streaming support to the 
> Interactive Beam experience. This will allow users to:
>  * Write and run a streaming job in IPython
>  * Automatically cache records from unbounded sources
>  * Add a replay experience that replays all cached 

[jira] [Commented] (BEAM-8409) docker-credential-gcloud not installed or not available in PATH

2019-10-21 Thread Udi Meiri (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956549#comment-16956549
 ] 

Udi Meiri commented on BEAM-8409:
-

Do we understand why only apache-beam-jenkins-15 seems to be affected?


> docker-credential-gcloud not installed or not available in PATH
> ---
>
> Key: BEAM-8409
> URL: https://issues.apache.org/jira/browse/BEAM-8409
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kamil Wasilewski
>Assignee: Yifan Zou
>Priority: Major
>  Labels: currently-failing
>
> _Use this form to file an issue for test failure:_
>  * 
> [beam_PreCommit_CommunityMetrics_Commit|https://builds.apache.org/view/A-D/view/Beam/view/All/job/beam_PreCommit_CommunityMetrics_Commit/1355/]
>  * 
> [beam_PostCommit_Python2_PR|https://builds.apache.org/job/beam_PostCommit_Python2_PR]
> Initial investigation:
> Jenkins job fails when executing docker-compose script.
> It seems the only Jenkins worker affected is *apache-beam-jenkins-15.*
>  
> Relevant logs:
> 1)
>  
> {code:java}
> 11:56:24 Execution failed for task ':beam-test-infra-metrics:composeUp'.
> 11:56:24 > Exit-code 255 when calling docker-compose, stdout: postgresql uses 
> an image, skipping
> 11:56:24   prometheus uses an image, skipping
> 11:56:24   pushgateway uses an image, skipping
> 11:56:24   alertmanager uses an image, skipping
> 11:56:24   Building grafana
> 11:56:24   [17038] Failed to execute script docker-compose
> 11:56:24   Traceback (most recent call last):
> 11:56:24 File "bin/docker-compose", line 6, in 
> 11:56:24 File "compose/cli/main.py", line 71, in main
> 11:56:24 File "compose/cli/main.py", line 127, in perform_command
> 11:56:24 File "compose/cli/main.py", line 287, in build
> 11:56:24 File "compose/project.py", line 386, in build
> 11:56:24 File "compose/project.py", line 368, in build_service
> 11:56:24 File "compose/service.py", line 1084, in build
> 11:56:24 File "site-packages/docker/api/build.py", line 260, in build
> 11:56:24 File "site-packages/docker/api/build.py", line 307, in 
> _set_auth_headers
> 11:56:24 File "site-packages/docker/auth.py", line 310, in 
> get_all_credentials
> 11:56:24 File "site-packages/docker/auth.py", line 262, in 
> _resolve_authconfig_credstore
> 11:56:24 File "site-packages/docker/auth.py", line 287, in 
> _get_store_instance
> 11:56:24 File "site-packages/dockerpycreds/store.py", line 25, in __init__
> 11:56:24   dockerpycreds.errors.InitializationError: docker-credential-gcloud 
> not installed or not available in PATH
> {code}
> 2)
> {code:java}
> 16:26:08 [9316] Failed to execute script docker-compose
> 16:26:08 Traceback (most recent call last):
> 16:26:08   File "bin/docker-compose", line 6, in 
> 16:26:08   File "compose/cli/main.py", line 71, in main
> 16:26:08   File "compose/cli/main.py", line 127, in perform_command
> 16:26:08   File "compose/cli/main.py", line 287, in build
> 16:26:08   File "compose/project.py", line 386, in build
> 16:26:08   File "compose/project.py", line 368, in build_service
> 16:26:08   File "compose/service.py", line 1084, in build
> 16:26:08   File "site-packages/docker/api/build.py", line 260, in build
> 16:26:08   File "site-packages/docker/api/build.py", line 307, in 
> _set_auth_headers
> 16:26:08   File "site-packages/docker/auth.py", line 310, in 
> get_all_credentials
> 16:26:08   File "site-packages/docker/auth.py", line 262, in 
> _resolve_authconfig_credstore
> 16:26:08   File "site-packages/docker/auth.py", line 287, in 
> _get_store_instance
> 16:26:08   File "site-packages/dockerpycreds/store.py", line 25, in __init__
> 16:26:08 dockerpycreds.errors.InitializationError: docker-credential-gcloud 
> not installed or not available in PATH
> {code}
>  **
>  
> 
> _After you've filled out the above details, pl__ease [assign the issue to an 
> individual|https://beam.apache.org/contribute/postcommits-guides/index.html#find_specialist].
>  Assignee should [treat test failures as 
> high-priority|https://beam.apache.org/contribute/postcommits-policies/#assigned-failing-test],
>  helping to fix the issue or find a more appropriate owner. See [Apache Beam 
> Post-Commit 
> Policies|https://beam.apache.org/contribute/postcommits-policies]._



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (BEAM-8455) docker-credential-gcloud missing

2019-10-21 Thread Udi Meiri (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri closed BEAM-8455.
---
Fix Version/s: Not applicable
   Resolution: Duplicate

> docker-credential-gcloud missing
> 
>
> Key: BEAM-8455
> URL: https://issues.apache.org/jira/browse/BEAM-8455
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Udi Meiri
>Priority: Major
> Fix For: Not applicable
>
>
> Task: :sdks:python:test-suites:direct:py37:hdfsIntegrationTest
> Machine: apache-beam-jenkins-15
> {code}
> 14:16:01 [16026] Failed to execute script docker-compose
> 14:16:01 Traceback (most recent call last):
> 14:16:01   File "bin/docker-compose", line 6, in 
> 14:16:01   File "compose/cli/main.py", line 71, in main
> 14:16:01   File "compose/cli/main.py", line 127, in perform_command
> 14:16:01   File "compose/cli/main.py", line 287, in build
> 14:16:01   File "compose/project.py", line 386, in build
> 14:16:01   File "compose/project.py", line 368, in build_service
> 14:16:01   File "compose/service.py", line 1084, in build
> 14:16:01   File "site-packages/docker/api/build.py", line 260, in build
> 14:16:01   File "site-packages/docker/api/build.py", line 307, in 
> _set_auth_headers
> 14:16:01   File "site-packages/docker/auth.py", line 310, in 
> get_all_credentials
> 14:16:01   File "site-packages/docker/auth.py", line 262, in 
> _resolve_authconfig_credstore
> 14:16:01   File "site-packages/docker/auth.py", line 287, in 
> _get_store_instance
> 14:16:01   File "site-packages/dockerpycreds/store.py", line 25, in __init__
> 14:16:01 dockerpycreds.errors.InitializationError: docker-credential-gcloud 
> not installed or not available in PATH
> {code}
> https://builds.apache.org/job/beam_PostCommit_Python37/745/console



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8455) docker-credential-gcloud missing

2019-10-21 Thread Udi Meiri (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri updated BEAM-8455:

Summary: docker-credential-gcloud missing  (was: hdfsIntegrationTest: 
docker-credential-gcloud missing)

> docker-credential-gcloud missing
> 
>
> Key: BEAM-8455
> URL: https://issues.apache.org/jira/browse/BEAM-8455
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Udi Meiri
>Priority: Major
>
> Task: :sdks:python:test-suites:direct:py37:hdfsIntegrationTest
> Machine: apache-beam-jenkins-15
> {code}
> 14:16:01 [16026] Failed to execute script docker-compose
> 14:16:01 Traceback (most recent call last):
> 14:16:01   File "bin/docker-compose", line 6, in 
> 14:16:01   File "compose/cli/main.py", line 71, in main
> 14:16:01   File "compose/cli/main.py", line 127, in perform_command
> 14:16:01   File "compose/cli/main.py", line 287, in build
> 14:16:01   File "compose/project.py", line 386, in build
> 14:16:01   File "compose/project.py", line 368, in build_service
> 14:16:01   File "compose/service.py", line 1084, in build
> 14:16:01   File "site-packages/docker/api/build.py", line 260, in build
> 14:16:01   File "site-packages/docker/api/build.py", line 307, in 
> _set_auth_headers
> 14:16:01   File "site-packages/docker/auth.py", line 310, in 
> get_all_credentials
> 14:16:01   File "site-packages/docker/auth.py", line 262, in 
> _resolve_authconfig_credstore
> 14:16:01   File "site-packages/docker/auth.py", line 287, in 
> _get_store_instance
> 14:16:01   File "site-packages/dockerpycreds/store.py", line 25, in __init__
> 14:16:01 dockerpycreds.errors.InitializationError: docker-credential-gcloud 
> not installed or not available in PATH
> {code}
> https://builds.apache.org/job/beam_PostCommit_Python37/745/console



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8455) hdfsIntegrationTest: docker-credential-gcloud missing

2019-10-21 Thread Udi Meiri (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956533#comment-16956533
 ] 

Udi Meiri commented on BEAM-8455:
-

Same problem for this job: 
https://builds.apache.org/job/beam_PreCommit_CommunityMetrics_Commit/1395/console

> hdfsIntegrationTest: docker-credential-gcloud missing
> -
>
> Key: BEAM-8455
> URL: https://issues.apache.org/jira/browse/BEAM-8455
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Udi Meiri
>Priority: Major
>
> Task: :sdks:python:test-suites:direct:py37:hdfsIntegrationTest
> Machine: apache-beam-jenkins-15
> {code}
> 14:16:01 [16026] Failed to execute script docker-compose
> 14:16:01 Traceback (most recent call last):
> 14:16:01   File "bin/docker-compose", line 6, in 
> 14:16:01   File "compose/cli/main.py", line 71, in main
> 14:16:01   File "compose/cli/main.py", line 127, in perform_command
> 14:16:01   File "compose/cli/main.py", line 287, in build
> 14:16:01   File "compose/project.py", line 386, in build
> 14:16:01   File "compose/project.py", line 368, in build_service
> 14:16:01   File "compose/service.py", line 1084, in build
> 14:16:01   File "site-packages/docker/api/build.py", line 260, in build
> 14:16:01   File "site-packages/docker/api/build.py", line 307, in 
> _set_auth_headers
> 14:16:01   File "site-packages/docker/auth.py", line 310, in 
> get_all_credentials
> 14:16:01   File "site-packages/docker/auth.py", line 262, in 
> _resolve_authconfig_credstore
> 14:16:01   File "site-packages/docker/auth.py", line 287, in 
> _get_store_instance
> 14:16:01   File "site-packages/dockerpycreds/store.py", line 25, in __init__
> 14:16:01 dockerpycreds.errors.InitializationError: docker-credential-gcloud 
> not installed or not available in PATH
> {code}
> https://builds.apache.org/job/beam_PostCommit_Python37/745/console



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-8455) hdfsIntegrationTest: docker-credential-gcloud missing

2019-10-21 Thread Udi Meiri (Jira)
Udi Meiri created BEAM-8455:
---

 Summary: hdfsIntegrationTest: docker-credential-gcloud missing
 Key: BEAM-8455
 URL: https://issues.apache.org/jira/browse/BEAM-8455
 Project: Beam
  Issue Type: Sub-task
  Components: testing
Reporter: Udi Meiri


Task: :sdks:python:test-suites:direct:py37:hdfsIntegrationTest
Machine: apache-beam-jenkins-15
{code}
14:16:01 [16026] Failed to execute script docker-compose
14:16:01 Traceback (most recent call last):
14:16:01   File "bin/docker-compose", line 6, in 
14:16:01   File "compose/cli/main.py", line 71, in main
14:16:01   File "compose/cli/main.py", line 127, in perform_command
14:16:01   File "compose/cli/main.py", line 287, in build
14:16:01   File "compose/project.py", line 386, in build
14:16:01   File "compose/project.py", line 368, in build_service
14:16:01   File "compose/service.py", line 1084, in build
14:16:01   File "site-packages/docker/api/build.py", line 260, in build
14:16:01   File "site-packages/docker/api/build.py", line 307, in 
_set_auth_headers
14:16:01   File "site-packages/docker/auth.py", line 310, in get_all_credentials
14:16:01   File "site-packages/docker/auth.py", line 262, in 
_resolve_authconfig_credstore
14:16:01   File "site-packages/docker/auth.py", line 287, in _get_store_instance
14:16:01   File "site-packages/dockerpycreds/store.py", line 25, in __init__
14:16:01 dockerpycreds.errors.InitializationError: docker-credential-gcloud not 
installed or not available in PATH
{code}
https://builds.apache.org/job/beam_PostCommit_Python37/745/console



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8416) ZipFileArtifactServiceTest.test_concurrent_requests flaky

2019-10-21 Thread Udi Meiri (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956527#comment-16956527
 ] 

Udi Meiri commented on BEAM-8416:
-

Still happening:
{code}
11:33:07 ==
11:33:07 ERROR: test_concurrent_requests 
(apache_beam.runners.portability.artifact_service_test.ZipFileArtifactServiceTest)
11:33:07 --
11:33:07 Traceback (most recent call last):
11:33:07   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Cron/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/runners/portability/artifact_service_test.py",
 line 215, in test_concurrent_requests
11:33:07 _ = list(pool.map(check, range(100)))
11:33:07   File "/usr/lib/python3.7/concurrent/futures/_base.py", line 586, in 
result_iterator
11:33:07 yield fs.pop().result()
11:33:07   File "/usr/lib/python3.7/concurrent/futures/_base.py", line 425, in 
result
11:33:07 return self.__get_result()
11:33:07   File "/usr/lib/python3.7/concurrent/futures/_base.py", line 384, in 
__get_result
11:33:07 raise self._exception
11:33:07   File "/usr/lib/python3.7/concurrent/futures/thread.py", line 57, in 
run
11:33:07 result = self.fn(*self.args, **self.kwargs)
11:33:07   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Cron/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/runners/portability/artifact_service_test.py",
 line 208, in check
11:33:07 self._service, tokens[session(index)], name(index)))
11:33:07   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Cron/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/runners/portability/artifact_service_test.py",
 line 73, in retrieve_artifact
11:33:07 name=name)))
11:33:07   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Cron/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/runners/portability/artifact_service_test.py",
 line 70, in 
11:33:07 return b''.join(chunk.data for chunk in 
retrieval_service.GetArtifact(
11:33:07   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Cron/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/runners/portability/artifact_service.py",
 line 126, in GetArtifact
11:33:07 for artifact in 
self._get_manifest_proxy(request.retrieval_token).location:
11:33:07   File 
"/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Cron/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/runners/portability/artifact_service.py",
 line 77, in _get_manifest_proxy
11:33:07 fin.read().decode('utf-8'), beam_artifact_api_pb2.ProxyManifest())
11:33:07   File "/usr/lib/python3.7/zipfile.py", line 885, in read
11:33:07 buf += self._read1(self.MAX_N)
11:33:07   File "/usr/lib/python3.7/zipfile.py", line 989, in _read1
11:33:07 self._update_crc(data)
11:33:07   File "/usr/lib/python3.7/zipfile.py", line 917, in _update_crc
11:33:07 raise BadZipFile("Bad CRC-32 for file %r" % self.name)
11:33:07 zipfile.BadZipFile: Bad CRC-32 for file 
'/3e3ff9aa4fe679c1bf76383e69bfb5e2167afb945aa30e15f05406cc8f55ad14/MANIFEST'
{code}
https://builds.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/

> ZipFileArtifactServiceTest.test_concurrent_requests flaky
> -
>
> Key: BEAM-8416
> URL: https://issues.apache.org/jira/browse/BEAM-8416
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Robert Bradshaw
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> {code}
> Traceback (most recent call last):
>   File "/usr/lib/python3.7/unittest/case.py", line 59, in testPartExecutor
> yield
>   File "/usr/lib/python3.7/unittest/case.py", line 615, in run
> testMethod()
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Cron/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/runners/portability/artifact_service_test.py",
>  line 215, in test_concurrent_requests
> _ = list(pool.map(check, range(100)))
>   File "/usr/lib/python3.7/concurrent/futures/_base.py", line 586, in 
> result_iterator
> yield fs.pop().result()
>   File "/usr/lib/python3.7/concurrent/futures/_base.py", line 425, in result
> return self.__get_result()
>   File "/usr/lib/python3.7/concurrent/futures/_base.py", line 384, in 
> __get_result
> raise self._exception
>   File "/usr/lib/python3.7/concurrent/futures/thread.py", line 57, in run
> result = self.fn(*self.args, **self.kwargs)
>   File 
> 

[jira] [Commented] (BEAM-8451) Interactive Beam example failing from stack overflow

2019-10-21 Thread Igor Durovic (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956528#comment-16956528
 ] 

Igor Durovic commented on BEAM-8451:


fyi I elaborated more on the cause of this issue in my email to dev@. Let me 
know if I should add that to the ticket description.

> Interactive Beam example failing from stack overflow
> 
>
> Key: BEAM-8451
> URL: https://issues.apache.org/jira/browse/BEAM-8451
> Project: Beam
>  Issue Type: Bug
>  Components: examples-python, runner-py-interactive
>Reporter: Igor Durovic
>Assignee: Igor Durovic
>Priority: Major
>
>  
> RecursionError: maximum recursion depth exceeded in __instancecheck__
> at 
> [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/interactive/pipeline_analyzer.py#L405]
>  
> This occurred after the execution of the last cell in 
> [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/interactive/examples/Interactive%20Beam%20Example.ipynb]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8451) Interactive Beam example failing from stack overflow

2019-10-21 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956526#comment-16956526
 ] 

Ahmet Altay commented on BEAM-8451:
---

cc: [~davidyan] [~ningk]

> Interactive Beam example failing from stack overflow
> 
>
> Key: BEAM-8451
> URL: https://issues.apache.org/jira/browse/BEAM-8451
> Project: Beam
>  Issue Type: Bug
>  Components: examples-python, runner-py-interactive
>Reporter: Igor Durovic
>Assignee: Igor Durovic
>Priority: Major
>
>  
> RecursionError: maximum recursion depth exceeded in __instancecheck__
> at 
> [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/interactive/pipeline_analyzer.py#L405]
>  
> This occurred after the execution of the last cell in 
> [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/interactive/examples/Interactive%20Beam%20Example.ipynb]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8451) Interactive Beam example failing from stack overflow

2019-10-21 Thread Ahmet Altay (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay updated BEAM-8451:
--
Component/s: runner-py-interactive

> Interactive Beam example failing from stack overflow
> 
>
> Key: BEAM-8451
> URL: https://issues.apache.org/jira/browse/BEAM-8451
> Project: Beam
>  Issue Type: Bug
>  Components: examples-python, runner-py-interactive
>Reporter: Igor Durovic
>Assignee: Igor Durovic
>Priority: Major
>
>  
> RecursionError: maximum recursion depth exceeded in __instancecheck__
> at 
> [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/interactive/pipeline_analyzer.py#L405]
>  
> This occurred after the execution of the last cell in 
> [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/interactive/examples/Interactive%20Beam%20Example.ipynb]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-3713) Consider moving away from nose to nose2 or pytest.

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-3713?focusedWorklogId=331711=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331711
 ]

ASF GitHub Bot logged work on BEAM-3713:


Author: ASF GitHub Bot
Created on: 21/Oct/19 23:31
Start Date: 21/Oct/19 23:31
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #9756: [BEAM-3713] Add 
pytest for unit tests
URL: https://github.com/apache/beam/pull/9756#discussion_r337284238
 
 

 ##
 File path: sdks/python/conftest.py
 ##
 @@ -0,0 +1,29 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+"""Pytest configuration and custom hooks."""
+
+from __future__ import absolute_import
+
+import sys
+
+# See pytest.ini for main collection rules.
+collect_ignore_glob = []
+if sys.version_info < (3,):
+  collect_ignore_glob.append('*_py3.py')
+for minor in [5, 6, 7, 8, 9]:
+  if sys.version_info < (3, minor):
+collect_ignore_glob.append('*_py3%d.py' % minor)
 
 Review comment:
   > But maintainability aside, it seems a bit odd to test for our lower bound 
in a loop when we know precisely what the lower bound is: 
`sys.version_info.minor +1`
   
   You're right. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331711)
Time Spent: 10h 10m  (was: 10h)

> Consider moving away from nose to nose2 or pytest.
> --
>
> Key: BEAM-3713
> URL: https://issues.apache.org/jira/browse/BEAM-3713
> Project: Beam
>  Issue Type: Test
>  Components: sdk-py-core, testing
>Reporter: Robert Bradshaw
>Assignee: Udi Meiri
>Priority: Minor
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> Per 
> [https://nose.readthedocs.io/en/latest/|https://nose.readthedocs.io/en/latest/,]
>  , nose is in maintenance mode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8335) Add streaming support to Interactive Beam

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8335?focusedWorklogId=331708=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331708
 ]

ASF GitHub Bot logged work on BEAM-8335:


Author: ASF GitHub Bot
Created on: 21/Oct/19 23:20
Start Date: 21/Oct/19 23:20
Worklog Time Spent: 10m 
  Work Description: rohdesamuel commented on pull request #9720: 
[BEAM-8335] Add initial modules for interactive streaming support
URL: https://github.com/apache/beam/pull/9720#discussion_r337281724
 
 

 ##
 File path: model/interactive/src/main/proto/beam_interactive_api.proto
 ##
 @@ -0,0 +1,130 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/*
+ * Protocol Buffers describing a service that can be used in conjunction with
+ * the TestStream class in order to control a pipeline remotely.
+ */
+
+syntax = "proto3";
+
+package org.apache.beam.model.interactive.v1;
+
+option go_package = "interactive_v1";
+option java_package = "org.apache.beam.model.interactive.v1";
+option java_outer_classname = "BeamInteractiveApi";
+
+import "beam_runner_api.proto";
+import "google/protobuf/timestamp.proto";
+
+service InteractiveService {
+
+  // A TestStream will request for events using this RPC.
+  rpc Events(EventsRequest) returns (stream EventsResponse) {}
+
+  // Starts the stream of events to the EventsRequest.
+  rpc Start (StartRequest) returns (StartResponse) {}
+
+  // Stops and resets the stream to the beginning.
+  rpc Stop (StopRequest) returns (StopResponse) {}
+
+  // Pauses the stream of events to the EventsRequest. If there is already an
+  // outstanding EventsRequest streaming events, then the stream will pause
+  // after the EventsResponse is completed.
+  rpc Pause (PauseRequest) returns (PauseResponse) {}
+
+  // Sends a single element to the EventsRequest then closes the stream.
+  rpc Step (StepRequest) returns (StepResponse) {}
+
 
 Review comment:
   Good point, I added the option to either start at a specified timestamp or 
to advance the stream in the StartRequest.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331708)
Time Spent: 5h 10m  (was: 5h)

> Add streaming support to Interactive Beam
> -
>
> Key: BEAM-8335
> URL: https://issues.apache.org/jira/browse/BEAM-8335
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-py-interactive
>Reporter: Sam Rohde
>Assignee: Sam Rohde
>Priority: Major
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> This issue tracks the work items to introduce streaming support to the 
> Interactive Beam experience. This will allow users to:
>  * Write and run a streaming job in IPython
>  * Automatically cache records from unbounded sources
>  * Add a replay experience that replays all cached records to simulate the 
> original pipeline execution
>  * Add controls to play/pause/stop/step individual elements from the cached 
> records
>  * Add ability to inspect/visualize unbounded PCollections



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8335) Add streaming support to Interactive Beam

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8335?focusedWorklogId=331707=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331707
 ]

ASF GitHub Bot logged work on BEAM-8335:


Author: ASF GitHub Bot
Created on: 21/Oct/19 23:20
Start Date: 21/Oct/19 23:20
Worklog Time Spent: 10m 
  Work Description: rohdesamuel commented on pull request #9720: 
[BEAM-8335] Add initial modules for interactive streaming support
URL: https://github.com/apache/beam/pull/9720#discussion_r337281718
 
 

 ##
 File path: model/interactive/src/main/proto/beam_interactive_api.proto
 ##
 @@ -0,0 +1,130 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/*
+ * Protocol Buffers describing a service that can be used in conjunction with
+ * the TestStream class in order to control a pipeline remotely.
+ */
+
+syntax = "proto3";
+
+package org.apache.beam.model.interactive.v1;
+
+option go_package = "interactive_v1";
+option java_package = "org.apache.beam.model.interactive.v1";
+option java_outer_classname = "BeamInteractiveApi";
+
+import "beam_runner_api.proto";
+import "google/protobuf/timestamp.proto";
+
+service InteractiveService {
+
+  // A TestStream will request for events using this RPC.
+  rpc Events(EventsRequest) returns (stream EventsResponse) {}
+
+  // Starts the stream of events to the EventsRequest.
+  rpc Start (StartRequest) returns (StartResponse) {}
+
+  // Stops and resets the stream to the beginning.
+  rpc Stop (StopRequest) returns (StopResponse) {}
+
+  // Pauses the stream of events to the EventsRequest. If there is already an
+  // outstanding EventsRequest streaming events, then the stream will pause
+  // after the EventsResponse is completed.
+  rpc Pause (PauseRequest) returns (PauseResponse) {}
+
+  // Sends a single element to the EventsRequest then closes the stream.
+  rpc Step (StepRequest) returns (StepResponse) {}
+
+  // Responds with debugging and other cache-specific metadata.
+  rpc Status (StatusRequest) returns (StatusResponse) {}
+}
+
+message StartRequest {
+  double playback_speed = 1;
+
+  google.protobuf.Timestamp start_time = 2;
+}
+message StartResponse { }
+
+message StopRequest { }
+message StopResponse { }
+
+message PauseRequest { }
+message PauseResponse { }
+
+message StatusRequest { }
+message StatusResponse {
+
+  // The current timestamp of the replay stream. Is MIN_TIMESTAMP when state
+  // is STOPPED.
+  google.protobuf.Timestamp stream_time = 1;
+
+  // The minimum watermark across all of the faked replayable unbounded 
sources.
+  // Is MIN_TIMESTAMP when state is STOPPED.
+  google.protobuf.Timestamp watermark = 2;
+
+  // The latest timestamp of the recording stream. Is MIN_TIMESTAMP if there is
+  // no recording.
+  google.protobuf.Timestamp recording_time = 3;
+
+  double playback_speed = 4;
+
+  enum State {
+// The InteractiveService is not replaying. Goes to RUNNING with a
+// StartRequest.
+STOPPED = 0;
+
+// The InteractiveService is replaying events. Goes to PAUSED with a
+// PauseRequest. Goes to STOPPED with a StopRequest.
+RUNNING = 1;
+
+// The InteractiveService is paused from replaying events. Goes to RUNNING
+// with either a StartRequest or a StepRequest. Goes to STOPPED with a
+// StopRequest.
+PAUSED = 2;
+  }
+  State state = 5;
+}
+
+message StepRequest { }
 
 Review comment:
   Done.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331707)
Time Spent: 5h  (was: 4h 50m)

> Add streaming support to Interactive Beam
> -
>
> Key: BEAM-8335
> URL: https://issues.apache.org/jira/browse/BEAM-8335
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-py-interactive
>

[jira] [Work logged] (BEAM-8433) DataCatalogBigQueryIT runs for both Calcite and ZetaSQL dialects

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8433?focusedWorklogId=331706=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331706
 ]

ASF GitHub Bot logged work on BEAM-8433:


Author: ASF GitHub Bot
Created on: 21/Oct/19 23:19
Start Date: 21/Oct/19 23:19
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #9835: [BEAM-8433] Use 
JUnit parameterized runner for dialect-sensitive integration tests
URL: https://github.com/apache/beam/pull/9835#issuecomment-544746144
 
 
   Got multiple local successes. Running here: 
https://gradle.com/s/l572gifqnx5pg
   
   The metadata provider issue appears to be a flake. The two planners set up 
metadata totally differently. We should fix that.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331706)
Time Spent: 3h  (was: 2h 50m)

> DataCatalogBigQueryIT runs for both Calcite and ZetaSQL dialects
> 
>
> Key: BEAM-8433
> URL: https://issues.apache.org/jira/browse/BEAM-8433
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8433) DataCatalogBigQueryIT runs for both Calcite and ZetaSQL dialects

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8433?focusedWorklogId=331705=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331705
 ]

ASF GitHub Bot logged work on BEAM-8433:


Author: ASF GitHub Bot
Created on: 21/Oct/19 23:12
Start Date: 21/Oct/19 23:12
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #9835: [BEAM-8433] Use 
JUnit parameterized runner for dialect-sensitive integration tests
URL: https://github.com/apache/beam/pull/9835#issuecomment-544746144
 
 
   Got multiple local successes. Running here: 
https://gradle.com/s/l572gifqnx5pg
   
   The metadata provider issue appears to be a flake. The two planners set up 
metadata totally differently. We should fix that.
   
   Here one success: https://gradle.com/s/l572gifqnx5pg
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331705)
Time Spent: 2h 50m  (was: 2h 40m)

> DataCatalogBigQueryIT runs for both Calcite and ZetaSQL dialects
> 
>
> Key: BEAM-8433
> URL: https://issues.apache.org/jira/browse/BEAM-8433
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8446) apache_beam.io.gcp.bigquery_write_it_test.BigQueryWriteIntegrationTests.test_big_query_write_new_types is flaky

2019-10-21 Thread Pablo Estrada (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pablo Estrada reassigned BEAM-8446:
---

Assignee: Pablo Estrada  (was: Juta Staes)

> apache_beam.io.gcp.bigquery_write_it_test.BigQueryWriteIntegrationTests.test_big_query_write_new_types
>  is flaky
> ---
>
> Key: BEAM-8446
> URL: https://issues.apache.org/jira/browse/BEAM-8446
> Project: Beam
>  Issue Type: New Feature
>  Components: test-failures
>Reporter: Boyuan Zhang
>Assignee: Pablo Estrada
>Priority: Major
>
> test_big_query_write_new_types appears to be flaky in 
> beam_PostCommit_Python37 test suite.
> https://builds.apache.org/job/beam_PostCommit_Python37/733/
> https://builds.apache.org/job/beam_PostCommit_Python37/739/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8433) DataCatalogBigQueryIT runs for both Calcite and ZetaSQL dialects

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8433?focusedWorklogId=331704=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331704
 ]

ASF GitHub Bot logged work on BEAM-8433:


Author: ASF GitHub Bot
Created on: 21/Oct/19 23:07
Start Date: 21/Oct/19 23:07
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #9835: [BEAM-8433] Use 
JUnit parameterized runner for dialect-sensitive integration tests
URL: https://github.com/apache/beam/pull/9835#issuecomment-544745017
 
 
   run sql postcommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331704)
Time Spent: 2h 40m  (was: 2.5h)

> DataCatalogBigQueryIT runs for both Calcite and ZetaSQL dialects
> 
>
> Key: BEAM-8433
> URL: https://issues.apache.org/jira/browse/BEAM-8433
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8399) Python HDFS implementation should support filenames of the format "hdfs://namenodehost/parent/child"

2019-10-21 Thread Kenneth Knowles (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956515#comment-16956515
 ] 

Kenneth Knowles commented on BEAM-8399:
---

At the very least, to be a valid URL, you would need the no-namenode version to 
have three slashes.

> Python HDFS implementation should support filenames of the format 
> "hdfs://namenodehost/parent/child"
> 
>
> Key: BEAM-8399
> URL: https://issues.apache.org/jira/browse/BEAM-8399
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Chamikara Madhusanka Jayalath
>Priority: Major
>
> "hdfs://namenodehost/parent/child" and "/parent/child" seems to be the 
> correct filename formats for HDFS based on [1] but we currently support 
> format "hdfs://parent/child".
> To not break existing users, we have to either (1) somehow support both 
> versions by default (based on [2] seems like HDFS does not allow colons in 
> file path so this might be possible) (2) make  
> "hdfs://namenodehost/parent/child" optional for now and change it to default 
> after few versions.
> We should also make sure that Beam Java and Python HDFS file-system 
> implementations are consistent in this regard.
>  
> [1][https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/FileSystemShell.html]
> [2] https://issues.apache.org/jira/browse/HDFS-13
>  
> cc: [~udim]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7520) DirectRunner timers are not strictly time ordered

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7520?focusedWorklogId=331702=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331702
 ]

ASF GitHub Bot logged work on BEAM-7520:


Author: ASF GitHub Bot
Created on: 21/Oct/19 23:02
Start Date: 21/Oct/19 23:02
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #9190: [BEAM-7520] Fix 
timer firing order in DirectRunner
URL: https://github.com/apache/beam/pull/9190#issuecomment-544743879
 
 
   Looks like you need one more "ignore" for the Flink portable ValidatesRunner 
and Jira for the bug to that runner.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331702)
Time Spent: 17h  (was: 16h 50m)

> DirectRunner timers are not strictly time ordered
> -
>
> Key: BEAM-7520
> URL: https://issues.apache.org/jira/browse/BEAM-7520
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 2.13.0
>Reporter: Jan Lukavský
>Assignee: Jan Lukavský
>Priority: Major
>  Time Spent: 17h
>  Remaining Estimate: 0h
>
> Let's suppose we have the following situation:
>  - statful ParDo with two timers - timerA and timerB
>  - timerA is set for window.maxTimestamp() + 1
>  - timerB is set anywhere between  timerB.timestamp
>  - input watermark moves to BoundedWindow.TIMESTAMP_MAX_VALUE
> Then the order of timers is as follows (correct):
>  - timerB
>  - timerA
> But, if timerB sets another timer (say for timerB.timestamp + 1), then the 
> order of timers will be:
>  - timerB (timerB.timestamp)
>  - timerA (BoundedWindow.TIMESTAMP_MAX_VALUE)
>  - timerB (timerB.timestamp + 1)
> Which is not ordered by timestamp. The reason for this is that when the input 
> watermark update is evaluated, the WatermarkManager,extractFiredTimers() will 
> produce both timerA and timerB. That would be correct, but when timerB sets 
> another timer, that breaks this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8335) Add streaming support to Interactive Beam

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8335?focusedWorklogId=331698=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331698
 ]

ASF GitHub Bot logged work on BEAM-8335:


Author: ASF GitHub Bot
Created on: 21/Oct/19 23:00
Start Date: 21/Oct/19 23:00
Worklog Time Spent: 10m 
  Work Description: davidyan74 commented on pull request #9720: [BEAM-8335] 
Add initial modules for interactive streaming support
URL: https://github.com/apache/beam/pull/9720#discussion_r337277051
 
 

 ##
 File path: model/interactive/src/main/proto/beam_interactive_api.proto
 ##
 @@ -0,0 +1,124 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/*
+ * Protocol Buffers describing a service that can be used in conjunction with
+ * the TestStream class in order to control a pipeline remotely.
+ */
+
+syntax = "proto3";
+
+package org.apache.beam.model.interactive.v1;
+
+option go_package = "interactive_v1";
+option java_package = "org.apache.beam.model.interactive.v1";
+option java_outer_classname = "BeamInteractiveApi";
+
+import "beam_runner_api.proto";
+import "google/protobuf/timestamp.proto";
+
+service InteractiveService {
+
+  // A TestStream will request for events using this RPC.
+  rpc Events(EventsRequest) returns (stream EventsResponse) {}
+
+  // Starts the stream of events to the EventsRequest.
+  rpc Start (StartRequest) returns (StartResponse) {}
+
+  // Stops and resets the stream to the beginning.
+  rpc Stop (StopRequest) returns (StopResponse) {}
+
+  // Pauses the stream of events to the EventsRequest. If there is already an
+  // outstanding EventsRequest streaming events, then the stream will pause
+  // after the EventsResponse is completed.
+  rpc Pause (PauseRequest) returns (PauseResponse) {}
+
+  // Sends a single element to the EventsRequest then closes the stream.
+  rpc Step (StepRequest) returns (StepResponse) {}
+
+  // Responds with debugging and other cache-specific metadata.
+  rpc Status (StatusRequest) returns (StatusResponse) {}
+}
+
+message StartRequest {
+  double playback_speed = 1;
+
+  google.protobuf.Timestamp start_time = 2;
+}
+message StartResponse { }
+
+message StopRequest { }
+message StopResponse { }
+
+message PauseRequest { }
+message PauseResponse { }
+
+message StatusRequest { }
+message StatusResponse {
+
+  // The current timestamp of the replay stream. Is MIN_TIMESTAMP when state
+  // is STOPPED.
+  google.protobuf.Timestamp stream_time = 1;
+
+  // The minimum watermark across all of the faked replayable unbounded 
sources.
+  // Is MIN_TIMESTAMP when state is STOPPED.
+  google.protobuf.Timestamp watermark = 2;
+
 
 Review comment:
   Thanks. I think the information is necessary to explain to the user why e.g. 
can't replay past a certain time, or why the current playback speed != the 
playback speed they specify in StartRequest.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331698)
Time Spent: 4h 50m  (was: 4h 40m)

> Add streaming support to Interactive Beam
> -
>
> Key: BEAM-8335
> URL: https://issues.apache.org/jira/browse/BEAM-8335
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-py-interactive
>Reporter: Sam Rohde
>Assignee: Sam Rohde
>Priority: Major
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> This issue tracks the work items to introduce streaming support to the 
> Interactive Beam experience. This will allow users to:
>  * Write and run a streaming job in IPython
>  * Automatically cache records from unbounded sources
>  * Add a replay experience that replays all cached records to simulate the 
> original pipeline execution
>  * Add controls to 

[jira] [Work logged] (BEAM-8335) Add streaming support to Interactive Beam

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8335?focusedWorklogId=331700=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331700
 ]

ASF GitHub Bot logged work on BEAM-8335:


Author: ASF GitHub Bot
Created on: 21/Oct/19 23:00
Start Date: 21/Oct/19 23:00
Worklog Time Spent: 10m 
  Work Description: davidyan74 commented on pull request #9720: [BEAM-8335] 
Add initial modules for interactive streaming support
URL: https://github.com/apache/beam/pull/9720#discussion_r337276328
 
 

 ##
 File path: model/interactive/src/main/proto/beam_interactive_api.proto
 ##
 @@ -0,0 +1,130 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/*
+ * Protocol Buffers describing a service that can be used in conjunction with
+ * the TestStream class in order to control a pipeline remotely.
+ */
+
+syntax = "proto3";
+
+package org.apache.beam.model.interactive.v1;
+
+option go_package = "interactive_v1";
+option java_package = "org.apache.beam.model.interactive.v1";
+option java_outer_classname = "BeamInteractiveApi";
+
+import "beam_runner_api.proto";
+import "google/protobuf/timestamp.proto";
+
+service InteractiveService {
+
+  // A TestStream will request for events using this RPC.
+  rpc Events(EventsRequest) returns (stream EventsResponse) {}
+
+  // Starts the stream of events to the EventsRequest.
+  rpc Start (StartRequest) returns (StartResponse) {}
+
+  // Stops and resets the stream to the beginning.
+  rpc Stop (StopRequest) returns (StopResponse) {}
+
+  // Pauses the stream of events to the EventsRequest. If there is already an
+  // outstanding EventsRequest streaming events, then the stream will pause
+  // after the EventsResponse is completed.
+  rpc Pause (PauseRequest) returns (PauseResponse) {}
+
+  // Sends a single element to the EventsRequest then closes the stream.
+  rpc Step (StepRequest) returns (StepResponse) {}
+
+  // Responds with debugging and other cache-specific metadata.
+  rpc Status (StatusRequest) returns (StatusResponse) {}
+}
+
+message StartRequest {
+  double playback_speed = 1;
+
+  google.protobuf.Timestamp start_time = 2;
+}
+message StartResponse { }
+
+message StopRequest { }
+message StopResponse { }
+
+message PauseRequest { }
+message PauseResponse { }
+
+message StatusRequest { }
+message StatusResponse {
+
+  // The current timestamp of the replay stream. Is MIN_TIMESTAMP when state
+  // is STOPPED.
+  google.protobuf.Timestamp stream_time = 1;
+
+  // The minimum watermark across all of the faked replayable unbounded 
sources.
+  // Is MIN_TIMESTAMP when state is STOPPED.
+  google.protobuf.Timestamp watermark = 2;
+
+  // The latest timestamp of the recording stream. Is MIN_TIMESTAMP if there is
+  // no recording.
+  google.protobuf.Timestamp recording_time = 3;
+
+  double playback_speed = 4;
+
+  enum State {
+// The InteractiveService is not replaying. Goes to RUNNING with a
+// StartRequest.
+STOPPED = 0;
+
+// The InteractiveService is replaying events. Goes to PAUSED with a
+// PauseRequest. Goes to STOPPED with a StopRequest.
+RUNNING = 1;
+
+// The InteractiveService is paused from replaying events. Goes to RUNNING
+// with either a StartRequest or a StepRequest. Goes to STOPPED with a
+// StopRequest.
+PAUSED = 2;
+  }
+  State state = 5;
+}
+
+message StepRequest { }
 
 Review comment:
   Should we add the number of elements the user wishes to step?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331700)
Time Spent: 4h 50m  (was: 4h 40m)

> Add streaming support to Interactive Beam
> -
>
> Key: BEAM-8335
> URL: https://issues.apache.org/jira/browse/BEAM-8335
> Project: Beam
>  Issue Type: 

[jira] [Work logged] (BEAM-8335) Add streaming support to Interactive Beam

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8335?focusedWorklogId=331699=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331699
 ]

ASF GitHub Bot logged work on BEAM-8335:


Author: ASF GitHub Bot
Created on: 21/Oct/19 23:00
Start Date: 21/Oct/19 23:00
Worklog Time Spent: 10m 
  Work Description: davidyan74 commented on pull request #9720: [BEAM-8335] 
Add initial modules for interactive streaming support
URL: https://github.com/apache/beam/pull/9720#discussion_r337276555
 
 

 ##
 File path: model/interactive/src/main/proto/beam_interactive_api.proto
 ##
 @@ -0,0 +1,130 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/*
+ * Protocol Buffers describing a service that can be used in conjunction with
+ * the TestStream class in order to control a pipeline remotely.
+ */
+
+syntax = "proto3";
+
+package org.apache.beam.model.interactive.v1;
+
+option go_package = "interactive_v1";
+option java_package = "org.apache.beam.model.interactive.v1";
+option java_outer_classname = "BeamInteractiveApi";
+
+import "beam_runner_api.proto";
+import "google/protobuf/timestamp.proto";
+
+service InteractiveService {
+
+  // A TestStream will request for events using this RPC.
+  rpc Events(EventsRequest) returns (stream EventsResponse) {}
+
+  // Starts the stream of events to the EventsRequest.
+  rpc Start (StartRequest) returns (StartResponse) {}
+
+  // Stops and resets the stream to the beginning.
+  rpc Stop (StopRequest) returns (StopResponse) {}
+
+  // Pauses the stream of events to the EventsRequest. If there is already an
+  // outstanding EventsRequest streaming events, then the stream will pause
+  // after the EventsResponse is completed.
+  rpc Pause (PauseRequest) returns (PauseResponse) {}
+
+  // Sends a single element to the EventsRequest then closes the stream.
+  rpc Step (StepRequest) returns (StepResponse) {}
+
 
 Review comment:
   Do we want to add another method for the user to advance to a certain 
timestamp?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331699)
Time Spent: 4h 50m  (was: 4h 40m)

> Add streaming support to Interactive Beam
> -
>
> Key: BEAM-8335
> URL: https://issues.apache.org/jira/browse/BEAM-8335
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-py-interactive
>Reporter: Sam Rohde
>Assignee: Sam Rohde
>Priority: Major
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> This issue tracks the work items to introduce streaming support to the 
> Interactive Beam experience. This will allow users to:
>  * Write and run a streaming job in IPython
>  * Automatically cache records from unbounded sources
>  * Add a replay experience that replays all cached records to simulate the 
> original pipeline execution
>  * Add controls to play/pause/stop/step individual elements from the cached 
> records
>  * Add ability to inspect/visualize unbounded PCollections



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8453) Failure in org.apache.beam.sdk.io.jms.JmsIOTest.testCheckpointMarkSafety

2019-10-21 Thread Kenneth Knowles (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-8453:
--
Description: 
{code}
Caused by: java.lang.NullPointerException: (No message provided)
at 
org.apache.beam.sdk.io.jms.JmsIOTest.lambda$withSlowAcks$2(JmsIOTest.java:463)
at org.apache.beam.sdk.io.jms.JmsIOTest.lambda$proxyMethod$6(JmsIOTest.java:489)
at com.sun.proxy.$Proxy62.receiveNoWait(Unknown Source)
at org.apache.beam.sdk.io.jms.JmsIO$UnboundedJmsReader.advance(JmsIO.java:512)
at 
org.apache.beam.sdk.io.jms.JmsIOTest.testCheckpointMarkSafety(JmsIOTest.java:381)
...
{code}

stdout:
{code}
Oct 21, 2019 9:52:32 PM org.apache.activemq.broker.BrokerService 
doStartPersistenceAdapter
INFO: Using Persistence Adapter: MemoryPersistenceAdapter
Oct 21, 2019 9:52:32 PM org.apache.activemq.store.kahadb.plist.PListStoreImpl 
doStart
INFO: 
PListStore:[/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Java_Commit@2/src/sdks/java/io/jms/activemq-data/localhost/tmp_storage]
 started
Oct 21, 2019 9:52:32 PM org.apache.activemq.broker.BrokerService doStartBroker
INFO: Apache ActiveMQ 5.13.1 (localhost, 
ID:apache-beam-jenkins-8-45641-1571694713139-0:6) is starting
Oct 21, 2019 9:52:32 PM org.apache.activemq.broker.TransportConnector start
INFO: Connector vm://localhost started
Oct 21, 2019 9:52:32 PM org.apache.activemq.broker.BrokerService doStartBroker
INFO: Apache ActiveMQ 5.13.1 (localhost, 
ID:apache-beam-jenkins-8-45641-1571694713139-0:6) started
Oct 21, 2019 9:52:32 PM org.apache.activemq.broker.BrokerService doStartBroker
INFO: For help or more information please see: http://activemq.apache.org
Oct 21, 2019 9:52:33 PM org.apache.activemq.broker.BrokerService stop
INFO: Apache ActiveMQ 5.13.1 (localhost, 
ID:apache-beam-jenkins-8-45641-1571694713139-0:6) is shutting down
Oct 21, 2019 9:52:33 PM org.apache.activemq.broker.TransportConnector stop
INFO: Connector vm://localhost stopped
Oct 21, 2019 9:52:33 PM org.apache.activemq.store.kahadb.plist.PListStoreImpl 
doStop
INFO: 
PListStore:[/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Java_Commit@2/src/sdks/java/io/jms/activemq-data/localhost/tmp_storage]
 stopped
Oct 21, 2019 9:52:33 PM org.apache.activemq.broker.BrokerService stop
INFO: Apache ActiveMQ 5.13.1 (localhost, 
ID:apache-beam-jenkins-8-45641-1571694713139-0:6) uptime 0.512 seconds
Oct 21, 2019 9:52:33 PM org.apache.activemq.broker.BrokerService stop
INFO: Apache ActiveMQ 5.13.1 (localhost, 
ID:apache-beam-jenkins-8-45641-1571694713139-0:6) is shutdown
{code}

  was:
{code}
Caused by: java.lang.NullPointerException: (No message provided)Close stacktrace
at 
org.apache.beam.sdk.io.jms.JmsIOTest.lambda$withSlowAcks$2(JmsIOTest.java:463)
at org.apache.beam.sdk.io.jms.JmsIOTest.lambda$proxyMethod$6(JmsIOTest.java:489)
at com.sun.proxy.$Proxy62.receiveNoWait(Unknown Source)
at org.apache.beam.sdk.io.jms.JmsIO$UnboundedJmsReader.advance(JmsIO.java:512)
at 
org.apache.beam.sdk.io.jms.JmsIOTest.testCheckpointMarkSafety(JmsIOTest.java:381)
...
{code}

{code}
Oct 21, 2019 9:52:32 PM org.apache.activemq.broker.BrokerService 
doStartPersistenceAdapter
INFO: Using Persistence Adapter: MemoryPersistenceAdapter
Oct 21, 2019 9:52:32 PM org.apache.activemq.store.kahadb.plist.PListStoreImpl 
doStart
INFO: 
PListStore:[/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Java_Commit@2/src/sdks/java/io/jms/activemq-data/localhost/tmp_storage]
 started
Oct 21, 2019 9:52:32 PM org.apache.activemq.broker.BrokerService doStartBroker
INFO: Apache ActiveMQ 5.13.1 (localhost, 
ID:apache-beam-jenkins-8-45641-1571694713139-0:6) is starting
Oct 21, 2019 9:52:32 PM org.apache.activemq.broker.TransportConnector start
INFO: Connector vm://localhost started
Oct 21, 2019 9:52:32 PM org.apache.activemq.broker.BrokerService doStartBroker
INFO: Apache ActiveMQ 5.13.1 (localhost, 
ID:apache-beam-jenkins-8-45641-1571694713139-0:6) started
Oct 21, 2019 9:52:32 PM org.apache.activemq.broker.BrokerService doStartBroker
INFO: For help or more information please see: http://activemq.apache.org
Oct 21, 2019 9:52:33 PM org.apache.activemq.broker.BrokerService stop
INFO: Apache ActiveMQ 5.13.1 (localhost, 
ID:apache-beam-jenkins-8-45641-1571694713139-0:6) is shutting down
Oct 21, 2019 9:52:33 PM org.apache.activemq.broker.TransportConnector stop
INFO: Connector vm://localhost stopped
Oct 21, 2019 9:52:33 PM org.apache.activemq.store.kahadb.plist.PListStoreImpl 
doStop
INFO: 
PListStore:[/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Java_Commit@2/src/sdks/java/io/jms/activemq-data/localhost/tmp_storage]
 stopped
Oct 21, 2019 9:52:33 PM org.apache.activemq.broker.BrokerService stop
INFO: Apache ActiveMQ 5.13.1 (localhost, 
ID:apache-beam-jenkins-8-45641-1571694713139-0:6) uptime 0.512 seconds
Oct 21, 2019 9:52:33 PM org.apache.activemq.broker.BrokerService stop
INFO: Apache ActiveMQ 5.13.1 (localhost, 

[jira] [Updated] (BEAM-8454) Failure in org.apache.beam.fn.harness.FnHarnessTest.testLaunchFnHarnessAndTeardownCleanly

2019-10-21 Thread Kenneth Knowles (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-8454:
--
Description: 
https://scans.gradle.com/s/7gstrbxhaki2o/tests/gslhb4gommrdu-5b76qxgixyl6k?openStackTraces=WzBd

{code:java}
:sdks:java:harness:testorg.apache.beam.fn.harness.FnHarnessTest » 
testLaunchFnHarnessAndTeardownCleanly (10.801s)
org.junit.runners.model.TestTimedOutException: test timed out after 1 
milliseconds
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429)
at java.util.concurrent.FutureTask.get(FutureTask.java:191)
at 
org.apache.beam.fn.harness.logging.BeamFnLoggingClient$LogRecordHandler.close(BeamFnLoggingClient.java:295)
at java.util.logging.LogManager.resetLogger(LogManager.java:1346)
at java.util.logging.LogManager.reset(LogManager.java:1332)
at java.util.logging.LogManager.readConfiguration(LogManager.java:1406)
at java.util.logging.LogManager.readConfiguration(LogManager.java:1303)
at 
org.apache.beam.fn.harness.logging.BeamFnLoggingClient.close(BeamFnLoggingClient.java:153)
at org.apache.beam.fn.harness.FnHarness.main(FnHarness.java:209)
at org.apache.beam.fn.harness.FnHarness.main(FnHarness.java:140)
at org.apache.beam.fn.harness.FnHarness.main(FnHarness.java:110)
at 
org.apache.beam.fn.harness.FnHarnessTest.testLaunchFnHarnessAndTeardownCleanly(FnHarnessTest.java:164)
...
{code}

stdout:
{code}
SDK Fn Harness started
Harness ID id
Logging location url: "localhost:43439"
Control location url: "localhost:41585"
Pipeline options {
  "beam:option:app_name:v1": "FnHarnessTest",
  "beam:option:options_id:v1": 12.0
}
{code}

  was:
https://scans.gradle.com/s/7gstrbxhaki2o/tests/gslhb4gommrdu-5b76qxgixyl6k?openStackTraces=WzBd

{code:java}
:sdks:java:harness:testorg.apache.beam.fn.harness.FnHarnessTest » 
testLaunchFnHarnessAndTeardownCleanly (10.801s)
org.junit.runners.model.TestTimedOutException: test timed out after 1 
millisecondsClose stacktrace
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429)
at java.util.concurrent.FutureTask.get(FutureTask.java:191)
at 
org.apache.beam.fn.harness.logging.BeamFnLoggingClient$LogRecordHandler.close(BeamFnLoggingClient.java:295)
at java.util.logging.LogManager.resetLogger(LogManager.java:1346)
at java.util.logging.LogManager.reset(LogManager.java:1332)
at java.util.logging.LogManager.readConfiguration(LogManager.java:1406)
at java.util.logging.LogManager.readConfiguration(LogManager.java:1303)
at 
org.apache.beam.fn.harness.logging.BeamFnLoggingClient.close(BeamFnLoggingClient.java:153)
at org.apache.beam.fn.harness.FnHarness.main(FnHarness.java:209)
at org.apache.beam.fn.harness.FnHarness.main(FnHarness.java:140)
at org.apache.beam.fn.harness.FnHarness.main(FnHarness.java:110)
at 
org.apache.beam.fn.harness.FnHarnessTest.testLaunchFnHarnessAndTeardownCleanly(FnHarnessTest.java:164)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)
SDK Fn Harness started
Harness ID id
Logging location url: "localhost:43439"
Control location url: "localhost:41585"
Pipeline options {
  "beam:option:app_name:v1": "FnHarnessTest",
  "beam:option:options_id:v1": 12.0
}
{code}


> Failure in 
> org.apache.beam.fn.harness.FnHarnessTest.testLaunchFnHarnessAndTeardownCleanly
> -
>
> Key: BEAM-8454
> URL: https://issues.apache.org/jira/browse/BEAM-8454
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-harness
>Reporter: Kenneth Knowles
>Priority: Critical
>  Labels: flake
>
> https://scans.gradle.com/s/7gstrbxhaki2o/tests/gslhb4gommrdu-5b76qxgixyl6k?openStackTraces=WzBd
> {code:java}
> :sdks:java:harness:testorg.apache.beam.fn.harness.FnHarnessTest » 
> 

[jira] [Updated] (BEAM-8453) Failure in org.apache.beam.sdk.io.jms.JmsIOTest.testCheckpointMarkSafety

2019-10-21 Thread Kenneth Knowles (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-8453:
--
Description: 
{code}
Caused by: java.lang.NullPointerException: (No message provided)Close stacktrace
at 
org.apache.beam.sdk.io.jms.JmsIOTest.lambda$withSlowAcks$2(JmsIOTest.java:463)
at org.apache.beam.sdk.io.jms.JmsIOTest.lambda$proxyMethod$6(JmsIOTest.java:489)
at com.sun.proxy.$Proxy62.receiveNoWait(Unknown Source)
at org.apache.beam.sdk.io.jms.JmsIO$UnboundedJmsReader.advance(JmsIO.java:512)
at 
org.apache.beam.sdk.io.jms.JmsIOTest.testCheckpointMarkSafety(JmsIOTest.java:381)
...
{code}

{code}
Oct 21, 2019 9:52:32 PM org.apache.activemq.broker.BrokerService 
doStartPersistenceAdapter
INFO: Using Persistence Adapter: MemoryPersistenceAdapter
Oct 21, 2019 9:52:32 PM org.apache.activemq.store.kahadb.plist.PListStoreImpl 
doStart
INFO: 
PListStore:[/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Java_Commit@2/src/sdks/java/io/jms/activemq-data/localhost/tmp_storage]
 started
Oct 21, 2019 9:52:32 PM org.apache.activemq.broker.BrokerService doStartBroker
INFO: Apache ActiveMQ 5.13.1 (localhost, 
ID:apache-beam-jenkins-8-45641-1571694713139-0:6) is starting
Oct 21, 2019 9:52:32 PM org.apache.activemq.broker.TransportConnector start
INFO: Connector vm://localhost started
Oct 21, 2019 9:52:32 PM org.apache.activemq.broker.BrokerService doStartBroker
INFO: Apache ActiveMQ 5.13.1 (localhost, 
ID:apache-beam-jenkins-8-45641-1571694713139-0:6) started
Oct 21, 2019 9:52:32 PM org.apache.activemq.broker.BrokerService doStartBroker
INFO: For help or more information please see: http://activemq.apache.org
Oct 21, 2019 9:52:33 PM org.apache.activemq.broker.BrokerService stop
INFO: Apache ActiveMQ 5.13.1 (localhost, 
ID:apache-beam-jenkins-8-45641-1571694713139-0:6) is shutting down
Oct 21, 2019 9:52:33 PM org.apache.activemq.broker.TransportConnector stop
INFO: Connector vm://localhost stopped
Oct 21, 2019 9:52:33 PM org.apache.activemq.store.kahadb.plist.PListStoreImpl 
doStop
INFO: 
PListStore:[/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Java_Commit@2/src/sdks/java/io/jms/activemq-data/localhost/tmp_storage]
 stopped
Oct 21, 2019 9:52:33 PM org.apache.activemq.broker.BrokerService stop
INFO: Apache ActiveMQ 5.13.1 (localhost, 
ID:apache-beam-jenkins-8-45641-1571694713139-0:6) uptime 0.512 seconds
Oct 21, 2019 9:52:33 PM org.apache.activemq.broker.BrokerService stop
INFO: Apache ActiveMQ 5.13.1 (localhost, 
ID:apache-beam-jenkins-8-45641-1571694713139-0:6) is shutdown
{code}

  was:
{code}
:sdks:java:io:jms:testorg.apache.beam.sdk.io.jms.JmsIOTest » 
testCheckpointMarkSafety (0.674s)
java.io.IOException: java.lang.NullPointerExceptionOpen stacktrace
Caused by: java.lang.NullPointerException: (No message provided)Open stacktrace
Oct 21, 2019 9:52:32 PM org.apache.activemq.broker.BrokerService 
doStartPersistenceAdapter
INFO: Using Persistence Adapter: MemoryPersistenceAdapter
Oct 21, 2019 9:52:32 PM org.apache.activemq.store.kahadb.plist.PListStoreImpl 
doStart
INFO: 
PListStore:[/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Java_Commit@2/src/sdks/java/io/jms/activemq-data/localhost/tmp_storage]
 started
Oct 21, 2019 9:52:32 PM org.apache.activemq.broker.BrokerService doStartBroker
INFO: Apache ActiveMQ 5.13.1 (localhost, 
ID:apache-beam-jenkins-8-45641-1571694713139-0:6) is starting
Oct 21, 2019 9:52:32 PM org.apache.activemq.broker.TransportConnector start
INFO: Connector vm://localhost started
Oct 21, 2019 9:52:32 PM org.apache.activemq.broker.BrokerService doStartBroker
INFO: Apache ActiveMQ 5.13.1 (localhost, 
ID:apache-beam-jenkins-8-45641-1571694713139-0:6) started
Oct 21, 2019 9:52:32 PM org.apache.activemq.broker.BrokerService doStartBroker
INFO: For help or more information please see: http://activemq.apache.org
Oct 21, 2019 9:52:33 PM org.apache.activemq.broker.BrokerService stop
INFO: Apache ActiveMQ 5.13.1 (localhost, 
ID:apache-beam-jenkins-8-45641-1571694713139-0:6) is shutting down
Oct 21, 2019 9:52:33 PM org.apache.activemq.broker.TransportConnector stop
INFO: Connector vm://localhost stopped
Oct 21, 2019 9:52:33 PM org.apache.activemq.store.kahadb.plist.PListStoreImpl 
doStop
INFO: 
PListStore:[/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Java_Commit@2/src/sdks/java/io/jms/activemq-data/localhost/tmp_storage]
 stopped
Oct 21, 2019 9:52:33 PM org.apache.activemq.broker.BrokerService stop
INFO: Apache ActiveMQ 5.13.1 (localhost, 
ID:apache-beam-jenkins-8-45641-1571694713139-0:6) uptime 0.512 seconds
Oct 21, 2019 9:52:33 PM org.apache.activemq.broker.BrokerService stop
INFO: Apache ActiveMQ 5.13.1 (localhost, 
ID:apache-beam-jenkins-8-45641-1571694713139-0:6) is shutdown
{code}


> Failure in org.apache.beam.sdk.io.jms.JmsIOTest.testCheckpointMarkSafety
> 
>
> Key: BEAM-8453
> 

[jira] [Updated] (BEAM-8454) Failure in org.apache.beam.fn.harness.FnHarnessTest.testLaunchFnHarnessAndTeardownCleanly

2019-10-21 Thread Kenneth Knowles (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-8454:
--
Description: 
https://scans.gradle.com/s/7gstrbxhaki2o/tests/gslhb4gommrdu-5b76qxgixyl6k?openStackTraces=WzBd

{code:java}
:sdks:java:harness:testorg.apache.beam.fn.harness.FnHarnessTest » 
testLaunchFnHarnessAndTeardownCleanly (10.801s)
org.junit.runners.model.TestTimedOutException: test timed out after 1 
millisecondsClose stacktrace
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429)
at java.util.concurrent.FutureTask.get(FutureTask.java:191)
at 
org.apache.beam.fn.harness.logging.BeamFnLoggingClient$LogRecordHandler.close(BeamFnLoggingClient.java:295)
at java.util.logging.LogManager.resetLogger(LogManager.java:1346)
at java.util.logging.LogManager.reset(LogManager.java:1332)
at java.util.logging.LogManager.readConfiguration(LogManager.java:1406)
at java.util.logging.LogManager.readConfiguration(LogManager.java:1303)
at 
org.apache.beam.fn.harness.logging.BeamFnLoggingClient.close(BeamFnLoggingClient.java:153)
at org.apache.beam.fn.harness.FnHarness.main(FnHarness.java:209)
at org.apache.beam.fn.harness.FnHarness.main(FnHarness.java:140)
at org.apache.beam.fn.harness.FnHarness.main(FnHarness.java:110)
at 
org.apache.beam.fn.harness.FnHarnessTest.testLaunchFnHarnessAndTeardownCleanly(FnHarnessTest.java:164)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)
SDK Fn Harness started
Harness ID id
Logging location url: "localhost:43439"
Control location url: "localhost:41585"
Pipeline options {
  "beam:option:app_name:v1": "FnHarnessTest",
  "beam:option:options_id:v1": 12.0
}
{code}

  was:
{code:java}
:sdks:java:harness:testorg.apache.beam.fn.harness.FnHarnessTest » 
testLaunchFnHarnessAndTeardownCleanly (10.801s)
org.junit.runners.model.TestTimedOutException: test timed out after 1 
millisecondsClose stacktrace
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429)
at java.util.concurrent.FutureTask.get(FutureTask.java:191)
at 
org.apache.beam.fn.harness.logging.BeamFnLoggingClient$LogRecordHandler.close(BeamFnLoggingClient.java:295)
at java.util.logging.LogManager.resetLogger(LogManager.java:1346)
at java.util.logging.LogManager.reset(LogManager.java:1332)
at java.util.logging.LogManager.readConfiguration(LogManager.java:1406)
at java.util.logging.LogManager.readConfiguration(LogManager.java:1303)
at 
org.apache.beam.fn.harness.logging.BeamFnLoggingClient.close(BeamFnLoggingClient.java:153)
at org.apache.beam.fn.harness.FnHarness.main(FnHarness.java:209)
at org.apache.beam.fn.harness.FnHarness.main(FnHarness.java:140)
at org.apache.beam.fn.harness.FnHarness.main(FnHarness.java:110)
at 
org.apache.beam.fn.harness.FnHarnessTest.testLaunchFnHarnessAndTeardownCleanly(FnHarnessTest.java:164)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)
SDK Fn Harness started
Harness ID id
Logging location url: "localhost:43439"

[jira] [Work logged] (BEAM-8433) DataCatalogBigQueryIT runs for both Calcite and ZetaSQL dialects

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8433?focusedWorklogId=331696=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331696
 ]

ASF GitHub Bot logged work on BEAM-8433:


Author: ASF GitHub Bot
Created on: 21/Oct/19 22:58
Start Date: 21/Oct/19 22:58
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #9835: [BEAM-8433] Use 
JUnit parameterized runner for dialect-sensitive integration tests
URL: https://github.com/apache/beam/pull/9835#issuecomment-544742874
 
 
   Filed https://issues.apache.org/jira/browse/BEAM-8453 for `JmsIOTest` 
failure.
   
   Filed https://issues.apache.org/jira/browse/BEAM-8454 for Java SDK harness 
failure.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331696)
Time Spent: 2h 20m  (was: 2h 10m)

> DataCatalogBigQueryIT runs for both Calcite and ZetaSQL dialects
> 
>
> Key: BEAM-8433
> URL: https://issues.apache.org/jira/browse/BEAM-8433
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8433) DataCatalogBigQueryIT runs for both Calcite and ZetaSQL dialects

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8433?focusedWorklogId=331697=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331697
 ]

ASF GitHub Bot logged work on BEAM-8433:


Author: ASF GitHub Bot
Created on: 21/Oct/19 22:58
Start Date: 21/Oct/19 22:58
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #9835: [BEAM-8433] Use 
JUnit parameterized runner for dialect-sensitive integration tests
URL: https://github.com/apache/beam/pull/9835#issuecomment-544742899
 
 
   run java precommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331697)
Time Spent: 2.5h  (was: 2h 20m)

> DataCatalogBigQueryIT runs for both Calcite and ZetaSQL dialects
> 
>
> Key: BEAM-8433
> URL: https://issues.apache.org/jira/browse/BEAM-8433
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-8454) Failure in org.apache.beam.fn.harness.FnHarnessTest.testLaunchFnHarnessAndTeardownCleanly

2019-10-21 Thread Kenneth Knowles (Jira)
Kenneth Knowles created BEAM-8454:
-

 Summary: Failure in 
org.apache.beam.fn.harness.FnHarnessTest.testLaunchFnHarnessAndTeardownCleanly
 Key: BEAM-8454
 URL: https://issues.apache.org/jira/browse/BEAM-8454
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-harness
Reporter: Kenneth Knowles


{code:java}
:sdks:java:harness:testorg.apache.beam.fn.harness.FnHarnessTest » 
testLaunchFnHarnessAndTeardownCleanly (10.801s)
org.junit.runners.model.TestTimedOutException: test timed out after 1 
millisecondsClose stacktrace
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429)
at java.util.concurrent.FutureTask.get(FutureTask.java:191)
at 
org.apache.beam.fn.harness.logging.BeamFnLoggingClient$LogRecordHandler.close(BeamFnLoggingClient.java:295)
at java.util.logging.LogManager.resetLogger(LogManager.java:1346)
at java.util.logging.LogManager.reset(LogManager.java:1332)
at java.util.logging.LogManager.readConfiguration(LogManager.java:1406)
at java.util.logging.LogManager.readConfiguration(LogManager.java:1303)
at 
org.apache.beam.fn.harness.logging.BeamFnLoggingClient.close(BeamFnLoggingClient.java:153)
at org.apache.beam.fn.harness.FnHarness.main(FnHarness.java:209)
at org.apache.beam.fn.harness.FnHarness.main(FnHarness.java:140)
at org.apache.beam.fn.harness.FnHarness.main(FnHarness.java:110)
at 
org.apache.beam.fn.harness.FnHarnessTest.testLaunchFnHarnessAndTeardownCleanly(FnHarnessTest.java:164)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)
SDK Fn Harness started
Harness ID id
Logging location url: "localhost:43439"
Control location url: "localhost:41585"
Pipeline options {
  "beam:option:app_name:v1": "FnHarnessTest",
  "beam:option:options_id:v1": 12.0
}
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8453) Failure in org.apache.beam.sdk.io.jms.JmsIOTest.testCheckpointMarkSafety

2019-10-21 Thread Kenneth Knowles (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-8453:
--
Priority: Critical  (was: Major)

> Failure in org.apache.beam.sdk.io.jms.JmsIOTest.testCheckpointMarkSafety
> 
>
> Key: BEAM-8453
> URL: https://issues.apache.org/jira/browse/BEAM-8453
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-jms
>Reporter: Kenneth Knowles
>Priority: Critical
>  Labels: flake
>
> {code}
> :sdks:java:io:jms:testorg.apache.beam.sdk.io.jms.JmsIOTest » 
> testCheckpointMarkSafety (0.674s)
> java.io.IOException: java.lang.NullPointerExceptionOpen stacktrace
> Caused by: java.lang.NullPointerException: (No message provided)Open 
> stacktrace
> Oct 21, 2019 9:52:32 PM org.apache.activemq.broker.BrokerService 
> doStartPersistenceAdapter
> INFO: Using Persistence Adapter: MemoryPersistenceAdapter
> Oct 21, 2019 9:52:32 PM org.apache.activemq.store.kahadb.plist.PListStoreImpl 
> doStart
> INFO: 
> PListStore:[/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Java_Commit@2/src/sdks/java/io/jms/activemq-data/localhost/tmp_storage]
>  started
> Oct 21, 2019 9:52:32 PM org.apache.activemq.broker.BrokerService doStartBroker
> INFO: Apache ActiveMQ 5.13.1 (localhost, 
> ID:apache-beam-jenkins-8-45641-1571694713139-0:6) is starting
> Oct 21, 2019 9:52:32 PM org.apache.activemq.broker.TransportConnector start
> INFO: Connector vm://localhost started
> Oct 21, 2019 9:52:32 PM org.apache.activemq.broker.BrokerService doStartBroker
> INFO: Apache ActiveMQ 5.13.1 (localhost, 
> ID:apache-beam-jenkins-8-45641-1571694713139-0:6) started
> Oct 21, 2019 9:52:32 PM org.apache.activemq.broker.BrokerService doStartBroker
> INFO: For help or more information please see: http://activemq.apache.org
> Oct 21, 2019 9:52:33 PM org.apache.activemq.broker.BrokerService stop
> INFO: Apache ActiveMQ 5.13.1 (localhost, 
> ID:apache-beam-jenkins-8-45641-1571694713139-0:6) is shutting down
> Oct 21, 2019 9:52:33 PM org.apache.activemq.broker.TransportConnector stop
> INFO: Connector vm://localhost stopped
> Oct 21, 2019 9:52:33 PM org.apache.activemq.store.kahadb.plist.PListStoreImpl 
> doStop
> INFO: 
> PListStore:[/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Java_Commit@2/src/sdks/java/io/jms/activemq-data/localhost/tmp_storage]
>  stopped
> Oct 21, 2019 9:52:33 PM org.apache.activemq.broker.BrokerService stop
> INFO: Apache ActiveMQ 5.13.1 (localhost, 
> ID:apache-beam-jenkins-8-45641-1571694713139-0:6) uptime 0.512 seconds
> Oct 21, 2019 9:52:33 PM org.apache.activemq.broker.BrokerService stop
> INFO: Apache ActiveMQ 5.13.1 (localhost, 
> ID:apache-beam-jenkins-8-45641-1571694713139-0:6) is shutdown
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-8453) Failure in org.apache.beam.sdk.io.jms.JmsIOTest.testCheckpointMarkSafety

2019-10-21 Thread Kenneth Knowles (Jira)
Kenneth Knowles created BEAM-8453:
-

 Summary: Failure in 
org.apache.beam.sdk.io.jms.JmsIOTest.testCheckpointMarkSafety
 Key: BEAM-8453
 URL: https://issues.apache.org/jira/browse/BEAM-8453
 Project: Beam
  Issue Type: Bug
  Components: io-java-jms
Reporter: Kenneth Knowles


{code}
:sdks:java:io:jms:testorg.apache.beam.sdk.io.jms.JmsIOTest » 
testCheckpointMarkSafety (0.674s)
java.io.IOException: java.lang.NullPointerExceptionOpen stacktrace
Caused by: java.lang.NullPointerException: (No message provided)Open stacktrace
Oct 21, 2019 9:52:32 PM org.apache.activemq.broker.BrokerService 
doStartPersistenceAdapter
INFO: Using Persistence Adapter: MemoryPersistenceAdapter
Oct 21, 2019 9:52:32 PM org.apache.activemq.store.kahadb.plist.PListStoreImpl 
doStart
INFO: 
PListStore:[/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Java_Commit@2/src/sdks/java/io/jms/activemq-data/localhost/tmp_storage]
 started
Oct 21, 2019 9:52:32 PM org.apache.activemq.broker.BrokerService doStartBroker
INFO: Apache ActiveMQ 5.13.1 (localhost, 
ID:apache-beam-jenkins-8-45641-1571694713139-0:6) is starting
Oct 21, 2019 9:52:32 PM org.apache.activemq.broker.TransportConnector start
INFO: Connector vm://localhost started
Oct 21, 2019 9:52:32 PM org.apache.activemq.broker.BrokerService doStartBroker
INFO: Apache ActiveMQ 5.13.1 (localhost, 
ID:apache-beam-jenkins-8-45641-1571694713139-0:6) started
Oct 21, 2019 9:52:32 PM org.apache.activemq.broker.BrokerService doStartBroker
INFO: For help or more information please see: http://activemq.apache.org
Oct 21, 2019 9:52:33 PM org.apache.activemq.broker.BrokerService stop
INFO: Apache ActiveMQ 5.13.1 (localhost, 
ID:apache-beam-jenkins-8-45641-1571694713139-0:6) is shutting down
Oct 21, 2019 9:52:33 PM org.apache.activemq.broker.TransportConnector stop
INFO: Connector vm://localhost stopped
Oct 21, 2019 9:52:33 PM org.apache.activemq.store.kahadb.plist.PListStoreImpl 
doStop
INFO: 
PListStore:[/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Java_Commit@2/src/sdks/java/io/jms/activemq-data/localhost/tmp_storage]
 stopped
Oct 21, 2019 9:52:33 PM org.apache.activemq.broker.BrokerService stop
INFO: Apache ActiveMQ 5.13.1 (localhost, 
ID:apache-beam-jenkins-8-45641-1571694713139-0:6) uptime 0.512 seconds
Oct 21, 2019 9:52:33 PM org.apache.activemq.broker.BrokerService stop
INFO: Apache ActiveMQ 5.13.1 (localhost, 
ID:apache-beam-jenkins-8-45641-1571694713139-0:6) is shutdown
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8397) DataflowRunnerTest.test_remote_runner_display_data fails due to infinite recursion during pickling.

2019-10-21 Thread Valentyn Tymofieiev (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956510#comment-16956510
 ] 

Valentyn Tymofieiev commented on BEAM-8397:
---

Hmm, actually the problem may be related to something else than super. The 
workaround used in https://github.com/apache/beam/pull/9513 does not seem to 
work here.

> DataflowRunnerTest.test_remote_runner_display_data fails due to infinite 
> recursion during pickling.
> ---
>
> Key: BEAM-8397
> URL: https://issues.apache.org/jira/browse/BEAM-8397
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
>
> `python ./setup.py test -s 
> apache_beam.runners.dataflow.dataflow_runner_test.DataflowRunnerTest.test_remote_runner_display_data`
>  passes.
> `tox -e py37-gcp` passes if Beam depends on dill==0.3.0, but fails if Beam 
> depends on dill==0.3.1.1.`python ./setup.py nosetests --tests 
> 'apache_beam/runners/dataflow/dataflow_runner_test.py:DataflowRunnerTest.test_remote_runner_display_data`
>  fails currently if run on master.
> The failure indicates infinite recursion during pickling:
> {noformat}
> test_remote_runner_display_data 
> (apache_beam.runners.dataflow.dataflow_runner_test.DataflowRunnerTest) ... 
> Fatal Python error: Cannot recover from stack overflow.
> Current thread 0x7f9d700ed740 (most recent call first):
>   File "/usr/lib/python3.7/pickle.py", line 479 in get
>   File "/usr/lib/python3.7/pickle.py", line 497 in save
>   File "/usr/lib/python3.7/pickle.py", line 786 in save_tuple
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 638 in save_reduce
>   File 
> "/usr/local/google/home/valentyn/tmp/py37env/lib/python3.7/site-packages/dill/_dill.py",
>  line 1394 in save_function
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 882 in _batch_setitems
>   File "/usr/lib/python3.7/pickle.py", line 856 in save_dict
>   File 
> "/usr/local/google/home/valentyn/tmp/py37env/lib/python3.7/site-packages/dill/_dill.py",
>  line 910 in save_module_dict
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean/beam/sdks/python/apache_beam/internal/pickler.py",
>  line 198 in new_save_module_dict
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 786 in save_tuple
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 638 in save_reduce
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean/beam/sdks/python/apache_beam/internal/pickler.py",
>  line 114 in wrapper
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 771 in save_tuple
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 638 in save_reduce
>   File 
> "/usr/local/google/home/valentyn/tmp/py37env/lib/python3.7/site-packages/dill/_dill.py",
>  line 1137 in save_cell
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 771 in save_tuple
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 786 in save_tuple
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 638 in save_reduce
>   File 
> "/usr/local/google/home/valentyn/tmp/py37env/lib/python3.7/site-packages/dill/_dill.py",
>  line 1394 in save_function
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 882 in _batch_setitems
>   File "/usr/lib/python3.7/pickle.py", line 856 in save_dict
>   File 
> "/usr/local/google/home/valentyn/tmp/py37env/lib/python3.7/site-packages/dill/_dill.py",
>  line 910 in save_module_dict
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean/beam/sdks/python/apache_beam/internal/pickler.py",
>  line 198 in new_save_module_dict
> ...
> {noformat}
> cc: [~yoshiki.obata]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-8397) DataflowRunnerTest.test_remote_runner_display_data fails due to infinite recursion during pickling.

2019-10-21 Thread Valentyn Tymofieiev (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956493#comment-16956493
 ] 

Valentyn Tymofieiev edited comment on BEAM-8397 at 10/21/19 10:52 PM:
--

I think the cause of the error here is a super() reference in 
test_remote_runner_display_data[1]. This issue demonstrates another case where  
dill is not able to pickle classes with superclass constructor calls on Python 
3. 
Similar errors were reported in [2,3,4]. The difference here is that:
- this error appears only on Python 3.7
- this error appears even though we don't use save_main_session, i.e. 

We should file a follow-up issue for [3]. The workaround here is to not use 
super(). Will send a PR shortly with the workaround.

[1] 
https://github.com/apache/beam/blob/e4aab40378f779ee7d6b6394301dd19db6bb0b82/sdks/python/apache_beam/runners/dataflow/dataflow_runner_test.py#L242
[2] https://github.com/uqfoundation/dill/issues/300
[3] https://github.com/uqfoundation/dill/issues/75
[4] https://issues.apache.org/jira/browse/BEAM-6158



was (Author: tvalentyn):
The cause of the error here is a super() reference in 
test_remote_runner_display_data[1]. This issue demonstrates another case where  
dill is not able to pickle classes with superclass constructor calls on Python 
3. 
Similar errors were reported in [2,3,4]. The difference here is that:
- this error appears only on Python 3.7
- this error appears even though we don't use save_main_session, i.e. 

We should file a follow-up issue for [3]. The workaround here is to not use 
super(). Will send a PR shortly with the workaround.

[1] 
https://github.com/apache/beam/blob/e4aab40378f779ee7d6b6394301dd19db6bb0b82/sdks/python/apache_beam/runners/dataflow/dataflow_runner_test.py#L242
[2] https://github.com/uqfoundation/dill/issues/300
[3] https://github.com/uqfoundation/dill/issues/75
[4] https://issues.apache.org/jira/browse/BEAM-6158


> DataflowRunnerTest.test_remote_runner_display_data fails due to infinite 
> recursion during pickling.
> ---
>
> Key: BEAM-8397
> URL: https://issues.apache.org/jira/browse/BEAM-8397
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
>
> `python ./setup.py test -s 
> apache_beam.runners.dataflow.dataflow_runner_test.DataflowRunnerTest.test_remote_runner_display_data`
>  passes.
> `tox -e py37-gcp` passes if Beam depends on dill==0.3.0, but fails if Beam 
> depends on dill==0.3.1.1.`python ./setup.py nosetests --tests 
> 'apache_beam/runners/dataflow/dataflow_runner_test.py:DataflowRunnerTest.test_remote_runner_display_data`
>  fails currently if run on master.
> The failure indicates infinite recursion during pickling:
> {noformat}
> test_remote_runner_display_data 
> (apache_beam.runners.dataflow.dataflow_runner_test.DataflowRunnerTest) ... 
> Fatal Python error: Cannot recover from stack overflow.
> Current thread 0x7f9d700ed740 (most recent call first):
>   File "/usr/lib/python3.7/pickle.py", line 479 in get
>   File "/usr/lib/python3.7/pickle.py", line 497 in save
>   File "/usr/lib/python3.7/pickle.py", line 786 in save_tuple
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 638 in save_reduce
>   File 
> "/usr/local/google/home/valentyn/tmp/py37env/lib/python3.7/site-packages/dill/_dill.py",
>  line 1394 in save_function
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 882 in _batch_setitems
>   File "/usr/lib/python3.7/pickle.py", line 856 in save_dict
>   File 
> "/usr/local/google/home/valentyn/tmp/py37env/lib/python3.7/site-packages/dill/_dill.py",
>  line 910 in save_module_dict
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean/beam/sdks/python/apache_beam/internal/pickler.py",
>  line 198 in new_save_module_dict
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 786 in save_tuple
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 638 in save_reduce
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean/beam/sdks/python/apache_beam/internal/pickler.py",
>  line 114 in wrapper
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 771 in save_tuple
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 638 in save_reduce
>   File 
> "/usr/local/google/home/valentyn/tmp/py37env/lib/python3.7/site-packages/dill/_dill.py",
>  line 1137 in save_cell
>   File "/usr/lib/python3.7/pickle.py", line 504 in 

[jira] [Work logged] (BEAM-8341) basic bundling support for samza portable runner

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8341?focusedWorklogId=331694=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331694
 ]

ASF GitHub Bot logged work on BEAM-8341:


Author: ASF GitHub Bot
Created on: 21/Oct/19 22:48
Start Date: 21/Oct/19 22:48
Worklog Time Spent: 10m 
  Work Description: lhaiesp commented on issue #9777: [BEAM-8341]: basic 
bundling support for portable runner
URL: https://github.com/apache/beam/pull/9777#issuecomment-544740427
 
 
   Looking at the end to end test but it may be hard to just do one for 
bundling. Instead, I've been looking into validatesPortableRunner and 
testPipelineJar example from flink runner. The latter one seems more achievable 
at this point, while validatesPortableRunner seems more like the real long term 
solution.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331694)
Time Spent: 1h 40m  (was: 1.5h)

> basic bundling support for samza portable runner
> 
>
> Key: BEAM-8341
> URL: https://issues.apache.org/jira/browse/BEAM-8341
> Project: Beam
>  Issue Type: Task
>  Components: runner-samza
>Reporter: Hai Lu
>Assignee: Hai Lu
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> bundling support for samza portable runner



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8335) Add streaming support to Interactive Beam

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8335?focusedWorklogId=331691=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331691
 ]

ASF GitHub Bot logged work on BEAM-8335:


Author: ASF GitHub Bot
Created on: 21/Oct/19 22:37
Start Date: 21/Oct/19 22:37
Worklog Time Spent: 10m 
  Work Description: rohdesamuel commented on pull request #9720: 
[BEAM-8335] Add initial modules for interactive streaming support
URL: https://github.com/apache/beam/pull/9720#discussion_r337270608
 
 

 ##
 File path: model/fn-execution/src/main/proto/beam_interactive_api.proto
 ##
 @@ -0,0 +1,106 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/*
+ * Protocol Buffers describing a service that can be used in conjunction with
+ * the TestStream class in order to control a pipeline remotely.
+ */
+
+syntax = "proto3";
+
+package org.apache.beam.model.fn_execution.v1;
+
+option go_package = "fnexecution_v1";
+option java_package = "org.apache.beam.model.fnexecution.v1";
+option java_outer_classname = "BeamInteractiveApi";
+
+import "beam_runner_api.proto";
+import "google/protobuf/timestamp.proto";
+
+service InteractiveService {
+
+  // A TestStream will request for events using this RPC.
+  rpc Events(EventsRequest) returns (stream EventsResponse) {}
+
+  rpc Start (StartRequest) returns (StartResponse) {}
 
 Review comment:
   Thanks for your clarification. We are actually on the same page. To clarify 
my position, the service will be sitting in the InteractiveRunner and the 
TestStream will ask for events from it using the EventsRequest. A user can then 
start replaying the stream using the StartRequest. At a later time, the user 
can also pause or prematurely stop the replay.
   
   I added an endpoint to the TestStreamPayload.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331691)
Time Spent: 4h 40m  (was: 4.5h)

> Add streaming support to Interactive Beam
> -
>
> Key: BEAM-8335
> URL: https://issues.apache.org/jira/browse/BEAM-8335
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-py-interactive
>Reporter: Sam Rohde
>Assignee: Sam Rohde
>Priority: Major
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> This issue tracks the work items to introduce streaming support to the 
> Interactive Beam experience. This will allow users to:
>  * Write and run a streaming job in IPython
>  * Automatically cache records from unbounded sources
>  * Add a replay experience that replays all cached records to simulate the 
> original pipeline execution
>  * Add controls to play/pause/stop/step individual elements from the cached 
> records
>  * Add ability to inspect/visualize unbounded PCollections



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8335) Add streaming support to Interactive Beam

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8335?focusedWorklogId=331685=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331685
 ]

ASF GitHub Bot logged work on BEAM-8335:


Author: ASF GitHub Bot
Created on: 21/Oct/19 22:31
Start Date: 21/Oct/19 22:31
Worklog Time Spent: 10m 
  Work Description: rohdesamuel commented on pull request #9720: 
[BEAM-8335] Add initial modules for interactive streaming support
URL: https://github.com/apache/beam/pull/9720#discussion_r337268901
 
 

 ##
 File path: model/interactive/src/main/proto/beam_interactive_api.proto
 ##
 @@ -0,0 +1,124 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/*
+ * Protocol Buffers describing a service that can be used in conjunction with
+ * the TestStream class in order to control a pipeline remotely.
+ */
+
+syntax = "proto3";
+
+package org.apache.beam.model.interactive.v1;
+
+option go_package = "interactive_v1";
+option java_package = "org.apache.beam.model.interactive.v1";
+option java_outer_classname = "BeamInteractiveApi";
+
+import "beam_runner_api.proto";
+import "google/protobuf/timestamp.proto";
+
+service InteractiveService {
+
+  // A TestStream will request for events using this RPC.
+  rpc Events(EventsRequest) returns (stream EventsResponse) {}
+
+  // Starts the stream of events to the EventsRequest.
+  rpc Start (StartRequest) returns (StartResponse) {}
+
+  // Stops and resets the stream to the beginning.
+  rpc Stop (StopRequest) returns (StopResponse) {}
+
+  // Pauses the stream of events to the EventsRequest. If there is already an
+  // outstanding EventsRequest streaming events, then the stream will pause
+  // after the EventsResponse is completed.
+  rpc Pause (PauseRequest) returns (PauseResponse) {}
+
+  // Sends a single element to the EventsRequest then closes the stream.
+  rpc Step (StepRequest) returns (StepResponse) {}
+
+  // Responds with debugging and other cache-specific metadata.
+  rpc Status (StatusRequest) returns (StatusResponse) {}
+}
+
+message StartRequest {
+  double playback_speed = 1;
+
+  google.protobuf.Timestamp start_time = 2;
+}
+message StartResponse { }
+
+message StopRequest { }
+message StopResponse { }
+
+message PauseRequest { }
+message PauseResponse { }
+
+message StatusRequest { }
+message StatusResponse {
+
+  // The current timestamp of the replay stream. Is MIN_TIMESTAMP when state
+  // is STOPPED.
+  google.protobuf.Timestamp stream_time = 1;
+
+  // The minimum watermark across all of the faked replayable unbounded 
sources.
+  // Is MIN_TIMESTAMP when state is STOPPED.
+  google.protobuf.Timestamp watermark = 2;
+
 
 Review comment:
   My intention is that this will be used to interact with the TestStream 
executing in the runner, but I see no harm with adding extra information here.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331685)
Time Spent: 4.5h  (was: 4h 20m)

> Add streaming support to Interactive Beam
> -
>
> Key: BEAM-8335
> URL: https://issues.apache.org/jira/browse/BEAM-8335
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-py-interactive
>Reporter: Sam Rohde
>Assignee: Sam Rohde
>Priority: Major
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> This issue tracks the work items to introduce streaming support to the 
> Interactive Beam experience. This will allow users to:
>  * Write and run a streaming job in IPython
>  * Automatically cache records from unbounded sources
>  * Add a replay experience that replays all cached records to simulate the 
> original pipeline execution
>  * Add controls to play/pause/stop/step individual elements from the cached 
> 

[jira] [Assigned] (BEAM-8397) DataflowRunnerTest.test_remote_runner_display_data fails due to infinite recursion during pickling.

2019-10-21 Thread Valentyn Tymofieiev (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev reassigned BEAM-8397:
-

Assignee: Valentyn Tymofieiev

> DataflowRunnerTest.test_remote_runner_display_data fails due to infinite 
> recursion during pickling.
> ---
>
> Key: BEAM-8397
> URL: https://issues.apache.org/jira/browse/BEAM-8397
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
>
> `python ./setup.py test -s 
> apache_beam.runners.dataflow.dataflow_runner_test.DataflowRunnerTest.test_remote_runner_display_data`
>  passes.
> `tox -e py37-gcp` passes if Beam depends on dill==0.3.0, but fails if Beam 
> depends on dill==0.3.1.1.`python ./setup.py nosetests --tests 
> 'apache_beam/runners/dataflow/dataflow_runner_test.py:DataflowRunnerTest.test_remote_runner_display_data`
>  fails currently if run on master.
> The failure indicates infinite recursion during pickling:
> {noformat}
> test_remote_runner_display_data 
> (apache_beam.runners.dataflow.dataflow_runner_test.DataflowRunnerTest) ... 
> Fatal Python error: Cannot recover from stack overflow.
> Current thread 0x7f9d700ed740 (most recent call first):
>   File "/usr/lib/python3.7/pickle.py", line 479 in get
>   File "/usr/lib/python3.7/pickle.py", line 497 in save
>   File "/usr/lib/python3.7/pickle.py", line 786 in save_tuple
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 638 in save_reduce
>   File 
> "/usr/local/google/home/valentyn/tmp/py37env/lib/python3.7/site-packages/dill/_dill.py",
>  line 1394 in save_function
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 882 in _batch_setitems
>   File "/usr/lib/python3.7/pickle.py", line 856 in save_dict
>   File 
> "/usr/local/google/home/valentyn/tmp/py37env/lib/python3.7/site-packages/dill/_dill.py",
>  line 910 in save_module_dict
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean/beam/sdks/python/apache_beam/internal/pickler.py",
>  line 198 in new_save_module_dict
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 786 in save_tuple
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 638 in save_reduce
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean/beam/sdks/python/apache_beam/internal/pickler.py",
>  line 114 in wrapper
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 771 in save_tuple
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 638 in save_reduce
>   File 
> "/usr/local/google/home/valentyn/tmp/py37env/lib/python3.7/site-packages/dill/_dill.py",
>  line 1137 in save_cell
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 771 in save_tuple
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 786 in save_tuple
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 638 in save_reduce
>   File 
> "/usr/local/google/home/valentyn/tmp/py37env/lib/python3.7/site-packages/dill/_dill.py",
>  line 1394 in save_function
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 882 in _batch_setitems
>   File "/usr/lib/python3.7/pickle.py", line 856 in save_dict
>   File 
> "/usr/local/google/home/valentyn/tmp/py37env/lib/python3.7/site-packages/dill/_dill.py",
>  line 910 in save_module_dict
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean/beam/sdks/python/apache_beam/internal/pickler.py",
>  line 198 in new_save_module_dict
> ...
> {noformat}
> cc: [~yoshiki.obata]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8397) DataflowRunnerTest.test_remote_runner_display_data fails due to infinite recursion during pickling.

2019-10-21 Thread Valentyn Tymofieiev (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956493#comment-16956493
 ] 

Valentyn Tymofieiev commented on BEAM-8397:
---

The cause of the error here is a super() reference in 
test_remote_runner_display_data[1]. This further demonstrates another issue 
where  dill is not able to pickle classes with superclass constructor calls on 
Python 3. 
Similar errors were reported in [2,3,4]. The difference here is that:
- this error appears only on Python 3.7
- this error appears even though we don't use save_main_session. 

I think we can file a follow-up issue for [3]. The workaround here is to not 
use super(). Will send a PR shortly with the workaround.

[1] 
https://github.com/apache/beam/blob/e4aab40378f779ee7d6b6394301dd19db6bb0b82/sdks/python/apache_beam/runners/dataflow/dataflow_runner_test.py#L242
[2] https://github.com/uqfoundation/dill/issues/300
[3] https://github.com/uqfoundation/dill/issues/75
[4] https://issues.apache.org/jira/browse/BEAM-6158


> DataflowRunnerTest.test_remote_runner_display_data fails due to infinite 
> recursion during pickling.
> ---
>
> Key: BEAM-8397
> URL: https://issues.apache.org/jira/browse/BEAM-8397
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Priority: Major
>
> `python ./setup.py test -s 
> apache_beam.runners.dataflow.dataflow_runner_test.DataflowRunnerTest.test_remote_runner_display_data`
>  passes.
> `tox -e py37-gcp` passes if Beam depends on dill==0.3.0, but fails if Beam 
> depends on dill==0.3.1.1.`python ./setup.py nosetests --tests 
> 'apache_beam/runners/dataflow/dataflow_runner_test.py:DataflowRunnerTest.test_remote_runner_display_data`
>  fails currently if run on master.
> The failure indicates infinite recursion during pickling:
> {noformat}
> test_remote_runner_display_data 
> (apache_beam.runners.dataflow.dataflow_runner_test.DataflowRunnerTest) ... 
> Fatal Python error: Cannot recover from stack overflow.
> Current thread 0x7f9d700ed740 (most recent call first):
>   File "/usr/lib/python3.7/pickle.py", line 479 in get
>   File "/usr/lib/python3.7/pickle.py", line 497 in save
>   File "/usr/lib/python3.7/pickle.py", line 786 in save_tuple
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 638 in save_reduce
>   File 
> "/usr/local/google/home/valentyn/tmp/py37env/lib/python3.7/site-packages/dill/_dill.py",
>  line 1394 in save_function
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 882 in _batch_setitems
>   File "/usr/lib/python3.7/pickle.py", line 856 in save_dict
>   File 
> "/usr/local/google/home/valentyn/tmp/py37env/lib/python3.7/site-packages/dill/_dill.py",
>  line 910 in save_module_dict
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean/beam/sdks/python/apache_beam/internal/pickler.py",
>  line 198 in new_save_module_dict
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 786 in save_tuple
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 638 in save_reduce
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean/beam/sdks/python/apache_beam/internal/pickler.py",
>  line 114 in wrapper
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 771 in save_tuple
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 638 in save_reduce
>   File 
> "/usr/local/google/home/valentyn/tmp/py37env/lib/python3.7/site-packages/dill/_dill.py",
>  line 1137 in save_cell
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 771 in save_tuple
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 786 in save_tuple
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 638 in save_reduce
>   File 
> "/usr/local/google/home/valentyn/tmp/py37env/lib/python3.7/site-packages/dill/_dill.py",
>  line 1394 in save_function
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 882 in _batch_setitems
>   File "/usr/lib/python3.7/pickle.py", line 856 in save_dict
>   File 
> "/usr/local/google/home/valentyn/tmp/py37env/lib/python3.7/site-packages/dill/_dill.py",
>  line 910 in save_module_dict
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean/beam/sdks/python/apache_beam/internal/pickler.py",
>  line 198 in new_save_module_dict
> ...
> {noformat}
> cc: [~yoshiki.obata]



--

[jira] [Work logged] (BEAM-8428) [SQL] BigQuery should support project push-down in DIRECT_READ mode

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8428?focusedWorklogId=331677=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331677
 ]

ASF GitHub Bot logged work on BEAM-8428:


Author: ASF GitHub Bot
Created on: 21/Oct/19 21:56
Start Date: 21/Oct/19 21:56
Worklog Time Spent: 10m 
  Work Description: apilloud commented on pull request #9823: [BEAM-8428] 
[SQL] Add project push-down for BigQuery
URL: https://github.com/apache/beam/pull/9823
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331677)
Time Spent: 1h 10m  (was: 1h)

> [SQL] BigQuery should support project push-down in DIRECT_READ mode
> ---
>
> Key: BEAM-8428
> URL: https://issues.apache.org/jira/browse/BEAM-8428
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Kirill Kozlov
>Assignee: Kirill Kozlov
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> BigQuery should perform project push-down for read pipelines when applicable.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8436) Interactive runner incompatible with experiments=beam_fn_api

2019-10-21 Thread Robert Bradshaw (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Bradshaw updated BEAM-8436:
--
Component/s: (was: sdk-py-core)
 runner-py-interactive

> Interactive runner incompatible with experiments=beam_fn_api
> 
>
> Key: BEAM-8436
> URL: https://issues.apache.org/jira/browse/BEAM-8436
> Project: Beam
>  Issue Type: Bug
>  Components: runner-py-interactive
>Reporter: Robert Bradshaw
>Priority: Major
>
> When this is enabled one gets
> {code}
> ERROR: test_wordcount 
> (apache_beam.runners.interactive.interactive_runner_test.InteractiveRunnerTest)
> --
> Traceback (most recent call last):
>   File 
> "/Users/robertwb/Work/beam/incubator-beam/sdks/python/apache_beam/runners/interactive/interactive_runner_test.py",
>  line 85, in test_wordcount
> result = p.run()
>   File 
> "/Users/robertwb/Work/beam/incubator-beam/sdks/python/apache_beam/pipeline.py",
>  line 406, in run
> self._options).run(False)
>   File 
> "/Users/robertwb/Work/beam/incubator-beam/sdks/python/apache_beam/pipeline.py",
>  line 419, in run
> return self.runner.run_pipeline(self, self._options)
>   File 
> "/Users/robertwb/Work/beam/incubator-beam/sdks/python/apache_beam/runners/interactive/interactive_runner.py",
>  line 136, in run_pipeline
> self._desired_cache_labels)
>   File 
> "/Users/robertwb/Work/beam/incubator-beam/sdks/python/apache_beam/runners/interactive/pipeline_analyzer.py",
>  line 73, in __init__
> self._analyze_pipeline()
>   File 
> "/Users/robertwb/Work/beam/incubator-beam/sdks/python/apache_beam/runners/interactive/pipeline_analyzer.py",
>  line 93, in _analyze_pipeline
> desired_pcollections = self._desired_pcollections(self._pipeline_info)
>   File 
> "/Users/robertwb/Work/beam/incubator-beam/sdks/python/apache_beam/runners/interactive/pipeline_analyzer.py",
>  line 313, in _desired_pcollections
> cache_label = pipeline_info.cache_label(pcoll_id)
>   File 
> "/Users/robertwb/Work/beam/incubator-beam/sdks/python/apache_beam/runners/interactive/pipeline_analyzer.py",
>  line 397, in cache_label
> return self._derivation(pcoll_id).cache_label()
>   File 
> "/Users/robertwb/Work/beam/incubator-beam/sdks/python/apache_beam/runners/interactive/pipeline_analyzer.py",
>  line 405, in _derivation
> for input_tag, input_id in transform_proto.inputs.items()
> ...
>   File 
> "/Users/robertwb/Work/beam/incubator-beam/sdks/python/apache_beam/runners/interactive/pipeline_analyzer.py",
>  line 405, in _derivation
> for input_tag, input_id in transform_proto.inputs.items()
>   File 
> "/Users/robertwb/Work/beam/venv3/bin/../lib/python3.6/_collections_abc.py", 
> line 678, in items
> return ItemsView(self)
> RecursionError: maximum recursion depth exceeded while calling a Python object
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8452) TriggerLoadJobs.process in bigquery_file_loads schema is type str

2019-10-21 Thread Noah Goodrich (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956470#comment-16956470
 ] 

Noah Goodrich commented on BEAM-8452:
-

I would like to propose the following as a fix in 
bigquery_file_loads.TriggerLoadJobs.process:

 

 
{code:java}
# Each load job is assumed to have files respecting these constraints:
 # 1. Total size of all files < 15 TB (Max size for load jobs)
 # 2. Total no. of files in a single load job < 10,000
 # This assumption means that there will always be a single load job
 # triggered for each partition of files.
 destination = element[0]
 files = element[1]
if callable(self.schema):
 schema = self.schema(destination, *schema_side_inputs)
 elif isinstance(self.schema, vp.ValueProvider):
 schema = self.schema.get()
 else:
 schema = self.schema
if isinstance(schema, (str, unicode)):
 schema = bigquery_tools.parse_table_schema_from_json(schema)
 elif isinstance(schema, dict):
 schema = bigquery_tools.parse_table_schema_from_json(json.dumps(schema))
{code}
 

 

 

> TriggerLoadJobs.process in bigquery_file_loads schema is type str
> -
>
> Key: BEAM-8452
> URL: https://issues.apache.org/jira/browse/BEAM-8452
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.15.0, 2.16.0
>Reporter: Noah Goodrich
>Assignee: Noah Goodrich
>Priority: Major
>
>  I've found a first issue with the BigQueryFileLoads Transform and the type 
> of the schema parameter.
> {code:java}
> Triggering job 
> beam_load_2019_10_11_140829_19_157670e4d458f0ff578fbe971a91b30a_1570802915 to 
> load data to BigQuery table   datasetId: 'pyr_monat_dev'
>  projectId: 'icentris-ml-dev'
>  tableId: 'tree_user_types'>.Schema: {"fields": [{"name": "id", "type": 
> "INTEGER", "mode": "required"}, {"name": "description", "type": "STRING", 
> "mode": "nullable"}]}. Additional parameters: {}
> Retry with exponential backoff: waiting for 4.875033410381894 seconds before 
> retrying _insert_load_job because we caught exception: 
> apitools.base.protorpclite.messages.ValidationError: Expected type  s 
> 'apache_beam.io.gcp.internal.clients.bigquery.bigquery_v2_messages.TableSchema'>
>  for field schema, found {"fields": [{"name": "id", "type": "INTEGER", 
> "mode": "required"}, {"name": "description", "type"
> : "STRING", "mode": "nullable"}]} (type )
>  Traceback for above exception (most recent call last):
>   File "/opt/conda/lib/python3.7/site-packages/apache_beam/utils/retry.py", 
> line 206, in wrapper
>     return fun(*args, **kwargs)
>   File 
> "/opt/conda/lib/python3.7/site-packages/apache_beam/io/gcp/bigquery_tools.py",
>  line 344, in _insert_load_job
>     **additional_load_parameters
>   File 
> "/opt/conda/lib/python3.7/site-packages/apitools/base/protorpclite/messages.py",
>  line 791, in __init__
>     setattr(self, name, value)
>   File 
> "/opt/conda/lib/python3.7/site-packages/apitools/base/protorpclite/messages.py",
>  line 973, in __setattr__
>     object.__setattr__(self, name, value)
>   File 
> "/opt/conda/lib/python3.7/site-packages/apitools/base/protorpclite/messages.py",
>  line 1652, in __set__
>     super(MessageField, self).__set__(message_instance, value)
>   File 
> "/opt/conda/lib/python3.7/site-packages/apitools/base/protorpclite/messages.py",
>  line 1293, in __set__
>     value = self.validate(value)
>   File 
> "/opt/conda/lib/python3.7/site-packages/apitools/base/protorpclite/messages.py",
>  line 1400, in validate
>     return self.__validate(value, self.validate_element)
>   File 
> "/opt/conda/lib/python3.7/site-packages/apitools/base/protorpclite/messages.py",
>  line 1358, in __validate
>     return validate_element(value)   
>   File 
> "/opt/conda/lib/python3.7/site-packages/apitools/base/protorpclite/messages.py",
>  line 1340, in validate_element
>     (self.type, name, value, type(value)))
>  
> {code}
>  
> The triggering code looks like this:
>  
> options.view_as(DebugOptions).experiments = ['use_beam_bq_sink']
>         # Save main session state so pickled functions and classes
>         # defined in __main__ can be unpickled
>         options.view_as(SetupOptions).save_main_session = True
>         custom_options = options.view_as(LoadSqlToBqOptions)
>         with beam.Pipeline(options=options) as p:
>             (p
>                 | "Initializing with empty collection" >> beam.Create([1])
>                 | "Reading records from CloudSql" >> 
> beam.ParDo(ReadFromRelationalDBFn(
>                     username=custom_options.user,
>                     password=custom_options.password,
>                     database=custom_options.database,
>                     table=custom_options.table,
>                     key_field=custom_options.key_field,
>                     

[jira] [Updated] (BEAM-8451) Interactive Beam example failing from stack overflow

2019-10-21 Thread Igor Durovic (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Durovic updated BEAM-8451:
---
Description: 
 

RecursionError: maximum recursion depth exceeded in __instancecheck__

at 
[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/interactive/pipeline_analyzer.py#L405]

 

This occurred after the execution of the last cell in 
[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/interactive/examples/Interactive%20Beam%20Example.ipynb]

  was:
~/tmp/beam_venv_dir/lib/python3.6/site-packages/apache_beam-2.17.0.dev0-py3.6.egg/apache_beam/runners/interactive/pipeline_analyzer.py
 in _derivation(self, pcoll_id)

403 self._derivations[pcoll_id] = self.Derivation({  404 input_tag: 
self._derivation(input_id) --> 405 for input_tag, input_id in 
transform_proto.inputs.items()  406 }, transform_proto, output_tag)  407 return 
self._derivations[pcoll_id]

RecursionError: maximum recursion depth exceeded in __instancecheck__


> Interactive Beam example failing from stack overflow
> 
>
> Key: BEAM-8451
> URL: https://issues.apache.org/jira/browse/BEAM-8451
> Project: Beam
>  Issue Type: Bug
>  Components: examples-python
>Reporter: Igor Durovic
>Assignee: Igor Durovic
>Priority: Major
>
>  
> RecursionError: maximum recursion depth exceeded in __instancecheck__
> at 
> [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/interactive/pipeline_analyzer.py#L405]
>  
> This occurred after the execution of the last cell in 
> [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/interactive/examples/Interactive%20Beam%20Example.ipynb]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8372) Allow submission of Flink UberJar directly to flink cluster.

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8372?focusedWorklogId=331664=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331664
 ]

ASF GitHub Bot logged work on BEAM-8372:


Author: ASF GitHub Bot
Created on: 21/Oct/19 21:44
Start Date: 21/Oct/19 21:44
Worklog Time Spent: 10m 
  Work Description: robertwb commented on issue #9844: [BEAM-8372] Support 
both flink_master and flink_master_url parameter
URL: https://github.com/apache/beam/pull/9844#issuecomment-544721281
 
 
   There's also the question of whether this should include the `http[s]://` 
portion. (Especially the 's' part is potentially ambiguous.)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331664)
Time Spent: 6h  (was: 5h 50m)

> Allow submission of Flink UberJar directly to flink cluster.
> 
>
> Key: BEAM-8372
> URL: https://issues.apache.org/jira/browse/BEAM-8372
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Major
>  Time Spent: 6h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-8452) TriggerLoadJobs.process in bigquery_file_loads schema is type str

2019-10-21 Thread Noah Goodrich (Jira)
Noah Goodrich created BEAM-8452:
---

 Summary: TriggerLoadJobs.process in bigquery_file_loads schema is 
type str
 Key: BEAM-8452
 URL: https://issues.apache.org/jira/browse/BEAM-8452
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-core
Affects Versions: 2.15.0, 2.16.0
Reporter: Noah Goodrich
Assignee: Noah Goodrich


 I've found a first issue with the BigQueryFileLoads Transform and the type of 
the schema parameter.
{code:java}
Triggering job 
beam_load_2019_10_11_140829_19_157670e4d458f0ff578fbe971a91b30a_1570802915 to 
load data to BigQuery table .Schema: {"fields": [{"name": "id", "type": 
"INTEGER", "mode": "required"}, {"name": "description", "type": "STRING", 
"mode": "nullable"}]}. Additional parameters: {}
Retry with exponential backoff: waiting for 4.875033410381894 seconds before 
retrying _insert_load_job because we caught exception: 
apitools.base.protorpclite.messages.ValidationError: Expected type 
 for field schema, found {"fields": [{"name": "id", "type": "INTEGER", "mode": 
"required"}, {"name": "description", "type"
: "STRING", "mode": "nullable"}]} (type )
 Traceback for above exception (most recent call last):
  File "/opt/conda/lib/python3.7/site-packages/apache_beam/utils/retry.py", 
line 206, in wrapper
    return fun(*args, **kwargs)
  File 
"/opt/conda/lib/python3.7/site-packages/apache_beam/io/gcp/bigquery_tools.py", 
line 344, in _insert_load_job
    **additional_load_parameters
  File 
"/opt/conda/lib/python3.7/site-packages/apitools/base/protorpclite/messages.py",
 line 791, in __init__
    setattr(self, name, value)
  File 
"/opt/conda/lib/python3.7/site-packages/apitools/base/protorpclite/messages.py",
 line 973, in __setattr__
    object.__setattr__(self, name, value)
  File 
"/opt/conda/lib/python3.7/site-packages/apitools/base/protorpclite/messages.py",
 line 1652, in __set__
    super(MessageField, self).__set__(message_instance, value)
  File 
"/opt/conda/lib/python3.7/site-packages/apitools/base/protorpclite/messages.py",
 line 1293, in __set__
    value = self.validate(value)
  File 
"/opt/conda/lib/python3.7/site-packages/apitools/base/protorpclite/messages.py",
 line 1400, in validate
    return self.__validate(value, self.validate_element)
  File 
"/opt/conda/lib/python3.7/site-packages/apitools/base/protorpclite/messages.py",
 line 1358, in __validate
    return validate_element(value)   
  File 
"/opt/conda/lib/python3.7/site-packages/apitools/base/protorpclite/messages.py",
 line 1340, in validate_element
    (self.type, name, value, type(value)))
 
{code}
 

The triggering code looks like this:

 
options.view_as(DebugOptions).experiments = ['use_beam_bq_sink']
        # Save main session state so pickled functions and classes
        # defined in __main__ can be unpickled
        options.view_as(SetupOptions).save_main_session = True
        custom_options = options.view_as(LoadSqlToBqOptions)
        with beam.Pipeline(options=options) as p:
            (p
                | "Initializing with empty collection" >> beam.Create([1])
                | "Reading records from CloudSql" >> 
beam.ParDo(ReadFromRelationalDBFn(
                    username=custom_options.user,
                    password=custom_options.password,
                    database=custom_options.database,
                    table=custom_options.table,
                    key_field=custom_options.key_field,
                    batch_size=custom_options.batch_size))
                | "Converting Row Object for BigQuery" >> 
beam.ParDo(BuildForBigQueryFn(custom_options.bq_schema))
                | "Writing to BigQuery" >> beam.io.WriteToBigQuery(
                        table=custom_options.bq_table,
                        schema=custom_options.bq_schema,
                        
write_disposition=beam.io.BigQueryDisposition.WRITE_TRUNCATE,
                        
create_disposition=beam.io.BigQueryDisposition.CREATE_IF_NEEDED))
 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8451) Interactive Beam example failing from stack overflow

2019-10-21 Thread Igor Durovic (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Durovic updated BEAM-8451:
---
Description: 
~/tmp/beam_venv_dir/lib/python3.6/site-packages/apache_beam-2.17.0.dev0-py3.6.egg/apache_beam/runners/interactive/pipeline_analyzer.py
 in _derivation(self, pcoll_id)

403 self._derivations[pcoll_id] = self.Derivation({  404 input_tag: 
self._derivation(input_id) --> 405 for input_tag, input_id in 
transform_proto.inputs.items()  406 }, transform_proto, output_tag)  407 return 
self._derivations[pcoll_id]

RecursionError: maximum recursion depth exceeded in __instancecheck__

  was:
~/tmp/beam_venv_dir/lib/python3.6/site-packages/apache_beam-2.17.0.dev0-py3.6.egg/apache_beam/runners/interactive/pipeline_analyzer.py
 in _derivation(self, pcoll_id)

  403   self._derivations[pcoll_id] = self.Derivation({
  404 input_tag: self._derivation(input_id)
--> 405 for input_tag, input_id in transform_proto.inputs.items()
  406   }, transform_proto, output_tag)
  407 return self._derivations[pcoll_id]

RecursionError: maximum recursion depth exceeded in __instancecheck__


> Interactive Beam example failing from stack overflow
> 
>
> Key: BEAM-8451
> URL: https://issues.apache.org/jira/browse/BEAM-8451
> Project: Beam
>  Issue Type: Bug
>  Components: examples-python
>Reporter: Igor Durovic
>Assignee: Igor Durovic
>Priority: Major
>
> ~/tmp/beam_venv_dir/lib/python3.6/site-packages/apache_beam-2.17.0.dev0-py3.6.egg/apache_beam/runners/interactive/pipeline_analyzer.py
>  in _derivation(self, pcoll_id)
> 403 self._derivations[pcoll_id] = self.Derivation({  404 input_tag: 
> self._derivation(input_id) --> 405 for input_tag, input_id in 
> transform_proto.inputs.items()  406 }, transform_proto, output_tag)  407 
> return self._derivations[pcoll_id]
> RecursionError: maximum recursion depth exceeded in __instancecheck__



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8451) Interactive Beam example failing from stack overflow

2019-10-21 Thread Igor Durovic (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Durovic updated BEAM-8451:
---
Description: 
~/tmp/beam_venv_dir/lib/python3.6/site-packages/apache_beam-2.17.0.dev0-py3.6.egg/apache_beam/runners/interactive/pipeline_analyzer.py
 in _derivation(self, pcoll_id)

  403   self._derivations[pcoll_id] = self.Derivation({
  404 input_tag: self._derivation(input_id)
--> 405 for input_tag, input_id in transform_proto.inputs.items()
  406   }, transform_proto, output_tag)
  407 return self._derivations[pcoll_id]

RecursionError: maximum recursion depth exceeded in __instancecheck__

> Interactive Beam example failing from stack overflow
> 
>
> Key: BEAM-8451
> URL: https://issues.apache.org/jira/browse/BEAM-8451
> Project: Beam
>  Issue Type: Bug
>  Components: examples-python
>Reporter: Igor Durovic
>Assignee: Igor Durovic
>Priority: Major
>
> ~/tmp/beam_venv_dir/lib/python3.6/site-packages/apache_beam-2.17.0.dev0-py3.6.egg/apache_beam/runners/interactive/pipeline_analyzer.py
>  in _derivation(self, pcoll_id)
>   403   self._derivations[pcoll_id] = self.Derivation({
>   404 input_tag: self._derivation(input_id)
> --> 405 for input_tag, input_id in transform_proto.inputs.items()
>   406   }, transform_proto, output_tag)
>   407 return self._derivations[pcoll_id]
> RecursionError: maximum recursion depth exceeded in __instancecheck__



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-8451) Interactive Beam example failing from stack overflow

2019-10-21 Thread Igor Durovic (Jira)
Igor Durovic created BEAM-8451:
--

 Summary: Interactive Beam example failing from stack overflow
 Key: BEAM-8451
 URL: https://issues.apache.org/jira/browse/BEAM-8451
 Project: Beam
  Issue Type: Bug
  Components: examples-python
Reporter: Igor Durovic
Assignee: Igor Durovic






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7730) Add Flink 1.9 build target and Make FlinkRunner compatible with Flink 1.9

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7730?focusedWorklogId=331662=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331662
 ]

ASF GitHub Bot logged work on BEAM-7730:


Author: ASF GitHub Bot
Created on: 21/Oct/19 21:30
Start Date: 21/Oct/19 21:30
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #9296: [BEAM-7730] Introduce 
Flink 1.9 Runner
URL: https://github.com/apache/beam/pull/9296#issuecomment-544716246
 
 
   Run Java Flink PortableValidatesRunner Streaming
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331662)
Time Spent: 7h 50m  (was: 7h 40m)

> Add Flink 1.9 build target and Make FlinkRunner compatible with Flink 1.9
> -
>
> Key: BEAM-7730
> URL: https://issues.apache.org/jira/browse/BEAM-7730
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: sunjincheng
>Assignee: David Moravek
>Priority: Major
> Fix For: 2.17.0
>
>  Time Spent: 7h 50m
>  Remaining Estimate: 0h
>
> Apache Flink 1.9 will coming and it's better to add Flink 1.9 build target 
> and make Flink Runner compatible with Flink 1.9.
> I will add the brief changes after the Flink 1.9.0 released. 
> And I appreciate it if you can leave your suggestions or comments!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7730) Add Flink 1.9 build target and Make FlinkRunner compatible with Flink 1.9

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7730?focusedWorklogId=331663=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331663
 ]

ASF GitHub Bot logged work on BEAM-7730:


Author: ASF GitHub Bot
Created on: 21/Oct/19 21:30
Start Date: 21/Oct/19 21:30
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #9296: [BEAM-7730] Introduce 
Flink 1.9 Runner
URL: https://github.com/apache/beam/pull/9296#issuecomment-544716271
 
 
   Run Java Flink PortableValidatesRunner Batch
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331663)
Time Spent: 8h  (was: 7h 50m)

> Add Flink 1.9 build target and Make FlinkRunner compatible with Flink 1.9
> -
>
> Key: BEAM-7730
> URL: https://issues.apache.org/jira/browse/BEAM-7730
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: sunjincheng
>Assignee: David Moravek
>Priority: Major
> Fix For: 2.17.0
>
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> Apache Flink 1.9 will coming and it's better to add Flink 1.9 build target 
> and make Flink Runner compatible with Flink 1.9.
> I will add the brief changes after the Flink 1.9.0 released. 
> And I appreciate it if you can leave your suggestions or comments!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7730) Add Flink 1.9 build target and Make FlinkRunner compatible with Flink 1.9

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7730?focusedWorklogId=331659=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331659
 ]

ASF GitHub Bot logged work on BEAM-7730:


Author: ASF GitHub Bot
Created on: 21/Oct/19 21:29
Start Date: 21/Oct/19 21:29
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #9296: [BEAM-7730] Introduce 
Flink 1.9 Runner
URL: https://github.com/apache/beam/pull/9296#issuecomment-544716127
 
 
   Run ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331659)
Time Spent: 7h 40m  (was: 7.5h)

> Add Flink 1.9 build target and Make FlinkRunner compatible with Flink 1.9
> -
>
> Key: BEAM-7730
> URL: https://issues.apache.org/jira/browse/BEAM-7730
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: sunjincheng
>Assignee: David Moravek
>Priority: Major
> Fix For: 2.17.0
>
>  Time Spent: 7h 40m
>  Remaining Estimate: 0h
>
> Apache Flink 1.9 will coming and it's better to add Flink 1.9 build target 
> and make Flink Runner compatible with Flink 1.9.
> I will add the brief changes after the Flink 1.9.0 released. 
> And I appreciate it if you can leave your suggestions or comments!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7730) Add Flink 1.9 build target and Make FlinkRunner compatible with Flink 1.9

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7730?focusedWorklogId=331657=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331657
 ]

ASF GitHub Bot logged work on BEAM-7730:


Author: ASF GitHub Bot
Created on: 21/Oct/19 21:29
Start Date: 21/Oct/19 21:29
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #9296: [BEAM-7730] Introduce 
Flink 1.9 Runner
URL: https://github.com/apache/beam/pull/9296#issuecomment-544715941
 
 
   Thanks @dmvk.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331657)
Time Spent: 7h 20m  (was: 7h 10m)

> Add Flink 1.9 build target and Make FlinkRunner compatible with Flink 1.9
> -
>
> Key: BEAM-7730
> URL: https://issues.apache.org/jira/browse/BEAM-7730
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: sunjincheng
>Assignee: David Moravek
>Priority: Major
> Fix For: 2.17.0
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Apache Flink 1.9 will coming and it's better to add Flink 1.9 build target 
> and make Flink Runner compatible with Flink 1.9.
> I will add the brief changes after the Flink 1.9.0 released. 
> And I appreciate it if you can leave your suggestions or comments!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7730) Add Flink 1.9 build target and Make FlinkRunner compatible with Flink 1.9

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7730?focusedWorklogId=331658=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331658
 ]

ASF GitHub Bot logged work on BEAM-7730:


Author: ASF GitHub Bot
Created on: 21/Oct/19 21:29
Start Date: 21/Oct/19 21:29
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #9296: [BEAM-7730] Introduce 
Flink 1.9 Runner
URL: https://github.com/apache/beam/pull/9296#issuecomment-544715978
 
 
   Retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331658)
Time Spent: 7.5h  (was: 7h 20m)

> Add Flink 1.9 build target and Make FlinkRunner compatible with Flink 1.9
> -
>
> Key: BEAM-7730
> URL: https://issues.apache.org/jira/browse/BEAM-7730
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: sunjincheng
>Assignee: David Moravek
>Priority: Major
> Fix For: 2.17.0
>
>  Time Spent: 7.5h
>  Remaining Estimate: 0h
>
> Apache Flink 1.9 will coming and it's better to add Flink 1.9 build target 
> and make Flink Runner compatible with Flink 1.9.
> I will add the brief changes after the Flink 1.9.0 released. 
> And I appreciate it if you can leave your suggestions or comments!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7730) Add Flink 1.9 build target and Make FlinkRunner compatible with Flink 1.9

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7730?focusedWorklogId=331655=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331655
 ]

ASF GitHub Bot logged work on BEAM-7730:


Author: ASF GitHub Bot
Created on: 21/Oct/19 21:28
Start Date: 21/Oct/19 21:28
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #9296: [BEAM-7730] 
Introduce Flink 1.9 Runner
URL: https://github.com/apache/beam/pull/9296#discussion_r337248403
 
 

 ##
 File path: 
runners/flink/src/test/java/org/apache/beam/runners/flink/translation/wrappers/streaming/DoFnOperatorTest.java
 ##
 @@ -1038,7 +1039,7 @@ void pushbackDataCheckpointing(
 
 assertThat(
 stripStreamRecordFromWindowedValue(testHarness.getOutput()),
-contains(helloElement, worldElement));
+containsInAnyOrder(helloElement, worldElement));
 
 Review comment:
   We can't guarantee the order here for keys in the state backend after 
restoring from a savepoint. This just so happens to work for older Flink 
versions. So seems fair to change this.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331655)
Time Spent: 7h 10m  (was: 7h)

> Add Flink 1.9 build target and Make FlinkRunner compatible with Flink 1.9
> -
>
> Key: BEAM-7730
> URL: https://issues.apache.org/jira/browse/BEAM-7730
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: sunjincheng
>Assignee: David Moravek
>Priority: Major
> Fix For: 2.17.0
>
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> Apache Flink 1.9 will coming and it's better to add Flink 1.9 build target 
> and make Flink Runner compatible with Flink 1.9.
> I will add the brief changes after the Flink 1.9.0 released. 
> And I appreciate it if you can leave your suggestions or comments!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8397) DataflowRunnerTest.test_remote_runner_display_data fails due to infinite recursion during pickling.

2019-10-21 Thread Valentyn Tymofieiev (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956457#comment-16956457
 ] 

Valentyn Tymofieiev commented on BEAM-8397:
---

So, test_remote_runner_display_data tests appears to be sensitive to the value 
of sys.getrecursionlimit(). The picture is roughly as follows (value boundaries 
are not precise):

sys.setrecursionlimit < 80 - Test will fail when recursion level with 
RecursionError('maximum recursion depth exceeded'). This is expected, since our 
stack is deeper than that.
sys.setrecursionlimit > 15000 - Test will fail, likely due to an overflow in C 
stack, which is what recursion limit is designed to protect us from.

for values in between, the test fails for 1-2 out of 10 values for  
sys.getrecursionlimit(). For example, with dill == 0.3.1.1, the test fails for 
recursion limit values set to: 1000, 1006, 1010, 1017, 1032, 12501, 12505, but 
passes for 1001, 1007, 12500, etc. When the test is failing, we can make it 
pass by adding another function call to the stack, for example:

{noformat}
  def test_remote_runner_display_data(self):
   
 def run_test():
 ... test code ...
 run_test()
{noformat}

So, I think commits [1-2] are not responsible for the error, they just change 
the shape of the call stack (in the case of [2]), or the value of 
sys.getrecursionlimit() in case of [1]. This also explains why the test may 
pass when run via python ./setup.py nosetests --tests 
'apache_beam/runners/dataflow/dataflow_runner_test.py:DataflowRunnerTest.test_remote_runner_display_data',
 but fail when run via tox -e py37-gcp - the shape of the call stack is 
slightly different in these invocations.

 

[1] 
https://github.com/ipython/ipython/commit/3ff1be2ea8ef180a6f17a6a03a3f8452303b9abe
[2] 
https://github.com/uqfoundation/dill/commit/1cc66b404b539df76f8332440547c567a09b8b28

> DataflowRunnerTest.test_remote_runner_display_data fails due to infinite 
> recursion during pickling.
> ---
>
> Key: BEAM-8397
> URL: https://issues.apache.org/jira/browse/BEAM-8397
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Priority: Major
>
> `python ./setup.py test -s 
> apache_beam.runners.dataflow.dataflow_runner_test.DataflowRunnerTest.test_remote_runner_display_data`
>  passes.
> `tox -e py37-gcp` passes if Beam depends on dill==0.3.0, but fails if Beam 
> depends on dill==0.3.1.1.`python ./setup.py nosetests --tests 
> 'apache_beam/runners/dataflow/dataflow_runner_test.py:DataflowRunnerTest.test_remote_runner_display_data`
>  fails currently if run on master.
> The failure indicates infinite recursion during pickling:
> {noformat}
> test_remote_runner_display_data 
> (apache_beam.runners.dataflow.dataflow_runner_test.DataflowRunnerTest) ... 
> Fatal Python error: Cannot recover from stack overflow.
> Current thread 0x7f9d700ed740 (most recent call first):
>   File "/usr/lib/python3.7/pickle.py", line 479 in get
>   File "/usr/lib/python3.7/pickle.py", line 497 in save
>   File "/usr/lib/python3.7/pickle.py", line 786 in save_tuple
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 638 in save_reduce
>   File 
> "/usr/local/google/home/valentyn/tmp/py37env/lib/python3.7/site-packages/dill/_dill.py",
>  line 1394 in save_function
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 882 in _batch_setitems
>   File "/usr/lib/python3.7/pickle.py", line 856 in save_dict
>   File 
> "/usr/local/google/home/valentyn/tmp/py37env/lib/python3.7/site-packages/dill/_dill.py",
>  line 910 in save_module_dict
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean/beam/sdks/python/apache_beam/internal/pickler.py",
>  line 198 in new_save_module_dict
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 786 in save_tuple
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 638 in save_reduce
>   File 
> "/usr/local/google/home/valentyn/projects/beam/clean/beam/sdks/python/apache_beam/internal/pickler.py",
>  line 114 in wrapper
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 771 in save_tuple
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 638 in save_reduce
>   File 
> "/usr/local/google/home/valentyn/tmp/py37env/lib/python3.7/site-packages/dill/_dill.py",
>  line 1137 in save_cell
>   File "/usr/lib/python3.7/pickle.py", line 504 in save
>   File "/usr/lib/python3.7/pickle.py", line 771 in save_tuple
>   

[jira] [Work logged] (BEAM-3713) Consider moving away from nose to nose2 or pytest.

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-3713?focusedWorklogId=331654=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331654
 ]

ASF GitHub Bot logged work on BEAM-3713:


Author: ASF GitHub Bot
Created on: 21/Oct/19 21:25
Start Date: 21/Oct/19 21:25
Worklog Time Spent: 10m 
  Work Description: chadrik commented on pull request #9756: [BEAM-3713] 
Add pytest for unit tests
URL: https://github.com/apache/beam/pull/9756#discussion_r337247329
 
 

 ##
 File path: sdks/python/conftest.py
 ##
 @@ -0,0 +1,29 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+"""Pytest configuration and custom hooks."""
+
+from __future__ import absolute_import
+
+import sys
+
+# See pytest.ini for main collection rules.
+collect_ignore_glob = []
+if sys.version_info < (3,):
+  collect_ignore_glob.append('*_py3.py')
+for minor in [5, 6, 7, 8, 9]:
+  if sys.version_info < (3, minor):
+collect_ignore_glob.append('*_py3%d.py' % minor)
 
 Review comment:
   > IDK. It's more verbose but I don't have a preference.
   
   I think it'll be easier to maintain long term: just update the max support 
python version.  
   
   But maintainability aside,  it seems a bit odd to test for our lower bound 
in a loop when we know precisely what the lower bound is: 
`sys.version_info.minor +1`
   
   > Ideally, I'd like a single source of truth for the list of Python 
supported versions (python_requires in setup.py is such a source but it's hard 
to parse).
   
   true.
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331654)
Time Spent: 10h  (was: 9h 50m)

> Consider moving away from nose to nose2 or pytest.
> --
>
> Key: BEAM-3713
> URL: https://issues.apache.org/jira/browse/BEAM-3713
> Project: Beam
>  Issue Type: Test
>  Components: sdk-py-core, testing
>Reporter: Robert Bradshaw
>Assignee: Udi Meiri
>Priority: Minor
>  Time Spent: 10h
>  Remaining Estimate: 0h
>
> Per 
> [https://nose.readthedocs.io/en/latest/|https://nose.readthedocs.io/en/latest/,]
>  , nose is in maintenance mode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8383) Add metrics to Python state cache

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8383?focusedWorklogId=331653=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331653
 ]

ASF GitHub Bot logged work on BEAM-8383:


Author: ASF GitHub Bot
Created on: 21/Oct/19 21:21
Start Date: 21/Oct/19 21:21
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #9769: [BEAM-8383] Add metrics 
to the Python state cache
URL: https://github.com/apache/beam/pull/9769#issuecomment-544713054
 
 
   I've ran some cluster tests with this. The metrics provide pretty good 
insight into how well the state caching performance. The hit/miss gauges as 
well as the current state size/capacity allows users to tune the cache to a 
decent size. The metrics are available per-bundle and as totals.
   
   This could need a review now :)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331653)
Time Spent: 50m  (was: 40m)

> Add metrics to Python state cache
> -
>
> Key: BEAM-8383
> URL: https://issues.apache.org/jira/browse/BEAM-8383
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink, sdk-py-core
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> For more insight into how effectively the cache works, metrics should be 
> added to the Python SDK. All the state operations should be counted, as well 
> as metrics like the current size of the cache, cache hits/misses, and the 
> capacity of the cache.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8397) DataflowRunnerTest.test_remote_runner_display_data fails due to infinite recursion during pickling.

2019-10-21 Thread Valentyn Tymofieiev (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev updated BEAM-8397:
--
Description: 
`python ./setup.py test -s 
apache_beam.runners.dataflow.dataflow_runner_test.DataflowRunnerTest.test_remote_runner_display_data`
 passes.
`tox -e py37-gcp` passes if Beam depends on dill==0.3.0, but fails if Beam 
depends on dill==0.3.1.1.`python ./setup.py nosetests --tests 
'apache_beam/runners/dataflow/dataflow_runner_test.py:DataflowRunnerTest.test_remote_runner_display_data`
 fails currently if run on master.

The failure indicates infinite recursion during pickling:
{noformat}
test_remote_runner_display_data 
(apache_beam.runners.dataflow.dataflow_runner_test.DataflowRunnerTest) ... 
Fatal Python error: Cannot recover from stack overflow.

Current thread 0x7f9d700ed740 (most recent call first):
  File "/usr/lib/python3.7/pickle.py", line 479 in get
  File "/usr/lib/python3.7/pickle.py", line 497 in save
  File "/usr/lib/python3.7/pickle.py", line 786 in save_tuple
  File "/usr/lib/python3.7/pickle.py", line 504 in save
  File "/usr/lib/python3.7/pickle.py", line 638 in save_reduce
  File 
"/usr/local/google/home/valentyn/tmp/py37env/lib/python3.7/site-packages/dill/_dill.py",
 line 1394 in save_function
  File "/usr/lib/python3.7/pickle.py", line 504 in save
  File "/usr/lib/python3.7/pickle.py", line 882 in _batch_setitems
  File "/usr/lib/python3.7/pickle.py", line 856 in save_dict
  File 
"/usr/local/google/home/valentyn/tmp/py37env/lib/python3.7/site-packages/dill/_dill.py",
 line 910 in save_module_dict
  File 
"/usr/local/google/home/valentyn/projects/beam/clean/beam/sdks/python/apache_beam/internal/pickler.py",
 line 198 in new_save_module_dict
  File "/usr/lib/python3.7/pickle.py", line 504 in save
  File "/usr/lib/python3.7/pickle.py", line 786 in save_tuple
  File "/usr/lib/python3.7/pickle.py", line 504 in save
  File "/usr/lib/python3.7/pickle.py", line 638 in save_reduce
  File 
"/usr/local/google/home/valentyn/projects/beam/clean/beam/sdks/python/apache_beam/internal/pickler.py",
 line 114 in wrapper
  File "/usr/lib/python3.7/pickle.py", line 504 in save
  File "/usr/lib/python3.7/pickle.py", line 771 in save_tuple
  File "/usr/lib/python3.7/pickle.py", line 504 in save
  File "/usr/lib/python3.7/pickle.py", line 638 in save_reduce
  File 
"/usr/local/google/home/valentyn/tmp/py37env/lib/python3.7/site-packages/dill/_dill.py",
 line 1137 in save_cell
  File "/usr/lib/python3.7/pickle.py", line 504 in save
  File "/usr/lib/python3.7/pickle.py", line 771 in save_tuple
  File "/usr/lib/python3.7/pickle.py", line 504 in save
  File "/usr/lib/python3.7/pickle.py", line 786 in save_tuple
  File "/usr/lib/python3.7/pickle.py", line 504 in save
  File "/usr/lib/python3.7/pickle.py", line 638 in save_reduce
  File 
"/usr/local/google/home/valentyn/tmp/py37env/lib/python3.7/site-packages/dill/_dill.py",
 line 1394 in save_function
  File "/usr/lib/python3.7/pickle.py", line 504 in save
  File "/usr/lib/python3.7/pickle.py", line 882 in _batch_setitems
  File "/usr/lib/python3.7/pickle.py", line 856 in save_dict
  File 
"/usr/local/google/home/valentyn/tmp/py37env/lib/python3.7/site-packages/dill/_dill.py",
 line 910 in save_module_dict
  File 
"/usr/local/google/home/valentyn/projects/beam/clean/beam/sdks/python/apache_beam/internal/pickler.py",
 line 198 in new_save_module_dict
...
{noformat}

cc: [~yoshiki.obata]


  was:
`python ./setup.py test -s 
apache_beam.runners.dataflow.dataflow_runner_test.DataflowRunnerTest.test_remote_runner_display_data`
 passes.
`tox -e py37-gcp` passes if Beam depends on dill==0.3.0, but fails if Beam 
depends on dill==0.3.1.1.`python ./setup.py nosetests --tests 
'apache_beam/runners/dataflow/dataflow_runner_test.py:DataflowRunnerTest.test_remote_runner_display_data`
 fails currently if run on master.

The failure indicates infinite recursion during pickling:
{noformat}
test_remote_runner_display_data 
(apache_beam.runners.dataflow.dataflow_runner_test.DataflowRunnerTest) ... 
Fatal Python error: Cannot recover from stack overflow.

Current thread 0x7f9d700ed740 (most recent call first):
  File "/usr/lib/python3.7/pickle.py", line 479 in get
  File "/usr/lib/python3.7/pickle.py", line 497 in save
  File "/usr/lib/python3.7/pickle.py", line 786 in save_tuple
  File "/usr/lib/python3.7/pickle.py", line 504 in save
  File "/usr/lib/python3.7/pickle.py", line 638 in save_reduce
  File 
"/usr/local/google/home/valentyn/tmp/py37env/lib/python3.7/site-packages/dill/_dill.py",
 line 1394 in save_function
  File "/usr/lib/python3.7/pickle.py", line 504 in save
  File "/usr/lib/python3.7/pickle.py", line 882 in _batch_setitems
  File "/usr/lib/python3.7/pickle.py", line 856 in save_dict
  File 
"/usr/local/google/home/valentyn/tmp/py37env/lib/python3.7/site-packages/dill/_dill.py",
 line 910 in save_module_dict
  File 

[jira] [Commented] (BEAM-8416) ZipFileArtifactServiceTest.test_concurrent_requests flaky

2019-10-21 Thread Ahmet Altay (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956443#comment-16956443
 ] 

Ahmet Altay commented on BEAM-8416:
---

Could this be closed?

> ZipFileArtifactServiceTest.test_concurrent_requests flaky
> -
>
> Key: BEAM-8416
> URL: https://issues.apache.org/jira/browse/BEAM-8416
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Robert Bradshaw
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> {code}
> Traceback (most recent call last):
>   File "/usr/lib/python3.7/unittest/case.py", line 59, in testPartExecutor
> yield
>   File "/usr/lib/python3.7/unittest/case.py", line 615, in run
> testMethod()
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Cron/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/runners/portability/artifact_service_test.py",
>  line 215, in test_concurrent_requests
> _ = list(pool.map(check, range(100)))
>   File "/usr/lib/python3.7/concurrent/futures/_base.py", line 586, in 
> result_iterator
> yield fs.pop().result()
>   File "/usr/lib/python3.7/concurrent/futures/_base.py", line 425, in result
> return self.__get_result()
>   File "/usr/lib/python3.7/concurrent/futures/_base.py", line 384, in 
> __get_result
> raise self._exception
>   File "/usr/lib/python3.7/concurrent/futures/thread.py", line 57, in run
> result = self.fn(*self.args, **self.kwargs)
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Cron/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/runners/portability/artifact_service_test.py",
>  line 208, in check
> self._service, tokens[session(index)], name(index)))
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Cron/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/runners/portability/artifact_service_test.py",
>  line 73, in retrieve_artifact
> name=name)))
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Cron/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/runners/portability/artifact_service_test.py",
>  line 70, in 
> return b''.join(chunk.data for chunk in retrieval_service.GetArtifact(
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Cron/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/runners/portability/artifact_service.py",
>  line 133, in GetArtifact
> chunk = fin.read(self._chunk_size)
>   File "/usr/lib/python3.7/zipfile.py", line 899, in read
> data = self._read1(n)
>   File "/usr/lib/python3.7/zipfile.py", line 989, in _read1
> self._update_crc(data)
>   File "/usr/lib/python3.7/zipfile.py", line 917, in _update_crc
> raise BadZipFile("Bad CRC-32 for file %r" % self.name)
> zipfile.BadZipFile: Bad CRC-32 for file 
> '/3b2b55eb92de23535010b7ac80d553ec2d4bae872ac5606bc3042ce9313dff87/e1d492628cc0c1d0c1b736184f689be54fa03a996de918268ad834560e77305f'
> {code}
> and:
> {code}
> Traceback (most recent call last):
>   File "/usr/lib/python3.7/unittest/case.py", line 59, in testPartExecutor
> yield
>   File "/usr/lib/python3.7/unittest/case.py", line 615, in run
> testMethod()
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Cron/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/runners/portability/artifact_service_test.py",
>  line 215, in test_concurrent_requests
> _ = list(pool.map(check, range(100)))
>   File "/usr/lib/python3.7/concurrent/futures/_base.py", line 586, in 
> result_iterator
> yield fs.pop().result()
>   File "/usr/lib/python3.7/concurrent/futures/_base.py", line 425, in result
> return self.__get_result()
>   File "/usr/lib/python3.7/concurrent/futures/_base.py", line 384, in 
> __get_result
> raise self._exception
>   File "/usr/lib/python3.7/concurrent/futures/thread.py", line 57, in run
> result = self.fn(*self.args, **self.kwargs)
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Cron/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/runners/portability/artifact_service_test.py",
>  line 208, in check
> self._service, tokens[session(index)], name(index)))
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Cron/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/runners/portability/artifact_service_test.py",
>  line 73, in retrieve_artifact
> name=name)))
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Cron/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/runners/portability/artifact_service_test.py",
>  line 

[jira] [Work logged] (BEAM-8449) Document known issue with macOS installation

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8449?focusedWorklogId=331648=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331648
 ]

ASF GitHub Bot logged work on BEAM-8449:


Author: ASF GitHub Bot
Created on: 21/Oct/19 20:57
Start Date: 21/Oct/19 20:57
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #9847: [BEAM-8449] 
Document known issue with macOS installation
URL: https://github.com/apache/beam/pull/9847#discussion_r337236072
 
 

 ##
 File path: website/src/_posts/2019-07-31-beam-2.14.0.md
 ##
 @@ -79,6 +79,7 @@ For more information on changes in 2.14.0, check out the
 ### Known Issues
 
 * Do **NOT** use Python MongoDB source in this release. Python MongoDB source 
[added](https://issues.apache.org/jira/browse/BEAM-5148) in this release has a 
known issue that can result in data loss. See 
([BEAM-7866](https://issues.apache.org/jira/browse/BEAM-7866)) for details.
+* Can't install the Python SDK on macOS 10.15 
([BEAM-8368](https://issues.apache.org/jira/browse/BEAM-8368))
 
 Review comment:
   I think this is fine.
   
   @soyrice you are missing a dot at the end of the sentence. Could you add 
that?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331648)
Time Spent: 50m  (was: 40m)

> Document known issue with macOS installation
> 
>
> Key: BEAM-8449
> URL: https://issues.apache.org/jira/browse/BEAM-8449
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Cyrus Maden
>Assignee: Cyrus Maden
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> For details about the original issue, see 
> [BEAM-8363|https://issues.apache.org/jira/browse/BEAM-8368]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8433) DataCatalogBigQueryIT runs for both Calcite and ZetaSQL dialects

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8433?focusedWorklogId=331644=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331644
 ]

ASF GitHub Bot logged work on BEAM-8433:


Author: ASF GitHub Bot
Created on: 21/Oct/19 20:51
Start Date: 21/Oct/19 20:51
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #9835: [BEAM-8433] Use 
JUnit parameterized runner for dialect-sensitive integration tests
URL: https://github.com/apache/beam/pull/9835#issuecomment-544702408
 
 
   run sql postcommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331644)
Time Spent: 2h 10m  (was: 2h)

> DataCatalogBigQueryIT runs for both Calcite and ZetaSQL dialects
> 
>
> Key: BEAM-8433
> URL: https://issues.apache.org/jira/browse/BEAM-8433
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8445) ZetaSQL translator returns null from unsupported methods

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8445?focusedWorklogId=331643=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331643
 ]

ASF GitHub Bot logged work on BEAM-8445:


Author: ASF GitHub Bot
Created on: 21/Oct/19 20:51
Start Date: 21/Oct/19 20:51
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on pull request #9837: 
[BEAM-8445] Fix some not allowed nulls in ZetaSQL translator
URL: https://github.com/apache/beam/pull/9837
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331643)
Time Spent: 20m  (was: 10m)

> ZetaSQL translator returns null from unsupported methods
> 
>
> Key: BEAM-8445
> URL: https://issues.apache.org/jira/browse/BEAM-8445
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql-zetasql
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This is an error-prone style that was missed because our null analysis is 
> disabled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7926) Visualize PCollection with Interactive Beam

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7926?focusedWorklogId=331642=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331642
 ]

ASF GitHub Bot logged work on BEAM-7926:


Author: ASF GitHub Bot
Created on: 21/Oct/19 20:48
Start Date: 21/Oct/19 20:48
Worklog Time Spent: 10m 
  Work Description: KevinGG commented on pull request #9741: [BEAM-7926] 
Visualize PCollection
URL: https://github.com/apache/beam/pull/9741#discussion_r337232020
 
 

 ##
 File path: 
sdks/python/apache_beam/runners/interactive/display/pcoll_visualization.py
 ##
 @@ -0,0 +1,258 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Module visualizes PCollection data.
+
+For internal use only; no backwards-compatibility guarantees.
+Only works with Python 3.5+.
+"""
+from __future__ import absolute_import
+
+import base64
+import logging
+from datetime import timedelta
+
+from pandas.io.json import json_normalize
+
+from apache_beam import pvalue
+from apache_beam.runners.interactive import interactive_environment as ie
+from apache_beam.runners.interactive import pipeline_instrument as instr
+from facets_overview.generic_feature_statistics_generator import 
GenericFeatureStatisticsGenerator
+from IPython.core.display import HTML
+from IPython.core.display import Javascript
+from IPython.core.display import display
+from IPython.core.display import display_javascript
+from IPython.core.display import update_display
+from timeloop import Timeloop
+
+# jsons doesn't support < Python 3.5. Work around with json for legacy tests.
 
 Review comment:
   All the Gradle tasks still supports py2.
   So you still have docs, lint and tasks running in py2 virtual env. We'll 
have to pass those pre-commit checks.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331642)
Time Spent: 7h 40m  (was: 7.5h)

> Visualize PCollection with Interactive Beam
> ---
>
> Key: BEAM-7926
> URL: https://issues.apache.org/jira/browse/BEAM-7926
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-py-interactive
>Reporter: Ning Kang
>Assignee: Ning Kang
>Priority: Major
>  Time Spent: 7h 40m
>  Remaining Estimate: 0h
>
> Support auto plotting / charting of materialized data of a given PCollection 
> with Interactive Beam.
> Say an Interactive Beam pipeline defined as
> p = create_pipeline()
> pcoll = p | 'Transform' >> transform()
> The use can call a single function and get auto-magical charting of the data 
> as materialized pcoll.
> e.g., visualize(pcoll)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7926) Visualize PCollection with Interactive Beam

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7926?focusedWorklogId=331640=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331640
 ]

ASF GitHub Bot logged work on BEAM-7926:


Author: ASF GitHub Bot
Created on: 21/Oct/19 20:46
Start Date: 21/Oct/19 20:46
Worklog Time Spent: 10m 
  Work Description: KevinGG commented on pull request #9741: [BEAM-7926] 
Visualize PCollection
URL: https://github.com/apache/beam/pull/9741#discussion_r337231031
 
 

 ##
 File path: 
sdks/python/apache_beam/runners/interactive/display/pcoll_visualization_test.py
 ##
 @@ -0,0 +1,152 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Tests for apache_beam.runners.interactive.display.pcoll_visualization."""
+from __future__ import absolute_import
+
+import sys
+import time
+import unittest
+
+import apache_beam as beam  # pylint: disable=ungrouped-imports
+import timeloop
+from apache_beam.runners import runner
+from apache_beam.runners.interactive import interactive_environment as ie
+from apache_beam.runners.interactive.display import pcoll_visualization as pv
+
+# Work around nose tests using Python2 without unittest.mock module.
+try:
+  from unittest.mock import patch
+except ImportError:
+  from mock import patch
+
+
+class PCollVisualizationTest(unittest.TestCase):
+
+  def setUp(self):
+self._p = beam.Pipeline()
+# pylint: disable=range-builtin-not-iterating
+self._pcoll = self._p | 'Create' >> beam.Create(range(1000))
+
+  @unittest.skipIf(sys.version_info < (3, 5, 3),
+   'PCollVisualization is not supported on Python 2.')
+  def test_raise_error_for_non_pcoll_input(self):
+class Foo(object):
+  pass
+
+with self.assertRaises(AssertionError) as ctx:
+  pv.PCollVisualization(Foo())
+  self.assertTrue('pcoll should be apache_beam.pvalue.PCollection' in
+  ctx.exception)
+
+  @unittest.skipIf(sys.version_info < (3, 5, 3),
+   'PCollVisualization is not supported on Python 2.')
+  def test_pcoll_visualization_generate_unique_display_id(self):
+pv_1 = pv.PCollVisualization(self._pcoll)
+pv_2 = pv.PCollVisualization(self._pcoll)
+self.assertNotEqual(pv_1._dive_display_id, pv_2._dive_display_id)
+self.assertNotEqual(pv_1._overview_display_id, pv_2._overview_display_id)
+self.assertNotEqual(pv_1._df_display_id, pv_2._df_display_id)
+
+  @unittest.skipIf(sys.version_info < (3, 5, 3),
+   'PCollVisualization is not supported on Python 2.')
+  @patch('apache_beam.runners.interactive.display.pcoll_visualization'
+ '.PCollVisualization._to_element_list', lambda x: [1, 2, 3])
+  def test_one_shot_visualization_not_return_handle(self):
+self.assertIsNone(pv.visualize(self._pcoll))
+
+  def _mock_to_element_list(self):
+yield [1, 2, 3]
+yield [1, 2, 3, 4]
+yield [1, 2, 3, 4, 5]
+yield [1, 2, 3, 4, 5, 6]
+yield [1, 2, 3, 4, 5, 6, 7]
+yield [1, 2, 3, 4, 5, 6, 7, 8]
+
+  @unittest.skipIf(sys.version_info < (3, 5, 3),
+   'PCollVisualization is not supported on Python 2.')
+  @patch('apache_beam.runners.interactive.display.pcoll_visualization'
+ '.PCollVisualization._to_element_list', _mock_to_element_list)
+  def test_dynamic_plotting_return_handle(self):
+h = pv.visualize(self._pcoll, dynamic_plotting_interval=1)
+self.assertIsInstance(h, timeloop.Timeloop)
+h.stop()
+
+  @unittest.skipIf(sys.version_info < (3, 5, 3),
+   'PCollVisualization is not supported on Python 2.')
+  @patch('apache_beam.runners.interactive.display.pcoll_visualization'
+ '.PCollVisualization._to_element_list', _mock_to_element_list)
+  @patch('apache_beam.runners.interactive.display.pcoll_visualization'
+ '.PCollVisualization.display_facets')
+  def test_dynamic_plotting_update_same_display(self,
+mocked_display_facets):
+fake_pipeline_result = runner.PipelineResult(runner.PipelineState.RUNNING)
+ie.current_env().set_pipeline_result(self._p, fake_pipeline_result)
+# Starts async dynamic plotting that never 

[jira] [Work logged] (BEAM-8335) Add streaming support to Interactive Beam

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8335?focusedWorklogId=331641=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331641
 ]

ASF GitHub Bot logged work on BEAM-8335:


Author: ASF GitHub Bot
Created on: 21/Oct/19 20:46
Start Date: 21/Oct/19 20:46
Worklog Time Spent: 10m 
  Work Description: davidyan74 commented on pull request #9720: [BEAM-8335] 
Add initial modules for interactive streaming support
URL: https://github.com/apache/beam/pull/9720#discussion_r337230234
 
 

 ##
 File path: model/interactive/src/main/proto/beam_interactive_api.proto
 ##
 @@ -0,0 +1,124 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/*
+ * Protocol Buffers describing a service that can be used in conjunction with
+ * the TestStream class in order to control a pipeline remotely.
+ */
+
+syntax = "proto3";
+
+package org.apache.beam.model.interactive.v1;
+
+option go_package = "interactive_v1";
+option java_package = "org.apache.beam.model.interactive.v1";
+option java_outer_classname = "BeamInteractiveApi";
+
+import "beam_runner_api.proto";
+import "google/protobuf/timestamp.proto";
+
+service InteractiveService {
+
+  // A TestStream will request for events using this RPC.
+  rpc Events(EventsRequest) returns (stream EventsResponse) {}
+
+  // Starts the stream of events to the EventsRequest.
+  rpc Start (StartRequest) returns (StartResponse) {}
+
+  // Stops and resets the stream to the beginning.
+  rpc Stop (StopRequest) returns (StopResponse) {}
+
+  // Pauses the stream of events to the EventsRequest. If there is already an
+  // outstanding EventsRequest streaming events, then the stream will pause
+  // after the EventsResponse is completed.
+  rpc Pause (PauseRequest) returns (PauseResponse) {}
+
+  // Sends a single element to the EventsRequest then closes the stream.
+  rpc Step (StepRequest) returns (StepResponse) {}
+
+  // Responds with debugging and other cache-specific metadata.
+  rpc Status (StatusRequest) returns (StatusResponse) {}
+}
+
+message StartRequest {
+  double playback_speed = 1;
+
+  google.protobuf.Timestamp start_time = 2;
+}
+message StartResponse { }
+
+message StopRequest { }
+message StopResponse { }
+
+message PauseRequest { }
+message PauseResponse { }
+
+message StatusRequest { }
+message StatusResponse {
+
+  // The current timestamp of the replay stream. Is MIN_TIMESTAMP when state
+  // is STOPPED.
+  google.protobuf.Timestamp stream_time = 1;
+
+  // The minimum watermark across all of the faked replayable unbounded 
sources.
+  // Is MIN_TIMESTAMP when state is STOPPED.
+  google.protobuf.Timestamp watermark = 2;
+
 
 Review comment:
   Do we need more info in the StatusResponse like:
   1. the latest timestamp of the replayable stream (i.e. the current time if 
the recording job is still running, or the time when the recording job stopped)
   2. the current playback speed if running (either playback speed of 
StartRequest when stream_time < latest timestamp of the replayable stream, or 1 
if stream_time == latest timestamp of the replayable stream and the recording 
job is still running, or 0 if not running).
   
   I think these two fields will be valuable for the user to find out more 
about the replay.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331641)
Time Spent: 4h 20m  (was: 4h 10m)

> Add streaming support to Interactive Beam
> -
>
> Key: BEAM-8335
> URL: https://issues.apache.org/jira/browse/BEAM-8335
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-py-interactive
>Reporter: Sam Rohde
>Assignee: Sam Rohde
>Priority: Major
>  Time Spent: 4h 20m
>  Remaining 

[jira] [Work logged] (BEAM-8449) Document known issue with macOS installation

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8449?focusedWorklogId=331639=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331639
 ]

ASF GitHub Bot logged work on BEAM-8449:


Author: ASF GitHub Bot
Created on: 21/Oct/19 20:43
Start Date: 21/Oct/19 20:43
Worklog Time Spent: 10m 
  Work Description: TheNeuralBit commented on pull request #9847: 
[BEAM-8449] Document known issue with macOS installation
URL: https://github.com/apache/beam/pull/9847#discussion_r337229867
 
 

 ##
 File path: website/src/_posts/2019-07-31-beam-2.14.0.md
 ##
 @@ -79,6 +79,7 @@ For more information on changes in 2.14.0, check out the
 ### Known Issues
 
 * Do **NOT** use Python MongoDB source in this release. Python MongoDB source 
[added](https://issues.apache.org/jira/browse/BEAM-5148) in this release has a 
known issue that can result in data loss. See 
([BEAM-7866](https://issues.apache.org/jira/browse/BEAM-7866)) for details.
+* Can't install the Python SDK on macOS 10.15 
([BEAM-8368](https://issues.apache.org/jira/browse/BEAM-8368))
 
 Review comment:
   I'm fine with leaving it as is
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331639)
Time Spent: 40m  (was: 0.5h)

> Document known issue with macOS installation
> 
>
> Key: BEAM-8449
> URL: https://issues.apache.org/jira/browse/BEAM-8449
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Cyrus Maden
>Assignee: Cyrus Maden
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> For details about the original issue, see 
> [BEAM-8363|https://issues.apache.org/jira/browse/BEAM-8368]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-3713) Consider moving away from nose to nose2 or pytest.

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-3713?focusedWorklogId=331637=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331637
 ]

ASF GitHub Bot logged work on BEAM-3713:


Author: ASF GitHub Bot
Created on: 21/Oct/19 20:39
Start Date: 21/Oct/19 20:39
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #9756: [BEAM-3713] Add pytest 
for unit tests
URL: https://github.com/apache/beam/pull/9756#issuecomment-544697773
 
 
   PTAL
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331637)
Time Spent: 9h 50m  (was: 9h 40m)

> Consider moving away from nose to nose2 or pytest.
> --
>
> Key: BEAM-3713
> URL: https://issues.apache.org/jira/browse/BEAM-3713
> Project: Beam
>  Issue Type: Test
>  Components: sdk-py-core, testing
>Reporter: Robert Bradshaw
>Assignee: Udi Meiri
>Priority: Minor
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>
> Per 
> [https://nose.readthedocs.io/en/latest/|https://nose.readthedocs.io/en/latest/,]
>  , nose is in maintenance mode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8335) Add streaming support to Interactive Beam

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8335?focusedWorklogId=331638=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331638
 ]

ASF GitHub Bot logged work on BEAM-8335:


Author: ASF GitHub Bot
Created on: 21/Oct/19 20:39
Start Date: 21/Oct/19 20:39
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #9720: [BEAM-8335] 
Add initial modules for interactive streaming support
URL: https://github.com/apache/beam/pull/9720#discussion_r337227974
 
 

 ##
 File path: model/fn-execution/src/main/proto/beam_interactive_api.proto
 ##
 @@ -0,0 +1,106 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/*
+ * Protocol Buffers describing a service that can be used in conjunction with
+ * the TestStream class in order to control a pipeline remotely.
+ */
+
+syntax = "proto3";
+
+package org.apache.beam.model.fn_execution.v1;
+
+option go_package = "fnexecution_v1";
+option java_package = "org.apache.beam.model.fnexecution.v1";
+option java_outer_classname = "BeamInteractiveApi";
+
+import "beam_runner_api.proto";
+import "google/protobuf/timestamp.proto";
+
+service InteractiveService {
+
+  // A TestStream will request for events using this RPC.
+  rpc Events(EventsRequest) returns (stream EventsResponse) {}
+
+  rpc Start (StartRequest) returns (StartResponse) {}
 
 Review comment:
   Sorry for not being more clear. 
   
   Currently, we have a TestStream that reads its elements from memory. We are 
hoping to augment that so that its elements can be supplied externally, over 
GRPC, so we have
   
   TestStreamImpl <--- TestStreamPayload.Event  Driver
   
   Where TestStreamImpl and Driver sit in separate processes. To do this, we 
should augment TestStreamPayload with an (optional) address (you can probably 
reuse Endpoint) to which it should connect over GRPC to get a stream of 
TestStreamPayload.Events. 
   
   This Service seems to be trying to split Driver across processes, which 
seems to introduce unnecessary complexity. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331638)
Time Spent: 4h 10m  (was: 4h)

> Add streaming support to Interactive Beam
> -
>
> Key: BEAM-8335
> URL: https://issues.apache.org/jira/browse/BEAM-8335
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-py-interactive
>Reporter: Sam Rohde
>Assignee: Sam Rohde
>Priority: Major
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> This issue tracks the work items to introduce streaming support to the 
> Interactive Beam experience. This will allow users to:
>  * Write and run a streaming job in IPython
>  * Automatically cache records from unbounded sources
>  * Add a replay experience that replays all cached records to simulate the 
> original pipeline execution
>  * Add controls to play/pause/stop/step individual elements from the cached 
> records
>  * Add ability to inspect/visualize unbounded PCollections



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-3713) Consider moving away from nose to nose2 or pytest.

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-3713?focusedWorklogId=331636=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331636
 ]

ASF GitHub Bot logged work on BEAM-3713:


Author: ASF GitHub Bot
Created on: 21/Oct/19 20:34
Start Date: 21/Oct/19 20:34
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #9756: [BEAM-3713] Add 
pytest for unit tests
URL: https://github.com/apache/beam/pull/9756#discussion_r337225683
 
 

 ##
 File path: sdks/python/conftest.py
 ##
 @@ -0,0 +1,29 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+"""Pytest configuration and custom hooks."""
+
+from __future__ import absolute_import
+
+import sys
+
+# See pytest.ini for main collection rules.
+collect_ignore_glob = []
+if sys.version_info < (3,):
+  collect_ignore_glob.append('*_py3.py')
+for minor in [5, 6, 7, 8, 9]:
+  if sys.version_info < (3, minor):
+collect_ignore_glob.append('*_py3%d.py' % minor)
 
 Review comment:
   IDK. It's more verbose but I don't have a preference.
   
   Ideally, I'd like a single source of truth for the list of Python supported 
versions (`python_requires` in setup.py is such a source but it's hard to 
parse).
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331636)
Time Spent: 9h 40m  (was: 9.5h)

> Consider moving away from nose to nose2 or pytest.
> --
>
> Key: BEAM-3713
> URL: https://issues.apache.org/jira/browse/BEAM-3713
> Project: Beam
>  Issue Type: Test
>  Components: sdk-py-core, testing
>Reporter: Robert Bradshaw
>Assignee: Udi Meiri
>Priority: Minor
>  Time Spent: 9h 40m
>  Remaining Estimate: 0h
>
> Per 
> [https://nose.readthedocs.io/en/latest/|https://nose.readthedocs.io/en/latest/,]
>  , nose is in maintenance mode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8450) ParDoLifecycleTest does not allow for empty bundles

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8450?focusedWorklogId=331635=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331635
 ]

ASF GitHub Bot logged work on BEAM-8450:


Author: ASF GitHub Bot
Created on: 21/Oct/19 20:32
Start Date: 21/Oct/19 20:32
Worklog Time Spent: 10m 
  Work Description: je-ik commented on issue #9848: [BEAM-8450] Allow empty 
bundles in ParDoLifecycleTest
URL: https://github.com/apache/beam/pull/9848#issuecomment-544695038
 
 
   R: @lukecwik @robertwb 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331635)
Time Spent: 1h 10m  (was: 1h)

> ParDoLifecycleTest does not allow for empty bundles
> ---
>
> Key: BEAM-8450
> URL: https://issues.apache.org/jira/browse/BEAM-8450
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Jan Lukavský
>Assignee: Jan Lukavský
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> ParDoLifecycleTest should allow empty bundles produced by runner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8450) ParDoLifecycleTest does not allow for empty bundles

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8450?focusedWorklogId=331631=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331631
 ]

ASF GitHub Bot logged work on BEAM-8450:


Author: ASF GitHub Bot
Created on: 21/Oct/19 20:31
Start Date: 21/Oct/19 20:31
Worklog Time Spent: 10m 
  Work Description: je-ik commented on issue #9848: [BEAM-8450] Allow empty 
bundles in ParDoLifecycleTest
URL: https://github.com/apache/beam/pull/9848#issuecomment-544694642
 
 
   Run Samza ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331631)
Time Spent: 50m  (was: 40m)

> ParDoLifecycleTest does not allow for empty bundles
> ---
>
> Key: BEAM-8450
> URL: https://issues.apache.org/jira/browse/BEAM-8450
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Jan Lukavský
>Assignee: Jan Lukavský
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> ParDoLifecycleTest should allow empty bundles produced by runner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8450) ParDoLifecycleTest does not allow for empty bundles

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8450?focusedWorklogId=331632=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331632
 ]

ASF GitHub Bot logged work on BEAM-8450:


Author: ASF GitHub Bot
Created on: 21/Oct/19 20:31
Start Date: 21/Oct/19 20:31
Worklog Time Spent: 10m 
  Work Description: je-ik commented on issue #9848: [BEAM-8450] Allow empty 
bundles in ParDoLifecycleTest
URL: https://github.com/apache/beam/pull/9848#issuecomment-544694698
 
 
   Run Dataflow ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331632)
Time Spent: 1h  (was: 50m)

> ParDoLifecycleTest does not allow for empty bundles
> ---
>
> Key: BEAM-8450
> URL: https://issues.apache.org/jira/browse/BEAM-8450
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Jan Lukavský
>Assignee: Jan Lukavský
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> ParDoLifecycleTest should allow empty bundles produced by runner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8450) ParDoLifecycleTest does not allow for empty bundles

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8450?focusedWorklogId=331630=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331630
 ]

ASF GitHub Bot logged work on BEAM-8450:


Author: ASF GitHub Bot
Created on: 21/Oct/19 20:31
Start Date: 21/Oct/19 20:31
Worklog Time Spent: 10m 
  Work Description: je-ik commented on issue #9848: [BEAM-8450] Allow empty 
bundles in ParDoLifecycleTest
URL: https://github.com/apache/beam/pull/9848#issuecomment-544694578
 
 
   Run Java Flink PortableValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331630)
Time Spent: 40m  (was: 0.5h)

> ParDoLifecycleTest does not allow for empty bundles
> ---
>
> Key: BEAM-8450
> URL: https://issues.apache.org/jira/browse/BEAM-8450
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Jan Lukavský
>Assignee: Jan Lukavský
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> ParDoLifecycleTest should allow empty bundles produced by runner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7926) Visualize PCollection with Interactive Beam

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7926?focusedWorklogId=331633=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331633
 ]

ASF GitHub Bot logged work on BEAM-7926:


Author: ASF GitHub Bot
Created on: 21/Oct/19 20:31
Start Date: 21/Oct/19 20:31
Worklog Time Spent: 10m 
  Work Description: davidyan74 commented on pull request #9741: [BEAM-7926] 
Visualize PCollection
URL: https://github.com/apache/beam/pull/9741#discussion_r337223563
 
 

 ##
 File path: 
sdks/python/apache_beam/runners/interactive/display/pcoll_visualization.py
 ##
 @@ -0,0 +1,258 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Module visualizes PCollection data.
+
+For internal use only; no backwards-compatibility guarantees.
+Only works with Python 3.5+.
+"""
+from __future__ import absolute_import
+
+import base64
+import logging
+from datetime import timedelta
+
+from pandas.io.json import json_normalize
+
+from apache_beam import pvalue
+from apache_beam.runners.interactive import interactive_environment as ie
+from apache_beam.runners.interactive import pipeline_instrument as instr
+from facets_overview.generic_feature_statistics_generator import 
GenericFeatureStatisticsGenerator
+from IPython.core.display import HTML
+from IPython.core.display import Javascript
+from IPython.core.display import display
+from IPython.core.display import display_javascript
+from IPython.core.display import update_display
+from timeloop import Timeloop
+
+# jsons doesn't support < Python 3.5. Work around with json for legacy tests.
 
 Review comment:
   Why do we need this workaround if we only support Python 3.5+?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 331633)
Time Spent: 7h 10m  (was: 7h)

> Visualize PCollection with Interactive Beam
> ---
>
> Key: BEAM-7926
> URL: https://issues.apache.org/jira/browse/BEAM-7926
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-py-interactive
>Reporter: Ning Kang
>Assignee: Ning Kang
>Priority: Major
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> Support auto plotting / charting of materialized data of a given PCollection 
> with Interactive Beam.
> Say an Interactive Beam pipeline defined as
> p = create_pipeline()
> pcoll = p | 'Transform' >> transform()
> The use can call a single function and get auto-magical charting of the data 
> as materialized pcoll.
> e.g., visualize(pcoll)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7926) Visualize PCollection with Interactive Beam

2019-10-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7926?focusedWorklogId=331634=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-331634
 ]

ASF GitHub Bot logged work on BEAM-7926:


Author: ASF GitHub Bot
Created on: 21/Oct/19 20:31
Start Date: 21/Oct/19 20:31
Worklog Time Spent: 10m 
  Work Description: davidyan74 commented on pull request #9741: [BEAM-7926] 
Visualize PCollection
URL: https://github.com/apache/beam/pull/9741#discussion_r337223191
 
 

 ##
 File path: 
sdks/python/apache_beam/runners/interactive/display/pcoll_visualization_test.py
 ##
 @@ -0,0 +1,152 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Tests for apache_beam.runners.interactive.display.pcoll_visualization."""
+from __future__ import absolute_import
+
+import sys
+import time
+import unittest
+
+import apache_beam as beam  # pylint: disable=ungrouped-imports
+import timeloop
+from apache_beam.runners import runner
+from apache_beam.runners.interactive import interactive_environment as ie
+from apache_beam.runners.interactive.display import pcoll_visualization as pv
+
+# Work around nose tests using Python2 without unittest.mock module.
+try:
+  from unittest.mock import patch
+except ImportError:
+  from mock import patch
+
+
+class PCollVisualizationTest(unittest.TestCase):
+
+  def setUp(self):
+self._p = beam.Pipeline()
+# pylint: disable=range-builtin-not-iterating
+self._pcoll = self._p | 'Create' >> beam.Create(range(1000))
+
+  @unittest.skipIf(sys.version_info < (3, 5, 3),
+   'PCollVisualization is not supported on Python 2.')
+  def test_raise_error_for_non_pcoll_input(self):
+class Foo(object):
+  pass
+
+with self.assertRaises(AssertionError) as ctx:
+  pv.PCollVisualization(Foo())
+  self.assertTrue('pcoll should be apache_beam.pvalue.PCollection' in
+  ctx.exception)
+
+  @unittest.skipIf(sys.version_info < (3, 5, 3),
+   'PCollVisualization is not supported on Python 2.')
+  def test_pcoll_visualization_generate_unique_display_id(self):
+pv_1 = pv.PCollVisualization(self._pcoll)
+pv_2 = pv.PCollVisualization(self._pcoll)
+self.assertNotEqual(pv_1._dive_display_id, pv_2._dive_display_id)
+self.assertNotEqual(pv_1._overview_display_id, pv_2._overview_display_id)
+self.assertNotEqual(pv_1._df_display_id, pv_2._df_display_id)
+
+  @unittest.skipIf(sys.version_info < (3, 5, 3),
+   'PCollVisualization is not supported on Python 2.')
+  @patch('apache_beam.runners.interactive.display.pcoll_visualization'
+ '.PCollVisualization._to_element_list', lambda x: [1, 2, 3])
+  def test_one_shot_visualization_not_return_handle(self):
+self.assertIsNone(pv.visualize(self._pcoll))
+
+  def _mock_to_element_list(self):
+yield [1, 2, 3]
+yield [1, 2, 3, 4]
+yield [1, 2, 3, 4, 5]
+yield [1, 2, 3, 4, 5, 6]
+yield [1, 2, 3, 4, 5, 6, 7]
+yield [1, 2, 3, 4, 5, 6, 7, 8]
+
+  @unittest.skipIf(sys.version_info < (3, 5, 3),
+   'PCollVisualization is not supported on Python 2.')
+  @patch('apache_beam.runners.interactive.display.pcoll_visualization'
+ '.PCollVisualization._to_element_list', _mock_to_element_list)
+  def test_dynamic_plotting_return_handle(self):
+h = pv.visualize(self._pcoll, dynamic_plotting_interval=1)
+self.assertIsInstance(h, timeloop.Timeloop)
+h.stop()
+
+  @unittest.skipIf(sys.version_info < (3, 5, 3),
+   'PCollVisualization is not supported on Python 2.')
+  @patch('apache_beam.runners.interactive.display.pcoll_visualization'
+ '.PCollVisualization._to_element_list', _mock_to_element_list)
+  @patch('apache_beam.runners.interactive.display.pcoll_visualization'
+ '.PCollVisualization.display_facets')
+  def test_dynamic_plotting_update_same_display(self,
+mocked_display_facets):
+fake_pipeline_result = runner.PipelineResult(runner.PipelineState.RUNNING)
+ie.current_env().set_pipeline_result(self._p, fake_pipeline_result)
+# Starts async dynamic plotting that 

  1   2   3   >