[jira] [Work logged] (BEAM-9240) Check for Nullability in typesEqual() method of FieldType class

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9240?focusedWorklogId=380320=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380320
 ]

ASF GitHub Bot logged work on BEAM-9240:


Author: ASF GitHub Bot
Created on: 01/Feb/20 07:55
Start Date: 01/Feb/20 07:55
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #10744: [BEAM-9240]: Check 
for Nullability in typesEqual() method of FieldTyp…
URL: https://github.com/apache/beam/pull/10744#issuecomment-581005641
 
 
   Can you explain how this would work? equals() checks field names and
   typesEqual does not.
   
   On Fri, Jan 31, 2020 at 11:48 PM Rahul Patwari 
   wrote:
   
   > Also, the typesEqual() method of FieldType class is redundant as the
   > behaviour is a subset of equals() method.
   >
   > Instead of doing this:
   >
   > public boolean typesEqual(Field other) {
   >   return getType().typesEqual(other.getType());
   > }
   >
   > we can test typesEquality by doing:
   >
   > public boolean typesEqual(Field other) {
   >   return getType().equals(other.getType());
   > }
   >
   > and we can *remove* tyesEqual() method in FieldType class.
   >
   > —
   > You are receiving this because your review was requested.
   > Reply to this email directly, view it on GitHub
   > 
,
   > or unsubscribe
   > 

   > .
   >
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380320)
Time Spent: 0.5h  (was: 20m)

> Check for Nullability in typesEqual() method of FieldType class
> ---
>
> Key: BEAM-9240
> URL: https://issues.apache.org/jira/browse/BEAM-9240
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Affects Versions: 2.18.0
>Reporter: Rahul Patwari
>Assignee: Rahul Patwari
>Priority: Major
> Fix For: 2.19.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> {{If two schemas are created like this:}}
> {{Schema schema1 = Schema.builder().addStringField("col1").build();}}
>  {{Schema schema2 = Schema.builder().addNullableField("col1", 
> FieldType.STRING).build();}}
>  
> {{schema1.typeEquals(schema2) returns "true" even though the schemas differ 
> by Nullability}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9219) Streamline creation of Python and Java dependencies pages

2020-01-31 Thread David Wrede (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Wrede updated BEAM-9219:
--
Description: 
This issue is about the need to address keeping both Python and Java SDK 
dependency pages more relevant and up-to-date while reducing the amount of time 
it takes to provide that information. The current method of scraping and 
copying dependencies into a table for every release is a non-trivial task 
because of the semi-automated workflows done by the tech writers on the website.

In an effort to provide accurate dependency listings that are always in sync 
with SDK releases, referring people to the appropriate places in the source 
code (or through CLI commands) should provide people the information they are 
looking for and not require the creation and maintenance of an automated 
tooling solution to generate the dependency tables.

  was:
This issue is about the need to address keeping both Python and Java SDK 
dependency pages more relevant and up-to-date while reducing the amount of time 
it takes to provide that information. The current method of scraping and 
copying dependencies into a table for every release is a non-trivial task 
because of some semi-automated workflows done by the tech writers on the 
website.

 

In an effort to provide accurate dependency listings that are always in sync 
with SDK releases, referring people to the appropriate places in the source 
code (or through CLI commands).


> Streamline creation of Python and Java dependencies pages
> -
>
> Key: BEAM-9219
> URL: https://issues.apache.org/jira/browse/BEAM-9219
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: David Wrede
>Priority: Minor
>
> This issue is about the need to address keeping both Python and Java SDK 
> dependency pages more relevant and up-to-date while reducing the amount of 
> time it takes to provide that information. The current method of scraping and 
> copying dependencies into a table for every release is a non-trivial task 
> because of the semi-automated workflows done by the tech writers on the 
> website.
> In an effort to provide accurate dependency listings that are always in sync 
> with SDK releases, referring people to the appropriate places in the source 
> code (or through CLI commands) should provide people the information they are 
> looking for and not require the creation and maintenance of an automated 
> tooling solution to generate the dependency tables.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9219) Streamline creation of Python and Java dependencies pages

2020-01-31 Thread David Wrede (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Wrede updated BEAM-9219:
--
Summary: Streamline creation of Python and Java dependencies pages  (was: 
Update Python and Java dependencies pages)

> Streamline creation of Python and Java dependencies pages
> -
>
> Key: BEAM-9219
> URL: https://issues.apache.org/jira/browse/BEAM-9219
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: David Wrede
>Priority: Minor
>
> This issue is about the need to address keeping both Python and Java SDK 
> dependency pages more relevant and up-to-date while reducing the amount of 
> time it takes to provide that information. The current method of scraping and 
> copying dependencies into a table for every release is a non-trivial task 
> because of some semi-automated workflows done by the tech writers on the 
> website.
>  
> In an effort to provide accurate dependency listings that are always in sync 
> with SDK releases, referring people to the appropriate places in the source 
> code (or through CLI commands).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9240) Check for Nullability in typesEqual() method of FieldType class

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9240?focusedWorklogId=380319=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380319
 ]

ASF GitHub Bot logged work on BEAM-9240:


Author: ASF GitHub Bot
Created on: 01/Feb/20 07:48
Start Date: 01/Feb/20 07:48
Worklog Time Spent: 10m 
  Work Description: rahul8383 commented on issue #10744: [BEAM-9240]: Check 
for Nullability in typesEqual() method of FieldTyp…
URL: https://github.com/apache/beam/pull/10744#issuecomment-581005142
 
 
   Also, the typesEqual() method of FieldType class is redundant as the 
behaviour is a subset of equals() method.
   
   Instead of doing this:
   ```
   public boolean typesEqual(Field other) {
 return getType().typesEqual(other.getType());
   }
   ```
   we can test typesEquality by doing:
   ```
   public boolean typesEqual(Field other) {
 return getType().equals(other.getType());
   }
   ```
   and we can **remove** tyesEqual() method in FieldType class.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380319)
Time Spent: 20m  (was: 10m)

> Check for Nullability in typesEqual() method of FieldType class
> ---
>
> Key: BEAM-9240
> URL: https://issues.apache.org/jira/browse/BEAM-9240
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Affects Versions: 2.18.0
>Reporter: Rahul Patwari
>Assignee: Rahul Patwari
>Priority: Major
> Fix For: 2.19.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {{If two schemas are created like this:}}
> {{Schema schema1 = Schema.builder().addStringField("col1").build();}}
>  {{Schema schema2 = Schema.builder().addNullableField("col1", 
> FieldType.STRING).build();}}
>  
> {{schema1.typeEquals(schema2) returns "true" even though the schemas differ 
> by Nullability}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9219) Update Python and Java dependencies pages

2020-01-31 Thread David Wrede (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Wrede updated BEAM-9219:
--
Description: 
This issue is about the need to address keeping both Python and Java SDK 
dependency pages more relevant and up-to-date while reducing the amount of time 
it takes to provide that information. The current method of scraping and 
copying dependencies into a table for every release is a non-trivial task 
because of some semi-automated workflows done by the tech writers on the 
website.

 

In an effort to provide accurate dependency listings that are always in sync 
with SDK releases, referring people to the appropriate places in the source 
code (or through CLI commands).

  was:
This issue is about the need to address keeping both Python and Java SDK 
dependency pages more relevant and up-to-date. The current method of scraping 
and copying dependencies into a table for every release is a non-trivial task 
because of some semi-automated workflows that have been done by the core 
writers of the documentation.

 

by referring people to the appropriate places in the source code (or through 
CLI commands).


> Update Python and Java dependencies pages
> -
>
> Key: BEAM-9219
> URL: https://issues.apache.org/jira/browse/BEAM-9219
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: David Wrede
>Priority: Minor
>
> This issue is about the need to address keeping both Python and Java SDK 
> dependency pages more relevant and up-to-date while reducing the amount of 
> time it takes to provide that information. The current method of scraping and 
> copying dependencies into a table for every release is a non-trivial task 
> because of some semi-automated workflows done by the tech writers on the 
> website.
>  
> In an effort to provide accurate dependency listings that are always in sync 
> with SDK releases, referring people to the appropriate places in the source 
> code (or through CLI commands).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9219) Update Python and Java dependencies pages

2020-01-31 Thread David Wrede (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Wrede updated BEAM-9219:
--
Description: 
This issue is about the need to address keeping both Python and Java SDK 
dependency pages more relevant and up-to-date. The current method of scraping 
and copying dependencies into a table for every release is a non-trivial task 
because of some semi-automated workflows that have been done by the core 
writers of the documentation.

 

by referring people to the appropriate places in the source code (or through 
CLI commands).

> Update Python and Java dependencies pages
> -
>
> Key: BEAM-9219
> URL: https://issues.apache.org/jira/browse/BEAM-9219
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: David Wrede
>Priority: Minor
>
> This issue is about the need to address keeping both Python and Java SDK 
> dependency pages more relevant and up-to-date. The current method of scraping 
> and copying dependencies into a table for every release is a non-trivial task 
> because of some semi-automated workflows that have been done by the core 
> writers of the documentation.
>  
> by referring people to the appropriate places in the source code (or through 
> CLI commands).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9219) Update Python and Java dependencies pages

2020-01-31 Thread David Wrede (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Wrede updated BEAM-9219:
--
Summary: Update Python and Java dependencies pages  (was: Update Python 
dependencies page for 2.18.0)

> Update Python and Java dependencies pages
> -
>
> Key: BEAM-9219
> URL: https://issues.apache.org/jira/browse/BEAM-9219
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: David Wrede
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9240) Check for Nullability in typesEqual() method of FieldType class

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9240?focusedWorklogId=380318=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380318
 ]

ASF GitHub Bot logged work on BEAM-9240:


Author: ASF GitHub Bot
Created on: 01/Feb/20 07:39
Start Date: 01/Feb/20 07:39
Worklog Time Spent: 10m 
  Work Description: rahul8383 commented on pull request #10744: 
[BEAM-9240]: Check for Nullability in typesEqual() method of FieldTyp…
URL: https://github.com/apache/beam/pull/10744
 
 
   
   
   Added logic in typesEqual() method of FieldType class to return "false" if 
the FieldTypes differ in Nullability. Also added a Unit Test to validate the 
logic.
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 

[jira] [Updated] (BEAM-9240) Check for Nullability in typesEqual() method of FieldType class

2020-01-31 Thread Rahul Patwari (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rahul Patwari updated BEAM-9240:

Description: 
{{If two schemas are created like this:}}

{{Schema schema1 = Schema.builder().addStringField("col1").build();}}
 {{Schema schema2 = Schema.builder().addNullableField("col1", 
FieldType.STRING).build();}}

 

{{schema1.typeEquals(schema2) returns "true" even though the schemas differ by 
Nullability}}

  was:
{{If two schemas are created like this:}}

Schema schema1 = Schema.builder().addStringField("col1").build();
 {{Schema schema2 = Schema.builder().addNullableField("col1", 
FieldType.STRING).build();}}

 

{{schema1.typeEquals(schema2) returns "true" even though the schemas differ by 
Nullability}}


> Check for Nullability in typesEqual() method of FieldType class
> ---
>
> Key: BEAM-9240
> URL: https://issues.apache.org/jira/browse/BEAM-9240
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Affects Versions: 2.18.0
>Reporter: Rahul Patwari
>Assignee: Rahul Patwari
>Priority: Major
> Fix For: 2.19.0
>
>
> {{If two schemas are created like this:}}
> {{Schema schema1 = Schema.builder().addStringField("col1").build();}}
>  {{Schema schema2 = Schema.builder().addNullableField("col1", 
> FieldType.STRING).build();}}
>  
> {{schema1.typeEquals(schema2) returns "true" even though the schemas differ 
> by Nullability}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9240) Check for Nullability in typesEqual() method of FieldType class

2020-01-31 Thread Rahul Patwari (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rahul Patwari updated BEAM-9240:

Description: 
{{If two schemas are created like this:}}

Schema schema1 = Schema.builder().addStringField("col1").build();
 {{Schema schema2 = Schema.builder().addNullableField("col1", 
FieldType.STRING).build();}}

 

{{schema1.typeEquals(schema2) returns "true" even though the schemas differ by 
Nullability}}

  was:
{{If two schemas are reated like this:}}

Schema schema1 = Schema.builder().addStringField("col1").build();
{{Schema schema2 = Schema.builder().addNullableField("col1", 
FieldType.STRING).build();}}

 

{{schema1.typeEquals(schema2) returns "true" even though the schemas differ by 
Nullability}}


> Check for Nullability in typesEqual() method of FieldType class
> ---
>
> Key: BEAM-9240
> URL: https://issues.apache.org/jira/browse/BEAM-9240
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql
>Affects Versions: 2.18.0
>Reporter: Rahul Patwari
>Assignee: Rahul Patwari
>Priority: Major
> Fix For: 2.19.0
>
>
> {{If two schemas are created like this:}}
> Schema schema1 = Schema.builder().addStringField("col1").build();
>  {{Schema schema2 = Schema.builder().addNullableField("col1", 
> FieldType.STRING).build();}}
>  
> {{schema1.typeEquals(schema2) returns "true" even though the schemas differ 
> by Nullability}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9240) Check for Nullability in typesEqual() method of FieldType class

2020-01-31 Thread Rahul Patwari (Jira)
Rahul Patwari created BEAM-9240:
---

 Summary: Check for Nullability in typesEqual() method of FieldType 
class
 Key: BEAM-9240
 URL: https://issues.apache.org/jira/browse/BEAM-9240
 Project: Beam
  Issue Type: Bug
  Components: dsl-sql
Affects Versions: 2.18.0
Reporter: Rahul Patwari
Assignee: Rahul Patwari
 Fix For: 2.19.0


{{If two schemas are reated like this:}}

Schema schema1 = Schema.builder().addStringField("col1").build();
{{Schema schema2 = Schema.builder().addNullableField("col1", 
FieldType.STRING).build();}}

 

{{schema1.typeEquals(schema2) returns "true" even though the schemas differ by 
Nullability}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9231) Add Experimental portability kind and tag related classes

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9231?focusedWorklogId=380307=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380307
 ]

ASF GitHub Bot logged work on BEAM-9231:


Author: ASF GitHub Bot
Created on: 01/Feb/20 06:24
Start Date: 01/Feb/20 06:24
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #10739: [BEAM-9231] Add 
Experimental portability kind and tag related classes
URL: https://github.com/apache/beam/pull/10739#issuecomment-580999683
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380307)
Time Spent: 20m  (was: 10m)

> Add Experimental portability kind and tag related classes
> -
>
> Key: BEAM-9231
> URL: https://issues.apache.org/jira/browse/BEAM-9231
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Labels: portability
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> For some extra context I was studying the evolution of our APIs between 
> versions in particular for beam-sdks-java-core and noticed that some parts 
> were not well classified as Experimental in particular classes (and 
> transforms)  for portability and SplittableDoFn.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9236) Mark missing Schema based classes and methods as Experimental

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9236?focusedWorklogId=380308=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380308
 ]

ASF GitHub Bot logged work on BEAM-9236:


Author: ASF GitHub Bot
Created on: 01/Feb/20 06:24
Start Date: 01/Feb/20 06:24
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #10741: [BEAM-9236] Mark 
missing Schema based classes and methods as Experimental
URL: https://github.com/apache/beam/pull/10741#issuecomment-580999697
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380308)
Time Spent: 20m  (was: 10m)

> Mark missing Schema based classes and methods as Experimental
> -
>
> Key: BEAM-9236
> URL: https://issues.apache.org/jira/browse/BEAM-9236
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-2546) Create InfluxDbIO

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-2546?focusedWorklogId=380302=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380302
 ]

ASF GitHub Bot logged work on BEAM-2546:


Author: ASF GitHub Bot
Created on: 01/Feb/20 04:48
Start Date: 01/Feb/20 04:48
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #10604: BEAM-2546 
Initial Commit
URL: https://github.com/apache/beam/pull/10604#issuecomment-580992086
 
 
   I think @rezarokni might be a great choice for reviewing IO for a time 
series DB.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380302)
Time Spent: 1h 20m  (was: 1h 10m)

> Create InfluxDbIO
> -
>
> Key: BEAM-2546
> URL: https://issues.apache.org/jira/browse/BEAM-2546
> Project: Beam
>  Issue Type: New Feature
>  Components: io-ideas
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8543) Dataflow streaming timers are not strictly time ordered when set earlier mid-bundle

2020-01-31 Thread Kenneth Knowles (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-8543:
-

Assignee: (was: Kenneth Knowles)

> Dataflow streaming timers are not strictly time ordered when set earlier 
> mid-bundle
> ---
>
> Key: BEAM-8543
> URL: https://issues.apache.org/jira/browse/BEAM-8543
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.13.0
>Reporter: Jan Lukavský
>Priority: Major
>
> Let's suppose we have the following situation:
>  - statful ParDo with two timers - timerA and timerB
>  - timerA is set for window.maxTimestamp() + 1
>  - timerB is set anywhere between  timerB.timestamp
>  - input watermark moves to BoundedWindow.TIMESTAMP_MAX_VALUE
> Then the order of timers is as follows (correct):
>  - timerB
>  - timerA
> But, if timerB sets another timer (say for timerB.timestamp + 1), then the 
> order of timers will be:
>  - timerB (timerB.timestamp)
>  - timerA (BoundedWindow.TIMESTAMP_MAX_VALUE)
>  - timerB (timerB.timestamp + 1)
> Which is not ordered by timestamp. The reason for this is that when the input 
> watermark update is evaluated, the WatermarkManager,extractFiredTimers() will 
> produce both timerA and timerB. That would be correct, but when timerB sets 
> another timer, that breaks this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8543) Dataflow streaming timers are not strictly time ordered when set earlier mid-bundle

2020-01-31 Thread Kenneth Knowles (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17027955#comment-17027955
 ] 

Kenneth Knowles commented on BEAM-8543:
---

Yea, that's quite severe. If you have bandwidth and want to take this one, 
please do. I will unassign for now. I will take it up again if I have time to 
direct code on it. We've discussed solutions plenty - "put a priority queue on 
it" - so I think there's not really any long design phase or anything.

> Dataflow streaming timers are not strictly time ordered when set earlier 
> mid-bundle
> ---
>
> Key: BEAM-8543
> URL: https://issues.apache.org/jira/browse/BEAM-8543
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.13.0
>Reporter: Jan Lukavský
>Assignee: Kenneth Knowles
>Priority: Major
>
> Let's suppose we have the following situation:
>  - statful ParDo with two timers - timerA and timerB
>  - timerA is set for window.maxTimestamp() + 1
>  - timerB is set anywhere between  timerB.timestamp
>  - input watermark moves to BoundedWindow.TIMESTAMP_MAX_VALUE
> Then the order of timers is as follows (correct):
>  - timerB
>  - timerA
> But, if timerB sets another timer (say for timerB.timestamp + 1), then the 
> order of timers will be:
>  - timerB (timerB.timestamp)
>  - timerA (BoundedWindow.TIMESTAMP_MAX_VALUE)
>  - timerB (timerB.timestamp + 1)
> Which is not ordered by timestamp. The reason for this is that when the input 
> watermark update is evaluated, the WatermarkManager,extractFiredTimers() will 
> produce both timerA and timerB. That would be correct, but when timerB sets 
> another timer, that breaks this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8618) Tear down unused DoFns periodically in Python SDK harness

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8618?focusedWorklogId=380289=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380289
 ]

ASF GitHub Bot logged work on BEAM-8618:


Author: ASF GitHub Bot
Created on: 01/Feb/20 02:01
Start Date: 01/Feb/20 02:01
Worklog Time Spent: 10m 
  Work Description: sunjincheng121 commented on pull request #10655: 
[BEAM-8618] Tear down unused DoFns periodically in Python SDK harness.
URL: https://github.com/apache/beam/pull/10655#discussion_r373749786
 
 

 ##
 File path: sdks/python/apache_beam/runners/worker/sdk_worker.py
 ##
 @@ -315,18 +322,49 @@ def release(self, instruction_id):
 """
 descriptor_id, processor = 
self.active_bundle_processors.pop(instruction_id)
 processor.reset()
+self.cached_bundle_processors_last_access_time[descriptor_id] = time.time()
 self.cached_bundle_processors[descriptor_id].append(processor)
 
   def shutdown(self):
 """
 Shutdown all ``BundleProcessor``s in the cache.
 """
+if self.periodic_shutdown:
+  self.periodic_shutdown.cancel()
+  self.periodic_shutdown.join()
+  self.periodic_shutdown = None
+
 for instruction_id in self.active_bundle_processors:
   self.active_bundle_processors[instruction_id][1].shutdown()
   del self.active_bundle_processors[instruction_id]
 for cached_bundle_processors in self.cached_bundle_processors.values():
-  while len(cached_bundle_processors) > 0:
-cached_bundle_processors.pop().shutdown()
+  BundleProcessorCache._shutdown_cached_bundle_processors(
+  cached_bundle_processors)
+
+  def _schedule_periodic_shutdown(self):
+def shutdown_inactive_bundle_processors():
+  for descriptor_id, last_access_time in \
+  self.cached_bundle_processors_last_access_time.items():
 
 Review comment:
   It seems that it doesn't support parenthesis in the for loop?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380289)
Time Spent: 3h 50m  (was: 3h 40m)

> Tear down unused DoFns periodically in Python SDK harness
> -
>
> Key: BEAM-8618
> URL: https://issues.apache.org/jira/browse/BEAM-8618
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-harness
>Reporter: sunjincheng
>Assignee: sunjincheng
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Per the discussion in the ML, detail can be found [1],  the teardown of DoFns 
> should be supported in the portability framework. It happens at two places:
> 1) Upon the control service termination
> 2) Tear down the unused DoFns periodically
> The aim of this JIRA is to add support for tear down the unused DoFns 
> periodically in Python SDK harness.
> [1] 
> https://lists.apache.org/thread.html/0c4a4cf83cf2e35c3dfeb9d906e26cd82d3820968ba6f862f91739e4@%3Cdev.beam.apache.org%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8831) Python PreCommit Failures: Could not copy file '/some/path/file.egg'

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8831?focusedWorklogId=380285=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380285
 ]

ASF GitHub Bot logged work on BEAM-8831:


Author: ASF GitHub Bot
Created on: 01/Feb/20 01:16
Start Date: 01/Feb/20 01:16
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on issue #10230: [BEAM-8831] 
Exclude generated files for Python source copy
URL: https://github.com/apache/beam/pull/10230#issuecomment-580973893
 
 
   This change is still valid. I'll get back to it later.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380285)
Time Spent: 1h 50m  (was: 1h 40m)

> Python PreCommit Failures: Could not copy file '/some/path/file.egg'
> 
>
> Key: BEAM-8831
> URL: https://issues.apache.org/jira/browse/BEAM-8831
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Luke Cwik
>Assignee: Mark Liu
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Several precommits fail due to "Could not copy file '/some/path/file.egg'"
> Examples
> [https://scans.gradle.com/s/ihfmrxr7evslw/failure?openFailures=WzFd=WzZd#top=0]
> [https://scans.gradle.com/s/ihfmrxr7evslw]
> {code}
> > Cannot create directory 
> > '/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Cron/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/.eggs/timeloop-1.0.2-py3.7.egg'
> >  as it already exists, but is not a directory
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7961) Add tests for all runner native transforms and some widely used composite transforms to cross-language validates runner test suite

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7961?focusedWorklogId=380284=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380284
 ]

ASF GitHub Bot logged work on BEAM-7961:


Author: ASF GitHub Bot
Created on: 01/Feb/20 01:06
Start Date: 01/Feb/20 01:06
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #10051: [BEAM-7961] Add 
tests for all runner native transforms for XLang
URL: https://github.com/apache/beam/pull/10051#issuecomment-580972689
 
 
   Run XVR_Flink PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380284)
Time Spent: 23h  (was: 22h 50m)

> Add tests for all runner native transforms and some widely used composite 
> transforms to cross-language validates runner test suite
> --
>
> Key: BEAM-7961
> URL: https://issues.apache.org/jira/browse/BEAM-7961
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 23h
>  Remaining Estimate: 0h
>
> Add tests for all runner native transforms and some widely used composite 
> transforms to cross-language validates runner test suite



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8543) Dataflow streaming timers are not strictly time ordered when set earlier mid-bundle

2020-01-31 Thread Reuven Lax (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17027918#comment-17027918
 ] 

Reuven Lax commented on BEAM-8543:
--

The problem is more general: if an input bundle contains both element and 
timers, the effect of processing the elements does not affect the timers in the 
bundle. So assume an input bundle contains an element E and a timer T set for 
12pm. While processing E, the user resets the timer T to be at 1pm. Correct 
behavior would be to skip the timer in this bundle, as it's no longer eligible 
to fire, however today it will still fire.

> Dataflow streaming timers are not strictly time ordered when set earlier 
> mid-bundle
> ---
>
> Key: BEAM-8543
> URL: https://issues.apache.org/jira/browse/BEAM-8543
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.13.0
>Reporter: Jan Lukavský
>Assignee: Kenneth Knowles
>Priority: Major
>
> Let's suppose we have the following situation:
>  - statful ParDo with two timers - timerA and timerB
>  - timerA is set for window.maxTimestamp() + 1
>  - timerB is set anywhere between  timerB.timestamp
>  - input watermark moves to BoundedWindow.TIMESTAMP_MAX_VALUE
> Then the order of timers is as follows (correct):
>  - timerB
>  - timerA
> But, if timerB sets another timer (say for timerB.timestamp + 1), then the 
> order of timers will be:
>  - timerB (timerB.timestamp)
>  - timerA (BoundedWindow.TIMESTAMP_MAX_VALUE)
>  - timerB (timerB.timestamp + 1)
> Which is not ordered by timestamp. The reason for this is that when the input 
> watermark update is evaluated, the WatermarkManager,extractFiredTimers() will 
> produce both timerA and timerB. That would be correct, but when timerB sets 
> another timer, that breaks this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9085) Investigate performance difference between Python 2/3 on Dataflow

2020-01-31 Thread Valentyn Tymofieiev (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17027900#comment-17027900
 ] 

Valentyn Tymofieiev commented on BEAM-9085:
---

It appears that the increase in time is likely caused by either generating the 
synthetic input, or reading the synthetic input.

The problem can be reproduced on Direct runner as well, I don't think it's 
Dataflow-specific.

Following command shows 14 (Py2) vs 40 (Py3) seconds difference on my machine.

{noformat}
python setup.py nosetests \
--test-pipeline-options="
--iterations=10
--number_of_counters=1
--number_of_counter_operations=1
--project=big-query-project
--publish_to_big_query=false
--metrics_dataset=python_load_tests
--metrics_table=pardo
--input_options='{
\"num_records\": 20,
\"key_size\": 10,
\"value_size\":90,
\"bundle_size_distribution_type\": \"const\",
\"bundle_size_distribution_param\": 1,
\"force_initial_num_bundles\": 0
}'" \
--tests apache_beam.testing.load_tests.pardo_test
{noformat}

> Investigate performance difference between Python 2/3 on Dataflow
> -
>
> Key: BEAM-9085
> URL: https://issues.apache.org/jira/browse/BEAM-9085
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Kamil Wasilewski
>Assignee: Valentyn Tymofieiev
>Priority: Major
>
> Tests show that the performance of core Beam operations in Python 3.x on 
> Dataflow can be a few time slower than in Python 2.7. We should investigate 
> what's the cause of the problem.
> Currently, we have one ParDo test that is run both in Py3 and Py2 [1]. A 
> dashboard with runtime results can be found here [2].
> [1] sdks/python/apache_beam/testing/load_tests/pardo_test.py
> [2] https://apache-beam-testing.appspot.com/explore?dashboard=5678187241537536



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9008) Add readAll() method to CassandraIO

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9008?focusedWorklogId=380267=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380267
 ]

ASF GitHub Bot logged work on BEAM-9008:


Author: ASF GitHub Bot
Created on: 01/Feb/20 00:05
Start Date: 01/Feb/20 00:05
Worklog Time Spent: 10m 
  Work Description: vmarquez commented on issue #10546: [BEAM-9008] Add 
CassandraIO readAll method
URL: https://github.com/apache/beam/pull/10546#issuecomment-580962843
 
 
   > Couldn't the user in the specific case you mention of many Reads generate 
queries for RingRanges that match more 'regions' and achieve the same result?
   
I don't think so, a user might be querying specific partition keys (only 
one key that returns multiple results) that are not contiguous some of which 
have data and many of which may have no data. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380267)
Time Spent: 4h 50m  (was: 4h 40m)

> Add readAll() method to CassandraIO
> ---
>
> Key: BEAM-9008
> URL: https://issues.apache.org/jira/browse/BEAM-9008
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-cassandra
>Affects Versions: 2.16.0
>Reporter: vincent marquez
>Assignee: vincent marquez
>Priority: Minor
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> When querying a large cassandra database, it's often *much* more useful to 
> programatically generate the queries needed to to be run rather than reading 
> all partitions and attempting some filtering.  
> As an example:
> {code:java}
> public class Event { 
>@PartitionKey(0) public UUID accountId;
>@PartitionKey(1)public String yearMonthDay; 
>@ClusteringKey public UUID eventId;  
>//other data...
> }{code}
> If there is ten years worth of data, you may want to only query one year's 
> worth.  Here each token range would represent one 'token' but all events for 
> the day. 
> {code:java}
> Set accounts = getRelevantAccounts();
> Set dateRange = generateDateRange("2018-01-01", "2019-01-01");
> PCollection tokens = generateTokens(accounts, dateRange); 
> {code}
>  
>  I propose an additional _readAll()_ PTransform that can take a PCollection 
> of token ranges and can return a PCollection of what the query would 
> return. 
> *Question: How much code should be in common between both methods?* 
> Currently the read connector already groups all partitions into a List of 
> Token Ranges, so it would be simple to refactor the current read() based 
> method to a 'ParDo' based one and have them both share the same function.  
> Reasons against sharing code between read and readAll
>  * Not having the read based method return a BoundedSource connector would 
> mean losing the ability to know the size of the data returned
>  * Currently the CassandraReader executes all the grouped TokenRange queries 
> *asynchronously* which is (maybe?) fine when all that's happening is 
> splitting up all the partition ranges but terrible for executing potentially 
> millions of queries. 
>  Reasons _for_ sharing code would be simplified code base and that both of 
> the above issues would most likely have a negligable performance impact. 
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-2546) Create InfluxDbIO

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-2546?focusedWorklogId=380260=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380260
 ]

ASF GitHub Bot logged work on BEAM-2546:


Author: ASF GitHub Bot
Created on: 31/Jan/20 23:52
Start Date: 31/Jan/20 23:52
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #10604: BEAM-2546 Initial 
Commit
URL: https://github.com/apache/beam/pull/10604#issuecomment-580960072
 
 
   cc: @kennknowles @chamikaramj -- for reviewer suggestions.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380260)
Time Spent: 1h 10m  (was: 1h)

> Create InfluxDbIO
> -
>
> Key: BEAM-2546
> URL: https://issues.apache.org/jira/browse/BEAM-2546
> Project: Beam
>  Issue Type: New Feature
>  Components: io-ideas
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9008) Add readAll() method to CassandraIO

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9008?focusedWorklogId=380259=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380259
 ]

ASF GitHub Bot logged work on BEAM-9008:


Author: ASF GitHub Bot
Created on: 31/Jan/20 23:52
Start Date: 31/Jan/20 23:52
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #10546: [BEAM-9008] Add 
CassandraIO readAll method
URL: https://github.com/apache/beam/pull/10546#issuecomment-580960037
 
 
   Cassandra is extremely slow opening new connections however the average case 
would be that for each `Read` there would be many `T` outputs which in 
principle would amortize the slow connection time. Couldn't the user in the 
specific case you mention of many Reads generate queries for RingRanges that 
match more 'regions' and achieve the same result?
   
   If this is really ends up being a problem one eventual workaround would be 
to batch requests and create some sort of pool of connections but hopefully 
this won't be needed.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380259)
Time Spent: 4h 40m  (was: 4.5h)

> Add readAll() method to CassandraIO
> ---
>
> Key: BEAM-9008
> URL: https://issues.apache.org/jira/browse/BEAM-9008
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-cassandra
>Affects Versions: 2.16.0
>Reporter: vincent marquez
>Assignee: vincent marquez
>Priority: Minor
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> When querying a large cassandra database, it's often *much* more useful to 
> programatically generate the queries needed to to be run rather than reading 
> all partitions and attempting some filtering.  
> As an example:
> {code:java}
> public class Event { 
>@PartitionKey(0) public UUID accountId;
>@PartitionKey(1)public String yearMonthDay; 
>@ClusteringKey public UUID eventId;  
>//other data...
> }{code}
> If there is ten years worth of data, you may want to only query one year's 
> worth.  Here each token range would represent one 'token' but all events for 
> the day. 
> {code:java}
> Set accounts = getRelevantAccounts();
> Set dateRange = generateDateRange("2018-01-01", "2019-01-01");
> PCollection tokens = generateTokens(accounts, dateRange); 
> {code}
>  
>  I propose an additional _readAll()_ PTransform that can take a PCollection 
> of token ranges and can return a PCollection of what the query would 
> return. 
> *Question: How much code should be in common between both methods?* 
> Currently the read connector already groups all partitions into a List of 
> Token Ranges, so it would be simple to refactor the current read() based 
> method to a 'ParDo' based one and have them both share the same function.  
> Reasons against sharing code between read and readAll
>  * Not having the read based method return a BoundedSource connector would 
> mean losing the ability to know the size of the data returned
>  * Currently the CassandraReader executes all the grouped TokenRange queries 
> *asynchronously* which is (maybe?) fine when all that's happening is 
> splitting up all the partition ranges but terrible for executing potentially 
> millions of queries. 
>  Reasons _for_ sharing code would be simplified code base and that both of 
> the above issues would most likely have a negligable performance impact. 
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9239) Dependency conflict with Spark using aws io

2020-01-31 Thread David McIntosh (Jira)
David McIntosh created BEAM-9239:


 Summary: Dependency conflict with Spark using aws io
 Key: BEAM-9239
 URL: https://issues.apache.org/jira/browse/BEAM-9239
 Project: Beam
  Issue Type: Bug
  Components: io-java-aws, runner-spark
Affects Versions: 2.17.0
Reporter: David McIntosh


Starting with beam 2.17.0 I get this error in the Spark 2.4.4 driver when aws 
io is also used:

{noformat}
java.lang.NoSuchMethodError: 
com.fasterxml.jackson.databind.jsontype.TypeSerializer.typeId(Ljava/lang/Object;Lcom/fasterxml/jackson/core/JsonToken;)Lcom/fasterxml/jackson/core/type/WritableTypeId;
at 
org.apache.beam.sdk.io.aws.options.AwsModule$AWSCredentialsProviderSerializer.serializeWithType(AwsModule.java:163)
at 
org.apache.beam.sdk.io.aws.options.AwsModule$AWSCredentialsProviderSerializer.serializeWithType(AwsModule.java:134)
at 
com.fasterxml.jackson.databind.ser.impl.TypeWrappedSerializer.serialize(TypeWrappedSerializer.java:32)
at 
com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:130)
at 
com.fasterxml.jackson.databind.ObjectMapper._configAndWriteValue(ObjectMapper.java:3559)
at 
com.fasterxml.jackson.databind.ObjectMapper.writeValueAsString(ObjectMapper.java:2927)
at 
org.apache.beam.sdk.options.ProxyInvocationHandler$Serializer.ensureSerializable(ProxyInvocationHandler.java:721)
at 
org.apache.beam.sdk.options.ProxyInvocationHandler$Serializer.serialize(ProxyInvocationHandler.java:647)
at 
org.apache.beam.sdk.options.ProxyInvocationHandler$Serializer.serialize(ProxyInvocationHandler.java:635)
at 
com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:130)
at 
com.fasterxml.jackson.databind.ObjectMapper._configAndWriteValue(ObjectMapper.java:3559)
at 
com.fasterxml.jackson.databind.ObjectMapper.writeValueAsString(ObjectMapper.java:2927)
at 
org.apache.beam.runners.core.construction.SerializablePipelineOptions.serializeToJson(SerializablePipelineOptions.java:67)
at 
org.apache.beam.runners.core.construction.SerializablePipelineOptions.(SerializablePipelineOptions.java:43)
at 
org.apache.beam.runners.spark.translation.EvaluationContext.(EvaluationContext.java:71)
at org.apache.beam.runners.spark.SparkRunner.run(SparkRunner.java:215)
at org.apache.beam.runners.spark.SparkRunner.run(SparkRunner.java:90)
{noformat}

The cause seems to be that the Spark driver environment uses an older version 
of Jackson. I tried to update jackson on the Spark cluster but that led to 
several other errors. 

The change that started causing this was:
https://github.com/apache/beam/commit/b68d70a47b68ad84efcd9405c1799002739bd116

After reverting that change I was able to successfully run my job.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9231) Add Experimental portability kind and tag related classes

2020-01-31 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-9231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-9231:
---
Labels: portability  (was: )

> Add Experimental portability kind and tag related classes
> -
>
> Key: BEAM-9231
> URL: https://issues.apache.org/jira/browse/BEAM-9231
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Labels: portability
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For some extra context I was studying the evolution of our APIs between 
> versions in particular for beam-sdks-java-core and noticed that some parts 
> were not well classified as Experimental in particular classes (and 
> transforms)  for portability and SplittableDoFn.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9236) Mark missing Schema based classes and methods as Experimental

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9236?focusedWorklogId=380257=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380257
 ]

ASF GitHub Bot logged work on BEAM-9236:


Author: ASF GitHub Bot
Created on: 31/Jan/20 23:21
Start Date: 31/Jan/20 23:21
Worklog Time Spent: 10m 
  Work Description: iemejia commented on pull request #10741: [BEAM-9236] 
Mark missing Schema based classes and methods as Experimental
URL: https://github.com/apache/beam/pull/10741
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380257)
Remaining Estimate: 0h
Time Spent: 10m

> Mark missing Schema based classes and methods as Experimental
> -
>
> Key: BEAM-9236
> URL: https://issues.apache.org/jira/browse/BEAM-9236
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8979) protoc-gen-mypy: program not found or is not executable

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8979?focusedWorklogId=380254=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380254
 ]

ASF GitHub Bot logged work on BEAM-8979:


Author: ASF GitHub Bot
Created on: 31/Jan/20 23:12
Start Date: 31/Jan/20 23:12
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #10734: [BEAM-8979] reintroduce 
mypy-protobuf stub generation
URL: https://github.com/apache/beam/pull/10734#issuecomment-580951255
 
 
   > @udim I can't trigger the jenkins jobs. Also, it's unclear to me how to 
run this job locally via gradle. Can you give me some advice?
   > 
   > Also, do you have any idea what's different about the way these two jobs 
are run that would cause their virtualenv's bin dir to not be on the search 
PATH?
   
   I ran:
   ```
   $ git fetch origin pull/10734/head:pr10734
   $ git checkout pr10734
   $ ./gradlew :sdks:python:sdist
   ...
   > Task :sdks:python:sdist
   setup.py:244: UserWarning: You are using Apache Beam with Python 2. New 
releases of Apache Beam will soon support Python 3 only.
 'You are using Apache Beam with Python 2. '
   
/usr/local/google/home/ehudm/src/beam/build/gradleenv/192237/local/lib/python2.7/site-packages/setuptools/dist.py:476:
 UserWarning: Normalizing '2.20.0.dev' to '2.20.0.dev0'
 normalized_version,
   No handlers could be found for logger "gen_protos"
   Traceback (most recent call last):
 File "setup.py", line 308, in 
   'mypy': generate_protos_first(mypy),
 File 
"/usr/local/google/home/ehudm/src/beam/build/gradleenv/192237/local/lib/python2.7/site-packages/setuptools/__init__.py",
 line 145, in setup
   return distutils.core.setup(**attrs)
 File "/usr/lib/python2.7/distutils/core.py", line 151, in setup
   dist.run_commands()
 File "/usr/lib/python2.7/distutils/dist.py", line 953, in run_commands
   self.run_command(cmd)
 File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
   cmd_obj.run()
 File 
"/usr/local/google/home/ehudm/src/beam/build/gradleenv/192237/local/lib/python2.7/site-packages/setuptools/command/sdist.py",
 line 44, in run
   self.run_command('egg_info')
 File "/usr/lib/python2.7/distutils/cmd.py", line 326, in run_command
   self.distribution.run_command(command)
 File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
   cmd_obj.run()
 File "setup.py", line 232, in run
   gen_protos.generate_proto_files()
 File "/usr/local/google/home/ehudm/src/beam/sdks/python/gen_protos.py", 
line 314, in generate_proto_files
   protoc_gen_mypy = _find_protoc_gen_mypy()
 File "/usr/local/google/home/ehudm/src/beam/sdks/python/gen_protos.py", 
line 233, in _find_protoc_gen_mypy
   (fname, ', '.join(search_paths)))
   RuntimeError: Could not find protoc-gen-mypy in 
/usr/local/google/home/ehudm/src/beam/build/gradleenv/192237/bin, 
/usr/local/google/home/ehudm/src/beam/build/gradleenv/192237/bin, 
/usr/local/google/home/ehudm/.pyenv/plugins/pyenv-virtualenv/shims, 
/usr/local/google/home/ehudm/.pyenv/shims, 
/usr/local/google/home/ehudm/.pyenv/bin, /usr/lib/google-golang/bin, 
/usr/local/buildtools/java/jdk/bin, /usr/local/sbin, /usr/local/bin, /usr/sbin, 
/usr/bin, /sbin, /bin, /google/data/ro/teams/iblaze, 
/google/data/ro/projects/devtools/rebaser, /usr/local/google/home/ehudm/bin
   ```
   Scan of subsequent run:
   https://gradle.com/s/nobazytmko36c
   
   It fails at the sdist task.
   Perhaps your shell already has this binary installed, which is why it works 
for you.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380254)
Time Spent: 3h 50m  (was: 3h 40m)

> protoc-gen-mypy: program not found or is not executable
> ---
>
> Key: BEAM-8979
> URL: https://issues.apache.org/jira/browse/BEAM-8979
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, test-failures
>Reporter: Kamil Wasilewski
>Assignee: Chad Dombrova
>Priority: Major
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> In some tests, `:sdks:python:sdist:` task fails due to problems in finding 
> protoc-gen-mypy. The following tests are affected (there might be more):
>  * 
> [https://builds.apache.org/job/beam_LoadTests_Python_37_ParDo_Dataflow_Batch_PR/]
>  * 
> [https://builds.apache.org/job/beam_BiqQueryIO_Write_Performance_Test_Python_Batch/
>  
> 

[jira] [Work logged] (BEAM-8979) protoc-gen-mypy: program not found or is not executable

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8979?focusedWorklogId=380253=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380253
 ]

ASF GitHub Bot logged work on BEAM-8979:


Author: ASF GitHub Bot
Created on: 31/Jan/20 23:09
Start Date: 31/Jan/20 23:09
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #10734: [BEAM-8979] reintroduce 
mypy-protobuf stub generation
URL: https://github.com/apache/beam/pull/10734#issuecomment-580950566
 
 
   > @udim thanks for the heads up. My guess is that the mypy-protoc executable 
is not getting on PATH.
   > 
   > The tests did not show up in the PRs until you requested them, but I can 
see them here now. Can I run the test phrases myself now?
   
   I think the phrases are limited to Apache committers sadly. Not sure if this 
requirement will ever go away.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380253)
Time Spent: 3h 40m  (was: 3.5h)

> protoc-gen-mypy: program not found or is not executable
> ---
>
> Key: BEAM-8979
> URL: https://issues.apache.org/jira/browse/BEAM-8979
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, test-failures
>Reporter: Kamil Wasilewski
>Assignee: Chad Dombrova
>Priority: Major
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> In some tests, `:sdks:python:sdist:` task fails due to problems in finding 
> protoc-gen-mypy. The following tests are affected (there might be more):
>  * 
> [https://builds.apache.org/job/beam_LoadTests_Python_37_ParDo_Dataflow_Batch_PR/]
>  * 
> [https://builds.apache.org/job/beam_BiqQueryIO_Write_Performance_Test_Python_Batch/
>  
> |https://builds.apache.org/job/beam_BiqQueryIO_Write_Performance_Test_Python_Batch/]
> Relevant logs:
> {code:java}
> 10:46:32 > Task :sdks:python:sdist FAILED
> 10:46:32 Requirement already satisfied: mypy-protobuf==1.12 in 
> /home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_37_ParDo_Dataflow_Batch_PR/src/build/gradleenv/192237/lib/python3.7/site-packages
>  (1.12)
> 10:46:32 beam_fn_api.proto: warning: Import google/protobuf/descriptor.proto 
> but not used.
> 10:46:32 beam_fn_api.proto: warning: Import google/protobuf/wrappers.proto 
> but not used.
> 10:46:32 protoc-gen-mypy: program not found or is not executable
> 10:46:32 --mypy_out: protoc-gen-mypy: Plugin failed with status code 1.
> 10:46:32 
> /home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_37_ParDo_Dataflow_Batch_PR/src/build/gradleenv/192237/lib/python3.7/site-packages/setuptools/dist.py:476:
>  UserWarning: Normalizing '2.19.0.dev' to '2.19.0.dev0'
> 10:46:32   normalized_version,
> 10:46:32 Traceback (most recent call last):
> 10:46:32   File "setup.py", line 295, in 
> 10:46:32 'mypy': generate_protos_first(mypy),
> 10:46:32   File 
> "/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_37_ParDo_Dataflow_Batch_PR/src/build/gradleenv/192237/lib/python3.7/site-packages/setuptools/__init__.py",
>  line 145, in setup
> 10:46:32 return distutils.core.setup(**attrs)
> 10:46:32   File "/usr/lib/python3.7/distutils/core.py", line 148, in setup
> 10:46:32 dist.run_commands()
> 10:46:32   File "/usr/lib/python3.7/distutils/dist.py", line 966, in 
> run_commands
> 10:46:32 self.run_command(cmd)
> 10:46:32   File "/usr/lib/python3.7/distutils/dist.py", line 985, in 
> run_command
> 10:46:32 cmd_obj.run()
> 10:46:32   File 
> "/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_37_ParDo_Dataflow_Batch_PR/src/build/gradleenv/192237/lib/python3.7/site-packages/setuptools/command/sdist.py",
>  line 44, in run
> 10:46:32 self.run_command('egg_info')
> 10:46:32   File "/usr/lib/python3.7/distutils/cmd.py", line 313, in 
> run_command
> 10:46:32 self.distribution.run_command(command)
> 10:46:32   File "/usr/lib/python3.7/distutils/dist.py", line 985, in 
> run_command
> 10:46:32 cmd_obj.run()
> 10:46:32   File "setup.py", line 220, in run
> 10:46:32 gen_protos.generate_proto_files(log=log)
> 10:46:32   File 
> "/home/jenkins/jenkins-slave/workspace/beam_LoadTests_Python_37_ParDo_Dataflow_Batch_PR/src/sdks/python/gen_protos.py",
>  line 144, in generate_proto_files
> 10:46:32 '%s' % ret_code)
> 10:46:32 RuntimeError: Protoc returned non-zero status (see logs for 
> details): 1
> {code}
>  
> This is what I have tried so far to resolve this (without being successful):
>  * Including 

[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=380251=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380251
 ]

ASF GitHub Bot logged work on BEAM-5605:


Author: ASF GitHub Bot
Created on: 31/Jan/20 23:00
Start Date: 31/Jan/20 23:00
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10702: [BEAM-5605] 
Migrate splittable DoFn methods to use "new" DoFn style argument providing.
URL: https://github.com/apache/beam/pull/10702#discussion_r373722630
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java
 ##
 @@ -871,13 +960,20 @@ public Duration getAllowedTimestampSkew() {
* Annotation for the method that creates a new {@link RestrictionTracker} 
for the restriction of
* a https://s.apache.org/splittable-do-fn;>splittable {@link 
DoFn}.
*
-   * Signature: {@code MyRestrictionTracker newTracker(RestrictionT 
restriction, );} where {@code MyRestrictionTracker} must be a subtype of 
{@code
-   * RestrictionTracker}.
+   * Signature: {@code MyRestrictionTracker newTracker();}
*
-   * The optional arguments are allowed to be:
+   * This method must satisfy the following constraints:
*
* 
+   *   The return type must be a subtype of {@code 
RestrictionTracker}.
+   *   It is suggested to use as narrow of a return type definition as 
possible (for example
+   *   prefer to use a square type over a shape type as a square is a type 
of a shape).
+   *   If one of its arguments is tagged with the {@link Element} 
annotation, then it will be
 
 Review comment:
   No.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380251)
Time Spent: 11.5h  (was: 11h 20m)

> Support Portable SplittableDoFn for batch
> -
>
> Key: BEAM-5605
> URL: https://issues.apache.org/jira/browse/BEAM-5605
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Scott Wegner
>Assignee: Luke Cwik
>Priority: Major
>  Labels: portability
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> Roll-up item tracking work towards supporting portable SplittableDoFn for 
> batch



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9167) Reduce overhead of Go SDK side metrics

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9167?focusedWorklogId=380247=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380247
 ]

ASF GitHub Bot logged work on BEAM-9167:


Author: ASF GitHub Bot
Created on: 31/Jan/20 22:53
Start Date: 31/Jan/20 22:53
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #10716: [BEAM-9167] 
Metrics extraction refactoring.
URL: https://github.com/apache/beam/pull/10716
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380247)
Time Spent: 3h 20m  (was: 3h 10m)

> Reduce overhead of Go SDK side metrics
> --
>
> Key: BEAM-9167
> URL: https://issues.apache.org/jira/browse/BEAM-9167
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Robert Burke
>Assignee: Robert Burke
>Priority: Major
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Locking overhead due to the global store and local caches of SDK counter data 
> can dominate certain workloads, which means we can do better.
> Instead of having a global store of metrics data to extract counters, we 
> should use per ptransform (or per bundle) counter sets, which would avoid 
> requiring locking per counter operation. The main detriment compared to the 
> current implementation is that a user would need to add their own locking if 
> they were to spawn multiple goroutines to process a Bundle's work in a DoFn.
> Given that self multithreaded DoFns aren't recommended/safe in Java,  largely 
> impossible in Python, and the other beam Go SDK provided constructs (like 
> Iterators and Emitters) are not thread safe, this is a small concern, 
> provided the documentation is clear on this.
> Removing the locking and switching to atomic ops reduces the overhead 
> significantly in example jobs and in the benchmarks.
> A second part of this change should be to move the exec package to manage 
> it's own per bundle state, rather than relying on a global datastore to 
> extract the per bundle,per ptransform values.
> Related: https://issues.apache.org/jira/browse/BEAM-6541 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9175) Introduce an autoformatting tool to Python SDK

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9175?focusedWorklogId=380246=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380246
 ]

ASF GitHub Bot logged work on BEAM-9175:


Author: ASF GitHub Bot
Created on: 31/Jan/20 22:52
Start Date: 31/Jan/20 22:52
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #10684: [BEAM-9175] Introduce 
an autoformatting tool to Python SDK
URL: https://github.com/apache/beam/pull/10684#issuecomment-580946013
 
 
   LGTM
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380246)
Time Spent: 6h 20m  (was: 6h 10m)

> Introduce an autoformatting tool to Python SDK
> --
>
> Key: BEAM-9175
> URL: https://issues.apache.org/jira/browse/BEAM-9175
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core, sdk-py-harness
>Reporter: Michał Walenia
>Assignee: Kamil Wasilewski
>Priority: Major
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> It seems there are three main options:
>  * black - very simple, but not configurable at all (except for line length), 
> would drastically change code style
>  * yapf - more options to tweak, can omit parts of code
>  * autopep8 - more similar to spotless - only touches code that breaks 
> formatting guidelines, can use pycodestyle and flake8 as configuration
>  The rigidity of Black makes it unusable for Beam.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=380239=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380239
 ]

ASF GitHub Bot logged work on BEAM-5605:


Author: ASF GitHub Bot
Created on: 31/Jan/20 22:39
Start Date: 31/Jan/20 22:39
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on pull request #10702: [BEAM-5605] 
Migrate splittable DoFn methods to use "new" DoFn style argument providing.
URL: https://github.com/apache/beam/pull/10702#discussion_r372145661
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java
 ##
 @@ -871,13 +960,20 @@ public Duration getAllowedTimestampSkew() {
* Annotation for the method that creates a new {@link RestrictionTracker} 
for the restriction of
* a https://s.apache.org/splittable-do-fn;>splittable {@link 
DoFn}.
*
-   * Signature: {@code MyRestrictionTracker newTracker(RestrictionT 
restriction, );} where {@code MyRestrictionTracker} must be a subtype of 
{@code
-   * RestrictionTracker}.
+   * Signature: {@code MyRestrictionTracker newTracker();}
*
-   * The optional arguments are allowed to be:
+   * This method must satisfy the following constraints:
*
* 
+   *   The return type must be a subtype of {@code 
RestrictionTracker}.
+   *   It is suggested to use as narrow of a return type definition as 
possible (for example
+   *   prefer to use a square type over a shape type as a square is a type 
of a shape).
+   *   If one of its arguments is tagged with the {@link Element} 
annotation, then it will be
 
 Review comment:
   Does creating tracker require passing in `element`?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380239)
Time Spent: 11h 20m  (was: 11h 10m)

> Support Portable SplittableDoFn for batch
> -
>
> Key: BEAM-5605
> URL: https://issues.apache.org/jira/browse/BEAM-5605
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Scott Wegner
>Assignee: Luke Cwik
>Priority: Major
>  Labels: portability
>  Time Spent: 11h 20m
>  Remaining Estimate: 0h
>
> Roll-up item tracking work towards supporting portable SplittableDoFn for 
> batch



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9188) Improving speed of splitting for Custom Sources

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9188?focusedWorklogId=380237=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380237
 ]

ASF GitHub Bot logged work on BEAM-9188:


Author: ASF GitHub Bot
Created on: 31/Jan/20 22:37
Start Date: 31/Jan/20 22:37
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on pull request #10701: [BEAM-9188] 
CassandraIO split performance improvement - cache size of the table
URL: https://github.com/apache/beam/pull/10701
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380237)
Time Spent: 4h 40m  (was: 4.5h)

> Improving speed of splitting for Custom Sources
> ---
>
> Key: BEAM-9188
> URL: https://issues.apache.org/jira/browse/BEAM-9188
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Radosław Stankiewicz
>Assignee: Radosław Stankiewicz
>Priority: Minor
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> At this moment Custom Source in being split and serialized in sequence. If 
> there are many splits, it takes time to process all splits. 
>  
> Example: it takes 2s to calculate size and serialize CassandraSource due to 
> connection setup and teardown. With 100+ splits, it's a lot of time spent in 
> 1 worker. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9188) Improving speed of splitting for Custom Sources

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9188?focusedWorklogId=380236=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380236
 ]

ASF GitHub Bot logged work on BEAM-9188:


Author: ASF GitHub Bot
Created on: 31/Jan/20 22:36
Start Date: 31/Jan/20 22:36
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on issue #10701: [BEAM-9188] 
CassandraIO split performance improvement - cache size of the table
URL: https://github.com/apache/beam/pull/10701#issuecomment-580941492
 
 
   All tests passed. I'll go ahead to merge this PR.
   Thanks for your contribution!
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380236)
Time Spent: 4.5h  (was: 4h 20m)

> Improving speed of splitting for Custom Sources
> ---
>
> Key: BEAM-9188
> URL: https://issues.apache.org/jira/browse/BEAM-9188
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Radosław Stankiewicz
>Assignee: Radosław Stankiewicz
>Priority: Minor
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> At this moment Custom Source in being split and serialized in sequence. If 
> there are many splits, it takes time to process all splits. 
>  
> Example: it takes 2s to calculate size and serialize CassandraSource due to 
> connection setup and teardown. With 100+ splits, it's a lot of time spent in 
> 1 worker. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8280) re-enable IOTypeHints.from_callable

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8280?focusedWorklogId=380234=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380234
 ]

ASF GitHub Bot logged work on BEAM-8280:


Author: ASF GitHub Bot
Created on: 31/Jan/20 22:34
Start Date: 31/Jan/20 22:34
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #10735: [BEAM-8280][BEAM-8629] 
Make IOTypeHints immutable
URL: https://github.com/apache/beam/pull/10735#issuecomment-580940695
 
 
   Postcommit results (on previous commit) seem okay. The failure in 
Python37_PC seems transient.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380234)
Time Spent: 1h 50m  (was: 1h 40m)

> re-enable IOTypeHints.from_callable
> ---
>
> Key: BEAM-8280
> URL: https://issues.apache.org/jira/browse/BEAM-8280
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> See https://issues.apache.org/jira/browse/BEAM-8279



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8280) re-enable IOTypeHints.from_callable

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8280?focusedWorklogId=380232=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380232
 ]

ASF GitHub Bot logged work on BEAM-8280:


Author: ASF GitHub Bot
Created on: 31/Jan/20 22:32
Start Date: 31/Jan/20 22:32
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #10735: 
[BEAM-8280][BEAM-8629] Make IOTypeHints immutable
URL: https://github.com/apache/beam/pull/10735#discussion_r373714584
 
 

 ##
 File path: sdks/python/apache_beam/typehints/decorators.py
 ##
 @@ -283,16 +301,19 @@ def from_callable(fn):
   output_args.append(typehints.Any)
 
 return IOTypeHints(input_types=(tuple(input_args), input_kwargs),
-   output_types=(tuple(output_args), {}))
+   output_types=(tuple(output_args), {}),
+   origin=cls._make_traceback(None))
 
-  def set_input_types(self, *args, **kwargs):
-self.input_types = args, kwargs
+  def with_input_types(self, *args, **kwargs):  # type: (...) -> IOTypeHints
 
 Review comment:
   ```suggestion
 def with_input_types(self, *args, **kwargs):
   # type: (...) -> IOTypeHints
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380232)
Time Spent: 1h 40m  (was: 1.5h)

> re-enable IOTypeHints.from_callable
> ---
>
> Key: BEAM-8280
> URL: https://issues.apache.org/jira/browse/BEAM-8280
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> See https://issues.apache.org/jira/browse/BEAM-8279



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8280) re-enable IOTypeHints.from_callable

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8280?focusedWorklogId=380231=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380231
 ]

ASF GitHub Bot logged work on BEAM-8280:


Author: ASF GitHub Bot
Created on: 31/Jan/20 22:31
Start Date: 31/Jan/20 22:31
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #10735: 
[BEAM-8280][BEAM-8629] Make IOTypeHints immutable
URL: https://github.com/apache/beam/pull/10735#discussion_r373714477
 
 

 ##
 File path: sdks/python/apache_beam/typehints/decorators.py
 ##
 @@ -283,16 +301,19 @@ def from_callable(fn):
   output_args.append(typehints.Any)
 
 return IOTypeHints(input_types=(tuple(input_args), input_kwargs),
-   output_types=(tuple(output_args), {}))
+   output_types=(tuple(output_args), {}),
+   origin=cls._make_traceback(None))
 
-  def set_input_types(self, *args, **kwargs):
-self.input_types = args, kwargs
+  def with_input_types(self, *args, **kwargs):  # type: (...) -> IOTypeHints
+return self._replace(input_types=(args, kwargs),
+ origin=self._make_traceback(self))
 
-  def set_output_types(self, *args, **kwargs):
-self.output_types = args, kwargs
+  def with_output_types(self, *args, **kwargs):  # type: (...) -> IOTypeHints
 
 Review comment:
   ```suggestion
 def with_output_types(self, *args, **kwargs):
   # type: (...) -> IOTypeHints
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380231)
Time Spent: 1.5h  (was: 1h 20m)

> re-enable IOTypeHints.from_callable
> ---
>
> Key: BEAM-8280
> URL: https://issues.apache.org/jira/browse/BEAM-8280
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> See https://issues.apache.org/jira/browse/BEAM-8279



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8550) @RequiresTimeSortedInput DoFn annotation

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8550?focusedWorklogId=380230=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380230
 ]

ASF GitHub Bot logged work on BEAM-8550:


Author: ASF GitHub Bot
Created on: 31/Jan/20 22:27
Start Date: 31/Jan/20 22:27
Worklog Time Spent: 10m 
  Work Description: je-ik commented on issue #8774: [BEAM-8550] Requires 
time sorted input
URL: https://github.com/apache/beam/pull/8774#issuecomment-580938807
 
 
   Run Java_Examples_Dataflow PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380230)
Time Spent: 10h 40m  (was: 10.5h)

> @RequiresTimeSortedInput DoFn annotation
> 
>
> Key: BEAM-8550
> URL: https://issues.apache.org/jira/browse/BEAM-8550
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model, sdk-java-core
>Reporter: Jan Lukavský
>Assignee: Jan Lukavský
>Priority: Major
>  Time Spent: 10h 40m
>  Remaining Estimate: 0h
>
> Implement new annotation {{@RequiresTimeSortedInput}} for stateful DoFn as 
> described in [design 
> document|https://docs.google.com/document/d/1ObLVUFsf1NcG8ZuIZE4aVy2RYKx2FfyMhkZYWPnI9-c/edit?usp=sharing].
>  First implementation will assume that:
>   - time is defined by timestamp in associated WindowedValue
>   - allowed lateness is explicitly zero and all late elements are dropped 
> (due to being out of order)
> The above properties are considered temporary and will be resolved by 
> subsequent extensions (backwards compatible).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9167) Reduce overhead of Go SDK side metrics

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9167?focusedWorklogId=380229=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380229
 ]

ASF GitHub Bot logged work on BEAM-9167:


Author: ASF GitHub Bot
Created on: 31/Jan/20 22:26
Start Date: 31/Jan/20 22:26
Worklog Time Spent: 10m 
  Work Description: lostluck commented on issue #10716: [BEAM-9167] Metrics 
extraction refactoring.
URL: https://github.com/apache/beam/pull/10716#issuecomment-580938268
 
 
   Run Go Postcommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380229)
Time Spent: 3h 10m  (was: 3h)

> Reduce overhead of Go SDK side metrics
> --
>
> Key: BEAM-9167
> URL: https://issues.apache.org/jira/browse/BEAM-9167
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Robert Burke
>Assignee: Robert Burke
>Priority: Major
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Locking overhead due to the global store and local caches of SDK counter data 
> can dominate certain workloads, which means we can do better.
> Instead of having a global store of metrics data to extract counters, we 
> should use per ptransform (or per bundle) counter sets, which would avoid 
> requiring locking per counter operation. The main detriment compared to the 
> current implementation is that a user would need to add their own locking if 
> they were to spawn multiple goroutines to process a Bundle's work in a DoFn.
> Given that self multithreaded DoFns aren't recommended/safe in Java,  largely 
> impossible in Python, and the other beam Go SDK provided constructs (like 
> Iterators and Emitters) are not thread safe, this is a small concern, 
> provided the documentation is clear on this.
> Removing the locking and switching to atomic ops reduces the overhead 
> significantly in example jobs and in the benchmarks.
> A second part of this change should be to move the exec package to manage 
> it's own per bundle state, rather than relying on a global datastore to 
> extract the per bundle,per ptransform values.
> Related: https://issues.apache.org/jira/browse/BEAM-6541 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9237) Environment-sensitive provisioning

2020-01-31 Thread Heejong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Heejong Lee updated BEAM-9237:
--
Parent: BEAM-9238
Issue Type: Sub-task  (was: Improvement)

> Environment-sensitive provisioning
> --
>
> Key: BEAM-9237
> URL: https://issues.apache.org/jira/browse/BEAM-9237
> Project: Beam
>  Issue Type: Sub-task
>  Components: java-fn-execution
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Major
>
> * Extending provision service
>  # Extending provision request proto to add an environment ID parameter
>  # Committing manifest per environment
>  # Passing an environment ID to SDK harness
>  # Modifying Job API to keep multiple retrieval tokens
>  # Retrieve the metadata only relevant to the given environment from SDK 
> harness with an environment ID



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9056) Staging artifacts from environment

2020-01-31 Thread Heejong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Heejong Lee updated BEAM-9056:
--
Parent: BEAM-9238
Issue Type: Sub-task  (was: Improvement)

> Staging artifacts from environment
> --
>
> Key: BEAM-9056
> URL: https://issues.apache.org/jira/browse/BEAM-9056
> Project: Beam
>  Issue Type: Sub-task
>  Components: java-fn-execution
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> staging artifacts from artifact information embedded in environment proto.
> detail: 
> https://docs.google.com/document/d/1L7MJcfyy9mg2Ahfw5XPhUeBe-dyvAPMOYOiFA1-kAog



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9167) Reduce overhead of Go SDK side metrics

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9167?focusedWorklogId=380226=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380226
 ]

ASF GitHub Bot logged work on BEAM-9167:


Author: ASF GitHub Bot
Created on: 31/Jan/20 22:24
Start Date: 31/Jan/20 22:24
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #10716: [BEAM-9167] 
Metrics extraction refactoring.
URL: https://github.com/apache/beam/pull/10716#discussion_r373708761
 
 

 ##
 File path: sdks/go/pkg/beam/core/metrics/dumper.go
 ##
 @@ -0,0 +1,130 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+package metrics
+
+import (
+   "context"
+   "fmt"
+   "sort"
+   "time"
+
+   "github.com/apache/beam/sdks/go/pkg/beam/log"
+)
+
+// DumpToLog is a debugging function that outputs all metrics available locally
+// to beam.Log.
+func DumpToLog(ctx context.Context) {
+   store := GetStore(ctx)
+   if store == nil {
+   log.Errorf(ctx, "Unable to dump metrics: provided context 
doesn't contain metrics Store.")
+   return
+   }
+   DumpToLogFromStore(ctx, store)
+}
+
+// DumpToLogFromStore dumps the metrics in the provided Store to beam.Log.
+func DumpToLogFromStore(ctx context.Context, store *Store) {
+   dumperExtractor(store, func(format string, args ...interface{}) {
+   log.Errorf(ctx, format, args...)
+   })
+}
+
+// DumpToOutFromContext is a debugging function that outputs all metrics
+// available locally to std out,
+// extracting the metric store from the context.
+func DumpToOutFromContext(ctx context.Context) {
+   store := GetStore(ctx)
+   if store == nil {
+   fmt.Printf("Unable to dump metrics: provided context doesn't 
contain metrics Store.")
+   return
+   }
+   DumpToOutFromStore(store)
+}
+
+// DumpToOutFromStore is a debugging function that outputs all metrics
+// available locally to std out directly from the store.
+func DumpToOutFromStore(store *Store) {
+   dumperExtractor(store, func(format string, args ...interface{}) {
+   fmt.Printf(format+"\n", args...)
+   })
+}
+
+func dumperExtractor(store *Store, p func(format string, args ...interface{})) 
{
+   m := make(map[Labels]interface{})
+   e := {
+   SumInt64: func(l Labels, v int64) {
+   m[l] = {value: v}
+   },
+   DistributionInt64: func(l Labels, count, sum, min, max int64) {
+   m[l] = {count: count, sum: sum, min: min, 
max: max}
+   },
+   GaugeInt64: func(l Labels, v int64, t time.Time) {
+   m[l] = {v: v, t: t}
+   },
+   }
+   e.ExtractFrom(store)
+   dumpTo(m, p)
+}
+
+type metricDumper struct {
 
 Review comment:
   Good catch. Oversight from a previous prototype.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380226)
Time Spent: 2h 50m  (was: 2h 40m)

> Reduce overhead of Go SDK side metrics
> --
>
> Key: BEAM-9167
> URL: https://issues.apache.org/jira/browse/BEAM-9167
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Robert Burke
>Assignee: Robert Burke
>Priority: Major
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Locking overhead due to the global store and local caches of SDK counter data 
> can dominate certain workloads, which means we can do better.
> Instead of having a global store of metrics data to extract counters, we 
> should use per ptransform (or per bundle) counter sets, which would avoid 
> requiring locking per counter operation. The main 

[jira] [Work logged] (BEAM-9167) Reduce overhead of Go SDK side metrics

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9167?focusedWorklogId=380228=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380228
 ]

ASF GitHub Bot logged work on BEAM-9167:


Author: ASF GitHub Bot
Created on: 31/Jan/20 22:24
Start Date: 31/Jan/20 22:24
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #10716: [BEAM-9167] 
Metrics extraction refactoring.
URL: https://github.com/apache/beam/pull/10716#discussion_r373709779
 
 

 ##
 File path: sdks/go/pkg/beam/core/metrics/metrics.go
 ##
 @@ -144,63 +119,55 @@ func (ctx *beamCtx) String() string {
return fmt.Sprintf("beamCtx[%s;%s]", ctx.bundleID, ctx.ptransformID)
 }
 
-// SetBundleID sets the id of the current Bundle.
+// SetBundleID sets the id of the current Bundle, and populates the store.
 func SetBundleID(ctx context.Context, id string) context.Context {
// Checking for *beamCtx is an optimization, so we don't dig deeply
// for ids if not necessary.
if bctx, ok := ctx.(*beamCtx); ok {
-   return {Context: bctx.Context, bundleID: id, bs: 
{}, ptransformID: bctx.ptransformID}
+   return {Context: bctx.Context, bundleID: id, store: 
newStore(), ptransformID: bctx.ptransformID}
}
-   return {Context: ctx, bundleID: id, bs: {}}
+   return {Context: ctx, bundleID: id, store: newStore()}
 }
 
 // SetPTransformID sets the id of the current PTransform.
-// Must only be called on a context returened by SetBundleID.
+// Must only be called on a context returned by SetBundleID.
 func SetPTransformID(ctx context.Context, id string) context.Context {
// Checking for *beamCtx is an optimization, so we don't dig deeply
// for ids if not necessary.
if bctx, ok := ctx.(*beamCtx); ok {
-   return {Context: bctx.Context, bundleID: bctx.bundleID, 
bs: bctx.bs, ptransformID: id}
+   return {Context: bctx.Context, bundleID: bctx.bundleID, 
store: bctx.store, ptransformID: id}
+   }
+   // Avoid breaking if the bundle is unset in testing.
+   return {Context: ctx, bundleID: bundleIDUnset, store: 
newStore(), ptransformID: id}
+}
+
+// GetStore extracts the metrics Store for the given context for a bundle.
+//
+// Returns nil if the context doesn't contain a metric Store.
+func GetStore(ctx context.Context) *Store {
+   if bctx, ok := ctx.(*beamCtx); ok {
+   return bctx.store
+   }
+   if v := ctx.Value(storeKey); v != nil {
+   return v.(*Store)
}
-   panic(fmt.Sprintf("SetPTransformID called before SetBundleID for %v", 
id))
-   return nil // never runs.
+   return nil
 }
 
 const (
bundleIDUnset = "(bundle id unset)"
ptransformIDUnset = "(ptransform id unset)"
 )
 
-func getContextKey(ctx context.Context, n name) key {
-   key := key{name: n, bundle: bundleIDUnset, ptransform: 
ptransformIDUnset}
-   if id := ctx.Value(bundleKey); id != nil {
-   key.bundle = id.(string)
-   }
-   if id := ctx.Value(ptransformKey); id != nil {
-   key.ptransform = id.(string)
-   }
-   return key
-}
-
 func getCounterSet(ctx context.Context) *ptCounterSet {
-   if id := ctx.Value(counterSetKey); id != nil {
-   return id.(*ptCounterSet)
+   if bctx, ok := ctx.(*beamCtx); ok && bctx.cs != nil {
+   return bctx.cs
}
-   // It's not set anywhere and wasn't hoisted, so create it.
-   if bctx, ok := ctx.(*beamCtx); ok {
-   bctx.bs.mu.Lock()
-   cs := {
-   counters:  make(map[nameHash]*counter),
-   distributions: make(map[nameHash]*distribution),
-   gauges:make(map[nameHash]*gauge),
-   }
-   bctx.bs.css = append(bctx.bs.css, cs)
-   bctx.cs = cs
-   bctx.bs.mu.Unlock()
-   return cs
+   if set := ctx.Value(counterSetKey); set != nil {
+   return set.(*ptCounterSet)
}
-   panic("counterSet missing, beam isn't set up properly.")
-   return nil // never runs.
+   // This isn't a beam context, so we can't store the metric.
 
 Review comment:
   The counterset is what's used to store the metric for bundle access, so it's 
accurate, but you're right that it's confusing. Rewording.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380228)

> Reduce overhead of Go SDK side metrics
> 

[jira] [Work logged] (BEAM-9167) Reduce overhead of Go SDK side metrics

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9167?focusedWorklogId=380227=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380227
 ]

ASF GitHub Bot logged work on BEAM-9167:


Author: ASF GitHub Bot
Created on: 31/Jan/20 22:24
Start Date: 31/Jan/20 22:24
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #10716: [BEAM-9167] 
Metrics extraction refactoring.
URL: https://github.com/apache/beam/pull/10716#discussion_r373711557
 
 

 ##
 File path: sdks/go/pkg/beam/runners/direct/direct.go
 ##
 @@ -72,7 +72,9 @@ func Execute(ctx context.Context, p *beam.Pipeline) error {
if err = plan.Down(ctx); err != nil {
return err
}
-   metrics.DumpToLog(ctx)
+   // TODO(lostluck) 2020/01/24: What's the right way to expose the
+   // metrics store for the direct runner?
+   metrics.DumpToLogFromStore(ctx, plan.Store)
 
 Review comment:
   It's more of a general comment of how do we give users programmatic access 
to the metrics after pipeline completion. So there's nothing wrong with the way 
the direct runner dumps the metrics every time. That's fine.
   
   We likely need an extractor registration set up, to go along with the runner 
Execute registration, as in the absence of a reliable job management server, 
there's no common way for runners to return metrics to the users.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380227)
Time Spent: 3h  (was: 2h 50m)

> Reduce overhead of Go SDK side metrics
> --
>
> Key: BEAM-9167
> URL: https://issues.apache.org/jira/browse/BEAM-9167
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Robert Burke
>Assignee: Robert Burke
>Priority: Major
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Locking overhead due to the global store and local caches of SDK counter data 
> can dominate certain workloads, which means we can do better.
> Instead of having a global store of metrics data to extract counters, we 
> should use per ptransform (or per bundle) counter sets, which would avoid 
> requiring locking per counter operation. The main detriment compared to the 
> current implementation is that a user would need to add their own locking if 
> they were to spawn multiple goroutines to process a Bundle's work in a DoFn.
> Given that self multithreaded DoFns aren't recommended/safe in Java,  largely 
> impossible in Python, and the other beam Go SDK provided constructs (like 
> Iterators and Emitters) are not thread safe, this is a small concern, 
> provided the documentation is clear on this.
> Removing the locking and switching to atomic ops reduces the overhead 
> significantly in example jobs and in the benchmarks.
> A second part of this change should be to move the exec package to manage 
> it's own per bundle state, rather than relying on a global datastore to 
> extract the per bundle,per ptransform values.
> Related: https://issues.apache.org/jira/browse/BEAM-6541 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9229) Adding dependency information to Environment proto

2020-01-31 Thread Heejong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Heejong Lee updated BEAM-9229:
--
Parent: BEAM-9238
Issue Type: Sub-task  (was: Improvement)

> Adding dependency information to Environment proto
> --
>
> Key: BEAM-9229
> URL: https://issues.apache.org/jira/browse/BEAM-9229
> Project: Beam
>  Issue Type: Sub-task
>  Components: beam-model
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Adding dependency information to Environment proto.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9238) Cross-language pipeline dependency management

2020-01-31 Thread Heejong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Heejong Lee updated BEAM-9238:
--
Status: Open  (was: Triage Needed)

> Cross-language pipeline dependency management
> -
>
> Key: BEAM-9238
> URL: https://issues.apache.org/jira/browse/BEAM-9238
> Project: Beam
>  Issue Type: Improvement
>  Components: java-fn-execution
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Major
>
> Meta-issue for tracking cross-language pipeline dependency management



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9238) Cross-language pipeline dependency management

2020-01-31 Thread Heejong Lee (Jira)
Heejong Lee created BEAM-9238:
-

 Summary: Cross-language pipeline dependency management
 Key: BEAM-9238
 URL: https://issues.apache.org/jira/browse/BEAM-9238
 Project: Beam
  Issue Type: Improvement
  Components: java-fn-execution
Reporter: Heejong Lee


Meta-issue for tracking cross-language pipeline dependency management



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-9238) Cross-language pipeline dependency management

2020-01-31 Thread Heejong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Heejong Lee reassigned BEAM-9238:
-

Assignee: Heejong Lee

> Cross-language pipeline dependency management
> -
>
> Key: BEAM-9238
> URL: https://issues.apache.org/jira/browse/BEAM-9238
> Project: Beam
>  Issue Type: Improvement
>  Components: java-fn-execution
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Major
>
> Meta-issue for tracking cross-language pipeline dependency management



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9237) Environment-sensitive provisioning

2020-01-31 Thread Heejong Lee (Jira)
Heejong Lee created BEAM-9237:
-

 Summary: Environment-sensitive provisioning
 Key: BEAM-9237
 URL: https://issues.apache.org/jira/browse/BEAM-9237
 Project: Beam
  Issue Type: Improvement
  Components: java-fn-execution
Reporter: Heejong Lee
Assignee: Heejong Lee


* Extending provision service
 # Extending provision request proto to add an environment ID parameter
 # Committing manifest per environment
 # Passing an environment ID to SDK harness
 # Modifying Job API to keep multiple retrieval tokens
 # Retrieve the metadata only relevant to the given environment from SDK 
harness with an environment ID



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9237) Environment-sensitive provisioning

2020-01-31 Thread Heejong Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Heejong Lee updated BEAM-9237:
--
Status: Open  (was: Triage Needed)

> Environment-sensitive provisioning
> --
>
> Key: BEAM-9237
> URL: https://issues.apache.org/jira/browse/BEAM-9237
> Project: Beam
>  Issue Type: Improvement
>  Components: java-fn-execution
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Major
>
> * Extending provision service
>  # Extending provision request proto to add an environment ID parameter
>  # Committing manifest per environment
>  # Passing an environment ID to SDK harness
>  # Modifying Job API to keep multiple retrieval tokens
>  # Retrieve the metadata only relevant to the given environment from SDK 
> harness with an environment ID



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9236) Mark missing Schema based classes and methods as Experimental

2020-01-31 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-9236:
---
Status: Open  (was: Triage Needed)

> Mark missing Schema based classes and methods as Experimental
> -
>
> Key: BEAM-9236
> URL: https://issues.apache.org/jira/browse/BEAM-9236
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9236) Mark missing Schema based classes and methods as Experimental

2020-01-31 Thread Jira
Ismaël Mejía created BEAM-9236:
--

 Summary: Mark missing Schema based classes and methods as 
Experimental
 Key: BEAM-9236
 URL: https://issues.apache.org/jira/browse/BEAM-9236
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-core
Reporter: Ismaël Mejía
Assignee: Ismaël Mejía






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (BEAM-9233) Go: unregistered Go functions fail when using -buildmode=pie -ldflags=-w

2020-01-31 Thread Robert Burke (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Burke closed BEAM-9233.
--
Fix Version/s: Not applicable
   Resolution: Fixed

Fixed by linked patch. Thanks!

> Go: unregistered Go functions fail when using -buildmode=pie -ldflags=-w
> 
>
> Key: BEAM-9233
> URL: https://issues.apache.org/jira/browse/BEAM-9233
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
> Environment: GNU/Linux
>Reporter: Ian Lance Taylor
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> If a Go program is built with -buildmode=pie -ldflags=-w, the code that 
> transfers an unregistered function fails.  It tries to look up the symbol in 
> the DWARF debug info, but that info has been stripped because of the -w flag. 
>  This causes a program crash when calling the function.
> I have a patch for this problem that I will send shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (BEAM-9235) Re-enable windmill precommit for *Java_Examples_Dataflow* Jenkins jobs

2020-01-31 Thread Luke Cwik (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik closed BEAM-9235.
---
Fix Version/s: Not applicable
 Assignee: Luke Cwik
   Resolution: Fixed

> Re-enable windmill precommit for *Java_Examples_Dataflow* Jenkins jobs
> --
>
> Key: BEAM-9235
> URL: https://issues.apache.org/jira/browse/BEAM-9235
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-5605) Support Portable SplittableDoFn for batch

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5605?focusedWorklogId=380224=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380224
 ]

ASF GitHub Bot logged work on BEAM-5605:


Author: ASF GitHub Bot
Created on: 31/Jan/20 22:09
Start Date: 31/Jan/20 22:09
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #10702: [BEAM-5605] Migrate 
splittable DoFn methods to use "new" DoFn style argument providing.
URL: https://github.com/apache/beam/pull/10702#issuecomment-580932788
 
 
   Run Java_Examples_Dataflow PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380224)
Time Spent: 11h 10m  (was: 11h)

> Support Portable SplittableDoFn for batch
> -
>
> Key: BEAM-5605
> URL: https://issues.apache.org/jira/browse/BEAM-5605
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Scott Wegner
>Assignee: Luke Cwik
>Priority: Major
>  Labels: portability
>  Time Spent: 11h 10m
>  Remaining Estimate: 0h
>
> Roll-up item tracking work towards supporting portable SplittableDoFn for 
> batch



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9235) Re-enable windmill precommit for *Java_Examples_Dataflow* Jenkins jobs

2020-01-31 Thread Luke Cwik (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17027860#comment-17027860
 ] 

Luke Cwik commented on BEAM-9235:
-

It looks like this has recovered now: 
[https://builds.apache.org/job/beam_PreCommit_Java_Examples_Dataflow_Cron/1838/]

> Re-enable windmill precommit for *Java_Examples_Dataflow* Jenkins jobs
> --
>
> Key: BEAM-9235
> URL: https://issues.apache.org/jira/browse/BEAM-9235
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Luke Cwik
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9233) Go: unregistered Go functions fail when using -buildmode=pie -ldflags=-w

2020-01-31 Thread Robert Burke (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Burke updated BEAM-9233:
---
Affects Version/s: (was: 2.18.0)

> Go: unregistered Go functions fail when using -buildmode=pie -ldflags=-w
> 
>
> Key: BEAM-9233
> URL: https://issues.apache.org/jira/browse/BEAM-9233
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
> Environment: GNU/Linux
>Reporter: Ian Lance Taylor
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> If a Go program is built with -buildmode=pie -ldflags=-w, the code that 
> transfers an unregistered function fails.  It tries to look up the symbol in 
> the DWARF debug info, but that info has been stripped because of the -w flag. 
>  This causes a program crash when calling the function.
> I have a patch for this problem that I will send shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9235) Re-enable windmill precommit for *Java_Examples_Dataflow* Jenkins jobs

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9235?focusedWorklogId=380223=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380223
 ]

ASF GitHub Bot logged work on BEAM-9235:


Author: ASF GitHub Bot
Created on: 31/Jan/20 22:06
Start Date: 31/Jan/20 22:06
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10738: [BEAM-9235] 
Disable failing Dataflow precommit
URL: https://github.com/apache/beam/pull/10738
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380223)
Time Spent: 40m  (was: 0.5h)

> Re-enable windmill precommit for *Java_Examples_Dataflow* Jenkins jobs
> --
>
> Key: BEAM-9235
> URL: https://issues.apache.org/jira/browse/BEAM-9235
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Luke Cwik
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9188) Improving speed of splitting for Custom Sources

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9188?focusedWorklogId=380220=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380220
 ]

ASF GitHub Bot logged work on BEAM-9188:


Author: ASF GitHub Bot
Created on: 31/Jan/20 22:03
Start Date: 31/Jan/20 22:03
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on issue #10701: [BEAM-9188] 
CassandraIO split performance improvement - cache size of the table
URL: https://github.com/apache/beam/pull/10701#issuecomment-580930393
 
 
   Run Spotless PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380220)
Time Spent: 4h 20m  (was: 4h 10m)

> Improving speed of splitting for Custom Sources
> ---
>
> Key: BEAM-9188
> URL: https://issues.apache.org/jira/browse/BEAM-9188
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Radosław Stankiewicz
>Assignee: Radosław Stankiewicz
>Priority: Minor
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> At this moment Custom Source in being split and serialized in sequence. If 
> there are many splits, it takes time to process all splits. 
>  
> Example: it takes 2s to calculate size and serialize CassandraSource due to 
> connection setup and teardown. With 100+ splits, it's a lot of time spent in 
> 1 worker. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9231) Add Experimental portability kind and tag related classes

2020-01-31 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-9231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-9231:
---
Description: For some extra context I was studying the evolution of our 
APIs between versions in particular for beam-sdks-java-core and noticed that 
some parts were not well classified as Experimental in particular classes (and 
transforms)  for portability and SplittableDoFn.

> Add Experimental portability kind and tag related classes
> -
>
> Key: BEAM-9231
> URL: https://issues.apache.org/jira/browse/BEAM-9231
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> For some extra context I was studying the evolution of our APIs between 
> versions in particular for beam-sdks-java-core and noticed that some parts 
> were not well classified as Experimental in particular classes (and 
> transforms)  for portability and SplittableDoFn.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9231) Add Experimental portability kind and tag related classes

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9231?focusedWorklogId=380218=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380218
 ]

ASF GitHub Bot logged work on BEAM-9231:


Author: ASF GitHub Bot
Created on: 31/Jan/20 21:57
Start Date: 31/Jan/20 21:57
Worklog Time Spent: 10m 
  Work Description: iemejia commented on pull request #10739: [BEAM-9231] 
Add Experimental portability kind and tag related classes
URL: https://github.com/apache/beam/pull/10739
 
 
   Also make `SplittableDoFn` classes in `beam-sdks-java-core` Experimental and 
make Experimental annotations homogeneous in beam-sdks-java-core
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380218)
Remaining Estimate: 0h
Time Spent: 10m

> Add Experimental portability kind and tag related classes
> -
>
> Key: BEAM-9231
> URL: https://issues.apache.org/jira/browse/BEAM-9231
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9231) Add Experimental portability kind and tag related classes

2020-01-31 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-9231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-9231:
---
Summary: Add Experimental portability kind and tag related classes  (was: 
Refine Experimental annotations and tag portability related classes)

> Add Experimental portability kind and tag related classes
> -
>
> Key: BEAM-9231
> URL: https://issues.apache.org/jira/browse/BEAM-9231
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9235) Re-enable windmill precommit for *Java_Examples_Dataflow* Jenkins jobs

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9235?focusedWorklogId=380216=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380216
 ]

ASF GitHub Bot logged work on BEAM-9235:


Author: ASF GitHub Bot
Created on: 31/Jan/20 21:51
Start Date: 31/Jan/20 21:51
Worklog Time Spent: 10m 
  Work Description: alexvanboxel commented on issue #10738: [BEAM-9235] 
Disable failing Dataflow precommit
URL: https://github.com/apache/beam/pull/10738#issuecomment-580926418
 
 
   I just got lucky with my pull-request it got through DF tests...
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380216)
Time Spent: 0.5h  (was: 20m)

> Re-enable windmill precommit for *Java_Examples_Dataflow* Jenkins jobs
> --
>
> Key: BEAM-9235
> URL: https://issues.apache.org/jira/browse/BEAM-9235
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Luke Cwik
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9235) Re-enable windmill precommit for *Java_Examples_Dataflow* Jenkins jobs

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9235?focusedWorklogId=380213=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380213
 ]

ASF GitHub Bot logged work on BEAM-9235:


Author: ASF GitHub Bot
Created on: 31/Jan/20 21:44
Start Date: 31/Jan/20 21:44
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #10738: [BEAM-9235] 
Disable failing Dataflow precommit
URL: https://github.com/apache/beam/pull/10738
 
 
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 

[jira] [Work logged] (BEAM-9235) Re-enable windmill precommit for *Java_Examples_Dataflow* Jenkins jobs

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9235?focusedWorklogId=380214=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380214
 ]

ASF GitHub Bot logged work on BEAM-9235:


Author: ASF GitHub Bot
Created on: 31/Jan/20 21:44
Start Date: 31/Jan/20 21:44
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #10738: [BEAM-9235] Disable 
failing Dataflow precommit
URL: https://github.com/apache/beam/pull/10738#issuecomment-580924100
 
 
   R: @alexvanboxel 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380214)
Time Spent: 20m  (was: 10m)

> Re-enable windmill precommit for *Java_Examples_Dataflow* Jenkins jobs
> --
>
> Key: BEAM-9235
> URL: https://issues.apache.org/jira/browse/BEAM-9235
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Luke Cwik
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9235) Re-enable windmill precommit for *Java_Examples_Dataflow* Jenkins jobs

2020-01-31 Thread Luke Cwik (Jira)
Luke Cwik created BEAM-9235:
---

 Summary: Re-enable windmill precommit for *Java_Examples_Dataflow* 
Jenkins jobs
 Key: BEAM-9235
 URL: https://issues.apache.org/jira/browse/BEAM-9235
 Project: Beam
  Issue Type: Bug
  Components: runner-dataflow
Reporter: Luke Cwik






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9234) beam_PerformanceTests_WordCountIT_Py* failing

2020-01-31 Thread Kyle Weaver (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kyle Weaver updated BEAM-9234:
--
Status: Open  (was: Triage Needed)

> beam_PerformanceTests_WordCountIT_Py* failing
> -
>
> Key: BEAM-9234
> URL: https://issues.apache.org/jira/browse/BEAM-9234
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kyle Weaver
>Assignee: Kamil Wasilewski
>Priority: Major
>
> https://builds.apache.org/job/beam_PerformanceTests_WordCountIT_Py37/1015
> 11:17:23 Traceback (most recent call last):
> 11:17:23   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_WordCountIT_Py37/PerfKitBenchmarker/pkb.py",
>  line 19, in 
> 11:17:23 from perfkitbenchmarker.pkb import Main
> 11:17:23   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_WordCountIT_Py37/PerfKitBenchmarker/perfkitbenchmarker/pkb.py",
>  line 75, in 
> 11:17:23 from perfkitbenchmarker import archive
> 11:17:23   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_WordCountIT_Py37/PerfKitBenchmarker/perfkitbenchmarker/archive.py",
>  line 24, in 
> 11:17:23 from perfkitbenchmarker.providers.aws.util import AWS_PATH
> 11:17:23   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_WordCountIT_Py37/PerfKitBenchmarker/perfkitbenchmarker/providers/__init__.py",
>  line 19, in 
> 11:17:23 from perfkitbenchmarker import events
> 11:17:23   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_WordCountIT_Py37/PerfKitBenchmarker/perfkitbenchmarker/events.py",
>  line 26, in 
> 11:17:23 from perfkitbenchmarker import data
> 11:17:23   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_WordCountIT_Py37/PerfKitBenchmarker/perfkitbenchmarker/data/__init__.py",
>  line 34, in 
> 11:17:23 from perfkitbenchmarker import temp_dir
> 11:17:23   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_WordCountIT_Py37/PerfKitBenchmarker/perfkitbenchmarker/temp_dir.py",
>  line 29, in 
> 11:17:23 import functools32 as functools
> 11:17:23 ImportError: No module named functools32



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9234) beam_PerformanceTests_WordCountIT_Py* failing

2020-01-31 Thread Kyle Weaver (Jira)
Kyle Weaver created BEAM-9234:
-

 Summary: beam_PerformanceTests_WordCountIT_Py* failing
 Key: BEAM-9234
 URL: https://issues.apache.org/jira/browse/BEAM-9234
 Project: Beam
  Issue Type: Bug
  Components: test-failures
Reporter: Kyle Weaver
Assignee: Kamil Wasilewski


https://builds.apache.org/job/beam_PerformanceTests_WordCountIT_Py37/1015

11:17:23 Traceback (most recent call last):
11:17:23   File 
"/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_WordCountIT_Py37/PerfKitBenchmarker/pkb.py",
 line 19, in 
11:17:23 from perfkitbenchmarker.pkb import Main
11:17:23   File 
"/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_WordCountIT_Py37/PerfKitBenchmarker/perfkitbenchmarker/pkb.py",
 line 75, in 
11:17:23 from perfkitbenchmarker import archive
11:17:23   File 
"/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_WordCountIT_Py37/PerfKitBenchmarker/perfkitbenchmarker/archive.py",
 line 24, in 
11:17:23 from perfkitbenchmarker.providers.aws.util import AWS_PATH
11:17:23   File 
"/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_WordCountIT_Py37/PerfKitBenchmarker/perfkitbenchmarker/providers/__init__.py",
 line 19, in 
11:17:23 from perfkitbenchmarker import events
11:17:23   File 
"/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_WordCountIT_Py37/PerfKitBenchmarker/perfkitbenchmarker/events.py",
 line 26, in 
11:17:23 from perfkitbenchmarker import data
11:17:23   File 
"/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_WordCountIT_Py37/PerfKitBenchmarker/perfkitbenchmarker/data/__init__.py",
 line 34, in 
11:17:23 from perfkitbenchmarker import temp_dir
11:17:23   File 
"/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_WordCountIT_Py37/PerfKitBenchmarker/perfkitbenchmarker/temp_dir.py",
 line 29, in 
11:17:23 import functools32 as functools
11:17:23 ImportError: No module named functools32




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9206) Easy way to run checkJavaLinkage?

2020-01-31 Thread Tomo Suzuki (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17027845#comment-17027845
 ] 

Tomo Suzuki commented on BEAM-9206:
---

Thank you for the input. I started to work on Option 1.
https://github.com/GoogleCloudPlatform/cloud-opensource-java/issues/1167

> Easy way to run checkJavaLinkage?
> -
>
> Key: BEAM-9206
> URL: https://issues.apache.org/jira/browse/BEAM-9206
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Tomo Suzuki
>Assignee: Tomo Suzuki
>Priority: Major
> Attachments: r2tG83tyDrn.png
>
>
> Follow up of iemejia's comment: 
> [https://github.com/apache/beam/pull/10643#issuecomment-579276082]
> {quote}I just want some sort of ./gradlew :checkJavaLinkage that works for 
> the whole set of modules of the project. Is this 'feasible' with gradlew + 
> Beam?
> {quote}
> h1. Considerations
>  * Something that can run on Jenkins
>  * Comparison with the result of origin/master
>  * Simple way to run checkJavaLinkage for all modules
> h1. Options
> h2. 1. A shell script that runs checkJavaLinkage
> Short-term solution to help iemejia's 31 modules. 
> [https://github.com/apache/beam/pull/10643#issuecomment-578167314] .
> h2. 2. Jenkins plugin
> Jenkins seems to have the feature to compare build result with a certain 
> "reference build".
> !r2tG83tyDrn.png|width=618,height=389!
>  
> h2. 3. LinkageCheckerMain to take ignore exception list
> * LinkageCheckerMain to take an option to output JSON file containing linkage 
> errors.
>   The file is checked in to Git repository.
> * LinkageCheckerMain to take JSON file to ignore linkage errors
>   The class returns non-zero status if there're linkage errors outside the 
> {{ignore}} file.
>   The 
> Leveraging the fact that java class name or method name does not contain 
> "{{/}}", can we use {{.git-ignore}} syntax to specify linkage errors to 
> ignore?
> {noformat}
> com.google.guava:guava:25.1-jre/com.google.common.collection.ImmutableList/size
> com.google.guava:guava:*/**
> *weld-osgi-bundle*/**
> */com.github.luben.zstd.ZstdInputStream
> */com.github.luben.zstd.ZstdOutputStream
> */org.apache.beam.vendor.bytebuddy.v1_9_3.net.bytebuddy.jar.asm.commons.ModuleHashesAttribute
> {noformat}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9233) Go: unregistered Go functions fail when using -buildmode=pie -ldflags=-w

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9233?focusedWorklogId=380208=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380208
 ]

ASF GitHub Bot logged work on BEAM-9233:


Author: ASF GitHub Bot
Created on: 31/Jan/20 21:12
Start Date: 31/Jan/20 21:12
Worklog Time Spent: 10m 
  Work Description: ianlancetaylor commented on issue #10737: [BEAM-9233] 
Support -buildmode=pie -ldflags=-w with unregistered Go functions
URL: https://github.com/apache/beam/pull/10737#issuecomment-580913576
 
 
   Thanks!
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380208)
Time Spent: 1h 10m  (was: 1h)

> Go: unregistered Go functions fail when using -buildmode=pie -ldflags=-w
> 
>
> Key: BEAM-9233
> URL: https://issues.apache.org/jira/browse/BEAM-9233
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Affects Versions: 2.18.0
> Environment: GNU/Linux
>Reporter: Ian Lance Taylor
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> If a Go program is built with -buildmode=pie -ldflags=-w, the code that 
> transfers an unregistered function fails.  It tries to look up the symbol in 
> the DWARF debug info, but that info has been stripped because of the -w flag. 
>  This causes a program crash when calling the function.
> I have a patch for this problem that I will send shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9037) Instant and duration as logical type

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9037?focusedWorklogId=380207=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380207
 ]

ASF GitHub Bot logged work on BEAM-9037:


Author: ASF GitHub Bot
Created on: 31/Jan/20 21:01
Start Date: 31/Jan/20 21:01
Worklog Time Spent: 10m 
  Work Description: alexvanboxel commented on issue #10486: [BEAM-9037] 
Instant and duration as logical type
URL: https://github.com/apache/beam/pull/10486#issuecomment-580909721
 
 
   Run Java_Examples_Dataflow PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380207)
Time Spent: 4h  (was: 3h 50m)

> Instant and duration as logical type 
> -
>
> Key: BEAM-9037
> URL: https://issues.apache.org/jira/browse/BEAM-9037
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> The proto schema includes Timestamp and Duration with nano precision. The 
> logical types should be promoted to the core logical types, so they can be 
> handled on various IO's as standard mandatory conversions.
> This means that the logical type should use the proto specific Timestamp and 
> Duration but the java 8 Instant and Duration.
> See discussion in the design document:
> [https://docs.google.com/document/d/1uu9pJktzT_O3DxGd1-Q2op4nRk4HekIZbzi-0oTAips/edit#heading=h.9uhml95iygqr]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9037) Instant and duration as logical type

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9037?focusedWorklogId=380206=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380206
 ]

ASF GitHub Bot logged work on BEAM-9037:


Author: ASF GitHub Bot
Created on: 31/Jan/20 21:01
Start Date: 31/Jan/20 21:01
Worklog Time Spent: 10m 
  Work Description: alexvanboxel commented on issue #10486: [BEAM-9037] 
Instant and duration as logical type
URL: https://github.com/apache/beam/pull/10486#issuecomment-580909218
 
 
   Run Java_Examples_Dataflow PreCommit
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380206)
Time Spent: 3h 50m  (was: 3h 40m)

> Instant and duration as logical type 
> -
>
> Key: BEAM-9037
> URL: https://issues.apache.org/jira/browse/BEAM-9037
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> The proto schema includes Timestamp and Duration with nano precision. The 
> logical types should be promoted to the core logical types, so they can be 
> handled on various IO's as standard mandatory conversions.
> This means that the logical type should use the proto specific Timestamp and 
> Duration but the java 8 Instant and Duration.
> See discussion in the design document:
> [https://docs.google.com/document/d/1uu9pJktzT_O3DxGd1-Q2op4nRk4HekIZbzi-0oTAips/edit#heading=h.9uhml95iygqr]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9037) Instant and duration as logical type

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9037?focusedWorklogId=380204=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380204
 ]

ASF GitHub Bot logged work on BEAM-9037:


Author: ASF GitHub Bot
Created on: 31/Jan/20 21:00
Start Date: 31/Jan/20 21:00
Worklog Time Spent: 10m 
  Work Description: alexvanboxel commented on issue #10486: [BEAM-9037] 
Instant and duration as logical type
URL: https://github.com/apache/beam/pull/10486#issuecomment-580844716
 
 
   Run Java_Examples_Dataflow PreCommit
   
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380204)
Time Spent: 3.5h  (was: 3h 20m)

> Instant and duration as logical type 
> -
>
> Key: BEAM-9037
> URL: https://issues.apache.org/jira/browse/BEAM-9037
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> The proto schema includes Timestamp and Duration with nano precision. The 
> logical types should be promoted to the core logical types, so they can be 
> handled on various IO's as standard mandatory conversions.
> This means that the logical type should use the proto specific Timestamp and 
> Duration but the java 8 Instant and Duration.
> See discussion in the design document:
> [https://docs.google.com/document/d/1uu9pJktzT_O3DxGd1-Q2op4nRk4HekIZbzi-0oTAips/edit#heading=h.9uhml95iygqr]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9037) Instant and duration as logical type

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9037?focusedWorklogId=380205=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380205
 ]

ASF GitHub Bot logged work on BEAM-9037:


Author: ASF GitHub Bot
Created on: 31/Jan/20 21:00
Start Date: 31/Jan/20 21:00
Worklog Time Spent: 10m 
  Work Description: alexvanboxel commented on issue #10486: [BEAM-9037] 
Instant and duration as logical type
URL: https://github.com/apache/beam/pull/10486#issuecomment-580836423
 
 
   Run Java_Examples_Dataflow PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380205)
Time Spent: 3h 40m  (was: 3.5h)

> Instant and duration as logical type 
> -
>
> Key: BEAM-9037
> URL: https://issues.apache.org/jira/browse/BEAM-9037
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> The proto schema includes Timestamp and Duration with nano precision. The 
> logical types should be promoted to the core logical types, so they can be 
> handled on various IO's as standard mandatory conversions.
> This means that the logical type should use the proto specific Timestamp and 
> Duration but the java 8 Instant and Duration.
> See discussion in the design document:
> [https://docs.google.com/document/d/1uu9pJktzT_O3DxGd1-Q2op4nRk4HekIZbzi-0oTAips/edit#heading=h.9uhml95iygqr]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9037) Instant and duration as logical type

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9037?focusedWorklogId=380203=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380203
 ]

ASF GitHub Bot logged work on BEAM-9037:


Author: ASF GitHub Bot
Created on: 31/Jan/20 20:59
Start Date: 31/Jan/20 20:59
Worklog Time Spent: 10m 
  Work Description: alexvanboxel commented on issue #10486: [BEAM-9037] 
Instant and duration as logical type
URL: https://github.com/apache/beam/pull/10486#issuecomment-580909218
 
 
   Run Java_Examples_Dataflow PreCommit
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380203)
Time Spent: 3h 20m  (was: 3h 10m)

> Instant and duration as logical type 
> -
>
> Key: BEAM-9037
> URL: https://issues.apache.org/jira/browse/BEAM-9037
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
> Fix For: 2.20.0
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> The proto schema includes Timestamp and Duration with nano precision. The 
> logical types should be promoted to the core logical types, so they can be 
> handled on various IO's as standard mandatory conversions.
> This means that the logical type should use the proto specific Timestamp and 
> Duration but the java 8 Instant and Duration.
> See discussion in the design document:
> [https://docs.google.com/document/d/1uu9pJktzT_O3DxGd1-Q2op4nRk4HekIZbzi-0oTAips/edit#heading=h.9uhml95iygqr]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8933) BigQuery IO should support read/write in Arrow format

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8933?focusedWorklogId=380202=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380202
 ]

ASF GitHub Bot logged work on BEAM-8933:


Author: ASF GitHub Bot
Created on: 31/Jan/20 20:57
Start Date: 31/Jan/20 20:57
Worklog Time Spent: 10m 
  Work Description: alexvanboxel commented on pull request #10384: 
[BEAM-8933] Utilities for converting Arrow schemas and reading Arrow batches as 
Rows
URL: https://github.com/apache/beam/pull/10384#discussion_r373632452
 
 

 ##
 File path: 
sdks/java/extensions/arrow/src/main/java/org/apache/beam/sdk/extensions/arrow/ArrowConversion.java
 ##
 @@ -0,0 +1,448 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.arrow;
+
+import static 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument;
+
+import java.util.Iterator;
+import java.util.List;
+import java.util.function.Function;
+import java.util.stream.Collectors;
+import javax.annotation.Nullable;
+import org.apache.arrow.vector.FieldVector;
+import org.apache.arrow.vector.VectorSchemaRoot;
+import org.apache.arrow.vector.types.TimeUnit;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.apache.arrow.vector.util.Text;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.schemas.CachingFactory;
+import org.apache.beam.sdk.schemas.Factory;
+import org.apache.beam.sdk.schemas.FieldValueGetter;
+import org.apache.beam.sdk.schemas.Schema;
+import org.apache.beam.sdk.schemas.Schema.Field;
+import org.apache.beam.sdk.schemas.Schema.FieldType;
+import org.apache.beam.sdk.schemas.logicaltypes.FixedBytes;
+import org.apache.beam.sdk.values.Row;
+import org.joda.time.DateTime;
+import org.joda.time.DateTimeZone;
+
+/**
+ * Utilities to create {@link Iterable}s of Beam {@link Row} instances backed 
by Arrow record
+ * batches.
+ */
+@Experimental(Experimental.Kind.SCHEMAS)
+public class ArrowConversion {
+  /** Converts Arrow schema to Beam row schema. */
+  public static Schema toBeamSchema(org.apache.arrow.vector.types.pojo.Schema 
schema) {
 
 Review comment:
   I would consider having the schema generation in a separate class (example 
ArrowSchemaTranslator)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380202)
Time Spent: 10h 40m  (was: 10.5h)

> BigQuery IO should support read/write in Arrow format
> -
>
> Key: BEAM-8933
> URL: https://issues.apache.org/jira/browse/BEAM-8933
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Reporter: Kirill Kozlov
>Assignee: Kirill Kozlov
>Priority: Major
>  Time Spent: 10h 40m
>  Remaining Estimate: 0h
>
> As of right now BigQuery uses Avro format for reading and writing.
> We should add a config to BigQueryIO to specify which format to use: Arrow or 
> Avro (with Avro as default).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9233) Go: unregistered Go functions fail when using -buildmode=pie -ldflags=-w

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9233?focusedWorklogId=380196=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380196
 ]

ASF GitHub Bot logged work on BEAM-9233:


Author: ASF GitHub Bot
Created on: 31/Jan/20 20:24
Start Date: 31/Jan/20 20:24
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #10737: [BEAM-9233] 
Support -buildmode=pie -ldflags=-w with unregistered Go functions
URL: https://github.com/apache/beam/pull/10737
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380196)
Time Spent: 1h  (was: 50m)

> Go: unregistered Go functions fail when using -buildmode=pie -ldflags=-w
> 
>
> Key: BEAM-9233
> URL: https://issues.apache.org/jira/browse/BEAM-9233
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Affects Versions: 2.18.0
> Environment: GNU/Linux
>Reporter: Ian Lance Taylor
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> If a Go program is built with -buildmode=pie -ldflags=-w, the code that 
> transfers an unregistered function fails.  It tries to look up the symbol in 
> the DWARF debug info, but that info has been stripped because of the -w flag. 
>  This causes a program crash when calling the function.
> I have a patch for this problem that I will send shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8280) re-enable IOTypeHints.from_callable

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8280?focusedWorklogId=380194=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380194
 ]

ASF GitHub Bot logged work on BEAM-8280:


Author: ASF GitHub Bot
Created on: 31/Jan/20 20:16
Start Date: 31/Jan/20 20:16
Worklog Time Spent: 10m 
  Work Description: chadrik commented on pull request #10735: 
[BEAM-8280][BEAM-8629] Make IOTypeHints immutable
URL: https://github.com/apache/beam/pull/10735#discussion_r373667044
 
 

 ##
 File path: sdks/python/apache_beam/typehints/decorators.py
 ##
 @@ -305,7 +326,7 @@ def has_simple_output_type(self):
 return (self.output_types and len(self.output_types[0]) == 1 and
 not self.output_types[1])
 
-  def strip_iterable(self):
+  def strip_iterable(self):  # type: (...) -> IOTypeHints
 
 Review comment:
   Type comment should go after the function.  No need for `...` unless you are 
skipping the definition of an argument (the type of `self` is implicit). 
   
   ```suggestion
 def strip_iterable(self):
   # type: () -> IOTypeHints
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380194)
Time Spent: 1h 20m  (was: 1h 10m)

> re-enable IOTypeHints.from_callable
> ---
>
> Key: BEAM-8280
> URL: https://issues.apache.org/jira/browse/BEAM-8280
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> See https://issues.apache.org/jira/browse/BEAM-8279



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9233) Go: unregistered Go functions fail when using -buildmode=pie -ldflags=-w

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9233?focusedWorklogId=380192=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380192
 ]

ASF GitHub Bot logged work on BEAM-9233:


Author: ASF GitHub Bot
Created on: 31/Jan/20 19:47
Start Date: 31/Jan/20 19:47
Worklog Time Spent: 10m 
  Work Description: lostluck commented on issue #10737: [BEAM-9233] Support 
-buildmode=pie -ldflags=-w with unregistered Go functions
URL: https://github.com/apache/beam/pull/10737#issuecomment-580882040
 
 
   Run Go Postcommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380192)
Time Spent: 50m  (was: 40m)

> Go: unregistered Go functions fail when using -buildmode=pie -ldflags=-w
> 
>
> Key: BEAM-9233
> URL: https://issues.apache.org/jira/browse/BEAM-9233
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Affects Versions: 2.18.0
> Environment: GNU/Linux
>Reporter: Ian Lance Taylor
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> If a Go program is built with -buildmode=pie -ldflags=-w, the code that 
> transfers an unregistered function fails.  It tries to look up the symbol in 
> the DWARF debug info, but that info has been stripped because of the -w flag. 
>  This causes a program crash when calling the function.
> I have a patch for this problem that I will send shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9233) Go: unregistered Go functions fail when using -buildmode=pie -ldflags=-w

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9233?focusedWorklogId=380191=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380191
 ]

ASF GitHub Bot logged work on BEAM-9233:


Author: ASF GitHub Bot
Created on: 31/Jan/20 19:44
Start Date: 31/Jan/20 19:44
Worklog Time Spent: 10m 
  Work Description: lostluck commented on issue #10737: [BEAM-9233] Support 
-buildmode=pie -ldflags=-w with unregistered Go functions
URL: https://github.com/apache/beam/pull/10737#issuecomment-580882040
 
 
   Run Go Postcommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380191)
Time Spent: 40m  (was: 0.5h)

> Go: unregistered Go functions fail when using -buildmode=pie -ldflags=-w
> 
>
> Key: BEAM-9233
> URL: https://issues.apache.org/jira/browse/BEAM-9233
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Affects Versions: 2.18.0
> Environment: GNU/Linux
>Reporter: Ian Lance Taylor
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> If a Go program is built with -buildmode=pie -ldflags=-w, the code that 
> transfers an unregistered function fails.  It tries to look up the symbol in 
> the DWARF debug info, but that info has been stripped because of the -w flag. 
>  This causes a program crash when calling the function.
> I have a patch for this problem that I will send shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9233) Go: unregistered Go functions fail when using -buildmode=pie -ldflags=-w

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9233?focusedWorklogId=380190=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380190
 ]

ASF GitHub Bot logged work on BEAM-9233:


Author: ASF GitHub Bot
Created on: 31/Jan/20 19:44
Start Date: 31/Jan/20 19:44
Worklog Time Spent: 10m 
  Work Description: lostluck commented on issue #10737: [BEAM-9233] Support 
-buildmode=pie -ldflags=-w with unregistered Go functions
URL: https://github.com/apache/beam/pull/10737#issuecomment-580881977
 
 
   Run Go Precommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380190)
Time Spent: 0.5h  (was: 20m)

> Go: unregistered Go functions fail when using -buildmode=pie -ldflags=-w
> 
>
> Key: BEAM-9233
> URL: https://issues.apache.org/jira/browse/BEAM-9233
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Affects Versions: 2.18.0
> Environment: GNU/Linux
>Reporter: Ian Lance Taylor
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> If a Go program is built with -buildmode=pie -ldflags=-w, the code that 
> transfers an unregistered function fails.  It tries to look up the symbol in 
> the DWARF debug info, but that info has been stripped because of the -w flag. 
>  This causes a program crash when calling the function.
> I have a patch for this problem that I will send shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9227) Perform bounded source computations on the worker.

2020-01-31 Thread Boyuan Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyuan Zhang updated BEAM-9227:
---
Fix Version/s: 2.20.0

> Perform bounded source computations on the worker.
> --
>
> Key: BEAM-9227
> URL: https://issues.apache.org/jira/browse/BEAM-9227
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.18.0
>Reporter: Robert Bradshaw
>Priority: Blocker
> Fix For: 2.20.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In particular, this breaks templates.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (BEAM-9227) Perform bounded source computations on the worker.

2020-01-31 Thread Boyuan Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyuan Zhang closed BEAM-9227.
--
Resolution: Fixed

> Perform bounded source computations on the worker.
> --
>
> Key: BEAM-9227
> URL: https://issues.apache.org/jira/browse/BEAM-9227
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.18.0
>Reporter: Robert Bradshaw
>Priority: Blocker
> Fix For: 2.20.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In particular, this breaks templates.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9233) Go: unregistered Go functions fail when using -buildmode=pie -ldflags=-w

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9233?focusedWorklogId=380188=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380188
 ]

ASF GitHub Bot logged work on BEAM-9233:


Author: ASF GitHub Bot
Created on: 31/Jan/20 19:37
Start Date: 31/Jan/20 19:37
Worklog Time Spent: 10m 
  Work Description: ianlancetaylor commented on pull request #10737: 
[BEAM-9233] Support -buildmode=pie -ldflags=-w with unregistered Go functions
URL: https://github.com/apache/beam/pull/10737
 
 
   Use symbol tables, rather than DWARF, to map between symbol names and 
address.
   This will work when the DWARF information is stripped.
   It should also be more efficient, as it does not require parsing the DWARF 
information.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 

[jira] [Work logged] (BEAM-9233) Go: unregistered Go functions fail when using -buildmode=pie -ldflags=-w

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9233?focusedWorklogId=380189=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380189
 ]

ASF GitHub Bot logged work on BEAM-9233:


Author: ASF GitHub Bot
Created on: 31/Jan/20 19:37
Start Date: 31/Jan/20 19:37
Worklog Time Spent: 10m 
  Work Description: ianlancetaylor commented on issue #10737: [BEAM-9233] 
Support -buildmode=pie -ldflags=-w with unregistered Go functions
URL: https://github.com/apache/beam/pull/10737#issuecomment-580879513
 
 
   R: @lostluck 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380189)
Time Spent: 20m  (was: 10m)

> Go: unregistered Go functions fail when using -buildmode=pie -ldflags=-w
> 
>
> Key: BEAM-9233
> URL: https://issues.apache.org/jira/browse/BEAM-9233
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Affects Versions: 2.18.0
> Environment: GNU/Linux
>Reporter: Ian Lance Taylor
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> If a Go program is built with -buildmode=pie -ldflags=-w, the code that 
> transfers an unregistered function fails.  It tries to look up the symbol in 
> the DWARF debug info, but that info has been stripped because of the -w flag. 
>  This causes a program crash when calling the function.
> I have a patch for this problem that I will send shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-9233) Go: unregistered Go functions fail when using -buildmode=pie -ldflags=-w

2020-01-31 Thread Ian Lance Taylor (Jira)
Ian Lance Taylor created BEAM-9233:
--

 Summary: Go: unregistered Go functions fail when using 
-buildmode=pie -ldflags=-w
 Key: BEAM-9233
 URL: https://issues.apache.org/jira/browse/BEAM-9233
 Project: Beam
  Issue Type: Bug
  Components: sdk-go
Affects Versions: 2.18.0
 Environment: GNU/Linux
Reporter: Ian Lance Taylor


If a Go program is built with -buildmode=pie -ldflags=-w, the code that 
transfers an unregistered function fails.  It tries to look up the symbol in 
the DWARF debug info, but that info has been stripped because of the -w flag.  
This causes a program crash when calling the function.

I have a patch for this problem that I will send shortly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7961) Add tests for all runner native transforms and some widely used composite transforms to cross-language validates runner test suite

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7961?focusedWorklogId=380177=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380177
 ]

ASF GitHub Bot logged work on BEAM-7961:


Author: ASF GitHub Bot
Created on: 31/Jan/20 19:22
Start Date: 31/Jan/20 19:22
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #10051: [BEAM-7961] Add 
tests for all runner native transforms for XLang
URL: https://github.com/apache/beam/pull/10051#issuecomment-580873951
 
 
   Retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380177)
Time Spent: 22h 50m  (was: 22h 40m)

> Add tests for all runner native transforms and some widely used composite 
> transforms to cross-language validates runner test suite
> --
>
> Key: BEAM-7961
> URL: https://issues.apache.org/jira/browse/BEAM-7961
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 22h 50m
>  Remaining Estimate: 0h
>
> Add tests for all runner native transforms and some widely used composite 
> transforms to cross-language validates runner test suite



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7961) Add tests for all runner native transforms and some widely used composite transforms to cross-language validates runner test suite

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7961?focusedWorklogId=380176=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380176
 ]

ASF GitHub Bot logged work on BEAM-7961:


Author: ASF GitHub Bot
Created on: 31/Jan/20 19:19
Start Date: 31/Jan/20 19:19
Worklog Time Spent: 10m 
  Work Description: ihji commented on issue #10051: [BEAM-7961] Add tests 
for all runner native transforms for XLang
URL: https://github.com/apache/beam/pull/10051#issuecomment-580873075
 
 
   @chamikaramj Fixed.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380176)
Time Spent: 22h 40m  (was: 22.5h)

> Add tests for all runner native transforms and some widely used composite 
> transforms to cross-language validates runner test suite
> --
>
> Key: BEAM-7961
> URL: https://issues.apache.org/jira/browse/BEAM-7961
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 22h 40m
>  Remaining Estimate: 0h
>
> Add tests for all runner native transforms and some widely used composite 
> transforms to cross-language validates runner test suite



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8691) Beam Dependency Update Request: com.google.cloud.bigtable:bigtable-client-core

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8691?focusedWorklogId=380175=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380175
 ]

ASF GitHub Bot logged work on BEAM-8691:


Author: ASF GitHub Bot
Created on: 31/Jan/20 19:15
Start Date: 31/Jan/20 19:15
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #10714: [BEAM-8691] 
bigtable-client-core dependency upgrade
URL: https://github.com/apache/beam/pull/10714#issuecomment-580871418
 
 
   > Dataflow example still fails. This time it's "No space left on device":
   > 
   > ```
   > 12:39:08 > Task 
:runners:google-cloud-dataflow-java:examples:preCommitLegacyWorker
   > 12:39:08 java.io.IOException: No space left on device
   > 12:39:08 com.esotericsoftware.kryo.KryoException: java.io.IOException: No 
space left on device
   > ```
   > 
   > 
https://builds.apache.org/job/beam_PreCommit_Java_Examples_Dataflow_Phrase/268/consoleFull
   > 
   > Is this specific to this PR? I'll confirm the word-count example in my 
local environment with my branch.
   > 
   ```
   jenkins@apache-beam-jenkins-7:~$ df -h .
   Filesystem  Size  Used Avail Use% Mounted on
   /dev/sda1   485G  484G 1001M 100% /
   ```
   Probably not specific to your PR.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380175)
Time Spent: 6.5h  (was: 6h 20m)

> Beam Dependency Update Request: com.google.cloud.bigtable:bigtable-client-core
> --
>
> Key: BEAM-8691
> URL: https://issues.apache.org/jira/browse/BEAM-8691
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Tomo Suzuki
>Priority: Major
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
>  - 2019-11-15 19:39:51.523448 
> -
> Please consider upgrading the dependency 
> com.google.cloud.bigtable:bigtable-client-core. 
> The current version is 1.8.0. The latest version is 1.12.1 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-11-19 21:05:43.901882 
> -
> Please consider upgrading the dependency 
> com.google.cloud.bigtable:bigtable-client-core. 
> The current version is 1.8.0. The latest version is 1.12.1 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-02 12:11:30.163557 
> -
> Please consider upgrading the dependency 
> com.google.cloud.bigtable:bigtable-client-core. 
> The current version is 1.8.0. The latest version is 1.12.1 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-09 12:10:37.979355 
> -
> Please consider upgrading the dependency 
> com.google.cloud.bigtable:bigtable-client-core. 
> The current version is 1.8.0. The latest version is 1.12.1 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-23 12:10:39.422837 
> -
> Please consider upgrading the dependency 
> com.google.cloud.bigtable:bigtable-client-core. 
> The current version is 1.8.0. The latest version is 1.12.1 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-30 14:06:11.312353 
> -
> Please consider upgrading the dependency 
> com.google.cloud.bigtable:bigtable-client-core. 
> The current version is 1.8.0. The latest version is 1.12.1 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-06 12:09:45.627449 
> -
> Please consider 

[jira] [Work logged] (BEAM-9203) Programmatically determine if SQL exception is user error, unsupported, or bug

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9203?focusedWorklogId=380174=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380174
 ]

ASF GitHub Bot logged work on BEAM-9203:


Author: ASF GitHub Bot
Created on: 31/Jan/20 19:14
Start Date: 31/Jan/20 19:14
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #10699: [BEAM-9203] 
Clarify exceptions in SQL modules
URL: https://github.com/apache/beam/pull/10699#issuecomment-580871171
 
 
   I have rebased and resolved the conflict that came up. Please take another 
look.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380174)
Time Spent: 50m  (was: 40m)

> Programmatically determine if SQL exception is user error, unsupported, or bug
> --
>
> Key: BEAM-9203
> URL: https://issues.apache.org/jira/browse/BEAM-9203
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql, dsl-sql-zetasql
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Right now there are a lot exceptions thrown by the Calcite SQL dialect and 
> ZetaSQL dialect of Beam SQL. It is hard to catch just the errors that are 
> user errors, or just the errors that are unsupported operations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-1251) Python 3 Support

2020-01-31 Thread Valentyn Tymofieiev (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-1251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17027744#comment-17027744
 ] 

Valentyn Tymofieiev commented on BEAM-1251:
---

Apache Beam 2.18.0 has been released recently. It adds support for keyword-only 
arguments in Python 3 Beam pipelines (BEAM-5878). Thanks, [~yoshiki.obata], for 
this contribution.



> Python 3 Support
> 
>
> Key: BEAM-1251
> URL: https://issues.apache.org/jira/browse/BEAM-1251
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Eyad Sibai
>Assignee: Valentyn Tymofieiev
>Priority: Major
> Fix For: 2.11.0
>
>  Time Spent: 30h 10m
>  Remaining Estimate: 0h
>
> FAQ
>  Does Apache Beam support Python 3?
>  - *Yes!*
> Is there any remaining work?
>  - We continue to improve user experience of Python 3 users, add support for 
> new Python minor versions, and phase out support of old ones.
>  Check out the Python SDK roadmap on [how to contribute or report a Python 3 
> issue|https://beam.apache.org/roadmap/python-sdk/#python-3-support]!
> Which SDK version should I use?
>  - For best experience, use the latest released SDK. For summary of 
> Py3-related changes, read this thread.
> Help! I am getting a pickling error in StockUnpickler.find_class() on Python 
> 3.
>  - Does the error happens in load_session call? See BEAM-6158 . Do you use 
> Beam SDK less than 2.17.0? See BEAM-8651.
> My streaming pipelines are stuck on Python 3: 
>  - Do you use Beam SDK less than 2.17.0? If so please upgrade to 2.17.0. See 
> BEAM-8651.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-9218) Template staging broken on Beam 2.18.0

2020-01-31 Thread Robert Bradshaw (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Bradshaw resolved BEAM-9218.
---
Fix Version/s: 2.20.0
   Resolution: Fixed

> Template staging broken on Beam 2.18.0
> --
>
> Key: BEAM-9218
> URL: https://issues.apache.org/jira/browse/BEAM-9218
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.18.0
>Reporter: Michael Charkin
>Assignee: Robert Bradshaw
>Priority: Major
> Fix For: 2.20.0
>
>
> beam 2.18.0 can not stage cloud Dataflow templates
>  
> Looks like it is trying to access the RuntimeValueProvider during staging 
> causing 'not accessible'
>  
> Repo with code to reproduce the issue: 
> [https://github.com/firemuzzy/dataflow-templates-bug]
>  
> With the help of stack overflow narrowed the issue to the latest beam release 
> and not python versions
> [https://stackoverflow.com/questions/59940069/how-do-you-create-a-google-cloud-dataflow-template-with-python-3?noredirect=1#59940069]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9218) Template staging broken on Beam 2.18.0

2020-01-31 Thread Robert Bradshaw (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17027745#comment-17027745
 ] 

Robert Bradshaw commented on BEAM-9218:
---

Thanks, Elias, for verifying this. 

It's possible that some of the work triggered by requirements_file (e.g. 
actually downloading the dependencies and staging them) could be deferred in 
the template case, and simply has not been. I think it's fair to call this a 
(separate) bug. 

> Template staging broken on Beam 2.18.0
> --
>
> Key: BEAM-9218
> URL: https://issues.apache.org/jira/browse/BEAM-9218
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.18.0
>Reporter: Michael Charkin
>Assignee: Robert Bradshaw
>Priority: Major
>
> beam 2.18.0 can not stage cloud Dataflow templates
>  
> Looks like it is trying to access the RuntimeValueProvider during staging 
> causing 'not accessible'
>  
> Repo with code to reproduce the issue: 
> [https://github.com/firemuzzy/dataflow-templates-bug]
>  
> With the help of stack overflow narrowed the issue to the latest beam release 
> and not python versions
> [https://stackoverflow.com/questions/59940069/how-do-you-create-a-google-cloud-dataflow-template-with-python-3?noredirect=1#59940069]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9188) Improving speed of splitting for Custom Sources

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9188?focusedWorklogId=380151=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380151
 ]

ASF GitHub Bot logged work on BEAM-9188:


Author: ASF GitHub Bot
Created on: 31/Jan/20 18:53
Start Date: 31/Jan/20 18:53
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on issue #10701: [BEAM-9188] 
CassandraIO split performance improvement - cache size of the table
URL: https://github.com/apache/beam/pull/10701#issuecomment-580863133
 
 
   Run Spotless PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380151)
Time Spent: 4h 10m  (was: 4h)

> Improving speed of splitting for Custom Sources
> ---
>
> Key: BEAM-9188
> URL: https://issues.apache.org/jira/browse/BEAM-9188
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Radosław Stankiewicz
>Assignee: Radosław Stankiewicz
>Priority: Minor
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> At this moment Custom Source in being split and serialized in sequence. If 
> there are many splits, it takes time to process all splits. 
>  
> Example: it takes 2s to calculate size and serialize CassandraSource due to 
> connection setup and teardown. With 100+ splits, it's a lot of time spent in 
> 1 worker. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9188) Improving speed of splitting for Custom Sources

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9188?focusedWorklogId=380140=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380140
 ]

ASF GitHub Bot logged work on BEAM-9188:


Author: ASF GitHub Bot
Created on: 31/Jan/20 18:39
Start Date: 31/Jan/20 18:39
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on issue #10701: [BEAM-9188] 
CassandraIO split performance improvement - cache size of the table
URL: https://github.com/apache/beam/pull/10701#issuecomment-580857887
 
 
   Run Spotless PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380140)
Time Spent: 4h  (was: 3h 50m)

> Improving speed of splitting for Custom Sources
> ---
>
> Key: BEAM-9188
> URL: https://issues.apache.org/jira/browse/BEAM-9188
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Radosław Stankiewicz
>Assignee: Radosław Stankiewicz
>Priority: Minor
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> At this moment Custom Source in being split and serialized in sequence. If 
> there are many splits, it takes time to process all splits. 
>  
> Example: it takes 2s to calculate size and serialize CassandraSource due to 
> connection setup and teardown. With 100+ splits, it's a lot of time spent in 
> 1 worker. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9188) Improving speed of splitting for Custom Sources

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9188?focusedWorklogId=380138=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380138
 ]

ASF GitHub Bot logged work on BEAM-9188:


Author: ASF GitHub Bot
Created on: 31/Jan/20 18:38
Start Date: 31/Jan/20 18:38
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on issue #10701: [BEAM-9188] 
CassandraIO split performance improvement - cache size of the table
URL: https://github.com/apache/beam/pull/10701#issuecomment-580857276
 
 
   retest this please
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380138)
Time Spent: 3h 50m  (was: 3h 40m)

> Improving speed of splitting for Custom Sources
> ---
>
> Key: BEAM-9188
> URL: https://issues.apache.org/jira/browse/BEAM-9188
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Radosław Stankiewicz
>Assignee: Radosław Stankiewicz
>Priority: Minor
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> At this moment Custom Source in being split and serialized in sequence. If 
> there are many splits, it takes time to process all splits. 
>  
> Example: it takes 2s to calculate size and serialize CassandraSource due to 
> connection setup and teardown. With 100+ splits, it's a lot of time spent in 
> 1 worker. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8280) re-enable IOTypeHints.from_callable

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8280?focusedWorklogId=380131=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380131
 ]

ASF GitHub Bot logged work on BEAM-8280:


Author: ASF GitHub Bot
Created on: 31/Jan/20 18:26
Start Date: 31/Jan/20 18:26
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #10735: [BEAM-8280][BEAM-8629] 
Make IOTypeHints immutable
URL: https://github.com/apache/beam/pull/10735#issuecomment-580852304
 
 
   R: @chadrik 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380131)
Time Spent: 1h 10m  (was: 1h)

> re-enable IOTypeHints.from_callable
> ---
>
> Key: BEAM-8280
> URL: https://issues.apache.org/jira/browse/BEAM-8280
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> See https://issues.apache.org/jira/browse/BEAM-8279



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8691) Beam Dependency Update Request: com.google.cloud.bigtable:bigtable-client-core

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8691?focusedWorklogId=380133=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380133
 ]

ASF GitHub Bot logged work on BEAM-8691:


Author: ASF GitHub Bot
Created on: 31/Jan/20 18:26
Start Date: 31/Jan/20 18:26
Worklog Time Spent: 10m 
  Work Description: suztomo commented on issue #10714: [BEAM-8691] 
bigtable-client-core dependency upgrade
URL: https://github.com/apache/beam/pull/10714#issuecomment-580852388
 
 
   Dataflow example still fails. This time it's "No space left on device":
   ```
   12:39:08 > Task 
:runners:google-cloud-dataflow-java:examples:preCommitLegacyWorker
   12:39:08 java.io.IOException: No space left on device
   12:39:08 com.esotericsoftware.kryo.KryoException: java.io.IOException: No 
space left on device
   ```
   
https://builds.apache.org/job/beam_PreCommit_Java_Examples_Dataflow_Phrase/268/consoleFull
   
   Is this specific to this PR? I'll confirm the word-count example in my local 
environment with my branch.
   
   @udim Thank you.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380133)
Time Spent: 6h 20m  (was: 6h 10m)

> Beam Dependency Update Request: com.google.cloud.bigtable:bigtable-client-core
> --
>
> Key: BEAM-8691
> URL: https://issues.apache.org/jira/browse/BEAM-8691
> Project: Beam
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Beam JIRA Bot
>Assignee: Tomo Suzuki
>Priority: Major
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
>  - 2019-11-15 19:39:51.523448 
> -
> Please consider upgrading the dependency 
> com.google.cloud.bigtable:bigtable-client-core. 
> The current version is 1.8.0. The latest version is 1.12.1 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-11-19 21:05:43.901882 
> -
> Please consider upgrading the dependency 
> com.google.cloud.bigtable:bigtable-client-core. 
> The current version is 1.8.0. The latest version is 1.12.1 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-02 12:11:30.163557 
> -
> Please consider upgrading the dependency 
> com.google.cloud.bigtable:bigtable-client-core. 
> The current version is 1.8.0. The latest version is 1.12.1 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-09 12:10:37.979355 
> -
> Please consider upgrading the dependency 
> com.google.cloud.bigtable:bigtable-client-core. 
> The current version is 1.8.0. The latest version is 1.12.1 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-23 12:10:39.422837 
> -
> Please consider upgrading the dependency 
> com.google.cloud.bigtable:bigtable-client-core. 
> The current version is 1.8.0. The latest version is 1.12.1 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2019-12-30 14:06:11.312353 
> -
> Please consider upgrading the dependency 
> com.google.cloud.bigtable:bigtable-client-core. 
> The current version is 1.8.0. The latest version is 1.12.1 
> cc: 
>  Please refer to [Beam Dependency Guide 
> |https://beam.apache.org/contribute/dependencies/]for more information. 
> Do Not Modify The Description Above. 
>  - 2020-01-06 12:09:45.627449 
> -
> Please consider upgrading the dependency 
> com.google.cloud.bigtable:bigtable-client-core. 
> The current version is 1.8.0. The latest version is 1.12.1 
> cc: 
>  Please refer to [Beam 

[jira] [Work logged] (BEAM-8280) re-enable IOTypeHints.from_callable

2020-01-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8280?focusedWorklogId=380129=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-380129
 ]

ASF GitHub Bot logged work on BEAM-8280:


Author: ASF GitHub Bot
Created on: 31/Jan/20 18:25
Start Date: 31/Jan/20 18:25
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #10735: [BEAM-8280][BEAM-8629] 
Make IOTypeHints immutable
URL: https://github.com/apache/beam/pull/10735#issuecomment-580851957
 
 
   run python 2 postcommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 380129)
Time Spent: 50m  (was: 40m)

> re-enable IOTypeHints.from_callable
> ---
>
> Key: BEAM-8280
> URL: https://issues.apache.org/jira/browse/BEAM-8280
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> See https://issues.apache.org/jira/browse/BEAM-8279



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   >