[jira] [Work logged] (BEAM-9978) Add offset range restrictions to the Go SDK.

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9978?focusedWorklogId=436376=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436376
 ]

ASF GitHub Bot logged work on BEAM-9978:


Author: ASF GitHub Bot
Created on: 22/May/20 04:07
Start Date: 22/May/20 04:07
Worklog Time Spent: 10m 
  Work Description: youngoli merged pull request #11763:
URL: https://github.com/apache/beam/pull/11763


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436376)
Time Spent: 1h  (was: 50m)

> Add offset range restrictions to the Go SDK.
> 
>
> Key: BEAM-9978
> URL: https://issues.apache.org/jira/browse/BEAM-9978
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: P2
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Currently these are part of the stringsplit example. but they should probably 
> be generalized and in the actual SDK, and should have adequate testing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9978) Add offset range restrictions to the Go SDK.

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9978?focusedWorklogId=436366=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436366
 ]

ASF GitHub Bot logged work on BEAM-9978:


Author: ASF GitHub Bot
Created on: 22/May/20 03:37
Start Date: 22/May/20 03:37
Worklog Time Spent: 10m 
  Work Description: youngoli commented on a change in pull request #11763:
URL: https://github.com/apache/beam/pull/11763#discussion_r429026079



##
File path: sdks/go/pkg/beam/io/rtrackers/offsetrange/offsetrange_test.go
##
@@ -0,0 +1,212 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+package offsetrange
+
+import (
+   "fmt"
+   "github.com/google/go-cmp/cmp"
+   "testing"
+)
+
+// TestRestriction_EvenSplits tests various splits and checks that they all
+// follow the contract for EvenSplits. This means that all restrictions are
+// evenly split, that each restriction has at least one element, and that each
+// element is present in the split restrictions.
+func TestRestriction_EvenSplits(t *testing.T) {
+   tests := []struct {
+   rest Restriction
+   num  int64
+   }{
+   {rest: Restriction{Start: 0, End: 21}, num: 4},
+   {rest: Restriction{Start: 21, End: 42}, num: 4},
+   {rest: Restriction{Start: 0, End: 5}, num: 10},
+   {rest: Restriction{Start: 0, End: 21}, num: -1},
+   }
+   for _, test := range tests {
+   test := test
+   t.Run(fmt.Sprintf("(rest[%v, %v], splits = %v)",
+   test.rest.Start, test.rest.End, test.num), func(t 
*testing.T) {
+   r := test.rest
+
+   // Get the minimum size that a split restriction can 
be. Max size
+   // should be min + 1. This way we can check the size of 
each split.
+   num := test.num
+   if num <= 1 {
+   num = 1
+   }
+   min := (r.End - r.Start) / num
+
+   splits := r.EvenSplits(test.num)
+   prevEnd := r.Start
+   for _, split := range splits {
+   size := split.End - split.Start
+   // Check: Each restriction has at least 1 
element.
+   if size == 0 {
+   t.Errorf("split restriction [%v, %v] is 
empty, size must be greater than 0.",
+   split.Start, split.End)
+   }
+   // Check: Restrictions are evenly split.
+   if size != min && size != min+1 {
+   t.Errorf("split restriction [%v, %v] 
has unexpected size. got: %v, want: %v or %v",
+   split.Start, split.End, size, 
min, min+1)
+   }
+   // Check: All elements are still in a split 
restrictions. This
+   // logic assumes that the splits are returned 
in order which
+   // isn't guaranteed by EvenSplits, but this 
check is way easier
+   // with the assumption.
+   if split.Start != prevEnd {
+   t.Errorf("restriction range [%v, %v] 
missing after splits.",
+   prevEnd, split.Start)
+   } else {
+   prevEnd = split.End
+   }
+   }
+   if prevEnd != r.End {
+   t.Errorf("restriction range [%v, %v] missing 
after splits.",
+   prevEnd, r.End)
+   }
+   })
+   }
+}
+
+// TestTracker_TryClaim validates both success and failure cases for TryClaim.
+func TestTracker_TryClaim(t 

[jira] [Work logged] (BEAM-9978) Add offset range restrictions to the Go SDK.

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9978?focusedWorklogId=436365=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436365
 ]

ASF GitHub Bot logged work on BEAM-9978:


Author: ASF GitHub Bot
Created on: 22/May/20 03:36
Start Date: 22/May/20 03:36
Worklog Time Spent: 10m 
  Work Description: youngoli commented on a change in pull request #11763:
URL: https://github.com/apache/beam/pull/11763#discussion_r429025895



##
File path: sdks/go/pkg/beam/io/rtrackers/offsetrange/offsetrange_test.go
##
@@ -0,0 +1,212 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+package offsetrange
+
+import (
+   "fmt"
+   "github.com/google/go-cmp/cmp"
+   "testing"
+)
+
+// TestRestriction_EvenSplits tests various splits and checks that they all
+// follow the contract for EvenSplits. This means that all restrictions are
+// evenly split, that each restriction has at least one element, and that each
+// element is present in the split restrictions.
+func TestRestriction_EvenSplits(t *testing.T) {
+   tests := []struct {
+   rest Restriction
+   num  int64
+   }{
+   {rest: Restriction{Start: 0, End: 21}, num: 4},
+   {rest: Restriction{Start: 21, End: 42}, num: 4},
+   {rest: Restriction{Start: 0, End: 5}, num: 10},
+   {rest: Restriction{Start: 0, End: 21}, num: -1},
+   }
+   for _, test := range tests {
+   test := test
+   t.Run(fmt.Sprintf("(rest[%v, %v], splits = %v)",
+   test.rest.Start, test.rest.End, test.num), func(t 
*testing.T) {
+   r := test.rest
+
+   // Get the minimum size that a split restriction can 
be. Max size
+   // should be min + 1. This way we can check the size of 
each split.
+   num := test.num
+   if num <= 1 {
+   num = 1
+   }
+   min := (r.End - r.Start) / num
+
+   splits := r.EvenSplits(test.num)
+   prevEnd := r.Start
+   for _, split := range splits {
+   size := split.End - split.Start
+   // Check: Each restriction has at least 1 
element.
+   if size == 0 {
+   t.Errorf("split restriction [%v, %v] is 
empty, size must be greater than 0.",
+   split.Start, split.End)
+   }
+   // Check: Restrictions are evenly split.
+   if size != min && size != min+1 {
+   t.Errorf("split restriction [%v, %v] 
has unexpected size. got: %v, want: %v or %v",
+   split.Start, split.End, size, 
min, min+1)
+   }
+   // Check: All elements are still in a split 
restrictions. This
+   // logic assumes that the splits are returned 
in order which
+   // isn't guaranteed by EvenSplits, but this 
check is way easier
+   // with the assumption.
+   if split.Start != prevEnd {
+   t.Errorf("restriction range [%v, %v] 
missing after splits.",
+   prevEnd, split.Start)
+   } else {
+   prevEnd = split.End
+   }
+   }
+   if prevEnd != r.End {
+   t.Errorf("restriction range [%v, %v] missing 
after splits.",
+   prevEnd, r.End)
+   }
+   })
+   }
+}
+
+// TestTracker_TryClaim validates both success and failure cases for TryClaim.
+func TestTracker_TryClaim(t 

[jira] [Work logged] (BEAM-9935) Resolve differences in allowed_split_point implementations

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9935?focusedWorklogId=436336=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436336
 ]

ASF GitHub Bot logged work on BEAM-9935:


Author: ASF GitHub Bot
Created on: 22/May/20 03:01
Start Date: 22/May/20 03:01
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #11791:
URL: https://github.com/apache/beam/pull/11791#issuecomment-632451757


   For reference, the tests I'm trying to match: 
https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/worker/bundle_processor_test.py#L61



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436336)
Time Spent: 1h 50m  (was: 1h 40m)

> Resolve differences in allowed_split_point implementations
> --
>
> Key: BEAM-9935
> URL: https://issues.apache.org/jira/browse/BEAM-9935
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go, sdk-java-harness, sdk-py-harness
>Reporter: Luke Cwik
>Assignee: Daniel Oliveira
>Priority: P2
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> [Java SDK 
> harness|https://github.com/apache/beam/blob/d82d061aa303430f3d2853f397f3130fae6200cd/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/BeamFnDataReadRunner.java#L223]
>  doesn't support it yet which is also safe.
> [Go SDK 
> harness|https://github.com/apache/beam/blob/d82d061aa303430f3d2853f397f3130fae6200cd/sdks/go/pkg/beam/core/runtime/exec/datasource.go#L273]
>  only supports splits if points are specified and it doesn't use the fraction 
> at all.
> [Python SDK 
> harness|https://github.com/apache/beam/blob/d82d061aa303430f3d2853f397f3130fae6200cd/sdks/python/apache_beam/runners/worker/bundle_processor.py#L947]
>  ignores the split points meaning that it may return an invalid split 
> location based upon the runners limitations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9935) Resolve differences in allowed_split_point implementations

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9935?focusedWorklogId=436334=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436334
 ]

ASF GitHub Bot logged work on BEAM-9935:


Author: ASF GitHub Bot
Created on: 22/May/20 02:59
Start Date: 22/May/20 02:59
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #11791:
URL: https://github.com/apache/beam/pull/11791#issuecomment-632451211


   R: @lostluck 
   CC: @robertwb @lukecwik @boyuanzz 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436334)
Time Spent: 1h 40m  (was: 1.5h)

> Resolve differences in allowed_split_point implementations
> --
>
> Key: BEAM-9935
> URL: https://issues.apache.org/jira/browse/BEAM-9935
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go, sdk-java-harness, sdk-py-harness
>Reporter: Luke Cwik
>Assignee: Daniel Oliveira
>Priority: P2
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> [Java SDK 
> harness|https://github.com/apache/beam/blob/d82d061aa303430f3d2853f397f3130fae6200cd/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/BeamFnDataReadRunner.java#L223]
>  doesn't support it yet which is also safe.
> [Go SDK 
> harness|https://github.com/apache/beam/blob/d82d061aa303430f3d2853f397f3130fae6200cd/sdks/go/pkg/beam/core/runtime/exec/datasource.go#L273]
>  only supports splits if points are specified and it doesn't use the fraction 
> at all.
> [Python SDK 
> harness|https://github.com/apache/beam/blob/d82d061aa303430f3d2853f397f3130fae6200cd/sdks/python/apache_beam/runners/worker/bundle_processor.py#L947]
>  ignores the split points meaning that it may return an invalid split 
> location based upon the runners limitations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9935) Resolve differences in allowed_split_point implementations

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9935?focusedWorklogId=436333=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436333
 ]

ASF GitHub Bot logged work on BEAM-9935:


Author: ASF GitHub Bot
Created on: 22/May/20 02:58
Start Date: 22/May/20 02:58
Worklog Time Spent: 10m 
  Work Description: youngoli opened a new pull request #11791:
URL: https://github.com/apache/beam/pull/11791


   Adds code to more closely align with the implementations of splitting in
   Python and Java. Note that not all cases are implemented. There is no
   measurement of sub-element progress yet, nor is there sub-element (SDF)
   splitting yet.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [x] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [x] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [x] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 

[jira] [Work logged] (BEAM-9946) Enhance Partition transform to provide partitionfn with SideInputs

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9946?focusedWorklogId=436331=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436331
 ]

ASF GitHub Bot logged work on BEAM-9946:


Author: ASF GitHub Bot
Created on: 22/May/20 02:43
Start Date: 22/May/20 02:43
Worklog Time Spent: 10m 
  Work Description: darshanj commented on pull request #11682:
URL: https://github.com/apache/beam/pull/11682#issuecomment-632447130


   @apilloud @aaltay Looks like passing a class instead of partitionFn makes 
Partition Transform unserializable. I have reverted that changes for 
suggestion. Serializing a Function object or Class should be ok interms of size 
I feel.
   
   Can you please rerun the tests?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436331)
Remaining Estimate: 93.5h  (was: 93h 40m)
Time Spent: 2.5h  (was: 2h 20m)

> Enhance Partition transform to provide partitionfn with SideInputs
> --
>
> Key: BEAM-9946
> URL: https://issues.apache.org/jira/browse/BEAM-9946
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Darshan Jani
>Assignee: Darshan Jani
>Priority: P2
>   Original Estimate: 96h
>  Time Spent: 2.5h
>  Remaining Estimate: 93.5h
>
> Currently _Partition_ transform can partition a collection into n collections 
> based on only _element_ value in _PartitionFn_ to decide on which partition a 
> particular element belongs to.
> {code:java}
> public interface PartitionFn extends Serializable {
> int partitionFor(T elem, int numPartitions);
>   }
> public static  Partition of(int numPartitions, PartitionFn 
> partitionFn) {
> return new Partition<>(new PartitionDoFn(numPartitions, partitionFn));
>   }
> {code}
> It will be useful to introduce new API with additional _sideInputs_ provided 
> to partition function. User will be able to write logic to use both _element_ 
> value and _sideInputs_ to decide on which partition a particular element 
> belongs to.
> Option-1: Proposed new API:
> {code:java}
>   public interface PartitionWithSideInputsFn extends Serializable {
> int partitionFor(T elem, int numPartitions, Context c);
>   }
> public static  Partition of(int numPartitions, 
> PartitionWithSideInputsFn partitionFn, Requirements requirements) {
>  ...
>   }
> {code}
> User can use any of the two APIs as per there partitioning function logic.
> Option-2: Redesign old API with Builder Pattern which can provide optionally 
> a _Requirements_ with _sideInputs._ Deprecate old API.
> {code:java}
> // using sideviews
> Partition.into(numberOfPartitions).via(
> fn(
>   (input,c) ->  {
> // use c.sideInput(view)
> // use input
> // return partitionnumber
>  },requiresSideInputs(view))
> )
> // without using sideviews
> Partition.into(numberOfPartitions).via(
> fn((input,c) ->  {
> // use input
> // return partitionnumber
>  })
> )
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-7304) Twister2 Beam runner

2020-05-21 Thread Pulasthi Wickramasinghe (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17113696#comment-17113696
 ] 

Pulasthi Wickramasinghe commented on BEAM-7304:
---

[~iemejia] will check with Java 11. Twister2 does support Java 11 so it should 
work. I Will test and let you know. So would we be able to merge this into the 
master for now so we can get it into 2.23.0 release?. No worries about missing 
the 2.22.0 cut :) 

> Twister2 Beam runner
> 
>
> Key: BEAM-7304
> URL: https://issues.apache.org/jira/browse/BEAM-7304
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-ideas
>Reporter: Pulasthi Wickramasinghe
>Assignee: Pulasthi Wickramasinghe
>Priority: P3
>  Time Spent: 21h 20m
>  Remaining Estimate: 0h
>
> Twister2 is a big data framework which supports both batch and stream 
> processing [1] [2]. The goal is to develop an beam runner for Twister2. 
> [1] [https://github.com/DSC-SPIDAL/twister2]
> [2] [https://twister2.org/]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9785) Add PostCommit suite for Python 3.8

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9785?focusedWorklogId=436329=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436329
 ]

ASF GitHub Bot logged work on BEAM-9785:


Author: ASF GitHub Bot
Created on: 22/May/20 02:15
Start Date: 22/May/20 02:15
Worklog Time Spent: 10m 
  Work Description: epicfaace opened a new pull request #11788:
URL: https://github.com/apache/beam/pull/11788


   Add Python 3.8 postcommit tests.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 

[jira] [Work logged] (BEAM-9948) Add Firefly to website

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9948?focusedWorklogId=436327=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436327
 ]

ASF GitHub Bot logged work on BEAM-9948:


Author: ASF GitHub Bot
Created on: 22/May/20 02:06
Start Date: 22/May/20 02:06
Worklog Time Spent: 10m 
  Work Description: aijamalnk commented on pull request #11780:
URL: https://github.com/apache/beam/pull/11780#issuecomment-632437739


   @iemejia could you review please?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436327)
Time Spent: 20m  (was: 10m)

> Add Firefly to website
> --
>
> Key: BEAM-9948
> URL: https://issues.apache.org/jira/browse/BEAM-9948
> Project: Beam
>  Issue Type: New Feature
>  Components: website
>Reporter: Kyle Weaver
>Assignee: Aizhamal Nurmamat kyzy
>Priority: P2
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Beam has an adorable new mascot, the Firefly: 
> https://blogs.apache.org/foundation/entry/success-at-apache-bringing-the
> We should add a usage guide for the Firefly to the website along with our 
> logos. (The blog post linked contains a model sheet for the Firefly we can 
> use.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9948) Add Firefly to website

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9948?focusedWorklogId=436326=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436326
 ]

ASF GitHub Bot logged work on BEAM-9948:


Author: ASF GitHub Bot
Created on: 22/May/20 02:06
Start Date: 22/May/20 02:06
Worklog Time Spent: 10m 
  Work Description: aijamalnk commented on pull request #11780:
URL: https://github.com/apache/beam/pull/11780#issuecomment-632437689


   Staged site: 
http://apache-beam-website-pull-requests.storage.googleapis.com/11780/community/mascot/index.html



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436326)
Remaining Estimate: 0h
Time Spent: 10m

> Add Firefly to website
> --
>
> Key: BEAM-9948
> URL: https://issues.apache.org/jira/browse/BEAM-9948
> Project: Beam
>  Issue Type: New Feature
>  Components: website
>Reporter: Kyle Weaver
>Assignee: Aizhamal Nurmamat kyzy
>Priority: P2
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Beam has an adorable new mascot, the Firefly: 
> https://blogs.apache.org/foundation/entry/success-at-apache-bringing-the
> We should add a usage guide for the Firefly to the website along with our 
> logos. (The blog post linked contains a model sheet for the Firefly we can 
> use.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9977) Build Kafka Read on top of Java SplittableDoFn

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9977?focusedWorklogId=436323=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436323
 ]

ASF GitHub Bot logged work on BEAM-9977:


Author: ASF GitHub Bot
Created on: 22/May/20 01:52
Start Date: 22/May/20 01:52
Worklog Time Spent: 10m 
  Work Description: iemejia commented on pull request #11715:
URL: https://github.com/apache/beam/pull/11715#issuecomment-632434300


   Thanks for answering and for the clear explaination @boyuanzz 
   
   I would have tended towards having the additional complexity in 
`OffsetRangeTracker` just because it is the de-facto reference, but I 
understand the different preference.
   
   Nice to see this an the Kafka SDF happening, congrats!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436323)
Time Spent: 4.5h  (was: 4h 20m)

> Build Kafka Read on top of Java SplittableDoFn
> --
>
> Key: BEAM-9977
> URL: https://issues.apache.org/jira/browse/BEAM-9977
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-kafka
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: P2
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8280) re-enable IOTypeHints.from_callable

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8280?focusedWorklogId=436321=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436321
 ]

ASF GitHub Bot logged work on BEAM-8280:


Author: ASF GitHub Bot
Created on: 22/May/20 01:50
Start Date: 22/May/20 01:50
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #11070:
URL: https://github.com/apache/beam/pull/11070#issuecomment-632433800


   I've squashed the commits to more easily rename the file. Optimistically 
merging by May 26! :)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436321)
Time Spent: 14h 40m  (was: 14.5h)

> re-enable IOTypeHints.from_callable
> ---
>
> Key: BEAM-8280
> URL: https://issues.apache.org/jira/browse/BEAM-8280
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: P2
> Fix For: 2.21.0
>
>  Time Spent: 14h 40m
>  Remaining Estimate: 0h
>
> See https://issues.apache.org/jira/browse/BEAM-8279



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8280) re-enable IOTypeHints.from_callable

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8280?focusedWorklogId=436314=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436314
 ]

ASF GitHub Bot logged work on BEAM-8280:


Author: ASF GitHub Bot
Created on: 22/May/20 01:22
Start Date: 22/May/20 01:22
Worklog Time Spent: 10m 
  Work Description: udim commented on a change in pull request #11070:
URL: https://github.com/apache/beam/pull/11070#discussion_r428996230



##
File path: website/src/_posts/2020-03-06-python-typing.md
##
@@ -0,0 +1,117 @@
+---
+layout: post
+title:  "Python SDK Typing Changes"
+date:   2020-03-06 00:00:01 -0800
+excerpt_separator: 
+categories: blog python typing
+authors:
+  - chadrik
+  - udim
+
+---
+
+
+TODO excerpt
+
+
+
+Python supports type annotations on functions (PEP 484). Static type checkers,
+such as mypy, are used to verify adherence to these types.
+For example:
+```py
+def f(v: int) -> int:
+  return v[0]
+```
+Running mypy on the above code will give the error:
+`Value of type "int" is not indexable`.
+
+We've recently made changes to Beam in 2 areas:
+
+Adding type hints throughout Beam. TODO expand
+
+Second, we've added support for Python 3 type annotations. This allows SDK
+users to specify a DoFn's type hints in one place. 
+We've also expanded Beam's support of `typing` module types.
+
+For more background see: 
+[Ensuring Python Type 
Safety](https://beam.apache.org/documentation/sdks/python-type-safety/).
+
+# Beam Is Typed
+
+TODO
+
+# New Ways to Annotate
+
+## Python 3 Syntax Annotations
+
+Coming in Beam 2.21 (BEAM-8280), you will be able to use Python annotation
+syntax to specify input and output types.
+
+For example, this new form:
+```py
+class MyDoFn(beam.DoFn):
+  def process(self, element: int) -> typing.Text:
+yield str(element)
+```
+is equivalent to this:
+```py
+@beam.typehints.with_input_types(int)
+@beam.typehints.with_output_types(typing.Text)
+class MyDoFn(beam.DoFn):
+  def process(self, element):
+yield str(element)
+```
+
+One of the advantages of the new form is that you may already be using it
+in tandem with a static type checker such as mypy, thus getting additional
+type checking for free.
+
+This feature will be enabled by default, and there will be 2 mechanisms in
+place to disable it:
+1. Calling `apache_beam.typehints.disable_type_annotations()` before pipeline
+construction will disable the new feature completely.
+1. Decorating a function with `@apache_beam.typehints.no_annotations` will
+tell Beam to ignore annotations for it. 
+ 
+Uses of Beam's `with_input_type`, `with_output_type` methods and decorators 
will 
+still work and take precedence over annotations.
+
+Sidebar:
+
+> You might ask: couldn't we use mypy to type check Beam pipelines? The main 
issue
+is that such a tool would have to understand type relations between
+pipeline graph nodes, e.g., that the type of element passed to a transform
+should be consistent with its annotated input type.

Review comment:
   I'm thinking of rephrasing this to just mention dynamically generate 
pipelines.
   
   As an aside, do you think that adding types to PCollections (such as 
PCollection[Tuple[K, Iterable[V]]]) would obviate the need for a plugin?

##
File path: website/src/_posts/2020-03-06-python-typing.md
##
@@ -0,0 +1,117 @@
+---
+layout: post
+title:  "Python SDK Typing Changes"
+date:   2020-03-06 00:00:01 -0800
+excerpt_separator: 
+categories: blog python typing
+authors:
+  - chadrik
+  - udim
+
+---
+
+
+TODO excerpt
+
+
+
+Python supports type annotations on functions (PEP 484). Static type checkers,
+such as mypy, are used to verify adherence to these types.
+For example:
+```py
+def f(v: int) -> int:
+  return v[0]
+```
+Running mypy on the above code will give the error:
+`Value of type "int" is not indexable`.
+
+We've recently made changes to Beam in 2 areas:
+
+Adding type hints throughout Beam. TODO expand
+
+Second, we've added support for Python 3 type annotations. This allows SDK
+users to specify a DoFn's type hints in one place. 
+We've also expanded Beam's support of `typing` module types.
+
+For more background see: 
+[Ensuring Python Type 
Safety](https://beam.apache.org/documentation/sdks/python-type-safety/).
+
+# Beam Is Typed
+
+TODO
+
+# New Ways to Annotate
+
+## Python 3 Syntax Annotations
+
+Coming in Beam 2.21 (BEAM-8280), you will be able to use Python annotation
+syntax to specify input and output types.
+
+For example, this new form:
+```py
+class MyDoFn(beam.DoFn):
+  def process(self, element: int) -> typing.Text:
+yield str(element)
+```
+is equivalent to this:
+```py
+@beam.typehints.with_input_types(int)
+@beam.typehints.with_output_types(typing.Text)
+class MyDoFn(beam.DoFn):
+  def process(self, element):
+yield str(element)
+```
+
+One of the advantages of the new form is that you may already be using 

[jira] [Work logged] (BEAM-8280) re-enable IOTypeHints.from_callable

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8280?focusedWorklogId=436310=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436310
 ]

ASF GitHub Bot logged work on BEAM-8280:


Author: ASF GitHub Bot
Created on: 22/May/20 01:11
Start Date: 22/May/20 01:11
Worklog Time Spent: 10m 
  Work Description: udim commented on a change in pull request #11070:
URL: https://github.com/apache/beam/pull/11070#discussion_r428993726



##
File path: website/src/_posts/2020-03-06-python-typing.md
##
@@ -0,0 +1,117 @@
+---
+layout: post
+title:  "Python SDK Typing Changes"
+date:   2020-03-06 00:00:01 -0800
+excerpt_separator: 
+categories: blog python typing
+authors:
+  - chadrik
+  - udim
+
+---
+
+
+TODO excerpt
+
+
+
+Python supports type annotations on functions (PEP 484). Static type checkers,
+such as mypy, are used to verify adherence to these types.
+For example:
+```py
+def f(v: int) -> int:
+  return v[0]
+```
+Running mypy on the above code will give the error:
+`Value of type "int" is not indexable`.
+
+We've recently made changes to Beam in 2 areas:
+
+Adding type hints throughout Beam. TODO expand
+
+Second, we've added support for Python 3 type annotations. This allows SDK
+users to specify a DoFn's type hints in one place. 
+We've also expanded Beam's support of `typing` module types.
+
+For more background see: 
+[Ensuring Python Type 
Safety](https://beam.apache.org/documentation/sdks/python-type-safety/).
+
+# Beam Is Typed
+
+TODO
+
+# New Ways to Annotate
+
+## Python 3 Syntax Annotations
+
+Coming in Beam 2.21 (BEAM-8280), you will be able to use Python annotation
+syntax to specify input and output types.
+
+For example, this new form:
+```py
+class MyDoFn(beam.DoFn):
+  def process(self, element: int) -> typing.Text:
+yield str(element)
+```
+is equivalent to this:
+```py
+@beam.typehints.with_input_types(int)
+@beam.typehints.with_output_types(typing.Text)
+class MyDoFn(beam.DoFn):
+  def process(self, element):
+yield str(element)
+```
+
+One of the advantages of the new form is that you may already be using it
+in tandem with a static type checker such as mypy, thus getting additional
+type checking for free.

Review comment:
   Changed.
   
   Runtime is a relative term. :)
   In Beam I like to say that we have static type checking during pipeline 
construction, and runtime type checking when the pipeline is running (the 
latter is off by default).





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436310)
Time Spent: 14h 20m  (was: 14h 10m)

> re-enable IOTypeHints.from_callable
> ---
>
> Key: BEAM-8280
> URL: https://issues.apache.org/jira/browse/BEAM-8280
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: P2
> Fix For: 2.21.0
>
>  Time Spent: 14h 20m
>  Remaining Estimate: 0h
>
> See https://issues.apache.org/jira/browse/BEAM-8279



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-7304) Twister2 Beam runner

2020-05-21 Thread Jira


[ 
https://issues.apache.org/jira/browse/BEAM-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17113661#comment-17113661
 ] 

Ismaël Mejía commented on BEAM-7304:


[~pulasthisupun] Have you tried to run the runner tests with Java 11? Asking 
this thinking on possible issues of maintainability now that Beam is starting 
to get better support of Java 11

> Twister2 Beam runner
> 
>
> Key: BEAM-7304
> URL: https://issues.apache.org/jira/browse/BEAM-7304
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-ideas
>Reporter: Pulasthi Wickramasinghe
>Assignee: Pulasthi Wickramasinghe
>Priority: P3
>  Time Spent: 21h 20m
>  Remaining Estimate: 0h
>
> Twister2 is a big data framework which supports both batch and stream 
> processing [1] [2]. The goal is to develop an beam runner for Twister2. 
> [1] [https://github.com/DSC-SPIDAL/twister2]
> [2] [https://twister2.org/]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-10063) Run pandas doctests for Beam dataframes API.

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-10063?focusedWorklogId=436308=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436308
 ]

ASF GitHub Bot logged work on BEAM-10063:
-

Author: ASF GitHub Bot
Created on: 22/May/20 00:56
Start Date: 22/May/20 00:56
Worklog Time Spent: 10m 
  Work Description: robertwb opened a new pull request #11787:
URL: https://github.com/apache/beam/pull/11787


   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 

[jira] [Commented] (BEAM-7304) Twister2 Beam runner

2020-05-21 Thread Jira


[ 
https://issues.apache.org/jira/browse/BEAM-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17113660#comment-17113660
 ] 

Ismaël Mejía commented on BEAM-7304:


Don't worry Brian I did not interpret it as pedantic, my answer was just to 
make the point clear of my awareness of the rules. I think we should not skip 
the rules to avoid people asking for this in every release and making the task 
of the release manager harder. But also when I put the user/contributor hat I 
see the benefit of having more features. Maybe it is because I come from the 
old world where releases were for features and not just programmed in time.

> Twister2 Beam runner
> 
>
> Key: BEAM-7304
> URL: https://issues.apache.org/jira/browse/BEAM-7304
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-ideas
>Reporter: Pulasthi Wickramasinghe
>Assignee: Pulasthi Wickramasinghe
>Priority: P3
>  Time Spent: 21h 20m
>  Remaining Estimate: 0h
>
> Twister2 is a big data framework which supports both batch and stream 
> processing [1] [2]. The goal is to develop an beam runner for Twister2. 
> [1] [https://github.com/DSC-SPIDAL/twister2]
> [2] [https://twister2.org/]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-9688) subprocess_server_test.py fails on Windows platform

2020-05-21 Thread Valentyn Tymofieiev (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev resolved BEAM-9688.
---
Fix Version/s: Not applicable
   Resolution: Fixed

> subprocess_server_test.py fails on Windows platform
> ---
>
> Key: BEAM-9688
> URL: https://issues.apache.org/jira/browse/BEAM-9688
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: P2
> Fix For: Not applicable
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> {noformat}
> File "C:\projects\apache_beam\utils\subprocess_server_test.py", line 65, in 
> test_gradle_jar_dev
> 'sdks:java:fake:fatJar', version='VERSION.dev')
>   File "C:\python27_x64\Lib\unittest\case.py", line 127, in __exit__
> (expected_regexp.pattern, str(exc_value)))
> AssertionError: 
> "sdks/java/fake/build/libs/beam-sdks-java-fake-VERSION-SNAPSHOT.jar not 
> found." does not match 
> "sdks\java\fake\build\libs\beam-sdks-java-fake-VERSION-SNAPSHOT.jar not 
> found. Please build the server with
>   cd C:\projects; ./gradlew sdks:java:fake:fatJar"
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9974) beam_PostRelease_NightlySnapshot failing

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9974?focusedWorklogId=436305=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436305
 ]

ASF GitHub Bot logged work on BEAM-9974:


Author: ASF GitHub Bot
Created on: 22/May/20 00:24
Start Date: 22/May/20 00:24
Worklog Time Spent: 10m 
  Work Description: TheNeuralBit commented on pull request #11786:
URL: https://github.com/apache/beam/pull/11786#issuecomment-632412040







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436305)
Time Spent: 20m  (was: 10m)

> beam_PostRelease_NightlySnapshot failing
> 
>
> Key: BEAM-9974
> URL: https://issues.apache.org/jira/browse/BEAM-9974
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kyle Weaver
>Assignee: Brian Hulette
>Priority: P1
>  Labels: currently-failing
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Another failure mode:
> 07:02:29 > Task 
> :runners:google-cloud-dataflow-java:runMobileGamingJavaDataflow
> 07:02:29 [ERROR] Failed to execute goal 
> org.codehaus.mojo:exec-maven-plugin:1.6.0:java (default-cli) on project 
> word-count-beam: An exception occured while executing the Java class. No 
> filesystem found for scheme gs -> [Help 1]
> 07:02:29 [ERROR] 
> 07:02:29 [ERROR] To see the full stack trace of the errors, re-run Maven with 
> the -e switch.
> 07:02:29 [ERROR] Re-run Maven using the -X switch to enable full debug 
> logging.
> 07:02:29 [ERROR] 
> 07:02:29 [ERROR] For more information about the errors and possible 
> solutions, please read the following articles:
> 07:02:29 [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
> 07:02:29 [ERROR] Failed to execute goal 
> org.codehaus.mojo:exec-maven-plugin:1.6.0:java (default-cli) on project 
> word-count-beam: An exception occured while executing the Java class. No 
> filesystem found for scheme gs -> [Help 1]
> 07:02:29 [ERROR] 
> 07:02:29 [ERROR] To see the full stack trace of the errors, re-run Maven with 
> the -e switch.
> 07:02:29 [ERROR] Re-run Maven using the -X switch to enable full debug 
> logging.
> 07:02:29 [ERROR] 
> 07:02:29 [ERROR] For more information about the errors and possible 
> solutions, please read the following articles:
> 07:02:29 [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
> 07:02:29 [ERROR] Failed command
> 07:02:29 
> 07:02:29 > Task 
> :runners:google-cloud-dataflow-java:runMobileGamingJavaDataflow FAILED



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9974) beam_PostRelease_NightlySnapshot failing

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9974?focusedWorklogId=436304=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436304
 ]

ASF GitHub Bot logged work on BEAM-9974:


Author: ASF GitHub Bot
Created on: 22/May/20 00:21
Start Date: 22/May/20 00:21
Worklog Time Spent: 10m 
  Work Description: TheNeuralBit opened a new pull request #11786:
URL: https://github.com/apache/beam/pull/11786


   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)
 | --- | [![Build 

[jira] [Assigned] (BEAM-9974) beam_PostRelease_NightlySnapshot failing

2020-05-21 Thread Brian Hulette (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brian Hulette reassigned BEAM-9974:
---

Assignee: Brian Hulette

> beam_PostRelease_NightlySnapshot failing
> 
>
> Key: BEAM-9974
> URL: https://issues.apache.org/jira/browse/BEAM-9974
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kyle Weaver
>Assignee: Brian Hulette
>Priority: P1
>  Labels: currently-failing
>
> Another failure mode:
> 07:02:29 > Task 
> :runners:google-cloud-dataflow-java:runMobileGamingJavaDataflow
> 07:02:29 [ERROR] Failed to execute goal 
> org.codehaus.mojo:exec-maven-plugin:1.6.0:java (default-cli) on project 
> word-count-beam: An exception occured while executing the Java class. No 
> filesystem found for scheme gs -> [Help 1]
> 07:02:29 [ERROR] 
> 07:02:29 [ERROR] To see the full stack trace of the errors, re-run Maven with 
> the -e switch.
> 07:02:29 [ERROR] Re-run Maven using the -X switch to enable full debug 
> logging.
> 07:02:29 [ERROR] 
> 07:02:29 [ERROR] For more information about the errors and possible 
> solutions, please read the following articles:
> 07:02:29 [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
> 07:02:29 [ERROR] Failed to execute goal 
> org.codehaus.mojo:exec-maven-plugin:1.6.0:java (default-cli) on project 
> word-count-beam: An exception occured while executing the Java class. No 
> filesystem found for scheme gs -> [Help 1]
> 07:02:29 [ERROR] 
> 07:02:29 [ERROR] To see the full stack trace of the errors, re-run Maven with 
> the -e switch.
> 07:02:29 [ERROR] Re-run Maven using the -X switch to enable full debug 
> logging.
> 07:02:29 [ERROR] 
> 07:02:29 [ERROR] For more information about the errors and possible 
> solutions, please read the following articles:
> 07:02:29 [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
> 07:02:29 [ERROR] Failed command
> 07:02:29 
> 07:02:29 > Task 
> :runners:google-cloud-dataflow-java:runMobileGamingJavaDataflow FAILED



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-10063) Run pandas doctests for Beam dataframes API.

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-10063?focusedWorklogId=436301=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436301
 ]

ASF GitHub Bot logged work on BEAM-10063:
-

Author: ASF GitHub Bot
Created on: 22/May/20 00:11
Start Date: 22/May/20 00:11
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #11785:
URL: https://github.com/apache/beam/pull/11785#issuecomment-632408257


   R: @TheNeuralBit 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436301)
Time Spent: 20m  (was: 10m)

> Run pandas doctests for Beam dataframes API.
> 
>
> Key: BEAM-10063
> URL: https://issues.apache.org/jira/browse/BEAM-10063
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: P2
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-10063) Run pandas doctests for Beam dataframes API.

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-10063?focusedWorklogId=436300=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436300
 ]

ASF GitHub Bot logged work on BEAM-10063:
-

Author: ASF GitHub Bot
Created on: 22/May/20 00:10
Start Date: 22/May/20 00:10
Worklog Time Spent: 10m 
  Work Description: robertwb opened a new pull request #11785:
URL: https://github.com/apache/beam/pull/11785


   It's not a perfect signal, but will still cover a lot of doctests.
   (It's also a bit hacky, but the TestRunner doesn't offer very good hooks for
   customization here.)
   
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 

[jira] [Work logged] (BEAM-9977) Build Kafka Read on top of Java SplittableDoFn

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9977?focusedWorklogId=436299=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436299
 ]

ASF GitHub Bot logged work on BEAM-9977:


Author: ASF GitHub Bot
Created on: 22/May/20 00:10
Start Date: 22/May/20 00:10
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on pull request #11715:
URL: https://github.com/apache/beam/pull/11715#issuecomment-632408026


   > What is the intrinsic limitation that did not allow old 
`OffsetRangeTracker` to be refactored for this use case? or why we want to have 
both?
   > 
   `GrowableOffsetRangeTracker` and `OffsetRangeTracker` should be applied to 
different kind of `OffsetRange`. `GrowableOffsetRangeTracker` is for the 
`OffsetRange` that the end could be changed during execution time, which mostly 
happens in streaming case. `OffsetRangeTracker` is for the range that we know 
what the exact end is, which is perfect for batch.  The reason that I didn't 
make them into one is because I don't want to introduce additional complexity 
to `OffsetRangeTracker`.  It's doable to have the dynamic one as general case 
where the range with fixed end is a special case. But I want to make them 
specifically with less confusion.
   
   > Does this mean also that we might need `GrowableBytekeyRangeTracker` and 
basically 'dynamic' versions for every `RestrictionTracker` ?
   
   I think it will depend on the actual usage. If we have an application 
scenario that requires for a dynamic version, then I would say yes. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436299)
Time Spent: 4h 20m  (was: 4h 10m)

> Build Kafka Read on top of Java SplittableDoFn
> --
>
> Key: BEAM-9977
> URL: https://issues.apache.org/jira/browse/BEAM-9977
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-kafka
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: P2
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-7304) Twister2 Beam runner

2020-05-21 Thread Brian Hulette (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17113643#comment-17113643
 ] 

Brian Hulette commented on BEAM-7304:
-

[~iemejia] sorry I wasn't trying to be pedantic, just sharing the policy with 
[~pulasthisupun]

> Twister2 Beam runner
> 
>
> Key: BEAM-7304
> URL: https://issues.apache.org/jira/browse/BEAM-7304
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-ideas
>Reporter: Pulasthi Wickramasinghe
>Assignee: Pulasthi Wickramasinghe
>Priority: P3
>  Time Spent: 21h 20m
>  Remaining Estimate: 0h
>
> Twister2 is a big data framework which supports both batch and stream 
> processing [1] [2]. The goal is to develop an beam runner for Twister2. 
> [1] [https://github.com/DSC-SPIDAL/twister2]
> [2] [https://twister2.org/]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-10038) Add script to mass-comment Jenkins triggers on PR

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-10038?focusedWorklogId=436298=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436298
 ]

ASF GitHub Bot logged work on BEAM-10038:
-

Author: ASF GitHub Bot
Created on: 22/May/20 00:03
Start Date: 22/May/20 00:03
Worklog Time Spent: 10m 
  Work Description: ibzib commented on pull request #11755:
URL: https://github.com/apache/beam/pull/11755#issuecomment-632405988


   > Yes we probably need those jenkins uber jobs (Flink too). Other use case 
where this script proves really handy would be users asking committers to 
trigger tests, maybe worth to announce it in the ML. Of course this is not its 
intended goal but realistically a more useful one until we have our own jenkins 
instances. Maybe we can improve it to make that task easier WDYT?
   
   What kind of improvements do you have in mind? It's going to be a little bit 
of work for the user no matter what to specify the list of jobs, so the script 
is only going to be useful when there are many jobs (as is the case with the 
release) or if they needed to be run very often for some reason (which is not 
common).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436298)
Time Spent: 1h 50m  (was: 1h 40m)

> Add script to mass-comment Jenkins triggers on PR
> -
>
> Key: BEAM-10038
> URL: https://issues.apache.org/jira/browse/BEAM-10038
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: P2
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> This is a work in progress, it just needs to be touched up and added to the 
> Beam repo:
> https://gist.github.com/Ardagan/13e6031e8d1c9ebbd3029bf365c1a517



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-10038) Add script to mass-comment Jenkins triggers on PR

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-10038?focusedWorklogId=436296=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436296
 ]

ASF GitHub Bot logged work on BEAM-10038:
-

Author: ASF GitHub Bot
Created on: 21/May/20 23:59
Start Date: 21/May/20 23:59
Worklog Time Spent: 10m 
  Work Description: iemejia commented on pull request #11755:
URL: https://github.com/apache/beam/pull/11755#issuecomment-632404805


   Yes we probably need those jenkins uber jobs (Flink too). Other use case 
where this script proves really handy would be users asking committers to 
trigger tests, maybe worth to announce it in the ML. Of course this is not its 
intended goal but realistically a more useful one until we have our own jenkins 
instances. Maybe we can improve it to make that task easier WDYT?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436296)
Time Spent: 1h 40m  (was: 1.5h)

> Add script to mass-comment Jenkins triggers on PR
> -
>
> Key: BEAM-10038
> URL: https://issues.apache.org/jira/browse/BEAM-10038
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: P2
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> This is a work in progress, it just needs to be touched up and added to the 
> Beam repo:
> https://gist.github.com/Ardagan/13e6031e8d1c9ebbd3029bf365c1a517



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-8019) Support cross-language transforms for DataflowRunner

2020-05-21 Thread Chamikara Madhusanka Jayalath (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17113628#comment-17113628
 ] 

Chamikara Madhusanka Jayalath commented on BEAM-8019:
-

Marking this as fixed since the basic framework is in place.

> Support cross-language transforms for DataflowRunner
> 
>
> Key: BEAM-8019
> URL: https://issues.apache.org/jira/browse/BEAM-8019
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Chamikara Madhusanka Jayalath
>Assignee: Chamikara Madhusanka Jayalath
>Priority: P1
> Fix For: 2.22.0
>
>  Time Spent: 22h
>  Remaining Estimate: 0h
>
> This is to capture the Beam changes needed for this task.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-8019) Support cross-language transforms for DataflowRunner

2020-05-21 Thread Chamikara Madhusanka Jayalath (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Madhusanka Jayalath updated BEAM-8019:

Priority: P2  (was: P1)

> Support cross-language transforms for DataflowRunner
> 
>
> Key: BEAM-8019
> URL: https://issues.apache.org/jira/browse/BEAM-8019
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Chamikara Madhusanka Jayalath
>Assignee: Chamikara Madhusanka Jayalath
>Priority: P2
> Fix For: 2.22.0
>
>  Time Spent: 22h
>  Remaining Estimate: 0h
>
> This is to capture the Beam changes needed for this task.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9971) beam_PostCommit_Java_PVR_Spark_Batch flakes (no such file)

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9971?focusedWorklogId=436293=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436293
 ]

ASF GitHub Bot logged work on BEAM-9971:


Author: ASF GitHub Bot
Created on: 21/May/20 23:55
Start Date: 21/May/20 23:55
Worklog Time Spent: 10m 
  Work Description: ibzib commented on pull request #11784:
URL: https://github.com/apache/beam/pull/11784#issuecomment-632403660


   Run Java Spark PortableValidatesRunner Batch



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436293)
Time Spent: 20m  (was: 10m)

> beam_PostCommit_Java_PVR_Spark_Batch flakes (no such file)
> --
>
> Key: BEAM-9971
> URL: https://issues.apache.org/jira/browse/BEAM-9971
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: P2
>  Labels: portability-spark
> Fix For: 2.22.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This happens sporadically. One time the issue affected 14 tests; another time 
> it affected 112 tests.
> It looks like the ClassLoader is sometimes contaminated with jars from 
> /tmp/spark-*, which have already been deleted.
> 20/05/21 13:54:27 ERROR org.apache.beam.runners.jobsubmission.JobInvocation: 
> Error during job invocation 
> pipelinetest0testidentitytransform-kcweaver-0521205426-f4de06c4_51aced77-c171-4842-be1f-6c79226872e5.
> java.util.ServiceConfigurationError: 
> org.apache.beam.runners.core.construction.NativeTransforms$IsNativeTransform: 
> Error reading configuration file
>   at java.util.ServiceLoader.fail(ServiceLoader.java:232)
>   at java.util.ServiceLoader.parse(ServiceLoader.java:309)
>   at java.util.ServiceLoader.access$200(ServiceLoader.java:185)
>   at 
> java.util.ServiceLoader$LazyIterator.hasNextService(ServiceLoader.java:357)
>   at java.util.ServiceLoader$LazyIterator.hasNext(ServiceLoader.java:393)
>   at java.util.ServiceLoader$1.hasNext(ServiceLoader.java:474)
>   at 
> org.apache.beam.runners.core.construction.NativeTransforms.isNative(NativeTransforms.java:50)
>   at 
> org.apache.beam.runners.core.construction.graph.QueryablePipeline.isPrimitiveTransform(QueryablePipeline.java:189)
>   at 
> org.apache.beam.runners.core.construction.graph.QueryablePipeline.getPrimitiveTransformIds(QueryablePipeline.java:137)
>   at 
> org.apache.beam.runners.core.construction.graph.QueryablePipeline.forPrimitivesIn(QueryablePipeline.java:90)
>   at 
> org.apache.beam.runners.core.construction.graph.GreedyPipelineFuser.(GreedyPipelineFuser.java:67)
>   at 
> org.apache.beam.runners.core.construction.graph.GreedyPipelineFuser.fuse(GreedyPipelineFuser.java:90)
>   at 
> org.apache.beam.runners.spark.SparkPipelineRunner.run(SparkPipelineRunner.java:94)
>   at 
> org.apache.beam.runners.jobsubmission.JobInvocation.runPipeline(JobInvocation.java:83)
>   at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
>   at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57)
>   at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.FileNotFoundException: 
> /tmp/spark-5e8a8a9a-22d6-48d5-b398-1a4f5582d954/userFiles-ec74cac1-21b5-4127-b764-540636b733d0/beam-runners-core-construction-java-2.22.0-SNAPSHOT-tests.jar
>  (No such file or directory)
>   at java.util.zip.ZipFile.open(Native Method)
>   at java.util.zip.ZipFile.(ZipFile.java:230)
>   at java.util.zip.ZipFile.(ZipFile.java:155)
>   at java.util.jar.JarFile.(JarFile.java:167)
>   at java.util.jar.JarFile.(JarFile.java:104)
>   at sun.net.www.protocol.jar.URLJarFile.(URLJarFile.java:93)
>   at sun.net.www.protocol.jar.URLJarFile.getJarFile(URLJarFile.java:69)
>   at sun.net.www.protocol.jar.JarFileFactory.get(JarFileFactory.java:84)
>   at 
> 

[jira] [Resolved] (BEAM-8019) Support cross-language transforms for DataflowRunner

2020-05-21 Thread Chamikara Madhusanka Jayalath (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Madhusanka Jayalath resolved BEAM-8019.
-
Resolution: Fixed

> Support cross-language transforms for DataflowRunner
> 
>
> Key: BEAM-8019
> URL: https://issues.apache.org/jira/browse/BEAM-8019
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Chamikara Madhusanka Jayalath
>Assignee: Chamikara Madhusanka Jayalath
>Priority: P1
> Fix For: 2.22.0
>
>  Time Spent: 22h
>  Remaining Estimate: 0h
>
> This is to capture the Beam changes needed for this task.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-10052) check hash and avoid duplicates when uploading artifact in Python Dataflow Runner

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-10052?focusedWorklogId=436290=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436290
 ]

ASF GitHub Bot logged work on BEAM-10052:
-

Author: ASF GitHub Bot
Created on: 21/May/20 23:54
Start Date: 21/May/20 23:54
Worklog Time Spent: 10m 
  Work Description: chamikaramj merged pull request #11771:
URL: https://github.com/apache/beam/pull/11771


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436290)
Time Spent: 1h 40m  (was: 1.5h)

> check hash and avoid duplicates when uploading artifact in Python Dataflow 
> Runner
> -
>
> Key: BEAM-10052
> URL: https://issues.apache.org/jira/browse/BEAM-10052
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: P2
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> xlang pipeline could have many duplicated jars. it would be great if we check 
> hash and avoid duplicate uploads.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9971) beam_PostCommit_Java_PVR_Spark_Batch flakes (no such file)

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9971?focusedWorklogId=436291=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436291
 ]

ASF GitHub Bot logged work on BEAM-9971:


Author: ASF GitHub Bot
Created on: 21/May/20 23:54
Start Date: 21/May/20 23:54
Worklog Time Spent: 10m 
  Work Description: ibzib opened a new pull request #11784:
URL: https://github.com/apache/beam/pull/11784


   **Please** add a meaningful description for your change here
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more 
tips on [how to make review process 
smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build
 

[jira] [Updated] (BEAM-8019) Support cross-language transforms for DataflowRunner

2020-05-21 Thread Chamikara Madhusanka Jayalath (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Madhusanka Jayalath updated BEAM-8019:

Fix Version/s: 2.22.0

> Support cross-language transforms for DataflowRunner
> 
>
> Key: BEAM-8019
> URL: https://issues.apache.org/jira/browse/BEAM-8019
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Chamikara Madhusanka Jayalath
>Assignee: Chamikara Madhusanka Jayalath
>Priority: P1
> Fix For: 2.22.0
>
>  Time Spent: 22h
>  Remaining Estimate: 0h
>
> This is to capture the Beam changes needed for this task.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9383) Staging Dataflow artifacts from environment

2020-05-21 Thread Chamikara Madhusanka Jayalath (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Madhusanka Jayalath updated BEAM-9383:

Fix Version/s: 2.22.0

> Staging Dataflow artifacts from environment
> ---
>
> Key: BEAM-9383
> URL: https://issues.apache.org/jira/browse/BEAM-9383
> Project: Beam
>  Issue Type: Sub-task
>  Components: java-fn-execution
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: P2
> Fix For: 2.22.0
>
>  Time Spent: 12h
>  Remaining Estimate: 0h
>
> Staging Dataflow artifacts from environment



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (BEAM-9383) Staging Dataflow artifacts from environment

2020-05-21 Thread Chamikara Madhusanka Jayalath (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Madhusanka Jayalath resolved BEAM-9383.
-
Resolution: Fixed

> Staging Dataflow artifacts from environment
> ---
>
> Key: BEAM-9383
> URL: https://issues.apache.org/jira/browse/BEAM-9383
> Project: Beam
>  Issue Type: Sub-task
>  Components: java-fn-execution
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: P2
> Fix For: 2.22.0
>
>  Time Spent: 12h
>  Remaining Estimate: 0h
>
> Staging Dataflow artifacts from environment



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-10063) Run pandas doctests for Beam dataframes API.

2020-05-21 Thread Robert Bradshaw (Jira)
Robert Bradshaw created BEAM-10063:
--

 Summary: Run pandas doctests for Beam dataframes API.
 Key: BEAM-10063
 URL: https://issues.apache.org/jira/browse/BEAM-10063
 Project: Beam
  Issue Type: Sub-task
  Components: sdk-py-core
Reporter: Robert Bradshaw






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9977) Build Kafka Read on top of Java SplittableDoFn

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9977?focusedWorklogId=436289=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436289
 ]

ASF GitHub Bot logged work on BEAM-9977:


Author: ASF GitHub Bot
Created on: 21/May/20 23:52
Start Date: 21/May/20 23:52
Worklog Time Spent: 10m 
  Work Description: iemejia commented on pull request #11715:
URL: https://github.com/apache/beam/pull/11715#issuecomment-632402839


   Now that this is merged. Can I ask a question. What is the intrinsic 
limitation that did not allow old `OffsetRangeTracker` to be refactored for 
this use case? or why we want to have both?
   
   Does this mean also that we might need `GrowableBytekeyRangeTracker` and 
basically 'dynamic' versions for every `RestrictionTracker` ?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436289)
Time Spent: 4h 10m  (was: 4h)

> Build Kafka Read on top of Java SplittableDoFn
> --
>
> Key: BEAM-9977
> URL: https://issues.apache.org/jira/browse/BEAM-9977
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-kafka
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: P2
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-10063) Run pandas doctests for Beam dataframes API.

2020-05-21 Thread Robert Bradshaw (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-10063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Bradshaw reassigned BEAM-10063:
--

Assignee: Robert Bradshaw

> Run pandas doctests for Beam dataframes API.
> 
>
> Key: BEAM-10063
> URL: https://issues.apache.org/jira/browse/BEAM-10063
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: P2
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-7304) Twister2 Beam runner

2020-05-21 Thread Jira


[ 
https://issues.apache.org/jira/browse/BEAM-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17113624#comment-17113624
 ] 

Ismaël Mejía commented on BEAM-7304:


I understand the rules Brian, that's the reason why in the first message I 
mentioned you to ask if we could do this 'exceptionally'. I understand that 
this also have risks and I am afraid of rushing things too but I wanted to get 
this feature in because we (and me concretely) have had this contribution 
waiting for long time.

Given the circumstances it seems like the best path forward is to try to get 
this into master rapidly so it gets released soon. Sorry [~pulasthisupun].


> Twister2 Beam runner
> 
>
> Key: BEAM-7304
> URL: https://issues.apache.org/jira/browse/BEAM-7304
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-ideas
>Reporter: Pulasthi Wickramasinghe
>Assignee: Pulasthi Wickramasinghe
>Priority: P3
>  Time Spent: 21h 20m
>  Remaining Estimate: 0h
>
> Twister2 is a big data framework which supports both batch and stream 
> processing [1] [2]. The goal is to develop an beam runner for Twister2. 
> [1] [https://github.com/DSC-SPIDAL/twister2]
> [2] [https://twister2.org/]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-10060) Release new Python container image for Dataflow

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-10060?focusedWorklogId=436286=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436286
 ]

ASF GitHub Bot logged work on BEAM-10060:
-

Author: ASF GitHub Bot
Created on: 21/May/20 23:44
Start Date: 21/May/20 23:44
Worklog Time Spent: 10m 
  Work Description: angoenka commented on pull request #11783:
URL: https://github.com/apache/beam/pull/11783#issuecomment-632400825


   Tests passed



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436286)
Time Spent: 1h  (was: 50m)

> Release new Python container image for Dataflow
> ---
>
> Key: BEAM-10060
> URL: https://issues.apache.org/jira/browse/BEAM-10060
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-py-harness
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: P2
> Fix For: 2.23.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Release beam-master-20200521 for python legacy and fnapi



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-10060) Release new Python container image for Dataflow

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-10060?focusedWorklogId=436287=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436287
 ]

ASF GitHub Bot logged work on BEAM-10060:
-

Author: ASF GitHub Bot
Created on: 21/May/20 23:44
Start Date: 21/May/20 23:44
Worklog Time Spent: 10m 
  Work Description: angoenka merged pull request #11783:
URL: https://github.com/apache/beam/pull/11783


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436287)
Time Spent: 1h 10m  (was: 1h)

> Release new Python container image for Dataflow
> ---
>
> Key: BEAM-10060
> URL: https://issues.apache.org/jira/browse/BEAM-10060
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-py-harness
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: P2
> Fix For: 2.23.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Release beam-master-20200521 for python legacy and fnapi



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-3489) Expose the message id of received messages within PubsubMessage

2020-05-21 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-3489:
---
Status: Open  (was: Triage Needed)

> Expose the message id of received messages within PubsubMessage
> ---
>
> Key: BEAM-3489
> URL: https://issues.apache.org/jira/browse/BEAM-3489
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Luke Cwik
>Assignee: Thinh Ha
>Priority: P3
>  Labels: newbie, starter
> Fix For: 2.16.0
>
>  Time Spent: 8h 20m
>  Remaining Estimate: 0h
>
> This task is about passing forward the message id from the pubsub proto to 
> the java PubsubMessage.
> Add a message id field to PubsubMessage.
> Update the coder for PubsubMessage to encode the message id.
> Update the translation from the Pubsub proto message to the Dataflow message:
> https://github.com/apache/beam/blob/2e275264b21db45787833502e5e42907b05e28b8/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubUnboundedSource.java#L976



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-10060) Release new Python container image for Dataflow

2020-05-21 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-10060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-10060:

Status: Open  (was: Triage Needed)

> Release new Python container image for Dataflow
> ---
>
> Key: BEAM-10060
> URL: https://issues.apache.org/jira/browse/BEAM-10060
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-py-harness
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: P2
> Fix For: 2.23.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Release beam-master-20200521 for python legacy and fnapi



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-10060) Release new Python container image for Dataflow

2020-05-21 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-10060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía reassigned BEAM-10060:
---

Assignee: Ankur Goenka

> Release new Python container image for Dataflow
> ---
>
> Key: BEAM-10060
> URL: https://issues.apache.org/jira/browse/BEAM-10060
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-py-harness
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: P2
> Fix For: 2.23.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Release beam-master-20200521 for python legacy and fnapi



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-3489) Expose the message id of received messages within PubsubMessage

2020-05-21 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-3489:
---
Fix Version/s: (was: 2.16.0)

> Expose the message id of received messages within PubsubMessage
> ---
>
> Key: BEAM-3489
> URL: https://issues.apache.org/jira/browse/BEAM-3489
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-gcp
>Reporter: Luke Cwik
>Assignee: Thinh Ha
>Priority: P3
>  Labels: newbie, starter
>  Time Spent: 8h 20m
>  Remaining Estimate: 0h
>
> This task is about passing forward the message id from the pubsub proto to 
> the java PubsubMessage.
> Add a message id field to PubsubMessage.
> Update the coder for PubsubMessage to encode the message id.
> Update the translation from the Pubsub proto message to the Dataflow message:
> https://github.com/apache/beam/blob/2e275264b21db45787833502e5e42907b05e28b8/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubUnboundedSource.java#L976



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-10058) VideoIntelligenceMlTestIT.test_label_detection_with_video_context is flaky

2020-05-21 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-10058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-10058:

Status: Open  (was: Triage Needed)

> VideoIntelligenceMlTestIT.test_label_detection_with_video_context is flaky
> --
>
> Key: BEAM-10058
> URL: https://issues.apache.org/jira/browse/BEAM-10058
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, test-failures
>Reporter: Brian Hulette
>Assignee: Kamil Wasilewski
>Priority: P2
>
> Example failure: https://builds.apache.org/job/beam_PostCommit_Python37/2371/
> {code}
> Dataflow pipeline failed. State: FAILED, Error:
> Traceback (most recent call last):
>   File "apache_beam/runners/common.py", line 961, in 
> apache_beam.runners.common.DoFnRunner.process
>   File "apache_beam/runners/common.py", line 554, in 
> apache_beam.runners.common.SimpleInvoker.invoke_process
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37/src/sdks/python/apache_beam/transforms/core.py",
>  line 1511, in 
> wrapper = lambda x: [fn(x)]
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python37/src/sdks/python/apache_beam/testing/util.py",
>  line 218, in _matches
> hamcrest_assert(actual, contains_inanyorder(*expected_list))
>   File "/usr/local/lib/python3.7/site-packages/hamcrest/core/assert_that.py", 
> line 44, in assert_that
> _assert_match(actual=arg1, matcher=arg2, reason=arg3)
>   File "/usr/local/lib/python3.7/site-packages/hamcrest/core/assert_that.py", 
> line 60, in _assert_match
> raise AssertionError(description)
> AssertionError: 
> Expected: a sequence over [(a sequence containing 'bicycle' and a sequence 
> containing 'dinosaur')] in any order
>  but: not matched: <['land vehicle', 'animal']>
> {code}
> At least the error is amusing :)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-10051) Misordered check WRT closed data readers.

2020-05-21 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-10051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-10051:

Status: Open  (was: Triage Needed)

> Misordered check WRT closed data readers.
> -
>
> Key: BEAM-10051
> URL: https://issues.apache.org/jira/browse/BEAM-10051
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Reporter: Robert Burke
>Assignee: Robert Burke
>Priority: P2
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This check 
> https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/runtime/harness/datamgr.go#L269
> in it's current position prevents the "normal teardown" that the reader 
> expects. This means that readers for instructions that terminate early such 
> as due to splitting stay resident in memory and never close.
> In practice this is benign as the buffer would already be closed, but with 
> streaming this  memory leak would become noticable.
> The fix is to move the check to after the sentinel check, and additionally 
> check there for early termination to avoid closing the buffer twice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-10061) ReadAllFromTextWithFilename

2020-05-21 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-10061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía reassigned BEAM-10061:
---

Assignee: (was: Aizhamal Nurmamat kyzy)

> ReadAllFromTextWithFilename
> ---
>
> Key: BEAM-10061
> URL: https://issues.apache.org/jira/browse/BEAM-10061
> Project: Beam
>  Issue Type: New Feature
>  Components: io-py-gcp, sdk-py-core
> Environment: Dataflow with Python
>Reporter: Ryan Canty
>Priority: P2
>
> I am trying to create a job that reads from GCS executes some code against 
> each line and creates a PCollection with the line and the file. So basically 
> what I'd like is a combination of textio.ReadTextWithFilename and 
> textio.ReadAllFromText



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-10061) ReadAllFromTextWithFilename

2020-05-21 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-10061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-10061:

Component/s: (was: beam-community)
 sdk-py-core
 io-py-gcp

> ReadAllFromTextWithFilename
> ---
>
> Key: BEAM-10061
> URL: https://issues.apache.org/jira/browse/BEAM-10061
> Project: Beam
>  Issue Type: New Feature
>  Components: io-py-gcp, sdk-py-core
> Environment: Dataflow with Python
>Reporter: Ryan Canty
>Assignee: Aizhamal Nurmamat kyzy
>Priority: P2
>
> I am trying to create a job that reads from GCS executes some code against 
> each line and creates a PCollection with the line and the file. So basically 
> what I'd like is a combination of textio.ReadTextWithFilename and 
> textio.ReadAllFromText



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-10062) BeamRelNode with different pipeline options

2020-05-21 Thread Andrew Pilloud (Jira)
Andrew Pilloud created BEAM-10062:
-

 Summary: BeamRelNode with different pipeline options
 Key: BEAM-10062
 URL: https://issues.apache.org/jira/browse/BEAM-10062
 Project: Beam
  Issue Type: Bug
  Components: dsl-sql-zetasql
Reporter: Andrew Pilloud


two failures in shard 37
{code:java}
INFO: Processing Sql statement: SELECT t1.id, t2.id
FROM (SELECT 1 as id) t1 FULL JOIN (SELECT id FROM R WHERE FALSE) t2
ON t1.id = t2.id
May 21, 2020 1:40:06 PM 
cloud.dataflow.sql.ExecuteQueryServiceServer$SqlComplianceServiceImpl 
executeQuery
SEVERE: null
java.lang.AssertionError
at 
org.apache.beam.sdk.extensions.sql.impl.rel.BeamRelNode.getPipelineOptions(BeamRelNode.java:63)
at 
org.apache.beam.sdk.extensions.sql.impl.rel.BeamRelNode.getPipelineOptions(BeamRelNode.java:61)
at 
org.apache.beam.sdk.extensions.sql.impl.rel.BeamEnumerableConverter.toRowList(BeamEnumerableConverter.java:127)
at 
cloud.dataflow.sql.ExecuteQueryServiceServer$SqlComplianceServiceImpl.executeQuery(ExecuteQueryServiceServer.java:262)
at 
com.google.zetasql.testing.SqlComplianceServiceGrpc$MethodHandlers.invoke(SqlComplianceServiceGrpc.java:423)
at 
com.google.zetasql.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171)
at 
com.google.zetasql.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283)
at 
com.google.zetasql.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:711)
at 
com.google.zetasql.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
at 
com.google.zetasql.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
 {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-10062) BeamRelNode with different pipeline options

2020-05-21 Thread Andrew Pilloud (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-10062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Pilloud updated BEAM-10062:
--
Status: Open  (was: Triage Needed)

> BeamRelNode with different pipeline options
> ---
>
> Key: BEAM-10062
> URL: https://issues.apache.org/jira/browse/BEAM-10062
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql-zetasql
>Reporter: Andrew Pilloud
>Priority: P2
>  Labels: zetasql-compliance
>
> two failures in shard 37
> {code:java}
> INFO: Processing Sql statement: SELECT t1.id, t2.id
> FROM (SELECT 1 as id) t1 FULL JOIN (SELECT id FROM R WHERE FALSE) t2
> ON t1.id = t2.id
> May 21, 2020 1:40:06 PM 
> cloud.dataflow.sql.ExecuteQueryServiceServer$SqlComplianceServiceImpl 
> executeQuery
> SEVERE: null
> java.lang.AssertionError
>   at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamRelNode.getPipelineOptions(BeamRelNode.java:63)
>   at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamRelNode.getPipelineOptions(BeamRelNode.java:61)
>   at 
> org.apache.beam.sdk.extensions.sql.impl.rel.BeamEnumerableConverter.toRowList(BeamEnumerableConverter.java:127)
>   at 
> cloud.dataflow.sql.ExecuteQueryServiceServer$SqlComplianceServiceImpl.executeQuery(ExecuteQueryServiceServer.java:262)
>   at 
> com.google.zetasql.testing.SqlComplianceServiceGrpc$MethodHandlers.invoke(SqlComplianceServiceGrpc.java:423)
>   at 
> com.google.zetasql.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171)
>   at 
> com.google.zetasql.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283)
>   at 
> com.google.zetasql.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:711)
>   at 
> com.google.zetasql.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
>   at 
> com.google.zetasql.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
>  {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7746) Add type hints to python code

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7746?focusedWorklogId=436283=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436283
 ]

ASF GitHub Bot logged work on BEAM-7746:


Author: ASF GitHub Bot
Created on: 21/May/20 23:37
Start Date: 21/May/20 23:37
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #11632:
URL: https://github.com/apache/beam/pull/11632#issuecomment-632398759


   Ping.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436283)
Time Spent: 83.5h  (was: 83h 20m)

> Add type hints to python code
> -
>
> Key: BEAM-7746
> URL: https://issues.apache.org/jira/browse/BEAM-7746
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: P2
>  Time Spent: 83.5h
>  Remaining Estimate: 0h
>
> As a developer of the beam source code, I would like the code to use pep484 
> type hints so that I can clearly see what types are required, get completion 
> in my IDE, and enforce code correctness via a static analyzer like mypy.
> This may be considered a precursor to BEAM-7060
> Work has been started here:  [https://github.com/apache/beam/pull/9056]
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9971) beam_PostCommit_Java_PVR_Spark_Batch flakes (no such file)

2020-05-21 Thread Kyle Weaver (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kyle Weaver updated BEAM-9971:
--
Description: 
This happens sporadically. One time the issue affected 14 tests; another time 
it affected 112 tests.

It looks like the ClassLoader is sometimes contaminated with jars from 
/tmp/spark-*, which have already been deleted.

20/05/21 13:54:27 ERROR org.apache.beam.runners.jobsubmission.JobInvocation: 
Error during job invocation 
pipelinetest0testidentitytransform-kcweaver-0521205426-f4de06c4_51aced77-c171-4842-be1f-6c79226872e5.
java.util.ServiceConfigurationError: 
org.apache.beam.runners.core.construction.NativeTransforms$IsNativeTransform: 
Error reading configuration file
at java.util.ServiceLoader.fail(ServiceLoader.java:232)
at java.util.ServiceLoader.parse(ServiceLoader.java:309)
at java.util.ServiceLoader.access$200(ServiceLoader.java:185)
at 
java.util.ServiceLoader$LazyIterator.hasNextService(ServiceLoader.java:357)
at java.util.ServiceLoader$LazyIterator.hasNext(ServiceLoader.java:393)
at java.util.ServiceLoader$1.hasNext(ServiceLoader.java:474)
at 
org.apache.beam.runners.core.construction.NativeTransforms.isNative(NativeTransforms.java:50)
at 
org.apache.beam.runners.core.construction.graph.QueryablePipeline.isPrimitiveTransform(QueryablePipeline.java:189)
at 
org.apache.beam.runners.core.construction.graph.QueryablePipeline.getPrimitiveTransformIds(QueryablePipeline.java:137)
at 
org.apache.beam.runners.core.construction.graph.QueryablePipeline.forPrimitivesIn(QueryablePipeline.java:90)
at 
org.apache.beam.runners.core.construction.graph.GreedyPipelineFuser.(GreedyPipelineFuser.java:67)
at 
org.apache.beam.runners.core.construction.graph.GreedyPipelineFuser.fuse(GreedyPipelineFuser.java:90)
at 
org.apache.beam.runners.spark.SparkPipelineRunner.run(SparkPipelineRunner.java:94)
at 
org.apache.beam.runners.jobsubmission.JobInvocation.runPipeline(JobInvocation.java:83)
at 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
at 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57)
at 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.FileNotFoundException: 
/tmp/spark-5e8a8a9a-22d6-48d5-b398-1a4f5582d954/userFiles-ec74cac1-21b5-4127-b764-540636b733d0/beam-runners-core-construction-java-2.22.0-SNAPSHOT-tests.jar
 (No such file or directory)
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.(ZipFile.java:230)
at java.util.zip.ZipFile.(ZipFile.java:155)
at java.util.jar.JarFile.(JarFile.java:167)
at java.util.jar.JarFile.(JarFile.java:104)
at sun.net.www.protocol.jar.URLJarFile.(URLJarFile.java:93)
at sun.net.www.protocol.jar.URLJarFile.getJarFile(URLJarFile.java:69)
at sun.net.www.protocol.jar.JarFileFactory.get(JarFileFactory.java:84)
at 
sun.net.www.protocol.jar.JarURLConnection.connect(JarURLConnection.java:122)
at 
sun.net.www.protocol.jar.JarURLConnection.getInputStream(JarURLConnection.java:152)
at java.net.URL.openStream(URL.java:1045)
at java.util.ServiceLoader.parse(ServiceLoader.java:304)
... 18 more


  was:
This happens sporadically. One time the issue affected 14 tests; another time 
it affected 112 tests.

20/05/21 13:54:27 ERROR org.apache.beam.runners.jobsubmission.JobInvocation: 
Error during job invocation 
pipelinetest0testidentitytransform-kcweaver-0521205426-f4de06c4_51aced77-c171-4842-be1f-6c79226872e5.
java.util.ServiceConfigurationError: 
org.apache.beam.runners.core.construction.NativeTransforms$IsNativeTransform: 
Error reading configuration file
at java.util.ServiceLoader.fail(ServiceLoader.java:232)
at java.util.ServiceLoader.parse(ServiceLoader.java:309)
at java.util.ServiceLoader.access$200(ServiceLoader.java:185)
at 
java.util.ServiceLoader$LazyIterator.hasNextService(ServiceLoader.java:357)
at java.util.ServiceLoader$LazyIterator.hasNext(ServiceLoader.java:393)
at java.util.ServiceLoader$1.hasNext(ServiceLoader.java:474)
at 
org.apache.beam.runners.core.construction.NativeTransforms.isNative(NativeTransforms.java:50)
at 

[jira] [Work logged] (BEAM-7390) Colab examples for aggregation transforms (Python)

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7390?focusedWorklogId=436281=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436281
 ]

ASF GitHub Bot logged work on BEAM-7390:


Author: ASF GitHub Bot
Created on: 21/May/20 23:36
Start Date: 21/May/20 23:36
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #10165:
URL: https://github.com/apache/beam/pull/10165#issuecomment-632398599


   Run Python PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436281)
Time Spent: 11.5h  (was: 11h 20m)

> Colab examples for aggregation transforms (Python)
> --
>
> Key: BEAM-7390
> URL: https://issues.apache.org/jira/browse/BEAM-7390
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Rose Nguyen
>Assignee: David Cavazos
>Priority: P3
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> Merge aggregation Colabs into the transform catalog



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9994) Cannot create a virtualenv using Python 3.8 on Jenkins machines

2020-05-21 Thread Valentyn Tymofieiev (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17113618#comment-17113618
 ] 

Valentyn Tymofieiev commented on BEAM-9994:
---

Sharing at 
https://cwiki.apache.org/confluence/display/BEAM/Jenkins+Tips#JenkinsTips-Jenkinsinfrastructuresetup.
 Please take a look and feel free to update the instructions.

> Cannot create a virtualenv using Python 3.8 on Jenkins machines
> ---
>
> Key: BEAM-9994
> URL: https://issues.apache.org/jira/browse/BEAM-9994
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Kamil Wasilewski
>Assignee: Kamil Wasilewski
>Priority: P2
>
> Command: *virtualenv --python /usr/bin/python3.8 env*
> Output:
> {noformat}
> Running virtualenv with interpreter /usr/bin/python3.8
> Traceback (most recent call last):
>   File "/usr/local/lib/python3.5/dist-packages/virtualenv.py", line 22, in 
> 
> import distutils.spawn
> ModuleNotFoundError: No module named 'distutils.spawn'
> {noformat}
> Example test affected: 
> https://builds.apache.org/job/beam_PreCommit_PythonFormatter_Commit/1723/console



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9514) AssertionError type mismatch from AggregateScanConverter

2020-05-21 Thread Andrew Pilloud (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Pilloud updated BEAM-9514:
-
Summary: AssertionError type mismatch from AggregateScanConverter  (was: 
AssertionError type mismatch from SUM)

> AssertionError type mismatch from AggregateScanConverter
> 
>
> Key: BEAM-9514
> URL: https://issues.apache.org/jira/browse/BEAM-9514
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql-zetasql
>Reporter: Andrew Pilloud
>Assignee: Andrew Pilloud
>Priority: P1
>  Labels: zetasql-compliance
> Fix For: 2.22.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Six failures in shard 31
> {code}
>   at 
> org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.util.Litmus$1.fail(Litmus.java:31)
>   at 
> org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptUtil.equal(RelOptUtil.java:1984)
>   at 
> org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.volcano.RelSubset.add(RelSubset.java:284)
>   at 
> org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.volcano.RelSet.add(RelSet.java:148)
>   at 
> org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.volcano.VolcanoPlanner.addRelToSet(VolcanoPlanner.java:1806)
>   at 
> org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.volcano.VolcanoPlanner.reregister(VolcanoPlanner.java:1480)
>   at 
> org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.volcano.RelSet.mergeWith(RelSet.java:331)
>   at 
> org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.volcano.VolcanoPlanner.merge(VolcanoPlanner.java:1571)
>   at 
> org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:863)
>   at 
> org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:1927)
>   at 
> org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(VolcanoRuleCall.java:129)
>   at 
> org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:236)
>   at 
> org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.rules.AggregateRemoveRule.onMatch(AggregateRemoveRule.java:126)
>   at 
> org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:208)
>   at 
> org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:631)
>   at 
> org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:328)
>   at 
> org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLPlannerImpl.transform(ZetaSQLPlannerImpl.java:180)
>   at 
> org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLQueryPlanner.convertToBeamRelInternal(ZetaSQLQueryPlanner.java:150)
>   at 
> org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLQueryPlanner.convertToBeamRel(ZetaSQLQueryPlanner.java:115)
>   at 
> cloud.dataflow.sql.ExecuteQueryServiceServer$SqlComplianceServiceImpl.executeQuery(ExecuteQueryServiceServer.java:242)
>   at 
> com.google.zetasql.testing.SqlComplianceServiceGrpc$MethodHandlers.invoke(SqlComplianceServiceGrpc.java:423)
>   at 
> com.google.zetasql.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171)
>   at 
> com.google.zetasql.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283)
>   at 
> com.google.zetasql.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:711)
>   at 
> com.google.zetasql.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
>   at 
> com.google.zetasql.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}
> 1:
> {code}
> Apr 01, 2020 11:48:56 AM 
> cloud.dataflow.sql.ExecuteQueryServiceServer$SqlComplianceServiceImpl 
> executeQuery
> INFO: Processing Sql statement: select sum(distinct_4) from TableDistincts
> group by distinct_2
> having false
> Apr 01, 2020 11:48:57 AM 
> cloud.dataflow.sql.ExecuteQueryServiceServer$SqlComplianceServiceImpl 
> executeQuery
> SEVERE:  Type mismatch:
> rowtype of new rel:
> RecordType(BIGINT distinct_2, BIGINT $col1) NOT NULL
> rowtype of set:
> RecordType(BIGINT distinct_2, BIGINT NOT NULL $col1) 

[jira] [Updated] (BEAM-9971) beam_PostCommit_Java_PVR_Spark_Batch flakes (no such file)

2020-05-21 Thread Kyle Weaver (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kyle Weaver updated BEAM-9971:
--
Description: 
This happens sporadically. One time the issue affected 14 tests; another time 
it affected 112 tests.

20/05/21 13:54:27 ERROR org.apache.beam.runners.jobsubmission.JobInvocation: 
Error during job invocation 
pipelinetest0testidentitytransform-kcweaver-0521205426-f4de06c4_51aced77-c171-4842-be1f-6c79226872e5.
java.util.ServiceConfigurationError: 
org.apache.beam.runners.core.construction.NativeTransforms$IsNativeTransform: 
Error reading configuration file
at java.util.ServiceLoader.fail(ServiceLoader.java:232)
at java.util.ServiceLoader.parse(ServiceLoader.java:309)
at java.util.ServiceLoader.access$200(ServiceLoader.java:185)
at 
java.util.ServiceLoader$LazyIterator.hasNextService(ServiceLoader.java:357)
at java.util.ServiceLoader$LazyIterator.hasNext(ServiceLoader.java:393)
at java.util.ServiceLoader$1.hasNext(ServiceLoader.java:474)
at 
org.apache.beam.runners.core.construction.NativeTransforms.isNative(NativeTransforms.java:50)
at 
org.apache.beam.runners.core.construction.graph.QueryablePipeline.isPrimitiveTransform(QueryablePipeline.java:189)
at 
org.apache.beam.runners.core.construction.graph.QueryablePipeline.getPrimitiveTransformIds(QueryablePipeline.java:137)
at 
org.apache.beam.runners.core.construction.graph.QueryablePipeline.forPrimitivesIn(QueryablePipeline.java:90)
at 
org.apache.beam.runners.core.construction.graph.GreedyPipelineFuser.(GreedyPipelineFuser.java:67)
at 
org.apache.beam.runners.core.construction.graph.GreedyPipelineFuser.fuse(GreedyPipelineFuser.java:90)
at 
org.apache.beam.runners.spark.SparkPipelineRunner.run(SparkPipelineRunner.java:94)
at 
org.apache.beam.runners.jobsubmission.JobInvocation.runPipeline(JobInvocation.java:83)
at 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
at 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57)
at 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.FileNotFoundException: 
/tmp/spark-5e8a8a9a-22d6-48d5-b398-1a4f5582d954/userFiles-ec74cac1-21b5-4127-b764-540636b733d0/beam-runners-core-construction-java-2.22.0-SNAPSHOT-tests.jar
 (No such file or directory)
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.(ZipFile.java:230)
at java.util.zip.ZipFile.(ZipFile.java:155)
at java.util.jar.JarFile.(JarFile.java:167)
at java.util.jar.JarFile.(JarFile.java:104)
at sun.net.www.protocol.jar.URLJarFile.(URLJarFile.java:93)
at sun.net.www.protocol.jar.URLJarFile.getJarFile(URLJarFile.java:69)
at sun.net.www.protocol.jar.JarFileFactory.get(JarFileFactory.java:84)
at 
sun.net.www.protocol.jar.JarURLConnection.connect(JarURLConnection.java:122)
at 
sun.net.www.protocol.jar.JarURLConnection.getInputStream(JarURLConnection.java:152)
at java.net.URL.openStream(URL.java:1045)
at java.util.ServiceLoader.parse(ServiceLoader.java:304)
... 18 more


  was:
This happens sporadically. One time the issue affected 14 tests; another time 
it affected 112 tests.

java.lang.RuntimeException: The Runner experienced the following error during 
execution:
java.io.FileNotFoundException: 
/tmp/spark-0812a463-8d6b-4c97-be4b-de43baf67108/userFiles-b90ca2e1-2041-442d-ae78-c8e9c30bff49/beam-runners-spark-2.22.0-SNAPSHOT.jar
 (No such file or directory)
at 
org.apache.beam.runners.portability.JobServicePipelineResult.propagateErrors(JobServicePipelineResult.java:165)
at 
org.apache.beam.runners.portability.JobServicePipelineResult.waitUntilFinish(JobServicePipelineResult.java:110)
at 
org.apache.beam.runners.portability.testing.TestPortableRunner.run(TestPortableRunner.java:83)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:317)
at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:350)
at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:331)
at 
org.apache.beam.runners.core.metrics.MetricsPusherTest.pushesUserMetrics(MetricsPusherTest.java:70)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 

[jira] [Updated] (BEAM-9514) AssertionError type mismatch from AggregateScanConverter

2020-05-21 Thread Andrew Pilloud (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Pilloud updated BEAM-9514:
-
Status: Open  (was: Triage Needed)

> AssertionError type mismatch from AggregateScanConverter
> 
>
> Key: BEAM-9514
> URL: https://issues.apache.org/jira/browse/BEAM-9514
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql-zetasql
>Reporter: Andrew Pilloud
>Assignee: Andrew Pilloud
>Priority: P1
>  Labels: zetasql-compliance
> Fix For: 2.22.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Six failures in shard 31
> {code}
>   at 
> org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.util.Litmus$1.fail(Litmus.java:31)
>   at 
> org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptUtil.equal(RelOptUtil.java:1984)
>   at 
> org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.volcano.RelSubset.add(RelSubset.java:284)
>   at 
> org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.volcano.RelSet.add(RelSet.java:148)
>   at 
> org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.volcano.VolcanoPlanner.addRelToSet(VolcanoPlanner.java:1806)
>   at 
> org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.volcano.VolcanoPlanner.reregister(VolcanoPlanner.java:1480)
>   at 
> org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.volcano.RelSet.mergeWith(RelSet.java:331)
>   at 
> org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.volcano.VolcanoPlanner.merge(VolcanoPlanner.java:1571)
>   at 
> org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:863)
>   at 
> org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:1927)
>   at 
> org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(VolcanoRuleCall.java:129)
>   at 
> org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:236)
>   at 
> org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.rules.AggregateRemoveRule.onMatch(AggregateRemoveRule.java:126)
>   at 
> org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:208)
>   at 
> org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:631)
>   at 
> org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:328)
>   at 
> org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLPlannerImpl.transform(ZetaSQLPlannerImpl.java:180)
>   at 
> org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLQueryPlanner.convertToBeamRelInternal(ZetaSQLQueryPlanner.java:150)
>   at 
> org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLQueryPlanner.convertToBeamRel(ZetaSQLQueryPlanner.java:115)
>   at 
> cloud.dataflow.sql.ExecuteQueryServiceServer$SqlComplianceServiceImpl.executeQuery(ExecuteQueryServiceServer.java:242)
>   at 
> com.google.zetasql.testing.SqlComplianceServiceGrpc$MethodHandlers.invoke(SqlComplianceServiceGrpc.java:423)
>   at 
> com.google.zetasql.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171)
>   at 
> com.google.zetasql.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283)
>   at 
> com.google.zetasql.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:711)
>   at 
> com.google.zetasql.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
>   at 
> com.google.zetasql.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}
> 1:
> {code}
> Apr 01, 2020 11:48:56 AM 
> cloud.dataflow.sql.ExecuteQueryServiceServer$SqlComplianceServiceImpl 
> executeQuery
> INFO: Processing Sql statement: select sum(distinct_4) from TableDistincts
> group by distinct_2
> having false
> Apr 01, 2020 11:48:57 AM 
> cloud.dataflow.sql.ExecuteQueryServiceServer$SqlComplianceServiceImpl 
> executeQuery
> SEVERE:  Type mismatch:
> rowtype of new rel:
> RecordType(BIGINT distinct_2, BIGINT $col1) NOT NULL
> rowtype of set:
> RecordType(BIGINT distinct_2, BIGINT NOT NULL $col1) NOT NULL
> java.lang.AssertionError: Type mismatch:
> rowtype of new rel:
> 

[jira] [Work logged] (BEAM-8889) Make GcsUtil use GoogleCloudStorage

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8889?focusedWorklogId=436279=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436279
 ]

ASF GitHub Bot logged work on BEAM-8889:


Author: ASF GitHub Bot
Created on: 21/May/20 23:34
Start Date: 21/May/20 23:34
Worklog Time Spent: 10m 
  Work Description: TheNeuralBit commented on pull request #11778:
URL: https://github.com/apache/beam/pull/11778#issuecomment-632397951


   Run Java PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436279)
Remaining Estimate: 130.5h  (was: 130h 40m)
Time Spent: 37.5h  (was: 37h 20m)

> Make GcsUtil use GoogleCloudStorage
> ---
>
> Key: BEAM-8889
> URL: https://issues.apache.org/jira/browse/BEAM-8889
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Affects Versions: 2.16.0
>Reporter: Esun Kim
>Assignee: VASU NORI
>Priority: P2
>  Labels: gcs
> Fix For: 2.22.0
>
>   Original Estimate: 168h
>  Time Spent: 37.5h
>  Remaining Estimate: 130.5h
>
> [GcsUtil|https://github.com/apache/beam/blob/master/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/util/GcsUtil.java]
>  is a primary class to access Google Cloud Storage on Apache Beam. Current 
> implementation directly creates GoogleCloudStorageReadChannel and 
> GoogleCloudStorageWriteChannel by itself to read and write GCS data rather 
> than using 
> [GoogleCloudStorage|https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/gcsio/src/main/java/com/google/cloud/hadoop/gcsio/GoogleCloudStorage.java]
>  which is an abstract class providing basic IO capability which eventually 
> creates channel objects. This request is about updating GcsUtil to use 
> GoogleCloudStorage to create read and write channel, which is expected 
> flexible because it can easily pick up the new change; e.g. new channel 
> implementation using new protocol without code change.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (BEAM-9514) AssertionError type mismatch from SUM

2020-05-21 Thread Andrew Pilloud (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Pilloud reopened BEAM-9514:
--

Still seeing this class of issue:

11 failures in shard 4, 2 failures in shard 8, 1 failure in shard 23, 4 
failures in shard 30, 
{code}
SELECT COUNT(bool_val) FROM TableAllNull
BIGINT NOT NULL
at 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.util.Litmus$1.fail(Litmus.java:31)
at 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptUtil.eq(RelOptUtil.java:1958)
at 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.core.Aggregate.typeMatchesInferred(Aggregate.java:434)
at 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.core.Aggregate.(Aggregate.java:159)
at 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.logical.LogicalAggregate.(LogicalAggregate.java:65)
at 
org.apache.beam.sdk.extensions.sql.zetasql.translation.AggregateScanConverter.convert(AggregateScanConverter.java:110)
at 
org.apache.beam.sdk.extensions.sql.zetasql.translation.AggregateScanConverter.convert(AggregateScanConverter.java:50)
at 
org.apache.beam.sdk.extensions.sql.zetasql.translation.QueryStatementConverter.convertNode(QueryStatementConverter.java:98)
at 
java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.Collections$2.tryAdvance(Collections.java:4717)
at java.util.Collections$2.forEachRemaining(Collections.java:4725)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at 
java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at 
java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at 
java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
at 
org.apache.beam.sdk.extensions.sql.zetasql.translation.QueryStatementConverter.convertNode(QueryStatementConverter.java:97)
at 
org.apache.beam.sdk.extensions.sql.zetasql.translation.QueryStatementConverter.convert(QueryStatementConverter.java:85)
at 
org.apache.beam.sdk.extensions.sql.zetasql.translation.QueryStatementConverter.convertRootQuery(QueryStatementConverter.java:51)
at 
org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLPlannerImpl.rel(ZetaSQLPlannerImpl.java:160)
at 
org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLQueryPlanner.convertToBeamRelInternal(ZetaSQLQueryPlanner.java:131)
at 
org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLQueryPlanner.convertToBeamRel(ZetaSQLQueryPlanner.java:115)
at 
cloud.dataflow.sql.ExecuteQueryServiceServer$SqlComplianceServiceImpl.executeQuery(ExecuteQueryServiceServer.java:241)
at 
com.google.zetasql.testing.SqlComplianceServiceGrpc$MethodHandlers.invoke(SqlComplianceServiceGrpc.java:423)
at 
com.google.zetasql.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171)
at 
com.google.zetasql.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283)
at 
com.google.zetasql.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:711)
at 
com.google.zetasql.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
at 
com.google.zetasql.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{code}


> AssertionError type mismatch from SUM
> -
>
> Key: BEAM-9514
> URL: https://issues.apache.org/jira/browse/BEAM-9514
> Project: Beam
>  Issue Type: Bug
>  Components: dsl-sql-zetasql
>Reporter: Andrew Pilloud
>Assignee: Andrew Pilloud
>Priority: P1
>  Labels: zetasql-compliance
> Fix For: 2.22.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Six failures in shard 31
> {code}
>   at 
> org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.util.Litmus$1.fail(Litmus.java:31)
>   at 
> org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptUtil.equal(RelOptUtil.java:1984)
>   at 
> org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.volcano.RelSubset.add(RelSubset.java:284)
>   at 
> org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.volcano.RelSet.add(RelSet.java:148)
>   at 
> 

[jira] [Work logged] (BEAM-9603) Support Dynamic Timer in Java SDK over FnApi

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9603?focusedWorklogId=436274=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436274
 ]

ASF GitHub Bot logged work on BEAM-9603:


Author: ASF GitHub Bot
Created on: 21/May/20 23:28
Start Date: 21/May/20 23:28
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on pull request #11756:
URL: https://github.com/apache/beam/pull/11756#issuecomment-632396377


   retest all please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436274)
Time Spent: 6h 20m  (was: 6h 10m)

> Support Dynamic Timer in Java SDK over FnApi
> 
>
> Key: BEAM-9603
> URL: https://issues.apache.org/jira/browse/BEAM-9603
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-harness
>Reporter: Boyuan Zhang
>Assignee: Yichi Zhang
>Priority: P2
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-2939) Fn API SDF support

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-2939?focusedWorklogId=436264=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436264
 ]

ASF GitHub Bot logged work on BEAM-2939:


Author: ASF GitHub Bot
Created on: 21/May/20 23:09
Start Date: 21/May/20 23:09
Worklog Time Spent: 10m 
  Work Description: TheNeuralBit commented on pull request #11781:
URL: https://github.com/apache/beam/pull/11781#issuecomment-632390617







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436264)
Time Spent: 35h  (was: 34h 50m)

> Fn API SDF support
> --
>
> Key: BEAM-2939
> URL: https://issues.apache.org/jira/browse/BEAM-2939
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Reporter: Henning Rohde
>Assignee: Luke Cwik
>Priority: P2
>  Labels: portability
>  Time Spent: 35h
>  Remaining Estimate: 0h
>
> The Fn API should support streaming SDF. Detailed design TBD.
> Once design is ready, expand subtasks similarly to BEAM-2822.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9322) Python SDK ignores manually set PCollection tags

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9322?focusedWorklogId=436260=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436260
 ]

ASF GitHub Bot logged work on BEAM-9322:


Author: ASF GitHub Bot
Created on: 21/May/20 23:04
Start Date: 21/May/20 23:04
Worklog Time Spent: 10m 
  Work Description: robertwb commented on a change in pull request #11765:
URL: https://github.com/apache/beam/pull/11765#discussion_r428960780



##
File path: sdks/python/apache_beam/transforms/ptransform.py
##
@@ -270,11 +256,19 @@ def get_named_nested_pvalues(pvalueish):
 tagged_values = pvalueish.items()
   else:
 if isinstance(pvalueish, (pvalue.PValue, pvalue.DoOutputsTuple)):
-  yield None, pvalueish
+  # For transforms that only have a tagged PCollection as an output,
+  # propagate that tag forward.
+  if first_iteration and isinstance(pvalueish, pvalue.PValue):
+yield pvalueish.tag, pvalueish

Review comment:
   I think this may break some google3 runners. Can you ensure that this 
imports correctly? (Could you also explain why this is needed?)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436260)
Time Spent: 4.5h  (was: 4h 20m)

> Python SDK ignores manually set PCollection tags
> 
>
> Key: BEAM-9322
> URL: https://issues.apache.org/jira/browse/BEAM-9322
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Sam Rohde
>Assignee: Sam Rohde
>Priority: P1
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> The Python SDK currently ignores any tags set on PCollections manually when 
> applying PTransforms when adding the PCollection to the PTransform 
> [outputs|[https://github.com/apache/beam/blob/688a4ea53f315ec2aa2d37602fd78496fca8bb4f/sdks/python/apache_beam/pipeline.py#L595]].
>  In the 
> [add_output|[https://github.com/apache/beam/blob/688a4ea53f315ec2aa2d37602fd78496fca8bb4f/sdks/python/apache_beam/pipeline.py#L872]]
>  method, the tag is set to None for all PValues, meaning the output tags are 
> set to an enumeration index over the PCollection outputs. The tags are not 
> propagated to correctly which can be a problem on relying on the output 
> PCollection tags to match the user set values.
> The fix is to correct BEAM-1833, and always pass in the tags. However, that 
> doesn't fix the problem for nested PCollections. If you have a dict of lists 
> of PCollections, what should their tags be correctly set to? In order to fix 
> this, first propagate the correct tag then talk with the community about the 
> best auto-generated tags.
> Some users may rely on the old implementation, so a flag will be created: 
> "force_generated_pcollection_output_ids" and be default set to False. If 
> True, this will go to the old implementation and generate tags for 
> PCollections.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-7304) Twister2 Beam runner

2020-05-21 Thread Brian Hulette (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17113603#comment-17113603
 ] 

Brian Hulette commented on BEAM-7304:
-

Cherry-picks are just for release blockers: bugs representing regressions, or 
features that the community has agreed its worth delaying the release for

> Twister2 Beam runner
> 
>
> Key: BEAM-7304
> URL: https://issues.apache.org/jira/browse/BEAM-7304
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-ideas
>Reporter: Pulasthi Wickramasinghe
>Assignee: Pulasthi Wickramasinghe
>Priority: P3
>  Time Spent: 21h 20m
>  Remaining Estimate: 0h
>
> Twister2 is a big data framework which supports both batch and stream 
> processing [1] [2]. The goal is to develop an beam runner for Twister2. 
> [1] [https://github.com/DSC-SPIDAL/twister2]
> [2] [https://twister2.org/]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (BEAM-7304) Twister2 Beam runner

2020-05-21 Thread Brian Hulette (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17113603#comment-17113603
 ] 

Brian Hulette edited comment on BEAM-7304 at 5/21/20, 10:44 PM:


Cherry-picks are just for release blockers: bugs representing regressions, or 
features that the community has agreed its worth delaying the release for: 
https://beam.apache.org/contribute/release-blocking/


was (Author: bhulette):
Cherry-picks are just for release blockers: bugs representing regressions, or 
features that the community has agreed its worth delaying the release for

> Twister2 Beam runner
> 
>
> Key: BEAM-7304
> URL: https://issues.apache.org/jira/browse/BEAM-7304
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-ideas
>Reporter: Pulasthi Wickramasinghe
>Assignee: Pulasthi Wickramasinghe
>Priority: P3
>  Time Spent: 21h 20m
>  Remaining Estimate: 0h
>
> Twister2 is a big data framework which supports both batch and stream 
> processing [1] [2]. The goal is to develop an beam runner for Twister2. 
> [1] [https://github.com/DSC-SPIDAL/twister2]
> [2] [https://twister2.org/]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8889) Make GcsUtil use GoogleCloudStorage

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8889?focusedWorklogId=436250=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436250
 ]

ASF GitHub Bot logged work on BEAM-8889:


Author: ASF GitHub Bot
Created on: 21/May/20 22:42
Start Date: 21/May/20 22:42
Worklog Time Spent: 10m 
  Work Description: TheNeuralBit commented on pull request #11778:
URL: https://github.com/apache/beam/pull/11778#issuecomment-632382830


   Run Python PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436250)
Remaining Estimate: 130h 40m  (was: 130h 50m)
Time Spent: 37h 20m  (was: 37h 10m)

> Make GcsUtil use GoogleCloudStorage
> ---
>
> Key: BEAM-8889
> URL: https://issues.apache.org/jira/browse/BEAM-8889
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Affects Versions: 2.16.0
>Reporter: Esun Kim
>Assignee: VASU NORI
>Priority: P2
>  Labels: gcs
> Fix For: 2.22.0
>
>   Original Estimate: 168h
>  Time Spent: 37h 20m
>  Remaining Estimate: 130h 40m
>
> [GcsUtil|https://github.com/apache/beam/blob/master/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/util/GcsUtil.java]
>  is a primary class to access Google Cloud Storage on Apache Beam. Current 
> implementation directly creates GoogleCloudStorageReadChannel and 
> GoogleCloudStorageWriteChannel by itself to read and write GCS data rather 
> than using 
> [GoogleCloudStorage|https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/gcsio/src/main/java/com/google/cloud/hadoop/gcsio/GoogleCloudStorage.java]
>  which is an abstract class providing basic IO capability which eventually 
> creates channel objects. This request is about updating GcsUtil to use 
> GoogleCloudStorage to create read and write channel, which is expected 
> flexible because it can easily pick up the new change; e.g. new channel 
> implementation using new protocol without code change.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9722) Add batch SnowflakeIO.Read to Java SDK

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9722?focusedWorklogId=436249=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436249
 ]

ASF GitHub Bot logged work on BEAM-9722:


Author: ASF GitHub Bot
Created on: 21/May/20 22:39
Start Date: 21/May/20 22:39
Worklog Time Spent: 10m 
  Work Description: chamikaramj merged pull request #11360:
URL: https://github.com/apache/beam/pull/11360


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436249)
Time Spent: 8.5h  (was: 8h 20m)

> Add batch SnowflakeIO.Read to Java SDK
> --
>
> Key: BEAM-9722
> URL: https://issues.apache.org/jira/browse/BEAM-9722
> Project: Beam
>  Issue Type: New Feature
>  Components: io-ideas
>Reporter: Kasia Kucharczyk
>Assignee: Dariusz Aniszewski
>Priority: P2
>  Time Spent: 8.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-10052) check hash and avoid duplicates when uploading artifact in Python Dataflow Runner

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-10052?focusedWorklogId=436248=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436248
 ]

ASF GitHub Bot logged work on BEAM-10052:
-

Author: ASF GitHub Bot
Created on: 21/May/20 22:36
Start Date: 21/May/20 22:36
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on pull request #11771:
URL: https://github.com/apache/beam/pull/11771#issuecomment-632381115


   Retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436248)
Time Spent: 1.5h  (was: 1h 20m)

> check hash and avoid duplicates when uploading artifact in Python Dataflow 
> Runner
> -
>
> Key: BEAM-10052
> URL: https://issues.apache.org/jira/browse/BEAM-10052
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: P2
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> xlang pipeline could have many duplicated jars. it would be great if we check 
> hash and avoid duplicate uploads.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-10061) ReadAllFromTextWithFilename

2020-05-21 Thread Ryan Canty (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-10061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17113593#comment-17113593
 ] 

Ryan Canty commented on BEAM-10061:
---

CC [~data-runner0]

> ReadAllFromTextWithFilename
> ---
>
> Key: BEAM-10061
> URL: https://issues.apache.org/jira/browse/BEAM-10061
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-community
> Environment: Dataflow with Python
>Reporter: Ryan Canty
>Assignee: Aizhamal Nurmamat kyzy
>Priority: P2
>
> I am trying to create a job that reads from GCS executes some code against 
> each line and creates a PCollection with the line and the file. So basically 
> what I'd like is a combination of textio.ReadTextWithFilename and 
> textio.ReadAllFromText



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-10061) ReadAllFromTextWithFilename

2020-05-21 Thread Ryan Canty (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-10061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan Canty updated BEAM-10061:
--
Summary: ReadAllFromTextWithFilename  (was: ReadAllFromTextWithFilenam)

> ReadAllFromTextWithFilename
> ---
>
> Key: BEAM-10061
> URL: https://issues.apache.org/jira/browse/BEAM-10061
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-community
> Environment: Dataflow with Python
>Reporter: Ryan Canty
>Assignee: Aizhamal Nurmamat kyzy
>Priority: P2
>
> I am trying to create a job that reads from GCS executes some code against 
> each line and creates a PCollection with the line and the file. So basically 
> what I'd like is a combination of textio.ReadTextWithFilename and 
> textio.ReadAllFromText



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-10061) ReadAllFromTextWithFilenam

2020-05-21 Thread Ryan Canty (Jira)
Ryan Canty created BEAM-10061:
-

 Summary: ReadAllFromTextWithFilenam
 Key: BEAM-10061
 URL: https://issues.apache.org/jira/browse/BEAM-10061
 Project: Beam
  Issue Type: New Feature
  Components: beam-community
 Environment: Dataflow with Python
Reporter: Ryan Canty
Assignee: Aizhamal Nurmamat kyzy


I am trying to create a job that reads from GCS executes some code against each 
line and creates a PCollection with the line and the file. So basically what 
I'd like is a combination of textio.ReadTextWithFilename and 
textio.ReadAllFromText



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7390) Colab examples for aggregation transforms (Python)

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7390?focusedWorklogId=436243=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436243
 ]

ASF GitHub Bot logged work on BEAM-7390:


Author: ASF GitHub Bot
Created on: 21/May/20 22:26
Start Date: 21/May/20 22:26
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #10165:
URL: https://github.com/apache/beam/pull/10165#issuecomment-632377660


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436243)
Time Spent: 11h 20m  (was: 11h 10m)

> Colab examples for aggregation transforms (Python)
> --
>
> Key: BEAM-7390
> URL: https://issues.apache.org/jira/browse/BEAM-7390
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Rose Nguyen
>Assignee: David Cavazos
>Priority: P3
>  Time Spent: 11h 20m
>  Remaining Estimate: 0h
>
> Merge aggregation Colabs into the transform catalog



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9825) Transforms for Intersect, IntersectAll, Except, ExceptAll, Union, UnionAll

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9825?focusedWorklogId=436244=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436244
 ]

ASF GitHub Bot logged work on BEAM-9825:


Author: ASF GitHub Bot
Created on: 21/May/20 22:26
Start Date: 21/May/20 22:26
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on pull request #11610:
URL: https://github.com/apache/beam/pull/11610#issuecomment-632377843


   Run Java PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436244)
Remaining Estimate: 85.5h  (was: 85h 40m)
Time Spent: 10.5h  (was: 10h 20m)

> Transforms for Intersect, IntersectAll, Except, ExceptAll, Union, UnionAll
> --
>
> Key: BEAM-9825
> URL: https://issues.apache.org/jira/browse/BEAM-9825
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Darshan Jani
>Assignee: Darshan Jani
>Priority: P2
>   Original Estimate: 96h
>  Time Spent: 10.5h
>  Remaining Estimate: 85.5h
>
> I'd like to propose following new high-level transforms.
>  * Intersect
> Compute the intersection between elements of two PCollection.
> Given _leftCollection_ and _rightCollection_, this transform returns a 
> collection containing elements that common to both _leftCollection_ and 
> _rightCollection_
>  
>  * Except
> Compute the difference between elements of two PCollection.
> Given _leftCollection_ and _rightCollection_, this transform returns a 
> collection containing elements that are in _leftCollection_ but not in 
> _rightCollection_
>  * Union
> Find the elements that are either of two PCollection.
> Implement IntersetAll, ExceptAll and UnionAll variants of transforms.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-10051) Misordered check WRT closed data readers.

2020-05-21 Thread Robert Burke (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-10051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Burke reassigned BEAM-10051:
---

Assignee: Robert Burke

> Misordered check WRT closed data readers.
> -
>
> Key: BEAM-10051
> URL: https://issues.apache.org/jira/browse/BEAM-10051
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Reporter: Robert Burke
>Assignee: Robert Burke
>Priority: P2
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This check 
> https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/runtime/harness/datamgr.go#L269
> in it's current position prevents the "normal teardown" that the reader 
> expects. This means that readers for instructions that terminate early such 
> as due to splitting stay resident in memory and never close.
> In practice this is benign as the buffer would already be closed, but with 
> streaming this  memory leak would become noticable.
> The fix is to move the check to after the sentinel check, and additionally 
> check there for early termination to avoid closing the buffer twice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-10060) Release new Python container image for Dataflow

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-10060?focusedWorklogId=436239=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436239
 ]

ASF GitHub Bot logged work on BEAM-10060:
-

Author: ASF GitHub Bot
Created on: 21/May/20 22:10
Start Date: 21/May/20 22:10
Worklog Time Spent: 10m 
  Work Description: angoenka commented on pull request #11783:
URL: https://github.com/apache/beam/pull/11783#issuecomment-632372278


   Thanks!
   Will wait for the tests to pass.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436239)
Time Spent: 50m  (was: 40m)

> Release new Python container image for Dataflow
> ---
>
> Key: BEAM-10060
> URL: https://issues.apache.org/jira/browse/BEAM-10060
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-py-harness
>Reporter: Ankur Goenka
>Priority: P2
> Fix For: 2.23.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Release beam-master-20200521 for python legacy and fnapi



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-10060) Release new Python container image for Dataflow

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-10060?focusedWorklogId=436238=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436238
 ]

ASF GitHub Bot logged work on BEAM-10060:
-

Author: ASF GitHub Bot
Created on: 21/May/20 22:08
Start Date: 21/May/20 22:08
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on pull request #11783:
URL: https://github.com/apache/beam/pull/11783#issuecomment-632371565


   LGTM if tests pass.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436238)
Time Spent: 40m  (was: 0.5h)

> Release new Python container image for Dataflow
> ---
>
> Key: BEAM-10060
> URL: https://issues.apache.org/jira/browse/BEAM-10060
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-py-harness
>Reporter: Ankur Goenka
>Priority: P2
> Fix For: 2.23.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Release beam-master-20200521 for python legacy and fnapi



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-10060) Release new Python container image for Dataflow

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-10060?focusedWorklogId=436236=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436236
 ]

ASF GitHub Bot logged work on BEAM-10060:
-

Author: ASF GitHub Bot
Created on: 21/May/20 22:06
Start Date: 21/May/20 22:06
Worklog Time Spent: 10m 
  Work Description: angoenka commented on pull request #11783:
URL: https://github.com/apache/beam/pull/11783#issuecomment-632370590


   R: @tvalentyn 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436236)
Time Spent: 0.5h  (was: 20m)

> Release new Python container image for Dataflow
> ---
>
> Key: BEAM-10060
> URL: https://issues.apache.org/jira/browse/BEAM-10060
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-py-harness
>Reporter: Ankur Goenka
>Priority: P2
> Fix For: 2.23.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Release beam-master-20200521 for python legacy and fnapi



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-10060) Release new Python container image for Dataflow

2020-05-21 Thread ASF GitHub Bot (Jira)
a
>Priority: P2
> Fix For: 2.23.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Release beam-master-20200521 for python legacy and fnapi



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-10060) Release new Python container image for Dataflow

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-10060?focusedWorklogId=436235=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436235
 ]

ASF GitHub Bot logged work on BEAM-10060:
-

Author: ASF GitHub Bot
Created on: 21/May/20 22:05
Start Date: 21/May/20 22:05
Worklog Time Spent: 10m 
  Work Description: angoenka commented on pull request #11783:
URL: https://github.com/apache/beam/pull/11783#issuecomment-632370504


   R: @ibzib 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436235)
Time Spent: 20m  (was: 10m)

> Release new Python container image for Dataflow
> ---
>
> Key: BEAM-10060
> URL: https://issues.apache.org/jira/browse/BEAM-10060
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-py-harness
>Reporter: Ankur Goenka
>Priority: P2
> Fix For: 2.23.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Release beam-master-20200521 for python legacy and fnapi



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-10060) Release new Python container image for Dataflow

2020-05-21 Thread Ankur Goenka (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-10060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur Goenka updated BEAM-10060:

Fix Version/s: 2.23.0

> Release new Python container image for Dataflow
> ---
>
> Key: BEAM-10060
> URL: https://issues.apache.org/jira/browse/BEAM-10060
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-py-harness
>Reporter: Ankur Goenka
>Priority: P2
> Fix For: 2.23.0
>
>
> Release beam-master-20200521 for python legacy and fnapi



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (BEAM-10060) Release new Python container image for Dataflow

2020-05-21 Thread Ankur Goenka (Jira)
Ankur Goenka created BEAM-10060:
---

 Summary: Release new Python container image for Dataflow
 Key: BEAM-10060
 URL: https://issues.apache.org/jira/browse/BEAM-10060
 Project: Beam
  Issue Type: Bug
  Components: runner-dataflow, sdk-py-harness
Reporter: Ankur Goenka


Release beam-master-20200521 for python legacy and fnapi



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9946) Enhance Partition transform to provide partitionfn with SideInputs

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9946?focusedWorklogId=436219=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436219
 ]

ASF GitHub Bot logged work on BEAM-9946:


Author: ASF GitHub Bot
Created on: 21/May/20 21:43
Start Date: 21/May/20 21:43
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #11682:
URL: https://github.com/apache/beam/pull/11682#issuecomment-632361896


   Java precommit failed in the last 2 runs. Could you look at the logs? Is it 
related to this change?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436219)
Remaining Estimate: 93h 40m  (was: 93h 50m)
Time Spent: 2h 20m  (was: 2h 10m)

> Enhance Partition transform to provide partitionfn with SideInputs
> --
>
> Key: BEAM-9946
> URL: https://issues.apache.org/jira/browse/BEAM-9946
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Darshan Jani
>Assignee: Darshan Jani
>Priority: P2
>   Original Estimate: 96h
>  Time Spent: 2h 20m
>  Remaining Estimate: 93h 40m
>
> Currently _Partition_ transform can partition a collection into n collections 
> based on only _element_ value in _PartitionFn_ to decide on which partition a 
> particular element belongs to.
> {code:java}
> public interface PartitionFn extends Serializable {
> int partitionFor(T elem, int numPartitions);
>   }
> public static  Partition of(int numPartitions, PartitionFn 
> partitionFn) {
> return new Partition<>(new PartitionDoFn(numPartitions, partitionFn));
>   }
> {code}
> It will be useful to introduce new API with additional _sideInputs_ provided 
> to partition function. User will be able to write logic to use both _element_ 
> value and _sideInputs_ to decide on which partition a particular element 
> belongs to.
> Option-1: Proposed new API:
> {code:java}
>   public interface PartitionWithSideInputsFn extends Serializable {
> int partitionFor(T elem, int numPartitions, Context c);
>   }
> public static  Partition of(int numPartitions, 
> PartitionWithSideInputsFn partitionFn, Requirements requirements) {
>  ...
>   }
> {code}
> User can use any of the two APIs as per there partitioning function logic.
> Option-2: Redesign old API with Builder Pattern which can provide optionally 
> a _Requirements_ with _sideInputs._ Deprecate old API.
> {code:java}
> // using sideviews
> Partition.into(numberOfPartitions).via(
> fn(
>   (input,c) ->  {
> // use c.sideInput(view)
> // use input
> // return partitionnumber
>  },requiresSideInputs(view))
> )
> // without using sideviews
> Partition.into(numberOfPartitions).via(
> fn((input,c) ->  {
> // use input
> // return partitionnumber
>  })
> )
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-7304) Twister2 Beam runner

2020-05-21 Thread Pulasthi Wickramasinghe (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17113561#comment-17113561
 ] 

Pulasthi Wickramasinghe commented on BEAM-7304:
---

[~bhulette] thanks for the update. is the option to cherry-pick this later for 
the 2.22.0 mentioned by [~iemejia] release still a viable option?. I am not 
sure how that would work with Beam release process but wanted to follow up 
since it was mentioned a couple times

> Twister2 Beam runner
> 
>
> Key: BEAM-7304
> URL: https://issues.apache.org/jira/browse/BEAM-7304
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-ideas
>Reporter: Pulasthi Wickramasinghe
>Assignee: Pulasthi Wickramasinghe
>Priority: P3
>  Time Spent: 21h 20m
>  Remaining Estimate: 0h
>
> Twister2 is a big data framework which supports both batch and stream 
> processing [1] [2]. The goal is to develop an beam runner for Twister2. 
> [1] [https://github.com/DSC-SPIDAL/twister2]
> [2] [https://twister2.org/]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9603) Support Dynamic Timer in Java SDK over FnApi

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9603?focusedWorklogId=436213=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436213
 ]

ASF GitHub Bot logged work on BEAM-9603:


Author: ASF GitHub Bot
Created on: 21/May/20 21:17
Start Date: 21/May/20 21:17
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on pull request #11756:
URL: https://github.com/apache/beam/pull/11756#issuecomment-632350790


   Run Java PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436213)
Time Spent: 6h 10m  (was: 6h)

> Support Dynamic Timer in Java SDK over FnApi
> 
>
> Key: BEAM-9603
> URL: https://issues.apache.org/jira/browse/BEAM-9603
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-harness
>Reporter: Boyuan Zhang
>Assignee: Yichi Zhang
>Priority: P2
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8889) Make GcsUtil use GoogleCloudStorage

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8889?focusedWorklogId=436210=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436210
 ]

ASF GitHub Bot logged work on BEAM-8889:


Author: ASF GitHub Bot
Created on: 21/May/20 21:13
Start Date: 21/May/20 21:13
Worklog Time Spent: 10m 
  Work Description: TheNeuralBit commented on pull request #11778:
URL: https://github.com/apache/beam/pull/11778#issuecomment-632349170


   Run Java PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436210)
Remaining Estimate: 130h 50m  (was: 131h)
Time Spent: 37h 10m  (was: 37h)

> Make GcsUtil use GoogleCloudStorage
> ---
>
> Key: BEAM-8889
> URL: https://issues.apache.org/jira/browse/BEAM-8889
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Affects Versions: 2.16.0
>Reporter: Esun Kim
>Assignee: VASU NORI
>Priority: P2
>  Labels: gcs
> Fix For: 2.22.0
>
>   Original Estimate: 168h
>  Time Spent: 37h 10m
>  Remaining Estimate: 130h 50m
>
> [GcsUtil|https://github.com/apache/beam/blob/master/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/util/GcsUtil.java]
>  is a primary class to access Google Cloud Storage on Apache Beam. Current 
> implementation directly creates GoogleCloudStorageReadChannel and 
> GoogleCloudStorageWriteChannel by itself to read and write GCS data rather 
> than using 
> [GoogleCloudStorage|https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/gcsio/src/main/java/com/google/cloud/hadoop/gcsio/GoogleCloudStorage.java]
>  which is an abstract class providing basic IO capability which eventually 
> creates channel objects. This request is about updating GcsUtil to use 
> GoogleCloudStorage to create read and write channel, which is expected 
> flexible because it can easily pick up the new change; e.g. new channel 
> implementation using new protocol without code change.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-2939) Fn API SDF support

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-2939?focusedWorklogId=436208=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436208
 ]

ASF GitHub Bot logged work on BEAM-2939:


Author: ASF GitHub Bot
Created on: 21/May/20 21:10
Start Date: 21/May/20 21:10
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on pull request #11781:
URL: https://github.com/apache/beam/pull/11781#issuecomment-632347848


   LGTM. Thanks.
   
   Ran the Kafka test few times and it seems to be working.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436208)
Time Spent: 34h 50m  (was: 34h 40m)

> Fn API SDF support
> --
>
> Key: BEAM-2939
> URL: https://issues.apache.org/jira/browse/BEAM-2939
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Reporter: Henning Rohde
>Assignee: Luke Cwik
>Priority: P2
>  Labels: portability
>  Time Spent: 34h 50m
>  Remaining Estimate: 0h
>
> The Fn API should support streaming SDF. Detailed design TBD.
> Once design is ready, expand subtasks similarly to BEAM-2822.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-2939) Fn API SDF support

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-2939?focusedWorklogId=436206=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436206
 ]

ASF GitHub Bot logged work on BEAM-2939:


Author: ASF GitHub Bot
Created on: 21/May/20 20:58
Start Date: 21/May/20 20:58
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #11781:
URL: https://github.com/apache/beam/pull/11781#issuecomment-632342320


   Test coverage is by existing IOs that enable these features which we don't 
have enough of in Beam (since it requires portable runners to implement SDF to 
a greater extent then they do) so supplement it with internal testing in Google.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436206)
Time Spent: 34h 40m  (was: 34.5h)

> Fn API SDF support
> --
>
> Key: BEAM-2939
> URL: https://issues.apache.org/jira/browse/BEAM-2939
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Reporter: Henning Rohde
>Assignee: Luke Cwik
>Priority: P2
>  Labels: portability
>  Time Spent: 34h 40m
>  Remaining Estimate: 0h
>
> The Fn API should support streaming SDF. Detailed design TBD.
> Once design is ready, expand subtasks similarly to BEAM-2822.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-2939) Fn API SDF support

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-2939?focusedWorklogId=436202=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436202
 ]

ASF GitHub Bot logged work on BEAM-2939:


Author: ASF GitHub Bot
Created on: 21/May/20 20:44
Start Date: 21/May/20 20:44
Worklog Time Spent: 10m 
  Work Description: TheNeuralBit commented on pull request #11781:
URL: https://github.com/apache/beam/pull/11781#issuecomment-632335607


   Run Java PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436202)
Time Spent: 34.5h  (was: 34h 20m)

> Fn API SDF support
> --
>
> Key: BEAM-2939
> URL: https://issues.apache.org/jira/browse/BEAM-2939
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Reporter: Henning Rohde
>Assignee: Luke Cwik
>Priority: P2
>  Labels: portability
>  Time Spent: 34.5h
>  Remaining Estimate: 0h
>
> The Fn API should support streaming SDF. Detailed design TBD.
> Once design is ready, expand subtasks similarly to BEAM-2822.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-8889) Make GcsUtil use GoogleCloudStorage

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8889?focusedWorklogId=436201=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436201
 ]

ASF GitHub Bot logged work on BEAM-8889:


Author: ASF GitHub Bot
Created on: 21/May/20 20:44
Start Date: 21/May/20 20:44
Worklog Time Spent: 10m 
  Work Description: TheNeuralBit commented on pull request #11778:
URL: https://github.com/apache/beam/pull/11778#issuecomment-632335448


   Run Python PreCommit



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436201)
Remaining Estimate: 131h  (was: 131h 10m)
Time Spent: 37h  (was: 36h 50m)

> Make GcsUtil use GoogleCloudStorage
> ---
>
> Key: BEAM-8889
> URL: https://issues.apache.org/jira/browse/BEAM-8889
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp
>Affects Versions: 2.16.0
>Reporter: Esun Kim
>Assignee: VASU NORI
>Priority: P2
>  Labels: gcs
> Fix For: 2.22.0
>
>   Original Estimate: 168h
>  Time Spent: 37h
>  Remaining Estimate: 131h
>
> [GcsUtil|https://github.com/apache/beam/blob/master/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/util/GcsUtil.java]
>  is a primary class to access Google Cloud Storage on Apache Beam. Current 
> implementation directly creates GoogleCloudStorageReadChannel and 
> GoogleCloudStorageWriteChannel by itself to read and write GCS data rather 
> than using 
> [GoogleCloudStorage|https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/gcsio/src/main/java/com/google/cloud/hadoop/gcsio/GoogleCloudStorage.java]
>  which is an abstract class providing basic IO capability which eventually 
> creates channel objects. This request is about updating GcsUtil to use 
> GoogleCloudStorage to create read and write channel, which is expected 
> flexible because it can easily pick up the new change; e.g. new channel 
> implementation using new protocol without code change.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-3767) A Complex Event Processing (CEP) library/extension for Apache Beam

2020-05-21 Thread Pablo Estrada (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-3767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pablo Estrada updated BEAM-3767:

Labels: gsoc gsoc2021  (was: )

> A Complex Event Processing (CEP) library/extension for Apache Beam
> --
>
> Key: BEAM-3767
> URL: https://issues.apache.org/jira/browse/BEAM-3767
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-ideas
>Reporter: Ismaël Mejía
>Priority: P3
>  Labels: gsoc, gsoc2021
>
> Apache Beam [1] is a unified and portable programming model for data 
> processing jobs. The Beam model [2, 3, 4] has rich mechanisms to process 
> endless streams of events.
> Complex Event Processing [5] lets you match patterns of events in streams to 
> detect important patterns in data and react to them.
> Some examples of uses of CEP are fraud detection for example by detecting 
> unusual behavior (patterns of activity), e.g. network intrusion, suspicious 
> banking transactions, etc. Also trend detection is another interesting use 
> case in the context of sensors and IoT.
> The goal of this issue is to implement an efficient pattern matching library 
> inspired by [6] and existing libraries like Apache Flink CEP [7] using the 
> Apache Beam Java SDK and the Beam style guides [8]. Because of the time 
> constraints of GSoC we will probably try to cover first simple patterns of 
> the ‘a followed by b followed by c’ kind, and then if there is still time try 
> to cover more advanced ones e.g. optional, atLeastOne, oneOrMore, etc.
> [1] [https://beam.apache.org/]
>  [2] [https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-101]
>  [3] [https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-102]
>  [4] 
> [https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43864.pdf]
>  [5] [https://en.wikipedia.org/wiki/Complex_event_processing]
>  [6] [https://people.cs.umass.edu/~yanlei/publications/sase-sigmod08.pdf]
>  [7] 
> [https://ci.apache.org/projects/flink/flink-docs-stable/dev/libs/cep.html]
>  [8] [https://beam.apache.org/contribute/ptransform-style-guide/]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-7554) datetime and decimal should be logical types

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7554?focusedWorklogId=436199=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436199
 ]

ASF GitHub Bot logged work on BEAM-7554:


Author: ASF GitHub Bot
Created on: 21/May/20 20:40
Start Date: 21/May/20 20:40
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on a change in pull request #11456:
URL: https://github.com/apache/beam/pull/11456#discussion_r428901750



##
File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/logicaltypes/MillisInstant.java
##
@@ -0,0 +1,44 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.schemas.logicaltypes;
+
+import org.joda.time.Instant;
+import org.joda.time.ReadableInstant;
+
+/** A timestamp represented as milliseconds since the epoch. */
+public class MillisInstant extends MillisType {
+  public static final String IDENTIFIER = 
"beam:logical_type:millis_instant:v1";

Review comment:
   A timestamp type seems like it's by definition time-zone agnostic. If we 
want a datetime type, that should be a different type.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436199)
Time Spent: 1h 10m  (was: 1h)

> datetime and decimal should be logical types
> 
>
> Key: BEAM-7554
> URL: https://issues.apache.org/jira/browse/BEAM-7554
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql, sdk-java-core
>Reporter: Brian Hulette
>Priority: P2
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Currently datetime and decimal are implemented as primitive types in the Java 
> SDK. They should be logical types as documented in 
> https://s.apache.org/beam-schemas



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9641) Support ZetaSQL DATE functions in BeamSQL

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9641?focusedWorklogId=436198=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436198
 ]

ASF GitHub Bot logged work on BEAM-9641:


Author: ASF GitHub Bot
Created on: 21/May/20 20:36
Start Date: 21/May/20 20:36
Worklog Time Spent: 10m 
  Work Description: robinyqiu commented on pull request #11272:
URL: https://github.com/apache/beam/pull/11272#issuecomment-632331823


   > are there any tests that use the DATE Type in an aggregation (e.g. MAX)?
   
   No. Thanks for bringing this up. I think it is likely to run into the 
problem.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436198)
Time Spent: 4h 40m  (was: 4.5h)

> Support ZetaSQL DATE functions in BeamSQL
> -
>
> Key: BEAM-9641
> URL: https://issues.apache.org/jira/browse/BEAM-9641
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql-zetasql
>Reporter: Robin Qiu
>Assignee: Robin Qiu
>Priority: P2
>  Labels: zetasql-compliance
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9641) Support ZetaSQL DATE functions in BeamSQL

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9641?focusedWorklogId=436197=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436197
 ]

ASF GitHub Bot logged work on BEAM-9641:


Author: ASF GitHub Bot
Created on: 21/May/20 20:34
Start Date: 21/May/20 20:34
Worklog Time Spent: 10m 
  Work Description: apilloud commented on pull request #11272:
URL: https://github.com/apache/beam/pull/11272#issuecomment-632330703


   Interesting question. You should probably add a test for JOIN as well, which 
will have a similar class of problems.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436197)
Time Spent: 4.5h  (was: 4h 20m)

> Support ZetaSQL DATE functions in BeamSQL
> -
>
> Key: BEAM-9641
> URL: https://issues.apache.org/jira/browse/BEAM-9641
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql-zetasql
>Reporter: Robin Qiu
>Assignee: Robin Qiu
>Priority: P2
>  Labels: zetasql-compliance
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9641) Support ZetaSQL DATE functions in BeamSQL

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9641?focusedWorklogId=436195=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436195
 ]

ASF GitHub Bot logged work on BEAM-9641:


Author: ASF GitHub Bot
Created on: 21/May/20 20:28
Start Date: 21/May/20 20:28
Worklog Time Spent: 10m 
  Work Description: TheNeuralBit edited a comment on pull request #11272:
URL: https://github.com/apache/beam/pull/11272#issuecomment-632327533


   Something just occurred to me - are there any tests that use the DATE Type 
in an aggregation (e.g. MAX)?
   
   I'd think that would run into the same issue I have in #11456 (processing 
logical types using their representation)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436195)
Time Spent: 4h 20m  (was: 4h 10m)

> Support ZetaSQL DATE functions in BeamSQL
> -
>
> Key: BEAM-9641
> URL: https://issues.apache.org/jira/browse/BEAM-9641
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql-zetasql
>Reporter: Robin Qiu
>Assignee: Robin Qiu
>Priority: P2
>  Labels: zetasql-compliance
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (BEAM-9641) Support ZetaSQL DATE functions in BeamSQL

2020-05-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9641?focusedWorklogId=436194=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-436194
 ]

ASF GitHub Bot logged work on BEAM-9641:


Author: ASF GitHub Bot
Created on: 21/May/20 20:28
Start Date: 21/May/20 20:28
Worklog Time Spent: 10m 
  Work Description: TheNeuralBit commented on pull request #11272:
URL: https://github.com/apache/beam/pull/11272#issuecomment-632327533


   Something just occurred to me - are there any tests that use the DATE Type 
in an aggregation (e.g. MAX)?
   
   I'd think that would run into the same issue I have in #11456 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 436194)
Time Spent: 4h 10m  (was: 4h)

> Support ZetaSQL DATE functions in BeamSQL
> -
>
> Key: BEAM-9641
> URL: https://issues.apache.org/jira/browse/BEAM-9641
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql-zetasql
>Reporter: Robin Qiu
>Assignee: Robin Qiu
>Priority: P2
>  Labels: zetasql-compliance
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-10059) Several Dataflow load tests failing

2020-05-21 Thread Brian Hulette (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-10059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brian Hulette updated BEAM-10059:
-
Priority: P1  (was: P2)

> Several Dataflow load tests failing
> ---
>
> Key: BEAM-10059
> URL: https://issues.apache.org/jira/browse/BEAM-10059
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-java-core, test-failures
>Reporter: Brian Hulette
>Priority: P1
>
> Example Failure: 
> https://builds.apache.org/job/beam_LoadTests_Java_Combine_Dataflow_Streaming/380
> The same issue seems to affect every other Dataflow LoadTest
> {code}
> 05:20:56 Exception in thread "main" java.lang.RuntimeException: Failed to 
> construct instance from factory method 
> TestDataflowRunner#fromOptions(interface 
> org.apache.beam.sdk.options.PipelineOptions)
> 05:20:56  at 
> org.apache.beam.sdk.util.InstanceBuilder.buildFromMethod(InstanceBuilder.java:224)
> 05:20:56  at 
> org.apache.beam.sdk.util.InstanceBuilder.build(InstanceBuilder.java:155)
> 05:20:56  at 
> org.apache.beam.sdk.PipelineRunner.fromOptions(PipelineRunner.java:55)
> 05:20:56  at org.apache.beam.sdk.Pipeline.create(Pipeline.java:149)
> 05:20:56  at 
> org.apache.beam.sdk.loadtests.LoadTest.(LoadTest.java:86)
> 05:20:56  at 
> org.apache.beam.sdk.loadtests.CombineLoadTest.(CombineLoadTest.java:112)
> 05:20:56  at 
> org.apache.beam.sdk.loadtests.CombineLoadTest.main(CombineLoadTest.java:169)
> 05:20:56 Caused by: java.lang.reflect.InvocationTargetException
> 05:20:56  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 05:20:56  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 05:20:56  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 05:20:56  at java.lang.reflect.Method.invoke(Method.java:498)
> 05:20:56  at 
> org.apache.beam.sdk.util.InstanceBuilder.buildFromMethod(InstanceBuilder.java:214)
> 05:20:56  ... 6 more
> 05:20:56 Caused by: java.lang.NullPointerException
> 05:20:56  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull(Preconditions.java:877)
> 05:20:56  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Joiner.toString(Joiner.java:452)
> 05:20:56  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Joiner.appendTo(Joiner.java:106)
> 05:20:56  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Joiner.appendTo(Joiner.java:152)
> 05:20:56  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Joiner.join(Joiner.java:195)
> 05:20:56  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Joiner.join(Joiner.java:185)
> 05:20:56  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Joiner.join(Joiner.java:211)
> 05:20:56  at 
> org.apache.beam.runners.dataflow.TestDataflowRunner.fromOptions(TestDataflowRunner.java:74)
> 05:20:56  ... 11 more
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-10059) Several Dataflow load tests failing

2020-05-21 Thread Brian Hulette (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-10059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17113515#comment-17113515
 ] 

Brian Hulette commented on BEAM-10059:
--

The error is an NPE here: 
https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/TestDataflowRunner.java#L72-L74

{code}
String tempLocation =
Joiner.on("/")
.join(dataflowOptions.getTempRoot(), dataflowOptions.getJobName(), 
"output", "results");
{code}

It seems tempRoot or jobName is null

> Several Dataflow load tests failing
> ---
>
> Key: BEAM-10059
> URL: https://issues.apache.org/jira/browse/BEAM-10059
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-java-core, test-failures
>Reporter: Brian Hulette
>Priority: P2
>
> Example Failure: 
> https://builds.apache.org/job/beam_LoadTests_Java_Combine_Dataflow_Streaming/380
> The same issue seems to affect every other Dataflow LoadTest
> {code}
> 05:20:56 Exception in thread "main" java.lang.RuntimeException: Failed to 
> construct instance from factory method 
> TestDataflowRunner#fromOptions(interface 
> org.apache.beam.sdk.options.PipelineOptions)
> 05:20:56  at 
> org.apache.beam.sdk.util.InstanceBuilder.buildFromMethod(InstanceBuilder.java:224)
> 05:20:56  at 
> org.apache.beam.sdk.util.InstanceBuilder.build(InstanceBuilder.java:155)
> 05:20:56  at 
> org.apache.beam.sdk.PipelineRunner.fromOptions(PipelineRunner.java:55)
> 05:20:56  at org.apache.beam.sdk.Pipeline.create(Pipeline.java:149)
> 05:20:56  at 
> org.apache.beam.sdk.loadtests.LoadTest.(LoadTest.java:86)
> 05:20:56  at 
> org.apache.beam.sdk.loadtests.CombineLoadTest.(CombineLoadTest.java:112)
> 05:20:56  at 
> org.apache.beam.sdk.loadtests.CombineLoadTest.main(CombineLoadTest.java:169)
> 05:20:56 Caused by: java.lang.reflect.InvocationTargetException
> 05:20:56  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 05:20:56  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 05:20:56  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 05:20:56  at java.lang.reflect.Method.invoke(Method.java:498)
> 05:20:56  at 
> org.apache.beam.sdk.util.InstanceBuilder.buildFromMethod(InstanceBuilder.java:214)
> 05:20:56  ... 6 more
> 05:20:56 Caused by: java.lang.NullPointerException
> 05:20:56  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull(Preconditions.java:877)
> 05:20:56  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Joiner.toString(Joiner.java:452)
> 05:20:56  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Joiner.appendTo(Joiner.java:106)
> 05:20:56  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Joiner.appendTo(Joiner.java:152)
> 05:20:56  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Joiner.join(Joiner.java:195)
> 05:20:56  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Joiner.join(Joiner.java:185)
> 05:20:56  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Joiner.join(Joiner.java:211)
> 05:20:56  at 
> org.apache.beam.runners.dataflow.TestDataflowRunner.fromOptions(TestDataflowRunner.java:74)
> 05:20:56  ... 11 more
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-10059) Several Dataflow load tests failing

2020-05-21 Thread Brian Hulette (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-10059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brian Hulette updated BEAM-10059:
-
Component/s: test-failures

> Several Dataflow load tests failing
> ---
>
> Key: BEAM-10059
> URL: https://issues.apache.org/jira/browse/BEAM-10059
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-java-core, test-failures
>Reporter: Brian Hulette
>Priority: P2
>
> Example Failure: 
> https://builds.apache.org/job/beam_LoadTests_Java_Combine_Dataflow_Streaming/380
> The same issue seems to affect every other Dataflow LoadTest
> {code}
> 05:20:56 Exception in thread "main" java.lang.RuntimeException: Failed to 
> construct instance from factory method 
> TestDataflowRunner#fromOptions(interface 
> org.apache.beam.sdk.options.PipelineOptions)
> 05:20:56  at 
> org.apache.beam.sdk.util.InstanceBuilder.buildFromMethod(InstanceBuilder.java:224)
> 05:20:56  at 
> org.apache.beam.sdk.util.InstanceBuilder.build(InstanceBuilder.java:155)
> 05:20:56  at 
> org.apache.beam.sdk.PipelineRunner.fromOptions(PipelineRunner.java:55)
> 05:20:56  at org.apache.beam.sdk.Pipeline.create(Pipeline.java:149)
> 05:20:56  at 
> org.apache.beam.sdk.loadtests.LoadTest.(LoadTest.java:86)
> 05:20:56  at 
> org.apache.beam.sdk.loadtests.CombineLoadTest.(CombineLoadTest.java:112)
> 05:20:56  at 
> org.apache.beam.sdk.loadtests.CombineLoadTest.main(CombineLoadTest.java:169)
> 05:20:56 Caused by: java.lang.reflect.InvocationTargetException
> 05:20:56  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 05:20:56  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 05:20:56  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 05:20:56  at java.lang.reflect.Method.invoke(Method.java:498)
> 05:20:56  at 
> org.apache.beam.sdk.util.InstanceBuilder.buildFromMethod(InstanceBuilder.java:214)
> 05:20:56  ... 6 more
> 05:20:56 Caused by: java.lang.NullPointerException
> 05:20:56  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkNotNull(Preconditions.java:877)
> 05:20:56  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Joiner.toString(Joiner.java:452)
> 05:20:56  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Joiner.appendTo(Joiner.java:106)
> 05:20:56  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Joiner.appendTo(Joiner.java:152)
> 05:20:56  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Joiner.join(Joiner.java:195)
> 05:20:56  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Joiner.join(Joiner.java:185)
> 05:20:56  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Joiner.join(Joiner.java:211)
> 05:20:56  at 
> org.apache.beam.runners.dataflow.TestDataflowRunner.fromOptions(TestDataflowRunner.java:74)
> 05:20:56  ... 11 more
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   >