[jira] [Work logged] (BEAM-8113) FlinkRunner: Stage files from context classloader

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8113?focusedWorklogId=307689=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307689
 ]

ASF GitHub Bot logged work on BEAM-8113:


Author: ASF GitHub Bot
Created on: 06/Sep/19 07:59
Start Date: 06/Sep/19 07:59
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #9451: [BEAM-8113] Stage 
files from context classloader
URL: https://github.com/apache/beam/pull/9451#discussion_r321618536
 
 

 ##
 File path: 
runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/PipelineResources.java
 ##
 @@ -40,34 +44,42 @@
 public class PipelineResources {
 
   /**
-   * Attempts to detect all the resources the class loader has access to. This 
does not recurse to
-   * class loader parents stopping it from pulling in resources from the 
system class loader.
+   * Detects all URLs that are present in all class loaders in between context 
class loader of
+   * calling thread and class loader of class passed as parameter. It doesn't 
follow parents above
+   * this class loader stopping it from pulling in resources from the system 
class loader.
*
-   * @param classLoader The URLClassLoader to use to detect resources to stage.
-   * @throws IllegalArgumentException If either the class loader is not a 
URLClassLoader or one of
-   * the resources the class loader exposes is not a file resource.
+   * @param cls Class whose class loader stops recursion into parent loaders
+   * @throws IllegalArgumentException no classloader in context hierarchy is 
URLClassloader or if
+   * one of the resources any class loader exposes is not a file resource.
* @return A list of absolute paths to the resources the class loader uses.
*/
-  public static List detectClassPathResourcesToStage(ClassLoader 
classLoader) {
-if (!(classLoader instanceof URLClassLoader)) {
-  String message =
-  String.format(
-  "Unable to use ClassLoader to detect classpath elements. "
-  + "Current ClassLoader is %s, only URLClassLoaders are 
supported.",
-  classLoader);
-  throw new IllegalArgumentException(message);
-}
+  public static List detectClassPathResourcesToStage(Class cls) {
+return detectClassPathResourcesToStage(cls.getClassLoader());
+  }
 
-List files = new ArrayList<>();
-for (URL url : ((URLClassLoader) classLoader).getURLs()) {
-  try {
-files.add(new File(url.toURI()).getAbsolutePath());
-  } catch (IllegalArgumentException | URISyntaxException e) {
-String message = String.format("Unable to convert url (%s) to file.", 
url);
-throw new IllegalArgumentException(message, e);
-  }
+  /**
+   * Detect resources to stage, by using hierarchy between current context 
classloader and the given
+   * one. Always include the given class loader in the process.
+   *
+   * @param loader the loader to use as stopping condition when traversing 
context class loaders
+   * @return A list of absolute paths to the resources the class loader uses.
+   */
+  @VisibleForTesting
+  static List detectClassPathResourcesToStage(final ClassLoader 
loader) {
+Set files = new HashSet<>();
 
 Review comment:
   You are right, this is bound to cause trouble. We need to preserve the class 
loader hierarchy.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307689)
Time Spent: 7h 20m  (was: 7h 10m)

> FlinkRunner: Stage files from context classloader
> -
>
> Key: BEAM-8113
> URL: https://issues.apache.org/jira/browse/BEAM-8113
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Jan Lukavský
>Assignee: Jan Lukavský
>Priority: Major
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Currently, only files from {{FlinkRunner.class.getClassLoader()}} are staged 
> by default. Add also files from 
> {{Thread.currentThread().getContextClassLoader()}}.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-5428) Implement cross-bundle state caching.

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5428?focusedWorklogId=307679=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307679
 ]

ASF GitHub Bot logged work on BEAM-5428:


Author: ASF GitHub Bot
Created on: 06/Sep/19 07:41
Start Date: 06/Sep/19 07:41
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #9418: [BEAM-5428] 
Implement cross-bundle user state caching in the Python SDK
URL: https://github.com/apache/beam/pull/9418#discussion_r321611844
 
 

 ##
 File path: sdks/python/apache_beam/runners/worker/worker_pool_main.py
 ##
 @@ -47,21 +47,26 @@
 class BeamFnExternalWorkerPoolServicer(
 beam_fn_api_pb2_grpc.BeamFnExternalWorkerPoolServicer):
 
-  def __init__(self, worker_threads, use_process=False,
-   container_executable=None):
+  def __init__(self, worker_threads,
+   use_process=False,
+   container_executable=None,
+   state_cache_size=0):
 self._worker_threads = worker_threads
 self._use_process = use_process
 self._container_executable = container_executable
+self._state_cache_size = state_cache_size
 self._worker_processes = {}
 
   @classmethod
   def start(cls, worker_threads=1, use_process=False, port=0,
-container_executable=None):
+state_cache_size=0, container_executable=None):
 worker_server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
 worker_address = 'localhost:%s' % worker_server.add_insecure_port(
 '[::]:%s' % port)
-worker_pool = cls(worker_threads, use_process=use_process,
-  container_executable=container_executable)
+worker_pool = cls(worker_threads,
+  use_process=use_process,
+  container_executable=container_executable,
+  state_cache_size=state_cache_size)
 
 Review comment:
   We need this here as well because this class is also used for testing. It 
will be overridden for the non-testing case.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307679)
Time Spent: 18h  (was: 17h 50m)

> Implement cross-bundle state caching.
> -
>
> Key: BEAM-5428
> URL: https://issues.apache.org/jira/browse/BEAM-5428
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-harness
>Reporter: Robert Bradshaw
>Assignee: Rakesh Kumar
>Priority: Major
>  Time Spent: 18h
>  Remaining Estimate: 0h
>
> Tech spec: 
> [https://docs.google.com/document/d/1BOozW0bzBuz4oHJEuZNDOHdzaV5Y56ix58Ozrqm2jFg/edit#heading=h.7ghoih5aig5m]
> Relevant document: 
> [https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit#|https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit]
> Mailing list link: 
> [https://lists.apache.org/thread.html/caa8d9bc6ca871d13de2c5e6ba07fdc76f85d26497d95d90893aa1f6@%3Cdev.beam.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7945) Allow runner to configure "semi_persist_dir" which is used in the SDK harness

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7945?focusedWorklogId=307745=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307745
 ]

ASF GitHub Bot logged work on BEAM-7945:


Author: ASF GitHub Bot
Created on: 06/Sep/19 10:29
Start Date: 06/Sep/19 10:29
Worklog Time Spent: 10m 
  Work Description: sunjincheng121 commented on pull request #9452: 
[BEAM-7945] Allow runner to configure semi_persist_dir which is used …
URL: https://github.com/apache/beam/pull/9452#discussion_r321674314
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/options/RemoteEnvironmentOptions.java
 ##
 @@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.options;
+
+import com.google.auto.service.AutoService;
+import org.apache.beam.sdk.annotations.Experimental;
+import 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableList;
+
+/** Options that are used to control configuration of the remote environment. 
*/
+@Experimental
+@Hidden
+public interface RemoteEnvironmentOptions extends PipelineOptions {
+
+  @Description("Local semi-persistent directory")
+  @Default.String("/tmp")
 
 Review comment:
   Currently, we keep the same as the default value of other default 
configuration, such as:`boot.go`.
   - 
https://github.com/apache/beam/blob/d21bbaf4c70986c2dbdbe8f6fce35b2b2cb4843d/sdks/go/container/boot.go#L41
   - 
https://github.com/apache/beam/blob/d21bbaf4c70986c2dbdbe8f6fce35b2b2cb4843d/sdks/python/container/boot.go#L51
   
   So, how about we keep using `/tmp` as default value ?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307745)
Time Spent: 2h  (was: 1h 50m)

> Allow runner to configure "semi_persist_dir" which is used in the SDK harness
> -
>
> Key: BEAM-7945
> URL: https://issues.apache.org/jira/browse/BEAM-7945
> Project: Beam
>  Issue Type: Sub-task
>  Components: java-fn-execution, sdk-go, sdk-java-core, sdk-py-core
>Reporter: sunjincheng
>Assignee: sunjincheng
>Priority: Major
> Fix For: 2.16.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Currently "semi_persist_dir" is not configurable. This may become a problem 
> in certain scenarios. For example, the default value of "semi_persist_dir" is 
> "/tmp" 
> ([https://github.com/apache/beam/blob/master/sdks/python/container/boot.go#L48])
>  in Python SDK harness. When the environment type is "PROCESS", the disk of 
> "/tmp" may be filled up and unexpected issues will occur in production 
> environment. We should provide a way to configure "semi_persist_dir" in 
> EnvironmentFactory at the runner side. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8131) Provide Kubernetes setup with Prometheus

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8131?focusedWorklogId=307779=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307779
 ]

ASF GitHub Bot logged work on BEAM-8131:


Author: ASF GitHub Bot
Created on: 06/Sep/19 11:50
Start Date: 06/Sep/19 11:50
Worklog Time Spent: 10m 
  Work Description: kamilwu commented on pull request #9482: [BEAM-8131] 
Provide Kubernetes setup for Prometheus
URL: https://github.com/apache/beam/pull/9482#discussion_r321690505
 
 

 ##
 File path: .test-infra/metrics/README.md
 ##
 @@ -84,23 +95,31 @@ docker-compose build
 
 # Spinup docker-compose related containers.
 docker-compose up
 
 Review comment:
   I agree. Combining these two docker-compose files is a good idea.
   
   What do you think about Kubernetes deployments (`beamgrafana-deploy.yaml` 
and `beamprometheus-deploy.yaml`)? It's a similar situation. Prometheus is an 
independent entity, but Grafana depends on Prometheus. Should we combine them 
too?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307779)
Time Spent: 1h 40m  (was: 1.5h)

> Provide Kubernetes setup with Prometheus
> 
>
> Key: BEAM-8131
> URL: https://issues.apache.org/jira/browse/BEAM-8131
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Kamil Wasilewski
>Assignee: Kamil Wasilewski
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8100) Add exception handling to Json transforms in Java SDK

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8100?focusedWorklogId=307816=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307816
 ]

ASF GitHub Bot logged work on BEAM-8100:


Author: ASF GitHub Bot
Created on: 06/Sep/19 13:04
Start Date: 06/Sep/19 13:04
Worklog Time Spent: 10m 
  Work Description: aromanenko-dev commented on issue #9499: [BEAM-8100] 
Exception handling for AsJsons and ParseJsons
URL: https://github.com/apache/beam/pull/9499#issuecomment-528846455
 
 
   R: @jklukas 
   R: @reuvenlax 
   Could you take a look?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307816)
Time Spent: 0.5h  (was: 20m)

> Add exception handling to Json transforms in Java SDK
> -
>
> Key: BEAM-8100
> URL: https://issues.apache.org/jira/browse/BEAM-8100
> Project: Beam
>  Issue Type: Improvement
>  Components: extensions-java-json, sdk-java-core
>Reporter: Jeff Klukas
>Assignee: Alexey Romanenko
>Priority: Minor
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Follow-up to https://issues.apache.org/jira/browse/BEAM-5638. We have 
> exception handling for MapElements and FlatMapElements. It should be 
> straightforward to add parallel signatures to AsJsons and ParseJsons for 
> catching parsing errors.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-5428) Implement cross-bundle state caching.

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5428?focusedWorklogId=307860=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307860
 ]

ASF GitHub Bot logged work on BEAM-5428:


Author: ASF GitHub Bot
Created on: 06/Sep/19 13:38
Start Date: 06/Sep/19 13:38
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #9418: [BEAM-5428] Implement 
cross-bundle user state caching in the Python SDK
URL: https://github.com/apache/beam/pull/9418#issuecomment-528858139
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307860)
Time Spent: 18h 10m  (was: 18h)

> Implement cross-bundle state caching.
> -
>
> Key: BEAM-5428
> URL: https://issues.apache.org/jira/browse/BEAM-5428
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-harness
>Reporter: Robert Bradshaw
>Assignee: Rakesh Kumar
>Priority: Major
>  Time Spent: 18h 10m
>  Remaining Estimate: 0h
>
> Tech spec: 
> [https://docs.google.com/document/d/1BOozW0bzBuz4oHJEuZNDOHdzaV5Y56ix58Ozrqm2jFg/edit#heading=h.7ghoih5aig5m]
> Relevant document: 
> [https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit#|https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit]
> Mailing list link: 
> [https://lists.apache.org/thread.html/caa8d9bc6ca871d13de2c5e6ba07fdc76f85d26497d95d90893aa1f6@%3Cdev.beam.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8100) Add exception handling to Json transforms in Java SDK

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8100?focusedWorklogId=307815=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307815
 ]

ASF GitHub Bot logged work on BEAM-8100:


Author: ASF GitHub Bot
Created on: 06/Sep/19 13:02
Start Date: 06/Sep/19 13:02
Worklog Time Spent: 10m 
  Work Description: aromanenko-dev commented on pull request #9499: 
[BEAM-8100] Exception handling for AsJsons and ParseJsons
URL: https://github.com/apache/beam/pull/9499#discussion_r321723002
 
 

 ##
 File path: 
sdks/java/extensions/jackson/src/main/java/org/apache/beam/sdk/extensions/jackson/AsJsons.java
 ##
 @@ -56,6 +63,35 @@ private AsJsons(Class outputClass) {
 return newTransform;
   }
 
+  /**
+   * Returns a new {@link AsJsonsWithFailures} transform that catches 
exceptions raised while
+   * writing JSON elements, passing the raised exception instance and the 
input element being
+   * processed through the given {@code exceptionHandler} and emitting the 
result to a failure
+   * collection.
+   *
+   * Example usage:
+   *
+   * {@code
+   * WithFailures.Result, KV>> 
result =
+   * pojos.apply(
+   * AsJsons.of(MyPojo.class)
+   * .withFailures(new 
WithFailures.ExceptionAsMapHandler() {}));
+   *
+   * PCollection output = result.output(); // valid json elements
+   * PCollection>> failures = result.failures();
+   * }
+   */
+  @Experimental(Experimental.Kind.WITH_EXCEPTIONS)
+  public  AsJsonsWithFailures withFailures(
 
 Review comment:
   I'm not sure that `withFailures` is a good name in this case and I couldn't 
find out if any agreement about that exists. So, any advices are welcomed.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307815)
Time Spent: 20m  (was: 10m)

> Add exception handling to Json transforms in Java SDK
> -
>
> Key: BEAM-8100
> URL: https://issues.apache.org/jira/browse/BEAM-8100
> Project: Beam
>  Issue Type: Improvement
>  Components: extensions-java-json, sdk-java-core
>Reporter: Jeff Klukas
>Assignee: Alexey Romanenko
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Follow-up to https://issues.apache.org/jira/browse/BEAM-5638. We have 
> exception handling for MapElements and FlatMapElements. It should be 
> straightforward to add parallel signatures to AsJsons and ParseJsons for 
> catching parsing errors.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8100) Add exception handling to Json transforms in Java SDK

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8100?focusedWorklogId=307847=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307847
 ]

ASF GitHub Bot logged work on BEAM-8100:


Author: ASF GitHub Bot
Created on: 06/Sep/19 13:27
Start Date: 06/Sep/19 13:27
Worklog Time Spent: 10m 
  Work Description: aromanenko-dev commented on issue #9499: [BEAM-8100] 
Exception handling for AsJsons and ParseJsons
URL: https://github.com/apache/beam/pull/9499#issuecomment-528854072
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307847)
Time Spent: 40m  (was: 0.5h)

> Add exception handling to Json transforms in Java SDK
> -
>
> Key: BEAM-8100
> URL: https://issues.apache.org/jira/browse/BEAM-8100
> Project: Beam
>  Issue Type: Improvement
>  Components: extensions-java-json, sdk-java-core
>Reporter: Jeff Klukas
>Assignee: Alexey Romanenko
>Priority: Minor
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Follow-up to https://issues.apache.org/jira/browse/BEAM-5638. We have 
> exception handling for MapElements and FlatMapElements. It should be 
> straightforward to add parallel signatures to AsJsons and ParseJsons for 
> catching parsing errors.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8100) Add exception handling to Json transforms in Java SDK

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8100?focusedWorklogId=307856=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307856
 ]

ASF GitHub Bot logged work on BEAM-8100:


Author: ASF GitHub Bot
Created on: 06/Sep/19 13:37
Start Date: 06/Sep/19 13:37
Worklog Time Spent: 10m 
  Work Description: jklukas commented on pull request #9499: [BEAM-8100] 
Exception handling for AsJsons and ParseJsons
URL: https://github.com/apache/beam/pull/9499#discussion_r321736061
 
 

 ##
 File path: 
sdks/java/extensions/jackson/src/main/java/org/apache/beam/sdk/extensions/jackson/AsJsons.java
 ##
 @@ -56,6 +63,35 @@ private AsJsons(Class outputClass) {
 return newTransform;
   }
 
+  /**
+   * Returns a new {@link AsJsonsWithFailures} transform that catches 
exceptions raised while
+   * writing JSON elements, passing the raised exception instance and the 
input element being
+   * processed through the given {@code exceptionHandler} and emitting the 
result to a failure
+   * collection.
+   *
+   * Example usage:
+   *
+   * {@code
+   * WithFailures.Result, KV>> 
result =
+   * pojos.apply(
+   * AsJsons.of(MyPojo.class)
+   * .withFailures(new 
WithFailures.ExceptionAsMapHandler() {}));
+   *
+   * PCollection output = result.output(); // valid json elements
+   * PCollection>> failures = result.failures();
+   * }
+   */
+  @Experimental(Experimental.Kind.WITH_EXCEPTIONS)
+  public  AsJsonsWithFailures withFailures(
 
 Review comment:
   And perhaps these JSON cases are constrained enough that we could provide a 
default exceptionHandler so that the user doesn't need to specify one. Perhaps 
it would catch only the exceptions we expect to be raised 
(`JsonProcessingException` for AsJsons and `IOException` for ParseJsons), 
reraising an other exceptions, and return a KV that contains the input element 
along with a string containing the exception message.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307856)
Time Spent: 1h  (was: 50m)

> Add exception handling to Json transforms in Java SDK
> -
>
> Key: BEAM-8100
> URL: https://issues.apache.org/jira/browse/BEAM-8100
> Project: Beam
>  Issue Type: Improvement
>  Components: extensions-java-json, sdk-java-core
>Reporter: Jeff Klukas
>Assignee: Alexey Romanenko
>Priority: Minor
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Follow-up to https://issues.apache.org/jira/browse/BEAM-5638. We have 
> exception handling for MapElements and FlatMapElements. It should be 
> straightforward to add parallel signatures to AsJsons and ParseJsons for 
> catching parsing errors.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8100) Add exception handling to Json transforms in Java SDK

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8100?focusedWorklogId=307855=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307855
 ]

ASF GitHub Bot logged work on BEAM-8100:


Author: ASF GitHub Bot
Created on: 06/Sep/19 13:37
Start Date: 06/Sep/19 13:37
Worklog Time Spent: 10m 
  Work Description: jklukas commented on pull request #9499: [BEAM-8100] 
Exception handling for AsJsons and ParseJsons
URL: https://github.com/apache/beam/pull/9499#discussion_r321737681
 
 

 ##
 File path: 
sdks/java/extensions/jackson/src/main/java/org/apache/beam/sdk/extensions/jackson/AsJsons.java
 ##
 @@ -56,6 +63,35 @@ private AsJsons(Class outputClass) {
 return newTransform;
   }
 
+  /**
+   * Returns a new {@link AsJsonsWithFailures} transform that catches 
exceptions raised while
+   * writing JSON elements, passing the raised exception instance and the 
input element being
+   * processed through the given {@code exceptionHandler} and emitting the 
result to a failure
+   * collection.
+   *
+   * Example usage:
+   *
+   * {@code
+   * WithFailures.Result, KV>> 
result =
+   * pojos.apply(
+   * AsJsons.of(MyPojo.class)
+   * .withFailures(new 
WithFailures.ExceptionAsMapHandler() {}));
 
 Review comment:
   Perhaps we should provide specific default exception handlers for AsJsons 
and ParseJsons that are identical to ExceptionAsMapHandler except that they 
catch only the exceptions we expect to be raised (`JsonProcessingException` for 
AsJsons and `IOException` for ParseJsons), reraising any other exceptions.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307855)
Time Spent: 50m  (was: 40m)

> Add exception handling to Json transforms in Java SDK
> -
>
> Key: BEAM-8100
> URL: https://issues.apache.org/jira/browse/BEAM-8100
> Project: Beam
>  Issue Type: Improvement
>  Components: extensions-java-json, sdk-java-core
>Reporter: Jeff Klukas
>Assignee: Alexey Romanenko
>Priority: Minor
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Follow-up to https://issues.apache.org/jira/browse/BEAM-5638. We have 
> exception handling for MapElements and FlatMapElements. It should be 
> straightforward to add parallel signatures to AsJsons and ParseJsons for 
> catching parsing errors.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8100) Add exception handling to Json transforms in Java SDK

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8100?focusedWorklogId=307858=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307858
 ]

ASF GitHub Bot logged work on BEAM-8100:


Author: ASF GitHub Bot
Created on: 06/Sep/19 13:37
Start Date: 06/Sep/19 13:37
Worklog Time Spent: 10m 
  Work Description: jklukas commented on pull request #9499: [BEAM-8100] 
Exception handling for AsJsons and ParseJsons
URL: https://github.com/apache/beam/pull/9499#discussion_r321733747
 
 

 ##
 File path: 
sdks/java/extensions/jackson/src/main/java/org/apache/beam/sdk/extensions/jackson/ParseJsons.java
 ##
 @@ -55,6 +61,34 @@ private ParseJsons(Class outputClass) {
 return newTransform;
   }
 
+  /**
+   * Returns a new {@link ParseJsonsWithFailures} transform that catches 
exceptions raised while
+   * parsing elements, passing the raised exception instance and the input 
element being processed
+   * through the given {@code exceptionHandler} and emitting the result to a 
failure collection.
+   *
+   * Example usage:
 
 Review comment:
   ```suggestion
  * See {@link WithFailures} documentation for usage patterns of the 
returned {@link
  * WithFailures.Result}.
  *
  * Example usage:
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307858)
Time Spent: 1h 20m  (was: 1h 10m)

> Add exception handling to Json transforms in Java SDK
> -
>
> Key: BEAM-8100
> URL: https://issues.apache.org/jira/browse/BEAM-8100
> Project: Beam
>  Issue Type: Improvement
>  Components: extensions-java-json, sdk-java-core
>Reporter: Jeff Klukas
>Assignee: Alexey Romanenko
>Priority: Minor
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Follow-up to https://issues.apache.org/jira/browse/BEAM-5638. We have 
> exception handling for MapElements and FlatMapElements. It should be 
> straightforward to add parallel signatures to AsJsons and ParseJsons for 
> catching parsing errors.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8100) Add exception handling to Json transforms in Java SDK

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8100?focusedWorklogId=307859=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307859
 ]

ASF GitHub Bot logged work on BEAM-8100:


Author: ASF GitHub Bot
Created on: 06/Sep/19 13:37
Start Date: 06/Sep/19 13:37
Worklog Time Spent: 10m 
  Work Description: jklukas commented on pull request #9499: [BEAM-8100] 
Exception handling for AsJsons and ParseJsons
URL: https://github.com/apache/beam/pull/9499#discussion_r321730735
 
 

 ##
 File path: 
sdks/java/extensions/jackson/src/main/java/org/apache/beam/sdk/extensions/jackson/AsJsons.java
 ##
 @@ -56,6 +63,35 @@ private AsJsons(Class outputClass) {
 return newTransform;
   }
 
+  /**
+   * Returns a new {@link AsJsonsWithFailures} transform that catches 
exceptions raised while
+   * writing JSON elements, passing the raised exception instance and the 
input element being
+   * processed through the given {@code exceptionHandler} and emitting the 
result to a failure
+   * collection.
+   *
+   * Example usage:
+   *
+   * {@code
+   * WithFailures.Result, KV>> 
result =
+   * pojos.apply(
+   * AsJsons.of(MyPojo.class)
+   * .withFailures(new 
WithFailures.ExceptionAsMapHandler() {}));
+   *
+   * PCollection output = result.output(); // valid json elements
+   * PCollection>> failures = result.failures();
+   * }
+   */
+  @Experimental(Experimental.Kind.WITH_EXCEPTIONS)
+  public  AsJsonsWithFailures withFailures(
 
 Review comment:
   My bias would be towards following the API in `MapElements` as closely as 
possible, which would make this method `exceptionsVia` and would suggest that 
we should also have `exceptionsInto` and `exceptionsVia(ProcessFunction)` to 
allow users to define exception processing inline. I would love to see this 
factored out somehow so it's easier to add exception handling to different 
classes with a stable API and without repeating all this boilerplate, but I 
haven't found a suitable alternative.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307859)
Time Spent: 1h 20m  (was: 1h 10m)

> Add exception handling to Json transforms in Java SDK
> -
>
> Key: BEAM-8100
> URL: https://issues.apache.org/jira/browse/BEAM-8100
> Project: Beam
>  Issue Type: Improvement
>  Components: extensions-java-json, sdk-java-core
>Reporter: Jeff Klukas
>Assignee: Alexey Romanenko
>Priority: Minor
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Follow-up to https://issues.apache.org/jira/browse/BEAM-5638. We have 
> exception handling for MapElements and FlatMapElements. It should be 
> straightforward to add parallel signatures to AsJsons and ParseJsons for 
> catching parsing errors.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8100) Add exception handling to Json transforms in Java SDK

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8100?focusedWorklogId=307857=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307857
 ]

ASF GitHub Bot logged work on BEAM-8100:


Author: ASF GitHub Bot
Created on: 06/Sep/19 13:37
Start Date: 06/Sep/19 13:37
Worklog Time Spent: 10m 
  Work Description: jklukas commented on pull request #9499: [BEAM-8100] 
Exception handling for AsJsons and ParseJsons
URL: https://github.com/apache/beam/pull/9499#discussion_r321733602
 
 

 ##
 File path: 
sdks/java/extensions/jackson/src/main/java/org/apache/beam/sdk/extensions/jackson/AsJsons.java
 ##
 @@ -56,6 +63,35 @@ private AsJsons(Class outputClass) {
 return newTransform;
   }
 
+  /**
+   * Returns a new {@link AsJsonsWithFailures} transform that catches 
exceptions raised while
+   * writing JSON elements, passing the raised exception instance and the 
input element being
+   * processed through the given {@code exceptionHandler} and emitting the 
result to a failure
+   * collection.
+   *
+   * Example usage:
 
 Review comment:
   ```suggestion
  * See {@link WithFailures} documentation for usage patterns of the 
returned {@link
  * WithFailures.Result}.
  *
  * Example usage:
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307857)
Time Spent: 1h 10m  (was: 1h)

> Add exception handling to Json transforms in Java SDK
> -
>
> Key: BEAM-8100
> URL: https://issues.apache.org/jira/browse/BEAM-8100
> Project: Beam
>  Issue Type: Improvement
>  Components: extensions-java-json, sdk-java-core
>Reporter: Jeff Klukas
>Assignee: Alexey Romanenko
>Priority: Minor
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Follow-up to https://issues.apache.org/jira/browse/BEAM-5638. We have 
> exception handling for MapElements and FlatMapElements. It should be 
> straightforward to add parallel signatures to AsJsons and ParseJsons for 
> catching parsing errors.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7305) Add first version of Hazelcast Jet Runner

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7305?focusedWorklogId=307881=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307881
 ]

ASF GitHub Bot logged work on BEAM-7305:


Author: ASF GitHub Bot
Created on: 06/Sep/19 14:16
Start Date: 06/Sep/19 14:16
Worklog Time Spent: 10m 
  Work Description: RyanSkraba commented on pull request #9471: [BEAM-7305] 
Improve Jet Runner related documentation
URL: https://github.com/apache/beam/pull/9471#discussion_r321749449
 
 

 ##
 File path: website/src/get-started/wordcount-example.md
 ##
 @@ -739,6 +755,12 @@ $ mvn package -Pnemo-runner && java -cp 
target/word-count-beam-bundled-0.1.jar o
  --runner=NemoRunner --inputFile=`pwd`/pom.xml --output=counts
 ```
 
+{:.runner-jet}
+```
+$ mvn package -P jet-runner && java -cp target/word-count-beam-bundled-0.1.jar 
org.apache.beam.examples.DebuggingWordCount \
+ --runner=JetRunner --jetLocalMode=3 --inputFile=$pwd/pom.xml 
--output=counts
 
 Review comment:
   ```suggestion
--runner=JetRunner --jetLocalMode=3 --output=counts
   ```
   It looks like most of the runners are running against the default inputFile 
(the Shakespeare corpus) which demonstrates the PAsserts succeeding in the 
example.
   
   Nemo is running against `pom.xml` which demonstrates failing the PAssert in 
the example.
   
   I suggest following the majority!
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307881)
Time Spent: 11h 10m  (was: 11h)

> Add first version of Hazelcast Jet Runner
> -
>
> Key: BEAM-7305
> URL: https://issues.apache.org/jira/browse/BEAM-7305
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-jet
>Reporter: Maximilian Michels
>Assignee: Jozsef Bartok
>Priority: Major
> Fix For: 2.14.0
>
>  Time Spent: 11h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7305) Add first version of Hazelcast Jet Runner

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7305?focusedWorklogId=307880=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307880
 ]

ASF GitHub Bot logged work on BEAM-7305:


Author: ASF GitHub Bot
Created on: 06/Sep/19 14:16
Start Date: 06/Sep/19 14:16
Worklog Time Spent: 10m 
  Work Description: RyanSkraba commented on pull request #9471: [BEAM-7305] 
Improve Jet Runner related documentation
URL: https://github.com/apache/beam/pull/9471#discussion_r321748337
 
 

 ##
 File path: website/src/get-started/wordcount-example.md
 ##
 @@ -390,6 +390,12 @@ $ mvn package -Pnemo-runner && java -cp 
target/word-count-beam-bundled-0.1.jar o
  --runner=NemoRunner --inputFile=`pwd`/pom.xml --output=counts
 ```
 
+{:.runner-jet}
+```
+$ mvn package -P jet-runner && java -cp target/word-count-beam-bundled-0.1.jar 
org.apache.beam.examples.WordCount \
+ --runner=JetRunner --jetLocalMode=3 --inputFile=$pwd/pom.xml 
--output=counts
 
 Review comment:
   ```suggestion
--runner=JetRunner --jetLocalMode=3 --inputFile=`pwd`/pom.xml 
--output=counts
   ```
   
   Not a big deal -- `$pwd` doesn't work for my shell (unless I set it of 
course), and `` `pwd` `` is consistent with the others.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307880)
Time Spent: 11h 10m  (was: 11h)

> Add first version of Hazelcast Jet Runner
> -
>
> Key: BEAM-7305
> URL: https://issues.apache.org/jira/browse/BEAM-7305
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-jet
>Reporter: Maximilian Michels
>Assignee: Jozsef Bartok
>Priority: Major
> Fix For: 2.14.0
>
>  Time Spent: 11h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7305) Add first version of Hazelcast Jet Runner

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7305?focusedWorklogId=307882=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307882
 ]

ASF GitHub Bot logged work on BEAM-7305:


Author: ASF GitHub Bot
Created on: 06/Sep/19 14:16
Start Date: 06/Sep/19 14:16
Worklog Time Spent: 10m 
  Work Description: RyanSkraba commented on pull request #9471: [BEAM-7305] 
Improve Jet Runner related documentation
URL: https://github.com/apache/beam/pull/9471#discussion_r321749773
 
 

 ##
 File path: website/src/get-started/wordcount-example.md
 ##
 @@ -1088,6 +1120,12 @@ $ mvn package -Pnemo-runner && java -cp 
target/word-count-beam-bundled-0.1.jar o
  --runner=NemoRunner --inputFile=`pwd`/pom.xml --output=counts
 ```
 
+{:.runner-jet}
+```
+$ mvn package -P jet-runner && java -cp target/word-count-beam-bundled-0.1.jar 
org.apache.beam.examples.WindowedWordCount \
+ --runner=JetRunner --jetLocalMode=3 --inputFile=$pwd/pom.xml 
--output=counts
 
 Review comment:
   ```suggestion
--runner=JetRunner --jetLocalMode=3 --inputFile=`pwd`/pom.xml 
--output=counts
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307882)
Time Spent: 11h 10m  (was: 11h)

> Add first version of Hazelcast Jet Runner
> -
>
> Key: BEAM-7305
> URL: https://issues.apache.org/jira/browse/BEAM-7305
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-jet
>Reporter: Maximilian Michels
>Assignee: Jozsef Bartok
>Priority: Major
> Fix For: 2.14.0
>
>  Time Spent: 11h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8151) Allow the Python SDK to use many many threads

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8151?focusedWorklogId=307845=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307845
 ]

ASF GitHub Bot logged work on BEAM-8151:


Author: ASF GitHub Bot
Created on: 06/Sep/19 13:25
Start Date: 06/Sep/19 13:25
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #9477: [BEAM-8151, 
BEAM-7848] Up the max number of threads inside the SDK harness to a default of 
10k
URL: https://github.com/apache/beam/pull/9477#issuecomment-528853628
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307845)
Time Spent: 4h 50m  (was: 4h 40m)

> Allow the Python SDK to use many many threads
> -
>
> Key: BEAM-8151
> URL: https://issues.apache.org/jira/browse/BEAM-8151
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core, sdk-py-harness
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>Priority: Major
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> We need to use a thread pool which shrinks the number of active threads when 
> they are not being used.
>  
> This is to prevent any stuckness issues related to a runner scheduling more 
> work items then there are "work" threads inside the SDK harness.
>  
> By default the control plane should have all "requests" being processed in 
> parallel and the runner is responsible for not overloading the SDK with too 
> much work.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8157) Flink state requests return wrong state in timers when encoded key is length-prefixed

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8157?focusedWorklogId=307854=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307854
 ]

ASF GitHub Bot logged work on BEAM-8157:


Author: ASF GitHub Bot
Created on: 06/Sep/19 13:34
Start Date: 06/Sep/19 13:34
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #9484: [BEAM-8157] Remove 
length prefix from state key for Flink's state backend
URL: https://github.com/apache/beam/pull/9484#discussion_r321737041
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/ExecutableStageDoFnOperator.java
 ##
 @@ -343,11 +343,11 @@ public void clear(K key, W window) {
 }
 
 private void prepareStateBackend(K key) {
-  // Key for state request is shipped already encoded as ByteString,
-  // this is mostly a wrapping with ByteBuffer. We still follow the
-  // usual key encoding procedure.
-  // final ByteBuffer encodedKey = FlinkKeyUtils.encodeKey(key, 
keyCoder);
-  final ByteBuffer encodedKey = ByteBuffer.wrap(key.toByteArray());
+  // Key for state request is shipped already encoded as ByteString, 
but it is
 
 Review comment:
   Here is why this works for Flink <= 1.8 but definitely needs to be fixed 
moving forward:
   
   Keys for state requests are encoded incorrectly for the Flink Runner's state 
backend because they may contain a length prefix, which is not in lines with 
how the key is encoded for data partitioning. This causes state to be written 
to the wrong key group. However, the encoding scheme is consistent, so reading 
from state in `ProcessElement` or `OnTimer` methods works correctly. Also when 
inter-playing with timers. However, there are potential issues with 
checkpoints/savepoints due to this.
   
   In Flink 1.9, the key encoding mismatch is visible because reading state 
from a key not part of the partition _can_ return a `null` value. That is, not 
always, likely due to the state compaction logic. This explains why 
`PortableStateExecutionTest` works correctly but `PortableTimersExecutionTest` 
does not. That said, I think we should merge this fix ASAP for the upcoming 
release. 
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 307854)
Time Spent: 2h 10m  (was: 2h)

> Flink state requests return wrong state in timers when encoded key is 
> length-prefixed
> -
>
> Key: BEAM-8157
> URL: https://issues.apache.org/jira/browse/BEAM-8157
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 2.13.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: 2.16.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Due to multiple changes made in BEAM-7126, the Flink internal key encoding is 
> broken when the key is encoded with a length prefix. The Flink runner 
> requires the internal key to be encoded without a length prefix. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8162) NPE error when add flink 1.9 runner

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8162?focusedWorklogId=308025=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308025
 ]

ASF GitHub Bot logged work on BEAM-8162:


Author: ASF GitHub Bot
Created on: 06/Sep/19 17:19
Start Date: 06/Sep/19 17:19
Worklog Time Spent: 10m 
  Work Description: sunjincheng121 commented on issue #9464: [BEAM-8162] 
Encode keys as NESTED for flink keyselector
URL: https://github.com/apache/beam/pull/9464#issuecomment-528938882
 
 
   Hi @mxm @tweise I think you are much familiar with this code. So I just 
share my thoughts, and looking forward your decision :)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308025)
Time Spent: 50m  (was: 40m)

> NPE error when add flink 1.9 runner
> ---
>
> Key: BEAM-8162
> URL: https://issues.apache.org/jira/browse/BEAM-8162
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: sunjincheng
>Assignee: sunjincheng
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> When add flink 1.9 runner in https://github.com/apache/beam/pull/9296, we 
> find an NPE error when run the `PortableTimersExecutionTest`. 
> the detail can be found here: 
> https://github.com/apache/beam/pull/9296#issuecomment-525262607



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8157) Flink state requests return wrong state in timers when encoded key is length-prefixed

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8157?focusedWorklogId=308077=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308077
 ]

ASF GitHub Bot logged work on BEAM-8157:


Author: ASF GitHub Bot
Created on: 06/Sep/19 18:20
Start Date: 06/Sep/19 18:20
Worklog Time Spent: 10m 
  Work Description: tweise commented on issue #9484: [BEAM-8157] Remove 
length prefix from state key for Flink's state backend
URL: https://github.com/apache/beam/pull/9484#issuecomment-528961315
 
 
   Unfortunately with this change the test pipeline fails:
   ```
   RuntimeError: java.lang.RuntimeException: Failed to remove nested context 
from key: XX
at 
org.apache.beam.runners.flink.translation.wrappers.streaming.FlinkKeyUtils.removeNestedContext(FlinkKeyUtils.java:85)
at 
org.apache.beam.runners.flink.translation.wrappers.streaming.ExecutableStageDoFnOperator$BagUserStateFactory$1.prepareStateBackend(ExecutableStageDoFnOperator.java:370)
   ```
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308077)
Time Spent: 2.5h  (was: 2h 20m)

> Flink state requests return wrong state in timers when encoded key is 
> length-prefixed
> -
>
> Key: BEAM-8157
> URL: https://issues.apache.org/jira/browse/BEAM-8157
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 2.13.0
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: 2.16.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Due to multiple changes made in BEAM-7126, the Flink internal key encoding is 
> broken when the key is encoded with a length prefix. The Flink runner 
> requires the internal key to be encoded without a length prefix. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7967) Execute portable Flink application jar

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7967?focusedWorklogId=308093=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308093
 ]

ASF GitHub Bot logged work on BEAM-7967:


Author: ASF GitHub Bot
Created on: 06/Sep/19 19:03
Start Date: 06/Sep/19 19:03
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #9408:  [BEAM-7967] Execute 
portable Flink application jar
URL: https://github.com/apache/beam/pull/9408#issuecomment-528976497
 
 
   Run Seed Job
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308093)
Time Spent: 3.5h  (was: 3h 20m)

> Execute portable Flink application jar
> --
>
> Key: BEAM-7967
> URL: https://issues.apache.org/jira/browse/BEAM-7967
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-flink
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> [https://docs.google.com/document/d/1kj_9JWxGWOmSGeZ5hbLVDXSTv-zBrx4kQRqOq85RYD4/edit#heading=h.oes73844vmhl]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7967) Execute portable Flink application jar

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7967?focusedWorklogId=308092=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308092
 ]

ASF GitHub Bot logged work on BEAM-7967:


Author: ASF GitHub Bot
Created on: 06/Sep/19 19:03
Start Date: 06/Sep/19 19:03
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #9408:  [BEAM-7967] Execute 
portable Flink application jar
URL: https://github.com/apache/beam/pull/9408#issuecomment-528976408
 
 
   Run PortableJar_Flink PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308092)
Time Spent: 3h 20m  (was: 3h 10m)

> Execute portable Flink application jar
> --
>
> Key: BEAM-7967
> URL: https://issues.apache.org/jira/browse/BEAM-7967
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-flink
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> [https://docs.google.com/document/d/1kj_9JWxGWOmSGeZ5hbLVDXSTv-zBrx4kQRqOq85RYD4/edit#heading=h.oes73844vmhl]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7389) Colab examples for element-wise transforms (Python)

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7389?focusedWorklogId=308121=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308121
 ]

ASF GitHub Bot logged work on BEAM-7389:


Author: ASF GitHub Bot
Created on: 06/Sep/19 19:51
Start Date: 06/Sep/19 19:51
Worklog Time Spent: 10m 
  Work Description: davidcavazos commented on pull request #9503: 
[BEAM-7389] Add code examples for ParDo page
URL: https://github.com/apache/beam/pull/9503
 
 
   Adding code samples for the ParDo page.
   
   R: @rosetn [website]
   R: @aaltay [code/approval]
   Can you take a look at this whenever you have a chance? Thanks!
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [x] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [x] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [x] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 

[jira] [Created] (BEAM-8166) Support Graceful shutdown of worker harness.

2019-09-06 Thread Robert Burke (Jira)
Robert Burke created BEAM-8166:
--

 Summary: Support Graceful shutdown of worker harness.
 Key: BEAM-8166
 URL: https://issues.apache.org/jira/browse/BEAM-8166
 Project: Beam
  Issue Type: Improvement
  Components: runner-core, sdk-go
Reporter: Robert Burke


Ideally there should be a clear Shutdown control RPC a runner can send a worker 
harness to trigger an orderly shutdown.

Absent that, errors on the runner side shouldn't manifest as SDK worker harness 
errors. SDKs should log, and gracefully shutdown from GRPC errors.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7969) Streaming Dataflow worker doesn't report FnAPI metrics.

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7969?focusedWorklogId=308126=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308126
 ]

ASF GitHub Bot logged work on BEAM-7969:


Author: ASF GitHub Bot
Created on: 06/Sep/19 20:10
Start Date: 06/Sep/19 20:10
Worklog Time Spent: 10m 
  Work Description: Ardagan commented on issue #9494: [BEAM-7969] Fix 
doublecount on GRPC PCollections in streaming jobs.
URL: https://github.com/apache/beam/pull/9494#issuecomment-528996585
 
 
   Run Python Dataflow ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308126)
Time Spent: 6h 20m  (was: 6h 10m)

> Streaming Dataflow worker doesn't report FnAPI metrics.
> ---
>
> Key: BEAM-7969
> URL: https://issues.apache.org/jira/browse/BEAM-7969
> Project: Beam
>  Issue Type: Bug
>  Components: java-fn-execution, runner-dataflow
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Major
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> EOM



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8166) Support Graceful shutdown of worker harness.

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8166?focusedWorklogId=308148=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308148
 ]

ASF GitHub Bot logged work on BEAM-8166:


Author: ASF GitHub Bot
Created on: 06/Sep/19 20:46
Start Date: 06/Sep/19 20:46
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #9504: [BEAM-8166] 
Allow in process workers to exit without killing the process.
URL: https://github.com/apache/beam/pull/9504
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308148)
Time Spent: 50m  (was: 40m)

> Support Graceful shutdown of worker harness.
> 
>
> Key: BEAM-8166
> URL: https://issues.apache.org/jira/browse/BEAM-8166
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-core, sdk-go
>Reporter: Robert Burke
>Priority: Minor
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Ideally there should be a clear Shutdown control RPC a runner can send a 
> worker harness to trigger an orderly shutdown.
> Absent that, errors on the runner side shouldn't manifest as SDK worker 
> harness errors. SDKs should log, and gracefully shutdown from GRPC errors.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Closed] (BEAM-7225) [beam_PostCommit_Py_ValCont ] [test_metrics_fnapi_it] Unable to match metrics for matcher namespace

2019-09-06 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira closed BEAM-7225.
-
Fix Version/s: Not applicable
   Resolution: Cannot Reproduce

Looks like this bug is obsolete now. Failure hasn't been popping up.

> [beam_PostCommit_Py_ValCont ] [test_metrics_fnapi_it] Unable to match metrics 
> for matcher  namespace
> 
>
> Key: BEAM-7225
> URL: https://issues.apache.org/jira/browse/BEAM-7225
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Ahmet Altay
>Assignee: Alex Amato
>Priority: Major
> Fix For: Not applicable
>
>
> https://builds.apache.org/job/beam_PostCommit_Py_ValCont/3115/console
> 13:19:41 test_metrics_fnapi_it 
> (apache_beam.runners.dataflow.dataflow_exercise_metrics_pipeline_test.ExerciseMetricsPipelineTest)
>  ... FAIL
> 13:19:41 
> 13:19:41 
> ==
> 13:19:41 FAIL: test_metrics_fnapi_it 
> (apache_beam.runners.dataflow.dataflow_exercise_metrics_pipeline_test.ExerciseMetricsPipelineTest)
> 13:19:41 
> --
> 13:19:41 Traceback (most recent call last):
> 13:19:41   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_ValCont/src/sdks/python/apache_beam/runners/dataflow/dataflow_exercise_metrics_pipeline_test.py",
>  line 70, in test_metrics_fnapi_it
> 13:19:41 self.assertFalse(errors, str(errors))
> 13:19:41 AssertionError: Unable to match metrics for matcher  namespace: 
> 'apache_beam.runners.dataflow.dataflow_exercise_metrics_pipeline.UserMetricsDoFn'
>  name: 'total_values' step: 'metrics' attempted: <100> committed: <100>
> 13:19:41 Actual MetricResults:



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8105) Add container publishing instruction to release manual

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8105?focusedWorklogId=308218=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308218
 ]

ASF GitHub Bot logged work on BEAM-8105:


Author: ASF GitHub Bot
Created on: 06/Sep/19 23:23
Start Date: 06/Sep/19 23:23
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on issue #9510: [WIP][BEAM-8105] 
update release guide with docker images
URL: https://github.com/apache/beam/pull/9510#issuecomment-529044758
 
 
   Though it's WIP, I would like to give you a heads up about PR size. soyrice
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308218)
Time Spent: 20m  (was: 10m)

> Add container publishing instruction to release manual
> --
>
> Key: BEAM-8105
> URL: https://issues.apache.org/jira/browse/BEAM-8105
> Project: Beam
>  Issue Type: Sub-task
>  Components: website
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
> Fix For: 2.16.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-7970) Regenerate Go SDK proto files in correct version

2019-09-06 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16924656#comment-16924656
 ] 

Daniel Oliveira commented on BEAM-7970:
---

>From my PR attempt:
{quote}Closing because it turns out attempting to regen protos with an updated 
version of the proto compiler also requires updating a bunch of the Go SDK's 
dependencies, which is a huge can of worms.

And it's mostly unnecessary anyway. The real way to avoid issues is to use an 
older version of the proto compiler, so the change that really needs to be made 
is updating the documentation to mention which version to use.
{quote}
So I'll be making a PR modifying the message instead and closing this for now.

> Regenerate Go SDK proto files in correct version
> 
>
> Key: BEAM-7970
> URL: https://issues.apache.org/jira/browse/BEAM-7970
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Generated proto files in the Go SDK currently include this bit:
> {{// This is a compile-time assertion to ensure that this generated file}}
> {{// is compatible with the proto package it is being compiled against.}}
> {{// A compilation error at this line likely means your copy of the}}
> {{// proto package needs to be updated.}}
> {{const _ = proto.ProtoPackageIsVersion2 // please upgrade the proto package}}
>  
> This indicates that the protos are being generated as proto v2 for whatever 
> reason. Most likely, as mentioned by this post with someone with a similar 
> issue, because the proto generation binary needs to be rebuilt before 
> generating the files again: 
> [https://github.com/golang/protobuf/issues/449#issuecomment-340884839]
> This hasn't caused any errors so far, but might eventually cause errors if we 
> hit version differences between the v2 and v3 protos.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (BEAM-8164) Correct document for building the python SDK harness container

2019-09-06 Thread sunjincheng (Jira)
sunjincheng created BEAM-8164:
-

 Summary: Correct document for building the python SDK harness 
container
 Key: BEAM-8164
 URL: https://issues.apache.org/jira/browse/BEAM-8164
 Project: Beam
  Issue Type: Bug
  Components: website
Reporter: sunjincheng
Assignee: sunjincheng


In the runner document, it is described that we can use the command: `./gradlew 
:sdks:python:container:docker` 
to Build the SDK harness container, see 
([https://beam.apache.org/documentation/runners/flink/)].

However, the docker config has been removed with the latest python3 docker 
related commit [1] the command would failed with the following error message.
{code:java}
 > Task :sdks:python:container:docker FAILED

 FAILURE: Build failed with an exception.

 * What went wrong:
 Execution failed for task ':sdks:python:container:docker'.
 > name is a required docker configuration item.{code}

I think we should also adapt the document with command: `./gradlew 
:sdks:python:container:py2:docker`?

[1] 
[https://github.com/apache/beam/commit/47feeafb21023e2a60ae51737cc4000a2033719c#diff-1bc5883bcfcc9e883ab7df09e4dcddb0L63]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8162) NPE error when add flink 1.9 runner

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8162?focusedWorklogId=308096=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308096
 ]

ASF GitHub Bot logged work on BEAM-8162:


Author: ASF GitHub Bot
Created on: 06/Sep/19 19:05
Start Date: 06/Sep/19 19:05
Worklog Time Spent: 10m 
  Work Description: sunjincheng121 commented on issue #9464: [BEAM-8162] 
Encode keys as NESTED for flink keyselector
URL: https://github.com/apache/beam/pull/9464#issuecomment-528977118
 
 
   > > I can see you already do more effort for that PR.
   > 
   > I don't consider a response time of one day to be long. Please keep in 
mind that people may be occupied with other tasks.
   
   Sorry, maybe I didn't express my meaning correctly. I meant that, we already 
do our best to fix it. :)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308096)
Time Spent: 1h  (was: 50m)

> NPE error when add flink 1.9 runner
> ---
>
> Key: BEAM-8162
> URL: https://issues.apache.org/jira/browse/BEAM-8162
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: sunjincheng
>Assignee: sunjincheng
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> When add flink 1.9 runner in https://github.com/apache/beam/pull/9296, we 
> find an NPE error when run the `PortableTimersExecutionTest`. 
> the detail can be found here: 
> https://github.com/apache/beam/pull/9296#issuecomment-525262607



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Comment Edited] (BEAM-3645) Support multi-process execution on the FnApiRunner

2019-09-06 Thread Yifan Mai (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-3645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16924585#comment-16924585
 ] 

Yifan Mai edited comment on BEAM-3645 at 9/6/19 8:54 PM:
-

While testing this I noticed that the multi-process runner does not handle 
SIGINT gracefully. I added the repro steps to BEAM-8149.


was (Author: myffi...@gmail.com):
While testing this I noticed that the multi-process runner does not handle 
SIGINT gracefully. To reproduce, run wordcount.py using the "Run with 
multiprocessing mode" instructions from the comment above (in Python 3).

Expected: wordcount terminates gracefully when Ctrl-C is pressed during 
pipeline execution (similarly to default direct runner)
Actual: wordcount hangs forever after printing the following once per worker:

{code}
Exception in thread run_worker:
Traceback (most recent call last):
  File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
  File "/usr/lib/python3.6/threading.py", line 864, in run
self._target(*self._args, **self._kwargs)
  File 
"/usr/local/google/home/yifanmai/venv/wordcount/lib/python3.6/site-packages/apache_beam/runners/portability/local_job_service.py",
 line 216, in run
'Worker subprocess exited with return code %s' % p.returncode)
RuntimeError: Worker subprocess exited with return code 1
{code}

> Support multi-process execution on the FnApiRunner
> --
>
> Key: BEAM-3645
> URL: https://issues.apache.org/jira/browse/BEAM-3645
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Charles Chen
>Assignee: Hannah Jiang
>Priority: Major
> Fix For: 2.15.0
>
>  Time Spent: 35h 20m
>  Remaining Estimate: 0h
>
> https://issues.apache.org/jira/browse/BEAM-3644 gave us a 15x performance 
> gain over the previous DirectRunner.  We can do even better in multi-core 
> environments by supporting multi-process execution in the FnApiRunner, to 
> scale past Python GIL limitations.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7969) Streaming Dataflow worker doesn't report FnAPI metrics.

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7969?focusedWorklogId=308156=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308156
 ]

ASF GitHub Bot logged work on BEAM-7969:


Author: ASF GitHub Bot
Created on: 06/Sep/19 20:54
Start Date: 06/Sep/19 20:54
Worklog Time Spent: 10m 
  Work Description: Ardagan commented on pull request #9494: [BEAM-7969] 
Fix doublecount on GRPC PCollections in streaming jobs.
URL: https://github.com/apache/beam/pull/9494#discussion_r321906748
 
 

 ##
 File path: 
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/fn/control/BeamFnMapTaskExecutor.java
 ##
 @@ -400,13 +409,19 @@ void updateProgress() {
  * @param monitoringInfos Usually received from FnApi.
  */
 private void updateMetrics(List monitoringInfos) {
+  List monitoringInfosCopy = new 
ArrayList<>(monitoringInfos);
+
+  List misToFilter =
+  
bundleProcessOperation.findIOPCollectionMonitoringInfos(monitoringInfos);
 
 Review comment:
   Synced offline. I'll try to elaborate more on this.
   
   When we modify graph for cross-boundary grpc operations, we give two 
different PCollections same metadata. As a result we report metrics correctly 
from SDK harness. Both of these PCollections generate metrics that show as 
ElementCount metric on UI.
   
   At this point, the shortest way to fix the issue is to utilize deduping on 
runner.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308156)
Time Spent: 6h 50m  (was: 6h 40m)

> Streaming Dataflow worker doesn't report FnAPI metrics.
> ---
>
> Key: BEAM-7969
> URL: https://issues.apache.org/jira/browse/BEAM-7969
> Project: Beam
>  Issue Type: Bug
>  Components: java-fn-execution, runner-dataflow
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Major
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> EOM



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-8149) [FnApiRunner]multi-process runner does not terminate cleanly upon receiving SIGINT

2019-09-06 Thread Yifan Mai (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yifan Mai updated BEAM-8149:

Description: 
The multi-process runner does not handle SIGINT gracefully. To reproduce, run 
wordcount.py using the "Run with multiprocessing mode" instructions from the 
first comment in BEAM-3645 (in Python 3).

Expected: wordcount terminates gracefully when Ctrl-C is pressed during 
pipeline execution (similarly to default direct runner)
Actual: wordcount hangs forever after printing the following once per worker:

{code}
Exception in thread run_worker:
Traceback (most recent call last):
  File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
  File "/usr/lib/python3.6/threading.py", line 864, in run
self._target(*self._args, **self._kwargs)
  File 
"/usr/local/google/home/yifanmai/venv/wordcount/lib/python3.6/site-packages/apache_beam/runners/portability/local_job_service.py",
 line 216, in run
'Worker subprocess exited with return code %s' % p.returncode)
RuntimeError: Worker subprocess exited with return code 1
{code}

  was:
The multi-process runner does not handle SIGINT gracefully. To reproduce, run 
wordcount.py using the "Run with multiprocessing mode" instructions from the 
comment above (in Python 3).

Expected: wordcount terminates gracefully when Ctrl-C is pressed during 
pipeline execution (similarly to default direct runner)
Actual: wordcount hangs forever after printing the following once per worker:

{code}
Exception in thread run_worker:
Traceback (most recent call last):
  File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
  File "/usr/lib/python3.6/threading.py", line 864, in run
self._target(*self._args, **self._kwargs)
  File 
"/usr/local/google/home/yifanmai/venv/wordcount/lib/python3.6/site-packages/apache_beam/runners/portability/local_job_service.py",
 line 216, in run
'Worker subprocess exited with return code %s' % p.returncode)
RuntimeError: Worker subprocess exited with return code 1
{code}


> [FnApiRunner]multi-process runner does not terminate cleanly upon receiving 
> SIGINT
> --
>
> Key: BEAM-8149
> URL: https://issues.apache.org/jira/browse/BEAM-8149
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.15.0
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
>
> The multi-process runner does not handle SIGINT gracefully. To reproduce, run 
> wordcount.py using the "Run with multiprocessing mode" instructions from the 
> first comment in BEAM-3645 (in Python 3).
> Expected: wordcount terminates gracefully when Ctrl-C is pressed during 
> pipeline execution (similarly to default direct runner)
> Actual: wordcount hangs forever after printing the following once per worker:
> {code}
> Exception in thread run_worker:
> Traceback (most recent call last):
>   File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
> self.run()
>   File "/usr/lib/python3.6/threading.py", line 864, in run
> self._target(*self._args, **self._kwargs)
>   File 
> "/usr/local/google/home/yifanmai/venv/wordcount/lib/python3.6/site-packages/apache_beam/runners/portability/local_job_service.py",
>  line 216, in run
> 'Worker subprocess exited with return code %s' % p.returncode)
> RuntimeError: Worker subprocess exited with return code 1
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-6185) Upgrade Spark to version 2.4.0

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6185?focusedWorklogId=308173=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308173
 ]

ASF GitHub Bot logged work on BEAM-6185:


Author: ASF GitHub Bot
Created on: 06/Sep/19 21:28
Start Date: 06/Sep/19 21:28
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on issue #9487: [WIP] 
[BEAM-6185]Change default docker images name
URL: https://github.com/apache/beam/pull/9487#issuecomment-529020384
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308173)
Time Spent: 2h 10m  (was: 2h)

> Upgrade Spark to version 2.4.0
> --
>
> Key: BEAM-6185
> URL: https://issues.apache.org/jira/browse/BEAM-6185
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-spark
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 2.12.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-5820) Vendor Calcite

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5820?focusedWorklogId=308185=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308185
 ]

ASF GitHub Bot logged work on BEAM-5820:


Author: ASF GitHub Bot
Created on: 06/Sep/19 21:50
Start Date: 06/Sep/19 21:50
Worklog Time Spent: 10m 
  Work Description: vectorijk commented on issue #9189: [BEAM-5820] vendor 
calcite
URL: https://github.com/apache/beam/pull/9189#issuecomment-516307123
 
 
   @kennknowles @iemejia pass unit test locally
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308185)
Time Spent: 6h  (was: 5h 50m)

> Vendor Calcite
> --
>
> Key: BEAM-5820
> URL: https://issues.apache.org/jira/browse/BEAM-5820
> Project: Beam
>  Issue Type: Sub-task
>  Components: dsl-sql
>Reporter: Kenneth Knowles
>Assignee: Kai Jiang
>Priority: Major
>  Time Spent: 6h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8105) Add container publishing instruction to release manual

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8105?focusedWorklogId=308216=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308216
 ]

ASF GitHub Bot logged work on BEAM-8105:


Author: ASF GitHub Bot
Created on: 06/Sep/19 23:17
Start Date: 06/Sep/19 23:17
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on pull request #9510: 
[WIP][BEAM-8105] update release guide with docker images
URL: https://github.com/apache/beam/pull/9510
 
 
   This PR is adding release procedure for docker images.
   
   Publishing docker images as part of release is tackled with following three 
PRs.
   
   Change default image name. (#9487 )
   Add staging and publishing scripts for docker images to release procedure. 
(#9506)
   Updating release-guide.md (current PR)
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 

[jira] [Created] (BEAM-8165) Change image name

2019-09-06 Thread Hannah Jiang (Jira)
Hannah Jiang created BEAM-8165:
--

 Summary: Change image name
 Key: BEAM-8165
 URL: https://issues.apache.org/jira/browse/BEAM-8165
 Project: Beam
  Issue Type: Sub-task
  Components: sdk-go, sdk-java-harness, sdk-py-harness
Reporter: Hannah Jiang
Assignee: Hannah Jiang


change name to apachebeam/\{lang}{ver}_sdk



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-8165) Change image name

2019-09-06 Thread Hannah Jiang (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hannah Jiang updated BEAM-8165:
---
Status: Open  (was: Triage Needed)

> Change image name
> -
>
> Key: BEAM-8165
> URL: https://issues.apache.org/jira/browse/BEAM-8165
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go, sdk-java-harness, sdk-py-harness
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
>
> change name to apachebeam/\{lang}{ver}_sdk



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-8149) [FnApiRunner]multi-process runner does not terminate cleanly upon receiving SIGINT

2019-09-06 Thread Yifan Mai (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yifan Mai updated BEAM-8149:

Description: 
The multi-process runner does not handle SIGINT gracefully. To reproduce, run 
wordcount.py using the "Run with multiprocessing mode" instructions from the 
comment above (in Python 3).

Expected: wordcount terminates gracefully when Ctrl-C is pressed during 
pipeline execution (similarly to default direct runner)
Actual: wordcount hangs forever after printing the following once per worker:

{code}
Exception in thread run_worker:
Traceback (most recent call last):
  File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
  File "/usr/lib/python3.6/threading.py", line 864, in run
self._target(*self._args, **self._kwargs)
  File 
"/usr/local/google/home/yifanmai/venv/wordcount/lib/python3.6/site-packages/apache_beam/runners/portability/local_job_service.py",
 line 216, in run
'Worker subprocess exited with return code %s' % p.returncode)
RuntimeError: Worker subprocess exited with return code 1
{code}

> [FnApiRunner]multi-process runner does not terminate cleanly upon receiving 
> SIGINT
> --
>
> Key: BEAM-8149
> URL: https://issues.apache.org/jira/browse/BEAM-8149
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.15.0
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
>
> The multi-process runner does not handle SIGINT gracefully. To reproduce, run 
> wordcount.py using the "Run with multiprocessing mode" instructions from the 
> comment above (in Python 3).
> Expected: wordcount terminates gracefully when Ctrl-C is pressed during 
> pipeline execution (similarly to default direct runner)
> Actual: wordcount hangs forever after printing the following once per worker:
> {code}
> Exception in thread run_worker:
> Traceback (most recent call last):
>   File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
> self.run()
>   File "/usr/lib/python3.6/threading.py", line 864, in run
> self._target(*self._args, **self._kwargs)
>   File 
> "/usr/local/google/home/yifanmai/venv/wordcount/lib/python3.6/site-packages/apache_beam/runners/portability/local_job_service.py",
>  line 216, in run
> 'Worker subprocess exited with return code %s' % p.returncode)
> RuntimeError: Worker subprocess exited with return code 1
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-3645) Support multi-process execution on the FnApiRunner

2019-09-06 Thread Yifan Mai (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-3645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16924585#comment-16924585
 ] 

Yifan Mai commented on BEAM-3645:
-

While testing this I noticed that the multi-process runner does not handle 
SIGINT gracefully. To reproduce, run wordcount.py using the "Run with 
multiprocessing mode" instructions from the comment above (in Python 3).

Expected: wordcount terminates gracefully when Ctrl-C is pressed during 
pipeline execution (similarly to default direct runner)
Actual: wordcount hangs forever after printing the following once per worker:

{code}
Exception in thread run_worker:
Traceback (most recent call last):
  File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
  File "/usr/lib/python3.6/threading.py", line 864, in run
self._target(*self._args, **self._kwargs)
  File 
"/usr/local/google/home/yifanmai/venv/wordcount/lib/python3.6/site-packages/apache_beam/runners/portability/local_job_service.py",
 line 216, in run
'Worker subprocess exited with return code %s' % p.returncode)
RuntimeError: Worker subprocess exited with return code 1
{code}

> Support multi-process execution on the FnApiRunner
> --
>
> Key: BEAM-3645
> URL: https://issues.apache.org/jira/browse/BEAM-3645
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Charles Chen
>Assignee: Hannah Jiang
>Priority: Major
> Fix For: 2.15.0
>
>  Time Spent: 35h 20m
>  Remaining Estimate: 0h
>
> https://issues.apache.org/jira/browse/BEAM-3644 gave us a 15x performance 
> gain over the previous DirectRunner.  We can do even better in multi-core 
> environments by supporting multi-process execution in the FnApiRunner, to 
> scale past Python GIL limitations.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7969) Streaming Dataflow worker doesn't report FnAPI metrics.

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7969?focusedWorklogId=308192=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308192
 ]

ASF GitHub Bot logged work on BEAM-7969:


Author: ASF GitHub Bot
Created on: 06/Sep/19 22:10
Start Date: 06/Sep/19 22:10
Worklog Time Spent: 10m 
  Work Description: Ardagan commented on pull request #9494: [BEAM-7969] 
Fix doublecount on GRPC PCollections in streaming jobs.
URL: https://github.com/apache/beam/pull/9494
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308192)
Time Spent: 7h  (was: 6h 50m)

> Streaming Dataflow worker doesn't report FnAPI metrics.
> ---
>
> Key: BEAM-7969
> URL: https://issues.apache.org/jira/browse/BEAM-7969
> Project: Beam
>  Issue Type: Bug
>  Components: java-fn-execution, runner-dataflow
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Major
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> EOM



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-3342) Create a Cloud Bigtable IO connector for Python

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-3342?focusedWorklogId=308032=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308032
 ]

ASF GitHub Bot logged work on BEAM-3342:


Author: ASF GitHub Bot
Created on: 06/Sep/19 17:24
Start Date: 06/Sep/19 17:24
Worklog Time Spent: 10m 
  Work Description: sduskis commented on pull request #8457: [BEAM-3342] 
Create a Cloud Bigtable IO connector for Python
URL: https://github.com/apache/beam/pull/8457#discussion_r321834329
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/bigtableio.py
 ##
 @@ -122,22 +126,145 @@ class WriteToBigTable(beam.PTransform):
   A PTransform that write a list of `DirectRow` into the Bigtable Table
 
   """
-  def __init__(self, project_id=None, instance_id=None,
-   table_id=None):
+  def __init__(self, project_id=None, instance_id=None, table_id=None):
 """ The PTransform to access the Bigtable Write connector
 Args:
   project_id(str): GCP Project of to write the Rows
   instance_id(str): GCP Instance to write the Rows
   table_id(str): GCP Table to write the `DirectRows`
 """
 super(WriteToBigTable, self).__init__()
-self.beam_options = {'project_id': project_id,
- 'instance_id': instance_id,
- 'table_id': table_id}
+self._beam_options = {'project_id': project_id,
+  'instance_id': instance_id,
+  'table_id': table_id}
 
   def expand(self, pvalue):
-beam_options = self.beam_options
+beam_options = self._beam_options
 return (pvalue
 | beam.ParDo(_BigTableWriteFn(beam_options['project_id'],
   beam_options['instance_id'],
   beam_options['table_id'])))
+
+
+class _BigtableReadFn(beam.DoFn):
+  """ Creates the connector that can read rows for Beam pipeline
+
+  Args:
+project_id(str): GCP Project ID
+instance_id(str): GCP Instance ID
+table_id(str): GCP Table ID
+
+  """
+
+  def __init__(self, project_id, instance_id, table_id, filter_=b''):
+""" Constructor of the Read connector of Bigtable
+
+Args:
+  project_id: [str] GCP Project of to write the Rows
+  instance_id: [str] GCP Instance to write the Rows
+  table_id: [str] GCP Table to write the `DirectRows`
+  filter_: [RowFilter] Filter to apply to columns in a row.
+"""
+super(self.__class__, self).__init__()
+self._initialize({'project_id': project_id,
+  'instance_id': instance_id,
+  'table_id': table_id,
+  'filter_': filter_})
+
+  def __getstate__(self):
+return self._beam_options
+
+  def __setstate__(self, options):
+self._initialize(options)
+
+  def _initialize(self, options):
+self._beam_options = options
+self.table = None
+self.sample_row_keys = None
+self.row_count = Metrics.counter(self.__class__.__name__, 'Rows read')
+
+  def start_bundle(self):
+if self.table is None:
+  self.table = Client(project=self._beam_options['project_id'])\
+.instance(self._beam_options['instance_id'])\
+.table(self._beam_options['table_id'])
+
+  def process(self, element, **kwargs):
+for row in self.table.read_rows(start_key=element.start_position,
+end_key=element.end_position,
+filter_=self._beam_options['filter_']):
+  self.row_count.inc()
+  yield row
+
+  def display_data(self):
+return {'projectId': DisplayDataItem(self._beam_options['project_id'],
+ label='Bigtable Project Id'),
+'instanceId': DisplayDataItem(self._beam_options['instance_id'],
+  label='Bigtable Instance Id'),
+'tableId': DisplayDataItem(self._beam_options['table_id'],
+   label='Bigtable Table Id'),
+'filter_': DisplayDataItem(str(self._beam_options['filter_']),
+   label='Bigtable Filter')
+   }
+
+
+class ReadFromBigTable(beam.PTransform):
+  def __init__(self, project_id, instance_id, table_id, filter_=b''):
+""" The PTransform to access the Bigtable Read connector
+
+Args:
+  project_id: [str] GCP Project of to read the Rows
+  instance_id): [str] GCP Instance to read the Rows
+  table_id): [str] GCP Table to read the Rows
+  filter_: [RowFilter] Filter to apply to columns in a row.
+"""
+super(self.__class__, self).__init__()
+self._beam_options = {'project_id': project_id,
+  'instance_id': instance_id,
+  'table_id': table_id,
+  'filter_': filter_}
+
+  def 

[jira] [Work logged] (BEAM-6185) Upgrade Spark to version 2.4.0

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6185?focusedWorklogId=308109=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308109
 ]

ASF GitHub Bot logged work on BEAM-6185:


Author: ASF GitHub Bot
Created on: 06/Sep/19 19:20
Start Date: 06/Sep/19 19:20
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on issue #9487: [WIP] 
[BEAM-6185]Change default docker images name
URL: https://github.com/apache/beam/pull/9487#issuecomment-528982038
 
 
   Run Python Dataflow ValidatesContainer
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308109)
Time Spent: 2h  (was: 1h 50m)

> Upgrade Spark to version 2.4.0
> --
>
> Key: BEAM-6185
> URL: https://issues.apache.org/jira/browse/BEAM-6185
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-spark
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 2.12.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8166) Support Graceful shutdown of worker harness.

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8166?focusedWorklogId=308131=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308131
 ]

ASF GitHub Bot logged work on BEAM-8166:


Author: ASF GitHub Bot
Created on: 06/Sep/19 20:16
Start Date: 06/Sep/19 20:16
Worklog Time Spent: 10m 
  Work Description: lostluck commented on issue #9504: [BEAM-8166] Allow in 
process workers to exit without killing the process.
URL: https://github.com/apache/beam/pull/9504#issuecomment-528999334
 
 
   Run Go PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308131)
Time Spent: 0.5h  (was: 20m)

> Support Graceful shutdown of worker harness.
> 
>
> Key: BEAM-8166
> URL: https://issues.apache.org/jira/browse/BEAM-8166
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-core, sdk-go
>Reporter: Robert Burke
>Priority: Minor
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Ideally there should be a clear Shutdown control RPC a runner can send a 
> worker harness to trigger an orderly shutdown.
> Absent that, errors on the runner side shouldn't manifest as SDK worker 
> harness errors. SDKs should log, and gracefully shutdown from GRPC errors.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work started] (BEAM-8167) Replace old Postcommit_Python_Verify test with new python postcommits in Grafana.

2019-09-06 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-8167 started by Daniel Oliveira.
-
> Replace old Postcommit_Python_Verify test with new python postcommits in 
> Grafana.
> -
>
> Key: BEAM-8167
> URL: https://issues.apache.org/jira/browse/BEAM-8167
> Project: Beam
>  Issue Type: Bug
>  Components: project-management, testing
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Major
>
> The old postcommit was replaced with multiple new postcommits for each 
> supported python version (Python2, Python35, Python36, Python37). These 
> aren't being reflected on the Grafana dashboards yet, so that should be fixed.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-8167) Replace old Postcommit_Python_Verify test with new python postcommits in Grafana.

2019-09-06 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira updated BEAM-8167:
--
Status: Open  (was: Triage Needed)

> Replace old Postcommit_Python_Verify test with new python postcommits in 
> Grafana.
> -
>
> Key: BEAM-8167
> URL: https://issues.apache.org/jira/browse/BEAM-8167
> Project: Beam
>  Issue Type: Bug
>  Components: project-management, testing
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Major
>
> The old postcommit was replaced with multiple new postcommits for each 
> supported python version (Python2, Python35, Python36, Python37). These 
> aren't being reflected on the Grafana dashboards yet, so that should be fixed.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8156) Finish migration to standard Python typing

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8156?focusedWorklogId=308203=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308203
 ]

ASF GitHub Bot logged work on BEAM-8156:


Author: ASF GitHub Bot
Created on: 06/Sep/19 22:46
Start Date: 06/Sep/19 22:46
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #9509: [BEAM-8156] Add 
convert_to_typing_type
URL: https://github.com/apache/beam/pull/9509
 
 
   First step in being able to compartmentalize Beam typing types to
   certain functions, and not expose them externally.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 

[jira] [Closed] (BEAM-7619) [beam_PostCommit_Py_ValCont] [test_metrics_fnapi_it] KeyError

2019-09-06 Thread Daniel Oliveira (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira closed BEAM-7619.
-
Fix Version/s: Not applicable
   Resolution: Cannot Reproduce

Looks like this specific bug was fixed once Dataflow service releases caught 
up. But the root cause (the blocking bug) is still unresolved, so the issue 
could pop back up in the future.

> [beam_PostCommit_Py_ValCont] [test_metrics_fnapi_it] KeyError
> -
>
> Key: BEAM-7619
> URL: https://issues.apache.org/jira/browse/BEAM-7619
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Mikhail Gryzykhin
>Assignee: Valentyn Tymofieiev
>Priority: Major
>  Labels: currently-failing
> Fix For: Not applicable
>
>
> _Use this form to file an issue for test failure:_
>  * [Jenkins 
> Job|[https://builds.apache.org/job/beam_PostCommit_Py_ValCont/3625/consoleFull]]
>  
> Initial investigation:
> 12:08:13 ERROR: test_wordcount_fnapi_it 
> (apache_beam.examples.wordcount_it_test.WordCountIT)
> 12:08:13 
> --
> 12:08:13 Traceback (most recent call last):
> 12:08:13 File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_ValCont/src/sdks/python/apache_beam/examples/wordcount_it_test.py",
>  line 52, in test_wordcount_fnapi_it
> 12:08:13 self._run_wordcount_it(wordcount.run, experiment='beam_fn_api')
> 12:08:13 File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_ValCont/src/sdks/python/apache_beam/examples/wordcount_it_test.py",
>  line 84, in _run_wordcount_it
> 12:08:13 run_wordcount(test_pipeline.get_full_options_as_args(**extra_opts))
> 12:08:13 File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_ValCont/src/sdks/python/apache_beam/examples/wordcount.py",
>  line 114, in run
> 12:08:13 result = p.run()
> 12:08:13 File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_ValCont/src/sdks/python/apache_beam/pipeline.py",
>  line 406, in run
> 12:08:13 self._options).run(False)
> 12:08:13 File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_ValCont/src/sdks/python/apache_beam/pipeline.py",
>  line 419, in run
> 12:08:13 return self.runner.run_pipeline(self, self._options)
> 12:08:13 File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_ValCont/src/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py",
>  line 64, in run_pipeline
> 12:08:13 self.result.wait_until_finish(duration=wait_duration)
> 12:08:13 File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Py_ValCont/src/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py",
>  line 1338, in wait_until_finish
> 12:08:13 (self.state, getattr(self._runner, 'last_error_msg', None)), self)
> 12:08:13 DataflowRuntimeException: Dataflow pipeline failed. State: FAILED, 
> Error:
> 12:08:13 java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> Error received from SDK harness for instruction -129: Traceback (most recent 
> call last):
> 12:08:13 File 
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 157, in _execute
> 12:08:13 response = task()
> 12:08:13 File 
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 190, in 
> 12:08:13 self._execute(lambda: worker.do_instruction(work), work)
> 12:08:13 File 
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 342, in do_instruction
> 12:08:13 request.instruction_id)
> 12:08:13 File 
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 368, in process_bundle
> 12:08:13 bundle_processor.process_bundle(instruction_id))
> 12:08:13 File 
> "/usr/local/lib/python2.7/site-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 593, in process_bundle
> 12:08:13 data.ptransform_id].process_encoded(data.data)
> 12:08:13 KeyError: u'\n\x04-107\x12\x04-105'
> 12:08:13 
> 12:08:13 at 
> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
> 12:08:13 at 
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
> 12:08:13 at org.apache.beam.sdk.util.MoreFutures.get(MoreFutures.java:57)
> 12:08:13 at 
> org.apache.beam.runners.dataflow.worker.fn.control.RegisterAndProcessBundleOperation.finish(RegisterAndProcessBundleOperation.java:285)
> 12:08:13 at 
> org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:85)
> 12:08:13 at 
> org.apache.beam.runners.dataflow.worker.fn.control.BeamFnMapTaskExecutor.execute(BeamFnMapTaskExecutor.java:125)
> 12:08:13 at 
> 

[jira] [Work logged] (BEAM-7600) Spark portable runner: reuse SDK harness

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7600?focusedWorklogId=308202=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308202
 ]

ASF GitHub Bot logged work on BEAM-7600:


Author: ASF GitHub Bot
Created on: 06/Sep/19 22:45
Start Date: 06/Sep/19 22:45
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #9095: [BEAM-7600] borrow SDK 
harness management code into Spark runner
URL: https://github.com/apache/beam/pull/9095#issuecomment-529038195
 
 
   @angoenka I have refactored this quite a bit. Unfortunately I only managed 
to eliminate a small amount of the complexity as most of it seems necessary. 
PTAL
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308202)
Time Spent: 7h  (was: 6h 50m)

> Spark portable runner: reuse SDK harness
> 
>
> Key: BEAM-7600
> URL: https://issues.apache.org/jira/browse/BEAM-7600
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-spark
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> Right now, we're creating a new SDK harness every time an executable stage is 
> run [1], which is expensive. We should be able to re-use code from the Flink 
> runner to re-use the SDK harness [2].
>  
> [1] 
> [https://github.com/apache/beam/blob/c9fb261bc7666788402840bb6ce1b0ce2fd445d1/runners/spark/src/main/java/org/apache/beam/runners/spark/translation/SparkExecutableStageFunction.java#L135]
> [2] 
> [https://github.com/apache/beam/blob/c9fb261bc7666788402840bb6ce1b0ce2fd445d1/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/functions/FlinkDefaultExecutableStageContext.java]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8156) Finish migration to standard Python typing

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8156?focusedWorklogId=308208=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308208
 ]

ASF GitHub Bot logged work on BEAM-8156:


Author: ASF GitHub Bot
Created on: 06/Sep/19 23:04
Start Date: 06/Sep/19 23:04
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #9509: [BEAM-8156] Add 
convert_to_typing_type
URL: https://github.com/apache/beam/pull/9509#issuecomment-529041440
 
 
   R: @robertwb @tvalentyn 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308208)
Remaining Estimate: 503h 40m  (was: 503h 50m)
Time Spent: 20m  (was: 10m)

> Finish migration to standard Python typing
> --
>
> Key: BEAM-8156
> URL: https://issues.apache.org/jira/browse/BEAM-8156
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Robert Bradshaw
>Assignee: Udi Meiri
>Priority: Major
>   Original Estimate: 504h
>  Time Spent: 20m
>  Remaining Estimate: 503h 40m
>
> We should migrate all Python uses of types to the standard typing module, and 
> make the typehints.* ones aliases of the Python ones. 
>  
> There are three places where we use custom typehints behavior: 
> (1) is_compatible_with
> (2) bind_type_variables/match_type_variables
> (3) trivial type inference. 
>  
> I would propose that each of these be adapted to a (internal) public 
> interface that accepts and returns standard typing types, and internally 
> converts to our (nowhere else exposed) typehints types, performs the logic, 
> and converts back. Each of these in turn can then be updated, as needed and 
> orthogonally, to operate on the typing types natively (possibly via deference 
> to a third-party library). 
>  
> I think coder inference could be easily adopted to use typing types directly, 
> but it may be a fourth place where we do internal conversion first. Another 
> gotcha is special care may need to be taken if we ever need to pickle these 
> types (which IIRC may have issues). 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8156) Finish migration to standard Python typing

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8156?focusedWorklogId=308214=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308214
 ]

ASF GitHub Bot logged work on BEAM-8156:


Author: ASF GitHub Bot
Created on: 06/Sep/19 23:16
Start Date: 06/Sep/19 23:16
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #9509: [BEAM-8156] Add 
convert_to_typing_type
URL: https://github.com/apache/beam/pull/9509#discussion_r321938298
 
 

 ##
 File path: sdks/python/apache_beam/typehints/native_type_compatibility.py
 ##
 @@ -198,6 +202,10 @@ def convert_to_beam_type(typ):
   arity=-1,
   beam_type=typehints.Tuple),
   _TypeMapEntry(match=_match_is_union, arity=-1, 
beam_type=typehints.Union),
+  _TypeMapEntry(
+  match=_match_issubclass(typing.Iterator),
 
 Review comment:
   Line 170 effectively limits the scope to typing module types. Is that good 
enough?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308214)
Remaining Estimate: 503h 20m  (was: 503.5h)
Time Spent: 40m  (was: 0.5h)

> Finish migration to standard Python typing
> --
>
> Key: BEAM-8156
> URL: https://issues.apache.org/jira/browse/BEAM-8156
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Robert Bradshaw
>Assignee: Udi Meiri
>Priority: Major
>   Original Estimate: 504h
>  Time Spent: 40m
>  Remaining Estimate: 503h 20m
>
> We should migrate all Python uses of types to the standard typing module, and 
> make the typehints.* ones aliases of the Python ones. 
>  
> There are three places where we use custom typehints behavior: 
> (1) is_compatible_with
> (2) bind_type_variables/match_type_variables
> (3) trivial type inference. 
>  
> I would propose that each of these be adapted to a (internal) public 
> interface that accepts and returns standard typing types, and internally 
> converts to our (nowhere else exposed) typehints types, performs the logic, 
> and converts back. Each of these in turn can then be updated, as needed and 
> orthogonally, to operate on the typing types natively (possibly via deference 
> to a third-party library). 
>  
> I think coder inference could be easily adopted to use typing types directly, 
> but it may be a fourth place where we do internal conversion first. Another 
> gotcha is special care may need to be taken if we ever need to pickle these 
> types (which IIRC may have issues). 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7969) Streaming Dataflow worker doesn't report FnAPI metrics.

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7969?focusedWorklogId=308130=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308130
 ]

ASF GitHub Bot logged work on BEAM-7969:


Author: ASF GitHub Bot
Created on: 06/Sep/19 20:14
Start Date: 06/Sep/19 20:14
Worklog Time Spent: 10m 
  Work Description: ajamato commented on pull request #9494: [BEAM-7969] 
Fix doublecount on GRPC PCollections in streaming jobs.
URL: https://github.com/apache/beam/pull/9494#discussion_r321893709
 
 

 ##
 File path: 
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/fn/control/BeamFnMapTaskExecutor.java
 ##
 @@ -400,13 +409,19 @@ void updateProgress() {
  * @param monitoringInfos Usually received from FnApi.
  */
 private void updateMetrics(List monitoringInfos) {
+  List monitoringInfosCopy = new 
ArrayList<>(monitoringInfos);
+
+  List misToFilter =
+  
bundleProcessOperation.findIOPCollectionMonitoringInfos(monitoringInfos);
 
 Review comment:
   findIOPCollectionMonitoringInfos
   
   Could we use a better name here? Not sure what this is referring to. I 
assume IO referrs to sources/sinks, but I don't think this is the case
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308130)
Time Spent: 6.5h  (was: 6h 20m)

> Streaming Dataflow worker doesn't report FnAPI metrics.
> ---
>
> Key: BEAM-7969
> URL: https://issues.apache.org/jira/browse/BEAM-7969
> Project: Beam
>  Issue Type: Bug
>  Components: java-fn-execution, runner-dataflow
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Major
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> EOM



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8166) Support Graceful shutdown of worker harness.

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8166?focusedWorklogId=308129=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308129
 ]

ASF GitHub Bot logged work on BEAM-8166:


Author: ASF GitHub Bot
Created on: 06/Sep/19 20:14
Start Date: 06/Sep/19 20:14
Worklog Time Spent: 10m 
  Work Description: lostluck commented on issue #9504: [BEAM-8166] Allow in 
process workers to exit without killing the process.
URL: https://github.com/apache/beam/pull/9504#issuecomment-528998600
 
 
   R: @youngoli 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308129)
Time Spent: 20m  (was: 10m)

> Support Graceful shutdown of worker harness.
> 
>
> Key: BEAM-8166
> URL: https://issues.apache.org/jira/browse/BEAM-8166
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-core, sdk-go
>Reporter: Robert Burke
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Ideally there should be a clear Shutdown control RPC a runner can send a 
> worker harness to trigger an orderly shutdown.
> Absent that, errors on the runner side shouldn't manifest as SDK worker 
> harness errors. SDKs should log, and gracefully shutdown from GRPC errors.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-6185) Upgrade Spark to version 2.4.0

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6185?focusedWorklogId=308187=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308187
 ]

ASF GitHub Bot logged work on BEAM-6185:


Author: ASF GitHub Bot
Created on: 06/Sep/19 21:54
Start Date: 06/Sep/19 21:54
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on issue #9487: [WIP] 
[BEAM-6185]Change default docker images name
URL: https://github.com/apache/beam/pull/9487#issuecomment-529027241
 
 
   R: @aaltay 
   Can you please review or find someone else to review it? Thanks.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308187)
Time Spent: 2.5h  (was: 2h 20m)

> Upgrade Spark to version 2.4.0
> --
>
> Key: BEAM-6185
> URL: https://issues.apache.org/jira/browse/BEAM-6185
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-spark
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 2.12.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8167) Replace old Postcommit_Python_Verify test with new python postcommits in Grafana.

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8167?focusedWorklogId=308186=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308186
 ]

ASF GitHub Bot logged work on BEAM-8167:


Author: ASF GitHub Bot
Created on: 06/Sep/19 21:52
Start Date: 06/Sep/19 21:52
Worklog Time Spent: 10m 
  Work Description: Ardagan commented on issue #9505: [BEAM-8167] Replace 
Postcommit_Python_Verify in Grafana dashboards.
URL: https://github.com/apache/beam/pull/9505#issuecomment-529026645
 
 
   Some thoughts on the change:
   
   * Do you have sample of how graph will look like?
   * Should we keep python verify on the graph in case if we want to see 
historical data? My understanding is that we discontinued python_verify jobs 
and created pythonXX instead. This will serve us well in the future, but we 
lose historical data on the default graph with current change.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308186)
Time Spent: 0.5h  (was: 20m)

> Replace old Postcommit_Python_Verify test with new python postcommits in 
> Grafana.
> -
>
> Key: BEAM-8167
> URL: https://issues.apache.org/jira/browse/BEAM-8167
> Project: Beam
>  Issue Type: Bug
>  Components: project-management, testing
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The old postcommit was replaced with multiple new postcommits for each 
> supported python version (Python2, Python35, Python36, Python37). These 
> aren't being reflected on the Grafana dashboards yet, so that should be fixed.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work started] (BEAM-8165) Change image name and add images to release process

2019-09-06 Thread Hannah Jiang (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-8165 started by Hannah Jiang.
--
> Change image name and add images to release process
> ---
>
> Key: BEAM-8165
> URL: https://issues.apache.org/jira/browse/BEAM-8165
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go, sdk-java-harness, sdk-py-harness
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
>
> change name to apachebeam/\{lang}{ver}_sdk



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-8165) Change image name and add images to release process

2019-09-06 Thread Hannah Jiang (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hannah Jiang updated BEAM-8165:
---
Summary: Change image name and add images to release process  (was: Change 
image name)

> Change image name and add images to release process
> ---
>
> Key: BEAM-8165
> URL: https://issues.apache.org/jira/browse/BEAM-8165
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go, sdk-java-harness, sdk-py-harness
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
>
> change name to apachebeam/\{lang}{ver}_sdk



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7969) Streaming Dataflow worker doesn't report FnAPI metrics.

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7969?focusedWorklogId=308125=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308125
 ]

ASF GitHub Bot logged work on BEAM-7969:


Author: ASF GitHub Bot
Created on: 06/Sep/19 20:07
Start Date: 06/Sep/19 20:07
Worklog Time Spent: 10m 
  Work Description: Ardagan commented on issue #9494: [BEAM-7969] Fix 
doublecount on GRPC PCollections in streaming jobs.
URL: https://github.com/apache/beam/pull/9494#issuecomment-528996585
 
 
   Run Python Dataflow ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308125)
Time Spent: 6h 10m  (was: 6h)

> Streaming Dataflow worker doesn't report FnAPI metrics.
> ---
>
> Key: BEAM-7969
> URL: https://issues.apache.org/jira/browse/BEAM-7969
> Project: Beam
>  Issue Type: Bug
>  Components: java-fn-execution, runner-dataflow
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Major
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> EOM



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7969) Streaming Dataflow worker doesn't report FnAPI metrics.

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7969?focusedWorklogId=308134=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308134
 ]

ASF GitHub Bot logged work on BEAM-7969:


Author: ASF GitHub Bot
Created on: 06/Sep/19 20:22
Start Date: 06/Sep/19 20:22
Worklog Time Spent: 10m 
  Work Description: ajamato commented on pull request #9494: [BEAM-7969] 
Fix doublecount on GRPC PCollections in streaming jobs.
URL: https://github.com/apache/beam/pull/9494#discussion_r321895776
 
 

 ##
 File path: 
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/fn/control/BeamFnMapTaskExecutor.java
 ##
 @@ -400,13 +409,19 @@ void updateProgress() {
  * @param monitoringInfos Usually received from FnApi.
  */
 private void updateMetrics(List monitoringInfos) {
+  List monitoringInfosCopy = new 
ArrayList<>(monitoringInfos);
+
+  List misToFilter =
+  
bundleProcessOperation.findIOPCollectionMonitoringInfos(monitoringInfos);
 
 Review comment:
   Is there a reason why we need to collect counters for these ones at all? Our 
UI doesn't display the grpc steps, they are an implementation detail.
   
   The alternative design I was thinking of here was to try and transform each 
monitoring info, and drop the ones that do not have a step in the original 
graph. Though, maybe there is no way to detect this? So they could be dropped 
int the transformer, as there is no need to send them to DFE.
   
   If I understand correctly, you are just avoiding a double count now? Though 
I think we don't need to report these at all.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308134)
Time Spent: 6h 40m  (was: 6.5h)

> Streaming Dataflow worker doesn't report FnAPI metrics.
> ---
>
> Key: BEAM-7969
> URL: https://issues.apache.org/jira/browse/BEAM-7969
> Project: Beam
>  Issue Type: Bug
>  Components: java-fn-execution, runner-dataflow
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Major
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> EOM



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8166) Support Graceful shutdown of worker harness.

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8166?focusedWorklogId=308145=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308145
 ]

ASF GitHub Bot logged work on BEAM-8166:


Author: ASF GitHub Bot
Created on: 06/Sep/19 20:45
Start Date: 06/Sep/19 20:45
Worklog Time Spent: 10m 
  Work Description: lostluck commented on issue #9504: [BEAM-8166] Allow in 
process workers to exit without killing the process.
URL: https://github.com/apache/beam/pull/9504#issuecomment-529008251
 
 
   Excellent question, and you're correct the new mode isn't presently 
available for any public runners at this time. Flink, Spark, Dataflow all rely 
on containers, which don't run into this.
   
   However, native Go runners might choose to make different calls, or runners 
where the runner half a worker is linked into the same binary run into this, 
such as the internal Google runner I work on uses. It has an inprocess mode 
that can take advantage of this as well.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308145)
Time Spent: 40m  (was: 0.5h)

> Support Graceful shutdown of worker harness.
> 
>
> Key: BEAM-8166
> URL: https://issues.apache.org/jira/browse/BEAM-8166
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-core, sdk-go
>Reporter: Robert Burke
>Priority: Minor
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Ideally there should be a clear Shutdown control RPC a runner can send a 
> worker harness to trigger an orderly shutdown.
> Absent that, errors on the runner side shouldn't manifest as SDK worker 
> harness errors. SDKs should log, and gracefully shutdown from GRPC errors.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-7859) Portable Wordcount on Spark runner does not work in DOCKER execution mode.

2019-09-06 Thread Kyle Weaver (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kyle Weaver updated BEAM-7859:
--
Description: 
The error was observed during Beam 2.14.0 release validation, see: 
https://issues.apache.org/jira/browse/BEAM-7224?focusedCommentId=16896831=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16896831

-Looks like master currently fails with a different error, both in Loopback and 
Docker modes.-

[~ibzib] [~altay] [~robertwb] [~angoenka]

  was:
The error was observed during Beam 2.14.0 release validation, see: 
https://issues.apache.org/jira/browse/BEAM-7224?focusedCommentId=16896831=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16896831

Looks like master currently fails with a different error, both in Loopback and 
Docker modes.

[~ibzib] [~altay] [~robertwb] [~angoenka]


> Portable Wordcount on Spark runner does not work in DOCKER execution mode.
> --
>
> Key: BEAM-7859
> URL: https://issues.apache.org/jira/browse/BEAM-7859
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark, sdk-py-harness
>Reporter: Valentyn Tymofieiev
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-spark
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> The error was observed during Beam 2.14.0 release validation, see: 
> https://issues.apache.org/jira/browse/BEAM-7224?focusedCommentId=16896831=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16896831
> -Looks like master currently fails with a different error, both in Loopback 
> and Docker modes.-
> [~ibzib] [~altay] [~robertwb] [~angoenka]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-6185) Upgrade Spark to version 2.4.0

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6185?focusedWorklogId=308177=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308177
 ]

ASF GitHub Bot logged work on BEAM-6185:


Author: ASF GitHub Bot
Created on: 06/Sep/19 21:36
Start Date: 06/Sep/19 21:36
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on pull request #9506: 
[WIP][BEAM-6185] Release scripts for docker images
URL: https://github.com/apache/beam/pull/9506
 
 
   This PR adds scripts for staging and publishing docker images to docker hub.
   
   
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 

[jira] [Work logged] (BEAM-8167) Replace old Postcommit_Python_Verify test with new python postcommits in Grafana.

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8167?focusedWorklogId=308198=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308198
 ]

ASF GitHub Bot logged work on BEAM-8167:


Author: ASF GitHub Bot
Created on: 06/Sep/19 22:18
Start Date: 06/Sep/19 22:18
Worklog Time Spent: 10m 
  Work Description: youngoli commented on issue #9505: [BEAM-8167] Replace 
Postcommit_Python_Verify in Grafana dashboards.
URL: https://github.com/apache/beam/pull/9505#issuecomment-529032650
 
 
   I'll modify the commit to put Python_Verify back in. Doesn't seem like a big 
deal to leave it for historical data for now. As for a sample:
   
   
![image](https://user-images.githubusercontent.com/1740865/64463554-4a902a80-d0b9-11e9-9138-db16900d3522.png)
   
   Since this is running locally, I don't have a full history like it would 
have on the website.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308198)
Time Spent: 40m  (was: 0.5h)

> Replace old Postcommit_Python_Verify test with new python postcommits in 
> Grafana.
> -
>
> Key: BEAM-8167
> URL: https://issues.apache.org/jira/browse/BEAM-8167
> Project: Beam
>  Issue Type: Bug
>  Components: project-management, testing
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The old postcommit was replaced with multiple new postcommits for each 
> supported python version (Python2, Python35, Python36, Python37). These 
> aren't being reflected on the Grafana dashboards yet, so that should be fixed.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-3342) Create a Cloud Bigtable IO connector for Python

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-3342?focusedWorklogId=308029=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308029
 ]

ASF GitHub Bot logged work on BEAM-3342:


Author: ASF GitHub Bot
Created on: 06/Sep/19 17:22
Start Date: 06/Sep/19 17:22
Worklog Time Spent: 10m 
  Work Description: sduskis commented on pull request #8457: [BEAM-3342] 
Create a Cloud Bigtable IO connector for Python
URL: https://github.com/apache/beam/pull/8457#discussion_r321833569
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/bigtableio.py
 ##
 @@ -122,22 +126,145 @@ class WriteToBigTable(beam.PTransform):
   A PTransform that write a list of `DirectRow` into the Bigtable Table
 
   """
-  def __init__(self, project_id=None, instance_id=None,
-   table_id=None):
+  def __init__(self, project_id=None, instance_id=None, table_id=None):
 """ The PTransform to access the Bigtable Write connector
 Args:
   project_id(str): GCP Project of to write the Rows
   instance_id(str): GCP Instance to write the Rows
   table_id(str): GCP Table to write the `DirectRows`
 """
 super(WriteToBigTable, self).__init__()
-self.beam_options = {'project_id': project_id,
- 'instance_id': instance_id,
- 'table_id': table_id}
+self._beam_options = {'project_id': project_id,
+  'instance_id': instance_id,
+  'table_id': table_id}
 
   def expand(self, pvalue):
-beam_options = self.beam_options
+beam_options = self._beam_options
 return (pvalue
 | beam.ParDo(_BigTableWriteFn(beam_options['project_id'],
   beam_options['instance_id'],
   beam_options['table_id'])))
+
+
+class _BigtableReadFn(beam.DoFn):
+  """ Creates the connector that can read rows for Beam pipeline
+
+  Args:
+project_id(str): GCP Project ID
+instance_id(str): GCP Instance ID
+table_id(str): GCP Table ID
+
+  """
+
+  def __init__(self, project_id, instance_id, table_id, filter_=b''):
+""" Constructor of the Read connector of Bigtable
+
+Args:
+  project_id: [str] GCP Project of to write the Rows
+  instance_id: [str] GCP Instance to write the Rows
+  table_id: [str] GCP Table to write the `DirectRows`
+  filter_: [RowFilter] Filter to apply to columns in a row.
+"""
+super(self.__class__, self).__init__()
+self._initialize({'project_id': project_id,
+  'instance_id': instance_id,
+  'table_id': table_id,
+  'filter_': filter_})
+
+  def __getstate__(self):
+return self._beam_options
+
+  def __setstate__(self, options):
+self._initialize(options)
+
+  def _initialize(self, options):
+self._beam_options = options
+self.table = None
+self.sample_row_keys = None
+self.row_count = Metrics.counter(self.__class__.__name__, 'Rows read')
+
+  def start_bundle(self):
+if self.table is None:
+  self.table = Client(project=self._beam_options['project_id'])\
+.instance(self._beam_options['instance_id'])\
+.table(self._beam_options['table_id'])
+
+  def process(self, element, **kwargs):
+for row in self.table.read_rows(start_key=element.start_position,
+end_key=element.end_position,
+filter_=self._beam_options['filter_']):
+  self.row_count.inc()
+  yield row
+
+  def display_data(self):
+return {'projectId': DisplayDataItem(self._beam_options['project_id'],
+ label='Bigtable Project Id'),
+'instanceId': DisplayDataItem(self._beam_options['instance_id'],
+  label='Bigtable Instance Id'),
+'tableId': DisplayDataItem(self._beam_options['table_id'],
+   label='Bigtable Table Id'),
+'filter_': DisplayDataItem(str(self._beam_options['filter_']),
+   label='Bigtable Filter')
+   }
+
+
+class ReadFromBigTable(beam.PTransform):
+  def __init__(self, project_id, instance_id, table_id, filter_=b''):
+""" The PTransform to access the Bigtable Read connector
+
+Args:
+  project_id: [str] GCP Project of to read the Rows
+  instance_id): [str] GCP Instance to read the Rows
+  table_id): [str] GCP Table to read the Rows
+  filter_: [RowFilter] Filter to apply to columns in a row.
+"""
+super(self.__class__, self).__init__()
+self._beam_options = {'project_id': project_id,
+  'instance_id': instance_id,
+  'table_id': table_id,
+  'filter_': filter_}
+
+  def 

[jira] [Work logged] (BEAM-7969) Streaming Dataflow worker doesn't report FnAPI metrics.

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7969?focusedWorklogId=308062=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308062
 ]

ASF GitHub Bot logged work on BEAM-7969:


Author: ASF GitHub Bot
Created on: 06/Sep/19 18:08
Start Date: 06/Sep/19 18:08
Worklog Time Spent: 10m 
  Work Description: Ardagan commented on issue #9494: [BEAM-7969] Fix 
doublecount on GRPC PCollections in streaming jobs.
URL: https://github.com/apache/beam/pull/9494#issuecomment-528957185
 
 
   Run Python Dataflow ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308062)
Time Spent: 5.5h  (was: 5h 20m)

> Streaming Dataflow worker doesn't report FnAPI metrics.
> ---
>
> Key: BEAM-7969
> URL: https://issues.apache.org/jira/browse/BEAM-7969
> Project: Beam
>  Issue Type: Bug
>  Components: java-fn-execution, runner-dataflow
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Major
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> EOM



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7969) Streaming Dataflow worker doesn't report FnAPI metrics.

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7969?focusedWorklogId=308063=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308063
 ]

ASF GitHub Bot logged work on BEAM-7969:


Author: ASF GitHub Bot
Created on: 06/Sep/19 18:08
Start Date: 06/Sep/19 18:08
Worklog Time Spent: 10m 
  Work Description: Ardagan commented on issue #9494: [BEAM-7969] Fix 
doublecount on GRPC PCollections in streaming jobs.
URL: https://github.com/apache/beam/pull/9494#issuecomment-528957185
 
 
   Run Python Dataflow ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308063)
Time Spent: 5h 40m  (was: 5.5h)

> Streaming Dataflow worker doesn't report FnAPI metrics.
> ---
>
> Key: BEAM-7969
> URL: https://issues.apache.org/jira/browse/BEAM-7969
> Project: Beam
>  Issue Type: Bug
>  Components: java-fn-execution, runner-dataflow
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Major
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> EOM



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7969) Streaming Dataflow worker doesn't report FnAPI metrics.

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7969?focusedWorklogId=308067=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308067
 ]

ASF GitHub Bot logged work on BEAM-7969:


Author: ASF GitHub Bot
Created on: 06/Sep/19 18:09
Start Date: 06/Sep/19 18:09
Worklog Time Spent: 10m 
  Work Description: Ardagan commented on issue #9494: [BEAM-7969] Fix 
doublecount on GRPC PCollections in streaming jobs.
URL: https://github.com/apache/beam/pull/9494#issuecomment-528957722
 
 
   @lukecwik @y1chi @angoenka 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308067)
Time Spent: 5h 50m  (was: 5h 40m)

> Streaming Dataflow worker doesn't report FnAPI metrics.
> ---
>
> Key: BEAM-7969
> URL: https://issues.apache.org/jira/browse/BEAM-7969
> Project: Beam
>  Issue Type: Bug
>  Components: java-fn-execution, runner-dataflow
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Major
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> EOM



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8166) Support Graceful shutdown of worker harness.

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8166?focusedWorklogId=308128=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308128
 ]

ASF GitHub Bot logged work on BEAM-8166:


Author: ASF GitHub Bot
Created on: 06/Sep/19 20:13
Start Date: 06/Sep/19 20:13
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #9504: [BEAM-8166] 
Allow in process workers to exit without killing the process.
URL: https://github.com/apache/beam/pull/9504
 
 
   Allow for runners that do not spawn distinct processes for workers. 
InProcess runners can be useful for testing, or to support for any future 
native go runner. When running multiple pipelines simultaneously, it's 
desireable to allow for some measure of independance as well.
   Responding fatally on GRPC connection terminations prevents these behaviours.
   
   Further, improve the error messages to include additional context when there 
are GRPC issues.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 

[jira] [Work logged] (BEAM-8167) Replace old Postcommit_Python_Verify test with new python postcommits in Grafana.

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8167?focusedWorklogId=308160=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308160
 ]

ASF GitHub Bot logged work on BEAM-8167:


Author: ASF GitHub Bot
Created on: 06/Sep/19 21:02
Start Date: 06/Sep/19 21:02
Worklog Time Spent: 10m 
  Work Description: youngoli commented on issue #9505: [BEAM-8167] Replace 
Postcommit_Python_Verify in Grafana dashboards.
URL: https://github.com/apache/beam/pull/9505#issuecomment-529013258
 
 
   R: @Ardagan 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308160)
Time Spent: 20m  (was: 10m)

> Replace old Postcommit_Python_Verify test with new python postcommits in 
> Grafana.
> -
>
> Key: BEAM-8167
> URL: https://issues.apache.org/jira/browse/BEAM-8167
> Project: Beam
>  Issue Type: Bug
>  Components: project-management, testing
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The old postcommit was replaced with multiple new postcommits for each 
> supported python version (Python2, Python35, Python36, Python37). These 
> aren't being reflected on the Grafana dashboards yet, so that should be fixed.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8167) Replace old Postcommit_Python_Verify test with new python postcommits in Grafana.

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8167?focusedWorklogId=308159=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308159
 ]

ASF GitHub Bot logged work on BEAM-8167:


Author: ASF GitHub Bot
Created on: 06/Sep/19 21:01
Start Date: 06/Sep/19 21:01
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #9505: [BEAM-8167] 
Replace Postcommit_Python_Verify in Grafana dashboards.
URL: https://github.com/apache/beam/pull/9505
 
 
   Replacing the old postcommit in the "Stability Critical Jobs" graphs
   with the new Python Postcommits which are split based on Python version.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [x] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [x] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 

[jira] [Work logged] (BEAM-7967) Execute portable Flink application jar

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7967?focusedWorklogId=308180=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308180
 ]

ASF GitHub Bot logged work on BEAM-7967:


Author: ASF GitHub Bot
Created on: 06/Sep/19 21:39
Start Date: 06/Sep/19 21:39
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #9408:  [BEAM-7967] Execute 
portable Flink application jar
URL: https://github.com/apache/beam/pull/9408#issuecomment-529023081
 
 
   Run PortableJar_Flink PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308180)
Time Spent: 3h 40m  (was: 3.5h)

> Execute portable Flink application jar
> --
>
> Key: BEAM-7967
> URL: https://issues.apache.org/jira/browse/BEAM-7967
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-flink
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> [https://docs.google.com/document/d/1kj_9JWxGWOmSGeZ5hbLVDXSTv-zBrx4kQRqOq85RYD4/edit#heading=h.oes73844vmhl]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8151) Allow the Python SDK to use many many threads

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8151?focusedWorklogId=308206=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308206
 ]

ASF GitHub Bot logged work on BEAM-8151:


Author: ASF GitHub Bot
Created on: 06/Sep/19 22:51
Start Date: 06/Sep/19 22:51
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #9477: [BEAM-8151, 
BEAM-7848] Up the max number of threads inside the SDK harness to a default of 
10k
URL: https://github.com/apache/beam/pull/9477#issuecomment-529039169
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308206)
Time Spent: 5h 10m  (was: 5h)

> Allow the Python SDK to use many many threads
> -
>
> Key: BEAM-8151
> URL: https://issues.apache.org/jira/browse/BEAM-8151
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core, sdk-py-harness
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>Priority: Major
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> We need to use a thread pool which shrinks the number of active threads when 
> they are not being used.
>  
> This is to prevent any stuckness issues related to a runner scheduling more 
> work items then there are "work" threads inside the SDK harness.
>  
> By default the control plane should have all "requests" being processed in 
> parallel and the runner is responsible for not overloading the SDK with too 
> much work.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8161) Upgrade to joda time 2.10.3 to get updated TZDB

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8161?focusedWorklogId=308205=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308205
 ]

ASF GitHub Bot logged work on BEAM-8161:


Author: ASF GitHub Bot
Created on: 06/Sep/19 22:51
Start Date: 06/Sep/19 22:51
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #9495: [BEAM-8161] 
Update joda time to 2.10.3 to get TZDB 2019b
URL: https://github.com/apache/beam/pull/9495
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308205)
Time Spent: 20m  (was: 10m)

> Upgrade to joda time 2.10.3 to get updated TZDB
> ---
>
> Key: BEAM-8161
> URL: https://issues.apache.org/jira/browse/BEAM-8161
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8156) Finish migration to standard Python typing

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8156?focusedWorklogId=308211=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308211
 ]

ASF GitHub Bot logged work on BEAM-8156:


Author: ASF GitHub Bot
Created on: 06/Sep/19 23:08
Start Date: 06/Sep/19 23:08
Worklog Time Spent: 10m 
  Work Description: robertwb commented on pull request #9509: [BEAM-8156] 
Add convert_to_typing_type
URL: https://github.com/apache/beam/pull/9509#discussion_r321936389
 
 

 ##
 File path: sdks/python/apache_beam/typehints/native_type_compatibility.py
 ##
 @@ -198,6 +202,10 @@ def convert_to_beam_type(typ):
   arity=-1,
   beam_type=typehints.Tuple),
   _TypeMapEntry(match=_match_is_union, arity=-1, 
beam_type=typehints.Union),
+  _TypeMapEntry(
+  match=_match_issubclass(typing.Iterator),
 
 Review comment:
   This will math all kinds of things, including strings and dicts, which is 
not what we want. (We may want to explicitly note that.)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308211)
Remaining Estimate: 503.5h  (was: 503h 40m)
Time Spent: 0.5h  (was: 20m)

> Finish migration to standard Python typing
> --
>
> Key: BEAM-8156
> URL: https://issues.apache.org/jira/browse/BEAM-8156
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Robert Bradshaw
>Assignee: Udi Meiri
>Priority: Major
>   Original Estimate: 504h
>  Time Spent: 0.5h
>  Remaining Estimate: 503.5h
>
> We should migrate all Python uses of types to the standard typing module, and 
> make the typehints.* ones aliases of the Python ones. 
>  
> There are three places where we use custom typehints behavior: 
> (1) is_compatible_with
> (2) bind_type_variables/match_type_variables
> (3) trivial type inference. 
>  
> I would propose that each of these be adapted to a (internal) public 
> interface that accepts and returns standard typing types, and internally 
> converts to our (nowhere else exposed) typehints types, performs the logic, 
> and converts back. Each of these in turn can then be updated, as needed and 
> orthogonally, to operate on the typing types natively (possibly via deference 
> to a third-party library). 
>  
> I think coder inference could be easily adopted to use typing types directly, 
> but it may be a fourth place where we do internal conversion first. Another 
> gotcha is special care may need to be taken if we ever need to pickle these 
> types (which IIRC may have issues). 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8167) Replace old Postcommit_Python_Verify test with new python postcommits in Grafana.

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8167?focusedWorklogId=308212=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308212
 ]

ASF GitHub Bot logged work on BEAM-8167:


Author: ASF GitHub Bot
Created on: 06/Sep/19 23:08
Start Date: 06/Sep/19 23:08
Worklog Time Spent: 10m 
  Work Description: Ardagan commented on pull request #9505: [BEAM-8167] 
Replace Postcommit_Python_Verify in Grafana dashboards.
URL: https://github.com/apache/beam/pull/9505
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308212)
Time Spent: 50m  (was: 40m)

> Replace old Postcommit_Python_Verify test with new python postcommits in 
> Grafana.
> -
>
> Key: BEAM-8167
> URL: https://issues.apache.org/jira/browse/BEAM-8167
> Project: Beam
>  Issue Type: Bug
>  Components: project-management, testing
>Reporter: Daniel Oliveira
>Assignee: Daniel Oliveira
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The old postcommit was replaced with multiple new postcommits for each 
> supported python version (Python2, Python35, Python36, Python37). These 
> aren't being reflected on the Grafana dashboards yet, so that should be fixed.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-7969) Streaming Dataflow worker doesn't report FnAPI metrics.

2019-09-06 Thread Mikhail Gryzykhin (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16924461#comment-16924461
 ] 

Mikhail Gryzykhin commented on BEAM-7969:
-

Initial PR double counted elements in GRPC PCollections 

> Streaming Dataflow worker doesn't report FnAPI metrics.
> ---
>
> Key: BEAM-7969
> URL: https://issues.apache.org/jira/browse/BEAM-7969
> Project: Beam
>  Issue Type: Bug
>  Components: java-fn-execution, runner-dataflow
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Major
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> EOM



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-5820) Vendor Calcite

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5820?focusedWorklogId=308048=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308048
 ]

ASF GitHub Bot logged work on BEAM-5820:


Author: ASF GitHub Bot
Created on: 06/Sep/19 17:49
Start Date: 06/Sep/19 17:49
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on issue #9189: [BEAM-5820] vendor 
calcite
URL: https://github.com/apache/beam/pull/9189#issuecomment-528950626
 
 
   Run SQL PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308048)
Time Spent: 5h 50m  (was: 5h 40m)

> Vendor Calcite
> --
>
> Key: BEAM-5820
> URL: https://issues.apache.org/jira/browse/BEAM-5820
> Project: Beam
>  Issue Type: Sub-task
>  Components: dsl-sql
>Reporter: Kenneth Knowles
>Assignee: Kai Jiang
>Priority: Major
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7969) Streaming Dataflow worker doesn't report FnAPI metrics.

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7969?focusedWorklogId=308113=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308113
 ]

ASF GitHub Bot logged work on BEAM-7969:


Author: ASF GitHub Bot
Created on: 06/Sep/19 19:27
Start Date: 06/Sep/19 19:27
Worklog Time Spent: 10m 
  Work Description: Ardagan commented on issue #9494: [BEAM-7969] Fix 
doublecount on GRPC PCollections in streaming jobs.
URL: https://github.com/apache/beam/pull/9494#issuecomment-528984343
 
 
   @ajamato 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308113)
Time Spent: 6h  (was: 5h 50m)

> Streaming Dataflow worker doesn't report FnAPI metrics.
> ---
>
> Key: BEAM-7969
> URL: https://issues.apache.org/jira/browse/BEAM-7969
> Project: Beam
>  Issue Type: Bug
>  Components: java-fn-execution, runner-dataflow
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Major
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> EOM



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (BEAM-8167) Replace old Postcommit_Python_Verify test with new python postcommits in Grafana.

2019-09-06 Thread Daniel Oliveira (Jira)
Daniel Oliveira created BEAM-8167:
-

 Summary: Replace old Postcommit_Python_Verify test with new python 
postcommits in Grafana.
 Key: BEAM-8167
 URL: https://issues.apache.org/jira/browse/BEAM-8167
 Project: Beam
  Issue Type: Bug
  Components: project-management, testing
Reporter: Daniel Oliveira
Assignee: Daniel Oliveira


The old postcommit was replaced with multiple new postcommits for each 
supported python version (Python2, Python35, Python36, Python37). These aren't 
being reflected on the Grafana dashboards yet, so that should be fixed.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8111) SchemaCoder broken on DataflowRunner

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8111?focusedWorklogId=308195=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308195
 ]

ASF GitHub Bot logged work on BEAM-8111:


Author: ASF GitHub Bot
Created on: 06/Sep/19 22:14
Start Date: 06/Sep/19 22:14
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on pull request #9454: [BEAM-8111] 
Add ValidatesRunner test to AvroSchemaTest
URL: https://github.com/apache/beam/pull/9454
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308195)
Time Spent: 1h 40m  (was: 1.5h)

> SchemaCoder broken on DataflowRunner
> 
>
> Key: BEAM-8111
> URL: https://issues.apache.org/jira/browse/BEAM-8111
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-java-core
>Affects Versions: 2.15.0
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Blocker
> Fix For: 2.16.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> https://github.com/apache/beam/commit/e65c176a9f34e45d408281e1101a2ae54cef0f6c
>  broke SchemaCoder on Dataflow. When translating a schema that uses logical 
> types from a cloud object dataflow encounters a runtime error.
> This means any pipelines that use SqlTransform or schema transforms will fail 
> on Dataflow in 2.15.0



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (BEAM-8168) Python GCSFileSystem failing with gzip content encoding

2019-09-06 Thread Daniel Ecer (Jira)
Daniel Ecer created BEAM-8168:
-

 Summary: Python GCSFileSystem failing with gzip content encoding
 Key: BEAM-8168
 URL: https://issues.apache.org/jira/browse/BEAM-8168
 Project: Beam
  Issue Type: Bug
  Components: io-py-gcp
Affects Versions: 2.15.0
Reporter: Daniel Ecer


Google Storage supports gzip content encoding.

 

While Apache Beam (Python) can correctly work with .gz files without content 
encoding.

It however fails to handle .gz files that have content encoding applied.

e.g. (the following would work run in a Jupyer notebook)

{code:python}
file_url_1 = 'gs://some-bucket/test1.gz'
file_url_2 = 'gs://some-bucket/test2.gz'

!echo 'my content' > /tmp/test

# file 1 without content encoding
!cat /tmp/test | gzip | gsutil cp - "{file_url_1}"

# file 2 with content encoding
!gsutil cp -Z /tmp/test "{file_url_2}"

!gsutil cat "{file_url_1}" | zcat -
# output: my content

!gsutil cat "{file_url_2}" | zcat -
# output: my content

import apache_beam as beam
from apache_beam.io.filesystem import CompressionTypes
from apache_beam.io.filesystems import FileSystems

print(beam.__version__)
# output: 2.15.0

with FileSystems.open(file_url_1, 
compression_type=CompressionTypes.UNCOMPRESSED) as fp:
print(fp.read(10))
# output: b'\x1f\x8b\x08\x00\x10\xd6r]\x00\x03'

with FileSystems.open(file_url_1) as fp:
print(fp.read(10))
# output: b'my content'

with FileSystems.open(file_url_2, 
compression_type=CompressionTypes.UNCOMPRESSED) as fp:
print(fp.read(10))
# output: b'my content'
# (here I would expect the gzipped byte code)

with FileSystems.open(file_url_2) as fp:
print(fp.read(10))
# exception: FailedToDecompressContent: Content purported to be compressed with 
gzip but failed to decompress.
{code}

 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (BEAM-8172) Add a label field to FunctionSpec proto

2019-09-06 Thread Chad Dombrova (Jira)
Chad Dombrova created BEAM-8172:
---

 Summary: Add a label field to FunctionSpec proto
 Key: BEAM-8172
 URL: https://issues.apache.org/jira/browse/BEAM-8172
 Project: Beam
  Issue Type: Improvement
  Components: beam-model
Reporter: Chad Dombrova
Assignee: Chad Dombrova


The FunctionSpec payload is opaque outside of the native environment, which can 
make debugging pipeline protos difficult.  It would be very useful if the 
FunctionSpec had an optional human readable "label", for debugging and 
presenting in UIs and error messages.

For example, in python, if the payload is an instance of a class, we could 
attempt to provide a string that represents the dotted path to that class, 
"mymodule.MyClass".  In the case of coders, we could use the label to hold the 
type hint:  "Optional[Iterable[mymodule.MyClass]]".




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7945) Allow runner to configure "semi_persist_dir" which is used in the SDK harness

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7945?focusedWorklogId=308245=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308245
 ]

ASF GitHub Bot logged work on BEAM-7945:


Author: ASF GitHub Bot
Created on: 07/Sep/19 00:58
Start Date: 07/Sep/19 00:58
Worklog Time Spent: 10m 
  Work Description: tweise commented on pull request #9452: [BEAM-7945] 
Allow runner to configure semi_persist_dir which is used …
URL: https://github.com/apache/beam/pull/9452#discussion_r321948469
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/options/RemoteEnvironmentOptions.java
 ##
 @@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.options;
+
+import com.google.auto.service.AutoService;
+import org.apache.beam.sdk.annotations.Experimental;
+import 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableList;
+
+/** Options that are used to control configuration of the remote environment. 
*/
+@Experimental
+@Hidden
+public interface RemoteEnvironmentOptions extends PipelineOptions {
+
+  @Description("Local semi-persistent directory")
+  @Default.String("/tmp")
 
 Review comment:
   I think the default should be null, so that the environment can pick its 
suitable tmp directory when nothing is specified by the user. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308245)
Time Spent: 2.5h  (was: 2h 20m)

> Allow runner to configure "semi_persist_dir" which is used in the SDK harness
> -
>
> Key: BEAM-7945
> URL: https://issues.apache.org/jira/browse/BEAM-7945
> Project: Beam
>  Issue Type: Sub-task
>  Components: java-fn-execution, sdk-go, sdk-java-core, sdk-py-core
>Reporter: sunjincheng
>Assignee: sunjincheng
>Priority: Major
> Fix For: 2.16.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Currently "semi_persist_dir" is not configurable. This may become a problem 
> in certain scenarios. For example, the default value of "semi_persist_dir" is 
> "/tmp" 
> ([https://github.com/apache/beam/blob/master/sdks/python/container/boot.go#L48])
>  in Python SDK harness. When the environment type is "PROCESS", the disk of 
> "/tmp" may be filled up and unexpected issues will occur in production 
> environment. We should provide a way to configure "semi_persist_dir" in 
> EnvironmentFactory at the runner side. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7990) Add ability to read parquet files into PCollection

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7990?focusedWorklogId=308254=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308254
 ]

ASF GitHub Bot logged work on BEAM-7990:


Author: ASF GitHub Bot
Created on: 07/Sep/19 01:32
Start Date: 07/Sep/19 01:32
Worklog Time Spent: 10m 
  Work Description: TheNeuralBit commented on issue #9361: [BEAM-7990] Add 
ability to read parquet files into PCollection
URL: https://github.com/apache/beam/pull/9361#issuecomment-529059952
 
 
   R: @chamikaramj would you mind reviewing/merging?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308254)
Time Spent: 40m  (was: 0.5h)

> Add ability to read parquet files into PCollection
> -
>
> Key: BEAM-7990
> URL: https://issues.apache.org/jira/browse/BEAM-7990
> Project: Beam
>  Issue Type: New Feature
>  Components: io-py-parquet
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8169) DataCatalogTableProvider should use credentials from PipelineOptions

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8169?focusedWorklogId=308220=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308220
 ]

ASF GitHub Bot logged work on BEAM-8169:


Author: ASF GitHub Bot
Created on: 06/Sep/19 23:31
Start Date: 06/Sep/19 23:31
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on pull request #9511: 
[BEAM-8169] DataCatalogTableProvider to use GCP credentials from PipelineOptions
URL: https://github.com/apache/beam/pull/9511
 
 
   This switches the metastore from Google Cloud Data Catalog to use the 
credentials specified in the options.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [x] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [x] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [x] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 

[jira] [Work logged] (BEAM-7970) Regenerate Go SDK proto files in correct version

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7970?focusedWorklogId=308221=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308221
 ]

ASF GitHub Bot logged work on BEAM-7970:


Author: ASF GitHub Bot
Created on: 06/Sep/19 23:31
Start Date: 06/Sep/19 23:31
Worklog Time Spent: 10m 
  Work Description: youngoli commented on pull request #9512: [BEAM-7970] 
Adding correct version of Go protobuf compiler.
URL: https://github.com/apache/beam/pull/9512
 
 
   From personal experience, using a version of the Go protobuf compiler that's 
too new will cause errors when generating the protobuf code, since it'll be 
incompatible with the version of the golang/protobuf library used in Beam. So I 
added a note with that warning and the best version to use.
   
   Only worry is that the note will become outdated if we ever update the 
golock file or switch to modules, but I can't think of a more "evergreen" way 
to leave a warning.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [x] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [x] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 

[jira] [Work logged] (BEAM-3713) Consider moving away from nose to nose2 or pytest.

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-3713?focusedWorklogId=308226=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308226
 ]

ASF GitHub Bot logged work on BEAM-3713:


Author: ASF GitHub Bot
Created on: 07/Sep/19 00:08
Start Date: 07/Sep/19 00:08
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #7949: [BEAM-3713] Add pytest 
testing infrastructure
URL: https://github.com/apache/beam/pull/7949#issuecomment-529051732
 
 
   run python 2 postcommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308226)
Time Spent: 4.5h  (was: 4h 20m)

> Consider moving away from nose to nose2 or pytest.
> --
>
> Key: BEAM-3713
> URL: https://issues.apache.org/jira/browse/BEAM-3713
> Project: Beam
>  Issue Type: Test
>  Components: sdk-py-core, testing
>Reporter: Robert Bradshaw
>Assignee: Udi Meiri
>Priority: Minor
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Per 
> [https://nose.readthedocs.io/en/latest/|https://nose.readthedocs.io/en/latest/,]
>  , nose is in maintenance mode.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-8171) Python container build fails

2019-09-06 Thread Kyle Weaver (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16924678#comment-16924678
 ] 

Kyle Weaver commented on BEAM-8171:
---

Seems to be caused by [https://github.com/h5py/h5py/issues/1294]

> Python container build fails
> 
>
> Key: BEAM-8171
> URL: https://issues.apache.org/jira/browse/BEAM-8171
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-harness
>Reporter: Kyle Weaver
>Priority: Major
>
> > Task :sdks:python:container:py3:docker
> The command '/bin/sh -c pip install -r /tmp/base_image_requirements.txt && 
> python -c "from google.protobuf.internal import api_implementation; assert 
> api_implementation._default_implementation_type == 'cpp'; print ('Verified 
> fast protobuf used.')" && rm -rf /root/.cache/pip' returned a non-zero code: 1
> This also happens for py2.
> Looks like this dependency might be causing the error:
> 16:47:51.893 [INFO] [system.out] Loading library to get version: libhdf5.so
> 16:47:51.893 [INFO] [system.out] error: libhdf5.so: cannot open shared object 
> file: No such file or directory
> 16:47:51.893 [INFO] [system.out] 
> 16:47:51.893 [INFO] [system.out] ERROR: Failed building wheel for h5py



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8151) Allow the Python SDK to use many many threads

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8151?focusedWorklogId=308260=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308260
 ]

ASF GitHub Bot logged work on BEAM-8151:


Author: ASF GitHub Bot
Created on: 07/Sep/19 01:58
Start Date: 07/Sep/19 01:58
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #9477: [BEAM-8151, BEAM-7848] 
Up the max number of threads inside the SDK harness to a default of 10k
URL: https://github.com/apache/beam/pull/9477#issuecomment-529062034
 
 
   Precommit is failing with:
   
   16:09:15 collapsing-thread-pool-executor 2018.6 has requirement 
futures==3.2.0, but you have futures 3.3.0.
   16:09:15 ERROR: InvocationError for command 
/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Phrase/src/sdks/python/test-suites/tox/py2/build/srcs/sdks/python/target/.tox-py27-lint/py27-lint/bin/pip
 check (exited with code 1)
   16:09:15 py27-lint run-test-post: commands[0] | 
/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Phrase/src/sdks/python/test-suites/tox/py2/build/srcs/sdks/python/scripts/run_tox_cleanup.sh
   16:09:15 ___ summary 

   16:09:15 ERROR:   py27-lint: commands failed
   
   The reason is a dependency conflict.
   collapsing-thread-pool-executor requires futures == 3.2.0 for python 2.7 
(https://github.com/ftpsolutions/collapsing-thread-pool-executor/blob/master/setup.py#L22)
   
   Beam's setup.py accepts "'futures>=3.2.0,<4.0.0; python_version < "3.0"',"
   
   We can do a few things:
   - Update collapsing-thread-pool-executor to relax the upper bound for 
futures dependency
   - Update beam to restict its futures dependency to use 3.2.0
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308260)
Time Spent: 5h 20m  (was: 5h 10m)

> Allow the Python SDK to use many many threads
> -
>
> Key: BEAM-8151
> URL: https://issues.apache.org/jira/browse/BEAM-8151
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core, sdk-py-harness
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>Priority: Major
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> We need to use a thread pool which shrinks the number of active threads when 
> they are not being used.
>  
> This is to prevent any stuckness issues related to a runner scheduling more 
> work items then there are "work" threads inside the SDK harness.
>  
> By default the control plane should have all "requests" being processed in 
> parallel and the runner is responsible for not overloading the SDK with too 
> much work.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-6185) Upgrade Spark to version 2.4.0

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6185?focusedWorklogId=308261=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308261
 ]

ASF GitHub Bot logged work on BEAM-6185:


Author: ASF GitHub Bot
Created on: 07/Sep/19 01:59
Start Date: 07/Sep/19 01:59
Worklog Time Spent: 10m 
  Work Description: tweise commented on pull request #9487: 
[BEAM-6185]Change default docker images name
URL: https://github.com/apache/beam/pull/9487#discussion_r321950925
 
 

 ##
 File path: sdks/python/apache_beam/runners/portability/portable_runner.py
 ##
 @@ -84,13 +84,12 @@ def default_docker_image():
   'has Python %d.%d interpreter.' % (
   sys.version_info[0], sys.version_info[1]))
 
-  # Perhaps also test if this was built?
-  image = ('{user}-docker-apache.bintray.io/beam/python'
-   '{version_suffix}:latest'.format(
-   user=os.environ['USER'],
-   version_suffix=version_suffix))
+  image = ('apachebeam/python{version_suffix}_sdk:latest'.format(
 
 Review comment:
   I have mixed feelings about this. When working with master, wouldn't I 
expect the locally built image to be used? What does it mean to refer to hub 
for something that wasn't released.
   
   Perhaps do that depending on whether current version is dev/snapshot or a 
release?
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308261)
Time Spent: 2h 50m  (was: 2h 40m)

> Upgrade Spark to version 2.4.0
> --
>
> Key: BEAM-6185
> URL: https://issues.apache.org/jira/browse/BEAM-6185
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-spark
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 2.12.0
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-5820) Vendor Calcite

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5820?focusedWorklogId=308252=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308252
 ]

ASF GitHub Bot logged work on BEAM-5820:


Author: ASF GitHub Bot
Created on: 07/Sep/19 01:13
Start Date: 07/Sep/19 01:13
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on issue #9189: [BEAM-5820] vendor 
calcite
URL: https://github.com/apache/beam/pull/9189#issuecomment-529058347
 
 
   Thanks for replacing calcite in Beam ZetaSQL to vendor calcite, which will 
unblock us from moving Beam ZetaSQL to a separate module. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308252)
Time Spent: 6.5h  (was: 6h 20m)

> Vendor Calcite
> --
>
> Key: BEAM-5820
> URL: https://issues.apache.org/jira/browse/BEAM-5820
> Project: Beam
>  Issue Type: Sub-task
>  Components: dsl-sql
>Reporter: Kenneth Knowles
>Assignee: Kai Jiang
>Priority: Major
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-5820) Vendor Calcite

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5820?focusedWorklogId=308251=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308251
 ]

ASF GitHub Bot logged work on BEAM-5820:


Author: ASF GitHub Bot
Created on: 07/Sep/19 01:13
Start Date: 07/Sep/19 01:13
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on issue #9189: [BEAM-5820] vendor 
calcite
URL: https://github.com/apache/beam/pull/9189#issuecomment-529058347
 
 
   Thanks for replacing calcite in Beam ZetaSQL to vendor calcite, which will 
unblock us to move Beam ZetaSQL to a separate module. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308251)
Time Spent: 6h 20m  (was: 6h 10m)

> Vendor Calcite
> --
>
> Key: BEAM-5820
> URL: https://issues.apache.org/jira/browse/BEAM-5820
> Project: Beam
>  Issue Type: Sub-task
>  Components: dsl-sql
>Reporter: Kenneth Knowles
>Assignee: Kai Jiang
>Priority: Major
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-5820) Vendor Calcite

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5820?focusedWorklogId=308250=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308250
 ]

ASF GitHub Bot logged work on BEAM-5820:


Author: ASF GitHub Bot
Created on: 07/Sep/19 01:13
Start Date: 07/Sep/19 01:13
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on issue #9189: [BEAM-5820] vendor 
calcite
URL: https://github.com/apache/beam/pull/9189#issuecomment-529058347
 
 
   Thanks for replace calcite in Beam ZetaSQL to vendor calcite, which will 
unblock us to move Beam ZetaSQL to a separate module. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308250)
Time Spent: 6h 10m  (was: 6h)

> Vendor Calcite
> --
>
> Key: BEAM-5820
> URL: https://issues.apache.org/jira/browse/BEAM-5820
> Project: Beam
>  Issue Type: Sub-task
>  Components: dsl-sql
>Reporter: Kenneth Knowles
>Assignee: Kai Jiang
>Priority: Major
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-6185) Upgrade Spark to version 2.4.0

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6185?focusedWorklogId=308266=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308266
 ]

ASF GitHub Bot logged work on BEAM-6185:


Author: ASF GitHub Bot
Created on: 07/Sep/19 02:33
Start Date: 07/Sep/19 02:33
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #9487: 
[BEAM-6185]Change default docker images name
URL: https://github.com/apache/beam/pull/9487#discussion_r321951929
 
 

 ##
 File path: sdks/python/apache_beam/runners/portability/portable_runner.py
 ##
 @@ -84,13 +84,12 @@ def default_docker_image():
   'has Python %d.%d interpreter.' % (
   sys.version_info[0], sys.version_info[1]))
 
-  # Perhaps also test if this was built?
-  image = ('{user}-docker-apache.bintray.io/beam/python'
-   '{version_suffix}:latest'.format(
-   user=os.environ['USER'],
-   version_suffix=version_suffix))
+  image = ('apachebeam/python{version_suffix}_sdk:latest'.format(
 
 Review comment:
   To build on Thomas's idea, we could do something similar to what 
dataflowrunner does today 
(https://github.com/apache/beam/blob/9043b3e0bc138e63123d3fa55d4b7280eb96a82e/sdks/python/apache_beam/runners/dataflow/internal/apiclient.py#L946)
   
   - If this is a released version on release branch (i.e. no 'dev' in 
version') default to image in hub. (release manager will have to build this 
container as part of the release anyway.)
   - If it is a dev version --> this is a question. Dataflowrunner's solution 
is to have a fixed image that is occasionally updated as needed or if provided 
a flag use that as an override. (Here we might just simplify and require a 
locally built version. This occasionally built image is convenient but is 
error-prone and a hassle.)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308266)
Time Spent: 3h 10m  (was: 3h)

> Upgrade Spark to version 2.4.0
> --
>
> Key: BEAM-6185
> URL: https://issues.apache.org/jira/browse/BEAM-6185
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-spark
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 2.12.0
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (BEAM-8173) Filesystems.matchSingleFileSpec throws away the actual failure exception

2019-09-06 Thread Kenneth Knowles (Jira)
Kenneth Knowles created BEAM-8173:
-

 Summary: Filesystems.matchSingleFileSpec throws away the actual 
failure exception
 Key: BEAM-8173
 URL: https://issues.apache.org/jira/browse/BEAM-8173
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-core
Reporter: Kenneth Knowles


At [1] the result of a non-OK match causes an exception to be thrown. But the 
exception does not include the actual cause of the failure, so it cannot be 
efficiently debugged. It appears that the design of MatchResult is that it 
should call metadata() without bothering to check status, so that the 
underlying exception can be re-raised and caught and put in the chain of 
causes, as it should be.

[1] 
https://github.com/apache/beam/blob/c2f0d282337f3ae0196a7717712396a5a41fdde1/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileSystems.java#L190



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-5878) Support DoFns with Keyword-only arguments in Python 3.

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5878?focusedWorklogId=308283=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308283
 ]

ASF GitHub Bot logged work on BEAM-5878:


Author: ASF GitHub Bot
Created on: 07/Sep/19 03:33
Start Date: 07/Sep/19 03:33
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on pull request #9517: Revert 
"[BEAM-5878] support DoFns with Keyword-only arguments"
URL: https://github.com/apache/beam/pull/9517
 
 
   Reverts apache/beam#9237
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308283)
Time Spent: 11h 20m  (was: 11h 10m)

> Support DoFns with Keyword-only arguments in Python 3.
> --
>
> Key: BEAM-5878
> URL: https://issues.apache.org/jira/browse/BEAM-5878
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: yoshiki obata
>Priority: Minor
> Fix For: 2.16.0
>
>  Time Spent: 11h 20m
>  Remaining Estimate: 0h
>
> Python 3.0 [adds a possibility|https://www.python.org/dev/peps/pep-3102/] to 
> define functions with keyword-only arguments. 
> Currently Beam does not handle them correctly. [~ruoyu] pointed out [one 
> place|https://github.com/apache/beam/blob/a56ce43109c97c739fa08adca45528c41e3c925c/sdks/python/apache_beam/typehints/decorators.py#L118]
>  in our codebase that we should fix: in Python in 3.0 inspect.getargspec() 
> will fail on functions with keyword-only arguments, but a new method 
> [inspect.getfullargspec()|https://docs.python.org/3/library/inspect.html#inspect.getfullargspec]
>  supports them.
> There may be implications for our (best-effort) type-hints machinery.
> We should also add a Py3-only unit tests that covers DoFn's with keyword-only 
> arguments once Beam Python 3 tests are in a good shape.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-5878) Support DoFns with Keyword-only arguments in Python 3.

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5878?focusedWorklogId=308282=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308282
 ]

ASF GitHub Bot logged work on BEAM-5878:


Author: ASF GitHub Bot
Created on: 07/Sep/19 03:33
Start Date: 07/Sep/19 03:33
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #9237: [BEAM-5878] support 
DoFns with Keyword-only arguments
URL: https://github.com/apache/beam/pull/9237#issuecomment-529068152
 
 
   One   Beam SDK users who use master branch reported that monkey-patch caused 
their pipeline to fail with what appears to be an infinite recursion from 
new_save_reduce. 
   
   If line 163 is removed, the problem disappears. The problem persists with 
the monkey patch, even if the body of new_save_reduce is changed to
   ```
   StockPickler.save_reduce(self, func, args, *other_args, **kwargs)
   ```
   @lazylynx I have not investigated that report yet but I'd suggest to revert 
this change in the mean time until we understand a better way to patch, which 
may not be necessary soon given your outstanding changes to dill.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308282)
Time Spent: 11h 10m  (was: 11h)

> Support DoFns with Keyword-only arguments in Python 3.
> --
>
> Key: BEAM-5878
> URL: https://issues.apache.org/jira/browse/BEAM-5878
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: yoshiki obata
>Priority: Minor
> Fix For: 2.16.0
>
>  Time Spent: 11h 10m
>  Remaining Estimate: 0h
>
> Python 3.0 [adds a possibility|https://www.python.org/dev/peps/pep-3102/] to 
> define functions with keyword-only arguments. 
> Currently Beam does not handle them correctly. [~ruoyu] pointed out [one 
> place|https://github.com/apache/beam/blob/a56ce43109c97c739fa08adca45528c41e3c925c/sdks/python/apache_beam/typehints/decorators.py#L118]
>  in our codebase that we should fix: in Python in 3.0 inspect.getargspec() 
> will fail on functions with keyword-only arguments, but a new method 
> [inspect.getfullargspec()|https://docs.python.org/3/library/inspect.html#inspect.getfullargspec]
>  supports them.
> There may be implications for our (best-effort) type-hints machinery.
> We should also add a Py3-only unit tests that covers DoFn's with keyword-only 
> arguments once Beam Python 3 tests are in a good shape.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-6185) Upgrade Spark to version 2.4.0

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6185?focusedWorklogId=308285=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308285
 ]

ASF GitHub Bot logged work on BEAM-6185:


Author: ASF GitHub Bot
Created on: 07/Sep/19 03:34
Start Date: 07/Sep/19 03:34
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on pull request #9487: 
[BEAM-6185]Change default docker images name
URL: https://github.com/apache/beam/pull/9487#discussion_r321953571
 
 

 ##
 File path: sdks/python/apache_beam/runners/portability/portable_runner.py
 ##
 @@ -84,13 +84,12 @@ def default_docker_image():
   'has Python %d.%d interpreter.' % (
   sys.version_info[0], sys.version_info[1]))
 
-  # Perhaps also test if this was built?
-  image = ('{user}-docker-apache.bintray.io/beam/python'
-   '{version_suffix}:latest'.format(
-   user=os.environ['USER'],
-   version_suffix=version_suffix))
+  image = ('apachebeam/python{version_suffix}_sdk:latest'.format(
 
 Review comment:
   Thanks @aaltay and @tweise for comments.
   
   `--environment_config` option can be used to pass customized images when use 
Portable runner. The default image is used when no docker image is specified.
   
   My understanding is, from users perspective, they don't need to worry about 
creating docker images at local when they write pipelines, so it's ok to pull 
it from remote. From developers perspective, developers who are working on sdks 
are expected to know how portable runner is working, so as how docker is used. 
If they want to test their developing sdks, they should build an image at local 
and use it. The image name can be same as the default one, then the local image 
is used instead of pulling it from remote, or can pass it with  
`--environment_config` option if it has different name.
   
   I can add more comments here to explain how to pass customized images.
   
   What scenarios am I missing?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308285)
Time Spent: 3h 20m  (was: 3h 10m)

> Upgrade Spark to version 2.4.0
> --
>
> Key: BEAM-6185
> URL: https://issues.apache.org/jira/browse/BEAM-6185
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-spark
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>Priority: Major
> Fix For: 2.12.0
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-5878) Support DoFns with Keyword-only arguments in Python 3.

2019-09-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5878?focusedWorklogId=308284=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-308284
 ]

ASF GitHub Bot logged work on BEAM-5878:


Author: ASF GitHub Bot
Created on: 07/Sep/19 03:34
Start Date: 07/Sep/19 03:34
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #9517: Revert "[BEAM-5878] 
support DoFns with Keyword-only arguments"
URL: https://github.com/apache/beam/pull/9517#issuecomment-529068201
 
 
   R: @lazylynx 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 308284)
Time Spent: 11.5h  (was: 11h 20m)

> Support DoFns with Keyword-only arguments in Python 3.
> --
>
> Key: BEAM-5878
> URL: https://issues.apache.org/jira/browse/BEAM-5878
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: yoshiki obata
>Priority: Minor
> Fix For: 2.16.0
>
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> Python 3.0 [adds a possibility|https://www.python.org/dev/peps/pep-3102/] to 
> define functions with keyword-only arguments. 
> Currently Beam does not handle them correctly. [~ruoyu] pointed out [one 
> place|https://github.com/apache/beam/blob/a56ce43109c97c739fa08adca45528c41e3c925c/sdks/python/apache_beam/typehints/decorators.py#L118]
>  in our codebase that we should fix: in Python in 3.0 inspect.getargspec() 
> will fail on functions with keyword-only arguments, but a new method 
> [inspect.getfullargspec()|https://docs.python.org/3/library/inspect.html#inspect.getfullargspec]
>  supports them.
> There may be implications for our (best-effort) type-hints machinery.
> We should also add a Py3-only unit tests that covers DoFn's with keyword-only 
> arguments once Beam Python 3 tests are in a good shape.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


  1   2   >