[jira] [Work logged] (BEAM-6855) Side inputs are not supported when using the state API

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6855?focusedWorklogId=297632=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297632
 ]

ASF GitHub Bot logged work on BEAM-6855:


Author: ASF GitHub Bot
Created on: 20/Aug/19 05:37
Start Date: 20/Aug/19 05:37
Worklog Time Spent: 10m 
  Work Description: mtalhajamil commented on issue #9140: [BEAM-6855] Side 
inputs are not supported when using the state API
URL: https://github.com/apache/beam/pull/9140#issuecomment-522860960
 
 
   @reuvenlax as in discussion with Ken 
https://issues.apache.org/jira/browse/BEAM-6855?focusedCommentId=16889000=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16889000
 on jira & above comments by committers looks like the task of this jira was to 
add these tests.  
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297632)
Time Spent: 3.5h  (was: 3h 20m)

> Side inputs are not supported when using the state API
> --
>
> Key: BEAM-6855
> URL: https://issues.apache.org/jira/browse/BEAM-6855
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core, runner-dataflow, runner-direct
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7969) Streaming Dataflow worker doesn't report FnAPI metrics.

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7969?focusedWorklogId=297630=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297630
 ]

ASF GitHub Bot logged work on BEAM-7969:


Author: ASF GitHub Bot
Created on: 20/Aug/19 05:34
Start Date: 20/Aug/19 05:34
Worklog Time Spent: 10m 
  Work Description: Ardagan commented on issue #9330: [BEAM-7969] Report 
FnAPI counters as deltas in streaming jobs.
URL: https://github.com/apache/beam/pull/9330#issuecomment-522860447
 
 
   > @Ardagan do you know what the test failures are?
   
   Seem to be some libraries import failure. Does work on my machine. Looking 
for fix.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297630)
Time Spent: 2.5h  (was: 2h 20m)

> Streaming Dataflow worker doesn't report FnAPI metrics.
> ---
>
> Key: BEAM-7969
> URL: https://issues.apache.org/jira/browse/BEAM-7969
> Project: Beam
>  Issue Type: Bug
>  Components: java-fn-execution, runner-dataflow
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Major
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> EOM



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7909) Write integration tests to test customized containers

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7909?focusedWorklogId=297623=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297623
 ]

ASF GitHub Bot logged work on BEAM-7909:


Author: ASF GitHub Bot
Created on: 20/Aug/19 05:29
Start Date: 20/Aug/19 05:29
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on pull request #9351: [BEAM-7909] 
support customized container for Python
URL: https://github.com/apache/beam/pull/9351#discussion_r315511451
 
 

 ##
 File path: sdks/python/apache_beam/runners/portability/portable_runner.py
 ##
 @@ -84,16 +84,10 @@ def __init__(self):
   @staticmethod
   def default_docker_image():
 if 'USER' in os.environ:
-  if sys.version_info[0] == 2:
-version_suffix = ''
-  elif sys.version_info[0:2] == (3, 5):
-version_suffix = '3'
-  else:
-version_suffix = '3'
-# TODO(BEAM-7474): Use an image which has correct Python minor version.
-logging.warning('Make sure that locally built Python SDK docker image '
-'has Python %d.%d interpreter. See also: BEAM-7474.' % 
(
-sys.version_info[0], sys.version_info[1]))
+  version_suffix = ''.join([str(i) for i in sys.version_info[0:2]])
 
 Review comment:
   +1 with Ahmet for not skipping these tests. But, failing a test which 
because of the user didn't build images beforehand does not make sense to me. 
Could we add warnings in the tests if the default images are missing? 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297623)
Time Spent: 3h 50m  (was: 3h 40m)

> Write integration tests to test customized containers
> -
>
> Key: BEAM-7909
> URL: https://issues.apache.org/jira/browse/BEAM-7909
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7909) Write integration tests to test customized containers

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7909?focusedWorklogId=297626=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297626
 ]

ASF GitHub Bot logged work on BEAM-7909:


Author: ASF GitHub Bot
Created on: 20/Aug/19 05:29
Start Date: 20/Aug/19 05:29
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on pull request #9351: [BEAM-7909] 
support customized container for Python
URL: https://github.com/apache/beam/pull/9351#discussion_r315507170
 
 

 ##
 File path: sdks/python/test-suites/portable/py2/build.gradle
 ##
 @@ -29,7 +29,7 @@ addPortableWordCountTasks()
 
 task preCommitPy2() {
   dependsOn ':runners:flink:1.5:job-server-container:docker'
-  dependsOn ':sdks:python:container:docker'
+  dependsOn ':sdks:python:container:buildDocker'
 
 Review comment:
   Could we only depend on ':sdks:python:container:py2:docker rather than build 
all images? Same below.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297626)
Time Spent: 3h 50m  (was: 3h 40m)

> Write integration tests to test customized containers
> -
>
> Key: BEAM-7909
> URL: https://issues.apache.org/jira/browse/BEAM-7909
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7909) Write integration tests to test customized containers

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7909?focusedWorklogId=297625=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297625
 ]

ASF GitHub Bot logged work on BEAM-7909:


Author: ASF GitHub Bot
Created on: 20/Aug/19 05:29
Start Date: 20/Aug/19 05:29
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on pull request #9351: [BEAM-7909] 
support customized container for Python
URL: https://github.com/apache/beam/pull/9351#discussion_r315506168
 
 

 ##
 File path: sdks/python/container/build.gradle
 ##
 @@ -60,15 +51,13 @@ golang {
   }
 }
 
-docker {
-  name containerImageName(name: "python")
-  files "./build"
+task buildDocker {
 
 Review comment:
   We talked offline and come out with this approach to build IMAGES WITH ALL 
PY VERSIONS in a singe task. `build` is good to me, but I am not sure if that 
is a keyword in gradle dsl. Alternatively, we could call it `buildAll`.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297625)
Time Spent: 3h 50m  (was: 3h 40m)

> Write integration tests to test customized containers
> -
>
> Key: BEAM-7909
> URL: https://issues.apache.org/jira/browse/BEAM-7909
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7909) Write integration tests to test customized containers

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7909?focusedWorklogId=297624=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297624
 ]

ASF GitHub Bot logged work on BEAM-7909:


Author: ASF GitHub Bot
Created on: 20/Aug/19 05:29
Start Date: 20/Aug/19 05:29
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on pull request #9351: [BEAM-7909] 
support customized container for Python
URL: https://github.com/apache/beam/pull/9351#discussion_r315507918
 
 

 ##
 File path: sdks/python/container/py35/build.gradle
 ##
 @@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+plugins {
+  id 'base'
+  id 'org.apache.beam.module'
+}
+applyDockerNature()
+
+description = "Apache Beam :: SDKs :: Python :: Container :: Python 35 
Container"
+
+configurations {
+  sdkSourceTarball
+  sdkHarnessLauncher
+}
+
+dependencies {
+  sdkSourceTarball project(path: ":sdks:python", configuration: "distTarBall")
+  sdkHarnessLauncher project(path: ":sdks:python:container", configuration: 
"sdkHarnessLauncher")
+}
+
+task copyDockerfileDependencies(type: Copy) {
 
 Review comment:
   Do we have to write separate `copyDockerfileDependencies` and 
`copyDockerfileDependencies` in py2|35|36|37? Could we try to extract them into 
the parent build.gradle to make future maintenance easier?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297624)
Time Spent: 3h 50m  (was: 3h 40m)

> Write integration tests to test customized containers
> -
>
> Key: BEAM-7909
> URL: https://issues.apache.org/jira/browse/BEAM-7909
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7986) Increase minimum grpcio required version

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7986?focusedWorklogId=297541=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297541
 ]

ASF GitHub Bot logged work on BEAM-7986:


Author: ASF GitHub Bot
Created on: 20/Aug/19 01:43
Start Date: 20/Aug/19 01:43
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #9356: [BEAM-7986] 
Upgrade grpcio
URL: https://github.com/apache/beam/pull/9356
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297541)
Time Spent: 1h  (was: 50m)

> Increase minimum grpcio required version
> 
>
> Key: BEAM-7986
> URL: https://issues.apache.org/jira/browse/BEAM-7986
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> According to this question, 1.11.0 is not new enough (1.22.0 reportedly 
> works), and we list the minimum as 1.8.
> https://stackoverflow.com/questions/57479498/beam-channel-object-has-no-attribute-close?noredirect=1#comment101446049_57479498
> Affects DirectRunner Pub/Sub client.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7986) Increase minimum grpcio required version

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7986?focusedWorklogId=297535=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297535
 ]

ASF GitHub Bot logged work on BEAM-7986:


Author: ASF GitHub Bot
Created on: 20/Aug/19 01:32
Start Date: 20/Aug/19 01:32
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #9356: [BEAM-7986] Upgrade 
grpcio
URL: https://github.com/apache/beam/pull/9356#issuecomment-522815851
 
 
   I am not aware of an existing test. We could test it manually perhaps. But 
maybe not needed it all. Based on your comment protobuf should be a non-issue. 
And tensorflow 1.14 was release much later.
   
   LGTM. Feel free to self merge when ready.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297535)
Time Spent: 50m  (was: 40m)

> Increase minimum grpcio required version
> 
>
> Key: BEAM-7986
> URL: https://issues.apache.org/jira/browse/BEAM-7986
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> According to this question, 1.11.0 is not new enough (1.22.0 reportedly 
> works), and we list the minimum as 1.8.
> https://stackoverflow.com/questions/57479498/beam-channel-object-has-no-attribute-close?noredirect=1#comment101446049_57479498
> Affects DirectRunner Pub/Sub client.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7969) Streaming Dataflow worker doesn't report FnAPI metrics.

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7969?focusedWorklogId=297533=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297533
 ]

ASF GitHub Bot logged work on BEAM-7969:


Author: ASF GitHub Bot
Created on: 20/Aug/19 01:28
Start Date: 20/Aug/19 01:28
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #9330: [BEAM-7969] Report 
FnAPI counters as deltas in streaming jobs.
URL: https://github.com/apache/beam/pull/9330#issuecomment-522815031
 
 
   @Ardagan do you know what the test failures are?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297533)
Time Spent: 2h 20m  (was: 2h 10m)

> Streaming Dataflow worker doesn't report FnAPI metrics.
> ---
>
> Key: BEAM-7969
> URL: https://issues.apache.org/jira/browse/BEAM-7969
> Project: Beam
>  Issue Type: Bug
>  Components: java-fn-execution, runner-dataflow
>Reporter: Mikhail Gryzykhin
>Assignee: Mikhail Gryzykhin
>Priority: Major
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> EOM



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (BEAM-8009) Pipeline context manager calls wait_until_finish on streaming pipelines

2019-08-19 Thread Udi Meiri (Jira)
Udi Meiri created BEAM-8009:
---

 Summary: Pipeline context manager calls wait_until_finish on 
streaming pipelines
 Key: BEAM-8009
 URL: https://issues.apache.org/jira/browse/BEAM-8009
 Project: Beam
  Issue Type: Improvement
  Components: sdk-py-core
Reporter: Udi Meiri


Perhaps there should be a warning logged saying: "Waiting until streaming 
pipeline finishes. If this is not what you intended, call run() on the pipeline 
instead."



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7909) Write integration tests to test customized containers

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7909?focusedWorklogId=297531=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297531
 ]

ASF GitHub Bot logged work on BEAM-7909:


Author: ASF GitHub Bot
Created on: 20/Aug/19 01:19
Start Date: 20/Aug/19 01:19
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #9351: [BEAM-7909] 
support customized container for Python
URL: https://github.com/apache/beam/pull/9351#discussion_r315472687
 
 

 ##
 File path: sdks/python/container/build.gradle
 ##
 @@ -60,15 +51,13 @@ golang {
   }
 }
 
-docker {
-  name containerImageName(name: "python")
-  files "./build"
+task buildDocker {
 
 Review comment:
   This change sounds good. 
   
   We could also rename it to just `build` . It would make sense to call it 
like `./gradlew -p sdks/python/container build`
   
   /cc @yifanzou on gradle related questions.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297531)
Time Spent: 3h 40m  (was: 3.5h)

> Write integration tests to test customized containers
> -
>
> Key: BEAM-7909
> URL: https://issues.apache.org/jira/browse/BEAM-7909
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7909) Write integration tests to test customized containers

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7909?focusedWorklogId=297529=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297529
 ]

ASF GitHub Bot logged work on BEAM-7909:


Author: ASF GitHub Bot
Created on: 20/Aug/19 01:19
Start Date: 20/Aug/19 01:19
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #9351: [BEAM-7909] 
support customized container for Python
URL: https://github.com/apache/beam/pull/9351#discussion_r315472495
 
 

 ##
 File path: sdks/python/apache_beam/runners/portability/portable_runner.py
 ##
 @@ -84,16 +84,10 @@ def __init__(self):
   @staticmethod
   def default_docker_image():
 if 'USER' in os.environ:
-  if sys.version_info[0] == 2:
-version_suffix = ''
-  elif sys.version_info[0:2] == (3, 5):
-version_suffix = '3'
-  else:
-version_suffix = '3'
-# TODO(BEAM-7474): Use an image which has correct Python minor version.
-logging.warning('Make sure that locally built Python SDK docker image '
-'has Python %d.%d interpreter. See also: BEAM-7474.' % 
(
-sys.version_info[0], sys.version_info[1]))
+  version_suffix = ''.join([str(i) for i in sys.version_info[0:2]])
 
 Review comment:
   Could you chat with @alanmyrvold or @yifanzou on what would be the best 
testing practices here?
   
   I think we should not skip tests if possible.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297529)
Time Spent: 3.5h  (was: 3h 20m)

> Write integration tests to test customized containers
> -
>
> Key: BEAM-7909
> URL: https://issues.apache.org/jira/browse/BEAM-7909
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7909) Write integration tests to test customized containers

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7909?focusedWorklogId=297530=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297530
 ]

ASF GitHub Bot logged work on BEAM-7909:


Author: ASF GitHub Bot
Created on: 20/Aug/19 01:19
Start Date: 20/Aug/19 01:19
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #9351: [BEAM-7909] 
support customized container for Python
URL: https://github.com/apache/beam/pull/9351#discussion_r315472846
 
 

 ##
 File path: sdks/python/setup.py
 ##
 @@ -125,15 +125,15 @@ def get_version():
 'pytz>=2018.3',
 # [BEAM-5628] Beam VCF IO is not supported in Python 3.
 'pyvcf>=0.6.8,<0.7.0; python_version < "3.0"',
-'pyyaml>=3.12,<4.0.0',
+'pyyaml>=3.13,<4.0.0',
 
 Review comment:
   Why do we need this change?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297530)
Time Spent: 3.5h  (was: 3h 20m)

> Write integration tests to test customized containers
> -
>
> Key: BEAM-7909
> URL: https://issues.apache.org/jira/browse/BEAM-7909
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7804) Fix unclear python programming guide document

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7804?focusedWorklogId=297527=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297527
 ]

ASF GitHub Bot logged work on BEAM-7804:


Author: ASF GitHub Bot
Created on: 20/Aug/19 01:13
Start Date: 20/Aug/19 01:13
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #9379: [BEAM-7804] Update 
python sdk transform programming guide
URL: https://github.com/apache/beam/pull/9379#issuecomment-522812095
 
 
   Awesome. LGTM.
   
   @rosetn for documentation review. Please ping me after that review to merge.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297527)
Time Spent: 0.5h  (was: 20m)

> Fix unclear python programming guide document
> -
>
> Key: BEAM-7804
> URL: https://issues.apache.org/jira/browse/BEAM-7804
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Yichi Zhang
>Assignee: Yichi Zhang
>Priority: Minor
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> [https://beam.apache.org/documentation/programming-guide/#additional-outputs],
>  section 4.5
> last two code snippets don't provide enough context when switching language 
> from java to python



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7819) PubsubMessage message parsing is lacking non-attribute fields

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7819?focusedWorklogId=297525=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297525
 ]

ASF GitHub Bot logged work on BEAM-7819:


Author: ASF GitHub Bot
Created on: 20/Aug/19 01:03
Start Date: 20/Aug/19 01:03
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #9232: [BEAM-7819] Python - 
parse PubSub message_id into attributes property
URL: https://github.com/apache/beam/pull/9232#issuecomment-522810369
 
 
   @matt-darwin - Re: isort - Sure, it is fine to exclude it from isort. Thank 
you for the explanation.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297525)
Time Spent: 7h 50m  (was: 7h 40m)

> PubsubMessage message parsing is lacking non-attribute fields
> -
>
> Key: BEAM-7819
> URL: https://issues.apache.org/jira/browse/BEAM-7819
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-gcp
>Reporter: Ahmet Altay
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 7h 50m
>  Remaining Estimate: 0h
>
> User reported issue: 
> https://lists.apache.org/thread.html/139b0c15abc6471a2e2202d76d915c645a529a23ecc32cd9cfecd315@%3Cuser.beam.apache.org%3E
> """
> Looking at the source code, with my untrained python eyes, I think if the 
> intention is to include the message id and the publish time in the attributes 
> attribute of the PubSubMessage type, then the protobuf mapping is missing 
> something:-
> @staticmethod
> def _from_proto_str(proto_msg):
> """Construct from serialized form of ``PubsubMessage``.
> Args:
> proto_msg: String containing a serialized protobuf of type
> https://cloud.google.com/pubsub/docs/reference/rpc/google.pubsub.v1#google.pubsub.v1.PubsubMessage
> Returns:
> A new PubsubMessage object.
> """
> msg = pubsub.types.pubsub_pb2.PubsubMessage()
> msg.ParseFromString(proto_msg)
> # Convert ScalarMapContainer to dict.
> attributes = dict((key, msg.attributes[key]) for key in msg.attributes)
> return PubsubMessage(msg.data, attributes)
> The protobuf definition is here:-
> https://cloud.google.com/pubsub/docs/reference/rpc/google.pubsub.v1#google.pubsub.v1.PubsubMessage
> and so it looks as if the message_id and publish_time are not being parsed as 
> they are seperate from the attributes. Perhaps the PubsubMessage class needs 
> expanding to include these as attributes, or they would need adding to the 
> dictionary for attributes. This would only need doing for the _from_proto_str 
> as obviously they would not need to be populated when transmitting a message 
> to PubSub.
> My python is not great, I'm assuming the latter option would need to look 
> something like this?
> attributes = dict((key, msg.attributes[key]) for key in msg.attributes)
> attributes.update({'message_id': msg.message_id, 'publish_time': 
> msg.publish_time})
> return PubsubMessage(msg.data, attributes)
> """



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7993) portable python precommit is flaky

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7993?focusedWorklogId=297514=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297514
 ]

ASF GitHub Bot logged work on BEAM-7993:


Author: ASF GitHub Bot
Created on: 20/Aug/19 00:40
Start Date: 20/Aug/19 00:40
Worklog Time Spent: 10m 
  Work Description: ibzib commented on pull request #9380: [BEAM-7993] wait 
longer for docker container startup
URL: https://github.com/apache/beam/pull/9380
 
 
   Previous previous behavior: wait 2 minutes x infinite retries for container 
to start up
   
   Previous behavior: wait 1 minute for docker container to start up, then 
immediately throw an exception, killing the job
   
   New behavior: wait 2 minutes x 5 retries for container to start up before 
failing.
   
   R: @angoenka 
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 

[jira] [Work logged] (BEAM-7804) Fix unclear python programming guide document

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7804?focusedWorklogId=297510=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297510
 ]

ASF GitHub Bot logged work on BEAM-7804:


Author: ASF GitHub Bot
Created on: 20/Aug/19 00:28
Start Date: 20/Aug/19 00:28
Worklog Time Spent: 10m 
  Work Description: y1chi commented on issue #9379: [BEAM-7804] Update 
python sdk transform programming guide.
URL: https://github.com/apache/beam/pull/9379#issuecomment-522804142
 
 
   R: @aaltay 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297510)
Time Spent: 20m  (was: 10m)

> Fix unclear python programming guide document
> -
>
> Key: BEAM-7804
> URL: https://issues.apache.org/jira/browse/BEAM-7804
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Yichi Zhang
>Assignee: Yichi Zhang
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> [https://beam.apache.org/documentation/programming-guide/#additional-outputs],
>  section 4.5
> last two code snippets don't provide enough context when switching language 
> from java to python



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7804) Fix unclear python programming guide document

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7804?focusedWorklogId=297509=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297509
 ]

ASF GitHub Bot logged work on BEAM-7804:


Author: ASF GitHub Bot
Created on: 20/Aug/19 00:27
Start Date: 20/Aug/19 00:27
Worklog Time Spent: 10m 
  Work Description: y1chi commented on pull request #9379: [BEAM-7804] 
Update python sdk transform programming guide.
URL: https://github.com/apache/beam/pull/9379
 
 
   **Please** add a meaningful description for your change here
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 

[jira] [Work logged] (BEAM-7909) Write integration tests to test customized containers

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7909?focusedWorklogId=297503=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297503
 ]

ASF GitHub Bot logged work on BEAM-7909:


Author: ASF GitHub Bot
Created on: 19/Aug/19 23:58
Start Date: 19/Aug/19 23:58
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on issue #9351: [BEAM-7909] 
support customized container for Python
URL: https://github.com/apache/beam/pull/9351#issuecomment-522798308
 
 
   Run Python Dataflow ValidatesContainer
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297503)
Time Spent: 3h 20m  (was: 3h 10m)

> Write integration tests to test customized containers
> -
>
> Key: BEAM-7909
> URL: https://issues.apache.org/jira/browse/BEAM-7909
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-7790) Make debugging subprocess workers easier

2019-08-19 Thread Kyle Weaver (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kyle Weaver updated BEAM-7790:
--
Description: 
[ ] The output of the SDK workers is currently invisible due to the output and 
logging setup.

[ ] The dockerized version of the Python SDK worker sets up an HTTP server to 
let the user view stack traces for all of the worker's threads [1]. It would be 
useful if this was available for other execution modes as well.

[x] BEAM-7676 Make the above items more usable with multiple subprocesses by 
identifying them with worker ids.

 

[1] 
[https://github.com/apache/beam/blob/9f4ce1c6fc2fb195e218783a6e9ce6104ddb4d1e/sdks/python/apache_beam/runners/worker/sdk_worker_main.py#L46-L89]

  was:
1. The output of the SDK workers is currently invisible due to the output and 
logging setup.

2. The dockerized version of the Python SDK worker sets up an HTTP server to 
let the user view stack traces for all of the worker's threads [1]. It would be 
useful if this was available for other execution modes as well.

3. Make the above items more usable with multiple subprocesses by identifying 
them with worker ids.

 

[1] 
[https://github.com/apache/beam/blob/9f4ce1c6fc2fb195e218783a6e9ce6104ddb4d1e/sdks/python/apache_beam/runners/worker/sdk_worker_main.py#L46-L89]


> Make debugging subprocess workers easier
> 
>
> Key: BEAM-7790
> URL: https://issues.apache.org/jira/browse/BEAM-7790
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-harness
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Minor
>
> [ ] The output of the SDK workers is currently invisible due to the output 
> and logging setup.
> [ ] The dockerized version of the Python SDK worker sets up an HTTP server to 
> let the user view stack traces for all of the worker's threads [1]. It would 
> be useful if this was available for other execution modes as well.
> [x] BEAM-7676 Make the above items more usable with multiple subprocesses by 
> identifying them with worker ids.
>  
> [1] 
> [https://github.com/apache/beam/blob/9f4ce1c6fc2fb195e218783a6e9ce6104ddb4d1e/sdks/python/apache_beam/runners/worker/sdk_worker_main.py#L46-L89]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Resolved] (BEAM-7676) All SDK workers have worker_id="1"

2019-08-19 Thread Kyle Weaver (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kyle Weaver resolved BEAM-7676.
---
Fix Version/s: 2.16.0
   Resolution: Fixed

> All SDK workers have worker_id="1"
> --
>
> Key: BEAM-7676
> URL: https://issues.apache.org/jira/browse/BEAM-7676
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Minor
> Fix For: 2.16.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> SDK workers are created using multiple job factories, which all give their 
> initial workers id 1 [1]. We could perhaps identify sdk workers also by the 
> factory that created them, for example worker_id=$FACTORY-$WORKER (e.g. 
> worker_id="1-1", "1-2"..."2-1"...)
>  
> [1] 
> [https://github.com/apache/beam/blob/89b08e133be5a2c6bcdbd36242f16ef7ab796902/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/control/DefaultJobBundleFactory.java#L115]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Resolved] (BEAM-7670) Flink portable worker gets stuck if one of the task does not get any data

2019-08-19 Thread Kyle Weaver (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kyle Weaver resolved BEAM-7670.
---
Fix Version/s: 2.16.0
   Resolution: Fixed

> Flink portable worker gets stuck if one of the task does not get any data
> -
>
> Key: BEAM-7670
> URL: https://issues.apache.org/jira/browse/BEAM-7670
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Ankur Goenka
>Priority: Major
> Fix For: 2.16.0
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> When using parallelism > 1 with flink portable runner, the job gets stuck if 
> the data is partitioned in such a way that one of the task does not get any 
> data.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7909) Write integration tests to test customized containers

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7909?focusedWorklogId=297500=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297500
 ]

ASF GitHub Bot logged work on BEAM-7909:


Author: ASF GitHub Bot
Created on: 19/Aug/19 23:53
Start Date: 19/Aug/19 23:53
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on issue #9351: [BEAM-7909] 
support customized container for Python
URL: https://github.com/apache/beam/pull/9351#issuecomment-522797346
 
 
   @aaltay , I made some changes, can you please take a look when you have 
time? Thanks.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297500)
Time Spent: 3h  (was: 2h 50m)

> Write integration tests to test customized containers
> -
>
> Key: BEAM-7909
> URL: https://issues.apache.org/jira/browse/BEAM-7909
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
>  Time Spent: 3h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7909) Write integration tests to test customized containers

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7909?focusedWorklogId=297489=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297489
 ]

ASF GitHub Bot logged work on BEAM-7909:


Author: ASF GitHub Bot
Created on: 19/Aug/19 23:23
Start Date: 19/Aug/19 23:23
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on pull request #9351: 
[BEAM-7909] support customized container for Python
URL: https://github.com/apache/beam/pull/9351#discussion_r315451829
 
 

 ##
 File path: sdks/python/apache_beam/runners/portability/portable_runner.py
 ##
 @@ -84,16 +84,10 @@ def __init__(self):
   @staticmethod
   def default_docker_image():
 if 'USER' in os.environ:
-  if sys.version_info[0] == 2:
-version_suffix = ''
-  elif sys.version_info[0:2] == (3, 5):
-version_suffix = '3'
-  else:
-version_suffix = '3'
-# TODO(BEAM-7474): Use an image which has correct Python minor version.
-logging.warning('Make sure that locally built Python SDK docker image '
-'has Python %d.%d interpreter. See also: BEAM-7474.' % 
(
-sys.version_info[0], sys.version_info[1]))
+  version_suffix = ''.join([str(i) for i in sys.version_info[0:2]])
 
 Review comment:
   `PortableRunnerTestWithLocalDocker` class at `portable_runner_test.py` will 
test this function. However, in order to run these tests, users should build 
docker images first, which I don't think all users need to do. Do we want to 
skip these tests at local? Is there any good practice I can refer?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297489)
Time Spent: 2h 40m  (was: 2.5h)

> Write integration tests to test customized containers
> -
>
> Key: BEAM-7909
> URL: https://issues.apache.org/jira/browse/BEAM-7909
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7909) Write integration tests to test customized containers

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7909?focusedWorklogId=297487=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297487
 ]

ASF GitHub Bot logged work on BEAM-7909:


Author: ASF GitHub Bot
Created on: 19/Aug/19 23:19
Start Date: 19/Aug/19 23:19
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on pull request #9351: 
[BEAM-7909] support customized container for Python
URL: https://github.com/apache/beam/pull/9351#discussion_r315451101
 
 

 ##
 File path: sdks/python/container/base_image_requirements.txt
 ##
 @@ -67,7 +67,7 @@ pandas==0.23.4
 protorpc==0.11.1
 python-gflags==3.0.6
 setuptools<=39.1.0 # requirement for Tensorflow.
-tensorflow==1.11.0
+tensorflow==1.13.1
 
 Review comment:
   I compared with libraries installed by default at dataflow workers and 
matched versions to dataflow workers. For libraries whose versions are not in 
the range of `setup.py`, I updated version range accordingly.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297487)
Time Spent: 2.5h  (was: 2h 20m)

> Write integration tests to test customized containers
> -
>
> Key: BEAM-7909
> URL: https://issues.apache.org/jira/browse/BEAM-7909
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-7993) portable python precommit is flaky

2019-08-19 Thread Kyle Weaver (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16910806#comment-16910806
 ] 

Kyle Weaver commented on BEAM-7993:
---

It looks like we are only waiting one minute for the Dockerized SDK workers to 
start:

[https://github.com/apache/beam/blob/c2f0d282337f3ae0196a7717712396a5a41fdde1/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/environment/DockerEnvironmentFactory.java#L157-L160]

We should probably wait longer for the Docker containers to start before giving 
up.

> portable python precommit is flaky
> --
>
> Key: BEAM-7993
> URL: https://issues.apache.org/jira/browse/BEAM-7993
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, test-failures, testing
>Affects Versions: 2.15.0
>Reporter: Udi Meiri
>Assignee: Kyle Weaver
>Priority: Major
> Fix For: 2.15.0
>
>
> I'm not sure what the root cause is here.
> Example log where 
> :sdks:python:test-suites:portable:py35:portableWordCountBatch failed:
> {code}
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (1/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)
> 11:51:22 java.lang.Exception: The user defined 'open()' method caused an 
> exception: java.io.IOException: Received exit code 1 for command 'docker 
> inspect -f {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498)
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
> 11:51:22  at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712)
> 11:51:22  at java.lang.Thread.run(Thread.java:748)
> 11:51:22 Caused by: 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.io.IOException: Received exit code 1 for command 'docker inspect -f 
> {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4966)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:211)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:202)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:185)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.FlinkDefaultExecutableStageContext.getStageBundleFactory(FlinkDefaultExecutableStageContext.java:49)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.ReferenceCountingFlinkExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingFlinkExecutableStageContextFactory.java:203)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.open(FlinkExecutableStageFunction.java:129)
> 11:51:22  at 
> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
> 11:51:22  at 
> 

[jira] [Updated] (BEAM-7993) portable python precommit is flaky

2019-08-19 Thread Mark Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Liu updated BEAM-7993:
---
Affects Version/s: 2.15.0

> portable python precommit is flaky
> --
>
> Key: BEAM-7993
> URL: https://issues.apache.org/jira/browse/BEAM-7993
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, test-failures, testing
>Affects Versions: 2.15.0
>Reporter: Udi Meiri
>Assignee: Kyle Weaver
>Priority: Major
> Fix For: 2.15.0
>
>
> I'm not sure what the root cause is here.
> Example log where 
> :sdks:python:test-suites:portable:py35:portableWordCountBatch failed:
> {code}
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (1/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)
> 11:51:22 java.lang.Exception: The user defined 'open()' method caused an 
> exception: java.io.IOException: Received exit code 1 for command 'docker 
> inspect -f {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498)
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
> 11:51:22  at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712)
> 11:51:22  at java.lang.Thread.run(Thread.java:748)
> 11:51:22 Caused by: 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.io.IOException: Received exit code 1 for command 'docker inspect -f 
> {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4966)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:211)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:202)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:185)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.FlinkDefaultExecutableStageContext.getStageBundleFactory(FlinkDefaultExecutableStageContext.java:49)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.ReferenceCountingFlinkExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingFlinkExecutableStageContextFactory.java:203)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.open(FlinkExecutableStageFunction.java:129)
> 11:51:22  at 
> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:494)
> 11:51:22  ... 3 more
> {code}
> https://builds.apache.org/job/beam_PreCommit_Portable_Python_Commit/5512/consoleFull



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Assigned] (BEAM-7993) portable python precommit is flaky

2019-08-19 Thread Kyle Weaver (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kyle Weaver reassigned BEAM-7993:
-

Assignee: Kyle Weaver

> portable python precommit is flaky
> --
>
> Key: BEAM-7993
> URL: https://issues.apache.org/jira/browse/BEAM-7993
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, test-failures, testing
>Reporter: Udi Meiri
>Assignee: Kyle Weaver
>Priority: Major
> Fix For: 2.15.0
>
>
> I'm not sure what the root cause is here.
> Example log where 
> :sdks:python:test-suites:portable:py35:portableWordCountBatch failed:
> {code}
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (1/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)
> 11:51:22 java.lang.Exception: The user defined 'open()' method caused an 
> exception: java.io.IOException: Received exit code 1 for command 'docker 
> inspect -f {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498)
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
> 11:51:22  at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712)
> 11:51:22  at java.lang.Thread.run(Thread.java:748)
> 11:51:22 Caused by: 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.io.IOException: Received exit code 1 for command 'docker inspect -f 
> {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4966)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:211)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:202)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:185)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.FlinkDefaultExecutableStageContext.getStageBundleFactory(FlinkDefaultExecutableStageContext.java:49)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.ReferenceCountingFlinkExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingFlinkExecutableStageContextFactory.java:203)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.open(FlinkExecutableStageFunction.java:129)
> 11:51:22  at 
> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:494)
> 11:51:22  ... 3 more
> {code}
> https://builds.apache.org/job/beam_PreCommit_Portable_Python_Commit/5512/consoleFull



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8008) show error message from expansion service in Java External transform

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8008?focusedWorklogId=297472=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297472
 ]

ASF GitHub Bot logged work on BEAM-8008:


Author: ASF GitHub Bot
Created on: 19/Aug/19 22:23
Start Date: 19/Aug/19 22:23
Worklog Time Spent: 10m 
  Work Description: ihji commented on issue #9377: [BEAM-8008] show error 
message from expansion service in Java External transform
URL: https://github.com/apache/beam/pull/9377#issuecomment-522776767
 
 
   R: @chamikaramj 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297472)
Time Spent: 20m  (was: 10m)

> show error message from expansion service in Java External transform
> 
>
> Key: BEAM-8008
> URL: https://issues.apache.org/jira/browse/BEAM-8008
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> show error message from expansion service in Java External transform



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8008) show error message from expansion service in Java External transform

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8008?focusedWorklogId=297471=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297471
 ]

ASF GitHub Bot logged work on BEAM-8008:


Author: ASF GitHub Bot
Created on: 19/Aug/19 22:22
Start Date: 19/Aug/19 22:22
Worklog Time Spent: 10m 
  Work Description: ihji commented on pull request #9377: [BEAM-8008] show 
error message from expansion service in Java External transform
URL: https://github.com/apache/beam/pull/9377
 
 
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)
 | 

[jira] [Created] (BEAM-8008) show error message from expansion service in Java External transform

2019-08-19 Thread Heejong Lee (Jira)
Heejong Lee created BEAM-8008:
-

 Summary: show error message from expansion service in Java 
External transform
 Key: BEAM-8008
 URL: https://issues.apache.org/jira/browse/BEAM-8008
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-core
Reporter: Heejong Lee
Assignee: Heejong Lee


show error message from expansion service in Java External transform



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Assigned] (BEAM-8007) Update Python dependencies page for 2.15.0

2019-08-19 Thread Cyrus Maden (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cyrus Maden reassigned BEAM-8007:
-

Assignee: Cyrus Maden

> Update Python dependencies page for 2.15.0
> --
>
> Key: BEAM-8007
> URL: https://issues.apache.org/jira/browse/BEAM-8007
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Rose Nguyen
>Assignee: Cyrus Maden
>Priority: Minor
>
> Update Python dependencies page for 2.15.0
> [https://beam.apache.org/documentation/sdks/python-dependencies/]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (BEAM-8007) Update Python dependencies page for 2.15.0

2019-08-19 Thread Rose Nguyen (Jira)
Rose Nguyen created BEAM-8007:
-

 Summary: Update Python dependencies page for 2.15.0
 Key: BEAM-8007
 URL: https://issues.apache.org/jira/browse/BEAM-8007
 Project: Beam
  Issue Type: Improvement
  Components: website
Reporter: Rose Nguyen


Update Python dependencies page for 2.15.0

[https://beam.apache.org/documentation/sdks/python-dependencies/]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Comment Edited] (BEAM-7993) portable python precommit is flaky

2019-08-19 Thread Mark Liu (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16910749#comment-16910749
 ] 

Mark Liu edited comment on BEAM-7993 at 8/19/19 8:57 PM:
-

Same issue made 
[beam_PreCommit_Portable_Python_Cron|https://builds.apache.org/view/A-D/view/Beam/view/PostCommit/job/beam_PreCommit_Portable_Python_Cron/]
 very flaky recently. It failed only py35 batch and streaming wordcount test. 
py2 test passed as expected.

A most recent failure: 
https://builds.apache.org/view/A-D/view/Beam/view/PostCommit/job/beam_PreCommit_Portable_Python_Cron/1044/


was (Author: markflyhigh):
Same issue made 
[beam_PreCommit_Portable_Python_Cron|https://builds.apache.org/view/A-D/view/Beam/view/PostCommit/job/beam_PreCommit_Portable_Python_Cron/]
 very flaky recently. 

A most recent failure: 
https://builds.apache.org/view/A-D/view/Beam/view/PostCommit/job/beam_PreCommit_Portable_Python_Cron/1044/

> portable python precommit is flaky
> --
>
> Key: BEAM-7993
> URL: https://issues.apache.org/jira/browse/BEAM-7993
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, test-failures, testing
>Reporter: Udi Meiri
>Priority: Major
>
> I'm not sure what the root cause is here.
> Example log where 
> :sdks:python:test-suites:portable:py35:portableWordCountBatch failed:
> {code}
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (1/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)
> 11:51:22 java.lang.Exception: The user defined 'open()' method caused an 
> exception: java.io.IOException: Received exit code 1 for command 'docker 
> inspect -f {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498)
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
> 11:51:22  at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712)
> 11:51:22  at java.lang.Thread.run(Thread.java:748)
> 11:51:22 Caused by: 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.io.IOException: Received exit code 1 for command 'docker inspect -f 
> {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4966)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:211)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:202)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:185)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.FlinkDefaultExecutableStageContext.getStageBundleFactory(FlinkDefaultExecutableStageContext.java:49)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.ReferenceCountingFlinkExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingFlinkExecutableStageContextFactory.java:203)
> 11:51:22  at 
> 

[jira] [Commented] (BEAM-7993) portable python precommit is flaky

2019-08-19 Thread Mark Liu (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16910749#comment-16910749
 ] 

Mark Liu commented on BEAM-7993:


Same issue made 
[beam_PreCommit_Portable_Python_Cron|https://builds.apache.org/view/A-D/view/Beam/view/PostCommit/job/beam_PreCommit_Portable_Python_Cron/]
 very flaky recently. 

A most recent failure: 
https://builds.apache.org/view/A-D/view/Beam/view/PostCommit/job/beam_PreCommit_Portable_Python_Cron/1044/

> portable python precommit is flaky
> --
>
> Key: BEAM-7993
> URL: https://issues.apache.org/jira/browse/BEAM-7993
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, test-failures, testing
>Reporter: Udi Meiri
>Priority: Major
>
> I'm not sure what the root cause is here.
> Example log where 
> :sdks:python:test-suites:portable:py35:portableWordCountBatch failed:
> {code}
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (1/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)
> 11:51:22 java.lang.Exception: The user defined 'open()' method caused an 
> exception: java.io.IOException: Received exit code 1 for command 'docker 
> inspect -f {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498)
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
> 11:51:22  at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712)
> 11:51:22  at java.lang.Thread.run(Thread.java:748)
> 11:51:22 Caused by: 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.io.IOException: Received exit code 1 for command 'docker inspect -f 
> {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4966)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:211)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:202)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:185)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.FlinkDefaultExecutableStageContext.getStageBundleFactory(FlinkDefaultExecutableStageContext.java:49)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.ReferenceCountingFlinkExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingFlinkExecutableStageContextFactory.java:203)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.open(FlinkExecutableStageFunction.java:129)
> 11:51:22  at 
> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:494)
> 11:51:22  ... 3 more
> {code}
> https://builds.apache.org/job/beam_PreCommit_Portable_Python_Commit/5512/consoleFull



--
This message was sent by 

[jira] [Closed] (BEAM-7940) beam_Release_Python_NightlySnapshot is broken due to directory not exist

2019-08-19 Thread Mark Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Liu closed BEAM-7940.
--

> beam_Release_Python_NightlySnapshot is broken due to directory not exist
> 
>
> Key: BEAM-7940
> URL: https://issues.apache.org/jira/browse/BEAM-7940
> Project: Beam
>  Issue Type: Bug
>  Components: build-system, test-failures
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> :sdks:python:depSnapshot broke beam_Release_Python_NightlySnapshot due to:
> {code}
> sh: 1: cannot create 
> /home/jenkins/jenkins-slave/workspace/beam_Release_Python_NightlySnapshot/src/sdks/python/build/requirements.txt:
>  Directory nonexistent
> {code}
> This is affected by https://github.com/apache/beam/pull/9277. Directory 
> `sdks/python/build` no longer exist when writes to 
> sdks/python/build/requirements.txt. We can create an empty file first to fix 
> this problem and do cleanup at same time if old file exists.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Resolved] (BEAM-7940) beam_Release_Python_NightlySnapshot is broken due to directory not exist

2019-08-19 Thread Mark Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Liu resolved BEAM-7940.

Resolution: Fixed

> beam_Release_Python_NightlySnapshot is broken due to directory not exist
> 
>
> Key: BEAM-7940
> URL: https://issues.apache.org/jira/browse/BEAM-7940
> Project: Beam
>  Issue Type: Bug
>  Components: build-system, test-failures
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> :sdks:python:depSnapshot broke beam_Release_Python_NightlySnapshot due to:
> {code}
> sh: 1: cannot create 
> /home/jenkins/jenkins-slave/workspace/beam_Release_Python_NightlySnapshot/src/sdks/python/build/requirements.txt:
>  Directory nonexistent
> {code}
> This is affected by https://github.com/apache/beam/pull/9277. Directory 
> `sdks/python/build` no longer exist when writes to 
> sdks/python/build/requirements.txt. We can create an empty file first to fix 
> this problem and do cleanup at same time if old file exists.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-3763) Add per-transform documentation to the website

2019-08-19 Thread Rose Nguyen (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-3763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16910743#comment-16910743
 ] 

Rose Nguyen commented on BEAM-3763:
---

Yes, thank you!

> Add per-transform documentation to the website
> --
>
> Key: BEAM-3763
> URL: https://issues.apache.org/jira/browse/BEAM-3763
> Project: Beam
>  Issue Type: Task
>  Components: website
>Reporter: Rafael Fernandez
>Priority: Minor
>  Labels: easyfix, reference
> Fix For: Not applicable
>
>
> Add structure to the website to incrementally document per-transform 
> definitions and examples. The idea is to incrementally populate this section 
> and clean up stale javadoc entries which have unworkable / outdated examples.
>  
> This task tracks creating the right structure for the website. Each transform 
> cleanup/documentation will come with its own JIRA, to facilitate other 
> members of the community to pick up outstanding work.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-5428) Implement cross-bundle state caching.

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5428?focusedWorklogId=297404=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297404
 ]

ASF GitHub Bot logged work on BEAM-5428:


Author: ASF GitHub Bot
Created on: 19/Aug/19 20:37
Start Date: 19/Aug/19 20:37
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #9374: [BEAM-5428] Implement 
Runner support for cache tokens
URL: https://github.com/apache/beam/pull/9374#issuecomment-522743857
 
 
   Requesting a review from @tweise but also feel free to review @rakeshcusat.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297404)
Time Spent: 20m  (was: 10m)

> Implement cross-bundle state caching.
> -
>
> Key: BEAM-5428
> URL: https://issues.apache.org/jira/browse/BEAM-5428
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-harness
>Reporter: Robert Bradshaw
>Assignee: Rakesh Kumar
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Tech spec: 
> [https://docs.google.com/document/d/1BOozW0bzBuz4oHJEuZNDOHdzaV5Y56ix58Ozrqm2jFg/edit#heading=h.7ghoih5aig5m]
> Relevant document: 
> [https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit#|https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit]
> Mailing list link: 
> [https://lists.apache.org/thread.html/caa8d9bc6ca871d13de2c5e6ba07fdc76f85d26497d95d90893aa1f6@%3Cdev.beam.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-3763) Add per-transform documentation to the website

2019-08-19 Thread Pablo Estrada (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-3763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16910740#comment-16910740
 ] 

Pablo Estrada commented on BEAM-3763:
-

If you already have JIRAs for this, then we can just mark as duplicate. WDYT?

> Add per-transform documentation to the website
> --
>
> Key: BEAM-3763
> URL: https://issues.apache.org/jira/browse/BEAM-3763
> Project: Beam
>  Issue Type: Task
>  Components: website
>Reporter: Rafael Fernandez
>Priority: Minor
>  Labels: easyfix, reference
>
> Add structure to the website to incrementally document per-transform 
> definitions and examples. The idea is to incrementally populate this section 
> and clean up stale javadoc entries which have unworkable / outdated examples.
>  
> This task tracks creating the right structure for the website. Each transform 
> cleanup/documentation will come with its own JIRA, to facilitate other 
> members of the community to pick up outstanding work.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Resolved] (BEAM-7700) Java transform catalog

2019-08-19 Thread Rose Nguyen (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rose Nguyen resolved BEAM-7700.
---
Fix Version/s: Not applicable
   Resolution: Fixed

> Java transform catalog
> --
>
> Key: BEAM-7700
> URL: https://issues.apache.org/jira/browse/BEAM-7700
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Rose Nguyen
>Assignee: Rose Nguyen
>Priority: Minor
> Fix For: Not applicable
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Create catalog of core transforms (Java)
> -Java transforms overview
> -Links to Javadocs
> -Brief description
> -Related transforms
> -Links to programming guide
> -Examples section to integrate Colab notebooks
>  
> See BEAM-7464 for Python.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Comment Edited] (BEAM-3763) Add per-transform documentation to the website

2019-08-19 Thread Rose Nguyen (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-3763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16910733#comment-16910733
 ] 

Rose Nguyen edited comment on BEAM-3763 at 8/19/19 8:31 PM:


Yes; see BEAM-7700, BEAM-7464, BEAM-7702, BEAM-7703, BEAM-7704, BEAM-7705, 
BEAM-7706

I'm not sure how to link it to this Jira besides adding the issue # in the PRs


was (Author: rtnguyen):
Yes; see BEAM-7700, BEAM-7464, BEAM-7702, BEAM-7703, BEAM-7704, BEAM-7705 

I'm not sure how to link it to this Jira besides adding the issue # in the PRs

> Add per-transform documentation to the website
> --
>
> Key: BEAM-3763
> URL: https://issues.apache.org/jira/browse/BEAM-3763
> Project: Beam
>  Issue Type: Task
>  Components: website
>Reporter: Rafael Fernandez
>Priority: Minor
>  Labels: easyfix, reference
>
> Add structure to the website to incrementally document per-transform 
> definitions and examples. The idea is to incrementally populate this section 
> and clean up stale javadoc entries which have unworkable / outdated examples.
>  
> This task tracks creating the right structure for the website. Each transform 
> cleanup/documentation will come with its own JIRA, to facilitate other 
> members of the community to pick up outstanding work.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Comment Edited] (BEAM-3763) Add per-transform documentation to the website

2019-08-19 Thread Rose Nguyen (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-3763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16910733#comment-16910733
 ] 

Rose Nguyen edited comment on BEAM-3763 at 8/19/19 8:30 PM:


Yes; see BEAM-7700, BEAM-7464, BEAM-7702, BEAM-7703, BEAM-7704, BEAM-7705 

I'm not sure how to link it to this Jira besides adding the issue # in the PRs


was (Author: rtnguyen):
Yes; see BEAM-7700, BEAM-7464,  BEAM-7702-7705

 

I'm not sure how to link it to this Jira besides adding the issue # in the PRs

> Add per-transform documentation to the website
> --
>
> Key: BEAM-3763
> URL: https://issues.apache.org/jira/browse/BEAM-3763
> Project: Beam
>  Issue Type: Task
>  Components: website
>Reporter: Rafael Fernandez
>Priority: Minor
>  Labels: easyfix, reference
>
> Add structure to the website to incrementally document per-transform 
> definitions and examples. The idea is to incrementally populate this section 
> and clean up stale javadoc entries which have unworkable / outdated examples.
>  
> This task tracks creating the right structure for the website. Each transform 
> cleanup/documentation will come with its own JIRA, to facilitate other 
> members of the community to pick up outstanding work.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-3763) Add per-transform documentation to the website

2019-08-19 Thread Rose Nguyen (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-3763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16910733#comment-16910733
 ] 

Rose Nguyen commented on BEAM-3763:
---

Yes; see BEAM-7700, BEAM-7464,  BEAM-7702-7705

 

I'm not sure how to link it to this Jira besides adding the issue # in the PRs

> Add per-transform documentation to the website
> --
>
> Key: BEAM-3763
> URL: https://issues.apache.org/jira/browse/BEAM-3763
> Project: Beam
>  Issue Type: Task
>  Components: website
>Reporter: Rafael Fernandez
>Priority: Minor
>  Labels: easyfix, reference
>
> Add structure to the website to incrementally document per-transform 
> definitions and examples. The idea is to incrementally populate this section 
> and clean up stale javadoc entries which have unworkable / outdated examples.
>  
> This task tracks creating the right structure for the website. Each transform 
> cleanup/documentation will come with its own JIRA, to facilitate other 
> members of the community to pick up outstanding work.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8006) add retracting to windowing strategy translation

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8006?focusedWorklogId=297384=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297384
 ]

ASF GitHub Bot logged work on BEAM-8006:


Author: ASF GitHub Bot
Created on: 19/Aug/19 20:07
Start Date: 19/Aug/19 20:07
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on issue #9375: [BEAM-8006] Add 
retracting to windowing strategy translation.
URL: https://github.com/apache/beam/pull/9375#issuecomment-522733538
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297384)
Time Spent: 0.5h  (was: 20m)

> add retracting to windowing strategy translation
> 
>
> Key: BEAM-8006
> URL: https://issues.apache.org/jira/browse/BEAM-8006
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-8005) beam_PostCommit_Python37 timing out

2019-08-19 Thread Udi Meiri (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16910712#comment-16910712
 ] 

Udi Meiri commented on BEAM-8005:
-

This also happens on beam_PostCommit_Python36, just less frequently.
https://builds.apache.org/job/beam_PostCommit_Python36/263/console


> beam_PostCommit_Python37 timing out
> ---
>
> Key: BEAM-8005
> URL: https://issues.apache.org/jira/browse/BEAM-8005
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures, testing
>Reporter: Udi Meiri
>Priority: Major
>
> Seems to get stuck in :sdks:python:test-suites:dataflow:py37:postCommitIT
> {code}
> 10:03:30 Build timed out (after 100 minutes). Marking the build as aborted.
> {code}
> https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/consoleFull
> possible culprits listed in changes for these PRs: 
> https://builds.apache.org/job/beam_PostCommit_Python37/173/
> https://builds.apache.org/job/beam_PostCommit_Python37/174/



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8006) add retracting to windowing strategy translation

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8006?focusedWorklogId=297364=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297364
 ]

ASF GitHub Bot logged work on BEAM-8006:


Author: ASF GitHub Bot
Created on: 19/Aug/19 19:15
Start Date: 19/Aug/19 19:15
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on pull request #9375: [BEAM-8006] 
Add retracting to windowing strategy translation.
URL: https://github.com/apache/beam/pull/9375
 
 
   
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [x] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [x] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)
 | --- | 

[jira] [Updated] (BEAM-8006) add retracting to windowing strategy translation

2019-08-19 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated BEAM-8006:
---
Status: Open  (was: Triage Needed)

> add retracting to windowing strategy translation
> 
>
> Key: BEAM-8006
> URL: https://issues.apache.org/jira/browse/BEAM-8006
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (BEAM-8006) add retracting to windowing strategy translation

2019-08-19 Thread Rui Wang (Jira)
Rui Wang created BEAM-8006:
--

 Summary: add retracting to windowing strategy translation
 Key: BEAM-8006
 URL: https://issues.apache.org/jira/browse/BEAM-8006
 Project: Beam
  Issue Type: Sub-task
  Components: runner-core
Reporter: Rui Wang
Assignee: Rui Wang






--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Resolved] (BEAM-7965) Add retracting mode to model proto

2019-08-19 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang resolved BEAM-7965.

Fix Version/s: Not applicable
   Resolution: Fixed

> Add retracting mode to model proto
> --
>
> Key: BEAM-7965
> URL: https://issues.apache.org/jira/browse/BEAM-7965
> Project: Beam
>  Issue Type: Sub-task
>  Components: beam-model
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7994) BEAM SDK has compatibility problems with go1.13

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7994?focusedWorklogId=297356=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297356
 ]

ASF GitHub Bot logged work on BEAM-7994:


Author: ASF GitHub Bot
Created on: 19/Aug/19 19:04
Start Date: 19/Aug/19 19:04
Worklog Time Spent: 10m 
  Work Description: lostluck commented on issue #9362: [BEAM-7994] Fixing 
unsafe pointer usage for Go 1.13
URL: https://github.com/apache/beam/pull/9362#issuecomment-522712507
 
 
   IIRC there should only be one other use of Unsafe and that's an optmization
   in the ioutilx package so a []byte we know should be allocated on the stack
   isn't put on the heap. (Noescape essentially).
   The only other workaround would be to plumb a concrete type all the way to
   that spot instead of an io.Reader.
   
   On Mon, Aug 19, 2019, 11:02 AM Joe Tsai  wrote:
   
   > *@dsnet* commented on this pull request.
   > --
   >
   > In sdks/go/pkg/beam/core/util/reflectx/functions_test.go
   > :
   >
   > > +func TestXxx(t *testing.T) {
   > +  val := reflect.ValueOf(testFunction)
   > +  fi := uintptr(val.Pointer())
   > +  typ := val.Type()
   > +
   > +  callable := LoadFunction(fi, typ)
   > +
   > +  cv := reflect.ValueOf(callable)
   > +  out := cv.Call(nil)
   > +  if len(out) != 1 {
   > +  t.Errorf("got %d return values, wanted 1.", len(out))
   > +  }
   > +  // TODO: check type?
   > +  if out[0].Int() != 42 {
   > +  t.Errorf("got %d, wanted 42", out[0].Int())
   > +  }
   >
   > Yes, please! :)
   >
   > The lack of a unit test on this function meant that it took several days
   > to track down the issue when there was mysterious memory corruption
   > happening in the entire beam job. A single unit test would have immediately
   > identified this as the culprit.
   >
   > —
   > You are receiving this because you were mentioned.
   > Reply to this email directly, view it on GitHub
   > 
,
   > or mute the thread
   > 

   > .
   >
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297356)
Time Spent: 3h 20m  (was: 3h 10m)

> BEAM SDK has compatibility problems with go1.13
> ---
>
> Key: BEAM-7994
> URL: https://issues.apache.org/jira/browse/BEAM-7994
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Bill Neubauer
>Assignee: Bill Neubauer
>Priority: Minor
> Fix For: 2.16.0
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> The Go team identified a problem in the Beam SDK that appears due to runtime 
> changes in Go 1.13, which is upcoming. There is a backwards compatible fix 
> the team recommended.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-7883) PubsubIO (Java) write batch size can exceed request payload limit

2019-08-19 Thread Chamikara Jayalath (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16910649#comment-16910649
 ] 

Chamikara Jayalath commented on BEAM-7883:
--

Thanks. Batch sizing that is done at PubSub IO is just an estimations and there 
can be cases where actual messages go over the 10 MB hard limit. Using 
'withMaxBatchBytesSize' is the correct workaround.

> PubsubIO (Java) write batch size can exceed request payload limit
> -
>
> Key: BEAM-7883
> URL: https://issues.apache.org/jira/browse/BEAM-7883
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Affects Versions: 2.13.0
>Reporter: Yurii Atamanchuk
>Priority: Minor
>
> In some (probably rare) cases PubsubIO write (in Batch mode) batch size can 
> exceed request payload limit of 10mb. PubsubIO ensures that batch size is 
> less than limit (10mb by default). But then PubsubJsonClient is used that 
> converts message payloads into URL-Safe Base64 encoding which can inflate 
> message size (in my case for json strings it was up to 25-30%). As result we 
> get 400 response (with 'Request payload size exceeds the limit: 10485760 
> bytes' message), even though original batch had correct size.
> Obvious workaround is to reduce batch size 
> (`PubsubIO.writeMessages().to(...).withMaxBatchBytesSize(... i.e. 5mb ...)`), 
> but it is a bit annoying.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7980) External environment with containerized worker pool

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7980?focusedWorklogId=297348=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297348
 ]

ASF GitHub Bot logged work on BEAM-7980:


Author: ASF GitHub Bot
Created on: 19/Aug/19 18:26
Start Date: 19/Aug/19 18:26
Worklog Time Spent: 10m 
  Work Description: tweise commented on issue #9371: [WIP] [BEAM-7980] 
External environment with containerized worker pool
URL: https://github.com/apache/beam/pull/9371#issuecomment-522698334
 
 
   CC: @rakeshcusat 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297348)
Time Spent: 20m  (was: 10m)

> External environment with containerized worker pool
> ---
>
> Key: BEAM-7980
> URL: https://issues.apache.org/jira/browse/BEAM-7980
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-harness
>Reporter: Thomas Weise
>Assignee: Thomas Weise
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Augment Beam Python docker image and boot.go so that it can be used to launch 
> BeamFnExternalWorkerPoolServicer.
> [https://docs.google.com/document/d/1z3LNrRtr8kkiFHonZ5JJM_L4NWNBBNcqRc_yAf6G0VI/edit#heading=h.lnhm75dhvhi0]
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-7998) MatchesFiles or MatchAll seems to return seveval time the same element

2019-08-19 Thread Jerome MASSOT (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16910622#comment-16910622
 ] 

Jerome MASSOT commented on BEAM-7998:
-

Hi Pablo,

thanks for taking care of this apparent issue.

For confidentiality reason, I cannot share the entire code. But this is the 
snippet where I have detected the behavior :

I have a bucket in GCP and these are the arguments for the runner. The strange 
behavior is both with DirectRunner and DataFlowRunner.


 gcs_folder = 'gs://{}/chunks-dashboard/'.format(bucket_id)
 argv = [
 '--project={}'.format(project_id),
 '--job_name=chunk-dashboard',
 '--save_main_session',
 '--staging_location=' + gcs_folder + 'staging',
 '--temp_location=' + gcs_folder + 'temp',
 '--max_num_workers=10',
 '--autoscaling_algorithm=THROUGHPUT_BASED',
 '--runner=DirectRunner',
 '--setup_file=./setup.py',
 '--machine_type=n1-standard-4'
 ]
 pipeline_options = beam.options.pipeline_options.PipelineOptions(argv)

 

I use *.json widcard as follows :
 # retrieve the path of the chunk folder in the bucket and wildcard the chunk 
files
 chunks_folder = 'gs://{}/{}'.format(bucket_id, folder_id)
 chunk_files = chunks_folder + '/*.json'

 with beam.Pipeline(options=pipeline_options) as p:
 chunk_content = (p
 | fileio.MatchFiles(chunk_files)
 | fileio.ReadMatches()
 )

 

When I run this pipeline on a folder where a single json file is stored, the 
pipeline finds twice the match to this unique json file.

Strange...

Thanks for your help,

and Good luck

Best regards

 

Jerome

> MatchesFiles or MatchAll seems to return seveval time the same element
> --
>
> Key: BEAM-7998
> URL: https://issues.apache.org/jira/browse/BEAM-7998
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-files
>Affects Versions: 2.14.0
> Environment: GCP for storage, DirectRunner and DataflowRunner both 
> have the problem. PyCharm on Win10 for IDE and dev environment.
>Reporter: Jerome MASSOT
>Assignee: Pablo Estrada
>Priority: Major
>
> Hi team,
> when I use MatcheFiles using wildcard and files located in a GCP bucket, the 
> MatcheFiles transform returns several times (at least 2) the same file.
> I have tried to follow the stack, and I can see that the MatchesAll is called 
> twice when I run the pipeline on a debug project where a single element is 
> present in the bucket.
> But I am not good enough to say more than that. Sorry.
> Best regards
> Jerome



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Assigned] (BEAM-6923) OOM errors in jobServer when using GCS artifactDir

2019-08-19 Thread Ankur Goenka (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur Goenka reassigned BEAM-6923:
--

Assignee: Ankur Goenka

> OOM errors in jobServer when using GCS artifactDir
> --
>
> Key: BEAM-6923
> URL: https://issues.apache.org/jira/browse/BEAM-6923
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-harness
>Reporter: Lukasz Gajowy
>Assignee: Ankur Goenka
>Priority: Major
> Attachments: Instance counts.png, Paths to GC root.png, 
> Telemetries.png, heapdump size-sorted.png
>
>
> When starting jobServer with artifactDir pointing to a GCS bucket: 
> {code:java}
> ./gradlew :beam-runners-flink_2.11-job-server:runShadow 
> -PflinkMasterUrl=localhost:8081 -PartifactsDir=gs://the-bucket{code}
> and running a Java portable pipeline with the following, portability related 
> pipeline options: 
> {code:java}
> --runner=PortableRunner --jobEndpoint=localhost:8099 
> --defaultEnvironmentType=DOCKER 
> --defaultEnvironmentConfig=gcr.io//java:latest'{code}
>  
> I'm facing a series of OOM errors, like this: 
> {code:java}
> Exception in thread "grpc-default-executor-3" java.lang.OutOfMemoryError: 
> Java heap space
> at 
> com.google.api.client.googleapis.media.MediaHttpUploader.buildContentChunk(MediaHttpUploader.java:606)
> at 
> com.google.api.client.googleapis.media.MediaHttpUploader.resumableUpload(MediaHttpUploader.java:408)
> at 
> com.google.api.client.googleapis.media.MediaHttpUploader.upload(MediaHttpUploader.java:336)
> at 
> com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:508)
> at 
> com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:432)
> at 
> com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:549)
> at 
> com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel$UploadOperation.call(AbstractGoogleAsyncWriteChannel.java:301)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745){code}
>  
> This does not happen when I'm using a local filesystem for the artifact 
> staging location. 
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7994) BEAM SDK has compatibility problems with go1.13

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7994?focusedWorklogId=297344=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297344
 ]

ASF GitHub Bot logged work on BEAM-7994:


Author: ASF GitHub Bot
Created on: 19/Aug/19 18:02
Start Date: 19/Aug/19 18:02
Worklog Time Spent: 10m 
  Work Description: dsnet commented on pull request #9362: [BEAM-7994] 
Fixing unsafe pointer usage for Go 1.13
URL: https://github.com/apache/beam/pull/9362#discussion_r315339519
 
 

 ##
 File path: sdks/go/pkg/beam/core/util/reflectx/functions_test.go
 ##
 @@ -0,0 +1,43 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+package reflectx
+
+import (
+   "reflect"
+   "testing"
+)
+
+func testFunction() int {
+   return 42
+}
+
+func TestXxx(t *testing.T) {
+   val := reflect.ValueOf(testFunction)
+   fi := uintptr(val.Pointer())
+   typ := val.Type()
+
+   callable := LoadFunction(fi, typ)
+
+   cv := reflect.ValueOf(callable)
+   out := cv.Call(nil)
+   if len(out) != 1 {
+   t.Errorf("got %d return values, wanted 1.", len(out))
+   }
+   // TODO: check type?
+   if out[0].Int() != 42 {
+   t.Errorf("got %d, wanted 42", out[0].Int())
+   }
 
 Review comment:
   Yes, please! :)
   
   The lack of a unit test on this function meant that it took several days to 
track down the issue when there was mysterious memory corruption happening in 
the entire beam job. A single unit test would have immediately identified this 
as the culprit.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297344)
Time Spent: 3h 10m  (was: 3h)

> BEAM SDK has compatibility problems with go1.13
> ---
>
> Key: BEAM-7994
> URL: https://issues.apache.org/jira/browse/BEAM-7994
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Bill Neubauer
>Assignee: Bill Neubauer
>Priority: Minor
> Fix For: 2.16.0
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> The Go team identified a problem in the Beam SDK that appears due to runtime 
> changes in Go 1.13, which is upcoming. There is a backwards compatible fix 
> the team recommended.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7013) A new count distinct transform based on BigQuery compatible HyperLogLog++ implementation

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7013?focusedWorklogId=297335=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297335
 ]

ASF GitHub Bot logged work on BEAM-7013:


Author: ASF GitHub Bot
Created on: 19/Aug/19 17:52
Start Date: 19/Aug/19 17:52
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on pull request #9144: [BEAM-7013] 
Integrating ZetaSketch's HLL++ algorithm with Beam
URL: https://github.com/apache/beam/pull/9144#discussion_r315316012
 
 

 ##
 File path: sdks/java/extensions/zetasketch/build.gradle
 ##
 @@ -0,0 +1,64 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import groovy.json.JsonOutput
+
+plugins { id 'org.apache.beam.module' }
+applyJavaNature()
+
+description = "Apache Beam :: SDKs :: Java :: Extensions :: ZetaSketch"
+
+def zetasketch_version = "0.1.0"
+
+dependencies {
+compile library.java.vendored_guava_26_0_jre
+compile project(path: ":sdks:java:core", configuration: "shadow")
+compile "com.google.zetasketch:zetasketch:$zetasketch_version"
+testCompile library.java.junit
+testCompile project(":sdks:java:io:google-cloud-platform")
+testRuntimeOnly project(":runners:direct-java")
 
 Review comment:
   Do we need to depend on direct runner here? If the test is only running with 
dataflow runner, then `:runners:google-cloud-dataflow-java` would be enough.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297335)
Time Spent: 20h 50m  (was: 20h 40m)

> A new count distinct transform based on BigQuery compatible HyperLogLog++ 
> implementation
> 
>
> Key: BEAM-7013
> URL: https://issues.apache.org/jira/browse/BEAM-7013
> Project: Beam
>  Issue Type: New Feature
>  Components: extensions-java-sketching, sdk-java-core
>Reporter: Yueyang Qiu
>Assignee: Yueyang Qiu
>Priority: Major
> Fix For: 2.16.0
>
>  Time Spent: 20h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7013) A new count distinct transform based on BigQuery compatible HyperLogLog++ implementation

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7013?focusedWorklogId=297338=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297338
 ]

ASF GitHub Bot logged work on BEAM-7013:


Author: ASF GitHub Bot
Created on: 19/Aug/19 17:52
Start Date: 19/Aug/19 17:52
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on pull request #9144: [BEAM-7013] 
Integrating ZetaSketch's HLL++ algorithm with Beam
URL: https://github.com/apache/beam/pull/9144#discussion_r315327103
 
 

 ##
 File path: sdks/java/extensions/zetasketch/build.gradle
 ##
 @@ -0,0 +1,64 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import groovy.json.JsonOutput
+
 
 Review comment:
   If current project depends on other project, it's recommended to declare 
'evaluationDependsOn'  first like: 
https://github.com/apache/beam/blob/master/runners/direct-java/build.gradle#L50
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297338)
Time Spent: 21h  (was: 20h 50m)

> A new count distinct transform based on BigQuery compatible HyperLogLog++ 
> implementation
> 
>
> Key: BEAM-7013
> URL: https://issues.apache.org/jira/browse/BEAM-7013
> Project: Beam
>  Issue Type: New Feature
>  Components: extensions-java-sketching, sdk-java-core
>Reporter: Yueyang Qiu
>Assignee: Yueyang Qiu
>Priority: Major
> Fix For: 2.16.0
>
>  Time Spent: 21h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7013) A new count distinct transform based on BigQuery compatible HyperLogLog++ implementation

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7013?focusedWorklogId=297339=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297339
 ]

ASF GitHub Bot logged work on BEAM-7013:


Author: ASF GitHub Bot
Created on: 19/Aug/19 17:52
Start Date: 19/Aug/19 17:52
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on pull request #9144: [BEAM-7013] 
Integrating ZetaSketch's HLL++ algorithm with Beam
URL: https://github.com/apache/beam/pull/9144#discussion_r315333061
 
 

 ##
 File path: 
sdks/java/extensions/zetasketch/src/test/java/org/apache/beam/sdk/extensions/zetasketch/BigQueryHllSketchCompatibilityIT.java
 ##
 @@ -0,0 +1,145 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.zetasketch;
+
+import com.google.api.services.bigquery.model.TableFieldSchema;
+import com.google.api.services.bigquery.model.TableRow;
+import com.google.api.services.bigquery.model.TableSchema;
+import java.nio.ByteBuffer;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.List;
+import org.apache.beam.sdk.Pipeline;
+import org.apache.beam.sdk.coders.ByteArrayCoder;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO;
+import org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO.TypedRead.Method;
+import org.apache.beam.sdk.io.gcp.bigquery.SchemaAndRecord;
+import org.apache.beam.sdk.io.gcp.testing.BigqueryMatcher;
+import org.apache.beam.sdk.options.ApplicationNameOptions;
+import org.apache.beam.sdk.testing.PAssert;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.apache.beam.sdk.testing.TestPipelineOptions;
+import org.apache.beam.sdk.transforms.Create;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.values.PCollection;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+/**
+ * Integration tests for HLL++ sketch compatibility between Beam and BigQuery. 
The tests verifies
+ * that HLL++ sketches created in Beam can be processed by BigQuery, and vice 
versa.
+ */
+@RunWith(JUnit4.class)
+public class BigQueryHllSketchCompatibilityIT {
+
+  private static final String DATASET_NAME = "zetasketch_compatibility_test";
+
+  // Table for testReadSketchFromBigQuery()
+  // Schema: only one STRING field named "data".
+  // Content: prepopulated with 4 rows: "Apple", "Orange", "Banana", "Orange"
+  private static final String DATA_TABLE_NAME = "hll_data";
+  private static final String DATA_FIELD_NAME = "data";
+  private static final String QUERY_RESULT_FIELD_NAME = "sketch";
+  private static final Long EXPECTED_COUNT = 3L;
+
+  // Table for testWriteSketchToBigQuery()
+  // Schema: only one BYTES field named "sketch".
+  // Content: will be overridden by the sketch computed by the test pipeline 
each time the test runs
+  private static final String SKETCH_TABLE_NAME = "hll_sketch";
+  private static final String SKETCH_FIELD_NAME = "sketch";
+  private static final List TEST_DATA =
+  Arrays.asList("Apple", "Orange", "Banana", "Orange");
+  // SHA-1 hash of string "[3]", the string representation of a row that has 
only one field 3 in it
+  private static final String EXPECTED_CHECKSUM = 
"f1e31df9806ce94c5bdbbfff9608324930f4d3f1";
+
+  /**
+   * Test that HLL++ sketch computed in BigQuery can be processed by Beam. Hll 
sketch is computed by
+   * {@code HLL_COUNT.INIT} in BigQuery and read into Beam; the test verifies 
that we can run {@link
+   * HllCount.MergePartial} and {@link HllCount.Extract} on the sketch in Beam 
to get the correct
+   * estimated count.
+   */
+  @Test
+  public void testReadSketchFromBigQuery() {
 
 Review comment:
   This test is trying to read from an exist BQ table, right? I'm curious 
without creating this data table before running pipeline, how can this test 
work?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the 

[jira] [Work logged] (BEAM-7013) A new count distinct transform based on BigQuery compatible HyperLogLog++ implementation

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7013?focusedWorklogId=297340=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297340
 ]

ASF GitHub Bot logged work on BEAM-7013:


Author: ASF GitHub Bot
Created on: 19/Aug/19 17:52
Start Date: 19/Aug/19 17:52
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on pull request #9144: [BEAM-7013] 
Integrating ZetaSketch's HLL++ algorithm with Beam
URL: https://github.com/apache/beam/pull/9144#discussion_r31597
 
 

 ##
 File path: 
sdks/java/extensions/zetasketch/src/test/java/org/apache/beam/sdk/extensions/zetasketch/BigQueryHllSketchCompatibilityIT.java
 ##
 @@ -0,0 +1,145 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.zetasketch;
+
+import com.google.api.services.bigquery.model.TableFieldSchema;
+import com.google.api.services.bigquery.model.TableRow;
+import com.google.api.services.bigquery.model.TableSchema;
+import java.nio.ByteBuffer;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.List;
+import org.apache.beam.sdk.Pipeline;
+import org.apache.beam.sdk.coders.ByteArrayCoder;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO;
+import org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO.TypedRead.Method;
+import org.apache.beam.sdk.io.gcp.bigquery.SchemaAndRecord;
+import org.apache.beam.sdk.io.gcp.testing.BigqueryMatcher;
+import org.apache.beam.sdk.options.ApplicationNameOptions;
+import org.apache.beam.sdk.testing.PAssert;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.apache.beam.sdk.testing.TestPipelineOptions;
+import org.apache.beam.sdk.transforms.Create;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import org.apache.beam.sdk.values.PCollection;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+/**
+ * Integration tests for HLL++ sketch compatibility between Beam and BigQuery. 
The tests verifies
+ * that HLL++ sketches created in Beam can be processed by BigQuery, and vice 
versa.
+ */
+@RunWith(JUnit4.class)
+public class BigQueryHllSketchCompatibilityIT {
+
+  private static final String DATASET_NAME = "zetasketch_compatibility_test";
+
+  // Table for testReadSketchFromBigQuery()
+  // Schema: only one STRING field named "data".
+  // Content: prepopulated with 4 rows: "Apple", "Orange", "Banana", "Orange"
+  private static final String DATA_TABLE_NAME = "hll_data";
+  private static final String DATA_FIELD_NAME = "data";
+  private static final String QUERY_RESULT_FIELD_NAME = "sketch";
+  private static final Long EXPECTED_COUNT = 3L;
+
+  // Table for testWriteSketchToBigQuery()
+  // Schema: only one BYTES field named "sketch".
+  // Content: will be overridden by the sketch computed by the test pipeline 
each time the test runs
+  private static final String SKETCH_TABLE_NAME = "hll_sketch";
+  private static final String SKETCH_FIELD_NAME = "sketch";
+  private static final List TEST_DATA =
+  Arrays.asList("Apple", "Orange", "Banana", "Orange");
+  // SHA-1 hash of string "[3]", the string representation of a row that has 
only one field 3 in it
+  private static final String EXPECTED_CHECKSUM = 
"f1e31df9806ce94c5bdbbfff9608324930f4d3f1";
+
+  /**
+   * Test that HLL++ sketch computed in BigQuery can be processed by Beam. Hll 
sketch is computed by
+   * {@code HLL_COUNT.INIT} in BigQuery and read into Beam; the test verifies 
that we can run {@link
+   * HllCount.MergePartial} and {@link HllCount.Extract} on the sketch in Beam 
to get the correct
+   * estimated count.
+   */
+  @Test
+  public void testReadSketchFromBigQuery() {
+String tableSpec = String.format("%s.%s", DATASET_NAME, DATA_TABLE_NAME);
+String query =
+String.format(
+"SELECT HLL_COUNT.INIT(%s) AS %s FROM %s",
+DATA_FIELD_NAME, QUERY_RESULT_FIELD_NAME, tableSpec);
+SerializableFunction parseQueryResultToByteArray =
+(SchemaAndRecord schemaAndRecord) ->
+// BigQuery BYTES type corresponds 

[jira] [Work logged] (BEAM-7013) A new count distinct transform based on BigQuery compatible HyperLogLog++ implementation

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7013?focusedWorklogId=297334=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297334
 ]

ASF GitHub Bot logged work on BEAM-7013:


Author: ASF GitHub Bot
Created on: 19/Aug/19 17:52
Start Date: 19/Aug/19 17:52
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on pull request #9144: [BEAM-7013] 
Integrating ZetaSketch's HLL++ algorithm with Beam
URL: https://github.com/apache/beam/pull/9144#discussion_r315325651
 
 

 ##
 File path: sdks/java/extensions/zetasketch/build.gradle
 ##
 @@ -0,0 +1,64 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import groovy.json.JsonOutput
+
+plugins { id 'org.apache.beam.module' }
+applyJavaNature()
+
+description = "Apache Beam :: SDKs :: Java :: Extensions :: ZetaSketch"
+
+def zetasketch_version = "0.1.0"
+
+dependencies {
+compile library.java.vendored_guava_26_0_jre
+compile project(path: ":sdks:java:core", configuration: "shadow")
+compile "com.google.zetasketch:zetasketch:$zetasketch_version"
+testCompile library.java.junit
+testCompile project(":sdks:java:io:google-cloud-platform")
+testRuntimeOnly project(":runners:direct-java")
+testRuntimeOnly project(":runners:google-cloud-dataflow-java")
+}
+
+/**
+ * Integration tests running on Dataflow with BigQuery.
+ */
+task integrationTest(type: Test) {
+group = "Verification"
+def gcpProject = project.findProperty('gcpProject') ?: 
'apache-beam-testing'
+def gcpTempRoot = project.findProperty('gcpTempRoot') ?: 
'gs://temp-storage-for-end-to-end-tests'
+systemProperty "beamTestPipelineOptions", JsonOutput.toJson([
+"--runner=TestDataflowRunner",
+"--project=${gcpProject}",
+"--tempRoot=${gcpTempRoot}",
+])
 
 Review comment:
   Within current configuration, the test will run with the default worker 
image, not the HEAD worker code. Would you like to explain more about why 
choose worker image rather than HEAD? 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297334)
Time Spent: 20h 50m  (was: 20h 40m)

> A new count distinct transform based on BigQuery compatible HyperLogLog++ 
> implementation
> 
>
> Key: BEAM-7013
> URL: https://issues.apache.org/jira/browse/BEAM-7013
> Project: Beam
>  Issue Type: New Feature
>  Components: extensions-java-sketching, sdk-java-core
>Reporter: Yueyang Qiu
>Assignee: Yueyang Qiu
>Priority: Major
> Fix For: 2.16.0
>
>  Time Spent: 20h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7013) A new count distinct transform based on BigQuery compatible HyperLogLog++ implementation

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7013?focusedWorklogId=297336=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297336
 ]

ASF GitHub Bot logged work on BEAM-7013:


Author: ASF GitHub Bot
Created on: 19/Aug/19 17:52
Start Date: 19/Aug/19 17:52
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on pull request #9144: [BEAM-7013] 
Integrating ZetaSketch's HLL++ algorithm with Beam
URL: https://github.com/apache/beam/pull/9144#discussion_r315334920
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/testing/BigqueryMatcher.java
 ##
 @@ -63,32 +63,52 @@
 
   private final String projectId;
   private final String query;
+  private final boolean usingStandardSql;
 
 Review comment:
   Any reason why we need this boolean? IIUC, bq can tell whether current query 
is legacy or standard based on query string.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297336)
Time Spent: 21h  (was: 20h 50m)

> A new count distinct transform based on BigQuery compatible HyperLogLog++ 
> implementation
> 
>
> Key: BEAM-7013
> URL: https://issues.apache.org/jira/browse/BEAM-7013
> Project: Beam
>  Issue Type: New Feature
>  Components: extensions-java-sketching, sdk-java-core
>Reporter: Yueyang Qiu
>Assignee: Yueyang Qiu
>Priority: Major
> Fix For: 2.16.0
>
>  Time Spent: 21h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7013) A new count distinct transform based on BigQuery compatible HyperLogLog++ implementation

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7013?focusedWorklogId=297337=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297337
 ]

ASF GitHub Bot logged work on BEAM-7013:


Author: ASF GitHub Bot
Created on: 19/Aug/19 17:52
Start Date: 19/Aug/19 17:52
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on pull request #9144: [BEAM-7013] 
Integrating ZetaSketch's HLL++ algorithm with Beam
URL: https://github.com/apache/beam/pull/9144#discussion_r315335308
 
 

 ##
 File path: 
sdks/java/extensions/zetasketch/src/test/java/org/apache/beam/sdk/extensions/zetasketch/HllCountTest.java
 ##
 @@ -0,0 +1,373 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.zetasketch;
+
+import com.google.zetasketch.HyperLogLogPlusPlus;
+import com.google.zetasketch.shaded.com.google.protobuf.ByteString;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.List;
+import org.apache.beam.sdk.Pipeline.PipelineExecutionException;
+import org.apache.beam.sdk.testing.NeedsRunner;
+import org.apache.beam.sdk.testing.PAssert;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.apache.beam.sdk.transforms.Create;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.TypeDescriptor;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.junit.rules.ExpectedException;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+/** Tests for {@link HllCount}. */
+@RunWith(JUnit4.class)
 
 Review comment:
   This test suite is not included in any test target, right?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297337)
Time Spent: 21h  (was: 20h 50m)

> A new count distinct transform based on BigQuery compatible HyperLogLog++ 
> implementation
> 
>
> Key: BEAM-7013
> URL: https://issues.apache.org/jira/browse/BEAM-7013
> Project: Beam
>  Issue Type: New Feature
>  Components: extensions-java-sketching, sdk-java-core
>Reporter: Yueyang Qiu
>Assignee: Yueyang Qiu
>Priority: Major
> Fix For: 2.16.0
>
>  Time Spent: 21h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7972) Portable Python Reshuffle does not work with windowed pcollection

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7972?focusedWorklogId=297333=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297333
 ]

ASF GitHub Bot logged work on BEAM-7972:


Author: ASF GitHub Bot
Created on: 19/Aug/19 17:50
Start Date: 19/Aug/19 17:50
Worklog Time Spent: 10m 
  Work Description: y1chi commented on issue #9334: [BEAM-7972] Always use 
Global window in reshuffle and then apply wind…
URL: https://github.com/apache/beam/pull/9334#issuecomment-522684767
 
 
   LGTM.
   thanks for looking into this.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297333)
Time Spent: 1h 10m  (was: 1h)

> Portable Python Reshuffle does not work with windowed pcollection
> -
>
> Key: BEAM-7972
> URL: https://issues.apache.org/jira/browse/BEAM-7972
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Streaming pipeline gets stuck when using Reshuffle with windowed pcollection.
> The issue happen because of window function gets deserialized on java side 
> which is not possible and hence default to global window function and result 
> into window function mismatch later down the code.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Resolved] (BEAM-7720) Fix the exception type of InMemoryJobService when job id not found

2019-08-19 Thread Luke Cwik (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-7720.
-
Fix Version/s: 2.16.0
   Resolution: Fixed

> Fix the exception type of InMemoryJobService when job id not found
> --
>
> Key: BEAM-7720
> URL: https://issues.apache.org/jira/browse/BEAM-7720
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
> Fix For: 2.16.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> The contract in beam_job_api.proto for `CancelJobRequest`, 
> `GetJobStateRequest`, and `GetJobPipelineRequest` states:
>   
> {noformat}
> // Throws error NOT_FOUND if the jobId is not found{noformat}
>   
> However, `InMemoryJobService` is handling this exception incorrectly by 
> rethrowing `NOT_FOUND` exceptions as `INTERNAL`.
> neither `JobMessagesRequest` nor `GetJobMetricsRequest` state their contract 
> wrt exceptions, but they should probably be updated to handle `NOT_FOUND` in 
> the same way.
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7720) Fix the exception type of InMemoryJobService when job id not found

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7720?focusedWorklogId=297332=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297332
 ]

ASF GitHub Bot logged work on BEAM-7720:


Author: ASF GitHub Bot
Created on: 19/Aug/19 17:48
Start Date: 19/Aug/19 17:48
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #9347: [BEAM-7720] 
Fix the exception type of InMemoryJobService when job id not found
URL: https://github.com/apache/beam/pull/9347
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297332)
Time Spent: 1h 10m  (was: 1h)

> Fix the exception type of InMemoryJobService when job id not found
> --
>
> Key: BEAM-7720
> URL: https://issues.apache.org/jira/browse/BEAM-7720
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> The contract in beam_job_api.proto for `CancelJobRequest`, 
> `GetJobStateRequest`, and `GetJobPipelineRequest` states:
>   
> {noformat}
> // Throws error NOT_FOUND if the jobId is not found{noformat}
>   
> However, `InMemoryJobService` is handling this exception incorrectly by 
> rethrowing `NOT_FOUND` exceptions as `INTERNAL`.
> neither `JobMessagesRequest` nor `GetJobMetricsRequest` state their contract 
> wrt exceptions, but they should probably be updated to handle `NOT_FOUND` in 
> the same way.
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7720) Fix the exception type of InMemoryJobService when job id not found

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7720?focusedWorklogId=297331=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297331
 ]

ASF GitHub Bot logged work on BEAM-7720:


Author: ASF GitHub Bot
Created on: 19/Aug/19 17:48
Start Date: 19/Aug/19 17:48
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #9347: [BEAM-7720] 
Fix the exception type of InMemoryJobService when job id not found
URL: https://github.com/apache/beam/pull/9347#discussion_r315333579
 
 

 ##
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/jobsubmission/InMemoryJobService.java
 ##
 @@ -253,6 +253,8 @@ public void getState(
   GetJobStateResponse response = 
GetJobStateResponse.newBuilder().setState(state).build();
   responseObserver.onNext(response);
   responseObserver.onCompleted();
+} catch (StatusException e) {
 
 Review comment:
   Since they are runtime exceptions, nothing is required to declare that they 
are thrown in their method signatures and can propagate up the call stack from 
any arbitrary location until they hit an appropriately scoped catch statement.
   
   Without the change to also catch the StatusRuntimeException as part of the 
StatusException, they would be caught as part of the Exception block and 
converted to INTERNAL errors.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297331)
Time Spent: 1h  (was: 50m)

> Fix the exception type of InMemoryJobService when job id not found
> --
>
> Key: BEAM-7720
> URL: https://issues.apache.org/jira/browse/BEAM-7720
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> The contract in beam_job_api.proto for `CancelJobRequest`, 
> `GetJobStateRequest`, and `GetJobPipelineRequest` states:
>   
> {noformat}
> // Throws error NOT_FOUND if the jobId is not found{noformat}
>   
> However, `InMemoryJobService` is handling this exception incorrectly by 
> rethrowing `NOT_FOUND` exceptions as `INTERNAL`.
> neither `JobMessagesRequest` nor `GetJobMetricsRequest` state their contract 
> wrt exceptions, but they should probably be updated to handle `NOT_FOUND` in 
> the same way.
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7909) Write integration tests to test customized containers

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7909?focusedWorklogId=297328=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297328
 ]

ASF GitHub Bot logged work on BEAM-7909:


Author: ASF GitHub Bot
Created on: 19/Aug/19 17:44
Start Date: 19/Aug/19 17:44
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on issue #9351: [BEAM-7909] 
support customized container for Python
URL: https://github.com/apache/beam/pull/9351#issuecomment-522682235
 
 
   R: @yifanzou 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297328)
Time Spent: 2h 20m  (was: 2h 10m)

> Write integration tests to test customized containers
> -
>
> Key: BEAM-7909
> URL: https://issues.apache.org/jira/browse/BEAM-7909
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (BEAM-8005) beam_PostCommit_Python37 timing out

2019-08-19 Thread Udi Meiri (Jira)
Udi Meiri created BEAM-8005:
---

 Summary: beam_PostCommit_Python37 timing out
 Key: BEAM-8005
 URL: https://issues.apache.org/jira/browse/BEAM-8005
 Project: Beam
  Issue Type: Bug
  Components: test-failures, testing
Reporter: Udi Meiri


Seems to get stuck in :sdks:python:test-suites:dataflow:py37:postCommitIT

{code}
10:03:30 Build timed out (after 100 minutes). Marking the build as aborted.
{code}
https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/consoleFull

possible culprits listed in changes for these PRs: 
https://builds.apache.org/job/beam_PostCommit_Python37/173/
https://builds.apache.org/job/beam_PostCommit_Python37/174/



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7986) Increase minimum grpcio required version

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7986?focusedWorklogId=297313=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297313
 ]

ASF GitHub Bot logged work on BEAM-7986:


Author: ASF GitHub Bot
Created on: 19/Aug/19 17:25
Start Date: 19/Aug/19 17:25
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #9356: [BEAM-7986] Upgrade 
grpcio
URL: https://github.com/apache/beam/pull/9356#issuecomment-522675016
 
 
   > Could you check that this will work with protobuf 3.5 and tensorflow 1.14 ?
   
   Do we have a test for that? Also, this PR doesn't change the protobuf 
requirement so it should still work the same.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297313)
Time Spent: 40m  (was: 0.5h)

> Increase minimum grpcio required version
> 
>
> Key: BEAM-7986
> URL: https://issues.apache.org/jira/browse/BEAM-7986
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> According to this question, 1.11.0 is not new enough (1.22.0 reportedly 
> works), and we list the minimum as 1.8.
> https://stackoverflow.com/questions/57479498/beam-channel-object-has-no-attribute-close?noredirect=1#comment101446049_57479498
> Affects DirectRunner Pub/Sub client.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-5820) Vendor Calcite

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5820?focusedWorklogId=297301=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297301
 ]

ASF GitHub Bot logged work on BEAM-5820:


Author: ASF GitHub Bot
Created on: 19/Aug/19 17:15
Start Date: 19/Aug/19 17:15
Worklog Time Spent: 10m 
  Work Description: apilloud commented on pull request #9189: [BEAM-5820] 
vendor calcite
URL: https://github.com/apache/beam/pull/9189#discussion_r315319821
 
 

 ##
 File path: sdks/java/extensions/sql/jdbc/build.gradle
 ##
 @@ -53,32 +50,13 @@ processResources {
   ]
 }
 
-shadowJar {
-  manifest {
-attributes "Main-Class": 
"org.apache.beam.sdk.extensions.sql.jdbc.BeamSqlLine"
 
 Review comment:
   Yes, we need to keep creating the uber jar for the JDBC target.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297301)
Time Spent: 5h 10m  (was: 5h)

> Vendor Calcite
> --
>
> Key: BEAM-5820
> URL: https://issues.apache.org/jira/browse/BEAM-5820
> Project: Beam
>  Issue Type: Sub-task
>  Components: dsl-sql
>Reporter: Kenneth Knowles
>Assignee: Kai Jiang
>Priority: Major
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-5428) Implement cross-bundle state caching.

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5428?focusedWorklogId=297299=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297299
 ]

ASF GitHub Bot logged work on BEAM-5428:


Author: ASF GitHub Bot
Created on: 19/Aug/19 17:12
Start Date: 19/Aug/19 17:12
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #9374: [BEAM-5428] 
Implement Runner support for cache tokens
URL: https://github.com/apache/beam/pull/9374
 
 
   This adds support for sending cache tokens for state requests from the
   SDK. Cache tokens are managed by the StateRequestHandler implemented by the
   Runner.
   
   The Flink Runner implementation keeps a fixed number of cache tokens scoped 
by
   the user state id. Currently, this is limited to 100 items.
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 

[jira] [Commented] (BEAM-7998) MatchesFiles or MatchAll seems to return seveval time the same element

2019-08-19 Thread Pablo Estrada (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16910580#comment-16910580
 ] 

Pablo Estrada commented on BEAM-7998:
-

Woah this is very odd. [~jerome.massot...@gmail.com] can you share your 
pipeline code? At least the part where you use MatchAll?

> MatchesFiles or MatchAll seems to return seveval time the same element
> --
>
> Key: BEAM-7998
> URL: https://issues.apache.org/jira/browse/BEAM-7998
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-files
>Affects Versions: 2.14.0
> Environment: GCP for storage, DirectRunner and DataflowRunner both 
> have the problem. PyCharm on Win10 for IDE and dev environment.
>Reporter: Jerome MASSOT
>Priority: Major
>
> Hi team,
> when I use MatcheFiles using wildcard and files located in a GCP bucket, the 
> MatcheFiles transform returns several times (at least 2) the same file.
> I have tried to follow the stack, and I can see that the MatchesAll is called 
> twice when I run the pipeline on a debug project where a single element is 
> present in the bucket.
> But I am not good enough to say more than that. Sorry.
> Best regards
> Jerome



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Assigned] (BEAM-7998) MatchesFiles or MatchAll seems to return seveval time the same element

2019-08-19 Thread Pablo Estrada (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pablo Estrada reassigned BEAM-7998:
---

Assignee: Pablo Estrada

> MatchesFiles or MatchAll seems to return seveval time the same element
> --
>
> Key: BEAM-7998
> URL: https://issues.apache.org/jira/browse/BEAM-7998
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-files
>Affects Versions: 2.14.0
> Environment: GCP for storage, DirectRunner and DataflowRunner both 
> have the problem. PyCharm on Win10 for IDE and dev environment.
>Reporter: Jerome MASSOT
>Assignee: Pablo Estrada
>Priority: Major
>
> Hi team,
> when I use MatcheFiles using wildcard and files located in a GCP bucket, the 
> MatcheFiles transform returns several times (at least 2) the same file.
> I have tried to follow the stack, and I can see that the MatchesAll is called 
> twice when I run the pipeline on a debug project where a single element is 
> present in the bucket.
> But I am not good enough to say more than that. Sorry.
> Best regards
> Jerome



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7013) A new count distinct transform based on BigQuery compatible HyperLogLog++ implementation

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7013?focusedWorklogId=297292=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297292
 ]

ASF GitHub Bot logged work on BEAM-7013:


Author: ASF GitHub Bot
Created on: 19/Aug/19 17:01
Start Date: 19/Aug/19 17:01
Worklog Time Spent: 10m 
  Work Description: boyuanzz commented on issue #9144: [BEAM-7013] 
Integrating ZetaSketch's HLL++ algorithm with Beam
URL: https://github.com/apache/beam/pull/9144#issuecomment-522665540
 
 
   Run Java PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297292)
Time Spent: 20h 40m  (was: 20.5h)

> A new count distinct transform based on BigQuery compatible HyperLogLog++ 
> implementation
> 
>
> Key: BEAM-7013
> URL: https://issues.apache.org/jira/browse/BEAM-7013
> Project: Beam
>  Issue Type: New Feature
>  Components: extensions-java-sketching, sdk-java-core
>Reporter: Yueyang Qiu
>Assignee: Yueyang Qiu
>Priority: Major
> Fix For: 2.16.0
>
>  Time Spent: 20h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-5820) Vendor Calcite

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5820?focusedWorklogId=297290=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297290
 ]

ASF GitHub Bot logged work on BEAM-5820:


Author: ASF GitHub Bot
Created on: 19/Aug/19 16:59
Start Date: 19/Aug/19 16:59
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #9333: [BEAM-5820] release 
vendor calcite
URL: https://github.com/apache/beam/pull/9333#issuecomment-522664796
 
 
   Please take a look at https://s.apache.org/beam-release-vendored-artifacts. 
The first step is finding a release manager by reaching out on the dev@ mailing 
list.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297290)
Time Spent: 5h  (was: 4h 50m)

> Vendor Calcite
> --
>
> Key: BEAM-5820
> URL: https://issues.apache.org/jira/browse/BEAM-5820
> Project: Beam
>  Issue Type: Sub-task
>  Components: dsl-sql
>Reporter: Kenneth Knowles
>Assignee: Kai Jiang
>Priority: Major
>  Time Spent: 5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-6855) Side inputs are not supported when using the state API

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6855?focusedWorklogId=297288=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297288
 ]

ASF GitHub Bot logged work on BEAM-6855:


Author: ASF GitHub Bot
Created on: 19/Aug/19 16:54
Start Date: 19/Aug/19 16:54
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #9140: [BEAM-6855] Side 
inputs are not supported when using the state API
URL: https://github.com/apache/beam/pull/9140#issuecomment-522662862
 
 
   Is this meant to fix the bug, or just a prerequisite PR to add a test?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297288)
Time Spent: 3h 20m  (was: 3h 10m)

> Side inputs are not supported when using the state API
> --
>
> Key: BEAM-6855
> URL: https://issues.apache.org/jira/browse/BEAM-6855
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core, runner-dataflow, runner-direct
>Reporter: Reuven Lax
>Assignee: Shehzaad Nakhoda
>Priority: Major
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7999) BigQueryIO.readTableRowsWithSchema() doesn't handle timestamp correctly

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7999?focusedWorklogId=297287=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297287
 ]

ASF GitHub Bot logged work on BEAM-7999:


Author: ASF GitHub Bot
Created on: 19/Aug/19 16:54
Start Date: 19/Aug/19 16:54
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #9369: [BEAM-7999] Fix 
BigQuery timestamp handling for Schema Aware PCollection
URL: https://github.com/apache/beam/pull/9369#issuecomment-522662577
 
 
   this looks good, but we should run SQL BigQuery tests as well. @apilloud  
what is the incantation to run all the SQL tests?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297287)
Time Spent: 40m  (was: 0.5h)

> BigQueryIO.readTableRowsWithSchema() doesn't handle timestamp correctly
> ---
>
> Key: BEAM-7999
> URL: https://issues.apache.org/jira/browse/BEAM-7999
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Affects Versions: 2.14.0, 2.15.0
>Reporter: Alex Van Boxel
>Assignee: Alex Van Boxel
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Using the new readTableRowsWithSchema to make a copy of a table (simple 
> operation), parsing the timestamp in the table doesn't work as it assumes a 
> Double value. BigQuery outputs a string like "2019-08-16 00:12:00.123456 
> UTC". This isn't handled.
> *Reproducable:*
> with this table
> {code:java}
> INSERT `research.alex.in1` (row_id, f_int64, f_timestamp)
> VALUES
> (1, 1, '2019-08-16 00:12:00 UTC'),
> (2, 2, '2019-08-16 00:12:00.123 UTC'),
> (3, 3, '2019-08-16 00:12:00.123456 UTC')
> {code}
> do a copy operation:
> {code:java}
> pipeline
> .apply(
> BigQueryIO.readTableRowsWithSchema()
> .from("research:alex.in1")
> //.withMethod(BigQueryIO.TypedRead.Method.DIRECT_READ)
> )
> .apply(ParDo.of(new Inspect()))
> .apply(
> BigQueryIO.writeTableRows()
> 
> .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
> .withMethod(BigQueryIO.Write.Method.FILE_LOADS)
> .useBeamSchema()
> .to("research:alex.out4"));
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7994) BEAM SDK has compatibility problems with go1.13

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7994?focusedWorklogId=297256=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297256
 ]

ASF GitHub Bot logged work on BEAM-7994:


Author: ASF GitHub Bot
Created on: 19/Aug/19 16:18
Start Date: 19/Aug/19 16:18
Worklog Time Spent: 10m 
  Work Description: wcn3 commented on pull request #9362: [BEAM-7994] 
Fixing unsafe pointer usage for Go 1.13
URL: https://github.com/apache/beam/pull/9362#discussion_r315296527
 
 

 ##
 File path: sdks/go/pkg/beam/core/util/reflectx/functions_test.go
 ##
 @@ -0,0 +1,43 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+package reflectx
+
+import (
+   "reflect"
+   "testing"
+)
+
+func testFunction() int {
+   return 42
+}
+
+func TestXxx(t *testing.T) {
+   val := reflect.ValueOf(testFunction)
+   fi := uintptr(val.Pointer())
+   typ := val.Type()
+
+   callable := LoadFunction(fi, typ)
+
+   cv := reflect.ValueOf(callable)
+   out := cv.Call(nil)
+   if len(out) != 1 {
+   t.Errorf("got %d return values, wanted 1.", len(out))
+   }
+   // TODO: check type?
+   if out[0].Int() != 42 {
+   t.Errorf("got %d, wanted 42", out[0].Int())
+   }
 
 Review comment:
   I like this style, since the comma operator helps keep it compact.
   
   Joe made an offline comment that every usage of the unsafe package should be 
accompanied by a unit test. I agree with that, and I'm looking at the remaining 
unsafe usages in the SDK right now. I'll make this tweak in that followup PR.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297256)
Time Spent: 3h  (was: 2h 50m)

> BEAM SDK has compatibility problems with go1.13
> ---
>
> Key: BEAM-7994
> URL: https://issues.apache.org/jira/browse/BEAM-7994
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Bill Neubauer
>Assignee: Bill Neubauer
>Priority: Minor
> Fix For: 2.16.0
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> The Go team identified a problem in the Beam SDK that appears due to runtime 
> changes in Go 1.13, which is upcoming. There is a backwards compatible fix 
> the team recommended.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7994) BEAM SDK has compatibility problems with go1.13

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7994?focusedWorklogId=297248=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297248
 ]

ASF GitHub Bot logged work on BEAM-7994:


Author: ASF GitHub Bot
Created on: 19/Aug/19 16:10
Start Date: 19/Aug/19 16:10
Worklog Time Spent: 10m 
  Work Description: lostluck commented on issue #9362: [BEAM-7994] Fixing 
unsafe pointer usage for Go 1.13
URL: https://github.com/apache/beam/pull/9362#issuecomment-522645991
 
 
   I have no additional comments. The discussion here was excellent, and I'm 
glad we'll be able to transition to go1.13 relatively easily. (Beam's Go Module 
status not withstanding...)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297248)
Time Spent: 2h 50m  (was: 2h 40m)

> BEAM SDK has compatibility problems with go1.13
> ---
>
> Key: BEAM-7994
> URL: https://issues.apache.org/jira/browse/BEAM-7994
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Bill Neubauer
>Assignee: Bill Neubauer
>Priority: Minor
> Fix For: 2.16.0
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> The Go team identified a problem in the Beam SDK that appears due to runtime 
> changes in Go 1.13, which is upcoming. There is a backwards compatible fix 
> the team recommended.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7994) BEAM SDK has compatibility problems with go1.13

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7994?focusedWorklogId=297241=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297241
 ]

ASF GitHub Bot logged work on BEAM-7994:


Author: ASF GitHub Bot
Created on: 19/Aug/19 16:08
Start Date: 19/Aug/19 16:08
Worklog Time Spent: 10m 
  Work Description: lostluck commented on pull request #9362: [BEAM-7994] 
Fixing unsafe pointer usage for Go 1.13
URL: https://github.com/apache/beam/pull/9362#discussion_r315292601
 
 

 ##
 File path: sdks/go/pkg/beam/core/util/reflectx/functions_test.go
 ##
 @@ -0,0 +1,43 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+package reflectx
+
+import (
+   "reflect"
+   "testing"
+)
+
+func testFunction() int {
+   return 42
+}
+
+func TestXxx(t *testing.T) {
+   val := reflect.ValueOf(testFunction)
+   fi := uintptr(val.Pointer())
+   typ := val.Type()
+
+   callable := LoadFunction(fi, typ)
+
+   cv := reflect.ValueOf(callable)
+   out := cv.Call(nil)
+   if len(out) != 1 {
+   t.Errorf("got %d return values, wanted 1.", len(out))
+   }
+   // TODO: check type?
+   if out[0].Int() != 42 {
+   t.Errorf("got %d, wanted 42", out[0].Int())
+   }
 
 Review comment:
   It's a bit more verbose but I've liked the following style since it avoids 
comparing then printing the wrong things accidentally.
   
   if got,want := out[0].Int(), testFunction(); got != want {
 t.Errorf("got %d, wanted %d", got, want)
   }
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297241)
Time Spent: 2h 40m  (was: 2.5h)

> BEAM SDK has compatibility problems with go1.13
> ---
>
> Key: BEAM-7994
> URL: https://issues.apache.org/jira/browse/BEAM-7994
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Bill Neubauer
>Assignee: Bill Neubauer
>Priority: Minor
> Fix For: 2.16.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> The Go team identified a problem in the Beam SDK that appears due to runtime 
> changes in Go 1.13, which is upcoming. There is a backwards compatible fix 
> the team recommended.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (BEAM-8004) Flink Load tests are flaky

2019-08-19 Thread Lukasz Gajowy (Jira)
Lukasz Gajowy created BEAM-8004:
---

 Summary: Flink Load tests are flaky
 Key: BEAM-8004
 URL: https://issues.apache.org/jira/browse/BEAM-8004
 Project: Beam
  Issue Type: Bug
  Components: testing
Reporter: Lukasz Gajowy


https://builds.apache.org/view/A-D/view/Beam/view/All/job/beam_LoadTests_Python_Combine_Flink_Batch/
https://builds.apache.org/view/A-D/view/Beam/view/All/job/beam_LoadTests_Python_GBK_Flink_Batch/

The tests are mostly failing (they sometimes succeed) due to issues with 
Dataproc cluster. The error: 


{code:java}
 root: DEBUG: java.net.UnknownHostException: 
beam-loadtests-python-gbk-flink-batch-68-w-10.c.apache-beam-testing.internal
13:35:25at java.net.InetAddress.getAllByName0(InetAddress.java:1281)
13:35:25at java.net.InetAddress.getAllByName(InetAddress.java:1193)
13:35:25at java.net.InetAddress.getAllByName(InetAddress.java:1127)
13:35:25at java.net.InetAddress.getByName(InetAddress.java:1077)
13:35:25at 
org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils.getRpcUrl(AkkaRpcServiceUtils.java:167)
13:35:25at 
org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils.getRpcUrl(AkkaRpcServiceUtils.java:133)
13:35:25at 
org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:89)
13:35:25at 
org.apache.flink.client.program.ClusterClient.(ClusterClient.java:159)
13:35:25at 
org.apache.flink.client.program.rest.RestClusterClient.(RestClusterClient.java:185)
13:35:25at 
org.apache.flink.client.program.rest.RestClusterClient.(RestClusterClient.java:158)
13:35:25at 
org.apache.flink.client.RemoteExecutor.start(RemoteExecutor.java:152)
13:35:25at 
org.apache.flink.client.RemoteExecutor.executePlanWithJars(RemoteExecutor.java:202)
13:35:25at 
org.apache.flink.client.RemoteExecutor.executePlan(RemoteExecutor.java:187)
13:35:25at 
org.apache.flink.api.java.RemoteEnvironment.execute(RemoteEnvironment.java:173)
13:35:25at 
org.apache.beam.runners.flink.FlinkBatchPortablePipelineTranslator$BatchTranslationContext.execute(FlinkBatchPortablePipelineTranslator.java:200)
13:35:25at 
org.apache.beam.runners.flink.FlinkPipelineRunner.runPipelineWithTranslator(FlinkPipelineRunner.java:92)
13:35:25at 
org.apache.beam.runners.flink.FlinkPipelineRunner.run(FlinkPipelineRunner.java:68)
13:35:25at 
org.apache.beam.runners.fnexecution.jobsubmission.JobInvocation.runPipeline(JobInvocation.java:78)
13:35:25at  {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-8004) Flink Load tests are flaky

2019-08-19 Thread Lukasz Gajowy (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukasz Gajowy updated BEAM-8004:

Status: Open  (was: Triage Needed)

> Flink Load tests are flaky
> --
>
> Key: BEAM-8004
> URL: https://issues.apache.org/jira/browse/BEAM-8004
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Lukasz Gajowy
>Priority: Critical
>
> https://builds.apache.org/view/A-D/view/Beam/view/All/job/beam_LoadTests_Python_Combine_Flink_Batch/
> https://builds.apache.org/view/A-D/view/Beam/view/All/job/beam_LoadTests_Python_GBK_Flink_Batch/
> The tests are mostly failing (they sometimes succeed) due to issues with 
> Dataproc cluster. The error: 
> {code:java}
>  root: DEBUG: java.net.UnknownHostException: 
> beam-loadtests-python-gbk-flink-batch-68-w-10.c.apache-beam-testing.internal
> 13:35:25  at java.net.InetAddress.getAllByName0(InetAddress.java:1281)
> 13:35:25  at java.net.InetAddress.getAllByName(InetAddress.java:1193)
> 13:35:25  at java.net.InetAddress.getAllByName(InetAddress.java:1127)
> 13:35:25  at java.net.InetAddress.getByName(InetAddress.java:1077)
> 13:35:25  at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils.getRpcUrl(AkkaRpcServiceUtils.java:167)
> 13:35:25  at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils.getRpcUrl(AkkaRpcServiceUtils.java:133)
> 13:35:25  at 
> org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:89)
> 13:35:25  at 
> org.apache.flink.client.program.ClusterClient.(ClusterClient.java:159)
> 13:35:25  at 
> org.apache.flink.client.program.rest.RestClusterClient.(RestClusterClient.java:185)
> 13:35:25  at 
> org.apache.flink.client.program.rest.RestClusterClient.(RestClusterClient.java:158)
> 13:35:25  at 
> org.apache.flink.client.RemoteExecutor.start(RemoteExecutor.java:152)
> 13:35:25  at 
> org.apache.flink.client.RemoteExecutor.executePlanWithJars(RemoteExecutor.java:202)
> 13:35:25  at 
> org.apache.flink.client.RemoteExecutor.executePlan(RemoteExecutor.java:187)
> 13:35:25  at 
> org.apache.flink.api.java.RemoteEnvironment.execute(RemoteEnvironment.java:173)
> 13:35:25  at 
> org.apache.beam.runners.flink.FlinkBatchPortablePipelineTranslator$BatchTranslationContext.execute(FlinkBatchPortablePipelineTranslator.java:200)
> 13:35:25  at 
> org.apache.beam.runners.flink.FlinkPipelineRunner.runPipelineWithTranslator(FlinkPipelineRunner.java:92)
> 13:35:25  at 
> org.apache.beam.runners.flink.FlinkPipelineRunner.run(FlinkPipelineRunner.java:68)
> 13:35:25  at 
> org.apache.beam.runners.fnexecution.jobsubmission.JobInvocation.runPipeline(JobInvocation.java:78)
> 13:35:25  at  {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7994) BEAM SDK has compatibility problems with go1.13

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7994?focusedWorklogId=297224=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297224
 ]

ASF GitHub Bot logged work on BEAM-7994:


Author: ASF GitHub Bot
Created on: 19/Aug/19 15:48
Start Date: 19/Aug/19 15:48
Worklog Time Spent: 10m 
  Work Description: wcn3 commented on issue #9362: [BEAM-7994] Fixing 
unsafe pointer usage for Go 1.13
URL: https://github.com/apache/beam/pull/9362#issuecomment-522637147
 
 
   This was a problem revealed in a beta test in an upcoming version of the Go
   SDK. I am not sure Beam wants to set up test coverage for beta SDKs. In
   general, Go has good backwards compatibility guarantees so testing version
   compatibility of older SDKs hasn’t been needed.
   
   It is entirely possible to set up the sort of testing you describe, but it
   would be of limited use, particularly with regards to this issue. In this
   case, I believe the Go community process worked as intended: as a precursor
   to the Go team releasing betas, they check it against the Google code
   corpus. Since Beam is used by Google, Beam gets the benefit of this testing
   coverage. This PR came about precisely because of those tests.
   
   
   On Mon, Aug 19, 2019 at 8:38 AM Lukasz Cwik 
   wrote:
   
   > Similar to how we are testing different versions of Python 3.5, 3.6, 3.7
   > with the Beam SDK, is there a way to test different Go versions against the
   > Beam SDK to catch these kinds of things?
   >
   > —
   > You are receiving this because you authored the thread.
   > Reply to this email directly, view it on GitHub
   > 
,
   > or mute the thread
   > 

   > .
   >
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297224)
Time Spent: 2.5h  (was: 2h 20m)

> BEAM SDK has compatibility problems with go1.13
> ---
>
> Key: BEAM-7994
> URL: https://issues.apache.org/jira/browse/BEAM-7994
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Bill Neubauer
>Assignee: Bill Neubauer
>Priority: Minor
> Fix For: 2.16.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> The Go team identified a problem in the Beam SDK that appears due to runtime 
> changes in Go 1.13, which is upcoming. There is a backwards compatible fix 
> the team recommended.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-7937) Support Hadoop 3.x on Hadoop File System

2019-08-19 Thread Reenu Saluja (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16910490#comment-16910490
 ] 

Reenu Saluja commented on BEAM-7937:


To add more details about the problem statement, We are already using 
hadoop-azure 2.7  version. with this API we are able to download data from ADLS 
Gen 2 on a shared storage. Than with Beam api transformation of data is 
happening. 

But we are looking for api through with Beam code can directly connect with 
ADLS Gen 2. For e.g. to read data from kafka, there is option for KafkaIO  
read(). p.apply(KafkaIO.read()
Do we have similar function for Hadoop also(ADLS Gen2 ) ?



> Support Hadoop 3.x on Hadoop File System
> 
>
> Key: BEAM-7937
> URL: https://issues.apache.org/jira/browse/BEAM-7937
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-hadoop-file-system
>Reporter: Reenu Saluja
>Priority: Major
>
> I try to get a Beam pipeline to have input from Azure Data Lake Storage Gen 
> 2. ADLS Gen 2 Support Hadoop 3.2+.  I tried using Apache Beam 2.8.1 later on 
> 2.14.0. I am getting below error:
> Error: Caused by: org.apache.hadoop.fs.UnsupportedFileSystemException: No 
> FileSystem for scheme "wasbs"



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7994) BEAM SDK has compatibility problems with go1.13

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7994?focusedWorklogId=297220=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297220
 ]

ASF GitHub Bot logged work on BEAM-7994:


Author: ASF GitHub Bot
Created on: 19/Aug/19 15:38
Start Date: 19/Aug/19 15:38
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #9362: [BEAM-7994] Fixing 
unsafe pointer usage for Go 1.13
URL: https://github.com/apache/beam/pull/9362#issuecomment-522632690
 
 
   Similar to how we are testing different versions of Python 3.5, 3.6, 3.7 
with the Beam SDK, is there a way to test different Go versions against the 
Beam SDK to catch these kinds of things?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297220)
Time Spent: 2h 20m  (was: 2h 10m)

> BEAM SDK has compatibility problems with go1.13
> ---
>
> Key: BEAM-7994
> URL: https://issues.apache.org/jira/browse/BEAM-7994
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Bill Neubauer
>Assignee: Bill Neubauer
>Priority: Minor
> Fix For: 2.16.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> The Go team identified a problem in the Beam SDK that appears due to runtime 
> changes in Go 1.13, which is upcoming. There is a backwards compatible fix 
> the team recommended.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Resolved] (BEAM-7994) BEAM SDK has compatibility problems with go1.13

2019-08-19 Thread Luke Cwik (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-7994.
-
Fix Version/s: 2.16.0
   Resolution: Fixed

> BEAM SDK has compatibility problems with go1.13
> ---
>
> Key: BEAM-7994
> URL: https://issues.apache.org/jira/browse/BEAM-7994
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Bill Neubauer
>Assignee: Bill Neubauer
>Priority: Minor
> Fix For: 2.16.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> The Go team identified a problem in the Beam SDK that appears due to runtime 
> changes in Go 1.13, which is upcoming. There is a backwards compatible fix 
> the team recommended.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7994) BEAM SDK has compatibility problems with go1.13

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7994?focusedWorklogId=297219=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297219
 ]

ASF GitHub Bot logged work on BEAM-7994:


Author: ASF GitHub Bot
Created on: 19/Aug/19 15:37
Start Date: 19/Aug/19 15:37
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #9362: [BEAM-7994] 
Fixing unsafe pointer usage for Go 1.13
URL: https://github.com/apache/beam/pull/9362
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297219)
Time Spent: 2h 10m  (was: 2h)

> BEAM SDK has compatibility problems with go1.13
> ---
>
> Key: BEAM-7994
> URL: https://issues.apache.org/jira/browse/BEAM-7994
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Bill Neubauer
>Assignee: Bill Neubauer
>Priority: Minor
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> The Go team identified a problem in the Beam SDK that appears due to runtime 
> changes in Go 1.13, which is upcoming. There is a backwards compatible fix 
> the team recommended.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-7990) Add ability to read parquet files into PCollection

2019-08-19 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-7990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-7990:
---
Status: Open  (was: Triage Needed)

> Add ability to read parquet files into PCollection
> -
>
> Key: BEAM-7990
> URL: https://issues.apache.org/jira/browse/BEAM-7990
> Project: Beam
>  Issue Type: New Feature
>  Components: io-py-parquet
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-7991) gradle cleanPython race

2019-08-19 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-7991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-7991:
---
Status: Open  (was: Triage Needed)

> gradle cleanPython race
> ---
>
> Key: BEAM-7991
> URL: https://issues.apache.org/jira/browse/BEAM-7991
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Udi Meiri
>Priority: Minor
>
> Under sdks/python run:
> {code}
> $ ../../gradlew setupVirtualenv
> $ ../../gradlew clean
> {code}
> And you should get with high probability errors about missing modules.
> Running this gives no errors:
> {code}
> $ ../../gradlew setupVirtualenv
> $ ../../gradlew clean --no-parallel
> {code}
> But notice that setup.py is not called in the second example, meaning that 
> some other task has already wiped out the build/ directory and the 
> virtualenvs in it.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-7992) Unhandled type_constraint in apache_beam.io.gcp.bigquery_write_it_test.BigQueryWriteIntegrationTests.test_big_query_write_new_types

2019-08-19 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-7992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-7992:
---
Status: Open  (was: Triage Needed)

> Unhandled type_constraint in 
> apache_beam.io.gcp.bigquery_write_it_test.BigQueryWriteIntegrationTests.test_big_query_write_new_types
> ---
>
> Key: BEAM-7992
> URL: https://issues.apache.org/jira/browse/BEAM-7992
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>
> {code}
> root: DEBUG: Unhandled type_constraint: Union[]
> root: DEBUG: Unhandled type_constraint: Union[]
> root: DEBUG: Unhandled type_constraint: Any
> root: DEBUG: Unhandled type_constraint: Any
> {code}
> https://builds.apache.org/job/beam_PostCommit_Python37_PR/20/testReport/junit/apache_beam.io.gcp.bigquery_write_it_test/BigQueryWriteIntegrationTests/test_big_query_write_new_types/
> These log entries are from opcode.py's _unpack_lists.
> They might be pointing to a bug or missing feature.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-7993) portable python precommit is flaky

2019-08-19 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-7993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-7993:
---
Status: Open  (was: Triage Needed)

> portable python precommit is flaky
> --
>
> Key: BEAM-7993
> URL: https://issues.apache.org/jira/browse/BEAM-7993
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, test-failures, testing
>Reporter: Udi Meiri
>Priority: Major
>
> I'm not sure what the root cause is here.
> Example log where 
> :sdks:python:test-suites:portable:py35:portableWordCountBatch failed:
> {code}
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (1/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)
> 11:51:22 java.lang.Exception: The user defined 'open()' method caused an 
> exception: java.io.IOException: Received exit code 1 for command 'docker 
> inspect -f {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498)
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
> 11:51:22  at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712)
> 11:51:22  at java.lang.Thread.run(Thread.java:748)
> 11:51:22 Caused by: 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.io.IOException: Received exit code 1 for command 'docker inspect -f 
> {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4966)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:211)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:202)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:185)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.FlinkDefaultExecutableStageContext.getStageBundleFactory(FlinkDefaultExecutableStageContext.java:49)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.ReferenceCountingFlinkExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingFlinkExecutableStageContextFactory.java:203)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.open(FlinkExecutableStageFunction.java:129)
> 11:51:22  at 
> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:494)
> 11:51:22  ... 3 more
> {code}
> https://builds.apache.org/job/beam_PreCommit_Portable_Python_Commit/5512/consoleFull



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-7996) Add support for remaining data types in python RowCoder

2019-08-19 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-7996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-7996:
---
Status: Open  (was: Triage Needed)

> Add support for remaining data types in python RowCoder 
> 
>
> Key: BEAM-7996
> URL: https://issues.apache.org/jira/browse/BEAM-7996
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Brian Hulette
>Priority: Major
>
> In the initial [python RowCoder 
> implementation|https://github.com/apache/beam/pull/9188] we only added 
> support for the data types that already had coders in the Python SDK. We 
> should add support for the remaining data types that are not currently 
> supported:
> * INT8 (ByteCoder in Java)
> * INT16 (BigEndianShortCoder in Java)
> * FLOAT (FloatCoder in Java)
> * BOOLEAN (BooleanCoder in Java)
> * Map (MapCoder in Java)
> We might consider making those coders standard so they can be tested 
> independently from RowCoder in standard_coders.yaml. Or, if we don't do that 
> we should probably add a more robust testing framework for RowCoder itself, 
> because it will be challenging to test all of these types as part of the 
> RowCoder tests in standard_coders.yaml.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-7995) IllegalStateException: TimestampCombiner moved element from to earlier time in Python

2019-08-19 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-7995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-7995:
---
Status: Open  (was: Triage Needed)

> IllegalStateException: TimestampCombiner moved element from to earlier time 
> in Python
> -
>
> Key: BEAM-7995
> URL: https://issues.apache.org/jira/browse/BEAM-7995
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Hai Lu
>Assignee: Hai Lu
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> I'm looking into a bug I found internally when using Beam portable API 
> (Python) on our own Samza runner. 
>  
> The pipeline looks something like this:
>  
>     (p
>      | 'read' >> ReadFromKafka(cluster="tracking", topic="PageViewEvent")
>      | 'transform' >> beam.Map(lambda event: process_event(event))
>      | 'window' >> beam.WindowInto(FixedWindows(15))
>      | 'group' >> *beam.CombinePerKey(beam.combiners.CountCombineFn())*
>      ...
>  
> The problem comes from the combiners which cause the following exception on 
> Java side:
>  
> Caused by: java.lang.IllegalStateException: TimestampCombiner moved element 
> from 2019-08-15T03:34:*45.000*Z to earlier time 2019-08-15T03:34:*44.999*Z 
> for window [2019-08-15T03:34:30.000Z..2019-08-15T03:34:*45.000*Z)
>     at 
> org.apache.beam.runners.core.WatermarkHold.shift(WatermarkHold.java:117)
>     at 
> org.apache.beam.runners.core.WatermarkHold.addElementHold(WatermarkHold.java:154)
>     at 
> org.apache.beam.runners.core.WatermarkHold.addHolds(WatermarkHold.java:98)
>     at 
> org.apache.beam.runners.core.ReduceFnRunner.processElement(ReduceFnRunner.java:605)
>     at 
> org.apache.beam.runners.core.ReduceFnRunner.processElements(ReduceFnRunner.java:349)
>     at 
> org.apache.beam.runners.core.GroupAlsoByWindowViaWindowSetNewDoFn.processElement(GroupAlsoByWindowViaWindowSetNewDoFn.java:136)
>  
> The exception happens here 
> [https://github.com/apache/beam/blob/master/runners/core-java/src/main/java/org/apache/beam/runners/core/WatermarkHold.java#L116]
>  when we check the shifted timestamp to ensure it's before the timestamp.
>  
>     if (shifted.isBefore(timestamp)) {
>       throw new IllegalStateException(
>           String.format(
>               "TimestampCombiner moved element from %s to earlier time %s for 
> window %s",
>               BoundedWindow.formatTimestamp(timestamp),
>               BoundedWindow.formatTimestamp(shifted),
>               window));
>     }
>  
> As you can see from the exception, the "shifted" is "XXX 44.999" while the 
> "timestamp" is "XXX 45.000". The "44.999" is coming from 
> [TimestampCombiner.END_OF_WINDOW|https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/windowing/TimestampCombiner.java#L116]:
>  
>     @Override
>     public Instant merge(BoundedWindow intoWindow, Iterable Instant> mergingTimestamps) {
>       return intoWindow.maxTimestamp();
>     }
>  
> where intoWindow.maxTimestamp() is:
>  
>   /** Returns the largest timestamp that can be included in this window. */
>   @Override
>   public Instant maxTimestamp() {
>     *// end not inclusive*
>     return *end.minus(1)*;
>   }
>  
> Hence, the "44.*999*". 
>  
> And the "45.000" comes from the Python side when the combiner output results 
> as pre GBK operation: 
> [operations.py#PGBKCVOperation#output_key|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/worker/operations.py#L889]
>  
>     if windows is 0:
>       self.output(_globally_windowed_value.with_value((key, value)))
>     else:
>       self.output(WindowedValue((key, value), *windows[0].end*, windows))
>  
> Here when we generate the window value, the timestamp is assigned to the 
> closed interval end (45.000) as opposed to open interval end (44.999)
>  
> Clearly the "end of window" definition is a bit inconsistent across Python 
> and Java. I'm yet to try this on other runner so not sure whether this is 
> only an issue for our Samza runner. I tend to think this is a bug but would 
> like to confirm with you. If this has not been an issue for other runners, 
> where did I potentially do wrong.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-7994) BEAM SDK has compatibility problems with go1.13

2019-08-19 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-7994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-7994:
---
Status: Open  (was: Triage Needed)

> BEAM SDK has compatibility problems with go1.13
> ---
>
> Key: BEAM-7994
> URL: https://issues.apache.org/jira/browse/BEAM-7994
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Bill Neubauer
>Assignee: Bill Neubauer
>Priority: Minor
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> The Go team identified a problem in the Beam SDK that appears due to runtime 
> changes in Go 1.13, which is upcoming. There is a backwards compatible fix 
> the team recommended.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-5980) Add load tests for Core Apache Beam operations

2019-08-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5980?focusedWorklogId=297215=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297215
 ]

ASF GitHub Bot logged work on BEAM-5980:


Author: ASF GitHub Bot
Created on: 19/Aug/19 15:28
Start Date: 19/Aug/19 15:28
Worklog Time Spent: 10m 
  Work Description: lgajowy commented on pull request #9286: [BEAM-5980] 
Remove redundant combine tests
URL: https://github.com/apache/beam/pull/9286
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 297215)
Time Spent: 3h 20m  (was: 3h 10m)

> Add load tests for Core Apache Beam operations 
> ---
>
> Key: BEAM-5980
> URL: https://issues.apache.org/jira/browse/BEAM-5980
> Project: Beam
>  Issue Type: New Feature
>  Components: testing
>Reporter: Lukasz Gajowy
>Assignee: Lukasz Gajowy
>Priority: Major
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> This involves adding a suite of load tests described in this proposal: 
> [https://s.apache.org/load-test-basic-operations]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-7998) MatchesFiles or MatchAll seems to return seveval time the same element

2019-08-19 Thread Jira


[ 
https://issues.apache.org/jira/browse/BEAM-7998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16910480#comment-16910480
 ] 

Ismaël Mejía commented on BEAM-7998:


[~pabloem] can you please take a look or assign to someone who can.

> MatchesFiles or MatchAll seems to return seveval time the same element
> --
>
> Key: BEAM-7998
> URL: https://issues.apache.org/jira/browse/BEAM-7998
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-files
>Affects Versions: 2.14.0
> Environment: GCP for storage, DirectRunner and DataflowRunner both 
> have the problem. PyCharm on Win10 for IDE and dev environment.
>Reporter: Jerome MASSOT
>Priority: Major
>
> Hi team,
> when I use MatcheFiles using wildcard and files located in a GCP bucket, the 
> MatcheFiles transform returns several times (at least 2) the same file.
> I have tried to follow the stack, and I can see that the MatchesAll is called 
> twice when I run the pipeline on a debug project where a single element is 
> present in the bucket.
> But I am not good enough to say more than that. Sorry.
> Best regards
> Jerome



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-7998) MatchesFiles or MatchAll seems to return seveval time the same element

2019-08-19 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-7998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-7998:
---
Status: Open  (was: Triage Needed)

> MatchesFiles or MatchAll seems to return seveval time the same element
> --
>
> Key: BEAM-7998
> URL: https://issues.apache.org/jira/browse/BEAM-7998
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-files
>Affects Versions: 2.14.0
> Environment: GCP for storage, DirectRunner and DataflowRunner both 
> have the problem. PyCharm on Win10 for IDE and dev environment.
>Reporter: Jerome MASSOT
>Priority: Major
>
> Hi team,
> when I use MatcheFiles using wildcard and files located in a GCP bucket, the 
> MatcheFiles transform returns several times (at least 2) the same file.
> I have tried to follow the stack, and I can see that the MatchesAll is called 
> twice when I run the pipeline on a debug project where a single element is 
> present in the bucket.
> But I am not good enough to say more than that. Sorry.
> Best regards
> Jerome



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-7998) MatchesFiles or MatchAll seems to return seveval time the same element

2019-08-19 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-7998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-7998:
---
Component/s: (was: beam-model)
 io-py-files

> MatchesFiles or MatchAll seems to return seveval time the same element
> --
>
> Key: BEAM-7998
> URL: https://issues.apache.org/jira/browse/BEAM-7998
> Project: Beam
>  Issue Type: Bug
>  Components: io-py-files
>Affects Versions: 2.14.0
> Environment: GCP for storage, DirectRunner and DataflowRunner both 
> have the problem. PyCharm on Win10 for IDE and dev environment.
>Reporter: Jerome MASSOT
>Priority: Major
>
> Hi team,
> when I use MatcheFiles using wildcard and files located in a GCP bucket, the 
> MatcheFiles transform returns several times (at least 2) the same file.
> I have tried to follow the stack, and I can see that the MatchesAll is called 
> twice when I run the pipeline on a debug project where a single element is 
> present in the bucket.
> But I am not good enough to say more than that. Sorry.
> Best regards
> Jerome



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-8000) Add Delete method to gRPC JobService

2019-08-19 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-8000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-8000:
---
Status: Open  (was: Triage Needed)

> Add Delete method to gRPC JobService
> 
>
> Key: BEAM-8000
> URL: https://issues.apache.org/jira/browse/BEAM-8000
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model
>Reporter: Chad Dombrova
>Assignee: Chad Dombrova
>Priority: Major
>
> As a user of the InMemoryJobService, I want a method to purge jobs from 
> memory when they are no longer needed, so that the service does not balloon 
> in memory usage over time.
> I was planning to name this Delete.  Also considering the name Purge.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


  1   2   >