[jira] [Work logged] (BEAM-7305) Add first version of Hazelcast Jet Runner

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7305?focusedWorklogId=243839=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243839
 ]

ASF GitHub Bot logged work on BEAM-7305:


Author: ASF GitHub Bot
Created on: 17/May/19 05:37
Start Date: 17/May/19 05:37
Worklog Time Spent: 10m 
  Work Description: jbartok commented on issue #8592: [BEAM-7305] Improve 
and extend Hazelcast Jet based Java Runner
URL: https://github.com/apache/beam/pull/8592#issuecomment-493326374
 
 
   Hi @mxm. Yes, I was pondering it yesterday if I should make this pull 
request out of multiple change-sets or squash them down to a single one... I 
might not have made the best choice...
   
   The thing is that I'm dumping months of my work into these two change-sets, 
that's why it looks so non-incremental. The actual development has been done in 
https://github.com/hazelcast/hazelcast-jet-beam-runner, there are 100+ commits 
there (debugging has proven simpler if working like this, worth the effort of 
migrating later). Anyways, from now on pace of development should be slower and 
I will make it more incremental by issuing more frequent PRs.
   
   As far as reviews are concerned, they have been done to some degree on our 
module by my Hazelcast colleagues. Here we would need somebody both impartial 
to Hazelcast and with knowledge of Jet, might be not that simple to find. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243839)
Time Spent: 2h 40m  (was: 2.5h)

> Add first version of Hazelcast Jet Runner
> -
>
> Key: BEAM-7305
> URL: https://issues.apache.org/jira/browse/BEAM-7305
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-jet
>Reporter: Maximilian Michels
>Assignee: Jozsef Bartok
>Priority: Major
> Fix For: 2.14.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7190) enable file system based token authentication for portable runner

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7190?focusedWorklogId=243815=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243815
 ]

ASF GitHub Bot logged work on BEAM-7190:


Author: ASF GitHub Bot
Created on: 17/May/19 03:48
Start Date: 17/May/19 03:48
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #8597: [BEAM-7190] Enable 
file based token auth for samza portable runner
URL: https://github.com/apache/beam/pull/8597#issuecomment-493309252
 
 
   - Can we make the token creation and authentication modular and pluggable so 
that it can be added to other runners as well by setting a pipeline option. 
   - We will also need secure channel to encrypt the content to call it truely 
secure.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243815)
Time Spent: 20m  (was: 10m)

> enable file system based token authentication for portable runner
> -
>
> Key: BEAM-7190
> URL: https://issues.apache.org/jira/browse/BEAM-7190
> Project: Beam
>  Issue Type: Task
>  Components: runner-samza
>Reporter: Hai Lu
>Assignee: Hai Lu
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> For Samza and potentially other portable runners, there is a need to secure 
> the communication between sdk worker and runner. Currently the SSL/TLS in 
> portability is half done.
> However, after investigation we found that it's sufficient to just 1) use 
> loopback address 2) enforce authentication and that way the communication is 
> both authenticated and secured.
> This ticket intends to track the implementation of the solution above. More 
> details can be found in the following PR.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7190) enable file system based token authentication for portable runner

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7190?focusedWorklogId=243797=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243797
 ]

ASF GitHub Bot logged work on BEAM-7190:


Author: ASF GitHub Bot
Created on: 17/May/19 02:56
Start Date: 17/May/19 02:56
Worklog Time Spent: 10m 
  Work Description: lhaiesp commented on pull request #8597: [BEAM-7190] 
Enable file based token auth for samza portable runner
URL: https://github.com/apache/beam/pull/8597
 
 
   For Samza and potentially other portable runners who do not use docker and 
need to run on multi-tenant environment, there is a need to secure the 
communication between sdk worker and runner. Currently the SSL/TLS in 
portability is half done.
   
   However, after investigation we found that it's sufficient to just
   1. Use loopback address. So that the traffic is not exposed to external 
network
   2. Enforce authentication. So that only the valid users can connect to the 
ports. 
   
   With the two steps above, it won't be necessary to enable TLS. Because the 
data channel is only local and one needs root privilege to eavesdrop the local 
traffic.
   
   A trivial way to do authentication is to share a secret token through file 
system (e.g. set the file permission to be 600, i.e. -rw---) . Next we 
introduce a customized interpreter for both the gRPC client and server to 
provide and verify this token (see GrpcFileTokenAuthProvider.java and 
token_auth_interceptor.py). The server can then deny any connection attempts 
that do not have the right token.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build
 

[jira] [Work logged] (BEAM-6988) TypeHints Py3 Error: test_non_function (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6988?focusedWorklogId=243779=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243779
 ]

ASF GitHub Bot logged work on BEAM-6988:


Author: ASF GitHub Bot
Created on: 17/May/19 01:25
Start Date: 17/May/19 01:25
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on pull request #8590: [BEAM-6988] 
Implement a Python 3 version of getcallargs_forhints
URL: https://github.com/apache/beam/pull/8590#discussion_r284951313
 
 

 ##
 File path: sdks/python/apache_beam/typehints/decorators.py
 ##
 @@ -269,6 +272,14 @@ def getcallargs_forhints(func, *typeargs, **typekwargs):
  for (arg, hint) in zip(argspec.args, typeargs)]
   packed_typeargs += list(typeargs[len(packed_typeargs):])
 
+  if sys.version_info.major < 3:
+return getcallargs_forhints_impl_py2(func, argspec, packed_typeargs,
 
 Review comment:
   
https://docs.python.org/3/howto/pyporting.html#use-feature-detection-instead-of-version-detection
 gives a good guideline on this topic.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243779)
Time Spent: 7.5h  (was: 7h 20m)

> TypeHints Py3 Error: test_non_function 
> (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+
> -
>
> Key: BEAM-6988
> URL: https://issues.apache.org/jira/browse/BEAM-6988
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: niklas Hansson
>Priority: Major
>  Time Spent: 7.5h
>  Remaining Estimate: 0h
>
> {noformat}
> Traceback (most recent call last):
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/typed_pipeline_test.py",
>  line 53, in test_non_function
>  result = ['xa', 'bbx', 'xcx'] | beam.Map(str.strip, 'x')
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 510, in _ror_
>  result = p.apply(self, pvalueish, label)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/pipeline.py",
>  line 514, in apply
>  transform.type_check_inputs(pvalueish)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 753, in type_check_inputs
>  hints = getcallargs_forhints(argspec_fn, *type_hints[0], **type_hints[1])
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/decorators.py",
>  line 283, in getcallargs_forhints
>  raise TypeCheckError(e)
>  apache_beam.typehints.decorators.TypeCheckError: strip() missing 1 required 
> positional argument: 'chars'{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-7347) beam_Performance failed with benchmark flag config error

2019-05-16 Thread Mark Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Liu updated BEAM-7347:
---
Description: 
[All|https://builds.apache.org/view/A-D/view/Beam/view/PerformanceTests/] 
performance benchmarks are affected.

Error log from [latest beam_PerformanceTests_TextIOIT 
run|https://builds.apache.org/view/A-D/view/Beam/view/PerformanceTests/job/beam_PerformanceTests_TextIOIT/2008/console]:

{code}
00:00:24.372 2019-05-17 00:21:24,724 5d6e9583 MainThread 
beam_integration_benchmark(1/1) ERRORError during benchmark 
beam_integration_benchmark
00:00:24.372 Traceback (most recent call last):
00:00:24.372   File 
"/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_TextIOIT/PerfKitBenchmarker/perfkitbenchmarker/pkb.py",
 line 752, in RunBenchmark
00:00:24.372 DoProvisionPhase(spec, detailed_timer)
00:00:24.372   File 
"/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_TextIOIT/PerfKitBenchmarker/perfkitbenchmarker/pkb.py",
 line 538, in DoProvisionPhase
00:00:24.372 spec.ConstructDpbService()
00:00:24.372   File 
"/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_TextIOIT/PerfKitBenchmarker/perfkitbenchmarker/benchmark_spec.py",
 line 209, in ConstructDpbService
00:00:24.372 self.dpb_service = dpb_service_class(self.config.dpb_service)
00:00:24.372   File 
"/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_TextIOIT/PerfKitBenchmarker/perfkitbenchmarker/providers/gcp/gcp_dpb_dataflow.py",
 line 53, in __init__
00:00:24.372 super(GcpDpbDataflow, self).__init__(dpb_service_spec)
00:00:24.372   File 
"/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_TextIOIT/PerfKitBenchmarker/perfkitbenchmarker/dpb_service.py",
 line 127, in __init__
00:00:24.372 'The flag dpb_service_zone must be provided, for 
provisioning.')
00:00:24.372 InvalidFlagConfigurationError: The flag dpb_service_zone must be 
provided, for provisioning.
{code}

Seems certain change on 
[PerfkitBenchmarker|https://github.com/GoogleCloudPlatform/PerfKitBenchmarker] 
breaks our 
[beam_integration_benchmark|https://github.com/GoogleCloudPlatform/PerfKitBenchmarker/blob/master/perfkitbenchmarker/linux_benchmarks/beam_integration_benchmark.py].
 However, we may be able to have a quick fix on Beam side.

  was:
[All|https://builds.apache.org/view/A-D/view/Beam/view/PerformanceTests/] 
performance benchmarks are affected.

Error log from [latest beam_PerformanceTests_TextIOIT 
run|https://builds.apache.org/view/A-D/view/Beam/view/PerformanceTests/job/beam_PerformanceTests_TextIOIT/2008/console]:

{code}
00:00:24.372 2019-05-17 00:21:24,724 5d6e9583 MainThread 
beam_integration_benchmark(1/1) ERRORError during benchmark 
beam_integration_benchmark
00:00:24.372 Traceback (most recent call last):
00:00:24.372   File 
"/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_TextIOIT/PerfKitBenchmarker/perfkitbenchmarker/pkb.py",
 line 752, in RunBenchmark
00:00:24.372 DoProvisionPhase(spec, detailed_timer)
00:00:24.372   File 
"/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_TextIOIT/PerfKitBenchmarker/perfkitbenchmarker/pkb.py",
 line 538, in DoProvisionPhase
00:00:24.372 spec.ConstructDpbService()
00:00:24.372   File 
"/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_TextIOIT/PerfKitBenchmarker/perfkitbenchmarker/benchmark_spec.py",
 line 209, in ConstructDpbService
00:00:24.372 self.dpb_service = dpb_service_class(self.config.dpb_service)
00:00:24.372   File 
"/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_TextIOIT/PerfKitBenchmarker/perfkitbenchmarker/providers/gcp/gcp_dpb_dataflow.py",
 line 53, in __init__
00:00:24.372 super(GcpDpbDataflow, self).__init__(dpb_service_spec)
00:00:24.372   File 
"/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_TextIOIT/PerfKitBenchmarker/perfkitbenchmarker/dpb_service.py",
 line 127, in __init__
00:00:24.372 'The flag dpb_service_zone must be provided, for 
provisioning.')
00:00:24.372 InvalidFlagConfigurationError: The flag dpb_service_zone must be 
provided, for provisioning.
{code}

Seems certain change on 
[PerfkitBenchmarker|https://github.com/GoogleCloudPlatform/PerfKitBenchmarker] 
breaks our 
[beam_integration_benchmark|https://github.com/GoogleCloudPlatform/PerfKitBenchmarker/blob/master/perfkitbenchmarker/linux_benchmarks/beam_integration_benchmark.py].
 However, we may be able to have a quick fix on our side.


> beam_Performance failed with benchmark flag config error
> 
>
> Key: BEAM-7347
> URL: https://issues.apache.org/jira/browse/BEAM-7347
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Mark Liu
>Priority: Major
>
> [All|https://builds.apache.org/view/A-D/view/Beam/view/PerformanceTests/] 
> performance 

[jira] [Work logged] (BEAM-6988) TypeHints Py3 Error: test_non_function (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6988?focusedWorklogId=243774=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243774
 ]

ASF GitHub Bot logged work on BEAM-6988:


Author: ASF GitHub Bot
Created on: 17/May/19 01:10
Start Date: 17/May/19 01:10
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #8590: [BEAM-6988] Implement a 
Python 3 version of getcallargs_forhints
URL: https://github.com/apache/beam/pull/8590#issuecomment-493282684
 
 
   run python postcommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243774)
Time Spent: 7h 20m  (was: 7h 10m)

> TypeHints Py3 Error: test_non_function 
> (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+
> -
>
> Key: BEAM-6988
> URL: https://issues.apache.org/jira/browse/BEAM-6988
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: niklas Hansson
>Priority: Major
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> {noformat}
> Traceback (most recent call last):
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/typed_pipeline_test.py",
>  line 53, in test_non_function
>  result = ['xa', 'bbx', 'xcx'] | beam.Map(str.strip, 'x')
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 510, in _ror_
>  result = p.apply(self, pvalueish, label)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/pipeline.py",
>  line 514, in apply
>  transform.type_check_inputs(pvalueish)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 753, in type_check_inputs
>  hints = getcallargs_forhints(argspec_fn, *type_hints[0], **type_hints[1])
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/decorators.py",
>  line 283, in getcallargs_forhints
>  raise TypeCheckError(e)
>  apache_beam.typehints.decorators.TypeCheckError: strip() missing 1 required 
> positional argument: 'chars'{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-7347) beam_Performance failed with benchmark flag config error

2019-05-16 Thread Mark Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Liu updated BEAM-7347:
---
Description: 
[All|https://builds.apache.org/view/A-D/view/Beam/view/PerformanceTests/] 
performance benchmarks are affected.

Error log from [latest beam_PerformanceTests_TextIOIT 
run|https://builds.apache.org/view/A-D/view/Beam/view/PerformanceTests/job/beam_PerformanceTests_TextIOIT/2008/console]:

{code}
00:00:24.372 2019-05-17 00:21:24,724 5d6e9583 MainThread 
beam_integration_benchmark(1/1) ERRORError during benchmark 
beam_integration_benchmark
00:00:24.372 Traceback (most recent call last):
00:00:24.372   File 
"/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_TextIOIT/PerfKitBenchmarker/perfkitbenchmarker/pkb.py",
 line 752, in RunBenchmark
00:00:24.372 DoProvisionPhase(spec, detailed_timer)
00:00:24.372   File 
"/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_TextIOIT/PerfKitBenchmarker/perfkitbenchmarker/pkb.py",
 line 538, in DoProvisionPhase
00:00:24.372 spec.ConstructDpbService()
00:00:24.372   File 
"/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_TextIOIT/PerfKitBenchmarker/perfkitbenchmarker/benchmark_spec.py",
 line 209, in ConstructDpbService
00:00:24.372 self.dpb_service = dpb_service_class(self.config.dpb_service)
00:00:24.372   File 
"/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_TextIOIT/PerfKitBenchmarker/perfkitbenchmarker/providers/gcp/gcp_dpb_dataflow.py",
 line 53, in __init__
00:00:24.372 super(GcpDpbDataflow, self).__init__(dpb_service_spec)
00:00:24.372   File 
"/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_TextIOIT/PerfKitBenchmarker/perfkitbenchmarker/dpb_service.py",
 line 127, in __init__
00:00:24.372 'The flag dpb_service_zone must be provided, for 
provisioning.')
00:00:24.372 InvalidFlagConfigurationError: The flag dpb_service_zone must be 
provided, for provisioning.
{code}

Seems certain change on 
[PerfkitBenchmarker|https://github.com/GoogleCloudPlatform/PerfKitBenchmarker] 
breaks our 
[beam_integration_benchmark|https://github.com/GoogleCloudPlatform/PerfKitBenchmarker/blob/master/perfkitbenchmarker/linux_benchmarks/beam_integration_benchmark.py].
 However, we may be able to have a quick fix on our side.

  was:
All performance benchmarks are affected.

Error log from [latest beam_PerformanceTests_TextIOIT 
run|https://builds.apache.org/view/A-D/view/Beam/view/PerformanceTests/job/beam_PerformanceTests_TextIOIT/2008/console]:

{code}
00:00:24.372 2019-05-17 00:21:24,724 5d6e9583 MainThread 
beam_integration_benchmark(1/1) ERRORError during benchmark 
beam_integration_benchmark
00:00:24.372 Traceback (most recent call last):
00:00:24.372   File 
"/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_TextIOIT/PerfKitBenchmarker/perfkitbenchmarker/pkb.py",
 line 752, in RunBenchmark
00:00:24.372 DoProvisionPhase(spec, detailed_timer)
00:00:24.372   File 
"/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_TextIOIT/PerfKitBenchmarker/perfkitbenchmarker/pkb.py",
 line 538, in DoProvisionPhase
00:00:24.372 spec.ConstructDpbService()
00:00:24.372   File 
"/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_TextIOIT/PerfKitBenchmarker/perfkitbenchmarker/benchmark_spec.py",
 line 209, in ConstructDpbService
00:00:24.372 self.dpb_service = dpb_service_class(self.config.dpb_service)
00:00:24.372   File 
"/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_TextIOIT/PerfKitBenchmarker/perfkitbenchmarker/providers/gcp/gcp_dpb_dataflow.py",
 line 53, in __init__
00:00:24.372 super(GcpDpbDataflow, self).__init__(dpb_service_spec)
00:00:24.372   File 
"/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_TextIOIT/PerfKitBenchmarker/perfkitbenchmarker/dpb_service.py",
 line 127, in __init__
00:00:24.372 'The flag dpb_service_zone must be provided, for 
provisioning.')
00:00:24.372 InvalidFlagConfigurationError: The flag dpb_service_zone must be 
provided, for provisioning.
{code}

Seems certain change on 
[PerfkitBenchmarker|https://github.com/GoogleCloudPlatform/PerfKitBenchmarker] 
breaks our 
[beam_integration_benchmark|https://github.com/GoogleCloudPlatform/PerfKitBenchmarker/blob/master/perfkitbenchmarker/linux_benchmarks/beam_integration_benchmark.py].
 However, we may be able to have a quick fix on our side.


> beam_Performance failed with benchmark flag config error
> 
>
> Key: BEAM-7347
> URL: https://issues.apache.org/jira/browse/BEAM-7347
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Mark Liu
>Priority: Major
>
> [All|https://builds.apache.org/view/A-D/view/Beam/view/PerformanceTests/] 
> performance benchmarks are affected.
> Error log from [latest 

[jira] [Created] (BEAM-7347) beam_Performance failed with benchmark flag config error

2019-05-16 Thread Mark Liu (JIRA)
Mark Liu created BEAM-7347:
--

 Summary: beam_Performance failed with benchmark flag config error
 Key: BEAM-7347
 URL: https://issues.apache.org/jira/browse/BEAM-7347
 Project: Beam
  Issue Type: Bug
  Components: test-failures
Reporter: Mark Liu


All performance benchmarks are affected.

Error log from [latest beam_PerformanceTests_TextIOIT 
run|https://builds.apache.org/view/A-D/view/Beam/view/PerformanceTests/job/beam_PerformanceTests_TextIOIT/2008/console]:

{code}
00:00:24.372 2019-05-17 00:21:24,724 5d6e9583 MainThread 
beam_integration_benchmark(1/1) ERRORError during benchmark 
beam_integration_benchmark
00:00:24.372 Traceback (most recent call last):
00:00:24.372   File 
"/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_TextIOIT/PerfKitBenchmarker/perfkitbenchmarker/pkb.py",
 line 752, in RunBenchmark
00:00:24.372 DoProvisionPhase(spec, detailed_timer)
00:00:24.372   File 
"/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_TextIOIT/PerfKitBenchmarker/perfkitbenchmarker/pkb.py",
 line 538, in DoProvisionPhase
00:00:24.372 spec.ConstructDpbService()
00:00:24.372   File 
"/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_TextIOIT/PerfKitBenchmarker/perfkitbenchmarker/benchmark_spec.py",
 line 209, in ConstructDpbService
00:00:24.372 self.dpb_service = dpb_service_class(self.config.dpb_service)
00:00:24.372   File 
"/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_TextIOIT/PerfKitBenchmarker/perfkitbenchmarker/providers/gcp/gcp_dpb_dataflow.py",
 line 53, in __init__
00:00:24.372 super(GcpDpbDataflow, self).__init__(dpb_service_spec)
00:00:24.372   File 
"/home/jenkins/jenkins-slave/workspace/beam_PerformanceTests_TextIOIT/PerfKitBenchmarker/perfkitbenchmarker/dpb_service.py",
 line 127, in __init__
00:00:24.372 'The flag dpb_service_zone must be provided, for 
provisioning.')
00:00:24.372 InvalidFlagConfigurationError: The flag dpb_service_zone must be 
provided, for provisioning.
{code}

Seems certain change on 
[PerfkitBenchmarker|https://github.com/GoogleCloudPlatform/PerfKitBenchmarker] 
breaks our 
[beam_integration_benchmark|https://github.com/GoogleCloudPlatform/PerfKitBenchmarker/blob/master/perfkitbenchmarker/linux_benchmarks/beam_integration_benchmark.py].
 However, we may be able to have a quick fix on our side.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6988) TypeHints Py3 Error: test_non_function (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6988?focusedWorklogId=243773=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243773
 ]

ASF GitHub Bot logged work on BEAM-6988:


Author: ASF GitHub Bot
Created on: 17/May/19 01:06
Start Date: 17/May/19 01:06
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #8590: [BEAM-6988] 
Implement a Python 3 version of getcallargs_forhints
URL: https://github.com/apache/beam/pull/8590#discussion_r284948765
 
 

 ##
 File path: sdks/python/apache_beam/typehints/decorators.py
 ##
 @@ -269,6 +272,14 @@ def getcallargs_forhints(func, *typeargs, **typekwargs):
  for (arg, hint) in zip(argspec.args, typeargs)]
   packed_typeargs += list(typeargs[len(packed_typeargs):])
 
+  if sys.version_info.major < 3:
+return getcallargs_forhints_impl_py2(func, argspec, packed_typeargs,
 
 Review comment:
   Perhaps it's better to avoid hard coding Python versions; rely on the 
existence of methods/attributes/etc. or behavior instead.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243773)
Time Spent: 7h 10m  (was: 7h)

> TypeHints Py3 Error: test_non_function 
> (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+
> -
>
> Key: BEAM-6988
> URL: https://issues.apache.org/jira/browse/BEAM-6988
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: niklas Hansson
>Priority: Major
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> {noformat}
> Traceback (most recent call last):
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/typed_pipeline_test.py",
>  line 53, in test_non_function
>  result = ['xa', 'bbx', 'xcx'] | beam.Map(str.strip, 'x')
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 510, in _ror_
>  result = p.apply(self, pvalueish, label)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/pipeline.py",
>  line 514, in apply
>  transform.type_check_inputs(pvalueish)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 753, in type_check_inputs
>  hints = getcallargs_forhints(argspec_fn, *type_hints[0], **type_hints[1])
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/decorators.py",
>  line 283, in getcallargs_forhints
>  raise TypeCheckError(e)
>  apache_beam.typehints.decorators.TypeCheckError: strip() missing 1 required 
> positional argument: 'chars'{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6988) TypeHints Py3 Error: test_non_function (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6988?focusedWorklogId=243767=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243767
 ]

ASF GitHub Bot logged work on BEAM-6988:


Author: ASF GitHub Bot
Created on: 17/May/19 01:04
Start Date: 17/May/19 01:04
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #8590: [BEAM-6988] 
Implement a Python 3 version of getcallargs_forhints
URL: https://github.com/apache/beam/pull/8590#discussion_r284926095
 
 

 ##
 File path: sdks/python/apache_beam/typehints/decorators.py
 ##
 @@ -269,6 +272,14 @@ def getcallargs_forhints(func, *typeargs, **typekwargs):
  for (arg, hint) in zip(argspec.args, typeargs)]
   packed_typeargs += list(typeargs[len(packed_typeargs):])
 
+  if sys.version_info.major < 3:
+return getcallargs_forhints_impl_py2(func, argspec, packed_typeargs,
 
 Review comment:
   Re: checking version within the code, I don't believe we have a guideline. 
Use your judgement. In this case, I would check if this could hurt performance.
   
   I've merged py2 and py3 union matching into one function and added 
3.5.2-specific support (what we use on Jenkins).
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243767)
Time Spent: 6.5h  (was: 6h 20m)

> TypeHints Py3 Error: test_non_function 
> (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+
> -
>
> Key: BEAM-6988
> URL: https://issues.apache.org/jira/browse/BEAM-6988
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: niklas Hansson
>Priority: Major
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> {noformat}
> Traceback (most recent call last):
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/typed_pipeline_test.py",
>  line 53, in test_non_function
>  result = ['xa', 'bbx', 'xcx'] | beam.Map(str.strip, 'x')
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 510, in _ror_
>  result = p.apply(self, pvalueish, label)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/pipeline.py",
>  line 514, in apply
>  transform.type_check_inputs(pvalueish)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 753, in type_check_inputs
>  hints = getcallargs_forhints(argspec_fn, *type_hints[0], **type_hints[1])
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/decorators.py",
>  line 283, in getcallargs_forhints
>  raise TypeCheckError(e)
>  apache_beam.typehints.decorators.TypeCheckError: strip() missing 1 required 
> positional argument: 'chars'{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6988) TypeHints Py3 Error: test_non_function (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6988?focusedWorklogId=243768=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243768
 ]

ASF GitHub Bot logged work on BEAM-6988:


Author: ASF GitHub Bot
Created on: 17/May/19 01:04
Start Date: 17/May/19 01:04
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #8590: [BEAM-6988] 
Implement a Python 3 version of getcallargs_forhints
URL: https://github.com/apache/beam/pull/8590#discussion_r284926464
 
 

 ##
 File path: sdks/python/apache_beam/typehints/decorators.py
 ##
 @@ -314,10 +325,40 @@ def getcallargs_forhints(func, *typeargs, **typekwargs):
   return callargs
 
 
+def getcallargs_forhints_impl_py3(func, packed_typeargs, typekwargs):
+  try:
+# TODO(udim): Function signature returned by getfullargspec (in
+#  packed_typeargs) might differ from the one below. Migrate to use
+#  inspect.signature in getfullargspec (for Py3).
+signature = inspect.signature(func)
+  except ValueError as e:
+logger.warning('Could not get signature for function: %s: %s', func, e)
+return {}
+  try:
+bindings = signature.bind(*packed_typeargs, **typekwargs)
+  except TypeError as e:
+# Might be raised due to too few or too many arguments.
+raise TypeCheckError(e)
+  bound_args = bindings.arguments
+  missing = []
+  for param in signature.parameters.values():
+if param.kind == inspect.Parameter.VAR_POSITIONAL:
+  bound_args[param.name] = typehints.Tuple[typehints.Any, ...]
+elif param.kind == inspect.Parameter.VAR_KEYWORD:
+  bound_args[param.name] = typehints.Dict[typehints.Any, typehints.Any]
+elif param.name not in bound_args and param.default is not param.empty:
+  # Declare unbound parameters with defaults to be Any.
+  bound_args[param.name] = typehints.Any
+
+  if missing:
 
 Review comment:
   nope, see above
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243768)
Time Spent: 6h 40m  (was: 6.5h)

> TypeHints Py3 Error: test_non_function 
> (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+
> -
>
> Key: BEAM-6988
> URL: https://issues.apache.org/jira/browse/BEAM-6988
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: niklas Hansson
>Priority: Major
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> {noformat}
> Traceback (most recent call last):
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/typed_pipeline_test.py",
>  line 53, in test_non_function
>  result = ['xa', 'bbx', 'xcx'] | beam.Map(str.strip, 'x')
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 510, in _ror_
>  result = p.apply(self, pvalueish, label)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/pipeline.py",
>  line 514, in apply
>  transform.type_check_inputs(pvalueish)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 753, in type_check_inputs
>  hints = getcallargs_forhints(argspec_fn, *type_hints[0], **type_hints[1])
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/decorators.py",
>  line 283, in getcallargs_forhints
>  raise TypeCheckError(e)
>  apache_beam.typehints.decorators.TypeCheckError: strip() missing 1 required 
> positional argument: 'chars'{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6988) TypeHints Py3 Error: test_non_function (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6988?focusedWorklogId=243766=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243766
 ]

ASF GitHub Bot logged work on BEAM-6988:


Author: ASF GitHub Bot
Created on: 17/May/19 01:04
Start Date: 17/May/19 01:04
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #8590: [BEAM-6988] 
Implement a Python 3 version of getcallargs_forhints
URL: https://github.com/apache/beam/pull/8590#discussion_r284926389
 
 

 ##
 File path: sdks/python/apache_beam/typehints/decorators.py
 ##
 @@ -314,10 +325,40 @@ def getcallargs_forhints(func, *typeargs, **typekwargs):
   return callargs
 
 
+def getcallargs_forhints_impl_py3(func, packed_typeargs, typekwargs):
+  try:
+# TODO(udim): Function signature returned by getfullargspec (in
+#  packed_typeargs) might differ from the one below. Migrate to use
+#  inspect.signature in getfullargspec (for Py3).
+signature = inspect.signature(func)
+  except ValueError as e:
+logger.warning('Could not get signature for function: %s: %s', func, e)
+return {}
+  try:
+bindings = signature.bind(*packed_typeargs, **typekwargs)
+  except TypeError as e:
+# Might be raised due to too few or too many arguments.
+raise TypeCheckError(e)
+  bound_args = bindings.arguments
+  missing = []
 
 Review comment:
   Good catch! That was a leftover. `signature.bind` should check for missing 
arguments.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243766)
Time Spent: 6h 20m  (was: 6h 10m)

> TypeHints Py3 Error: test_non_function 
> (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+
> -
>
> Key: BEAM-6988
> URL: https://issues.apache.org/jira/browse/BEAM-6988
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: niklas Hansson
>Priority: Major
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> {noformat}
> Traceback (most recent call last):
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/typed_pipeline_test.py",
>  line 53, in test_non_function
>  result = ['xa', 'bbx', 'xcx'] | beam.Map(str.strip, 'x')
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 510, in _ror_
>  result = p.apply(self, pvalueish, label)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/pipeline.py",
>  line 514, in apply
>  transform.type_check_inputs(pvalueish)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 753, in type_check_inputs
>  hints = getcallargs_forhints(argspec_fn, *type_hints[0], **type_hints[1])
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/decorators.py",
>  line 283, in getcallargs_forhints
>  raise TypeCheckError(e)
>  apache_beam.typehints.decorators.TypeCheckError: strip() missing 1 required 
> positional argument: 'chars'{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6988) TypeHints Py3 Error: test_non_function (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6988?focusedWorklogId=243770=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243770
 ]

ASF GitHub Bot logged work on BEAM-6988:


Author: ASF GitHub Bot
Created on: 17/May/19 01:04
Start Date: 17/May/19 01:04
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #8590: [BEAM-6988] 
Implement a Python 3 version of getcallargs_forhints
URL: https://github.com/apache/beam/pull/8590#discussion_r284945686
 
 

 ##
 File path: sdks/python/apache_beam/typehints/decorators.py
 ##
 @@ -269,6 +272,14 @@ def getcallargs_forhints(func, *typeargs, **typekwargs):
  for (arg, hint) in zip(argspec.args, typeargs)]
   packed_typeargs += list(typeargs[len(packed_typeargs):])
 
+  if sys.version_info.major < 3:
+return getcallargs_forhints_impl_py2(func, argspec, packed_typeargs,
+ typekwargs)
+  else:
+return getcallargs_forhints_impl_py3(func, packed_typeargs, typekwargs)
+
+
+def getcallargs_forhints_impl_py2(func, argspec, packed_typeargs, typekwargs):
 
 Review comment:
   Removed it from almost everywhere.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243770)
Time Spent: 7h  (was: 6h 50m)

> TypeHints Py3 Error: test_non_function 
> (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+
> -
>
> Key: BEAM-6988
> URL: https://issues.apache.org/jira/browse/BEAM-6988
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: niklas Hansson
>Priority: Major
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> {noformat}
> Traceback (most recent call last):
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/typed_pipeline_test.py",
>  line 53, in test_non_function
>  result = ['xa', 'bbx', 'xcx'] | beam.Map(str.strip, 'x')
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 510, in _ror_
>  result = p.apply(self, pvalueish, label)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/pipeline.py",
>  line 514, in apply
>  transform.type_check_inputs(pvalueish)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 753, in type_check_inputs
>  hints = getcallargs_forhints(argspec_fn, *type_hints[0], **type_hints[1])
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/decorators.py",
>  line 283, in getcallargs_forhints
>  raise TypeCheckError(e)
>  apache_beam.typehints.decorators.TypeCheckError: strip() missing 1 required 
> positional argument: 'chars'{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6988) TypeHints Py3 Error: test_non_function (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6988?focusedWorklogId=243769=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243769
 ]

ASF GitHub Bot logged work on BEAM-6988:


Author: ASF GitHub Bot
Created on: 17/May/19 01:04
Start Date: 17/May/19 01:04
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #8590: [BEAM-6988] 
Implement a Python 3 version of getcallargs_forhints
URL: https://github.com/apache/beam/pull/8590#discussion_r284946031
 
 

 ##
 File path: sdks/python/apache_beam/typehints/native_type_compatibility.py
 ##
 @@ -144,11 +154,17 @@ def convert_to_beam_type(typ):
   match=_match_issubclass(typing.Tuple),
   arity=-1,
   beam_type=typehints.Tuple),
-  _TypeMapEntry(
-  match=_match_same_type(typing.Union),
-  arity=-1,
-  beam_type=typehints.Union)
   ]
+  if sys.version_info.major >= 3:
+type_map.append(
+_TypeMapEntry(
+match=_match_is_union_py3, arity=-1, beam_type=typehints.Union))
 
 Review comment:
   I made a py2and3 `_match_is_union` function. There are some differences 
between it and #8453.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243769)
Time Spent: 6h 50m  (was: 6h 40m)

> TypeHints Py3 Error: test_non_function 
> (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+
> -
>
> Key: BEAM-6988
> URL: https://issues.apache.org/jira/browse/BEAM-6988
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: niklas Hansson
>Priority: Major
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> {noformat}
> Traceback (most recent call last):
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/typed_pipeline_test.py",
>  line 53, in test_non_function
>  result = ['xa', 'bbx', 'xcx'] | beam.Map(str.strip, 'x')
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 510, in _ror_
>  result = p.apply(self, pvalueish, label)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/pipeline.py",
>  line 514, in apply
>  transform.type_check_inputs(pvalueish)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 753, in type_check_inputs
>  hints = getcallargs_forhints(argspec_fn, *type_hints[0], **type_hints[1])
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/decorators.py",
>  line 283, in getcallargs_forhints
>  raise TypeCheckError(e)
>  apache_beam.typehints.decorators.TypeCheckError: strip() missing 1 required 
> positional argument: 'chars'{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-7135) Spark executable stage: Job bundle factory is not being closed

2019-05-16 Thread Kyle Weaver (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16841836#comment-16841836
 ] 

Kyle Weaver commented on BEAM-7135:
---

It also doesn't help that I'm logging every exception that's going to be 
printed out later anyway.

[https://github.com/apache/beam/blob/8821ed8c3f6b5f4d16abf98d17910cc4a9ba8720/runners/spark/src/main/java/org/apache/beam/runners/spark/translation/SparkExecutableStageFunction.java#L144]

> Spark executable stage: Job bundle factory is not being closed
> --
>
> Key: BEAM-7135
> URL: https://issues.apache.org/jira/browse/BEAM-7135
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>
> JobBundleFactory is being created, but never closed: 
> [https://github.com/apache/beam/blob/a91516cd10d382d1c8a42f3e3b373fbad46369f6/runners/spark/src/main/java/org/apache/beam/runners/spark/translation/SparkExecutableStageFunction.java#L111]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-7135) Spark executable stage: Job bundle factory is not being closed

2019-05-16 Thread Kyle Weaver (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16841828#comment-16841828
 ] 

Kyle Weaver commented on BEAM-7135:
---

This issue creates a huge number of errors that often end up overwriting 
whatever logs preceded them.

2019-05-16 17:32:19,506 [grpc-default-executor-30] ERROR 
org.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ManagedChannelOrphanWrapper
 - *~*~*~ Channel ManagedChannelImpl\{logId=11064, 
target=directaddress:///org.apache.beam.vendor.grpc.v1p13p1.io.grpc.inprocess.InProcessSocketAddress@3d10bf0b}
 was not shutdown properly!!! ~*~*~*

> Spark executable stage: Job bundle factory is not being closed
> --
>
> Key: BEAM-7135
> URL: https://issues.apache.org/jira/browse/BEAM-7135
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>
> JobBundleFactory is being created, but never closed: 
> [https://github.com/apache/beam/blob/a91516cd10d382d1c8a42f3e3b373fbad46369f6/runners/spark/src/main/java/org/apache/beam/runners/spark/translation/SparkExecutableStageFunction.java#L111]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6908) Add Python3 performance benchmarks

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6908?focusedWorklogId=243762=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243762
 ]

ASF GitHub Bot logged work on BEAM-6908:


Author: ASF GitHub Bot
Created on: 17/May/19 00:31
Start Date: 17/May/19 00:31
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on pull request #8518: 
[BEAM-6908] Refactor Python performance test groovy file for easy configuration
URL: https://github.com/apache/beam/pull/8518
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243762)
Time Spent: 15h  (was: 14h 50m)

> Add Python3 performance benchmarks
> --
>
> Key: BEAM-6908
> URL: https://issues.apache.org/jira/browse/BEAM-6908
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Time Spent: 15h
>  Remaining Estimate: 0h
>
> Similar to 
> [beam_PerformanceTests_Python|https://builds.apache.org/view/A-D/view/Beam/view/PerformanceTests/job/beam_PerformanceTests_Python/],
>  we want to have a Python3 benchmark running on Jenkins to detect performance 
> regression during code adoption.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6908) Add Python3 performance benchmarks

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6908?focusedWorklogId=243760=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243760
 ]

ASF GitHub Bot logged work on BEAM-6908:


Author: ASF GitHub Bot
Created on: 17/May/19 00:30
Start Date: 17/May/19 00:30
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on issue #8518: [BEAM-6908] 
Refactor Python performance test groovy file for easy configuration
URL: https://github.com/apache/beam/pull/8518#issuecomment-493276171
 
 
   @tvalentyn I can add link of Bigquery table and dashboard in Beam doc. Also 
fixed the comment for `beam_prebuilt`.
   
   Synced with @manisha252 offline and got approved for this change.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243760)
Time Spent: 14h 50m  (was: 14h 40m)

> Add Python3 performance benchmarks
> --
>
> Key: BEAM-6908
> URL: https://issues.apache.org/jira/browse/BEAM-6908
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Time Spent: 14h 50m
>  Remaining Estimate: 0h
>
> Similar to 
> [beam_PerformanceTests_Python|https://builds.apache.org/view/A-D/view/Beam/view/PerformanceTests/job/beam_PerformanceTests_Python/],
>  we want to have a Python3 benchmark running on Jenkins to detect performance 
> regression during code adoption.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-7271) Adding StringUtf8Coder to ModelCoder in JavaSDK [REOPENED]

2019-05-16 Thread Heejong Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16841827#comment-16841827
 ] 

Heejong Lee commented on BEAM-7271:
---

Yes. I think we still need to backport 
[https://github.com/apache/beam/pull/8575]. The commit related to `adding 
StringUtf8Coder to ModelCoder` on 2.13.0 branch is incomplete. The branch only 
has BEAM-7008 but not BEAM-7260 and some additional fix in BEAM-7271.

> Adding StringUtf8Coder to ModelCoder in JavaSDK [REOPENED]
> --
>
> Key: BEAM-7271
> URL: https://issues.apache.org/jira/browse/BEAM-7271
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Major
> Fix For: 2.13.0
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> Reopend for the reverted previous commit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7339) Enable 1Gb input for Python wordcount benchmark

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7339?focusedWorklogId=243747=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243747
 ]

ASF GitHub Bot logged work on BEAM-7339:


Author: ASF GitHub Bot
Created on: 17/May/19 00:10
Start Date: 17/May/19 00:10
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on pull request #8596: [BEAM-7339] 
Make input and checksum configurable for Python WordCountIT
URL: https://github.com/apache/beam/pull/8596
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243747)
Time Spent: 1h 10m  (was: 1h)

> Enable 1Gb input for Python wordcount benchmark
> ---
>
> Key: BEAM-7339
> URL: https://issues.apache.org/jira/browse/BEAM-7339
> Project: Beam
>  Issue Type: Task
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Requirement:
> - Use input from: gs://apache-beam-samples/input_small_files/*
> - Use TestDataflowRunner
> - Limit worker number
> - Disable autoscaling
> - Enable both py2 and py3 benchmarks



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-7326) Document that Beam BigQuery IO expects users to pass base64-encoded bytes, and BQ IO serves base64-encoded bytes to the user.

2019-05-16 Thread Robert Burke (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16841822#comment-16841822
 ] 

Robert Burke commented on BEAM-7326:


The BigQuery Go package (not Beam's IO) doesn't mention base64 at all. I 
believe that it handles that by itself usually, and treats them as opaque 
blobs. In particular, it's handled by the JSON encoding of the values, which 
automatically base64 encodes bytes.


See [https://godoc.org/cloud.google.com/go/bigquery] and 
[https://godoc.org/encoding/json#Marshal]


In other words, in Go, its a BiqQuery implementation detail that is hidden from 
users, unless they configure things to change it.

> Document that Beam BigQuery IO expects users to pass base64-encoded bytes, 
> and BQ IO serves base64-encoded bytes to the user.
> -
>
> Key: BEAM-7326
> URL: https://issues.apache.org/jira/browse/BEAM-7326
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp, io-python-gcp
>Reporter: Valentyn Tymofieiev
>Priority: Major
>
> BYTES is one of the Datatypes supported by Google Cloud BigQuery, and Apache 
> Beam BigQuery IO connector.
> Current implementation of BigQuery connector in Java and Python SDKs expects 
> that users base64-encode bytes before passing them to BigQuery IO, see 
> discussion on dev: [1] 
> This needs to be reflected in public documentation, see [2-4]
> cc: [~juta] [~chamikara] [~pabloem] 
> cc: [~lostluck] [~kedin] FYI and to advise whether similar action needs to be 
> done for Go SDK and/or Beam SQL.
> [1] 
> https://lists.apache.org/thread.html/f35c836887014e059527ed1a806e730321e2f9726164a3030575f455@%3Cdev.beam.apache.org%3E
> [2] https://beam.apache.org/documentation/io/built-in/google-bigquery/
> [3] 
> https://beam.apache.org/releases/pydoc/2.12.0/apache_beam.io.gcp.bigquery.html
> [4] 
> https://beam.apache.org/releases/javadoc/2.12.0/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6908) Add Python3 performance benchmarks

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6908?focusedWorklogId=243746=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243746
 ]

ASF GitHub Bot logged work on BEAM-6908:


Author: ASF GitHub Bot
Created on: 17/May/19 00:05
Start Date: 17/May/19 00:05
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on issue #8518: [BEAM-6908] 
Refactor Python performance test groovy file for easy configuration
URL: https://github.com/apache/beam/pull/8518#issuecomment-493271799
 
 
   Run Python35 WordCountIT Performance Test
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243746)
Time Spent: 14h 40m  (was: 14.5h)

> Add Python3 performance benchmarks
> --
>
> Key: BEAM-6908
> URL: https://issues.apache.org/jira/browse/BEAM-6908
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Time Spent: 14h 40m
>  Remaining Estimate: 0h
>
> Similar to 
> [beam_PerformanceTests_Python|https://builds.apache.org/view/A-D/view/Beam/view/PerformanceTests/job/beam_PerformanceTests_Python/],
>  we want to have a Python3 benchmark running on Jenkins to detect performance 
> regression during code adoption.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-7342) Extend SyntheticPipeline map steps to be able to be splittable (Beam Python SDK)

2019-05-16 Thread Lara Schmidt (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lara Schmidt reassigned BEAM-7342:
--

Assignee: Lara Schmidt

> Extend SyntheticPipeline map steps to be able to be splittable (Beam Python 
> SDK)
> 
>
> Key: BEAM-7342
> URL: https://issues.apache.org/jira/browse/BEAM-7342
> Project: Beam
>  Issue Type: New Feature
>  Components: testing
> Environment: Beam Python
>Reporter: Lara Schmidt
>Assignee: Lara Schmidt
>Priority: Minor
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> Add the ability for map steps to be configured to be splittable. 
> Possible configuration options:
>  - uneven bundle sizes
>  - possible incorrect sizing returned



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6683) Add an integration test suite for cross-language transforms for Flink runner

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6683?focusedWorklogId=243727=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243727
 ]

ASF GitHub Bot logged work on BEAM-6683:


Author: ASF GitHub Bot
Created on: 16/May/19 23:35
Start Date: 16/May/19 23:35
Worklog Time Spent: 10m 
  Work Description: ihji commented on issue #8174: [BEAM-6683] add 
createCrossLanguageValidatesRunner task
URL: https://github.com/apache/beam/pull/8174#issuecomment-493266079
 
 
   run python precommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243727)
Time Spent: 11h  (was: 10h 50m)

> Add an integration test suite for cross-language transforms for Flink runner
> 
>
> Key: BEAM-6683
> URL: https://issues.apache.org/jira/browse/BEAM-6683
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Chamikara Jayalath
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 11h
>  Remaining Estimate: 0h
>
> We should add an integration test suite that covers following.
> (1) Currently available Java IO connectors that do not use UDFs work for 
> Python SDK on Flink runner.
> (2) Currently available Python IO connectors that do not use UDFs work for 
> Java SDK on Flink runner.
> (3) Currently available Java/Python pipelines work in a scalable manner for 
> cross-language pipelines (for example, try 10GB, 100GB input for 
> textio/avroio for Java and Python). 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (BEAM-7190) enable file system based token authentication for portable runner

2019-05-16 Thread Hai Lu (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-7190 started by Hai Lu.

> enable file system based token authentication for portable runner
> -
>
> Key: BEAM-7190
> URL: https://issues.apache.org/jira/browse/BEAM-7190
> Project: Beam
>  Issue Type: Task
>  Components: runner-samza
>Reporter: Hai Lu
>Assignee: Hai Lu
>Priority: Major
>
> For Samza and potentially other portable runners, there is a need to secure 
> the communication between sdk worker and runner. Currently the SSL/TLS in 
> portability is half done.
> However, after investigation we found that it's sufficient to just 1) use 
> loopback address 2) enforce authentication and that way the communication is 
> both authenticated and secured.
> This ticket intends to track the implementation of the solution above. More 
> details can be found in the following PR.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-7346) Add tests for BigQuery connector in Go SDK that will exercise all types supported by BQ.

2019-05-16 Thread Valentyn Tymofieiev (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev updated BEAM-7346:
--
Priority: Minor  (was: Major)

> Add tests for BigQuery connector in Go SDK that will exercise all types 
> supported by BQ.
> 
>
> Key: BEAM-7346
> URL: https://issues.apache.org/jira/browse/BEAM-7346
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-go
>Reporter: Valentyn Tymofieiev
>Priority: Minor
>
> Sample tests in Python and Java SDKs:
> https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryToTableIT.java
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/big_query_query_to_table_it_test.py
> In particular, we should make sure BYTES datatype is treated in the same way 
> on go SDK as in Java and Python SDK. Currently, Java and Python SDK assume 
> that users pass base64-encoded bytes, but we may decide to revise this 
> behavior, see [1,2]. 
> [1] 
> https://lists.apache.org/thread.html/0c2178cf8e5d9e77c4f233f05a0b87b6011a1daa1a5ae47b41463af5@%3Cdev.beam.apache.org%3E,
>  
> [2] https://issues.apache.org/jira/browse/BEAM-7344



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7346) Add tests for BigQuery connector in Go SDK that will exercise all types supported by BQ.

2019-05-16 Thread Valentyn Tymofieiev (JIRA)
Valentyn Tymofieiev created BEAM-7346:
-

 Summary: Add tests for BigQuery connector in Go SDK that will 
exercise all types supported by BQ.
 Key: BEAM-7346
 URL: https://issues.apache.org/jira/browse/BEAM-7346
 Project: Beam
  Issue Type: Bug
  Components: sdk-go
Reporter: Valentyn Tymofieiev


Sample tests in Python and Java SDKs:

https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryToTableIT.java

https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/big_query_query_to_table_it_test.py

In particular, we should make sure BYTES datatype is treated in the same way on 
go SDK as in Java and Python SDK. Currently, Java and Python SDK assume that 
users pass base64-encoded bytes, but we may decide to revise this behavior, see 
[1,2]. 

[1] 
https://lists.apache.org/thread.html/0c2178cf8e5d9e77c4f233f05a0b87b6011a1daa1a5ae47b41463af5@%3Cdev.beam.apache.org%3E,
 
[2] https://issues.apache.org/jira/browse/BEAM-7344




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6683) Add an integration test suite for cross-language transforms for Flink runner

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6683?focusedWorklogId=243726=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243726
 ]

ASF GitHub Bot logged work on BEAM-6683:


Author: ASF GitHub Bot
Created on: 16/May/19 23:34
Start Date: 16/May/19 23:34
Worklog Time Spent: 10m 
  Work Description: ihji commented on issue #8174: [BEAM-6683] add 
createCrossLanguageValidatesRunner task
URL: https://github.com/apache/beam/pull/8174#issuecomment-493266079
 
 
   run python precommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243726)
Time Spent: 10h 50m  (was: 10h 40m)

> Add an integration test suite for cross-language transforms for Flink runner
> 
>
> Key: BEAM-6683
> URL: https://issues.apache.org/jira/browse/BEAM-6683
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Chamikara Jayalath
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 10h 50m
>  Remaining Estimate: 0h
>
> We should add an integration test suite that covers following.
> (1) Currently available Java IO connectors that do not use UDFs work for 
> Python SDK on Flink runner.
> (2) Currently available Python IO connectors that do not use UDFs work for 
> Java SDK on Flink runner.
> (3) Currently available Java/Python pipelines work in a scalable manner for 
> cross-language pipelines (for example, try 10GB, 100GB input for 
> textio/avroio for Java and Python). 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6959) Run Go SDK Post Commit tests against the Flink Runner.

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6959?focusedWorklogId=243725=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243725
 ]

ASF GitHub Bot logged work on BEAM-6959:


Author: ASF GitHub Bot
Created on: 16/May/19 23:28
Start Date: 16/May/19 23:28
Worklog Time Spent: 10m 
  Work Description: angoenka commented on pull request #8531: [BEAM-6959] 
Add Flink tests for Go SDK
URL: https://github.com/apache/beam/pull/8531
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243725)
Time Spent: 1h 50m  (was: 1h 40m)

> Run Go SDK  Post Commit tests against the Flink Runner.
> ---
>
> Key: BEAM-6959
> URL: https://issues.apache.org/jira/browse/BEAM-6959
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink, sdk-go, testing
>Reporter: Robert Burke
>Assignee: Kyle Weaver
>Priority: Minor
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> See parent task BEAM-6958



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6988) TypeHints Py3 Error: test_non_function (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6988?focusedWorklogId=243722=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243722
 ]

ASF GitHub Bot logged work on BEAM-6988:


Author: ASF GitHub Bot
Created on: 16/May/19 23:20
Start Date: 16/May/19 23:20
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #8590: [BEAM-6988] 
Implement a Python 3 version of getcallargs_forhints
URL: https://github.com/apache/beam/pull/8590#discussion_r284925302
 
 

 ##
 File path: sdks/python/apache_beam/typehints/decorators.py
 ##
 @@ -105,6 +107,7 @@ def foo((a, b)):
 'TypeCheckError',
 ]
 
+logger = logging.getLogger(__name__)
 
 Review comment:
   It uses the module name instead of "root". See 
https://issues.apache.org/jira/browse/BEAM-3523 for details. Rereading that 
JIRA however, I realize that the worker expects us to log to root so I'll 
revert this for now.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243722)
Time Spent: 6h 10m  (was: 6h)

> TypeHints Py3 Error: test_non_function 
> (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+
> -
>
> Key: BEAM-6988
> URL: https://issues.apache.org/jira/browse/BEAM-6988
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: niklas Hansson
>Priority: Major
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> {noformat}
> Traceback (most recent call last):
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/typed_pipeline_test.py",
>  line 53, in test_non_function
>  result = ['xa', 'bbx', 'xcx'] | beam.Map(str.strip, 'x')
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 510, in _ror_
>  result = p.apply(self, pvalueish, label)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/pipeline.py",
>  line 514, in apply
>  transform.type_check_inputs(pvalueish)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 753, in type_check_inputs
>  hints = getcallargs_forhints(argspec_fn, *type_hints[0], **type_hints[1])
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/decorators.py",
>  line 283, in getcallargs_forhints
>  raise TypeCheckError(e)
>  apache_beam.typehints.decorators.TypeCheckError: strip() missing 1 required 
> positional argument: 'chars'{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-7116) Remove KV from Schema transforms

2019-05-16 Thread Brian Hulette (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16841798#comment-16841798
 ] 

Brian Hulette commented on BEAM-7116:
-

Ah shoot. My intention was just to make it so that we could lookup a Schema for 
KV 
[here|https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/transforms/Convert.java#L124],
 I didn't realize that would also make KV use SchemaCoder over KVCoder.

> Remove KV from Schema transforms
> 
>
> Key: BEAM-7116
> URL: https://issues.apache.org/jira/browse/BEAM-7116
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Reuven Lax
>Priority: Major
>
> Instead of returning KV objects, we should return a Schema with two fields. 
> The Convert transform should be able to convert these to KV objects.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7345) Add support for generics in schema inference

2019-05-16 Thread Brian Hulette (JIRA)
Brian Hulette created BEAM-7345:
---

 Summary: Add support for generics in schema inference
 Key: BEAM-7345
 URL: https://issues.apache.org/jira/browse/BEAM-7345
 Project: Beam
  Issue Type: Sub-task
  Components: sdk-java-core
Reporter: Brian Hulette


Currently schema inference doesn't work for getters that return a parameterized 
type. Fixing this would most likely involve plumbing TypeDescriptor through 
FieldValueTypeSupplier, FieldValueTypeInformation, StaticSchemaInference, etc.. 
rather than Class.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-7345) Add support for generics in schema inference

2019-05-16 Thread Brian Hulette (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16841797#comment-16841797
 ] 

Brian Hulette commented on BEAM-7345:
-

Some more discussion here: 
https://issues.apache.org/jira/browse/BEAM-7116?focusedCommentId=16841702=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16841702

> Add support for generics in schema inference
> 
>
> Key: BEAM-7345
> URL: https://issues.apache.org/jira/browse/BEAM-7345
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Brian Hulette
>Priority: Major
>
> Currently schema inference doesn't work for getters that return a 
> parameterized type. Fixing this would most likely involve plumbing 
> TypeDescriptor through FieldValueTypeSupplier, FieldValueTypeInformation, 
> StaticSchemaInference, etc.. rather than Class.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6683) Add an integration test suite for cross-language transforms for Flink runner

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6683?focusedWorklogId=243715=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243715
 ]

ASF GitHub Bot logged work on BEAM-6683:


Author: ASF GitHub Bot
Created on: 16/May/19 23:00
Start Date: 16/May/19 23:00
Worklog Time Spent: 10m 
  Work Description: ihji commented on issue #8174: [BEAM-6683] add 
createCrossLanguageValidatesRunner task
URL: https://github.com/apache/beam/pull/8174#issuecomment-493259550
 
 
   run java precommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243715)
Time Spent: 10h 40m  (was: 10.5h)

> Add an integration test suite for cross-language transforms for Flink runner
> 
>
> Key: BEAM-6683
> URL: https://issues.apache.org/jira/browse/BEAM-6683
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Chamikara Jayalath
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 10h 40m
>  Remaining Estimate: 0h
>
> We should add an integration test suite that covers following.
> (1) Currently available Java IO connectors that do not use UDFs work for 
> Python SDK on Flink runner.
> (2) Currently available Python IO connectors that do not use UDFs work for 
> Java SDK on Flink runner.
> (3) Currently available Java/Python pipelines work in a scalable manner for 
> cross-language pipelines (for example, try 10GB, 100GB input for 
> textio/avroio for Java and Python). 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6683) Add an integration test suite for cross-language transforms for Flink runner

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6683?focusedWorklogId=243714=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243714
 ]

ASF GitHub Bot logged work on BEAM-6683:


Author: ASF GitHub Bot
Created on: 16/May/19 23:00
Start Date: 16/May/19 23:00
Worklog Time Spent: 10m 
  Work Description: ihji commented on issue #8174: [BEAM-6683] add 
createCrossLanguageValidatesRunner task
URL: https://github.com/apache/beam/pull/8174#issuecomment-493259550
 
 
   run java precommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243714)
Time Spent: 10.5h  (was: 10h 20m)

> Add an integration test suite for cross-language transforms for Flink runner
> 
>
> Key: BEAM-6683
> URL: https://issues.apache.org/jira/browse/BEAM-6683
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Chamikara Jayalath
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 10.5h
>  Remaining Estimate: 0h
>
> We should add an integration test suite that covers following.
> (1) Currently available Java IO connectors that do not use UDFs work for 
> Python SDK on Flink runner.
> (2) Currently available Python IO connectors that do not use UDFs work for 
> Java SDK on Flink runner.
> (3) Currently available Java/Python pipelines work in a scalable manner for 
> cross-language pipelines (for example, try 10GB, 100GB input for 
> textio/avroio for Java and Python). 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7344) Consider removing the requirement that users need to base64-encode their BYTES before passing them to BQ IO connector.

2019-05-16 Thread Valentyn Tymofieiev (JIRA)
Valentyn Tymofieiev created BEAM-7344:
-

 Summary: Consider removing the requirement that users need to 
base64-encode their BYTES before passing them to BQ IO connector.
 Key: BEAM-7344
 URL: https://issues.apache.org/jira/browse/BEAM-7344
 Project: Beam
  Issue Type: Bug
  Components: io-java-gcp, io-python-gcp
Reporter: Valentyn Tymofieiev


Currently, when BigQuery IO connector reads or stores BYTES datatype in 
BigQuery, there is an expectation that users base64-encode the bytes before 
passing them to the connector, see [1,2]. 
This may be an extra overhead for users to do base64 encoding when interacting 
with Beam, that is possible to avoid. Filing this issue to reconsider this 
behavior.

cc: [~chamikara].

[1] 
https://lists.apache.org/thread.html/0c2178cf8e5d9e77c4f233f05a0b87b6011a1daa1a5ae47b41463af5@%3Cdev.beam.apache.org%3E
[2] https://issues.apache.org/jira/browse/BEAM-7326



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-7344) Consider removing the requirement that users need to base64-encode their BYTES before passing them to BQ IO connector.

2019-05-16 Thread Valentyn Tymofieiev (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev updated BEAM-7344:
--
Priority: Minor  (was: Major)

> Consider removing the requirement that users need to base64-encode their 
> BYTES before passing them to BQ IO connector.
> --
>
> Key: BEAM-7344
> URL: https://issues.apache.org/jira/browse/BEAM-7344
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp, io-python-gcp
>Reporter: Valentyn Tymofieiev
>Priority: Minor
>
> Currently, when BigQuery IO connector reads or stores BYTES datatype in 
> BigQuery, there is an expectation that users base64-encode the bytes before 
> passing them to the connector, see [1,2]. 
> This may be an extra overhead for users to do base64 encoding when 
> interacting with Beam, that is possible to avoid. Filing this issue to 
> reconsider this behavior.
> cc: [~chamikara].
> [1] 
> https://lists.apache.org/thread.html/0c2178cf8e5d9e77c4f233f05a0b87b6011a1daa1a5ae47b41463af5@%3Cdev.beam.apache.org%3E
> [2] https://issues.apache.org/jira/browse/BEAM-7326



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6988) TypeHints Py3 Error: test_non_function (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6988?focusedWorklogId=243709=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243709
 ]

ASF GitHub Bot logged work on BEAM-6988:


Author: ASF GitHub Bot
Created on: 16/May/19 22:45
Start Date: 16/May/19 22:45
Worklog Time Spent: 10m 
  Work Description: udim commented on pull request #8590: [BEAM-6988] 
Implement a Python 3 version of getcallargs_forhints
URL: https://github.com/apache/beam/pull/8590#discussion_r284925302
 
 

 ##
 File path: sdks/python/apache_beam/typehints/decorators.py
 ##
 @@ -105,6 +107,7 @@ def foo((a, b)):
 'TypeCheckError',
 ]
 
+logger = logging.getLogger(__name__)
 
 Review comment:
   It uses the module name instead of "root". See 
https://issues.apache.org/jira/browse/BEAM-3523 for details. Rereading that 
however, I realize that the worker expects us to log to root so I'll revert 
this for now.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243709)
Time Spent: 6h  (was: 5h 50m)

> TypeHints Py3 Error: test_non_function 
> (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+
> -
>
> Key: BEAM-6988
> URL: https://issues.apache.org/jira/browse/BEAM-6988
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: niklas Hansson
>Priority: Major
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> {noformat}
> Traceback (most recent call last):
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/typed_pipeline_test.py",
>  line 53, in test_non_function
>  result = ['xa', 'bbx', 'xcx'] | beam.Map(str.strip, 'x')
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 510, in _ror_
>  result = p.apply(self, pvalueish, label)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/pipeline.py",
>  line 514, in apply
>  transform.type_check_inputs(pvalueish)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 753, in type_check_inputs
>  hints = getcallargs_forhints(argspec_fn, *type_hints[0], **type_hints[1])
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/decorators.py",
>  line 283, in getcallargs_forhints
>  raise TypeCheckError(e)
>  apache_beam.typehints.decorators.TypeCheckError: strip() missing 1 required 
> positional argument: 'chars'{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6988) TypeHints Py3 Error: test_non_function (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6988?focusedWorklogId=243706=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243706
 ]

ASF GitHub Bot logged work on BEAM-6988:


Author: ASF GitHub Bot
Created on: 16/May/19 22:42
Start Date: 16/May/19 22:42
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #8590: [BEAM-6988] Implement a 
Python 3 version of getcallargs_forhints
URL: https://github.com/apache/beam/pull/8590#issuecomment-493256130
 
 
   @NikeNano Valentyn was referring the precommit test's failure: 
https://builds.apache.org/job/beam_PreCommit_Python_Commit/6421/
   
   As for the postcommit test failure I see this line in the console log 
(viewing the full log):
   ```
   test_big_query_legacy_sql 
(apache_beam.io.gcp.big_query_query_to_table_it_test.BigQueryQueryToTableIT) 
... FAIL
   ```
   Right now my guess is that the postcommit failure is a flake, and I'll try 
to run it against once I address the review comments.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243706)
Time Spent: 5h 50m  (was: 5h 40m)

> TypeHints Py3 Error: test_non_function 
> (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+
> -
>
> Key: BEAM-6988
> URL: https://issues.apache.org/jira/browse/BEAM-6988
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: niklas Hansson
>Priority: Major
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> {noformat}
> Traceback (most recent call last):
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/typed_pipeline_test.py",
>  line 53, in test_non_function
>  result = ['xa', 'bbx', 'xcx'] | beam.Map(str.strip, 'x')
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 510, in _ror_
>  result = p.apply(self, pvalueish, label)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/pipeline.py",
>  line 514, in apply
>  transform.type_check_inputs(pvalueish)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 753, in type_check_inputs
>  hints = getcallargs_forhints(argspec_fn, *type_hints[0], **type_hints[1])
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/decorators.py",
>  line 283, in getcallargs_forhints
>  raise TypeCheckError(e)
>  apache_beam.typehints.decorators.TypeCheckError: strip() missing 1 required 
> positional argument: 'chars'{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-7326) Document that Beam BigQuery IO expects users to pass base64-encoded bytes, and BQ IO serves base64-encoded bytes to the user.

2019-05-16 Thread Valentyn Tymofieiev (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev updated BEAM-7326:
--
Description: 
BYTES is one of the Datatypes supported by Google Cloud BigQuery, and Apache 
Beam BigQuery IO connector.

Current implementation of BigQuery connector in Java and Python SDKs expects 
that users base64-encode bytes before passing them to BigQuery IO, see 
discussion on dev: [1] 

This needs to be reflected in public documentation, see [2-4]

cc: [~juta] [~chamikara] [~pabloem] 

cc: [~lostluck] [~kedin] FYI and to advise whether similar action needs to be 
done for Go SDK and/or Beam SQL.

[1] 
https://lists.apache.org/thread.html/f35c836887014e059527ed1a806e730321e2f9726164a3030575f455@%3Cdev.beam.apache.org%3E
[2] https://beam.apache.org/documentation/io/built-in/google-bigquery/
[3] 
https://beam.apache.org/releases/pydoc/2.12.0/apache_beam.io.gcp.bigquery.html
[4] 
https://beam.apache.org/releases/javadoc/2.12.0/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.html

  was:
BYTES is one of the Datatypes supported by Google Cloud BigQuery, and Apache 
Beam BigQuery IO connector.

Current implementation of BigQuery connector in Java and Python SDKs expects 
that users base64-encode bytes before passing them to BigQuery IO, see 
discussion on dev: [1] 

This needs to be reflected in public documentation, see [2-4]

cc: [~juta] [~chamikara] [~pabloem] 

cc: [~rebo] [~kedin] FYI and to advise whether similar action needs to be done 
for Go SDK and/or Beam SQL.

[1] 
https://lists.apache.org/thread.html/f35c836887014e059527ed1a806e730321e2f9726164a3030575f455@%3Cdev.beam.apache.org%3E
[2] https://beam.apache.org/documentation/io/built-in/google-bigquery/
[3] 
https://beam.apache.org/releases/pydoc/2.12.0/apache_beam.io.gcp.bigquery.html
[4] 
https://beam.apache.org/releases/javadoc/2.12.0/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.html


> Document that Beam BigQuery IO expects users to pass base64-encoded bytes, 
> and BQ IO serves base64-encoded bytes to the user.
> -
>
> Key: BEAM-7326
> URL: https://issues.apache.org/jira/browse/BEAM-7326
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp, io-python-gcp
>Reporter: Valentyn Tymofieiev
>Priority: Major
>
> BYTES is one of the Datatypes supported by Google Cloud BigQuery, and Apache 
> Beam BigQuery IO connector.
> Current implementation of BigQuery connector in Java and Python SDKs expects 
> that users base64-encode bytes before passing them to BigQuery IO, see 
> discussion on dev: [1] 
> This needs to be reflected in public documentation, see [2-4]
> cc: [~juta] [~chamikara] [~pabloem] 
> cc: [~lostluck] [~kedin] FYI and to advise whether similar action needs to be 
> done for Go SDK and/or Beam SQL.
> [1] 
> https://lists.apache.org/thread.html/f35c836887014e059527ed1a806e730321e2f9726164a3030575f455@%3Cdev.beam.apache.org%3E
> [2] https://beam.apache.org/documentation/io/built-in/google-bigquery/
> [3] 
> https://beam.apache.org/releases/pydoc/2.12.0/apache_beam.io.gcp.bigquery.html
> [4] 
> https://beam.apache.org/releases/javadoc/2.12.0/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-7271) Adding StringUtf8Coder to ModelCoder in JavaSDK [REOPENED]

2019-05-16 Thread Ankur Goenka (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16841780#comment-16841780
 ] 

Ankur Goenka commented on BEAM-7271:


Flink PortableValidatesRunner test cases are passing on 2.13.0 
[https://github.com/apache/beam/pull/8579] 

Do we still want to back port this?

> Adding StringUtf8Coder to ModelCoder in JavaSDK [REOPENED]
> --
>
> Key: BEAM-7271
> URL: https://issues.apache.org/jira/browse/BEAM-7271
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Heejong Lee
>Assignee: Heejong Lee
>Priority: Major
> Fix For: 2.13.0
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> Reopend for the reverted previous commit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6908) Add Python3 performance benchmarks

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6908?focusedWorklogId=243704=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243704
 ]

ASF GitHub Bot logged work on BEAM-6908:


Author: ASF GitHub Bot
Created on: 16/May/19 22:36
Start Date: 16/May/19 22:36
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on issue #8518: [BEAM-6908] 
Refactor Python performance test groovy file for easy configuration
URL: https://github.com/apache/beam/pull/8518#issuecomment-493255048
 
 
   Run Seed Job
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243704)
Time Spent: 14.5h  (was: 14h 20m)

> Add Python3 performance benchmarks
> --
>
> Key: BEAM-6908
> URL: https://issues.apache.org/jira/browse/BEAM-6908
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Time Spent: 14.5h
>  Remaining Estimate: 0h
>
> Similar to 
> [beam_PerformanceTests_Python|https://builds.apache.org/view/A-D/view/Beam/view/PerformanceTests/job/beam_PerformanceTests_Python/],
>  we want to have a Python3 benchmark running on Jenkins to detect performance 
> regression during code adoption.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-7116) Remove KV from Schema transforms

2019-05-16 Thread Reuven Lax (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16841777#comment-16841777
 ] 

Reuven Lax commented on BEAM-7116:
--

The problem is that Beam special cases KvCoder all over the place, so if we
cause KV to use SchemaCoder we will break large parts of Beam. I think it
will be easier to just remove KV from our interface and let any two-field
schema translate to KV.

However what you suggested is indeed a problem in Schema type inference -
we don't do a good job with generic classes (someone trying AutoValueSchema
hit this). Do you want to file a JIRA for this issue, as there doesn't
appear to be one?

*From: *Brian Hulette (JIRA) 
*Date: *Thu, May 16, 2019 at 1:38 PM
*To: * 




> Remove KV from Schema transforms
> 
>
> Key: BEAM-7116
> URL: https://issues.apache.org/jira/browse/BEAM-7116
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Reuven Lax
>Priority: Major
>
> Instead of returning KV objects, we should return a Schema with two fields. 
> The Convert transform should be able to convert these to KV objects.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7343) Fix Google Cloud Dataflow Runner * tests on 2.13.0

2019-05-16 Thread Ankur Goenka (JIRA)
Ankur Goenka created BEAM-7343:
--

 Summary: Fix Google Cloud Dataflow Runner * tests on 2.13.0
 Key: BEAM-7343
 URL: https://issues.apache.org/jira/browse/BEAM-7343
 Project: Beam
  Issue Type: Bug
  Components: testing
Reporter: Ankur Goenka
Assignee: Ankur Goenka
 Fix For: 2.13.0


One of the failing test 
https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow_PR/81/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-7342) Extend SyntheticPipeline map steps to be able to be splittable (Beam Python SDK)

2019-05-16 Thread Lara Schmidt (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lara Schmidt updated BEAM-7342:
---
Status: Open  (was: Triage Needed)

> Extend SyntheticPipeline map steps to be able to be splittable (Beam Python 
> SDK)
> 
>
> Key: BEAM-7342
> URL: https://issues.apache.org/jira/browse/BEAM-7342
> Project: Beam
>  Issue Type: New Feature
>  Components: testing
> Environment: Beam Python
>Reporter: Lara Schmidt
>Priority: Minor
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> Add the ability for map steps to be configured to be splittable. 
> Possible configuration options:
>  - uneven bundle sizes
>  - possible incorrect sizing returned



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7342) Extend SyntheticPipeline map steps to be able to be splittable (Beam Python SDK)

2019-05-16 Thread Lara Schmidt (JIRA)
Lara Schmidt created BEAM-7342:
--

 Summary: Extend SyntheticPipeline map steps to be able to be 
splittable (Beam Python SDK)
 Key: BEAM-7342
 URL: https://issues.apache.org/jira/browse/BEAM-7342
 Project: Beam
  Issue Type: New Feature
  Components: testing
 Environment: Beam Python
Reporter: Lara Schmidt


Add the ability for map steps to be configured to be splittable. 
Possible configuration options:

 - uneven bundle sizes

 - possible incorrect sizing returned



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-7177) Spark portable runner fails testGlobalCombineWithDefaultsAndTriggers

2019-05-16 Thread Kyle Weaver (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kyle Weaver resolved BEAM-7177.
---
   Resolution: Duplicate
Fix Version/s: Not applicable

> Spark portable runner fails testGlobalCombineWithDefaultsAndTriggers
> 
>
> Key: BEAM-7177
> URL: https://issues.apache.org/jira/browse/BEAM-7177
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Minor
> Fix For: Not applicable
>
>
> [https://github.com/apache/beam/blob/1892c97aba6fc5d8342341cba8abff51477f5456/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/CombineTest.java#L1185-L1210]
> Expected: a collection containing "2: true"
> but: mismatches were: [was "2: false"]
> Meaning c.pane().isLast() is supposed to be true, but is actually false.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (BEAM-7282) Spark portable runner doesn't support `pre_optimize=all`

2019-05-16 Thread Kyle Weaver (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kyle Weaver closed BEAM-7282.
-
   Resolution: Fixed
Fix Version/s: 2.14.0

> Spark portable runner doesn't support `pre_optimize=all`
> 
>
> Key: BEAM-7282
> URL: https://issues.apache.org/jira/browse/BEAM-7282
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
> Fix For: 2.14.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Because we are trying to fuse the already-optimized pipeline.
> Error message: https://gist.github.com/ibzib/c432b45b90f7ddb62eb39e1784b55ba8



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7341) portable Spark: testGlobalCombineWithDefaultsAndTriggers fails

2019-05-16 Thread Kyle Weaver (JIRA)
Kyle Weaver created BEAM-7341:
-

 Summary: portable Spark: testGlobalCombineWithDefaultsAndTriggers 
fails
 Key: BEAM-7341
 URL: https://issues.apache.org/jira/browse/BEAM-7341
 Project: Beam
  Issue Type: Bug
  Components: runner-spark
Reporter: Kyle Weaver
Assignee: Kyle Weaver


PaneInfo for CombineTest.testGlobalCombineWithDefaultsAndTriggers [1] output is 
incorrect.

isLast: expected true, is false
timing: expected UNKNOWN, is EARLY

No idea yet why this is happening, but commenting out the special GBK 
translation for non-merging windows [2] seems to fix it.

[1] 
[https://github.com/apache/beam/blob/8403313ea7d63e49974629136c615e379ea874ce/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/CombineTest.java#L1219-L1242]

[2] 
[https://github.com/apache/beam/blob/e98a3a69295afbfc6984fe92c52125929daf6088/runners/spark/src/main/java/org/apache/beam/runners/spark/translation/SparkBatchPortablePipelineTranslator.java#L165-L170]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-563) DoFn Reuse: Update DirectRunner

2019-05-16 Thread Ahmet Altay (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay updated BEAM-563:
-
Fix Version/s: (was: Not applicable)
   2.14.0

> DoFn Reuse: Update DirectRunner
> ---
>
> Key: BEAM-563
> URL: https://issues.apache.org/jira/browse/BEAM-563
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Priority: Major
> Fix For: 2.14.0
>
>
> https://issues.apache.org/jira/browse/BEAM-562 will add setup and teardown 
> methods to DoFns. Update DirectRunner to add support for these new methods.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-562) DoFn Reuse: Add new methods to DoFn

2019-05-16 Thread Ahmet Altay (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay updated BEAM-562:
-
Fix Version/s: (was: Not applicable)
   2.14.0

> DoFn Reuse: Add new methods to DoFn
> ---
>
> Key: BEAM-562
> URL: https://issues.apache.org/jira/browse/BEAM-562
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Yifan Mai
>Priority: Major
>  Labels: sdk-consistency
> Fix For: 2.14.0
>
>  Time Spent: 10h 50m
>  Remaining Estimate: 0h
>
> Java SDK added setup and teardown methods to the DoFns. This makes DoFns 
> reusable and provide performance improvements. Python SDK should add support 
> for these new DoFn methods:
> Proposal doc: 
> https://docs.google.com/document/d/1LLQqggSePURt3XavKBGV7SZJYQ4NW8yCu63lBchzMRk/edit?ts=5771458f#



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7339) Enable 1Gb input for Python wordcount benchmark

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7339?focusedWorklogId=243674=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243674
 ]

ASF GitHub Bot logged work on BEAM-7339:


Author: ASF GitHub Bot
Created on: 16/May/19 21:53
Start Date: 16/May/19 21:53
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on issue #8596: [BEAM-7339] Make 
input and checksum configurable for Python WordCountIT
URL: https://github.com/apache/beam/pull/8596#issuecomment-493244963
 
 
   Thank you @yifanzou. I updated comments as well as fixed the pylint error 
that cause PreCommit failed. 
   PTAL.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243674)
Time Spent: 1h  (was: 50m)

> Enable 1Gb input for Python wordcount benchmark
> ---
>
> Key: BEAM-7339
> URL: https://issues.apache.org/jira/browse/BEAM-7339
> Project: Beam
>  Issue Type: Task
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Requirement:
> - Use input from: gs://apache-beam-samples/input_small_files/*
> - Use TestDataflowRunner
> - Limit worker number
> - Disable autoscaling
> - Enable both py2 and py3 benchmarks



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7339) Enable 1Gb input for Python wordcount benchmark

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7339?focusedWorklogId=243673=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243673
 ]

ASF GitHub Bot logged work on BEAM-7339:


Author: ASF GitHub Bot
Created on: 16/May/19 21:52
Start Date: 16/May/19 21:52
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on pull request #8596: 
[BEAM-7339] Make input and checksum configurable for Python WordCountIT
URL: https://github.com/apache/beam/pull/8596#discussion_r284912678
 
 

 ##
 File path: sdks/python/apache_beam/examples/wordcount_it_test.py
 ##
 @@ -39,7 +39,8 @@ class WordCountIT(unittest.TestCase):
   _multiprocess_can_split_ = True
 
   # The default checksum is a SHA-1 hash generated from a sorted list of
-  # lines read from expected output.
+  # lines read from expected output. This value coresponds to the default
 
 Review comment:
   done
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243673)
Time Spent: 50m  (was: 40m)

> Enable 1Gb input for Python wordcount benchmark
> ---
>
> Key: BEAM-7339
> URL: https://issues.apache.org/jira/browse/BEAM-7339
> Project: Beam
>  Issue Type: Task
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Requirement:
> - Use input from: gs://apache-beam-samples/input_small_files/*
> - Use TestDataflowRunner
> - Limit worker number
> - Disable autoscaling
> - Enable both py2 and py3 benchmarks



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6683) Add an integration test suite for cross-language transforms for Flink runner

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6683?focusedWorklogId=243666=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243666
 ]

ASF GitHub Bot logged work on BEAM-6683:


Author: ASF GitHub Bot
Created on: 16/May/19 21:35
Start Date: 16/May/19 21:35
Worklog Time Spent: 10m 
  Work Description: ihji commented on pull request #8174: [BEAM-6683] add 
createCrossLanguageValidatesRunner task
URL: https://github.com/apache/beam/pull/8174#discussion_r284907859
 
 

 ##
 File path: sdks/python/apache_beam/io/external/generate_sequence_test.py
 ##
 @@ -0,0 +1,64 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""Unit tests for cross-language generate sequence."""
+
+from __future__ import absolute_import
+from __future__ import print_function
+
+import logging
+import os
+import re
+import unittest
+
+from nose.plugins.attrib import attr
+
+from apache_beam.io.external.generate_sequence import GenerateSequence
+from apache_beam.testing.test_pipeline import TestPipeline
+from apache_beam.testing.util import assert_that
+from apache_beam.testing.util import equal_to
+
+
+@attr('UsesCrossLanguageTransforms')
+class XlangGenerateSequenceTest(unittest.TestCase):
+  def test_generate_sequence(self):
+test_pipeline = TestPipeline()
+port = os.environ.get('EXPANSION_PORT')
 
 Review comment:
   We don't need to stage the expansion service jar here since 
`GenerateSequence` doesn't depend on extra dependencies.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243666)
Time Spent: 10h 20m  (was: 10h 10m)

> Add an integration test suite for cross-language transforms for Flink runner
> 
>
> Key: BEAM-6683
> URL: https://issues.apache.org/jira/browse/BEAM-6683
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Chamikara Jayalath
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 10h 20m
>  Remaining Estimate: 0h
>
> We should add an integration test suite that covers following.
> (1) Currently available Java IO connectors that do not use UDFs work for 
> Python SDK on Flink runner.
> (2) Currently available Python IO connectors that do not use UDFs work for 
> Java SDK on Flink runner.
> (3) Currently available Java/Python pipelines work in a scalable manner for 
> cross-language pipelines (for example, try 10GB, 100GB input for 
> textio/avroio for Java and Python). 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-6690) Spark Translator - ASSIGN_WINDOWS_TRANSFORM_URN

2019-05-16 Thread Kyle Weaver (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kyle Weaver resolved BEAM-6690.
---
   Resolution: Won't Fix
Fix Version/s: Not applicable

> Spark Translator - ASSIGN_WINDOWS_TRANSFORM_URN
> ---
>
> Key: BEAM-6690
> URL: https://issues.apache.org/jira/browse/BEAM-6690
> Project: Beam
>  Issue Type: Task
>  Components: runner-spark
>Reporter: Ankur Goenka
>Assignee: Kyle Weaver
>Priority: Major
> Fix For: Not applicable
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6690) Spark Translator - ASSIGN_WINDOWS_TRANSFORM_URN

2019-05-16 Thread Kyle Weaver (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16841735#comment-16841735
 ] 

Kyle Weaver commented on BEAM-6690:
---

`translateAssignWindows` was removed from the Flink portable runner [1], so 
it's probably safe to say we won't need this for Spark.

[1] [https://github.com/apache/beam/pull/8058]

> Spark Translator - ASSIGN_WINDOWS_TRANSFORM_URN
> ---
>
> Key: BEAM-6690
> URL: https://issues.apache.org/jira/browse/BEAM-6690
> Project: Beam
>  Issue Type: Task
>  Components: runner-spark
>Reporter: Ankur Goenka
>Assignee: Kyle Weaver
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6683) Add an integration test suite for cross-language transforms for Flink runner

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6683?focusedWorklogId=243643=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243643
 ]

ASF GitHub Bot logged work on BEAM-6683:


Author: ASF GitHub Bot
Created on: 16/May/19 21:18
Start Date: 16/May/19 21:18
Worklog Time Spent: 10m 
  Work Description: ihji commented on issue #8174: [BEAM-6683] add 
createCrossLanguageValidatesRunner task
URL: https://github.com/apache/beam/pull/8174#issuecomment-493235414
 
 
   run xvr_flink postcommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243643)
Time Spent: 10h 10m  (was: 10h)

> Add an integration test suite for cross-language transforms for Flink runner
> 
>
> Key: BEAM-6683
> URL: https://issues.apache.org/jira/browse/BEAM-6683
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Chamikara Jayalath
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> We should add an integration test suite that covers following.
> (1) Currently available Java IO connectors that do not use UDFs work for 
> Python SDK on Flink runner.
> (2) Currently available Python IO connectors that do not use UDFs work for 
> Java SDK on Flink runner.
> (3) Currently available Java/Python pipelines work in a scalable manner for 
> cross-language pipelines (for example, try 10GB, 100GB input for 
> textio/avroio for Java and Python). 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6683) Add an integration test suite for cross-language transforms for Flink runner

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6683?focusedWorklogId=243642=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243642
 ]

ASF GitHub Bot logged work on BEAM-6683:


Author: ASF GitHub Bot
Created on: 16/May/19 21:18
Start Date: 16/May/19 21:18
Worklog Time Spent: 10m 
  Work Description: ihji commented on issue #8174: [BEAM-6683] add 
createCrossLanguageValidatesRunner task
URL: https://github.com/apache/beam/pull/8174#issuecomment-493235414
 
 
   run xvr_flink postcommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243642)
Time Spent: 10h  (was: 9h 50m)

> Add an integration test suite for cross-language transforms for Flink runner
> 
>
> Key: BEAM-6683
> URL: https://issues.apache.org/jira/browse/BEAM-6683
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Chamikara Jayalath
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 10h
>  Remaining Estimate: 0h
>
> We should add an integration test suite that covers following.
> (1) Currently available Java IO connectors that do not use UDFs work for 
> Python SDK on Flink runner.
> (2) Currently available Python IO connectors that do not use UDFs work for 
> Java SDK on Flink runner.
> (3) Currently available Java/Python pipelines work in a scalable manner for 
> cross-language pipelines (for example, try 10GB, 100GB input for 
> textio/avroio for Java and Python). 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-6877) TypeHints Py3 Error: Type inference tests fail on Python 3.6 due to bytecode changes

2019-05-16 Thread niklas Hansson (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-6877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16841716#comment-16841716
 ] 

niklas Hansson commented on BEAM-6877:
--

Sadly not :(. Should I release it? Plan to work with it on Sunday. 

> TypeHints Py3 Error: Type inference tests fail on Python 3.6 due to bytecode 
> changes
> 
>
> Key: BEAM-6877
> URL: https://issues.apache.org/jira/browse/BEAM-6877
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: niklas Hansson
>Priority: Major
>
> Type inference doesn't work on Python 3.6 due to [bytecode to wordcode 
> changes|https://docs.python.org/3/whatsnew/3.6.html#cpython-bytecode-changes].
> Type inference always returns Any on Python 3.6, so this is not critical.
> Affected tests are:
>  *transforms.ptransform_test*:
>  - test_combine_properly_pipeline_type_checks_using_decorator
>  - test_mean_globally_pipeline_checking_satisfied
>  - test_mean_globally_runtime_checking_satisfied
>  - test_count_globally_pipeline_type_checking_satisfied
>  - test_count_globally_runtime_type_checking_satisfied
>  - test_pardo_type_inference
>  - test_pipeline_inference
>  - test_inferred_bad_kv_type
> *typehints.trivial_inference_test*:
>  - all tests in TrivialInferenceTest
> *io.gcp.pubsub_test.TestReadFromPubSubOverride*:
> * test_expand_with_other_options
> * test_expand_with_subscription
> * test_expand_with_topic



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6988) TypeHints Py3 Error: test_non_function (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6988?focusedWorklogId=243624=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243624
 ]

ASF GitHub Bot logged work on BEAM-6988:


Author: ASF GitHub Bot
Created on: 16/May/19 20:52
Start Date: 16/May/19 20:52
Worklog Time Spent: 10m 
  Work Description: NikeNano commented on issue #8590: [BEAM-6988] 
Implement a Python 3 version of getcallargs_forhints
URL: https://github.com/apache/beam/pull/8590#issuecomment-493224388
 
 
   > looks like test_convert_to_beam_type[s] test are currently failing on 
Py3.5.
   
   How do you see this? I have check the logs and as far as I can see it only 
say: "sdks:python:test-suites:direct:py35:postCommitIT" for Python SDK 
PostCommit Tests on Python 3 in the console output. Trying to debug why it 
fails.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243624)
Time Spent: 5h 40m  (was: 5.5h)

> TypeHints Py3 Error: test_non_function 
> (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+
> -
>
> Key: BEAM-6988
> URL: https://issues.apache.org/jira/browse/BEAM-6988
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: niklas Hansson
>Priority: Major
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> {noformat}
> Traceback (most recent call last):
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/typed_pipeline_test.py",
>  line 53, in test_non_function
>  result = ['xa', 'bbx', 'xcx'] | beam.Map(str.strip, 'x')
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 510, in _ror_
>  result = p.apply(self, pvalueish, label)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/pipeline.py",
>  line 514, in apply
>  transform.type_check_inputs(pvalueish)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 753, in type_check_inputs
>  hints = getcallargs_forhints(argspec_fn, *type_hints[0], **type_hints[1])
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/decorators.py",
>  line 283, in getcallargs_forhints
>  raise TypeCheckError(e)
>  apache_beam.typehints.decorators.TypeCheckError: strip() missing 1 required 
> positional argument: 'chars'{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6988) TypeHints Py3 Error: test_non_function (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6988?focusedWorklogId=243623=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243623
 ]

ASF GitHub Bot logged work on BEAM-6988:


Author: ASF GitHub Bot
Created on: 16/May/19 20:51
Start Date: 16/May/19 20:51
Worklog Time Spent: 10m 
  Work Description: NikeNano commented on issue #8590: [BEAM-6988] 
Implement a Python 3 version of getcallargs_forhints
URL: https://github.com/apache/beam/pull/8590#issuecomment-493224388
 
 
   > looks like test_convert_to_beam_type[s] test are currently failing on 
Py3.5.
   
   How do you see this? I have check the logs and as far as I can see it only 
say: "sdks:python:test-suites:direct:py35:postCommitIT" for Python SDK 
PostCommit Tests on Python 3 in the console output. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243623)
Time Spent: 5.5h  (was: 5h 20m)

> TypeHints Py3 Error: test_non_function 
> (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+
> -
>
> Key: BEAM-6988
> URL: https://issues.apache.org/jira/browse/BEAM-6988
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: niklas Hansson
>Priority: Major
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> {noformat}
> Traceback (most recent call last):
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/typed_pipeline_test.py",
>  line 53, in test_non_function
>  result = ['xa', 'bbx', 'xcx'] | beam.Map(str.strip, 'x')
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 510, in _ror_
>  result = p.apply(self, pvalueish, label)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/pipeline.py",
>  line 514, in apply
>  transform.type_check_inputs(pvalueish)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 753, in type_check_inputs
>  hints = getcallargs_forhints(argspec_fn, *type_hints[0], **type_hints[1])
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/decorators.py",
>  line 283, in getcallargs_forhints
>  raise TypeCheckError(e)
>  apache_beam.typehints.decorators.TypeCheckError: strip() missing 1 required 
> positional argument: 'chars'{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6988) TypeHints Py3 Error: test_non_function (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6988?focusedWorklogId=243616=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243616
 ]

ASF GitHub Bot logged work on BEAM-6988:


Author: ASF GitHub Bot
Created on: 16/May/19 20:41
Start Date: 16/May/19 20:41
Worklog Time Spent: 10m 
  Work Description: NikeNano commented on issue #8590: [BEAM-6988] 
Implement a Python 3 version of getcallargs_forhints
URL: https://github.com/apache/beam/pull/8590#issuecomment-493224388
 
 
   > looks like test_convert_to_beam_type[s] test are currently failing on 
Py3.5.
   
   How do you see this? I have check the logs and as far as I can see it only 
say: "sdks:python:test-suites:direct:py35:postCommitIT" for Python SDK 
PostCommit Tests on Python 3. In the console output. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243616)
Time Spent: 5h 20m  (was: 5h 10m)

> TypeHints Py3 Error: test_non_function 
> (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+
> -
>
> Key: BEAM-6988
> URL: https://issues.apache.org/jira/browse/BEAM-6988
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: niklas Hansson
>Priority: Major
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> {noformat}
> Traceback (most recent call last):
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/typed_pipeline_test.py",
>  line 53, in test_non_function
>  result = ['xa', 'bbx', 'xcx'] | beam.Map(str.strip, 'x')
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 510, in _ror_
>  result = p.apply(self, pvalueish, label)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/pipeline.py",
>  line 514, in apply
>  transform.type_check_inputs(pvalueish)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 753, in type_check_inputs
>  hints = getcallargs_forhints(argspec_fn, *type_hints[0], **type_hints[1])
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/decorators.py",
>  line 283, in getcallargs_forhints
>  raise TypeCheckError(e)
>  apache_beam.typehints.decorators.TypeCheckError: strip() missing 1 required 
> positional argument: 'chars'{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-7116) Remove KV from Schema transforms

2019-05-16 Thread Brian Hulette (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16841702#comment-16841702
 ] 

Brian Hulette commented on BEAM-7116:
-

[~reuvenlax] - Could we just use the existing SchemaRegistry/SchemaProvider 
architecture to add support for KVs to Convert?

I'm still wrapping my head around all the schema inference code, but it seems 
like if we modify FieldValueTypeSupplier to accept a TypeDescriptor rather than 
just a Class, and plumb that through FieldValueTypeInformation and 
StaticSchemaInference we could add support for generic classes, including KV.

> Remove KV from Schema transforms
> 
>
> Key: BEAM-7116
> URL: https://issues.apache.org/jira/browse/BEAM-7116
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Reuven Lax
>Priority: Major
>
> Instead of returning KV objects, we should return a Schema with two fields. 
> The Convert transform should be able to convert these to KV objects.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Issue Comment Deleted] (BEAM-7338) Deprecate PoolableDataSourceProvider from JdbcIO

2019-05-16 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/BEAM-7338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-7338:
---
Comment: was deleted

(was: Another reason to do so is that it leaks DBCP into the JDBCIO user 
classpath and this disallows him from using older or future versions of the 
library withoug conlicts.)

> Deprecate PoolableDataSourceProvider from JdbcIO
> 
>
> Key: BEAM-7338
> URL: https://issues.apache.org/jira/browse/BEAM-7338
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-jdbc
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>
> `PoolableDataSourceProvider` was introduced as a facility to create a 
> `PoolableDataSource` from a `ConnectionConfiguration` in JdbcIO.
> However the current implementation default parameters cannot cover all cases, 
> and tweaking the right parameters of the pool is not trivial without exposing 
> too many knobs in the API, so given that we have a generic way to do this via 
> `withDataSourceProviderFn` we could deprecate and remove this in the future, 
> and probably add its use as an example.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7339) Enable 1Gb input for Python wordcount benchmark

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7339?focusedWorklogId=243611=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243611
 ]

ASF GitHub Bot logged work on BEAM-7339:


Author: ASF GitHub Bot
Created on: 16/May/19 20:08
Start Date: 16/May/19 20:08
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on issue #8596: [BEAM-7339] Make 
input and checksum configurable for Python WordCountIT
URL: https://github.com/apache/beam/pull/8596#issuecomment-493213893
 
 
   Sorry, I didn't read the description carefully. That answers my motivation 
question. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243611)
Time Spent: 40m  (was: 0.5h)

> Enable 1Gb input for Python wordcount benchmark
> ---
>
> Key: BEAM-7339
> URL: https://issues.apache.org/jira/browse/BEAM-7339
> Project: Beam
>  Issue Type: Task
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Requirement:
> - Use input from: gs://apache-beam-samples/input_small_files/*
> - Use TestDataflowRunner
> - Limit worker number
> - Disable autoscaling
> - Enable both py2 and py3 benchmarks



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7339) Enable 1Gb input for Python wordcount benchmark

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7339?focusedWorklogId=243609=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243609
 ]

ASF GitHub Bot logged work on BEAM-7339:


Author: ASF GitHub Bot
Created on: 16/May/19 20:05
Start Date: 16/May/19 20:05
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on pull request #8596: [BEAM-7339] 
Make input and checksum configurable for Python WordCountIT
URL: https://github.com/apache/beam/pull/8596#discussion_r284871200
 
 

 ##
 File path: sdks/python/apache_beam/examples/wordcount_it_test.py
 ##
 @@ -39,7 +39,8 @@ class WordCountIT(unittest.TestCase):
   _multiprocess_can_split_ = True
 
   # The default checksum is a SHA-1 hash generated from a sorted list of
-  # lines read from expected output.
+  # lines read from expected output. This value coresponds to the default
 
 Review comment:
   Typo - corresponds
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243609)
Time Spent: 0.5h  (was: 20m)

> Enable 1Gb input for Python wordcount benchmark
> ---
>
> Key: BEAM-7339
> URL: https://issues.apache.org/jira/browse/BEAM-7339
> Project: Beam
>  Issue Type: Task
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Requirement:
> - Use input from: gs://apache-beam-samples/input_small_files/*
> - Use TestDataflowRunner
> - Limit worker number
> - Disable autoscaling
> - Enable both py2 and py3 benchmarks



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-7339) Enable 1Gb input for Python wordcount benchmark

2019-05-16 Thread Ahmet Altay (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16841684#comment-16841684
 ] 

Ahmet Altay commented on BEAM-7339:
---

Related to the quota issue, we need to have a solution to this, otherwise we 
cannot build large benchmarks which is needed. Either we can increase the 
quota, or agree that apache-beam-testing project is not good for benchmarks and 
find an alternative solution.

Output verification problem could be simplified IMO. Perhaps we can use gcloud 
tool itself to calculate output hashes.

Thanks for summarizing the issues.

> Enable 1Gb input for Python wordcount benchmark
> ---
>
> Key: BEAM-7339
> URL: https://issues.apache.org/jira/browse/BEAM-7339
> Project: Beam
>  Issue Type: Task
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Requirement:
> - Use input from: gs://apache-beam-samples/input_small_files/*
> - Use TestDataflowRunner
> - Limit worker number
> - Disable autoscaling
> - Enable both py2 and py3 benchmarks



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-7338) Deprecate PoolableDataSourceProvider from JdbcIO

2019-05-16 Thread JIRA


[ 
https://issues.apache.org/jira/browse/BEAM-7338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16841680#comment-16841680
 ] 

Ismaël Mejía commented on BEAM-7338:


Another reason to do so is that it leaks DBCP into the JDBCIO user classpath 
and this disallows him from using older or future versions of the library 
withoug conlicts.

> Deprecate PoolableDataSourceProvider from JdbcIO
> 
>
> Key: BEAM-7338
> URL: https://issues.apache.org/jira/browse/BEAM-7338
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-jdbc
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>
> `PoolableDataSourceProvider` was introduced as a facility to create a 
> `PoolableDataSource` from a `ConnectionConfiguration` in JdbcIO.
> However the current implementation default parameters cannot cover all cases, 
> and tweaking the right parameters of the pool is not trivial without exposing 
> too many knobs in the API, so given that we have a generic way to do this via 
> `withDataSourceProviderFn` we could deprecate and remove this in the future, 
> and probably add its use as an example.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6683) Add an integration test suite for cross-language transforms for Flink runner

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6683?focusedWorklogId=243604=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243604
 ]

ASF GitHub Bot logged work on BEAM-6683:


Author: ASF GitHub Bot
Created on: 16/May/19 19:33
Start Date: 16/May/19 19:33
Worklog Time Spent: 10m 
  Work Description: ihji commented on pull request #8174: [BEAM-6683] add 
createCrossLanguageValidatesRunner task
URL: https://github.com/apache/beam/pull/8174#discussion_r284864864
 
 

 ##
 File path: sdks/python/apache_beam/transforms/combiners.py
 ##
 @@ -129,10 +129,13 @@ class PerElement(ptransform.PTransform):
 
 def expand(self, pcoll):
   paired_with_void_type = KV[pcoll.element_type, Any]
-  return (pcoll
-  | ('%s:PairWithVoid' % self.label >> core.Map(lambda x: (x, 
None))
- .with_output_types(paired_with_void_type))
-  | core.CombinePerKey(CountCombineFn()))
+  output_type = KV[pcoll.element_type, int]
 
 Review comment:
   The output type needs to be as narrow as possible in order to avoid python 
pickled coder. The test was failed because of it. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243604)
Time Spent: 9h 50m  (was: 9h 40m)

> Add an integration test suite for cross-language transforms for Flink runner
> 
>
> Key: BEAM-6683
> URL: https://issues.apache.org/jira/browse/BEAM-6683
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Chamikara Jayalath
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>
> We should add an integration test suite that covers following.
> (1) Currently available Java IO connectors that do not use UDFs work for 
> Python SDK on Flink runner.
> (2) Currently available Python IO connectors that do not use UDFs work for 
> Java SDK on Flink runner.
> (3) Currently available Java/Python pipelines work in a scalable manner for 
> cross-language pipelines (for example, try 10GB, 100GB input for 
> textio/avroio for Java and Python). 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-7339) Enable 1Gb input for Python wordcount benchmark

2019-05-16 Thread Mark Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16841674#comment-16841674
 ] 

Mark Liu commented on BEAM-7339:


Two concerns for >100Gb input:

1. The resource we have for apache-beam-testing project. We have seen exceeding 
quota in postcommit jobs like cpu and disk. So we should limit number of 
workers in those performance tests. On the other hand, I don't know how long 
does it take to process 100Gb with certain number of workers.
2. Output verification could be hard. Large output may not be fit into Jenkins 
machine so may need special way to verify output correctness. 

> Enable 1Gb input for Python wordcount benchmark
> ---
>
> Key: BEAM-7339
> URL: https://issues.apache.org/jira/browse/BEAM-7339
> Project: Beam
>  Issue Type: Task
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Requirement:
> - Use input from: gs://apache-beam-samples/input_small_files/*
> - Use TestDataflowRunner
> - Limit worker number
> - Disable autoscaling
> - Enable both py2 and py3 benchmarks



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6988) TypeHints Py3 Error: test_non_function (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6988?focusedWorklogId=243599=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243599
 ]

ASF GitHub Bot logged work on BEAM-6988:


Author: ASF GitHub Bot
Created on: 16/May/19 19:21
Start Date: 16/May/19 19:21
Worklog Time Spent: 10m 
  Work Description: NikeNano commented on pull request #8590: [BEAM-6988] 
Implement a Python 3 version of getcallargs_forhints
URL: https://github.com/apache/beam/pull/8590#discussion_r284860176
 
 

 ##
 File path: sdks/python/apache_beam/typehints/native_type_compatibility.py
 ##
 @@ -144,11 +154,17 @@ def convert_to_beam_type(typ):
   match=_match_issubclass(typing.Tuple),
   arity=-1,
   beam_type=typehints.Tuple),
-  _TypeMapEntry(
-  match=_match_same_type(typing.Union),
-  arity=-1,
-  beam_type=typehints.Union)
   ]
+  if sys.version_info.major >= 3:
+type_map.append(
+_TypeMapEntry(
+match=_match_is_union_py3, arity=-1, beam_type=typehints.Union))
 
 Review comment:
    , there are some updates to the same functions, functionality in 
[BEAM-6985]. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243599)
Time Spent: 5h 10m  (was: 5h)

> TypeHints Py3 Error: test_non_function 
> (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+
> -
>
> Key: BEAM-6988
> URL: https://issues.apache.org/jira/browse/BEAM-6988
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: niklas Hansson
>Priority: Major
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> {noformat}
> Traceback (most recent call last):
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/typed_pipeline_test.py",
>  line 53, in test_non_function
>  result = ['xa', 'bbx', 'xcx'] | beam.Map(str.strip, 'x')
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 510, in _ror_
>  result = p.apply(self, pvalueish, label)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/pipeline.py",
>  line 514, in apply
>  transform.type_check_inputs(pvalueish)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 753, in type_check_inputs
>  hints = getcallargs_forhints(argspec_fn, *type_hints[0], **type_hints[1])
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/decorators.py",
>  line 283, in getcallargs_forhints
>  raise TypeCheckError(e)
>  apache_beam.typehints.decorators.TypeCheckError: strip() missing 1 required 
> positional argument: 'chars'{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6988) TypeHints Py3 Error: test_non_function (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6988?focusedWorklogId=243598=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243598
 ]

ASF GitHub Bot logged work on BEAM-6988:


Author: ASF GitHub Bot
Created on: 16/May/19 19:21
Start Date: 16/May/19 19:21
Worklog Time Spent: 10m 
  Work Description: NikeNano commented on pull request #8590: [BEAM-6988] 
Implement a Python 3 version of getcallargs_forhints
URL: https://github.com/apache/beam/pull/8590#discussion_r284859030
 
 

 ##
 File path: sdks/python/apache_beam/typehints/decorators.py
 ##
 @@ -314,10 +325,40 @@ def getcallargs_forhints(func, *typeargs, **typekwargs):
   return callargs
 
 
+def getcallargs_forhints_impl_py3(func, packed_typeargs, typekwargs):
+  try:
+# TODO(udim): Function signature returned by getfullargspec (in
+#  packed_typeargs) might differ from the one below. Migrate to use
+#  inspect.signature in getfullargspec (for Py3).
+signature = inspect.signature(func)
+  except ValueError as e:
+logger.warning('Could not get signature for function: %s: %s', func, e)
+return {}
+  try:
+bindings = signature.bind(*packed_typeargs, **typekwargs)
+  except TypeError as e:
+# Might be raised due to too few or too many arguments.
+raise TypeCheckError(e)
+  bound_args = bindings.arguments
+  missing = []
+  for param in signature.parameters.values():
+if param.kind == inspect.Parameter.VAR_POSITIONAL:
+  bound_args[param.name] = typehints.Tuple[typehints.Any, ...]
+elif param.kind == inspect.Parameter.VAR_KEYWORD:
+  bound_args[param.name] = typehints.Dict[typehints.Any, typehints.Any]
+elif param.name not in bound_args and param.default is not param.empty:
+  # Declare unbound parameters with defaults to be Any.
+  bound_args[param.name] = typehints.Any
+
+  if missing:
 
 Review comment:
   Will this ever be conditioned to false with the current code? 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243598)
Time Spent: 5h  (was: 4h 50m)

> TypeHints Py3 Error: test_non_function 
> (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+
> -
>
> Key: BEAM-6988
> URL: https://issues.apache.org/jira/browse/BEAM-6988
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: niklas Hansson
>Priority: Major
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> {noformat}
> Traceback (most recent call last):
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/typed_pipeline_test.py",
>  line 53, in test_non_function
>  result = ['xa', 'bbx', 'xcx'] | beam.Map(str.strip, 'x')
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 510, in _ror_
>  result = p.apply(self, pvalueish, label)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/pipeline.py",
>  line 514, in apply
>  transform.type_check_inputs(pvalueish)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 753, in type_check_inputs
>  hints = getcallargs_forhints(argspec_fn, *type_hints[0], **type_hints[1])
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/decorators.py",
>  line 283, in getcallargs_forhints
>  raise TypeCheckError(e)
>  apache_beam.typehints.decorators.TypeCheckError: strip() missing 1 required 
> positional argument: 'chars'{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6985) TypeHints Py3 Error: Native type compatibility tests fail on Python 3.7+

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6985?focusedWorklogId=243601=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243601
 ]

ASF GitHub Bot logged work on BEAM-6985:


Author: ASF GitHub Bot
Created on: 16/May/19 19:23
Start Date: 16/May/19 19:23
Worklog Time Spent: 10m 
  Work Description: NikeNano commented on issue #8453: [BEAM-6985] 
TypeHints Py3 Error: Native type compatibility tests fail on Python 3.7+ Updates
URL: https://github.com/apache/beam/pull/8453#issuecomment-493199476
 
 
   > Thanks, @NikeNano. #8590 expands on this change. Would you be ok to keep 
the discussion on what needs to happen in that PR? We probably don't need two 
changes.
   
   Sure! 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243601)
Time Spent: 5.5h  (was: 5h 20m)

> TypeHints Py3 Error: Native type compatibility tests fail on Python 3.7+
> 
>
> Key: BEAM-6985
> URL: https://issues.apache.org/jira/browse/BEAM-6985
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: niklas Hansson
>Priority: Major
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> The following tests are failing:
> * test_convert_nested_to_beam_type 
> (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest)
>  
> * test_convert_to_beam_type 
> (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest)
>  
> * test_convert_to_beam_types 
> (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest)
> With similar errors, where `typing. != `. eg:
> {noformat}
>  FAIL: test_convert_to_beam_type 
> (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest)
>  --
>  Traceback (most recent call last):
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/native_type_compatibility_test.py",
>  line 79, in test_convert_to_beam_type
>  beam_type, description)
>  AssertionError: typing.Dict[bytes, int] != Dict[bytes, int] : simple dict
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6683) Add an integration test suite for cross-language transforms for Flink runner

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6683?focusedWorklogId=243597=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243597
 ]

ASF GitHub Bot logged work on BEAM-6683:


Author: ASF GitHub Bot
Created on: 16/May/19 19:20
Start Date: 16/May/19 19:20
Worklog Time Spent: 10m 
  Work Description: ihji commented on pull request #8174: [BEAM-6683] add 
createCrossLanguageValidatesRunner task
URL: https://github.com/apache/beam/pull/8174#discussion_r284860203
 
 

 ##
 File path: 
buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy
 ##
 @@ -1625,6 +1645,110 @@ class BeamModulePlugin implements Plugin {
 
 /** 
***/
 
+// Method to create the crossLanguageValidatesRunnerTask.
+// The method takes crossLanguageValidatesRunnerConfiguration as parameter.
+project.ext.createCrossLanguageValidatesRunnerTask = {
 
 Review comment:
   added the postcommit test.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243597)
Time Spent: 9h 40m  (was: 9.5h)

> Add an integration test suite for cross-language transforms for Flink runner
> 
>
> Key: BEAM-6683
> URL: https://issues.apache.org/jira/browse/BEAM-6683
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Chamikara Jayalath
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 9h 40m
>  Remaining Estimate: 0h
>
> We should add an integration test suite that covers following.
> (1) Currently available Java IO connectors that do not use UDFs work for 
> Python SDK on Flink runner.
> (2) Currently available Python IO connectors that do not use UDFs work for 
> Java SDK on Flink runner.
> (3) Currently available Java/Python pipelines work in a scalable manner for 
> cross-language pipelines (for example, try 10GB, 100GB input for 
> textio/avroio for Java and Python). 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7305) Add first version of Hazelcast Jet Runner

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7305?focusedWorklogId=243594=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243594
 ]

ASF GitHub Bot logged work on BEAM-7305:


Author: ASF GitHub Bot
Created on: 16/May/19 19:20
Start Date: 16/May/19 19:20
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #8592: [BEAM-7305] 
Improve and extend Hazelcast Jet based Java Runner
URL: https://github.com/apache/beam/pull/8592#discussion_r284856263
 
 

 ##
 File path: 
runners/jet-experimental/src/main/java/org/apache/beam/runners/jet/JetTransformTranslators.java
 ##
 @@ -79,7 +76,6 @@
 TRANSLATORS.put(PTransformTranslation.FLATTEN_TRANSFORM_URN, new 
FlattenTranslator());
 TRANSLATORS.put(PTransformTranslation.ASSIGN_WINDOWS_TRANSFORM_URN, new 
WindowTranslator());
 TRANSLATORS.put(PTransformTranslation.IMPULSE_TRANSFORM_URN, new 
ImpulseTranslator());
-TRANSLATORS.put(PTransformTranslation.TEST_STREAM_TRANSFORM_URN, new 
TestStreamTranslator());
 
 Review comment:
   Curious, why did you remove this?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243594)
Time Spent: 2h 20m  (was: 2h 10m)

> Add first version of Hazelcast Jet Runner
> -
>
> Key: BEAM-7305
> URL: https://issues.apache.org/jira/browse/BEAM-7305
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-jet
>Reporter: Maximilian Michels
>Assignee: Jozsef Bartok
>Priority: Major
> Fix For: 2.14.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7305) Add first version of Hazelcast Jet Runner

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7305?focusedWorklogId=243596=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243596
 ]

ASF GitHub Bot logged work on BEAM-7305:


Author: ASF GitHub Bot
Created on: 16/May/19 19:20
Start Date: 16/May/19 19:20
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #8592: [BEAM-7305] 
Improve and extend Hazelcast Jet based Java Runner
URL: https://github.com/apache/beam/pull/8592#discussion_r284856811
 
 

 ##
 File path: 
runners/jet-experimental/src/main/java/org/apache/beam/runners/jet/Utils.java
 ##
 @@ -246,4 +259,34 @@ static boolean usesStateOrTimers(AppliedPTransform appliedTransform) {
 return WindowedValue.FullWindowedValueCoder.of(
 ListCoder.of(elementCoder.getValueCoder()), 
elementCoder.getWindowCoder());
   }
+
+  /** A wrapper of {@code byte[]} that can be used as a hash-map key. */
+  public static class ByteArrayKey {
+private final byte[] value;
+private int hash;
+
+public ByteArrayKey(@Nonnull byte[] value) {
+  this.value = value;
+}
+
+@Override
+public boolean equals(Object o) {
+  if (this == o) {
+return true;
+  }
+  if (o == null || getClass() != o.getClass()) {
+return false;
+  }
+  ByteArrayKey that = (ByteArrayKey) o;
+  return Arrays.equals(value, that.value);
+}
+
+@Override
+public int hashCode() {
+  if (hash == 0) {
 
 Review comment:
   Make `hash` an `Integer` and check for null here?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243596)
Time Spent: 2.5h  (was: 2h 20m)

> Add first version of Hazelcast Jet Runner
> -
>
> Key: BEAM-7305
> URL: https://issues.apache.org/jira/browse/BEAM-7305
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-jet
>Reporter: Maximilian Michels
>Assignee: Jozsef Bartok
>Priority: Major
> Fix For: 2.14.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7305) Add first version of Hazelcast Jet Runner

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7305?focusedWorklogId=243595=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243595
 ]

ASF GitHub Bot logged work on BEAM-7305:


Author: ASF GitHub Bot
Created on: 16/May/19 19:20
Start Date: 16/May/19 19:20
Worklog Time Spent: 10m 
  Work Description: mxm commented on pull request #8592: [BEAM-7305] 
Improve and extend Hazelcast Jet based Java Runner
URL: https://github.com/apache/beam/pull/8592#discussion_r284858529
 
 

 ##
 File path: 
runners/jet-experimental/src/main/java/org/apache/beam/runners/jet/JetTransformTranslators.java
 ##
 @@ -79,7 +76,6 @@
 TRANSLATORS.put(PTransformTranslation.FLATTEN_TRANSFORM_URN, new 
FlattenTranslator());
 TRANSLATORS.put(PTransformTranslation.ASSIGN_WINDOWS_TRANSFORM_URN, new 
WindowTranslator());
 TRANSLATORS.put(PTransformTranslation.IMPULSE_TRANSFORM_URN, new 
ImpulseTranslator());
-TRANSLATORS.put(PTransformTranslation.TEST_STREAM_TRANSFORM_URN, new 
TestStreamTranslator());
 
 Review comment:
   Ah, see that you moved it to the test runner.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243595)
Time Spent: 2.5h  (was: 2h 20m)

> Add first version of Hazelcast Jet Runner
> -
>
> Key: BEAM-7305
> URL: https://issues.apache.org/jira/browse/BEAM-7305
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-jet
>Reporter: Maximilian Michels
>Assignee: Jozsef Bartok
>Priority: Major
> Fix For: 2.14.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-562) DoFn Reuse: Add new methods to DoFn

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-562?focusedWorklogId=243592=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243592
 ]

ASF GitHub Bot logged work on BEAM-562:
---

Author: ASF GitHub Bot
Created on: 16/May/19 19:12
Start Date: 16/May/19 19:12
Worklog Time Spent: 10m 
  Work Description: yifanmai commented on issue #7994: [BEAM-562] Add 
DoFn.setup and DoFn.teardown to Python SDK
URL: https://github.com/apache/beam/pull/7994#issuecomment-493195732
 
 
   Thanks @aaltay, @kennknowles and @charlesccychen for your help!
   
   I added https://issues.apache.org/jira/browse/BEAM-7340 to track the issue 
related to metrics in DoFn.teardown, as discussed earlier.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243592)
Time Spent: 10h 50m  (was: 10h 40m)

> DoFn Reuse: Add new methods to DoFn
> ---
>
> Key: BEAM-562
> URL: https://issues.apache.org/jira/browse/BEAM-562
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Yifan Mai
>Priority: Major
>  Labels: sdk-consistency
> Fix For: Not applicable
>
>  Time Spent: 10h 50m
>  Remaining Estimate: 0h
>
> Java SDK added setup and teardown methods to the DoFns. This makes DoFns 
> reusable and provide performance improvements. Python SDK should add support 
> for these new DoFn methods:
> Proposal doc: 
> https://docs.google.com/document/d/1LLQqggSePURt3XavKBGV7SZJYQ4NW8yCu63lBchzMRk/edit?ts=5771458f#



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6988) TypeHints Py3 Error: test_non_function (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6988?focusedWorklogId=243590=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243590
 ]

ASF GitHub Bot logged work on BEAM-6988:


Author: ASF GitHub Bot
Created on: 16/May/19 19:10
Start Date: 16/May/19 19:10
Worklog Time Spent: 10m 
  Work Description: NikeNano commented on pull request #8590: [BEAM-6988] 
Implement a Python 3 version of getcallargs_forhints
URL: https://github.com/apache/beam/pull/8590#discussion_r284856682
 
 

 ##
 File path: sdks/python/apache_beam/typehints/decorators.py
 ##
 @@ -269,6 +272,14 @@ def getcallargs_forhints(func, *typeargs, **typekwargs):
  for (arg, hint) in zip(argspec.args, typeargs)]
   packed_typeargs += list(typeargs[len(packed_typeargs):])
 
+  if sys.version_info.major < 3:
+return getcallargs_forhints_impl_py2(func, argspec, packed_typeargs,
 
 Review comment:
   Is it accepted behaviour to check the python version within the code? 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243590)
Time Spent: 4.5h  (was: 4h 20m)

> TypeHints Py3 Error: test_non_function 
> (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+
> -
>
> Key: BEAM-6988
> URL: https://issues.apache.org/jira/browse/BEAM-6988
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: niklas Hansson
>Priority: Major
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> {noformat}
> Traceback (most recent call last):
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/typed_pipeline_test.py",
>  line 53, in test_non_function
>  result = ['xa', 'bbx', 'xcx'] | beam.Map(str.strip, 'x')
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 510, in _ror_
>  result = p.apply(self, pvalueish, label)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/pipeline.py",
>  line 514, in apply
>  transform.type_check_inputs(pvalueish)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 753, in type_check_inputs
>  hints = getcallargs_forhints(argspec_fn, *type_hints[0], **type_hints[1])
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/decorators.py",
>  line 283, in getcallargs_forhints
>  raise TypeCheckError(e)
>  apache_beam.typehints.decorators.TypeCheckError: strip() missing 1 required 
> positional argument: 'chars'{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6988) TypeHints Py3 Error: test_non_function (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6988?focusedWorklogId=243591=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243591
 ]

ASF GitHub Bot logged work on BEAM-6988:


Author: ASF GitHub Bot
Created on: 16/May/19 19:11
Start Date: 16/May/19 19:11
Worklog Time Spent: 10m 
  Work Description: NikeNano commented on pull request #8590: [BEAM-6988] 
Implement a Python 3 version of getcallargs_forhints
URL: https://github.com/apache/beam/pull/8590#discussion_r284856682
 
 

 ##
 File path: sdks/python/apache_beam/typehints/decorators.py
 ##
 @@ -269,6 +272,14 @@ def getcallargs_forhints(func, *typeargs, **typekwargs):
  for (arg, hint) in zip(argspec.args, typeargs)]
   packed_typeargs += list(typeargs[len(packed_typeargs):])
 
+  if sys.version_info.major < 3:
+return getcallargs_forhints_impl_py2(func, argspec, packed_typeargs,
 
 Review comment:
   Is it accepted behaviour to check the python version within the code? Don't 
see a problem with it but asking to learn :)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243591)
Time Spent: 4h 40m  (was: 4.5h)

> TypeHints Py3 Error: test_non_function 
> (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+
> -
>
> Key: BEAM-6988
> URL: https://issues.apache.org/jira/browse/BEAM-6988
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: niklas Hansson
>Priority: Major
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> {noformat}
> Traceback (most recent call last):
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/typed_pipeline_test.py",
>  line 53, in test_non_function
>  result = ['xa', 'bbx', 'xcx'] | beam.Map(str.strip, 'x')
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 510, in _ror_
>  result = p.apply(self, pvalueish, label)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/pipeline.py",
>  line 514, in apply
>  transform.type_check_inputs(pvalueish)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 753, in type_check_inputs
>  hints = getcallargs_forhints(argspec_fn, *type_hints[0], **type_hints[1])
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/decorators.py",
>  line 283, in getcallargs_forhints
>  raise TypeCheckError(e)
>  apache_beam.typehints.decorators.TypeCheckError: strip() missing 1 required 
> positional argument: 'chars'{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-7340) DoFn.teardown metrics are lost in Python SDK

2019-05-16 Thread Yifan Mai (JIRA)
Yifan Mai created BEAM-7340:
---

 Summary: DoFn.teardown metrics are lost in Python SDK
 Key: BEAM-7340
 URL: https://issues.apache.org/jira/browse/BEAM-7340
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-harness
Reporter: Yifan Mai


If user code in DoFn.shutdown updates custom user metrics, those updates will 
not get registered e.g. counter increments are not registered.

Context: In 
[FnApiRunner.run_stages|https://github.com/apache/beam/blob/4629e82512ef1606c78cf28a2d66402c3533e23f/sdks/python/apache_beam/runners/portability/fn_api_runner.py#L342-L364],
 DoFn.teardown is called in worker_handler_manager.close_all, but this is 
called outside of the FnApiRunner.run_stage calls, so no metrics / monitoring 
info is retrieved there.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6988) TypeHints Py3 Error: test_non_function (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6988?focusedWorklogId=243593=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243593
 ]

ASF GitHub Bot logged work on BEAM-6988:


Author: ASF GitHub Bot
Created on: 16/May/19 19:16
Start Date: 16/May/19 19:16
Worklog Time Spent: 10m 
  Work Description: NikeNano commented on pull request #8590: [BEAM-6988] 
Implement a Python 3 version of getcallargs_forhints
URL: https://github.com/apache/beam/pull/8590#discussion_r284858804
 
 

 ##
 File path: sdks/python/apache_beam/typehints/decorators.py
 ##
 @@ -314,10 +325,40 @@ def getcallargs_forhints(func, *typeargs, **typekwargs):
   return callargs
 
 
+def getcallargs_forhints_impl_py3(func, packed_typeargs, typekwargs):
+  try:
+# TODO(udim): Function signature returned by getfullargspec (in
+#  packed_typeargs) might differ from the one below. Migrate to use
+#  inspect.signature in getfullargspec (for Py3).
+signature = inspect.signature(func)
+  except ValueError as e:
+logger.warning('Could not get signature for function: %s: %s', func, e)
+return {}
+  try:
+bindings = signature.bind(*packed_typeargs, **typekwargs)
+  except TypeError as e:
+# Might be raised due to too few or too many arguments.
+raise TypeCheckError(e)
+  bound_args = bindings.arguments
+  missing = []
 
 Review comment:
   Is missing ever used except for in the if statement?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243593)
Time Spent: 4h 50m  (was: 4h 40m)

> TypeHints Py3 Error: test_non_function 
> (apache_beam.typehints.typed_pipeline_test.MainInputTest) Fails on Python 3.7+
> -
>
> Key: BEAM-6988
> URL: https://issues.apache.org/jira/browse/BEAM-6988
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: niklas Hansson
>Priority: Major
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> {noformat}
> Traceback (most recent call last):
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/typed_pipeline_test.py",
>  line 53, in test_non_function
>  result = ['xa', 'bbx', 'xcx'] | beam.Map(str.strip, 'x')
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 510, in _ror_
>  result = p.apply(self, pvalueish, label)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/pipeline.py",
>  line 514, in apply
>  transform.type_check_inputs(pvalueish)
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/transforms/ptransform.py",
>  line 753, in type_check_inputs
>  hints = getcallargs_forhints(argspec_fn, *type_hints[0], **type_hints[1])
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/decorators.py",
>  line 283, in getcallargs_forhints
>  raise TypeCheckError(e)
>  apache_beam.typehints.decorators.TypeCheckError: strip() missing 1 required 
> positional argument: 'chars'{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6683) Add an integration test suite for cross-language transforms for Flink runner

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6683?focusedWorklogId=243587=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243587
 ]

ASF GitHub Bot logged work on BEAM-6683:


Author: ASF GitHub Bot
Created on: 16/May/19 19:07
Start Date: 16/May/19 19:07
Worklog Time Spent: 10m 
  Work Description: ihji commented on issue #8174: [BEAM-6683] add 
createCrossLanguageValidatesRunner task
URL: https://github.com/apache/beam/pull/8174#issuecomment-493194159
 
 
   Run XVR_Flink PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243587)
Time Spent: 9h 20m  (was: 9h 10m)

> Add an integration test suite for cross-language transforms for Flink runner
> 
>
> Key: BEAM-6683
> URL: https://issues.apache.org/jira/browse/BEAM-6683
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Chamikara Jayalath
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> We should add an integration test suite that covers following.
> (1) Currently available Java IO connectors that do not use UDFs work for 
> Python SDK on Flink runner.
> (2) Currently available Python IO connectors that do not use UDFs work for 
> Java SDK on Flink runner.
> (3) Currently available Java/Python pipelines work in a scalable manner for 
> cross-language pipelines (for example, try 10GB, 100GB input for 
> textio/avroio for Java and Python). 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6683) Add an integration test suite for cross-language transforms for Flink runner

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6683?focusedWorklogId=243588=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243588
 ]

ASF GitHub Bot logged work on BEAM-6683:


Author: ASF GitHub Bot
Created on: 16/May/19 19:07
Start Date: 16/May/19 19:07
Worklog Time Spent: 10m 
  Work Description: ihji commented on issue #8174: [BEAM-6683] add 
createCrossLanguageValidatesRunner task
URL: https://github.com/apache/beam/pull/8174#issuecomment-493194159
 
 
   Run XVR_Flink PostCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243588)
Time Spent: 9.5h  (was: 9h 20m)

> Add an integration test suite for cross-language transforms for Flink runner
> 
>
> Key: BEAM-6683
> URL: https://issues.apache.org/jira/browse/BEAM-6683
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Chamikara Jayalath
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 9.5h
>  Remaining Estimate: 0h
>
> We should add an integration test suite that covers following.
> (1) Currently available Java IO connectors that do not use UDFs work for 
> Python SDK on Flink runner.
> (2) Currently available Python IO connectors that do not use UDFs work for 
> Java SDK on Flink runner.
> (3) Currently available Java/Python pipelines work in a scalable manner for 
> cross-language pipelines (for example, try 10GB, 100GB input for 
> textio/avroio for Java and Python). 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6985) TypeHints Py3 Error: Native type compatibility tests fail on Python 3.7+

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6985?focusedWorklogId=243586=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243586
 ]

ASF GitHub Bot logged work on BEAM-6985:


Author: ASF GitHub Bot
Created on: 16/May/19 19:06
Start Date: 16/May/19 19:06
Worklog Time Spent: 10m 
  Work Description: NikeNano commented on issue #8453: [BEAM-6985] 
TypeHints Py3 Error: Native type compatibility tests fail on Python 3.7+ Updates
URL: https://github.com/apache/beam/pull/8453#issuecomment-493194050
 
 
   PTLA @tvalentyn I have added a test for the ordering to make sure the 
behaviour is the same for python2 vs python3. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243586)
Time Spent: 5h 20m  (was: 5h 10m)

> TypeHints Py3 Error: Native type compatibility tests fail on Python 3.7+
> 
>
> Key: BEAM-6985
> URL: https://issues.apache.org/jira/browse/BEAM-6985
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: niklas Hansson
>Priority: Major
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> The following tests are failing:
> * test_convert_nested_to_beam_type 
> (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest)
>  
> * test_convert_to_beam_type 
> (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest)
>  
> * test_convert_to_beam_types 
> (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest)
> With similar errors, where `typing. != `. eg:
> {noformat}
>  FAIL: test_convert_to_beam_type 
> (apache_beam.typehints.native_type_compatibility_test.NativeTypeCompatibilityTest)
>  --
>  Traceback (most recent call last):
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PreCommit_Python_Commit/src/sdks/python/test-suites/tox/py37/build/srcs/sdks/python/apache_beam/typehints/native_type_compatibility_test.py",
>  line 79, in test_convert_to_beam_type
>  beam_type, description)
>  AssertionError: typing.Dict[bytes, int] != Dict[bytes, int] : simple dict
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6683) Add an integration test suite for cross-language transforms for Flink runner

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6683?focusedWorklogId=243585=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243585
 ]

ASF GitHub Bot logged work on BEAM-6683:


Author: ASF GitHub Bot
Created on: 16/May/19 19:05
Start Date: 16/May/19 19:05
Worklog Time Spent: 10m 
  Work Description: ihji commented on pull request #8174: [BEAM-6683] add 
createCrossLanguageValidatesRunner task
URL: https://github.com/apache/beam/pull/8174#discussion_r284854976
 
 

 ##
 File path: 
runners/core-construction-java/src/test/java/org/apache/beam/runners/core/construction/ExternalTest.java
 ##
 @@ -86,26 +84,27 @@ public static void tearDown() {
   @Test
   @Category({ValidatesRunner.class, UsesCrossLanguageTransforms.class})
   public void expandSingleTest() {
-PCollection col =
+PCollection col =
 testPipeline
-.apply(Create.of(1, 2, 3))
+.apply(Create.of("1", "2", "3"))
 
 Review comment:
   The test is modified as close to Python external_test. It was just adding 1 
but now concatenating `Simple(%s)`.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243585)
Time Spent: 9h 10m  (was: 9h)

> Add an integration test suite for cross-language transforms for Flink runner
> 
>
> Key: BEAM-6683
> URL: https://issues.apache.org/jira/browse/BEAM-6683
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Chamikara Jayalath
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> We should add an integration test suite that covers following.
> (1) Currently available Java IO connectors that do not use UDFs work for 
> Python SDK on Flink runner.
> (2) Currently available Python IO connectors that do not use UDFs work for 
> Java SDK on Flink runner.
> (3) Currently available Java/Python pipelines work in a scalable manner for 
> cross-language pipelines (for example, try 10GB, 100GB input for 
> textio/avroio for Java and Python). 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (BEAM-563) DoFn Reuse: Update DirectRunner

2019-05-16 Thread Yifan Mai (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yifan Mai closed BEAM-563.
--
   Resolution: Done
Fix Version/s: Not applicable

This was also done in https://github.com/apache/beam/pull/7994

> DoFn Reuse: Update DirectRunner
> ---
>
> Key: BEAM-563
> URL: https://issues.apache.org/jira/browse/BEAM-563
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Priority: Major
> Fix For: Not applicable
>
>
> https://issues.apache.org/jira/browse/BEAM-562 will add setup and teardown 
> methods to DoFns. Update DirectRunner to add support for these new methods.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6683) Add an integration test suite for cross-language transforms for Flink runner

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6683?focusedWorklogId=243584=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243584
 ]

ASF GitHub Bot logged work on BEAM-6683:


Author: ASF GitHub Bot
Created on: 16/May/19 19:03
Start Date: 16/May/19 19:03
Worklog Time Spent: 10m 
  Work Description: ihji commented on pull request #8174: [BEAM-6683] add 
createCrossLanguageValidatesRunner task
URL: https://github.com/apache/beam/pull/8174#discussion_r284854091
 
 

 ##
 File path: sdks/python/apache_beam/runners/worker/sdk_worker_main.py
 ##
 @@ -197,6 +198,19 @@ def _get_worker_count(pipeline_options):
   return 12
 
 
+def _load_avro_coder(pipeline_options):
+  experiments = pipeline_options.view_as(DebugOptions).experiments
+
+  experiments = experiments if experiments else []
+
+  for experiment in experiments:
+# There should only be 1 match so returning from the loop
+if re.match(r'xlang_test', experiment):
 
 Review comment:
   same here.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243584)
Time Spent: 9h  (was: 8h 50m)

> Add an integration test suite for cross-language transforms for Flink runner
> 
>
> Key: BEAM-6683
> URL: https://issues.apache.org/jira/browse/BEAM-6683
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Chamikara Jayalath
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> We should add an integration test suite that covers following.
> (1) Currently available Java IO connectors that do not use UDFs work for 
> Python SDK on Flink runner.
> (2) Currently available Python IO connectors that do not use UDFs work for 
> Java SDK on Flink runner.
> (3) Currently available Java/Python pipelines work in a scalable manner for 
> cross-language pipelines (for example, try 10GB, 100GB input for 
> textio/avroio for Java and Python). 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-7230) Using JdbcIO creates huge amount of connections

2019-05-16 Thread Brachi Packter (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16841646#comment-16841646
 ] 

Brachi Packter commented on BEAM-7230:
--

Hi.

Looked into the code, seems that like when using API

https://github.com/apache/beam/blob/adb6d0c9f790c9cda363dd5d14f03fb11362f4d1/sdks/java/io/jdbc/src/main/java/org/apache/beam/sdk/io/jdbc/JdbcIO.java#L298

you are setting data source via:

https://github.com/apache/beam/blob/adb6d0c9f790c9cda363dd5d14f03fb11362f4d1/sdks/java/io/jdbc/src/main/java/org/apache/beam/sdk/io/jdbc/JdbcIO.java#L361

and then it is not static, and created per function...

> Using JdbcIO creates huge amount of connections
> ---
>
> Key: BEAM-7230
> URL: https://issues.apache.org/jira/browse/BEAM-7230
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.11.0
>Reporter: Brachi Packter
>Assignee: Ismaël Mejía
>Priority: Major
> Fix For: 2.13.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I want to write form DataFlow to GCP cloud SQL, I'm using connection pool, 
> and still I see huge amount of connections in GCP SQL (4k while I set 
> connection pool to 300), and most of them in sleep.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6683) Add an integration test suite for cross-language transforms for Flink runner

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6683?focusedWorklogId=243582=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243582
 ]

ASF GitHub Bot logged work on BEAM-6683:


Author: ASF GitHub Bot
Created on: 16/May/19 19:02
Start Date: 16/May/19 19:02
Worklog Time Spent: 10m 
  Work Description: ihji commented on pull request #8174: [BEAM-6683] add 
createCrossLanguageValidatesRunner task
URL: https://github.com/apache/beam/pull/8174#discussion_r284853783
 
 

 ##
 File path: sdks/python/apache_beam/runners/worker/sdk_worker_main.py
 ##
 @@ -136,6 +136,7 @@ def main(unused_argv):
 service_descriptor = endpoints_pb2.ApiServiceDescriptor()
 text_format.Merge(os.environ['CONTROL_API_SERVICE_DESCRIPTOR'],
   service_descriptor)
+_load_avro_coder(sdk_pipeline_options)
 
 Review comment:
   Thanks for pointing this out. Will fix this (this is from old design when I 
thought that Avro coder is only for testing xlang).
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243582)
Time Spent: 8h 40m  (was: 8.5h)

> Add an integration test suite for cross-language transforms for Flink runner
> 
>
> Key: BEAM-6683
> URL: https://issues.apache.org/jira/browse/BEAM-6683
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Chamikara Jayalath
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> We should add an integration test suite that covers following.
> (1) Currently available Java IO connectors that do not use UDFs work for 
> Python SDK on Flink runner.
> (2) Currently available Python IO connectors that do not use UDFs work for 
> Java SDK on Flink runner.
> (3) Currently available Java/Python pipelines work in a scalable manner for 
> cross-language pipelines (for example, try 10GB, 100GB input for 
> textio/avroio for Java and Python). 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6683) Add an integration test suite for cross-language transforms for Flink runner

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6683?focusedWorklogId=243583=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243583
 ]

ASF GitHub Bot logged work on BEAM-6683:


Author: ASF GitHub Bot
Created on: 16/May/19 19:02
Start Date: 16/May/19 19:02
Worklog Time Spent: 10m 
  Work Description: ihji commented on pull request #8174: [BEAM-6683] add 
createCrossLanguageValidatesRunner task
URL: https://github.com/apache/beam/pull/8174#discussion_r284853870
 
 

 ##
 File path: 
buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy
 ##
 @@ -1625,6 +1645,110 @@ class BeamModulePlugin implements Plugin {
 
 /** 
***/
 
+// Method to create the crossLanguageValidatesRunnerTask.
+// The method takes crossLanguageValidatesRunnerConfiguration as parameter.
+project.ext.createCrossLanguageValidatesRunnerTask = {
+  def config = it ? it as CrossLanguageValidatesRunnerConfiguration : new 
CrossLanguageValidatesRunnerConfiguration()
+
+  project.evaluationDependsOn(":sdks:python")
+  project.evaluationDependsOn(":sdks:java:testing:expansion-service")
+  project.evaluationDependsOn(":runners:core-construction-java")
+
+  // Task for launching expansion services
+  def envDir = project.project(":sdks:python").envdir
+  def pythonDir = project.project(":sdks:python").projectDir
+  def javaPort = startingExpansionPortNumber.getAndDecrement()
+  def pythonPort = startingExpansionPortNumber.getAndDecrement()
+  def expansionJar = 
project.project(':sdks:java:testing:expansion-service').buildTestExpansionServiceJar.archivePath
+  def expansionServiceOpts = [
+"group_id": project.name,
+"java_expansion_service_jar": expansionJar,
+"java_port": javaPort,
+"python_virtualenv_dir": envDir,
+"python_expansion_service_module": 
"apache_beam.runners.portability.expansion_service_test",
+"python_port": pythonPort
+  ]
+  def serviceArgs = 
project.project(':sdks:python').mapToArgString(expansionServiceOpts)
+  def setupTask = project.tasks.create(name: config.name+"Setup", type: 
Exec) {
+dependsOn ':sdks:java:container:docker'
+dependsOn ':sdks:python:container:docker'
+dependsOn 
':sdks:java:testing:expansion-service:buildTestExpansionServiceJar'
+dependsOn ":sdks:python:installGcpTest"
+// setup test env
+executable 'sh'
+args '-c', "$pythonDir/scripts/run_expansion_services.sh stop 
--group_id ${project.name} && $pythonDir/scripts/run_expansion_services.sh 
start $serviceArgs"
+  }
+
+  def mainTask = project.tasks.create(name: config.name) {
+group = "Verification"
+description = "Validates cross-language capability of runner"
+  }
+
+  def cleanupTask = project.tasks.create(name: config.name+'Cleanup', 
type: Exec) {
+// teardown test env
+executable 'sh'
+args '-c', "$pythonDir/scripts/run_expansion_services.sh stop 
--group_id ${project.name}"
+  }
+  setupTask.finalizedBy cleanupTask
+
+  // Task for running testcases in Java SDK
+  def beamJavaTestPipelineOptions = [
+
"--runner=org.apache.beam.runners.reference.testing.TestPortableRunner",
+"--jobServerDriver=${config.jobServerDriver}",
+"--environmentCacheMillis=1"
+  ]
+  beamJavaTestPipelineOptions.addAll(config.pipelineOpts)
+  if (config.jobServerConfig) {
+
beamJavaTestPipelineOptions.add("--jobServerConfig=${config.jobServerConfig}")
+  }
+  ['Java': javaPort, 'Python': pythonPort].each { sdk, port ->
+def javaTask = project.tasks.create(name: config.name+"JavaUsing"+sdk, 
type: Test) {
+  group = "Verification"
+  description = "Validates runner for cross-language capability of 
using ${sdk} transforms from Java SDK"
+  systemProperty "beamTestPipelineOptions", 
JsonOutput.toJson(beamJavaTestPipelineOptions)
+  systemProperty "expansionPort", port
+  classpath = config.testClasspathConfiguration
+  testClassesDirs = 
project.files(project.project(":runners:core-construction-java").sourceSets.test.output.classesDirs)
+  maxParallelForks config.numParallelTests
+  useJUnit(config.testCategories)
+  // increase maxHeapSize as this is directly correlated to direct 
memory,
+  // see https://issues.apache.org/jira/browse/BEAM-6698
+  maxHeapSize = '4g'
+  dependsOn setupTask
+}
+mainTask.dependsOn javaTask
+cleanupTask.mustRunAfter javaTask
+
+// Task for running testcases in Python SDK
+def testOpts = [
+  "--attr=UsesCrossLanguageTransforms"
+]
+def 

[jira] [Work logged] (BEAM-6683) Add an integration test suite for cross-language transforms for Flink runner

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6683?focusedWorklogId=243577=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243577
 ]

ASF GitHub Bot logged work on BEAM-6683:


Author: ASF GitHub Bot
Created on: 16/May/19 18:57
Start Date: 16/May/19 18:57
Worklog Time Spent: 10m 
  Work Description: ihji commented on issue #8174: [BEAM-6683] add 
createCrossLanguageValidatesRunner task
URL: https://github.com/apache/beam/pull/8174#issuecomment-493190945
 
 
   run seed job
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243577)
Time Spent: 8h 20m  (was: 8h 10m)

> Add an integration test suite for cross-language transforms for Flink runner
> 
>
> Key: BEAM-6683
> URL: https://issues.apache.org/jira/browse/BEAM-6683
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Chamikara Jayalath
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 8h 20m
>  Remaining Estimate: 0h
>
> We should add an integration test suite that covers following.
> (1) Currently available Java IO connectors that do not use UDFs work for 
> Python SDK on Flink runner.
> (2) Currently available Python IO connectors that do not use UDFs work for 
> Java SDK on Flink runner.
> (3) Currently available Java/Python pipelines work in a scalable manner for 
> cross-language pipelines (for example, try 10GB, 100GB input for 
> textio/avroio for Java and Python). 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6683) Add an integration test suite for cross-language transforms for Flink runner

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6683?focusedWorklogId=243578=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243578
 ]

ASF GitHub Bot logged work on BEAM-6683:


Author: ASF GitHub Bot
Created on: 16/May/19 18:57
Start Date: 16/May/19 18:57
Worklog Time Spent: 10m 
  Work Description: ihji commented on issue #8174: [BEAM-6683] add 
createCrossLanguageValidatesRunner task
URL: https://github.com/apache/beam/pull/8174#issuecomment-493190945
 
 
   run seed job
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243578)
Time Spent: 8.5h  (was: 8h 20m)

> Add an integration test suite for cross-language transforms for Flink runner
> 
>
> Key: BEAM-6683
> URL: https://issues.apache.org/jira/browse/BEAM-6683
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Chamikara Jayalath
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 8.5h
>  Remaining Estimate: 0h
>
> We should add an integration test suite that covers following.
> (1) Currently available Java IO connectors that do not use UDFs work for 
> Python SDK on Flink runner.
> (2) Currently available Python IO connectors that do not use UDFs work for 
> Java SDK on Flink runner.
> (3) Currently available Java/Python pipelines work in a scalable manner for 
> cross-language pipelines (for example, try 10GB, 100GB input for 
> textio/avroio for Java and Python). 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6683) Add an integration test suite for cross-language transforms for Flink runner

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6683?focusedWorklogId=243576=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243576
 ]

ASF GitHub Bot logged work on BEAM-6683:


Author: ASF GitHub Bot
Created on: 16/May/19 18:57
Start Date: 16/May/19 18:57
Worklog Time Spent: 10m 
  Work Description: ihji commented on issue #8174: [BEAM-6683] add 
createCrossLanguageValidatesRunner task
URL: https://github.com/apache/beam/pull/8174#issuecomment-493185048
 
 
   run seed job
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243576)
Time Spent: 8h 10m  (was: 8h)

> Add an integration test suite for cross-language transforms for Flink runner
> 
>
> Key: BEAM-6683
> URL: https://issues.apache.org/jira/browse/BEAM-6683
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Chamikara Jayalath
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 8h 10m
>  Remaining Estimate: 0h
>
> We should add an integration test suite that covers following.
> (1) Currently available Java IO connectors that do not use UDFs work for 
> Python SDK on Flink runner.
> (2) Currently available Python IO connectors that do not use UDFs work for 
> Java SDK on Flink runner.
> (3) Currently available Java/Python pipelines work in a scalable manner for 
> cross-language pipelines (for example, try 10GB, 100GB input for 
> textio/avroio for Java and Python). 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7339) Enable 1Gb input for Python wordcount benchmark

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7339?focusedWorklogId=243575=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243575
 ]

ASF GitHub Bot logged work on BEAM-7339:


Author: ASF GitHub Bot
Created on: 16/May/19 18:55
Start Date: 16/May/19 18:55
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on issue #8596: [BEAM-7339] Make 
input and checksum configurable for Python WordCountIT
URL: https://github.com/apache/beam/pull/8596#issuecomment-493190447
 
 
   +R: @yifanzou 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243575)
Time Spent: 20m  (was: 10m)

> Enable 1Gb input for Python wordcount benchmark
> ---
>
> Key: BEAM-7339
> URL: https://issues.apache.org/jira/browse/BEAM-7339
> Project: Beam
>  Issue Type: Task
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Requirement:
> - Use input from: gs://apache-beam-samples/input_small_files/*
> - Use TestDataflowRunner
> - Limit worker number
> - Disable autoscaling
> - Enable both py2 and py3 benchmarks



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5510) Records including datetime to be saved as DATETIME or TIMESTAMP in BigQuery

2019-05-16 Thread Ahmet Altay (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16841636#comment-16841636
 ] 

Ahmet Altay commented on BEAM-5510:
---

cc: [~chamikara] [~pabloem]

> Records including datetime to be saved as DATETIME or TIMESTAMP in BigQuery
> ---
>
> Key: BEAM-5510
> URL: https://issues.apache.org/jira/browse/BEAM-5510
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.6.0
>Reporter: Pascal Gula
>Priority: Major
>
> When trying to write some row in BigQuery that include a python datetime 
> object, the marshaling used to save a row in BigQuery is impossible.
> {code:java}
> File 
> "/home/pascal/Wks/GitHub/PEAT-AI/Albatros/venv/local/lib/python2.7/site-packages/apache_beam/internal/gcp/json_value.py",
>  line 124, in to_json_value
>     raise TypeError('Cannot convert %s to a JSON value.' % repr(obj))
> TypeError: Cannot convert datetime.datetime(2018, 9, 25, 18, 57, 18, 108579) 
> to a JSON value. [while running 'save/WriteToBigQuery']
> {code}
> However, this is something perfectly feasible, as `google-cloud-python` 
> supports it since this issue has been solved: 
> [https://github.com/GoogleCloudPlatform/google-cloud-python/issues/2957]
> thanks to this pull request: 
> [https://github.com/GoogleCloudPlatform/google-cloud-python/pull/3426/files]
> As similar approach could be taken for the `json_value.py` helper.
> Is there any workaround that can be applied to solve this issue? 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-7339) Enable 1Gb input for Python wordcount benchmark

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-7339?focusedWorklogId=243565=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243565
 ]

ASF GitHub Bot logged work on BEAM-7339:


Author: ASF GitHub Bot
Created on: 16/May/19 18:43
Start Date: 16/May/19 18:43
Worklog Time Spent: 10m 
  Work Description: markflyhigh commented on pull request #8596: 
[BEAM-7339] Make input and checksum configurable for Python WordCountIT
URL: https://github.com/apache/beam/pull/8596
 
 
   This is step 1 to support large input for WordCountIT benchmark.
   
   Make `input` and `expect_checksum` configurable from command line for 
WordCountIT.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/)
 | --- | --- | ---
   
   Pre-Commit Tests Status (on master branch)
   

[jira] [Work logged] (BEAM-6683) Add an integration test suite for cross-language transforms for Flink runner

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6683?focusedWorklogId=243560=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243560
 ]

ASF GitHub Bot logged work on BEAM-6683:


Author: ASF GitHub Bot
Created on: 16/May/19 18:40
Start Date: 16/May/19 18:40
Worklog Time Spent: 10m 
  Work Description: ihji commented on issue #8174: [BEAM-6683] add 
createCrossLanguageValidatesRunner task
URL: https://github.com/apache/beam/pull/8174#issuecomment-493185048
 
 
   run seed job
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243560)
Time Spent: 8h  (was: 7h 50m)

> Add an integration test suite for cross-language transforms for Flink runner
> 
>
> Key: BEAM-6683
> URL: https://issues.apache.org/jira/browse/BEAM-6683
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Chamikara Jayalath
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> We should add an integration test suite that covers following.
> (1) Currently available Java IO connectors that do not use UDFs work for 
> Python SDK on Flink runner.
> (2) Currently available Python IO connectors that do not use UDFs work for 
> Java SDK on Flink runner.
> (3) Currently available Java/Python pipelines work in a scalable manner for 
> cross-language pipelines (for example, try 10GB, 100GB input for 
> textio/avroio for Java and Python). 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-7339) Enable 1Gb input for Python wordcount benchmark

2019-05-16 Thread Ahmet Altay (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16841629#comment-16841629
 ] 

Ahmet Altay commented on BEAM-7339:
---

If this is for a benchmark, should we target larger input (> 100 GB). Is there 
a reason like a limitation for us to use 1 GB value?

> Enable 1Gb input for Python wordcount benchmark
> ---
>
> Key: BEAM-7339
> URL: https://issues.apache.org/jira/browse/BEAM-7339
> Project: Beam
>  Issue Type: Task
>  Components: testing
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
>
> Requirement:
> - Use input from: gs://apache-beam-samples/input_small_files/*
> - Use TestDataflowRunner
> - Limit worker number
> - Disable autoscaling
> - Enable both py2 and py3 benchmarks



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-6683) Add an integration test suite for cross-language transforms for Flink runner

2019-05-16 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-6683?focusedWorklogId=243556=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-243556
 ]

ASF GitHub Bot logged work on BEAM-6683:


Author: ASF GitHub Bot
Created on: 16/May/19 18:36
Start Date: 16/May/19 18:36
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on pull request #8174: 
[BEAM-6683] add createCrossLanguageValidatesRunner task
URL: https://github.com/apache/beam/pull/8174#discussion_r284839573
 
 

 ##
 File path: sdks/python/apache_beam/runners/worker/sdk_worker_main.py
 ##
 @@ -136,6 +136,7 @@ def main(unused_argv):
 service_descriptor = endpoints_pb2.ApiServiceDescriptor()
 text_format.Merge(os.environ['CONTROL_API_SERVICE_DESCRIPTOR'],
   service_descriptor)
+_load_avro_coder(sdk_pipeline_options)
 
 Review comment:
   Can you explain why we need to import the AvroCoder here (but not the other 
coders). Can we load coders in a uniform way ?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 243556)
Time Spent: 7h 40m  (was: 7.5h)

> Add an integration test suite for cross-language transforms for Flink runner
> 
>
> Key: BEAM-6683
> URL: https://issues.apache.org/jira/browse/BEAM-6683
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Chamikara Jayalath
>Assignee: Heejong Lee
>Priority: Major
>  Time Spent: 7h 40m
>  Remaining Estimate: 0h
>
> We should add an integration test suite that covers following.
> (1) Currently available Java IO connectors that do not use UDFs work for 
> Python SDK on Flink runner.
> (2) Currently available Python IO connectors that do not use UDFs work for 
> Java SDK on Flink runner.
> (3) Currently available Java/Python pipelines work in a scalable manner for 
> cross-language pipelines (for example, try 10GB, 100GB input for 
> textio/avroio for Java and Python). 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >