[jira] [Comment Edited] (BEAM-7993) portable python precommit is flaky

2019-08-29 Thread Hannah Jiang (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919211#comment-16919211
 ] 

Hannah Jiang edited comment on BEAM-7993 at 8/30/19 5:44 AM:
-

I tried to run at local with py2 and py36.

I ran more than 10 times, and all of them parallel running py2 and py36 and no 
one failed. So I haven't try to run one by one, because it doesn't explain 
non-parallel run would solve the problem.

I wrote a build.gradle to trigger python portable precommit job at local. 
Jenkins uses a job to build the tasks, so it might not be exactly same.

[~markflyhigh], are you familiar with gradle/jenkins? I try to remove parallel 
part from the Jenkins job, but all precommit job share the same job builder, 
impact range is very big. Is it possible for you to remove parallel part for 
Python Portable Precommit tasks only?


was (Author: hannahjiang):
I tried to run at local with py2 and py36.

I wrote a build.gradle to trigger python portable precommit job at local. 
Jenkins uses a job to build the tasks, so it might not be exactly same.

I ran more than 10 times, and all of them parallel running py2 and py36 and no 
one failed. So I haven't try to run one by one, because it doesn't explain 
non-parallel run would solve the problem.

[~markflyhigh], are you familiar with gradle? I try to remove parallel part 
from the Jenkins job, but all precommit job share the same job builder, impact 
range is very big. Is it possible for you to remove parallel part for Python 
Portable Precommit tasks only?

> portable python precommit is flaky
> --
>
> Key: BEAM-7993
> URL: https://issues.apache.org/jira/browse/BEAM-7993
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, test-failures, testing
>Affects Versions: 2.15.0
>Reporter: Udi Meiri
>Assignee: Mark Liu
>Priority: Major
>  Labels: currently-failing
> Fix For: 2.16.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> I'm not sure what the root cause is here.
> Example log where 
> :sdks:python:test-suites:portable:py35:portableWordCountBatch failed:
> {code}
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (1/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)
> 11:51:22 java.lang.Exception: The user defined 'open()' method caused an 
> exception: java.io.IOException: Received exit code 1 for command 'docker 
> inspect -f {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498)
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
> 11:51:22  at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712)
> 11:51:22  at java.lang.Thread.run(Thread.java:748)
> 11:51:22 Caused by: 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.io.IOException: Received exit code 1 for command 'docker inspect -f 
> {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4966)
> 11:51:22  at 
> 

[jira] [Work logged] (BEAM-7730) Add Flink 1.9 build target and Make FlinkRunner compatible with Flink 1.9

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7730?focusedWorklogId=304055=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304055
 ]

ASF GitHub Bot logged work on BEAM-7730:


Author: ASF GitHub Bot
Created on: 30/Aug/19 05:42
Start Date: 30/Aug/19 05:42
Worklog Time Spent: 10m 
  Work Description: sunjincheng121 commented on issue #9296: WIP: 
[BEAM-7730] Introduce Flink 1.9 Runner
URL: https://github.com/apache/beam/pull/9296#issuecomment-526464549
 
 
   Thanks @dmvk, I would like to check it, then feedback you! :)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304055)
Time Spent: 1h  (was: 50m)

> Add Flink 1.9 build target and Make FlinkRunner compatible with Flink 1.9
> -
>
> Key: BEAM-7730
> URL: https://issues.apache.org/jira/browse/BEAM-7730
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: sunjincheng
>Assignee: David Moravek
>Priority: Major
> Fix For: 2.16.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Apache Flink 1.9 will coming and it's better to add Flink 1.9 build target 
> and make Flink Runner compatible with Flink 1.9.
> I will add the brief changes after the Flink 1.9.0 released. 
> And I appreciate it if you can leave your suggestions or comments!



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-7993) portable python precommit is flaky

2019-08-29 Thread Hannah Jiang (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919211#comment-16919211
 ] 

Hannah Jiang commented on BEAM-7993:


I tried to run at local with py2 and py36.

I wrote a build.gradle to trigger python portable precommit job at local. 
Jenkins uses a job to build the tasks, so it might not be exactly same.

I ran more than 10 times, and all of them parallel running py2 and py36 and no 
one failed. So I haven't try to run one by one, because it doesn't explain 
non-parallel run would solve the problem.

[~markflyhigh], are you familiar with gradle? I try to remove parallel part 
from the Jenkins job, but all precommit job share the same job builder, impact 
range is very big. Is it possible for you to remove parallel part for Python 
Portable Precommit tasks only?

> portable python precommit is flaky
> --
>
> Key: BEAM-7993
> URL: https://issues.apache.org/jira/browse/BEAM-7993
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, test-failures, testing
>Affects Versions: 2.15.0
>Reporter: Udi Meiri
>Assignee: Mark Liu
>Priority: Major
>  Labels: currently-failing
> Fix For: 2.16.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> I'm not sure what the root cause is here.
> Example log where 
> :sdks:python:test-suites:portable:py35:portableWordCountBatch failed:
> {code}
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (1/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)
> 11:51:22 java.lang.Exception: The user defined 'open()' method caused an 
> exception: java.io.IOException: Received exit code 1 for command 'docker 
> inspect -f {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498)
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
> 11:51:22  at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712)
> 11:51:22  at java.lang.Thread.run(Thread.java:748)
> 11:51:22 Caused by: 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.io.IOException: Received exit code 1 for command 'docker inspect -f 
> {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4966)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:211)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:202)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:185)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.FlinkDefaultExecutableStageContext.getStageBundleFactory(FlinkDefaultExecutableStageContext.java:49)
> 11:51:22  at 
> 

[jira] [Work logged] (BEAM-8111) SchemaCoder broken on DataflowRunner

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8111?focusedWorklogId=304054=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304054
 ]

ASF GitHub Bot logged work on BEAM-8111:


Author: ASF GitHub Bot
Created on: 30/Aug/19 05:37
Start Date: 30/Aug/19 05:37
Worklog Time Spent: 10m 
  Work Description: TheNeuralBit commented on issue #9455: [BEAM-8111] 
Update dataflow container version to revert SchemaCoder change
URL: https://github.com/apache/beam/pull/9455#issuecomment-526463400
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304054)
Time Spent: 50m  (was: 40m)

> SchemaCoder broken on DataflowRunner
> 
>
> Key: BEAM-8111
> URL: https://issues.apache.org/jira/browse/BEAM-8111
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-java-core
>Affects Versions: 2.15.0
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Blocker
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> https://github.com/apache/beam/commit/e65c176a9f34e45d408281e1101a2ae54cef0f6c
>  broke SchemaCoder on Dataflow. When translating a schema that uses logical 
> types from a cloud object dataflow encounters a runtime error.
> This means any pipelines that use SqlTransform or schema transforms will fail 
> on Dataflow in 2.15.0



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-7978) ArithmeticExceptions on getting backlog bytes

2019-08-29 Thread Mateusz (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919201#comment-16919201
 ] 

Mateusz commented on BEAM-7978:
---

[~aromanenko], looks good from my perspective

> ArithmeticExceptions on getting backlog bytes 
> --
>
> Key: BEAM-7978
> URL: https://issues.apache.org/jira/browse/BEAM-7978
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-kinesis
>Affects Versions: 2.14.0
>Reporter: Mateusz
>Assignee: Alexey Romanenko
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Hello,
> Beam 2.14.0
>  (and to be more precise 
> [commit|https://github.com/apache/beam/commit/9fe0a0387ad1f894ef02acf32edbc9623a3b13ad#diff-b4964a457006b1555c7042c739b405ec])
>  introduced a change in watermark calculation in Kinesis IO causing below 
> error:
> {code:java}
> exception:  "java.lang.RuntimeException: Unknown kinesis failure, when trying 
> to reach kinesis
>   at 
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.wrapExceptions(SimplifiedKinesisClient.java:227)
>   at 
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.getBacklogBytes(SimplifiedKinesisClient.java:167)
>   at 
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.getBacklogBytes(SimplifiedKinesisClient.java:155)
>   at 
> org.apache.beam.sdk.io.kinesis.KinesisReader.getTotalBacklogBytes(KinesisReader.java:158)
>   at 
> org.apache.beam.runners.dataflow.worker.StreamingModeExecutionContext.flushState(StreamingModeExecutionContext.java:433)
>   at 
> org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1289)
>   at 
> org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access$1000(StreamingDataflowWorker.java:149)
>   at 
> org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker$6.run(StreamingDataflowWorker.java:1024)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ArithmeticException: Value cannot fit in an int: 
> 153748963401
>   at org.joda.time.field.FieldUtils.safeToInt(FieldUtils.java:229)
>   at 
> org.joda.time.field.BaseDurationField.getDifference(BaseDurationField.java:141)
>   at 
> org.joda.time.base.BaseSingleFieldPeriod.between(BaseSingleFieldPeriod.java:72)
>   at org.joda.time.Minutes.minutesBetween(Minutes.java:101)
>   at 
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.lambda$getBacklogBytes$3(SimplifiedKinesisClient.java:169)
>   at 
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.wrapExceptions(SimplifiedKinesisClient.java:210)
>   ... 10 more
> {code}
> We spotted this issue on Dataflow runner. It's problematic as inability to 
> get backlog bytes seems to result in constant recreation of KinesisReader.
> The issue happens if the backlog bytes are retrieved before watermark value 
> is updated from initial default value. Easy way to reproduce it is to create 
> a pipeline with Kinesis source for a stream where no records are being put. 
> While debugging it locally, you can observe that the watermark is set to the 
> value on the past (like: "-290308-12-21T19:59:05.225Z"). After two minutes 
> (default watermark idle duration threshold is set to 2 minutes) , the 
> watermark is set to value of 
> [watermarkIdleThreshold|https://github.com/apache/beam/blob/9fe0a0387ad1f894ef02acf32edbc9623a3b13ad/sdks/java/io/kinesis/src/main/java/org/apache/beam/sdk/io/kinesis/WatermarkPolicyFactory.java#L110]),
>  so the next backlog bytes retrieval should be correct. However, as described 
> before, running the pipeline on Dataflow runner results in KinesisReader 
> being closed just after creation, so the watermark won't be fixed.
> The reason of the issue is following: The introduced watermark policies are 
> relying on 
> [WatermarkParameters|https://github.com/apache/beam/blob/9fe0a0387ad1f894ef02acf32edbc9623a3b13ad/sdks/java/io/kinesis/src/main/java/org/apache/beam/sdk/io/kinesis/WatermarkParameters.java]
>  which initialises currentWatermark and eventTime to 
> [BoundedWindow.TIMESTAMP_MIN_VALUE|https://github.com/apache/beam/commit/9fe0a0387ad1f894ef02acf32edbc9623a3b13ad#diff-b4964a457006b1555c7042c739b405ecR52].
>  This result in watermark being set to new Instant(-9223372036854775L) at the 
> KinesisReader creation. Calculated [period between the watermark and the 
> current 
> timestamp|https://github.com/apache/beam/blob/9fe0a0387ad1f894ef02acf32edbc9623a3b13ad/sdks/java/io/kinesis/src/main/java/org/apache/beam/sdk/io/kinesis/SimplifiedKinesisClient.java#L169]
>  is bigger than expected 

[jira] [Work logged] (BEAM-7972) Portable Python Reshuffle does not work with windowed pcollection

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7972?focusedWorklogId=304046=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304046
 ]

ASF GitHub Bot logged work on BEAM-7972:


Author: ASF GitHub Bot
Created on: 30/Aug/19 05:16
Start Date: 30/Aug/19 05:16
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #9334: [BEAM-7972] Always 
use Global window in reshuffle and then apply wind…
URL: https://github.com/apache/beam/pull/9334#issuecomment-526458583
 
 
   Request for another pass for the review.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304046)
Time Spent: 3h 20m  (was: 3h 10m)

> Portable Python Reshuffle does not work with windowed pcollection
> -
>
> Key: BEAM-7972
> URL: https://issues.apache.org/jira/browse/BEAM-7972
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Blocker
> Fix For: 2.16.0
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Streaming pipeline gets stuck when using Reshuffle with windowed pcollection.
> The issue happen because of window function gets deserialized on java side 
> which is not possible and hence default to global window function and result 
> into window function mismatch later down the code.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8111) SchemaCoder broken on DataflowRunner

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8111?focusedWorklogId=304042=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304042
 ]

ASF GitHub Bot logged work on BEAM-8111:


Author: ASF GitHub Bot
Created on: 30/Aug/19 04:52
Start Date: 30/Aug/19 04:52
Worklog Time Spent: 10m 
  Work Description: TheNeuralBit commented on pull request #9455: 
[BEAM-8111] Update dataflow container version to revert SchemaCoder change
URL: https://github.com/apache/beam/pull/9455
 
 
   R: @aaltay 
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/)
 | --- | --- | [![Build 

[jira] [Work logged] (BEAM-8113) FlinkRunner: Stage files from context classloader

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8113?focusedWorklogId=304032=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304032
 ]

ASF GitHub Bot logged work on BEAM-8113:


Author: ASF GitHub Bot
Created on: 30/Aug/19 04:15
Start Date: 30/Aug/19 04:15
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #9451: [BEAM-8113] 
Stage files from context classloader
URL: https://github.com/apache/beam/pull/9451#discussion_r319351193
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkRunner.java
 ##
 @@ -76,8 +80,15 @@ public static FlinkRunner fromOptions(PipelineOptions 
options) {
 }
 
 if (flinkOptions.getFilesToStage() == null) {
-  flinkOptions.setFilesToStage(
-  detectClassPathResourcesToStage(FlinkRunner.class.getClassLoader()));
+  List filesToStage =
+  Stream.concat(
+  Stream.of(FlinkRunner.class.getClassLoader()),
 
 Review comment:
   Please use `ReflectHelpers#findClassLoader`
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304032)
Time Spent: 1h 10m  (was: 1h)

> FlinkRunner: Stage files from context classloader
> -
>
> Key: BEAM-8113
> URL: https://issues.apache.org/jira/browse/BEAM-8113
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Jan Lukavský
>Assignee: Jan Lukavský
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Currently, only files from {{FlinkRunner.class.getClassLoader()}} are staged 
> by default. Add also files from 
> {{Thread.currentThread().getContextClassLoader()}}.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-5428) Implement cross-bundle state caching.

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5428?focusedWorklogId=304026=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304026
 ]

ASF GitHub Bot logged work on BEAM-5428:


Author: ASF GitHub Bot
Created on: 30/Aug/19 03:58
Start Date: 30/Aug/19 03:58
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #9440: [BEAM-5428] Modify 
cache token Proto design to only include tokens in ProcessBundleRequest
URL: https://github.com/apache/beam/pull/9440#issuecomment-526446443
 
 
   If possible, please regenerate the Go proto bindings by following 
https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/model/PROTOBUF.md
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304026)
Time Spent: 11h  (was: 10h 50m)

> Implement cross-bundle state caching.
> -
>
> Key: BEAM-5428
> URL: https://issues.apache.org/jira/browse/BEAM-5428
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-harness
>Reporter: Robert Bradshaw
>Assignee: Rakesh Kumar
>Priority: Major
>  Time Spent: 11h
>  Remaining Estimate: 0h
>
> Tech spec: 
> [https://docs.google.com/document/d/1BOozW0bzBuz4oHJEuZNDOHdzaV5Y56ix58Ozrqm2jFg/edit#heading=h.7ghoih5aig5m]
> Relevant document: 
> [https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit#|https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit]
> Mailing list link: 
> [https://lists.apache.org/thread.html/caa8d9bc6ca871d13de2c5e6ba07fdc76f85d26497d95d90893aa1f6@%3Cdev.beam.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-5428) Implement cross-bundle state caching.

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5428?focusedWorklogId=304024=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304024
 ]

ASF GitHub Bot logged work on BEAM-5428:


Author: ASF GitHub Bot
Created on: 30/Aug/19 03:56
Start Date: 30/Aug/19 03:56
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #9440: [BEAM-5428] 
Modify cache token Proto design to only include tokens in ProcessBundleRequest
URL: https://github.com/apache/beam/pull/9440#discussion_r319348413
 
 

 ##
 File path: model/fn-execution/src/main/proto/beam_fn_api.proto
 ##
 @@ -232,9 +232,35 @@ message ProcessBundleRequest {
   // instantiated and executed by the SDK harness.
   string process_bundle_descriptor_reference = 1;
 
+  // A cache token which can be used by an SDK to check for the validity
+  // of cached elements which have a cache token associated.
+  message CacheToken {
+
+// A flag to indicate a cache token is valid for user state.
+message UserState {}
+
+// A flag to indicate a cache token is valid for a side input.
+message SideInput {
+  // The id of a side input.
+  string side_input = 1;
+}
+
+// The scope of a cache token.
+oneof type {
+  UserState user_state = 1;
+  SideInput side_input = 2;
+}
+
+// The cache token identifier which should be globally unique.
+bytes token = 10;
+  }
+
   // (Optional) A list of cache tokens that can be used by an SDK to reuse
   // cached data returned by the State API across multiple bundles.
-  repeated bytes cache_tokens = 2;
+  repeated CacheToken cache_tokens = 3;
+
+  // Old version of cache_tokens field
+  reserved 2;
 
 Review comment:
   Delete it.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304024)
Time Spent: 10h 50m  (was: 10h 40m)

> Implement cross-bundle state caching.
> -
>
> Key: BEAM-5428
> URL: https://issues.apache.org/jira/browse/BEAM-5428
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-harness
>Reporter: Robert Bradshaw
>Assignee: Rakesh Kumar
>Priority: Major
>  Time Spent: 10h 50m
>  Remaining Estimate: 0h
>
> Tech spec: 
> [https://docs.google.com/document/d/1BOozW0bzBuz4oHJEuZNDOHdzaV5Y56ix58Ozrqm2jFg/edit#heading=h.7ghoih5aig5m]
> Relevant document: 
> [https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit#|https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit]
> Mailing list link: 
> [https://lists.apache.org/thread.html/caa8d9bc6ca871d13de2c5e6ba07fdc76f85d26497d95d90893aa1f6@%3Cdev.beam.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-5428) Implement cross-bundle state caching.

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5428?focusedWorklogId=304023=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304023
 ]

ASF GitHub Bot logged work on BEAM-5428:


Author: ASF GitHub Bot
Created on: 30/Aug/19 03:56
Start Date: 30/Aug/19 03:56
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #9440: [BEAM-5428] 
Modify cache token Proto design to only include tokens in ProcessBundleRequest
URL: https://github.com/apache/beam/pull/9440#discussion_r319348727
 
 

 ##
 File path: model/fn-execution/src/main/proto/beam_fn_api.proto
 ##
 @@ -595,10 +621,6 @@ message StateResponse {
   // failed.
   string error = 2;
 
-  // (Optional) If this is specified, then the result of this state request
-  // can be cached using the supplied token.
-  bytes cache_token = 3;
 
 Review comment:
   I don't think we need it as part of the request either since the state key 
uniquely describes which ProcessBundleDescriptor is being used as long as the 
runner doesn't reuse instruction ids.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304023)
Time Spent: 10h 50m  (was: 10h 40m)

> Implement cross-bundle state caching.
> -
>
> Key: BEAM-5428
> URL: https://issues.apache.org/jira/browse/BEAM-5428
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-harness
>Reporter: Robert Bradshaw
>Assignee: Rakesh Kumar
>Priority: Major
>  Time Spent: 10h 50m
>  Remaining Estimate: 0h
>
> Tech spec: 
> [https://docs.google.com/document/d/1BOozW0bzBuz4oHJEuZNDOHdzaV5Y56ix58Ozrqm2jFg/edit#heading=h.7ghoih5aig5m]
> Relevant document: 
> [https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit#|https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit]
> Mailing list link: 
> [https://lists.apache.org/thread.html/caa8d9bc6ca871d13de2c5e6ba07fdc76f85d26497d95d90893aa1f6@%3Cdev.beam.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-5428) Implement cross-bundle state caching.

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5428?focusedWorklogId=304025=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304025
 ]

ASF GitHub Bot logged work on BEAM-5428:


Author: ASF GitHub Bot
Created on: 30/Aug/19 03:56
Start Date: 30/Aug/19 03:56
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on pull request #9440: [BEAM-5428] 
Modify cache token Proto design to only include tokens in ProcessBundleRequest
URL: https://github.com/apache/beam/pull/9440#discussion_r319348379
 
 

 ##
 File path: model/fn-execution/src/main/proto/beam_fn_api.proto
 ##
 @@ -610,6 +632,9 @@ message StateResponse {
 // A response to clearing state.
 StateClearResponse clear = 1002;
   }
+
+  // Cache token which used to be part of the StateResponse
+  reserved 3;
 
 Review comment:
   Delete it.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304025)
Time Spent: 10h 50m  (was: 10h 40m)

> Implement cross-bundle state caching.
> -
>
> Key: BEAM-5428
> URL: https://issues.apache.org/jira/browse/BEAM-5428
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-harness
>Reporter: Robert Bradshaw
>Assignee: Rakesh Kumar
>Priority: Major
>  Time Spent: 10h 50m
>  Remaining Estimate: 0h
>
> Tech spec: 
> [https://docs.google.com/document/d/1BOozW0bzBuz4oHJEuZNDOHdzaV5Y56ix58Ozrqm2jFg/edit#heading=h.7ghoih5aig5m]
> Relevant document: 
> [https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit#|https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit]
> Mailing list link: 
> [https://lists.apache.org/thread.html/caa8d9bc6ca871d13de2c5e6ba07fdc76f85d26497d95d90893aa1f6@%3Cdev.beam.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7909) Write integration tests to test customized containers

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7909?focusedWorklogId=304019=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304019
 ]

ASF GitHub Bot logged work on BEAM-7909:


Author: ASF GitHub Bot
Created on: 30/Aug/19 03:23
Start Date: 30/Aug/19 03:23
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on issue #9351: [BEAM-7909] 
Python3 docker containers
URL: https://github.com/apache/beam/pull/9351#issuecomment-526440933
 
 
   Run Portable_Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304019)
Time Spent: 9.5h  (was: 9h 20m)

> Write integration tests to test customized containers
> -
>
> Key: BEAM-7909
> URL: https://issues.apache.org/jira/browse/BEAM-7909
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
>  Time Spent: 9.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Comment Edited] (BEAM-7049) Merge multiple input to one BeamUnionRel

2019-08-29 Thread sridhar Reddy (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919139#comment-16919139
 ] 

sridhar Reddy edited comment on BEAM-7049 at 8/30/19 3:00 AM:
--

After several attempts to generalize union with multiple operands, I am unable 
to locate a place where this can be done. Any attempt to pull inputs from lower 
operands to upper operands actually limits output. I am not sure if I am 
looking in the right places.

[~amaliujia] Can you please confirm that UnionMergeRule (or variation of it) is 
the right place to make these changes?

[https://github.com/Qihoo360/Quicksql/blob/7863cff886ad233b5102d6e2f11f8d21f374aa82/analysis/src/main/java/org/apache/calcite/rel/rules/UnionMergeRule.java#L128]

I tried to make different changes in the above location but without any luck. 
In fact, any changes made there limits union result. For ex: select 1 union 2 
union 3 union 4 union results in [1,2,3] skipping 4. 

Just trying to hardcode in BeamUnionRel also was not possible beyond 3 inputs.

[https://github.com/apache/beam/blob/9e152b7a99b2e081d224584905a49e14742e0d5d/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamUnionRel.java#L76]

 

Where ever I looked most of the operations are done on 2 or 3 operands only.

 

Only in CaliciteQueryPlanner, I can find everything in one place

[https://github.com/apache/beam/blob/9e152b7a99b2e081d224584905a49e14742e0d5d/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/CalciteQueryPlanner.java#L169]

Is there a place where one can access - beamRelNode or a similar structure 
where all the inputs are present- and can be manipulated?  CalciteQueryPlanner 
doesn't seem like a good place to do this. 

 

 

 


was (Author: sridharg):
After several attempts to generalize union with multiple operands, I am unable 
to locate a place where this can be done. Any attempt to pull inputs from lower 
operands to upper operands actually limits output. I am not sure I am looking 
in the right places.

[~amaliujia] Can you please confirm that UnionMergeRule (or variation of it) is 
the right place to make these changes?

[https://github.com/Qihoo360/Quicksql/blob/7863cff886ad233b5102d6e2f11f8d21f374aa82/analysis/src/main/java/org/apache/calcite/rel/rules/UnionMergeRule.java#L128]

I tried to make different changes in the above location but without any luck. 
In fact any changes made there limits union result. For ex: select 1 union 2 
union 3 union 4 union results in [1,2,3] skipping 4. 

Just trying to hardcode in BeamUnionRel also was not possible beyond 3 inputs.

[https://github.com/apache/beam/blob/9e152b7a99b2e081d224584905a49e14742e0d5d/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamUnionRel.java#L76]

 

Where ever I looked most of the operations are done on 2 or 3 operands only.

 

Only in CaliciteQueryPlanner, I can find everything in one place

[https://github.com/apache/beam/blob/9e152b7a99b2e081d224584905a49e14742e0d5d/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/CalciteQueryPlanner.java#L169]

Is there a place where you can access - beamRelNode or a similar structure 
where all the inputs are present- and can be manipulated?  CalciteQueryPlanner 
doesn't seem like a good place to do this. 

 

 

 

> Merge multiple input to one BeamUnionRel
> 
>
> Key: BEAM-7049
> URL: https://issues.apache.org/jira/browse/BEAM-7049
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: sridhar Reddy
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> BeamUnionRel assumes inputs are two and rejects more. So `a UNION b UNION c` 
> will have to be created as UNION(a, UNION(b, c)) and have two shuffles. If 
> BeamUnionRel can handle multiple shuffles, we will have only one shuffle



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-7049) Merge multiple input to one BeamUnionRel

2019-08-29 Thread sridhar Reddy (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919139#comment-16919139
 ] 

sridhar Reddy commented on BEAM-7049:
-

After several attempts to generalize union with multiple operands, I am unable 
to locate a place where this can be done. Any attempt to pull inputs from lower 
operands to upper operands actually limits output. I am not sure I am looking 
in the right places.

[~amaliujia] Can you please confirm that UnionMergeRule (or variation of it) is 
the right place to make these changes?

[https://github.com/Qihoo360/Quicksql/blob/7863cff886ad233b5102d6e2f11f8d21f374aa82/analysis/src/main/java/org/apache/calcite/rel/rules/UnionMergeRule.java#L128]

I tried to make different changes in the above location but without any luck. 
In fact any changes made there limits union result. For ex: select 1 union 2 
union 3 union 4 union results in [1,2,3] skipping 4. 

Just trying to hardcode in BeamUnionRel also was not possible beyond 3 inputs.

[https://github.com/apache/beam/blob/9e152b7a99b2e081d224584905a49e14742e0d5d/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamUnionRel.java#L76]

 

Where ever I looked most of the operations are done on 2 or 3 operands only.

 

Only in CaliciteQueryPlanner, I can find everything in one place

[https://github.com/apache/beam/blob/9e152b7a99b2e081d224584905a49e14742e0d5d/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/CalciteQueryPlanner.java#L169]

Is there a place where you can access - beamRelNode or a similar structure 
where all the inputs are present- and can be manipulated?  CalciteQueryPlanner 
doesn't seem like a good place to do this. 

 

 

 

> Merge multiple input to one BeamUnionRel
> 
>
> Key: BEAM-7049
> URL: https://issues.apache.org/jira/browse/BEAM-7049
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: sridhar Reddy
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> BeamUnionRel assumes inputs are two and rejects more. So `a UNION b UNION c` 
> will have to be created as UNION(a, UNION(b, c)) and have two shuffles. If 
> BeamUnionRel can handle multiple shuffles, we will have only one shuffle



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7966) Write portable Flink application jar

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7966?focusedWorklogId=304007=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304007
 ]

ASF GitHub Bot logged work on BEAM-7966:


Author: ASF GitHub Bot
Created on: 30/Aug/19 02:32
Start Date: 30/Aug/19 02:32
Worklog Time Spent: 10m 
  Work Description: ibzib commented on issue #9331: [BEAM-7966] Write 
portable Flink application jar
URL: https://github.com/apache/beam/pull/9331#issuecomment-526431817
 
 
   @tweise I created https://issues.apache.org/jira/browse/BEAM-8115 for 
overwriting pipeline options.
   For an example usage, check out the integration test I wrote for the 
subsequent draft PR. 
https://github.com/apache/beam/blob/725116bfdcc7ff364456be016cb93dc7644ffb02/runners/flink/job-server/pipeline_jar_test.sh
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304007)
Time Spent: 1.5h  (was: 1h 20m)

> Write portable Flink application jar
> 
>
> Key: BEAM-7966
> URL: https://issues.apache.org/jira/browse/BEAM-7966
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-flink
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> *[https://docs.google.com/document/d/1kj_9JWxGWOmSGeZ5hbLVDXSTv-zBrx4kQRqOq85RYD4/edit#heading=h.oes73844vmhl]*



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (BEAM-8115) Overwrite portable Flink application jar pipeline options at runtime

2019-08-29 Thread Kyle Weaver (Jira)
Kyle Weaver created BEAM-8115:
-

 Summary: Overwrite portable Flink application jar pipeline options 
at runtime
 Key: BEAM-8115
 URL: https://issues.apache.org/jira/browse/BEAM-8115
 Project: Beam
  Issue Type: New Feature
  Components: runner-flink
Reporter: Kyle Weaver
Assignee: Kyle Weaver


In the first iteration of portable Flink application jars, all pipeline options 
are set at job creation time and cannot be later modified at runtime. There 
should be a way to pass arguments to the jar to write/overwrite pipeline 
options.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7966) Write portable Flink application jar

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7966?focusedWorklogId=304006=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304006
 ]

ASF GitHub Bot logged work on BEAM-7966:


Author: ASF GitHub Bot
Created on: 30/Aug/19 02:23
Start Date: 30/Aug/19 02:23
Worklog Time Spent: 10m 
  Work Description: tweise commented on issue #9331: [BEAM-7966] Write 
portable Flink application jar
URL: https://github.com/apache/beam/pull/9331#issuecomment-526430047
 
 
   It would be good to capture the pipeline option modification at runtime in a 
follow-up JIRA.
   
   Do you have an example how this jar producing runner would be execute from 
python side? I'm wondering how to integrate this into an automated build. An 
example command line that also shows how the runner/job-server jar dependency 
is specified would be helpful. 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 304006)
Time Spent: 1h 20m  (was: 1h 10m)

> Write portable Flink application jar
> 
>
> Key: BEAM-7966
> URL: https://issues.apache.org/jira/browse/BEAM-7966
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Kyle Weaver
>Assignee: Kyle Weaver
>Priority: Major
>  Labels: portability-flink
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> *[https://docs.google.com/document/d/1kj_9JWxGWOmSGeZ5hbLVDXSTv-zBrx4kQRqOq85RYD4/edit#heading=h.oes73844vmhl]*



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-7952) Make the input queue of the input buffer in Python SDK Harness size limited.

2019-08-29 Thread sunjincheng (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sunjincheng updated BEAM-7952:
--
Fix Version/s: (was: 2.16.0)
   2.17.0

> Make the input queue of the input buffer in Python SDK Harness size limited.
> 
>
> Key: BEAM-7952
> URL: https://issues.apache.org/jira/browse/BEAM-7952
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-harness
>Reporter: sunjincheng
>Priority: Major
> Fix For: 2.17.0
>
>
> At Python SDK harness, the input queue size of the input buffer in Python SDK 
> Harness is not size limited and also not configurable. This may become a 
> problem if the data production rate is more than the data consumption rate.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-7950) Remove the Python 3 warning as it has already been supported

2019-08-29 Thread sunjincheng (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sunjincheng updated BEAM-7950:
--
Fix Version/s: (was: 2.16.0)
   2.17.0

> Remove the Python 3 warning as it has already been supported
> 
>
> Key: BEAM-7950
> URL: https://issues.apache.org/jira/browse/BEAM-7950
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: sunjincheng
>Assignee: sunjincheng
>Priority: Major
> Fix For: 2.17.0
>
>
> There are warnings that Python 3 is not fully supported in Beam 
> (beam/sdks/python/setup.py). As mentioned in the ML, we should remove the 
> Python 3 warning as it has already been supported as an effort of 
> https://issues.apache.org/jira/browse/BEAM-1251.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-7949) Add time-based cache threshold support in the data service of the Python SDK harness

2019-08-29 Thread sunjincheng (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sunjincheng updated BEAM-7949:
--
Fix Version/s: (was: 2.16.0)
   2.17.0

> Add time-based cache threshold support in the data service of the Python SDK 
> harness
> 
>
> Key: BEAM-7949
> URL: https://issues.apache.org/jira/browse/BEAM-7949
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-harness
>Reporter: sunjincheng
>Priority: Major
> Fix For: 2.17.0
>
>
> Currently only size-based cache threshold is supported in the data service of 
> Python SDK harness. It should also support the time-based cache threshold. 
> This is very important, especially for streaming jobs which are sensitive to 
> the delay. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-7948) Add time-based cache threshold support in the Java data service

2019-08-29 Thread sunjincheng (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sunjincheng updated BEAM-7948:
--
Fix Version/s: (was: 2.16.0)
   2.17.0

> Add time-based cache threshold support in the Java data service
> ---
>
> Key: BEAM-7948
> URL: https://issues.apache.org/jira/browse/BEAM-7948
> Project: Beam
>  Issue Type: Sub-task
>  Components: java-fn-execution
>Reporter: sunjincheng
>Priority: Major
> Fix For: 2.17.0
>
>
> Currently only size-based cache threshold is supported in data service. It 
> should also support the time-based cache threshold. This is very important, 
> especially for streaming jobs which are sensitive to the delay.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-8034) Upgrade Flink Runner to 1.8.1 and fix Avro Schema serialization problems

2019-08-29 Thread sunjincheng (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919079#comment-16919079
 ] 

sunjincheng commented on BEAM-8034:
---

Thanks report this issue and share your analysis and I agree with you this 
caused by FLINK-13367. [~mxm] ! 

[~markflyhigh] Already bring up the PROPOSAL for the Beam 2.16 release, and 
Beam 2.16 release branch cut is scheduled on Sep 11.

I'm not sure if I bringup the discuss of Flink-1.8.2 release today can catch up 
with Beam 2.16 release(Sep 11), but I think even if 2.16 can't merge this 
ticket we also need to push Flink-1.8.2 release as soon as possible, So that we 
can upgrade flink-runner version to flink-1.8.2 as soon as possible, at least 
we can merge this in Beam 2.17 release.

What do you think?

> Upgrade Flink Runner to 1.8.1 and fix Avro Schema serialization problems
> 
>
> Key: BEAM-8034
> URL: https://issues.apache.org/jira/browse/BEAM-8034
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Priority: Major
>
> The Flink Runner should be upgrade to 1.8.1. Users have reported 
> serialization problems related to Avro's Schema:
> {noformat}
> Caused by: java.io.NotSerializableException: org.apache.avro.Schema$Field
> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
> at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
> at java.util.ArrayList.writeObject(ArrayList.java:766)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1140)
> at 
> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)
> at 
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
> at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
> at 
> org.apache.flink.util.InstantiationUtil.serializeObject(InstantiationUtil.java:576)
> at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:122)
> ... 45 more
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7909) Write integration tests to test customized containers

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7909?focusedWorklogId=303982=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303982
 ]

ASF GitHub Bot logged work on BEAM-7909:


Author: ASF GitHub Bot
Created on: 30/Aug/19 00:46
Start Date: 30/Aug/19 00:46
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on issue #9351: [BEAM-7909] 
Python3 docker containers
URL: https://github.com/apache/beam/pull/9351#issuecomment-526411707
 
 
   Run Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303982)
Time Spent: 9h 20m  (was: 9h 10m)

> Write integration tests to test customized containers
> -
>
> Key: BEAM-7909
> URL: https://issues.apache.org/jira/browse/BEAM-7909
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7909) Write integration tests to test customized containers

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7909?focusedWorklogId=303963=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303963
 ]

ASF GitHub Bot logged work on BEAM-7909:


Author: ASF GitHub Bot
Created on: 30/Aug/19 00:06
Start Date: 30/Aug/19 00:06
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on issue #9351: [BEAM-7909] 
Python3 docker containers
URL: https://github.com/apache/beam/pull/9351#issuecomment-526404789
 
 
   Run Portable_Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303963)
Time Spent: 9h 10m  (was: 9h)

> Write integration tests to test customized containers
> -
>
> Key: BEAM-7909
> URL: https://issues.apache.org/jira/browse/BEAM-7909
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7909) Write integration tests to test customized containers

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7909?focusedWorklogId=303956=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303956
 ]

ASF GitHub Bot logged work on BEAM-7909:


Author: ASF GitHub Bot
Created on: 29/Aug/19 23:53
Start Date: 29/Aug/19 23:53
Worklog Time Spent: 10m 
  Work Description: Hannah-Jiang commented on issue #9351: [BEAM-7909] 
Python3 docker containers
URL: https://github.com/apache/beam/pull/9351#issuecomment-526402467
 
 
   Run Portable_Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303956)
Time Spent: 9h  (was: 8h 50m)

> Write integration tests to test customized containers
> -
>
> Key: BEAM-7909
> URL: https://issues.apache.org/jira/browse/BEAM-7909
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Hannah Jiang
>Assignee: Hannah Jiang
>Priority: Major
>  Time Spent: 9h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7945) Allow runner to configure "semi_persist_dir" which is used in the SDK harness

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7945?focusedWorklogId=303951=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303951
 ]

ASF GitHub Bot logged work on BEAM-7945:


Author: ASF GitHub Bot
Created on: 29/Aug/19 23:34
Start Date: 29/Aug/19 23:34
Worklog Time Spent: 10m 
  Work Description: sunjincheng121 commented on issue #9452: [BEAM-7945] 
Allow runner to configure semi_persist_dir which is used …
URL: https://github.com/apache/beam/pull/9452#issuecomment-526399094
 
 
   I appreciate if you have time to look up the changes @robertwb @mxm :)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303951)
Time Spent: 20m  (was: 10m)

> Allow runner to configure "semi_persist_dir" which is used in the SDK harness
> -
>
> Key: BEAM-7945
> URL: https://issues.apache.org/jira/browse/BEAM-7945
> Project: Beam
>  Issue Type: Sub-task
>  Components: java-fn-execution, sdk-go, sdk-java-core, sdk-py-core
>Reporter: sunjincheng
>Assignee: sunjincheng
>Priority: Major
> Fix For: 2.16.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently "semi_persist_dir" is not configurable. This may become a problem 
> in certain scenarios. For example, the default value of "semi_persist_dir" is 
> "/tmp" 
> ([https://github.com/apache/beam/blob/master/sdks/python/container/boot.go#L48])
>  in Python SDK harness. When the environment type is "PROCESS", the disk of 
> "/tmp" may be filled up and unexpected issues will occur in production 
> environment. We should provide a way to configure "semi_persist_dir" in 
> EnvironmentFactory at the runner side. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7520) DirectRunner timers are not strictly time ordered

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7520?focusedWorklogId=303862=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303862
 ]

ASF GitHub Bot logged work on BEAM-7520:


Author: ASF GitHub Bot
Created on: 29/Aug/19 19:16
Start Date: 29/Aug/19 19:16
Worklog Time Spent: 10m 
  Work Description: je-ik commented on issue #9190: [BEAM-7520] Fix timer 
firing order in DirectRunner
URL: https://github.com/apache/beam/pull/9190#issuecomment-526323662
 
 
   Run Dataflow ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303862)
Time Spent: 9h  (was: 8h 50m)

> DirectRunner timers are not strictly time ordered
> -
>
> Key: BEAM-7520
> URL: https://issues.apache.org/jira/browse/BEAM-7520
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 2.13.0
>Reporter: Jan Lukavský
>Assignee: Jan Lukavský
>Priority: Major
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> Let's suppose we have the following situation:
>  - statful ParDo with two timers - timerA and timerB
>  - timerA is set for window.maxTimestamp() + 1
>  - timerB is set anywhere between  timerB.timestamp
>  - input watermark moves to BoundedWindow.TIMESTAMP_MAX_VALUE
> Then the order of timers is as follows (correct):
>  - timerB
>  - timerA
> But, if timerB sets another timer (say for timerB.timestamp + 1), then the 
> order of timers will be:
>  - timerB (timerB.timestamp)
>  - timerA (BoundedWindow.TIMESTAMP_MAX_VALUE)
>  - timerB (timerB.timestamp + 1)
> Which is not ordered by timestamp. The reason for this is that when the input 
> watermark update is evaluated, the WatermarkManager,extractFiredTimers() will 
> produce both timerA and timerB. That would be correct, but when timerB sets 
> another timer, that breaks this.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-6114) SQL join selection should be done in planner, not in expansion to PTransform

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6114?focusedWorklogId=303841=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303841
 ]

ASF GitHub Bot logged work on BEAM-6114:


Author: ASF GitHub Bot
Created on: 29/Aug/19 18:30
Start Date: 29/Aug/19 18:30
Worklog Time Spent: 10m 
  Work Description: rahul8383 commented on issue #9453: [BEAM-6114] Handle 
Unsupported Lookup Joins in BeamSql
URL: https://github.com/apache/beam/pull/9453#issuecomment-526307374
 
 
   R: @amaliujia 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303841)
Time Spent: 3h 50m  (was: 3h 40m)

> SQL join selection should be done in planner, not in expansion to PTransform
> 
>
> Key: BEAM-6114
> URL: https://issues.apache.org/jira/browse/BEAM-6114
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Kenneth Knowles
>Assignee: Rahul Patwari
>Priority: Major
> Fix For: 2.16.0
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Currently Beam SQL joins all go through a single physical operator which has 
> a single PTransform that does all join algorithms based on properties of its 
> input PCollections as well as the relational algebra.
> A first step is to make the needed information part of the relational 
> algebra, so it can choose a PTransform based on that, and the PTransforms can 
> be simpler.
> Second step is to have separate (physical) relational operators for different 
> join algorithms.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-6114) SQL join selection should be done in planner, not in expansion to PTransform

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-6114?focusedWorklogId=303839=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303839
 ]

ASF GitHub Bot logged work on BEAM-6114:


Author: ASF GitHub Bot
Created on: 29/Aug/19 18:25
Start Date: 29/Aug/19 18:25
Worklog Time Spent: 10m 
  Work Description: rahul8383 commented on pull request #9453: [BEAM-6114] 
Handle Unsupported Lookup Joins in BeamSql
URL: https://github.com/apache/beam/pull/9453
 
 
   - Added code in BeamSideInputLookupJoinRel which throws 
UnsupportedOperationException if "OUTER JOINS" are applied when the OUTER side 
is a Seekable Table.
   - Added Unit Tests for BeamSideInputLookupJoinRel
   - Modified and added Javadoc for Join Rels.
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 

[jira] [Commented] (BEAM-7993) portable python precommit is flaky

2019-08-29 Thread Hannah Jiang (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918758#comment-16918758
 ] 

Hannah Jiang commented on BEAM-7993:


I am not sure if this is the root cause, because it doesn’t answer following 
two questions.
1. Why it doesn’t happen at local?
2. Why only py3 fails, but not py2?

I will try to not parallel run portable tests later today to see if the problem 
is solved. 

> portable python precommit is flaky
> --
>
> Key: BEAM-7993
> URL: https://issues.apache.org/jira/browse/BEAM-7993
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, test-failures, testing
>Affects Versions: 2.15.0
>Reporter: Udi Meiri
>Assignee: Mark Liu
>Priority: Major
>  Labels: currently-failing
> Fix For: 2.16.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> I'm not sure what the root cause is here.
> Example log where 
> :sdks:python:test-suites:portable:py35:portableWordCountBatch failed:
> {code}
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (1/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)
> 11:51:22 java.lang.Exception: The user defined 'open()' method caused an 
> exception: java.io.IOException: Received exit code 1 for command 'docker 
> inspect -f {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498)
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
> 11:51:22  at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712)
> 11:51:22  at java.lang.Thread.run(Thread.java:748)
> 11:51:22 Caused by: 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.io.IOException: Received exit code 1 for command 'docker inspect -f 
> {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4966)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:211)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:202)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:185)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.FlinkDefaultExecutableStageContext.getStageBundleFactory(FlinkDefaultExecutableStageContext.java:49)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.ReferenceCountingFlinkExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingFlinkExecutableStageContextFactory.java:203)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.open(FlinkExecutableStageFunction.java:129)
> 11:51:22  at 
> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:494)
> 11:51:22  ... 

[jira] [Commented] (BEAM-8096) Allow runner to configure "subnetwork"

2019-08-29 Thread Daniel Oliveira (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-8096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918769#comment-16918769
 ] 

Daniel Oliveira commented on BEAM-8096:
---

Just an update: I took a cursory look through the bug & PR, but I've been busy 
and probably won't have a chance to give this a real review until tomorrow.

> Allow runner to configure "subnetwork"
> --
>
> Key: BEAM-8096
> URL: https://issues.apache.org/jira/browse/BEAM-8096
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Affects Versions: 2.15.0
>Reporter: Jack Whelpton
>Assignee: Robert Burke
>Priority: Major
>
> When running a Dataflow job, the network can be specified using the --network 
> flag; however, there is no support for doing the same for the subnetwork. 
> This would be the go equivalent of the following Java code:
> [https://cloud.google.com/dataflow/java-sdk/JavaDoc/com/google/cloud/dataflow/sdk/options/DataflowPipelineWorkerPoolOptions.html#getSubnetwork--|https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineWorkerPoolOptions.java#L151]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-7993) portable python precommit is flaky

2019-08-29 Thread Mark Liu (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918727#comment-16918727
 ] 

Mark Liu commented on BEAM-7993:


Looks like same issue in BEAM-7527. It happens if multiple 'python setup.py' 
run concurrently under same project. In portable precommit, py2 and py3 tests 
are triggered at same time. It's likely that part of the test call setuptool 
and unfortunately conflict with each other.

> portable python precommit is flaky
> --
>
> Key: BEAM-7993
> URL: https://issues.apache.org/jira/browse/BEAM-7993
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core, test-failures, testing
>Affects Versions: 2.15.0
>Reporter: Udi Meiri
>Assignee: Mark Liu
>Priority: Major
>  Labels: currently-failing
> Fix For: 2.16.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> I'm not sure what the root cause is here.
> Example log where 
> :sdks:python:test-suites:portable:py35:portableWordCountBatch failed:
> {code}
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap 
> (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at 
> ExtractOutput[0]) (1/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)
> 11:51:22 [CHAIN MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)] ERROR 
> org.apache.flink.runtime.operators.BatchTask - Error in task code:  CHAIN 
> MapPartition (MapPartition at 
> [2]write/Write/WriteImpl/DoOnce/{FlatMap(), 
> Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)
> 11:51:22 java.lang.Exception: The user defined 'open()' method caused an 
> exception: java.io.IOException: Received exit code 1 for command 'docker 
> inspect -f {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498)
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
> 11:51:22  at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712)
> 11:51:22  at java.lang.Thread.run(Thread.java:748)
> 11:51:22 Caused by: 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>  java.io.IOException: Received exit code 1 for command 'docker inspect -f 
> {{.State.Running}} 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: 
> Error: No such object: 
> 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1
> 11:51:22  at 
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4966)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:211)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:202)
> 11:51:22  at 
> org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:185)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.FlinkDefaultExecutableStageContext.getStageBundleFactory(FlinkDefaultExecutableStageContext.java:49)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.ReferenceCountingFlinkExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingFlinkExecutableStageContextFactory.java:203)
> 11:51:22  at 
> org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.open(FlinkExecutableStageFunction.java:129)
> 11:51:22  at 
> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
> 11:51:22  at 
> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:494)
> 

[jira] [Work logged] (BEAM-8108) Run Chicago Taxi Example on Flink

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8108?focusedWorklogId=303774=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303774
 ]

ASF GitHub Bot logged work on BEAM-8108:


Author: ASF GitHub Bot
Created on: 29/Aug/19 15:48
Start Date: 29/Aug/19 15:48
Worklog Time Spent: 10m 
  Work Description: kamilwu commented on issue #9448: [BEAM-8108] Add 
Chicago Taxi Example running on Flink
URL: https://github.com/apache/beam/pull/9448#issuecomment-526247150
 
 
   Run Chicago Taxi on Flink
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303774)
Time Spent: 1h 40m  (was: 1.5h)

> Run Chicago Taxi Example on Flink
> -
>
> Key: BEAM-8108
> URL: https://issues.apache.org/jira/browse/BEAM-8108
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Kamil Wasilewski
>Assignee: Kamil Wasilewski
>Priority: Minor
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8113) FlinkRunner: Stage files from context classloader

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8113?focusedWorklogId=303773=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303773
 ]

ASF GitHub Bot logged work on BEAM-8113:


Author: ASF GitHub Bot
Created on: 29/Aug/19 15:47
Start Date: 29/Aug/19 15:47
Worklog Time Spent: 10m 
  Work Description: je-ik commented on issue #9451: [BEAM-8113] Stage files 
from context classloader
URL: https://github.com/apache/beam/pull/9451#issuecomment-526246737
 
 
   @mxm I think this can be merged. My intention was to enable the auto staging 
feature to work on JDK >= 9 (if user supplied URLClassLoader in context 
classloader). The problem is, unfortunately, that local flink runner does not 
use the context classloader (not sure why, still looking into that), but this 
PR seems to me to be valid on its own. If the user creates some context 
classloader before submitting the pipeline, we can do our best to retrieve as 
many dependencies as possible.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303773)
Time Spent: 1h  (was: 50m)

> FlinkRunner: Stage files from context classloader
> -
>
> Key: BEAM-8113
> URL: https://issues.apache.org/jira/browse/BEAM-8113
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Jan Lukavský
>Assignee: Jan Lukavský
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Currently, only files from {{FlinkRunner.class.getClassLoader()}} are staged 
> by default. Add also files from 
> {{Thread.currentThread().getContextClassLoader()}}.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8113) FlinkRunner: Stage files from context classloader

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8113?focusedWorklogId=303765=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303765
 ]

ASF GitHub Bot logged work on BEAM-8113:


Author: ASF GitHub Bot
Created on: 29/Aug/19 15:42
Start Date: 29/Aug/19 15:42
Worklog Time Spent: 10m 
  Work Description: je-ik commented on issue #9451: [BEAM-8113] Stage files 
from context classloader
URL: https://github.com/apache/beam/pull/9451#issuecomment-526244675
 
 
   Run Portable_Python PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303765)
Time Spent: 50m  (was: 40m)

> FlinkRunner: Stage files from context classloader
> -
>
> Key: BEAM-8113
> URL: https://issues.apache.org/jira/browse/BEAM-8113
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Jan Lukavský
>Assignee: Jan Lukavský
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Currently, only files from {{FlinkRunner.class.getClassLoader()}} are staged 
> by default. Add also files from 
> {{Thread.currentThread().getContextClassLoader()}}.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-8114) Chicago Taxi on Dataflow is failing

2019-08-29 Thread Lukasz Gajowy (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukasz Gajowy updated BEAM-8114:

Status: Open  (was: Triage Needed)

> Chicago Taxi on Dataflow is failing
> ---
>
> Key: BEAM-8114
> URL: https://issues.apache.org/jira/browse/BEAM-8114
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Kamil Wasilewski
>Priority: Major
>
> An exception is being raised when publishing metrics:
>  
> {code:java}
> 16:48:57 value = self._prepare_runtime_metrics(runtime_list)
> 16:48:57   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Chicago_Taxi_Dataflow/src/sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py",
>  line 317, in _prepare_runtime_metrics
> 16:48:57 min_value = min(min_values)
> 16:48:57 ValueError: min() arg is an empty sequence
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-8114) Chicago Taxi on Dataflow is failing

2019-08-29 Thread Kamil Wasilewski (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamil Wasilewski updated BEAM-8114:
---
Description: 
An exception is being raised when publishing metrics:

 
{code:java}
16:48:57 value = self._prepare_runtime_metrics(runtime_list)
16:48:57   File 
"/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Chicago_Taxi_Dataflow/src/sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py",
 line 317, in _prepare_runtime_metrics
16:48:57 min_value = min(min_values)
16:48:57 ValueError: min() arg is an empty sequence
{code}

  was:
An exception is being raised when publishing metrics:

 
{code:java}
// code 16:48:57 value = self._prepare_runtime_metrics(runtime_list)
16:48:57   File 
"/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Chicago_Taxi_Dataflow/src/sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py",
 line 317, in _prepare_runtime_metrics
16:48:57 min_value = min(min_values)
16:48:57 ValueError: min() arg is an empty sequence
{code}


> Chicago Taxi on Dataflow is failing
> ---
>
> Key: BEAM-8114
> URL: https://issues.apache.org/jira/browse/BEAM-8114
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Kamil Wasilewski
>Priority: Major
>
> An exception is being raised when publishing metrics:
>  
> {code:java}
> 16:48:57 value = self._prepare_runtime_metrics(runtime_list)
> 16:48:57   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Chicago_Taxi_Dataflow/src/sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py",
>  line 317, in _prepare_runtime_metrics
> 16:48:57 min_value = min(min_values)
> 16:48:57 ValueError: min() arg is an empty sequence
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-8114) Chicago Taxi on Dataflow is failing

2019-08-29 Thread Kamil Wasilewski (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamil Wasilewski updated BEAM-8114:
---
Description: 
An exception is being raised when publishing metrics:

 
{code:java}
// code 16:48:57 value = self._prepare_runtime_metrics(runtime_list)
16:48:57   File 
"/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Chicago_Taxi_Dataflow/src/sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py",
 line 317, in _prepare_runtime_metrics
16:48:57 min_value = min(min_values)
16:48:57 ValueError: min() arg is an empty sequence
{code}

  was:
An exception is being raised when publishing metrics:

 
*16:48:57* value = self._prepare_runtime_metrics(runtime_list)*16:48:57*   
File 
"/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Chicago_Taxi_Dataflow/src/sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py",
 line 317, in _prepare_runtime_metrics*16:48:57* min_value = 
min(min_values)*16:48:57* ValueError: min() arg is an empty sequence


> Chicago Taxi on Dataflow is failing
> ---
>
> Key: BEAM-8114
> URL: https://issues.apache.org/jira/browse/BEAM-8114
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Kamil Wasilewski
>Priority: Major
>
> An exception is being raised when publishing metrics:
>  
> {code:java}
> // code 16:48:57 value = self._prepare_runtime_metrics(runtime_list)
> 16:48:57   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Chicago_Taxi_Dataflow/src/sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py",
>  line 317, in _prepare_runtime_metrics
> 16:48:57 min_value = min(min_values)
> 16:48:57 ValueError: min() arg is an empty sequence
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-8114) Chicago Taxi on Dataflow is failing

2019-08-29 Thread Kamil Wasilewski (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamil Wasilewski updated BEAM-8114:
---
Description: 
An exception is being raised when publishing metrics:

 

16:48:57 value = self._prepare_runtime_metrics(runtime_list)
 16:48:57 File 
"/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Chicago_Taxi_Dataflow/src/sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py",
 line 317, in _prepare_runtime_metrics
 16:48:57 min_value = min(min_values)
 16:48:57 ValueError: min() arg is an empty sequence

  was:
An exception is being raised when publishing metrics:

 

```

16:48:57 value = self._prepare_runtime_metrics(runtime_list)
16:48:57 File 
"/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Chicago_Taxi_Dataflow/src/sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py",
 line 317, in _prepare_runtime_metrics
16:48:57 min_value = min(min_values)
16:48:57 ValueError: min() arg is an empty sequence

```


> Chicago Taxi on Dataflow is failing
> ---
>
> Key: BEAM-8114
> URL: https://issues.apache.org/jira/browse/BEAM-8114
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Kamil Wasilewski
>Priority: Major
>
> An exception is being raised when publishing metrics:
>  
> 16:48:57 value = self._prepare_runtime_metrics(runtime_list)
>  16:48:57 File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Chicago_Taxi_Dataflow/src/sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py",
>  line 317, in _prepare_runtime_metrics
>  16:48:57 min_value = min(min_values)
>  16:48:57 ValueError: min() arg is an empty sequence



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-8114) Chicago Taxi on Dataflow is failing

2019-08-29 Thread Kamil Wasilewski (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamil Wasilewski updated BEAM-8114:
---
Description: 
An exception is being raised when publishing metrics:

 

```

16:48:57 value = self._prepare_runtime_metrics(runtime_list)
16:48:57 File 
"/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Chicago_Taxi_Dataflow/src/sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py",
 line 317, in _prepare_runtime_metrics
16:48:57 min_value = min(min_values)
16:48:57 ValueError: min() arg is an empty sequence

```

  was:
An exception is being raised when publishing metrics:
*16:48:57* value = self._prepare_runtime_metrics(runtime_list)*16:48:57*   
File 
"/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Chicago_Taxi_Dataflow/src/sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py",
 line 317, in _prepare_runtime_metrics*16:48:57* min_value = 
min(min_values)*16:48:57* ValueError: min() arg is an empty sequence


> Chicago Taxi on Dataflow is failing
> ---
>
> Key: BEAM-8114
> URL: https://issues.apache.org/jira/browse/BEAM-8114
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Kamil Wasilewski
>Priority: Major
>
> An exception is being raised when publishing metrics:
>  
> ```
> 16:48:57 value = self._prepare_runtime_metrics(runtime_list)
> 16:48:57 File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Chicago_Taxi_Dataflow/src/sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py",
>  line 317, in _prepare_runtime_metrics
> 16:48:57 min_value = min(min_values)
> 16:48:57 ValueError: min() arg is an empty sequence
> ```



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-8114) Chicago Taxi on Dataflow is failing

2019-08-29 Thread Kamil Wasilewski (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamil Wasilewski updated BEAM-8114:
---
Description: 
An exception is being raised when publishing metrics:

 
*16:48:57* value = self._prepare_runtime_metrics(runtime_list)*16:48:57*   
File 
"/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Chicago_Taxi_Dataflow/src/sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py",
 line 317, in _prepare_runtime_metrics*16:48:57* min_value = 
min(min_values)*16:48:57* ValueError: min() arg is an empty sequence

  was:
An exception is being raised when publishing metrics:

 

16:48:57 value = self._prepare_runtime_metrics(runtime_list)
 16:48:57 File 
"/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Chicago_Taxi_Dataflow/src/sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py",
 line 317, in _prepare_runtime_metrics
 16:48:57 min_value = min(min_values)
 16:48:57 ValueError: min() arg is an empty sequence


> Chicago Taxi on Dataflow is failing
> ---
>
> Key: BEAM-8114
> URL: https://issues.apache.org/jira/browse/BEAM-8114
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Kamil Wasilewski
>Priority: Major
>
> An exception is being raised when publishing metrics:
>  
> *16:48:57* value = self._prepare_runtime_metrics(runtime_list)*16:48:57*  
>  File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Chicago_Taxi_Dataflow/src/sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py",
>  line 317, in _prepare_runtime_metrics*16:48:57* min_value = 
> min(min_values)*16:48:57* ValueError: min() arg is an empty sequence



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (BEAM-8114) Chicago Taxi on Dataflow is failing

2019-08-29 Thread Kamil Wasilewski (Jira)
Kamil Wasilewski created BEAM-8114:
--

 Summary: Chicago Taxi on Dataflow is failing
 Key: BEAM-8114
 URL: https://issues.apache.org/jira/browse/BEAM-8114
 Project: Beam
  Issue Type: Bug
  Components: testing
Reporter: Kamil Wasilewski


An exception is being raised when publishing metrics:
*16:48:57* value = self._prepare_runtime_metrics(runtime_list)*16:48:57*   
File 
"/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Chicago_Taxi_Dataflow/src/sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py",
 line 317, in _prepare_runtime_metrics*16:48:57* min_value = 
min(min_values)*16:48:57* ValueError: min() arg is an empty sequence



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8108) Run Chicago Taxi Example on Flink

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8108?focusedWorklogId=303753=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303753
 ]

ASF GitHub Bot logged work on BEAM-8108:


Author: ASF GitHub Bot
Created on: 29/Aug/19 15:06
Start Date: 29/Aug/19 15:06
Worklog Time Spent: 10m 
  Work Description: kamilwu commented on issue #9448: [BEAM-8108] Add 
Chicago Taxi Example running on Flink
URL: https://github.com/apache/beam/pull/9448#issuecomment-526228045
 
 
   Run Chicago Taxi on Flink
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303753)
Time Spent: 1.5h  (was: 1h 20m)

> Run Chicago Taxi Example on Flink
> -
>
> Key: BEAM-8108
> URL: https://issues.apache.org/jira/browse/BEAM-8108
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Kamil Wasilewski
>Assignee: Kamil Wasilewski
>Priority: Minor
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7389) Colab examples for element-wise transforms (Python)

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7389?focusedWorklogId=303752=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303752
 ]

ASF GitHub Bot logged work on BEAM-7389:


Author: ASF GitHub Bot
Created on: 29/Aug/19 15:04
Start Date: 29/Aug/19 15:04
Worklog Time Spent: 10m 
  Work Description: aaltay commented on pull request #9433: [BEAM-7389] 
Update to use util.ToString transform
URL: https://github.com/apache/beam/pull/9433#discussion_r319121548
 
 

 ##
 File path: 
sdks/python/apache_beam/examples/snippets/transforms/element_wise/to_string_test.py
 ##
 @@ -19,36 +19,81 @@
 from __future__ import absolute_import
 from __future__ import print_function
 
+import sys
 import unittest
 
 import mock
 
-from apache_beam.examples.snippets.transforms.element_wise.to_string import *
 from apache_beam.testing.test_pipeline import TestPipeline
 from apache_beam.testing.util import assert_that
 from apache_beam.testing.util import equal_to
 
+from . import to_string
+
+
+def check_plants(actual):
+  # [START plants]
+  plants = [
+  ',Strawberry',
+  '凌,Carrot',
+  ',Eggplant',
+  ',Tomato',
+  '凜,Potato',
+  ]
+  # [END plants]
+  assert_that(actual, equal_to(plants))
+
+
+def check_plant_lists(actual):
+  # [START plant_lists]
+  plant_lists = [
+  "['', 'Strawberry', 'perennial']",
+  "['凌', 'Carrot', 'biennial']",
+  "['', 'Eggplant', 'perennial']",
+  "['', 'Tomato', 'annual']",
+  "['凜', 'Potato', 'perennial']",
+  ]
+  # [END plant_lists]
+
+  # Some unicode characters become escaped with double backslashes.
+  import apache_beam as beam
+
+  def normalize_escaping(elem):
 
 Review comment:
   Could you add a JIRA todo here so that we can remove it after python 2 
deprecation?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303752)
Time Spent: 53h  (was: 52h 50m)

> Colab examples for element-wise transforms (Python)
> ---
>
> Key: BEAM-7389
> URL: https://issues.apache.org/jira/browse/BEAM-7389
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Rose Nguyen
>Assignee: David Cavazos
>Priority: Minor
>  Time Spent: 53h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7389) Colab examples for element-wise transforms (Python)

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7389?focusedWorklogId=303751=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303751
 ]

ASF GitHub Bot logged work on BEAM-7389:


Author: ASF GitHub Bot
Created on: 29/Aug/19 15:02
Start Date: 29/Aug/19 15:02
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #9262: [BEAM-7389] Add code 
examples for Regex page
URL: https://github.com/apache/beam/pull/9262#issuecomment-526226609
 
 
   I see an extra space in the example data:
   `',   Strawberry,   perennial',`
   
   do you know why? (between the strawberry icon and strawberry text.)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303751)
Time Spent: 52h 50m  (was: 52h 40m)

> Colab examples for element-wise transforms (Python)
> ---
>
> Key: BEAM-7389
> URL: https://issues.apache.org/jira/browse/BEAM-7389
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Rose Nguyen
>Assignee: David Cavazos
>Priority: Minor
>  Time Spent: 52h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8108) Run Chicago Taxi Example on Flink

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8108?focusedWorklogId=303741=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303741
 ]

ASF GitHub Bot logged work on BEAM-8108:


Author: ASF GitHub Bot
Created on: 29/Aug/19 14:44
Start Date: 29/Aug/19 14:44
Worklog Time Spent: 10m 
  Work Description: kamilwu commented on issue #9448: [BEAM-8108] Add 
Chicago Taxi Example running on Flink
URL: https://github.com/apache/beam/pull/9448#issuecomment-526218614
 
 
   Run Chicago Taxi on Flink
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303741)
Time Spent: 1h 20m  (was: 1h 10m)

> Run Chicago Taxi Example on Flink
> -
>
> Key: BEAM-8108
> URL: https://issues.apache.org/jira/browse/BEAM-8108
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Kamil Wasilewski
>Assignee: Kamil Wasilewski
>Priority: Minor
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8108) Run Chicago Taxi Example on Flink

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8108?focusedWorklogId=303711=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303711
 ]

ASF GitHub Bot logged work on BEAM-8108:


Author: ASF GitHub Bot
Created on: 29/Aug/19 13:40
Start Date: 29/Aug/19 13:40
Worklog Time Spent: 10m 
  Work Description: kamilwu commented on issue #9448: [BEAM-8108] Run 
Chicago Taxi Example on Flink
URL: https://github.com/apache/beam/pull/9448#issuecomment-526189897
 
 
   Run Seed Job
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303711)
Time Spent: 1h 10m  (was: 1h)

> Run Chicago Taxi Example on Flink
> -
>
> Key: BEAM-8108
> URL: https://issues.apache.org/jira/browse/BEAM-8108
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Kamil Wasilewski
>Assignee: Kamil Wasilewski
>Priority: Minor
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8003) Remove all mentions of PKB on Confluence / website docs

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8003?focusedWorklogId=303712=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303712
 ]

ASF GitHub Bot logged work on BEAM-8003:


Author: ASF GitHub Bot
Created on: 29/Aug/19 13:40
Start Date: 29/Aug/19 13:40
Worklog Time Spent: 10m 
  Work Description: lgajowy commented on issue #9450: [BEAM-8003] Remove 
Perfkit leftovers
URL: https://github.com/apache/beam/pull/9450#issuecomment-526190065
 
 
   Run Java TextIO Performance Test HDFS
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303712)
Time Spent: 0.5h  (was: 20m)

> Remove all mentions of PKB on Confluence / website docs
> ---
>
> Key: BEAM-8003
> URL: https://issues.apache.org/jira/browse/BEAM-8003
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing, website
>Reporter: Lukasz Gajowy
>Assignee: Lukasz Gajowy
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-5428) Implement cross-bundle state caching.

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5428?focusedWorklogId=303706=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303706
 ]

ASF GitHub Bot logged work on BEAM-5428:


Author: ASF GitHub Bot
Created on: 29/Aug/19 13:33
Start Date: 29/Aug/19 13:33
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #9374: [BEAM-5428] Implement 
Runner support for cache tokens
URL: https://github.com/apache/beam/pull/9374#issuecomment-526128312
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303706)
Time Spent: 10h 40m  (was: 10.5h)

> Implement cross-bundle state caching.
> -
>
> Key: BEAM-5428
> URL: https://issues.apache.org/jira/browse/BEAM-5428
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-harness
>Reporter: Robert Bradshaw
>Assignee: Rakesh Kumar
>Priority: Major
>  Time Spent: 10h 40m
>  Remaining Estimate: 0h
>
> Tech spec: 
> [https://docs.google.com/document/d/1BOozW0bzBuz4oHJEuZNDOHdzaV5Y56ix58Ozrqm2jFg/edit#heading=h.7ghoih5aig5m]
> Relevant document: 
> [https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit#|https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit]
> Mailing list link: 
> [https://lists.apache.org/thread.html/caa8d9bc6ca871d13de2c5e6ba07fdc76f85d26497d95d90893aa1f6@%3Cdev.beam.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8113) FlinkRunner: Stage files from context classloader

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8113?focusedWorklogId=303692=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303692
 ]

ASF GitHub Bot logged work on BEAM-8113:


Author: ASF GitHub Bot
Created on: 29/Aug/19 13:14
Start Date: 29/Aug/19 13:14
Worklog Time Spent: 10m 
  Work Description: je-ik commented on issue #9451: [BEAM-8113] Stage files 
from context classloader
URL: https://github.com/apache/beam/pull/9451#issuecomment-526179284
 
 
   Still debugging this locally. The fix seems not to be doing what I was 
expecting.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303692)
Time Spent: 40m  (was: 0.5h)

> FlinkRunner: Stage files from context classloader
> -
>
> Key: BEAM-8113
> URL: https://issues.apache.org/jira/browse/BEAM-8113
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Jan Lukavský
>Assignee: Jan Lukavský
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently, only files from {{FlinkRunner.class.getClassLoader()}} are staged 
> by default. Add also files from 
> {{Thread.currentThread().getContextClassLoader()}}.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7829) AvroUtils.toAvroSchema should put a Schema name to pass Avro Schema validation

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7829?focusedWorklogId=303668=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303668
 ]

ASF GitHub Bot logged work on BEAM-7829:


Author: ASF GitHub Bot
Created on: 29/Aug/19 12:45
Start Date: 29/Aug/19 12:45
Worklog Time Spent: 10m 
  Work Description: kanterov commented on pull request #9247: [BEAM-7829] 
Add schema names when converting with AvroUtils.toAvroSchema
URL: https://github.com/apache/beam/pull/9247
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303668)
Time Spent: 2h 40m  (was: 2.5h)

> AvroUtils.toAvroSchema should put a Schema name to pass Avro Schema validation
> --
>
> Key: BEAM-7829
> URL: https://issues.apache.org/jira/browse/BEAM-7829
> Project: Beam
>  Issue Type: Test
>  Components: io-java-avro, sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ryan Skraba
>Priority: Minor
> Fix For: 2.16.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> While trying to use an Avro PCollection with the SQL transform I notice you 
> could not do correctly a bijective transform: PCollection -> 
> SQL -> PCollection -> ParDo -> PCollection I noticed that 
> some of the Avro metadata gets lost in particular the name of the Avro 
> Schema. This is important because Avro validates that the schema has a name 
> and if it does not it breaks with a ParseException.
> {quote}
> org.apache.avro.SchemaParseException: Illegal character in: EXPR$1
>     at org.apache.avro.Schema.validateName (Schema.java:1151)
>     at org.apache.avro.Schema.access$200 (Schema.java:81)
>     at org.apache.avro.Schema$Field. (Schema.java:403)
>     at org.apache.avro.Schema$Field. (Schema.java:423)
>     at org.apache.avro.Schema$Field. (Schema.java:415){quote}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Closed] (BEAM-7829) AvroUtils.toAvroSchema should put a Schema name to pass Avro Schema validation

2019-08-29 Thread Gleb Kanterov (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gleb Kanterov closed BEAM-7829.
---
Resolution: Fixed

> AvroUtils.toAvroSchema should put a Schema name to pass Avro Schema validation
> --
>
> Key: BEAM-7829
> URL: https://issues.apache.org/jira/browse/BEAM-7829
> Project: Beam
>  Issue Type: Test
>  Components: io-java-avro, sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ryan Skraba
>Priority: Minor
> Fix For: 2.16.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> While trying to use an Avro PCollection with the SQL transform I notice you 
> could not do correctly a bijective transform: PCollection -> 
> SQL -> PCollection -> ParDo -> PCollection I noticed that 
> some of the Avro metadata gets lost in particular the name of the Avro 
> Schema. This is important because Avro validates that the schema has a name 
> and if it does not it breaks with a ParseException.
> {quote}
> org.apache.avro.SchemaParseException: Illegal character in: EXPR$1
>     at org.apache.avro.Schema.validateName (Schema.java:1151)
>     at org.apache.avro.Schema.access$200 (Schema.java:81)
>     at org.apache.avro.Schema$Field. (Schema.java:403)
>     at org.apache.avro.Schema$Field. (Schema.java:423)
>     at org.apache.avro.Schema$Field. (Schema.java:415){quote}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8003) Remove all mentions of PKB on Confluence / website docs

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8003?focusedWorklogId=303651=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303651
 ]

ASF GitHub Bot logged work on BEAM-8003:


Author: ASF GitHub Bot
Created on: 29/Aug/19 12:25
Start Date: 29/Aug/19 12:25
Worklog Time Spent: 10m 
  Work Description: lgajowy commented on issue #9450: [BEAM-8003] Remove 
Perfkit leftovers
URL: https://github.com/apache/beam/pull/9450#issuecomment-526161563
 
 
   Run seed job
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303651)
Time Spent: 20m  (was: 10m)

> Remove all mentions of PKB on Confluence / website docs
> ---
>
> Key: BEAM-8003
> URL: https://issues.apache.org/jira/browse/BEAM-8003
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing, website
>Reporter: Lukasz Gajowy
>Assignee: Lukasz Gajowy
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7660) Create ParDo Python Load Test Jenkins Job [Flink]

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7660?focusedWorklogId=303614=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303614
 ]

ASF GitHub Bot logged work on BEAM-7660:


Author: ASF GitHub Bot
Created on: 29/Aug/19 11:47
Start Date: 29/Aug/19 11:47
Worklog Time Spent: 10m 
  Work Description: kamilwu commented on issue #9449: [BEAM-7660] Create 
Python ParDo load test job on Flink
URL: https://github.com/apache/beam/pull/9449#issuecomment-526149697
 
 
   @iemejia  Could you take a look?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303614)
Time Spent: 40m  (was: 0.5h)

> Create ParDo Python Load Test Jenkins Job [Flink]
> -
>
> Key: BEAM-7660
> URL: https://issues.apache.org/jira/browse/BEAM-7660
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Kamil Wasilewski
>Assignee: Kamil Wasilewski
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7660) Create ParDo Python Load Test Jenkins Job [Flink]

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7660?focusedWorklogId=303616=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303616
 ]

ASF GitHub Bot logged work on BEAM-7660:


Author: ASF GitHub Bot
Created on: 29/Aug/19 11:47
Start Date: 29/Aug/19 11:47
Worklog Time Spent: 10m 
  Work Description: kamilwu commented on issue #9449: [BEAM-7660] Create 
Python ParDo load test job on Flink
URL: https://github.com/apache/beam/pull/9449#issuecomment-526149782
 
 
   R: @iemejia Could you take a look?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303616)
Time Spent: 1h  (was: 50m)

> Create ParDo Python Load Test Jenkins Job [Flink]
> -
>
> Key: BEAM-7660
> URL: https://issues.apache.org/jira/browse/BEAM-7660
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Kamil Wasilewski
>Assignee: Kamil Wasilewski
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7660) Create ParDo Python Load Test Jenkins Job [Flink]

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7660?focusedWorklogId=303615=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303615
 ]

ASF GitHub Bot logged work on BEAM-7660:


Author: ASF GitHub Bot
Created on: 29/Aug/19 11:47
Start Date: 29/Aug/19 11:47
Worklog Time Spent: 10m 
  Work Description: kamilwu commented on issue #9449: [BEAM-7660] Create 
Python ParDo load test job on Flink
URL: https://github.com/apache/beam/pull/9449#issuecomment-526149697
 
 
   @iemejia  Could you take a look?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303615)
Time Spent: 50m  (was: 40m)

> Create ParDo Python Load Test Jenkins Job [Flink]
> -
>
> Key: BEAM-7660
> URL: https://issues.apache.org/jira/browse/BEAM-7660
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Kamil Wasilewski
>Assignee: Kamil Wasilewski
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work stopped] (BEAM-8113) FlinkRunner: Stage files from context classloader

2019-08-29 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-8113 stopped by Jan Lukavský.
--
> FlinkRunner: Stage files from context classloader
> -
>
> Key: BEAM-8113
> URL: https://issues.apache.org/jira/browse/BEAM-8113
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Jan Lukavský
>Assignee: Jan Lukavský
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently, only files from {{FlinkRunner.class.getClassLoader()}} are staged 
> by default. Add also files from 
> {{Thread.currentThread().getContextClassLoader()}}.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7945) Allow runner to configure "semi_persist_dir" which is used in the SDK harness

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7945?focusedWorklogId=303612=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303612
 ]

ASF GitHub Bot logged work on BEAM-7945:


Author: ASF GitHub Bot
Created on: 29/Aug/19 11:45
Start Date: 29/Aug/19 11:45
Worklog Time Spent: 10m 
  Work Description: sunjincheng121 commented on pull request #9452: 
[BEAM-7945] Allow runner to configure semi_persist_dir which is used …
URL: https://github.com/apache/beam/pull/9452
 
 
   Currently "semi_persist_dir" is not configurable. This may become a problem 
in certain scenarios. For example, the default value of "semi_persist_dir" is 
"/tmp" 
(https://github.com/apache/beam/blob/master/sdks/python/container/boot.go#L48) 
in Python SDK harness. When the environment type is "PROCESS", the disk of 
"/tmp" may be filled up and unexpected issues will occur in production 
environment. 
   
   So, This pull request makes the semi_persist_dir configurable through adding 
a new PipelineOption(RemoteEnvironmentOptions).The Pipeline option will be 
passed to the `DefaultJobBundleFactory` and then be used in each 
EnvironmentFactory(docker, process, external and embedded).
   
   For details of the discussion can be found in [1].
   
   [1] 
https://lists.apache.org/list.html?d...@beam.apache.org:lte=1M:%5BDISCUSS%5D%20Turn%20%60WindowedValue
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 

[jira] [Work logged] (BEAM-8113) FlinkRunner: Stage files from context classloader

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8113?focusedWorklogId=303609=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303609
 ]

ASF GitHub Bot logged work on BEAM-8113:


Author: ASF GitHub Bot
Created on: 29/Aug/19 11:42
Start Date: 29/Aug/19 11:42
Worklog Time Spent: 10m 
  Work Description: je-ik commented on issue #9451: [BEAM-8113] Stage files 
from context classloader
URL: https://github.com/apache/beam/pull/9451#issuecomment-526148312
 
 
   R: @mxm 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303609)
Time Spent: 20m  (was: 10m)

> FlinkRunner: Stage files from context classloader
> -
>
> Key: BEAM-8113
> URL: https://issues.apache.org/jira/browse/BEAM-8113
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Jan Lukavský
>Assignee: Jan Lukavský
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently, only files from {{FlinkRunner.class.getClassLoader()}} are staged 
> by default. Add also files from 
> {{Thread.currentThread().getContextClassLoader()}}.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8113) FlinkRunner: Stage files from context classloader

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8113?focusedWorklogId=303610=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303610
 ]

ASF GitHub Bot logged work on BEAM-8113:


Author: ASF GitHub Bot
Created on: 29/Aug/19 11:42
Start Date: 29/Aug/19 11:42
Worklog Time Spent: 10m 
  Work Description: je-ik commented on issue #9451: [BEAM-8113] Stage files 
from context classloader
URL: https://github.com/apache/beam/pull/9451#issuecomment-526148361
 
 
   Run Flink ValidatesRunner
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303610)
Time Spent: 0.5h  (was: 20m)

> FlinkRunner: Stage files from context classloader
> -
>
> Key: BEAM-8113
> URL: https://issues.apache.org/jira/browse/BEAM-8113
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Jan Lukavský
>Assignee: Jan Lukavský
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently, only files from {{FlinkRunner.class.getClassLoader()}} are staged 
> by default. Add also files from 
> {{Thread.currentThread().getContextClassLoader()}}.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8113) FlinkRunner: Stage files from context classloader

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8113?focusedWorklogId=303608=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303608
 ]

ASF GitHub Bot logged work on BEAM-8113:


Author: ASF GitHub Bot
Created on: 29/Aug/19 11:41
Start Date: 29/Aug/19 11:41
Worklog Time Spent: 10m 
  Work Description: je-ik commented on pull request #9451: [BEAM-8113] 
Stage files from context classloader
URL: https://github.com/apache/beam/pull/9451
 
 
   Fixes [BEAM-8113].
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)
 | --- | 

[jira] [Work logged] (BEAM-8003) Remove all mentions of PKB on Confluence / website docs

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8003?focusedWorklogId=303605=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303605
 ]

ASF GitHub Bot logged work on BEAM-8003:


Author: ASF GitHub Bot
Created on: 29/Aug/19 11:31
Start Date: 29/Aug/19 11:31
Worklog Time Spent: 10m 
  Work Description: lgajowy commented on pull request #9450: [BEAM-8003] 
Remove Perfkit from webside documentation
URL: https://github.com/apache/beam/pull/9450
 
 
   Removes all mentions of PKB in documentation that are out of date. Removes 
Gradle task that is obsolete since we resigned from using Perfkit. Removes 
Perfkit word from file-based tests description (makes no sense anymore).
   
   @markflyhigh could you take a look?
   
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 

[jira] [Work started] (BEAM-8003) Remove all mentions of PKB on Confluence / website docs

2019-08-29 Thread Lukasz Gajowy (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-8003 started by Lukasz Gajowy.
---
> Remove all mentions of PKB on Confluence / website docs
> ---
>
> Key: BEAM-8003
> URL: https://issues.apache.org/jira/browse/BEAM-8003
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing, website
>Reporter: Lukasz Gajowy
>Assignee: Lukasz Gajowy
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-7978) ArithmeticExceptions on getting backlog bytes

2019-08-29 Thread Alexey Romanenko (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-7978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918516#comment-16918516
 ] 

Alexey Romanenko commented on BEAM-7978:


[~Juraszek] [~chamikara] Could you take a look on this PR: 
https://github.com/apache/beam/pull/9432?

> ArithmeticExceptions on getting backlog bytes 
> --
>
> Key: BEAM-7978
> URL: https://issues.apache.org/jira/browse/BEAM-7978
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-kinesis
>Affects Versions: 2.14.0
>Reporter: Mateusz
>Assignee: Alexey Romanenko
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Hello,
> Beam 2.14.0
>  (and to be more precise 
> [commit|https://github.com/apache/beam/commit/9fe0a0387ad1f894ef02acf32edbc9623a3b13ad#diff-b4964a457006b1555c7042c739b405ec])
>  introduced a change in watermark calculation in Kinesis IO causing below 
> error:
> {code:java}
> exception:  "java.lang.RuntimeException: Unknown kinesis failure, when trying 
> to reach kinesis
>   at 
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.wrapExceptions(SimplifiedKinesisClient.java:227)
>   at 
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.getBacklogBytes(SimplifiedKinesisClient.java:167)
>   at 
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.getBacklogBytes(SimplifiedKinesisClient.java:155)
>   at 
> org.apache.beam.sdk.io.kinesis.KinesisReader.getTotalBacklogBytes(KinesisReader.java:158)
>   at 
> org.apache.beam.runners.dataflow.worker.StreamingModeExecutionContext.flushState(StreamingModeExecutionContext.java:433)
>   at 
> org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1289)
>   at 
> org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access$1000(StreamingDataflowWorker.java:149)
>   at 
> org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker$6.run(StreamingDataflowWorker.java:1024)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ArithmeticException: Value cannot fit in an int: 
> 153748963401
>   at org.joda.time.field.FieldUtils.safeToInt(FieldUtils.java:229)
>   at 
> org.joda.time.field.BaseDurationField.getDifference(BaseDurationField.java:141)
>   at 
> org.joda.time.base.BaseSingleFieldPeriod.between(BaseSingleFieldPeriod.java:72)
>   at org.joda.time.Minutes.minutesBetween(Minutes.java:101)
>   at 
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.lambda$getBacklogBytes$3(SimplifiedKinesisClient.java:169)
>   at 
> org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.wrapExceptions(SimplifiedKinesisClient.java:210)
>   ... 10 more
> {code}
> We spotted this issue on Dataflow runner. It's problematic as inability to 
> get backlog bytes seems to result in constant recreation of KinesisReader.
> The issue happens if the backlog bytes are retrieved before watermark value 
> is updated from initial default value. Easy way to reproduce it is to create 
> a pipeline with Kinesis source for a stream where no records are being put. 
> While debugging it locally, you can observe that the watermark is set to the 
> value on the past (like: "-290308-12-21T19:59:05.225Z"). After two minutes 
> (default watermark idle duration threshold is set to 2 minutes) , the 
> watermark is set to value of 
> [watermarkIdleThreshold|https://github.com/apache/beam/blob/9fe0a0387ad1f894ef02acf32edbc9623a3b13ad/sdks/java/io/kinesis/src/main/java/org/apache/beam/sdk/io/kinesis/WatermarkPolicyFactory.java#L110]),
>  so the next backlog bytes retrieval should be correct. However, as described 
> before, running the pipeline on Dataflow runner results in KinesisReader 
> being closed just after creation, so the watermark won't be fixed.
> The reason of the issue is following: The introduced watermark policies are 
> relying on 
> [WatermarkParameters|https://github.com/apache/beam/blob/9fe0a0387ad1f894ef02acf32edbc9623a3b13ad/sdks/java/io/kinesis/src/main/java/org/apache/beam/sdk/io/kinesis/WatermarkParameters.java]
>  which initialises currentWatermark and eventTime to 
> [BoundedWindow.TIMESTAMP_MIN_VALUE|https://github.com/apache/beam/commit/9fe0a0387ad1f894ef02acf32edbc9623a3b13ad#diff-b4964a457006b1555c7042c739b405ecR52].
>  This result in watermark being set to new Instant(-9223372036854775L) at the 
> KinesisReader creation. Calculated [period between the watermark and the 
> current 
> 

[jira] [Updated] (BEAM-8113) FlinkRunner: Stage files from context classloader

2019-08-29 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Lukavský updated BEAM-8113:
---
Status: Open  (was: Triage Needed)

> FlinkRunner: Stage files from context classloader
> -
>
> Key: BEAM-8113
> URL: https://issues.apache.org/jira/browse/BEAM-8113
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Jan Lukavský
>Assignee: Jan Lukavský
>Priority: Major
>
> Currently, only files from {{FlinkRunner.class.getClassLoader()}} are staged 
> by default. Add also files from 
> {{Thread.currentThread().getContextClassLoader()}}.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Assigned] (BEAM-8113) FlinkRunner: Stage files from context classloader

2019-08-29 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Lukavský reassigned BEAM-8113:
--

Assignee: Jan Lukavský

> FlinkRunner: Stage files from context classloader
> -
>
> Key: BEAM-8113
> URL: https://issues.apache.org/jira/browse/BEAM-8113
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Jan Lukavský
>Assignee: Jan Lukavský
>Priority: Major
>
> Currently, only files from {{FlinkRunner.class.getClassLoader()}} are staged 
> by default. Add also files from 
> {{Thread.currentThread().getContextClassLoader()}}.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work started] (BEAM-8113) FlinkRunner: Stage files from context classloader

2019-08-29 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-8113 started by Jan Lukavský.
--
> FlinkRunner: Stage files from context classloader
> -
>
> Key: BEAM-8113
> URL: https://issues.apache.org/jira/browse/BEAM-8113
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Jan Lukavský
>Assignee: Jan Lukavský
>Priority: Major
>
> Currently, only files from {{FlinkRunner.class.getClassLoader()}} are staged 
> by default. Add also files from 
> {{Thread.currentThread().getContextClassLoader()}}.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-5428) Implement cross-bundle state caching.

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5428?focusedWorklogId=303580=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303580
 ]

ASF GitHub Bot logged work on BEAM-5428:


Author: ASF GitHub Bot
Created on: 29/Aug/19 10:35
Start Date: 29/Aug/19 10:35
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #9374: [BEAM-5428] Implement 
Runner support for cache tokens
URL: https://github.com/apache/beam/pull/9374#issuecomment-526128312
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303580)
Time Spent: 10.5h  (was: 10h 20m)

> Implement cross-bundle state caching.
> -
>
> Key: BEAM-5428
> URL: https://issues.apache.org/jira/browse/BEAM-5428
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-harness
>Reporter: Robert Bradshaw
>Assignee: Rakesh Kumar
>Priority: Major
>  Time Spent: 10.5h
>  Remaining Estimate: 0h
>
> Tech spec: 
> [https://docs.google.com/document/d/1BOozW0bzBuz4oHJEuZNDOHdzaV5Y56ix58Ozrqm2jFg/edit#heading=h.7ghoih5aig5m]
> Relevant document: 
> [https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit#|https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit]
> Mailing list link: 
> [https://lists.apache.org/thread.html/caa8d9bc6ca871d13de2c5e6ba07fdc76f85d26497d95d90893aa1f6@%3Cdev.beam.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-5428) Implement cross-bundle state caching.

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5428?focusedWorklogId=303578=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303578
 ]

ASF GitHub Bot logged work on BEAM-5428:


Author: ASF GitHub Bot
Created on: 29/Aug/19 10:31
Start Date: 29/Aug/19 10:31
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #9440: [BEAM-5428] Modify cache 
token Proto design to only include tokens in ProcessBundleRequest
URL: https://github.com/apache/beam/pull/9440#issuecomment-526127127
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303578)
Time Spent: 10h 10m  (was: 10h)

> Implement cross-bundle state caching.
> -
>
> Key: BEAM-5428
> URL: https://issues.apache.org/jira/browse/BEAM-5428
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-harness
>Reporter: Robert Bradshaw
>Assignee: Rakesh Kumar
>Priority: Major
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> Tech spec: 
> [https://docs.google.com/document/d/1BOozW0bzBuz4oHJEuZNDOHdzaV5Y56ix58Ozrqm2jFg/edit#heading=h.7ghoih5aig5m]
> Relevant document: 
> [https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit#|https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit]
> Mailing list link: 
> [https://lists.apache.org/thread.html/caa8d9bc6ca871d13de2c5e6ba07fdc76f85d26497d95d90893aa1f6@%3Cdev.beam.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-5428) Implement cross-bundle state caching.

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-5428?focusedWorklogId=303579=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303579
 ]

ASF GitHub Bot logged work on BEAM-5428:


Author: ASF GitHub Bot
Created on: 29/Aug/19 10:31
Start Date: 29/Aug/19 10:31
Worklog Time Spent: 10m 
  Work Description: mxm commented on issue #9440: [BEAM-5428] Modify cache 
token Proto design to only include tokens in ProcessBundleRequest
URL: https://github.com/apache/beam/pull/9440#issuecomment-526127127
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303579)
Time Spent: 10h 20m  (was: 10h 10m)

> Implement cross-bundle state caching.
> -
>
> Key: BEAM-5428
> URL: https://issues.apache.org/jira/browse/BEAM-5428
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-harness
>Reporter: Robert Bradshaw
>Assignee: Rakesh Kumar
>Priority: Major
>  Time Spent: 10h 20m
>  Remaining Estimate: 0h
>
> Tech spec: 
> [https://docs.google.com/document/d/1BOozW0bzBuz4oHJEuZNDOHdzaV5Y56ix58Ozrqm2jFg/edit#heading=h.7ghoih5aig5m]
> Relevant document: 
> [https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit#|https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit]
> Mailing list link: 
> [https://lists.apache.org/thread.html/caa8d9bc6ca871d13de2c5e6ba07fdc76f85d26497d95d90893aa1f6@%3Cdev.beam.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7660) Create ParDo Python Load Test Jenkins Job [Flink]

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7660?focusedWorklogId=303569=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303569
 ]

ASF GitHub Bot logged work on BEAM-7660:


Author: ASF GitHub Bot
Created on: 29/Aug/19 10:18
Start Date: 29/Aug/19 10:18
Worklog Time Spent: 10m 
  Work Description: kamilwu commented on issue #9449: [BEAM-7660] Create 
Python ParDo load test job on Flink
URL: https://github.com/apache/beam/pull/9449#issuecomment-526122565
 
 
   Run Python Load Tests ParDo Flink Batch
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303569)
Time Spent: 0.5h  (was: 20m)

> Create ParDo Python Load Test Jenkins Job [Flink]
> -
>
> Key: BEAM-7660
> URL: https://issues.apache.org/jira/browse/BEAM-7660
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Kamil Wasilewski
>Assignee: Kamil Wasilewski
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7660) Create ParDo Python Load Test Jenkins Job [Flink]

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7660?focusedWorklogId=303555=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303555
 ]

ASF GitHub Bot logged work on BEAM-7660:


Author: ASF GitHub Bot
Created on: 29/Aug/19 10:07
Start Date: 29/Aug/19 10:07
Worklog Time Spent: 10m 
  Work Description: kamilwu commented on issue #9449: [BEAM-7660] Create 
Python ParDo load test job on Flink
URL: https://github.com/apache/beam/pull/9449#issuecomment-526118890
 
 
   Run Seed Job
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303555)
Time Spent: 20m  (was: 10m)

> Create ParDo Python Load Test Jenkins Job [Flink]
> -
>
> Key: BEAM-7660
> URL: https://issues.apache.org/jira/browse/BEAM-7660
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Kamil Wasilewski
>Assignee: Kamil Wasilewski
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7660) Create ParDo Python Load Test Jenkins Job [Flink]

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7660?focusedWorklogId=303554=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303554
 ]

ASF GitHub Bot logged work on BEAM-7660:


Author: ASF GitHub Bot
Created on: 29/Aug/19 10:06
Start Date: 29/Aug/19 10:06
Worklog Time Spent: 10m 
  Work Description: kamilwu commented on pull request #9449: [BEAM-7660] 
Create Python ParDo load test job on Flink
URL: https://github.com/apache/beam/pull/9449
 
 
   
   
   Thank you for your contribution! Follow this checklist to help us 
incorporate your contribution quickly and easily:
   
- [ ] [**Choose 
reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and 
mention them in a comment (`R: @username`).
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/)
 | --- | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build
 
Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)
 | --- | [![Build 

[jira] [Created] (BEAM-8113) FlinkRunner: Stage files from context classloader

2019-08-29 Thread Jira
Jan Lukavský created BEAM-8113:
--

 Summary: FlinkRunner: Stage files from context classloader
 Key: BEAM-8113
 URL: https://issues.apache.org/jira/browse/BEAM-8113
 Project: Beam
  Issue Type: Improvement
  Components: runner-flink
Reporter: Jan Lukavský


Currently, only files from {{FlinkRunner.class.getClassLoader()}} are staged by 
default. Add also files from {{Thread.currentThread().getContextClassLoader()}}.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work started] (BEAM-8108) Run Chicago Taxi Example on Flink

2019-08-29 Thread Kamil Wasilewski (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-8108 started by Kamil Wasilewski.
--
> Run Chicago Taxi Example on Flink
> -
>
> Key: BEAM-8108
> URL: https://issues.apache.org/jira/browse/BEAM-8108
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Kamil Wasilewski
>Assignee: Kamil Wasilewski
>Priority: Minor
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8108) Run Chicago Taxi Example on Flink

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8108?focusedWorklogId=303495=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303495
 ]

ASF GitHub Bot logged work on BEAM-8108:


Author: ASF GitHub Bot
Created on: 29/Aug/19 09:41
Start Date: 29/Aug/19 09:41
Worklog Time Spent: 10m 
  Work Description: kamilwu commented on issue #9448: [BEAM-8108] Run 
Chicago Taxi Example on Flink
URL: https://github.com/apache/beam/pull/9448#issuecomment-526109750
 
 
   Run Chicago Taxi on Flink
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303495)
Time Spent: 1h  (was: 50m)

> Run Chicago Taxi Example on Flink
> -
>
> Key: BEAM-8108
> URL: https://issues.apache.org/jira/browse/BEAM-8108
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Kamil Wasilewski
>Assignee: Kamil Wasilewski
>Priority: Minor
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8107) Update commons-compress to version 1.19

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8107?focusedWorklogId=303472=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303472
 ]

ASF GitHub Bot logged work on BEAM-8107:


Author: ASF GitHub Bot
Created on: 29/Aug/19 09:10
Start Date: 29/Aug/19 09:10
Worklog Time Spent: 10m 
  Work Description: mibo commented on issue #9439: [BEAM-8107] Update 
commons-compress to version 1.19
URL: https://github.com/apache/beam/pull/9439#issuecomment-526098353
 
 
   @iemejia Thanks a lot for the info  
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303472)
Time Spent: 1h 40m  (was: 1.5h)

> Update commons-compress to version 1.19
> ---
>
> Key: BEAM-8107
> URL: https://issues.apache.org/jira/browse/BEAM-8107
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system, sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Beam's current version of commons-compress (1.18) is subject to a DoS 
> vulnerability so we need to upgrade it [CVE-2019-12402]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8107) Update commons-compress to version 1.19

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8107?focusedWorklogId=303473=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303473
 ]

ASF GitHub Bot logged work on BEAM-8107:


Author: ASF GitHub Bot
Created on: 29/Aug/19 09:10
Start Date: 29/Aug/19 09:10
Worklog Time Spent: 10m 
  Work Description: mibo commented on issue #9439: [BEAM-8107] Update 
commons-compress to version 1.19
URL: https://github.com/apache/beam/pull/9439#issuecomment-526098353
 
 
   @iemejia Thanks a lot for the info 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303473)
Time Spent: 1h 50m  (was: 1h 40m)

> Update commons-compress to version 1.19
> ---
>
> Key: BEAM-8107
> URL: https://issues.apache.org/jira/browse/BEAM-8107
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system, sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Beam's current version of commons-compress (1.18) is subject to a DoS 
> vulnerability so we need to upgrade it [CVE-2019-12402]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8107) Update commons-compress to version 1.19

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8107?focusedWorklogId=303471=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303471
 ]

ASF GitHub Bot logged work on BEAM-8107:


Author: ASF GitHub Bot
Created on: 29/Aug/19 09:05
Start Date: 29/Aug/19 09:05
Worklog Time Spent: 10m 
  Work Description: iemejia commented on issue #9439: [BEAM-8107] Update 
commons-compress to version 1.19
URL: https://github.com/apache/beam/pull/9439#issuecomment-526096729
 
 
   Next release branch will be cut on sept. 11. Usually it takes around two 
weeks after that for the new artifacts to be available.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303471)
Time Spent: 1.5h  (was: 1h 20m)

> Update commons-compress to version 1.19
> ---
>
> Key: BEAM-8107
> URL: https://issues.apache.org/jira/browse/BEAM-8107
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system, sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Beam's current version of commons-compress (1.18) is subject to a DoS 
> vulnerability so we need to upgrade it [CVE-2019-12402]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7829) AvroUtils.toAvroSchema should put a Schema name to pass Avro Schema validation

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7829?focusedWorklogId=303466=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303466
 ]

ASF GitHub Bot logged work on BEAM-7829:


Author: ASF GitHub Bot
Created on: 29/Aug/19 08:54
Start Date: 29/Aug/19 08:54
Worklog Time Spent: 10m 
  Work Description: kanterov commented on issue #9247: [BEAM-7829] Add 
schema names when converting with AvroUtils.toAvroSchema
URL: https://github.com/apache/beam/pull/9247#issuecomment-526092234
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303466)
Time Spent: 2.5h  (was: 2h 20m)

> AvroUtils.toAvroSchema should put a Schema name to pass Avro Schema validation
> --
>
> Key: BEAM-7829
> URL: https://issues.apache.org/jira/browse/BEAM-7829
> Project: Beam
>  Issue Type: Test
>  Components: io-java-avro, sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ryan Skraba
>Priority: Minor
> Fix For: 2.16.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> While trying to use an Avro PCollection with the SQL transform I notice you 
> could not do correctly a bijective transform: PCollection -> 
> SQL -> PCollection -> ParDo -> PCollection I noticed that 
> some of the Avro metadata gets lost in particular the name of the Avro 
> Schema. This is important because Avro validates that the schema has a name 
> and if it does not it breaks with a ParseException.
> {quote}
> org.apache.avro.SchemaParseException: Illegal character in: EXPR$1
>     at org.apache.avro.Schema.validateName (Schema.java:1151)
>     at org.apache.avro.Schema.access$200 (Schema.java:81)
>     at org.apache.avro.Schema$Field. (Schema.java:403)
>     at org.apache.avro.Schema$Field. (Schema.java:423)
>     at org.apache.avro.Schema$Field. (Schema.java:415){quote}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-8109) Encoder exception for structure contains Iterable of KV

2019-08-29 Thread Jira


[ 
https://issues.apache.org/jira/browse/BEAM-8109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918402#comment-16918402
 ] 

Ismaël Mejía commented on BEAM-8109:


Just out of curiosity, does this work on Direct runner?

> Encoder exception for structure contains Iterable of KV
> ---
>
> Key: BEAM-8109
> URL: https://issues.apache.org/jira/browse/BEAM-8109
> Project: Beam
>  Issue Type: Bug
>  Components: extensions-java-sorter
>Affects Versions: 2.14.0, 2.15.0
>Reporter: Brachi Packter
>Priority: Critical
>
> When doing group by and then sort, the sort should get this structure:
> PCollection>>>
> However, for any SecondaryKeyT that I put I get coder exception:EOF, this 
> happens for long and for string coders.
> Happens in DataFlow runner.
>  
> Here is the full exception:
> {code}
> java.lang.IllegalStateException: Unable to decode tag list using 
> WindowedValue$FullWindowedValueCoder(KvCoder(StringUtf8Coder,IterableCoder(KvCoder(StringUtf8Coder,com.moonactive.data.processor.beam.coders.JsonNodeCoder@20df8330))),IntervalWindow$IntervalWindowCoder)
>  
> org.apache.beam.runners.dataflow.worker.WindmillStateReader.bagPageValues(WindmillStateReader.java:575)
>  
> org.apache.beam.runners.dataflow.worker.WindmillStateReader.consumeBag(WindmillStateReader.java:597)
>  
> org.apache.beam.runners.dataflow.worker.WindmillStateReader.consumeResponse(WindmillStateReader.java:504)
>  
> org.apache.beam.runners.dataflow.worker.WindmillStateReader.startBatchAndBlock(WindmillStateReader.java:420)
>  
> org.apache.beam.runners.dataflow.worker.WindmillStateReader$WrappedFuture.get(WindmillStateReader.java:313)
>  
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.Futures$2.get(Futures.java:542)
>  
> org.apache.beam.runners.dataflow.worker.WindmillStateInternals$WindmillBag.fetchData(WindmillStateInternals.java:503)
>  
> org.apache.beam.runners.dataflow.worker.WindmillStateInternals$WindmillBag.read(WindmillStateInternals.java:536)
>  
> org.apache.beam.runners.dataflow.worker.StreamingSideInputDoFnRunner.startBundle(StreamingSideInputDoFnRunner.java:60)
>  
> org.apache.beam.runners.dataflow.worker.SimpleParDoFn.reallyStartBundle(SimpleParDoFn.java:307)
>  
> org.apache.beam.runners.dataflow.worker.SimpleParDoFn.startBundle(SimpleParDoFn.java:231)
>  
> org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation.start(ParDoOperation.java:36)
>  
> org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:77)
>  
> org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1295)
>  
> org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access$1000(StreamingDataflowWorker.java:149)
>  
> org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker$6.run(StreamingDataflowWorker.java:1028)
>  
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  java.lang.Thread.run(Thread.java:745) Caused by: 
> org.apache.beam.sdk.coders.CoderException: java.io.EOFException 
> org.apache.beam.sdk.coders.StringUtf8Coder.decode(StringUtf8Coder.java:104) 
> org.apache.beam.sdk.coders.StringUtf8Coder.decode(StringUtf8Coder.java:90) 
> org.apache.beam.sdk.coders.StringUtf8Coder.decode(StringUtf8Coder.java:37) 
> org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:81) 
> org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:76) 
> org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:36) 
> org.apache.beam.sdk.coders.IterableLikeCoder.decode(IterableLikeCoder.java:124)
>  
> org.apache.beam.sdk.coders.IterableLikeCoder.decode(IterableLikeCoder.java:60)
>  org.apache.beam.sdk.coders.Coder.decode(Coder.java:159) 
> org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:82) 
> org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:36) 
> org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:592)
>  
> org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:529)
>  
> org.apache.beam.runners.dataflow.worker.WindmillStateReader.bagPageValues(WindmillStateReader.java:573)
>  
> org.apache.beam.runners.dataflow.worker.WindmillStateReader.consumeBag(WindmillStateReader.java:597)
>  
> org.apache.beam.runners.dataflow.worker.WindmillStateReader.consumeResponse(WindmillStateReader.java:504)
>  
> org.apache.beam.runners.dataflow.worker.WindmillStateReader.startBatchAndBlock(WindmillStateReader.java:420)
>  
> org.apache.beam.runners.dataflow.worker.WindmillStateReader$WrappedFuture.get(WindmillStateReader.java:313)
>  
> 

[jira] [Updated] (BEAM-8109) Encoder exception for structure contains Iterable of KV

2019-08-29 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-8109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-8109:
---
Status: Open  (was: Triage Needed)

> Encoder exception for structure contains Iterable of KV
> ---
>
> Key: BEAM-8109
> URL: https://issues.apache.org/jira/browse/BEAM-8109
> Project: Beam
>  Issue Type: Bug
>  Components: extensions-java-sorter
>Affects Versions: 2.14.0, 2.15.0
>Reporter: Brachi Packter
>Priority: Critical
>
> When doing group by and then sort, the sort should get this structure:
> PCollection>>>
> However, for any SecondaryKeyT that I put I get coder exception:EOF, this 
> happens for long and for string coders.
> Happens in DataFlow runner.
>  
> Here is the full exception:
> {code}
> java.lang.IllegalStateException: Unable to decode tag list using 
> WindowedValue$FullWindowedValueCoder(KvCoder(StringUtf8Coder,IterableCoder(KvCoder(StringUtf8Coder,com.moonactive.data.processor.beam.coders.JsonNodeCoder@20df8330))),IntervalWindow$IntervalWindowCoder)
>  
> org.apache.beam.runners.dataflow.worker.WindmillStateReader.bagPageValues(WindmillStateReader.java:575)
>  
> org.apache.beam.runners.dataflow.worker.WindmillStateReader.consumeBag(WindmillStateReader.java:597)
>  
> org.apache.beam.runners.dataflow.worker.WindmillStateReader.consumeResponse(WindmillStateReader.java:504)
>  
> org.apache.beam.runners.dataflow.worker.WindmillStateReader.startBatchAndBlock(WindmillStateReader.java:420)
>  
> org.apache.beam.runners.dataflow.worker.WindmillStateReader$WrappedFuture.get(WindmillStateReader.java:313)
>  
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.Futures$2.get(Futures.java:542)
>  
> org.apache.beam.runners.dataflow.worker.WindmillStateInternals$WindmillBag.fetchData(WindmillStateInternals.java:503)
>  
> org.apache.beam.runners.dataflow.worker.WindmillStateInternals$WindmillBag.read(WindmillStateInternals.java:536)
>  
> org.apache.beam.runners.dataflow.worker.StreamingSideInputDoFnRunner.startBundle(StreamingSideInputDoFnRunner.java:60)
>  
> org.apache.beam.runners.dataflow.worker.SimpleParDoFn.reallyStartBundle(SimpleParDoFn.java:307)
>  
> org.apache.beam.runners.dataflow.worker.SimpleParDoFn.startBundle(SimpleParDoFn.java:231)
>  
> org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation.start(ParDoOperation.java:36)
>  
> org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:77)
>  
> org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1295)
>  
> org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access$1000(StreamingDataflowWorker.java:149)
>  
> org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker$6.run(StreamingDataflowWorker.java:1028)
>  
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  java.lang.Thread.run(Thread.java:745) Caused by: 
> org.apache.beam.sdk.coders.CoderException: java.io.EOFException 
> org.apache.beam.sdk.coders.StringUtf8Coder.decode(StringUtf8Coder.java:104) 
> org.apache.beam.sdk.coders.StringUtf8Coder.decode(StringUtf8Coder.java:90) 
> org.apache.beam.sdk.coders.StringUtf8Coder.decode(StringUtf8Coder.java:37) 
> org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:81) 
> org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:76) 
> org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:36) 
> org.apache.beam.sdk.coders.IterableLikeCoder.decode(IterableLikeCoder.java:124)
>  
> org.apache.beam.sdk.coders.IterableLikeCoder.decode(IterableLikeCoder.java:60)
>  org.apache.beam.sdk.coders.Coder.decode(Coder.java:159) 
> org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:82) 
> org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:36) 
> org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:592)
>  
> org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:529)
>  
> org.apache.beam.runners.dataflow.worker.WindmillStateReader.bagPageValues(WindmillStateReader.java:573)
>  
> org.apache.beam.runners.dataflow.worker.WindmillStateReader.consumeBag(WindmillStateReader.java:597)
>  
> org.apache.beam.runners.dataflow.worker.WindmillStateReader.consumeResponse(WindmillStateReader.java:504)
>  
> org.apache.beam.runners.dataflow.worker.WindmillStateReader.startBatchAndBlock(WindmillStateReader.java:420)
>  
> org.apache.beam.runners.dataflow.worker.WindmillStateReader$WrappedFuture.get(WindmillStateReader.java:313)
>  
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.Futures$2.get(Futures.java:542)
>  
> 

[jira] [Updated] (BEAM-8104) Remove windowStrategy deserialization exception handling

2019-08-29 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-8104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-8104:
---
Status: Open  (was: Triage Needed)

> Remove windowStrategy deserialization exception handling
> 
>
> Key: BEAM-8104
> URL: https://issues.apache.org/jira/browse/BEAM-8104
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Ankur Goenka
>Priority: Major
>
> Remove the exception handling in JRH at 
> [https://github.com/apache/beam/blob/98e553b371966127a988bab69290b5a8c44d5dad/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/GroupAlsoByWindowParDoFnFactory.java#L106]
> And subsequently fix Reshuffle transform to use _IdentityWindowFn in 
> [https://github.com/apache/beam/blob/98e553b371966127a988bab69290b5a8c44d5dad/sdks/python/apache_beam/transforms/util.py#L626]
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-8109) Encoder exception for structure contains Iterable of KV

2019-08-29 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-8109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-8109:
---
Description: 
When doing group by and then sort, the sort should get this structure:

PCollection>>>

However, for any SecondaryKeyT that I put I get coder exception:EOF, this 
happens for long and for string coders.



Happens in DataFlow runner.

 

Here is the full exception:
{code}
java.lang.IllegalStateException: Unable to decode tag list using 
WindowedValue$FullWindowedValueCoder(KvCoder(StringUtf8Coder,IterableCoder(KvCoder(StringUtf8Coder,com.moonactive.data.processor.beam.coders.JsonNodeCoder@20df8330))),IntervalWindow$IntervalWindowCoder)
 
org.apache.beam.runners.dataflow.worker.WindmillStateReader.bagPageValues(WindmillStateReader.java:575)
 
org.apache.beam.runners.dataflow.worker.WindmillStateReader.consumeBag(WindmillStateReader.java:597)
 
org.apache.beam.runners.dataflow.worker.WindmillStateReader.consumeResponse(WindmillStateReader.java:504)
 
org.apache.beam.runners.dataflow.worker.WindmillStateReader.startBatchAndBlock(WindmillStateReader.java:420)
 
org.apache.beam.runners.dataflow.worker.WindmillStateReader$WrappedFuture.get(WindmillStateReader.java:313)
 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.Futures$2.get(Futures.java:542)
 
org.apache.beam.runners.dataflow.worker.WindmillStateInternals$WindmillBag.fetchData(WindmillStateInternals.java:503)
 
org.apache.beam.runners.dataflow.worker.WindmillStateInternals$WindmillBag.read(WindmillStateInternals.java:536)
 
org.apache.beam.runners.dataflow.worker.StreamingSideInputDoFnRunner.startBundle(StreamingSideInputDoFnRunner.java:60)
 
org.apache.beam.runners.dataflow.worker.SimpleParDoFn.reallyStartBundle(SimpleParDoFn.java:307)
 
org.apache.beam.runners.dataflow.worker.SimpleParDoFn.startBundle(SimpleParDoFn.java:231)
 
org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation.start(ParDoOperation.java:36)
 
org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:77)
 
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1295)
 
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access$1000(StreamingDataflowWorker.java:149)
 
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker$6.run(StreamingDataflowWorker.java:1028)
 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
java.lang.Thread.run(Thread.java:745) Caused by: 
org.apache.beam.sdk.coders.CoderException: java.io.EOFException 
org.apache.beam.sdk.coders.StringUtf8Coder.decode(StringUtf8Coder.java:104) 
org.apache.beam.sdk.coders.StringUtf8Coder.decode(StringUtf8Coder.java:90) 
org.apache.beam.sdk.coders.StringUtf8Coder.decode(StringUtf8Coder.java:37) 
org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:81) 
org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:76) 
org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:36) 
org.apache.beam.sdk.coders.IterableLikeCoder.decode(IterableLikeCoder.java:124) 
org.apache.beam.sdk.coders.IterableLikeCoder.decode(IterableLikeCoder.java:60) 
org.apache.beam.sdk.coders.Coder.decode(Coder.java:159) 
org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:82) 
org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:36) 
org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:592)
 
org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:529)
 
org.apache.beam.runners.dataflow.worker.WindmillStateReader.bagPageValues(WindmillStateReader.java:573)
 
org.apache.beam.runners.dataflow.worker.WindmillStateReader.consumeBag(WindmillStateReader.java:597)
 
org.apache.beam.runners.dataflow.worker.WindmillStateReader.consumeResponse(WindmillStateReader.java:504)
 
org.apache.beam.runners.dataflow.worker.WindmillStateReader.startBatchAndBlock(WindmillStateReader.java:420)
 
org.apache.beam.runners.dataflow.worker.WindmillStateReader$WrappedFuture.get(WindmillStateReader.java:313)
 
org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.Futures$2.get(Futures.java:542)
 
org.apache.beam.runners.dataflow.worker.WindmillStateInternals$WindmillBag.fetchData(WindmillStateInternals.java:503)
 
org.apache.beam.runners.dataflow.worker.WindmillStateInternals$WindmillBag.read(WindmillStateInternals.java:536)
 
org.apache.beam.runners.dataflow.worker.StreamingSideInputDoFnRunner.startBundle(StreamingSideInputDoFnRunner.java:60)
 
org.apache.beam.runners.dataflow.worker.SimpleParDoFn.reallyStartBundle(SimpleParDoFn.java:307)
 
org.apache.beam.runners.dataflow.worker.SimpleParDoFn.startBundle(SimpleParDoFn.java:231)
 
org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation.start(ParDoOperation.java:36)
 

[jira] [Work logged] (BEAM-8108) Run Chicago Taxi Example on Flink

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8108?focusedWorklogId=303459=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303459
 ]

ASF GitHub Bot logged work on BEAM-8108:


Author: ASF GitHub Bot
Created on: 29/Aug/19 08:28
Start Date: 29/Aug/19 08:28
Worklog Time Spent: 10m 
  Work Description: kamilwu commented on issue #9448: [BEAM-8108] Run 
Chicago Taxi Example on Flink
URL: https://github.com/apache/beam/pull/9448#issuecomment-526082164
 
 
   Google Cloud Flink Runner Chicago Taxi Example
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303459)
Time Spent: 50m  (was: 40m)

> Run Chicago Taxi Example on Flink
> -
>
> Key: BEAM-8108
> URL: https://issues.apache.org/jira/browse/BEAM-8108
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Kamil Wasilewski
>Assignee: Kamil Wasilewski
>Priority: Minor
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (BEAM-8110) Fedora 30: ImportError: libnsl.so.1 when importing apache_beam in Python 3

2019-08-29 Thread Jira


[ 
https://issues.apache.org/jira/browse/BEAM-8110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918399#comment-16918399
 ] 

Ismaël Mejía commented on BEAM-8110:


This looks like an issue in the Arrow side, maybe worth to report there. At 
least it has the workaround of installing the dep. Anything extra to say maybe 
[~bhulette]

> Fedora 30: ImportError: libnsl.so.1 when importing apache_beam in Python 3
> --
>
> Key: BEAM-8110
> URL: https://issues.apache.org/jira/browse/BEAM-8110
> Project: Beam
>  Issue Type: Bug
>  Components: dependencies, sdk-py-core
>Affects Versions: 2.15.0
> Environment: Fedora 30, Singularity container
>Reporter: Robert Lugg
>Priority: Minor
>
> {{When importing apache_beam in python, it fails because it can't find 
> libnsl.so.1.}}
> It is fixed by running 'dnf install libnsl'.  This appears to be a dependency 
> of Apache Arrow.
>  
> {{output:}}
> {{Singularity tfx:~/tfx_image_example_gen> python}}
> {{Python 3.6.9 |Anaconda, Inc.| (default, Jul 30 2019, 19:07:31) }}
> {{[GCC 7.3.0] on linux}}
> {{Type "help", "copyright", "credits" or "license" for more information.}}
> {{>>> import apache_beam}}
> {{/opt/snps/envs/tfx/lib/python3.6/site-packages/apache_beam/__init__.py:84: 
> UserWarning: Some syntactic constructs of Python 3 are not yet fully 
> supported by Apache Beam.}}
> {{ 'Some syntactic constructs of Python 3 are not yet fully supported by '}}
> {{Traceback (most recent call last):}}
> {{ File "", line 1, in }}
> {{ File 
> "/opt/snps/envs/tfx/lib/python3.6/site-packages/apache_beam/__init__.py", 
> line 98, in }}
> {{ from apache_beam import io}}
> {{ File 
> "/opt/snps/envs/tfx/lib/python3.6/site-packages/apache_beam/io/__init__.py", 
> line 29, in }}
> {{ from apache_beam.io.parquetio import *}}
> {{ File 
> "/opt/snps/envs/tfx/lib/python3.6/site-packages/apache_beam/io/parquetio.py", 
> line 45, in }}
> {{ import pyarrow as pa}}
> {{ File "/opt/snps/envs/tfx/lib/python3.6/site-packages/pyarrow/__init__.py", 
> line 49, in }}
> {{ from pyarrow.lib import cpu_count, set_cpu_count}}
> {{ImportError: libnsl.so.1: cannot open shared object file: No such file or 
> directory}}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8108) Run Chicago Taxi Example on Flink

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8108?focusedWorklogId=303458=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303458
 ]

ASF GitHub Bot logged work on BEAM-8108:


Author: ASF GitHub Bot
Created on: 29/Aug/19 08:28
Start Date: 29/Aug/19 08:28
Worklog Time Spent: 10m 
  Work Description: kamilwu commented on issue #9448: [BEAM-8108] Run 
Chicago Taxi Example on Flink
URL: https://github.com/apache/beam/pull/9448#issuecomment-526082717
 
 
   Run Chicago Taxi on Flink
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303458)
Time Spent: 40m  (was: 0.5h)

> Run Chicago Taxi Example on Flink
> -
>
> Key: BEAM-8108
> URL: https://issues.apache.org/jira/browse/BEAM-8108
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Kamil Wasilewski
>Assignee: Kamil Wasilewski
>Priority: Minor
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8108) Run Chicago Taxi Example on Flink

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8108?focusedWorklogId=303455=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303455
 ]

ASF GitHub Bot logged work on BEAM-8108:


Author: ASF GitHub Bot
Created on: 29/Aug/19 08:26
Start Date: 29/Aug/19 08:26
Worklog Time Spent: 10m 
  Work Description: kamilwu commented on issue #9448: [BEAM-8108] Run 
Chicago Taxi Example on Flink
URL: https://github.com/apache/beam/pull/9448#issuecomment-526082164
 
 
   Google Cloud Flink Runner Chicago Taxi Example
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303455)
Time Spent: 0.5h  (was: 20m)

> Run Chicago Taxi Example on Flink
> -
>
> Key: BEAM-8108
> URL: https://issues.apache.org/jira/browse/BEAM-8108
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Kamil Wasilewski
>Assignee: Kamil Wasilewski
>Priority: Minor
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-8110) Fedora 30: ImportError: libnsl.so.1 when importing apache_beam in Python 3

2019-08-29 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-8110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-8110:
---
Component/s: sdk-py-core

> Fedora 30: ImportError: libnsl.so.1 when importing apache_beam in Python 3
> --
>
> Key: BEAM-8110
> URL: https://issues.apache.org/jira/browse/BEAM-8110
> Project: Beam
>  Issue Type: Bug
>  Components: dependencies, sdk-py-core
>Affects Versions: 2.15.0
> Environment: Fedora 30, Singularity container
>Reporter: Robert Lugg
>Priority: Minor
>
> {{When importing apache_beam in python, it fails because it can't find 
> libnsl.so.1.}}
> It is fixed by running 'dnf install libnsl'.  This appears to be a dependency 
> of Apache Arrow.
>  
> {{output:}}
> {{Singularity tfx:~/tfx_image_example_gen> python}}
> {{Python 3.6.9 |Anaconda, Inc.| (default, Jul 30 2019, 19:07:31) }}
> {{[GCC 7.3.0] on linux}}
> {{Type "help", "copyright", "credits" or "license" for more information.}}
> {{>>> import apache_beam}}
> {{/opt/snps/envs/tfx/lib/python3.6/site-packages/apache_beam/__init__.py:84: 
> UserWarning: Some syntactic constructs of Python 3 are not yet fully 
> supported by Apache Beam.}}
> {{ 'Some syntactic constructs of Python 3 are not yet fully supported by '}}
> {{Traceback (most recent call last):}}
> {{ File "", line 1, in }}
> {{ File 
> "/opt/snps/envs/tfx/lib/python3.6/site-packages/apache_beam/__init__.py", 
> line 98, in }}
> {{ from apache_beam import io}}
> {{ File 
> "/opt/snps/envs/tfx/lib/python3.6/site-packages/apache_beam/io/__init__.py", 
> line 29, in }}
> {{ from apache_beam.io.parquetio import *}}
> {{ File 
> "/opt/snps/envs/tfx/lib/python3.6/site-packages/apache_beam/io/parquetio.py", 
> line 45, in }}
> {{ import pyarrow as pa}}
> {{ File "/opt/snps/envs/tfx/lib/python3.6/site-packages/pyarrow/__init__.py", 
> line 49, in }}
> {{ from pyarrow.lib import cpu_count, set_cpu_count}}
> {{ImportError: libnsl.so.1: cannot open shared object file: No such file or 
> directory}}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-8110) Fedora 30: ImportError: libnsl.so.1 when importing apache_beam in Python 3

2019-08-29 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-8110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-8110:
---
Status: Open  (was: Triage Needed)

> Fedora 30: ImportError: libnsl.so.1 when importing apache_beam in Python 3
> --
>
> Key: BEAM-8110
> URL: https://issues.apache.org/jira/browse/BEAM-8110
> Project: Beam
>  Issue Type: Bug
>  Components: dependencies
>Affects Versions: 2.15.0
> Environment: Fedora 30, Singularity container
>Reporter: Robert Lugg
>Priority: Minor
>
> {{When importing apache_beam in python, it fails because it can't find 
> libnsl.so.1.}}
> It is fixed by running 'dnf install libnsl'.  This appears to be a dependency 
> of Apache Arrow.
>  
> {{output:}}
> {{Singularity tfx:~/tfx_image_example_gen> python}}
> {{Python 3.6.9 |Anaconda, Inc.| (default, Jul 30 2019, 19:07:31) }}
> {{[GCC 7.3.0] on linux}}
> {{Type "help", "copyright", "credits" or "license" for more information.}}
> {{>>> import apache_beam}}
> {{/opt/snps/envs/tfx/lib/python3.6/site-packages/apache_beam/__init__.py:84: 
> UserWarning: Some syntactic constructs of Python 3 are not yet fully 
> supported by Apache Beam.}}
> {{ 'Some syntactic constructs of Python 3 are not yet fully supported by '}}
> {{Traceback (most recent call last):}}
> {{ File "", line 1, in }}
> {{ File 
> "/opt/snps/envs/tfx/lib/python3.6/site-packages/apache_beam/__init__.py", 
> line 98, in }}
> {{ from apache_beam import io}}
> {{ File 
> "/opt/snps/envs/tfx/lib/python3.6/site-packages/apache_beam/io/__init__.py", 
> line 29, in }}
> {{ from apache_beam.io.parquetio import *}}
> {{ File 
> "/opt/snps/envs/tfx/lib/python3.6/site-packages/apache_beam/io/parquetio.py", 
> line 45, in }}
> {{ import pyarrow as pa}}
> {{ File "/opt/snps/envs/tfx/lib/python3.6/site-packages/pyarrow/__init__.py", 
> line 49, in }}
> {{ from pyarrow.lib import cpu_count, set_cpu_count}}
> {{ImportError: libnsl.so.1: cannot open shared object file: No such file or 
> directory}}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-8111) SchemaCoder broken on DataflowRunner

2019-08-29 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-8111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-8111:
---
Status: Open  (was: Triage Needed)

> SchemaCoder broken on DataflowRunner
> 
>
> Key: BEAM-8111
> URL: https://issues.apache.org/jira/browse/BEAM-8111
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-java-core
>Affects Versions: 2.15.0
>Reporter: Brian Hulette
>Assignee: Brian Hulette
>Priority: Blocker
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> https://github.com/apache/beam/commit/e65c176a9f34e45d408281e1101a2ae54cef0f6c
>  broke SchemaCoder on Dataflow. When translating a schema that uses logical 
> types from a cloud object dataflow encounters a runtime error.
> This means any pipelines that use SqlTransform or schema transforms will fail 
> on Dataflow in 2.15.0



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-8112) Support passing stateBackend through pipeline options in python sdks

2019-08-29 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-8112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-8112:
---
Status: Open  (was: Triage Needed)

> Support passing stateBackend through pipeline options in python sdks
> 
>
> Key: BEAM-8112
> URL: https://issues.apache.org/jira/browse/BEAM-8112
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Catlyn Kong
>Priority: Major
>
> Currently the only way for python sdks to instruct flink to use a 
> StateBackend different than the default (MemoryStateBackend) would be to 
> specify state.backend in flink-conf.yaml, which creates the limitation of 
> using the same statebackend for every job running on the same flink cluster. 
> Ideally we should be able to pass it in to flink runner through 
> PipelineOptions. Here's the error it spits out when I flag  
> --state_backend=RocksDBStateBackend:
>  
> {code:java}
> RuntimeError: Pipeline failed in state FAILED: 
> com.fasterxml.jackson.databind.exc.InvalidDefinitionException: Cannot 
> construct instance of `org.apache.flink.runtime.state.StateBackend` (no 
> Creators, like default construct, exist): abstract types either need to be 
> mapped to concrete types, have custom deserializer, or contain additional 
> type information
>  at [Source: (String)""RocksDBStateBackend""; line: 1, column: 1]
> {code}
> Acceptance Criteria:
> Flink StateBackend is configurable via command line options from python.
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-8108) Run Chicago Taxi Example on Flink

2019-08-29 Thread Jira


 [ 
https://issues.apache.org/jira/browse/BEAM-8108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-8108:
---
Status: Open  (was: Triage Needed)

> Run Chicago Taxi Example on Flink
> -
>
> Key: BEAM-8108
> URL: https://issues.apache.org/jira/browse/BEAM-8108
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Kamil Wasilewski
>Assignee: Kamil Wasilewski
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8108) Run Chicago Taxi Example on Flink

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8108?focusedWorklogId=303449=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303449
 ]

ASF GitHub Bot logged work on BEAM-8108:


Author: ASF GitHub Bot
Created on: 29/Aug/19 08:06
Start Date: 29/Aug/19 08:06
Worklog Time Spent: 10m 
  Work Description: kamilwu commented on issue #9448: [BEAM-8108] Run 
Chicago Taxi Example on Flink
URL: https://github.com/apache/beam/pull/9448#issuecomment-526074915
 
 
   Run Seed Job
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303449)
Time Spent: 20m  (was: 10m)

> Run Chicago Taxi Example on Flink
> -
>
> Key: BEAM-8108
> URL: https://issues.apache.org/jira/browse/BEAM-8108
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Kamil Wasilewski
>Assignee: Kamil Wasilewski
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-7972) Portable Python Reshuffle does not work with windowed pcollection

2019-08-29 Thread Ankur Goenka (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur Goenka updated BEAM-7972:
---
Fix Version/s: 2.16.0

> Portable Python Reshuffle does not work with windowed pcollection
> -
>
> Key: BEAM-7972
> URL: https://issues.apache.org/jira/browse/BEAM-7972
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Major
> Fix For: 2.16.0
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Streaming pipeline gets stuck when using Reshuffle with windowed pcollection.
> The issue happen because of window function gets deserialized on java side 
> which is not possible and hence default to global window function and result 
> into window function mismatch later down the code.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (BEAM-7972) Portable Python Reshuffle does not work with windowed pcollection

2019-08-29 Thread Ankur Goenka (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur Goenka updated BEAM-7972:
---
Priority: Blocker  (was: Major)

> Portable Python Reshuffle does not work with windowed pcollection
> -
>
> Key: BEAM-7972
> URL: https://issues.apache.org/jira/browse/BEAM-7972
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Blocker
> Fix For: 2.16.0
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Streaming pipeline gets stuck when using Reshuffle with windowed pcollection.
> The issue happen because of window function gets deserialized on java side 
> which is not possible and hence default to global window function and result 
> into window function mismatch later down the code.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-7829) AvroUtils.toAvroSchema should put a Schema name to pass Avro Schema validation

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-7829?focusedWorklogId=303418=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303418
 ]

ASF GitHub Bot logged work on BEAM-7829:


Author: ASF GitHub Bot
Created on: 29/Aug/19 07:20
Start Date: 29/Aug/19 07:20
Worklog Time Spent: 10m 
  Work Description: RyanSkraba commented on issue #9247: [BEAM-7829] Add 
schema names when converting with AvroUtils.toAvroSchema
URL: https://github.com/apache/beam/pull/9247#issuecomment-526059811
 
 
   Run Java PreCommit
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303418)
Time Spent: 2h 20m  (was: 2h 10m)

> AvroUtils.toAvroSchema should put a Schema name to pass Avro Schema validation
> --
>
> Key: BEAM-7829
> URL: https://issues.apache.org/jira/browse/BEAM-7829
> Project: Beam
>  Issue Type: Test
>  Components: io-java-avro, sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ryan Skraba
>Priority: Minor
> Fix For: 2.16.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> While trying to use an Avro PCollection with the SQL transform I notice you 
> could not do correctly a bijective transform: PCollection -> 
> SQL -> PCollection -> ParDo -> PCollection I noticed that 
> some of the Avro metadata gets lost in particular the name of the Avro 
> Schema. This is important because Avro validates that the schema has a name 
> and if it does not it breaks with a ParseException.
> {quote}
> org.apache.avro.SchemaParseException: Illegal character in: EXPR$1
>     at org.apache.avro.Schema.validateName (Schema.java:1151)
>     at org.apache.avro.Schema.access$200 (Schema.java:81)
>     at org.apache.avro.Schema$Field. (Schema.java:403)
>     at org.apache.avro.Schema$Field. (Schema.java:423)
>     at org.apache.avro.Schema$Field. (Schema.java:415){quote}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Work logged] (BEAM-8107) Update commons-compress to version 1.19

2019-08-29 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8107?focusedWorklogId=303402=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303402
 ]

ASF GitHub Bot logged work on BEAM-8107:


Author: ASF GitHub Bot
Created on: 29/Aug/19 06:58
Start Date: 29/Aug/19 06:58
Worklog Time Spent: 10m 
  Work Description: mibo commented on issue #9439: [BEAM-8107] Update 
commons-compress to version 1.19
URL: https://github.com/apache/beam/pull/9439#issuecomment-526052731
 
 
   @pabloem do you already have a date for the next release (which includes 
this PR)?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 303402)
Time Spent: 1h 20m  (was: 1h 10m)

> Update commons-compress to version 1.19
> ---
>
> Key: BEAM-8107
> URL: https://issues.apache.org/jira/browse/BEAM-8107
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system, sdk-java-core
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Beam's current version of commons-compress (1.18) is subject to a DoS 
> vulnerability so we need to upgrade it [CVE-2019-12402]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)