[jira] [Comment Edited] (BEAM-7993) portable python precommit is flaky
[ https://issues.apache.org/jira/browse/BEAM-7993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919211#comment-16919211 ] Hannah Jiang edited comment on BEAM-7993 at 8/30/19 5:44 AM: - I tried to run at local with py2 and py36. I ran more than 10 times, and all of them parallel running py2 and py36 and no one failed. So I haven't try to run one by one, because it doesn't explain non-parallel run would solve the problem. I wrote a build.gradle to trigger python portable precommit job at local. Jenkins uses a job to build the tasks, so it might not be exactly same. [~markflyhigh], are you familiar with gradle/jenkins? I try to remove parallel part from the Jenkins job, but all precommit job share the same job builder, impact range is very big. Is it possible for you to remove parallel part for Python Portable Precommit tasks only? was (Author: hannahjiang): I tried to run at local with py2 and py36. I wrote a build.gradle to trigger python portable precommit job at local. Jenkins uses a job to build the tasks, so it might not be exactly same. I ran more than 10 times, and all of them parallel running py2 and py36 and no one failed. So I haven't try to run one by one, because it doesn't explain non-parallel run would solve the problem. [~markflyhigh], are you familiar with gradle? I try to remove parallel part from the Jenkins job, but all precommit job share the same job builder, impact range is very big. Is it possible for you to remove parallel part for Python Portable Precommit tasks only? > portable python precommit is flaky > -- > > Key: BEAM-7993 > URL: https://issues.apache.org/jira/browse/BEAM-7993 > Project: Beam > Issue Type: Bug > Components: sdk-py-core, test-failures, testing >Affects Versions: 2.15.0 >Reporter: Udi Meiri >Assignee: Mark Liu >Priority: Major > Labels: currently-failing > Fix For: 2.16.0 > > Time Spent: 40m > Remaining Estimate: 0h > > I'm not sure what the root cause is here. > Example log where > :sdks:python:test-suites:portable:py35:portableWordCountBatch failed: > {code} > 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap > (FlatMap at ExtractOutput[0]) (2/2)] ERROR > org.apache.flink.runtime.operators.BatchTask - Error in task code: CHAIN > MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at > ExtractOutput[0]) (2/2) > 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap > (FlatMap at ExtractOutput[0]) (1/2)] ERROR > org.apache.flink.runtime.operators.BatchTask - Error in task code: CHAIN > MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at > ExtractOutput[0]) (1/2) > 11:51:22 [CHAIN MapPartition (MapPartition at > [2]write/Write/WriteImpl/DoOnce/{FlatMap(), > Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)] ERROR > org.apache.flink.runtime.operators.BatchTask - Error in task code: CHAIN > MapPartition (MapPartition at > [2]write/Write/WriteImpl/DoOnce/{FlatMap(), > Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2) > 11:51:22 [CHAIN MapPartition (MapPartition at > [2]write/Write/WriteImpl/DoOnce/{FlatMap(), > Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)] ERROR > org.apache.flink.runtime.operators.BatchTask - Error in task code: CHAIN > MapPartition (MapPartition at > [2]write/Write/WriteImpl/DoOnce/{FlatMap(), > Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2) > 11:51:22 java.lang.Exception: The user defined 'open()' method caused an > exception: java.io.IOException: Received exit code 1 for command 'docker > inspect -f {{.State.Running}} > 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: > Error: No such object: > 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1 > 11:51:22 at > org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498) > 11:51:22 at > org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368) > 11:51:22 at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712) > 11:51:22 at java.lang.Thread.run(Thread.java:748) > 11:51:22 Caused by: > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > java.io.IOException: Received exit code 1 for command 'docker inspect -f > {{.State.Running}} > 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: > Error: No such object: > 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1 > 11:51:22 at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4966) > 11:51:22 at >
[jira] [Work logged] (BEAM-7730) Add Flink 1.9 build target and Make FlinkRunner compatible with Flink 1.9
[ https://issues.apache.org/jira/browse/BEAM-7730?focusedWorklogId=304055=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304055 ] ASF GitHub Bot logged work on BEAM-7730: Author: ASF GitHub Bot Created on: 30/Aug/19 05:42 Start Date: 30/Aug/19 05:42 Worklog Time Spent: 10m Work Description: sunjincheng121 commented on issue #9296: WIP: [BEAM-7730] Introduce Flink 1.9 Runner URL: https://github.com/apache/beam/pull/9296#issuecomment-526464549 Thanks @dmvk, I would like to check it, then feedback you! :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 304055) Time Spent: 1h (was: 50m) > Add Flink 1.9 build target and Make FlinkRunner compatible with Flink 1.9 > - > > Key: BEAM-7730 > URL: https://issues.apache.org/jira/browse/BEAM-7730 > Project: Beam > Issue Type: New Feature > Components: runner-flink >Reporter: sunjincheng >Assignee: David Moravek >Priority: Major > Fix For: 2.16.0 > > Time Spent: 1h > Remaining Estimate: 0h > > Apache Flink 1.9 will coming and it's better to add Flink 1.9 build target > and make Flink Runner compatible with Flink 1.9. > I will add the brief changes after the Flink 1.9.0 released. > And I appreciate it if you can leave your suggestions or comments! -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (BEAM-7993) portable python precommit is flaky
[ https://issues.apache.org/jira/browse/BEAM-7993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919211#comment-16919211 ] Hannah Jiang commented on BEAM-7993: I tried to run at local with py2 and py36. I wrote a build.gradle to trigger python portable precommit job at local. Jenkins uses a job to build the tasks, so it might not be exactly same. I ran more than 10 times, and all of them parallel running py2 and py36 and no one failed. So I haven't try to run one by one, because it doesn't explain non-parallel run would solve the problem. [~markflyhigh], are you familiar with gradle? I try to remove parallel part from the Jenkins job, but all precommit job share the same job builder, impact range is very big. Is it possible for you to remove parallel part for Python Portable Precommit tasks only? > portable python precommit is flaky > -- > > Key: BEAM-7993 > URL: https://issues.apache.org/jira/browse/BEAM-7993 > Project: Beam > Issue Type: Bug > Components: sdk-py-core, test-failures, testing >Affects Versions: 2.15.0 >Reporter: Udi Meiri >Assignee: Mark Liu >Priority: Major > Labels: currently-failing > Fix For: 2.16.0 > > Time Spent: 40m > Remaining Estimate: 0h > > I'm not sure what the root cause is here. > Example log where > :sdks:python:test-suites:portable:py35:portableWordCountBatch failed: > {code} > 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap > (FlatMap at ExtractOutput[0]) (2/2)] ERROR > org.apache.flink.runtime.operators.BatchTask - Error in task code: CHAIN > MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at > ExtractOutput[0]) (2/2) > 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap > (FlatMap at ExtractOutput[0]) (1/2)] ERROR > org.apache.flink.runtime.operators.BatchTask - Error in task code: CHAIN > MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at > ExtractOutput[0]) (1/2) > 11:51:22 [CHAIN MapPartition (MapPartition at > [2]write/Write/WriteImpl/DoOnce/{FlatMap(), > Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)] ERROR > org.apache.flink.runtime.operators.BatchTask - Error in task code: CHAIN > MapPartition (MapPartition at > [2]write/Write/WriteImpl/DoOnce/{FlatMap(), > Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2) > 11:51:22 [CHAIN MapPartition (MapPartition at > [2]write/Write/WriteImpl/DoOnce/{FlatMap(), > Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)] ERROR > org.apache.flink.runtime.operators.BatchTask - Error in task code: CHAIN > MapPartition (MapPartition at > [2]write/Write/WriteImpl/DoOnce/{FlatMap(), > Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2) > 11:51:22 java.lang.Exception: The user defined 'open()' method caused an > exception: java.io.IOException: Received exit code 1 for command 'docker > inspect -f {{.State.Running}} > 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: > Error: No such object: > 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1 > 11:51:22 at > org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498) > 11:51:22 at > org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368) > 11:51:22 at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712) > 11:51:22 at java.lang.Thread.run(Thread.java:748) > 11:51:22 Caused by: > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > java.io.IOException: Received exit code 1 for command 'docker inspect -f > {{.State.Running}} > 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: > Error: No such object: > 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1 > 11:51:22 at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4966) > 11:51:22 at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:211) > 11:51:22 at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:202) > 11:51:22 at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:185) > 11:51:22 at > org.apache.beam.runners.flink.translation.functions.FlinkDefaultExecutableStageContext.getStageBundleFactory(FlinkDefaultExecutableStageContext.java:49) > 11:51:22 at >
[jira] [Work logged] (BEAM-8111) SchemaCoder broken on DataflowRunner
[ https://issues.apache.org/jira/browse/BEAM-8111?focusedWorklogId=304054=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304054 ] ASF GitHub Bot logged work on BEAM-8111: Author: ASF GitHub Bot Created on: 30/Aug/19 05:37 Start Date: 30/Aug/19 05:37 Worklog Time Spent: 10m Work Description: TheNeuralBit commented on issue #9455: [BEAM-8111] Update dataflow container version to revert SchemaCoder change URL: https://github.com/apache/beam/pull/9455#issuecomment-526463400 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 304054) Time Spent: 50m (was: 40m) > SchemaCoder broken on DataflowRunner > > > Key: BEAM-8111 > URL: https://issues.apache.org/jira/browse/BEAM-8111 > Project: Beam > Issue Type: Bug > Components: runner-dataflow, sdk-java-core >Affects Versions: 2.15.0 >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Blocker > Time Spent: 50m > Remaining Estimate: 0h > > https://github.com/apache/beam/commit/e65c176a9f34e45d408281e1101a2ae54cef0f6c > broke SchemaCoder on Dataflow. When translating a schema that uses logical > types from a cloud object dataflow encounters a runtime error. > This means any pipelines that use SqlTransform or schema transforms will fail > on Dataflow in 2.15.0 -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (BEAM-7978) ArithmeticExceptions on getting backlog bytes
[ https://issues.apache.org/jira/browse/BEAM-7978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919201#comment-16919201 ] Mateusz commented on BEAM-7978: --- [~aromanenko], looks good from my perspective > ArithmeticExceptions on getting backlog bytes > -- > > Key: BEAM-7978 > URL: https://issues.apache.org/jira/browse/BEAM-7978 > Project: Beam > Issue Type: Bug > Components: io-java-kinesis >Affects Versions: 2.14.0 >Reporter: Mateusz >Assignee: Alexey Romanenko >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > Hello, > Beam 2.14.0 > (and to be more precise > [commit|https://github.com/apache/beam/commit/9fe0a0387ad1f894ef02acf32edbc9623a3b13ad#diff-b4964a457006b1555c7042c739b405ec]) > introduced a change in watermark calculation in Kinesis IO causing below > error: > {code:java} > exception: "java.lang.RuntimeException: Unknown kinesis failure, when trying > to reach kinesis > at > org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.wrapExceptions(SimplifiedKinesisClient.java:227) > at > org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.getBacklogBytes(SimplifiedKinesisClient.java:167) > at > org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.getBacklogBytes(SimplifiedKinesisClient.java:155) > at > org.apache.beam.sdk.io.kinesis.KinesisReader.getTotalBacklogBytes(KinesisReader.java:158) > at > org.apache.beam.runners.dataflow.worker.StreamingModeExecutionContext.flushState(StreamingModeExecutionContext.java:433) > at > org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1289) > at > org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access$1000(StreamingDataflowWorker.java:149) > at > org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker$6.run(StreamingDataflowWorker.java:1024) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.ArithmeticException: Value cannot fit in an int: > 153748963401 > at org.joda.time.field.FieldUtils.safeToInt(FieldUtils.java:229) > at > org.joda.time.field.BaseDurationField.getDifference(BaseDurationField.java:141) > at > org.joda.time.base.BaseSingleFieldPeriod.between(BaseSingleFieldPeriod.java:72) > at org.joda.time.Minutes.minutesBetween(Minutes.java:101) > at > org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.lambda$getBacklogBytes$3(SimplifiedKinesisClient.java:169) > at > org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.wrapExceptions(SimplifiedKinesisClient.java:210) > ... 10 more > {code} > We spotted this issue on Dataflow runner. It's problematic as inability to > get backlog bytes seems to result in constant recreation of KinesisReader. > The issue happens if the backlog bytes are retrieved before watermark value > is updated from initial default value. Easy way to reproduce it is to create > a pipeline with Kinesis source for a stream where no records are being put. > While debugging it locally, you can observe that the watermark is set to the > value on the past (like: "-290308-12-21T19:59:05.225Z"). After two minutes > (default watermark idle duration threshold is set to 2 minutes) , the > watermark is set to value of > [watermarkIdleThreshold|https://github.com/apache/beam/blob/9fe0a0387ad1f894ef02acf32edbc9623a3b13ad/sdks/java/io/kinesis/src/main/java/org/apache/beam/sdk/io/kinesis/WatermarkPolicyFactory.java#L110]), > so the next backlog bytes retrieval should be correct. However, as described > before, running the pipeline on Dataflow runner results in KinesisReader > being closed just after creation, so the watermark won't be fixed. > The reason of the issue is following: The introduced watermark policies are > relying on > [WatermarkParameters|https://github.com/apache/beam/blob/9fe0a0387ad1f894ef02acf32edbc9623a3b13ad/sdks/java/io/kinesis/src/main/java/org/apache/beam/sdk/io/kinesis/WatermarkParameters.java] > which initialises currentWatermark and eventTime to > [BoundedWindow.TIMESTAMP_MIN_VALUE|https://github.com/apache/beam/commit/9fe0a0387ad1f894ef02acf32edbc9623a3b13ad#diff-b4964a457006b1555c7042c739b405ecR52]. > This result in watermark being set to new Instant(-9223372036854775L) at the > KinesisReader creation. Calculated [period between the watermark and the > current > timestamp|https://github.com/apache/beam/blob/9fe0a0387ad1f894ef02acf32edbc9623a3b13ad/sdks/java/io/kinesis/src/main/java/org/apache/beam/sdk/io/kinesis/SimplifiedKinesisClient.java#L169] > is bigger than expected
[jira] [Work logged] (BEAM-7972) Portable Python Reshuffle does not work with windowed pcollection
[ https://issues.apache.org/jira/browse/BEAM-7972?focusedWorklogId=304046=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304046 ] ASF GitHub Bot logged work on BEAM-7972: Author: ASF GitHub Bot Created on: 30/Aug/19 05:16 Start Date: 30/Aug/19 05:16 Worklog Time Spent: 10m Work Description: angoenka commented on issue #9334: [BEAM-7972] Always use Global window in reshuffle and then apply wind… URL: https://github.com/apache/beam/pull/9334#issuecomment-526458583 Request for another pass for the review. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 304046) Time Spent: 3h 20m (was: 3h 10m) > Portable Python Reshuffle does not work with windowed pcollection > - > > Key: BEAM-7972 > URL: https://issues.apache.org/jira/browse/BEAM-7972 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Blocker > Fix For: 2.16.0 > > Time Spent: 3h 20m > Remaining Estimate: 0h > > Streaming pipeline gets stuck when using Reshuffle with windowed pcollection. > The issue happen because of window function gets deserialized on java side > which is not possible and hence default to global window function and result > into window function mismatch later down the code. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-8111) SchemaCoder broken on DataflowRunner
[ https://issues.apache.org/jira/browse/BEAM-8111?focusedWorklogId=304042=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304042 ] ASF GitHub Bot logged work on BEAM-8111: Author: ASF GitHub Bot Created on: 30/Aug/19 04:52 Start Date: 30/Aug/19 04:52 Worklog Time Spent: 10m Work Description: TheNeuralBit commented on pull request #9455: [BEAM-8111] Update dataflow container version to revert SchemaCoder change URL: https://github.com/apache/beam/pull/9455 R: @aaltay Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/) | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/) | --- | --- | [![Build
[jira] [Work logged] (BEAM-8113) FlinkRunner: Stage files from context classloader
[ https://issues.apache.org/jira/browse/BEAM-8113?focusedWorklogId=304032=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304032 ] ASF GitHub Bot logged work on BEAM-8113: Author: ASF GitHub Bot Created on: 30/Aug/19 04:15 Start Date: 30/Aug/19 04:15 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #9451: [BEAM-8113] Stage files from context classloader URL: https://github.com/apache/beam/pull/9451#discussion_r319351193 ## File path: runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkRunner.java ## @@ -76,8 +80,15 @@ public static FlinkRunner fromOptions(PipelineOptions options) { } if (flinkOptions.getFilesToStage() == null) { - flinkOptions.setFilesToStage( - detectClassPathResourcesToStage(FlinkRunner.class.getClassLoader())); + List filesToStage = + Stream.concat( + Stream.of(FlinkRunner.class.getClassLoader()), Review comment: Please use `ReflectHelpers#findClassLoader` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 304032) Time Spent: 1h 10m (was: 1h) > FlinkRunner: Stage files from context classloader > - > > Key: BEAM-8113 > URL: https://issues.apache.org/jira/browse/BEAM-8113 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Jan Lukavský >Assignee: Jan Lukavský >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > Currently, only files from {{FlinkRunner.class.getClassLoader()}} are staged > by default. Add also files from > {{Thread.currentThread().getContextClassLoader()}}. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-5428) Implement cross-bundle state caching.
[ https://issues.apache.org/jira/browse/BEAM-5428?focusedWorklogId=304026=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304026 ] ASF GitHub Bot logged work on BEAM-5428: Author: ASF GitHub Bot Created on: 30/Aug/19 03:58 Start Date: 30/Aug/19 03:58 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #9440: [BEAM-5428] Modify cache token Proto design to only include tokens in ProcessBundleRequest URL: https://github.com/apache/beam/pull/9440#issuecomment-526446443 If possible, please regenerate the Go proto bindings by following https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/model/PROTOBUF.md This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 304026) Time Spent: 11h (was: 10h 50m) > Implement cross-bundle state caching. > - > > Key: BEAM-5428 > URL: https://issues.apache.org/jira/browse/BEAM-5428 > Project: Beam > Issue Type: Improvement > Components: sdk-py-harness >Reporter: Robert Bradshaw >Assignee: Rakesh Kumar >Priority: Major > Time Spent: 11h > Remaining Estimate: 0h > > Tech spec: > [https://docs.google.com/document/d/1BOozW0bzBuz4oHJEuZNDOHdzaV5Y56ix58Ozrqm2jFg/edit#heading=h.7ghoih5aig5m] > Relevant document: > [https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit#|https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit] > Mailing list link: > [https://lists.apache.org/thread.html/caa8d9bc6ca871d13de2c5e6ba07fdc76f85d26497d95d90893aa1f6@%3Cdev.beam.apache.org%3E] -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-5428) Implement cross-bundle state caching.
[ https://issues.apache.org/jira/browse/BEAM-5428?focusedWorklogId=304024=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304024 ] ASF GitHub Bot logged work on BEAM-5428: Author: ASF GitHub Bot Created on: 30/Aug/19 03:56 Start Date: 30/Aug/19 03:56 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #9440: [BEAM-5428] Modify cache token Proto design to only include tokens in ProcessBundleRequest URL: https://github.com/apache/beam/pull/9440#discussion_r319348413 ## File path: model/fn-execution/src/main/proto/beam_fn_api.proto ## @@ -232,9 +232,35 @@ message ProcessBundleRequest { // instantiated and executed by the SDK harness. string process_bundle_descriptor_reference = 1; + // A cache token which can be used by an SDK to check for the validity + // of cached elements which have a cache token associated. + message CacheToken { + +// A flag to indicate a cache token is valid for user state. +message UserState {} + +// A flag to indicate a cache token is valid for a side input. +message SideInput { + // The id of a side input. + string side_input = 1; +} + +// The scope of a cache token. +oneof type { + UserState user_state = 1; + SideInput side_input = 2; +} + +// The cache token identifier which should be globally unique. +bytes token = 10; + } + // (Optional) A list of cache tokens that can be used by an SDK to reuse // cached data returned by the State API across multiple bundles. - repeated bytes cache_tokens = 2; + repeated CacheToken cache_tokens = 3; + + // Old version of cache_tokens field + reserved 2; Review comment: Delete it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 304024) Time Spent: 10h 50m (was: 10h 40m) > Implement cross-bundle state caching. > - > > Key: BEAM-5428 > URL: https://issues.apache.org/jira/browse/BEAM-5428 > Project: Beam > Issue Type: Improvement > Components: sdk-py-harness >Reporter: Robert Bradshaw >Assignee: Rakesh Kumar >Priority: Major > Time Spent: 10h 50m > Remaining Estimate: 0h > > Tech spec: > [https://docs.google.com/document/d/1BOozW0bzBuz4oHJEuZNDOHdzaV5Y56ix58Ozrqm2jFg/edit#heading=h.7ghoih5aig5m] > Relevant document: > [https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit#|https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit] > Mailing list link: > [https://lists.apache.org/thread.html/caa8d9bc6ca871d13de2c5e6ba07fdc76f85d26497d95d90893aa1f6@%3Cdev.beam.apache.org%3E] -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-5428) Implement cross-bundle state caching.
[ https://issues.apache.org/jira/browse/BEAM-5428?focusedWorklogId=304023=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304023 ] ASF GitHub Bot logged work on BEAM-5428: Author: ASF GitHub Bot Created on: 30/Aug/19 03:56 Start Date: 30/Aug/19 03:56 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #9440: [BEAM-5428] Modify cache token Proto design to only include tokens in ProcessBundleRequest URL: https://github.com/apache/beam/pull/9440#discussion_r319348727 ## File path: model/fn-execution/src/main/proto/beam_fn_api.proto ## @@ -595,10 +621,6 @@ message StateResponse { // failed. string error = 2; - // (Optional) If this is specified, then the result of this state request - // can be cached using the supplied token. - bytes cache_token = 3; Review comment: I don't think we need it as part of the request either since the state key uniquely describes which ProcessBundleDescriptor is being used as long as the runner doesn't reuse instruction ids. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 304023) Time Spent: 10h 50m (was: 10h 40m) > Implement cross-bundle state caching. > - > > Key: BEAM-5428 > URL: https://issues.apache.org/jira/browse/BEAM-5428 > Project: Beam > Issue Type: Improvement > Components: sdk-py-harness >Reporter: Robert Bradshaw >Assignee: Rakesh Kumar >Priority: Major > Time Spent: 10h 50m > Remaining Estimate: 0h > > Tech spec: > [https://docs.google.com/document/d/1BOozW0bzBuz4oHJEuZNDOHdzaV5Y56ix58Ozrqm2jFg/edit#heading=h.7ghoih5aig5m] > Relevant document: > [https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit#|https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit] > Mailing list link: > [https://lists.apache.org/thread.html/caa8d9bc6ca871d13de2c5e6ba07fdc76f85d26497d95d90893aa1f6@%3Cdev.beam.apache.org%3E] -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-5428) Implement cross-bundle state caching.
[ https://issues.apache.org/jira/browse/BEAM-5428?focusedWorklogId=304025=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304025 ] ASF GitHub Bot logged work on BEAM-5428: Author: ASF GitHub Bot Created on: 30/Aug/19 03:56 Start Date: 30/Aug/19 03:56 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #9440: [BEAM-5428] Modify cache token Proto design to only include tokens in ProcessBundleRequest URL: https://github.com/apache/beam/pull/9440#discussion_r319348379 ## File path: model/fn-execution/src/main/proto/beam_fn_api.proto ## @@ -610,6 +632,9 @@ message StateResponse { // A response to clearing state. StateClearResponse clear = 1002; } + + // Cache token which used to be part of the StateResponse + reserved 3; Review comment: Delete it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 304025) Time Spent: 10h 50m (was: 10h 40m) > Implement cross-bundle state caching. > - > > Key: BEAM-5428 > URL: https://issues.apache.org/jira/browse/BEAM-5428 > Project: Beam > Issue Type: Improvement > Components: sdk-py-harness >Reporter: Robert Bradshaw >Assignee: Rakesh Kumar >Priority: Major > Time Spent: 10h 50m > Remaining Estimate: 0h > > Tech spec: > [https://docs.google.com/document/d/1BOozW0bzBuz4oHJEuZNDOHdzaV5Y56ix58Ozrqm2jFg/edit#heading=h.7ghoih5aig5m] > Relevant document: > [https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit#|https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit] > Mailing list link: > [https://lists.apache.org/thread.html/caa8d9bc6ca871d13de2c5e6ba07fdc76f85d26497d95d90893aa1f6@%3Cdev.beam.apache.org%3E] -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7909) Write integration tests to test customized containers
[ https://issues.apache.org/jira/browse/BEAM-7909?focusedWorklogId=304019=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304019 ] ASF GitHub Bot logged work on BEAM-7909: Author: ASF GitHub Bot Created on: 30/Aug/19 03:23 Start Date: 30/Aug/19 03:23 Worklog Time Spent: 10m Work Description: Hannah-Jiang commented on issue #9351: [BEAM-7909] Python3 docker containers URL: https://github.com/apache/beam/pull/9351#issuecomment-526440933 Run Portable_Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 304019) Time Spent: 9.5h (was: 9h 20m) > Write integration tests to test customized containers > - > > Key: BEAM-7909 > URL: https://issues.apache.org/jira/browse/BEAM-7909 > Project: Beam > Issue Type: Sub-task > Components: build-system >Reporter: Hannah Jiang >Assignee: Hannah Jiang >Priority: Major > Time Spent: 9.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Comment Edited] (BEAM-7049) Merge multiple input to one BeamUnionRel
[ https://issues.apache.org/jira/browse/BEAM-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919139#comment-16919139 ] sridhar Reddy edited comment on BEAM-7049 at 8/30/19 3:00 AM: -- After several attempts to generalize union with multiple operands, I am unable to locate a place where this can be done. Any attempt to pull inputs from lower operands to upper operands actually limits output. I am not sure if I am looking in the right places. [~amaliujia] Can you please confirm that UnionMergeRule (or variation of it) is the right place to make these changes? [https://github.com/Qihoo360/Quicksql/blob/7863cff886ad233b5102d6e2f11f8d21f374aa82/analysis/src/main/java/org/apache/calcite/rel/rules/UnionMergeRule.java#L128] I tried to make different changes in the above location but without any luck. In fact, any changes made there limits union result. For ex: select 1 union 2 union 3 union 4 union results in [1,2,3] skipping 4. Just trying to hardcode in BeamUnionRel also was not possible beyond 3 inputs. [https://github.com/apache/beam/blob/9e152b7a99b2e081d224584905a49e14742e0d5d/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamUnionRel.java#L76] Where ever I looked most of the operations are done on 2 or 3 operands only. Only in CaliciteQueryPlanner, I can find everything in one place [https://github.com/apache/beam/blob/9e152b7a99b2e081d224584905a49e14742e0d5d/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/CalciteQueryPlanner.java#L169] Is there a place where one can access - beamRelNode or a similar structure where all the inputs are present- and can be manipulated? CalciteQueryPlanner doesn't seem like a good place to do this. was (Author: sridharg): After several attempts to generalize union with multiple operands, I am unable to locate a place where this can be done. Any attempt to pull inputs from lower operands to upper operands actually limits output. I am not sure I am looking in the right places. [~amaliujia] Can you please confirm that UnionMergeRule (or variation of it) is the right place to make these changes? [https://github.com/Qihoo360/Quicksql/blob/7863cff886ad233b5102d6e2f11f8d21f374aa82/analysis/src/main/java/org/apache/calcite/rel/rules/UnionMergeRule.java#L128] I tried to make different changes in the above location but without any luck. In fact any changes made there limits union result. For ex: select 1 union 2 union 3 union 4 union results in [1,2,3] skipping 4. Just trying to hardcode in BeamUnionRel also was not possible beyond 3 inputs. [https://github.com/apache/beam/blob/9e152b7a99b2e081d224584905a49e14742e0d5d/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamUnionRel.java#L76] Where ever I looked most of the operations are done on 2 or 3 operands only. Only in CaliciteQueryPlanner, I can find everything in one place [https://github.com/apache/beam/blob/9e152b7a99b2e081d224584905a49e14742e0d5d/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/CalciteQueryPlanner.java#L169] Is there a place where you can access - beamRelNode or a similar structure where all the inputs are present- and can be manipulated? CalciteQueryPlanner doesn't seem like a good place to do this. > Merge multiple input to one BeamUnionRel > > > Key: BEAM-7049 > URL: https://issues.apache.org/jira/browse/BEAM-7049 > Project: Beam > Issue Type: Improvement > Components: dsl-sql >Reporter: Rui Wang >Assignee: sridhar Reddy >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > BeamUnionRel assumes inputs are two and rejects more. So `a UNION b UNION c` > will have to be created as UNION(a, UNION(b, c)) and have two shuffles. If > BeamUnionRel can handle multiple shuffles, we will have only one shuffle -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (BEAM-7049) Merge multiple input to one BeamUnionRel
[ https://issues.apache.org/jira/browse/BEAM-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919139#comment-16919139 ] sridhar Reddy commented on BEAM-7049: - After several attempts to generalize union with multiple operands, I am unable to locate a place where this can be done. Any attempt to pull inputs from lower operands to upper operands actually limits output. I am not sure I am looking in the right places. [~amaliujia] Can you please confirm that UnionMergeRule (or variation of it) is the right place to make these changes? [https://github.com/Qihoo360/Quicksql/blob/7863cff886ad233b5102d6e2f11f8d21f374aa82/analysis/src/main/java/org/apache/calcite/rel/rules/UnionMergeRule.java#L128] I tried to make different changes in the above location but without any luck. In fact any changes made there limits union result. For ex: select 1 union 2 union 3 union 4 union results in [1,2,3] skipping 4. Just trying to hardcode in BeamUnionRel also was not possible beyond 3 inputs. [https://github.com/apache/beam/blob/9e152b7a99b2e081d224584905a49e14742e0d5d/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamUnionRel.java#L76] Where ever I looked most of the operations are done on 2 or 3 operands only. Only in CaliciteQueryPlanner, I can find everything in one place [https://github.com/apache/beam/blob/9e152b7a99b2e081d224584905a49e14742e0d5d/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/CalciteQueryPlanner.java#L169] Is there a place where you can access - beamRelNode or a similar structure where all the inputs are present- and can be manipulated? CalciteQueryPlanner doesn't seem like a good place to do this. > Merge multiple input to one BeamUnionRel > > > Key: BEAM-7049 > URL: https://issues.apache.org/jira/browse/BEAM-7049 > Project: Beam > Issue Type: Improvement > Components: dsl-sql >Reporter: Rui Wang >Assignee: sridhar Reddy >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > BeamUnionRel assumes inputs are two and rejects more. So `a UNION b UNION c` > will have to be created as UNION(a, UNION(b, c)) and have two shuffles. If > BeamUnionRel can handle multiple shuffles, we will have only one shuffle -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7966) Write portable Flink application jar
[ https://issues.apache.org/jira/browse/BEAM-7966?focusedWorklogId=304007=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304007 ] ASF GitHub Bot logged work on BEAM-7966: Author: ASF GitHub Bot Created on: 30/Aug/19 02:32 Start Date: 30/Aug/19 02:32 Worklog Time Spent: 10m Work Description: ibzib commented on issue #9331: [BEAM-7966] Write portable Flink application jar URL: https://github.com/apache/beam/pull/9331#issuecomment-526431817 @tweise I created https://issues.apache.org/jira/browse/BEAM-8115 for overwriting pipeline options. For an example usage, check out the integration test I wrote for the subsequent draft PR. https://github.com/apache/beam/blob/725116bfdcc7ff364456be016cb93dc7644ffb02/runners/flink/job-server/pipeline_jar_test.sh This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 304007) Time Spent: 1.5h (was: 1h 20m) > Write portable Flink application jar > > > Key: BEAM-7966 > URL: https://issues.apache.org/jira/browse/BEAM-7966 > Project: Beam > Issue Type: New Feature > Components: runner-flink >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Major > Labels: portability-flink > Time Spent: 1.5h > Remaining Estimate: 0h > > *[https://docs.google.com/document/d/1kj_9JWxGWOmSGeZ5hbLVDXSTv-zBrx4kQRqOq85RYD4/edit#heading=h.oes73844vmhl]* -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (BEAM-8115) Overwrite portable Flink application jar pipeline options at runtime
Kyle Weaver created BEAM-8115: - Summary: Overwrite portable Flink application jar pipeline options at runtime Key: BEAM-8115 URL: https://issues.apache.org/jira/browse/BEAM-8115 Project: Beam Issue Type: New Feature Components: runner-flink Reporter: Kyle Weaver Assignee: Kyle Weaver In the first iteration of portable Flink application jars, all pipeline options are set at job creation time and cannot be later modified at runtime. There should be a way to pass arguments to the jar to write/overwrite pipeline options. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7966) Write portable Flink application jar
[ https://issues.apache.org/jira/browse/BEAM-7966?focusedWorklogId=304006=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-304006 ] ASF GitHub Bot logged work on BEAM-7966: Author: ASF GitHub Bot Created on: 30/Aug/19 02:23 Start Date: 30/Aug/19 02:23 Worklog Time Spent: 10m Work Description: tweise commented on issue #9331: [BEAM-7966] Write portable Flink application jar URL: https://github.com/apache/beam/pull/9331#issuecomment-526430047 It would be good to capture the pipeline option modification at runtime in a follow-up JIRA. Do you have an example how this jar producing runner would be execute from python side? I'm wondering how to integrate this into an automated build. An example command line that also shows how the runner/job-server jar dependency is specified would be helpful. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 304006) Time Spent: 1h 20m (was: 1h 10m) > Write portable Flink application jar > > > Key: BEAM-7966 > URL: https://issues.apache.org/jira/browse/BEAM-7966 > Project: Beam > Issue Type: New Feature > Components: runner-flink >Reporter: Kyle Weaver >Assignee: Kyle Weaver >Priority: Major > Labels: portability-flink > Time Spent: 1h 20m > Remaining Estimate: 0h > > *[https://docs.google.com/document/d/1kj_9JWxGWOmSGeZ5hbLVDXSTv-zBrx4kQRqOq85RYD4/edit#heading=h.oes73844vmhl]* -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (BEAM-7952) Make the input queue of the input buffer in Python SDK Harness size limited.
[ https://issues.apache.org/jira/browse/BEAM-7952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sunjincheng updated BEAM-7952: -- Fix Version/s: (was: 2.16.0) 2.17.0 > Make the input queue of the input buffer in Python SDK Harness size limited. > > > Key: BEAM-7952 > URL: https://issues.apache.org/jira/browse/BEAM-7952 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-harness >Reporter: sunjincheng >Priority: Major > Fix For: 2.17.0 > > > At Python SDK harness, the input queue size of the input buffer in Python SDK > Harness is not size limited and also not configurable. This may become a > problem if the data production rate is more than the data consumption rate. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (BEAM-7950) Remove the Python 3 warning as it has already been supported
[ https://issues.apache.org/jira/browse/BEAM-7950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sunjincheng updated BEAM-7950: -- Fix Version/s: (was: 2.16.0) 2.17.0 > Remove the Python 3 warning as it has already been supported > > > Key: BEAM-7950 > URL: https://issues.apache.org/jira/browse/BEAM-7950 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-core >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Fix For: 2.17.0 > > > There are warnings that Python 3 is not fully supported in Beam > (beam/sdks/python/setup.py). As mentioned in the ML, we should remove the > Python 3 warning as it has already been supported as an effort of > https://issues.apache.org/jira/browse/BEAM-1251. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (BEAM-7949) Add time-based cache threshold support in the data service of the Python SDK harness
[ https://issues.apache.org/jira/browse/BEAM-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sunjincheng updated BEAM-7949: -- Fix Version/s: (was: 2.16.0) 2.17.0 > Add time-based cache threshold support in the data service of the Python SDK > harness > > > Key: BEAM-7949 > URL: https://issues.apache.org/jira/browse/BEAM-7949 > Project: Beam > Issue Type: Sub-task > Components: sdk-py-harness >Reporter: sunjincheng >Priority: Major > Fix For: 2.17.0 > > > Currently only size-based cache threshold is supported in the data service of > Python SDK harness. It should also support the time-based cache threshold. > This is very important, especially for streaming jobs which are sensitive to > the delay. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (BEAM-7948) Add time-based cache threshold support in the Java data service
[ https://issues.apache.org/jira/browse/BEAM-7948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sunjincheng updated BEAM-7948: -- Fix Version/s: (was: 2.16.0) 2.17.0 > Add time-based cache threshold support in the Java data service > --- > > Key: BEAM-7948 > URL: https://issues.apache.org/jira/browse/BEAM-7948 > Project: Beam > Issue Type: Sub-task > Components: java-fn-execution >Reporter: sunjincheng >Priority: Major > Fix For: 2.17.0 > > > Currently only size-based cache threshold is supported in data service. It > should also support the time-based cache threshold. This is very important, > especially for streaming jobs which are sensitive to the delay. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (BEAM-8034) Upgrade Flink Runner to 1.8.1 and fix Avro Schema serialization problems
[ https://issues.apache.org/jira/browse/BEAM-8034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16919079#comment-16919079 ] sunjincheng commented on BEAM-8034: --- Thanks report this issue and share your analysis and I agree with you this caused by FLINK-13367. [~mxm] ! [~markflyhigh] Already bring up the PROPOSAL for the Beam 2.16 release, and Beam 2.16 release branch cut is scheduled on Sep 11. I'm not sure if I bringup the discuss of Flink-1.8.2 release today can catch up with Beam 2.16 release(Sep 11), but I think even if 2.16 can't merge this ticket we also need to push Flink-1.8.2 release as soon as possible, So that we can upgrade flink-runner version to flink-1.8.2 as soon as possible, at least we can merge this in Beam 2.17 release. What do you think? > Upgrade Flink Runner to 1.8.1 and fix Avro Schema serialization problems > > > Key: BEAM-8034 > URL: https://issues.apache.org/jira/browse/BEAM-8034 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Maximilian Michels >Priority: Major > > The Flink Runner should be upgrade to 1.8.1. Users have reported > serialization problems related to Avro's Schema: > {noformat} > Caused by: java.io.NotSerializableException: org.apache.avro.Schema$Field > at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184) > at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348) > at java.util.ArrayList.writeObject(ArrayList.java:766) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1140) > at > java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496) > at > java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432) > at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178) > at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348) > at > org.apache.flink.util.InstantiationUtil.serializeObject(InstantiationUtil.java:576) > at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:122) > ... 45 more > {noformat} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7909) Write integration tests to test customized containers
[ https://issues.apache.org/jira/browse/BEAM-7909?focusedWorklogId=303982=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303982 ] ASF GitHub Bot logged work on BEAM-7909: Author: ASF GitHub Bot Created on: 30/Aug/19 00:46 Start Date: 30/Aug/19 00:46 Worklog Time Spent: 10m Work Description: Hannah-Jiang commented on issue #9351: [BEAM-7909] Python3 docker containers URL: https://github.com/apache/beam/pull/9351#issuecomment-526411707 Run Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303982) Time Spent: 9h 20m (was: 9h 10m) > Write integration tests to test customized containers > - > > Key: BEAM-7909 > URL: https://issues.apache.org/jira/browse/BEAM-7909 > Project: Beam > Issue Type: Sub-task > Components: build-system >Reporter: Hannah Jiang >Assignee: Hannah Jiang >Priority: Major > Time Spent: 9h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7909) Write integration tests to test customized containers
[ https://issues.apache.org/jira/browse/BEAM-7909?focusedWorklogId=303963=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303963 ] ASF GitHub Bot logged work on BEAM-7909: Author: ASF GitHub Bot Created on: 30/Aug/19 00:06 Start Date: 30/Aug/19 00:06 Worklog Time Spent: 10m Work Description: Hannah-Jiang commented on issue #9351: [BEAM-7909] Python3 docker containers URL: https://github.com/apache/beam/pull/9351#issuecomment-526404789 Run Portable_Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303963) Time Spent: 9h 10m (was: 9h) > Write integration tests to test customized containers > - > > Key: BEAM-7909 > URL: https://issues.apache.org/jira/browse/BEAM-7909 > Project: Beam > Issue Type: Sub-task > Components: build-system >Reporter: Hannah Jiang >Assignee: Hannah Jiang >Priority: Major > Time Spent: 9h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7909) Write integration tests to test customized containers
[ https://issues.apache.org/jira/browse/BEAM-7909?focusedWorklogId=303956=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303956 ] ASF GitHub Bot logged work on BEAM-7909: Author: ASF GitHub Bot Created on: 29/Aug/19 23:53 Start Date: 29/Aug/19 23:53 Worklog Time Spent: 10m Work Description: Hannah-Jiang commented on issue #9351: [BEAM-7909] Python3 docker containers URL: https://github.com/apache/beam/pull/9351#issuecomment-526402467 Run Portable_Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303956) Time Spent: 9h (was: 8h 50m) > Write integration tests to test customized containers > - > > Key: BEAM-7909 > URL: https://issues.apache.org/jira/browse/BEAM-7909 > Project: Beam > Issue Type: Sub-task > Components: build-system >Reporter: Hannah Jiang >Assignee: Hannah Jiang >Priority: Major > Time Spent: 9h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7945) Allow runner to configure "semi_persist_dir" which is used in the SDK harness
[ https://issues.apache.org/jira/browse/BEAM-7945?focusedWorklogId=303951=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303951 ] ASF GitHub Bot logged work on BEAM-7945: Author: ASF GitHub Bot Created on: 29/Aug/19 23:34 Start Date: 29/Aug/19 23:34 Worklog Time Spent: 10m Work Description: sunjincheng121 commented on issue #9452: [BEAM-7945] Allow runner to configure semi_persist_dir which is used … URL: https://github.com/apache/beam/pull/9452#issuecomment-526399094 I appreciate if you have time to look up the changes @robertwb @mxm :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303951) Time Spent: 20m (was: 10m) > Allow runner to configure "semi_persist_dir" which is used in the SDK harness > - > > Key: BEAM-7945 > URL: https://issues.apache.org/jira/browse/BEAM-7945 > Project: Beam > Issue Type: Sub-task > Components: java-fn-execution, sdk-go, sdk-java-core, sdk-py-core >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Major > Fix For: 2.16.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Currently "semi_persist_dir" is not configurable. This may become a problem > in certain scenarios. For example, the default value of "semi_persist_dir" is > "/tmp" > ([https://github.com/apache/beam/blob/master/sdks/python/container/boot.go#L48]) > in Python SDK harness. When the environment type is "PROCESS", the disk of > "/tmp" may be filled up and unexpected issues will occur in production > environment. We should provide a way to configure "semi_persist_dir" in > EnvironmentFactory at the runner side. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7520) DirectRunner timers are not strictly time ordered
[ https://issues.apache.org/jira/browse/BEAM-7520?focusedWorklogId=303862=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303862 ] ASF GitHub Bot logged work on BEAM-7520: Author: ASF GitHub Bot Created on: 29/Aug/19 19:16 Start Date: 29/Aug/19 19:16 Worklog Time Spent: 10m Work Description: je-ik commented on issue #9190: [BEAM-7520] Fix timer firing order in DirectRunner URL: https://github.com/apache/beam/pull/9190#issuecomment-526323662 Run Dataflow ValidatesRunner This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303862) Time Spent: 9h (was: 8h 50m) > DirectRunner timers are not strictly time ordered > - > > Key: BEAM-7520 > URL: https://issues.apache.org/jira/browse/BEAM-7520 > Project: Beam > Issue Type: Bug > Components: runner-direct >Affects Versions: 2.13.0 >Reporter: Jan Lukavský >Assignee: Jan Lukavský >Priority: Major > Time Spent: 9h > Remaining Estimate: 0h > > Let's suppose we have the following situation: > - statful ParDo with two timers - timerA and timerB > - timerA is set for window.maxTimestamp() + 1 > - timerB is set anywhere between timerB.timestamp > - input watermark moves to BoundedWindow.TIMESTAMP_MAX_VALUE > Then the order of timers is as follows (correct): > - timerB > - timerA > But, if timerB sets another timer (say for timerB.timestamp + 1), then the > order of timers will be: > - timerB (timerB.timestamp) > - timerA (BoundedWindow.TIMESTAMP_MAX_VALUE) > - timerB (timerB.timestamp + 1) > Which is not ordered by timestamp. The reason for this is that when the input > watermark update is evaluated, the WatermarkManager,extractFiredTimers() will > produce both timerA and timerB. That would be correct, but when timerB sets > another timer, that breaks this. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-6114) SQL join selection should be done in planner, not in expansion to PTransform
[ https://issues.apache.org/jira/browse/BEAM-6114?focusedWorklogId=303841=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303841 ] ASF GitHub Bot logged work on BEAM-6114: Author: ASF GitHub Bot Created on: 29/Aug/19 18:30 Start Date: 29/Aug/19 18:30 Worklog Time Spent: 10m Work Description: rahul8383 commented on issue #9453: [BEAM-6114] Handle Unsupported Lookup Joins in BeamSql URL: https://github.com/apache/beam/pull/9453#issuecomment-526307374 R: @amaliujia This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303841) Time Spent: 3h 50m (was: 3h 40m) > SQL join selection should be done in planner, not in expansion to PTransform > > > Key: BEAM-6114 > URL: https://issues.apache.org/jira/browse/BEAM-6114 > Project: Beam > Issue Type: Improvement > Components: dsl-sql >Reporter: Kenneth Knowles >Assignee: Rahul Patwari >Priority: Major > Fix For: 2.16.0 > > Time Spent: 3h 50m > Remaining Estimate: 0h > > Currently Beam SQL joins all go through a single physical operator which has > a single PTransform that does all join algorithms based on properties of its > input PCollections as well as the relational algebra. > A first step is to make the needed information part of the relational > algebra, so it can choose a PTransform based on that, and the PTransforms can > be simpler. > Second step is to have separate (physical) relational operators for different > join algorithms. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-6114) SQL join selection should be done in planner, not in expansion to PTransform
[ https://issues.apache.org/jira/browse/BEAM-6114?focusedWorklogId=303839=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303839 ] ASF GitHub Bot logged work on BEAM-6114: Author: ASF GitHub Bot Created on: 29/Aug/19 18:25 Start Date: 29/Aug/19 18:25 Worklog Time Spent: 10m Work Description: rahul8383 commented on pull request #9453: [BEAM-6114] Handle Unsupported Lookup Joins in BeamSql URL: https://github.com/apache/beam/pull/9453 - Added code in BeamSideInputLookupJoinRel which throws UnsupportedOperationException if "OUTER JOINS" are applied when the OUTER side is a Seekable Table. - Added Unit Tests for BeamSideInputLookupJoinRel - Modified and added Javadoc for Join Rels. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
[jira] [Commented] (BEAM-7993) portable python precommit is flaky
[ https://issues.apache.org/jira/browse/BEAM-7993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918758#comment-16918758 ] Hannah Jiang commented on BEAM-7993: I am not sure if this is the root cause, because it doesn’t answer following two questions. 1. Why it doesn’t happen at local? 2. Why only py3 fails, but not py2? I will try to not parallel run portable tests later today to see if the problem is solved. > portable python precommit is flaky > -- > > Key: BEAM-7993 > URL: https://issues.apache.org/jira/browse/BEAM-7993 > Project: Beam > Issue Type: Bug > Components: sdk-py-core, test-failures, testing >Affects Versions: 2.15.0 >Reporter: Udi Meiri >Assignee: Mark Liu >Priority: Major > Labels: currently-failing > Fix For: 2.16.0 > > Time Spent: 40m > Remaining Estimate: 0h > > I'm not sure what the root cause is here. > Example log where > :sdks:python:test-suites:portable:py35:portableWordCountBatch failed: > {code} > 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap > (FlatMap at ExtractOutput[0]) (2/2)] ERROR > org.apache.flink.runtime.operators.BatchTask - Error in task code: CHAIN > MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at > ExtractOutput[0]) (2/2) > 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap > (FlatMap at ExtractOutput[0]) (1/2)] ERROR > org.apache.flink.runtime.operators.BatchTask - Error in task code: CHAIN > MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at > ExtractOutput[0]) (1/2) > 11:51:22 [CHAIN MapPartition (MapPartition at > [2]write/Write/WriteImpl/DoOnce/{FlatMap(), > Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)] ERROR > org.apache.flink.runtime.operators.BatchTask - Error in task code: CHAIN > MapPartition (MapPartition at > [2]write/Write/WriteImpl/DoOnce/{FlatMap(), > Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2) > 11:51:22 [CHAIN MapPartition (MapPartition at > [2]write/Write/WriteImpl/DoOnce/{FlatMap(), > Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)] ERROR > org.apache.flink.runtime.operators.BatchTask - Error in task code: CHAIN > MapPartition (MapPartition at > [2]write/Write/WriteImpl/DoOnce/{FlatMap(), > Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2) > 11:51:22 java.lang.Exception: The user defined 'open()' method caused an > exception: java.io.IOException: Received exit code 1 for command 'docker > inspect -f {{.State.Running}} > 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: > Error: No such object: > 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1 > 11:51:22 at > org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498) > 11:51:22 at > org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368) > 11:51:22 at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712) > 11:51:22 at java.lang.Thread.run(Thread.java:748) > 11:51:22 Caused by: > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > java.io.IOException: Received exit code 1 for command 'docker inspect -f > {{.State.Running}} > 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: > Error: No such object: > 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1 > 11:51:22 at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4966) > 11:51:22 at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:211) > 11:51:22 at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:202) > 11:51:22 at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:185) > 11:51:22 at > org.apache.beam.runners.flink.translation.functions.FlinkDefaultExecutableStageContext.getStageBundleFactory(FlinkDefaultExecutableStageContext.java:49) > 11:51:22 at > org.apache.beam.runners.flink.translation.functions.ReferenceCountingFlinkExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingFlinkExecutableStageContextFactory.java:203) > 11:51:22 at > org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.open(FlinkExecutableStageFunction.java:129) > 11:51:22 at > org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36) > 11:51:22 at > org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:494) > 11:51:22 ...
[jira] [Commented] (BEAM-8096) Allow runner to configure "subnetwork"
[ https://issues.apache.org/jira/browse/BEAM-8096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918769#comment-16918769 ] Daniel Oliveira commented on BEAM-8096: --- Just an update: I took a cursory look through the bug & PR, but I've been busy and probably won't have a chance to give this a real review until tomorrow. > Allow runner to configure "subnetwork" > -- > > Key: BEAM-8096 > URL: https://issues.apache.org/jira/browse/BEAM-8096 > Project: Beam > Issue Type: Improvement > Components: sdk-go >Affects Versions: 2.15.0 >Reporter: Jack Whelpton >Assignee: Robert Burke >Priority: Major > > When running a Dataflow job, the network can be specified using the --network > flag; however, there is no support for doing the same for the subnetwork. > This would be the go equivalent of the following Java code: > [https://cloud.google.com/dataflow/java-sdk/JavaDoc/com/google/cloud/dataflow/sdk/options/DataflowPipelineWorkerPoolOptions.html#getSubnetwork--|https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineWorkerPoolOptions.java#L151] -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (BEAM-7993) portable python precommit is flaky
[ https://issues.apache.org/jira/browse/BEAM-7993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918727#comment-16918727 ] Mark Liu commented on BEAM-7993: Looks like same issue in BEAM-7527. It happens if multiple 'python setup.py' run concurrently under same project. In portable precommit, py2 and py3 tests are triggered at same time. It's likely that part of the test call setuptool and unfortunately conflict with each other. > portable python precommit is flaky > -- > > Key: BEAM-7993 > URL: https://issues.apache.org/jira/browse/BEAM-7993 > Project: Beam > Issue Type: Bug > Components: sdk-py-core, test-failures, testing >Affects Versions: 2.15.0 >Reporter: Udi Meiri >Assignee: Mark Liu >Priority: Major > Labels: currently-failing > Fix For: 2.16.0 > > Time Spent: 40m > Remaining Estimate: 0h > > I'm not sure what the root cause is here. > Example log where > :sdks:python:test-suites:portable:py35:portableWordCountBatch failed: > {code} > 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap > (FlatMap at ExtractOutput[0]) (2/2)] ERROR > org.apache.flink.runtime.operators.BatchTask - Error in task code: CHAIN > MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at > ExtractOutput[0]) (2/2) > 11:51:22 [CHAIN MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap > (FlatMap at ExtractOutput[0]) (1/2)] ERROR > org.apache.flink.runtime.operators.BatchTask - Error in task code: CHAIN > MapPartition (MapPartition at [1]read/Read/Split) -> FlatMap (FlatMap at > ExtractOutput[0]) (1/2) > 11:51:22 [CHAIN MapPartition (MapPartition at > [2]write/Write/WriteImpl/DoOnce/{FlatMap(), > Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2)] ERROR > org.apache.flink.runtime.operators.BatchTask - Error in task code: CHAIN > MapPartition (MapPartition at > [2]write/Write/WriteImpl/DoOnce/{FlatMap(), > Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (2/2) > 11:51:22 [CHAIN MapPartition (MapPartition at > [2]write/Write/WriteImpl/DoOnce/{FlatMap(), > Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2)] ERROR > org.apache.flink.runtime.operators.BatchTask - Error in task code: CHAIN > MapPartition (MapPartition at > [2]write/Write/WriteImpl/DoOnce/{FlatMap(), > Map(decode)}) -> FlatMap (FlatMap at ExtractOutput[0]) (1/2) > 11:51:22 java.lang.Exception: The user defined 'open()' method caused an > exception: java.io.IOException: Received exit code 1 for command 'docker > inspect -f {{.State.Running}} > 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: > Error: No such object: > 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1 > 11:51:22 at > org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:498) > 11:51:22 at > org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368) > 11:51:22 at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712) > 11:51:22 at java.lang.Thread.run(Thread.java:748) > 11:51:22 Caused by: > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: > java.io.IOException: Received exit code 1 for command 'docker inspect -f > {{.State.Running}} > 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1'. stderr: > Error: No such object: > 642c312c335d3881b885873c66917b536e79cff07503fdceaddee5fbeb10bfd1 > 11:51:22 at > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.getUnchecked(LocalCache.java:4966) > 11:51:22 at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:211) > 11:51:22 at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory$SimpleStageBundleFactory.(DefaultJobBundleFactory.java:202) > 11:51:22 at > org.apache.beam.runners.fnexecution.control.DefaultJobBundleFactory.forStage(DefaultJobBundleFactory.java:185) > 11:51:22 at > org.apache.beam.runners.flink.translation.functions.FlinkDefaultExecutableStageContext.getStageBundleFactory(FlinkDefaultExecutableStageContext.java:49) > 11:51:22 at > org.apache.beam.runners.flink.translation.functions.ReferenceCountingFlinkExecutableStageContextFactory$WrappedContext.getStageBundleFactory(ReferenceCountingFlinkExecutableStageContextFactory.java:203) > 11:51:22 at > org.apache.beam.runners.flink.translation.functions.FlinkExecutableStageFunction.open(FlinkExecutableStageFunction.java:129) > 11:51:22 at > org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36) > 11:51:22 at > org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:494) >
[jira] [Work logged] (BEAM-8108) Run Chicago Taxi Example on Flink
[ https://issues.apache.org/jira/browse/BEAM-8108?focusedWorklogId=303774=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303774 ] ASF GitHub Bot logged work on BEAM-8108: Author: ASF GitHub Bot Created on: 29/Aug/19 15:48 Start Date: 29/Aug/19 15:48 Worklog Time Spent: 10m Work Description: kamilwu commented on issue #9448: [BEAM-8108] Add Chicago Taxi Example running on Flink URL: https://github.com/apache/beam/pull/9448#issuecomment-526247150 Run Chicago Taxi on Flink This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303774) Time Spent: 1h 40m (was: 1.5h) > Run Chicago Taxi Example on Flink > - > > Key: BEAM-8108 > URL: https://issues.apache.org/jira/browse/BEAM-8108 > Project: Beam > Issue Type: Test > Components: testing >Reporter: Kamil Wasilewski >Assignee: Kamil Wasilewski >Priority: Minor > Time Spent: 1h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-8113) FlinkRunner: Stage files from context classloader
[ https://issues.apache.org/jira/browse/BEAM-8113?focusedWorklogId=303773=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303773 ] ASF GitHub Bot logged work on BEAM-8113: Author: ASF GitHub Bot Created on: 29/Aug/19 15:47 Start Date: 29/Aug/19 15:47 Worklog Time Spent: 10m Work Description: je-ik commented on issue #9451: [BEAM-8113] Stage files from context classloader URL: https://github.com/apache/beam/pull/9451#issuecomment-526246737 @mxm I think this can be merged. My intention was to enable the auto staging feature to work on JDK >= 9 (if user supplied URLClassLoader in context classloader). The problem is, unfortunately, that local flink runner does not use the context classloader (not sure why, still looking into that), but this PR seems to me to be valid on its own. If the user creates some context classloader before submitting the pipeline, we can do our best to retrieve as many dependencies as possible. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303773) Time Spent: 1h (was: 50m) > FlinkRunner: Stage files from context classloader > - > > Key: BEAM-8113 > URL: https://issues.apache.org/jira/browse/BEAM-8113 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Jan Lukavský >Assignee: Jan Lukavský >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > Currently, only files from {{FlinkRunner.class.getClassLoader()}} are staged > by default. Add also files from > {{Thread.currentThread().getContextClassLoader()}}. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-8113) FlinkRunner: Stage files from context classloader
[ https://issues.apache.org/jira/browse/BEAM-8113?focusedWorklogId=303765=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303765 ] ASF GitHub Bot logged work on BEAM-8113: Author: ASF GitHub Bot Created on: 29/Aug/19 15:42 Start Date: 29/Aug/19 15:42 Worklog Time Spent: 10m Work Description: je-ik commented on issue #9451: [BEAM-8113] Stage files from context classloader URL: https://github.com/apache/beam/pull/9451#issuecomment-526244675 Run Portable_Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303765) Time Spent: 50m (was: 40m) > FlinkRunner: Stage files from context classloader > - > > Key: BEAM-8113 > URL: https://issues.apache.org/jira/browse/BEAM-8113 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Jan Lukavský >Assignee: Jan Lukavský >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > Currently, only files from {{FlinkRunner.class.getClassLoader()}} are staged > by default. Add also files from > {{Thread.currentThread().getContextClassLoader()}}. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (BEAM-8114) Chicago Taxi on Dataflow is failing
[ https://issues.apache.org/jira/browse/BEAM-8114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lukasz Gajowy updated BEAM-8114: Status: Open (was: Triage Needed) > Chicago Taxi on Dataflow is failing > --- > > Key: BEAM-8114 > URL: https://issues.apache.org/jira/browse/BEAM-8114 > Project: Beam > Issue Type: Bug > Components: testing >Reporter: Kamil Wasilewski >Priority: Major > > An exception is being raised when publishing metrics: > > {code:java} > 16:48:57 value = self._prepare_runtime_metrics(runtime_list) > 16:48:57 File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Chicago_Taxi_Dataflow/src/sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py", > line 317, in _prepare_runtime_metrics > 16:48:57 min_value = min(min_values) > 16:48:57 ValueError: min() arg is an empty sequence > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (BEAM-8114) Chicago Taxi on Dataflow is failing
[ https://issues.apache.org/jira/browse/BEAM-8114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kamil Wasilewski updated BEAM-8114: --- Description: An exception is being raised when publishing metrics: {code:java} 16:48:57 value = self._prepare_runtime_metrics(runtime_list) 16:48:57 File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Chicago_Taxi_Dataflow/src/sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py", line 317, in _prepare_runtime_metrics 16:48:57 min_value = min(min_values) 16:48:57 ValueError: min() arg is an empty sequence {code} was: An exception is being raised when publishing metrics: {code:java} // code 16:48:57 value = self._prepare_runtime_metrics(runtime_list) 16:48:57 File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Chicago_Taxi_Dataflow/src/sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py", line 317, in _prepare_runtime_metrics 16:48:57 min_value = min(min_values) 16:48:57 ValueError: min() arg is an empty sequence {code} > Chicago Taxi on Dataflow is failing > --- > > Key: BEAM-8114 > URL: https://issues.apache.org/jira/browse/BEAM-8114 > Project: Beam > Issue Type: Bug > Components: testing >Reporter: Kamil Wasilewski >Priority: Major > > An exception is being raised when publishing metrics: > > {code:java} > 16:48:57 value = self._prepare_runtime_metrics(runtime_list) > 16:48:57 File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Chicago_Taxi_Dataflow/src/sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py", > line 317, in _prepare_runtime_metrics > 16:48:57 min_value = min(min_values) > 16:48:57 ValueError: min() arg is an empty sequence > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (BEAM-8114) Chicago Taxi on Dataflow is failing
[ https://issues.apache.org/jira/browse/BEAM-8114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kamil Wasilewski updated BEAM-8114: --- Description: An exception is being raised when publishing metrics: {code:java} // code 16:48:57 value = self._prepare_runtime_metrics(runtime_list) 16:48:57 File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Chicago_Taxi_Dataflow/src/sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py", line 317, in _prepare_runtime_metrics 16:48:57 min_value = min(min_values) 16:48:57 ValueError: min() arg is an empty sequence {code} was: An exception is being raised when publishing metrics: *16:48:57* value = self._prepare_runtime_metrics(runtime_list)*16:48:57* File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Chicago_Taxi_Dataflow/src/sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py", line 317, in _prepare_runtime_metrics*16:48:57* min_value = min(min_values)*16:48:57* ValueError: min() arg is an empty sequence > Chicago Taxi on Dataflow is failing > --- > > Key: BEAM-8114 > URL: https://issues.apache.org/jira/browse/BEAM-8114 > Project: Beam > Issue Type: Bug > Components: testing >Reporter: Kamil Wasilewski >Priority: Major > > An exception is being raised when publishing metrics: > > {code:java} > // code 16:48:57 value = self._prepare_runtime_metrics(runtime_list) > 16:48:57 File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Chicago_Taxi_Dataflow/src/sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py", > line 317, in _prepare_runtime_metrics > 16:48:57 min_value = min(min_values) > 16:48:57 ValueError: min() arg is an empty sequence > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (BEAM-8114) Chicago Taxi on Dataflow is failing
[ https://issues.apache.org/jira/browse/BEAM-8114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kamil Wasilewski updated BEAM-8114: --- Description: An exception is being raised when publishing metrics: 16:48:57 value = self._prepare_runtime_metrics(runtime_list) 16:48:57 File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Chicago_Taxi_Dataflow/src/sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py", line 317, in _prepare_runtime_metrics 16:48:57 min_value = min(min_values) 16:48:57 ValueError: min() arg is an empty sequence was: An exception is being raised when publishing metrics: ``` 16:48:57 value = self._prepare_runtime_metrics(runtime_list) 16:48:57 File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Chicago_Taxi_Dataflow/src/sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py", line 317, in _prepare_runtime_metrics 16:48:57 min_value = min(min_values) 16:48:57 ValueError: min() arg is an empty sequence ``` > Chicago Taxi on Dataflow is failing > --- > > Key: BEAM-8114 > URL: https://issues.apache.org/jira/browse/BEAM-8114 > Project: Beam > Issue Type: Bug > Components: testing >Reporter: Kamil Wasilewski >Priority: Major > > An exception is being raised when publishing metrics: > > 16:48:57 value = self._prepare_runtime_metrics(runtime_list) > 16:48:57 File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Chicago_Taxi_Dataflow/src/sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py", > line 317, in _prepare_runtime_metrics > 16:48:57 min_value = min(min_values) > 16:48:57 ValueError: min() arg is an empty sequence -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (BEAM-8114) Chicago Taxi on Dataflow is failing
[ https://issues.apache.org/jira/browse/BEAM-8114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kamil Wasilewski updated BEAM-8114: --- Description: An exception is being raised when publishing metrics: ``` 16:48:57 value = self._prepare_runtime_metrics(runtime_list) 16:48:57 File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Chicago_Taxi_Dataflow/src/sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py", line 317, in _prepare_runtime_metrics 16:48:57 min_value = min(min_values) 16:48:57 ValueError: min() arg is an empty sequence ``` was: An exception is being raised when publishing metrics: *16:48:57* value = self._prepare_runtime_metrics(runtime_list)*16:48:57* File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Chicago_Taxi_Dataflow/src/sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py", line 317, in _prepare_runtime_metrics*16:48:57* min_value = min(min_values)*16:48:57* ValueError: min() arg is an empty sequence > Chicago Taxi on Dataflow is failing > --- > > Key: BEAM-8114 > URL: https://issues.apache.org/jira/browse/BEAM-8114 > Project: Beam > Issue Type: Bug > Components: testing >Reporter: Kamil Wasilewski >Priority: Major > > An exception is being raised when publishing metrics: > > ``` > 16:48:57 value = self._prepare_runtime_metrics(runtime_list) > 16:48:57 File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Chicago_Taxi_Dataflow/src/sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py", > line 317, in _prepare_runtime_metrics > 16:48:57 min_value = min(min_values) > 16:48:57 ValueError: min() arg is an empty sequence > ``` -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (BEAM-8114) Chicago Taxi on Dataflow is failing
[ https://issues.apache.org/jira/browse/BEAM-8114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kamil Wasilewski updated BEAM-8114: --- Description: An exception is being raised when publishing metrics: *16:48:57* value = self._prepare_runtime_metrics(runtime_list)*16:48:57* File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Chicago_Taxi_Dataflow/src/sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py", line 317, in _prepare_runtime_metrics*16:48:57* min_value = min(min_values)*16:48:57* ValueError: min() arg is an empty sequence was: An exception is being raised when publishing metrics: 16:48:57 value = self._prepare_runtime_metrics(runtime_list) 16:48:57 File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Chicago_Taxi_Dataflow/src/sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py", line 317, in _prepare_runtime_metrics 16:48:57 min_value = min(min_values) 16:48:57 ValueError: min() arg is an empty sequence > Chicago Taxi on Dataflow is failing > --- > > Key: BEAM-8114 > URL: https://issues.apache.org/jira/browse/BEAM-8114 > Project: Beam > Issue Type: Bug > Components: testing >Reporter: Kamil Wasilewski >Priority: Major > > An exception is being raised when publishing metrics: > > *16:48:57* value = self._prepare_runtime_metrics(runtime_list)*16:48:57* > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Chicago_Taxi_Dataflow/src/sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py", > line 317, in _prepare_runtime_metrics*16:48:57* min_value = > min(min_values)*16:48:57* ValueError: min() arg is an empty sequence -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (BEAM-8114) Chicago Taxi on Dataflow is failing
Kamil Wasilewski created BEAM-8114: -- Summary: Chicago Taxi on Dataflow is failing Key: BEAM-8114 URL: https://issues.apache.org/jira/browse/BEAM-8114 Project: Beam Issue Type: Bug Components: testing Reporter: Kamil Wasilewski An exception is being raised when publishing metrics: *16:48:57* value = self._prepare_runtime_metrics(runtime_list)*16:48:57* File "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Chicago_Taxi_Dataflow/src/sdks/python/apache_beam/testing/load_tests/load_test_metrics_utils.py", line 317, in _prepare_runtime_metrics*16:48:57* min_value = min(min_values)*16:48:57* ValueError: min() arg is an empty sequence -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-8108) Run Chicago Taxi Example on Flink
[ https://issues.apache.org/jira/browse/BEAM-8108?focusedWorklogId=303753=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303753 ] ASF GitHub Bot logged work on BEAM-8108: Author: ASF GitHub Bot Created on: 29/Aug/19 15:06 Start Date: 29/Aug/19 15:06 Worklog Time Spent: 10m Work Description: kamilwu commented on issue #9448: [BEAM-8108] Add Chicago Taxi Example running on Flink URL: https://github.com/apache/beam/pull/9448#issuecomment-526228045 Run Chicago Taxi on Flink This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303753) Time Spent: 1.5h (was: 1h 20m) > Run Chicago Taxi Example on Flink > - > > Key: BEAM-8108 > URL: https://issues.apache.org/jira/browse/BEAM-8108 > Project: Beam > Issue Type: Test > Components: testing >Reporter: Kamil Wasilewski >Assignee: Kamil Wasilewski >Priority: Minor > Time Spent: 1.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7389) Colab examples for element-wise transforms (Python)
[ https://issues.apache.org/jira/browse/BEAM-7389?focusedWorklogId=303752=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303752 ] ASF GitHub Bot logged work on BEAM-7389: Author: ASF GitHub Bot Created on: 29/Aug/19 15:04 Start Date: 29/Aug/19 15:04 Worklog Time Spent: 10m Work Description: aaltay commented on pull request #9433: [BEAM-7389] Update to use util.ToString transform URL: https://github.com/apache/beam/pull/9433#discussion_r319121548 ## File path: sdks/python/apache_beam/examples/snippets/transforms/element_wise/to_string_test.py ## @@ -19,36 +19,81 @@ from __future__ import absolute_import from __future__ import print_function +import sys import unittest import mock -from apache_beam.examples.snippets.transforms.element_wise.to_string import * from apache_beam.testing.test_pipeline import TestPipeline from apache_beam.testing.util import assert_that from apache_beam.testing.util import equal_to +from . import to_string + + +def check_plants(actual): + # [START plants] + plants = [ + ',Strawberry', + '凌,Carrot', + ',Eggplant', + ',Tomato', + '凜,Potato', + ] + # [END plants] + assert_that(actual, equal_to(plants)) + + +def check_plant_lists(actual): + # [START plant_lists] + plant_lists = [ + "['', 'Strawberry', 'perennial']", + "['凌', 'Carrot', 'biennial']", + "['', 'Eggplant', 'perennial']", + "['', 'Tomato', 'annual']", + "['凜', 'Potato', 'perennial']", + ] + # [END plant_lists] + + # Some unicode characters become escaped with double backslashes. + import apache_beam as beam + + def normalize_escaping(elem): Review comment: Could you add a JIRA todo here so that we can remove it after python 2 deprecation? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303752) Time Spent: 53h (was: 52h 50m) > Colab examples for element-wise transforms (Python) > --- > > Key: BEAM-7389 > URL: https://issues.apache.org/jira/browse/BEAM-7389 > Project: Beam > Issue Type: Improvement > Components: website >Reporter: Rose Nguyen >Assignee: David Cavazos >Priority: Minor > Time Spent: 53h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7389) Colab examples for element-wise transforms (Python)
[ https://issues.apache.org/jira/browse/BEAM-7389?focusedWorklogId=303751=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303751 ] ASF GitHub Bot logged work on BEAM-7389: Author: ASF GitHub Bot Created on: 29/Aug/19 15:02 Start Date: 29/Aug/19 15:02 Worklog Time Spent: 10m Work Description: aaltay commented on issue #9262: [BEAM-7389] Add code examples for Regex page URL: https://github.com/apache/beam/pull/9262#issuecomment-526226609 I see an extra space in the example data: `', Strawberry, perennial',` do you know why? (between the strawberry icon and strawberry text.) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303751) Time Spent: 52h 50m (was: 52h 40m) > Colab examples for element-wise transforms (Python) > --- > > Key: BEAM-7389 > URL: https://issues.apache.org/jira/browse/BEAM-7389 > Project: Beam > Issue Type: Improvement > Components: website >Reporter: Rose Nguyen >Assignee: David Cavazos >Priority: Minor > Time Spent: 52h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-8108) Run Chicago Taxi Example on Flink
[ https://issues.apache.org/jira/browse/BEAM-8108?focusedWorklogId=303741=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303741 ] ASF GitHub Bot logged work on BEAM-8108: Author: ASF GitHub Bot Created on: 29/Aug/19 14:44 Start Date: 29/Aug/19 14:44 Worklog Time Spent: 10m Work Description: kamilwu commented on issue #9448: [BEAM-8108] Add Chicago Taxi Example running on Flink URL: https://github.com/apache/beam/pull/9448#issuecomment-526218614 Run Chicago Taxi on Flink This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303741) Time Spent: 1h 20m (was: 1h 10m) > Run Chicago Taxi Example on Flink > - > > Key: BEAM-8108 > URL: https://issues.apache.org/jira/browse/BEAM-8108 > Project: Beam > Issue Type: Test > Components: testing >Reporter: Kamil Wasilewski >Assignee: Kamil Wasilewski >Priority: Minor > Time Spent: 1h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-8108) Run Chicago Taxi Example on Flink
[ https://issues.apache.org/jira/browse/BEAM-8108?focusedWorklogId=303711=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303711 ] ASF GitHub Bot logged work on BEAM-8108: Author: ASF GitHub Bot Created on: 29/Aug/19 13:40 Start Date: 29/Aug/19 13:40 Worklog Time Spent: 10m Work Description: kamilwu commented on issue #9448: [BEAM-8108] Run Chicago Taxi Example on Flink URL: https://github.com/apache/beam/pull/9448#issuecomment-526189897 Run Seed Job This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303711) Time Spent: 1h 10m (was: 1h) > Run Chicago Taxi Example on Flink > - > > Key: BEAM-8108 > URL: https://issues.apache.org/jira/browse/BEAM-8108 > Project: Beam > Issue Type: Test > Components: testing >Reporter: Kamil Wasilewski >Assignee: Kamil Wasilewski >Priority: Minor > Time Spent: 1h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-8003) Remove all mentions of PKB on Confluence / website docs
[ https://issues.apache.org/jira/browse/BEAM-8003?focusedWorklogId=303712=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303712 ] ASF GitHub Bot logged work on BEAM-8003: Author: ASF GitHub Bot Created on: 29/Aug/19 13:40 Start Date: 29/Aug/19 13:40 Worklog Time Spent: 10m Work Description: lgajowy commented on issue #9450: [BEAM-8003] Remove Perfkit leftovers URL: https://github.com/apache/beam/pull/9450#issuecomment-526190065 Run Java TextIO Performance Test HDFS This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303712) Time Spent: 0.5h (was: 20m) > Remove all mentions of PKB on Confluence / website docs > --- > > Key: BEAM-8003 > URL: https://issues.apache.org/jira/browse/BEAM-8003 > Project: Beam > Issue Type: Sub-task > Components: testing, website >Reporter: Lukasz Gajowy >Assignee: Lukasz Gajowy >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-5428) Implement cross-bundle state caching.
[ https://issues.apache.org/jira/browse/BEAM-5428?focusedWorklogId=303706=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303706 ] ASF GitHub Bot logged work on BEAM-5428: Author: ASF GitHub Bot Created on: 29/Aug/19 13:33 Start Date: 29/Aug/19 13:33 Worklog Time Spent: 10m Work Description: mxm commented on issue #9374: [BEAM-5428] Implement Runner support for cache tokens URL: https://github.com/apache/beam/pull/9374#issuecomment-526128312 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303706) Time Spent: 10h 40m (was: 10.5h) > Implement cross-bundle state caching. > - > > Key: BEAM-5428 > URL: https://issues.apache.org/jira/browse/BEAM-5428 > Project: Beam > Issue Type: Improvement > Components: sdk-py-harness >Reporter: Robert Bradshaw >Assignee: Rakesh Kumar >Priority: Major > Time Spent: 10h 40m > Remaining Estimate: 0h > > Tech spec: > [https://docs.google.com/document/d/1BOozW0bzBuz4oHJEuZNDOHdzaV5Y56ix58Ozrqm2jFg/edit#heading=h.7ghoih5aig5m] > Relevant document: > [https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit#|https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit] > Mailing list link: > [https://lists.apache.org/thread.html/caa8d9bc6ca871d13de2c5e6ba07fdc76f85d26497d95d90893aa1f6@%3Cdev.beam.apache.org%3E] -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-8113) FlinkRunner: Stage files from context classloader
[ https://issues.apache.org/jira/browse/BEAM-8113?focusedWorklogId=303692=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303692 ] ASF GitHub Bot logged work on BEAM-8113: Author: ASF GitHub Bot Created on: 29/Aug/19 13:14 Start Date: 29/Aug/19 13:14 Worklog Time Spent: 10m Work Description: je-ik commented on issue #9451: [BEAM-8113] Stage files from context classloader URL: https://github.com/apache/beam/pull/9451#issuecomment-526179284 Still debugging this locally. The fix seems not to be doing what I was expecting. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303692) Time Spent: 40m (was: 0.5h) > FlinkRunner: Stage files from context classloader > - > > Key: BEAM-8113 > URL: https://issues.apache.org/jira/browse/BEAM-8113 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Jan Lukavský >Assignee: Jan Lukavský >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > Currently, only files from {{FlinkRunner.class.getClassLoader()}} are staged > by default. Add also files from > {{Thread.currentThread().getContextClassLoader()}}. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7829) AvroUtils.toAvroSchema should put a Schema name to pass Avro Schema validation
[ https://issues.apache.org/jira/browse/BEAM-7829?focusedWorklogId=303668=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303668 ] ASF GitHub Bot logged work on BEAM-7829: Author: ASF GitHub Bot Created on: 29/Aug/19 12:45 Start Date: 29/Aug/19 12:45 Worklog Time Spent: 10m Work Description: kanterov commented on pull request #9247: [BEAM-7829] Add schema names when converting with AvroUtils.toAvroSchema URL: https://github.com/apache/beam/pull/9247 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303668) Time Spent: 2h 40m (was: 2.5h) > AvroUtils.toAvroSchema should put a Schema name to pass Avro Schema validation > -- > > Key: BEAM-7829 > URL: https://issues.apache.org/jira/browse/BEAM-7829 > Project: Beam > Issue Type: Test > Components: io-java-avro, sdk-java-core >Reporter: Ismaël Mejía >Assignee: Ryan Skraba >Priority: Minor > Fix For: 2.16.0 > > Time Spent: 2h 40m > Remaining Estimate: 0h > > While trying to use an Avro PCollection with the SQL transform I notice you > could not do correctly a bijective transform: PCollection -> > SQL -> PCollection -> ParDo -> PCollection I noticed that > some of the Avro metadata gets lost in particular the name of the Avro > Schema. This is important because Avro validates that the schema has a name > and if it does not it breaks with a ParseException. > {quote} > org.apache.avro.SchemaParseException: Illegal character in: EXPR$1 > at org.apache.avro.Schema.validateName (Schema.java:1151) > at org.apache.avro.Schema.access$200 (Schema.java:81) > at org.apache.avro.Schema$Field. (Schema.java:403) > at org.apache.avro.Schema$Field. (Schema.java:423) > at org.apache.avro.Schema$Field. (Schema.java:415){quote} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Closed] (BEAM-7829) AvroUtils.toAvroSchema should put a Schema name to pass Avro Schema validation
[ https://issues.apache.org/jira/browse/BEAM-7829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gleb Kanterov closed BEAM-7829. --- Resolution: Fixed > AvroUtils.toAvroSchema should put a Schema name to pass Avro Schema validation > -- > > Key: BEAM-7829 > URL: https://issues.apache.org/jira/browse/BEAM-7829 > Project: Beam > Issue Type: Test > Components: io-java-avro, sdk-java-core >Reporter: Ismaël Mejía >Assignee: Ryan Skraba >Priority: Minor > Fix For: 2.16.0 > > Time Spent: 2h 40m > Remaining Estimate: 0h > > While trying to use an Avro PCollection with the SQL transform I notice you > could not do correctly a bijective transform: PCollection -> > SQL -> PCollection -> ParDo -> PCollection I noticed that > some of the Avro metadata gets lost in particular the name of the Avro > Schema. This is important because Avro validates that the schema has a name > and if it does not it breaks with a ParseException. > {quote} > org.apache.avro.SchemaParseException: Illegal character in: EXPR$1 > at org.apache.avro.Schema.validateName (Schema.java:1151) > at org.apache.avro.Schema.access$200 (Schema.java:81) > at org.apache.avro.Schema$Field. (Schema.java:403) > at org.apache.avro.Schema$Field. (Schema.java:423) > at org.apache.avro.Schema$Field. (Schema.java:415){quote} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-8003) Remove all mentions of PKB on Confluence / website docs
[ https://issues.apache.org/jira/browse/BEAM-8003?focusedWorklogId=303651=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303651 ] ASF GitHub Bot logged work on BEAM-8003: Author: ASF GitHub Bot Created on: 29/Aug/19 12:25 Start Date: 29/Aug/19 12:25 Worklog Time Spent: 10m Work Description: lgajowy commented on issue #9450: [BEAM-8003] Remove Perfkit leftovers URL: https://github.com/apache/beam/pull/9450#issuecomment-526161563 Run seed job This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303651) Time Spent: 20m (was: 10m) > Remove all mentions of PKB on Confluence / website docs > --- > > Key: BEAM-8003 > URL: https://issues.apache.org/jira/browse/BEAM-8003 > Project: Beam > Issue Type: Sub-task > Components: testing, website >Reporter: Lukasz Gajowy >Assignee: Lukasz Gajowy >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7660) Create ParDo Python Load Test Jenkins Job [Flink]
[ https://issues.apache.org/jira/browse/BEAM-7660?focusedWorklogId=303614=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303614 ] ASF GitHub Bot logged work on BEAM-7660: Author: ASF GitHub Bot Created on: 29/Aug/19 11:47 Start Date: 29/Aug/19 11:47 Worklog Time Spent: 10m Work Description: kamilwu commented on issue #9449: [BEAM-7660] Create Python ParDo load test job on Flink URL: https://github.com/apache/beam/pull/9449#issuecomment-526149697 @iemejia Could you take a look? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303614) Time Spent: 40m (was: 0.5h) > Create ParDo Python Load Test Jenkins Job [Flink] > - > > Key: BEAM-7660 > URL: https://issues.apache.org/jira/browse/BEAM-7660 > Project: Beam > Issue Type: Sub-task > Components: testing >Reporter: Kamil Wasilewski >Assignee: Kamil Wasilewski >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7660) Create ParDo Python Load Test Jenkins Job [Flink]
[ https://issues.apache.org/jira/browse/BEAM-7660?focusedWorklogId=303616=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303616 ] ASF GitHub Bot logged work on BEAM-7660: Author: ASF GitHub Bot Created on: 29/Aug/19 11:47 Start Date: 29/Aug/19 11:47 Worklog Time Spent: 10m Work Description: kamilwu commented on issue #9449: [BEAM-7660] Create Python ParDo load test job on Flink URL: https://github.com/apache/beam/pull/9449#issuecomment-526149782 R: @iemejia Could you take a look? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303616) Time Spent: 1h (was: 50m) > Create ParDo Python Load Test Jenkins Job [Flink] > - > > Key: BEAM-7660 > URL: https://issues.apache.org/jira/browse/BEAM-7660 > Project: Beam > Issue Type: Sub-task > Components: testing >Reporter: Kamil Wasilewski >Assignee: Kamil Wasilewski >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7660) Create ParDo Python Load Test Jenkins Job [Flink]
[ https://issues.apache.org/jira/browse/BEAM-7660?focusedWorklogId=303615=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303615 ] ASF GitHub Bot logged work on BEAM-7660: Author: ASF GitHub Bot Created on: 29/Aug/19 11:47 Start Date: 29/Aug/19 11:47 Worklog Time Spent: 10m Work Description: kamilwu commented on issue #9449: [BEAM-7660] Create Python ParDo load test job on Flink URL: https://github.com/apache/beam/pull/9449#issuecomment-526149697 @iemejia Could you take a look? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303615) Time Spent: 50m (was: 40m) > Create ParDo Python Load Test Jenkins Job [Flink] > - > > Key: BEAM-7660 > URL: https://issues.apache.org/jira/browse/BEAM-7660 > Project: Beam > Issue Type: Sub-task > Components: testing >Reporter: Kamil Wasilewski >Assignee: Kamil Wasilewski >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work stopped] (BEAM-8113) FlinkRunner: Stage files from context classloader
[ https://issues.apache.org/jira/browse/BEAM-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on BEAM-8113 stopped by Jan Lukavský. -- > FlinkRunner: Stage files from context classloader > - > > Key: BEAM-8113 > URL: https://issues.apache.org/jira/browse/BEAM-8113 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Jan Lukavský >Assignee: Jan Lukavský >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > Currently, only files from {{FlinkRunner.class.getClassLoader()}} are staged > by default. Add also files from > {{Thread.currentThread().getContextClassLoader()}}. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7945) Allow runner to configure "semi_persist_dir" which is used in the SDK harness
[ https://issues.apache.org/jira/browse/BEAM-7945?focusedWorklogId=303612=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303612 ] ASF GitHub Bot logged work on BEAM-7945: Author: ASF GitHub Bot Created on: 29/Aug/19 11:45 Start Date: 29/Aug/19 11:45 Worklog Time Spent: 10m Work Description: sunjincheng121 commented on pull request #9452: [BEAM-7945] Allow runner to configure semi_persist_dir which is used … URL: https://github.com/apache/beam/pull/9452 Currently "semi_persist_dir" is not configurable. This may become a problem in certain scenarios. For example, the default value of "semi_persist_dir" is "/tmp" (https://github.com/apache/beam/blob/master/sdks/python/container/boot.go#L48) in Python SDK harness. When the environment type is "PROCESS", the disk of "/tmp" may be filled up and unexpected issues will occur in production environment. So, This pull request makes the semi_persist_dir configurable through adding a new PipelineOption(RemoteEnvironmentOptions).The Pipeline option will be passed to the `DefaultJobBundleFactory` and then be used in each EnvironmentFactory(docker, process, external and embedded). For details of the discussion can be found in [1]. [1] https://lists.apache.org/list.html?d...@beam.apache.org:lte=1M:%5BDISCUSS%5D%20Turn%20%60WindowedValue Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build
[jira] [Work logged] (BEAM-8113) FlinkRunner: Stage files from context classloader
[ https://issues.apache.org/jira/browse/BEAM-8113?focusedWorklogId=303609=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303609 ] ASF GitHub Bot logged work on BEAM-8113: Author: ASF GitHub Bot Created on: 29/Aug/19 11:42 Start Date: 29/Aug/19 11:42 Worklog Time Spent: 10m Work Description: je-ik commented on issue #9451: [BEAM-8113] Stage files from context classloader URL: https://github.com/apache/beam/pull/9451#issuecomment-526148312 R: @mxm This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303609) Time Spent: 20m (was: 10m) > FlinkRunner: Stage files from context classloader > - > > Key: BEAM-8113 > URL: https://issues.apache.org/jira/browse/BEAM-8113 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Jan Lukavský >Assignee: Jan Lukavský >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > Currently, only files from {{FlinkRunner.class.getClassLoader()}} are staged > by default. Add also files from > {{Thread.currentThread().getContextClassLoader()}}. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-8113) FlinkRunner: Stage files from context classloader
[ https://issues.apache.org/jira/browse/BEAM-8113?focusedWorklogId=303610=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303610 ] ASF GitHub Bot logged work on BEAM-8113: Author: ASF GitHub Bot Created on: 29/Aug/19 11:42 Start Date: 29/Aug/19 11:42 Worklog Time Spent: 10m Work Description: je-ik commented on issue #9451: [BEAM-8113] Stage files from context classloader URL: https://github.com/apache/beam/pull/9451#issuecomment-526148361 Run Flink ValidatesRunner This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303610) Time Spent: 0.5h (was: 20m) > FlinkRunner: Stage files from context classloader > - > > Key: BEAM-8113 > URL: https://issues.apache.org/jira/browse/BEAM-8113 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Jan Lukavský >Assignee: Jan Lukavský >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > Currently, only files from {{FlinkRunner.class.getClassLoader()}} are staged > by default. Add also files from > {{Thread.currentThread().getContextClassLoader()}}. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-8113) FlinkRunner: Stage files from context classloader
[ https://issues.apache.org/jira/browse/BEAM-8113?focusedWorklogId=303608=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303608 ] ASF GitHub Bot logged work on BEAM-8113: Author: ASF GitHub Bot Created on: 29/Aug/19 11:41 Start Date: 29/Aug/19 11:41 Worklog Time Spent: 10m Work Description: je-ik commented on pull request #9451: [BEAM-8113] Stage files from context classloader URL: https://github.com/apache/beam/pull/9451 Fixes [BEAM-8113]. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/) | --- |
[jira] [Work logged] (BEAM-8003) Remove all mentions of PKB on Confluence / website docs
[ https://issues.apache.org/jira/browse/BEAM-8003?focusedWorklogId=303605=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303605 ] ASF GitHub Bot logged work on BEAM-8003: Author: ASF GitHub Bot Created on: 29/Aug/19 11:31 Start Date: 29/Aug/19 11:31 Worklog Time Spent: 10m Work Description: lgajowy commented on pull request #9450: [BEAM-8003] Remove Perfkit from webside documentation URL: https://github.com/apache/beam/pull/9450 Removes all mentions of PKB in documentation that are out of date. Removes Gradle task that is obsolete since we resigned from using Perfkit. Removes Perfkit word from file-based tests description (makes no sense anymore). @markflyhigh could you take a look? Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build
[jira] [Work started] (BEAM-8003) Remove all mentions of PKB on Confluence / website docs
[ https://issues.apache.org/jira/browse/BEAM-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on BEAM-8003 started by Lukasz Gajowy. --- > Remove all mentions of PKB on Confluence / website docs > --- > > Key: BEAM-8003 > URL: https://issues.apache.org/jira/browse/BEAM-8003 > Project: Beam > Issue Type: Sub-task > Components: testing, website >Reporter: Lukasz Gajowy >Assignee: Lukasz Gajowy >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (BEAM-7978) ArithmeticExceptions on getting backlog bytes
[ https://issues.apache.org/jira/browse/BEAM-7978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918516#comment-16918516 ] Alexey Romanenko commented on BEAM-7978: [~Juraszek] [~chamikara] Could you take a look on this PR: https://github.com/apache/beam/pull/9432? > ArithmeticExceptions on getting backlog bytes > -- > > Key: BEAM-7978 > URL: https://issues.apache.org/jira/browse/BEAM-7978 > Project: Beam > Issue Type: Bug > Components: io-java-kinesis >Affects Versions: 2.14.0 >Reporter: Mateusz >Assignee: Alexey Romanenko >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > Hello, > Beam 2.14.0 > (and to be more precise > [commit|https://github.com/apache/beam/commit/9fe0a0387ad1f894ef02acf32edbc9623a3b13ad#diff-b4964a457006b1555c7042c739b405ec]) > introduced a change in watermark calculation in Kinesis IO causing below > error: > {code:java} > exception: "java.lang.RuntimeException: Unknown kinesis failure, when trying > to reach kinesis > at > org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.wrapExceptions(SimplifiedKinesisClient.java:227) > at > org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.getBacklogBytes(SimplifiedKinesisClient.java:167) > at > org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.getBacklogBytes(SimplifiedKinesisClient.java:155) > at > org.apache.beam.sdk.io.kinesis.KinesisReader.getTotalBacklogBytes(KinesisReader.java:158) > at > org.apache.beam.runners.dataflow.worker.StreamingModeExecutionContext.flushState(StreamingModeExecutionContext.java:433) > at > org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1289) > at > org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access$1000(StreamingDataflowWorker.java:149) > at > org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker$6.run(StreamingDataflowWorker.java:1024) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.ArithmeticException: Value cannot fit in an int: > 153748963401 > at org.joda.time.field.FieldUtils.safeToInt(FieldUtils.java:229) > at > org.joda.time.field.BaseDurationField.getDifference(BaseDurationField.java:141) > at > org.joda.time.base.BaseSingleFieldPeriod.between(BaseSingleFieldPeriod.java:72) > at org.joda.time.Minutes.minutesBetween(Minutes.java:101) > at > org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.lambda$getBacklogBytes$3(SimplifiedKinesisClient.java:169) > at > org.apache.beam.sdk.io.kinesis.SimplifiedKinesisClient.wrapExceptions(SimplifiedKinesisClient.java:210) > ... 10 more > {code} > We spotted this issue on Dataflow runner. It's problematic as inability to > get backlog bytes seems to result in constant recreation of KinesisReader. > The issue happens if the backlog bytes are retrieved before watermark value > is updated from initial default value. Easy way to reproduce it is to create > a pipeline with Kinesis source for a stream where no records are being put. > While debugging it locally, you can observe that the watermark is set to the > value on the past (like: "-290308-12-21T19:59:05.225Z"). After two minutes > (default watermark idle duration threshold is set to 2 minutes) , the > watermark is set to value of > [watermarkIdleThreshold|https://github.com/apache/beam/blob/9fe0a0387ad1f894ef02acf32edbc9623a3b13ad/sdks/java/io/kinesis/src/main/java/org/apache/beam/sdk/io/kinesis/WatermarkPolicyFactory.java#L110]), > so the next backlog bytes retrieval should be correct. However, as described > before, running the pipeline on Dataflow runner results in KinesisReader > being closed just after creation, so the watermark won't be fixed. > The reason of the issue is following: The introduced watermark policies are > relying on > [WatermarkParameters|https://github.com/apache/beam/blob/9fe0a0387ad1f894ef02acf32edbc9623a3b13ad/sdks/java/io/kinesis/src/main/java/org/apache/beam/sdk/io/kinesis/WatermarkParameters.java] > which initialises currentWatermark and eventTime to > [BoundedWindow.TIMESTAMP_MIN_VALUE|https://github.com/apache/beam/commit/9fe0a0387ad1f894ef02acf32edbc9623a3b13ad#diff-b4964a457006b1555c7042c739b405ecR52]. > This result in watermark being set to new Instant(-9223372036854775L) at the > KinesisReader creation. Calculated [period between the watermark and the > current >
[jira] [Updated] (BEAM-8113) FlinkRunner: Stage files from context classloader
[ https://issues.apache.org/jira/browse/BEAM-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Lukavský updated BEAM-8113: --- Status: Open (was: Triage Needed) > FlinkRunner: Stage files from context classloader > - > > Key: BEAM-8113 > URL: https://issues.apache.org/jira/browse/BEAM-8113 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Jan Lukavský >Assignee: Jan Lukavský >Priority: Major > > Currently, only files from {{FlinkRunner.class.getClassLoader()}} are staged > by default. Add also files from > {{Thread.currentThread().getContextClassLoader()}}. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Assigned] (BEAM-8113) FlinkRunner: Stage files from context classloader
[ https://issues.apache.org/jira/browse/BEAM-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jan Lukavský reassigned BEAM-8113: -- Assignee: Jan Lukavský > FlinkRunner: Stage files from context classloader > - > > Key: BEAM-8113 > URL: https://issues.apache.org/jira/browse/BEAM-8113 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Jan Lukavský >Assignee: Jan Lukavský >Priority: Major > > Currently, only files from {{FlinkRunner.class.getClassLoader()}} are staged > by default. Add also files from > {{Thread.currentThread().getContextClassLoader()}}. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work started] (BEAM-8113) FlinkRunner: Stage files from context classloader
[ https://issues.apache.org/jira/browse/BEAM-8113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on BEAM-8113 started by Jan Lukavský. -- > FlinkRunner: Stage files from context classloader > - > > Key: BEAM-8113 > URL: https://issues.apache.org/jira/browse/BEAM-8113 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Jan Lukavský >Assignee: Jan Lukavský >Priority: Major > > Currently, only files from {{FlinkRunner.class.getClassLoader()}} are staged > by default. Add also files from > {{Thread.currentThread().getContextClassLoader()}}. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-5428) Implement cross-bundle state caching.
[ https://issues.apache.org/jira/browse/BEAM-5428?focusedWorklogId=303580=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303580 ] ASF GitHub Bot logged work on BEAM-5428: Author: ASF GitHub Bot Created on: 29/Aug/19 10:35 Start Date: 29/Aug/19 10:35 Worklog Time Spent: 10m Work Description: mxm commented on issue #9374: [BEAM-5428] Implement Runner support for cache tokens URL: https://github.com/apache/beam/pull/9374#issuecomment-526128312 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303580) Time Spent: 10.5h (was: 10h 20m) > Implement cross-bundle state caching. > - > > Key: BEAM-5428 > URL: https://issues.apache.org/jira/browse/BEAM-5428 > Project: Beam > Issue Type: Improvement > Components: sdk-py-harness >Reporter: Robert Bradshaw >Assignee: Rakesh Kumar >Priority: Major > Time Spent: 10.5h > Remaining Estimate: 0h > > Tech spec: > [https://docs.google.com/document/d/1BOozW0bzBuz4oHJEuZNDOHdzaV5Y56ix58Ozrqm2jFg/edit#heading=h.7ghoih5aig5m] > Relevant document: > [https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit#|https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit] > Mailing list link: > [https://lists.apache.org/thread.html/caa8d9bc6ca871d13de2c5e6ba07fdc76f85d26497d95d90893aa1f6@%3Cdev.beam.apache.org%3E] -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-5428) Implement cross-bundle state caching.
[ https://issues.apache.org/jira/browse/BEAM-5428?focusedWorklogId=303578=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303578 ] ASF GitHub Bot logged work on BEAM-5428: Author: ASF GitHub Bot Created on: 29/Aug/19 10:31 Start Date: 29/Aug/19 10:31 Worklog Time Spent: 10m Work Description: mxm commented on issue #9440: [BEAM-5428] Modify cache token Proto design to only include tokens in ProcessBundleRequest URL: https://github.com/apache/beam/pull/9440#issuecomment-526127127 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303578) Time Spent: 10h 10m (was: 10h) > Implement cross-bundle state caching. > - > > Key: BEAM-5428 > URL: https://issues.apache.org/jira/browse/BEAM-5428 > Project: Beam > Issue Type: Improvement > Components: sdk-py-harness >Reporter: Robert Bradshaw >Assignee: Rakesh Kumar >Priority: Major > Time Spent: 10h 10m > Remaining Estimate: 0h > > Tech spec: > [https://docs.google.com/document/d/1BOozW0bzBuz4oHJEuZNDOHdzaV5Y56ix58Ozrqm2jFg/edit#heading=h.7ghoih5aig5m] > Relevant document: > [https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit#|https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit] > Mailing list link: > [https://lists.apache.org/thread.html/caa8d9bc6ca871d13de2c5e6ba07fdc76f85d26497d95d90893aa1f6@%3Cdev.beam.apache.org%3E] -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-5428) Implement cross-bundle state caching.
[ https://issues.apache.org/jira/browse/BEAM-5428?focusedWorklogId=303579=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303579 ] ASF GitHub Bot logged work on BEAM-5428: Author: ASF GitHub Bot Created on: 29/Aug/19 10:31 Start Date: 29/Aug/19 10:31 Worklog Time Spent: 10m Work Description: mxm commented on issue #9440: [BEAM-5428] Modify cache token Proto design to only include tokens in ProcessBundleRequest URL: https://github.com/apache/beam/pull/9440#issuecomment-526127127 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303579) Time Spent: 10h 20m (was: 10h 10m) > Implement cross-bundle state caching. > - > > Key: BEAM-5428 > URL: https://issues.apache.org/jira/browse/BEAM-5428 > Project: Beam > Issue Type: Improvement > Components: sdk-py-harness >Reporter: Robert Bradshaw >Assignee: Rakesh Kumar >Priority: Major > Time Spent: 10h 20m > Remaining Estimate: 0h > > Tech spec: > [https://docs.google.com/document/d/1BOozW0bzBuz4oHJEuZNDOHdzaV5Y56ix58Ozrqm2jFg/edit#heading=h.7ghoih5aig5m] > Relevant document: > [https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit#|https://docs.google.com/document/d/1ltVqIW0XxUXI6grp17TgeyIybk3-nDF8a0-Nqw-s9mY/edit] > Mailing list link: > [https://lists.apache.org/thread.html/caa8d9bc6ca871d13de2c5e6ba07fdc76f85d26497d95d90893aa1f6@%3Cdev.beam.apache.org%3E] -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7660) Create ParDo Python Load Test Jenkins Job [Flink]
[ https://issues.apache.org/jira/browse/BEAM-7660?focusedWorklogId=303569=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303569 ] ASF GitHub Bot logged work on BEAM-7660: Author: ASF GitHub Bot Created on: 29/Aug/19 10:18 Start Date: 29/Aug/19 10:18 Worklog Time Spent: 10m Work Description: kamilwu commented on issue #9449: [BEAM-7660] Create Python ParDo load test job on Flink URL: https://github.com/apache/beam/pull/9449#issuecomment-526122565 Run Python Load Tests ParDo Flink Batch This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303569) Time Spent: 0.5h (was: 20m) > Create ParDo Python Load Test Jenkins Job [Flink] > - > > Key: BEAM-7660 > URL: https://issues.apache.org/jira/browse/BEAM-7660 > Project: Beam > Issue Type: Sub-task > Components: testing >Reporter: Kamil Wasilewski >Assignee: Kamil Wasilewski >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7660) Create ParDo Python Load Test Jenkins Job [Flink]
[ https://issues.apache.org/jira/browse/BEAM-7660?focusedWorklogId=303555=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303555 ] ASF GitHub Bot logged work on BEAM-7660: Author: ASF GitHub Bot Created on: 29/Aug/19 10:07 Start Date: 29/Aug/19 10:07 Worklog Time Spent: 10m Work Description: kamilwu commented on issue #9449: [BEAM-7660] Create Python ParDo load test job on Flink URL: https://github.com/apache/beam/pull/9449#issuecomment-526118890 Run Seed Job This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303555) Time Spent: 20m (was: 10m) > Create ParDo Python Load Test Jenkins Job [Flink] > - > > Key: BEAM-7660 > URL: https://issues.apache.org/jira/browse/BEAM-7660 > Project: Beam > Issue Type: Sub-task > Components: testing >Reporter: Kamil Wasilewski >Assignee: Kamil Wasilewski >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7660) Create ParDo Python Load Test Jenkins Job [Flink]
[ https://issues.apache.org/jira/browse/BEAM-7660?focusedWorklogId=303554=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303554 ] ASF GitHub Bot logged work on BEAM-7660: Author: ASF GitHub Bot Created on: 29/Aug/19 10:06 Start Date: 29/Aug/19 10:06 Worklog Time Spent: 10m Work Description: kamilwu commented on pull request #9449: [BEAM-7660] Create Python ParDo load test job on Flink URL: https://github.com/apache/beam/pull/9449 Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/) | --- | [![Build
[jira] [Created] (BEAM-8113) FlinkRunner: Stage files from context classloader
Jan Lukavský created BEAM-8113: -- Summary: FlinkRunner: Stage files from context classloader Key: BEAM-8113 URL: https://issues.apache.org/jira/browse/BEAM-8113 Project: Beam Issue Type: Improvement Components: runner-flink Reporter: Jan Lukavský Currently, only files from {{FlinkRunner.class.getClassLoader()}} are staged by default. Add also files from {{Thread.currentThread().getContextClassLoader()}}. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work started] (BEAM-8108) Run Chicago Taxi Example on Flink
[ https://issues.apache.org/jira/browse/BEAM-8108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on BEAM-8108 started by Kamil Wasilewski. -- > Run Chicago Taxi Example on Flink > - > > Key: BEAM-8108 > URL: https://issues.apache.org/jira/browse/BEAM-8108 > Project: Beam > Issue Type: Test > Components: testing >Reporter: Kamil Wasilewski >Assignee: Kamil Wasilewski >Priority: Minor > Time Spent: 1h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-8108) Run Chicago Taxi Example on Flink
[ https://issues.apache.org/jira/browse/BEAM-8108?focusedWorklogId=303495=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303495 ] ASF GitHub Bot logged work on BEAM-8108: Author: ASF GitHub Bot Created on: 29/Aug/19 09:41 Start Date: 29/Aug/19 09:41 Worklog Time Spent: 10m Work Description: kamilwu commented on issue #9448: [BEAM-8108] Run Chicago Taxi Example on Flink URL: https://github.com/apache/beam/pull/9448#issuecomment-526109750 Run Chicago Taxi on Flink This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303495) Time Spent: 1h (was: 50m) > Run Chicago Taxi Example on Flink > - > > Key: BEAM-8108 > URL: https://issues.apache.org/jira/browse/BEAM-8108 > Project: Beam > Issue Type: Test > Components: testing >Reporter: Kamil Wasilewski >Assignee: Kamil Wasilewski >Priority: Minor > Time Spent: 1h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-8107) Update commons-compress to version 1.19
[ https://issues.apache.org/jira/browse/BEAM-8107?focusedWorklogId=303472=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303472 ] ASF GitHub Bot logged work on BEAM-8107: Author: ASF GitHub Bot Created on: 29/Aug/19 09:10 Start Date: 29/Aug/19 09:10 Worklog Time Spent: 10m Work Description: mibo commented on issue #9439: [BEAM-8107] Update commons-compress to version 1.19 URL: https://github.com/apache/beam/pull/9439#issuecomment-526098353 @iemejia Thanks a lot for the info This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303472) Time Spent: 1h 40m (was: 1.5h) > Update commons-compress to version 1.19 > --- > > Key: BEAM-8107 > URL: https://issues.apache.org/jira/browse/BEAM-8107 > Project: Beam > Issue Type: Improvement > Components: build-system, sdk-java-core >Reporter: Ismaël Mejía >Assignee: Ismaël Mejía >Priority: Major > Time Spent: 1h 40m > Remaining Estimate: 0h > > Beam's current version of commons-compress (1.18) is subject to a DoS > vulnerability so we need to upgrade it [CVE-2019-12402] -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-8107) Update commons-compress to version 1.19
[ https://issues.apache.org/jira/browse/BEAM-8107?focusedWorklogId=303473=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303473 ] ASF GitHub Bot logged work on BEAM-8107: Author: ASF GitHub Bot Created on: 29/Aug/19 09:10 Start Date: 29/Aug/19 09:10 Worklog Time Spent: 10m Work Description: mibo commented on issue #9439: [BEAM-8107] Update commons-compress to version 1.19 URL: https://github.com/apache/beam/pull/9439#issuecomment-526098353 @iemejia Thanks a lot for the info This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303473) Time Spent: 1h 50m (was: 1h 40m) > Update commons-compress to version 1.19 > --- > > Key: BEAM-8107 > URL: https://issues.apache.org/jira/browse/BEAM-8107 > Project: Beam > Issue Type: Improvement > Components: build-system, sdk-java-core >Reporter: Ismaël Mejía >Assignee: Ismaël Mejía >Priority: Major > Time Spent: 1h 50m > Remaining Estimate: 0h > > Beam's current version of commons-compress (1.18) is subject to a DoS > vulnerability so we need to upgrade it [CVE-2019-12402] -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-8107) Update commons-compress to version 1.19
[ https://issues.apache.org/jira/browse/BEAM-8107?focusedWorklogId=303471=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303471 ] ASF GitHub Bot logged work on BEAM-8107: Author: ASF GitHub Bot Created on: 29/Aug/19 09:05 Start Date: 29/Aug/19 09:05 Worklog Time Spent: 10m Work Description: iemejia commented on issue #9439: [BEAM-8107] Update commons-compress to version 1.19 URL: https://github.com/apache/beam/pull/9439#issuecomment-526096729 Next release branch will be cut on sept. 11. Usually it takes around two weeks after that for the new artifacts to be available. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303471) Time Spent: 1.5h (was: 1h 20m) > Update commons-compress to version 1.19 > --- > > Key: BEAM-8107 > URL: https://issues.apache.org/jira/browse/BEAM-8107 > Project: Beam > Issue Type: Improvement > Components: build-system, sdk-java-core >Reporter: Ismaël Mejía >Assignee: Ismaël Mejía >Priority: Major > Time Spent: 1.5h > Remaining Estimate: 0h > > Beam's current version of commons-compress (1.18) is subject to a DoS > vulnerability so we need to upgrade it [CVE-2019-12402] -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7829) AvroUtils.toAvroSchema should put a Schema name to pass Avro Schema validation
[ https://issues.apache.org/jira/browse/BEAM-7829?focusedWorklogId=303466=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303466 ] ASF GitHub Bot logged work on BEAM-7829: Author: ASF GitHub Bot Created on: 29/Aug/19 08:54 Start Date: 29/Aug/19 08:54 Worklog Time Spent: 10m Work Description: kanterov commented on issue #9247: [BEAM-7829] Add schema names when converting with AvroUtils.toAvroSchema URL: https://github.com/apache/beam/pull/9247#issuecomment-526092234 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303466) Time Spent: 2.5h (was: 2h 20m) > AvroUtils.toAvroSchema should put a Schema name to pass Avro Schema validation > -- > > Key: BEAM-7829 > URL: https://issues.apache.org/jira/browse/BEAM-7829 > Project: Beam > Issue Type: Test > Components: io-java-avro, sdk-java-core >Reporter: Ismaël Mejía >Assignee: Ryan Skraba >Priority: Minor > Fix For: 2.16.0 > > Time Spent: 2.5h > Remaining Estimate: 0h > > While trying to use an Avro PCollection with the SQL transform I notice you > could not do correctly a bijective transform: PCollection -> > SQL -> PCollection -> ParDo -> PCollection I noticed that > some of the Avro metadata gets lost in particular the name of the Avro > Schema. This is important because Avro validates that the schema has a name > and if it does not it breaks with a ParseException. > {quote} > org.apache.avro.SchemaParseException: Illegal character in: EXPR$1 > at org.apache.avro.Schema.validateName (Schema.java:1151) > at org.apache.avro.Schema.access$200 (Schema.java:81) > at org.apache.avro.Schema$Field. (Schema.java:403) > at org.apache.avro.Schema$Field. (Schema.java:423) > at org.apache.avro.Schema$Field. (Schema.java:415){quote} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (BEAM-8109) Encoder exception for structure contains Iterable of KV
[ https://issues.apache.org/jira/browse/BEAM-8109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918402#comment-16918402 ] Ismaël Mejía commented on BEAM-8109: Just out of curiosity, does this work on Direct runner? > Encoder exception for structure contains Iterable of KV > --- > > Key: BEAM-8109 > URL: https://issues.apache.org/jira/browse/BEAM-8109 > Project: Beam > Issue Type: Bug > Components: extensions-java-sorter >Affects Versions: 2.14.0, 2.15.0 >Reporter: Brachi Packter >Priority: Critical > > When doing group by and then sort, the sort should get this structure: > PCollection>>> > However, for any SecondaryKeyT that I put I get coder exception:EOF, this > happens for long and for string coders. > Happens in DataFlow runner. > > Here is the full exception: > {code} > java.lang.IllegalStateException: Unable to decode tag list using > WindowedValue$FullWindowedValueCoder(KvCoder(StringUtf8Coder,IterableCoder(KvCoder(StringUtf8Coder,com.moonactive.data.processor.beam.coders.JsonNodeCoder@20df8330))),IntervalWindow$IntervalWindowCoder) > > org.apache.beam.runners.dataflow.worker.WindmillStateReader.bagPageValues(WindmillStateReader.java:575) > > org.apache.beam.runners.dataflow.worker.WindmillStateReader.consumeBag(WindmillStateReader.java:597) > > org.apache.beam.runners.dataflow.worker.WindmillStateReader.consumeResponse(WindmillStateReader.java:504) > > org.apache.beam.runners.dataflow.worker.WindmillStateReader.startBatchAndBlock(WindmillStateReader.java:420) > > org.apache.beam.runners.dataflow.worker.WindmillStateReader$WrappedFuture.get(WindmillStateReader.java:313) > > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.Futures$2.get(Futures.java:542) > > org.apache.beam.runners.dataflow.worker.WindmillStateInternals$WindmillBag.fetchData(WindmillStateInternals.java:503) > > org.apache.beam.runners.dataflow.worker.WindmillStateInternals$WindmillBag.read(WindmillStateInternals.java:536) > > org.apache.beam.runners.dataflow.worker.StreamingSideInputDoFnRunner.startBundle(StreamingSideInputDoFnRunner.java:60) > > org.apache.beam.runners.dataflow.worker.SimpleParDoFn.reallyStartBundle(SimpleParDoFn.java:307) > > org.apache.beam.runners.dataflow.worker.SimpleParDoFn.startBundle(SimpleParDoFn.java:231) > > org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation.start(ParDoOperation.java:36) > > org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:77) > > org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1295) > > org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access$1000(StreamingDataflowWorker.java:149) > > org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker$6.run(StreamingDataflowWorker.java:1028) > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > java.lang.Thread.run(Thread.java:745) Caused by: > org.apache.beam.sdk.coders.CoderException: java.io.EOFException > org.apache.beam.sdk.coders.StringUtf8Coder.decode(StringUtf8Coder.java:104) > org.apache.beam.sdk.coders.StringUtf8Coder.decode(StringUtf8Coder.java:90) > org.apache.beam.sdk.coders.StringUtf8Coder.decode(StringUtf8Coder.java:37) > org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:81) > org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:76) > org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:36) > org.apache.beam.sdk.coders.IterableLikeCoder.decode(IterableLikeCoder.java:124) > > org.apache.beam.sdk.coders.IterableLikeCoder.decode(IterableLikeCoder.java:60) > org.apache.beam.sdk.coders.Coder.decode(Coder.java:159) > org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:82) > org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:36) > org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:592) > > org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:529) > > org.apache.beam.runners.dataflow.worker.WindmillStateReader.bagPageValues(WindmillStateReader.java:573) > > org.apache.beam.runners.dataflow.worker.WindmillStateReader.consumeBag(WindmillStateReader.java:597) > > org.apache.beam.runners.dataflow.worker.WindmillStateReader.consumeResponse(WindmillStateReader.java:504) > > org.apache.beam.runners.dataflow.worker.WindmillStateReader.startBatchAndBlock(WindmillStateReader.java:420) > > org.apache.beam.runners.dataflow.worker.WindmillStateReader$WrappedFuture.get(WindmillStateReader.java:313) > >
[jira] [Updated] (BEAM-8109) Encoder exception for structure contains Iterable of KV
[ https://issues.apache.org/jira/browse/BEAM-8109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismaël Mejía updated BEAM-8109: --- Status: Open (was: Triage Needed) > Encoder exception for structure contains Iterable of KV > --- > > Key: BEAM-8109 > URL: https://issues.apache.org/jira/browse/BEAM-8109 > Project: Beam > Issue Type: Bug > Components: extensions-java-sorter >Affects Versions: 2.14.0, 2.15.0 >Reporter: Brachi Packter >Priority: Critical > > When doing group by and then sort, the sort should get this structure: > PCollection>>> > However, for any SecondaryKeyT that I put I get coder exception:EOF, this > happens for long and for string coders. > Happens in DataFlow runner. > > Here is the full exception: > {code} > java.lang.IllegalStateException: Unable to decode tag list using > WindowedValue$FullWindowedValueCoder(KvCoder(StringUtf8Coder,IterableCoder(KvCoder(StringUtf8Coder,com.moonactive.data.processor.beam.coders.JsonNodeCoder@20df8330))),IntervalWindow$IntervalWindowCoder) > > org.apache.beam.runners.dataflow.worker.WindmillStateReader.bagPageValues(WindmillStateReader.java:575) > > org.apache.beam.runners.dataflow.worker.WindmillStateReader.consumeBag(WindmillStateReader.java:597) > > org.apache.beam.runners.dataflow.worker.WindmillStateReader.consumeResponse(WindmillStateReader.java:504) > > org.apache.beam.runners.dataflow.worker.WindmillStateReader.startBatchAndBlock(WindmillStateReader.java:420) > > org.apache.beam.runners.dataflow.worker.WindmillStateReader$WrappedFuture.get(WindmillStateReader.java:313) > > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.Futures$2.get(Futures.java:542) > > org.apache.beam.runners.dataflow.worker.WindmillStateInternals$WindmillBag.fetchData(WindmillStateInternals.java:503) > > org.apache.beam.runners.dataflow.worker.WindmillStateInternals$WindmillBag.read(WindmillStateInternals.java:536) > > org.apache.beam.runners.dataflow.worker.StreamingSideInputDoFnRunner.startBundle(StreamingSideInputDoFnRunner.java:60) > > org.apache.beam.runners.dataflow.worker.SimpleParDoFn.reallyStartBundle(SimpleParDoFn.java:307) > > org.apache.beam.runners.dataflow.worker.SimpleParDoFn.startBundle(SimpleParDoFn.java:231) > > org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation.start(ParDoOperation.java:36) > > org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:77) > > org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1295) > > org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access$1000(StreamingDataflowWorker.java:149) > > org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker$6.run(StreamingDataflowWorker.java:1028) > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > java.lang.Thread.run(Thread.java:745) Caused by: > org.apache.beam.sdk.coders.CoderException: java.io.EOFException > org.apache.beam.sdk.coders.StringUtf8Coder.decode(StringUtf8Coder.java:104) > org.apache.beam.sdk.coders.StringUtf8Coder.decode(StringUtf8Coder.java:90) > org.apache.beam.sdk.coders.StringUtf8Coder.decode(StringUtf8Coder.java:37) > org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:81) > org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:76) > org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:36) > org.apache.beam.sdk.coders.IterableLikeCoder.decode(IterableLikeCoder.java:124) > > org.apache.beam.sdk.coders.IterableLikeCoder.decode(IterableLikeCoder.java:60) > org.apache.beam.sdk.coders.Coder.decode(Coder.java:159) > org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:82) > org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:36) > org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:592) > > org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:529) > > org.apache.beam.runners.dataflow.worker.WindmillStateReader.bagPageValues(WindmillStateReader.java:573) > > org.apache.beam.runners.dataflow.worker.WindmillStateReader.consumeBag(WindmillStateReader.java:597) > > org.apache.beam.runners.dataflow.worker.WindmillStateReader.consumeResponse(WindmillStateReader.java:504) > > org.apache.beam.runners.dataflow.worker.WindmillStateReader.startBatchAndBlock(WindmillStateReader.java:420) > > org.apache.beam.runners.dataflow.worker.WindmillStateReader$WrappedFuture.get(WindmillStateReader.java:313) > > org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.Futures$2.get(Futures.java:542) > >
[jira] [Updated] (BEAM-8104) Remove windowStrategy deserialization exception handling
[ https://issues.apache.org/jira/browse/BEAM-8104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismaël Mejía updated BEAM-8104: --- Status: Open (was: Triage Needed) > Remove windowStrategy deserialization exception handling > > > Key: BEAM-8104 > URL: https://issues.apache.org/jira/browse/BEAM-8104 > Project: Beam > Issue Type: Bug > Components: runner-dataflow >Reporter: Ankur Goenka >Priority: Major > > Remove the exception handling in JRH at > [https://github.com/apache/beam/blob/98e553b371966127a988bab69290b5a8c44d5dad/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/GroupAlsoByWindowParDoFnFactory.java#L106] > And subsequently fix Reshuffle transform to use _IdentityWindowFn in > [https://github.com/apache/beam/blob/98e553b371966127a988bab69290b5a8c44d5dad/sdks/python/apache_beam/transforms/util.py#L626] > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (BEAM-8109) Encoder exception for structure contains Iterable of KV
[ https://issues.apache.org/jira/browse/BEAM-8109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismaël Mejía updated BEAM-8109: --- Description: When doing group by and then sort, the sort should get this structure: PCollection>>> However, for any SecondaryKeyT that I put I get coder exception:EOF, this happens for long and for string coders. Happens in DataFlow runner. Here is the full exception: {code} java.lang.IllegalStateException: Unable to decode tag list using WindowedValue$FullWindowedValueCoder(KvCoder(StringUtf8Coder,IterableCoder(KvCoder(StringUtf8Coder,com.moonactive.data.processor.beam.coders.JsonNodeCoder@20df8330))),IntervalWindow$IntervalWindowCoder) org.apache.beam.runners.dataflow.worker.WindmillStateReader.bagPageValues(WindmillStateReader.java:575) org.apache.beam.runners.dataflow.worker.WindmillStateReader.consumeBag(WindmillStateReader.java:597) org.apache.beam.runners.dataflow.worker.WindmillStateReader.consumeResponse(WindmillStateReader.java:504) org.apache.beam.runners.dataflow.worker.WindmillStateReader.startBatchAndBlock(WindmillStateReader.java:420) org.apache.beam.runners.dataflow.worker.WindmillStateReader$WrappedFuture.get(WindmillStateReader.java:313) org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.Futures$2.get(Futures.java:542) org.apache.beam.runners.dataflow.worker.WindmillStateInternals$WindmillBag.fetchData(WindmillStateInternals.java:503) org.apache.beam.runners.dataflow.worker.WindmillStateInternals$WindmillBag.read(WindmillStateInternals.java:536) org.apache.beam.runners.dataflow.worker.StreamingSideInputDoFnRunner.startBundle(StreamingSideInputDoFnRunner.java:60) org.apache.beam.runners.dataflow.worker.SimpleParDoFn.reallyStartBundle(SimpleParDoFn.java:307) org.apache.beam.runners.dataflow.worker.SimpleParDoFn.startBundle(SimpleParDoFn.java:231) org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation.start(ParDoOperation.java:36) org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:77) org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1295) org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access$1000(StreamingDataflowWorker.java:149) org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker$6.run(StreamingDataflowWorker.java:1028) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) java.lang.Thread.run(Thread.java:745) Caused by: org.apache.beam.sdk.coders.CoderException: java.io.EOFException org.apache.beam.sdk.coders.StringUtf8Coder.decode(StringUtf8Coder.java:104) org.apache.beam.sdk.coders.StringUtf8Coder.decode(StringUtf8Coder.java:90) org.apache.beam.sdk.coders.StringUtf8Coder.decode(StringUtf8Coder.java:37) org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:81) org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:76) org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:36) org.apache.beam.sdk.coders.IterableLikeCoder.decode(IterableLikeCoder.java:124) org.apache.beam.sdk.coders.IterableLikeCoder.decode(IterableLikeCoder.java:60) org.apache.beam.sdk.coders.Coder.decode(Coder.java:159) org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:82) org.apache.beam.sdk.coders.KvCoder.decode(KvCoder.java:36) org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:592) org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:529) org.apache.beam.runners.dataflow.worker.WindmillStateReader.bagPageValues(WindmillStateReader.java:573) org.apache.beam.runners.dataflow.worker.WindmillStateReader.consumeBag(WindmillStateReader.java:597) org.apache.beam.runners.dataflow.worker.WindmillStateReader.consumeResponse(WindmillStateReader.java:504) org.apache.beam.runners.dataflow.worker.WindmillStateReader.startBatchAndBlock(WindmillStateReader.java:420) org.apache.beam.runners.dataflow.worker.WindmillStateReader$WrappedFuture.get(WindmillStateReader.java:313) org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.Futures$2.get(Futures.java:542) org.apache.beam.runners.dataflow.worker.WindmillStateInternals$WindmillBag.fetchData(WindmillStateInternals.java:503) org.apache.beam.runners.dataflow.worker.WindmillStateInternals$WindmillBag.read(WindmillStateInternals.java:536) org.apache.beam.runners.dataflow.worker.StreamingSideInputDoFnRunner.startBundle(StreamingSideInputDoFnRunner.java:60) org.apache.beam.runners.dataflow.worker.SimpleParDoFn.reallyStartBundle(SimpleParDoFn.java:307) org.apache.beam.runners.dataflow.worker.SimpleParDoFn.startBundle(SimpleParDoFn.java:231) org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation.start(ParDoOperation.java:36)
[jira] [Work logged] (BEAM-8108) Run Chicago Taxi Example on Flink
[ https://issues.apache.org/jira/browse/BEAM-8108?focusedWorklogId=303459=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303459 ] ASF GitHub Bot logged work on BEAM-8108: Author: ASF GitHub Bot Created on: 29/Aug/19 08:28 Start Date: 29/Aug/19 08:28 Worklog Time Spent: 10m Work Description: kamilwu commented on issue #9448: [BEAM-8108] Run Chicago Taxi Example on Flink URL: https://github.com/apache/beam/pull/9448#issuecomment-526082164 Google Cloud Flink Runner Chicago Taxi Example This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303459) Time Spent: 50m (was: 40m) > Run Chicago Taxi Example on Flink > - > > Key: BEAM-8108 > URL: https://issues.apache.org/jira/browse/BEAM-8108 > Project: Beam > Issue Type: Test > Components: testing >Reporter: Kamil Wasilewski >Assignee: Kamil Wasilewski >Priority: Minor > Time Spent: 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (BEAM-8110) Fedora 30: ImportError: libnsl.so.1 when importing apache_beam in Python 3
[ https://issues.apache.org/jira/browse/BEAM-8110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918399#comment-16918399 ] Ismaël Mejía commented on BEAM-8110: This looks like an issue in the Arrow side, maybe worth to report there. At least it has the workaround of installing the dep. Anything extra to say maybe [~bhulette] > Fedora 30: ImportError: libnsl.so.1 when importing apache_beam in Python 3 > -- > > Key: BEAM-8110 > URL: https://issues.apache.org/jira/browse/BEAM-8110 > Project: Beam > Issue Type: Bug > Components: dependencies, sdk-py-core >Affects Versions: 2.15.0 > Environment: Fedora 30, Singularity container >Reporter: Robert Lugg >Priority: Minor > > {{When importing apache_beam in python, it fails because it can't find > libnsl.so.1.}} > It is fixed by running 'dnf install libnsl'. This appears to be a dependency > of Apache Arrow. > > {{output:}} > {{Singularity tfx:~/tfx_image_example_gen> python}} > {{Python 3.6.9 |Anaconda, Inc.| (default, Jul 30 2019, 19:07:31) }} > {{[GCC 7.3.0] on linux}} > {{Type "help", "copyright", "credits" or "license" for more information.}} > {{>>> import apache_beam}} > {{/opt/snps/envs/tfx/lib/python3.6/site-packages/apache_beam/__init__.py:84: > UserWarning: Some syntactic constructs of Python 3 are not yet fully > supported by Apache Beam.}} > {{ 'Some syntactic constructs of Python 3 are not yet fully supported by '}} > {{Traceback (most recent call last):}} > {{ File "", line 1, in }} > {{ File > "/opt/snps/envs/tfx/lib/python3.6/site-packages/apache_beam/__init__.py", > line 98, in }} > {{ from apache_beam import io}} > {{ File > "/opt/snps/envs/tfx/lib/python3.6/site-packages/apache_beam/io/__init__.py", > line 29, in }} > {{ from apache_beam.io.parquetio import *}} > {{ File > "/opt/snps/envs/tfx/lib/python3.6/site-packages/apache_beam/io/parquetio.py", > line 45, in }} > {{ import pyarrow as pa}} > {{ File "/opt/snps/envs/tfx/lib/python3.6/site-packages/pyarrow/__init__.py", > line 49, in }} > {{ from pyarrow.lib import cpu_count, set_cpu_count}} > {{ImportError: libnsl.so.1: cannot open shared object file: No such file or > directory}} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-8108) Run Chicago Taxi Example on Flink
[ https://issues.apache.org/jira/browse/BEAM-8108?focusedWorklogId=303458=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303458 ] ASF GitHub Bot logged work on BEAM-8108: Author: ASF GitHub Bot Created on: 29/Aug/19 08:28 Start Date: 29/Aug/19 08:28 Worklog Time Spent: 10m Work Description: kamilwu commented on issue #9448: [BEAM-8108] Run Chicago Taxi Example on Flink URL: https://github.com/apache/beam/pull/9448#issuecomment-526082717 Run Chicago Taxi on Flink This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303458) Time Spent: 40m (was: 0.5h) > Run Chicago Taxi Example on Flink > - > > Key: BEAM-8108 > URL: https://issues.apache.org/jira/browse/BEAM-8108 > Project: Beam > Issue Type: Test > Components: testing >Reporter: Kamil Wasilewski >Assignee: Kamil Wasilewski >Priority: Minor > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-8108) Run Chicago Taxi Example on Flink
[ https://issues.apache.org/jira/browse/BEAM-8108?focusedWorklogId=303455=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303455 ] ASF GitHub Bot logged work on BEAM-8108: Author: ASF GitHub Bot Created on: 29/Aug/19 08:26 Start Date: 29/Aug/19 08:26 Worklog Time Spent: 10m Work Description: kamilwu commented on issue #9448: [BEAM-8108] Run Chicago Taxi Example on Flink URL: https://github.com/apache/beam/pull/9448#issuecomment-526082164 Google Cloud Flink Runner Chicago Taxi Example This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303455) Time Spent: 0.5h (was: 20m) > Run Chicago Taxi Example on Flink > - > > Key: BEAM-8108 > URL: https://issues.apache.org/jira/browse/BEAM-8108 > Project: Beam > Issue Type: Test > Components: testing >Reporter: Kamil Wasilewski >Assignee: Kamil Wasilewski >Priority: Minor > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (BEAM-8110) Fedora 30: ImportError: libnsl.so.1 when importing apache_beam in Python 3
[ https://issues.apache.org/jira/browse/BEAM-8110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismaël Mejía updated BEAM-8110: --- Component/s: sdk-py-core > Fedora 30: ImportError: libnsl.so.1 when importing apache_beam in Python 3 > -- > > Key: BEAM-8110 > URL: https://issues.apache.org/jira/browse/BEAM-8110 > Project: Beam > Issue Type: Bug > Components: dependencies, sdk-py-core >Affects Versions: 2.15.0 > Environment: Fedora 30, Singularity container >Reporter: Robert Lugg >Priority: Minor > > {{When importing apache_beam in python, it fails because it can't find > libnsl.so.1.}} > It is fixed by running 'dnf install libnsl'. This appears to be a dependency > of Apache Arrow. > > {{output:}} > {{Singularity tfx:~/tfx_image_example_gen> python}} > {{Python 3.6.9 |Anaconda, Inc.| (default, Jul 30 2019, 19:07:31) }} > {{[GCC 7.3.0] on linux}} > {{Type "help", "copyright", "credits" or "license" for more information.}} > {{>>> import apache_beam}} > {{/opt/snps/envs/tfx/lib/python3.6/site-packages/apache_beam/__init__.py:84: > UserWarning: Some syntactic constructs of Python 3 are not yet fully > supported by Apache Beam.}} > {{ 'Some syntactic constructs of Python 3 are not yet fully supported by '}} > {{Traceback (most recent call last):}} > {{ File "", line 1, in }} > {{ File > "/opt/snps/envs/tfx/lib/python3.6/site-packages/apache_beam/__init__.py", > line 98, in }} > {{ from apache_beam import io}} > {{ File > "/opt/snps/envs/tfx/lib/python3.6/site-packages/apache_beam/io/__init__.py", > line 29, in }} > {{ from apache_beam.io.parquetio import *}} > {{ File > "/opt/snps/envs/tfx/lib/python3.6/site-packages/apache_beam/io/parquetio.py", > line 45, in }} > {{ import pyarrow as pa}} > {{ File "/opt/snps/envs/tfx/lib/python3.6/site-packages/pyarrow/__init__.py", > line 49, in }} > {{ from pyarrow.lib import cpu_count, set_cpu_count}} > {{ImportError: libnsl.so.1: cannot open shared object file: No such file or > directory}} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (BEAM-8110) Fedora 30: ImportError: libnsl.so.1 when importing apache_beam in Python 3
[ https://issues.apache.org/jira/browse/BEAM-8110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismaël Mejía updated BEAM-8110: --- Status: Open (was: Triage Needed) > Fedora 30: ImportError: libnsl.so.1 when importing apache_beam in Python 3 > -- > > Key: BEAM-8110 > URL: https://issues.apache.org/jira/browse/BEAM-8110 > Project: Beam > Issue Type: Bug > Components: dependencies >Affects Versions: 2.15.0 > Environment: Fedora 30, Singularity container >Reporter: Robert Lugg >Priority: Minor > > {{When importing apache_beam in python, it fails because it can't find > libnsl.so.1.}} > It is fixed by running 'dnf install libnsl'. This appears to be a dependency > of Apache Arrow. > > {{output:}} > {{Singularity tfx:~/tfx_image_example_gen> python}} > {{Python 3.6.9 |Anaconda, Inc.| (default, Jul 30 2019, 19:07:31) }} > {{[GCC 7.3.0] on linux}} > {{Type "help", "copyright", "credits" or "license" for more information.}} > {{>>> import apache_beam}} > {{/opt/snps/envs/tfx/lib/python3.6/site-packages/apache_beam/__init__.py:84: > UserWarning: Some syntactic constructs of Python 3 are not yet fully > supported by Apache Beam.}} > {{ 'Some syntactic constructs of Python 3 are not yet fully supported by '}} > {{Traceback (most recent call last):}} > {{ File "", line 1, in }} > {{ File > "/opt/snps/envs/tfx/lib/python3.6/site-packages/apache_beam/__init__.py", > line 98, in }} > {{ from apache_beam import io}} > {{ File > "/opt/snps/envs/tfx/lib/python3.6/site-packages/apache_beam/io/__init__.py", > line 29, in }} > {{ from apache_beam.io.parquetio import *}} > {{ File > "/opt/snps/envs/tfx/lib/python3.6/site-packages/apache_beam/io/parquetio.py", > line 45, in }} > {{ import pyarrow as pa}} > {{ File "/opt/snps/envs/tfx/lib/python3.6/site-packages/pyarrow/__init__.py", > line 49, in }} > {{ from pyarrow.lib import cpu_count, set_cpu_count}} > {{ImportError: libnsl.so.1: cannot open shared object file: No such file or > directory}} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (BEAM-8111) SchemaCoder broken on DataflowRunner
[ https://issues.apache.org/jira/browse/BEAM-8111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismaël Mejía updated BEAM-8111: --- Status: Open (was: Triage Needed) > SchemaCoder broken on DataflowRunner > > > Key: BEAM-8111 > URL: https://issues.apache.org/jira/browse/BEAM-8111 > Project: Beam > Issue Type: Bug > Components: runner-dataflow, sdk-java-core >Affects Versions: 2.15.0 >Reporter: Brian Hulette >Assignee: Brian Hulette >Priority: Blocker > Time Spent: 0.5h > Remaining Estimate: 0h > > https://github.com/apache/beam/commit/e65c176a9f34e45d408281e1101a2ae54cef0f6c > broke SchemaCoder on Dataflow. When translating a schema that uses logical > types from a cloud object dataflow encounters a runtime error. > This means any pipelines that use SqlTransform or schema transforms will fail > on Dataflow in 2.15.0 -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (BEAM-8112) Support passing stateBackend through pipeline options in python sdks
[ https://issues.apache.org/jira/browse/BEAM-8112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismaël Mejía updated BEAM-8112: --- Status: Open (was: Triage Needed) > Support passing stateBackend through pipeline options in python sdks > > > Key: BEAM-8112 > URL: https://issues.apache.org/jira/browse/BEAM-8112 > Project: Beam > Issue Type: New Feature > Components: runner-flink >Reporter: Catlyn Kong >Priority: Major > > Currently the only way for python sdks to instruct flink to use a > StateBackend different than the default (MemoryStateBackend) would be to > specify state.backend in flink-conf.yaml, which creates the limitation of > using the same statebackend for every job running on the same flink cluster. > Ideally we should be able to pass it in to flink runner through > PipelineOptions. Here's the error it spits out when I flag > --state_backend=RocksDBStateBackend: > > {code:java} > RuntimeError: Pipeline failed in state FAILED: > com.fasterxml.jackson.databind.exc.InvalidDefinitionException: Cannot > construct instance of `org.apache.flink.runtime.state.StateBackend` (no > Creators, like default construct, exist): abstract types either need to be > mapped to concrete types, have custom deserializer, or contain additional > type information > at [Source: (String)""RocksDBStateBackend""; line: 1, column: 1] > {code} > Acceptance Criteria: > Flink StateBackend is configurable via command line options from python. > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (BEAM-8108) Run Chicago Taxi Example on Flink
[ https://issues.apache.org/jira/browse/BEAM-8108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismaël Mejía updated BEAM-8108: --- Status: Open (was: Triage Needed) > Run Chicago Taxi Example on Flink > - > > Key: BEAM-8108 > URL: https://issues.apache.org/jira/browse/BEAM-8108 > Project: Beam > Issue Type: Test > Components: testing >Reporter: Kamil Wasilewski >Assignee: Kamil Wasilewski >Priority: Minor > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-8108) Run Chicago Taxi Example on Flink
[ https://issues.apache.org/jira/browse/BEAM-8108?focusedWorklogId=303449=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303449 ] ASF GitHub Bot logged work on BEAM-8108: Author: ASF GitHub Bot Created on: 29/Aug/19 08:06 Start Date: 29/Aug/19 08:06 Worklog Time Spent: 10m Work Description: kamilwu commented on issue #9448: [BEAM-8108] Run Chicago Taxi Example on Flink URL: https://github.com/apache/beam/pull/9448#issuecomment-526074915 Run Seed Job This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303449) Time Spent: 20m (was: 10m) > Run Chicago Taxi Example on Flink > - > > Key: BEAM-8108 > URL: https://issues.apache.org/jira/browse/BEAM-8108 > Project: Beam > Issue Type: Test > Components: testing >Reporter: Kamil Wasilewski >Assignee: Kamil Wasilewski >Priority: Minor > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (BEAM-7972) Portable Python Reshuffle does not work with windowed pcollection
[ https://issues.apache.org/jira/browse/BEAM-7972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka updated BEAM-7972: --- Fix Version/s: 2.16.0 > Portable Python Reshuffle does not work with windowed pcollection > - > > Key: BEAM-7972 > URL: https://issues.apache.org/jira/browse/BEAM-7972 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Major > Fix For: 2.16.0 > > Time Spent: 3h 10m > Remaining Estimate: 0h > > Streaming pipeline gets stuck when using Reshuffle with windowed pcollection. > The issue happen because of window function gets deserialized on java side > which is not possible and hence default to global window function and result > into window function mismatch later down the code. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (BEAM-7972) Portable Python Reshuffle does not work with windowed pcollection
[ https://issues.apache.org/jira/browse/BEAM-7972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ankur Goenka updated BEAM-7972: --- Priority: Blocker (was: Major) > Portable Python Reshuffle does not work with windowed pcollection > - > > Key: BEAM-7972 > URL: https://issues.apache.org/jira/browse/BEAM-7972 > Project: Beam > Issue Type: Bug > Components: sdk-py-core >Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Blocker > Fix For: 2.16.0 > > Time Spent: 3h 10m > Remaining Estimate: 0h > > Streaming pipeline gets stuck when using Reshuffle with windowed pcollection. > The issue happen because of window function gets deserialized on java side > which is not possible and hence default to global window function and result > into window function mismatch later down the code. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-7829) AvroUtils.toAvroSchema should put a Schema name to pass Avro Schema validation
[ https://issues.apache.org/jira/browse/BEAM-7829?focusedWorklogId=303418=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303418 ] ASF GitHub Bot logged work on BEAM-7829: Author: ASF GitHub Bot Created on: 29/Aug/19 07:20 Start Date: 29/Aug/19 07:20 Worklog Time Spent: 10m Work Description: RyanSkraba commented on issue #9247: [BEAM-7829] Add schema names when converting with AvroUtils.toAvroSchema URL: https://github.com/apache/beam/pull/9247#issuecomment-526059811 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303418) Time Spent: 2h 20m (was: 2h 10m) > AvroUtils.toAvroSchema should put a Schema name to pass Avro Schema validation > -- > > Key: BEAM-7829 > URL: https://issues.apache.org/jira/browse/BEAM-7829 > Project: Beam > Issue Type: Test > Components: io-java-avro, sdk-java-core >Reporter: Ismaël Mejía >Assignee: Ryan Skraba >Priority: Minor > Fix For: 2.16.0 > > Time Spent: 2h 20m > Remaining Estimate: 0h > > While trying to use an Avro PCollection with the SQL transform I notice you > could not do correctly a bijective transform: PCollection -> > SQL -> PCollection -> ParDo -> PCollection I noticed that > some of the Avro metadata gets lost in particular the name of the Avro > Schema. This is important because Avro validates that the schema has a name > and if it does not it breaks with a ParseException. > {quote} > org.apache.avro.SchemaParseException: Illegal character in: EXPR$1 > at org.apache.avro.Schema.validateName (Schema.java:1151) > at org.apache.avro.Schema.access$200 (Schema.java:81) > at org.apache.avro.Schema$Field. (Schema.java:403) > at org.apache.avro.Schema$Field. (Schema.java:423) > at org.apache.avro.Schema$Field. (Schema.java:415){quote} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-8107) Update commons-compress to version 1.19
[ https://issues.apache.org/jira/browse/BEAM-8107?focusedWorklogId=303402=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-303402 ] ASF GitHub Bot logged work on BEAM-8107: Author: ASF GitHub Bot Created on: 29/Aug/19 06:58 Start Date: 29/Aug/19 06:58 Worklog Time Spent: 10m Work Description: mibo commented on issue #9439: [BEAM-8107] Update commons-compress to version 1.19 URL: https://github.com/apache/beam/pull/9439#issuecomment-526052731 @pabloem do you already have a date for the next release (which includes this PR)? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 303402) Time Spent: 1h 20m (was: 1h 10m) > Update commons-compress to version 1.19 > --- > > Key: BEAM-8107 > URL: https://issues.apache.org/jira/browse/BEAM-8107 > Project: Beam > Issue Type: Improvement > Components: build-system, sdk-java-core >Reporter: Ismaël Mejía >Assignee: Ismaël Mejía >Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > > Beam's current version of commons-compress (1.18) is subject to a DoS > vulnerability so we need to upgrade it [CVE-2019-12402] -- This message was sent by Atlassian Jira (v8.3.2#803003)