[
https://issues.apache.org/jira/browse/BEAM-12898?focusedWorklogId=746232&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-746232
]
ASF GitHub Bot logged work on BEAM-12898:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 22/Mar/22 22:57
Start Date: 22/Mar/22 22:57
Worklog Time Spent: 10m
Work Description: ibzib commented on a change in pull request #17046:
URL: https://github.com/apache/beam/pull/17046#discussion_r832690786
##########
File path: .test-infra/dataproc/flink_cluster.sh
##########
@@ -133,7 +134,8 @@ function create_cluster() {
# Docker init action restarts yarn so we need to start yarn session after
this restart happens.
# This is why flink init action is invoked last.
- gcloud dataproc clusters create $CLUSTER_NAME --region=global
--num-workers=$num_dataproc_workers --initialization-actions
$DOCKER_INIT,$BEAM_INIT,$FLINK_INIT --metadata "${metadata}",
--image-version=$image_version --zone=$GCLOUD_ZONE --quiet
+ #--initialization-actions $DOCKER_INIT,$BEAM_INIT,$FLINK_INIT Older
initialization
Review comment:
I don't think we need any of these comments anymore.
##########
File path: .test-infra/jenkins/job_LoadTests_Combine_Flink_Go.groovy
##########
@@ -133,5 +133,5 @@
CronJobBuilder.cronJob('beam_LoadTests_Go_Combine_Flink_Batch', 'H 8 * * *', thi
influx_hostname: InfluxDBCredentialsHelper.InfluxDBHostUrl,
]
// TODO(BEAM-12898): Re-enable this test once fixed.
- // loadTestJob(delegate, CommonTestProperties.TriggeringContext.POST_COMMIT,
'batch')
+ //loadTestJob(delegate, CommonTestProperties.TriggeringContext.POST_COMMIT,
'batch')
Review comment:
We can re-enable these. It's fine if they're failing as long as we're
not leaking VMs (which I believe was fixed in #15547).
##########
File path: .test-infra/jenkins/CommonTestProperties.groovy
##########
@@ -26,7 +26,7 @@ class CommonTestProperties {
}
static String getFlinkVersion() {
- return "1.13"
+ return "1.12" //Flink Version in dataproc 2.0
Review comment:
I don't think this sets the Flink version used in the load tests. I
think that's set in each load test config, e.g.
https://github.com/apache/beam/blob/d3f320426a115b8c986a817fe2ba87f9fd7f2192/.test-infra/jenkins/job_LoadTests_GBK_Flink_Go.groovy#L201
Looks like it's already set to 1.12, so you can probably revert this change.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 746232)
Time Spent: 9h 20m (was: 9h 10m)
> Flink Load Tests failure- UncheckedExecutionException - leaking vms
> -------------------------------------------------------------------
>
> Key: BEAM-12898
> URL: https://issues.apache.org/jira/browse/BEAM-12898
> Project: Beam
> Issue Type: Test
> Components: test-failures
> Reporter: Alex Amato
> Assignee: Andoni Guzman
> Priority: P2
> Attachments: 6L8weM2p7mDLMJV.png, BmJoKx8T8pZT2Ls.png
>
> Time Spent: 9h 20m
> Remaining Estimate: 0h
>
> Same failure from different tests:
> [https://ci-beam.apache.org/job/beam_LoadTests_Go_CoGBK_Flink_Batch/277/console]
> [https://ci-beam.apache.org/job/beam_LoadTests_Go_Combine_Flink_Batch/289/console]
> [https://ci-beam.apache.org/job/beam_LoadTests_Go_GBK_Flink_Batch/290/console]
> [https://ci-beam.apache.org/job/beam_LoadTests_Go_ParDo_Flink_Batch/295/console]
> I think that this test may also be responsible for leaking some gce vms on
> apache-beam-testing. As this morning we discovered several vms that were not
> torn down. I suspect this is the cause of the leaked vms.
> The vms have names like this:
> vm names:
> beam-loadtests-python*flink*
> beam-loadtests-go*flink*
> i.e.
> beam-loadtests-go-cogbk-flink-batch-277-m
> beam-loadtests-go-gbk-flink-batch-290-w-2
> beam-loadtests-go-pardo-flink-batch-295-m
> beam-loadtests-go-sideinput-flink-batch-269-w-2
> beam-loadtests-python-combine-flink-batch-766-m
> beam-loadtests-python-combine-flink-streaming-368-w-0
> beam-loadtests-python-pardo-flink-batch-716-m
>
> It seems like this tests are spinning up a dataproc cluster. The gce metadata
> on the vms refers to a lot of dataproc stuff (attached). Likely the tests are
> crashing and not running their code to clean up/shutdown the dataproc cluster.
> Logs
> ----
> [https://ci-beam.apache.org/job/beam_LoadTests_Go_Combine_Flink_Batch/lastBuild/console]
> 01:43:59 2021/09/14 08:43:59 ():
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
> java.lang.IllegalArgumentException: Encountered unsupported logical type
> URN: int01:43:59 2021/09/14 08:43:59 ():
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
> java.lang.IllegalArgumentException: Encountered unsupported logical type
> URN: int01:43:59 at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2050)01:43:59
> at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get(LocalCache.java:3952)01:43:59
> at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974)01:43:59
> at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958)01:43:59
> at
> org.apache.beam.runners.core.construction.RehydratedComponents.getCoder(RehydratedComponents.java:168)01:43:59
> at
> org.apache.beam.runners.fnexecution.wire.WireCoders.instantiateRunnerWireCoder(WireCoders.java:94)01:43:59
> at
> org.apache.beam.runners.fnexecution.wire.WireCoders.instantiateRunnerWireCoder(WireCoders.java:75)01:43:59
> at
> org.apache.beam.runners.flink.FlinkBatchPortablePipelineTranslator.translateExecutableStage(FlinkBatchPortablePipelineTranslator.java:311)01:43:59
> at
> org.apache.beam.runners.flink.FlinkBatchPortablePipelineTranslator.translate(FlinkBatchPortablePipelineTranslator.java:272)01:43:59
> at
> org.apache.beam.runners.flink.FlinkBatchPortablePipelineTranslator.translate(FlinkBatchPortablePipelineTranslator.java:118)01:43:59
> at
> org.apache.beam.runners.flink.FlinkPipelineRunner.runPipelineWithTranslator(FlinkPipelineRunner.java:115)01:43:59
> at
> org.apache.beam.runners.flink.FlinkPipelineRunner.run(FlinkPipelineRunner.java:85)01:43:59
> at
> org.apache.beam.runners.jobsubmission.JobInvocation.runPipeline(JobInvocation.java:86)01:43:59
> at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)01:43:59
> at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57)01:43:59
> at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)01:43:59
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)01:43:59
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)01:43:59
> at java.lang.Thread.run(Thread.java:748)01:43:59 Caused by:
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
> java.lang.IllegalArgumentException: Encountered unsupported logical type
> URN: int01:43:59 at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2050)01:43:59
> at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get(LocalCache.java:3952)01:43:59
> at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974)01:43:59
> at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958)01:43:59
> at
> org.apache.beam.runners.core.construction.RehydratedComponents.getCoder(RehydratedComponents.java:168)01:43:59
> at
> org.apache.beam.runners.core.construction.CoderTranslation.fromKnownCoder(CoderTranslation.java:158)01:43:59
> at
> org.apache.beam.runners.core.construction.CoderTranslation.fromProto(CoderTranslation.java:145)01:43:59
> at
> org.apache.beam.runners.core.construction.RehydratedComponents$2.load(RehydratedComponents.java:87)01:43:59
> at
> org.apache.beam.runners.core.construction.RehydratedComponents$2.load(RehydratedComponents.java:82)01:43:59
> at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3528)01:43:59
> at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2277)01:43:59
> at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154)01:43:59
> at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2044)01:43:59
> ... 18 more01:43:59 Caused by:
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
>
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
> java.lang.IllegalArgumentException: Encountered unsupported logical type
> URN: int01:43:59 at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2050)01:43:59
> at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get(LocalCache.java:3952)01:43:59
> at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974)01:43:59
> at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958)01:43:59
> at
> org.apache.beam.runners.core.construction.RehydratedComponents.getCoder(RehydratedComponents.java:168)01:43:59
> at
> org.apache.beam.runners.core.construction.CoderTranslation.fromKnownCoder(CoderTranslation.java:158)01:43:59
> at
> org.apache.beam.runners.core.construction.CoderTranslation.fromProto(CoderTranslation.java:145)01:43:59
> at
> org.apache.beam.runners.core.construction.RehydratedComponents$2.load(RehydratedComponents.java:87)01:43:59
> at
> org.apache.beam.runners.core.construction.RehydratedComponents$2.load(RehydratedComponents.java:82)01:43:59
> at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3528)01:43:59
> at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2277)01:43:59
> at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154)01:43:59
> at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2044)01:43:59
> ... 30 more01:43:59 Caused by:
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException:
> java.lang.IllegalArgumentException: Encountered unsupported logical type
> URN: int01:43:59 at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2050)01:43:59
> at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.get(LocalCache.java:3952)01:43:59
> at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3974)01:43:59
> at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4958)01:43:59
> at
> org.apache.beam.runners.core.construction.RehydratedComponents.getCoder(RehydratedComponents.java:168)01:43:59
> at
> org.apache.beam.runners.core.construction.CoderTranslation.fromKnownCoder(CoderTranslation.java:158)01:43:59
> at
> org.apache.beam.runners.core.construction.CoderTranslation.fromProto(CoderTranslation.java:145)01:43:59
> at
> org.apache.beam.runners.core.construction.RehydratedComponents$2.load(RehydratedComponents.java:87)01:43:59
> at
> org.apache.beam.runners.core.construction.RehydratedComponents$2.load(RehydratedComponents.java:82)01:43:59
> at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3528)01:43:59
> at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2277)01:43:59
> at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154)01:43:59
> at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2044)01:43:59
> ... 42 more01:43:59 Caused by: java.lang.IllegalArgumentException:
> Encountered unsupported logical type URN: int01:43:59 at
> org.apache.beam.sdk.schemas.SchemaTranslation.fieldTypeFromProtoWithoutNullable(SchemaTranslation.java:328)01:43:59
> at
> org.apache.beam.sdk.schemas.SchemaTranslation.fieldTypeFromProto(SchemaTranslation.java:244)01:43:59
> at
> org.apache.beam.sdk.schemas.SchemaTranslation.fieldFromProto(SchemaTranslation.java:238)01:43:59
> at
> org.apache.beam.sdk.schemas.SchemaTranslation.schemaFromProto(SchemaTranslation.java:212)01:43:59
> at
> org.apache.beam.runners.core.construction.CoderTranslators$8.fromComponents(CoderTranslators.java:169)01:43:59
> at
> org.apache.beam.runners.core.construction.CoderTranslators$8.fromComponents(CoderTranslators.java:151)01:43:59
> at
> org.apache.beam.runners.core.construction.CoderTranslation.fromKnownCoder(CoderTranslation.java:170)01:43:59
> at
> org.apache.beam.runners.core.construction.CoderTranslation.fromProto(CoderTranslation.java:145)01:43:59
> at
> org.apache.beam.runners.core.construction.RehydratedComponents$2.load(RehydratedComponents.java:87)01:43:59
> at
> org.apache.beam.runners.core.construction.RehydratedComponents$2.load(RehydratedComponents.java:82)01:43:59
> at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3528)01:43:59
> at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2277)01:43:59
> at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2154)01:43:59
> at
> org.apache.beam.vendor.guava.v26_0_jre.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2044)01:43:59
> ... 54 more
--
This message was sent by Atlassian Jira
(v8.20.1#820001)