Rui Fan created FLINK-36129:
-------------------------------
Summary: Autoscaler is compatible with Flink 1.20
Key: FLINK-36129
URL: https://issues.apache.org/jira/browse/FLINK-36129
Project: Flink
Issue Type: Improvement
Components: Autoscaler
Reporter: Rui Fan
Assignee: Rui Fan
AggregatedMetric added the skew field in Flink 1.20, it caused the
AggregatedMetric of old version cannot parse the AggregatedMetric of 1.20.
Following is the root exception:
{code:java}
Caused by:
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException:
Unrecognized field "skew" (class
org.apache.flink.runtime.rest.messages.job.metrics.AggregatedMetric), not
marked as ignorable (5 known properties: "min", "id", "max", "avg", "sum"])
at [Source: UNKNOWN; byte offset: #UNKNOWN] (through reference chain:
java.util.ArrayList[0]->org.apache.flink.runtime.rest.messages.job.metrics.AggregatedMetric["skew"]){code}
Following is the detailed log:
{code:java}
2024-08-22 13:25:44,262 o.a.f.a.s.StandaloneAutoscalerExecutor [ERROR] [] Error
while fetch job list.
java.util.concurrent.TimeoutException
at
java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1960)
at
java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2095)
at
org.apache.flink.autoscaler.standalone.flinkcluster.FlinkClusterJobListFetcher.fetch(FlinkClusterJobListFetcher.java:65)
at
org.apache.flink.autoscaler.standalone.StandaloneAutoscalerExecutor.scaling(StandaloneAutoscalerExecutor.java:126)
at
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
at
java.base/java.util.concurrent.FutureTask.runAndReset$$$capture(FutureTask.java:305)
at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java)
at
java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:842)
2024-08-22 13:25:44,260 o.a.f.a.JobAutoScalerImpl [ERROR]
[8bae72ec15d2993f66803c5f20b5654d] Error while scaling job
java.util.concurrent.ExecutionException:
org.apache.flink.util.concurrent.FutureUtils$RetryException: Could not complete
the operation. Number of retries has been exhausted.
at
java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396)
at
java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2073)
at
org.apache.flink.autoscaler.RestApiMetricsCollector.queryAggregatedVertexMetrics(RestApiMetricsCollector.java:109)
at
org.apache.flink.autoscaler.RestApiMetricsCollector.lambda$queryAllAggregatedMetrics$0(RestApiMetricsCollector.java:80)
at
java.base/java.util.stream.Collectors.lambda$uniqKeysMapAccumulator$1(Collectors.java:180)
at
java.base/java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169)
at
java.base/java.util.HashMap$EntrySpliterator.forEachRemaining(HashMap.java:1850)
at
java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
at
java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
at
java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:921)
at
java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at
java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:682)
at
org.apache.flink.autoscaler.RestApiMetricsCollector.queryAllAggregatedMetrics(RestApiMetricsCollector.java:77)
at
org.apache.flink.autoscaler.ScalingMetricCollector.updateMetrics(ScalingMetricCollector.java:134)
at
org.apache.flink.autoscaler.JobAutoScalerImpl.runScalingLogic(JobAutoScalerImpl.java:178)
at
org.apache.flink.autoscaler.JobAutoScalerImpl.scale(JobAutoScalerImpl.java:103)
at
org.apache.flink.autoscaler.standalone.StandaloneAutoscalerExecutor.scalingSingleJob(StandaloneAutoscalerExecutor.java:185)
at
org.apache.flink.autoscaler.standalone.StandaloneAutoscalerExecutor.lambda$scaling$0(StandaloneAutoscalerExecutor.java:143)
at
java.base/java.util.concurrent.CompletableFuture$AsyncRun.run$$$capture(CompletableFuture.java:1804)
at
java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:842)
Caused by: org.apache.flink.util.concurrent.FutureUtils$RetryException: Could
not complete the operation. Number of retries has been exhausted.
at
org.apache.flink.util.concurrent.FutureUtils.lambda$retryOperationWithDelay$6(FutureUtils.java:294)
at
java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
at
java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
at
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
at
java.base/java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:614)
at
java.base/java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1163)
at
java.base/java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:482)
... 3 more
Caused by: java.util.concurrent.CompletionException:
org.apache.flink.runtime.rest.util.RestClientException: Response was neither of
the expected type([simple type, class
org.apache.flink.runtime.rest.messages.job.metrics.AggregatedMetricsResponseBody])
nor an error.
at
java.base/java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:368)
at
java.base/java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:377)
at
java.base/java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1152)
... 4 more
Caused by: org.apache.flink.runtime.rest.util.RestClientException: Response was
neither of the expected type([simple type, class
org.apache.flink.runtime.rest.messages.job.metrics.AggregatedMetricsResponseBody])
nor an error.
at
org.apache.flink.runtime.rest.RestClient.parseResponse(RestClient.java:664)
at
org.apache.flink.runtime.rest.RestClient.lambda$submitRequest$6(RestClient.java:628)
at
java.base/java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1150)
... 4 more
Caused by:
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException:
Unrecognized field "skew" (class
org.apache.flink.runtime.rest.messages.job.metrics.AggregatedMetric), not
marked as ignorable (5 known properties: "min", "id", "max", "avg", "sum"])
at [Source: UNKNOWN; byte offset: #UNKNOWN] (through reference chain:
java.util.ArrayList[0]->org.apache.flink.runtime.rest.messages.job.metrics.AggregatedMetric["skew"])
at
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException.from(UnrecognizedPropertyException.java:61)
at
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.DeserializationContext.handleUnknownProperty(DeserializationContext.java:1132)
at
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.deser.std.StdDeserializer.handleUnknownProperty(StdDeserializer.java:2202)
at
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.deser.BeanDeserializerBase.handleUnknownProperty(BeanDeserializerBase.java:1705)
at
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.deser.BeanDeserializerBase.handleUnknownVanilla(BeanDeserializerBase.java:1683)
at
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:284)
at
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:463)
at
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1405)
at
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:352)
at
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:185)
at
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.deser.std.CollectionDeserializer._deserializeFromArray(CollectionDeserializer.java:359)
at
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:244)
at
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.deserialize(CollectionDeserializer.java:28)
at
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.deser.DefaultDeserializationContext.readRootValue(DefaultDeserializationContext.java:323)
at
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.ObjectMapper._readValue(ObjectMapper.java:4706)
at
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2904)
at
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.core.JsonParser.readValueAs(JsonParser.java:2337)
at
org.apache.flink.runtime.rest.messages.job.metrics.AggregatedMetricsResponseBody$Deserializer.deserialize(AggregatedMetricsResponseBody.java:104)
at
org.apache.flink.runtime.rest.messages.job.metrics.AggregatedMetricsResponseBody$Deserializer.deserialize(AggregatedMetricsResponseBody.java:90)
at
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.deser.DefaultDeserializationContext.readRootValue(DefaultDeserializationContext.java:323)
at
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.ObjectMapper._readValue(ObjectMapper.java:4706)
at
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2948)
at
org.apache.flink.runtime.rest.RestClient.parseResponse(RestClient.java:646)
... 6 more {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)