Rui Fan created FLINK-34655:
-------------------------------
Summary: Autoscaler doesn't work for flink 1.15
Key: FLINK-34655
URL: https://issues.apache.org/jira/browse/FLINK-34655
Project: Flink
Issue Type: Bug
Components: Autoscaler
Reporter: Rui Fan
Assignee: Rui Fan
Fix For: 1.8.0
flink-ubernetes-operator is committed to supporting the latest 4 flink minor
versions, and autoscaler is a part of flink-ubernetes-operator. Currently, the
latest 4 flink minor versions are 1.15, 1.16, 1.17 and 1.18.
But autoscaler doesn't work for flink 1.15.
h2. Root cause:
* FLINK-28310 added some properties in IOMetricsInfo in flink-1.16
* IOMetricsInfo is a part of JobDetailsInfo
* JobDetailsInfo is necessary for autoscaler [1]
* flink's RestClient doesn't allow miss any property during deserializing the
json
That means that the RestClient after 1.15 cannot fetch JobDetailsInfo for 1.15
jobs.
h2. How to fix it properly?
Flink side support ignore unknown properties.
FLINK-33268 already do it. But I try run autoscaler with flink-1.15 job, it
still doesn't work. Because the IOMetricsInfo added some properties, they are
primitive type.
It should disable DeserializationFeature.FAIL_ON_NULL_FOR_PRIMITIVES as well.
(Not sure whether it should be a seperate FLIP or it can be a part of FLIP-401
[2].)
h2. How to fix it in the short term?
1. Copy the latest RestMapperUtils and RestClient from master branch (It
includes FLINK-33268) to flink-autoscaler module. (The copied class will be
loaded first)
2. Disable DeserializationFeature.FAIL_ON_NULL_FOR_PRIMITIVES in
RestMapperUtils#flexibleObjectMapper in copied class.
Based on these 2 steps, flink-1.15 works well with autoscaler. (I try it
locally).
After DeserializationFeature.FAIL_ON_NULL_FOR_PRIMITIVES in
RestMapperUtils#flexibleObjectMapper is disabled, and the corresponding code is
released in flink side. flink-ubernetes-operator can remove these 2 copied
classes.
[1]
https://github.com/apache/flink-kubernetes-operator/blob/ede1a610b3375d31a2e82287eec67ace70c4c8df/flink-autoscaler/src/main/java/org/apache/flink/autoscaler/ScalingMetricCollector.java#L109
[2]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-401%3A+REST+API+JSON+response+deserialization+unknown+field+tolerance
--
This message was sent by Atlassian Jira
(v8.20.10#820010)