Guowei Ma created FLINK-15032:
---------------------------------
Summary: Remove the eagerly serialization from
`RemoteRpcInvocation`
Key: FLINK-15032
URL: https://issues.apache.org/jira/browse/FLINK-15032
Project: Flink
Issue Type: Improvement
Components: Runtime / Coordination
Reporter: Guowei Ma
Currently, the constructor of `RemoteRpcInvocation` serializes the
`parameterTypes` and `arg` of an RPC call. This could lead to two problems:
# Consider a job that has 1k parallelism and has a 1m union list state. When
deploying the 1k tasks, the eager serialization would use 1G memory
instantly(Some time the serialization amplifies the memory usage). However, the
serialized object is only used when the Akka sends the message. So we could
reduce the memory pressure if we only serialize the object when the message
would be sent by the Akka.
# Furthermore, Akka would serialize the message at last and all the XXXGateway
related class could be visible by the RPC level. Because of that, I think the
serialization in the constructor of `RemoteRpcInvocation` could be avoided. I
also do a simple test and find this could reduce the time cost of the RPC call.
The 1k number of RPC calls with 1m `String` message: The current version
costs around 2700ms; the Nonserialization version cost about 37ms.
In summary, this Jira proposes to remove the serialization at the constructor
of `RemoteRpcInvocation`.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)