[
https://issues.apache.org/jira/browse/FLINK-15032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16986935#comment-16986935
]
Guowei Ma commented on FLINK-15032:
-----------------------------------
Hi [~SleePy]
>>>Do you mean we could postpone the serialization till
>>>{{RemoteRpcInvocation#writeObject}}?
yes.
>>>Could you explain it a bit more? Do you mean there is no visible issue of
>>>postponing the serialization? Or we don't need serialization anymore?
What I mean is that there is no visible issue of postponing the serialization.
>>>Do we still need serialization in {{wirteObject}} after removing
>>>serialization in constructor?
The serialization is still needed.
> Remove the eager serialization from `RemoteRpcInvocation`
> ----------------------------------------------------------
>
> Key: FLINK-15032
> URL: https://issues.apache.org/jira/browse/FLINK-15032
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Coordination
> Reporter: Guowei Ma
> Priority: Major
>
> Currently, the constructor of `RemoteRpcInvocation` serializes the
> `parameterTypes` and `arg` of an RPC call. This could lead to a problem:
> Consider a job that has 1k parallelism and has a 1m union list state. When
> deploying the 1k tasks, the eager serialization would use 1G memory
> instantly(Some time the serialization amplifies the memory usage). However,
> the serialized object is only used when the Akka sends the message. So we
> could reduce the memory pressure if we only serialize the object when the
> message would be sent by the Akka.
> Akka would serialize the message at last and all the XXXGateway related class
> could be visible by the RPC level. Because of that, I think the eager
> serialization in the constructor of `RemoteRpcInvocation` could be avoided. I
> also do a simple test and find this could reduce the time cost of the RPC
> call. The 1k number of RPC calls with 1m `String` message: The current
> version costs around 2700ms; the Nonserialization version cost about 37ms.
>
> In summary, this Jira proposes to remove the eager serialization at the
> constructor of `RemoteRpcInvocation`.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)