[
https://issues.apache.org/jira/browse/FLINK-15032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16991225#comment-16991225
]
Biao Liu commented on FLINK-15032:
----------------------------------
[~trohrmann], thanks for explanation! I think you have raised a critical
question.
Besides the message size checking, the serialization checking might be more
sticky. We might avoid the message size checking somehow, see the discussion
under FLINK-4399. But I don't think we could do the same thing on serialization
checking. It seems that this synchronous pre-checking can not be avoided,
because some rpc invocations are "fire and forget". It's hard to tell invoker
that there is something wrong with the rpc message without a synchronous
pre-checking.
Here are some rough ideas of mine. I think maybe we could do this optimization
on the invocation without user-defined fields. I believe we could guarantee
that the message is small and serializable in some scenarios.
> Remove the eager serialization from `RemoteRpcInvocation`
> ----------------------------------------------------------
>
> Key: FLINK-15032
> URL: https://issues.apache.org/jira/browse/FLINK-15032
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Coordination
> Reporter: Guowei Ma
> Priority: Major
>
> Currently, the constructor of `RemoteRpcInvocation` serializes the
> `parameterTypes` and `arg` of an RPC call. This could lead to a problem:
> Consider a job that has 1k parallelism and has a 1m union list state. When
> deploying the 1k tasks, the eager serialization would use 1G memory
> instantly(Some time the serialization amplifies the memory usage). However,
> the serialized object is only used when the Akka sends the message. So we
> could reduce the memory pressure if we only serialize the object when the
> message would be sent by the Akka.
> Akka would serialize the message at last and all the XXXGateway related class
> could be visible by the RPC level. Because of that, I think the eager
> serialization in the constructor of `RemoteRpcInvocation` could be avoided. I
> also do a simple test and find this could reduce the time cost of the RPC
> call. The 1k number of RPC calls with 1m `String` message: The current
> version costs around 2700ms; the Nonserialization version cost about 37ms.
>
> In summary, this Jira proposes to remove the eager serialization at the
> constructor of `RemoteRpcInvocation`.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)