[ 
https://issues.apache.org/jira/browse/FLINK-15032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16991225#comment-16991225
 ] 

Biao Liu commented on FLINK-15032:
----------------------------------

[~trohrmann], thanks for explanation! I think you have raised a critical 
question.

Besides the message size checking, the serialization checking might be more 
sticky. We might avoid the message size checking somehow, see the discussion 
under FLINK-4399. But I don't think we could do the same thing on serialization 
checking. It seems that this synchronous pre-checking can not be avoided, 
because some rpc invocations are "fire and forget". It's hard to tell invoker 
that there is something wrong with the rpc message without a synchronous 
pre-checking.

Here are some rough ideas of mine. I think maybe we could do this optimization 
on the invocation without user-defined fields. I believe we could guarantee 
that the message is small and serializable in some scenarios.

> Remove the eager serialization from `RemoteRpcInvocation` 
> ----------------------------------------------------------
>
>                 Key: FLINK-15032
>                 URL: https://issues.apache.org/jira/browse/FLINK-15032
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Coordination
>            Reporter: Guowei Ma
>            Priority: Major
>
>     Currently, the constructor of `RemoteRpcInvocation` serializes the 
> `parameterTypes` and `arg` of an RPC call. This could lead to a problem:
> Consider a job that has 1k parallelism and has a 1m union list state. When 
> deploying the 1k tasks, the eager serialization would use 1G memory 
> instantly(Some time the serialization amplifies the memory usage). However, 
> the serialized object is only used when the Akka sends the message.  So we 
> could reduce the memory pressure if we only serialize the object when the 
> message would be sent by the Akka.
> Akka would serialize the message at last and all the XXXGateway related class 
> could be visible by the RPC level. Because of that, I think the eager 
> serialization in the constructor of `RemoteRpcInvocation` could be avoided. I 
> also do a simple test and find this could reduce the time cost of the RPC 
> call. The 1k number of  RPC calls with 1m `String` message:  The current 
> version costs around 2700ms; the Nonserialization version cost about 37ms.
>  
> In summary, this Jira proposes to remove the eager serialization at the 
> constructor of `RemoteRpcInvocation`.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to