[ 
https://issues.apache.org/jira/browse/FLINK-15032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guowei Ma updated FLINK-15032:
------------------------------
    Summary: Remove the eager serialization from `RemoteRpcInvocation`   (was: 
Remove the eagerly serialization from `RemoteRpcInvocation` )

> Remove the eager serialization from `RemoteRpcInvocation` 
> ----------------------------------------------------------
>
>                 Key: FLINK-15032
>                 URL: https://issues.apache.org/jira/browse/FLINK-15032
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Coordination
>            Reporter: Guowei Ma
>            Priority: Major
>
>     Currently, the constructor of `RemoteRpcInvocation` serializes the 
> `parameterTypes` and `arg` of an RPC call. This could lead to a problem:
> Consider a job that has 1k parallelism and has a 1m union list state. When 
> deploying the 1k tasks, the eager serialization would use 1G memory 
> instantly(Some time the serialization amplifies the memory usage). However, 
> the serialized object is only used when the Akka sends the message.  So we 
> could reduce the memory pressure if we only serialize the object when the 
> message would be sent by the Akka.
> Akka would serialize the message at last and all the XXXGateway related class 
> could be visible by the RPC level. Because of that, I think the eager 
> serialization in the constructor of `RemoteRpcInvocation` could be avoided. I 
> also do a simple test and find this could reduce the time cost of the RPC 
> call. The 1k number of  RPC calls with 1m `String` message:  The current 
> version costs around 2700ms; the Nonserialization version cost about 37ms.
>  
> In summary, this Jira proposes to remove the serialization at the constructor 
> of `RemoteRpcInvocation`.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to