Alexey Serbin created KUDU-3036:
-----------------------------------

             Summary: RPC size multiplication for DDL operations might hit 
maximum RPC size limit
                 Key: KUDU-3036
                 URL: https://issues.apache.org/jira/browse/KUDU-3036
             Project: Kudu
          Issue Type: Improvement
          Components: master, rpc
            Reporter: Alexey Serbin


When a table uses multi-tier partitioning scheme, with large number of 
partitions created, an {{AlterTable}} request that affects many 
partitions/tablets turns into a much larger {{UpdateConsensus}} RPC when leader 
master pushes the corresponding update on the system tablet to follower masters.

I did some testing for this use case.  With {{AlterTable}} RPC adding new range 
partitions, I observed the following:
* With range x 2 hash partitions, with the incoming {{AlterTable}} RPC request 
size is 37070 bytes, the size for the corresponding {{UpdateConsensus}}  is 
274278 bytes (~ 7x multiplication factor).
* With range x 10 hash partitions, with the incoming {{AlterTable}} RPC request 
size is 37070 bytes, the size for the corresponding {{UpdateConsensus}} when 
leader master pushes the updates on the system tablet to followers is 1365438 
bytes (~ 36x multiplication factor).

With that, it's easy to hit the limit on the maximum PRC size (controlled via 
the {{\-\-rpc_max_message_size}} flag) in case of larger Kudu clusters.  If 
that happens, Kudu masters start continuous leader re-election cycle since 
follower masters don't receive any Raft heartbeats from their leader: the 
heartbeats are rejected at the lower RPC layer due to the maximum RPC size 
limit.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to