Alexey Serbin created KUDU-3036:
-----------------------------------
Summary: RPC size multiplication for DDL operations might hit
maximum RPC size limit
Key: KUDU-3036
URL: https://issues.apache.org/jira/browse/KUDU-3036
Project: Kudu
Issue Type: Improvement
Components: master, rpc
Reporter: Alexey Serbin
When a table uses multi-tier partitioning scheme, with large number of
partitions created, an {{AlterTable}} request that affects many
partitions/tablets turns into a much larger {{UpdateConsensus}} RPC when leader
master pushes the corresponding update on the system tablet to follower masters.
I did some testing for this use case. With {{AlterTable}} RPC adding new range
partitions, I observed the following:
* With range x 2 hash partitions, with the incoming {{AlterTable}} RPC request
size is 37070 bytes, the size for the corresponding {{UpdateConsensus}} is
274278 bytes (~ 7x multiplication factor).
* With range x 10 hash partitions, with the incoming {{AlterTable}} RPC request
size is 37070 bytes, the size for the corresponding {{UpdateConsensus}} when
leader master pushes the updates on the system tablet to followers is 1365438
bytes (~ 36x multiplication factor).
With that, it's easy to hit the limit on the maximum PRC size (controlled via
the {{\-\-rpc_max_message_size}} flag) in case of larger Kudu clusters. If
that happens, Kudu masters start continuous leader re-election cycle since
follower masters don't receive any Raft heartbeats from their leader: the
heartbeats are rejected at the lower RPC layer due to the maximum RPC size
limit.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)