Alexey Serbin created KUDU-3016:
-----------------------------------
Summary: Catalog manager: don't lump together all transactions per
tablet report
Key: KUDU-3016
URL: https://issues.apache.org/jira/browse/KUDU-3016
Project: Kudu
Issue Type: Improvement
Components: master
Reporter: Alexey Serbin
With current structure of system tablet for rows storing metadata information
on tablets, the catalog manager can create a very large write operation on the
system tablet when processing full tablet reports sent from tablet servers. At
some point (depends on the {{\-\-rpc_max_message_size}} setting), a tablet
report received from a tablet server comes through, but its Raft counterpart
for the update can be almost two times larger. If that happens, Kudu cluster
becomes almost non-functional because of self-perpetuating
accepted-huge-tablet-report-but-cannot-push-Raft-update-to-follower-masters
pattern.
The catalog manager should not lump together updates on all tablets received
from one tablet server:
https://github.com/apache/kudu/blob/3175c35c7d721aef0c4c6b358cc3b422089c1ba7/src/kudu/master/catalog_manager.cc#L4268-L4274
--
This message was sent by Atlassian Jira
(v8.3.4#803005)