[
https://issues.apache.org/jira/browse/KUDU-3016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexey Serbin updated KUDU-3016:
--------------------------------
Affects Version/s: 1.6.0
1.7.0
1.8.0
1.7.1
1.9.0
1.10.0
1.10.1
1.11.0
1.11.1
Labels: Availability scalability (was: scalability)
> Catalog manager: don't lump together all updates from one tablet report
> -----------------------------------------------------------------------
>
> Key: KUDU-3016
> URL: https://issues.apache.org/jira/browse/KUDU-3016
> Project: Kudu
> Issue Type: Improvement
> Components: master
> Affects Versions: 1.6.0, 1.7.0, 1.8.0, 1.7.1, 1.9.0, 1.10.0, 1.10.1,
> 1.11.0, 1.11.1
> Reporter: Alexey Serbin
> Assignee: Alexey Serbin
> Priority: Major
> Labels: Availability, scalability
>
> With current structure of the system tablet for rows storing metadata
> information on tablets, the catalog manager can create a very large write
> operation on the system tablet when processing full tablet reports sent from
> tablet servers. At some point (depends on the {{\-\-rpc_max_message_size}}
> setting), a tablet report received from a tablet server comes through, but
> its Raft counterpart for the system tablet update doesn't because it might be
> almost two times larger. If that happens, Kudu cluster becomes almost
> non-functional because of self-perpetuating
> accepted-huge-tablet-report-but-cannot-push-Raft-update-to-follower-masters
> pattern.
> The catalog manager should not lump together updates on all tablets received
> from one tablet server:
> https://github.com/apache/kudu/blob/3175c35c7d721aef0c4c6b358cc3b422089c1ba7/src/kudu/master/catalog_manager.cc#L4268-L4274
--
This message was sent by Atlassian Jira
(v8.3.4#803005)