[
https://issues.apache.org/jira/browse/KUDU-2278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16377424#comment-16377424
]
Todd Lipcon commented on KUDU-2278:
-----------------------------------
The difference is that a random insert workload will just populate one MRS per
tablet, so it can reach a large size before having to flush. A random
update/upsert workload, however, will spread over hundreds of DMS (one per
rowset), so they'll often have to start flushing when individual DMS are quite
small.
> Improve IO for writing deltas
> -----------------------------
>
> Key: KUDU-2278
> URL: https://issues.apache.org/jira/browse/KUDU-2278
> Project: Kudu
> Issue Type: Improvement
> Components: cfile, tablet
> Reporter: Andrew Wong
> Priority: Major
>
> Today, writing new deltas entails rewriting entire tablet metadata files in
> order to track the newly-created block ids. Even if the delta were on the
> order of kilobytes, the tablet metadata files could be on the order of
> megabytes, so the relative cost for this small amount of data is quite high,
> considering the amount of the metadata.
> This could be improved by batching such delta flushes, or by revamping tablet
> metadata entirely to batch any operations that require metadata updates.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)