[
https://issues.apache.org/jira/browse/KUDU-2785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Grant Henke reassigned KUDU-2785:
---------------------------------
Assignee: Grant Henke
> Support more parallel scanners in the backup job
> ------------------------------------------------
>
> Key: KUDU-2785
> URL: https://issues.apache.org/jira/browse/KUDU-2785
> Project: Kudu
> Issue Type: Improvement
> Affects Versions: 1.9.0
> Reporter: Grant Henke
> Assignee: Grant Henke
> Priority: Major
> Labels: backup
>
> Currently the KuduBackup job uses 1 scanner and therefore 1 Spark task per
> Kudu partition. When KUDU-2670 is complete, we should consider and test
> having more than one scanner per partition and instead configuring a target
> data size for each scanner. That should result in faster and more
> reliable/predictable backup jobs regardless of partition count.
> It may however make restoring more difficult because it could cause
> compactions. Restore side testing and improvements may also be required.
> Improvements to the estimation for key range sizes may also need to be done,
> so this should be well tested.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)