[jira] [Resolved] (KUDU-2785) Support more parallel scanners in the backup job

Grant Henke (JIRA) Thu, 06 Jun 2019 17:04:57 -0700


     [ 
https://issues.apache.org/jira/browse/KUDU-2785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Grant Henke resolved KUDU-2785.
-------------------------------
       Resolution: Fixed
    Fix Version/s: 1.10.0

Resolved via 
[https://github.com/apache/kudu/commit/5c87afd4f2160344d651948e90234ee512adcfd8]

> Support more parallel scanners in the backup job
> ------------------------------------------------
>
>                 Key: KUDU-2785
>                 URL: https://issues.apache.org/jira/browse/KUDU-2785
>             Project: Kudu
>          Issue Type: Improvement
>    Affects Versions: 1.9.0
>            Reporter: Grant Henke
>            Assignee: Grant Henke
>            Priority: Major
>              Labels: backup
>             Fix For: 1.10.0
>
>
> Currently the KuduBackup job uses 1 scanner and therefore 1 Spark task per 
> Kudu partition. When KUDU-2670 is complete, we should consider and test 
> having more than one scanner per partition and instead configuring a target 
> data size for each scanner. That should result in faster and more 
> reliable/predictable backup jobs regardless of partition count.
> It may however make restoring more difficult because it could cause 
> compactions. Restore side testing and improvements may also be required.
> Improvements to the estimation for key range sizes may also need to be done, 
> so this should be well tested. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (KUDU-2785) Support more parallel scanners in the backup job

Reply via email to