[ 
https://issues.apache.org/jira/browse/KUDU-2786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Will Berkeley reassigned KUDU-2786:
-----------------------------------

    Assignee: Will Berkeley

> Parallelize tables for backup and restore 
> ------------------------------------------
>
>                 Key: KUDU-2786
>                 URL: https://issues.apache.org/jira/browse/KUDU-2786
>             Project: Kudu
>          Issue Type: Improvement
>    Affects Versions: 1.9.0
>            Reporter: Grant Henke
>            Assignee: Will Berkeley
>            Priority: Major
>              Labels: backup
>
> Currently the backup and restore jobs process tables serially. This works 
> well to ensure resources aren't over allocated upfront, but could be less 
> performant for cases where there are many small tables. Instead we could 
> parallelize the Spark jobs for each table. 
> It should be straightforward to use Scala futures to run multiple jobs in 
> parallel and check their status. We could add a configuration to cap the 
> maximum number of tables run at the same time, though maybe that isn't really 
> needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to