[jira] [Resolved] (KUDU-2786) Parallelize tables for backup and restore

Grant Henke (JIRA) Thu, 06 Jun 2019 07:37:39 -0700


     [ 
https://issues.apache.org/jira/browse/KUDU-2786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Grant Henke resolved KUDU-2786.
-------------------------------
       Resolution: Fixed
    Fix Version/s: 1.10.0

Resolved via [https://github.com/apache/kudu/commit/84086fe6a]

> Parallelize tables for backup and restore 
> ------------------------------------------
>
>                 Key: KUDU-2786
>                 URL: https://issues.apache.org/jira/browse/KUDU-2786
>             Project: Kudu
>          Issue Type: Improvement
>    Affects Versions: 1.9.0
>            Reporter: Grant Henke
>            Assignee: Will Berkeley
>            Priority: Major
>              Labels: backup
>             Fix For: 1.10.0
>
>
> Currently the backup and restore jobs process tables serially. This works 
> well to ensure resources aren't over allocated upfront, but could be less 
> performant for cases where there are many small tables. Instead we could 
> parallelize the Spark jobs for each table. 
> It should be straightforward to use Scala futures to run multiple jobs in 
> parallel and check their status. We could add a configuration to cap the 
> maximum number of tables run at the same time, though maybe that isn't really 
> needed. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (KUDU-2786) Parallelize tables for backup and restore

Reply via email to