[ https://issues.apache.org/jira/browse/KUDU-2786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Will Berkeley reassigned KUDU-2786: ----------------------------------- Assignee: Will Berkeley > Parallelize tables for backup and restore > ------------------------------------------ > > Key: KUDU-2786 > URL: https://issues.apache.org/jira/browse/KUDU-2786 > Project: Kudu > Issue Type: Improvement > Affects Versions: 1.9.0 > Reporter: Grant Henke > Assignee: Will Berkeley > Priority: Major > Labels: backup > > Currently the backup and restore jobs process tables serially. This works > well to ensure resources aren't over allocated upfront, but could be less > performant for cases where there are many small tables. Instead we could > parallelize the Spark jobs for each table. > It should be straightforward to use Scala futures to run multiple jobs in > parallel and check their status. We could add a configuration to cap the > maximum number of tables run at the same time, though maybe that isn't really > needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)