Grant Henke created KUDU-2786:
---------------------------------
Summary: Parallelize tables for backup and restore
Key: KUDU-2786
URL: https://issues.apache.org/jira/browse/KUDU-2786
Project: Kudu
Issue Type: Improvement
Affects Versions: 1.9.0
Reporter: Grant Henke
Currently the backup and restore jobs process tables serially. This works well
to ensure resources aren't over allocated upfront, but could be less performant
for cases where there are many small tables. Instead we could parallelize the
Spark jobs for each table.
It should be straightforward to use Scala futures to run multiple jobs in
parallel and check their status. We could add a configuration to cap the
maximum number of tables run at the same time, though maybe that isn't really
needed.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)