Gabor Kaszab created IMPALA-12251:
-------------------------------------
Summary: Table migration to run on multiple partitions in parallel
Key: IMPALA-12251
URL: https://issues.apache.org/jira/browse/IMPALA-12251
Project: IMPALA
Issue Type: New Feature
Components: Frontend
Reporter: Gabor Kaszab
https://issues.apache.org/jira/browse/IMPALA-11013 Introduces table migration
from legacy Hive tables to Iceberg tables. The parallelization in this patch is
based on files within a partition. But if there are a lot of partitions and
only few files in them this approach is not performant.
Instead, as an improvement we can implement the parallelisation based on
partitions and then decide which one to used based on a # partitions / avg # of
files in a partition ratio.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)