Csaba Ringhofer created IMPALA-12476:
----------------------------------------
Summary: Single thread permission check can bottleneck table loadin
Key: IMPALA-12476
URL: https://issues.apache.org/jira/browse/IMPALA-12476
Project: IMPALA
Issue Type: Improvement
Components: Catalog
Reporter: Csaba Ringhofer
Partitioned tables use multiple threads to list files in different partitions,
but access permission checks are done before this on a single thread.
IMPALA-7320 optimized this for full table loads (more exactly if a high
percentage of partitions have changed), but in some cases we still do namenode
RPCs on a single thread to get access level:
1. as mentioned above, if only a small subset of partitions are changed
2. if the path has ACL (access control list), then after getting file status an
extra getAclStatus RPC is done, leading to partition_count number of RPCs on a
single thread if ACL is enabled for all partitions
3. if there is some error when doing the optimized path
1. is especially problematic for metastore event processing, as partition
events will often change only a subset of partitions. Even if all partitions
are changed, the catalogd may not process them in one batch (see IMPALA-12463),
leading to choosing the unoptimized path for several smaller batches
Besides the optimization in IMPALA-7320, there is no good reason for doing
access level check on a single thread, so both case 1 and 2 good be made faster
by moving to the multithreaded stage of table loading.
Note it is also a question whether all these access permission checks are
really needed, see IMPALA-12472.
An anomaly caused by doing these on a single thread is that the affect of flag
max_hdfs_partitions_parallel_load can be ambiguous - while it can significantly
speed up loading tables with multiple partitions, if the namenode (or the
thread that communicates with namenode) is contended, then parallel loads will
get an unfair share of the limited resources, meaning the tables where large
amount of work is done on single thread can actually get slower.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)