Zhou Fang created HIVE-25277:
--------------------------------
Summary: Slow Hive partition deletion for Cloud object stores with
expensive ListFiles
Key: HIVE-25277
URL: https://issues.apache.org/jira/browse/HIVE-25277
Project: Hive
Issue Type: Improvement
Components: Standalone Metastore
Affects Versions: All Versions
Reporter: Zhou Fang
Assignee: Zhou Fang
Deleting a Hive partition is slow when use a Cloud object store as the
warehouse for which ListFiles is expensive. A root cause is that the recursive
parent dir deletion is very inefficient: there are many duplicated calls to
isEmpty (ListFiles is called at the end). This fix sorts the parents to delete
according to the path size, and always processes the longest one (e.g., a/b/c
is always before a/b). As a result, each parent path is only needed to be
checked once.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)