Zhou Fang created HIVE-25277:
--------------------------------

             Summary: Slow Hive partition deletion for Cloud object stores with 
expensive ListFiles
                 Key: HIVE-25277
                 URL: https://issues.apache.org/jira/browse/HIVE-25277
             Project: Hive
          Issue Type: Improvement
          Components: Standalone Metastore
    Affects Versions: All Versions
            Reporter: Zhou Fang
            Assignee: Zhou Fang


Deleting a Hive partition is slow when use a Cloud object store as the 
warehouse for which ListFiles is expensive. A root cause is that the recursive 
parent dir deletion is very inefficient: there are many duplicated calls to 
isEmpty (ListFiles is called at the end). This fix sorts the parents to delete 
according to the path size, and always processes the longest one (e.g., a/b/c 
is always before a/b). As a result, each parent path is only needed to be 
checked once.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to