[ 
https://issues.apache.org/jira/browse/HIVE-24263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17217081#comment-17217081
 ] 

Szehon Ho commented on HIVE-24263:
----------------------------------

Ah I did not see this, will take a look, thanks for the pointer

> Create an HMS endpoint to list partition locations
> --------------------------------------------------
>
>                 Key: HIVE-24263
>                 URL: https://issues.apache.org/jira/browse/HIVE-24263
>             Project: Hive
>          Issue Type: Improvement
>          Components: Standalone Metastore
>            Reporter: Szehon Ho
>            Assignee: Szehon Ho
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: HIVE-24263.patch
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> In our company, we have a use-case to get quickly a list of partition 
> locations.  Currently it is done via listPartitions, which is a very heavy 
> operation in terms of memory and performance.
> This JIRA proposes an API: Map<String, String> listPartitionLocations(String 
> db, String table, short max) that returns a map of partition names to 
> locations.
> For example, we have an integration from output of a Hive pipeline to Spark 
> jobs that consume directly from HDFS.  The Spark job scheduler needs to know 
> the partition paths that are available for consumption (the partition name is 
> not sufficient as it's input is HDFS path), and so we have to do heavy 
> listPartitions() for this.
> Another use-case is for a HDFS data removal tool that does a nightly crawl to 
> see if there are associated hive partitions mapped to a given partition path. 
>  The nightly crawling job could be much less resource-intensive if we had a 
> listPartitionLocations().
> As there is already an internal method in the ObjectStore for this done for 
> dropPartitions, it is only a matter of exposing this API to 
> HiveMetaStoreClient.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to