[
https://issues.apache.org/jira/browse/HIVE-17956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16234468#comment-16234468
]
Mithun Radhakrishnan commented on HIVE-17956:
---------------------------------------------
Hello, [~mkwhitacre].
HIVE-17466 might be of interest to you. This feature adds metastore-calls that
return unique values for specified partition keys. In the
{{PartitionValuesRequest}}, one can specify the required partition keys (e.g.
{{dt}}), a filter (e.g. {{dt > "20170101" && dt < "20171231"}}), a sort order
(ascending/descending), and a limit (e.g. top 10).
The implementation sorts and filters on the metastore (in fact, it's pushed
down to the database). It doesn't require to form complete {{Partition}}
objects. This is much more memory-friendly and efficient than the alternative.
It's in {{master}}, and {{branch-2}}. I have yet to port this to {{branch-2.2}}.
HIVE-17467 wraps this raw API up in an {{HCatClient}} wrapper that's easier to
use. This was developed for use in Oozie (for discovery of data dependencies),
and a couple of other projects. The unit test in that patch indicates how to
use it. This has yet to be reviewed or checked in, I'm afraid.
Would this suit your requirement?
> Retrieve "latest" partition from Hive Metastore
> -----------------------------------------------
>
> Key: HIVE-17956
> URL: https://issues.apache.org/jira/browse/HIVE-17956
> Project: Hive
> Issue Type: New Feature
> Components: Metastore
> Reporter: Micah Whitacre
>
> We are trying to utilize the Hive Metastore for our processing needs,
> specifically focusing on consuming through the HCatalog APIs. One use case
> we have is that we want to consume the "latest" partition. In researching
> there are a number of posts[1][2] that talk about using queries through Hive
> Server2 to find that information. It would be more ideal if this was a first
> class API offered from the Hive Metastore without requiring a query to be
> executed.
> The other option would be to retrieve all of the partitions and sort client
> side. There is a concern about the efficiency and memory requirements of
> this especially without the "iterator" concept implemented from HIVE-7195.
> [1] -
> https://community.hortonworks.com/questions/85330/how-to-optimize-hive-access-to-the-latest-partitio.html
> [2] -
> https://stackoverflow.com/questions/36095790/how-to-find-the-most-recent-partition-in-hive-table
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)