[
https://issues.apache.org/jira/browse/HIVE-24741?focusedWorklogId=547843&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-547843
]
ASF GitHub Bot logged work on HIVE-24741:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 04/Feb/21 20:17
Start Date: 04/Feb/21 20:17
Worklog Time Spent: 10m
Work Description: vihangk1 opened a new pull request #1948:
URL: https://github.com/apache/hive/pull/1948
### What changes were proposed in this pull request?
This PR improves the performance of get_partitions_ps_with_auth for certain
cases when it is requesting for all the partitions. get_partitions_ps_with_auth
is not directSQL enabled and if it is requesting for all the partitions, we can
redirect it to a faster partitions API which supports directSQL. When filters
are provided the API continues to use existing datanuclues path so there is no
change with that behaviour.
### Why are the changes needed?
Performance improvement for a HMS API.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Modified existing unit-test to make sure it is working as expected. Also
tested on a real-world workload where performance improvement was substantial.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 547843)
Remaining Estimate: 0h
Time Spent: 10m
> get_partitions_ps_with_auth performance can be improved when requesting all
> the partitions
> ------------------------------------------------------------------------------------------
>
> Key: HIVE-24741
> URL: https://issues.apache.org/jira/browse/HIVE-24741
> Project: Hive
> Issue Type: Improvement
> Reporter: Vihang Karajgaonkar
> Assignee: Vihang Karajgaonkar
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> {{get_partitions_ps_with_auth}} API does not support DirectSQL. I have seen
> some large production use-cases where this API (specifically from Spark
> applications) is used heavily to request for all the partitions of a table.
> This performance of this API when requesting all the partitions of the table
> can be signficantly improved (~4 times from a realworld large workload
> usecase) if we forward this API call to a directSQL enabled API.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)