[
https://issues.apache.org/jira/browse/HIVE-28145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17956846#comment-17956846
]
Denys Kuzmenko commented on HIVE-28145:
---------------------------------------
[~VenuReddy], [~dengzh] is this a regression? what HMS client was used,
SessionHiveMetaStoreClient?
> getPartitionsByNames API returns partition objects with empty values in many
> fields when it is executed concurrently with dropPartition API
> --------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HIVE-28145
> URL: https://issues.apache.org/jira/browse/HIVE-28145
> Project: Hive
> Issue Type: Bug
> Reporter: Venugopal Reddy K
> Priority: Major
> Labels: hive-4.1.0-must
>
> *Description:*
> getPartitionsByNames API returns partition objects with empty values in many
> fields when it is executed concurrently with dropPartition API.
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql#getPartitionsViaPartNames
> method does multiple queries to backend db to populate the various fields in
> the partition object. First it queries for part ids using partition names,
> then joins PARTITIONS, SDS, SERDES tables for those part ids and creates
> partition objects. Then another query to PARTITION_KEY_VALS table to get the
> partition values for those part ids and populates in already created
> partition objects.
> So if the partition is deleted just before PARTITION_KEY_VALS table query, it
> can lead to empty values in partition object. This issue can happen for other
> fields(like, partition params, storage descriptor params, serde params, sort
> cols, bucket cols, skewed cols etc) too in partition object that require
> queries to populate those fields.
> *Note: Issue can be observed with both directsql and JDO based query. Need
> to check for all APIs that involves multiple queries to backend database
> within a transaction.*
> *Root Cause:*
> Transaction is opened with default isolation level(read-committed). The
> default in DataNucleus is read-committed.
> *Steps to reproduce:*
> # Create a partitioned table and add 500~1000 dynamic partitions(can add
> dummy partition param, sd param, serde param).
> # Create a thread pool of size 2 and submit 2 tasks. One task to submit
> getPartitionsByNames and another task to submit dropPartition in loop
> # Verify the fields in partition objects returned from
> getPartitionsByNames().
--
This message was sent by Atlassian Jira
(v8.20.10#820010)