[
https://issues.apache.org/jira/browse/HIVE-26893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sai Hemanth Gantasala reassigned HIVE-26893:
--------------------------------------------
Assignee: Sai Hemanth Gantasala
> Extend batch partition APIs to ignore partition schemas
> -------------------------------------------------------
>
> Key: HIVE-26893
> URL: https://issues.apache.org/jira/browse/HIVE-26893
> Project: Hive
> Issue Type: New Feature
> Components: Metastore
> Reporter: Quanlong Huang
> Assignee: Sai Hemanth Gantasala
> Priority: Major
>
> There are several HMS APIs that return a list of partitions, e.g.
> get_partitions_ps(), get_partitions_by_names(), add_partitions_req() with
> needResult=true, etc. Each partition instance will have a unique list of
> FieldSchemas as the partition schema:
> {code:java}
> org.apache.hadoop.hive.metastore.api.Partition
> -> org.apache.hadoop.hive.metastore.api.StorageDescriptor
> -> cols: list<org.apache.hadoop.hive.metastore.api.FieldSchema> {code}
> This could occupy a large memory footprint for wide tables (e.g. with 2k
> cols). See the heap histogram in IMPALA-11812 as an example.
> Some engines like Impala doesn't actually use/respect the partition level
> schema. It's a waste of network/serde resource to transmit them. It'd be nice
> if these APIs provide an optional boolean flag for ignoring partition
> schemas. So HMS clients (e.g. Impala) don't need to clear them later (to save
> mem).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)