[
https://issues.apache.org/jira/browse/HIVE-1602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12903630#action_12903630
]
Joydeep Sen Sarma commented on HIVE-1602:
-----------------------------------------
yikes. how is this queried afterwards?
the user can do this by doing the transformation namit listed in the select
clause (on the partitioning column). the user can do a one time analysis of the
data (for size distribution on different partitioning columns) and then
generate the clumping logic manually.
because this does not result in queryable data sets - it doesn't seem
useful/reusable to me.
> List Partitioning
> -----------------
>
> Key: HIVE-1602
> URL: https://issues.apache.org/jira/browse/HIVE-1602
> Project: Hadoop Hive
> Issue Type: New Feature
> Affects Versions: 0.7.0
> Reporter: Ning Zhang
>
> Dynamic partition inserts create partitions bases on the dynamic partition
> column values. Currently it creates one partition for each distinct DP column
> value. This could result in skews in the created dynamic partitions in that
> some partitions are large but there could be large number of small partitions
> as well. This results in burdens in HDFS as well as metastore. A list
> partitioning scheme that aggregate a number of small partitions into one big
> one is more preferable for skewed partitions.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.