[ https://issues.apache.org/jira/browse/HIVE-7654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14107096#comment-14107096 ]
Ashutosh Chauhan commented on HIVE-7654: ---------------------------------------- [~szehon], api returns # of partitions for which partitions were found in metastore. If user of api is not interested in extrapolated stats, than she can check for # of partitions requested & # of partitions returned and if those two numbers arent equal, than can choose to ignore result of this api. Api for unaggregated stats already exists, which they can use if they arent interested in this extrapolation. > A method to extrapolate columnStats for partitions of a table > ------------------------------------------------------------- > > Key: HIVE-7654 > URL: https://issues.apache.org/jira/browse/HIVE-7654 > Project: Hive > Issue Type: New Feature > Reporter: pengcheng xiong > Assignee: pengcheng xiong > Priority: Minor > Attachments: Extrapolate the Column Status.docx, HIVE-7654.0.patch, > HIVE-7654.1.patch, HIVE-7654.4.patch, HIVE-7654.6.patch, HIVE-7654.7.patch, > HIVE-7654.8.patch > > > In a PARTITIONED table, there are many partitions. For example, > create table if not exists loc_orc ( > state string, > locid int, > zip bigint > ) partitioned by(year string) stored as orc; > We assume there are 4 partitions, partition(year='2000'), > partition(year='2001'), partition(year='2002') and partition(year='2003'). > We can use the following command to compute statistics for columns > state,locid of partition(year='2001') > analyze table loc_orc partition(year='2001') compute statistics for columns > state,locid; > We need to know the “aggregated” column status for the whole table loc_orc. > However, we may not have the column status for some partitions, e.g., > partition(year='2002') and also we may not have the column status for some > columns, e.g., zip bigint for partition(year='2001') > We propose a method to extrapolate the missing column status for the > partitions. -- This message was sent by Atlassian JIRA (v6.2#6252)