Deepesh Khandelwal created HIVE-8062: ----------------------------------------
Summary: Stats collection for columns fails on a partitioned table with null values in partitioning column Key: HIVE-8062 URL: https://issues.apache.org/jira/browse/HIVE-8062 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 0.14.0 Reporter: Deepesh Khandelwal Steps to reproduce: 1. Create a data file abc.txt with the following contents: {noformat} a,1 b, {noformat} 2. Use the Hive CLI to create and load the partitioned table: {noformat} hive> create table abc(a string, b int); OK Time taken: 0.272 seconds hive> load data local inpath 'abc.txt' into table abc; Loading data to table default.abc Table default.abc stats: [numFiles=1, numRows=0, totalSize=7, rawDataSize=0] OK Time taken: 0.463 seconds hive> create table abc1(a string) partitioned by (b int); OK Time taken: 0.098 seconds hive> set hive.exec.dynamic.partition.mode=nonstrict; hive> insert overwrite table abc1 partition (b) select a, b from abc; Query ID = hrt_qa_20140911210909_1200fae7-1e18-4e0d-b74f-040453c27cff Total jobs = 1 Launching Job 1 out of 1 Status: Running (application id: Executing on YARN cluster with App id application_1410457588978_0063) Map 1: -/- Reducer 2: 0/1 Map 1: 0/1 Reducer 2: 0/1 Map 1: 0(+1)/1 Reducer 2: 0/1 Map 1: 1/1 Reducer 2: 0(+1)/1 Map 1: 1/1 Reducer 2: 0/1 Map 1: 1/1 Reducer 2: 1/1 Status: Finished successfully Loading data to table default.abc1 partition (b=null) Loading partition {b=__HIVE_DEFAULT_PARTITION__} Partition default.abc1{b=__HIVE_DEFAULT_PARTITION__} stats: [numFiles=1, numRows=2, totalSize=7, rawDataSize=5] OK Time taken: 7.49 seconds {noformat} 3. Now run the analyze statistics command for columns: {noformat} hive> analyze table abc1 partition (b) compute statistics for columns; Query ID = hrt_qa_20140911211010_440bdb4a-6a0d-496b-9d2e-5fc84db3d0ee Total jobs = 1 Launching Job 1 out of 1 Status: Running (application id: Executing on YARN cluster with App id application_1410457588978_0063) Map 1: 0(+1)/1 Reducer 2: 0/1 Map 1: 1/1 Reducer 2: 0(+1)/1 Map 1: 1/1 Reducer 2: 1/1 Status: Finished successfully FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.ColumnStatsTask {noformat} The analyze statistics for columns fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332)