[ https://issues.apache.org/jira/browse/HIVE-14143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15361156#comment-15361156 ]
Nemon Lou commented on HIVE-14143: ---------------------------------- [~pxiong] The "ids" passed in is just "sizeOfColumnsInTableScan" in many places.So "ids.size() != *sizeOfColumnsInTableScan" will always be false. {code} ColumnProjectionUtils.appendReadColumns( jobConf, ts.getNeededColumnIDs(), ts.getNeededColumns()); {code} In the case of count(1) or stats gather,"sizeOfColumnsInTableScan" is zero.We need to find a way to distinguish these two cases. For count(1), READ_ALL_COLUMNS should be set to false. For stat gather of rcfile,READ_ALL_COLUMNS should be set to true in order to read all columns and then calculate rawDataSize. > RawDataSize of RCFile is zero after analyze > -------------------------------------------- > > Key: HIVE-14143 > URL: https://issues.apache.org/jira/browse/HIVE-14143 > Project: Hive > Issue Type: Bug > Components: Statistics > Affects Versions: 1.2.1, 2.1.0 > Reporter: Nemon Lou > Assignee: Abhishek > Priority: Minor > Attachments: HIVE-14143.1.patch, HIVE-14143.patch > > > After running the following analyze command ,rawDataSize becomes zero for > rcfile tables. > {noformat} > analyze table RCFILE_TABLE compute statistics ; > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)