[ 
https://issues.apache.org/jira/browse/HIVE-14143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15361156#comment-15361156
 ] 

Nemon Lou commented on HIVE-14143:
----------------------------------

[~pxiong] The "ids" passed in is just "sizeOfColumnsInTableScan" in many 
places.So "ids.size() != *sizeOfColumnsInTableScan" will always be false.
{code}
 ColumnProjectionUtils.appendReadColumns(
                  jobConf, ts.getNeededColumnIDs(), ts.getNeededColumns());
{code}
In the case of count(1) or stats gather,"sizeOfColumnsInTableScan"  is zero.We 
need to find a way to distinguish these two cases.
For  count(1), READ_ALL_COLUMNS should be set to false.
For stat gather of rcfile,READ_ALL_COLUMNS should be set to true in order to 
read all columns and then calculate rawDataSize.



> RawDataSize of RCFile is zero after analyze 
> --------------------------------------------
>
>                 Key: HIVE-14143
>                 URL: https://issues.apache.org/jira/browse/HIVE-14143
>             Project: Hive
>          Issue Type: Bug
>          Components: Statistics
>    Affects Versions: 1.2.1, 2.1.0
>            Reporter: Nemon Lou
>            Assignee: Abhishek
>            Priority: Minor
>         Attachments: HIVE-14143.1.patch, HIVE-14143.patch
>
>
> After running the following analyze command ,rawDataSize becomes zero for 
> rcfile tables.
> {noformat}
>  analyze table RCFILE_TABLE compute statistics ;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to