[jira] [Commented] (HIVE-5272) Column statistics on a invalid column name results in IndexOutOfBoundsException

Shreepadma Venugopalan (JIRA) Wed, 11 Sep 2013 12:36:25 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-5272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13764658#comment-13764658
 ]


Shreepadma Venugopalan commented on HIVE-5272:
----------------------------------------------

Thanks, Prasanth. The code in question assumes, incorrectly, that the 
validation done later by the SemanticAnalyzer is sufficient to raise an invalid 
column error. But looks like, the IndexOutOfBounds occurs prior. I think we can 
either fix the if condition in getTableColumnType() or alternatively perform 
the validation early. One of the reasons for deferring the validation was to 
piggyback on the existing logic later during SemanticAnalysis and avoid 
duplicating work. But, the patch you have put together looks simple enough. 
                
> Column statistics on a invalid column name results in 
> IndexOutOfBoundsException
> -------------------------------------------------------------------------------
>
>                 Key: HIVE-5272
>                 URL: https://issues.apache.org/jira/browse/HIVE-5272
>             Project: Hive
>          Issue Type: Bug
>          Components: Statistics
>    Affects Versions: 0.13.0
>            Reporter: Prasanth J
>            Assignee: Prasanth J
>              Labels: statistics
>             Fix For: 0.13.0
>
>         Attachments: HIVE-5272.txt
>
>
> When invalid column name is specified for column statistics 
> IndexOutOfBoundsException is thrown. 
> {code}hive> analyze table customer_staging compute statistics for columns 
> c_first_name, invalid_name, c_customer_sk;
> FAILED: IndexOutOfBoundsException Index: 2, Size: 1{code}
> If the invalid column name appears at first or last then 
> INVALID_COLUMN_REFERENCE is thrown at query planning stage. But if the 
> invalid column name appears somewhere in the middle of column lists then 
> IndexOutOfBoundsException is thrown at semantic analysis step. The problem is 
> with getTableColumnType() and getPartitionColumnType() methods. The following 
> segment 
> {code}    for (int i=0; i <numCols; i++) {
>       colName = colNames.get(i);
>       for (FieldSchema col: cols) {
>         if (colName.equalsIgnoreCase(col.getName())) {
>           colTypes.add(i, new String(col.getType()));
>         }
>       }
>     }{code}
> is the reason for it. If the invalid column names appears in the middle of 
> column list then the equalsIgnoreCase() skips the invalid name and increments 
> the i. Since the list is not initialized it results in exception. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-5272) Column statistics on a invalid column name results in IndexOutOfBoundsException

Reply via email to