[jira] [Commented] (HIVE-14154) Select from ORC table stored in VectorizedOrcInputFormat fails

Gopal V (JIRA) Fri, 01 Jul 2016 19:15:02 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-14154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15359927#comment-15359927
 ]


Gopal V commented on HIVE-14154:
--------------------------------

This is a usability bug - the VectorizedOrcInputFormat is not a complete 
general purpose input format.

Possible fix would be to disallow creating tables with that as a the input 
format. 

Creating tables "STORED AS ORC" should never do that, however the planner will 
inject this if the query can be vectorized.

> Select from ORC table stored  in VectorizedOrcInputFormat fails
> ---------------------------------------------------------------
>
>                 Key: HIVE-14154
>                 URL: https://issues.apache.org/jira/browse/HIVE-14154
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>            Reporter: Zhu Li
>
> TABLE INFORMATION:
> # col_name                    data_type               comment                 
>          
> c_1m_l                bigint                                      
> c_10_l                bigint                                      
> c_100_l               bigint                                      
> c_1k_l                bigint                                      
> c_10k_l               bigint                                      
> c_100k_l              bigint                                      
> c_1k_r                bigint                                      
> c_1m_r                bigint                                      
> s_5000_r              char(40)                                    
> s_1m_r                char(40)                                    
>                
> # Detailed Table Information           
> Database:             default                  
> Owner:                hdfs                     
> CreateTime:           Fri Jul 01 17:32:23 PDT 2016     
> LastAccessTime:       UNKNOWN                  
> Protect Mode:         None                     
> Retention:            0                        
> Location:             
> hdfs://mynamenode:8020/apps/hive/warehouse/t_1m_2_orc_vec        
> Table Type:           MANAGED_TABLE            
> Table Parameters:              
>       COLUMN_STATS_ACCURATE   true                
>       numFiles                8                   
>       numRows                 10000000            
>       rawDataSize             3120000000          
>       totalSize               105254801           
>       transient_lastDdlTime   1467419689   
> Storage Information            
> SerDe Library:      org.apache.hadoop.hive.ql.io.orc.OrcSerde  
> InputFormat:         
> org.apache.hadoop.hive.ql.io.orc.VectorizedOrcInputFormat         
> OutputFormat:      org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat    
> Compressed:           No                       
> Num Buckets:          -1                       
> Bucket Columns:     []                         
> Sort Columns:         []                       
> Storage Desc Params:           
>       serialization.format    1 
> QUERY: select * from t_1m_2_orc_vec limit 10;
> ERROR:
> Failed with exception java.io.IOException:java.lang.RuntimeException: 
> java.lang.RuntimeException: Failed to load plan: null: 
> java.lang.NullPointerException
> Additional information:
> input format is set as 
> org.apache.hadoop.hive.ql.io.orc.VectorizedOrcInputFormat 
> When I want to read from this table with HCatalog API, I got the following 
> error:
> Exception in thread "main" java.lang.RuntimeException: 
> java.lang.RuntimeException: Failed to load plan: null: 
> java.lang.NullPointerException
>       at 
> org.apache.hadoop.hive.ql.io.orc.VectorizedOrcInputFormat$VectorizedOrcRecordReader.<init>(VectorizedOrcInputFormat.java:76)
>       at 
> org.apache.hadoop.hive.ql.io.orc.VectorizedOrcInputFormat.getRecordReader(VectorizedOrcInputFormat.java:156)
>       at 
> org.apache.hive.hcatalog.mapreduce.HCatRecordReader.createBaseRecordReader(HCatRecordReader.java:116)
>       at 
> org.apache.hive.hcatalog.mapreduce.HCatRecordReader.initialize(HCatRecordReader.java:91)
>       at OrcConnector.main(OrcConnector.java:254)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)
> Caused by: java.lang.RuntimeException: Failed to load plan: null: 
> java.lang.NullPointerException
>       at 
> org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:461)
>       at 
> org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:300)
>       at 
> org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatchCtx.init(VectorizedRowBatchCtx.java:171)
>       at 
> org.apache.hadoop.hive.ql.io.orc.VectorizedOrcInputFormat$VectorizedOrcRecordReader.<init>(VectorizedOrcInputFormat.java:74)
>       ... 9 more
> Caused by: java.lang.NullPointerException
>       at 
> org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:416)
>       ... 12 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14154) Select from ORC table stored in VectorizedOrcInputFormat fails

Reply via email to