[ 
https://issues.apache.org/jira/browse/HIVE-1488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12892426#action_12892426
 ] 

Ning Zhang commented on HIVE-1488:
----------------------------------

MultiFileInputFormat in Hadoop 0.19 was introduced in HIVE-1121 to implement 
CombineFileInputFormat-like functionality in Hadoop pre-20 version.  However, 
MultiFileInputFormat actually does not support pooling (creating different 
splits for files in different directories). So it is impossible to use 
CombineHiveInputFormat to query multiple partitions (each partition has to be 
in a different pool in the case of CombineFileInputFormat). 

Due to this limitation and there is no easy way to fix this in Hive, I think we 
should disable CombineHiveInputFormat in pre-0.20 in strict mode and give a 
warning to users in nonstrict mode. 

For unit testing, we can exclude combine2.q (which test CombineHiveInputFormat 
across partitions) from Hadoop 0.19.

Any thoughts?

> CombineHiveInputFormat for hadoop-19 is broken
> ----------------------------------------------
>
>                 Key: HIVE-1488
>                 URL: https://issues.apache.org/jira/browse/HIVE-1488
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Joydeep Sen Sarma
>            Assignee: Ning Zhang
>
> I don't if anyone is using it. After making some recent testing related 
> changes in HIVE-1408, combine[12].q are no longer working when testing 
> against 19. I have seen them fail earlier as well and not investigated. 
> Looking at the code, it seems pretty hokey:
> getInputPathsShim():
>       Path[] newPaths = new Path[paths.length];
>       // remove file:                                                         
>                                                                               
>    
>       for (int pos = 0; pos < paths.length; pos++) {
>         newPaths[pos] = new Path(paths[pos].toString().substring(5));
>       }
> since we are no longer using 'file:' namespace for test warehouse, this is 
> broke. But this would be broken against any hdfs instance it would seem(?). 
> Also not clear what we are trying to do here.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to