On Jun 3, 2010, at 12:36 PM, Sanjit Jhala wrote: > I'm wondering why the Split class needs to extend FileSplit and also why the > InputFormat needs to call FileInputFormat.getInputPaths(job) in getSplits. Is > this because of legacy code that needs to be cleaned up or does it get used > somewhere?
Both of these are due to legacy code needing cleanup (HIVE-1133). Currently some of the inputformat logic is based on physical paths where it should be based on logical data sources such as partitions instead. This is also the reason why currently we are forced to create an empty directory in the file system corresponding to the name of each non-native table (HIVE-1222). JVS
