Background: I'm trying to track the details of how Hive creates multi-file 
splits.  I'm under the impression that MapReduce's CombineFileInputFormat does 
the main work of combining files and specifically that, if no overrides are 
set, then the target split filesize will be set to dfs.block.size.

However, I cannot see how the value for dfs.block.size finds its way into 
CombineFileInputFormat.  I'm probably missing some obvious thing but I'd 
appreciate someone pointing it out!

thanks,
Mike.

Reply via email to