Hi David, What file format and compression type are you using ?
Mathieu Le 25 janv. 2013 à 07:16, David Morel <dmore...@gmail.com> a écrit : > Hello, > > I have seen many posts on various sites and MLs, but didn't find a firm > answer anywhere: is it possible yes or no to force a smaller split size > than a block on the mappers, from the client side? I'm not after > pointers to the docs (unless you're very very sure :-) but after > real-life experience along the lines of 'yes, it works this way, I've > done it like this...' > > All the parameters that I could find (especially specifying a max input > split size) seem to have no effect, and the files that I have are so > heavily compressed that they completely saturate the mappers' memory > when processed. > > A solution I could imagine for this specific issue is reducing the block > size, but for now I simply went with disabling in-file compression for > those. And changing the block size on a per-file basis is something I'd > like to avoid if at all possible. > > All the hive settings that we tried only got me as far as raising the > number of mappers from 5 to 6 (yay!) where I would have needed at least > ten times more. > > Thanks! > > D.Morel