Hello guys, I am using hadoop-0.20.2-cdh3u0 and I use MultipleOutputs to divide the HFiles (which are the output of my MR job) so that each file can fit into one region of the table where I am going to bulk load them.
Therefore I have one MultipleOutput per region and as a result I had 280 different outputs. I just realized that using so many outputs makes my job a lot slower than it is when I have just one output. Do you know what goes wrong? Has anyone noticed the same? Thank you! Panagiotis