Hi All,

I have a question about map reduce. Suppose I have set of small files (say
100) usually having size 8-15 MB and need to process in a single job. For
each file, there will be 1 map process and hence 100 map process will be
initiated for 100 files. Now the question is about number of reducers and
total order partitioning. If I use 1 reducer then I will achieve total order
partitioning as it will generate 1 file. but if there are more than 1
reducers then the questions are

1- How many reducers should be used for such scenario to get the best
performance?
2- If I use the reducer= number of input files and in this case 100 reducers
against 100 input files then is it a good approach?
3- If 100 reducers are used then how to achieve global sort order in this
case i.e total ordering.

kindly share your thoughts.
Thanks
-- 
Regards
Shuja-ur-Rehman Baig
<http://pk.linkedin.com/in/shujamughal>

Reply via email to