Hi All, I have a question about map reduce. Suppose I have set of small files (say 100) usually having size 8-15 MB and need to process in a single job. For each file, there will be 1 map process and hence 100 map process will be initiated for 100 files. Now the question is about number of reducers and total order partitioning. If I use 1 reducer then I will achieve total order partitioning as it will generate 1 file. but if there are more than 1 reducers then the questions are
1- How many reducers should be used for such scenario to get the best performance? 2- If I use the reducer= number of input files and in this case 100 reducers against 100 input files then is it a good approach? 3- If 100 reducers are used then how to achieve global sort order in this case i.e total ordering. kindly share your thoughts. Thanks -- Regards Shuja-ur-Rehman Baig <http://pk.linkedin.com/in/shujamughal>