Re: Reducer MapFileOutpuFormat

2012-07-27 Thread Harsh J
Hi Bertrand, I believe he is talking about MapFile's index files, explained here: http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/MapFile.html On Fri, Jul 27, 2012 at 11:24 AM, Bertrand Dechoux wrote: > Your use of 'index' is indeed not clear. Are you talking about Hive or

Re: Reducer MapFileOutpuFormat

2012-07-27 Thread Harsh J
Hey Mike, Inline. On Tue, Jul 24, 2012 at 1:39 AM, Mike S wrote: > If I set my reducer output to map file output format and the job would > say have 100 reducers, will the output generate 100 different index > file (one for each reducer) or one index file for all the reducers > (basically one in

Re: Reducer MapFileOutpuFormat

2012-07-26 Thread Bertrand Dechoux
Your use of 'index' is indeed not clear. Are you talking about Hive or HBase? I can confirm that you will have one result file per reducer. Of course, for efficiency reasons, you need to limit the number of files. But if you are using multiple reducers it should mean that one reducer isn't fast en

Re: Reducer MapFileOutpuFormat

2012-07-26 Thread syed kather
Mike , Can you please give more details . Context is not clear . Can you share ur use case if possible On Jul 24, 2012 1:40 AM, "Mike S" wrote: > If I set my reducer output to map file output format and the job would > say have 100 reducers, will the output generate 100 different index > file (on

Reducer MapFileOutpuFormat

2012-07-23 Thread Mike S
If I set my reducer output to map file output format and the job would say have 100 reducers, will the output generate 100 different index file (one for each reducer) or one index file for all the reducers (basically one index file per job)? If it is one index file per reducer, can rely on HDFS ap