Re: is it possible to concatenate output files under many reducers?

2011-05-12 Thread Jun Young Kim

yes. that is a general solution to control counts of output files.

however, if you need to control counts of outputs dynamically, how could 
you do?


if an output file name is 'A', counts of this output files are needed to 
be 5.
if an output file name is 'B', counts of this output files are needed to 
be 10.


is it able to be under hadoop?

Junyoung Kim (juneng...@gmail.com)


On 05/12/2011 02:17 PM, Harsh J wrote:

Short, blind answer: You could run 10 reducers.

Otherwise, you'll have to run another job that picks up a few files
each in mapper and merges them out. But having 60 files shouldn't
really be a problem if they are sufficiently large (at least 80% of a
block size perhaps -- you can tune # of reducers to achieve this).

On Thu, May 12, 2011 at 6:14 AM, Jun Young Kimjuneng...@gmail.com  wrote:

hi, all.

I have 60 reducers which are generating same output files.

from output-r--1 to output-r-00059.

under this situation, I want to control the count of output files.

for example, is it possible to concatenate all output files to 10 ?

from output-r-1 to output-r-00010.

thanks

--
Junyoung Kim (juneng...@gmail.com)







Re: is it possible to concatenate output files under many reducers?

2011-05-12 Thread Joey Echeverria
You can control the number of reducers by calling
job.setNumReduceTasks() before you launch it.

-Joey

On Thu, May 12, 2011 at 6:33 PM, Jun Young Kim juneng...@gmail.com wrote:
 yes. that is a general solution to control counts of output files.

 however, if you need to control counts of outputs dynamically, how could you
 do?

 if an output file name is 'A', counts of this output files are needed to be
 5.
 if an output file name is 'B', counts of this output files are needed to be
 10.

 is it able to be under hadoop?

 Junyoung Kim (juneng...@gmail.com)


 On 05/12/2011 02:17 PM, Harsh J wrote:

 Short, blind answer: You could run 10 reducers.

 Otherwise, you'll have to run another job that picks up a few files
 each in mapper and merges them out. But having 60 files shouldn't
 really be a problem if they are sufficiently large (at least 80% of a
 block size perhaps -- you can tune # of reducers to achieve this).

 On Thu, May 12, 2011 at 6:14 AM, Jun Young Kimjuneng...@gmail.com
  wrote:

 hi, all.

 I have 60 reducers which are generating same output files.

 from output-r--1 to output-r-00059.

 under this situation, I want to control the count of output files.

 for example, is it possible to concatenate all output files to 10 ?

 from output-r-1 to output-r-00010.

 thanks

 --
 Junyoung Kim (juneng...@gmail.com)








-- 
Joseph Echeverria
Cloudera, Inc.
443.305.9434


is it possible to concatenate output files under many reducers?

2011-05-11 Thread Jun Young Kim

hi, all.

I have 60 reducers which are generating same output files.

from output-r--1 to output-r-00059.

under this situation, I want to control the count of output files.

for example, is it possible to concatenate all output files to 10 ?

from output-r-1 to output-r-00010.

thanks

--
Junyoung Kim (juneng...@gmail.com)



Re: is it possible to concatenate output files under many reducers?

2011-05-11 Thread Harsh J
Short, blind answer: You could run 10 reducers.

Otherwise, you'll have to run another job that picks up a few files
each in mapper and merges them out. But having 60 files shouldn't
really be a problem if they are sufficiently large (at least 80% of a
block size perhaps -- you can tune # of reducers to achieve this).

On Thu, May 12, 2011 at 6:14 AM, Jun Young Kim juneng...@gmail.com wrote:
 hi, all.

 I have 60 reducers which are generating same output files.

 from output-r--1 to output-r-00059.

 under this situation, I want to control the count of output files.

 for example, is it possible to concatenate all output files to 10 ?

 from output-r-1 to output-r-00010.

 thanks

 --
 Junyoung Kim (juneng...@gmail.com)





-- 
Harsh J