Actually I want the output can be used by other modules. So it has to read the output from hdfs files? Or integrate these modules into map-reduce? Is there other ways?

--------------------------------------------------
From: "Jeff Zhang" <[email protected]>
Sent: Friday, November 27, 2009 10:00 PM
To: <[email protected]>
Subject: Re: Store mapreduce output into my own data structures

Hi Liu,

Why you want to store the output in memory? You can not use the output out
of reducer.
Actually at the beginning the output of reducer is in memory, and the
OutputFormat write these data to file system or other data store.


Jeff Zhang



2009/11/27 Liu Xianglong <[email protected]>

Hi, everyone. Is there someone who uses map-reduce to store the reduce
output in memory. I mean, now the output path of job is set and reduce
outputs are stored into files under this path.(see the comments along with
the following codes)
    job.setOutputFormatClass(MyOutputFormat.class);
//can I implement my OutputFormat to store these output key-value pairs
in my data structures, or are these other ways to do it?
    job.setOutputKeyClass(ImmutableBytesWritable.class);
    job.setOutputValueClass(Result.class);
    FileOutputFormat.setOutputPath(job, outputDir);

Is there any way to store them in some variables or data structures? Then how can I implement my OutputFormat? Any suggestions and codes are welcomed.

Another question: is there some way to set the number of map task? It seems there is no API to do this in hadoop new job APIs. I am not sure the way to
set this number.

Thanks!

Best Wishes!
_____________________________________________________________

刘祥龙  Liu Xianglong


Reply via email to