Sorry that I still don't get it.
I have only one reducer, so I produce one output file. In that reducer, I
have a standard line
output.collect(key, values.next());
Each values.next() is a file, and I would like to write all of these into
one zip output.
If I do as suggested
ZipOutputStream zos = new ZipOutputStream( fs.create("Output.zip"));
how does this zos work instead of output?
Thank you,
Mark
On Fri, Jul 24, 2009 at 9:02 AM, Jason Venner <[email protected]>wrote:
> I used to write zip files in my reducer, it was very very fast, and pulling
> the files out of hdfs as also very fast.
>
> In part this is because each reducer might need to write 26k individual
> files, by writing them as a zip file there was only 1 hdfs file.
> The job ran about 15x faster that way.
>
> I don't have the code handy any more but it was something on the order of
> ZipOutputStream zos = new ZipOutputStream( fs.create("Output.zip"));
> where fs is a FileSystem object.
>
> On Thu, Jul 23, 2009 at 8:48 PM, Mark Kerzner <[email protected]>
> wrote:
>
> > Thank you, MultipleOutputFormat is sufficient.
> > Mark
> >
> > On Thu, Jul 23, 2009 at 12:24 AM, Amogh Vasekar <[email protected]>
> > wrote:
> >
> > > Does MultipleOutputFormat suffice?
> > >
> > > Cheers!
> > > Amogh
> > >
> > > -----Original Message-----
> > > From: Mark Kerzner [mailto:[email protected]]
> > > Sent: Thursday, July 23, 2009 6:24 AM
> > > To: [email protected]
> > > Subject: Output of a Reducer as a zip file?
> > >
> > > Hi,
> > > my output consists of a number of binary files, corresponding text
> files,
> > > and one descriptor file. Is there a way to for my reducer to produce a
> > zip
> > > of all binary files, another zip of all text ones, and a separate text
> > > descriptor? If not, how close to this can I get? For example, I could
> > code
> > > the binary and the text into one text line of an output file, but then
> I
> > > would need some additional processing.
> > >
> > > Thank you,
> > > Mark
> > >
> >
>
>
>
> --
> Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> http://www.amazon.com/dp/1430219424?tag=jewlerymall
> www.prohadoopbook.com a community for Hadoop Professionals
>