In hadoop 18 and beyond, the key and value do not have to Implement
Writable.
As a general rule, the key and value objects passed to the map task will be
the same objects, with a fresh value initialized by the record reader.
The output.collect method will serialize the value during the call (unless
you are using the chainmapping from 19+), and you are free to reset the
values stored in the key value objects passed to output.collect after the
call.

It is a common practice to have a class field containing an object instance
of the output key or value type, which are used for transformations, instead
of allocating a new key or value instance in each call to map or reduce.

On Tue, Jul 28, 2009 at 11:29 AM, Devajyoti Sarkar <[email protected]> wrote:

> Thanks.
>
> Dev
>
> On Wed, Jul 29, 2009 at 2:27 AM, Todd Lipcon <[email protected]> wrote:
>
> > On Tue, Jul 28, 2009 at 11:24 AM, Devajyoti Sarkar <[email protected]>
> > wrote:
> >
> > > Hi,
> > >
> > > In the hadoop documentation it says that all key-value classes need to
> > > implement Writable to allow serialization and de-serialization of
> outputs
> > > between mappers and reducers. Is this also necessary for key/value
> pairs
> > > sent between the RecordReader and the Mapper (as well as the Reducer
> and
> > > the
> > > RecordWriter)? I assume that each of these two cases, classes are
> > > instantiated in the same VM. So is it safe to assume that key/value
> pairs
> > > are sent by reference instead of serialization/deserialization? If so,
> my
> > > specific application may get a performance boost. Please do let me know
> > if
> > > this so.
> > >
> >
> > Yes, this is correct. The values that come out of RecordReaders and go
> into
> > RecordWriters do not need to implement Writable.
> >
> > -Todd
> >
>



-- 
Pro Hadoop, a book to guide you from beginner to hadoop mastery,
http://www.amazon.com/dp/1430219424?tag=jewlerymall
www.prohadoopbook.com a community for Hadoop Professionals

Reply via email to