Jimmy Lin wrote:
Hi everyone,

I'm wondering if it's possible to lazily deserialize a Writable.  That is,
when my custom Writable is handed a DataInput from readFields, can I
simply hang on to the reference and read from it later?  This would be
useful if the Writable is a complex data structure that may be expensive
to deserialize, so I'd only want to do it on-demand.  Or does the runtime
mutate the underlying stream, leaving the Writable with a reference to
something completely different later?

I'm wondering about both present behavior, and the implicit contract
provided by the Hadoop API.

The implicit contract is that you consume all bytes from the input in readFields() that you'll ever consume from this DataInput. The same DataInput is then passed to other Writables so that they can read their fields. If you don't advance the DataInput sufficiently to consume all bytes related to your Writable, then the next record won't read in properly, and things will start crashing ..


--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Reply via email to