I agree that VectorWritable should handle construction of all vector types
and it should understand how to do that.

BUT... there is one possible role for sub-classes of VectorWritable.  That
would be to avoid the otherwise necessary cast
of the object that is produced by the VectorWritable.  Thus a
MumbleVectorWritable would delegate all reading to VectorWritable
but would cast the result to a MumbleVector before returning it.  That cast
would fail if the objects being read don't sub-class
MumbleVector and the user code would not need a cast.

That isn't a big deal, though, and I would be +epsilon for the final marking
since it might be more maintainable in the long run since
anybody who hasn't heard this discussion would almost have to look at the
comment if they tried to sub-class VectorWritable.

On Mon, Sep 13, 2010 at 8:36 AM, Sean Owen <[email protected]> wrote:

> No, and that's the issue, really. A file of MultiLableVectorWritable
> cannot be read by VectorWritable since the latter does not expect that
> extra data. It's not quite a Hadoop issue, but simply that the OO
> world's object representation in memory doesn't exactly translate to
> serializing to a stream neatly.
>
> Yes I would mark VectorWritable final.
>

Reply via email to