On 9/13/10 8:36 AM, Sean Owen wrote:
Well that goes down another interesting road. I think we have all
enjoyed the idea of keeping "decoration" out of the core Vector
implementation. Vector and its subclasses represent only different
ways of representing elements and values.
Notions like name (and ideally, labels) are farmed out to a decorator
like NamedVector.
This is all wonderful in the object-oriented world. The language and
object layout in memory are most happy for you to treat a NamedVector
as just a Vector; the extra data in memory is irrelevant.
It gets tricky when trying to write Writables for all this, since when
reading a sequence of objects from a stream you can't somehow know a
priori that there's more data in there than you expect and what to
ignore, and how. You don't have a parallel hierarchy of Writables --
it doesn't work that way. Instead you need one factory
(VectorWritable) that has knowledge of the serialized form of all
these things.
(Well, we did initially just serialize with each Vector the name of
its corresponding Writable. This is a tidy solution indeed, but is a
lot of overhead. So that went away.)
Back to Robin's point:
If there is a need for such a thing as a "weighted vector", then I
suggest that instead of injecting a field in Vector, it become another
decorator class. Likewise, labels should really be handled this way.
Yes, then VectorWritable needs another header bit for "weighted" and
needs to reconstruct the vector appropriately. It starts to get messy,
but works.
My original question was, do we need a "weighted vector" entity? or is
this only used in a context where one needs to serialize "a vector,
and a weight too". In the latter case, fine, easy: it should simply
compose rather than extend VectorWritable IMHO.
That's just what WeightedVectorWritable did prior to r996363 yesterday.
We should back out that commit?