On Tue, Jan 5, 2010 at 11:46 AM, Drew Farris <drew.far...@gmail.com> wrote:

>
> Have you seen any cases where a class hierarchy of Writables is
> established to do something like that? E.g the mapreduce jobs are
> written to use VectorWritable, but subclasses (e.g
> SparseVectorWritable) are available for specific needs?
>
>
Bah, nevermind -- this is precisely what Mahout does today without
separating the Vector and Writable portions into two separate classes.
Serious brain lapse that one.

Of course this would probably be a very straightforward approach to
implement: Simply separate out the Writable portions of each Vector
implementation into its own class. The Writable implementation to use would
specified at runtime and this would also determine which underlying Vector
implementation is used. The price we pay for separating the Writable stuff
from the Vectors is an extra class that implements Writable for each
implementation. Since the Writable (an thus implementation) to use is
specified at runtime via options, there's no need for an ugly switch
statement anywhere.

Theoretically one could even decouple the writable (serialization style)
from the (in-memory) implementation, but I don't know if there is any need
for that whatsoever.

Drew

Reply via email to