On Tue, Jan 5, 2010 at 11:46 AM, Drew Farris <drew.far...@gmail.com> wrote:
> > Have you seen any cases where a class hierarchy of Writables is > established to do something like that? E.g the mapreduce jobs are > written to use VectorWritable, but subclasses (e.g > SparseVectorWritable) are available for specific needs? > > Bah, nevermind -- this is precisely what Mahout does today without separating the Vector and Writable portions into two separate classes. Serious brain lapse that one. Of course this would probably be a very straightforward approach to implement: Simply separate out the Writable portions of each Vector implementation into its own class. The Writable implementation to use would specified at runtime and this would also determine which underlying Vector implementation is used. The price we pay for separating the Writable stuff from the Vectors is an extra class that implements Writable for each implementation. Since the Writable (an thus implementation) to use is specified at runtime via options, there's no need for an ugly switch statement anywhere. Theoretically one could even decouple the writable (serialization style) from the (in-memory) implementation, but I don't know if there is any need for that whatsoever. Drew