I think the key is, as Ted says, every place where you want to emit a
writable form of vector, to wrap it in a VectorWritable.

In scala terms, there is certainly two implicit conversions (a, ahem
bijection in fact) between Vector and VectorWritable, by the get/set
encapsulation of the latter around the former.

On Saturday, January 12, 2013, Ted Dunning wrote:

> This might be more appropriate on the Mahout list.  I have copied that list
> in order to gain the largest audience for the answers.
>
> It is an absolute requirement in Mahout to have multiple vector
> implementations.  It is also a requirement that the math library not depend
> on Hadoop.
>
> A third absolute requirement in Mahout is that very simple Java programming
> suffice for working with Vectors of many types as well as Matrix values.
>
> In order to meet these requirements and allow the simplest form of
> map-reduce programming, we implemented a class VectorWritable which will
> wrap any kind of vector as a writable object.  You can retrieve the
> underlying vector from the VectorWritable and there is some discusion about
> making VW implement the Vector interface as well.
>
> If your code returns a VectorWritable, then Hadoop should be able to
> serialize it trivially.
>
> If your code returns a Vector, however, it will not natively be
> serializable.  It should be possible to inject a single registration into
> Kryo, however, that will understand how to serialize Vector's using the
> VectorWritable infrastructure.
>
>
> On Sat, Jan 12, 2013 at 11:49 PM, Koert Kuipers 
> <[email protected]<javascript:;>>
> wrote:
>
> > i would like to have some mahout vectors flow through a scalding job. i
> > thought at first that this should be easy since the mahout vector is a
> > writable so if i put it in the tuple all will be fine. but then i
> realized
> > mahout did this thing where they split up the vector in a whole bunch of
> > classes and interfaces: they have the Vector interface, implementations
> > such as DenseVector and SparseSequentialAcccessVector, and then the class
> > VectorWritable which takes a Vector and turns it into a Writable. argh.
> so
> > now if i have for example a DenseVector then i think it will not get
> > serialized as a Writable and then kryo will attempt to serialize it
> > instead, which is not what i want. any ideas for an elegant solution (i
> > wish a simple scala implicit conversion would do the trick!). should i
> add
> > a custom hadoop Serializer to catch these (seems ugly)?
> >
> > --
> > You received this message because you are subscribed to the Google Groups
> > "cascading-user" group.
> > To post to this group, send email to 
> > [email protected]<javascript:;>
> .
> > To unsubscribe from this group, send email to
> > [email protected] <javascript:;>.
> > For more options, visit this group at
> > http://groups.google.com/group/cascading-user?hl=en.
> >
>


-- 

  -jake

Reply via email to