I think the key is, as Ted says, every place where you want to emit a writable form of vector, to wrap it in a VectorWritable.
In scala terms, there is certainly two implicit conversions (a, ahem bijection in fact) between Vector and VectorWritable, by the get/set encapsulation of the latter around the former. On Saturday, January 12, 2013, Ted Dunning wrote: > This might be more appropriate on the Mahout list. I have copied that list > in order to gain the largest audience for the answers. > > It is an absolute requirement in Mahout to have multiple vector > implementations. It is also a requirement that the math library not depend > on Hadoop. > > A third absolute requirement in Mahout is that very simple Java programming > suffice for working with Vectors of many types as well as Matrix values. > > In order to meet these requirements and allow the simplest form of > map-reduce programming, we implemented a class VectorWritable which will > wrap any kind of vector as a writable object. You can retrieve the > underlying vector from the VectorWritable and there is some discusion about > making VW implement the Vector interface as well. > > If your code returns a VectorWritable, then Hadoop should be able to > serialize it trivially. > > If your code returns a Vector, however, it will not natively be > serializable. It should be possible to inject a single registration into > Kryo, however, that will understand how to serialize Vector's using the > VectorWritable infrastructure. > > > On Sat, Jan 12, 2013 at 11:49 PM, Koert Kuipers > <[email protected]<javascript:;>> > wrote: > > > i would like to have some mahout vectors flow through a scalding job. i > > thought at first that this should be easy since the mahout vector is a > > writable so if i put it in the tuple all will be fine. but then i > realized > > mahout did this thing where they split up the vector in a whole bunch of > > classes and interfaces: they have the Vector interface, implementations > > such as DenseVector and SparseSequentialAcccessVector, and then the class > > VectorWritable which takes a Vector and turns it into a Writable. argh. > so > > now if i have for example a DenseVector then i think it will not get > > serialized as a Writable and then kryo will attempt to serialize it > > instead, which is not what i want. any ideas for an elegant solution (i > > wish a simple scala implicit conversion would do the trick!). should i > add > > a custom hadoop Serializer to catch these (seems ugly)? > > > > -- > > You received this message because you are subscribed to the Google Groups > > "cascading-user" group. > > To post to this group, send email to > > [email protected]<javascript:;> > . > > To unsubscribe from this group, send email to > > [email protected] <javascript:;>. > > For more options, visit this group at > > http://groups.google.com/group/cascading-user?hl=en. > > > -- -jake
