There's been a significant problem in the past with bits of code chucking in their own different ideas about serialization, such that there were about 5 approaches at once. It's unlikely that 5 different approaches are each optimal in their way, and they weren't. So, I've tried to remove several mechanisms (JSON, Xstream come to mind) and replace with a more consistent and hopefully appropriate approach.
I'd use Writable by default. Wherever you write something to be serialized, chances are it's going to maybe want to be serialized to a SequenceFile at some point. The downside is you have to code your serialization manually. The upside is you can make it quite optimally efficient as a result. I have never found the writing of Writable code onerous. Serializable is hanging around in the non-distributed recommender code mostly for legacy third-party integration reasons. It has to be to use in a servlet / J2EE environment. It's not used *internally* for serialization but marked as such for the convenience of callers. Avro is JSON-based and that just seems far too verbose for these purposes. Nothing we serialize would be consumed directly by a browser. So I'd say implement WritableComparable, yes. On Sat, May 28, 2011 at 11:50 PM, Dhruv Kumar <[email protected]> wrote: > Has anyone looked at using Avro for serialization in Mahout? > > This is in reference to an earlier email thread where I had inquired about > the serialization frameworks which I could use for Mahout-627. > > Following the discussion, my plan was to just write a custom class > implementing the WritableComparable interface however I wanted to see what > the community thinks of using Avro in Mahout. >
