There's been a significant problem in the past with bits of code chucking in
their own different ideas about serialization, such that there were about 5
approaches at once. It's unlikely that 5 different approaches are each
optimal in their way, and they weren't. So, I've tried to remove several
mechanisms (JSON, Xstream come to mind) and replace with a more consistent
and hopefully appropriate approach.

I'd use Writable by default. Wherever you write something to be serialized,
chances are it's going to maybe want to be serialized to a SequenceFile at
some point. The downside is you have to code your serialization manually.
The upside is you can make it quite optimally efficient as a result. I have
never found the writing of Writable code onerous.

Serializable is hanging around in the non-distributed recommender code
mostly for legacy third-party integration reasons. It has to be to use in a
servlet / J2EE environment. It's not used *internally* for serialization but
marked as such for the convenience of callers.


Avro is JSON-based and that just seems far too verbose for these purposes.
Nothing we serialize would be consumed directly by a browser.


So I'd say implement WritableComparable, yes.



On Sat, May 28, 2011 at 11:50 PM, Dhruv Kumar <[email protected]> wrote:

> Has anyone looked at using Avro for serialization in Mahout?
>
> This is in reference to an earlier email thread where I had inquired about
> the serialization frameworks which I could use for Mahout-627.
>
> Following the discussion, my plan was to just write a custom class
> implementing the WritableComparable interface however I wanted to  see what
> the community thinks of using Avro in Mahout.
>

Reply via email to