Dirichlet supports JSON but uses Writable internally as do the rest of the clustering algorithms.

On 1/17/11 8:50 AM, Sean Owen wrote:
The idea behind MAHOUT-510 was to try to standardize serialization
mechanisms as much as reasonable. It would be counterproductive to remove
one and add another, I think. There was some support for using Avro for text
serialization instead of JSON, even though that has the same issue -- so I
think Avro is somehow considered a better idea than JSON.

I contend that it'd be nice to just deal in serializing stuff to files via
Writable; the use cases for serializing to strings, or passing in memory,
seem deprecated, which is the last use case for JSON/Avro at the present
time.

I still suggest step 1 is to try try to change dirichlet to make a few
classes Writable and pass the model that way. Then we'd be done.

Then, hey, think about reasons Avro is needed.

On Mon, Jan 17, 2011 at 3:37 PM, Robin Anil<[email protected]>  wrote:

Protobufs are a good choice :)


Reply via email to