The bit I wasn't able to take out easily was serialzing ModelDistribution to
a JSON string and then passing that via the Configuration object.

Indeed everywhere else it was just another output option.

Do you think the way forward is to leave it, or use Writable and write the
model distribution to a file, or something else?

On Mon, Jan 17, 2011 at 4:49 PM, Jeff Eastman <[email protected]>wrote:

> Dirichlet supports JSON but uses Writable internally as do the rest of the
> clustering algorithms.
>
>
> On 1/17/11 8:50 AM, Sean Owen wrote:
>
>> The idea behind MAHOUT-510 was to try to standardize serialization
>> mechanisms as much as reasonable. It would be counterproductive to remove
>> one and add another, I think. There was some support for using Avro for
>> text
>> serialization instead of JSON, even though that has the same issue -- so I
>> think Avro is somehow considered a better idea than JSON.
>>
>> I contend that it'd be nice to just deal in serializing stuff to files via
>> Writable; the use cases for serializing to strings, or passing in memory,
>> seem deprecated, which is the last use case for JSON/Avro at the present
>> time.
>>
>> I still suggest step 1 is to try try to change dirichlet to make a few
>> classes Writable and pass the model that way. Then we'd be done.
>>
>> Then, hey, think about reasons Avro is needed.
>>
>> On Mon, Jan 17, 2011 at 3:37 PM, Robin Anil<[email protected]>  wrote:
>>
>>  Protobufs are a good choice :)
>>>
>>>
>

Reply via email to