TrainNewsGroups in the examples module does this.

ModelSerializer supports json serialization as well as binary.  The JSON
form breaks down for larger models because GSON does recursion instead of
iteration to iterate through things on parsing.

Trunk should have everything you need. (I will verify that in a moment)

On Thu, Jan 27, 2011 at 12:38 AM, Claudia Grieco <[email protected]>wrote:

> Are there some examples of use of ModelSerializer?
> Can I use it without fixing mahout from the trunk?
> I see that ModelSerializer uses json too, isn't it?
>
> -----Messaggio originale-----
> Da: Ted Dunning [mailto:[email protected]]
> Inviato: mercoledì 26 gennaio 2011 19.38
> A: [email protected]
> Oggetto: Re: problems saving and loading SGD classifications
>
> This is a known problem that should be fixed in trunk.
>
> While you are at it, the LogisticModelParameters approach may not be as
> useful as the ModelSerializer approach.
>
> Here is a comparison of pros and cons:
>
> LogisticModelParameters
>
> + incorporates lots of CSV parsing info
> + serializes the whole lot including model and data representation
> + somewhat simpler to use
> + matches chapter 13 of MiA examples
> -- uses json to serialize model
> - pretty much assumes CSV input by implication
> - has a bug in many recent versions
>
> ModelSerializer
>
> ++ allows binary serialization
> + makes no assumptions about how feature vectors are encoded
> - requires that you make your own arrangements for vector encoding
>
> The bit about binary serialization is (for me) a real show-stopper for LMP
> for big models.  Almost as important is the issue about vector encoding
> since real Mahout applications tend to have large sparse text-like input
> variables.
>
>
>
> On Wed, Jan 26, 2011 at 9:44 AM, Claudia Grieco <[email protected]
> >wrote:
>
> > What do you think can be the problem?
> >
>
>

Reply via email to