Here is the issue that describes the problem and fix: https://issues.apache.org/jira/browse/MAHOUT-556
On Wed, Jan 26, 2011 at 10:38 AM, Ted Dunning <[email protected]> wrote: > This is a known problem that should be fixed in trunk. > > While you are at it, the LogisticModelParameters approach may not be as > useful as the ModelSerializer approach. > > Here is a comparison of pros and cons: > > LogisticModelParameters > > + incorporates lots of CSV parsing info > + serializes the whole lot including model and data representation > + somewhat simpler to use > + matches chapter 13 of MiA examples > -- uses json to serialize model > - pretty much assumes CSV input by implication > - has a bug in many recent versions > > ModelSerializer > > ++ allows binary serialization > + makes no assumptions about how feature vectors are encoded > - requires that you make your own arrangements for vector encoding > > The bit about binary serialization is (for me) a real show-stopper for LMP > for big models. Almost as important is the issue about vector encoding > since real Mahout applications tend to have large sparse text-like input > variables. > > > > On Wed, Jan 26, 2011 at 9:44 AM, Claudia Grieco <[email protected]>wrote: > >> What do you think can be the problem? >> > >
