I think that may have to do with the very large update that has been sitting
on my machine because I forgot to commit the change.

On Wed, Dec 29, 2010 at 5:05 PM, Chris Schilling <[email protected]> wrote:

> Hey Ted,
>
> Sorry for the noise.  I am looking around in the
> o.a.m.classifier.sgd.ModelSerializer and I only see methods for writeJson...
>
>
> On Dec 29, 2010, at 4:01 PM, Ted Dunning wrote:
>
> > Yes.
> >
> > That is evil.  The problem is that GSON recurses on lists and that makes
> > memory use crazy bad.
> >
> > Try serializing as binary.  I committed a change to allow that a few
> weeks
> > ago that added a method to ModelSerializer.  The SGD models are also all
> > Writable's now which should make rolling your own serialization very
> easy..
> >
> >
> > On Wed, Dec 29, 2010 at 3:59 PM, Chris Schilling
> > <[email protected]>wrote:
> >
> >> Hi again,
> >>
> >> I notice that if I try to write the model for the 20 NG example, I am
> >> running out of memory.  I am running on a small ec2 instance, so I run
> with
> >> the JVM with -Xmx1400m.
> >>
> >> So, I can train and dissect the model just fine.  However, when I try to
> >> write the weights:
> >> ModelSerializer.writeJson("/tmp/sgd_adaptive.model", learningAlgorithm);
> >>
> >> My feature vector size is 10000.
> >>
> >> I get an OOM exception:
> >>
> >> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
> >>       at
> >>
> org.apache.mahout.classifier.sgd.ModelSerializer$MatrixTypeAdapter.serialize(ModelSerializer.java:221)
> >>       at
> >>
> org.apache.mahout.classifier.sgd.ModelSerializer$MatrixTypeAdapter.serialize(ModelSerializer.java:210)
> >>       at
> >>
> com.google.gson.JsonSerializationVisitor.visitFieldUsingCustomHandler(JsonSerializationVisitor.java:148)
> >>       at
> >>
> com.google.gson.ObjectNavigator.navigateClassFields(ObjectNavigator.java:141)
> >>       at
> com.google.gson.ObjectNavigator.accept(ObjectNavigator.java:122)
> >>       at
> >>
> com.google.gson.JsonSerializationContextDefault.serialize(JsonSerializationContextDefault.java:47)
> >>       at
> >>
> com.google.gson.DefaultTypeAdapters$CollectionTypeAdapter.serialize(DefaultTypeAdapters.java:445)
> >>       at
> >>
> com.google.gson.DefaultTypeAdapters$CollectionTypeAdapter.serialize(DefaultTypeAdapters.java:431)
> >>       at
> >>
> com.google.gson.JsonSerializationVisitor.visitFieldUsingCustomHandler(JsonSerializationVisitor.java:148)
> >>       at
> >>
> com.google.gson.ObjectNavigator.navigateClassFields(ObjectNavigator.java:141)
> >>       at
> com.google.gson.ObjectNavigator.accept(ObjectNavigator.java:122)
> >>       at
> >>
> com.google.gson.JsonSerializationVisitor.getJsonElementForChild(JsonSerializationVisitor.java:117)
> >>       at
> >>
> com.google.gson.JsonSerializationVisitor.addAsChildOfObject(JsonSerializationVisitor.java:95)
> >>       at
> >>
> com.google.gson.JsonSerializationVisitor.visitObjectField(JsonSerializationVisitor.java:90)
> >>       at
> >>
> com.google.gson.ObjectNavigator.navigateClassFields(ObjectNavigator.java:147)
> >>       at
> com.google.gson.ObjectNavigator.accept(ObjectNavigator.java:122)
> >>       at
> >>
> com.google.gson.JsonSerializationContextDefault.serialize(JsonSerializationContextDefault.java:47)
> >>       at
> >>
> com.google.gson.JsonSerializationContextDefault.serialize(JsonSerializationContextDefault.java:40)
> >>       at
> >>
> org.apache.mahout.classifier.sgd.ModelSerializer$StateTypeAdapter.serialize(ModelSerializer.java:333)
> >>       at
> >>
> org.apache.mahout.classifier.sgd.ModelSerializer$StateTypeAdapter.serialize(ModelSerializer.java:287)
> >>       at
> >>
> com.google.gson.JsonSerializationVisitor.visitUsingCustomHandler(JsonSerializationVisitor.java:128)
> >>       at com.google.gson.ObjectNavigator.accept(ObjectNavigator.java:96)
> >>       at
> >>
> com.google.gson.JsonSerializationContextDefault.serialize(JsonSerializationContextDefault.java:47)
> >>       at
> >>
> org.apache.mahout.classifier.sgd.ModelSerializer$EvolutionaryProcessTypeAdapter.serialize(ModelSerializer.java:375)
> >>       at
> >>
> org.apache.mahout.classifier.sgd.ModelSerializer$EvolutionaryProcessTypeAdapter.serialize(ModelSerializer.java:339)
> >>       at
> >>
> com.google.gson.JsonSerializationVisitor.visitUsingCustomHandler(JsonSerializationVisitor.java:128)
> >>       at com.google.gson.ObjectNavigator.accept(ObjectNavigator.java:96)
> >>       at
> >>
> com.google.gson.JsonSerializationContextDefault.serialize(JsonSerializationContextDefault.java:47)
> >>       at
> >>
> org.apache.mahout.classifier.sgd.ModelSerializer$AdaptiveLogisticRegressionTypeAdapter.serialize(ModelSerializer.java:189)
> >>       at
> >>
> org.apache.mahout.classifier.sgd.ModelSerializer$AdaptiveLogisticRegressionTypeAdapter.serialize(ModelSerializer.java:153)
> >>       at
> >>
> com.google.gson.JsonSerializationVisitor.visitUsingCustomHandler(JsonSerializationVisitor.java:128)
> >>       at com.google.gson.ObjectNavigator.accept(ObjectNavigator.java:96)
> >>
> >> Does this make sense?  seems like too much memory to serialize.
> >>
> >> Thanks
> >> Chris
>
>

Reply via email to