Hi again,
I notice that if I try to write the model for the 20 NG example, I am running
out of memory. I am running on a small ec2 instance, so I run with the JVM
with -Xmx1400m.
So, I can train and dissect the model just fine. However, when I try to write
the weights:
ModelSerializer.writeJson("/tmp/sgd_adaptive.model", learningAlgorithm);
My feature vector size is 10000.
I get an OOM exception:
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at
org.apache.mahout.classifier.sgd.ModelSerializer$MatrixTypeAdapter.serialize(ModelSerializer.java:221)
at
org.apache.mahout.classifier.sgd.ModelSerializer$MatrixTypeAdapter.serialize(ModelSerializer.java:210)
at
com.google.gson.JsonSerializationVisitor.visitFieldUsingCustomHandler(JsonSerializationVisitor.java:148)
at
com.google.gson.ObjectNavigator.navigateClassFields(ObjectNavigator.java:141)
at com.google.gson.ObjectNavigator.accept(ObjectNavigator.java:122)
at
com.google.gson.JsonSerializationContextDefault.serialize(JsonSerializationContextDefault.java:47)
at
com.google.gson.DefaultTypeAdapters$CollectionTypeAdapter.serialize(DefaultTypeAdapters.java:445)
at
com.google.gson.DefaultTypeAdapters$CollectionTypeAdapter.serialize(DefaultTypeAdapters.java:431)
at
com.google.gson.JsonSerializationVisitor.visitFieldUsingCustomHandler(JsonSerializationVisitor.java:148)
at
com.google.gson.ObjectNavigator.navigateClassFields(ObjectNavigator.java:141)
at com.google.gson.ObjectNavigator.accept(ObjectNavigator.java:122)
at
com.google.gson.JsonSerializationVisitor.getJsonElementForChild(JsonSerializationVisitor.java:117)
at
com.google.gson.JsonSerializationVisitor.addAsChildOfObject(JsonSerializationVisitor.java:95)
at
com.google.gson.JsonSerializationVisitor.visitObjectField(JsonSerializationVisitor.java:90)
at
com.google.gson.ObjectNavigator.navigateClassFields(ObjectNavigator.java:147)
at com.google.gson.ObjectNavigator.accept(ObjectNavigator.java:122)
at
com.google.gson.JsonSerializationContextDefault.serialize(JsonSerializationContextDefault.java:47)
at
com.google.gson.JsonSerializationContextDefault.serialize(JsonSerializationContextDefault.java:40)
at
org.apache.mahout.classifier.sgd.ModelSerializer$StateTypeAdapter.serialize(ModelSerializer.java:333)
at
org.apache.mahout.classifier.sgd.ModelSerializer$StateTypeAdapter.serialize(ModelSerializer.java:287)
at
com.google.gson.JsonSerializationVisitor.visitUsingCustomHandler(JsonSerializationVisitor.java:128)
at com.google.gson.ObjectNavigator.accept(ObjectNavigator.java:96)
at
com.google.gson.JsonSerializationContextDefault.serialize(JsonSerializationContextDefault.java:47)
at
org.apache.mahout.classifier.sgd.ModelSerializer$EvolutionaryProcessTypeAdapter.serialize(ModelSerializer.java:375)
at
org.apache.mahout.classifier.sgd.ModelSerializer$EvolutionaryProcessTypeAdapter.serialize(ModelSerializer.java:339)
at
com.google.gson.JsonSerializationVisitor.visitUsingCustomHandler(JsonSerializationVisitor.java:128)
at com.google.gson.ObjectNavigator.accept(ObjectNavigator.java:96)
at
com.google.gson.JsonSerializationContextDefault.serialize(JsonSerializationContextDefault.java:47)
at
org.apache.mahout.classifier.sgd.ModelSerializer$AdaptiveLogisticRegressionTypeAdapter.serialize(ModelSerializer.java:189)
at
org.apache.mahout.classifier.sgd.ModelSerializer$AdaptiveLogisticRegressionTypeAdapter.serialize(ModelSerializer.java:153)
at
com.google.gson.JsonSerializationVisitor.visitUsingCustomHandler(JsonSerializationVisitor.java:128)
at com.google.gson.ObjectNavigator.accept(ObjectNavigator.java:96)
Does this make sense? seems like too much memory to serialize.
Thanks
Chris