[
https://issues.apache.org/jira/browse/MAHOUT-556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12964776#action_12964776
]
Ted Dunning commented on MAHOUT-556:
------------------------------------
Yes. I think that this code:
{code}
GsonBuilder gb = new GsonBuilder();
gb.registerTypeAdapter(Matrix.class, new MatrixTypeAdapter());
Gson gson = gb.setPrettyPrinting().create();
{code}
in LogisticModelParameters.saveTo() at about line 126 should actually be more
like this:
{code}
Gson gson = ModelSerializer.gson();
{code}
The serializer that results has lots of adapters installed already.
> In the trainlogistic example the JSON model file which is created is missing
> commas and making it unusable with runLogistic.
> ----------------------------------------------------------------------------------------------------------------------------
>
> Key: MAHOUT-556
> URL: https://issues.apache.org/jira/browse/MAHOUT-556
> Project: Mahout
> Issue Type: Bug
> Components: Classification
> Affects Versions: 0.5
> Environment: Ubuntu 10.10, Hadoop-0.20.2
> Reporter: Rohan Anil
> Priority: Minor
>
> Bug related to creation of the model when you run trainlogistic
> Its creating the JSON model file using the toJson function as illustrated
> below
> --------------------------------
> In,
> LogisticModelParameters.java
> Function
> void saveTo(Writer out)
> {
> ...
> ..
> String savedForm = gson.toJson(this);
> ...
> }
> --------------------------------
> But this is not working as expected : - String savedForm = gson.toJson(this);
> For my experiment using a different dataset -
> I get the following model file :
> {"targetVariable":"customer","typeMap":{"feature2":"n","feature3":"n",
> "feature1":"n"},"numFeatures":334,"useBias":true,"maxTargetCategories":
> 2,"targetCategories":["0","1"],"lambda":1.0E-4,"learningRate":0.001,"lr":{
> "mu0":0.001,"decayFactor":0.999,"stepOffset":10,"forgettingExponent":
> -0.5,"perTermAnnealingOffset":20,"beta":{"rows":1,"cols":334,"data":[[
> 0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
> 0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
> 0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
> 0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
> 0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
> 0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
> 0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
> 0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
> 0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
> 0.0,6.741887291022263E-4,0.0,0.0,-53.6076187622054,0.0,0.0,0.0,
> 0.0,0.0,0.0,0.0,0.0,0.0,0.0,5.031178185395536E-5,0.0,0.0,0.0,0.0,
> 0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
> 0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
> 0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
> 0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04383410529689268,
> 0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
> 0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
> 0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
> 0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
> 0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
> 0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
> 0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]]},
> "numCategories":2,"step":260951,"updateSteps":{}"updateCounts":{}
> "lambda":1.0E-4,"prior":{}"sealed":true,"gradient":{}}}
> If you notice the last part,
> "numCategories":2,"step":260951,"updateSteps":{}"updateCounts":{}
> "lambda":1.0E-4,"prior":{}"sealed":true,"gradient":{}}}
> are missing commas between updateSteps,updateCounts and Sealed variables
> Investigating further,
> These come from the AbstractOnlineLogisticRegression.java and the above
> variables are not initialized hence the wrong output by the toJson function.
> This is a bug with - > gson.toJson function, I see that I am using gson-1.3
> and upgrading to 1.4 by modifying core/pom.xml fixes things, But runLogistic
> then complains about
> 10/11/29 03:29:43 WARN conf.Configuration: DEPRECATED: hadoop-site.xml found
> in the classpath. Usage of hadoop-site.xml is deprecated. Instead use
> core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of
> core-default.xml, mapred-default.xml and hdfs-default.xml respectively
> Exception in thread "main" java.lang.RuntimeException: No-args constructor
> for interface org.apache.mahout.math.Vector does not exist. Register an
> InstanceCreator with Gson for this type to fix this problem.
> at
> com.google.gson.MappedObjectConstructor.constructWithNoArgConstructor(MappedObjectConstructor.java:64)
> at
> com.google.gson.MappedObjectConstructor.construct(MappedObjectConstructor.java:53)
> at
> com.google.gson.JsonObjectDeserializationVisitor.constructTarget(JsonObjectDeserializationVisitor.java:41)
> at
> com.google.gson.JsonDeserializationVisitor.getTarget(JsonDeserializationVisitor.java:56)
> at com.google.gson.ObjectNavigator.accept(ObjectNavigator.java:101)
> at
> com.google.gson.JsonDeserializationVisitor.visitChild(JsonDeserializationVisitor.java:107)
> at
> com.google.gson.JsonDeserializationVisitor.visitChildAsObject(JsonDeserializationVisitor.java:95)
> at
> com.google.gson.JsonObjectDeserializationVisitor.visitObjectField(JsonObjectDeserializationVisitor.java:62)
> at
> com.google.gson.ObjectNavigator.navigateClassFields(ObjectNavigator.java:156)
> at com.google.gson.ObjectNavigator.accept(ObjectNavigator.java:123)
> at
> com.google.gson.JsonDeserializationVisitor.visitChild(JsonDeserializationVisitor.java:107)
> at
> com.google.gson.JsonDeserializationVisitor.visitChildAsObject(JsonDeserializationVisitor.java:95)
> at
> com.google.gson.JsonObjectDeserializationVisitor.visitObjectField(JsonObjectDeserializationVisitor.java:62)
> at
> com.google.gson.ObjectNavigator.navigateClassFields(ObjectNavigator.java:156)
> at com.google.gson.ObjectNavigator.accept(ObjectNavigator.java:123)
> at
> com.google.gson.JsonDeserializationContextDefault.fromJsonObject(JsonDeserializationContextDefault.java:73)
> at
> com.google.gson.JsonDeserializationContextDefault.deserialize(JsonDeserializationContextDefault.java:51)
> at com.google.gson.Gson.fromJson(Gson.java:495)
> at com.google.gson.Gson.fromJson(Gson.java:444)
> at com.google.gson.Gson.fromJson(Gson.java:419)
> at
> org.apache.mahout.classifier.sgd.LogisticModelParameters.loadFrom(LogisticModelParameters.java:142)
> at
> org.apache.mahout.classifier.sgd.LogisticModelParameters.loadFrom(LogisticModelParameters.java:155)
> at
> org.apache.mahout.classifier.sgd.RunLogistic.main(RunLogistic.java:56)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:616)
> at
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
> at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:182)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:616)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> Which I haven't had the time to investigate yet, Will post more results
> tomorrow.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.