[ 
https://issues.apache.org/jira/browse/MAHOUT-291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12833806#action_12833806
 ] 

Drew Farris commented on MAHOUT-291:
------------------------------------

Thanks very much Robin for posting a patch to review. Things like 
KMeansClusterer.log.debug -> log.debug look great herehere, and I'm ok with the 
whitespace oriented changes for the most part, bu there are some cases where 
auto-code formatting is really making a hash of things:

e.g:

{code}
System.out.println("Generating " + num + " samples m=[" + mx + ", " + my + "] 
sd=[" + sdx + ", " + sdy + ']');
{code}

gets transformed to:

{code}
System.out.println("Generating " + num + " samples m=[" + mx + ", " + my + "] 
sd=[" + sdx + ", " + sdy
                        + ']');
{code}

which despite the 120 line length rule seems a little too strict IMHO. 

Also, a nicely formatted OptionBuilder is turned into something nasty and 
unreadable.

{code}
-    Option clustersOpt = obuilder
-        .withLongName("clusters")
-        .withRequired(true)
-        
.withArgument(abuilder.withName("clusters").withMinimum(1).withMaximum(1).create())
-        .withDescription(
-          "The input centroids, as Vectors.  Must be a SequenceFile of 
Writable, Cluster/Canopy.  "
-              + "If k is also specified, then a random set of vectors will be 
selected and written out to this path first")
+    Option clustersOpt = 
obuilder.withLongName("clusters").withRequired(true).withArgument(
+      
abuilder.withName("clusters").withMinimum(1).withMaximum(1).create()).withDescription(
+      "The input centroids, as Vectors.  Must be a SequenceFile of Writable, 
Cluster/Canopy.  "
+          + "If k is also specified, then a random set of vectors will be 
selected and"
+          + "written out to this path first")
         .withShortName("c").create();
{code}

And things like the following, but honestly which of these is the greater sin? 
(From LDAInference)

{code}
-    double t = f
-               * (-1 / 12.0 + f
-                              * (1 / 120.0 + f
-                                             * (-1 / 252.0 + f
-                                                             * (1 / 240.0 + f
-                                                                            * 
(-1 / 132.0 + f
-                                                                               
             * (691 / 32760.0 + f
-                                                                               
                                * (-1 / 12.0 + f * 3617.0 / 8160.0)))))));
+    double t = f * (-1 / 12.0 + f * (1 / 120.0 + f * (-1 / 252.0 
+        + f * (1 / 240.0 + f * (-1 / 132.0 + f * (691 / 32760.0 + f * (-1 / 
12.0 + f * 3617.0 / 8160.0)))))));
{code}

What's the best way to proceed from here given this?

> Mahout Code Cleanup
> -------------------
>
>                 Key: MAHOUT-291
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-291
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Classification, Clustering, Collaborative Filtering, 
> Frequent Itemset/Association Rule Mining, Genetic Algorithms, Math, Utils, 
> Website
>    Affects Versions: 0.3
>            Reporter: Robin Anil
>            Assignee: Robin Anil
>            Priority: Minor
>             Fix For: 0.3
>
>         Attachments: MAHOUT-291.patch
>
>
> Code Cleanup
> Organize imports
> Remove space in blank lines
> make local variables final

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to