[ 
https://issues.apache.org/jira/browse/MAHOUT-716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen reassigned MAHOUT-716:
--------------------------------

    Assignee: Ted Dunning

Let me let Ted comment on this, but more small-scale feedback. This patch 
definitely needs a fair bit more scrub and I will post my version of it.

- java.util.Vector is so 2000. Definitely want to use List/ArrayList
- You definitely don't want to build strings by new String() and concatentation 
-- try StringBuilder
- equals() must take Object as a param or it is not actually implemented. If 
you use @Override, the compiler will flag this sort of error.
- You absolutely must implement hashCode() when implementing equals()
- Use Java 5 foreach where you can
- Avoid non-private fields; use protected getters if you want
- 0 is an int, 0.0 is a double, and while Java silently converts I personally 
prefer being explicit about intent by using 0.0 where a double value is meant
- There are a load of unused imports, including weird ones like the 
JavacCompiler
- Don't catch Exception

I strongly encourage anyone to just go get the free version of IntelliJ. Turn 
on every one of its code inspection settings. Every source file will have like 
100 things flagged. Slowly turn off the rules you don't like or that don't 
apply. You will be left with a rule set that flags all of this stuff instantly 
as you look at a file. It's just like night and day to have all this stuff 
literally jump out at you and be fixable with one click

(I am happy to share my personal ruleset which is pretty standard)

> Implement Boosting
> ------------------
>
>                 Key: MAHOUT-716
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-716
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Classification
>    Affects Versions: 0.6
>            Reporter: Hector Yee
>            Assignee: Ted Dunning
>            Priority: Minor
>              Labels: features
>             Fix For: 0.6
>
>         Attachments: MAHOUT-716.patch, MAHOUT-716.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> Implement boosting (grad boost variant) with l1-regularization and induction.
> The gradient part is scalable and parallel and the induction part allows 
> stochastic hypothesis generation for speed.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to