[
https://issues.apache.org/jira/browse/MAHOUT-716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Owen reassigned MAHOUT-716:
--------------------------------
Assignee: Ted Dunning
Let me let Ted comment on this, but more small-scale feedback. This patch
definitely needs a fair bit more scrub and I will post my version of it.
- java.util.Vector is so 2000. Definitely want to use List/ArrayList
- You definitely don't want to build strings by new String() and concatentation
-- try StringBuilder
- equals() must take Object as a param or it is not actually implemented. If
you use @Override, the compiler will flag this sort of error.
- You absolutely must implement hashCode() when implementing equals()
- Use Java 5 foreach where you can
- Avoid non-private fields; use protected getters if you want
- 0 is an int, 0.0 is a double, and while Java silently converts I personally
prefer being explicit about intent by using 0.0 where a double value is meant
- There are a load of unused imports, including weird ones like the
JavacCompiler
- Don't catch Exception
I strongly encourage anyone to just go get the free version of IntelliJ. Turn
on every one of its code inspection settings. Every source file will have like
100 things flagged. Slowly turn off the rules you don't like or that don't
apply. You will be left with a rule set that flags all of this stuff instantly
as you look at a file. It's just like night and day to have all this stuff
literally jump out at you and be fixable with one click
(I am happy to share my personal ruleset which is pretty standard)
> Implement Boosting
> ------------------
>
> Key: MAHOUT-716
> URL: https://issues.apache.org/jira/browse/MAHOUT-716
> Project: Mahout
> Issue Type: New Feature
> Components: Classification
> Affects Versions: 0.6
> Reporter: Hector Yee
> Assignee: Ted Dunning
> Priority: Minor
> Labels: features
> Fix For: 0.6
>
> Attachments: MAHOUT-716.patch, MAHOUT-716.patch
>
> Original Estimate: 72h
> Remaining Estimate: 72h
>
> Implement boosting (grad boost variant) with l1-regularization and induction.
> The gradient part is scalable and parallel and the induction part allows
> stochastic hypothesis generation for speed.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira