We have a couple JIRAs that relate here: We want to factor all the (-cl) classification steps out of all of the driver classes (MAHOUT-930) and into a separate job to remove duplicated code; MAHOUT-931 is to add a pluggable outlier removal capability to this job; and MAHOUT-933 is aimed at factoring all the iteration mechanics from each driver class into the ClusterIterator, which uses a ClusterClassifier which is itself an OnlineLearner. This will hopefully allow semi-supervised classifier applications to be constructed by feeding cluster-derived models into the classification process. Still kind of fuzzy at this point but promising too.

On 2/11/12 2:29 PM, Frank Scholten wrote:
...
What kind of clustering refactoring do mean here? I did some work on creating bean configurations in the past (MAHOUT-612). I underestimated the amount of work required to do the entire refactoring. If this can be contributed and committed on a per-job basis I would like to help out.
...


Reply via email to