[
https://issues.apache.org/jira/browse/MAHOUT-122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Deneche A. Hakim updated MAHOUT-122:
------------------------------------
Attachment: 3w_patch.diff
*3rd Week Patch*
work in progress...
*Changes*
* ForestBuilder becomes an object that uses a TreeBuilder object
* RandomForest represents a...guess what ! it has methods to classify single
instances or bunch of data. Contains also methods to compute the total and mean
number of nodes and mean max depth of the trees
* Added more PredictionCallback implementations!
** MeanTreeCollector computes the mean classification error among all the trees
of the forest
** MultiCallback allows many callbacks to be passed to the same classification
method
* BreimanExample is a running example similar to the testing procedures used in
Breiman's paper about Random Forests
* MemoryUsage is a running app used to collect the stats about memory usage
* DataSplit, a temporary app, allows to split the KDD dataset (1%, 10%, 25%,
50%)
* TreeBuilder is an abstract class that builds a Decision Tree given a Data
instance
* DefaultTreeBuilder implementation of a TreeBuilder based on Andrew W. Moore
Decision Trees tutorial
*What's next*
* some more memory usage tests
* I think its time to start with the map-reduce implementation, the results of
the memory usage tests should help us decide which implementation to pursue
> Random Forests Reference Implementation
> ---------------------------------------
>
> Key: MAHOUT-122
> URL: https://issues.apache.org/jira/browse/MAHOUT-122
> Project: Mahout
> Issue Type: Task
> Components: Classification
> Affects Versions: 0.2
> Reporter: Deneche A. Hakim
> Attachments: 2w_patch.diff, 3w_patch.diff, RF reference.patch
>
> Original Estimate: 25h
> Remaining Estimate: 25h
>
> This is the first step of my GSOC project. Implement a simple, easy to
> understand, reference implementation of Random Forests (Building and
> Classification). The only requirement here is that "it works"
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.