Phil Sherrod wrote: > I'm doing research comparing boosted decision trees to neural > various types of predictive analyses. A boosted decision tree is an > ensemble tree created as a series of small trees that form an > model. I'm using the TreeBoost method of boosting to generate the > tree series. TreeBoost uses stochastic gradient boosting to > predictive accuracy of decision tree models (see > http://www.dtreg.com/treeboost.htm).
I think Phil exceeded the reasonable limits of Usenet advertising, so let me provide a list of cost-free classification tree utilities. I'm (*) tagging those that support boasting, bragging (pun intended) or some other approach to reducing the model variance. If you're interested in perturbation approaches (boosting, bagging, arcing) I recommend looking at Random Forests, the recent approach by L. Breiman http://www.stat.berkeley.edu/users/breiman/RandomForests/ based on a voting ensemble of many small classification/regression trees. Namely, it turns out that a single classification tree represents a single interaction. If you have a mostly linear phenomenon, as in many real-life problems, a classification tree will represent it as a single humongous interaction, which is not particularly clear or sophisticated. A few free and user-friendly machine learning toolkits are: http://www.cs.waikato.ac.nz/ml/weka/ (in Java) http://magix.fri.uni-lj.si/orange/default.asp (in Python) http://www.sgi.com/tech/mlc/ (in C++) The R project is a tremendously powerful framework for computational statistics, but might not be the easiest thing for a beginner http://www.r-project.org/ There are some contributed libraries focusing on trees: http://cran.at.r-project.org/src/contrib/Descriptions/tree.html http://cran.at.r-project.org/src/contrib/Descriptions/randomForest.html http://cran.at.r-project.org/src/contrib/Descriptions/mvpart.html http://cran.at.r-project.org/src/contrib/Descriptions/rpart.html KDnuggets is an index site to mostly commercial data mining software: http://www.kdnuggets.com/software/classification-tree-rules.html There is another index of software at: http://www.mlnet.org/cgi-bin/mlnetois.pl/?File=software.html RuleQuest has some simple&quick but good tools, with demonstration versions that function on small datasets (200 instances for regression, and 400 instances for classification): http://www.rulequest.com/ http://www.cse.unsw.edu.au/~quinlan/ (this is for an older, free version of C4.5) W.-Y. Loh was involved with several tree-based algorithms: http://www.stat.nus.edu.sg/~kinyee/lotus.html (logistic regression trees) http://www.stat.wisc.edu/~loh/guide.html (regression) http://www.stat.wisc.edu/~loh/quest.html (classification) http://www.stat.wisc.edu/~loh/cruise.html (classification) Jerome H. Friedman offers some of his tree-based software online http://www-stat.stanford.edu/~jhf/#software (e.g., MART) L. Torgo has a pretty good regression tree learner: http://www.liacc.up.pt/~ltorgo/RT/ M. Robnik-Sikonja specializes in cost-sensitive modelling: http://lkm.fri.uni-lj.si/rmarko/software/index.htm C. Borgelt has an extensive library of learning routines, including trees: http://fuzzy.cs.uni-magdeburg.de/~borgelt/software.html The SMILES system: http://www.dsic.upv.es/~flip/smiles/ ADtree system: http://www.grappa.univ-lille3.fr/grappa/en_index.php3?info=software -- mag. Aleks Jakulin http://www.ailab.si/aleks/ Artificial Intelligence Laboratory, Faculty of Computer and Information Science, University of Ljubljana. . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
