Dear Ey-Chih, What are your use cases for a better random forest?
On 27 March 2013 11:59, Yutaka Mandai <[email protected]> wrote: > My understanding of current Random Forrest has a certain level of improvement > for running on Hadoop cluster from data splitting alignment perspective for > better balanced CPU utilization. > Regards,,, > Y.Mandai > > iPhoneから送信 > > On 2013/03/25, at 14:48, Ted Dunning <[email protected]> wrote: > >> I think that there are some others who could say more. >> >> On Mon, Mar 25, 2013 at 6:01 AM, Ey-Chih chow <[email protected]> wrote: >> >>> On Mar 24, 2013, at 1:00 AM, Ted Dunning wrote: >>> >>>> - random forest, sequential and parallel implementations, new versions >>> are being developed, the current version may or may not be useful to you. >>>> >>> Can you elaborate the usefulness of the current version and features of >>> the new versions? Thanks. >>> >>> Ey-Chih Chow >>> >>> >>> On Mar 24, 2013, at 1:00 AM, Ted Dunning wrote: >>> >>>> You are correct to suspect that this page is substantially out of date. >>>> >>>> Currently, Mahout has the following classifiers: >>>> >>>> - stochastic gradient descent for logistic regression (SGD) with L_1 or >>> L_2 regularization, sequential version only. These classifiers can be >>> easily extended with other gradients and regularizers which should make >>> linear SVM's easy to implement. >>>> >>>> - naive bayes, sequential and parallel implementations >>>> >>>> - random forest, sequential and parallel implementations, new versions >>> are being developed, the current version may or may not be useful to you. >>>> >>>> There are a variety of other classifiers which are in various states of >>> utility. >>>> >>>> On Mar 24, 2013, at 4:07 AM, Chidananda Sridhar wrote: >>>> >>>>> Hi, >>>>> >>>>> I am doing a class project on classification and want to use Mahout. I >>> was >>>>> searching for the classification algorithms already implemented in >>> Mahout >>>>> and came to this page: >>>>> https://cwiki.apache.org/confluence/display/MAHOUT/Algorithms >>>>> >>>>> The webpage says that Online Passive >>>>> Aggressive< >>> https://cwiki.apache.org/confluence/display/MAHOUT/Online+Passive+Aggressive >>>> is >>>>> integrated and the rest of the classification algorithms are open or >>>>> awaiting commit. Does the webpage have the latest information, or is it >>> yet >>>>> to be updated? Is "Online Passive Aggressive" the only algorithm I can >>> use >>>>> for now? On the other hand, I see that most of the clustering algorithms >>>>> have been integrated. >>>>> >>>>> Thanks, >>>>> Chidananda >>>> >>> >>> -- Dr Andy Twigg Junior Research Fellow, St Johns College, Oxford Room 351, Department of Computer Science http://www.cs.ox.ac.uk/people/andy.twigg/ [email protected] | +447799647538
