Hi Ted, Thanks for the answer. I am having some difficulty understanding why running random forest on top of Hadoop "does not produce arbitrary scalability". Could you elaborate? Also, are you aware of any work that involved developing random forest using map-reduce? Thanks! Lingxiang
________________________________ From: Ted Dunning <[email protected]> To: [email protected]; Lingxiang Cheng <[email protected]> Sent: Sunday, December 25, 2011 4:13 PM Subject: Re: Mahout classifier on Hadoop Random forest works as a map-reduce program, but that does not produce arbitrary scalability. The Naive Bayes classifier is relatively natural as a map-reduce program and has a map-reduce version. The linear classifiers like linear regression do not have map-reduce versions (yet) since there is some difficulty in getting these to work well. On Sun, Dec 25, 2011 at 5:59 AM, Lingxiang Cheng <[email protected]>wrote: > Hi, > > I am a newbie to Mahout. When I was reading the book "Mahout in > Action", I found chapters talking about how clustering naturally fit into > Map/Reduce framework, but I did not see the same claim for classifiers. > Does it involve a lot of work to make classifiers like random forest work > with Hadoop? > > Thanks! > Lingxiang Cheng
