Gary, Yes, we have continued where you left it off. All of those kernels have been parallelized/simulated/analyzed, and now being optimized for 'many-core'. Hopefully, we will be able to publish soon :-).
Pradeep -----Original Message----- From: Ted Dunning [mailto:[EMAIL PROTECTED] Sent: Thursday, April 03, 2008 5:13 PM To: [email protected]; 'Gary Bradski' Cc: 'Andrew Y. Ng'; Dubey, Pradeep; 'Jimmy Lin' Subject: Re: MapReduce, machine learning, and introductions Random forests are very cool and very odd little bests. +n! On 4/3/08 5:04 PM, "Jeff Eastman" <[EMAIL PROTECTED]> wrote: > Hi Gary, > > > > Thanks for your suggestion on Random Forests. I've cc'd this thread to the > Mahout dev list just in case you would like to continue it there. We have > received a lot of interest from students in conjunction with the Google > Summer of Code project and others looking to contribute to our mission. We > are not restricted at all to the 10 original NIPS algorithms; they were just > a natural starting point and a way to "prime the pump". Perhaps some more > information on your experiences using it on real manufacturing data would > motivate an implementation. > > > > Jeff > > > > _____ > > From: Gary Bradski [mailto:[EMAIL PROTECTED] > Sent: Thursday, April 03, 2008 4:46 PM > To: Jeff Eastman > Cc: Andrew Y. Ng; Dubey, Pradeep; Jimmy Lin > Subject: Re: MapReduce, machine learning, and introductions > > > > One of the things I'd like to see parallelized is Random forests. Though > there is no "best" algorithm for classification, when I ran it on Intel > manufacturing data sets it was almost always beating boosting, SVM, and > MART. Zisserman claimed it worked best on keypoint recognition in vision and > his version was the simplest one I've heard. > > This is one of those "brain dead" parallelizations -- just parcel out the > learning of trees on randomly selected subsets of the data. In learning, > each tree randomly selects from a subset of the features at each node. > > It has nice techniques for doing feature selection as well. > > Gary > > On Thu, Apr 3, 2008 at 4:27 PM, Jeff Eastman <[EMAIL PROTECTED]> > wrote: > > Well, it has been a couple of years. Thanks for the response and > retransmission. Good luck in your current endeavors. > > > > Jeff > > > > _____ > > From: Gary Bradski [mailto:[EMAIL PROTECTED] > Sent: Thursday, April 03, 2008 4:23 PM > To: Andrew Y. Ng; Dubey, Pradeep > Cc: Jeff Eastman; Jimmy Lin > Subject: Re: MapReduce, machine learning, and introductions > > > > Re: Parallel Machine learning project mahout http://lucene.apache.org/mahout > > When I was at Intel, I began carving out a parallel Machine learning niche > since it was something interesting that Intel would also be interested in. > > But that was two companies ago for me and I haven't touched it since. I'm > now focused on sensor guided manipulation and revamping the computer vision > library I started, OpenCV. > > About all I can do is send the last known working version of the code that I > had. I've CC'd Pradeep Dubey, and Intel Fellow with whom I worked on some > of the parallel machine learning issues, his team also studied that code. I > don't know what happened since, but Parallel machine learning might still be > one of his active areas and maybe theres's some synergy there. > > Gary > > On Thu, Apr 3, 2008 at 3:38 PM, Andrew Y. Ng <[EMAIL PROTECTED]> wrote: > > Hi Jeff, > > I'd been hearing increasing amounts of buzz on Mahour and am excited > about it, but unfortunately am no longer working in this space. > Gary Bradski, CC-ed above, would be a great person to talk to about > Map-Reduce and machine learning, though! > > Andrew > > > On Thu, 3 Apr 2008, Jeff Eastman wrote: > >> Hi Andrew, >> >> I'm a committer on the new Mahout project. As Jimmy indicated, we are >> setting out to implement versions of the NIPS paper algorithms on top of >> Hadoop. So far, we have committed versions of only k-means and canopy but >> have a number of other algorithms in various stages of implementation. I >> don't have any immediate questions but I live in Los Altos and so it would >> be convenient to visit if you or your colleagues do have questions about >> Mahout. >> >> In any case I thought it would be nice to introduce myself. >> >> Jeff >> >> http://lucene.apache.org/mahout >> >> >> Jeff Eastman, Ph.D. >> Windward Solutions Inc. >> +1.415.298.0023 >> http://windwardsolutions.com >> http://jeffeastman.blogspot.com >> >> >>> -----Original Message----- >>> From: Jimmy Lin [mailto:[EMAIL PROTECTED] >>> Sent: Saturday, March 29, 2008 8:37 PM >>> To: [EMAIL PROTECTED] >>> Cc: Jeff Eastman >>> Subject: MapReduce, machine learning, and introductions >>> >>> Hi Andrew, >>> >>> How are things going? Haven't seen you in a while... hope everything >>> is going well at Stanford. >>> >>> I was recently in the bay area attending the Yahoo Hadoop summit--- >>> I've been using MapReduce in teaching and research recently (stat MT, >>> IR, etc.), so I was there talking about that. >>> >>> Are you aware of the Apache Mahout project? They are putting together >>> an open-source MR toolkit for machine-learning-ish things; one of the >>> things they're working on is implementing the various algorithms in >>> your NIPS paper. Jeff Eastman is involved in the project, cc'ed >>> here. I thought I'd put the two of you in touch... >>> >>> Best, >>> Jimmy >> >> >> >> > > > > >
