RE: MapReduce, machine learning, and introductions

Dubey, Pradeep Fri, 04 Apr 2008 09:39:12 -0700

Gary,
Yes, we have continued where you left it off. All of those kernels have
been parallelized/simulated/analyzed, and now being optimized for
'many-core'. Hopefully, we will be able to publish soon :-).


Pradeep

-----Original Message-----
From: Ted Dunning [mailto:[EMAIL PROTECTED] 
Sent: Thursday, April 03, 2008 5:13 PM
To: [email protected]; 'Gary Bradski'
Cc: 'Andrew Y. Ng'; Dubey, Pradeep; 'Jimmy Lin'
Subject: Re: MapReduce, machine learning, and introductions


Random forests are very cool and very odd little bests.

+n!


On 4/3/08 5:04 PM, "Jeff Eastman" <[EMAIL PROTECTED]> wrote:

> Hi Gary,
> 
>  
> 
> Thanks for your suggestion on Random Forests. I've cc'd this thread to
the
> Mahout dev list just in case you would like to continue it there. We
have
> received a lot of interest from students in conjunction with the
Google
> Summer of Code project and others looking to contribute to our
mission. We
> are not restricted at all to the 10 original NIPS algorithms; they
were just
> a natural starting point and a way to "prime the pump". Perhaps some
more
> information on your experiences using it on real manufacturing data
would
> motivate an implementation.
> 
>  
> 
> Jeff 
> 
>  
> 
>   _____  
> 
> From: Gary Bradski [mailto:[EMAIL PROTECTED]
> Sent: Thursday, April 03, 2008 4:46 PM
> To: Jeff Eastman
> Cc: Andrew Y. Ng; Dubey, Pradeep; Jimmy Lin
> Subject: Re: MapReduce, machine learning, and introductions
> 
>  
> 
> One of the things I'd like to see parallelized is Random forests.
Though
> there is no "best" algorithm for classification, when I ran it on
Intel
> manufacturing data sets it was almost always beating boosting, SVM,
and
> MART. Zisserman claimed it worked best on keypoint recognition in
vision and
> his version was the simplest one I've heard.
> 
> This is one of those "brain dead" parallelizations -- just parcel out
the
> learning of trees on randomly selected subsets of the data.  In
learning,
> each tree randomly selects from a subset of the features at each node.
> 
> It has nice techniques for doing feature selection as well.
> 
> Gary
> 
> On Thu, Apr 3, 2008 at 4:27 PM, Jeff Eastman
<[EMAIL PROTECTED]>
> wrote:
> 
> Well, it has been a couple of years. Thanks for the response and
> retransmission. Good luck in your current endeavors.
> 
>  
> 
> Jeff 
> 
>  
> 
>   _____  
> 
> From: Gary Bradski [mailto:[EMAIL PROTECTED]
> Sent: Thursday, April 03, 2008 4:23 PM
> To: Andrew Y. Ng; Dubey, Pradeep
> Cc: Jeff Eastman; Jimmy Lin
> Subject: Re: MapReduce, machine learning, and introductions
> 
>  
> 
> Re: Parallel Machine learning project mahout
http://lucene.apache.org/mahout
> 
> When I was at Intel, I began carving out a parallel Machine learning
niche
> since it was something interesting that Intel would also be interested
in.
> 
> But that was two companies ago for me and I haven't touched it since.
I'm
> now focused on sensor guided manipulation and revamping the computer
vision
> library I started, OpenCV.
> 
> About all I can do is send the last known working version of the code
that I
> had.  I've CC'd Pradeep Dubey, and Intel Fellow with whom I worked on
some
> of the parallel machine learning issues, his team also studied that
code.  I
> don't know what happened since, but Parallel machine learning might
still be
> one of his active areas and maybe theres's some synergy there.
> 
> Gary
> 
> On Thu, Apr 3, 2008 at 3:38 PM, Andrew Y. Ng <[EMAIL PROTECTED]>
wrote:
> 
> Hi Jeff,
> 
> I'd been hearing increasing amounts of buzz on Mahour and am excited
> about it, but unfortunately am no longer working in this space.
> Gary Bradski, CC-ed above, would be a great person to talk to about
> Map-Reduce and machine learning, though!
> 
> Andrew
> 
> 
> On Thu, 3 Apr 2008, Jeff Eastman wrote:
> 
>> Hi Andrew,
>> 
>> I'm a committer on the new Mahout project. As Jimmy indicated, we are
>> setting out to implement versions of the NIPS paper algorithms on top
of
>> Hadoop. So far, we have committed versions of only k-means and canopy
but
>> have a number of other algorithms in various stages of
implementation. I
>> don't have any immediate questions but I live in Los Altos and so it
would
>> be convenient to visit if you or your colleagues do have questions
about
>> Mahout.
>> 
>> In any case I thought it would be nice to introduce myself.
>> 
>> Jeff
>> 
>> http://lucene.apache.org/mahout
>> 
>> 
>> Jeff Eastman, Ph.D.
>> Windward Solutions Inc.
>> +1.415.298.0023
>> http://windwardsolutions.com
>> http://jeffeastman.blogspot.com
>> 
>> 
>>> -----Original Message-----
>>> From: Jimmy Lin [mailto:[EMAIL PROTECTED]
>>> Sent: Saturday, March 29, 2008 8:37 PM
>>> To: [EMAIL PROTECTED]
>>> Cc: Jeff Eastman
>>> Subject: MapReduce, machine learning, and introductions
>>> 
>>> Hi Andrew,
>>> 
>>> How are things going?  Haven't seen you in a while... hope
everything
>>> is going well at Stanford.
>>> 
>>> I was recently in the bay area attending the Yahoo Hadoop summit---
>>> I've been using MapReduce in teaching and research recently (stat
MT,
>>> IR, etc.), so I was there talking about that.
>>> 
>>> Are you aware of the Apache Mahout project?  They are putting
together
>>> an open-source MR toolkit for machine-learning-ish things; one of
the
>>> things they're working on is implementing the various algorithms in
>>> your NIPS paper.  Jeff Eastman is involved in the project, cc'ed
>>> here.  I thought I'd put the two of you in touch...
>>> 
>>> Best,
>>> Jimmy
>> 
>> 
>> 
>> 
> 
>  
> 
>  
>

RE: MapReduce, machine learning, and introductions

Reply via email to