Re: mahout on GPU

Ted Dunning Fri, 13 Jul 2012 12:05:34 -0700

Suprisingly enough, even the large scale svd codes in Mahout are nearly I/O
bound.  The issue is with sparse data, you really can process data nearly
as fast as it comes in (with a few exceptional steps).


On Fri, Jul 13, 2012 at 11:10 AM, mohsen jadidi <[email protected]>wrote:

> yes , you are right . it's not something general but I like idea.It just
> proves that in some cases we can achieve higher speed.
>
> by the way I was just wondering which one is most beneficial for machine
> learning stuff.for some like matrix factorization such as
> Eigendecomposition or some clustering g stuff. I am new to both one and I
> want to choose one to focus on.
>
>
>
> On Tue, Jul 10, 2012 at 4:35 PM, Ted Dunning <[email protected]>
> wrote:
>
> > Note that on page 6 they explicitly say that if they had to actually read
> > their input, this wouldn't help.  Since they *generate* their input
> inside
> > the GPU, they get speedup.  Without that aspect, they wouldn't get any
> > gain.
> >
> > This is a wildly non-typical case and is a great example of the kind of
> > program that I mentioned before that has an enormous ratio of compute /
> > input size.
> >
> > On Tue, Jul 10, 2012 at 3:51 AM, mohsen jadidi <[email protected]
> > >wrote:
> >
> > > to add some note:
> > >
> > > This paper demonstrated that a version of Hadoop MapReduce when
> “ported”
> > to
> > > a small 4-node GPU cluster could outperform a regular Hadoop 62 node
> CPU
> > > cluster and achieved a 508x speed-up per cluster node when performing
> > Black
> > > Scholes option pricing. It should be noted that Black Scholes algorithm
> > is
> > > an analytical algorithm.  The GPU cluster configuration comprised 5
> nodes
> > > each comprising a quad-core CPU with two 9800 GX2 GPUs, each with 128
> > core
> > > processors, connected to a Gigabit Ethernet router and one control node
> > > also connected to the Gigabit Ethernet router.
> > >
> > >
> > > On Tue, Jul 10, 2012 at 12:48 PM, mohsen jadidi <
> [email protected]
> > > >wrote:
> > >
> > > > sorry but I don't agree with you. We can benefit of GPU to speed up
> the
> > > > hadoop MapReduce computation .look at this paper. I just found it :
> > > >
> > > >
> > > >
> > >
> >
> http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=5289201&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D5289201
> > > >
> > > >
> > > >
> > > > On Mon, Jul 9, 2012 at 7:13 PM, Sean Owen <[email protected]> wrote:
> > > >
> > > >> Hadoop and CUDA are quite at odds -- Hadoop is all about splitting
> up
> > > >> a problem across quite remote machines while CUDA/GPU approaches
> rely
> > > >> on putting all computation together not only on one machine but
> within
> > > >> one graphics card.
> > > >>
> > > >> It doesn't make sense to combine them. Either you want to
> distribute a
> > > >> lot or you don't.
> > > >>
> > > >> As has been said above, it is all quite possible to implement if you
> > > want
> > > >> to.
> > > >> Nothing like this exists in Mahout. There is not even native code in
> > > >> this project.
> > > >>
> > > >> On Mon, Jul 9, 2012 at 6:07 PM, mohsen jadidi <
> > [email protected]>
> > > >> wrote:
> > > >> > yes it makes sense .
> > > >> > but I am more interested to get faster computation by combining
> the
> > > >> Mahout
> > > >> > and GPU capabilities. I just wanted to know if   people involve in
> > > >> Mahout
> > > >> > have thought about it or is it at all possible or not.for example
> > > speed
> > > >> up
> > > >> > the Map and Reduce phases by parallelise computations on nodes. Of
> > > >> course I
> > > >> > am not aware of communication cost.
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > Mohsen Jadidi
> > > >
> > > >
> > >
> > >
> > > --
> > > Mohsen Jadidi
> > >
> >
>
>
>
> --
> Mohsen Jadidi
>

Re: mahout on GPU

Reply via email to