Re: mahout job and hadoop

Dmitriy Lyubimov Wed, 16 May 2012 11:39:05 -0700

On Wed, May 16, 2012 at 3:00 AM, Chandra Mohan, Ananda Vel Murugan
<[email protected]> wrote:
> *       What is the difference between running a mahout job locally and
> in Hadoop?


Mostly the difference is whether algorithm supports running on
MapReduce or not. Usually it is one way or the other.  (although
MapReduce based solutions could be run using hadoop local mode in some
cases (not all) and technically it would still be "running in Hadoop".

>
>
>
> *       I wrote a simple mahout job to do K-means clustering using my
> data. I packaged it as jar and tried running it. It worked and did the
> clustering in a Hadoop single node cluster. I am planning to move this
> job to a multi node cluster.  Should I execute mahout command from job
> tracker node only? Or can I execute it from any node in cluster and be
> assured that it uses all the nodes in the cluster. How mahout works in a
> multi node cluster?
>


You can execute command line (it's called "driver" in Hadoop's lingua)
from any node that has a network connectivity to mapreduce cluster
(i.e. you don't have to choose any particular node or even be within
the cluster) but you should do it only once.

-d

Re: mahout job and hadoop

Reply via email to