Re: kMeans Implementation

Gmail Tue, 04 Mar 2014 03:03:24 -0800

I used the kMeansDriver class, in clustering.kmeans package.

Yes I know that the use of MapReduce is mandatory, but I think thatexists an easier implementation and especially mapreduce oriented.


Anyway, I thought it was a choice driven by performances.

Thank you.


On 03/04/2014 11:48 AM, Sean Owen wrote:

Although I don't know exactly what you're referring to, in general,
nothing about Map/Reduce means you always use a reducer. There are
plenty of tasks that are much more appropriate as a map-only or
reduce-only job. So this assertion doesn't fly to start with. But if
you see two jobs that might be merged into one, that could be a useful
suggestion.

On Tue, Mar 4, 2014 at 10:43 AM, Gmail <[email protected]> wrote:

Hello,
I was studying Mahout libraries and I found something of strange in your
kMeans implementation.

I was looking inside it and I have noticed that kMeans only uses map
functions, omitting the reducers. Why have you done this choice?
It is not using MapReduce programming model even if it is declared that the
Mahout's core is Hadoop.
Is this choice driven by performance issue?

Best regards
Manuel Sequino

Re: kMeans Implementation

Reply via email to