On Wed, Apr 6, 2011 at 9:03 AM, Dhruv <[email protected]> wrote:

> Just a few quick questions:
>
> 1. Is it better to start writing a driver first, followed by the mapper,
> combiner and reducer or, write the driver at the end?


This is a matter of personal style.

I tend to write the mapper, combiner and reducer first along with unit tests
for them.  That helps give me avoid
debugging large scale programs.

Next comes the driver which is usually trivial.  It helps to make sure that
the driver can be easily called programmatically
for testing.

Then I test with a trivial input running in local mode.


> Is it just a matter of
> personal style of a top down vs bottom up development or do these
> approaches
> have any tangible pros and cons in the context of iterative Map Reduce
> programming?
>

The biggest issue that I see is that debugging a parallel program can be
quite difficult.

2. Is there a preferred IDE (Eclipse, IntelliJ or...)? I've been using
> Eclipse so far.
>

We got to some lengths to make that a personal choice as well.  Committers
here use
both of the major options (Eclipse and IntelliJ).  I don't know of anybody
using NetBeans.
There is bound to be somebody who still likes to type everything.  My own
preference
is IntelliJ.


> 3. Is there a place where I can find guidelines for code formatting
> specific
> to Mahout? Things like indentation, class names, comments etc.
>

Lucene standard which is Sun standard with 2 space indentation.

This page might help especially near the bottom:
https://cwiki.apache.org/MAHOUT/how-to-contribute.html

Note to others, this wiki page has lots more information than the
how-to-contribute page that is linked from our main site.

Reply via email to