On Wed, Apr 6, 2011 at 9:03 AM, Dhruv <[email protected]> wrote: > Just a few quick questions: > > 1. Is it better to start writing a driver first, followed by the mapper, > combiner and reducer or, write the driver at the end?
This is a matter of personal style. I tend to write the mapper, combiner and reducer first along with unit tests for them. That helps give me avoid debugging large scale programs. Next comes the driver which is usually trivial. It helps to make sure that the driver can be easily called programmatically for testing. Then I test with a trivial input running in local mode. > Is it just a matter of > personal style of a top down vs bottom up development or do these > approaches > have any tangible pros and cons in the context of iterative Map Reduce > programming? > The biggest issue that I see is that debugging a parallel program can be quite difficult. 2. Is there a preferred IDE (Eclipse, IntelliJ or...)? I've been using > Eclipse so far. > We got to some lengths to make that a personal choice as well. Committers here use both of the major options (Eclipse and IntelliJ). I don't know of anybody using NetBeans. There is bound to be somebody who still likes to type everything. My own preference is IntelliJ. > 3. Is there a place where I can find guidelines for code formatting > specific > to Mahout? Things like indentation, class names, comments etc. > Lucene standard which is Sun standard with 2 space indentation. This page might help especially near the bottom: https://cwiki.apache.org/MAHOUT/how-to-contribute.html Note to others, this wiki page has lots more information than the how-to-contribute page that is linked from our main site.
