Pradeep Fernando wrote:
Thanks Enis & steve,

 for your valuable guidelines. As i can understand since hadoop is a
implementation of map-reduce it is aimed at working in a clustering
environment. So me, having only a Desktop and Internet connection having
doubts weather i can successfully contribute to the project.


You can have some fun on a single machine; the algorithms are the same, just some of the problems involved in running jobsd and managing machines dufferent


although this sounds good. there are clouds like Amazon EC2 for setting up a
cluster.Are you devs make use of that sort of infrastructure in development
testing.I dont knw this is a right question or this is relevent at all. plz
bare with me if.


Tom White does.
EC2 has some issues
*your test runs accrue debt, especially if you are setting up and tearing down machines regularly
 * the network is insecure
Hadoop could do with some improvement over network security, though there's no reason why AWS could'nt offer virtual VPNs.


Have you worked on Apache projects before?


yes i have contributed to the Apache Axis2 project.so Im pretty much
familiar with java,ant,maven,junit , etc.



OK, if you worked on Axis2 then you'll know the basics, though be advised that the hadoop commit process is much more rigorous.

I'd recommend start playing with MR algorithms on any data you have to hand, if you want some interesting datasets then ask on the user mailing list and you will get some pointers. Go with a current release, and not SVN_HEAD if you want stability in your life. Only if/when you want to make changes to the code should you go with SVN head

-steve

Reply via email to