Re: Contributing to hadoop

Steve Loughran Thu, 26 Feb 2009 03:54:45 -0800

Pradeep Fernando wrote:

Thanks Enis & steve,


 for your valuable guidelines. As i can understand since hadoop is a
implementation of map-reduce it is aimed at working in a clustering
environment. So me, having only a Desktop and Internet connection having
doubts weather i can successfully contribute to the project.

You can have some fun on a single machine; the algorithms are the same,just some of the problems involved in running jobsd and managingmachines dufferent


although this sounds good. there are clouds like Amazon EC2 for setting up a
cluster.Are you devs make use of that sort of infrastructure in development
testing.I dont knw this is a right question or this is relevent at all. plz
bare with me if.



Tom White does.
EC2 has some issues

*your test runs accrue debt, especially if you are setting up andtearing down machines regularly

 * the network is insecure

Hadoop could do with some improvement over network security, thoughthere's no reason why AWS could'nt offer virtual VPNs.


Have you worked on Apache projects before?


yes i have contributed to the Apache Axis2 project.so Im pretty much
familiar with java,ant,maven,junit , etc.

OK, if you worked on Axis2 then you'll know the basics, though beadvised that the hadoop commit process is much more rigorous.

I'd recommend start playing with MR algorithms on any data you have tohand, if you want some interesting datasets then ask on the user mailinglist and you will get some pointers. Go with a current release, and notSVN_HEAD if you want stability in your life. Only if/when you want tomake changes to the code should you go with SVN head


-steve

Re: Contributing to hadoop

Reply via email to