+1 I'm one of those new people :)
On Thu, Apr 18, 2013 at 1:32 PM, Noelle Jakusz (c) <njak...@vmware.com>wrote: > +1 > > There are quite a few new people, so maybe start a collaborative group > where you can collect notes and steps (videos and articles). I know I would > have some for you that I have created as I have gotten started... it would > be a great idea to post them after some collaboration and review. > > Thanks Chris for the detailed reply... > > -----Original Message----- > From: Chris Nauroth [mailto:cnaur...@hortonworks.com] > Sent: Thursday, April 18, 2013 1:14 PM > To: common-dev@hadoop.apache.org > Subject: Re: How to understand Hadoop source code ? > > Is there a specific bug fix or feature that you are trying to contribute? > Specific questions like "how can I help with jira X?" or "what is the > main entry point when I run the hdfs command?" or "where does the namenode > serialize metadata to disk" or "where does the secondary namenode execute a > checkpoint" can help focus the conversation. > > AFAIK, we don't have a general code walkthrough document focused on > onboarding new engineers. This could be a valuable contribution if you > want to gather notes while you learn. I think this always works best if > it's driven by a new engineer with review by an expert. (If the experts > write it, then they might accidentally skip something non-obvious that > they've already internalized.) > > Since that document doesn't exist yet, the other option is to do some > reading of the code, ideally while trying to fix a specific bug that has > been filed in jira. Like you said, it's a relatively large codebase, so > it's impractical to read the whole thing top-to-bottom. Instead, it's > important to look for high-level clues that steer you towards the right > files. I've found that the Maven module structure and the Java package > names are usually descriptive enough to steer me in the right direction. > If you focus on getting familiar with those, you'll basically build a > btree inside your brain that helps you index into the right part of the > codebase and answer your own questions rapidly. Several examples: > > "Where is the main entry point for the datanode daemon?": module > hadoop-hdfs, package org.apache.hadoop.hdfs.server.datanode > > "What is the algorithm for rebalancing an unbalanced cluster?": module > hadoop-hdfs, package org.apache.hadoop.hdfs.server.balancer > > "How does YARN launch a new container process?": module > hadoop-yarn-server-nodemanager, package > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher > > "Multiple daemons publish JMX metrics as a common concern. Where is that > implemented?": module hadoop-common, package org.apache.hadoop.metrics2 > > I hope this is helpful to get the process started for you. We're always > here to help if you have specific follow-up questions. > > Thanks, > --Chris > > > On Wed, Apr 17, 2013 at 10:33 PM, Prabakaran Krishnan < > prabakaran_j...@yahoo.in> wrote: > > > Couuld you please help me in understand map reduce in Hadoop? > > > > > > > > ________________________________ > > From: Mohammad Mustaqeem <3m.mustaq...@gmail.com> > > To: common-dev <common-dev@hadoop.apache.org> > > Sent: Thursday, 18 April 2013 10:44 AM > > Subject: Re: How to understand Hadoop source code ? > > > > > > I am interested in HDFS. Please guide me. > > > > > > On Thu, Apr 18, 2013 at 3:36 AM, Arun C Murthy <a...@hortonworks.com> > > wrote: > > > > > Please don't cross post. > > > > > > What parts of Hadoop are you interested in? HDFS? YARN? MapReduce? > > > > > > Arun > > > > > > On Apr 17, 2013, at 2:50 PM, Mohammad Mustaqeem wrote: > > > > > > > Hello everyone, > > > > I am new to this group. Since the source code of Hadoop > > > > is > > very > > > > big, I am not able to understand it entirely. > > > > Is there any document that describes the code? > > > > Is there any way to understand the functionality of each classes > > > > and > > its > > > > method? > > > > > > > > > > > > -- > > > > *With regards ---* > > > > *Mohammad Mustaqeem*, > > > > M.Tech (CSE) > > > > MNNIT Allahabad > > > > > > -- > > > Arun C. Murthy > > > Hortonworks Inc. > > > http://hortonworks.com/ > > > > > > > > > > > > > > > -- > > *With regards ---* > > *Mohammad Mustaqeem*, > > M.Tech (CSE) > > MNNIT Allahabad > > 9026604270 > > >