Finally, I start it successfully , with the "NameNode" and one "DataNode" which both on the localhost.
My configures are : 1. extract the code from the tar.gz , i got the version hadoop-0.12.3 2. in eclipse, new a project from the ant file "build.xml" which under the source code folder. 3. Try compile. (may be have to configure the java compile version,in project properties or eclipse preference. I just enable the java6.0 in my ubuntu7.04) 4.if that done well , find the NameNode.java ,configure as a JavaApplication , and try to run. 5. If there are some exceptions in log4j like "can not found log appender". It might be "conf" problem. I fix this with add the "Hadoop/conf" folder to "use as source folder". In eclipse it is easy , find the the conf folder in the source exploer tree view , then right-click ->Build Path->"use as source folder" 6. rebuild , try run again . Now there may be exception like "NameNode have not be formatted" 7.add "-format" arguments to the application once , it will format the namenode , then drop this arguments. 8. then i take some other configurations here , export the "HADOOP_HOME" in the hadoop-env.sh. make it direct to the source code path is OK. configure the hadoop-site.xml , just as there in the hadoop wiki says,host, ports , and the paths such as dfs.name.dir. here i just give it the path that format generated , something look like this "*/workspace/Hadoop/filesystem/name". 9. rebuild and retry, then , must go to the "webapps not found in classpath " found,which i refer in my last post . Just copy to the Hadoop/bin folder won't be ok , that just cause another strange exception. 10. after trace some code , i found that while create the httpserver it found webapps in /src/webapps, yes , it's there , but not work , i copy the "Hadoop/src/webapps" to "Hadoop/src/java/webapps" the , refresh the tree view in eclipse , and find the webapps folder under java/ , right-click->build-path->include. Now the webapps folder will be copy to the output to the path which we set for the bulid-output-folder, Hadoop/bin or Hadoop/build , i choose the first as default. 11. Try again , the NameNode started , cheer. 12. Configure DataNode.java as JavaApplication , run , started , cheer again. 13. Then i toggle some breakpoints in the source files , and write some code who calls the FSShell from another computer, wonderful , the breakpoints actived at the server-side. ------------------------------------------------------------------------------------------------------------------------- Then after that , still i have problem that : 1. If i want to start the jobtrackers ,etc , just do as the DataNode ?? 2. The how can i start a cluster with several datanode , need some scripts ? ------------And thanks for you guys reply , those really help me a lot , thanks. KrzyCube KrzyCube wrote: > > I take steps below: > > 1. New a project from the exist ant file "build.xml" > 2.try to compile the project , its done well. > 3.find NameNode.java and configure as a Java App to run. > 4.Told me that NameNode not formatted , then i do it with -format argument > 5.Then , Exceptions as "webapps" not found in classpath > 6.so i try to configure the src/webapps folder as Build->Use as source > folder > 7.Build the project again. But i can find the webapps output to > build_output_path > 8.Then i just copy the "webapps" to the bin/ path , as my build output > path is Hadoop/bin. > 9.Then Exceptions like these: > ---------------------------------------------------------------------------------------------------------------------- > 07/06/22 12:42:22 INFO dfs.StateChange: STATE* Network topology has 0 > racks and 0 datanodes > 07/06/22 12:42:22 INFO dfs.StateChange: STATE* UnderReplicatedBlocks has 0 > blocks > 07/06/22 12:42:22 INFO util.Credential: Checking Resource aliases > 07/06/22 12:42:22 INFO http.HttpServer: Version Jetty/5.1.4 > 07/06/22 12:42:22 INFO util.Container: Started > HttpContext[/static,/static] > 07/06/22 12:42:23 INFO util.Container: Started > [EMAIL PROTECTED] > 07/06/22 12:42:23 INFO http.SocketListener: Started SocketListener on > 0.0.0.0:50070 > 07/06/22 12:42:23 ERROR dfs.NameNode: java.io.IOException: Problem > starting http server > at > org.apache.hadoop.mapred.StatusHttpServer.start(StatusHttpServer.java:211) > at org.apache.hadoop.dfs.FSNamesystem.<init>(FSNamesystem.java:274) > at org.apache.hadoop.dfs.NameNode.init(NameNode.java:178) > at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:195) > at org.apache.hadoop.dfs.NameNode.main(NameNode.java:728) > Caused by: > org.mortbay.util.MultiException[java.lang.ClassNotFoundException: > org.apache.hadoop.dfs.dfshealth_jsp, java.lang.ClassNotFoundException: > org.apache.hadoop.dfs.nn_005fbrowsedfscontent_jsp] > at org.mortbay.http.HttpServer.doStart(HttpServer.java:731) > at org.mortbay.util.Container.start(Container.java:72) > at > org.apache.hadoop.mapred.StatusHttpServer.start(StatusHttpServer.java:188) > ... 4 more > ------------------------------------------------------------------------------------------------------------------------- > I tried configure here and there and try and try , but , this exception is > still there. > what the problem this exception might be? > > > Thanks a lot > KrzyCube > > > Konstantin Shvachko wrote: >> >> I run entire one node cluster in eclipse by just executing main() (run >> or debug menus) for each component. >> You need to configure eclipse correctly in order to do that. Can you >> compile the whole thing under eclipse? >> NameNode example: >> = Open NameNode.java in the editor. >> = Run / Run >> = New Java Application -> will create an entry under "Java Application" >> named NameNode >> = Select NameNode, go to tab Arguments and enter the following arguments >> under "VM Arguments": >> -Dhadoop.log.dir=./logs >> -Xmx500m >> -ea >> The first one is required, can point to your log directory, the >> other two are optional >> = go to the "Classpath" tab, add "hadoop/build" path under "User entries" >> by >> Advanced / New Folder / select "hadoop/build" >> That should be it, if the default classpath is configured correctly, and >> if I am not forgetting anything. >> Let me know if that helped, I 'll send you screenshots of my >> configuration if not. >> >> --Konstantin >> >> >> Mahajan, Neeraj wrote: >> >>>There are two sepearete issues you are asking here: >>>1. How to modify/add to haddop code and execute the changed - >>>Eclipse is just an IDE, it doesn't matter whether you use eclipse or >>>some other editor. >>>I have been using eclipse. What I do is modify the code using eclipse >>>and then run "ant jar" in the root folder of hadoop (you could also >>>configure this to work directly from eclipse). This would regenerate the >>>jars and put them in build/ folder. Now you can either copy these jars >>>into hadoop root folder (removing "dev" in their name) so that they >>>replace the original jars or modify the scripts in bin/ to point to the >>>newly generated jars. >>> >>>2. How to debug using a IDE - >>>This page gives a high-level intro to debugging hadoop - >>>http://wiki.apache.org/lucene-hadoop/HowToDebugMapReducePrograms >>>According to me, there are two ways you can debug hadoop programs: Run >>>hadoop in local mode and debug in process in the IDE or run hadoop in >>>distributed mode and remote debug using IDE. >>> >>>The first way is easy. In the bin/hadoop script at the end there is a >>>exec command, instead of that put a echo command and run your program. >>>You can see what the paramters the script passes while starting hadoop. >>>Use these same parameters in the IDE and you can debug hadoop. Remember >>>to make change to the conf files so that hadoop runs in local mode. To >>>be more specific, you will have to set the program arguemnts, VM >>>arguments and add an entry in the classpath pointing to the conf folder. >>> >>>The second method is compilcated. You will have to modify the scripts >>>and put in some extra params like "-Xdebug >>>-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=<port>" for the >>>java command. Specify the <port> of you choice in it. On the server >>>where you are running both the namenode/jobnode there will be a conflict >>>as the same port would be specified. So you will have to do some >>>intelligent scripting to take care of this. Once the java processes >>>start you can attach eclipse debugger to that machine's <port> and set >>>breakpoints. Till this part you can debug all the things before map >>>reduce tasks. Mapp reduce tasks run in separate process, for debugging >>>them you will have to figure out yourself. >>> >>>The best way is to debug using the first approach (as the above link >>>says). I think by that approach you can fix any map-reduce related >>>problems and for other purely distributed kind of problems you can >>>follow the second approach. >>> >>>~ Neeraj >>> >>>-----Original Message----- >>>From: KrzyCube [mailto:[EMAIL PROTECTED] >>>Sent: Thursday, June 21, 2007 2:08 AM >>>To: hadoop-user@lucene.apache.org >>>Subject: How to Start Hadoop Cluster from source code in Eclipse >>> >>> >>>Hi,all: >>> >>>I am using Eclipse to View Hadoop source code , and i want to trace to >>>see how it works, I code a few code to call the FSClient and when i >>>call into the RPC object, it can not to be deep more . >>> >>>So i just want to start cluster from source code , which i am holding >>>them in Eclipse now. >>>I browse the start-*.sh , and find that it must start several threads , >>>such as namenode , datanode,secondnamenode. i just don't know how to >>>figure out. >>> >>>or is there any way to attach my code to a running process , just as the >>>gdb while we are debug c code >>> >>>Does any body ever use Eclipse to debug these source code , please give >>>some tip. >>> >>> >>> >>>Thanks . >>> >>> >>>KrzyCube >>>-- >>>View this message in context: >>>http://www.nabble.com/How-to-Start-Hadoop-Cluster-from-source-code-in-Ec >>>lipse-tf3957457.html#a11229322 >>>Sent from the Hadoop Users mailing list archive at Nabble.com. >>> >>> >>> >> >> >> > > -- View this message in context: http://www.nabble.com/How-to-Start-Hadoop-Cluster-from-source-code-in-Eclipse-tf3957457.html#a11247363 Sent from the Hadoop Users mailing list archive at Nabble.com.