Re: How to Start Hadoop Cluster from source code in Eclipse

KrzyCube Fri, 22 Jun 2007 00:17:45 -0700

Finally, I start it successfully , with the "NameNode"  and one "DataNode"
which both on the localhost.


My configures are :

1. extract the code from the tar.gz  , i got the version hadoop-0.12.3
2. in eclipse, new a project from the ant file "build.xml" which under the
source code folder.
3. Try compile. (may be have to configure the java compile version,in
project properties or eclipse preference. I just enable the java6.0 in my
ubuntu7.04)
4.if that done well , find the NameNode.java ,configure as a JavaApplication
, and try to run.
5. If there are some exceptions in log4j like "can not found log appender".
It might be  "conf" problem. I fix this with add the "Hadoop/conf" folder to
"use as source folder". In eclipse it is easy , 
find the the conf folder in the source exploer tree view , then right-click
->Build Path->"use as source folder"

6. rebuild , try run again . Now there may be exception like "NameNode have
not be formatted"
7.add "-format" arguments to the application once , it will format the
namenode , then drop this arguments.
8. then i take some other configurations here , export the "HADOOP_HOME" in
the hadoop-env.sh.
    make it direct to the source code path is OK.
    configure the hadoop-site.xml , just as there in the hadoop wiki
says,host, ports , and the paths such as dfs.name.dir. here i just give it
the path that format generated , something look like this
"*/workspace/Hadoop/filesystem/name".

9.  rebuild and retry, then , must go to the "webapps not found in classpath
" found,which i refer in my last post . Just copy to the Hadoop/bin folder
won't be ok , that just cause another strange exception.

10. after trace some code , i found that while create the httpserver it
found webapps in /src/webapps, yes , it's there , but not work , i copy the
"Hadoop/src/webapps" to "Hadoop/src/java/webapps" the , refresh the tree
view in eclipse , and find the webapps folder under java/ ,
right-click->build-path->include.
Now the webapps folder will be copy to the output to the path which we set
for the bulid-output-folder, Hadoop/bin or Hadoop/build , i choose the first
as default.

11. Try again , the NameNode started , cheer.
12. Configure DataNode.java as JavaApplication , run , started , cheer
again.
13. Then i toggle some breakpoints in the source files , and write some code
who calls the FSShell from another computer, wonderful , the breakpoints
actived at the server-side.
-------------------------------------------------------------------------------------------------------------------------

Then after that , still i have problem that :

1. If i want to start the jobtrackers ,etc , just do as the DataNode  ??
2. The how can i start a cluster with several datanode , need some scripts ?

------------And thanks for you guys reply , those really help me a lot ,
thanks.
                                                                                
         
KrzyCube





KrzyCube wrote:
> 
>  I take steps below:
> 
> 1. New a project from the exist ant file "build.xml"
> 2.try to compile the project , its done well.
> 3.find NameNode.java and configure as a Java App to run.
> 4.Told me that NameNode not formatted , then i do it with -format argument
> 5.Then , Exceptions as "webapps" not found in classpath
> 6.so i try to configure the src/webapps folder as Build->Use as source
> folder
> 7.Build the project again. But i can find the webapps output to
> build_output_path
> 8.Then i just copy the "webapps" to the bin/ path , as my build output
> path is Hadoop/bin.
> 9.Then Exceptions like these:
> ----------------------------------------------------------------------------------------------------------------------
> 07/06/22 12:42:22 INFO dfs.StateChange: STATE* Network topology has 0
> racks and 0 datanodes
> 07/06/22 12:42:22 INFO dfs.StateChange: STATE* UnderReplicatedBlocks has 0
> blocks
> 07/06/22 12:42:22 INFO util.Credential: Checking Resource aliases
> 07/06/22 12:42:22 INFO http.HttpServer: Version Jetty/5.1.4
> 07/06/22 12:42:22 INFO util.Container: Started
> HttpContext[/static,/static]
> 07/06/22 12:42:23 INFO util.Container: Started
> [EMAIL PROTECTED]
> 07/06/22 12:42:23 INFO http.SocketListener: Started SocketListener on
> 0.0.0.0:50070
> 07/06/22 12:42:23 ERROR dfs.NameNode: java.io.IOException: Problem
> starting http server
>       at
> org.apache.hadoop.mapred.StatusHttpServer.start(StatusHttpServer.java:211)
>       at org.apache.hadoop.dfs.FSNamesystem.<init>(FSNamesystem.java:274)
>       at org.apache.hadoop.dfs.NameNode.init(NameNode.java:178)
>       at org.apache.hadoop.dfs.NameNode.<init>(NameNode.java:195)
>       at org.apache.hadoop.dfs.NameNode.main(NameNode.java:728)
> Caused by:
> org.mortbay.util.MultiException[java.lang.ClassNotFoundException:
> org.apache.hadoop.dfs.dfshealth_jsp, java.lang.ClassNotFoundException:
> org.apache.hadoop.dfs.nn_005fbrowsedfscontent_jsp]
>       at org.mortbay.http.HttpServer.doStart(HttpServer.java:731)
>       at org.mortbay.util.Container.start(Container.java:72)
>       at
> org.apache.hadoop.mapred.StatusHttpServer.start(StatusHttpServer.java:188)
>       ... 4 more
> -------------------------------------------------------------------------------------------------------------------------
> I tried configure here and there and try and try , but , this exception is
> still there.
> what the problem this exception might be?
> 
>                                                                     
> Thanks a lot
>                                                      KrzyCube
> 
> 
> Konstantin Shvachko wrote:
>> 
>> I run entire one node cluster in eclipse by just executing main() (run 
>> or debug menus) for each component.
>> You need to configure eclipse correctly in order to do that. Can you 
>> compile the whole thing under eclipse?
>> NameNode example:
>> = Open NameNode.java in the editor.
>> = Run / Run
>> = New Java Application -> will create an entry under "Java Application" 
>> named NameNode
>> = Select NameNode, go to tab Arguments and enter the following arguments 
>> under "VM Arguments":
>>   -Dhadoop.log.dir=./logs  
>>   -Xmx500m
>>   -ea
>>     The first one is required, can point to your log directory, the 
>> other two are optional
>> = go to the "Classpath" tab, add "hadoop/build" path under "User entries"
>> by
>>     Advanced / New Folder / select "hadoop/build"
>> That should be it, if the default classpath is configured correctly, and 
>> if I am not forgetting anything.
>> Let me know if that helped, I 'll send you screenshots of my 
>> configuration if not.
>> 
>> --Konstantin
>> 
>> 
>> Mahajan, Neeraj wrote:
>> 
>>>There are two sepearete issues you are asking here:
>>>1. How to modify/add to haddop code and execute the changed -
>>>Eclipse is just an IDE, it doesn't matter whether you use eclipse or
>>>some other editor.
>>>I have been using eclipse. What I do is modify the code using eclipse
>>>and then run "ant jar" in the root folder of hadoop (you could also
>>>configure this to work directly from eclipse). This would regenerate the
>>>jars and put them in build/ folder. Now you can either copy these jars
>>>into hadoop root folder (removing "dev" in their name) so that they
>>>replace the original jars or modify the scripts in bin/ to point to the
>>>newly generated jars.
>>>
>>>2. How to debug using a IDE -
>>>This page gives a high-level intro to debugging hadoop -
>>>http://wiki.apache.org/lucene-hadoop/HowToDebugMapReducePrograms
>>>According to me, there are two ways you can debug hadoop programs: Run
>>>hadoop in local mode and debug in process in the IDE or run hadoop in
>>>distributed mode and remote debug using IDE.
>>>
>>>The first way is easy. In the bin/hadoop script at the end there is a
>>>exec command, instead of that put a echo command and run your program.
>>>You can see what the paramters the script passes while starting hadoop.
>>>Use these same parameters in the IDE and you can debug hadoop. Remember
>>>to make change to the conf files so that hadoop runs in local mode. To
>>>be more specific, you will have to set the program arguemnts, VM
>>>arguments and add an entry in the classpath pointing to the conf folder.
>>>
>>>The second method is compilcated. You will have to modify the scripts
>>>and put in some extra params like "-Xdebug
>>>-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=<port>" for the
>>>java command. Specify the <port> of you choice in it. On the server
>>>where you are running both the namenode/jobnode there will be a conflict
>>>as the same port would be specified. So you will have to do some
>>>intelligent scripting to take care of this. Once the java processes
>>>start you can attach eclipse debugger to that machine's <port> and set
>>>breakpoints. Till this part you can debug all the things before map
>>>reduce tasks. Mapp reduce tasks run in separate process, for debugging
>>>them you will have to figure out yourself.
>>>
>>>The best way is to debug using the first approach (as the above link
>>>says). I think by that approach you can fix any map-reduce related
>>>problems and for other purely distributed kind of problems you can
>>>follow the second approach.
>>>
>>>~ Neeraj
>>>
>>>-----Original Message-----
>>>From: KrzyCube [mailto:[EMAIL PROTECTED] 
>>>Sent: Thursday, June 21, 2007 2:08 AM
>>>To: hadoop-user@lucene.apache.org
>>>Subject: How to Start Hadoop Cluster from source code in Eclipse
>>>
>>>
>>>Hi,all:
>>>
>>>I am using Eclipse to View Hadoop source code , and i want to trace to
>>>see how it works, I code a few code to call the FSClient  and when i
>>>call into the RPC  object, it can not to be deep more .
>>>
>>>So i just want to start cluster from source code , which i am holding
>>>them in Eclipse now. 
>>>I browse the start-*.sh , and find that it must start several threads ,
>>>such as namenode , datanode,secondnamenode. i just don't know how to
>>>figure out.
>>>
>>>or is there any way to attach my code to a running process , just as the
>>>gdb while we are debug c code 
>>>
>>>Does any body ever use Eclipse to debug these source code , please give
>>>some tip.
>>>
>>> 
>>>
>>>Thanks .
>>> 
>>>
>>>KrzyCube
>>>--
>>>View this message in context:
>>>http://www.nabble.com/How-to-Start-Hadoop-Cluster-from-source-code-in-Ec
>>>lipse-tf3957457.html#a11229322
>>>Sent from the Hadoop Users mailing list archive at Nabble.com.
>>>
>>>  
>>>
>> 
>> 
>> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/How-to-Start-Hadoop-Cluster-from-source-code-in-Eclipse-tf3957457.html#a11247363
Sent from the Hadoop Users mailing list archive at Nabble.com.

Re: How to Start Hadoop Cluster from source code in Eclipse

Reply via email to