Hi Smith, step debugging also works in hadoop as with other java applications. export HADOOP_OPTS="-agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=8000" 'suspend=y' is to let the jvm suspend until the remote debugger is attached.
Thanks and Regards Piyush Garg On Sunday 15 August 2010 10:39 AM, smith jack wrote: > that means you can only trace by log, > and not possible to debug hadoop using step debug, haha > distributed system always introduce extra complexity and confusing issues. > > 2010/8/15 Piyush Garg <[email protected]>: > >> Hi Rita, >> >> You can put log4j logger debug statements in the code. log4j library is >> part of hadoop framework and there is already a log4j.properties file in >> hadoop conf directory and all the output logs are saved in hadoop logs >> directory. >> >> Thanks and Regards >> Piyush Garg >> >> >> On Sunday 15 August 2010 10:20 AM, Rita Liu wrote: >> >>> Thank you very much, Piyush! :) May I know more about how to use "traces"? >>> >>> And -- yes, please teach me if possible, experts! :) >>> >>> Thanks a lot, >>> -Rita :)) >>> >>> On Sat, Aug 14, 2010 at 9:42 PM, Piyush Garg <[email protected]> wrote: >>> >>> >>> >>>> Hi Rita, >>>> >>>> I have just started to learn hadoop as well, I know there is a long way >>>> to go. >>>> I found some useful links which I am sharing with you. >>>> >>>> Hadoop Tutorial - YDN >>>> <http://developer.yahoo.com/hadoop/tutorial/index.html> excellent >>>> beginners tutorial and well organized. >>>> Running Hadoop On Ubuntu Linux (Single-Node Cluster) - Michael G. Noll >>>> < >>>> http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29 >>>> >>>> >>>>> >>>> Running_Hadoop_On_Ubuntu_Linux_(Multi-Node_Cluster) >>>> < >>>> http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Multi-Node_Cluster%29 >>>> >>>> >>>>> >>>> The tutorial on the hadoop wiki >>>> <http://hadoop.apache.org/common/docs/r0.20.0/mapred_tutorial.html> is >>>> too much for a beginner. >>>> >>>> Debugger: >>>> I do not think you can easily do debugging using remote debugger. This >>>> is natural since hadoop is not sequential programming, it would be very >>>> difficult to debug its apps. >>>> The only way to debug is to use traces. >>>> >>>> I think you can learn how to setup multi-node cluster, but for practice >>>> session you can use single node setup. >>>> >>>> Lets see what the experts say. >>>> >>>> Thanks and Regards >>>> Piyush Garg >>>> >>>> >>>> On Sunday 15 August 2010 09:07 AM, Rita Liu wrote: >>>> >>>> >>>>> Hi! >>>>> >>>>> I am a total beginner, but I am very interested in hadoop. I've already >>>>> downloaded hadoop 0.19.2 and run on Ubuntu in single-node mode. Now I >>>>> >>>>> >>>> want >>>> >>>> >>>>> to do two things: >>>>> >>>>> 1. Explore how hadoop works internally with one of the example >>>>> >>>>> >>>> applications >>>> >>>> >>>>> hadoop provides >>>>> 2. Write an application on my own >>>>> >>>>> Those two things bring me following questions: >>>>> >>>>> a. debugger? >>>>> I am stuck since I don't know how to "explore" hadoop. I used to trace >>>>> through the code using a debugger, but in this case, I don't know if >>>>> >>>>> >>>> there >>>> >>>> >>>>> is a good debugger to use; or -- maybe a debugger is not necessary for >>>>> hadoop? If not, then how do you trace through the code to either debug or >>>>> just gain an understanding about the system? May I know what you, >>>>> experienced experts, do? :) >>>>> >>>>> b. Where to run hadoop? >>>>> Also -- may I know where you run your hadoop? Do you run on linux, or on >>>>> >>>>> >>>> VM >>>> >>>> >>>>> -- in particular, Cloudera? I heard that Cloudera is good for writing >>>>> mapreduce applications with hadoop itself as a blackbox; is it true? If >>>>> >>>>> >>>> my >>>> >>>> >>>>> ultimate goal is to understand how hadoop works internally, would it be >>>>> better if I directly run it on linux? >>>>> >>>>> c. Single-node or multi-node? >>>>> In the beginning (just like my case :p) would it be better to use >>>>> single-node or multi-node? If the latter is true, should I obtain more >>>>> machines, or should I use more virtual machines to create more nodes? >>>>> >>>>> As a newbie, I am sorry for all those basic (and silly, I know :$) >>>>> questions. If possible, please help me out? Any suggestion or advice will >>>>> >>>>> >>>> be >>>> >>>> >>>>> greatly appreciated. Thank you very much! >>>>> >>>>> Best, >>>>> Rita :) >>>>> >>>>> P.S. If my questions are not suitable for this mailing-list, please let >>>>> >>>>> >>>> me >>>> >>>> >>>>> apologize, and then, could you please direct me to other mailing-lists? >>>>> Sorry, and thanks a lot! :) >>>>> >>>>> >>>>> >>>>> >>>> >>> >>
