Re: Hadoop basics

Rita Liu Sat, 14 Aug 2010 22:40:39 -0700

Hi Harsh and Piyush! Thank you very much. So it seems like it would be best
if I use log4j to trace, and debugging with a debugger is still possible if
I set "mapred.job.tracker" to be "local" and "fs.default.name" to be
"local", in hadoop-site.xml. Plus: in hadoop-env.sh, I should specify
HADOOP_OPTS to be:


"-agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=8000" (why
8000? also, what does "-agentlib:jdwp=transport=dt_socket" mean?)

... in order to use a debugger. Is my understanding correct? :)

If so -- then which debugger do you use? May I know? Thanks a lot! I am also
going to try log4j now!

Many thanks,
-Rita :))

On Sat, Aug 14, 2010 at 10:22 PM, Piyush Garg <[email protected]>wrote:

> Hi Smith,
>
> step debugging also works in hadoop as with other java applications.
> export
>
> HADOOP_OPTS="-agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=8000"
> 'suspend=y' is to let the jvm suspend until the remote debugger is
> attached.
>
> Thanks and Regards
> Piyush Garg
>
>
> On Sunday 15 August 2010 10:39 AM, smith jack wrote:
> > that means you can only trace by log,
> > and not possible to debug hadoop using step debug, haha
> > distributed system always introduce extra complexity and confusing
> issues.
> >
> > 2010/8/15 Piyush Garg <[email protected]>:
> >
> >> Hi Rita,
> >>
> >> You can put log4j logger debug statements in the code. log4j library is
> >> part of hadoop framework and there is already a log4j.properties file in
> >> hadoop conf directory and all the output logs are saved in hadoop logs
> >> directory.
> >>
> >> Thanks and Regards
> >> Piyush Garg
> >>
> >>
> >> On Sunday 15 August 2010 10:20 AM, Rita Liu wrote:
> >>
> >>> Thank you very much, Piyush! :) May I know more about how to use
> "traces"?
> >>>
> >>> And -- yes, please teach me if possible, experts! :)
> >>>
> >>> Thanks a lot,
> >>> -Rita :))
> >>>
> >>> On Sat, Aug 14, 2010 at 9:42 PM, Piyush Garg <[email protected]>
> wrote:
> >>>
> >>>
> >>>
> >>>> Hi Rita,
> >>>>
> >>>> I have just started to learn hadoop as well, I know there is a long
> way
> >>>> to go.
> >>>> I found some useful links which I am sharing with you.
> >>>>
> >>>> Hadoop Tutorial - YDN
> >>>> <http://developer.yahoo.com/hadoop/tutorial/index.html> excellent
> >>>> beginners tutorial and well organized.
> >>>> Running Hadoop On Ubuntu Linux (Single-Node Cluster) - Michael G. Noll
> >>>> <
> >>>>
> http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29
> >>>>
> >>>>
> >>>>>
> >>>> Running_Hadoop_On_Ubuntu_Linux_(Multi-Node_Cluster)
> >>>> <
> >>>>
> http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Multi-Node_Cluster%29
> >>>>
> >>>>
> >>>>>
> >>>> The tutorial on the hadoop wiki
> >>>> <http://hadoop.apache.org/common/docs/r0.20.0/mapred_tutorial.html>
> is
> >>>> too much for a beginner.
> >>>>
> >>>> Debugger:
> >>>> I do not think you can easily do debugging using remote debugger. This
> >>>> is natural since hadoop is not sequential programming, it would be
> very
> >>>> difficult to debug its apps.
> >>>> The only way to debug is to use traces.
> >>>>
> >>>> I think you can learn how to setup multi-node cluster, but for
> practice
> >>>> session you can use single node setup.
> >>>>
> >>>> Lets see what the experts say.
> >>>>
> >>>> Thanks and Regards
> >>>> Piyush Garg
> >>>>
> >>>>
> >>>> On Sunday 15 August 2010 09:07 AM, Rita Liu wrote:
> >>>>
> >>>>
> >>>>> Hi!
> >>>>>
> >>>>> I am a total beginner, but I am very interested in hadoop. I've
> already
> >>>>> downloaded hadoop 0.19.2 and run on Ubuntu in single-node mode. Now I
> >>>>>
> >>>>>
> >>>> want
> >>>>
> >>>>
> >>>>> to do two things:
> >>>>>
> >>>>> 1. Explore how hadoop works internally with one of the example
> >>>>>
> >>>>>
> >>>> applications
> >>>>
> >>>>
> >>>>> hadoop provides
> >>>>> 2. Write an application on my own
> >>>>>
> >>>>> Those two things bring me following questions:
> >>>>>
> >>>>> a. debugger?
> >>>>> I am stuck since I don't know how to "explore" hadoop. I used to
> trace
> >>>>> through the code using a debugger, but in this case, I don't know if
> >>>>>
> >>>>>
> >>>> there
> >>>>
> >>>>
> >>>>> is a good debugger to use; or -- maybe a debugger is not necessary
> for
> >>>>> hadoop? If not, then how do you trace through the code to either
> debug or
> >>>>> just gain an understanding about the system? May I know what you,
> >>>>> experienced experts, do? :)
> >>>>>
> >>>>> b. Where to run hadoop?
> >>>>> Also -- may I know where you run your hadoop? Do you run on linux, or
> on
> >>>>>
> >>>>>
> >>>> VM
> >>>>
> >>>>
> >>>>> -- in particular, Cloudera? I heard that Cloudera is good for writing
> >>>>> mapreduce applications with hadoop itself as a blackbox; is it true?
> If
> >>>>>
> >>>>>
> >>>> my
> >>>>
> >>>>
> >>>>> ultimate goal is to understand how hadoop works internally, would it
> be
> >>>>> better if I directly run it on linux?
> >>>>>
> >>>>> c. Single-node or multi-node?
> >>>>> In the beginning (just like my case :p) would it be better to use
> >>>>> single-node or multi-node? If the latter is true, should I obtain
> more
> >>>>> machines, or should I use more virtual machines to create more nodes?
> >>>>>
> >>>>> As a newbie, I am sorry for all those basic (and silly, I know :$)
> >>>>> questions. If possible, please help me out? Any suggestion or advice
> will
> >>>>>
> >>>>>
> >>>> be
> >>>>
> >>>>
> >>>>> greatly appreciated. Thank you very much!
> >>>>>
> >>>>> Best,
> >>>>> Rita :)
> >>>>>
> >>>>> P.S. If my questions are not suitable for this mailing-list, please
> let
> >>>>>
> >>>>>
> >>>> me
> >>>>
> >>>>
> >>>>> apologize, and then, could you please direct me to other
> mailing-lists?
> >>>>> Sorry, and thanks a lot! :)
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>
> >>>
> >>
>

Re: Hadoop basics

Reply via email to