Re: Design Q: logging/nesting of exceptions

Steve Loughran Mon, 12 May 2008 03:53:21 -0700

Nigel Daley wrote:

On May 9, 2008, at 2:03 AM, Steve Loughran wrote:
Owen O'Malley wrote:
On May 8, 2008, at 10:21 AM, Doug Cutting wrote:
I'd go with something closer to an accidental policy. My suspicionis that logging framework didn't print nested exceptions well. Owenis the father of stringifyException and may have more insight.
Yeah, it was my fault. The log4j was misconfigured, so we didn't getthe exception traces out of the messages. I didn't realize it was amisconfiguration until much later and it had become standard practicewithin Hadoop. *sigh* I've fixed a couple of them, but there are alot more.
OK, that means when I encounter them I can change them.
Somewhere on my todo list is way better Junit reports, including logsfrom multiple machines and stack traces in machine readableform...getting the exceptions out raw is one of the requirements forthis to work
http://wiki.apache.org/ant/Proposals/EnhancedTestReports

-steve
Hey Steve,
Hope Hudson is on your list of CI servers to test wrt to Ant Junitreport changes :-)

I hope so too. One of the big problems is backwards compatibility...theoriginal junit report saves a summary in the attributes of the rootnode, so you can't stream the results out, you have to buffer it -then,if the JVM crashes, you get a 0 byte file.

I'm not sure what the ideal solution would be here, what I may do isgenerate the new format alongside the old, or do a backup format whichcan be used to postmortem a JVM crash when it happens.

On another note, have you used SmartFrog to config/deploy Hadoop? Ifso, I'm interested in your experiences.

I'm working on it in the period of my time I get to do interestingstuff; you can track the status here:

http://jira.smartfrog.org/jira/browse/SFOS-780

-I have the ability for hadoop to get its state from our configurationfiles, not the existing XML files

-I can submit work to an existing cluster

-we can poll a cluster for being in a working condition (job trackerlive &C), and block actions until that state is reached

-I'm just bringing up namenode and data nodes;

-I'm also adding components to do HDFS maintenance: formatting,balancing, etc.

What is nice so far is that it does work with our testing components, soI can run a test that brings up a cluster with some parameters (such asJVM, replication options) and try something, then tear that down and doa different set. That should be good for testing interesting values-like what happens on different replication options, JVM tuning etc. AndI can get the logs back into one place. You don't necessarily want thatin production (one logger=one point of failure), but its good in smalltest runs.

By the end of the month I should have something that others can playwith; we're going to talk about it at the UK hadoop users meeting inaugust. I'm also taking notes of where I've had to do ugly things wheresome changes to hadoop core would make things a lot better. So far-make it easier to get configuration information from some form ofexternal factory-have the services - namenode, datanode, etc, all have a lifecycleinterface, with a base class that provides stable thread safestartup/ping/shutdown.-make package scoped stuff in NameNode/DataNode private, maybe withprotected accessors-find where exceptions are being stringified before nesting/loggingand retain in their raw form

These changes aren't smartfrog-specific, they're just the things youneed to do to manage hadoop better from inside the JVM.


Currently my code is here

http://smartfrog.svn.sourceforge.net/viewvc/smartfrog/trunk/core/components/hadoop/

with subclassed things in the apache packages to get at package-scopedcontent.

http://smartfrog.svn.sourceforge.net/viewvc/smartfrog/trunk/core/components/hadoop/src/org/apache/hadoop/dfs/


-steve

Re: Design Q: logging/nesting of exceptions

Reply via email to