Unfortunately that may not be very feasible. I'll have to see if there's a way. Are my assumptions about how the yarn.nodemanager.health- checker.script.opts option should work correct? It took quite a bit of trial and error to sort this out initially (who knew even an unhealthy node script run had to have a 0 exit status...) and I'm worried I may be starting off on the wrong foot here.
On Mon, Mar 25, 2013 at 3:27 PM, Arun C Murthy <[email protected]> wrote: > Hmm... an easy way to debug this would be to add a LOG statement in > NodeHealthScriptRunner.init to printout the args you are getting from the > config. > > Is there a chance you can try this, recompile & re-run? > > thanks, > Arun > > On Mar 25, 2013, at 11:14 AM, Tucker wrote: > > Does anyone have a working example of a node manager health checker scipt > using "yarn.nodemanager.health-checker.script.opts"? I wrote a health > checker that works fine but one of the items being checked is a little too > sensitive. Since I wrote it to be able to load and unload modules by > passing various flags. Unfortunately, adding these flags to my config > doesn't seem to have had any affect and we've had to disable the health > check entirely. > > For reference: > > $ health_checker -h > Usage: health_checker [options] > --default-disabled Default all checks disabled. > -e, --enable-checks CHECKS Command separated list of checks to > enable. > -d, --disable-checks CHECKS Command separated list of checks to > disable. > -l, --list List available checks. > > Settings used: > > <property> > <name>yarn.nodemanager.health-checker.script.path</name> > <value>/usr/bin/health_checker</value> > </property> > ... > <property> > <name>yarn.nodemanager.health-checker.script.opts</name> > <value>-d Network</value> > </property> > > If the flag were actually being passed, I would expect the output to be > return healthy. This is what I see on a command line: > > # health_checker > ERROR(s): ["Errors found on interface eth2."] > # health_checker -d Network > Healthy > # echo $? > 0 > > Unfortunately, even with opts set, I continue to get the interface errors > warning after cluster start and beyond the run interval. I assume I'm > missing something but I can't seem to find any good docs on the matter. > > -- > > --tucker > > > -- > Arun C. Murthy > Hortonworks Inc. > http://hortonworks.com/ > > > -- --tucker
