Derek, I like it although I have some ideas.
Some thoughts on usage: It might be a bit more useful on error if you output a reason for not starting. e.g. if you don't have all your options, which one(s) are missing or malformed Taking the usage to the extreme... Die on no logfile is not very friendly, it would seem reasonable to me to monitor all running services. I can see a use for an include 'all' option while also adding an exclude option. e.g. ./smfalert.pl -m send -r root -i "all" -e "gss:default" As trying to monitor everything at the moment is a bit unwieldy ./smfalert.pl -m send -r root -i "`svcs | awk '/gss|xfs|stfsloader| smserver|rstat|rusers/{next} /svc:/{print $3}' ORS=' '`" Some thoughts on implementation: I think there may be an alternate method to achieve the broader goal of service monitoring given the current capabilities of SMF, I would still like to see an include/exclude method in any case. Right now smfalert is a bit on the heavy side when run against all services with logfiles. (I created a project and launched smfalert in the new project to allow for easier tracking) projadd smfalert newtask -p smfalert ./smfalert.pl -m send -r root -i "`svcs -a | awk '/gss|xfs|stfsloader|smserver|rstat|rusers/{next} /svc:/{print $3}' ORS=' '`" prstat -J PROJID NPROC SIZE RSS MEMORY TIME CPU PROJECT 1 19 190M 94M 0.6% 5:53:05 0.0% user.root 100 258 700M 434M 2.4% 0:00:00 0.0% smfalert 0 57 268M 196M 1.1% 29:10:32 0.0% system Or running in both the global and a non-global zone prstat -J PROJID NPROC SIZE RSS MEMORY TIME CPU PROJECT 1 21 199M 99M 0.6% 5:53:06 0.0% user.root 0 57 268M 196M 1.1% 29:10:33 0.0% system 100 484 1314M 815M 4.4% 0:00:00 0.0% smfalert What about taking the output of a svcs -a, and stashing the STATE and STIME columns? You could then with one process per-zone parse the output from svcs - a and report on changes in status including the logfile output as needed and available. You could then have messages that contain: Hostname: foo Instance: system-log Previous State(Time): Online(May_31) Current State(Time): Online(17:37:55) Although a better data source would be to use svcprop so you don't have to worry about the STIME representation changing over time. First Run: svcprop -p restarter/state -p restarter/state_timestamp -p restarter/ logfile system-log 1160256235.416468000 online Subsequent (as you have cached the logfile if there is one): svcprop -p restarter/state -p restarter/state_timestamp system-log 1160256235.416468000 online Shawn On Oct 7, 2006, at 11:09 AM, Derek Crudgington wrote: >> * Derek Crudgington <dacrud at gmail.com> [2006-10-04 >> 08:34]: >>> If anyone is interested, I have written a Perl >> daemon that runs in the >>> background and monitors SMF services to e-mail you >> when something >>> happens. Check http://hell.jedicoder.net/?p=83 for >> more info. >> >> Cool. I had a brief look; you can simplify >> $grep = `svcs -l $smf | grep logfile`; >> >> to >> $grep = `svcprop -p restarter/logfile $smf`; >> (and probably change the variable name...). We >> should be able to get >> a proper API in place for transitions, so you >> needn't be forced to log >> scrape to see stops and starts. >> - Stephen >> - >> Stephen Hahn, PhD Solaris Kernel Development, Sun >> Microsystems >> stephen.hahn at sun.com http://blogs.sun.com/sch/ >> _______________________________________________ >> smf-discuss mailing list >> smf-discuss at opensolaris.org >> > > Thanks for the tip! > > > This message posted from opensolaris.org > _______________________________________________ > smf-discuss mailing list > smf-discuss at opensolaris.org -- Shawn Ferry shawn.ferry at sun.com Senior Principal Systems Engineer Sun Managed Operations Delivery 703.579.1948