On Tue, Aug 02, 2011 at 07:34:46PM +0100, David Nutter wrote: > On Tue, Jul 26, 2011 at 11:27:19AM +0530, trm asn wrote: > > *snip* > > > 'ProbeRecord', 'get_hostAddress') ; that error I have resolved by editing > > "NPRecords.pm" & changed the parameter "hostAddress" to "HOSTADDRESS" . > > > > now my "scout config push" are getting success, but unfortunately still the > > monitoring status is "PENDING, Awaiting Update" . > > > > > > -bash-3.2$ rhn-runprobe --probe=261 --log=all=4 > > Can't locate object method "hostname" via package > > "NOCpulse::Probe::Config::ProbeRecord" at > > blib/lib/Class/MethodMaker/Engine.pm (autosplit into > > blib/lib/auto/Class/MethodMaker/Engine/new.al) line 941. > > > > Any idea... > > Hi, >
*snip* After a bit more digging I've narrowed down the problem area. The method NOCpulse::Probe::Shell::Unix::connect seems to be failing when running under the scheduler but not when running under rhn-runprobe. I jacked up logging for NOCpulse::Probe::Shell::Unix and this is what I see in kernel-err.log when monitoring is running (sorry for the mess, there seem to be two different logging frameworks in use which causes weird things to happen like multiple timestamps): 2011-08-04 12:31:44 2011-08-04 12:31:43 NOCpulse::Probe::Shell::Unix::connect Execute '/usr/bin/ssh -l nocpulse -p 4545 -i /var/lib/nocpulse/.ssh/nocpulse- identity -o StrictHostKeyChecking=no -o BatchMode=yes 212.219.57.230 /bin/sh -s' 2011-08-04 12:31:44 2011-08-04 12:31:43 NOCpulse::Probe::Shell::Unix::connect Opened pipe to 21109 2011-08-04 12:31:44 2011-08-04 12:31:43 (eval) Sending >>>echo `uname -s`#`uname -r`#$$ 2011-08-04 12:31:44 echo NOCPULSE-1312457503-STATUS $? 2011-08-04 12:31:44 <<< 2011-08-04 12:31:44 2011-08-04 12:31:43 (eval) Done 2011-08-04 12:31:44 2011-08-04 12:31:43 NOCpulse::Probe::Shell::Unix::read_result stdout: >>><<< 2011-08-04 12:31:44 2011-08-04 12:31:43 NOCpulse::Probe::Shell::Unix::read_result stderr: >>><<< 2011-08-04 12:31:44 2011-08-04 12:31:43 NOCpulse::Probe::Shell::Unix::read_result status: >>><<< 2011-08-04 12:31:44 2011-08-04 12:31:44 NOCpulse::Probe::Shell::Unix::connect Opened pipe to 21111 2011-08-04 12:31:44 2011-08-04 12:31:44 (eval) Sending >>>echo `uname -s`#`uname -r`#$$ 2011-08-04 12:31:44 echo NOCPULSE-1312457504-STATUS $? 2011-08-04 12:31:44 <<< 2011-08-04 12:31:44 2011-08-04 12:31:44 (eval) Done 2011-08-04 12:31:44 2011-08-04 12:31:44 NOCpulse::Probe::Shell::Unix::read_result stdout: >>><<< 2011-08-04 12:31:44 2011-08-04 12:31:44 NOCpulse::Probe::Shell::Unix::read_result stderr: >>><<< 2011-08-04 12:31:44 2011-08-04 12:31:44 NOCpulse::Probe::Shell::Unix::read_result status: >>><<< 2011-08-04 12:31:44 Linux#2.6.18-238.12.1.el5xen#2834 2011-08-04 12:31:44 NOCPULSE-1312457503-STATUS 0 However, when run under rhn-runprobe --log=all=4 I get the expected output of: 2011-08-04 12:36:40 NOCpulse::Probe::Shell::Unix::connect Execute '/usr/bin/ssh -l nocpulse -p 4545 -i /var/lib/nocpulse/.ssh/nocpulse-identity -o StrictHostKeyChecking=no -o BatchMode=yes 212.219.57.233 /bin/sh -s' 2011-08-04 12:36:41 NOCpulse::Probe::Shell::Unix::connect Opened pipe to 22504 2011-08-04 12:36:41 (eval) Sending >>>echo `uname -s`#`uname -r`#$$ echo NOCPULSE-1312457801-STATUS $? <<< 2011-08-04 12:36:41 (eval) Done 2011-08-04 12:36:41 NOCpulse::Probe::Shell::Unix::read_result Read 32 bytes from 6: >>>Linux#2.6.18-238.12.1.el5#16885 <<< 2011-08-04 12:36:41 NOCpulse::Probe::Shell::Unix::read_result Read 29 bytes from 6: >>>NOCPULSE-1312457801-STATUS 0 <<< 2011-08-04 12:36:41 NOCpulse::Probe::Shell::Unix::read_result stdout: >>>Linux#2.6.18-238.12.1.el5#16885 <<< 2011-08-04 12:36:41 NOCpulse::Probe::Shell::Unix::read_result stderr: >>><<< 2011-08-04 12:36:41 NOCpulse::Probe::Shell::Unix::read_result status: >>>0<<< 2011-08-04 12:36:41 NOCpulse::Probe::Shell::Unix::connect OS Linux 2.6.18-238.12.1.el5, shell pid 16885 2011-08-04 12:36:41 NOCpulse::Probe::Shell::Unix::connect OK ... What seems to be happening on the scheduler is the command's STDOUT is getting written to the logfile and not read by Unix::read_result. In the first trace you can see this in the last two log lines: 2011-08-04 12:31:44 Linux#2.6.18-238.12.1.el5xen#2834 2011-08-04 12:31:44 NOCPULSE-1312457503-STATUS 0 So, somewhere STDOUT is being redirected in such a way that it's interfering with the STDOUT of all the children. I'll see if I can find out why. Incidentally, the files in the TSDBLocalQueue I mentioned earlier are a red herring - they seem to contain the output from when the probe last worked properly but with the current time, for some reason. Nevertheless, they do actually get loaded into the DB (verified with SQLplus) so I think it's just the problems with output streams that are causing monitoring not to work. Regards, -- David Nutter Tel: +44 (0)131 650 4888 BioSS, JCMB, King's Buildings, Mayfield Rd, EH9 3JZ. Scotland, UK Biomathematics and Statistics Scotland (BioSS) is formally part of The James Hutton Institute (JHI), a registered Scottish charity No. SC041796 and a company limited by guarantee No. SC374831 _______________________________________________ Spacewalk-list mailing list [email protected] https://www.redhat.com/mailman/listinfo/spacewalk-list
