Hi, I don't know if it happens also for you, but I encountered a problem in reactionner when it processes a Notification (with last git code).
The reactionner's worker that handle the notification crashes with a traceback : [0][scheduler-central]Stats : Workers:1 (Queued:0 Processing:0 ReturnWait:0) [1][scheduler-CMP]Stats : Workers:1 (Queued:0 Processing:0 ReturnWait:0) Wait ratio: 1.0 Notification instance has no attribute 'timeout' Ask actions to 1 got 1 Process Process-2: Traceback (most recent call last): File "/usr/lib/python2.6/multiprocessing/process.py", line 232, in _bootstrap self.run() File "/usr/lib/python2.6/multiprocessing/process.py", line 88, in run self._target(*self._args, **self._kwargs) File "./shinken/worker.py", line 207, in work self.manage_finished_checks() File "./shinken/worker.py", line 146, in manage_finished_checks action.check_finished(self.max_plugins_output_length) File "./shinken/action.py", line 178, in check_finished self.check_finished_unix(max_plugins_output_length) File "./shinken/action.py", line 191, in check_finished_unix if (now - self.check_time) > self.timeout: AttributeError: Notification instance has no attribute 'timeout' We ask us for a ping ======================== [reactionner-central] Warning : the worker 0 goes down unexpectly! [0][scheduler-central]Stats : Workers:0 (Queued:0 Processing:1 ReturnWait:0) [1][scheduler-CMP]Stats : Workers:0 (Queued:0 Processing:1 ReturnWait:0) Wait ratio: 1.0 [reactionner-central] Allocating new Worker : 1 After debugging, I found that Notification is correctly created and sent scheduler-side (in get_checks method), but reactionner receive this Notification without the 'timeout' attribute (after the Pyro remote call to get_checks) ! Here is a small patch that worked for me (adding the 'timeout' attribute to the 'properties' list defined in the Notification class), I don't know if it's the correct way to correct the problem : notification.py 93c93 < --- > 'timeout' : StringProp(default=5), And a little worker.py patch to add exception catching : 146,147c150,157 < action.check_finished(self.max_plugins_output_length) < wait_time = min(wait_time, action.wait_time) --- > try: > action.check_finished(self.max_plugins_output_length) > wait_time = min(wait_time, action.wait_time) > except Exception, exp: > print "[%d]Error!!! %s, exiting." % (self.id, exp) > sys.exit(2) But, after having corrected this first problem, another bug occured (in reactionner again, when worker returns its result to reactionner), traceback : Traceback (most recent call last): File "/usr/local/shinken/bin/shinken-reactionner", line 5, in <module> pkg_resources.run_script('Shinken==0.4', 'shinken-reactionner') File "/usr/lib/python2.6/dist-packages/pkg_resources.py", line 467, in run_script self.require(requires)[0].run_script(script_name, ns) File "/usr/lib/python2.6/dist-packages/pkg_resources.py", line 1200, in run_script execfile(script_filename, namespace, namespace) File "/usr/local/lib/python2.6/dist-packages/Shinken-0.4-py2.6.egg/EGG-INFO/scripts/shinken-reactionner", line 158, in <module> p.main() File "/usr/local/lib/python2.6/dist-packages/Shinken-0.4-py2.6.egg/shinken/satellite.py", line 708, in main self.manage_action_return(self.returns_queue.pop()) File "/usr/local/lib/python2.6/dist-packages/Shinken-0.4-py2.6.egg/shinken/satellite.py", line 309, in manage_action_return sched_id = action.sched_id AttributeError: Notification instance has no attribute 'sched_id' I found where the problem is, but it's very strange and didn't manage to solve it. When reactionner get a Notification from scheduler, it adds the sched_id attribute to it, and put it in its 'self.s' Queue (multiprocessing.Queue), ok. But when the worker dequeue this Notification, the sched_id attribute have disapeared ! I tried to dequeue the Notification just after it have been queued by reactionner, and this attribute really disapeared ! Have you any idea ? some race condition ? I'm running Python 2.6.6 (Debian Squeeze) Laurent ------------------------------------------------------------------------------ Gaining the trust of online customers is vital for the success of any company that requires sensitive data to be transmitted over the Web. Learn how to best implement a security strategy that keeps consumers' information secure and instills the confidence they need to proceed with transactions. http://p.sf.net/sfu/oracle-sfdevnl _______________________________________________ Shinken-devel mailing list Shinken-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/shinken-devel