Your message dated Sat, 26 Nov 2016 18:38:14 +0000
with message-id <[email protected]>
and subject line Bug#845765: Removed package(s) from unstable
has caused the Debian Bug report #725268,
regarding nagios3: nagios.log - misleading errors about check results queue
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact [email protected]
immediately.)


-- 
725268: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=725268
Debian Bug Tracking System
Contact [email protected] with problems
--- Begin Message ---
Package: nagios3
Version: 3.4.1-5+b1
Severity: minor
Tags: upstream

Dear Maintainer,
this is reincarnation of the bug #522538 closed as unreproducible some
time ago...

I'm affected by this bug too, but fortunately I'm further in the
observing the problem. The problem have appeared while preparing a new
backup monitoring virtual host based on Debian Wheezy amd64. I have got
up to a point with near the same configuration on the new server
as on a production server. The difference I have noticed between the
server's nagios.logs is, that a new node logs regularly:

[1380716982] Error: Unable to rename file 
'/var/lib/nagios3/spool/checkresults/checkP5jroS' to 
'/var/lib/nagios3/spool/checkresults/c84vzxe': No such file or directory
[1380716982] Warning: Unable to move file 
'/var/lib/nagios3/spool/checkresults/checkP5jroS' to check results queue.

I have certainty the configurations of Nagios on both servers are the
same because I'm using Unison to synchronize the server configurations.
I have installed Systemtap to see the problem on the syscall level.
I have not knowing the Nagios results processing, but have adapted some
example Systemtap script, to monitor syscalls open, rename and unlink.

11978246 6312 (nagios3) open /var/lib/nagios3/spool/checkresults/checkP5jroS 
returned 8
11986098 26537 (nagios3) rename( 
/var/lib/nagios3/spool/checkresults/checkP5jroS -> 
/var/lib/nagios3/spool/checkresults/cMlI3we ) returned 0
11988931 26532 (nagios3) rename( 
/var/lib/nagios3/spool/checkresults/checkP5jroS -> 
/var/lib/nagios3/spool/checkresults/c84vzxe ) returned -2
11989054 26532 (nagios3) unlink /var/lib/nagios3/spool/checkresults/checkP5jroS 
returned -2

A nagios process (pid 26537) renamed result and another nagios process
(pid 26532) later (cca 1ms later) tried to rename the same result file
too.

This was not the answer for why the two Nagios boxes behaves
differently. I have started to compare installed packages and found the
missing smbclient on the new server. I installed software without
recommended packages motivated to keep the number of installed packages
small. I'm monitoring a Samba share, so I have installed smbclient on
the new Nagios server too. The errors in the nagios.log disappeared. :)

I have setup another Nagios server on my destop (Debian Sid with
nagios3 3.4.1-5+b1) and have simplified the check_disk_smb until I have
comprehend the problem. Pieces of configuration and the script are
attached so you can reproduce the problem. In the short:

        Perl check running in the embeded Perl interpreter can do
        a fork() syscall, but if the child process fails to exec() some
        external binary and exits Perl interpreter through exit() then
        the cleanup phase calling move_check_result_to_queue()
        (base/checks.c) is ran in the two places: in the parent process
        and also in the child process. This is the bug in the
        base/checks.c and should be fixed upstream. Probably could be
        sufficient to test if pidof running process not changed (I'm the
        parent) and call move_check_result_to_queue() only in the parent
        process.

Thanks for your time on packaging Nagios!
Best Regards
-- 
Zito


-- System Information:
Debian Release: jessie/sid
  APT prefers unstable
  APT policy: (500, 'unstable'), (1, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 3.10-3-amd64 (SMP w/4 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=cs_CZ.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages nagios3 depends on:
ii  nagios3-cgi   3.4.1-5+b1
ii  nagios3-core  3.4.1-5+b1

nagios3 recommends no packages.

Versions of packages nagios3 suggests:
ii  nagios-nrpe-plugin  2.13-3

-- no debconf information
#! /usr/bin/perl -w
# nagios: +epn
use strict;


my $pid = open my $pipe, "-|";
if (defined($pid)) {
	if ($pid) {
	} else {
		exit(1);
	}
}
else {
	exit(3);
}
wait;

print "OK child has pid $pid\n";
exit(0);
define command {
        command_name    nagtest_check
        command_line    /usr/local/bin/check_nagtest
}

define host {
  host_name                     nagtest
  alias                         nagtest
  check_command                 return-ok
  max_check_attempts            1
  notifications_enabled         0
}

define service {
  host_name                     nagtest
  service_description           nagtest_service
  check_command                 nagtest_check
  max_check_attempts            1
  check_interval                1
  retry_interval                1
  normal_check_interval         1
  notifications_enabled         0
}
#! /usr/bin/env stap

global start

function timestamp:long() { return gettimeofday_us() - start }

function proc:string() { return sprintf("%d (%s)", pid(), execname()) }

function filename_filter:long(filename) {
    return substr(filename, 0, 36) == "/var/lib/nagios3/spool/checkresults/"
}

probe begin { start = gettimeofday_us() }

probe syscall.open.return {
  filename = user_string($filename)
  if ( filename_filter(filename) ) {
      printf("%d %s open %s returned %d\n", timestamp(), proc(), filename, 
$return)
  }
}

probe syscall.unlink.return {
  pathname = user_string($pathname)
  if ( filename_filter(pathname) ) {
      printf("%d %s unlink %s returned %d\n", timestamp(), proc(), pathname, 
$return)
  }
}

probe syscall.rename.return {
  oldname = user_string($oldname)
  newname = user_string($newname)
  if ( filename_filter(oldname) || filename_filter(newname) ) {
      printf("%d %s rename %s -> %s returned %d\n", timestamp(), proc(), 
oldname, newname, $return)
  }
}

probe kprocess.exec {
  printf("%d %s exec %s\n", timestamp(), proc(), filename)
}

--- End Message ---
--- Begin Message ---
Version: 3.5.1.dfsg-2.2+rm

Dear submitter,

as the package nagios3 has just been removed from the Debian archive
unstable we hereby close the associated bug reports.  We are sorry
that we couldn't deal with your issue properly.

For details on the removal, please see https://bugs.debian.org/845765

The version of this package that was in Debian prior to this removal
can still be found using http://snapshot.debian.org/.

This message was generated automatically; if you believe that there is
a problem with it please contact the archive administrators by mailing
[email protected].

Debian distribution maintenance software
pp.
Scott Kitterman (the ftpmaster behind the curtain)

--- End Message ---

Reply via email to