Hi All,

I'm running snmpd on my CentOS5 systems for monitoring.
Every 3 seconds the prErrorFlags are checked from the configured procs.
But on some moments it falsely detects an error, which is gone during the next check.

Investigating this we didn't notice any strange behavior on the machines, the specific process did not restart.

I enabled debugging and noticed that whenever such false positive occurs the list of running processes is much smaller. The debugging output also showed the function that handles the process counting and therefore I verified that piece of code.

As it was reading from /proc/*/status I wrote a bash script which did exactly the same, but the problem never occurred.

I copied the sh_count_proc function out of snmpd code into a separate program doing only the proc check.
And the problem occurred again.

Every time the specific proc isn't found the debug output "Could not fgets for /proc/<pid>/status" is printed as last line. In the code you can see that in the if block a break is used and no continue.

Is there a reason why there is a break instead of a continue?

I attached my test script.
This function is copied from the latest stable snmp-5-5 from file: agent/mibgroup/ucd-snmp/proc.c

Is this a bug, and is changing the break into a continue a correct fix?

Greetings,
Johan Huysmans


Newtecs MENOS system awarded IBC Innovation Award for Content Delivery & the 
IBC Judges Award  Newtecs FlexACM awarded 2009 Teleport Technology of the Year by 
WTA  *** e-mail confidentiality footer *** This message and any attachments thereto 
are confidential. They may also be privileged or otherwise protected by work 
product immunity or other legal rules. If you have received it by mistake, please 
let us know by e-mail reply and delete it from your system; you may not copy this 
message or disclose its contents to anyone. E-mail transmission cannot be 
guaranteed to be secure or error free as information could be intercepted, 
corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The 
sender therefore is in no way liable for any errors or omissions in the content of 
this message, which may arise as a result of e-mail transmission. If verification 
is required, please request a hard copy.
#include <stdio.h>
#include <dirent.h>

char           *
skip_white(char *ptr)
{
    if (ptr == NULL)
        return (NULL);
    while (*ptr != 0 && isspace((unsigned char)*ptr))
        ptr++;
    if (*ptr == 0 || *ptr == '#')
        return (NULL);
    return (ptr);
}

char           *
skip_not_white(char *ptr)
{
    if (ptr == NULL)
        return (NULL);
    while (*ptr != 0 && !isspace((unsigned char)*ptr))
        ptr++;
    if (*ptr == 0 || *ptr == '#')
        return (NULL);
    return (ptr);
}

char           *
skip_token(char *ptr)
{
    ptr = skip_white(ptr);
    ptr = skip_not_white(ptr);
    ptr = skip_white(ptr);
    return (ptr);
}


int
sh_count_procs(char *procname)
{
    DIR *dir;
    char cmdline[512], *tmpc;
    char state[64];
    struct dirent *ent;
    int len,plen=strlen(procname),total = 0;
    FILE *status;

    if ((dir = opendir("/proc")) == NULL) return -1;
    while (NULL != (ent = readdir(dir))) {
      if(!(ent->d_name[0] >= '0' && ent->d_name[0] <= '9')) continue;
      /* read /proc/XX/status */
      sprintf(cmdline,"/proc/%s/status",ent->d_name);
      if ((status = fopen(cmdline, "r")) == NULL) {
          // DEBUG
          printf("Could not fopen for %s\n", cmdline);
          continue;
      }
      if (fgets(cmdline, sizeof(cmdline), status) == NULL) {
          // DEBUG
          printf("Could not fgets for %s\n", cmdline);
          fclose(status);
//          break;
          continue;
      }
      /* Grab the state of the process as well
       * (so we can ignore zombie processes)
       * XXX: Assumes the second line is the status
       */
      if (fgets(state, sizeof(state), status) == NULL) {
          state[0]='\0';
      }
      fclose(status);
      cmdline[sizeof(cmdline)-1] = '\0';
      state[sizeof(state)-1] = '\0';
      /* XXX: assumes Name: is first */
      if (strncmp("Name:",cmdline, 5) != 0)
          break;
      tmpc = skip_token(cmdline);
      if (!tmpc)
          break;
      for (len=0;; len++) {
	if (tmpc[len] && isgraph(tmpc[len])) continue;
	tmpc[len]='\0';
	break;
      }
      // DEBUGMSGTL(("proc","Comparing wanted %s against %s\n",
      //            procname, tmpc));
      // DEBUG
      printf("Comparing wanted %s against %s\n",procname, tmpc);
      if(len==plen && !strncmp(tmpc,procname,plen)) {
          /* Do not count zombie process as they are not running processes */
          if ( strstr(state, "zombie") == NULL ) {
              total++;
              // DEBUGMSGTL(("proc", " Matched.  total count now=%d\n", total));
          } else {
              // DEBUGMSGTL(("proc", " Skipping zombie process.\n"));
          }
      }
    }
    closedir(dir);
    return total;
}

int main() {
  int result = sh_count_procs("sshd");
  return result;
}
------------------------------------------------------------------------------
This SF.net Dev2Dev email is sponsored by:

Show off your parallel programming skills.
Enter the Intel(R) Threading Challenge 2010.
http://p.sf.net/sfu/intel-thread-sfd
_______________________________________________
Net-snmp-coders mailing list
Net-snmp-coders@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/net-snmp-coders

Reply via email to