Hello:

I have a number of high-CPU processes that run on 24-core boxes configured e.g.:

check process emr-enc01-01 with pidfile /var/run/tada_liveenc_emr-enc01-01.pid
  start program = "/usr/local/tada/launch.sh -c emr-enc01-01"
  stop program = "/bin/bash -c 'kill -s SIGTERM `/bin/cat 
/var/run/tada_liveenc_emr-enc01-01.pid`'"
  if totalmem > 80% then alert
  if totalmem > 90% then restart
  if totalcpu < 10% for 10 cycles then alert

These processes create pidfiles which match correctly in top as:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND            
                                               
 1710 root      20   0 3064m 1.2g 7808 S  578 15.8  47:31.53 tada_liveenc       
                                                 
 1866 root      20   0 2954m 1.3g 7804 S  545 16.7  45:18.52 tada_liveenc     

However, monit sees these as a completely different total CPU usage:

Process 'emr-enc01-01'
  status                            Running
  monitoring status                 Monitored
  pid                               1710
  parent pid                        1
  uptime                            8m 
  children                          0
  memory kilobytes                  1372300
  memory kilobytes total            1372300
  memory percent                    16.7%
  memory percent total              16.7%
  cpu percent                       4.1%
  cpu percent total                 4.1%
  data collected                    Thu, 05 Jan 2012 00:05:49

Process 'emr-enc01-02'
  status                            Running
  monitoring status                 Monitored
  pid                               1866
  parent pid                        1
  uptime                            8m 
  children                          0
  memory kilobytes                  1362240
  memory kilobytes total            1362240
  memory percent                    16.6%
  memory percent total              16.6%
  cpu percent                       4.1%
  cpu percent total                 4.1%
  data collected                    Thu, 05 Jan 2012 00:05:49

Any thoughts on why this might be happening?  Hosts are ubuntu natty.  The 
master processes themselves spawn about 150 threads (not forks).

FYI:

662 root@enc01[tada]: $ uname -m
x86_64

663 root@enc01[tada]: $ file `which monit`
/usr/local/bin/monit: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), 
dynamically linked (uses shared libs), for GNU/Linux 2.6.0, not stripped

664 root@enc01[tada]: $ monit -V
This is Monit version 5.3.2
Copyright (C) 2000-2011 Tildeslash Ltd. All Rights Reserved.

Thanks in advance,
-Tom
--
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general

Reply via email to