Package: procps
Version: 1:3.2.8-2tls
Severity: normal
Tags: patch

Most in-kernel counters nowadays are 64bits, even on 32bit arches.
But procps uses unsigned long internally, and strtoul() to convert
from text to number.  In case of overflow, strtoul() returns -1,
and comparing two -1s always results in 0.  So with large in-kernel
counters vmstat stops displaying statistics properly, leading to
false system troubleshooting analisys.

The attached patch provides a very simple fix for this, which does
not solve the root problem but makes it disappear in almost all
cases (it only occurs when the actual counter overflows, in which
case the difference between "this" and "previous" will be around
2**32, only once, and this difference is easy to understand) s
that it stops becoming an issue.  Real fix involves major code
changes.

Thanks.

-- System Information:
Debian Release: 5.0.3
  APT prefers stable
  APT policy: (990, 'stable'), (60, 'testing'), (50, 'unstable'), (1, 
'experimental')
Architecture: i386 (x86_64)

Kernel: Linux 2.6.32-rc7-amd64 (SMP w/2 CPU cores)
Locale: LANG=ru_RU.UTF-8, LC_CTYPE=ru_RU.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash

Versions of packages procps depends on:
ii  initscripts               2.86.ds1-61    Scripts for initializing and shutt
ii  libc6                     2.7-18         GNU C Library: Shared libraries
ii  libncurses5               5.7+20081213-1 shared libraries for terminal hand
ii  lsb-base                  3.2-20         Linux Standard Base 3.2 init scrip

Versions of packages procps recommends:
ii  psmisc                        22.6-1     Utilities that use the proc filesy

procps suggests no packages.

-- no debconf information
after some uptime, vmstat starts displaying zeros in various
stats columns instead of real numbers.  This is because most
kernel counters are now 64bit but in procps they're 32bits on
i386 (and other 32bit arches).  To convert values read from
files in /proc, procps uses strtoul(), which returns -1 in
case of overflow.  But comparing two -1s always gives 0, so
the statistics becomes useless.

The real fix to this and other similar problems is to always
use 64bit counters in procps.  But that requires alot more
changes all over the places.

But much simpler fix is possible too: changing strtoul() to
strtoull() which returns 64bit integer.  Hopefully that one
will not overflow.  We convert the result into our native
unsigned long by truncating the most significant part if
necessary.  This way, we will still have proper least
significant part, which is enough for comparison with
previous value of the same nature, and substraction gives
good result.

Signed-off-by: Michael Tokarev <m...@tls.msk.ru>

--- procps-3.2.8/proc/sysinfo.c.orig    2008-03-24 07:33:43.000000000 +0300
+++ procps-3.2.8/proc/sysinfo.c 2009-11-28 11:53:45.816811421 +0300
@@ -606,3 +606,3 @@ void meminfo(void){
     if(!found) goto nextline;
-    *(found->slot) = strtoul(head,&tail,10);
+    *(found->slot) = (unsigned long)strtoull(head,&tail,10);
 nextline:

Reply via email to