Jason,
You're absolutely right. When I wrote the disk metrics (disk_total,
disk_free, part_max_used), I did not consider the afs filesystems. Also
the method of computing the total size is a bit different than df. As
you saw from the code, it does not count filesystems twice when they
have the same device name, which df will do (when you ask it for
summary info). I put that logic in there to handle the case when
multiple automounted volumes were up from the same device.
You have found several special cases that I did not consider. The
"tmpfs" types (called device "none") should probably not be counted, as
you guessed. Not sure what the rootfs is, but I have read that it is a
ram-backed filesystem, and therefore shouldn't be counted either. Note
that I parse the /proc/mounts file, which has no flag for
"memory-backed", unfortunately. Perhaps the statfs() call can tell us.
Either you or I can add the necessary logic for tmpfs, rootfs. I'm a
bit busy now, but I could get to it in the near future.
Nice observations about the code,
Federico
On Monday, March 3, 2003, at 01:48 PM, Jason A. Smith wrote:
We have AFS mounted on our Linux compute cluster and this is screwing
up
the disk_free, disk_total and part_max_used metrics in gmond. With afs
mounted, I have the following entry in /proc/mounts:
AFS /afs afs rw 0 0
df also shows this for afs:
AFS 9000000 0 9000000 0% /afs
I am sure these numbers are meaningless for afs but the statfs syscall
in get_fs_usage must be adding this 9GB value to the disk metrics
because our numbers are off by almost exactly that amount. I have
attached a patch which adds afs to the list of remote filesystem types
to skip in addition to nfs, autofs and smbfs.
I said almost above, because even with my patch, gmond's value is a lot
closer to what I add up with df, but still a little bit off and I am
not
sure why. I noticed that the device_space function skips filesystems
that it has seen before, but it seems to go by the device name, not the
mount point. On my desktop (RedHat 2.4.18-24.7.x kernel), I seem to
have the root filesystem listed twice with different device names:
rootfs / rootfs rw 0 0
/dev/root / ext3 rw 0 0
Is this being counted twice?
~Jason
PS. Should devices with name equal to none also be skipped, like the
Linux kernel's shared memory fs? From /proc/mounts I have:
none /dev/shm tmpfs rw 0 0
And this filesystem does report some space that might be getting added
up with all the rest:
none 256816 0 256816 0% /dev/shm
--
/------------------------------------------------------------------\
| Jason A. Smith Email: [EMAIL PROTECTED] |
| Atlas Computing Facility, Bldg. 510M Phone: (631)344-4226 |
| Brookhaven National Lab, P.O. Box 5000 Fax: (631)344-7616 |
| Upton, NY 11973-5000 |
\------------------------------------------------------------------/
diff -uNr ganglia-monitor-core-2.5.2-dist/gmond/machines/linux.c
ganglia-monitor-core-2.5.2/gmond/machines/linux.c
--- ganglia-monitor-core-2.5.2-dist/gmond/machines/linux.c Tue Jan 7
12:05:38 2003
+++ ganglia-monitor-core-2.5.2/gmond/machines/linux.c Mon Mar 3
16:41:49 2003
@@ -919,7 +919,8 @@
or if (it is of type smbfs and its Fs_name starts with `//'). */
return ((strchr(device,':') != 0)
|| (!strcmp(type, "smbfs") && device[0]=='/' && device[1]=='/')
- || (!strcmp(type, "autofs")));
+ || (!strcmp(type, "autofs"))
+ || (!strcmp(type, "afs")));
}
/*
-----------------------------------------------------------------------
---- */
Federico
Rocks Cluster Group, SDSC, San Diego
GPG Fingerprint: 3C5E 47E7 BDF8 C14E ED92 92BB BA86 B2E6 0390 8845