[osol-discuss] question concerning kstat's biostats, ufs_inode_cache, hsfs_hsnode_cache

2006-04-28 Thread Thomas Maier-Komor
Hi,

I am once again looking at a kstat output and trying to understand what some of 
these fields might mean and what their unit might be. Unfortunately the units 
aren't documented anywhere, are they?

biostats is probably the statistics for the ddi I/O buffers of Solaris that are 
accessible via bioinit(9f). So lookup and cache hits and misses are counted 
here. Unit is probably each. 

Is there somewhere hidden a field that tells us how big the I/O buffers are? I 
can only see biostats' new_buffer_requests, but nothing that gives a hint how 
big the buffer currently is.

Concerning ufs_inode_cache and hsfs_hsnode_cache, I'd like to know what unit 
buf_inuse has. Is it kB or pages or something else?

TIA,
Tom
BTW: is the result sysconf(_SC_CPUID_MAX) the maximum id a processor can have 
or the maximum id no processor will ever have?
 
 
This message posted from opensolaris.org
___
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org


Re: [osol-discuss] question concerning kstat's biostats, ufs_inode_cache, hsfs_hsnode_cache

2006-04-28 Thread Daniel Rock

Thomas Maier-Komor schrieb:

Hi,

I am once again looking at a kstat output and trying to understand what

 some of these fields might mean and what their unit might be.
 Unfortunately the units aren't documented anywhere, are they?

Use the source, Luke!



biostats is probably the statistics for the ddi I/O buffers of Solaris

 that are accessible via bioinit(9f). So lookup and cache hits and misses
 are counted here. Unit is probably each.

http://cvs.opensolaris.org/source/xref/on/usr/src/uts/common/os/bio.c#biostats

Later in the same file you can look up when the statistics will be 
incremented. But I don't think biostats are of much interest any more.




Concerning ufs_inode_cache and hsfs_hsnode_cache, I'd like to know what

 unit buf_inuse has. Is it kB or pages or something else?

Unit is number of elements, as with all kstat entries of class kmem_cache.



BTW: is the result sysconf(_SC_CPUID_MAX) the maximum id a processor can
have or the maximum id no processor will ever have?


Well, sysconf(_SC_CPUID_MAX) ends finally in the kernel:

http://cvs.opensolaris.org/source/xref/on/usr/src/uts/common/syscall/sysconfig.c#165

max_cpuid is initialized to a default value of (NCPU - 1) - some 
architectures may re-set max_cpuid:


http://cvs.opensolaris.org/source/xref/on/usr/src/uts/common/os/cpu.c#max_cpuid


So _SC_CPUID_MAX returns the maximum possible value. If a architecture 
supports cpuids from 0..31 (total 32 cpus) then sysconf(_SC_CPUID_MAX) will 
return 31. So you should iterate over


for(cpuid = 0; cpuid = cpuid_max; ++cpuid)
  ...

But for my performance gathering tool I wrote a few years ago I didn't 
bother to get _SC_CPUID_MAX at all. I just iterated over all kstat entries 
while searching for the right kstat modules:


  ncpu = 0;
  for(ksp = kc-kc_chain; ksp != NULL; ksp = ksp-ks_next)
  {
cpu_stat_t *cp;

if((ksp-ks_type != KSTAT_TYPE_RAW) ||
   (strncmp(ksp-ks_module, cpu_stat, 8)))
  continue;
if(kstat_read(kc, ksp, NULL) == -1)
  continue;
++ncpu;
[...]
  }

It really isn't that inefficient. Even on large machines with many RAM, lots 
of disks and CPUs the cumulative running time was ~60 minutes over a period 
of 200 days. The program fetched every 60 seconds performance counters from 
disks, network, memory and cpu. Only on Solaris 2.6 accessing the kstat 
system_misc module blocked for a few seconds on machines with large memory.




Daniel
___
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org


Re: [osol-discuss] question concerning kstat's biostats, ufs_inode_cache, hsfs_hsnode_cache

2006-04-28 Thread Thomas Maier-Komor
Daniel Rock wrote:
 Thomas Maier-Komor schrieb:
 Hi,

 I am once again looking at a kstat output and trying to understand what
 some of these fields might mean and what their unit might be.
 Unfortunately the units aren't documented anywhere, are they?
 
 Use the source, Luke!
 

;-)

Asking and getting a good answer has sometimes the advantage that one
can avoid to make the same mistake someone else already made and be much
faster a the same time...

 biostats is probably the statistics for the ddi I/O buffers of Solaris
 that are accessible via bioinit(9f). So lookup and cache hits and misses
 are counted here. Unit is probably each.
 
 http://cvs.opensolaris.org/source/xref/on/usr/src/uts/common/os/bio.c#biostats
 
 
 Later in the same file you can look up when the statistics will be
 incremented. But I don't think biostats are of much interest any more.
 

Great pointer! BTW: your comment concerning biostats is unfortunately
not included in the source - so I'm really glad that I asked, because I
was considering using it ;-)

 
 Concerning ufs_inode_cache and hsfs_hsnode_cache, I'd like to know what
 unit buf_inuse has. Is it kB or pages or something else?
 
 Unit is number of elements, as with all kstat entries of class
 kmem_cache.
 
 
 BTW: is the result sysconf(_SC_CPUID_MAX) the maximum id a processor can
 have or the maximum id no processor will ever have?
 
 Well, sysconf(_SC_CPUID_MAX) ends finally in the kernel:
 
 http://cvs.opensolaris.org/source/xref/on/usr/src/uts/common/syscall/sysconfig.c#165
 
 
 max_cpuid is initialized to a default value of (NCPU - 1) - some
 architectures may re-set max_cpuid:
 
 http://cvs.opensolaris.org/source/xref/on/usr/src/uts/common/os/cpu.c#max_cpuid
 
 
 So _SC_CPUID_MAX returns the maximum possible value. If a architecture
 supports cpuids from 0..31 (total 32 cpus) then sysconf(_SC_CPUID_MAX)
 will return 31. So you should iterate over
 

Thanks! I thought it must be like this, but wasn't sure. Having a
reference where one can look is always good. Although the sources of
Solaris are available and browsing them is interesting, this isn't
really enough to know them and be sure one interprets a piece of code
correctly and doesn't miss anything in another part. Concerning the
board utilities the sources are great, because they mostly concern a
couple of files over which you quickly get an overview and goodd
understanding. But the kernel takes more time than I can spend, as I
really get paid for doing something else...

 for(cpuid = 0; cpuid = cpuid_max; ++cpuid)
   ...
 
 But for my performance gathering tool I wrote a few years ago I didn't
 bother to get _SC_CPUID_MAX at all. I just iterated over all kstat
 entries while searching for the right kstat modules:
 
   ncpu = 0;
   for(ksp = kc-kc_chain; ksp != NULL; ksp = ksp-ks_next)
   {
 cpu_stat_t *cp;
 
 if((ksp-ks_type != KSTAT_TYPE_RAW) ||
(strncmp(ksp-ks_module, cpu_stat, 8)))
   continue;
 if(kstat_read(kc, ksp, NULL) == -1)
   continue;
 ++ncpu;
 [...]
   }
 

I also iterate only once and search for the modules/classes of interest.
Why do you use strncmp? Are ks_module and Co not guaranteed to be null
terminated or did you mix it up with ks_name which includes the trailing id?

I need _SC_CPUID_MAX, because I only want to allocate memory once for
all CPUs. It doesn't really matter if I ever use this memory, because
the amount is so low that it is altogether at maximum two pages. OTOH
this approach makes the source code a little bit smaller and faster to
implement.

 It really isn't that inefficient. Even on large machines with many RAM,
 lots of disks and CPUs the cumulative running time was ~60 minutes over
 a period of 200 days. The program fetched every 60 seconds performance
 counters from disks, network, memory and cpu. Only on Solaris 2.6
 accessing the kstat system_misc module blocked for a few seconds on
 machines with large memory.
 

Interesting experience - thanks for sharing it. I'd like to know if the
behavior concerning system_misc has improved with later Solaris releases...

 
 Daniel
 

Cheers,
Tom
___
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org


Re: [osol-discuss] question concerning kstat's biostats, ufs_inode_cache, hsfs_hsnode_cache

2006-04-28 Thread Daniel Rock

Thomas Maier-Komor schrieb:

I also iterate only once and search for the modules/classes of interest.
Why do you use strncmp? Are ks_module and Co not guaranteed to be null
terminated or did you mix it up with ks_name which includes the trailing id?


Maybe just a leftover from an earlier implementation which iterated over 
ksp-ks_name. strcmp() should be sufficient - the strings are null terminated.




Interesting experience - thanks for sharing it. I'd like to know if the
behavior concerning system_misc has improved with later Solaris releases...


Well, in Solaris 2.6 it took 2-3 seconds to read system_misc on a machine 
with 32GB RAM and ~600 MHz CPUs. I skipped Solaris 7 and in 8 the problem 
went away - so I don't know if it has been fixed in 7 or 8. The statistics 
in system_misc were sometimes even unusable (pp_kernel had ridicilous high 
values). Therefor I added some comments in the source code:


/*
 * Zugriff auf system_pages braucht extrem viel Zeit (1s bei 32 GB RAM).
 * Gibt es nicht etwas besseres?
 */
  if((ksp = kstat_lookup(kc, unix, -1, system_pages)) == NULL)
return;
  if(!kstat_read(kc, ksp, NULL))
return;
  pp-kmem = data_lookup(kc, ksp, pp_kernel) * pagesize;

/* Bug in älteren Solaris-Versionen */
  if(pp-kmem  totalmem)
pp-kmem = 0;



Some other things you should consider (I fell into these traps while writing 
my little program):


. altough the network drivers do have 64 bit counters in addition to
  the 32 bit ones (obytes64, opackets64, rbytes64, ipackets64) - at
  least for the bge driver these counters are still 32 bit values, so
  they wrap at 2^32. So don't rely on these values really 64 bits wide.
. If you do DR operations or add/remove disks you have to kstat_close()
  kstat_open() again - otherwise you won't notice the changed configuration.
. Also DR: Remember that on large machines if you detach a system board
  some cpu_stat%d entries might vanish.
. If you count disk usage remember that SDS/SVM also registers its
  devices to kstat. In my program I did calculate a disk I/O summary over
  all disks and I wondered why on some machines these values were doubled
  or 2.5x:
  They were setup with SDS while other machines had VxVM (no kstat)
  installed.
. Same goes for disk/partition kstat classes: don't count disk I/O twice.
  Basically the relevant portion looks like:

  for(ksp = kc-kc_chain; ksp != NULL; ksp = ksp-ks_next)
  {
if(ksp-ks_type != KSTAT_TYPE_IO)
  continue;
if(!strcmp(ksp-ks_module, md))
  continue;
if(!strcmp(ksp-ks_class, partition))
  continue;
kstat_read(kc, ksp, kio);
pp-rops += kio.reads;
pp-wops += kio.writes;
pp-rbytes += kio.nread;
pp-wbytes += kio.nwritten;
  }





Daniel
___
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org