One of the more difficult performance monitoring problems that I have come across is determining the impact of multiple workloads running on a server. Consider a server that has about 1000 database processes that are long running - many minutes to many months - mixed with batch jobs written in Bourne shell. Largely due to the batch jobs, it is not uncommon for sar to report hundreds of forks and execs per second.
There is somewhat of a knee-jerk reaction to move the batch jobs off of the database server. Howerver, quantifying how much of an impact this would have is somewhat hard to do. Trying to use "prstat -a" or "prstat -J" does not seem to give a very accurate picture. My guess is that prstat will tend to miss out on all of the processes that were very short lived. The best solution that I have come up with is to write extended accounting records (task) every few minutes, then to process the exacct file afterwards. Writing the code to write exacct records periodically and make sense of them later is far from trivial. It is also impractical for multiple users (monitoring frameworks, administrators, etc.) to make use of this approach on the same machine at the same time due to the fact that the exacct records need to be written and this is presumably a somewhat expensive operation to do too often. It seems as though it should be possible for the kernel to maintain per-user, per-project, and per-zone statistics. Perhaps collecting them all the time is not desirable, but it seems as though updating the three sets of statistics for each context switch would be lighter weight than writing accounting records then post processing them. The side affect of having this data available would be that tools like prstat could report accurate data. Other tools could likely get this data through kstat or a similar interface. Maybe this already exists but is not exposed. If it looks like a reasonable thing to do, is there a good place to start looking to add such functionality. My guess is that it would be hidden somewhere in /usr/src/uts/common/disp, but an admittedly short perusal of that directory hasn't turned up anything obvious. Mike _______________________________________________ perf-discuss mailing list perf-discuss@opensolaris.org