I worked in MQ performance on z/OS. This uses the same logger as DB2, and
similar ways of access data on page sets.
I see there are different sorts of workload.
Mainly sequential write For example writing to the logs.
- These write n * 4K pages, where I think n is up to 32. These perhaps
write 20 pages, then possibly rewrite the last page written, possibly
rewrite the last page written, then write 10 pages.
- If there is rollback activity, then it looks in its buffers for the
page - if not there, then read from disk. It may be in the disk cache - if
not then read from the disk. These tend to be sequential reads, and so
can exploit disk read ahead at the hardware level
Mainly write - randomish
- This would be individual 4K pages - though it it might write multiple
pages at a time.
- Every update would cause a disk write.
- A read - may be satisfied by data in the buffer pools, and not need a
disk rea
Mixtures - these tend to be unsophisticated programs with no cached data.
___________________
To make it more complex your disks may be duplexed (mirroring within the
DASD controller) and hyperwrite where z/OS does two writes.
________________________________
Out of the above, most of the IO is write.
Here is my first pass at a benchmark ( a small C program)
Open data set
Do I = 1 to large N
start timer
Write n * 4kB record
flush
end time
save endtime-starttime in and array
close data
Find average write time, standard deviation, and plot time in a chart (
make sure there are no spikes)
repeat with different "block size".
Plot block size vs data rate.
________________________________
Reading is complex.
If the data is in the DASD cache it will be faster, if it has to read from
disk it will be slower.
So you might try writing n* 4KB blocks.
Then have a program which reads the n*4k blocks and times each read.
If you do the write and read close together, the data may be in the DASD
cache.
If you write today, and read next week, you may get the reads from disk.
__________________
If your reads are sequential the hardware will prefetch the data.
If you want a random read measurement, I think the hardware handles 64KB
chunks of data. So you need to allow for this. Of course when you've run
the get test - you have to wait another week before you rerun it.
(If you can flush the cache you do not have to wait a week)
________________
Having done all that, use the SMF 42.6 records to get the IO stats at a
dataset level, to see where the days were... eg in channel - or reading
from disk
See Undestanding SMF 42.6 data set statistics
<https://colinpaice.blog/tag/smf-42/>
Please contact me off line if you would like to discuss it more.
Colin
__________________________
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN