The whole question of latency could take up several lectures, if we were still in college, and a lot of research.
The CPU Measurement facilities mentioned before offer us a 'peek through the keyhole' view of what is going on behind the scenes that before now we had no way of seeing. As I see it today there are four layers of latency. The first is instruction fetch. The next is address resolution followed by data fetch followed lastly by data store. The first and third of these can be greatly affected by where the data is coming from, be it level 1 cache or level 4 cache. Address resolution gets hit due to the use of a LOAD followed by a LOAD using the first as a base or index register. Data store is most always into level 1 cache, but even this can cause delays because of the updating of the cache directory for all associated processors. Hardware Instrumentation Services can tell us the number of cycles used to execute a number of instructions. It also gives us the number of writes into the cache directory and the penalty cycles waiting for the data to harden. What is not clear to me is how those two inter-relate. Does a cache penalty cycle also cause an instruction cycle? In addition, does an MVCL with a padding byte of x'B0' bypass all the cache, or just level 1? And does the bypass of cache, while not flushing cache, really save time because now it has to go farther out in memory? All are questions that will be answered as we go along. I think the best we can say today is that we will get hints and glimmers of what is happening. Chris Blaicher Principal Software Engineer, Software Development Syncsort Incorporated 50 Tice Boulevard, Woodcliff Lake, NJ 07677 P: 201-930-8260 | M: 512-627-3803 E: [email protected] -----Original Message----- From: IBM Mainframe Discussion List [mailto:[email protected]] On Behalf Of Anne & Lynn Wheeler Sent: Tuesday, February 04, 2014 12:22 PM To: [email protected] Subject: Re: CPU time [email protected] (Tom Marchant) writes: > John Eels had a SHARE presentation a couple of years ago where he > described the cost of going to memory. See page 88 of this: > https://share.confex.com/share/119/webprogramschedule/Handout/Session1 > 1718/SHARE 119 Session 11718 Presentation.pdf > > I don't remember what processor this information referred to. Bottom > line is that when the data comes from L1 cache, it is available during > the same machine cycle. If it has to come from main storage, it takes > about 850 machine cycles. for a decade or so the latency cost for a cache miss to memory counted in processor cycles is similar to the count of 360 processor cycles for a 360 disk i/o ... aka memory has become the new disk. that is motivation for things like hyperthreading (multiple overlapped i-streams) ... simulating multiprocessor ... aka the hardware equivalent of multitasking to allow overlapping execution with things waiting. it is also behind out-of-order execution (skipping past instruction stalled on cache miss). Introduction of out-of-order execution for z196 is claimed to be major factor in the increase in processor throughput between z10 and z196 (something that dates back couple decades in some other platforms). also, processor cycle time has been getting faster than memory latency ... which harkens back to my theme in the 70s and early 80s ... that processor was getting faster, much faster than disks were getting faster. At one point in the early 80s, I was saying that relative system disk speed had declined by a factor of ten times over a period of 15years (processor&memory got 40-50 times, disks got 3-5 times). Disk division executives assigned their performance group to refute my statements ... but after a couple weeks they came back and effectively said that I had slightly understated the problem ... the analysis is respun and turns into a SHARE presentation on optimizing disk configurations for system throughput. -- virtualization experience starting Jan1968, online at home since Mar1970 ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [email protected] with the message: INFO IBM-MAIN ________________________________ ATTENTION: ----- The information contained in this message (including any files transmitted with this message) may contain proprietary, trade secret or other confidential and/or legally privileged information. Any pricing information contained in this message or in any files transmitted with this message is always confidential and cannot be shared with any third parties without prior written approval from Syncsort. This message is intended to be read only by the individual or entity to whom it is addressed or by their designee. If the reader of this message is not the intended recipient, you are on notice that any use, disclosure, copying or distribution of this message, in any form, is strictly prohibited. If you have received this message in error, please immediately notify the sender and/or Syncsort and destroy all copies of this message in your possession, custody or control. ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [email protected] with the message: INFO IBM-MAIN
