The whole question of latency could take up several lectures, if we were still 
in college, and a lot of research.

The CPU Measurement facilities mentioned before offer us a 'peek through the 
keyhole' view of what is going on behind the scenes that before now we had no 
way of seeing.

As I see it today there are four layers of latency.  The first is instruction 
fetch.  The next is address resolution followed by data fetch followed lastly 
by data store.  The first and third of these can be greatly affected by where 
the data is coming from, be it level 1 cache or level 4 cache.  Address 
resolution gets hit due to the use of a LOAD followed by a LOAD using the first 
as a base or index register.  Data store is most always into level 1 cache, but 
even this can cause delays because of the updating of the cache directory for 
all associated processors.

Hardware Instrumentation Services can tell us the number of cycles used to 
execute a number of instructions.  It also gives us the number of writes into 
the cache directory and the penalty cycles waiting for the data to harden.  
What is not clear to me is how those two inter-relate.  Does a cache penalty 
cycle also cause an instruction cycle?

In addition, does an MVCL with a padding byte of x'B0' bypass all the cache, or 
just level 1?  And does the bypass of cache, while not flushing cache, really 
save time because now it has to go farther out in memory?

All are questions that will be answered as we go along. I think the best we can 
say today is that we will get hints and glimmers of what is happening.

Chris Blaicher
Principal Software Engineer, Software Development
Syncsort Incorporated
50 Tice Boulevard, Woodcliff Lake, NJ 07677
P: 201-930-8260  |  M: 512-627-3803
E: [email protected]


-----Original Message-----
From: IBM Mainframe Discussion List [mailto:[email protected]] On Behalf 
Of Anne & Lynn Wheeler
Sent: Tuesday, February 04, 2014 12:22 PM
To: [email protected]
Subject: Re: CPU time

[email protected] (Tom Marchant) writes:
> John Eels had a SHARE presentation a couple of years ago where he
> described the cost of going to memory.  See page 88 of this:
> https://share.confex.com/share/119/webprogramschedule/Handout/Session1
> 1718/SHARE 119 Session 11718 Presentation.pdf
>
> I don't remember what processor this information referred to.  Bottom
> line is that when the data comes from L1 cache, it is available during
> the same machine cycle.  If it has to come from main storage, it takes
> about 850 machine cycles.

for a decade or so the latency cost for a cache miss to memory counted in 
processor cycles is similar to the count of 360 processor cycles for a 360 disk 
i/o ... aka memory has become the new disk.

that is motivation for things like hyperthreading (multiple overlapped
i-streams) ... simulating multiprocessor ... aka the hardware equivalent of 
multitasking to allow overlapping execution with things waiting. it is also 
behind out-of-order execution (skipping past instruction stalled on cache 
miss). Introduction of out-of-order execution for z196 is claimed to be major 
factor in the increase in processor throughput between z10 and z196 (something 
that dates back couple decades in some other platforms).

also, processor cycle time has been getting faster than memory latency ... 
which harkens back to my theme in the 70s and early 80s ... that processor was 
getting faster, much faster than disks were getting faster. At one point in the 
early 80s, I was saying that relative system disk speed had declined by a 
factor of ten times over a period of 15years (processor&memory got 40-50 times, 
disks got 3-5 times).  Disk division executives assigned their performance 
group to refute my statements ... but after a couple weeks they came back and 
effectively said that I had slightly understated the problem ... the analysis 
is respun and turns into a SHARE presentation on optimizing disk configurations 
for system throughput.

--
virtualization experience starting Jan1968, online at home since Mar1970

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions, send email to 
[email protected] with the message: INFO IBM-MAIN

________________________________



ATTENTION: -----

The information contained in this message (including any files transmitted with 
this message) may contain proprietary, trade secret or other confidential 
and/or legally privileged information. Any pricing information contained in 
this message or in any files transmitted with this message is always 
confidential and cannot be shared with any third parties without prior written 
approval from Syncsort. This message is intended to be read only by the 
individual or entity to whom it is addressed or by their designee. If the 
reader of this message is not the intended recipient, you are on notice that 
any use, disclosure, copying or distribution of this message, in any form, is 
strictly prohibited. If you have received this message in error, please 
immediately notify the sender and/or Syncsort and destroy all copies of this 
message in your possession, custody or control.

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

Reply via email to