Re: Optimization, CPU time, and related issues

Anne & Lynn Wheeler Wed, 19 Feb 2014 17:02:18 -0800

re:
http://www.garlic.com/~lynn/2014c.html#62 Optimization, CPU time, and related 
issues


aka the internal operation of the machine ... and the execution elements
actually being managed ... are becoming less & less directly related to
the external instruction architecture.

for instance, risk architectures have had significant performance
advantage over i86 (having pioneered super scalar, out-of-order
execution, branch prediction, speculative execution, etc) ... however
for the last several generations of server chips ... i86 has gone to
hardware layer that translates i86 instructions into risk micro-ops for
execution ... which has largely mitigated the difference in throughput
between risk and i86. the more sophisticated compilers will include some
level of model of the internal execution characteristics as part of code
generation.

another feature common in i86 has been hypertheading ... in the 70s, I
got sucked into a proposal to do hyperthreading for the 370/195 (that
never shipped) ... basically feading the execution units from two
separate (simulated multiprocessor) i-streams. The issue was that
370/195 was out-of-order, superscaler, and pipelined ... but conditional
operations stalled the processing (no branch prediction or speculative
execution).  The issue was that peak 370/195 was around 10mips ... but
tended to require very careful coding ... most codes with conditional
branching only ran around 5mips. the idea was that two i-streams, each
running around 5mips throughput (because of conditional branch
processing stalling the machine) ... it would achieve 10mips aggregate
throughput.

360/91, 360/195, 370/195 discussed here
http://www.quadibloc.com/comp/pan05.htm

the above talks about cycle time of 91, 95, & 195 ... basically the same
750ns memory used in the 65 & 75. Originally the 360/60 and 360/70 was
going to have 1ms memory ... but it was upgraded to 750ns ... and the
model numbers changed.

65/(67) & 75 did double word fetch at a time ... for the i-stream it
kept the full 8-bytes around ... so it didn't require a separate memory
fetch for every instructions. the timing values for the machines include
instruction execution and other data/store fetch memory times plus a
prorated amount for instruction fetch (assuming execution normally
proceeds sequentially) ... aka a 2byte instruction includes 1/4th of
750ns instruction fetch, a 4byte instruction includes 1/2th of 750ns
instruction fetch, a 6byte instsruction includes 3/4th of 750ns
instruction fetch.

the hypertheading gimick had been proposed in the ACS-360 effort
http://people.cs.clemson.edu/~mark/acs_end.html

see "Sidebar: Multithreading" in above ... which is followed by another
sidebar about acs-360 features that finally show up 20yrs later in
es/9000.  Earlier in the article Amdahl talks about IBM executives
shutting down the effort because it would advance the computing
state-of-the-art too fast and they would loose control of the market.

-- 
virtualization experience starting Jan1968, online at home since Mar1970

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: Optimization, CPU time, and related issues

Reply via email to