Re: Instruction Lists/Counts.

robin Sun, 17 Feb 2013 19:21:05 -0800

From: "Tony Harminc" <[email protected]>
Sent: Monday, 18 February 2013 12:05 PM

On 17 February 2013 18:45, robin <[email protected]> wrote:

From: Tony Harminc

One might better think of the mix of instruction encountered in a real
world instruction stream. ED and EDMK form a minuscule fraction of all
instructions executed, and even in a commercial environment, the
packed decimal instructions (including CVB, CVD, PACK, and UNPK) form
a very small portion. Indeed only the RR and RX instructions (and
their modern counterparts) show up on any sort of ordinary graph of
instruction use, and all others can go in the "other" bucket.

Some think that instruction frequency counts are the measure
of the usefulness of instructions.

But first, you have overlooked the CISC-style instructions of the S/360.
I did not restrict that classification to just ED, EDMK, etc,
but also to RR and RX instructions like shift, multiply, divide,
the floating-point instructions that involve pre- and post-normalization
(and in particular but not restricted to multiply and divide),
decimal instructions, and most of the character instructions.


Defining RISC and CISC is a bit of a mug's game.


You need to take a look at a peer-reviewed textbook.

There are many people
with very strong opinions on whether a particular instruction or style
of instruction immediately puts an architecture into one or the other
category, but equally, those peoples' opinions differ greatly on any
particular feature. It is amusing, and perhaps even instructive, to
read the Wikipedia article with an eye to the multiple edits and text
streams as people with conflicting opinions add their view as a
"However,..." to the previous one.


An entry in Wiki for RISC is noted for its paucity of references to
authoritative sources.
The reviewer has asked that they be supplied.

Using simple instruction counts tells nothing. Take the TRT instruction,
for example.  To search for a single character in, say, a string of length
256, requires one TR instruction. To do the same without TR, requires a loop
of several instructions.  Multiply that by the number of iterations (up to
256) thus could require the execution of up to 1024 instructions.
Given the choice, which would you use?


Of course I usually use the TR, because it's easy to write and
understand. Most compilers do not generate any TRs at all.


I understand that mainframe PL/I does, and has done for a long time.

Usually the
compiler-generated code outperforms mine, unless I am writing with
close attention to performance.

The TR instruction for this task runs in a fraction (perhaps 1/10th) of the
time of the loop.
Given a list of characters to search for, TR romps home in anything from
1/20-th to 1/100-th of the time of the loop method.


I'd like to see those numbers. I think you will find that TR, and many
other complex and rare instructions, are interpreted by millicode
streams of RR, RX, and similar instructions on a modern zArch machine.


This discussion has been about the S/360 and to some extent of the S/370.
For z/Arch you can check that out with IBM.

For an IBM 360 look-alike, TR takes 4.8 + 1.36N uS
(where N is the number of bytes searched).
If we use a loop testing for one character with:
CLI 2
LA 1.3
BE 2.2
BCT/BCTR 1.6
Loop time = 7.1uS per byte, plus loop set-up time,
or about 5 times slower.

The loop times increase as soon as multiple characters are searched for.

Thus, to make a reasonable comparison of instruction frequencies,
one must use a weighted comparison, counting, say, 10 to 100,
or even 1000 times for each TR instruction, depending on what it is
searching for.


Perhaps one should look at the frequency of the basic (RISCy)
instructions and those used by the millicode to implement the likes of
TR and EDMK.

Similarly for the other CISC-type instructions, like decimal, which you
dispense to the "other" bucket.


Decimal instructions are frequent enough in commercial work that they
are probably implemented by a combination of millicode and hardware
assists. They are still outweighed by the RISCy ones.

Certainly S/360 and S/370 are CISC machines from a time long predating
the terms RISC and CISC. But for practical purposes (such as compiler
writing or performance analysis) rather than academic taxonomy, they
may as well be RISC.


They are very definitely nothing like RISC.


Hmmm... Many aspects of them are very much like RISC. Of course no one
is suggesting that the S/360 and S/370 architectures are purely RISC.
But then neither is the IBM Power purely RISC, though almost everyone
calls it a a RISC architecture.

RISC machines attempt to execute each instruction in one machine cycle.
That tends to restrict the range of instructions that can be implemented
on such a machine.  Such instruction sets would include integer operations
for load, add, subtract, compare, store, shift one place (though more places
could be handled in a single machine cycle), branch.  However, the instruction 
set
wouldn't include such luxuries as multiply, divide, etc.


I think you will find it difficult to find a widely (let alone
universally) accepted definition of RISC.


As I said, you need to take a look at a textbook.

Certainly some features, if
encountered, find easy agreement that "this is not a RISC feature".
But in my opinion S/360 and S/370 and their successors, even with
1000+ instructions, have many of the RISC attributes,


Actually they don't.  I have already explicitly pointed these out to you.
Here they are again.
ED EDMK, TR TRT, PACK, UNPK, CBD, CVB,
SLL SRL M, D, AE SE ME DE, AP SP ZAP MP DP CP
CLC MVC MVO MVN MVZ MVCL CLCL, and their derivatives.

and those
attributes are very widely used, particularly by compiler generated
code.


Compilers tend to be reluctant to use the best instructions,
or even to use the best optimisations, so they are not good benchmarks
to cite for model instructions.

Re: Instruction Lists/Counts.

Reply via email to