On Wed, 9 Dec 2020 at 10:53, Charles Mills <[email protected]> wrote:
>
> I have been thinking about this. It is a daunting project. I once set out to
> develop a simple list of opcodes with their required minimum hardware level.
> I wanted to be able to answer questions of the form "management wants this
> product to be able to run on a z9. Can I use AHI?" In fact I think I asked
> for help on this list, and promised to share any results. I abandoned the
> effort! I decided it was easier just to code AHI and assemble with ZS-5 or
> whatever and see if I got an error. That list is obviously just one
> component of your product.

There has been a list at http://www.tachyonsoft.com/inst390o.htm for
many years. I'm not at all sure it's maintained these days, but I use
it frequently as a quick reference.

On the more general matter of analysing a load module/PO/UNIX file and
classifying the opcodes found therein, I think this is pretty much an
AI project. Surely it engages the Halting Problem for the general
case. Given sufficient non-code (data) space within the module, it is
very likely that some of the data will be interpreted as instructions,
and notably many of the instructions having their first byte in the
range of EBCDIC letters are from the newer architecture levels.

There are some other problems. In a few cases the behaviour of
existing instructions has been expanded with an architectural level.
Notably the Long Displacement Facility added use of a previously
unused byte in around 50 existing instructions to go from a 12-bit
unsigned to a 20-bit signed displacement. This isn't insurmountable;
finding just one such instruction with a non-zero DH field means that
that facility is required. And this facility is pretty old now (2003),
so unlikely to not be required.

The Interlocked-Access Facility 1 (2010) and 2 (2012) each turned some
existing instructions into interlocked-update ones. This included old
standbyes like NI, OI, and XI. There is no way to tell from the
instruction itself if the program is relying on the new behaviour.

The ETF2-Enhancement Facility (2005) added meaning to a bit in the
existing TRxx instructions (TROO, TROT, TRTO, and TRTT)..

And there are very CISC facilities like the Message-Security-Assists
that are more like library calls that support many subfunctions than
like even the previous most complex instructions. A program can issue
a query to find out if a function is supported, or just assume that it
is, so analysis of such code would require data flow analysis or
finding a hard-coded function that was introduced at a particular
level.

And then there is the EXecute instruction (actually two of them now).
Typically this is used to put the length into an MVC or CLC or the
like. Occasionally it has been used to put the SVC number into an SVC
or the mask into a TM. But given that part of the opcode in some
instruction formats is in the second byte, it's quite possible to
change the target instruction itself. So e.g. if you EX a STCK (B205)
with the execute register containing X'40', you will instead execute a
SQDR (Square Root short HFP) instruction (B245). This would be fine
material for an obfuscated programming contest, and is of course
unlikely to be found in real-life code. But who knows - maybe code
obfuscators (human or software) have used it to "protect" their code.

Tony H.

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

Reply via email to