The S/370 PoOps mentions the vector facility and says "Vector operations are 
described in the publication IBM System/370 Vector Operations, SA22-7125."

You can download it from 
http://bitsavers.org/pdf/ibm/370/vectorFacility/SA22-7125-3_Vector_Operations_Aug88.pdf


--
Shmuel (Seymour J.) Metz
http://mason.gmu.edu/~smetz3

________________________________________
From: IBM Mainframe Assembler List [[email protected]] on behalf 
of Dan Greiner [[email protected]]
Sent: Friday, June 5, 2020 2:56 PM
To: [email protected]
Subject: Re: Does the z architecture have something like the SIMD instructions

Although it does actually access multiple data items, PERFORM LOCK OPERATION 
(PLO) really doesn't qualify as a SIMD instruction (see my PLO screed below).

Seymour's reference to the Wikipedia page 
(https://secure-web.cisco.com/1g9JjH5spQKTKip_YcVCSGxFPyS93bsF0rty3cayA3B2sZ2D4Q_El-WD75GMFVbcIZJHbWIdhnz469e8c96r3NQvd7MnrTdsIQegSzR5roKxSGwI3UJUFgG2cmwu22PBp8FShQeXm8O8D8JfdpCMC5LZvgP2ONFYFDRlByNSKUu-v2XTrrLoMUS10BP1xSiGJb729bBpfKoSFFuWCMkiDCKa1j1urk56bSobXkpIgGPXGpwibHEteCcsvsmN0OnEFlv7bpRAl7PWToJp0CJ88K6BhT-vIQPZp4oiUQqBIiCCRV4S-ztbTed7wno5eXxGUEQ2SO7DQMf6itV_ZR_UMptAT0vQigtdqmIzTpk6cQ0hoJZv4y1bPTmiNvuA7kAXnlMuWJxU0xkyApGW0Qgrz92gzowCKCkNd6HIam5wuiRAFEquej9eyAeTJl87GKprc/https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FSIMD%29
 is about as adequate a definition as any I've seen. As I recall, IBM's 
original implementation of vector instructions appeared as an optional 
extension to ESA/390, but these were never part of the standard architecture 
defined in the PoO.

With the advent of the z13 (2015), IBM added vector instructions to the general 
architecture, and added Chapters 21-24 to the PoO. There are 32 vector 
registers, each having 128 bits ... but the 64 bits of VRs 0-15 are the same as 
floating-point register 0-15. This is not to say that VRs are necessarily 
floating-point entities; they can be binary integers, strings, or floating 
point.

With the introduction of the z14 (2017), IBM added (a) new instructions that 
enhanced the existing VR facility, and (b) a vector packed-decimal facility 
(the latter being a benefit to COBOL and other packed players). With the 
introduction of the z15 (2019), IBM added a second enhancement to the VR 
facility. There are now around 190 separate vector instructions — with a 
mind-boggling array of extended mnemonics. If you haven't bothered to download 
a PoO in the last few years, it's worth it (but if you choose to print it, have 
two reams of paper handy). Check out SA22-7832-15 for the latest version.

Regarding PLO, this provides the means by which multiple, discontiguous storage 
locations can appear to be updated atomically without having to bother 
acquiring a lock. However, in order for PLO to operate properly, EVERY program 
that inspects or modifies those storage locations also has to do it with PLO. 
This is because the firmware for PLO gets its own lock in HSA, and serializes 
other CPUs attempts to use PLO with that lock. If other programs on other CPUs 
examine the data, the updates do not necessarily appear to be atomic. And, if 
some programs use PLO and others try to perform updates with classic 
compare-and-swap logic, really BAD things happen (as certain z/OS developers 
have discovered more than once). If nobody was actually using PLO, I would have 
quietly proposed removing it from the architecture, but (alas) there are some 
OS components that have actually managed to use it properly.

For a far more flexible (and higher performance) means of atomic updates of 
multiple storage locations, check out the transactional-execution facility 
introduced in the z12 (2012).

Reply via email to