note that test&set was on both 360/67 and 360/65 machines and was atomic.
I've commented before about charlie invented compare&swap (chosen because CAS are his initials) while doing fine-grain multiprocessor locking working on CP67 (360/67 precursor to vm370) at the science center. http://manana.garlic.com/~lynn/subtopic.html#545tech and http://manana.garlic.com/~lynn/subtopic.html#smp then we attempted to get it added to 370 architecture. initially was rebuffed because the POK favorite son operating system people said that test&set was more than adequate for multiprocessor support (serializing critical code sections). The 370 architecture owners said that to get it justified would require additional uses, not just multiprocessor serialization. Thus was invented the multiprogramming/multithreading examples (used whether or not running on multiprocessor machine) that still are shown in the principles of operation. The problem in a multithreaded application is it is enabled for interrupts and can loose control in a locked/critical section. Compare&Swap is used for doing an atomic operation directly not needing to lock a critical section. This was especially leveraged by large multiprogramming/multithreading DBMS avoiding needing to make kernel calls for lots of serialization ... and by the 80s lots of other platforms (especially those supporting high-throughput DBMS) were including compare&swap (or instructions with similar semantics). I first saw transactional memory on 801/risc in the late 70s. They demonstrated that they could do transactional type operations on applications that weren't originally coded for transactions. 801/risc ROMP (research/office products) that started out going to be a displaywriter followon. When the displaywriter followon was canceled, they looked around and decided to retarget it to the workstation market. They hired the company that had done the UNIX port to IBM/PC for PC/IX to do one for romp. This was eventually released as PC/RT and AIX. The followon to ROMP was RIOS (rs/6000) and they used the transactional memory to implement JFS ... journalling the UNIX filesystem metadata changes ... with a claim that it was more efficient that directly implementing journalling calls in the filesystem. However, Palo Alto then did a portable JFS that used explicit journaling calls ... and demonstrated on RS/6000 that it was much faster than the transaction memory implemention. http://manana.garlic.com/~lynn/subtopic.html#801 Note that s/370 had very strong (multiprocessor) memory consistency and cost huge amount in performance. Two processor multiprocessor machines slowed each processor clock cycle by 10% to accommodate cross-cache protocol chatter ... and this overhead went up non-linearly. Later IBM mainframe was running cache machine cycle at much higher rate than the processor machine cycle. In the late 80s, I was asked to participate in the standardization (started by LLNL) of what quickly became fibre-channel standard (on which they eventually built the heavy-weight FICON protocol that drastically reduces the native throughput) http://www.garlic.com/~lynn/submisc.html#ficon I was also asked to participate in the standardization of scalable coherant interface (started by people at SLAC ... a large VM370 mainframe installation at the time and host of the monthly IBM BAYBUNCH user group meetings). SCI was defined for both I/O operations as well as multiprocessor shared memory operation. The standard SCI memory concistency defined 64-port memory bus ... that relaxed memory concistency (compared to IBM mainframe) and allowed for lot larger mainframe configuration.s Sequent, Data General, Silicon Graphics, and at least Convex built multiprocessor products. Sequent & Data General took standard i486 four processor board that shared cache and built interface to SCI ... being able to get 64 4-processor boards in configurations (256-way processor shared memory configuration). Convex took standard HP/SNAKE (risc) two processor board that shared cache and built interface to SCI ... being able to get 64 2-processor boards in configuration. As an aside, much later IBM buys Sequent and shuts it down. Note both FCS and SCI started out with fiber that supported concurrent transfers in both direction. SCI https://en.wikipedia.org/wiki/Scalable_Coherent_Interface is part of what evolves into infiniband https://en.wikipedia.org/wiki/InfiniBand other trivia ... in the mid-70s I was involved in project that defined a 16-way shared memory multiprocessor. Lots of people thought it was really fantastic ... and we got some of the 3033 processor engineers to work on it in their part time (lot more interesting than mapping 168 logic to 20% faster chips). Then somebody tells the head of POK that it could be decades before the POK favorite son operating system could effectively support 16-way (it was 2000 before 16-way shipped) and we got invited to never visit POK again (and the 3033 processor engineers were instructed to stop being distracted). -- virtualization experience starting Jan1968, online at home since Mar1970 ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN