The classic example is a rewrite of some I/O routines in MULTICS. The original was in Assembly Language for MULTICS (ALM); the new version was in PL/I and ran faster. Why? Because the new algorithm was more efficient and that trumped any inefficiencies in the code generation.
-- Shmuel (Seymour J.) Metz http://mason.gmu.edu/~smetz3 עַם יִשְׂרָאֵל חַי נֵ֣צַח יִשְׂרָאֵ֔ל לֹ֥א יְשַׁקֵּ֖ר ________________________________________ From: IBM Mainframe Assembler List <ASSEMBLER-LIST@LISTSERV.UGA.EDU> on behalf of Colin Paice <00001dc19479b371-dmarc-requ...@listserv.uga.edu> Sent: Saturday, August 23, 2025 11:34 AM To: ASSEMBLER-LIST@LISTSERV.UGA.EDU <ASSEMBLER-LIST@LISTSERV.UGA.EDU> Subject: Execute-Type Instructions External Message: Use Caution Few people should be looking at the instruction level that is being discussed. You are more likely to get performance benefits from higher up the stack. For example, our code used an IMS bridge exit. Like all good programmers we allocated storage for our usage, did the work and then freed the storage. This showed up as a hot spot 1) because of enqueue on the storage request - and 2) the (small)cost of the storage requests. We fixed this by passing in a block of storage for the routine to use - so we allocated it once, and used it millions of times. We had a global block with a pointer to the next free trace slot. The code to update this involved compare and swap. With 8 CPU's there was a lot of contention. We gave each TCB their own trace area and this contention just disappeared. One customer did the same DB2 SQL query in every CICS program - just to check a system wide flag. It was pointed out this request was done 10,000 times a second. They changed this checking a bit in a global block, and deferred the need to upgrade their CPUs. So yes, look at the instruction level ... but do not forget what you are trying to do. Remember there used to be discussions on *How many angels can dance on the head of a pin?* Colin