On 14 June 2018 at 15:40, Charles Mills <[email protected]> wrote: > Not the answer to your question but I don't think "TRT performs badly." > > It is just that people sometimes assume that because it is a single > instruction in the Pop it must execute roughly as fast as many simple > instructions. > > I think it is fast for what it does. Picture "writing TRT in assembler" -- > implementing it as a subroutine as if the opcode magically vanished. > > Now picture a version of that only somewhat faster because millicode has > some special tricks up its sleeve -- that's TRT.
Yup. A byte-at-a-time subroutine. Ignoring for the moment what Dan says about offloading TRT to special hardware, there is still a small but fundamental problem with TRT and TR, as architected in the early 1960s. That is that access failures *for the table* (operand 2) are dependent on the actual data in operand 1. If your table straddles a page boundary, it is entirely ordinary for your access to those two pages to be different, whether due to key protection, page protection, or just that one is paged in and the other isn't. TRT and TR are defined so as to program check only if the data requires it, so if no data byte causes a given function byte to be referenced, there will be no program checks related to the address of that function byte. A similar problem arises with the data (operand 1) of TRT: program checks are recognized only if strictly necessary. If the instruction ends early because of the value of the function byte addressed by a data byte, then there's no program check even if unexamined parts of operand 1 are inaccessible. This is a more common situation not unique to TRT (presumably SRST and friends would also have it), but still it must surely be checked for, essentially one byte at a time. The newer variants TRTE and the many TR variants like TROO, TROT, etc. are defined so as to allow the program check even if the data doesn't require looking at one or the other page. Well obviously don't put your table across a page boundary, and presumably the hardware/millicode will detect this and be able to run faster. But is this a reasonable requirement for an application program to have to pay attention to? It's very easy to have your table in a good place, insert some data above it that pushes it over a boundary, and suddenly things slow down. Maybe on modern machines all this state can be cached, but if the data is different each time as it would be in a list of words, then I can't see a general way of speeding up the checking. Maybe Dan knows the answer... Tony H.
