Art Celestini wrote:
It seems that the TRE instruction has been in z/Arch for at least a few
years. If anyone is inclined to try this, it would be interesting to see
how it fares against Ed Jaffe's code:
XR R1,R1 Clear for insert
L R15,Length Load string length
Loop IC R1,Input-1(R15) Get input byte
IC R0,XlatTab(R1) Get translated character ...
STC R0,Output-1(R15) ... and store it in output
BCT R15,Loop Decrement length & loop until done
I believe the OP said that the data to be translated had to first be moved
from one buffer to another. The above does that, but a move of some type
needs to be added to Ed's code to make it a true comparison.
Some years ago, on our z800 processor, we measured the performance of
(in-place) TR against a software-coded loop. We found that the loop was
faster than TR for strings shorter than nine (9) bytes in length. When
we spoke to IBM about this, we learned that TR had been partially moved
into millicode for the z900/z800. It ran slower for short strings
because of the millicode start/stop (aka "subroutine linkage") costs.
For strings longer than nine bytes, TR was faster because it had access
to a hardware facility that could translate two bytes per cycle. The
code fragments we compared were:
|CASE1 DC 0H
| LA R2,9
| LA R3,DATA
| XR R4,R4
|CASE1L1 DS 0H
| IC R4,0(,R3)
| IC R4,EBCDIC(R4)
| STC R4,0(,R3)
| AHI R4,1
| AHI R3,1
| JCT R2,CASE1L1
|CASE1L EQU *-CASE1
|CASE2 DC 0H
| TR DATA(9),EBCDIC
|CASE2L EQU *-CASE2
We later "unrolled" the loop, interleaving the use of three different
registers, and found it was now faster than TR for strings of 24 bytes
or fewer!
|Stride EQU 3
|CASE1 DC 0H
| LA R0,9/Stride
| LA R3,DATA
| XR R4,R4
| XR R5,R5
| XR R6,R6
|CASE1L1 DS 0H
| IC R4,0(,R3)
| IC R5,1(,R3)
| IC R6,2(,R3)
| IC R4,EBCDIC(R4)
| IC R5,EBCDIC(R5)
| IC R6,EBCDIC(R6)
| STC R4,0(,R3)
| STC R5,1(,R3)
| STC R6,2(,R3)
| AHI R3,Stride
| JCT R0,CASE1L1
|CASE1L EQU *-CASE1
The results of the above experiments suggest that your loop has an
excellent chance of being faster than *any* sequence involving TR or
TRE, for strings shorter than some number of bytes 'n', on any given
hardware generation supporting z/Architecture.
--
Edward E Jaffe
Phoenix Software International, Inc
5200 W Century Blvd, Suite 800
Los Angeles, CA 90045
310-338-0400 x318
[EMAIL PROTECTED]
http://www.phoenixsoftware.com/
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO
Search the archives at http://bama.ua.edu/archives/ibm-main.html