Fred, I tested it - and it does work
I need 27 instructions (maybe more because of two JO after TROT) see end of this post. But I am sure this code is faster than what was used before. The simple relation 256 input bytes processed by 27 instructions to create 192 bytes - should speed up the whole precess by a factor (even if the TROT is slowing stuff down). The complete code (with translate-tables is here and I send it to whoever wants it). MVC INTER,TR1 prepare for rearrange to 14 23 TR INTER,INPUT MVC INTER2,TR2 prepare for spread TR INTER2,INTER-1 spread to 1.4 MVC RESULT,INTER2 NC RESULT,TR31+2 leave only 1.. NC INTER2,TR31 leave only ..4 TR INTER2,TR32 make it low order bits OC RESULT,INTER2 * now we have first and last byte of a tripplet ready * now we take 64 bytes and make them 128 LA R1,TR4 LA R14,INTER LA R15,64 Sending length LA R2,INTER+128 * TROT R14,R2,1 no stopping DC X'B992',X'10',AL.4(R14,R2) HELP HASM JO *-4 * rearrange 128 to 192 * MVC INTER2,TR6 TR INTER2,INTER-1 * OC RESULT,INTER2 * * now we take these 64 bytes and make them 128 LA R1,TR5 LA R14,INTER LA R15,64 sending length LA R2,INTER+192 * TROT R14,R2,1 hardware know it but HASM don'T DC X'B992',X'10',AL.4(R14,R2) HELP HASM JO *-4 * rearrange128 to 192 * MVC INTER2,TR6-1 TR INTER2,INTER-1 * OC RESULT,INTER2 -- Martin Pi_cap_CPU - all you ever need around MWLC/SCRT/CMT in z/VSE more at http://www.picapcpu.de
