One minor thing to remember that if you are dealing with lengths that are not a multiple of 256 bytes, then the MVC loop method is going to have to drop thru to an executed move for those remaining bytes. This means that you need addressability to that target of the execute instruction and depending on the environment where your code is executing that might be a non-trivial or non-desirable exercise.
Dropping through to an actual "MVCL" to move those last remaining bytes might be a way to avoid the "EX Rx,MOVE_LAST_BIT" :-) Rob Scott Lead Developer Rocket Software 275 Grove Street * Newton, MA 02466-2272 * USA Tel: +1.617.614.2305 Email: rsc...@rs.com Web: www.rocketsoftware.com -----Original Message----- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@bama.ua.edu] On Behalf Of Charles Mills Sent: 31 May 2011 23:24 To: IBM-MAIN@bama.ua.edu Subject: Re: What is the current feeling for MVC loop vs. MVCL? Wow! Man do I stand corrected! Sounds like an MVC loop is well worth the bother in anything where performance *truly* matters. Charles -----Original Message----- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@bama.ua.edu] On Behalf Of Andy Coburn Sent: Tuesday, May 31, 2011 3:10 PM To: IBM-MAIN@bama.ua.edu Subject: Re: What is the current feeling for MVC loop vs. MVCL? The following is a snip from a program which I just ran on our Z10 whose characteristics are shown last below. The program first moved 65536 bytes from one location to another using MVCL and did this 1 million times. You'll see it took ~56 seconds. Then the same number of bytes were moved using 1 million MVC loops. This took ~8 seconds. Finally, the same number of bytes were moved 1 million times using MVCLE. I have run this program on every CEC that I have had available to me and the results are always relatively the same although the actual numbers change. MVCL has some extraordinarily useful functions: Truncation, padding and returning addresses and lengths after MVCL. Each of these would have to be done manually if a MVC loop were used and these instructions would have to be added into the total time for MVC loops. And, yes, I wrote a MVCL macro. But once one tries to support the case where bits 8 through 31 of R1+1 and R2+1 are not equal the macro gets very big and very awkward. And the R2+macro would have to return the 4 registers with contents the same as MVCL would. 1 MILLION MVCL INSTRUCTIONS IN SECONDS 55.961093 1 MILLION MVC LOOPS IN SECONDS 8.389260 1 MILLION MVCLE INSTRUCTIONS IN SECONDS 115.548643 OPERATING SYSTEM= HBB7770SP7.1.2 HBB7770 CPU ID FROM STIDP: TYPE=2098 VERSION=00 SERIAL=00XXXX EXECUTION ENVIRONMENT: LPAR=YES VM=NO CPU ID FROM STSI: MANUF=IBM TYPE=2098 MODEL=R03 SERIAL=00000000000XXXXX VERSION CODE FROM CONVERSION TABLE= ROLLING 4-HOUR AVERAGE UTILIZATION IS 23 MSUS LPAR XXXX IS UNCAPPED IN A 89 MSU CEC ADJ FACTOR 1946 ACCUM WEIGHT 6385225 TIMES ACCUM 23219 LPAR_NAME=XXXX LPAR_ID=0003 LPAR_SIZE=89 CEC_SIZE=89 ARCH=Z/ARCHITECTURE ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@bama.ua.edu with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html