> 
> What is the effect of the conditional branch and the EX on the pipeline? Are 
> the performance tradeoffs the same on all supported processors? Also, tuning 
> code for a current processor may slow it down on a new one.
> 
> 
> --
> Shmuel (Seymour J.) Metz
> http://mason.gmu.edu/~smetz3

 In *very* casual tests we and some customers did, we determined 
 that this general scenerio seems to be a good approach for
 moving bytes with a constant length:

     sizes less than 1024:
       generate up to 4 MVCs in a row
   
     sizes greater than or equal to 1024:
       if MVCLE is allowed (there is a compiler option for this)
       then use MVCLE
     
       otherwise:
         generate a loop of MVCs updating the src/target
         address and lengths as needed (you don't need an EX
         for this.)   Basically divide the length by 256
         and loop moving 256 bytes at a time by that count;
         then get the modulus of the length by 256 and
         move those remaining bytes (since the length is constant,
         the division and mod operations provide constants.)

 That seems to be a good balance between code-size and speed. 
 And, the loop is small enough that it probably fits in the 
 machines instruction-cache, so hopefully the branch back
 (a BCTR back to the MVC) isn't that painful.

 Just some thoughts...

        - Dave R. -

--
[email protected]                        Work: (919) 676-0847
Get your mainframe programming tools at http://www.dignus.com

Reply via email to