>
> What is the effect of the conditional branch and the EX on the pipeline? Are
> the performance tradeoffs the same on all supported processors? Also, tuning
> code for a current processor may slow it down on a new one.
>
>
> --
> Shmuel (Seymour J.) Metz
> http://mason.gmu.edu/~smetz3
In *very* casual tests we and some customers did, we determined
that this general scenerio seems to be a good approach for
moving bytes with a constant length:
sizes less than 1024:
generate up to 4 MVCs in a row
sizes greater than or equal to 1024:
if MVCLE is allowed (there is a compiler option for this)
then use MVCLE
otherwise:
generate a loop of MVCs updating the src/target
address and lengths as needed (you don't need an EX
for this.) Basically divide the length by 256
and loop moving 256 bytes at a time by that count;
then get the modulus of the length by 256 and
move those remaining bytes (since the length is constant,
the division and mod operations provide constants.)
That seems to be a good balance between code-size and speed.
And, the loop is small enough that it probably fits in the
machines instruction-cache, so hopefully the branch back
(a BCTR back to the MVC) isn't that painful.
Just some thoughts...
- Dave R. -
--
[email protected] Work: (919) 676-0847
Get your mainframe programming tools at http://www.dignus.com