On Fri, Jan 13, 2012 at 4:55 AM, Gerhard Postpischil <[email protected]> wrote:
> On 1/12/2012 2:23 PM, John Gilmore wrote:
>>
>> No one has ever claimed that the timing differences here are large,
>> significant ones; and the continuing preoccupation here with
>> suboptimizing of this sort is, I think, evidence of a pervasive
>> malaise, a retreat into the familiar that precludes consideration of
>> more, much more, important design issues.
>
>
> In general I tend to agree with this, but I've worked or
> consulted at installations that either had problems completing
> overnight jobs in their assigned batch window, or just
> processing large amounts of data.
>
> While I haven't tried this on very current machines, on older
> ones EX added 40 to 50% to the instruction time (EX overhead on
> some Amdahl machines was greater); 4 MVCs of 256 bytes were
> about the same as a 1K MVCL; and 5 CLI/BE were about the same as
> one TRT/B *+4(R2). In each case if paid to identify the most
> frequently executed code and look for improvements.

That makes sense. It sounds like even if you can afford to MVC the
entire buffer (because you know there is room in the destination and
you're not near the edge of the source) then it might make sense to EX
MVC if you know the actual size and it's less than half on average.

For short EX MVC's the burden of getting stuff in the right registers
makes MVCL less interesting.

My preoccupation with this is mostly on Friday ;-)  And I guess I
should not write real code on Friday 13th anyway...
The EX CLC is in fact in loop scanning a linked list for the right
entry among 100-200 elements. My big savings were moving the TRT etc
out of the loop. I was tempted to also take the decision between CLC
and EX CLC out of the loop, but didn't for ease of maintenance.

Rob

Reply via email to