http://d.puremagic.com/issues/show_bug.cgi?id=6660



--- Comment #5 from Don <clugd...@yahoo.com.au> 2011-09-27 03:57:59 PDT ---
This is really incredible. I've removed all of the D code, and I can still
reproduce the behaviour. If you uncomment out the jz line, it won't happen.
The 'int 3' line is just a breakpoint, to prove that the branch is never taken.

void main()
{ 
    int ctr; // also works with __gshared int ctr;
    asm {
        mov EAX, 2;
        cpuid;
        and EAX, 0xFF;
        mov ctr, EAX;
//        jz was_zero;
Lxx:
        dec int ptr ctr;
        jnz Lxx;
        jmp done;
was_zero: 
        int 3;
done:   ;        
    }
}

Wild speculation: there's a bug in CPUID 2: it's not clearing the loopback
buffer. The loop is executed as if 'ctr' were still zero. This means that it
loops 2^^32 times. This is long enough that Windows does a task switch.
In core2, the loopback buffer was between the predecoders and the decoders, but
on core i7, they moved it after the decoders.
I tried to confirm this by extending the size of the loop, by padding with
nops.
When the loop is 63 bytes of code (56 nops), it fails. Once I add a 57th nop,
it stops failing.
These aren't the numbers I expected -- the loopback buffer is 256 bytes on the
core i7. However I have a core i3, perhaps it's different, or it may be a
decoding bug. Regardless, this looks very much like a CPU erratum.


My guess is that affecting the loop predictor. which isn't the branch
prediction

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------

Reply via email to