On Fri, Dec 11, 2009 at 9:34 PM, Don <[email protected]> wrote: > Walter Bright wrote: >> >> Don wrote: >>> >>> Yeah. Actually the CPU problem is an accepts-invalid bug. It worked on my >>> Pentium M, but it shouldn't have. >>> The problem is what DMD does to the "uninitialized assignments". >>> >>> float x; >>> >>> gets changed into >>> >>> float x = double.snan; >>> >>> and is implemented with >>> fld float.snan; fstp x; >>> >>> The FLD is triggering the snan. They should be changed into mov EAX, >>> reinterpret_cast<int>(float.snan); mov x, EAX; >> >> Sounds like a good idea. >> >>> There's another reason for doing this. On Pentium 4, x87 NaNs are >>> incredibly slow. More than 250 cycles!!! On AMD and on Pentium 4 SSE2, they >>> are the same as any other value (about 0.5 cycles). Yet another reason to >>> hate the P4. But still, this is such a horrific performance killer that we >>> ought to avoid it. >> >> I had no idea that was the case! > > I only just discovered it. Every documentation I've seen just said "These > [cycle count] values are for normal operands. NaNs, infinities, and > denormals may increase cycle counts considerably." I found a blog of someone > who'd actually measured it.
I experienced it in a fluid sim I was working on in grad school. NaNs were creeping in and performance was terrible. I thought it was two problems till I got rid of the NaNs and suddenly performance was ok too. --bb
