> Have you reported the bug to Microsoft? Not yet, but I will after making a reasonable translation unit.
After my last mail I realized that setting cl.exe affinity to one core was no guarantee to make it run one thread. And it did not. So I just booted after setting 1 core in bios and no hyperthreading. Task manager and env variables show that my system has one core and not 4. But cl.exe still launches 4 worker threads on a single core for a single file and again fails in 10% of cases. Anyway, the point is that it has all the signs of being a multi-threaded bug seemingly without any way of turning off threading in the compiler. As far as I could see, cl.exe in the previous version, VS 2010, does not use multiple worker threads and so older versions work fine. > > The error happened in roughly 10% of 100 consecutive > > compilations of the file. Happens with -O1 and -O2. > > But no error when optimization is off. Although I > > have not gone so far as having a minimal workaround, two > > possible solutions are to have optimization off for select > > functions in a few files using #pragma or just turn off > > optimizations for three directories where such errors > > happened - > > > > src/mat/impls/sbaij/seq/sbstream > > src/mat/impls/sbaij/seq > > src/mat/impls/baij/seq > > These directories have some of the kernels that stand the most chance of > benefiting from optimization. If you break the files up, do the errors > go away? I agree with the optimization need, especially for people using complex scalar. But I have not tried splitting files. Even if errors do go away, they might only be reduced statistically, which does not lead to a reliable solution. For Microsoft, it might be easier to fix the compiler bug using the existing file with more functions that fails 10% of time rather than a file with fewer functions that fails very rarely and making the bug(s) harder to reproduce. Chetan
