On Wed, May 21, 2008 at 11:21:27AM +0100, Ed Brambley wrote: > As I understand it (which is not necessarily correct), your code is slightly > incorrect, since variable are by default shared between parallel sections. > Therefore, the "int i" is shared between threads, and hence the erratic > results if both loops execute at the same time. To fix it, you could try > changing the first #pragma to read > > #pragma omp parallel sections private(i) > > Or, alternatively, define i in the for loops as, for example > > for (int i = 0; i < num_steps*10; i++) {
Yep, or just in the block inside of #pragma omp sections or #pragma omp section. In fact, it would be good to define factorial there too, or add private(factorial) - while this one is not necessary for correctness, it still would improve readability and could be tiny bit faster. > So why does this works with icc or gcc 4.4? My guess would be it's because > OpenMP doesn't guarantee that variables are in sync between threads unless it > hits a flush directive (either explicit or implicit), and so it would seem > with icc or gcc 4.4 the variable i is out of sync (probably because it's held > in a register, which is probably a good idea). When multiple threads modify the same shared library you are really in an undefined behavior territory, where the results will depend on what kind of loop optimizations is performed etc. - if each iteration updates the shared variable then it is of course much more likely to see "unexpected" results than if the var is just written at the end of loop (which is possible, both because the loops don't call any function and i's address isn't taken). Jakub