[Bug c++/108737] New: Apparent miscompile of infinite loop on gcc trunk in cddce2 pass

2023-02-08 Thread njs at pobox dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108737

Bug ID: 108737
   Summary: Apparent miscompile of infinite loop on gcc trunk in
cddce2 pass
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: njs at pobox dot com
  Target Milestone: ---

There's a meme going around about strange compilation outputs caused
infinite-loop-related UB, and while discussing it a friend stumbled on this
weird behavior:

https://godbolt.org/z/qfj1jabMW

C++ code is:

extern int foo();

void blah() {
int x = foo();
while (1) {
if(x) foo();
}
}


When compiled with whatever "gcc trunk" means on compiler explorer, and with
-O3 (but not -O2), as C++ (but not C), then gcc reduces this to a single call
to `foo()` followed by an empty infinite loop:

blah():
sub rsp, 8
callfoo()
.L2:
jmp .L2

As far as I can tell, there's no UB here -- 'foo' being extern means it might
do anything, and infinite loops with side-effects are well-defined in C++. So
this seems like a straight-up optimizer bug?

Poking through the pass analysis in CE, it looks like the output of "unrolljam
(tree)" is still correct, but then the output of the next pass "cddce2 (tree)"
has deleted everything. Somehow it concludes that the hoisted `if` can be
eliminated? idgi.

[Bug libgomp/60035] [PATCH] make it possible to use OMP on both sides of a fork (without violating standard)

2018-12-13 Thread njs at pobox dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60035

--- Comment #13 from Nathaniel J. Smith  ---
Unfortunately, AFAICT, the omp_pause_resource APIs don't actually solve the
problem.

They're fine and useful if you have a single piece of code that wants to use
both omp and fork(). But, this was never a *huge* issue, because if you knew
you were using omp then you also knew that fork() wasn't going to work, and
could use some workaround.

The really nasty case is when the code using omp and the code using fork() are
in entirely different pieces of code, that don't know about each other (e.g.,
different shared libraries). That's the motivating use case for this patch. I
don't see how the omp_pause_resource APIs help with this. The best you could do
is to set up a pre-fork hook to call omp_pause_resource_all, but that would be
equivalent to my first patch that got rejected for breaking standards-compliant
programs.

(In practice, the effect of this issue has been that the whole scientific
python ecosystem simply avoids omp where-ever possible. That's why no one's
been nagging about this patch. It still seems like a shame to me that all this
work goes into the omp runtime and then it gets ruled out for so many users for
such a trivial thing, but so it goes I guess.)

[Bug target/55307] libgcc's __cpu_indicator_init does not check for avx correctly

2018-05-03 Thread njs at pobox dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55307

Nathaniel J. Smith  changed:

   What|Removed |Added

 CC||njs at pobox dot com

--- Comment #5 from Nathaniel J. Smith  ---
This was fixed in #85100.

[Bug target/85100] __builtin_cpu_supports avx does not verify OS supports it

2018-03-28 Thread njs at pobox dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85100

--- Comment #5 from Nathaniel J. Smith  ---
Julian, are you able to test the patch? I don't have a reproduction setup
currently...

[Bug target/85100] __builtin_cpu_supports avx does not verify OS supports it

2018-03-28 Thread njs at pobox dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85100

--- Comment #3 from Nathaniel J. Smith  ---
We're using it exactly like the docs recommend. What on earth is
__builtin_cpu_supports for, if not to tell you whether you can use a given
feature?

If this is by design, than at the very least the docs need to make clear that a
return value of 1 does not mean you can actually use the feature. Also, it
would be nice if gcc provided some function that *did* answer the question of
what CPU features you can use, since AFAICT this is what every current user of
__builtin_cpu_supports actually wants.

[Bug target/85100] __builtin_cpu_supports avx does not verify OS supports it

2018-03-27 Thread njs at pobox dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85100

Nathaniel J. Smith  changed:

   What|Removed |Added

 CC||njs at pobox dot com

--- Comment #1 from Nathaniel J. Smith  ---
For context here: NumPy currently uses __builtin_cpu_supports("avx") to decide
whether it can use AVX-accelerated numerical kernels. We've been getting
regular bug reports from users where this __builtin_cpu_supports("avx")
returned true, but then NumPy crashes with a SIGILL when it tries to use AVX.
(It seems to be related to some kind of relatively common virtualization
configurations.)

Examples:

https://github.com/numpy/numpy/issues/10787
https://github.com/numpy/numpy/issues/9532
https://github.com/numpy/numpy/issues/10330
https://github.com/numpy/numpy/issues/9534

Now that Julian finally figured it out, I guess we'll work around it by calling
xgetbv ourselves:

https://github.com/numpy/numpy/pull/10814

but it really seems like it would be better if __builtin_cpu_supports would
check this itself.

[Bug libgomp/60035] [PATCH] make it possible to use OMP on both sides of a fork (without violating standard)

2014-04-05 Thread njs at pobox dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60035

--- Comment #8 from Nathaniel J. Smith  ---
(In reply to larsmans from comment #7)
> Phase 1? (Not familiar with the GCC dev cycle.)

Sorry, meant "stage 1". GCC trunk is (IIUC) currently in RC-bug-fixes-only
pre-release freeze mode.

http://gcc.gnu.org/develop.html


[Bug libgomp/60035] [PATCH] make it possible to use OMP on both sides of a fork (without violating standard)

2014-04-05 Thread njs at pobox dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60035

--- Comment #6 from Nathaniel J. Smith  ---
(In reply to larsmans from comment #4)
> Nathaniel, could you apply the cosmetic changes suggested at
> http://gcc.gnu.org/ml/gcc-patches/2014-02/msg00860.html? I'd hate to see
> this patch go to waste.

If you look at that thread then you'll see I did resend the patch with those
fixes -- I've just attached the updated patch to this bug report as well,
thanks for the catch.

My guess is that no-one will pay much attention to this until gcc re-enters
phase 1 in any case.


[Bug libgomp/60035] [PATCH] make it possible to use OMP on both sides of a fork (without violating standard)

2014-04-05 Thread njs at pobox dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60035

Nathaniel J. Smith  changed:

   What|Removed |Added

  Attachment #32019|0   |1
is obsolete||

--- Comment #5 from Nathaniel J. Smith  ---
Created attachment 32548
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=32548&action=edit
patch to make openmp -> quiesce -> fork -> openmp work (updated)

Updated based on feedback from Richard Henderson


[Bug libgomp/60035] [PATCH] make it possible to use OMP on both sides of a fork (without violating standard)

2014-02-12 Thread njs at pobox dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60035

--- Comment #2 from Nathaniel J. Smith  ---
Good point -- sent.

http://gcc.gnu.org/ml/gcc-patches/2014-02/msg00813.html


[Bug libgomp/60035] New: [PATCH] make it possible to use OMP on both sides of a fork (without violating standard)

2014-02-02 Thread njs at pobox dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60035

Bug ID: 60035
   Summary: [PATCH] make it possible to use OMP on both sides of a
fork (without violating standard)
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: libgomp
  Assignee: unassigned at gcc dot gnu.org
  Reporter: njs at pobox dot com
CC: jakub at gcc dot gnu.org

Created attachment 32019
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=32019&action=edit
patch to make openmp -> quiesce -> fork -> openmp work

This is a re-open of #52303 and #58378, with more arguments, and a proposed
patch that fixes the problem without violating the openmp standard.

Background: Almost all scientific/numerical code delegates linear algebra
operations to some optimized BLAS library. Currently, the main contenders for
this library are:
1) ATLAS: free software, but uses extensive build-time configuration, which
means it must be re-compiled from source by every user to achieve competitive
performance.
2) MKL: proprietary, but technically excellent.
3) OpenBLAS: free software, but uses OpenMP for threading, which means that any
program which does linear algebra and also expects fork() to work is screwed
[1], at least when using GCC.

This means that for projects like numpy, which are used in a very large range
of downstream products, we are pretty much screwed too. Many of our users use
fork(), for various good reasons that I can elaborate if desired, so we can't
just recommend OpenBLAS in general -- ATLAS or MKL are superior for . But
recompiling ATLAS is difficult, so we can't recommend that as a general
solution, or provide it in pre-compiled downloads. So what we end up doing is
shipping slow, unoptimized BLAS, while all the major "scientific python"
distros ship MKL; and we also get constantly pressured by users to either ship
binaries with MKL or with OpenBLAS built with icc; and we field a new bug
report every week or two from people who use OpenBLAS without realizing it and
are experiencing mysterious hangs. (Or sometimes other projects get caught in
the crossfire, e.g. [2] which is someone trying to figure out why their web-app
can't generate plot graphics when using the celery job queue manager.)
Meanwhile people are waiting with bated breath for clang to get an openmp
implementation so that they can shift their whole stack over there, solely
because of this one bug.

Basically the current situation is causing ongoing pain for a wide range of
people and makes free software uncompetitive with proprietary software for
scientific code using Python in general. But it doesn't have to be this way! In
actual practice on real implementations -- regardless of what POSIX says --
it's perfectly safe to use arbitrary POSIX APIs after fork, so long as all
threads are in a known, quiescent state when the fork occurs.

The attached patch has essentially no impact on compliant OpenMP-using
programs; in particular, and unlike the patch in #58378, it has no affect on
the behavior of the parent process, and in the child process it does nothing
that violates POSIX unless the user has violated POSIX first. But it makes it
safe in practice to use OpenMP encapsulated within a serial library API,
without mysterious breakage depending on far away parts of the program, and in
particular should fix the OpenBLAS issue.

Test case included in patch is by Olivier Grisel, from #58378. Patch is against
current gcc svn trunk (r206297).

[1] https://github.com/xianyi/OpenBLAS/issues/294
[2] https://github.com/celery/celery/issues/1842