[Bug libstdc++/42624] libstdc++ parallel mode deadlocks in barrier
--- Comment #20 from edwintorok at gmail dot com 2010-01-27 12:35 --- Thanks to Jakub for the hints. This is not a bug in libstdc++/gcc: the problem is that fork() is called when we already have threads (due to openmp/libstdc++ parallel mode), and then you can call a limited number of functions before exec(). std::find is called both before and after fork(). This is fine in a default build, but in a parallel mode build, the first std::find spawns threads, which ClamAV doesn't expect. I will just make it a #error if ClamAV is built in libstdc++ parallel mode. -- edwintorok at gmail dot com changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||INVALID http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42624
[Bug libstdc++/42624] libstdc++ parallel mode deadlocks in barrier
--- Comment #16 from singler at kit dot edu 2010-01-15 14:29 --- First, let's remove superfluous #pragma omp single in two occurences, to make things simpler (see attached patch for trunk). The problem still persists, the program deadlocks. When dropping in some prints (see attached patch), the log ends like this: find going parallel, requesting 2 thread thread 0 of 2 starts thread 0 finished thread 1 of 2 starts thread 1 finished successful join find going parallel, requesting 2 thread thread 0 of 2 starts thread 0 finished Analysis: Thread 1 never starts (or at least does not reach the first printf). In general, for more threads, only thread 0 starts. This obviously leads to the deadlock. So on first sight, I would blame it on the OpenMP implementation. Maybe yet some interference with the pthreads. Any other explanations? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42624
[Bug libstdc++/42624] libstdc++ parallel mode deadlocks in barrier
--- Comment #17 from singler at kit dot edu 2010-01-15 14:30 --- Created an attachment (id=19616) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19616action=view) Removes superfluous pragma omp single twice -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42624
[Bug libstdc++/42624] libstdc++ parallel mode deadlocks in barrier
--- Comment #18 from singler at kit dot edu 2010-01-15 14:30 --- Created an attachment (id=19617) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19617action=view) Add printf debug statements. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42624
[Bug libstdc++/42624] libstdc++ parallel mode deadlocks in barrier
--- Comment #19 from paolo dot carlini at oracle dot com 2010-01-15 14:35 --- Let's add Jakub in CC, he knows the implementation very well. In case, please keep also in touch privately. -- paolo dot carlini at oracle dot com changed: What|Removed |Added CC||jakub at redhat dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42624
[Bug libstdc++/42624] libstdc++ parallel mode deadlocks in barrier
--- Comment #14 from singler at kit dot edu 2010-01-13 13:53 --- (In reply to comment #13) This code is compiled with -fno-exceptions, could that be a problem? No, that should rather help. Still, it is very difficult to debug this. Is there at least a way to access clamd's stdout and/or stderr? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42624
[Bug libstdc++/42624] libstdc++ parallel mode deadlocks in barrier
--- Comment #15 from edwintorok at gmail dot com 2010-01-13 20:39 --- (In reply to comment #14) (In reply to comment #13) This code is compiled with -fno-exceptions, could that be a problem? No, that should rather help. Still, it is very difficult to debug this. Is there at least a way to access clamd's stdout and/or stderr? The usual way to debug clamd is by setting 'Foreground yes' in clamd.conf, however the bug doesn't reproduce then. You can however still get stderr/stdout by applying the patch below, and starting clamd like this: $ clamd/clamd -c etc/clamd.conf stdout.log 2stderr.log or even without redirection: $ clamd/clamd -c etc/clamd.conf diff --git a/shared/misc.c b/shared/misc.c index 080d4ec..656dda5 100644 --- a/shared/misc.c +++ b/shared/misc.c @@ -247,7 +247,7 @@ int daemonize(void) int fds[3], i; pid_t pid; - +#if 0 fds[0] = open(/dev/null, O_RDONLY); fds[1] = open(/dev/null, O_WRONLY); fds[2] = open(/dev/null, O_WRONLY); @@ -272,7 +272,7 @@ int daemonize(void) for(i = 0; i = 2; i++) if(fds[i] 2) close(fds[i]); - +#endif pid = fork(); if(pid == -1) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42624
[Bug libstdc++/42624] libstdc++ parallel mode deadlocks in barrier
--- Comment #5 from paolo dot carlini at oracle dot com 2010-01-12 11:54 --- Thanks. If you could do your best to figure out something small and self contained it would be great, otherwise we lack anyway something to add to the testsuite. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42624
[Bug libstdc++/42624] libstdc++ parallel mode deadlocks in barrier
--- Comment #6 from singler at kit dot edu 2010-01-12 12:36 --- Can I get this thing to run without actually installing it into the system? 5. clamd/clamd -c etc/clamd.conf LibClamAV Error: cl_load(): Can't get status of /usr/local/share/clamav ERROR: Can't get file status Please enter the GCC version into the Reported against field. What happens for OMP_NUM_THREADS=1? I will look thoroughly into the find implementation in the meantime. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42624
[Bug libstdc++/42624] libstdc++ parallel mode deadlocks in barrier
--- Comment #7 from edwintorok at gmail dot com 2010-01-12 12:41 --- (In reply to comment #6) Can I get this thing to run without actually installing it into the system? 5. clamd/clamd -c etc/clamd.conf LibClamAV Error: cl_load(): Can't get status of /usr/local/share/clamav ERROR: Can't get file status Yes, you can specify the path. A minimal example (you can use any path instead of /tmp): $ mkdir /tmp/testdb $ touch /tmp/testdb/foo.pdb $ cat etc/clamd.conf EOF DatabaseDirectory /tmp/testdb LocalSocket /tmp/clamd.socket EOF $ clamd/clamd -c etc/clamd/conf Same for clamdscan (-c etc/clamd.conf) Please enter the GCC version into the Reported against field. Done. What happens for OMP_NUM_THREADS=1? Will test now. I will look thoroughly into the find implementation in the meantime. Ok. -- edwintorok at gmail dot com changed: What|Removed |Added Version|unknown |4.4.2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42624
[Bug libstdc++/42624] libstdc++ parallel mode deadlocks in barrier
--- Comment #8 from edwintorok at gmail dot com 2010-01-12 12:51 --- (In reply to comment #7) What happens for OMP_NUM_THREADS=1? Will test now. It doesn't hang with OMP_NUM_THREADS=1. It does hang with OMP_NUM_THREADS=2, or with OMP_NUM_THREADS unset. Please enter the GCC version into the Reported against field. I reproduced the issue with gcc version 4.3.2 (Debian 4.3.2-1.1) too. BTW you can also find my build on gcc14 in the compiler farm at /home/edwin/clam/git_test/clamav-devel (should be world readable). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42624
[Bug libstdc++/42624] libstdc++ parallel mode deadlocks in barrier
--- Comment #9 from edwintorok at gmail dot com 2010-01-12 13:35 --- Could this bug be related to this one: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36242#c4 Clamd creates threads using pthread_create, std::find is called from those threads. There are also threads that only poll/dispatch, and never use the STL (hence never uses openmp). However the gcc manual doesn't mention incompatibility between pthread_create and openmp (or libstdc++ parallel mode). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42624
[Bug libstdc++/42624] libstdc++ parallel mode deadlocks in barrier
--- Comment #10 from singler at kit dot edu 2010-01-12 14:35 --- Can reproduce deadlock now. -- singler at kit dot edu changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |singler at kit dot edu |dot org | Status|UNCONFIRMED |ASSIGNED Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2010-01-12 14:35:01 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42624
[Bug libstdc++/42624] libstdc++ parallel mode deadlocks in barrier
--- Comment #11 from singler at kit dot edu 2010-01-12 14:35 --- (In reply to comment #9) Could this bug be related to this one: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36242#c4 This bug is invalid for GCC 4.4. Clamd creates threads using pthread_create, std::find is called from those threads. There are also threads that only poll/dispatch, and never use the STL (hence never uses openmp). However the gcc manual doesn't mention incompatibility between pthread_create and openmp (or libstdc++ parallel mode). It should work nevertheless. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42624
[Bug libstdc++/42624] libstdc++ parallel mode deadlocks in barrier
--- Comment #12 from singler at kit dot edu 2010-01-12 17:42 --- Thread 1 waits for its colleagues, but where are they gone? Is it possible that an exception is thrown inside find (by means of the value type or the predicate)? I don't fully trust gdb in this case, but it shows that an iterator range of (NULL, NULL) had to be searched. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42624
[Bug libstdc++/42624] libstdc++ parallel mode deadlocks in barrier
--- Comment #13 from edwintorok at gmail dot com 2010-01-12 17:54 --- (In reply to comment #12) Thread 1 waits for its colleagues, but where are they gone? Is it possible that an exception is thrown inside find (by means of the value type or the predicate)? I don't fully trust gdb in this case, but it shows that an iterator range of (NULL, NULL) had to be searched. This code is compiled with -fno-exceptions, could that be a problem? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42624
[Bug libstdc++/42624] libstdc++ parallel mode deadlocks in barrier
--- Comment #4 from edwintorok at gmail dot com 2010-01-12 07:28 --- (In reply to comment #3) Johannes is looking into it, Thanks. certainly reproducing the problem will not be a trivial taks, I'm afraid... If the steps I listed in the bugreport don't work for you just let me know for which step you need more info. You can also ping me on #gcc (oftc.net), or #clamav (freenode.net). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42624
[Bug libstdc++/42624] libstdc++ parallel mode deadlocks in barrier
--- Comment #1 from edwintorok at gmail dot com 2010-01-05 18:09 --- (In reply to comment #0) $ make -j4 This should have been: make CCLD=g++ -j4 -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42624
[Bug libstdc++/42624] libstdc++ parallel mode deadlocks in barrier
--- Comment #2 from paolo dot carlini at oracle dot com 2010-01-05 19:22 --- The best we can do is asking the attention of Johannes... -- paolo dot carlini at oracle dot com changed: What|Removed |Added CC||singler at ira dot uka dot ||de http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42624