[Bug libstdc++/47433] libstdc++ parallel mode calls std::swap explicitely

2011-01-24 Thread singler at kit dot edu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47433

--- Comment #5 from Johannes Singler singler at kit dot edu 2011-01-24 
13:19:22 UTC ---
What are you proposing for a fix?  Omitting std::?  Using std::iter_swap where
appropriate, like stl_algo.h mostly does?  The latter would be more consistent.

Johannes


[Bug libstdc++/47433] libstdc++ parallel mode calls std::swap explicitely

2011-01-24 Thread singler at kit dot edu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47433

--- Comment #6 from Johannes Singler singler at kit dot edu 2011-01-24 
13:23:26 UTC ---
Taking __key as value type in (some variants of __delete_min_insert) makes
sense, since it is also used as a buffer for storing the current loser.  Having
a local variable that is initialized with the const ref would have the same
effect, would it not?

Johannes


[Bug libstdc++/47433] libstdc++ parallel mode calls std::swap explicitely

2011-01-24 Thread singler at kit dot edu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47433

--- Comment #9 from Johannes Singler singler at kit dot edu 2011-01-24 
13:55:33 UTC ---
I have made the attached minimal patch.  
Use std::iter_swap where possible, use swap for _Tp, and leave std::swap for
built-in types.  I will test and then submit the patch.

Johannes


[Bug libstdc++/47433] libstdc++ parallel mode calls std::swap explicitely

2011-01-24 Thread singler at kit dot edu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47433

--- Comment #10 from Johannes Singler singler at kit dot edu 2011-01-24 
13:57:16 UTC ---
Created attachment 23098
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=23098
Minimal patch avoid std::swap on template types.


[Bug libstdc++/47437] New: libstdc++ parallel mode: multiway_merge does not compile

2011-01-24 Thread singler at kit dot edu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47437

   Summary: libstdc++ parallel mode: multiway_merge does not
compile
   Product: gcc
   Version: 4.6.0
Status: UNCONFIRMED
  Severity: blocker
  Priority: P3
 Component: libstdc++
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: sing...@kit.edu
CC: sing...@kit.edu, paolo.carl...@oracle.com,
manuel.holtgr...@fu-berlin.de
Depends on: 47433


Mainline currently fails to compile the parallel mode multiway merge facilities
because of a spurious mutable reference.  This is a serious regression.

The fix is easy, though, and currently undergoes testing.


[Bug libstdc++/47437] libstdc++ parallel mode: multiway_merge does not compile

2011-01-24 Thread singler at kit dot edu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47437

--- Comment #1 from Johannes Singler singler at kit dot edu 2011-01-24 
14:48:42 UTC ---
Created attachment 23101
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=23101
Remove mutable qualifier from reference member.


[Bug libstdc++/47437] libstdc++ parallel mode: multiway_merge does not compile

2011-01-24 Thread singler at kit dot edu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47437

Johannes Singler singler at kit dot edu changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2011.01.24 14:49:38
 AssignedTo|unassigned at gcc dot   |singler at kit dot edu
   |gnu.org |
 Ever Confirmed|0   |1


[Bug libstdc++/47437] [4.6 Regression] libstdc++ parallel mode: multiway_merge does not compile

2011-01-24 Thread singler at kit dot edu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47437

Johannes Singler singler at kit dot edu changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED

--- Comment #4 from Johannes Singler singler at kit dot edu 2011-01-24 
16:52:37 UTC ---
Fixed by above commit.


[Bug libstdc++/47433] libstdc++ parallel mode calls std::swap explicitely

2011-01-24 Thread singler at kit dot edu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47433

Johannes Singler singler at kit dot edu changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED

--- Comment #13 from Johannes Singler singler at kit dot edu 2011-01-24 
17:10:05 UTC ---
Fixed in mainline.


[Bug libgomp/43706] scheduling two threads on one core leads to starvation

2010-11-15 Thread singler at kit dot edu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43706

--- Comment #26 from Johannes Singler singler at kit dot edu 2010-11-15 
08:53:12 UTC ---
(In reply to comment #25)
 You might have misread what I wrote.  I did not mention 35 tests; I 
 mentioned
 that a test became slower by 35%.  The total number of different tests was 4
 (and each was invoked multiple times per spincount setting, indeed).  One out
 of four stayed 35% slower until I increased GOMP_SPINCOUNT to 20.

Sorry, I got that wrong.  

 This makes some sense, but the job of an optimizing compiler and runtime
 libraries is to deliver the best performance they can even with somewhat
 non-optimal source code.  

I agree with that in principle.  But please be reminded that as is, there is
the very simple testcase posted, which takes a serious performance hit.  And
repeated parallel loops like the one in the test program certainly appear very
often in real applications.
BTW:  How does the testcase react to this change on your machine?

 There are plenty of real-world cases where spending
 time on application redesign for speed is unreasonable or can only be 
 completed
 at a later time - yet it is desirable to squeeze a little bit of extra
 performance out of the existing code.  There are also cases where more
 efficient parallelization - implemented at a higher level to avoid frequent
 switches between parallel and sequential execution - makes the application
 harder to use.  To me, one of the very reasons to use OpenMP was to
 avoid/postpone that redesign and the user-visible complication for now.  If I
 went for a more efficient higher-level solution, I would not need OpenMP in 
 the
 first place.

OpenMP should not be regarded as only good for loop parallelization.  With
the new task construct, it is a fully-fledged parallelization substrate.

  So I would suggest a threshold of 10 for now.
 
 My suggestion is 25.

Well, that's already much better than staying with 20,000,000, so I agree.

  IMHO, something should really happen to this problem before the 4.6 release.
 
 Agreed.  It'd be best to have a code fix, though.

IMHO, there is no obvious way to fix this in principle.  There will always be a
compromise between busy waiting and giving back control to the OS.

Jakub, what do you plan to do about this problem?


[Bug libgomp/43706] scheduling two threads on one core leads to starvation

2010-11-12 Thread singler at kit dot edu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43706

--- Comment #24 from Johannes Singler singler at kit dot edu 2010-11-12 
08:15:56 UTC ---
If only one out of 35 tests becomes slower, I would rather blame it to this one
(probably badly parallelized) application, not the OpenMP runtime system.  So I
would suggest a threshold of 10 for now.  IMHO, something should really
happen to this problem before the 4.6 release.


[Bug libgomp/43706] scheduling two threads on one core leads to starvation

2010-08-30 Thread singler at kit dot edu


--- Comment #20 from singler at kit dot edu  2010-08-30 08:41 ---
Maybe we could agree on a compromise for a start.  Alexander, what are the
corresponding results for GOMP_SPINCOUNT=10?


-- 

singler at kit dot edu changed:

   What|Removed |Added

 CC||singler at kit dot edu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43706



[Bug libgomp/43706] scheduling two threads on one core leads to starvation

2010-08-13 Thread singler at kit dot edu


--- Comment #16 from singler at kit dot edu  2010-08-13 15:48 ---
I would really like to see this bug tackled.  It has been confirmed two more
times. 

Fixing it is easily done by lowering the spin count as proposed.  Otherwise,
please show cases where a low spin count hurts performance.

In general, for a tuning parameter, a good-natured rather value should be
preferred over a value that gives best results in one case, but very bad ones
in another case.


-- 

singler at kit dot edu changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2010-08-13 15:48:18
   date||
   Target Milestone|4.4.5   |4.5.2


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43706



[Bug bootstrap/44439] New: Configure states wrong required versions for GMP, MPFR, and MPC

2010-06-07 Thread singler at kit dot edu
In case of wrong prerequisite versions, configure says

Building GCC requires GMP 4.2+, MPFR 2.3.1+ and MPC 0.8.0+

but the prerequisites page http://gcc.gnu.org/install/prerequisites.html says

GNU Multiple Precision Library (GMP) version 4.3.2 (or later)
MPFR Library version 2.4.2 (or later)
MPC Library version 0.8.1 (or later)

which is apparently more correct.  This could cause some installation
headaches.


-- 
   Summary: Configure states wrong required versions for GMP, MPFR,
and MPC
   Product: gcc
   Version: 4.6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: bootstrap
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: singler at kit dot edu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44439



[Bug libstdc++/44417] make check-target-libstdc++-v3 fails due to undefined ptrdiff_t

2010-06-07 Thread singler at kit dot edu


--- Comment #11 from singler at kit dot edu  2010-06-07 09:35 ---
Obviously, I'm not the only one having this problem, Jason has patched
libstdc++-v3/testsuite/util/testsuite_abi.h in the meantime.


r160313 | jason | 2010-06-05 15:13:46 +0200 (Sat, 05 Jun 2010) | 1 line

* testsuite/util/testsuite_abi.h: Work around glibc BZ 9694.

+// Include stddef now to work around glibc BZ 9694
+#include stddef.h

related to

http://sourceware.org/bugzilla/show_bug.cgi?id=9694

This solves the problem testsuite_abi.h, but then, the next test fails, namely
testsuite_allocator.h.

I have tested all that using a different user ID, with a completely clean
environment and new checkout, but the problem persists.


-- 

singler at kit dot edu changed:

   What|Removed |Added

 CC||jason at redhat dot com
 Status|RESOLVED|UNCONFIRMED
 Resolution|WORKSFORME  |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44417



[Bug libstdc++/44417] New: make check-target-libstdc++-v3 fails due to undefined ptrdiff_t

2010-06-04 Thread singler at kit dot edu
make check-target-libstdc++-v3 fails because ptrdiff_t is undefined.
std::ptrdiff_t works.
Maybe this bug is related to the Linux system run on.  I have openSuse 11.1
running.

configure --enable-languages=c,c++ --program-suffix=-rep
--prefix=$HOME/gcc/install_trunk_1


In file included from
/home/singler/gcc/trunk_1/libstdc++-v3/testsuite/util/testsuite_abi.h:27:0,
 from
/home/singler/gcc/trunk_1/libstdc++-v3/testsuite/util/testsuite_abi.cc:23:
/home/singler/gcc/trunk_1/libstdc++-v3/libsupc++/cxxabi.h:371:5: error:
'ptrdiff_t' does not name a type
/home/singler/gcc/trunk_1/libstdc++-v3/libsupc++/cxxabi.h:447:23: error:
'ptrdiff_t' has not been declared
/home/singler/gcc/trunk_1/libstdc++-v3/libsupc++/cxxabi.h:459:18: error:
'ptrdiff_t' has not been declared
/home/singler/gcc/trunk_1/libstdc++-v3/libsupc++/cxxabi.h:469:26: error:
'ptrdiff_t' has not been declared
/home/singler/gcc/trunk_1/libstdc++-v3/libsupc++/cxxabi.h:495:18: error:
'ptrdiff_t' has not been declared
/home/singler/gcc/trunk_1/libstdc++-v3/libsupc++/cxxabi.h:501:26: error:
'ptrdiff_t' has not been declared
/home/singler/gcc/trunk_1/libstdc++-v3/libsupc++/cxxabi.h:540:18: error:
'ptrdiff_t' has not been declared
/home/singler/gcc/trunk_1/libstdc++-v3/libsupc++/cxxabi.h:546:26: error:
'ptrdiff_t' has not been declared
/home/singler/gcc/trunk_1/libstdc++-v3/libsupc++/cxxabi.h:566:4: error:
'ptrdiff_t' has not been declared
compiler exited with status 1

Any ideas and/or a workaround?


-- 
   Summary: make check-target-libstdc++-v3 fails due to undefined
ptrdiff_t
   Product: gcc
   Version: 4.6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: singler at kit dot edu
 GCC build triplet: x86_64-unknown-linux-gnu
  GCC host triplet: x86_64-unknown-linux-gnu
GCC target triplet: x86_64-unknown-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44417



[Bug libstdc++/44417] make check-target-libstdc++-v3 fails due to undefined ptrdiff_t

2010-06-04 Thread singler at kit dot edu


--- Comment #2 from singler at kit dot edu  2010-06-04 14:16 ---
I had cleaned the builddir already.
Adding 

#include cstddef 

solves the problem. 
The crucial file seems to be

lib/gcc/x86_64-unknown-linux-gnu/4.6.0/include/stddef.h

Only if it is (indirectly) included, ptrdiff_t is defined in the global scope. 
Maybe on other systems, ptrdiff_t is also declared somewhere else, so the
problem does not appear.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44417



[Bug middle-end/44416] [4.6 regression] Failed to build 447.dealII in SPEC CPU 2006

2010-06-04 Thread singler at kit dot edu


--- Comment #3 from singler at kit dot edu  2010-06-04 14:19 ---
Bug 44417 is very likely to have the same cause, but here, we can reproduce it
more easily, using the testsuite.

*** This bug has been marked as a duplicate of 44417 ***


-- 

singler at kit dot edu changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||DUPLICATE


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44416



[Bug libstdc++/44417] make check-target-libstdc++-v3 fails due to undefined ptrdiff_t

2010-06-04 Thread singler at kit dot edu


--- Comment #3 from singler at kit dot edu  2010-06-04 14:19 ---
*** Bug 44416 has been marked as a duplicate of this bug. ***


-- 

singler at kit dot edu changed:

   What|Removed |Added

 CC||hjl dot tools at gmail dot
   ||com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44417



[Bug libgomp/43706] scheduling two threads on one core leads to starvation

2010-04-23 Thread singler at kit dot edu


--- Comment #13 from singler at kit dot edu  2010-04-23 14:17 ---
The default spin count is not 2,000,000 cycles, but even 20,000,000.  As
commented in libgomp/env.c, this is supposed to correspond to 200ms.  The
timings we see here are even larger, but the number of cycles is just a rough
estimation.

Throttling the spincount in the awareness of too many threads is a good idea,
but it is just a heuristic.  If there are other processes, the cores might be
loaded anyway, and libgomp has little chances to figure that out.  This gets
even more difficult when having multiple programs using libgomp at the same
time.  So I would like the non-throttling value to be chosen more conservative,
better balancing worst case behavior in difficult situations and best case
behavior on an unloaded machine.

There are algorithms in libstdc++ parallel mode that show speedups for as
little as less than 1ms of sequential running time (when taking threads from
the pool), so users will accept a parallelization overhead for such small
computing times.  However, if they are then hit by a 200ms penalty, this
results in catastrophic slowdowns.  Calling such short-lived parallel regions
several times will make this very noticeable, although it need not be.  So
IMHO, by default, the spinning should take about as long as rescheduling a
thread takes (that was already migrated on another core), by that making things
at most twice as bad as in the best case.
From my experience, this is a matter of a few milliseconds, so I propose to
lower the default spincount to something like 10,000, at most 100,000.  I think
that spinning for even longer than a usual time slice like now is questionable
anyway.

Are nested threads taken into account when deciding on whether to throttle or
not?


-- 

singler at kit dot edu changed:

   What|Removed |Added

 Status|RESOLVED|UNCONFIRMED
 Resolution|FIXED   |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43706



[Bug middle-end/39154] Miscompilation of VLAs in nested parallel regions

2010-04-20 Thread singler at kit dot edu


--- Comment #3 from singler at kit dot edu  2010-04-20 14:04 ---
Can this old bug be closed?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39154



[Bug libstdc++/33485] parallel v3: do not use __builtin_alloca, use VLA

2010-02-09 Thread singler at kit dot edu


--- Comment #17 from singler at kit dot edu  2010-02-09 10:49 ---
The actual problem has vanished, but maybe it would still be nice to use VLA in
the appropriate places.

We can close the bug as fixed/invalid, or reprioritize it as enhancement and
leave it open.  Both is fine with me.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33485



[Bug libstdc++/42712] search_n/iterator.cc times out in parallel-mode

2010-01-18 Thread singler at kit dot edu


--- Comment #3 from singler at kit dot edu  2010-01-18 09:08 ---
Paolo, you were right, it was just the fallback switch missing for this case. 
And since this specific test issues many thousands of calls with very small
input, the overhead was very noticeable.  Patch upcoming...


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42712



[Bug libstdc++/42624] libstdc++ parallel mode deadlocks in barrier

2010-01-15 Thread singler at kit dot edu


--- Comment #16 from singler at kit dot edu  2010-01-15 14:29 ---
First, let's remove superfluous #pragma omp single in two occurences, to make
things simpler (see attached patch for trunk).
The problem still persists, the program deadlocks.

When dropping in some prints (see attached patch), the log ends like this:

find going parallel, requesting 2 thread
thread 0 of 2 starts
thread 0 finished
thread 1 of 2 starts
thread 1 finished
successful join
find going parallel, requesting 2 thread
thread 0 of 2 starts
thread 0 finished

Analysis: Thread 1 never starts (or at least does not reach the first printf).
In general, for more threads, only thread 0 starts.  This obviously leads to
the deadlock.

So on first sight, I would blame it on the OpenMP implementation.  Maybe yet
some interference with the pthreads.  Any other explanations?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42624



[Bug libstdc++/42624] libstdc++ parallel mode deadlocks in barrier

2010-01-15 Thread singler at kit dot edu


--- Comment #17 from singler at kit dot edu  2010-01-15 14:30 ---
Created an attachment (id=19616)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19616action=view)
Removes superfluous pragma omp single twice


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42624



[Bug libstdc++/42624] libstdc++ parallel mode deadlocks in barrier

2010-01-15 Thread singler at kit dot edu


--- Comment #18 from singler at kit dot edu  2010-01-15 14:30 ---
Created an attachment (id=19617)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=19617action=view)
Add printf debug statements.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42624



[Bug libstdc++/42624] libstdc++ parallel mode deadlocks in barrier

2010-01-13 Thread singler at kit dot edu


--- Comment #14 from singler at kit dot edu  2010-01-13 13:53 ---
(In reply to comment #13)

 This code is compiled with -fno-exceptions, could that be a problem?

No, that should rather help.

Still, it is very difficult to debug this.  Is there at least a way to access
clamd's stdout and/or stderr?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42624



[Bug libstdc++/42624] libstdc++ parallel mode deadlocks in barrier

2010-01-12 Thread singler at kit dot edu


--- Comment #6 from singler at kit dot edu  2010-01-12 12:36 ---
Can I get this thing to run without actually installing it into the system?

5. clamd/clamd -c etc/clamd.conf
LibClamAV Error: cl_load(): Can't get status of /usr/local/share/clamav
ERROR: Can't get file status

Please enter the GCC version into the Reported against field.
What happens for OMP_NUM_THREADS=1?

I will look thoroughly into the find implementation in the meantime.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42624



[Bug libstdc++/42624] libstdc++ parallel mode deadlocks in barrier

2010-01-12 Thread singler at kit dot edu


--- Comment #10 from singler at kit dot edu  2010-01-12 14:35 ---
Can reproduce deadlock now.


-- 

singler at kit dot edu changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |singler at kit dot edu
   |dot org |
 Status|UNCONFIRMED |ASSIGNED
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2010-01-12 14:35:01
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42624



[Bug libstdc++/42624] libstdc++ parallel mode deadlocks in barrier

2010-01-12 Thread singler at kit dot edu


--- Comment #11 from singler at kit dot edu  2010-01-12 14:35 ---
(In reply to comment #9)
 Could this bug be related to this one:
 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36242#c4

This bug is invalid for GCC 4.4.

 Clamd creates threads using pthread_create, std::find is called from those
 threads. There are also threads that only poll/dispatch, and never use the STL
 (hence never uses openmp). However the gcc manual doesn't mention
 incompatibility between pthread_create and openmp (or libstdc++ parallel 
 mode).

It should work nevertheless.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42624



[Bug libstdc++/42624] libstdc++ parallel mode deadlocks in barrier

2010-01-12 Thread singler at kit dot edu


--- Comment #12 from singler at kit dot edu  2010-01-12 17:42 ---
Thread 1 waits for its colleagues, but where are they gone?  Is it possible
that an exception is thrown inside find (by means of the value type or the
predicate)?
I don't fully trust gdb in this case, but it shows that an iterator range of
(NULL, NULL) had to be searched.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42624



[Bug libstdc++/42712] search_n/iterator.cc times out in parallel-mode

2010-01-12 Thread singler at kit dot edu


--- Comment #1 from singler at kit dot edu  2010-01-12 17:43 ---
Maybe rather an endless loop.


-- 

singler at kit dot edu changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |singler at kit dot edu
   |dot org |
 Status|UNCONFIRMED |ASSIGNED
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2010-01-12 17:43:17
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42712