[Bug libstdc++/99277] C++2a synchronisation is inefficient in GCC 11

2023-05-08 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99277

--- Comment #23 from Thomas Rodgers  ---
(In reply to Thiago Macieira from comment #21)
> I understand that. I don't think it's a reason to repeat the policy, though.
> 
> Anyway, I don't have any new arguments than when we discussed this two years
> ago, so I won't pursue this matter further.

(Un)fortunately, the entire implementation detail in the headers will change,
most likely with GCC14 (ideally moving most of the implementation into the
.so). As Jonathan points out, this hardly the only experimental feature area
within libstdc++ that has undergone such changes.

[Bug libstdc++/99277] C++2a synchronisation is inefficient in GCC 11

2023-05-08 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99277

--- Comment #16 from Thomas Rodgers  ---
(In reply to Thiago Macieira from comment #15)
> > >  5) std::barrier implementation also uses a type that futex(2) can't 
> > > handle
> 
> > barrier still uses a 1-byte enum for the atomic waits.
> 
> That can only now be fixed for libstdc++.so.7, then.

The original implementation came from Olvier Giroux and is part of libc++. The
libc++ implementation also does not use a type that futex or ulock_wait/wake
(uint64_t) can handle. I have discussed this in the past with Olivier, the
choice of char was deliberate on his part. The implementation has been tested
on a number of platforms (including time on ORNL's Summit). The following
comment, preserved from libc++ should be considered carefully before any change
here -

" 2. A great deal of attention has been paid to avoid cache line thrashing
by flattening the tree structure into cache-line sized arrays, that
are indexed in an efficient way."

It is my opinion that the bar for making a change here is high. I would need to
see benchmark numbers that illustrate the performance differences under various
contention scenarios vs impact on caches by being able to fit the entire tree
in a single cache line using char, vs four or eight cache lines using the type
favored by futex or ulock_wait/wake.

[Bug libstdc++/103934] std::atomic_flag: multiple C++20 functions missing

2023-03-09 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103934

Thomas Rodgers  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #8 from Thomas Rodgers  ---
Fixed on master, backported to releases/gcc-12 and releases/gcc-12

[Bug libstdc++/88322] Implement C++20 library features.

2023-03-09 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88322
Bug 88322 depends on bug 103934, which changed state.

Bug 103934 Summary: std::atomic_flag: multiple C++20 functions missing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103934

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug libstdc++/108974] std::barrier except completion function which is not manifestly noexcept

2023-03-01 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108974

Thomas Rodgers  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |rodgertq at gcc dot 
gnu.org

[Bug libstdc++/108974] std::barrier except completion function which is not manifestly noexcept

2023-03-01 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108974

--- Comment #4 from Thomas Rodgers  ---
(In reply to Jiang An from comment #3)
> > is_nothrow_invocable_v shall be true.
> 
> If implementation divergence is not intendedly permitted, I don't think it
> makes much sense to introduce UB in this way.
> 
> I guess we should either turn it into a mandating requirement:
> > Instantiation of barrier is ill-formed
> > if is_nothrow_invocable_v is not true.
> 
> Or relax the preconditions:
> > If any invocation to the completion function throws an exception,
> > the behavior is undefined.
> 
> I've mailed to LWG Chair for this...

LWG Chair has already weighed in on this thread.

libstdc++ and libc++ have essentially the same implementation of std::barrier
(libstdc++'s implementation is derived from libc++'s). So that they behave
consistently and MSVC is different is not surprising.

I don't have a strong opinion on whether this should be UB ore mandated.
Raising an LWG issue seems reasonable.

[Bug libstdc++/106772] atomic::wait shouldn't touch waiter pool if used platform wait

2022-09-28 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106772

--- Comment #25 from Thomas Rodgers  ---
(In reply to Mkkt Bkkt from comment #24)
> (In reply to Thomas Rodgers from comment #22)
> > Your example of '100+ core' systems especially on NUMA is certainly a valid
> > one. I would ask, at what point do those collisions and the resulting cache
> > invalidation traffic swamp the cost of just making the syscall? I do plan to
> > put these tests together, because there is another algorithm that I am
> > exploring, that I believe will reduce the likelihood of spurious wakeups,
> > and achieves the same result as this particular approach, without generating
> > the same invalidation traffic. At this point, I don't anticipate doing that
> > work until after GCC13 stage1 closes.
> 
> I try to explain: 
> 
> syscall overhead is some constant commonly like 10-30ns (futex syscall can
> be more expensive like 100ns in your example)
> 
> But numbers of cores are grow, arm also makes more popular (fetch_add/sub
> have more cost on it compares to x86)
> And people already faced with situation where fetch_add have a bigger cost
> than syscall overhead:
> 
> https://pkolaczk.github.io/server-slower-than-a-laptop/
> https://travisdowns.github.io/blog/2020/07/06/concurrency-costs.html
> 
> I don't think we will faced with problem like in these links in
> atomic::wait/notify in real code, but I'm pretty sure in some cases it can
> be more expansive than syscall part of atomic::wait/notify
> 
> Of course better to prove it, maybe someday I will do it :(

So to your previous comment, I don't the discussion is at all pointless. i plan
to raise some of these issues at the next SG1 meeting in November. Sure, that
doesn't help *you* or any developer with your specific intent until C++26, and
maybe Boost's implementation is a better choice, I also get how unsatisfying of
an aswer that is.

I'm well aware of the potential scalability problems, and I have a longer term
plan to get concrete data on how different implementation choices impact
scalability. The barrier implementation (which is the same algorithm as in
libc++), for example spreads this traffic over 64 individual atomic_refs, for
this very reason, and that implementation has been shown to scale quite well on
ORNL's Summit. But not all users of libstdc++ have those sorts of problems
either.

[Bug libstdc++/106772] atomic::wait shouldn't touch waiter pool if used platform wait

2022-09-28 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106772

--- Comment #22 from Thomas Rodgers  ---
(In reply to Mkkt Bkkt from comment #20)
> My main concern with this optimization it's not zero-overhead.
> 
> It's not necessary when we expect we have some waiters, in that case it just
> additional synchronization and contention in waiter pool (that have small
> fixed size, just imagine system with 100+ cores, if we have > 16 waiting
> threads some of them make fetch_add/sub on the same atomic, that can be
> expensive, especially on numa)
> 
> And at the same time, I don't understand when I need to notify and cannot
> know notification not needed.
> I don't understand when it useful.

You are correct, it is not zero overhead. It also isn't clear what those
overheads are, either. As I noted in comment #21, there is no test over a
variety of workloads to inform this discussion, either.

Your example of '100+ core' systems especially on NUMA is certainly a valid
one. I would ask, at what point do those collisions and the resulting cache
invalidation traffic swamp the cost of just making the syscall? I do plan to
put these tests together, because there is another algorithm that I am
exploring, that I believe will reduce the likelihood of spurious wakeups, and
achieves the same result as this particular approach, without generating the
same invalidation traffic. At this point, I don't anticipate doing that work
until after GCC13 stage1 closes.

[Bug libstdc++/106772] atomic::wait shouldn't touch waiter pool if used platform wait

2022-09-28 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106772

--- Comment #21 from Thomas Rodgers  ---
(In reply to Mkkt Bkkt from comment #16)
> > it with sarcasm
> 
> I started with sarcasm because you restart this thread with some doubtful
> benchmarks without code for them.
> 
> I think it's very disrespectfully.

I wasn't sarcastic in what I posted. As I noted, this question has come up
before in different contexts, Bugzilla is a useful historical archive, so
updating this issue with my reasoning and a bit of data was primarily a capture
task.

So, let's try this again.

I did not post the original source because it required hacking on the libstdc++
headers. I have posted a version that does not require that, the results are
identical.

In this test, it is the example Jonathan cited in #14; incrementing an atomic
int and calling notify.

This isn't about semaphore or any other synchronization primitive. Those types
are free to make different choices that may be more appropriate to the
constrained usage of the type just like Lewis' lightweight manual reset event
does (I will also note that Lewis has reviewed this implementation, and has
written a paper to be discussed at the Fall meeting, p2616). 

There are, as Jonathan has pointed out, use cases where notify can and will be
called without a notifier having any way to determine it will wake a waiter.
One example I, as the person that is going to have to implement C++26 executors
care about is a wait free work-stealing deque, it certainly doesn't require
anything more than spinning for work on an empty queue to be algorithmically
correct, but after spinning on an empty queue, making the rounds trying to
steal work from other deques, maybe spinning a bit more, just to be sure, the
de-queuing thread which hasn't been able to acquire more work is probably going
to want to enter a wait until such time as it knows it can do productive work.
Another thread pushing work into that queue won't be able to determine if the
dequeue-ing thread is spinning for new work, work stealing, or has entered a
wait, but atomic::notify() does know that, and can avoid penalizing the
thread submitting work with a syscall, if there is no thread in a wait on the
other end of the deque, which is the expected case for this algorithm.

p1135 was the paper that added atomic wait/notify. One of the co-authors of
that paper wrote the libc++ implementation. That implementation, as with
libstdc++, is not simply a wrapper over the underlying platform's wait
primitive. It has two 'backends', an exponentially timed backoff and ulock
wait/wake. libstdc++ currently has a Futex and condvar backend. Both
implementations make the choice of having a short-term spinning strategy and
long term waiting strategy (spinning, futex/ulock, condvar).

I have confirmed with the libc++ implementation's author, (who also chairs the
C++ committee's Concurrency and Parallelism study group), that it was never the
intention of p1135 or the subsequently standardized language in C++20 to imply
that wait/notify were direct portable analogs to the platform's waiting
primitives. There are users, such as yourself that want exactly that, there are
other users (like in my prior industry) where busy waiting is the desired
strategy, and in between those two choices are people who want it to work as
advertised in the standard, and to do so 'efficiently'. Both libc++ and
libstdc++ take a balanced approach somewhere between always going to OS and
always spinning.

There is an open question here that your original issue raises -

* At what point do collisions on the waiter pool, with the cache invalidation
traffic and spurious wakeups that result, swamp the gain of doing this empty
waiter check on notify?

I also, made a comment about the 'size of the waiter pool not withstanding'. I
chose a smaller size than libc++ chose, in part because Jonathan and I did not
want to make any sort of ABI commitment until this had been in a few GCC
releases. This implementation is header only at present, and still considered
experimental. libc++ committed to an ABI early.

In the sizing of the libc++ waiter pool there is the comment that 'there is no
magic in this value'. 

Not only is there no magic, there is no test of any sort that I have done, or
that has been done on libc++ to determine what effect different size pools have
under different load scenarios. So, all of it is a guess at this point. I will
likely match libc++ when I do move this into the .so.

Finally, in the most recent mailing there is p2643 which proposes additional
changes to atomic waiting. One proposal is to add a 'hinted' wait that can
allow the caller to steer the choices atomic wait/notify uses. I have conferred
with the other authors of the paper and this latter option is not without
controversy, and likely some sharp edges for the user, but I plan to raise the
discussion at the Fall WG21 meeting to see what the other members of SG1 think.

[Bug libstdc++/106772] atomic::wait shouldn't touch waiter pool if used platform wait

2022-09-28 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106772

--- Comment #17 from Thomas Rodgers  ---
Created attachment 53638
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53638=edit
benchmark main.cc

[Bug libstdc++/106772] atomic::wait shouldn't touch waiter pool if used platform wait

2022-09-20 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106772

--- Comment #11 from Thomas Rodgers  ---
(In reply to Mkkt Bkkt from comment #9)
> Why do you think you smarter than msvc stl, libc++, boost::atomic developers?
> 
> Maybe it's about your "I"?

I should ignore this (see jwakely's response), but -

I very much don't think I am smarter than the person (ogir...@apple.com) who
implemented libc++'s implementation. There are minor difference between
libstc++'s details and libc++'s but in this regard,in particular, they behave
the same.

[Bug libstdc++/106772] atomic::wait shouldn't touch waiter pool if used platform wait

2022-09-20 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106772

--- Comment #6 from Thomas Rodgers  ---
(In reply to Mkkt Bkkt from comment #5)
> Single reason why I want to use atomic::wait/notify is cross platform api
> for futex, wait/wake on address, ulock, etc
> Not because I need YOU decide instead of me how many spins, or other
> optimization I need.
> 
> boost::atomic already do this.
> 
> Why do you cannot make same?

(In reply to Mkkt Bkkt from comment #4)
> Nice, thanks for benchmarks without code.
> 
> Also, why do I need call notify when don't have wait? 
> 
> Common usage from my point of view looks like:
> 
> atomic.wait(value, mo):
> 
> while (atomic.load(mo) == value) {
>   futex_wait();
> }
> 
> Notify want to looks like:
> 
> atomic.store(1, mo)
> atomic.notify_one();
> 
> See:
> 
> https://github.com/lewissbaker/cppcoro/blob/master/lib/
> lightweight_manual_reset_event.cpp
> 

I have every confidence that Lewis knows how to bring a paper for a
'lightweight manual reset event' to SG1, I suspect it will be well received
when he does.

> https://github.com/YACLib/YACLib/blob/main/src/util/atomic_event.cpp
> 
> and others
> 
> So your optimization is pretty useless.
> 
> Also if for some reason possible few notification, or notify before wait,
> then possible to have code can looks like:
> 
> if (atomic.exchange(non_wait_value, mo) == wait_value) {
>   atomic.notify_one();
> }

I

[Bug libstdc++/106772] atomic::wait shouldn't touch waiter pool if used platform wait

2022-09-19 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106772

--- Comment #3 from Thomas Rodgers  ---
Since this latter point has come up before, I want to additionally note that
the optimization to use an atomic count of waiters per-waiter pool bucket means
that a call to notify_one/notify_all is roughly 25x faster based on my testing
than naively issuing a syscall to FUTEX_WAKE when there is no possibility of
the wake being issued to a waiter.

2022-09-19T20:34:28-07:00
Running ./benchmark
Run on (20 X 4800 MHz CPU s)
CPU Caches:
  L1 Data 48 KiB (x10)
  L1 Instruction 32 KiB (x10)
  L2 Unified 1280 KiB (x10)
  L3 Unified 24576 KiB (x1)
Load Average: 0.69, 0.61, 1.30
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may
be noisy and will incur extra overhead.
--
BenchmarkTime CPU   Iterations
--
BM_empty_notify_checked   3.79 ns 3.79 ns179929051
BM_empty_notify_syscall   94.1 ns 93.9 ns  7477997

For types that can use a FUTEX directly (e.g. int) there is no place to put
that extra atomic to perform this check, so we can either have the type that is
directly usable by the underlying platform be significantly more expensive to
call, or we can use the waiter count in the waiter_pool.

[Bug libstdc++/106772] atomic::wait shouldn't touch waiter pool if used platform wait

2022-08-29 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106772

Thomas Rodgers  changed:

   What|Removed |Added

 Resolution|--- |INVALID
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Thomas Rodgers  ---
This is not a bug. The size of the waiter pool, not withstanding, the point is
you want to determine if a notify is likely to notify anything, to avoid the
very expensive (by comparison) syscall to wake the futex.

[Bug libstdc++/97936] [11/12 Regression] 30_threads/latch/3.cc hangs

2022-07-05 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97936

Thomas Rodgers  changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|FIXED   |---

--- Comment #20 from Thomas Rodgers  ---
Solaris uses the non-futex wait/notify path. There has been a recent PR opened
indicating a likely algorithmic issue with the non-futex implementation.

See - https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106183

I am going to re-open this issue while I investigate the new report.

[Bug libstdc++/106183] std::atomic::wait might fail to be unblocked by notify_one/all on platforms without platform_wait()

2022-07-05 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106183

Thomas Rodgers  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2022-07-06
 Status|UNCONFIRMED |ASSIGNED

[Bug libstdc++/102994] std::atomic::wait is not marked const

2022-05-10 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102994

Thomas Rodgers  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #18 from Thomas Rodgers  ---
Committed to GCC12 and backported to GCC11

[Bug libstdc++/100806] deadlock in std::counting_semaphore

2022-02-14 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100806

Thomas Rodgers  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Thomas Rodgers  ---
I believe this has been fully resolved.

[Bug libstdc++/102994] std::atomic::wait is not marked const

2022-02-11 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102994

--- Comment #14 from Thomas Rodgers  ---
Created attachment 52420
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52420=edit
Make notify_one/notify_all non-const

I submitted this patch to the list.

[Bug tree-optimization/102586] [12 Regression] ICE in clear_padding_type, at gimple-fold.c:4798 since r12-3433-ga25e0b5e6ac8a77a

2022-02-10 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102586

--- Comment #20 from Thomas Rodgers  ---
(In reply to Jakub Jelinek from comment #17)
...

> I don't remember the std::bit_cast case right now, OpenMP atomics are about

Not sure if this is what you are talking about (frankly most of this is well
above my grokability) but, for std::bit_cast, the standard says -

"Constraints:
—(1.1) sizeof(To) == sizeof(From) is true;
—(1.2) is_trivially_copyable_v is true; and
—(1.3) is_trivially_copyable_v is true"

So same as for 

[Bug libstdc++/104442] atomic::wait incorrectly loops in case of spurious notification when __waiter is shared

2022-02-10 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104442

Thomas Rodgers  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #8 from Thomas Rodgers  ---
(In reply to Marc Poulhiès from comment #7)
> Thanks !

Since the patch committed is effectively the same as your proposed patch, I am
going to assume this resolves the issue on vxworks, but I have no way to
independently confirm that.

[Bug libstdc++/104442] atomic::wait incorrectly loops in case of spurious notification when __waiter is shared

2022-02-09 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104442

--- Comment #4 from Thomas Rodgers  ---
(In reply to Jonathan Wakely from comment #3)
> Tom submitted
> https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590050.html already,
> which is slightly different.

The primary difference was changing the memory order of the load in _M_do_spin.
I may revert that part of the change and submit it separately, as I've noticed
one other case where I'd like to change it from RELAXED to SEQ_CST.

[Bug libstdc++/104442] atomic::wait incorrectly loops in case of spurious notification when __waiter is shared

2022-02-08 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104442

Thomas Rodgers  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2022-02-09
 CC||rodgertq at gcc dot gnu.org
 Status|UNCONFIRMED |ASSIGNED

--- Comment #1 from Thomas Rodgers  ---
I took a look at whether or not it made sense to do a different refactoring
with this version of the implementation and I don't believe that it does (the
implementation detail here is likely to change substantially when we commit to
an ABI stable version) and indeed, the predicate version does exactly what this
patch proposes.

[Bug libstdc++/102994] std::atomic::wait is not marked const

2022-02-03 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102994

Thomas Rodgers  changed:

   What|Removed |Added

 Status|REOPENED|SUSPENDED

--- Comment #11 from Thomas Rodgers  ---
(In reply to Jonathan Wakely from comment #10)
> N.B. [member.functions] in the standard says 
> 
> "For a non-virtual member function described in the C++ standard library, an
> implementation may declare a different set of member function signatures,
> provided that any call to the member function that would select an overload
> from the set of declarations described in this document behaves as if that
> overload were selected."
> 
> In general, being declared with a different signature is permitted.
> 
> Do you have an example where a call to std::atomic::notify_one() that
> should be valid according to the standard either fails to compile or
> misbehaves, as a result of being const qualified?

Pending the outcome of whether there is an LWG issue with the wording, and
given this, I am going to mark this issue SUSPENDED.

[Bug libstdc++/102994] std::atomic::wait is not marked const

2022-02-03 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102994

--- Comment #7 from Thomas Rodgers  ---
(In reply to Óscar Fuentes from comment #6)
> (In reply to Jonathan Wakely from comment #5)
> > (In reply to Óscar Fuentes from comment #4)
> > > The fix is wrong. It changes atomic_notify_one and atomic_notify_all 
> > > instead
> > > of atomic<>::wait.
> > 
> > It changed both.
> > 
> > > So right now atomic<>::wait remains unfixed
> > 
> > Are you sure?
> 
> Sigh. Sorry. It would be nice if the commit message mentioned the change to
> atomic_notify_* and its motivation, though.
>  
> > > and atomic_notify_(one|all) arg
> > > is wrongly marked as const.
> > 
> > This will be the subject of a library issue, potentially fixing the
> > standard. The notify functions should be const too.
> 
> So IIUC you are applying modifications to libstdc++ that deviate from the
> published standard expecting that the committee will accept those changes.
> As a user, this is troublesome, because right now I need to special-case gcc
> version >11.2 and maybe version  not accepted and is reverted.

There is an ongoing discussion between myself and the SG1,LWG, and LEWG chairs
(two of which were authors of p1135 which proposes atomic wait/notify) as to
whether there is a wording issue with the standard.

None of the three major standard library implementations require (as a matter
of implementation detail) notify_one/notify_all to be non-const, and indeed the
early wording of p1135 had them marked const. Between r2 and r3 of p1135 this
was changed, it'cites the minutes of an LEWG discussion as part of the change
rationale, but the minutes of that discussion do not give the motivation for
the change.

One argument is that you would typically wait in a const context and notify in
a non-const context, but by that rationale, the constness of atomic_ref::notify
is somewhat weird.

[Bug other/89863] [meta-bug] Issues in gcc that other static analyzers (cppcheck, clang-static-analyzer, PVS-studio) find that gcc misses

2022-01-15 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89863
Bug 89863 depends on bug 102074, which changed state.

Bug 102074 Summary: include/bits/atomic_timed_wait.h:215: possible missing 
return ?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102074

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug libstdc++/102074] include/bits/atomic_timed_wait.h:215: possible missing return ?

2022-01-15 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102074

Thomas Rodgers  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #4 from Thomas Rodgers  ---
Fixed on master, see -
https://gcc.gnu.org/g:763eb1f19239ebb19c0f87590a4f02300c02c52b

[Bug libstdc++/102377] FAIL: 29_atomics/atomic_flag/cons/56012.cc with -std=gnu++20

2021-09-16 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102377

--- Comment #2 from Thomas Rodgers  ---
(In reply to Jonathan Wakely from comment #1)
> The reason we don't see it in every test is that this one uses:
> 
> // { dg-options "-Wsystem-headers -Wnarrowing" }
> 
> But I think the warning is pointing out a real issue. Since the interference
> sizes vary with -mtune options, we shouldn't use them in , since
> that's defining a public ABI (or will be, once our C++20 support is
> non-experimental).

Agreed.

[Bug target/101761] Random hang with 29_atomics/atomic_ref/wait_notify.cc

2021-09-15 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101761

--- Comment #11 from Thomas Rodgers  ---
(In reply to Jonathan Wakely from comment #10)
> http://eel.is/c++draft/atomics#ref.generic.general-3.sentence-2
> 
> "While any atomic_­ref instances exist that reference the *ptr object, all
> accesses to that object shall exclusively occur through those atomic_­ref
> instances."

Yes. I will submit a patch for this test shortly.

Having said that, the atomic integral tests also spuriously deadlock, they
don't have this UB issue.

[Bug target/101761] Random hang with 29_atomics/atomic_ref/wait_notify.cc

2021-09-15 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101761

--- Comment #5 from Thomas Rodgers  ---
(In reply to Florian Weimer from comment #4)
> I think it's a test bug:
> 
>   std::atomic_ref a{ aa };
> 
>   std::thread t([&]
> {
> a.store(bb);
> a.notify_one();
> });
>   a.wait(aa);
> 
> Due to the use of std::atomic_ref, store() overwrites aa with the value of
> bb. If the notify_one() call completes before the wait() call, wait() blocks
> because aa == bb (due to the previous store()), and the wakeup never happens
> because wakes are not queued.

This was my initial suspicion. It applies to all of the
29_atomics/**/wait_notify.cc tests. The libc++ tests for wait/notify have the
same problem (see
https://github.com/llvm/llvm-project/blob/main/libcxx/test/std/atomics/atomics.types.operations/atomics.types.operations.wait/atomic_wait.pass.cpp)

I have, as part of this investigation, extensively re-reviewed the
implementation detail in bits/atomic_wait.h and I think there is a race
condition lurking in the implementation which also needs to be address.
Specifically tracking waiters to try to void making the syscall for
FUTEX_WAKE_PRIVATE if there are no current waiters. In order for this to work
the increment of the waiter count would have to be atomic with syscall. I
believe that libc++ also has this same issue.

[Bug target/101761] Random hang with 29_atomics/atomic_ref/wait_notify.cc

2021-09-07 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101761

--- Comment #3 from Thomas Rodgers  ---
This appears to be the test case itself.

[Bug libstdc++/101274] [11/12 Regression] std::execution::seq has incorrect behaviour under GCC 11.1.0

2021-07-02 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101274

--- Comment #6 from Thomas Rodgers  ---
It does raise an issue which I'm going to follow up on separately on the SG1
(concurrency and parallelism study group) mailing list.

While it is indeed the case that standard says you can't count on deterministic
sequencing for std::reduce(), I can't find any wording that directly supports
the wording on cppreference.com std::execution::sequenced_policy, though that
wording is consistent with my recollection of SG1's discussions regarding
execution policies.

There is wording that says 'Unless otherwise specified, the semantics of
ExecutionPolicy algorithm overloads are identical to their overloads without."

The wording for std::for_each() for the non-execution policy overload the
standard says -

Effects: Applies f to the result of dereferencing every iterator in the range
[first, last), starting from first and proceeding to last - 1.

And for one that takes an execution policy -

Effects: Applies f to the result of dereferencing every iterator in the range
[first, last).

So it omits the ordering constraint, which makes sense, but I wonder if we
shouldn't be more explicit in the documentation of the policies themselves.

[Bug libstdc++/101274] [11/12 Regression] std::execution::seq has incorrect behaviour under GCC 11.1.0

2021-07-01 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101274

--- Comment #4 from Thomas Rodgers  ---
I did some more reading of
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/n4878.pdf and it would
seem that you are not even guaranteed a deterministic ordering of the
application of binary_op on the non-execution policy accepting overloads of
std::reduce().

See [reduce] the note at bullet 9 -

"[Note 1 : The difference between reduce and accumulate is that reduce applies
binary_op in an unspecified order, which yields a nondeterministic result for
non-associative or non-commutative binary_op such as floating-point addition. —
end note]"

[Bug libstdc++/101274] [11/12 Regression] std::execution::seq has incorrect behaviour under GCC 11.1.0

2021-07-01 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101274

Thomas Rodgers  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |INVALID

--- Comment #3 from Thomas Rodgers  ---
The standard only requires that the the parallel algorithm's execution, when
invoked with std::sequenced_policy, be invoked on the calling thread, and (as
with the other execution policies) allows for the invocations of element access
functions to be indeterminately sequenced on the execution agent.

This is also spelled at -

https://en.cppreference.com/w/cpp/algorithm/execution_policy_tag_t

"1) The execution policy type used as a unique type to disambiguate parallel
algorithm overloading and require that a parallel algorithm's execution may not
be parallelized. The invocations of element access functions in parallel
algorithms invoked with this policy (usually specified as std::execution::seq)
are indeterminately sequenced in the calling thread."

[Bug libstdc++/101228] tbb/task.h is Deprecated in newer TBB.

2021-06-28 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101228

Thomas Rodgers  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |rodgertq at gcc dot 
gnu.org
 Status|NEW |ASSIGNED

--- Comment #9 from Thomas Rodgers  ---
Yes (I already reviewed it when it arrived upstream). It is probably also safe
to backport to GCC-11 since there's no ABI or ABI stability concerns. I will
have a go at applying *just* this patch (I don't want to commit to trying a
full rebase against upstream at this point).

[Bug libstdc++/100806] deadlock in std::counting_semaphore

2021-06-16 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100806

--- Comment #2 from Thomas Rodgers  ---
I have confirmed that this is new issue, not related to PR100334.

When there are waiting threads and the semaphore attempts to release one of
the waiting threads, the FUTEX_WAKE syscall's return indicates that 1 thread is
woken, as expected, but it is unclear to me why there is no forward progress at
that point.

I have replaced the algorithm with a simplified version (similar to what is in
libc++) and observe the same result. Further investigation is required. I have
submitted an interim patch that forces wake all which appears to address the
issue.

[Bug libstdc++/100889] Wrong param type for std::atomic_ref<_Tp*>::wait

2021-06-08 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100889

Thomas Rodgers  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #4 from Thomas Rodgers  ---
Fixed in master, backported to releases/gcc-11

[Bug libstdc++/100889] Wrong param type for std::atomic_ref<_Tp*>::wait

2021-06-04 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100889

Thomas Rodgers  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED

[Bug libstdc++/100806] deadlock in std::counting_semaphore

2021-05-28 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100806

Thomas Rodgers  changed:

   What|Removed |Added

   Last reconfirmed||2021-05-28
   Assignee|unassigned at gcc dot gnu.org  |rodgertq at gcc dot 
gnu.org
 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1

--- Comment #1 from Thomas Rodgers  ---
This is almost certainly a duplicate of
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100334

Which is fixed on master with a backport to releases/gcc-11 but was broken in
GCC 11.1

Will confirm, thanks!

[Bug libstdc++/100334] atomic::notify_one() sometimes wakes wrong thread

2021-05-17 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100334

Thomas Rodgers  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #9 from Thomas Rodgers  ---
Fixed on master, backported to releases/gcc-11

[Bug libstdc++/100334] atomic::notify_one() sometimes wakes wrong thread

2021-05-03 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100334

Thomas Rodgers  changed:

   What|Removed |Added

   Keywords||patch

--- Comment #8 from Thomas Rodgers  ---
Candidate patch submitted to mailing list -

https://gcc.gnu.org/pipermail/libstdc++/2021-May/052464.html

[Bug libstdc++/100334] atomic::notify_one() sometimes wakes wrong thread

2021-05-03 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100334

Thomas Rodgers  changed:

   What|Removed |Added

 Status|WAITING |ASSIGNED

[Bug libstdc++/100334] atomic::notify_one() sometimes wakes wrong thread

2021-05-03 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100334

Thomas Rodgers  changed:

   What|Removed |Added

 Status|ASSIGNED|WAITING

[Bug libstdc++/100334] atomic::notify_one() sometimes wakes wrong thread

2021-05-03 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100334

Thomas Rodgers  changed:

   What|Removed |Added

  Attachment #50728|0   |1
is obsolete||

--- Comment #7 from Thomas Rodgers  ---
Created attachment 50740
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50740=edit
Fix wrong thread getting notification, take 2

This should fix the issue, submitting patch to mailing list.

[Bug libstdc++/100334] atomic::notify_one() sometimes wakes wrong thread

2021-05-01 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100334

--- Comment #6 from Thomas Rodgers  ---
Created attachment 50728
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50728=edit
Fix wrong thread getting notification

I am testing this patch

[Bug libstdc++/100334] atomic::notify_one() sometimes wakes wrong thread

2021-05-01 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100334

Thomas Rodgers  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2021-05-01
 Status|UNCONFIRMED |ASSIGNED

[Bug libstdc++/100334] atomic::notify_one() sometimes wakes wrong thread

2021-04-29 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100334

--- Comment #4 from Thomas Rodgers  ---
This analysis is likely correct, except for -

"- protect from spurious wakeups in __waiter_pool::_M_do_wait by rechecking if
the value has changed from old, if not then wait again"

An earlier version of this code did this, but was subject to ABA problems, and 
the standard doesn't require that we do this -

"(23.3) — Blocks until it is unblocked by an atomic notifying operation or is
unblocked spuriously."

[Bug libstdc++/100164] [11/12 Regression] semaphore_impl not declared on AIX

2021-04-21 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100164

--- Comment #16 from Thomas Rodgers  ---
The _M_try_acquire() change should be on master and gcc-11 now.

[Bug libstdc++/100164] [11/12 Regression] semaphore_impl not declared on AIX

2021-04-21 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100164

--- Comment #15 from Thomas Rodgers  ---
>From the most recent patch -

+_GLIBCXX_ALWAYS_INLINE bool
+_M_try_acquire() noexcept
+{
+  for (;;)
+   {
+ auto __err = sem_trywait(&_M_semaphore);
+ if (__err && (errno == EINTR))
+   continue;
+ else if (__err && (errno == EAGAIN))
+   return false;
+ else if (__err)
+   std::terminate();
+ else
+   break;
+   }
+  return true;
+}
+

You are correct it was never exercised. I saw then when I forced it in the test
case and then added the above. I don't understand why you are seeing this error
if you've applied the 0001-libstdc-Work-around-for-PR100164 patch. I don't see
it locally.

[Bug libstdc++/100164] [11 Regression] semaphore_impl not declared on AIX

2021-04-20 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100164

Thomas Rodgers  changed:

   What|Removed |Added

  Attachment #50643|0   |1
is obsolete||
  Attachment #50645|0   |1
is obsolete||

--- Comment #10 from Thomas Rodgers  ---
Created attachment 50646
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50646=edit
Work around broken macro name

This patch works around the borked macro name that David pointed out on the
mailing list. I left in the commented out _GLIBCXX_HAVE_POSIX_SEMAPHORE checks
and have temporarily replaced them with _GLIBCXX__GLIBCXX_HAVE_POSIX_SEMAPHORE,
and forced the posix semaphore test to always run if that macro is defined.

I am not sufficiently versed in the arcane ways in which config.h is
transformed to c++config.h but the borked macro transformation ideally should
be fixed and the commented out checks restored.

[Bug libstdc++/100164] [11 Regression] semaphore_impl not declared on AIX

2021-04-20 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100164

--- Comment #7 from Thomas Rodgers  ---
Created attachment 50645
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50645=edit
Disable  on unsupported targets

Let's try this with the right patch attached this time

[Bug libstdc++/100164] [11 Regression] semaphore_impl not declared on AIX

2021-04-20 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100164

--- Comment #4 from Thomas Rodgers  ---
It would appear that I cannot log into either of the AIX machines in the
compile farm.

[Bug libstdc++/100164] [11 Regression] semaphore_impl not declared on AIX

2021-04-20 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100164

--- Comment #3 from Thomas Rodgers  ---
Created attachment 50643
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50643=edit
Disable  on unsupported targets

This patch is probably not the most elegant way to do this, it probably should
be a dg-effective-target check and it disables  on platforms where
there is a suitable Posix implementation, but this feature is experimental for
GCC11.

[Bug libstdc++/100164] [11 Regression] semaphore_impl not declared on AIX

2021-04-20 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100164

Thomas Rodgers  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |rodgertq at gcc dot 
gnu.org

--- Comment #2 from Thomas Rodgers  ---
Investigating

[Bug libstdc++/97936] [11 Regression] 30_threads/latch/3.cc hangs

2021-04-09 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97936

--- Comment #16 from Thomas Rodgers  ---
I believe it is addressed in the most recent patch I have submitted for the
atomic wait/notify, barriers, latches, semaphores functionality.

[Bug libstdc++/98034] std::atomic_signed_lock_free and std::atomic_unsigned_lock_free not defined

2020-11-30 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98034

Thomas Rodgers  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2020-11-30

[Bug libstdc++/97719] New: "Implement C++20 features for " changed behavior of istreambuf_iterator

2020-11-04 Thread rodgertq at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97719

Bug ID: 97719
   Summary: "Implement C++20 features for " changed
behavior of istreambuf_iterator
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rodgertq at gcc dot gnu.org
  Target Milestone: ---

from mailing list -

On 07/10/2020 18:55, Thomas Rodgers wrote:
From: Thomas Rodgers 
New ctors and ::view() accessor for -
  * basic_stingbuf
  * basic_istringstream
  * basic_ostringstream
  * basic_stringstreamm
New ::get_allocator() accessor for basic_stringbuf.
I found that this

"libstdc++: Implement C++20 features for " changed the behavior of

$ cat test.cc
#include 
#include 
#include 
int main() {
 std::stringstream s("a");
 std::istreambuf_iterator i(s);
 if (i != std::istreambuf_iterator()) std::cout << *i << '\n';
}
$ g++ -std=c++20 test.cc
$ ./a.out

from printing "a" to printing nothing.  (The `i != ...` comparison appears to
change i from pointing at "a" to pointing to null, and returns false.)

I ran into this when building LibreOffice, and I hope test.cc is a faithfully
minimized reproducer.  However, I know little about std::istreambuf_iterator,
so it may well be that the code isn't even valid.