Re: _BitInt vs. _Atomic

2023-08-02 Thread Michael Matz via Gcc-patches
Hello,

On Tue, 1 Aug 2023, Joseph Myers wrote:

> > Only because cmpxchg is defined in terms of memcpy/memcmp.  If it were 
> > defined in terms of the == operator (obviously applied recursively 
> > member-wise for structs) and simple-assignment that wouldn't be a problem.  
> 
> It also wouldn't work for floating point, where I think clearly the atomic 
> operations should consider positive and negative zero as different, and 
> should consider different DFP quantum exponents for the same real number 
> as different - but should also consider the same NaN (same payload, same 
> choice of quiet / signaling) as being the same.

That is all true.  But the current wording can't work either.  It happily 
requires copying around memory between types of different representations 
and sizes, it makes padding observable behaviour, and due to that makes 
basic algebraic guarantees not be observed (after two values are checked 
equal with the predicates of the algrabra, they are not then in fact equal 
with predicates of the same algebra).


Ciao,
Michael.


Re: _BitInt vs. _Atomic

2023-08-01 Thread Martin Uecker
Am Dienstag, dem 01.08.2023 um 15:54 + schrieb Michael Matz:
> Hello,
> 
> On Mon, 31 Jul 2023, Martin Uecker wrote:
> 
> > >  Say you have a loop like so:
> > > 
> > > _Atomic T obj;
> > > ...
> > > T expected1, expected2, newval;
> > > newval = ...;
> > > expected1 = ...;
> > > do {
> > >   expected2 = expected1;
> > >   if (atomic_compare_exchange_weak(, , newval);
> > > break;
> > >   expected1 = expected2;
> > > } while (1);
> > > 
> > > As written this looks of course stupid, and you may say "don't do that", 
> > > but internally the copies might result from temporaries (compiler 
> > > generated or wrapper function arguments, or suchlike). 
> > >  Now, while 
> > > expected2 will contain the copied padding bits after the cmpxchg the 
> > > copies to and from expected1 will possibly destroy them.  Either way I 
> > > don't see why the above loop should be out-of-spec, so I can write it and 
> > > expect it to proceed eventually (certainly when the _strong variant is 
> > > used).  Any argument that would declare the above loop out-of-spec I 
> > > would 
> > > consider a defect in the spec.
> > 
> > It is "out-of-spec" for C in the sense that it can not be
> > expected work with the semantics as specified in the C standard.
> 
> (I call that a defect.  See below)

This was extensively discussed in WG14 (before my time). In fact,
there was a defect report about the previous version defined in
terms of values and the wording was changed to memcmp / memcpy
operating on padding bytes (also to align with C++ at that time):

https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2059.htm#dr_431
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1906.htm


> > In practice, what the semantics specified using memcpy/memcmp
> > allow one to do is to also apply atomic operations on non-atomic 
> > types.  This is not guaranteed to work by the C standard, but
> > in practice  people often have to do this.  For example, nobody
> > is going to copy a 256 GB numerical array with non-atomic types
> > into another data structure with atomic versions of the same
> > type just so that you can apply atomic operations on it.
> > So one simply does an unsafe cast and hopes the compiler does
> > not break this.
> > 
> > If the non-atomic struct now has non-zero values in the padding, 
> > and the compiler would clear those automatically for "expected", 
> > you would create the problem of an infinite loop (this time 
> > for real).
> 
> Only because cmpxchg is defined in terms of memcpy/memcmp. 

Yes, but this is intentional.

>  If it were 
> defined in terms of the == operator (obviously applied recursively 
> member-wise for structs) and simple-assignment that wouldn't be a problem. 

C has no == operator or any concept of struct equality. 

It would also cause implementation overhead and I guess
could cause severe performance issues when there several
padding bytes distributed over an object and you need to
jump over those when doing copying or doing comparisons.
(how do vectorization?)

> In addition that would get rid of all discussion of what happens or 
> doesn't happen with padding.  Introducing reliance on padding bits (which 
> IMHO goes against some fundamental ideas of the C standard) has 
> far-reaching consequences, see below. 

Working with representation bytes of objects is a rather 
fundamental property of C. That you can do this using
character pointers or that you can copy objects with
memcpy and that the result are compared with memcmp is
something I expect to work in C. 

>  The current definition of the 
> atomic_cmpxchg is also inconsistent with the rest of the standard:
> 
> We have:
> 
>   ... (C is non-atomic variant of A) ...
>   _Bool atomic_compare_exchange_strong(volatile A *object,
>C *expected, C desired);
>   ... (is equivalent to atomic variant of:) 
>   if (memcmp(object, expected, sizeof (*object)) == 0)
> { memcpy(object, , sizeof (*object)); return true; }
>   else
> { memcpy(expected, object, sizeof (*object)); return false; }
> 
> But we also have:
> 
>   The size, representation, and alignment of an atomic type need not be 
>   the same as those of the corresponding unqualified type.
> 
>   (with later text only suggesting that at least for atomic integer 
>   types these please be the same.  But here we aren't talking about
>   integer types even.)

Reading the old meeting minutes, it seems WG14 considered
the case that an atomic type could have a content part and
possibly a lock and you would compare only the content
part (with padding) and not the lock. But I agree, the
wording should be improved.


> 
> So, already the 'memcmp(object, expected, sizeof (*object)' may be 
> undefined.  sizeof(*object) need not be the same as sizeof(*expected).
> In particular the memcpy in the else branch might clobber memory outside 
> *expected.
> 
> That alone should be sufficient to show that defining this all in terms of 
> memcpy/memcmp is 

Re: _BitInt vs. _Atomic

2023-08-01 Thread Joseph Myers
On Tue, 1 Aug 2023, Michael Matz via Gcc-patches wrote:

> Only because cmpxchg is defined in terms of memcpy/memcmp.  If it were 
> defined in terms of the == operator (obviously applied recursively 
> member-wise for structs) and simple-assignment that wouldn't be a problem.  

It also wouldn't work for floating point, where I think clearly the atomic 
operations should consider positive and negative zero as different, and 
should consider different DFP quantum exponents for the same real number 
as different - but should also consider the same NaN (same payload, same 
choice of quiet / signaling) as being the same.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: _BitInt vs. _Atomic

2023-08-01 Thread Michael Matz via Gcc-patches
Hello,

On Mon, 31 Jul 2023, Martin Uecker wrote:

> >  Say you have a loop like so:
> > 
> > _Atomic T obj;
> > ...
> > T expected1, expected2, newval;
> > newval = ...;
> > expected1 = ...;
> > do {
> >   expected2 = expected1;
> >   if (atomic_compare_exchange_weak(, , newval);
> > break;
> >   expected1 = expected2;
> > } while (1);
> > 
> > As written this looks of course stupid, and you may say "don't do that", 
> > but internally the copies might result from temporaries (compiler 
> > generated or wrapper function arguments, or suchlike). 
> >  Now, while 
> > expected2 will contain the copied padding bits after the cmpxchg the 
> > copies to and from expected1 will possibly destroy them.  Either way I 
> > don't see why the above loop should be out-of-spec, so I can write it and 
> > expect it to proceed eventually (certainly when the _strong variant is 
> > used).  Any argument that would declare the above loop out-of-spec I would 
> > consider a defect in the spec.
> 
> It is "out-of-spec" for C in the sense that it can not be
> expected work with the semantics as specified in the C standard.

(I call that a defect.  See below)

> In practice, what the semantics specified using memcpy/memcmp
> allow one to do is to also apply atomic operations on non-atomic 
> types.  This is not guaranteed to work by the C standard, but
> in practice  people often have to do this.  For example, nobody
> is going to copy a 256 GB numerical array with non-atomic types
> into another data structure with atomic versions of the same
> type just so that you can apply atomic operations on it.
> So one simply does an unsafe cast and hopes the compiler does
> not break this.
> 
> If the non-atomic struct now has non-zero values in the padding, 
> and the compiler would clear those automatically for "expected", 
> you would create the problem of an infinite loop (this time 
> for real).

Only because cmpxchg is defined in terms of memcpy/memcmp.  If it were 
defined in terms of the == operator (obviously applied recursively 
member-wise for structs) and simple-assignment that wouldn't be a problem.  
In addition that would get rid of all discussion of what happens or 
doesn't happen with padding.  Introducing reliance on padding bits (which 
IMHO goes against some fundamental ideas of the C standard) has 
far-reaching consequences, see below.  The current definition of the 
atomic_cmpxchg is also inconsistent with the rest of the standard:

We have:

  ... (C is non-atomic variant of A) ...
  _Bool atomic_compare_exchange_strong(volatile A *object,
   C *expected, C desired);
  ... (is equivalent to atomic variant of:) 
  if (memcmp(object, expected, sizeof (*object)) == 0)
{ memcpy(object, , sizeof (*object)); return true; }
  else
{ memcpy(expected, object, sizeof (*object)); return false; }

But we also have:

  The size, representation, and alignment of an atomic type need not be 
  the same as those of the corresponding unqualified type.

  (with later text only suggesting that at least for atomic integer 
  types these please be the same.  But here we aren't talking about
  integer types even.)

So, already the 'memcmp(object, expected, sizeof (*object)' may be 
undefined.  sizeof(*object) need not be the same as sizeof(*expected).
In particular the memcpy in the else branch might clobber memory outside 
*expected.

That alone should be sufficient to show that defining this all in terms of 
memcpy/memcmp is a bad idea.  But it also has other 
consequences: you can't copy (simple-assign) or compare (== operator) 
atomic values anymore reliably and expect the atomic_cmpxchg to work.  My 
example from earlier shows that you can't copy them, a similar one can be 
constructed for breaking ==.

But it goes further: you can also construct an example that shows an 
internal inconsistency just with using atomic_cmpxchg (of course, assume 
all this to run without concurrent accesses to the respective objects):

  _Atomic T obj;
  ...
  T expected, newval;
  expected = ...;
  newval = expected + 1; // just to make it different
  atomic_store (, expected);
  if (atomic_cmpxchg_strong (, , newval)) {
/* Now we have: obj == newval.
   Do we also have memcmp(,)==0? */
if (!atomic_cmpxchg_strong (, , expected)) {
  /* No, we can't rely on that!  */
  error("what's going on?");
}
  } else {
/* May happen, padding of expected may not be the same
   as in obj, even after atomic_store.  */
error("WTH? a compare after a store doesn't even work?");
  }

So, even though cmpxchg is defined in terms of memcpy/memcmp, we still 
can't rely on anything after it succeeded (or failed).  Simply because the 
by-value passing of the 'desired' argument will have unknown padding 
(within the implementation of cmpxchg) that isn't necessarily the same as 
the newval object.

Now, about your suggestion of clearing or ignoring the padding bits at 
specific 

Re: _BitInt vs. _Atomic

2023-07-31 Thread Martin Uecker
Am Montag, dem 31.07.2023 um 14:33 + schrieb Michael Matz:
> Hello,
> 
> On Fri, 28 Jul 2023, Martin Uecker wrote:
> 
> > > > Sorry, somehow I must be missing something here.
> > > > 
> > > > If you add something you would create a new value and this may (in
> > > > an object) have random new padding.  But the "expected" value should
> > > > be updated by a failed atomic_compare_exchange cycle and then have
> > > > same padding as the value stored in the atomic. So the next cycle
> > > > should succeed.  The user would not change the representation of
> > > > the "expected" value but create a new value for another object
> > > > by adding something.
> > > 
> > > You're right that it would pass the expected value not something after an
> > > operation on it usually.  But still, expected type will be something like
> > > _BitInt(37) or _BitInt(195) and so neither the atomic_load nor what
> > > atomic_compare_exchange copies back on failure is guaranteed to have the
> > > padding bits preserved.
> > 
> > For atomic_load in C a value is returned. A value does not care about
> > padding and when stored into a new object can produce new and different
> > padding.  
> > 
> > But for atomic_compare_exchange the memory content is copied into 
> > an object passed by pointer, so here the C standard requires to
> > that the padding is preserved. It explicitely states that the effect
> > is like:
> > 
> > if (memcmp(object, expected, sizeof(*object)) == 0)
> >   memcpy(object, , sizeof(*object));
> > else
> >   memcpy(expected, object, sizeof(*object));
> > 
> > > It is true that if it is larger than 16 bytes the libatomic 
> > > atomic_compare_exchange will memcpy the value back which copies the 
> > > padding bits, but is there a guarantee that the user code doesn't 
> > > actually copy that value further into some other variable?  
> > 
> > I do not think it would be surprising for C user when
> > the next atomic_compare_exchange fails in this case.
> 
> But that is a problem (the same one the cited C++ paper tries to resolve, 
> IIUC). 

I do not quite understand the paper. I can't see
how the example 3 in

https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0528r0.html

could loop indefinitely with the memcpy / memcmp semantics,
but somehow the authors seem to claim this. 


>  Say you have a loop like so:
> 
> _Atomic T obj;
> ...
> T expected1, expected2, newval;
> newval = ...;
> expected1 = ...;
> do {
>   expected2 = expected1;
>   if (atomic_compare_exchange_weak(, , newval);
> break;
>   expected1 = expected2;
> } while (1);
> 
> As written this looks of course stupid, and you may say "don't do that", 
> but internally the copies might result from temporaries (compiler 
> generated or wrapper function arguments, or suchlike). 
>  Now, while 
> expected2 will contain the copied padding bits after the cmpxchg the 
> copies to and from expected1 will possibly destroy them.  Either way I 
> don't see why the above loop should be out-of-spec, so I can write it and 
> expect it to proceed eventually (certainly when the _strong variant is 
> used).  Any argument that would declare the above loop out-of-spec I would 
> consider a defect in the spec.

It is "out-of-spec" for C in the sense that it can not be
expected work with the semantics as specified in the C standard.

But I agree with you that it would be better if it just worked.

A compiler could, for example, always clear the padding when
initializing or storing atomic values.  It might also clear
the padding of the initial "expected", when it is
initialized or stored to.

But it should not clear / ignore the padding when copying 
to "expected" using atomic_compare_exhange or when comparing
to the memory content. See below why I think this would not
be helpful.

> 
> It's never a good idea to introduce reliance on padding bits.  Exactly 
> because you can trivially destroy them with simple value copies.
> 
> > > Anyway, for smaller or equal to 16 (or 8) bytes if 
> > > atomic_compare_exchange is emitted inline I don't see what would 
> > > preserve the bits.
> > 
> > This then seems to be incorrect for C.
> 
> Or the spec is.

In practice, what the semantics specified using memcpy/memcmp
allow one to do is to also apply atomic operations on non-atomic 
types.  This is not guaranteed to work by the C standard, but
in practice  people often have to do this.  For example, nobody
is going to copy a 256 GB numerical array with non-atomic types
into another data structure with atomic versions of the same
type just so that you can apply atomic operations on it. So
one simply does an unsafe cast and hopes the compiler does not
break this.

If the non-atomic struct now has non-zero values in the padding, 
and the compiler would clear those automatically for "expected", 
you would create the problem of an infinite loop (this time 
for real).


Martin









Re: _BitInt vs. _Atomic

2023-07-31 Thread Michael Matz via Gcc-patches
Hello,

On Fri, 28 Jul 2023, Martin Uecker wrote:

> > > Sorry, somehow I must be missing something here.
> > > 
> > > If you add something you would create a new value and this may (in
> > > an object) have random new padding.  But the "expected" value should
> > > be updated by a failed atomic_compare_exchange cycle and then have
> > > same padding as the value stored in the atomic. So the next cycle
> > > should succeed.  The user would not change the representation of
> > > the "expected" value but create a new value for another object
> > > by adding something.
> > 
> > You're right that it would pass the expected value not something after an
> > operation on it usually.  But still, expected type will be something like
> > _BitInt(37) or _BitInt(195) and so neither the atomic_load nor what
> > atomic_compare_exchange copies back on failure is guaranteed to have the
> > padding bits preserved.
> 
> For atomic_load in C a value is returned. A value does not care about
> padding and when stored into a new object can produce new and different
> padding.  
> 
> But for atomic_compare_exchange the memory content is copied into 
> an object passed by pointer, so here the C standard requires to
> that the padding is preserved. It explicitely states that the effect
> is like:
> 
> if (memcmp(object, expected, sizeof(*object)) == 0)
>   memcpy(object, , sizeof(*object));
> else
>   memcpy(expected, object, sizeof(*object));
> 
> > It is true that if it is larger than 16 bytes the libatomic 
> > atomic_compare_exchange will memcpy the value back which copies the 
> > padding bits, but is there a guarantee that the user code doesn't 
> > actually copy that value further into some other variable?  
> 
> I do not think it would be surprising for C user when
> the next atomic_compare_exchange fails in this case.

But that is a problem (the same one the cited C++ paper tries to resolve, 
IIUC).  Say you have a loop like so:

_Atomic T obj;
...
T expected1, expected2, newval;
newval = ...;
expected1 = ...;
do {
  expected2 = expected1;
  if (atomic_compare_exchange_weak(, , newval);
break;
  expected1 = expected2;
} while (1);

As written this looks of course stupid, and you may say "don't do that", 
but internally the copies might result from temporaries (compiler 
generated or wrapper function arguments, or suchlike).  Now, while 
expected2 will contain the copied padding bits after the cmpxchg the 
copies to and from expected1 will possibly destroy them.  Either way I 
don't see why the above loop should be out-of-spec, so I can write it and 
expect it to proceed eventually (certainly when the _strong variant is 
used).  Any argument that would declare the above loop out-of-spec I would 
consider a defect in the spec.

It's never a good idea to introduce reliance on padding bits.  Exactly 
because you can trivially destroy them with simple value copies.

> > Anyway, for smaller or equal to 16 (or 8) bytes if 
> > atomic_compare_exchange is emitted inline I don't see what would 
> > preserve the bits.
> 
> This then seems to be incorrect for C.

Or the spec is.


Ciao,
Michael.


Re: _BitInt vs. _Atomic

2023-07-28 Thread Joseph Myers
On Fri, 28 Jul 2023, Jakub Jelinek via Gcc-patches wrote:

> The C++ way of dealing with this is using __builtin_clear_padding,
> done on atomic stores/updates of the atomic memory (padding is cleared
> if any on the value to be stored, or on the expected and desired values).
> 
> I don't know enough about the C atomic requirements whether that is feasible
> for it as well, or whether it is possible to make the padding bits partially
> or fully set somehow non-atomically without invoking UB and then make it
> never match.

If padding bits not being reliably preserved causes problems for the 
compare-exchange loops in C in practice, then it would seem reasonable to 
use __builtin_clear_padding internally as part of implementing those cases 
of atomic compound assignment.

> And another issue is that while __atomic_load, __atomic_store,
> __atomic_exchange and __atomic_compare_exchange work on arbitrary _BitInt
> sizes, others like __atomic_fetch_add only support _BitInt or other integral
> types which have size of 1, 2, 4, 8 or 16 bytes, others emit an error
> in c-family/c-common.cc (sync_resolve_size).  So, either
> resolve_overloaded_builtin should for the case when pointer is pointer to
> _BitInt which doesn't have 1, 2, 4, 8 or 16 bytes size lower those into
> a loop using __atomic_compare_exchange (or perhaps also if there is
> padding), or  should do that.

The  interfaces definitely need to work with _BitInt.  My 
guess is that doing this with the built-in expansion would be more robust 
than putting more complicated definitions in the header that choose which 
built-in functions to use depending on properties of the type (and keeping 
the built-in functions limited to certain widths), but I don't know.

Note also that those  operations have no undefined behavior 
on signed integer overflow.

If any ABIs require sign / zero extension of _BitInt values in memory, 
care would also be needed in the case of (size of 1, 2, 4, 8 or 16 bytes, 
but also has high bits required to be sign / zero extended) to ensure that 
the operations are implemented so as to leave the high bits with the 
expected values in case of overflow, which wouldn't result from simply 
using the underlying operation for a type with the full precision of its 
memory size.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: _BitInt vs. _Atomic

2023-07-28 Thread Martin Uecker
Am Freitag, dem 28.07.2023 um 17:10 +0200 schrieb Jakub Jelinek:
> On Fri, Jul 28, 2023 at 04:53:30PM +0200, Martin Uecker wrote:
> > > The thing is that user doesn't have much control over those
> > > padding bits, so whether _Atomic operations on long double (when it is 80
> > > bit and stores from hw actually store 10 bytes rather than 12 or 16), or
> > > _BitInt(37) or _BitInt(195) or struct S { char a; int b; }; then depend
> > > purely on luck.  If the expected value is based on atomic_load on the
> > > atomic_compare_exchange location or whatever atomic_compare_exchange gave
> > > back, if in the loop one e.g. adds something to it, then again it might 
> > > get
> > > different padding bits from what is originally in memory, so it isn't true
> > > that it will always succeed at least in the second loop iteration.
> > 
> > Sorry, somehow I must be missing something here.
> > 
> > If you add something you would create a new value and this may (in
> > an object) have random new padding.  But the "expected" value should
> > be updated by a failed atomic_compare_exchange cycle and then have
> > same padding as the value stored in the atomic. So the next cycle
> > should succeed.  The user would not change the representation of
> > the "expected" value but create a new value for another object
> > by adding something.
> 
> You're right that it would pass the expected value not something after an
> operation on it usually.  But still, expected type will be something like
> _BitInt(37) or _BitInt(195) and so neither the atomic_load nor what
> atomic_compare_exchange copies back on failure is guaranteed to have the
> padding bits preserved.

For atomic_load in C a value is returned. A value does not care about
padding and when stored into a new object can produce new and different
padding.  

But for atomic_compare_exchange the memory content is copied into 
an object passed by pointer, so here the C standard requires to
that the padding is preserved. It explicitely states that the effect
is like:

if (memcmp(object, expected, sizeof(*object)) == 0)
  memcpy(object, , sizeof(*object));
else
  memcpy(expected, object, sizeof(*object));


> It is true that if it is larger than 16 bytes the libatomic
> atomic_compare_exchange will memcpy the value back which copies the padding
> bits, but is there a guarantee that the user code doesn't actually copy that
> value further into some other variable?  

I do not think it would be surprising for C user when
the next atomic_compare_exchange fails in this case.

> Anyway, for smaller or equal
> to 16 (or 8) bytes if atomic_compare_exchange is emitted inline I don't see
> what would preserve the bits.

This then seems to be incorrect for C.

Martin



Re: _BitInt vs. _Atomic

2023-07-28 Thread Jakub Jelinek via Gcc-patches
On Fri, Jul 28, 2023 at 04:53:30PM +0200, Martin Uecker wrote:
> > The thing is that user doesn't have much control over those
> > padding bits, so whether _Atomic operations on long double (when it is 80
> > bit and stores from hw actually store 10 bytes rather than 12 or 16), or
> > _BitInt(37) or _BitInt(195) or struct S { char a; int b; }; then depend
> > purely on luck.  If the expected value is based on atomic_load on the
> > atomic_compare_exchange location or whatever atomic_compare_exchange gave
> > back, if in the loop one e.g. adds something to it, then again it might get
> > different padding bits from what is originally in memory, so it isn't true
> > that it will always succeed at least in the second loop iteration.
> 
> Sorry, somehow I must be missing something here.
> 
> If you add something you would create a new value and this may (in
> an object) have random new padding.  But the "expected" value should
> be updated by a failed atomic_compare_exchange cycle and then have
> same padding as the value stored in the atomic. So the next cycle
> should succeed.  The user would not change the representation of
> the "expected" value but create a new value for another object
> by adding something.

You're right that it would pass the expected value not something after an
operation on it usually.  But still, expected type will be something like
_BitInt(37) or _BitInt(195) and so neither the atomic_load nor what
atomic_compare_exchange copies back on failure is guaranteed to have the
padding bits preserved.
It is true that if it is larger than 16 bytes the libatomic
atomic_compare_exchange will memcpy the value back which copies the padding
bits, but is there a guarantee that the user code doesn't actually copy that
value further into some other variable?  Anyway, for smaller or equal
to 16 (or 8) bytes if atomic_compare_exchange is emitted inline I don't see
what would preserve the bits.

Jakub



Re: _BitInt vs. _Atomic

2023-07-28 Thread Martin Uecker
Am Freitag, dem 28.07.2023 um 16:26 +0200 schrieb Jakub Jelinek:
> On Fri, Jul 28, 2023 at 04:03:39PM +0200, Martin Uecker wrote:
> > > On Thu, Jul 27, 2023 at 07:06:03PM +, Joseph Myers wrote:
> > > > I think there should be tests for _Atomic _BitInt types.  Hopefully 
> > > > atomic 
> > > > compound assignment just works via the logic for compare-and-exchange 
> > > > loops, but does e.g. atomic_fetch_add work with _Atomic _BitInt types?
> > > 
> > > So, there are 2 issues.
> > > 
> > > One is something I haven't seen being handled for C at all so far, but
> > > handled for C++ - padding bits.
> > > 
> > > Already e.g. x86 long double has some padding bits - 16 bits on ia32,
> > > 48 bits on x86_64, when one does
> > >   _Atomic long double l;
> > > ...
> > >   l += 2.0;
> > > it will sometimes work and sometimes hang forever.
> > > Similarly atomic_compare_exchange with structs which contain padding
> > > (unions with padding bits are lost case, there is nothing that can be
> > > reliably done for that, because we don't know at runtime what is the 
> > > active
> > > union member if any).  And _BitInt if it doesn't use all bits in
> > > all containing limbs has padding as well (and psABI doesn't say it is sign
> > > or zero extended).
> > 
> > What is the problem here?  In C, atomic_compare_exchange is defined in terms
> > of the memory content which includes padding.  So it may fail spuriously
> > due to padding differences (but it may fail anyway for arbitrary reasons
> > even without padding differences), but then should work in the second 
> > iterations.
> 
> See https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0528r3.html
> for background.  

Thanks. I have seen this at that time, but it seems to refer to 
C++ specific problems. At least at that time, I concluded (maybe
incorrectly) that this is not a serious problem for how things 
work in C.


> The thing is that user doesn't have much control over those
> padding bits, so whether _Atomic operations on long double (when it is 80
> bit and stores from hw actually store 10 bytes rather than 12 or 16), or
> _BitInt(37) or _BitInt(195) or struct S { char a; int b; }; then depend
> purely on luck.  If the expected value is based on atomic_load on the
> atomic_compare_exchange location or whatever atomic_compare_exchange gave
> back, if in the loop one e.g. adds something to it, then again it might get
> different padding bits from what is originally in memory, so it isn't true
> that it will always succeed at least in the second loop iteration.

Sorry, somehow I must be missing something here.

If you add something you would create a new value and this may (in
an object) have random new padding.  But the "expected" value should
be updated by a failed atomic_compare_exchange cycle and then have
same padding as the value stored in the atomic. So the next cycle
should succeed.  The user would not change the representation of
the "expected" value but create a new value for another object
by adding something.


Martin






Re: _BitInt vs. _Atomic

2023-07-28 Thread Jakub Jelinek via Gcc-patches
On Fri, Jul 28, 2023 at 04:03:39PM +0200, Martin Uecker wrote:
> > On Thu, Jul 27, 2023 at 07:06:03PM +, Joseph Myers wrote:
> > > I think there should be tests for _Atomic _BitInt types.  Hopefully 
> > > atomic 
> > > compound assignment just works via the logic for compare-and-exchange 
> > > loops, but does e.g. atomic_fetch_add work with _Atomic _BitInt types?
> > 
> > So, there are 2 issues.
> > 
> > One is something I haven't seen being handled for C at all so far, but
> > handled for C++ - padding bits.
> > 
> > Already e.g. x86 long double has some padding bits - 16 bits on ia32,
> > 48 bits on x86_64, when one does
> >   _Atomic long double l;
> > ...
> >   l += 2.0;
> > it will sometimes work and sometimes hang forever.
> > Similarly atomic_compare_exchange with structs which contain padding
> > (unions with padding bits are lost case, there is nothing that can be
> > reliably done for that, because we don't know at runtime what is the active
> > union member if any).  And _BitInt if it doesn't use all bits in
> > all containing limbs has padding as well (and psABI doesn't say it is sign
> > or zero extended).
> 
> What is the problem here?  In C, atomic_compare_exchange is defined in terms
> of the memory content which includes padding.  So it may fail spuriously
> due to padding differences (but it may fail anyway for arbitrary reasons
> even without padding differences), but then should work in the second 
> iterations.

See https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0528r3.html
for background.  The thing is that user doesn't have much control over those
padding bits, so whether _Atomic operations on long double (when it is 80
bit and stores from hw actually store 10 bytes rather than 12 or 16), or
_BitInt(37) or _BitInt(195) or struct S { char a; int b; }; then depend
purely on luck.  If the expected value is based on atomic_load on the
atomic_compare_exchange location or whatever atomic_compare_exchange gave
back, if in the loop one e.g. adds something to it, then again it might get
different padding bits from what is originally in memory, so it isn't true
that it will always succeed at least in the second loop iteration.

Jakub



Re: _BitInt vs. _Atomic

2023-07-28 Thread Martin Uecker
Am Freitag, dem 28.07.2023 um 16:03 +0200 schrieb Martin Uecker:
> 
> > On Thu, Jul 27, 2023 at 07:06:03PM +, Joseph Myers wrote:
> > > I think there should be tests for _Atomic _BitInt types.  Hopefully 
> > > atomic 
> > > compound assignment just works via the logic for compare-and-exchange 
> > > loops, but does e.g. atomic_fetch_add work with _Atomic _BitInt types?
> > 
> > So, there are 2 issues.
> > 
> > One is something I haven't seen being handled for C at all so far, but
> > handled for C++ - padding bits.
> > 
> > Already e.g. x86 long double has some padding bits - 16 bits on ia32,
> > 48 bits on x86_64, when one does
> >   _Atomic long double l;
> > ...
> >   l += 2.0;
> > it will sometimes work and sometimes hang forever.
> > Similarly atomic_compare_exchange with structs which contain padding
> > (unions with padding bits are lost case, there is nothing that can be
> > reliably done for that, because we don't know at runtime what is the active
> > union member if any).  And _BitInt if it doesn't use all bits in
> > all containing limbs has padding as well (and psABI doesn't say it is sign
> > or zero extended).
> 
> What is the problem here?  In C, atomic_compare_exchange is defined in terms
> of the memory content which includes padding.  So it may fail spuriously
> due to padding differences (but it may fail anyway for arbitrary reasons
> even without padding differences), but then should work in the second 
> iterations.

(only the weak version can fail spuriously, but the strong one can still
fail if there are differences in the padding)





Re: _BitInt vs. _Atomic

2023-07-28 Thread Martin Uecker



> On Thu, Jul 27, 2023 at 07:06:03PM +, Joseph Myers wrote:
> > I think there should be tests for _Atomic _BitInt types.  Hopefully atomic 
> > compound assignment just works via the logic for compare-and-exchange 
> > loops, but does e.g. atomic_fetch_add work with _Atomic _BitInt types?
> 
> So, there are 2 issues.
> 
> One is something I haven't seen being handled for C at all so far, but
> handled for C++ - padding bits.
> 
> Already e.g. x86 long double has some padding bits - 16 bits on ia32,
> 48 bits on x86_64, when one does
>   _Atomic long double l;
> ...
>   l += 2.0;
> it will sometimes work and sometimes hang forever.
> Similarly atomic_compare_exchange with structs which contain padding
> (unions with padding bits are lost case, there is nothing that can be
> reliably done for that, because we don't know at runtime what is the active
> union member if any).  And _BitInt if it doesn't use all bits in
> all containing limbs has padding as well (and psABI doesn't say it is sign
> or zero extended).

What is the problem here?  In C, atomic_compare_exchange is defined in terms
of the memory content which includes padding.  So it may fail spuriously
due to padding differences (but it may fail anyway for arbitrary reasons
even without padding differences), but then should work in the second 
iterations.

Martin





Re: _BitInt vs. _Atomic

2023-07-28 Thread Jakub Jelinek via Gcc-patches
On Thu, Jul 27, 2023 at 07:06:03PM +, Joseph Myers wrote:
> I think there should be tests for _Atomic _BitInt types.  Hopefully atomic 
> compound assignment just works via the logic for compare-and-exchange 
> loops, but does e.g. atomic_fetch_add work with _Atomic _BitInt types?

So, there are 2 issues.

One is something I haven't seen being handled for C at all so far, but
handled for C++ - padding bits.

Already e.g. x86 long double has some padding bits - 16 bits on ia32,
48 bits on x86_64, when one does
  _Atomic long double l;
...
  l += 2.0;
it will sometimes work and sometimes hang forever.
Similarly atomic_compare_exchange with structs which contain padding
(unions with padding bits are lost case, there is nothing that can be
reliably done for that, because we don't know at runtime what is the active
union member if any).  And _BitInt if it doesn't use all bits in
all containing limbs has padding as well (and psABI doesn't say it is sign
or zero extended).

The C++ way of dealing with this is using __builtin_clear_padding,
done on atomic stores/updates of the atomic memory (padding is cleared
if any on the value to be stored, or on the expected and desired values).

I don't know enough about the C atomic requirements whether that is feasible
for it as well, or whether it is possible to make the padding bits partially
or fully set somehow non-atomically without invoking UB and then make it
never match.

If one ignores this or deals with it, then

_Atomic _BitInt(15) a;
_Atomic(_BitInt(15)) b;
_Atomic _BitInt(115) c;
_Atomic _BitInt(192) d;
_Atomic _BitInt(575) e;
_BitInt(575) f;

int
main ()
{
  a += 1wb;
  b -= 2wb;
  c += 3wb;
  d += 4wb;
  e -= 5wb;
//  f = __atomic_fetch_add (, 
54342985743985743985743895743834298574985734895743895734895wb, 
__ATOMIC_SEQ_CST);
}

compiles fine with the patch set.

And another issue is that while __atomic_load, __atomic_store,
__atomic_exchange and __atomic_compare_exchange work on arbitrary _BitInt
sizes, others like __atomic_fetch_add only support _BitInt or other integral
types which have size of 1, 2, 4, 8 or 16 bytes, others emit an error
in c-family/c-common.cc (sync_resolve_size).  So, either
resolve_overloaded_builtin should for the case when pointer is pointer to
_BitInt which doesn't have 1, 2, 4, 8 or 16 bytes size lower those into
a loop using __atomic_compare_exchange (or perhaps also if there is
padding), or  should do that.

Thoughts on that?

Jakub