Re: [PATCH] libstdc++: Remove UB in _Arg_value union alteranatives assigment

2026-03-04 Thread Jonathan Wakely
On Wed, 4 Mar 2026 at 16:48, Tomasz Kaminski  wrote:
>
>
>
> On Wed, Mar 4, 2026 at 4:11 PM Jonathan Wakely  wrote:
>>
>> On Wed, 25 Feb 2026 at 14:10 +0100, Tomasz Kamiński wrote:
>> >The _Arg_value::_M_set method, initialized the union member, by
>> >assining to reference to that member produced by _M_get(*this).
>> >However, per language rules, such assigment has undefined behavior,
>> >if alternative was not already active.
>>
>>
>> So doing 'u.mem = val' changes the active member, but returning a
>> reference to u.mem from the function and then assigning to that
>> reference is undefined, because we can't do 'return u.mem' if it's not
>> active? Interesting.
>
> Yes, the rules are syntax based. From what I recall, allowing any reference
> would lead into an arbitrary function accepting int* p, being able to switch
> union lifetime by assigning to *p. I think the exact need here is for the 
> compiler
> to be able to see that this is an assignment for a union member.,

Yes, that makes sense. IIRC the C99 rules for type punning with unions
have some similar restrictions.

>
>>
>>
>> Ah yes, [class.union.general] p5. Could you add that standard
>> reference to the commit message, maybe "per the language rules in
>> [class.union.general], such an assignment has ..."
>
> Will do, however I do not think that p5 is relevant here. Assigning to object
> not withing lifetime is UB, and p5 does not make it non UB.

Ack.

>>
>>
>> >To address above, we modify _M_set to use placement new for the class
>> >types, and invoke _S_access with two arguments for all other types.
>> >The _S_access (rename of _S_get) is modified to assign the value of
>> >the second parameter (if provided) to the union memnber. Such direct
>> >assigments are treated specially in the language, and will start
>
> I will add the reference to  [class.union.general] p5, here, as that were we
> get special treatment.

OK, thanks.

>>
>> >lifetime of union alternative of implicit-lifetime type.
>> >
>> >libstdc++-v3/ChangeLog:
>> >
>> >   * include/std/format (_Arg_value::_M_get): Rename to...
>> >   (_Arg_value::_M_access): Modified to accept optional
>> >   second parameter that is assigned to value.
>> >   (_Arg_value::_M_get): Handle rename.
>> >   (_Arg_value::_M_set): Use construct_at for basic_string_view,
>> >   and two-argument _S_access for other types.
>> >
>> >Signed-off-by: Tomasz Kamiński 
>> >Signed-off-by: Ivan Lazaric 
>> >Co-authored-by: Ivan Lazaric 
>> >---
>> >This was detected by constant evaluator, and while it seem to not
>> >be harmfull at runtime, we will need to address it anyway.
>> >
>> >Testing on x86_64-linux. OK for trunk when all test passes?
>> >
>> > libstdc++-v3/include/std/format | 61 ++---
>> > 1 file changed, 33 insertions(+), 28 deletions(-)
>> >
>> >diff --git a/libstdc++-v3/include/std/format 
>> >b/libstdc++-v3/include/std/format
>> >index 245ea964675..4f0b0f377c6 100644
>> >--- a/libstdc++-v3/include/std/format
>> >+++ b/libstdc++-v3/include/std/format
>> >@@ -4234,70 +4234,73 @@ namespace __format
>> >   { _S_get<_Tp>() = __val; }
>> > #endif
>> >
>> >-  template
>> >+  // Returns reference to the _Arg_value member with the type _Tp.
>> >+  // Value of second argument (if provided), is assigned to that 
>> >member.
>> >+  template
>> >   [[__gnu__::__always_inline__]]
>> >   static auto&
>> >-  _S_get(_Self& __u) noexcept
>> >+  _S_access(_Self& __u, _Value... __value) noexcept
>> >   {
>> >+static_assert(sizeof...(_Value) <= 1);
>> > if constexpr (is_same_v<_Tp, bool>)
>> >-  return __u._M_bool;
>> >+  return (__u._M_bool = ... = __value);
>>
>> Oh that's sneaky. Clever, but sneaky.
>
> Took the idea from Ivan patch.
>>
>>
>> > else if constexpr (is_same_v<_Tp, _CharT>)
>> >-  return __u._M_c;
>> >+  return (__u._M_c = ... = __value);
>> > else if constexpr (is_same_v<_Tp, int>)
>> >-  return __u._M_i;
>> >+  return (__u._M_i = ... = __value);
>> > else if constexpr (is_same_v<_Tp, unsigned>)
>> >-  return __u._M_u;
>> >+  return (__u._M_u = ... = __value);
>> > else if constexpr (is_same_v<_Tp, long long>)
>> >-  return __u._M_ll;
>> >+  return (__u._M_ll = ... = __value);
>> > else if constexpr (is_same_v<_Tp, unsigned long long>)
>> >-  return __u._M_ull;
>> >+  return (__u._M_ull = ... = __value);
>> > else if constexpr (is_same_v<_Tp, float>)
>> >-  return __u._M_flt;
>> >+  return (__u._M_flt = ... = __value);
>> > else if constexpr (is_same_v<_Tp, double>)
>> >-  return __u._M_dbl;
>> >+  return (__u._M_dbl = ... = __value);
>> > #ifndef _GLIBCXX_LONG_DOUBLE_ALT128_COMPAT
>> > else if constexpr (is_same_v<_Tp, long double>)
>> >-  return __u._M_ldbl;
>> >+  return (__u._M_ldbl = ... =

Re: [PATCH] libstdc++: Remove UB in _Arg_value union alteranatives assigment

2026-03-04 Thread Tomasz Kaminski
On Wed, Mar 4, 2026 at 4:11 PM Jonathan Wakely  wrote:

> On Wed, 25 Feb 2026 at 14:10 +0100, Tomasz Kamiński wrote:
> >The _Arg_value::_M_set method, initialized the union member, by
> >assining to reference to that member produced by _M_get(*this).
> >However, per language rules, such assigment has undefined behavior,
> >if alternative was not already active.
>
>
> So doing 'u.mem = val' changes the active member, but returning a
> reference to u.mem from the function and then assigning to that
> reference is undefined, because we can't do 'return u.mem' if it's not
> active? Interesting.
>
Yes, the rules are syntax based. From what I recall, allowing any reference
would lead into an arbitrary function accepting int* p, being able to switch
union lifetime by assigning to *p. I think the exact need here is for the
compiler
to be able to see that this is an assignment for a union member.,


>
> Ah yes, [class.union.general] p5. Could you add that standard
> reference to the commit message, maybe "per the language rules in
> [class.union.general], such an assignment has ..."
>
Will do, however I do not think that p5 is relevant here. Assigning to
object
not withing lifetime is UB, and p5 does not make it non UB.

>
> >To address above, we modify _M_set to use placement new for the class
> >types, and invoke _S_access with two arguments for all other types.
> >The _S_access (rename of _S_get) is modified to assign the value of
> >the second parameter (if provided) to the union memnber. Such direct
> >assigments are treated specially in the language, and will start
>
I will add the reference to  [class.union.general] p5, here, as that were we
get special treatment.

> >lifetime of union alternative of implicit-lifetime type.
> >
> >libstdc++-v3/ChangeLog:
> >
> >   * include/std/format (_Arg_value::_M_get): Rename to...
> >   (_Arg_value::_M_access): Modified to accept optional
> >   second parameter that is assigned to value.
> >   (_Arg_value::_M_get): Handle rename.
> >   (_Arg_value::_M_set): Use construct_at for basic_string_view,
> >   and two-argument _S_access for other types.
> >
> >Signed-off-by: Tomasz Kamiński 
> >Signed-off-by: Ivan Lazaric 
> >Co-authored-by: Ivan Lazaric 
> >---
> >This was detected by constant evaluator, and while it seem to not
> >be harmfull at runtime, we will need to address it anyway.
> >
> >Testing on x86_64-linux. OK for trunk when all test passes?
> >
> > libstdc++-v3/include/std/format | 61 ++---
> > 1 file changed, 33 insertions(+), 28 deletions(-)
> >
> >diff --git a/libstdc++-v3/include/std/format
> b/libstdc++-v3/include/std/format
> >index 245ea964675..4f0b0f377c6 100644
> >--- a/libstdc++-v3/include/std/format
> >+++ b/libstdc++-v3/include/std/format
> >@@ -4234,70 +4234,73 @@ namespace __format
> >   { _S_get<_Tp>() = __val; }
> > #endif
> >
> >-  template
> >+  // Returns reference to the _Arg_value member with the type _Tp.
> >+  // Value of second argument (if provided), is assigned to that
> member.
> >+  template
> >   [[__gnu__::__always_inline__]]
> >   static auto&
> >-  _S_get(_Self& __u) noexcept
> >+  _S_access(_Self& __u, _Value... __value) noexcept
> >   {
> >+static_assert(sizeof...(_Value) <= 1);
> > if constexpr (is_same_v<_Tp, bool>)
> >-  return __u._M_bool;
> >+  return (__u._M_bool = ... = __value);
>
> Oh that's sneaky. Clever, but sneaky.
>
Took the idea from Ivan patch.

>
> > else if constexpr (is_same_v<_Tp, _CharT>)
> >-  return __u._M_c;
> >+  return (__u._M_c = ... = __value);
> > else if constexpr (is_same_v<_Tp, int>)
> >-  return __u._M_i;
> >+  return (__u._M_i = ... = __value);
> > else if constexpr (is_same_v<_Tp, unsigned>)
> >-  return __u._M_u;
> >+  return (__u._M_u = ... = __value);
> > else if constexpr (is_same_v<_Tp, long long>)
> >-  return __u._M_ll;
> >+  return (__u._M_ll = ... = __value);
> > else if constexpr (is_same_v<_Tp, unsigned long long>)
> >-  return __u._M_ull;
> >+  return (__u._M_ull = ... = __value);
> > else if constexpr (is_same_v<_Tp, float>)
> >-  return __u._M_flt;
> >+  return (__u._M_flt = ... = __value);
> > else if constexpr (is_same_v<_Tp, double>)
> >-  return __u._M_dbl;
> >+  return (__u._M_dbl = ... = __value);
> > #ifndef _GLIBCXX_LONG_DOUBLE_ALT128_COMPAT
> > else if constexpr (is_same_v<_Tp, long double>)
> >-  return __u._M_ldbl;
> >+  return (__u._M_ldbl = ... = __value);
> > #else
> > else if constexpr (is_same_v<_Tp, __ibm128>)
> >-  return __u._M_ibm128;
> >+  return (__u._M_ibm128 = ... = __value);
> > else if constexpr (is_same_v<_Tp, __ieee128>)
> >-  return __u._M_ieee128;
> >+  return (__u._M_ieee128 = ... = __value);

Re: [PATCH] libstdc++: Remove UB in _Arg_value union alteranatives assigment

2026-03-04 Thread Jonathan Wakely

On Wed, 25 Feb 2026 at 14:10 +0100, Tomasz Kamiński wrote:

The _Arg_value::_M_set method, initialized the union member, by
assining to reference to that member produced by _M_get(*this).
However, per language rules, such assigment has undefined behavior,
if alternative was not already active.



So doing 'u.mem = val' changes the active member, but returning a
reference to u.mem from the function and then assigning to that
reference is undefined, because we can't do 'return u.mem' if it's not
active? Interesting.

Ah yes, [class.union.general] p5. Could you add that standard
reference to the commit message, maybe "per the language rules in
[class.union.general], such an assignment has ..."



To address above, we modify _M_set to use placement new for the class
types, and invoke _S_access with two arguments for all other types.
The _S_access (rename of _S_get) is modified to assign the value of
the second parameter (if provided) to the union memnber. Such direct
assigments are treated specially in the language, and will start
lifetime of union alternative of implicit-lifetime type.

libstdc++-v3/ChangeLog:

* include/std/format (_Arg_value::_M_get): Rename to...
(_Arg_value::_M_access): Modified to accept optional
second parameter that is assigned to value.
(_Arg_value::_M_get): Handle rename.
(_Arg_value::_M_set): Use construct_at for basic_string_view,
and two-argument _S_access for other types.

Signed-off-by: Tomasz Kamiński 
Signed-off-by: Ivan Lazaric 
Co-authored-by: Ivan Lazaric 
---
This was detected by constant evaluator, and while it seem to not
be harmfull at runtime, we will need to address it anyway.

Testing on x86_64-linux. OK for trunk when all test passes?

libstdc++-v3/include/std/format | 61 ++---
1 file changed, 33 insertions(+), 28 deletions(-)

diff --git a/libstdc++-v3/include/std/format b/libstdc++-v3/include/std/format
index 245ea964675..4f0b0f377c6 100644
--- a/libstdc++-v3/include/std/format
+++ b/libstdc++-v3/include/std/format
@@ -4234,70 +4234,73 @@ namespace __format
{ _S_get<_Tp>() = __val; }
#endif

-  template
+  // Returns reference to the _Arg_value member with the type _Tp.
+  // Value of second argument (if provided), is assigned to that member.
+  template
[[__gnu__::__always_inline__]]
static auto&
-   _S_get(_Self& __u) noexcept
+   _S_access(_Self& __u, _Value... __value) noexcept
{
+ static_assert(sizeof...(_Value) <= 1);
  if constexpr (is_same_v<_Tp, bool>)
-   return __u._M_bool;
+   return (__u._M_bool = ... = __value);


Oh that's sneaky. Clever, but sneaky.


  else if constexpr (is_same_v<_Tp, _CharT>)
-   return __u._M_c;
+   return (__u._M_c = ... = __value);
  else if constexpr (is_same_v<_Tp, int>)
-   return __u._M_i;
+   return (__u._M_i = ... = __value);
  else if constexpr (is_same_v<_Tp, unsigned>)
-   return __u._M_u;
+   return (__u._M_u = ... = __value);
  else if constexpr (is_same_v<_Tp, long long>)
-   return __u._M_ll;
+   return (__u._M_ll = ... = __value);
  else if constexpr (is_same_v<_Tp, unsigned long long>)
-   return __u._M_ull;
+   return (__u._M_ull = ... = __value);
  else if constexpr (is_same_v<_Tp, float>)
-   return __u._M_flt;
+   return (__u._M_flt = ... = __value);
  else if constexpr (is_same_v<_Tp, double>)
-   return __u._M_dbl;
+   return (__u._M_dbl = ... = __value);
#ifndef _GLIBCXX_LONG_DOUBLE_ALT128_COMPAT
  else if constexpr (is_same_v<_Tp, long double>)
-   return __u._M_ldbl;
+   return (__u._M_ldbl = ... = __value);
#else
  else if constexpr (is_same_v<_Tp, __ibm128>)
-   return __u._M_ibm128;
+   return (__u._M_ibm128 = ... = __value);
  else if constexpr (is_same_v<_Tp, __ieee128>)
-   return __u._M_ieee128;
+   return (__u._M_ieee128 = ... = __value);
#endif
#ifdef __SIZEOF_FLOAT128__
  else if constexpr (is_same_v<_Tp, __float128>)
-   return __u._M_float128;
+   return (__u._M_float128 = ... = __value);
#endif
  else if constexpr (is_same_v<_Tp, const _CharT*>)
-   return __u._M_str;
+   return (__u._M_str = ... = __value);
  else if constexpr (is_same_v<_Tp, basic_string_view<_CharT>>)
-   return __u._M_sv;
+   return (__u._M_sv = ... = __value);
  else if constexpr (is_same_v<_Tp, const void*>)
-   return __u._M_ptr;
+   return (__u._M_ptr = ... = __value);
#ifdef __SIZEOF_INT128__
  else if constexpr (is_same_v<_Tp, __int128>)
-   return __u._M_i128;
+   return (__u._M_i128 = ... = __value);
  else if constexpr (is_same_v<_Tp, unsigned __int128>)
-   return __u._M_u128;
+

Re: [PATCH] libstdc++: Remove UB in _Arg_value union alteranatives assigment

2026-02-25 Thread Tomasz Kaminski
On Wed, Feb 25, 2026 at 2:13 PM Tomasz Kamiński  wrote:

> The _Arg_value::_M_set method, initialized the union member, by
> assining to reference to that member produced by _M_get(*this).
> However, per language rules, such assigment has undefined behavior,
> if alternative was not already active.
>
> To address above, we modify _M_set to use placement new for the class
> types, and invoke _S_access with two arguments for all other types.
> The _S_access (rename of _S_get) is modified to assign the value of
> the second parameter (if provided) to the union memnber. Such direct
> assigments are treated specially in the language, and will start
> lifetime of union alternative of implicit-lifetime type.
>
> libstdc++-v3/ChangeLog:
>
> * include/std/format (_Arg_value::_M_get): Rename to...
> (_Arg_value::_M_access): Modified to accept optional
> second parameter that is assigned to value.
> (_Arg_value::_M_get): Handle rename.
> (_Arg_value::_M_set): Use construct_at for basic_string_view,
> and two-argument _S_access for other types.
>
> Signed-off-by: Tomasz Kamiński 
> Signed-off-by: Ivan Lazaric 
> Co-authored-by: Ivan Lazaric 
> ---
> This was detected by constant evaluator, and while it seem to not
> be harmfull at runtime, we will need to address it anyway.
>
> Testing on x86_64-linux. OK for trunk when all test passes?
>
All test passed.

>
>  libstdc++-v3/include/std/format | 61 ++---
>  1 file changed, 33 insertions(+), 28 deletions(-)
>
> diff --git a/libstdc++-v3/include/std/format
> b/libstdc++-v3/include/std/format
> index 245ea964675..4f0b0f377c6 100644
> --- a/libstdc++-v3/include/std/format
> +++ b/libstdc++-v3/include/std/format
> @@ -4234,70 +4234,73 @@ namespace __format
> { _S_get<_Tp>() = __val; }
>  #endif
>
> -  template
> +  // Returns reference to the _Arg_value member with the type _Tp.
> +  // Value of second argument (if provided), is assigned to that
> member.
> +  template
> [[__gnu__::__always_inline__]]
> static auto&
> -   _S_get(_Self& __u) noexcept
> +   _S_access(_Self& __u, _Value... __value) noexcept
> {
> + static_assert(sizeof...(_Value) <= 1);
>   if constexpr (is_same_v<_Tp, bool>)
> -   return __u._M_bool;
> +   return (__u._M_bool = ... = __value);
>   else if constexpr (is_same_v<_Tp, _CharT>)
> -   return __u._M_c;
> +   return (__u._M_c = ... = __value);
>   else if constexpr (is_same_v<_Tp, int>)
> -   return __u._M_i;
> +   return (__u._M_i = ... = __value);
>   else if constexpr (is_same_v<_Tp, unsigned>)
> -   return __u._M_u;
> +   return (__u._M_u = ... = __value);
>   else if constexpr (is_same_v<_Tp, long long>)
> -   return __u._M_ll;
> +   return (__u._M_ll = ... = __value);
>   else if constexpr (is_same_v<_Tp, unsigned long long>)
> -   return __u._M_ull;
> +   return (__u._M_ull = ... = __value);
>   else if constexpr (is_same_v<_Tp, float>)
> -   return __u._M_flt;
> +   return (__u._M_flt = ... = __value);
>   else if constexpr (is_same_v<_Tp, double>)
> -   return __u._M_dbl;
> +   return (__u._M_dbl = ... = __value);
>  #ifndef _GLIBCXX_LONG_DOUBLE_ALT128_COMPAT
>   else if constexpr (is_same_v<_Tp, long double>)
> -   return __u._M_ldbl;
> +   return (__u._M_ldbl = ... = __value);
>  #else
>   else if constexpr (is_same_v<_Tp, __ibm128>)
> -   return __u._M_ibm128;
> +   return (__u._M_ibm128 = ... = __value);
>   else if constexpr (is_same_v<_Tp, __ieee128>)
> -   return __u._M_ieee128;
> +   return (__u._M_ieee128 = ... = __value);
>  #endif
>  #ifdef __SIZEOF_FLOAT128__
>   else if constexpr (is_same_v<_Tp, __float128>)
> -   return __u._M_float128;
> +   return (__u._M_float128 = ... = __value);
>  #endif
>   else if constexpr (is_same_v<_Tp, const _CharT*>)
> -   return __u._M_str;
> +   return (__u._M_str = ... = __value);
>   else if constexpr (is_same_v<_Tp, basic_string_view<_CharT>>)
> -   return __u._M_sv;
> +   return (__u._M_sv = ... = __value);
>   else if constexpr (is_same_v<_Tp, const void*>)
> -   return __u._M_ptr;
> +   return (__u._M_ptr = ... = __value);
>  #ifdef __SIZEOF_INT128__
>   else if constexpr (is_same_v<_Tp, __int128>)
> -   return __u._M_i128;
> +   return (__u._M_i128 = ... = __value);
>   else if constexpr (is_same_v<_Tp, unsigned __int128>)
> -   return __u._M_u128;
> +   return (__u._M_u128 = ... = __value);
>  #endif
>  #ifdef __BFLT16_DIG__
>   else if constexpr (is_same_v<_Tp, __bflt16_t>)
> -   return __u._M_bf16;
> +   return (__u._M_bf16 =