[llvm-branch-commits] [llvm] release/19.x: [Windows SEH] Fix crash on empty seh block (#107031) (PR #107466)

2024-09-10 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 approved this pull request.

This should be low risk. If the condition holds, it would previously 
dereference an invalid iterator, and either crash immediately thanks to 
assertions or use whatever junk's in memory. Now it will treat it the same as 
if there's an immediate terminator, i.e. has no non-terminator instructions. So 
it's turning definitely wrong, likely crashing, behaviour into something that's 
believed correct, and is not affecting anything other than the case where the 
iterator was invalid. The only risk I can personally see is if the new 
behaviour is incorrect, then you've gone from likely crashing to definitely 
miscompiling, but this seems unlikely given the intent of the surrounding code, 
and would only affect Windows SEH data in this corner case. The other argument 
against it would be that it's been broken since the release of LLVM 17 rather 
than being a regression in LLVM 19, though of course whether you hit it or not 
for real-world code is going to be dependent on the exact optimisations 
performed, and it could well be those prior versions didn't (I don't 
know/recall if the reporter tried compiling the original code in question with 
17 or 18).

https://github.com/llvm/llvm-project/pull/107466
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: [clang] Make LazyOffsetPtr more portable (#112927) (PR #113052)

2024-10-28 Thread Jessica Clarke via llvm-branch-commits

jrtc27 wrote:

As the author of the patch that seems sensible to me, and I’ve not been aware 
of any regressions from it in main.

https://github.com/llvm/llvm-project/pull/113052
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DataLayout][LangRef] Split non-integral and unstable pointer properties (PR #105735)

2024-10-25 Thread Jessica Clarke via llvm-branch-commits


@@ -361,6 +414,16 @@ class DataLayout {
 return PTy && isNonIntegralPointerType(PTy);
   }
 
+  bool shouldAvoidPtrToInt(Type *Ty) const {
+auto *PTy = dyn_cast(Ty);
+return PTy && shouldAvoidPtrToInt(PTy->getPointerAddressSpace());

jrtc27 wrote:

It seems odd to ask about ptrtoint for something where you don't know it's a 
pointer already, but I guess this is to match isNonIntegralPointerType which 
seems to have a decent number of uses.

https://github.com/llvm/llvm-project/pull/105735
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DataLayout][LangRef] Split non-integral and unstable pointer properties (PR #105735)

2024-10-25 Thread Jessica Clarke via llvm-branch-commits


@@ -649,48 +649,95 @@ literal types are uniqued in recent versions of LLVM.
 
 .. _nointptrtype:
 
-Non-Integral Pointer Type
--
+Non-Integral and Unstable Pointer Types
+---
 
-Note: non-integral pointer types are a work in progress, and they should be
-considered experimental at this time.
+Note: non-integral/unstable pointer types are a work in progress, and they
+should be considered experimental at this time.
 
 LLVM IR optionally allows the frontend to denote pointers in certain address
-spaces as "non-integral" via the :ref:`datalayout string`.
-Non-integral pointer types represent pointers that have an *unspecified* 
bitwise
-representation; that is, the integral representation may be target dependent or
-unstable (not backed by a fixed integer).
+spaces as "non-integral" or "unstable" (or both "non-integral" and "unstable")
+via the :ref:`datalayout string`.
+
+These exact implications of these properties are target-specific, but the
+following IR semantics and restrictions to optimization passes apply:
+
+Unstable pointer representation
+^^^
+
+Pointers in this address space have an *unspecified* bitwise representation
+(i.e. not backed by a fixed integer). The bitwise pattern of such pointers is
+allowed to change in a target-specific way. For example, this could be a 
pointer
+type used for with copying garbage collection where the garbage collector could
+update the pointer at any time in the collection sweep.
 
 ``inttoptr`` and ``ptrtoint`` instructions have the same semantics as for
 integral (i.e. normal) pointers in that they convert integers to and from
-corresponding pointer types, but there are additional implications to be
-aware of.  Because the bit-representation of a non-integral pointer may
-not be stable, two identical casts of the same operand may or may not
+corresponding pointer types, but there are additional implications to be aware
+of.
+
+For "unstable" pointer representations, the bit-representation of the pointer
+may not be stable, so two identical casts of the same operand may or may not
 return the same value.  Said differently, the conversion to or from the
-non-integral type depends on environmental state in an implementation
+"unstable" pointer type depends on environmental state in an implementation
 defined manner.
-
 If the frontend wishes to observe a *particular* value following a cast, the
 generated IR must fence with the underlying environment in an implementation
 defined manner. (In practice, this tends to require ``noinline`` routines for
 such operations.)
 
 From the perspective of the optimizer, ``inttoptr`` and ``ptrtoint`` for
-non-integral types are analogous to ones on integral types with one
+"unstable" pointer types are analogous to ones on integral types with one
 key exception: the optimizer may not, in general, insert new dynamic
 occurrences of such casts.  If a new cast is inserted, the optimizer would
 need to either ensure that a) all possible values are valid, or b)
 appropriate fencing is inserted.  Since the appropriate fencing is
 implementation defined, the optimizer can't do the latter.  The former is
 challenging as many commonly expected properties, such as
-``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for non-integral types.
+``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for "unstable" pointer types.
 Similar restrictions apply to intrinsics that might examine the pointer bits,
 such as :ref:`llvm.ptrmask`.
 
-The alignment information provided by the frontend for a non-integral pointer
+The alignment information provided by the frontend for an "unstable" pointer
 (typically using attributes or metadata) must be valid for every possible
 representation of the pointer.
 
+Non-integral pointer representation

jrtc27 wrote:

Is non-integral the right term for something that is _more than_ just an 
integer?

https://github.com/llvm/llvm-project/pull/105735
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DataLayout][LangRef] Split non-integral and unstable pointer properties (PR #105735)

2024-10-25 Thread Jessica Clarke via llvm-branch-commits


@@ -649,48 +649,95 @@ literal types are uniqued in recent versions of LLVM.
 
 .. _nointptrtype:
 
-Non-Integral Pointer Type
--
+Non-Integral and Unstable Pointer Types
+---
 
-Note: non-integral pointer types are a work in progress, and they should be
-considered experimental at this time.
+Note: non-integral/unstable pointer types are a work in progress, and they
+should be considered experimental at this time.
 
 LLVM IR optionally allows the frontend to denote pointers in certain address
-spaces as "non-integral" via the :ref:`datalayout string`.
-Non-integral pointer types represent pointers that have an *unspecified* 
bitwise
-representation; that is, the integral representation may be target dependent or
-unstable (not backed by a fixed integer).
+spaces as "non-integral" or "unstable" (or both "non-integral" and "unstable")
+via the :ref:`datalayout string`.
+
+These exact implications of these properties are target-specific, but the
+following IR semantics and restrictions to optimization passes apply:
+
+Unstable pointer representation
+^^^
+
+Pointers in this address space have an *unspecified* bitwise representation
+(i.e. not backed by a fixed integer). The bitwise pattern of such pointers is
+allowed to change in a target-specific way. For example, this could be a 
pointer
+type used for with copying garbage collection where the garbage collector could
+update the pointer at any time in the collection sweep.
 
 ``inttoptr`` and ``ptrtoint`` instructions have the same semantics as for
 integral (i.e. normal) pointers in that they convert integers to and from
-corresponding pointer types, but there are additional implications to be
-aware of.  Because the bit-representation of a non-integral pointer may
-not be stable, two identical casts of the same operand may or may not
+corresponding pointer types, but there are additional implications to be aware
+of.
+
+For "unstable" pointer representations, the bit-representation of the pointer
+may not be stable, so two identical casts of the same operand may or may not
 return the same value.  Said differently, the conversion to or from the
-non-integral type depends on environmental state in an implementation
+"unstable" pointer type depends on environmental state in an implementation
 defined manner.
-
 If the frontend wishes to observe a *particular* value following a cast, the
 generated IR must fence with the underlying environment in an implementation
 defined manner. (In practice, this tends to require ``noinline`` routines for
 such operations.)
 
 From the perspective of the optimizer, ``inttoptr`` and ``ptrtoint`` for
-non-integral types are analogous to ones on integral types with one
+"unstable" pointer types are analogous to ones on integral types with one
 key exception: the optimizer may not, in general, insert new dynamic
 occurrences of such casts.  If a new cast is inserted, the optimizer would
 need to either ensure that a) all possible values are valid, or b)
 appropriate fencing is inserted.  Since the appropriate fencing is
 implementation defined, the optimizer can't do the latter.  The former is
 challenging as many commonly expected properties, such as
-``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for non-integral types.
+``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for "unstable" pointer types.
 Similar restrictions apply to intrinsics that might examine the pointer bits,
 such as :ref:`llvm.ptrmask`.
 
-The alignment information provided by the frontend for a non-integral pointer
+The alignment information provided by the frontend for an "unstable" pointer
 (typically using attributes or metadata) must be valid for every possible
 representation of the pointer.
 
+Non-integral pointer representation
+^^^
+
+Pointers are not represented as an address, but may instead include
+additional metadata such as bounds information or a temporal identifier.
+Examples include AMDGPU buffer descriptors with a 128-bit fat pointer and a
+32-bit offset or CHERI capabilities that contain bounds, permissions and an
+out-of-band validity bit. In general, these pointers cannot be re-created
+from just an integer value.
+
+In most cases pointers with a non-integral representation behave exactly the
+same as an integral pointer, the only difference is that it is not possible to
+create a pointer just from an address.
+
+"Non-integral" pointers also impose restrictions on the optimizer, but in
+general these are less restrictive than for "unstable" pointers. The main
+difference compared to integral pointers is that ``inttoptr`` instructions
+should not be inserted by passes as they may not be able to create a valid
+pointer. This property also means that ``inttoptr(ptrtoint(x))`` cannot be
+folded to ``x`` as the ``ptrtoint`` operation may destroy the necessary 
metadata
+to reconstruct the pointer.
+Additiona

[llvm-branch-commits] [llvm] [DataLayout][LangRef] Split non-integral and unstable pointer properties (PR #105735)

2024-10-25 Thread Jessica Clarke via llvm-branch-commits


@@ -649,48 +649,95 @@ literal types are uniqued in recent versions of LLVM.
 
 .. _nointptrtype:
 
-Non-Integral Pointer Type
--
+Non-Integral and Unstable Pointer Types
+---
 
-Note: non-integral pointer types are a work in progress, and they should be
-considered experimental at this time.
+Note: non-integral/unstable pointer types are a work in progress, and they
+should be considered experimental at this time.
 
 LLVM IR optionally allows the frontend to denote pointers in certain address
-spaces as "non-integral" via the :ref:`datalayout string`.
-Non-integral pointer types represent pointers that have an *unspecified* 
bitwise
-representation; that is, the integral representation may be target dependent or
-unstable (not backed by a fixed integer).
+spaces as "non-integral" or "unstable" (or both "non-integral" and "unstable")
+via the :ref:`datalayout string`.
+
+These exact implications of these properties are target-specific, but the
+following IR semantics and restrictions to optimization passes apply:
+
+Unstable pointer representation
+^^^
+
+Pointers in this address space have an *unspecified* bitwise representation
+(i.e. not backed by a fixed integer). The bitwise pattern of such pointers is
+allowed to change in a target-specific way. For example, this could be a 
pointer
+type used for with copying garbage collection where the garbage collector could

jrtc27 wrote:

```suggestion
type used with copying garbage collection where the garbage collector could
```

https://github.com/llvm/llvm-project/pull/105735
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DataLayout][LangRef] Split non-integral and unstable pointer properties (PR #105735)

2024-10-25 Thread Jessica Clarke via llvm-branch-commits


@@ -419,9 +420,24 @@ Error DataLayout::parsePointerSpec(StringRef Spec) {
 
   // Address space. Optional, defaults to 0.
   unsigned AddrSpace = 0;
-  if (!Components[0].empty())
-if (Error Err = parseAddrSpace(Components[0], AddrSpace))
+  bool UnstableRepr = false;
+  bool NonIntegralRepr = false;
+  StringRef AddrSpaceStr = Components[0].drop_while([&](char C) {
+if (C == 'n') {
+  NonIntegralRepr = true;
+  return true;
+} else if (C == 'u') {
+  UnstableRepr = true;
+  return true;
+}
+return false;
+  });
+  if (!AddrSpaceStr.empty()) {
+if (Error Err = parseAddrSpace(AddrSpaceStr, AddrSpace))
   return Err;
+  }
+  if (AddrSpace == 0 && (NonIntegralRepr || UnstableRepr))
+return createStringError("address space 0 cannot be non-integral");

jrtc27 wrote:

The check is for non-integral or unstable, but this only mentions the former

https://github.com/llvm/llvm-project/pull/105735
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DataLayout][LangRef] Split non-integral and unstable pointer properties (PR #105735)

2024-10-25 Thread Jessica Clarke via llvm-branch-commits


@@ -649,48 +649,95 @@ literal types are uniqued in recent versions of LLVM.
 
 .. _nointptrtype:
 
-Non-Integral Pointer Type
--
+Non-Integral and Unstable Pointer Types
+---
 
-Note: non-integral pointer types are a work in progress, and they should be
-considered experimental at this time.
+Note: non-integral/unstable pointer types are a work in progress, and they
+should be considered experimental at this time.
 
 LLVM IR optionally allows the frontend to denote pointers in certain address
-spaces as "non-integral" via the :ref:`datalayout string`.
-Non-integral pointer types represent pointers that have an *unspecified* 
bitwise
-representation; that is, the integral representation may be target dependent or
-unstable (not backed by a fixed integer).
+spaces as "non-integral" or "unstable" (or both "non-integral" and "unstable")
+via the :ref:`datalayout string`.
+
+These exact implications of these properties are target-specific, but the

jrtc27 wrote:

```suggestion
The exact implications of these properties are target-specific, but the
```

https://github.com/llvm/llvm-project/pull/105735
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DataLayout][LangRef] Split non-integral and unstable pointer properties (PR #105735)

2024-10-25 Thread Jessica Clarke via llvm-branch-commits


@@ -342,14 +346,63 @@ class DataLayout {
   SmallVector getNonIntegralAddressSpaces() const {

jrtc27 wrote:

This name seems stale given it's including unstable pointers

https://github.com/llvm/llvm-project/pull/105735
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DataLayout][LangRef] Split non-integral and unstable pointer properties (PR #105735)

2024-10-25 Thread Jessica Clarke via llvm-branch-commits


@@ -649,48 +649,95 @@ literal types are uniqued in recent versions of LLVM.
 
 .. _nointptrtype:
 
-Non-Integral Pointer Type
--
+Non-Integral and Unstable Pointer Types
+---
 
-Note: non-integral pointer types are a work in progress, and they should be
-considered experimental at this time.
+Note: non-integral/unstable pointer types are a work in progress, and they
+should be considered experimental at this time.
 
 LLVM IR optionally allows the frontend to denote pointers in certain address
-spaces as "non-integral" via the :ref:`datalayout string`.
-Non-integral pointer types represent pointers that have an *unspecified* 
bitwise
-representation; that is, the integral representation may be target dependent or
-unstable (not backed by a fixed integer).
+spaces as "non-integral" or "unstable" (or both "non-integral" and "unstable")
+via the :ref:`datalayout string`.
+
+These exact implications of these properties are target-specific, but the
+following IR semantics and restrictions to optimization passes apply:
+
+Unstable pointer representation
+^^^
+
+Pointers in this address space have an *unspecified* bitwise representation
+(i.e. not backed by a fixed integer). The bitwise pattern of such pointers is
+allowed to change in a target-specific way. For example, this could be a 
pointer
+type used for with copying garbage collection where the garbage collector could
+update the pointer at any time in the collection sweep.
 
 ``inttoptr`` and ``ptrtoint`` instructions have the same semantics as for
 integral (i.e. normal) pointers in that they convert integers to and from
-corresponding pointer types, but there are additional implications to be
-aware of.  Because the bit-representation of a non-integral pointer may
-not be stable, two identical casts of the same operand may or may not
+corresponding pointer types, but there are additional implications to be aware
+of.
+
+For "unstable" pointer representations, the bit-representation of the pointer
+may not be stable, so two identical casts of the same operand may or may not
 return the same value.  Said differently, the conversion to or from the
-non-integral type depends on environmental state in an implementation
+"unstable" pointer type depends on environmental state in an implementation
 defined manner.
-
 If the frontend wishes to observe a *particular* value following a cast, the
 generated IR must fence with the underlying environment in an implementation
 defined manner. (In practice, this tends to require ``noinline`` routines for
 such operations.)
 
 From the perspective of the optimizer, ``inttoptr`` and ``ptrtoint`` for
-non-integral types are analogous to ones on integral types with one
+"unstable" pointer types are analogous to ones on integral types with one
 key exception: the optimizer may not, in general, insert new dynamic
 occurrences of such casts.  If a new cast is inserted, the optimizer would
 need to either ensure that a) all possible values are valid, or b)
 appropriate fencing is inserted.  Since the appropriate fencing is
 implementation defined, the optimizer can't do the latter.  The former is
 challenging as many commonly expected properties, such as
-``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for non-integral types.
+``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for "unstable" pointer types.
 Similar restrictions apply to intrinsics that might examine the pointer bits,
 such as :ref:`llvm.ptrmask`.
 
-The alignment information provided by the frontend for a non-integral pointer
+The alignment information provided by the frontend for an "unstable" pointer
 (typically using attributes or metadata) must be valid for every possible
 representation of the pointer.
 
+Non-integral pointer representation
+^^^
+
+Pointers are not represented as an address, but may instead include
+additional metadata such as bounds information or a temporal identifier.
+Examples include AMDGPU buffer descriptors with a 128-bit fat pointer and a
+32-bit offset or CHERI capabilities that contain bounds, permissions and an
+out-of-band validity bit. In general, these pointers cannot be re-created
+from just an integer value.
+
+In most cases pointers with a non-integral representation behave exactly the
+same as an integral pointer, the only difference is that it is not possible to
+create a pointer just from an address.
+
+"Non-integral" pointers also impose restrictions on the optimizer, but in
+general these are less restrictive than for "unstable" pointers. The main
+difference compared to integral pointers is that ``inttoptr`` instructions
+should not be inserted by passes as they may not be able to create a valid
+pointer. This property also means that ``inttoptr(ptrtoint(x))`` cannot be
+folded to ``x`` as the ``ptrtoint`` operation may destroy the necessary 
metadata
+to reconstruct the pointer.
+Additiona

[llvm-branch-commits] [llvm] [DataLayout][LangRef] Split non-integral and unstable pointer properties (PR #105735)

2024-10-25 Thread Jessica Clarke via llvm-branch-commits


@@ -3082,16 +3129,21 @@ as follows:
 ``A``
 Specifies the address space of objects created by '``alloca``'.
 Defaults to the default address space of 0.
-``p[n]::[:][:]``
+``p[][]::[:][:]``
 This specifies the *size* of a pointer and its  and
 \erred alignments for address space ``n``.  is optional
 and defaults to . The fourth parameter  is the size of 
the
 index that used for address calculation, which must be less than or equal
 to the pointer size. If not
 specified, the default index size is equal to the pointer size. All sizes
-are in bits. The address space, ``n``, is optional, and if not specified,
-denotes the default address space 0. The value of ``n`` must be
-in the range [1,2^24).
+are in bits. The , is optional, and if not specified,
+denotes the default address space 0. The value of  must
+be in the range [1,2^24).
+The optional are used to specify properties of pointers in this

jrtc27 wrote:

Is it legal to have the same flag appear multiple times?

https://github.com/llvm/llvm-project/pull/105735
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DataLayout][LangRef] Split non-integral and unstable pointer properties (PR #105735)

2024-10-25 Thread Jessica Clarke via llvm-branch-commits


@@ -649,48 +649,95 @@ literal types are uniqued in recent versions of LLVM.
 
 .. _nointptrtype:
 
-Non-Integral Pointer Type
--
+Non-Integral and Unstable Pointer Types
+---
 
-Note: non-integral pointer types are a work in progress, and they should be
-considered experimental at this time.
+Note: non-integral/unstable pointer types are a work in progress, and they
+should be considered experimental at this time.
 
 LLVM IR optionally allows the frontend to denote pointers in certain address
-spaces as "non-integral" via the :ref:`datalayout string`.
-Non-integral pointer types represent pointers that have an *unspecified* 
bitwise
-representation; that is, the integral representation may be target dependent or
-unstable (not backed by a fixed integer).
+spaces as "non-integral" or "unstable" (or both "non-integral" and "unstable")
+via the :ref:`datalayout string`.
+
+These exact implications of these properties are target-specific, but the
+following IR semantics and restrictions to optimization passes apply:
+
+Unstable pointer representation
+^^^
+
+Pointers in this address space have an *unspecified* bitwise representation
+(i.e. not backed by a fixed integer). The bitwise pattern of such pointers is
+allowed to change in a target-specific way. For example, this could be a 
pointer
+type used for with copying garbage collection where the garbage collector could
+update the pointer at any time in the collection sweep.
 
 ``inttoptr`` and ``ptrtoint`` instructions have the same semantics as for
 integral (i.e. normal) pointers in that they convert integers to and from
-corresponding pointer types, but there are additional implications to be
-aware of.  Because the bit-representation of a non-integral pointer may
-not be stable, two identical casts of the same operand may or may not
+corresponding pointer types, but there are additional implications to be aware
+of.
+
+For "unstable" pointer representations, the bit-representation of the pointer
+may not be stable, so two identical casts of the same operand may or may not
 return the same value.  Said differently, the conversion to or from the
-non-integral type depends on environmental state in an implementation
+"unstable" pointer type depends on environmental state in an implementation
 defined manner.
-
 If the frontend wishes to observe a *particular* value following a cast, the
 generated IR must fence with the underlying environment in an implementation
 defined manner. (In practice, this tends to require ``noinline`` routines for
 such operations.)
 
 From the perspective of the optimizer, ``inttoptr`` and ``ptrtoint`` for
-non-integral types are analogous to ones on integral types with one
+"unstable" pointer types are analogous to ones on integral types with one
 key exception: the optimizer may not, in general, insert new dynamic
 occurrences of such casts.  If a new cast is inserted, the optimizer would
 need to either ensure that a) all possible values are valid, or b)
 appropriate fencing is inserted.  Since the appropriate fencing is
 implementation defined, the optimizer can't do the latter.  The former is
 challenging as many commonly expected properties, such as
-``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for non-integral types.
+``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for "unstable" pointer types.
 Similar restrictions apply to intrinsics that might examine the pointer bits,
 such as :ref:`llvm.ptrmask`.
 
-The alignment information provided by the frontend for a non-integral pointer
+The alignment information provided by the frontend for an "unstable" pointer
 (typically using attributes or metadata) must be valid for every possible
 representation of the pointer.
 
+Non-integral pointer representation
+^^^
+
+Pointers are not represented as an address, but may instead include
+additional metadata such as bounds information or a temporal identifier.
+Examples include AMDGPU buffer descriptors with a 128-bit fat pointer and a
+32-bit offset or CHERI capabilities that contain bounds, permissions and an
+out-of-band validity bit. In general, these pointers cannot be re-created
+from just an integer value.
+
+In most cases pointers with a non-integral representation behave exactly the
+same as an integral pointer, the only difference is that it is not possible to
+create a pointer just from an address.
+
+"Non-integral" pointers also impose restrictions on the optimizer, but in
+general these are less restrictive than for "unstable" pointers. The main
+difference compared to integral pointers is that ``inttoptr`` instructions
+should not be inserted by passes as they may not be able to create a valid
+pointer. This property also means that ``inttoptr(ptrtoint(x))`` cannot be
+folded to ``x`` as the ``ptrtoint`` operation may destroy the necessary 
metadata
+to reconstruct the pointer.
+Additiona

[llvm-branch-commits] [llvm] [DataLayout][LangRef] Split non-integral and unstable pointer properties (PR #105735)

2024-10-25 Thread Jessica Clarke via llvm-branch-commits


@@ -649,48 +649,95 @@ literal types are uniqued in recent versions of LLVM.
 
 .. _nointptrtype:
 
-Non-Integral Pointer Type
--
+Non-Integral and Unstable Pointer Types
+---
 
-Note: non-integral pointer types are a work in progress, and they should be
-considered experimental at this time.
+Note: non-integral/unstable pointer types are a work in progress, and they
+should be considered experimental at this time.
 
 LLVM IR optionally allows the frontend to denote pointers in certain address
-spaces as "non-integral" via the :ref:`datalayout string`.
-Non-integral pointer types represent pointers that have an *unspecified* 
bitwise
-representation; that is, the integral representation may be target dependent or
-unstable (not backed by a fixed integer).
+spaces as "non-integral" or "unstable" (or both "non-integral" and "unstable")
+via the :ref:`datalayout string`.
+
+These exact implications of these properties are target-specific, but the
+following IR semantics and restrictions to optimization passes apply:
+
+Unstable pointer representation
+^^^
+
+Pointers in this address space have an *unspecified* bitwise representation
+(i.e. not backed by a fixed integer). The bitwise pattern of such pointers is
+allowed to change in a target-specific way. For example, this could be a 
pointer
+type used for with copying garbage collection where the garbage collector could
+update the pointer at any time in the collection sweep.
 
 ``inttoptr`` and ``ptrtoint`` instructions have the same semantics as for
 integral (i.e. normal) pointers in that they convert integers to and from
-corresponding pointer types, but there are additional implications to be
-aware of.  Because the bit-representation of a non-integral pointer may
-not be stable, two identical casts of the same operand may or may not
+corresponding pointer types, but there are additional implications to be aware
+of.
+
+For "unstable" pointer representations, the bit-representation of the pointer
+may not be stable, so two identical casts of the same operand may or may not
 return the same value.  Said differently, the conversion to or from the
-non-integral type depends on environmental state in an implementation
+"unstable" pointer type depends on environmental state in an implementation
 defined manner.
-
 If the frontend wishes to observe a *particular* value following a cast, the
 generated IR must fence with the underlying environment in an implementation
 defined manner. (In practice, this tends to require ``noinline`` routines for
 such operations.)
 
 From the perspective of the optimizer, ``inttoptr`` and ``ptrtoint`` for
-non-integral types are analogous to ones on integral types with one
+"unstable" pointer types are analogous to ones on integral types with one
 key exception: the optimizer may not, in general, insert new dynamic
 occurrences of such casts.  If a new cast is inserted, the optimizer would
 need to either ensure that a) all possible values are valid, or b)
 appropriate fencing is inserted.  Since the appropriate fencing is
 implementation defined, the optimizer can't do the latter.  The former is
 challenging as many commonly expected properties, such as
-``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for non-integral types.
+``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for "unstable" pointer types.
 Similar restrictions apply to intrinsics that might examine the pointer bits,
 such as :ref:`llvm.ptrmask`.
 
-The alignment information provided by the frontend for a non-integral pointer
+The alignment information provided by the frontend for an "unstable" pointer
 (typically using attributes or metadata) must be valid for every possible
 representation of the pointer.
 
+Non-integral pointer representation
+^^^
+
+Pointers are not represented as an address, but may instead include
+additional metadata such as bounds information or a temporal identifier.
+Examples include AMDGPU buffer descriptors with a 128-bit fat pointer and a
+32-bit offset or CHERI capabilities that contain bounds, permissions and an
+out-of-band validity bit. In general, these pointers cannot be re-created
+from just an integer value.
+
+In most cases pointers with a non-integral representation behave exactly the
+same as an integral pointer, the only difference is that it is not possible to
+create a pointer just from an address.
+
+"Non-integral" pointers also impose restrictions on the optimizer, but in
+general these are less restrictive than for "unstable" pointers. The main
+difference compared to integral pointers is that ``inttoptr`` instructions
+should not be inserted by passes as they may not be able to create a valid
+pointer. This property also means that ``inttoptr(ptrtoint(x))`` cannot be
+folded to ``x`` as the ``ptrtoint`` operation may destroy the necessary 
metadata
+to reconstruct the pointer.
+Additiona

[llvm-branch-commits] [llvm] [DataLayout][LangRef] Split non-integral and unstable pointer properties (PR #105735)

2024-10-25 Thread Jessica Clarke via llvm-branch-commits


@@ -649,48 +649,95 @@ literal types are uniqued in recent versions of LLVM.
 
 .. _nointptrtype:
 
-Non-Integral Pointer Type
--
+Non-Integral and Unstable Pointer Types
+---
 
-Note: non-integral pointer types are a work in progress, and they should be
-considered experimental at this time.
+Note: non-integral/unstable pointer types are a work in progress, and they
+should be considered experimental at this time.
 
 LLVM IR optionally allows the frontend to denote pointers in certain address
-spaces as "non-integral" via the :ref:`datalayout string`.
-Non-integral pointer types represent pointers that have an *unspecified* 
bitwise
-representation; that is, the integral representation may be target dependent or
-unstable (not backed by a fixed integer).
+spaces as "non-integral" or "unstable" (or both "non-integral" and "unstable")
+via the :ref:`datalayout string`.
+
+These exact implications of these properties are target-specific, but the
+following IR semantics and restrictions to optimization passes apply:
+
+Unstable pointer representation
+^^^
+
+Pointers in this address space have an *unspecified* bitwise representation
+(i.e. not backed by a fixed integer). The bitwise pattern of such pointers is
+allowed to change in a target-specific way. For example, this could be a 
pointer
+type used for with copying garbage collection where the garbage collector could
+update the pointer at any time in the collection sweep.
 
 ``inttoptr`` and ``ptrtoint`` instructions have the same semantics as for
 integral (i.e. normal) pointers in that they convert integers to and from
-corresponding pointer types, but there are additional implications to be
-aware of.  Because the bit-representation of a non-integral pointer may
-not be stable, two identical casts of the same operand may or may not
+corresponding pointer types, but there are additional implications to be aware
+of.
+
+For "unstable" pointer representations, the bit-representation of the pointer
+may not be stable, so two identical casts of the same operand may or may not
 return the same value.  Said differently, the conversion to or from the
-non-integral type depends on environmental state in an implementation
+"unstable" pointer type depends on environmental state in an implementation
 defined manner.
-
 If the frontend wishes to observe a *particular* value following a cast, the
 generated IR must fence with the underlying environment in an implementation
 defined manner. (In practice, this tends to require ``noinline`` routines for
 such operations.)
 
 From the perspective of the optimizer, ``inttoptr`` and ``ptrtoint`` for
-non-integral types are analogous to ones on integral types with one
+"unstable" pointer types are analogous to ones on integral types with one
 key exception: the optimizer may not, in general, insert new dynamic
 occurrences of such casts.  If a new cast is inserted, the optimizer would
 need to either ensure that a) all possible values are valid, or b)
 appropriate fencing is inserted.  Since the appropriate fencing is
 implementation defined, the optimizer can't do the latter.  The former is
 challenging as many commonly expected properties, such as
-``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for non-integral types.
+``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for "unstable" pointer types.
 Similar restrictions apply to intrinsics that might examine the pointer bits,
 such as :ref:`llvm.ptrmask`.
 
-The alignment information provided by the frontend for a non-integral pointer
+The alignment information provided by the frontend for an "unstable" pointer
 (typically using attributes or metadata) must be valid for every possible
 representation of the pointer.
 
+Non-integral pointer representation
+^^^
+
+Pointers are not represented as an address, but may instead include
+additional metadata such as bounds information or a temporal identifier.
+Examples include AMDGPU buffer descriptors with a 128-bit fat pointer and a
+32-bit offset or CHERI capabilities that contain bounds, permissions and an
+out-of-band validity bit. In general, these pointers cannot be re-created
+from just an integer value.

jrtc27 wrote:

At least with CHERI one can turn an integer into a pointer, it's just not a 
valid pointer (i.e. things like `#define SIG_IGN ((__sighandler_t *)1)` work, 
the pointer just can't be used as anything other than a sentinel to pass around 
or compare against). Is that something to discuss here (/ is it also true for 
AMDGPU's buffer descriptors)?

https://github.com/llvm/llvm-project/pull/105735
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DataLayout][LangRef] Split non-integral and unstable pointer properties (PR #105735)

2024-10-25 Thread Jessica Clarke via llvm-branch-commits


@@ -649,48 +649,95 @@ literal types are uniqued in recent versions of LLVM.
 
 .. _nointptrtype:
 
-Non-Integral Pointer Type
--
+Non-Integral and Unstable Pointer Types
+---
 
-Note: non-integral pointer types are a work in progress, and they should be
-considered experimental at this time.
+Note: non-integral/unstable pointer types are a work in progress, and they
+should be considered experimental at this time.
 
 LLVM IR optionally allows the frontend to denote pointers in certain address
-spaces as "non-integral" via the :ref:`datalayout string`.
-Non-integral pointer types represent pointers that have an *unspecified* 
bitwise
-representation; that is, the integral representation may be target dependent or
-unstable (not backed by a fixed integer).
+spaces as "non-integral" or "unstable" (or both "non-integral" and "unstable")
+via the :ref:`datalayout string`.
+
+These exact implications of these properties are target-specific, but the
+following IR semantics and restrictions to optimization passes apply:
+
+Unstable pointer representation
+^^^
+
+Pointers in this address space have an *unspecified* bitwise representation
+(i.e. not backed by a fixed integer). The bitwise pattern of such pointers is
+allowed to change in a target-specific way. For example, this could be a 
pointer
+type used for with copying garbage collection where the garbage collector could
+update the pointer at any time in the collection sweep.
 
 ``inttoptr`` and ``ptrtoint`` instructions have the same semantics as for
 integral (i.e. normal) pointers in that they convert integers to and from
-corresponding pointer types, but there are additional implications to be
-aware of.  Because the bit-representation of a non-integral pointer may
-not be stable, two identical casts of the same operand may or may not
+corresponding pointer types, but there are additional implications to be aware
+of.
+
+For "unstable" pointer representations, the bit-representation of the pointer
+may not be stable, so two identical casts of the same operand may or may not
 return the same value.  Said differently, the conversion to or from the
-non-integral type depends on environmental state in an implementation
+"unstable" pointer type depends on environmental state in an implementation
 defined manner.
-
 If the frontend wishes to observe a *particular* value following a cast, the
 generated IR must fence with the underlying environment in an implementation
 defined manner. (In practice, this tends to require ``noinline`` routines for
 such operations.)
 
 From the perspective of the optimizer, ``inttoptr`` and ``ptrtoint`` for
-non-integral types are analogous to ones on integral types with one
+"unstable" pointer types are analogous to ones on integral types with one
 key exception: the optimizer may not, in general, insert new dynamic
 occurrences of such casts.  If a new cast is inserted, the optimizer would
 need to either ensure that a) all possible values are valid, or b)
 appropriate fencing is inserted.  Since the appropriate fencing is
 implementation defined, the optimizer can't do the latter.  The former is
 challenging as many commonly expected properties, such as
-``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for non-integral types.
+``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for "unstable" pointer types.
 Similar restrictions apply to intrinsics that might examine the pointer bits,
 such as :ref:`llvm.ptrmask`.
 
-The alignment information provided by the frontend for a non-integral pointer
+The alignment information provided by the frontend for an "unstable" pointer
 (typically using attributes or metadata) must be valid for every possible
 representation of the pointer.
 
+Non-integral pointer representation
+^^^
+
+Pointers are not represented as an address, but may instead include
+additional metadata such as bounds information or a temporal identifier.
+Examples include AMDGPU buffer descriptors with a 128-bit fat pointer and a
+32-bit offset or CHERI capabilities that contain bounds, permissions and an
+out-of-band validity bit. In general, these pointers cannot be re-created
+from just an integer value.
+
+In most cases pointers with a non-integral representation behave exactly the
+same as an integral pointer, the only difference is that it is not possible to
+create a pointer just from an address.
+
+"Non-integral" pointers also impose restrictions on the optimizer, but in
+general these are less restrictive than for "unstable" pointers. The main
+difference compared to integral pointers is that ``inttoptr`` instructions
+should not be inserted by passes as they may not be able to create a valid
+pointer. This property also means that ``inttoptr(ptrtoint(x))`` cannot be
+folded to ``x`` as the ``ptrtoint`` operation may destroy the necessary 
metadata
+to reconstruct the pointer.
+Additiona

[llvm-branch-commits] [llvm] [DataLayout][LangRef] Split non-integral and unstable pointer properties (PR #105735)

2024-10-25 Thread Jessica Clarke via llvm-branch-commits


@@ -649,48 +649,95 @@ literal types are uniqued in recent versions of LLVM.
 
 .. _nointptrtype:
 
-Non-Integral Pointer Type
--
+Non-Integral and Unstable Pointer Types
+---
 
-Note: non-integral pointer types are a work in progress, and they should be
-considered experimental at this time.
+Note: non-integral/unstable pointer types are a work in progress, and they
+should be considered experimental at this time.
 
 LLVM IR optionally allows the frontend to denote pointers in certain address
-spaces as "non-integral" via the :ref:`datalayout string`.
-Non-integral pointer types represent pointers that have an *unspecified* 
bitwise
-representation; that is, the integral representation may be target dependent or
-unstable (not backed by a fixed integer).
+spaces as "non-integral" or "unstable" (or both "non-integral" and "unstable")
+via the :ref:`datalayout string`.
+
+These exact implications of these properties are target-specific, but the
+following IR semantics and restrictions to optimization passes apply:
+
+Unstable pointer representation
+^^^
+
+Pointers in this address space have an *unspecified* bitwise representation
+(i.e. not backed by a fixed integer). The bitwise pattern of such pointers is
+allowed to change in a target-specific way. For example, this could be a 
pointer
+type used for with copying garbage collection where the garbage collector could
+update the pointer at any time in the collection sweep.
 
 ``inttoptr`` and ``ptrtoint`` instructions have the same semantics as for
 integral (i.e. normal) pointers in that they convert integers to and from
-corresponding pointer types, but there are additional implications to be
-aware of.  Because the bit-representation of a non-integral pointer may
-not be stable, two identical casts of the same operand may or may not
+corresponding pointer types, but there are additional implications to be aware
+of.
+
+For "unstable" pointer representations, the bit-representation of the pointer
+may not be stable, so two identical casts of the same operand may or may not
 return the same value.  Said differently, the conversion to or from the
-non-integral type depends on environmental state in an implementation
+"unstable" pointer type depends on environmental state in an implementation
 defined manner.
-
 If the frontend wishes to observe a *particular* value following a cast, the
 generated IR must fence with the underlying environment in an implementation
 defined manner. (In practice, this tends to require ``noinline`` routines for
 such operations.)
 
 From the perspective of the optimizer, ``inttoptr`` and ``ptrtoint`` for
-non-integral types are analogous to ones on integral types with one
+"unstable" pointer types are analogous to ones on integral types with one
 key exception: the optimizer may not, in general, insert new dynamic
 occurrences of such casts.  If a new cast is inserted, the optimizer would
 need to either ensure that a) all possible values are valid, or b)
 appropriate fencing is inserted.  Since the appropriate fencing is
 implementation defined, the optimizer can't do the latter.  The former is
 challenging as many commonly expected properties, such as
-``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for non-integral types.
+``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for "unstable" pointer types.
 Similar restrictions apply to intrinsics that might examine the pointer bits,
 such as :ref:`llvm.ptrmask`.
 
-The alignment information provided by the frontend for a non-integral pointer
+The alignment information provided by the frontend for an "unstable" pointer
 (typically using attributes or metadata) must be valid for every possible
 representation of the pointer.
 
+Non-integral pointer representation
+^^^
+
+Pointers are not represented as an address, but may instead include
+additional metadata such as bounds information or a temporal identifier.
+Examples include AMDGPU buffer descriptors with a 128-bit fat pointer and a
+32-bit offset or CHERI capabilities that contain bounds, permissions and an
+out-of-band validity bit. In general, these pointers cannot be re-created
+from just an integer value.
+
+In most cases pointers with a non-integral representation behave exactly the
+same as an integral pointer, the only difference is that it is not possible to
+create a pointer just from an address.
+
+"Non-integral" pointers also impose restrictions on the optimizer, but in
+general these are less restrictive than for "unstable" pointers. The main
+difference compared to integral pointers is that ``inttoptr`` instructions
+should not be inserted by passes as they may not be able to create a valid
+pointer. This property also means that ``inttoptr(ptrtoint(x))`` cannot be
+folded to ``x`` as the ``ptrtoint`` operation may destroy the necessary 
metadata
+to reconstruct the pointer.
+Additiona

[llvm-branch-commits] [llvm] [DataLayout][LangRef] Split non-integral and unstable pointer properties (PR #105735)

2024-10-25 Thread Jessica Clarke via llvm-branch-commits


@@ -649,48 +649,95 @@ literal types are uniqued in recent versions of LLVM.
 
 .. _nointptrtype:
 
-Non-Integral Pointer Type
--
+Non-Integral and Unstable Pointer Types
+---
 
-Note: non-integral pointer types are a work in progress, and they should be
-considered experimental at this time.
+Note: non-integral/unstable pointer types are a work in progress, and they
+should be considered experimental at this time.
 
 LLVM IR optionally allows the frontend to denote pointers in certain address
-spaces as "non-integral" via the :ref:`datalayout string`.
-Non-integral pointer types represent pointers that have an *unspecified* 
bitwise
-representation; that is, the integral representation may be target dependent or
-unstable (not backed by a fixed integer).
+spaces as "non-integral" or "unstable" (or both "non-integral" and "unstable")
+via the :ref:`datalayout string`.
+
+These exact implications of these properties are target-specific, but the
+following IR semantics and restrictions to optimization passes apply:
+
+Unstable pointer representation
+^^^
+
+Pointers in this address space have an *unspecified* bitwise representation
+(i.e. not backed by a fixed integer). The bitwise pattern of such pointers is
+allowed to change in a target-specific way. For example, this could be a 
pointer
+type used for with copying garbage collection where the garbage collector could
+update the pointer at any time in the collection sweep.
 
 ``inttoptr`` and ``ptrtoint`` instructions have the same semantics as for
 integral (i.e. normal) pointers in that they convert integers to and from
-corresponding pointer types, but there are additional implications to be
-aware of.  Because the bit-representation of a non-integral pointer may
-not be stable, two identical casts of the same operand may or may not
+corresponding pointer types, but there are additional implications to be aware
+of.
+
+For "unstable" pointer representations, the bit-representation of the pointer
+may not be stable, so two identical casts of the same operand may or may not
 return the same value.  Said differently, the conversion to or from the
-non-integral type depends on environmental state in an implementation
+"unstable" pointer type depends on environmental state in an implementation
 defined manner.
-
 If the frontend wishes to observe a *particular* value following a cast, the
 generated IR must fence with the underlying environment in an implementation
 defined manner. (In practice, this tends to require ``noinline`` routines for
 such operations.)
 
 From the perspective of the optimizer, ``inttoptr`` and ``ptrtoint`` for
-non-integral types are analogous to ones on integral types with one
+"unstable" pointer types are analogous to ones on integral types with one
 key exception: the optimizer may not, in general, insert new dynamic
 occurrences of such casts.  If a new cast is inserted, the optimizer would
 need to either ensure that a) all possible values are valid, or b)
 appropriate fencing is inserted.  Since the appropriate fencing is
 implementation defined, the optimizer can't do the latter.  The former is
 challenging as many commonly expected properties, such as
-``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for non-integral types.
+``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for "unstable" pointer types.
 Similar restrictions apply to intrinsics that might examine the pointer bits,
 such as :ref:`llvm.ptrmask`.
 
-The alignment information provided by the frontend for a non-integral pointer
+The alignment information provided by the frontend for an "unstable" pointer
 (typically using attributes or metadata) must be valid for every possible
 representation of the pointer.
 
+Non-integral pointer representation
+^^^
+
+Pointers are not represented as an address, but may instead include
+additional metadata such as bounds information or a temporal identifier.
+Examples include AMDGPU buffer descriptors with a 128-bit fat pointer and a
+32-bit offset or CHERI capabilities that contain bounds, permissions and an
+out-of-band validity bit. In general, these pointers cannot be re-created
+from just an integer value.
+
+In most cases pointers with a non-integral representation behave exactly the
+same as an integral pointer, the only difference is that it is not possible to
+create a pointer just from an address.
+
+"Non-integral" pointers also impose restrictions on the optimizer, but in
+general these are less restrictive than for "unstable" pointers. The main
+difference compared to integral pointers is that ``inttoptr`` instructions
+should not be inserted by passes as they may not be able to create a valid
+pointer. This property also means that ``inttoptr(ptrtoint(x))`` cannot be
+folded to ``x`` as the ``ptrtoint`` operation may destroy the necessary 
metadata
+to reconstruct the pointer.
+Additiona

[llvm-branch-commits] [llvm] [DataLayout][LangRef] Split non-integral and unstable pointer properties (PR #105735)

2024-10-25 Thread Jessica Clarke via llvm-branch-commits


@@ -342,14 +346,63 @@ class DataLayout {
   SmallVector getNonIntegralAddressSpaces() const {
 SmallVector AddrSpaces;
 for (const PointerSpec &PS : PointerSpecs) {
-  if (PS.IsNonIntegral)
+  if (PS.HasNonIntegralRepresentation || PS.HasUnstableRepresentation)
 AddrSpaces.push_back(PS.AddrSpace);
 }
 return AddrSpaces;
   }
 
+  /// Returns whether this address space is "non-integral" and "unstable".
+  /// This means that passes should not introduce inttoptr or ptrtoint
+  /// instructions operating on pointers of this address space.
+  /// TODO: remove this function after migrating to finer-grained properties.
   bool isNonIntegralAddressSpace(unsigned AddrSpace) const {
-return getPointerSpec(AddrSpace).IsNonIntegral;
+const PointerSpec &PS = getPointerSpec(AddrSpace);
+return PS.HasNonIntegralRepresentation || PS.HasUnstableRepresentation;
+  }
+
+  /// Returns whether this address space has an "unstable" pointer
+  /// representation. The bitwise pattern of such pointers is allowed to change
+  /// in a target-specific way. For example, this could be used for copying
+  /// garbage collection where the garbage collector could update the pointer
+  /// value as part of the collection sweep.
+  bool hasUnstableRepresentation(unsigned AddrSpace) const {
+return getPointerSpec(AddrSpace).HasUnstableRepresentation;
+  }
+
+  /// Returns whether this address space has a non-integral pointer
+  /// representation, i.e. the pointer is not just an integer address but some
+  /// other bitwise representation. Examples include AMDGPU buffer descriptors
+  /// with a 128-bit fat pointer and a 32-bit offset or CHERI capabilities that
+  /// contain bounds, permissions and an out-of-band validity bit. In general,
+  /// these pointers cannot be re-created from just an integer value.
+  bool hasNonIntegralRepresentation(unsigned AddrSpace) const {
+return getPointerSpec(AddrSpace).HasNonIntegralRepresentation;
+  }
+
+  /// Returns whether passes should avoid introducing `inttoptr` instructions
+  /// for this address space.
+  ///
+  /// This is currently the case "non-integral" pointer representations
+  /// (hasNonIntegralRepresentation()) since such pointers generally require
+  /// additional metadata beyond just an address.
+  /// New `inttoptr` instructions should also be avoided for "unstable" bitwise
+  /// representations (hasUnstableRepresentation()) unless the pass knows it is
+  /// within a critical section that retains the current representation.
+  bool shouldAvoidIntToPtr(unsigned AddrSpace) const {
+const PointerSpec &PS = getPointerSpec(AddrSpace);
+return PS.HasNonIntegralRepresentation || PS.HasUnstableRepresentation;

jrtc27 wrote:

Use the helpers? This is the only one that doesn't (other than the deprecated 
isNonIntegralAddressSpace).

https://github.com/llvm/llvm-project/pull/105735
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DataLayout][LangRef] Split non-integral and unstable pointer properties (PR #105735)

2024-10-25 Thread Jessica Clarke via llvm-branch-commits


@@ -649,48 +649,95 @@ literal types are uniqued in recent versions of LLVM.
 
 .. _nointptrtype:
 
-Non-Integral Pointer Type
--
+Non-Integral and Unstable Pointer Types
+---
 
-Note: non-integral pointer types are a work in progress, and they should be
-considered experimental at this time.
+Note: non-integral/unstable pointer types are a work in progress, and they
+should be considered experimental at this time.
 
 LLVM IR optionally allows the frontend to denote pointers in certain address
-spaces as "non-integral" via the :ref:`datalayout string`.
-Non-integral pointer types represent pointers that have an *unspecified* 
bitwise
-representation; that is, the integral representation may be target dependent or
-unstable (not backed by a fixed integer).
+spaces as "non-integral" or "unstable" (or both "non-integral" and "unstable")
+via the :ref:`datalayout string`.
+
+These exact implications of these properties are target-specific, but the
+following IR semantics and restrictions to optimization passes apply:
+
+Unstable pointer representation
+^^^
+
+Pointers in this address space have an *unspecified* bitwise representation
+(i.e. not backed by a fixed integer). The bitwise pattern of such pointers is
+allowed to change in a target-specific way. For example, this could be a 
pointer
+type used for with copying garbage collection where the garbage collector could
+update the pointer at any time in the collection sweep.
 
 ``inttoptr`` and ``ptrtoint`` instructions have the same semantics as for
 integral (i.e. normal) pointers in that they convert integers to and from
-corresponding pointer types, but there are additional implications to be
-aware of.  Because the bit-representation of a non-integral pointer may
-not be stable, two identical casts of the same operand may or may not
+corresponding pointer types, but there are additional implications to be aware
+of.
+
+For "unstable" pointer representations, the bit-representation of the pointer
+may not be stable, so two identical casts of the same operand may or may not
 return the same value.  Said differently, the conversion to or from the
-non-integral type depends on environmental state in an implementation
+"unstable" pointer type depends on environmental state in an implementation
 defined manner.
-
 If the frontend wishes to observe a *particular* value following a cast, the
 generated IR must fence with the underlying environment in an implementation
 defined manner. (In practice, this tends to require ``noinline`` routines for
 such operations.)
 
 From the perspective of the optimizer, ``inttoptr`` and ``ptrtoint`` for
-non-integral types are analogous to ones on integral types with one
+"unstable" pointer types are analogous to ones on integral types with one
 key exception: the optimizer may not, in general, insert new dynamic
 occurrences of such casts.  If a new cast is inserted, the optimizer would
 need to either ensure that a) all possible values are valid, or b)
 appropriate fencing is inserted.  Since the appropriate fencing is
 implementation defined, the optimizer can't do the latter.  The former is
 challenging as many commonly expected properties, such as
-``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for non-integral types.
+``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for "unstable" pointer types.
 Similar restrictions apply to intrinsics that might examine the pointer bits,
 such as :ref:`llvm.ptrmask`.
 
-The alignment information provided by the frontend for a non-integral pointer
+The alignment information provided by the frontend for an "unstable" pointer
 (typically using attributes or metadata) must be valid for every possible
 representation of the pointer.
 
+Non-integral pointer representation
+^^^
+
+Pointers are not represented as an address, but may instead include

jrtc27 wrote:

```suggestion
Pointers are not represented as just an address, but may instead include
```

https://github.com/llvm/llvm-project/pull/105735
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [TableGen] Remove a pointless check for iPTRAny (PR #113732)

2024-10-25 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 created 
https://github.com/llvm/llvm-project/pull/113732

We've already called EnforceInteger on Types[0], and iPTRAny isn't
regarded as an integer type (note that TableGen special-cases iPTR here
to include that, though), so we cannot possibly still have an iPTRAny by
this point. Delete the check, and let getFixedSizeInBits catch it along
with all the other overloaded types if that ever becomes false. Also
document why we have this check whilst here.



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [TableGen] Remove a pointless check for iPTRAny (PR #113732)

2024-10-25 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 updated 
https://github.com/llvm/llvm-project/pull/113732


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [TableGen] Remove a pointless check for iPTRAny (PR #113732)

2024-10-25 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 updated 
https://github.com/llvm/llvm-project/pull/113732


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [CodeGen] Rename MVT::iPTRAny to MVT::pAny (PR #113733)

2024-10-25 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 updated 
https://github.com/llvm/llvm-project/pull/113733


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [CodeGen] Rename MVT::iPTRAny to MVT::pAny (PR #113733)

2024-10-25 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 updated 
https://github.com/llvm/llvm-project/pull/113733


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [CodeGen] Rename MVT::iPTRAny to MVT::pAny (PR #113733)

2024-10-25 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 created 
https://github.com/llvm/llvm-project/pull/113733

Whilst in upstream LLVM iPTRAny is only ever an integer, essentially an
alias for iPTR, this is not true in CHERI LLVM, where it gets used to
mean "iPTR or cPTR", i.e. either an integer address or a capability
(with cPTR and cN being the capability equivalents of iPTR and iN).
Moreover, iPTRAny is already not itself regarded as an integer (calling
isInteger() will give false), so the "i" prefix is misleading, and it
stands out as different from all the other xAny that have a single
letter prefix denoting their type.

Thus, rename it to pAny, reflecting that it is an overloaded pointer
type, which could end up being specialised to an integer type, but does
not have to be.

This has been verified to have no effect on the generated files for LLVM
itself or any in-tree target beyond the replacement of the identifier
iPTRAny with pAny in GenVT.inc.



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [RISCV] Replace @plt/@gotpcrel in data directives with %plt %gotpcrel (PR #132569)

2025-03-22 Thread Jessica Clarke via llvm-branch-commits


@@ -18,6 +18,6 @@
 .globl _start
 _start:
 .data
-  .word foo@PLT - .
-  .word foo@PLT - . + 1
-  .word foo@PLT - . - 1
+  .word %plt(foo - .)

jrtc27 wrote:

We have %(got_)pcrel_hi and now %gotpcrel, what's so different about this one?

https://github.com/llvm/llvm-project/pull/132569
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [RISCV] Replace @plt/@gotpcrel in data directives with %pltpcrel %gotpcrel (PR #132569)

2025-03-25 Thread Jessica Clarke via llvm-branch-commits


@@ -18,6 +18,6 @@
 .globl _start
 _start:
 .data
-  .word foo@PLT - .
-  .word foo@PLT - . + 1
-  .word foo@PLT - . - 1
+  .word %plt(foo - .)

jrtc27 wrote:

I've not looked at the implementation in detail, but thank you for taking the 
time to do so, I know from experience it can be quite painful getting the MC 
code to do things that weren't previously expected of it

https://github.com/llvm/llvm-project/pull/132569
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] ELF: Only rewrite non-preemptible IFUNCs to IPLT functions if a non-IRELATIVE relocation is needed. (PR #133531)

2025-04-10 Thread Jessica Clarke via llvm-branch-commits

jrtc27 wrote:

The canonical PLT is for
```int main(void) {
  return compare(&ifp);
}```
in code models where the address is computed inline (absolute or PC-relative) 
rather than as an indirect load (whether from a global or a GOT entry).

I like to think of canonical PLTs as the function version of copy relocations.

https://github.com/llvm/llvm-project/pull/133531
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [RISCV] Replace @plt/@gotpcrel in data directives with %plt %gotpcrel (PR #132569)

2025-03-22 Thread Jessica Clarke via llvm-branch-commits


@@ -18,6 +18,6 @@
 .globl _start
 _start:
 .data
-  .word foo@PLT - .
-  .word foo@PLT - . + 1
-  .word foo@PLT - . - 1
+  .word %plt(foo - .)

jrtc27 wrote:

Yeah, I know, but it's pretty weird and confusing syntax. It's not really 
written that way because it makes sense, it's just written that way because it 
aligns with how implementations think about it.

Perhaps %pltpcrel, to mirror %gotpcrel, would be the right thing to do here 
that sidesteps the issue?

https://github.com/llvm/llvm-project/pull/132569
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [RISCV] Replace @plt/@gotpcrel in data directives with %plt %gotpcrel (PR #132569)

2025-03-22 Thread Jessica Clarke via llvm-branch-commits


@@ -18,6 +18,6 @@
 .globl _start
 _start:
 .data
-  .word foo@PLT - .
-  .word foo@PLT - . + 1
-  .word foo@PLT - . - 1
+  .word %plt(foo - .)

jrtc27 wrote:

Would `%plt(foo) - .` not be the saner syntax? PLT of an offset is a bit 
nonsensical...

https://github.com/llvm/llvm-project/pull/132569
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [RISCV] Replace @plt/@gotpcrel in data directives with %plt %gotpcrel (PR #132569)

2025-03-23 Thread Jessica Clarke via llvm-branch-commits


@@ -18,6 +18,6 @@
 .globl _start
 _start:
 .data
-  .word foo@PLT - .
-  .word foo@PLT - . + 1
-  .word foo@PLT - . - 1
+  .word %plt(foo - .)

jrtc27 wrote:

Well my overarching point would be that user-facing syntax should not be 
beholden to arbitrary historic implementation choices. If it's truly impossible 
to make it work then that's one thing, but I doubt that to be the case.

https://github.com/llvm/llvm-project/pull/132569
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [IR] Introduce the `ptrtoaddr` instruction (PR #139357)

2025-06-20 Thread Jessica Clarke via llvm-branch-commits


@@ -4274,6 +4274,7 @@ bool LLParser::parseValID(ValID &ID, PerFunctionState 
*PFS, Type *ExpectedTy) {
   case lltok::kw_bitcast:
   case lltok::kw_addrspacecast:
   case lltok::kw_inttoptr:
+  // ptrtoaddr not supported in constant exprs (yet?).

jrtc27 wrote:

(i.e. it's a TODO if not implemented here, not a question of whether it should 
be supported, IMO)

https://github.com/llvm/llvm-project/pull/139357
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [IR] Introduce the `ptrtoaddr` instruction (PR #139357)

2025-06-20 Thread Jessica Clarke via llvm-branch-commits


@@ -4274,6 +4274,7 @@ bool LLParser::parseValID(ValID &ID, PerFunctionState 
*PFS, Type *ExpectedTy) {
   case lltok::kw_bitcast:
   case lltok::kw_addrspacecast:
   case lltok::kw_inttoptr:
+  // ptrtoaddr not supported in constant exprs (yet?).

jrtc27 wrote:

That's something that is needed, we support that on CHERI

https://github.com/llvm/llvm-project/pull/139357
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [IRTranslator] Handle ptrtoaddr (PR #139601)

2025-06-09 Thread Jessica Clarke via llvm-branch-commits


@@ -1583,6 +1583,26 @@ bool IRTranslator::translateCast(unsigned Opcode, const 
User &U,
   return true;
 }
 
+bool IRTranslator::translatePtrToAddr(const User &U,
+  MachineIRBuilder &MIRBuilder) {
+  if (containsBF16Type(U))
+return false;
+
+  uint32_t Flags = 0;
+  if (const Instruction *I = dyn_cast(&U))
+Flags = MachineInstr::copyFlagsFromInstruction(*I);
+
+  Register Op = getOrCreateVReg(*U.getOperand(0));
+  Type *PtrTy = U.getOperand(0)->getType();
+  LLT AddrTy = getLLTForType(*DL->getIndexType(PtrTy), *DL);
+  auto IntPtrTy = getLLTForType(*DL->getIntPtrType(PtrTy), *DL);
+  auto PtrToInt = MIRBuilder.buildPtrToInt(IntPtrTy, Op);

jrtc27 wrote:

We'd need a G_PTRTOADDR for CHERI given we can't do ptrtoint as it's defined 
upstream

https://github.com/llvm/llvm-project/pull/139601
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [DataLayout][LangRef] Split non-integral and unstable pointer properties (PR #105735)

2025-07-22 Thread Jessica Clarke via llvm-branch-commits


@@ -650,48 +650,136 @@ literal types are uniqued in recent versions of LLVM.
 
 .. _nointptrtype:
 
-Non-Integral Pointer Type
--
+Non-Integral and Unstable Pointer Types
+---
 
-Note: non-integral pointer types are a work in progress, and they should be
-considered experimental at this time.
+Note: non-integral/unstable pointer types are a work in progress, and they
+should be considered experimental at this time.
 
 LLVM IR optionally allows the frontend to denote pointers in certain address
-spaces as "non-integral" via the :ref:`datalayout string`.
-Non-integral pointer types represent pointers that have an *unspecified* 
bitwise
-representation; that is, the integral representation may be target dependent or
-unstable (not backed by a fixed integer).
+spaces as "unstable", "non-integral", or "non-integral with external state"
+(or combinations of these) via the :ref:`datalayout 
string`.
+
+The exact implications of these properties are target-specific, but the
+following IR semantics and restrictions to optimization passes apply:
+
+Unstable pointer representation
+^^^
+
+Pointers in this address space have an *unspecified* bitwise representation
+(i.e. not backed by a fixed integer). The bitwise pattern of such pointers is
+allowed to change in a target-specific way. For example, this could be a 
pointer
+type used with copying garbage collection where the garbage collector could
+update the pointer at any time in the collection sweep.
 
 ``inttoptr`` and ``ptrtoint`` instructions have the same semantics as for
 integral (i.e. normal) pointers in that they convert integers to and from
-corresponding pointer types, but there are additional implications to be
-aware of.  Because the bit-representation of a non-integral pointer may
-not be stable, two identical casts of the same operand may or may not
+corresponding pointer types, but there are additional implications to be aware
+of.
+
+For "unstable" pointer representations, the bit-representation of the pointer
+may not be stable, so two identical casts of the same operand may or may not
 return the same value.  Said differently, the conversion to or from the
-non-integral type depends on environmental state in an implementation
+"unstable" pointer type depends on environmental state in an implementation
 defined manner.
-
 If the frontend wishes to observe a *particular* value following a cast, the
 generated IR must fence with the underlying environment in an implementation
 defined manner. (In practice, this tends to require ``noinline`` routines for
 such operations.)
 
 From the perspective of the optimizer, ``inttoptr`` and ``ptrtoint`` for
-non-integral types are analogous to ones on integral types with one
+"unstable" pointer types are analogous to ones on integral types with one
 key exception: the optimizer may not, in general, insert new dynamic
 occurrences of such casts.  If a new cast is inserted, the optimizer would
 need to either ensure that a) all possible values are valid, or b)
 appropriate fencing is inserted.  Since the appropriate fencing is
 implementation defined, the optimizer can't do the latter.  The former is
 challenging as many commonly expected properties, such as
-``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for non-integral types.
+``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for "unstable" pointer types.
 Similar restrictions apply to intrinsics that might examine the pointer bits,
 such as :ref:`llvm.ptrmask`.
 
-The alignment information provided by the frontend for a non-integral pointer
+The alignment information provided by the frontend for an "unstable" pointer
 (typically using attributes or metadata) must be valid for every possible
 representation of the pointer.
 
+Non-integral pointer representation
+^^^
+
+Pointers are not represented as just an address, but may instead include
+additional metadata such as bounds information or a temporal identifier.
+Examples include AMDGPU buffer descriptors with a 128-bit fat pointer and a
+32-bit offset, or CHERI capabilities that contain bounds, permissions and a
+type field (as well as an out-of-band validity bit, see next section).
+
+In most cases pointers with a non-integral representation behave exactly the
+same as an integral pointer, the only difference is that it is not possible to
+create a pointer just from an address unless all the metadata bits were
+also recreated correctly.
+
+"Non-integral" pointers also impose restrictions on transformation passes, but
+in general these are less restrictive than for "unstable" pointers. The main
+difference compared to integral pointers is that the address width of a
+non-integral pointer is not equal to the bitwise representation, so extracting
+the address needs to truncate to the index width of the pointer.
+
+Note: Currently all supported targets require that truncating the `

[llvm-branch-commits] [NFC][ELF] Replace DynamicReloc::Kind with the equivalent bool in APIs (PR #150813)

2025-07-27 Thread Jessica Clarke via llvm-branch-commits

jrtc27 wrote:

> Thanks for cleaning up my ugly https://reviews.llvm.org/D100490 in the 
> previous commits :)
> 
> I think having names for the boolean parameters makes the code calling this 
> function easier to read than a magic true/false. So I'd have a slight 
> preference for keeping an enum parameter but I'll leave that decision to 
> @MaskRay.

There's always /*isAgainstSymbol=*/ if you want to name it, I suppose. The 
existence of the enum just encourages people to do things like MIPS and Morello 
have both done in the past, so I want to make sure that everything ends up 
being done via RelExpr instead.

https://github.com/llvm/llvm-project/pull/150813
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFC][ELF] Replace DynamicReloc::Kind with the equivalent bool in APIs (PR #150813)

2025-07-27 Thread Jessica Clarke via llvm-branch-commits

jrtc27 wrote:

> Is this the last change in the patch series? The use of booleans is a good 
> move to prevent the complexity of MIPS-style dynamic relocations (which I 
> haven’t fully analyzed). Thanks for tidying this up!

Yes, though #150729 and #150730 are based on this patch series (as a 
logically-separate series that conflicts due to the refactoring so I linearised 
that way).

https://github.com/llvm/llvm-project/pull/150813
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFCI][ELF] Merge AgainstSymbol and AgainstSymbolWithTargetVA (PR #150798)

2025-07-26 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 edited 
https://github.com/llvm/llvm-project/pull/150798
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFCI][ELF] Merge AgainstSymbol and AgainstSymbolWithTargetVA (PR #150798)

2025-07-26 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 updated 
https://github.com/llvm/llvm-project/pull/150798


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFCI][ELF] Introduce explicit Computed state for DynamicReloc (PR #150799)

2025-07-26 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 created 
https://github.com/llvm/llvm-project/pull/150799

Currently we set the kind to AddendOnly in computeRaw() in order to
catch cases where we're not treating the DynamicReloc as computed.
Specifically, computeAddend() will then assert that sym is nullptr, so
can catch any subsequent calls for relocations that have sym set.
However, if the DynamicReloc was already AddendOnly (or
MipsMultiGotPage), we will silently allow this, which does work
correctly, but is not the intended use. We also cannot catch cases where
needsDynSymIndex() is called after this point, which would give a
misleading value if the kind were previously against a symbol.

By introducing a new (internal) Computed kind we can be explicit and add
more rigorous assertions, rather than abusing AddendOnly.



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFCI][ELF] Merge AddendOnly and AddendOnlyWithTargetVA (PR #150797)

2025-07-26 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 updated 
https://github.com/llvm/llvm-project/pull/150797


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFCI][ELF] Merge AddendOnly and AddendOnlyWithTargetVA (PR #150797)

2025-07-26 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 updated 
https://github.com/llvm/llvm-project/pull/150797


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] [ELF][Mips] Fix addend for preemptible static TLS (PR #150729)

2025-07-26 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 updated 
https://github.com/llvm/llvm-project/pull/150729

>From 32400cb0d5c16e16b6d0d259955ba060f561fefe Mon Sep 17 00:00:00 2001
From: Jessica Clarke 
Date: Sat, 26 Jul 2025 02:12:18 +0100
Subject: [PATCH] =?UTF-8?q?[=F0=9D=98=80=F0=9D=97=BD=F0=9D=97=BF]=20initia?=
 =?UTF-8?q?l=20version?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Created using spr 1.3.5
---
 lld/ELF/SyntheticSections.cpp | 16 
 lld/ELF/SyntheticSections.h   |  9 +
 lld/test/ELF/mips-mgot.s  |  2 +-
 lld/test/ELF/mips-tls-64.s|  2 +-
 lld/test/ELF/mips-tls.s   |  2 +-
 5 files changed, 12 insertions(+), 19 deletions(-)

diff --git a/lld/ELF/SyntheticSections.cpp b/lld/ELF/SyntheticSections.cpp
index efec41a737b62..0bb00c6d2bcff 100644
--- a/lld/ELF/SyntheticSections.cpp
+++ b/lld/ELF/SyntheticSections.cpp
@@ -1065,9 +1065,8 @@ void MipsGotSection::build() {
   // for the TP-relative offset as we don't know how much other data will
   // be allocated before us in the static TLS block.
   if (s->isPreemptible || ctx.arg.shared)
-ctx.mainPart->relaDyn->addReloc(
-{ctx.target->tlsGotRel, this, offset,
- DynamicReloc::AgainstSymbolWithTargetVA, *s, 0, R_ABS});
+ctx.mainPart->relaDyn->addAddendOnlyRelocIfNonPreemptible(
+ctx.target->tlsGotRel, *this, offset, *s, ctx.target->symbolicRel);
 }
 for (std::pair &p : got.dynTlsSymbols) {
   Symbol *s = p.first;
@@ -1160,6 +1159,7 @@ void MipsGotSection::writeTo(uint8_t *buf) {
   // if we had to do this.
   writeUint(ctx, buf + ctx.arg.wordsize,
 (uint64_t)1 << (ctx.arg.wordsize * 8 - 1));
+  ctx.target->relocateAlloc(*this, buf);
   for (const FileGot &g : gots) {
 auto write = [&](size_t i, const Symbol *s, int64_t a) {
   uint64_t va = a;
@@ -1189,9 +1189,10 @@ void MipsGotSection::writeTo(uint8_t *buf) {
 write(p.second, p.first, 0);
 for (const std::pair &p : g.relocs)
   write(p.second, p.first, 0);
-for (const std::pair &p : g.tls)
-  write(p.second, p.first,
-p.first->isPreemptible || ctx.arg.shared ? 0 : -0x7000);
+for (const std::pair &p : g.tls) {
+  if (!p.first->isPreemptible && !ctx.arg.shared)
+write(p.second, p.first, -0x7000);
+}
 for (const std::pair &p : g.dynTlsSymbols) {
   if (p.first == nullptr && !ctx.arg.shared)
 write(p.second, nullptr, 1);
@@ -1653,8 +1654,7 @@ int64_t DynamicReloc::computeAddend(Ctx &ctx) const {
   case AgainstSymbol:
 assert(sym != nullptr);
 return addend;
-  case AddendOnlyWithTargetVA:
-  case AgainstSymbolWithTargetVA: {
+  case AddendOnlyWithTargetVA: {
 uint64_t ca = inputSec->getRelocTargetVA(
 ctx, Relocation{expr, type, 0, addend, sym}, getOffset());
 return ctx.arg.is64 ? ca : SignExtend64<32>(ca);
diff --git a/lld/ELF/SyntheticSections.h b/lld/ELF/SyntheticSections.h
index 5f01513630597..7612915b5b1dc 100644
--- a/lld/ELF/SyntheticSections.h
+++ b/lld/ELF/SyntheticSections.h
@@ -429,11 +429,6 @@ class DynamicReloc {
 /// The resulting dynamic relocation references symbol #sym from the 
dynamic
 /// symbol table and uses #addend as the value of computeAddend(ctx).
 AgainstSymbol,
-/// The resulting dynamic relocation references symbol #sym from the 
dynamic
-/// symbol table and uses InputSection::getRelocTargetVA() + #addend for 
the
-/// final addend. It can be used for relocations that write the symbol VA 
as
-// the addend (e.g. R_MIPS_TLS_TPREL64) but still reference the symbol.
-AgainstSymbolWithTargetVA,
 /// This is used by the MIPS multi-GOT implementation. It relocates
 /// addresses of 64kb pages that lie inside the output section.
 MipsMultiGotPage,
@@ -460,9 +455,7 @@ class DynamicReloc {
 
   uint64_t getOffset() const;
   uint32_t getSymIndex(SymbolTableBaseSection *symTab) const;
-  bool needsDynSymIndex() const {
-return kind == AgainstSymbol || kind == AgainstSymbolWithTargetVA;
-  }
+  bool needsDynSymIndex() const { return kind == AgainstSymbol; }
 
   /// Computes the addend of the dynamic relocation. Note that this is not the
   /// same as the #addend member variable as it may also include the symbol
diff --git a/lld/test/ELF/mips-mgot.s b/lld/test/ELF/mips-mgot.s
index 6978b5d9623b4..67bd5e6619f12 100644
--- a/lld/test/ELF/mips-mgot.s
+++ b/lld/test/ELF/mips-mgot.s
@@ -23,7 +23,7 @@
 
 # CHECK:  Contents of section .got:
 # CHECK-NEXT:  7  8000 [[FOO0]] [[FOO2]]
-# CHECK-NEXT:  70010  0004 0001 0002
+# CHECK-NEXT:  70010   0001 0002
 # CHECK-NEXT:  70020 0003 0004 0005 0006
 # CHECK-NEXT:  70030    
 # CHECK-NEXT:  70040   
diff --git a/lld/test/ELF/mips-tls-64.s b/lld/test/ELF/mips-tls-64.s
index 3976b50274be4..8a00b93c77e2f 100644
--- a/ll

[llvm-branch-commits] [lld] [ELF][Mips] Fix addend for preemptible static TLS (PR #150729)

2025-07-26 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 edited 
https://github.com/llvm/llvm-project/pull/150729
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] [ELF][Mips] Fix addend for preemptible static TLS (PR #150729)

2025-07-26 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 edited 
https://github.com/llvm/llvm-project/pull/150729
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFCI][ELF][Mips] Refactor MipsGotSection to avoid explicit writes (PR #150730)

2025-07-26 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 updated 
https://github.com/llvm/llvm-project/pull/150730


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFCI][ELF][Mips] Refactor MipsGotSection to avoid explicit writes (PR #150730)

2025-07-26 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 updated 
https://github.com/llvm/llvm-project/pull/150730


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFCI][ELF] Merge AgainstSymbol and AgainstSymbolWithTargetVA (PR #150798)

2025-07-26 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 created 
https://github.com/llvm/llvm-project/pull/150798

The former is just a special case of the latter, ignoring the expr and
always just using the addend. If we use R_ADDEND as expr (which
previously had no effect, and so was misleadingly R_ABS not R_ADDEND in
all but one use) then we don't need to maintain this as a separate case.

This just leaves MipsMultiGotPage as a special case; the only difference
between the other two Kind values is what needsDynSymIndex returns.



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFCI][ELF] Merge AddendOnly and AddendOnlyWithTargetVA (PR #150797)

2025-07-26 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 created 
https://github.com/llvm/llvm-project/pull/150797

The former is just a special case of the latter, ignoring the expr and
always just using the addend, allowing (and enforcing) the sym is null.
If we just use dummySym then we don't need to maintain this as a
separate case, since R_ADDEND will return the addend unmodified for the
call to getRelocTargetVA.



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFCI][ELF] Merge AgainstSymbol and AgainstSymbolWithTargetVA (PR #150795)

2025-07-26 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 created 
https://github.com/llvm/llvm-project/pull/150795

The former is just a special case of the latter, ignoring the expr and
always just using the addend. If we use R_ADDEND as expr (which
previously had no effect, and so was misleadingly R_ABS not R_ADDEND in
all but one use) then we don't need to maintain this as a separate case.

This just leaves MipsMultiGotPage as a special case; the only difference
between the other two Kind values is what needsDynSymIndex returns.



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFCI][ELF] Merge AgainstSymbol and AgainstSymbolWithTargetVA (PR #150795)

2025-07-26 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 closed 
https://github.com/llvm/llvm-project/pull/150795
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFCI][ELF] Merge AgainstSymbol and AgainstSymbolWithTargetVA (PR #150795)

2025-07-26 Thread Jessica Clarke via llvm-branch-commits

jrtc27 wrote:

#150798. Screwed up spr...

https://github.com/llvm/llvm-project/pull/150795
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFCI][ELF] Merge AgainstSymbol and AgainstSymbolWithTargetVA (PR #150798)

2025-07-26 Thread Jessica Clarke via llvm-branch-commits

jrtc27 wrote:

> When DynamicReloc::Kind was introduced, I was concerned of the many Kinds, 
> but that was still better than the previous state. Thanks for the 
> simplification.

Yeah I think Alex was a bit confused (and also there were some bugs that he 
faithfully replicated and since fixed, or that I'm fixing in the special case 
of #150729). I've been delving into how all this works to clean up some horrors 
in CHERI LLD (and many more in Morello LLD, where there is actually a 
legitimate use case for AgainstSymbol(WithTargetVA) with something other than 
R_ADDEND, due to ABI weirdness), and as part of that discovered the confusing 
and overly-complex nature of all this. With the exception of the MIPS GOT page 
(and Computed), the end state of this stack (see #150796, #150799, #150797) is 
everything consistently gets funnelled through getRelocTargetVA. I might also 
tackle that at some point, though I've already spent too much time on MIPS the 
past few days, despite not even caring about it any more downstream in CHERI 
LLVM...

https://github.com/llvm/llvm-project/pull/150798
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFCI][ELF][Mips] Refactor MipsGotSection to avoid explicit writes (PR #150730)

2025-07-26 Thread Jessica Clarke via llvm-branch-commits

jrtc27 wrote:

> > The current code is quite crusty and divergent from non-MIPS in API use, 
> > but fixing it up like this is quite high-risk, especially given how weird 
> > the MIPS GOT is when it comes to the required initial memory state. Is 
> > anyone using LLD for MIPS these days who can test this in a wider context? 
> > I've tried to be very careful but I would not be surprised if I've made 
> > mistakes; our test coverage isn't great.
> 
> It's indeed very difficult to find folks still concerned with MIPS... I 
> believe only one company is still actively contributing to the MIPS backend 
> in LLVM... @wzssyqa
> 
> Besides the ClangBuiltLinux maintainer and the OpenBSD maintainer continue to 
> support MIPS.

This one I don't hugely care about getting in soon as it's (a) meant to be NFC 
(b) limited to MipsGotSection. It makes the code clearer and more like the 
normal GOT, but in theory what's there is fine, and the wider LLD internals 
aren't affected by it.

https://github.com/llvm/llvm-project/pull/150730
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFCI][ELF][Mips] Replace MipsMultiGotPage with new RE_MIPS_OSEC_LOCAL_PAGE (PR #150810)

2025-07-26 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 created 
https://github.com/llvm/llvm-project/pull/150810

Instead of having a special DynamicReloc::Kind, we can just use a new
RelExpr for the calculation needed. The only odd thing we do that allows
this is to keep a representative symbol for the OutputSection in
question (the first we see for it) around to use in this relocation for
the addend calculation.

This reduces DynamicReloc to just AddendOnly vs AgainstSymbol, plus the
internal Computed.



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFC][ELF] Don't duplicate DynamicReloc constructor (PR #150811)

2025-07-26 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 created 
https://github.com/llvm/llvm-project/pull/150811

This second constructor is just a shorthand for an AddendOnly relocation
against dummySym with R_ADDEND, so write it as such.



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFCI][ELF] Store DynamicReloc Kind as two bools (PR #150812)

2025-07-26 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 created 
https://github.com/llvm/llvm-project/pull/150812

Aside from Computed, Kind is now just AddendOnly and AgainstSymbol, so
it's really just a bool reflecting whether the resulting ELF relocation
should reference the symbol or not. Refactor DynamicReloc's storage to
reflect this, splitting Computed out into its own orthogonal isFinal
bool. As part of this, rename computeRaw to finalize to reflect that
it's side-effecting.

This also allows needsDynSymIndex() to work even after finalize(), so
drop the existing assertion.

A future commit will refact the DynamicReloc API to take isAgainstSymbol
directly now the enum serves little purpose, as a more invasive,
mechanical change. For this commit we keep DynamicReloc::Kind as the
external API.



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFCI][ELF] Merge AgainstSymbol and AgainstSymbolWithTargetVA (PR #150798)

2025-07-26 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 updated 
https://github.com/llvm/llvm-project/pull/150798


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] [ELF][Mips] Fix addend for preemptible static TLS (PR #150729)

2025-07-26 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 updated 
https://github.com/llvm/llvm-project/pull/150729

>From 32400cb0d5c16e16b6d0d259955ba060f561fefe Mon Sep 17 00:00:00 2001
From: Jessica Clarke 
Date: Sat, 26 Jul 2025 02:12:18 +0100
Subject: [PATCH] =?UTF-8?q?[=F0=9D=98=80=F0=9D=97=BD=F0=9D=97=BF]=20initia?=
 =?UTF-8?q?l=20version?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Created using spr 1.3.5
---
 lld/ELF/SyntheticSections.cpp | 16 
 lld/ELF/SyntheticSections.h   |  9 +
 lld/test/ELF/mips-mgot.s  |  2 +-
 lld/test/ELF/mips-tls-64.s|  2 +-
 lld/test/ELF/mips-tls.s   |  2 +-
 5 files changed, 12 insertions(+), 19 deletions(-)

diff --git a/lld/ELF/SyntheticSections.cpp b/lld/ELF/SyntheticSections.cpp
index efec41a737b62..0bb00c6d2bcff 100644
--- a/lld/ELF/SyntheticSections.cpp
+++ b/lld/ELF/SyntheticSections.cpp
@@ -1065,9 +1065,8 @@ void MipsGotSection::build() {
   // for the TP-relative offset as we don't know how much other data will
   // be allocated before us in the static TLS block.
   if (s->isPreemptible || ctx.arg.shared)
-ctx.mainPart->relaDyn->addReloc(
-{ctx.target->tlsGotRel, this, offset,
- DynamicReloc::AgainstSymbolWithTargetVA, *s, 0, R_ABS});
+ctx.mainPart->relaDyn->addAddendOnlyRelocIfNonPreemptible(
+ctx.target->tlsGotRel, *this, offset, *s, ctx.target->symbolicRel);
 }
 for (std::pair &p : got.dynTlsSymbols) {
   Symbol *s = p.first;
@@ -1160,6 +1159,7 @@ void MipsGotSection::writeTo(uint8_t *buf) {
   // if we had to do this.
   writeUint(ctx, buf + ctx.arg.wordsize,
 (uint64_t)1 << (ctx.arg.wordsize * 8 - 1));
+  ctx.target->relocateAlloc(*this, buf);
   for (const FileGot &g : gots) {
 auto write = [&](size_t i, const Symbol *s, int64_t a) {
   uint64_t va = a;
@@ -1189,9 +1189,10 @@ void MipsGotSection::writeTo(uint8_t *buf) {
 write(p.second, p.first, 0);
 for (const std::pair &p : g.relocs)
   write(p.second, p.first, 0);
-for (const std::pair &p : g.tls)
-  write(p.second, p.first,
-p.first->isPreemptible || ctx.arg.shared ? 0 : -0x7000);
+for (const std::pair &p : g.tls) {
+  if (!p.first->isPreemptible && !ctx.arg.shared)
+write(p.second, p.first, -0x7000);
+}
 for (const std::pair &p : g.dynTlsSymbols) {
   if (p.first == nullptr && !ctx.arg.shared)
 write(p.second, nullptr, 1);
@@ -1653,8 +1654,7 @@ int64_t DynamicReloc::computeAddend(Ctx &ctx) const {
   case AgainstSymbol:
 assert(sym != nullptr);
 return addend;
-  case AddendOnlyWithTargetVA:
-  case AgainstSymbolWithTargetVA: {
+  case AddendOnlyWithTargetVA: {
 uint64_t ca = inputSec->getRelocTargetVA(
 ctx, Relocation{expr, type, 0, addend, sym}, getOffset());
 return ctx.arg.is64 ? ca : SignExtend64<32>(ca);
diff --git a/lld/ELF/SyntheticSections.h b/lld/ELF/SyntheticSections.h
index 5f01513630597..7612915b5b1dc 100644
--- a/lld/ELF/SyntheticSections.h
+++ b/lld/ELF/SyntheticSections.h
@@ -429,11 +429,6 @@ class DynamicReloc {
 /// The resulting dynamic relocation references symbol #sym from the 
dynamic
 /// symbol table and uses #addend as the value of computeAddend(ctx).
 AgainstSymbol,
-/// The resulting dynamic relocation references symbol #sym from the 
dynamic
-/// symbol table and uses InputSection::getRelocTargetVA() + #addend for 
the
-/// final addend. It can be used for relocations that write the symbol VA 
as
-// the addend (e.g. R_MIPS_TLS_TPREL64) but still reference the symbol.
-AgainstSymbolWithTargetVA,
 /// This is used by the MIPS multi-GOT implementation. It relocates
 /// addresses of 64kb pages that lie inside the output section.
 MipsMultiGotPage,
@@ -460,9 +455,7 @@ class DynamicReloc {
 
   uint64_t getOffset() const;
   uint32_t getSymIndex(SymbolTableBaseSection *symTab) const;
-  bool needsDynSymIndex() const {
-return kind == AgainstSymbol || kind == AgainstSymbolWithTargetVA;
-  }
+  bool needsDynSymIndex() const { return kind == AgainstSymbol; }
 
   /// Computes the addend of the dynamic relocation. Note that this is not the
   /// same as the #addend member variable as it may also include the symbol
diff --git a/lld/test/ELF/mips-mgot.s b/lld/test/ELF/mips-mgot.s
index 6978b5d9623b4..67bd5e6619f12 100644
--- a/lld/test/ELF/mips-mgot.s
+++ b/lld/test/ELF/mips-mgot.s
@@ -23,7 +23,7 @@
 
 # CHECK:  Contents of section .got:
 # CHECK-NEXT:  7  8000 [[FOO0]] [[FOO2]]
-# CHECK-NEXT:  70010  0004 0001 0002
+# CHECK-NEXT:  70010   0001 0002
 # CHECK-NEXT:  70020 0003 0004 0005 0006
 # CHECK-NEXT:  70030    
 # CHECK-NEXT:  70040   
diff --git a/lld/test/ELF/mips-tls-64.s b/lld/test/ELF/mips-tls-64.s
index 3976b50274be4..8a00b93c77e2f 100644
--- a/ll

[llvm-branch-commits] [lld] [ELF][Mips] Fix addend for preemptible static TLS (PR #150729)

2025-07-26 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 updated 
https://github.com/llvm/llvm-project/pull/150729

>From 32400cb0d5c16e16b6d0d259955ba060f561fefe Mon Sep 17 00:00:00 2001
From: Jessica Clarke 
Date: Sat, 26 Jul 2025 02:12:18 +0100
Subject: [PATCH] =?UTF-8?q?[=F0=9D=98=80=F0=9D=97=BD=F0=9D=97=BF]=20initia?=
 =?UTF-8?q?l=20version?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Created using spr 1.3.5
---
 lld/ELF/SyntheticSections.cpp | 16 
 lld/ELF/SyntheticSections.h   |  9 +
 lld/test/ELF/mips-mgot.s  |  2 +-
 lld/test/ELF/mips-tls-64.s|  2 +-
 lld/test/ELF/mips-tls.s   |  2 +-
 5 files changed, 12 insertions(+), 19 deletions(-)

diff --git a/lld/ELF/SyntheticSections.cpp b/lld/ELF/SyntheticSections.cpp
index efec41a737b62..0bb00c6d2bcff 100644
--- a/lld/ELF/SyntheticSections.cpp
+++ b/lld/ELF/SyntheticSections.cpp
@@ -1065,9 +1065,8 @@ void MipsGotSection::build() {
   // for the TP-relative offset as we don't know how much other data will
   // be allocated before us in the static TLS block.
   if (s->isPreemptible || ctx.arg.shared)
-ctx.mainPart->relaDyn->addReloc(
-{ctx.target->tlsGotRel, this, offset,
- DynamicReloc::AgainstSymbolWithTargetVA, *s, 0, R_ABS});
+ctx.mainPart->relaDyn->addAddendOnlyRelocIfNonPreemptible(
+ctx.target->tlsGotRel, *this, offset, *s, ctx.target->symbolicRel);
 }
 for (std::pair &p : got.dynTlsSymbols) {
   Symbol *s = p.first;
@@ -1160,6 +1159,7 @@ void MipsGotSection::writeTo(uint8_t *buf) {
   // if we had to do this.
   writeUint(ctx, buf + ctx.arg.wordsize,
 (uint64_t)1 << (ctx.arg.wordsize * 8 - 1));
+  ctx.target->relocateAlloc(*this, buf);
   for (const FileGot &g : gots) {
 auto write = [&](size_t i, const Symbol *s, int64_t a) {
   uint64_t va = a;
@@ -1189,9 +1189,10 @@ void MipsGotSection::writeTo(uint8_t *buf) {
 write(p.second, p.first, 0);
 for (const std::pair &p : g.relocs)
   write(p.second, p.first, 0);
-for (const std::pair &p : g.tls)
-  write(p.second, p.first,
-p.first->isPreemptible || ctx.arg.shared ? 0 : -0x7000);
+for (const std::pair &p : g.tls) {
+  if (!p.first->isPreemptible && !ctx.arg.shared)
+write(p.second, p.first, -0x7000);
+}
 for (const std::pair &p : g.dynTlsSymbols) {
   if (p.first == nullptr && !ctx.arg.shared)
 write(p.second, nullptr, 1);
@@ -1653,8 +1654,7 @@ int64_t DynamicReloc::computeAddend(Ctx &ctx) const {
   case AgainstSymbol:
 assert(sym != nullptr);
 return addend;
-  case AddendOnlyWithTargetVA:
-  case AgainstSymbolWithTargetVA: {
+  case AddendOnlyWithTargetVA: {
 uint64_t ca = inputSec->getRelocTargetVA(
 ctx, Relocation{expr, type, 0, addend, sym}, getOffset());
 return ctx.arg.is64 ? ca : SignExtend64<32>(ca);
diff --git a/lld/ELF/SyntheticSections.h b/lld/ELF/SyntheticSections.h
index 5f01513630597..7612915b5b1dc 100644
--- a/lld/ELF/SyntheticSections.h
+++ b/lld/ELF/SyntheticSections.h
@@ -429,11 +429,6 @@ class DynamicReloc {
 /// The resulting dynamic relocation references symbol #sym from the 
dynamic
 /// symbol table and uses #addend as the value of computeAddend(ctx).
 AgainstSymbol,
-/// The resulting dynamic relocation references symbol #sym from the 
dynamic
-/// symbol table and uses InputSection::getRelocTargetVA() + #addend for 
the
-/// final addend. It can be used for relocations that write the symbol VA 
as
-// the addend (e.g. R_MIPS_TLS_TPREL64) but still reference the symbol.
-AgainstSymbolWithTargetVA,
 /// This is used by the MIPS multi-GOT implementation. It relocates
 /// addresses of 64kb pages that lie inside the output section.
 MipsMultiGotPage,
@@ -460,9 +455,7 @@ class DynamicReloc {
 
   uint64_t getOffset() const;
   uint32_t getSymIndex(SymbolTableBaseSection *symTab) const;
-  bool needsDynSymIndex() const {
-return kind == AgainstSymbol || kind == AgainstSymbolWithTargetVA;
-  }
+  bool needsDynSymIndex() const { return kind == AgainstSymbol; }
 
   /// Computes the addend of the dynamic relocation. Note that this is not the
   /// same as the #addend member variable as it may also include the symbol
diff --git a/lld/test/ELF/mips-mgot.s b/lld/test/ELF/mips-mgot.s
index 6978b5d9623b4..67bd5e6619f12 100644
--- a/lld/test/ELF/mips-mgot.s
+++ b/lld/test/ELF/mips-mgot.s
@@ -23,7 +23,7 @@
 
 # CHECK:  Contents of section .got:
 # CHECK-NEXT:  7  8000 [[FOO0]] [[FOO2]]
-# CHECK-NEXT:  70010  0004 0001 0002
+# CHECK-NEXT:  70010   0001 0002
 # CHECK-NEXT:  70020 0003 0004 0005 0006
 # CHECK-NEXT:  70030    
 # CHECK-NEXT:  70040   
diff --git a/lld/test/ELF/mips-tls-64.s b/lld/test/ELF/mips-tls-64.s
index 3976b50274be4..8a00b93c77e2f 100644
--- a/ll

[llvm-branch-commits] [NFC][ELF] Replace DynamicReloc::Kind with the equivalent bool in APIs (PR #150813)

2025-07-26 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 created 
https://github.com/llvm/llvm-project/pull/150813

DynamicReloc::AgainstSymbol is now true and DynamicReloc::AddendOnly is
now false; uses of the constants were replaced mechanically.



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFCI][ELF][Mips] Refactor MipsGotSection to avoid explicit writes (PR #150730)

2025-07-26 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 edited 
https://github.com/llvm/llvm-project/pull/150730
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFCI][ELF][Mips] Refactor MipsGotSection to avoid explicit writes (PR #150730)

2025-07-26 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 updated 
https://github.com/llvm/llvm-project/pull/150730


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFCI][ELF][Mips] Refactor MipsGotSection to avoid explicit writes (PR #150730)

2025-07-26 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 updated 
https://github.com/llvm/llvm-project/pull/150730


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFCI][ELF] Introduce explicit Computed state for DynamicReloc (PR #150799)

2025-07-26 Thread Jessica Clarke via llvm-branch-commits

jrtc27 wrote:

NB: This is a transient state, https://github.com/llvm/llvm-project/pull/150812 
removes this member. This makes the history a bit odd, but I think it makes it 
clearer what's going on to include this in the history.

https://github.com/llvm/llvm-project/pull/150799
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFCI][ELF] Merge AgainstSymbol and AgainstSymbolWithTargetVA (PR #150798)

2025-07-26 Thread Jessica Clarke via llvm-branch-commits

jrtc27 wrote:

> > When DynamicReloc::Kind was introduced, I was concerned of the many Kinds, 
> > but that was still better than the previous state. Thanks for the 
> > simplification.
> 
> Yeah I think Alex was a bit confused (and also there were some bugs that he 
> faithfully replicated and since fixed, or that I'm fixing in the special case 
> of #150729). I've been delving into how all this works to clean up some 
> horrors in CHERI LLD (and many more in Morello LLD, where there is actually a 
> legitimate use case for AgainstSymbol(WithTargetVA) with something other than 
> R_ADDEND, due to ABI weirdness), and as part of that discovered the confusing 
> and overly-complex nature of all this. With the exception of the MIPS GOT 
> page (and Computed), the end state of this stack (see #150796, #150799, 
> #150797) is everything consistently gets funnelled through getRelocTargetVA. 
> I might also tackle that at some point, though I've already spent too much 
> time on MIPS the past few days, despite not even caring about it any more 
> downstream in CHERI LLVM...

Eh, I decided to do it in https://github.com/llvm/llvm-project/pull/150810 
because then I could do https://github.com/llvm/llvm-project/pull/150813, which 
is the true sensible (IMO) end state.

https://github.com/llvm/llvm-project/pull/150798
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFCI][ELF][Mips] Refactor MipsGotSection to avoid explicit writes (PR #150730)

2025-07-25 Thread Jessica Clarke via llvm-branch-commits

jrtc27 wrote:

The current code is quite crusty and divergent from non-MIPS in API use, but 
fixing it up like this is quite high-risk, especially given how weird the MIPS 
GOT is when it comes to the required initial memory state. Is anyone using LLD 
for MIPS these days who can test this in a wider context? I've tried to be very 
careful but I would not be surprised if I've made mistakes; our test coverage 
isn't great.

https://github.com/llvm/llvm-project/pull/150730
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFCI][ELF][Mips] Refactor MipsGotSection to avoid explicit writes (PR #150730)

2025-07-25 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 created 
https://github.com/llvm/llvm-project/pull/150730

Splitting the VA / addend calculations between build and writeTo means
having to keep them in sync and duplicating some of the logic. For
everything except "page address" relocations, move all such calculations
into build, mirroring how the normal non-MIPS code in Relocations.cpp
ensures the addend and initial memory contents are set.



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFCI][ELF][Mips] Refactor MipsGotSection to avoid explicit writes (PR #150730)

2025-07-25 Thread Jessica Clarke via llvm-branch-commits


@@ -1055,74 +1059,122 @@ void MipsGotSection::build() {
 ctx.symAux.back().gotIdx = p.second;
   }
 
-  // Create dynamic relocations.
+  // Create relocations.
+  //
+  // NB: GOT 'page address' entries have their VAs handled in writeTo as they
+  // reference an OutputSection not a Symbol.
   for (FileGot &got : gots) {
-// Create dynamic relocations for TLS entries.
+static Undefined dummy(ctx.internalFile, "", STB_LOCAL, 0, 0);

jrtc27 wrote:

NB: There are now three instances of this (one non-static, which I guess is 
more correct with the ctx-ification of LLD...). Perhaps ctx should grow this as 
a member that's constructed at the same time as internalFile?

https://github.com/llvm/llvm-project/pull/150730
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] release/21.x: [lld] Add thunks for hexagon (#111217) (PR #149723)

2025-08-03 Thread Jessica Clarke via llvm-branch-commits


@@ -2139,17 +2139,44 @@ void ThunkCreator::mergeThunks(ArrayRef outputSections) {
   });
 }
 
-static int64_t getPCBias(Ctx &ctx, RelType type) {
-  if (ctx.arg.emachine != EM_ARM)
-return 0;
-  switch (type) {
-  case R_ARM_THM_JUMP19:
-  case R_ARM_THM_JUMP24:
-  case R_ARM_THM_CALL:
-return 4;
-  default:
-return 8;
+constexpr uint32_t HEXAGON_MASK_END_PACKET = 3 << 14;
+constexpr uint32_t HEXAGON_END_OF_PACKET = 3 << 14;
+constexpr uint32_t HEXAGON_END_OF_DUPLEX = 0 << 14;
+
+// Return the distance between the packet start and the instruction in the
+// relocation.
+static int getHexagonPacketOffset(const InputSection &isec,
+  const Relocation &rel) {
+  const ArrayRef data = isec.content();
+
+  // Search back as many as 3 instructions.
+  for (unsigned i = 0;; i++) {
+if (i == 3 || rel.offset < (i + 1) * 4)
+  return i * 4;
+uint32_t instWord = 0;
+const ArrayRef instWordContents =
+data.drop_front(rel.offset - (i + 1) * 4);
+memcpy(&instWord, instWordContents.data(), sizeof(instWord));

jrtc27 wrote:

This assumes native endianness; use read32

https://github.com/llvm/llvm-project/pull/149723
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] release/21.x: [lld] Add thunks for hexagon (#111217) (PR #149723)

2025-08-03 Thread Jessica Clarke via llvm-branch-commits

jrtc27 wrote:

/cherry-pick b03d1e1e2e8e4b0b4b9e035b7ad9fb86dccefb93 
de15d365743e16848a9d15fc32ae6ab98d399ec2 
723b40a8d92f76fc913ef21061fc3d74e8c47441

https://github.com/llvm/llvm-project/pull/149723
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] release/21.x: [lld] Add thunks for hexagon (#111217) (PR #149723)

2025-08-03 Thread Jessica Clarke via llvm-branch-commits


@@ -2139,17 +2139,44 @@ void ThunkCreator::mergeThunks(ArrayRef outputSections) {
   });
 }
 
-static int64_t getPCBias(Ctx &ctx, RelType type) {
-  if (ctx.arg.emachine != EM_ARM)
-return 0;
-  switch (type) {
-  case R_ARM_THM_JUMP19:
-  case R_ARM_THM_JUMP24:
-  case R_ARM_THM_CALL:
-return 4;
-  default:
-return 8;
+constexpr uint32_t HEXAGON_MASK_END_PACKET = 3 << 14;
+constexpr uint32_t HEXAGON_END_OF_PACKET = 3 << 14;
+constexpr uint32_t HEXAGON_END_OF_DUPLEX = 0 << 14;
+
+// Return the distance between the packet start and the instruction in the
+// relocation.
+static int getHexagonPacketOffset(const InputSection &isec,
+  const Relocation &rel) {
+  const ArrayRef data = isec.content();
+
+  // Search back as many as 3 instructions.
+  for (unsigned i = 0;; i++) {
+if (i == 3 || rel.offset < (i + 1) * 4)
+  return i * 4;
+uint32_t instWord = 0;
+const ArrayRef instWordContents =
+data.drop_front(rel.offset - (i + 1) * 4);

jrtc27 wrote:

Why are you drop_front'ing every time rather than just offsetting from 
data.data()?

https://github.com/llvm/llvm-project/pull/149723
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] release/21.x: [lld] Add thunks for hexagon (#111217) (PR #149723)

2025-08-03 Thread Jessica Clarke via llvm-branch-commits

jrtc27 wrote:

> ```
> /builddir/build/BUILD/llvm-21.1.0_rc2-build/llvm-project-21.1.0-rc2.src/lld/test/ELF/hexagon-thunks-packets.s:20:17:
>  error: CHECK-NONPIC: expected string not found i
> n input
> # CHECK-NONPIC: jump 0x1020110  }
> ^
> :7:50: note: scanning from here
>  200b4: 01 40 10 00 00104001 { immext(#0x140)
>  ^
> :8:30: note: possible intended match here
>  200b8: 58 c0 00 58 5800c058 jump 0x1020120  }
>  ^
> ```
> 
> Hmm, if it were an endianness issue I wonder if this patch below might 
> address it? How convenient is it to try this?
> 
> ```
> diff --git a/lld/ELF/Thunks.cpp b/lld/ELF/Thunks.cpp
> index 65d0f094c43c..c8d0c91fd8b6 100644
> --- a/lld/ELF/Thunks.cpp
> +++ b/lld/ELF/Thunks.cpp
> @@ -1546,16 +1546,16 @@ void HexagonThunk::writeTo(uint8_t *buf) {
>uint64_t p = getThunkTargetSym()->getVA(ctx);
>  
>if (ctx.arg.isPic) {
> -write32(ctx, buf + 0, 0x4000); // {  immext(#0)
> +write32le(buf + 0, 0x4000); // {  immext(#0)
>  ctx.target->relocateNoSym(buf, R_HEX_B32_PCREL_X, s - p);
> -write32(ctx, buf + 4, 0x6a49c00e); //r14 = add(pc,##0) }
> +write32le(buf + 4, 0x6a49c00e); //r14 = add(pc,##0) }
>  ctx.target->relocateNoSym(buf + 4, R_HEX_6_PCREL_X, s - p);
>  
> -write32(ctx, buf + 8, 0x528ec000); // {  jumpr r14 }
> +write32le(buf + 8, 0x528ec000); // {  jumpr r14 }
>} else {
> -write32(ctx, buf + 0, 0x4000); //  { immext
> +write32le(buf + 0, 0x4000); //  { immext
>  ctx.target->relocateNoSym(buf, R_HEX_B32_PCREL_X, s - p);
> -write32(ctx, buf + 4, 0x5800c000); //jump <> }
> +write32le(buf + 4, 0x5800c000); //jump <> }
>  ctx.target->relocateNoSym(buf + 4, R_HEX_B22_PCREL_X, s - p);
>}
> ```

That should make no difference. read32 etc use config->endianness, which is the 
endianness of the ELF file. If Hexagon is always little-endian then it is 
completely correct to use read32le everywhere instead, but that is an 
optimisation to make that says "assume the file is little-endian rather than 
dynamically checking which endianness it is".

https://github.com/llvm/llvm-project/pull/149723
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFCI][ELF] Merge AgainstSymbol and AgainstSymbolWithTargetVA (PR #150798)

2025-07-30 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 updated 
https://github.com/llvm/llvm-project/pull/150798


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFCI][ELF][Mips] Replace MipsMultiGotPage with new RE_MIPS_OSEC_LOCAL_PAGE (PR #150810)

2025-07-30 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 updated 
https://github.com/llvm/llvm-project/pull/150810


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFCI][ELF][Mips] Replace MipsMultiGotPage with new RE_MIPS_OSEC_LOCAL_PAGE (PR #150810)

2025-07-30 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 updated 
https://github.com/llvm/llvm-project/pull/150810


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] [NFC][ELF] Don't duplicate DynamicReloc constructor (PR #150811)

2025-07-30 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 updated 
https://github.com/llvm/llvm-project/pull/150811


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFC][ELF] Don't duplicate DynamicReloc constructor (PR #150811)

2025-07-30 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 updated 
https://github.com/llvm/llvm-project/pull/150811


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFCI][ELF] Store DynamicReloc Kind as two bools (PR #150812)

2025-07-30 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 updated 
https://github.com/llvm/llvm-project/pull/150812


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFC][ELF] Replace DynamicReloc::Kind with the equivalent bool in APIs (PR #150813)

2025-07-30 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 updated 
https://github.com/llvm/llvm-project/pull/150813


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFCI][ELF] Store DynamicReloc Kind as two bools (PR #150812)

2025-07-30 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 updated 
https://github.com/llvm/llvm-project/pull/150812


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFC][ELF] Replace DynamicReloc::Kind with the equivalent bool in APIs (PR #150813)

2025-07-30 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 updated 
https://github.com/llvm/llvm-project/pull/150813


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] [NFCI][ELF] Introduce explicit Computed state for DynamicReloc (PR #150799)

2025-07-30 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 updated 
https://github.com/llvm/llvm-project/pull/150799

>From 1308e1aad30d7089f658832150854b1362c63f45 Mon Sep 17 00:00:00 2001
From: Jessica Clarke 
Date: Sat, 26 Jul 2025 22:05:06 +0100
Subject: [PATCH] =?UTF-8?q?[=F0=9D=98=80=F0=9D=97=BD=F0=9D=97=BF]=20change?=
 =?UTF-8?q?s=20to=20main=20this=20commit=20is=20based=20on?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Created using spr 1.3.5

[skip ci]
---
 lld/ELF/Config.h| 2 ++
 lld/ELF/Driver.cpp  | 1 +
 lld/ELF/Relocations.cpp | 3 +--
 lld/ELF/Target.cpp  | 3 +--
 4 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/lld/ELF/Config.h b/lld/ELF/Config.h
index d9639b06ca4bf..958e5caaf0dfa 100644
--- a/lld/ELF/Config.h
+++ b/lld/ELF/Config.h
@@ -701,6 +701,8 @@ struct Ctx : CommonLinkerContext {
   std::unique_ptr tar;
   // InputFile for linker created symbols with no source location.
   InputFile *internalFile = nullptr;
+  // Dummy Undefined for relocations without a symbol.
+  Undefined *dummySym = nullptr;
   // True if symbols can be exported (isExported) or preemptible.
   bool hasDynsym = false;
   // True if SHT_LLVM_SYMPART is used.
diff --git a/lld/ELF/Driver.cpp b/lld/ELF/Driver.cpp
index 21d228eda6470..4dcf577ebcb16 100644
--- a/lld/ELF/Driver.cpp
+++ b/lld/ELF/Driver.cpp
@@ -3138,6 +3138,7 @@ template  void 
LinkerDriver::link(opt::InputArgList &args) {
 ctx.symtab->insert(arg->getValue())->traced = true;
 
   ctx.internalFile = createInternalFile(ctx, "");
+  ctx.dummySym = make(ctx.internalFile, "", STB_LOCAL, 0, 0);
 
   // Handle -u/--undefined before input files. If both a.a and b.so define foo,
   // -u foo a.a b.so will extract a.a.
diff --git a/lld/ELF/Relocations.cpp b/lld/ELF/Relocations.cpp
index bd22fe2f1aa25..e847e85b060fe 100644
--- a/lld/ELF/Relocations.cpp
+++ b/lld/ELF/Relocations.cpp
@@ -1948,13 +1948,12 @@ void elf::postScanRelocations(Ctx &ctx) {
 
   GotSection *got = ctx.in.got.get();
   if (ctx.needsTlsLd.load(std::memory_order_relaxed) && got->addTlsIndex()) {
-static Undefined dummy(ctx.internalFile, "", STB_LOCAL, 0, 0);
 if (ctx.arg.shared)
   ctx.mainPart->relaDyn->addReloc(
   {ctx.target->tlsModuleIndexRel, got, got->getTlsIndexOff()});
 else
   got->addConstant({R_ADDEND, ctx.target->symbolicRel,
-got->getTlsIndexOff(), 1, &dummy});
+got->getTlsIndexOff(), 1, ctx.dummySym});
   }
 
   assert(ctx.symAux.size() == 1);
diff --git a/lld/ELF/Target.cpp b/lld/ELF/Target.cpp
index ad7d57d30668d..4946484074d05 100644
--- a/lld/ELF/Target.cpp
+++ b/lld/ELF/Target.cpp
@@ -105,10 +105,9 @@ ErrorPlace elf::getErrorPlace(Ctx &ctx, const uint8_t 
*loc) {
 if (isecLoc <= loc && loc < isecLoc + isec->getSize()) {
   std::string objLoc = isec->getLocation(loc - isecLoc);
   // Return object file location and source file location.
-  Undefined dummy(ctx.internalFile, "", STB_LOCAL, 0, 0);
   ELFSyncStream msg(ctx, DiagLevel::None);
   if (isec->file)
-msg << isec->getSrcMsg(dummy, loc - isecLoc);
+msg << isec->getSrcMsg(*ctx.dummySym, loc - isecLoc);
   return {isec, objLoc + ": ", std::string(msg.str())};
 }
   }

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFCI][ELF] Merge AddendOnly and AddendOnlyWithTargetVA (PR #150797)

2025-07-30 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 updated 
https://github.com/llvm/llvm-project/pull/150797


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFCI][ELF] Merge AddendOnly and AddendOnlyWithTargetVA (PR #150797)

2025-07-30 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 updated 
https://github.com/llvm/llvm-project/pull/150797


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] [ELF][Mips] Fix addend for preemptible static TLS (PR #150729)

2025-07-30 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 updated 
https://github.com/llvm/llvm-project/pull/150729

>From 32400cb0d5c16e16b6d0d259955ba060f561fefe Mon Sep 17 00:00:00 2001
From: Jessica Clarke 
Date: Sat, 26 Jul 2025 02:12:18 +0100
Subject: [PATCH] =?UTF-8?q?[=F0=9D=98=80=F0=9D=97=BD=F0=9D=97=BF]=20initia?=
 =?UTF-8?q?l=20version?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Created using spr 1.3.5
---
 lld/ELF/SyntheticSections.cpp | 16 
 lld/ELF/SyntheticSections.h   |  9 +
 lld/test/ELF/mips-mgot.s  |  2 +-
 lld/test/ELF/mips-tls-64.s|  2 +-
 lld/test/ELF/mips-tls.s   |  2 +-
 5 files changed, 12 insertions(+), 19 deletions(-)

diff --git a/lld/ELF/SyntheticSections.cpp b/lld/ELF/SyntheticSections.cpp
index efec41a737b62..0bb00c6d2bcff 100644
--- a/lld/ELF/SyntheticSections.cpp
+++ b/lld/ELF/SyntheticSections.cpp
@@ -1065,9 +1065,8 @@ void MipsGotSection::build() {
   // for the TP-relative offset as we don't know how much other data will
   // be allocated before us in the static TLS block.
   if (s->isPreemptible || ctx.arg.shared)
-ctx.mainPart->relaDyn->addReloc(
-{ctx.target->tlsGotRel, this, offset,
- DynamicReloc::AgainstSymbolWithTargetVA, *s, 0, R_ABS});
+ctx.mainPart->relaDyn->addAddendOnlyRelocIfNonPreemptible(
+ctx.target->tlsGotRel, *this, offset, *s, ctx.target->symbolicRel);
 }
 for (std::pair &p : got.dynTlsSymbols) {
   Symbol *s = p.first;
@@ -1160,6 +1159,7 @@ void MipsGotSection::writeTo(uint8_t *buf) {
   // if we had to do this.
   writeUint(ctx, buf + ctx.arg.wordsize,
 (uint64_t)1 << (ctx.arg.wordsize * 8 - 1));
+  ctx.target->relocateAlloc(*this, buf);
   for (const FileGot &g : gots) {
 auto write = [&](size_t i, const Symbol *s, int64_t a) {
   uint64_t va = a;
@@ -1189,9 +1189,10 @@ void MipsGotSection::writeTo(uint8_t *buf) {
 write(p.second, p.first, 0);
 for (const std::pair &p : g.relocs)
   write(p.second, p.first, 0);
-for (const std::pair &p : g.tls)
-  write(p.second, p.first,
-p.first->isPreemptible || ctx.arg.shared ? 0 : -0x7000);
+for (const std::pair &p : g.tls) {
+  if (!p.first->isPreemptible && !ctx.arg.shared)
+write(p.second, p.first, -0x7000);
+}
 for (const std::pair &p : g.dynTlsSymbols) {
   if (p.first == nullptr && !ctx.arg.shared)
 write(p.second, nullptr, 1);
@@ -1653,8 +1654,7 @@ int64_t DynamicReloc::computeAddend(Ctx &ctx) const {
   case AgainstSymbol:
 assert(sym != nullptr);
 return addend;
-  case AddendOnlyWithTargetVA:
-  case AgainstSymbolWithTargetVA: {
+  case AddendOnlyWithTargetVA: {
 uint64_t ca = inputSec->getRelocTargetVA(
 ctx, Relocation{expr, type, 0, addend, sym}, getOffset());
 return ctx.arg.is64 ? ca : SignExtend64<32>(ca);
diff --git a/lld/ELF/SyntheticSections.h b/lld/ELF/SyntheticSections.h
index 5f01513630597..7612915b5b1dc 100644
--- a/lld/ELF/SyntheticSections.h
+++ b/lld/ELF/SyntheticSections.h
@@ -429,11 +429,6 @@ class DynamicReloc {
 /// The resulting dynamic relocation references symbol #sym from the 
dynamic
 /// symbol table and uses #addend as the value of computeAddend(ctx).
 AgainstSymbol,
-/// The resulting dynamic relocation references symbol #sym from the 
dynamic
-/// symbol table and uses InputSection::getRelocTargetVA() + #addend for 
the
-/// final addend. It can be used for relocations that write the symbol VA 
as
-// the addend (e.g. R_MIPS_TLS_TPREL64) but still reference the symbol.
-AgainstSymbolWithTargetVA,
 /// This is used by the MIPS multi-GOT implementation. It relocates
 /// addresses of 64kb pages that lie inside the output section.
 MipsMultiGotPage,
@@ -460,9 +455,7 @@ class DynamicReloc {
 
   uint64_t getOffset() const;
   uint32_t getSymIndex(SymbolTableBaseSection *symTab) const;
-  bool needsDynSymIndex() const {
-return kind == AgainstSymbol || kind == AgainstSymbolWithTargetVA;
-  }
+  bool needsDynSymIndex() const { return kind == AgainstSymbol; }
 
   /// Computes the addend of the dynamic relocation. Note that this is not the
   /// same as the #addend member variable as it may also include the symbol
diff --git a/lld/test/ELF/mips-mgot.s b/lld/test/ELF/mips-mgot.s
index 6978b5d9623b4..67bd5e6619f12 100644
--- a/lld/test/ELF/mips-mgot.s
+++ b/lld/test/ELF/mips-mgot.s
@@ -23,7 +23,7 @@
 
 # CHECK:  Contents of section .got:
 # CHECK-NEXT:  7  8000 [[FOO0]] [[FOO2]]
-# CHECK-NEXT:  70010  0004 0001 0002
+# CHECK-NEXT:  70010   0001 0002
 # CHECK-NEXT:  70020 0003 0004 0005 0006
 # CHECK-NEXT:  70030    
 # CHECK-NEXT:  70040   
diff --git a/lld/test/ELF/mips-tls-64.s b/lld/test/ELF/mips-tls-64.s
index 3976b50274be4..8a00b93c77e2f 100644
--- a/ll

[llvm-branch-commits] [lld] [ELF][Mips] Fix addend for preemptible static TLS (PR #150729)

2025-07-30 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 updated 
https://github.com/llvm/llvm-project/pull/150729

>From 32400cb0d5c16e16b6d0d259955ba060f561fefe Mon Sep 17 00:00:00 2001
From: Jessica Clarke 
Date: Sat, 26 Jul 2025 02:12:18 +0100
Subject: [PATCH] =?UTF-8?q?[=F0=9D=98=80=F0=9D=97=BD=F0=9D=97=BF]=20initia?=
 =?UTF-8?q?l=20version?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Created using spr 1.3.5
---
 lld/ELF/SyntheticSections.cpp | 16 
 lld/ELF/SyntheticSections.h   |  9 +
 lld/test/ELF/mips-mgot.s  |  2 +-
 lld/test/ELF/mips-tls-64.s|  2 +-
 lld/test/ELF/mips-tls.s   |  2 +-
 5 files changed, 12 insertions(+), 19 deletions(-)

diff --git a/lld/ELF/SyntheticSections.cpp b/lld/ELF/SyntheticSections.cpp
index efec41a737b62..0bb00c6d2bcff 100644
--- a/lld/ELF/SyntheticSections.cpp
+++ b/lld/ELF/SyntheticSections.cpp
@@ -1065,9 +1065,8 @@ void MipsGotSection::build() {
   // for the TP-relative offset as we don't know how much other data will
   // be allocated before us in the static TLS block.
   if (s->isPreemptible || ctx.arg.shared)
-ctx.mainPart->relaDyn->addReloc(
-{ctx.target->tlsGotRel, this, offset,
- DynamicReloc::AgainstSymbolWithTargetVA, *s, 0, R_ABS});
+ctx.mainPart->relaDyn->addAddendOnlyRelocIfNonPreemptible(
+ctx.target->tlsGotRel, *this, offset, *s, ctx.target->symbolicRel);
 }
 for (std::pair &p : got.dynTlsSymbols) {
   Symbol *s = p.first;
@@ -1160,6 +1159,7 @@ void MipsGotSection::writeTo(uint8_t *buf) {
   // if we had to do this.
   writeUint(ctx, buf + ctx.arg.wordsize,
 (uint64_t)1 << (ctx.arg.wordsize * 8 - 1));
+  ctx.target->relocateAlloc(*this, buf);
   for (const FileGot &g : gots) {
 auto write = [&](size_t i, const Symbol *s, int64_t a) {
   uint64_t va = a;
@@ -1189,9 +1189,10 @@ void MipsGotSection::writeTo(uint8_t *buf) {
 write(p.second, p.first, 0);
 for (const std::pair &p : g.relocs)
   write(p.second, p.first, 0);
-for (const std::pair &p : g.tls)
-  write(p.second, p.first,
-p.first->isPreemptible || ctx.arg.shared ? 0 : -0x7000);
+for (const std::pair &p : g.tls) {
+  if (!p.first->isPreemptible && !ctx.arg.shared)
+write(p.second, p.first, -0x7000);
+}
 for (const std::pair &p : g.dynTlsSymbols) {
   if (p.first == nullptr && !ctx.arg.shared)
 write(p.second, nullptr, 1);
@@ -1653,8 +1654,7 @@ int64_t DynamicReloc::computeAddend(Ctx &ctx) const {
   case AgainstSymbol:
 assert(sym != nullptr);
 return addend;
-  case AddendOnlyWithTargetVA:
-  case AgainstSymbolWithTargetVA: {
+  case AddendOnlyWithTargetVA: {
 uint64_t ca = inputSec->getRelocTargetVA(
 ctx, Relocation{expr, type, 0, addend, sym}, getOffset());
 return ctx.arg.is64 ? ca : SignExtend64<32>(ca);
diff --git a/lld/ELF/SyntheticSections.h b/lld/ELF/SyntheticSections.h
index 5f01513630597..7612915b5b1dc 100644
--- a/lld/ELF/SyntheticSections.h
+++ b/lld/ELF/SyntheticSections.h
@@ -429,11 +429,6 @@ class DynamicReloc {
 /// The resulting dynamic relocation references symbol #sym from the 
dynamic
 /// symbol table and uses #addend as the value of computeAddend(ctx).
 AgainstSymbol,
-/// The resulting dynamic relocation references symbol #sym from the 
dynamic
-/// symbol table and uses InputSection::getRelocTargetVA() + #addend for 
the
-/// final addend. It can be used for relocations that write the symbol VA 
as
-// the addend (e.g. R_MIPS_TLS_TPREL64) but still reference the symbol.
-AgainstSymbolWithTargetVA,
 /// This is used by the MIPS multi-GOT implementation. It relocates
 /// addresses of 64kb pages that lie inside the output section.
 MipsMultiGotPage,
@@ -460,9 +455,7 @@ class DynamicReloc {
 
   uint64_t getOffset() const;
   uint32_t getSymIndex(SymbolTableBaseSection *symTab) const;
-  bool needsDynSymIndex() const {
-return kind == AgainstSymbol || kind == AgainstSymbolWithTargetVA;
-  }
+  bool needsDynSymIndex() const { return kind == AgainstSymbol; }
 
   /// Computes the addend of the dynamic relocation. Note that this is not the
   /// same as the #addend member variable as it may also include the symbol
diff --git a/lld/test/ELF/mips-mgot.s b/lld/test/ELF/mips-mgot.s
index 6978b5d9623b4..67bd5e6619f12 100644
--- a/lld/test/ELF/mips-mgot.s
+++ b/lld/test/ELF/mips-mgot.s
@@ -23,7 +23,7 @@
 
 # CHECK:  Contents of section .got:
 # CHECK-NEXT:  7  8000 [[FOO0]] [[FOO2]]
-# CHECK-NEXT:  70010  0004 0001 0002
+# CHECK-NEXT:  70010   0001 0002
 # CHECK-NEXT:  70020 0003 0004 0005 0006
 # CHECK-NEXT:  70030    
 # CHECK-NEXT:  70040   
diff --git a/lld/test/ELF/mips-tls-64.s b/lld/test/ELF/mips-tls-64.s
index 3976b50274be4..8a00b93c77e2f 100644
--- a/ll

[llvm-branch-commits] [NFCI][ELF][Mips] Refactor MipsGotSection to avoid explicit writes (PR #150730)

2025-07-30 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 updated 
https://github.com/llvm/llvm-project/pull/150730


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFCI][ELF][Mips] Refactor MipsGotSection to avoid explicit writes (PR #150730)

2025-07-30 Thread Jessica Clarke via llvm-branch-commits

https://github.com/jrtc27 updated 
https://github.com/llvm/llvm-project/pull/150730


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFCI][ELF][Mips] Refactor MipsGotSection to avoid explicit writes (PR #150730)

2025-07-27 Thread Jessica Clarke via llvm-branch-commits

jrtc27 wrote:

> > It's indeed very difficult to find folks still concerned with MIPS... I 
> > believe only one company is still actively contributing to the MIPS backend 
> > in LLVM... @wzssyqa
> > Besides the ClangBuiltLinux maintainer and the OpenBSD maintainer continue 
> > to support MIPS.
> 
> It's a catch 22 when the linker has never been feature complete / bug free 
> enough to be usable. For OpenBSD it's the only arch we currently don't use as 
> a linker due to issues / bugs.

Hm, that surprises me. We got it to a good enough point to be usable as the 
system linker for FreeBSD, before MIPS was dropped as a supported architecture.

https://github.com/llvm/llvm-project/pull/150730
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFCI][ELF] Merge AddendOnly and AddendOnlyWithTargetVA (PR #150797)

2025-07-27 Thread Jessica Clarke via llvm-branch-commits


@@ -422,13 +422,10 @@ class DynamicReloc {
 /// The resulting dynamic relocation has already had its addend computed.
 /// Calling computeAddend() is an error. Only for internal use.
 Computed,
-/// The resulting dynamic relocation does not reference a symbol (#sym must
-/// be nullptr) and uses #addend as the result of computeAddend(ctx).
-AddendOnly,
 /// The resulting dynamic relocation will not reference a symbol: #sym is
 /// only used to compute the addend with InputSection::getRelocTargetVA().
 /// Useful for various relative and TLS relocations (e.g. 
R_X86_64_TPOFF64).
-AddendOnlyWithTargetVA,
+AddendOnly,

jrtc27 wrote:

It can't; that's what dummySym is for. Note that getRelocTargetVA takes a 
reference not a pointer so we must have a symbol even if using R_ADDEND.

https://github.com/llvm/llvm-project/pull/150797
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)

2025-06-04 Thread Jessica Clarke via llvm-branch-commits


@@ -2627,6 +2629,93 @@ SDValue DAGCombiner::foldSubToAvg(SDNode *N, const SDLoc 
&DL) {
   return SDValue();
 }
 
+/// Try to fold a pointer arithmetic node.
+/// This needs to be done separately from normal addition, because pointer
+/// addition is not commutative.
+SDValue DAGCombiner::visitPTRADD(SDNode *N) {
+  SDValue N0 = N->getOperand(0);
+  SDValue N1 = N->getOperand(1);
+  EVT PtrVT = N0.getValueType();
+  EVT IntVT = N1.getValueType();
+  SDLoc DL(N);
+
+  // This is already ensured by an assert in SelectionDAG::getNode(). Several
+  // combines here depend on this assumption.
+  assert(PtrVT == IntVT &&
+ "PTRADD with different operand types is not supported");
+
+  // fold (ptradd undef, y) -> undef
+  if (N0.isUndef())
+return N0;
+
+  // fold (ptradd x, undef) -> undef
+  if (N1.isUndef())
+return DAG.getUNDEF(PtrVT);
+
+  // fold (ptradd x, 0) -> x
+  if (isNullConstant(N1))
+return N0;
+
+  // fold (ptradd 0, x) -> x
+  if (isNullConstant(N0))

jrtc27 wrote:

Only if they're the same type. This isn't valid for CHERI, the LHS is a 
capability, the RHS is an integer. Nor is this valid for architectures where 
address size != index size.

https://github.com/llvm/llvm-project/pull/142739
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)

2025-06-04 Thread Jessica Clarke via llvm-branch-commits

jrtc27 wrote:

> isNullConstant(X), since there are address spaces where 0 is a perfectly
normal value that shouldn't be treated specially,

I don't know if it's important for CHERI to have this or if the IR-level 
optimisations render it not so needed. But `NULL + int` is how we represent an 
integer as a pointer, so `NULL + x + y` is something that can legitimately turn 
up, and we want to be able to fold the x and y together as just integer 
arithmetic, only converting to a capability at the very end when needed.

https://github.com/llvm/llvm-project/pull/142739
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] [ELF][Mips] Fix addend for preemptible static TLS (PR #150729)

2025-08-14 Thread Jessica Clarke via llvm-branch-commits

jrtc27 wrote:

Ping

https://github.com/llvm/llvm-project/pull/150729
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits