================ @@ -15454,33 +15471,43 @@ The first index determines which element/field of ``basetype`` is selected, computes the pointer to access this element/field assuming ``source`` points to the start of ``basetype``. This pointer becomes the new ``source``, the current type the new -``basetype``, and the next indices is consumed until a scalar type is +``basetype``, and the next index is consumed until a scalar type is reached or all indices are consumed. -All indices must be consumed, and it is illegal to index into a scalar type. -Meaning the maximum number of indices depends on the depth of the basetype. - -Because this instruction performs a logical addressing, all indices are -assumed to be inbounds. This means it is not possible to access the next -element in the logical layout by overflowing: - -- If the indexed type is a struct with N fields, the index must be an - immediate/constant value in the range ``[0; N[``. -- If indexing into an array or vector, the index can be a variable, but - is assumed to be inbounds with regards to the current basetype logical layout. -- If the traversed type is an array or vector of N elements with ``N > 0``, - the index is assumed to belong to ``[0; N[``. -- If the traversed type is an array of size ``0``, the array size is assumed - to be known at runtime, and the instruction assumes the index is always - inbounds. - -In all cases **except** when the accessed type is a 0-sized array, indexing -out of bounds yields `poison`. When the index value is unknown, optimizations -can use the type bounds to determine the range of values the index can have. +All indices must be consumed, and it is illegal to index into a scalar type, +meaning the maximum number of indices depends on the depth of the basetype. + +If the indexed type is a struct with N fields, the index must be an +integer constant in the range ``[0, N[``. + +If the constraints implied by a flag bit are violated, the result is ``poison``. If the source pointer is poison, the instruction returns poison. The resulting pointer belongs to the same address space as ``source``. This instruction does not dereference the pointer. +Defined flag bits: +"""""""""""""""""" + +``inbounds`` + Bit 0 (``1 << 0``) - specifies that this index is within the bounds of the type + being indexed at that level. In particular, when indexing an array or vector + ``[ N x T ]``, implies that the index is in the range ``[0, N[``. As an exception, + if ``N`` is 0, the bound is treated as an unknown, dynamic value, but the flag + still implies that the index is inside that runtime bound. + Structure accesses are always ``inbounds`` and must be marked as + such. A structured GEP is said to be inbounds if all of its indices are inbounds. + +``nneg`` + Bit 1 (``1 << 1``) - specifies that the value of this index is non-negative. + This is not necessarily implied by ``inbounds``, as an object may have more + fields than the maximal signed value for the index type. ---------------- Flakebi wrote:
> ```llvm > %p0 = sgep inbounds nneg [8 x i32] %p, i32 4 > %p1 = sgep inbounds [8 x i32] %p0, i32 %x > ``` Hm, is this supposed to be valid? My understanding of the current sgep definition is that only `0 <= %x < 8` would be inbounds. Your example would definitely be valid for gep, however currently, sgep is _different_ from gep as it does not take the first index that gep takes. So, `%p0` points to an int element in the array and you cannot further index into that. The following, just combining the two sgeps, which in itself should always be valid, shows obviously invalid code: ```llvm %p1 = sgep inbounds [8 x i32] %p, i32 4, i32 %x ``` So, if we wanted to allow things like negative indices, we would need to change the definition of sgep and add the first index that gep has, or “emulate” similar behavior through `[0 x i32]` or so. https://github.com/llvm/llvm-project/pull/200093 _______________________________________________ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
