On Sat, Aug 16, 2025 at 11:56:43AM +0000, Brother Bill via Digitalmars-d-learn 
wrote:
> It is obvious that reading or writing to invalid memory can result in
> "undefined behavior".
> But is merely pointing to invalid memory "harmful"?
> 
> The documentation states that going one past the last element of a
> slice is acceptable.

Where does it say this?  This is wrong.


> But is it also safe to go 10, 100 or 1000 items past the last element
> of a slice?

Of course not.

//

It all depends on the interpretation you're using.

Technically speaking, a pointer is just a memory address. An unsigned
integer.  There's nothing inherently "harmful" about an integer.

The problem arises when you interpret it as a memory address.  Once you
interpret it as an address, you're likely to pass it around to code that
expects to be able to read or write to memory at that address.  And
that's where the problem arises.  There are expectations placed upon an
unsigned integer that's to be interpreted as an address, such as that
you can read memory from that address.  The set of integers that are
valid addresses is a subset of the set of all (representable) integers.

It's not just about pointing to "invalid memory" either; it's also about
not breaking the expectations of the type system.  When handed a pointer
to a string, for example, the expectation is that when you read memory
at that address, you will find a valid sequence of values that
represents a string.  If you treat a random unsigned integer as a
poitner to a string, you may end up reading a sequence of values that
*aren't* a string, thereby obtaining invalid data.  Or worse, if you
write to that address, then somebody else (i.e. some other code) that
put the previous data there may try to read it later, expecting valid
data of the previous type, and get instead something that's no longer a
valid value of that type.  The set of memory addresses containing data
of the correct type is narrower than the set of valid addresses
(addresses assigned to you by the OS), and the set of valid addresses is
narrower than the set of all addresses, most of which will trigger an
invalid memory access from your OS because that address wasn't assigned
to your program and the OS will step in to terminate your program if you
try to access it.

//

Now in theory you can allow arbitrary values in your pointers, and only
check for validity when you actually dereference it, analogous to how,
given a street address handed to you on a piece of paper, you'd check
whether that address actually exists before actually heading out there.
In practice, though, this is impractical, because that means every
pointer dereference your program makes would have to run through some
global registry of valid addresses and check whether data currently
stored there is of the correct type.  This would be extremely slow and
the simplest of operations would take forever to run. (Not to mention
the issues of keeping said global registry up-to-date as the program
runs and modifies its data.)

To eliminate this onerous overhead every time you dereference a pointer,
programming languages make the simplification that *all* pointers must
always contain a valid address of the correct type (or a special null
value, that indicates that there is no address at all).  The idea being
that before even assigning a given integer value to a pointer, you'd
ensure that it was a valid address to begin with, so that by the time
you try to dereference the pointer, you can be confident that it's a
valid address and simply dereference it without further verification.

This is essentially the whole point of a type system -- to ensure that a
given piece of data is a valid representation of its intended type, so
that you can safely manipulate it.

Doing things like assigning non-pointer values to a pointer breaks the
guarantees that the type system gives you, because the assumptions made
by all those places in your code that dereferences this pointer are now
invalid, and all bets are off what will happen when you run that code.
This is why it's invalid to point to "invalid" memory.  The act of
pointing itself is "harmless" -- since it's just some integer address --
but the harm comes from the broken assumptions of the rest of the code
that assumes that the address contained in the pointer is a valid
address containing data of the expected type.


T

-- 
A mathematician learns more and more about less and less, until he knows 
everything about nothing; whereas a philospher learns less and less about more 
and more, until he knows nothing about everything.
  • Pointers - Is it... Brother Bill via Digitalmars-d-learn
    • Re: Pointer... H. S. Teoh via Digitalmars-d-learn
    • Re: Pointer... Brother Bill via Digitalmars-d-learn
      • Re: Poi... Richard (Rikki) Andrew Cattermole via Digitalmars-d-learn
      • Re: Poi... H. S. Teoh via Digitalmars-d-learn
      • Re: Poi... Brother Bill via Digitalmars-d-learn
        • Re:... H. S. Teoh via Digitalmars-d-learn
        • Re:... Brother Bill via Digitalmars-d-learn
          • ... Richard (Rikki) Andrew Cattermole via Digitalmars-d-learn
            • ... Brother Bill via Digitalmars-d-learn
              • ... Richard (Rikki) Andrew Cattermole via Digitalmars-d-learn
              • ... monkyyy via Digitalmars-d-learn

Reply via email to