Re: How is this code invalid?

2022-12-17 Thread j via Digitalmars-d-learn
On Saturday, 17 December 2022 at 00:23:32 UTC, thebluepandabear 
wrote:
I am reading the fantastic book about D by Ali Çehreli, and he 
gives the following example when he talks about variadic 
functions:


```D
int[] numbersForLaterUse;

void foo(int[] numbers...) {
   numbersForLaterUse = numbers;
}

struct S {
  string[] namesForLaterUse;

  void foo(string[] names...) {
 namesForLaterUse = names;
  }
}
```

He says that the code above is a bug because:

"Both the free-standing function foo() and the member function 
S.foo() are in
error because they store slices to automatically-generated 
temporary arrays that
live on the program stack. Those arrays are valid only during 
the execution of the

variadic functions."

The thing is, when I run the code I get absolutely no error, so 
how is this exactly a 'bug' if the code runs properly? That's 
what I am confused about. What is the D compiler doing behind 
the scenes?



Ali is right. It is a bug. Do not be like the man of the moon who 
copies brothers like a monkey then learning all that is bad from 
his brothers he convinces himself it is ok. Be your own man and 
determine if you have what it takes or not. We are here for your 
support. I am sure there is a method in object that can help you 
with this bug because I have used it before, but do not be a 
Lemming on the wall. You will fall. Then you will not be able to 
help the other behind you from falling too.


Re: How is this code invalid?

2022-12-17 Thread H. S. Teoh via Digitalmars-d-learn
On Sat, Dec 17, 2022 at 02:36:10AM +, thebluepandabear via 
Digitalmars-d-learn wrote:
[...]
> Thanks, I've tried to mark it with `@safe` and it did give me a
> warning.
> 
> I was also wondering, why is this code valid?
> 
> ```D
> int[] numbersForLaterUse;
> 
> @safe void foo(int[] numbers) {
>   numbersForLaterUse = numbers;
> }
> ```

This code is safe provided the arguments are not allocated on the stack,
which is usually the case because you can no longer call it with:

foo(1, 2, 3, 4);

but you have to write:

foo([ 1, 2, 3, 4 ]);

The [] here will allocate a new array on the heap, so the array elements
will not go out of scope when the caller returns. (They will be
collected by the GC after all references to them have gone out of scope.
This is one of the advantages of using a GC: it saves you from having to
worry about complicated lifetimes in such cases.)

You may still run into trouble, though, if you do this:

int[3] data = [ 1, 2, 3 ]; // N.B.: stack-allocated
foo(data[]);// uh oh

To guard against this, use @safe and -dip1000, which will cause the
compiler to detect this dangerous usage and generate an error.


T

-- 
Answer: Because it breaks the logical sequence of discussion. / Question: Why 
is top posting bad?


Re: How is this code invalid?

2022-12-16 Thread Ali Çehreli via Digitalmars-d-learn

On 12/16/22 18:20, H. S. Teoh wrote:

> scratch space for computations, called the runtime
> stack.

I called it "function call stack" where I gave a very simplistic view of 
it here:


  https://www.youtube.com/watch?v=NWIU5wn1F1I=236s

> (2) Use @safe when possible so that the compiler will tell you when
> you're doing something wrong and potentially dangerous.

Unfortunately, @safe is not as prominent in the book as it should be. 
Part of the reason is I think its implementation is not complete 
especially how it changes with -dip1000.


Ali



Re: How is this code invalid?

2022-12-16 Thread thebluepandabear via Digitalmars-d-learn


T


Thanks, I've tried to mark it with `@safe` and it did give me a 
warning.


I was also wondering, why is this code valid?

```D
int[] numbersForLaterUse;

@safe void foo(int[] numbers) {
numbersForLaterUse = numbers;
}
```



Re: How is this code invalid?

2022-12-16 Thread H. S. Teoh via Digitalmars-d-learn
On Fri, Dec 16, 2022 at 05:39:08PM -0800, H. S. Teoh via Digitalmars-d-learn 
wrote:
[...]
> If you really want to see what could possibly have gone wrong, try
> this version of the code:
[...]
> The results will likely differ depending on your OS and specific
> environment; but on my Linux machine, it outputs a bunch of garbage
> (instead of the expected numbers and "hello" "world!" strings) and
> crashes.
[...]

In case you're wondering, here's a brief explanation of why the above
code triggers a problem:

When your program is running, the CPU has FIFO (first-in, first-out)
queue that it uses as scratch space for computations, called the runtime
stack.  Function arguments are typically passed by having the calling
function push the values on the stack, and having the called function
retrieve these values from the stack. In addition to function arguments,
the CPU also stores various other information on the stack, such as the
return address to jump to once the called function returns, and
potentially other stuff, depending on the specific OS and CPU.
Furthermore, the called function itself also reserves some space on the
stack for storing local variables.  Together, this information is called
a "stack frame".

When you call badCodeBad(), the arguments [ 1, 2, 3, 4, 5 ] are
allocated on the stack and passed to foo().  foo() then stores a slice
to these arguments,  i.e., a slice of the stack locations that currently
contain [ 1, 2, 3, 4, 5 ].  Then foo() returns to badCodeBad(), and
badCodeBad() returns to main.  The stack frame that contains the [ 1, 2,
3, 4, 5 ] is now no longer in scope.  However, it may not necessarily
have been overwritten with new data yet.

Then main() calls whatwentwrong(). This involves creating a new stack
frame for whatwentwrong(), pushing the return address on the stack, and
so on.  At this point, whatwentwrong()'s stack frame overwrites the
original stack frame where badCodeBad() stored the [ 1, 2, 3, 4, 5 ].
The array elements are now overwritten with other data that aren't
supposed to be interpreted as integers.  That's why when whatwentwrong()
tries to print the contents of numberForLaterUse, which now points to an
area on the stack that has just been overwritten by whatwentwrong()'s
stack frame, you get garbage output.

A similar thing happens when you call alsoReallyBad(). It allocates the
string array [ "hello", "world!" ] on the stack, and S.foo() wrongly
stores a slice to that location on the stack.  When alsoReallyBad()
returns, the stack frame that contains this array goes out of scope
(though not necessarily overwritten just yet).  When main() then calls
whatelsewentwrong(), that involves passing the instance of S as
argument, and also creating a new stack frame for alsoReallyBad(). All
of this new data overwrites the original stack frame, stomping all over
the [ "hello", "world!" ] array and overwriting it with stuff that isn't
supposed to be interpreted as a string array.

When whatelsewentwrong() then tries to print the contents of
s.namesForLaterUse, the slice points to the location on the stack that
now contains data that no longer contains the string array; writeln
tries to interpret this as a string array, which results in garbage
being printed.  Since a string is also an array, consisting of a pointer
and a length, interpreting random data as a string causes writeln to
read a random amount of data from a random location in memory. On my
system, it just so happens part of range of memory locations is outside
the range mapped by the OS to the program; this causes an invalid memory
access that made the OS forcefully terminate the program.

//

The underlying cause of these problems is exactly what Ali said in his
book: foo() and S.foo() tried to store a slice to a stack location past
its lifetime.  Once the stack frame went out of scope, all bets are off
as to what the slice now points to.  It could have been overwritten by
other data that can no longer be interpreted as an int[] or string[].
In this case, it caused the program to print random garbage and crash.
In more complicated scenarios, such a bug in the code can become a hole
for a hacker to exploit.

Consider, for example, if the code tried to do some arithmetic on the
int[] that it saved as numbersForLaterUse. Since the location that used
to contain the int[] now contains a function stack frame, part of it
could potentially contain a return address to main(). The hacker could
exploit this by manipulating the program's input such that the
arithmetic on the int[] overwrites this return address to point to
something else, such as an OS call to format your hard drive.  Then when
the function finishes what it's doing and tries to return, instead of
returning to main() it jumps to the function that formats your hard
drive.

The takeaway from all this is:

(1) It's Bad(tm) to store a slice to a stack location past its lifetime.

(2) Use @safe when possible so that the compiler will tell you when
you're doing 

Re: How is this code invalid?

2022-12-16 Thread H. S. Teoh via Digitalmars-d-learn
On Sat, Dec 17, 2022 at 12:23:32AM +, thebluepandabear via 
Digitalmars-d-learn wrote:
[...]
> ```D
> int[] numbersForLaterUse;
> 
> void foo(int[] numbers...) {
>numbersForLaterUse = numbers;
> }
> 
> struct S {
>   string[] namesForLaterUse;
> 
>   void foo(string[] names...) {
>  namesForLaterUse = names;
>   }
> }
> ```
[...]
> The thing is, when I run the code I get absolutely no error, so how is
> this exactly a 'bug' if the code runs properly? That's what I am
> confused about.  What is the D compiler doing behind the scenes?

Try labelling the above functions with @safe and see what the compiler
says.

If you really want to see what could possibly have gone wrong, try this
version of the code:

--snip---
int[] numbersForLaterUse;

void foo(int[] numbers...) {
   numbersForLaterUse = numbers;
}

struct S {
  string[] namesForLaterUse;

  void foo(string[] names...) {
 namesForLaterUse = names;
  }
}

void whatwentwrong() {
import std.stdio;
writeln(numbersForLaterUse);
}

void whatelsewentwrong(S s) {
import std.stdio;
writeln(s.namesForLaterUse);
}

void badCodeBad() {
  foo(1, 2, 3, 4, 5);
}

S alsoReallyBad() {
  S s;
  s.foo("hello", "world!");
  return s;
}

void main() {
  badCodeBad();
  whatwentwrong();

  auto s = alsoReallyBad();
  whatelsewentwrong(s);
}
--snip---

The results will likely differ depending on your OS and specific
environment; but on my Linux machine, it outputs a bunch of garbage
(instead of the expected numbers and "hello" "world!" strings) and
crashes.


T

-- 
If you want to solve a problem, you need to address its root cause, not just 
its symptoms. Otherwise it's like treating cancer with Tylenol...


Re: How is this code invalid?

2022-12-16 Thread ag0aep6g via Digitalmars-d-learn
On Saturday, 17 December 2022 at 00:23:32 UTC, thebluepandabear 
wrote:

```D
int[] numbersForLaterUse;

void foo(int[] numbers...) {
   numbersForLaterUse = numbers;
}

struct S {
  string[] namesForLaterUse;

  void foo(string[] names...) {
 namesForLaterUse = names;
  }
}
```

[...]
The thing is, when I run the code I get absolutely no error, so 
how is this exactly a 'bug' if the code runs properly? That's 
what I am confused about. What is the D compiler doing behind 
the scenes?


You're witnessing the wonders of undefined behavior. Invalid code 
can still produce the results you're hoping for, or it can 
produce garbage results, or it can crash, or it can do something 
else entirely. And just because running it once does one thing, 
does not mean that the next run will do the same.


For your particular code, here is an example where 
`numberForLaterUse` end up not being what we pass in:


```d
int[] numbersForLaterUse;

void foo(int[] numbers...) {
   numbersForLaterUse = numbers; /* No! Don't! Bad programmer! 
Bad! */

}

void bar()
{
int[3] n = [1, 2, 3];
foo(n);
}

void main()
{
bar();
import std.stdio;
writeln(numbersForLaterUse); /* prints garbage */
}
```

But again nothing at all is actually guaranteed about what that 
program does. It exhibits undefined behavior. So it could just as 
well print "[1, 2, 3]", making you think that everything is fine.


How is this code invalid?

2022-12-16 Thread thebluepandabear via Digitalmars-d-learn
I am reading the fantastic book about D by Ali Çehreli, and he 
gives the following example when he talks about variadic 
functions:


```D
int[] numbersForLaterUse;

void foo(int[] numbers...) {
   numbersForLaterUse = numbers;
}

struct S {
  string[] namesForLaterUse;

  void foo(string[] names...) {
 namesForLaterUse = names;
  }
}
```

He says that the code above is a bug because:

"Both the free-standing function foo() and the member function 
S.foo() are in
error because they store slices to automatically-generated 
temporary arrays that
live on the program stack. Those arrays are valid only during the 
execution of the

variadic functions."

The thing is, when I run the code I get absolutely no error, so 
how is this exactly a 'bug' if the code runs properly? That's 
what I am confused about. What is the D compiler doing behind the 
scenes?