Re: miscellaneous array questions...

2020-07-21 Thread Adam D. Ruppe via Digitalmars-d-learn

On Tuesday, 21 July 2020 at 19:20:28 UTC, Simen Kjærås wrote:
Walter gives some justification in the post immediately 
following:


whelp proves my memory wrong!


Re: miscellaneous array questions...

2020-07-21 Thread Simen Kjærås via Digitalmars-d-learn
On Tuesday, 21 July 2020 at 13:42:15 UTC, Steven Schveighoffer 
wrote:

On 7/21/20 8:34 AM, Adam D. Ruppe wrote:

The others aren't wrong about stack size limits playing some 
role, but the primary reason is that it is a weird hack for 
@safe, believe it or not.

...
I don't recall exactly when this was discussed but it came up 
in the earlier days of @safe, I'm pretty sure it worked before 
then.


I think this was discussed, but was not the reason for the 
limitation. The limitation exists even in D1, which is before 
@safe: https://digitalmars.com/d/1.0/arrays.html#static-arrays


I have stressed before that any access of a pointer to a large 
object in @safe code should also check that the base of the 
object is not within the null page (this is not currently 
done). This is the only way to ensure safety.


It seems the limitation was introduced in DMD 0.123, in May 2005:
https://forum.dlang.org/post/d61jpa$1m0l$1...@digitaldaemon.com
Walter gives some justification in the post immediately following:

1) Gigantic static arrays are often either the result of a typo 
or are a

newbie mistake.
2) Such require a lot of memory for the compiler to handle. 
Before the OS
officially runs out of memory, it goes to greater and greater 
lengths to
scavenge memory for the compiler, often bringing the computer 
to its knees

in desperation.
3) D needs to be a portable language, and by capping the array 
size a

program is more likely to be portable.
4) Giant arrays are reflected in a corresponding giant size for 
the exe

file.
5) There simply isn't a need I can think of for such arrays. 
There shouldn't

be a problem with allocating them dynamically.


I admit I thought it was an old optlink limitation, but it seems 
it's basically arbitrary.


--
  Simen


Re: miscellaneous array questions...

2020-07-21 Thread Johan via Digitalmars-d-learn

On Monday, 20 July 2020 at 22:05:35 UTC, WhatMeWorry wrote:

1) The D Language Reference says:

"There are four kinds of arrays..." with the first example being
"type* Pointers to data"  and "int* p;  etc.

At the risk of sounding overly nitpicky, isn't a pointer to an 
integer simply a pointer to an integer?  How does that pertain 
to an array?


I agree. "type*" being an array makes no sense from a D language 
point of view.


2) "The total size of a static array cannot exceed 16Mb" What 
limits this? And with modern systems of 16GB and 32GB, isn't 
16Mb excessively small?   (an aside: shouldn't that be 16MB in 
the reference instead of 16Mb? that is, Doesn't b = bits and B 
= bytes)


This doesn't make sense either. Where did you find this in the 
documentation? It should be removed, as it is easily proven to 
work (`ubyte[170_000_000] s; void main(){s[160_000_000] = 1;}`).


3) Lastly, In the following code snippet, is arrayA and arrayB 
both allocated on the stack? And how does their scopes and/or 
lifetimes differ?


 module1 =
int[100] arrayA;
void foo() // changed from main to foo for clarity
{
int[100] arrayB;
// ...
}
 module1 =


"The stack" is not a D language thing, a better way of looking at 
it is that local storage is implemented by all D compilers by 
using the "stack" (on x86).


arrayA is not allocated on the stack, lifetime is whole duration 
of program, one array per thread.
arrayB is indeed allocated on the stack (local storage), lifetime 
is only from start to end of foo(), one array per call to foo (!).


Because arrayB is on the stack, you are limited by stack size 
which is set by the OS (but can be overridden). The array would 
be competing with all other things that are put on the stack, 
such as function call return addresses and temporary values, both 
of which you as coder cannot see. What maximum size of arrayB you 
can get away with heavily depends on the rest of your program 
(and the stack size allocated by the OS, which is somewhere in 
the 4MB, 8MB, 16MB range), thus best to avoid putting large 
arrays on the stack alltogether.
arrayA is allocated together with other global/TLS variables in a 
section for which I don't think there really is a size limit.


-Johan


Re: miscellaneous array questions...

2020-07-21 Thread IGotD- via Digitalmars-d-learn

On Tuesday, 21 July 2020 at 13:23:32 UTC, Adam D. Ruppe wrote:


But the array isn't initialized in the justification scenario. 
It is accessed through a null pointer and the type system 
thinks it is fine because it is still inside the static limit.


At run time, the cpu just sees access to memory address 0 + x, 
and if x is sufficient large, it can bypass those guard pages.


I'm not that convinced. This totally depends on how the virtual 
memory for the process looks like. Some operating systems might 
have a gap between 0 - 16MB but some others don't. This is also a 
subject that can change between versions of the OS and even more 
uncertain as address space randomization becomes popular. Safety 
based on assumptions aren't really worth it.


I don't personally care about the 16MB limit as I would never use 
it for any foreseeable future but the motivation for it is kind 
of vague.


Re: miscellaneous array questions...

2020-07-21 Thread Steven Schveighoffer via Digitalmars-d-learn

On 7/21/20 8:34 AM, Adam D. Ruppe wrote:

The others aren't wrong about stack size limits playing some role, but 
the primary reason is that it is a weird hack for @safe, believe it or not.

...
I don't recall exactly when this was discussed but it came up in the 
earlier days of @safe, I'm pretty sure it worked before then.


I think this was discussed, but was not the reason for the limitation. 
The limitation exists even in D1, which is before @safe: 
https://digitalmars.com/d/1.0/arrays.html#static-arrays


I have stressed before that any access of a pointer to a large object in 
@safe code should also check that the base of the object is not within 
the null page (this is not currently done). This is the only way to 
ensure safety.


-Steve


Re: miscellaneous array questions...

2020-07-21 Thread Adam D. Ruppe via Digitalmars-d-learn

On Tuesday, 21 July 2020 at 13:16:44 UTC, IGotD- wrote:
Either the array will hit that page during initialization or 
something else during the execution.


But the array isn't initialized in the justification scenario. It 
is accessed through a null pointer and the type system thinks it 
is fine because it is still inside the static limit.


At run time, the cpu just sees access to memory address 0 + x, 
and if x is sufficient large, it can bypass those guard pages.


Re: miscellaneous array questions...

2020-07-21 Thread IGotD- via Digitalmars-d-learn

On Tuesday, 21 July 2020 at 12:34:14 UTC, Adam D. Ruppe wrote:


With the null `a`, the offset to the static array is just 0 + 
whatever and the @safe mechanism can't trace that.


So the arbitrary limit was put in place to make it more likely 
that such a situation will hit a protected page and segfault 
instead of carrying on. (most low addresses are not actually 
allocated by the OS... though there's no reason why they 
couldn't, it just usually doesn't, so that 16 MB limit makes 
the odds of something like this actually happening a lot lower)


I don't recall exactly when this was discussed but it came up 
in the earlier days of @safe, I'm pretty sure it worked before 
then.


If that's the case I would consider this 16MB limit unnecessary. 
Most operating systems put a guard page at the very bottom of the 
stack (which is usually 1MB - 4MB, usually 1MB on Linux). Either 
the array will hit that page during initialization or something 
else during the execution.


Let's say someone puts a 15MB array on the stack, then we will 
have a page fault instead for sure and this artificial limit 
there for nothing. With 64-bits or more and some future crazy 
operating system, it might support large stack sizes like 256MB. 
This is a little like a 640kB limit.


Re: miscellaneous array questions...

2020-07-21 Thread Adam D. Ruppe via Digitalmars-d-learn

On Monday, 20 July 2020 at 22:05:35 UTC, WhatMeWorry wrote:

How does that pertain to an array?


C arrays work as pointers to the first element and D can use that 
style too.


2) "The total size of a static array cannot exceed 16Mb" What 
limits this?


The others aren't wrong about stack size limits playing some 
role, but the primary reason is that it is a weird hack for 
@safe, believe it or not.


The idea is:

---
class A {
ubyte[4_000_000_000] whole_system;
}

@safe void lol() {
A a;
a.whole_system[any_address] = whatever;
}
---


With the null `a`, the offset to the static array is just 0 + 
whatever and the @safe mechanism can't trace that.


So the arbitrary limit was put in place to make it more likely 
that such a situation will hit a protected page and segfault 
instead of carrying on. (most low addresses are not actually 
allocated by the OS... though there's no reason why they 
couldn't, it just usually doesn't, so that 16 MB limit makes the 
odds of something like this actually happening a lot lower)


I don't recall exactly when this was discussed but it came up in 
the earlier days of @safe, I'm pretty sure it worked before then.


Re: miscellaneous array questions...

2020-07-21 Thread Steven Schveighoffer via Digitalmars-d-learn

On 7/21/20 7:10 AM, IGotD- wrote:

On Monday, 20 July 2020 at 22:05:35 UTC, WhatMeWorry wrote:


2) "The total size of a static array cannot exceed 16Mb" What limits 
this? And with modern systems of 16GB and 32GB, isn't 16Mb excessively 
small?   (an aside: shouldn't that be 16MB in the reference instead of 
16Mb? that is, Doesn't b = bits and B = bytes)




I didn't know this but it makes sense and I guess this is a constraint 
of the D language itself. In practice 16MB should be well enough for 
most cases. I'm not sure where 16MB is taken from, if there is any OS 
out there that has this limitation or if it was just taken as an 
adequate limit.


I believe it stems from a limitation in the way the stacks are 
allocated? Or maybe a limitation in DMC, the basis for DMD.


Also, you CAN actually have larger arrays, they just cannot be put on 
the stack (which most static arrays are):


struct S
{
ubyte[17_000_000] big;
}

void main()
{
auto s = new S; // ok
S s; // crash (signal 11 on run.dlang.io)
}

This may not work if `big` had a static initializer, I'm not sure.

-Steve


Re: miscellaneous array questions...

2020-07-21 Thread wjoe via Digitalmars-d-learn

On Monday, 20 July 2020 at 22:05:35 UTC, WhatMeWorry wrote:
2) "The total size of a static array cannot exceed 16Mb" What 
limits this? And with modern systems of 16GB and 32GB, isn't 
16Mb excessively small?   (an aside: shouldn't that be 16MB in 
the reference instead of 16Mb? that is, Doesn't b = bits and B 
= bytes)


Static arrays are passed by value.

(Also I think you're right about Mb vs MB except it should be 
MiB. 1MB = 1000^2 (decimal) and 1MiB = 1024^2 (binary).
Note that MB is defined 1024^2 in JEDEC 100B.01 but, IMO, ISO 
standard is superior because it's unambiguous and JEDEC only 
defines units up to GB (inclusive))


Re: miscellaneous array questions...

2020-07-21 Thread IGotD- via Digitalmars-d-learn

On Monday, 20 July 2020 at 22:05:35 UTC, WhatMeWorry wrote:


2) "The total size of a static array cannot exceed 16Mb" What 
limits this? And with modern systems of 16GB and 32GB, isn't 
16Mb excessively small?   (an aside: shouldn't that be 16MB in 
the reference instead of 16Mb? that is, Doesn't b = bits and B 
= bytes)




I didn't know this but it makes sense and I guess this is a 
constraint of the D language itself. In practice 16MB should be 
well enough for most cases. I'm not sure where 16MB is taken 
from, if there is any OS out there that has this limitation or if 
it was just taken as an adequate limit.


Let's say you have a program with 4 threads, then suddenly the 
TLS area is 4 * 16 MB = 64MB. This size rapidly increases with 
number of threads and TLS area size. Let's say TLS area of 128MB 
and 8 threads, which gives you a memory consumption of 1GB. 
That's how quickly it starts to consume memory if you don't limit 
the TLS variables.


If you want global variables like in good old C/C++, then use 
__gshared. Of course you have to take care if any multiple 
accesses from several threads.




Re: miscellaneous array questions...

2020-07-21 Thread Ali Çehreli via Digitalmars-d-learn

On 7/20/20 8:16 PM, a...@a.com wrote:

>> 3) Lastly, In the following code snippet, is arrayA and arrayB both
>> allocated on the stack?

arrayA is allocated on thread-local storage and lives as long as the 
program is active. I guess a final interaction with it can be in a 
'static ~this()' or a 'shared static ~this()' block.


Note that this is different from e.g. C++: In that language, arrayA 
would be a "global" variable and there would be a single instance of it. 
In D, there will be as many arrayA variables as there are active 
threads. (One thread's modification to its own arrayA is not seen by 
other threads.)


arrayB is allocated on the stack and lives as long as the scope that it 
is defined inside. That scope is main's body in your code.


> And how does their scopes and/or lifetimes
>> differ?
>>
>>  module1 =
>> int[100] arrayA;
>> void main()
>> {
>> int[100] arrayB;
>> // ...
>> }
>>  module1 =

Ali



Re: miscellaneous array questions...

2020-07-20 Thread a--- via Digitalmars-d-learn

On Monday, 20 July 2020 at 22:05:35 UTC, WhatMeWorry wrote:

1) The D Language Reference says:

"There are four kinds of arrays..." with the first example being
"type* Pointers to data"  and "int* p;  etc.

At the risk of sounding overly nitpicky, isn't a pointer to an 
integer simply a pointer to an integer?  How does that pertain 
to an array?



2) "The total size of a static array cannot exceed 16Mb" What 
limits this? And with modern systems of 16GB and 32GB, isn't 
16Mb excessively small?   (an aside: shouldn't that be 16MB in 
the reference instead of 16Mb? that is, Doesn't b = bits and B 
= bytes)



3) Lastly, In the following code snippet, is arrayA and arrayB 
both allocated on the stack? And how does their scopes and/or 
lifetimes differ?


 module1 =
int[100] arrayA;
void main()
{
int[100] arrayB;
// ...
}
 module1 =


1) Pointers can be used as arrays with the [] operator, int* p = 
arrayA.ptr; assert(*(p + 99) == p[99]); should access the same 
element.
http://ddili.org/ders/d.en/pointers.html ("Using pointers with 
the array indexing operator []")
2) I've encountered this problem too, it's arbitrary AFAIK but it 
can be circumvented with dynamic arrays.


miscellaneous array questions...

2020-07-20 Thread WhatMeWorry via Digitalmars-d-learn

1) The D Language Reference says:

"There are four kinds of arrays..." with the first example being
"type* Pointers to data"  and "int* p;  etc.

At the risk of sounding overly nitpicky, isn't a pointer to an 
integer simply a pointer to an integer?  How does that pertain to 
an array?



2) "The total size of a static array cannot exceed 16Mb" What 
limits this? And with modern systems of 16GB and 32GB, isn't 16Mb 
excessively small?   (an aside: shouldn't that be 16MB in the 
reference instead of 16Mb? that is, Doesn't b = bits and B = 
bytes)



3) Lastly, In the following code snippet, is arrayA and arrayB 
both allocated on the stack? And how does their scopes and/or 
lifetimes differ?


 module1 =
int[100] arrayA;
void main()
{
int[100] arrayB;
// ...
}
 module1 =