Re: miscellaneous array questions...
On Tuesday, 21 July 2020 at 19:20:28 UTC, Simen Kjærås wrote: Walter gives some justification in the post immediately following: whelp proves my memory wrong!
Re: miscellaneous array questions...
On Tuesday, 21 July 2020 at 13:42:15 UTC, Steven Schveighoffer wrote: On 7/21/20 8:34 AM, Adam D. Ruppe wrote: The others aren't wrong about stack size limits playing some role, but the primary reason is that it is a weird hack for @safe, believe it or not. ... I don't recall exactly when this was discussed but it came up in the earlier days of @safe, I'm pretty sure it worked before then. I think this was discussed, but was not the reason for the limitation. The limitation exists even in D1, which is before @safe: https://digitalmars.com/d/1.0/arrays.html#static-arrays I have stressed before that any access of a pointer to a large object in @safe code should also check that the base of the object is not within the null page (this is not currently done). This is the only way to ensure safety. It seems the limitation was introduced in DMD 0.123, in May 2005: https://forum.dlang.org/post/d61jpa$1m0l$1...@digitaldaemon.com Walter gives some justification in the post immediately following: 1) Gigantic static arrays are often either the result of a typo or are a newbie mistake. 2) Such require a lot of memory for the compiler to handle. Before the OS officially runs out of memory, it goes to greater and greater lengths to scavenge memory for the compiler, often bringing the computer to its knees in desperation. 3) D needs to be a portable language, and by capping the array size a program is more likely to be portable. 4) Giant arrays are reflected in a corresponding giant size for the exe file. 5) There simply isn't a need I can think of for such arrays. There shouldn't be a problem with allocating them dynamically. I admit I thought it was an old optlink limitation, but it seems it's basically arbitrary. -- Simen
Re: miscellaneous array questions...
On Monday, 20 July 2020 at 22:05:35 UTC, WhatMeWorry wrote: 1) The D Language Reference says: "There are four kinds of arrays..." with the first example being "type* Pointers to data" and "int* p; etc. At the risk of sounding overly nitpicky, isn't a pointer to an integer simply a pointer to an integer? How does that pertain to an array? I agree. "type*" being an array makes no sense from a D language point of view. 2) "The total size of a static array cannot exceed 16Mb" What limits this? And with modern systems of 16GB and 32GB, isn't 16Mb excessively small? (an aside: shouldn't that be 16MB in the reference instead of 16Mb? that is, Doesn't b = bits and B = bytes) This doesn't make sense either. Where did you find this in the documentation? It should be removed, as it is easily proven to work (`ubyte[170_000_000] s; void main(){s[160_000_000] = 1;}`). 3) Lastly, In the following code snippet, is arrayA and arrayB both allocated on the stack? And how does their scopes and/or lifetimes differ? module1 = int[100] arrayA; void foo() // changed from main to foo for clarity { int[100] arrayB; // ... } module1 = "The stack" is not a D language thing, a better way of looking at it is that local storage is implemented by all D compilers by using the "stack" (on x86). arrayA is not allocated on the stack, lifetime is whole duration of program, one array per thread. arrayB is indeed allocated on the stack (local storage), lifetime is only from start to end of foo(), one array per call to foo (!). Because arrayB is on the stack, you are limited by stack size which is set by the OS (but can be overridden). The array would be competing with all other things that are put on the stack, such as function call return addresses and temporary values, both of which you as coder cannot see. What maximum size of arrayB you can get away with heavily depends on the rest of your program (and the stack size allocated by the OS, which is somewhere in the 4MB, 8MB, 16MB range), thus best to avoid putting large arrays on the stack alltogether. arrayA is allocated together with other global/TLS variables in a section for which I don't think there really is a size limit. -Johan
Re: miscellaneous array questions...
On Tuesday, 21 July 2020 at 13:23:32 UTC, Adam D. Ruppe wrote: But the array isn't initialized in the justification scenario. It is accessed through a null pointer and the type system thinks it is fine because it is still inside the static limit. At run time, the cpu just sees access to memory address 0 + x, and if x is sufficient large, it can bypass those guard pages. I'm not that convinced. This totally depends on how the virtual memory for the process looks like. Some operating systems might have a gap between 0 - 16MB but some others don't. This is also a subject that can change between versions of the OS and even more uncertain as address space randomization becomes popular. Safety based on assumptions aren't really worth it. I don't personally care about the 16MB limit as I would never use it for any foreseeable future but the motivation for it is kind of vague.
Re: miscellaneous array questions...
On 7/21/20 8:34 AM, Adam D. Ruppe wrote: The others aren't wrong about stack size limits playing some role, but the primary reason is that it is a weird hack for @safe, believe it or not. ... I don't recall exactly when this was discussed but it came up in the earlier days of @safe, I'm pretty sure it worked before then. I think this was discussed, but was not the reason for the limitation. The limitation exists even in D1, which is before @safe: https://digitalmars.com/d/1.0/arrays.html#static-arrays I have stressed before that any access of a pointer to a large object in @safe code should also check that the base of the object is not within the null page (this is not currently done). This is the only way to ensure safety. -Steve
Re: miscellaneous array questions...
On Tuesday, 21 July 2020 at 13:16:44 UTC, IGotD- wrote: Either the array will hit that page during initialization or something else during the execution. But the array isn't initialized in the justification scenario. It is accessed through a null pointer and the type system thinks it is fine because it is still inside the static limit. At run time, the cpu just sees access to memory address 0 + x, and if x is sufficient large, it can bypass those guard pages.
Re: miscellaneous array questions...
On Tuesday, 21 July 2020 at 12:34:14 UTC, Adam D. Ruppe wrote: With the null `a`, the offset to the static array is just 0 + whatever and the @safe mechanism can't trace that. So the arbitrary limit was put in place to make it more likely that such a situation will hit a protected page and segfault instead of carrying on. (most low addresses are not actually allocated by the OS... though there's no reason why they couldn't, it just usually doesn't, so that 16 MB limit makes the odds of something like this actually happening a lot lower) I don't recall exactly when this was discussed but it came up in the earlier days of @safe, I'm pretty sure it worked before then. If that's the case I would consider this 16MB limit unnecessary. Most operating systems put a guard page at the very bottom of the stack (which is usually 1MB - 4MB, usually 1MB on Linux). Either the array will hit that page during initialization or something else during the execution. Let's say someone puts a 15MB array on the stack, then we will have a page fault instead for sure and this artificial limit there for nothing. With 64-bits or more and some future crazy operating system, it might support large stack sizes like 256MB. This is a little like a 640kB limit.
Re: miscellaneous array questions...
On Monday, 20 July 2020 at 22:05:35 UTC, WhatMeWorry wrote: How does that pertain to an array? C arrays work as pointers to the first element and D can use that style too. 2) "The total size of a static array cannot exceed 16Mb" What limits this? The others aren't wrong about stack size limits playing some role, but the primary reason is that it is a weird hack for @safe, believe it or not. The idea is: --- class A { ubyte[4_000_000_000] whole_system; } @safe void lol() { A a; a.whole_system[any_address] = whatever; } --- With the null `a`, the offset to the static array is just 0 + whatever and the @safe mechanism can't trace that. So the arbitrary limit was put in place to make it more likely that such a situation will hit a protected page and segfault instead of carrying on. (most low addresses are not actually allocated by the OS... though there's no reason why they couldn't, it just usually doesn't, so that 16 MB limit makes the odds of something like this actually happening a lot lower) I don't recall exactly when this was discussed but it came up in the earlier days of @safe, I'm pretty sure it worked before then.
Re: miscellaneous array questions...
On 7/21/20 7:10 AM, IGotD- wrote: On Monday, 20 July 2020 at 22:05:35 UTC, WhatMeWorry wrote: 2) "The total size of a static array cannot exceed 16Mb" What limits this? And with modern systems of 16GB and 32GB, isn't 16Mb excessively small? (an aside: shouldn't that be 16MB in the reference instead of 16Mb? that is, Doesn't b = bits and B = bytes) I didn't know this but it makes sense and I guess this is a constraint of the D language itself. In practice 16MB should be well enough for most cases. I'm not sure where 16MB is taken from, if there is any OS out there that has this limitation or if it was just taken as an adequate limit. I believe it stems from a limitation in the way the stacks are allocated? Or maybe a limitation in DMC, the basis for DMD. Also, you CAN actually have larger arrays, they just cannot be put on the stack (which most static arrays are): struct S { ubyte[17_000_000] big; } void main() { auto s = new S; // ok S s; // crash (signal 11 on run.dlang.io) } This may not work if `big` had a static initializer, I'm not sure. -Steve
Re: miscellaneous array questions...
On Monday, 20 July 2020 at 22:05:35 UTC, WhatMeWorry wrote: 2) "The total size of a static array cannot exceed 16Mb" What limits this? And with modern systems of 16GB and 32GB, isn't 16Mb excessively small? (an aside: shouldn't that be 16MB in the reference instead of 16Mb? that is, Doesn't b = bits and B = bytes) Static arrays are passed by value. (Also I think you're right about Mb vs MB except it should be MiB. 1MB = 1000^2 (decimal) and 1MiB = 1024^2 (binary). Note that MB is defined 1024^2 in JEDEC 100B.01 but, IMO, ISO standard is superior because it's unambiguous and JEDEC only defines units up to GB (inclusive))
Re: miscellaneous array questions...
On Monday, 20 July 2020 at 22:05:35 UTC, WhatMeWorry wrote: 2) "The total size of a static array cannot exceed 16Mb" What limits this? And with modern systems of 16GB and 32GB, isn't 16Mb excessively small? (an aside: shouldn't that be 16MB in the reference instead of 16Mb? that is, Doesn't b = bits and B = bytes) I didn't know this but it makes sense and I guess this is a constraint of the D language itself. In practice 16MB should be well enough for most cases. I'm not sure where 16MB is taken from, if there is any OS out there that has this limitation or if it was just taken as an adequate limit. Let's say you have a program with 4 threads, then suddenly the TLS area is 4 * 16 MB = 64MB. This size rapidly increases with number of threads and TLS area size. Let's say TLS area of 128MB and 8 threads, which gives you a memory consumption of 1GB. That's how quickly it starts to consume memory if you don't limit the TLS variables. If you want global variables like in good old C/C++, then use __gshared. Of course you have to take care if any multiple accesses from several threads.
Re: miscellaneous array questions...
On 7/20/20 8:16 PM, a...@a.com wrote: >> 3) Lastly, In the following code snippet, is arrayA and arrayB both >> allocated on the stack? arrayA is allocated on thread-local storage and lives as long as the program is active. I guess a final interaction with it can be in a 'static ~this()' or a 'shared static ~this()' block. Note that this is different from e.g. C++: In that language, arrayA would be a "global" variable and there would be a single instance of it. In D, there will be as many arrayA variables as there are active threads. (One thread's modification to its own arrayA is not seen by other threads.) arrayB is allocated on the stack and lives as long as the scope that it is defined inside. That scope is main's body in your code. > And how does their scopes and/or lifetimes >> differ? >> >> module1 = >> int[100] arrayA; >> void main() >> { >> int[100] arrayB; >> // ... >> } >> module1 = Ali
Re: miscellaneous array questions...
On Monday, 20 July 2020 at 22:05:35 UTC, WhatMeWorry wrote: 1) The D Language Reference says: "There are four kinds of arrays..." with the first example being "type* Pointers to data" and "int* p; etc. At the risk of sounding overly nitpicky, isn't a pointer to an integer simply a pointer to an integer? How does that pertain to an array? 2) "The total size of a static array cannot exceed 16Mb" What limits this? And with modern systems of 16GB and 32GB, isn't 16Mb excessively small? (an aside: shouldn't that be 16MB in the reference instead of 16Mb? that is, Doesn't b = bits and B = bytes) 3) Lastly, In the following code snippet, is arrayA and arrayB both allocated on the stack? And how does their scopes and/or lifetimes differ? module1 = int[100] arrayA; void main() { int[100] arrayB; // ... } module1 = 1) Pointers can be used as arrays with the [] operator, int* p = arrayA.ptr; assert(*(p + 99) == p[99]); should access the same element. http://ddili.org/ders/d.en/pointers.html ("Using pointers with the array indexing operator []") 2) I've encountered this problem too, it's arbitrary AFAIK but it can be circumvented with dynamic arrays.
miscellaneous array questions...
1) The D Language Reference says: "There are four kinds of arrays..." with the first example being "type* Pointers to data" and "int* p; etc. At the risk of sounding overly nitpicky, isn't a pointer to an integer simply a pointer to an integer? How does that pertain to an array? 2) "The total size of a static array cannot exceed 16Mb" What limits this? And with modern systems of 16GB and 32GB, isn't 16Mb excessively small? (an aside: shouldn't that be 16MB in the reference instead of 16Mb? that is, Doesn't b = bits and B = bytes) 3) Lastly, In the following code snippet, is arrayA and arrayB both allocated on the stack? And how does their scopes and/or lifetimes differ? module1 = int[100] arrayA; void main() { int[100] arrayB; // ... } module1 =
Re: better than union and array questions
bearophile wrote: Saaa: Is there a better way to support arrays of any type? Currently all the code working with these Structs are templated with loads of static ifs in them. You have to ask a more precise question if you want an answer. Also, is it possible to add a .deepdup property to all arrays? D devs don't read posts here, so you have to ask ask in the main newsgroup. I have asked for that more than a year ago, and I was ignored, as usual. You always seem to be so negative! In one recent DMD release, half the changes were bugfixes requested by you... If everyone involved in compiler development spent 100% of their time on bearophile requests, you still might not get everything you want g. Actually you have at least 20% of my time. Stop complaining, and start prioritizing... Will a[]=b.dup; copy b twice? When you have questions like this it's good to take a look at the produced asm. The dup allocates a new array and then copies data on it. The a[]=b[]; copies b on a. int[] array; array.length = 100; array.length = 0; //no other arrays pointing/slicing to this array This way I can be sure for the following 100 element concatenations the array won't be copied. Or isn't this implicitly part of the D spec? Are you talking about appends or concatenations? Concatenations produce memory allocations. But you probably mean 100 appends. Those 100 appends will not produce allocations or copies. But generally array appends are slow anyway in D, so where you need to do them quickly it's much better to use an ArrayBuilder like the one in my dlibs, of a similar one a bit less efficient in Phobos of D2. Bye, bearophile
Re: better than union and array questions
Don wrote: You always seem to be so negative! In one recent DMD release, half the changes were bugfixes requested by you... If everyone involved in compiler development spent 100% of their time on bearophile requests, you still might not get everything you want g. Actually you have at least 20% of my time. Stop complaining, and start prioritizing... Say bearophile, do you keep a list of suggestions on your website? :)
Re: better than union and array questions
Saaa: Is there a better way to support arrays of any type? Currently all the code working with these Structs are templated with loads of static ifs in them. You have to ask a more precise question if you want an answer. Also, is it possible to add a .deepdup property to all arrays? D devs don't read posts here, so you have to ask ask in the main newsgroup. I have asked for that more than a year ago, and I was ignored, as usual. Will a[]=b.dup; copy b twice? When you have questions like this it's good to take a look at the produced asm. The dup allocates a new array and then copies data on it. The a[]=b[]; copies b on a. int[] array; array.length = 100; array.length = 0; //no other arrays pointing/slicing to this array This way I can be sure for the following 100 element concatenations the array won't be copied. Or isn't this implicitly part of the D spec? Are you talking about appends or concatenations? Concatenations produce memory allocations. But you probably mean 100 appends. Those 100 appends will not produce allocations or copies. But generally array appends are slow anyway in D, so where you need to do them quickly it's much better to use an ArrayBuilder like the one in my dlibs, of a similar one a bit less efficient in Phobos of D2. Bye, bearophile
Re: better than union and array questions
bearophile Wrote: Saaa: Is there a better way to support arrays of any type? Currently all the code working with these Structs are templated with loads of static ifs in them. You have to ask a more precise question if you want an answer. Maybe a more general one :) I use a tagged union scheme to encapsulate different types(arrays) as one type. Accessing the arrays thus always need a type-check, plus I need multiple get/setArray functions (or one template with loads of static ifs). Is there maybe a general scheme which results in the same type support without the all the separate type handling hassle? Also, is it possible to add a .deepdup property to all arrays? D devs don't read posts here, so you have to ask ask in the main newsgroup. I have asked for that more than a year ago, and I was ignored, as usual. Two might not be a front, at least they make a line. Will a[]=b.dup; copy b twice? When you have questions like this it's good to take a look at the produced asm. The dup allocates a new array and then copies data on it. The a[]=b[]; copies b on a. By 'copies b on a' you mean only the length and pointer, right. How do you produce asm? Not that I can read it but it would be a nice way to start learning it a bit. int[] array; array.length = 100; array.length = 0; //no other arrays pointing/slicing to this array This way I can be sure for the following 100 element concatenations the array won't be copied. Or isn't this implicitly part of the D spec? Are you talking about appends or concatenations? Concatenations produce memory allocations. But you probably mean 100 appends. Those 100 appends will not produce allocations or copies. But generally array appends are slow anyway in D, so where you need to do them quickly it's much better to use an ArrayBuilder like the one in my dlibs, of a similar one a bit less efficient in Phobos of D2. Dlibs license doesn't like my commercial project :) (also, deprecated) ByeBye and thanks, Saaa
better than union and array questions
(D1 Phobos) I use the struct below like: Struct[][char[]] _struct; Is there a better way to support arrays of any type? Currently all the code working with these Structs are templated with loads of static ifs in them. Also, is it possible to add a .deepdup property to all arrays? One last question: Will a[]=b.dup; copy b twice? struct Struct { enum TYPE{ UNKNOWN, BOOL, BYTE,INT, FLOAT}; // I excluded boolean and byte to shorten this message TYPE type = TYPE.UNKNOWN; union { float floatMin; int intMin; } union { float floatMax; int intMax; } union { int[][] intArray; float[][] floatArray; } }
Re: better than union and array questions
(D1 Phobos) I use the struct below like: Struct[][char[]] _struct; Is there a better way to support arrays of any type? Currently all the code working with these Structs are templated with loads of static ifs in them. Also, is it possible to add a .deepdup property to all arrays? One last question: Will a[]=b.dup; copy b twice? One more :) int[] array; array.length = 100; array.length = 0; //no other arrays pointing/slicing to this array This way I can be sure for the following 100 element concatenations the array won't be copied. Or isn't this implicitly part of the D spec?
array questions
Hello again is it possible to make a dynamic array less dynamic? int[][] array; array[0].length = 10; //has to be set at runtime writefln(array[1].length); // writes also 10 Because I now have to loop through the whole array to check for correct size. also, can this be done? int size; size = 10; //runtime void function( int[size][] array){} thank you
Re: array questions
Sun, 11 Jan 2009 17:17:54 -0500, yes wrote: Hello again is it possible to make a dynamic array less dynamic? int[][] array; array[0].length = 10; //has to be set at runtime Um, if that's your code, everything should crash at this point (or throw in debug mode): array is null, that is, empty, so there is no array[0]. You should array.length = 10; first. Then you'll get an array of 10 empty arrays of int, and will be able to set their lengths separately: array[0].length = 10; // OK assert(array[1].length == 0); // OK If you want to set up quickly you can write int[][] array = new int[][](10, 10); This will give you a sort of square matrix, bit internally it will be an array of arrays nevertheless. also, can this be done? int size; size = 10; //runtime void function( int[size][] array){} No, static arrays are static, i.e. compile time.
Re: array questions
Hello again is it possible to make a dynamic array less dynamic? int[][] array; array[0].length = 10; //has to be set at runtime Um, if that's your code, everything should crash at this point (or throw in debug mode): array is null, that is, empty, so there is no array[0]. You should array.length = 10; first. Then you'll get an array of 10 well, at least to 2. my fault. empty arrays of int, and will be able to set their lengths separately: array[0].length = 10; // OK assert(array[1].length == 0); // OK If you want to set up quickly you can write int[][] array = new int[][](10, 10); thank you, that will make all the element to be at least a certain size. This will give you a sort of square matrix, bit internally it will be an array of arrays nevertheless. also, can this be done? int size; size = 10; //runtime void function( int[size][] array){} No, static arrays are static, i.e. compile time. I meant it to be an dynamic array argument, but that the function wouldn't need to check the size of all the elements itself.
Re: array questions
yes wrote: [snip] also, can this be done? int size; size = 10; //runtime void function( int[size][] array){} No, static arrays are static, i.e. compile time. I meant it to be an dynamic array argument, but that the function wouldn't need to check the size of all the elements itself. Dynamic arrays do not have a fixed size, thus code needs to check the number of elements when it uses them. Static arrays have a fixed size, but that size MUST be fixed at compile time. You can convert static arrays into dynamic arrays, but not the other way around (at least, not without copying the contents). When you compile with the -release flag, it disables range checking. You can also avoid it by going via pointers, but that's just asking for trouble. -- Daniel
Re: array questions
yes: is it possible to make a dynamic array less dynamic? int[][] array; array[0].length = 10; //has to be set at runtime writefln(array[1].length); // writes also 10 Because I now have to loop through the whole array to check for correct size. If you are using normal D dynamic arrays you have to loop through the whole array to check for correct size. Otherwise you have to create a new and different data structure, an array that is guaranteed to be rectangular. Probably there are already such data structure done by someone else (the downside is that DMD may handle them less efficiently. The up side is that the resulting memory allocated is probably contiguous, that leads to better cache coherence, less memory wasted, and ability to quickly reshape the matrix on the fly). Built-in dynamic arrays are just one of the many possible kinds of arrays a programmer may need. I think the current D design is good enough: a very common and flexible case is built-in, and you can create the other different data structures by yourself (or you can import them from a lib like Tango). also, can this be done? int size; size = 10; //runtime void function( int[size][] array){} The size of a dynamic array is an information known only at runtime, so the D type system is unable to know it at compile time. So for the D type system is impossible to perform that control at compile time. What you ask may be done by a more powerful type system in special situations (when the compiler can infer at compile time the size), but you need a more powerful type system. Bye, bearophile