On Thursday, July 14, 2016 07:07:52 Miguel L via Digitalmars-d-learn wrote: > Ok, i have read about Appender and assumeSafeAppend(), but i am > still a bit confused. > What i have understood is: dynamic arrays are (almost) always > reallocating when appending to them except assumeSafeAppend() is > used, or when wrapping them with Appender. So, if i'm sure there > are no slices referencing my array i can use assumeSafeAppend(). > But is this permanent? I mean if I declare: > array A[] x; > x.assumeSafeAppend(); > > Does that last forever so I can append without reallocating after > emptying array x, or should I call assumeSafeAppend() every time > I adjust x.length or append to x? > > Maybe I should give up trying to use dynamic arrays and use fixed > length arrays instead.
If you haven't read it yet, I'd suggest reading http://dlang.org/d-array-article.html A dynamic array is basically struct DynamicArray(T) { size_t length; T* ptr; } So, it does not own or manage its own memory. It can refer to any memory, but it's the GC that manages determining whether appending would need to reallocate or not. And if it does reallocate, then the memory is going to be GC-allocated regardless of whether it was GC allocated originally or malloc-ed memory or a slice of a static array or whatever. When the GC allocates a block of memory for a dynamic array, it keeps track of the farthest point in that block of memory that any dynamic array refers to. If the last element in a dynamic array is the last element in that memory block, and there is additional capacity beyond that within the memory block, then appending will not result in a reallocation. Rather, it will expand that dynamic array into the free space in the memory block. But if that memory block is fully used or if the last element in the dynamic array is not the last used element in the memory block (or if the dynamic array refers to memory that was not allocated by the GC), then there is no room for that array to be expanded, and a reallocation will occur, at which point that dynamic array _will_ point to the last used element in its new memory block. So, if you doing something like int[] arr; arr ~= 10; arr ~= 9; arr ~= 42; arr ~= 17; arr ~= 42; arr ~= 99; arr ~= 0; arr ~= 100; it could be that only one of those append operations actually result in an allocation (the first one), or it could be that multiple do. How many of them require a reallocation depends on what size the underlying buffer is, and that's implementation dependent. If you want to know how many elements can be appended without reallocating, then use capacity. For instance, on my machine, this code import std.stdio; void main() { int[] arr; writefln("len: %s, cap: %s", arr.length, arr.capacity); arr ~= 10; writefln("len: %s, cap: %s", arr.length, arr.capacity); arr ~= 9; writefln("len: %s, cap: %s", arr.length, arr.capacity); arr ~= 42; writefln("len: %s, cap: %s", arr.length, arr.capacity); arr ~= 17; writefln("len: %s, cap: %s", arr.length, arr.capacity); arr ~= 42; writefln("len: %s, cap: %s", arr.length, arr.capacity); arr ~= 99; writefln("len: %s, cap: %s", arr.length, arr.capacity); arr ~= 0; writefln("len: %s, cap: %s", arr.length, arr.capacity); arr ~= 100; writefln("len: %s, cap: %s", arr.length, arr.capacity); } prints len: 0, cap: 0 len: 1, cap: 3 len: 2, cap: 3 len: 3, cap: 3 len: 4, cap: 7 len: 5, cap: 7 len: 6, cap: 7 len: 7, cap: 7 len: 8, cap: 15 So, you can see that a reallocation occurred on the 1st, 4th, and 8th append operations, and that won't reallocate again until the 16th append operation (since it can grow to a length of 15 before a reallocation would be required). The capacity grows in a manner similar to that of std::vector in C++ or ArrayList in Java. And it's reasonably efficient in terms of memory allocations (amortized O(1)). The reason that Appender often gets used is that checking the capacity of the dynamic array is not cheap (since that's kept track of by the GC and not the dynamic array itself), and that has to be checked every time that you append. Appender is able to play some games to make that more efficient under the assumption that what you're doing is simply appending to build an array after which you would take it out of the Appender and not use Appender anymore. But Appender does not change the allocation scheme. It just makes the checks more efficient. So, use Appender when you're first building an array, but don't keep using it after that. What gets more entertaining in terms of capacity and reallocations is when you start slicing the array or changing its length. For instance, if we add arr.length = arr.length - 1; writefln("len: %s, cap: %s", arr.length, arr.capacity); to the end of the previous example, then you get len: 0, cap: 0 len: 1, cap: 3 len: 2, cap: 3 len: 3, cap: 3 len: 4, cap: 7 len: 5, cap: 7 len: 6, cap: 7 len: 7, cap: 7 len: 8, cap: 15 len: 7, cap: 0 Notice that the capacity is now 0. That's because the dynamic array no longer refers to the last, used element in the underlying memory buffer. Similarly, if you instead did import std.stdio; void main() { int[] arr; writefln("arr len: %s, cap: %s", arr.length, arr.capacity); arr ~= 10; writefln("arr len: %s, cap: %s", arr.length, arr.capacity); arr ~= 9; writefln("arr len: %s, cap: %s", arr.length, arr.capacity); arr ~= 42; writefln("arr len: %s, cap: %s", arr.length, arr.capacity); arr ~= 17; writefln("arr len: %s, cap: %s", arr.length, arr.capacity); arr ~= 42; writefln("arr len: %s, cap: %s", arr.length, arr.capacity); arr ~= 99; writefln("arr len: %s, cap: %s", arr.length, arr.capacity); arr ~= 0; writefln("arr len: %s, cap: %s", arr.length, arr.capacity); arr ~= 100; writefln("arr len: %s, cap: %s\n", arr.length, arr.capacity); auto arr2 = arr; writefln("arr len: %s, cap: %s", arr.length, arr.capacity); writefln("arr2 len: %s, cap: %s\n", arr2.length, arr2.capacity); auto arr3 = arr[0 .. $ - 1]; writefln("arr len: %s, cap: %s", arr.length, arr.capacity); writefln("arr2 len: %s, cap: %s", arr2.length, arr2.capacity); writefln("arr3 len: %s, cap: %s\n", arr3.length, arr3.capacity); arr2 ~= 77; writefln("arr len: %s, cap: %s", arr.length, arr.capacity); writefln("arr2 len: %s, cap: %s", arr2.length, arr2.capacity); writefln("arr3 len: %s, cap: %s", arr3.length, arr3.capacity); } it prints out arr len: 0, cap: 0 arr len: 1, cap: 3 arr len: 2, cap: 3 arr len: 3, cap: 3 arr len: 4, cap: 7 arr len: 5, cap: 7 arr len: 6, cap: 7 arr len: 7, cap: 7 arr len: 8, cap: 15 arr len: 8, cap: 15 arr2 len: 8, cap: 15 arr len: 8, cap: 15 arr2 len: 8, cap: 15 arr3 len: 7, cap: 0 arr len: 8, cap: 0 arr2 len: 9, cap: 15 arr3 len: 7, cap: 0 Notice that only whichever dynamic arrays refer to the last element within that memory block has a non-zero capacity. So, appending one of those won't result in a reallocation so long as the memory block isn't full, but appending to any of the others _will_ result in a reallocation, and if you append to a dynamic without reallocating, any other dynamic arrays which are slices of the same memory block and which had a non-zero capacity will no longer have a non-zero capacity, because they no longer refer to the last, used element in the underlying memory block. So, if what you're doing is passing around dynamic arrays which are slices of one another left and right and appending to them will-nilly, then you're going to get a lot of reallocations, but if you're just appending to dynamic arrays which refer to the last element in their memory block, and you don't append to any of the other dynamic arrays which are slices of that same block, then you'll get only occasional reallocations. But it sounds like what you're trying to do is something like auto arr = [99, 45, 33, 22, 19, 46]; writefln("arr len: %s, cap: %s", arr.length, arr.capacity); arr.length -= 2; writefln("arr len: %s, cap: %s", arr.length, arr.capacity); arr ~= 99; writefln("arr len: %s, cap: %s", arr.length, arr.capacity); which on my machine results in arr len: 6, cap: 7 arr len: 4, cap: 0 arr len: 5, cap: 7 It reallocated 99 was appended, because arr didn't refer to the last used element in the memory block. To fix that, you use assumeSafeAppend. e.g. auto arr = [99, 45, 33, 22, 19, 46]; writefln("arr len: %s, cap: %s", arr.length, arr.capacity); arr.length -= 2; writefln("arr len: %s, cap: %s", arr.length, arr.capacity); arr.assumeSafeAppend(); writefln("arr len: %s, cap: %s", arr.length, arr.capacity); arr ~= 99; writefln("arr len: %s, cap: %s", arr.length, arr.capacity); which prints arr len: 6, cap: 7 arr len: 4, cap: 0 arr len: 4, cap: 7 arr len: 5, cap: 7 on my machine. It didn't reallocate. Instead, assumeSafeAppend told the GC that the last element that arr refered to within its memory block was the last used element in the memory block. So, suddenly, all of the space after arr was available, and its capacity was non-zero. So, if you're going to be doing a bunch of adjustments to length to remove elements, then you can use assumeSafeAppend to then make it so that the GC understands that the elements after that array are non longer used and that appending can grow the array into that space, which will significantly reduce the number of reallocations in the case where you keep removing elements from the end of the array. _However_, the very large caveat is that if you do this, you cannot have any other dynamic arrays which refer to any elements past the end of arr, because if such dynamic arrays do exist, then suddenly, their values are gonig to be stomped on by the append operations to arr (and it might even be that those elements had their destructors called when assumesafeAppend was called - I don't know; if so it's that much worse if you tell it that it's safe to append when it actually isn't). So, whether you should be using Appender or assumeSafeAppend or neither depends entirely on what you're doing. However, in general, simply appending to dynamic arrays does not result in many reallocations (just like it doesn't result in a lot of realloctions for std::vector or ArrayList). When reallocations become a problem is when you start slicing a dynamic array so that you have other dynamic arrays which refer to the same memory, and you append to those dynamic arrays, or when you reduce the length of an array and then append to it, because in both of those cases, you're appending to dynamic arrays which do not refer to the last element in their underlying memory block. Hopefully, that makes things at least somewhat clearer. - Jonathan M Davis