On Fri, Apr 01, 2022 at 01:15:01PM -0700, Ali Çehreli via Digitalmars-d-learn wrote: > On 4/1/22 10:46, Steven Schveighoffer wrote: > > >> But if it considers slices and their lengths specially, then it > >> would see that the mid section is not referenced any more. > > > > It's not a consideration for the GC, but possibly the compiler. The > > GC has no visibility into slices that hold references to its memory > > block. It only has knowledge of pointers to its memory block (and > > then only indirectly via scanning). > > Yes but if it recognized slices in addition to pointers, it could > consider their lengths to reuse the middle section, no? I understand > the role of the compiler but as GC is part of the D runtime, it feels > like it could own the concept of slices. [...]
The GC trades in blocks of memory. Being conservative, it makes no assumptions about the semantics of pointers that it encounters when it scans for references, other than the fact that it's referencing an allocated block of memory. It doesn't know what a slice is -- which, in D, is stored as a struct containing a pointer and a length: the GC makes no assumptions about the meaning of the length value that just happens to occur adjacently to the pointer. It could be part of a slice, or it could be a completely unrelated variable that has no relation to the pointer; such values are disregarded. In fact, unless precise scanning is enabled, the GC doesn't even know which values are pointers and which are integers; it just conservatively assumes that every integer-like value is a potential pointer. Only if they don't look like a valid address, or point to something outside the GC heap, or to an unallocated block, then the GC will ignore them. This is why in 32-bit programs these "false pointers" can sometimes cause the GC not to collect blocks that are actually already unreferenced (you happen to have an integer variable whose value coincidentally looks like a pointer into the block). So as long as one reference to the allocated block still exists, the GC will not collect it. It cannot tell whether the program might have stored an index into an array inside this block in an integer variable that will later be added to the remaining reference; the GC would have no way of telling which integer variable this might be, so it must conservatively assume that as long as a reference to the block still exists, the entire block is still alive (other parts of the block may still be accessed by adding an index value, etc.). If you really only intend on keeping a small chunk of an existing large allocated block, one solution is to .dup or .idup the slice you want to keep, and null all other references to the old block (or let them go out of scope). // Of course, one could argue that a more precise scanning could help the GC minimize blocks whose sole remaining reference is a slice that covers only a small subset of the entire block. This would involve significant complications to the scanning algorithm, though, which may have performance implications. E.g., what constitutes a slice? Only built-in slices? So pointer + separately-kept index is not allowed? Then we have a potential buffer overrun bug in code that's actually not buggy, if some code holds a pointer to the block with some separate index value, then the GC comes along and thinks it can free the part of the block that the index indirectly refers to. I'm sure someone can come up with a way to solve this problem, but the question is whether the payoff is worth the likely performance hit to GC collection times. T -- Unix was not designed to stop people from doing stupid things, because that would also stop them from doing clever things. -- Doug Gwyn