Re: Unittests pass, and then an invalid memory operation happens after?

H. S. Teoh via Digitalmars-d-learn Thu, 28 Mar 2024 18:25:27 -0700

On Thu, Mar 28, 2024 at 11:49:19PM +0000, Liam McGillivray via 
Digitalmars-d-learn wrote:
> On Thursday, 28 March 2024 at 04:46:27 UTC, H. S. Teoh wrote:
> > The whole point of a GC is that you leave everything up to it to
> > clean up.  If you want to manage your own memory, don't use the GC.
> > D does not force you to use it; you can import core.stdc.stdlib and
> > use malloc/free to your heart's content.
> > 
> > Unpredictable order of collection is an inherent property of GCs.
> > It's not going away.  If you don't like it, use malloc/free instead.
> > (Or write your own memory management scheme.)
> 
> I disagree with this attitude on how the GC should work. Having to
> jump immediately from leaving everything behind for the GC to fully
> manual memory allocation whenever the GC becomes a problem is a
> problem, which gives legitimacy to the common complaint of D being
> "garbage-collected". It would be much better if the garbage collector
> could be there as a backup for when it's needed, while allowing the
> programmer to write code for object destruction when they want to
> optimize.


Take a look at the docs for core.memory.GC.  There *is* a method GC.free
that you can use to manually deallocate GC-allocated memory if you so
wish.  Keep in mind, though, that manually managing memory in this way
invites memory-related errors. That's not something I recommend unless
you're adamant about doing everything the manual way.


> > > Anyway, I suppose I'll have to experiment with either manually
> > > destroying every object at the end of every unittest, or just
> > > leaving more to the GC. Maybe I'll make a separate `die` function
> > > for the units, if you think it's a good idea.
> > 
> > I think you're approaching this from a totally wrong angle. (Which I
> > sympathize with, having come from a C/C++ background myself.)  The whole
> > point of having a GC is that you *don't* worry about when an object is
> > collected.  You just allocate whatever you need, and let the GC worry
> > about cleaning up after you. The more you let the GC do its job, the
> > better it will be.
> 
> Now you're giving me conflicting advice. I was told that my current
> destructor functions aren't acceptable with the garbage collector, and you
> specifically tell me to leave things to the GC. But then I suggest that I
> "leave more to the GC" and move everything from the Unit destructor to a
> specialized `die` function that can be called instead of `destroy` whenever
> they must be removed from the game, which as far as I can see is the only
> way to achieve the desired game functionality while following your and
> Steve's advice and not having dangling references. But in response to that,
> you tell me "I think you're approaching this from the wrong angle". And then
> right after that, you *again* tell me to "just let the GC worry about
> cleaning up after you"? Even if I didn't call `destroy` at all during my
> program, as far as I can see, I would still need the `die` function
> mentioned to remove a unit on death.

I think you're conflating two separate concepts, and it would help to
distinguish between them.  There's the lifetime of a memory-allocated
object, which is how long an object remains in the part of the heap
that's allocated to it.  It begins when you allocate the object with
`new`, and ends with the GC finds that it's no longer referenced and
collects it.

There's a different lifetime that you appear to be talking about: the
logical lifetime of an in-game object (not to be confused with an
"object" in the OO sense, though the two may overlap).  The (game)
object gets created (comes into existence in the simulated game world)
at a certain point in game time, until something in the game simulation
decides that it should no longer exist (it got destroyed, replaced with
another object, whatever). At that point, it should be removed from the
game simulation, and that's probably also what you have in mind when you
mentioned your "die" function.

And here's the important point: the two *do not need to coincide*.
Here's a concrete example of what I mean. Suppose in your game there's
some in-game mechanic that's creating N objects per M turns, and another
mechanic that's destroying some of these objects every L turns.  If you
map these creations/destructions with the object lifetime, you're
looking at a *lot* of memory allocations and deallocations throughout
the course of your game.  Memory allocations and deallocations can be
costly; this can become a problem if you're talking about a large number
of objects, or if they're being created/destroyed very rapidly (e.g.,
they are fragments flying out from explosions).  Since most of these
objects are identical in type, one way of optimizing the code is to
preallocate them: before starting your main loop, say you allocate an
array of say, 100 objects. Or 1000 or 10000, however many you anticipate
you'll need. These objects aren't actually in the game world yet; you're
merely reserving the memory for them beforehand. Mark each of them with
a "live"-ness flag that indicates whether or not they're actually in the
game.  Then during your main loop, whenever you need to create a new
object of that type, don't allocate memory for it; just find a non-live
object in this array, set its fields to the right values, and mark it
"live".  Now it's a object in the game.  When the object is destroyed
in-game, don't deallocate it; instead, just set its "live" flag to
false.  Now you can blast through hundreds and thousands of these
objects without incurring the cost of allocating and deallocating them
every time.  You also save on GC cost (there's nothing for the GC to
collect, so it doesn't need to run at all).

Don't get confused by the "object" terminology; an in-game object is not
necessarily the same thing as a class object in your program. In fact,
sometimes it's advantageous to treat them as separate things.


[...]
> > As far as performance is concerned, a GC actually has higher
> > throughput than manually freeing objects, because in a fragmented
> > heap situation, freeing objects immediately when they go out of use
> > incurs a lot of random access RAM roundtrip costs, whereas a GC that
> > scans memory for references can amortize some of this cost to a
> > single period of time.
> 
> By "manually freeing objects", do you mean through `destroy`? If so
> that's actually quite disappointing, as D is often described as a
> "systems programming language", and I thought it would be fun to do
> these optimizations of object destruction, even if I have the garbage
> collector as a backup for anything missed. Or did you mean with
> `malloc` and `free`?

By "manually freeing objects" I mean what you typically do when you're
using malloc/free.

Note that in D, you *can* actually "manually manage" some objects this
way by calling GC.free.  I don't recommend this, though. It's not how
the GC is intended to be used, and it can lead to memory safety
problems.  My advice remains the same: just let the GC do its job.
Don't "optimize" prematurely.  Use a profiler to test your program and
identify its real bottlenecks before embarking on these often needlessly
complicated premature optimizations that may turn out to be completely
unnecessary.


[...]
> Well, I suppose that's fine for when the GC problem is specifically
> over slowness. I'm quite new to D, so I don't really know what it
> means to "preallocate before your main loop". Is this a combination of
> using `static this` constructors and `malloc`? I haven't used `malloc`
> yet. I have tried making static constructors, but every time I've
> tried them, they caused an error to happen immediately after the
> program is launched.

I gave an example of preallocation above.  It's not specific to GC or
malloc/free; it's a general principle of reducing the number of
allocations inside your main loop. If you can reserve beforehand the
memory you know you'll need later, it will generally perform better than
if you keep allocating on-the-fly.  Allocation and deallocation (or GC
collection) always come with a cost; so you want to avoid doing this
inside a hot loop, like your main game loop (assuming it's running every
frame -- otherwise it's a non-issue and you shouldn't worry about it).
Allocating N objects beforehand and then gradually using them as needed,
is always better than starting with 0 objects and allocating each one
individually inside your hot loop.


[...]
> I suppose I can turn the `Tile` object into a struct, which I suppose
> will mean replacing all it's references (outside the map's `Tile[][]
> grid`) with pointers. I have thought about this before, since tiles
> are fundamentally associated with one particular map, but I chose
> objects mostly so I can easily pass around references to them.

The correct design will depend on how you're using them, so I can't give
you a specific recommendation here.

If you're conscious of performance, however, I'd say avoid references
where you can.  Since maps presumably will always exist while the game
is going on, why bother with references at all?  Just use a struct to
store the coordinates of the tile, and look it up in the map.  Or if you
need to distinguish between tiles belonging to multiple simultaneous
maps, then store a reference to the parent map along with the
coordinates, then you'll be able to find the right Tile easily. This way
your maps can just store an array of Tile structs (single allocation),
instead of an array of Tile objects (M*N allocations for an M×N map).


> I've already been using structs for the stuff with a short lifespan.

Good.


[...]
> I want to ask about `@nogc`. Does it simply place restrictions on what
> I can do? Or does it change the meaning of certain lines? For example,
> does it mean that I can still create objects, but they will just keep
> piling up without being cleaned up?

@nogc does not change runtime behaviour. It's a static enforcement that
prevents your code from doing anything that might trigger GC allocations
at all. So using anything that will trigger a GC allocation, such as
using `new` or appending to an array with `~`, will cause a compile
error.  I don't recommend using it (a pretty substantial chunk of the
language and stdlib will become unavailable to you), but if you
absolutely, totally, 100% want to abstain from using the GC at all,
@nogc is your ticket to ensure that you didn't accidentally do so. The
compiler will enforce it at compile-time.


T

-- 
Music critic: "That's an imitation fugue!"

Re: Unittests pass, and then an invalid memory operation happens after?

Reply via email to