@araq: > and after the thread dies, everything is freed.
That's really the problem. I am using `threadpool`, which creates 48 threads when the program starts. Limiting the number of threads seems to cause some to die after usage, but I don't understand what that code is doing. Are you suggesting that I create a new thread for each task as needed, and destroy each thread after its task finishes? My question then -- which I keep trying to get an answer to -- is how do I diagnose and avoid memory fragmentation? When "everything is freed", is everything coalesced? If not, then my problem is not solved. @Jehan: > My problem here is that I have only a pretty vague idea of what you're doing, > so I'm flying blind and am trying to guess what may help you. After I upload my Gigabyte of test-data to the cloud, I'll try to make my code available. It's not huge, and it's a re-write of a mostly C version (plus Python multiprocessing). 1. I scan the input to generate sets of DNA strings. 2. I reference those sets in a `CStringArray` so that they do not need to be copied for each thread. 3. For each set, I call a thread-proc. 4. The result of each thread proc is a DNA sequence which is immediately appended to the output file. In (3), I allocate some 2 or 3-level nested arrays/seqs of objects. I have no idea whether each object in an array gets its own memory fragment. I also allocate 3 or 4 relatively large arrays repeatedly (in gradually descreasing sizes, which was the pathological case.) The allocation/freeing of many small objects seems not to interact well with the large ones. Memory grows steadily, and with 48 threads I can exhaust 96GB of RAM pretty quickly, which then leads to swapping. I've already solved the worst offenders by using and resizing the same big string for the inner loop of each thread. I still create a new string for the task, but just once for the task. The thread persists in the `threadpool`, but the memory currently does not, since I'm not confident in threadvar persistence. For (1), I was using `Regex.findall()`, which the memprofiler seemed to claim was a significant use of memory. I've re-written that to a simple string iterator with the underlying string re-used. Even though the memory now "leaks" slowly, it still leaks. But I got sick this weekend, so I cannot show you an example right away.
