On Fri, Jan 15, 2021 at 09:04:13PM +0000, welkam via Digitalmars-d-learn wrote: > On Friday, 15 January 2021 at 07:35:00 UTC, H. S. Teoh wrote: > > (1) Refactored one function called from an inner loop to reuse a > > buffer instead of allocating a new one each time, thus eliminating a > > large amount of garbage from small allocations; > > <...> > > The result was about 40-50% reduction in runtime, which is close to > > about a 2x speedup. > > I think this message needs to be signal boosted. Most of the time GC > is not the problem. The problem is sloppy memory usage. If you > allocate a lot of temporary objects your performance will suffer even > if you use malloc and free.
As the joke goes, "you can write assembly code in any language". :-D If you code in a sloppy way, it doesn't matter what language you write in, your program will still suck. No amount of compiler magic will be able to help you. The solution is not to blame this or that, it's to learn how to use what the language offers you effectively. > If you write code that tries to use stack allocation as much as > possible, doesn't copy data around, reuses buffers then it will be > faster than manual memory management that doesn't do that. And thats > with a "slow" GC. And with D, it's actually easy to do this, because D gives you tools like slices and by-value structs. Having slices backed by the GC is actually a very powerful combination that people seem to overlook: it means you can freely refer to data by slicing the buffer. Strings being slices, as opposed to null-terminated, is a big part of this. In C, you cannot assume anything about how the memory of a buffer is managed (unless you allocated it yourself); as a result, in typical C code strcpy's, strdup's are everywhere. Want a substring? You can't null-terminate the parent string without affecting code that still depends on it; solution? strdup. Want to store a string in some persistent data structure? You can't be sure the pointer will still be valid (or that the contents pointed to won't change); solution? strdup, or strcpy. Want to parse a string into words? Either you modify it in-place (e.g. strtok), invalidating any other references to it, or you have to make new allocations of every segment. GC or no GC, this will not lead to a good place, performance-wise. I could not have written fastcsv if I had to work under the constraints of C's null-terminated strings under manual memory management. Well, I *could*, but it would have taken 10x the amount of effort, and the API would be 5x uglier due to the memory management paraphrenalia required to do this correctly in C. And to support lazy range-based iteration would require a whole new set of API's in C just for that purpose. In D, I can simply take slices of the input -- eliminating a whole bunch of copying. And backed by the GC -- so the code doesn't have to be cluttered with memory management paraphrenalia, but can have a simple, easy-to-use API compatible across a large range of use cases. Lazy iteration comes "for free", no need to introduce an entire new API. It's a win-win. All that's really needed is for people to be willing to drop their C/C++/Java coding habits, and write D the way it's meant to be written: with preference for stack-allocated structs and by-value semantics, using class objects only for more persistent data. Use slices for maximum buffer reuse, avoid needless copying. Use compile-time introspection to generate code statically where possible instead of needlessly recomputing stuff at runtime. Don't fear the GC; embrace it and use it to your advantage. If it becomes a bottleneck, refactor that part of the code. No need to rewrite the entire project the painful way; most of the time GC performance issues are localised and have relatively simple fixes. T -- Once the bikeshed is up for painting, the rainbow won't suffice. -- Andrei Alexandrescu