Re: D Language Foundation October 2023 Quarterly Meeting Summary
On Sunday, 10 December 2023 at 18:16:05 UTC, Nick Treleaven wrote: You can call `alloca` as a default argument to a function. The memory will be allocated on the caller's stack before calling the function: https://github.com/ntrel/stuff/blob/master/util.d#L113C1-L131C2 I've just tested and it seems it works as a constructor default argument too. Clever!
Re: D Language Foundation October 2023 Quarterly Meeting Summary
On Sunday, 10 December 2023 at 16:08:45 UTC, Bastiaan Veelo wrote: On Sunday, 10 December 2023 at 15:31:55 UTC, Richard (Rikki) Andrew Cattermole wrote: It will be interesting to hear how dcompute will fare in your situation, due to it being D code it should be an incremental improvement once you're ready to move to D fully. Yes, dcompute could mean another leap forward. There are so many great things to look forward to. -- Bastiaan. Always happy to help if you're interested in looking into using dcompute. I can't remember if we've talked about it before, but if you were wanting to use it you'd need OpenCL 2.x (explicitly the 2.x version series, or make sure the 3.x implementation supports SPIRV) running on that 20 logical core box (or if it has GPUs attached to it, CUDA (any version should do) for NVidia GPUs or OpenCL 2.x (as above) on any other GPUs). With regards to the stack corruption there is https://github.com/ldc-developers/ldc/blob/master/gen/abi/x86.cpp#L260 which has been there for some time. It would be fairly simple to issue a diagnostic there (although getting source location from there might be a bit tricky) for when there is both a `byval` and an alignment specified. Or you could use grep with `--output-ll` as noted by Johan https://github.com/ldc-developers/ldc/issues/4265#issuecomment-1376424944 although this will be with that `workaroundIssue1356` applied.
Re: D Language Foundation October 2023 Quarterly Meeting Summary
On Wednesday, 6 December 2023 at 16:28:08 UTC, Mike Parker wrote: One way to do that in D is to use `alloca`, but that's an issue because the memory it allocates has to be used in the same function that calls the `alloca`. So you can't, e.g., use `alloca` to alloc memory in a constructor, and that prevents using it in a custom array implementation. You can call `alloca` as a default argument to a function. The memory will be allocated on the caller's stack before calling the function: https://github.com/ntrel/stuff/blob/master/util.d#L113C1-L131C2 I've just tested and it seems it works as a constructor default argument too.
Re: D Language Foundation October 2023 Quarterly Meeting Summary
On Sunday, 10 December 2023 at 17:11:04 UTC, Siarhei Siamashka wrote: On Sunday, 10 December 2023 at 15:08:05 UTC, Bastiaan Veelo wrote: The compiler can check if `scope` delegates escape a function, but it only does this in `@safe` code --- and our code is long from being `@safe`. So it was a bit of a puzzle to find out which arguments needed to be `scope` and which arguments couldn't be `scope`. This reminded me of https://forum.dlang.org/thread/myiqlzkghnnyykbyk...@forum.dlang.org LDC has a special GC2Stack IR optimization pass, which is a lifesaver in many cases like this. Interesting. Are there some known blocker bugs, which prevent a safe usage of LDC in production? This one: https://github.com/ldc-developers/ldc/issues/4265 Mike has summarized it: LDC unfortunately had an issue that caused stack corruption on 32-bit Windows. They'd hit it in one case and were able to work around it, but he couldn't be sure they wouldn't hit it somewhere else. He wasn't willing to risk unreliable computations. He said that LDC could do the right thing, but his understanding from talking to Martin was that implementing it would have a large time cost. Since Win32 is going to eventually go away, he wasn't very keen on paying that cost. They'd spoken at DConf about the possibility of LDC raising compilation errors when stack corruption could occur so that they could then work around those cases, but he hadn't followed up with Martin about it. -- Bastiaan.
Re: D Language Foundation October 2023 Quarterly Meeting Summary
On Sunday, 10 December 2023 at 15:08:05 UTC, Bastiaan Veelo wrote: 1) Missing `scope` storage class specifiers on `delegate` function arguments. This can be chalked down as a beginner error, but also one that is easy to miss. If you didn't know: without `scope` the compiler cannot be sure that the delegate is not stored in some variable that has a longer lifetime than the stack frame of the (nested) function pointed to by the delegate. Therefore, a dynamic closure is created, which means that the stack is copied to new GC-allocated memory. In the majority of our cases, delegate arguments are simple callbacks that are only stored on the stack, but a select number of delegates in the GUI are stored for longer. The compiler can check if `scope` delegates escape a function, but it only does this in `@safe` code --- and our code is long from being `@safe`. So it was a bit of a puzzle to find out which arguments needed to be `scope` and which arguments couldn't be `scope`. This reminded me of https://forum.dlang.org/thread/myiqlzkghnnyykbyk...@forum.dlang.org LDC has a special GC2Stack IR optimization pass, which is a lifesaver in many cases like this. So now all cores are finally under full load, which is a magnificent sight! Speed of DMD `release-nobounds` is on par with our Pascal version, if not slightly faster. We are looking forward to being able to safely use LDC, because tests show that it has the potential to at least double the performance. Are there some known blocker bugs, which prevent a safe usage of LDC in production?
Re: D Language Foundation October 2023 Quarterly Meeting Summary
On Sunday, 10 December 2023 at 15:31:55 UTC, Richard (Rikki) Andrew Cattermole wrote: It will be interesting to hear how dcompute will fare in your situation, due to it being D code it should be an incremental improvement once you're ready to move to D fully. Yes, dcompute could mean another leap forward. There are so many great things to look forward to. -- Bastiaan.
Re: D Language Foundation October 2023 Quarterly Meeting Summary
That is awesome to hear! If the move towards ldc has the potential to half your run time, that is quite a significant improvement for your customers. It will be interesting to hear how dcompute will fare in your situation, due to it being D code it should be an incremental improvement once you're ready to move to D fully. Based upon the estimates here already, it seems like acquiring an LDC developer in house might be well worth it.
Re: D Language Foundation October 2023 Quarterly Meeting Summary
On Wednesday, 6 December 2023 at 16:28:08 UTC, Mike Parker wrote: Bastiaan reported that SARC had been testing their D codebase (transpiled from Pascal---[see Bastiaan's DConf 2019 talk](https://youtu.be/HvunD0ZJqiA)). They'd found the multithreaded performance worse than the Pascal version. He said that execution time increased with more threads and that it didn't matter how many threads you throw at it. It's the latter problem he was focused on at the moment. I have an update on this issue. But first let me clarify how grave this situation is (was!) for us. There are certain tasks that we, and our customers, need to perform that involves a 20 logical core computer to crunch numbers for a week. This is painful, but it also means that a doubling of that time is completely unacceptable, let alone a 20-fold increase. It is the difference between in business and out of business. Aside from the allocation issue, there are several other properties that our array implementation needs to replicate from Extended Pascal: being able to have non-0 starting indices, having value semantics, having array limits that can be compile-time and run-time, and function arguments that must work on arrays of any limits, also for multi-dimensional arrays. So while trying to solve one aspect, care had to be taken not to break any of the other aspects. It turned out that thread contention had more than one causes, which made this an extra frustrating problem because just as we thought to have found the culprit, it did not have the effect that we expected. These were the three major reasons we were seeing large thread contention, in no particular order: 1) Missing `scope` storage class specifiers on `delegate` function arguments. This can be chalked down as a beginner error, but also one that is easy to miss. If you didn't know: without `scope` the compiler cannot be sure that the delegate is not stored in some variable that has a longer lifetime than the stack frame of the (nested) function pointed to by the delegate. Therefore, a dynamic closure is created, which means that the stack is copied to new GC-allocated memory. In the majority of our cases, delegate arguments are simple callbacks that are only stored on the stack, but a select number of delegates in the GUI are stored for longer. The compiler can check if `scope` delegates escape a function, but it only does this in `@safe` code --- and our code is long from being `@safe`. So it was a bit of a puzzle to find out which arguments needed to be `scope` and which arguments couldn't be `scope`. 2) Allocating heap memory in the array implementation, as discussed in the meeting. We followed Walter's advice and now use `alloca`. Not directly, but using string mixin's and static member functions that generate the appropriate code. 3) Stale calls to `GC.addRange` and `GC.removeRange`. These were left over from an experiment where we tried to circumvent the garbage collector. Without knowing these were still in there, we were puzzled because we even saw contention in code that was marked `@nogc`. It makes sense now, because even though `addRange` doesn't allocate, it does need the global GC lock to register the range safely. Because the stack is already scanned by default, these calls were now superfluous and could be removed. So now all cores are finally under full load, which is a magnificent sight! Speed of DMD `release-nobounds` is on par with our Pascal version, if not slightly faster. We are looking forward to being able to safely use LDC, because tests show that it has the potential to at least double the performance. A big sigh of relief from us as we have solved the biggest hurdle (hopefully!) on our way to full adoption of D. -- Bastiaan.