Re: Tracing/Profiling D Applications
On 5/29/22 13:47, Christian Köstlin wrote: > Our discussion with using TLS for the > collectors proposed to not need any lock on the add method for > collector, because its thread local and with that thread safe? It would be great that way but then the client changed the requirements on us: On 5/26/22 12:54, Christian Köstlin wrote: > I want to be able to dump tracings even while the program is still running. :p If the collected data were TLS, then the dumping thread should be able to ask each thread to provide data collected so far. That either requires synchronization, which I did, which necessitated shared Collector objects; or messaging, which would require each thread checking their Concurrency message box. I don't know... A third option came to me: Each thread periodically puts their data to a common location and dumper dumps whatever is aggregated up to that point. This would reduce contention. Ali
Re: Tracing/Profiling D Applications
On 5/27/22 06:55, Christian Köstlin wrote: > I wonder how I can synchronize the "dumping" and the > collection of the threads. Would be cool to have an efficient lockless > implementation of appender ... That turned out to be nontrivial. The following is a draft I played with. Collector collects and Dumper dumps. They use a SpinLock, an unpublished feature of core.internal for locking. The implementation of spinlock (e.g. at /usr/include/dlang/dmd/core/internal/spinlock.d) has a reference to "test and test-and-set (TTAS)": https://en.wikipedia.org/wiki/Test_and_test-and-set I learned about TTAS from Rikki Cattermole yesterday at TeaConf. :) The code is attached and works on my system. Ali import std; import std.datetime.stopwatch; import core.thread; import core.atomic; import core.internal.spinlock; enum workerCount = 8; enum threadRunTime = 4.seconds; enum mainRunTime = threadRunTime + 1.seconds; shared struct ScopeLock { @disable this(this); @disable void opAssign(ref const(typeof(this))); SpinLock * lock; this(shared(SpinLock) * lock) { this.lock = lock; lock.lock(); } ~this() { lock.unlock(); } } struct Collector { long[] data; shared(SpinLock) lock; auto scopeLock() shared { return ScopeLock(); } // Adds a data point to this collector. void add(long i) shared { auto sl = scopeLock(); /// Some crazy way of adding data points. Real code should // make more sense. data ~= i; } // Adds the data of this collector to the specified array // array. Again, real code should use a more sophisticated // method. void aggregate(ref long[] where) shared { auto sl = scopeLock(); where ~= data.sum; data.length = 0; (cast(long[])data).assumeSafeAppend(); } } // A variable to help us trust the code. We will print this at // the end of main. long allThatHasBeenDumped = 0; // Used only for validating the code. shared long allCollectedByThreads; synchronized class Dumper { private: shared(Collector)*[] collectors; void register(shared(Collector) * collector) shared { writeln("registering ", collector); collectors ~= collector; } // Dumps current results. void dump(File output) shared { long[] data; foreach (collector; collectors) { collector.aggregate(data); } const allData = data.sum; if (allData != 0) { stdout.writefln!"Just collected:%-(\n %,s%)"(data); allThatHasBeenDumped += allData; } } } shared(Dumper) dumper; shared static this() { writeln("Making a Dumper"); dumper = new Dumper(); } shared(Collector) * collector; static this() { writeln("Making a Collector"); collector = new shared(Collector)(); dumper.register(cast(shared)collector); } // Main thread function void doWork() { try { doWorkImpl(); } catch (Throwable exc) { stderr.writeln("Caught Throwable: ", exc.msg); } } // The implementation of each thread. void doWorkImpl() { auto sw = StopWatch(); sw.start(); long i = 0; while (sw.peek < threadRunTime) { (cast(shared)collector).add(i); ++i; } --i; auto total = i * (i + 1) / 2; writefln("Thread collected %,s items equaling %,s with %s", i, total, collector); atomicOp!"+="(allCollectedByThreads, total); } void main() { writeln("main started"); iota(workerCount).each!(_ => spawn()); auto sw = StopWatch(); sw.start(); while (sw.peek < mainRunTime) { dumper.dump(stdout); Thread.sleep(100.msecs); } // One final collection (and dump): dumper.dump(stdout); assert(allThatHasBeenDumped == allCollectedByThreads); }
Re: Tracing/Profiling D Applications
On 2022-05-29 20:52, Ali Çehreli wrote: On 5/27/22 06:55, Christian Köstlin wrote: > I wonder how I can synchronize the "dumping" and the > collection of the threads. Would be cool to have an efficient lockless > implementation of appender ... That turned out to be nontrivial. The following is a draft I played with. Collector collects and Dumper dumps. They use a SpinLock, an unpublished feature of core.internal for locking. The implementation of spinlock (e.g. at /usr/include/dlang/dmd/core/internal/spinlock.d) has a reference to "test and test-and-set (TTAS)": https://en.wikipedia.org/wiki/Test_and_test-and-set I learned about TTAS from Rikki Cattermole yesterday at TeaConf. :) The code is attached and works on my system. Ali import std; import std.datetime.stopwatch; import core.thread; import core.atomic; import core.internal.spinlock; enum workerCount = 8; enum threadRunTime = 4.seconds; enum mainRunTime = threadRunTime + 1.seconds; shared struct ScopeLock { @disable this(this); @disable void opAssign(ref const(typeof(this))); SpinLock * lock; this(shared(SpinLock) * lock) { this.lock = lock; lock.lock(); } ~this() { lock.unlock(); } } struct Collector { long[] data; shared(SpinLock) lock; auto scopeLock() shared { return ScopeLock(); } // Adds a data point to this collector. void add(long i) shared { auto sl = scopeLock(); /// Some crazy way of adding data points. Real code should // make more sense. data ~= i; } // Adds the data of this collector to the specified array // array. Again, real code should use a more sophisticated // method. void aggregate(ref long[] where) shared { auto sl = scopeLock(); where ~= data.sum; data.length = 0; (cast(long[])data).assumeSafeAppend(); } } // A variable to help us trust the code. We will print this at // the end of main. long allThatHasBeenDumped = 0; // Used only for validating the code. shared long allCollectedByThreads; synchronized class Dumper { private: shared(Collector)*[] collectors; void register(shared(Collector) * collector) shared { writeln("registering ", collector); collectors ~= collector; } // Dumps current results. void dump(File output) shared { long[] data; foreach (collector; collectors) { collector.aggregate(data); } const allData = data.sum; if (allData != 0) { stdout.writefln!"Just collected:%-(\n %,s%)"(data); allThatHasBeenDumped += allData; } } } shared(Dumper) dumper; shared static this() { writeln("Making a Dumper"); dumper = new Dumper(); } > shared(Collector) * collector; static this() { writeln("Making a Collector"); collector = new shared(Collector)(); dumper.register(cast(shared)collector); } // Main thread function void doWork() { try { doWorkImpl(); } catch (Throwable exc) { stderr.writeln("Caught Throwable: ", exc.msg); } } // The implementation of each thread. void doWorkImpl() { auto sw = StopWatch(); sw.start(); long i = 0; while (sw.peek < threadRunTime) { (cast(shared)collector).add(i); ++i; } --i; auto total = i * (i + 1) / 2; writefln("Thread collected %,s items equaling %,s with %s", i, total, collector); atomicOp!"+="(allCollectedByThreads, total); } void main() { writeln("main started"); iota(workerCount).each!(_ => spawn()); auto sw = StopWatch(); sw.start(); while (sw.peek < mainRunTime) { dumper.dump(stdout); Thread.sleep(100.msecs); } // One final collection (and dump): dumper.dump(stdout); assert(allThatHasBeenDumped == allCollectedByThreads); } Hi Ali, thanks a lot for that, I will first have to digest that. Just one first question: Our discussion with using TLS for the collectors proposed to not need any lock on the add method for collector, because its thread local and with that thread safe? Kind regards, Christian
Re: Tracing/Profiling D Applications
On 2022-05-29 20:52, Ali Çehreli wrote: On 5/27/22 06:55, Christian Köstlin wrote: > I wonder how I can synchronize the "dumping" and the > collection of the threads. Would be cool to have an efficient lockless > implementation of appender ... That turned out to be nontrivial. The following is a draft I played with. Collector collects and Dumper dumps. They use a SpinLock, an unpublished feature of core.internal for locking. The implementation of spinlock (e.g. at /usr/include/dlang/dmd/core/internal/spinlock.d) has a reference to "test and test-and-set (TTAS)": https://en.wikipedia.org/wiki/Test_and_test-and-set I learned about TTAS from Rikki Cattermole yesterday at TeaConf. :) The code is attached and works on my system. Ali import std; import std.datetime.stopwatch; import core.thread; import core.atomic; import core.internal.spinlock; enum workerCount = 8; enum threadRunTime = 4.seconds; enum mainRunTime = threadRunTime + 1.seconds; shared struct ScopeLock { @disable this(this); @disable void opAssign(ref const(typeof(this))); SpinLock * lock; this(shared(SpinLock) * lock) { this.lock = lock; lock.lock(); } ~this() { lock.unlock(); } } struct Collector { long[] data; shared(SpinLock) lock; auto scopeLock() shared { return ScopeLock(); } // Adds a data point to this collector. void add(long i) shared { auto sl = scopeLock(); /// Some crazy way of adding data points. Real code should // make more sense. data ~= i; } // Adds the data of this collector to the specified array // array. Again, real code should use a more sophisticated // method. void aggregate(ref long[] where) shared { auto sl = scopeLock(); where ~= data.sum; data.length = 0; (cast(long[])data).assumeSafeAppend(); } } // A variable to help us trust the code. We will print this at // the end of main. long allThatHasBeenDumped = 0; // Used only for validating the code. shared long allCollectedByThreads; synchronized class Dumper { private: shared(Collector)*[] collectors; void register(shared(Collector) * collector) shared { writeln("registering ", collector); collectors ~= collector; } // Dumps current results. void dump(File output) shared { long[] data; foreach (collector; collectors) { collector.aggregate(data); } const allData = data.sum; if (allData != 0) { stdout.writefln!"Just collected:%-(\n %,s%)"(data); allThatHasBeenDumped += allData; } } } shared(Dumper) dumper; shared static this() { writeln("Making a Dumper"); dumper = new Dumper(); } shared(Collector) * collector; static this() { writeln("Making a Collector"); collector = new shared(Collector)(); dumper.register(cast(shared)collector); } // Main thread function void doWork() { try { doWorkImpl(); } catch (Throwable exc) { stderr.writeln("Caught Throwable: ", exc.msg); } } // The implementation of each thread. void doWorkImpl() { auto sw = StopWatch(); sw.start(); long i = 0; while (sw.peek < threadRunTime) { (cast(shared)collector).add(i); ++i; } --i; auto total = i * (i + 1) / 2; writefln("Thread collected %,s items equaling %,s with %s", i, total, collector); atomicOp!"+="(allCollectedByThreads, total); } void main() { writeln("main started"); iota(workerCount).each!(_ => spawn()); auto sw = StopWatch(); sw.start(); while (sw.peek < mainRunTime) { dumper.dump(stdout); Thread.sleep(100.msecs); } // One final collection (and dump): dumper.dump(stdout); assert(allThatHasBeenDumped == allCollectedByThreads); } According to https://www.schveiguy.com/blog/2022/05/comparing-exceptions-and-errors-in-d/ its bad to catch Errors ... so dowork should catch only Exception? Or is this a special case to just log the error per thread and be done with it? still if not everything is cleaned up correctly it might be better to crash directly ... Kind regards, Christian
Re: Tracing/Profiling D Applications
On 5/29/22 13:53, Christian Köstlin wrote: > According to > https://www.schveiguy.com/blog/2022/05/comparing-exceptions-and-errors-in-d/ > its bad to catch Errors ... Correct in the sense that the program should not continue after catching Error. > so dowork should catch only Exception? It should catch Error as well. Otherwise you will have no idea why a thread disappeared. You may agree with me here: https://youtu.be/dRORNQIB2wA?t=1950 I catch'ed Throwable in the code just because it's a quick little experiment. > it might be better > to crash directly ... Sounds good but only after leaving the reason behind somehow. Ali
Re: Compiler switch for integer comparison/promotion to catch a simple error
On Sunday, 29 May 2022 at 01:35:23 UTC, frame wrote: Is there a compiler switch to catch this kind of error? ```d ulong v = 1; writeln(v > -1); ``` IMHO the compiler should bail a warning if it sees a logic comparison between signed and unsigned / different integer sizes. There is 50% chance that a implicit conversion was not intended. Well I don't know about this, but of course I think (That if not) we should have at least a flag like we have with GCC (-Wextra). Searching about I found in this topic: https://forum.dlang.org/post/hhpacodmcibejatqz...@forum.dlang.org "Good luck adding a warning into DMD. After years there still isn't a warning for unsigned/signed comparisons." This is from 2017, so let's wait for experienced weighting in. Matheus.