Re: Error running concurrent process and storing results in array
On Wednesday, 6 May 2020 at 10:23:17 UTC, data pulverizer wrote: D: ~ 1.5 seconds This is going to sound absurd but can we do even better? If none of the optimizations we have so far is using simd maybe we can get even better performance by using it. I think I need to go and read a simd primer.
Re: Thread to watch keyboard during main's infinite loop
On Thursday, 7 May 2020 at 01:33:12 UTC, Daren Scot Wilson wrote: import core.thread: sleep; It sould be import core.thread : Thread; Thread.sleep(1.secs); // or whatever sleep is a static method on the Thread class.
Re: Thread to watch keyboard during main's infinite loop
On Thursday, 7 May 2020 at 01:02:57 UTC, ag0aep6g wrote: Thank you, this is 110% helpful. Actually, I'd like to return the excess 10%. My dmd compiler does not like: import core.thread: sleep; so I put the code back the way I had, just to get on with work. Use `shared` so that all threads use the same variables: shared bool running=true; shared char command = '?'; "shared" did the job. I had read about "thread local" and "shared" in D before, but did not comprehend. Now I do :) This sequence of events is entirely possible: 1) main: cmd = command 2) cmdwatcher: command = c 3) main: command = ' ' It won't happen often, but if it does, your input has no effect. For this tool, lost key hits are not a problem. At least, that's what I say for now. I may be back next week for help with that. For now, the trousered ape running the software will just have to tap the key again. (Or the key + Enter.)
Re: Thread to watch keyboard during main's infinite loop
On 07.05.20 02:13, Daren Scot Wilson wrote: import std.stdio; import core.stdc.stdio; // for getchar(). There's nothing similar in D std libs? import std.concurrency; import core.thread; // just for sleep() Instead of the comment you can write: import core.thread: sleep; bool running=true; char command = '?'; These variables are thread-local by default. That means independent `running` and `command` variables are created for every thread. If you make changes in one thread, they won't be visible in another thread. Use `shared` so that all threads use the same variables: shared bool running=true; shared char command = '?'; void cmdwatcher() { writeln("Key Watcher"); while (running) { char c = cast(char)getchar(); if (c>=' ') { command = c; writefln(" key %c %04X", c, c); } } } void main() { writeln("Start main"); spawn(); while (running) { writeln("Repetitive work"); Thread.sleep( dur!("msecs")( 900 ) ); char cmd = command; // local copy can't change during rest of this loop For values that don't change, we've got `immutable`: immutable char cmd = command; command = ' '; Note that even when using `shared` you still have to think hard to avoid race conditions. This sequence of events is entirely possible: 1) main: cmd = command 2) cmdwatcher: command = c 3) main: command = ' ' It won't happen often, but if it does, your input has no effect.
Re: Error running concurrent process and storing results in array
On Wednesday, 6 May 2020 at 23:10:05 UTC, data pulverizer wrote: The -O3 -O5 optimization on the ldc compiler is instrumental in bringing the times down, going with -02 based optimization even with the other flags gives us ~ 13 seconds for the 10,000 dataset rather than the very nice 1.5 seconds. What is the difference between -O2 and -O3 ldc2 compiler optimizations?
Thread to watch keyboard during main's infinite loop
I'm writing a simple command line tool to send data by UDP once per second forever, to test some software on another machine. Not actually forever, of course, but until ^C or I hit 'Q'. I want to tap keys to make other things happen, like change the data or rate of sending. Not sure of the best way to do this. Thought I'd try a thread whose job is just to loop, calling readln() or getch() or something similar, and setting a global variables according to what key was tapped. The loop in the main thread can then break, or do some other action, according to the value of that variable. Code I have written is below, stripped to just the stuff relevant to key watching. It doesn't work. When it runs, it prints "Repetitive work" over and over, but never see anything appear in "cmd [ ]". When I tap 'A' I expect to see "cmd [A]" and a line of several 'A'. This does not happen. But the writeln in cmdwatcher() does show whatever I typed just fine. How to fix this, or redesign the whole thing to work? Note that I'm coming from science and fine art. My know-how in computer science is uneven, and I probably have gaps in my knowledge about threads. import std.stdio; import core.stdc.stdio; // for getchar(). There's nothing similar in D std libs? import std.concurrency; import core.thread; // just for sleep() bool running=true; char command = '?'; void cmdwatcher() { writeln("Key Watcher"); while (running) { char c = cast(char)getchar(); if (c>=' ') { command = c; writefln(" key %c %04X", c, c); } } } void main() { writeln("Start main"); spawn(); while (running) { writeln("Repetitive work"); Thread.sleep( dur!("msecs")( 900 ) ); char cmd = command; // local copy can't change during rest of this loop command = ' '; writefln("cmd [%c] running %d", cmd, running); switch (cmd) { case 'A': writeln("A A A A A"); break; case 'Q': writeln("Quitting"); running=false; break; default: break; } } }
Re: Error running concurrent process and storing results in array
On Wednesday, 6 May 2020 at 17:31:39 UTC, Jacob Carlborg wrote: On 2020-05-06 12:23, data pulverizer wrote: Yes, I'll do a blog or something on GitHub and link it. It would be nice if you could get it published on the Dlang blog [1]. One usually get paid for that. Contact Mike Parker. [1] https://blog.dlang.org I'm definitely open to publishing it in dlang blog, getting paid would be nice. I've just done a full reconciliation of the output from D and Chapel with Julia's output they're all the same. In the calculation I used 32-bit floats to minimise memory consumption, I was also working with the 10,000 MINST image data (t10k-images-idx3-ubyte.gz) http://yann.lecun.com/exdb/mnist/ rather than random generated data. The -O3 -O5 optimization on the ldc compiler is instrumental in bringing the times down, going with -02 based optimization even with the other flags gives us ~ 13 seconds for the 10,000 dataset rather than the very nice 1.5 seconds. As an idea of how kernel matrix computations scale the file "train-images-idx3-ubyte.gz" contains 60,000 images and Julia performs a kernel matrix calculation in 1340 seconds while D performs it in 163 seconds - not really in line with the first time, I'd expect around 1.5*36 = 54 seconds; Chapel performs in 357 seconds - approximately line with the original and the new kernel matrix consumes about 14 GB of memory which is why I chose to use 32 bit floats - to give me an opportunity to do the kernel matrix calculation on my laptop that currently has 31GB of RAM.
Re: Get months / years between two dates.
On Wednesday, 6 May 2020 at 19:58:59 UTC, Adam D. Ruppe wrote: On Wednesday, 6 May 2020 at 19:51:01 UTC, bauss wrote: How will I get the months or years between the two dates? What's the length of a month or a year? That's the tricky part - they have variable lengths. So a difference of one month is not super precise. You could probably just do days / 365 or days / 30 to get a reasonable approximation but that's why the library doesn't give you an easy pre-made answer. I actually found something. I can't rely on "approximation" unfortunately. DateTime apparently has a "diffMonths" function that can be used.
Re: Get months / years between two dates.
On Wednesday, 6 May 2020 at 19:51:01 UTC, bauss wrote: How will I get the months or years between the two dates? What's the length of a month or a year? That's the tricky part - they have variable lengths. So a difference of one month is not super precise. You could probably just do days / 365 or days / 30 to get a reasonable approximation but that's why the library doesn't give you an easy pre-made answer.
Get months / years between two dates.
How do you exactly do that? Like if I have two dates as std.datetime.DateTime How will I get the months or years between the two dates? I was surprised to learn that Duration does not support them and only has weeks, days etc. but not months or years. I can't seem to find any standard way of doing it. Will I have to calculate that myself or? I feel like that's a big thing missing if it's not there already. My use-case is simple. I need to calculate years/months since a specific date has happened.
Re: Beginner's Comparison Benchmark
On Tuesday, 5 May 2020 at 20:07:54 UTC, RegeleIONESCU wrote: [...] Python should be ruled out, this is not its war :) I have done benchmarks against NumPy if you are interested: https://github.com/tastyminerals/mir_benchmarks
Re: Error running concurrent process and storing results in array
On 5/6/20 2:29 PM, drug wrote: 06.05.2020 16:57, Steven Schveighoffer пишет: ``` foreach(i; 0..n) // instead of for(long i = 0; i < n;) ``` I guess that `proc` delegate cant capture `i` var of `foreach` loop so the range violation doesn't happen. foreach over a range of integers is lowered to an equivalent for loop, so that was not the problem. I was surprised but `foreach` version do not have range violation, so there is difference between `foreach` and `for` loops. I did not try DerivedThread at all, only suggested them to avoid var capture. I just changed `for` by `foreach` and range violation gone. Probably this is implementation details. Ah yes, because foreach(i; 0 .. n) actually uses a hidden variable to iterate, and assigns it to i each time through the loop. It used to just use i for iteration, but then you could play tricks by adjusting i. So the equivalent for loop would be: for(int _i = 0; _i < n; ++_i) { auto i = _i; // this won't be executed after _i is out of range ... // foreach body } So the problem would not be a range error, but just random i's coming through to the various threads ;) Very interesting! -Steve
Re: Error running concurrent process and storing results in array
06.05.2020 13:23, data pulverizer пишет: On Wednesday, 6 May 2020 at 08:28:41 UTC, drug wrote: What is current D time? ... Current Times: D: ~ 1.5 seconds Chapel: ~ 9 seconds Julia: ~ 35 seconds Oh, I'm impressed. I thought that D time has been decreased by 1.5 seconds but it is 1.5 seconds! That would be really nice if you make the resume of your research. Yes, I'll do a blog or something on GitHub and link it. Thanks for all your help. You're welcome! Helping others helps me too.
Re: Error running concurrent process and storing results in array
06.05.2020 16:57, Steven Schveighoffer пишет: ``` foreach(i; 0..n) // instead of for(long i = 0; i < n;) ``` I guess that `proc` delegate cant capture `i` var of `foreach` loop so the range violation doesn't happen. foreach over a range of integers is lowered to an equivalent for loop, so that was not the problem. I was surprised but `foreach` version do not have range violation, so there is difference between `foreach` and `for` loops. I did not try DerivedThread at all, only suggested them to avoid var capture. I just changed `for` by `foreach` and range violation gone. Probably this is implementation details.
Re: How to port C++ std::is_reference to D ?
On Wednesday, 6 May 2020 at 09:07:22 UTC, wjoe wrote: Hello, I'm choking on a piece of C++ I have no idea about how to translate to D. std::is_reference In general, you can't. In D, `ref` is not part of the type, it's a "storage class", and as such it is a property that a function parameter can have alongside its type. I.e. in C++, it makes sense to ask: "Is that parameter's type a reference type?" But in D it doesn't; you could ask: "Is the parameter given by reference?" ("Does the parameter have the storage class `ref` [or `out` to be complete]?") C++'s decision to make references part of the type has some advantages, but D didn't do it because of many disadvantages.
Re: Error running concurrent process and storing results in array
On 2020-05-06 12:23, data pulverizer wrote: Yes, I'll do a blog or something on GitHub and link it. It would be nice if you could get it published on the Dlang blog [1]. One usually get paid for that. Contact Mike Parker. [1] https://blog.dlang.org -- /Jacob Carlborg
Re: Beginner's Comparison Benchmark
On Wed, May 06, 2020 at 09:59:48AM +, welkam via Digitalmars-d-learn wrote: > On Tuesday, 5 May 2020 at 20:29:13 UTC, Steven Schveighoffer wrote: > > the optimizer recognizes what you are doing and changes your code > > to: > > > > writeln(1_000_000_001); > > > Oh yes a classic constant folding. The other thing to worry about is > dead code elimination. Walter has a nice story where he sent his > compiler for benchmarking and the compiler figured out that the the > result of the calculation in benchmark is not used so it deleted the > whole benchmark. I remember one time I was doing some benchmarks between different compilers, and LDC consistently beat them all -- which is not surprising, but what was surprising was that running times were suspiciously short. Curious to learn what magic code transformation LDC applied to make it run so incredibly fast, I took a look at the generated assembly. Turns out, because I was calling the function being benchmarked with constant arguments, LDC decided to execute the entire danged thing at compile-time and substitute the entire function call with a single instruction that loaded its return value(!). Another classic guffaw was when the function return value was simply discarded: LDC figured out that the function had no side-effects and its return value was not being used, so it deleted the function call, leaving the benchmark with the equivalent of: void main() {} which, needless to say, beat all other benchmarks hands down. :-D Lessons learned: (1) Always use external input to your benchmark (e.g., load from a file, so that an overly aggressive optimizer won't decide to execute the entire program at compile-time); (2) Always make use of the return value somehow, even if it's just to print 0 to stdout, or pipe the whole thing to /dev/null, so that the overly aggressive optimizer won't decide that since your program has no effect on the outside world, it should just consist of a single ret instruction. :-D T -- This is not a sentence.
Re: How to port C++ std::is_reference to D ?
On Wednesday, 6 May 2020 at 09:40:47 UTC, wjoe wrote: yes, I did read the spec. I read the language spec on traits as well as std.traits docs as well as searching the internet for a solution since day before yesterday. But I couldn't bring it together because } else static if (__traits(isRef, T)) { compiles, but e.g. assert (modifier!(ref int) == "[out] "); doesn't. Anyways, thanks for your reply. D doesn't have reference *types*, it only has reference *parameters*. Here's an example: void fun(ref int r, int v) { static assert(is(typeof(r) == int)); // note: not `ref int` static assert(is(typeof(r) == typeof(v))); // `ref` makes no difference to type static assert(__traits(isRef, r)); // note: not `__traits(isRef, typeof(r))` static assert(!__traits(isRef, v)); }
Re: Error running concurrent process and storing results in array
On 5/6/20 2:49 AM, drug wrote: 06.05.2020 09:24, data pulverizer пишет: On Wednesday, 6 May 2020 at 05:44:47 UTC, drug wrote: proc is already a delegate, so is a pointer to the delegate, just pass a `proc` itself Thanks done that but getting a range violation on z which was not there before. ``` core.exception.RangeError@onlineapp.d(3): Range violation ??:? _d_arrayboundsp [0x55de2d83a6b5] onlineapp.d:3 void onlineapp.process(double, double, long, shared(double[])) [0x55de2d8234fd] onlineapp.d:16 void onlineapp.main().__lambda1() [0x55de2d823658] ??:? void core.thread.osthread.Thread.run() [0x55de2d83bdf9] ??:? thread_entryPoint [0x55de2d85303d] ??:? [0x7fc1d6088668] ``` confirmed. I think that's because `proc` delegates captures `i` variable of `for` loop. I managed to get rid of range violation by using `foreach`: ``` foreach(i; 0..n) // instead of for(long i = 0; i < n;) ``` I guess that `proc` delegate cant capture `i` var of `foreach` loop so the range violation doesn't happen. foreach over a range of integers is lowered to an equivalent for loop, so that was not the problem. Indeed, D does not capture individual for loop contexts, only the context of the entire function. you use `proc` delegate to pass arguments to `process` function. I would recommend for this purpose to derive a class from class Thread. Then you can pass the arguments in ctor of the derived class like: ``` foreach(long i; 0..n) new DerivedThread(double)(i), cast(double)(i + 1), i, z).start(); thread_joinAll(); ``` This is why it works, because you are capturing the value manually while in the loop itself. Another way to do this is to create a new capture context: foreach(long i; 0 .. n) auto proc = (val => { process(cast(double)(val), cast(double)(val + 1), val, z); })(i); ... } -Steve
Re: Retrieve the return type of the current function
On Wednesday, 6 May 2020 at 08:04:16 UTC, Jacob Carlborg wrote: On 2020-05-05 19:11, learner wrote: On Tuesday, 5 May 2020 at 16:41:06 UTC, Adam D. Ruppe wrote: typeof(return) Thank you, that was indeed easy! Is it possible to retrieve also the caller return type? Something like: Yes, kind of: void foo(string caller = __FUNCTION__)() { import std.traits : ReturnType; alias R = ReturnType!(mixin(caller)); static assert(is(R == int)); } int bar() { foo(); return 0; } Awesome and beautiful, thank you!
Re: std.uni, std.ascii, std.encoding, std.utf ugh!
On Wednesday, 6 May 2020 at 10:57:59 UTC, learner wrote: On Tuesday, 5 May 2020 at 19:24:41 UTC, WebFreak001 wrote: On Tuesday, 5 May 2020 at 18:41:50 UTC, learner wrote: Good morning, Trying to do this: ``` bool foo(string s) nothrow { return s.all!isDigit; } ``` I realised that the conversion from char to dchar could throw. I need to validate and operate over ascii strings and utf8 strings, possibly in separate functions, what's the best way to transition between: ``` immutable(ubyte)[] -> validate utf8 -> string -> nothrow usage -> isDigit etc immutable(ubyte)[] -> validate ascii -> AsciiString? -> nothrow usage -> isDigit etc string -> validate ascii -> AsciiString? -> nothrow usage -> isDigit etc ``` Thank you Thank you WebFreak, if you want nothrow operations on the sequence of characters (bytes) of the strings, use `str.representation` to get `immutable(ubyte)[]` and work on that. This is useful for example for doing indexOf (countUntil), startsWith, endsWith, etc. Make sure at least one of your inputs is validated though to avoid potentially handling or cutting off unfinished code points. I think this is the best way to go if you want to do simple things. What I really want is a way to validate an immutable(ubyte)[] sequence for UFT8 or ASCII, and from that point forward, apply functions like isDigit in nothrow functions. If your algorithm is sufficiently complex that you would like to still decode but not crash, you can also manually call .decode with UseReplacementDchar.yes to make it emit \uFFFD for invalid characters. I will simply reject invalid UTF8 input, that's coming from I/O To get the best of both worlds, use `.byUTF!dchar` which gives you an input range to iterate over and defaults to using replacement dchar. You can then call the various algorithm & array functions on it. Can you explain better? Unless you are working with different encodings than UTF-8 (like doing file or network operations) you shouldn't be needing std.encoding. I'm expecting UTF8 and ASCII encoding from I/O Thank you! Using .representation would be like assuming UTF-8 and .byUTF!dchar will still test and replace invalid characters. If you want to check if a string is UTF-8 beforehand, use `std.utf : validate` - it will throw an UTFException in case of malformed UTF-8. However this will not magically make your algorithms nothrow, except of course it won't actually throw because of decoding exceptions in that case. If you want to give the nothrow attribute to your functions, you will need to work with .representation or .byUTF!dchar
Re: Beginner's Comparison Benchmark
On Tuesday, 5 May 2020 at 20:29:13 UTC, Steven Schveighoffer wrote: the optimizer recognizes what you are doing and changes your code to: writeln(1_000_000_001); Oh yes a classic constant folding. The other thing to worry about is dead code elimination. Walter has a nice story where he sent his compiler for benchmarking and the compiler figured out that the the result of the calculation in benchmark is not used so it deleted the whole benchmark.
Re: std.uni, std.ascii, std.encoding, std.utf ugh!
On Tuesday, 5 May 2020 at 19:24:41 UTC, WebFreak001 wrote: On Tuesday, 5 May 2020 at 18:41:50 UTC, learner wrote: Good morning, Trying to do this: ``` bool foo(string s) nothrow { return s.all!isDigit; } ``` I realised that the conversion from char to dchar could throw. I need to validate and operate over ascii strings and utf8 strings, possibly in separate functions, what's the best way to transition between: ``` immutable(ubyte)[] -> validate utf8 -> string -> nothrow usage -> isDigit etc immutable(ubyte)[] -> validate ascii -> AsciiString? -> nothrow usage -> isDigit etc string -> validate ascii -> AsciiString? -> nothrow usage -> isDigit etc ``` Thank you Thank you WebFreak, if you want nothrow operations on the sequence of characters (bytes) of the strings, use `str.representation` to get `immutable(ubyte)[]` and work on that. This is useful for example for doing indexOf (countUntil), startsWith, endsWith, etc. Make sure at least one of your inputs is validated though to avoid potentially handling or cutting off unfinished code points. I think this is the best way to go if you want to do simple things. What I really want is a way to validate an immutable(ubyte)[] sequence for UFT8 or ASCII, and from that point forward, apply functions like isDigit in nothrow functions. If your algorithm is sufficiently complex that you would like to still decode but not crash, you can also manually call .decode with UseReplacementDchar.yes to make it emit \uFFFD for invalid characters. I will simply reject invalid UTF8 input, that's coming from I/O To get the best of both worlds, use `.byUTF!dchar` which gives you an input range to iterate over and defaults to using replacement dchar. You can then call the various algorithm & array functions on it. Can you explain better? Unless you are working with different encodings than UTF-8 (like doing file or network operations) you shouldn't be needing std.encoding. I'm expecting UTF8 and ASCII encoding from I/O Thank you!
Re: How to port C++ std::is_reference to D ?
06.05.2020 12:07, wjoe пишет: Hello, I'm choking on a piece of C++ I have no idea about how to translate to D. template typename std::enable_if< std::is_const::value == true, void>::type* = nullptr> constexpr const char *modifier() const { return "[in] "; } template typename std::enable_if< std::is_reference::value == true, void>::type* = nullptr> constexpr const char *modifier() const { return "[out] "; } my attempt at it is like this: template modifier(T) { static if (is (T==const)) { const char* modifier = "[in] "; } else static if (/* T is a reference ?*/) { // [*] const char* modifier = "[out] "; } } but even if I could e.g. say something like is(T == ref R, R), auto a = modifier!(ref T); wouldn't work. did you try https://dlang.org/spec/traits.html#isRef?
Re: Error running concurrent process and storing results in array
On Wednesday, 6 May 2020 at 08:28:41 UTC, drug wrote: What is current D time? ... Current Times: D: ~ 1.5 seconds Chapel: ~ 9 seconds Julia: ~ 35 seconds That would be really nice if you make the resume of your research. Yes, I'll do a blog or something on GitHub and link it. Thanks for all your help.
How to port C++ std::is_reference to D ?
Hello, I'm choking on a piece of C++ I have no idea about how to translate to D. template typename std::enable_if< std::is_const::value == true, void>::type* = nullptr> constexpr const char *modifier() const { return "[in] "; } template typename std::enable_if< std::is_reference::value == true, void>::type* = nullptr> constexpr const char *modifier() const { return "[out] "; } my attempt at it is like this: template modifier(T) { static if (is (T==const)) { const char* modifier = "[in] "; } else static if (/* T is a reference ?*/) { // [*] const char* modifier = "[out] "; } } but even if I could e.g. say something like is(T == ref R, R), auto a = modifier!(ref T); wouldn't work.
Re: How to port C++ std::is_reference to D ?
On Wednesday, 6 May 2020 at 09:19:10 UTC, drug wrote: 06.05.2020 12:07, wjoe пишет: Hello, I'm choking on a piece of C++ I have no idea about how to translate to D. template typename std::enable_if< std::is_const::value == true, void>::type* = nullptr> constexpr const char *modifier() const { return "[in] "; } template typename std::enable_if< std::is_reference::value == true, void>::type* = nullptr> constexpr const char *modifier() const { return "[out] "; } my attempt at it is like this: template modifier(T) { static if (is (T==const)) { const char* modifier = "[in] "; } else static if (/* T is a reference ?*/) { // [*] const char* modifier = "[out] "; } } but even if I could e.g. say something like is(T == ref R, R), auto a = modifier!(ref T); wouldn't work. did you try https://dlang.org/spec/traits.html#isRef? yes, I did read the spec. I read the language spec on traits as well as std.traits docs as well as searching the internet for a solution since day before yesterday. But I couldn't bring it together because } else static if (__traits(isRef, T)) { compiles, but e.g. assert (modifier!(ref int) == "[out] "); doesn't. Anyways, thanks for your reply.
Re: Error running concurrent process and storing results in array
06.05.2020 11:18, data pulverizer пишет: CPU usage now revs up almost has time to touch 100% before the process is finished! Interestingly using `--boundscheck=off` without `--ffast-math` gives a timing of around 4 seconds and, whereas using `--ffast-math` without `--boundscheck=off` made no difference, having both gives us the 1.5 seconds. As Jacob Carlborg suggested I tried adding `-mcpu=native -flto=full -defaultlib=phobos2-ldc-lto,druntime-ldc-lto` but I didn't see any difference. Current Julia time is still around 35 seconds even when using @inbounds @simd, and running julia -O3 --check-bounds=no but I'll probably need to run the code by the Julia community to see whether it can be further optimized but it's pretty interesting to see D so far in front. Interesting when I attempt to switch off the garbage collector in Julia, the process gets killed because my computer runs out of memory (I have over 26 GB of memory free) whereas in D the memory I'm using barely registers (max 300MB) - it uses even less than Chapel (max 500MB) - which doesn't use much at all. It's exactly the same computation, D and Julia's timing were similar before the GC optimization and compiler flag magic in D. What is current D time? That would be really nice if you make the resume of your research.
Re: Error running concurrent process and storing results in array
On Wednesday, 6 May 2020 at 07:57:46 UTC, WebFreak001 wrote: On Wednesday, 6 May 2020 at 07:42:44 UTC, data pulverizer wrote: On Wednesday, 6 May 2020 at 07:27:19 UTC, data pulverizer wrote: Just tried removing the boundscheck and got 1.5 seconds in D! Cool! But before getting too excited I would recommend you to also run tests if the resulting data is even still correct before you keep this in if you haven't done this already! Yes, I've been outputting portions of the result which is a 10_000 x 10_000 matrix but it's definitely a good idea to do a full reconciliation of the outputs from all the languages. If you feel like it, I would recommend you to write up some small blog article what you learned about how to improve performance of hot code like this. Maybe simply write a post on reddit or make a full blog or something. I'll probably do a blog on GitHub and it can be linked it on reddit. Ultimately: all the smart suggestions in here should probably be aggregated. More benchmarks and more blog articles always help the discoverability then. Definitely, Julia has a very nice performance optimization section that makes things easy to start with https://docs.julialang.org/en/v1/manual/performance-tips/index.html, it helps alot to start getting your code speedy before you ask for help from the community.
Re: Error running concurrent process and storing results in array
On Wednesday, 6 May 2020 at 07:47:59 UTC, drug wrote: 06.05.2020 10:42, data pulverizer пишет: On Wednesday, 6 May 2020 at 07:27:19 UTC, data pulverizer wrote: On Wednesday, 6 May 2020 at 06:54:07 UTC, drug wrote: Thing are really interesting. So there is a space to improve performance in 2.5 times :-) Yes, `array` is smart enough and if you call it on another array it is no op. What means `--fast` in Chapel? Do you try `--fast-math` in ldc? Don't know if 05 use this flag I tried `--fast-math` in ldc but it didn't make any difference the documentation of `--fast` in Chapel says "Disable checks; optimize/specialize". Just tried removing the boundscheck and got 1.5 seconds in D! Congrats! it looks like the thriller! What about cpu usage? the same 40%? CPU usage now revs up almost has time to touch 100% before the process is finished! Interestingly using `--boundscheck=off` without `--ffast-math` gives a timing of around 4 seconds and, whereas using `--ffast-math` without `--boundscheck=off` made no difference, having both gives us the 1.5 seconds. As Jacob Carlborg suggested I tried adding `-mcpu=native -flto=full -defaultlib=phobos2-ldc-lto,druntime-ldc-lto` but I didn't see any difference. Current Julia time is still around 35 seconds even when using @inbounds @simd, and running julia -O3 --check-bounds=no but I'll probably need to run the code by the Julia community to see whether it can be further optimized but it's pretty interesting to see D so far in front. Interesting when I attempt to switch off the garbage collector in Julia, the process gets killed because my computer runs out of memory (I have over 26 GB of memory free) whereas in D the memory I'm using barely registers (max 300MB) - it uses even less than Chapel (max 500MB) - which doesn't use much at all. It's exactly the same computation, D and Julia's timing were similar before the GC optimization and compiler flag magic in D.
Re: Retrieve the return type of the current function
On 2020-05-05 19:11, learner wrote: On Tuesday, 5 May 2020 at 16:41:06 UTC, Adam D. Ruppe wrote: typeof(return) Thank you, that was indeed easy! Is it possible to retrieve also the caller return type? Something like: Yes, kind of: void foo(string caller = __FUNCTION__)() { import std.traits : ReturnType; alias R = ReturnType!(mixin(caller)); static assert(is(R == int)); } int bar() { foo(); return 0; } -- /Jacob Carlborg
Re: Error running concurrent process and storing results in array
On Wednesday, 6 May 2020 at 07:42:44 UTC, data pulverizer wrote: On Wednesday, 6 May 2020 at 07:27:19 UTC, data pulverizer wrote: On Wednesday, 6 May 2020 at 06:54:07 UTC, drug wrote: Thing are really interesting. So there is a space to improve performance in 2.5 times :-) Yes, `array` is smart enough and if you call it on another array it is no op. What means `--fast` in Chapel? Do you try `--fast-math` in ldc? Don't know if 05 use this flag I tried `--fast-math` in ldc but it didn't make any difference the documentation of `--fast` in Chapel says "Disable checks; optimize/specialize". Just tried removing the boundscheck and got 1.5 seconds in D! Cool! But before getting too excited I would recommend you to also run tests if the resulting data is even still correct before you keep this in if you haven't done this already! If you feel like it, I would recommend you to write up some small blog article what you learned about how to improve performance of hot code like this. Maybe simply write a post on reddit or make a full blog or something. Ultimately: all the smart suggestions in here should probably be aggregated. More benchmarks and more blog articles always help the discoverability then.
Re: Error running concurrent process and storing results in array
On 2020-05-06 06:04, Mathias LANG wrote: In general, if you want to parallelize something, you should aim to have as many threads as you have cores. That should be _logical_ cores. If the CPU supports hyper threading it can run two threads per core. -- /Jacob Carlborg
Re: Error running concurrent process and storing results in array
06.05.2020 10:42, data pulverizer пишет: On Wednesday, 6 May 2020 at 07:27:19 UTC, data pulverizer wrote: On Wednesday, 6 May 2020 at 06:54:07 UTC, drug wrote: Thing are really interesting. So there is a space to improve performance in 2.5 times :-) Yes, `array` is smart enough and if you call it on another array it is no op. What means `--fast` in Chapel? Do you try `--fast-math` in ldc? Don't know if 05 use this flag I tried `--fast-math` in ldc but it didn't make any difference the documentation of `--fast` in Chapel says "Disable checks; optimize/specialize". Just tried removing the boundscheck and got 1.5 seconds in D! Congrats! it looks like the thriller! What about cpu usage? the same 40%?
Re: Error running concurrent process and storing results in array
On 2020-05-06 08:54, drug wrote: Do you try `--fast-math` in ldc? Don't know if 05 use this flag Try the following flags as well: `-mcpu=native -flto=full -defaultlib=phobos2-ldc-lto,druntime-ldc-lto` -- /Jacob Carlborg
Re: Error running concurrent process and storing results in array
On Wednesday, 6 May 2020 at 07:27:19 UTC, data pulverizer wrote: On Wednesday, 6 May 2020 at 06:54:07 UTC, drug wrote: Thing are really interesting. So there is a space to improve performance in 2.5 times :-) Yes, `array` is smart enough and if you call it on another array it is no op. What means `--fast` in Chapel? Do you try `--fast-math` in ldc? Don't know if 05 use this flag I tried `--fast-math` in ldc but it didn't make any difference the documentation of `--fast` in Chapel says "Disable checks; optimize/specialize". Just tried removing the boundscheck and got 1.5 seconds in D!
Re: Error running concurrent process and storing results in array
On Wednesday, 6 May 2020 at 07:27:19 UTC, data pulverizer wrote: On Wednesday, 6 May 2020 at 06:54:07 UTC, drug wrote: Thing are really interesting. So there is a space to improve performance in 2.5 times :-) Yes, `array` is smart enough and if you call it on another array it is no op. What means `--fast` in Chapel? Do you try `--fast-math` in ldc? Don't know if 05 use this flag I tried `--fast-math` in ldc but it didn't make any difference the documentation of `--fast` in Chapel says "Disable checks; optimize/specialize". Just tried removing the boundscheck and got 1.5 seconds in D!
Re: Error running concurrent process and storing results in array
On 2020-05-06 05:25, data pulverizer wrote: I have been using std.parallelism and that has worked quite nicely but it is not fully utilising all the cpu resources in my computation If you happen to be using macOS, I know that when std.parallelism checks how many cores the computer has, it checks physical cores instead of logical cores. That could be a reason, if you're running macOS. -- /Jacob Carlborg
Re: Error running concurrent process and storing results in array
On Wednesday, 6 May 2020 at 06:49:13 UTC, drug wrote: ... Then you can pass the arguments in ctor of the derived class like: ``` foreach(long i; 0..n) new DerivedThread(double)(i), cast(double)(i + 1), i, z).start(); thread_joinAll(); ``` not tested example of derived thread ``` class DerivedThread { this(double x, double y, long i, shared(double[]) z) { this.x = x; this.y = y; this.i = i; this.z = z; super(); } private: void run() { process(x, y, i, z); } double x, y; long i; shared(double[]) z; } ``` Thanks. Now working.
Re: Error running concurrent process and storing results in array
On Wednesday, 6 May 2020 at 06:54:07 UTC, drug wrote: Thing are really interesting. So there is a space to improve performance in 2.5 times :-) Yes, `array` is smart enough and if you call it on another array it is no op. What means `--fast` in Chapel? Do you try `--fast-math` in ldc? Don't know if 05 use this flag I tried `--fast-math` in ldc but it didn't make any difference the documentation of `--fast` in Chapel says "Disable checks; optimize/specialize".
Re: Error running concurrent process and storing results in array
06.05.2020 09:43, data pulverizer пишет: On Wednesday, 6 May 2020 at 05:50:23 UTC, drug wrote: General advice - try to avoid using `array` and `new` in hot code. Memory allocating is slow in general, except if you use carefully crafted custom memory allocators. And that can easily be the reason of 40% cpu usage because the cores are waiting for the memory subsystem. I changed the Matrix object from class to struct and timing went from about 19 seconds with ldc2 and flags `-O5` to 13.69 seconds, but CPU usage is still at ~ 40% still using `taskPool.parallel(iota(n))`. The `.array` method is my method for the Matrix object just returning internal data array object so it shouldn't copy. Julia is now at about 34 seconds (D was at about 30 seconds while just using dmd with no optimizations), to make things more interesting I also did an implementation in Chapel which is now at around 9 seconds with `--fast` flag. Thing are really interesting. So there is a space to improve performance in 2.5 times :-) Yes, `array` is smart enough and if you call it on another array it is no op. What means `--fast` in Chapel? Do you try `--fast-math` in ldc? Don't know if 05 use this flag
Re: Error running concurrent process and storing results in array
06.05.2020 09:24, data pulverizer пишет: On Wednesday, 6 May 2020 at 05:44:47 UTC, drug wrote: proc is already a delegate, so is a pointer to the delegate, just pass a `proc` itself Thanks done that but getting a range violation on z which was not there before. ``` core.exception.RangeError@onlineapp.d(3): Range violation ??:? _d_arrayboundsp [0x55de2d83a6b5] onlineapp.d:3 void onlineapp.process(double, double, long, shared(double[])) [0x55de2d8234fd] onlineapp.d:16 void onlineapp.main().__lambda1() [0x55de2d823658] ??:? void core.thread.osthread.Thread.run() [0x55de2d83bdf9] ??:? thread_entryPoint [0x55de2d85303d] ??:? [0x7fc1d6088668] ``` confirmed. I think that's because `proc` delegates captures `i` variable of `for` loop. I managed to get rid of range violation by using `foreach`: ``` foreach(i; 0..n) // instead of for(long i = 0; i < n;) ``` I guess that `proc` delegate cant capture `i` var of `foreach` loop so the range violation doesn't happen. you use `proc` delegate to pass arguments to `process` function. I would recommend for this purpose to derive a class from class Thread. Then you can pass the arguments in ctor of the derived class like: ``` foreach(long i; 0..n) new DerivedThread(double)(i), cast(double)(i + 1), i, z).start(); thread_joinAll(); ``` not tested example of derived thread ``` class DerivedThread { this(double x, double y, long i, shared(double[]) z) { this.x = x; this.y = y; this.i = i; this.z = z; super(); } private: void run() { process(x, y, i, z); } double x, y; long i; shared(double[]) z; } ```
Re: Error running concurrent process and storing results in array
On Wednesday, 6 May 2020 at 05:50:23 UTC, drug wrote: General advice - try to avoid using `array` and `new` in hot code. Memory allocating is slow in general, except if you use carefully crafted custom memory allocators. And that can easily be the reason of 40% cpu usage because the cores are waiting for the memory subsystem. I changed the Matrix object from class to struct and timing went from about 19 seconds with ldc2 and flags `-O5` to 13.69 seconds, but CPU usage is still at ~ 40% still using `taskPool.parallel(iota(n))`. The `.array` method is my method for the Matrix object just returning internal data array object so it shouldn't copy. Julia is now at about 34 seconds (D was at about 30 seconds while just using dmd with no optimizations), to make things more interesting I also did an implementation in Chapel which is now at around 9 seconds with `--fast` flag.
Re: Error running concurrent process and storing results in array
On Wednesday, 6 May 2020 at 05:44:47 UTC, drug wrote: proc is already a delegate, so is a pointer to the delegate, just pass a `proc` itself Thanks done that but getting a range violation on z which was not there before. ``` core.exception.RangeError@onlineapp.d(3): Range violation ??:? _d_arrayboundsp [0x55de2d83a6b5] onlineapp.d:3 void onlineapp.process(double, double, long, shared(double[])) [0x55de2d8234fd] onlineapp.d:16 void onlineapp.main().__lambda1() [0x55de2d823658] ??:? void core.thread.osthread.Thread.run() [0x55de2d83bdf9] ??:? thread_entryPoint [0x55de2d85303d] ??:? [0x7fc1d6088668] ```