Re: gdc or ldc for faster programs?
On Thursday, 27 January 2022 at 16:46:59 UTC, Ali Çehreli wrote: What I know is that weak symbols can be overridden by strong symbols during linking. Which means, if a function body is inlined which also has a weak symbol, some part of the program may be using the inlined definition and some other parts may be using the overridden definition. Thanks to separate compilation, they need not match hence the violation of the one-definition rule (ODR). But the language requires ODR, so we can emit templates as weak_odr, telling the optimizer and linker that the symbols should be merged _and_ that ODR can be assumed to hold (i.e. inlining is OK). The onus of honouring ODR is on the user - not the compiler - because we allow the user to do separate compilation. Some more detailed explanation and example: https://stackoverflow.com/questions/44335046/how-does-the-linker-handle-identical-template-instantiations-across-translation/44346057 -Johan
Re: How to profile compile times of a source code?
On Sunday, 31 January 2021 at 12:16:50 UTC, Imperatorn wrote: On Saturday, 30 January 2021 at 23:34:50 UTC, Stefan Koch wrote: On Saturday, 30 January 2021 at 22:47:39 UTC, Ahmet Sait wrote: [...] I have a way of getting the profile data your are after. Get the dmd_tracing_20942 branch from https://github.com/UplinkCoder/dmd Compile that version of dmd. this special version of dmd will generate a trace file which can be read with the included printTraceHeader tool Interesting. Is this something that we could get into dmd with a switch? 樂 Try LDC 1.25 (now in beta testing) with --ftime-trace. Clang has the same option, so you can read more about it online in that context. Be sure to check out the related commandline flags. I recommend the Tracy UI to look at traces, because it is by far the fastest viewer of large traces. https://github.com/wolfpld/tracy -Johan
Re: LDC relocation flags
On Friday, 18 December 2020 at 13:00:45 UTC, Severin Teona wrote: Hi guys! Do you know how can I compile D code using LDC with the following gcc flags? * -msingle-pic-base * -mpic-register=r9 * -mno-pic-data-is-text-relative. As far as I know, there are no equivalents in D for these. Is is ok to use the -Xcc flag? Thank you! One direction to look into is what flags to use with Clang instead of GCC. The -Xcc flag is not useful because that will only help with linker configuration. At least the first two flags that you mention are related to parts of codegen that I think are not done by the linker. -Johan
Re: toStringz lifetime
On Sunday, 25 October 2020 at 10:03:44 UTC, Ali Çehreli wrote: Is it really safe? Imagine a multi-threaded environment where another D function is executed that triggers a GC collection right before the printf. Does the GC see that local variable 'name' that is on the C side? What I don't know is whether the GC is aware only of the stack frames of D functions or the entire thread, which would include the C caller's 'name'. Small note: besides the stack, it is crucial that the GC is aware of the CPU register values. -Johan
Re: Order of static this() execution?
On Sunday, 23 February 2020 at 11:55:11 UTC, drathier wrote: On Sunday, 23 February 2020 at 11:41:25 UTC, Johan Engelen wrote: On Sunday, 23 February 2020 at 09:59:45 UTC, drathier wrote: I'm having some trouble with the order in which `static this()` runs. This is the order defined in the source file, numbered for convenience: To avoid confusion: you have all `static this()` in a single source file? Or across several source files? -Johan They're all in a single source file. The `[template]` prints are inside templates, It's not clear from the language specification, but in this case with templates, I am not surprised that the order of execution is not the same as in the source file. Probably it does fit with the order in the source file if you take into account where the template is instantiated in that file (but you shouldn't depend on that). I strongly recommend not to depend on the order of multiple static this execution, by either rewriting things or by including some logic to make sure things are called in the right order (checking that something has already run, and running it if not, or cancel running it if it already ran) -Johan
Re: Order of static this() execution?
On Sunday, 23 February 2020 at 09:59:45 UTC, drathier wrote: I'm having some trouble with the order in which `static this()` runs. This is the order defined in the source file, numbered for convenience: To avoid confusion: you have all `static this()` in a single source file? Or across several source files? -Johan
Re: Default values in derived class
On Saturday, 28 December 2019 at 20:47:38 UTC, Mike Parker wrote: On Saturday, 28 December 2019 at 20:22:51 UTC, JN wrote: import std.stdio; class Base { bool b = true; } class Derived : Base { bool b = false; } void main() { // 1 Base b = new Derived(); writeln(b.b); // true // 2 Derived d = new Derived(); writeln(d.b); // false } Expected behavior or bug? 1) seems like a bug to me. Expected. Member variables do not override base class variables. b is declared as Base, so it knows nothing about Derived’s member variable even though you instantiated it with an instance of Derived. There’s no vtable for variables. If you want it to print false, then you either have to cast b to Derived or provide a getter function in Base that Derived can override. What Mike is saying is that `Base` has one `b` member variable, but `Derived` has two (!). ``` writeln(d.b); // false writeln(d.Base.b); // true (the `b` member inherited from Base) ``` -Johan
Re: static assert(version(x)) ?
On Tuesday, 26 November 2019 at 12:53:02 UTC, Jonathan M Davis wrote: On Tuesday, November 26, 2019 4:29:18 AM MST S.G via Digitalmars-d-learn wrote: On Tuesday, 26 November 2019 at 10:24:00 UTC, Robert M. Münch wrote: > How can I write something like this to check if any of a set > of specific versions is used? > > static assert(!(version(a) | version(b) | version(c)): > > The problem is that I can use version(a) like a test, and the > symbol a is not accessbile from assert (different, > non-accessible namespace). BTW D language designers are against boolean eval of version. It's not a technical restriction, it's just that they don't want this to work. ... static if can be used instead of version blocks to get boolean conditions, and local version identifiers can be defined which combine some set of version identifiers, but such practices are discouraged for D programmers in general, and they're basically forbidden in official source code. The only case I'm aware of where anything like that is used in druntime or Phobos is for darwin stuff, since darwin isn't a predefined identifier. `xversion` is a simple and effective and useful tool, used in dmd source: https://github.com/dlang/dmd/blob/53b533dc7fc5da604e7ebf457734766b4e96d900/src/dmd/globals.d#L21-L35 ``` template xversion(string s) { enum xversion = mixin(`{ version (` ~ s ~ `) return true; else return false; }`)(); } enum version_a = xversion!`a`; ``` -Johan
Re: Final necessary for inline optimization?
Just a brief answer. On Saturday, 19 October 2019 at 15:58:08 UTC, IGotD- wrote: Which one is it, LDC recognizes TestClass isn't derived or is sure that the class (c) isn't derived in particular? It is the latter: the optimizer is able to prove that object c has a vtable that is known exactly. Then indexing into that vtable gives a definite function that can then be inlined. Some more info: The (suboptimal) trick that LDC employs is that after the call to `new`, LDC again sets the vtable of the object. It is superfluous, because it is already done by `new`, but `new` is opaque to the optimizer whereas the extra vtable write is not. So after the object creation, the optimizer knows exactly what vtable is used for that object. Now unfortunately, that only works for the first virtual call after `new`. Any opaque function that is called with `c` as parameter, e.g. calling a virtual function of that class will destroy knowledge about what vtable is stored in `c`. Per D language spec, the virtual function cannot overwrite the vtable pointer, but the optimizer does not know that so it assumes it might be overwritten and it no longers knows the contents of the vtable pointer. You can see this happening here: https://d.godbolt.org/z/8ERNhg -Johan
Re: How does D distnguish managed pointers from raw pointers?
On Thursday, 3 October 2019 at 14:21:37 UTC, Andrea Fontana wrote: In D arrays are fat pointer instead: int[10] my_array; my_array is actually a pair ptr+length. ``` int[10] my_static_array; int[] my_dynamic_array; ``` my_static_array will not be a fat pointer. Length is known at compile time. Address is known at link/load time so it's also not a pointer but just a normal variable (& will give you a pointer to the array data). my_dynamic_array will be a pair for ptr+length. -Johan
Re: Deprecation message sources
On Tuesday, 17 September 2019 at 20:16:12 UTC, Anonymouse wrote: On Tuesday, 17 September 2019 at 19:31:53 UTC, Steven Schveighoffer wrote: I'd hate to say the answer is to special case Nullable for so many functions, but what other alternative is there? -Steve Nullable isn't alone, std.json.JSONType causes a literal wall of text of deprecation warnings. import std.stdio; import std.json; void main() { writeln(JSONValue.init.type); } https://run.dlang.io/is/J0UDay Wow. How come this is not caught by the CI testing? -Johan
Re: C++ vs D: Default param values and struct to array casting
On Friday, 6 September 2019 at 09:14:31 UTC, Andrew Edwards wrote: C++ allows the for following: struct Demo { float a, b, c, d; Demo() { a = b = c = d = 0.0f; } Demo(float _a, float _b, float _c, float _d) { a = _a; b = _b; c = _c; d = _d; } float operator[] (size_t i) const { return ()[i]; } //[3] "()[i]" is undefined behavior in C++. You cannot index into struct members like that. Of course it may work in certain cases, but UB is UB. Don't do it! I found a more detailed explanation for you: https://stackoverflow.com/questions/40590216/is-it-legal-to-index-into-a-struct -Johan
Re: LDC won't find ld linker -why?
On Tuesday, 9 July 2019 at 15:25:17 UTC, Dukc wrote: I just downloaded ldc 1.15.0 for Linux from GH releases. Testing it, it will make the object file out of a hello world application, but then complain: ``` collect2: fatal error: cannot find ‘ld’ compilation terminated. ``` Run LDC with "-v" and check what the linker command is (somewhere at the bottom, LDC invokes `gcc` for linking). You might see something like `-Wl,-fuse-ld=...` in that gcc command line. Perhaps that ld.gold / ld.bfd is not installed and thus GCC complains that it cannot find it. See e.g. https://askubuntu.com/questions/798453/collect2-fatal-error-cannot-find-ld-compilation-terminated -Johan
Re: Can D optimize?
On Sunday, 9 June 2019 at 05:24:56 UTC, Amex wrote: Can dmd or ldc optimize the following cases: foo(int x) { if (x > 10 && x < 100) bar1; else bar2; } ... for(int i = 23; i < 55; i++) foo(i); // equivalent to calling bar1(i) clearly i is within the range of the if in foo and so the checks are unnecessary. This is a simple case of inlining for the optimizer, so works out well with LDC and GDC: https://d.godbolt.org/z/pLy8Yy -Johan
Re: Explicitly avoid GC of objects?
On Tuesday, 21 May 2019 at 13:23:54 UTC, Benjamin Schaaf wrote: On Tuesday, 21 May 2019 at 11:54:08 UTC, Robert M. Münch wrote: Is there a trick to accomplish 2 when objects are created from different scopes which need to be kept? So, I have one function creating the objects and one using them. How can I keep things on the stack between these two functions? How is 3 done? Is this only useful for static variables? I'll try to describe rules 2 and 3 as simply as possible: As long as you can access the pointer to gc allocated memory in D it will not be freed. If you don't actually access the pointer, the compiler may optimize-out storage of that pointer and the garbage collector will then not see it. So this statement should read: "As long as _the GC_ can see the pointer to gc allocated memory in D it will not be freed". The GC is looking at registers, stack, static memory, GC allocated memory, ... So whether that pointer lives on the stack: int* foo() { return new int; } void bar() { int* a = foo(); c_fn(a); } This is actually not GC safe [*]. The local storage `a` is optimized out, and the parameter to `c_fn` is passed in a register is not guaranteed to not be overwritten by `c_fn`. If after the overwrite, the GC is invoked (e.g. from other thread, or from deeper call tree in `c_fn`) then the memory may be freed. LDC, DMD, GDC, all 3 perform that optimization. So more care is needed here! https://d.godbolt.org/z/DumVNF In static or thread local memory: int* a; __gshared int* b; void bar() { a = new int; c_fn(a); b = a; c_fn(b); } Also here, whole-program analysis/optimization may discover that `a` and `b` are really never used by anyone. Again, optimizing-out those storage spaces will make this code GC unsafe [*]. In this case, I think LTO and LTO visibility of the c_fn implementation would be needed to do that optimization and thus probably will be safe, for now (!). -Johan [*] The unsafety is a little tricky. If `c_fn` stores the pointer in a place where the GC can see it (registers, D static memory, ...) all is good. But if `c_fn` stores it in, say, a variable on the C side, removes it from register, and _then_ GC is invoked, that's when trouble may happen.
Re: LDC2 and classic profiling
On Saturday, 11 May 2019 at 11:34:35 UTC, Denis Feklushkin wrote: On Saturday, 11 May 2019 at 09:12:24 UTC, Johan Engelen wrote: Those calls are to templated functions I presume? No Then I don't understand how you'd see instrumentation on functions that you did not compile with -fprofile-instr-generate (indirect calls to such functions may be recorded); the std lib is not compiled with profiling instrumentation, and so you shouldn't see any internal instrumentation of it. Unless those functions are instantiated in your object file (e.g. templates or explicitly inlined functions). Also I changed flags to "dflags-ldc": ["-fprofile-instr-generate", "-O0"] - second flag disables optimisation (I assumed that optimizations magically completely remove calls to my functions. But this is probably not the case.) No, indeed, -O0 doesn't (shouldn't!) matter. Ok. It is strange that you don't see calls to your functions. Just to verify, could you compile a simple program manually (without dub) and verify that you see calls to your own functions? Tried, and it works! Lambdas should also be instrumented, so please test that. Works on simple program too. Excellent. I think dub -v will output the exact commands that dub is executing. Looks like some parts are not compiled with the compile flag, and some other parts are? By the way, if you are on linux, then XRay should work like with clang ( -fxray-instrument ) Tried it, and xray also does not returns any info about my own functions... You tried with DUB or manually? Note that XRay has a (configurable) threshold for not instrumenting very small functions. See for example this test: https://github.com/ldc-developers/ldc/blob/master/tests/instrument/xray_simple_execution.d -Johan
Re: LDC2 and classic profiling
On Saturday, 11 May 2019 at 06:59:52 UTC, Denis Feklushkin wrote: On Saturday, 11 May 2019 at 05:46:29 UTC, Denis Feklushkin wrote: All another calls is made inside of this lambda - maybe lambdas is not traced by profiler? Tried to remove lambda with same result. Command: llvm-profdata show -all-functions -topn=10 default.profdata returns huge amount of std*, core*, vibe* calls - it is all used in my code. But here is no one my own function (except "main"). Those calls are to templated functions I presume? (they are instantiated in your program and hence instrumented) Also I changed flags to "dflags-ldc": ["-fprofile-instr-generate", "-O0"] - second flag disables optimisation (I assumed that optimizations magically completely remove calls to my functions. But this is probably not the case.) No, indeed, -O0 doesn't (shouldn't!) matter. It is strange that you don't see calls to your functions. Just to verify, could you compile a simple program manually (without dub) and verify that you see calls to your own functions? Lambdas should also be instrumented, so please test that. By the way, if you are on linux, then XRay should work like with clang ( -fxray-instrument ) -Johan
Re: LDC2 and classic profiling
On Friday, 10 May 2019 at 14:00:30 UTC, Denis Feklushkin wrote: Build with dub some package. Profiling are enabled by dub.json: "dflags-ldc": ["-fprofile-instr-generate", "-finstrument-functions", "-cov"], Resulting default.profraw (and generated default.profdata) contains only calls to external libraries, but not my internal functions calls. Why? You only need `-fprofile-instr-generate` for generating default.profraw. contains only calls to external libraries That's impossible, because those are exactly _not_ profiled. This may help: https://forum.dlang.org/post/voknxddblrbuywcyf...@forum.dlang.org -Johan
Re: D threading and shared variables
On Sunday, 7 April 2019 at 14:08:07 UTC, Archie Allison wrote: This generally works OK when tied to a Console but when link options are changed to be SUBSYSTEM:WINDOWS and ENTRY:mainCRTStartup it rarely does. Manually setting the entry point sounds problematic if no other precautions are taken. Are you sure that druntime is initialized? See [1]. - Johan [1] https://wiki.dlang.org/D_for_Win32
Re: Block statements and memory management
On Saturday, 16 March 2019 at 03:47:43 UTC, Murilo wrote: Does anyone know if when I create a variable inside a scope as in {int a = 10;} it disappears complete from the memory when the scope finishes? Or does it remain in some part of the memory? I am thinking of using scopes to make optimized programs that consume less memory. Others have made good points in this thread, but what is missing is that indeed scopes _can_ be used beneficially to reduce memory footprint. I recommend playing with this code on d.godbolt.org: ``` void func(ref int[10] a); // important detail: pointer void foo() { { int[10] a; func(a); } { int[10] b; func(b); } } ``` Because the variable is passed by reference (pointer), the optimizer cannot merge the storage space of `a` and `b` _unless_ scope information is taken into account. Without taking scope into account, the first `func` call could store the pointer to `a` somewhere for later use in the second `func` call for example. However, because of scope, using `a` after its scope has ended is UB, and thus variables `a` and `b` can be used. GDC uses scope information for variable lifetime optimization, but LDC and DMD both do not. For anyone interested in working on compilers: adding variable scope lifetime to LDC (not impossibly hard) would be a nice project and be very valuable. -Johan
Re: Returning reference: why this works?
On Wednesday, 13 March 2019 at 20:57:13 UTC, Denis Feklushkin wrote: import std.stdio; struct S { int x; } ref S func1(ref S i) // i is reference { return i; } ref S func2(S i) // i is not reference { return func1(i); // Works! Possibility to return reference to local object i? Indeed, you're invoking UB here. With compiler flag `-dip25` that code no longer compiles. -Johan
How to use core.atomic.cas with (function) pointers?
The following code compiles: ``` alias T = shared(int)*; shared T a; shared T b; shared T c; void foo() { import core.atomic: cas; cas(, b, c); } ``` The type of T has to be a pointer to a shared int (you get a template match error for `cas` if `T = int*`), which is annoying but I kind-of understand. However, change b to null (`cas(, null, c);`) and things no longer work: "Error: template `core.atomic.cas` cannot deduce function from argument types" I have not succeeded to make things work with function pointers (neither with nor without `null` as second argument). What am I doing wrong if `alias T = void function();` ? Thanks, Johan
Re: Compiling to 68K processor (Maybe GDC?)
On Monday, 21 January 2019 at 17:08:23 UTC, Johan Engelen wrote: For LDC, dereferencing `null` invokes Undefined Behavior [1]. For completeness, you can tell LDC that dereferencing `null` is _not_ UB in a particular function by specifying `@llvmAttr("null-pointer-is-valid", "true")`: https://d.godbolt.org/z/1FQCRf -Johan
Re: Compiling to 68K processor (Maybe GDC?)
On Saturday, 19 January 2019 at 17:45:41 UTC, Patrick Schluter wrote: Afaict NULL pointer derefernecing must fault for D to be "usable". At least all code is written with that assumption. Dereferencing `null` in D is implementation defined (https://dlang.org/spec/arrays.html#pointers). For LDC, dereferencing `null` invokes Undefined Behavior [1]. However, the compiler does try to be a little friendly towards the programmer. UB includes just ignoring the dereference, but if you are blatantly dereferencing `null` with optimization enabled, the compiler generates a `ud2` instruction for you: https://d.godbolt.org/z/5VLjFt -Johan [1] Now I am not quite sure yet whether Undefined Behavior is part of the set of behaviors allowed to choose from for Implementation Defined behavior. ;-)
Re: LDC2 with -fxray-instrument
On Monday, 21 January 2019 at 14:36:15 UTC, Márcio Martins wrote: On Thursday, 17 January 2019 at 00:11:10 UTC, Johan Engelen wrote: OK, got it :-) LLVM 7 changed things a little, so it's broken with LDC 1.13 [*]. For now, you can use LDC 1.12 (LLVM 6). You also have to add `-L-lstdc++` as compiler flag. So with LDC 1.12, Linux, and this commandline things work for me: ``` ldc2 -fxray-instrument -fxray-instruction-threshold=1 xraytest.d -L-lstdc++ XRAY_OPTIONS="patch_premain=true xray_mode=xray-basic verbosity=1" xraytest ``` Happy testing, Johan [*] XRay was split into several libraries, and we only copy+link with one of them. You can make things work by downloading the libs and linking with them. Great! Does it mean it will be fixed in next LDC release? :) Yes: https://github.com/ldc-developers/ldc/pull/2965 I also played a bit with -finstrument-functions and it works fine for tools like uftrace... However, if you define your own __cyg_profile_func_* functions, it won't work (or be useful), because there is no way to disable instrumenting on these __cyg_profile_func_* functions, so you can imagine what happens. I tried @llvmAttr("no_instrument_function"), supposedly the GCC compatible way to disable instrumentation on these, but it doesn't make a difference. You need: pragma(LDC_profile_instr, false), e.g.: ``` pragma(LDC_profile_instr, false) void fun () { ... } `` There are few LLVM settings that will make this useful: - https://reviews.llvm.org/D40276 - https://reviews.llvm.org/D39331 (which enables instrumenting after inlining, which is more useful in my use case) - https://reviews.llvm.org/D37622 (is has not been merged yet, but necessary, to be able to conveniently call library functions from within the hooks) These are not too hard to implement in LDC. Whoever is interested in working on it, here are some pointers in addition to the LLVM links: https://github.com/ldc-developers/ldc/issues/2466 https://github.com/ldc-developers/ldc/pull/1845/files -Johan
Re: LDC2 with -fxray-instrument
On Wednesday, 16 January 2019 at 23:29:45 UTC, Johan Engelen wrote: On Wednesday, 16 January 2019 at 22:10:14 UTC, Johan Engelen wrote: On Wednesday, 16 January 2019 at 17:36:31 UTC, Márcio Martins wrote: On Tuesday, 15 January 2019 at 22:51:15 UTC, Johan Engelen wrote: What platform are you on? Linux x64 OK, so that should work. What is your testcase? Try with `-fxray-instruction-threshold=1` to also instrument small functions. I'm trying now and am having the same problem. I'm trying to figure out how to fix LDC so that it works. work in progress! -Johan OK, got it :-) LLVM 7 changed things a little, so it's broken with LDC 1.13 [*]. For now, you can use LDC 1.12 (LLVM 6). You also have to add `-L-lstdc++` as compiler flag. So with LDC 1.12, Linux, and this commandline things work for me: ``` ldc2 -fxray-instrument -fxray-instruction-threshold=1 xraytest.d -L-lstdc++ XRAY_OPTIONS="patch_premain=true xray_mode=xray-basic verbosity=1" xraytest ``` Happy testing, Johan [*] XRay was split into several libraries, and we only copy+link with one of them. You can make things work by downloading the libs and linking with them.
Re: LDC2 with -fxray-instrument
On Wednesday, 16 January 2019 at 22:10:14 UTC, Johan Engelen wrote: On Wednesday, 16 January 2019 at 17:36:31 UTC, Márcio Martins wrote: On Tuesday, 15 January 2019 at 22:51:15 UTC, Johan Engelen wrote: What platform are you on? Linux x64 OK, so that should work. What is your testcase? Try with `-fxray-instruction-threshold=1` to also instrument small functions. I'm trying now and am having the same problem. I'm trying to figure out how to fix LDC so that it works. work in progress! -Johan
Re: LDC2 with -fxray-instrument
On Wednesday, 16 January 2019 at 17:36:31 UTC, Márcio Martins wrote: On Tuesday, 15 January 2019 at 22:51:15 UTC, Johan Engelen wrote: What platform are you on? Linux x64 OK, so that should work. What is your testcase? Try with `-fxray-instruction-threshold=1` to also instrument small functions. Have you gotten things to work with C++ code? -Johan
Re: LDC2 with -fxray-instrument
On Tuesday, 15 January 2019 at 14:32:03 UTC, Márcio Martins wrote: Anyone has any idea how to build with LDC and -fxray-instrument? I am running LDC 1.13 on Linux (x64) The XRay section is in the binary, and the compiler-rt is linked in, but when I run the binary with XRAY_OPTIONS="patch_premain=true verbosity=1" in the environment, and I get nothing. No XRay logging or terminal output. Great that you are trying this! I added it to LDC (trivial), but I have not tested it because I develop on macOS (bad excuse). Back than at least, XRay did not function on macOS. What platform are you on? -Johan
Re: one path skips constructor
On Sunday, 13 January 2019 at 16:29:27 UTC, Kagamin wrote: --- struct A { int a; this(int) { if(__ctfe)this(0,0); //Error: one path skips constructor else a=0; } this(int,int){ a=1; } } --- Is this supposed to not compile? Yes. See spec 14.14.8.1: https://dlang.org/spec/struct.html#struct-constructor -Johan
Re: Creating fixed array on stack
On Friday, 11 January 2019 at 15:23:08 UTC, Dgame wrote: On Friday, 11 January 2019 at 14:46:36 UTC, Andrey wrote: Hi, In C++ you can create a fixed array on stack: int count = getCount(); int myarray[count]; Small correction: this is valid in C, but not in C++. In D the "count" is part of type and must be known at CT but in example it is RT. How to do such thing in D? Without using of heap. You could try alloca: Indeed, but make sure you read up on the caveats of using alloca. I believe all use cases of VLA/alloca have better alternative implementation options. -Johan
Re: Segfault when adding a static destructor in druntime/src/rt/sections_elf_shared.d
On Tuesday, 8 January 2019 at 12:54:11 UTC, RazvanN wrote: Hi all, I am working on issue 14650 [1] Great! (I am _extremely_ surprised that dtors are not called for globals.) and I would like to implement a solution where static destructors are destroying global variables. However, I have the following problem in druntime/src/rt/sections_elf_shared: struct ThreadDSO { DSO* _pdso; static if (_pdso.sizeof == 8) uint _refCnt, _addCnt; else static if (_pdso.sizeof == 4) ushort _refCnt, _addCnt; else static assert(0, "unimplemented"); void[] _tlsRange; alias _pdso this; // update the _tlsRange for the executing thread void updateTLSRange() nothrow @nogc { _tlsRange = _pdso.tlsRange(); } } Array!(ThreadDSO) _loadedDSOs; For this code, I would have to create the following static destructor: static ~this() { _loadedDSOs.__dtor(); } Because Array defines a destructor which sets its length to 0. However this code leads to segfault when compiling any program with the runtime (betterC code compiles successfully). In my attempt to debug it, I dropped my patch and added the above mentioned static destructor manually in druntime which lead to the same effect. Interestingly, _loadedDSOs.__dtor runs successfully, the segmentation fault comes from somewhere higher in the call path (outside of the _d_run_main function (in rt/dmain2.d)). I'm thinking that the static destructor somehow screws up the object which is later referenced after the main program finished executing. Does someone well versed in druntime has any ideas what's happening? This is my guess: Have a look at `_d_dso_registry` and it's description. The function is also called upon shutdown and it accesses `_loadedDSOs`. As part of shutdown, `_d_dso_registry` calls `runModuleDestructors` (which will run your compiler-inserted static dtor), but _after_ that `_d_dso_registry` accesses `_loadedDSOs`. I don't know exactly why the segfault happens, but the code assumes in several places that `_loadedDSOs` is non-empty. For example, `popBack` is called and `popBack` is invalid for length=0 (it will set the length to `size_t.max` !). I think the solution is to not have `_loadedDSOs` be of type `Array!T` but of a special type that explicitly has no dtor (i.e. the "dtor" should only be called explicitly such that the data needed for shutdown survives `runModuleDestructors`). This probably applies to more of these druntime low-level arrays and other data structures. -Johan [1] The dtor of Array calls reset, and reset has a bug in rt.util.container.Array. Note the invariant: `assert(!_ptr == !_length);`, which triggers when `_length` is set to 0, but `_ptr` is not set to `null`. !!!
Re: Bitwise rotate of integral
On Monday, 7 January 2019 at 14:39:07 UTC, Per Nordlöw wrote: What's the preferred way of doing bitwise rotate of an integral value in D? Are there intrinsics for bitwise rotation available in LDC? LDC does not expose this intrinsic currently, but you can use LLVM's fshl: https://llvm.org/docs/LangRef.html#llvm-fshl-intrinsic It's a fairly new intrinsic, so won't work with older LLVM versions. https://d.godbolt.org/z/SBnJFJ As noted by others, the optimizer is strong enough to recognize what you are doing (without intrinsic) and use ror/rol if it deems it advantageous to do so. -Johan
Re: cas and interfaces
On Thursday, 27 December 2018 at 12:07:48 UTC, Rene Zwanenburg wrote: On Tuesday, 25 December 2018 at 22:07:07 UTC, Johannes Loher wrote: Thanks a lot for the info, that clarifies things a bit. But it still leaves the question, why it works correctly when inheriting from an abstract class instead of implementing an interface... Any idea about why that? Unlike interfaces, base class references don't need adjustment. Yeah. You shouldn't need to know these details but if you are interested, the details are here: https://dlang.org/spec/abi.html#classes (it's meant for tech reference, not for explanation. If you need more explanation, go search for vtables, multiple inheritance, etc.). -Johan
Re: Mysteries of the Underscore
On Monday, 24 December 2018 at 11:18:44 UTC, Ron Tarrant wrote: I found a mention that in the definition of a delegate, a function parameter and its type could be replaced by an underscore: myTestRig.addOnDestroy(delegate void(Widget w) { quitApp(); } ); became: myTestRig.addOnDestroy(delegate void(_) { quitApp(); } ); I was trying to find some further documentation on this, but I'm coming up empty. Questions: 1) What is this called (substituting an underscore in this manner)? 2) Where can a learn more about it? The underscore is just an identifier but nothing special, it could be any valid identifier like "ldkhfksdkdsg". ``` void ggg(); void takedelegate(void delegate(int) dlg); void foo() { takedelegate( delegate void(asdadasdeg) { ggg(); } ); } ``` The type of the argument is deduced from the function the delegate is passed to. -Johan
Re: cas and interfaces
On Sunday, 23 December 2018 at 14:07:04 UTC, Johannes Loher wrote: I recently played around with atomic operations. While doing so, I noticed a problem with the interaction of interfaces and cas. Consider the following program: ``` import core.atomic; import std.stdio; interface TestInterface { } class TestClass : TestInterface { } void main() { shared TestInterface testInterface = new shared TestClass; cas(, testInterface, new shared TestClass).writeln; writeln(typeid(testInterface)); } ``` (https://run.dlang.io/is/9P7PAb) The types of the 2nd and 3rd arguments of `cas` do not have to be the same, and aren't in your case. I think what's happening is that you are overwriting `testInterface` with a pointer to a TestClass which is not a valid TestInterface pointer. And then the program does something invalid because, well, you enter UB land. Fixed by: ``` cas(, testInterface, cast(shared(TestInterface)) new shared TestClass).writeln; ``` Note the cast! Whether this is a bug in `cas` or not, I don't know. The `cas` template checks whether the 3rd can be assigned to the 1st argument (`*here = writeThis;`) which indeed compiles _with_ an automatic conversion. But then the implementation of `cas` does not do any automatic conversions (casts to `void*`). Hence the problem you are seeing. -Johan
Re: LDC2 win64 calling convention
On Thursday, 29 November 2018 at 15:10:41 UTC, realhet wrote: In conclusion: Maybe LDC2 generates a lot of extra code, but I always make longer asm routines, so it's not a problem for me at all while it helps me a lot. An extra note: I recommend you look into using `ldc.llvmasm.__asm` to write inline assembly. Some advantages: no worrying about calling conventions (portability) and you'll have more instructions available. If you care about performance, usually you should _not_ write assembly, but for the 1% of other cases: the compiler also understands your asm much better if you use __asm. LDC's __asm syntax is very similar (if not the same) to what GDC uses for inline assembly. -Johan
Re: Why does nobody seem to think that `null` is a serious problem in D?
On Wednesday, 21 November 2018 at 03:05:07 UTC, Neia Neutuladh wrote: Virtual function calls have to do a dereference to figure out which potentially overrided function to call. "have to do a dereference" in terms of "dereference" as language semantic: yes. "have to do a dereference" in terms of "dereference" as reading from memory: no. If you have proof of the runtime type of an object, then you can use that information to have the CPU call the overrided function directly without reading from memory. -Johan
Re: Why does nobody seem to think that `null` is a serious problem in D?
On Wednesday, 21 November 2018 at 07:47:14 UTC, Jonathan M Davis wrote: IMHO, requiring something in the spec like "it must segfault when dereferencing null" as has been suggested before is probably not a good idea is really getting too specific (especially considering that some folks have argued that not all architectures segfault like x86 does), but ultimately, the question needs to be discussed with Walter. I did briefly discuss it with him at this last dconf, but I don't recall exactly what he had to say about the ldc optimization stuff. I _think_ that he was hoping that there was a way to tell the optimizer to just not do that kind of optimization, but I don't remember for sure. The issue is not specific to LDC at all. DMD also does optimizations that assume that dereferencing [*] null is UB. The example I gave is dead-code-elimination of a dead read of a member variable inside a class method, which can only be done either if the spec says that`a.foo()` is UB when `a` is null, or if `this.a` is UB when `this` is null. [*] I notice you also use "dereference" for an execution machine [**] reading from a memory address, instead of the language doing a dereference (which may not necessarily mean a read from memory). [**] intentional weird name for the CPU? Yes. We also have D code running as webassembly... -Johan
Re: Why does nobody seem to think that `null` is a serious problem in D?
On Wednesday, 21 November 2018 at 09:31:41 UTC, Patrick Schluter wrote: On Tuesday, 20 November 2018 at 23:14:27 UTC, Johan Engelen wrote: On Tuesday, 20 November 2018 at 19:11:46 UTC, Steven Schveighoffer wrote: On 11/20/18 1:04 PM, Johan Engelen wrote: D does not make dereferencing on class objects explicit, which makes it harder to see where the dereference is happening. Again, the terms are confusing. You just said the dereference happens at a.foo(), right? I would consider the dereference to happen when the object's data is used. i.e. when you read or write what the pointer points at. But `a.foo()` is already using the object's data: it is accessing a function of the object and calling it. Whether it is a virtual function, or a final function, that shouldn't matter. It matters a lot. A virtual function is a pointer that is in the instance, so there is a derefernce of the this pointer to get the address of the function. For a final function, the address of the function is known at compile time and no dereferencing is necessary. That is a thing that a lot of people do not get, a member function and a plain function are basically the same thing. What distinguishes them, is their mangled name. You can call a non virtual member function from an assembly source if you know the symbol name. UFCS uses this fact, that member function and plain function are indistinguishable in a object code point of view, to fake member functions. This and the rest of your email is exactly the kind of thinking that I oppose where language semantics and compiler implementation are being mixed. I don't think it's possible to write an optimizing compiler where that way of reasoning works. So D doesn't do that, and we have to treat language semantics separate from implementation details. (virtual functions don't have to be implemented using vtables, local variables don't have to be on a stack, "a+b" does not need to result in a CPU add instruction, "foo()" does not need to result in a CPU procedure call instruction, etc, etc, etc. D is not a portable assembly language.) -Johan
Re: Why does nobody seem to think that `null` is a serious problem in D?
On Tuesday, 20 November 2018 at 19:11:46 UTC, Steven Schveighoffer wrote: On 11/20/18 1:04 PM, Johan Engelen wrote: D does not make dereferencing on class objects explicit, which makes it harder to see where the dereference is happening. Again, the terms are confusing. You just said the dereference happens at a.foo(), right? I would consider the dereference to happen when the object's data is used. i.e. when you read or write what the pointer points at. But `a.foo()` is already using the object's data: it is accessing a function of the object and calling it. Whether it is a virtual function, or a final function, that shouldn't matter. There are different ways of implementing class function calls, but here often people seem to pin things down to one specific way. I feel I stand alone in the D community in treating the language in this abstract sense (like C and C++ do, other languages I don't know). It's similar to that people think that local variables and the function return address are put on a stack; even though that is just an implementation detail that is free to be changed (and does often change: local variables are regularly _not_ stored on the stack [*]). Optimization isn't allowed to change behavior of a program, yet already simple dead-code-elimination would when null dereference is not treated as UB or when it is not guarded by a null check. Here is an example of code that also does what you call a "dereference" (read object data member): ``` class A { int i; final void foo() { int a = i; // no crash with -O } } void main() { A a; a.foo(); // dereference happens } ``` When you don't call `a.foo()` a dereference, you basically say that `this` is allowed to be `null` inside a class member function. (and then it'd have to be normal to do `if (this) ...` inside class member functions...) These discussions are hard to do on a mailinglist, so I'll stop here. Until next time at DConf, I suppose... ;-) -Johan [*] intentionally didn't say where those local variables _are_ stored, so that people can solve that little puzzle for themselves ;-)
Re: Why does nobody seem to think that `null` is a serious problem in D?
On Tuesday, 20 November 2018 at 03:38:14 UTC, Jonathan M Davis wrote: For @safe to function properly, dereferencing null _must_ be guaranteed to be memory safe, and for dmd it is, since it will always segfault. Unfortunately, as understand it, it is currently possible with ldc's optimizer to run into trouble, since it'll do things like see that something must be null and therefore assume that it must never be dereferenced, since it would clearly be wrong to dereference it. And then when the code hits a point where it _does_ try to dereference it, you get undefined behavior. It's something that needs to be fixed in ldc, but based on discussions I had with Johan at dconf this year about the issue, I suspect that the spec is going to have to be updated to be very clear on how dereferencing null has to be handled before the ldc guys do anything about it. As long as the optimizer doesn't get involved everything is fine, but as great as optimizers can be at making code faster, they aren't really written with stuff like @safe in mind. One big problem is the way people talk and write about this issue. There is a difference between "dereferencing" in the language, and reading from a memory address by the CPU. Confusing language semantics with what the CPU is doing happens often in the D community and is not helping these debates. D is proclaiming that dereferencing `null` must segfault but that is not implemented by any of the compilers. It would require inserting null checks upon every dereference. (This may not be as slow as you may think, but it would probably not make code run faster.) An example: ``` class A { int i; final void foo() { import std.stdio; writeln(__LINE__); // i = 5; } } void main() { A a; a.foo(); } ``` In this case, the actual null dereference happens on the last line of main. The program runs fine however since dlang 2.077. Now when `foo` is modified such that it writes to member field `i`, the program does segfault (writes to address 0). D does not make dereferencing on class objects explicit, which makes it harder to see where the dereference is happening. So, I think all compiler implementations are not spec compliant on this point. I think most people believe that compliance is too costly for the kind of software one wants to write in D; the issue is similar to array bounds checking that people explicitly disable or work around. For compliance we would need to change the compiler to emit null checks on all @safe dereferences (the opposite direction was chosen in 2.077). It'd be interesting to do the experiment. -Johan
Re: @safe - why does this compile?
On Friday, 13 July 2018 at 14:51:17 UTC, ketmar wrote: yeah. in simple words: safe code is *predictable*, but not "segfault-less". segfaults (null dereferences) in safe code are allowed, 'cause they have completely predictable behavior (instant program termination). @safe doesn't free you from doing your null checks, it protects you from so-called "undefined behavior" (aka "unpredictable execution results"). so when we are talking about "memory safety", it doesn't mean that your code cannot segfault, it means that your code won't corrupt random memory due to misbehaving. This is not true when using LDC (and I'd expect the same for GDC). With LDC, dereferencing `null` is undefined behavior regardless of whether you are in an @safe context or not. - Johan
Re: GDC on Travis-CI
On Saturday, 2 June 2018 at 10:49:30 UTC, rjframe wrote: There is documentation for older Phobos versions online, but I don't remember the link and haven't found it by searching. https://docarchives.dlang.io/
Re: How are switches optimized
On Friday, 1 June 2018 at 21:18:25 UTC, IntegratedDimensions wrote: What is the best optimizations that a compiler does to switches in general and in the D compilers? The best possible depends a lot on the specific case at hand. Best possible is to fully elide the switch, which does happen. You can use d.godbolt.org to investigate what happens for different pieces of code. Sometimes a jumptable is used, sometimes an if-chain. LLVM's (LDC) and GCC's (GDC) optimizers are strong and the optimized code will often do extra calculations before indexing in the table or before doing the comparisons in the if-chain. Different compilers will make different optimizations. https://godbolt.org/g/pHptff https://godbolt.org/g/AwZ69o A switch can be seen as if statements, or safer, nested if elses. but surely the cost per case does not grow with depth in the switch? If one has a switch of N case then the last cost surely does not cost N times the cost of the first, approximately? Depends on the code, but it's not O(N). This is the cost when implementing a switch as nested ifs. Not true. Nested if's are optimized as well. Sometimes switch is faster, sometimes if-chain, sometimes it's the same. Tables can be used to give O(1) cost, are these used in D's compilers? Yes (LDC and GDC). How are they generally implemented? Hash tables? If the switch is on an enum of small values is it optimized for a simple calculating offset table? Table stored in the instruction stream. Simple offset table with calculation on the index value (I guess you could say it is a simplified hash table). Note that with performance, the rest of the program and the execution flow also matters... cheers, Johan
Re: ldc2 and dmd
On Tuesday, 22 May 2018 at 16:17:48 UTC, Russel Winder wrote: Hi, I have a shared object (of DInotify) compiled with ldc2. I have a program (me-tv) which seems to work when compiled with ldc2. If I compile the program (me-tv) with dmd then it throws a SIGSEGV seemingly in _D3std4file15DirIteratorImpl5frontMFNdNfZSQBoQBn8DirEntry in DInotify. Is this what I should expect? LDC and DMD are ABI incompatible. Also, different compiler versions of the same vendor are ABI incompatible (sometimes they are compatible). Things may or may not work when different pieces of the program are compiled with a different compiler (which is what you are doing here). - Johan
Re: Extra .tupleof field in structs with disabled postblit blocks non-GC-allocation trait
On Thursday, 10 May 2018 at 19:14:39 UTC, Meta wrote: So it looks like disabling a struct's postblit actually counts as having a __postblit and __xpostblit function (don't ask me why), in addition to a construction and opAssign... no idea why, and maybe this is a bug, but I bet there's a good reason for it. https://issues.dlang.org/show_bug.cgi?id=18628 -Johan
Re: LDC phobos2-ldc.lib(json.obj) : fatal error LNK1112: module machine type 'x64' conflicts with target machine type 'x86'
On Thursday, 3 May 2018 at 23:47:40 UTC, IntegratedDimensions wrote: trying to compile a simple program in x86. Compiles fine in dmd and ldcx64. Seems like ldc is using the wrong lib for some reason? phobos2-ldc.lib(json.obj) : fatal error LNK1112: module machine type 'x64' conflicts with target machine type 'x86' To help, we need more details, such as what code are you compiling, what version of LDC, and what commandline. - Johan
Re: Link-time optimisation (LTO)
On Friday, 30 March 2018 at 10:23:15 UTC, Cecil Ward wrote: Say that I use say GDC or LDC. I want to declare a routine as public in one compilation unit (.d src file) and be able to access it from other compilation units. Do I simply declare the routine with the word keyword public before the usual declaration? Or maybe that is the default, like not using the keyword static with function declarations in C? Global functions in a module have default "public" visibility indeed. https://dlang.org/spec/attribute.html#visibility_attributes My principal question: If I successfully do this, with GCC or LDC, will I be able to get the code for the externally defined short routine expanded inline and fully integrated into the generated code that corresponds to the calling source code? (So no ‘call’ instruction is even found.) What you want is "cross-module inlining". As far as I know, DMD and GDC will do cross-module inlining (if inlining is profitable). LDC does not, unless `-enable-cross-module-inlining` is enabled (which is aggressive and may result in linking errors in specific cases, https://github.com/ldc-developers/ldc/pull/1737). LDC with LTO enabled will definitely give you cross-module inlining (also for private functions). You can force inlining with `pragma(inline, true)`, but it is usually better to leave that decision up to the compiler. https://dlang.org/spec/pragma.html#inline -Johan
Re: Building application with LDC and -flto=thin fails in link stage
On Wednesday, 28 March 2018 at 16:42:23 UTC, Johan Engelen wrote: On Tuesday, 27 March 2018 at 22:10:33 UTC, Per Nordlöw wrote: On Tuesday, 27 March 2018 at 22:00:42 UTC, Johan Engelen wrote: Indeed. Please try to manually link first (without dub) by modifying the command on which dub errors: ``` ldmd2 -flto=thin -of.dub/build/application-release-nobounds-lto-linux.posix-x86_64-ldc_2078-F2904BE3C4DA237C077E5C2B0B23442E/knetquery .dub/build/application-release-nobounds-lto-linux.posix-x86_64-ldc_2078-F2904BE3C4DA237C077E5C2B0B23442E/knetquery.o ../../.dub/packages/gmp-d-master/gmp-d/.dub/build/library-release-nobounds-lto-linux.posix-x86_64-ldc_2078-B287F67CE5FF6145BC229790CFB09607/libgmp-d.a phobos-next/.dub/build/library-release-nobounds-lto-linux.posix-x86_64-ldc_2078-F0F2FDB01B8401C04D657BCC145D46A5/libknet_phobos-next.a -L--no-as-needed -L-lzstd -L-lgmp -L-lc -L-lreadline -L-lz -L-lbz2 ``` -Johan Yes, that works! I'm no dub expert and I don't know how to pass flags to the _compiler_ for the link step. Would be good to figure that out with the dub folks. For a stopgap solution: I think what you are doing is passing `--compiler=/ldmd2` (note: LDMD) to dub when building, and you get separate compilation+linking. If you'd use `--compiler=/ldc2` (note: LDC), then you would not get separate compilation+linking [1] and things work with just setting `dflags`. I've asked about the problem here: https://github.com/dlang/dub/issues/1431 cheers, Johan [1] https://github.com/dlang/dub/issues/809
Re: Building application with LDC and -flto=thin fails in link stage
On Thursday, 29 March 2018 at 08:44:21 UTC, Jacob Carlborg wrote: Please read the reply :), although it could be a bit more clear. I'll spell it out for you. Both `dflags` and `lflags` are being used already. With separate compilation and linking, there seems to be no way to pass flags to the compiler during the linking step: dflags is not used and lflags is prefixed with `-L`. -Johan
Re: Building application with LDC and -flto=thin fails in link stage
On Wednesday, 28 March 2018 at 17:03:07 UTC, Seb wrote: dub supports dflags and lflags in the config file. lflags are the linker commands. Please read the thread. `lflags` is for passing flags to the _linker_ (i.e. those flags are prefixed with -L when passed to the _compiler_) Here, what's needed is passing flags to the _compiler_ when it is invoked to perform the link step in the build. -Johan
Re: Building application with LDC and -flto=thin fails in link stage
On Tuesday, 27 March 2018 at 22:10:33 UTC, Per Nordlöw wrote: On Tuesday, 27 March 2018 at 22:00:42 UTC, Johan Engelen wrote: Indeed. Please try to manually link first (without dub) by modifying the command on which dub errors: ``` ldmd2 -flto=thin -of.dub/build/application-release-nobounds-lto-linux.posix-x86_64-ldc_2078-F2904BE3C4DA237C077E5C2B0B23442E/knetquery .dub/build/application-release-nobounds-lto-linux.posix-x86_64-ldc_2078-F2904BE3C4DA237C077E5C2B0B23442E/knetquery.o ../../.dub/packages/gmp-d-master/gmp-d/.dub/build/library-release-nobounds-lto-linux.posix-x86_64-ldc_2078-B287F67CE5FF6145BC229790CFB09607/libgmp-d.a phobos-next/.dub/build/library-release-nobounds-lto-linux.posix-x86_64-ldc_2078-F0F2FDB01B8401C04D657BCC145D46A5/libknet_phobos-next.a -L--no-as-needed -L-lzstd -L-lgmp -L-lc -L-lreadline -L-lz -L-lbz2 ``` -Johan Yes, that works! Good :-) I'm no dub expert and I don't know how to pass flags to the _compiler_ for the link step. Would be good to figure that out with the dub folks. There will be similar problems with using ASan (and fuzzer aswell): `-fsanitize=address` must also be passed to the D compiler (not the linker) during linking such that the correct asan library is linked into the executable. - Johan
Re: Building application with LDC and -flto=thin fails in link stage
On Tuesday, 27 March 2018 at 13:28:08 UTC, kinke wrote: On Monday, 26 March 2018 at 23:32:59 UTC, Nordlöw wrote: forwarded as `-L-flto=thin` but still errors as Which is wrong, it's not a ld command-line option (i.e., the `-L` prefix is wrong). Indeed. Please try to manually link first (without dub) by modifying the command on which dub errors: ``` ldmd2 -flto=thin -of.dub/build/application-release-nobounds-lto-linux.posix-x86_64-ldc_2078-F2904BE3C4DA237C077E5C2B0B23442E/knetquery .dub/build/application-release-nobounds-lto-linux.posix-x86_64-ldc_2078-F2904BE3C4DA237C077E5C2B0B23442E/knetquery.o ../../.dub/packages/gmp-d-master/gmp-d/.dub/build/library-release-nobounds-lto-linux.posix-x86_64-ldc_2078-B287F67CE5FF6145BC229790CFB09607/libgmp-d.a phobos-next/.dub/build/library-release-nobounds-lto-linux.posix-x86_64-ldc_2078-F0F2FDB01B8401C04D657BCC145D46A5/libknet_phobos-next.a -L--no-as-needed -L-lzstd -L-lgmp -L-lc -L-lreadline -L-lz -L-lbz2 ``` -Johan
Re: Building application with LDC and -flto=thin fails in link stage
On Monday, 26 March 2018 at 22:07:49 UTC, Nordlöw wrote: When I try build my application using LDC and -flto=thin it fails in the final linking You must also pass `-flto=thin` during linking (a special plugin is needed for LTO, and LDC will only pass the plugin to the linker when `-flto=` is specified). I couldn't see `-flto=thin` in your link command, so I suspect that will fix it. - Johan
Re: Does the compiler inline the predicate functions to std.algorithm.sort?
On Monday, 19 March 2018 at 12:45:58 UTC, tipdbmp wrote: the LLVM IR obtained with -output-ll might be easier to read than assembly.) I only seem to get assembly on d.godbolt.org, even with the -output-ll option. On d.godbolt.org, you can get LLVM IR with a trick: use `-output-s=false -output-ll`. -Johan
Re: LDC / BetterC / _d_run_main
On Saturday, 10 March 2018 at 07:54:33 UTC, Mike Franklin wrote: On Saturday, 10 March 2018 at 02:25:38 UTC, Richard wrote: Hi, I've been trying to see if I can get an mbed project to work with Dlang basically compiling D code for use on a Cortex-M Proccessor You might be interested in the following, if you're not already aware: * https://github.com/JinShil/stm32f42_discovery_demo * https://bitbucket.org/timosi/minlibd There is also: https://github.com/kubo39/stm32f407discovery and its submodules. The STM32 demo only supports GDC right now, but I'll be updating it to support LDC when 2.079.0 lands there. Awesome. -Johan
Re: Speed of math function atan: comparison D and C++
On Monday, 5 March 2018 at 06:01:27 UTC, J-S Caux wrote: On Monday, 5 March 2018 at 05:40:09 UTC, rikki cattermole wrote: On 05/03/2018 6:35 PM, J-S Caux wrote: I'm considering shifting a large existing C++ codebase into D (it's a scientific code making much use of functions like atan, log etc). I've compared the raw speed of atan between C++ (Apple LLVM version 7.3.0 (clang-703.0.29)) and D (dmd v2.079.0, also ldc2 1.7.0) by doing long loops of such functions. I can't get the D to run faster than about half the speed of C++. double x = 0.0; for (int a = 0; a < 10; ++a) x += atan(1.0/(1.0 + sqrt(1.0 + a))); for C++ and double x = 0.0; for (int a = 0; a < 1_000_000_000; ++a) x += atan(1.0/(1.0 + sqrt(1.0 + a))); for D. C++ exec takes 40 seconds, D exec takes 68 seconds. The performance problem with this code is that LDC does not yet do cross-module inlining by default. GDC does. If you pass `-enable-cross-module-inlining` to LDC, things should be faster. In particular, std.sqrt is not inlined although it is profitable to do so (it becomes one machine instruction). Things become worse when using core.stdc.math.sqrt, because no implementation source available: no inlining possible. Another problem is that std.math.atan(double) just calls std.math.atan(real). Calculations are more expensive on platforms where real==80bits (i.e. x86), and that's not solvable with a compile flag. What it takes is someone to write the double and float versions of atan (and other math functions), but it requires someone with the right knowledge to do it. Your tests (and reporting about them) are much appreciated. Please do file bug reports for these things. Perhaps you can take a stab at implementing double-versions of the functions you need? cheers, Johan
Re: how to install latest lcd2 for armbain
On Friday, 2 March 2018 at 08:25:51 UTC, dangbinghoo wrote: So, does anyone know how to install latest ldc2 from arm-debain? You can use the dlang install.sh script to download LDC and put it in your home dir: https://dlang.org/install.html -Johan
Re: Disk space used and free size of a Network share folder in Windows
On Wednesday, 14 February 2018 at 12:22:09 UTC, Vino wrote: Hi All, Request your help on how to get the disk space used and free size of a Network share folder in Windows, tried with getSize but it return 0; See: https://github.com/ldc-developers/ldc/blob/f5b05878de6df2ea4a77c37128ad2eae0266b690/driver/cache_pruning.d#L47-L71 and https://issues.dlang.org/show_bug.cgi?id=16487 cheers, Johan
Re: D generates large assembly for simple function
On 01/27/2018 11:42 AM, Matt wrote: Godbolt link: https://godbolt.org/g/t5S976 Careful with these comparisons guys. Know what you are looking at. Rust does not eliminate setting the framepointer register, and so it looks "bad" [1]. Clang also sets the framepointer for macOS ABI regardless of optimization level. https://godbolt.org/g/eeo81n [1] See https://github.com/rust-lang/rust/pull/47152 -Johan
Re: D generates large assembly for simple function
On Saturday, 27 January 2018 at 19:45:35 UTC, Stefan Koch wrote: ah ... -betterC is only for dmd. `-betterC` works from LDC 1.1.0. - Johan
Re: __gshared as part of alias
On Wednesday, 17 January 2018 at 22:56:09 UTC, kinke wrote: On Wednesday, 17 January 2018 at 22:01:57 UTC, Johan Engelen wrote: ``` struct GSharedVariable(AddrSpace as, T) { static __gshared T val; alias val this; } alias Global(T) = GSharedVariable!(AddrSpace.Global, T); Global!float bar1; // __gshared ``` Only 1 value per T though. ;) Ah, haha indeed, I meant without the "static", but __gshared is always "static". ``` alias Global(T) = shared Variable!(AddrSpace.Global, T); ``` seems to work. https://godbolt.org/g/iDbRX7 -Johan
Re: __gshared as part of alias
On Friday, 12 January 2018 at 04:25:25 UTC, Nicholas Wilson wrote: Is there a way to make __gshared part of an alias? Hi Nick, how about this? ``` struct GSharedVariable(AddrSpace as, T) { static __gshared T val; alias val this; } alias Global(T) = GSharedVariable!(AddrSpace.Global, T); Global!float bar1; // __gshared ``` -Johan
Re: Efficient way to pass struct as parameter
On Tuesday, 2 January 2018 at 18:21:13 UTC, Tim Hsu wrote: I am creating Vector3 structure. I use struct to avoid GC. However, struct will be copied when passed as parameter to function struct Ray { Vector3f origin; Vector3f dir; @nogc @system this(Vector3f *origin, Vector3f *dir) { this.origin = *origin; this.dir = *dir; } } How can I pass struct more efficiently? Pass the Vector3f by value. There is not one best solution here: it depends on what you are doing with the struct, and how large the struct is. It depends on whether the function will be inlined. It depends on the CPU. And probably 10 other things. Vector3f is a small struct (I'm guessing it's 3 floats?), pass it by value and it will be passed in registers. This "copy" costs nothing on x86, the CPU will have to load the floats from memory and store them in a register anyway, before it can write it to the target Vector3f, regardless of how you pass the Vector3f. You can play with some code here: https://godbolt.org/g/w56jmA Passing by pointer (ref is the same) has large downsides and is certainly not always fastest. For small structs and if copying is not semantically wrong, just pass by value. More important: measure what bottlenecks your program has and optimize there. - Johan
Re: Finding unsafe line of code
On Friday, 29 December 2017 at 10:23:24 UTC, codephantom wrote: On Friday, 29 December 2017 at 09:38:50 UTC, Vino wrote: Let me re-frame the question with an example, as the Dsafe the below line of code is considered as unsafe(Pointer arithmetic), ... ini[10] a; int* p = [0]; for (size_t i=0; i <= 10; i++) p[i] = ...; From, Vino.B Is this what you're looking for? https://dlang.org/spec/function.html#safe-functions Just annotate your functions with @safe (as @system is the default). Or if that's not possible, you can add runtime checks with ASan: http://johanengelen.github.io/ldc/2017/12/25/LDC-and-AddressSanitizer.html -Johan
Re: Does LDC support profiling at all?
On Sunday, 24 December 2017 at 02:48:32 UTC, Chris Katko wrote: It would probably be really helpful to get a clear Wiki guide for this information LDC. I'll write it myself if necessary once I try your recommendations and test them out. This would help us out a lot, thanks. -Johan
Re: Does LDC support profiling at all?
On Friday, 22 December 2017 at 09:52:26 UTC, Chris Katko wrote: DMD can use -profile and -profile=gc. But I tried for HOURS to find the equivalent for LDC and came up with only profile-guided optimization--which I don't believe I want. Yet, if we can get PGO... where's the PROFILE itself it's using to make those decisions! :) Fine grained PGO profiling: -fprofile-instr-generate http://johanengelen.github.io/ldc/2016/07/15/Profile-Guided-Optimization-with-LDC.html Function entry/exit profiling: -finstrument-functions https://github.com/ldc-developers/ldc/issues/1839 https://www.youtube.com/watch?v=LNav5qvyK7I I suspect it is not too much effort to add DMD's -profile and -profile=gc to LDC, but noone has done it yet. Another thing that is relatively easy to add to LDC: https://llvm.org/docs/XRay.html -Johan
Re: Is there a way to get a function name within a function?
On Wednesday, 20 December 2017 at 23:28:46 UTC, jicman wrote: Greetings! Imagine, //start int getMe(int i) { writefln(__LINE__); writefln(__FUNCTION_NAME__); So close! Use "__FUNCTION__" or "__PRETTY_FUNCTION__". https://dlang.org/spec/traits.html#specialkeywords -Johan
Re: std.range.interfaces : InputRange moveFront
On Friday, 1 December 2017 at 18:55:53 UTC, Steven Schveighoffer wrote: Once you popFront a byLine range, the element that was at front is now possibly invalid (the buffer may be reused). So in order to return the line from popFront, you have to store it somewhere. This means allocating another buffer to hold the line you just returned. So the costs of doing this aren't just that you might do work and just throw it away, it's that you have this extra caching problem you didn't have before. Cool, thanks. Can we add points like this to the documentation? (if not, user frustration and forum threads will keep coming about these things... ;-) -Johan
Re: std.range.interfaces : InputRange moveFront
On Friday, 1 December 2017 at 18:33:09 UTC, Ali Çehreli wrote: On 12/01/2017 07:21 AM, Steven Schveighoffer wrote: > On 12/1/17 4:29 AM, Johan Engelen wrote: >> (Also, I would expect "popFront" to return the element popped, but it >> doesn't, OK... > > pop removes the front element, but if getting the front element is > expensive (say if it's a map with a complex lambda function), you don't > want to execute that just so you can return it to someone who doesn't > care. This is why front and popFront are separate. Yet, we're told that compilers are pretty good at eliminating that unused copy especially for function templates where all code is visible. I assume that Steven means "copying the front element" when he wrote "getting the front element"? There is no need for a copy, because the element will be removed from the range, so we can move (whose cost only depends on the size of the element, internal pointers being disallowed by the language). If it is expensive to actually get _to_ the front/back element (i.e. find its memory location), then having to do the operation twice is a disadvantage. Ali: the compiler can only elide copying/moving of an unused return value when inlining the function. (the duty of making the return value move/copy is on the callee, not the caller) Note that because front/back() and popFront/Back() are separate, a copy *is* needed when one wants to "pop an element off". Thus moveFront/Back() and popFront/Back() should be used. OK. The fact that "pop" does something different from other programming languages is something important to remember when teaching people about D. And I think should be made clear in the documentation; let's add an example of how one is supposed to use all this in an efficient manner? Back on topic: let's change the documentation of moveFront such that it is clear that it does _not_ reduce the number of elements in the range? So, even though exception safety is not a common topic of D community, the real reason for why popFront() does not return the element is for strong exception safety guarantee. Interesting point. Argh why do we allow the user to throw in move? Regardless, separating front() from popFront() is preferable due to cohesion: fewer responsibilities per function, especially such low level ones. This doesn't make much sense ;-) popFrontN has more responsibility, and one gains better performance than simply calling popFront N times. It's similar here. -Johan
Re: std.range.interfaces : InputRange moveFront
On Friday, 1 December 2017 at 09:11:40 UTC, Johan Engelen wrote: I tested it and it works like you wrote, but the behavior is different for an array of integers...: Hmm, I guess I misread what Ali meant. But the documentation is wrong/very confusing for moveFront: It says "moveFront -- Removes the front element of a range." and "Moves the front of r out and returns it." With "to move _out_", I would expect that the range is advanced/shortened/..., but it is not. (Also, I would expect "popFront" to return the element popped, but it doesn't, OK... So which function name is given to the behavior of "pop" of other languages?) -Johan
Re: std.range.interfaces : InputRange moveFront
On Thursday, 30 November 2017 at 06:36:12 UTC, Ali Çehreli wrote: import std.range; struct S { int i; bool is_a_copy = false; this(this) { is_a_copy = true; } } void main() { auto r = [S(1)]; auto a = r.front; assert(a.is_a_copy); // yes, a is a copy assert(a.i == 1); // as expected, 1 assert(r.front.i == 1);// front is still 1 auto b = r.moveFront(); assert(!b.is_a_copy); // no, b is not a copy assert(b.i == 1); // state is transferred assert(r.front.i == 0);// front is int.init } I tested it and it works like you wrote, but the behavior is different for an array of integers...: auto a = [ 1,2,3 ]; writeln(a.front); // 1 auto b = a.moveFront(); writeln(b); // 1 writeln(a.length); // still 3 writeln(a.front); // still 1 -Johan
Re: Fast removal of character
On Wednesday, 11 October 2017 at 22:45:14 UTC, Jonathan M Davis wrote: On Wednesday, October 11, 2017 22:22:43 Johan Engelen via Digitalmars-d- learn wrote: std.string.removechars is now deprecated. https://dlang.org/changelog/2.075.0.html#pattern-deprecate What is now the most efficient way to remove characters from a string, if only one type of character needs to be removed? ``` // old auto old(string s) { return s.removechars(",").to!int; } // new? auto newnew(string s) { return s.filter!(a => a != ',').to!int; } ``` Well, in general, I'd guess that the fastest way to remove all instances of a character from a string would be std.array.replace with the replacement being the empty string, Is that optimized for empty replacement? but if you're feeding it to std.conv.to rather than really using the resultant string, then filter probably is faster, because it won't allocate. Really though, you'd have to test for your use case and see how fast a given solution is. Yeah :( I am disappointed to see functions being deprecated, without an extensive documentation of how to rewrite them for different usage of the deprecated function. It makes me feel that no deep thought went into removing them (perhaps there was, I can't tell). One has to go and browse through the different version _release notes_ to find any documentation on how to rewrite them. It would have been much better to add it (aswell) to the deprecated function documentation. I have the same problem for std.string.squeeze. The release notes only say how to rewrite the `squeeze()` case, but not the `squeeze("_")` use case. I guess `uniq!("a=='_' && a == b")` ? Great improvement? - Johan
Fast removal of character
std.string.removechars is now deprecated. https://dlang.org/changelog/2.075.0.html#pattern-deprecate What is now the most efficient way to remove characters from a string, if only one type of character needs to be removed? ``` // old auto old(string s) { return s.removechars(",").to!int; } // new? auto newnew(string s) { return s.filter!(a => a != ',').to!int; } ``` cheers, Johan
Re: Is there further documentation of core.atomic.MemoryOrder?
On Wednesday, 13 September 2017 at 14:40:55 UTC, Nathan S. wrote: Is there a formal description of "hoist-load", "hoist-store", "sink-load", and "sink-store" as used in core.atomic.MemoryOrder (https://dlang.org/library/core/atomic/memory_order.html)? You can read this: https://llvm.org/docs/Atomics.html#atomic-orderings And use this: ``` static if (ms == MemoryOrder.acq) { enum _ordering = AtomicOrdering.Acquire; } else static if (ms == MemoryOrder.rel) { enum _ordering = AtomicOrdering.Release; } else static if (ms == MemoryOrder.seq) { enum _ordering = AtomicOrdering.SequentiallyConsistent; } else static if (ms == MemoryOrder.raw) { enum _ordering = AtomicOrdering.Monotonic; } ``` -Johan
Re: Access Violation when passing the result of a C function directly to a D function?
On Friday, 15 September 2017 at 04:01:13 UTC, Timothy Foster wrote: I've been calling it like so: ErrorFMOD(FMOD_System_Create(), "Error Creating System: "); Making the calls without my helper function doesn't cause an Access Violation. Calling it like this is the only thing that seems to fix it: auto result = FMOD_System_Create(); ErrorFMOD(result, "Error Creating System: "); Is this a known issue, or am I required to save the result of a C function to variable before passing it into another function or? This is very strange and you are certainly not required to save the result in a temp variable first. Do you have a small but full testcase that we can look at? (did you try with another compiler, LDC or GDC?) (note that debug information may be off, so the crash may happen in a different location from where it is reported to happen) -Johan
Re: performance cost of sample conversion
On Thursday, 7 September 2017 at 05:45:58 UTC, Ali Çehreli wrote: You have to measure. Indeed. Here's a start: The program has way too many things pre-defined, and the semantics are such that workWithDoubles can be completely eliminated... So you are not measuring what you want to be measuring. Make stuff depend on argc, and print the result of calculations or do something else such that the calculation must be performed. When measuring without LTO, probably attaching @weak onto the workWith* functions will work too. (pragma(inline, false) does not prevent reasoning about the function) -Johan
Re: One path skips constructor - is this a bug?
On Thursday, 7 September 2017 at 16:08:53 UTC, Piotr Mitana wrote: main.d(17): Error: one path skips constructor main.d(15): Error: return without calling constructor http://www.digitalmars.com/d/archives/digitalmars/D/learn/Throwing_exception_in_constructor_28995.html
Re: SIMD under LDC
On Wednesday, 6 September 2017 at 20:43:01 UTC, Igor wrote: I opened a feature request on github. I also tried using the gccbuiltins but I got this error: LLVM ERROR: Cannot select: 0x2199c96fd70: v16i8 = X86ISD::PSHUFB 0x2199c74e9a8, 0x2199c74d6c0 That's because SSSE3 instructions are not enabled by default, so the compiler isn't allowed to generate the PSHUFB instruction. Some options you have: 1. Set a cpu that has ssse3, e.g. compile with `-mcpu=native` 2. Enable SSSE3: compile with `-mattr=+ssse3` 3. Perhaps best for your case, enable SSSE3 for that function, importing the ldc.attributes module and using the @target("ssse3") UDA on that function. -Johan
Re: SIMD under LDC
On Monday, 4 September 2017 at 20:39:11 UTC, Igor wrote: I found that I can't use __simd function from core.simd under LDC and that it has ldc.simd but I couldn't find how to implement equivalent to this with it: ubyte16* masks = ...; foreach (ref c; pixels) { c = __simd(XMM.PSHUFB, c, *masks); } I see it has shufflevector function but it only accepts constant masks and I am using a variable one. Is this possible under LDC? You can use the module ldc.gccbuiltins_x86.di, __builtin_ia32_pshufb128 and __builtin_ia32_pshufb256. (also see https://gcc.gnu.org/onlinedocs/gcc-4.4.5/gcc/X86-Built_002din-Functions.html) Please file a feature request about shufflevector with variable mask in our (LDC) issue tracker on Github; with some code that you'd expect to work. Thanks. - Johan
Re: How to fix wrong deprecation message - dmd-2.075.1
On Wednesday, 16 August 2017 at 16:54:04 UTC, Pham wrote: On Wednesday, 16 August 2017 at 13:55:31 UTC, Steven Schveighoffer wrote: On 8/16/17 9:12 AM, Daniel Kozak via Digitalmars-d-learn wrote: It should not be print? AIAIK std.utf.toUTF16 is not deprecated: http://dlang.org/phobos/std_utf.html#toUTF16 OK this one is:https://github.com/dlang/phobos/blob/v2.075.1/std/utf.d#L2760 (but this one is not in doc) but this one should not be deprecated: https://github.com/dlang/phobos/blob/v2.075.1/std/utf.d#L2777 Hm.. that's a bug in the compiler. Only one is marked, but both are treated as deprecated. Issue 17757 is created I ran into this too the other day, and found that the issue was already filed: https://issues.dlang.org/show_bug.cgi?id=17193 - Johan
Re: D outperformed by C++, what am I doing wrong?
On Sunday, 13 August 2017 at 09:15:48 UTC, amfvcg wrote: Change the parameter for this array size to be taken from stdin and I assume that these optimizations will go away. This is paramount for all of the testing, examining, and comparisons that are discussed in this thread. Full information is given to the compiler, and you are basically testing the constant folding power of the compilers (not unimportant). No runtime calculation is needed for the sum. Your program could be optimized to the following code: ``` void main() { MonoTime beg = MonoTime.currTime; MonoTime end = MonoTime.currTime; writeln(end-beg); writeln(5000); } ``` So actually you should be more surprised that the reported time is not equal to near-zero (just the time between two `MonoTime.currTime` calls)! Instead of `iota(1,100)`, you should initialize the array with random numbers with a randomization seed given by the user (e.g. commandline argument or stdin). Then, the program will actually have to do the runtime calculations that I assume you are expecting it to perform. - Johan
Re: D outperformed by C++, what am I doing wrong?
On Sunday, 13 August 2017 at 09:08:14 UTC, Petar Kirov [ZombineDev] wrote: This instantiation: sum_subranges(std.range.iota!(int, int).iota(int, int).Result, uint) of the following function: auto sum_subranges(T)(T input, uint range) { import std.range : chunks, ElementType, array; import std.algorithm : map; return input.chunks(range).map!(sum); } gets optimized with LDC to: [snip] I.e. the compiler turned a O(n) algorithm to O(1), which is quite neat. It is also quite surprising to me that it looks like even dmd managed to do a similar optimization: [snip] Execution of sum_subranges is already O(1), because the calculation of the sum is delayed: the return type of the function is not `uint`, it is `MapResult!(sum, )` which does a lazy evaluation of the sum. - Johan
Re: Create class on stack
On Monday, 7 August 2017 at 13:40:18 UTC, Moritz Maxeiner wrote: Thanks, I wasn't aware of this. I tried fooling around scope classes and DIP1000 for a bit and was surprised that this is allowed: Thanks for the test case :-) It was fun to see that ASan can catch this bug too. Because writing the blog post about ASan will take quite some time still, I've pasted the demonstration below (there is a big big big caveat that will need more work from LDC's side, but you'll have to wait until the blog article). Simplified your code for the demonstration: ``` class A { int i; } void inc(A a) @safe { a.i += 1; // Line 6 } auto makeA() @safe { // Line 9 import std.algorithm : move; scope a = new A(); return move(a); } void main() @safe { auto a = makeA(); a.inc(); // Line 17 } ``` ``` ldc2 -fsanitize=address -disable-fp-elim scopeclass.d -g -O1 -dip1000 ASAN_OPTIONS=detect_stack_use_after_return=1 ./scopeclass 2>&1 | ddemangle = ==11446==ERROR: AddressSanitizer: stack-use-after-return on address 0x000104929050 at pc 0x0001007a9837 bp 0x7fff5f457510 sp 0x7fff5f457508 READ of size 4 at 0x000104929050 thread T0 #0 0x1007a9836 in @safe void scopeclass.inc(scopeclass.A) scopeclass.d:6 #1 0x1007a9a20 in _Dmain scopeclass.d:17 #2 0x1008e40ce in _D2rt6dmain211_d_run_mainUiPPaPUAAaZiZ6runAllMFZ9__lambda1MFZv (scopeclass:x86_64+0x10013c0ce) #3 0x7fff9729b5ac in start (libdyld.dylib:x86_64+0x35ac) Address 0x000104929050 is located in stack of thread T0 at offset 80 in frame #0 0x1007a984f in pure nothrow @nogc @safe scopeclass.A scopeclass.makeA() scopeclass.d:9 ```
Re: Express "Class argument may not be null" ?
On Tuesday, 8 August 2017 at 19:38:19 UTC, Steven Schveighoffer wrote: Note that C++ also can do this, so I'm not sure the & is accomplishing the correct goal: void foo(Klass&); int main() { Klass *k = NULL; foo(*k); } In C++, it is clear that the _caller_ is doing the dereferencing, and the dereference is also explicit. However, the in contract does actually enforce the requirement. And adds null pointer checks even when clearly not needed. - Johan
Re: Express "Class argument may not be null" ?
On Tuesday, 8 August 2017 at 18:57:48 UTC, Steven Schveighoffer wrote: On 8/8/17 2:34 PM, Johan Engelen wrote: Hi all, How would you express the function interface intent that a reference to a class may not be null? For a function "void foo(Klass)", calling "foo(null)" is valid. How do I express that that is invalid? (let's leave erroring with a compile error aside for now) There isn't a way to do this in the type itself. One can always create a null class instance via: MyObj obj; There is no way to disallow this somehow in the definition of MyObj. With structs, you can @disable this(), and it's still possible but harder to do so. Ok thanks, so this could be a reason for not being allowed to express the non-null-ness. (I still haven't found peace with the absence of an explicit * for classes) I would say, however, that if you wanted to express the *intent*, even without a compile-time error, you could use a contract: void foo(Klass k) in {assert(k !is null);}; Thanks. I regret leaving compile-time errors out, because in that case adding it to the function documentation would suffice. (Btw: "Error: function foo in and out contracts require function body". But who uses .di files anyway. ;-) Cheers, Johan
Express "Class argument may not be null" ?
Hi all, How would you express the function interface intent that a reference to a class may not be null? For a function "void foo(Klass)", calling "foo(null)" is valid. How do I express that that is invalid? (let's leave erroring with a compile error aside for now) Something equivalent to C++'s pass by reference: "void foo(Klass&)". (note: I mean D classes, for structs "ref" works) Thanks, Johan
Re: Question on SSE intrinsics
On Saturday, 29 July 2017 at 16:01:07 UTC, piotrekg2 wrote: Hi, I'm trying to port some of my c++ code which uses sse2 instructions into D. The code calls the following intrinsics: - _mm256_loadu_si256 - _mm256_movemask_epi8 Do they have any equivalent intrinsics in D? Yes, with LDC (probably GDC too). But unfortunately we don't have the "_mm256" functions (yet?), instead we have GCC's "__builtin_ia32..." functions. The first one you mention I think is just an unaligned load? That can be done with the template `loadUnaligned` from module ldc.simd. The second one has a synonym, "__builtin_ia32_pmovmskb256". -Johan
Re: Is align(16) respected for globals?
On Sunday, 23 July 2017 at 08:43:33 UTC, Guillaume Piolat wrote: I rely a lot on such constants for SSE: align(16) static immutable short[8] A = [ 1, 1, 1, 1, 3, 3, 3, 3 ]; Does such alignment actually work on all OS, at all times? Word on the street says align() doesn't work with globals. Should work with LDC (part of our testsuite).
Re: Zero-cost version-dependent function call at -O0.
On Sunday, 25 June 2017 at 23:02:28 UTC, Adam D. Ruppe wrote: On Sunday, 25 June 2017 at 22:53:07 UTC, Johan Engelen wrote: I meant semantically no call. In the existing language, I think version (or static if) at the usage and definition points both is as good as you're going to get. At the usage _and_ at definition point, indeed, that's a very good idea. - Johan
Re: Zero-cost version-dependent function call at -O0.
On Sunday, 25 June 2017 at 23:02:28 UTC, Adam D. Ruppe wrote: That'd be kinda tricky because the arguments would still be liable to be evaluated... Well.. I guess someone might argue that's a mis-feature of my preprocessor example: "foo(i++)" may not do what you want. (So the C code would have to do "(void)param" for all params, to ensure evaluation and avoid compiler warning? =)) So I think I got things to work with inline IR! ^_^ https://godbolt.org/g/HVGTbx ``` version(none) { void foo(int a, int b, int c) { /*...*/ }; } else { pragma(LDC_inline_ir) R __ir(string s, R, P...)(P); alias foo = __ir!(`ret i32 0`, int, int, int, int); } void bar() { int a; foo(a++,2,3); } ``` -Johan
Re: Zero-cost version-dependent function call at -O0.
On Sunday, 25 June 2017 at 22:23:44 UTC, Moritz Maxeiner wrote: The solution obviously does *not* work if you change the premise of your question after the fact by artificially injecting instructions into all function bodies I meant semantically no call. I am asking for a little more imagination, such that I don't have to specify all obvious details. For example, the always inline solution also doesn't work well when `foo` takes parameters. Regardless, perhaps in the meanwhile you've come up with an other solution? I am now thinking about introducing a noop intrinsic... (read what `-finstrument-functions` does). :-)
Re: Zero-cost version-dependent function call at -O0.
On Sunday, 25 June 2017 at 16:31:52 UTC, Moritz Maxeiner wrote: On Sunday, 25 June 2017 at 15:58:48 UTC, Johan Engelen wrote: [...] If version(X) is not defined, there should be no call and no extra code at -O0. [...] In C, you could do something like: ``` #if X void foo() {..} #else #define foo() #endif ``` How would you do this in D? By requiring the compiler to inline the empty foo: This won't work. Semantically, there is still a call and e.g. profiling will see it: https://godbolt.org/g/AUCeuu -Johan
Re: Zero-cost version-dependent function call at -O0.
On Sunday, 25 June 2017 at 16:29:20 UTC, Anonymouse wrote: Am I missing something, or can't you just version both the function and the function ćall? version(X) void foo() { /* ... */ } void main() { version(X) { foo(); } } I am hoping for something where "foo()" would just work. "version(X) foo();" isn't bad, but it adds a lot of noise. -Johan
Zero-cost version-dependent function call at -O0.
How would you solve this problem: do an optional function call depending on some version(X). If version(X) is not defined, there should be no call and no extra code at -O0. ``` { ... foo(); // either compiles to a function call, or to _nothing_. ... } ``` In C, you could do something like: ``` #if X void foo() {..} #else #define foo() #endif ``` How would you do this in D? I can think of `mixin(foo())` but there is probably a nicer way that preserves normal function calling syntax. Cheers, Johan
Re: Solution to "statement is not reachable" depending on template variables?
On Sunday, 18 June 2017 at 09:56:50 UTC, Steven Schveighoffer wrote: On Sunday, 18 June 2017 at 09:28:57 UTC, Johan Engelen wrote: Reviving this thread to see whether anything has changed on the topic. If Timon gets static for each into the language, it can look a little better. Can you help me understand what you mean? How will it improve things? (static foreach would disable the "statement not reachable" analysis?) -Johan
Re: Solution to "statement is not reachable" depending on template variables?
Reviving this thread to see whether anything has changed on the topic. I now have this monster: ``` struct FMT { // has immutable members. FMT cannot be assigned to. } FMT monsterThatCompilerAccepts(T)(){ alias TP = Tuple!(__traits(getAttributes, T)); foreach(i, att; TP){ static if( ... ) { return FMT( ... ); } // Make sure we return a default value in the last iteration. // Prevents "statement not reachable" warning when placed here instead of outside the foreach. else static if (i + 1 == TP.length) { return FMT( ... ); } } static if (TP.length == 0) { return FMT( ... ); } } FMT codeThatIsUnderstandableButNotAccepted(T)(){ alias TP = Tuple!(__traits(getAttributes, T)); foreach(i, att; TP){ static if( ... ) { return FMT( ... ); } } return FMT( ... ); } ``` Thanks, Johan
Re: Is D slow?
On Saturday, 10 June 2017 at 11:43:06 UTC, Johan Engelen wrote: On Friday, 9 June 2017 at 16:21:22 UTC, Honey wrote: What seems particularly strange to me is that -boundscheck=off leads to a performance decrease. Strange indeed. `-release` should be synonymous with `-release -boundscheck=off`. Nope it's not. http://www.digitalmars.com/d/archives/digitalmars/D/What_s_the_deal_with_-boundscheck_260237.html
Re: Is D slow?
On Friday, 9 June 2017 at 16:21:22 UTC, Honey wrote: What seems particularly strange to me is that -boundscheck=off leads to a performance decrease. Strange indeed. `-release` should be synonymous with `-release -boundscheck=off`. Investigating... - Johan