Re: Want a function that determines a double or float given its 80-bit IEEE 754 SANE (big endian) representation
On Tuesday, 22 August 2023 at 22:38:23 UTC, dan wrote: Hi, I'm parsing some files, each containing (among other things) 10 bytes said to represent an IEEE 754 extended floating point number, in SANE (Standard Apple Numerical Environment) form, as SANE existed in the early 1990s (so, big endian). Note that the number actually stored will probably be a positive even integer less than 100,000, so a better format would have been to store a two-byte ushort rather than a 10-byte float. However the spec chose to have an encoded float there. I would like to have a function of the form public bool ubytes_to_double( ubytes[10] u, out double d ) { /* stuff */ } which would set d to the value encoded provided that the value is a number and is sane, and otherwise just return false. So my plan is just to do this: examine the first 2 bytes to check the sign and see how big the number is, and if it is reasonable, convert the remaining 8 bytes to a fractional part, perhaps ignoring the last 2 or 3 as not being significant. But --- it seems like this kind of task may be something that d already does, maybe with some constructor of a double or something. Thanks in advance for any suggestions. dan On 32bit x86 an endianness swap and pointer cast to `real` should be enough.(seems to be the same format but i could be wrong.) Else(afaik `real` on 64 bit x86 is just `double`?) you can always isolate sign mantissa and exponent to three isolated `double` values(cast from integer to `double`) and recalculate(`sign*mantissa*(2^^exponent)` according to wikipedia) the floating point number, since they mostly contain integers precision loss probably won't be a problem.
Re: Is this code correct?
On Saturday, 1 April 2023 at 15:32:27 UTC, Dennis wrote: On Friday, 31 March 2023 at 13:11:58 UTC, z wrote: I've tried to search before but was only able to find articles for 3D triangles, and documentation for OpenGL, which i don't use. The first function you posted takes a 3D triangle as input, so I assumed you're working in 3D. What are you working on? Determines if a triangle is visible. You haven't defined what 'visible' means for a geometric triangle. explained to the best of my ability : https://d.godbolt.org/z/4a8zPGsGo the "angle bias" i was trying to explain about is present, when `rot = 0` the `mask` is supposed to only have three `true` values.(that of the "front" triangle.) You haven't defined what 'visible' means for a geometric triangle. Problem is, all i have is an assembly dump and the data it interacts with, both of which are very(very) old.
Re: Writing a dejargoniser - producing read ke analysis output in English that explains GDC / LDC asm code’s parameters and clobbers
On Wednesday, 5 April 2023 at 15:19:55 UTC, Cecil Ward wrote: How much code do you thing I would need to write for this? I’m still thinking about its feasibility. I don’t want to invent the wheel and write a custom parser by hand, so’d rather steal the code using sim eg called ‘a library’. :-) The idea would be that the user could run this to sanity-check her understanding of the sometimes arcane GDC asm code outputs/inputs/clobbers syntax, and see what her asm code’s constraints are actually going to do rather than what she thinks it’s going to do. Clearly I can’t readily start parsing the asm body itself, I would just inspect the meta info. (If that’s the right word?) I would have a huge problem with LDC’s alternative syntax unless I could think of some way to pre-transform and munge it into GDC format. I do wish LDC (and DMD) would now converge on the GDC asm syntax. Do you think that’s reasonably doable? Maybe try a translator approach? A GDC/LDC to the other(or a custom format, maybe even NASM with comments explaining what the inline assembly metadata says) translator(assuming x86 here, but at a glance it seems ARM is affected by similar MASMvsAT and compiler inline assembly nightmare fuel) would probably help, especially considering you appear to prefer GDC's inline asm to LLVM's. As for the amount of effort, it will possibly require a few structs to represent an intermediary format(maybe) and pattern matching routines to convert to that format. On a more personal note, i was faced with the same dissatisfaction you appear to be having and found great success with NASM, it was easier to write code for it and also helps get one familiarized with usage of D/dub in cross language projects. It's also compatible with DMD by virtue of being compiler independent. Using the D-integrated `asm` statement may work but it certainly lacks support of too many instructions and registers for me to recommend using it.(unless this changed.)
Re: Is this code correct?
On Thursday, 30 March 2023 at 12:59:31 UTC, Dennis wrote: On Thursday, 30 March 2023 at 10:29:25 UTC, z wrote: Is this code correct or logically sound? You need to be exact on what 'correct' is. The comment above `triangleFacesCamera` says: Indicates wether a triangle faces an imaginary view point. There's no view point / camera position defined anywhere in your code. Instead, all it's doing is comparing the angle of one triangle side (A->B) with another (B->C) in the XZ plane. This suggests you want to know the triangle winding order: clockwise or counter clockwise. If you search for "triangle winding order" you'll find simple and correct ways to do that. Your code needlessly computes angles, only considers the XZ plane, and doesn't compare the angles correctly. r2 -= r1; return r2 > 0; You need to wrap (r2 - r1) into [-τ/2, τ/2] range, then you can compare with 0. Yes, the viewpoint/cam should be [0,0,0] for now. The old algorithm completely discards depth information after scaling triangle point data depending on depth distance from 0 hence why i ignored it. I've tried to search before but was only able to find articles for 3D triangles, and documentation for OpenGL, which i don't use. My description of the old algorithm was also badly written(it is confusing!) so here it is rewritten to be more understandable : ```D enum {x, y}//leftright/side, updown/height alias tri = byte[2]; /** Determines if a triangle is visible. */ extern bool oldfunction(tri tA, tri tB, tri tC) { short r0, r1;//r, "result" r0 = cast(short) ((tB[y]-tA[y]) * (tC[x]-tB[x])); r1 = cast(short) ((tC[y]-tA[y]) * (tB[x]-tA[x])); r1 -= r0; return r1 > 0;//"visible" means result is neither negative or zero } ``` (replacing my code with this but in float point "works" but there is an apparent "angle bias" and other problems that are difficult to pinpoint or describe.) Does this match anything known? It doesn't look like cross product to me because the subtractions occur earlier than the multiplications. By correct i mean that it can be used exactly like the old algorithm. (to ignore subshapes from a 3D model) Thanks again
Is this code correct?
Is this code correct or logically sound? ```D import std; enum {side/depth/height and side/height x,//0 y,//1 z //2 } /** Indicates wether a triangle faces an imaginary view point. */ bool triangleFacesCamera(float[3] tA, float[3] tB, float[3] tC) { float r1, r2; r1 = atan2(tB[x] - tA[x], tB[z] - tA[z]);//tried swapping parameters without success r2 = atan2(tC[x] - tB[x], tC[z] - tB[z]);//also tried with tA as substraction 2nd operand, with same/similar results. //trying with both is still broken, but appears to change the breakage pattern. r2 -= r1; return r2 > 0; } ``` For context, it is trying to reproduce an old algorithm that does the same thing : ```D //in D pseudo code extern short sus(byte A, byte B){ r0 = sus(tB[y] - tA[y], tB[x] - tA[x]); r1 = sus(tC[y] - tA[y], tC[x] - tA[x]); r1 -= r0; return r1 > 0; } ``` `sus` is named so because it makes use of `atan` and a premade table that *looks like* `tan(a*(a*5.09...))` but is impossible to exactly reproduce for me.(on a graph it looks like tan but crammed to loop only four/sixteen/sixtyfour times at τ/4,π,τ) Trying to parse both `atan2` and `sus` with `(sin(angle), cos(angle))` as arguments[1] shows that atan2 outputs it as an angle in `-π~π` angle representation but `sus`'s output differs in that it looks like `sin(x*2)`. I'd add images to better illustrate it but do not know how... Big thanks [1] https://run.dlang.io/is/xR8TcE ps: is it normal that sometimes posting has a very long delay? when it happens it is as if the post has gone nowhere but then it suddenly appears much later, which can cause confusion and duplicate posting, oof.
Re: Why are globals set to tls by default? and why is fast code ugly by default?
On Sunday, 26 March 2023 at 18:07:03 UTC, ryuukk_ wrote: ``shared`` is even more ugly since everything must be shared afterwards The limitations of `shared` can be bypassed with a "function" that removes type qualifiers. `return *cast(Unqual!T*) `(example, doesn't work as is for arrays.) This way `shared` symbols can be used with functions that *cannot* be made compatible with both shared and non-shared.(by cannot the exact reasons seem obscure, it may be that a called function *would* support `shared` parameters but calls a function with the parameter that does not, ime this is painfully apparent with `struct`s' member functions) Naturally there's the risk of concurrency issues because this isn't the language documentation-recommended way of doing things, but D's library has these problems covered.
Re: Why are globals set to tls by default? and why is fast code ugly by default?
On Sunday, 26 March 2023 at 18:07:03 UTC, ryuukk_ wrote: ``shared`` is even more ugly since everything must be shared afterwards The limitations of `shared` can be bypassed with a "function" that removes type qualifiers. `return *cast(Unqual!T*) `(example, doesn't work as is for arrays.) This way `shared` symbols can be used with functions that *cannot* be made compatible with both shared and non-shared.(by cannot the exact reasons seem obscure, it may be that a called function *would* support `shared` parameters but calls a function with the parameter that does not, ime this is painfully apparent with `struct`s' member functions) Naturally there's the risk of concurrency issues because this isn't the language documentation-recommended way of doing things, but D's library has these problems covered.
Re: Why are globals set to tls by default? and why is fast code ugly by default?
On Sunday, 26 March 2023 at 18:07:03 UTC, ryuukk_ wrote: ``shared`` is even more ugly since everything must be shared afterwards The limitations of `shared` can be bypassed with a "function" that removes type qualifiers. `return *cast(Unqual!T*) `(example, doesn't work as is for arrays.) This way `shared` symbols can be used with functions that *cannot* be made compatible with both shared and non-shared.(by cannot the exact reasons seem obscure, it may be that a called function *would* support `shared` parameters but calls a function with the parameter that does not, ime this is painfully apparent with `struct`s' member functions) Naturally there's the risk of concurrency issues because this isn't the language documentation-recommended way of doing things, but D's library has these problems covered.
Re: Why are globals set to tls by default? and why is fast code ugly by default?
On Sunday, 26 March 2023 at 18:07:03 UTC, ryuukk_ wrote: ``shared`` is even more ugly since everything must be shared afterwards The limitations of `shared` can be bypassed with a "function" that removes type qualifiers. `return *cast(Unqual!T*) `(example, doesn't work as is for arrays.) This way `shared` symbols can be used with functions that *cannot* be made compatible with both shared and non-shared.(by cannot the exact reasons seem obscure, it may be that a called function *would* support `shared` parameters but calls a function with the parameter that does not, ime this is painfully apparent with `struct`s' member functions) Naturally there's the risk of concurrency issues because this isn't the language documentation-recommended way of doing things, but D's library has these problems covered.
Re: Drawing a line code
On Sunday, 6 November 2022 at 20:07:47 UTC, z wrote: whenever the counter is above `1` I meant above or equal(`>=`), woops
Re: Drawing a line code
On Sunday, 6 November 2022 at 16:48:24 UTC, Joel wrote: I want my code fixed up so that works from any two points. You can add a condition to prevent writing out of the image/framebuffer/whatever memory so it won't do any out of bounds write. Another valid algorithm could be testing all pixels for distance from the line.(i believe that's what 2D vector rendering software does?) `incrE` and `incrNE`'s use can be replaced with an approach where you increment a counter by `dx/dy`(assuming `dy > dx` here) every iteration(after you draw the dot i believe) and whenever the counter is above `1` you add `sx` or `sy` to the `x` or `y` value then substract the counter by `1`.(as the wording implies, you will probably need to branch depending on if `dx > dy` or `dy > dx`).
Re: Hipreme's #4 Tip of the day - Don't use package.d
On Friday, 4 November 2022 at 10:57:12 UTC, Hipreme wrote: ... What do we use instead? I won't lie about the fact package.d forced me to workaround elusive "bugs" in my usage(1) but what is the alternative if we don't want to work around it? (1)(ime : had cases of package.d requiring compiler specific pragmas for LDC, and dub can't find the package's `source` files at all if it's a multi file subpackage intended to be imported only, i never got it working with `package.d`, only a single source file setup `*packagename*.d` would work...)
Re: Hipreme's #1 Tip of the day
On Wednesday, 19 October 2022 at 23:28:46 UTC, Hipreme wrote: Hey guys, I'm going to start making a tip of the day (although I'm pretty sure I won't be able to give every day a tip), but those things are really interesting to newcomers to know and may be obvious to some of the old schoolers there. Always public import a type that the user (including you) is expected to interact with: Imagine we have a module A that define a tree ```d module a; struct Tree(T){} ``` If you create a module b containing a function that returns a tree: ```d module b; import a; Tree!string getDirectoryTree(string entry){return null;} ``` This is virtually unusable! One must `public import a: Tree;` This will make your API a lot easier to interact with, keep in mind to always public import some type that is used from another dependency like this, but try to not overdo it. Will add an important tip for when handling variables of integer type smaller than `int` : ```D ushort a, b, c; c = a + b;//expression a + b will be promoted to int or compiler will emit a deprecation warning c = cast(ushort) (a + b);//works fine c = cast(typeof(c)) (a + b);//alternate form c = cast(ushort) a + b;//bad, only "a" will be casted so you get promoted to int/warned anyway c += a;//assigning operators' operands won't be promoted, with same types at least auto d = a + b;//will probably be of type int ``` note: i didn't 101% check but the above holds true in my usage at least.
Re: vectorization of a simple loop -- not in DMD?
On Monday, 11 July 2022 at 18:15:16 UTC, Ivan Kazmenko wrote: Hi. I'm looking at the compiler output of DMD (-O -release), LDC (-O -release), and GDC (-O3) for a simple array operation: ``` void add1 (int [] a) { foreach (i; 0..a.length) a[i] += 1; } ``` Here are the outputs: https://godbolt.org/z/GcznbjEaf From what I gather at the view linked above, DMD does not use XMM registers for speedup, and does not unroll the loop either. Switching between 32bit and 64bit doesn't help either. However, I recall in the past it was capable of at least some of these optimizations. So, how do I enable them for such a function? Ivan Kazmenko. No, not in DMD. DMD generates what looks like 32 bit code adapted to x86_64. LDC may optimize this kind of loop with a tri-way branch depending on how many array elements remain. but it can both generate very good loop code(particularly when AVX-512 is available and the struct/data arrangement in memory is unfavorable for SIMD) and very questionable code. You may be losing performance for obscure reasons that look like gnomes decided to steal your precious cpu cycles and when that happens there is no way to fix it other than manually going in with a disassembler/debugger, changing defect optimizations in hot code paths to something faster then save back to executable file.(yikes, i know.)
Re: How to debug thread code
On Sunday, 10 July 2022 at 21:27:08 UTC, Hipreme wrote: I'm stuck in a racing condition right now and I'm unable to run a debugger on the code. Usually I was using Visual Studio 2019 for debugging my code, but it shows the following message: "Your app has entered a break state, but there is no code to show because all threads were executing external code (typically system or framework code)." I can only access the main thread code which stops on my semaphore Maybe it will work with an external debugger like x96dbg? With debug symbols(i'd recommend LDC build too) it should be able to show threads and where in the code it was stuck in(i'm assuming each thread will be stuck in a wait loop) It's also possible the error message is just the truth, but it seems weird that it can't at least display the call stack in that case.
Re: a struct as an multidimensional array index
On Saturday, 11 June 2022 at 15:01:05 UTC, Ali Çehreli wrote: On 6/11/22 00:09, z wrote: > I rechecked and it should be `X Y Z` for static array, but `Z Y X` for > indexing/dynamic array creating with `new` How so? i meant with the syntax in (1), the spec's documentation appears to say they are equivalent in result with `new *type*[X][Y]` form. (1) https://dlang.org/spec/expression#new_multidimensional (3. multiple argument form)
Re: a struct as an multidimensional array index
On Saturday, 11 June 2022 at 03:56:32 UTC, Chris Katko wrote: On Friday, 10 June 2022 at 17:26:48 UTC, Ali Çehreli wrote: On 6/10/22 08:13, z wrote: > arrays of arrays has different order for declaration and addressing, > and declaring array of arrays has different order depending on how you > declare it and wether it's static or dynamic array, *oof*) > > To give you an idea of the situation : > ```D > int[3][1] a;//one array of 3 int > writeln(a[0][2]);//first "column", third "row" > ``` I've written about this multiple times in the past but D's way is consistent for me. That must be because I always found C's syntax to be very illogical on this. To me, C's problem starts with putting the variable name in the middle: // C code: int a[1][3]; // Why? So, first, D moves the variable to its consistent place: after the type: int i; int[N] arr; Both of those are in the form of "type and then name". Good... And then, here is the consistency with arrays: "type and then square brackets". int[] dynamicArray; int[N] staticArray; So, here is where you and I differ: int[3][1] arr; // Ali likes int[1][3] arr; // z wants I like it because it is consistently "type and then square brackets". (It so happens that the type of each element is int[N] in this case.) If it were the other way, than array syntax would be inconsistent with itself. :) Or, we would have to accept that it is inside-out like in C. But of course I understand how it is seen as consistent from C's point of view. :) And this is consistent with static vs dynamic as well because again it's "type and then square brackets": int[1][] a; // A dynamic array of int[1] int[][3] b; // A static array of 3 int[]s Ali This is an interesting discussion. I had noticed multi-dim arrays seemed backwards but I assumed I was doing something wrong and had other thing to worry about. I had no idea it was DIFFERENT for static vs dynamic arrays? That's horrifying! Also you reminded me of a possible D bug that I ran into. I had classes that had circular dependencies. One had to know about the other, and vice-versa. And I had derived classes. But somehow, they would explode. I would send one reference to the others constructor to 'link' them together, but the reference would be NULL. But if I accessed the exact same variable through a global reference, it worked fine. I tried ripping the affected code into a new file but the bug wasn't replicated. Even if I matched the compiler/linker options. It was super frustrating. I rechecked and it should be `X Y Z` for static array, but `Z Y X` for indexing/dynamic array creating with `new` (e.g. `float[][][] arr = new float[][][](third_dimension,second_dimension,first_dimension;`) The dillema is that alone, the orders are sound byproducts of the language rules, it's when they are put in relation to each other that it can become weird. The bug could also be one of those implementation-specific bugs that are seemingly impossible to reproduce minimally because they require unknown very specific conditions to occur. Self and inter referencing appears unstable whenever it is not in the module/global scope.
Re: a struct as an multidimensional array index
On Friday, 10 June 2022 at 08:08:45 UTC, Chris Katko wrote: Is it somehow possible to use a struct as a [multidimensional] array index: D struct indexedPair { size_t x, y; } bool isMapPassable[100][100]; auto p = indexedPair(50, 50); if(isMapPassable[p]) return true; Probably not, but I'm curious. AFAIK no. I admit it's an area D could improve on, it creates a lot of confusion because of the ordering and the lack of an integrated solution. (arrays of arrays has different order for declaration and addressing, and declaring array of arrays has different order depending on how you declare it and wether it's static or dynamic array, *oof*) To give you an idea of the situation : ```D int[3][1] a;//one array of 3 int writeln(a[0][2]);//first "column", third "row" ``` One thing you could do however is make the array accept a multidimensional argument through operator overloading(opIndex) if it is the only array from a struct, but that gets unviable when you have multiple arrays that would benefit from it. To summarize, there does not appear to be an easy solution that has no drawbacks. I'd recommend saving yourself the trouble of array of arrays(of arrays?) and using a single array of length x*y with a function to index into it `(x+(xlength*y)` or `( (x+(xlength*y)) + ((xlength*ylength)*z)) )` if that is desirable.
Graphing a D function : possible?
Is there a quick way of obtaining the graph of D functions like these? ```d T f(T) if (isScalarType!T){} ``` or ```D T[2] f(T, T)if (isScalarType!T){} ``` I know that there are graphing calculators already, but these don't support low level black magic like int <-> float conversions and i'm lost because there is no way to know if the code i write is correct without a graph or trial and error, hence the question. Many thanks
Union member positions?
Is it possible to set a "position" on a union member? or is there is a language-integrated equivalent? For example, to get access to each byte in an unsigned integer while still supporting the original type. ```D ///a single uint that would be accessed as two ushort, or four separate ubyte union UnionExample{ uint EAX; //upper ushort EAHX; ubyte EAHH; ubyte EAHL; //lower ushort EALX; ubyte EALH; ubyte EALL; } ``` Thanks.
Re: Download a file into array (using std.net.curl.download)
On Wednesday, 7 July 2021 at 10:27:47 UTC, notna wrote: On Windows: ``` ::> dmd curl_get.d ::> .\curl_get.exe object.Error@(0): Access Violation 0x0283CA66 0x0041DE8D 0x004023A2 0x00402308 0x00414D33 0x00414CAD 0x00414B48 0x0040D41F 0x00402363 0x74B96359 in BaseThreadInitThunk 0x773887A4 in RtlGetAppContainerNamedObjectPath 0x77388774 in RtlGetAppContainerNamedObjectPath ``` Nice and helpful Error messages is on the top of our desires list, right? On 64 bits you don't even get a stack trace or description. Sad, i know. If you want better i could recommend you to compile with `-g` and hook up a debugger, then just let it run and it should triger a breakpoint on 0xC009(access violation).
Re: Parallel For
On Tuesday, 15 June 2021 at 06:39:24 UTC, seany wrote: ... This is the best I could do: https://run.dlang.io/is/dm8LBP For some reason, LDC refuses to vectorize or even just unroll the nonparallel version, and more than one `parallel` corrupts the results. But judging by the results you expected and what you described, you could maybe replace it by a ton of `c[] = a[] *operand* b[]` operations? Unless you use conditionals after or do something else that confuses the compiler, it will maybe use SSE/AVX instructions, and at worst use basic loop unrolling.
Re: ugly and/or useless features in the language.
On Saturday, 15 May 2021 at 14:31:08 UTC, Alain De Vos wrote: Which parts in dlang don't you use and why ? Auto return types i find dangerous to use. I found `auto` and `ref`(yes just `ref`) return types very useful for bypassing the type system, eg.: ```D ///function requires lvalue and the return value can be of any type while still being the same data. ref mutateSomething(A)(ref A something) { static if (isArray!A){/*...*/} else {/*...*/} return something; } ``` Other than that, i've found tuples limited in usage compared to just using a struct or static arrays.(the main problem is the inability to use runtime indexing even if the types match.) The rest, i don't use from lack of interest or awareness it even exists.
Re: What's a good approach to DRY with the block code of a case-statement?
I'd recommend you to use templates with alias parameters but you mentioned that you need to retain function context(for gotos, continue, break, etc...) One thing you could do is mix the ugly mixin solution with the good one: ```D //inside the function, so that you can access locals pragma(inline, true) string getDRYstr(T, parameters...)() { static assert(__ctfe); string mixedinstr; mixedinstr ~= /*code*/; static if(/**/){mixedinstr ~= /*code*/;} //... mixedinstr ~= "break;"; return mixedinstr } ``` There is no doubt that it is an ugly solution but the time saved not copy-pasting code could be worth it.
Re: AliasSeq different from just using the symbol name(s)?
On Thursday, 15 April 2021 at 19:53:57 UTC, Paul Backus wrote: They're not *exactly* the same. When you write auto seq = AliasSeq!(a, b, c); ...you are declaring a sequence of three *new* array variables [1] and initializing them with copies of the original arrays. It's as though you'd written: auto seq_a = a; auto seq_b = b; auto seq_c = c; alias seq = AliasSeq!(a, b, c); If you want to refer directly to the original variables, you need to create your sequence with `alias` instead of `auto`: alias seq = AliasSeq!(a, b, c); [1] https://dlang.org/articles/ctarguments.html#type-seq-instantiation Ah thank you so much! i changed `auto` to `alias` and it worked perfectly.
Re: AliasSeq different from just using the symbol name(s)?
On Thursday, 15 April 2021 at 19:38:04 UTC, z wrote: ```D int[] a,b,c,d,e; void templatef(args...){/*...*/} //... auto seq = AliasSeq!(b,c,d); templatef!(a,seq,e); templatef!(a,b,c,d,e); //am i being mistaken for thinking these two template calls should be equivalent in behavior? ``` woops, meant `void templatef(args...)(){}`
Re: AliasSeq different from just using the symbol name(s)?
On Thursday, 15 April 2021 at 18:58:40 UTC, Paul Backus wrote: Without an example that shows the actual problem you encountered, it will be almost impossible for anyone to help you figure out what is causing it. Since you were not able to trigger it, it seems likely that the problem is related to something other than the AliasSeq which you have left out of the example. I understand that it won't be possible to pinpoint the cause without a reduced test case, but : ```D int[] a,b,c,d,e; void templatef(args...){/*...*/} //... auto seq = AliasSeq!(b,c,d); templatef!(a,seq,e); templatef!(a,b,c,d,e); //am i being mistaken for thinking these two template calls should be equivalent in behavior? ``` And if not, does it mean that the problem i encountered is a possible bug?
AliasSeq different from just using the symbol name(s)?
I've tried to group together a bundle of alias template parameters with AliasSeq, but while without it works just fine, when the verbose parameters are grouped with multiple AliasSeqs, the lengths of the array parameters passed through AliasSeq are 0(inside the templated function, before the call it's still OK) and a range violation/exception occurs. This is weird because the templated function does not change the length of its array parameters, and printing the parameter's string name to stdout at runtime shows that they are supposedly the same(in symbol name at least), but somehow it isn't the same? To see what i mean : https://run.dlang.io/is/VXDRL4 (i could not manage to trigger it here however.) Big thanks
Re: dub commands do not run correctly.
On Saturday, 10 April 2021 at 13:15:19 UTC, Alain De Vos wrote: dub fetch lint Getting a release version failed: (1): Error: Got JSON of type null_, expected object. Retry with ~master... (1): Error: Got JSON of type null_, expected object. He meant `dub lint`, with the working directory in the package's root folder(where dub.sdl/dub.json is present) It should then automatically fetch and compile dscanner and execute it on the package.
Re: Extern/scope issue
On Saturday, 3 April 2021 at 10:17:14 UTC, DLearner wrote: Does this mean D has no equivalent of C globals? What is the D way of doing this? With __gshared. If the global is defined from within another language, apparently you'd have to do [extern(C) extern __gshared *name*](https://dlang.org/spec/interfaceToC.html#c-globals) It seems that the whole extern keyword can be confusing with variables: ```d //L is the language name extern(L) returnType functionName(parameters); // function implemented in another language or out of this module. extern(L) returnType functionName(parameters) {/*...*/}//extern only changes the name mangling and the calling rules in the resulting assembly code.(with D it does not change anything?) extern(L) variableType variableName; //what you did, declares a normal variable, except that the name mangling rule is that of the language you specified. extern(L) extern otherQualifiersIfAny variableType variableName; //appears to be a variable declared outside of the module, so at link time a .obj file will have to declare a variable with this symbol name or else the linker will error out. ``` It seems that case 4 is what you desired but i do not know if with this module hierarchy it can/will work with dub.(it should.) With the code as is you should be able to access both variables from main with `testmod.xvar` and simply `xvar`.(when name conflicts like this occur the most local is used by default, otherwise use the full name which should be `testmain.xvar` in this case.)
Re: Casting lvalues
On Friday, 2 April 2021 at 12:47:35 UTC, z wrote: ```d T processArray(T)(ref T[] p){/*...*/} //function calls .reserve ``` i meant `void` for the function return type.
Casting lvalues
```d shared TT[] a; T processArray(T)(ref T[] p){/*...*/} //function calls .reserve on its parameter a.processArray; ``` Template *name* cannot deduce function from argument types !()(shared(T[]))... Even if the function is changed to only accept shared parameters, .reserve does not appear to support shared so the function is impossible to use without somehow changing its type or using __gshared. Other than using pointers and casting(```processArray(*(cast(Unqual!TT*)))```, not ideal), is there a better way to transform an lvalue's type without it transforming into an rvalue?(direct casting seems to create an rvalue because compilation fails.) Thanks
Re: Using YMM registers causes an undefined label error
On Tuesday, 9 March 2021 at 20:33:01 UTC, z wrote: On Tuesday, 9 March 2021 at 20:23:48 UTC, z wrote: On Friday, 5 March 2021 at 12:57:43 UTC, z wrote: ... Then it seems the only way to get AVX-compatible inline assembly(ldc.llvmasm excluded) is to use an external assembler. For example : ... But i'm not really sure how to integrate that into a dub project, it seems «lflags "filename.obj"» and preGenerateCommands/preBuildCommands would work but i haven't tested that.(«dflags "filename.obj"» doesn't work for sure) In dub.sdl : lflags "source/asmfunctions.obj" preGenerateCommands " cd source && *command or bat/sh file that builds the asm object file(s)*" It works, but if the package is being imported by another then it will fail because the way lflags work mean that the linker will try to find source/asmfunctions.obj from the working directory of the importer. This is circumventable with relative paths(if possible). lflags "../importedpackagesname/source/asmfunctions.obj"
Re: Using YMM registers causes an undefined label error
On Tuesday, 9 March 2021 at 20:23:48 UTC, z wrote: On Friday, 5 March 2021 at 12:57:43 UTC, z wrote: ... Then it seems the only way to get AVX-compatible inline assembly(ldc.llvmasm excluded) is to use an external assembler. For example : ... But i'm not really sure how to integrate that into a dub project, it seems «lflags "filename.obj"» and preGenerateCommands/preBuildCommands would work but i haven't tested that.(«dflags "filename.obj"» doesn't work for sure)
Re: Using YMM registers causes an undefined label error
On Friday, 5 March 2021 at 12:57:43 UTC, z wrote: ... Then it seems the only way to get AVX-compatible inline assembly(ldc.llvmasm excluded) is to use an external assembler. For example : import std.stdio; extern(C) void vxorps_d(ubyte[32]*); void main() { ubyte[32] a = 2; writefln!"Contents of a before : %( %s %)"(a); vxorps_d(); writefln!"Contents of a after : %( %s %)"(a); } BITS 64 global vxorps_d section .text2 vxorps_d: vmovups ymm0, [rcx]; mov rdx, zerofilled vbroadcastss ymm1, [rdx] vxorps ymm0, ymm0, ymm1 vmovups [rcx], ymm0 ret zerofilled: db 0xFF,0xFF,0xFF,0xFF nasm -g -f win64 asmfile.asm dmd vxorpstest.d asmfile.obj -m64 ldc vxorpstest.d asmfile.obj -m64 vxorpstest.exe Contents of a before : 2 2 2... (0x02/0b_0010) Contents of a after : 253 253 253...(0xFD/0b_1101)
Re: Optimizing for SIMD: best practices?(i.e. what features are allowed?)
On Friday, 26 February 2021 at 03:57:12 UTC, tsbockman wrote: static foreach(size_t i; 0 .. 3/+typeof(a).length+/){ distance += a[i].abs;//abs required by the caller (a * a) above is always positive for real numbers. You don't need the call to abs unless you're trying to guarantee that even nan values will have a clear sign bit. I do not know why but the caller's performance nosedives whenever there is no .abs at this particular line.(there's a 3x difference, no joke.) Same for assignment instead of addition, but with a 2x difference instead.
Re: Optimizing for SIMD: best practices?(i.e. what features are allowed?)
On Thursday, 25 February 2021 at 14:28:40 UTC, Guillaume Piolat wrote: On Thursday, 25 February 2021 at 11:28:14 UTC, z wrote: How does one optimize code to make full use of the CPU's SIMD capabilities? Is there any way to guarantee that "packed" versions of SIMD instructions will be used?(e.g. vmulps, vsqrtps, etc...) https://code.dlang.org/packages/intel-intrinsics I'd try to use it but the platform i'm building on requires AVX to get the most performance.
Re: Optimizing for SIMD: best practices?(i.e. what features are allowed?)
On Thursday, 25 February 2021 at 11:28:14 UTC, z wrote: ... It seems that using static foreach with pointer parameters exclusively is the best way to "guide" LDC into optimizing code.(using arr1[] += arr2[] syntax resulted in worse performance for me.) However, AVX512 support seems limited to being able to use the 16 other YMM registers, rather than using the same code template but changed to use ZMM registers and double the offsets to take advantage of the new size. Compiled with «-g -enable-unsafe-fp-math -enable-no-infs-fp-math -ffast-math -O -release -mcpu=skylake» : __gshared simdf init = [0f,0f,0f,0f,0f,0f,0f,0f]; alias simdf = float[8] extern(C)//with extern(D)(the default), the assembly output uses one register for two pointers. void vEUCLIDpsptr_void(simdf* a0, simdf* a1, simdf* a2, simdf* a3, simdf* b1, simdf* b2, simdf* b3) { simdf amm0 = init;//returned simdf simdf amm1 = *a1; simdf amm2 = *a2; simdf amm3 = *a3; static foreach(size_t i; 0..simdlength) { //Needs to be interleaved like this, otherwise LDC generates worse code. amm1[i] -= (*b1)[i]; amm1[i] *= amm1[i]; amm2[i] -= (*b2)[i]; amm2[i] *= amm2[i]; amm3[i] -= (*b3)[i]; amm3[i] *= amm3[i]; amm0[i] += amm1[i]; amm0[i] += amm2[i]; amm0[i] += amm3[i]; amm0[i] = sqrt(amm0[i]); } *a0 = amm0; return; } mov r10,qword ptr ss:[rsp+38] mov r11,qword ptr ss:[rsp+30] mov rax,qword ptr ss:[rsp+28] vmovups ymm0,yword ptr ds:[rdx] vmovups ymm1,yword ptr ds:[r8] vsubps ymm0,ymm0,yword ptr ds:[rax] vmovups ymm2,yword ptr ds:[r9] vfmadd213ps ymm0,ymm0,yword ptr ds:[<_D12euclideandst4initG8f>] vsubps ymm1,ymm1,yword ptr ds:[r11] vfmadd213ps ymm1,ymm1,ymm0 vsubps ymm0,ymm2,yword ptr ds:[r10] vfmadd213ps ymm0,ymm0,ymm1 vsqrtps ymm0,ymm0 vmovups yword ptr ds:[rcx],ymm0 vzeroupper ret The speed difference is near 400% for the same amount of distances compared with the single distance function example. However, the assembly generated isn't the fastest, for example removing vzeroupper and using the unused and known-zeroed YMM15 register as a zero-filled register operand for the first vfmadd213ps instruction improves performance by 10%(70 vs 78ms for 256 million distances...) The function can then be improved further to use pointer offsets and more registers, this is more efficient and results in a 500%~ improvement : extern(C) void vEUCLIDpsptr_void_40(simdf* a0, simdf* a1, simdf* a2, simdf* a3, simdf* b1, simdf* b2, simdf* b3) { simdf amm0 = init; simdf amm1 = *a1; simdf amm2 = *a2; simdf amm3 = *a3; simdf emm0 = init; simdf emm1 = amm1; simdf emm2 = amm2;//mirror AMM for positions simdf emm3 = amm3; simdf imm0 = init; simdf imm1 = emm1; simdf imm2 = emm2; simdf imm3 = emm3; simdf omm0 = init; simdf omm1 = emm1; simdf omm2 = emm2; simdf omm3 = emm3; simdf umm0 = init; simdf umm1 = omm1; simdf umm2 = omm2; simdf umm3 = omm3; //cascading assignment may not be the fastest way, especially compared to just loading from the pointer! static foreach(size_t i; 0..simdlength) { amm1[i] -= (b1[0])[i]; amm1[i] *= amm1[i]; amm0[i] += amm1[i]; amm2[i] -= (b2[0])[i]; amm2[i] *= amm2[i]; amm0[i] += amm2[i]; amm3[i] -= (b3[0])[i]; amm3[i] *= amm3[i]; amm0[i] += amm3[i]; amm0[i] = sqrt(amm0[i]); //template emm1[i] -= (b1[1])[i]; emm1[i] *= emm1[i]; emm0[i] += emm1[i]; emm2[i] -= (b2[1])[i]; emm2[i] *= emm2[i]; emm0[i] += emm2[i]; emm3[i] -= (b3[1])[i]; emm3[i] *= emm3[i]; emm0[i] += emm3[i]; emm0[i] = sqrt(emm0[i]); // imm1[i] -= (b1[2])[i]; imm1[i] *= imm1[i]; imm0[i] += imm1[i]; imm2[i] -= (b2[2])[i]; imm2[i] *= imm2[i]; imm0[i] += imm2[i]; imm3[i] -= (b3[2])[i]; imm3[i] *= imm3[i]; imm0[i] += imm3[i]; imm0[i] = sqrt(imm0[i]); // omm1[i] -= (b1[3])[i]; omm1[i] *= omm1[i]; omm0[i] += omm1[i]; omm2[i] -= (b2[3])[i]; omm2[i] *= omm2[i]; omm0[i] += omm2[i]; omm3[i] -= (b3[3])[i]; omm3[i] *= omm3[i]; omm0[i] += omm3[i]; omm0[i] = sqrt(omm0[i]); //
Re: dmd -> ldmd2: /usr/bin/ld.gold: error: .o: multiple definition of 'bool ldc.attributes...
On Saturday, 6 March 2021 at 22:14:26 UTC, kdevel wrote: After replacing dmd with ldmd2 (LDC 1.25.1) I get tons of link errors all of the form mentioned in the subject. Any idea what can be done about it? (With a handcrafted single compile/link statement using ldc2 everything compiles but ideally I want to reuse my Makefile). I think i had a similar error, can you try adding version(LDC) pragma(LDC_no_moduleinfo) to the affected modules? At the line just after the module declaration, particularly in all package.d files and the file that contains the main function. However, your error seems to be with the files inside LDC... I'm not sure if this will solve it.
Re: Using YMM registers causes an undefined label error
On Friday, 5 March 2021 at 16:10:02 UTC, Rumbu wrote: First of all, in 64 bit ABI, parameters are not passed on stack, therefore a[RBP] is a nonsense. void complement32(simdbytes* a, simdbytes* b) a is in RCX, b is in RDX on Windows a is in RDI, b is in RSI on Linux I'm confused, with your help i've been able to find the function calling convention but on LDC-generated code, sometimes i see the layout being reversed(The function i was looking at is a 7 argument function, all are pointers. The first argument is on the stack, the seventh and last is in RCX) and the offsets don't seem to make sense either(first arguemnt as ss:[rsp+38], second at ss:[rsp+30], and third at ss:[rsp+28]) Secondly, there is no such thing as movaps YMMX, [RAX], but vmovaps YMM3, [RAX] Same for vxorps, but there are 3 operands, not 2. You're absolutely right, but apparently it only accepts the two-operand version from SSE. Other AVX/AVX2/AVX512 instructions that have «v» prefixed aren't recognized either("Error: unknown opcode vmovaps"), is AVX(2) with YMM registers supported for «asm{}» statements?
Using YMM registers causes an undefined label error
XMM registers work, but as soon as they are changed into YMM DMD outputs "bad type/size of operands %s" and LDC outputs an "label YMM0 is undefined" error. Are they not supported? To illutrate : https://run.dlang.io/is/IqDHlK By the way, how can i use instructions that are not listed in [1]?(vfmaddxxxps for example) And how are function parameters accessed if they are not on the stack?(looking up my own code in a debugger, i see that the majority of pointer parameters are already in registers rather than being on the stack.) I need those so that i can write a better answer for [2]. Big thanks [1] https://dlang.org/spec/iasm.html#supported_opcodes [2] https://forum.dlang.org/thread/qyybpvwvbfkhlvulv...@forum.dlang.org
Optimizing for SIMD: best practices?(i.e. what features are allowed?)
How does one optimize code to make full use of the CPU's SIMD capabilities? Is there any way to guarantee that "packed" versions of SIMD instructions will be used?(e.g. vmulps, vsqrtps, etc...) To give some context, this is a sample of one of the functions that could benefit from better SIMD usage : float euclideanDistanceFixedSizeArray(float[3] a, float[3] b) { float distance; a[] -= b[]; a[] *= a[]; static foreach(size_t i; 0 .. 3/+typeof(a).length+/){ distance += a[i].abs;//abs required by the caller } return sqrt(distance); } vmovsd xmm0,qword ptr ds:[rdx] vmovss xmm1,dword ptr ds:[rdx+8] vmovsd xmm2,qword ptr ds:[rcx+4] vsubps xmm0,xmm0,xmm2 vsubss xmm1,xmm1,dword ptr ds:[rcx+C] vmulps xmm0,xmm0,xmm0 vmulss xmm1,xmm1,xmm1 vbroadcastss xmm2,dword ptr ds:[<__real@7fff>] vandps xmm0,xmm0,xmm2 vpermilps xmm3,xmm0,F5 vaddss xmm0,xmm0,xmm3 vandps xmm1,xmm1,xmm2 vaddss xmm0,xmm0,xmm1 vsqrtss xmm0,xmm0,xmm0 vmovaps xmm6,xmmword ptr ss:[rsp+20] add rsp,38 ret I've tried to experiment with dynamic arrays of float[3] but the output assembly seemed to be worse.[1](in short, it's calling internal D functions which use "vxxxss" instructions while performing many moves.) Big thanks [1] https://run.dlang.io/is/F3Xye3
Struct delegate access corruption
So i've upgraded one of my structs to use the more flexible delegates instead of function pointers but when the member function tries to access the struct's members, the contents are random and the program fails. i've isolated the problem by adding writefln calls before calling the delegate and inside the delegate(the functions are defined in the struct as member functions, the delegate itself is set in the constructor) : In the code that uses the delegate : writefln!"test %s"(a, ); T b = a.d();//the delegate While in the most used delegate : writefln!"test2 %s %s"(this, ); The contents and pointers don't match(they're random, full of 0, -nan, -inf and other invalid values), are they(delegates) supposed to be used like this? Big thanks
Re: Fastest way to "ignore" elements from array without removal
On Tuesday, 16 February 2021 at 06:03:50 UTC, H. S. Teoh wrote: It depends on what your goal is. Do you want to permanently remove the items from the array? Or only skip over some items while iterating over it? For the latter, see std.algorithm.iteration.filter. The array itself is read only, so it'll have to be an array of pointers/indexes. For the former, you can use the read-head/write-head algorithm: keep two indices as you iterate over the array, say i and j: i is for reading (incremented every iteration) and j is for writing (not incremented if array[i] is to be deleted). Each iteration, if j < i, copy array[i] to array[j]. At the end of the loop, assign the value of j to the length of the array. Example: int[] array = ...; size_t i=0, j=0; while (i < array.length) { doSomething(array[i]); if (!shouldDelete(array[i])) j++; if (j < i) array[j] = array[i]; i++; } array.length = j; Basically, the loop moves elements up from the back of the array on top of elements to be deleted. This is done in tandem with processing each element, so it requires only traversing array elements once, and copies array elements at most once for the entire loop. Array elements are also read / copied sequentially, to maximize CPU cache-friendliness. T This is most likely ideal for what i'm trying to do.(resizes/removes will probably have to propagate to other arrays) The only problem is that it does not work with the first element but i could always just handle the special case on my own.[1] I'll probably use .filter or an equivalent for an initial first pass and this algorithm for the rest, thank you both! [1] https://run.dlang.io/is/f9p29A (the first element is still there, and the last element is missing. both occur if the first element didn't pass the check.)
Re: Fastest way to "ignore" elements from array without removal
On Tuesday, 16 February 2021 at 04:43:33 UTC, Paul Backus wrote: On Tuesday, 16 February 2021 at 04:20:06 UTC, z wrote: What would be the overall best manner(in ease of implementation and speed) to arbitrarily remove an item in the middle of an array while iterating through it? http://phobos.dpldocs.info/std.algorithm.iteration.filter.html Does filter support multiple arguments for the predicate?(i.e. using a function that has a "bool function(T1 a, T2 b)" prototype) If not could still implement the function inside the loop but that would be unwieldy. And does it create copies every call? this is important because if i end up using .filter it will be called a 6 to 8 digit number of times.
Fastest way to "ignore" elements from array without removal
What would be the overall best manner(in ease of implementation and speed) to arbitrarily remove an item in the middle of an array while iterating through it? So far i've thought about simply using D's standard [0...x] ~ [x+1..$] with an array of indexes but wouldn't that just cause slow reallocations? According to wikipedia the performance would be suboptimal.[1] I've also thought about using a pointer array and just assigning a null pointer when the item doesn't need to be iterated on but i'm not sure which method will result in the fastest execution times for the general case where over half of the items will be removed from the index/pointer array. Big thanks [1] - https://en.wikipedia.org/wiki/Dynamic_array#Performance
Re: A variation of issue 11977?
On Thursday, 21 January 2021 at 14:11:15 UTC, kdevel wrote: Creating a scope works around the issue. Another way to work around the issue is to use «asm {jmp label;}» (replace jmp by whatever equivalent is there on the target architecture.) Yes it's ugly, but it bypasses the arbitrary limitation perfectly.
Re: writeln and write at CTFE
On Wednesday, 13 January 2021 at 11:50:26 UTC, Andrey wrote: Function "ctfeWriteln" doens't exist. pragma(msg, ...) is used only for CT values. Today is 2021. Dlang still doesn't have ctfe write functions? Yes.(afaik) It has shot me on the foot once, to the point i abandoned the idea of ever accomplishing what i wanted to do at compile-time and instead just did it at module construction-time, and i must admit that it was immediately easier to debug.
Re: Undefined reference error at linktime with unittests
On Thursday, 10 December 2020 at 14:51:51 UTC, ag0aep6g wrote: ... Thank you for the explanation on mangles. The problem was caused by an «unittest{ void main() }» declaration in an import's source file, and for some reason it had both passed compilation and not resulted in the usual "undefined symbol : mainCRTStartup" error when these problems occur. Hence the confusion.
Undefined reference error at linktime with unittests
When compiling with unit tests(via «dub test», or adding «dflags "-unittest"»), i'm getting this error at link time : lld-link: error: undefined symbol: _D5packagename9subpackage9__mixin119type8toStringMFZAya The same occurs with OPTLINK. Curiously, looking at the incriminated .lib file with an hexadecimal editor reveals something odd: _D5packagename9subpackage9__mixin109type8toStringMFZAya (the mangled name in the .lib is mixin109, but the linker is complaining that it cannot find a "mixin119" version of the symbol.) Is there something i am doing wrong? I couldn't find documentation on what digits mean in mangled function names.
Re: Local libraries/packages with dub: How?
On Tuesday, 1 December 2020 at 07:39:31 UTC, rikki cattermole wrote: That isn't right. Thank you, this was the problem apparently. Dub ignored the malformed dependency declaration instead of displaying a warning or an error.(this is apparently a bug.[0][1]) [0] https://github.com/dlang/dub/issues/614 [1] https://github.com/dlang/dub/issues/1382
Re: Local libraries/packages with dub: How?
On Tuesday, 1 December 2020 at 04:50:03 UTC, rikki cattermole wrote: ... What puzzles me is that the dependencies are indeed declared, but "dub describe" refuses to recognize the dependencies and "dub build" fails. "dub list" does recognize the root folder, but trying to get a subpackage to import another fails because the dependency isn't imported.(dmd only sees the "source" and phobos/runtime import paths) The root dub.sdl roughly contains this : name "fldr" dependencies "fldr:spkg1" "fldr:spkg2" "fldr:spkg3" // tried with and without "fldr:" subPackage "./spkg1/" subPackage "./spkg2/" subPackage "./spkg3/" targetType "none" While for example, "spkg3"'s dub.sdl contains this : name "spkg3" dependencies "fldr:spkg2" And its source/*.d file contains this : import std.stdio, fldr.spkg2; //tried with and without "fldr." void main() {writeln(«function from spkg2»)}
Local libraries/packages with dub: How?
How does one set up a dub package so that it contains multiple sublibraries that may or may not depend on these same libraries?(but never co-dependant) So far i've tried using add-local and add-path with subpackage declarations in the root folder's dub.sdl but to no avail. (dub does not complain but silently doesn't add dependencies with -v, dmd just errors out with the standard file not found error "module y is in file y.d which cannot be read", with the path of my libraries absent from import paths.) Alternatively, is there an equivalent of -version= for DUB's build command?(for when a package author didn't add a config that incorporates the desired versions.) Big thanks and my apologies if this is the wrong place for DUB discussion.
Re: Alias overload of function
On Sunday, 9 June 2019 at 10:22:36 UTC, Andrey wrote: Hello, I have got 2 functions: void myFunc(string name, int a)(wstring value) {} void myFunc(string name, int a)() {} I want to make an alias (for example for second function without argument): alias myAlias(int a) = myFunc!("Name", a); but compiler says: ... matches more than one template declaration So how solve this problem? use __traits(getOverloads) to apply to all of them in a static foreach.