Re: Range handling difficulties
On Wed, Apr 24, 2024 at 08:08:06AM +, Menjanahary R. R. via Digitalmars-d-learn wrote: > I tried to solve Project Euler [problem > #2](https://projecteuler.net/problem=2) using > [Recurrence/recurrence](https://dlang.org/library/std/range/recurrence.html). > > Assuming `genEvenFibonacci` is the appropriate funtion in Explicit > form, I got what I need like so: > > ``` > auto evenfib = recurrence!genEvenFibonacci(2uL, 8uL); > > writeln; > evenfib.take(11).sum.writeln; > ``` > > But that's like cheating because there is no prior knowledge of `11`. > > I just got it manually by peeking at the sequence `[2, 8, 34, 144, > 610, 2584, 10946, 46368, 196418, 832040, 3524578, 14930352]`. > > `14930352` must be filtered out because beyond the limit set! > > How to fix that properly using all the standard library capabilities > programatically? > > I'm thinking of Range and/or std.algorithm. evenfib.until!(n => n > 4_000_000).sum.writeln; T -- The trouble with TCP jokes is that it's like hearing the same joke over and over.
Re: Unittests pass, and then an invalid memory operation happens after?
On Wed, Apr 03, 2024 at 09:57:00PM +, Liam McGillivray via Digitalmars-d-learn wrote: > On Friday, 29 March 2024 at 01:18:22 UTC, H. S. Teoh wrote: > > Take a look at the docs for core.memory.GC. There *is* a method > > GC.free that you can use to manually deallocate GC-allocated memory > > if you so wish. Keep in mind, though, that manually managing memory > > in this way invites memory-related errors. That's not something I > > recommend unless you're adamant about doing everything the manual > > way. > > Was this function removed from the library? I don't see it in [the > document](https://dlang.org/phobos/core_memory.html). https://dlang.org/phobos/core_memory.html#.GC.free > How is `GC.free` different from `destroy`? GC.free is low-level. It does not invoke dtors. [...] > When you mention a "flag" to indicate whether they are "live", do you > mean like a boolean member variable for the `Unit` object? Like `bool > alive;`? Yes. > > My advice remains the same: just let the GC do its job. Don't > > "optimize" prematurely. Use a profiler to test your program and > > identify its real bottlenecks before embarking on these often > > needlessly complicated premature optimizations that may turn out to > > be completely unnecessary. > > Alright. I suppose that some of the optimization decisions I have made > so far may have resulted in less readable code for little performance > benefit. Now I'm trying to worry less about optimization. Everything > has been very fast so far. > > I haven't used a profiler yet, but I may like to try it. Never make any optimization decisions without a profiler. I learned the hard way that more often than not, I'm wrong about where my program's bottleneck is, and that I spend far too much time and effort "optimizing" things that don't need to be optimized, while totally missing optimizations where it really matters. Life is too short to be wasted on optimizing things that don't really matter. When it comes to optimizations, always profile, profile, profile! [...] > It's unlikely that I will have multiple maps running simultaneously, > unless if I do the AI thing mentioned above. I've had a dilemma of > passing around references to the tile object vs passing around the > coordinates, as is mentioned in an earlier thread that I started. In > what way do references slow down performance? Would passing around a > pair of coordinates to functions be better? It's not references themselves that slow things down; it's the likelihood that using reference types when you don't need to can lead to excessive GC allocations, which in turn causes longer GC pauses. Well, excessive dereferencing can also reduce cache coherence, but if you're already at the level where this actually makes a difference, you don't my advice anymore. :-D Generally, if a piece of data is transient and not expected to last very long (e.g., past the current frame), it probably should be a struct rather than a class. There are exceptions, of course, but generally that's how I'd decide whether something should be a by-value type or a by-reference type. T -- "The number you have dialed is imaginary. Please rotate your phone 90 degrees and try again."
Re: Inconsistent chain (implicitly converts to int)
On Fri, Apr 05, 2024 at 03:18:09PM +, Salih Dincer via Digitalmars-d-learn wrote: > Hi everyone, > > Technically r1 and r2 are different types of range. Isn't it > inconsistent to chain both? If not, why is the char type converted to > int? [...] It's not inconsistent if there exists a common type that both range element types implicit convert to. The real problem is the implicit conversion of char to int, which I have been against for a long time. Walter, however, disagrees. T -- What's worse than raining cats and dogs? Hailing taxis!
Re: Unittests pass, and then an invalid memory operation happens after?
On Thu, Mar 28, 2024 at 11:49:19PM +, Liam McGillivray via Digitalmars-d-learn wrote: > On Thursday, 28 March 2024 at 04:46:27 UTC, H. S. Teoh wrote: > > The whole point of a GC is that you leave everything up to it to > > clean up. If you want to manage your own memory, don't use the GC. > > D does not force you to use it; you can import core.stdc.stdlib and > > use malloc/free to your heart's content. > > > > Unpredictable order of collection is an inherent property of GCs. > > It's not going away. If you don't like it, use malloc/free instead. > > (Or write your own memory management scheme.) > > I disagree with this attitude on how the GC should work. Having to > jump immediately from leaving everything behind for the GC to fully > manual memory allocation whenever the GC becomes a problem is a > problem, which gives legitimacy to the common complaint of D being > "garbage-collected". It would be much better if the garbage collector > could be there as a backup for when it's needed, while allowing the > programmer to write code for object destruction when they want to > optimize. Take a look at the docs for core.memory.GC. There *is* a method GC.free that you can use to manually deallocate GC-allocated memory if you so wish. Keep in mind, though, that manually managing memory in this way invites memory-related errors. That's not something I recommend unless you're adamant about doing everything the manual way. > > > Anyway, I suppose I'll have to experiment with either manually > > > destroying every object at the end of every unittest, or just > > > leaving more to the GC. Maybe I'll make a separate `die` function > > > for the units, if you think it's a good idea. > > > > I think you're approaching this from a totally wrong angle. (Which I > > sympathize with, having come from a C/C++ background myself.) The whole > > point of having a GC is that you *don't* worry about when an object is > > collected. You just allocate whatever you need, and let the GC worry > > about cleaning up after you. The more you let the GC do its job, the > > better it will be. > > Now you're giving me conflicting advice. I was told that my current > destructor functions aren't acceptable with the garbage collector, and you > specifically tell me to leave things to the GC. But then I suggest that I > "leave more to the GC" and move everything from the Unit destructor to a > specialized `die` function that can be called instead of `destroy` whenever > they must be removed from the game, which as far as I can see is the only > way to achieve the desired game functionality while following your and > Steve's advice and not having dangling references. But in response to that, > you tell me "I think you're approaching this from the wrong angle". And then > right after that, you *again* tell me to "just let the GC worry about > cleaning up after you"? Even if I didn't call `destroy` at all during my > program, as far as I can see, I would still need the `die` function > mentioned to remove a unit on death. I think you're conflating two separate concepts, and it would help to distinguish between them. There's the lifetime of a memory-allocated object, which is how long an object remains in the part of the heap that's allocated to it. It begins when you allocate the object with `new`, and ends with the GC finds that it's no longer referenced and collects it. There's a different lifetime that you appear to be talking about: the logical lifetime of an in-game object (not to be confused with an "object" in the OO sense, though the two may overlap). The (game) object gets created (comes into existence in the simulated game world) at a certain point in game time, until something in the game simulation decides that it should no longer exist (it got destroyed, replaced with another object, whatever). At that point, it should be removed from the game simulation, and that's probably also what you have in mind when you mentioned your "die" function. And here's the important point: the two *do not need to coincide*. Here's a concrete example of what I mean. Suppose in your game there's some in-game mechanic that's creating N objects per M turns, and another mechanic that's destroying some of these objects every L turns. If you map these creations/destructions with the object lifetime, you're looking at a *lot* of memory allocations and deallocations throughout the course of your game. Memory allocations and deallocations can be costly; this can become a problem if you're talking about a large number of objects, or if they're being created/destroyed very rapidly (e.g., they are fragments flying out from explosions). Since most of these objects are identical in type, one way of optimizing the code is to preallocate them: before starting your main loop, say you allocate an array of say, 100 objects. Or 1000 or 1, however many you anticipate you'll need. These objects aren't actually in the game world yet; you're
Re: Difference between chunks(stdin, 1) and stdin.rawRead?
On Thu, Mar 28, 2024 at 10:10:43PM +, jms via Digitalmars-d-learn wrote: > On Thursday, 28 March 2024 at 02:30:11 UTC, jms wrote: [...] > I think I figured it out and the difference is probably in the mode. > This documentation > https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/fread?view=msvc-170 > mentions that "If the given stream is opened in text mode, > Windows-style newlines are converted into Unix-style newlines. That > is, carriage return-line feed (CRLF) pairs are replaced by single line > feed (LF) characters." > > And rawRead's documention mentions that "rawRead always reads in > binary mode on Windows.", which I guess should have given me a clue. > chunks must be using text-mode. It's not so much that chunks is using text-mode, but that you opened the file in text mode. On Windows, if you don't want crlf translation you need to open your file with File(filename, "rb"), not just File(filename "r"), because the latter defaults to text mode. T -- There's light at the end of the tunnel. It's the oncoming train.
Re: Opinions on iterating a struct to absorb the decoding of a CSV?
On Thu, Mar 28, 2024 at 05:23:39PM +, Andy Valencia via Digitalmars-d-learn wrote: [...] > auto t = T(); > foreach (i, ref val; t.tupleof) { > static if (is(typeof(val) == int)) { > val = this.get_int(); > } else { > val = this.get_str(); > } > } > return t; > > So you cue off the type of the struct field, and decode the next CSV > field, and put the value into the new struct. > > Is there a cleaner way to do this? This _does_ work, and gives me > very compact code. This is pretty clean, and is a good example of DbI. I use the same method in my fastcsv experimental module to transcribe csv to an array of structs: https://github.com/quickfur/fastcsv T -- Today's society is one of specialization: as you grow, you learn more and more about less and less. Eventually, you know everything about nothing.
Re: Unittests pass, and then an invalid memory operation happens after?
On Thu, Mar 28, 2024 at 03:56:10AM +, Liam McGillivray via Digitalmars-d-learn wrote: [...] > I may be now starting to see why the use of a garbage collector is > such a point of contention for D. Not being able to predict how the > garbage collection process will happen seems like a major problem. If you want it to be predictable, simply: import core.memory; GC.disable(); ... // insert your code here if (timeToCleanup()) { GC.collect(); // now you know exactly when this happens } Of course, you'll have to know exactly how timeToCleanup() should decide when it's time to collect. Simple possibilities are once every N units of time, once every N iterations of some main loop, etc.. Or use a profiler to decide. > > As mentioned, GCs do not work this way -- you do not need to worry > > about cascading removal of anything. > > Wanting to avoid the GC pauses that I hear about, I was trying to > optimize object deletion so that the GC doesn't have to look for every > object individually. It sounds like what I'm hearing is that I should > just leave everything to the GC. While I can do this without really > hurting the performance of my program (for now), I don't like this. The whole point of a GC is that you leave everything up to it to clean up. If you want to manage your own memory, don't use the GC. D does not force you to use it; you can import core.stdc.stdlib and use malloc/free to your heart's content. > I hope that solving the unpredictable destruction pattern is a > priority for the developers of the language. This problem in my > program wouldn't be happening if either *all* of the objects had their > destructors called or *none* of them did. Unpredictable order of collection is an inherent property of GCs. It's not going away. If you don't like it, use malloc/free instead. (Or write your own memory management scheme.) > Anyway, I suppose I'll have to experiment with either manually > destroying every object at the end of every unittest, or just leaving > more to the GC. Maybe I'll make a separate `die` function for the > units, if you think it's a good idea. [...] I think you're approaching this from a totally wrong angle. (Which I sympathize with, having come from a C/C++ background myself.) The whole point of having a GC is that you *don't* worry about when an object is collected. You just allocate whatever you need, and let the GC worry about cleaning up after you. The more you let the GC do its job, the better it will be. Now of course there are situations where you need deterministic destruction, such as freeing up system resources as soon as they're no longer needed (file descriptors, OS shared memory segments allocations, etc.). For these you would manage the memory manually (e.g. with a struct that implements reference counting or whatever is appropriate). As far as performance is concerned, a GC actually has higher throughput than manually freeing objects, because in a fragmented heap situation, freeing objects immediately when they go out of use incurs a lot of random access RAM roundtrip costs, whereas a GC that scans memory for references can amortize some of this cost to a single period of time. Now somebody coming from C/C++ would immediately cringe at the thought that a major GC collection might strike at the least opportune time. For that, I'd say: (1) don't fret about it until it actually becomes a problem. I.e., your program is slow and/or has bad response times, and the profiler is pointing to GC collections as the cause. Then you optimize appropriately with the usual practices for GC optimization: preallocate before your main loop, avoid frequent allocations of small objects (prefer to use structs rather than classes), reuse previous allocations instead of allocating new memory when you know that an existing object is no longer used. In D, you can also selectively allocate certain troublesome objects with malloc/free instead (mixing both types of allocations is perfectly fine in D; we are not Java where you're forced to use the GC no matter what). (2) Use D's GC control mechanisms to exercise some control over when collections happen. By default, collections ONLY ever get triggered if you try to allocate something and the heap has run out of memory. Ergo, if you don't allocate anything, GC collections are guaranteed not to happen. Use GC.disable and GC.collect to control when collections happen. In one of my projects, I got a 40% performance boost by using GC.disable and using my own schedule of GC.collect, because the profiler revealed that collections were happening too frequently. The exact details how what to do will depend on your project, of course, but my point is, there are plenty of tools at your disposal to exercise some degree of control. Or if (1) and (2) are not enough for your particular case, you can always resort to the nuclear option: slap @nogc on main() and use
Re: Unittests pass, and then an invalid memory operation happens after?
On Wed, Mar 27, 2024 at 09:43:48PM +, Liam McGillivray via Digitalmars-d-learn wrote: [...] > ``` > ~this() { > this.alive = false; > if (this.map !is null) this.map.removeUnit(this); > if (this.faction !is null) this.faction.removeUnit(this); > if (this.currentTile !is null) this.currentTile.occupant = null; > } > ``` [...] What's the definition of this.map, this.faction, and this.currentTile? If any of them are class objects, this would be the cause of your problem. Basically, when the dtor runs, there is no guarantee that any referenced classed objects haven't already been collected by the GC. So if you try to access them, it will crash with an invalid memory access. In general, it's a bad idea to do anything that relies on the dtor being run in a particular order, because the GC can collect dead objects in any order. It's also illegal to perform any GC memory-related operations inside a dtor (like allocate memory, free memory, etc.) because the GC is not reentrant. If you need deterministic clean up of your objects, you should do it before the last reference to the object is deleted. T -- He who does not appreciate the beauty of language is not worthy to bemoan its flaws.
Re: Challenge: Make a data type for holding one of 8 directions allowing increment and overflow
On Sat, Mar 16, 2024 at 09:16:51PM +, Liam McGillivray via Digitalmars-d-learn wrote: > On Friday, 15 March 2024 at 00:21:42 UTC, H. S. Teoh wrote: [...] > > When dealing with units of data smaller than a byte, you generally > > need to do it manually, because memory is not addressable by > > individual bits, making it difficult to implement things like > > slicing an array of bool. [...] > I'm curious as to what "manual implementation" would mean, since > clearly making my own struct with `bool[3]` doesn't count. Does D have > features for precise memory manipulation? Manual implementation as in you would deal with the machine representation in terms of bytes, or more likely, uints (on modern CPUs even though bytes are individually addressible, the hardware actually works in terms of a larger unit, typically an 4-byte 32-bit unit, or an 8-byte 64-bit unit), using bitwise operators to manipulate the bits the way you want to. > Anyway, I'm surprised that D has a special operator `&=` for doing bit > manipulation on integers, especially given that the steps to convert > an int into a bool array is more complicated. I would imagine the > former would be a rather niche thing. You should understand that bitwise operators are directly implemented in hardware, and thus operators like &, |, ^, <<, >>, ~, etc., typically map directly to individual CPU instructions. As such, they are very fast, and preferred when you're doing bit-level manipulations. At this level, you typically do not work with individual bits per se, but with machine words (typically 32-bit or 64-bit units). Bitwise operators operate on all 32 or 64 bits at once, so performance-aware code typically manipulates all these bits simultaneously rather than individually. Of course, using suitable bit-masking you *can* address individual bits, but the hardware instructions themselves typically work with all 32/64 bits at once. Here's a simple example. Suppose you have 3 bits you want to store. Since the machine doesn't have a 3-bit built-in type, you typically just use the next larger available size, either a ubyte (8 bits) if you want compact storage, or if compactness isn't a big issue just a uint (32 bits, you just ignore the other 29 bits that you don't need). So you'd declare the storage something like this: uint myBits; Bits are usually indexed from 0, so bit 0 is the first position, bit 1 is the second position, and so on. So to set the first bit to 1, you'd do: myBits |= 0b001; Note that at the machine level, this operator works on all 32 bits at the same time. Most of the bits remain unchanged, though, because bitwise OR does not change the original value if the operand is 0. So the overall effect is that the first bit is set. To set the first bit to 0, there isn't a direct operator that does that, but you can take advantage of the behaviour of bitwise AND, in which any bit which is 0 in the operand will get cleared, everything else remains unchanged. So you'd do this: myBits &= 0b110; Now, since we don't really care about the other 29 bits, we could write this as follows instead, to make our intent clearer: myBits &= ~0b001; The ~ operator flips all the bits, so this is equivalent to writing: myBits &= ~0b_______1110; Writing it with ~ also has the advantage that should we later decide to add another bit to our "bit array", we don't have to update the code; whereas if we'd used `myBits &= 0b110;` then we'd need to change it to `myBits &= 0b1110;` otherwise our new 4th bit may get unexpectedly cleared when we only wanted to clear the first bit. Now, what if we wanted to set both the 1st and 3rd bits? In a hypothetical bit array implementation, we'd do the equivalent of: bool[3] myBits; myBits[0] = 1; myBits[2] = 1; However, in our uint approach, we can cut the number of operations by half, because the CPU is already operating on the entire 32 bits of the uint at once -- so there's no need to have two instructions to set two individual bits when we could just do it all in one: myBits |= 0b101; // look, ma! both bits set at once! Similarly, to clear the 1st and 3rd bits simultaneously, we simply write: myBits &= ~0b101; // clear both bits in 1 instruction! Of course, when we only have 3 bits to work with, the savings isn't that significant. However, if you have a larger bit array, say you need an array of 32 bits, this can speed your code up by 32x, because you're taking advantage of the fact that the hardware is already operating on all 32 bits at the same time. On 64-bit CPUs, you can speed it up by 64x because the CPU operates on all 64 bits simultaneously, so you can manipulate an entire array of 64 bits in a single instruction, which is 64x faster than if you looped over an array of bool with 64 iterations. T -- Without outlines, life would be pointless.
Re: Challenge: Make a data type for holding one of 8 directions allowing increment and overflow
On Thu, Mar 14, 2024 at 11:39:33PM +, Liam McGillivray via Digitalmars-d-learn wrote: [...] > I tried to rework the functions to use bitwise operations, but it was > difficult to figure out the correct logic. I decided that it's not > worth the hassle, so I just changed the value storage from `bool[3]` > to `ubyte`. [...] Just wanted to note that while in theory bool[3] could be optimized by the compiler for compact storage, what you're most likely to get is 3 bytes, one for each bool, or perhaps even 3 ints (24 bytes). When dealing with units of data smaller than a byte, you generally need to do it manually, because memory is not addressable by individual bits, making it difficult to implement things like slicing an array of bool. So the compiler is most likely to simplify things by making it an array of bytes rather than emit complex bit manipulation code to make up for the lack of bit-addressability in the underlying hardware. Using bit operators like others have pointed out in this thread is probably the best way to implement what you want. T -- LINUX = Lousy Interface for Nefarious Unix Xenophobes.
Re: varargs when they're not all the same type?
On Thu, Mar 14, 2024 at 08:58:21PM +, Andy Valencia via Digitalmars-d-learn wrote: > On Thursday, 14 March 2024 at 18:05:59 UTC, H. S. Teoh wrote: > > ... > > The best way to do multi-type varags in D is to use templates: > > > > import std; > > void myFunc(Args...)(Args args) { > > Thank you. The first parenthetical list is of types, is it not? I > can't find anywhere which says what "type" is inferred for "Args..."? > (gdb pretends like "arg" is not a known symbol.) Is it basically a > tuple of the suitable type? [...] The first set of parenthesis specify compile-time arguments. The specification `Args...` means "zero or more types". So it could be any list of types, which naturally would be chosen according to the arguments given. For example, to pass an int and a float, you'd do something like: myFunc!(int, float)(123, 3.14159f); and to pass a string, two ints, and a char, you'd write: myFunc!(string, int, int, char)("abc", 123, 456, 'z'); Having to specify types manually, of course, is a lot of unnecessary typing, since the compiler already knows what the types are based on what you write in the second pair of parentheses. For this reason, typical D code will omit the first pair of parentheses (the `!(...)`, that is, the compile-time arguments) and just let the compiler infer the types automatically: myFunc(123, 3.14159f); // compiler figures out Args = (int, float) myFunc("abc", 123, 456, 'z'); // compiler figures out Args = (string, int, int, char) T -- A program should be written to model the concepts of the task it performs rather than the physical world or a process because this maximizes the potential for it to be applied to tasks that are conceptually similar and, more important, to tasks that have not yet been conceived. -- Michael B. Allen
Re: varargs when they're not all the same type?
On Thu, Mar 14, 2024 at 05:57:21PM +, Andy Valencia via Digitalmars-d-learn wrote: > Can somebody give me a starting point for understanding varadic > functions? I know that we can declare them > > int[] args... > > and pick through whatever the caller provided. But if the caller > wants to pass two int's and a _string_? That declaration won't permit > it. > > I've looked into the formatter, and also the varargs implementation. > But it's a bit of a trip through a funhouse full of mirrors. Can > somebody describe the basic language approach to non-uniform varargs, > and then I can take it the rest of the way reading the library. [...] The best way to do multi-type varags in D is to use templates: import std; void myFunc(Args...)(Args args) { foreach (i, arg; args) { writefln("parameter %d is a %s with value %s", i, typeof(arg), arg); } } void main() { myFunc(123, 3.14159, "blah blah", [ 1, 2, 3 ], new Object()); } D also supports C-style varags (without templates), but it's not recommended because it's not type-safe. You can find the description in the language docs. T -- "Maybe" is a strange word. When mom or dad says it it means "yes", but when my big brothers say it it means "no"! -- PJ jr.
Re: Hidden members of Class objects
On Wed, Mar 06, 2024 at 11:39:13PM +, Carl Sturtivant via Digitalmars-d-learn wrote: > I notice that a class with no data members has a size of two words (at > 64 bits). Presumably there's a pointer to a table of virtual > functions, and one more. Is the Vtable first? [...] > What is actually in these objects using that space? In D, there's a pointer to the vtable and another pointer to a Monitor object (used for synchronized methods). There was talk about getting rid of the Monitor field years ago, but nothing has happened yet. T -- MAS = Mana Ada Sistem?
Re: Array types and index types
On Wed, Feb 28, 2024 at 03:00:55AM +, Liam McGillivray via Digitalmars-d-learn wrote: > In D, it appears that dynamic arrays (at least by default) use a ulong > as their key type. They are declared like this: > ``` > string[] dynamicArray; > ``` > > I imagine that using a 64-bit value as the key would be slower than > using 32 bits or 16 bits, Wrong. The machine uses 64 bits internally anyway. Well, 48 on i386. But the point is that there is no speed difference. Also, on 32-bit architectures size_t is aliased to uint, which is 32 bits. [...] > So I have some questions: > > Is there a way to declare a dynamic array with a uint, ushort, or ubyte key? No. > If there was, would it really be faster? No. > Is an associative array with a ushort key faster than a dynamic array > with a ulong key? No. T -- Error: Keyboard not attached. Press F1 to continue. -- Yoon Ha Lee, CONLANG
Re: what was the problem with the old post blit operator already ?
On Thu, Feb 15, 2024 at 02:17:15AM +, Basile B. via Digitalmars-d-learn wrote: > From what I remember, it was that there was no reference to the > source. Things got blitted and you had to fix the copy, already > blitted. Was that the only issue ? I don't quite remember all of the reasons now. But yeah, one of the problems with postblit was that you don't have access to the original copy. That precludes some applications where you need to look up data from the original or update the original. And if you have immutable fields they've already been blitted and you can't fix them anymore, not without casting away immutable and putting yourself in UB zone. There may have been other issues with postblit, I don't quite remember now. T -- Beware of bugs in the above code; I have only proved it correct, not tried it. -- Donald Knuth
Re: length's type.
On Tue, Feb 13, 2024 at 06:36:22PM +, Nick Treleaven via Digitalmars-d-learn wrote: > On Monday, 12 February 2024 at 18:22:46 UTC, H. S. Teoh wrote: [...] > > Honestly, I think this issue is blown completely out of proportion. > > The length of stuff in any language needs to be some type. D decided > > on an unsigned type. You just learn that and adapt your code > > accordingly, end of story. Issues like these can always be argued > > both ways, and the amount of energy spent in these debates far > > outweigh the trivial workarounds in code, of which there are many > > (use std.conv.to for bounds checks, just outright cast it if you > > know what you're doing (or just foolhardy), use CheckedInt, etc.). > > And the cost of any change to the type now also far, far outweighs > > any meager benefits it may have brought. It's just not worth it, > > IMNSHO. > > I don't want the type of .length to change, that indeed would be too > disruptive. What I want is proper diagnostics like any well-regarded > C compiler when I mix/implicit convert unsigned and signed types. I agree, mixing signed/unsigned types in the same expression ought to require a cast, and error out otherwise. Allowing them to be freely mixed, or worse, implicitly convert to each other, is just too error-prone. > Due to D's generic abilities, it's easier to make wrong assumptions > about whether some integer is signed or unsigned. But even without > that, C compilers accepted that this is a task for the compiler to > diagnose rather than humans, because it is too bug-prone for humans. Indeed. T -- Живёшь только однажды.
Re: length's type.
On Mon, Feb 12, 2024 at 07:34:36PM +, bachmeier via Digitalmars-d-learn wrote: > On Monday, 12 February 2024 at 18:22:46 UTC, H. S. Teoh wrote: > > > Honestly, I think this issue is blown completely out of proportion. > > Only for people that don't have to deal with the problems it causes. I've run into size_t vs int issues many times. About half the time it exposed fallacious assumptions on my part about value types. The other half of the time a simple cast or std.conv.to invocation solved the problem. My guess is that most common use of .length in your typical D code is in (1) passing it to code that expect a length for various reasons, and (2) in loop conditions to avoid overrunning a buffer or overshooting some range. (1) is a non-problem, 90% of (2) is solved by using constructs like foreach() and/or ranges instead of overly-clever arithmetic involving length, which is almost always wrong or unnecessary. If you need to do subtraction with lengths, that's a big red flag that you're approaching your problem from the wrong POV. About the only time you need to do arithmetic with lengths is in low-level code like allocators or array copying, for which you really should be using higher-level constructs instead. > > D decided on an unsigned type. You just learn that and adapt your > > code accordingly, end of story. Issues like these can always be > > argued both ways, and the amount of energy spent in these debates > > far outweigh the trivial workarounds in code, of which there are > > many (use std.conv.to for bounds checks, just outright cast it if > > you know what you're doing (or just foolhardy), use CheckedInt, > > etc.). > > A terrible language is one that makes you expend your energy thinking > about workarounds rather than solving your problems. The default > should be code that works. The workarounds should be for cases where > you want to do something extremely unusual like subtracting from an > unsigned type and having it wrap around. Yes, if I had my way, implicit conversions to/from unsigned types should be a compile error. As should comparisons between signed/unsigned values. But regardless, IMNSHO any programmer worth his wages ought to learn what an unsigned type is and how it works. A person should not be writing code if he can't even be bothered to learn how the machine that's he's programming actually works. To quote Knuth: People who are more than casually interested in computers should have at least some idea of what the underlying hardware is like. Otherwise the programs they write will be pretty weird. -- D. Knuth One of the reasons Walter settled on size_t being unsigned is that this reflects how the hardware actually works. Computer arithmetic is NOT highschool arithmetic; you do not have infinite width nor infinite precision, and you're working with binary, not decimal. This has consequences, and having the language pretend the distinction doesn't exist does not solve any problems. If an architectural astronaut works at such a high level of abstraction that he doesn't even understand how basic things about the hardware, like how uint or ulong work and how to use them correctly, maybe he should be promoted to a managerial role instead of writing code. T -- You are only young once, but you can stay immature indefinitely. -- azephrahel
Re: length's type.
On Mon, Feb 12, 2024 at 05:26:25PM +, Nick Treleaven via Digitalmars-d-learn wrote: > On Friday, 9 February 2024 at 15:19:32 UTC, bachmeier wrote: > > It's been discussed many, many times. The behavior is not going to > > change - there won't even be a compiler warning. (You'll have to > > check with the leadership for their reasons.) > > Was (part of) the reason because it would disrupt existing code? If > that was the blocker then editions are the solution. Honestly, I think this issue is blown completely out of proportion. The length of stuff in any language needs to be some type. D decided on an unsigned type. You just learn that and adapt your code accordingly, end of story. Issues like these can always be argued both ways, and the amount of energy spent in these debates far outweigh the trivial workarounds in code, of which there are many (use std.conv.to for bounds checks, just outright cast it if you know what you're doing (or just foolhardy), use CheckedInt, etc.). And the cost of any change to the type now also far, far outweighs any meager benefits it may have brought. It's just not worth it, IMNSHO. T -- Verbing weirds language. -- Calvin (& Hobbes)
Re: The difference between the dates in years
On Sat, Feb 10, 2024 at 03:53:09PM +, Alexander Zhirov via Digitalmars-d-learn wrote: > Is it possible to calculate the difference between dates in years > using regular means? Something like that > > > ``` > writeln(Date(1999, 3, 1).diffMonths(Date(1999, 1, 1))); > ``` > > At the same time, keep in mind that the month and day matter, because > the difference between the year, taking into account the month that > has not come, will be less. > > My abilities are not yet enough to figure it out more elegantly. IIRC you can just subtract two DateTime's to get a Duration that you can then convert into whatever units you want. Only thing is, in this case conversion to months may not work because months don't have a fixed duration (they can vary from 28 days to 31 days) so there is no "correct" way of computing it, you need to program it yourself according to the exact calculation you want. T -- PNP = Plug 'N' Pray
Re: std.uni CodepointSet toString
On Thu, Feb 08, 2024 at 06:22:29PM +, Carl Sturtivant via Digitalmars-d-learn wrote: > On Wednesday, 7 February 2024 at 17:11:30 UTC, H. S. Teoh wrote: > > Do we know why the compiler isn't getting it right? Shouldn't we be > > fixing it instead of just turning off elision completely? > > This matter seems to have been an issue for some time. > https://forum.dlang.org/post/l5e5hm$1177$1...@digitalmars.com 11 years and we still haven't fixed all the problems?! That's ... wow. I've recently run into the same problem myself and had to use -allinst in order to to compile my project. Maybe I should dustmite it and submit a report. But given it's been 11 years, I'm not sure if this is worth my time T -- "No, John. I want formats that are actually useful, rather than over-featured megaliths that address all questions by piling on ridiculous internal links in forms which are hideously over-complex." -- Simon St. Laurent on xml-dev
Re: std.uni CodepointSet toString
On Thu, Feb 08, 2024 at 05:44:59AM +1300, Richard (Rikki) Andrew Cattermole via Digitalmars-d-learn wrote: > On 08/02/2024 5:36 AM, Carl Sturtivant wrote: [...] > > ``` > > $ dmd --help | grep allinst > > -allinst generate code for all template instantiations > > ``` > > Unclear exactly how -allinst does this, given type parameters, and > > it will affect all of the many templates I use in source with > > CodepointSet. > > > > Can you shed any light? > > Basically the compiler will by default try to elide templates it > thinks isn't used. > > However it doesn't always get this right, which this flag overrides by > turning it off. Do we know why the compiler isn't getting it right? Shouldn't we be fixing it instead of just turning off elision completely? T -- Let's call it an accidental feature. -- Larry Wall
Re: trouble with associative Arrays
On Sat, Jan 20, 2024 at 02:33:24PM +, atzensepp via Digitalmars-d-learn wrote: > Hello, > > I am new with D and want to convert a c program for a csv file manipulation > with exhaustive dynamic memory mechanics to D . > > When reading a CSV-file line by line I would like to create an associative > array to get the row values by the value in the second column. > Although I save the rows in an array (to get different pointers to the > values) the program below always gives the last row. [...] Because .byLine reuses its line buffer. You want .byLineCopy instead. T -- Everybody talks about it, but nobody does anything about it! -- Mark Twain
Re: Help optimize D solution to phone encoding problem: extremely slow performace.
On Sat, Jan 20, 2024 at 01:35:44AM +0100, Daniel Kozak via Digitalmars-d-learn wrote: [...] >> Try addressing the points I wrote above and see if it makes a >> difference. > >I have tried it (all of it) even before you wrote it here, because >I have completely the same ideas, but to be fair it has almost zero >effect on speed. >There is my version (It still use OOP, but I have try it wit >Printer and Counter to be structs and it has no effect at >all) [2]https://paste.ofcode.org/38vKWLS8DHRazpv6MTidRJY >The only difference in speed in the end is caused by hash >implementation of dlang associative arrays and rust HashMap, >actually if you modify rust to not used ahash it has almost same >speed as D [...] I'm confused by the chained hashing of the digits. Why is that necessary? I would have thought it'd be faster to hash the entire key instead of computing the hash of each digit and chaining them together. I looked up Rust's ahash algorithm. Apparently they leverage the CPU's hardware AES instruction to compute a collision-resistant hash very quickly. Somebody should file a bug on druntime to implement this where the hardware supports it, instead of the current hashOf. For relatively small keys this would be a big performance boost. T -- Valentine's Day: an occasion for florists to reach into the wallets of nominal lovers in dire need of being reminded to profess their hypothetical love for their long-forgotten.
Re: Help optimize D solution to phone encoding problem: extremely slow performace.
On Fri, Jan 19, 2024 at 01:40:39PM +, Renato via Digitalmars-d-learn wrote: > On Friday, 19 January 2024 at 10:15:57 UTC, evilrat wrote: [...] > > Additionally if you comparing D by measuring DMD performance - > > don't. It is valuable in developing for fast iterations, but it > > lacks many modern optimization techniques, for that we have LDC and > > GDC. > > I tried with DMD again, and yeah, it's much slower. For anything where performance is even remotely important, I wouldn't even consider DMD. It's a well-known fact that it produces suboptimal executables. Its only redeeming factor is really only its fast turnaround time. If fast turnaround is not important, I would always use LDC or GDC instead. > Here's the [current implementation in > D](https://github.com/renatoathaydes/prechelt-phone-number-encoding/blob/dlang-key-hash-incremental/src/d/src/dencoder.d), > and the roughly [equivalent Rust > implementation](https://github.com/renatoathaydes/prechelt-phone-number-encoding/blob/dlang-key-hash-incremental/src/rust/phone_encoder/src/main.rs). Taking a look at this code: One of the most important thing I found is that every call to printTranslations allocates a new array (`auto keyValue = new ubyte[...];`). Given the large number of recursions involved in this algorithm, this will add up to quite a lot. If I were optimizing this code, I'd look into ways of reducing, if not eliminating, this allocation. Observe that this allocation is needed each time printTranslations recurses, so instead of making separate allocations, you could put it on a stack. Either with alloca, or with my appendPath() trick in my version of the code: preallocate a reasonably large buffer and take slices of it each time you need a new keyValue array. Secondly, your hash function looks suspicious. Why are you chaining your hash per digit? That's a lot of hash computations. Shouldn't you just hash the entire key each time? That would eliminate the need to store a custom hash in your key, you could just lookup the entire key at once. Next, what stood out is ISolutionHandler. If I were to write this, I wouldn't use OO for this at all, and especially not interfaces, because they involve a double indirection. I'd just return a delegate instead (single indirection, no object lookup). This is a relatively small matter, but when it's being used inside a hot inner loop, it could be important. Then a minor point: I wouldn't use Array in printTranslations. It's overkill for what you need to do; a built-in array would work just fine. Take a look at the implementation of Array and you'll see lots of function calls and locks and GC root-adding and all that stuff. Most of it doesn't apply here, of course, and is compiled out. Nevertheless, it uses wrapped integer operations and range checks, etc.. Again, these are all minor issues, but in a hot inner loop they do add up. Built-in arrays let you literally just bump the pointer when adding an element. Just a couple of instructions as opposed to several function calls. Important difference when you're on the hot path. Now, as I mentioned earlier w.r.t. my own code, appending to built-in arrays comes with a cost. So here's where you'd optimize by creating your own buffer and custom push/pop operations. Something like appendPath() in my version of the code would do the job. Finally, a very a small point: in loadDictionary, you do an AA lookup with `n in result`, and then if that returns null, you create a new entry. This does two AA lookups, once unsuccessfully, and the second time to insert the missing key. You could use the .require operation with a delegate instead of `in` followed by `if (... is null)`, which only requires a single lookup. Probably not an important point, but for a large dictionary this might make a difference. > The only "significant" difference is that in Rust, an enum > `WordOrDigit` is used to represent currently known "words"... I [did > try using that in > D](https://github.com/renatoathaydes/prechelt-phone-number-encoding/blob/dlang-int128-word-and-digit/src/d/src/dencoder.d), > but it made the algorithm slower. > > If you see anything in D that's not as efficient as it should be, or > somehow "inferior" to what the Rust version is doing , please let me > know. Couldn't tell you, I don't know Rust. :-D > Notice that almost all of the time is spent in the for-loop inside > `printTranslations` (which is misnamed as it doesn't necessarily > "print" anything, like it did earlier) - the rest of the code almost > doesn't matter. [...] Of course, that's where your hot path is. And that loop makes recursive calls to printTranslations, so the entire body of the function could use some optimization. ;-) Try addressing the points I wrote above and see if it makes a difference. T -- The two rules of success: 1. Don't tell everything you know. -- YHL
Re: Help optimize D solution to phone encoding problem: extremely slow performace.
On Thu, Jan 18, 2024 at 04:23:16PM +, Renato via Digitalmars-d-learn wrote: [...] > Ok, last time I'm running this for someone else :D > > ``` > Proc,Run,Memory(bytes),Time(ms) > ===> ./rust > ./rust,23920640,30 > ./rust,24018944,147 > ./rust,24068096,592 > ./rust,24150016,1187 > ./rust,7766016,4972 > ./rust,8011776,46101 > ===> src/d/dencoder > src/d/dencoder,44154880,42 > src/d/dencoder,51347456,87 > src/d/dencoder,51380224,273 > src/d/dencoder,51462144,441 > src/d/dencoder,18644992,4414 > src/d/dencoder,18710528,43548 > ``` OK, this piqued my interest enough that I decided to install rust using rustup instead of my distro's package manager. Here are the numbers I got for my machine: ===> ./rust ./rust,22896640,35 ./rust,22896640,137 ./rust,22384640,542 ./rust,22896640,1034 ./rust,8785920,2489 ./rust,8785920,12157 ===> src/d/dencoder src/d/dencoder,1066799104,36 src/d/dencoder,1066799104,72 src/d/dencoder,1066799104,198 src/d/dencoder,1066799104,344 src/d/dencoder,1035292672,2372 src/d/dencoder,1035292672,13867 Looks like we lost out to Rust for larger inputs. :-D Probably due to environmental factors (and the fact that std.stdio is slow). I re-ran it again and got this: ===> ./rust ./rust,22896640,30 ./rust,22896640,131 ./rust,22896640,511 ./rust,22896640,983 ./rust,8785920,3102 ./rust,8785920,9912 ===> src/d/dencoder src/d/dencoder,1066799104,36 src/d/dencoder,1066799104,71 src/d/dencoder,1066799104,197 src/d/dencoder,1066799104,355 src/d/dencoder,1035292672,3441 src/d/dencoder,1035292672,9471 Notice the significant discrepancy between the two runs; this seems to show that the benchmark is only accurate up to about ±1.5 seconds. Anyway, oddly enough, Java seems to beat Rust on larger inputs. Maybe my Java compiler has a better JIT implementation? :-P > Congratulations on beating Rust :D but remember: you're using a much > more efficient algorithm! I must conclude that the Rust translation of > the Trie algorithm would be much faster still, unfortunately (you may > have noticed that I am on D's side here!). At this point, it's not really about the difference between languages anymore; it's about the programmer's skill at optimizing his code. Traditionally Java is thought to be the slowest, because it runs in a VM and generally tends to use more heap allocations. In recent times, however, JIT and advanced GC implementations have significantly levelled that out, so you're probably not going to see the difference unless you hand-tweak your code down to the bare metal. Surprisingly, at least on my machine, Lisp actually performed the worst. I'd have thought it would at least beat Java, but I was quite wrong. :-D Perhaps the Lisp implementation I'm using is suboptimal, I don't know. Or perhaps modern JVMs have really overtaken Lisp. Now I'm really curious how a Rust version of the trie algorithm would perform. Unfortunately I don't know Rust so I wouldn't be able to write it myself. (Hint, hint, nudge, nudge ;-)). As far as the performance of my D version is concerned, I still haven't squeezed out all the performance I could yet. Going into this, my intention was to take the lazy way of optimizing only what the profiler points out to me, with the slight ulterior motive of proving that a relatively small amount of targeted optimizations can go a long way at making the GC a non-problem in your typical D code. ;-) I haven't pulled out all the optimization guns at my disposal yet. If I were to go the next step, I'd split up the impl() function so that I get a better profile of where it's spending most of its time, and then optimize that. My current suspicion is that the traversal of the trie could be improved by caching intermediate results to eliminate a good proportion of recursive calls in impl(). Also, the `print` mode of operation is quite slow, probably because writefln() allocates. (It allocates less than if I had used .format like I did before, but it nevertheless still allocates.) To alleviate this cost, I'd allocate an output buffer and write to that, flushing only once it filled up. Another thing I could do is to use std.parallelism.parallel to run searches on batches of phone numbers in parallel. This is kinda cheating, though, since it's the same algorithm with the same cost, we're just putting more CPU cores to work. :-P But in D this is quite easy to do, often as easy as simply adding .parallel to your outer foreach loop. In this particular case it will need some additional refactoring due to the fact that the input is being read line by line. But it's relatively easy to load the input into a buffer by chunks instead, and just run the searches on all the numbers found in the buffer in parallel. On Thu, Jan 18, 2024 at 04:25:45PM +, Renato via Digitalmars-d-learn wrote: [...] > BTW here's you main function so it can run on the benchmark: [...] Thanks, I've adapted my code accordingly and pushed to my github repo. T -- This is a tpyo.
Re: Datetime format?
On Thu, Jan 18, 2024 at 11:58:32PM +, zoujiaqing via Digitalmars-d-learn wrote: > On Thursday, 18 January 2024 at 23:43:13 UTC, Jonathan M Davis wrote: > > On Thursday, January 18, 2024 4:26:42 PM MST zoujiaqing via > > Digitalmars-d- learn wrote: > > > ```D > > > import std.datetime : Clock, format; > > > import std.stdio : writeln; > > > > > > void main() > > > { > > > auto currentTime = Clock.currTime; > > > > > > auto formattedTime = currentTime.format("%Y-%m-%d %H:%M:%S"); > > > > > > writeln("Formatted Time: ", formattedTime); > > > } > > > ``` [...] > So shame! The standard library doesn't have date formatting. [...] It's easy to write your own: d import std; void main() { auto curTime = Clock.currTime; auto dt = cast(DateTime) curTime; auto fmtTime = format("%04d-%02d-%02d %02d:%02d:%02d", dt.year, dt.month, dt.day, dt.hour, dt.minute, dt.second); writeln(fmtTime); } Output: 2024-01-18 16:21:51 You have maximum flexibility to format it however you like. T -- Computers aren't intelligent; they only think they are.
Re: Help optimize D solution to phone encoding problem: extremely slow performace.
On Wed, Jan 17, 2024 at 07:57:02AM -0800, H. S. Teoh via Digitalmars-d-learn wrote: [...] > I'll push the code to github. [...] Here: https://github.com/quickfur/prechelt/blob/master/encode_phone.d T -- Why do conspiracy theories always come from the same people??
Re: Help optimize D solution to phone encoding problem: extremely slow performace.
On Wed, Jan 17, 2024 at 07:19:39AM +, Renato via Digitalmars-d-learn wrote: [...] > But pls run the benchmarks yourself as I am not going to keep running > it for you, and would be nice if you posted your solution on a Gist > for example, pasting lots of code in the forum makes it difficult to > follow. I can't. I spent half an hour trying to get ./benchmark.sh to run, but no matter what it could not compile benchmark_runner. It complains that my rustc is too old and some dependencies do not support it. I tried running the suggested cargo update command to pin the versions but none of them worked. Since I'm not a Rust user, I'm not feeling particularly motivated right now to spend any more time on this. Upgrading my rustc isn't really an option because that's the version currently in my distro and I really don't feel like spending more time to install a custom version of rustc just for this benchmark. T -- Today's society is one of specialization: as you grow, you learn more and more about less and less. Eventually, you know everything about nothing.
Re: Help optimize D solution to phone encoding problem: extremely slow performace.
On Wed, Jan 17, 2024 at 07:19:39AM +, Renato via Digitalmars-d-learn wrote: > On Tuesday, 16 January 2024 at 22:13:55 UTC, H. S. Teoh wrote: > > used for the recursive calls. Getting rid of the .format ought to > > speed it up a bit. Will try that now... > > > > That will make no difference for the `count` option which is where > your solution was very slow. Of course it will. Passing the data directly to the callback that bumps a counter is faster than allocating a new string, formatting the data, and then passing it to the callback that bumps a counter. It may not look like much, but avoiding unnecessary GC allocations means the GC will have less work to do later when a collection is run, thus you save time over the long term. > To run the slow test manually use the `words_quarter.txt` dictionary > (the phone numbers file doesn't matter much - it's all in the > dictionary). > > But pls run the benchmarks yourself as I am not going to keep running > it for you, and would be nice if you posted your solution on a Gist > for example, pasting lots of code in the forum makes it difficult to > follow. I'll push the code to github. T -- "No, John. I want formats that are actually useful, rather than over-featured megaliths that address all questions by piling on ridiculous internal links in forms which are hideously over-complex." -- Simon St. Laurent on xml-dev
Re: Help optimize D solution to phone encoding problem: extremely slow performace.
On Tue, Jan 16, 2024 at 10:15:04PM +, Siarhei Siamashka via Digitalmars-d-learn wrote: > On Tuesday, 16 January 2024 at 21:15:19 UTC, Renato wrote: [...] > > ... what I am really curious about is what the code I wrote is doing > > wrong that causes it to run 4x slower than Rust despite doing "the > > same thing"... > > It's a GC allocations fest. Indeed. I have just completed 2 rounds of optimizations of my version of the code, and both times the profiler also showed the problem to be excessive allocations in the inner loop. So, I did the following optimizations: 1) Get rid of .format in the inner loop. Not only does .format cause a lot of allocations, it is also a known performance hog. So instead of constructing the output string in the search function, I changed it to take a delegate instead, and the delegate either counts the result or prints it directly (bypassing the construction of an intermediate string). This improved performance quite a bit for the count-only runs, but also wins some performance even when output is generated. Overall, however, this optimization only gave me some minor savings. 2) Changed the `path` parameter from string[] to string, since I didn't really need it to be an array of strings anyway. This in itself only improved performance marginally, barely noticeable, but it led to (3), which gave a huge performance boost. 3) Basically, in the earlier version of the code, the `path` parameter was appended to every time I recursed, and furthermore the same initial segment gets appended to many times with different trailers as the algorithm walks the trie. As a result, this triggers a lot of array reallocations to store the new strings. Most of these allocations are unnecessary, because we already know that the initial segment of the string will stay constant, only the tail end changes. Furthermore, we only ever have a single instance of .path at any point in time in the algorithm. So we could use a single buffer to hold all of these instances of .path, and simply return slices to it as we go along, overwriting the tail end each time we need to append something. This significantly cut down on the number of allocations, and along with (1) and (2), performance improved by about 3x (!). It didn't completely remove all allocations, but I'm reasonably happy with the performance now that I probably won't try to optimize it more unless it's still losing out to another language. ;-) (I'm especially curious to see if this beats the Rust version. :-P) Optimized version of the code: ---snip /** * Encoding phone numbers according to a dictionary. */ import std; /** * Table of digit mappings. */ static immutable ubyte[dchar] digitOf; shared static this() { digitOf = [ 'E': 0, 'J': 1, 'N': 1, 'Q': 1, 'R': 2, 'W': 2, 'X': 2, 'D': 3, 'S': 3, 'Y': 3, 'F': 4, 'T': 4, 'A': 5, 'M': 5, 'C': 6, 'I': 6, 'V': 6, 'B': 7, 'K': 7, 'U': 7, 'L': 8, 'O': 8, 'P': 8, 'G': 9, 'H': 9, 'Z': 9, ]; } /** * Trie for storing dictionary words according to the phone number mapping. */ class Trie { Trie[10] edges; string[] words; private void insert(string word, string suffix) { const(ubyte)* dig; while (!suffix.empty && (dig = std.ascii.toUpper(suffix[0]) in digitOf) is null) { suffix = suffix[1 .. $]; } if (suffix.empty) { words ~= word; return; } auto node = new Trie; auto idx = *dig; if (edges[idx] is null) { edges[idx] = new Trie; } edges[idx].insert(word, suffix[1 .. $]); } /** * Insert a word into the Trie. * * Characters that don't map to any digit are ignored in building the Trie. * However, the original form of the word will be retained as-is in the * leaf node. */ void insert(string word) { insert(word, word[]); } /** * Iterate over all words stored in this Trie. */ void foreachEntry(void delegate(string path, string word) cb) { void impl(Trie node, string path = "") { if (node is null) return; foreach (word; node.words) { cb(path, word); } foreach (i, child; node.edges) { impl(child, path ~ cast(char)('0' + i)); } } impl(this); } } /** * Loads the given dictionary into a Trie. */ Trie loadDictionary(R)(R lines) if (isInputRange!R & is(ElementType!R : const(char)[])) { Trie result = new Trie; foreach (line; lines) { result.insert(line.idup); } return result; } /// unittest { auto dict = loadDictionary(q"ENDDICT an blau Bo" Boot bo"s da Fee fern Fest fort je jemand mir Mix Mixer
Re: Help optimize D solution to phone encoding problem: extremely slow performace.
On Tue, Jan 16, 2024 at 09:15:19PM +, Renato via Digitalmars-d-learn wrote: > On Tuesday, 16 January 2024 at 20:34:48 UTC, H. S. Teoh wrote: > > On Tue, Jan 16, 2024 at 12:28:49PM -0800, H. S. Teoh via > > Digitalmars-d-learn wrote: [...] > > > Anyway, I've fixed the problem, now my program produces the exact > > > same output as Renato's repo. Code is posted below. > > [...] > > > > Great, I ran the benchmarks for you :) > > I had to change how you accept arguments, even though you did "the > right thing" using `getopt`, the solutions should just take a `count` > or `print` argument first... Oops, haha :-P > Anyway, here's your result: > > ``` > ===> ./rust > ./rust,24133632,25 > ./rust,24739840,130 > ./rust,24477696,536 > ./rust,25247744,1064 > ./rust,8175616,6148 > ./rust,8306688,8315 > ===> src/d/dencoder > src/d/dencoder,46055424,43 > src/d/dencoder,96337920,146 > src/d/dencoder,102350848,542 > src/d/dencoder,102268928,1032 > src/d/dencoder,40206336,99936 > ^C > ``` > > It took too long with the `count` option, so I had to abort before the > last run ended... there's probably some bug there, otherwise the Trie > runs very fast, as I had expected. [...] Do you have the problematic data file handy? I'd like to look into any potential bugs. Also, the profiler revealed that a lot of time was spent in the GC and in small allocations. The cause is in all likelihood the .format() call for each found match, and array append being used for the recursive calls. Getting rid of the .format ought to speed it up a bit. Will try that now... T -- If the comments and the code disagree, it's likely that *both* are wrong. -- Christopher
Re: Help optimize D solution to phone encoding problem: extremely slow performace.
On Tue, Jan 16, 2024 at 12:28:49PM -0800, H. S. Teoh via Digitalmars-d-learn wrote: [...] > Anyway, I've fixed the problem, now my program produces the exact same > output as Renato's repo. Code is posted below. [...] Oops, forgot to actually paste the code. Here it is: snip /** * Encoding phone numbers according to a dictionary. */ import std; /** * Table of digit mappings. */ static immutable ubyte[dchar] digitOf; shared static this() { digitOf = [ 'E': 0, 'J': 1, 'N': 1, 'Q': 1, 'R': 2, 'W': 2, 'X': 2, 'D': 3, 'S': 3, 'Y': 3, 'F': 4, 'T': 4, 'A': 5, 'M': 5, 'C': 6, 'I': 6, 'V': 6, 'B': 7, 'K': 7, 'U': 7, 'L': 8, 'O': 8, 'P': 8, 'G': 9, 'H': 9, 'Z': 9, ]; } /** * Trie for storing dictionary words according to the phone number mapping. */ class Trie { Trie[10] edges; string[] words; private void insert(string word, string suffix) { const(ubyte)* dig; while (!suffix.empty && (dig = std.ascii.toUpper(suffix[0]) in digitOf) is null) { suffix = suffix[1 .. $]; } if (suffix.empty) { words ~= word; return; } auto node = new Trie; auto idx = *dig; if (edges[idx] is null) { edges[idx] = new Trie; } edges[idx].insert(word, suffix[1 .. $]); } /** * Insert a word into the Trie. * * Characters that don't map to any digit are ignored in building the Trie. * However, the original form of the word will be retained as-is in the * leaf node. */ void insert(string word) { insert(word, word[]); } /** * Iterate over all words stored in this Trie. */ void foreachEntry(void delegate(string path, string word) cb) { void impl(Trie node, string path = "") { if (node is null) return; foreach (word; node.words) { cb(path, word); } foreach (i, child; node.edges) { impl(child, path ~ cast(char)('0' + i)); } } impl(this); } } /** * Loads the given dictionary into a Trie. */ Trie loadDictionary(R)(R lines) if (isInputRange!R & is(ElementType!R : const(char)[])) { Trie result = new Trie; foreach (line; lines) { result.insert(line.idup); } return result; } /// unittest { auto dict = loadDictionary(q"ENDDICT an blau Bo" Boot bo"s da Fee fern Fest fort je jemand mir Mix Mixer Name neu o"d Ort so Tor Torf Wasser ENDDICT".splitLines); auto app = appender!(string[]); dict.foreachEntry((path, word) { app ~= format("%s: %s", path, word); }); assert(app.data == [ "10: je", "105513: jemand", "107: neu", "1550: Name", "253302: Wasser", "35: da", "38: so", "400: Fee", "4021: fern", "4034: Fest", "482: Tor", "4824: fort", "4824: Torf", "51: an", "562: mir", "562: Mix", "56202: Mixer", "78: Bo\"", "783: bo\"s", "7857: blau", "7884: Boot", "824: Ort", "83: o\"d" ]); } /** * Find all encodings of the given phoneNumber according to the given * dictionary, and write each encoding to the given sink. */ void findMatches(W)(Trie dict, const(char)[] phoneNumber, W sink) if (isOutputRange!(W, string)) { bool impl(Trie node, const(char)[] suffix, string[] path, bool allowDigit) { if (node is null) return false; // Ignore non-digit characters in phone number while (!suffix.empty && (suffix[0] < '0' || suffix[0] > '9')) suffix = suffix[1 .. $]; if (suffix.empty) { // Found a match, print result foreach (word; node.words) { put(sink, format("%s: %-(%s %)", phoneNumber, path.chain(only(word; } return !node.words.empty; } bool ret; foreach (word; node.words) { // Found a matching word, try to match the rest of the phone // number. ret = true; if (impl(dict, suffix, path ~ word, true)) allowDigit = false; } if (impl(node.edges[suffix[0] - '0'], suffix[1 .. $], path, false)) { allowDigit = false
Re: Help optimize D solution to phone encoding problem: extremely slow performace.
On Tue, Jan 16, 2024 at 06:54:56PM +, Renato via Digitalmars-d-learn wrote: > On Tuesday, 16 January 2024 at 16:56:04 UTC, Siarhei Siamashka wrote: [...] > > You are not allowed to emit "1" as the first token in the output as > > long as there are any dictionary word matches at that position. The > > relevant paragraph from the problem statement: Ohhh now I get it. Initially I misunderstood that as saying that if the rest of the phone number has at least one match, then a digit is not allowed. Now I see that what it's actually saying is that even if some random dictionary word matches at that position, even if it does not lead to any full matches, then a digit is excluded. [...] > > I also spent a bit of time trying to figure out this nuance when > > implementing my solution. It doesn't make much sense visually (no > > back-to-back digits in the output either way), but that's how it is. > > Exactly, this is one of the things that make this problem a bit > annoying to solve :) It's a strange requirement, for sure, but I don't think it's annoying. It makes the problem more Interesting(tm). ;-) Anyway, I've fixed the problem, now my program produces the exact same output as Renato's repo. Code is posted below. Interestingly enough, the running time has now halved to about 0.9 seconds for 1 million phone numbers. I guess that's caused by the more stringent requirement excluding many more matching possibilities, effectively pruning away large parts of the search tree. > @"H. S. Teoh" you implemented the solution as a Trie!! Nice, that's > also what I did when I "participated" in the study. Here's [my Trie > solution in > Java](https://github.com/renatoathaydes/prechelt-phone-number-encoding/blob/fastest-implementations-print-or-count/src/java/Main.java). > > These are basically the two common approaches to the problem: a Trie > or a numeric-based table. According to the study, people who use > scripting languages almost always go with the numeric approach, while > people coming from lower level languages tend to use a data structure > like Trie (if they don't know Trie, they come up with something > similar which is fascinating), which is harder to implement but more > efficient in general. Interesting. I guess my C/C++ background is showing. ;-) I'm not sure what exactly motivated me to go this route; I guess it was just my default preference of choosing the path of least work as far as the algorithm is concerned: I chose the algorithm that did the least amount of work needed to produce the right answer. Scanning through sections of the dictionary to find a match was therefore excluded; so my first thought was an AA. But then how much of the initial prefix to look up an in AA? Since it can't be known beforehand, I'd have to gradually lengthen the prefix to search for, which does a lot of repetitive work (we keep looking up the first few digits repeatedly each time we search for a longer prefix). Plus, multiple consecutive AA lookups is not cache-friendly. So my next thought was, what should I do such that I don't have to look at the initial digits anymore once I already processed it? This line of thought naturally led to a trie structure. Once I arrived at a trie structure, the next question was how exactly dictionary entries would be stored in it. Again, in the vein of doing the least amount of work I could get away with, I thought, if I stored words in the trie directly, with each edge encoding a letter, then during the search I'd have to repeatedly convert letters to the corresponding phone number digit and vice versa. So why not do this conversion beforehand, and store only phone digits in the trie? This also had the additional benefit of letting me effectively search multiple letters simultaneously, since multiple letters map to the same digit, so scanning a digit is equivalent to searching multiple letters at the same time. The output, of course, required the original form of the words -- so the obvious solution was to attach the original words as a list of words attached to the trie node representing the end of that word. Once this was all decided, the only remaining question was the search algorithm. This turned out to take the most time in solving this problem, due to the recursive nature of the search, I had to grapple with where and how to make the recursive calls, and how to propagate return values correctly. The initial implementation only found word matches, and did not allow the single digits. Happily, the recursive algorithm turned out to have enough structure to encode the single digit requirements as well, although it took a bit of trial and error to find the correct implementation. > Can I ask you why didn't you use the [D stdlib > Trie](https://dlang.org/phobos/std_uni.html#codepointTrie)? Not sure > that would've worked, but did you consider that? Haha, I didn't even think of that. :-D I wouldn't have wanted to use it anyway, because it was optimized for
Re: Help optimize D solution to phone encoding problem: extremely slow performace.
P.S. Compiling my program with `ldc -O2`, it runs so fast that I couldn't measure any meaningful running time that's greater than startup overhead. So I wrote a helper program to generate random phone numbers up to 50 characters long, and found that it could encode 1 million phone numbers in 2.2 seconds (using the 75,000 entry dictionary from your repository). Counting vs. printing the results made no significant difference to this. T -- People tell me that I'm skeptical, but I don't believe them.
Re: Help optimize D solution to phone encoding problem: extremely slow performace.
On Tue, Jan 16, 2024 at 07:50:35AM -0800, H. S. Teoh via Digitalmars-d-learn wrote: [...] > Unfortunately there seems to be some discrepancy between the output I > got and the prescribed output in your repository. For example, in your > output the number 1556/0 does not have an encoding, but isn't "1 Mai 0" > a valid encoding according to your dictionary and the original problem > description? [...] Also, found a bug in my program that misses some solutions when the phone number has trailing non-digits. Here's the updated code. It still finds extra encodings from the output in your repo, though. Maybe I misunderstood part of the requirements? --snip-- /** * Encoding phone numbers according to a dictionary. */ import std; /** * Table of digit mappings. */ static immutable ubyte[dchar] digitOf; shared static this() { digitOf = [ 'E': 0, 'J': 1, 'N': 1, 'Q': 1, 'R': 2, 'W': 2, 'X': 2, 'D': 3, 'S': 3, 'Y': 3, 'F': 4, 'T': 4, 'A': 5, 'M': 5, 'C': 6, 'I': 6, 'V': 6, 'B': 7, 'K': 7, 'U': 7, 'L': 8, 'O': 8, 'P': 8, 'G': 9, 'H': 9, 'Z': 9, ]; } /** * Trie for storing dictionary words according to the phone number mapping. */ class Trie { Trie[10] edges; string[] words; private void insert(string word, string suffix) { const(ubyte)* dig; while (!suffix.empty && (dig = std.ascii.toUpper(suffix[0]) in digitOf) is null) { suffix = suffix[1 .. $]; } if (suffix.empty) { words ~= word; return; } auto node = new Trie; auto idx = *dig; if (edges[idx] is null) { edges[idx] = new Trie; } edges[idx].insert(word, suffix[1 .. $]); } /** * Insert a word into the Trie. * * Characters that don't map to any digit are ignored in building the Trie. * However, the original form of the word will be retained as-is in the * leaf node. */ void insert(string word) { insert(word, word[]); } /** * Iterate over all words stored in this Trie. */ void foreachEntry(void delegate(string path, string word) cb) { void impl(Trie node, string path = "") { if (node is null) return; foreach (word; node.words) { cb(path, word); } foreach (i, child; node.edges) { impl(child, path ~ cast(char)('0' + i)); } } impl(this); } } /** * Loads the given dictionary into a Trie. */ Trie loadDictionary(R)(R lines) if (isInputRange!R & is(ElementType!R : const(char)[])) { Trie result = new Trie; foreach (line; lines) { result.insert(line.idup); } return result; } /// unittest { auto dict = loadDictionary(q"ENDDICT an blau Bo" Boot bo"s da Fee fern Fest fort je jemand mir Mix Mixer Name neu o"d Ort so Tor Torf Wasser ENDDICT".splitLines); auto app = appender!(string[]); dict.foreachEntry((path, word) { app ~= format("%s: %s", path, word); }); assert(app.data == [ "10: je", "105513: jemand", "107: neu", "1550: Name", "253302: Wasser", "35: da", "38: so", "400: Fee", "4021: fern", "4034: Fest", "482: Tor", "4824: fort", "4824: Torf", "51: an", "562: mir", "562: Mix", "56202: Mixer", "78: Bo\"", "783: bo\"s", "7857: blau", "7884: Boot", "824: Ort", "83: o\"d" ]); } /** * Find all encodings of the given phoneNumber according to the given * dictionary, and write each encoding to the given sink. */ void findMatches(W)(Trie dict, const(char)[] phoneNumber, W sink) if (isOutputRange!(W, string)) { bool impl(Trie node, const(char)[] suffix, string[] path, bool allowDigit) { if (node is null) return false; // Ignore non-digit characters in phone number while (!suffix.empty && (suffix[0] < '0' || suffix[0] > '9')) suffix = suffix[1 .. $]; if (suffix.empty) { // Found a match, print result foreach (word; node.words) { put(sink, format("%s: %-(%s %)", phoneNumber, path.chain(only(word; } return !node.words.empty; }
Re: Help optimize D solution to phone encoding problem: extremely slow performace.
On Mon, Jan 15, 2024 at 08:10:55PM +, Renato via Digitalmars-d-learn wrote: > On Monday, 15 January 2024 at 01:10:14 UTC, Sergey wrote: > > On Sunday, 14 January 2024 at 17:11:27 UTC, Renato wrote: > > > If anyone can find any flaw in my methodology or optmise my code so > > > that it can still get a couple of times faster, approaching Rust's > > > performance, I would greatly appreciate that! But for now, my > > > understanding is that the most promising way to get there would be > > > to write D in `betterC` style?! > > > > I've added port from Rust in the PR comment. Can you please check > > this solution? > > Most probably it need to be optimized with profiler. Just > > interesting how close-enough port will work. > > As discussed on GitHub, the line-by-line port of the Rust code is 5x > slower than [my latest solution using > int128](https://github.com/renatoathaydes/prechelt-phone-number-encoding/blob/0cbfd41a072718bfb0c0d0af8bb7266471e7e94c/src/d/src/dencoder.d), > which is itself 3 to 4x slower than the Rust implementation (at around > the same order of magnitude as algorithm-equivalent Java and Common > Lisp implementations, D is perhaps 15% faster). > > I did the best I could to make D run faster, but we hit a limit that's > a bit hard to get past now. Happy to be given suggestions (see > profiling information in previous messages), but I've run out of ideas > myself. This problem piqued my interest, so yesterday and today I worked on it and came up with my own solution (I did not look at existing solutions in order to prevent bias). I have not profiled it or anything, but the runtime seems quite promising. Here it is: -snip-- /** * Encoding phone numbers according to a dictionary. */ import std; /** * Table of digit mappings. */ static immutable ubyte[dchar] digitOf; shared static this() { digitOf = [ 'E': 0, 'J': 1, 'N': 1, 'Q': 1, 'R': 2, 'W': 2, 'X': 2, 'D': 3, 'S': 3, 'Y': 3, 'F': 4, 'T': 4, 'A': 5, 'M': 5, 'C': 6, 'I': 6, 'V': 6, 'B': 7, 'K': 7, 'U': 7, 'L': 8, 'O': 8, 'P': 8, 'G': 9, 'H': 9, 'Z': 9, ]; } /** * Trie for storing dictionary words according to the phone number mapping. */ class Trie { Trie[10] edges; string[] words; private void insert(string word, string suffix) { const(ubyte)* dig; while (!suffix.empty && (dig = std.ascii.toUpper(suffix[0]) in digitOf) is null) { suffix = suffix[1 .. $]; } if (suffix.empty) { words ~= word; return; } auto node = new Trie; auto idx = *dig; if (edges[idx] is null) { edges[idx] = new Trie; } edges[idx].insert(word, suffix[1 .. $]); } /** * Insert a word into the Trie. * * Characters that don't map to any digit are ignored in building the Trie. * However, the original form of the word will be retained as-is in the * leaf node. */ void insert(string word) { insert(word, word[]); } /** * Iterate over all words stored in this Trie. */ void foreachEntry(void delegate(string path, string word) cb) { void impl(Trie node, string path = "") { if (node is null) return; foreach (word; node.words) { cb(path, word); } foreach (i, child; node.edges) { impl(child, path ~ cast(char)('0' + i)); } } impl(this); } } /** * Loads the given dictionary into a Trie. */ Trie loadDictionary(R)(R lines) if (isInputRange!R & is(ElementType!R : const(char)[])) { Trie result = new Trie; foreach (line; lines) { result.insert(line.idup); } return result; } /// unittest { auto dict = loadDictionary(q"ENDDICT an blau Bo" Boot bo"s da Fee fern Fest fort je jemand mir Mix Mixer Name neu o"d Ort so Tor Torf Wasser ENDDICT".splitLines); auto app = appender!(string[]); dict.foreachEntry((path, word) { app ~= format("%s: %s", path, word); }); assert(app.data == [ "10: je", "105513: jemand", "107: neu", "1550: Name", "253302: Wasser", "35: da", "38: so", "400: Fee", "4021: fern", "4034: Fest", "482: Tor", "4824: fort", "4824: Torf", "51: an", "562: mir", "562: Mix", "56202: Mixer", "78: Bo\"", "783: bo\"s", "7857: blau", "7884: Boot", "824: Ort", "83: o\"d" ]); } /** * Find all encodings of the given phoneNumber according to the given * dictionary, and write each encoding to the given sink. */ void findMatches(W)(Trie dict, const(char)[] phoneNumber, W sink)
Re: `static` function ... cannot access variable in frame of ...
On Mon, Jan 15, 2024 at 06:16:44PM +, Bastiaan Veelo via Digitalmars-d-learn wrote: > Hey people, I can use some help understanding why the last line > produces a compile error. > > ```d > import std.stdio; > > struct S > { > static void foo(alias len)() [...] The trouble is with the `static` here. A context pointer is necessary in order to have access to the context of main() from the body of this function; but `static` precludes this possibility. T -- It is of the new things that men tire --- of fashions and proposals and improvements and change. It is the old things that startle and intoxicate. It is the old things that are young. -- G.K. Chesterton
Re: Doubt about Struct and members
On Mon, Jan 08, 2024 at 05:28:50PM +, matheus via Digitalmars-d-learn wrote: > Hi, > > I was doing some tests and this code: > > import std; > > struct S{ > string[] s = ["ABC"]; > int i = 123; > } [...] It's not recommended to use initializers to initialize mutable array-valued members, because it probably does not do what you think it does. What the above code does is to store the array ["ABC"] somewhere in the program's pre-initialized data segment and set s to point to that by default. It does NOT allocated a new array literal every time you create a new instance of S; every instance of S will *share* the same array value unless you reassign it. As such, altering the contents array may cause the new contents to show up in other instances of S. This behaviour is generally harmless if your array is immutable. In fact, it saves space in your executable by reusing the same data for multiple instances of s. It also avoids repeated GC allocations at runtime. However, if you're banking on each instance of S getting its own copy of the array, you're in for a surprise. In this case, what you want is to use a ctor to initialize it rather than the above initializer. T -- Right now I'm having amnesia and deja vu at the same time. I think I've forgotten this before.
Re: Trying to understand map being a template
On Fri, Jan 05, 2024 at 08:41:53PM +, Noé Falzon via Digitalmars-d-learn wrote: > On the subject of `map` taking the function as template parameter, I > was surprised to see it could still be used with functions determined > at runtime, even closures, etc. I am trying to understand the > mechanism behind it. That's simple, if the argument is a runtime function, it is treated as a function pointer (or delegate). [...] > In fact, how can the template be instantiated at all in the following > example, where no functions can possibly be known at compile time: > > ``` > auto do_random_map(int delegate(int)[] funcs, int[] values) > { > auto func = funcs.choice; > return values.map!func; > } > ``` [...] The argument is taken to be a delegate to be bound at runtime. In the instantiation a shim is inserted to pass along the delegate from the caller's context. T -- Creativity is not an excuse for sloppiness.
Re: Pick a class at random
On Wed, Jan 03, 2024 at 04:50:57PM +, axricard via Digitalmars-d-learn wrote: > I have an interface that is implemented by many classes, and I want to > pick one of these implementations at random. There are two more > constraints : first the distribution is not uniform, all classes can > define the chance they have to be picked (this is reflected by the > function 'weight()' below). And all classes are not always available, > this depends on some runtime information. I would tag each implementation with a compile-time enum and use compile-time introspection with CRTP[1] to auto-generate the code for choosing a class according to the desired distribution. [1] https://en.wikipedia.org/wiki/Curiously_recurring_template_pattern Something like this: --SNIP--- import std.stdio; interface MyIntf { void work(); } struct ImplemInfo { int weight; MyIntf function() instantiate; } ImplemInfo[] implems; // list of implementations int totalWeight; MyIntf chooseImplem() { import std.random; auto pick = uniform(0, totalWeight); auto slice = implems[]; assert(slice.length > 0); while (slice[0].weight <= pick) { pick -= slice[0].weight; slice = slice[1 .. $]; } return slice[0].instantiate(); } // Base class that uses CRTP to auto-register implementations in // .implems without needing too much boilerplate in every // subclass. class Base(C) : MyIntf { // Derived class must define a .weight member readable // at compile-time. static assert(is(typeof(C.weight) : int), "Derived class must define .weight"); static this() { implems ~= ImplemInfo(C.weight, () { return cast(MyIntf) new C; }); totalWeight += C.weight; } // Derived classes must implement this abstract void work(); } // These classes can be anywhere class Implem1 : Base!Implem1 { enum weight = 1; override void work() { writeln(typeof(this).stringof); } } class Implem2 : Base!Implem2 { enum weight = 2; override void work() { writeln(typeof(this).stringof); } } class Implem3 : Base!Implem3 { enum weight = 3; override void work() { writeln(typeof(this).stringof); } } void main() { // pipe output of program to `sort | uniq -c` to verify that the // required distribution is generated correctly. foreach (_; 0 .. 100) { auto impl = chooseImplem(); impl.work(); } } --SNIP--- --T
Re: D is nice whats really wrong with gc??
On Fri, Dec 22, 2023 at 09:40:03PM +, bomat via Digitalmars-d-learn wrote: > On Friday, 22 December 2023 at 16:51:11 UTC, bachmeier wrote: > > Given how fast computers are today, the folks that focus on memory > > and optimizing for performance might want to apply for jobs as > > flooring inspectors, because they're often solving problems from the > > 1990s. > > *Generally* speaking, I disagree. Think of the case of GTA V where > several *minutes* of loading time were burned just because they > botched the implementation of a JSON parser. IMNSHO, if I had very large data files to load, I wouldn't use JSON. Precompile the data into a more compact binary form that's already ready to use, and just mmap() it at runtime. > Of course, this was unrelated to memory management. But it goes to > show that today's hardware being super fast doesn't absolve you from > knowing what you're doing... or at least question your implementation > once you notice that it's slow. My favorite example is this area is the poor selection of algorithms, a very common mistake being choosing an O(n²) algorithm because it's easier to implement than the equivalent O(n) algorithm, and not very noticeable on small inputs. But on large inputs it slows to an unusable crawl. "But I wrote it in C, why isn't it fast?!" Because O(n²) is O(n²), and that's independent of language. Given large enough input, an O(n) Java program will beat the heck out of an O(n²) C program. > But that is true for any language, obviously. > > I think there is a big danger of people programming in C/C++ and > thinking that it *must* be performing well just because it's C/C++. > The C++ codebase I have to maintain in my day job is a really bad > example for that as well. "Elegant or ugly code as well as fine or rude sentences have something in common: they don't depend on the language." -- Luca De Vitis :-) > > I say this as I'm in the midst of porting C code to D. The biggest > > change by far is deleting line after line of manual memory > > management. Changing anything in that codebase would be miserable. > > I actually hate C with a passion. Me too. :-D > I have to be fair though: What you describe doesn't sound like a > problem of the codebase being C, but the codebase being crap. :) Yeah, I've seen my fair share of crap C and C++ codebases. C code that makes you do a double take and stare real hard at the screen to ascertain whether it's actually C and not some jokelang or exolang purposely designed to be unreadable/unmaintainable. (Or maybe it would qualify as an IOCCC entry. :-D) And C++ code that looks like ... I dunno what. When business logic is being executed inside of a dtor, you *know* that your codebase has Problems(tm), real big ones at that. > If you have to delete "line after line" of manual memory management, I > assume you're dealing with micro-allocations on the heap - which are > performance poison in any language. Depends on what you're dealing with. Some micro-allocations are totally avoidable, but if you're manipulating a complex object graph composed of nodes of diverse types, it's hard to avoid. At least, not without uglifying your APIs significantly and introducing long-term maintainability issues. One of my favorite GC "lightbulb" moments is when I realized that having a GC allowed me to simplify my internal APIs significantly, resulting in much cleaner code that's easy to debug and easy to maintain. Whereas the equivalent bit of code in the original C++ codebase would have required disproportionate amounts of effort just to navigate the complex allocation requirements. These days my motto is: use the GC by default, when it becomes a problem, then use a more manual memory management scheme, but *only where the bottleneck is* (as proven by an actual profiler, not where you "know" (i.e., imagine) it is). A lot of C/C++ folk (and I speak from my own experience as one of them) spend far too much time and energy optimizing things that don't need to be optimized, because they are nowhere near the bottleneck, resulting in lots of sunk cost and added maintenance burden with no meaningful benefit. [...] > Of course, this directly leads to the favorite argument of C > defenders, which I absolutely hate: "Why, it's not a problem if you're > doing it *right*." > > By this logic, you have to do all these terrible mistakes while > learning your terrible language, and then you'll be a good programmer > and can actually be trusted with writing production software - after > like, what, 20 years of shooting yourself in the foot and learning > everything the hard way? :) And even then, the slightest slipup will > give you dramatic vulnerabilities. Such a great concept. Year after year I see reports of security vulnerabilities, the most common of which are buffer overflows, use-after-free, and double-free. All of which are caused directly by using a language that forces you to manage memory manually. If C were only 10
Re: D is nice whats really wrong with gc??
On Fri, Dec 22, 2023 at 07:22:15PM +, Dmitry Ponyatov via Digitalmars-d-learn wrote: > > It's called GC phobia, a knee-jerk reaction malady common among > > C/C++ programmers > > I'd like to use D in hard realtime apps (gaming can be thought as one > of them, but I mostly mean realtime dynamic multimedia and digital > signal processing). For digital signal processing, couldn't you just preallocate beforehand? Even if we had a top-of-the-line incremental GC I wouldn't want to allocate wantonly in my realtime code. I'd preallocate whatever I can, and use region allocators for the rest. > So, GC in such applications commonly supposed unacceptable. In > contrast, I can find some PhD theses speaking about realtime GC, > prioritized message passing and maybe RDMA-based clustering. I'm always skeptical of general claims like this. Until you actually profile and identify the real hotspots, it's just speculation. > Unfortunately, I have no hope that D lang is popular enough that > somebody in the topic can rewrite its runtime and gc to be usable in > more or less hard RT apps. Popularity has nothing to do with it. The primary showstopper here is the lack of write barriers (and Walter's reluctance to change this). If we had write barriers a lot more GC options would open up. T -- What is Matter, what is Mind? Never Mind, it doesn't Matter.
Re: D is nice whats really wrong with gc??
On Mon, Dec 18, 2023 at 04:44:11PM +, Bkoie via Digitalmars-d-learn wrote: [...] > but what is with these ppl and the gc? [...] It's called GC phobia, a knee-jerk reaction malady common among C/C++ programmers (I'm one of them, though I got cured of GC phobia thanks to D :-P). 95% of the time the GC helps far more than it hurts. And the 5% of the time when it hurts, there are plenty of options for avoiding it in D. It's not shoved down your throat like in Java, there's no need to get all worked up about it. T -- Computerese Irregular Verb Conjugation: I have preferences. You have biases. He/She has prejudices. -- Gene Wirchenko
Re: Is it possible to set/override the name of the source file when piping it into DMD via stdin?
On Wed, Dec 13, 2023 at 11:58:42AM -0800, H. S. Teoh via Digitalmars-d-learn wrote: [...] > Add a module declaration to your source file. For example: > > echo 'module abc; import std; void main(){writefln(__MODULE__);}' | dmd > -run - > > Output: > abc > > `__stdin` is used as a placeholder when no module declaration is > present, and dmd doesn't know the filename (which is what it would > normally have used for the module name in this case). [...] Hmm, apparently the module declaration doesn't change the placeholder filename. Using `#line 1 abc.d` does the trick, as Adam suggests. T -- People tell me I'm stubborn, but I refuse to accept it!
Re: Is it possible to set/override the name of the source file when piping it into DMD via stdin?
On Wed, Dec 13, 2023 at 07:37:09PM +, Siarhei Siamashka via Digitalmars-d-learn wrote: > Example: > > ```D > import std; > void main() { > deliberate syntax error here > } > ``` > > ```bash > $ cat example.d | dmd -run - > __stdin.d(3): Error: found `error` when expecting `;` or `=`, did you mean > `deliberate syntax = here`? > __stdin.d(3): Error: found `}` when expecting `;` or `=`, did you mean > `error here = End of File`? > ``` > > Now I'm curious. Is it possible to somehow communicate the real source > file name to `dmd`, so that it shows up in the error log instead of > "__stdin.d"? Add a module declaration to your source file. For example: echo 'module abc; import std; void main(){writefln(__MODULE__);}' | dmd -run - Output: abc `__stdin` is used as a placeholder when no module declaration is present, and dmd doesn't know the filename (which is what it would normally have used for the module name in this case). T -- Век живи - век учись. А дураком помрёшь.
Re: union default initialization values
On Wed, Dec 06, 2023 at 04:24:51AM +0900, confuzzled via Digitalmars-d-learn wrote: [...] > import std.stdio; > void main() > { > F fp; > fp.lo.writeln; // Why is this not zero? How is this value derived? > fp.hi.writeln; // expected > fp.x.writeln; // expected > > fp.x = > 19716939937510315926535.148979323846264338327950288458209749445923078164062862089986280348253421170679; > fp.lo.writeln; > fp.hi.writeln; > fp.x.writefln!"%20.98f"; // Also, why is precision completely lost after > 16 digits (18 if I change the type of x to real)? > } > > Sorry if this seem like noise but I genuinely do not understand. What > changes would I need to make to retain the precision of the value > provided in the assignment above? [...] A `double` type is stored as an IEEE double-precision floating-point number, which is a 64-bit value containing 1 sign bit, 11 exponent bits, and 53 mantissa bits (52 stored, 1 implied). A mantissa of 53 bits can store up to 2^53 distinct values, which corresponds with log_10(2^53) ≈ 15.95 decimal digits. So around 15-16 decimal digits. (The exponent bits only affect the position of the decimal point, not the precision of the value, so they are not relevant here.) In D, you can use the .dig property to find out approximately how many of precision a format has (e.g., `writeln(double.dig);` or `writeln(real.dig);`). The number you have above is WAY beyond the storage capacity of the double-precision floating-point format or the 80-bit extended precision format of `real`. If you need that level of precision, you probably want to use an arbitrary-precision floating point library like libgmp instead of the built-in `double` or `real`. (Keep in mind that the performance will be significantly slower, because the hardware only works with IEEE 64-bit / 8088 80-bit extended precision numbers. Anything beyond that has to be implemented in software, and will incur memory management costs as well since the storage size of the number will not be fixed.) Also, if you don't understand how floating-point in computers work, I highly recommend reading this: https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html It's a bit long, but well worth the time to read to understand why floating-point behaves the way it does. T -- It is of the new things that men tire --- of fashions and proposals and improvements and change. It is the old things that startle and intoxicate. It is the old things that are young. -- G.K. Chesterton
Re: anonymous structs within structs
On Mon, Dec 04, 2023 at 11:46:45PM +, DLearner via Digitalmars-d-learn wrote: [...] > Basically, B corresponds to the whole record (and only a whole record > can be read). > But the task only requires Var1 and Var2, the last two fields on the record. > By putting all the irrelevant fields into A, and defining B as above, > program remains unpolluted with data it does not need. [...] Sounds like what you need is something like this: struct Record { struct UnimportantStuff { ... } UnimportantStuff unimportant; struct ImportantStuff { ... } ImportantStuff important; } ImportantStuff readData() { Record rec = readData(...); // read entire record return rec.important; // discard unimportant stuff } int main() { ... ImportantStuff data = readData(); // only important stuff returned processData(data); ... } T -- Let X be the set not defined by this sentence...
Re: D: Declaring empty pointer variables that return address inside function calls?
On Thu, Nov 23, 2023 at 07:22:22PM +, BoQsc via Digitalmars-d-learn wrote: > Is it possible to declare empty pointer variable inside function calls > and pass its address to the function? > > These are sometimes required while using Win32 - Windows Operating > System API. > > * Empty pointer variables are used by functions to return information > after the function is done. > > My own horrible **suggestion** of empty pointer declaration inside > function call: > `someFunction(uint & passingEmptyVariableForWrite);` > > What it would do: > * A new variable is declared inside function call. > * Address of that variable is passed to the function. > * After function is done, you can refer to it for returned value. What's wrong with: uint* result; someFunction(); // use *result ? T -- One Word to write them all, One Access to find them, One Excel to count them all, And thus to Windows bind them. -- Mike Champion
Re: Keyword "package" prevents from importing a package module "package.d"
On Fri, Nov 03, 2023 at 12:19:48AM +, Andrey Zherikov via Digitalmars-d-learn wrote: > On Thursday, 2 November 2023 at 19:43:01 UTC, Adam D Ruppe wrote: > > On Thursday, 2 November 2023 at 19:30:58 UTC, Jonathan M Davis wrote: > > > The entire reason that it was added to the language was to be able > > > to split up existing modules without breaking code. And it does that > > > well. > > > > No, it doesn't do that well at all. In fact, it does that so extremely > > poorly that (as you might recall) there were a very large number of > > support requests shortly after Phobos started using it about broken > > builds, since it would keep the old file and the new file when you > > updated and this stupid, idiotic design can't handle that situation. > > > > This only subsided because enough time has passed that nobody tries > > using it to break up existing modules anymore. > > > > It is just a *terrible* design that never should have passed review. It > > is randomly inconsistent with the rest of the language and this > > manifests as several bugs. > > > > (including but not limited to: > > > > https://issues.dlang.org/show_bug.cgi?id=14687 doesn't work with .di > > https://issues.dlang.org/show_bug.cgi?id=17699 breaks if you try to use > > it for its intended purpose > > https://issues.dlang.org/show_bug.cgi?id=20563 error messages hit random > > problems > > all-at-once vs separate compilation of package > > leads to inconsistent reflection results > > > > im sure the list went on if i spent a few more minutes looking for my > > archives) > > > > > > > package.d is indeed completely unnecessary for creating a module > > > that publicly imports other modules in order to be able to import a > > > single module and get several modules. > > > > Yeah, it is a terrible feature that is poorly designed, hackily > > implemented, and serves no legitimate use case. > > Is there any guide how one can refactor single-module package into > multi-module package with distinction between public and private > modules? Supposedly you can do this: /* Original: */ // pkg/mymodule.d module mymodule; ... // code here // main.d import mymodule; void main() { ... } /* Split */ // pkg/mymodule/pub_submod.d module mymodule.pub_submod; ... // code here // pkg/mymodule/priv_submod.d module mymodule.priv_submod; ... // code here // pkg/mymodule/package.d module mymodule; public import priv_submod; // main.d import mymodule; void main() { ... } Barring the issues listed above, of course. T -- "The number you have dialed is imaginary. Please rotate your phone 90 degrees and try again."
Re: is the array literal in a loop stack or heap allocated?
On Wed, Oct 11, 2023 at 02:54:53AM +, mw via Digitalmars-d-learn wrote: > Hi, > > I want to confirm: in the following loop, is the array literal `a` vs. > `b` stack or heap allocated? and how many times? > > void main() { > > int[2] a; This is stack-allocated. Once per call to the function. > int[] b; This is an empty slice. It can refer to either stack or heap memory, depending on what's assigned to it. > int i; > While(++i <=100) { > > a = [i, i+1]; // array literal `a` is overwritten in-place once per loop. > b = [i, i+1]; [...] A new array consisting of 2 elements is allocated, once per loop, and assigned to b each time. Any arrays from previous iterations will be collected by the GC eventually. T -- They pretend to pay us, and we pretend to work. -- Russian saying
Re: array setting : Whats going in here?
On Sat, Oct 07, 2023 at 12:00:48AM +, claptrap via Digitalmars-d-learn wrote: > > char[] foo; > foo.length = 4; > foo[] = 'a'; // ok sets all elements > foo[] = "a"; // range error at runtime? > foo[] = "ab"; // range error at runtime? > > So I meant to init with a char literal but accidently used double > quotes. Should that even compile? Shouldn't the compiler at least > complain when trying to init with "ab"? If you want initialization, don't slice the target array. For example: char[] foo = "a"; Or: char[] foo; ... foo = "a"; When you write `foo[]` you're taking a slice of the array, and in that case if the lengths of both sides of the assignment don't match, you'll get a runtime error. T -- Always remember that you are unique. Just like everybody else. -- despair.com
Re: Setting struct as default parameter of a function using struct literal?
On Mon, Sep 11, 2023 at 10:39:00PM +, Salih Dincer via Digitalmars-d-learn wrote: > On Monday, 11 September 2023 at 22:13:25 UTC, H. S. Teoh wrote: > > > > Because sometimes I want a specific type. > > > > it's possible... > > ```d > alias ST = Options; > void specificType(ST option = ST()) [...] This is missing the point. The point is that I don't want to have to type `Options` or `ST` twice. Since the type of the parameter is already known, the compiler does not need me to repeat the type name. It already knows enough to figure it out on its own. "Don't Repeat Yourself" (DRY). T -- "Holy war is an oxymoron." -- Lazarus Long
Re: Setting struct as default parameter of a function using struct literal?
On Mon, Sep 11, 2023 at 10:05:11PM +, Salih Dincer via Digitalmars-d-learn wrote: > On Monday, 11 September 2023 at 20:17:09 UTC, H. S. Teoh wrote: > > > > Someone should seriously come up with a way of eliminating the > > repeated type name in default parameters. > > Why not allow it to be flexible enough by using a template parameter? Because sometimes I want a specific type. T -- What did the alien say to Schubert? "Take me to your lieder."
Re: Setting struct as default parameter of a function using struct literal?
On Mon, Sep 11, 2023 at 07:59:37PM +, ryuukk_ via Digitalmars-d-learn wrote: [...] > Recent version of D added named arguments so you can do something > like: > > ```D > void someFunction(Options option = Options(silenceErrors: false)) > ``` > > I don't like the useless repeating "option option option", but that's > D for you Someone should seriously come up with a way of eliminating the repeated type name in default parameters. It's a constant fly in my otherwise tasty soup of D. Every time I have to type that I think about how nice it would be if we could just write void someFunction(Options option = .init) {...} and be done with it. Or else: void someFunction(auto options = Options.init) {} though this is not as good because the `auto` may make it hard to parse function declarations. T -- Life would be easier if I had the source code. -- YHL
Re: malloc error when trying to assign the returned pointer to a struct field
On Sat, Sep 09, 2023 at 09:21:32AM +, rempas via Digitalmars-d-learn wrote: > On Saturday, 9 September 2023 at 08:54:14 UTC, Brad Roberts wrote: > > I'm pretty sure this is your problem. You're allocating size bytes > > which is only going to work where sizeof(T) == 1. Changing to > > malloc(size * sizeof(T)) is likely going to work better. > > Oh man That was it! I had forget about that! Funny enough, the > reallocation tests I do letter when expanding the vector do include > that but I had forgot to place it in the new (because I had the an old > one and it included this) constructor I had made that only allocates > memory! > > Now, if only one could expect how and why "libc" knows that and > doesn't just care to give me the memory I asked it for? Or it could be > than D does something additional without telling us? Which can explain > when this memory is only present when I assign the value to the > "this._ptr` field! libc doesn't know what you intended. All it knows is that you asked it for 20 bytes (even though you actually needed 40), then later on its internal structures are corrupted (because you thought you got 40 bytes; storing data past the 20 bytes overwrote some of malloc's internal data -- this is the buffer overrun / buffer overflow I referred to). So it aborts the program instead of continuing to run in a compromised state. T -- There are four kinds of lies: lies, damn lies, and statistics.
Re: malloc error when trying to assign the returned pointer to a struct field
On Fri, Sep 08, 2023 at 06:59:21PM +, rempas via Digitalmars-d-learn wrote: > On Friday, 8 September 2023 at 16:02:36 UTC, Basile B. wrote: > > > > Could this be a problem of copy construction ? > > I don't think so. The assertion seems to be violated when `malloc` is used. > And when I assert the result in the `_ptr` field. Really weird... The error message looks to me like a corruption of the malloc heap. These kinds of bugs are very hard to trace, because they may go undetected and only show up in specific circumstances, so small perturbations of completely unrelated code may make the bug appear or disappear -- just because the bug doesn't show up when you disable some code does not prove that that's where the problem is; it could be that corruption is still happening, it just so happens that it goes unnoticed when the behaviour of the code changes slightly. My guess is that you have a double-free somewhere, or there's a buffer overrun. Or maybe some bad interaction with the GC, e.g. if you tried to free a pointer from the GC heap. (Note that this may not immediately show up; free() could've assumed that everything was OK when it has in fact messed up its internal data structures; the problem would only show up later on in code that's actually unrelated to the real problem.) If I were in your shoes I'd use Valgrind / Memcheck to try to find the real cause of the problem. Chances are, it may have nothing to do with the bit of code you quoted at all. You could try to insert extra malloc/free's in various places around the code (in places along the code path, but unrelated to the problematic code) to see if that changes the behaviour of the bug. If it does, your corruption is likely somewhere other than the _ptr code you showed. T -- If the comments and the code disagree, it's likely that *both* are wrong. -- Christopher
Re: Keeping data from memory mapped files
On Fri, Sep 01, 2023 at 03:53:42AM +, Alexibu via Digitalmars-d-learn wrote: > Why do I need to copy data out of memory mapped files to avoid seg faults. > This defeats the purpose of memory mapped files. > Shouldn't the GC be able to manage it if I keep a pointer into it. The GC does not manage memory-mapped files. That's the job of the OS. > I am closing them because the OS has a limit in how many it can open, > either way the memory is still there isn't it ? No, once you close it, the OS will remove the mapping. So when you try to access that address, it will segfault. This has nothing to do with the GC, the page tables that map the memory addresses to the file are managed by the OS. By closing it you're basically telling the OS "I don't need the mapping anymore", so it removes the mapping from your page tables and that address no longer exists in your process' address space. So the next time you try to access it, you will get a segfault. T -- Heuristics are bug-ridden by definition. If they didn't have bugs, they'd be algorithms.
Re: Unicode in strings
On Thu, Jul 27, 2023 at 10:15:47PM +, Cecil Ward via Digitalmars-d-learn wrote: > How do I get a wstring or dstring with a code point of 0xA0 in it ? > That’s a type of space, is it? I keep getting a message from the LDC > compiler something like "Outside Unicode code space" in my unittests > when this is the first character in a wstring. I’ve tried all sorts of > escape sequences but I must simply be misunderstanding the docs. I > could always copy-paste a real live one into a double quoted string > and be done with it, I suppose. D strings are assumed to be encoded in UTF-8 / UTF-16 / UTF-32. So if you wrote something like `\xA0` in your string will likely generate an invalid encoding. Try instead `\u00A0`. T -- Ph.D. = Permanent head Damage
Re: Pre-expanding alloc cell(s) / reserving space for an associative array
On Mon, Jul 10, 2023 at 09:30:57AM +, IchorDev via Digitalmars-d-learn wrote: [...] > From the spec it sounds as though (but good luck testing for sure) > that if you have (for example) 6 big dummy key-value pairs in the AA > to begin with, then if you use `.clear` it "Removes all remaining keys > and values from [the] associative array. The array is not rehashed > after removal, __to allow for the existing storage to be reused.__" [...] This is not an accurate understanding of what actually happens. The AA implementation consists of a primary hashtable (an array), each slot of which points to a list of buckets. Clearing the AA does not discard the hashtable, but does dispose of the buckets, so adding new keys afterwards will allocate new buckets. So the buckets used by the dummy key-value pairs do not get reused without a reallocation. T -- This is a tpyo.
Re: Dynamic array of strings and appending a zero length array
On Sat, Jul 08, 2023 at 05:15:26PM +, Cecil Ward via Digitalmars-d-learn wrote: > I have a dynamic array of dstrings and I’m spending dstrings to it. At > one point I need to append a zero-length string just to increase the > length of the array by one but I can’t have a slot containing garbage. > I thought about ++arr.length - would that work, while giving me valid > contents to the final slot ? Unlike C/C++, the D runtime always ensures that things are initialized unless you explicitly tell it not to (via void-initialization). So ++arr.length will work; the new element will be initialized to dstring.init (which is the empty string). T -- If Java had true garbage collection, most programs would delete themselves upon execution. -- Robert Sewell
Re: Bug in usage of associative array: dynamic array with string as a key
On Fri, Jun 30, 2023 at 07:05:23PM +, Cecil Ward via Digitalmars-d-learn wrote: [...] It would help if you could post the complete code that reproduces the problem. Or, if you do not wish to reveal your code, reduce it to a minimal case that still exhibits the same problem, so that we can see it for ourselves. The snippets you provided do not provide enough information to identify the problem. T -- What's the difference between a 4D tube and an overweight Dutchman? One is a hollow spherinder, and the other is a spherical Hollander.
Re: Debugging by old fashioned trace log printfs / writefln
On Fri, Jun 30, 2023 at 03:43:14PM +, Cecil Ward via Digitalmars-d-learn wrote: [...] > Since I can pass my main function some compile-time-defined input, the > whole program should be capable of being executed with CTFE, no? So in > that case pragma( msg ) should suffice for a test situation? Would > pragma(message) have the advantage over writefln that I don’t have to > pervert the function attributes like nogc nothrow pure ? Just use the `debug` statement: auto pureFunc(Args args) pure { ... debug writefln("debug info: %s", ...); ... } Compile with `-debug` to enable the writefln during development. When not compiling with `-debug`, the writefln will not be compiled and the function will actually be pure. The problem with pragma(msg) is that it happens very early in the compilation process; some things may not be available to it, such as the value of variables in CTFE. This may limit its usefulness in some situations. For more details on this, see: https://wiki.dlang.org/Compile-time_vs._compile-time T -- He who sacrifices functionality for ease of use, loses both and deserves neither. -- Slashdotter
Re: is Dlang support Uniform initialization like c++
On Fri, Jun 30, 2023 at 03:18:41PM +, lili via Digitalmars-d-learn wrote: > struct Point { > int x; > int y; > this(int x, int y) { this.x =x; this.y=y;} > } > > void addPoint(Point a, Point b) { >... > } > > How too wirte this: addPoint({4,5}, {4,6}) addPoint(Point(4,5), Point(4,6)); T -- "No, John. I want formats that are actually useful, rather than over-featured megaliths that address all questions by piling on ridiculous internal links in forms which are hideously over-complex." -- Simon St. Laurent on xml-dev
Re: static if - unexpected results
On Friday, 23 June 2023 at 15:22:36 UTC, DLearner wrote: On Friday, 23 June 2023 at 14:31:45 UTC, FeepingCreature wrote: On Friday, 23 June 2023 at 14:22:24 UTC, DLearner wrote: [...] ``` static assert(__traits(isPOD, int)); // ok. static assert(__traits(isPOD, byte)); // ok. ``` It's a bug in either the spec or the compiler. I am using ``` DMD64 D Compiler v2.103.0-dirty ``` under ``` Windows [Version 10.0.19045.3086] ``` Do I need to report this anywhere? Tested your original code on latest dmd git master, here's the output: d char1 is a char int1 is a struct foovar1 is a struct byte1 is a struct Looks like there isn't a problem? Or at least, it's now fixed in git master. Which exact version of dmd are you using? Did you download from dlang.org or did you build your own? --T
Re: A couple of questions about arrays and slices
On Wed, Jun 21, 2023 at 02:09:26AM +, Cecil Ward via Digitalmars-d-learn wrote: > First is an easy one: > > 1.) I have a large array and a sub-slice which I want to set up to be > pointing into a sub-range of it. What do I write if I know the start > and end indices ? Concerned about an off-by-one error, I have > start_index and past_end_index (exclusive). array[start_idx .. one_past_end_idx] > 2.) I have a dynamic array and I wish to preinitialise its alloc cell > to be a certain large size so that I don’t need to reallocate often > initially. I tell myself that I can set the .length property. Is that > true? You can use `array.reserve(preinitSize);`. > 2a.) And what happens when the cell is extended, is the remainder > zero-filled or remaining full of garbage, or is the size of the alloc > cell something separate from the dynamic array’s knowledge of the > number of valid elements in it ? The size of the allocated cell is managed by druntime. On the user code side, all you know is the slice (start pointer + length). The allocated region outside the current array length is not initialized. Assigning to array.length initializes the area that the array occupies after the length has been extended. T -- Ph.D. = Permanent head Damage
Re: How does D’s ‘import’ work?
On Sun, Jun 18, 2023 at 03:51:14PM -0600, Jonathan M Davis via Digitalmars-d-learn wrote: > On Sunday, June 18, 2023 2:24:10 PM MDT Cecil Ward via Digitalmars-d-learn > wrote: > > I wasn’t intending to use DMD, rather ldc if possible or GDC > > because of their excellent optimisation, in which DMD seems > > lacking, is that fair? (Have only briefly looked at dmd+x86 and > > haven’t given DMD’s back end a fair trial.) My experience with D for the past decade or so has consistently shown that executables produced by LDC or GDC generally run about 40% faster than those produced by DMD. Especially with CPU-intensive computations. This is just the hard fact. Of course, for some applications like shell-script replacements (which, incidentally, D is really good at -- once your script passes the level of complexity beyond which writing a shell script just becomes unmanageable), the difference doesn't really matter, and I'd use DMD just for faster compile times. The one thing the DMD backend is really good at, is compiling stuff *really* fast. LDC has been catching up in this department, but currently DMD still wins the fast compilation time race, by quite a lot. So it's very useful for fast turnaround when you're coding. But for release builds, LDC and GDC are your ticket. > In general, dmd is fantastic for its fast compilation speed. So, it > works really well for developing whatever software you're working on > (whereas ldc and gdc are typically going to be slower at compiling). > And depending on what you're doing, the code is plenty fast. However, > if you want to maximize the efficiency of your code, then you > definitely want to be building the binaries that you actually use or > release with ldc or gdc. [...] Yeah, LDC/GDC are really good at producing optimized executables, but they do take a long time to do it. (Probably 'cos it's a hard problem!) So for development -- DMD. For final release build -- GDC/LDC. T -- If it tastes good, it's probably bad for you.
Re: ldc link error on new machine: undefined reference to `_D6object9Throwable7messageMxFNbNfZAxa'
On Thu, Jun 15, 2023 at 12:49:30AM +, mw via Digitalmars-d-learn wrote: > Hi, > > I switched to a different machine to build my project, suddenly I got > lots of link errors. (It builds fine on the old machine, and my > software version are the same on both machines LDC - the LLVM D > compiler (1.32.2)) Recently encountered a similar problem, ultimately the cause was that my library paths turned out to be wrongly set, so it was picking up the wrong version of the precompiled libraries. Probably you could check whether the library paths in ldc2.conf are set correctly, and also double-check whether the libraries at those paths are actually the correct ones for your compiler version (you may have installed the wrong libraries in the right paths). Mixing up libraries from different LDC releases tend to show up as link errors of this kind. T -- The computer is only a tool. Unfortunately, so is the user. -- Armaphine, K5
Re: byte and short data types use cases
On Sat, Jun 10, 2023 at 09:58:12PM +, Cecil Ward via Digitalmars-d-learn wrote: > On Friday, 9 June 2023 at 15:07:54 UTC, Murloc wrote: [...] > > So you can optimize memory usage by using arrays of things smaller > > than `int` if these are enough for your purposes, but what about > > using these instead of single variables, for example as an iterator > > in a loop, if range of such a data type is enough for me? Is there > > any advantages on doing that? > > A couple of other important use-cases came to me. The first one is > unicode which has three main representations, utf-8 which is a stream > of bytes each character can be several bytes, utf-16 where a character > can be one or rarely two 16-bit words, and utf32 - a stream of 32-bit > words, one per character. The simplicity of the latter is a huge deal > in speed efficiency, but utf32 takes up almost four times as memory as > utf-8 for western european languages like english or french. The > four-to-one ratio means that the processor has to pull in four times > the amount of memory so that’s a slowdown, but on the other hand it is > processing the same amount of characters whichever way you look at it, > and in utf8 the cpu is having to parse more bytes than characters > unless the text is entirely ASCII-like. [...] On contemporary machines, the CPU is so fast that memory access is a much bigger bottleneck than processing speed. So unless an operation is being run hundreds of thousands of times, you're not likely to notice the difference. OTOH, accessing memory is slow (that's why the memory cache hierarchy exists). So utf8 is actually advantageous here: it fits in a smaller space, so it's faster to fetch from memory; more of it can fit in the CPU cache, so less DRAM roundtrips are needed. Which is faster. Yes you need extra processing because of the variable-width encoding, but it happens mostly inside the CPU, which is fast enough that it generally outstrips the memory roundtrip overhead. So unless you're doing something *really* complex with the utf8 data, it's an overall win in terms of performance. The CPU gets to do what it's good at -- running complex code -- and the memory cache gets to do what it's good at: minimizing the amount of slow DRAM roundtrips. T -- It said to install Windows 2000 or better, so I installed Linux instead.
Re: byte and short data types use cases
On Fri, Jun 09, 2023 at 11:24:38AM +, Murloc via Digitalmars-d-learn wrote: [...] > Which raised another question: since objects of types smaller than > `int` are promoted to `int` to use integer arithmetic on them anyway, > is there any point in using anything of integer type less than `int` > other than to limit the range of values that can be assigned to a > variable at compile time? Not just at compile time, at runtime they will also be fixed to that width (mapped to a hardware register of that size) and will not be able to contain a larger value. [...] > People say that there is no advantage for using `byte`/`short` type > for integer objects over an int for a single variable, however, as > they say, this is not true for arrays, where you can save some memory > space by using `byte`/`short` instead of `int`. That's correct. > But isn't any further manipulations with these array objects will > produce results of type `int` anyway? Don't you have to cast these > objects over and over again after manipulating them to write them back > into that array or for some other manipulations with these smaller > types objects? Yes you will have to cast them back. Casting often translates to a no-op or just a single instruction in the machine code; you just write part of a 32-bit register back to memory instead of the whole thing, and this automatically truncates the value to the narrow int. The general advice is, perform computations with int or wider, then truncate when writing back to storage for storage efficiency. So generally you wouldn't cast the value to short/byte until the very end when you're about to store the final result back to the array. At that point you'd probably also want to do a range check to catch any potential overflows. > Some people say that these promoting and casting operations in summary > may have an even slower overall effect than simply using int, so I'm > kind of confused about the use cases of these data types... (I think > that my misunderstanding comes from not knowing how things happen at a > slightly lower level of abstractions, like which operations require > memory allocation, which do not, etc. Maybe some resource > recommendations on that?) Thanks! I highly recommend taking an introductory course to assembly language, or finding a book / online tutorial on the subject. Understanding how the machine actually works under the hood will help answer a lot of these questions, even if you'll never actually write a single line of assembly code. But in a nutshell: integer data types do not allocate, unless you explicitly ask for it (e.g. `int* p = new int;` -- but you almost never want to do this). They are held in machine registers or stored on the runtime stack, and always occupy a fixed size, so almost no memory management is needed for them. (Which is also why they're preferred when you don't need anything more fancy, because they're also super-fast.) Promoting an int takes at most 1 machine instruction, or, in the case of unsigned values, sometimes zero instructions. Casting back to a narrow int is often a no-op (the subsequent code just ignores the upper bits). The performance difference is negligible, unless you're doing expensive things like range checking after every operation (generally you don't need to anyway, usually it's sufficient to check range at the end of a computation, not at every intermediate step -- unless you have reason to believe that an intermediate step is liable to overflow or wrap around). T -- People who are more than casually interested in computers should have at least some idea of what the underlying hardware is like. Otherwise the programs they write will be pretty weird. -- D. Knuth
Re: How does D’s ‘import’ work?
On Wed, May 31, 2023 at 06:43:52PM +, Cecil Ward via Digitalmars-d-learn wrote: > Is there an explanation of how D’s ‘import’ works somewhere? I’m > trying to understand the comparison with the inclusion of .h files, > similarities if any and differences with the process. Unlike C's #include, `import` does NOT paste the contents of the imported file into the context of `import`, like #include would do. Instead, it causes the compiler to load and parse the imported file, placing the parsed symbols into a separate symbol table dedicated for that module (in D, a file == a module). These symbols are then pulled into the local symbol table so that they become available to code containing the import declaration. (There's a variation, `static import`, that does the same thing except the last step of pulling symbols into the local symbol table. So the symbols will not "pollute" the current namespace, but are still accessible via their fully-qualified name (FQN), i.e., by the form `pkg.mod.mysymbol`, for a symbol `mysymbol` defined in the module `pkg.mod`, which in turn is a module under the package `pkg`.) For more information: https://tour.dlang.org/tour/en/basics/imports-and-modules https://dlang.org/spec/module.html T -- People who are more than casually interested in computers should have at least some idea of what the underlying hardware is like. Otherwise the programs they write will be pretty weird. -- D. Knuth
Re: How get struct value by member name string ?
On Tue, May 30, 2023 at 01:24:46AM +, John Xu via Digitalmars-d-learn wrote: > On Monday, 29 May 2023 at 11:21:11 UTC, Adam D Ruppe wrote: > > On Monday, 29 May 2023 at 09:35:11 UTC, John Xu wrote: > > > Error: variable `column` cannot be read at compile time > > > > you should generally getMember on a variable > > > > T t; > > __traits(getMember, t, "name") > > > > like that, that's as if you wrote t.name > > It seems I can't use variable as member name: > > struct T {int a; string name;} > T t; > string s = "name"; > writeln(__traits(getMember, t, s)); > > Above code fails to compile. Any help? Short answer: `s` must be known at compile-time. Or more precisely, known at the time of template expansion. In this case, use `enum`: enum s = "name"; Long answer: https://wiki.dlang.org/Compile-time_vs._compile-time T -- Which is worse: ignorance or apathy? Who knows? Who cares? -- Erich Schubert
Re: Concepts like c++20 with specialized overload resolution.
On Sat, May 27, 2023 at 05:49:27PM +, vushu via Digitalmars-d-learn wrote: > On Saturday, 27 May 2023 at 16:38:43 UTC, Steven Schveighoffer wrote: [...] > > void make_lava(T)(ref T lava) if (hasMagma!T) { > > lava.magma(); > > } > > > > void make_lava(T)(ref T lava_thing) if (!hasMagma!T){ > > lava_thing.try_making_lava(); > > } [...] > I see thanks for the example :), I think this probably the closest > equivalent i dlang. You can also use static if inside the function, which will give you an if-then-else structure: void make_lava(T)(ref T lava) { static if (hasMagma!T) { lava.magma(); } else { lava_thing.try_making_lava(); } } T -- Written on the window of a clothing store: No shirt, no shoes, no service.
Re: Proper way to handle "alias this" deprecation for classes
On Wed, May 10, 2023 at 10:57:13PM +, Chris Piker via Digitalmars-d-learn wrote: > On Wednesday, 10 May 2023 at 20:25:48 UTC, H. S. Teoh wrote: > > On Wed, May 10, 2023 at 07:56:10PM +, Chris Piker via > > Digitalmars-d-learn wrote: [...] > > I also suffer from left/right confusion, and always have to pause to > > think about which is the right(!) word before uttering it. > Oh, I though was the only one with that difficulty. Glad to hear I'm > not alone. :-) :-) > I have a tendency to think of things by their purpose when programming > but not by their location on the line or page. So terms such as > "writable" versus "ephemeral" or "addressable" versus "temporary" (or > "register"), make so much more sense to me. Yeah TBH I was never a fan of the lvalue/rvalue terminology. In a hypothetical language where the arguments to an assignment operator is reversed, the terminology would become needlessly confusing. E.g., if there was an operator `X => Y;` that means "assign the value of X to Y", then the roles of lvalues/rvalues would be reversed. > Back on the ref issue for a moment... I'd imagine that asking the > compiler to delay creating a writable variable until it finds out that > a storage location is actually needed by subsequent statements, is a > tall order. So D chose to introduce programmers to lvalues and rvalues > head-on, instead of creating a leaky abstraction. It depends on how you look at it. The very concept of a variable in memory is actually already an abstraction. Modern compilers may enregister variables or even completely elide them. Assignments may be reordered, and the CPU may execute things out-of-order (as long as semantics are preserved). Intermediate values may not get stored at all, but get folded into the larger computation and perhaps merged with some other operation with the resulting compound operation mapped to a single CPU instruction, etc.. So in that sense the compiler is quite capable of figuring out what to do... But what it can't do is read the programmer's mind to deduce the intent of his code. Exact semantics must be somehow conveyed to the compiler, and sad to say humans aren't very good at being exact. Often we *think* we know exactly what the computation is, but in reality we gloss over low-level details that will make a big difference in the outcome of the computation in the corner cases. The whole rvalue/lvalue business is really more a way of conveying to the compiler what exactly must happen, rather than directly corresponding to any actual feature in the underlying physical machine. T -- Computerese Irregular Verb Conjugation: I have preferences. You have biases. He/She has prejudices. -- Gene Wirchenko
Re: Proper way to handle "alias this" deprecation for classes
On Wed, May 10, 2023 at 07:56:10PM +, Chris Piker via Digitalmars-d-learn wrote: [...] > My problem with the terms lvalue and rvalue is much more basic, and is > just a personal one that only affects probably 0.1% of people. I just > can't keep left vs. right straight in real life. "Right" in my head > always means "correct". > > My daughter hates it when I'm telling her which way to turn the car > since I've said the wrong direction so many times. :) I also suffer from left/right confusion, and always have to pause to think about which is the right(!) word before uttering it. :-D Would compass directions be more helpful? (wvalue vs. evalue) Or would it suffer from the same problem? (One could retroactively rationalize it as *w*ritable value vs. *e*phemeral value. :-P) T -- By understanding a machine-oriented language, the programmer will tend to use a much more efficient method; it is much closer to reality. -- D. Knuth
Re: Proper way to handle "alias this" deprecation for classes
On Wed, May 10, 2023 at 03:24:48PM +, Chris Piker via Digitalmars-d-learn wrote: [...] > It's off topic, but I forget why managing memory for rvalues* was > pushed onto the programmer and not handled by the compiler. I'm sure > there is a good reason but it does seem like a symmetry breaking > requirement. > > -- > *or was it lvalues, I can never keep the two separate. Wish the some > other terminology was adopted long ago, such as "named" vs. > "ephemeral". x = y; ^ ^ | | lvalue rvalue An lvalue is simply something that can appear on the *l*eft side of an assignment statement, and an rvalue is something that appears on the *r*ight side of an assignment statement. It seems trivially obvious, but has far-reaching consequences. For one thing, to be an lvalue means that you must be able to assign a value to it. I.e., it must be a variable that exists somewhere in memory; `1 = x;` is illegal because `1` is a literal with no memory associated with it, so you cannot assign a new value to it. For something to be an rvalue means that it's a value like `1` that may not necessarily have a memory address associated with it. For example, the value of a computation is an rvalue: // This is OK: x = y + 1; // This is not OK: (y + 1) = x; The value of a computation cannot be assigned to, it makes no sense. Therefore, given an rvalue, you are not guaranteed that assignment is legal. Note however, that given an lvalue, you can always get an rvalue out of it. In the first example above, `y` can be an lvalue because it's a variable with a memory location. However, it can also be used as an rvalue. Or, if you like, `x = y;` contains an implicit "cast" of y to an rvalue. But you can never turn an rvalue back into an lvalue. T -- It's bad luck to be superstitious. -- YHL
Re: quick question, probably of little importance...
On Wed, Apr 26, 2023 at 11:07:39PM +, WhatMeWorry via Digitalmars-d-learn wrote: > On Wednesday, 26 April 2023 at 23:02:07 UTC, Richard (Rikki) Andrew > Cattermole wrote: > > Don't forget ``num % 2 == 0``. > > > > None should matter, pretty much all production compilers within the > > last 30 years should recognize all forms of this and do the right > > thing. > > Thanks. Fastest reply ever! And I believe across the world? I > suppose my examples required overhead of a function call. So maybe num > % 2 == 0 is fastest? If performance matters, you'd be using an optimizing compiler. And unless you're hiding your function implementation behind a .di, almost all optimizing compilers would inline it, so you shouldn't even be able to tell the difference. T -- Without outlines, life would be pointless.
Re: D style - member functions
On Wed, Apr 26, 2023 at 06:24:08PM +, DLearner via Digitalmars-d-learn wrote: > Consider: > ``` > struct S1 { >int A; >int B; >int foo() { > return(A+B); >} > } > > struct S2 { >int A; >int B; > } > int fnAddS2(S2 X) { >return (X.A + X.B); > } > > void main() { >import std.stdio : writeln; > >S1 Var1 = S1(1, 2); >writeln("Total Var1 = ", Var1.foo()); > >S2 Var2 = S2(1, 2); >writeln("Total Var2 = ", fnAddS2(Var2)); > >return; > } > ``` > > Of the two ways shown of producing the total from the same underlying > structure, which is the better style? Either way works, it doesn't really matter. The slight difference is that the member function is preferred when resolving a symbol, so if there's a module-level function called `foo` that takes S1 as a parameter, the member function would be called instead. > Further, do we care about the situation where there are many variables > of type 'S', which presumably means the function code generated from > S1 gets duplicated many times, but not so with S2? This is false. Member functions are only instantiated once in the entire program, not with every instance of S. (Template functions may be instantiated more than once, but that's still only once per combination of template arguments, not once per instance of the enclosing type.) T -- MAS = Mana Ada Sistem?
Re: How can a function pointer required to be extern(C)?
On Wed, Apr 12, 2023 at 08:23:51PM +, rempas via Digitalmars-d-learn wrote: > Sorry if the title doesn't make any sense, let me explain. So, I do have the > following code that does not compile: > > ```d > import core.sys.posix.pthread; /* The library */ > > struct Thread { > private: > pthread_t thread_id; > > public: > this(void* function(void*) func, void* arg = null, scope > const(pthread_attr_t*) attr = null) { > pthread_create(_id, attr, func, arg); > } > > @property: > pthread_t id() { return this.thread_id; } > } > > ``` > > Yes, I'm trying to "encapsulate" the Pthread (POSIX threads) API. > Normally, the function pointer that is passed to "pthread_create" must > be "extern(C)" and this is the complaining that the compile does. So, > I'm thinking to replace the constructor to this: > > ```d > this(extern(C) void* function(void*) func, void* arg = null, > scope const(pthread_attr_t*) attr = null) > { pthread_create(_id, attr, func, arg); } > ``` > > I just added "extern(C)" before the type. This is how it looks in the > error message so it must work right? Well... it doesn't. And here I am > wondering why. Any ideas? IMO this is a bug either in D's syntax or in the parser. I'd file an enhancement request. In the meantime, you can use alias as a workaround: ---snip--- extern(C) void* abc(void*) {return null;} alias FuncPtr = typeof(); pragma(msg, typeof(abc)); pragma(msg, typeof()); //void wrapper(extern(C) void* function(void*) callback) {} // NG void wrapper(FuncPtr callback) {} // OK pragma(msg, typeof(wrapper)); ---snip--- T -- A programming language should be a toolbox for the programmer to draw upon, not a minefield of dangerous explosives that you have to very carefully avoid touching in the wrong way.
Re: foreach (i; taskPool.parallel(0..2_000_000)
On Thu, Apr 06, 2023 at 01:20:28AM +, Paul via Digitalmars-d-learn wrote: [...] > Yes I understand, basically, what's going on in hardware. I just > wasn't sure if the access type was linked to the container type. It > seems obvious now, since you've both made it clear, that it also > depends on how I'm accessing my container. > > Having said all of this, isn't a D 'range' fundamentally a sequential > access container (i.e popFront) ? D ranges are conceptually sequential, but the actual underlying memory access patterns depends on the concrete type at runtime. An array's elements are stored sequentially in memory, and arrays are ranges. But a linked-list can also have a range interface, yet its elements may be stored in non-consecutive memory locations. So the concrete type matters here; the range API only gives you conceptual sequentiality, it does not guarantee physically sequential memory access. T -- Many open minds should be closed for repairs. -- K5 user
Re: foreach (i; taskPool.parallel(0..2_000_000)
On Wed, Apr 05, 2023 at 10:34:22PM +, Paul via Digitalmars-d-learn wrote: > On Tuesday, 4 April 2023 at 22:20:52 UTC, H. S. Teoh wrote: > > > Best practices for arrays in hot loops: [...] > > - Where possible, prefer sequential access over random access (take > > advantage of the CPU cache hierarchy). > > Thanks for sharing Teoh! Very helpful. > > would this be random access? for(size_t i; i indices? ...and this be sequential foreach(a;arr) ? > > or would they have to be completely different kinds of containers? a > dlang 'range' vs arr[]? [...] The exact syntactic construct you use is not important; under the hood, for(i; i
Re: foreach (i; taskPool.parallel(0..2_000_000)
On Tue, Apr 04, 2023 at 09:35:29PM +, Paul via Digitalmars-d-learn wrote: [...] > Well Steven just making the change you said reduced the execution time > from ~6-7 secs to ~3 secs. Then, including the 'parallel' in the > foreach statement took it down to ~1 sec. > > Boy lesson learned in appending-to and zeroing dynamic arrays in a hot > loop! Best practices for arrays in hot loops: - Avoid appending if possible; instead, pre-allocate outside the loop. - Where possible, reuse existing arrays instead of discarding old ones and allocating new ones. - Use slices where possible instead of making copies of subarrays (this esp. applies to strings). - Where possible, prefer sequential access over random access (take advantage of the CPU cache hierarchy). T -- Famous last words: I *think* this will work...
Re: better video rendering in d
On Tue, Mar 21, 2023 at 05:29:22PM +, monkyyy via Digitalmars-d-learn wrote: > On Tuesday, 21 March 2023 at 17:18:15 UTC, H. S. Teoh wrote: > > On Tue, Mar 21, 2023 at 04:57:49PM +, monkyyy via > > Digitalmars-d-learn wrote: > > > My current method of making videos of using raylib to generate > > > screenshots, throwing those screenshots into a folder and calling > > > a magic ffmpeg command is ... slow. > > [...] > > > > How slow is it now, and how fast do you want it to be? > > T > > I vaguely remember an hour and half for 5 minutes of video when its > extremely lightweight and raylib trivially does real-time to display > it normally and realistically I wouldn't be surprised if it could do > 1000 frames a second. > > Coping several gb of data to disk(that probably asking the gpu one > pixel at a time) to be compressed down into a dozen mb of video is > just... temp shit. I should just do something that isnt stressing > hard drives extremely unnecessarily. You could try to feed the frames to ffmpeg over stdin instead of storing the frames on disk. See this, for example: https://stackoverflow.com/questions/45899585/pipe-input-in-to-ffmpeg-stdin Then you can just feed live data to it in the background while you generate frames in the foreground. T -- Lottery: tax on the stupid. -- Slashdotter
Re: better video rendering in d
On Tue, Mar 21, 2023 at 04:57:49PM +, monkyyy via Digitalmars-d-learn wrote: > My current method of making videos of using raylib to generate screenshots, > throwing those screenshots into a folder and calling a magic ffmpeg command > is ... slow. [...] How slow is it now, and how fast do you want it to be? One possibility is to generate frames in parallel... though if you're recording a video of a sequence of operations, each of which depends on the previous, it may not be possible to parallelize. I have a toy project that generates animations of a 3D model parametrized over time. It generates .pov files and runs POVRay to generate frames, then calls ffmpeg to make the video. This is parallelized with std.parallelism.parallel, and is reasonably fast. However, ffmpeg will take a long time no matter what (encoding a video is a non-trivial operation). T -- Try to keep an open mind, but not so open your brain falls out. -- theboz
Re: @nogc and Phobos
On Sat, Mar 11, 2023 at 04:21:40PM +, bomat via Digitalmars-d-learn wrote: [...] > Although I come from a C++ background, I'm not exactly a fanboy of > that language (you can probably tell, otherwise I wouldn't be here). > But after hearing praise for D for being a cleaner and better version > of C/C++, I am a bit disappointed so far, tbh. I don't want to go into > too much detail here to not derail the thread entirely, but I think it > repeats too many old sins, like implicit type conversions, the `for` > loop syntax (although I guess one wouldn't need it that often because > of `foreach`), the `switch` `case` fallthrough, and the cancerous > `const` (as far as I can tell, `immutable` is an even worse flavor of > it). [...] I also came from a C/C++ background. The GC turned me off D for a long time, until one day I decided to just give it a try to see if it was all as bad as people made it sound. I have to admit that GC phobia stuck with me for a long time, but once I actually started using the language seriously, I discovered to my surprise that it wasn't *that* big of a deal as I had thought. In fact, I found that I quite liked it, because it made my APIs cleaner. A LOT cleaner because I didn't have to pollute every function call with memory management paraphrenalia; they can be nice and clean with no extraneous detritus and things Just Work(tm). Also, the amount of time/effort spent (i.e., wasted) debugging memory problems was gone, and I was a LOT more productive than I ever was in C++. True, I have to relinquish 100% control of my memory, and as an ex-C++ fanboy I totally understand that it's not a pleasant feeling. But I have to say that I was pleasantly surprised at how much D's GC *didn't* get in my way, once I actually started using it for real (toy examples can be misleading). Why am I saying all this? Because to be frank, you haven't really used D if you've been avoiding its standard library like the plague. Not all of Phobos is GC-dependent; the range-based stuff, for example, lets you avoid GC use most of the time. True, for exceptions you need GC, but exceptions are supposed to be ... exceptions ... not the norm, and in practice it isn't really *that* big of a deal. You shouldn't be catching exceptions inside performance-sensitive inner loops anyway. D's strong points don't really show until you start using range-based stuff with UFCS chains -- now *that's* power. Even if you dislike the GC you can still mostly manage your own memory, and let the GC be the fallback mechanism for stuff you missed. As for implicit type conversions: I don't know where you got your information from, but D's implicit conversions are a WHOLE different world from C++. Walter has resisted adding implicit conversion mechanisms in spite of harsh criticism and constant pressure, and in practice, you aren't gonna see a lot of it in D code, if at all. It's not even close to C++ where SFINAE + Koenig lookup gang up on you from behind and you don't even know what hit you. Issues with implicit conversions in D only really come up if you go out of your way to abuse alias this and/or use short ints a little too zealously. Otherwise in practice it's not even an issue IME. For-loop syntax: I can't remember the last time I wrote one in D. Maybe I did like 1 or 2 times (across my 20+ D projects) when I really needed to do something weird with my loops. But foreach covers 90% of my looping needs, and while loops take care of the 9.9% of the cases. Besides, once you start getting used to UFCS chains and Phobos algorithms, most of the time you won't even be writing any loops at all. You won't believe how much more readable your code becomes when you can finally stop worrying about pesky fragile loop conditions and just tack on a couple more components to your UFCS chain and it just automagically takes care of itself. Again, not something you'll understand if you never tried to use D in a serious way. I recommend actually trying to write D, not as transplanted C++, but the way D code is meant to be written. As for switch: yeah D switch has some crazy parts (like Duff's device -- you can actually write that in D). But I've never needed to use it... also, final switch + enums = awesome. As for const: I hardly ever use it. It's useful occasional for low-level code, but not much beyond that. My advice: don't bother. Just pretend it doesn't exist, and your life will be happier. Well OK, once in a while you do need to deal with it. But if it were me, I'd avoid it unless I have to. It doesn't mix well with high-level code, I'll put it that way. Immutable is the same thing, I only use it as `static immutable` just so the compiler would put my data in the preinitialized segment. Other than that, I don't bother. T -- If you're not part of the solution, you're part of the precipitate.
Re: Bug in DMD?
On Thu, Mar 02, 2023 at 09:55:55PM +, ryuukk_ via Digitalmars-d-learn wrote: > On Thursday, 2 March 2023 at 21:38:23 UTC, ryuukk_ wrote: > > On Thursday, 2 March 2023 at 21:21:14 UTC, Richard (Rikki) Andrew > > Cattermole wrote: [...] > > > 2. Dustmite, so we have something we can work with. > > > > [...] 2. do you have a link for a guide how to setup "dustmite"? https://dlang.org/blog/2020/04/13/dustmite-the-general-purpose-data-reduction-tool/ Dustmite automatically reduces your code to a minimal example that still exhibits the same problem, good for bug reports that are easily reproducible. Also useful if you don't want to publicly share the code for whatever reason, but still want to provide enough information so that the dmd devs can find the problem and fix it. [...] > That's is not something i like doing, it should just work, i shouldn't > have to debug DMD, that aint my job Dustmite can run in the background on a temporary copy of your code, you don't have to babysit it and can work on other things while it's doing its thing. T -- Written on the window of a clothing store: No shirt, no shoes, no service.
Re: Transform static immutable string array to tuple.
On Sun, Feb 19, 2023 at 11:08:34AM +, realhet via Digitalmars-d-learn wrote: > Hello, > > Is there a better way to transform a string array to a tuple or to an > AliasSeq? > > ``` > mixin(customSyntaxPrefixes.format!`tuple(%(%s,%))`) > ``` > > I'd like to use this as variable length arguments passed to the > startsWith() std function (as multiple needles). In situations like this it's often better to define your dats as an AliasSeq to begin with, since it's easier to covert that to an array than the other way round. T -- I don't trust computers, I've spent too long programming to think that they can get anything right. -- James Miller
Re: Deciding one member of iteration chain at runtime
On Fri, Feb 17, 2023 at 05:30:40PM +, Chris Piker via Digitalmars-d-learn wrote: [...] > In order to handle new functionality it turns out that operatorG needs > to be of one of two different types at runtime. How would I do > something like the following: > > ```d > auto virtualG; // <-- probably invalid code, illustrating the idea > if(runtime_condition) >virtualG = operatorG1; > else >virtualG = operatorG2; [...] > ``` > ? > > I've tried various usages of `range.InputRangeObject` but haven't been > able to get the syntax right. Any suggestions on the best way to > proceed? Maybe the whole chain should be wrapped in InputRangeObject > classes, I don't know. [...] Here's an actual function taken from my own code, that returns a different range type depending on a runtime condition, maybe this will help you? ```d /** * Expands '@'-directives in a range of strings. * * Returns: A range of strings with lines that begin with '@' * substituted with the contents of the file named by the rest of the * line. */ auto expandFileDirectives(File = std.stdio.File, R)(R args) if (isInputRange!R && is(ElementType!R : const(char)[])) { import std.algorithm.iteration : joiner, map; import std.algorithm.searching : startsWith; import std.range : only; import std.range.interfaces : InputRange, inputRangeObject; import std.typecons : No; return args.map!(arg => arg.startsWith('@') ? cast(InputRange!string) inputRangeObject( File(arg[1 .. $]).byLineCopy(No.keepTerminator)) : cast(InputRange!string) inputRangeObject(only(arg))) .joiner; } ``` Note that the cast is to a common base class of the two different subclasses returned by inputRangeObject(). This function is used in the rest of the code as part of a UFCS chain of ranges. T -- Long, long ago, the ancient Chinese invented a device that lets them see through walls. It was called the "window".
Re: Non-ugly ways to implement a 'static' class or namespace?
On Thu, Feb 16, 2023 at 08:51:39AM +, FeepingCreature via Digitalmars-d-learn wrote: [...] > Springboarding off this post: > > This thread is vastly dominated by some people who care very much > about this issue. Comparatively, for instance, I care very little > because I think D already does it right. > > But then the thread will look unbalanced. This is a fundamental design > flaw in forum software. > > So let me just say: I think D does it right. D does not have class > encapsulation; it has module encapsulation. This is by design, and the > design is good. +1, this issue is wayyy overblown by a vocal minority. D's design diverges from other languages, but that in itself does not make it a bad design. In the context of D it actually makes sense. Saying that D's design is bad because language X does it differently is logically fallacious (X is good, Y is not X, therefore Y is bad). T -- He who sacrifices functionality for ease of use, loses both and deserves neither. -- Slashdotter
Re: Simplest way to convert an array into a set
On Mon, Feb 13, 2023 at 06:04:40PM +, Matt via Digitalmars-d-learn wrote: > Obviously, there is no "set" object in D, Actually, bool[T] could be used as a set object of sorts. Or even void[0][T], though that's a little more verbose to type. But this can be aliased to something nicer (see below). > but I was wondering what the quickest way to remove duplicates from an > array would be. I was convinced I'd seen a "unique" method somewhere, > but I've looked through the documentation for std.array, std.algorithm > AND std.range, and I've either missed it, or my memory is playing > tricks on me. That, or I'm looking in the wrong place entirely, of > course Try this: -snip- import std; auto deduplicate(R)(R input) if (isInputRange!R) { alias Unit = void[0]; enum unit = Unit.init; Unit[ElementType!R] seen; return input.filter!((e) { if (e in seen) return false; seen[e] = unit; return true; }); } unittest { assert([ 1, 2, 3, 4, 2, 5, 6, 4, 7 ].deduplicate.array == [ 1, 2, 3, 4, 5, 6, 7 ]); assert([ "abc", "def", "def", "ghi", "abc", "jkl" ].deduplicate.array == [ "abc", "def", "ghi", "jkl" ]); } -snip- T -- Маленькие детки - маленькие бедки.
Re: betterC DLL in Windows
On Mon, Feb 06, 2023 at 03:54:40PM +, bachmeier via Digitalmars-d-learn wrote: > On Sunday, 5 February 2023 at 08:48:34 UTC, Tamas wrote: [...] > > This is the specification for the D Programming Language. > > I've been bitten by that a few times over the years, though to be > honest, I'm not sure of the relationship of the spec to documentation. > The Phobos documentation and compiler documentation appear to be > actual documentation, in the sense that you can trust it to be > accurate, and if not it's a bug. Maybe someone that has been around > from the earliest days understands the goal of the spec. IIRC the spec was started as part of an ongoing effort to fully specify the language so that, at least in theory, someone could read the spec and implement a D compiler completely independent of the current ones. T -- INTEL = Only half of "intelligence".
Re: Which TOML package, or SDLang?
On Mon, Jan 30, 2023 at 03:59:52PM +, Adam D Ruppe via Digitalmars-d-learn wrote: > On Monday, 30 January 2023 at 15:37:56 UTC, Guillaume Piolat wrote: > > Why not XML? :) It has comments, you can use backslashes too. > > no kidding, xml is an underrated format. XML is evil. Let me qualify that statement. XML, as specified by the XML spec, is pure evil. It has some absolutely nasty corners that has pathological behaviours like recursive expansion of entities (exploitable for DOS attacks or to induce OOM crashes in XML parsers), which includes token-pasting style pathology like C's preprocessor, and remote fetching of arbitrary network resources (which, no thanks to pathological entities, can be easily obfuscated). XML as used by casual users, however, is a not-bad format for markup text. It's far too verbose for my tastes, but for some applications it could be a good fit. As far as implementation is concerned, a (non-compliant) XML parser that implements the subset of XML employed for "normal" use, i.e., without the pathological bits, would be a good thing, e.g., Jonathan's dxml. A fully-compliant XML parser that includes the pathological bits, however, I wouldn't touch with a 10-foot pole. T -- Customer support: the art of getting your clients to pay for your own incompetence.
Re: Where I download Digital Mars C Preprocessor sppn.exe?
On Mon, Jan 23, 2023 at 08:06:28PM +, Alain De Vos via Digitalmars-d-learn wrote: > Mixing D with C or C++ or Python is looking for problems. > Better write something in D. > And write something in C/C++/Python. > And have some form of communication between both. I don't know about Python, but I regularly write D code that interacts with external C libraries and have not encountered any major problems. You just have to put `extern(C)` in the right places and make sure you link the right objects / libraries, and you're good to go. So far I haven't actually tried integrating non-trivial C++ libraries with D yet, but I expect it will be similar unless you're dealing with C++ templates (which are not compatible with D templates) or multiple inheritance, which D doesn't support. T -- Right now I'm having amnesia and deja vu at the same time. I think I've forgotten this before.
Re: Non-ugly ways to implement a 'static' class or namespace?
On Fri, Jan 20, 2023 at 01:32:22PM -0800, Ali Çehreli via Digitalmars-d-learn wrote: > On 1/20/23 07:01, torhu wrote: > > > But why not have drawLine just be a free function? > > Exactly. > > If I'm not mistaken, and please teach me if I am wrong, they are > practically free functions in Java as well. That Java class is working > as a namespace. Exactly. Every time you see a static singleton class, you're essentially looking at a namespace. Only, in OO circles non-class namespaces are taboo, it's not OO-correct to call them what they are, instead you have to do lip service to OO by calling them static singleton classes instead. And free functions are taboo in OO; OO doctrine declares them unclean affronts to OO purity and requires that you dress them in more OO-appropriate clothing, like putting them inside a namesp^W excuse me, static singleton class. ;-) > So, the function above is the same as the following free-standing > function in D, C++, C, and many other languages: > > void Algo_drawLine(Canvas c, Pos from, Pos to) { .. }; [...] That way of naming a global function is essentially a poor man's^W^Wexcuse me, I mean, C's way of working around the lack of a proper namespacing / module system. In D, we do have a proper module system, so you could just call the function `drawLine` and put it in a file named Algo.d, then you can just use D's symbol resolution rules to disambiguate between Algo.drawLine and PersonalSpace.drawLine, for example. :-P T -- Public parking: euphemism for paid parking. -- Flora
Re: What is the 'Result' type even for?
On Fri, Jan 20, 2023 at 12:49:54PM +, Ruby The Roobster via Digitalmars-d-learn wrote: [...] > Thank you. I didn't know that there was such a property `.array`. It's not a property, it's a Phobos function from std.array. T -- INTEL = Only half of "intelligence".
Re: What is the 'Result' type even for?
On Fri, Jan 20, 2023 at 03:34:43AM +, Ruby The Roobster via Digitalmars-d-learn wrote: > On Friday, 20 January 2023 at 03:30:56 UTC, Steven Schveighoffer wrote: > > On 1/19/23 10:11 PM, Ruby The Roobster wrote: > > ... > > > > The point is to be a range over the original input, evaluated > > lazily. Using this building block, you can create an array, or use > > some other algorithm, or whatever you want. All without allocating > > more space to hold an array. [...] > I get the point that it is supposed to be lazy. But why are these > basic cases not implemented? I shouldn't have to go write a wrapper > for something as simple as casting this type to the original type. > This is one of the things that one expects the standard library to do > for you. There's no need to write any wrappers. Just tack `.array` to the end of your pipeline, and you're good to go. T -- My father told me I wasn't at all afraid of hard work. I could lie down right next to it and go to sleep. -- Walter Bright
Re: What is the 'Result' type even for?
On Fri, Jan 20, 2023 at 03:11:33AM +, Ruby The Roobster via Digitalmars-d-learn wrote: > Take this example: > > ```d > import std; > void main() > { > auto c = "a|b|c|d|e".splitter('|'); > c.writeln; > string[] e = ["a", "b", "c", "d", "e"]; > assert(c.equal(e)); > typeof(c).stringof.writeln; > } > ``` > > The program prints: > > ["a", "b", "c", "d", "e"] > Result > > What is the purpose of this 'Result' type? To serve as a generic > range? It's a Voldemort type, representing a range that iterates over its elements lazily. > Because, it seems to only cause problems. For example, you cannot > assign or cast the result type into a range, even when the type has > the same inherent function: > > ```d > string[] c = "a|b|c|d|e".splitter('|'); // fails > string[] d = cast(string[])"a|b|c|d|e".splitter('|'); // also fails > ``` You're confusing arrays and ranges. A "range" isn't any specific type, it refers to *any* type that behaves a certain way (behaves like a range). Each `Result` you get back has its own unique type (arguably, it's a compiler bug to display it as merely `Result` without distinguishing it from other identically-named but distinct Voldemort types), so you cannot just assign it back to an array. You can either create an array from it using std.array.array, use a function that eagerly creates its results instead of a lazy result (in the above instance, use std.string.split instead of .splitter), or use std.algorithm.copy to copy the contents of the lazy range into an array: // Option 1 string[] c = "a|b|c|d|e".splitter('|').dup; // Option 2 string[] c = "a|b|c|d|e".split('|'); // Option 3 // Caveat: .copy expects you to have prepared the buffer // beforehand to be large enough to hold the contents; it does // not reallocate the result array for you. string[] result = new string[5]; "a|b|c|d|e".splitter('|').copy(result); [...] > Then what is the point of this type, if not to just make things > difficult? It cannot be casted, and vector operations cannot be > performed, and it seems to just serve as an unnecessary > generalization. It serves to chain further range operations into a pipeline: string[] c = "a|b|c|d|e".splitter('|') .filter!(c => c >= 'b' && c <= 'd') .map!(c => c+1) .array; Because ranges are lazily iterated, the .array line only allocates the 3 elements that got through the .filter. Whereas if you created the intermediate result array eagerly, you'd have to allocate space for 5 elements only to discard 2 of them afterwards. One way to think about this is that the intermediate Result ranges are like the middle part of a long pipe; you cannot get stuff from the middle of the pipe without breaking it, you need to terminate the pipe with a sink (like .array, .copy, etc.) first. T -- I am Pentium of Borg. Division is futile; you will be approximated.
Re: Coding Challenges - Dlang or Generic
On Tue, Jan 17, 2023 at 11:08:19PM +, Siarhei Siamashka via Digitalmars-d-learn wrote: > On Tuesday, 17 January 2023 at 21:50:06 UTC, matheus wrote: > > Question: Have you compared the timings between this way (With > > ranges) and a normal way (Without ranges)? > > If you are intensively using ranges, UFCS or the other convenient high > level language features, then the compiler choice does matter a lot. > And only LDC compiler is able to produce fast binaries from such > source code. > > GDC compiler has severe performance problems with inlining, unless LTO > is enabled. And it also allocates closures on stack. This may or may > not be fixed in the future, but today I can't recommend GDC if you > really care about performance. Interesting, I didn't know GDC has issues with inlining. I thought it was more-or-less on par with LDC in terms of the quality of code generation. Do you have a concrete example of this problem? > DMD compiler uses an outdated code generation backend from Digital > Mars C++ and will never be able to produce fast binaries. It > prioritizes fast compilation speed over everything else. [...] For anything performance related, I wouldn't even consider DMD. For all the 10+ years I've been using D, it has consistently produced executables that run about 20-30% slower than those produced by LDC or GDC, sometimes even up to 40%. For script-like programs or interactive apps that don't care about performance, DMD is fine for convenience and fast compile turnaround times. But as soon as performance matters, DMD is not even on my radar. T -- Heuristics are bug-ridden by definition. If they didn't have bugs, they'd be algorithms.
Re: Creating a pointer/slice to a specific-size buffer? (Handing out a page/frame from a memory manager)
On Fri, Jan 13, 2023 at 08:31:17AM -0800, Ali Çehreli via Digitalmars-d-learn wrote: > On 1/13/23 07:07, Gavin Ray wrote: > > > This is "valid" D I hope? > > Yes because static arrays are just elements side-by-side in memory. > You can cast any piece of memory to a static array provided the length > and alignment are correct. [...] Or to be more precise, cast the memory to a *pointer* to a static array of the right size. Static arrays are by-value types; passing around the raw array will cause the array to be copied every time, which is probably not what is intended. T -- "You are a very disagreeable person." "NO."
Re: Why not allow elementwise operations on tuples?
On Fri, Jan 13, 2023 at 02:22:34PM +, Sergei Nosov via Digitalmars-d-learn wrote: > Hey, everyone! > > I was wondering if there's a strong reason behind not implementing > elementwise operations on tuples? > > Say, I've decided to store 2d points in a `Tuple!(int, int)`. It would > be convenient to just write `a + b` to yield another `Tuple!(int, > int)`. I've written a Vec type that implements precisely this, using tuples behind the scenes as the implementation, and operator overloading to allow nice syntax for vector arithmetic. ---snip /** * Represents an n-dimensional vector of values. */ struct Vec(T, size_t n) { T[n] impl; alias impl this; /** * Per-element unary operations. */ Vec opUnary(string op)() if (is(typeof((T t) => mixin(op ~ "t" { Vec result; foreach (i, ref x; result.impl) x = mixin(op ~ "this[i]"); return result; } /** * Per-element binary operations. */ Vec opBinary(string op, U)(Vec!(U,n) v) if (is(typeof(mixin("T.init" ~ op ~ "U.init" { Vec result; foreach (i, ref x; result.impl) x = mixin("this[i]" ~ op ~ "v[i]"); return result; } /// ditto Vec opBinary(string op, U)(U y) if (isScalar!U && is(typeof(mixin("T.init" ~ op ~ "U.init" { Vec result; foreach (i, ref x; result.impl) x = mixin("this[i]" ~ op ~ "y"); return result; } /// ditto Vec opBinaryRight(string op, U)(U y) if (isScalar!U && is(typeof(mixin("U.init" ~ op ~ "T.init" { Vec result; foreach (i, ref x; result.impl) x = mixin("y" ~ op ~ "this[i]"); return result; } /** * Per-element assignment operators. */ void opOpAssign(string op, U)(Vec!(U,n) v) if (is(typeof({ T t; mixin("t " ~ op ~ "= U.init;"); }))) { foreach (i, ref x; impl) mixin("x " ~ op ~ "= v[i];"); } void toString(W)(W sink) const if (isOutputRange!(W, char)) { import std.format : formattedWrite; formattedWrite(sink, "(%-(%s,%))", impl[]); } } /** * Convenience function for creating vectors. * Returns: Vec!(U,n) instance where n = args.length, and U is the common type * of the elements given in args. A compile-time error results if the arguments * have no common type. */ auto vec(T...)(T args) { static if (args.length == 1 && is(T[0] == U[n], U, size_t n)) return Vec!(U, n)(args); else static if (is(typeof([args]) : U[], U)) return Vec!(U, args.length)([ args ]); else static assert(false, "No common type for " ~ T.stringof); } /// unittest { // Basic vector construction auto v1 = vec(1,2,3); static assert(is(typeof(v1) == Vec!(int,3))); assert(v1[0] == 1 && v1[1] == 2 && v1[2] == 3); // Vector comparison auto v2 = vec(1,2,3); assert(v1 == v2); // Unary operations assert(-v1 == vec(-1, -2, -3)); assert(++v2 == vec(2,3,4)); assert(v2 == vec(2,3,4)); assert(v2-- == vec(2,3,4)); assert(v2 == vec(1,2,3)); // Binary vector operations auto v3 = vec(2,3,1); assert(v1 + v3 == vec(3,5,4)); auto v4 = vec(1.1, 2.2, 3.3); static assert(is(typeof(v4) == Vec!(double,3))); assert(v4 + v1 == vec(2.1, 4.2, 6.3)); // Binary operations with scalars assert(vec(1,2,3)*2 == vec(2,4,6)); assert(vec(4,2,6)/2 == vec(2,1,3)); assert(3*vec(1,2,3) == vec(3,6,9)); // Non-numeric vectors auto sv1 = vec("a", "b"); static assert(is(typeof(sv1) == Vec!(string,2))); assert(sv1 ~ vec("c", "d") == vec("ac", "bd")); assert(sv1 ~ "post" == vec("apost", "bpost")); assert("pre" ~ sv1 == vec("prea", "preb")); } unittest { // Test opOpAssign. auto v = vec(1,2,3); auto w = vec(4,5,6); v += w; assert(v == vec(5,7,9)); } unittest { int[4] z = [ 1, 2, 3, 4 ]; auto v = vec(z); static assert(is(typeof(v) == Vec!(int,4))); assert(v == vec(1, 2, 3, 4)); } unittest { import std.format : format; auto v = vec(1,2,3,4); assert(format("%s", v) == "(1,2,3,4)"); } ---snip T -- Never ascribe to malice that which is adequately explained by incompetence. -- Napoleon Bonaparte
Re: append - too many files
On Wed, Jan 11, 2023 at 02:15:13AM +, Joel via Digitalmars-d-learn wrote: > I get this error after a while (seems append doesn't close the file > each time): > std.file.FileException@std/file.d(836): history.txt: Too many open files > > ```d > auto jm_addToHistory(T...)(T args) { > import std.conv : text; > import std.file : append; > > auto txt = text(args); > append("history.txt", txt); > > return txt; > } > ``` This is a bug, please file an issue in mantis. Phobos functions should not leak file descriptors. T -- "No, John. I want formats that are actually useful, rather than over-featured megaliths that address all questions by piling on ridiculous internal links in forms which are hideously over-complex." -- Simon St. Laurent on xml-dev