Re: Error: cannot implicitly convert expression this.aa of type inout(string[string]) to string[string]
On Thursday, 15 March 2018 at 13:18:38 UTC, Simen Kjærås wrote: On Thursday, 15 March 2018 at 12:00:08 UTC, Robert-D wrote: I want the function to create a mutable copy from a const or a imutable Like this: void main() { const S s = S(["": ""]); S b = s.dup(); } How can i do that? In that case, the problem is that you also have to .dup the aa: S dup() const pure { return S(aa.dup); } However, it seems aa.dup returns the wrong type - one would expect V[K] for inout(V[K]), but it returns inout(V)[K], or inout(string)[string], in your case. That's apparently a known bug: https://issues.dlang.org/show_bug.cgi?id=14148. The solution for now, then, is this: S dup() const pure { return S(cast(string[string])aa.dup); } -- Simen Why something like this doesn't compile (with or without the cast on bb.dup)? struct S { string[string] aa; S dup() inout pure { return S(cast(string[string]) aa.dup); } } struct SS { S[] bb; SS dup() inout pure { return SS(cast(S[]) bb.dup); } } Error: static assert: "Cannot implicitly convert type inout(S) to S in dup." Or: const(S)[] ss = [S(["": ""])]; S[] d = ss.dup; Error: template object.dup cannot deduce function from argument types !()(const(S)[]), candidates are: /dlang/dmd/linux/bin64/../../src/druntime/import/object.d(2086): object.dup(T : V[K], K, V)(T aa) /dlang/dmd/linux/bin64/../../src/druntime/import/object.d(2122): object.dup(T : V[K], K, V)(T* aa) /dlang/dmd/linux/bin64/../../src/druntime/import/object.d(4191): object.dup(T)(T[] a) if (!is(const(T) : T)) /dlang/dmd/linux/bin64/../../src/druntime/import/object.d(4207): object.dup(T)(const(T)[] a) if (is(const(T) : T))
Re: Error: cannot implicitly convert expression this.aa of type inout(string[string]) to string[string]
On Thursday, 15 March 2018 at 11:33:49 UTC, Simen Kjærås wrote: On Thursday, 15 March 2018 at 11:18:48 UTC, Robert-D wrote: [...] This is where things go wrong: [...] 'inout' means that this function can keep the const, immutable or mutable status of the type on which the function is called. This means that an inout function has to treat the object as const, because otherwise the function would break the guarantees of immutable and const. When using inout on a function, you always want to put inout on something else too - either a ref parameter or the return value. In your case, this works: inout(S) dup() inout pure { return inout(S)(aa); } -- Simen I want the function to create a mutable copy from a const or a imutable Like this: void main() { const S s = S(["": ""]); S b = s.dup(); } How can i do that?
Error: cannot implicitly convert expression this.aa of type inout(string[string]) to string[string]
struct S { string[string] aa; S dup() inout pure { return S(aa); } } void main() { auto s = S(["": ""]); s.dup(); } Result: Error: cannot implicitly convert expression this.aa of type inout(string[string]) to string[string] I need help with the above program.
Re: Intermediate level D and open source projects to study
On Wednesday, 11 May 2016 at 18:41:47 UTC, xtreak wrote: Hi, I am a D newbie. I worked through D programming language and programming in D books. I primarily use Python daily. I will be happy to know how I can go to intermediate level in D. It will be hepful to have projects in D of high quality and also beginner friendly code that I can study to improve my D. [snip] Might not be exactly what you are looking for, but I recently open-sourced some command line utilities you could take look at. They are real apps in that they take command line arguments, have help, error handling, etc. But, they are doing relatively straightforward tasks, things you might do in Python also. A caution: I'm relatively new to D as well, and there are likely places where the code could be more idiomatic D. Utilities are at: https://github.com/eBay/tsv-utils-dlang. The readme has a section labeled "The code" that describes the code structure.
Re: Can't use std.algorithm.remove on a char[]?
On Saturday, 30 April 2016 at 19:21:30 UTC, ag0aep6g wrote: On 30.04.2016 21:08, Jon D wrote: If an initial step is to fix the documentation, it would be helpful to include specifically that it doesn't work with characters. It's not obvious that characters don't meet the requirement. Characters are not the problem. remove works fine on a range of chars, when the elements are assignable lvalues. char[] as a range has neither assignable elements, nor lvalue elements. That is, lines 3 and 4 here don't compile: import std.range: front; char[] a = ['f', 'o', 'o']; a.front = 'g'; auto ptr = I didn't mean to suggest making the documentation technically incorrect. Just that it be helpful in important cases that won't necessarily be obvious. To me, char[] is an important case, one that's not made obvious by listing the hasLvalueElements constraint by itself. --Jon
Re: Can't use std.algorithm.remove on a char[]?
On Saturday, 30 April 2016 at 18:32:32 UTC, ag0aep6g wrote: On 30.04.2016 18:44, TheGag96 wrote: I was just writing some code trying to remove a value from a character array, but the compiler complained "No overload matches for remove", and if I specifically say use std.algorithm.remove() the compiler doesn't think it fits any definition. For reference, this would be all I'm doing: char[] thing = ['a', 'b', 'c']; thing = thing.remove(1); Is this a bug? std.algorithm claims remove() works on any forward range... The documentation is wrong. 1) remove requires a bidirectional range. The constraints and parameter documentation correctly say so. char[] is a bidirectional range, though. 2) remove requires lvalue elements. char[] fails this, as the range primitives decode the chars on-the-fly to dchars. Pull request to fix the documentation: https://github.com/dlang/phobos/pull/4271 By the way, I think requiring lvalues is too restrictive. It should work with assignable elements. Also, it has apparently been missed that const/immutable can make non-assignable lvalues. There's a ticket open related to the lvalue element requirement: https://issues.dlang.org/show_bug.cgi?id=8930 Personally, I think this example is more compelling than the one in the ticket. It seems very reasonable to expect that std.algorithm.remove will work regardless of whether the elements are characters, integers, ubytes, etc. If an initial step is to fix the documentation, it would be helpful to include specifically that it doesn't work with characters. It's not obvious that characters don't meet the requirement. --Jon
Re: Is there a way to disable 'dub test' for applications?
On Monday, 18 April 2016 at 11:47:42 UTC, Dicebot wrote: On Monday, 18 April 2016 at 04:25:25 UTC, Jon D wrote: I have an dub config file specifying a targetType of 'executable'. There is only one file, the file containing main(), and no unit tests. When I run 'dub test', dub builds and runs the executable. This is not really desirable. Is there a way to set up the dub configuration file to disable running the test? configuration "unittest" { excludedSourceFiles "path/to/main.d" } Very nice, thank you. What also seems to work is: configuration "unittest" { targetType "none" } Then 'dub test' produces the message: Configuration 'unittest' has target type "none". Skipping test.
Re: Is there a way to disable 'dub test' for applications?
On Monday, 18 April 2016 at 05:30:21 UTC, Jonathan M Davis wrote: On Monday, April 18, 2016 04:25:25 Jon D via Digitalmars-d-learn wrote: I have an dub config file specifying a targetType of 'executable'. There is only one file, the file containing main(), and no unit tests. When I run 'dub test', dub builds and runs the executable. This is not really desirable. Is there a way to set up the dub configuration file to disable running the test? Note: What I'd really like to do is run a custom shell command when 'dub test' is done, I haven't seen anything suggesting that's an option. However, disabling would still be useful. What's the point of even running dub test if you have no unit tests? Just do dub build, and then use the resulting executable, or if you want to build and run in one command, then use dub run. - Jonathan M Davis I should have supplied more context. A few days ago I announced open-sourcing a D package consisting of several executables. Multiple comments recommended making it available via the Dub repository. I wasn't using Dub to build, and there are a number of loose ends when working with Dub and multiple executables. I've been trying to limit the number of issues others might encounter if they pulled the package and ran typical commands, like 'dub test'. It's not a big deal, but if there's an easy way to provide a handler, I will. Also, the reason for a custom shell command is that there are tests, it's just that they are run against the built executable rather than via the unittest framework. --Jon
Is there a way to disable 'dub test' for applications?
I have an dub config file specifying a targetType of 'executable'. There is only one file, the file containing main(), and no unit tests. When I run 'dub test', dub builds and runs the executable. This is not really desirable. Is there a way to set up the dub configuration file to disable running the test? Note: What I'd really like to do is run a custom shell command when 'dub test' is done, I haven't seen anything suggesting that's an option. However, disabling would still be useful. --Jon
Specifying a minimum Phobos version in dub?
Is there a way to specify a minimum Phobos version in a dub package specification? --Jon
Re: Get memory usage report from GC
On Saturday, 20 February 2016 at 05:34:01 UTC, tcak wrote: On Saturday, 20 February 2016 at 05:33:00 UTC, tcak wrote: Is there any way (I checked core.memory already) to collect report about memory usage from garbage collector? So, I can see a list of pointer and length information. Since getting this information would require another memory area in heap, it could be like logging when report is asked. My long running but idle program starts using 41.7% of memory (that's close to 3GB), and it is not obvious whether the memory is allocated by a single variable, or many variables. My mistake, it is close to 512MB. Doesn't sounds like precisely what you want, but there are summary reports of GC activity available via the "--DRT-gcopt=profile:1" command line option. More info at: http://dlang.org/spec/garbage.html --Jon
Re: Scala Spark-like RDD for D?
On Wednesday, 17 February 2016 at 02:32:01 UTC, bachmeier wrote: You can discuss here, but there is also a gitter room https://gitter.im/DlangScience/public Also, I've got a project that embeds R inside D http://lancebachmeier.com/rdlang/ It's not quite as good a user experience as others because I have limited time for things not related to work. I've got an older project to embed D inside R, but it hasn't been updated in a while and it's Linux only. https://bitbucket.org/bachmeil/dmdinline2 Excellent, thanks, I'll check these out. --Jon
Re: Scala Spark-like RDD for D?
On Tuesday, 16 February 2016 at 16:27:27 UTC, bachmeier wrote: On Monday, 15 February 2016 at 11:09:10 UTC, data pulverizer wrote: As an alternative are there plans for parallel/cluster computing frameworks for D? You can use MPI: https://github.com/DlangScience/OpenMPI FWIW, I'm interested in the wider topic of incorporating D into data science environments also. Sounds as if there are several interesting projects in the area, but so far my understanding of them is limited. Perhaps the forum isn't the best place to discuss, but if there happen to be any blog posts or other descriptions, it'd be great to get links. --Jon
Re: Reserving capacity in associative arrays
On Tuesday, 16 February 2016 at 19:49:55 UTC, H. S. Teoh wrote: On Tue, Feb 16, 2016 at 07:34:07PM +, Jon D via Digitalmars-d-learn wrote: On Tuesday, 16 February 2016 at 16:37:07 UTC, Steven Schveighoffer wrote: >On 2/14/16 10:22 PM, Jon D wrote: >>Is there a way to reserve capacity in associative arrays? >>[snip] >>The underlying implementation of associative arrays appears >>to take >>an initial number of buckets, and there's a private resize() >>method, >>but it's not clear if there's a public way to use these. Rehashing (aa.rehash) would resize the number of buckets, but if you don't already have the requisite number of keys, it wouldn't help. Thanks for the reply and the detailed example for manually controlling GC. I haven't experimented with taking control over GC that way. Regarding reserving capacity, the relevant method is aa.resize(), not aa.rehash(). See: https://github.com/D-Programming-Language/druntime/blob/master/src/rt/aaA.d#L141. This allocates space for the buckets, doesn't matter if the keys are known. Note that every time the buckets array is resized the old bucket array is walked and elements reinserted. Preallocating allocating a large bucket array would avoid this. See also the private constructor in the same file (line 51). It takes an initial size. --Jon
Re: Reserving capacity in associative arrays
On Tuesday, 16 February 2016 at 16:37:07 UTC, Steven Schveighoffer wrote: On 2/14/16 10:22 PM, Jon D wrote: Is there a way to reserve capacity in associative arrays? [snip] The underlying implementation of associative arrays appears to take an initial number of buckets, and there's a private resize() method, but it's not clear if there's a public way to use these. There is not a public way to access these methods unfortunately. It would be a good addition to druntime I believe. Recently, I added a clear method to the AA, which does not reduce capacity. So if you frequently build large AAs, and then throw them away, you could instead reuse the memory. My programs build AAs lasting the lifetime of the program. I would caution to be sure of this cause, however, before thinking it would solve the problem. The AA not only uses an array for buckets, but allocates a memory location for each element as well. I'm often wrong when I assume what the problem is when it comes to GC issues... Completely agree. After posting I decided to take a more methodical look. Not finished yet, but I can share part of it. Key thing so far is noticeable step function related to GC costs related to AA size (likely not a surprise). My programs work with large data sets. Size is open-ended, what I'm trying to do is get an idea of the data set sizes they will handle reasonably. For purposes of illustration, word-count is a reasonable proxy for what I'm doing. It was in this context that I saw significant performance drop-off after 'size_t[string]' AAs reached about 10 million entries. I've started measuring with a simple program. Basically: StopWatch sw; sw.start; size_t[size_t] counts; foreach (i; 0..iterations) counts[uniform(0, uniqMax)]++; sw.stop; Same thing with string as key ('size_t[string]') AAs. 'iterations' and 'uniqMax' are varied between runs. GC stats are printed (via "--DRT-gcopt=profile:1"), plus timing and AA size. (Runs use LDC 17, release mode compiles, a fast 16GB MacBook). For the integer as key case ('size_t[size_t]', there are notable jumps in GC total time and GC max pause time as AA size crosses specific size thresholds. This makes sense, as the AA needs to grow. Approximate steps: | entries | gc_total (ms) | gc_max_pause (ms) | |-+---+---| | 2M |30 |60 | | 4M | 200 | 100 | | 12M | 650 | 330 | | 22M | 1650 | 750 | | 44M | 5300 | 3200 | Iterations didn't matter, and gc total time and gc max time were largely flat between these jumps. This suggests AA resize is the likely driver, and that preallocating a large size might address it. To the point about being sure about cause - my programs use strings as keys, not integers. The performance drop-off with strings was quite a bit more significant than with integers. That analysis seems a bit trickier, I'm not done with that yet. Different memory allocation, perhaps effects from creating short-lived, temporary strings to test AA membership. Could easily be that string use or the combo of AAs with strings as key is a larger effect. The other thing that jumps out from the table is the GC max pause time gets to be multiple seconds. Not an issue for my tools, which aren't interactive at those points, but would be significant issue for many interactive apps. --Jon
Re: Reserving capacity in associative arrays
On Tuesday, 16 February 2016 at 17:05:11 UTC, Basile B. wrote: On Tuesday, 16 February 2016 at 16:37:07 UTC, Steven Schveighoffer wrote: There is not a public way to access these methods unfortunately. It would be a good addition to druntime I believe. -Steve After reading the topic i've added this enhancement proposal, not quite sure if it's possible: https://issues.dlang.org/show_bug.cgi?id=15682 The idea is to concatenate smallers AA into the destination. There is also this: https://issues.dlang.org/show_bug.cgi?id=2504
Re: Reserving capacity in associative arrays
On Monday, 15 February 2016 at 05:29:23 UTC, sigod wrote: On Monday, 15 February 2016 at 03:22:44 UTC, Jon D wrote: Is there a way to reserve capacity in associative arrays? [snip] Maybe try using this: http://code.dlang.org/packages/aammm Thanks, I wasn't aware of this package. I'll give it a try. --Jon
Reserving capacity in associative arrays
Is there a way to reserve capacity in associative arrays? In some programs I've been writing I've been getting reasonable performance up to about 10 million entries, but beyond that performance is impacted considerably (say, 30 million or 50 million entries). GC stats (via the "--DRT-gcopt=profile:1" option) indicate dramatic increases in gc time, which I'm assuming comes from resizing the underlying hash table. I'm guessing that by preallocating a large size the performance degradation would not be quite so dramatic. The underlying implementation of associative arrays appears to take an initial number of buckets, and there's a private resize() method, but it's not clear if there's a public way to use these. --Jon
Difference between toLower() and asLowerCase() for strings?
I'm trying to identify the preferred ways to lower case a string. In std.uni there are two functions that return the lower case form of a string: toLower() and asLowerCase(). There is also toLowerInPlace(). I'm having trouble figuring out what the relationship is between these, and when to prefer one over the other. Both take a strings, asLowerCase also takes range. Otherwise, I couldn't find the differences in the documentation. Implementations are apparently different, but not clear what the real difference is. Are there reasons to prefer one over the other? --Jon
Re: Difference between toLower() and asLowerCase() for strings?
On Sunday, 24 January 2016 at 21:04:46 UTC, Adam D. Ruppe wrote: On Sunday, 24 January 2016 at 20:56:20 UTC, Jon D wrote: I'm trying to identify the preferred ways to lower case a string. In std.uni there are two functions that return the lower case form of a string: toLower() and asLowerCase(). There is also toLowerInPlace(). toLower will allocate a new string, leaving the original untouched. toLowerInPlace will modify the existing string. asLowerCase will returned the modified data as you iterate over it, but will not actually allocate the new string. [snip...] As a general rule, the asLowerCase (etc.) version should be your first go since it is the most efficient. But the others are around for convenience in cases where you need a new string built anyway. Great explanation, thank you!
Re: Speed of csvReader
On Thursday, 21 January 2016 at 22:20:28 UTC, H. S. Teoh wrote: On Thu, Jan 21, 2016 at 10:09:24PM +, Jon D via Digitalmars-d-learn wrote: [...] FWIW - I've been implementing a few programs manipulating delimited files, e.g. tab-delimited. Simpler than CSV files because there is no escaping inside the data. I've been trying to do this in relatively straightforward ways, e.g. using byLine rather than byChunk. (Goal is to explore the power of D standard libraries). I've gotten significant speed-ups in a couple different ways: * DMD libraries 2.068+ - byLine is dramatically faster * LDC 0.17 (alpha) - Based on DMD 2.068, and faster than the DMD compiler While byLine has improved a lot, it's still not the fastest thing in the world, because it still performs (at least) one OS roundtrip per line, not to mention it will auto-reencode to UTF-8. If your data is already in a known encoding, reading in the entire file and casting to (|w|d)string then splitting it by line will be a lot faster, since you can eliminate a lot of I/O roundtrips that way. No disagreement, but I had other goals. At a high level, I'm trying to learn and evaluate D, which partly involves understanding the strengths and weaknesses of the standard library. From this perspective, byLine was a logical starting point. More specifically, the tools I'm writing are often used in unix pipelines, so input can be a mixture of standard input and files. And, the files can be arbitrarily large. In these cases, reading the entire file is not always appropriate. Buffering usually is, and my code knows when it is dealing with files vs standard input and could handle these differently. However, standard library code could handle these distinctions as well, which was part of the reason for trying the straightforward approach. Aside - Despite the 'learning D' motivation, the tools are real tools, and writing them in D has been a clear win, especially with the byLine performance improvements in 2.068.
Re: Speed of csvReader
On Thursday, 21 January 2016 at 09:39:30 UTC, data pulverizer wrote: I have been reading large text files with D's csv file reader and have found it slow compared to R's read.table function which is not known to be particularly fast. FWIW - I've been implementing a few programs manipulating delimited files, e.g. tab-delimited. Simpler than CSV files because there is no escaping inside the data. I've been trying to do this in relatively straightforward ways, e.g. using byLine rather than byChunk. (Goal is to explore the power of D standard libraries). I've gotten significant speed-ups in a couple different ways: * DMD libraries 2.068+ - byLine is dramatically faster * LDC 0.17 (alpha) - Based on DMD 2.068, and faster than the DMD compiler * Avoid utf-8 to dchar conversion - This conversion often occurs silently when working with ranges, but is generally not needed when manipulating data. * Avoid unnecessary string copies. e.g. Don't gratuitously convert char[] to string. At this point performance of the utilities I've been writing is quite good. They don't have direct equivalents with other tools (such as gnu core utils), so a head-to-head is not appropriate, but generally it seems the tools are quite competitive without needing to do my own buffer or memory management. And, they are dramatically faster than the same tools written in perl (which I was happy with). --Jon
Re: Convert some ints into a byte array without allocations?
On Sat, 16 Jan 2016 14:34:54 +, Samson Smith wrote: > I'm trying to make a fast little function that'll give me a random > looking (but deterministic) value from an x,y position on a grid. I'm > just going to run each co-ord that I need through an FNV-1a hash > function as an array of bytes since that seems like a fast and easy way > to go. I'm going to need to do this a lot and quickly for a real time > application so I don't want to waste a lot of cycles converting data or > allocating space for an array. > > In a nutshell how do I cast an int into a byte array? > > I tried this: > > byte[] bytes = cast(byte[])x; >> Error: cannot cast expression x of type int to byte[] > > What should I be doing instead? You can do this: ubyte[] b = (cast(ubyte*) )[0 .. int.sizeof]; It is casting the pointer to `a` to a ubyte (or byte) pointer and then taking a slice the size of int.
Re: Convert some ints into a byte array without allocations?
On Sat, 16 Jan 2016 14:42:27 +, Yazan D wrote: > > You can do this: > ubyte[] b = (cast(ubyte*) )[0 .. int.sizeof]; > > It is casting the pointer to `a` to a ubyte (or byte) pointer and then > taking a slice the size of int. You can also use a union: union Foo { int i; ubyte[4] b; } // write to int part Foo f = Foo(a); // then read from ubyte part writeln(foo.b); ps. I am not sure of the aliasing rules in D for unions. In C, this is allowed, but in C++, this is undefined behaviour AFAIK.
function argument accepting function or delegate?
My underlying question is how to compose functions taking functions as arguments, while allowing the caller the flexibility to pass either a function or delegate. Simply declaring an argument as either a function or delegate seems to prohibit the other. Overloading works. Are there better ways? An example: auto callIntFn (int function(int) f, int x) { return f(x); } auto callIntDel (int delegate(int) f, int x) { return f(x); } auto callIntFnOrDel (int delegate(int) f, int x) { return f(x); } auto callIntFnOrDel (int function(int) f, int x) { return f(x); } void main(string[] args) { alias AddN = int delegate(int); AddN makeAddN(int n) { return x => x + n; } auto addTwo = makeAddN(2);// Delegate int function(int) addThree = x => x + 3; // Function // assert(callIntFn(addTwo, 4) == 6); // Compile error // assert(callIntDel(addThree, 4) == 7); // Compile error assert(callIntDel(addTwo, 4) == 6); assert(callIntFn(addThree, 4) == 7); assert(callIntFnOrDel(addTwo, 4) == 6); assert(callIntFnOrDel(addThree, 4) == 7); } ---Jon
Re: function argument accepting function or delegate?
On Sunday, 17 January 2016 at 06:49:23 UTC, rsw0x wrote: On Sunday, 17 January 2016 at 06:27:41 UTC, Jon D wrote: My underlying question is how to compose functions taking functions as arguments, while allowing the caller the flexibility to pass either a function or delegate. [...] Templates are an easy way. --- auto call(F, Args...)(F fun, auto ref Args args) { return fun(args); } --- Would probably look nicer with some constraints from std.traits. Thanks much, that works!
Re: How is D doing?
I'm doing quite well, thank you.
Re: Why should file names intended for executables be valid identifiers?
On Tuesday, 15 December 2015 at 03:31:18 UTC, Shriramana Sharma wrote: For instance, hyphens are often used as part of executable names on Linux, but if I do this: $ dmd usage-printer.d I get the following error: usage-printer.d: Error: module usage-printer has non-identifier characters in filename, use module declaration instead Try adding the line: module usage_printer; at the top of the file. This overrides the default module name (same as file name). --Jon
Re: Reason for 'static struct'
On Wednesday, 9 December 2015 at 21:23:03 UTC, Daniel Kozák wrote: V Wed, 09 Dec 2015 21:10:43 + Jon D via Digitalmars-d-learn <digitalmars-d-learn@puremagic.com> napsáno: There is a fair bit of range related code in the standard library structured like: auto MyRange(Range)(Range r) if (isInputRange!Range) { static struct Result { private Range source; // define empty, front, popFront, etc } return Result(r); } I'm curious about what declaring the Result struct as 'static' does, and if there are use cases where it be better to exclude the static qualifier. --Jon It make it non-nested struct: https://dlang.org/spec/struct.html#nested Thanks. So, is in the example above, would the advantage be that 'static' avoids saving the enclosing state, which is not needed?
Reason for 'static struct'
There is a fair bit of range related code in the standard library structured like: auto MyRange(Range)(Range r) if (isInputRange!Range) { static struct Result { private Range source; // define empty, front, popFront, etc } return Result(r); } I'm curious about what declaring the Result struct as 'static' does, and if there are use cases where it be better to exclude the static qualifier. --Jon
Re: block file reads and lazy utf-8 decoding
On Thursday, 10 December 2015 at 00:36:27 UTC, Jon D wrote: Question I have is if there is a better way to do this. For example, a different way to construct the lazy 'decodeUTF8Range' rather than writing it out in this fashion. A further thought - The decodeUTF8Range function is basically constructing a lazy wrapper range around decodeFront, which is effectively combining a 'front' and 'popFront' operation. So perhaps a generic way to compose a wrapper for such functions. auto decodeUTF8Range(Range)(Range charSource) if (isInputRange!Range && is(Unqual!(ElementType!Range) == char)) { static struct Result { private Range source; private dchar next; bool empty = false; dchar front() @property { return next; } void popFront() { if (source.empty) { empty = true; next = dchar.init; } else { next = source.decodeFront; } } } auto r = Result(charSource); r.popFront; return r; }
block file reads and lazy utf-8 decoding
I want to combine block reads with lazy conversion of utf-8 characters to dchars. Solution I came with is in the program below. This works fine. Has good performance, etc. Question I have is if there is a better way to do this. For example, a different way to construct the lazy 'decodeUTF8Range' rather than writing it out in this fashion. There is quite a bit of power in the library and I'm still learning it. I'm wondering if I overlooked a useful alternative. --Jon Program: --- import std.algorithm: each, joiner, map; import std.conv; import std.range; import std.stdio; import std.traits; import std.utf: decodeFront; auto decodeUTF8Range(Range)(Range charSource) if (isInputRange!Range && is(Unqual!(ElementType!Range) == char)) { static struct Result { private Range source; private dchar next; bool empty = false; dchar front() @property { return next; } void popFront() { if (source.empty) { empty = true; next = dchar.init; } else { next = source.decodeFront; } } } auto r = Result(charSource); r.popFront; return r; } void main(string[] args) { if (args.length != 2) { writeln("Provide one file name."); return; } ubyte[1024*1024] rawbuf; auto inputStream = args[1].File(); inputStream .byChunk(rawbuf)// Read in blocks .joiner // Join the blocks into a single input char range .map!(a => to!char(a)) // Cast ubyte to char for decodeFront. Any better ways? .decodeUTF8Range// utf8 to dchar conversion. .each; // Real work goes here. writeln("done"); }
Re: copy and array length vs capacity. (Doc suggestion?)
On Tuesday, 24 November 2015 at 01:00:40 UTC, Steven Schveighoffer wrote: On 11/23/15 7:29 PM, Ali Çehreli wrote: On 11/23/2015 04:03 PM, Steven Schveighoffer wrote: > On 11/23/15 4:29 PM, Jon D wrote: >> In the example I gave, what I was really wondering was if there is a >> difference between allocating with 'new' or with 'reserve', or with >> 'length', for that matter. That is, is there a material difference >> between: >> >> auto x = new int[](n); >> int[] y; y.length = n; > > There is no difference at all, other than the function that is called > (the former will call an allocation function, the latter will call a > length setting function, which then will determine if more data is > needed, and finding it is, call the allocation function). Although Jon's example above does not compare reserve, I have to ask: How about non-trivial types? Both cases above would set all elements to ..init, right? So, I think reserve would be faster if copy() knew how to take advantage of capacity. It could emplace elements instead of copying, no? I think the cost of looking up the array metadata is more than the initialization of elements to .init. However, using an Appender would likely fix all these problems. You could also use https://dlang.org/phobos/std_array.html#uninitializedArray to create the array before copying. There are quite a few options, actually :) A delegate is also surprisingly considered an output range! Because why not? So you can do this too as a crude substitute for appender (or for testing performance): import std.range; // for iota import std.algorithm; void main() { int[] arr; arr.reserve(100); iota(100).copy((int a) { arr ~= a;}); } -Steve Thanks. I was also wondering if that initial allocation could be avoided. Code I was writing involved repeatedly using a buffer in a loop. I was trying out taskPool.amap, which needs a random access range. This meant copying from the input range being read. Something like: auto input = anInfiniteRange(); auto bufsize = workPerThread * taskPool.size(); auto workbuf = new int[](bufsize); auto results = new int[](bufsize); while (true) { input.take(bufsize).copy(workbuf); input.popFront(bufsize); taskPool.amap!expensiveCalc(workbuf, workPerThread, results); results.doSomething(); } I'm just writing a toy example, but it is where these questions came from. For this example, the next step would be to allow the buffer size to change while iterating. --Jon
Re: copy and array length vs capacity. (Doc suggestion?)
On Monday, 23 November 2015 at 15:19:08 UTC, Steven Schveighoffer wrote: On 11/21/15 10:19 PM, Jon D wrote: On Sunday, 22 November 2015 at 00:31:53 UTC, Jonathan M Davis wrote: Honestly, arrays suck as output ranges. They don't get appended to; they get filled, and for better or worse, the documentation for copy is probably assuming that you know that. If you want your array to be appended to when using it as an output range, then you need to use std.array.Appender. Hi Jonathan, thanks for the reply and the info about std.array.Appender. I was actually using copy to fill an array, not append. However, I also wanted to preallocate the space. And, since I'm mainly trying to understand the language, I was also trying to figure out the difference between these two forms of creating a dynamic array with an initial size: auto x = new int[](n); int[] y; y.reserve(n); If you want to change the size of the array, use length: y.length = n; This will extend y to the correct length, automatically reserving a block of data that can hold it, and allow you to write to the array. All reserve does is to make sure there is enough space so you can append that much data to it. It is not relevant to your use case. The obvious difference is that first initializes n values, the second form does not. I'm still unclear if there are other material differences, or when one might be preferred over the other :) It's was in this context the behavior of copy surprised me, that it wouldn't operate on the second form without first filling in the elements. If this seems unclear, I can provide a slightly longer sample showing what I was doing. extending length affects the given array, extending if necessary. reserve is ONLY relevant if you are using appending (arr ~= x). It doesn't actually affect the "slice" or the variable you are using, at all (except to possibly point it at newly allocated space). copy uses an "output range" as it's destination. The output range supports taking elements and putting them somewhere. In the case of a simple array, putting them somewhere means assigning to the first element, and then moving to the next one. -Steve Thanks for the reply. And for your article (which Jonathan recommended). It clarified a number of things. In the example I gave, what I was really wondering was if there is a difference between allocating with 'new' or with 'reserve', or with 'length', for that matter. That is, is there a material difference between: auto x = new int[](n); int[] y; y.length = n; I can imagine that the first might be faster, but otherwise there appears no difference. As the article stresses, the question is the ownership model. If I'm understanding, both cause an allocation into the runtime managed heap. --Jon
Re: copy and array length vs capacity. (Doc suggestion?)
On Sunday, 22 November 2015 at 00:10:07 UTC, Ali Çehreli wrote: May I suggest that you improve that page. ;) If you don't already have a clone o the repo, you can do it easily by clicking the "Improve this page" button on that page. Hi Ali, thanks for the quick response. And point taken :) I hadn't noticed those buttons on the doc pages, looks very convenient. There are a couple formalities I need to look into before making contributions, even small ones, but I'll check into these. Regarding why copy() cannot use the capacity of the slice, it is because slices don't know about each other, so, copy could not let other slices know that the capacity has just been used by this particular slice. Thanks for the explanation, very helpful understanding what's going on. --Jon
copy and array length vs capacity. (Doc suggestion?)
Something I found confusing was the relationship between array capacity and copy(). A short example: void main() { import std.algorithm: copy; auto a = new int[](3); assert(a.length == 3); [1, 2, 3].copy(a); // Okay int[] b; b.reserve(3); assert(b.capacity >= 3); assert(b.length == 0); [1, 2, 3].copy(b); // Error } I had expected that copy() would work if the target had sufficient capacity, but that's not the case. Target has to have sufficient length. If I've understood this correctly, a small change to the documentation for copy() might make this clearer. In particular, the "precondition" section: Preconditions: target shall have enough room to accomodate the entirety of source. Clarifying that "enough room" means 'length' rather than 'capacity' might be beneficial.
Re: copy and array length vs capacity. (Doc suggestion?)
On Sunday, 22 November 2015 at 00:31:53 UTC, Jonathan M Davis wrote: Honestly, arrays suck as output ranges. They don't get appended to; they get filled, and for better or worse, the documentation for copy is probably assuming that you know that. If you want your array to be appended to when using it as an output range, then you need to use std.array.Appender. Hi Jonathan, thanks for the reply and the info about std.array.Appender. I was actually using copy to fill an array, not append. However, I also wanted to preallocate the space. And, since I'm mainly trying to understand the language, I was also trying to figure out the difference between these two forms of creating a dynamic array with an initial size: auto x = new int[](n); int[] y; y.reserve(n); The obvious difference is that first initializes n values, the second form does not. I'm still unclear if there are other material differences, or when one might be preferred over the other :) It's was in this context the behavior of copy surprised me, that it wouldn't operate on the second form without first filling in the elements. If this seems unclear, I can provide a slightly longer sample showing what I was doing. --Jon
compatible types for chains of different lengths
I'd like to chain several ranges and operate on them. However, if the chains are different lengths, the data type is different. This makes it hard to use in a general way. There is likely an alternate way to do this that I'm missing. A short example: $ cat chain.d import std.stdio; import std.range; import std.algorithm; void main(string[] args) { auto x1 = ["abc", "def", "ghi"]; auto x2 = ["jkl", "mno", "pqr"]; auto x3 = ["stu", "vwx", "yz"]; auto chain1 = (args.length > 1) ? chain(x1, x2) : chain(x1); auto chain2 = (args.length > 1) ? chain(x1, x2, x3) : chain(x1, x2); chain1.joiner(", ").writeln; chain2.joiner(", ").writeln; } $ dmd chain.d chain.d(10): Error: incompatible types for ((chain(x1, x2)) : (chain(x1))): 'Result' and 'string[]' chain.d(11): Error: incompatible types for ((chain(x1, x2, x3)) : (chain(x1, x2))): 'Result' and 'Result' Is there a different way to do this? --Jon
Re: compatible types for chains of different lengths
On Tuesday, 17 November 2015 at 23:22:58 UTC, Brad Anderson wrote: One solution: [snip] Thanks for the quick response. Extending your example, here's another style that works and may be nicer in some cases. import std.stdio; import std.range; import std.algorithm; void main(string[] args) { auto x1 = ["abc", "def", "ghi"]; auto x2 = ["jkl", "mno", "pqr"]; auto x3 = ["stu", "vwx", "yz"]; auto y1 = (args.length > 1) ? x1 : []; auto y2 = (args.length > 2) ? x2 : []; auto y3 = (args.length > 3) ? x3 : []; chain(y1, y2, y3).joiner(", ").writeln; }
Preferred behavior of take() with ranges (value vs reference range)
Just started looking at D, very promising! One of the first programs I constructed involved infinite sequences. A design question that showed up is whether to construct the range as a struct/value, or class/reference. It appears that structs/values are more the norm, but there are exceptions, notably refRange. I'm wondering if there are any community best practices or guidelines in this area. One key difference is the behavior of take(). If the range is a value/struct, take() does not consume elements. If it's a ref/class, it does consume elements. From a consistency perspective, it'd seem useful if the behavior was consistent as much as possible. Here's an example of the behavior differences below. It uses refRange, but same behavior occurs if the range is created as a class rather than a struct. import std.range; import std.algorithm; void main() { auto fib1 = recurrence!((a,n) => a[n-1] + a[n-2])(1, 1); auto fib2 = recurrence!((a,n) => a[n-1] + a[n-2])(1, 1); auto fib3 = refRange(); // Struct/value based range - take() does not consume elements assert(fib1.take(7).equal([1, 1, 2, 3, 5, 8, 13])); assert(fib1.take(7).equal([1, 1, 2, 3, 5, 8, 13])); fib1.popFrontN(7); assert(fib1.take(7).equal([21, 34, 55, 89, 144, 233, 377])); // Reference range (fib3) - take() consumes elements assert(fib2.take(7).equal([1, 1, 2, 3, 5, 8, 13])); assert(fib3.take(7).equal([1, 1, 2, 3, 5, 8, 13])); assert(fib3.take(7).equal([21, 34, 55, 89, 144, 233, 377])); assert(fib2.take(7).equal([610, 987, 1597, 2584, 4181, 6765, 10946])); assert(fib2.take(7).equal([610, 987, 1597, 2584, 4181, 6765, 10946])); } --Jon
Re: Preferred behavior of take() with ranges (value vs reference range)
On Monday, 9 November 2015 at 02:44:48 UTC, TheFlyingFiddle wrote: On Monday, 9 November 2015 at 02:14:58 UTC, Jon D wrote: Here's an example of the behavior differences below. It uses refRange, but same behavior occurs if the range is created as a class rather than a struct. --Jon This is an artifact of struct based ranges being value types. When you use take the range get's copied into another structure that is also a range but limits the number of elements you take from that range. ... If you want a more indepth explanation there were two talks at Dconf this year that (in part) discussed this topic. (https://www.youtube.com/watch?v=A8Btr8TPJ8c, https://www.youtube.com/watch?v=QdMdH7WX2ew=PLEDeq48KhndP-mlE-0Bfb_qPIMA4RrrKo=14) Thanks for the quick reply. The two videos were very helpful. I understood what was happening underneath (mostly), but the videos made it clear there are a number of open questions regarding reference and value ranges and how best to use them.
Re: Decrease number of front evaluations
On Wed, 26 Aug 2015 08:27:05 +, FreeSlave wrote: Are there ways to fix this? Should I consider writing my own range type probably? Check http://dlang.org/phobos/std_algorithm_iteration.html#.cache
Re: Trying to compile weather program
On Sun, 23 Aug 2015 16:00:16 +, Tony wrote: Thanks for the replies. It compiles OK with just. However, it isn't linking: /usr/bin/ld: cannot find -lcurl I do have some versions of libcurl on my system: /usr/lib/x86_64-linux-gnu/libcurl.so.3 /usr/lib/x86_64-linux-gnu/libcurl.so.4.3.0 /usr/lib/x86_64-linux-gnu/libcurl.so.4 I see there is a -L option to pass things to the linker -Llinkerflag pass linkerflag to link but I am not sure how to use it. I've had the same problem recently. What I did was that I ran `dmd main.d -L-L/usr/lib/x86_64-linux-gnu/ -L-lcurl -v`. It would still fail to link, but I can find the linking command from the verbose output. It was something like this: `gcc main.o -o main -m64 -L/usr/lib/x86_64-linux- gnu/ -lcurl -L/usr/lib/x86_64-linux-gnu -Xlinker --export-dynamic - l:libphobos2.a -lpthread -lm -lrt`. As you can see, -lcurl is there, but it still needs to be added again after -l:libphobos2.a. So just add it again so the command becomes: `gcc main.o -o main -m64 -L/usr/lib/x86_64-linux-gnu/ -lcurl -L/usr/lib/ x86_64-linux-gnu -Xlinker --export-dynamic -l:libphobos2.a -lpthread -lm - lrt -lcurl`. And it links and runs.
Re: Measuring Execution time
There is also http://linux.die.net/man/2/sched_setaffinity if you want to do it programmatically.
Re: Measuring Execution time
On Thu, 23 Jul 2015 16:43:01 +, Clayton wrote: On Wednesday, 22 July 2015 at 09:32:15 UTC, John Colvin wrote: On Wednesday, 22 July 2015 at 09:23:36 UTC, Clayton wrote: [...] The normal way of doing this would be using std.datetime.StopWatch: StopWatch sw; sw.start(); algorithm(); long exec_ms = sw.peek().msecs; Am wondering how possible is to restrict that all algorithms get run on a specific core( e.g. CPU 0 ) since I wanted my test run on the same environment. If you are using Linux, you can use `taskset`. Example: `taskset -c 0 ./program`. This will run your program on the first CPU only.
Re: override toString() for a tuple?
On Wednesday, 4 June 2014 at 06:04:22 UTC, Jonathan M Davis via Digitalmars-d-learn wrote: toString is a member of Tuple, and there's no way to override that externally. ... Hi Jonathan, Yeah, I'll probably just keep my locally cobbled version of typecons.d in my path. The other options would be hard going as I've got tuples printed from arrays and variant arrays etc as well as individually. It's just easier to hack the default library code, although not so elegant. You would think the promise of OO and Inheritance would make it easy and free us from hacks like this ;) That said, it's only a personal project so as long as it works, who cares? Many Thanks for your reply Steve D
override toString() for a tuple?
Is it possible to override std tuple's toString format? so that auto a = tuple(hello,1,2,3); writeln(a); prints (hello, 1, 2, 3) and not Tuple!(string, int, int, int)(hello, 1, 2, 3) I'm aware I could write a custom formatter function, but it would be nice not to have to use such a function for every tuple printed by the program. Overriding toString() one time in program (if possible) would give the ideal default behaviour. (I would duplicate the current typecons.d toString() and strip off the prefix) thanks for any help