Re: foreach (i; taskPool.parallel(0..2_000_000)
On Thursday, 6 April 2023 at 01:44:15 UTC, H. S. Teoh wrote: D ranges are conceptually sequential, but the actual underlying memory access patterns depends on the concrete type at runtime. An array's elements are stored sequentially in memory, and arrays are ranges. But a linked-list can also have a range interface, yet its elements may be stored in non-consecutive memory locations. So the concrete type matters here; the range API only gives you conceptual sequentiality, it does not guarantee physically sequential memory access. Very helpful Teoh. Thanks again.
Re: foreach (i; taskPool.parallel(0..2_000_000)
On Tuesday, 4 April 2023 at 16:22:29 UTC, Steven Schveighoffer wrote: On 4/4/23 11:34 AM, Salih Dincer wrote: On Tuesday, 4 April 2023 at 14:20:20 UTC, Steven Schveighoffer wrote: parallel is a shortcut to `TaskPool.parallel`, which is indeed a foreach-only construct, it does not return a range. I think what you want is `TaskPool.map`: ```d // untested, just looking at the taskPool.map!(/* your map function here */) (s.iota(len)).writeln; ``` I tried, thanks but it goes into infinite loop. For example, the first 50 of the sequence should have been printed to the screen immediately without waiting. ```d long[50] arr; RowlandSequence_v2 range; auto next(long a) { range.popFront(); return arr[a] = range.front(); } void main() { range = RowlandSequence_v2(7, 2); taskPool.map!next(iota(50))/* s.iota(50) .map!next .parallel//*/ .writeln; } ``` Keep in mind that `arr` and `range` are thread-local, and so will be different states for different tasks. I guess the reason it goes into an infinite loop is because gcd() a recursive function (gcd). This is the only solution I can think of about this: ```d import std.range, std.algorithm : map; import std.stdio, std.parallelism; //import std.numeric : gcd; /* struct Vector { long x, y, result; alias result this; } Vector gcd(long a, long b) { if(b == 0) return Vector(1, 0, a); auto pre = gcd(b, a % b); auto tmp = (a / b) * pre.y; return Vector(pre.y, pre.x - tmp, pre.result); }//*/ struct RowlandSequence_v3 { long b, r, n, a = 3, limit; bool empty() { return n == limit; } auto front() { return a; } void popFront() { long result = 1; while(result == 1) { result = gcd(r++, b); b += result; } a = result; } long gcd(long a, long b) { long c; while(b != 0) { c = a % b; a = b; b = c; } return a; } } auto next(ref RowlandSequence_v3 range) { with(range) { if(empty) return [n, front]; popFront(); return [n++, front]; } } long[179] arr; void main() { // initialization: RowlandSequence_v3[4] r = [ RowlandSequence_v3(7 , 2, 0, 3, 112), RowlandSequence_v3(186837678, 62279227, 112, 3, 145), RowlandSequence_v3(747404910, 249134971, 145, 6257, 160), RowlandSequence_v3(1494812421, 498270808, 160, 11, 177) ]; auto tasks = [ task(, r[0]), task(, r[1]), task(, r[2]), task(, r[3]) ]; // quad parallel operation: foreach(_; 0..r[0].limit) { foreach(p, ref task; tasks) { task.executeInNewThread; auto t = task.workForce; arr[t[0]] = t[1]; } } // prints... foreach(x, n; arr) { switch(x + 1) { case 112, 145, 160: n.writeln; break; default: n.write(", "); } } } /* PRINTS: user@debian:~/Documents$ dmd -O rowlandSequence.d -release user@debian:~/Documents$ time ./rowlandSequence 5, 3, 11, 3, 23, 3, 47, 3, 5, 3, 101, 3, 7, 11, 3, 13, 233, 3, 467, 3, 5, 3, 941, 3, 7, 1889, 3, 3779, 3, 7559, 3, 13, 15131, 3, 53, 3, 7, 30323, 3, 60647, 3, 5, 3, 101, 3, 121403, 3, 242807, 3, 5, 3, 19, 7, 5, 3, 47, 3, 37, 5, 3, 17, 3, 199, 53, 3, 29, 3, 486041, 3, 7, 421, 23, 3, 972533, 3, 577, 7, 1945649, 3, 163, 7, 3891467, 3, 5, 3, 127, 443, 3, 31, 7783541, 3, 7, 15567089, 3, 19, 29, 3, 5323, 7, 5, 3, 31139561, 3, 41, 3, 5, 3, 62279171, 3, 7, 83, 3 29, 3, 1103, 3, 5, 3, 13, 7, 124559609, 3, 107, 3, 911, 3, 249120239, 3, 11, 3, 7, 61, 37, 179, 3, 31, 19051, 7, 3793, 23, 3, 5, 3, 6257, 3 3, 11, 3, 13, 5, 3, 739, 37, 5, 3, 498270791, 3, 19, 11, 3 3, 3, 5, 3, 996541661, 3, 7, 37, 5, 3, 67, 1993083437, 3, 5, 3, 83, 3, 3, 0, real7m54.093s user7m54.062s sys 0m0.024s */ ``` However, parallel processing for 4-digit sequence elements is not promising at least for the Rowland Sequence. SDB@79
Re: foreach (i; taskPool.parallel(0..2_000_000)
On Thu, Apr 06, 2023 at 01:20:28AM +, Paul via Digitalmars-d-learn wrote: [...] > Yes I understand, basically, what's going on in hardware. I just > wasn't sure if the access type was linked to the container type. It > seems obvious now, since you've both made it clear, that it also > depends on how I'm accessing my container. > > Having said all of this, isn't a D 'range' fundamentally a sequential > access container (i.e popFront) ? D ranges are conceptually sequential, but the actual underlying memory access patterns depends on the concrete type at runtime. An array's elements are stored sequentially in memory, and arrays are ranges. But a linked-list can also have a range interface, yet its elements may be stored in non-consecutive memory locations. So the concrete type matters here; the range API only gives you conceptual sequentiality, it does not guarantee physically sequential memory access. T -- Many open minds should be closed for repairs. -- K5 user
Re: foreach (i; taskPool.parallel(0..2_000_000)
On Wednesday, 5 April 2023 at 23:06:54 UTC, H. S. Teoh wrote: So your data structures and algorithms should be designed in a way that takes advantage of linear access where possible. T Yes I understand, basically, what's going on in hardware. I just wasn't sure if the access type was linked to the container type. It seems obvious now, since you've both made it clear, that it also depends on how I'm accessing my container. Having said all of this, isn't a D 'range' fundamentally a sequential access container (i.e popFront) ?
Re: foreach (i; taskPool.parallel(0..2_000_000)
On Wed, Apr 05, 2023 at 10:34:22PM +, Paul via Digitalmars-d-learn wrote: > On Tuesday, 4 April 2023 at 22:20:52 UTC, H. S. Teoh wrote: > > > Best practices for arrays in hot loops: [...] > > - Where possible, prefer sequential access over random access (take > > advantage of the CPU cache hierarchy). > > Thanks for sharing Teoh! Very helpful. > > would this be random access? for(size_t i; i indices? ...and this be sequential foreach(a;arr) ? > > or would they have to be completely different kinds of containers? a > dlang 'range' vs arr[]? [...] The exact syntactic construct you use is not important; under the hood, for(i; i
Re: foreach (i; taskPool.parallel(0..2_000_000)
On 4/5/23 6:34 PM, Paul wrote: On Tuesday, 4 April 2023 at 22:20:52 UTC, H. S. Teoh wrote: Best practices for arrays in hot loops: - Avoid appending if possible; instead, pre-allocate outside the loop. - Where possible, reuse existing arrays instead of discarding old ones and allocating new ones. - Use slices where possible instead of making copies of subarrays (this esp. applies to strings). - Where possible, prefer sequential access over random access (take advantage of the CPU cache hierarchy). Thanks for sharing Teoh! Very helpful. would this be random access? for(size_t i; iindices? ...and this be sequential foreach(a;arr) ? No, random access is access out of sequence. Those two lines are pretty much equivalent, and even a naive compiler is going to produce exactly the same generated code from both of them. A classic example is processing a 2d array: ```d for(int i = 0; i < arr[0].length; ++i) for(int j = 0; j < arr.length; ++j) arr[j][i]++; // vs for(int j = 0; j < arr.length; ++j) for(int i = 0; i < arr[0].length; ++i) arr[j][i]++; ``` The first accesses elements *by column*, which means that the array data is accessed non-linearly in memory. To be fair, both are "linear" in terms of algorithm, but one is going to be faster because of cache coherency (you are accessing sequential *hardware addresses*). -Steve
Re: foreach (i; taskPool.parallel(0..2_000_000)
On Tuesday, 4 April 2023 at 22:20:52 UTC, H. S. Teoh wrote: Best practices for arrays in hot loops: - Avoid appending if possible; instead, pre-allocate outside the loop. - Where possible, reuse existing arrays instead of discarding old ones and allocating new ones. - Use slices where possible instead of making copies of subarrays (this esp. applies to strings). - Where possible, prefer sequential access over random access (take advantage of the CPU cache hierarchy). Thanks for sharing Teoh! Very helpful. would this be random access? for(size_t i; iusing indices? ...and this be sequential foreach(a;arr) ? or would they have to be completely different kinds of containers? a dlang 'range' vs arr[]?
Re: foreach (i; taskPool.parallel(0..2_000_000)
On Tue, Apr 04, 2023 at 09:35:29PM +, Paul via Digitalmars-d-learn wrote: [...] > Well Steven just making the change you said reduced the execution time > from ~6-7 secs to ~3 secs. Then, including the 'parallel' in the > foreach statement took it down to ~1 sec. > > Boy lesson learned in appending-to and zeroing dynamic arrays in a hot > loop! Best practices for arrays in hot loops: - Avoid appending if possible; instead, pre-allocate outside the loop. - Where possible, reuse existing arrays instead of discarding old ones and allocating new ones. - Use slices where possible instead of making copies of subarrays (this esp. applies to strings). - Where possible, prefer sequential access over random access (take advantage of the CPU cache hierarchy). T -- Famous last words: I *think* this will work...
Re: foreach (i; taskPool.parallel(0..2_000_000)
On Monday, 3 April 2023 at 23:50:48 UTC, Steven Schveighoffer wrote: So what you need is inside `createSpansOfNoBeacons`, take as a reference a `ref Span[MAX_SPANS]`, and have it return a `Span[]` that is a slice of that which was "alocated". See if this helps. Well Steven just making the change you said reduced the execution time from ~6-7 secs to ~3 secs. Then, including the 'parallel' in the foreach statement took it down to ~1 sec. Boy lesson learned in appending-to and zeroing dynamic arrays in a hot loop!
Re: foreach (i; taskPool.parallel(0..2_000_000)
On 4/4/23 11:34 AM, Salih Dincer wrote: On Tuesday, 4 April 2023 at 14:20:20 UTC, Steven Schveighoffer wrote: parallel is a shortcut to `TaskPool.parallel`, which is indeed a foreach-only construct, it does not return a range. I think what you want is `TaskPool.map`: ```d // untested, just looking at the taskPool.map!(/* your map function here */) (s.iota(len)).writeln; ``` I tried, thanks but it goes into infinite loop. For example, the first 50 of the sequence should have been printed to the screen immediately without waiting. ```d long[50] arr; RowlandSequence_v2 range; auto next(long a) { range.popFront(); return arr[a] = range.front(); } void main() { range = RowlandSequence_v2(7, 2); taskPool.map!next(iota(50))/* s.iota(50) .map!next .parallel//*/ .writeln; } ``` Keep in mind that `arr` and `range` are thread-local, and so will be different states for different tasks. Though I don't really know what you are doing there. -Steve
Re: foreach (i; taskPool.parallel(0..2_000_000)
On Tuesday, 4 April 2023 at 14:20:20 UTC, Steven Schveighoffer wrote: parallel is a shortcut to `TaskPool.parallel`, which is indeed a foreach-only construct, it does not return a range. I think what you want is `TaskPool.map`: ```d // untested, just looking at the taskPool.map!(/* your map function here */) (s.iota(len)).writeln; ``` I tried, thanks but it goes into infinite loop. For example, the first 50 of the sequence should have been printed to the screen immediately without waiting. ```d long[50] arr; RowlandSequence_v2 range; auto next(long a) { range.popFront(); return arr[a] = range.front(); } void main() { range = RowlandSequence_v2(7, 2); taskPool.map!next(iota(50))/* s.iota(50) .map!next .parallel//*/ .writeln; } ``` On Tuesday, 4 April 2023 at 13:18:01 UTC, Ali Çehreli wrote: I don't have time to experiment more at this time but I have the following chapters, which includes some of those other algorithms as well: http://ddili.org/ders/d/kosut_islemler.html I read it, thanks... SDB@79
Re: foreach (i; taskPool.parallel(0..2_000_000)
On 4/4/23 5:24 AM, Salih Dincer wrote: Is it necessary to enclose the code in `foreach()`? I invite Ali to tell me! Please explain why parallel isn't running. parallel is a shortcut to `TaskPool.parallel`, which is indeed a foreach-only construct, it does not return a range. I think what you want is `TaskPool.map`: ```d // untested, just looking at the taskPool.map!(/* your map function here */) (s.iota(len)).writeln; ``` Can't use pipelining with it, because it is a member function. https://dlang.org/phobos/std_parallelism.html#.TaskPool.map -Steve
Re: foreach (i; taskPool.parallel(0..2_000_000)
On 4/4/23 02:24, Salih Dincer wrote: > I don't understand what `foreach()` does :) Hm. I forgot whether 'parallel' works only with 'foreach'. But there are various other algorithms in std.parallelism that may be more useful with range algorithm chains: https://dlang.org/phobos/std_parallelism.html > in Turkish I don't have time to experiment more at this time but I have the following chapters, which includes some of those other algorithms as well: http://ddili.org/ders/d/kosut_islemler.html http://ddili.org/ders/d.en/parallelism.html Ali
Re: foreach (i; taskPool.parallel(0..2_000_000)
On Monday, 3 April 2023 at 22:24:18 UTC, Steven Schveighoffer wrote: So for example, if you have: ```d foreach(i; iota(0, 2_000_000).parallel) { runExpensiveTask(i); } ``` The foreach is run on the main thread, gets a `0`, then hands off to a task thread `runExpensiveTask(0)`. Then it gets a `1`, and hands off to a task thread `runExpensiveTask(1)`, etc. The iteration is not expensive, and is not done in parallel. On the other hand, what you *shouldn't* do is: ```d foreach(i; iota(0, 2_000_000).map!(x => runExpensiveTask(x)).parallel) { } ``` as this will run the expensive task *before* running any tasks. I don't understand what `foreach()` does :) ```d import std.range, std.algorithm : map; import std.stdio, std.parallelism; //import sdb.sequences : RowlandSequence_v2;/* struct RowlandSequence_v2 { import std.numeric : gcd; long b, r, a = 3; enum empty = false; auto front() => a; void popFront() { long result = 1; while(result == 1) { result = gcd(r++, b); b += result; } a = result; } }//*/ enum BP : long { // s, f, r, b = 7, /* <- beginning s = 178, r = 1993083484, b = 5979250449,//*/ len = 190 } void main() { with(BP) { long[len] arr; auto range = RowlandSequence_v2(b, r); s.iota(len) .map!((a){ range.popFront(); return arr[a] = range.front(); } ) .parallel .writeln; } } /* PRINTS: ParallelForeach!(MapResult!(__lambda3, Result))(std.parallelism.TaskPool, [5, 3, 73, 157, 7, 5, 3, 13, 3986167223, 3, 7, 73], 1) */ ``` Is it necessary to enclose the code in `foreach()`? I invite Ali to tell me! Please explain why parallel isn't running. "Ben anlamıyor, foreach ne yapıyor Kodu `foreach()` içine almak şart mı? Ali'yi davet ediyorum, bana anlatması için! Paralel() niye çalışmıyor, lütfen açıklayın hocam. Mümkünse Türkçe!" in Turkish. SDB@79
Re: foreach (i; taskPool.parallel(0..2_000_000)
On 4/3/23 7:22 PM, Paul wrote: ```d // Timed main() vvv void main(string[] args) { auto progStartTime = MonoTime.currTime; //- string filename = args.length > 1 ? args[1] : "aoc2215a.data"; CommPair[] commPair; ulong x,y; // read file that has data sets in the form of x,y coordinate pairs // for each sensor-beacon pair. Create on array of structs to hold // this information. loadFileDataIntoArrayOfStructs(commPair, filename); foreach(int lineOfInterest;parallel(iota(0,4_000_001))) { Span[] span; // A section of line-of-interest coordinates where no other beacons are present. const spanReserve = span.reserve(50); createSpansOfNoBeacons(lineOfInterest,commPair,span); // if spans overlap, combine them into a single span and mark // the other spans !inUse. combineOverlappingSpans(span); // look for a line that doesn't have 4,000,001 locations accounted for if(beaconFreeLocations(span) < 4_000_001) { // find the location that is not accounted for foreach(ulong i;0..4_000_000) { bool found = false; foreach(sp;span) { if(i >= sp.xLow && i <= sp.xHigh) { found = true; break; } } if(!found) { x = i; y = lineOfInterest; break; } } } } writeln(x," ",y); ``` So I just quoted your main loop. I am assuming that this O(n^2) algorithm doesn't actually run for all iterations, because that wouldn't be feasible (16 trillion iterations is a lot). This means that I'm assuming a lot of cases do not run the second loop. Everything you do besides prune the second loop is mostly allocating an array of `Span` types. This means most of the parallel loops are allocating, and doing nothing else. As I said earlier, allocations need a global lock of the GC. What you need to do probably, is to avoid these allocations per loop. The easiest thing I can think of is to store the Span array as a static array of the largest array you need (i.e. the length of `commPair`), and then slice it instead of appending. So what you need is inside `createSpansOfNoBeacons`, take as a reference a `ref Span[MAX_SPANS]`, and have it return a `Span[]` that is a slice of that which was "alocated". See if this helps. FWIW, I did the AoC 2022 as well, and I never needed parallel execution. Looking at my solution comment in reddit, this one I sort of punted by knowing I could exit as soon as the answer is found (my solution runs in 2.5s on my input). But I recommend (once you are done), reading this post, it is a really cool way to look at it: https://www.reddit.com/r/adventofcode/comments/zmcn64/2022_day_15_solutions/j0hl19a/?context=8=9 -Steve
Re: foreach (i; taskPool.parallel(0..2_000_000)
On Monday, 3 April 2023 at 23:13:58 UTC, Steven Schveighoffer wrote: Yeah, please post. ```d module aoc2215b2; import std.stdio; import std.file: readText; import std.conv: to; import std.math: abs; import std.traits; import std.parallelism; import std.range; import core.time: MonoTime; // Timed main() vvv void main(string[] args) { auto progStartTime = MonoTime.currTime; //- string filename = args.length > 1 ? args[1] : "aoc2215a.data"; CommPair[] commPair; ulong x,y; // read file that has data sets in the form of x,y coordinate pairs // for each sensor-beacon pair. Create on array of structs to hold // this information. loadFileDataIntoArrayOfStructs(commPair, filename); foreach(int lineOfInterest;parallel(iota(0,4_000_001))) { Span[] span; // A section of line-of-interest coordinates where no other beacons are present. const spanReserve = span.reserve(50); createSpansOfNoBeacons(lineOfInterest,commPair,span); // if spans overlap, combine them into a single span and mark // the other spans !inUse. combineOverlappingSpans(span); // look for a line that doesn't have 4,000,001 locations accounted for if(beaconFreeLocations(span) < 4_000_001) { // find the location that is not accounted for foreach(ulong i;0..4_000_000) { bool found = false; foreach(sp;span) { if(i >= sp.xLow && i <= sp.xHigh) { found = true; break; } } if(!found) { x = i; y = lineOfInterest; break; } } } } writeln(x," ",y); //- auto progEndTime = MonoTime.currTime; writeln(progEndTime - progStartTime); } // Timed main() ^^^ struct CommPair { int sx,sy,bx,by; int manhattanDistance; } void loadFileDataIntoArrayOfStructs(ref CommPair[] commPair, string filename) { import std.regex; auto s = readText(filename); auto ctr = ctRegex!(`x=(-*\d+), y=(-*\d+):.*x=(-*\d+), y=(-*\d+)`); CommPair cptemp; foreach (c; matchAll(s, ctr)) { cptemp.sx = to!int(c[1]); cptemp.sy = to!int(c[2]); cptemp.bx = to!int(c[3]); cptemp.by = to!int(c[4]); cptemp.manhattanDistance = abs(cptemp.sx-cptemp.bx) + abs(cptemp.sy-cptemp.by); commPair ~= cptemp; } } struct Span { int xLow, xHigh; bool inUse = true; } void createSpansOfNoBeacons(int lineOfInterest, CommPair[] commPair,ref Span[] span) { foreach(size_t i,cp;commPair) { int distanceToLineOfInterest = abs(cp.sy - lineOfInterest); if(cp.manhattanDistance >= distanceToLineOfInterest) { int xLow = cp.sx - (cp.manhattanDistance - distanceToLineOfInterest); int xHigh = cp.sx + (cp.manhattanDistance - distanceToLineOfInterest); span ~= Span(xLow,xHigh); } } } void combineOverlappingSpans(ref Span[] span) { bool combinedSpansThisCycle = true; while(combinedSpansThisCycle) { combinedSpansThisCycle = false; for(size_t i=0; i < span.length-1; i++) { if(!span[i].inUse) continue; for(size_t j=i+1; j < span.length; j++) { if(!span[j].inUse) continue; // if one span overlaps with the other, combine them into one span if(spanIContainedInSpanJ(span[i],span[j]) || spanJContainedInSpanI(span[i],span[j])) { span[i].xLow = span[i].xLow < span[j].xLow ? span[i].xLow : span[j].xLow; span[i].xHigh = span[i].xHigh > span[j].xHigh ? span[i].xHigh : span[j].xHigh; span[j].inUse = false; combinedSpansThisCycle = true; // after combining two spans, perform bounds checking // 15 part b limits the search between 0 and 4,000,000 span[i].xLow = span[i].xLow < 0 ? 0 : span[i].xLow; span[i].xHigh = span[i].xHigh > 4_000_000 ? 4_000_000 : span[i].xHigh;
Re: foreach (i; taskPool.parallel(0..2_000_000)
On 4/3/23 6:56 PM, Paul wrote: On Monday, 3 April 2023 at 22:24:18 UTC, Steven Schveighoffer wrote: If your `foreach` body takes a global lock (like `writeln(i);`), then it's not going to run any faster (probably slower actually). **Ok I did have some debug writelns I commented out.** And did it help? **No** My program is about 140 lines Steven. Its just one of the Advent of Code challenges. Could I past the whole program here and see what you think? Yeah, please post. -Steve
Re: foreach (i; taskPool.parallel(0..2_000_000)
On Monday, 3 April 2023 at 22:24:18 UTC, Steven Schveighoffer wrote: If your `foreach` body takes a global lock (like `writeln(i);`), then it's not going to run any faster (probably slower actually). **Ok I did have some debug writelns I commented out.** And did it help? **No** My program is about 140 lines Steven. Its just one of the Advent of Code challenges. Could I past the whole program here and see what you think? Thanks for your assistance...much appreciated.
Re: foreach (i; taskPool.parallel(0..2_000_000)
On 4/3/23 6:02 PM, Paul wrote: On Sunday, 2 April 2023 at 15:32:05 UTC, Steven Schveighoffer wrote: It's important to note that parallel doesn't iterate the range in parallel, it just runs the body in parallel limited by your CPU count. **?!?** So for example, if you have: ```d foreach(i; iota(0, 2_000_000).parallel) { runExpensiveTask(i); } ``` The foreach is run on the main thread, gets a `0`, then hands off to a task thread `runExpensiveTask(0)`. Then it gets a `1`, and hands off to a task thread `runExpensiveTask(1)`, etc. The iteration is not expensive, and is not done in parallel. On the other hand, what you *shouldn't* do is: ```d foreach(i; iota(0, 2_000_000).map!(x => runExpensiveTask(x)).parallel) { } ``` as this will run the expensive task *before* running any tasks. If your `foreach` body takes a global lock (like `writeln(i);`), then it's not going to run any faster (probably slower actually). **Ok I did have some debug writelns I commented out.** And did it help? Another thing that takes a global lock is memory allocation. Also make sure you have more than one logical CPU. **I have 8.** It's dependent on the work being done, but you should see a roughly 8x speedup as long as the overhead of distributing tasks is not significant compared to the work being done. -Steve
Re: foreach (i; taskPool.parallel(0..2_000_000)
On Sunday, 2 April 2023 at 15:32:05 UTC, Steven Schveighoffer wrote: It's important to note that parallel doesn't iterate the range in parallel, it just runs the body in parallel limited by your CPU count. **?!?** If your `foreach` body takes a global lock (like `writeln(i);`), then it's not going to run any faster (probably slower actually). **Ok I did have some debug writelns I commented out.** If you can disclose more about what you are trying to do, it would be helpful. **This seems like it would be a lot of code and explaining but let me think about how to summarize.** Also make sure you have more than one logical CPU. **I have 8.**
Re: foreach (i; taskPool.parallel(0..2_000_000)
On 4/1/23 6:32 PM, Paul wrote: On Saturday, 1 April 2023 at 18:30:32 UTC, Steven Schveighoffer wrote: On 4/1/23 2:25 PM, Paul wrote: ```d import std.range; foreach(i; iota(0, 2_000_000).parallel) ``` Is there a way to tell if the parallelism actually divided up the work? Both versions of my program run in the same time ~6 secs. It's important to note that parallel doesn't iterate the range in parallel, it just runs the body in parallel limited by your CPU count. If your `foreach` body takes a global lock (like `writeln(i);`), then it's not going to run any faster (probably slower actually). If you can disclose more about what you are trying to do, it would be helpful. Also make sure you have more than one logical CPU. -Steve
Re: foreach (i; taskPool.parallel(0..2_000_000)
On Sunday, 2 April 2023 at 04:34:40 UTC, Salih Dincer wrote: I haven't seen rsFirst256 until now... **Edit:** I saw, I saw :) I am struck with consternation! I've never seen these results before. Interesting, there is such a thing as parallel threading :) Here are my skipPoints: ```d enum BP : long { //f, a, r, b = 7, /* <- beginning f = 113, r = 62279227, b = 186837678, // f = 146, r = 249134971, b = 747404910, // f = 161, r = 498270808, b = 1494812421, // f = 178, r = 1993083484, b = 5979250449, // f = 210, r = 3986167363, b = 11958502086, //*/ s = 5 } /* PRINTS: eLab@pico:~/Projeler$ ./RownlandSequence_v2 122: ["124559610, 373678827"] 128: ["249120240, 747360717"] */ ``` It looks like there are 5 total skipPoints until 256 where it loops for a long time. (while passing 1's). SDB@79
Re: foreach (i; taskPool.parallel(0..2_000_000)
On Saturday, 1 April 2023 at 22:48:46 UTC, Ali Çehreli wrote: On 4/1/23 15:30, Paul wrote: > Is there a way to verify that it split up the work in to tasks/threads > ...? It is hard to see the difference unless there is actual work in the loop that takes time. I always use the Rowland Sequence for such experiments. At least it's better than the Fibonacci Range: ```d struct RowlandSequence { import std.numeric : gcd; import std.format : format; import std.conv : text; long b, r, a = 3; enum empty = false; string[] front() { string result = format("%s, %s", b, r); return [text(a), result]; } void popFront() { long result = 1; while(result == 1) { result = gcd(r++, b); b += result; } a = result; } } enum BP { f = 1, b = 7, r = 2, a = 1, /* f = 109, b = 186837516, r = 62279173, //*/ s = 5 } void main() { RowlandSequence rs; long start, skip; with(BP) { rs = RowlandSequence(b, r); start = f; skip = s; } rs.popFront(); import std.stdio, std.parallelism; import std.range : take; auto rsFirst128 = rs.take(128); foreach(r; rsFirst128.parallel) { if(r[0].length > skip) { start.writeln(": ", r); } start++; } } /* PRINTS: 46: ["121403", "364209, 121404"] 48: ["242807", "728421, 242808"] 68: ["486041", "1458123, 486042"] 74: ["972533", "2917599, 972534"] 78: ["1945649", "5836947, 1945650"] 82: ["3891467", "11674401, 3891468"] 90: ["7783541", "23350623, 7783542"] 93: ["15567089", "46701267, 15567090"] 102: ["31139561", "93418683, 31139562"] 108: ["62279171", "186837513, 62279172"] */ ``` The operation is simple, again multiplication, addition, subtraction and module, i.e. So four operations but enough to overrun the CPU! I haven't seen rsFirst256 until now because I don't have a fast enough processor. Maybe you'll see it, but the first 108 is fast anyway. **PS:** Decrease value of the `skip` to see the entire sequence. In cases where your processor power is not enough, you can create skip points. Check out BP... SDB@79
Re: foreach (i; taskPool.parallel(0..2_000_000)
On 4/1/23 15:30, Paul wrote: > Is there a way to verify that it split up the work in to tasks/threads > ...? It is hard to see the difference unless there is actual work in the loop that takes time. You can add a Thread.sleep call. (Commented-out in the following program.) Another option is to monitor a task manager like 'top' on unix based systems. It should multiple threads for the same program. However, I will do something unspeakably wrong and take advantage of undefined behavior below. :) Since iteration count is an even number, the 'sum' variable should come out as 0 in the end. With .parallel it doesn't because multiple threads are stepping on each other's toes (values): import std; void main() { long sum; foreach(i; iota(0, 2_000_000).parallel) { // import core.thread; // Thread.sleep(1.msecs); if (i % 2) { ++sum; } else { --sum; } } if (sum == 0) { writeln("We highly likely worked serially."); } else { writefln!"We highly likely worked in parallel because %s != 0."(sum); } } If you remove .parallel, 'sum' will always be 0. Ali
Re: foreach (i; taskPool.parallel(0..2_000_000)
On Saturday, 1 April 2023 at 18:30:32 UTC, Steven Schveighoffer wrote: On 4/1/23 2:25 PM, Paul wrote: ```d import std.range; foreach(; iota(0, 2_000_000).parallel) ``` -Steve Is there a way to tell if the parallelism actually divided up the work? Both versions of my program run in the same time ~6 secs.
Re: foreach (i; taskPool.parallel(0..2_000_000)
```d import std.range; foreach(; iota(0, 2_000_000).parallel) ``` -Steve Is there a way to verify that it split up the work in to tasks/threads ...? The example you gave me works...compiles w/o errors but the execution time is the same as the non-parallel version. They both take about 6 secs to execute. totalCPUs tells me I have 8 CPUs available.
Re: foreach (i; taskPool.parallel(0..2_000_000)
Thanks Steve.
Re: foreach (i; taskPool.parallel(0..2_000_000)
On 4/1/23 2:25 PM, Paul wrote: Thanks in advance for any assistance. As the subject line suggests can I do something like? : ```d foreach (i; taskPool.parallel(0..2_000_000)) ``` Obviously this exact syntax doesn't work but I think it expresses the gist of my challenge. ```d import std.range; foreach(; iota(0, 2_000_000).parallel) ``` -Steve
Re: foreach with assoc. array
On Wednesday, 1 March 2023 at 19:05:10 UTC, DLearner wrote: ``` Error: variable `wk_Idx` is shadowing variable `for3.main.wk_Idx` ``` Why is this usage wrong? Or use the `each` template which is almost the same as `foreach` to avoid the shadowing variable issue. ```d import std.algorithm, std.range, std.stdio; enum len = 10; void main() { int[len] IntArr, i, n; len.iota.each!((i, n) => IntArr[i] = n); IntArr.writeln; // [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] assert(IntArr == len.iota.array); } ``` SDB@79
Re: foreach with assoc. array
On Wednesday, 1 March 2023 at 19:05:10 UTC, DLearner wrote: (1) & (2) compile and run with the expected results. But (3) fails with: ``` Error: variable `wk_Idx` is shadowing variable `for3.main.wk_Idx` ``` Why is this usage wrong? With `foreach`, you can't reuse an existing variable as the loop variable. It always declares a new one. If you want to reuse an existing variable for your loop, you have to use `for`.
Re: foreach(ubyte j;0 .. num) is bugging out
On Thursday, 23 September 2021 at 00:30:45 UTC, Ruby The Roobster wrote: I figured out something weird. The variable 'i' is passed by reference, yet the variable 'i' of the loop isn't being incremented by posfunc. I assume foreach creates a new i variable at the start of each new loop. Yep: ``` $ rdmd --eval 'foreach (i; 0 .. 5) { writeln(i); i++; }' 0 1 2 3 4 ```
Re: foreach(ubyte j;0 .. num) is bugging out
On Thursday, 23 September 2021 at 00:17:49 UTC, jfondren wrote: On Thursday, 23 September 2021 at 00:06:42 UTC, Ruby The Roobster wrote: So, I have the following function: ```d writeln(tempcolor); //For this matter, the program correctly reports tempcolor as 1... for(ubyte j = 0;j < tempcolor; j++ /*trying ++j has same effect*/ ) { //tempcolor is 1, yet this sloop gets executed twice... writeln(); posfunc(ftext, main, exp, temp, i, j, points , x); //Orignally foreach loop, but switching to for loop has same effect... } ``` Needs more print in your print debugging: ```d writeln("tempcolor: ", tempcolor); ... writeln("in tempcolor with j: ", j); ``` output: ``` tempcolor: 1 in tempcolor with j: 0 ... ... numbers ... tempcolor: 0 tempcolor: 0 tempcolor: 0 tempcolor: 0 tempcolor: 0 tempcolor: 0 tempcolor: 0 tempcolor: 0 tempcolor: 0 tempcolor: 1 in tempcolor with j: 0 ``` I figured out something weird. The variable 'i' is passed by reference, yet the variable 'i' of the loop isn't being incremented by posfunc. I assume foreach creates a new i variable at the start of each new loop. Swapping the original loop with a while loop fixes the problem. Thank you very much for trying to help.
Re: foreach(ubyte j;0 .. num) is bugging out
On Thursday, 23 September 2021 at 00:06:42 UTC, Ruby The Roobster wrote: So, I have the following function: ```d writeln(tempcolor); //For this matter, the program correctly reports tempcolor as 1... for(ubyte j = 0;j < tempcolor; j++ /*trying ++j has same effect*/ ) { //tempcolor is 1, yet this sloop gets executed twice... writeln(); posfunc(ftext, main, exp, temp, i, j, points , x); //Orignally foreach loop, but switching to for loop has same effect... } ``` Needs more print in your print debugging: ```d writeln("tempcolor: ", tempcolor); ... writeln("in tempcolor with j: ", j); ``` output: ``` tempcolor: 1 in tempcolor with j: 0 ... ... numbers ... tempcolor: 0 tempcolor: 0 tempcolor: 0 tempcolor: 0 tempcolor: 0 tempcolor: 0 tempcolor: 0 tempcolor: 0 tempcolor: 0 tempcolor: 1 in tempcolor with j: 0 ... ... numbers ... ``` Here's a oneliner to reproduce to abc.txt: ``` rdmd --eval '"00 00 00 01 00 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 01".split(" ").map!(s => cast(char) s.to!ubyte).write' > abc.txt ```
Re: foreach() behavior on ranges
On Wednesday, 25 August 2021 at 11:02:23 UTC, Steven Schveighoffer wrote: On 8/25/21 4:31 AM, frame wrote: On Tuesday, 24 August 2021 at 21:15:02 UTC, Steven Schveighoffer wrote: I'm surprised you bring PHP as an example, as it appears their foreach interface works EXACTLY as D does: Yeah, but the point is, there is a rewind() method. That is called every time on foreach(). It seems what you are after is forward ranges. Those are able to "rewind" when you are done with them. It's just not done through a rewind method, but via saving the range before iteration: ```d foreach(val; forwardRange.save) { ... break; } // forwardRange hasn't been iterated here ``` -Steve This could be any custom method for my ranges or forward range returned by some function. But that doesn't help if some thirdparty library function would break and return just an input range. Then it seems that it must be very properly implemented like postblit technics mentioned before. Some author may never care about. That it works in 99% of all cases should not be an excuse for a design flaw. The documentation really need to mention this.
Re: foreach() behavior on ranges
On Wednesday, 25 August 2021 at 19:51:36 UTC, H. S. Teoh wrote: What I understand from what Andrei has said in the past, is that a range is merely a "view" into some underlying storage; it is not responsible for the contents of that storage. My interpretation of this is that .save will only save the *position* of the range, but it will not save the contents it points to, so it will not (should not) deep-copy. That definition is potentially misleading if we take into account that a range is not necessarily iterating over some underlying storage: ranges can also be defined by algorithmic processes. (Think e.g. iota, or pseudo-RNGs, or a range that iterates over the Fibonacci numbers.) However, if the range is implemented by a struct that contains a reference to its iteration state, then yes, to satisfy the definition of .save it should deep-copy this state. Right. And in the case of algorithmic ranges (rather than container-derived ranges), the state is always and only the iteration state. And then as well as that there are ranges that are iterating over external IO, which in most cases can't be treated as forward ranges but in a few cases might be (e.g. saving the cursor position when iterating over a file's contents). Arguably I think a lot of problems in the range design derive from not thinking through those distinctions in detail (external-IO-based vs. algorithmic vs. container-based), even though superficially those seem to map well to the input vs forward vs bidirectional vs random-access range distinctions. That's also not taking into account edge cases, e.g. stuff like RandomShuffle or RandomSample: here one can in theory copy the "head" of the range but one arguably wants to avoid correlations in the output of the different copies (which can arise from at least 2 different sources: copying under-the-hood pseudo-random state of the sampling/shuffling algorithm itself, or copying the underlying pseudo-random number generator). Except perhaps in the case where one wants to take advantage of the pseudo-random feature to reproduce those sequences ... but then one wants that to be a conscious programmer decision, not happening by accident under the hood of some library function. (Rabbit hole, here we come.) Andrei has mentioned before that in retrospect, .save was a design mistake. The difference between an input range and a forward range should have been keyed on whether the range type has reference semantics (input range) or by-value semantics (forward range). But for various reasons, including the state of the language at the time the range API was designed, the .save route was chosen, and we're stuck with it unless Phobos 2.0 comes into existence. Either way, though, the semantics of a forward range pretty much dictates that whatever type a range has, if it claims to be a forward range then .save must preserve whatever iteration state it has at that point in time. If this requires deep-copying some state referenced from a struct, then that's what it takes to satisfy the API. This may take the form of a .save method that copies state, or a copy ctor that does the same, or simply storing iteration state as PODs in the range struct so that copying the struct equates to preserving the iteration state. Yes. FWIW I agree that when _implementing_ a forward range one should probably make sure that copying by value and the `save` method produce the same results. But as a _user_ of code implemented using the current range API, it might be a bad idea to assume that a 3rd party forward range implementation will necessarily guarantee that.
Re: foreach() behavior on ranges
On Wed, Aug 25, 2021 at 04:46:54PM +, Joseph Rushton Wakeling via Digitalmars-d-learn wrote: > On Wednesday, 25 August 2021 at 10:59:44 UTC, Steven Schveighoffer wrote: > > structs still provide a mechanism (postblit/copy ctor) to properly > > save a forward range when copying, even if the guts need copying > > (unlike classes). In general, I think it was a mistake to use > > `.save` as the mechanism, as generally `.save` is equivalent to > > copying, so nobody does it, and code works fine for most ranges. > > Consider a struct whose internal fields are just a pointer to its > "true" internal state. Does one have any right to assume that the > postblit/copy ctor would necessarily deep-copy that? [...] > If that struct implements a forward range, though, and that pointed-to > state is mutated by iteration of the range, then it would be > reasonable to assume that the `save` method MUST deep-copy it, because > otherwise the forward-range property would not be respected. [...] What I understand from what Andrei has said in the past, is that a range is merely a "view" into some underlying storage; it is not responsible for the contents of that storage. My interpretation of this is that .save will only save the *position* of the range, but it will not save the contents it points to, so it will not (should not) deep-copy. However, if the range is implemented by a struct that contains a reference to its iteration state, then yes, to satisfy the definition of .save it should deep-copy this state. > With that in mind, I am not sure it's reasonable to assume that just > because a struct implements a forward-range API, that copying the > struct instance is necessarily the same as saving the range. [...] Andrei has mentioned before that in retrospect, .save was a design mistake. The difference between an input range and a forward range should have been keyed on whether the range type has reference semantics (input range) or by-value semantics (forward range). But for various reasons, including the state of the language at the time the range API was designed, the .save route was chosen, and we're stuck with it unless Phobos 2.0 comes into existence. Either way, though, the semantics of a forward range pretty much dictates that whatever type a range has, if it claims to be a forward range then .save must preserve whatever iteration state it has at that point in time. If this requires deep-copying some state referenced from a struct, then that's what it takes to satisfy the API. This may take the form of a .save method that copies state, or a copy ctor that does the same, or simply storing iteration state as PODs in the range struct so that copying the struct equates to preserving the iteration state. T -- Why waste time reinventing the wheel, when you could be reinventing the engine? -- Damian Conway
Re: foreach() behavior on ranges
On Wednesday, 25 August 2021 at 17:01:54 UTC, Steven Schveighoffer wrote: In a world where copyability means it's a forward range? Yes. We aren't in that world, it's a hypothetical "if we could go back and redesign". OK, that makes sense. Technically this is true. In practice, it rarely happens. The flaw of `save` isn't that it's an unsound API, the flaw is that people get away with just copying, and it works 99.9% of the time. So code is simply untested with ranges where `save` is important. This is very true, and makes it quite reasonable to try to pursue "the obvious/lazy thing == the thing you're supposed to do" w.r.t. how ranges are defined. I'd be willing to bet $10 there is a function in phobos right now, that takes forward ranges, and forgets to call `save` when iterating with foreach. It's just so easy to do, and works with most ranges in existence. I'm sure you'd win that bet! The idea is to make the meaning of a range copy not ambiguous. Yes, this feels reasonable. And then one can reserve the idea of a magic deep-copy method for special cases like pseudo-RNGs where one wants them to be copyable on user request, but without code assuming it can copy them.
Re: foreach() behavior on ranges
On 8/25/21 12:46 PM, Joseph Rushton Wakeling wrote: On Wednesday, 25 August 2021 at 10:59:44 UTC, Steven Schveighoffer wrote: structs still provide a mechanism (postblit/copy ctor) to properly save a forward range when copying, even if the guts need copying (unlike classes). In general, I think it was a mistake to use `.save` as the mechanism, as generally `.save` is equivalent to copying, so nobody does it, and code works fine for most ranges. Consider a struct whose internal fields are just a pointer to its "true" internal state. Does one have any right to assume that the postblit/copy ctor would necessarily deep-copy that? In a world where copyability means it's a forward range? Yes. We aren't in that world, it's a hypothetical "if we could go back and redesign". If that struct implements a forward range, though, and that pointed-to state is mutated by iteration of the range, then it would be reasonable to assume that the `save` method MUST deep-copy it, because otherwise the forward-range property would not be respected. With that in mind, I am not sure it's reasonable to assume that just because a struct implements a forward-range API, that copying the struct instance is necessarily the same as saving the range. Technically this is true. In practice, it rarely happens. The flaw of `save` isn't that it's an unsound API, the flaw is that people get away with just copying, and it works 99.9% of the time. So code is simply untested with ranges where `save` is important. Indeed, IIRC quite a few Phobos library functions program defensively against that difference by taking a `.save` copy of their input before iterating over it. I'd be willing to bet $10 there is a function in phobos right now, that takes forward ranges, and forgets to call `save` when iterating with foreach. It's just so easy to do, and works with most ranges in existence. What should have happened is that input-only ranges should not have been copyable, and copying should have been the save mechanism. Then it becomes way way more obvious what is happening. Yes, this means forgoing classes as ranges. I think there's a benefit of a method whose definition is explicitly "If you call this, you will get a copy of the range which will replay exactly the same results when iterating over it". Just because the meaning of "copy" can be ambiguous, whereas a promise about how iteration can be used is not. The idea is to make the meaning of a range copy not ambiguous. -Steve
Re: foreach() behavior on ranges
On Wednesday, 25 August 2021 at 10:59:44 UTC, Steven Schveighoffer wrote: structs still provide a mechanism (postblit/copy ctor) to properly save a forward range when copying, even if the guts need copying (unlike classes). In general, I think it was a mistake to use `.save` as the mechanism, as generally `.save` is equivalent to copying, so nobody does it, and code works fine for most ranges. Consider a struct whose internal fields are just a pointer to its "true" internal state. Does one have any right to assume that the postblit/copy ctor would necessarily deep-copy that? If that struct implements a forward range, though, and that pointed-to state is mutated by iteration of the range, then it would be reasonable to assume that the `save` method MUST deep-copy it, because otherwise the forward-range property would not be respected. With that in mind, I am not sure it's reasonable to assume that just because a struct implements a forward-range API, that copying the struct instance is necessarily the same as saving the range. Indeed, IIRC quite a few Phobos library functions program defensively against that difference by taking a `.save` copy of their input before iterating over it. What should have happened is that input-only ranges should not have been copyable, and copying should have been the save mechanism. Then it becomes way way more obvious what is happening. Yes, this means forgoing classes as ranges. I think there's a benefit of a method whose definition is explicitly "If you call this, you will get a copy of the range which will replay exactly the same results when iterating over it". Just because the meaning of "copy" can be ambiguous, whereas a promise about how iteration can be used is not.
Re: foreach() behavior on ranges
On 8/25/21 7:26 AM, Alexandru Ermicioi wrote: On Wednesday, 25 August 2021 at 11:04:35 UTC, Steven Schveighoffer wrote: It never has called `save`. It makes a copy, which is almost always the equivalent `save` implementation. Really? Then what is the use for .save method then? The only reason I can find is that you can't declare constructors in interfaces hence the use of the .save method instead of copy constructor for defining forward ranges. The `save` function was used to provide a way for code like `isForwardRange` to have a definitive symbol to search for. It's also opt-in, whereas if we used copying, it would be opt-out. Why a function, and not just some enum? Because it should be something that has to be used, not just a "documenting" attribute if I recall correctly. Keep in mind, UDAs were not a thing yet, and compile-time introspection was not as robust as it is now. I'm not even sure you could disable copying. We have now two ways of doing the same thing, which can cause confusion. Best would be then for ranges to hide copy constructor under private modifier (or disable altoghether), and force other range wrappers call .save always, including foreach since by not doing so we introduce difference in behavior between ref and value forward ranges (for foreach use). There would be a huge hole in this plan -- arrays. Arrays are the most common range anywhere, and if a forward range must not be copyable any way but using `save`, it would mean arrays are not forward ranges. Not to mention that foreach on an array is a language construct, and does not involve the range interface. -Steve
Re: foreach() behavior on ranges
On Wednesday, 25 August 2021 at 11:04:35 UTC, Steven Schveighoffer wrote: It never has called `save`. It makes a copy, which is almost always the equivalent `save` implementation. -Steve Really? Then what is the use for .save method then? The only reason I can find is that you can't declare constructors in interfaces hence the use of the .save method instead of copy constructor for defining forward ranges. We have now two ways of doing the same thing, which can cause confusion. Best would be then for ranges to hide copy constructor under private modifier (or disable altoghether), and force other range wrappers call .save always, including foreach since by not doing so we introduce difference in behavior between ref and value forward ranges (for foreach use).
Re: foreach() behavior on ranges
On 8/25/21 6:06 AM, Alexandru Ermicioi wrote: On Wednesday, 25 August 2021 at 08:15:18 UTC, frame wrote: I know, but foreach() doesn't call save(). Hmm, this is a regression probably, or I missed the time frame when foreach moved to use of copy constructor for forward ranges. Do we have a well defined description of what input, forward and any other well known range is, and how it does interact with language features? For some reason I didn't manage to find anything on dlang.org. It never has called `save`. It makes a copy, which is almost always the equivalent `save` implementation. -Steve
Re: foreach() behavior on ranges
On 8/25/21 4:31 AM, frame wrote: On Tuesday, 24 August 2021 at 21:15:02 UTC, Steven Schveighoffer wrote: I'm surprised you bring PHP as an example, as it appears their foreach interface works EXACTLY as D does: Yeah, but the point is, there is a rewind() method. That is called every time on foreach(). It seems what you are after is forward ranges. Those are able to "rewind" when you are done with them. It's just not done through a rewind method, but via saving the range before iteration: ```d foreach(val; forwardRange.save) { ... break; } // forwardRange hasn't been iterated here ``` -Steve
Re: foreach() behavior on ranges
On 8/25/21 6:06 AM, Joseph Rushton Wakeling wrote: On Tuesday, 24 August 2021 at 09:15:23 UTC, bauss wrote: A range should be a struct always and thus its state is copied when the foreach loop is created. That's quite a strong assumption, because its state might be a reference type, or it might not _have_ state in a meaningful sense -- consider an input range that wraps reading from a socket, or that just reads from `/dev/urandom`, for two examples. Deterministic copying per foreach loop is only guaranteed for forward ranges. structs still provide a mechanism (postblit/copy ctor) to properly save a forward range when copying, even if the guts need copying (unlike classes). In general, I think it was a mistake to use `.save` as the mechanism, as generally `.save` is equivalent to copying, so nobody does it, and code works fine for most ranges. What should have happened is that input-only ranges should not have been copyable, and copying should have been the save mechanism. Then it becomes way way more obvious what is happening. Yes, this means forgoing classes as ranges. -Steve
Re: foreach() behavior on ranges
On Tuesday, 24 August 2021 at 09:15:23 UTC, bauss wrote: A range should be a struct always and thus its state is copied when the foreach loop is created. That's quite a strong assumption, because its state might be a reference type, or it might not _have_ state in a meaningful sense -- consider an input range that wraps reading from a socket, or that just reads from `/dev/urandom`, for two examples. Deterministic copying per foreach loop is only guaranteed for forward ranges.
Re: foreach() behavior on ranges
On Wednesday, 25 August 2021 at 08:15:18 UTC, frame wrote: I know, but foreach() doesn't call save(). Hmm, this is a regression probably, or I missed the time frame when foreach moved to use of copy constructor for forward ranges. Do we have a well defined description of what input, forward and any other well known range is, and how it does interact with language features? For some reason I didn't manage to find anything on dlang.org.
Re: foreach() behavior on ranges
On Wednesday, 25 August 2021 at 06:51:36 UTC, bauss wrote: Of course it doesn't disallow classes but it's generally advised that you use structs and that's what you want in 99% of the cases. It's usually a red flag when a range starts being a reference type. Well, sometimes you can't avoid ref types. For example when you need to mask the implementation of the range, but yes, in most of the cases best is to use simpler methods to represent ranges.
Re: foreach() behavior on ranges
On Tuesday, 24 August 2021 at 21:15:02 UTC, Steven Schveighoffer wrote: If you have a for loop: ```d int i; for(i = 0; i < someArr.length; ++i) { if(someArr[i] == desiredValue) break; } ``` You are saying, "compiler, please execute the `++i` when I break from the loop because I already processed that one". How can that be expected? I would *never* expect that. When I break, it means "stop the loop, I'm done", and then I use `i` which is where I expected it to be. I get your point, you see foreach() as raw translate to the for-loop and I'm fine with that. To automatically popFront() on break also is only a suggestion if there is no other mechanism to the tell the range we have cancelled it. It becomes useless for foreach() because you can't rely on them if other code breaks the loop and you need to use that range, like in my case. But also for ranges - there is no need for a popFront() if it is not called in a logic way. Then even empty() could fetch next data if needed. It only makes sense if language system code uses it in a strictly order and ensures that this order is always assured. There is no problem with the ordering. What seems to be the issue is that you aren't used to the way ranges work. Ehm, no... -> empty() -> front() -> popFront() -> empty() -> front() break; -> empty(); -> front(); clearly violates the order for me. Well, nobody said that we must move on the range - but come on... What's great about D is that there is a solution for you: ```d struct EagerPopfrontRange(R) { R source; ElementType!R front; bool empty; void popFront() { if(source.empty) empty = true; else { front = source.front; source.popFront; } } } auto epf(R)(R inputRange) { auto result = EagerPopfrontRange!R(inputRange); result.popFront; // eager! return result; } // usage foreach(v; someRange.epf) { ... } ``` Now if you break from the loop, the original range is pointing at the element *after* the one you last were processing. This is nice. But foreach() should do it automatically - avoiding this. foreach() should be seen as a special construct that does that, not just a dumb alias for the for-loop. Why? Because it is a convenient language construct and usage should be easy. Again, there should be no additional popFront() just because I break the loop. I'm surprised you bring PHP as an example, as it appears their foreach interface works EXACTLY as D does: Yeah, but the point is, there is a rewind() method. That is called every time on foreach().
Re: foreach() behavior on ranges
On Tuesday, 24 August 2021 at 18:52:19 UTC, Alexandru Ermicioi wrote: Forward range exposes also capability to create save points, which is actually used by foreach to do, what it is done in java by iterable interface for example. I know, but foreach() doesn't call save().
Re: foreach() behavior on ranges
On Tuesday, 24 August 2021 at 19:06:44 UTC, Alexandru Ermicioi wrote: On Tuesday, 24 August 2021 at 09:15:23 UTC, bauss wrote: A range should be a struct always and thus its state is copied when the foreach loop is created. Actually the range contracts don't mention that it needs to be a by value type. It can also be a reference type, i.e. a class. Of course it doesn't disallow classes but it's generally advised that you use structs and that's what you want in 99% of the cases. It's usually a red flag when a range starts being a reference type.
Re: foreach() behavior on ranges
On 8/24/21 1:44 PM, Ferhat Kurtulmuş wrote: > Just out of curiosity, if a range implementation uses malloc in save, is > it only possible to free the memory with the dtor? Yes but It depends on the specific case. For example, if the type has a clear() function that does clean up, then one might call that. I don't see it as being different from any other resource management. > Is a save function only meaningful for GC ranges? save() is to store the iteration state of a range. It should seldom require memory allocation unless we're dealing with e.g. stdin where we would have to store input lines just to support save(). It would not be a good design to hide such potentilly expensive storage of lines behind save(). To me, save() should mostly be as trivial as returning a copy of the struct object to preserve the state of the original range. Here is a trivial generator: import std.range; struct Squares { int current; enum empty = false; int front() const { return current * current; } void popFront() { ++current; } auto save() { return this; } } void main() { auto r = Squares(0); r.popFront(); // Drop 0 * 0 r.popFront(); // Drop 1 * 1 auto copy = r.save; copy.popFront(); // Drop 2 * 2 only from the copy assert(r.front == 2 * 2); // Saved original still has 2 * 2 } Ali
Re: foreach() behavior on ranges
On 8/24/21 2:12 PM, frame wrote: You can call `popFront` if you need to after the loop, or just before the break. I have to say, the term "useless" does not even come close to describing ranges using foreach in my experience. I disagree, because foreach() is a language construct and therefore it should behave in a logic way. The methods are fine in ranges or if something is done manually. But in case of foreach() it's just unexpected. I can't agree at all. It's totally expected. If you have a for loop: ```d int i; for(i = 0; i < someArr.length; ++i) { if(someArr[i] == desiredValue) break; } ``` You are saying, "compiler, please execute the `++i` when I break from the loop because I already processed that one". How can that be expected? I would *never* expect that. When I break, it means "stop the loop, I'm done", and then I use `i` which is where I expected it to be. It becomes useless for foreach() because you can't rely on them if other code breaks the loop and you need to use that range, like in my case. But also for ranges - there is no need for a popFront() if it is not called in a logic way. Then even empty() could fetch next data if needed. It only makes sense if language system code uses it in a strictly order and ensures that this order is always assured. There is no problem with the ordering. What seems to be the issue is that you aren't used to the way ranges work. What's great about D is that there is a solution for you: ```d struct EagerPopfrontRange(R) { R source; ElementType!R front; bool empty; void popFront() { if(source.empty) empty = true; else { front = source.front; source.popFront; } } } auto epf(R)(R inputRange) { auto result = EagerPopfrontRange!R(inputRange); result.popFront; // eager! return result; } // usage foreach(v; someRange.epf) { ... } ``` Now if you break from the loop, the original range is pointing at the element *after* the one you last were processing. It's not a bug. So there is no need to "handle" it. The pattern of using a for(each) loop to align certain things occurs all the time in code. Imagine a loop that is looking for a certain line in a file, and breaks when the line is there. Would you really want the compiler to unhelpfully throw away that line for you? I don't get this point. If it breaks from the loop then it changes the scope anyway, so my data should be already processed or copied. What is thrown away here? Why does the loop have to contain all your code? Maybe you have code after the loop. Maybe the loop's purpose is to align the range based on some criteria (e.g. take this byLine range and prime it so it contains the first line of the thing I'm looking for). And if that is what you want, put `popFront` in the loop before you exit. You can't "unpopFront" something, so this provides the most flexibility. Yes, this is the solution but not the way how it should be. If the programmer uses the range methods within the foreach-loop then you would expect some bug. There shouldn't be a need to manipulate the range just because I break the foreach-loop. You shouldn't need to in most circumstances. I don't think I've ever needed to do this. And I use foreach on ranges all the time. Granted, I probably would use a while loop to align a range rather than foreach. Java, for example just uses next() and hasNext(). You can't run into a bug here because one method must move the cursor. This gives a giant clue as to the problem -- you aren't used to this. Java's iterator interface is different than D's. It consumes the element as you fetch it, instead of acting like a pointer to a current element. Once it gives you the element, it's done with it. D's ranges are closer to a C++ iterator pair (which is modeled after a pair of pointers). PHP has a rewind() method. So any foreach() would reset the range or could clean up before next use of it. I'm surprised you bring PHP as an example, as it appears their foreach interface works EXACTLY as D does: ```php $arriter = new ArrayIterator(array(1, 2, 3, 4)); foreach($arriter as $val) { if ($val == 2) break; } print($arriter->current()); // 2 ``` But D just lets your range in an inconsistent state between an iteration cycle. This feels just wrong. The next foreach() would not continue with popFront() but with empty() again - because it even relies on it that a range should be called in a given order. As there is no rewind or exit-method, this order should be maintained by foreach-exit too, preparing for next use. That's it. You don't see a bug here? I believe the bug is in your expectations. While Java-like iteration would be a possible API D could have chosen, it's not what D chose. -Steve
Re: foreach() behavior on ranges
On Tuesday, 24 August 2021 at 19:06:44 UTC, Alexandru Ermicioi wrote: On Tuesday, 24 August 2021 at 09:15:23 UTC, bauss wrote: [...] Actually the range contracts don't mention that it needs to be a by value type. It can also be a reference type, i.e. a class. [...] True for any forward range and above, not true for input ranges. The problem with them is that some of them are structs, and even if they are not forward ranges they do have this behavior due to implicit copy on assignment, which can potentially make the code confusing. [...] If we follow the definition of ranges, they must not be copy-able at all. The only way to copy/save, would be to have .save method and call that method. This again is not being properly followed by even phobos implementations. Note, that a better approach would be to replace .save in definition of forward range with a copy constructor, then all non-compliant ranges would become suddenly compliant, while those that have .save method should be refactored to a copy constructor version. [...] You should add .save on assignment if range is a forward range, or just remove the assignment if it is not. Best regards, Alexandru. Just out of curiosity, if a range implementation uses malloc in save, is it only possible to free the memory with the dtor? I worry about that especially when using those nogc range implementations with standard library. I don't have a list of the functions calling save in phobos. Is a save function only meaningful for GC ranges?
Re: foreach() behavior on ranges
On Tuesday, 24 August 2021 at 09:15:23 UTC, bauss wrote: A range should be a struct always and thus its state is copied when the foreach loop is created. Actually the range contracts don't mention that it needs to be a by value type. It can also be a reference type, i.e. a class. Which means the state resets every time the loop is initiated. True for any forward range and above, not true for input ranges. The problem with them is that some of them are structs, and even if they are not forward ranges they do have this behavior due to implicit copy on assignment, which can potentially make the code confusing. If your range uses some internal state that isn't able to be copied then or your ranges are not structs then your ranges are inherently incorrect. If we follow the definition of ranges, they must not be copy-able at all. The only way to copy/save, would be to have .save method and call that method. This again is not being properly followed by even phobos implementations. Note, that a better approach would be to replace .save in definition of forward range with a copy constructor, then all non-compliant ranges would become suddenly compliant, while those that have .save method should be refactored to a copy constructor version. This is what a foreach loop on a range actually compiles to: ```d for (auto copy = range; !copy.empty; copy.popFront()) { ... } ``` You should add .save on assignment if range is a forward range, or just remove the assignment if it is not. Best regards, Alexandru.
Re: foreach() behavior on ranges
On Tuesday, 24 August 2021 at 08:36:18 UTC, frame wrote: How do you handle that issue? Are your ranges designed to have this bug or do you implement opApply() always? This is expected behavior imho. I think what you need is a forward range, not input range. By the contract of input range, it is a consumable object, hence once used in a foreach it can't be used anymore. It is similar to an iterator or a stream object in java. Forward range exposes also capability to create save points, which is actually used by foreach to do, what it is done in java by iterable interface for example. Then there is bidirectional and random access ranges that offer even more capabilities. Per knowledge I have opApply is from pre range era, and is kinda left as an option to provide easy foreach integration. In this case you can think of objects having opApply as forward ranges, though just for foreach constructs only. Regards, Alexandru.
Re: foreach() behavior on ranges
On Tuesday, 24 August 2021 at 16:45:27 UTC, H. S. Teoh wrote: In some cases, you *want* to retain the same element between loops, e.g., if you're iterating over elements of some category and stop when you encounter something that belongs to the next category -- you wouldn't want to consume that element, but leave it to the next loop to consume it. So it's not a good idea to have break call .popFront automatically. Similarly, sometimes you might want to reuse an element (e.g., the loop body detects a condition that warrants retrying). I'm only talking about foreach() uses and that you should'nt need to mix it with manual methods. Such iterations are another topic.
Re: foreach() behavior on ranges
On Tuesday, 24 August 2021 at 13:02:38 UTC, Steven Schveighoffer wrote: On 8/24/21 4:36 AM, frame wrote: Consider a simple input range that can be iterated with empty(), front() and popFront(). That is comfortable to use with foreach() but what if the foreach loop will be cancelled? If a range isn't depleted yet and continued it will supply the same data twice on front() in the next use of foreach(). For some reason, foreach() does not call popFront() on a break or continue statement. continue calls `popFront`. break does not. Of course by the next iteration, you are right. You can call `popFront` if you need to after the loop, or just before the break. I have to say, the term "useless" does not even come close to describing ranges using foreach in my experience. I disagree, because foreach() is a language construct and therefore it should behave in a logic way. The methods are fine in ranges or if something is done manually. But in case of foreach() it's just unexpected. It becomes useless for foreach() because you can't rely on them if other code breaks the loop and you need to use that range, like in my case. But also for ranges - there is no need for a popFront() if it is not called in a logic way. Then even empty() could fetch next data if needed. It only makes sense if language system code uses it in a strictly order and ensures that this order is always assured. It's not a bug. So there is no need to "handle" it. The pattern of using a for(each) loop to align certain things occurs all the time in code. Imagine a loop that is looking for a certain line in a file, and breaks when the line is there. Would you really want the compiler to unhelpfully throw away that line for you? I don't get this point. If it breaks from the loop then it changes the scope anyway, so my data should be already processed or copied. What is thrown away here? And if that is what you want, put `popFront` in the loop before you exit. You can't "unpopFront" something, so this provides the most flexibility. -Steve Yes, this is the solution but not the way how it should be. If the programmer uses the range methods within the foreach-loop then you would expect some bug. There shouldn't be a need to manipulate the range just because I break the foreach-loop. Java, for example just uses next() and hasNext(). You can't run into a bug here because one method must move the cursor. PHP has a rewind() method. So any foreach() would reset the range or could clean up before next use of it. But D just lets your range in an inconsistent state between an iteration cycle. This feels just wrong. The next foreach() would not continue with popFront() but with empty() again - because it even relies on it that a range should be called in a given order. As there is no rewind or exit-method, this order should be maintained by foreach-exit too, preparing for next use. That's it. You don't see a bug here?
Re: foreach() behavior on ranges
On Tue, Aug 24, 2021 at 08:36:18AM +, frame via Digitalmars-d-learn wrote: > Consider a simple input range that can be iterated with empty(), > front() and popFront(). That is comfortable to use with foreach() but > what if the foreach loop will be cancelled? If a range isn't depleted > yet and continued it will supply the same data twice on front() in the > next use of foreach(). Generally, if you need precise control over range state between multiple loops, you really should think about using a while loop instead of a for loop, and call .popFront where it's needed. > For some reason, foreach() does not call popFront() on a break or continue > statement. There is no way to detect it except the range itself tracks its > status and does an implicit popFront() if needed - but then this whole > interface is some kind of useless. In some cases, you *want* to retain the same element between loops, e.g., if you're iterating over elements of some category and stop when you encounter something that belongs to the next category -- you wouldn't want to consume that element, but leave it to the next loop to consume it. So it's not a good idea to have break call .popFront automatically. Similarly, sometimes you might want to reuse an element (e.g., the loop body detects a condition that warrants retrying). Basically, once you need anything more than a single sequential iteration over a range, it's better to be explicit about what exactly you want, rather than depend on implicit semantics, which may lead to surprising results. while (!range.empty) { doSomething(range.front); if (someCondition) { range.popFront; break; } else if (someOtherCondition) { // Don't consume current element break; } else if (skipElement) { range.popFront; continue; } else if (retryElement) { continue; } range.popFront; // normal iteration } T -- "No, John. I want formats that are actually useful, rather than over-featured megaliths that address all questions by piling on ridiculous internal links in forms which are hideously over-complex." -- Simon St. Laurent on xml-dev
Re: foreach() behavior on ranges
On 8/24/21 4:36 AM, frame wrote: Consider a simple input range that can be iterated with empty(), front() and popFront(). That is comfortable to use with foreach() but what if the foreach loop will be cancelled? If a range isn't depleted yet and continued it will supply the same data twice on front() in the next use of foreach(). For some reason, foreach() does not call popFront() on a break or continue statement. continue calls `popFront`. break does not. There is no way to detect it except the range itself tracks its status and does an implicit popFront() if needed - but then this whole interface is some kind of useless. You can call `popFront` if you need to after the loop, or just before the break. I have to say, the term "useless" does not even come close to describing ranges using foreach in my experience. There is opApply() on the other hand that is designed for foreach() and informs via non-0-result if the loop is cancelled - but this means that every range must implement it if the range should work in foreach() correctly? `opApply` has to return different values because it needs you to pass through its instructions to the compiler-generated code. The compiler has written the delegate to return the message, and so you need to pass through that information. The non-zero result is significant, not just non-zero. For instance, if you end with a `break somelabel;` statement, it has to know which label to go to. The correct behavior for `opApply` should be, if the delegate returns non-zero, return that value immediately. It should not be doing anything else. Would you be happy with a `break somelabel;` actually triggering output? What if it just continued the loop instead? You don't get to decide what happens at that point, you are acting as the compiler. This is very inconsistent. Either foreach() should deny usage of ranges that have no opApply() method or there should be a reset() or cancel() method in the interfaces that may be called by foreach() if they are implemented. How do you handle that issue? Are your ranges designed to have this bug or do you implement opApply() always? It's not a bug. So there is no need to "handle" it. The pattern of using a for(each) loop to align certain things occurs all the time in code. Imagine a loop that is looking for a certain line in a file, and breaks when the line is there. Would you really want the compiler to unhelpfully throw away that line for you? And if that is what you want, put `popFront` in the loop before you exit. You can't "unpopFront" something, so this provides the most flexibility. -Steve
Re: foreach() behavior on ranges
On Tuesday, 24 August 2021 at 09:26:20 UTC, jfondren wrote: I think you strayed from the beaten path, in a second way, as soon as your range's lifetime escaped a single expression, to be possibly used in two foreach loops. With ranges, as you do more unusual things, you're already encouraged to use a more advanced range. And ranges already have caveats for surprising behavior, like map/filter interactions that redundantly execute code. So I see this as a documentation problem. The current behavior of 'if you break then the next foreach gets what you broke on' is probably a desirable behavior for some uses: Yes, I have a special case where a delegate jumps back to the range because something must be buffered before it can be delivered. ```d import std; class MyIntRange { int[] _elements; size_t _offset; this(int[] elems) { _elements = elems; } bool empty() { return !_elements || _offset >= _elements.length; } int front() { return _elements[_offset]; } void popFront() { _offset++; } } void main() { auto ns = new MyIntRange([0, 1, 1, 2, 3, 4, 4, 4, 5]); // calls writeln() as many times as there are numbers: while (!ns.empty) { foreach (odd; ns) { if (odd % 2 == 0) break; writeln("odd: ", odd); } foreach (even; ns) { if (even % 2 != 0) break; writeln("even: ", even); } } } ``` That is just weird. It's not logical and a source of bugs. I mean, we should use foreach() to avoid loop-bugs. Then it's a desired behavior to rely on that?
Re: foreach() behavior on ranges
On Tuesday, 24 August 2021 at 09:15:23 UTC, bauss wrote: A range should be a struct always and thus its state is copied when the foreach loop is created. This is not conform with the aggregate expression mentioned in the manual where a class object would be also allowed. Which means the state resets every time the loop is initiated. Yes, it should reset - thus foreach() also needs to handle that correctly.
Re: foreach() behavior on ranges
On Tuesday, 24 August 2021 at 08:36:18 UTC, frame wrote: Consider a simple input range that can be iterated with empty(), front() and popFront(). That is comfortable to use with foreach() but what if the foreach loop will be cancelled? If a range isn't depleted yet and continued it will supply the same data twice on front() in the next use of foreach(). I think you strayed from the beaten path, in a second way, as soon as your range's lifetime escaped a single expression, to be possibly used in two foreach loops. With ranges, as you do more unusual things, you're already encouraged to use a more advanced range. And ranges already have caveats for surprising behavior, like map/filter interactions that redundantly execute code. So I see this as a documentation problem. The current behavior of 'if you break then the next foreach gets what you broke on' is probably a desirable behavior for some uses: ```d import std; class MyIntRange { int[] _elements; size_t _offset; this(int[] elems) { _elements = elems; } bool empty() { return !_elements || _offset >= _elements.length; } int front() { return _elements[_offset]; } void popFront() { _offset++; } } void main() { auto ns = new MyIntRange([0, 1, 1, 2, 3, 4, 4, 4, 5]); // calls writeln() as many times as there are numbers: while (!ns.empty) { foreach (odd; ns) { if (odd % 2 == 0) break; writeln("odd: ", odd); } foreach (even; ns) { if (even % 2 != 0) break; writeln("even: ", even); } } } ```
Re: foreach() behavior on ranges
On Tuesday, 24 August 2021 at 08:36:18 UTC, frame wrote: Consider a simple input range that can be iterated with empty(), front() and popFront(). That is comfortable to use with foreach() but what if the foreach loop will be cancelled? If a range isn't depleted yet and continued it will supply the same data twice on front() in the next use of foreach(). For some reason, foreach() does not call popFront() on a break or continue statement. There is no way to detect it except the range itself tracks its status and does an implicit popFront() if needed - but then this whole interface is some kind of useless. There is opApply() on the other hand that is designed for foreach() and informs via non-0-result if the loop is cancelled - but this means that every range must implement it if the range should work in foreach() correctly? This is very inconsistent. Either foreach() should deny usage of ranges that have no opApply() method or there should be a reset() or cancel() method in the interfaces that may be called by foreach() if they are implemented. How do you handle that issue? Are your ranges designed to have this bug or do you implement opApply() always? A range should be a struct always and thus its state is copied when the foreach loop is created. Which means the state resets every time the loop is initiated. If your range uses some internal state that isn't able to be copied then or your ranges are not structs then your ranges are inherently incorrect. This is what a foreach loop on a range actually compiles to: ```d for (auto copy = range; !copy.empty; copy.popFront()) { ... } ``` This is easily evident in this example: https://run.dlang.io/is/YFuWHn Which prints: 1 2 1 2 3 4 5 Unless I'm misunderstanding your concern?
Re: foreach: How start a foreach count with specific number?
On Wednesday, 2 June 2021 at 15:49:36 UTC, Marcone wrote: But I don't want it starts with 0, but other number. How can I do it? Easiest way is to just add the starting number: size_t start = 5; foreach (n, i; glob("*")) { print("{} DATA {}".format(n, start + i)); } You can also use [`std.range.enumerate`][1], which takes the number to start with as an optional argument: foreach (n, i; glob("*").enumerate(5)) { print("{} DATA {}".format(n, start + i)); } [1]: https://phobos.dpldocs.info/std.range.enumerate.html
Re: foreach: How start a foreach count with specific number?
On 6/2/21 8:49 AM, Marcone wrote: > But I don't want it starts with 0, but other number. How can I do it? It is not configurable but is trivial by adding a base value: import std.stdio; enum base = 17; void main() { auto arr = [ "hello", "world" ]; foreach (i, str; arr) { const count = base + i; writefln!"%s: %s"(count, str); } } Ali
Re: foreach, RefCounted and non-copyable range
On Monday, 18 January 2021 at 18:57:04 UTC, vitamin wrote: You need something like RefCountedRange with methods popFront, front, empty. Thanks! refRange from std.range does the trick, indeed.
Re: foreach, RefCounted and non-copyable range
On Sunday, 17 January 2021 at 12:15:00 UTC, Fynn Schröder wrote: I'm puzzled why RefCounted and foreach do not work well together, i.e.: ``` auto range = refCounted(nonCopyableRange); // ok foreach(e; range) // Error: struct is not copyable because it is annotated with @disable // do something ``` See https://run.dlang.io/is/u271nK for a full example where I also compared the foreach compiler rewrite and the manual rewrite of foreach to a simple for loop. Somehow foreach makes a copy of the internal payload of RefCounted (run the example and look at the address of the payload/range). Is this a bug and is there any way around it? foreach first copy range and then iterate over it. RefCounted is not range, foreach directly copy element of RefCounted. //this code is equivalent to yours void notOk() { auto r = refCounted(Range()); writeln("before ", r.front); Range tmp_r = r; foreach (i; tmp_r) writeln("loop ", i); writeln("after ", r.front); assert(r.i == 3, "r.ri != 3"); } You need something like RefCountedRange with methods popFront, front, empty.
Re: Foreach output into a multi dimensional associative array.
On Tuesday, 27 October 2020 at 08:00:55 UTC, Imperatorn wrote: On Monday, 26 October 2020 at 19:05:04 UTC, Vino wrote: [...] Some comments: 1. You're missing a comma (,) after the first item in your apidata 2. You're creating a string[int][string] instead of string[][string] (your expected output) 3. Where is i++ coming from? https://run.dlang.io/is/jfPoeZ Hi, Thank yu very much, your suggestion resolved my issue.
Re: Foreach output into a multi dimensional associative array.
On Monday, 26 October 2020 at 19:05:04 UTC, Vino wrote: Hi All, Request your help on the below on how to store the output to a multi dimensional associative array. Code: import std.stdio: writeln; import asdf: parseJson; import std.conv: to; void main() { string[int][string] aa; string apidata = `{"items": [ {"name":"T01","hostname":"test01","pool":"Development"} {"name":"T02","hostname":"test02","pool":"Quality"}, {"name":"T03","hostname":"test03","pool":"Production"} ] }`; auto jv = parseJson(apidata); foreach(j; jv["items"].byElement()){ aa["Name"] = j["name"].get!string("default"); i++; } writeln(aa); } Expected Output aa["Name"] = [T01, T01, T03] aa["Hostname"] = [test01, test02, test03] aa["Pool"] = [Development, Quality, Production] From, Vino.B Some comments: 1. You're missing a comma (,) after the first item in your apidata 2. You're creating a string[int][string] instead of string[][string] (your expected output) 3. Where is i++ coming from? https://run.dlang.io/is/jfPoeZ
Re: foreach iterator with closure
To keep this reply brief, I'll just summarize: Lots of great takeaways from both of your posts, and a handful of topics you mentioned that I need to dig into further now. This is great (I too like D :) I very much appreciate the extra insight into how things work and why certain design decisions were made: for me, this is essential for gaining fluency in a language. Thanks again for all your help! Denis
Re: foreach iterator with closure
On 6/28/20 9:07 AM, Denis wrote: > * foreach is the actual iterator, Yes. foreach is "lowered" to the following equivalent: for ( ; !range.empty; range.popFront()) { // Use range.front here } A struct can support foreach iteration through its opCall() member function as well. opCall() takes the body of the foreach as a delegate. Because it's a function call, it can take full advantage of the function call stack. This may help with e.g. writing recursive iteration algorithms. http://ddili.org/ders/d.en/foreach_opapply.html#ix_foreach_opapply.opApply > the instantiation of a struct is the > range. Yes. > * When a constructor is not used, the arguments in the call to > instantiate the range (in this case, `hello` in letters(`hello`)) are > mapped sequentially to the member variables in the struct definition > (i.e. to letters.str). Yes, that is a very practical struct feature. I write my structs with as little as needed and provide a constructor only when it is necessary as in your case. > * When a constructor is used, the member variables in the struct > definition are in essence private. Not entirely true. You can still make them public if you want. http://ddili.org/ders/d.en/encapsulation.html > The arguments in the call to > instantiate the range are now mapped directly to the parameters in the > definition of the "this" function. Yes. > * The syntax and conventions for constructors is difficult and > non-intuitive for anyone who hasn't learned Java (or a derivative). C++ uses the name of the class as the constructor: // C++ code struct S { S(); // <-- Constructor S(int); // <-- Another one }; The problem with that syntax is having to rename more than one thing when the name of struct changes e.g. to Q: struct Q { Q(); Q(int); }; And usually in the implementation: Q::Q() {} Q::Q(int) {} D's choice of 'this' is productive. > The > linked document provides a simplified explanation for the "this" > keyword, which is helpful for the first read: > https://docs.oracle.com/javase/tutorial/java/javaOO/thiskey.html. I like searching for keywords in my index. The "this, constructor" here links to the constructor syntax: http://ddili.org/ders/d.en/ix.html > * In some respects, the Java syntax is not very D-like. (For example, it > breaks the well-established convention of "Do not use the same name to > mean two different things".) Yes but it competes with another goal: Change as little code as possible when one thing needs to be changed. This is not only practical but helps with correctness. > However, it does need to be learned, > because it is common in D source code. I like D. :p > Here is the complete revised code for the example (in condensed form): > >import std.stdio; > >struct letters { > > string str; > int pos = 1;// Assign here or in this()) > > this(string param1) {// cf. shadow str >str = param1;// cf. this.str = param1 / this.str = str >writeln(`BEGIN`); } > > char front() { return str[pos]; } > void popFront() { pos ++; } > bool empty() { return pos == str.length; } > > ~this() { writeln("\nEND"); }} > >void main() { > foreach (letter; letters(`hello`)) { >write(letter, ' '); }} > > At this point, I do have one followup question: > > Why is the shadow str + "this.str = str" the more widely used syntax in > D, when the syntax in the code above is unambiguous? Because one needs to come up with names like "param7", "str_", "_str", "s", etc. I like and follow D's standard here. > One possible reason that occurred to me is that "str = param1" might > require additional GC, because they are different names. Not at all because there is not memory allocation at all. strings are implemented as the equivalent of the following struct: struct __D_native_string { size_t length_; char * ptr; // ... } So, the "str = param1" assignment is nothing but two 64 bit data transfer, which can easily by optimized away by the compiler in many cases. > But I wouldn't > think it'd make any difference to the compiler. Yes. :) > > Denis Ali
Re: foreach iterator with closure
Many thanks: your post has helped me get past the initial stumbling blocks I was struggling with. I do have a followup question. First, here are my conclusions up to this point, based on your post above, some additional experimentation, and further research (for future reference, and for any other readers). * foreach is the actual iterator, the instantiation of a struct is the range. * When a constructor is not used, the arguments in the call to instantiate the range (in this case, `hello` in letters(`hello`)) are mapped sequentially to the member variables in the struct definition (i.e. to letters.str). * When a constructor is used, the member variables in the struct definition are in essence private. The arguments in the call to instantiate the range are now mapped directly to the parameters in the definition of the "this" function. * The syntax and conventions for constructors is difficult and non-intuitive for anyone who hasn't learned Java (or a derivative). The linked document provides a simplified explanation for the "this" keyword, which is helpful for the first read: https://docs.oracle.com/javase/tutorial/java/javaOO/thiskey.html. * In some respects, the Java syntax is not very D-like. (For example, it breaks the well-established convention of "Do not use the same name to mean two different things".) However, it does need to be learned, because it is common in D source code. Here is the complete revised code for the example (in condensed form): import std.stdio; struct letters { string str; int pos = 1;// Assign here or in this()) this(string param1) { // cf. shadow str str = param1; // cf. this.str = param1 / this.str = str writeln(`BEGIN`); } char front() { return str[pos]; } void popFront() { pos ++; } bool empty() { return pos == str.length; } ~this() { writeln("\nEND"); }} void main() { foreach (letter; letters(`hello`)) { write(letter, ' '); }} At this point, I do have one followup question: Why is the shadow str + "this.str = str" the more widely used syntax in D, when the syntax in the code above is unambiguous? One possible reason that occurred to me is that "str = param1" might require additional GC, because they are different names. But I wouldn't think it'd make any difference to the compiler. Denis
Re: foreach iterator with closure
On 6/27/20 8:19 PM, Denis wrote: > Is it possible to write an iterator It is arguable whether D's ranges are iterators but if nouns are useful, we call them ranges. :) (Iterators can be written in D as well and then it would really be confusing.) >struct letters { > string str; > int pos = 0; > char front() { return str[pos]; } > void popFront() { pos ++; } > bool empty() { >if (pos == 0) writeln(`BEGIN`); >else if (pos == str.length) writeln("\nEND"); >return pos == str.length; }} > >void main() { > foreach (letter; letters(`hello`)) { >write(letter, ' '); } > writeln(); } > > The obvious problems with this code include: > > (1) The user can pass a second argument, which will set the initial > value of pos. That problem can be solved by a constructor that takes a single string. Your BEGIN code would normally go there as well. And END goes into the destructor: struct letters { this(string str) { this.str = str; this.pos = 0; // Redundant writeln(`BEGIN`); } ~this() { writeln("\nEND"); } // [...] } Note: You may want to either disallow copying of your type or write copy constructor that does the right thing: https://dlang.org/spec/struct.html#struct-copy-constructor However, it's common to construct a range object by a function. The actual range type can be kept as an implementation detail: struct Letters { // Note capital L // ... } auto letters(string str) { // ... return Letters(str); } struct Letter can be a private type of its module or even a nested struct inside letters(), in which case it's called a "Voldemort type". Ali
Re: foreach on a tuple using aliases
On 06.08.2018 14:37, Steven Schveighoffer wrote: On 8/5/18 11:40 AM, Timon Gehr wrote: On 05.08.2018 16:07, Steven Schveighoffer wrote: So is this a bug? Is it expected? It's a bug. The two copies of 'item' are not supposed to be the same symbol. (Different types -> different symbols.) Yep. I even found it has nothing to do with foreach on a tuple: https://run.dlang.io/is/vxQlIi I wonder though, it shouldn't really be a different type that triggers it, right? It shouldn't.
Re: foreach on a tuple using aliases
On 8/5/18 11:40 AM, Timon Gehr wrote: On 05.08.2018 16:07, Steven Schveighoffer wrote: So is this a bug? Is it expected? It's a bug. The two copies of 'item' are not supposed to be the same symbol. (Different types -> different symbols.) Yep. I even found it has nothing to do with foreach on a tuple: https://run.dlang.io/is/vxQlIi I wonder though, it shouldn't really be a different type that triggers it, right? I mean 2 separate aliases to different variables that are the same type, I would hope would re-instantiate. Otherwise something like .offsetof would be wrong. Is it too difficult to fix? ... Unlikely. https://issues.dlang.org/show_bug.cgi?id=19145 -Steve
Re: foreach on a tuple using aliases
On 8/5/18 10:48 AM, Alex wrote: void main() { Foo foo; assert(isFoo!foo); static struct X { int i; Foo foo; } X x; static foreach(i, item; typeof(x).tupleof) static if(is(typeof(item) == Foo)) // line A static assert(isFoo!item); // line B else static assert(!isFoo!item); } I did try static foreach, but it doesn't work. The difference here is you are using typeof(x).tupleof, whereas I want x.tupleof. Note that in my real code, I do more than just the static assert, I want to use item as a reference to the real field in x. -Steve
Re: foreach on a tuple using aliases
On 05.08.2018 16:07, Steven Schveighoffer wrote: I have found something that looks like a bug to me, but also looks like it could simply be a limitation of the foreach construct. Consider this code: struct Foo {} enum isFoo(alias x) = is(typeof(x) == Foo); void main() { Foo foo; assert(isFoo!foo); static struct X { int i; Foo foo; } X x; foreach(i, ref item; x.tupleof) static if(is(typeof(item) == Foo)) // line A static assert(isFoo!item); // line B else static assert(!isFoo!item); } Consider just the two lines A and B. If you saw those lines anywhere, given the isFoo definition, you would expect the assert to pass. But in this case, it fails. What is happening is that the first time through the loop, we are considering x.i. This is an int, and not a Foo, so it assigns false to the template isFoo!item. The second time through the loop on x.foo, the compiler decides that it ALREADY FIGURED OUT isFoo!item, and so it just substitutes false, even though the item in question is a different item. So is this a bug? Is it expected? It's a bug. The two copies of 'item' are not supposed to be the same symbol. (Different types -> different symbols.) Is it too difficult to fix? ... Unlikely. The workaround of course is to use x.tupleof[i] when instantiating isFoo. But it's a bit ugly. I can also see other issues cropping up if you use `item` for other meta things. -Steve
Re: foreach on a tuple using aliases
On Sunday, 5 August 2018 at 14:07:30 UTC, Steven Schveighoffer wrote: I have found something that looks like a bug to me, but also looks like it could simply be a limitation of the foreach construct. Consider this code: struct Foo {} enum isFoo(alias x) = is(typeof(x) == Foo); void main() { Foo foo; assert(isFoo!foo); static struct X { int i; Foo foo; } X x; foreach(i, ref item; x.tupleof) static if(is(typeof(item) == Foo)) // line A static assert(isFoo!item); // line B else static assert(!isFoo!item); } Consider just the two lines A and B. If you saw those lines anywhere, given the isFoo definition, you would expect the assert to pass. But in this case, it fails. What is happening is that the first time through the loop, we are considering x.i. This is an int, and not a Foo, so it assigns false to the template isFoo!item. The second time through the loop on x.foo, the compiler decides that it ALREADY FIGURED OUT isFoo!item, and so it just substitutes false, even though the item in question is a different item. So is this a bug? Is it expected? Is it too difficult to fix? The workaround of course is to use x.tupleof[i] when instantiating isFoo. But it's a bit ugly. I can also see other issues cropping up if you use `item` for other meta things. -Steve Another workaround would be ´´´ void main() { Foo foo; assert(isFoo!foo); static struct X { int i; Foo foo; } X x; static foreach(i, item; typeof(x).tupleof) static if(is(typeof(item) == Foo)) // line A static assert(isFoo!item); // line B else static assert(!isFoo!item); } ´´´ wouldn't it?
Re: foreach / mutating iterator - How to do this?
On 2018-06-25 15:29:23 +, Robert M. Münch said: I have two foreach loops where the inner should change the iterator (append new entries) of the outer. foreach(a, candidates) { foreach(b, a) { if(...) candidates ~= additionalCandidate; } } The foreach docs state that the collection must not change during iteration. So, how to best handle such a situation then? Using a plain for loop? Answering myself: If you implement an opApply using for() or while() etc. with a mutating aggregate, foreach can be used indirectly with mutating aggregates. Works without any problems. -- Robert M. Münch http://www.saphirion.com smarter | better | faster
Re: foreach / mutating iterator - How to do this?
On Monday, June 25, 2018 17:29:23 Robert M. Münch via Digitalmars-d-learn wrote: > I have two foreach loops where the inner should change the iterator > (append new entries) of the outer. > > foreach(a, candidates) { > foreach(b, a) { > if(...) candidates ~= additionalCandidate; > } > } > > The foreach docs state that the collection must not change during > iteration. > > So, how to best handle such a situation then? Using a plain for loop? Either that or create a separate array containing the elements you're adding and then append that to candidates after the loop has terminated. Or if all you're really trying to do is run an operation on a list of items, and in the process, you get more items that you want to operate on but don't need to keep them around afterwards, you could just wrap the operation in a function and use recursion. e.g. foreach(a, candidates) { doStuff(a); } void func(T)(T a) { foreach(b, a) { if(...) func(additionalCandidate); } } But regardless, you can't mutate something while you're iterating over it with foreach, so you're either going to have to manually control the iteration yourself so that you can do it in a way that guarantees that it's safe to add elements while iterating, or you're going to have to adjust what you're doing so that it doesn't need to add to the list of items while iterating over it. The big issue with foreach is that if it's iterating over is a range, then it copies it, and if it's not a range, it slices it (or if it defines opApply, that gets used). So, foreach(e; range) gets lowered to foreach(auto __c = range; !__c.empty; __c.popFront()) { auto e = __c.front; } which means that range is copied, and it's then unspecified behavior as to what happens if you try to use the range after passing it to foreach (the exact behavior depends on how the range is implemented), meaning that you really shouldn't be passing a range to foreach and then still do anything with it. If foreach is given a container, then it slices it, e.g. foreach(e; container) foreach(auto __c = container[]; !__c.empty; __c.popFront()) { auto e = __c.front; } so it doesn't run into the copying problem, but it's still not a good idea to mutate the container while iterating. What happens when you try to mutate the container while iterating over a range from that container depends on the container, and foreach in general isn't supposed to be able to iterate over something while it's mutated. Dynamic and associative arrays get different lowerings than generic ranges or containers, but they're also likely to run into problems if you try to mutate them while iterating over them. So, if using a normal for loop instead of foreach fixes your problem, then there you go. Otherwise, rearrange what you're doing so that it doesn't need to add anything to the original list of items in the loop. Either way, trying to mutate what you're iterating over is going to cause bugs, albeit slightly different bugs depending on what you're iterating over. - Jonathan M Davis
Re: foreach / mutating iterator - How to do this?
On Mon, Jun 25, 2018 at 05:29:23PM +0200, Robert M. Münch via Digitalmars-d-learn wrote: > I have two foreach loops where the inner should change the iterator > (append new entries) of the outer. > > foreach(a, candidates) { > foreach(b, a) { > if(...) candidates ~= additionalCandidate; > } > } > > The foreach docs state that the collection must not change during > iteration. > > So, how to best handle such a situation then? Using a plain for loop? [...] Yes. T -- The fact that anyone still uses AOL shows that even the presence of options doesn't stop some people from picking the pessimal one. - Mike Ellis
Re: foreach DFS/BFS for tree data-structure?
On Thursday, 14 June 2018 at 11:31:50 UTC, Robert M. Münch wrote: I have a simple tree C data-structure that looks like this: node { node parent: vector[node] children; } I would like to create two foreach algorthims, one follwing the breadth first search pattern and one the depth first search pattern. Is this possible? I read about Inputranges, took a look at the RBTree code etc. but don't relly know/understand where to start. What I found really interesting when reading Ali Çehreli's book 'Programming in D' was using fibers for tree iteration. Check out http://ddili.org/ders/d.en/fibers.html and skip to the section "Fibers in range implementations"
Re: foreach DFS/BFS for tree data-structure?
On Thursday, 14 June 2018 at 11:31:50 UTC, Robert M. Münch wrote: I have a simple tree C data-structure that looks like this: node { node parent: vector[node] children; } I would like to create two foreach algorthims, one follwing the breadth first search pattern and one the depth first search pattern. Is this possible? I read about Inputranges, took a look at the RBTree code etc. but don't relly know/understand where to start. While it's possible to do with input ranges, it's not pretty and I'm not sure that it's as performant as the traditional method. I would recommend going with one of the other suggestions in this thread.
Re: foreach DFS/BFS for tree data-structure?
On 6/14/18 8:35 AM, Robert M. Münch wrote: On 2018-06-14 11:46:04 +, Dennis said: On Thursday, 14 June 2018 at 11:31:50 UTC, Robert M. Münch wrote: Is this possible? I read about Inputranges, took a look at the RBTree code etc. but don't relly know/understand where to start. You can also use opApply to iterate over a tree using foreach, see: https://tour.dlang.org/tour/en/gems/opdispatch-opapply Ah... that looks very good. Need to dig a bit deeper on how to use it. Thanks. Just to clarify, RBTree can easily do DFS without a real stack because there are a finite number of children (2) for each node, and it's an O(1) operation to figure out which child the current node is in relation to the parent (am I a left child or right child?). Now, with your C version, if your children are stored in the array itself, figuring out the "next" child is a matter of doing + 1. But if you are storing pointers instead, then figuring out the next child would be an O(n) operation. Using the stack to track where you are (via opApply) is a valid way as well. You could also unroll that into a malloc'd stack, but the code is not as pretty of course. -Steve
Re: foreach DFS/BFS for tree data-structure?
On 2018-06-14 11:46:04 +, Dennis said: On Thursday, 14 June 2018 at 11:31:50 UTC, Robert M. Münch wrote: Is this possible? I read about Inputranges, took a look at the RBTree code etc. but don't relly know/understand where to start. You can also use opApply to iterate over a tree using foreach, see: https://tour.dlang.org/tour/en/gems/opdispatch-opapply Ah... that looks very good. Need to dig a bit deeper on how to use it. Thanks. -- Robert M. Münch http://www.saphirion.com smarter | better | faster
Re: foreach DFS/BFS for tree data-structure?
On Thursday, 14 June 2018 at 11:31:50 UTC, Robert M. Münch wrote: Is this possible? I read about Inputranges, took a look at the RBTree code etc. but don't relly know/understand where to start. You can also use opApply to iterate over a tree using foreach, see: https://tour.dlang.org/tour/en/gems/opdispatch-opapply
Re: foreach DFS/BFS for tree data-structure?
On 14/06/2018 11:31 PM, Robert M. Münch wrote: I have a simple tree C data-structure that looks like this: node { struct Node { node parent: Node* parent; vector[node] children; Node[] children; } I would like to create two foreach algorthims, one follwing the breadth first search pattern and one the depth first search pattern. Here is (very roughly breadth): auto search(Method method) { struct Voldermort { Node parent; size_t offset; @property { Node front() { return parent.children[offset]; } bool empty() { return offset == parent.children.length; } } void popFront() { offset++; } } return Voldermort(this); } Depth will be a bit of a pain since you'll need to know where you have been at each set of children.
Re: foreach bug, or shoddy docs, or something, or both.
On Sunday, 10 December 2017 at 02:31:47 UTC, Jonathan M Davis wrote: On Sunday, December 10, 2017 02:02:31 Dave Jones via Digitalmars-d wrote: https://issues.dlang.org/show_bug.cgi?id=14984 Honestly, it would have never occurred to me to try and modify the variables declared in the foreach like that, and my first inclination is to think that it shouldn't be allowed, but thinking it through, and looking at how things actually work, I don't think that the current behavior is really a problem. Its not something I normally do but I was porting some C++ code and changed the fors to foreaches. Just it took a while to figure out what was going wrong. then you can increment i the way you want to. So, maybe modifying the loop variable in something like foreach(i; 0 .. 10) { ++i; } should be disallowed, but I don't really think that the current behavior is really a problem either so long as it's properly documented. Modifying i doesn't really cause any problems and ref or the lack thereof allows you to control whether it affects the loop. Ultimaetly, it looks to me like what we currently have is reasonably well designed. It doesn't surprise me in the least if it's not well-documented though. Well it's just one of those "APIs should be hard to use in the wrong way" kinda things to me. Either the loop var should be actually the loop var rather than a copy, or it should be an error to modify it. Anyway it's not a bit deal like you say, but should be better explained in the docs at least. Thanks,
Re: foreach bug, or shoddy docs, or something, or both.
On Sunday, December 10, 2017 02:02:31 Dave Jones via Digitalmars-d wrote: > Foreach ignores modification to the loop variable... > > import std.stdio; > > void main() { > int[10] foo = 10; > > foreach (i; 0..10) // writes '10' ten times > { > writeln(foo[i]); > if (i == 3) i+=2; > } > } > > From the docs... > > "ForeachType declares a variable with either an explicit type, or > a type inferred from LwrExpression and UprExpression. <**snip**> > If Foreach is foreach, then the variable is set to LwrExpression, > then incremented at the end of each iteration." > > That's clearly not what is happening, yes we get a variable, but > either its a copy of the actual loop variable, or something else > is going on. Because if we got the actual variable then > modifications to it would not be lost at the end of the current > iteration. > > My opinion is the it should pick up the modification. I think > people expect the foreach form to be a shorthand for a regular > for loop. > > Failing that it should be an error to write to the loop variable. > > An at the least it should be explained in the documentation that > actually you get a copy of the loop variable so modifying it is a > waste of time. https://issues.dlang.org/show_bug.cgi?id=14984 Honestly, it would have never occurred to me to try and modify the variables declared in the foreach like that, and my first inclination is to think that it shouldn't be allowed, but thinking it through, and looking at how things actually work, I don't think that the current behavior is really a problem. Given what the variant of foreach you used would presumably be lowered to something like for(int i = 0; i < 10; ++i) { ... } it wouldn't have surprised me if modifying i would have worked, but clearly, that's not quite what's happening, and in the general case, it doesn't make much sense to expect that mucking with the loop variable would affect the loop itself. foreach(e; myRange) { ... } gets lowered to something like for(auto __c = range; !__c.empty; __c.popFront()) { auto e = __c.front; ... } and altering e in tha case should probably be fine, but it wouldn't affect the iteration at all either. However, when iterating over arrays or using foreach like you tried, it turns out that you can use ref to control things. e.g. foreach(i, ref e; arr) { } will allow you to modify the elements in the array, and foreach(ref i, e; arr) { } will actually allow you to increment i to skip elements. And if you do foreach(ref i; 0 .. 10) { } then you can increment i the way you want to. So, maybe modifying the loop variable in something like foreach(i; 0 .. 10) { ++i; } should be disallowed, but I don't really think that the current behavior is really a problem either so long as it's properly documented. Modifying i doesn't really cause any problems and ref or the lack thereof allows you to control whether it affects the loop. Ultimaetly, it looks to me like what we currently have is reasonably well designed. It doesn't surprise me in the least if it's not well-documented though. - Jonathan M Davis
Re: Foreach loops on static arrays error message
On Thursday, 6 July 2017 at 08:49:33 UTC, Stefan Koch wrote: I'd say this is not often encoutered. One should avoid using a different type then size_t for the index, as it can have negative performance implications. I thought size_t was what it lowered down to using if you used something else. What should I use instead?
Re: Foreach loops on static arrays error message
On Thursday, 6 July 2017 at 09:06:18 UTC, Guillaume Chatelet wrote: ubyte[256] data; foreach(ubyte i; 0..256) { ubyte x = data[i]; } Yes. Much better. What's the rewrite in this case? Using a size_t internally and casting to ubyte? I was just wondering
Re: Foreach loops on static arrays error message
On Thursday, 6 July 2017 at 09:11:44 UTC, Ola Fosheim Grøstad wrote: ubyte[256] data; if (data.length > 0) { ubyte i = 0; do { writeln(i); } while ((++i) != cast(ubyte)data.length); } Here is another version that will work ok on CPUs that can issue many instructions in parallel if there are no dependencies between them as you avoid an explicit comparison on the counter (zero tests tend be be free): ubyte[N] data; size_t _counter = data.length; if( _counter !=0){ ubyte i = 0; do { writeln(i); i++; } while (--_counter); }
Re: Foreach loops on static arrays error message
On Thursday, 6 July 2017 at 09:11:44 UTC, Ola Fosheim Grøstad wrote: ubyte[256] data; if (data.length > 0) { ubyte i = 0; do { writeln(i); } while ((++i) != cast(ubyte)data.length); } You also need to add an assert before the if to check that the last index can be represented as an ubyte. So probably not worth it.
Re: Foreach loops on static arrays error message
On Thursday, 6 July 2017 at 08:26:42 UTC, Guillaume Chatelet wrote: A correct lowering would be: ubyte[256] data; for(ubyte i = 0;;++i) { ubyte x = data[i]; ... if(i==255) break; } That could lead to two branches in machine language, try to think about it in terms of if and do-while loops to mirror typical machine language. The correct lowering is: ubyte[256] data; if (data.length > 0) { ubyte i = 0; do { writeln(i); } while ((++i) != cast(ubyte)data.length); }
Re: Foreach loops on static arrays error message
On Thursday, 6 July 2017 at 09:00:47 UTC, Andrea Fontana wrote: On Thursday, 6 July 2017 at 08:26:42 UTC, Guillaume Chatelet wrote: From the programmer's point of view the original code makes sense. A correct lowering would be: ubyte[256] data; for(ubyte i = 0;;++i) { ubyte x = data[i]; ... if(i==255) break; } or: ubyte[256] data; foreach(ubyte i; 0..256) { ubyte x = data[i]; } Yes. Much better. What's the rewrite in this case? Using a size_t internally and casting to ubyte?
Re: Foreach loops on static arrays error message
On Thursday, 6 July 2017 at 08:57:42 UTC, Nemanja Boric wrote: On Thursday, 6 July 2017 at 08:49:33 UTC, Stefan Koch wrote: On Thursday, 6 July 2017 at 08:26:42 UTC, Guillaume Chatelet wrote: [...] I'd say this is not often encoutered. One should avoid using a different type then size_t for the index, as it can have negative performance implications. Interesting. What would be the example of negative performance implication? I'm guilty of using the int on occasions. on 64bit a downcast can cause the compiler to emit a cqo instruction when the index is used as an index. it's relatively expensive in some circumstances when it messed up predictions.
Re: Foreach loops on static arrays error message
On Thursday, 6 July 2017 at 08:26:42 UTC, Guillaume Chatelet wrote: From the programmer's point of view the original code makes sense. A correct lowering would be: ubyte[256] data; for(ubyte i = 0;;++i) { ubyte x = data[i]; ... if(i==255) break; } or: ubyte[256] data; foreach(ubyte i; 0..256) { ubyte x = data[i]; }
Re: Foreach loops on static arrays error message
On Thursday, 6 July 2017 at 08:49:33 UTC, Stefan Koch wrote: On Thursday, 6 July 2017 at 08:26:42 UTC, Guillaume Chatelet wrote: [...] I'd say this is not often encoutered. One should avoid using a different type then size_t for the index, as it can have negative performance implications. Interesting. What would be the example of negative performance implication? I'm guilty of using the int on occasions.
Re: Foreach loops on static arrays error message
On Thursday, 6 July 2017 at 08:26:42 UTC, Guillaume Chatelet wrote: I stumbled upon https://issues.dlang.org/show_bug.cgi?id=12685 In essence: [...] `ubyte` can clearly hold a value from 0 to 255 so it should be ok. No need for 256 ?! So I decided to fix it https://github.com/dlang/dmd/pull/6973 Unfortunately when you think about how foreach is lowered it makes sense - but the error message is misleading. The previous code is lowered to: [...] `i < 256` is always true, this would loop forever. Questions: - What would be a better error message? - How about a different lowering in this case? From the programmer's point of view the original code makes sense. A correct lowering would be: [...] I'd say this is not often encoutered. One should avoid using a different type then size_t for the index, as it can have negative performance implications.
Re: foreach range with index
On 6/14/17 6:02 PM, Ali Çehreli wrote: On 06/14/2017 12:22 PM, Steven Schveighoffer wrote: foreach(i, v; hashmap) => i is counter, v is value Later hashmap adds support for iterating key and value. Now i is key, v is value. Code means something completely different. Compare with foreach(i, v; hashmap.enumerate) Intent is clear from the code. -Steve Then, perhaps we're arguing in favor of * writing .enumerate even for slices (implying that automatic indexing for them has been a historical artifact and code that wants to be portable should always write .enumerate) * making sure that enumerate() on arrays don't bring extra cost I would say making enumerate on *any* range shouldn't bring any extra cost over how foreach works on an array. One idea I had but haven't thought it through completely is a way to mark some parameter to foreach as always referencing the actual index, so you aren't making unnecessary copies for the loop. When you foreach a range, a copy is made just for the loop, and *then* a copy is made each loop iteration for the element itself. Maybe tagging a parameter in foreach as lazy means "always use the range element". -Steve