Re: GStreamer and D
On Friday, 16 June 2017 at 16:33:56 UTC, Russel Winder wrote: gst-inspect-1.0 is an executable that comes with the installation, however that is done. What are you thinking of when saying "ported"? gst-inspect is a good demonstration of iteration through the available gstreamer elements and their options. I want to modify that code to generate a model that could be used for persisting a pipeline configuration. https://github.com/GStreamer/gstreamer/blob/master/tools/gst-inspect.c So far, I've looked up about 80 calls used in that program, and only these few don't have c aliases in the d interfaces. I haven't looked to see if these are macros, or perhaps I could be looking at an incompatible version of gst-inspect.c. Anyway, looks pretty good so far. gst_plugin_feature_get_name g_list_next g_return_if_fail g_value_get_boolean
Re: GStreamer and D
On Friday, 16 June 2017 at 06:45:38 UTC, Russel Winder wrote: Welcome to the group of people using GStreamer from D. I suspect I may be the only other member of that club. Looks like gst-inspect hasn't been ported... I'm looking at that now.
Re: GStreamer and D
wow! I hadn't tried this gtkd library before. I was hunting for the gstreamer in particular. The hello_world alsa-sink audio example failed on Windows. The debugger indicates no sink, which I guess is reasonable. With very little effort, though, I converted the hello_world example to generate a video test pattern and use vidoeconvert and autovideosink, and that popped up right away on Windows in a 64 bit build ... so, nice going! The gstreamer example built without error in msvc 2013 with visualD and DMD32 D Compiler v2.073.2
foreach(i,ref val; ndim_arr)??
I noticed some discussion of Cartesian indexes in Julia, where the index is a tuple, along with some discussion of optimizing the index created for cache efficiency. I could find foreach(ref val, m.byElement()), but didn't find an example that returned a tuple index. Is that supported? http://julialang.org/blog/2016/02/iteration http://julialang.org/blog/2016/03/arrays-iteration
Re: Async or event library
The tnfox cross-platform toolkit had some solution for per-thread event loops. I believe this was the demo: https://github.com/ned14/tnfox/blob/master/TestSuite/TestEventLoops/main.cpp
Re: relative benefit of .reserve and .length
On Friday, 29 April 2016 at 10:10:26 UTC, sigod wrote: How about `assumeSafeAppend`? Does it have any positive impact on performance? assumeSafeAppend made it even slower ... about 20x instead of 10x worse than the indexed assign. Release build, win32.
relative benefit of .reserve and .length
I timed some code recently and found that .reserve made almost no improvement when appending. It appears that the actual change to the length by the append had a very high overhead of something over 200 instructions executed, regardless if the .reserve was done. This was a simple append to an integer array. The only way I found to avoid this was to set the length outside the loop and update the array values by index. That was on the order of 10x faster.
version pairs?
Seems like there should be an extra level to the version statement, something like version(arch,x86). I must be missing something about the intended use of the version statement.
Re: multi-dimensional dynamic arrays
On Friday, 19 February 2016 at 14:26:25 UTC, Steven Schveighoffer wrote: Try ub[0].length = 3. You are trying to change the length on one of the static arrays. yes, right these compile. I was surpised it wouldn't accept the append with just an int. int[1][][1] ubb; ubb[0].length = 3; ubb[0] ~= [5]; If you had more than 1 as a static dimension, then you would have to change the length of *each* of the elements. Arrays in D, are actually quite simple. Any time you see: T[] It's a dynamic array of T. Any time you see: T[N] Where N is a compile-time integer, it's a static array of T. So dissecting your type: int[1][][1] So the outer-most T is int[1][]. You have a single instance of this, in a static array. At the next level, T is int[1], where you have a dynamic array of these. Finally, at the 3rd level, T is int, you have a single element in a static array of int. -Steve
Re: multi-dimensional dynamic arrays
On Friday, 19 February 2016 at 07:59:29 UTC, Jonathan M Davis wrote: .. Or you could do something really wonky like auto arr = new int[][2][](5); which would be a dynamic array of length 5 which holds static arrays of length 2 which hold dynamic arrays which are null. In my case, int [1][][1] ub;, there is only one dynamic dimension, but if I try to use .length to change the length, ub.length = 3, the compiler doesn't like that. int[1][][1] ubb; ubb.length = 3; src\app.d(13,5): Error: constant ubb.length is not an lvalue dmd failed with exit code 1. So, is there some supported syntax to set the length of the internal dimension, or to append to it?
multi-dimensional dynamic arrays
Strange to me that this compiles, since I would expect there to be some C-like limitation on the position of the unspecified dimension. Is allowing this somehow useful? int[1][][1] ub; writeln("ub",ub);
foreach( i, e; a) vs ndslice
I'm playing with the example below. I noticed a few things. 1. The ndslice didn't support the extra index, i, in the foreach, so had to add extra i,j. 2. I couldn't figure out a way to use sliced on the original 'a' array. Is slicing only available on 1 dim arrays? 3. Sliced parameter order is different than multi-dimension array dimension declaration. import std.stdio; import std.experimental.ndslice.slice; void main() { int[4][5] a = new int[20]; foreach(i,ref r; a){ foreach(j,ref c; r){ c= i+j; writefln("a(%d,%d)=%s",i,j,c); } } writefln("a=%s",a); auto b = new int[20].sliced(5,4); int i=0; foreach( ref r; b){ int j=0; foreach( ref c; r){ c= i+j; writefln("b(%d,%d)=%s",i,j,c); j++; } i++; } writefln("b=%s",b); }
Re: ndslice, using a slice in place of T[] in template parameters
On Monday, 11 January 2016 at 00:50:37 UTC, Ilya Yaroshenko wrote: I will add such function. But it is not safe to do so (Slice can have strides not equal to 1). So it is like a hack (&ret[0, 0, 0])[0 .. ret.elementsCount]). Have you made comparison between my and yours parallel versions? https://github.com/9il/examples/blob/parallel/image_processing/median-filter/source/app.d -- Ilya Thanks. No, I haven't studied it previously, but I see how you used the 'hack' in your code, and it works out to the statement below in my case. medians[i] = median(vec, (&slb[task,0])[0 .. bigd]); which compiled. It ran in the faster time without the .array copying. parallel time medians msec:87 That 'hack' seems to be related to the third from below. https://dlang.org/spec/arrays.html b = a; b = a[]; b = a[0 .. a.length];
Re: ndslice, using a slice in place of T[] in template parameters
On Sunday, 10 January 2016 at 23:31:47 UTC, Ilya Yaroshenko wrote: Just use normal arrays for buffer (median accepts array on second argument for optimisation reasons). ok, I think I see. I created a slice(numTasks, bigd) over an allocated double[] dbuf, but slb[task] will be returning some struct instead of the double[] that i need in this case. If I add .array to the Slice, it does compile, and executes, but slower than using the buffer directly. medians[i] = median(vec, slb[task].array); parallel time medians msec:113 original version using the computed slice of the original allocated dbuf. medians[i] = median(vec,dbuf[j .. k]); parallel time medians msec:85 The .array appears to make a copy. Is there some other call in ndslice to return the double[] slice of the original array?
Re: ndslice, using a slice in place of T[] in template parameters
On Sunday, 10 January 2016 at 22:23:18 UTC, Ilya Yaroshenko wrote: Could you please provide full code and error (git gists)? -- Ilya ok, thanks. I'm building with DMD32 D Compiler v2.069.2 on Win32. The dub.json is included. https://gist.github.com/jnorwood/affd05b69795c20989a3
ndslice, using a slice in place of T[] in template parameters
I cut this median template from Jack Stouffer's article and was attempting to use it in a parallel function. As shown, it builds and execute correctly, but it failed to compile if I attempting to use medians[i] = median(vec,slb[task]); in place of the medians[i] = median(vec,dbuf[j .. k]); Is there a cast needed? import std.array : array; import std.algorithm; import std.datetime; import std.conv : to; import std.stdio; import std.experimental.ndslice; shared double[] medians; double[] data; shared double[] dbuf; int numTasks; const int smalld = 1000; const int bigd = 10_000; const int fulld = bigd*smalld; /** Params: r = input range buf = buffer with length no less than the number of elements in `r` Returns: median value over the range `r` */ T median(Range, T)(Range r, T[] buf) { import std.algorithm.sorting: sort; size_t n; foreach (e; r) { buf[n++] = e; } buf[0 .. n].sort(); immutable m = n >> 1; return n & 1 ? buf[m] : cast(T)((buf[m - 1] + buf[m]) / 2); } void f3() { import std.parallelism; auto sl = data.sliced(smalld,bigd); auto slb = dbuf.sliced(numTasks,bigd); foreach(i,vec; parallel(sl)){ int task = taskPool.workerIndex; int j = task*bigd; int k = j+bigd; medians[i] = median(vec,dbuf[j .. k]); } } void main() { import std.parallelism; numTasks = taskPool.size+1; data = new double[fulld]; dbuf = new double[bigd*numTasks]; medians = new double[smalld]; for(int i=0;i
Re: sliced().array compatibility with parallel?
On Sunday, 10 January 2016 at 03:23:14 UTC, Ilya wrote: I will add significantly faster pairwise summation based on SIMD instructions into the future std.las. --Ilya Wow! A lot of overhead in the debug build. I checked the computed values are the same. This is on my laptop corei5. dub -b release-nobounds --force parallel time msec:448 non_parallel msec:767 dub -b debug --force parallel time msec:2465 non_parallel msec:4962 on my corei7 desktop, the release-no bounds parallel time msec:161 non_parallel msec:571
Re: sliced().array compatibility with parallel?
On Sunday, 10 January 2016 at 11:21:53 UTC, Marc Schütz wrote: I'd say, if `shared` is required, but it compiles without, then it's still a bug. Yeah, probably so. Interestingly, without 'shared' and using a simple assignment from a constant (means[i]= 1.0;), instead of assignment from the sum() evaluation, results in all the values being initialized, so not marking it shared doesn't protect it from being written from the other thread. Anyway, the shared declaration doesn't seem to slow the execution, and it does make sense to me that it should be marked shared.
Re: sliced().array compatibility with parallel?
On Sunday, 10 January 2016 at 12:11:39 UTC, Russel Winder wrote: foreach( dv; dvp){ if(dv != dv){ // test for NaN return 1; } } return(0); } I am not convinced these "Tests for NaN" actually test for NaN. I believe you have to use isNan(dv). I saw it mentioned in another post, and tried it. Works.
Re: sliced().array compatibility with parallel?
On Sunday, 10 January 2016 at 01:54:18 UTC, Jay Norwood wrote: ok, thanks. That works. I'll go back to trying ndslice now. The parallel time for this case is about a 2x speed-up on my corei5 laptop, debug build in windows32, dmd. D:\ec_mars_ddt\workspace\nd8>nd8.exe parallel time msec:2495 non_parallel msec:5093 === import std.array : array; import std.algorithm; import std.datetime; import std.conv : to; import std.stdio; import std.experimental.ndslice; shared double[1000] means; double[] data; void f1() { import std.parallelism; auto sl = data.sliced(1000,100_000); foreach(i,vec; parallel(sl)){ means[i] = vec.sum / 100_000; } } void f2() { auto sl = data.sliced(1000,100_000); foreach(i,vec; sl.array){ means[i] = vec.sum / 100_000; } } void main() { data = new double[100_000_000]; for(int i=0;i<100_000_000;i++){ data[i] = i/100_000_000.0;} StopWatch sw1, sw2; sw1.start(); f1() ; auto r1 = sw1.peek().msecs; sw2.start(); f2(); auto r2 = sw2.peek().msecs; writeln("parallel time msec:",r1); writeln("non_parallel msec:", r2); }
Re: sliced().array compatibility with parallel?
On Sunday, 10 January 2016 at 01:16:43 UTC, Ilya Yaroshenko wrote: On Saturday, 9 January 2016 at 23:20:00 UTC, Jay Norwood wrote: I'm playing around with win32, v2.069.2 dmd and "dip80-ndslice": "~>0.8.8". If I convert the 2D slice with .array(), should that first dimension then be compatible with parallel foreach? [...] Oh... there is no bug. means must be shared =) : shared double[1000] means; ok, thanks. That works. I'll go back to trying ndslice now.
Re: sliced().array compatibility with parallel?
On Sunday, 10 January 2016 at 00:47:29 UTC, Ilya Yaroshenko wrote: This is a bug in std.parallelism :-) ok, thanks. I'm using your code and reduced it a bit. Looks like it has some interaction with executing vec.sum. If I substitute a simple assign of a double value, then all the values are updated in the parallel version also. import std.algorithm; double[1000] dvp; double[1000] dv2; double[] data; void f1() { import std.parallelism; auto sla = new double[][1000]; foreach(i, ref e; sla) { e = data[i * 100_000 .. (i+1) * 100_000]; } // calculate sums in parallel foreach(i, vec; parallel(sla)){ dvp[i] = vec.sum; } // calculate same values non-parallel foreach(i, vec; sla){ dv2[i] = vec.sum; } } int main() { data = new double[100_000_000]; for(int i=0;i<100_000_000;i++){ data[i] = i/100_000_000.0;} f1(); // processed non-parallel works ok foreach( dv; dv2){ if(dv != dv){ // test for NaN return 1; } } // calculated parallel leaves out processing of many values foreach( dv; dvp){ if(dv != dv){ // test for NaN return 1; } } return(0); }
Re: sliced().array compatibility with parallel?
On Sunday, 10 January 2016 at 00:41:35 UTC, Ilya Yaroshenko wrote: It is a bug (Slice or Parallel ?). Please fill this issue. Slice should work with parallel, and array of slices should work with parallel. Ok, thanks, I'll submit it.
Re: sliced().array compatibility with parallel?
for example, means[63] through means[251] are consistently all NaN when using parallel in this test, but are all computed double values when parallel is not used.
sliced().array compatibility with parallel?
I'm playing around with win32, v2.069.2 dmd and "dip80-ndslice": "~>0.8.8". If I convert the 2D slice with .array(), should that first dimension then be compatible with parallel foreach? I find that without using parallel, all the means get computed, but with parallel, only about half of them are computed in this example. The others remain NaN, examined in the debugger in Visual D. import std.range : iota; import std.array : array; import std.algorithm; import std.datetime; import std.conv : to; import std.stdio; import std.experimental.ndslice; enum testCount = 1; double[1000] means; double[] data; void f1() { import std.parallelism; auto sl = data.sliced(1000,100_000); auto sla = sl.array(); foreach(i,vec; parallel(sla)){ double v=vec.sum(0.0); means[i] = v / 100_000; } } void main() { data = new double[100_000_000]; for(int i=0;i<100_000_000;i++){ data[i] = i/100_000_000.0;} auto r = benchmark!(f1)(testCount); auto f0Result = to!Duration(r[0] / testCount); f0Result.writeln; writeln(means[0]); }
Re: UFCS vs auto-completion support
On Saturday, 9 January 2016 at 16:00:51 UTC, cym13 wrote: I may be very naive but how is the second form more complicated than the first? Pretending these were regular function implementations ... 1000. 1000.iota. 1000.iota.sliced( iota( sliced( sliced(iota( I wouldn't be surprised if auto-completion provided correct possible parameter type lists for the last three, but obviously the first two would provide no help, and I'd be pleasantly surprised if the third form provided the parameter type list without the first parameter. anyway ... I'll just try some simple cases in VisualD and eclipse DDT and see what they come up with.
UFCS vs auto-completion support
I'm reading Jack Stouffer's documentation: http://jackstouffer.com/blog/nd_slice.html considering the UFCS example below and how it would impact auto-completion support. auto slice = sliced(iota(1000), 5, 5, 40); auto slice = 1000.iota.sliced(5, 5, 40); Seems like auto-complete support for the second form would be complicated. Do any of the auto-completion implementations even attempt to support that second form?
Re: each! vs foreach parallel timings
On Sunday, 27 December 2015 at 23:42:57 UTC, Ali Çehreli wrote: That does not compile because i is size_t but apply_metrics() takes an int. One solution is to call to!int: foreach( i, ref a; parallel(samples[])){ apply_metrics(i.to!int,a);} It builds for me still, and executes ok, but must be because size_t and i are both 32 bits on Win32 build.
Re: each! vs foreach parallel timings
On Sunday, 27 December 2015 at 23:42:57 UTC, Ali Çehreli wrote: On 12/27/2015 11:30 AM, Jay Norwood wrote: > samples[].each!((int i, ref a)=>apply_metrics(i,a)); Are you using an older compiler? That tuple expansion does not work any more at least with dmd v2.069.0 but you can use enumerate(): samples[].enumerate.each!(t=>apply_metrics(t[0].to!int,t[1])); > foreach( i, ref a; parallel(samples[])){ apply_metrics(i,a);} That does not compile because i is size_t but apply_metrics() takes an int. One solution is to call to!int: foreach( i, ref a; parallel(samples[])){ apply_metrics(i.to!int,a);} To not answer your actual question, I don't think it's possible. :) Ali The code I posted was compiled with v2.069.2. It isn't creating a tuple return value in this code. I'll re-check it.
each! vs foreach parallel timings
I'm doing some re-writing and measuring. The basic task is to take 10K samples (in struct S samples below) and calculate some metrics (just per sample for now). It isn't evident to me how to write the parallel foreach in the same format as each!, so I just used the loop form that I understood. Measured times below are for processing three simple metrics 100 times on 10K samples. This parallel mode could be very useful in my work, which involves processing a bunch of hardware performance data. This is on windows, corei5, DMD32 D Compiler v2.069.2, debug build. each! time:59 ms parallel! time:20 ms import std.stdio; import std.algorithm; import std.conv; import std.range; import std.typecons; import std.parallelism; import std.array; import std.traits; import std.datetime; struct S { int sn; ulong a; ulong b; ulong c; ulong d; double e; ulong f; ulong m1; double m2; double m3;} void apply_metrics(int i,ref S s){ with(s){ m1 = a+b; m2 = (c+d)/e; m3 = (c+f)/e; sn = i; } } int main() { S[1] samples; // initialize some values foreach ( int i, ref s; samples){ int j=i+1; with (s){ a=j; b=j*2; c=j*3; d=j*4; e=j*10; f=j*5; } } auto sw = StopWatch(AutoStart.yes); // apply several functions on each sample, also number the samples foreach(j;iota(100)) samples[].each!((int i, ref a)=>apply_metrics(i,a)); writeln("each! time:", sw.peek().msecs, " ms"); auto sw2 = StopWatch(AutoStart.yes); // do the same as above, but in parallel foreach(j;iota(100)) foreach( i, ref a; parallel(samples[])){ apply_metrics(i,a);} writeln("parallel! time:", sw2.peek().msecs, " ms"); return 0; }
Re: specifying an auto array type
On Sunday, 27 December 2015 at 07:40:55 UTC, Ali Çehreli wrote: It looks like you need map(), not each(): import std.algorithm; import std.typecons; import std.array; void main() { auto a = [ 1, 2 ]; auto arr = a.map!(e => tuple(2 * e, e * e)).array; static assert(is(typeof(arr) == Tuple!(int, int)[])); } Ali ok, thanks. This does work, using the uint i ahead of the map statement. uint i=0; auto arr = samples[].map!(a => tuple!("sample","f1","f2","f3")(i++,f1(a),f2(a),f3(a))).array; writeln(arr); = output Tuple!(int, "sample", ulong, "f1", double, "f2", double, "f3")(0, 3, 0.7, 0.8) Tuple!(int, "sample", ulong, "f1", double, "f2", double, "f3")(1, 6, 0.7, 0.8) Tuple!(int, "sample", ulong, "f1", double, "f2", double, "f3")(2, 9, 0.7, 0.8) Tuple!(int, "sample", ulong, "f1", double, "f2", double, "f3")(3, 12, 0.7, 0.8) Tuple!(int, "sample", ulong, "f1", double, "f2", double, "f3")(4, 15, 0.7, 0.8) Tuple!(int, "sample", ulong, "f1", double, "f2", double, "f3")(5, 18, 0.7, 0.8) Tuple!(int, "sample", ulong, "f1", double, "f2", double, "f3")(6, 21, 0.7, 0.8) Tuple!(int, "sample", ulong, "f1", double, "f2", double, "f3")(7, 24, 0.7, 0.8) Tuple!(int, "sample", ulong, "f1", double, "f2", double, "f3")(8, 27, 0.7, 0.8) Tuple!(int, "sample", ulong, "f1", double, "f2", double, "f3")(9, 30, 0.7, 0.8) = However, I was trying to use each!, with the intention of then moving to parallel processing by samples blocks. My guess is this would be more efficient than using parallel map or amap, which would parallel process by function application, if I understand correctly. It isn't clear to me from the examples if something like below can be rewritten to use the chained calls. foreach(i, ref elem; taskPool.parallel(samples, 100))
specifying an auto array type
This is getting kind of a long example, but I'm really only interested in the last 4 or 5 lines. This works as desired, creating the array of tuples, but I'm wondering if there is a way to have the Tuple array defined as auto instead of having to specify the types. I tried using .array() at the end of the last samples.each!, but couldn't find an implementation that worked. Yes, I know, some of these imports aren't required (yet). import std.stdio; import std.algorithm; import std.conv; import std.range; import std.typecons; import std.parallelism; import std.array; struct S { ulong a; ulong b; ulong c; ulong d; double e; ulong f;} ulong f1(ref S s) { with(s){return a+b;}} double f2(ref S s) { with(s){return (c+d)/e;}} double f3(ref S s) { with(s){return (c+f)/e;}} int main() { S[10] samples; // initialize some values foreach ( int i, ref s; samples){ int j=i+1; with (s){ a=j; b=j*2; c=j*3; d=j*4; e=j*10; f=j*5; } } // apply several functions on each sample samples.each!((int i, ref a)=>tuple!("sample","f1","f2","f3")(i,f1(a),f2(a),f3(a)).writeln()); // output the function results to an array of tuples Tuple!(int, ulong, double, double)[] arr; samples.each!((int i, ref a)=> arr ~= tuple!("sample","f1","f2","f3")(i,f1(a),f2(a),f3(a))); writeln(arr); return 0; }
Re: How to instantiate a map with multiple functions
On Sunday, 27 December 2015 at 03:22:50 UTC, Jay Norwood wrote: I would probably want to associate names with the tuple metric results, and I've seen that somewhere in the docs in parameter tuples. I suppose I'll try those in place of the current tuple ... This worked to associate names with the tuple values. Just the one line modified. samples.each!((int i, ref a)=>tuple!("sample","f1","f2","f3")(i,f1(a),f2(a),f3(a)).writeln()); === output Tuple!(int, "sample", ulong, "f1", double, "f2", double, "f3")(0, 3, 0.7, 0.8) Tuple!(int, "sample", ulong, "f1", double, "f2", double, "f3")(1, 6, 0.7, 0.8) Tuple!(int, "sample", ulong, "f1", double, "f2", double, "f3")(2, 9, 0.7, 0.8) Tuple!(int, "sample", ulong, "f1", double, "f2", double, "f3")(3, 12, 0.7, 0.8) Tuple!(int, "sample", ulong, "f1", double, "f2", double, "f3")(4, 15, 0.7, 0.8) Tuple!(int, "sample", ulong, "f1", double, "f2", double, "f3")(5, 18, 0.7, 0.8) Tuple!(int, "sample", ulong, "f1", double, "f2", double, "f3")(6, 21, 0.7, 0.8) Tuple!(int, "sample", ulong, "f1", double, "f2", double, "f3")(7, 24, 0.7, 0.8) Tuple!(int, "sample", ulong, "f1", double, "f2", double, "f3")(8, 27, 0.7, 0.8) Tuple!(int, "sample", ulong, "f1", double, "f2", double, "f3")(9, 30, 0.7, 0.8)
Re: How to instantiate a map with multiple functions
I'm playing around with something also trying to apply multiple functions. In my case, a sample is some related group of measurements taken simultaneously, and I'm calculating a group of metrics from the measured data of each sample. This produces the correct results for the input data, and it seems pretty clear what functions are being applied. I would probably want to associate names with the tuple metric results, and I've seen that somewhere in the docs in parameter tuples. I suppose I'll try those in place of the current tuple ... import std.stdio; import std.algorithm; import std.conv; import std.range; import std.typecons; struct S { ulong a; ulong b; ulong c; ulong d; double e; ulong f;} ulong f1(ref S s) { with(s){return a+b;}} double f2(ref S s) { with(s){return (c+d)/e;}} double f3(ref S s) { with(s){return (c+f)/e;}} int main() { S[10] samples; // initialize some values foreach ( int i, ref s; samples){ int j=i+1; with (s){ a=j; b=j*2; c=j*3; d=j*4; e=j*10; f=j*5; } } // apply several functions on each sample samples.each!((int i, ref a)=>tuple(i,f1(a),f2(a),f3(a)).writeln()); return 0; } == output is Tuple!(int, ulong, double, double)(0, 3, 0.7, 0.8) Tuple!(int, ulong, double, double)(1, 6, 0.7, 0.8) Tuple!(int, ulong, double, double)(2, 9, 0.7, 0.8) Tuple!(int, ulong, double, double)(3, 12, 0.7, 0.8) Tuple!(int, ulong, double, double)(4, 15, 0.7, 0.8) Tuple!(int, ulong, double, double)(5, 18, 0.7, 0.8) Tuple!(int, ulong, double, double)(6, 21, 0.7, 0.8) Tuple!(int, ulong, double, double)(7, 24, 0.7, 0.8) Tuple!(int, ulong, double, double)(8, 27, 0.7, 0.8) Tuple!(int, ulong, double, double)(9, 30, 0.7, 0.8)
Re: basic interactive readf from stdin
On Sunday, 27 December 2015 at 00:20:51 UTC, Ali Çehreli wrote: On 12/26/2015 12:11 PM, karthikeyan wrote: > I read http://ddili.org/ders/d.en/input.html and inserted a space before %s > but still no use. Am I missing something here with the latest version? The answer is nine chapters later. :) (Use readln() and strip() (or chomp())). http://ddili.org/ders/d.en/strings.html Ali Yes, thank you, strip() appears to be more useful than chomp() in this case.
Re: basic interactive readf from stdin
On Saturday, 26 December 2015 at 20:19:08 UTC, Adam D. Ruppe wrote: On Saturday, 26 December 2015 at 20:11:27 UTC, karthikeyan wrote: I experience the same as the OP on Linux Mint 15 with dmd2.069 and 64 bit machine. I have to press enter twice to get the output. I read http://ddili.org/ders/d.en/input.html and inserted a space before %s but still no use. Am I missing something here with the latest version? Oh, I'm sorry, it isn't buffering, it is readfing into a string here which is weird. Maybe try readln instead of readf. The use of readf into a string is demonstrated in a stdio.d unit test. I assumed it might also work with stdin. string s; auto f = File(deleteme); f.readf("%s\n", &s); assert(s == "hello", "["~s~"]"); f.readf("%s\n", &s); assert(s == "world", "["~s~"]"); = I did get this below to work with readln, although since readln didn't consume the terminator, I had to add the chomp() call. import std.stdio; import std.string; int main(string[] argv) { string nm, nm2; nm=readln('\n'); nm2 = nm.chomp(); writeln("nm:",nm2); nm=readln('\n'); nm2 = nm.chomp(); writeln("nm:",nm2); return 0; }
Re: basic interactive readf from stdin
On Saturday, 26 December 2015 at 20:38:52 UTC, tcak wrote: On Saturday, 26 December 2015 at 20:19:08 UTC, Adam D. Ruppe wrote: On Saturday, 26 December 2015 at 20:11:27 UTC, karthikeyan wrote: I experience the same as the OP on Linux Mint 15 with dmd2.069 and 64 bit machine. I have to press enter twice to get the output. I read http://ddili.org/ders/d.en/input.html and inserted a space before %s but still no use. Am I missing something here with the latest version? Oh, I'm sorry, it isn't buffering, it is readfing into a string here which is weird. Maybe try readln instead of readf. As far as I remember, in C, if I was to be putting "\n" in scanf after %s, that double entering was happening. I guess that's the same problem. Trying same code without \n in readf can fix it I guess. import std.stdio; int main(string[] argv) { string nm; stdin.readf("%s",&nm); writeln("nm:",nm); stdout.flush(); stdin.readf("%s",&nm); writeln("nm:",nm); stdout.flush(); return 0; } ok, I tried above, adding both the stdout.flush() and removing the \n from the format. It didn't write to output even after a couple of enter's. When I entered ctrl-Z, it output below. output running from command prompt 123 456 ^Z nm:123 456 nm:123 456
Re: basic interactive readf from stdin
On Saturday, 26 December 2015 at 19:52:15 UTC, Adam D. Ruppe wrote: On Saturday, 26 December 2015 at 19:40:59 UTC, Jay Norwood wrote: Simple VS console app in D. If you are running inside visual studio, you need to be aware that output will be block buffered, not line buffered, because VS pipes the output making the program think it is talking to another program instead of to an interactive console (well, because it is!) Add a stdout.flush(); after writing to force it to show immediately. I really think the read functions ought to flush output too because this is such a FAQ. (indeed, my terminal.d does flush output when you request input) It doesn't make a difference if I run in VS or from a console window. I had also already tried various forms stdout.flush(). It doesn't make a difference ... still requires two extra enters before it outputs the data. I haven't tried it in linux yet.
basic interactive readf from stdin
Simple VS console app in D. Reading lines to a string variable interactively. Object is to have no extra blank lines in the console output. Seems very broken for this use, requiring two extra "enter" entries before the outputs both appear. Version DMD32 D Compiler v2.069.2 import std.stdio; int main(string[] argv) { string nm; stdin.readf("%s\n",&nm); writeln("nm:",nm); stdin.readf("%s\n",&nm); writeln("nm:",nm); return 0; } io shown below 123 456 nm:123 nm:456
Re: ndslice of an array structure member?
On Tuesday, 22 December 2015 at 01:13:54 UTC, Jack Stouffer wrote: The problem is that t3 is slicing a1 which is a dynamic array, which is a range, while t4 is trying to slice a static array, which is not a range. ok, thanks. I lost track of the double meaning of static ... I normally think of static as variables with static preceding the type, allocated at start-up, but in this case it refers to structure members that will have a fixed size after the allocation.
Re: ndslice and limits of debug info and autocompletion
The autocompletion doesn't work here to offer epu_ctr in the writeln statement either, so it doesn't seem to be a problem with number of subscripts. writeln(a1[0]. does offer epu_ctr for completion at the same place. import std.stdio; import std.experimental.ndslice; import std.experimental.ndslice.iteration: transposed; struct sample{ ulong [10] core_ctr; ulong [32] epu_ctr; } void main() { auto a1 = new sample[60]; auto t3 = a1.sliced!(ReplaceArrayWithPointer.no)(60); writeln(t3[0].epu_ctr); }
ndslice and limits of debug info and autocompletion
I'm trying to determine if the debugger autocompletion would be useful in combination with ndslice. I find that using visualD I get offered no completion to select core_ctr or epu_ctr where epu_ctr is used in the writeln below. I take it this either means that there is some basic limitation in the debug info, or else VisualD just punts after some number of array subscripts. The code builds and executes correctly ... but I was hoping the debugger completion would help out with an exploratory mode using compiled code. import std.stdio; import std.experimental.ndslice; import std.experimental.ndslice.iteration: transposed; struct sample{ ulong [10] core_ctr; ulong [32] epu_ctr; } void main() { auto a1 = new sample[60]; auto t3 = a1.sliced!(ReplaceArrayWithPointer.no)(3,4,5); writeln(t3[0][0][0].epu_ctr); }
ndslice of an array structure member?
I'm trying to learn ndslice. It puzzles me why t3 compiles ok, but t4 causes a compiler error in the example below. Should I be able to slice a struct member that is an array? import std.stdio; import std.experimental.ndslice; import std.experimental.ndslice.iteration: transposed; struct sample{ ulong [10] core_ctr; } struct block{ ulong[60] samples; } void main() { auto a1 = new sample[60]; auto t3 = a1.sliced!(ReplaceArrayWithPointer.no)(3,4,5); auto b1 = new block; auto t4 = b1.samples.sliced!(ReplaceArrayWithPointer.no)(3,4,5); } == results in error Building D project: nd4 Running: "C:\Program Files\dub\dub.exe" build Performing "debug" build using dmd for x86. dip80-ndslice 0.8.4: target for configuration "library" is up to date. nd4 ~master: building configuration "application"... src\app.d(16,58): Error: template std.experimental.ndslice.slice.sliced cannot deduce function from argument types !(cast(Flag)false)(ulong[60], int, int, int), candidates are: C:\Users\rlxv10\AppData\Roaming\dub\packages\dip80-ndslice-0.8.4\source\std\experimental\ndslice\slice.d(39,6): std.experimental.ndslice.slice.sliced(Flag mod = ReplaceArrayWithPointer.yes, Range, Lengths...)(Range range, Lengths lengths) if (!isStaticArray!Range && !isNarrowString!Range && allSatisfy!(isIndex, Lengths) && Lengths.length) C:\Users\rlxv10\AppData\Roaming\dub\packages\dip80-ndslice-0.8.4\source\std\experimental\ndslice\slice.d(47,6): std.experimental.ndslice.slice.sliced(Flag mod = ReplaceArrayWithPointer.yes, uint N, Range)(Range range, auto ref size_t[N] lengths, size_t shift = 0) if (!isStaticArray!Range && !isNarrowString!Range && N) C:\Users\rlxv10\AppData\Roaming\dub\packages\dip80-ndslice-0.8.4\source\std\experimental\ndslice\slice.d(116,1): std.experimental.ndslice.slice.sliced(Names...) if (Names.length && !anySatisfy!(isType, Names) && allSatisfy!(isStringValue, Names)) dmd failed with exit code 1. ^^^ Terminated, exit code: 2 ^^^
Re: use of typeof to determine auto type with ndslice examples
So, the extra confusion of the typeof(iota) Result return goes away when slicing arrays. auto a1 = new int[100]; auto t3 = a1.sliced(3,4,5); pragma(msg,typeof(t3)); //This prints Slice!(3u, int*) Slice!(3u, int*) t4 = a1.sliced(3,4,5); // and this works ok
Re: use of typeof to determine auto type with ndslice examples
On Monday, 21 December 2015 at 04:39:23 UTC, drug wrote: You can use alias Type = typeof(t0); Type t1 = 1000.iota.sliced(3, 4, 5); IIRC Result is the Voldemort type. You can think of it as a detail of implementation of ndslice that isn't intended to be used by a ndslice user directly. ok, well this worked, so it seems to be something lacking in the description of iota's type rather than an issue with ndslice. alias RESULT = typeof(1000.iota); Slice!(3u, RESULT) t1 = 1000.iota.sliced(3, 4, 5); auto t2 = 1000.iota(); pragma(msg, typeof(t2)); This just prints Result. Ok, so what is the reason for not being able to know the type of a specified iota?
Re: use of typeof to determine auto type with ndslice examples
import std.stdio; import std.experimental.ndslice; void main() { import std.algorithm.iteration: map; import std.array: array; import std.range; import std.traits; auto t0 = 1000.iota.sliced(3, 4, 5); pragma(msg, typeof(t0)); Slice!(3u, Result) t1 = 1000.iota.sliced(3, 4, 5); darn, I didn't duplicate the problem correctly. It should be as above. and the error is: Slice!(3u, Result) src\app.d(12,2): Error: undefined identifier 'Result' dmd failed with exit code 1.
use of typeof to determine auto type with ndslice examples
I pulled down the std.experimental.ndslice examples and am attempting to build some of the examples and understand the types being used. I know don't need all these imports, but it is hard to guess which ones are needed, and the examples often don't provide them, which I suspect is a common gripe here. Anyway, I was expecting to be able to use the typeof pragma to print a type that could use as a fully specified type, and that doesn't seem to be the case. I get a compile error instead. DMD32 D Compiler v2.069.2 on win32, "dip80-ndslice": "~>0.8.4" Is there some other way to get a valid fully specified type for these sliced auto variables? import std.stdio; import std.experimental.ndslice; void main() { import std.algorithm.iteration: map; import std.array: array; import std.range; import std.traits; auto t0 = 1000.iota.sliced(3, 4, 5); pragma(msg, typeof(t0)); Slice!(3u, Result) = 1000.iota.sliced(3, 4, 5); Slice!(3u, Result) src\app.d(12,2): Error: undefined identifier 'Result' dmd failed with exit code 1.
Re: dataframe implementations
On Saturday, 21 November 2015 at 14:16:26 UTC, Laeeth Isharc wrote: Not sure it is a great idea to use a variant as the basic option when very often you will know that every cell in a particular column will be of the same type. I'm reading today about an n-dim extension to pandas named xray. Maybe should try to understand how that fits. They support io from netCDF, and are making extensions to support blocked input using dask, so they can process data larger than in-memory limits. http://xray.readthedocs.org/en/stable/data-structures.html https://www.continuum.io/content/xray-dask-out-core-labeled-arrays-python In general, pandas and xray are supporting with the requirement of pulling in data from storage of initially unknown column and index names and data types. Julia throws in support of jit compilation and specialized operations for different data types. It seems to me that D's strength would be in a quick compile, which would then allow you to replace the dictionary tag implementations and variants with something that used compile time symbol names and data types. Seems like that would provide more efficient processing, as well as better tab completion support when creating expressions.
Re: dataframe implementations
On Wednesday, 18 November 2015 at 22:46:01 UTC, jmh530 wrote: My sense is that any data frame implementation should try to build on the work that's being done with n-dimensional slices. I've been watching that development, but I don't have a feel for where it could be applied in this case, since it appears to be focused on multi-dimensional slices of the same data type, slicing up a single range. The dataframes often consist of different data types by column. How did you see the nd slices being used? Maybe the nd slices could be applied if you considered each row to be the same structure, and slice by rows rather than operating on columns. Pandas supports a multi-dimension panel. Maybe this would be the application for nd slices by row.
Re: dataframe implementations
One more discussion link on the NA subject. This one on the R implementation of NA using a single encoding of NaN, as well as their treatment of a selected integer value as a NA. http://rsnippets.blogspot.com/2013/12/gnu-r-vs-julia-is-it-only-matter-of.html
Re: dataframe implementations
On Wednesday, 18 November 2015 at 18:04:30 UTC, Jay Norwood wrote: vector. I'll try to find the discussions and post the link. Here are the two discussions I recall on the julia NA implementation. http://wizardmac.tumblr.com/post/104019606584/whats-wrong-with-statistics-in-julia-a-reply https://github.com/JuliaLang/julia/pull/9363
Re: dataframe implementations
On Wednesday, 18 November 2015 at 17:15:38 UTC, Laeeth Isharc wrote: What do you think about the use of NaN for missing floats? In theory I could imagine wanting to distinguish between an NaN in the source file and a missing value, but in my world I never felt the need for this. For integers and bools, that is different of course. The julia discussions mention another dataframe implementation, I believe it was for R, where NaN was used. There was some mention of the virtues of their own choice and the problems with NaN. I think use of NaN was a particular encoding of NaN. Other implementations they mentioned used some reserved value in each of the numeric data types to represent NA. In the julia case, I believe what they use is a separate byte vector for each column that holds the NA status. They discussed some other possible enhancements, but I don't know what they implemented. For example, if the single byte holds the NA flag, the cell value can hold additional info ... maybe the reason for the NA. There was also some discussion of having the associated cell hold repeat counts for the NA status, which I suppose meant to repeat it for following cells in the column vector. I'll try to find the discussions and post the link.
Re: dataframe implementations
I looked through the dataframe code and a couple of comments... I had thought perhaps an app could read in the header info and type info from hdf5, and generate D struct definitions with column headers as symbol names. That would enable faster processing than with the associative arrays, as well as support the auto-completion that would be helpful in writing expressions. The csv type info for columns could be inferred, or else stated in the reader call, as done as an option in julia. In both cases the column names would have to be valid symbol names for this to work. I believe Julia also expects this, or else does some conversion on your column names to make them valid symbols. I think the D csv processing would also need to check if the The jupyter interactive environment supports python pandas and Julia dataframe column names in the autocompletion, and so I think the D debugging environment would need to provide similar capability if it is to be considered as a fast-recompile substitute for interactive dataframe exploration. It seems to me that your particular examples of stock data would eventually need to handle missing data, as supported in Julia dataframes and python pandas. They both provide ways to drop or fill missing values. Did you want to support that?
Re: dataframe implementations
On Monday, 2 November 2015 at 15:33:34 UTC, Laeeth Isharc wrote: Hi Jay. That may have been me. I have implemented something very basic, but you can read and write my proto dataframe to/from CSV and HDF5. The code is up here: https://github.com/Laeeth/d_dataframes yes, thanks. I believe I did see your comments previously. That's great that you've already got support for hdf5. I'll take a look.
dataframe implementations
I was reading about the Julia dataframe implementation yesterday, trying to understand their decisions and how D might implement. From my notes, 1. they are currently using a dictionary of column vectors. 2. for NA (not available) they are currently using an array of bytes, effectively as a Boolean flag, rather than a bitVector, for performance reasons. 3. they are not currently implementing hierarchical headers. 4. they are transforming non-valid symbol header strings (read from csv, for example) to valid symbols by replacing '.' with underscore and prefixing numbers with 'x', as examples. This allows use in expressions. 5. Along with 4., they currently have @with for DataVector, to allow expressions to use, for example, :symbol_name instead of dv[:symbol_name]. 6. They have operation symbols for per element operations on two vectors, for example a ./ b expresses applying the operation to the vector. 7. They currently only have row indexes, no row names or symbols. I saw someone posting that they were working on DataFrame implementation here, but haven't been able to locate any code in github, and was wondering what implementation decisions are being made here. Thanks.
Re: an example of parallel calculation of metrics
This is another attempt with the metric parallel processing. This uses the results only to return an int value, which could be used later as an error return value. The metric value locations are now allocated as a part of the input measurement values tuple. The Tuple vs struct definitions seem to have a big difference in default output formatting. import std.algorithm, std.parallelism, std.range; import std.typecons; import std.meta; import std.stdio; // define some input measurement sample tuples and output metric tuples alias TR = Tuple!(long,"raw",double, "per_cycle"); //struct TR {long raw; double per_cycle;} alias TO = Tuple!(TR, "l1_miss", TR, "l1_access" ); //struct TO {TR l1_miss; TR l1_access; }; alias TI = Tuple!(long, "L1I_MISS",long, "L1D_MISS", long, "L1D_READ", long, "L1D_WRITE", long, "cycles", TO, "res"); // various metric definitions // using Tuples with defined names for each member, and use the names here in the metrics. long met_l1_miss ( ref TI m){ return m.L1I_MISS + m.L1D_MISS; } long met_l1_access ( ref TI m){ return m.L1D_READ + m.L1D_WRITE; } int met_all (ref TI m) { with (m.res){ l1_miss.raw = met_l1_miss(m); l1_access.raw = met_l1_access(m); l1_miss.per_cycle = (m.cycles == 0)? double.nan : l1_miss.raw / cast(double)m.cycles; l1_access.per_cycle = (m.cycles == 0)? double.nan : l1_access.raw / cast(double)m.cycles; } return 0; } // a convenience to use all the metrics above as a list alias Metrics = AliasSeq!(met_all); void main(string[] argv) { auto samples = iota(100); auto meas = new TI[samples.length]; auto results = new int[samples.length]; // Initialize some values for the measured samples foreach(i, ref m; meas){ m.L1D_MISS= 100+i; m.L1I_MISS=100-i; m.L1D_READ= 200+i; m.L1D_WRITE=200-i; m.cycles= 10+i; } ref TI getTerm(int i) { return meas[i]; } // compute the metric results for the above measured sample values in parallel taskPool.amap!(Metrics)(std.algorithm.map!getTerm(samples),results); writeln("measurements:", meas[1]); foreach(ref m; meas){ writeln(m.res); } }
Re: an example of parallel calculation of metrics
I re-submitted this as: https://issues.dlang.org/show_bug.cgi?id=15135
Re: an example of parallel calculation of metrics
So, this is a condensed version of the original problem. It looks like the problem is that the return value for taskPool.amap can't be a tuple of tuples or a tuple of struct. Either way, it fails with the Wrong buffer type error message if I uncomment the taskPool line import std.algorithm, std.parallelism, std.range; import std.typecons; import std.meta; import std.stdio; // define some input measurement sample tuples and output metric tuples struct TR { long raw; double per_cyc;} //alias TR = Tuple!(long, "raw", double, "per_cyc"); alias TI = Tuple!(long, "L1I_MISS",long, "L1D_MISS", long, "L1D_READ", long, "L1D_WRITE", long, "cycles" ); alias TO = Tuple!(TR, "L1_MISS", TR, "L1D_ACCESS"); // various metric definitions // using Tuples with defined names for each member, and use the names here in the metrics. TR met_l1_miss ( ref TI m){ TR rv; rv.raw = m.L1I_MISS+m.L1D_MISS; rv.per_cyc = cast(double)rv.raw/m.cycles; return rv; } TR met_l1_access ( ref TI m){ TR rv; rv.raw = m.L1D_READ+m.L1D_WRITE; rv.per_cyc = cast(double)rv.raw/m.cycles; return rv; } // a convenience to use all the metrics above as a list alias Metrics = AliasSeq!(met_l1_miss, met_l1_access); void main(string[] argv) { auto samples = iota(100); auto meas = new TI[samples.length]; auto results = new TO[samples.length]; // Initialize some values for the measured samples foreach(i, ref m; meas){ m.L1D_MISS= 100+i; m.L1I_MISS=100-i; m.L1D_READ= 200+i; m.L1D_WRITE=200-i; m.cycles= 10+i; } ref TI getTerm(int i) { return meas[i]; } // compute the metric results for the above measured sample values in parallel //taskPool.amap!(Metrics)(std.algorithm.map!getTerm(samples),results); TR rv1 = met_l1_miss( meas[1]); TR rv2 = met_l1_access( meas[1]); writeln("measurements:", meas[1]); writeln("rv1:", rv1); writeln("rv2:", rv2); writeln("results:", results[1]); }
Re: an example of parallel calculation of metrics
On Thursday, 1 October 2015 at 18:08:31 UTC, Ali Çehreli wrote: However, if you prove to yourself that the result tuple and your struct have the same memory layout, you can cast the tuple slice to struct slice after calling amap: After re-reading your explanation, I see that the problem is only that the results needs to be a Tuple. It works with named tuple members in this example as the result and array of struct as the input. I'll re-check if the multi-member result also works with named members. I'll update the issue report. import std.meta; import std.stdio; // define some input measurement sample tuples and output metric tuples struct TI {long L1I_MISS; long L1D_MISS; } alias TO = Tuple!(long, "raw"); // various metric definitions // using Tuples with defined names for each member, and use the names here in the metrics. TO met_l1_miss ( ref TI m){ TO rv; rv.raw = m.L1I_MISS+m.L1D_MISS; return rv; } // a convenience to use all the metrics above as a list alias Metrics = AliasSeq!(met_l1_miss); void main(string[] argv) { auto samples = iota(100); auto meas = new TI[samples.length]; auto results = new TO[samples.length]; // Initialize some values for the measured samples foreach(i, ref m; meas){ m.L1D_MISS= 100+i; m.L1I_MISS=100-i; } ref TI getTerm(int i) { return meas[i]; } // compute the metric results for the above measured sample values in parallel taskPool.amap!(Metrics)(std.algorithm.map!getTerm(samples),results); TO rv1 = met_l1_miss( meas[1]); writeln("measurements:", meas[1]); writeln("rv1:", rv1); writeln("results:", results[1]); }
Re: an example of parallel calculation of metrics
On Thursday, 1 October 2015 at 18:08:31 UTC, Ali Çehreli wrote: Makes sense. Please open a bug at least for investigation why tuples with named members don't work with amap. ok, thanks. I opened the issue. https://issues.dlang.org/show_bug.cgi?id=15134
Re: an example of parallel calculation of metrics
On Thursday, 1 October 2015 at 07:03:40 UTC, Ali Çehreli wrote: Looks like a bug. Workaround: Get rid of member names Thanks. My particular use case, working with metric expressions, is easier to understand if I use the names. I converted the use of Tuple to struct to see if I could get an easier error msg. Turns out the use of struct also results in much cleaner writeln text. Still has the compile error, though. import std.algorithm, std.parallelism, std.range; import std.stdio; import std.datetime; import std.typecons; import std.meta; // define some input measurement sample tuples and output metric tuples struct TR {double per_sec; double per_cycle; long raw;} struct TI {long proc_cyc; long DATA_RD; long DATA_WR; long INST_FETCH; long L1I_MISS; long L1I_HIT; long L1D_HIT; long L1D_MISS;} struct TO { TR L1_MISS; TR L1_HIT; TR DATA_ACC; TR ALL_ACC;} const double CYC_PER_SEC = 1_600_000_000; // various metric definitions // using Tuples with defined names for each member, and use the names here in the metrics. TR met_l1_miss ( ref TI m){ TR rv; with(rv) with(m) { raw = L1I_MISS+L1D_MISS; per_cycle = cast(double)raw/proc_cyc; per_sec = per_cycle*CYC_PER_SEC;} return rv; } TR met_l1_hit ( ref TI m){ TR rv; with(rv) with(m) { raw = L1I_HIT+L1D_HIT; per_cycle = cast(double)raw/proc_cyc; per_sec = per_cycle*CYC_PER_SEC;} return rv; } TR met_data_acc ( ref TI m){ TR rv; with(rv) with(m) { raw = DATA_RD+DATA_WR; per_cycle = cast(double)raw/proc_cyc; per_sec = per_cycle*CYC_PER_SEC;} return rv; } TR met_all_acc( ref TI m){ TR rv; with(rv) with(m) { raw = DATA_RD+DATA_WR+INST_FETCH; per_cycle = cast(double)raw/proc_cyc; per_sec = per_cycle*CYC_PER_SEC;} return rv; } // a convenience to use all the metrics above as a list alias Metrics = AliasSeq!(met_l1_miss,met_l1_hit,met_data_acc,met_all_acc); void main(string[] argv) { auto samples = iota(1_00); auto meas = new TI[samples.length]; auto results = new TO[samples.length]; // Initialize some values for the measured samples foreach(i, ref m; meas){ with(m){ proc_cyc = 1_000_000+i*2; DATA_RD = 1000+i; DATA_WR= 2000+i; INST_FETCH=proc_cyc/2; L1I_HIT= INST_FETCH-100; L1I_MISS=100; L1D_HIT= DATA_RD+DATA_WR - 200; L1D_MISS=200;} } std.datetime.StopWatch sw; sw.start(); ref TI getTerm(int i) { return meas[i]; } // compute the metric results for the above measured sample values in parallel taskPool.amap!(Metrics)(std.algorithm.map!getTerm(samples),results); TR rv1 = met_l1_miss( meas[0]); TR rv2 = met_l1_hit( meas[0]); TR rv3 = met_data_acc( meas[0]); TR rv4 = met_all_acc( meas[0]); // how long did this take long exec_ms = sw.peek().msecs; writeln("measurements:", meas[0]); writeln("rv1:", rv1); writeln("rv2:", rv2); writeln("rv3:", rv3); writeln("rv4:", rv4); writeln("results:", results[1]); writeln("time:", exec_ms); }
Re: an example of parallel calculation of metrics
This compiles and appears to execute correctly, but if I uncomment the taskPool line I get a compile error message about wrong buffer type. Am I breaking some rule for std.parallelism.amap? import std.algorithm, std.parallelism, std.range; import std.stdio; import std.datetime; import std.typecons; import std.meta; // define some input measurement sample tuples and output metric tuples alias TR = Tuple!(double,"per_sec", double, "per_cycle", long,"raw"); alias TI = Tuple!(long, "proc_cyc", long, "DATA_RD", long, "DATA_WR", long, "INST_FETCH", long, "L1I_MISS", long, "L1I_HIT", long,"L1D_HIT", long, "L1D_MISS"); alias TO = Tuple!(TR,"L1_MISS", TR, "L1_HIT", TR,"DATA_ACC", TR,"ALL_ACC"); const double CYC_PER_SEC = 1_600_000_000; // various metric definitions // using Tuples with defined names for each member, and use the names here in the metrics. TR met_l1_miss ( ref TI m){ TR rv; with(rv) with(m) { raw = L1I_MISS+L1D_MISS; per_cycle = cast(double)raw/proc_cyc; per_sec = per_cycle*CYC_PER_SEC;} return rv; } TR met_l1_hit ( ref TI m){ TR rv; with(rv) with(m) { raw = L1I_HIT+L1D_HIT; per_cycle = cast(double)raw/proc_cyc; per_sec = per_cycle*CYC_PER_SEC;} return rv; } TR met_data_acc ( ref TI m){ TR rv; with(rv) with(m) { raw = DATA_RD+DATA_WR; per_cycle = cast(double)raw/proc_cyc; per_sec = per_cycle*CYC_PER_SEC;} return rv; } TR met_all_acc( ref TI m){ TR rv; with(rv) with(m) { raw = DATA_RD+DATA_WR+INST_FETCH; per_cycle = cast(double)raw/proc_cyc; per_sec = per_cycle*CYC_PER_SEC;} return rv; } // a convenience to use all the metrics above as a list alias Metrics = AliasSeq!(met_l1_miss,met_l1_hit,met_data_acc,met_all_acc); void main(string[] argv) { auto samples = iota(1_00); auto meas = new TI[samples.length]; auto results = new TO[samples.length]; // Initialize some values for the measured samples foreach(i, ref m; meas){ with(m){ proc_cyc = 1_000_000+i*2; DATA_RD = 1000+i; DATA_WR= 2000+i; INST_FETCH=proc_cyc/2; L1I_HIT= INST_FETCH-100; L1I_MISS=100; L1D_HIT= DATA_RD+DATA_WR - 200; L1D_MISS=200;} } std.datetime.StopWatch sw; sw.start(); ref TI getTerm(int i) { return meas[i]; } // compute the metric results for the above measured sample values in parallel //taskPool.amap!(Metrics)(std.algorithm.map!getTerm(samples),results); TR rv1 = met_l1_miss( meas[0]); TR rv2 = met_l1_hit( meas[0]); TR rv3 = met_data_acc( meas[0]); TR rv4 = met_all_acc( meas[0]); // how long did this take long exec_ms = sw.peek().msecs; writeln("measurements:", meas[0]); writeln("rv1:", rv1); writeln("rv2:", rv2); writeln("rv3:", rv3); writeln("rv4:", rv4); writeln("results:", results[1]); writeln("time:", exec_ms); }
Re: an example of parallel calculation of metrics
On Wednesday, 30 September 2015 at 22:24:25 UTC, Jay Norwood wrote: // various metric definitions // the Tuples could also define names for each member and use the names here in the metrics. long met1( TI m){ return m[0] + m[1] + m[2]; } long met2( TI m){ return m[1] + m[2] + m[3]; } long met3( TI m){ return m[0] - m[1] + m[2]; } long met4( TI m){ return m[0] + m[1] - m[2]; } should use reference parameters here: long met1( ref TI m){ return m[0] + m[1] + m[2]; } long met2( ref TI m){ return m[1] + m[2] + m[3]; } long met3( ref TI m){ return m[0] - m[1] + m[2]; } long met4( ref TI m){ return m[0] + m[1] - m[2]; }
an example of parallel calculation of metrics
This is something I'm playing with for work. We do this a lot, capture counter events for some number of on-chip performance counters, compute some metrics, display the outputs. This seems ideal for the application. import std.algorithm, std.parallelism, std.range; import std.stdio; import std.datetime; import std.typecons; import std.meta; // define some input measurement sample tuples and output metric tuples alias TI = Tuple!(long, long, long, long, long); alias TO = Tuple!(long, long, long, long); // various metric definitions // the Tuples could also define names for each member and use the names here in the metrics. long met1( TI m){ return m[0] + m[1] + m[2]; } long met2( TI m){ return m[1] + m[2] + m[3]; } long met3( TI m){ return m[0] - m[1] + m[2]; } long met4( TI m){ return m[0] + m[1] - m[2]; } // a convenience to use all the metrics above as a list alias Metrics = AliasSeq!(met1,met2,met3,met4); void main(string[] argv) { auto samples = iota(1_000); auto meas = new TI[samples.length]; auto results = new TO[samples.length]; // Initialize some values for the measured samples foreach(i, ref m; meas){ m[0] = i; m[1] = i+1; m[2] = i+2; m[3] = i+3; m[4] = i+4; } std.datetime.StopWatch sw; sw.start(); ref TI getTerm(int i) { return meas[i]; } // compute the metric results for the above measured sample values in parallel taskPool.amap!(Metrics)(std.algorithm.map!getTerm(samples),results); // how long did this take long exec_ms = sw.peek().msecs; writeln("results:", results); writeln("time:", exec_ms); }
Re: Parallel processing and further use of output
On Saturday, 26 September 2015 at 15:56:54 UTC, Jay Norwood wrote: This results in a compile error: auto sum3 = taskPool.reduce!"a + b"(iota(1UL,101UL)); I believe there was discussion of this problem recently ... https://issues.dlang.org/show_bug.cgi?id=14832 https://issues.dlang.org/show_bug.cgi?id=6446 looks like the problem has been reported a couple of times. I probably saw the discussion of the 8/22 bug.
Re: Parallel processing and further use of output
This is a work-around to get a ulong result without having the ulong as the range variable. ulong getTerm(int i) { return i; } auto sum4 = taskPool.reduce!"a + b"(std.algorithm.map!getTerm(iota(11)));
Re: Parallel processing and further use of output
btw, on my corei5, in debug build, reduce (using double): 11msec non_parallel: 37msec parallel with atomicOp: 123msec so, that is the reason for using parallel reduce, assuming the ulong range thing will get fixed.
Re: Parallel processing and further use of output
std.parallelism.reduce documentation provides an example of a parallel sum. This works: auto sum3 = taskPool.reduce!"a + b"(iota(1.0,101.0)); This results in a compile error: auto sum3 = taskPool.reduce!"a + b"(iota(1UL,101UL)); I believe there was discussion of this problem recently ...
Re: Are there any Phobos functions to check file permissions on Windows and Posix?
On Monday, 7 September 2015 at 15:48:56 UTC, BBasile wrote: On Sunday, 6 September 2015 at 23:05:29 UTC, Jonathan M Davis For example you can retieve the flags: archive/readonly/hidden/system/indexable(?) and even if it looks writable or readable, the file won't be open at all because the ACL for the file don't include the current user account. This guy enhanced the cross-platform fox toolkit capabilities. It would be worth looking at it if you intend to write something similar. See the section 7. Superior provision of host OS facilities portably http://tnfox.sourceforge.net/TnFOX/html/main.html
Re: Dynamic array and foreach loop
On Sunday, 9 August 2015 at 19:10:01 UTC, Binarydepth wrote: On Sunday, 9 August 2015 at 16:42:16 UTC, Jay Norwood wrote: Oooh... I like how this works import std.stdio : writeln, readf; void main() { immutable a=5; int[a] Arr; int nim; foreach(num, ref nem; Arr) { readf(" %s", &nem); } foreach(num; Arr) { writeln(num); } } you can also do something like this to accept blank separated input values on a single line. import std.stdio : writeln, readln; import std.string: split; import std.conv: to; import std.algorithm: each; void main() { double [] Arr; Arr = readln().split().to!(double[]); Arr.each!writeln(); }
Re: Dynamic array and foreach loop
On Sunday, 9 August 2015 at 15:37:23 UTC, Binarydepth wrote: So I should use the REF like this ? import std.stdio : writeln; void main() { immutable a=5; int[a] Arr; foreach(num; 0..a) { Arr[num] = num; } foreach(num, ref ele; Arr) { writeln(Arr[ele]+1);//Using the REF } } The reference v is to the array member in this case, rather than making a copy. In the last loop c is a copy. No big deal for this case of int Arr members, but if Arr was made up of struct members, you might not want to be making copies. The i+3 initialization is just so you can see that v is the Arr member (not the index) in the other loops. import std.stdio : writeln; void main() { immutable a=5; int[a] Arr; foreach(i, ref v; Arr) { v = i+3; } foreach( ref v; Arr) { writeln(v); } foreach( c; Arr) { writeln(c); } }
Re: file rawRead and rawWrite in chunks example
On Sunday, 9 August 2015 at 10:40:06 UTC, Nordlöw wrote: On Sunday, 9 August 2015 at 00:50:16 UTC, Ali Çehreli wrote: Ali Now benchmarks write and read separately: I benchmarked my first results: D:\visd\raw\raw\Release>raw time write msecs:457 time read msecs:75 This is for 160MB of data. The write includes initialization of the values. The read time is faster than my ssd drive, so I have to assume this is win7 or the ssd caching the data. If I increase double count to 200,000,000 (to 1.6GB of data), the times are: D:\visd\raw\raw\Release>raw time write msecs:7236 time read msecs:11979 08/09/2015 10:12 AM 1,600,000,000 numberList.db So that's around 220MB/sec for the writes and 133MB/sec for the reads. That's an intel 520 series 180GB ssd, but in an SATA 3Gb/s interface in a laptop. Sequential write speed for that ssd should be about 257MB/sec. Sequential read should be close to 395MB/sec for this drive on a 6Gb/sec SATA. So read speed is lower than I'd expect. If I move this program over to my work computer, the same 1.6GB measurement returns these times below on a Samsung 840 SSD, which is on a 6Gb/sec SATA interface. I believe the 458MB/sec write speeds. I suspect the read timing is again just measuring win7's cached data. J:\visd>raw time write msecs:3489 time read msecs:579
Re: file rawRead and rawWrite in chunks example
On Sunday, 9 August 2015 at 11:06:34 UTC, Nordlöw wrote: On Sunday, 9 August 2015 at 10:40:06 UTC, Nordlöw wrote: Couldn't the chunk logic be deduced aswell? Yes :) See update at: https://github.com/nordlow/justd/blob/a633b52876388921ec49c189f374746f7b4d8c93/tests/t_rawio.d What would a suitable value for `preferred_disk_write_size` be? Is there a suitable constant somewhere in Phobos? So, to be clear, I think you must be saying that you want to specify the disk chunk size separate from the array size. Is that correct? I stepped through the original code (with the foreach loops) and I see single calls to fwrite and fread for each array. The rawWrite is executing a single fwrite per array f.rawWrite(elem.array()) auto result = .fwrite(buffer.ptr, T.sizeof, buffer.length, _p.handle); The rawRead is executing a sing fread per array immutable result = fread(buffer.ptr, T.sizeof, buffer.length, _p.handle);
Re: file rawRead and rawWrite in chunks example
On Sunday, 9 August 2015 at 00:50:16 UTC, Ali Çehreli wrote: // NOTE: No need to tell rawRead the type as double iota(10, 20_000_000 + 10, n) .each!(a => f.rawRead(dbv)); } Ali Your f.rawRead(dbv) form compiles, but f.rawRead!(dbv) results in an error msg in compiler error in 2.067.1. The f.rawRead!(double)(dbv) form works. Error: template instance rawRead!(dbv) does not match template declaration rawRead(T)(T[] buffer)
Re: file rawRead and rawWrite in chunks example
On Sunday, 9 August 2015 at 00:50:16 UTC, Ali Çehreli wrote: { auto f = File(fn,"wb"); iota(10.5, 20_000_010.5, 1.0) .chunks(100) .each!(a => f.rawWrite(a.array)); } Ali Thanks. There are many examples of numeric to string data output in the docs, saving byLine. Those are on the order of 30x slower than this rawWrite example. This will be more useful to many people.
Re: Dynamic array and foreach loop
On Saturday, 8 August 2015 at 18:28:25 UTC, Binarydepth wrote: This is the new code : foreach(num; 0..liEle) {//Data input loop write("Input the element : ", num+1, " "); readf(" %d", &liaOrig[num]); } Even better : foreach(num; 0..liaOrig.length I believe they usually do something like: foreach( num, ref elem; liaOrig){ } which creates the index num and the reference to the element of range liaOrig. It also seems that a lot of discussion is going on about reducing use of foreach loops in their preferred style, so you might want to try some of that.
file rawRead and rawWrite in chunks example
I'm playing around with the range based operations and with raw file io. I couldn't figure out a way to get rid of the outer foreach loops. Nice execution time of 537 msec for this, which creates and reads back a file of about 160MB (20_000_000 doubles). import std.algorithm; import std.stdio; import std.conv; import std.math; import std.range; import std.file; import std.datetime; import std.array; void main() { auto fn = "numberList.db"; auto f = File(fn,"wb"); scope(exit) std.file.remove(fn); std.datetime.StopWatch sw; sw.start(); foreach(elem; chunks(iota(10.5,20_000_010.5,1.0),100)){ f.rawWrite(elem.array()); } f.close(); f = File(fn,"rb"); const int n = 100; double dbv[] = new double[n]; foreach(i; iota(10,20_000_000+10,n)){ f.rawRead!(double)(dbv); } f.close(); long tm = sw.peek().msecs; writeln("time msecs:", tm); }
Re: std.parallelism taskPool.map example throws exception
Unfortunately, this is not a very good example for std.parallelism, since the measured times are better using the std.algorithm.map calls. I know from past experience that std.parallelism routines can work well when the work is spread out correctly, so this example could be improved. This is parallel D:\visd\map\map\Release>map sum=1.17335e+07 time msecs:1242 Non-parallel D:\visd\map\map\Release>map sum=1.17335e+07 time msecs:970 I think this example import std.parallelism; import std.algorithm; import std.stdio; import std.conv; import std.math; import std.range; import std.file; import std.datetime; void main() { auto fn = "numberList.txt"; auto f = File(fn,"w"); scope(exit) std.file.remove(fn); foreach (i ; iota(10.0,2_000_000.0)){ f.writefln("%g",i+0.5); } f.close(); std.datetime.StopWatch sw; sw.start(); auto lineRange = File(fn).byLineCopy(); auto chomped = std.algorithm.map!"a.chomp"(lineRange); auto nums = std.algorithm.map!(to!double)(chomped); auto logs = std.algorithm.map!log10(nums); double sum = 0; foreach(elem; logs) { sum += elem; } long tm = sw.peek().msecs; writeln("sum=",sum); writeln("time msecs:", tm); }
Re: std.parallelism taskPool.map example throws exception
and, finally, this works using the taskPool.map, as in the std.parallelism example. So, the trick appears to be that the call to chomp is needed. auto lineRange = File(fn).byLineCopy(); auto chomped = std.algorithm.map!"a.chomp"(lineRange); auto nums = taskPool.map!(to!double)(chomped); auto logs = taskPool.map!log10(nums); double sum = 0; foreach(elem; logs) { sum += elem; } writeln("sum=",sum);
std.parallelism taskPool.map example throws exception
I tried to create a working example from the std.parallelism taskPool.map code, and it throws with empty strings with length 1 being passed to to!double. Anyone have a working example? I'm building on Windows with 2.067.1 dmd. import std.parallelism; import std.algorithm; import std.stdio; import std.conv; import std.math; import std.range; import std.file; void main() { auto fn = "numberList.txt"; auto f = File(fn,"w"); scope(exit) std.file.remove(fn); foreach (i ; iota(10.0,2_000.0)){ f.writefln("%g",i+0.5); } f.close(); auto lineRange = File(fn).byLine(); auto dupedLines = std.algorithm.map!"a.idup"(lineRange); auto nums = taskPool.map!(to!double)(dupedLines); auto logs = taskPool.map!log10(nums); double sum = 0; foreach(elem; logs) { sum += elem; } writeln("sum=",sum); }
Re: std.parallelism example hangs compiler 2.067.1
On Friday, 7 August 2015 at 18:51:45 UTC, Steven Schveighoffer wrote: On 8/7/15 2:37 PM, Steven Schveighoffer wrote: I'll file a bug on this. https://issues.dlang.org/show_bug.cgi?id=14886 -Steve Thanks. The workaround works ok.
Re: std.parallelism taskPool.map example throws exception
This also works. auto sm = File(fn).byLineCopy() .map!"a.chomp"() .map!(to!double) .map!"a.log10"() .sum(); writeln("sum=",sm);
Re: std.parallelism taskPool.map example throws exception
This appears to work ... at least, no exception: auto sm = File(fn).byLine(KeepTerminator.no) .map!"a.chomp"() .map!"a.idup"() .map!(to!double) .map!"a.log10"() .sum(); writeln("sum=",sm);
std.parallelism example hangs compiler 2.067.1
This appears to hang up dmd compiler 2.067.1. Changing parallel(s) to s works ok. Is this a known problem? import std.stdio; import std.string; import std.format; import std.range; import std.parallelism; int main(string[] argv) { string s[10]; foreach (i, ref si ; parallel(s)){ si = format("hi:%d",i); } foreach (ref rm; s[99000..99010]){ writeln(rm); } return 0; }
Re: ctfe and static arrays
On Sunday, 24 May 2015 at 18:14:19 UTC, anonymous wrote: "Static array" has a special meaning. It does not mean "static variable with an array type". Static arrays are those of the form Type[size]. That is, the size is known statically. Examples: 1) static int[5] x; -- x is a static variable with a static array type 2) static int[] x; -- static variable, dynamic array 3) int[5] x; -- non-static variable, static array 4) int[] x; -- non-static variable, dynamic array So, CTFE can't handle examples 1 and 2, because they're static variables. 3 and 4 are fine. From your description, I would expect this to fail since I would expect it to be included in 2 above, but it builds and prints ok. import std.stdio; struct A { int me; int next; int prev;} A[] initA(int n) { if (!__ctfe){ assert(false); } A[] v = new A[n]; foreach (i; 0..n){ v[i].me = i; v[i].prev = i-1; v[i].next = i+1; } return v; } int main(string[] argv) { enum int N = 100; static A[] linkedA = initA(N); writefln("%s",linkedA); return 0; }
ctfe and static arrays
I'm a bit confused by the documentation of the ctfe limitations wrt static arrays due to these seemingly conflicting statements, and the examples didn't seem to clear anything up. I was wondering if anyone has examples of clever things that might be done with static arrays and pointers using ctfe. 2.Executed expressions may not reference any global or local static variables. C-style semantics on pointer arithmetic are strictly enforced. Pointer arithmetic is permitted only on pointers which point to static or dynamic array elements.
Re: How to make a Currency class from std.BigInt?
This library allow to specify the internal base of the arbitrary precision numbers( default is decimal), as well as allows specification of the precision of floating point values. Each floating point number precision can be read with .precision(). Also supports specification of rounding modes. Seems like it would be a nice project for a port to D. http://www.hvks.com/Numerical/arbitrary_precision.html
Re: Remove filename from path
On Friday, 26 September 2014 at 03:32:46 UTC, Jay Norwood wrote: On Wednesday, 24 September 2014 at 10:28:05 UTC, Suliman wrote: string path = thisExePath() Seems like "dirName" in std.path is a good candidate ;) http://dlang.org/phobos/std_path.html#.dirName You'll find many other path manipulation functions there. Thanks! But if I want to strip it, how I can cut it? dirName gives the directory, baseName the filename, stripExtension strips it, so seems like what you want is dirName ~ stripExtension( baseName ) easier than that. Looks like stripExtension handles the whole path. assert (stripExtension("dir/file.ext") == "dir/file");
Re: Remove filename from path
On Wednesday, 24 September 2014 at 10:28:05 UTC, Suliman wrote: string path = thisExePath() Seems like "dirName" in std.path is a good candidate ;) http://dlang.org/phobos/std_path.html#.dirName You'll find many other path manipulation functions there. Thanks! But if I want to strip it, how I can cut it? dirName gives the directory, baseName the filename, stripExtension strips it, so seems like what you want is dirName ~ stripExtension( baseName )
exporting analysispoint labels into symbol tables
I have a use case that requires repeating performance measurements of blocks of code that do not coincide with function start and stop. For example, a function will be calling several sub-operations, and I need to measure the execution from the call statement until the execution of the statement following the call. So, ideally, I'd like to mark the start and stop points in the source code with label pairs, and have these exported as symbol/address pairs. I would read these label names with an external app, and would set up the performance measurement start and stop window boundaries without modifying the target code. Does D provide any feature that would allow me to export such labels? I've seen some discussion of use of goto labels within the program, but nothing about exporting them for use by an external app. I've also read through the recent info on user annotations, but those seem to be associated with data properties and it isn't apparent to me if they could provide program address info. Thanks, Jay
Re: dual with statement
On Friday, 25 July 2014 at 21:10:56 UTC, monarch_dodra wrote: Functionally nothing more than an alias? EG: { alias baz = foo.bar; ... } Yes, it is all just alias. So with ( (d,e,a,b,c) as (ar.rm.a, ar.rm.b, ar.r.a, ar.r.b, ar.r.c)){ d = a + c; e = (c==0)?0:(a+b)/c; } could be instead { alias d = ar.rm.a; alias e = ar.rm.b; alias a = ar.r.a; alias b = ar.r.b; alias c = ar.r.c; d = a + c; e = (c==0)?0:(a+b)/c; } I guess this means I don't need WITH.
Re: dual with statement
On Friday, 25 July 2014 at 01:54:53 UTC, Jay Norwood wrote: I don't recall the exact use case for the database expressions, but I believe they were substituting a simple symbol for the fully qualified object. The sql with clause is quite a bit different than I remembered. For one thing, I have the order reversed, so it would have been with (a as something.x.y). It looks more like they are more like creating a temporary tuple from some more complicated selection. It isn't clear what the implementation is, but just the that it might be a more concise way of stating things so that the expressions inside the body can be simple So, my prior example with (ar.rm.a as d) with (ar.rm.b as e) with (ar.r.a as a) with (ar.r.b as b) with (ar.r.c as c){ d = a + c; e = (c==0)?0:(a+b)/c; } would reduce to something maybe simpler to read. The sql seems to be using this as a foreach type operation, but I was just interested in the name substitution so that the expressions inside the with block could be simpler, as well as get rid of the struct member name clashes. with ( (d,e,a,b,c) as (ar.rm.a, ar.rm.b, ar.r.a, ar.r.b, ar.r.c)){ d = a + c; e = (c==0)?0:(a+b)/c; }
Re: dual with statement
On Thursday, 24 July 2014 at 20:16:53 UTC, monarch_dodra wrote: Or did I miss something? Yes, sorry, I should have pasted a full example previously. The code at the end is with the Raw_met members renamed (they were originally a and b but clashed). So, if Raw_met members were still a and b ... with(ar.r as v) with (ar.rm as res){ res.a = v.a + v.c; res.b = (v.c==0)? 0: (v.a + v.b)/ v.c; I don't recall the exact use case for the database expressions, but I believe they were substituting a simple symbol for the fully qualified object. So they would have done something analogous to this ... with (ar.rm.a as d) with (ar.rm.b as e) with (ar.r.a as a) with (ar.r.b as b) with (ar.r.c as c){ d = a + c; e = (c==0)?0:(a+b)/c; } So, the expressions are simpler to read, and I've fully qualified the members only a single time. this is the code for one of the metric experiments, with the members renamed to avoid the clashes. I like that the expressions can be so simple. double ref_clock=1_600_000_000.0; struct Raw_met {long d; double e;} struct Per_sec_met {double d; double e; } struct Per_cyc_met {double d; double e; } struct Raw {long proc_cyc; long a; long b; long c; } struct Per_sec {double proc_cyc; double a; double b; double c; } struct Per_cyc {double proc_cyc; double a; double b; double c; } struct All { Raw r; Per_sec ps; Per_cyc pc; Raw_met rm; Per_sec_met psm; Per_cyc_met pcm;} void calc_raw_met(ref All ar){ with(ar.r) with(ar.rm){ d = a+c; e = (c==0)?0:(a+b)/c; } }
dual with statement
I was playing around with use of the dual WITH statement. I like the idea, since it makes the code within the with cleaner. Also, I got the impression from one of the conference presentations ... maybe the one on the ARM debug ... that there are some additional optimizations available that the compiler processes the WITH statement block. Anyway, a problem I ran into was if two structures had the same member names, for example struct ar.r and ar.psm in this case below. In this case, there was no way for the compiler to determine from which structure to get the member. void calc_per_sec_met(ref All ar){ with (ar.r) with(ar.psm) { double per_sec = proc_cyc/ref_clock; d = (a+c)*per_sec; e = (c==0)?0:(a+b)/c; } } ok, so I guess I could make all the member names unique in the different structures, but that's kind of ugly. Also, what happens if using two structs of the same type in a WITH statement. Seems like something like this would help, which I believe I've seen used in database queries ... with (ar.r1 as r1) with (ar.r2 as r2){ auto sum = r1.a + r2.a; }
Re: Function to print a diamond shape
On Tuesday, 22 April 2014 at 15:25:04 UTC, monarch_dodra wrote: Yeah, that's because join actually works on "RoR, R", rather than "R, E". This means if you feed it a "string[], string", then it will actually iterate over individual *characters*. Not only that, but since you are using char[], it will decode them too. "join" is faster for 2 reasons: 1) It detects you want to joins arrays, so it doesn't have to iterate over them: It just glues them "slice at once" 2) No UTF decoding. I kind of wish we had a faster joiner, but I think it would have made the call ambiguous. Ok, thanks. I re-tried joiner with both parameters being ranges, but there was no improvement in execution speed. I thought perhaps from your comments that it might work. char nl[] = uninitializedArray!(char[])(1); nl[] = '\n'; write(joiner(wc,nl));
Re: Function to print a diamond shape
Wow, joiner is much slower than join. Such a small choice can make this big of a difference. Not at all expected, since the lazy calls, I thought, were considered to be more efficient. This is with ldc2 -O2. jay@jay-ubuntu:~/ec_ddt/workspace/diamond/source$ ./main 1>/dev/null brad: time: 21958[ms] sergei: time: 24629[ms] jay2: time: 259[ms] diamondShape: time: 6701[ms] printDiamond: time: 194[ms] printDiamonde2a: time: 95[ms] printDiamonde2b: time: 92[ms] printDiamond3: time: 144[ms] printDiamonde2monarch: time: 67[ms] printDiamonde2cJoin: time: 96[ms] printDiamonde2cJoiner: time: 16115[ms] void printDiamonde2cJoin(in uint N) { int n,l; size_t N2 = N/2; size_t NM1 = N-1; char p[] = uninitializedArray!(char[])(N2+N); p[0..N2] = ' '; p[N2..$] = '*'; char nl[] = uninitializedArray!(char[])(1); nl[] = '\n'; char wc[][] = minimallyInitializedArray!(char[][])(N); for(n=0,l=0; n
Re: Function to print a diamond shape
On Monday, 21 April 2014 at 08:26:49 UTC, monarch_dodra wrote: The two "key" points here, first, is to avoid using appender. Second, instead of having two buffer: "" and "**\n", and two do two "slice copies", to only have 1 buffer " *", and to do 1 slice copy, and a single '\n' write. At this point, I'm not sure how we could be going any faster, short of using alloca... How does this hold up on your environment? Yes your solution is the fastest yet. Also, its times are similar for all three compilers. The range of execution times varied for different solutions from over 108 seconds down to 64 msec. I see that RefAppender's data() returns the managed array. Can write() handle that? It seems that would be more efficient than duplicating the character buffer ... or perhaps writing directly to an OutBuffer, and then sending that to write() would avoid the duplication? jay@jay-ubuntu:~/ec_ddt/workspace/diamond/source$ gdc -O2 main.d jay@jay-ubuntu:~/ec_ddt/workspace/diamond/source$ ./a.out 1>/dev/null brad: time: 31865[ms] sergei: time: 28596[ms] jay2: time: 258[ms] diamondShape: time: 7512[ms] printDiamond: time: 200[ms] printDiamonde2a: time: 140[ms] printDiamonde2b: time: 137[ms] printDiamond3: time: 503[ms] printDiamonde2monarch: time: 86[ms] jay@jay-ubuntu:~/ec_ddt/workspace/diamond/source$ dmd -release main.d jay@jay-ubuntu:~/ec_ddt/workspace/diamond/source$ ./main 1>/dev/null brad: time: 108111[ms] sergei: time: 33949[ms] jay2: time: 282[ms] diamondShape: time: 24567[ms] printDiamond: time: 230[ms] printDiamonde2a: time: 132[ms] printDiamonde2b: time: 106[ms] printDiamond3: time: 222[ms] printDiamonde2monarch: time: 66[ms] jay@jay-ubuntu:~/ec_ddt/workspace/diamond/source$ ~/ldc/bin/ldc2 -O2 main.d jay@jay-ubuntu:~/ec_ddt/workspace/diamond/source$ ./main 1>/dev/null brad: time: 20996[ms] sergei: time: 24841[ms] jay2: time: 259[ms] diamondShape: time: 6797[ms] printDiamond: time: 194[ms] printDiamonde2a: time: 91[ms] printDiamonde2b: time: 87[ms] printDiamond3: time: 145[ms] printDiamonde2monarch: time: 64[ms] jay@jay-ubuntu:~/ec_ddt/workspace/diamond/source$
Re: Function to print a diamond shape
On Tuesday, 25 March 2014 at 08:42:30 UTC, monarch_dodra wrote: Interesting. I'd have thought the "extra copy" would be an overall slowdown, but I guess that's not the case. I installed ubuntu 14.04 64 bit, and measured some of these examples using gdc, ldc and dmd on a corei3 box. The examples that wouldn't build had something to do with use of array.replicate and range.replicate conflicting in the libraries for gdc and ldc builds, which were based on 2.064.2. This is the ldc2 (0.13.0 alpha)(2.064.2) result: jay@jay-ubuntu:~/ec_ddt/workspace/diamond/source$ ./main 1>/dev/null brad: time: 2107[ms] sergei: time: 2441[ms] jay2: time: 26[ms] diamondShape: time: 679[ms] printDiamond: time: 19[ms] printDiamonde2a: time: 9[ms] printDiamonde2b: time: 8[ms] printDiamond3: time: 14[ms] This is the gdc(2.064.2) result: jay@jay-ubuntu:~/ec_ddt/workspace/diamond/source$ ./a.out 1>/dev/null brad: time: 3216[ms] sergei: time: 2828[ms] jay2: time: 26[ms] diamondShape: time: 776[ms] printDiamond: time: 19[ms] printDiamonde2a: time: 13[ms] printDiamonde2b: time: 13[ms] printDiamond3: time: 51[ms] This is the dmd(2.065) result: jay@jay-ubuntu:~/ec_ddt/workspace/diamond/source$ ./main 1>/dev/null brad: time: 10830[ms] sergei: time: 3480[ms] jay2: time: 29[ms] diamondShape: time: 2462[ms] printDiamond: time: 23[ms] printDiamonde2a: time: 13[ms] printDiamonde2b: time: 10[ms] printDiamond3: time: 23[ms] So this printDiamonde2b example had the fastest time of the solutions, and had similar times on all three builds. The ldc2 compiler build is performing best in most examples on ubuntu. void printDiamonde2b(in uint N) { uint N2 = N/2; char pSpace[] = uninitializedArray!(char[])(N2); pSpace[] = ' '; char pStars[] = uninitializedArray!(char[])(N+1); pStars[] = '*'; pStars[$-1] = '\n'; auto w = appender!(char[])(); w.reserve(N*3); foreach (n ; 0 .. N2 + 1){ w.put(pSpace[0 .. N2 - n]); w.put(pStars[$-2*n-2 .. $]); } foreach_reverse (n ; 0 .. N2){ w.put(pSpace[0 .. N2 - n]); w.put(pStars[$-2*n-2 .. $]); } write(w.data); }