Re: Dynamic bindings for OpenCL, libsndfile, FANN and BASS libraries
I've updated my derelictified dynamic bindings for: - OpenCL the Computing Library (thanks goes to vuaru) - BASS, an audio library - FANN, a neural network library - libsndfile, a library which read and write a variety of audio files Added Enet bindings: https://github.com/p0nce/DerelictENet
Re: [OT] My C++ talk at GoingNative 2013
On Friday, 13 September 2013 at 06:51:52 UTC, deadalnix wrote: There is 2 ask us anything. Can you tell us which one and approximately when ? Yes, the first one ( http://channel9.msdn.com/Events/GoingNative/2013/Interactive-Panel-Ask-Us-Anything ) around 01:14:20.
Re: [OT] My C++ talk at GoingNative 2013
On Monday, 9 September 2013 at 16:43:54 UTC, Andrei Alexandrescu wrote: http://www.reddit.com/r/programming/comments/1m1izv/goingnative_2013_writing_quick_code_in_c_quickly/ Andrei Just wanted to say thanks for posting the link, it was a great talk. Of the other talks I especially enjoyed those by the backend VC compiler guys, Compiler++ and Compiler Confidential. Watching your second talk on tuples I couldn't help thinking how much easier templates are in D. When reading C++ I find I have to make a context switch between run-time and compile-time thinking. It's quite jarring. In D, however, templates come naturally and it's very seamless.
Implementing and optimizing a simple graph metric
Hello all, I thought I'd do a writeup of the process of implementing and optimizing one of the graph metrics in Dgraph, starting from a fairly straight copy of pseudo-code in a research paper all through the various incremental tweaks that improve performance. http://braingam.es/2013/09/betweenness-centrality-in-dgraph/ I guess the optimizations made in this process are trivial for anyone experienced in D, but I hope the detailed description of what changes produce useful speedups might be useful for people new to the language. {Enj,Destr}oy :-) Best wishes, -- Joe
Re: Implementing and optimizing a simple graph metric
Joseph Rushton Wakeling: http://braingam.es/2013/09/betweenness-centrality-in-dgraph/ Some small improvements: T[] betweenness(Graph, T = double)(ref Graph g, bool[] ignore = null) if (isGraph!Graph isFloatingPoint!T) { enum T T0 = 0; enum T T1 = 1; auto centrality = minimallyInitializedArray!(typeof(return))(g.vertexCount); centrality[] = T0; auto stack = new size_t[g.vertexCount]; auto sigma = minimallyInitializedArray!T(g.vertexCount); sigma[] = T0; auto delta = minimallyInitializedArray!T(g.vertexCount); delta[] = T0; auto d = minimallyInitializedArray!long(g.vertexCount); d[] = -1; auto q = VertexQueue(g.vertexCount); auto p = new Appender!(size_t[])[g.vertexCount]; foreach (immutable s; 0 .. g.vertexCount) { if (ignore ignore[s]) { continue; } size_t stackLength = 0; assert(q.empty); sigma[s] = T1; d[s] = 0; q.push(s); while (!q.empty) { immutable size_t v = q.front; q.pop; stack[stackLength] = v; ++stackLength; foreach (immutable w; g.neighboursOut(v)) { if (ignore ignore[w]) { continue; } if (d[w] 0) { q.push(w); d[w] = d[v] + 1; assert(sigma[w] == T0); sigma[w] = sigma[v]; p[w].clear; p[w].put(v); } else if (d[w] (d[v] + 1)) { /* w has already been encountered, but we've found a shorter path to the source. This is probably only relevant to the weighted case, but let's keep it here in order to be ready for that update. */ d[w] = d[v] + 1L; sigma[w] = sigma[v]; p[w].clear; p[w].put(v); } else if (d[w] == (d[v] + 1)) { sigma[w] += sigma[v]; p[w].put(v); } } } while (stackLength 0) { --stackLength; immutable w = stack[stackLength]; foreach (immutable v; p[w].data) { delta[v] += ((sigma[v] / sigma[w]) * (T1 + delta[w])); } if (w != s) { centrality[w] += delta[w]; } sigma[w] = T0; delta[w] = T0; d[w] = -1; } } static if (!g.directed) { centrality[] /= 2; } return centrality; } Just for a test, try to allocate all those arrays in a different way: - First try a std.array.Array using the LDC2 compiler; - Another thing to try is to allocate them on the stack using core.stdc.stdlib.alloca: auto p = cast(T*)alloca(T.sizeof * g.vertexCount); if (!p) throw new MemoryError... auto centrality = p[0 .. g.vertexCount]; This probably can't be used for very large graphs, so you have to add even more logic to allocate the arrays on the C heap if they are too much large. This could be useful if you call betweenness() many many times. - Try to optionally accept the buffers from outside. Bye, bearophile
Re: Implementing and optimizing a simple graph metric
On Wednesday, 18 September 2013 at 13:39:29 UTC, bearophile wrote: Just for a test, try to allocate all those arrays in a different way: - First try a std.array.Array using the LDC2 compiler; - Another thing to try is to allocate them on the stack using core.stdc.stdlib.alloca: auto p = cast(T*)alloca(T.sizeof * g.vertexCount); if (!p) throw new MemoryError... auto centrality = p[0 .. g.vertexCount]; This probably can't be used for very large graphs, so you have to add even more logic to allocate the arrays on the C heap if they are too much large. This could be useful if you call betweenness() many many times. - Try to optionally accept the buffers from outside. Nice suggestions, I'll try those out! :-) I think I did give std.array.Array a trial when trying to speed up its performance, and I don't remember it making any difference (if anything it may have slowed things down). But I'll give it a second look and report back. I haven't yet tried alloca or other manual memory management -- I felt a bit resistant to this as I'd prefer to keep the code simple and readable -- but I'll give that a go too just to see how it goes. It's interesting that you pick up on my use of to!T(0) etc. I had some concern about whether that would affect speed at all. At the time of originally writing the code, my impression was that not having it actually made things worse, and that the compiler was smart enough to carry out the conversion at compile-time. However, my impression now is that having or not having it, or any other method (e.g. enum T0, T1, etc.) makes absolutely no difference, and that I might as well be simple and write 0 for integral-types, 0.0 for floating-point.
Re: Implementing and optimizing a simple graph metric
Joseph Rushton Wakeling: I haven't yet tried alloca or other manual memory management -- I felt a bit resistant to this as I'd prefer to keep the code simple and readable -- but I'll give that a go too just to see how it goes. I'd like some stack-allocated variable-length array in D+Phobos, as in C++14. It's also a basis to build several other stack-allocated data structures. However, my impression now is that having or not having it, or any other method (e.g. enum T0, T1, etc.) makes absolutely no difference, and that I might as well be simple and write 0 for integral-types, 0.0 for floating-point. Right. And this could be right even if you want T=Rational. Bye, bearophile
Re: [OT] My C++ talk at GoingNative 2013
On Wednesday, 18 September 2013 at 10:18:43 UTC, growler wrote: Watching your second talk on tuples I couldn't help thinking how much easier templates are in D. When reading C++ I find I have to make a context switch between run-time and compile-time thinking. It's quite jarring. In D, however, templates come naturally and it's very seamless. This is how i feel too, templates in D are an absolute joy to use and so natural!
Re: Implementing and optimizing a simple graph metric
On Wednesday, 18 September 2013 at 15:22:51 UTC, bearophile wrote: Joseph Rushton Wakeling: I haven't yet tried alloca or other manual memory management -- I felt a bit resistant to this as I'd prefer to keep the code simple and readable -- but I'll give that a go too just to see how it goes. I'd like some stack-allocated variable-length array in D+Phobos, as in C++14. It's also a basis to build several other stack-allocated data structures. Yes, that'd be useful, although in the case of this code I think that stack vs. heap probably isn't that important. If the data is small enough for stack allocation, the calculation will be quick anyway. However, my impression now is that having or not having it, or any other method (e.g. enum T0, T1, etc.) makes absolutely no difference, and that I might as well be simple and write 0 for integral-types, 0.0 for floating-point. Right. And this could be right even if you want T=Rational. Actually, on re-testing this just now, I'm returning to my original view. I find that if you put in raw floating-point numbers like 0.0, 1.0 etc. the resulting code can be slower in the case where T = float. Putting to!T or using enums as you suggest appears to make no difference. This is a guess rather than confirmed by looking at assembly, but I'm presuming that to!T(0) and other conversions of compile-time constants are evaluated at compile time, so you don't in practice carry the cost you'd normally get from using to().
Re: Implementing and optimizing a simple graph metric
On Wednesday, 18 September 2013 at 15:17:25 UTC, Joseph Rushton Wakeling wrote: I think I did give std.array.Array a trial when trying to speed up its performance, and I don't remember it making any difference (if anything it may have slowed things down). But I'll give it a second look and report back. I tried rewriting the variable declarations: Array!T centrality; centrality.length = g.vertexCount; centrality[] = to!T(0); Array!size_t stack; stack.length = g.vertexCount; Array!T sigma; sigma.length = g.vertexCount; Array!T delta; delta.length = g.vertexCount; Array!long d; d.length = g.vertexCount; auto q = VertexQueue(g.vertexCount); Appender!(size_t[])[] p = new Appender!(size_t[])[g.vertexCount]; It results in a moderate slowdown of the code, at a guess because it increases the total number of mallocs/deallocs. I note that there seems to be no way to initialize std.container.Array as having a certain length ... ? I tried instead using std.array.uninitializedArray to initialize the arrays in the betweenness() function -- it was a bit difficult to judge here: with a relatively small number of calls to betweenness() it seems to have resulted in a speedup, but with very many calls, it seems to have been slower.
Re: Implementing and optimizing a simple graph metric
Joseph Rushton Wakeling: in the case of this code I think that stack vs. heap probably isn't that important. If the data is small enough for stack allocation, the calculation will be quick anyway. How many times or how often do you need to call betweenness()? If it's called only few times or once in a while then using the GC is good enough. But if you have to call it many millions of times, all those GC array allocations could cost some time, even if they are small arrays. In this case allocating them on the stack (or not allocating them, using externally allocated buffers) helps. Bye, bearophile
Re: Implementing and optimizing a simple graph metric
On Wednesday, 18 September 2013 at 13:39:29 UTC, bearophile wrote: - Try to optionally accept the buffers from outside. Does this look good to you? / auto ref betweenness(T = double, Graph)(ref Graph g, bool[] ignore = null) if (isFloatingPoint!T isGraph!Graph) { T[] centrality = new T[g.vertexCount]; return betweenness!(T, Graph)(g, centrality, ignore); } auto ref betweenness(T = double, Graph)(ref Graph g, ref T[] centrality, bool[] ignore = null) { centrality.length = g.vertexCount; centrality[] = to!T(0); // ... the rest as before } /
Re: I'm porting some go code to D
I've made the scheduler a bit more inteligent and the channel implementation is now backed by a lock free queue. https://github.com/rjmcguire/goport/blob/tip/wrapper2.d for a example using libev https://github.com/rjmcguire/goport/blob/tip/goroutine.d for the go routines https://github.com/rjmcguire/goport/blob/tip/channel.d channel implementation, the channel still has a read lock because I didn't like looping wasting cpu cycles/battery, I may change this to use the same loop as the select(...) expression with a timeout that backs off. https://github.com/rjmcguire/goport/blob/tip/concurrentlinkedqueue.d queue implementation there is a problem with the file EV handler in that it can get spurios READ events and I can't seem to set it to non-blocking. The select(...) implementation and usage is quite interesting because it works very much like Go's select statement. On Mon, Aug 26, 2013 at 1:28 AM, Rory McGuire rjmcgu...@gmail.com wrote: Awesome thanks, thought what I did there was dodgy. It was really weird when this() started multiple schedulers at least now I see the obvious reason. BTW: gist is at: https://gist.github.com/rjmcguire/6336931 Could someone point me at the correct linked list to use inside the channel. I'd prefer to use a range container of some kind if it exists in the std lib. I tried SList and had a bad experience hence the custom implementation. On Mon, Aug 26, 2013 at 1:21 AM, Timon Gehr timon.g...@gmx.ch wrote: On 08/26/2013 12:55 AM, Rory McGuire wrote: shared chan!Fiber scheduler; // channel contains Fibers waiting for their time slice static this () { if (scheduler is null) { You want 'shared static this' instead to avoid a race condition.