Re: Using objects that manage threads via std.concurrency
On Tuesday, 12 February 2013 at 07:07:21 UTC, Jonathan M Davis wrote: Which I don't think was ever really intended. That doesn't mean that it's unreasonable, but I think that it was always the idea that a particular thread had a particular job, in which case, you wouldn't generally be trying to send messages to different parts of the thread. - Jonathan M Davis Hum, I just realized that receive works out of order on the types requested. I thought it *had* to receive THE first message in the queue, and throw if the type is not supported. I guess then that by specifying my specific type, and having a dedicated dispatcher, I can make my program work, without clashing with anybody else who is also threading. Now, I've just got to figure out how to manage my master's mailbox sizes, if a worker is faster than the rest.
Re: Using objects that manage threads via std.concurrency
On Tuesday, 12 February 2013 at 10:08:14 UTC, FG wrote: On 2013-02-12 07:58, monarch_dodra wrote: I think I didn't explain myself very well. I have my single master thread which has a thread-global mailbox, but I have 3 different objects that are sharing that mailbox. OK, I finally get what you are saying. You need to create a mailbox and a unique tid for every Manager (and probably would have to change Manager into a class). Unfortunately this won't work out of the box, as for example receiveOnly and friends use only the default mailbox of the current thread. struct Manager { Tid tid; MessageBox mbox; this(string s) { this.mbox = new MessageBox tid = Tid(new Mailbox); spawn(worker, s, tid); } string get() { // you'd have to rewrite receive to use custom mbox return tid.myReceiveOnly!string(); } } Hum, I'll have to try to play around with that. For one thing, MessageBox is private. Good news is my manager is already a class. As for the re-implement of receive to work on a custom Tid, maybe it might be better to forget about the tid, and implement it on directly on the mailbox? Something like this: // struct Manager { MessageBox mbox; this(string s) { this.mbox = new MessageBox Tid managerTid = Tid(new Mailbox); spawn(worker, s, managerTid); } string get() { // you'd have to rewrite receive to use custom mbox return mbox.receiveOnly!string(); //Or just straight up: mbox.get(); } } // I don't know, I'll try and see how it goes.
Re: How to read fastly files ( I/O operation)
On Tuesday, 12 February 2013 at 12:02:59 UTC, bioinfornatics wrote: instead to use memcpy I try with slicing ~ lines 136 : _hardBuffer[ 0 .. moveSize] = _hardBuffer[_bufPosition .. moveSize + _bufPosition]; I get same perf I think I figured out why I'm getting different results than you guys are, on my windows machine. AFAIK, file reads in windows are done natively asynchronously. I wrote a multi-threaded version of the parser, with a thread dedicated to reading the file, while the main thread parses the read buffers. I'm getting EXACTLY 0% performance improvement. Not better, not worst, just 0%. I'd have to try again on my SSD. Right now, I'm parsing the file 6 Gig file in 60 seconds, which is the limit of my HDD. As a matter of fact, just *reading* the files takes the EXACT same amount of time as parsing it... This takes 60 seconds. // auto input = File(args[1], rb); ubyte[] buffer = new ubyte[](BufferSize); do{ buffer = input.rawRead(buffer); }while(buffer.length); // This takes 60 seconds too. // Parser parser = new Parser(args[1]); foreach(q; parser) foreach(char c; q.sequence) globalNucleic.collect(c); } // So at this point, I'd need to test on my Linux box, or publish the code so you can tell me how I'm doing. I'm still tweaking the code to publish something readable, as there is a lot of sketchy code right now. I'm also implementing a correct exception handling, so that if there is an erroneous entry, an exception is thrown. However, all the erroneous data is parsed out of the file, and placed inside the exception. This means that: a) You can inspect the erroneous data b) You can skip the erroneous data, and parse the rest of the file. Once I deliver the code with the multi-threaded code activated, you should get some better performance on Linux. When 1.0 is ready, I'll create a github project for it, so work can be done parallel on it.
Re: Using objects that manage threads via std.concurrency
On 2013-02-12 12:14, monarch_dodra wrote: For one thing, MessageBox is private. Unnecessarily hidden, because, from what I can see from a fast look at the sources, there is no implicit requirement for there to be only one MessageBox per thread. Maybe we're getting somewhere and this will be changed. As for the re-implement of receive to work on a custom Tid, maybe it might be better to forget about the tid, and implement it on directly on the mailbox? Well, yes. It's more natural to work on mbox than some artificial struct. Now, as for the usefulness of having many mailboxes. I'd rather have one mailbox than go into a loop with receiveTimeout called for each Manager, but in your divideconquer example receive makes sense and keeps ordering.
Re: How to read fastly files ( I/O operation)
On Tuesday, 12 February 2013 at 12:45:26 UTC, monarch_dodra wrote: On Tuesday, 12 February 2013 at 12:02:59 UTC, bioinfornatics wrote: instead to use memcpy I try with slicing ~ lines 136 : _hardBuffer[ 0 .. moveSize] = _hardBuffer[_bufPosition .. moveSize + _bufPosition]; I get same perf I think I figured out why I'm getting different results than you guys are, on my windows machine. AFAIK, file reads in windows are done natively asynchronously. I wrote a multi-threaded version of the parser, with a thread dedicated to reading the file, while the main thread parses the read buffers. I'm getting EXACTLY 0% performance improvement. Not better, not worst, just 0%. I'd have to try again on my SSD. Right now, I'm parsing the file 6 Gig file in 60 seconds, which is the limit of my HDD. As a matter of fact, just *reading* the files takes the EXACT same amount of time as parsing it... This takes 60 seconds. // auto input = File(args[1], rb); ubyte[] buffer = new ubyte[](BufferSize); do{ buffer = input.rawRead(buffer); }while(buffer.length); // This takes 60 seconds too. // Parser parser = new Parser(args[1]); foreach(q; parser) foreach(char c; q.sequence) globalNucleic.collect(c); } // So at this point, I'd need to test on my Linux box, or publish the code so you can tell me how I'm doing. I'm still tweaking the code to publish something readable, as there is a lot of sketchy code right now. I'm also implementing a correct exception handling, so that if there is an erroneous entry, an exception is thrown. However, all the erroneous data is parsed out of the file, and placed inside the exception. This means that: a) You can inspect the erroneous data b) You can skip the erroneous data, and parse the rest of the file. Once I deliver the code with the multi-threaded code activated, you should get some better performance on Linux. When 1.0 is ready, I'll create a github project for it, so work can be done parallel on it. about threaded version is possible to use get file size function to split it in several thread. Use fseek read end of section return it to detect end of split to used
Re: A little of coordination for Rosettacode
ixid: If you're posting code on Rosetta code you are presenting that code as idiomatic. The D code on Rosettacode has some stylistic uniformity, and I think in most cases it follows the dstyle (http://dlang.org/dstyle.html ), but that code is not meant to be production code (lot of people in other languages don't add unittests, etc). So it is not idiomatic, and it's not meant to be. If you go on Rosettacode you can't expect to see code similar to Phobos code. There is also variability: some D entries of Rosettacode have unittests and are written in a readable style, other entries try to be short simple, and other entries are very strongly typed and longer, other entries look almost as C, and so on. This is done on purpose, to show various kinds of D coding. None of those ways is the only idiomatic one. You tend to use the superfluous parens which the properties discussion would suggest are becoming more idiomatic not to use. -property was supposed to become the standard semantics of the D language. So I have written code that way. Later things are changing. Now I am waiting to see what's coming out of the property discussion. If the final decision is that those parentheses aren't needed nor idiomatic, I/we will (slowly) remove the parentheses from the entries. You also use 'in' a lot in function inputs while others have argued against it. in is currently not good if you are writing a long term large library, or a larger program you want to use for lot of time, etc, because it assumes some type system semantics that is not yet implemented. But for the purposes of Rosettacode using in is good, because it's short, readable, and if/when scope will break some entries, I/we will fix them. This is not an attack on your code at all, but maybe there should be some discussion of and consensus on what is idiomatic. One problem of Rosettacode is that it's not so good to discuss. GitHub offers better means to discuss on the code. What other things do you want to discuss about? Bye, bearophile
Re: A little of coordination for Rosettacode
What other things do you want to discuss about? I mean some level of D community discussion of the language as a whole as to what is an idiomatic style, perhaps after the current issues are settled, not anything specific about your code. There are areas like complex UFCS statements where it would help to have agreed, suggested ways of formatting.
Re: A little of coordination for Rosettacode
ixid: I mean some level of D community discussion of the language as a whole as to what is an idiomatic style, perhaps after the current issues are settled, not anything specific about your code. Such discussion seems better in the main D newsgroup. But it also seems a good way to waste time with hundreds of posts that produce nothing of value :-) There are areas like complex UFCS statements where it would help to have agreed, suggested ways of formatting. I think this is currently a good way to format that kind of chains, this is inspired by similar F# formatting: auto r = fooSomething() .barSomething!pred1() .bazSomething() .spamSomething!fun2(); In some cases on Rosettacode I have followed that formatting pattern. Bye, bearophile
Finding large difference b/w execution time of c++ and D codes for same problem
I am writing Julia sets program in C++ and D; exactly same way as much as possible. On executing I find large difference in their execution time. Can you comment what wrong am I doing or is it expected? //===C++ code, compiled with -O3 == #include sys/time.h #include iostream using namespace std; const int DIM= 4194304; struct complexClass { float r; float i; complexClass( float a, float b ) { r = a; i = b; } float squarePlusMag(complexClass another) { float r1 = r*r - i*i + another.r; float i1 = 2.0*i*r + another.i; r = r1; i = i1; return (r1*r1+ i1*i1); } }; int juliaFunction( int x, int y ) { complexClass a (x,y); complexClass c(-0.8, 0.156); int i = 0; for (i=0; i200; i++) { if( a.squarePlusMag(c) 1000) return 0; } return 1; } void kernel( ){ for (int x=0; xDIM; x++) { for (int y=0; yDIM; y++) { int offset = x + y * DIM; int juliaValue = juliaFunction( x, y ); //juliaValue will be used by some function. } } } int main() { struct timeval start, end; gettimeofday(start, NULL); kernel(); gettimeofday(end, NULL); float delta = ((end.tv_sec - start.tv_sec) * 100u + end.tv_usec - start.tv_usec) / 1.e6; cout C++ code with dimension DIM Total time: delta [sec]\n; } //=D++ code, compiled with -O -release -inline= #!/usr/bin/env rdmd import std.stdio; import std.datetime; immutable int DIM= 4194304; struct complexClass { float r; float i; float squarePlusMag(complexClass another) { float r1 = r*r - i*i + another.r; float i1 = 2.0*i*r + another.i; r = r1; i = i1; return (r1*r1+ i1*i1); } }; int juliaFunction( int x, int y ) { complexClass c = complexClass(0.8, 0.156); complexClass a= complexClass(x, y); for (int i=0; i200; i++) { if( a.squarePlusMag(c) 1000) return 0; } return 1; } void kernel( ){ for (int x=0; xDIM; x++) { for (int y=0; yDIM; y++) { int offset = x + y * DIM; int juliaValue = juliaFunction( x, y ); //juliaValue will be used by some function. } } } void main() { StopWatch sw; sw.start(); kernel(); sw.stop(); writeln( D code serial with dimension , DIM , Total time: , (sw.peek().msecs/1000), [sec]); } // I will appreciate any help.
Re: Finding large difference b/w execution time of c++ and D codes for same problem
I am finding C++ code is much faster than D code.
Re: Finding large difference b/w execution time of c++ and D codes for same problem
On Tuesday, 12 February 2013 at 20:39:36 UTC, Sparsh Mittal wrote: I am finding C++ code is much faster than D code. dmd (AFAIK) is known to be slower. try LDC or GDC if speed is your major concern.
Re: Finding large difference b/w execution time of c++ and D codes for same problem
13-Feb-2013 00:39, Sparsh Mittal пишет: I am finding C++ code is much faster than D code. Seems like DMD's floating point issue. The issue being that it always works with floats as full-width reals + rounding. Basically if nothing changed (and I doubt it changed) then DMD with floating point code is about two (or more) times slower then GDC/LDC. The cure is using GDC/LDC compiler as they are pretty stable and up to date on the front-end side these days. -- Dmitry Olshansky
Re: Finding large difference b/w execution time of c++ and D codes for same problem
Pardon me, can you please point me to suitable reference or tell just command here. Searching on google, I could not find anything yet. Performance is my main concern.
Re: Finding large difference b/w execution time of c++ and D codes for same problem
On Wed, Feb 13, 2013 at 12:56:01AM +0400, Dmitry Olshansky wrote: 13-Feb-2013 00:39, Sparsh Mittal пишет: I am finding C++ code is much faster than D code. Seems like DMD's floating point issue. The issue being that it always works with floats as full-width reals + rounding. Basically if nothing changed (and I doubt it changed) then DMD with floating point code is about two (or more) times slower then GDC/LDC. The cure is using GDC/LDC compiler as they are pretty stable and up to date on the front-end side these days. [...] I did a few benchmarks somewhat recently where I compared the performance of code produced by GDC with DMD. Code produced by GDC consistently outperforms code produced by DMD by about 20-30% or so. This is across the board, with both floats, reals, and applications that don't do heavy arithmetic (just basic looping/recursion constructs). I didn't investigate in detail the cause of this difference, but the last time I looked at the assembly code generated by both compilers, I noticed that GDC's optimizer is far more advanced than DMD's, esp. when it comes to loop-unrolling, strength reduction, inlining, etc.. For non-trivial code, GDC pretty much consistently produces superior code in general (not just in floating-point operations). So if performance is a concern, I'd say definitely look into GDC or LDC instead of DMD. T -- Two wrongs don't make a right; but three rights do make a left...
Re: Finding large difference b/w execution time of c++ and D codes for same problem
OK. I found it.
Re: Finding large difference b/w execution time of c++ and D codes for same problem
13-Feb-2013 01:09, Sparsh Mittal пишет: Pardon me, can you please point me to suitable reference or tell just command here. Searching on google, I could not find anything yet. Performance is my main concern. GDC, seems like its mostly build from source kind of thing. Moved to gitbub: https://github.com/D-Programming-GDC (See also newsgroup digitalmars.d.D.gnu) GDC binaries for Windows TDM-GCC toolchain are still available there: https://bitbucket.org/goshawk/gdc/downloads AFAIK it needs 4.6.1 version of TDM toolset. LDC(2), recent release with binaries. https://github.com/downloads/ldc-developers/ldc/ldc-0.10.0-src.tar.gz https://github.com/downloads/ldc-developers/ldc/ldc2-0.10.0-linux-x86_64.tar.gz https://github.com/downloads/ldc-developers/ldc/ldc2-0.10.0-linux-x86_64.tar.xz https://github.com/downloads/ldc-developers/ldc/ldc2-0.10.0-linux-x86.tar.gz https://github.com/downloads/ldc-developers/ldc/ldc2-0.10.0-linux-x86.tar.xz https://github.com/downloads/ldc-developers/ldc/ldc2-0.10.0-osx-x86_64.tar.gz https://github.com/downloads/ldc-developers/ldc/ldc2-0.10.0-osx-x86_64.tar.xz (See also announce on the newsgroup digitalmars.d.D.ldc) Both compilers ship dmd-style compiler driver called gdmd or ldmd2. Speed is mostly what you'd expect of GCC and LLVM respectively. -- Dmitry Olshansky
Re: Finding large difference b/w execution time of c++ and D codes for same problem
Thanks for your insights. It was very helpful.
Re: How to read fastly files ( I/O operation)
On Tuesday, 12 February 2013 at 21:41:14 UTC, bioinfornatics wrote: Some time fastq are comressed to gz bz2 or xz as that is often a huge file. Maybe we need keep in mind this early in developement and use std.zlib While working on making the parser multi-threaded compatible, I was able to seperate the part that feeds data, and the part that parses data. Long story short, the parser operates on an input range of ubyte[]: It is not responsible any more for acquisition of data. The range can be a simple (wrapped) File, a byChunk, an asynchroneus file reader, or a zip decompresser, or just stdin I guess. Range can be transient. However, now that you mention it, I'll make sure it is correctly supported. I'll *try* to show you what I have so far tomorow (in about 18h).
Re: Finding large difference b/w execution time of c++ and D codes for same problem
On 2013-02-12 21:39, Sparsh Mittal wrote: I am finding C++ code is much faster than D code. I had a look, but first had to make juliaValue global, because g++ had optimized all the calculations away. :) Also changed DIM to 32 * 1024. 13.2s -- g++ -O3 16.0s -- g++ -O2 15.9s -- gdc -O3 15.9s -- gdc -O2 16.2s -- dmd -O -release -inline(v.2.060) Winblows and DMD 32-bit, the rest 64-bit, but still, dmd was quite fast. Interesting how gdc -O3 gave no extra boost vs. -O2.