Re: Using objects that manage threads via std.concurrency

2013-02-12 Thread monarch_dodra
On Tuesday, 12 February 2013 at 07:07:21 UTC, Jonathan M Davis 
wrote:
Which I don't think was ever really intended. That doesn't mean 
that it's
unreasonable, but I think that it was always the idea that a 
particular thread
had a particular job, in which case, you wouldn't generally be 
trying to send

messages to different parts of the thread.

- Jonathan M Davis


Hum, I just realized that receive works out of order on the 
types requested. I thought it *had* to receive THE first message 
in the queue, and throw if the type is not supported.


I guess then that by specifying my specific type, and having a 
dedicated dispatcher, I can make my program work, without 
clashing with anybody else who is also threading.


Now, I've just got to figure out how to manage my master's 
mailbox sizes, if a worker is faster than the rest.


Re: Using objects that manage threads via std.concurrency

2013-02-12 Thread monarch_dodra

On Tuesday, 12 February 2013 at 10:08:14 UTC, FG wrote:

On 2013-02-12 07:58, monarch_dodra wrote:
I think I didn't explain myself very well. I have my single 
master thread
which has a thread-global mailbox, but I have 3 different 
objects that are

sharing that mailbox.


OK, I finally get what you are saying.
You need to create a mailbox and a unique tid for every Manager 
(and probably would have to change Manager into a class). 
Unfortunately this won't work out of the box, as for example 
receiveOnly and friends use only the default mailbox of the 
current thread.


struct Manager
{
Tid tid;
MessageBox mbox;
this(string s)
{
this.mbox = new MessageBox
tid = Tid(new Mailbox);
spawn(worker, s, tid);
}
string get()
{
// you'd have to rewrite receive to use custom mbox
return tid.myReceiveOnly!string();
}
}


Hum, I'll have to try to play around with that. For one thing, 
MessageBox is private.


Good news is my manager is already a class.

As for the re-implement of receive to work on a custom Tid, maybe 
it might be better to forget about the tid, and implement it on 
directly on the mailbox? Something like this:


//
struct Manager
{
MessageBox mbox;
this(string s)
{
this.mbox = new MessageBox
Tid managerTid = Tid(new Mailbox);
spawn(worker, s, managerTid);
}
string get()
{
// you'd have to rewrite receive to use custom mbox
return mbox.receiveOnly!string();
//Or just straight up:
mbox.get();
}
}
//

I don't know, I'll try and see how it goes.


Re: How to read fastly files ( I/O operation)

2013-02-12 Thread monarch_dodra
On Tuesday, 12 February 2013 at 12:02:59 UTC, bioinfornatics 
wrote:

instead to use memcpy I try with slicing ~ lines 136 :
_hardBuffer[ 0 .. moveSize]  = _hardBuffer[_bufPosition .. 
moveSize + _bufPosition];


I get same perf


I think I figured out why I'm getting different results than you 
guys are, on my windows machine.


AFAIK, file reads in windows are done natively asynchronously.

I wrote a multi-threaded version of the parser, with a thread 
dedicated to reading the file, while the main thread parses the 
read buffers.


I'm getting EXACTLY 0% performance improvement. Not better, not 
worst, just 0%.


I'd have to try again on my SSD. Right now, I'm parsing the file 
6 Gig file in 60 seconds, which is the limit of my HDD. As a 
matter of fact, just *reading* the files takes the EXACT same 
amount of time as parsing it...


This takes 60 seconds.
//
auto input = File(args[1], rb);
ubyte[] buffer = new ubyte[](BufferSize);
do{
buffer = input.rawRead(buffer);
}while(buffer.length);
//

This takes 60 seconds too.
//
Parser parser = new Parser(args[1]);
foreach(q; parser)
foreach(char c; q.sequence)
globalNucleic.collect(c);
}
//

So at this point, I'd need to test on my Linux box, or publish 
the code so you can tell me how I'm doing.


I'm still tweaking the code to publish something readable, as 
there is a lot of sketchy code right now.


I'm also implementing a correct exception handling, so that if 
there is an erroneous entry, an exception is thrown. However, all 
the erroneous data is parsed out of the file, and placed inside 
the exception. This means that:

a) You can inspect the erroneous data
b) You can skip the erroneous data, and parse the rest of the 
file.


Once I deliver the code with the multi-threaded code activated, 
you should get some better performance on Linux.


When 1.0 is ready, I'll create a github project for it, so work 
can be done parallel on it.


Re: Using objects that manage threads via std.concurrency

2013-02-12 Thread FG

On 2013-02-12 12:14, monarch_dodra wrote:

For one thing, MessageBox is private.


Unnecessarily hidden, because, from what I can see from a fast look at the 
sources, there is no implicit requirement for there to be only one MessageBox 
per thread. Maybe we're getting somewhere and this will be changed.



As for the re-implement of receive to work on a custom Tid, maybe it might be
better to forget about the tid, and implement it on directly on the mailbox?


Well, yes. It's more natural to work on mbox than some artificial struct.

Now, as for the usefulness of having many mailboxes. I'd rather have one mailbox 
than go into a loop with receiveTimeout called for each Manager, but in your 
divideconquer example receive makes sense and keeps ordering.


Re: How to read fastly files ( I/O operation)

2013-02-12 Thread bioinfornatics

On Tuesday, 12 February 2013 at 12:45:26 UTC, monarch_dodra wrote:
On Tuesday, 12 February 2013 at 12:02:59 UTC, bioinfornatics 
wrote:

instead to use memcpy I try with slicing ~ lines 136 :
_hardBuffer[ 0 .. moveSize]  = _hardBuffer[_bufPosition .. 
moveSize + _bufPosition];


I get same perf


I think I figured out why I'm getting different results than 
you guys are, on my windows machine.


AFAIK, file reads in windows are done natively asynchronously.

I wrote a multi-threaded version of the parser, with a thread 
dedicated to reading the file, while the main thread parses the 
read buffers.


I'm getting EXACTLY 0% performance improvement. Not better, not 
worst, just 0%.


I'd have to try again on my SSD. Right now, I'm parsing the 
file 6 Gig file in 60 seconds, which is the limit of my HDD. As 
a matter of fact, just *reading* the files takes the EXACT same 
amount of time as parsing it...


This takes 60 seconds.
//
auto input = File(args[1], rb);
ubyte[] buffer = new ubyte[](BufferSize);
do{
buffer = input.rawRead(buffer);
}while(buffer.length);
//

This takes 60 seconds too.
//
Parser parser = new Parser(args[1]);
foreach(q; parser)
foreach(char c; q.sequence)
globalNucleic.collect(c);
}
//

So at this point, I'd need to test on my Linux box, or publish 
the code so you can tell me how I'm doing.


I'm still tweaking the code to publish something readable, as 
there is a lot of sketchy code right now.


I'm also implementing a correct exception handling, so that if 
there is an erroneous entry, an exception is thrown. However, 
all the erroneous data is parsed out of the file, and placed 
inside the exception. This means that:

a) You can inspect the erroneous data
b) You can skip the erroneous data, and parse the rest of the 
file.


Once I deliver the code with the multi-threaded code activated, 
you should get some better performance on Linux.


When 1.0 is ready, I'll create a github project for it, so 
work can be done parallel on it.


about threaded version is possible to use get file size function 
to split it in several thread.
Use fseek read end of section return it to detect end of split to 
used


Re: A little of coordination for Rosettacode

2013-02-12 Thread bearophile

ixid:

If you're posting code on Rosetta code you are presenting that 
code as idiomatic.


The D code on Rosettacode has some stylistic uniformity, and I 
think in most cases it follows the dstyle 
(http://dlang.org/dstyle.html ), but that code is not meant to be 
production code (lot of people in other languages don't add 
unittests, etc). So it is not idiomatic, and it's not meant to 
be. If you go on Rosettacode you can't expect to see code similar 
to Phobos code.


There is also variability: some D entries of Rosettacode have 
unittests and are written in a readable style, other entries try 
to be short simple, and other entries are very strongly typed and 
longer, other entries look almost as C, and so on. This is done 
on purpose, to show various kinds of D coding. None of those ways 
is the only idiomatic one.



You tend to use the superfluous parens which the properties 
discussion would suggest are becoming more idiomatic not to use.


-property was supposed to become the standard semantics of the D 
language. So I have written code that way. Later things are 
changing. Now I am waiting to see what's coming out of the 
property discussion. If the final decision is that those 
parentheses aren't needed nor idiomatic, I/we will (slowly) 
remove the parentheses from the entries.



You also use 'in' a lot in function inputs while others have 
argued against it.


in is currently not good if you are writing a long term large 
library, or a larger program you want to use for lot of time, 
etc, because it assumes some type system semantics that is not 
yet implemented.
But for the purposes of Rosettacode using in is good, because 
it's short, readable, and if/when scope will break some entries, 
I/we will fix them.



This is not an attack on your code at all, but maybe there 
should be some discussion of and consensus on what is idiomatic.


One problem of Rosettacode is that it's not so good to discuss. 
GitHub offers better means to discuss on the code.


What other things do you want to discuss about?

Bye,
bearophile


Re: A little of coordination for Rosettacode

2013-02-12 Thread ixid

What other things do you want to discuss about?


I mean some level of D community discussion of the language as a 
whole as to what is an idiomatic style, perhaps after the current 
issues are settled, not anything specific about your code. There 
are areas like complex UFCS statements where it would help to 
have agreed, suggested ways of formatting.


Re: A little of coordination for Rosettacode

2013-02-12 Thread bearophile

ixid:

I mean some level of D community discussion of the language as 
a whole as to what is an idiomatic style, perhaps after the 
current issues are settled, not anything specific about your 
code.


Such discussion seems better in the main D newsgroup. But it also 
seems a good way to waste time with hundreds of posts that 
produce nothing of value :-)



There are areas like complex UFCS statements where it would 
help to have agreed, suggested ways of formatting.


I think this is currently a good way to format that kind of 
chains, this is inspired by similar F# formatting:



auto r = fooSomething()
 .barSomething!pred1()
 .bazSomething()
 .spamSomething!fun2();


In some cases on Rosettacode I have followed that formatting 
pattern.


Bye,
bearophile


Finding large difference b/w execution time of c++ and D codes for same problem

2013-02-12 Thread Sparsh Mittal
I am writing Julia sets program in C++ and D; exactly same way as 
much as possible. On executing I find large difference in their 
execution time. Can you comment what wrong am I doing or is it 
expected?



//===C++ code, compiled with -O3 ==
#include sys/time.h
#include iostream
using namespace std;
const  int DIM= 4194304;

struct complexClass {
  float r;
  float i;
  complexClass( float a, float b )
  {
r = a;
i = b;
  }


  float squarePlusMag(complexClass another)
  {
float r1 = r*r - i*i + another.r;
float i1 = 2.0*i*r + another.i;

r = r1;
i = i1;

return (r1*r1+ i1*i1);
  }
};


int juliaFunction( int x, int y )
{

  complexClass a (x,y);

   complexClass c(-0.8, 0.156);

  int i = 0;

  for (i=0; i200; i++) {
   if( a.squarePlusMag(c)  1000)
  return 0;
  }

  return 1;
}


void kernel(  ){
  for (int x=0; xDIM; x++) {
for (int y=0; yDIM; y++) {
  int offset = x + y * DIM;
  int juliaValue = juliaFunction( x, y );
//juliaValue will be used by some function.
}
  }
}


int main()
{

  struct timeval start, end;
  gettimeofday(start, NULL);
  kernel();
  gettimeofday(end, NULL);
  float delta = ((end.tv_sec  - start.tv_sec) * 100u +
 end.tv_usec - start.tv_usec) / 1.e6;


  cout C++ code with dimension   DIM  Total time:  
delta  [sec]\n;

}






//=D++ code, compiled with -O -release 
-inline=


#!/usr/bin/env rdmd
import std.stdio;
import std.datetime;
immutable int DIM= 4194304;


struct complexClass {
  float r;
  float i;

  float squarePlusMag(complexClass another)
  {
float r1 = r*r - i*i + another.r;
float i1 = 2.0*i*r + another.i;

r = r1;
i = i1;

return (r1*r1+ i1*i1);
  }
};


int juliaFunction( int x, int y )
{

  complexClass c = complexClass(0.8, 0.156);
  complexClass a= complexClass(x, y);


  for (int i=0; i200; i++) {

if( a.squarePlusMag(c)  1000)
  return 0;
  }
  return 1;
}


void kernel(  ){
  for (int x=0; xDIM; x++) {
for (int y=0; yDIM; y++) {
  int offset = x + y * DIM;
  int juliaValue = juliaFunction( x, y );
  //juliaValue will be used by some function.   
}
  }
}


void main()
{
  StopWatch sw;
  sw.start();
  kernel();
  sw.stop();
  writeln( D code serial with dimension , DIM , Total time: , 
(sw.peek().msecs/1000), [sec]);

}

//
I will appreciate any help.


Re: Finding large difference b/w execution time of c++ and D codes for same problem

2013-02-12 Thread Sparsh Mittal

I am finding C++ code is much faster than D code.


Re: Finding large difference b/w execution time of c++ and D codes for same problem

2013-02-12 Thread monarch_dodra

On Tuesday, 12 February 2013 at 20:39:36 UTC, Sparsh Mittal wrote:

I am finding C++ code is much faster than D code.


dmd (AFAIK) is known to be slower. try LDC or GDC if speed is 
your major concern.


Re: Finding large difference b/w execution time of c++ and D codes for same problem

2013-02-12 Thread Dmitry Olshansky

13-Feb-2013 00:39, Sparsh Mittal пишет:

I am finding C++ code is much faster than D code.


Seems like DMD's floating point issue. The issue being that it always 
works with floats as full-width reals + rounding. Basically if nothing 
changed (and I doubt it changed) then  DMD with floating point code is 
about two (or more) times slower then GDC/LDC.


The cure is using GDC/LDC compiler as they are pretty stable and up to 
date on the front-end side these days.


--
Dmitry Olshansky


Re: Finding large difference b/w execution time of c++ and D codes for same problem

2013-02-12 Thread Sparsh Mittal
Pardon me, can you please point me to suitable reference or tell 
just command here. Searching on google, I could not find anything 
yet. Performance is my main concern.






Re: Finding large difference b/w execution time of c++ and D codes for same problem

2013-02-12 Thread H. S. Teoh
On Wed, Feb 13, 2013 at 12:56:01AM +0400, Dmitry Olshansky wrote:
 13-Feb-2013 00:39, Sparsh Mittal пишет:
 I am finding C++ code is much faster than D code.
 
 Seems like DMD's floating point issue. The issue being that it
 always works with floats as full-width reals + rounding. Basically
 if nothing changed (and I doubt it changed) then  DMD with floating
 point code is about two (or more) times slower then GDC/LDC.
 
 The cure is using GDC/LDC compiler as they are pretty stable and up
 to date on the front-end side these days.
[...]

I did a few benchmarks somewhat recently where I compared the
performance of code produced by GDC with DMD. Code produced by GDC
consistently outperforms code produced by DMD by about 20-30% or so.
This is across the board, with both floats, reals, and applications that
don't do heavy arithmetic (just basic looping/recursion constructs).

I didn't investigate in detail the cause of this difference, but the
last time I looked at the assembly code generated by both compilers, I
noticed that GDC's optimizer is far more advanced than DMD's, esp. when
it comes to loop-unrolling, strength reduction, inlining, etc.. For
non-trivial code, GDC pretty much consistently produces superior code in
general (not just in floating-point operations).

So if performance is a concern, I'd say definitely look into GDC or LDC
instead of DMD.


T

-- 
Two wrongs don't make a right; but three rights do make a left...


Re: Finding large difference b/w execution time of c++ and D codes for same problem

2013-02-12 Thread Sparsh Mittal

OK. I found it.



Re: Finding large difference b/w execution time of c++ and D codes for same problem

2013-02-12 Thread Dmitry Olshansky

13-Feb-2013 01:09, Sparsh Mittal пишет:

Pardon me, can you please point me to suitable reference or tell just
command here. Searching on google, I could not find anything yet.
Performance is my main concern.




GDC, seems like its mostly build from source kind of thing.
Moved to gitbub:
https://github.com/D-Programming-GDC
(See also newsgroup digitalmars.d.D.gnu)

GDC binaries for Windows TDM-GCC toolchain are still available there:
https://bitbucket.org/goshawk/gdc/downloads

AFAIK it needs 4.6.1 version of TDM toolset.


LDC(2), recent release with binaries.

https://github.com/downloads/ldc-developers/ldc/ldc-0.10.0-src.tar.gz
https://github.com/downloads/ldc-developers/ldc/ldc2-0.10.0-linux-x86_64.tar.gz
https://github.com/downloads/ldc-developers/ldc/ldc2-0.10.0-linux-x86_64.tar.xz
https://github.com/downloads/ldc-developers/ldc/ldc2-0.10.0-linux-x86.tar.gz
https://github.com/downloads/ldc-developers/ldc/ldc2-0.10.0-linux-x86.tar.xz
https://github.com/downloads/ldc-developers/ldc/ldc2-0.10.0-osx-x86_64.tar.gz
https://github.com/downloads/ldc-developers/ldc/ldc2-0.10.0-osx-x86_64.tar.xz 



(See also announce on the newsgroup digitalmars.d.D.ldc)

Both compilers ship dmd-style compiler driver called gdmd or ldmd2.
Speed is mostly what you'd expect of GCC and LLVM respectively.

--
Dmitry Olshansky


Re: Finding large difference b/w execution time of c++ and D codes for same problem

2013-02-12 Thread Sparsh Mittal

Thanks for your insights. It was very helpful.




Re: How to read fastly files ( I/O operation)

2013-02-12 Thread monarch_dodra
On Tuesday, 12 February 2013 at 21:41:14 UTC, bioinfornatics 
wrote:


Some time fastq are comressed to gz bz2 or xz as that is often a
huge file.
Maybe we need keep in mind this early in developement and use
std.zlib


While working on making the parser multi-threaded compatible, I 
was able to seperate the part that feeds data, and the part that 
parses data.


Long story short, the parser operates on an input range of 
ubyte[]: It is not responsible any more for acquisition of data.


The range can be a simple (wrapped) File, a byChunk, an 
asynchroneus file reader, or a zip decompresser, or just stdin I 
guess. Range can be transient.


However, now that you mention it, I'll make sure it is correctly 
supported.


I'll *try* to show you what I have so far tomorow (in about 18h).


Re: Finding large difference b/w execution time of c++ and D codes for same problem

2013-02-12 Thread FG

On 2013-02-12 21:39, Sparsh Mittal wrote:

I am finding C++ code is much faster than D code.


I had a look, but first had to make juliaValue global, because g++ had optimized 
all the calculations away. :)  Also changed DIM to 32 * 1024.


13.2s -- g++ -O3
16.0s -- g++ -O2
15.9s -- gdc -O3
15.9s -- gdc -O2
16.2s -- dmd -O -release -inline(v.2.060)

Winblows and DMD 32-bit, the rest 64-bit, but still, dmd was quite fast.
Interesting how gdc -O3 gave no extra boost vs. -O2.