Calculating mean and standard deviation with std.algorithm.reduce

2013-02-13 Thread Joseph Rushton Wakeling
Hello all, A little challenge that's been bothering me today. The docs for std.algorithm give an illustration of its use to calculate mean and standard deviation in a single pass: // Compute sum and sum of squares in one pass r = reduce!(a + b, a + b * b)(tuple(0.0, 0.0), a);

Re: Finding large difference b/w execution time of c++ and D codes for same problem

2013-02-13 Thread FG
On 2013-02-13 14:26, Marco Leise wrote: template Julia(TReal) { struct ComplexStruct { float r; float i; ... Why aren't r and i of type TReal?

Re: Finding large difference b/w execution time of c++ and D codes for same problem

2013-02-13 Thread Joseph Rushton Wakeling
On 02/13/2013 02:26 PM, Marco Leise wrote: You get both, 50% more speed and more precision! It is a win-win situation. Also take a look at Phobos' std.math that returns real everywhere. I have to say, it's not been my experience that using real improves speed. Exactly what optimizations are

Re: Finding large difference b/w execution time of c++ and D codes for same problem

2013-02-13 Thread Joseph Rushton Wakeling
On 02/13/2013 02:26 PM, Marco Leise wrote: I compiled with LDC2 and these are the results: D code serial with dimension 32768 ... using floats Total time: 13.399 [sec] using doubles Total time: 9.429 [sec] using reals Total time: 8.909 [sec] // - !!! You get both, 50% more speed

Re: Calculating mean and standard deviation with std.algorithm.reduce

2013-02-13 Thread jerro
... where k represents the index count 1, 2, 3, ... However, it's not evident to me how you could get reduce() to know this counting value. You would use zip and sequence to add indices to x, like this: reduce!reducer(initial, zip(x, sequence!n)) Where calculating Q[k] is concerned, you

Re: Finding large difference b/w execution time of c++ and D codes for same problem

2013-02-13 Thread Marco Leise
Am Wed, 13 Feb 2013 14:44:36 +0100 schrieb FG h...@fgda.pl: On 2013-02-13 14:26, Marco Leise wrote: template Julia(TReal) { struct ComplexStruct { float r; float i; ... Why aren't r and i of type TReal? They are actual storage in memory,

Re: Calculating mean and standard deviation with std.algorithm.reduce

2013-02-13 Thread jerro
reduce!reducer(MQ(x.front, 0), zip(x, sequence!n)) A small correction : you would need to use x.drop(1) instead of x, because the first element of x is only used to compute the initial value of 1. If you wanted k to have the same meaning as the one in your formula, you would need to use

Re: Finding large difference b/w execution time of c++ and D codes for same problem

2013-02-13 Thread Marco Leise
Am Wed, 13 Feb 2013 14:48:21 +0100 schrieb Joseph Rushton Wakeling joseph.wakel...@webdrake.net: On 02/13/2013 02:26 PM, Marco Leise wrote: You get both, 50% more speed and more precision! It is a win-win situation. Also take a look at Phobos' std.math that returns real everywhere. I

Re: Finding large difference b/w execution time of c++ and D codes for same problem

2013-02-13 Thread Joseph Rushton Wakeling
On 02/13/2013 03:29 PM, Marco Leise wrote: They are actual storage in memory, where every increase in size hurts. When I replaced with TReal, it sped things up for double.

Re: Calculating mean and standard deviation with std.algorithm.reduce

2013-02-13 Thread FG
On 2013-02-13 14:44, Joseph Rushton Wakeling wrote: The docs for std.algorithm give an illustration of its use to calculate mean and standard deviation in a single pass: [...] However, this formula for standard deviation is one that is well known for being subject to potentially fatal rounding

Re: Calculating mean and standard deviation with std.algorithm.reduce

2013-02-13 Thread bearophile
jerro: reduce!reducer(MQ(x.front, 0), zip(x, sequence!n)) A small correction : you would need to use x.drop(1) instead of x, because the first element of x is only used to compute the initial value of 1. If you wanted k to have the same meaning as the one in your formula, you would need to

Re: Calculating mean and standard deviation with std.algorithm.reduce

2013-02-13 Thread Joseph Rushton Wakeling
On 02/13/2013 03:48 PM, FG wrote: You can use reduce and put the division and subtraction into the reduce itself to prevent overflows. You also won't end up with jaw-dropping tuples, sorry. :) float[] a = [10_000.0f, 10_001.0f, 10_002.0f]; auto n = a.length; auto avg =

Re: Finding large difference b/w execution time of c++ and D codes for same problem

2013-02-13 Thread Marco Leise
Am Wed, 13 Feb 2013 15:00:21 +0100 schrieb Joseph Rushton Wakeling joseph.wakel...@webdrake.net: Compiling with ldmd2 -O -inline -release on 64-bit Ubuntu, latest from-GitHub LDC, LLVM 3.2: D code serial with dimension 32768 ... using floats Total time: 4.751 [sec] using

Re: Calculating mean and standard deviation with std.algorithm.reduce

2013-02-13 Thread Joseph Rushton Wakeling
On 02/13/2013 03:48 PM, FG wrote: Typical thing with examples - they try to be terse and show off a mechanism like reduce, without going into too much details and hence are unusable IRL. My favourite -- in the tutorial for a really serious piece of scientific code written in C: int n =

Re: Finding large difference b/w execution time of c++ and D codes for same problem

2013-02-13 Thread Joseph Rushton Wakeling
On 02/13/2013 03:56 PM, Marco Leise wrote: Ok, I get pretty much the same numbers as before with: ldmd2 -O -inline -release It's even a bit faster than my lng command line. My experience has been that the higher -O values of ldc don't do much, but of course, that's going to vary

Re: Finding large difference b/w execution time of c++ and D codes for same problem

2013-02-13 Thread Marco Leise
Am Wed, 13 Feb 2013 15:45:13 +0100 schrieb Joseph Rushton Wakeling joseph.wakel...@webdrake.net: On 02/13/2013 03:29 PM, Marco Leise wrote: They are actual storage in memory, where every increase in size hurts. When I replaced with TReal, it sped things up for double. Give me that stuff,

Re: Calculating mean and standard deviation with std.algorithm.reduce

2013-02-13 Thread jerro
See enumerate(): http://d.puremagic.com/issues/show_bug.cgi?id=5550 I like this enumerate() thing. Is there any particular reason why it isn't in phobos, or is it just that no one has added it yet? I think with enumerate it becomes: MQ(x.front, 0).enumerate(1).reduce!reducer() I think the

Re: Finding large difference b/w execution time of c++ and D codes for same problem

2013-02-13 Thread FG
Good point about choosing the right type of floating point numbers. Conclusion: when there's enough space, always pick double over float. Tested with GDC in win64. floats: 16.0s / doubles: 14.1s / reals: 11.2s. I thought to myself: cool, I almost beat the 13.4s I got with C++, until I changed

Re: Finding large difference b/w execution time of c++ and D codes for same problem

2013-02-13 Thread Marco Leise
Am Wed, 13 Feb 2013 15:45:13 +0100 schrieb Joseph Rushton Wakeling joseph.wakel...@webdrake.net: On 02/13/2013 03:29 PM, Marco Leise wrote: They are actual storage in memory, where every increase in size hurts. When I replaced with TReal, it sped things up for double. Oh this gets even

Re: Finding large difference b/w execution time of c++ and D codes for same problem

2013-02-13 Thread FG
On 2013-02-13 16:26, Marco Leise wrote: I'd still bet a dollar that with an array of values floats would outperform doubles, when cache misses happen. (E.g. more or less random memory access.) I'll play it safe and only bet my opDollar. :)

Re: Finding large difference b/w execution time of c++ and D codes for same problem

2013-02-13 Thread Joseph Rushton Wakeling
On 02/13/2013 04:17 PM, FG wrote: Good point about choosing the right type of floating point numbers. Conclusion: when there's enough space, always pick double over float. Tested with GDC in win64. floats: 16.0s / doubles: 14.1s / reals: 11.2s. I thought to myself: cool, I almost beat the 13.4s

Re: Why are commands executing out of order?

2013-02-13 Thread Andrea Fontana
On Saturday, 2 February 2013 at 15:39:00 UTC, Namespace wrote: I've never come across Appenders before. Could you please explain them a little bit, and what each call in your modified code does? Thanks, Josh http://dlang.org/phobos/std_array.html#.Appender And read Era's post. Why

Re: A little of coordination for Rosettacode

2013-02-13 Thread bearophile
If some of you has a little of time to review code, I have converted the very fast C memory mapped version of Ordered words to D: http://rosettacode.org/wiki/Ordered_words#Mmap http://rosettacode.org/wiki/Ordered_words#Memory_Mapped_Version The C version contains several pointers that get

Re: Why are commands executing out of order?

2013-02-13 Thread bearophile
Andrea Fontana: Why Appender has no ~ operator itself? auto app = appender!string(); app.put(a); // why not app ~= a? Now it's in GIT head. Bye, bearophile

Re: Finding large difference b/w execution time of c++ and D codes for same problem

2013-02-13 Thread Marco Leise
Am Wed, 13 Feb 2013 16:17:12 +0100 schrieb FG h...@fgda.pl: Good point about choosing the right type of floating point numbers. Conclusion: when there's enough space, always pick double over float. Tested with GDC in win64. floats: 16.0s / doubles: 14.1s / reals: 11.2s. I thought to myself:

Re: Finding large difference b/w execution time of c++ and D codes for same problem

2013-02-13 Thread Joseph Rushton Wakeling
On 02/13/2013 04:41 PM, Joseph Rushton Wakeling wrote: On 02/13/2013 04:17 PM, FG wrote: Good point about choosing the right type of floating point numbers. Conclusion: when there's enough space, always pick double over float. Tested with GDC in win64. floats: 16.0s / doubles: 14.1s / reals:

Importing problems

2013-02-13 Thread Korey Peters
Hi everyone. I'm new to D, coming from a Java/Python background. I've been reading the excellent The D Programming Language book, and want to now start playing around with D. I'm having an issue with importing. When I have the following file: file ~/src/sample.d: =

Re: How to read fastly files ( I/O operation)

2013-02-13 Thread monarch_dodra
On Tuesday, 12 February 2013 at 22:06:48 UTC, monarch_dodra wrote: On Tuesday, 12 February 2013 at 21:41:14 UTC, bioinfornatics wrote: Some time fastq are comressed to gz bz2 or xz as that is often a huge file. Maybe we need keep in mind this early in developement and use std.zlib While

Re: Importing problems

2013-02-13 Thread H. S. Teoh
On Wed, Feb 13, 2013 at 06:17:51PM +0100, Korey Peters wrote: [...] ...and at the terminal: me@ubuntu:~/src$ rdmd sample.d /tmp/.rdmd-1000/rdmd-sample.d-94E53075E2E84D963426A11F2B81FDED/objs/sample.o: In function `_Dmain': sample.d:(.text._Dmain+0xa): undefined reference to

Re: Importing problems

2013-02-13 Thread Korey Peters
Thanks for your response, H.S.Teoh. On Wednesday, 13 February 2013 at 17:47:09 UTC, H. S. Teoh wrote: You need to specify both files on the command line, so that the linker knows where to find everything: rdmd sample.d sample_a.d Running this from the command line produces

Re: Finding large difference b/w execution time of c++ and D codes for same problem

2013-02-13 Thread Marco Leise
Am Wed, 13 Feb 2013 18:10:47 +0100 schrieb Joseph Rushton Wakeling joseph.wakel...@webdrake.net: Just to update on times. I was running another large job at the same time as doing all these tests, so there was some slowdown. Current results are: -- with g++ -O3 and using double rather

Re: Importing problems

2013-02-13 Thread H. S. Teoh
On Wed, Feb 13, 2013 at 06:57:52PM +0100, Korey Peters wrote: Thanks for your response, H.S.Teoh. On Wednesday, 13 February 2013 at 17:47:09 UTC, H. S. Teoh wrote: You need to specify both files on the command line, so that the linker knows where to find everything: rdmd sample.d

Re: Finding large difference b/w execution time of c++ and D codes for same problem

2013-02-13 Thread jerro
When you are comparing LDC and GDC, you should either use -mcpu=generic for ldc or -march=native for GDC, because their default targets are different. GDC will produce code that works on most x86_64 (if you are on a x86_64 system) CPUs by default, and LDC targets the host CPU. But this does

Re: Importing problems

2013-02-13 Thread Korey Peters
Hmm. I moved my two sample files from ~/path/to/where/I/was/working to ~/ and the import worked. This makes me suspect a permissions issue. I'll carry on working in ~/ for now, until I sort my stupidity out! Thanks for your help.

Re: Importing problems

2013-02-13 Thread jerro
On Wednesday, 13 February 2013 at 18:42:51 UTC, H. S. Teoh wrote: On Wed, Feb 13, 2013 at 06:57:52PM +0100, Korey Peters wrote: Thanks for your response, H.S.Teoh. On Wednesday, 13 February 2013 at 17:47:09 UTC, H. S. Teoh wrote: You need to specify both files on the command line, so that

Re: How to read fastly files ( I/O operation)

2013-02-13 Thread FG
On 2013-02-13 18:39, monarch_dodra wrote: In any case, I am now parsing the 6Gig packed into 1.5Gig in about 53 seconds (down from 61). I also tried doing a dual-threaded approach (1 thread to unzip, 1 thread to parse), but again, the actual *parse* phase is so ridiculously fast, that it changes

Re: Importing problems

2013-02-13 Thread Korey Peters
On Wednesday, 13 February 2013 at 19:33:00 UTC, jerro wrote: This solves the issue: rdmd --force sample Hi jerro, That definitely helped. There's still some things I haven't figured out yet about D's importing, but this has got me going. Thank you.

std.container.RedBlackTree versus C++ std::set

2013-02-13 Thread Ivan Kazmenko
Hi! I'm learning to use D collections properly, and I'm looking for a sorted data structure with logarithmic access time (i.e., a binary search tree will do, but a hash table would not help). As far as I can see, std.container.RedBlackTree is exactly what I need. However, I am not sure if I

Re: std.container.RedBlackTree versus C++ std::set

2013-02-13 Thread Ivan Kazmenko
P.S. More on C++ version: Personally, I don't see why at all we should call the copy constructor more than once per element. I mean, if we intend to build a generic data structure, we sure need an internal node object with some extra bytes (internal references and counters) per each element,

Re: std.container.RedBlackTree versus C++ std::set

2013-02-13 Thread Jonathan M Davis
On Thursday, February 14, 2013 00:33:26 Ivan Kazmenko wrote: P.S. More on C++ version: Personally, I don't see why at all we should call the copy constructor more than once per element. I mean, if we intend to build a generic data structure, we sure need an internal node object with some

Re: Importing problems

2013-02-13 Thread Brad Roberts
On Wed, 13 Feb 2013, jerro wrote: I created self-contained sample.d, ran it with rdmd, then moved class A to sample_a.d and tried to run it with rdmd again. I could reproduce the issue that way. It seems that rdmd caches the dependency list. Using --chatty flag confirms that rdmd does not run

Re: std.container.RedBlackTree versus C++ std::set

2013-02-13 Thread Rob T
You can check if disabling the GC just before the insert process improves the performance. You may see 3x performance improvement. Disabling is safe provided you re-enable, this can be done reliably with scope(exit) or something similar. import core.memory; // ... void main () {

Mixin template function

2013-02-13 Thread cal
Should the following work? import std.traits; mixin template Foo() { void foo(T)(T t) if (isSomeString!T) {} } class A { void foo()(int i){} mixin Foo; } void main() { auto a = new A; a.foo(hello); } Error: template hello.A.foo does not match any function template

Re: std.container.RedBlackTree versus C++ std::set

2013-02-13 Thread FG
On 2013-02-14 01:09, Rob T wrote: You can check if disabling the GC just before the insert process improves the performance. You may see 3x performance improvement. Disabling is safe provided you re-enable, this can be done reliably with scope(exit) or something similar. How did you know? It

Re: Importing problems

2013-02-13 Thread jerro
Please file a bug report on this. Done.

Re: std.container.RedBlackTree versus C++ std::set

2013-02-13 Thread Steven Schveighoffer
On Wed, 13 Feb 2013 18:22:02 -0500, Ivan Kazmenko ga...@mail.ru wrote: Hi! I'm learning to use D collections properly, and I'm looking for a sorted data structure with logarithmic access time (i.e., a binary search tree will do, but a hash table would not help). As far as I can see,

Re: Importing problems

2013-02-13 Thread H. S. Teoh
On Wed, Feb 13, 2013 at 03:53:51PM -0800, Brad Roberts wrote: On Wed, 13 Feb 2013, jerro wrote: I created self-contained sample.d, ran it with rdmd, then moved class A to sample_a.d and tried to run it with rdmd again. I could reproduce the issue that way. It seems that rdmd caches the

Re: Mixin template function

2013-02-13 Thread cal
And a related question: class A { void foo(int i){} void foo(Tuple!(int) i){} } class B: A { override void foo(int i){} } int main() { auto b = new B; b.foo(tuple(5)); } This fails to compile. Why can't B use A's tuple overload of foo()? If I do this: class B: A {

Re: std.container.RedBlackTree versus C++ std::set

2013-02-13 Thread monarch_dodra
On Wednesday, 13 February 2013 at 23:22:03 UTC, Ivan Kazmenko wrote: Hi! - Ivan Kazmenko. Keep in mind that C++ and D have very different philosophies regarding copy construction. C++ has strong ownership, so for example, whenever you copy a string/vector, or pass it by value, the

Re: Mixin template function

2013-02-13 Thread monarch_dodra
On Thursday, 14 February 2013 at 00:29:51 UTC, cal wrote: Should the following work? import std.traits; mixin template Foo() { void foo(T)(T t) if (isSomeString!T) {} } class A { void foo()(int i){} mixin Foo; } void main() { auto a = new A; a.foo(hello); } Error:

Re: Mixin template function

2013-02-13 Thread monarch_dodra
On Thursday, 14 February 2013 at 05:49:33 UTC, cal wrote: And a related question: class A { void foo(int i){} void foo(Tuple!(int) i){} } class B: A { override void foo(int i){} } int main() { auto b = new B; b.foo(tuple(5)); } This fails to compile. Why can't B use A's

Re: std.container.RedBlackTree versus C++ std::set

2013-02-13 Thread Rob T
On Thursday, 14 February 2013 at 00:25:15 UTC, FG wrote: On 2013-02-14 01:09, Rob T wrote: You can check if disabling the GC just before the insert process improves the performance. You may see 3x performance improvement. Disabling is safe provided you re-enable, this can be done reliably with