Re: Passing Templated Function Arguments Solely by Reference

2014-07-09 Thread Ali Çehreli via Digitalmars-d-learn

On 07/08/2014 05:13 PM, Nordlöw wrote:

 If I want randInPlace to take value arguments (such as structs) by
 reference and reference types (classes) as normal is this

I don't understand what it means to fill a struct or a class object with 
random content.


 /** Generate Random Contents in $(D x).
   */
 auto ref randInPlace(T)(auto ref T x) @safe /* nothrow */ if 
(isIterable!T)


hasAssignableElements is more correct.

 {
  foreach (ref elt; x)
  {
  import std.range: ElementType;
  static if (isInputRange!(ElementType!T))

The documentation of hasAssignableElements mentions that it implies 
isForwardRange and it makes sense: You don't want the range to be 
consumed as an InputRange would do.


  elt[].randInPlace;
  else
  elt.randInPlace;
  }
  return x;
 }

 And how does this compare to using x[].randInPlace() when x is a static
 array?

Range algorithms don't work with static arrays because they can't 
popFront(). The solution is to use a slice to the entire array as you've 
already done as x[]. ;)


 Does x[] create unnecessary GC-heap activity in this case?

No. Static array will remain in memory and x[] will be a local slice. A 
slice consists of two members, the equivalent of the following:


struct __SliceImpl
{
size_t length;
void * pointer_to_first_element;
}

 I'm wondering because (auto ref T x) is just used in two places in
 std.algorithm and std.range in Phobos. Is this a relatively new
 enhancement?

Phobos algorithms use ranges. The following is what I've come up with 
very quickly:


import std.stdio;
import std.range;
import std.traits;
import std.random;

void randInPlace(R)(R range)
if (hasAssignableElements!R)
{
foreach (ref e; range) {
e.randInPlace();
}
}

void randInPlace(E)(ref E element)
if (isNumeric!E)
{
// BUG: Never assigns the value E.max
element = uniform(E.min, E.max);
}

void randInPlace(E)(ref E element)
if (isBoolean!E)
{
element = cast(bool)uniform(0, 2);
}

void main()
{
auto arr = [ [ 0, 1, 2 ], [ 3, 4, 5 ] ];
arr.randInPlace();
writefln(%s, arr);

auto barr = [ [ false, true ], [ false, true ] ];
barr.randInPlace();
writefln(%s, barr);
}

Ali



__VERSION__ and the different compilers

2014-07-09 Thread Mike Parker via Digitalmars-d-learn
Is it safe to assume that __VERSION__ is the same among DMD, LDC and GDC 
when using the equivalent front-end? I want to implement @nogc in 
Derelict in a backward compatible way. The simple thing to do is (at the 
suggestion of w0rp):


static if( __VERSION__  2.066 ) enum nogc = 1;

I just want to verify that this is sufficient and that I don't need to 
test for version(DMD) and friends as well.


---
This email is free from viruses and malware because avast! Antivirus protection 
is active.
http://www.avast.com



Re: __VERSION__ and the different compilers

2014-07-09 Thread bearophile via Digitalmars-d-learn

Mike Parker:

Is it safe to assume that __VERSION__ is the same among DMD, 
LDC and GDC when using the equivalent front-end?


Right. An alternative solution is to use __traits(compiles) and 
use @nogc inside it.


Bye,
bearophile


Re: __VERSION__ and the different compilers

2014-07-09 Thread Andrej Mitrovic via Digitalmars-d-learn
On 7/9/14, Mike Parker via Digitalmars-d-learn
digitalmars-d-learn@puremagic.com wrote:
 Is it safe to assume that __VERSION__ is the same among DMD, LDC and GDC
 when using the equivalent front-end?

Yes, but not all future compilers might implement this (although I
hope they will). I think there's also __VENDOR__ IIRC.


Small part of a program : d and c versions performances diff.

2014-07-09 Thread Larry via Digitalmars-d-learn

Hello,

I extracted a part of my code written in c.
it is deliberately useless here but I would understand the 
different technics to optimize such kind of code with gdc 
compiler.


it currently runs under a microsecond.

Constraint : the way the code is expressed cannot be changed much 
we need that double loop because there are other operations 
involved in the first loop scope.


main.c :
[code]
#include stdio.h
#include string.h
#include stdlib.h
#include jol.h
#include time.h
#include sys/time.h
int main(void)
{

struct timeval s,e;
gettimeofday(s,NULL);

int pol = 5;
tes(pol);


int arr[] = {9,16,458,2,68,5452,98,32,4,565,78,985,3215};
int len = 13-1;
int g = 0;

for (int x = 36; x = 0 ; --x ){
// some code here erased for the test
for(int y = len ; y = 0; --y){
//some other code here
++g;
arr[y] +=1;

}

}
gettimeofday(e,NULL);

printf(so ? %d %lu %d %d %d,g,e.tv_usec - s.tv_usec, 
arr[4],arr[9],pol);

return 0;
}
[/code]

jol.c
[code]
void tes(int * restrict a){

*a = 9;

}
[/code]

and jol.h

#ifndef JOL_H
#define JOL_H
void tes(int * restrict a);
#endif // JOL_H


Now, the D counterpart:

module main;

import std.stdio;
import std.datetime;
import jol;
int main(string[] args)
{


auto currentTime = Clock.currTime();

int pol = 5;
tes(pol);
pol = 8;

int arr[] = [9,16,458,2,68,5452,98,32,4,565,78,985,3215];
int len = 13-1;
int g = 0;

for (int x = 31; x = 0 ; --x ){

for(int y = len ; y = 0; --y){

++g;
arr[y] +=1;

}

}
auto currentTime2 = Clock.currTime();
writefln(Hello World %d %s %d %d\n,g, (currentTime2 - 
currentTime),arr[4],arr[9]);


return 0;
}

and

module jol;
final void tes(ref int a){

a = 9;

}


Ok, the compilation options :
gdc hello.d jol.d -O3 -frelease -ftree-loop-optimize

gcc -march=native -std=c11 -O2 main.c jol.c

Now the performance :
D : 12 µs
C :  1µs

Where does the diff comes from ? Is there a way to optimize the d 
version ?


Again, I am absolutely new to D and those are my very first line 
of code with it.


Thanks


Re: Small part of a program : d and c versions performances diff.

2014-07-09 Thread NCrashed via Digitalmars-d-learn

On Wednesday, 9 July 2014 at 10:57:33 UTC, Larry wrote:

Hello,

I extracted a part of my code written in c.
it is deliberately useless here but I would understand the 
different technics to optimize such kind of code with gdc 
compiler.


it currently runs under a microsecond.

Constraint : the way the code is expressed cannot be changed 
much we need that double loop because there are other 
operations involved in the first loop scope.


main.c :
[code]
#include stdio.h
#include string.h
#include stdlib.h
#include jol.h
#include time.h
#include sys/time.h
int main(void)
{

struct timeval s,e;
gettimeofday(s,NULL);

int pol = 5;
tes(pol);


int arr[] = {9,16,458,2,68,5452,98,32,4,565,78,985,3215};
int len = 13-1;
int g = 0;

for (int x = 36; x = 0 ; --x ){
// some code here erased for the test
for(int y = len ; y = 0; --y){
//some other code here
++g;
arr[y] +=1;

}

}
gettimeofday(e,NULL);

printf(so ? %d %lu %d %d %d,g,e.tv_usec - s.tv_usec, 
arr[4],arr[9],pol);

return 0;
}
[/code]

jol.c
[code]
void tes(int * restrict a){

*a = 9;

}
[/code]

and jol.h

#ifndef JOL_H
#define JOL_H
void tes(int * restrict a);
#endif // JOL_H


Now, the D counterpart:

module main;

import std.stdio;
import std.datetime;
import jol;
int main(string[] args)
{


auto currentTime = Clock.currTime();

int pol = 5;
tes(pol);
pol = 8;

int arr[] = [9,16,458,2,68,5452,98,32,4,565,78,985,3215];
int len = 13-1;
int g = 0;

for (int x = 31; x = 0 ; --x ){

for(int y = len ; y = 0; --y){

++g;
arr[y] +=1;

}

}
auto currentTime2 = Clock.currTime();
writefln(Hello World %d %s %d %d\n,g, (currentTime2 - 
currentTime),arr[4],arr[9]);


return 0;
}

and

module jol;
final void tes(ref int a){

a = 9;

}


Ok, the compilation options :
gdc hello.d jol.d -O3 -frelease -ftree-loop-optimize

gcc -march=native -std=c11 -O2 main.c jol.c

Now the performance :
D : 12 µs
C :  1µs

Where does the diff comes from ? Is there a way to optimize the 
d version ?


Again, I am absolutely new to D and those are my very first 
line of code with it.


Thanks


Clock isn't an accurate benchmark instrument. Try 
std.datetime.benchmark:

```
module main;

import std.stdio;
import std.datetime;

void tes(ref int a)
{
a = 9;
}

int[] arr = [9,16,458,2,68,5452,98,32,4,565,78,985,3215];

void foo()
{
int pol = 5;
tes(pol);
pol = 8;
int g = 0;

foreach_reverse(x; 0..31)
{
foreach_reverse(ref a; arr)
{
++g;
a += 1;
}
}
}

void main()
{
auto res = benchmark!foo(1000); // take mean of 1000 launches
writeln(res[0].msecs,  , arr[4],  , arr[9]);
}
```

Dmd time: 1 us
Gcc time: = 1 us


Re: Small part of a program : d and c versions performances diff.

2014-07-09 Thread bearophile via Digitalmars-d-learn

Larry:


Now the performance :
D : 12 µs
C :  1µs

Where does the diff comes from ? Is there a way to optimize the 
d version ?


Again, I am absolutely new to D and those are my very first 
line of code with it.


Your C code is not equivalent to the D code, there are small 
differences, even the output is different. So I've cleaned up 
your C and D code:




// C code.
#include stdio.h
#include string.h
#include stdlib.h
#include time.h
#include sys/time.h
#include jol.h

int main() {
struct timeval s, e;
gettimeofday(s, NULL);

int pol = 5;
tes(pol);

int arr[] = {9, 16, 458, 2, 68, 5452, 98, 32, 4, 565, 78, 
985, 3215};

int len = 13 - 1;
int g = 0;

for (int x = 36; x = 0; --x) {
for (int y = len; y = 0; --y) {
++g;
arr[y]++;
}
}

gettimeofday(e, NULL);
printf(C: %d %lu %d %d %d\n,
   g, e.tv_usec - s.tv_usec, arr[4], arr[9], pol);

return 0;
}



D code (final functions have not much meaning, but the D 
compiler is very sloppy and doesn't complain):



module jol;

void tes(ref int a) {
a = 9;
}


-

module maind;

void main() {
import std.stdio;
import std.datetime;
import jol;

StopWatch sw;
sw.start;

int pol = 5;
tes(pol);

int[] arr = [9, 16, 458, 2, 68, 5452, 98, 32, 4, 565, 78, 
985, 3215];

int len = 13 - 1;
int g = 0;

for (int x = 36; x = 0; --x) {
// Some code here erased for the test.
for (int y = len; y = 0; --y) {
// Some other code here.
++g;
arr[y]++;
}
}

sw.stop;
writefln(D: %d %d %d %d %d,
 g, sw.peek.nsecs, arr[4], arr[9], pol);
}



That D code is not fully idiomatic, this is closer to idiomatic D 
code:



module jol2;

void test(ref int x) pure nothrow @safe {
x = 9;
}



module maind;

void main() {
import std.stdio, std.datetime;
import jol2;

StopWatch sw;
sw.start;

int pol = 5;
test(pol);

int[13] arr = [9, 16, 458, 2, 68, 5452, 98, 32, 4, 565, 78, 
985, 3215];

uint count = 0;

foreach_reverse (immutable _; 0 .. 37) {
foreach_reverse (ref ai; arr) {
count++;
ai++;
}
}

sw.stop;
writefln(D: %d %d %d %d %d,
 count, sw.peek.nsecs, arr[4], arr[9], pol);
}



In my benchmarks I don't have used the more idiomatic D code, I 
have used the C-like code. But the run-time is essentially the 
same.


I compile the C and D code with (on a 32 bit Windows):

gcc -march=native -std=c11 -O2 main.c jol.c -o main

ldmd2 -wi -O -release -inline -noboundscheck maind.d jol.d
strip maind.exe

For the D code I've used the latest ldc2 compiler (V. 0.13.0, 
based on DMD v2.064 and LLVM 3.4.2), GCC is V.4.8.0 
(rubenvb-4.8.0).




The C code gives as ouput:

C: 481 0 105 602 9


The D code gives as output:

D: 481 6076 105 602 9

--

If I slow down the CPU at half speed the C code runs in about 
0.05 seconds, the D code runs in about 0.07 seconds.


Such run times are too much small to perform a sufficiently 
meaningful comparison. You need a run-time of about 2 seconds to 
get meaningful timings.


The difference between 0.05 and 0.07 is caused by initializing 
the D rutime (like the D GC), it takes about 0.015 seconds on my 
systems at full speed CPU to initialize the D runtime, and it's a 
constant time.


Bye,
bearophile


Re: Visual D: Settings to Improve compil and link process

2014-07-09 Thread ParticlePerter via Digitalmars-d-learn

On Monday, 7 July 2014 at 22:00:51 UTC, Rainer Schuetze wrote:



On 07.07.2014 12:46, ParticlePeter wrote:

On Sunday, 6 July 2014 at 19:27:38 UTC, Rainer Schuetze wrote:

These object files are in the library ;-) That means manual 
selection,
though, as incremental builds to multiple object files don't 
work with

dmd, and single file compilation is painfully slow.


Not sure if I am getting this right, so when one object file 
has to be

recompiled all other object files, even if up to date, would be
recompiled ?


That's how it is currently done if you don't use single file 
compilation. Compiling only modified and dependent modules in 
one step could work incrementally, but especially template 
instantiations make this hard to do correctly.




The modules form MyProject do import the MyLib modules 
properly, I do
not get compiler errors. However, the compiler should create 
Object
files from MyLib modules, and the linker should link them. 
But he

does not.
On the other hand, when I add MyLib modules to MyProject ( 
Rightclick
MyProject - add - existing item... MyLib source files ) then 
linking

works. I do not understand why the later step is necessary.


dmd does not compile imported modules, but rdmd does.


Ähm ... not seeing the connection here either, why is this 
significant ?


dmd just compiles the files given on the command line. rdmd 
makes two passes, one to collect imported files, and another to 
compile all the collected files. So rdmd works the way you want 
dmd to work (if I understand you correctly).


I feel that I could not explain my problem properly, so one 
example:
Importing phobos modules. I do not have to define any import 
path or lib
file in the project settings, I just need to import 
std.somthing. That's
because the import path for phobos modules are stored in the 
dmd sc.ini

file.
When I want to import my modules which are somewhere on my 
hard-drive
and not added to my project I need to tell the compiler where 
these
modules can be found, using the additional import path project 
setting.

That's fine, doing this.

But result is, std.somthing works, my modules in a path known 
by the
compiler don't work, giving me linker errors. Why ? ( I do not 
create a

lib, I just want to import the module. )



phobos is precompiled to a library and is automatically 
included in the link. If you want your custom modules to work 
the same way, you have to compile them to a library.


Thanks for claryfying all the above. If the std.library is 
rebuild whenever a template is used, than my assumption that 
MyLib compiles slow due to usage of ( very simple ) templates 
must be wrong. I will dive deeper into profiling. Thanks a lot.


Cheers, ParticlePeter



Re: Small part of a program : d and c versions performances diff.

2014-07-09 Thread Larry via Digitalmars-d-learn

On Wednesday, 9 July 2014 at 12:25:40 UTC, bearophile wrote:

Larry:


Now the performance :
D : 12 µs
C :  1µs

Where does the diff comes from ? Is there a way to optimize 
the d version ?


Again, I am absolutely new to D and those are my very first 
line of code with it.


Your C code is not equivalent to the D code, there are small 
differences, even the output is different. So I've cleaned up 
your C and D code:




// C code.
#include stdio.h
#include string.h
#include stdlib.h
#include time.h
#include sys/time.h
#include jol.h

int main() {
struct timeval s, e;
gettimeofday(s, NULL);

int pol = 5;
tes(pol);

int arr[] = {9, 16, 458, 2, 68, 5452, 98, 32, 4, 565, 78, 
985, 3215};

int len = 13 - 1;
int g = 0;

for (int x = 36; x = 0; --x) {
for (int y = len; y = 0; --y) {
++g;
arr[y]++;
}
}

gettimeofday(e, NULL);
printf(C: %d %lu %d %d %d\n,
   g, e.tv_usec - s.tv_usec, arr[4], arr[9], pol);

return 0;
}



D code (final functions have not much meaning, but the D 
compiler is very sloppy and doesn't complain):



module jol;

void tes(ref int a) {
a = 9;
}


-

module maind;

void main() {
import std.stdio;
import std.datetime;
import jol;

StopWatch sw;
sw.start;

int pol = 5;
tes(pol);

int[] arr = [9, 16, 458, 2, 68, 5452, 98, 32, 4, 565, 78, 
985, 3215];

int len = 13 - 1;
int g = 0;

for (int x = 36; x = 0; --x) {
// Some code here erased for the test.
for (int y = len; y = 0; --y) {
// Some other code here.
++g;
arr[y]++;
}
}

sw.stop;
writefln(D: %d %d %d %d %d,
 g, sw.peek.nsecs, arr[4], arr[9], pol);
}



That D code is not fully idiomatic, this is closer to idiomatic 
D code:



module jol2;

void test(ref int x) pure nothrow @safe {
x = 9;
}



module maind;

void main() {
import std.stdio, std.datetime;
import jol2;

StopWatch sw;
sw.start;

int pol = 5;
test(pol);

int[13] arr = [9, 16, 458, 2, 68, 5452, 98, 32, 4, 565, 78, 
985, 3215];

uint count = 0;

foreach_reverse (immutable _; 0 .. 37) {
foreach_reverse (ref ai; arr) {
count++;
ai++;
}
}

sw.stop;
writefln(D: %d %d %d %d %d,
 count, sw.peek.nsecs, arr[4], arr[9], pol);
}



In my benchmarks I don't have used the more idiomatic D code, I 
have used the C-like code. But the run-time is essentially the 
same.


I compile the C and D code with (on a 32 bit Windows):

gcc -march=native -std=c11 -O2 main.c jol.c -o main

ldmd2 -wi -O -release -inline -noboundscheck maind.d jol.d
strip maind.exe

For the D code I've used the latest ldc2 compiler (V. 0.13.0, 
based on DMD v2.064 and LLVM 3.4.2), GCC is V.4.8.0 
(rubenvb-4.8.0).




The C code gives as ouput:

C: 481 0 105 602 9


The D code gives as output:

D: 481 6076 105 602 9

--

If I slow down the CPU at half speed the C code runs in about 
0.05 seconds, the D code runs in about 0.07 seconds.


Such run times are too much small to perform a sufficiently 
meaningful comparison. You need a run-time of about 2 seconds 
to get meaningful timings.


The difference between 0.05 and 0.07 is caused by initializing 
the D rutime (like the D GC), it takes about 0.015 seconds on 
my systems at full speed CPU to initialize the D runtime, and 
it's a constant time.


Bye,
bearophile


You are definitely right, I did mess up while translating !

I run the corrected codes (the ones I was meant to provide :S) 
and on a slow macbook I end up with :

C : 2
D : 15994

Of course when run on very high end machines, this diff is almost 
non existent but we want to run on very low powered hardware.


Ok, even with a longer code, there will always be a launch 
penalty for d. So I cannot use it for very high performance loops.


Shame for us..
:)

Thanks and bye



Re: Small part of a program : d and c versions performances diff.

2014-07-09 Thread John Colvin via Digitalmars-d-learn

On Wednesday, 9 July 2014 at 13:18:00 UTC, Larry wrote:

On Wednesday, 9 July 2014 at 12:25:40 UTC, bearophile wrote:

Larry:


Now the performance :
D : 12 µs
C :  1µs

Where does the diff comes from ? Is there a way to optimize 
the d version ?


Again, I am absolutely new to D and those are my very first 
line of code with it.


Your C code is not equivalent to the D code, there are small 
differences, even the output is different. So I've cleaned up 
your C and D code:




// C code.
#include stdio.h
#include string.h
#include stdlib.h
#include time.h
#include sys/time.h
#include jol.h

int main() {
   struct timeval s, e;
   gettimeofday(s, NULL);

   int pol = 5;
   tes(pol);

   int arr[] = {9, 16, 458, 2, 68, 5452, 98, 32, 4, 565, 78, 
985, 3215};

   int len = 13 - 1;
   int g = 0;

   for (int x = 36; x = 0; --x) {
   for (int y = len; y = 0; --y) {
   ++g;
   arr[y]++;
   }
   }

   gettimeofday(e, NULL);
   printf(C: %d %lu %d %d %d\n,
  g, e.tv_usec - s.tv_usec, arr[4], arr[9], pol);

   return 0;
}



D code (final functions have not much meaning, but the D 
compiler is very sloppy and doesn't complain):



module jol;

void tes(ref int a) {
   a = 9;
}


-

module maind;

void main() {
   import std.stdio;
   import std.datetime;
   import jol;

   StopWatch sw;
   sw.start;

   int pol = 5;
   tes(pol);

   int[] arr = [9, 16, 458, 2, 68, 5452, 98, 32, 4, 565, 78, 
985, 3215];

   int len = 13 - 1;
   int g = 0;

   for (int x = 36; x = 0; --x) {
   // Some code here erased for the test.
   for (int y = len; y = 0; --y) {
   // Some other code here.
   ++g;
   arr[y]++;
   }
   }

   sw.stop;
   writefln(D: %d %d %d %d %d,
g, sw.peek.nsecs, arr[4], arr[9], pol);
}



That D code is not fully idiomatic, this is closer to 
idiomatic D code:



module jol2;

void test(ref int x) pure nothrow @safe {
   x = 9;
}



module maind;

void main() {
   import std.stdio, std.datetime;
   import jol2;

   StopWatch sw;
   sw.start;

   int pol = 5;
   test(pol);

   int[13] arr = [9, 16, 458, 2, 68, 5452, 98, 32, 4, 565, 78, 
985, 3215];

   uint count = 0;

   foreach_reverse (immutable _; 0 .. 37) {
   foreach_reverse (ref ai; arr) {
   count++;
   ai++;
   }
   }

   sw.stop;
   writefln(D: %d %d %d %d %d,
count, sw.peek.nsecs, arr[4], arr[9], pol);
}



In my benchmarks I don't have used the more idiomatic D code, 
I have used the C-like code. But the run-time is essentially 
the same.


I compile the C and D code with (on a 32 bit Windows):

gcc -march=native -std=c11 -O2 main.c jol.c -o main

ldmd2 -wi -O -release -inline -noboundscheck maind.d jol.d
strip maind.exe

For the D code I've used the latest ldc2 compiler (V. 0.13.0, 
based on DMD v2.064 and LLVM 3.4.2), GCC is V.4.8.0 
(rubenvb-4.8.0).




The C code gives as ouput:

C: 481 0 105 602 9


The D code gives as output:

D: 481 6076 105 602 9

--

If I slow down the CPU at half speed the C code runs in about 
0.05 seconds, the D code runs in about 0.07 seconds.


Such run times are too much small to perform a sufficiently 
meaningful comparison. You need a run-time of about 2 seconds 
to get meaningful timings.


The difference between 0.05 and 0.07 is caused by initializing 
the D rutime (like the D GC), it takes about 0.015 seconds on 
my systems at full speed CPU to initialize the D runtime, and 
it's a constant time.


Bye,
bearophile


You are definitely right, I did mess up while translating !

I run the corrected codes (the ones I was meant to provide :S) 
and on a slow macbook I end up with :

C : 2
D : 15994

Of course when run on very high end machines, this diff is 
almost non existent but we want to run on very low powered 
hardware.


Ok, even with a longer code, there will always be a launch 
penalty for d. So I cannot use it for very high performance 
loops.


Shame for us..
:)

Thanks and bye


Could you provide the exact code you are using for that 
benchmark? Once the program has started up you should be able to 
obtain performance parity between C and D. Situations where this 
isn't true are problems we would like to know about.


For the amount of work you are doing in the test program (almost 
nothing), the total runtime is probably dominated by the program 
load time etc. even when using C.


Re: Small part of a program : d and c versions performances diff.

2014-07-09 Thread Larry via Digitalmars-d-learn

On Wednesday, 9 July 2014 at 13:46:59 UTC, Larry wrote:
Yes you are perfectly right but our need is to run the fastest 
code on the lowest powered machines. Not servers but embedded 
systems.


That is why I just test the overall structures.

The rest of the code is numerical so it will not change by much 
the fact that d cannot get back the huge launching time. At the 
microsecond level(even nano) it counts because of electrical 
consumption, size of hardware, heat and so on.


It is definitely not something most care about and i cannot 
disclose the full code for license reasons (yeah I know I suck 
and generate some fuss for nothing but.. I just execute.)


But D may be of our use for non critical code to replace some 
Python there and there. It is definitely a good piece of 
engineering. And it will help save money.


@John Colvin :
hem, you meant the sample code or the real code ? If the former, 
it is the one corrected by Bearophile.

My excuses


Re: Small part of a program : d and c versions performances diff.

2014-07-09 Thread Larry via Digitalmars-d-learn

@Bearophile: just tried. No dramatic change.

import core.memory;

void main() {
GC.disable;
...
}


Re: Small part of a program : d and c versions performances diff.

2014-07-09 Thread bearophile via Digitalmars-d-learn

Larry:


@Bearophile: just tried. No dramatic change.

import core.memory;

void main() {
GC.disable;
...
}


That just means disabling the GC, so the start time is the same.
What you want is to not start the GC/runtime, stubbing it out... 
(assuming you don't need the GC in your program).


I think you can stub out the runtime functions defining few empty 
extern(C) functions, but I've never had to do it (saving 0.015 
seconds is not important for my needs), so if you don't know how 
to do it, you have to ask to others.


Bye,
bearophile


Re: Small part of a program : d and c versions performances diff.

2014-07-09 Thread John Colvin via Digitalmars-d-learn

On Wednesday, 9 July 2014 at 13:46:59 UTC, Larry wrote:
The rest of the code is numerical so it will not change by much 
the fact that d cannot get back the huge launching time. At the 
microsecond level(even nano) it counts because of electrical 
consumption, size of hardware, heat and so on.


You say you are worried about microseconds and power consumption, 
but you are suggesting launching a new process - a lot of 
overhead - to do a small amount of numerical work.


Surely no matter what programming language you use you would not 
want to work like this?


Re: Small part of a program : d and c versions performances diff.

2014-07-09 Thread via Digitalmars-d-learn

On Wednesday, 9 July 2014 at 14:30:41 UTC, John Colvin wrote:
You say you are worried about microseconds and power 
consumption, but you are suggesting launching a new process - a 
lot of overhead - to do a small amount of numerical work.


Not much overhead if you don't use a MMU and use static linking.


Re: Opinions: The Best and Worst of D (for a lecture/talk I intend to give)

2014-07-09 Thread H. S. Teoh via Digitalmars-d-learn
On Wed, Jul 09, 2014 at 07:51:24AM +0200, Philippe Sigaud via 
Digitalmars-d-learn wrote:
 On Tue, Jul 8, 2014 at 7:50 AM, H. S. Teoh via Digitalmars-d-learn
 digitalmars-d-learn@puremagic.com wrote quite a wall of text
 
 Wow, what to add to that? Maybe you scared other from participating
 ;-)

I hope not. :)


[...]
 * I'd add static introspection to the mix: using static if,
 __traits(...) and is(...), clunky as the syntax is (there, one 'ugly'
 thing for you), is easy and very powerful:
[...]

Oh yeah, I forgot about that one. The syntax of is-expressions is very
counterintuitive (not to mention noisy), and has too many special-cased
meanings that are completely non-obvious for the uninitiated, for
example:

// Assume T = some type
is(T)   // is T a valid type?
is(T U) // is T a valid type? If so, alias it to U
is(T : U)   // is T implicitly convertible to U?
is(T U : V) // is T implicitly convertible to V? If so,
// alias it to U
is(T U : V, W)  // does T match the type pattern V, for some
// template arguments W? If so, alias to U
is(T == U)  // is T the same type as U?

// You thought the above is (somewhat) consistent? Well look at
// this one:
is(T U : __parameters)
// is T the type of a function? If so, alias U
// to the parameter tuple of its arguments.

That last one is remarkably pathological: it breaks away from the
general interpretation of the other cases, where T is matched against
the right side of the expression; here, __parameters is a magic keyword
that makes the whole thing mean something else completely. Not to
mention, what is returned in U is something extremely strange; it
looks like a type tuple, but it's actually something more than that.
Unlike usual type tuples, in addition to encoding the list of types of
the function's parameters, it also includes the parameter names and
attributes...  except that you can only get at the parameter names using
__traits(name,...). But indexing it like a regular type tuple will
reduce its elements into mere types, on which __traits(name,...) will
fail; you need to take 1-element slices of it in order to preserve the
additional information.

This strange, inconsistent behaviour only barely begins to make sense
once you understand how it's implemented in the compiler. It's the
epitome of leaky abstraction.


T

-- 
Do not reason with the unreasonable; you lose by definition.


Re: Opinions: The Best and Worst of D (for a lecture/talk I intend to give)

2014-07-09 Thread Meta via Digitalmars-d-learn

On Monday, 7 July 2014 at 23:47:26 UTC, Aerolite wrote:

Hey all,

I've not posted here in a while, but I've been keeping up to 
speed with D's progress over the last couple of years and 
remain consistently impressed with the language.


I'm part of a new computing society in the University of 
Newcastle, Australia, and am essentially known throughout our 
Computer Science department as 'the D guy'. At the insistence 
of my peers, I have decided to give an introductory lecture on 
the D Programming Language, in order to expose more students to 
the increasingly amazing aspects of D. I expect to cover the 
good, the bad, the awesome, and the ugly, in a 
complement-criticism-complement styled talk, and while I have 
my own opinions regarding each of these things, I'd like a 
broader view from the community regarding these aspects, so 
that I may provide as accurate and as useful information as 
possible.


So, if you would be so kind, give me a bullet list of the 
aspects of D you believe to be good, awesome, bad, and/or ugly. 
If you have the time, some code examples wouldn't go amiss 
either! Try not to go in-depth to weird edge cases - remain 
general, yet informative. E.g. I consider D's string mixins to 
be in the 'awesome' category, but its reliance on the GC for 
large segments of the standard library to be in the 'ugly' 
category.


Thanks so much for your time!


opDispatch is a mostly untapped goldmine of potential. Just take 
a look at this thread, where an (almost, depends on the compiler) 
no-cost safe dereference wrapper was implemented using it: 
http://forum.dlang.org/post/mailman.2584.1403213951.2907.digitalmar...@puremagic.com


opDisptach also allows for vector swizzling, which is really nice 
for any kind of vector work.


One of the uglier things in D is also a long-standing problem 
with C and C++, in that comparison of signed and unsigned values 
is allowed.


How to interact with fortran code

2014-07-09 Thread seany via Digitalmars-d-learn
I apologize many times for this question, may be this had already 
been answered somewhere, but considering today the last of my 
nerve is broken, I can not really find the soution.


So I have a D code, which acts as a central manager of all my 
codes, reads user input, reads files, etc, and based on the file 
readouts, I would like to pass some variables from the D code to 
a fortran code, in binary format, perhaps, if such a thing 
exists, instead of encoding to text/ ASCII first.


I would also like to read some (not all) variables back from the 
fortran code.


The Fortran code resides in a subdirectory to the path/to/d/code

How to do this? is there a preffered way / easier than system 
call way to interface D and Fortran code? This must be Fortan 
code - these are the standard atmospheric chemistry codes.


I apologize again if the question is stupid, trust me, today all 
my nerves are broken.


Re: Small part of a program : d and c versions performances diff.

2014-07-09 Thread Larry via Digitalmars-d-learn

On Wednesday, 9 July 2014 at 14:30:41 UTC, John Colvin wrote:

On Wednesday, 9 July 2014 at 13:46:59 UTC, Larry wrote:
The rest of the code is numerical so it will not change by 
much the fact that d cannot get back the huge launching time. 
At the microsecond level(even nano) it counts because of 
electrical consumption, size of hardware, heat and so on.


You say you are worried about microseconds and power 
consumption, but you are suggesting launching a new process - a 
lot of overhead - to do a small amount of numerical work.


Surely no matter what programming language you use you would 
not want to work like this?


@John : A new process ? Where ?
Or maybe I got you wrong on this one John

I am writing libraries and before going further I wondered if
there were alternatives that I could have a grab on. The idea is
to have an homogeneous software so we were ready to switch to d
for the whole tasks/asset.

No new process involved.

I was seaking for maybe a python like programming language that
offers c-like perfs, without so much writing as in c. Exit
Cython. Debugging it is a real pain. And executable size is..
well..

I am becoming lazy and seek for the Holy Grail. Java not welcome.
D seemed like a very good choice and maybe it is, or more
certainly will.


Re: Small part of a program : d and c versions performances diff.

2014-07-09 Thread Chris via Digitalmars-d-learn

On Wednesday, 9 July 2014 at 15:09:09 UTC, Larry wrote:

On Wednesday, 9 July 2014 at 14:30:41 UTC, John Colvin wrote:

On Wednesday, 9 July 2014 at 13:46:59 UTC, Larry wrote:
The rest of the code is numerical so it will not change by 
much the fact that d cannot get back the huge launching time. 
At the microsecond level(even nano) it counts because of 
electrical consumption, size of hardware, heat and so on.


You say you are worried about microseconds and power 
consumption, but you are suggesting launching a new process - 
a lot of overhead - to do a small amount of numerical work.


Surely no matter what programming language you use you would 
not want to work like this?


@John : A new process ? Where ?
Or maybe I got you wrong on this one John

I am writing libraries and before going further I wondered if
there were alternatives that I could have a grab on. The idea is
to have an homogeneous software so we were ready to switch to d
for the whole tasks/asset.

No new process involved.

I was seaking for maybe a python like programming language that
offers c-like perfs, without so much writing as in c. Exit
Cython. Debugging it is a real pain. And executable size is..
well..

I am becoming lazy and seek for the Holy Grail. Java not 
welcome.

D seemed like a very good choice and maybe it is, or more
certainly will.


I wouldn't give up on D (as you've already signalled). It's 
getting better with each iteration. BTW, have you measured the 
power consumption yet? Does it make a big difference if you use D 
or C?


Re: Small part of a program : d and c versions performances diff.

2014-07-09 Thread John Colvin via Digitalmars-d-learn

On Wednesday, 9 July 2014 at 15:09:09 UTC, Larry wrote:

On Wednesday, 9 July 2014 at 14:30:41 UTC, John Colvin wrote:

On Wednesday, 9 July 2014 at 13:46:59 UTC, Larry wrote:
The rest of the code is numerical so it will not change by 
much the fact that d cannot get back the huge launching time. 
At the microsecond level(even nano) it counts because of 
electrical consumption, size of hardware, heat and so on.


You say you are worried about microseconds and power 
consumption, but you are suggesting launching a new process - 
a lot of overhead - to do a small amount of numerical work.


Surely no matter what programming language you use you would 
not want to work like this?


@John : A new process ? Where ?
Or maybe I got you wrong on this one John


process == program in this case. Launching a new process == 
running the program


The startup cost of the D runtime is only paid when you start the 
program. If the amount of work done per execution of the program 
is more than a trivial amount then the startup cost will only be 
a small part of the total running time and power consumption etc.



I am writing libraries and before going further I wondered if
there were alternatives that I could have a grab on. The idea is
to have an homogeneous software so we were ready to switch to d
for the whole tasks/asset.

No new process involved.

I was seaking for maybe a python like programming language that
offers c-like perfs, without so much writing as in c. Exit
Cython. Debugging it is a real pain. And executable size is..
well..

I am becoming lazy and seek for the Holy Grail. Java not 
welcome.

D seemed like a very good choice and maybe it is, or more
certainly will.


I think D could be a good choice for you.


Re: Small part of a program : d and c versions performances diff.

2014-07-09 Thread Larry via Digitalmars-d-learn

I may definitely help on the D project.

I noticed that gdc doesn't have profile guided optimization too.
So yeah, I cannot use D right now, I mean for this project.

Ok, I will do my best to have some spare time on Dlang. Didn't 
really looked at the code already and I code for years in C, 
which is my first class coding language. Hope it will not be any 
kind of barrier (c++ is my.. third best coding buddy anyway 
(after python, excellent for managing systems)).


Many thanks to all the community. I will stick with you and see 
what I can bring (or cannot).


:)

Bye


Re: Opinions: The Best and Worst of D (for a lecture/talk I intend to give)

2014-07-09 Thread Dominikus Dittes Scherkl via Digitalmars-d-learn

On Wednesday, 9 July 2014 at 14:51:41 UTC, Meta wrote:
One of the uglier things in D is also a long-standing problem 
with C and C++, in that comparison of signed and unsigned 
values is allowed.


I would like that, if it would be implemented along this line:

/// Returns -1 if a  b, 0 if they are equal or 1 if a  b.
/// this will always yield a correct result, no matter which 
numeric types are compared.

/// It uses one extra comparison operation if and only if
/// one type is signed and the other unsigned but the signed 
value is = 0

/// (that is what you need to pay for stupid choice of type).
int opCmp(T, U)(const(T) a, const(U) b) @primitive if(isNumeric!T 
 isNumeric!U)

{
   static if(Unqual!T == Unqual!U)
   {
  // use the standard D implementation
   }
   else static if(isFloatingPoint!T || isFloatingPoint!U)
   {
  alias CommonType!(T, U) C;
  return opCmp!(cast(C)a, cast(C)b);
   }
   else static if(isSigned!T  isUnsigned!U)
   {
  alias CommonType!(Unsigned!T, U) C;
  return (a  0) ? -1 : opCmp!(cast(C)a, cast(C)b);
   }
   else static if(isUnsigned!T  isSigned!U)
   {
  alias CommonType!(T, Unsigned!U) C;
  return (b  0) ? 1 : opCmp!(cast(C)a, cast(C)b);
   }
   else // both signed or both unsigned
   {
  alias CommonType!(T, U) C;
  return opCmp!(cast(C)a, cast(C)b);
   }
}


Re: Opinions: The Best and Worst of D (for a lecture/talk I intend to give)

2014-07-09 Thread Dominikus Dittes Scherkl via Digitalmars-d-learn

Of course without the ! after opCmp in the several cases.


Re: Small part of a program : d and c versions performances diff.

2014-07-09 Thread Larry via Digitalmars-d-learn

@Chris :
Actually yes. If we consider the device to run 20h a day, by 
shaving a few microseconds there and there on billions of 
operations a day over a whole machine park, you can enable 
yourself to shut down some of them for maintenance more easily, 
or pause some of them letting their battery lasting a bit longer 
and economies have proven to be in the order of thousands $$ 
thanks to a redefined coding strategy.


Not even mentionning hardware usage which is related to heat and 
savings you can pretend to have over a long run.


By changing some hardware a few monthes after their theorical 
obsolescence, you can save a bit further.


And the accountant is very happy because he can optimize the 
finance further (staggered repayment)


It enabled us to hire more engineers/hardware.

Of course, the saving is not only on this loop but on the whole 
chain. And it definitely adds up $$$.


And there are a lot more things involved that benefit it (latency 
and so on).


Yep. :)


Re: Opinions: The Best and Worst of D (for a lecture/talk I intend to give)

2014-07-09 Thread H. S. Teoh via Digitalmars-d-learn
On Wed, Jul 09, 2014 at 04:24:38PM +, Dominikus Dittes Scherkl via 
Digitalmars-d-learn wrote:
 On Wednesday, 9 July 2014 at 14:51:41 UTC, Meta wrote:
 One of the uglier things in D is also a long-standing problem with C
 and C++, in that comparison of signed and unsigned values is allowed.
 
 I would like that, if it would be implemented along this line:
 
 /// Returns -1 if a  b, 0 if they are equal or 1 if a  b.
 /// this will always yield a correct result, no matter which numeric types 
 are compared.
 /// It uses one extra comparison operation if and only if
 /// one type is signed and the other unsigned but the signed value is = 0
 /// (that is what you need to pay for stupid choice of type).
[...]

Yeah, I don't see what's the problem with comparing signed and unsigned
values, as long as the result is as expected. Currently, however, this
code asserts, which is wrong:

uint x = uint.max;
int y = -1;
assert(x  y);


 {
static if(Unqual!T == Unqual!U)

Nitpick: should be:

static if(is(Unqual!T == Unqual!U))


[...]
else static if(isSigned!T  isUnsigned!U)
{
   alias CommonType!(Unsigned!T, U) C;
   return (a  0) ? -1 : opCmp!(cast(C)a, cast(C)b);
}
else static if(isUnsigned!T  isSigned!U)
{
   alias CommonType!(T, Unsigned!U) C;
   return (b  0) ? 1 : opCmp!(cast(C)a, cast(C)b);
}
[...]

Hmm. I wonder if there's a more efficient way to do this.

For the comparison s  u, where s is a signed value and u is an unsigned
value, whenever s is negative, the return value of opCmp must be
negative.  Assuming 2's-complement representation of integers, this
means we simply copy the MSB of s (i.e., the sign bit) to the result. So
we can implement s  u as:

enum signbitMask = 1u  (s.sizeof*8 - 1); // this is a compile-time 
constant
return (s - u) | (s  signbitMask); // look ma, no branches!

which would translate (roughly) to the assembly code:

mov eax, [address of s]
mov ebx, [address of u]
mov ecx, eax; save the value of s for signbit extraction
sub eax, ebx; s - u
and ecx, #$1000 ; s  signbitMask
or  eax, ecx; (s - u) | (s  signbitMask)
(ret; this is deleted if opCmp is inlined)

which avoids a branch hazard in the CPU pipeline.

Similarly, for the comparison u  s, whenever s is negative, then opCmp
must always be positive. So this means we copy over the *negation* of
the sign bit of s to the result. So we get this for u  s:

enum signbitMask = 1u  (s.sizeof*8 - 1); // as before
return (u - s)  ~(s  signbitMask); // look ma, no branches!

which translates roughly to:

mov eax, [address of u]
mov ebx, [address of s]
sub eax, ebx; u - s
and ebx, #$1000 ; s  signbitMask
not ebx ; ~(s  signbitMask)
and eax, ebx; (u - s)  ~(s  signbitMask)
(ret; this is deleted if opCmp is inlined)

Again, this avoid a branch hazard in the CPU pipeline.

In both cases, the first 2 instructions are unnecessary if the values to
be compared are already in CPU registers. The naïve implementation of
opCmp is just a single sub instruction (this is why opCmp is defined the
way it is, BTW), whereas the smart signed/unsigned comparison is 4
instructions long. The branched version would look something like this:

mov eax, [address of u]
mov ebx, [address of s]
cmp ebx, $#0
jge label1  ; first branch
mov eax, $#
jmp label2  ; 2nd branch
label1:
sub eax, ebx
label2:
(ret)

The 2nd branch can be replaced with ret if opCmp is not inlined, but
requiring a function call to compare integers seems excessive, so let's
assume it's inlined, in which case the 2nd branch is necessary. So as
you can see, the branched version is 5 instructions long, and always
causes a CPU pipeline hazard.

So I submit that the unbranched version is better. ;-)

(So much for premature optimization... now lemme go and actually
benchmark this stuff and see how well it actually performs in practice.
Often, such kinds of hacks often perform more poorly than expected due
to unforeseen complications with today's complex CPU's. So for all I
know, I could've just been spouting nonsense above. :P)


T

-- 
Debian GNU/Linux: Cray on your desktop.


Re: Opinions: The Best and Worst of D (for a lecture/talk I intend to give)

2014-07-09 Thread Dominikus Dittes Scherkl via Digitalmars-d-learn
On Wednesday, 9 July 2014 at 17:13:21 UTC, H. S. Teoh via 
Digitalmars-d-learn wrote:
On Wed, Jul 09, 2014 at 04:24:38PM +, Dominikus Dittes 
Scherkl via Digitalmars-d-learn wrote:

/// Returns -1 if a  b, 0 if they are equal or 1 if a  b.
/// this will always yield a correct result, no matter which 
numeric types are compared.

/// It uses one extra comparison operation if and only if
/// one type is signed and the other unsigned but the signed 
value is = 0

/// (that is what you need to pay for stupid choice of type).

[...]

Yeah, I don't see what's the problem with comparing signed and 
unsigned
values, as long as the result is as expected. Currently, 
however, this

code asserts, which is wrong:

uint x = uint.max;
int y = -1;
assert(x  y);

Yes, this is really bad.
But last time I got the response that this is so to be compatible 
with C.
That is what I really thought was the reason why D throw away 
balast from C,

to fix bugs.


   static if(Unqual!T == Unqual!U)


Nitpick: should be:

static if(is(Unqual!T == Unqual!U))

Of course.


[...]

   else static if(isSigned!T  isUnsigned!U)
   {
  alias CommonType!(Unsigned!T, U) C;
  return (a  0) ? -1 : opCmp!(cast(C)a, cast(C)b);
   }
   else static if(isUnsigned!T  isSigned!U)
   {
  alias CommonType!(T, Unsigned!U) C;
  return (b  0) ? 1 : opCmp!(cast(C)a, cast(C)b);
   }

[...]

Hmm. I wonder if there's a more efficient way to do this.
I'm sure. But I think it should be done at the compiler, not in a 
library.



{...]
opCmp is just a single sub instruction (this is why opCmp is 
defined the
way it is, BTW), whereas the smart signed/unsigned comparison 
is 4

instructions long.
[...]
you can see, the branched version is 5 instructions long, and 
always

causes a CPU pipeline hazard.

So I submit that the unbranched version is better. ;-)
I don't think so, because the branch will only be taken if the 
signed
type is = 0 (in fact unsigned). So if the signed/unsigned 
comparison
is by accident, you pay the extra runtime. But if it is 
intentional
the signed value is likely to be negative, so you get a correct 
result

with no extra cost.
Even better for constants, where the compiler can not only 
evaluate expressions like (uint.max  -1) correct, but it should 
optimize them completely away!




(So much for premature optimization... now lemme go and actually
benchmark this stuff and see how well it actually performs in 
practice.

Yes, we should do this.

Often, such kinds of hacks often perform more poorly than 
expected due
to unforeseen complications with today's complex CPU's. So for 
all I

know, I could've just been spouting nonsense above. :P)
I don't see such a compiler change as a hack. It is a strong 
improvement IMHO.


Re: Opinions: The Best and Worst of D (for a lecture/talk I intend to give)

2014-07-09 Thread anonymous via Digitalmars-d-learn

On Wednesday, 9 July 2014 at 17:13:21 UTC, H. S. Teoh via
Digitalmars-d-learn wrote:
For the comparison s  u, where s is a signed value and u is an 
unsigned

value, whenever s is negative, the return value of opCmp must be
negative.  Assuming 2's-complement representation of integers, 
this
means we simply copy the MSB of s (i.e., the sign bit) to the 
result. So

we can implement s  u as:

	enum signbitMask = 1u  (s.sizeof*8 - 1); // this is a 
compile-time constant

return (s - u) | (s  signbitMask); // look ma, no branches!


This is a problem, isn't it:

void main()
{
 assert(cmp(0, uint.max)  0); /* fails */
}
int cmp(int s, uint u)
{
 enum signbitMask = 1u  (s.sizeof*8 - 1); // this is a
compile-time constant
 return (s - u) | (s  signbitMask); // look ma, no branches!
}


Re: Opinions: The Best and Worst of D (for a lecture/talk I intend to give)

2014-07-09 Thread Dominikus Dittes Scherkl via Digitalmars-d-learn
On Wednesday, 9 July 2014 at 17:13:21 UTC, H. S. Teoh via 
Digitalmars-d-learn wrote:

The branched version would look something like this:

mov eax, [address of u]
mov ebx, [address of s]
cmp ebx, $#0
jge label1  ; first branch
mov eax, $#
jmp label2  ; 2nd branch
label1:
sub eax, ebx
label2:
(ret)

Why?

I would say:
mov eax, [adress of s] ; mov directly compares to 
zero

jl  lable; less - jump to return
sub eax, [adress of u]
neg eax  ; because we subtracted in 
the wrong order

lable:  ret


Re: Small part of a program : d and c versions performances diff.

2014-07-09 Thread Ali Çehreli via Digitalmars-d-learn

On 07/09/2014 03:57 AM, Larry wrote:

  struct timeval s,e;
[...]
  gettimeofday(e,NULL);

  printf(so ? %d %lu %d %d %d,g,e.tv_usec - s.tv_usec,
 arr[4],arr[9],pol);

Changing the topic a little, the calculation above ignores the tv_sec 
members of s and e.


Ali



Re: Opinions: The Best and Worst of D (for a lecture/talk I intend to give)

2014-07-09 Thread H. S. Teoh via Digitalmars-d-learn
On Wed, Jul 09, 2014 at 05:43:15PM +, Dominikus Dittes Scherkl via 
Digitalmars-d-learn wrote:
 On Wednesday, 9 July 2014 at 17:13:21 UTC, H. S. Teoh via
 Digitalmars-d-learn wrote:
[..]
 Yeah, I don't see what's the problem with comparing signed and unsigned
 values, as long as the result is as expected. Currently, however, this
 code asserts, which is wrong:
 
  uint x = uint.max;
  int y = -1;
  assert(x  y);

 Yes, this is really bad.
 But last time I got the response that this is so to be compatible with
 C.  That is what I really thought was the reason why D throw away
 balast from C, to fix bugs.

I think the slogan was that if something in D looks like C, then it
should either have C semantics or not compile. According to this logic,
the only recourse here is to prohibit comparison of signed with
unsigned, which I don't think is workable because there are many valid
use cases for it (plus, it will break a ton of code and people will be
very unhappy).

I don't like the current behaviour, though. It just reeks of wrong in so
many different ways. If you *really* want semantives like the above, you
really should be casting y to unsigned so that it's clear what exactly
you're trying to achieve.


[...]
 Hmm. I wonder if there's a more efficient way to do this.
 I'm sure. But I think it should be done at the compiler, not in a
 library.

Obviously, yes. But I wasn't thinking about implementing opCmp in the
library -- that would be strange since ints, of all things, need to have
native compiler support. I was thinking more about how the compiler
would implement safe signed/unsigned comparisons.


[...]
 So I submit that the unbranched version is better. ;-)
 I don't think so, because the branch will only be taken if the signed
 type is = 0 (in fact unsigned). So if the signed/unsigned comparison
 is by accident, you pay the extra runtime. But if it is intentional
 the signed value is likely to be negative, so you get a correct result
 with no extra cost.

Good point. Moreover, I have discovered multiple bugs in my proposed
implementation; the correct implementation should be as follows:

int compare(int x, uint y)
{
enum signbitMask = 1u  (int.sizeof*8 - 1);
static assert(signbitMask == 0x8000);

// The (x|y)  signbitMask basically means that if either x is 
negative
// or y  int.max, then the result will always be negative 
(sign bit
// set).
return (cast(uint)x - y) | ((x | y)  signbitMask);
}

unittest
{
// Basic cases
assert(compare(5, 10u)  0);
assert(compare(5, 5u) == 0);
assert(compare(10, 5u)  0);

// Large cases
assert(compare(0, uint.max)  0);
assert(compare(50, uint.max)  0);

// Sign-dependent cases
assert(compare(-1, 0u)  0);
assert(compare(-1, 10u)  0);
assert(compare(-1, uint.max)  0);
}

int compare(uint x, int y)
{
enum signbitMask = 1u  (int.sizeof*8 - 1);
static assert(signbitMask == 0x8000);
return ((x - y)  ~(x  signbitMask)) | ((cast(uint)y  
signbitMask)  1);
}

unittest
{
// Basic cases
assert(compare(0u, 10)  0);
assert(compare(10u, 10) == 0);
assert(compare(10u, 5)  0);

// Large cases
assert(compare(uint.max, 10)  0);
assert(compare(uint.max, -10)  0);

// Sign-dependent cases
assert(compare(0u, -1)  0);
assert(compare(10u, -1)  0);
assert(compare(uint.max, -1)  0);
}


Using gdc -O3, I managed to get a very good result for
compare(int,uint), only 5 instructions long.

However, for compare(uint,int), there is the annoying special case of
compare(uint.max, -1), which can only be fixed by the hack
... | ((y  signbitMask)  1).  Unfortunately, this makes it 11
instructions long, which is unacceptable. So it looks like a simple
compare and branch would be far better in the compare(uint,int) case --
it's far more costly to avoid the branch than to live with it.


 Even better for constants, where the compiler can not only evaluate
 expressions like (uint.max  -1) correct, but it should optimize them
 completely away!

Actually, with gdc -O3, I found that the body of the above unittests got
completely optimized away at compile-time, so that the unittest body is
empty in the executable! So even with a library implementation the
compiler was able to maximize performance. DMD left the assert calls in,
but then it's not exactly known for generating optimal code anyway.


 (So much for premature optimization... now lemme 

Re: Small part of a program : d and c versions performances diff.

2014-07-09 Thread Larry via Digitalmars-d-learn

On Wednesday, 9 July 2014 at 18:18:43 UTC, Ali Çehreli wrote:

On 07/09/2014 03:57 AM, Larry wrote:

  struct timeval s,e;
[...]
  gettimeofday(e,NULL);

  printf(so ? %d %lu %d %d %d,g,e.tv_usec - s.tv_usec,
 arr[4],arr[9],pol);

Changing the topic a little, the calculation above ignores the 
tv_sec members of s and e.


Ali


Absolutely Ali because I know it is under the sec range. I made 
some test before submitting it :)


But you are absolutely right Ali the mileage will vary in a 
completely different scenario.


Re: Opinions: The Best and Worst of D (for a lecture/talk I intend to give)

2014-07-09 Thread H. S. Teoh via Digitalmars-d-learn
On Wed, Jul 09, 2014 at 11:29:06AM -0700, H. S. Teoh via Digitalmars-d-learn 
wrote:
 On Wed, Jul 09, 2014 at 05:43:15PM +, Dominikus Dittes Scherkl via 
 Digitalmars-d-learn wrote:
  On Wednesday, 9 July 2014 at 17:13:21 UTC, H. S. Teoh via
  Digitalmars-d-learn wrote:
[...]
  Often, such kinds of hacks often perform more poorly than expected
  due to unforeseen complications with today's complex CPU's. So for
  all I know, I could've just been spouting nonsense above. :P)
  I don't see such a compiler change as a hack. It is a strong
  improvement IMHO.
 
 I was talking about using | and  to get rid of the branch in
 signed/unsigned comparison. As it turns out, the compare(uint,int)
 case seems far more costly than a simple compare-and-branch as you had
 it at the beginning. So at least that part of what I wrote is probably
 nonsense.  :P
 
 But I can't say for sure until I actually run some benchmarks on it.
[...]

Hmph. I'm having trouble coming up with a fair benchmark, because I
realized that D doesn't actually have a way of expressing opCmp for
unsigned int's in a minimal way! The problem is that the function needs
to return int, but given two uints, their difference may be greater than
int.max, so simply subtracting them will not work. So the best I can
come up with is:

int compare2(int x, uint y)
{
return (x  0) ? -1 :
(y  int.max) ? -1 :
(x - y);
}

which requires 2 comparisons.

Similarly, for the uint-int case:

int compare2(uint x, int y)
{
return (y  0) ? 1 :
(x  int.max) ? 1 :
(x - y);
}

If you have a better implementation in mind, I'm all ears.

In any case, I went ahead and benchmarked the above two functions along
with my branchless implementations, and here are the results:

(with dmd -O -unittest:)

non-branching compare(signed,unsigned): 5513 msecs
branching compare(signed,unsigned): 5442 msecs
non-branching compare(unsigned,signed): 5441 msecs
branching compare(unsigned,signed): 5744 msecs
Optimizer-thwarting value: 0

(with gdc -O3 -funittest:)

non-branching compare(signed,unsigned): 516 msecs
branching compare(signed,unsigned): 1209 msecs
non-branching compare(unsigned,signed): 453 msecs
branching compare(unsigned,signed): 756 msecs
Optimizer-thwarting value: 0

(Ignore the last lines of each output; that's just a way to prevent gdc
-O3 from being over-eager and optimizing out the entire test so that
everything returns 0 msecs.)

Interestingly, with dmd, the branching compare for the signed-unsigned
case is faster than my non-branching one, but the order is reversed for
the unsigned-signed case. They're pretty close, though, and on some runs
the order of the latter case is reversed.

With gdc, however, it seem the non-branching versions are clearly
better, even in the unsigned-signed case, which I thought would be
inferior. So clearly, these results are very optimizer-dependent.

Keep in mind, though, that this may not necessarily reflect actual
performance when the compiler generates the equivalent code for the
built-in integer comparison operators, because in codegen the compiler
can take advantage of the CPU's carry and overflow bits, and can elide
the actual return values of opCmp. This may skew the results enough to
reverse the order of some of these cases.

Anyway, here's the code (for independent verification):

int compare(int x, uint y)
{
enum signbitMask = 1u  (int.sizeof*8 - 1);
static assert(signbitMask == 0x8000);

// The (x|y)  signbitMask basically means that if either x is 
negative
// or y  int.max, then the result will always be negative 
(sign bit
// set).
return (cast(uint)x - y) | ((x | y)  signbitMask);
}

unittest
{
// Basic cases
assert(compare(5, 10u)  0);
assert(compare(5, 5u) == 0);
assert(compare(10, 5u)  0);

// Large cases
assert(compare(0, uint.max)  0);
assert(compare(50, uint.max)  0);

// Sign-dependent cases
assert(compare(-1, 0u)  0);
assert(compare(-1, 10u)  0);
assert(compare(-1, uint.max)  0);
}

int compare2(int x, uint y)
{
return (x  0) ? -1 :
(y  int.max) ? -1 :
x - y;
}

unittest
{
// Basic cases
assert(compare2(5, 10u)  0);
assert(compare2(5, 5u) == 0);
assert(compare2(10, 5u)  0);

// Large cases

Introspecting a Module with Traits, allMembers

2014-07-09 Thread Maxime Chevalier-Boisvert via Digitalmars-d-learn

Hello,

I'm looking to introspect a module, list all the members, iterate 
over them and filter them by kind inside of a static constructor. 
This is in the hope of shortening some hand-written code that is 
quite repetitive (adding many struct instances to an associative 
array in a static constructor).


The code I'm trying to improve upon can be seen here:
https://github.com/maximecb/Higgs/blob/master/source/ir/iir.d#L56

I've done some googling, and it seems I should be able to use the 
allMembers trait 
(http://wiki.dlang.org/Finding_all_Functions_in_a_Module), but 
unfortunately, the module name seems to be unrecognized, no 
matter which way I spell it:


auto members = [__traits(allMembers, ir.ir)];
pragma(msg, members);

Produces:
ir/iir.d(85): Error: argument has no members

Other people seem to have run into this problem. Am I doing it 
wrong or is this a bug in DMD?


Re: Introspecting a Module with Traits, allMembers

2014-07-09 Thread Justin Whear via Digitalmars-d-learn
On Wed, 09 Jul 2014 20:07:56 +, NCrashed wrote:

 On Wednesday, 9 July 2014 at 20:04:47 UTC, Maxime Chevalier-Boisvert
 wrote:
 auto members = [__traits(allMembers, ir.ir)];
 pragma(msg, members);
 
 Have you tried without quotes?
 pragma(msg, __traits(allMembers, ir.ir));

Also, looks like it should be ir.iir


Re: Introspecting a Module with Traits, allMembers

2014-07-09 Thread NCrashed via Digitalmars-d-learn

On Wednesday, 9 July 2014 at 20:07:57 UTC, NCrashed wrote:

Produces:
ir/iir.d(85): Error: argument has no members


If module name is ir.iir: pragma(msg, __traits(allMembers, 
ir.iir));


Re: Introspecting a Module with Traits, allMembers

2014-07-09 Thread Maxime Chevalier-Boisvert via Digitalmars-d-learn

On Wednesday, 9 July 2014 at 20:07:57 UTC, NCrashed wrote:
On Wednesday, 9 July 2014 at 20:04:47 UTC, Maxime 
Chevalier-Boisvert wrote:

auto members = [__traits(allMembers, ir.ir)];
pragma(msg, members);


Have you tried without quotes?
pragma(msg, __traits(allMembers, ir.ir));


Did need to write it without the quotes, and to add enum to 
force compile-time evaluation. It's actually ir.ops that I wanted 
to list the members of. Got the following snippet to work:


static this()
{
enum members = [__traits(allMembers, ir.ops)];
pragma(msg, members);
}

Prints:
[object, ir, jit, OpArg, OpInfo, Opcode, GET_ARG, 
SET_STR, MAKE_VALUE, GET_WORD, GET_TYPE, IS_I32, ...]


Re: Introspecting a Module with Traits, allMembers

2014-07-09 Thread Maxime Chevalier-Boisvert via Digitalmars-d-learn

I got the following code to do what I want:

static this()
{
void addOp(ref Opcode op)
{
assert (
op.mnem !in iir,
duplicate op name  ~ op.mnem
);

iir[op.mnem] = op;
}

foreach (memberName; __traits(allMembers, ir.ops))
{
static if (__traits(compiles, addOp(__traits(getMember, 
ir.ops, memberName

{
writeln(memberName);
addOp(__traits(getMember, ir.ops, memberName));
}
}
}


It's a bit of a hack, but it works. Is there any way to create 
some sort of alias for __traits(getMember, ir.ops, memberName) so 
that I don't have to write it out in full twice? Made some 
attempts but only got the compiler to complain.


Re: Introspecting a Module with Traits, allMembers

2014-07-09 Thread Dicebot via Digitalmars-d-learn
On Wednesday, 9 July 2014 at 20:52:29 UTC, Maxime 
Chevalier-Boisvert wrote:
It's a bit of a hack, but it works. Is there any way to create 
some sort of alias for __traits(getMember, ir.ops, memberName) 
so that I don't have to write it out in full twice? Made some 
attempts but only got the compiler to complain.


alias Alias(alias Sym) = Sym;
alias member = Alias!(__traits(getMember, ir.ops, memberName);

It does not work with normal alias because of grammar limitation 
afaik.


Re: Small part of a program : d and c versions performances diff.

2014-07-09 Thread Ali Çehreli via Digitalmars-d-learn

On 07/09/2014 12:47 PM, Larry wrote:

 On Wednesday, 9 July 2014 at 18:18:43 UTC, Ali Çehreli wrote:
 On 07/09/2014 03:57 AM, Larry wrote:

   struct timeval s,e;
 [...]
   gettimeofday(e,NULL);
 
   printf(so ? %d %lu %d %d %d,g,e.tv_usec - s.tv_usec,
  arr[4],arr[9],pol);

 Changing the topic a little, the calculation above ignores the tv_sec
 members of s and e.

 Ali

 Absolutely Ali because I know it is under the sec range. I made some
 test before submitting it :)

I know it did work and will work every time you test it. :)

However, even if the difference is just one millisecond, if s and e 
happen to be on different sides of a second boundary, you will get a 
huge result.


Ali



Re: Quicksort Variants

2014-07-09 Thread Nordlöw

On Tuesday, 8 July 2014 at 20:50:01 UTC, Nordlöw wrote:
I recall that Python's default sorting algorithm is related to 
this, right?


https://en.wikipedia.org/wiki/Timsort


Re: Quicksort Variants

2014-07-09 Thread Nordlöw

On Tuesday, 8 July 2014 at 20:50:01 UTC, Nordlöw wrote:

Also related: 
http://forum.dlang.org/thread/eaxcfzlvsakeucwpx...@forum.dlang.org#post-mailman.2809.1355844427.5162.digitalmars-d:40puremagic.com


Re: Small part of a program : d and c versions performances diff.

2014-07-09 Thread Larry via Digitalmars-d-learn

Right


Re: Small part of a program : d and c versions performances diff.

2014-07-09 Thread Kapps via Digitalmars-d-learn
Measure a larger number of loops. I understand you're concerned 
about microseconds, but your benchmark shows nothing because your 
timer is simply not accurate enough for this. The benchmark that 
bearophile showed where C took ~2 nanoseconds vs the ~7000 D took 
heavily implies to me that the C implementation is simply being 
optimized out and nothing is actually running. All inputs are 
known at compile-time, the output is known at compile-time, the 
compiler is perfectly free to simply remove all your code and 
replace it with the result. I'm somewhat surprised that the D 
version doesn't do this actually, perhaps because of the dynamic 
memory allocation. I realize that you can't post your actual 
code, but this benchmark honestly just has too many flaws to 
determine anything from.


As for startup cost, D will indeed have a higher startup cost 
than C because of static constructors. Once it's running, it 
should be very close. If you're looking to start a process that 
will run for only a few milliseconds, you'd probably want to not 
use D (or avoid most static constructors, including those in the 
runtime / standard library).


Re: Small part of a program : d and c versions performances diff.

2014-07-09 Thread Kapps via Digitalmars-d-learn

On Wednesday, 9 July 2014 at 13:18:00 UTC, Larry wrote:


You are definitely right, I did mess up while translating !

I run the corrected codes (the ones I was meant to provide :S) 
and on a slow macbook I end up with :

C : 2
D : 15994

Of course when run on very high end machines, this diff is 
almost non existent but we want to run on very low powered 
hardware.


Ok, even with a longer code, there will always be a launch 
penalty for d. So I cannot use it for very high performance 
loops.


Shame for us..
:)

Thanks and bye


This to me pretty much confirms that almost the entirety of your 
C code is being optimized out and thus not actually executing.


Re: Small part of a program : d and c versions performances diff.

2014-07-09 Thread Larry via Digitalmars-d-learn
The actual code is not that much slower according to the numerous 
other operations we do. And certainly faster than D version doing 
almost nothing.


Well it is about massive bitshifts and array accesses and 
calculations.
With all the optimizations we are on par with fortran numerical 
code (thanks -std=c11).


There may be an optimization hidden somewhere or just gdc having 
to mature.


Dunno. But don't get me wrong, D is a fantastic language.