Re: Are padding bits always zero?

2017-06-24 Thread Honey via Digitalmars-d-learn

On Saturday, 24 June 2017 at 14:03:16 UTC, Adam D. Ruppe wrote:
On Saturday, 24 June 2017 at 12:41:47 UTC, Steven Schveighoffer 
wrote:
There is no spec for this, but I know that when the compiler 
has to fill gaps with something it chooses 0.


I'm almost certain there at least used to be a spec for this, 
because I remember citing a link to someone who then complained 
that this zero requirement hurt optimization of void members. 
On the other hand though, the zero padding requirement does 
simplify equality to being memcmp.


The only hint I could find is this statement from Walter:

   http://forum.dlang.org/post/hn11oh$1usk$1...@digitalmars.com




Re: Are padding bits always zero?

2017-06-24 Thread Honey via Digitalmars-d-learn

On Saturday, 24 June 2017 at 14:03:16 UTC, Adam D. Ruppe wrote:
On the other hand though, the zero padding requirement does 
simplify equality to being memcmp.


That's the reason for my question.


Re: Are padding bits always zero?

2017-06-24 Thread Honey via Digitalmars-d-learn
On Saturday, 24 June 2017 at 12:41:47 UTC, Steven Schveighoffer 
wrote:
Any padding bits between fields should be 0 as long as the 
struct is initialized (i.e. as long as you don't do Struct s = 
void).


Padding bits after the fields I assume would be 0, but I don't 
know if this is defined. It's possible the compiler doesn't 
consider those bits to be part of the struct, and just there 
for alignment.


There is no spec for this, but I know that when the compiler 
has to fill gaps with something it chooses 0.


Thanks. Your answer has generated more questions. ;-)

Let's say, I have a struct S of size n with m bits of padding at 
the end. How can I find m?


Is it possible to provide a facility Pad such that for any struct 
T, Pad!T is a struct that mimics T but contains explicit instead 
of implicit padding? E.g.


struct Foo
{
   ubyte b;
   double d;
   int i;
}

struct Pad!Foo
{
   ubyte b;
   ubyte[7] __padding_0;
   double d;
   int i;
   ubyte[4] __padding_1;
}


Are padding bits always zero?

2017-06-24 Thread Honey via Digitalmars-d-learn

Hi everyone!

Are there any guarantees about the values of padding bits in 
structs?


Thanks,
Honey


Re: libc dependency

2017-06-20 Thread Honey via Digitalmars-d-learn

On Tuesday, 20 June 2017 at 05:19:21 UTC, Cym13 wrote:

The issue with that is that you assume:

1) that a complete libc replacement can be created from scratch
2) that a correct libc replacement can be created from scratch
3) that a better libc replacement can be created from scratch
4) that this replacement can be maintained for years to come on 
all plateforms


For things like core.stdc it would be insane not to depend on 
libc. In fact, it would be strange if we did not call into libc, 
here.


But does this really mean that depending on libc must be the 
default for everyone?


I doubt that we would have to replace every function provided by 
libc. We only have to provide alternatives for what we use. Other 
than that I share your scepsis.


I am not saying that we should actually drop the dependency. I 
was just curious to learn something.


Re: libc dependency

2017-06-19 Thread Honey via Digitalmars-d-learn
On Monday, 19 June 2017 at 21:35:56 UTC, Steven Schveighoffer 
wrote:
It does, but it depends on what you want to replace. What 
specifically are you looking for?


I might need a fast variant of memcmp. Writing it in D seems to 
be the natural choice. I see no reason why it should be slower 
than other well optimized C implementations. It will probably be 
faster than implementations in not so much tuned standard C 
libraries.


Actually replacing memcmp does not seem worth the effort. 
However, I was wondering whether D could and should be used to 
replace more C 'legacy' - for fun and profit. ;-)



IIRC, Tango did not depend on libc at all. It only used system 
calls. So it certainly is possible.


I heard that Go's standard library is based on a similar approach.

Not depending on libc at all might serve as a marketing 
instrument as well.


Re: libc dependency

2017-06-19 Thread Honey via Digitalmars-d-learn

On Monday, 19 June 2017 at 20:26:43 UTC, Adam D. Ruppe wrote:

On Monday, 19 June 2017 at 20:20:23 UTC, Honey wrote:

Is it correct that D produces executables that depend on libc?


Usually yes, since that makes sense for most programs on 
operating systems, but you can pull it out if you want to for a 
specialized task. (Really, same boat C is in.)


Thanks, Adam!

To elaborate on my second point: if one compares, say, FreeBSD's 
libc [1] with glibc [2] it is pretty obvious that not all 
implementations are optimized to the same degree.


Does it make sense - at least in cerain cases - to provide and 
use a D implementation that is fast on all platforms, instead?


[1] https://svnweb.freebsd.org/base/head/lib/libc/
[2] https://sourceware.org/git/?p=glibc.git;a=tree;hb=HEAD


libc dependency

2017-06-19 Thread Honey via Digitalmars-d-learn

Hi all!

Is it correct that D produces executables that depend on libc?

While this approach avoids a certain amount of maintenance cost 
doesn't it limit D's possibility to outdo C on the lowest level?


Honey


Re: How to implement opCmp?

2017-06-13 Thread Honey via Digitalmars-d-learn
On Tuesday, 13 June 2017 at 14:51:40 UTC, Steven Schveighoffer 
wrote:
Yes, I saw that when I was looking (you can see from my reply 
that you quoted below).


Yes, I had missed that point.


Yes I think it makes sense to have such a comparison function 
for non-ranges. Yes it probably belongs there, as there are 
other functions (e.g. swap) that are not specific to ranges in 
std.algorithm. It should probably not be called cmp, as it will 
be a default comparison (with the default ordering), although 
if we did something like:


int cmp(alias pred = "a < b", T)(T t1, T t2) if(!isInputRange!T)
{
   static if(pred == "a < b") { /* do default optimized way */ }
   else { /* more generic mechanism using pred */ }
}

it might be nice. Need to think about the API ramifications.


I retracted my earlier proposal after I had realized my 
confusion. I had thought that cmp would implement three way range 
comparison based on three way element comparison. Then I realized 
that it is based on "a < b" or alike. The latter is certainly 
useful but I am afraid that this approach does not always lead to 
optimal performance.


I gathered a few ideas about the subject. I have to sit down and 
write it down.



I agree. It's a thing also that can be optimized in unintuitive 
ways. I think Andrei has a nice way to do opCmp for integers 
that's a simple subtraction and negation or something like that.


I observed that compiler optimizers are pretty smart about 
comparison of individual integers. I guess that we do not need to 
be clever, here.


Re: How to implement opCmp?

2017-06-11 Thread Honey via Digitalmars-d-learn

On Sunday, 11 June 2017 at 15:40:42 UTC, Honey wrote:

On Sunday, 11 June 2017 at 15:24:30 UTC, Honey wrote:
Doesn't it make sense to introduce another overload of cmp 
similar to Steve's doCmp [2] right at that spot?


Moreover, it seems that std.algorithm.cmp should employ three 
way comparison as well.


Forget about it. I think a different approach is required, here.


Re: How to implement opCmp?

2017-06-11 Thread Honey via Digitalmars-d-learn

On Sunday, 11 June 2017 at 15:24:30 UTC, Honey wrote:
Doesn't it make sense to introduce another overload of cmp 
similar to Steve's doCmp [2] right at that spot?


Moreover, it seems that std.algorithm.cmp should employ three way 
comparison as well. The current implementation


int cmp(alias pred = "a < b", R1, R2)(R1 r1, R2 r2)
if (isInputRange!R1 && isInputRange!R2 && !(isSomeString!R1 && 
isSomeString!R2))

{
for (;; r1.popFront(), r2.popFront())
{
if (r1.empty) return -cast(int)!r2.empty;
if (r2.empty) return !r1.empty;
auto a = r1.front, b = r2.front;
if (binaryFun!pred(a, b)) return -1;
if (binaryFun!pred(b, a)) return 1;
}
}

does not seem to be great.


Re: How to implement opCmp?

2017-06-11 Thread Honey via Digitalmars-d-learn

On Friday, 9 June 2017 at 17:50:28 UTC, Honey wrote:
Looking at the implementation of Tuple.opCmp, I'm not sure I'd 
bet on existence of opCmp for fundamental types:


int opCmp(R)(R rhs)
if (areCompatibleTuples!(typeof(this), R, "<"))
{
foreach (i, Unused; Types)
{
if (field[i] != rhs.field[i])
{
return field[i] < rhs.field[i] ? -1 : 1;
}
}
return 0;
}


It turned out that there is a standard three way comparison 
function for ranges and strings [1].


Doesn't it make sense to introduce another overload of cmp 
similar to Steve's doCmp [2] right at that spot?


This would simplify the implementation of opCmp for aggregates 
and can also lead to a moderate performance benefit [3]. (Note 
that [3] does not provide an additional overload of cmp but uses 
a less elegant approach with the same performance 
characteristics.)


// $ ldc2 -O3 -release -boundscheck=off cmpTuple.d
//
// real 0m18.787s
// user 0m18.784s
// sys  0m0.003s
//
//
// $ ldc2 -O3 -release -boundscheck=off cmpTuple.d 
-d-version=singlePassCmp

// $ time ./cmpTuple
//
// real 0m16.109s
// user 0m16.102s
// sys  0m0.007s

[1] https://dlang.org/phobos/std_algorithm_comparison.html#cmp
[2] http://forum.dlang.org/post/ohedtb$1phe$1...@digitalmars.com
[3] https://dpaste.dzfl.pl/7cbfa2e1965b


Re: Is D slow?

2017-06-10 Thread Honey via Digitalmars-d-learn

On Saturday, 10 June 2017 at 12:23:05 UTC, Nicholas Wilson wrote:

On Saturday, 10 June 2017 at 12:16:34 UTC, Honey wrote:
Is it expected that turning off bounds checking can lead to a 
performance decrease?


Yes, with it on you are doing an "is the index <= the length" 
for every array access. Now some of them can be elided by the 
complier when it can prove that the index is always in bounds 
but it is generally dangerous to do so as it opens up the 
possibility of buffer overflow.


Are you saying that introducing additional checks enables the 
optimizer to eliminate more or more costly checks than it could 
without introducing those additional checks in the first place? 
Can you give an example?


Re: Is D slow?

2017-06-10 Thread Honey via Digitalmars-d-learn

On Saturday, 10 June 2017 at 11:53:44 UTC, Johan Engelen wrote:
`-release` should be synonymous with `-release 
-boundscheck=off`.


Nope it's not.
http://www.digitalmars.com/d/archives/digitalmars/D/What_s_the_deal_with_-boundscheck_260237.html


Thank you for clarifying that point. Is it expected that turning 
off bounds checking can lead to a performance decrease?


Re: Is D slow?

2017-06-10 Thread Honey via Digitalmars-d-learn
On Saturday, 10 June 2017 at 10:59:24 UTC, Steven Schveighoffer 
wrote:
I wrote it like this, which confirms that it's indeed 
bringToFront (I tried your getTransitionIndex function, but 
that didn't change timings at all):


void insertionSort(alias Less, Range)(Range r)
if (hasLength!Range && isRandomAccessRange!Range && 
hasSlicing!Range)

{
foreach (immutable i; 1 .. r.length)
{
auto ubElem = i - r[0 .. 
i].assumeSorted!Less.upperBound(r[i]).length;

r[ubElem .. i+1].rotateRight;
}
}


Taking the length of upperBound is a great move! ;-)


On my mac, it's completing in about 4.4 seconds, whereas the 
C++ version completes in around 3.


On my machine, I am getting the same performance even without 
resorting to memmove. Using getTransitionIndex directly, however, 
leads to a neglectable improvement (about 2.7 vs. 2.8 seconds), 
here.


Re: Is D slow?

2017-06-10 Thread Honey via Digitalmars-d-learn

On Friday, 9 June 2017 at 23:10:28 UTC, Ali Çehreli wrote:
You would get the exact performance if you implemented e.g. 
with pointers. Your test has been very valuable for exposing an 
embarrassing performance issue. :)


I can confirm that, after changing implementation to the 
following, performance is on par. It is not immediatetly clear to 
me how


 r[r[0 .. i].getTransitionIndex!Less(r[i]) .. i + 1]

would look like if written idiomatically.


size_t getTransitionIndex(alias test, Range, V)(Range r, V v)
if (isRandomAccessRange!Range && hasLength!Range)
{
size_t first = 0;
size_t count = r.length;
while (count > 0)
{
immutable step = count / 2;
immutable it = first + step;
if (!test(v, r[it]))
{
first = it + 1;
count -= step + 1;
}
else
{
count = step;
}
}
return first;
}

void rotateRight(Range)(Range r)
if (hasLength!Range && isRandomAccessRange!Range && 
hasSlicing!Range)

{
   auto t = r[$ - 1];
   foreach_reverse (i; 0 .. r.length - 1)
   {
  r[i + 1] = r[i];
   }
   r[0] = t;
}

void insertionSort(alias Less, Range)(Range r)
if (hasLength!Range && isRandomAccessRange!Range && 
hasSlicing!Range)

{
   foreach (i; 1 .. r.length)
   {
  rotateRight(r[r[0 .. i].getTransitionIndex!Less(r[i]) .. i 
+ 1]);

   }
}


Re: Is D slow?

2017-06-09 Thread Honey via Digitalmars-d-learn

On Friday, 9 June 2017 at 20:27:58 UTC, Ali Çehreli wrote:

On 06/09/2017 12:29 PM, Honey wrote:

> I think, I should not rely on standard library facilities.

I think you hit a Phobos function with particularly bad 
performance (which can be improved). Phobos is not a slow 
library in general.


What I meant to say is: the comparison would have been less 
biased if I had written the complete algorithm without relying on 
different standard library abstractions (iterators vs. ranges).


Re: Is D slow?

2017-06-09 Thread Honey via Digitalmars-d-learn
On Friday, 9 June 2017 at 21:11:50 UTC, Steven Schveighoffer 
wrote:
Just to show you what I meant, I changed your code to eliminate 
the functors completely, the main function now looks like this:



foreach (i;  0 .. N)
{
insertionSort!((a, b) => lt(a, b))(v);
insertionSort!((a, b) => lt(b, a))(v);
}

I'm sure there's also a way to reduce the initialization of the 
array to a few lines (or maybe even one?), but didn't have time 
to think about it.


Thanks for your hints. I'm sure there are many things to improve 
(also in the C++ version). It should be pretty obvious that my 
knowledge of D is lacking.



Well, D is pretty fast, as fast as C++ or C. What I mean is 
that there is no inherent overhead -- both can produce exactly 
the same code.


I agree.


However, there are some parts of C/C++ that have been optimized 
to death. It's one of those things where D's version of rotate 
probably hasn't had as much scrutiny as C++'s version. We are 
always looking to improve such things, and more investigation 
and suggestions are most welcome! It's why I filed the bug 
report.


Thank you for filing the bug!

bringToFrontImpl does not seem to exploit bidirectional or random 
access properties.



Try to find something besides insertion sort to test with I 
think ;) D is pretty fast at a lot of things, and pleasant to 
write. You just found one test case that isn't (and we will fix 
it, I'm sure).


I guess that benchmarking comparison of string tuples will not 
result in happy faces unless a single-pass version of the 
comparison function is used.


Re: Is D slow?

2017-06-09 Thread Honey via Digitalmars-d-learn

On Friday, 9 June 2017 at 19:29:35 UTC, Honey wrote:

Unfortunately, that does not increase my confidence.


I should add that, nevertheless, I very much appreciate and 
respect D and its community. Great work! :-)


Re: Is D slow?

2017-06-09 Thread Honey via Digitalmars-d-learn
On Friday, 9 June 2017 at 18:32:06 UTC, Steven Schveighoffer 
wrote:

Wow, so that's how D code would look like if it were C++ :)


Well, I cannot (and did not try to) hide where I am coming from. 
;-)



The results are quite disappointing. What seems particularly 
strange to

me is that -boundscheck=off leads to a performance decrease.


That doesn't make much sense, but I'm not an ldc2 user. 
However, it does note in the help that -release disables bounds 
checks already.


Sounds like a bug, then.


I did replicate that issue on my box, and mucking around with 
the implementation didn't help.


In answer to the subject, no D is not slow. However, it's quite 
possible that std.algorithm.bringToFront is slower than 
std::rotate, or SortedRange.upperBound is slower than 
std::upper_bound, or both. I don't think it's a design issue 
per se, probably more of an implementation issue.


Thank you for confirming the results and your factual explanation 
notwithstanding my pointed question. ;-)


Maybe I was expecting too much given Andrei's performance 
oriented talks. I realize that the conceptual groundwork is more 
important than a concrete implementation that can be easily 
improved. However, I think that real world out-of-the-box 
performance - particularly with respect to toy examples (since 
those are small enough to be literally translated) - is important 
for prospects to gain confidence in buying into D.


At the current state, at least for such benchmarks, I think, I 
should not rely on standard library facilities. Unfortunately, 
that does not increase my confidence.


Re: How to implement opCmp?

2017-06-09 Thread Honey via Digitalmars-d-learn
On Friday, 9 June 2017 at 17:28:18 UTC, Steven Schveighoffer 
wrote:
This is why I think such a function should exist in 
Phobos/druntime. I'm not 100% sure it doesn't already, but I 
couldn't find one...


Looking at the implementation of Tuple.opCmp, I'm not sure I'd 
bet on existence of opCmp for fundamental types:


int opCmp(R)(R rhs)
if (areCompatibleTuples!(typeof(this), R, "<"))
{
foreach (i, Unused; Types)
{
if (field[i] != rhs.field[i])
{
return field[i] < rhs.field[i] ? -1 : 1;
}
}
return 0;
}



Re: a way to specily floating-point numbers as bit patters

2017-06-09 Thread Honey via Digitalmars-d-learn

On Friday, 9 June 2017 at 17:25:22 UTC, ketmar wrote:
it is highly platform-dependent. and both bin->dec, and 
dec->bin conversion routines can contain errors, btw. so using 
decimal forms for exact bit-patterns is the last thing i want 
to do, as i know how fragile they are.


Sure. Hex-format is the right choice if you want a fixed bit 
pattern.


Re: a way to specily floating-point numbers as bit patters

2017-06-09 Thread Honey via Digitalmars-d-learn

On Friday, 9 June 2017 at 16:41:01 UTC, Honey wrote:
Lossless turn-around is guaranteed if you are using 
sufficiently many digits. In case of IEEE-754 single precision 
it's 8 significant decimal digits.


s/turn-around/recovery/g
s/8/9/g

:-P




Re: a way to specily floating-point numbers as bit patters

2017-06-09 Thread Honey via Digitalmars-d-learn

On Friday, 9 June 2017 at 16:34:28 UTC, ketmar wrote:

Try -0.1685f.


it doesn't matter if i can find the decimal representation for 
the given bit pattern or not. the whole post is about removing 
the need to rely on lossy binary->decimal->binary conversions.


Lossless turn-around is guaranteed if you are using sufficiently 
many digits. In case of IEEE-754 single precision it's 8 
significant decimal digits.


Re: a way to specily floating-point numbers as bit patters

2017-06-09 Thread Honey via Digitalmars-d-learn

On Friday, 9 June 2017 at 16:07:36 UTC, ketmar wrote:
one of my calculated values is `-0.17`, which has 
bit-pattern of 0xBE2AAAB7.

now, let's say i want to use this number in my code:

float v = -0.17f;
writefln("%f 0x%08X", v, *cast(uint*));

ps. "-0.17 0xBE2AAAC1". it's not the same! (and yes, it 
matters).


Try -0.1685f.


Is D slow?

2017-06-09 Thread Honey via Digitalmars-d-learn

Hi guys,

I wrote a toy benchmark in C++ [1] and tried to literally 
translate it to D [2].


The results are quite disappointing. What seems particularly 
strange to me is that -boundscheck=off leads to a performance 
decrease.


Am I missing anything?

// $ clang++ -std=c++1z -O3 cmp-bench.cpp
// $ time ./a.out
// 501
//
// real 0m2.777s
// user 0m2.774s
// sys  0m0.004s
//
//
// $ clang++ --version
// clang version 4.0.0 (tags/RELEASE_400/final)
// Target: x86_64-unknown-linux-gnu
// Thread model: posix
// InstalledDir: /usr/bin

// $ ldc2 -O3 -release cmp-bench.d
// $ time ./cmp-bench
// 501
//
// real 0m5.257s
// user 0m5.224s
// sys  0m0.000s
//
//
// $ ldc2 -O3 -release -boundscheck=off cmp-bench.d
// $ time ./cmp-bench
// 501
//
// real 0m11.083s
// user 0m11.083s
// sys  0m0.000s
//
//
// $ ldc2 --version
// LDC - the LLVM D compiler (1.2.0):
//   based on DMD v2.072.2 and LLVM 4.0.0
//   built with DMD64 D Compiler v2.074.0
//   Default target: x86_64-unknown-linux-gnu
//   Host CPU: haswell

[1] C++ Insertion Sort - https://dpaste.dzfl.pl/74fdb92a0579
[2] D Insertion Sort   - https://dpaste.dzfl.pl/b97a1fca1546


Re: How to implement opCmp?

2017-06-09 Thread Honey via Digitalmars-d-learn

Hi Steve!

On Friday, 9 June 2017 at 15:12:42 UTC, Steven Schveighoffer 
wrote:
TypeInfo can provide the comparison for you, but it's a little 
ugly.


If we take a look at std.algorithm.comparison.cmp, it uses 
operators to compare two values (i.e. < first, then >, 
otherwise return 0). However, this is less performant if the 
type defines opCmp (you are calling it twice).


Calling opCmp twice on the same data is exactly what I tried to 
avoid.



There really ought to be an opCmp for any type, which does the 
best thing available. I'm not sure if it exists.


I agree it should exist.



If I were to write it, it would be something like:

int doCmp(T)(auto ref T t1, auto ref T t2)
{
   static if(is(typeof(t1.opCmp(t2
return t1.opCmp(t2);
   else
   {
  if(t1 < t2) return -1;
  else if(t1 > t2) return 1;
  return 0;
   }
}

Then your function would work, just use doCmp instead of opCmp.


Thanks. That's working.

Do you know whether this will always generate optimally efficient 
code (say, T is int and there is hardware support for three way 
comparison)?



Note that D already (I think) does by default a member-wise 
comparison, in order of declaration. So if that's all you 
really need, I don't think you need to define opCmp at all.


Checked that:

Error: need member function opCmp() for struct Pair!(int, int) to 
compare


How to implement opCmp?

2017-06-09 Thread Honey via Digitalmars-d-learn

Hi everyone!

Given

struct Pair(T, U = T)
{
  T f;
  U s;
}

what is the intended way to genrically implement opCmp for this 
struct?


The naive approach

struct Pair(T, U = T)
{
  // [...]

  int opCmp(const Pair r) const
  {
immutable c = f.opCmp(r.f);
return c != 0 ? c : s.opCmp(r.s);
  }
}

yields

Error: no property 'opCmp' for type 'const(int)'

if one tries to compare pairs of int.