subject:"Re\: My ACCU 2016 keynote video available online"

Re: My ACCU 2016 keynote video available online

2016-05-23 Thread Jens Mueller via Digitalmars-d-announce


On Sunday, 22 May 2016 at 21:16:03 UTC, David Nadlinger wrote:

On Thursday, 19 May 2016 at 12:54:48 UTC, Jens Müller wrote:

But ldc looks so bad.
Any comments from ldc users or developers? Because I see this 
in many other measurements as well.


This definitely does not match up with my experience. 
Particularly if you see this in many measurements, there might 
be something iffy with the way you test. Could you please post 
a runnable snippet that demonstrates the issue?


No need to. Since you said it didn't match your experience I 
checked. It turns out I was using last years ldc (0.15.2-beta2). 
Version 0.16.0 behaves similar.
But for version 0.17.1 I get the following plots. ldc looks much 
better. I make a note to check the fast math version once that's 
release. But the code is integer heavy.


https://www.scribd.com/doc/313524674/Running-Time
https://www.scribd.com/doc/313524678/Speedup

In general, could you please directly post any performance 
issues to the LDC issue tracker on GitHub? We are quite 
interested in them, but I only happened to come across this 
post by chance.


Sorry about that. Next time.

Jens

Re: My ACCU 2016 keynote video available online

2016-05-22 Thread Johan Engelen via Digitalmars-d-announce


On Thursday, 19 May 2016 at 12:54:48 UTC, Jens Müller wrote:


For example what's equivalent to gdc's -ffast-math in ldc.


This is:
https://github.com/ldc-developers/ldc/pull/1472

Working on performance improvements is a lot of fun. Please feed 
us with code that doesn't run as fast as it should!


:)
  Johan

Re: My ACCU 2016 keynote video available online

2016-05-22 Thread David Nadlinger via Digitalmars-d-announce


On Thursday, 19 May 2016 at 12:54:48 UTC, Jens Müller wrote:

But ldc looks so bad.
Any comments from ldc users or developers? Because I see this 
in many other measurements as well.


This definitely does not match up with my experience. 
Particularly if you see this in many measurements, there might be 
something iffy with the way you test. Could you please post a 
runnable snippet that demonstrates the issue?


In general, could you please directly post any performance issues 
to the LDC issue tracker on GitHub? We are quite interested in 
them, but I only happened to come across this post by chance.


 — David

Re: My ACCU 2016 keynote video available online

2016-05-22 Thread Atila Neves via Digitalmars-d-announce


On Saturday, 21 May 2016 at 13:51:11 UTC, Manu wrote:
On 21 May 2016 at 23:20, Andrei Alexandrescu via 
Digitalmars-d-announce  
wrote:

On 05/21/2016 04:45 AM, Manu via Digitalmars-d-announce wrote:

[...]



I guess a lot more detail would be necessary here. A bunch of 
good folks (at least better than me) have worked for over a 
decade on C++ concepts and three (three!) proposals later it's 
still unclear whether they're a good idea. -- Andrei


I agree it's not clear to me either. I haven't seen any 
proposals for

D. Have there been any?


Not quite the same but:

https://github.com/dlang/phobos/pull/3677
https://wiki.dlang.org/DIP84#Implementation

Atila

Re: My ACCU 2016 keynote video available online

2016-05-21 Thread Manu via Digitalmars-d-announce

On 21 May 2016 at 23:20, Andrei Alexandrescu via
Digitalmars-d-announce  wrote:
> On 05/21/2016 04:45 AM, Manu via Digitalmars-d-announce wrote:
>>
>> On 20 May 2016 at 18:26, Walter Bright via Digitalmars-d-announce
>>  wrote:
>>>
>>> On 5/19/2016 11:50 PM, Manu via Digitalmars-d-announce wrote:

 Ah. Okay, well while this is a very interesting talk, I was indeed
 hoping you were going to make a D concepts proposal... do you have
 such a thing in mind, or are you against concepts in D?
>>>
>>>
>>>
>>> D has constraints. No point in adding concepts.
>>
>>
>> I really struggle to agree. Constraints are a good first-step in that
>> direction, but they're unwieldy, produce the worst looking function
>> signatures (read: documentation) of literally any language ever
>> conceived, relatively awkward error feedback, and very quickly get out
>> of hand if you have many variations of possible constraints.
>
>
> I guess a lot more detail would be necessary here. A bunch of good folks (at
> least better than me) have worked for over a decade on C++ concepts and
> three (three!) proposals later it's still unclear whether they're a good
> idea. -- Andrei

I agree it's not clear to me either. I haven't seen any proposals for
D. Have there been any?
Constraints obviously work, but I do constantly find myself wishing
there were something a little bit closer to the type system.
I generally like where C++ is going. I think part of the problem with
C++, as always, is that they are in a constant struggle to bolt things
on to an ancient language with so many existing spiky angular edge
cases that it always seems to become awkward.

I think my biggest gripe with constraints in D though, is that they
are quite pervasive when you start to produce generic API's, and the
documentation becomes a massive problem.
Every single person (seriously) I've ever introduced to D has
struggled with the phobos docs as soon as constraints are presented.
Many of us have raised this many times, and we've had a lot of
discussion and various experiments with improving the documentation
wrt existing constraints. I'm not sure we've doing a good
documentation solution her, and I'm not even confident there's a good
solution there. I just don't think it's a great way to express the
problem. Concepts, or something like it, feels a lot more intuitive to
me, and certainly lends much nicer to documentation and API
presentation.

Re: My ACCU 2016 keynote video available online

2016-05-21 Thread Leontien Smaal via Digitalmars-d-announce


On Friday, 20 May 2016 at 19:34:11 UTC, Walter Bright wrote:
Constraints can address behavior and relationships, concepts do 
not.


Wow, TIL. That's so clear once said !
There's been several discussion here and even one phobos PR that 
proposes a kind of concepts but I didn't realize before that the 
2 things are different.


The problem I see in D is that the constraints, since they 
prevent to output a good message, are doing both (in a way):


void foo(T)(T t)
if (constraint)
{
// cannot have the message if constraint fails...
static assert((checkConcept!T).ok, (checkConcept!T).message);
}

At the language level it would work

void foo(T)(T t) @Concept(CheckerTemplate) // use this to output 
a smart message

if (constraint)
{

}

But really, without changing much what I'd like to see is a DMD 
feature that
would parse and evaluate the constraints to output a message: 
such as


void foo(T)(T t)
if ((a || b) && (a || b))
{
}


error:(a || b) is false


instead of throwing the whole constraint text in the output.

Re: My ACCU 2016 keynote video available online

2016-05-21 Thread Manu via Digitalmars-d-announce

On 21 May 2016 at 19:55, Stefan Koch via Digitalmars-d-announce
 wrote:
> On Saturday, 21 May 2016 at 08:45:45 UTC, Manu wrote:
>>
>> Constraints are a good first-step in that direction, but they're unwieldy,
>> produce the worst looking function signatures (read: documentation) of
>> literally any language ever conceived, relatively awkward error feedback,
>> and very quickly get out of hand if you have many variations of possible
>> constraints.
>
>
> Constraints are enough for simple matters.
> As soon as they are used to distinguish between many overloads with
> complicated relationships they become slightly crude.
>
> However this can be worked around with having static ifs and static
> asserts(0).
> I find myself just one variadic template with a lot of static ifs branches.

I also find myself taking that route when I run into cases where there
are numerous constraint combinations. It's better in some ways, but
worse in others.
I think it simplifies API's, but at the same time, it removes
information from the API, and takes away a lot of documentation.

Re: My ACCU 2016 keynote video available online

2016-05-21 Thread Andrei Alexandrescu via Digitalmars-d-announce


On 05/21/2016 04:45 AM, Manu via Digitalmars-d-announce wrote:

On 20 May 2016 at 18:26, Walter Bright via Digitalmars-d-announce
 wrote:

On 5/19/2016 11:50 PM, Manu via Digitalmars-d-announce wrote:


Ah. Okay, well while this is a very interesting talk, I was indeed
hoping you were going to make a D concepts proposal... do you have
such a thing in mind, or are you against concepts in D?



D has constraints. No point in adding concepts.


I really struggle to agree. Constraints are a good first-step in that
direction, but they're unwieldy, produce the worst looking function
signatures (read: documentation) of literally any language ever
conceived, relatively awkward error feedback, and very quickly get out
of hand if you have many variations of possible constraints.


I guess a lot more detail would be necessary here. A bunch of good folks 
(at least better than me) have worked for over a decade on C++ concepts 
and three (three!) proposals later it's still unclear whether they're a 
good idea. -- Andrei

Re: My ACCU 2016 keynote video available online

2016-05-21 Thread Stefan Koch via Digitalmars-d-announce


On Saturday, 21 May 2016 at 08:45:45 UTC, Manu wrote:
Constraints are a good first-step in that direction, but 
they're unwieldy, produce the worst looking function signatures 
(read: documentation) of literally any language ever conceived, 
relatively awkward error feedback, and very quickly get out of 
hand if you have many variations of possible constraints.


Constraints are enough for simple matters.
As soon as they are used to distinguish between many overloads 
with complicated relationships they become slightly crude.


However this can be worked around with having static ifs and 
static asserts(0).
I find myself just one variadic template with a lot of static ifs 
branches.

Re: My ACCU 2016 keynote video available online

2016-05-21 Thread Manu via Digitalmars-d-announce

On 20 May 2016 at 18:26, Walter Bright via Digitalmars-d-announce
 wrote:
> On 5/19/2016 11:50 PM, Manu via Digitalmars-d-announce wrote:
>>
>> Ah. Okay, well while this is a very interesting talk, I was indeed
>> hoping you were going to make a D concepts proposal... do you have
>> such a thing in mind, or are you against concepts in D?
>
>
> D has constraints. No point in adding concepts.

I really struggle to agree. Constraints are a good first-step in that
direction, but they're unwieldy, produce the worst looking function
signatures (read: documentation) of literally any language ever
conceived, relatively awkward error feedback, and very quickly get out
of hand if you have many variations of possible constraints.

Re: My ACCU 2016 keynote video available online

2016-05-20 Thread Jens Müller via Digitalmars-d-announce


On Friday, 20 May 2016 at 20:04:35 UTC, Andrei Alexandrescu wrote:

On 5/20/16 2:13 PM, Jens Müller wrote:
No it doesn't work because you need to break in the last case. 
Consider
the case when the last element of a is equal to an element in 
b. Next

iteration you overrun a.


I'm not that Bright :o).


Me neither.

So you'd need one more test, but you still save the other test 
and the test on one of the three branches, so 2 out of 4. --


Yes. Current version reduces it from 4 to 2.
I've read that people use SSE to speed it up. Maybe I consider 
this later.
If we want to improve on similar vectors (which is a great idea) 
we just reorder the cases, I'll guess. Just did it. It improves 
on my test data. But then on near dissimilar input I expect it to 
be worse.
When doing these optimizations it is always dependent on the 
expected input which is a pity when optimizing library functions. 
These must work in all cases and should only be less efficient 
for very good reason.

I'm looking forward to Fastware.

Jens

Re: My ACCU 2016 keynote video available online

2016-05-20 Thread Andrei Alexandrescu via Digitalmars-d-announce


On 5/20/16 2:13 PM, Jens Müller wrote:

No it doesn't work because you need to break in the last case. Consider
the case when the last element of a is equal to an element in b. Next
iteration you overrun a.


I'm not that Bright :o).

So you'd need one more test, but you still save the other test and the 
test on one of the three branches, so 2 out of 4. -- Andrei

Re: My ACCU 2016 keynote video available online

2016-05-20 Thread Walter Bright via Digitalmars-d-announce


On 5/20/2016 6:47 AM, H. S. Teoh via Digitalmars-d-announce wrote:

Not to mention inconsistency in what exactly
is being tested for: if you want to check if something is an input
range, do you use is(typeof(R.empty)), etc., or should you use
__traits(compiles, R.init.empty), or is it is(typeof((R r){r.empty;})),
or any of the 15 or so slightly different ways of testing for the
existence and type of some aggregate member, all subtly different from
each other? Subtly different as in, for instance, testing for
is(typeof((){R r; bool x = r.empty;})) is different from is(typeof(R
r){bool x = r.empty;}), because the former doesn't work with R that has
parameters closing over a local scope, whereas the latter does.


That is not a problem with constraints, it's a result of a dozen people adding 
constraints in an uncoordinated manner. I.e. it's a library problem.




Whereas if D had concepts, it would have been a simple
matter of defining the prototypical range with struct-like syntax and
calling it a day.


That really isn't good enough. Constraints can address behavior and 
relationships, concepts do not.

Re: My ACCU 2016 keynote video available online

2016-05-20 Thread Jens Müller via Digitalmars-d-announce


On Friday, 20 May 2016 at 14:14:18 UTC, Andrei Alexandrescu wrote:

On 05/19/2016 06:50 PM, Jens Müller wrote:
What if you stomped over an index in a that has as an equal 
index in b

(it could be anywhere in b).


Hmmm, you're right. So that doesn't work, or at least not 
efficiently (the fixup would entail a binary search in b).


How about this idea: arrange things such that a.back.idx <= 
b.back.idx (i.e. swap them if not so). Then stomp b.back.idx 
with size_t.max. Your kernel is:


if (a[i].idx < b[j].idx) {
  if (i == imax) break; // check needed
  i++;
} else if (a[i].idx > b[j].idx) {
  j++; // no check needed
} else {
  // no check needed
  r += a[i++].val * b[j++].val;
}

The fixup needs only to account for the case a.back.idx == 
b.back.idx. In fact I like this a lot better than others 
because it works real fast for identical vector (taking the 
norm) or similar vectors (often the case). Works?


No it doesn't work because you need to break in the last case. 
Consider the case when the last element of a is equal to an 
element in b. Next iteration you overrun a.

But optimizing for similar vectors is interesting.

Jens

Re: My ACCU 2016 keynote video available online

2016-05-20 Thread Andrei Alexandrescu via Digitalmars-d-announce


On 05/19/2016 06:50 PM, Jens Müller wrote:

What if you stomped over an index in a that has as an equal index in b
(it could be anywhere in b).


Hmmm, you're right. So that doesn't work, or at least not efficiently 
(the fixup would entail a binary search in b).


How about this idea: arrange things such that a.back.idx <= b.back.idx 
(i.e. swap them if not so). Then stomp b.back.idx with size_t.max. Your 
kernel is:


if (a[i].idx < b[j].idx) {
  if (i == imax) break; // check needed
  i++;
} else if (a[i].idx > b[j].idx) {
  j++; // no check needed
} else {
  // no check needed
  r += a[i++].val * b[j++].val;
}

The fixup needs only to account for the case a.back.idx == b.back.idx. 
In fact I like this a lot better than others because it works real fast 
for identical vector (taking the norm) or similar vectors (often the 
case). Works?



Andrei

Re: My ACCU 2016 keynote video available online

2016-05-20 Thread Manu via Digitalmars-d-announce

On 19 May 2016 at 22:10, Andrei Alexandrescu via
Digitalmars-d-announce  wrote:
> On 5/18/16 7:42 AM, Manu via Digitalmars-d-announce wrote:
>>
>> On 16 May 2016 at 23:46, Andrei Alexandrescu via
>> Digitalmars-d-announce  wrote:
>>>
>>> Uses D for examples, showcases Design by Introspection, and rediscovers a
>>> fast partition routine. It was quite well received.
>>> https://www.youtube.com/watch?v=AxnotgLql0k
>>>
>>> Andrei
>>
>>
>> This isn't the one you said you were going to "destroy concepts" is it?
>> At dconf, you mentioned a talk for release on some date I can't
>> remember, and I got the impression you were going to show how C++'s
>> concepts proposal was broken, and ideally, propose how we can nail it
>> in D?
>
>
> That's the one - sorry to disappoint :o). -- Andrei

Ah. Okay, well while this is a very interesting talk, I was indeed
hoping you were going to make a D concepts proposal... do you have
such a thing in mind, or are you against concepts in D?

Re: My ACCU 2016 keynote video available online

2016-05-19 Thread Jens Müller via Digitalmars-d-announce

On Thursday, 19 May 2016 at 22:04:56 UTC, Andrei Alexandrescu 
wrote:

On 05/19/2016 05:36 PM, Jens Müller wrote:
I removed the code to optimize for large gaps. Because it is 
only
confusing. I may generate some benchmark data with larger gaps 
later to

see whether it is worthwhile for such data.


For skipping large gaps quickly, check galloping search (google 
for it, we also have it in phobos). -- Andrei


Sure. I've already seen this. It's nice. But you have to include 
it in the sparse dot product (or list intersection) algorithm 
somehow. Then you require random access and galloping is only 
beneficial if the gaps are large. As a library writer this is a 
difficult position because this turns easily into over 
engineering. Optimally one just exposes the primitives and the 
user plugs them together. Ideally without having to many knobs 
per algorithm.


Jens

Re: My ACCU 2016 keynote video available online

2016-05-19 Thread Jens Müller via Digitalmars-d-announce

On Thursday, 19 May 2016 at 22:02:53 UTC, Andrei Alexandrescu 
wrote:

On 05/19/2016 05:36 PM, Jens Müller wrote:

I'm not seeing it. Let me explain.
Consider the input a = [1] and b = [2, 3] (I only write the 
indices).
The smallest back index is 1, i.e., a.back is the chosen 
sentinel.


Nonono, you stamp the largest index over the smaller index. So 
you overwrite a = [3] and you leave b = [2, 3] as it is.


Now you know that you're multiplying two correct sparse vectors 
in which _definitely_ the last elements have equal indexes. So 
the inner loop is:


if (a[i].idx < b[j].idx) {
  i++; // no check needed
} else if (a[i].idx > b[j].idx) {
  j++; // no check needed
} else {
  // check needed
  r += a[i].val * b[j].val;
  if (i == imax || j == jmax) break;
  ++i;
  ++j;
}

At the end you need a fixup to make sure you account for the 
last index that you overwrote (which of course you need to save 
first).


Makes sense?


What if you stomped over an index in a that has as an equal index 
in b (it could be anywhere in b). After the loop finishes you 
restore the index in a. But how do you address the values for the 
stomped over index if needed?

For instance test it on
a = [2]
b = [2,3]
Note the 2 in b could be anywhere.

I think you can check for
if (a[i].idx == sentinelIdx) break;
instead of
if (i == imax || j == jmax) break;

Jens

Re: My ACCU 2016 keynote video available online

2016-05-19 Thread Andrei Alexandrescu via Digitalmars-d-announce


On 05/19/2016 05:36 PM, Jens Müller wrote:

I removed the code to optimize for large gaps. Because it is only
confusing. I may generate some benchmark data with larger gaps later to
see whether it is worthwhile for such data.


For skipping large gaps quickly, check galloping search (google for it, 
we also have it in phobos). -- Andrei

Re: My ACCU 2016 keynote video available online

2016-05-19 Thread Andrei Alexandrescu via Digitalmars-d-announce


On 05/19/2016 05:36 PM, Jens Müller wrote:

I'm not seeing it. Let me explain.
Consider the input a = [1] and b = [2, 3] (I only write the indices).
The smallest back index is 1, i.e., a.back is the chosen sentinel.


Nonono, you stamp the largest index over the smaller index. So you 
overwrite a = [3] and you leave b = [2, 3] as it is.


Now you know that you're multiplying two correct sparse vectors in which 
_definitely_ the last elements have equal indexes. So the inner loop is:


if (a[i].idx < b[j].idx) {
  i++; // no check needed
} else if (a[i].idx > b[j].idx) {
  j++; // no check needed
} else {
  // check needed
  r += a[i].val * b[j].val;
  if (i == imax || j == jmax) break;
  ++i;
  ++j;
}

At the end you need a fixup to make sure you account for the last index 
that you overwrote (which of course you need to save first).


Makes sense?


Andrei

Re: My ACCU 2016 keynote video available online

2016-05-19 Thread Jens Müller via Digitalmars-d-announce

On Thursday, 19 May 2016 at 12:04:31 UTC, Andrei Alexandrescu 
wrote:

On 5/19/16 4:12 AM, Jens Müller wrote:

---
if (a.length == 0 || b.length == 0)
return 0;
const amax = a.length - 1, bmax = b.length - 1;
size_t i,j = 0;
double sum = 0;
for (;;)
{
if (a[i].index < b[j].index) {
if (i++ == amax) break;
}
else if (a[i].index > b[j].index) {
bumpJ: if (j++ == bmax) break;
}
else
{
assert(a[i].index == b[j].index);
sum += a[i].value * b[j].value;
if (i++ == amax) break;
goto bumpJ;
}
}
return sum;
---

Then if you add the sentinel you only need the bounds tests in 
the third case.


I'm not seeing it. Let me explain.
Consider the input a = [1] and b = [2, 3] (I only write the 
indices). The smallest back index is 1, i.e., a.back is the 
chosen sentinel. Now I assume that we set b.back to a.back 
restoring it after the loop. Now in the case a[i].index < 
b[j].index I have to check whether a[i].index == a.back.index to 
break because otherwise i is incremented (since a[i].index = 1 
and b[j].index = 2, for i = 0 and j = 0 respectively). In the 
last case I only check a[i].index == a.back.index, since this 
implies b[j].index == a.back.index.
So in sum I have two bounds tests. But I think this is not what 
you are thinking of.

This does not look right.
Here are the plots for the implementations.
https://www.scribd.com/doc/313204510/Running-Time
https://www.scribd.com/doc/313204526/Speedup

dot1 is my baseline, which is indeed worse than your baseline 
(dot2). But only on gdc. I choose dot2 as the baseline for 
computing the speedup. dot3 is the sentinel version.


I removed the code to optimize for large gaps. Because it is only 
confusing. I may generate some benchmark data with larger gaps 
later to see whether it is worthwhile for such data.

It looks much more regular now (ldc is still strange).

Jens

Re: My ACCU 2016 keynote video available online

2016-05-19 Thread Jens Müller via Digitalmars-d-announce

On Thursday, 19 May 2016 at 12:04:31 UTC, Andrei Alexandrescu 
wrote:

On 5/19/16 4:12 AM, Jens Müller wrote:



What test data did you use?


An instance for benchmarking is generated as follows. Given nnz 
which is the sum of non-zero indices in input vector a and b.


auto lengthA = uniform!"[]"(0, nnz, gen);
auto lengthB = nnz - lengthA;

auto a = iota(0, nnz).randomSample(lengthA, gen).map!(i => 
Pair(i, 10)).array();
auto b = iota(0, nnz).randomSample(lengthB, gen).map!(i => 
Pair(i, 10)).array();


So I take a random sample of (0, ..., nnz) for each input.
Any better idea? I've seen that people generate vectors with 
larger gaps.


10%-20% win on dot product is significant because for many 
algorithms dot product is a kernel and dominates everything 
else. For those any win goes straight to the bottom line.


Sure. Still I wasn't sure whether I got the idea from your talk. 
So maybe there is/was more.


The base line (dot1 in the graphs) is the straightforward 
version


---
size_t i,j = 0;
double sum = 0;
while (i < a.length && j < b.length)
{
 if (a[i].index < b[j].index) i++;
 else if (a[i].index > b[j].index) j++;
 else
 {
 assert(a[i].index == b[j].index);
 sum += a[i].value * b[j].value;
 i++;
 j++;
 }
}
return sum;
---


That does redundant checking. There's a better baseline:

---
if (a.length == 0 || b.length == 0)
return 0;
const amax = a.length - 1, bmax = b.length - 1;
size_t i,j = 0;
double sum = 0;
for (;;)
{
if (a[i].index < b[j].index) {
if (i++ == amax) break;
}
else if (a[i].index > b[j].index) {
bumpJ: if (j++ == bmax) break;
}
else
{
assert(a[i].index == b[j].index);
sum += a[i].value * b[j].value;
if (i++ == amax) break;
goto bumpJ;
}
}
return sum;
---


I check that.

Then if you add the sentinel you only need the bounds tests in 
the third case.


I post the sentinel code later. Probably there is something to 
improve there as well.



BTW the effects vary greatly for different compilers.
For example with dmd the optimized version is slowest. The 
baseline is
best. Weird. With gdc the optimized is best and gdc's code is 
always
faster than dmd's code. With ldc it's really strange. Slower 
than dmd. I

assume I'm doing something wrong here.

Used compiler flags
dmd v2.071.0
-wi -dw -g -O -inline -release -noboundscheck
gdc (crosstool-NG 203be35 - 20160205-2.066.1-e95a735b97) 5.2.0
-Wall  -g -O3 -fomit-frame-pointer -finline-functions -frelease
-fno-bounds-check -ffast-math
ldc (0.15.2-beta2) based on DMD v2.066.1 and LLVM 3.6.1
-wi -dw -g -O3 -enable-inlining -release -boundscheck=off

Am I missing some flags?


These look reasonable.


But ldc looks so bad.
Any comments from ldc users or developers? Because I see this in 
many other measurements as well. I would love to have another 
compiler producing efficient like gdc.

For example what's equivalent to gdc's -ffast-math in ldc.


I uploaded my plots.
- running time 
https://www.scribd.com/doc/312951947/Running-Time

- speed up https://www.scribd.com/doc/312951964/Speedup


What is dot2? Could you please redo the experiments with the 
modified code as well?


dot2 is an optimization for jumping over gaps more quickly 
replacing the first two if statements with while statements.
But my benchmark tests have no large gaps but interestingly it 
does make things worse.


Jens

Re: My ACCU 2016 keynote video available online

2016-05-19 Thread Andrei Alexandrescu via Digitalmars-d-announce


On 5/18/16 7:42 AM, Manu via Digitalmars-d-announce wrote:

On 16 May 2016 at 23:46, Andrei Alexandrescu via
Digitalmars-d-announce  wrote:

Uses D for examples, showcases Design by Introspection, and rediscovers a
fast partition routine. It was quite well received.
https://www.youtube.com/watch?v=AxnotgLql0k

Andrei


This isn't the one you said you were going to "destroy concepts" is it?
At dconf, you mentioned a talk for release on some date I can't
remember, and I got the impression you were going to show how C++'s
concepts proposal was broken, and ideally, propose how we can nail it
in D?


That's the one - sorry to disappoint :o). -- Andrei

Re: My ACCU 2016 keynote video available online

2016-05-19 Thread Andrei Alexandrescu via Digitalmars-d-announce


On 5/19/16 4:12 AM, Jens Müller wrote:

The code applying the sentinel optimization assumes mutability of
the input. That needs to be checked for.


Indeed. As I mentioned after discussing find, I didn't worry about those 
checks assuming they were obvious.



That's fine for partition
because that is assumed to be in-place. But for other algorithms it's
not so obvious. It's sad that the optimization works only for
non-const input.


Several optimizations only apply to mutable data. Others apply to 
immutable data. It's the way things go.



I didn't get the idea behind sentinels for sparse dot product. I picked the
smallest of the last elements (so you need bidirectional ranges) and fix
up as
needed. For gdc I get a speedup (baseline over new implementation) of
1.2 in
best case and >1.0 in worst case. On average it's about 1.1 I would say. I
expected more. How would you approach sentinels with the sparse dot
product. Can
you elaborate the idea from the video? I didn't get it.


That's the idea - to only need to bounds check on one of the three 
possibilities.


What test data did you use?

10%-20% win on dot product is significant because for many algorithms 
dot product is a kernel and dominates everything else. For those any win 
goes straight to the bottom line.



The base line (dot1 in the graphs) is the straightforward version

---
size_t i,j = 0;
double sum = 0;
while (i < a.length && j < b.length)
{
 if (a[i].index < b[j].index) i++;
 else if (a[i].index > b[j].index) j++;
 else
 {
 assert(a[i].index == b[j].index);
 sum += a[i].value * b[j].value;
 i++;
 j++;
 }
}
return sum;
---


That does redundant checking. There's a better baseline:

---
if (a.length == 0 || b.length == 0)
return 0;
const amax = a.length - 1, bmax = b.length - 1;
size_t i,j = 0;
double sum = 0;
for (;;)
{
if (a[i].index < b[j].index) {
if (i++ == amax) break;
}
else if (a[i].index > b[j].index) {
bumpJ: if (j++ == bmax) break;
}
else
{
assert(a[i].index == b[j].index);
sum += a[i].value * b[j].value;
if (i++ == amax) break;
goto bumpJ;
}
}
return sum;
---

Then if you add the sentinel you only need the bounds tests in the third 
case.



BTW the effects vary greatly for different compilers.
For example with dmd the optimized version is slowest. The baseline is
best. Weird. With gdc the optimized is best and gdc's code is always
faster than dmd's code. With ldc it's really strange. Slower than dmd. I
assume I'm doing something wrong here.

Used compiler flags
dmd v2.071.0
-wi -dw -g -O -inline -release -noboundscheck
gdc (crosstool-NG 203be35 - 20160205-2.066.1-e95a735b97) 5.2.0
-Wall  -g -O3 -fomit-frame-pointer -finline-functions -frelease
-fno-bounds-check -ffast-math
ldc (0.15.2-beta2) based on DMD v2.066.1 and LLVM 3.6.1
-wi -dw -g -O3 -enable-inlining -release -boundscheck=off

Am I missing some flags?


These look reasonable.


I uploaded my plots.
- running time https://www.scribd.com/doc/312951947/Running-Time
- speed up https://www.scribd.com/doc/312951964/Speedup


What is dot2? Could you please redo the experiments with the modified 
code as well?



Thanks!

Andrei

Re: My ACCU 2016 keynote video available online

2016-05-19 Thread Andrei Alexandrescu via Digitalmars-d-announce


On 5/16/16 9:46 AM, Andrei Alexandrescu wrote:

Uses D for examples, showcases Design by Introspection, and rediscovers
a fast partition routine. It was quite well received.
https://www.youtube.com/watch?v=AxnotgLql0k


This talk took a big gambit and it seems to have worked well. Per 
https://www.youtube.com/channel/UCJhay24LTpO1s4bIZxuIqKw/videos?sort=p=0=grid, 
"There's Treasure Everywhere" is the most watched talk of the ACCU 2016 
conference with 5276 views with a large margin (next 1874, median 339).


Andrei

Re: My ACCU 2016 keynote video available online

2016-05-19 Thread Jens Müller via Digitalmars-d-announce


On Monday, 16 May 2016 at 13:46:11 UTC, Andrei Alexandrescu wrote:
Uses D for examples, showcases Design by Introspection, and 
rediscovers a fast partition routine. It was quite well 
received. https://www.youtube.com/watch?v=AxnotgLql0k


Andrei


Nice presentation.

The code applying the sentinel optimization assumes mutability of 
the input.
That needs to be checked for. That's fine for partition because 
that is assumed
to be in-place. But for other algorithms it's not so obvious. 
It's sad that the
optimization works only for non-const input. It is in conflict 
with the advice
to make input const if the function doesn't change it. This makes 
the
optimization less likely to be applicable. One might though relax 
the const
requirement to mean "the input is identical at return of the 
function to its
beginning". But that's a different story, I'll guess. Coming up 
with another
implementation might also work, using chain or so. But typically 
the sentinel

optimization assumes mutability.

I didn't get the idea behind sentinels for sparse dot product. I 
picked the
smallest of the last elements (so you need bidirectional ranges) 
and fix up as
needed. For gdc I get a speedup (baseline over new 
implementation) of 1.2 in
best case and >1.0 in worst case. On average it's about 1.1 I 
would say. I
expected more. How would you approach sentinels with the sparse 
dot product. Can

you elaborate the idea from the video? I didn't get it.

The base line (dot1 in the graphs) is the straightforward version

---
size_t i,j = 0;
double sum = 0;
while (i < a.length && j < b.length)
{
if (a[i].index < b[j].index) i++;
else if (a[i].index > b[j].index) j++;
else
{
assert(a[i].index == b[j].index);
sum += a[i].value * b[j].value;
i++;
j++;
}
}
return sum;
---

BTW the effects vary greatly for different compilers.
For example with dmd the optimized version is slowest. The 
baseline is
best. Weird. With gdc the optimized is best and gdc's code is 
always
faster than dmd's code. With ldc it's really strange. Slower than 
dmd. I

assume I'm doing something wrong here.

Used compiler flags
dmd v2.071.0
-wi -dw -g -O -inline -release -noboundscheck
gdc (crosstool-NG 203be35 - 20160205-2.066.1-e95a735b97) 5.2.0
-Wall  -g -O3 -fomit-frame-pointer -finline-functions -frelease 
-fno-bounds-check -ffast-math

ldc (0.15.2-beta2) based on DMD v2.066.1 and LLVM 3.6.1
-wi -dw -g -O3 -enable-inlining -release -boundscheck=off

Am I missing some flags?

I uploaded my plots.
- running time https://www.scribd.com/doc/312951947/Running-Time
- speed up https://www.scribd.com/doc/312951964/Speedup

*Disclaimer*
I hope most of this makes sense but take it with a grain of salt.

Jens

PS
It seems the mailinglist interface does not work. I cannot send 
replies anymore via mail. I wrote Brad Roberts but no answer yet.

Re: My ACCU 2016 keynote video available online

2016-05-18 Thread Rory McGuire via Digitalmars-d-announce

On 18 May 2016 1:25 PM, "Bill Hicks via Digitalmars-d-announce" <
digitalmars-d-announce@puremagic.com> wrote:
>
> On Tuesday, 17 May 2016 at 09:34:53 UTC, Rory McGuire wrote:
>>
>>
>> As a South African, I can only say that you are talking nonsense
regarding the horse/zebra joke. If you've been to Africa you will
understand; there really are a lot more Zebra than horses. Although I must
admit I think of horses or Monty Python when I hear hoofbeats.
>>
>> Andrei: Really good talk, thanks for sharing!
>
>
> Don't get it twisted.
>
> There are whites living in South Africa, and they are certainly not
African.  'Africans' refers to the people, the black people, and not all
Africans live in the continent of Africa, because of things like the
slavery.  The fact that Andrei referred to the people is what makes it
racist.  What's the next joke, something about Africans enjoying fried
chicken?

And then what. That the English often enjoy soccer/football?

Stop trying to alienate us. While I'm "white" (more like brown) I'm so
African even my grand parents couldn't get ancestral visas. I also happen
to be Irish so don't even start the slavery BS unless you do your history
research first.

Re: My ACCU 2016 keynote video available online

2016-05-18 Thread Manu via Digitalmars-d-announce

On 16 May 2016 at 23:46, Andrei Alexandrescu via
Digitalmars-d-announce  wrote:
> Uses D for examples, showcases Design by Introspection, and rediscovers a
> fast partition routine. It was quite well received.
> https://www.youtube.com/watch?v=AxnotgLql0k
>
> Andrei

This isn't the one you said you were going to "destroy concepts" is it?
At dconf, you mentioned a talk for release on some date I can't
remember, and I got the impression you were going to show how C++'s
concepts proposal was broken, and ideally, propose how we can nail it
in D?

Re: My ACCU 2016 keynote video available online

2016-05-18 Thread Bill Hicks via Digitalmars-d-announce


On Tuesday, 17 May 2016 at 17:31:21 UTC, Piotrek wrote:


If you want to be a troll please go to the Rust forums. They 
need you there to protect "underrepresented minorities".


Piotrek



I'm here to expose the over-represented majority.

Re: My ACCU 2016 keynote video available online

2016-05-18 Thread Bill Hicks via Digitalmars-d-announce


On Tuesday, 17 May 2016 at 09:34:53 UTC, Rory McGuire wrote:


As a South African, I can only say that you are talking 
nonsense regarding the horse/zebra joke. If you've been to 
Africa you will understand; there really are a lot more Zebra 
than horses. Although I must admit I think of horses or Monty 
Python when I hear hoofbeats.


Andrei: Really good talk, thanks for sharing!


Don't get it twisted.

There are whites living in South Africa, and they are certainly 
not African.  'Africans' refers to the people, the black people, 
and not all Africans live in the continent of Africa, because of 
things like the slavery.  The fact that Andrei referred to the 
people is what makes it racist.  What's the next joke, something 
about Africans enjoying fried chicken?

Re: My ACCU 2016 keynote video available online

2016-05-17 Thread Jesse Phillips via Digitalmars-d-announce


On Tuesday, 17 May 2016 at 08:42:42 UTC, Bill Hicks wrote:
And here you go again with your borderline racist jokes.  Not 
very cool.  If you honestly want to find out if it's "confusing 
to Africans", I suggest you go to a black neighborhood and ask 
them.


Haha, that is probably the most racist statement. When you say 
"black neighborhood" are you talking about in Africa or Americans 
in the US?

Re: My ACCU 2016 keynote video available online

2016-05-17 Thread Andrei Alexandrescu via Digitalmars-d-announce


On 05/16/2016 05:45 PM, QAston wrote:

On Monday, 16 May 2016 at 13:46:11 UTC, Andrei Alexandrescu wrote:

Uses D for examples, showcases Design by Introspection, and
rediscovers a fast partition routine. It was quite well received.
https://www.youtube.com/watch?v=AxnotgLql0k

Andrei


Funny, useful, advertises the best parts of D very well.


Thanks. And of course on reddit there's the Slaves Chorus of Nabucco 
chiming in right on cue how it could be done in C++. -- Andrei

Re: My ACCU 2016 keynote video available online

2016-05-17 Thread Piotrek via Digitalmars-d-announce


On Tuesday, 17 May 2016 at 08:42:42 UTC, Bill Hicks wrote:
On Monday, 16 May 2016 at 13:46:11 UTC, Andrei Alexandrescu 
wrote:
Uses D for examples, showcases Design by Introspection, and 
rediscovers a fast partition routine. It was quite well 
received. https://www.youtube.com/watch?v=AxnotgLql0k


Andrei


Incidentally, 2_000_000 D users have been waiting 10 years for 
one guy, you, to complete the containers/allocators and many 
other things.


Man, this sh*t writes itself.

And here you go again with your borderline racist jokes.  Not 
very cool.  If you honestly want to find out if it's "confusing 
to Africans", I suggest you go to a black neighborhood and ask 
them.


If you want to be a troll please go to the Rust forums. They need 
you there to protect "underrepresented minorities".


Piotrek

Re: My ACCU 2016 keynote video available online

2016-05-17 Thread Rory McGuire via Digitalmars-d-announce

On Tue, May 17, 2016 at 10:42 AM, Bill Hicks via Digitalmars-d-announce <
digitalmars-d-announce@puremagic.com> wrote:

> On Monday, 16 May 2016 at 13:46:11 UTC, Andrei Alexandrescu wrote:
>
>> Uses D for examples, showcases Design by Introspection, and rediscovers a
>> fast partition routine. It was quite well received.
>> https://www.youtube.com/watch?v=AxnotgLql0k
>>
>> Andrei
>>
>
> Incidentally, 2_000_000 D users have been waiting 10 years for one guy,
> you, to complete the containers/allocators and many other things.
>
> Man, this sh*t writes itself.
>
> And here you go again with your borderline racist jokes.  Not very cool.
> If you honestly want to find out if it's "confusing to Africans", I suggest
> you go to a black neighborhood and ask them.
>
>
>

As a South African, I can only say that you are talking nonsense regarding
the horse/zebra joke. If you've been to Africa you will understand; there
really are a lot more Zebra than horses. Although I must admit I think of
horses or Monty Python when I hear hoofbeats.

Andrei: Really good talk, thanks for sharing!

Re: My ACCU 2016 keynote video available online

2016-05-16 Thread QAston via Digitalmars-d-announce


On Monday, 16 May 2016 at 13:46:11 UTC, Andrei Alexandrescu wrote:
Uses D for examples, showcases Design by Introspection, and 
rediscovers a fast partition routine. It was quite well 
received. https://www.youtube.com/watch?v=AxnotgLql0k


Andrei


Funny, useful, advertises the best parts of D very well.

Re: My ACCU 2016 keynote video available online

2016-05-16 Thread Nordlöw via Digitalmars-d-announce


On Monday, 16 May 2016 at 13:46:11 UTC, Andrei Alexandrescu wrote:
Uses D for examples, showcases Design by Introspection, and 
rediscovers a fast partition routine. It was quite well 
received. https://www.youtube.com/watch?v=AxnotgLql0k


Andrei


Great! Your talks are always pushedFront in my view queue 

Did you manage to recruit any new D liutenants 

Re: My ACCU 2016 keynote video available online

2016-05-16 Thread Walter Bright via Digitalmars-d-announce


On 5/16/2016 6:46 AM, Andrei Alexandrescu wrote:

Uses D for examples, showcases Design by Introspection, and rediscovers a fast
partition routine. It was quite well received.
https://www.youtube.com/watch?v=AxnotgLql0k


https://www.reddit.com/r/programming/comments/4jlkhv/accu_2016_keynote_by_andrei_alexandrescu/

37 matches

Mail list logo