On Monday, 11 July 2022 at 18:15:16 UTC, Ivan Kazmenko wrote:
Hi.
I'm looking at the compiler output of DMD (-O -release), LDC
(-O -release), and GDC (-O3) for a simple array operation:
```
void add1 (int [] a)
{
foreach (i; 0..a.length)
a[i] += 1;
}
```
Here are the outputs:
On Friday, 29 October 2021 at 14:23:49 UTC, Kagamin wrote:
Unsigned integers aren't numbers.
assert(-abs(1)<0);
Unsigneds approximate whole numbers of course (truncated on one
side). Likewise signeds approximate integers (across a
restricted interval). As always, we need to be careful with
I'm nervous enough about future compilations/builds of the code
that I'm responsible for that I employ the following idiom quite
a bit, mostly in @trusted code:
(some boolean expression denoting invariants) || assert(0,
"what went wrong");
How might the above cause problems and how do you
On Saturday, 10 July 2021 at 01:11:28 UTC, Steven Schveighoffer
wrote:
You can get better than hnsecs resolution with
`core.time.MonoTime`, which can support whatever the OS
supports.
However, `Duration` and `SysTime` are stored in hnsecs for a
very specific reason -- range. Simply put,
On Sunday, 7 March 2021 at 14:15:58 UTC, z wrote:
On Thursday, 25 February 2021 at 14:28:40 UTC, Guillaume Piolat
wrote:
On Thursday, 25 February 2021 at 11:28:14 UTC, z wrote:
How does one optimize code to make full use of the CPU's SIMD
capabilities?
Is there any way to guarantee that
On Thursday, 25 February 2021 at 11:28:14 UTC, z wrote:
How does one optimize code to make full use of the CPU's SIMD
capabilities?
Is there any way to guarantee that "packed" versions of SIMD
instructions will be used?(e.g. vmulps, vsqrtps, etc...)
To give some context, this is a sample of one
On Friday, 29 January 2021 at 20:01:17 UTC, Bruce Carneal wrote:
On Friday, 29 January 2021 at 17:46:05 UTC, Guillaume Piolat
wrote:
On Friday, 29 January 2021 at 16:34:25 UTC, Bruce Carneal
wrote:
The project I've been working on for the last few months has
a compute backend that is currently
On Friday, 29 January 2021 at 17:46:05 UTC, Guillaume Piolat
wrote:
On Friday, 29 January 2021 at 16:34:25 UTC, Bruce Carneal wrote:
The project I've been working on for the last few months has a
compute backend that is currently written MT+SIMD. I would
like to bring up a GPU variant.
What
On Friday, 29 January 2021 at 18:23:40 UTC, mw wrote:
On Friday, 29 January 2021 at 16:34:25 UTC, Bruce Carneal wrote:
Guidance from experience regarding any of the above, or other,
GPU possibilities would be most welcome.
https://dlang.org/blog/2017/10/30/d-compute-running-d-on-the-gpu/
The project I've been working on for the last few months has a
compute backend that is currently written MT+SIMD. I would like
to bring up a GPU variant.
If you have experience with this sort of thing, I'd love to hear
from you, either within this forum or at beerconf.
In a past life I was
On Sunday, 6 December 2020 at 16:42:00 UTC, Ola Fosheim Grostad
wrote:
On Sunday, 6 December 2020 at 14:44:25 UTC, Paulo Pinto wrote:
And while on the subject of low level programming in JVM or
.NET.
https://www.infoq.com/news/2020/12/net-5-runtime-improvements/
Didnt say anything about low
On Sunday, 6 December 2020 at 08:59:49 UTC, Ola Fosheim Grostad
wrote:
On Sunday, 6 December 2020 at 08:36:49 UTC, Bruce Carneal wrote:
Yes, but they don't allow low level programming. Go also
freeze to sync threads this has a rather profound impact on
code generation. They have spent a lot of
On Sunday, 6 December 2020 at 08:12:58 UTC, Ola Fosheim Grostad
wrote:
On Sunday, 6 December 2020 at 07:45:17 UTC, Bruce Carneal wrote:
GCs scan memory, sure. Lots of variations. Not germane. Not
a rationale.
We need to freeze the threads when collecting stacks/globals.
OK. Low latency
On Sunday, 6 December 2020 at 06:52:41 UTC, Ola Fosheim Grostad
wrote:
On Sunday, 6 December 2020 at 05:41:05 UTC, Bruce Carneal wrote:
OK. Some rationale? Do you, for example, believe that
no-probable-dlanger could benefit from a low-latency GC? That
it is too hard to implement? That the
On Sunday, 6 December 2020 at 05:29:37 UTC, Ola Fosheim Grostad
wrote:
On Sunday, 6 December 2020 at 05:16:26 UTC, Bruce Carneal wrote:
How difficult would it be to add a, selectable, low-latency GC
to dlang?
Is it closer to "we cant get there from here" or "no big deal
if you already have
How difficult would it be to add a, selectable, low-latency GC to
dlang?
Is it closer to "we cant get there from here" or "no big deal if
you already have the low-latency GC in hand"?
I've heard Walter mention performance issues (write barriers
IIRC). I'm also interested in the GC-flavor
On Friday, 23 October 2020 at 16:56:46 UTC, Kagamin wrote:
On Thursday, 22 October 2020 at 18:24:47 UTC, Bruce Carneal
wrote:
Per the wiki on termination analysis some languages with
dependent types (Agda, Coq) have built-in termination checkers.
What they do with code that does, say, a hash
On Friday, 23 October 2020 at 04:24:09 UTC, Paul Backus wrote:
On Friday, 23 October 2020 at 00:53:19 UTC, Bruce Carneal wrote:
When you write functions, the compiler helps you out with
fully automated constraint checking. When you write templates
you can write them so that they look like
On Thursday, 22 October 2020 at 20:37:22 UTC, Paul Backus wrote:
On Thursday, 22 October 2020 at 19:24:53 UTC, Bruce Carneal
wrote:
On a related topic, I believe that type functions enable a
large amount of code in the "may be hard to prove decidable"
category (templates) to be (re)written as
On Thursday, 22 October 2020 at 18:46:07 UTC, Ola Fosheim Grøstad
wrote:
On Thursday, 22 October 2020 at 18:38:12 UTC, Stefan Koch wrote:
On Thursday, 22 October 2020 at 18:33:52 UTC, Ola Fosheim
Grøstad wrote:
In general, it is hard to tell if a computation is
long-running or unsolvable.
On Thursday, 22 October 2020 at 18:04:32 UTC, Ola Fosheim Grøstad
wrote:
On Thursday, 22 October 2020 at 17:25:44 UTC, Bruce Carneal
wrote:
Is type checking in D undecidable? Per the wiki on dependent
types it sure looks like it is.
Even if it is, you can still write something that is
Is type checking in D undecidable? Per the wiki on dependent
types it sure looks like it is.
I assume that it's well known to the compiler contributors that D
type checking is undecidable which, among other reasons, is why
we have things like template recursion limits.
Confirmation of the
On Monday, 10 August 2020 at 13:52:46 UTC, Steven Schveighoffer
wrote:
On 8/9/20 8:46 AM, Steven Schveighoffer wrote:
On 8/9/20 8:37 AM, Steven Schveighoffer wrote:
I think this has come up before, there may even be a bug
report on it.
Found one, I'll see if I can fix the array runtime:
On Monday, 10 August 2020 at 13:52:46 UTC, Steven Schveighoffer
wrote:
On 8/9/20 8:46 AM, Steven Schveighoffer wrote:
On 8/9/20 8:37 AM, Steven Schveighoffer wrote:
I think this has come up before, there may even be a bug
report on it.
Found one, I'll see if I can fix the array runtime:
On Sunday, 9 August 2020 at 12:37:06 UTC, Steven Schveighoffer
wrote:
On 8/9/20 8:09 AM, Bruce Carneal wrote:
[...]
All blocks in the GC that are more than 16 bytes are aligned by
32 bytes. You shouldn't have any 16 byte blocks here, because
each element is 32 bytes long.
However, if your
On Sunday, 9 August 2020 at 10:02:32 UTC, kinke wrote:
On Sunday, 9 August 2020 at 01:03:51 UTC, Bruce Carneal wrote:
Is sub .alignof alignment expected here? IOW, do I have to
manually manage memory if I want alignments above 16?
IIRC, yes when using the GC, as that only guarantees 16-bytes
On Sunday, 9 August 2020 at 09:58:18 UTC, Johan wrote:
On Sunday, 9 August 2020 at 01:03:51 UTC, Bruce Carneal wrote:
The .alignof attribute of __vector(ubyte[32]) is 32 but
initializing an array of such vectors via an assignment to
.length has given me 16 byte alignment (and subsequent seg
On Sunday, 9 August 2020 at 05:49:23 UTC, user1234 wrote:
On Sunday, 9 August 2020 at 01:56:54 UTC, Bruce Carneal wrote:
On Sunday, 9 August 2020 at 01:03:51 UTC, Bruce Carneal wrote:
Manually managing the alignment eliminated the seg faulting.
Additionally, I found that
On Sunday, 9 August 2020 at 01:03:51 UTC, Bruce Carneal wrote:
The .alignof attribute of __vector(ubyte[32]) is 32 but
initializing an array of such vectors via an assignment to
.length has given me 16 byte alignment (and subsequent seg
faults which I suspect are related).
Is sub .alignof
The .alignof attribute of __vector(ubyte[32]) is 32 but
initializing an array of such vectors via an assignment to
.length has given me 16 byte alignment (and subsequent seg faults
which I suspect are related).
Is sub .alignof alignment expected here? IOW, do I have to
manually manage
On Monday, 3 August 2020 at 18:55:36 UTC, Steven Schveighoffer
wrote:
On 8/2/20 1:31 PM, Bruce Carneal wrote:
import std;
void f0(int[] a, int[] b, int[] dst) @safe {
dst[] = a[] + b[];
}
[snip of auto-vectorization example]
I was surprised that f0 ran just fine with a.length and
import std;
void f0(int[] a, int[] b, int[] dst) @safe {
dst[] = a[] + b[];
}
void f1(int[] a, int[] b, int[] dst) @trusted {
const minLen = min(a.length, b.length, dst.length);
dst[0..minLen] = a[0..minLen] + b[0..minLen];
assert(dst.length == minLen);
}
I was surprised that
On Tuesday, 30 June 2020 at 20:43:00 UTC, Bruce Carneal wrote:
On Tuesday, 30 June 2020 at 20:12:59 UTC, Stanislav Blinov
wrote:
On Tuesday, 30 June 2020 at 20:04:33 UTC, Steven Schveighoffer
wrote:
The answer is -- update Phobos so it works with
-nosharedaccess :)
Yeah... and dip1000. And
On Tuesday, 30 June 2020 at 20:12:59 UTC, Stanislav Blinov wrote:
On Tuesday, 30 June 2020 at 20:04:33 UTC, Steven Schveighoffer
wrote:
The answer is -- update Phobos so it works with
-nosharedaccess :)
Yeah... and dip1000. And dip1008. And dip... :)
Didn't want to be snippity but, yeah,
Given -preview=nosharedaccess on the command line, "hello world"
fails to compile (you are referred to core.atomic ...).
What is the idiomatic way to get writeln style output from a
nosharedaccess program?
Is separate compilation the way to go?
On Sunday, 12 April 2020 at 23:14:42 UTC, Bruce Carneal wrote:
Could dlang compilers emit aliases for extern(C) and
extern(C++) routines that would carry dlang specific
information? (@safe, @nogc, nothrow, ...)
I'm thinking two symbols. The first as per normal C/C++, and
the second as per
Could dlang compilers emit aliases for extern(C) and extern(C++)
routines that would carry dlang specific information? (@safe,
@nogc, nothrow, ...)
I'm thinking two symbols. The first as per normal C/C++, and the
second as per normal dlang with a "use API {C, C++, ...}" suffix.
On Saturday, 28 March 2020 at 18:01:37 UTC, Crayo List wrote:
On Saturday, 28 March 2020 at 06:56:14 UTC, Bruce Carneal wrote:
On Saturday, 28 March 2020 at 05:21:14 UTC, Crayo List wrote:
On Monday, 23 March 2020 at 18:52:16 UTC, Bruce Carneal wrote:
[snip]
Explicit SIMD code, ispc or other,
On Saturday, 28 March 2020 at 05:21:14 UTC, Crayo List wrote:
On Monday, 23 March 2020 at 18:52:16 UTC, Bruce Carneal wrote:
[snip]
(on the downside you have to guard against compiler code-gen
performance regressions)
auto vectorization is bad because you never know if your code
will get
When speeds are equivalent, or very close, I usually prefer auto
vectorized code to explicit SIMD/__vector code as it's easier to
read. (on the downside you have to guard against compiler
code-gen performance regressions)
One oddity I've noticed is that I sometimes need to use
On Friday, 28 February 2020 at 10:11:23 UTC, Bruce Carneal wrote:
On Friday, 28 February 2020 at 06:50:55 UTC, 9il wrote:
On Wednesday, 26 February 2020 at 00:50:35 UTC, Basile B.
wrote:
So after reading the translation of RYU I was interested too
see if the decimalLength() function can be
On Friday, 28 February 2020 at 06:50:55 UTC, 9il wrote:
On Wednesday, 26 February 2020 at 00:50:35 UTC, Basile B. wrote:
So after reading the translation of RYU I was interested too
see if the decimalLength() function can be written to be
faster, as it cascades up to 8 CMP.
[...]
bsr can
On Thursday, 27 February 2020 at 19:46:23 UTC, Basile B. wrote:
Yes please, post the benchmark method. You see the benchmarks I
run with your version are always slowest. I'm aware that rndGen
(and generaly any uniform rnd func) is subject to a bias but I
dont thing this bias maters much in the
On Thursday, 27 February 2020 at 17:11:48 UTC, Basile B. wrote:
On Thursday, 27 February 2020 at 15:29:02 UTC, Bruce Carneal
wrote:
On Thursday, 27 February 2020 at 08:52:09 UTC, Basile B. wrote:
I will post my code if there is any meaningful difference in
your subsequent results.
give me
On Thursday, 27 February 2020 at 15:29:02 UTC, Bruce Carneal
wrote:
big snip
TL;DR for the snipped: Unsurprisingly, different inputs will lead
to different timing results. The equi-probable values supplied
by a standard PRNG differ significantly from an equi-probable
digit input. In
On Thursday, 27 February 2020 at 08:52:09 UTC, Basile B. wrote:
On Thursday, 27 February 2020 at 04:44:56 UTC, Basile B. wrote:
On Thursday, 27 February 2020 at 03:58:15 UTC, Bruce Carneal
wrote:
Maybe you talked about another implementation of
decimalLength9 ?
Yes. It's one I wrote after
On Thursday, 27 February 2020 at 03:58:15 UTC, Bruce Carneal
wrote:
On Wednesday, 26 February 2020 at 23:09:34 UTC, Basile B. wrote:
On Wednesday, 26 February 2020 at 20:44:31 UTC, Bruce Carneal
wrote:
After shuffling the input, branchless wins by 2.4X (240%).
snip
Let me know if the
On Wednesday, 26 February 2020 at 23:09:34 UTC, Basile B. wrote:
On Wednesday, 26 February 2020 at 20:44:31 UTC, Bruce Carneal
wrote:
After shuffling the input, branchless wins by 2.4X (240%).
I've replaced the input by the front of a rndGen (that pops for
count times and starting with a
On Wednesday, 26 February 2020 at 19:44:05 UTC, Bruce Carneal
wrote:
On Wednesday, 26 February 2020 at 13:50:11 UTC, Basile B. wrote:
On Wednesday, 26 February 2020 at 00:50:35 UTC, Basile B.
wrote:
...
foreach (i; 0 .. count)
sum += funcs[func](i);
The input stream is highly
On Wednesday, 26 February 2020 at 13:50:11 UTC, Basile B. wrote:
On Wednesday, 26 February 2020 at 00:50:35 UTC, Basile B. wrote:
...
foreach (i; 0 .. count)
sum += funcs[func](i);
The input stream is highly predictable and strongly skewed
towards higher digits.
The winning
50 matches
Mail list logo