Re: Unsigned comparison operators

2017-10-17 Thread Cecil Ward via Digitalmars-d

On Friday, 29 September 2017 at 14:13:07 UTC, Stefan Koch wrote:


I have a hard time imagining a use case.


It just came up in a real application. It was a case of 
bit-twiddling, happened to come up during some weird 'arithmetic' 
when trying to write a jump-free replacement expression for a ? : 
because the compiler still generated jumps for a ? : when I 
didn't want it to. (A jump-less attempt at Bresenham's line 
algorithm.)




Re: ACLs for variables - lock / unlock idea

2017-09-29 Thread Cecil Ward via Digitalmars-d

On Friday, 29 September 2017 at 13:07:32 UTC, Atila Neves wrote:

auto modifiable = foo();
{
const nonModifiable = modifiable;
//...
}


I had already thought about using two names.

I don’t think that using a kind of ‘const-alias’ mechanism (or 
‘const reference’, in the C++ var& sense) would be a good idea at 
all. By this I mean something where a modifiable variable has two 
names, only of of them permitting write access. The reason why 
this wouldn’t be good enough is that there would be nothing much 
stopping me from forgetting what I’m doing and going back to 
using the wrong name at some point, without any warnings. Slight 
mitigation would be if the original declaration was something 
like myvar_writeable and the const alias for it was called simply 
myvar, so that the default brain-free form would be the safe one 
and you would have to go out of your way to get write access. It 
still wouldn’t be that strong though.


Iirc you get told off if you block access to variables by using 
‘shadowing’, declaring an exactly matching name using an alias 
declaration in a normal basic block scope. (I suspect you can do 
so inside function bodies, I don’t think d moans about you 
blocking access to matching variables that are in global or 
outer, non-global, non-local scopes if you write ‘shadowing’ 
local variable declarations - not sure.) Anyway, couldn’t use 
that either and it’s no way usable enough for me.


Unsigned comparison operators

2017-09-29 Thread Cecil Ward via Digitalmars-d
I love the D >>> operator and I use it a lot. So much safer than 
the chaos in C.


I would absolutely love to have unsigned comparison operators in 
D. Do you agree? What on earth would the syntax be like?


Yes, I could write a generic function or something, but the 
result would look ugly. And I would have to be very careful not 
to screw up type conversions (although i’m pretty sure there is 
some help available in the rtl to assist in getting the job done 
safely).


ACLs for variables - lock / unlock idea

2017-09-29 Thread Cecil Ward via Digitalmars-d
An idea, I’d be interested to hear if this might be at all useful 
to anyone -


If and only if a variable is declared as modifiable (neither 
immutable nor const), would it possibly be useful to be able to 
have a kind of scope const on/off switch to either impose const 
within a block, and/or the converse, that is to impose const as 
the default everywhere starting say from an initial directive 
straight after the (modifiable) declaration, while being able to 
flick the const protection switch off within a block?


Re: C `restrict` keyword in D

2017-09-06 Thread Cecil Ward via Digitalmars-d

On Monday, 4 September 2017 at 09:15:30 UTC, ag0aep6g wrote:

On 09/04/2017 06:10 AM, Moritz Maxeiner wrote:
That doesn't crash at the call site, but only when the callee 
accesses the parameter:


That's just an observation based on a detail of a particular 
compiler implementation. It's simply not true in general but 
might appear that way in a particular case. Did you inspect the 
generated code? If the entire thing has been _inlined_ and 
properly optimised as decent modern compilers most definitely all 
do _when the correct switches are used_, then looking at the code 
there is no such thing as caller and callee - it's all just a 
stream of code.





Re: Fast hashtable

2017-03-02 Thread Cecil Ward via Digitalmars-d

On Wednesday, 1 March 2017 at 09:45:23 UTC, Daniel Kozak wrote:

On Wednesday, 1 March 2017 at 06:44:34 UTC, Cecil Ward wrote:


I liked that article. I didn't really understand the point 
about implementation of modulo primes, maybe I missed 
something. Given that our man is doing modulo a 'known' value 
(he had a switch statement to get to them), why not do 
something rather cheaper than a compiler-expanded constant 
div/mod made up of multiplies and shifts


const uint power2 = 512; // say, some 1 << n anyway
const uint prime = 509; // some prime just below the 
power, some prime > power2/2


static assert( power2 - 1 - prime < prime );

x = x & ( power2 - 1 );
x = ( x >= prime ) ? x - prime : x;

which is good news on my x86 with GDC -O3 (only 3 operations, 
and sub cmovx ) - all well provided you make sure that you are 
getting CMOVx not branches. I could work out the power from 
the prime using CTFE given a bit of thought. Maybe CTFE could 
even do the reverse?


Have I finally gone mad?


Yes :D, this is something compiler should do.

btw: https://github.com/dlang/phobos/pull/1452


Does anyone know if the compilers could use this for code 
generation? I did later CTFE the prime from the power, can do the 
reverse more easily which is the way compilers would need to do 
it for division by known x.


Re: Fast hashtable

2017-03-02 Thread Cecil Ward via Digitalmars-d

On Wednesday, 1 March 2017 at 12:59:30 UTC, deadalnix wrote:

On Wednesday, 1 March 2017 at 06:44:34 UTC, Cecil Ward wrote:

const uint power2 = 512; // say, some 1 << n anyway
const uint prime = 509; // some prime just below the 
power, some prime > power2/2


static assert( power2 - 1 - prime < prime );

x = x & ( power2 - 1 );
x = ( x >= prime ) ? x - prime : x;

which is good news on my x86 with GDC -O3 (only 3 operations, 
and sub cmovx ) - all well provided you make sure that you are 
getting CMOVx not branches. I could work out the power from 
the prime using CTFE given a bit of thought. Maybe CTFE could 
even do the reverse?


Have I finally gone mad?


The lower slot will be twice as crowded as the higher ones.


Sorry, I think I was unclear, I was suggesting the author should 
use modulo the prime. The power of two is irrelevant, it's just a 
quick(er?) way of computing modulo. Are we on the same wavelength?


Re: Fast hashtable

2017-02-28 Thread Cecil Ward via Digitalmars-d

On Tuesday, 28 February 2017 at 21:00:05 UTC, H. S. Teoh wrote:
On Tue, Feb 28, 2017 at 12:57:14PM -0500, Andrei Alexandrescu 
via Digitalmars-d wrote:
This is of possible interest: 
https://probablydance.com/2017/02/26/i-wrote-the-fastest-hashtable/ --


Related to this, recently I found some interesting papers for 
large external (i.e., on-disk) hash tables:


http://www.itu.dk/people/pagh/papers/linear.pdf
http://www.weizhewei.com/papers/pods020-wei.pdf

AIUI, blocked probing (1st paper) is similar to Robin Hood 
hashing, in that inserting an entry may cause an existing entry 
to be moved out of its occupied slot to a different one, but 
blocked probing also has additional interesting characteristics:


- Scanning is done in blocks of size 2^i with starting slots of 
index
  2^i for incrementing i. This structure makes it 
cache-oblivious, and
  thus able to take advantage of the modern cache hierarchy 
without

  needing cache-specific tuning.

- Something special about the 2^i block sizes makes the math 
just work

  out (see 2nd paper), so that lookups have expected average 1 +
  1/2^Ω(b) I/O operations, where b is the block size (provided 
the load
  factor is bounded away from 1), which is a marked improvement 
over
  plain linear probing which has expected 1 + O(α/b) I/O 
operations,

  where α is the load factor.

I didn't look closely enough at the analysis to know for sure, 
but it seems that since the analysis is cache-oblivious, the 
O(1 + 1/2^Ω(b)) I/O operations should also generalize to cache 
misses as well (as part of the memory hierarchy, if you regard 
secondary storage as the lowest level of the hierarchy).  So 
I'm expecting this might be even faster than the Robin Hood 
hashing in your linked blog.



T


I liked that article. I didn't really understand the point about 
implementation of modulo primes, maybe I missed something. Given 
that our man is doing modulo a 'known' value (he had a switch 
statement to get to them), why not do something rather cheaper 
than a compiler-expanded constant div/mod made up of multiplies 
and shifts


const uint power2 = 512; // say, some 1 << n anyway
const uint prime = 509; // some prime just below the power, 
some prime > power2/2


static assert( power2 - 1 - prime < prime );

x = x & ( power2 - 1 );
x = ( x >= prime ) ? x - prime : x;

which is good news on my x86 with GDC -O3 (only 3 operations, and 
sub cmovx ) - all well provided you make sure that you are 
getting CMOVx not branches. I could work out the power from the 
prime using CTFE given a bit of thought. Maybe CTFE could even do 
the reverse?


Have I finally gone mad?


Re: Optimisation possibilities: current, future and enhancements

2016-08-25 Thread Cecil Ward via Digitalmars-d

On Thursday, 25 August 2016 at 18:17:21 UTC, Cecil Ward wrote:

On Thursday, 25 August 2016 at 11:16:52 UTC, Cecil Ward wrote:


* special values of objects such zero, and one, so that that (x 
⊛ zero) ≡ x, and that (zero ⊛ x) ≡ x


(Should of course read
(x ⊛ zero) ≡ zero, and that (one ⊛ x) ≡ x
if you take the operator as being like multiplication.)



Re: Optimisation possibilities: current, future and enhancements

2016-08-25 Thread Cecil Ward via Digitalmars-d

On Thursday, 25 August 2016 at 11:16:52 UTC, Cecil Ward wrote:
...
A useful phrase I saw today: “declaration of intent given by 
the programmer to the compiler”.


Particular dream wish-list items of mine: some kind of mechanism 
that could express  possible operator properties, classes 
properties and arithmetic identities, identity operations. 
Examples:

* commutativity;
* 1.0 / (1.0 / x) ≡ x, with or without ignoring of zero and 
ignoring of IEEE-weirdos; or

* sin²(x) +cos²(x) ≡ 1
* special values of objects such zero, and one, so that that (x ⊛ 
zero) ≡ x, and that (zero ⊛ x) ≡ x

* D strings can be first, so that x ~ "" ≡  x, arrays too
* arithmetic operators’ properties and identities a they apply to 
complex numbers


Another dream: Strength reductions so that sequences / patterns 
of operators (back to identities again, sort-of) could be mapped 
to named helper functions or operators. For example, with 
strings: s1 ~ s2 ~ s3 ~ … → StringArrayConcat( [] )


Re: Optimisation possibilities: current, future and enhancements

2016-08-25 Thread Cecil Ward via Digitalmars-d

On Thursday, 25 August 2016 at 17:22:27 UTC, kinke wrote:
I found it hard to believe LDC generates such crappy code when 
optimizing. These are my results using LDC master on Win64 
(`ldc2 -O -release -output-s`):


struct Foo
{
immutable _u = 8;
int foo() const
{
return 8 * _u;
}
}
int use(ref const(Foo) foo)
{
return foo.foo() + foo.foo();
}

int main()
{
Foo f;
return use(f);
}


_D7current3Foo3fooMxFZi:
movl(%rcx), %eax
shll$3, %eax
retq

_D7current3useFKxS7current3FooZi:
movl(%rcx), %eax
shll$4, %eax
retq

_Dmain:
movl$128, %eax
retq

Sure, Foo.foo() and use() could return a constant, but 
otherwise it can't get much better than this.


I think that here the optimisation is only because LDC can “see” 
the text of the method. When expansion is not possible, that 
would be the real test.




Re: Optimisation possibilities: current, future and enhancements

2016-08-25 Thread Cecil Ward via Digitalmars-d

On Thursday, 25 August 2016 at 18:07:14 UTC, Cecil Ward wrote:

On Thursday, 25 August 2016 at 17:22:27 UTC, kinke wrote:

[...]


I think that here the optimisation is only because LDC can 
“see” the text of the method. When expansion is not possible, 
that would be the real test.


(Assuming LDC behaves like GDC. I'm unfamiliar with LDC, I'm 
ashamed to admit.)


Re: Optimisation possibilities: current, future and enhancements

2016-08-25 Thread Cecil Ward via Digitalmars-d

On Thursday, 25 August 2016 at 14:42:28 UTC, Basile B. wrote:

On Thursday, 25 August 2016 at 11:16:52 UTC, Cecil Ward wrote:

[...]


I'll add

* create temporaries based on the const function attribute.

I don't know why but I believed that it was already the case. 
After disassembling a short test with DMD and LDMD2 it appears 
clearly that this is not true:


°°
struct Foo
{
immutable _u = 8;
int foo() const
{
return 8 * _u;
}
}
int use(ref const(Foo) foo)
{
return foo.foo() + foo.foo();
}
°°

disasm of use (LDC2 via LDMD2, -O -release)

00402930h  sub rsp, 18h
00402934h  mov qword ptr [rsp+10h], rdi
00402939h  call 004028F0h ; (Foo.foo)
0040293Eh  mov rdi, qword ptr [rsp+10h]
00402943h  mov dword ptr [rsp+0Ch], eax
00402947h  call 004028F0h ; (Foo.foo)
0040294Ch  mov ecx, dword ptr [rsp+0Ch]
00402950h  add ecx, eax
00402952h  mov eax, ecx
00402954h  add rsp, 18h
00402958h  ret

But Foo.foo constness guarantees that Foo state is not 
modified. So the result of the first CALL could be cached in a 
temporary and reused instead of the second CALL. This would 
help for example in loops when a getter function is called to 
know the iteration count.


The problem of the non-caching of appropriate function calls is 
not confined to methods. It also is observable when calling 
explicitly pure-marked external functions, eg. my_pure() + 
my_pure() makes two calls. (Checked in GCC -O3, with an extern 
pure-marked function.)


This is often covered up by inlining with full expansion, as 
non-extern functions don't show this.


Re: Optimisation possibilities: current, future and enhancements

2016-08-25 Thread Cecil Ward via Digitalmars-d

On Thursday, 25 August 2016 at 11:55:08 UTC, Cauterite wrote:

On Thursday, 25 August 2016 at 11:16:52 UTC, Cecil Ward wrote:

* the GCC "__builtin_expect()"


Instead of adding individual micro-optimisation features like 
this, I'd be more interested in the potential for 
profile-guided optimisation (which *ideally* can make these 
micro-optimisation decisions automatically). Since DMD already 
has some framework in place to support code profiling, I 
suspect this is at least a feasible enhancement.


On the other hand, it might not be worth trying to play 
catch-up with existing PGO features in GCC/LLVM. If you're 
using PGO, you're probably already using these other backends 
for their more advanced static optimisers.


One killer reason for me to use D rather than C or C++ would be 
if it either has or could be enhanced to have greater code 
optimisation possibilities. LTO isn't relevant here because it's 
equally applicable to other languages (in GCC at any rate, as I 
understand it). Aside from its own properties, D might have an 
advantage over C because its spec development could possibly be 
more agile, well, compared with C _standards_ anyway. GCC has 
kept innovating with non-standard features, to its credit. I 
think it's desirable for D not to fall _behind_ GCC's 
non-standard powerful and ingenious tricks.


Optimisation possibilities: current, future and enhancements

2016-08-25 Thread Cecil Ward via Digitalmars-d
I'm wondering if there are more opportunities in D for increased 
optimization in compilers that have not been mined yet. I'm 
specifically interested in the possibilities of D over and above 
what is possible in C and C++ because of the characteristics of D 
or because of our freedom to change compared with the inertia in 
the C/C++ world.


A useful phrase I saw today: “declaration of intent given by the 
programmer to the compiler”.


As well as the opportunities that exist in D as it stands and as 
it is _used_ currently,
I wonder what could be achieved by enhancing the language or its 
usage patterns
with new semantic restrictive markings that could be implemented 
with varying degrees of disruptiveness (from zero, up to idiom 
bans or even minor grammar changes [gulp!]). New attributes or 
property markings have already been added, I believe, during the 
history of D, which fall into this category. I'm also thinking of 
things like pragmas, functions with magic names and so forth.


Examples from the wider world, for discussion, no guarantees they 
allow increased optimisation:


* In C, the “restrict” marker
* In C++, the mechanism that makes possible optimised 
move-constructors and a solution to temporary-object hell
* assert()’s possibilities: but only if it is native and not 
deleted too early in the compiler stack - guarantees of the truth 
of predicates and restriction of values to be in known ranges, 
just as compilers can exploit prior truth of if-statements. Same 
for static assert of course
* Contracts, invariants, pre- and postconditions’ many, many 
delicious possibilities. Ideally, they need to be published, at 
two extra levels: within the same module and globally so that 
callers even from other translation units who have only the 
prototype can have a richer spec to guide inlining with truly 
first-class optimisation

* Custom calling conventions
* Some C compilers have magic to allow the declaration of an ISR. 
Stretching the category of optimisation a bit perhaps, but 
certainly does aid optimisation in the widest sense, because you 
don't then have to have unnecessary extra function-calls just to 
bolt assembler to a routine in C

* Similarly, inter-language calling mechanisms in general
* GCC and LDC’s extended asm interfacing specs, constraints and 
other magic
* Non-return-function marking, first in GCC and then in C itself. 
(iirc. And in C++?)
* the GCC "__builtin_expect()" / "likely()" and "unlikely()" 
magic marker functions to aid branch-prediction, code layout, etc

* GCC’s “builtin_assume_aligned()” function
* The GCC -ffast-math switch allows (if I understand correctly) 
the compiler to assume there are no IEEE floating-point 
weirdnesses such as infinities, NaNs, denormals anywhere, or to 
assume that the programmer just doesn't care. What if there were 
a mechanism to give kind of control down to a much more 
fine-grained level? (Such as 
per-function/operator/block/expression/object/variable.)


Sorry it's such a paltry list. However discussion will I'm sure 
expand it.