On 30-Jul-12 06:01, Andrei Alexandrescu wrote:
In fact memcpy could and should be replaced with word by word copy for
almost all of struct sizes up to ~32 bytes (as size is known in advance
for this particular function pointer i.e. handler!int).
In fact memcpy should be smart enough to do all that, but apparently it
doesn't.
I'd say array ops could and should do this (since compiler has all the
info at compile-time). On the other hand memcpy is just one tired C
function aimed at fast blitting of any memory chunks.
(Even just call/ret pair is too much of overhead in case of int).
I've found something far more evil:
[snip]
So to boot in order to get to int + int _detected_ it needs 4 calls to
fptr(OpID.testConversion, null, &info) == 0 and acquire TypeInfo each
time. Not counting 2 gets to obtain result. It's just madness.
Heh, this all gives me a new, humbling perspective on my earlier self
running my mouth about other designs :o). As always, history and context
makes one much more empathic.
Variant has a very old design and implementation. It is, in fact, the
very first design I've ever carried in D; I'd decided, if I can't do
that, D isn't powerful enough. The compiler was so buggy back then,
implementation needed to take many contortions, and it was a minor
miracle the whole house of cards stood together.
These were indeed very rough days. I think it's a minor miracle I
haven't gave up on D after seeing tons of bugs back in 2010.
The basic approach was to use a pointer to function (instead of an
integral) as a discriminator. The pointer to function was taken from the
instantiation of a template function. That means Variant could work with
any type, even types not yet defined. I'd made it a clear goal that I
must not enumerate all types in one place, or "register" types for use
with Variant - that would have been failure. Using a pointer to template
function naturally associated a distinct tag with each type,
automatically. I think it was a good idea then, and it's a good idea now.
It's indeed nice trick and design sounds great (no register function, yay!).
The only piece missing is that e.g. arithmetic operations are binary
thus requiring double-dispatch or something like
arithmeticHandler!(T1,T2) to archive optimal performance.
The virtual handcrafted C competition would do it via nested switches.
[snip]
We can use the pointer to function as a tag for improving performance on
primitive types.
Indeed we can though it won't be as fast as simple tag - pointers
have quite big and sparse addresses giving us the same as:
if(ptr == handler!int)
....
else if(ptr == handler!uint)
....
else
....
Yes, and what we gain is speed for select known types. As long as the
number of those is relatively small, using simple test and branch should
serve us well. And we get to keep the generality, too.
OK I'll try it for opArithmetic that should help a lot.
Another way to go is make 2-d function table for all built-ins and all
operations on them. It's in fact a bunch of tables:
Type1 x Type2 for all of important basic operations. Tables are static
so it won't be much of a problem. At least int + int then would be
1 memory read, 1 call, 2 (casted as needed) reads & 1 store.
I think that's not worth it, but hey, worth a try.
I'm thinking more of it and common types are just uint, int, long,
ulong, double, string. In fact most of e.g. int x string and such are
"throw me an exception" entries, they could be merged.
Then we could make 2-stage table to save some space, it sounds like
trie, hm....
--
Dmitry Olshansky