On Saturday, 9 August 2014 at 09:34:38 UTC, Mehrdad wrote:
On Friday, 8 August 2014 at 22:43:38 UTC, Andrei Alexandrescu
wrote:
Alignment is often not an issue - you handle the
setup/teardown misalignments separately and to the bulk 64
bits at a time.
What kind of performance are you looking for? I have some very
basic bit-manipulation code written in C++ that operates on
whole words at a time, not sure if it's what you need but if it
is then it should be trivial to port this to D:
Glancing at it, it looks like it would probably do what i'd
want. Although i'd have to study it closer while converting it.
But these would probably only take effect when you have at least
a certain number of bits or more to make it worth it. Probably as
a guess, 8 bits or more.
Most of the speedup would probably be seen while doing matrix
operations. Still if they don't align perfectly the penalty based
on that C++ algo is probably 8x-64x slower (mostly do to
modulus/division, if it's a power of 2 than it drops to 4x-8x
slower) but it would be considerably faster than doing each bit
individually.