On Tue, 14 Nov 2006 17:33:09 +0100, Dave Long wrote:
> >> In an industry that only clocks progress on linear semi-log plots
> >> true innovation has become rare -- it's far too risky at the high
> >> end.
> >
> > Right --- it's the anti-Innovator's-Dilemma. As long as focusing on
> > making stuff smaller can get you 50+% improvements every year, you'll
> > beat all the Teras and Symbolicses just by doing that.
>
> In DSP-land (the last refuge of VLIW?) strange innovations can find a
> home.
>
> A couple of years ago* I'd wondered about gray-coded PC increments.
> Well, the DSP people haven't gone that far, but they do offer the
> option of bit-reversed autoincrementing index registers. By counting
> left-to-right (x000, x100, x010, x110, x001, x101, x011, x111, x000...)
> one can get various power-of-two circular buffers "for free".
>
> That may seem like a lot of trouble to save a mask op, but there's a
> good market reason to provide the feature: bit reversal simplifies
> twiddling FFT butterflies.
Interesting. Ward Cunningham is currently using bit-reversed counters
in his Bynase code for PDM [0] --- he's doing the equivalent of a
fixed dither pattern in the time domain rather than just doing the
delta-sigma thing.
My current best approaches to this in software are
unsigned char spec_binc(unsigned char acc) {
unsigned char mask;
for (mask = 0x80; mask & acc; mask >>= 1)
acc ^= mask;
return mask | acc;
}
which is 5 AVR instructions in the inner loop, so probably 6 cycles;
12 instructions in all; average case should be around 18 cycles, worst
case around 50; and
unsigned char table16[] = { 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f,
0x04, 0x05, 0x06, 0x07, 0x02, 0x03, 0x01, 0x00 };
unsigned char table16_reverse_counter(unsigned char input) {
unsigned char lefthalf = input >> 4;
unsigned char nextleft = table16[lefthalf];
unsigned char righthalf = input & 0x0f;
unsigned char nextright = (nextleft ? righthalf : table16[righthalf]);
return ((unsigned char)(nextleft << 4) | nextright);
}
which I think can be better both in the average case and the worst
case if you avoid doing address arithmetic by aligning table16 on a
256-byte boundary, but is better only in the worst case as output by
gcc.
Again we get back to the missing perfect-shuffle instruction [1]
--- reversing the bits in a byte requires only three perfect
in-shuffles:
abcd EFGH
EaFb GcHd
GEca HFdb
HGFE dcba
[0] http://c2.com/cybords/wiki.cgi?BynaseProtocol
[1] http://lists.canonical.org/pipermail/kragen-tol/2006-February/000814.html