Re: Quilted Dies of CPU & RAM

Kragen Javier Sitaker Fri, 24 Nov 2006 16:34:21 -0800

On Tue, 14 Nov 2006 17:33:09 +0100, Dave Long wrote:
> >> In an industry that only clocks progress on linear semi-log plots
> >> true innovation has become rare -- it's far too risky at the high
> >> end.
> >
> > Right --- it's the anti-Innovator's-Dilemma.  As long as focusing on
> > making stuff smaller can get you 50+% improvements every year, you'll
> > beat all the Teras and Symbolicses just by doing that.
> 
> In DSP-land (the last refuge of VLIW?) strange innovations can find a  
> home.
> 
> A couple of years ago* I'd wondered about gray-coded PC increments.   
> Well, the DSP people haven't gone that far, but they do offer the  
> option of bit-reversed autoincrementing index registers.  By counting  
> left-to-right (x000, x100, x010, x110, x001, x101, x011, x111, x000...)  
> one can get various power-of-two circular buffers "for free".
> 
> That may seem like a lot of trouble to save a mask op, but there's a  
> good market reason to provide the feature: bit reversal simplifies  
> twiddling FFT butterflies.


Interesting.  Ward Cunningham is currently using bit-reversed counters
in his Bynase code for PDM [0] --- he's doing the equivalent of a
fixed dither pattern in the time domain rather than just doing the
delta-sigma thing.

My current best approaches to this in software are

  unsigned char spec_binc(unsigned char acc) {
    unsigned char mask;
    for (mask = 0x80; mask & acc; mask >>= 1) 
      acc ^= mask; 
    return mask | acc;
  }

which is 5 AVR instructions in the inner loop, so probably 6 cycles;
12 instructions in all; average case should be around 18 cycles, worst
case around 50; and

  unsigned char table16[] = { 0x08, 0x09, 0x0a, 0x0b,  0x0c, 0x0d, 0x0e, 0x0f,
                              0x04, 0x05, 0x06, 0x07,  0x02, 0x03, 0x01, 0x00 };

  unsigned char table16_reverse_counter(unsigned char input) {
    unsigned char lefthalf = input >> 4;
    unsigned char nextleft = table16[lefthalf];
    unsigned char righthalf = input & 0x0f;
    unsigned char nextright = (nextleft ? righthalf : table16[righthalf]);
    return ((unsigned char)(nextleft << 4) | nextright);
  }

which I think can be better both in the average case and the worst
case if you avoid doing address arithmetic by aligning table16 on a
256-byte boundary, but is better only in the worst case as output by
gcc.

Again we get back to the missing perfect-shuffle instruction [1]
--- reversing the bits in a byte requires only three perfect
in-shuffles:

  abcd EFGH
  EaFb GcHd
  GEca HFdb
  HGFE dcba

[0] http://c2.com/cybords/wiki.cgi?BynaseProtocol

[1] http://lists.canonical.org/pipermail/kragen-tol/2006-February/000814.html

Re: Quilted Dies of CPU & RAM

Reply via email to