Re: [time-nuts] 32768 Hz from 10 MHz

2012-02-03 Thread Dennis Ferguson

On 3 Feb, 2012, at 14:15 , Orin Eman wrote:

 On Thu, Feb 2, 2012 at 7:16 PM, Hal Murray hmur...@megapathdsl.net wrote:
 
 
 It's possible to use Bresenham with two integers 10,000,000 and 32,768
 but I
 found no way to perform all the 24-bit calculations on an 8-bit PIC quick
 enough. Removing the GCD often helps but in this case the accumulator
 remains 3-bytes wide.
 
 To generate 32 kHz you have to toggle a pin and calculate if the next
 toggle
 must be 38 or 39 instructions in the future; all the math must occur
 within
 37 instructions. That's why I came up with the binary leap year kind of
 algorithm; it's as close to math-less as you can get.
 
 You missed the simple way.  Table lookup.  :)
 
 The table is only 256 slots long.
 
 That's toggling between 305 and 306 cycles.  If your CPU uses N clocks per
 instruction, multiply the table size by N.
 
 
 
 
 Well, I thought table lookup too, but I figured  a 2048 x 1 table.  Easily
 done with a rotating bit and 256 byte table.
 
 
 Assuming clocking a PIC at 10MHz, you have 2,500,000 instructions per
 second.  Since there was talk about time to the next toggle, we have
 2,500,000/65536 instructions between toggles, ie 38.1470... instructions.
 The fraction turns out to be 301/2048, so you have to distribute 301 extra
 instructions over every 2048 half-periods of the 32768Hz waveform.

I only barely know the instruction set on those processors, but it seems
like it should be way easier than that.  You know it is going to be 38 or
39 instructions, so that only question is when it should be 39.  The value
of 250/65536 is 38.1470… in decimal, but in hex it is exactly 26.25a;
that is the 0x26 is 38 decimal while the fractional part is only 10 bits
long.  This means you should be able to compute when the extra cycle is
required by keeping a 16 bit accumulator to which the fractional part
0x25a0 is added at every change and executing the extra instruction when
there is a carry out of that. The seems straight forward.  If `lo' and `hi'
are the two halves of the accumulator then the working part of this becomes
something like (excusing my PIC assembler, which I mostly forget):

movl0xa0,w  // low byte of increment into w
add w,lo// add w to lo, may set carry
movl0x25,w  // high byte of increment into w
btfsc   3,0 // skip next if carry clear
add one,w   // increment w by one; I'm not sure how to do that
add w,hi// add w to hi, may set carry
// if carry set here need extra instruction.  Maybe this does it?
btfss   3,0 // skip if carry set
gotoblorp   // carry clear, don't execute next instruction
nop // the extra instruction
blorp:
// enough instructions more to make 38/39

Maybe someone who knows what they're doing can interpret that?

Dennis Ferguson
___
time-nuts mailing list -- time-nuts@febo.com
To unsubscribe, go to https://www.febo.com/cgi-bin/mailman/listinfo/time-nuts
and follow the instructions there.


Re: [time-nuts] 32768 Hz from 10 MHz

2012-02-03 Thread Orin Eman
On Fri, Feb 3, 2012 at 1:08 AM, Dennis Ferguson dennis.c.fergu...@gmail.com
 wrote:


 On 3 Feb, 2012, at 14:15 , Orin Eman wrote:

  On Thu, Feb 2, 2012 at 7:16 PM, Hal Murray hmur...@megapathdsl.net
 wrote:
 
 
  It's possible to use Bresenham with two integers 10,000,000 and 32,768
  but I
  found no way to perform all the 24-bit calculations on an 8-bit PIC
 quick
  enough. Removing the GCD often helps but in this case the accumulator
  remains 3-bytes wide.
 
  To generate 32 kHz you have to toggle a pin and calculate if the next
  toggle
  must be 38 or 39 instructions in the future; all the math must occur
  within
  37 instructions. That's why I came up with the binary leap year kind of
  algorithm; it's as close to math-less as you can get.
 
  You missed the simple way.  Table lookup.  :)
 
  The table is only 256 slots long.
 
  That's toggling between 305 and 306 cycles.  If your CPU uses N clocks
 per
  instruction, multiply the table size by N.
 
 
 
 
  Well, I thought table lookup too, but I figured  a 2048 x 1 table.
  Easily
  done with a rotating bit and 256 byte table.
 
 
  Assuming clocking a PIC at 10MHz, you have 2,500,000 instructions per
  second.  Since there was talk about time to the next toggle, we have
  2,500,000/65536 instructions between toggles, ie 38.1470... instructions.
  The fraction turns out to be 301/2048, so you have to distribute 301
 extra
  instructions over every 2048 half-periods of the 32768Hz waveform.

 I only barely know the instruction set on those processors, but it seems
 like it should be way easier than that.  You know it is going to be 38 or
 39 instructions, so that only question is when it should be 39.  The value
 of 250/65536 is 38.1470… in decimal, but in hex it is exactly 26.25a;
 that is the 0x26 is 38 decimal while the fractional part is only 10 bits
 long.  This means you should be able to compute when the extra cycle is
 required by keeping a 16 bit accumulator to which the fractional part
 0x25a0 is added at every change and executing the extra instruction when
 there is a carry out of that. The seems straight forward.  If `lo' and `hi'
 are the two halves of the accumulator then the working part of this becomes
 something like (excusing my PIC assembler, which I mostly forget):

movl0xa0,w  // low byte of increment into w
add w,lo// add w to lo, may set carry
movl0x25,w  // high byte of increment into w
btfsc   3,0 // skip next if carry clear
add one,w   // increment w by one; I'm not sure how to do that
add w,hi// add w to hi, may set carry
// if carry set here need extra instruction.  Maybe this does it?
btfss   3,0 // skip if carry set
gotoblorp   // carry clear, don't execute next instruction
nop // the extra instruction
 blorp:
// enough instructions more to make 38/39

 Maybe someone who knows what they're doing can interpret that?



Here you go:

 movlw 0xa0
 addwf  lo,f
 movlw 0x25
 btfsc  status,C
 addlw 1; or movlw 0x26
 addwf hi,f
 btfss  status,C
 goto   skip
 nop
 nop
skip:


Takes 9 or 10 instruction cycles.  You needed an extra nop as the goto
takes two cycles, one cycle if skipped.  I'd just reverse the sense as
follows:

btfsc  status,C
goto   skip
skip:

If carry is clear, skipping the goto takes one instruction cycle.  If carry
is set, executing the goto takes two instruction cycles.  Yes, the goto is
to the next inline instruction, but as far as I know, the PIC doesn't
optimise that.  The entire sequence in this case would take 8 or 9
instruction cycles.

I'd forgotten the trick of using binary fixed point arithmetic with the
fractional part strategically placed at a byte boundary!  I've used a
similar trick in the past to convert a binary fraction into decimal digits
- keep multiplying by 10 (a couple of shifts plus an add), pick up the
non-fractional bits as the next decimal digit, then truncate to the
remaining fractional bits.

Orin.
___
time-nuts mailing list -- time-nuts@febo.com
To unsubscribe, go to https://www.febo.com/cgi-bin/mailman/listinfo/time-nuts
and follow the instructions there.


Re: [time-nuts] 32768 Hz from 10 MHz

2012-02-03 Thread Hal Murray

t...@leapsecond.com said:
 I'm curious how a 10 MHz-driven high-end DDS would generate 32 kHz with the
 lowest possible jitter?

What do you mean by high-end DDS?  A chip from Analog Devices or one from 
Xilinx? :)

If you use a classic DDS chip with a binary adder and ROM, it will have low 
jitter but the frequency will be off a tiny bit.  My calculations...

24 bits:
  54975 = 32767.653465
  54976 = 32768.249511

32 bits:
  14073748 = 32767.998054
  14073749 = 32768.000382


If you use a FPGA, it's something like this:
  X = X + 32768
  If X = 1000
output clock pulse
X = X - 1000

You get the subtract for free if you are using decimal arithmetic and ignore 
the overflow.  You can do it with binary addition if you pipeline things 
right and put a mux in front of the adder.  It either adds 32K or adds 
(32K-10M).

That makes a 1 clock wide clock pulse, 100 ns at 10 MHz.  If you want a 
square clock, make a 2X clock and toggle a FF to divide by 2.

It will have the right long term frequency, or at least as good as the input 
clock.

It will have lots of jitter.  It's not Gaussian type jitter but spurs.  
Peak-to-peak will be roughly one clock period.

--

I don't know how to compute the jitter on traditional (binary, ROM) DDS 
chips.  Peak to peak will also be roughly one clock period in the raw output, 
but the output is close to a sine wave so some filtering would easily reduce 
the jitter.

I'm pretty sure they have spurs and that they are smaller and farther out if 
the ROM is wider and deeper.

Both DDS chips and FPGAs usually contain a PLL/DLL to get a faster internal 
clock rate.  That would reduce jitter.

I don't know enough matlab to be able to simulate this.  Another time sink when 
I get the time.


-- 
These are my opinions, not necessarily my employer's.  I hate spam.




___
time-nuts mailing list -- time-nuts@febo.com
To unsubscribe, go to https://www.febo.com/cgi-bin/mailman/listinfo/time-nuts
and follow the instructions there.


[time-nuts] 32768 Hz from 10 MHz

2012-02-02 Thread Brooke Clarke

Hi Roberto:

By changing the timer count dynamically it's possible to lower the jitter to 
one timer count.  See:
http://www.prc68.com/I/PClock.shtml#BA

Have Fun,

Brooke Clarke
http://www.PRC68.com
http://www.end2partygovernment.com/Brooke4Congress.html


Roberto Barrios wrote:

Hi Tom,

I'm interested in that divider. Actually, insterested in knowing how it works, 
not in the .HEX file.

Breseham's algorith works but has inherent jitter and I've found no other 
solutions for situations like that.

I'd live to know how it is done.

Thank you,
Roberto EB4EQA
http://www.rbarrios.com


-Mensaje original- From: Tom Van Baak
Sent: Thursday, February 02, 2012 10:34 AM
To: Discussion of precise time and frequency measurement
Subject: Re: [time-nuts] ANFSCD - Synchronizing time in home video recorders


I think I've seen comments about making 32 KHz from 10 MHz in a PIC or AVR.

tvb has this web page, but I don't see a 32 KHz option:
 http://www.leapsecond.com/pic/picdiv.htm


Hal,

Yes, I have a PIC divider that takes 5 or 10 MHz input and
outputs a 32.768 kHz square wave with minimal jitter and
no long-term phase offset. Contact me off-line if interested.

/tvb


___
time-nuts mailing list -- time-nuts@febo.com
To unsubscribe, go to https://www.febo.com/cgi-bin/mailman/listinfo/time-nuts
and follow the instructions there.

___
time-nuts mailing list -- time-nuts@febo.com
To unsubscribe, go to https://www.febo.com/cgi-bin/mailman/listinfo/time-nuts
and follow the instructions there.




___
time-nuts mailing list -- time-nuts@febo.com
To unsubscribe, go to https://www.febo.com/cgi-bin/mailman/listinfo/time-nuts
and follow the instructions there.


Re: [time-nuts] 32768 Hz from 10 MHz

2012-02-02 Thread Tom Van Baak

Hi Roberto:

By changing the timer count dynamically it's possible to lower the jitter to 
one timer count.  See:
http://www.prc68.com/I/PClock.shtml#BA

Have Fun,

Brooke Clarke


Hi Brooke,

You're a fellow PIC guy; let me explain.

Correct, that method works with a modest interrupt rate to count
integer seconds without long-term rounding error; but to generate
a total of 32,768 as-consistent-as-possible pulses *per* second
is quite different.

It's possible to use Bresenham with two integers 10,000,000 and
32,768 but I found no way to perform all the 24-bit calculations
on an 8-bit PIC quick enough. Removing the GCD often helps
but in this case the accumulator remains 3-bytes wide.

To generate 32 kHz you have to toggle a pin and calculate if
the next toggle must be 38 or 39 instructions in the future; all
the math must occur within 37 instructions. That's why I came
up with the binary leap year kind of algorithm; it's as close to
math-less as you can get.

By comparison, all the decimal dividers (1 Hz, 10 Hz, etc.) that
you and I do are trivial because of the common factors with the
10 MHz clock. It's just that 32,768 has no factors of 5. Read the
comments in the file 10m32k.c for more details.

I'm curious how a 10 MHz-driven high-end DDS would generate
32 kHz with the lowest possible jitter?

/tvb


___
time-nuts mailing list -- time-nuts@febo.com
To unsubscribe, go to https://www.febo.com/cgi-bin/mailman/listinfo/time-nuts
and follow the instructions there.


Re: [time-nuts] 32768 Hz from 10 MHz

2012-02-02 Thread Chris Albertson
On Thu, Feb 2, 2012 at 12:21 PM, Tom Van Baak t...@leapsecond.com wrote:

 I'm curious how a 10 MHz-driven high-end DDS would generate
 32 kHz with the lowest possible jitter?

I wonder if your 32K diver could be improved if it used interpolation.
  In other words use an analog output.   So at each cycle you decide
what value to put out, either one or zero or some voltage between.

The next question is why use a PIC divider?  Why not a DDS?  For
low-end DDS the cost is not much different.  Maybe $1 vs. $10 or about
that. (don't say 10X say $9 more)

The DDS does about the same thing is a PIC except that at each cycle
it picks an entry from a sine wave table.  I don't know if they
interpolate or just use the nearest value.   Your algorithm in the
PIC, I think is the same as that but you use nearest value in your
square wave look up table.  Try interpolating. and filtering.   This
can move to zero crossing to someplace unrelated to the 10MHz
reference


Chris Albertson
Redondo Beach, California

___
time-nuts mailing list -- time-nuts@febo.com
To unsubscribe, go to https://www.febo.com/cgi-bin/mailman/listinfo/time-nuts
and follow the instructions there.


Re: [time-nuts] 32768 Hz from 10 MHz

2012-02-02 Thread Azelio Boriani
Now I'm thinking that starting with a first run of 8 cycles at 500nS + 2
cycles at 400nS to be repeated for 10 times and then inserting 2 cycles of
400nS, a first approximation of my 2.048MHz can be done. Maybe with a
deltaF/F of 10 at -4 for tau 1 second but it can be done. In the very long
run the count will be correct and the accuracy gets better tau after tau.
Of course there is the source oscillator's limit.

On Thu, Feb 2, 2012 at 9:21 PM, Tom Van Baak t...@leapsecond.com wrote:

 Hi Roberto:

 By changing the timer count dynamically it's possible to lower the jitter
 to one timer count.  See:
 http://www.prc68.com/I/PClock.shtml#BA

 Have Fun,

 Brooke Clarke


 Hi Brooke,

 You're a fellow PIC guy; let me explain.

 Correct, that method works with a modest interrupt rate to count
 integer seconds without long-term rounding error; but to generate
 a total of 32,768 as-consistent-as-possible pulses *per* second
 is quite different.

 It's possible to use Bresenham with two integers 10,000,000 and
 32,768 but I found no way to perform all the 24-bit calculations
 on an 8-bit PIC quick enough. Removing the GCD often helps
 but in this case the accumulator remains 3-bytes wide.

 To generate 32 kHz you have to toggle a pin and calculate if
 the next toggle must be 38 or 39 instructions in the future; all
 the math must occur within 37 instructions. That's why I came
 up with the binary leap year kind of algorithm; it's as close to
 math-less as you can get.

 By comparison, all the decimal dividers (1 Hz, 10 Hz, etc.) that
 you and I do are trivial because of the common factors with the
 10 MHz clock. It's just that 32,768 has no factors of 5. Read the
 comments in the file 10m32k.c for more details.

 I'm curious how a 10 MHz-driven high-end DDS would generate
 32 kHz with the lowest possible jitter?

 /tvb


 ___
 time-nuts mailing list -- time-nuts@febo.com
 To unsubscribe, go to
 https://www.febo.com/cgi-bin/mailman/listinfo/time-nuts
 and follow the instructions there.

___
time-nuts mailing list -- time-nuts@febo.com
To unsubscribe, go to https://www.febo.com/cgi-bin/mailman/listinfo/time-nuts
and follow the instructions there.


Re: [time-nuts] 32768 Hz from 10 MHz

2012-02-02 Thread Brooke Clarke

Hi Tom:

I like the leap year idea.  Does this fit into one of the 8-pin PICs?

Have Fun,

Brooke Clarke
http://www.PRC68.com
http://www.end2partygovernment.com/Brooke4Congress.html


Tom Van Baak wrote:

Hi Roberto:

By changing the timer count dynamically it's possible to lower the jitter to 
one timer count.  See:
http://www.prc68.com/I/PClock.shtml#BA

Have Fun,

Brooke Clarke


Hi Brooke,

You're a fellow PIC guy; let me explain.

Correct, that method works with a modest interrupt rate to count
integer seconds without long-term rounding error; but to generate
a total of 32,768 as-consistent-as-possible pulses *per* second
is quite different.

It's possible to use Bresenham with two integers 10,000,000 and
32,768 but I found no way to perform all the 24-bit calculations
on an 8-bit PIC quick enough. Removing the GCD often helps
but in this case the accumulator remains 3-bytes wide.

To generate 32 kHz you have to toggle a pin and calculate if
the next toggle must be 38 or 39 instructions in the future; all
the math must occur within 37 instructions. That's why I came
up with the binary leap year kind of algorithm; it's as close to
math-less as you can get.

By comparison, all the decimal dividers (1 Hz, 10 Hz, etc.) that
you and I do are trivial because of the common factors with the
10 MHz clock. It's just that 32,768 has no factors of 5. Read the
comments in the file 10m32k.c for more details.

I'm curious how a 10 MHz-driven high-end DDS would generate
32 kHz with the lowest possible jitter?

/tvb


___
time-nuts mailing list -- time-nuts@febo.com
To unsubscribe, go to https://www.febo.com/cgi-bin/mailman/listinfo/time-nuts
and follow the instructions there.




___
time-nuts mailing list -- time-nuts@febo.com
To unsubscribe, go to https://www.febo.com/cgi-bin/mailman/listinfo/time-nuts
and follow the instructions there.


Re: [time-nuts] 32768 Hz from 10 MHz

2012-02-02 Thread Hal Murray

 It's possible to use Bresenham with two integers 10,000,000 and 32,768 but I
 found no way to perform all the 24-bit calculations on an 8-bit PIC quick
 enough. Removing the GCD often helps but in this case the accumulator
 remains 3-bytes wide.

 To generate 32 kHz you have to toggle a pin and calculate if the next toggle
 must be 38 or 39 instructions in the future; all the math must occur within
 37 instructions. That's why I came up with the binary leap year kind of
 algorithm; it's as close to math-less as you can get. 

You missed the simple way.  Table lookup.  :)

The table is only 256 slots long.

That's toggling between 305 and 306 cycles.  If your CPU uses N clocks per 
instruction, multiply the table size by N.

In hindsight, I'm embarrassed that I didn't see this much sooner.

10,000,000 is 10^7 or 2^7 * 2^5.

32,768 is 2^15.  So we need a factor of 2^8 to get back to where we started.

-

My early introduction to the advantages of table lookup was using Fortran on 
an IBM 7094.  How do you calculate factorials quickly?  The table is only 
30-50 slots.  Anything bigger generates a floating point overflow.

-

Here is my hack python code that I had to write to see what's going on:

#!/usr/bin/python2

# Given 10 MHz, target is 32 KHz
# What sequence of DDS steps is required?
# How long is the sequence before it repeats.

import sys

Target = 32768
Input = 1000

table = {}

X = 0
K = 0
oldI = 0

for I in range(0,Input):
  X += Target
  if X = Input:
X -= Input
print %5d %7d %5d  %3d % (K, I, X, I-oldI)
if table.has_key(X):
  print Found in table:, X
  sys.exit(0)
table[X] = K
K += 1
oldI = I



-- 
These are my opinions, not necessarily my employer's.  I hate spam.




___
time-nuts mailing list -- time-nuts@febo.com
To unsubscribe, go to https://www.febo.com/cgi-bin/mailman/listinfo/time-nuts
and follow the instructions there.


Re: [time-nuts] 32768 Hz from 10 MHz

2012-02-02 Thread Dave Martindale
On Thu, Feb 2, 2012 at 12:21, Tom Van Baak t...@leapsecond.com wrote:

 It's possible to use Bresenham with two integers 10,000,000 and
 32,768 but I found no way to perform all the 24-bit calculations
 on an 8-bit PIC quick enough. Removing the GCD often helps
 but in this case the accumulator remains 3-bytes wide.

In this particular case, the divisor your want is 2^15 / 10^7.  You
can remove a common factor of 2^7, giving 2^8 / 5^7, or 256 / 78125.

If you only want a square wave output, you should be able to do this
with a 17-bit binary counter and some logic.  In concept, it looks
something like:
- initialize register to 0
- every input clock, add 256 to the register
- when the register is greater than or equal to 78125, set overflow
bit and subtract 78125 from the register.

In practice, you'd probably set the register to 78125 and count down
to zero, using the borrow output from the subtract of 256 as
overflow.  Then you don't need to compare the register to 78125.

Essentially, you've built a special-purpose DDS whose frequency
resolution is 128 Hz , and the output frequency you want is exactly
256*128 Hz.  The average frequency is exact, and the output waveform
repeats every 1/128 sec.

 I'm curious how a 10 MHz-driven high-end DDS would generate
 32 kHz with the lowest possible jitter?

You should be able to use a AD9913 to do the same 256/78125 division
described above, with exact output frequency, and sine wave output to
boot.  If I've understood the datasheet correctly, you would program
the main DDS frequency tuning word to 14073748, which gets you as
close to 32768 Hz as possible without exceeding it.  Using variable
modulus mode, you program the FTW and modulus of the secondary DDS to
65276 and 78125.

Every input clock, the main FTW of 14073748 is added to the main
32-bit register.  At the same time, 65276 is added to the secondary
register.  If the secondary register exceeds 78125 (which will happen
on most clocks with these values), the main register is incremented by
1 and the secondary register has 78125 subtracted.  So over the course
of 78125 input clocks (1/128 second), the secondary register has
65276*78125 counts total added, which causes it to overflow 65276
times.  The main register has 78125*14073748 added to it directly,
plus 65276 extra counts from the secondary register overflows.  The
sum of those two values is exactly 2^40, meaning the main register
overflows 2^8 times in 78125 clocks.

After 78125 input clocks, both the main and secondary register have
returned to zero, so the sequence repeats exactly every 1/128 second.
In effect, the secondary register is acting as a variable-modulus DDS
that changes the FTW of the primary fixed-modulus DDS by only one
count, just often enough to make the division ratio exact.  And
because the primary DDS is still fixed-modulus, you can still use the
top k bits of the accumulator to index into a sine lookup table, and
produce a sine wave output.

 Dave

___
time-nuts mailing list -- time-nuts@febo.com
To unsubscribe, go to https://www.febo.com/cgi-bin/mailman/listinfo/time-nuts
and follow the instructions there.


Re: [time-nuts] 32768 Hz from 10 MHz

2012-02-02 Thread Orin Eman
On Thu, Feb 2, 2012 at 7:16 PM, Hal Murray hmur...@megapathdsl.net wrote:


  It's possible to use Bresenham with two integers 10,000,000 and 32,768
 but I
  found no way to perform all the 24-bit calculations on an 8-bit PIC quick
  enough. Removing the GCD often helps but in this case the accumulator
  remains 3-bytes wide.

  To generate 32 kHz you have to toggle a pin and calculate if the next
 toggle
  must be 38 or 39 instructions in the future; all the math must occur
 within
  37 instructions. That's why I came up with the binary leap year kind of
  algorithm; it's as close to math-less as you can get.

 You missed the simple way.  Table lookup.  :)

 The table is only 256 slots long.

 That's toggling between 305 and 306 cycles.  If your CPU uses N clocks per
 instruction, multiply the table size by N.




Well, I thought table lookup too, but I figured  a 2048 x 1 table.  Easily
done with a rotating bit and 256 byte table.


Assuming clocking a PIC at 10MHz, you have 2,500,000 instructions per
second.  Since there was talk about time to the next toggle, we have
2,500,000/65536 instructions between toggles, ie 38.1470... instructions.
 The fraction turns out to be 301/2048, so you have to distribute 301 extra
instructions over every 2048 half-periods of the 32768Hz waveform.

Here's what I would do in a mix of C and asm:

unsigned char bitmask = 0x80;
unsigned char index =  0xFF;
unsigned char table[256] = { // Calculate using a spreadsheet or similar };
bit OutputBit;

asm {
loop:
BCFSTATUS,C
RLFbitmask,F
BTFSS STATUS,C
GOTO IndexOK
RLFbitmask,F   ; restore low bit from carry
INCF   index,W ; on to the next byte in the table
GOTO DoLookup
IndexOK:
NOP; equalize time in if/else cases
NOP
MOVF index,W
DoLookup:
CALL TableLookup; Not defined here, returns value in W

ANDWF bitmask,W
BTFSS STATUS,Z
GOTO  ExtraCycle; 1 cycle if skipped, 2 if executed
ExtraCycle:
}
// Extra delay to get to 38/39 instructions (about 20 instructions if I
counted right)

OutputBit ^= 1; ; Toggle output
goto loop;

This version rotates the mask each time through and increments the index
every 8 times through.  You could increment the index each time through and
rotate the mask when the index rolls over.  That makes calculating the
table harder though.

No doubt I got the sense of the skips wrong or miscounted instructions
somewhere!

Orin.
___
time-nuts mailing list -- time-nuts@febo.com
To unsubscribe, go to https://www.febo.com/cgi-bin/mailman/listinfo/time-nuts
and follow the instructions there.