Here we go:
The loop version (size = 0x2c bytes, 8 passes thru loop, 2 memory write, 3
memory reads)
10071E02: 4E560000 link a6,#0
10071E06: 2F05 move.l d5,-(sp)
10071E08: 1A2E0008 move.b 8(a6),d5
10071E0C: 7200 moveq #0,d1
10071E0E: 7400 moveq #0,d2
10071E10: 600E bra.s *+16 ; 0x10071e20
10071E12: D201 add.b d1,d1
10071E14: 1005 move.b d5,d0
10071E16: 02000001 andi.b #0x1,d0
10071E1A: 8200 or.b d0,d1
10071E1C: E20D lsr.b #1,d5
10071E1E: 5242 addq.w #1,d2
10071E20: 0C420008 cmpi.w #8,d2
10071E24: 6DEC blt.s *-18 ; 0x10071e12
10071E26: 1001 move.b d1,d0
10071E28: 2A1F move.l (sp)+,d5
10071E2A: 4E5E unlk a6
10071E2C: 4E75 rts
The lookup version (size = 0x2e bytes, plus 0x10 for static data, no loop, 1
memory write, 6 memory reads).
10071E40: 4E560000 link a6,#0
10071E44: 102E0008 move.b 8(a6),d0
10071E48: 024000F0 andi.w #0xf0,d0
10071E4C: E840 asr.w #4,d0
10071E4E: 41EDE8F2 lea -5902(a5),a0
10071E52: 7200 moveq #0,d1
10071E54: 12300000 move.b (a0,d0.w),d1
10071E58: E841 asr.w #4,d1
10071E5A: 102E0008 move.b 8(a6),d0
10071E5E: 0240000F andi.w #0xf,d0
10071E62: 7400 moveq #0,d2
10071E64: 14300000 move.b (a0,d0.w),d2
10071E68: D441 add.w d1,d2
10071E6A: 1002 move.b d2,d0
10071E6C: 4E5E unlk a6
10071E6E: 4E75 rts
The stats:
codesize #OpcodesRun #datareads #datawrites
loop 0x2c 74 3 2
lookup 0x2e 16 6 1
Code size is virtually identical, so we have to decide based on speed. I
guess it depends how many cycles per instruction, and how fast the memory
system is. Since its the 68k (CISC), lets say 2 cycles/instruction, and
since its a Palm, lets say a slow memory system (5 cycles/references). Of
course, the best thing to do would be to just measure how long it took to
run each one a million times...
Using the guestimates, we then have:
loop: 74*2 + 5*5 = 173 cycles
lookup: 16*2 + 7*5 = 68 cycles
I like the lookup version!
-- bob
-----Original Message-----
From: Fitzpatrick, Joe [mailto:[EMAIL PROTECTED]]
Sent: Friday, May 12, 2000 1:37 PM
To: Palm Developer Forum
Subject: RE: Byte order
Usually a lookup table is by far the fastest algorithm. But, on the 68K a
loop of shifts and ors might be as fast. It might be interesting to compare
the code that CW generates for your function with the one below.
-jjf
Byte SwapByte(Byte oldbyte)
{
int i;
Byte newbyte = 0;
for (i=0; i<8 ; i++)
{
newbyte <<= 1;
newbyte |= (oldbyte & 1);
oldbyte >>= 1;
}
return newbyte;
}
-----Original Message-----
From: Robert Herold [mailto:[EMAIL PROTECTED]]
Sent: Friday, May 12, 2000 1:10 PM
To: Palm Developer Forum
Subject: RE: Byte order
Or how about this:
Byte swap_byte(Byte b)
{
static Byte swapped_nibble[16] = {
0x00, 0x80, 0x40, 0xC0,
0x20, 0xA0, 0x60, 0xe0,
0x10, 0x90, 0x50, 0xd0,
0x30, 0xb0, 0x70, 0xf0
};
return swapped_nibble[b & 0x0f] + (swapped_nibble[(b & 0xf0) >> 4] >>
4);
}
-- bob
====================
Robert Herold (650) 627-3302
Webvan Group, Inc (650) 627-3948 (fax)
310 Lakeside Drive [EMAIL PROTECTED]
Foster City, CA 94404 www.webvan.com
-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
Sent: Friday, May 12, 2000 12:00 PM
To: Palm Developer Forum
Subject: Re: Byte order
This function seems like its doing a lot of work, what about something like
this instead:
Byte swap_byte(Byte byOld)
{
Byte byNew = 0;
if (byOld & 0x01) { byNew |= 0x80; }
if (byOld & 0x02) { byNew |= 0x40; }
if (byOld & 0x04) { byNew |= 0x20; }
if (byOld & 0x08) { byNew |= 0x10; }
if (byOld & 0x10) { byNew |= 0x08; }
if (byOld & 0x20) { byNew |= 0x04; }
if (byOld & 0x40) { byNew |= 0x02; }
if (byOld & 0x80) { byNew |= 0x01; }
return byNew;
}
In a message dated 5/12/00 12:32:49 PM Eastern Daylight Time,
[EMAIL PROTECTED] writes:
<< Byte swap_byte(Byte old_byte){
Byte new_byte;
Byte working_byte[8];
int i;
new_byte=0x00; // zero out new_byte
for(i=0;i<8;i++)
working_byte[i]=0x00; // zero out working bytes
working_byte[0]=(old_byte << 7); // bit 1
working_byte[1]=(old_byte >> 1); // bit 2
working_byte[1]=(working_byte[1] << 7);
working_byte[1]=(working_byte[1] >> 1);
working_byte[2]=(old_byte >> 2); // bit 3
working_byte[2]=(working_byte[2] << 7);
working_byte[2]=(working_byte[2] >> 2);
working_byte[3]=(old_byte >> 3); // bit 4
working_byte[3]=(working_byte[3] << 7);
working_byte[3]=(working_byte[3] >> 3);
working_byte[4]=(old_byte >> 4); // bit 5
working_byte[4]=(working_byte[4] << 7);
working_byte[4]=(working_byte[4] >> 4);
working_byte[5]=(old_byte >> 5); // bit 6
working_byte[5]=(working_byte[5] << 7);
working_byte[5]=(working_byte[5] >> 5);
working_byte[6]=(old_byte >> 6); // bit 7
working_byte[6]=(working_byte[6] << 7);
working_byte[6]=(working_byte[6] >> 6);
working_byte[7]=(old_byte >> 7); // bit 8
for(i=0;i<8;i++){
new_byte=(new_byte | working_byte[i]);
} // end for loop
return(new_byte);
} // end function >>
--
For information on using the Palm Developer Forums, or to unsubscribe,
please see http://www.palmos.com/dev/tech/support/forums/
--
For information on using the Palm Developer Forums, or to unsubscribe,
please see http://www.palmos.com/dev/tech/support/forums/
--
For information on using the Palm Developer Forums, or to unsubscribe,
please see http://www.palmos.com/dev/tech/support/forums/
--
For information on using the Palm Developer Forums, or to unsubscribe, please see
http://www.palmos.com/dev/tech/support/forums/