Jan Slodicka wrote:

> I recently compared the speed of an identical algorithm (loading a JPEG) on
> several platforms - PC (1.6 MHz), Win CE (about 400 MHz, I forgot now the
> device type but it should not matter) and Palm m515. The results really
> surprised me.
>
> m515 run the program (approx.) 1000x slower than the PC whereas the
> processor speed differs by a factor 50 only.
> CE machine was only 2 times slower than the processor speed difference would
> suggest.
>
> In other words, after linear compensation for different processor speeds CE
> machine seems to be 10 times "faster" than Palm.
> This is for me incredibly bad result for Palm. Naturally I tried to find
> some excuses.
>

As far as I know devices running WinCE have a RISC-alike processor WITH a FPU.
When you then take the m515 with it's Motorola 68K processor - there's NO FPU
unit - well, there is software floating point support by some libraries ...
but,  ahemmm.  no comment.   :)
Furthermore the 68K (and it's derivates, like the Dragonball in 'earlier' Palm
PDAs (like the m515) ) is a CISC processor - with an ~10 times slower
clockfreq. compared to the devices running WinCE.  Plus no cache at all in the
68K (I don't know about the Dragonball 68K clone - but I'm quite sure it
doesn't have a cache too)
Therefore - if you compile a JPEG decoder with the DCT decoding using floating
point math ... surprise surprise  . . .    software FPU emulation is for sure
A LOT slower than on a processor using a hardware FPU.
(Even DCT decoding using int-math is much much more powerful on a RISC type
processor (like the ARM), because of it's powerful instruction set (just look
at ARM:  multiply AND accumulate in just 1 instruction... mla Rd,Rs1,Rs2,Rs3
(Rd:= Rs1*Rs2 + Rs3) ... simply wonderful!  :)  ...  not to speak about the
barrel shifter..   ALL registers are general purpose registers ... etc. etc.
)  ahh yes..  just forgot about the instruction pipelining on RISC processor
too...
(another fact: the 'official' JPEG decoding source has in my opinion a too much
overhead.. you may write your own image compression code based on
DCT->quantization->compression of the quantized output(Huffman + RLE or LZ77)


After all, comparing the Dragonball 68K devices to ARM (or any other RISC
processor) driven PDAs is like comparing apples with bananas.
(plus: comparing just the processor speeds is senseless too..  just image a
1000MHz proc. with a Bus-width of 1 bit  ;)    and then compare the results to
a proc. of 8MHz with a 32 bit wide Bus ...   :)

[don't know if all facts are true..  but I hope it helped a bit.   ;)  ]


Best regards,
Carsten.




well..
http://userpages.umbc.edu/~zding1/cmsc611/report.pdf
compares RISC - CISC   (not very extensive)

http://www.ecs.umass.edu/ece/wolf/papers/npw2002.pdf
(although it's about Network-processors it's quite interesting has some
important formulas about processor architecture)
(and you can easily recognize in this paper that a processor isn't just
measured by it's clock frequency)

... you  should also look for some papers by the gurus of processor
architecture:  Hennesy & Patterson




-- 
For information on using the Palm Developer Forums, or to unsubscribe, please see 
http://www.palmos.com/dev/support/forums/

Reply via email to