I recently compared the speed of an identical algorithm (loading a JPEG) on
several platforms - PC (1.6 MHz), Win CE (about 400 MHz, I forgot now the
device type but it should not matter) and Palm m515. The results really
surprised me.

m515 run the program (approx.) 1000x slower than the PC whereas the
processor speed differs by a factor 50 only.
CE machine was only 2 times slower than the processor speed difference would
suggest.

In other words, after linear compensation for different processor speeds CE
machine seems to be 10 times "faster" than Palm.
This is for me incredibly bad result for Palm. Naturally I tried to find
some excuses.

The first candidate was the file loading. (The picture was read from an SD
card.) It proved that the file reading (although slow) takes only 3% of the
whole image processing time.

There are several apps on the market claiming that they can substantially
speed up Palm handheld by supplying properly written low level routines -
noticeably memory operations, drawing functions etc. (This fact itself would
deserve an explanation from the OS writers.)
So I tried to speed up the memory operations. MemMove() is really easy to
beat, but it proved that this function contributes substantially only in
case of progressive JPEG. (Which wasn't the case.) Speeding up MemSet()
(better to say writing special MemClear()) would help, but here I did not
succeed in writing substantially faster code.

There are no DM calls used on the Palm side (they are needed only for
progressive JPEGs) - everything runs in dynamic RAM.

I am using CodeWarrior 8 and to my belief I used the optimal compiler
parameters. Curiously enough CW 9 (besides other problems) seems to produce
slower code than CW 8 - at least in this particular case it was so.

I tried to look at how good is the CW optimizer. It seems that it is lacking
behind its MS counterpart (factor 2-3 improvement should often be possible),
but working in assembly is the last resort for me. (Especially in case of
JPEG where the key places represent rather complicated algorithms.) Anyway,
I would be curious if gcc compiler could deliver faster code.

As the last thing I wrote an easy stupid benchmark program consisting of
loops and additions. The results were a bit better, but the Palm was still
10x slower than the processor speeds would suggest.

Is there anybody able to explain these results?

Jan Slodicka


-- 
For information on using the Palm Developer Forums, or to unsubscribe, please see 
http://www.palmos.com/dev/support/forums/

Reply via email to