I recently compared the speed of an identical algorithm (loading a JPEG) on several platforms - PC (1.6 MHz), Win CE (about 400 MHz, I forgot now the device type but it should not matter) and Palm m515. The results really surprised me.
m515 run the program (approx.) 1000x slower than the PC whereas the processor speed differs by a factor 50 only. CE machine was only 2 times slower than the processor speed difference would suggest. In other words, after linear compensation for different processor speeds CE machine seems to be 10 times "faster" than Palm. This is for me incredibly bad result for Palm. Naturally I tried to find some excuses. The first candidate was the file loading. (The picture was read from an SD card.) It proved that the file reading (although slow) takes only 3% of the whole image processing time. There are several apps on the market claiming that they can substantially speed up Palm handheld by supplying properly written low level routines - noticeably memory operations, drawing functions etc. (This fact itself would deserve an explanation from the OS writers.) So I tried to speed up the memory operations. MemMove() is really easy to beat, but it proved that this function contributes substantially only in case of progressive JPEG. (Which wasn't the case.) Speeding up MemSet() (better to say writing special MemClear()) would help, but here I did not succeed in writing substantially faster code. There are no DM calls used on the Palm side (they are needed only for progressive JPEGs) - everything runs in dynamic RAM. I am using CodeWarrior 8 and to my belief I used the optimal compiler parameters. Curiously enough CW 9 (besides other problems) seems to produce slower code than CW 8 - at least in this particular case it was so. I tried to look at how good is the CW optimizer. It seems that it is lacking behind its MS counterpart (factor 2-3 improvement should often be possible), but working in assembly is the last resort for me. (Especially in case of JPEG where the key places represent rather complicated algorithms.) Anyway, I would be curious if gcc compiler could deliver faster code. As the last thing I wrote an easy stupid benchmark program consisting of loops and additions. The results were a bit better, but the Palm was still 10x slower than the processor speeds would suggest. Is there anybody able to explain these results? Jan Slodicka -- For information on using the Palm Developer Forums, or to unsubscribe, please see http://www.palmos.com/dev/support/forums/
