Re: [fpc-devel] Memory consumed by strings

Daniël Mantione Sun, 23 Nov 2008 04:11:13 -0800


Op Sun, 23 Nov 2008, schreef listmember:

What I had in mind wasn't to store the string data in UTF-32 (or UCS-4); itwould still be UTF-8 or whatever.
I am only considering in memory representation being UTF-32 (or UCS-4).
This way, loading from and saving to would hardly be affected, yet in-memoryoperations would be a lot faster and more simplified.

For source code, en extended ASCII charset like UTF-8 is the best choice,since all characters that need processing are in the ASCII range, the codeneeds to do nothing about the high ASCII codes except keeping them in onepart.

Therefore, any other encoding is a waste of memory and does not gain youany speed. For that reason, I don't see the compiler switch from 8-bitprocessing either.

The situation is very different when processing real text, the memorysaving advantages dissappear for the majority of the world, and if youwant to process characters beyond #127, UTF-16 and UTF-32 are mucheasier. Obviously, UTF-32 is the best encoding if there are characters youneed to process are beyond #65535.

Only if you need to process characters (rather than pass them on), UTF-32is a lot faster and simpler.


Daniël

_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Memory consumed by strings

Reply via email to