On Thu, 2003-11-13 at 09:10, Toby Dickenson wrote:
> Ive not used ucs4 python yet, but it is one of the things I was looking 
> forward to in version 2.3. It would much nicer to leave ucs2 behind.

I would like to move away from UCS2 as well, but I'd like some arguments
to say why this is a good thing apart from "it's more compatible.".

> If ucs4 strings were the only cause of that difference, supybot would need to 
> be storing 2.5 million unicode characters. I guess that isnt likely. 
> Excluding bugs, I dont see any reason why a program that doesnt use any 
> unicode objects would use more memory when running on a ucs4 python 
> interpreter.

All unicode string objects would have been stored in UCS4 instead of
UCS2. Things like XML parsers all use unicode string objects to store
their representations because UTF-8 is the default encoding for XML.
Those sorts of applications may have a more significant  memory
footprint growth.

> > But note that this example is not scientific
> > because the machines were different in kernel version, compiler and
> > compiler optimisations.
> 
> Those reasons sound much more plausibe to me. Does anyone have a more 
> scientific comparison of the effect of the ucs4 option on python?

I'd like to do that some time. Otherwise, someone with a faster machine
than mine may want to try it. It would be an interesting to see what the
real impact is. If the memory footprint doesn't grow as much as I claims
it does, then it is a powerful argument for moving to UCS4 as default.

The reason why UCS2 is still default in the masked python-2.3.2 is
because (a) not many people use anything at the moment that requires
anything above UCS2 and (b) UCS4 does take up more memory compared to
the UCS2. How much more, I'm not certain.

For instance, how much more memory would portage take if it doesn't use
unicode strings at all?

Cheers,
-- 
Alastair 'liquidx' Tse
 >> Gentoo Developer
 >> http://www.liquidx.net/ | http://dev.gentoo.org/~liquidx/

Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to