-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Tue, 6 Sep 2005, Reed Hedges wrote:
Why use wxConvCurrent instead of wxConvUTF8? I got the impression from
the wx docs that wxConvCurrent depends on the (GUI) platform, so on a
unicode platform you'd be telling it your strings from VOS were unicode,
when they aren't. I only read the wx docs with a normal depth, no
backwards bending, maybe I am wrong :)
Hmm, you may well be right. I think I mistakenly though that
wxConvCurrent converted *to* the "current" native encoding for the window
system -- but that doesn't really make sense, because the purpose of the
wxString class is precisely to gloss over those details.
So actually it's the source encoding. If we decide to go with UTF-8, then
it will need to be changed everywhere.
UTF-8 pros:
backwards-compatible with ASCII
most efficient way to encode western scripts
most common unicode encoding on Unix (?)
UTF-8 cons:
variable-length characters
least-efficient way to encode eastern scripts (chinese/japanese) due to
extra control characters required
native encoding in Windows is UTF-16
Variable-length characters is what really burns people. Unless one has a
string class that specifically knows about unicode varable-length
encodings, the usual solution is to store the string in more-or-less
uncompressed UTF-32. So for example, std::string is really a typedef for
std::basic_string<char>, so unicode would be std::basic_string<int32_t>.
Making Ter'angreal Unicode friendly (by using the wx unicode classes
_correctly_ :-) shouldn't be too hard. I'm going to convert my install to
use unicode and start primarily developing in that environment.
Making VOS unicode friendly is a much bigger issue.
- Properties store arbitrary binary data. We probably want to have the
property datatype include the encoding, with it defaulting to UTF-8.
Reading/writing text with multibyte characters to properties will require
an encoding step.
- As noted above, to store unicode strings so that methods that operate
on the string work correctly, it would be necessary to convert everything
to use std::basic_string<int32_t> or something similar.
- Various other things such as mesh's command line parser, the vosapp
framework may need to be made multibyte-aware as well.
- It's not clear what immediate short-term benefit there would be to
allow chinese characters in the child contextual names.
My feelings are that there are two main places we would want to support
unicode: properties with text would be stored in UTF-8 (?) and chat
messages would also be sent in UTF-8. This will require a library to
provide the conversion support. Does anyone know a C++ library for this
offhand?
Overall these two specific changes wouldn't be such a huge amount of work
and are contained enough that they could probably be phased in, in such a
way that they didn't break the entire VOS API all at once.
[ Peter Amstutz ][ [EMAIL PROTECTED] ][ [EMAIL PROTECTED] ]
[Lead Programmer][Interreality Project][Virtual Reality for the Internet]
[ VOS: Next Generation Internet Communication][ http://interreality.org ]
[ http://interreality.org/~tetron ][ pgpkey: pgpkeys.mit.edu 18C21DF7 ]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
iD8DBQFDH8RaaeHUyhjCHfcRAt8gAJ9IMq+XmDUaG50/a3LOJ8+j35pWrACgsNqt
FWQ+yzu4LbtNrzUoicaC+vY=
=9Yrj
-----END PGP SIGNATURE-----
_______________________________________________
vos-d mailing list
vos-d@interreality.org
http://www.interreality.org/cgi-bin/mailman/listinfo/vos-d