Re: [vos-d] Terangreal and Unicode?
Hugh Perkins wrote: > utf-8 sounds good I'd say that we just use only utf-8, basically, and have conversions from whatever wchar_t*/wstring are (are they 16 bit or 32 bit characters? depends on the platform?) built into the property and talkative apis (the conversion functions would be in vutil of course). > on the other hand since pure ascii maps exactle to > utf-8 thats probably not an issue? The issue is if an application is expecting just ascii from, say, property data and gets some utf8 high characters, they are discarded or displayed as ascii "garbarge" characters. Getting "ascii" when you are expecting utf-8 is of course no problem, you will get valid utf-8 :) > btw, i broke my wrist rollerblading and i'm typing wiyh one hand which is > why there are som any typos... the moral is: don't trust cheap chinese > wrist-guards :-/ Sounds bad! Good luck to heal quickly! Reed ___ vos-d mailing list vos-d@interreality.org http://www.interreality.org/cgi-bin/mailman/listinfo/vos-d
Re: [vos-d] Terangreal and Unicode?
utf-8 sounds good fwiw. for swig, you can define your own functions quite eqsily to handle how conversions take place, so you can probably just define your own conversion function for strings? You only have to write the conversion function once, and they're generally quite short. On the other hand if sometimes strings are utf-8 and sometimes they're not, it could get trickier... on the other hand since pure ascii maps exactle to utf-8 thats probably not an issue? Easy to say if it's not me writing the wrappers :-) btw, i broke my wrist rollerblading and i'm typing wiyh one hand which is why there are som any typos... the moral is: don't trust cheap chinese wrist-guards :-/ HughOn 9/9/05, Reed Hedges <[EMAIL PROTECTED]> wrote: On Fri, Sep 09, 2005 at 10:16:28AM -0400, Reed Hedges wrote:> Then property just needs wstring methods I guess.with conversion into utf8 to store in the std::string.___ vos-d mailing listvos-d@interreality.orghttp://www.interreality.org/cgi-bin/mailman/listinfo/vos-d ___ vos-d mailing list vos-d@interreality.org http://www.interreality.org/cgi-bin/mailman/listinfo/vos-d
Re: [vos-d] Terangreal and Unicode?
On Fri, Sep 09, 2005 at 10:16:28AM -0400, Reed Hedges wrote: > Then property just needs wstring methods I guess. with conversion into utf8 to store in the std::string. ___ vos-d mailing list vos-d@interreality.org http://www.interreality.org/cgi-bin/mailman/listinfo/vos-d
Re: [vos-d] Terangreal and Unicode?
Hey guess what, we need to deal with Unicode strings in Python too! Unicode strings are a different type than normal strings. Swig easily maps 'some string' to char* or std::string, but doesn't seem to want to map u'some string'. You have to convert it with string.encode("ascii") or something like that. Might Swig map u'some string' to wchar_t* or wstring? I'll have to try it and find out. ... Been thinking a bit more, I guess we could say that in general, textual Property data should be only utf-8 and no other encoding. Certain textual property datatypes restrict the characters allowed anyway, for instance ints and floats, and the characters those are restricted to would be plain ascii compatible anyway. Then property just needs wstring methods I guess. I'm still not sure about object names and URLs. I have a feeling that we should hold off on that. That could get pretty complicated. Is there any reason to support other charsets like the windows encodings or the "koi" encodings or whatever-- do they offer anytheng that unicode doesn't (other than existing documents in those charsets)? Reed ___ vos-d mailing list vos-d@interreality.org http://www.interreality.org/cgi-bin/mailman/listinfo/vos-d
Re: [vos-d] Terangreal and Unicode?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hmm, well I'll have to try it again, then. I don't remember what the specific problem was, although I seem to recall at the time (this being over a year ago) Jorrit warning me away from Unicode and CS on Win32. On Tue, 6 Sep 2005, res wrote: On 06.09.2005 05:41, Peter Amstutz wrote: I tried compiling in unicode on Windows once, and it failed miserably because Crystal Space can't (or couldn't) handle the Unicode/wide-character versions of the win32 API. Not sure what you mean CS itself interacts with Win32 only at very few points, and it actually uses Unicode versions of Win32 functions on Windows NT. I'm not sure how a Unicode wxWidgets could break due CS. -f.r. [ Peter Amstutz ][ [EMAIL PROTECTED] ][ [EMAIL PROTECTED] ] [Lead Programmer][Interreality Project][Virtual Reality for the Internet] [ VOS: Next Generation Internet Communication][ http://interreality.org ] [ http://interreality.org/~tetron ][ pgpkey: pgpkeys.mit.edu 18C21DF7 ] -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.1 (GNU/Linux) iD8DBQFDH7uCaeHUyhjCHfcRAtHKAJ4y7x6lUop57g1epM9n5b5/ysGEFwCeNYOk ej4Cv0MGeOqUpT8ZlTMV9qc= =WPpt -END PGP SIGNATURE- ___ vos-d mailing list vos-d@interreality.org http://www.interreality.org/cgi-bin/mailman/listinfo/vos-d
Re: [vos-d] Terangreal and Unicode?
On 06.09.2005 05:41, Peter Amstutz wrote: > I tried compiling in unicode on Windows once, and it failed miserably > because Crystal Space can't (or couldn't) handle the > Unicode/wide-character versions of the win32 API. Not sure what you mean CS itself interacts with Win32 only at very few points, and it actually uses Unicode versions of Win32 functions on Windows NT. I'm not sure how a Unicode wxWidgets could break due CS. -f.r. signature.asc Description: OpenPGP digital signature ___ vos-d mailing list vos-d@interreality.org http://www.interreality.org/cgi-bin/mailman/listinfo/vos-d
Re: [vos-d] Terangreal and Unicode?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I've spent this weekend moving into my new apartment, and I don't have internet for my desktop yet (on my laptop I'm "borrowing" the upstairs neighbor's wireless for the time being). However, when I get a chance I will set up ter'angreal to compile in unicode mode on Debian; I'll let you know how it goes. I tried compiling in unicode on Windows once, and it failed miserably because Crystal Space can't (or couldn't) handle the Unicode/wide-character versions of the win32 API. So I'm not sure what the status of unicode on Windows is going to be, it may or may not be possible to isolate the unicode-using wxwidgets from the rest of CS. Debian doesn't have this problem because the standard encoding is UTF-8, which is backwards-compatible with ASCII. A related, larger issue is internationalization of VOS in general. I'm not sure where to go with that, although Reed mentions one issue which is including the text encoding in places like property datatypes and talk messages. We would also want to use gettext to support translations for the UI of user apps like Ter'Angreal. I should be getting internet set up at my apartment on thursday, so I'll be able to start working on VOS again. In the mean time I need to finish unpacking :-) On Sun, 4 Sep 2005, Reed Hedges wrote: I guess wxgtk 2.6 in Debian is built in Unicode mode, and while I noticed some random bits and pieces of TerAngreal with stuff in #ifdef wxUSE_UNICODE conditionals, and some wxStrings are converted with wxString::mb_str(), it's generally incomplete. If build in Unicode mode, wxChar is a wchar_t, otherwise it's an 8 bit char. Wx provides some objects to convert between char* in various encodings, and wxChar (whichever of the above it might be). So I think to create a wxString from an ASCII char* you do this: const char* cstr = "foo"; wxString wxstr(cstr, wxConvUTF8); In Unicode mode, wxString uses the predefined wxConvUTF8 object to convert cstr from UTF8 (ASCII) to Unicode. In non-unicode mode it just uses cstr, or maybe wxConvUTF8 actally does no conversion... whatever. And to get a UTF8 string: cstr = (const char*) wxstr.mb_str(); Again, in Unicode mode, mb_str() converts to ASCII, while in non-unicode mode it just returns the ASCII string. Maybe you need to do this, not 100% certain yet: cstr = (const char*) wxstr.mb_str(wxConvUTF8); And of course string literals have different syntax, wx provides the wxT macro to do the right thing depending on the mode (though for the most part we are already using that in terangreal). I made a few changes along those lines and got wxterangreal to build but now it crashes in CS, as does wxtest, in csDriverDBReader. Has anyone else been hacking on Terangreal to build in unicode mode? Anyone want my patch while I figure out why it's crashing? (I suspect that some combination of my graphics hardware and X configuration is revealing the crashing bug) I have a bunch of other modifications I made to terangreal mixed in at the moment, though, I'd have to seperate them. One thing we will need to do is add support for different encodings to things like the talkative message strings and property values. Or maybe just change them to always use unicode? Reed [ Peter Amstutz ][ [EMAIL PROTECTED] ][ [EMAIL PROTECTED] ] [Lead Programmer][Interreality Project][Virtual Reality for the Internet] [ VOS: Next Generation Internet Communication][ http://interreality.org ] [ http://interreality.org/~tetron ][ pgpkey: pgpkeys.mit.edu 18C21DF7 ] -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.1 (GNU/Linux) iD8DBQFDHQ/XaeHUyhjCHfcRAmbhAJ457TCm5DwUsaPynCwY7mXGFqALfwCglPLN FF/eWJ3qU7z4mYEwRIibJo8= =3JJN -END PGP SIGNATURE- ___ vos-d mailing list vos-d@interreality.org http://www.interreality.org/cgi-bin/mailman/listinfo/vos-d
Re: [vos-d] Terangreal and Unicode?
On 05.09.2005 03:44, Reed Hedges wrote: > I made a few changes along those lines and got wxterangreal to build but > now it crashes in CS, as does wxtest, in csDriverDBReader. A backtrace may be helpful. -f.r. signature.asc Description: OpenPGP digital signature ___ vos-d mailing list vos-d@interreality.org http://www.interreality.org/cgi-bin/mailman/listinfo/vos-d
[vos-d] Terangreal and Unicode?
I guess wxgtk 2.6 in Debian is built in Unicode mode, and while I noticed some random bits and pieces of TerAngreal with stuff in #ifdef wxUSE_UNICODE conditionals, and some wxStrings are converted with wxString::mb_str(), it's generally incomplete. If build in Unicode mode, wxChar is a wchar_t, otherwise it's an 8 bit char. Wx provides some objects to convert between char* in various encodings, and wxChar (whichever of the above it might be). So I think to create a wxString from an ASCII char* you do this: const char* cstr = "foo"; wxString wxstr(cstr, wxConvUTF8); In Unicode mode, wxString uses the predefined wxConvUTF8 object to convert cstr from UTF8 (ASCII) to Unicode. In non-unicode mode it just uses cstr, or maybe wxConvUTF8 actally does no conversion... whatever. And to get a UTF8 string: cstr = (const char*) wxstr.mb_str(); Again, in Unicode mode, mb_str() converts to ASCII, while in non-unicode mode it just returns the ASCII string. Maybe you need to do this, not 100% certain yet: cstr = (const char*) wxstr.mb_str(wxConvUTF8); And of course string literals have different syntax, wx provides the wxT macro to do the right thing depending on the mode (though for the most part we are already using that in terangreal). I made a few changes along those lines and got wxterangreal to build but now it crashes in CS, as does wxtest, in csDriverDBReader. Has anyone else been hacking on Terangreal to build in unicode mode? Anyone want my patch while I figure out why it's crashing? (I suspect that some combination of my graphics hardware and X configuration is revealing the crashing bug) I have a bunch of other modifications I made to terangreal mixed in at the moment, though, I'd have to seperate them. One thing we will need to do is add support for different encodings to things like the talkative message strings and property values. Or maybe just change them to always use unicode? Reed ___ vos-d mailing list vos-d@interreality.org http://www.interreality.org/cgi-bin/mailman/listinfo/vos-d