On 8/3/06 8:00 PM, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote:
>> I know that Valentina went UTF-16 for precisely this reason. > > Could be a mistake :( Is he processing the full code points? If he > is, then the variable widthness of UTF-16 kills off the advantage > over utf-8. > > RB's regex requires UTF-8, btw. If UTF-16 is so much easier, then why > is it using UTF-8? I think I know answer why. If you look around that some software project which was not unicode - safe once come to point when they need get unicode. Old C/C++ software projects are based on char* UTF8 fit this, but UTF16 not. When we have switch Valentina to UTF16, we was lucky that we have decide in the same time re-write the whole engine fro scratch using many new modern C++ techniques. And we have switch our code to UChar* which is for many compilers is wchar_t, 2 bytes. Many big old projects just cannot allow self re-write totally using new string points and methods. ----------- About REGEX...I know only one REGEX library that work with UTF16, it is ICU library... Apple use ICU, but they have open access to REGEX of ICU only in 10.4. So REALbasic probably use some other third party REGEX which can work only with UTF8. -- Best regards, Ruslan Zasukhin VP Engineering and New Technology Paradigma Software, Inc Valentina - Joining Worlds of Information http://www.paradigmasoft.com [I feel the need: the need for speed] _______________________________________________ Unsubscribe or switch delivery mode: <http://www.realsoftware.com/support/listmanager/> Search the archives of this list here: <http://support.realsoftware.com/listarchives/lists.html>
