Re: [sqlite] Making sqlite support unicode?
On Mon, 2003-10-27 at 10:27, [EMAIL PROTECTED] wrote: > On Sun, 26 Oct 2003 23:36:55 -0500 > "Mrs. Brisby" <[EMAIL PROTECTED]> wrote: > > It's good to use null-terminated in many cases; especially in collating > > and sorting. It helps to understand that in those cases you stop > > processing _after_ you see the terminator (and treat the terminator as > > it is: zero.) > > Collating involves with length. If data length is known prior to scanning > data, in some cases you can skip it if it doesn't match without scanning > data body. It helps to understand that in those cases you stop processing > _before_ you see the terminator or anything else. No it cannot. How are the following tokens collated? aaa aab > > UTF-16 is NOT used in HFS+. HFS+ still uses ASCII with some "tricks". > > UFS is what's "preferred" in MacOS X, and it doesn't use UTF-16 either. > > UTF-16 isn't what we're talking about anyway, it's UCS16. > MacOS X uses "Unicode" as its native encoding. In Unicode encoding > the most used in MacOS X is UTF-16. Only to call BSD API it uses > UTF-8. It's kind of hybrid, but UTF-8 is just used for compatibility to > Unix parts in MacOS X, and other non-Unix pieces in MacOS X, which > is why MacOS X is Mac, is using UTF-16 internally, including Carbon, > Cocoa and ATSUI. > > For HFS+, from Apple's Technical Note TN2078 (Migrating to FSRefs & long > Unicode names from FSSpecs): > http://developer.apple.com/technotes/tn2002/tn2078.html See the parts in http://developer.apple.com/technotes/tn/tn1150.html regarding the HFS wrapper; this was what I thought I was remembering; my memory on this subject is admittedly spotty. I don't recall this information being quite so easy to google; thanks for correcting me on this. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [sqlite] Making sqlite support unicode?
At 09:39 AM 26/10/2003, you wrote: Most deliberately unicode-aware task domains have chosen UTF-8- simply because it's a path of minimal resistance. Microsoft chose double-byte unicode encoding (often referred to as UCS16). In case anyone is wondering why Microsoft made such an OBVIOUSLY bad choice by going with UCS16 over UTF-8 it is because at the time UCS16 covered all the code points at UTF-8 hadn't been invented yet. It was figured that handling double-byte chars would be the easiest option. But, now we have more code points that can fit in 16bits so UTF-16 and UTF-8 were invented. However, when your intent is portability, you tend to want to find common footholds. More systems are tolerant to UTF-8 than to UCS16. Period. BTW, I don't disagree with that! Further: I always read statements like "Microsoft C/C++ is the largest most popular language platform in the world" as foolish sentiment. These people obviously don't know what they're talking about and need a good healthy dose of some reality. A subtle barb fired in my direction. I think statements like that are even more foolish and even more out of touch with reality! Oh sure, TRON is the most used OS in the world. Does sqlite even compile on TRON? How many developers program for it? Windows is installed on 96% of all desktop computers and somewhere around 30% of servers. That's a very large number of machines, but you dismiss that pretty easily. And Visual Basic is probably still the most popular programming language in the world! (and yes, that should make you shudder!) But, for the record, I spend 99% of my time developing for unix in a programming language that really knows only ASCII (with some exceptions). Here's a hint: this language will, in it's next major version, be a very large source of new sqlite users. The other 1% of the time is spent developing in a USC-16 platform that isn't Windows. Even if I'm not in the Microsoft camp, I can acknowledge that it has some significance. Too many rude users talk about how inconvenient their life now is because here is this wonderful and rather free toolkit that decided to make the life of the author easier- and most of it's users or potential users easier, at the expense of their own. BTW, all I had asked is if anyone had done the work of making sqlite unicode-aware (I did ask for UCS16, however). I hadn't seen anything to indicate it did anything but straight ASCII. Someone pointed out that it did handle UTF-8 w/ the appropriate #define and that is certainly good enough for my task. If I hadn't of asked, I wouldn't have known. Later, Programmer/Analyst WebMotif Net Services, Inc. 1.800.332.WIPS Direct: 604.299.1908 [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [sqlite] Making sqlite support unicode?
On Sun, 2003-10-26 at 14:26, Wayne Venables wrote: > >Further: I always read statements like "Microsoft C/C++ is the largest > >most popular language platform in the world" as foolish sentiment. These > >people obviously don't know what they're talking about and need a good > >healthy dose of some reality. > > A subtle barb fired in my direction. I think statements like that are even > more foolish and even more out of touch with reality! Oh sure, TRON is the > most used OS in the world. Does sqlite even compile on TRON? How many > developers program for it? Windows is installed on 96% of all desktop > computers and somewhere around 30% of servers. That's a very large number > of machines, but you dismiss that pretty easily. And Visual Basic is > probably still the most popular programming language in the world! (and > yes, that should make you shudder!) If you're targeting a large platform- however large- even monopolistic large, you are NOT WRITING PORTABLE CODE. I can dismiss however large Windows is simply by stating that it is not portable. SQLite is rather portable: It builds on many different platforms. It even _works_ on many different platforms. I have no idea how you got the idea that "portable code" meant runs on Windows, but you should drop it. On your other points: 1. SQLite does compile on TRON with very little help. TRON is nearly POSIX complete (although not certified, AFAIK). 2. The largest number of developers are presently mobilized for unixish platforms (POSIX). Windows claims to have a POSIX abstraction layer that doesn't comply with POSIX.1, but because fewer people know this each year, my statistics may be skewed by this. 3. The popularity of Visual Basic is greatly exaggerated. Using google: "Visual Basic" programmer resume: 61,200 "C" programmer resume: 165,000 "C++" programmer resume: 92,500 "Java" programmer resume: 112,000 "Javascript" programmer resume: 48,900 "SQL" programmer resume: 84,000 I have no idea how you got the ideas that you have on this subject, but it really surprises me whenever someone wants to talk about the popularity of a language. You can find metrics to support whatever you like, and I have no idea which ones you are using. I'd probably believe that more dollars are spent on Visual Basic Training every year, but I also wouldn't consider that a fair indicator of how many people are actually using it, or how many commercial programs (or better quality) are being turned out in a year. I wouldn't suggest that has anything to do with how long it's been around or how many people that actually HAVE USED IT and know some other programming language and yet still use Visual Basic. > But, for the record, I spend 99% of my time developing for unix in a > programming language that really knows only ASCII (with some > exceptions). Here's a hint: this language will, in it's next major > version, be a very large source of new sqlite users. The other 1% of the > time is spent developing in a USC-16 platform that isn't Windows. Even if > I'm not in the Microsoft camp, I can acknowledge that it has some significance. I have no idea what you're talking about. Do you? > >Too many rude users talk about how inconvenient their life now is > >because here is this wonderful and rather free toolkit that decided to > >make the life of the author easier- and most of it's users or potential > >users easier, at the expense of their own. > > BTW, all I had asked is if anyone had done the work of making sqlite > unicode-aware (I did ask for UCS16, however). I hadn't seen anything to > indicate it did anything but straight ASCII. Someone pointed out that it > did handle UTF-8 w/ the appropriate #define and that is certainly good > enough for my task. If I hadn't of asked, I wouldn't have known. No you said: "Unfortunately that still means there is a performance hit converting all data in and out of the library from UTF-8 to UCS16. A large number of operating systems and programming languages store strings natively as UCS16." And I happily gave you suggestions that would be useful in minimizing your "performance hit" as long as you weren't writing portable code. Knowing how to avoid a "performance hit" - especially one that a user hasn't bothered to profile yet is a good thing to know- a good thing to have in the archives, so that when someone DOES end up profiling and sees "oh boy, I'm spending a lot of time converting this into UTF-8 and back again!" they'll know what they can do. In some rather weak backpedaling, you tried to "justify" your large number of operating systems and languages by listing three- only one of which was actually correct: "Win32 is unicode (UCS16). Writing C/C++ w/ Win32 generally involves using wide char strings. Visual Basic natively stores strings as unicode. Java natively stores strings as unicode. I'd say that covers a lot more than 1%. And if you're
Re: [sqlite] Making sqlite support unicode?
On Sun, Oct 26, 2003 12:39:40 at 12:39:40PM -0500, Mrs. Brisby ([EMAIL PROTECTED]) wrote: > You did; that's okay. UCS16 != UTF-8. Uh, OK. Thanks for the detailed explanation. Marco Fioretti -- Marco Fioretti m.fioretti, at the server inwind.it Red Hat for low memory http://www.rule-project.org/en/ It's always socially unacceptable to be right too soon. -- RAH - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [sqlite] Making sqlite support unicode?
On Fri, Oct 24, 2003 21:24:44 at 09:24:44PM -0400, Mrs. Brisby ([EMAIL PROTECTED]) wrote: > > FYI, nobody said internal use of "unicode" - just "UCS16". Plan9 > doesn't. Linux doesn't. Not correct, unless I misunderstood the original question. At least Red Hat started to use unicode (UTF-8?) as default setting in version 8.0 Ciao, Marco Fioretti -- Marco Fioretti m.fioretti, at the server inwind.it Red Hat for low memory http://www.rule-project.org/en/ Technological progress has merely provided us with more efficient means for going backwards.Aldous Huxley - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [sqlite] Making sqlite support unicode?
At 05:39 PM 24/10/2003, you wrote: On Thu, 2003-10-23 at 13:31, Wayne Venables wrote: > Unfortunately that still means there is a performance hit converting all > data in and out of the library from UTF-8 to UCS16. A large number of > operating systems and programming languages store strings natively as UCS16. Even if you meant larger than 1% of operating systems AND languages store strings natively as UCS16, you'd still be incorrect. I believe compiling sqlite for UTF-8 should solve my problems. Just have to pay the conversion price. But just for the sake of argument, I will respond. Win32 is unicode (UCS16). Writing C/C++ w/ Win32 generally involves using wide char strings. Visual Basic natively stores strings as unicode. Java natively stores strings as unicode. I'd say that covers a lot more than 1%. And if you're coding for a non-unicode operating system, I sure hope you're using unicode anyway or otherwise you're alienating a large portion of users. I purely ASCII database in this day and age is terribly backwards. Thankfully, sqlite has the UTF-8 option. Later, Programmer/Analyst WebMotif Net Services, Inc. 1.800.332.WIPS Direct: 604.299.1908 [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]