Re: [sqlite] Making sqlite support unicode?

2003-10-29 Thread Mrs. Brisby
On Mon, 2003-10-27 at 10:27, [EMAIL PROTECTED] wrote: 
> On Sun, 26 Oct 2003 23:36:55 -0500
> "Mrs. Brisby" <[EMAIL PROTECTED]> wrote:
> > It's good to use null-terminated in many cases; especially in collating
> > and sorting. It helps to understand that in those cases you stop
> > processing _after_ you see the terminator (and treat the terminator as
> > it is: zero.)
> 
> Collating involves with length. If data length is known prior to scanning
> data, in some cases you can skip it if it doesn't match without scanning
> data body. It helps to understand that in those cases you stop processing
> _before_ you see the terminator or anything else.

No it cannot. How are the following tokens collated?

aaa
aab



> > UTF-16 is NOT used in HFS+. HFS+ still uses ASCII with some "tricks".
> > UFS is what's "preferred" in MacOS X, and it doesn't use UTF-16 either.
> > UTF-16 isn't what we're talking about anyway, it's UCS16.
> MacOS X uses "Unicode" as its native encoding. In Unicode encoding
> the most used in MacOS X is UTF-16. Only to call BSD API it uses
> UTF-8. It's kind of hybrid, but UTF-8 is just used for compatibility to
> Unix parts in MacOS X, and other non-Unix pieces in MacOS X, which
> is why MacOS X is Mac, is using UTF-16 internally, including Carbon,
> Cocoa and ATSUI.
> 
> For HFS+, from Apple's Technical Note TN2078 (Migrating to FSRefs & long
> Unicode names from FSSpecs):
> http://developer.apple.com/technotes/tn2002/tn2078.html

See the parts in http://developer.apple.com/technotes/tn/tn1150.html
regarding the HFS wrapper; this was what I thought I was remembering;
my memory on this subject is admittedly spotty.

I don't recall this information being quite so easy to google; thanks for
correcting me on this.



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [sqlite] Making sqlite support unicode?

2003-10-26 Thread Wayne Venables
At 09:39 AM 26/10/2003, you wrote:
Most deliberately unicode-aware task domains have chosen UTF-8- simply
because it's a path of minimal resistance. Microsoft chose double-byte
unicode encoding (often referred to as UCS16).
In case anyone is wondering why Microsoft made such an OBVIOUSLY bad choice 
by going with UCS16 over UTF-8 it is because at the time UCS16 covered all 
the code points at UTF-8 hadn't been invented yet.  It was figured that 
handling double-byte chars would be the easiest option.  But, now we have 
more code points that can fit in 16bits so UTF-16 and UTF-8 were invented.

However, when your intent is portability, you tend to want to find
common footholds. More systems are tolerant to UTF-8 than to UCS16.
Period.
BTW, I don't disagree with that!

Further: I always read statements like "Microsoft C/C++ is the largest
most popular language platform in the world" as foolish sentiment. These
people obviously don't know what they're talking about and need a good
healthy dose of some reality.
A subtle barb fired in my direction.  I think statements like that are even 
more foolish and even more out of touch with reality!  Oh sure, TRON is the 
most used OS in the world.  Does sqlite even compile on TRON?  How many 
developers program for it?  Windows is installed on 96% of all desktop 
computers and somewhere around 30% of servers.  That's a very large number 
of machines, but you dismiss that pretty easily.  And Visual Basic is 
probably still the most popular programming language in the world! (and 
yes, that should make you shudder!)

But, for the record, I spend 99% of my time developing for unix in a 
programming language that really knows only ASCII (with some 
exceptions).  Here's a hint: this language will, in it's next major 
version, be a very large source of new sqlite users.  The other 1% of the 
time is spent developing in a USC-16 platform that isn't Windows.  Even if 
I'm not in the Microsoft camp, I can acknowledge that it has some significance.

Too many rude users talk about how inconvenient their life now is
because here is this wonderful and rather free toolkit that decided to
make the life of the author easier- and most of it's users or potential
users easier, at the expense of their own.
BTW, all I had asked is if anyone had done the work of making sqlite 
unicode-aware (I did ask for UCS16, however).  I hadn't seen anything to 
indicate it did anything but straight ASCII.  Someone pointed out that it 
did handle UTF-8 w/ the appropriate #define and that is certainly good 
enough for my task.  If I hadn't of asked, I wouldn't have known.

Later,

Programmer/Analyst
WebMotif Net Services, Inc.
1.800.332.WIPS
Direct: 604.299.1908
[EMAIL PROTECTED]
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: [sqlite] Making sqlite support unicode?

2003-10-26 Thread Mrs. Brisby
On Sun, 2003-10-26 at 14:26, Wayne Venables wrote:
> >Further: I always read statements like "Microsoft C/C++ is the largest
> >most popular language platform in the world" as foolish sentiment. These
> >people obviously don't know what they're talking about and need a good
> >healthy dose of some reality.
> 
> A subtle barb fired in my direction.  I think statements like that are even 
> more foolish and even more out of touch with reality!  Oh sure, TRON is the 
> most used OS in the world.  Does sqlite even compile on TRON?  How many 
> developers program for it?  Windows is installed on 96% of all desktop 
> computers and somewhere around 30% of servers.  That's a very large number 
> of machines, but you dismiss that pretty easily.  And Visual Basic is 
> probably still the most popular programming language in the world! (and 
> yes, that should make you shudder!)

If you're targeting a large platform- however large- even monopolistic
large, you are NOT WRITING PORTABLE CODE. I can dismiss however large
Windows is simply by stating that it is not portable. SQLite is rather
portable: It builds on many different platforms. It even _works_ on many
different platforms.

I have no idea how you got the idea that "portable code" meant runs on
Windows, but you should drop it.

On your other points:

1. SQLite does compile on TRON with very little help. TRON is nearly
POSIX complete (although not certified, AFAIK).

2. The largest number of developers are presently mobilized for unixish
platforms (POSIX). Windows claims to have a POSIX abstraction layer that
doesn't comply with POSIX.1, but because fewer people know this each
year, my statistics may be skewed by this.

3. The popularity of Visual Basic is greatly exaggerated. Using google:
"Visual Basic" programmer resume: 61,200
"C" programmer resume: 165,000
"C++" programmer resume: 92,500
"Java" programmer resume: 112,000
"Javascript" programmer resume: 48,900
"SQL" programmer resume: 84,000

I have no idea how you got the ideas that you have on this subject, but
it really surprises me whenever someone wants to talk about the
popularity of a language. You can find metrics to support whatever you
like, and I have no idea which ones you are using.

I'd probably believe that more dollars are spent on Visual Basic
Training every year, but I also wouldn't consider that a fair indicator
of how many people are actually using it, or how many commercial
programs (or better quality) are being turned out in a year. I wouldn't
suggest that has anything to do with how long it's been around or how
many people that actually HAVE USED IT and know some other programming
language and yet still use Visual Basic.


> But, for the record, I spend 99% of my time developing for unix in a 
> programming language that really knows only ASCII (with some 
> exceptions).  Here's a hint: this language will, in it's next major 
> version, be a very large source of new sqlite users.  The other 1% of the 
> time is spent developing in a USC-16 platform that isn't Windows.  Even if 
> I'm not in the Microsoft camp, I can acknowledge that it has some significance.

I have no idea what you're talking about. Do you?


> >Too many rude users talk about how inconvenient their life now is
> >because here is this wonderful and rather free toolkit that decided to
> >make the life of the author easier- and most of it's users or potential
> >users easier, at the expense of their own.
> 
> BTW, all I had asked is if anyone had done the work of making sqlite 
> unicode-aware (I did ask for UCS16, however).  I hadn't seen anything to 
> indicate it did anything but straight ASCII.  Someone pointed out that it 
> did handle UTF-8 w/ the appropriate #define and that is certainly good 
> enough for my task.  If I hadn't of asked, I wouldn't have known.

No you said:

"Unfortunately that still means there is a performance hit converting
all data in and out of the library from UTF-8 to UCS16.  A large number
of operating systems and programming languages store strings natively as
UCS16."

And I happily gave you suggestions that would be useful in minimizing
your "performance hit" as long as you weren't writing portable code.
Knowing how to avoid a "performance hit" - especially one that a user
hasn't bothered to profile yet is a good thing to know- a good thing to
have in the archives, so that when someone DOES end up profiling and
sees "oh boy, I'm spending a lot of time converting this into UTF-8 and
back again!" they'll know what they can do.

In some rather weak backpedaling, you tried to "justify" your large
number of operating systems and languages by listing three- only one of
which was actually correct:

"Win32 is unicode (UCS16).  Writing C/C++ w/ Win32 generally involves
using wide char strings.  Visual Basic natively stores strings as
unicode.  Java natively stores strings as unicode.  I'd say that covers
a lot more than 1%.  And if you're 

Re: [sqlite] Making sqlite support unicode?

2003-10-26 Thread M. Fioretti
On Sun, Oct 26, 2003 12:39:40 at 12:39:40PM -0500, Mrs. Brisby ([EMAIL PROTECTED]) 
wrote:
 
> You did; that's okay. UCS16 != UTF-8.

Uh, OK. Thanks for the detailed explanation.

Marco Fioretti
-- 
Marco Fioretti m.fioretti, at the server inwind.it
Red Hat for low memory http://www.rule-project.org/en/

It's always socially unacceptable to be right too soon. -- RAH

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [sqlite] Making sqlite support unicode?

2003-10-25 Thread M. Fioretti
On Fri, Oct 24, 2003 21:24:44 at 09:24:44PM -0400, Mrs. Brisby ([EMAIL PROTECTED]) 
wrote:
> 
> FYI, nobody said internal use of "unicode" - just "UCS16". Plan9
> doesn't. Linux doesn't.

Not correct, unless I misunderstood the original question. At least
Red Hat started to use unicode (UTF-8?) as default setting in version
8.0 

Ciao,
Marco Fioretti


-- 
Marco Fioretti m.fioretti, at the server inwind.it
Red Hat for low memory http://www.rule-project.org/en/

Technological progress has merely provided us with more efficient
means for going backwards.Aldous Huxley

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [sqlite] Making sqlite support unicode?

2003-10-24 Thread Wayne Venables
At 05:39 PM 24/10/2003, you wrote:
On Thu, 2003-10-23 at 13:31, Wayne Venables wrote:
> Unfortunately that still means there is a performance hit converting all
> data in and out of the library from UTF-8 to UCS16.  A large number of
> operating systems and programming languages store strings natively as 
UCS16.

Even if you meant larger than 1% of operating systems AND languages store 
strings
natively as UCS16, you'd still be incorrect.
I believe compiling sqlite for UTF-8 should solve my problems.  Just have 
to pay the conversion price.

But just for the sake of argument, I will respond.  Win32 is unicode 
(UCS16).  Writing C/C++ w/ Win32 generally involves using wide char 
strings.  Visual Basic natively stores strings as unicode.  Java natively 
stores strings as unicode.  I'd say that covers a lot more than 1%.  And if 
you're coding for a non-unicode operating system, I sure hope you're using 
unicode anyway or otherwise you're alienating a large portion of users.  I 
purely ASCII database in this day and age is terribly 
backwards.  Thankfully, sqlite has the UTF-8 option.

Later,

Programmer/Analyst
WebMotif Net Services, Inc.
1.800.332.WIPS
Direct: 604.299.1908
[EMAIL PROTECTED]
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]