[sqlite] Windows A and W APIs dual support

2016-02-13 Thread R Smith


On 2016/02/13 6:52 AM, J Decker wrote:
> On Fri, Feb 12, 2016 at 8:00 PM, Igor Tandetnik  wrote:
>> On 2/12/2016 10:44 PM, J Decker wrote:
>>>
>>> I expect it to take any string 
>>
>> What is the basis of this expectation, other than wishful thinking?
> I don't think expectation and wishful thinknig have anything to do
> with each other.

That's exactly Igor's point (if I may). The two things are very 
different. You talk about expectation as if it were "what you yourself 
expect from the World" - which is akin to wishful thinking, while the 
normal understanding of "expectation" used in these contexts, is to find 
a result to be in line with how the documentation describes it would be. 
i.e - what is to be expected considering the workings of the system.


> Though I expect standards would look at what the world really needs
> and implement core functionality?  That's hardly wishful thinking.
> Well I guess it is, because I repeatedly have found myself
> disappointed in the lack of considerations in standards.  Yes there
> are even unicode libraries for posix; but it's a huge expense for a
> couple hundred lines of code.

Very true.

>and if it's something I needed for
> interop, why doesn't everyone?

This furthers the point:
Your expectation isn't unreasonable at all from a needs-based or wishful 
point of view (I'm sure many of us wish it too) - but it is unreasonable 
to expect that which is not explicitly (or even implicitly) promised by 
the documentation/designer.

The boiled-down difference being that your original statement, to 
paraphrase, said: "It doesn't work", while it should have said: "It 
doesn't work for me" - and so was rightly challenged.

Cheers,
Ryan




[sqlite] Windows A and W APIs dual support

2016-02-12 Thread Scott Robison
On Fri, Feb 12, 2016 at 8:05 PM, Warren Young  wrote:

> On Feb 12, 2016, at 4:42 PM, Scott Robison 
> wrote:
> >
> > I find it kind of interesting that Microsoft takes a lot
> > of (deserved) flack for not adhering to standards, yet UTF-8 came about
> > specifically because some didn't want to use UCS-2
>
> ?for good reason.  UCS-2/UTF-16 isn?t compatible with C strings.  I know
> you know this, but it?s a huge consideration.  Outside of Mac OS Classic
> and a few even smaller enclaves, C and its calling standards were the
> lingua franca of the computing world when Unicode first came on the scene,
> and those enclaves are now all but gone.
>
> We?ll be living with the legacy of C for quite a long time yet.  Until C
> is completely stamped out, we?ll have to accommodate 0-terminated strings
> somehow.
>

UCS (which was by definition a 2 byte encoding; UCS-2 is a retronym) was
not a "standard" until late 1991. C89/C90 provided for definition of a wide
character type. People didn't want to (perhaps couldn't) use it (and I can
understand why). My point was just that Microsoft was the first to really
embrace the standard as written, not tweak it into something else.

Windows bought into the idea of Unicode and/or UCS before they were unified
and standardized to their current form. That locked Windows into what we
call the UCS-2 format, when Unicode was "guaranteed" to never need more
than 2^16 code points. Later unification of the two standards expanded the
potential code point space to U+7FFF, and later still restricted it to
U+10 to ensure that UTF-16 could address all of the potential standard
code points.


> > Had Microsoft come up with it first, I'm sure they'd be crucified by
> some of
> > the same people who today are critical of them for using wide characters
> > instead of UTF-8!
>
> I think if we were to send a copy of the Unicode 8.0 standard back to the
> early 1960s as a model for those designing ASCII, Unicode would look very
> different today.
>

I think you're probably correct. Though who knows. The industry still
hadn't really agreed to 8 bit bytes. Memory was expensive, and you did what
you had to to minimize its use. 6 bit bytes/characters, 2 digit year
encodings. A lot of people today just can't imagine caring that much about
RAM (given how much of it is used to share pictures of kittens), but it was
a significant savings that translated to real money.


> UCS-2 feels like the 90?s version of ?640 kB is enough for everything!? to
> me, and UTF-16 like bank switching/segmentation.  We?re going to be stuck
> with those half-measure decisions for decades now.  Thanks, Microsoft.
>

Thanks Unicode / ISO-10646. They set the standard. Microsoft adopted it.


> The POSIX platforms did the right thing here: UTF-32 when speed matters
> more than space, and UTF-8 when space or compatibility matters more.
>

They had the luxury of waiting until UTF-8 and UCS-4 (now UTF-32) existed
before making those decisions. 20/20 hindsight.

Note: I like UTF-8. I try to use it everywhere and only convert as needed
to suit the API. I certainly think Microsoft has had plenty of time to more
thoroughly integrate UTF-8 into the APIs so that you don't have to convert
back and forth. I just find it funny that Microsoft is condemned by so many
for adhering to the standards / draft standards while POSIX systems were
able to embrace and extend. :)

-- 
Scott Robison


[sqlite] Windows A and W APIs dual support

2016-02-12 Thread Igor Tandetnik
On 2/12/2016 10:44 PM, J Decker wrote:
> On Fri, Feb 12, 2016 at 7:37 PM, Igor Tandetnik  wrote:
>> It performs the conversion it is documented to perform. It indeed doesn't
>> perform the conversion that you, for reasons unclear, expect it to perform.
>> In other words, you engage in wishful thinking, and then blame the messenger
>> for failure of your wishes to materialize.
>
> I expect it to take any string

What is the basis of this expectation, other than wishful thinking?

Again, if you need to convert specifically between UTF-16 and UTF-8, 
there are API functions that are documented to do that, and they do 
work. They are WideCharToMultiByte and MultiByteToWideChar. wcstombs and 
mbstowcs are not documented to do that, and, quite unsurprisingly, they 
don't work for that.
-- 
Igor Tandetnik



[sqlite] Windows A and W APIs dual support

2016-02-12 Thread Igor Tandetnik
On 2/12/2016 10:14 PM, J Decker wrote:
> mbstowcs( out, utf8, 5 );

mbstowcs expects the string in the codepage of the current locale - 
which is never UTF-8.

> for( n = 0; n < 5; n++ )
> printf( "%04x ", out[n] );  // output is 00f0 0090 0080 0081; expect d800 dc01

Why do you expect that? It appears your system uses Western European 
codepage (aka Latin-1). You pass a character "\xf0" which, when taken to 
be encoded in that codepage, is quite properly converted to U+00F0.

> for( n = 0; n < 5; n++ )
>printf( "%02x ", chout[n] );  // output is 00 00 00 00

U+10001 is (unsurprisingly) not representable in your current ANSI 
codepage, so wcstombs call fails (I can't help but notice that you 
aren't checking any calls for failure) and leaves the output buffer 
unchanged.

> so it does no useful conversion either way :)

It performs the conversion it is documented to perform. It indeed 
doesn't perform the conversion that you, for reasons unclear, expect it 
to perform. In other words, you engage in wishful thinking, and then 
blame the messenger for failure of your wishes to materialize.
-- 
Igor Tandetnik



[sqlite] Windows A and W APIs dual support

2016-02-12 Thread Clemens Ladisch
Olivier Mascia wrote:
> Are there Windows platforms, supported by SQLite source code of course, where 
> the 'W' version of the APIs are not available?

Once upon a time, SQLite supported Windows 95/98/Me.

Nowadays, the code is still there, but untested.


Regards,
Clemens


[sqlite] Windows A and W APIs dual support

2016-02-12 Thread J Decker
On Fri, Feb 12, 2016 at 8:00 PM, Igor Tandetnik  wrote:
> On 2/12/2016 10:44 PM, J Decker wrote:
>>
>> On Fri, Feb 12, 2016 at 7:37 PM, Igor Tandetnik 
>> wrote:
>>>
>>> It performs the conversion it is documented to perform. It indeed doesn't
>>> perform the conversion that you, for reasons unclear, expect it to
>>> perform.
>>> In other words, you engage in wishful thinking, and then blame the
>>> messenger
>>> for failure of your wishes to materialize.
>>
>>
>> I expect it to take any string
>
>
> What is the basis of this expectation, other than wishful thinking?
I don't think expectation and wishful thinknig have anything to do
with each other.

Though I expect standards would look at what the world really needs
and implement core functionality?  That's hardly wishful thinking.
Well I guess it is, because I repeatedly have found myself
disappointed in the lack of considerations in standards.  Yes there
are even unicode libraries for posix; but it's a huge expense for a
couple hundred lines of code.  and if it's something I needed for
interop, why doesn't everyone?

>
> Again, if you need to convert specifically between UTF-16 and UTF-8, there
> are API functions that are documented to do that, and they do work. They are
> WideCharToMultiByte and MultiByteToWideChar. wcstombs and mbstowcs are not
> documented to do that, and, quite unsurprisingly, they don't work for that.
>
> --
> Igor Tandetnik
>
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Windows A and W APIs dual support

2016-02-12 Thread J Decker
On Fri, Feb 12, 2016 at 8:00 PM, Igor Tandetnik  wrote:
> On 2/12/2016 10:44 PM, J Decker wrote:
>>
>> On Fri, Feb 12, 2016 at 7:37 PM, Igor Tandetnik 
>> wrote:
>>>
>>> It performs the conversion it is documented to perform. It indeed doesn't
>>> perform the conversion that you, for reasons unclear, expect it to
>>> perform.
>>> In other words, you engage in wishful thinking, and then blame the
>>> messenger
>>> for failure of your wishes to materialize.
>>
>>
>> I expect it to take any string
>
>
> What is the basis of this expectation, other than wishful thinking?
>
> Again, if you need to convert specifically between UTF-16 and UTF-8, there
> are API functions that are documented to do that, and they do work. They are
> WideCharToMultiByte and MultiByteToWideChar. wcstombs and mbstowcs are not
> documented to do that, and, quite unsurprisingly, they don't work for that.
>
and what exists for platforms other than windows?
doesn't matter though.  I have a solution that works on all platforms
that's the same name and doesn't require some #ifdef to work.
> --
> Igor Tandetnik
>
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Windows A and W APIs dual support

2016-02-12 Thread Igor Tandetnik
On 2/12/2016 7:24 PM, J Decker wrote:
> well mbstowc and vice versa only understand 16 bit encodings, and do
> not generate codpairsand do generate d800-dfff characters which
> are entirely illegal in utf-16 (without corresponding pair)

What character in what ANSI codepage ends up converted by mbstowcs to an 
unpaired surrogate?

What character in what ANSI codepage requires a surrogate pair to 
represent (that is, corresponds to a Unicode character outside of BMP), 
and triggers failure when passed to mbstowcs?

With all due respect, I find your claims difficult to believe.

In any case, MultiByteToWideChar and WideCharToMultiByte are perfectly 
capable of converting between UTF-8 and UTF-16.
-- 
Igor Tandetnik



[sqlite] Windows A and W APIs dual support

2016-02-12 Thread Warren Young
On Feb 12, 2016, at 4:42 PM, Scott Robison  wrote:
> 
> I find it kind of interesting that Microsoft takes a lot
> of (deserved) flack for not adhering to standards, yet UTF-8 came about
> specifically because some didn't want to use UCS-2

?for good reason.  UCS-2/UTF-16 isn?t compatible with C strings.  I know you 
know this, but it?s a huge consideration.  Outside of Mac OS Classic and a few 
even smaller enclaves, C and its calling standards were the lingua franca of 
the computing world when Unicode first came on the scene, and those enclaves 
are now all but gone.

We?ll be living with the legacy of C for quite a long time yet.  Until C is 
completely stamped out, we?ll have to accommodate 0-terminated strings somehow.

> Had Microsoft come up with it first, I'm sure they'd be crucified by some of
> the same people who today are critical of them for using wide characters
> instead of UTF-8!

I think if we were to send a copy of the Unicode 8.0 standard back to the early 
1960s as a model for those designing ASCII, Unicode would look very different 
today.

I think the basic idea of UTF-8 would remain.  Instead of being sold as a 
C-compatible encoding, we?d still have a need for it as a packed encoding.  A 
kind of Huffman encoding for language, if you will.

But, I think we?d probably reorder the Unicode character points so that it 
packed even more densely on typical texts.  Several of the ASCII punctuation 
characters don?t deserve a place in the low 7 bits, and we could relocate the 
control characters, too.  We could probably get all of Western Europe?s 
characters into the lower 7 that way.

The next priority would be to pack the rest of the Western world?s characters 
into the lower 11 bits.  Cyrillic, Greek, Eastern European accented Latin 
characters, etc.

That should still leave space for several other non-Asian, non-Latin character 
sets.  Devanagari, Hebrew, Arabic?pack as many of them in as we can.  We should 
be able to cover about half the world?s population in the same space as UCS-2, 
while allowing most Western texts to be smaller, thoroughly outcompeting it.

UCS-2 feels like the 90?s version of ?640 kB is enough for everything!? to me, 
and UTF-16 like bank switching/segmentation.  We?re going to be stuck with 
those half-measure decisions for decades now.  Thanks, Microsoft.

The POSIX platforms did the right thing here: UTF-32 when speed matters more 
than space, and UTF-8 when space or compatibility matters more.

> Note: I still wish [Microsoft] supported UTF-8 directly from the API.

If wishes were changes, I?d rather that all languages and platforms supported 
tagged UTF-8 and UTF-32 strings, with automatic conversion as necessary.  Pack 
your strings down as UTF-8 when space matters, and unpack them as UTF-32 when 
speed matters.  Unicode could define a sensible conversion rule set, similar to 
the way sign extension works when mixing integer sizes.

Since the Unicode Consortium has stated that Unicode won?t grow beyond 2^21-1 
code points to prevent UTF-8 from going beyond 4 bytes per character, that tag 
could be an all-1s upper byte.  The rule could be that if you pass at least 4 
bytes to a function expecting a string, the buffer length is evenly divisible 
by 4, and the first 32-bit word has 0xFF on either end, it?s a tagged UTF-32 
value.  Otherwise, it?s UTF-8.

Simple and straightforward.

Too bad it will never happen.


[sqlite] Windows A and W APIs dual support

2016-02-12 Thread J Decker
On Fri, Feb 12, 2016 at 7:37 PM, Igor Tandetnik  wrote:
> On 2/12/2016 10:14 PM, J Decker wrote:
>>
>> mbstowcs( out, utf8, 5 );
>
>
> mbstowcs expects the string in the codepage of the current locale - which is
> never UTF-8.
>
>> for( n = 0; n < 5; n++ )
>> printf( "%04x ", out[n] );  // output is 00f0 0090 0080 0081; expect d800
>> dc01
>
>
> Why do you expect that? It appears your system uses Western European
> codepage (aka Latin-1). You pass a character "\xf0" which, when taken to be
> encoded in that codepage, is quite properly converted to U+00F0.
>
>> for( n = 0; n < 5; n++ )
>>printf( "%02x ", chout[n] );  // output is 00 00 00 00
>
>
> U+10001 is (unsurprisingly) not representable in your current ANSI codepage,
> so wcstombs call fails (I can't help but notice that you aren't checking any
> calls for failure) and leaves the output buffer unchanged.
>
>> so it does no useful conversion either way :)
>
>
> It performs the conversion it is documented to perform. It indeed doesn't
> perform the conversion that you, for reasons unclear, expect it to perform.
> In other words, you engage in wishful thinking, and then blame the messenger
> for failure of your wishes to materialize.

I expect it to take any string such as

???
or
?  ??? ? 

and give me a char * representation of it that's useful... or
conversely take the char* version of said strings and give me wchar_t
* that makes can be used.


> --
> Igor Tandetnik
>
>
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Windows A and W APIs dual support

2016-02-12 Thread J Decker
On Fri, Feb 12, 2016 at 7:05 PM, Warren Young  wrote:
> On Feb 12, 2016, at 4:42 PM, Scott Robison  wrote:
>>
>> I find it kind of interesting that Microsoft takes a lot
>> of (deserved) flack for not adhering to standards, yet UTF-8 came about
>> specifically because some didn't want to use UCS-2
>
> ?for good reason.  UCS-2/UTF-16 isn?t compatible with C strings.  I know you 
> know this, but it?s a huge consideration.  Outside of Mac OS Classic and a 
> few even smaller enclaves, C and its calling standards were the lingua franca 
> of the computing world when Unicode first came on the scene, and those 
> enclaves are now all but gone.
>
> We?ll be living with the legacy of C for quite a long time yet.  Until C is 
> completely stamped out, we?ll have to accommodate 0-terminated strings 
> somehow.
>

and Go.  Which is purely UTF8.

>
> Simple and straightforward.
>
> Too bad it will never happen.
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Windows A and W APIs dual support

2016-02-12 Thread J Decker
On Fri, Feb 12, 2016 at 5:32 PM, Igor Tandetnik  wrote:
> On 2/12/2016 7:24 PM, J Decker wrote:
>>

> What character in what ANSI codepage ends up converted by mbstowcs to an
> unpaired surrogate?
>
> What character in what ANSI codepage requires a surrogate pair to represent
> (that is, corresponds to a Unicode character outside of BMP), and triggers
> failure when passed to mbstowcs?
>
> With all due respect, I find your claims difficult to believe.
>
> In any case, MultiByteToWideChar and WideCharToMultiByte are perfectly
> capable of converting between UTF-8 and UTF-16.
> --
> Igor Tandetnik
>

Okay; I'd forgotten.  It does worse than I expected...

//--

int main( void )
{
char utf8[5] = "\xf0\x90\x80\x81";
char utf82[5] = "\xed\xa0\x81";
   char utf8tmp[5];
wchar_t out[5];
wchar_t out2[5];
wchar_t utf16[5] = L"\xd800\xdc01";
wchar_t real_out[25];
   char chout[5];
int n;

memset( out, 0, sizeof( out ) );
   memset( out2, 0, sizeof( out2 ) );
   memset( chout, 0, sizeof( chout ) );

mbstowcs( out, utf8, 5 );
mbstowcs( out2, utf82, 5 );
wcstombs( chout, utf16, 5 );

for( n = 0; n < 5; n++ )
printf( "%04x ", out[n] );  // output is 00f0 0090 0080 0081; expect d800 dc01
   printf( "\n" );
for( n = 0; n < 5; n++ )
  printf( "%04x ", out2[n] ); // output is 00ed 00a0 0081; expect d801
   printf( "\n" );
for( n = 0; n < 5; n++ )
  printf( "%02x ", chout[n] );  // output is 00 00 00 00
}

//--

so it does no useful conversion either way :)  (but at least I ended
up fixing a boundary issue while testing)

>
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Windows A and W APIs dual support

2016-02-12 Thread Olivier Mascia
Dear,

I see the source code for SQLite3 takes great care to support either the A 
(MBCS, but not UTF8) or the W (Windows 'UTF16') versions of key APIs it depends 
on that platform.

Are there Windows platforms, supported by SQLite source code of course, where 
the 'W' version of the APIs are not available? I know about Windows 3.0, but 
what else? Some CE editions?

--
Meilleures salutations, Met vriendelijke groeten, Best Regards,
Olivier Mascia, integral.be/om

-- next part --
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 842 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: 



[sqlite] Windows A and W APIs dual support

2016-02-12 Thread Scott Robison
On Fri, Feb 12, 2016 at 4:05 PM, J Decker  wrote:

> windows W is wide-char not utf-16.
> as much as A is ansi and isn't utf-8
>

Has Windows ever supported a wide character set that was not UCS-2 or
UTF-16? I've always understood Microsoft embraced UCS-2 specifically so
that it would not have to deal with future encoding changes. Obviously it
failed to an extent when UCS-2 was deprecated in favor of UTF-16, but since
UTF-16 is backward compatible as long as you don't need surrogate pairs, it
wasn't too painful of a transition. Especially when compared to the
plethora of 8 bit multibyte encodings.

Note: I know Windows has supported DBCS for various encodings / code pages,
but those are never passed to wide functions.

I find it kind of interesting that Microsoft takes a lot
of (deserved) flack for not adhering to standards, yet UTF-8 came about
specifically because some didn't want to use UCS-2 (then simply known as
UCS, the one and only true flavor of the Universal Character Set). Had
Microsoft come up with it first, I'm sure they'd be crucified by some of
the same people who today are critical of them for using wide characters
instead of UTF-8!

Note: I still wish they supported UTF-8 directly from the API.

-- 
Scott Robison


[sqlite] Windows A and W APIs dual support

2016-02-12 Thread J Decker
well mbstowc and vice versa only understand 16 bit encodings, and do
not generate codpairsand do generate d800-dfff characters which
are entirely illegal in utf-16 (without corresponding pair)

But; fortunately, they do end up supporting utf-8 since it's just a
stream of bytes with a nul terminator in most cases.  But for display
I defiantly had to do my own 'getCodepoint' and then index the font..
I'd imagine that applications like IE handle it internally too... but
definitely the console has some issues.

On Fri, Feb 12, 2016 at 3:42 PM, Scott Robison  
wrote:
> On Fri, Feb 12, 2016 at 4:05 PM, J Decker  wrote:
>
>> windows W is wide-char not utf-16.
>> as much as A is ansi and isn't utf-8
>>
>
> Has Windows ever supported a wide character set that was not UCS-2 or
> UTF-16? I've always understood Microsoft embraced UCS-2 specifically so
> that it would not have to deal with future encoding changes. Obviously it
> failed to an extent when UCS-2 was deprecated in favor of UTF-16, but since
> UTF-16 is backward compatible as long as you don't need surrogate pairs, it
> wasn't too painful of a transition. Especially when compared to the
> plethora of 8 bit multibyte encodings.
>
> Note: I know Windows has supported DBCS for various encodings / code pages,
> but those are never passed to wide functions.
>
> I find it kind of interesting that Microsoft takes a lot
> of (deserved) flack for not adhering to standards, yet UTF-8 came about
> specifically because some didn't want to use UCS-2 (then simply known as
> UCS, the one and only true flavor of the Universal Character Set). Had
> Microsoft come up with it first, I'm sure they'd be crucified by some of
> the same people who today are critical of them for using wide characters
> instead of UTF-8!
>
> Note: I still wish they supported UTF-8 directly from the API.
>
> --
> Scott Robison
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Windows A and W APIs dual support

2016-02-12 Thread J Decker
windows W is wide-char not utf-16.
as much as A is ansi and isn't utf-8

On Fri, Feb 12, 2016 at 1:07 PM, Warren Young  wrote:
> On Feb 12, 2016, at 1:49 PM, Clemens Ladisch  wrote:
>>
>> Olivier Mascia wrote:
>>> Are there Windows platforms, supported by SQLite source code of course, 
>>> where the 'W' version of the APIs are not available?
>>
>> Once upon a time, SQLite supported Windows 95/98/Me.
>
> The DOS-based versions of Windows still have the ?W? functions for binary 
> compatibility with the NT-based versions, but for the most part they treat 
> their arguments according to the 8-bit code page or MBCS rules, which means 
> you generally get garbage output when you feed in UCS-2.
>
> There are a few exceptions: https://support.microsoft.com/en-us/kb/210341
>
> Note that Windows didn?t move from UCS-2 to UTF-16 until Windows 2000, which 
> is effectively after the development time of the DOS-based versions of 
> Windows.  (There?s a tiny overlap there with Windows ME, but that?s last-gasp 
> stuff.)
>
> I assume if you pass strings using characters beyond the BMP to the ?16? APIs 
> in SQLite, they would do the wrong thing on Windows NT 3.x and 4.x systems, 
> too.
>
> I doubt there would be much crying if SQLite dropped the ?A? support.  I 
> suspect the only reason SQLite still has it is that it?s more work to remove 
> it than to leave it alone.
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Windows A and W APIs dual support

2016-02-12 Thread Warren Young
On Feb 12, 2016, at 1:49 PM, Clemens Ladisch  wrote:
> 
> Olivier Mascia wrote:
>> Are there Windows platforms, supported by SQLite source code of course, 
>> where the 'W' version of the APIs are not available?
> 
> Once upon a time, SQLite supported Windows 95/98/Me.

The DOS-based versions of Windows still have the ?W? functions for binary 
compatibility with the NT-based versions, but for the most part they treat 
their arguments according to the 8-bit code page or MBCS rules, which means you 
generally get garbage output when you feed in UCS-2.

There are a few exceptions: https://support.microsoft.com/en-us/kb/210341

Note that Windows didn?t move from UCS-2 to UTF-16 until Windows 2000, which is 
effectively after the development time of the DOS-based versions of Windows.  
(There?s a tiny overlap there with Windows ME, but that?s last-gasp stuff.)

I assume if you pass strings using characters beyond the BMP to the ?16? APIs 
in SQLite, they would do the wrong thing on Windows NT 3.x and 4.x systems, too.

I doubt there would be much crying if SQLite dropped the ?A? support.  I 
suspect the only reason SQLite still has it is that it?s more work to remove it 
than to leave it alone.