Not a model of clarity (ANSI and Unicode are not encodings), but this page
seems to be the best resource on this:

https://msdn.microsoft.com/en-us/library/ms709439(v=vs.85).aspx

It seems that there's a parallel "Unicode" API for ODBC drivers that
support it. Moreover:

Currently, the only Unicode encoding that ODBC supports is UCS-2, which
> uses a 16-bit integer (fixed length) to represent a character. Unicode
> allows applications to work in different languages.


So using Klingon is off the table. Although the design of UTF-16 is such
that sending UTF-16 to an application that expects UCS-2 will probably work
reasonably well, as long as it treats it as "just data".

This still doesn't explain why some drivers are accepting UCS-2/UTF-16 when
called with the non-Unicode API.

On Thu, Feb 4, 2016 at 1:12 PM, Stefan Karpinski <[email protected]>
wrote:

> The real issue is this:
>
> SQLCHAR is for encodings with 8-bit code units.
>
>
> Condescending lecture on encodings notwithstanding, UTF-16 is not such an
> encoding, yet UTF-16 is what the ODBC package is currently sending
> to SQLExecDirect for an argument of type SQLCHAR * – and somehow it seems
> to be working for many drivers, which still makes no sense to me. I can
> only conclude that some ODBC drivers are treating this as a void * argument
> and they expect pointers to data in whatever encoding they prefer, not
> specifically in encodings with 8-bit code units.
>
> Querying the database about what encoding it expects is a good idea, but
> how does one do that? The SQLGetInfo
> <https://msdn.microsoft.com/en-us/library/ms711681(v=vs.85).aspx>
> function seems like a good candidate but this page doesn't include
> "encoding" or "utf" anywhere.
>
> On Thu, Feb 4, 2016 at 7:53 AM, Milan Bouchet-Valat <[email protected]>
> wrote:
>
>> Le mercredi 03 février 2016 à 11:44 -0800, Terry Seaward a écrit :
>> > From R, it seems like the encoding is based on the connection (as
>> > opposed to being hard coded). See `enc <- attr(channel, "encoding")`
>> > below:
>> >
>> > ```
>> > [...]
>> >
>> > Digging down `odbcConnect` is just a wrapper for `odbcDriverConnect`
>> > which has the following parameter `DBMSencoding = ""`. This calls the
>> > `C` function `C_RODBCDriverConnect` (available here:RODBC_1.3-
>> > 12.tar.gz), which has no reference to encodings. So `attr(channel,
>> > "encoding")` is simply `DBMSencoding`, i.e. `""`.
>> >
>> > It seems to come down to `iconv(..., to = "")` which, from the R
>> > source code, uses `win_iconv.c` attached. I can't seem to find how
>> > `""` is handled, i.e. is there some default value based on the
>> > system?
>> "" refers to the encoding of the current system locale. This is a
>> reasonable guess, but it will probably be wrong in many cases (else, R
>> wouldn't have provided this option at all).
>>
>>
>> Regards
>>
>
>

Reply via email to