On Fri, 10 Sep 2010, Ludwig, Michael wrote:
> The Win32::OLE manual says the following about the CP option:
>
> ----
> This variable is used to determine the codepage used by all
> translations between Perl strings and Unicode strings used by the OLE
> interface. The default value is CP_ACP, which is the default ANSI
> codepage. Other possible values are CP_OEMCP, CP_MACCP, CP_UTF7 and
> CP_UTF8. These constants are not exported by default.
> ----
>
> I don't understand the impact of this setting. I presume there isn't
> any, but I want to be sure.

OLE Automation transfers strings internally encoded in UTF-16 (as BSTR
types).  Win32::OLE needs to transform them into regular Perl strings.
By default it converts to CP_ACP, the standard 8-bit character set on
Windows.  That means any Unicode character that is not representable
in CP_ACP will be translated to a "replacement" character (e.g. '?').

If you want to preserve the original Unicode string, then you need
to tell Win32::OLE to use CP_UTF8 instead.

CP_ACP is just the default for backwards compatibility reasons.

You probably don't want to use any of the other encodings, like CP_MACCP
or CP_UTF7, ever. :)

Cheers,
-Jan


_______________________________________________
ActivePerl mailing list
ActivePerl@listserv.ActiveState.com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs

Reply via email to