Re: Cross-platform string handling

Tom Cook Fri, 19 Sep 2014 03:37:01 -0700

Thanks for responding.

Yes, use of C++11 char16_t would mean that various compilers would
become unsupported (ie GCC support would change from 3.4.x+ to 4.5+).


I've just tried this patch on configure.ac:

289c289
<                       xerces_cv_type_xmlch=$xerces_cv_type_u16bit_int
---
>                       xerces_cv_type_xmlch=char16_t

but the results are more-or-less disastrous (haven't had a chance to
look into the errors in detail yet).  Is that the right way to try
changing the type of XMLCh?

Thanks,
Tom

On Fri, Sep 19, 2014 at 5:50 PM, Alberto Massari
<albertomass...@tiscali.it> wrote:
> There are no plans to use C++11 features in Xerces, especially if this would
> make any of the supported compilers to become unsupported.
> But you can make a test to see if char16_t literals are encoded in the same
> format as XMLCh expects...
>
> Alberto
>
> Il 19/09/14 08:56, Tom Cook ha scritto:
>
>> No response to this so far.  Am I better off rebuilding Xerces-C with
>> XMLCh typedef'd to char16_t?
>>
>> Regards,
>> Tom
>>
>> On Mon, Sep 15, 2014 at 4:16 PM, Tom Cook <tom.k.c...@gmail.com> wrote:
>>>
>>> Hi,
>>>
>>> I've googled around and found this question asked in quite a few
>>> places, but not any answer to it.
>>>
>>> What is the best way of handling strings, and particularly string
>>> literals, in portable code?
>>>
>>> Specifically, I'm interested in building code with VC++ 2013 on
>>> Windows and G++ 4.8 on Linux.  On Windows, the Xerces binary build
>>> uses wchar_t as the character type, and so, naturally enough, people
>>> on Windows write code that passes around wchar_t.  Unfortunately, G++
>>> has a 4-byte wchar_t and so Xerces uses unsigned short int (or
>>> uint16_t) as its character type.  This causes all such code written
>>> for Windows to break in fairly horrible ways on Linux, and in ways
>>> that require wide-ranging code changes to fix.
>>>
>>> So far, the best solution I've come up with looks something like this:
>>>
>>> #if defined _MSC_VER
>>> #define U16S(x) L##x
>>> typedef wchar_t my_u16_char_t;
>>> typedef std::wstring my_u16_str_t;
>>> typedef std::wstringstream my_u16_stream_t;
>>> inline XmlCh* XmlString(my_u16_char_t* s) { return s; }
>>> inline XmlCh* XmlString(my_u16_str_t* s) { return s.c_str(); }
>>> #elif defined __linux
>>> #define U16S(x) u##x
>>> typedef char16_t my_u16_char_t;
>>> typedef std::basic_string<char16_t> my_u16_str_t;
>>> typedef std::basic_stringstream<char16_t> my_u16_stream_t;
>>> inline XmlCh* XmlString(my_u16_char_t* s) { return
>>> reinterpret_cast<char16_t*>(s); }
>>> inline XmlCh* XmlString(my_u16_str_t* s) { return XmlString(s.c_str()); }
>>> #endif
>>>
>>> But of course this still requires major code changes for existing code
>>> that uses wchar_t.
>>>
>>> Is there a better way of sorting this out?  C++11 now has a distinct,
>>> UTF-16-encoded character type, char16_t.  Is there any plan to make
>>> Xerces use it?
>>>
>>> Thanks,
>>> Tom
>
>
>
> --
> -----------------------------------------
> Lucia Riccardi & Alberto Massari
> Via Prasca 19/5
> 16148 Genova
> Italia
> Tel.: (010) 3771653
> E-mail: lucia_albe...@libero.it
> Web: http://www.massari.org
>

Re: Cross-platform string handling

Reply via email to