Hi Roger,

I think Microsoft have had wchar_t as a type way before char16_t was introduced 
(as far back as I can remember, which is getting shorter as I get older :)). At 
the time, Microsoft were well known for doing things the 'Microsoft way' and 
not following standards very well. So maybe they backed themselves into a 
corner to some extent.

Indeed, your suggested edit has worked! I now have XERCES_XMLCH_T defined as 
wchar_t :)

I'm not quite sure why I'd need a transcoder if I'm using wchar_t across the 
board, so hopefully I can get away with having to worry about it. The only XML 
files we consume are UTF-8, so everything *should* just work. Fingers crossed.

I'm going to push my luck a bit here and ask an unrelated question. I need to 
build both 32 and 64 bit binaries for our application (it's a long story), but 
the Xerces C++ CMake system generates .LIBs and .DLLs with the same name. Is it 
possible to specify that one or other of the builds generates different output 
filenames? For example, I'd like to generate a xerces-c_3_2x64.dll for my 
64-bit build. At the moment I'm having to hand-craft the generated VS project 
file to achieve this.

Many thanks,
Mark

-----Original Message-----
From: rle...@codelibre.net [mailto:rle...@codelibre.net] 
Sent: 23 January 2018 14:46
To: c-users@xerces.apache.org
Subject: Re: How to build with XMLCh = wchar_t on Windows platform.

On 2018-01-23 14:12, Mark Douglas wrote:
> Hi Roger,
> 
> Thank you very much for this valuable feedback! As I'm new to CMake, I 
> didn't find the options of disabling char16_t (at least I wasn't 
> looking for the right thing to start with!).
> 
> I think the default policy of using char16_t, if it is available, is a 
> good choice - cross platform consistency should be maintained where 
> possible I think. The reason for wanting to use wchar_t is that I'm 
> moving some legacy code from Xerces C++ 1.5.1 to 3.2.0 and a LOT of 
> the application code is using wchar_t as the character type. I've also 
> now selected the VC++ option to 'Treat WChar_t as a Built in Type' in 
> my application meaning that it's no longer compatible with 16-bit 
> integer values.
> 
> If I were to use char16_t for Xerces C++ I'd need to make a lot of
> XMLString::transcode() calls in my application to perform the 
> conversion. Either that, or I'd have to insert a lot of 
> reinterpret_cast<wchar_t*> etc. throughout the code. So for me, using 
> wchar_t seems like the least invasive of the two options and also 
> means I don't need to perform any transcoding in order to interface 
> with the rest of the application.

That does sound like a pain and is completely understandable.  It's a shame 
that they didn't make wchar_t a typedef for char16_t or vice versa, but I'm 
sure there were reasons for not doing so.  Probably because the wchar_t 
encoding is unspecified, and it would also prevent overloading based on the 
type if they are the same underlying type.

> I think that adding an option to force wchar_t use as you suggest 
> would be a valuable addition - at least on platform where wchar_t is 
> 16-bit.

I have opened https://issues.apache.org/jira/browse/XERCESC-2132 to track this. 
 It should be fairly straightforward to add this for 3.2.1.  
In the interim, hopefully the edit I suggested will achieve the same effect by 
hand with 3.2.0.

> By the way, I wasn't quite sure what the CMake '-Dtranscoder=windows'
> option did.

It's the default transcoding implementation on Windows using functionality 
built into Windows (src/xercesc/util/Transcoders/Win32/Win32TransService.cpp).  
You can see the selection in src/CMakeLists.txt -- search for 
XERCES_USE_TRANSCODER_.  On Windows, you could use ICU as an alternative, or 
GNU iconv if you built it from source.  If you wanted absolutely consistent 
cross-platform behaviour, then ICU might be a good choice (I do this for my 
work projects--we build and use ICU for builds of Xerces-C++ on all platforms). 
 But the default should be fine in practice, so you could just leave it 
unspecified--it should default to "windows".


Regards,
Roger

Reply via email to