Well it's not the console I'm worried about, that's coming straight from the VS debugger. Knowing that strings are always coming out of PROJ in UTF-8 is good.
Ultimately I'm sending the output to a C# DLL, so I need to CoTaskMemAlloc my string. If I do something like this: std::wstring s2ws(const char* utf8Bytes) { const std::string& str(utf8Bytes); int size_needed = MultiByteToWideChar(CP_UTF8, 0, &str[0], (int)str.size(), NULL, 0); std::wstring wstrTo(size_needed, 0); MultiByteToWideChar(CP_UTF8, 0, &str[0], (int)str.size(), &wstrTo[0], size_needed); return wstrTo; } Then I see the corrected UTF-8 text in the wstring. As mentioned this isn't something I'm terribly familiar with, and I'd like to avoid writing terrible C code and exploding buffers. CoTaskMemAlloc needs the actual number of bytes, and we'll need an extra spot for the null terminator. const wchar_t* u_convertResult(const char* result) { if (!result) return nullptr; std::wstring wstr = s2ws(result); auto wlen = wstr.length() + 1; auto len = wlen * sizeof(wchar_t); wchar_t* buff = (wchar_t*)CoTaskMemAlloc(len); if (buff) { wcscpy_s(buff, wlen, wstr.c_str()); } return buff; } Does this sound reasonable for Windows? And as for Linux and maintaining a multi-platform compatibility, I'd define an alias function like this instead: const wchar_t* u_convertResult(const char* result) { std::string str(result); std::wstring wstr = std::wstring(str.begin(), str.end()); auto wlen = wstr.length() + 1; auto len = wlen * sizeof(wchar_t); wchar_t* buff = (wchar_t*)malloc(len); if (buff) { wcscpy(buff, wstr.c_str()); } return buff; } Since it's already happily working as UTF-8 on Linux, I should be able to pass in the original string to the wstring. CoTaskMemAlloc is just malloc. Does this sound okay too? Thanks! On Wed, Apr 5, 2023 at 4:52 PM Even Rouault <even.roua...@spatialys.com> wrote: > Peter, > > there isn't any issue in your build. It is just that PROJ returns UTF-8 > encoded strings and that the typical Windows console isn't configured to > display UTF-8. Cf > https://stackoverflow.com/questions/57131654/using-utf-8-encoding-chcp-65001-in-command-prompt-windows-powershell-window > or similar issues > > Even > Le 05/04/2023 à 23:44, Peter Townsend via PROJ a écrit : > > I've got a bit of an annoyance with my windows proj build. Hopefully it's > not too hard to resolve as the world of char/wchar_t/etc. isn't something > I'm terribly familiar with. > > Take for example the area of use of EPSG:23031. On Linux it's fine, but on > windows there's a unicode issue. > > PJ* crs = proj_create(m_ctxt, "EPSG:23031"); > ASSERT_NE(crs, nullptr); > ObjectKeeper keeper_crsH(crs); > > double w, s, e, n; > const char* a; > proj_get_area_of_use(m_ctxt, crs, &w, &s, &e, &n, &a); > > Contents of a: > "Europe - between 0°E and 6°E - Andorra; Denmark (North Sea); Germany > offshore; Netherlands offshore; Norway including Svalbard - onshore and > offshore; Spain - onshore (mainland and Balearic Islands); United Kingdom > (UKCS) offshore." > > Is there a simple thing I'm overlooking in the build process that might > clear up the encoding goof? Or do I need to do some bending over backwards > with character manipulation? > > This is the command line I'm using to build this example: > cmake -DBUILD_SHARED_LIBS=ON > -DCMAKE_TOOLCHAIN_FILE=C:\dev\vcpkg\scripts\buildsystems\vcpkg.cmake .. > cmake --build . --config Debug -j 8 > > Thanks! > -- > Peter Townsend > Senior Software Developer > > _______________________________________________ > PROJ mailing > listPROJ@lists.osgeo.orghttps://lists.osgeo.org/mailman/listinfo/proj > > -- http://www.spatialys.com > My software is free, but my time generally not. > > -- Peter Townsend Senior Software Developer
_______________________________________________ PROJ mailing list PROJ@lists.osgeo.org https://lists.osgeo.org/mailman/listinfo/proj