> On Dec. 20, 2016, 12:29 a.m., Daniel Pravat wrote:
> > 3rdparty/stout/include/stout/windows/os.hpp, line 748
> > <https://reviews.apache.org/r/54877/diff/1/?file=1589098#file1589098line748>
> >
> >     I don't think the conversion to UTF-8 is appropiate here.
> 
> Andrew Schwartzmeyer wrote:
>     What would you convert it to? It's currently in UTF-16, and Windows paths 
> are allowed to have (almost) any Unicode character.
> 
> Alex Clemmer wrote:
>     I think he's saying that bad things will happen if you use strings that 
> have Unicode in them, so it's probably better to just error out.
> 
> Andrew Schwartzmeyer wrote:
>     I'm not following... That is a case our codebase must deal with, NTFS 
> paths are stored in Unicode:
>     
>     
> https://msdn.microsoft.com/en-us/library/windows/desktop/dd317748(v=vs.85).aspx
>     
>     and on top of that, Windows file paths can:
>     
>     > Use any character in the current code page for a name, including 
> Unicode characters and characters in the extended character set (128–255)
>     
>     
> https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx
>     
>     So what I'm saying is that bad things _must not happen_ if we have paths 
> with non-ASCII characters in them; we need to handle that at any point we 
> deal with file paths on Windows.
>     
>     I'm not saying this function is likely to return a path like `C:\ÂÑÐ¥`, 
> but that would be a valid file path, and so we must be treating all our paths 
> on Windows as Unicode.
>     
>     If this is not the case, are purposefully constraining ourselves to 
> ASCII, and if so, why?
> 
> Alex Clemmer wrote:
>     Sorry, I meant bad things will happen to Mesos if you give it strings 
> with unicode in them. Mesos itself does not deal with Unicode gracefully.
> 
> Andrew Schwartzmeyer wrote:
>     Oooh. We should probably fix that.
> 
> Andrew Schwartzmeyer wrote:
>     Alex, Daniel, do we have a resolution on this? I spoke with Alex and we 
> feel that we should be doing two main things: avoid exacberating the problem 
> in MESOS-6817 by being explicit in our use of `W` APIs, rather than relying 
> on the mercurial value of `TCHAR`; and not attempt to guard against paths 
> that Mesos does not handle, as it is a bug that Mesos does not handle paths 
> with valid extended characters (since both Linux and Windows may have these 
> paths), so failing here would be inappropriate as it hides the actual bug. 
> (Not to speak for you Alex, I think this summarizes our discussion yesterday.)
> 
> Alex Clemmer wrote:
>     If I'm understanding our discussion correctly, failing here on unicode 
> conversion would be preferable, since it makes a subtle bug more obvious. If 
> that's nontrivial I think we can just ship it as is, since Mesos already has 
> these problems.
> 
> Andrew Schwartzmeyer wrote:
>     > failing here on unicode conversion
>     
>     What would fail here?
>     
>     I *think* you're implying to add a detection algorithm to see if there 
> are any characters in the path that Mesos does not believe are valid; but 
> that is not "failing on Unicode", we'd need to write our own validator for 
> Mesos.
> 
> Alex Clemmer wrote:
>     I was just saying, if there's an easy way to try to interpret the string 
> as normal ASCII and return `Error` if that fails, we should do that. If not, 
> it's not that big of a deal, because Mesos will probably fail anyway.

Ah I see what you want: 
> interpret the string as normal ASCII and return Error if that fails

but I don't believe this is possible. To be clear, the non-Unicode versions of 
these APIs do not return ASCII; they return the same data encoded with the 
current Windows code page (which is a superset of ASCII, in much the same way 
as UTF-8). If we used this version, we would not only hit the same problem 
(possible non-ASCII characters), but also introduce a new problem of having to 
decode from an arbitrary Windows code page (rather than UTF-16 which is at 
least standard). With either the short or wide version of the API, detecting if 
the string is not ASCII would mean implmenenting a validator, which I don't 
think we want to do here (it's the wrong solution to the problem).

That said, what would you suggest?

P.S. For what it's worth, I don't expect this particular API to give us 
non-ASCII characters, but I think we should choose a pattern for dealing with 
Windows APIs and stick to it. This pattern gives us guaranteed UTF-8 strings 
with whatever data we're expected to handle.


- Andrew


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54877/#review159672
-----------------------------------------------------------


On Dec. 19, 2016, 11:20 p.m., Andrew Schwartzmeyer wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54877/
> -----------------------------------------------------------
> 
> (Updated Dec. 19, 2016, 11:20 p.m.)
> 
> 
> Review request for mesos, Daniel Pravat, Alex Clemmer, and Joseph Wu.
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The API `SHGetKnownFolderPath` requires `Shell32.dll`,
> which is not available on Nano server.
> The equivalent API `GetAllUsersProfileDirectory`
> only requires `Userenv.dll`, which is available on Nano.
> 
> This API is also friendlier, as we own the allocation.
> 
> The Unicode version `GetAllUsersProfileDirectoryW` is
> explicitly used so that we are guaranteed a Unicode path,
> which we then convert from UTF-16 to UTF-8,
> instead of using the ASCII version which depends on a
> varying Windows code-page, and is not recommended.
> 
> A `vector<wchar_t>` is used over a `wstring` to avoid dealing
> with the placement of the null-terminating character.
> 
> 
> Diffs
> -----
> 
>   3rdparty/stout/include/stout/windows.hpp 
> e641c46d033372e1b6c9f9c066b1ad4957d55088 
>   3rdparty/stout/include/stout/windows/os.hpp 
> 5cd92545a49648e39e8eb7cf131895e9cfc97902 
> 
> Diff: https://reviews.apache.org/r/54877/diff/
> 
> 
> Testing
> -------
> 
> cmake && msbuild, attach agent to master and check default `runtime_dir` 
> value.
> 
> 
> Thanks,
> 
> Andrew Schwartzmeyer
> 
>

Reply via email to