cor3ntin added inline comments.

================
Comment at: clang/lib/Sema/SemaDeclCXX.cpp:16869
+/// The code point needs to be zero-extended to 32-bits.
+void ConvertCharToString(uint32_t CodePoint, const BuiltinType *BTy,
+                         unsigned TyWidth, llvm::raw_ostream &OS) {
----------------
hubert.reinterpretcast wrote:
> It does not seem that the first parameter expects a `CodePoint` argument in 
> all cases. For `Char_S`, `Char_U`, and `Char8`, it seems the function wants 
> to treat the input as a UTF-8 code unit.
> 
> I suggest changing the argument to be clearly a code unit (and potentially 
> treat it as a code point value as appropriate later in the function).
> 
> Also: The function should probably be declared as having static linkage.
> Additionally: The function does not "convert" in the language semantic sense. 
> `WriteCharacterValueDescriptionForDisplay` might be a better name.
Agreed, `CodeUnit` or `Value` would be more correct (mostly because of numeric 
escape sequences).
But if we are going to change that then `WriteCharValueForDiagnostic` would be 
better, `Character` implies too much


================
Comment at: clang/lib/Sema/SemaDeclCXX.cpp:16876
+  // other types.
+  if (CodePoint <= UCHAR_MAX) {
+    StringRef Escaped = escapeCStyle<EscapeChar::Single>(CodePoint);
----------------
hubert.reinterpretcast wrote:
> For types other than `Char_S`, `Char_U`, and `Char8`, this fails to treat the 
> C1 Controls and Latin-1 Supplement characters as Unicode code points. It 
> looks like test coverage for these cases are missing.
`escapeCStyle` is one of the things that assume ASCII / UTF, but yes, we might 
as well reduce to 0x7F just to avoid unnecessary work


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155610/new/

https://reviews.llvm.org/D155610

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to