hazohelet added a comment.

In D155610#4550346 <https://reviews.llvm.org/D155610#4550346>, @cor3ntin wrote:

> I've been thinking about it and I think I have a cleaner design for the 
> printing of characters:
>
> We need a `CharToString(unsigned, Qualtype) -> SmallString` method that takes 
> a value and the type.
> for `char` and `char8_t` we can just return the value.
> For `wchar_t`, `char32_t` and `char16_t`, we can use something like 
> `ConvertCodePointToUTF8` to convert to UTF-8. If that fails we can escape 
> with `\x`
> If we pass the result of that to the diagnostic engine, escaping  of non 
> printing character would happen automatically.
>
> That way we have a nice separation between converting an APValue to string 
> and printing it, which should avoid code duplication quite a bit, and 
> generally make the design nicer.
> Then maybe we need to consider whether we want to modify 
> `CharacterLiteral::print` to be aligned with all of that. I don;t know if 
> that's used, for example, for mangling.
>
> Given there are a bunch of different issues here, i would not mind separate 
> PRs - having the numerical value showed in paren seems valuable on its own.

Thanks for the suggestion. Printing of multibyte character is a bit out of the 
scope of the original goal of this patch, but I'll give it a try.



================
Comment at: clang/lib/Basic/Diagnostic.cpp:838-858
+      if (UseUCN)
+        OutStream << "\\u"
+                  << llvm::format_hex_no_prefix(CodepointValue, /*Width=*/4,
+                                                /*Upper=*/false);
+      else
+        OutStream << "<U+"
+                  << llvm::format_hex_no_prefix(CodepointValue, /*Width=*/4,
----------------
cor3ntin wrote:
> hazohelet wrote:
> > cor3ntin wrote:
> > > The use UCN addition is probably not justified. we should consistent in 
> > > how we print the value of non-printable code points.
> > My motivation here is to print a valid character literal. I think it 
> > justifies this change somewhat. I'd like to see what others think about 
> > this.
> Why? there is no expectation that diagnostics messages reproduce C++. 
> Consistency between string literals and characters literals is a lot more 
> important
The diagnostic meesage here looks like `expression evaluates to 'VALUE1 == 
VALUE2'`
I tend to expect that `VALUE1 == VALUE2` is a syntactically valid expression 
because of the syntactical element `==`. But if others do not feel the same 
way, I am okay with something like `'<U+0001>' == '<U+0002>'`


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155610/new/

https://reviews.llvm.org/D155610

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to