sammccall created this revision.
sammccall added a reviewer: hokein.
Herald added a project: All.
sammccall requested review of this revision.
Herald added projects: clang, clang-tools-extra.
Herald added a subscriber: cfe-commits.

Currently if a lexically-valid UCN encodes an invalid codepoint, then we
diagnose that, and then hit an assertion while trying to decode it.

Since there isn't anything preventing us reaching this state, remove the
assertion. expandUCNs("X\UAAAAAAAAY") will produce "XY".


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D125059

Files:
  clang-tools-extra/pseudo/test/crash/bad-ucn.c
  clang/lib/Lex/LiteralSupport.cpp
  clang/test/Lexer/unicode.c


Index: clang/test/Lexer/unicode.c
===================================================================
--- clang/test/Lexer/unicode.c
+++ clang/test/Lexer/unicode.c
@@ -28,6 +28,9 @@
 
         int _;
 
+extern int X\UAAAAAAAA; // expected-error {{not allowed in an identifier}}
+int Y = '\UAAAAAAAA'; // expected-error {{invalid universal character}}
+
 #ifdef __cplusplus
 
 extern int ༀ;
Index: clang/lib/Lex/LiteralSupport.cpp
===================================================================
--- clang/lib/Lex/LiteralSupport.cpp
+++ clang/lib/Lex/LiteralSupport.cpp
@@ -320,10 +320,8 @@
                             llvm::SmallVectorImpl<char> &Str) {
   char ResultBuf[4];
   char *ResultPtr = ResultBuf;
-  bool Res = llvm::ConvertCodePointToUTF8(Codepoint, ResultPtr);
-  (void)Res;
-  assert(Res && "Unexpected conversion failure");
-  Str.append(ResultBuf, ResultPtr);
+  if (llvm::ConvertCodePointToUTF8(Codepoint, ResultPtr))
+    Str.append(ResultBuf, ResultPtr);
 }
 
 void clang::expandUCNs(SmallVectorImpl<char> &Buf, StringRef Input) {
Index: clang-tools-extra/pseudo/test/crash/bad-ucn.c
===================================================================
--- /dev/null
+++ clang-tools-extra/pseudo/test/crash/bad-ucn.c
@@ -0,0 +1,4 @@
+// This UCN doesn't encode a valid codepoint.
+// We used to assert while trying to expand UCNs in the token.
+// RUN: clang-pseudo -source=%s
+A\UAAAAAAAA


Index: clang/test/Lexer/unicode.c
===================================================================
--- clang/test/Lexer/unicode.c
+++ clang/test/Lexer/unicode.c
@@ -28,6 +28,9 @@
 
         int _;
 
+extern int X\UAAAAAAAA; // expected-error {{not allowed in an identifier}}
+int Y = '\UAAAAAAAA'; // expected-error {{invalid universal character}}
+
 #ifdef __cplusplus
 
 extern int ༀ;
Index: clang/lib/Lex/LiteralSupport.cpp
===================================================================
--- clang/lib/Lex/LiteralSupport.cpp
+++ clang/lib/Lex/LiteralSupport.cpp
@@ -320,10 +320,8 @@
                             llvm::SmallVectorImpl<char> &Str) {
   char ResultBuf[4];
   char *ResultPtr = ResultBuf;
-  bool Res = llvm::ConvertCodePointToUTF8(Codepoint, ResultPtr);
-  (void)Res;
-  assert(Res && "Unexpected conversion failure");
-  Str.append(ResultBuf, ResultPtr);
+  if (llvm::ConvertCodePointToUTF8(Codepoint, ResultPtr))
+    Str.append(ResultBuf, ResultPtr);
 }
 
 void clang::expandUCNs(SmallVectorImpl<char> &Buf, StringRef Input) {
Index: clang-tools-extra/pseudo/test/crash/bad-ucn.c
===================================================================
--- /dev/null
+++ clang-tools-extra/pseudo/test/crash/bad-ucn.c
@@ -0,0 +1,4 @@
+// This UCN doesn't encode a valid codepoint.
+// We used to assert while trying to expand UCNs in the token.
+// RUN: clang-pseudo -source=%s
+A\UAAAAAAAA
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to