sammccall created this revision. sammccall added a reviewer: hokein. Herald added a project: All. sammccall requested review of this revision. Herald added projects: clang, clang-tools-extra. Herald added a subscriber: cfe-commits.
Currently if a lexically-valid UCN encodes an invalid codepoint, then we diagnose that, and then hit an assertion while trying to decode it. Since there isn't anything preventing us reaching this state, remove the assertion. expandUCNs("X\UAAAAAAAAY") will produce "XY". Repository: rG LLVM Github Monorepo https://reviews.llvm.org/D125059 Files: clang-tools-extra/pseudo/test/crash/bad-ucn.c clang/lib/Lex/LiteralSupport.cpp clang/test/Lexer/unicode.c Index: clang/test/Lexer/unicode.c =================================================================== --- clang/test/Lexer/unicode.c +++ clang/test/Lexer/unicode.c @@ -28,6 +28,9 @@ int _; +extern int X\UAAAAAAAA; // expected-error {{not allowed in an identifier}} +int Y = '\UAAAAAAAA'; // expected-error {{invalid universal character}} + #ifdef __cplusplus extern int ༀ; Index: clang/lib/Lex/LiteralSupport.cpp =================================================================== --- clang/lib/Lex/LiteralSupport.cpp +++ clang/lib/Lex/LiteralSupport.cpp @@ -320,10 +320,8 @@ llvm::SmallVectorImpl<char> &Str) { char ResultBuf[4]; char *ResultPtr = ResultBuf; - bool Res = llvm::ConvertCodePointToUTF8(Codepoint, ResultPtr); - (void)Res; - assert(Res && "Unexpected conversion failure"); - Str.append(ResultBuf, ResultPtr); + if (llvm::ConvertCodePointToUTF8(Codepoint, ResultPtr)) + Str.append(ResultBuf, ResultPtr); } void clang::expandUCNs(SmallVectorImpl<char> &Buf, StringRef Input) { Index: clang-tools-extra/pseudo/test/crash/bad-ucn.c =================================================================== --- /dev/null +++ clang-tools-extra/pseudo/test/crash/bad-ucn.c @@ -0,0 +1,4 @@ +// This UCN doesn't encode a valid codepoint. +// We used to assert while trying to expand UCNs in the token. +// RUN: clang-pseudo -source=%s +A\UAAAAAAAA
Index: clang/test/Lexer/unicode.c =================================================================== --- clang/test/Lexer/unicode.c +++ clang/test/Lexer/unicode.c @@ -28,6 +28,9 @@ int _; +extern int X\UAAAAAAAA; // expected-error {{not allowed in an identifier}} +int Y = '\UAAAAAAAA'; // expected-error {{invalid universal character}} + #ifdef __cplusplus extern int à¼; Index: clang/lib/Lex/LiteralSupport.cpp =================================================================== --- clang/lib/Lex/LiteralSupport.cpp +++ clang/lib/Lex/LiteralSupport.cpp @@ -320,10 +320,8 @@ llvm::SmallVectorImpl<char> &Str) { char ResultBuf[4]; char *ResultPtr = ResultBuf; - bool Res = llvm::ConvertCodePointToUTF8(Codepoint, ResultPtr); - (void)Res; - assert(Res && "Unexpected conversion failure"); - Str.append(ResultBuf, ResultPtr); + if (llvm::ConvertCodePointToUTF8(Codepoint, ResultPtr)) + Str.append(ResultBuf, ResultPtr); } void clang::expandUCNs(SmallVectorImpl<char> &Buf, StringRef Input) { Index: clang-tools-extra/pseudo/test/crash/bad-ucn.c =================================================================== --- /dev/null +++ clang-tools-extra/pseudo/test/crash/bad-ucn.c @@ -0,0 +1,4 @@ +// This UCN doesn't encode a valid codepoint. +// We used to assert while trying to expand UCNs in the token. +// RUN: clang-pseudo -source=%s +A\UAAAAAAAA
_______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits