Ah...good point. Richard pointed out to me that we probably shouldn't be using the library functions anyway, since sometimes they're locale-sensitive and sometimes they aren't. We can replace them all with LLVM ASCII-only versions that take a char instead of int.
Do you want to seek out all the uses or should I? On Jan 29, 2013, at 0:27 , "Benyei, Guy" <[email protected]> wrote: > Right, but you did change the failing tests, and added Unicode characters > (FixIt/fixit-unicode.c, CodeGen\ucn-identifiers.c). > Anyhow, the failures occur in debug build only: > > CRT assert: f:\dd\vctools\crt_bld\self_x86\crt\src\isctype.c(56) : Assertion > failed: (unsigned)(c + 1) <= 256 > > Apparently this assertion is supposed to check that the value passed to the > character classification functions is an actual character. However, since in > clang and LLVM there are several places where a signed char is passed to > these function - the negative char values are simply passed as negative int > values, and the c-style cast above results in some really big numbers, that > fail the assertion. > > Thanks > Guy > > > -----Original Message----- > From: Jordan Rose [mailto:[email protected]] > Sent: Monday, January 28, 2013 19:03 > To: Benyei, Guy; NAKAMURA Takumi > Cc: [email protected] > Subject: Re: [cfe-commits] r173622 - PR15067: Don't assert when a UCN appears > in a C90 file. > > Hm, I didn't change any of these, and they've been working fine on Takumi's > bot for a while now. Takumi, do you have an opinion on these fixes? > > Jordan > > > On Jan 28, 2013, at 5:45 , "Benyei, Guy" <[email protected]> wrote: > >> Hi Jordan, >> Working on Windows/Visual Studio 2010, I get asserts due to the usage of >> signed chars representing Unicode characters in some library functions >> expecting int arguments (specifically: isdigit and isalnum). Ultimately, >> this makes Visual Studio crash while running check-clang. >> A possible solution would be to cast these char values to unsigned char: >> >> Index: LiteralSupport.cpp >> =================================================================== >> --- LiteralSupport.cpp (revision 173676) >> +++ LiteralSupport.cpp (working copy) >> @@ -458,7 +458,7 @@ >> // and FP constants (specifically, the 'pp-number' regex), and assumes that >> // the byte at "*end" is both valid and not part of the regex. Because of >> // this, it doesn't have to check for 'overscan' in various places. >> - assert(!isalnum(*ThisTokEnd) && *ThisTokEnd != '.' && *ThisTokEnd >> != '_' && >> + assert(!isalnum((unsigned char)*ThisTokEnd) && *ThisTokEnd != '.' >> + && *ThisTokEnd != '_' && >> "Lexer didn't maximally munch?"); >> >> s = DigitsBegin = ThisTokBegin; >> >> >> Index: AsmWriter.cpp >> =================================================================== >> --- AsmWriter.cpp (revision 173676) >> +++ AsmWriter.cpp (working copy) >> @@ -117,7 +117,7 @@ >> } >> >> // Scan the name to see if it needs quotes first. >> - bool NeedsQuotes = isdigit(Name[0]); >> + bool NeedsQuotes = isdigit((unsigned char)Name[0]); >> if (!NeedsQuotes) { >> for (unsigned i = 0, e = Name.size(); i != e; ++i) { >> // By making this unsigned, the value passed in to isalnum will >> always be >> >> Anyhow, these are only the specific places the current LITs fail on - I'm >> really not sure if there are others. >> >> Thanks >> Guy >> >> -----Original Message----- >> From: [email protected] >> [mailto:[email protected]] On Behalf Of Jordan Rose >> Sent: Sunday, January 27, 2013 22:12 >> To: [email protected] >> Subject: [cfe-commits] r173622 - PR15067: Don't assert when a UCN appears in >> a C90 file. >> >> Author: jrose >> Date: Sun Jan 27 14:12:04 2013 >> New Revision: 173622 >> >> URL: http://llvm.org/viewvc/llvm-project?rev=173622&view=rev >> Log: >> PR15067: Don't assert when a UCN appears in a C90 file. >> >> Unfortunately, we can't accept the UCN as an extension because we're >> required to treat it as two tokens for preprocessing purposes. >> >> Modified: >> cfe/trunk/include/clang/Basic/DiagnosticLexKinds.td >> cfe/trunk/lib/Lex/Lexer.cpp >> cfe/trunk/lib/Lex/LiteralSupport.cpp >> cfe/trunk/test/Lexer/c90.c >> >> Modified: cfe/trunk/include/clang/Basic/DiagnosticLexKinds.td >> URL: >> http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/Diag >> nosticLexKinds.td?rev=173622&r1=173621&r2=173622&view=diff >> ====================================================================== >> ======== >> --- cfe/trunk/include/clang/Basic/DiagnosticLexKinds.td (original) >> +++ cfe/trunk/include/clang/Basic/DiagnosticLexKinds.td Sun Jan 27 >> +++ 14:12:04 2013 >> @@ -127,6 +127,11 @@ def warn_cxx98_compat_literal_ucn_escape >> def warn_cxx98_compat_literal_ucn_control_character : Warning< >> "universal character name referring to a control character " >> "is incompatible with C++98">, InGroup<CXX98Compat>, DefaultIgnore; >> +def warn_ucn_not_valid_in_c89 : Warning< >> + "universal character names are only valid in C99 or C++; " >> + "treating as '\\' followed by identifier">, InGroup<Unicode>; def >> +warn_ucn_not_valid_in_c89_literal : ExtWarn< >> + "universal character names are only valid in C99 or C++">, >> +InGroup<Unicode>; >> >> >> // Literal >> @@ -165,8 +170,6 @@ def ext_string_too_long : Extension<"str >> "support">, InGroup<OverlengthStrings>; def err_character_too_large : >> Error< >> "character too large for enclosing character literal type">; -def >> warn_ucn_not_valid_in_c89 : ExtWarn< >> - "unicode escape sequences are only valid in C99 or C++">, >> InGroup<Unicode>; def warn_cxx98_compat_unicode_literal : Warning< >> "unicode literals are incompatible with C++98">, >> InGroup<CXX98Compat>, DefaultIgnore; >> >> Modified: cfe/trunk/lib/Lex/Lexer.cpp >> URL: >> http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Lex/Lexer.cpp?rev=17 >> 3622&r1=173621&r2=173622&view=diff >> ====================================================================== >> ======== >> --- cfe/trunk/lib/Lex/Lexer.cpp (original) >> +++ cfe/trunk/lib/Lex/Lexer.cpp Sun Jan 27 14:12:04 2013 >> @@ -2698,8 +2698,6 @@ bool Lexer::isCodeCompletionPoint(const >> >> uint32_t Lexer::tryReadUCN(const char *&StartPtr, const char *SlashLoc, >> Token *Result) { >> - assert(LangOpts.CPlusPlus || LangOpts.C99); >> - >> unsigned CharSize; >> char Kind = getCharAndSize(StartPtr, CharSize); >> >> @@ -2711,6 +2709,11 @@ uint32_t Lexer::tryReadUCN(const char *& >> else >> return 0; >> >> + if (!LangOpts.CPlusPlus && !LangOpts.C99) { >> + Diag(SlashLoc, diag::warn_ucn_not_valid_in_c89); >> + return 0; >> + } >> + >> const char *CurPtr = StartPtr + CharSize; >> const char *KindLoc = &CurPtr[-1]; >> >> @@ -2737,7 +2740,7 @@ uint32_t Lexer::tryReadUCN(const char *& >> } >> } >> } >> - >> + >> return 0; >> } >> >> >> Modified: cfe/trunk/lib/Lex/LiteralSupport.cpp >> URL: >> http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Lex/LiteralSupport.c >> pp?rev=173622&r1=173621&r2=173622&view=diff >> ====================================================================== >> ======== >> --- cfe/trunk/lib/Lex/LiteralSupport.cpp (original) >> +++ cfe/trunk/lib/Lex/LiteralSupport.cpp Sun Jan 27 14:12:04 2013 >> @@ -277,7 +277,7 @@ static bool ProcessUCNEscape(const char >> >> if (!Features.CPlusPlus && !Features.C99 && Diags) >> Diag(Diags, Features, Loc, ThisTokBegin, UcnBegin, ThisTokBuf, >> - diag::warn_ucn_not_valid_in_c89); >> + diag::warn_ucn_not_valid_in_c89_literal); >> >> return true; >> } >> >> Modified: cfe/trunk/test/Lexer/c90.c >> URL: >> http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Lexer/c90.c?rev=173 >> 622&r1=173621&r2=173622&view=diff >> ====================================================================== >> ======== >> --- cfe/trunk/test/Lexer/c90.c (original) >> +++ cfe/trunk/test/Lexer/c90.c Sun Jan 27 14:12:04 2013 >> @@ -29,8 +29,8 @@ void test2() { >> } >> >> void test3() { >> - (void)L"\u1234"; // expected-error {{unicode escape sequences are >> only valid in C99 or C++}} >> - (void)L'\u1234'; // expected-error {{unicode escape sequences are >> only valid in C99 or C++}} >> + (void)L"\u1234"; // expected-error {{universal character names are >> + only valid in C99 or C++}} (void)L'\u1234'; // expected-error >> + {{universal character names are only valid in C99 or C++}} >> } >> >> #define PREFIX(x) foo ## x >> @@ -39,3 +39,5 @@ int test4() { >> int *p = &PREFIX(0p+1); >> return p[-1]; >> } >> + >> +#define MY_UCN \u00FC // expected-warning {{universal character names >> +are only valid in C99 or C++; treating as '\' followed by >> +identifier}} >> >> >> _______________________________________________ >> cfe-commits mailing list >> [email protected] >> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits >> --------------------------------------------------------------------- >> Intel Israel (74) Limited >> >> This e-mail and any attachments may contain confidential material for >> the sole use of the intended recipient(s). Any review or distribution >> by others is strictly prohibited. If you are not the intended >> recipient, please contact the sender and delete all copies. >> > > --------------------------------------------------------------------- > Intel Israel (74) Limited > > This e-mail and any attachments may contain confidential material for > the sole use of the intended recipient(s). Any review or distribution > by others is strictly prohibited. If you are not the intended > recipient, please contact the sender and delete all copies. > _______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
