Re: [cfe-commits] r173622 - PR15067: Don't assert when a UCN appears in a C90 file.

Jordan Rose Wed, 30 Jan 2013 08:38:52 -0800

Ah...good point. Richard pointed out to me that we probably shouldn't be using 
the library functions anyway, since sometimes they're locale-sensitive and 
sometimes they aren't. We can replace them all with LLVM ASCII-only versions 
that take a char instead of int.


Do you want to seek out all the uses or should I?


On Jan 29, 2013, at 0:27 , "Benyei, Guy" <[email protected]> wrote:

> Right, but you did change the failing tests, and added Unicode characters 
> (FixIt/fixit-unicode.c, CodeGen\ucn-identifiers.c). 
> Anyhow, the failures occur in debug build only:
> 
>  CRT assert: f:\dd\vctools\crt_bld\self_x86\crt\src\isctype.c(56) : Assertion 
> failed: (unsigned)(c + 1) <= 256
> 
> Apparently this assertion is supposed to check that the value passed to the 
> character classification functions is an actual character. However, since in 
> clang and LLVM there are several places where a signed char is passed to 
> these function - the negative char values are simply passed as negative int 
> values, and the c-style cast above results in some really big numbers, that 
> fail the assertion.
> 
> Thanks
>   Guy
> 
> 
> -----Original Message-----
> From: Jordan Rose [mailto:[email protected]] 
> Sent: Monday, January 28, 2013 19:03
> To: Benyei, Guy; NAKAMURA Takumi
> Cc: [email protected]
> Subject: Re: [cfe-commits] r173622 - PR15067: Don't assert when a UCN appears 
> in a C90 file.
> 
> Hm, I didn't change any of these, and they've been working fine on Takumi's 
> bot for a while now. Takumi, do you have an opinion on these fixes?
> 
> Jordan
> 
> 
> On Jan 28, 2013, at 5:45 , "Benyei, Guy" <[email protected]> wrote:
> 
>> Hi Jordan,
>> Working on Windows/Visual Studio 2010, I get asserts due to the usage of 
>> signed chars representing Unicode characters in some library functions 
>> expecting int arguments (specifically: isdigit and isalnum). Ultimately, 
>> this makes Visual Studio crash while running check-clang.
>> A possible solution would be to cast these char values to unsigned char:
>> 
>> Index: LiteralSupport.cpp
>> ===================================================================
>> --- LiteralSupport.cpp       (revision 173676)
>> +++ LiteralSupport.cpp       (working copy)
>> @@ -458,7 +458,7 @@
>>  // and FP constants (specifically, the 'pp-number' regex), and assumes that
>>  // the byte at "*end" is both valid and not part of the regex.  Because of
>>  // this, it doesn't have to check for 'overscan' in various places.
>> -  assert(!isalnum(*ThisTokEnd) && *ThisTokEnd != '.' && *ThisTokEnd 
>> != '_' &&
>> +  assert(!isalnum((unsigned char)*ThisTokEnd) && *ThisTokEnd != '.' 
>> + && *ThisTokEnd != '_' &&
>>         "Lexer didn't maximally munch?");
>> 
>>  s = DigitsBegin = ThisTokBegin;
>> 
>> 
>> Index: AsmWriter.cpp
>> ===================================================================
>> --- AsmWriter.cpp    (revision 173676)
>> +++ AsmWriter.cpp    (working copy)
>> @@ -117,7 +117,7 @@
>>  }
>> 
>>  // Scan the name to see if it needs quotes first.
>> -  bool NeedsQuotes = isdigit(Name[0]);
>> +  bool NeedsQuotes = isdigit((unsigned char)Name[0]);
>>  if (!NeedsQuotes) {
>>    for (unsigned i = 0, e = Name.size(); i != e; ++i) {
>>      // By making this unsigned, the value passed in to isalnum will 
>> always be
>> 
>> Anyhow, these are only the specific places the current LITs fail on - I'm 
>> really not sure if there are others.
>> 
>> Thanks
>>   Guy
>> 
>> -----Original Message-----
>> From: [email protected] 
>> [mailto:[email protected]] On Behalf Of Jordan Rose
>> Sent: Sunday, January 27, 2013 22:12
>> To: [email protected]
>> Subject: [cfe-commits] r173622 - PR15067: Don't assert when a UCN appears in 
>> a C90 file.
>> 
>> Author: jrose
>> Date: Sun Jan 27 14:12:04 2013
>> New Revision: 173622
>> 
>> URL: http://llvm.org/viewvc/llvm-project?rev=173622&view=rev
>> Log:
>> PR15067: Don't assert when a UCN appears in a C90 file.
>> 
>> Unfortunately, we can't accept the UCN as an extension because we're 
>> required to treat it as two tokens for preprocessing purposes.
>> 
>> Modified:
>>   cfe/trunk/include/clang/Basic/DiagnosticLexKinds.td
>>   cfe/trunk/lib/Lex/Lexer.cpp
>>   cfe/trunk/lib/Lex/LiteralSupport.cpp
>>   cfe/trunk/test/Lexer/c90.c
>> 
>> Modified: cfe/trunk/include/clang/Basic/DiagnosticLexKinds.td
>> URL: 
>> http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/Diag
>> nosticLexKinds.td?rev=173622&r1=173621&r2=173622&view=diff
>> ======================================================================
>> ========
>> --- cfe/trunk/include/clang/Basic/DiagnosticLexKinds.td (original)
>> +++ cfe/trunk/include/clang/Basic/DiagnosticLexKinds.td Sun Jan 27
>> +++ 14:12:04 2013
>> @@ -127,6 +127,11 @@ def warn_cxx98_compat_literal_ucn_escape
>> def warn_cxx98_compat_literal_ucn_control_character : Warning<
>>  "universal character name referring to a control character "
>>  "is incompatible with C++98">, InGroup<CXX98Compat>, DefaultIgnore;
>> +def warn_ucn_not_valid_in_c89 : Warning<
>> +  "universal character names are only valid in C99 or C++; "
>> +  "treating as '\\' followed by identifier">, InGroup<Unicode>; def 
>> +warn_ucn_not_valid_in_c89_literal : ExtWarn<
>> +  "universal character names are only valid in C99 or C++">, 
>> +InGroup<Unicode>;
>> 
>> 
>> // Literal
>> @@ -165,8 +170,6 @@ def ext_string_too_long : Extension<"str
>>  "support">, InGroup<OverlengthStrings>;  def err_character_too_large : 
>> Error<
>>  "character too large for enclosing character literal type">; -def 
>> warn_ucn_not_valid_in_c89 : ExtWarn<
>> -  "unicode escape sequences are only valid in C99 or C++">, 
>> InGroup<Unicode>;  def warn_cxx98_compat_unicode_literal : Warning<
>>  "unicode literals are incompatible with C++98">,
>>  InGroup<CXX98Compat>, DefaultIgnore;
>> 
>> Modified: cfe/trunk/lib/Lex/Lexer.cpp
>> URL: 
>> http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Lex/Lexer.cpp?rev=17
>> 3622&r1=173621&r2=173622&view=diff
>> ======================================================================
>> ========
>> --- cfe/trunk/lib/Lex/Lexer.cpp (original)
>> +++ cfe/trunk/lib/Lex/Lexer.cpp Sun Jan 27 14:12:04 2013
>> @@ -2698,8 +2698,6 @@ bool Lexer::isCodeCompletionPoint(const
>> 
>> uint32_t Lexer::tryReadUCN(const char *&StartPtr, const char *SlashLoc,
>>                           Token *Result) {
>> -  assert(LangOpts.CPlusPlus || LangOpts.C99);
>> -
>>  unsigned CharSize;
>>  char Kind = getCharAndSize(StartPtr, CharSize);
>> 
>> @@ -2711,6 +2709,11 @@ uint32_t Lexer::tryReadUCN(const char *&
>>  else
>>    return 0;
>> 
>> +  if (!LangOpts.CPlusPlus && !LangOpts.C99) {
>> +    Diag(SlashLoc, diag::warn_ucn_not_valid_in_c89);
>> +    return 0;
>> +  }
>> +
>>  const char *CurPtr = StartPtr + CharSize;
>>  const char *KindLoc = &CurPtr[-1];
>> 
>> @@ -2737,7 +2740,7 @@ uint32_t Lexer::tryReadUCN(const char *&
>>          }
>>        }
>>      }
>> -      
>> +
>>      return 0;
>>    }
>> 
>> 
>> Modified: cfe/trunk/lib/Lex/LiteralSupport.cpp
>> URL: 
>> http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Lex/LiteralSupport.c
>> pp?rev=173622&r1=173621&r2=173622&view=diff
>> ======================================================================
>> ========
>> --- cfe/trunk/lib/Lex/LiteralSupport.cpp (original)
>> +++ cfe/trunk/lib/Lex/LiteralSupport.cpp Sun Jan 27 14:12:04 2013
>> @@ -277,7 +277,7 @@ static bool ProcessUCNEscape(const char
>> 
>>  if (!Features.CPlusPlus && !Features.C99 && Diags)
>>    Diag(Diags, Features, Loc, ThisTokBegin, UcnBegin, ThisTokBuf,
>> -         diag::warn_ucn_not_valid_in_c89);
>> +         diag::warn_ucn_not_valid_in_c89_literal);
>> 
>>  return true;
>> }
>> 
>> Modified: cfe/trunk/test/Lexer/c90.c
>> URL: 
>> http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Lexer/c90.c?rev=173
>> 622&r1=173621&r2=173622&view=diff 
>> ======================================================================
>> ========
>> --- cfe/trunk/test/Lexer/c90.c (original)
>> +++ cfe/trunk/test/Lexer/c90.c Sun Jan 27 14:12:04 2013
>> @@ -29,8 +29,8 @@ void test2() {
>> }
>> 
>> void test3() {
>> -  (void)L"\u1234";  // expected-error {{unicode escape sequences are 
>> only valid in C99 or C++}}
>> -  (void)L'\u1234';  // expected-error {{unicode escape sequences are 
>> only valid in C99 or C++}}
>> +  (void)L"\u1234";  // expected-error {{universal character names are 
>> + only valid in C99 or C++}}  (void)L'\u1234';  // expected-error 
>> + {{universal character names are only valid in C99 or C++}}
>> }
>> 
>> #define PREFIX(x) foo ## x
>> @@ -39,3 +39,5 @@ int test4() {
>>  int *p = &PREFIX(0p+1);
>>  return p[-1];
>> }
>> +
>> +#define MY_UCN \u00FC // expected-warning {{universal character names 
>> +are only valid in C99 or C++; treating as '\' followed by 
>> +identifier}}
>> 
>> 
>> _______________________________________________
>> cfe-commits mailing list
>> [email protected]
>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>> ---------------------------------------------------------------------
>> Intel Israel (74) Limited
>> 
>> This e-mail and any attachments may contain confidential material for 
>> the sole use of the intended recipient(s). Any review or distribution 
>> by others is strictly prohibited. If you are not the intended 
>> recipient, please contact the sender and delete all copies.
>> 
> 
> ---------------------------------------------------------------------
> Intel Israel (74) Limited
> 
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
> 

_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Re: [cfe-commits] r173622 - PR15067: Don't assert when a UCN appears in a C90 file.

Reply via email to