> -----Original Message----- > From: Arthur O'Dwyer [mailto:[email protected]] > Sent: Wednesday, September 11, 2013 1:35 PM > To: Gao, Yunzhong; cfe-commits > Cc: [email protected] > Subject: Re: [PATCH] [2/6] Convert non-printing characters to their octal > sequence before emitting #line directive or __FILE__ macro > > If #include directives will use UTF-8, then __FILE__ must also use UTF-8, so > that this will work: > > #include __FILE__ > > And I would expect #line directives also to use UTF-8. The only good rationale > I can imagine is that you're dealing with badly behaved third-party generators > such as lex/yacc which dump malformed #line directives into the source file. > > The patch looks good to me, but the stated rationale is misleading; I don't > think this patch helps with anything on a well-behaved system (even one > where the filesystem charset is Shift-JIS). It merely helps Clang not-barf on > malformed input (such as that produced by a badly behaved lex/yacc). > > my $.02, > -Arthur
For some reason, your replies just won't appear in Phabricator while Eli's went in just fine. Weird. I think, a UTF-8 encoded source file should not contain shift-jis encoded lines like this: #include "こんにちは.c" But it is okay to have lines like this: #include "\202\261\202\361\202\311\202\277\202\315.c" You might be right that the current patch does not help the compiler find the included file because the compiler will attempt a UTF-8 to unicode translation on the shift-jis file name. It only makes sure that you do not have strange characters in the preprocessed file. The equivalent UTF-8 encoded file name like the following might help the compiler find the file: #include "\343\203\231\343\203\274\343\202\267\343\203\203\343\202\257.c" http://llvm-reviews.chandlerc.com/D1291 _______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
