aaron.ballman added inline comments.
================ Comment at: clang-tools-extra/clang-tidy/utils/HeaderGuard.cpp:28 +/// Replace invalid characters in the identifier with '_'. +static std::string replaceInvalidChars(StringRef Identifier) { + std::string Fixed{Identifier}; ---------------- salman-javed-nz wrote: > aaron.ballman wrote: > > This fixes some issues but I think it introduces new issues too. The old > > code would accept Unicode characters and pass them along as (valid) > > identifier characters. The new code will replace all the Unicode characters > > with `_` which means some people's header guards may go from useful to > > `_______________`. We should definitely add test cases for Unicode file > > names. > > > > Ultimately, I think this functionality will want to use `UnicodeCharSets.h` > > to implement something along the lines of what's done in > > `isAllowedInitiallyIDChar()` and `isAllowedIDChar()` from Lexer.cpp. > `isAllowed*IDChar()`does exactly what I want, but is a static function in > Lexer.cpp. > Should I be changing its declaration so that it's exposed via Lexer.h? > I don't know what the bar is to warrant changing core public Clang API. > > If I can just call the function directly from llvm-header-guard, I can avoid > code duplication. > > We could expose the interface from `Lexer` if we feel that's the right approach. I was going back and forth on whether it was the right approach and eventually thought that we don't want to expose it directly because I don't think we want clang-tidy to care about things like `LangOpts.AsmPreprocessor` or even `LangOpts.DollarIdents`. So my recommendation is to put this into LexerUtils.h/.cpp in clang-tidy rather than changing the Clang Lexer interface. WDYT? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D114149/new/ https://reviews.llvm.org/D114149 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits