[clang] [clang-tools-extra] [llvm] [clang][modules] stdarg.h and stddef.h shouldn't directly declare anything (PR #90676)

2024-05-06 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese approved this pull request.

lgtm

https://github.com/llvm/llvm-project/pull/90676
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement P3034R1 Module Declarations Shouldn’t be Macros (PR #90574)

2024-05-06 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese edited 
https://github.com/llvm/llvm-project/pull/90574
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement P3034R1 Module Declarations Shouldn’t be Macros (PR #90574)

2024-05-06 Thread Michael Spencer via cfe-commits


@@ -932,6 +932,12 @@ def warn_module_conflict : Warning<
   InGroup;
 
 // C++20 modules
+def err_module_decl_cannot_be_macros : Error<
+  "the name of a module%select{| partition}0 declaration cannot contains "
+  "an object-like macro %1, and the macro will not expand"
+  "%select{|; did you mean '%3'?}2">;

Bigcheese wrote:

Not sure if you need "and the macro will not expand" here. Per the spec you 
don't even get to if expansion happens or not, it's just invalid being a macro.

```suggestion
def err_module_decl_cannot_be_macros : Error<
  "the module name in a module%select{| partition}0 declaration cannot contain "
  "an object-like macro %1%select{|; did you mean '%3'?}2">;
```

https://github.com/llvm/llvm-project/pull/90574
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement P3034R1 Module Declarations Shouldn’t be Macros (PR #90574)

2024-05-06 Thread Michael Spencer via cfe-commits


@@ -1329,6 +1341,129 @@ bool Preprocessor::LexAfterModuleImport(Token ) {
   return true;
 }
 
+/// Lex a token following the 'module' contextual keyword.
+///
+/// [cpp.module]/p2:
+/// The pp-tokens, if any, of a pp-module shall be of the form:
+///   pp-module-name pp-module-partition[opt] pp-tokens[opt]
+///
+/// where the pp-tokens (if any) shall not begin with a ( preprocessing token
+/// and the grammar non-terminals are defined as:
+///   pp-module-name:
+/// pp-module-name-qualifierp[opt] identifier
+///   pp-module-partition:
+/// : pp-module-name-qualifier[opt] identifier
+///   pp-module-name-qualifier:
+/// identifier .
+/// pp-module-name-qualifier identifier .
+/// No identifier in the pp-module-name or pp-module-partition shall currently
+/// be defined as an object-like macro.
+///
+/// [cpp.module]/p3:
+/// Any preprocessing tokens after the module preprocessing token in the module
+/// directive are processed just as in normal text.
+bool Preprocessor::LexAfterModuleDecl(Token ) {
+  // Figure out what kind of lexer we actually have.
+  recomputeCurLexerKind();
+  LexUnexpandedToken(Result);
+
+  auto EnterTokens = [this](ArrayRef Toks, bool DisableMacroExpansion) {
+auto ToksCopy = std::make_unique(Toks.size());
+std::copy(Toks.begin(), Toks.end(), ToksCopy.get());
+EnterTokenStream(std::move(ToksCopy), Toks.size(), DisableMacroExpansion,
+ /*IsReinject=*/false);
+  };
+
+  // If we not expect an identifier but got an identifier, it's not a part of
+  // module name.
+  if (!ModuleDeclExpectsIdentifier && Result.is(tok::identifier)) {
+EnterTokens(Result, /*DisableMacroExpansion=*/false);
+return false;
+  }
+
+  // The token sequence
+  //
+  // export[opt] module identifier (. identifier)*
+  //
+  // indicates a module import directive. We already saw the 'module'
+  // contextual keyword, so now we're looking for the identifiers.
+  if (ModuleDeclExpectsIdentifier && Result.is(tok::identifier)) {
+auto *MI = getMacroInfo(Result.getIdentifierInfo());
+if (MI && MI->isObjectLike()) {
+  auto BuildFixItHint =
+  [&](std::string ) -> std::optional {
+EnterTokens(Result, /*DisableMacroExpansion=*/false);
+if (!CurTokenLexer)
+  return std::nullopt;
+Token Tok;
+bool HasUnknownToken = false;
+llvm::raw_string_ostream OS(Replacement);
+do {
+  Lex(Tok);
+  if (const char *Punc = tok::getPunctuatorSpelling(Tok.getKind()))
+OS << Punc;
+  else if (Tok.isLiteral() && Tok.getLiteralData())
+OS << StringRef(Tok.getLiteralData(), Tok.getLength());
+  else if (auto *II = Tok.getIdentifierInfo())
+OS << II->getName();
+  else
+HasUnknownToken = true;
+} while (CurTokenLexer && !CurTokenLexer->isAtEnd());

Bigcheese wrote:

I forget now where else I made this comment, but this is probably the 5th or 
6th case of expanding a macro into a string in Clang. I'd really like to have a 
single API for doing this, as they are all slightly different.

https://github.com/llvm/llvm-project/pull/90574
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement P3034R1 Module Declarations Shouldn’t be Macros (PR #90574)

2024-05-06 Thread Michael Spencer via cfe-commits


@@ -932,6 +932,12 @@ def warn_module_conflict : Warning<
   InGroup;
 
 // C++20 modules
+def err_module_decl_cannot_be_macros : Error<
+  "the name of a module%select{| partition}0 declaration cannot contains "
+  "an object-like macro %1, and the macro will not expand"
+  "%select{|; did you mean '%3'?}2">;
+def err_unxepected_paren_in_module_decl : Error<
+  "unexpected '(' after the name of a module%select{| partition}0 
declaration">;

Bigcheese wrote:

"name of a module declaration" is a bit odd. I think "module name in a module 
declaration" is clearer.

```suggestion
def err_unxepected_paren_in_module_decl : Error<
  "unexpected '(' after the module name in a module%select{| partition}0 
declaration">;
```

https://github.com/llvm/llvm-project/pull/90574
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement P3034R1 Module Declarations Shouldn’t be Macros (PR #90574)

2024-05-06 Thread Michael Spencer via cfe-commits


@@ -1329,6 +1341,129 @@ bool Preprocessor::LexAfterModuleImport(Token ) {
   return true;
 }
 
+/// Lex a token following the 'module' contextual keyword.
+///
+/// [cpp.module]/p2:
+/// The pp-tokens, if any, of a pp-module shall be of the form:
+///   pp-module-name pp-module-partition[opt] pp-tokens[opt]
+///
+/// where the pp-tokens (if any) shall not begin with a ( preprocessing token
+/// and the grammar non-terminals are defined as:
+///   pp-module-name:
+/// pp-module-name-qualifierp[opt] identifier
+///   pp-module-partition:
+/// : pp-module-name-qualifier[opt] identifier
+///   pp-module-name-qualifier:
+/// identifier .
+/// pp-module-name-qualifier identifier .
+/// No identifier in the pp-module-name or pp-module-partition shall currently
+/// be defined as an object-like macro.
+///
+/// [cpp.module]/p3:
+/// Any preprocessing tokens after the module preprocessing token in the module
+/// directive are processed just as in normal text.
+bool Preprocessor::LexAfterModuleDecl(Token ) {
+  // Figure out what kind of lexer we actually have.
+  recomputeCurLexerKind();
+  LexUnexpandedToken(Result);
+
+  auto EnterTokens = [this](ArrayRef Toks, bool DisableMacroExpansion) {
+auto ToksCopy = std::make_unique(Toks.size());
+std::copy(Toks.begin(), Toks.end(), ToksCopy.get());
+EnterTokenStream(std::move(ToksCopy), Toks.size(), DisableMacroExpansion,
+ /*IsReinject=*/false);
+  };
+
+  // If we not expect an identifier but got an identifier, it's not a part of
+  // module name.
+  if (!ModuleDeclExpectsIdentifier && Result.is(tok::identifier)) {
+EnterTokens(Result, /*DisableMacroExpansion=*/false);
+return false;
+  }
+
+  // The token sequence
+  //
+  // export[opt] module identifier (. identifier)*
+  //
+  // indicates a module import directive. We already saw the 'module'
+  // contextual keyword, so now we're looking for the identifiers.
+  if (ModuleDeclExpectsIdentifier && Result.is(tok::identifier)) {
+auto *MI = getMacroInfo(Result.getIdentifierInfo());
+if (MI && MI->isObjectLike()) {
+  auto BuildFixItHint =
+  [&](std::string ) -> std::optional {
+EnterTokens(Result, /*DisableMacroExpansion=*/false);
+if (!CurTokenLexer)
+  return std::nullopt;
+Token Tok;
+bool HasUnknownToken = false;
+llvm::raw_string_ostream OS(Replacement);
+do {
+  Lex(Tok);
+  if (const char *Punc = tok::getPunctuatorSpelling(Tok.getKind()))
+OS << Punc;
+  else if (Tok.isLiteral() && Tok.getLiteralData())
+OS << StringRef(Tok.getLiteralData(), Tok.getLength());
+  else if (auto *II = Tok.getIdentifierInfo())
+OS << II->getName();
+  else
+HasUnknownToken = true;
+} while (CurTokenLexer && !CurTokenLexer->isAtEnd());
+if (HasUnknownToken)
+  return std::nullopt;

Bigcheese wrote:

I believe this still allows a lot of invalid cases where applying the fixit 
produces an invalid module name. I expect two different cases of people hitting 
this.

1. Accidentally use the name of a macro
Here what the macro expands to doesn't matter, as they weren't trying to 
expand it.
2. Intentionally trying to use the preprocessor to set the module name
Here the expansion is probably valid, but they went out of their way to use 
the preprocessor, so they probably don't want to put in what the macro expands 
to.

In either case I'm not sure if the fixit is actually helpful here. My 
expectation is that Clang will either tell you to replace it with something 
like `export module static inline __attribute__((always_inline))`, or something 
you don't want to replace it with anyway. It would be nice in cases like this 
if we just linked to some documentation saying what to do instead.

https://github.com/llvm/llvm-project/pull/90574
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement P3034R1 Module Declarations Shouldn’t be Macros (PR #90574)

2024-05-06 Thread Michael Spencer via cfe-commits


@@ -1329,6 +1341,129 @@ bool Preprocessor::LexAfterModuleImport(Token ) {
   return true;
 }
 
+/// Lex a token following the 'module' contextual keyword.
+///
+/// [cpp.module]/p2:
+/// The pp-tokens, if any, of a pp-module shall be of the form:
+///   pp-module-name pp-module-partition[opt] pp-tokens[opt]
+///
+/// where the pp-tokens (if any) shall not begin with a ( preprocessing token
+/// and the grammar non-terminals are defined as:
+///   pp-module-name:
+/// pp-module-name-qualifierp[opt] identifier
+///   pp-module-partition:
+/// : pp-module-name-qualifier[opt] identifier
+///   pp-module-name-qualifier:
+/// identifier .
+/// pp-module-name-qualifier identifier .
+/// No identifier in the pp-module-name or pp-module-partition shall currently
+/// be defined as an object-like macro.
+///
+/// [cpp.module]/p3:
+/// Any preprocessing tokens after the module preprocessing token in the module
+/// directive are processed just as in normal text.
+bool Preprocessor::LexAfterModuleDecl(Token ) {
+  // Figure out what kind of lexer we actually have.
+  recomputeCurLexerKind();
+  LexUnexpandedToken(Result);
+
+  auto EnterTokens = [this](ArrayRef Toks, bool DisableMacroExpansion) {
+auto ToksCopy = std::make_unique(Toks.size());
+std::copy(Toks.begin(), Toks.end(), ToksCopy.get());
+EnterTokenStream(std::move(ToksCopy), Toks.size(), DisableMacroExpansion,
+ /*IsReinject=*/false);
+  };
+
+  // If we not expect an identifier but got an identifier, it's not a part of
+  // module name.

Bigcheese wrote:

```suggestion
  // If we don't expect an identifier but got an identifier, it's not a part of
  // module name.
```

https://github.com/llvm/llvm-project/pull/90574
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement P3034R1 Module Declarations Shouldn’t be Macros (PR #90574)

2024-05-06 Thread Michael Spencer via cfe-commits


@@ -0,0 +1,87 @@
+// RUN: rm -rf %t
+// RUN: mkdir -p %t
+// RUN: split-file %s %t
+
+// RUN: %clang_cc1 -std=c++20 %t/A.cppm -triple x86_64-linux-gnu -verify
+// RUN: %clang_cc1 -std=c++20 %t/B.cppm -triple x86_64-linux-gnu -verify
+// RUN: %clang_cc1 -std=c++20 %t/C.cppm -triple x86_64-linux-gnu -verify
+// RUN: %clang_cc1 -std=c++20 %t/D.cppm -triple x86_64-linux-gnu -verify
+// RUN: %clang_cc1 -std=c++20 %t/E.cppm -triple x86_64-linux-gnu -verify
+// RUN: %clang_cc1 -std=c++20 %t/F.cppm -triple x86_64-linux-gnu -verify
+// RUN: %clang_cc1 -std=c++20 %t/G.cppm -triple x86_64-linux-gnu -verify
+// RUN: %clang_cc1 -std=c++20 %t/H.cppm -triple x86_64-linux-gnu -verify
+// RUN: %clang_cc1 -std=c++20 %t/I.cppm -triple x86_64-linux-gnu -verify
+// RUN: %clang_cc1 -std=c++20 %t/J.cppm -triple x86_64-linux-gnu -verify
+
+//--- version.h
+#ifndef VERSION_H
+#define VERSION_H
+
+#define VERSION libv5
+#define A a
+#define B b
+#define C c
+#define FUNC_LIKE(X) function_like_##X
+#define ATTRS [[]]
+#define SEMICOLON ;
+
+#endif
+
+//--- A.cppm
+#include "version.h"

Bigcheese wrote:

The module-file grammar prohibits this, but clang doesn't implement this 
restriction yet. https://eel.is/c++draft/cpp#nt:module-file

You need to have a `module;` decl before any preprocessor directives.

https://github.com/llvm/llvm-project/pull/90574
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement P3034R1 Module Declarations Shouldn’t be Macros (PR #90574)

2024-05-06 Thread Michael Spencer via cfe-commits


@@ -1329,6 +1341,129 @@ bool Preprocessor::LexAfterModuleImport(Token ) {
   return true;
 }
 
+/// Lex a token following the 'module' contextual keyword.
+///
+/// [cpp.module]/p2:
+/// The pp-tokens, if any, of a pp-module shall be of the form:
+///   pp-module-name pp-module-partition[opt] pp-tokens[opt]
+///
+/// where the pp-tokens (if any) shall not begin with a ( preprocessing token
+/// and the grammar non-terminals are defined as:
+///   pp-module-name:
+/// pp-module-name-qualifierp[opt] identifier
+///   pp-module-partition:
+/// : pp-module-name-qualifier[opt] identifier
+///   pp-module-name-qualifier:
+/// identifier .
+/// pp-module-name-qualifier identifier .
+/// No identifier in the pp-module-name or pp-module-partition shall currently
+/// be defined as an object-like macro.
+///
+/// [cpp.module]/p3:
+/// Any preprocessing tokens after the module preprocessing token in the module
+/// directive are processed just as in normal text.
+bool Preprocessor::LexAfterModuleDecl(Token ) {
+  // Figure out what kind of lexer we actually have.
+  recomputeCurLexerKind();
+  LexUnexpandedToken(Result);
+
+  auto EnterTokens = [this](ArrayRef Toks, bool DisableMacroExpansion) {
+auto ToksCopy = std::make_unique(Toks.size());
+std::copy(Toks.begin(), Toks.end(), ToksCopy.get());
+EnterTokenStream(std::move(ToksCopy), Toks.size(), DisableMacroExpansion,
+ /*IsReinject=*/false);
+  };
+
+  // If we not expect an identifier but got an identifier, it's not a part of
+  // module name.
+  if (!ModuleDeclExpectsIdentifier && Result.is(tok::identifier)) {
+EnterTokens(Result, /*DisableMacroExpansion=*/false);
+return false;
+  }
+
+  // The token sequence
+  //
+  // export[opt] module identifier (. identifier)*
+  //
+  // indicates a module import directive. We already saw the 'module'

Bigcheese wrote:

```suggestion
  // indicates a module declaration. We already saw the 'module'
```

https://github.com/llvm/llvm-project/pull/90574
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang] Implement P3034R1 Module Declarations Shouldn’t be Macros (PR #90574)

2024-05-06 Thread Michael Spencer via cfe-commits

Bigcheese wrote:

> The paper does not clearly says whether disallow function-like macro is also 
> needed, but I think disallow function-like macro has the same goal as the 
> paper. WDYT? @cor3ntin @ChuanqiXu9
> 
> The wording in the paper said: _No identifier in the pp-module-name or 
> pp-module-partition shall currently be defined as an **object-like macro**._

The "where the pp-tokens (if any) shall not begin with a ( preprocessing token" 
prohibits function like macros already. The goal was to restrict only what was 
needed. So identifiers defined as function like macros are ok as long as it's 
not followed by `(`.

https://github.com/llvm/llvm-project/pull/90574
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] [llvm] [clang][modules] stdarg.h and stddef.h shouldn't directly declare anything (PR #90676)

2024-05-02 Thread Michael Spencer via cfe-commits


@@ -15,9 +15,11 @@ const HeaderMapCollector::RegexHeaderMap 
*getSTLPostfixHeaderMap() {
   static const HeaderMapCollector::RegexHeaderMap STLPostfixHeaderMap = {
   {"include/__stdarg___gnuc_va_list.h$", ""},
   {"include/__stdarg___va_copy.h$", ""},
+  {"include/__stdarg_macro.h$", ""},
   {"include/__stdarg_va_copy.h$", ""},
   {"include/__stdarg_va_list.h$", ""},
+  {"include/__stddef_macro.h$", "` is missing for these two.

https://github.com/llvm/llvm-project/pull/90676
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] [llvm] [clang][modules] stdarg.h and stddef.h shouldn't directly declare anything (PR #90676)

2024-05-02 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese commented:

I think I like this as a solution, although I wonder if __stdarg_guard_macro.h 
would be a better name. The name now somewhat implies that it contains the 
macros that stddef and stdarg define.

https://github.com/llvm/llvm-project/pull/90676
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] [llvm] [clang][modules] stdarg.h and stddef.h shouldn't directly declare anything (PR #90676)

2024-05-02 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese edited 
https://github.com/llvm/llvm-project/pull/90676
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Modules] No transitive source location change (PR #86912)

2024-04-30 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese approved this pull request.

I did some testing today and this change seems fine. For scanning modules I 
actually saw some get smaller with your change.

https://github.com/llvm/llvm-project/pull/86912
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [clang][MBD] set up module build daemon infrastructure (PR #67562)

2024-04-24 Thread Michael Spencer via cfe-commits


@@ -0,0 +1,289 @@
+//===--- cc1modbuildd_main.cpp - Clang CC1 Module Build Daemon 
===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#include "clang/Tooling/ModuleBuildDaemon/SocketSupport.h"
+#include "clang/Tooling/ModuleBuildDaemon/Utils.h"
+#include "llvm/Support/FileSystem.h"
+#include "llvm/Support/Path.h"
+#include "llvm/Support/Signals.h"
+#include "llvm/Support/ThreadPool.h"
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#ifdef _WIN32
+#include 
+#else
+#include 
+#endif
+
+using namespace llvm;
+using namespace clang::tooling::cc1modbuildd;
+
+// Create unbuffered STDOUT stream so that any logging done by the module build
+// daemon can be viewed without having to terminate the process
+static raw_fd_ostream _outs() {
+  static raw_fd_ostream S(fileno(stdout), false, true);
+  return S;
+}
+
+static bool LogVerbose = false;
+static void logVerbose(const llvm::Twine ) {
+  if (LogVerbose) {
+unbuff_outs() << message << '\n';
+  }
+}
+
+static void modifySignals(decltype(SIG_DFL) handler) {
+
+  if (std::signal(SIGTERM, handler) == SIG_ERR) {
+errs() << "failed to handle SIGTERM" << '\n';
+exit(EXIT_FAILURE);
+  }
+
+  if (std::signal(SIGINT, handler) == SIG_ERR) {
+errs() << "failed to handle SIGINT" << '\n';
+exit(EXIT_FAILURE);
+  }
+
+#ifdef SIGHUP
+  if (::signal(SIGHUP, SIG_IGN) == SIG_ERR) {
+errs() << "failed to handle SIGHUP" << '\n';
+exit(EXIT_FAILURE);
+  }
+#endif
+}
+
+namespace {
+
+class ModuleBuildDaemonServer {
+public:
+  SmallString<256> SocketPath;
+  SmallString<256> Stderr; // path to stderr
+  SmallString<256> Stdout; // path to stdout
+
+  explicit ModuleBuildDaemonServer(StringRef Path)
+  : SocketPath(Path), Stderr(Path), Stdout(Path) {
+llvm::sys::path::append(SocketPath, SocketFileName);
+llvm::sys::path::append(Stdout, StdoutFileName);
+llvm::sys::path::append(Stderr, StderrFileName);
+  }
+
+  void setupDaemonEnv();
+  void createDaemonSocket();
+  void listenForClients();
+
+  static void handleConnection(std::shared_ptr Connection);
+
+  // TODO: modify so when shutdownDaemon is called the daemon stops accepting
+  // new client connections and waits for all existing client connections to
+  // terminate before closing the file descriptor and exiting
+  void shutdownDaemon() {
+RunServiceLoop = false;
+if (ServerListener.has_value())
+  ServerListener.value().shutdown();
+  }
+
+private:
+  std::atomic RunServiceLoop = true;
+  std::optional ServerListener;

Bigcheese wrote:

I think this is fine. The default handler is ok before any clients have 
successfully connected.

https://github.com/llvm/llvm-project/pull/67562
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [clang][MBD] set up module build daemon infrastructure (PR #67562)

2024-04-19 Thread Michael Spencer via cfe-commits


@@ -0,0 +1,25 @@
+// Check that a clang invocation can spawn and handshake with a module build 
daemon
+
+// RUN: %kill-process "-cc1modbuildd mbd-handshake"
+// RUN: rm -rf mbd-handshake %t
+// RUN: split-file %s %t
+
+//--- main.c
+int main() {return 0;}
+
+// RUN: %clang -fmodule-build-daemon=mbd-handshake -Rmodule-build-daemon 
%t/main.c &> %t/output-new || true

Bigcheese wrote:

For longer lived connections the 15 second timeout would just kill the server 
anyway right now. I believe when Connor and I discussed this issue the plan was 
to also have timeouts for reading from clients that can close individual 
connections, and then exit the server when all clients are gone and there has 
been no activity for a set time.

I'm definitely concerned about landing anything that would have anyone running 
tests end up with zombie processes, so I think it's a requirement that any 
patches that land need to have robust timeout handling.

https://github.com/llvm/llvm-project/pull/67562
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][deps] Only bypass scanning VFS for the module cache (PR #88800)

2024-04-18 Thread Michael Spencer via cfe-commits


@@ -201,11 +201,8 @@ const CachedRealPath 
::CacheShard::
   return *StoredRealPath;
 }
 
-static bool shouldCacheStatFailures(StringRef Filename) {
-  StringRef Ext = llvm::sys::path::extension(Filename);
-  if (Ext.empty())
-return false; // This may be the module cache directory.
-  return true;
+bool DependencyScanningWorkerFilesystem::shouldBypass(StringRef Path) const {
+  return BypassedPathPrefix && Path.starts_with(*BypassedPathPrefix);

Bigcheese wrote:

It doesn't canonicalize if it's already absolute, which is commonly the case 
for the module cache dir. If we're going to canonicalize on each access anyway, 
I think it's fine to canonicalize in CompilerInstance at the start.

https://github.com/llvm/llvm-project/pull/88800
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [llvm][support] Implement tracing virtual file system (PR #88326)

2024-04-17 Thread Michael Spencer via cfe-commits


@@ -1125,6 +1125,54 @@ class YAMLVFSWriter {
   void write(llvm::raw_ostream );
 };
 
+/// File system that tracks the number of calls to the underlying file system.
+/// This is particularly useful when wrapped around \c RealFileSystem to add
+/// lightweight tracking of expensive syscalls.
+class TracingFileSystem
+: public llvm::RTTIExtends {
+public:
+  static const char ID;
+
+  std::size_t NumStatusCalls = 0;

Bigcheese wrote:

Looking at the existing VFSs there appears to be a reasonable subset that 
currently are thread safe for stat/open/etc., but some definitely aren't. I 
think it's fine to make this one thread safe and document that it is if the 
underlying FS is.

I think they can also all be relaxed memory order, you have to have some other 
synchronization before collecting the numbers at the end anyway.

https://github.com/llvm/llvm-project/pull/88326
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [llvm][support] Implement tracing virtual file system (PR #88326)

2024-04-17 Thread Michael Spencer via cfe-commits


@@ -2933,8 +2933,21 @@ recursive_directory_iterator::increment(std::error_code 
) {
   return *this;
 }
 
+void TracingFileSystem::printImpl(raw_ostream , PrintType Type,

Bigcheese wrote:

Should this print the current stat values?

https://github.com/llvm/llvm-project/pull/88326
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [llvm][support] Implement tracing virtual file system (PR #88326)

2024-04-17 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese approved this pull request.

lgtm with some comments.

https://github.com/llvm/llvm-project/pull/88326
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [llvm][support] Implement tracing virtual file system (PR #88326)

2024-04-17 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese edited 
https://github.com/llvm/llvm-project/pull/88326
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [clang][MBD] set up module build daemon infrastructure (PR #67562)

2024-04-17 Thread Michael Spencer via cfe-commits


@@ -0,0 +1,289 @@
+//===--- cc1modbuildd_main.cpp - Clang CC1 Module Build Daemon 
===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#include "clang/Tooling/ModuleBuildDaemon/SocketSupport.h"
+#include "clang/Tooling/ModuleBuildDaemon/Utils.h"
+#include "llvm/Support/FileSystem.h"
+#include "llvm/Support/Path.h"
+#include "llvm/Support/Signals.h"
+#include "llvm/Support/ThreadPool.h"
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#ifdef _WIN32
+#include 
+#else
+#include 
+#endif
+
+using namespace llvm;
+using namespace clang::tooling::cc1modbuildd;
+
+// Create unbuffered STDOUT stream so that any logging done by the module build
+// daemon can be viewed without having to terminate the process
+static raw_fd_ostream _outs() {
+  static raw_fd_ostream S(fileno(stdout), false, true);
+  return S;
+}
+
+static bool LogVerbose = false;
+static void logVerbose(const llvm::Twine ) {
+  if (LogVerbose) {
+unbuff_outs() << message << '\n';
+  }
+}
+
+static void modifySignals(decltype(SIG_DFL) handler) {
+
+  if (std::signal(SIGTERM, handler) == SIG_ERR) {
+errs() << "failed to handle SIGTERM" << '\n';
+exit(EXIT_FAILURE);
+  }
+
+  if (std::signal(SIGINT, handler) == SIG_ERR) {
+errs() << "failed to handle SIGINT" << '\n';
+exit(EXIT_FAILURE);
+  }
+
+#ifdef SIGHUP
+  if (::signal(SIGHUP, SIG_IGN) == SIG_ERR) {
+errs() << "failed to handle SIGHUP" << '\n';
+exit(EXIT_FAILURE);
+  }
+#endif
+}
+
+namespace {
+
+class ModuleBuildDaemonServer {
+public:
+  SmallString<256> SocketPath;
+  SmallString<256> Stderr; // path to stderr
+  SmallString<256> Stdout; // path to stdout
+
+  explicit ModuleBuildDaemonServer(StringRef Path)
+  : SocketPath(Path), Stderr(Path), Stdout(Path) {
+llvm::sys::path::append(SocketPath, SocketFileName);
+llvm::sys::path::append(Stdout, StdoutFileName);
+llvm::sys::path::append(Stderr, StderrFileName);
+  }
+
+  void setupDaemonEnv();
+  void createDaemonSocket();
+  void listenForClients();
+
+  static void handleConnection(std::shared_ptr Connection);
+
+  // TODO: modify so when shutdownDaemon is called the daemon stops accepting
+  // new client connections and waits for all existing client connections to
+  // terminate before closing the file descriptor and exiting
+  void shutdownDaemon() {
+RunServiceLoop = false;
+if (ServerListener.has_value())
+  ServerListener.value().shutdown();
+  }
+
+private:
+  std::atomic RunServiceLoop = true;
+  std::optional ServerListener;
+};
+
+// Used to handle signals
+ModuleBuildDaemonServer *DaemonPtr = nullptr;
+void handleSignal(int) { DaemonPtr->shutdownDaemon(); }
+} // namespace
+
+// Sets up file descriptors and signals for module build daemon
+void ModuleBuildDaemonServer::setupDaemonEnv() {
+
+#ifdef _WIN32
+  if (std::freopen("NUL", "r", stdin) == NULL) {
+#else
+  if (std::freopen("/dev/null", "r", stdin) == NULL) {
+#endif
+llvm::errs() << "Failed to close stdin" << '\n';
+exit(EXIT_FAILURE);
+  }
+
+  if (std::freopen(Stdout.c_str(), "a", stdout) == NULL) {
+llvm::errs() << "Failed to redirect stdout to " << Stdout << '\n';
+exit(EXIT_FAILURE);
+  }
+  if (std::freopen(Stderr.c_str(), "a", stderr) == NULL) {
+llvm::errs() << "Failed to redirect stderr to " << Stderr << '\n';
+exit(EXIT_FAILURE);
+  }
+
+  modifySignals(handleSignal);
+}
+
+// Creates unix socket for IPC with frontends
+void ModuleBuildDaemonServer::createDaemonSocket() {
+
+  while (true) {
+Expected MaybeServerListener =
+llvm::ListeningSocket::createUnix(SocketPath);
+
+if (llvm::Error Err = MaybeServerListener.takeError()) {
+  llvm::handleAllErrors(std::move(Err), [&](const llvm::StringError ) {
+std::error_code EC = SE.convertToErrorCode();
+
+// Exit successfully if the socket address is already in use. When
+// TUs are compiled in parallel, until the socket file is created, all
+// clang invocations will try to spawn a module build daemon.
+#ifdef _WIN32
+if (EC.value() == WSAEADDRINUSE) {
+#else
+if (EC == std::errc::address_in_use) {
+#endif
+  exit(EXIT_SUCCESS);
+} else if (EC == std::errc::file_exists) {
+  if (std::remove(SocketPath.c_str()) != 0) {
+llvm::errs() << "Failed to remove " << SocketPath << ": "
+ << strerror(errno) << '\n';
+exit(EXIT_FAILURE);
+  }
+  // If a previous module build daemon invocation crashes, the socket
+  // file will need to be removed before the address can be binded to
+  logVerbose("Removing ineligible file: " + SocketPath);
+} else {
+   

[clang] [llvm] [clang][MBD] set up module build daemon infrastructure (PR #67562)

2024-04-17 Thread Michael Spencer via cfe-commits


@@ -0,0 +1,289 @@
+//===--- cc1modbuildd_main.cpp - Clang CC1 Module Build Daemon 
===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#include "clang/Tooling/ModuleBuildDaemon/SocketSupport.h"
+#include "clang/Tooling/ModuleBuildDaemon/Utils.h"
+#include "llvm/Support/FileSystem.h"
+#include "llvm/Support/Path.h"
+#include "llvm/Support/Signals.h"
+#include "llvm/Support/ThreadPool.h"
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#ifdef _WIN32
+#include 
+#else
+#include 
+#endif
+
+using namespace llvm;
+using namespace clang::tooling::cc1modbuildd;
+
+// Create unbuffered STDOUT stream so that any logging done by the module build
+// daemon can be viewed without having to terminate the process
+static raw_fd_ostream _outs() {
+  static raw_fd_ostream S(fileno(stdout), false, true);
+  return S;
+}
+
+static bool LogVerbose = false;
+static void logVerbose(const llvm::Twine ) {
+  if (LogVerbose) {
+unbuff_outs() << message << '\n';
+  }
+}
+
+static void modifySignals(decltype(SIG_DFL) handler) {
+
+  if (std::signal(SIGTERM, handler) == SIG_ERR) {
+errs() << "failed to handle SIGTERM" << '\n';
+exit(EXIT_FAILURE);
+  }
+
+  if (std::signal(SIGINT, handler) == SIG_ERR) {
+errs() << "failed to handle SIGINT" << '\n';
+exit(EXIT_FAILURE);
+  }
+
+#ifdef SIGHUP
+  if (::signal(SIGHUP, SIG_IGN) == SIG_ERR) {
+errs() << "failed to handle SIGHUP" << '\n';
+exit(EXIT_FAILURE);
+  }
+#endif
+}
+
+namespace {
+
+class ModuleBuildDaemonServer {
+public:
+  SmallString<256> SocketPath;
+  SmallString<256> Stderr; // path to stderr
+  SmallString<256> Stdout; // path to stdout
+
+  explicit ModuleBuildDaemonServer(StringRef Path)
+  : SocketPath(Path), Stderr(Path), Stdout(Path) {
+llvm::sys::path::append(SocketPath, SocketFileName);
+llvm::sys::path::append(Stdout, StdoutFileName);
+llvm::sys::path::append(Stderr, StderrFileName);
+  }
+
+  void setupDaemonEnv();
+  void createDaemonSocket();
+  void listenForClients();
+
+  static void handleConnection(std::shared_ptr Connection);
+
+  // TODO: modify so when shutdownDaemon is called the daemon stops accepting
+  // new client connections and waits for all existing client connections to
+  // terminate before closing the file descriptor and exiting
+  void shutdownDaemon() {
+RunServiceLoop = false;
+if (ServerListener.has_value())
+  ServerListener.value().shutdown();
+  }
+
+private:
+  std::atomic RunServiceLoop = true;
+  std::optional ServerListener;
+};
+
+// Used to handle signals
+ModuleBuildDaemonServer *DaemonPtr = nullptr;
+void handleSignal(int) { DaemonPtr->shutdownDaemon(); }
+} // namespace
+
+// Sets up file descriptors and signals for module build daemon
+void ModuleBuildDaemonServer::setupDaemonEnv() {
+
+#ifdef _WIN32
+  if (std::freopen("NUL", "r", stdin) == NULL) {
+#else
+  if (std::freopen("/dev/null", "r", stdin) == NULL) {
+#endif
+llvm::errs() << "Failed to close stdin" << '\n';
+exit(EXIT_FAILURE);
+  }
+
+  if (std::freopen(Stdout.c_str(), "a", stdout) == NULL) {
+llvm::errs() << "Failed to redirect stdout to " << Stdout << '\n';
+exit(EXIT_FAILURE);
+  }
+  if (std::freopen(Stderr.c_str(), "a", stderr) == NULL) {
+llvm::errs() << "Failed to redirect stderr to " << Stderr << '\n';
+exit(EXIT_FAILURE);
+  }
+
+  modifySignals(handleSignal);
+}
+
+// Creates unix socket for IPC with frontends
+void ModuleBuildDaemonServer::createDaemonSocket() {
+
+  while (true) {
+Expected MaybeServerListener =
+llvm::ListeningSocket::createUnix(SocketPath);
+
+if (llvm::Error Err = MaybeServerListener.takeError()) {
+  llvm::handleAllErrors(std::move(Err), [&](const llvm::StringError ) {
+std::error_code EC = SE.convertToErrorCode();
+
+// Exit successfully if the socket address is already in use. When
+// TUs are compiled in parallel, until the socket file is created, all
+// clang invocations will try to spawn a module build daemon.
+#ifdef _WIN32
+if (EC.value() == WSAEADDRINUSE) {
+#else
+if (EC == std::errc::address_in_use) {
+#endif
+  exit(EXIT_SUCCESS);
+} else if (EC == std::errc::file_exists) {
+  if (std::remove(SocketPath.c_str()) != 0) {
+llvm::errs() << "Failed to remove " << SocketPath << ": "
+ << strerror(errno) << '\n';
+exit(EXIT_FAILURE);
+  }
+  // If a previous module build daemon invocation crashes, the socket
+  // file will need to be removed before the address can be binded to
+  logVerbose("Removing ineligible file: " + SocketPath);
+} else {
+   

[clang] [llvm] [clang][MBD] set up module build daemon infrastructure (PR #67562)

2024-04-17 Thread Michael Spencer via cfe-commits


@@ -0,0 +1,289 @@
+//===--- cc1modbuildd_main.cpp - Clang CC1 Module Build Daemon 
===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#include "clang/Tooling/ModuleBuildDaemon/SocketSupport.h"
+#include "clang/Tooling/ModuleBuildDaemon/Utils.h"
+#include "llvm/Support/FileSystem.h"
+#include "llvm/Support/Path.h"
+#include "llvm/Support/Signals.h"
+#include "llvm/Support/ThreadPool.h"
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#ifdef _WIN32
+#include 
+#else
+#include 
+#endif
+
+using namespace llvm;
+using namespace clang::tooling::cc1modbuildd;
+
+// Create unbuffered STDOUT stream so that any logging done by the module build
+// daemon can be viewed without having to terminate the process
+static raw_fd_ostream _outs() {
+  static raw_fd_ostream S(fileno(stdout), false, true);
+  return S;
+}
+
+static bool LogVerbose = false;
+static void logVerbose(const llvm::Twine ) {
+  if (LogVerbose) {
+unbuff_outs() << message << '\n';
+  }
+}
+
+static void modifySignals(decltype(SIG_DFL) handler) {
+
+  if (std::signal(SIGTERM, handler) == SIG_ERR) {
+errs() << "failed to handle SIGTERM" << '\n';
+exit(EXIT_FAILURE);
+  }
+
+  if (std::signal(SIGINT, handler) == SIG_ERR) {
+errs() << "failed to handle SIGINT" << '\n';
+exit(EXIT_FAILURE);
+  }
+
+#ifdef SIGHUP
+  if (::signal(SIGHUP, SIG_IGN) == SIG_ERR) {
+errs() << "failed to handle SIGHUP" << '\n';
+exit(EXIT_FAILURE);
+  }
+#endif
+}
+
+namespace {
+
+class ModuleBuildDaemonServer {
+public:
+  SmallString<256> SocketPath;
+  SmallString<256> Stderr; // path to stderr
+  SmallString<256> Stdout; // path to stdout
+
+  explicit ModuleBuildDaemonServer(StringRef Path)
+  : SocketPath(Path), Stderr(Path), Stdout(Path) {
+llvm::sys::path::append(SocketPath, SocketFileName);
+llvm::sys::path::append(Stdout, StdoutFileName);
+llvm::sys::path::append(Stderr, StderrFileName);
+  }
+
+  void setupDaemonEnv();
+  void createDaemonSocket();
+  void listenForClients();
+
+  static void handleConnection(std::shared_ptr Connection);
+
+  // TODO: modify so when shutdownDaemon is called the daemon stops accepting
+  // new client connections and waits for all existing client connections to
+  // terminate before closing the file descriptor and exiting
+  void shutdownDaemon() {
+RunServiceLoop = false;
+if (ServerListener.has_value())
+  ServerListener.value().shutdown();
+  }
+
+private:
+  std::atomic RunServiceLoop = true;
+  std::optional ServerListener;

Bigcheese wrote:

This is set after the signal handler is installed so it needs to be atomic.
```suggestion
  std::atomic ServerListener;
  std::optional ServerListenerStorage;
```

Then in `createDaemonSocket ` set `ServerListener = &*ServerListenerStorage`, 
and in `shutdownDaemon` use the `ServerListener` atomic pointer instead of the 
optional.

In the case that shutdown is called before the `ServerListener` is set, the 
server will still shutdown because `RunServiceLoop` will be false and `accept` 
will never be called.

https://github.com/llvm/llvm-project/pull/67562
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [clang][MBD] set up module build daemon infrastructure (PR #67562)

2024-04-17 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese commented:

Looks about ready to land, just a few comments.

https://github.com/llvm/llvm-project/pull/67562
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [clang][MBD] set up module build daemon infrastructure (PR #67562)

2024-04-17 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese edited 
https://github.com/llvm/llvm-project/pull/67562
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [NFC] Parameterize Initialization of `clang::CodeGenerator` on a `TargetInfo` instance which may differ from the one in the `ASTContext` (PR #88977)

2024-04-16 Thread Michael Spencer via cfe-commits


@@ -26,123 +26,132 @@ namespace clang {
   class VarDecl;
   class FunctionDecl;
   class ImportDecl;
-
-/// ASTConsumer - This is an abstract interface that should be implemented by
-/// clients that read ASTs.  This abstraction layer allows the client to be
-/// independent of the AST producer (e.g. parser vs AST dump file reader, etc).
-class ASTConsumer {
-  /// Whether this AST consumer also requires information about
-  /// semantic analysis.
-  bool SemaConsumer = false;
-
-  friend class SemaConsumer;
-
-public:
-  ASTConsumer() = default;
-
-  virtual ~ASTConsumer() {}
-
-  /// Initialize - This is called to initialize the consumer, providing the
-  /// ASTContext.
-  virtual void Initialize(ASTContext ) {}
-
-  /// HandleTopLevelDecl - Handle the specified top-level declaration.  This is
-  /// called by the parser to process every top-level Decl*.
-  ///
-  /// \returns true to continue parsing, or false to abort parsing.
-  virtual bool HandleTopLevelDecl(DeclGroupRef D);
-
-  /// This callback is invoked each time an inline (method or friend)
-  /// function definition in a class is completed.
-  virtual void HandleInlineFunctionDefinition(FunctionDecl *D) {}
-
-  /// HandleInterestingDecl - Handle the specified interesting declaration. 
This
-  /// is called by the AST reader when deserializing things that might interest
-  /// the consumer. The default implementation forwards to HandleTopLevelDecl.
-  virtual void HandleInterestingDecl(DeclGroupRef D);
-
-  /// HandleTranslationUnit - This method is called when the ASTs for entire
-  /// translation unit have been parsed.
-  virtual void HandleTranslationUnit(ASTContext ) {}
-
-  /// HandleTagDeclDefinition - This callback is invoked each time a TagDecl
-  /// (e.g. struct, union, enum, class) is completed.  This allows the client 
to
-  /// hack on the type, which can occur at any point in the file (because these
-  /// can be defined in declspecs).
-  virtual void HandleTagDeclDefinition(TagDecl *D) {}
-
-  /// This callback is invoked the first time each TagDecl is required to
-  /// be complete.
-  virtual void HandleTagDeclRequiredDefinition(const TagDecl *D) {}
-
-  /// Invoked when a function is implicitly instantiated.
-  /// Note that at this point it does not have a body, its body is
-  /// instantiated at the end of the translation unit and passed to
-  /// HandleTopLevelDecl.
-  virtual void HandleCXXImplicitFunctionInstantiation(FunctionDecl *D) {}
-
-  /// Handle the specified top-level declaration that occurred inside
-  /// and ObjC container.
-  /// The default implementation ignored them.
-  virtual void HandleTopLevelDeclInObjCContainer(DeclGroupRef D);
-
-  /// Handle an ImportDecl that was implicitly created due to an
-  /// inclusion directive.
-  /// The default implementation passes it to HandleTopLevelDecl.
-  virtual void HandleImplicitImportDecl(ImportDecl *D);
-
-  /// CompleteTentativeDefinition - Callback invoked at the end of a 
translation
-  /// unit to notify the consumer that the given tentative definition should be
-  /// completed.
-  ///
-  /// The variable declaration itself will be a tentative
-  /// definition. If it had an incomplete array type, its type will
-  /// have already been changed to an array of size 1. However, the
-  /// declaration remains a tentative definition and has not been
-  /// modified by the introduction of an implicit zero initializer.
-  virtual void CompleteTentativeDefinition(VarDecl *D) {}
-
-  /// CompleteExternalDeclaration - Callback invoked at the end of a 
translation
-  /// unit to notify the consumer that the given external declaration should be
-  /// completed.
-  virtual void CompleteExternalDeclaration(VarDecl *D) {}
-
-  /// Callback invoked when an MSInheritanceAttr has been attached to a
-  /// CXXRecordDecl.
-  virtual void AssignInheritanceModel(CXXRecordDecl *RD) {}
-
-  /// HandleCXXStaticMemberVarInstantiation - Tell the consumer that this
-  // variable has been instantiated.
-  virtual void HandleCXXStaticMemberVarInstantiation(VarDecl *D) {}
-
-  /// Callback involved at the end of a translation unit to
-  /// notify the consumer that a vtable for the given C++ class is
-  /// required.
-  ///
-  /// \param RD The class whose vtable was used.
-  virtual void HandleVTable(CXXRecordDecl *RD) {}
-
-  /// If the consumer is interested in entities getting modified after
-  /// their initial creation, it should return a pointer to
-  /// an ASTMutationListener here.
-  virtual ASTMutationListener *GetASTMutationListener() { return nullptr; }
-
-  /// If the consumer is interested in entities being deserialized from
-  /// AST files, it should return a pointer to a ASTDeserializationListener 
here
-  virtual ASTDeserializationListener *GetASTDeserializationListener() {
-return nullptr;
-  }
-
-  /// PrintStats - If desired, print any statistics.
-  virtual void PrintStats() {}
-
-  /// This callback is called for each function if the 

[clang] [NFC] Parameterize Initialization of `clang::CodeGenerator` on a `TargetInfo` instance which may differ from the one in the `ASTContext` (PR #88977)

2024-04-16 Thread Michael Spencer via cfe-commits


@@ -26,123 +26,132 @@ namespace clang {
   class VarDecl;
   class FunctionDecl;
   class ImportDecl;
-
-/// ASTConsumer - This is an abstract interface that should be implemented by
-/// clients that read ASTs.  This abstraction layer allows the client to be
-/// independent of the AST producer (e.g. parser vs AST dump file reader, etc).
-class ASTConsumer {
-  /// Whether this AST consumer also requires information about
-  /// semantic analysis.
-  bool SemaConsumer = false;
-
-  friend class SemaConsumer;
-
-public:
-  ASTConsumer() = default;
-
-  virtual ~ASTConsumer() {}
-
-  /// Initialize - This is called to initialize the consumer, providing the
-  /// ASTContext.
-  virtual void Initialize(ASTContext ) {}
-
-  /// HandleTopLevelDecl - Handle the specified top-level declaration.  This is
-  /// called by the parser to process every top-level Decl*.
-  ///
-  /// \returns true to continue parsing, or false to abort parsing.
-  virtual bool HandleTopLevelDecl(DeclGroupRef D);
-
-  /// This callback is invoked each time an inline (method or friend)
-  /// function definition in a class is completed.
-  virtual void HandleInlineFunctionDefinition(FunctionDecl *D) {}
-
-  /// HandleInterestingDecl - Handle the specified interesting declaration. 
This
-  /// is called by the AST reader when deserializing things that might interest
-  /// the consumer. The default implementation forwards to HandleTopLevelDecl.
-  virtual void HandleInterestingDecl(DeclGroupRef D);
-
-  /// HandleTranslationUnit - This method is called when the ASTs for entire
-  /// translation unit have been parsed.
-  virtual void HandleTranslationUnit(ASTContext ) {}
-
-  /// HandleTagDeclDefinition - This callback is invoked each time a TagDecl
-  /// (e.g. struct, union, enum, class) is completed.  This allows the client 
to
-  /// hack on the type, which can occur at any point in the file (because these
-  /// can be defined in declspecs).
-  virtual void HandleTagDeclDefinition(TagDecl *D) {}
-
-  /// This callback is invoked the first time each TagDecl is required to
-  /// be complete.
-  virtual void HandleTagDeclRequiredDefinition(const TagDecl *D) {}
-
-  /// Invoked when a function is implicitly instantiated.
-  /// Note that at this point it does not have a body, its body is
-  /// instantiated at the end of the translation unit and passed to
-  /// HandleTopLevelDecl.
-  virtual void HandleCXXImplicitFunctionInstantiation(FunctionDecl *D) {}
-
-  /// Handle the specified top-level declaration that occurred inside
-  /// and ObjC container.
-  /// The default implementation ignored them.
-  virtual void HandleTopLevelDeclInObjCContainer(DeclGroupRef D);
-
-  /// Handle an ImportDecl that was implicitly created due to an
-  /// inclusion directive.
-  /// The default implementation passes it to HandleTopLevelDecl.
-  virtual void HandleImplicitImportDecl(ImportDecl *D);
-
-  /// CompleteTentativeDefinition - Callback invoked at the end of a 
translation
-  /// unit to notify the consumer that the given tentative definition should be
-  /// completed.
-  ///
-  /// The variable declaration itself will be a tentative
-  /// definition. If it had an incomplete array type, its type will
-  /// have already been changed to an array of size 1. However, the
-  /// declaration remains a tentative definition and has not been
-  /// modified by the introduction of an implicit zero initializer.
-  virtual void CompleteTentativeDefinition(VarDecl *D) {}
-
-  /// CompleteExternalDeclaration - Callback invoked at the end of a 
translation
-  /// unit to notify the consumer that the given external declaration should be
-  /// completed.
-  virtual void CompleteExternalDeclaration(VarDecl *D) {}
-
-  /// Callback invoked when an MSInheritanceAttr has been attached to a
-  /// CXXRecordDecl.
-  virtual void AssignInheritanceModel(CXXRecordDecl *RD) {}
-
-  /// HandleCXXStaticMemberVarInstantiation - Tell the consumer that this
-  // variable has been instantiated.
-  virtual void HandleCXXStaticMemberVarInstantiation(VarDecl *D) {}
-
-  /// Callback involved at the end of a translation unit to
-  /// notify the consumer that a vtable for the given C++ class is
-  /// required.
-  ///
-  /// \param RD The class whose vtable was used.
-  virtual void HandleVTable(CXXRecordDecl *RD) {}
-
-  /// If the consumer is interested in entities getting modified after
-  /// their initial creation, it should return a pointer to
-  /// an ASTMutationListener here.
-  virtual ASTMutationListener *GetASTMutationListener() { return nullptr; }
-
-  /// If the consumer is interested in entities being deserialized from
-  /// AST files, it should return a pointer to a ASTDeserializationListener 
here
-  virtual ASTDeserializationListener *GetASTDeserializationListener() {
-return nullptr;
-  }
-
-  /// PrintStats - If desired, print any statistics.
-  virtual void PrintStats() {}
-
-  /// This callback is called for each function if the 

[clang] [clang][deps] Support single-file mode for all formats (PR #88764)

2024-04-15 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese approved this pull request.

lgtm. Good to get rid of more unneeded differences between ways of doing 
modules.

https://github.com/llvm/llvm-project/pull/88764
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [clang][MBD] set up module build daemon infrastructure (PR #67562)

2024-04-10 Thread Michael Spencer via cfe-commits


@@ -0,0 +1,278 @@
+//===--- cc1modbuildd_main.cpp - Clang CC1 Module Build Daemon 
===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#include "clang/Tooling/ModuleBuildDaemon/SocketSupport.h"
+#include "clang/Tooling/ModuleBuildDaemon/Utils.h"
+#include "llvm/Support/FileSystem.h"
+#include "llvm/Support/Path.h"
+#include "llvm/Support/Signals.h"
+#include "llvm/Support/ThreadPool.h"
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#ifdef _WIN32
+#include 
+#else
+#include 
+#endif
+
+using namespace llvm;
+using namespace clang::tooling::cc1modbuildd;
+
+// Create unbuffered STDOUT stream so that any logging done by the module build
+// daemon can be viewed without having to terminate the process
+static raw_fd_ostream _outs() {
+  static raw_fd_ostream S(fileno(stdout), false, true);
+  return S;
+}
+
+static bool VerboseLog = false;
+static void verboseLog(const llvm::Twine ) {
+  if (VerboseLog) {
+unbuff_outs() << message << '\n';
+  }
+}
+
+static void modifySignals(decltype(SIG_DFL) handler) {
+
+  if (std::signal(SIGTERM, handler) == SIG_ERR) {
+errs() << "failed to handle SIGTERM" << '\n';
+exit(EXIT_FAILURE);
+  }
+
+  if (std::signal(SIGINT, handler) == SIG_ERR) {
+errs() << "failed to handle SIGINT" << '\n';
+exit(EXIT_FAILURE);
+  }
+
+#ifdef SIGHUP
+  if (::signal(SIGHUP, SIG_IGN) == SIG_ERR) {
+errs() << "failed to handle SIGHUP" << '\n';
+exit(EXIT_FAILURE);
+  }
+#endif
+}
+
+namespace {
+
+class ModuleBuildDaemonServer {
+public:
+  SmallString<256> SocketPath;
+  SmallString<256> Stderr; // path to stderr
+  SmallString<256> Stdout; // path to stdout
+
+  ModuleBuildDaemonServer(StringRef Path)
+  : SocketPath(Path), Stderr(Path), Stdout(Path) {
+llvm::sys::path::append(SocketPath, SocketFileName);
+llvm::sys::path::append(Stdout, StdoutFileName);
+llvm::sys::path::append(Stderr, StderrFileName);
+  }
+
+  void setupDaemonEnv();
+  void createDaemonSocket();
+  void listenForClients();
+
+  static void handleConnection(std::shared_ptr Connection);
+
+  // TODO: modify so when shutdownDaemon is called the daemon stops accepting
+  // new client connections and waits for all existing client connections to
+  // terminate before closing the file descriptor and exiting
+  void shutdownDaemon() {
+RunServiceLoop = false;
+if (ServerListener.has_value())
+  ServerListener.value().shutdown();
+  }
+
+private:
+  std::atomic RunServiceLoop = true;
+  std::optional ServerListener;
+};
+
+// Used to handle signals
+ModuleBuildDaemonServer *DaemonPtr = nullptr;
+void handleSignal(int) { DaemonPtr->shutdownDaemon(); }
+} // namespace
+
+// Sets up file descriptors and signals for module build daemon
+void ModuleBuildDaemonServer::setupDaemonEnv() {
+
+#ifdef _WIN32
+  freopen("NUL", "r", stdin);
+#else
+  close(STDIN_FILENO);
+#endif
+
+  freopen(Stdout.c_str(), "a", stdout);
+  freopen(Stderr.c_str(), "a", stderr);
+
+  modifySignals(handleSignal);
+}
+
+// Creates unix socket for IPC with frontends
+void ModuleBuildDaemonServer::createDaemonSocket() {
+
+  while (true) {
+Expected MaybeServerListener =
+llvm::ListeningSocket::createUnix(SocketPath);
+
+if (llvm::Error Err = MaybeServerListener.takeError()) {
+  llvm::handleAllErrors(std::move(Err), [&](const llvm::StringError ) {
+std::error_code EC = SE.convertToErrorCode();
+
+// Exit successfully if the socket address is already in use. When
+// TUs are compiled in parallel, until the socket file is created, all
+// clang invocations will try to spawn a module build daemon.
+#ifdef _WIN32
+if (EC.value() == WSAEADDRINUSE) {
+#else
+if (EC == std::errc::address_in_use) {
+#endif
+  exit(EXIT_SUCCESS);
+} else if (EC == std::errc::file_exists) {
+  if (std::remove(SocketPath.c_str()) != 0) {
+llvm::errs() << "Failed to remove " << SocketPath << ": "
+ << strerror(errno) << '\n';
+exit(EXIT_FAILURE);
+  }
+  // If a previous module build daemon invocation crashes, the socket
+  // file will need to be removed before the address can be binded to
+  verboseLog("Removing ineligible file: " + SocketPath);
+} else {
+  llvm::errs() << "MBD failed to create unix socket: "
+   << SE.getMessage() << ": " << EC.message() << '\n';
+  exit(EXIT_FAILURE);
+}
+  });
+} else {
+  verboseLog("MBD created and binded to socket at: " + SocketPath);
+  ServerListener.emplace(std::move(*MaybeServerListener));
+  break;
+}
+  }
+}
+
+// Function 

[clang] [llvm] [clang][MBD] set up module build daemon infrastructure (PR #67562)

2024-04-10 Thread Michael Spencer via cfe-commits


@@ -237,18 +251,24 @@ int cc1modbuildd_main(ArrayRef Argv) {
   if (!validBasePathLength(BasePath)) {
 errs() << "BasePath '" << BasePath << "' is longer then the max length of "
<< std::to_string(BASEPATH_MAX_LENGTH) << '\n';
-return 1;
+return EXIT_FAILURE;
   }
 
   llvm::sys::fs::create_directories(BasePath);
-  ModuleBuildDaemonServer Daemon(BasePath);
 
-  // Used to handle signals
-  DaemonPtr = 
+  {
+ModuleBuildDaemonServer Daemon(BasePath);
+
+// Used to handle signals
+DaemonPtr = 
+
+Daemon.setupDaemonEnv();
+Daemon.createDaemonSocket();
+Daemon.listenForClients();
+  }
 
-  Daemon.setupDaemonEnv();
-  Daemon.createDaemonSocket();
-  Daemon.listenForClients();
+  // Prevents handleSignal from being called after ServerListener is destructed
+  setupSignals(SIG_DFL);

Bigcheese wrote:

This should just be `SIG_IGN`, you're just about to shut down anyway.

https://github.com/llvm/llvm-project/pull/67562
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [clang][MBD] set up module build daemon infrastructure (PR #67562)

2024-04-10 Thread Michael Spencer via cfe-commits


@@ -133,31 +133,42 @@ void ModuleBuildDaemonServer::setupDaemonEnv() {
 // Creates unix socket for IPC with frontends
 void ModuleBuildDaemonServer::createDaemonSocket() {
 
-  Expected MaybeServerListener =
-  llvm::ListeningSocket::createUnix(SocketPath);
-
-  if (llvm::Error Err = MaybeServerListener.takeError()) {
-llvm::handleAllErrors(std::move(Err), [&](const llvm::StringError ) {
-  std::error_code EC = SE.convertToErrorCode();
-  // Exit successfully if the socket address is already in use. When
-  // translation units are compiled in parallel, until the socket file is
-  // created, all clang invocations will try to spawn a module build 
daemon.
+  bool SocketCreated = false;
+  while (!SocketCreated) {
+
+Expected MaybeServerListener =
+llvm::ListeningSocket::createUnix(SocketPath);
+
+if (llvm::Error Err = MaybeServerListener.takeError()) {
+  llvm::handleAllErrors(std::move(Err), [&](const llvm::StringError ) {
+std::error_code EC = SE.convertToErrorCode();
+
+// Exit successfully if the socket address is already in use. When
+// TUs are compiled in parallel, until the socket file is created, all
+// clang invocations will try to spawn a module build daemon.
 #ifdef _WIN32
-  if (EC.value() == WSAEADDRINUSE) {
+if (EC.value() == WSAEADDRINUSE) {
 #else
   if (EC == std::errc::address_in_use) {
 #endif
-exit(EXIT_SUCCESS);
-  } else {
-llvm::errs() << "MBD failed to create unix socket: " << SE.message()
- << EC.message() << '\n';
-exit(EXIT_FAILURE);
-  }
-});
+  exit(EXIT_SUCCESS);
+} else if (EC == std::errc::file_exists) {
+  if (std::error_code EC = llvm::sys::fs::remove(SocketPath); EC) {
+llvm::errs() << "Failed to remove file: " << EC.message() << '\n';
+exit(EXIT_FAILURE);
+  }
+} else {
+  llvm::errs() << "MBD failed to create unix socket: "
+   << SE.getMessage() << ": " << EC.message() << '\n';
+  exit(EXIT_FAILURE);
+}
+  });
+} else {
+  SocketCreated = true;

Bigcheese wrote:

I don't think you need `SocketCreated` here, you can just `break;`.

https://github.com/llvm/llvm-project/pull/67562
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [clang][MBD] set up module build daemon infrastructure (PR #67562)

2024-04-10 Thread Michael Spencer via cfe-commits


@@ -43,8 +44,7 @@ constexpr size_t SOCKET_ADDR_MAX_LENGTH = 
sizeof(sockaddr_un::sun_path);
 constexpr size_t BASEPATH_MAX_LENGTH =
 SOCKET_ADDR_MAX_LENGTH - SOCKET_FILE_NAME.length();
 
-// How long should the module build daemon sit ideal before exiting
-constexpr int TimeoutSec = 15;
+constexpr std::chrono::microseconds MICROSEC_IN_SEC(100);

Bigcheese wrote:

You can just use `std::chrono::seconds`, or use 
https://en.cppreference.com/w/cpp/chrono/operator%22%22s. 

https://github.com/llvm/llvm-project/pull/67562
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [clang][MBD] set up module build daemon infrastructure (PR #67562)

2024-04-10 Thread Michael Spencer via cfe-commits


@@ -71,6 +101,18 @@ ListeningSocket::ListeningSocket(ListeningSocket &)
 Expected ListeningSocket::createUnix(StringRef SocketPath,
   int MaxBacklog) {
 
+  // Identify instances where the target socket address already exist but 
hasn't
+  // been binded to by another program. If there is already a file (of any 
type)
+  // at the specified path, ::bind() will fail with an error
+  if (llvm::sys::fs::exists(SocketPath)) {
+Expected MaybeFD = getSocketFD(SocketPath);
+if (!MaybeFD) {
+  return llvm::make_error(
+  std::make_error_code(std::errc::file_exists),
+  "Cannot create and bind to socket file");
+}

Bigcheese wrote:

Need to call `::close` on the FD.

https://github.com/llvm/llvm-project/pull/67562
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [clang][MBD] set up module build daemon infrastructure (PR #67562)

2024-04-10 Thread Michael Spencer via cfe-commits


@@ -0,0 +1,66 @@
+//=== SocketMsgSupport.cpp 
===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#include "llvm/ADT/StringRef.h"
+#include "llvm/Support/Error.h"
+#include "llvm/Support/raw_socket_stream.h"
+
+#include 
+#include 
+
+using namespace llvm;
+
+namespace clang::tooling::cc1modbuildd {
+
+Expected>
+connectToSocket(StringRef SocketPath) {
+
+  Expected> MaybeClient =
+  raw_socket_stream::createConnectedUnix(SocketPath);
+  if (!MaybeClient)
+return MaybeClient.takeError();
+
+  return std::move(*MaybeClient);
+}

Bigcheese wrote:

With the adoption of `raw_socket_stream` it looks like this function isn't 
needed now.

https://github.com/llvm/llvm-project/pull/67562
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [clang][MBD] set up module build daemon infrastructure (PR #67562)

2024-04-10 Thread Michael Spencer via cfe-commits


@@ -0,0 +1,285 @@
+//===--- cc1modbuildd_main.cpp - Clang CC1 Module Build Daemon 
===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#include "clang/Tooling/ModuleBuildDaemon/SocketSupport.h"
+#include "clang/Tooling/ModuleBuildDaemon/Utils.h"
+#include "llvm/Support/FileSystem.h"
+#include "llvm/Support/Path.h"
+#include "llvm/Support/Signals.h"
+#include "llvm/Support/ThreadPool.h"
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#ifdef _WIN32
+#include 
+#else
+#include 
+#endif
+
+using namespace llvm;
+using namespace clang::tooling::cc1modbuildd;
+
+// Create unbuffered STDOUT stream so that any logging done by the module build
+// daemon can be viewed without having to terminate the process
+static raw_fd_ostream _outs() {
+  static raw_fd_ostream S(fileno(stdout), false, true);
+  return S;
+}
+
+static bool VerboseLog = false;
+static void verboseLog(const llvm::Twine ) {
+  if (VerboseLog) {
+unbuff_outs() << message << '\n';
+  }
+}
+
+static void setupSignals(sighandler_t handler) {

Bigcheese wrote:

`sighandler_t` isn't portable as the std doesn't define a name for the signal 
handler type. You should be able to use `decltype(SIG_DFL)` to define your own 
type.

https://github.com/llvm/llvm-project/pull/67562
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][modules] Only compute affecting module maps with implicit search (PR #87849)

2024-04-09 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese approved this pull request.


https://github.com/llvm/llvm-project/pull/87849
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Modules] No transitive source location change (PR #86912)

2024-04-09 Thread Michael Spencer via cfe-commits


@@ -2220,40 +2227,47 @@ class ASTReader
 return Sema::AlignPackInfo::getFromRawEncoding(Raw);
   }
 
+  using RawLocEncoding = SourceLocationEncoding::RawLocEncoding;
+
   /// Read a source location from raw form and return it in its
   /// originating module file's source location space.
-  SourceLocation ReadUntranslatedSourceLocation(SourceLocation::UIntTy Raw,
-LocSeq *Seq = nullptr) const {
+  std::pair
+  ReadUntranslatedSourceLocation(RawLocEncoding Raw,
+ LocSeq *Seq = nullptr) const {
 return SourceLocationEncoding::decode(Raw, Seq);
   }
 
   /// Read a source location from raw form.
-  SourceLocation ReadSourceLocation(ModuleFile ,
-SourceLocation::UIntTy Raw,
-LocSeq *Seq = nullptr) const {
-SourceLocation Loc = ReadUntranslatedSourceLocation(Raw, Seq);
-return TranslateSourceLocation(ModuleFile, Loc);
+  SourceLocation ReadRawSourceLocation(ModuleFile , RawLocEncoding Raw,
+   LocSeq *Seq = nullptr) const {
+if (!MF.ModuleOffsetMap.empty())
+  ReadModuleOffsetMap(MF);
+
+auto [Loc, ModuleFileIndex] = ReadUntranslatedSourceLocation(Raw, Seq);
+ModuleFile *ModuleFileHomingLoc =

Bigcheese wrote:

`ModuleFileHomingLoc` sounds a bit confusing to me. Would a better name here be 
`OwningModuleFile`?

https://github.com/llvm/llvm-project/pull/86912
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Modules] No transitive source location change (PR #86912)

2024-04-09 Thread Michael Spencer via cfe-commits


@@ -2220,40 +2227,47 @@ class ASTReader
 return Sema::AlignPackInfo::getFromRawEncoding(Raw);
   }
 
+  using RawLocEncoding = SourceLocationEncoding::RawLocEncoding;
+
   /// Read a source location from raw form and return it in its
   /// originating module file's source location space.
-  SourceLocation ReadUntranslatedSourceLocation(SourceLocation::UIntTy Raw,
-LocSeq *Seq = nullptr) const {
+  std::pair
+  ReadUntranslatedSourceLocation(RawLocEncoding Raw,
+ LocSeq *Seq = nullptr) const {
 return SourceLocationEncoding::decode(Raw, Seq);
   }
 
   /// Read a source location from raw form.
-  SourceLocation ReadSourceLocation(ModuleFile ,
-SourceLocation::UIntTy Raw,
-LocSeq *Seq = nullptr) const {
-SourceLocation Loc = ReadUntranslatedSourceLocation(Raw, Seq);
-return TranslateSourceLocation(ModuleFile, Loc);
+  SourceLocation ReadRawSourceLocation(ModuleFile , RawLocEncoding Raw,
+   LocSeq *Seq = nullptr) const {
+if (!MF.ModuleOffsetMap.empty())
+  ReadModuleOffsetMap(MF);
+
+auto [Loc, ModuleFileIndex] = ReadUntranslatedSourceLocation(Raw, Seq);
+ModuleFile *ModuleFileHomingLoc =
+ModuleFileIndex ? ImportedModuleFiles[][ModuleFileIndex - 1] : 

Bigcheese wrote:

Should have at least a bounds check assert on the ModuleFileIndex value here.

https://github.com/llvm/llvm-project/pull/86912
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Modules] No transitive source location change (PR #86912)

2024-04-09 Thread Michael Spencer via cfe-commits


@@ -2220,40 +2227,47 @@ class ASTReader
 return Sema::AlignPackInfo::getFromRawEncoding(Raw);
   }
 
+  using RawLocEncoding = SourceLocationEncoding::RawLocEncoding;
+
   /// Read a source location from raw form and return it in its
   /// originating module file's source location space.
-  SourceLocation ReadUntranslatedSourceLocation(SourceLocation::UIntTy Raw,
-LocSeq *Seq = nullptr) const {
+  std::pair
+  ReadUntranslatedSourceLocation(RawLocEncoding Raw,
+ LocSeq *Seq = nullptr) const {
 return SourceLocationEncoding::decode(Raw, Seq);
   }
 
   /// Read a source location from raw form.
-  SourceLocation ReadSourceLocation(ModuleFile ,
-SourceLocation::UIntTy Raw,
-LocSeq *Seq = nullptr) const {
-SourceLocation Loc = ReadUntranslatedSourceLocation(Raw, Seq);
-return TranslateSourceLocation(ModuleFile, Loc);
+  SourceLocation ReadRawSourceLocation(ModuleFile , RawLocEncoding Raw,
+   LocSeq *Seq = nullptr) const {
+if (!MF.ModuleOffsetMap.empty())
+  ReadModuleOffsetMap(MF);
+
+auto [Loc, ModuleFileIndex] = ReadUntranslatedSourceLocation(Raw, Seq);
+ModuleFile *ModuleFileHomingLoc =
+ModuleFileIndex ? ImportedModuleFiles[][ModuleFileIndex - 1] : 
+return TranslateSourceLocation(*ModuleFileHomingLoc, Loc);
   }
 
   /// Translate a source location from another module file's source
   /// location space into ours.
   SourceLocation TranslateSourceLocation(ModuleFile ,
  SourceLocation Loc) const {
-if (!ModuleFile.ModuleOffsetMap.empty())
-  ReadModuleOffsetMap(ModuleFile);
-assert(ModuleFile.SLocRemap.find(Loc.getOffset()) !=
-   ModuleFile.SLocRemap.end() &&
-   "Cannot find offset to remap.");
-SourceLocation::IntTy Remap =
-ModuleFile.SLocRemap.find(Loc.getOffset())->second;
-return Loc.getLocWithOffset(Remap);
+if (Loc.isInvalid())
+  return Loc;
+
+// It implies that the Loc is already translated.
+if (SourceMgr.isLoadedSourceLocation(Loc))
+  return Loc;
+
+return Loc.getLocWithOffset(ModuleFile.SLocEntryBaseOffset - 2);

Bigcheese wrote:

Why the -2 here?

https://github.com/llvm/llvm-project/pull/86912
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Modules] No transitive source location change (PR #86912)

2024-04-09 Thread Michael Spencer via cfe-commits


@@ -149,14 +157,44 @@ class SourceLocationSequence::State {
   operator SourceLocationSequence *() { return  }
 };
 
-inline uint64_t SourceLocationEncoding::encode(SourceLocation Loc,
-   SourceLocationSequence *Seq) {
-  return Seq ? Seq->encode(Loc) : encodeRaw(Loc.getRawEncoding());
+inline SourceLocationEncoding::RawLocEncoding
+SourceLocationEncoding::encode(SourceLocation Loc, UIntTy BaseOffset,
+   unsigned BaseModuleFileIndex,
+   SourceLocationSequence *Seq) {
+  // If the source location is a local source location, we can try to optimize
+  // the similar sequences to only record the differences.
+  if (!BaseOffset)
+return Seq ? Seq->encode(Loc) : encodeRaw(Loc.getRawEncoding());
+
+  if (Loc.isInvalid())
+return 0;
+  
+  // Otherwise, the higher bits are used to store the module file index,
+  // so it is meaningless to optimize the source locations into small
+  // integers. Let's try to always use the raw encodings.
+  assert(Loc.getOffset() >= BaseOffset);
+  Loc = Loc.getLocWithOffset(-BaseOffset);
+  RawLocEncoding Encoded = encodeRaw(Loc.getRawEncoding());
+  assert(Encoded < ((RawLocEncoding)1 << 32));
+
+  assert(BaseModuleFileIndex < ((RawLocEncoding)1 << 32));
+  Encoded |= (RawLocEncoding)BaseModuleFileIndex << 32;

Bigcheese wrote:

Were you going to change this to only reserve 16 bits for module index?

https://github.com/llvm/llvm-project/pull/86912
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Modules] No transitive source location change (PR #86912)

2024-04-09 Thread Michael Spencer via cfe-commits


@@ -4078,8 +4065,8 @@ void ASTReader::ReadModuleOffsetMap(ModuleFile ) const {
   return;

Bigcheese wrote:

There's no more SourceLocation remap so I believe this error message is wrong 
now.

https://github.com/llvm/llvm-project/pull/86912
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Modules] No transitive source location change (PR #86912)

2024-04-09 Thread Michael Spencer via cfe-commits


@@ -149,14 +157,44 @@ class SourceLocationSequence::State {
   operator SourceLocationSequence *() { return  }
 };
 
-inline uint64_t SourceLocationEncoding::encode(SourceLocation Loc,
-   SourceLocationSequence *Seq) {
-  return Seq ? Seq->encode(Loc) : encodeRaw(Loc.getRawEncoding());
+inline SourceLocationEncoding::RawLocEncoding
+SourceLocationEncoding::encode(SourceLocation Loc, UIntTy BaseOffset,
+   unsigned BaseModuleFileIndex,
+   SourceLocationSequence *Seq) {
+  // If the source location is a local source location, we can try to optimize
+  // the similar sequences to only record the differences.
+  if (!BaseOffset)
+return Seq ? Seq->encode(Loc) : encodeRaw(Loc.getRawEncoding());
+
+  if (Loc.isInvalid())
+return 0;
+  
+  // Otherwise, the higher bits are used to store the module file index,
+  // so it is meaningless to optimize the source locations into small
+  // integers. Let's try to always use the raw encodings.
+  assert(Loc.getOffset() >= BaseOffset);
+  Loc = Loc.getLocWithOffset(-BaseOffset);
+  RawLocEncoding Encoded = encodeRaw(Loc.getRawEncoding());
+  assert(Encoded < ((RawLocEncoding)1 << 32));
+
+  assert(BaseModuleFileIndex < ((RawLocEncoding)1 << 32));
+  Encoded |= (RawLocEncoding)BaseModuleFileIndex << 32;
+  return Encoded;
 }
-inline SourceLocation
-SourceLocationEncoding::decode(uint64_t Encoded, SourceLocationSequence *Seq) {
-  return Seq ? Seq->decode(Encoded)
- : SourceLocation::getFromRawEncoding(decodeRaw(Encoded));
+inline std::pair
+SourceLocationEncoding::decode(RawLocEncoding Encoded,
+   SourceLocationSequence *Seq) {
+  unsigned ModuleFileIndex = Encoded >> 32;
+
+  if (!ModuleFileIndex)
+return {Seq ? Seq->decode(Encoded)
+ : SourceLocation::getFromRawEncoding(decodeRaw(Encoded)),
+ModuleFileIndex};
+
+  Encoded &= ((RawLocEncoding)1 << 33) - 1;

Bigcheese wrote:

We have `maskTrailingOnes(32)` which is a bit clearer than raw 
bit math.

https://github.com/llvm/llvm-project/pull/86912
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Modules] No transitive source location change (PR #86912)

2024-04-09 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese edited 
https://github.com/llvm/llvm-project/pull/86912
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Modules] No transitive source location change (PR #86912)

2024-04-09 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese commented:

I have a few minor comments on the patch. I want to do some additional perf 
testing on module scanning perf because I'm a bit concerned about the cost of 
`ASTWriter::getRawSourceLocationEncoding`.

https://github.com/llvm/llvm-project/pull/86912
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Modules] No transitive source location change (PR #86912)

2024-04-09 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese commented:

I think the general approach makes sense. I'll take a closer look at the 
specific changes.

https://github.com/llvm/llvm-project/pull/86912
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][frontend] Make DumpModuleInfoAction emit the full macro (PR #85745)

2024-04-09 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese requested changes to this pull request.

Generally when we get the definition of a macro we loop over the tokens, for 
example:

```c++
  SmallString<128> SpellingBuffer;
  bool First = true;
  for (const auto  : MI.tokens()) {
if (!First && T.hasLeadingSpace())
  Value << ' ';

Value << PP.getSpelling(T, SpellingBuffer);
First = false;
  }
```

This handles all of the lexing requirements and should have the same output as 
your loop here. `MacroPPCallbacks::writeMacroDefinition` is a good example 
here. We do a similar thing in 
`DeclarationFragmentsBuilder::getFragmentsForMacro`, `PrintMacroDefinition` 
(PrintPreprocessedOutput.cpp), and 
`CodeCompletionResult::CreateCodeCompletionStringForMacro`.

At this point we should just have common code that does this. Code completion 
is a bit special as it's only printing part of it, but I'd like to see macro 
decl/def printing happen in a single place.

https://github.com/llvm/llvm-project/pull/85745
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Headers] Don't declare unreachable() from stddef.h in C++ (PR #86748)

2024-03-26 Thread Michael Spencer via cfe-commits

Bigcheese wrote:

I'm checking with the C and C++ Compatibility study group (SG22) for what's 
expected here.

https://github.com/llvm/llvm-project/pull/86748
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] [llvm] [llvm][Support] Add and use errnoAsErrorCode (PR #84423)

2024-03-08 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese closed 
https://github.com/llvm/llvm-project/pull/84423
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] [llvm] [llvm][Support] Add and use errnoAsErrorCode (PR #84423)

2024-03-08 Thread Michael Spencer via cfe-commits

Bigcheese wrote:

The other cases of `std::system_category` were in `LLVM_ON_UNIX` (or similar) 
blocks that would only be used on systems where it's mostly fine to use either 
one, as most of the time you'll get an error that's in `std::errc`, and then 
there's no difference (or they just are never compared in general). The initial 
desire to do this came from spending 30m looking into which one to use on UNIX 
systems in general and wanting to avoid that in the future. The 
`JSONTransport.cpp` case was just more indication to me that the existing way 
was error prone.

In auditing all uses of `errno` I did find a few other places where the code 
isn't quite wrong, but it's not really using `llvm::Error` correctly. There's 
quite a few places where people use `llvm::inconvertibleErrorCode()` where they 
really shouldn't be. For example `llvm-exegesis` has a bunch of places where 
they call `strerror(errno)` to construct an `llvm::Error` that implicitly uses 
`llvm::inconvertibleErrorCode()` as the `std::error_code` value. Our existing 
documentation here is pretty nice for `Error` and `Expected` 
(https://llvm.org/docs/ProgrammersManual.html#error-handling), but it would be 
nice to better cover how `std::error_code` is supposed to be propagated, as it 
currently kind of implies they are going away, but they are always needed for 
OS errors. I'll try to get to this when I can find time.

As for test coverage, some of these have existing error tests, but for a lot of 
the rest it's either incredibly difficult or just not possible (without using 
`LD_PRELOAD` or something similar) to test. Given that, it's good to make 
handling them as easy to get correct as practicable.



https://github.com/llvm/llvm-project/pull/84423
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang-tools-extra] [llvm] [llvm][Support] Add and use errnoAsErrorCode (PR #84423)

2024-03-07 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese created 
https://github.com/llvm/llvm-project/pull/84423

LLVM is inconsistent about how it converts `errno` to `std::error_code`. This 
can cause problems because values outside of `std::errc` compare differently if 
one is system and one is generic on POSIX systems.

This is even more of a problem on Windows where use of the system category is 
just wrong, as that is for Windows errors, which have a completely different 
mapping than POSIX/generic errors. This patch fixes one instance of this 
mistake in `JSONTransport.cpp`.

This patch adds `errnoAsErrorCode()` which makes it so people do not need to 
think about this issue in the future. It also cleans up a lot of usage of errno 
in LLVM and Clang.

>From 858effb8f8225e9ca7c367037046f07b576a3348 Mon Sep 17 00:00:00 2001
From: Michael Spencer 
Date: Thu, 7 Mar 2024 17:36:33 -0800
Subject: [PATCH] [llvm][Support] Add and use errnoAsErrorCode

LLVM is inconsistent about how it converts `errno` to
`std::error_code`. This can cause problems because values outside of
`std::errc` compare differently if one is system and one is generic on
POSIX systems.

This is even more of a problem on Windows where use of the system
category is just wrong, as that is for Windows errors, which have a
completely different mapping than POSIX/generic errors. This patch
fixes one instance of this mistake in `JSONTransport.cpp`.

This patch adds `errnoAsErrorCode()` which makes it so people do not
need to think about this issue in the future. It also cleans up a lot
of usage of errno in LLVM and Clang.
---
 clang-tools-extra/clangd/JSONTransport.cpp|  3 +-
 .../linux/DirectoryWatcher-linux.cpp  |  9 +--
 llvm/include/llvm/Support/Error.h | 14 
 llvm/lib/ExecutionEngine/Orc/MemoryMapper.cpp |  9 +--
 .../ExecutorSharedMemoryMapperService.cpp | 11 ++-
 llvm/lib/Object/ArchiveWriter.cpp |  2 +-
 llvm/lib/Support/AutoConvert.cpp  |  8 +--
 llvm/lib/Support/LockFileManager.cpp  |  2 +-
 llvm/lib/Support/Path.cpp | 19 ++
 llvm/lib/Support/RandomNumberGenerator.cpp|  7 +-
 llvm/lib/Support/Unix/Memory.inc  | 10 +--
 llvm/lib/Support/Unix/Path.inc| 67 +--
 llvm/lib/Support/Unix/Process.inc | 12 ++--
 llvm/lib/Support/Windows/Process.inc  |  2 +-
 llvm/lib/Support/Windows/Program.inc  |  4 +-
 llvm/lib/Support/raw_ostream.cpp  |  6 +-
 llvm/lib/Support/raw_socket_stream.cpp|  2 +-
 17 files changed, 94 insertions(+), 93 deletions(-)

diff --git a/clang-tools-extra/clangd/JSONTransport.cpp 
b/clang-tools-extra/clangd/JSONTransport.cpp
index 346c7dfb66a1db..3c0e198433f360 100644
--- a/clang-tools-extra/clangd/JSONTransport.cpp
+++ b/clang-tools-extra/clangd/JSONTransport.cpp
@@ -107,8 +107,7 @@ class JSONTransport : public Transport {
 return error(std::make_error_code(std::errc::operation_canceled),
  "Got signal, shutting down");
   if (ferror(In))
-return llvm::errorCodeToError(
-std::error_code(errno, std::system_category()));
+return llvm::errorCodeToError(llvm::errnoAsErrorCode());
   if (readRawMessage(JSON)) {
 ThreadCrashReporter ScopedReporter([]() {
   auto  = llvm::errs();
diff --git a/clang/lib/DirectoryWatcher/linux/DirectoryWatcher-linux.cpp 
b/clang/lib/DirectoryWatcher/linux/DirectoryWatcher-linux.cpp
index beca9586988b52..2ffbc1a226958a 100644
--- a/clang/lib/DirectoryWatcher/linux/DirectoryWatcher-linux.cpp
+++ b/clang/lib/DirectoryWatcher/linux/DirectoryWatcher-linux.cpp
@@ -333,8 +333,7 @@ llvm::Expected> 
clang::DirectoryWatcher::creat
   const int InotifyFD = inotify_init1(IN_CLOEXEC);
   if (InotifyFD == -1)
 return llvm::make_error(
-std::string("inotify_init1() error: ") + strerror(errno),
-llvm::inconvertibleErrorCode());
+llvm::errnoAsErrorCode(), std::string(": inotify_init1()"));
 
   const int InotifyWD = inotify_add_watch(
   InotifyFD, Path.str().c_str(),
@@ -346,15 +345,13 @@ llvm::Expected> 
clang::DirectoryWatcher::creat
   );
   if (InotifyWD == -1)
 return llvm::make_error(
-std::string("inotify_add_watch() error: ") + strerror(errno),
-llvm::inconvertibleErrorCode());
+llvm::errnoAsErrorCode(), std::string(": inotify_add_watch()"));
 
   auto InotifyPollingStopper = SemaphorePipe::create();
 
   if (!InotifyPollingStopper)
 return llvm::make_error(
-std::string("SemaphorePipe::create() error: ") + strerror(errno),
-llvm::inconvertibleErrorCode());
+llvm::errnoAsErrorCode(), std::string(": SemaphorePipe::create()"));
 
   return std::make_unique(
   Path, Receiver, WaitForInitialSync, InotifyFD, InotifyWD,
diff --git a/llvm/include/llvm/Support/Error.h 
b/llvm/include/llvm/Support/Error.h
index bb4f38f7ec355e..894b6484336aef 100644
--- 

[clang] [clang] Diagnose config_macros before building modules (PR #83641)

2024-03-05 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese closed 
https://github.com/llvm/llvm-project/pull/83641
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Diagnose config_macros before building modules (PR #83641)

2024-03-04 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese updated 
https://github.com/llvm/llvm-project/pull/83641

>From e8993b51f0dcdecd2fcb72f91d7e4631e95c2c09 Mon Sep 17 00:00:00 2001
From: Michael Spencer 
Date: Fri, 1 Mar 2024 17:18:20 -0800
Subject: [PATCH] [clang] Diagnose config_macros before building modules

Before this patch, if a module fails to build because of a missing
config_macro, the user will never see the config macro warning. This
patch diagnoses this before building, and each subsequent time a
module is imported.

rdar://123921931
---
 clang/lib/Frontend/CompilerInstance.cpp| 35 ++
 clang/test/Modules/Inputs/config.h |  7 ---
 clang/test/Modules/Inputs/module.modulemap |  5 --
 clang/test/Modules/config_macros.m | 54 --
 4 files changed, 78 insertions(+), 23 deletions(-)
 delete mode 100644 clang/test/Modules/Inputs/config.h

diff --git a/clang/lib/Frontend/CompilerInstance.cpp 
b/clang/lib/Frontend/CompilerInstance.cpp
index 4443073775..ec4e68209d657d 100644
--- a/clang/lib/Frontend/CompilerInstance.cpp
+++ b/clang/lib/Frontend/CompilerInstance.cpp
@@ -1591,6 +1591,14 @@ static void checkConfigMacro(Preprocessor , StringRef 
ConfigMacro,
   }
 }
 
+static void checkConfigMacros(Preprocessor , Module *M,
+  SourceLocation ImportLoc) {
+  clang::Module *TopModule = M->getTopLevelModule();
+  for (const StringRef ConMacro : TopModule->ConfigMacros) {
+checkConfigMacro(PP, ConMacro, M, ImportLoc);
+  }
+}
+
 /// Write a new timestamp file with the given path.
 static void writeTimestampFile(StringRef TimestampFile) {
   std::error_code EC;
@@ -1829,6 +1837,13 @@ ModuleLoadResult 
CompilerInstance::findOrCompileModuleAndReadAST(
   Module *M =
   HS.lookupModule(ModuleName, ImportLoc, true, !IsInclusionDirective);
 
+  // Check for any configuration macros that have changed. This is done
+  // immediately before potentially building a module in case this module
+  // depends on having one of its configuration macros defined to successfully
+  // build. If this is not done the user will never see the warning.
+  if (M)
+checkConfigMacros(getPreprocessor(), M, ImportLoc);
+
   // Select the source and filename for loading the named module.
   std::string ModuleFilename;
   ModuleSource Source =
@@ -2006,12 +2021,23 @@ CompilerInstance::loadModule(SourceLocation ImportLoc,
   if (auto MaybeModule = MM.getCachedModuleLoad(*Path[0].first)) {
 // Use the cached result, which may be nullptr.
 Module = *MaybeModule;
+// Config macros are already checked before building a module, but they 
need
+// to be checked at each import location in case any of the config macros
+// have a new value at the current `ImportLoc`.
+if (Module)
+  checkConfigMacros(getPreprocessor(), Module, ImportLoc);
   } else if (ModuleName == getLangOpts().CurrentModule) {
 // This is the module we're building.
 Module = PP->getHeaderSearchInfo().lookupModule(
 ModuleName, ImportLoc, /*AllowSearch*/ true,
 /*AllowExtraModuleMapSearch*/ !IsInclusionDirective);
 
+// Config macros do not need to be checked here for two reasons.
+// * This will always be textual inclusion, and thus the config macros
+//   actually do impact the content of the header.
+// * `Preprocessor::HandleHeaderIncludeOrImport` will never call this
+//   function as the `#include` or `#import` is textual.
+
 MM.cacheModuleLoad(*Path[0].first, Module);
   } else {
 ModuleLoadResult Result = findOrCompileModuleAndReadAST(
@@ -2146,18 +2172,11 @@ CompilerInstance::loadModule(SourceLocation ImportLoc,
 TheASTReader->makeModuleVisible(Module, Visibility, ImportLoc);
   }
 
-  // Check for any configuration macros that have changed.
-  clang::Module *TopModule = Module->getTopLevelModule();
-  for (unsigned I = 0, N = TopModule->ConfigMacros.size(); I != N; ++I) {
-checkConfigMacro(getPreprocessor(), TopModule->ConfigMacros[I],
- Module, ImportLoc);
-  }
-
   // Resolve any remaining module using export_as for this one.
   getPreprocessor()
   .getHeaderSearchInfo()
   .getModuleMap()
-  .resolveLinkAsDependencies(TopModule);
+  .resolveLinkAsDependencies(Module->getTopLevelModule());
 
   LastModuleImportLoc = ImportLoc;
   LastModuleImportResult = ModuleLoadResult(Module);
diff --git a/clang/test/Modules/Inputs/config.h 
b/clang/test/Modules/Inputs/config.h
deleted file mode 100644
index 4c124b0bf82b7c..00
--- a/clang/test/Modules/Inputs/config.h
+++ /dev/null
@@ -1,7 +0,0 @@
-#ifdef WANT_FOO
-int* foo(void);
-#endif
-
-#ifdef WANT_BAR
-char *bar(void);
-#endif
diff --git a/clang/test/Modules/Inputs/module.modulemap 
b/clang/test/Modules/Inputs/module.modulemap
index e7cb4b27bc08b9..47f6c5c1010d7d 100644
--- a/clang/test/Modules/Inputs/module.modulemap
+++ b/clang/test/Modules/Inputs/module.modulemap
@@ -260,11 +260,6 @@ module cxx_decls_merged {
   header 

[clang] [clang] Diagnose config_macros before building modules (PR #83641)

2024-03-04 Thread Michael Spencer via cfe-commits


@@ -2006,6 +2021,11 @@ CompilerInstance::loadModule(SourceLocation ImportLoc,
   if (auto MaybeModule = MM.getCachedModuleLoad(*Path[0].first)) {
 // Use the cached result, which may be nullptr.
 Module = *MaybeModule;
+// Config macros are already checked before building a module, but they 
need
+// to be checked at each import location in case any of the config macros
+// have a new value at the current `ImportLoc`.
+if (Module)
+  checkConfigMacros(getPreprocessor(), Module, ImportLoc);
   } else if (ModuleName == getLangOpts().CurrentModule) {
 // This is the module we're building.
 Module = PP->getHeaderSearchInfo().lookupModule(

Bigcheese wrote:

Yeah, given how `Preprocessor::HandleHeaderIncludeOrImport` works, you will 
never hit this case.

https://github.com/llvm/llvm-project/pull/83641
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Diagnose config_macros before building modules (PR #83641)

2024-03-04 Thread Michael Spencer via cfe-commits


@@ -22,7 +58,10 @@
 #define WANT_BAR 1 // expected-note{{macro was defined here}}
 @import config; // expected-warning{{definition of configuration macro 
'WANT_BAR' has no effect on the import of 'config'; pass '-DWANT_BAR=...' on 
the command line to configure the module}}
 
-// RUN: rm -rf %t
-// RUN: %clang_cc1 -std=c99 -fmodules -fimplicit-module-maps -x objective-c 
-fmodules-cache-path=%t -DWANT_FOO=1 -emit-module -fmodule-name=config 
%S/Inputs/module.modulemap
-// RUN: %clang_cc1 -std=c99 -fmodules -fimplicit-module-maps 
-fmodules-cache-path=%t -I %S/Inputs -DWANT_FOO=1 %s -verify
+//--- config_error.m
 
+#ifdef TEST_ERROR
+#define SOME_VALUE 5 // expected-note{{macro was defined here}}
+@import config_error; // expected-error{{could not build module}} \
+  // expected-warning{{definition of configuration macro 
'SOME_VALUE' has no effect on the import of 'config_error';}}

Bigcheese wrote:

There were already other tests that check the exact wording, intent here was 
just to make it easier to reword if ever needed, as this specific test doesn't 
care about the exact wording.

https://github.com/llvm/llvm-project/pull/83641
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Diagnose config_macros before building modules (PR #83641)

2024-03-04 Thread Michael Spencer via cfe-commits


@@ -22,7 +58,10 @@
 #define WANT_BAR 1 // expected-note{{macro was defined here}}
 @import config; // expected-warning{{definition of configuration macro 
'WANT_BAR' has no effect on the import of 'config'; pass '-DWANT_BAR=...' on 
the command line to configure the module}}
 
-// RUN: rm -rf %t
-// RUN: %clang_cc1 -std=c99 -fmodules -fimplicit-module-maps -x objective-c 
-fmodules-cache-path=%t -DWANT_FOO=1 -emit-module -fmodule-name=config 
%S/Inputs/module.modulemap
-// RUN: %clang_cc1 -std=c99 -fmodules -fimplicit-module-maps 
-fmodules-cache-path=%t -I %S/Inputs -DWANT_FOO=1 %s -verify
+//--- config_error.m
 
+#ifdef TEST_ERROR

Bigcheese wrote:

Not needed, that was left over from when I tried to use a single file for all 
the cases.

https://github.com/llvm/llvm-project/pull/83641
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Diagnose config_macros before building modules (PR #83641)

2024-03-04 Thread Michael Spencer via cfe-commits


@@ -22,7 +58,10 @@
 #define WANT_BAR 1 // expected-note{{macro was defined here}}
 @import config; // expected-warning{{definition of configuration macro 
'WANT_BAR' has no effect on the import of 'config'; pass '-DWANT_BAR=...' on 
the command line to configure the module}}
 
-// RUN: rm -rf %t
-// RUN: %clang_cc1 -std=c99 -fmodules -fimplicit-module-maps -x objective-c 
-fmodules-cache-path=%t -DWANT_FOO=1 -emit-module -fmodule-name=config 
%S/Inputs/module.modulemap
-// RUN: %clang_cc1 -std=c99 -fmodules -fimplicit-module-maps 
-fmodules-cache-path=%t -I %S/Inputs -DWANT_FOO=1 %s -verify
+//--- config_error.m

Bigcheese wrote:

I can just add comments explaining what it's testing above the run line.

https://github.com/llvm/llvm-project/pull/83641
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Diagnose config_macros before building modules (PR #83641)

2024-03-04 Thread Michael Spencer via cfe-commits


@@ -2006,6 +2021,11 @@ CompilerInstance::loadModule(SourceLocation ImportLoc,
   if (auto MaybeModule = MM.getCachedModuleLoad(*Path[0].first)) {
 // Use the cached result, which may be nullptr.
 Module = *MaybeModule;
+// Config macros are already checked before building a module, but they 
need
+// to be checked at each import location in case any of the config macros
+// have a new value at the current `ImportLoc`.
+if (Module)
+  checkConfigMacros(getPreprocessor(), Module, ImportLoc);
   } else if (ModuleName == getLangOpts().CurrentModule) {
 // This is the module we're building.
 Module = PP->getHeaderSearchInfo().lookupModule(

Bigcheese wrote:

Hmm, I assumed it wasn't needed because this is just `-fmodule-name=` which 
causes textual inclusion, but it looks like the original code would warn in 
that case, even though they actually would apply if it was the first inclusion. 
I'll check and see if it matters.

https://github.com/llvm/llvm-project/pull/83641
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Diagnose config_macros before building modules (PR #83641)

2024-03-01 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese updated 
https://github.com/llvm/llvm-project/pull/83641

>From c2445d426e374592522ac392254c9909ab52fc40 Mon Sep 17 00:00:00 2001
From: Michael Spencer 
Date: Fri, 1 Mar 2024 17:18:20 -0800
Subject: [PATCH] [clang] Diagnose config_macros before building modules

Before this patch, if a module fails to build because of a missing
config_macro, the user will never see the config macro warning. This
patch diagnoses this before building, and each subsequent time a
module is imported.

rdar://123921931
---
 clang/lib/Frontend/CompilerInstance.cpp| 29 ++
 clang/test/Modules/Inputs/config.h |  7 
 clang/test/Modules/Inputs/module.modulemap |  5 ---
 clang/test/Modules/config_macros.m | 45 --
 4 files changed, 63 insertions(+), 23 deletions(-)
 delete mode 100644 clang/test/Modules/Inputs/config.h

diff --git a/clang/lib/Frontend/CompilerInstance.cpp 
b/clang/lib/Frontend/CompilerInstance.cpp
index 4443073775..378f940d8da2cf 100644
--- a/clang/lib/Frontend/CompilerInstance.cpp
+++ b/clang/lib/Frontend/CompilerInstance.cpp
@@ -1591,6 +1591,14 @@ static void checkConfigMacro(Preprocessor , StringRef 
ConfigMacro,
   }
 }
 
+static void checkConfigMacros(Preprocessor , Module *M,
+  SourceLocation ImportLoc) {
+  clang::Module *TopModule = M->getTopLevelModule();
+  for (unsigned I = 0, N = TopModule->ConfigMacros.size(); I != N; ++I) {
+checkConfigMacro(PP, TopModule->ConfigMacros[I], M, ImportLoc);
+  }
+}
+
 /// Write a new timestamp file with the given path.
 static void writeTimestampFile(StringRef TimestampFile) {
   std::error_code EC;
@@ -1829,6 +1837,13 @@ ModuleLoadResult 
CompilerInstance::findOrCompileModuleAndReadAST(
   Module *M =
   HS.lookupModule(ModuleName, ImportLoc, true, !IsInclusionDirective);
 
+  // Check for any configuration macros that have changed. This is done
+  // immediately before potentially building a module in case this module
+  // depends on having one of its configuration macros defined to successfully
+  // build. If this is not done the user will never see the warning.
+  if (M)
+checkConfigMacros(getPreprocessor(), M, ImportLoc);
+
   // Select the source and filename for loading the named module.
   std::string ModuleFilename;
   ModuleSource Source =
@@ -2006,6 +2021,11 @@ CompilerInstance::loadModule(SourceLocation ImportLoc,
   if (auto MaybeModule = MM.getCachedModuleLoad(*Path[0].first)) {
 // Use the cached result, which may be nullptr.
 Module = *MaybeModule;
+// Config macros are already checked before building a module, but they 
need
+// to be checked at each import location in case any of the config macros
+// have a new value at the current `ImportLoc`.
+if (Module)
+  checkConfigMacros(getPreprocessor(), Module, ImportLoc);
   } else if (ModuleName == getLangOpts().CurrentModule) {
 // This is the module we're building.
 Module = PP->getHeaderSearchInfo().lookupModule(
@@ -2146,18 +2166,11 @@ CompilerInstance::loadModule(SourceLocation ImportLoc,
 TheASTReader->makeModuleVisible(Module, Visibility, ImportLoc);
   }
 
-  // Check for any configuration macros that have changed.
-  clang::Module *TopModule = Module->getTopLevelModule();
-  for (unsigned I = 0, N = TopModule->ConfigMacros.size(); I != N; ++I) {
-checkConfigMacro(getPreprocessor(), TopModule->ConfigMacros[I],
- Module, ImportLoc);
-  }
-
   // Resolve any remaining module using export_as for this one.
   getPreprocessor()
   .getHeaderSearchInfo()
   .getModuleMap()
-  .resolveLinkAsDependencies(TopModule);
+  .resolveLinkAsDependencies(Module->getTopLevelModule());
 
   LastModuleImportLoc = ImportLoc;
   LastModuleImportResult = ModuleLoadResult(Module);
diff --git a/clang/test/Modules/Inputs/config.h 
b/clang/test/Modules/Inputs/config.h
deleted file mode 100644
index 4c124b0bf82b7c..00
--- a/clang/test/Modules/Inputs/config.h
+++ /dev/null
@@ -1,7 +0,0 @@
-#ifdef WANT_FOO
-int* foo(void);
-#endif
-
-#ifdef WANT_BAR
-char *bar(void);
-#endif
diff --git a/clang/test/Modules/Inputs/module.modulemap 
b/clang/test/Modules/Inputs/module.modulemap
index e7cb4b27bc08b9..47f6c5c1010d7d 100644
--- a/clang/test/Modules/Inputs/module.modulemap
+++ b/clang/test/Modules/Inputs/module.modulemap
@@ -260,11 +260,6 @@ module cxx_decls_merged {
   header "cxx-decls-merged.h"
 }
 
-module config {
-  header "config.h"
-  config_macros [exhaustive] WANT_FOO, WANT_BAR
-}
-
 module diag_flags {
   header "diag_flags.h"
 }
diff --git a/clang/test/Modules/config_macros.m 
b/clang/test/Modules/config_macros.m
index 15e2c16606ba29..65dd2a89a05c32 100644
--- a/clang/test/Modules/config_macros.m
+++ b/clang/test/Modules/config_macros.m
@@ -1,3 +1,39 @@
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+// RUN: %clang_cc1 -std=c99 -fmodules -fimplicit-module-maps -x objective-c 
-fmodules-cache-path=%t 

[clang] [clang] Diagnose config_macros before building modules (PR #83641)

2024-03-01 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese created 
https://github.com/llvm/llvm-project/pull/83641

Before this patch, if a module fails to build because of a missing 
config_macro, the user will never see the config macro warning. This patch 
diagnoses this before building, and each subsequent time a module is imported.

rdar://123921931

>From 4edc58151460ae21baa312a91aa5a7955857c8a5 Mon Sep 17 00:00:00 2001
From: Michael Spencer 
Date: Fri, 1 Mar 2024 17:18:20 -0800
Subject: [PATCH] [clang] Diagnose config_macros before building modules

Before this patch, if a module fails to build because of a missing
config_macro, the user will never see the config macro warning. This
patch diagnoses this before building, and each subsequent time a
module is imported.

rdar://123921931
---
 clang/lib/Frontend/CompilerInstance.cpp| 28 ++
 clang/test/Modules/Inputs/config.h |  7 
 clang/test/Modules/Inputs/module.modulemap |  5 ---
 clang/test/Modules/config_macros.m | 45 --
 4 files changed, 62 insertions(+), 23 deletions(-)
 delete mode 100644 clang/test/Modules/Inputs/config.h

diff --git a/clang/lib/Frontend/CompilerInstance.cpp 
b/clang/lib/Frontend/CompilerInstance.cpp
index 4443073775..14fd140f03cc36 100644
--- a/clang/lib/Frontend/CompilerInstance.cpp
+++ b/clang/lib/Frontend/CompilerInstance.cpp
@@ -1591,6 +1591,14 @@ static void checkConfigMacro(Preprocessor , StringRef 
ConfigMacro,
   }
 }
 
+static void checkConfigMacros(Preprocessor , Module *M,
+  SourceLocation ImportLoc) {
+  clang::Module *TopModule = M->getTopLevelModule();
+  for (unsigned I = 0, N = TopModule->ConfigMacros.size(); I != N; ++I) {
+checkConfigMacro(PP, TopModule->ConfigMacros[I], M, ImportLoc);
+  }
+}
+
 /// Write a new timestamp file with the given path.
 static void writeTimestampFile(StringRef TimestampFile) {
   std::error_code EC;
@@ -1829,6 +1837,12 @@ ModuleLoadResult 
CompilerInstance::findOrCompileModuleAndReadAST(
   Module *M =
   HS.lookupModule(ModuleName, ImportLoc, true, !IsInclusionDirective);
 
+  // Check for any configuration macros that have changed. This is done
+  // immediately before potentially building a module in case this module
+  // depends on having one of its configuration macros defined to successfully
+  // build. If this is not done the user will never see the warning.
+  checkConfigMacros(getPreprocessor(), M, ImportLoc);
+
   // Select the source and filename for loading the named module.
   std::string ModuleFilename;
   ModuleSource Source =
@@ -2006,6 +2020,11 @@ CompilerInstance::loadModule(SourceLocation ImportLoc,
   if (auto MaybeModule = MM.getCachedModuleLoad(*Path[0].first)) {
 // Use the cached result, which may be nullptr.
 Module = *MaybeModule;
+// Config macros are already checked before building a module, but they 
need
+// to be checked at each import location in case any of the config macros
+// have a new value at the current `ImportLoc`.
+if (Module)
+  checkConfigMacros(getPreprocessor(), Module, ImportLoc);
   } else if (ModuleName == getLangOpts().CurrentModule) {
 // This is the module we're building.
 Module = PP->getHeaderSearchInfo().lookupModule(
@@ -2146,18 +2165,11 @@ CompilerInstance::loadModule(SourceLocation ImportLoc,
 TheASTReader->makeModuleVisible(Module, Visibility, ImportLoc);
   }
 
-  // Check for any configuration macros that have changed.
-  clang::Module *TopModule = Module->getTopLevelModule();
-  for (unsigned I = 0, N = TopModule->ConfigMacros.size(); I != N; ++I) {
-checkConfigMacro(getPreprocessor(), TopModule->ConfigMacros[I],
- Module, ImportLoc);
-  }
-
   // Resolve any remaining module using export_as for this one.
   getPreprocessor()
   .getHeaderSearchInfo()
   .getModuleMap()
-  .resolveLinkAsDependencies(TopModule);
+  .resolveLinkAsDependencies(Module->getTopLevelModule());
 
   LastModuleImportLoc = ImportLoc;
   LastModuleImportResult = ModuleLoadResult(Module);
diff --git a/clang/test/Modules/Inputs/config.h 
b/clang/test/Modules/Inputs/config.h
deleted file mode 100644
index 4c124b0bf82b7c..00
--- a/clang/test/Modules/Inputs/config.h
+++ /dev/null
@@ -1,7 +0,0 @@
-#ifdef WANT_FOO
-int* foo(void);
-#endif
-
-#ifdef WANT_BAR
-char *bar(void);
-#endif
diff --git a/clang/test/Modules/Inputs/module.modulemap 
b/clang/test/Modules/Inputs/module.modulemap
index e7cb4b27bc08b9..47f6c5c1010d7d 100644
--- a/clang/test/Modules/Inputs/module.modulemap
+++ b/clang/test/Modules/Inputs/module.modulemap
@@ -260,11 +260,6 @@ module cxx_decls_merged {
   header "cxx-decls-merged.h"
 }
 
-module config {
-  header "config.h"
-  config_macros [exhaustive] WANT_FOO, WANT_BAR
-}
-
 module diag_flags {
   header "diag_flags.h"
 }
diff --git a/clang/test/Modules/config_macros.m 
b/clang/test/Modules/config_macros.m
index 15e2c16606ba29..65dd2a89a05c32 100644
--- 

[clang] [llvm] [clang][ScanDeps] Allow PCHs to have different VFS overlays (PR #82294)

2024-02-28 Thread Michael Spencer via cfe-commits


@@ -175,8 +192,19 @@ static void sanitizeDiagOpts(DiagnosticOptions ) {
   DiagOpts.ShowCarets = false;
   // Don't write out diagnostic file.
   DiagOpts.DiagnosticSerializationFile.clear();
-  // Don't emit warnings as errors (and all other warnings too).
-  DiagOpts.IgnoreWarnings = true;
+  // Don't emit warnings except for scanning specific warnings.
+  // TODO: It would be useful to add a more principled way to ignore all
+  //   warnings that come from source code. The issue is that we need to
+  //   ignore warnings that could be surpressed by
+  //   `#pragma clang diagnostic`, while still allowing some scanning
+  //   warnings for things we're not ready to turn into errors yet.

Bigcheese wrote:

The scanner never sees `#pragma clang diagnostic`, so there's no issue with 
code that uses that to turn warnings on. The original issue was with warnings 
getting turned off via `#pragma clang diagnostic`, but the new code removes all 
warnings and `Werror`s, so you're just left with default warnings.

The goal here was to keep driver warnings (which are lost otherwise) and allow 
us to have dedicated scanner warnings. I do think we want more control over 
this, possibly add a scanner bit to diagnostics so we can be explicit about 
which warnings we expect from the scanner, but I think this change is fine for 
now.

https://github.com/llvm/llvm-project/pull/82294
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [clang][ScanDeps] Allow PCHs to have different VFS overlays (PR #82294)

2024-02-23 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese closed 
https://github.com/llvm/llvm-project/pull/82294
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [clang][ScanDeps] Allow PCHs to have different VFS overlays (PR #82294)

2024-02-23 Thread Michael Spencer via cfe-commits

Bigcheese wrote:

CI failure is a preexisting Flang test failure and a preexisting trailing 
whitespace issue.

https://github.com/llvm/llvm-project/pull/82294
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] reland: [clang][ScanDeps] Canonicalize -D and -U flags (PR #82568)

2024-02-23 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese closed 
https://github.com/llvm/llvm-project/pull/82568
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] reland: [clang][ScanDeps] Canonicalize -D and -U flags (PR #82568)

2024-02-23 Thread Michael Spencer via cfe-commits

Bigcheese wrote:

CI failure was a preexisting trailing whitespace issue.

https://github.com/llvm/llvm-project/pull/82568
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [clang][ScanDeps] Allow PCHs to have different VFS overlays (PR #82294)

2024-02-23 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese updated 
https://github.com/llvm/llvm-project/pull/82294

>From 45852f569575d0735c686376ad30753fe791db26 Mon Sep 17 00:00:00 2001
From: Michael Spencer 
Date: Thu, 15 Feb 2024 16:44:45 -0800
Subject: [PATCH] [clang][ScanDeps] Allow PCHs to have different VFS overlays

It turns out it's not that uncommon for real code to pass a different
set of VFSs while building a PCH than while using the PCH. This can
cause problems as seen in `test/ClangScanDeps/optimize-vfs-pch.m`. If
you scan `compile-commands-tu-no-vfs-error.json` without -Werror and
run the resulting commands, Clang will emit a fatal error while trying
to emit a note saying that it can't find a remapped header.

This also adds textual tracking of VFSs for prebuilt modules that are
part of an included PCH, as the same issue can occur in a module we
are building if we drop VFSs. This has to be textual because we have
no guarantee the PCH had the same list of VFSs as the current TU.
---
 .../Basic/DiagnosticSerializationKinds.td |   4 +-
 .../DependencyScanning/ModuleDepCollector.h   |   5 +
 .../DependencyScanningWorker.cpp  |  58 +++--
 .../DependencyScanning/ModuleDepCollector.cpp |  34 --
 clang/test/ClangScanDeps/optimize-vfs-pch.m   | 114 --
 llvm/include/llvm/ADT/StringSet.h |   4 +
 6 files changed, 190 insertions(+), 29 deletions(-)

diff --git a/clang/include/clang/Basic/DiagnosticSerializationKinds.td 
b/clang/include/clang/Basic/DiagnosticSerializationKinds.td
index 4c4659ed517e0a..eb27de5921d6a1 100644
--- a/clang/include/clang/Basic/DiagnosticSerializationKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSerializationKinds.td
@@ -44,7 +44,9 @@ def err_pch_diagopt_mismatch : Error<"%0 is currently 
enabled, but was not in "
   "the PCH file">;
 def err_pch_modulecache_mismatch : Error<"PCH was compiled with module cache "
   "path '%0', but the path is currently '%1'">;
-def err_pch_vfsoverlay_mismatch : Error<"PCH was compiled with different VFS 
overlay files than are currently in use">;
+def warn_pch_vfsoverlay_mismatch : Warning<
+  "PCH was compiled with different VFS overlay files than are currently in 
use">,
+  InGroup>;
 def note_pch_vfsoverlay_files : Note<"%select{PCH|current translation unit}0 
has the following VFS overlays:\n%1">;
 def note_pch_vfsoverlay_empty : Note<"%select{PCH|current translation unit}0 
has no VFS overlays">;
 
diff --git 
a/clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h 
b/clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
index 13ad2530864927..081899cc2c8503 100644
--- a/clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
+++ b/clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
@@ -149,6 +149,8 @@ struct ModuleDeps {
   BuildInfo;
 };
 
+using PrebuiltModuleVFSMapT = llvm::StringMap>;
+
 class ModuleDepCollector;
 
 /// Callback that records textual includes and direct modular includes/imports
@@ -214,6 +216,7 @@ class ModuleDepCollector final : public DependencyCollector 
{
  CompilerInstance , DependencyConsumer ,
  DependencyActionController ,
  CompilerInvocation OriginalCI,
+ PrebuiltModuleVFSMapT PrebuiltModuleVFSMap,
  ScanningOptimizations OptimizeArgs, bool EagerLoadModules,
  bool IsStdModuleP1689Format);
 
@@ -233,6 +236,8 @@ class ModuleDepCollector final : public DependencyCollector 
{
   DependencyConsumer 
   /// Callbacks for computing dependency information.
   DependencyActionController 
+  /// Mapping from prebuilt AST files to their sorted list of VFS overlay 
files.
+  PrebuiltModuleVFSMapT PrebuiltModuleVFSMap;
   /// Path to the main source file.
   std::string MainFile;
   /// Hash identifying the compilation conditions of the current TU.
diff --git a/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp 
b/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
index 3cf3ad8a4e4907..b252463a08b8d7 100644
--- a/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
+++ b/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
@@ -24,6 +24,7 @@
 #include "clang/Tooling/DependencyScanning/DependencyScanningService.h"
 #include "clang/Tooling/DependencyScanning/ModuleDepCollector.h"
 #include "clang/Tooling/Tooling.h"
+#include "llvm/ADT/ScopeExit.h"
 #include "llvm/Support/Allocator.h"
 #include "llvm/Support/Error.h"
 #include "llvm/TargetParser/Host.h"
@@ -67,7 +68,7 @@ static bool checkHeaderSearchPaths(const HeaderSearchOptions 
,
   if (LangOpts.Modules) {
 if (HSOpts.VFSOverlayFiles != ExistingHSOpts.VFSOverlayFiles) {
   if (Diags) {
-Diags->Report(diag::err_pch_vfsoverlay_mismatch);
+Diags->Report(diag::warn_pch_vfsoverlay_mismatch);
 auto VFSNote = [&](int Type, ArrayRef VFSOverlays) {
   if (VFSOverlays.empty()) {
  

[clang] reland: [clang][ScanDeps] Canonicalize -D and -U flags (PR #82568)

2024-02-23 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese updated 
https://github.com/llvm/llvm-project/pull/82568

>From a690c96562dea29a760390644d78a01a263993ca Mon Sep 17 00:00:00 2001
From: Michael Spencer 
Date: Fri, 16 Feb 2024 22:05:25 -0800
Subject: [PATCH] [clang][ScanDeps] Canonicalize -D and -U flags

Canonicalize `-D` and `-U` flags by sorting them and only keeping the
last instance of a given name.

This optimization will only fire if all `-D` and `-U` flags start with
a simple identifier that we can guarantee a simple analysis of can
determine if two flags refer to the same identifier or not. See the
comment on `getSimpleMacroName()` for details of what the issues are.
---
 .../DependencyScanningService.h   |  5 +-
 .../DependencyScanningWorker.cpp  | 74 
 .../optimize-canonicalize-macros.m| 87 +++
 clang/tools/clang-scan-deps/ClangScanDeps.cpp |  1 +
 4 files changed, 166 insertions(+), 1 deletion(-)
 create mode 100644 clang/test/ClangScanDeps/optimize-canonicalize-macros.m

diff --git 
a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h 
b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h
index 4f9867262a275c..557f0e547ab4a8 100644
--- a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h
+++ b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h
@@ -60,7 +60,10 @@ enum class ScanningOptimizations {
   /// Remove unused -ivfsoverlay arguments.
   VFS = 4,
 
-  DSS_LAST_BITMASK_ENUM(VFS),
+  /// Canonicalize -D and -U options.
+  Macros = 8,
+
+  DSS_LAST_BITMASK_ENUM(Macros),
   Default = All
 };
 
diff --git a/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp 
b/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
index 3cf3ad8a4e4907..7477b930188b4f 100644
--- a/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
+++ b/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
@@ -179,6 +179,78 @@ static void sanitizeDiagOpts(DiagnosticOptions ) {
   DiagOpts.IgnoreWarnings = true;
 }
 
+// Clang implements -D and -U by splatting text into a predefines buffer. This
+// allows constructs such as `-DFඞ=3 "-D F\u{0D9E} 4 3 2”` to be accepted and
+// define the same macro, or adding C++ style comments before the macro name.
+//
+// This function checks that the first non-space characters in the macro
+// obviously form an identifier that can be uniqued on without lexing. Failing
+// to do this could lead to changing the final definition of a macro.
+//
+// We could set up a preprocessor and actually lex the name, but that's very
+// heavyweight for a situation that will almost never happen in practice.
+static std::optional getSimpleMacroName(StringRef Macro) {
+  StringRef Name = Macro.split("=").first.ltrim(" \t");
+  std::size_t I = 0;
+
+  auto FinishName = [&]() -> std::optional {
+StringRef SimpleName = Name.slice(0, I);
+if (SimpleName.empty())
+  return std::nullopt;
+return SimpleName;
+  };
+
+  for (; I != Name.size(); ++I) {
+switch (Name[I]) {
+case '(': // Start of macro parameter list
+case ' ': // End of macro name
+case '\t':
+  return FinishName();
+case '_':
+  continue;
+default:
+  if (llvm::isAlnum(Name[I]))
+continue;
+  return std::nullopt;
+}
+  }
+  return FinishName();
+}
+
+static void canonicalizeDefines(PreprocessorOptions ) {
+  using MacroOpt = std::pair;
+  std::vector SimpleNames;
+  SimpleNames.reserve(PPOpts.Macros.size());
+  std::size_t Index = 0;
+  for (const auto  : PPOpts.Macros) {
+auto SName = getSimpleMacroName(M.first);
+// Skip optimizing if we can't guarantee we can preserve relative order.
+if (!SName)
+  return;
+SimpleNames.emplace_back(*SName, Index);
+++Index;
+  }
+
+  llvm::stable_sort(SimpleNames, [](const MacroOpt , const MacroOpt ) {
+return A.first < B.first;
+  });
+  // Keep the last instance of each macro name by going in reverse
+  auto NewEnd = std::unique(
+  SimpleNames.rbegin(), SimpleNames.rend(),
+  [](const MacroOpt , const MacroOpt ) { return A.first == B.first; });
+  SimpleNames.erase(SimpleNames.begin(), NewEnd.base());
+
+  // Apply permutation.
+  decltype(PPOpts.Macros) NewMacros;
+  NewMacros.reserve(SimpleNames.size());
+  for (std::size_t I = 0, E = SimpleNames.size(); I != E; ++I) {
+std::size_t OriginalIndex = SimpleNames[I].second;
+// We still emit undefines here as they may be undefining a predefined 
macro
+NewMacros.push_back(std::move(PPOpts.Macros[OriginalIndex]));
+  }
+  std::swap(PPOpts.Macros, NewMacros);
+}
+
 /// A clang tool that runs the preprocessor in a mode that's optimized for
 /// dependency scanning for the given compiler invocation.
 class DependencyScanningAction : public tooling::ToolAction {
@@ -203,6 +275,8 @@ class DependencyScanningAction : public tooling::ToolAction 
{
 

[clang] reland: [clang][ScanDeps] Canonicalize -D and -U flags (PR #82568)

2024-02-23 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese updated 
https://github.com/llvm/llvm-project/pull/82568

>From eb622c20b8d84afabdbb82543c1f9e4889639735 Mon Sep 17 00:00:00 2001
From: Michael Spencer 
Date: Fri, 16 Feb 2024 22:05:25 -0800
Subject: [PATCH] [clang][ScanDeps] Canonicalize -D and -U flags

Canonicalize `-D` and `-U` flags by sorting them and only keeping the
last instance of a given name.

This optimization will only fire if all `-D` and `-U` flags start with
a simple identifier that we can guarantee a simple analysis of can
determine if two flags refer to the same identifier or not. See the
comment on `getSimpleMacroName()` for details of what the issues are.
---
 .../DependencyScanningService.h   |  5 +-
 .../DependencyScanningWorker.cpp  | 74 
 .../optimize-canonicalize-macros.m| 87 +++
 clang/tools/clang-scan-deps/ClangScanDeps.cpp |  1 +
 4 files changed, 166 insertions(+), 1 deletion(-)
 create mode 100644 clang/test/ClangScanDeps/optimize-canonicalize-macros.m

diff --git 
a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h 
b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h
index 4f9867262a275c..557f0e547ab4a8 100644
--- a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h
+++ b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h
@@ -60,7 +60,10 @@ enum class ScanningOptimizations {
   /// Remove unused -ivfsoverlay arguments.
   VFS = 4,
 
-  DSS_LAST_BITMASK_ENUM(VFS),
+  /// Canonicalize -D and -U options.
+  Macros = 8,
+
+  DSS_LAST_BITMASK_ENUM(Macros),
   Default = All
 };
 
diff --git a/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp 
b/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
index 3cf3ad8a4e4907..7477b930188b4f 100644
--- a/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
+++ b/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
@@ -179,6 +179,78 @@ static void sanitizeDiagOpts(DiagnosticOptions ) {
   DiagOpts.IgnoreWarnings = true;
 }
 
+// Clang implements -D and -U by splatting text into a predefines buffer. This
+// allows constructs such as `-DFඞ=3 "-D F\u{0D9E} 4 3 2”` to be accepted and
+// define the same macro, or adding C++ style comments before the macro name.
+//
+// This function checks that the first non-space characters in the macro
+// obviously form an identifier that can be uniqued on without lexing. Failing
+// to do this could lead to changing the final definition of a macro.
+//
+// We could set up a preprocessor and actually lex the name, but that's very
+// heavyweight for a situation that will almost never happen in practice.
+static std::optional getSimpleMacroName(StringRef Macro) {
+  StringRef Name = Macro.split("=").first.ltrim(" \t");
+  std::size_t I = 0;
+
+  auto FinishName = [&]() -> std::optional {
+StringRef SimpleName = Name.slice(0, I);
+if (SimpleName.empty())
+  return std::nullopt;
+return SimpleName;
+  };
+
+  for (; I != Name.size(); ++I) {
+switch (Name[I]) {
+case '(': // Start of macro parameter list
+case ' ': // End of macro name
+case '\t':
+  return FinishName();
+case '_':
+  continue;
+default:
+  if (llvm::isAlnum(Name[I]))
+continue;
+  return std::nullopt;
+}
+  }
+  return FinishName();
+}
+
+static void canonicalizeDefines(PreprocessorOptions ) {
+  using MacroOpt = std::pair;
+  std::vector SimpleNames;
+  SimpleNames.reserve(PPOpts.Macros.size());
+  std::size_t Index = 0;
+  for (const auto  : PPOpts.Macros) {
+auto SName = getSimpleMacroName(M.first);
+// Skip optimizing if we can't guarantee we can preserve relative order.
+if (!SName)
+  return;
+SimpleNames.emplace_back(*SName, Index);
+++Index;
+  }
+
+  llvm::stable_sort(SimpleNames, [](const MacroOpt , const MacroOpt ) {
+return A.first < B.first;
+  });
+  // Keep the last instance of each macro name by going in reverse
+  auto NewEnd = std::unique(
+  SimpleNames.rbegin(), SimpleNames.rend(),
+  [](const MacroOpt , const MacroOpt ) { return A.first == B.first; });
+  SimpleNames.erase(SimpleNames.begin(), NewEnd.base());
+
+  // Apply permutation.
+  decltype(PPOpts.Macros) NewMacros;
+  NewMacros.reserve(SimpleNames.size());
+  for (std::size_t I = 0, E = SimpleNames.size(); I != E; ++I) {
+std::size_t OriginalIndex = SimpleNames[I].second;
+// We still emit undefines here as they may be undefining a predefined 
macro
+NewMacros.push_back(std::move(PPOpts.Macros[OriginalIndex]));
+  }
+  std::swap(PPOpts.Macros, NewMacros);
+}
+
 /// A clang tool that runs the preprocessor in a mode that's optimized for
 /// dependency scanning for the given compiler invocation.
 class DependencyScanningAction : public tooling::ToolAction {
@@ -203,6 +275,8 @@ class DependencyScanningAction : public tooling::ToolAction 
{
 

[clang] reland: [clang][ScanDeps] Canonicalize -D and -U flags (PR #82568)

2024-02-22 Thread Michael Spencer via cfe-commits

Bigcheese wrote:

Try double quotes.

https://github.com/llvm/llvm-project/pull/82568
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] reland: [clang][ScanDeps] Canonicalize -D and -U flags (PR #82568)

2024-02-22 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese updated 
https://github.com/llvm/llvm-project/pull/82568

>From 9759145f34306f1832b1deff0ca1b5e41d2ad89d Mon Sep 17 00:00:00 2001
From: Michael Spencer 
Date: Fri, 16 Feb 2024 22:05:25 -0800
Subject: [PATCH] [clang][ScanDeps] Canonicalize -D and -U flags

Canonicalize `-D` and `-U` flags by sorting them and only keeping the
last instance of a given name.

This optimization will only fire if all `-D` and `-U` flags start with
a simple identifier that we can guarantee a simple analysis of can
determine if two flags refer to the same identifier or not. See the
comment on `getSimpleMacroName()` for details of what the issues are.
---
 .../DependencyScanningService.h   |  5 +-
 .../DependencyScanningWorker.cpp  | 74 
 .../optimize-canonicalize-macros.m| 87 +++
 clang/tools/clang-scan-deps/ClangScanDeps.cpp |  1 +
 4 files changed, 166 insertions(+), 1 deletion(-)
 create mode 100644 clang/test/ClangScanDeps/optimize-canonicalize-macros.m

diff --git 
a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h 
b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h
index 4f9867262a275c..557f0e547ab4a8 100644
--- a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h
+++ b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h
@@ -60,7 +60,10 @@ enum class ScanningOptimizations {
   /// Remove unused -ivfsoverlay arguments.
   VFS = 4,
 
-  DSS_LAST_BITMASK_ENUM(VFS),
+  /// Canonicalize -D and -U options.
+  Macros = 8,
+
+  DSS_LAST_BITMASK_ENUM(Macros),
   Default = All
 };
 
diff --git a/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp 
b/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
index 3cf3ad8a4e4907..7477b930188b4f 100644
--- a/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
+++ b/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
@@ -179,6 +179,78 @@ static void sanitizeDiagOpts(DiagnosticOptions ) {
   DiagOpts.IgnoreWarnings = true;
 }
 
+// Clang implements -D and -U by splatting text into a predefines buffer. This
+// allows constructs such as `-DFඞ=3 "-D F\u{0D9E} 4 3 2”` to be accepted and
+// define the same macro, or adding C++ style comments before the macro name.
+//
+// This function checks that the first non-space characters in the macro
+// obviously form an identifier that can be uniqued on without lexing. Failing
+// to do this could lead to changing the final definition of a macro.
+//
+// We could set up a preprocessor and actually lex the name, but that's very
+// heavyweight for a situation that will almost never happen in practice.
+static std::optional getSimpleMacroName(StringRef Macro) {
+  StringRef Name = Macro.split("=").first.ltrim(" \t");
+  std::size_t I = 0;
+
+  auto FinishName = [&]() -> std::optional {
+StringRef SimpleName = Name.slice(0, I);
+if (SimpleName.empty())
+  return std::nullopt;
+return SimpleName;
+  };
+
+  for (; I != Name.size(); ++I) {
+switch (Name[I]) {
+case '(': // Start of macro parameter list
+case ' ': // End of macro name
+case '\t':
+  return FinishName();
+case '_':
+  continue;
+default:
+  if (llvm::isAlnum(Name[I]))
+continue;
+  return std::nullopt;
+}
+  }
+  return FinishName();
+}
+
+static void canonicalizeDefines(PreprocessorOptions ) {
+  using MacroOpt = std::pair;
+  std::vector SimpleNames;
+  SimpleNames.reserve(PPOpts.Macros.size());
+  std::size_t Index = 0;
+  for (const auto  : PPOpts.Macros) {
+auto SName = getSimpleMacroName(M.first);
+// Skip optimizing if we can't guarantee we can preserve relative order.
+if (!SName)
+  return;
+SimpleNames.emplace_back(*SName, Index);
+++Index;
+  }
+
+  llvm::stable_sort(SimpleNames, [](const MacroOpt , const MacroOpt ) {
+return A.first < B.first;
+  });
+  // Keep the last instance of each macro name by going in reverse
+  auto NewEnd = std::unique(
+  SimpleNames.rbegin(), SimpleNames.rend(),
+  [](const MacroOpt , const MacroOpt ) { return A.first == B.first; });
+  SimpleNames.erase(SimpleNames.begin(), NewEnd.base());
+
+  // Apply permutation.
+  decltype(PPOpts.Macros) NewMacros;
+  NewMacros.reserve(SimpleNames.size());
+  for (std::size_t I = 0, E = SimpleNames.size(); I != E; ++I) {
+std::size_t OriginalIndex = SimpleNames[I].second;
+// We still emit undefines here as they may be undefining a predefined 
macro
+NewMacros.push_back(std::move(PPOpts.Macros[OriginalIndex]));
+  }
+  std::swap(PPOpts.Macros, NewMacros);
+}
+
 /// A clang tool that runs the preprocessor in a mode that's optimized for
 /// dependency scanning for the given compiler invocation.
 class DependencyScanningAction : public tooling::ToolAction {
@@ -203,6 +275,8 @@ class DependencyScanningAction : public tooling::ToolAction 
{
 

[clang] reland: [clang][ScanDeps] Canonicalize -D and -U flags (PR #82568)

2024-02-21 Thread Michael Spencer via cfe-commits

Bigcheese wrote:

Windows didn't like the quoted argument, now let's see if Linux is happy with 
an unquoted argument.

https://github.com/llvm/llvm-project/pull/82568
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] reland: [clang][ScanDeps] Canonicalize -D and -U flags (PR #82568)

2024-02-21 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese updated 
https://github.com/llvm/llvm-project/pull/82568

>From d8bfbdeedbf0a3bdd2db25e7dd389d6f223091a3 Mon Sep 17 00:00:00 2001
From: Michael Spencer 
Date: Fri, 16 Feb 2024 22:05:25 -0800
Subject: [PATCH] [clang][ScanDeps] Canonicalize -D and -U flags

Canonicalize `-D` and `-U` flags by sorting them and only keeping the
last instance of a given name.

This optimization will only fire if all `-D` and `-U` flags start with
a simple identifier that we can guarantee a simple analysis of can
determine if two flags refer to the same identifier or not. See the
comment on `getSimpleMacroName()` for details of what the issues are.
---
 .../DependencyScanningService.h   |  5 +-
 .../DependencyScanningWorker.cpp  | 74 
 .../optimize-canonicalize-macros.m| 87 +++
 clang/tools/clang-scan-deps/ClangScanDeps.cpp |  1 +
 4 files changed, 166 insertions(+), 1 deletion(-)
 create mode 100644 clang/test/ClangScanDeps/optimize-canonicalize-macros.m

diff --git 
a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h 
b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h
index 4f9867262a275c..557f0e547ab4a8 100644
--- a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h
+++ b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h
@@ -60,7 +60,10 @@ enum class ScanningOptimizations {
   /// Remove unused -ivfsoverlay arguments.
   VFS = 4,
 
-  DSS_LAST_BITMASK_ENUM(VFS),
+  /// Canonicalize -D and -U options.
+  Macros = 8,
+
+  DSS_LAST_BITMASK_ENUM(Macros),
   Default = All
 };
 
diff --git a/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp 
b/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
index 3cf3ad8a4e4907..7477b930188b4f 100644
--- a/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
+++ b/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
@@ -179,6 +179,78 @@ static void sanitizeDiagOpts(DiagnosticOptions ) {
   DiagOpts.IgnoreWarnings = true;
 }
 
+// Clang implements -D and -U by splatting text into a predefines buffer. This
+// allows constructs such as `-DFඞ=3 "-D F\u{0D9E} 4 3 2”` to be accepted and
+// define the same macro, or adding C++ style comments before the macro name.
+//
+// This function checks that the first non-space characters in the macro
+// obviously form an identifier that can be uniqued on without lexing. Failing
+// to do this could lead to changing the final definition of a macro.
+//
+// We could set up a preprocessor and actually lex the name, but that's very
+// heavyweight for a situation that will almost never happen in practice.
+static std::optional getSimpleMacroName(StringRef Macro) {
+  StringRef Name = Macro.split("=").first.ltrim(" \t");
+  std::size_t I = 0;
+
+  auto FinishName = [&]() -> std::optional {
+StringRef SimpleName = Name.slice(0, I);
+if (SimpleName.empty())
+  return std::nullopt;
+return SimpleName;
+  };
+
+  for (; I != Name.size(); ++I) {
+switch (Name[I]) {
+case '(': // Start of macro parameter list
+case ' ': // End of macro name
+case '\t':
+  return FinishName();
+case '_':
+  continue;
+default:
+  if (llvm::isAlnum(Name[I]))
+continue;
+  return std::nullopt;
+}
+  }
+  return FinishName();
+}
+
+static void canonicalizeDefines(PreprocessorOptions ) {
+  using MacroOpt = std::pair;
+  std::vector SimpleNames;
+  SimpleNames.reserve(PPOpts.Macros.size());
+  std::size_t Index = 0;
+  for (const auto  : PPOpts.Macros) {
+auto SName = getSimpleMacroName(M.first);
+// Skip optimizing if we can't guarantee we can preserve relative order.
+if (!SName)
+  return;
+SimpleNames.emplace_back(*SName, Index);
+++Index;
+  }
+
+  llvm::stable_sort(SimpleNames, [](const MacroOpt , const MacroOpt ) {
+return A.first < B.first;
+  });
+  // Keep the last instance of each macro name by going in reverse
+  auto NewEnd = std::unique(
+  SimpleNames.rbegin(), SimpleNames.rend(),
+  [](const MacroOpt , const MacroOpt ) { return A.first == B.first; });
+  SimpleNames.erase(SimpleNames.begin(), NewEnd.base());
+
+  // Apply permutation.
+  decltype(PPOpts.Macros) NewMacros;
+  NewMacros.reserve(SimpleNames.size());
+  for (std::size_t I = 0, E = SimpleNames.size(); I != E; ++I) {
+std::size_t OriginalIndex = SimpleNames[I].second;
+// We still emit undefines here as they may be undefining a predefined 
macro
+NewMacros.push_back(std::move(PPOpts.Macros[OriginalIndex]));
+  }
+  std::swap(PPOpts.Macros, NewMacros);
+}
+
 /// A clang tool that runs the preprocessor in a mode that's optimized for
 /// dependency scanning for the given compiler invocation.
 class DependencyScanningAction : public tooling::ToolAction {
@@ -203,6 +275,8 @@ class DependencyScanningAction : public tooling::ToolAction 
{
 

[clang] reland: [clang][ScanDeps] Canonicalize -D and -U flags (PR #82568)

2024-02-21 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese created 
https://github.com/llvm/llvm-project/pull/82568

Canonicalize `-D` and `-U` flags by sorting them and only keeping the last 
instance of a given name.

This optimization will only fire if all `-D` and `-U` flags start with a simple 
identifier that we can guarantee a simple analysis of can determine if two 
flags refer to the same identifier or not. See the comment on 
`getSimpleMacroName()` for details of what the issues are.

Previous version of this had issues with sed differences between macOS, Linux, 
and Windows. This test doesn't check paths, so just don't run sed.
Other tests should use `sed -E 's:?:/:g'` to get portable behavior.

>From 21300748d867e321fc16ba51aea4d0330d8d27a6 Mon Sep 17 00:00:00 2001
From: Michael Spencer 
Date: Fri, 16 Feb 2024 22:05:25 -0800
Subject: [PATCH] [clang][ScanDeps] Canonicalize -D and -U flags

Canonicalize `-D` and `-U` flags by sorting them and only keeping the
last instance of a given name.

This optimization will only fire if all `-D` and `-U` flags start with
a simple identifier that we can guarantee a simple analysis of can
determine if two flags refer to the same identifier or not. See the
comment on `getSimpleMacroName()` for details of what the issues are.
---
 .../DependencyScanningService.h   |  5 +-
 .../DependencyScanningWorker.cpp  | 74 
 .../optimize-canonicalize-macros.m| 87 +++
 clang/tools/clang-scan-deps/ClangScanDeps.cpp |  1 +
 4 files changed, 166 insertions(+), 1 deletion(-)
 create mode 100644 clang/test/ClangScanDeps/optimize-canonicalize-macros.m

diff --git 
a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h 
b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h
index 4f9867262a275c..557f0e547ab4a8 100644
--- a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h
+++ b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h
@@ -60,7 +60,10 @@ enum class ScanningOptimizations {
   /// Remove unused -ivfsoverlay arguments.
   VFS = 4,
 
-  DSS_LAST_BITMASK_ENUM(VFS),
+  /// Canonicalize -D and -U options.
+  Macros = 8,
+
+  DSS_LAST_BITMASK_ENUM(Macros),
   Default = All
 };
 
diff --git a/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp 
b/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
index 3cf3ad8a4e4907..7477b930188b4f 100644
--- a/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
+++ b/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
@@ -179,6 +179,78 @@ static void sanitizeDiagOpts(DiagnosticOptions ) {
   DiagOpts.IgnoreWarnings = true;
 }
 
+// Clang implements -D and -U by splatting text into a predefines buffer. This
+// allows constructs such as `-DFඞ=3 "-D F\u{0D9E} 4 3 2”` to be accepted and
+// define the same macro, or adding C++ style comments before the macro name.
+//
+// This function checks that the first non-space characters in the macro
+// obviously form an identifier that can be uniqued on without lexing. Failing
+// to do this could lead to changing the final definition of a macro.
+//
+// We could set up a preprocessor and actually lex the name, but that's very
+// heavyweight for a situation that will almost never happen in practice.
+static std::optional getSimpleMacroName(StringRef Macro) {
+  StringRef Name = Macro.split("=").first.ltrim(" \t");
+  std::size_t I = 0;
+
+  auto FinishName = [&]() -> std::optional {
+StringRef SimpleName = Name.slice(0, I);
+if (SimpleName.empty())
+  return std::nullopt;
+return SimpleName;
+  };
+
+  for (; I != Name.size(); ++I) {
+switch (Name[I]) {
+case '(': // Start of macro parameter list
+case ' ': // End of macro name
+case '\t':
+  return FinishName();
+case '_':
+  continue;
+default:
+  if (llvm::isAlnum(Name[I]))
+continue;
+  return std::nullopt;
+}
+  }
+  return FinishName();
+}
+
+static void canonicalizeDefines(PreprocessorOptions ) {
+  using MacroOpt = std::pair;
+  std::vector SimpleNames;
+  SimpleNames.reserve(PPOpts.Macros.size());
+  std::size_t Index = 0;
+  for (const auto  : PPOpts.Macros) {
+auto SName = getSimpleMacroName(M.first);
+// Skip optimizing if we can't guarantee we can preserve relative order.
+if (!SName)
+  return;
+SimpleNames.emplace_back(*SName, Index);
+++Index;
+  }
+
+  llvm::stable_sort(SimpleNames, [](const MacroOpt , const MacroOpt ) {
+return A.first < B.first;
+  });
+  // Keep the last instance of each macro name by going in reverse
+  auto NewEnd = std::unique(
+  SimpleNames.rbegin(), SimpleNames.rend(),
+  [](const MacroOpt , const MacroOpt ) { return A.first == B.first; });
+  SimpleNames.erase(SimpleNames.begin(), NewEnd.base());
+
+  // Apply permutation.
+  decltype(PPOpts.Macros) NewMacros;
+  NewMacros.reserve(SimpleNames.size());
+  for 

[clang] [clang][ScanDeps] Canonicalize -D and -U flags (PR #82298)

2024-02-21 Thread Michael Spencer via cfe-commits

Bigcheese wrote:

Weird that it passes on macOS. Also weird that discord doesn't ping about build 
failures anymore, seems that merging on github now just blames all CI failures 
on noreply@​github.com.

https://github.com/llvm/llvm-project/pull/82298
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][ScanDeps] Canonicalize -D and -U flags (PR #82298)

2024-02-20 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese closed 
https://github.com/llvm/llvm-project/pull/82298
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][ScanDeps] Canonicalize -D and -U flags (PR #82298)

2024-02-20 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese updated 
https://github.com/llvm/llvm-project/pull/82298

>From b60972ed9183dd9e2deb3860f7732dc87bdfc84e Mon Sep 17 00:00:00 2001
From: Michael Spencer 
Date: Fri, 16 Feb 2024 22:05:25 -0800
Subject: [PATCH] Canonicalize -D and -U flags

Canonicalize `-D` and `-U` flags by sorting them and only keeping the
last instance of a given name.

This optimization will only fire if all `-D` and `-U` flags start with
a simple identifier that we can guarantee a simple analysis of can
determine if two flags refer to the same identifier or not. See the
comment on `getSimpleMacroName()` for details of what the issues are.
---
 .../DependencyScanningService.h   |  5 +-
 .../DependencyScanningWorker.cpp  | 74 
 .../optimize-canonicalize-macros.m| 87 +++
 clang/tools/clang-scan-deps/ClangScanDeps.cpp |  1 +
 4 files changed, 166 insertions(+), 1 deletion(-)
 create mode 100644 clang/test/ClangScanDeps/optimize-canonicalize-macros.m

diff --git 
a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h 
b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h
index 4f9867262a275c..557f0e547ab4a8 100644
--- a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h
+++ b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h
@@ -60,7 +60,10 @@ enum class ScanningOptimizations {
   /// Remove unused -ivfsoverlay arguments.
   VFS = 4,
 
-  DSS_LAST_BITMASK_ENUM(VFS),
+  /// Canonicalize -D and -U options.
+  Macros = 8,
+
+  DSS_LAST_BITMASK_ENUM(Macros),
   Default = All
 };
 
diff --git a/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp 
b/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
index 3cf3ad8a4e4907..7477b930188b4f 100644
--- a/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
+++ b/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
@@ -179,6 +179,78 @@ static void sanitizeDiagOpts(DiagnosticOptions ) {
   DiagOpts.IgnoreWarnings = true;
 }
 
+// Clang implements -D and -U by splatting text into a predefines buffer. This
+// allows constructs such as `-DFඞ=3 "-D F\u{0D9E} 4 3 2”` to be accepted and
+// define the same macro, or adding C++ style comments before the macro name.
+//
+// This function checks that the first non-space characters in the macro
+// obviously form an identifier that can be uniqued on without lexing. Failing
+// to do this could lead to changing the final definition of a macro.
+//
+// We could set up a preprocessor and actually lex the name, but that's very
+// heavyweight for a situation that will almost never happen in practice.
+static std::optional getSimpleMacroName(StringRef Macro) {
+  StringRef Name = Macro.split("=").first.ltrim(" \t");
+  std::size_t I = 0;
+
+  auto FinishName = [&]() -> std::optional {
+StringRef SimpleName = Name.slice(0, I);
+if (SimpleName.empty())
+  return std::nullopt;
+return SimpleName;
+  };
+
+  for (; I != Name.size(); ++I) {
+switch (Name[I]) {
+case '(': // Start of macro parameter list
+case ' ': // End of macro name
+case '\t':
+  return FinishName();
+case '_':
+  continue;
+default:
+  if (llvm::isAlnum(Name[I]))
+continue;
+  return std::nullopt;
+}
+  }
+  return FinishName();
+}
+
+static void canonicalizeDefines(PreprocessorOptions ) {
+  using MacroOpt = std::pair;
+  std::vector SimpleNames;
+  SimpleNames.reserve(PPOpts.Macros.size());
+  std::size_t Index = 0;
+  for (const auto  : PPOpts.Macros) {
+auto SName = getSimpleMacroName(M.first);
+// Skip optimizing if we can't guarantee we can preserve relative order.
+if (!SName)
+  return;
+SimpleNames.emplace_back(*SName, Index);
+++Index;
+  }
+
+  llvm::stable_sort(SimpleNames, [](const MacroOpt , const MacroOpt ) {
+return A.first < B.first;
+  });
+  // Keep the last instance of each macro name by going in reverse
+  auto NewEnd = std::unique(
+  SimpleNames.rbegin(), SimpleNames.rend(),
+  [](const MacroOpt , const MacroOpt ) { return A.first == B.first; });
+  SimpleNames.erase(SimpleNames.begin(), NewEnd.base());
+
+  // Apply permutation.
+  decltype(PPOpts.Macros) NewMacros;
+  NewMacros.reserve(SimpleNames.size());
+  for (std::size_t I = 0, E = SimpleNames.size(); I != E; ++I) {
+std::size_t OriginalIndex = SimpleNames[I].second;
+// We still emit undefines here as they may be undefining a predefined 
macro
+NewMacros.push_back(std::move(PPOpts.Macros[OriginalIndex]));
+  }
+  std::swap(PPOpts.Macros, NewMacros);
+}
+
 /// A clang tool that runs the preprocessor in a mode that's optimized for
 /// dependency scanning for the given compiler invocation.
 class DependencyScanningAction : public tooling::ToolAction {
@@ -203,6 +275,8 @@ class DependencyScanningAction : public tooling::ToolAction 
{
 CompilerInvocation 

[clang] [llvm] [clang][ScanDeps] Allow PCHs to have different VFS overlays (PR #82294)

2024-02-20 Thread Michael Spencer via cfe-commits


@@ -67,7 +68,7 @@ static bool checkHeaderSearchPaths(const HeaderSearchOptions 
,
   if (LangOpts.Modules) {
 if (HSOpts.VFSOverlayFiles != ExistingHSOpts.VFSOverlayFiles) {
   if (Diags) {
-Diags->Report(diag::err_pch_vfsoverlay_mismatch);
+Diags->Report(diag::warn_pch_vfsoverlay_mismatch);

Bigcheese wrote:

Maybe, this is currently disabled by default, but is useful for normal PCH too. 
It would be an issue if we ever want to start changing the original command 
lines too though (to increase cache hits for normal TUs), there you only want 
to warn when scanning, not in the actual build.

https://github.com/llvm/llvm-project/pull/82294
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [clang][ScanDeps] Allow PCHs to have different VFS overlays (PR #82294)

2024-02-20 Thread Michael Spencer via cfe-commits

Bigcheese wrote:

> Just to clarify, this patch doesn't attempt to solve the case where Clang can 
> crash when the VFS overlay files are different between the PCH and the TU, 
> since that's existing behavior. Correct?

Yep, this patch still allows that to happen in cases where it would today.

https://github.com/llvm/llvm-project/pull/82294
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [clang][ScanDeps] Allow PCHs to have different VFS overlays (PR #82294)

2024-02-20 Thread Michael Spencer via cfe-commits


@@ -65,11 +66,25 @@ static void optimizeHeaderSearchOpts(HeaderSearchOptions 
,
 llvm::DenseSet Visited;
 std::function VisitMF =
 [&](const serialization::ModuleFile *MF) {
-  VFSUsage |= MF->VFSUsage;
   Visited.insert(MF);
-  for (const serialization::ModuleFile *Import : MF->Imports)
-if (!Visited.contains(Import))
-  VisitMF(Import);
+  if (MF->Kind == serialization::MK_ImplicitModule) {
+VFSUsage |= MF->VFSUsage;
+// We only need to recurse into implicit modules. Other module 
types
+// will have the correct set of VFSs for anything they depend on.
+for (const serialization::ModuleFile *Import : MF->Imports)
+  if (!Visited.contains(Import))
+VisitMF(Import);
+  } else {
+// This is not an implicitly built module, so it may have different
+// VFS options. Fall back to a string comparison instead.
+auto VFSMap = PrebuiltModuleVFSMap.find(MF->FileName);
+if (VFSMap == PrebuiltModuleVFSMap.end())
+  return;

Bigcheese wrote:

I think it's also possible that you used an explicit module in the original 
driver command, or prebuilt module path. The idea of silently ignoring unknown 
modules here was to preserve the existing behavior of ignoring them. I would 
eventually like to make this an error, but doing this works in most cases now, 
so I don't want to do that yet. It would probably be good to make it a warning 
in this patch though, I don't expect this return to fire unless something weird 
is going on.

https://github.com/llvm/llvm-project/pull/82294
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][ScanDeps] Canonicalize -D and -U flags (PR #82298)

2024-02-19 Thread Michael Spencer via cfe-commits


@@ -179,6 +179,73 @@ static void sanitizeDiagOpts(DiagnosticOptions ) {
   DiagOpts.IgnoreWarnings = true;
 }
 
+// Clang implements -D and -U by splatting text into a predefines buffer. This
+// allows constructs such as `-DFඞ=3 "-D F\u{0D9E} 4 3 2”` to be accepted and
+// define the same macro, or adding C++ style comments before the macro name.
+//
+// This function checks that the first non-space characters in the macro
+// obviously form an identifier that can be uniqued on without lexing. Failing
+// to do this could lead to changing the final definition of a macro.
+//
+// We could set up a preprocessor and actually lex the name, but that's very
+// heavyweight for a situation that will almost never happen in practice.
+static std::optional getSimpleMacroName(StringRef Macro) {
+  StringRef Name = Macro.split("=").first.trim(" \t");
+  std::size_t I = 0;
+  for (; I != Name.size(); ++I) {
+switch (Name[I]) {
+case '(': // Start of macro parameter list
+case ' ': // End of macro name
+case '\t':
+  goto EndOfMacro;
+case '_':
+  continue;
+default:
+  if (llvm::isAlnum(Name[I]))
+continue;
+  return std::nullopt;
+}
+  }
+EndOfMacro:
+  StringRef SimpleName = Name.slice(0, I);
+  if (SimpleName.empty())
+return std::nullopt;
+  return SimpleName;
+}
+
+static void canonicalizeDefines(PreprocessorOptions ) {
+  using MacroOpt = std::pair;
+  std::vector SimpleNames;
+  SimpleNames.reserve(PPOpts.Macros.size());
+  std::size_t Index = 0;
+  for (const auto  : PPOpts.Macros) {
+auto SName = getSimpleMacroName(M.first);
+// Skip optimizing if we can't guarantee we can preserve relative order.
+if (!SName)
+  return;
+SimpleNames.emplace_back(*SName, Index);
+++Index;
+  }
+
+  llvm::stable_sort(SimpleNames, [](const MacroOpt , const MacroOpt ) {
+return A.first < B.first;
+  });
+  // Keep the last instance of each macro name by going in reverse
+  auto NewEnd = std::unique(SimpleNames.rbegin(), SimpleNames.rend(), [](const 
MacroOpt , const MacroOpt ) {
+return A.first == B.first;
+  });
+  SimpleNames.erase(SimpleNames.begin(), NewEnd.base());
+
+  // Apply permutation.
+  decltype(PPOpts.Macros) NewMacros;
+  NewMacros.reserve(SimpleNames.size());
+  for (std::size_t I = 0, E = SimpleNames.size(); I != E; ++I) {
+std::size_t OriginalIndex = SimpleNames[I].second;
+NewMacros.push_back(std::move(PPOpts.Macros[OriginalIndex]));

Bigcheese wrote:

It could be undefining a predefined macro.

https://github.com/llvm/llvm-project/pull/82298
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][ScanDeps] Canonicalize -D and -U flags (PR #82298)

2024-02-19 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese created 
https://github.com/llvm/llvm-project/pull/82298

Canonicalize `-D` and `-U` flags by sorting them and only keeping the last 
instance of a given name.

This optimization will only fire if all `-D` and `-U` flags start with a simple 
identifier that we can guarantee a simple analysis of can determine if two 
flags refer to the same identifier or not. See the comment on 
`getSimpleMacroName()` for details of what the issues are.

>From c89bcfd061066433c90b854ebb0bc369268797ee Mon Sep 17 00:00:00 2001
From: Michael Spencer 
Date: Fri, 16 Feb 2024 22:05:25 -0800
Subject: [PATCH] Canonicalize -D and -U flags

Canonicalize `-D` and `-U` flags by sorting them and only keeping the
last instance of a given name.

This optimization will only fire if all `-D` and `-U` flags start with
a simple identifier that we can guarantee a simple analysis of can
determine if two flags refer to the same identifier or not. See the
comment on `getSimpleMacroName()` for details of what the issues are.
---
 .../DependencyScanningService.h   |  5 +-
 .../DependencyScanningWorker.cpp  | 69 +++
 .../optimize-canonicalize-macros.m| 87 +++
 clang/tools/clang-scan-deps/ClangScanDeps.cpp |  1 +
 4 files changed, 161 insertions(+), 1 deletion(-)
 create mode 100644 clang/test/ClangScanDeps/optimize-canonicalize-macros.m

diff --git 
a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h 
b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h
index 4f9867262a275c..557f0e547ab4a8 100644
--- a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h
+++ b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h
@@ -60,7 +60,10 @@ enum class ScanningOptimizations {
   /// Remove unused -ivfsoverlay arguments.
   VFS = 4,
 
-  DSS_LAST_BITMASK_ENUM(VFS),
+  /// Canonicalize -D and -U options.
+  Macros = 8,
+
+  DSS_LAST_BITMASK_ENUM(Macros),
   Default = All
 };
 
diff --git a/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp 
b/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
index 3cf3ad8a4e4907..2e2ad5c037d765 100644
--- a/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
+++ b/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
@@ -179,6 +179,73 @@ static void sanitizeDiagOpts(DiagnosticOptions ) {
   DiagOpts.IgnoreWarnings = true;
 }
 
+// Clang implements -D and -U by splatting text into a predefines buffer. This
+// allows constructs such as `-DFඞ=3 "-D F\u{0D9E} 4 3 2”` to be accepted and
+// define the same macro, or adding C++ style comments before the macro name.
+//
+// This function checks that the first non-space characters in the macro
+// obviously form an identifier that can be uniqued on without lexing. Failing
+// to do this could lead to changing the final definition of a macro.
+//
+// We could set up a preprocessor and actually lex the name, but that's very
+// heavyweight for a situation that will almost never happen in practice.
+static std::optional getSimpleMacroName(StringRef Macro) {
+  StringRef Name = Macro.split("=").first.trim(" \t");
+  std::size_t I = 0;
+  for (; I != Name.size(); ++I) {
+switch (Name[I]) {
+case '(': // Start of macro parameter list
+case ' ': // End of macro name
+case '\t':
+  goto EndOfMacro;
+case '_':
+  continue;
+default:
+  if (llvm::isAlnum(Name[I]))
+continue;
+  return std::nullopt;
+}
+  }
+EndOfMacro:
+  StringRef SimpleName = Name.slice(0, I);
+  if (SimpleName.empty())
+return std::nullopt;
+  return SimpleName;
+}
+
+static void canonicalizeDefines(PreprocessorOptions ) {
+  using MacroOpt = std::pair;
+  std::vector SimpleNames;
+  SimpleNames.reserve(PPOpts.Macros.size());
+  std::size_t Index = 0;
+  for (const auto  : PPOpts.Macros) {
+auto SName = getSimpleMacroName(M.first);
+// Skip optimizing if we can't guarantee we can preserve relative order.
+if (!SName)
+  return;
+SimpleNames.emplace_back(*SName, Index);
+++Index;
+  }
+
+  llvm::stable_sort(SimpleNames, [](const MacroOpt , const MacroOpt ) {
+return A.first < B.first;
+  });
+  // Keep the last instance of each macro name by going in reverse
+  auto NewEnd = std::unique(SimpleNames.rbegin(), SimpleNames.rend(), [](const 
MacroOpt , const MacroOpt ) {
+return A.first == B.first;
+  });
+  SimpleNames.erase(SimpleNames.begin(), NewEnd.base());
+
+  // Apply permutation.
+  decltype(PPOpts.Macros) NewMacros;
+  NewMacros.reserve(SimpleNames.size());
+  for (std::size_t I = 0, E = SimpleNames.size(); I != E; ++I) {
+std::size_t OriginalIndex = SimpleNames[I].second;
+NewMacros.push_back(std::move(PPOpts.Macros[OriginalIndex]));
+  }
+  std::swap(PPOpts.Macros, NewMacros);
+}
+
 /// A clang tool that runs the preprocessor in a mode that's optimized for
 /// dependency scanning for 

[clang] [llvm] [clang][ScanDeps] Allow PCHs to have different VFS overlays (PR #82294)

2024-02-19 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese created 
https://github.com/llvm/llvm-project/pull/82294

It turns out it's not that uncommon for real code to pass a different set of 
VFSs while building a PCH than while using the PCH. This can cause problems as 
seen in `test/ClangScanDeps/optimize-vfs-pch.m`. If you scan 
`compile-commands-tu-no-vfs-error.json` without -Werror and run the resulting 
commands, Clang will emit a fatal error while trying to emit a note saying that 
it can't find a remapped header.

This also adds textual tracking of VFSs for prebuilt modules that are part of 
an included PCH, as the same issue can occur in a module we are building if we 
drop VFSs. This has to be textual because we have no guarantee the PCH had the 
same list of VFSs as the current TU.

This uses the `PrebuiltModuleListener` to collect `VFSOverlayFiles` instead of 
trying to extract it out of a `serialization::ModuleFile` each time it's 
needed. There's not a great way to just store a pointer to the list of strings 
in the serialized AST.

>From fa28ec316f2e4c39bbcd882328b0fe53691bf9ee Mon Sep 17 00:00:00 2001
From: Michael Spencer 
Date: Thu, 15 Feb 2024 16:44:45 -0800
Subject: [PATCH] [clang][ScanDeps] Allow PCHs to have different VFS overlays

It turns out it's not that uncommon for real code to pass a different
set of VFSs while building a PCH than while using the PCH. This can
cause problems as seen in `test/ClangScanDeps/optimize-vfs-pch.m`. If
you scan `compile-commands-tu-no-vfs-error.json` without -Werror and
run the resulting commands, Clang will emit a fatal error while trying
to emit a note saying that it can't find a remapped header.

This also adds textual tracking of VFSs for prebuilt modules that are
part of an included PCH, as the same issue can occur in a module we
are building if we drop VFSs. This has to be textual because we have
no guarantee the PCH had the same list of VFSs as the current TU.
---
 .../Basic/DiagnosticSerializationKinds.td |   4 +-
 .../DependencyScanning/ModuleDepCollector.h   |   5 +
 .../DependencyScanningWorker.cpp  |  58 +++--
 .../DependencyScanning/ModuleDepCollector.cpp |  34 --
 clang/test/ClangScanDeps/optimize-vfs-pch.m   | 114 --
 llvm/include/llvm/ADT/StringSet.h |   4 +
 6 files changed, 190 insertions(+), 29 deletions(-)

diff --git a/clang/include/clang/Basic/DiagnosticSerializationKinds.td 
b/clang/include/clang/Basic/DiagnosticSerializationKinds.td
index 4c4659ed517e0a..eb27de5921d6a1 100644
--- a/clang/include/clang/Basic/DiagnosticSerializationKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSerializationKinds.td
@@ -44,7 +44,9 @@ def err_pch_diagopt_mismatch : Error<"%0 is currently 
enabled, but was not in "
   "the PCH file">;
 def err_pch_modulecache_mismatch : Error<"PCH was compiled with module cache "
   "path '%0', but the path is currently '%1'">;
-def err_pch_vfsoverlay_mismatch : Error<"PCH was compiled with different VFS 
overlay files than are currently in use">;
+def warn_pch_vfsoverlay_mismatch : Warning<
+  "PCH was compiled with different VFS overlay files than are currently in 
use">,
+  InGroup>;
 def note_pch_vfsoverlay_files : Note<"%select{PCH|current translation unit}0 
has the following VFS overlays:\n%1">;
 def note_pch_vfsoverlay_empty : Note<"%select{PCH|current translation unit}0 
has no VFS overlays">;
 
diff --git 
a/clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h 
b/clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
index 13ad2530864927..081899cc2c8503 100644
--- a/clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
+++ b/clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
@@ -149,6 +149,8 @@ struct ModuleDeps {
   BuildInfo;
 };
 
+using PrebuiltModuleVFSMapT = llvm::StringMap>;
+
 class ModuleDepCollector;
 
 /// Callback that records textual includes and direct modular includes/imports
@@ -214,6 +216,7 @@ class ModuleDepCollector final : public DependencyCollector 
{
  CompilerInstance , DependencyConsumer ,
  DependencyActionController ,
  CompilerInvocation OriginalCI,
+ PrebuiltModuleVFSMapT PrebuiltModuleVFSMap,
  ScanningOptimizations OptimizeArgs, bool EagerLoadModules,
  bool IsStdModuleP1689Format);
 
@@ -233,6 +236,8 @@ class ModuleDepCollector final : public DependencyCollector 
{
   DependencyConsumer 
   /// Callbacks for computing dependency information.
   DependencyActionController 
+  /// Mapping from prebuilt AST files to their sorted list of VFS overlay 
files.
+  PrebuiltModuleVFSMapT PrebuiltModuleVFSMap;
   /// Path to the main source file.
   std::string MainFile;
   /// Hash identifying the compilation conditions of the current TU.
diff --git a/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp 

[clang] [clang][DependencyScanner] Remove unused -fmodule-map-file arguments (PR #80090)

2024-01-31 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese closed 
https://github.com/llvm/llvm-project/pull/80090
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][DependencyScanner] Remove unused -fmodule-map-file arguments (PR #80090)

2024-01-30 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese created 
https://github.com/llvm/llvm-project/pull/80090

Since we already add a `-fmodule-map-file=` argument for every used modulemap, 
we can remove all `ModuleMapFiles` entries before adding them.

This reduces the number of module variants when `-fmodule-map-file=` appears on 
the original command line.

>From 73c023e0f1b8d81ee25c75c19bfd0322675fcf44 Mon Sep 17 00:00:00 2001
From: Michael Spencer 
Date: Tue, 30 Jan 2024 17:42:48 -0800
Subject: [PATCH] [clang][DependencyScanner] Remove unused -fmodule-map-file
 arguments

Since we already add a `-fmodule-map-file=` argument for every used
modulemap, we can remove all `ModuleMapFiles` entries before adding
them.

This reduces the number of module variants when `-fmodule-map-file=`
appears on the original command line.
---
 .../DependencyScanning/ModuleDepCollector.cpp |  4 ++
 .../test/ClangScanDeps/optimize-fmodulemap.m  | 66 +++
 2 files changed, 70 insertions(+)
 create mode 100644 clang/test/ClangScanDeps/optimize-fmodulemap.m

diff --git a/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp 
b/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
index b807dc8432185..995d8b2899c8d 100644
--- a/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
+++ b/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
@@ -211,6 +211,10 @@ 
ModuleDepCollector::getInvocationAdjustedForModuleBuildWithoutOutputs(
   ScanInstance.getFileManager().getFile(Deps.ClangModuleMapFile);
   assert(CurrentModuleMapEntry && "module map file entry not found");
 
+  // Remove directly passed modulemap files. They will get added back if they
+  // were actually used.
+  CI.getMutFrontendOpts().ModuleMapFiles.clear();
+
   auto DepModuleMapFiles = collectModuleMapFiles(Deps.ClangModuleDeps);
   for (StringRef ModuleMapFile : Deps.ModuleMapFileDeps) {
 // TODO: Track these as `FileEntryRef` to simplify the equality check 
below.
diff --git a/clang/test/ClangScanDeps/optimize-fmodulemap.m 
b/clang/test/ClangScanDeps/optimize-fmodulemap.m
new file mode 100644
index 0..5e9affb30b9c1
--- /dev/null
+++ b/clang/test/ClangScanDeps/optimize-fmodulemap.m
@@ -0,0 +1,66 @@
+// Check that unused directly passed -fmodule-map-file options get dropped.
+
+// RUN: rm -rf %t && split-file %s %t
+// RUN: sed -e "s|DIR|%/t|g" %t/build/cdb.json.in > %t/build/cdb.json
+// RUN: clang-scan-deps -compilation-database %t/build/cdb.json \
+// RUN:   -format experimental-full > %t/deps.json
+// RUN: cat %t/deps.json | sed 's:\?:/:g' | FileCheck %s -DPREFIX=%/t
+
+// CHECK:  {
+// CHECK-NEXT:   "modules": [
+// CHECK-NEXT: {
+// CHECK-NEXT:   "clang-module-deps": [
+// CHECK-NEXT: {
+// CHECK-NEXT:   "context-hash": "{{.*}}",
+// CHECK-NEXT:   "module-name": "B"
+// CHECK-NEXT: }
+// CHECK-NEXT:   ],
+// CHECK-NEXT:   "clang-modulemap-file": 
"[[PREFIX]]/modules/A/module.modulemap",
+// CHECK-NEXT:   "command-line": [
+// CHECK-NOT:  
"-fmodule-map-file=[[PREFIX]]/modules/A/module.modulemap"
+// CHECK:  
"-fmodule-map-file=[[PREFIX]]/modules/B/module.modulemap"
+// CHECK-NOT:  
"-fmodule-map-file=[[PREFIX]]/modules/A/module.modulemap"
+// CHECK:],
+// CHECK-NEXT:   "context-hash": "{{.*}}",
+// CHECK-NEXT:   "file-deps": [
+// CHECK:],
+// CHECK-NEXT:   "name": "A"
+// CHECK-NEXT: },
+// CHECK-NEXT: {
+// CHECK-NEXT:   "clang-module-deps": [],
+// CHECK-NEXT:   "clang-modulemap-file": 
"[[PREFIX]]/modules/B/module.modulemap",
+// CHECK-NEXT:   "command-line": [
+// CHECK-NOT:  "-fmodule-map-file=
+// CHECK:],
+// CHECK-NEXT:   "context-hash": "{{.*}}",
+// CHECK-NEXT:   "file-deps": [
+// CHECK:],
+// CHECK-NEXT:   "name": "B"
+// CHECK-NEXT: }
+// CHECK-NEXT:   ],
+// CHECK-NEXT:   "translation-units": [
+// CHECK:]
+// CHECK:  }
+
+//--- build/cdb.json.in
+[{
+  "directory": "DIR",
+  "command": "clang -c DIR/tu.m -I DIR/modules/B 
-fmodule-map-file=DIR/modules/A/module.modulemap -fmodules 
-fmodules-cache-path=DIR/cache -fimplicit-module-maps",
+  "file": "DIR/tu.m"
+}]
+
+//--- build/vfs.yaml.in
+
+//--- tu.m
+@import A;
+
+//--- modules/A/module.modulemap
+module A { header "A.h" }
+
+//--- modules/A/A.h
+#include 
+
+//--- modules/B/module.modulemap
+module B { header "B.h" }
+
+//--- modules/B/B.h

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] b21a2f9 - [clang][scan-deps] Stop scanning if any scanning setup emits an error.

2024-01-30 Thread Michael Spencer via cfe-commits

Author: Michael Spencer
Date: 2024-01-30T17:03:13-08:00
New Revision: b21a2f9365b6c5fd464a97be5dfe7085742870ef

URL: 
https://github.com/llvm/llvm-project/commit/b21a2f9365b6c5fd464a97be5dfe7085742870ef
DIFF: 
https://github.com/llvm/llvm-project/commit/b21a2f9365b6c5fd464a97be5dfe7085742870ef.diff

LOG: [clang][scan-deps] Stop scanning if any scanning setup emits an error.

Without this scanning will continue and later hit an assert that the
number of `RedirectingFileSystem`s matches the number of -ivfsoverlay
arguments.

Added: 
clang/test/ClangScanDeps/missing-vfs.m

Modified: 
clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp

Removed: 




diff  --git a/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp 
b/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
index 390cbe5aa65e1..3cf3ad8a4e490 100644
--- a/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
+++ b/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
@@ -322,6 +322,9 @@ class DependencyScanningAction : public tooling::ToolAction 
{
 else
   Action = std::make_unique();
 
+if (ScanInstance.getDiagnostics().hasErrorOccurred())
+  return false;
+
 const bool Result = ScanInstance.ExecuteAction(*Action);
 
 if (Result)

diff  --git a/clang/test/ClangScanDeps/missing-vfs.m 
b/clang/test/ClangScanDeps/missing-vfs.m
new file mode 100644
index 0..e825b00526728
--- /dev/null
+++ b/clang/test/ClangScanDeps/missing-vfs.m
@@ -0,0 +1,18 @@
+// Check that a missing VFS errors before trying to scan anything.
+
+// RUN: rm -rf %t && split-file %s %t
+// RUN: sed -e "s|DIR|%/t|g" %t/build/cdb.json.in > %t/build/cdb.json
+// RUN: not clang-scan-deps -compilation-database %t/build/cdb.json \
+// RUN:   -format experimental-full 2>&1 | FileCheck %s
+
+// CHECK: virtual filesystem overlay file
+// CHECK: not found
+
+//--- build/cdb.json.in
+[{
+  "directory": "DIR",
+  "command": "clang -c DIR/tu.m -ivfsoverlay DIR/vfs.yaml",
+  "file": "DIR/tu.m"
+}]
+
+//--- tu.m



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [clang][DependencyScanner] Remove unused -ivfsoverlay files (PR #73734)

2024-01-30 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese closed 
https://github.com/llvm/llvm-project/pull/73734
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [clang][DependencyScanner] Remove unused -ivfsoverlay files (PR #73734)

2024-01-30 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese updated 
https://github.com/llvm/llvm-project/pull/73734

>From 42666e6c0b46a83f0b4fbc7d7945825c56e6ac5a Mon Sep 17 00:00:00 2001
From: Michael Spencer 
Date: Fri, 24 Feb 2023 17:18:51 -0800
Subject: [PATCH] [clang][DepScan] Remove unused -ivfsoverlay files

`-ivfsoverlay` files are unused when building most modules. Enable
removing them by,
* adding a way to visit the filesystem tree with extensible RTTI to
  access each `RedirectingFileSystem`.
* Adding tracking to `RedirectingFileSystem` to record when it
  actually redirects a file access.
* Storing this information in each PCM.

Usage tracking is only enabled when iterating over the source manager
and affecting modulemaps. Here each path is stated to cause an access.
During scanning these stats all hit the cache.
---
 .../Basic/DiagnosticSerializationKinds.td |   3 +
 clang/include/clang/Basic/FileManager.h   |   4 +
 clang/include/clang/Lex/HeaderSearch.h|   7 +
 clang/include/clang/Lex/HeaderSearchOptions.h |   6 +-
 .../include/clang/Serialization/ASTBitCodes.h |   3 +
 clang/include/clang/Serialization/ASTReader.h |  16 +-
 clang/include/clang/Serialization/ASTWriter.h |   4 +-
 .../include/clang/Serialization/ModuleFile.h  |   3 +
 .../DependencyScanningFilesystem.h|   6 +-
 .../DependencyScanningService.h   |  11 +-
 clang/lib/Basic/FileManager.cpp   |   7 +
 clang/lib/Frontend/CompilerInvocation.cpp |   1 +
 clang/lib/Lex/HeaderSearch.cpp|  22 ++
 clang/lib/Serialization/ASTReader.cpp |  36 ++--
 clang/lib/Serialization/ASTWriter.cpp |  52 -
 .../DependencyScanningFilesystem.cpp  |   6 +-
 .../DependencyScanningWorker.cpp  |  86 ++--
 .../DependencyScanning/ModuleDepCollector.cpp |  74 +--
 .../ClangScanDeps/optimize-vfs-edgecases.m|  84 
 clang/test/ClangScanDeps/optimize-vfs-leak.m  | 105 ++
 clang/test/ClangScanDeps/optimize-vfs-pch.m   | 129 
 clang/test/ClangScanDeps/optimize-vfs.m   | 193 ++
 clang/tools/clang-scan-deps/ClangScanDeps.cpp |   1 +
 llvm/include/llvm/Support/VirtualFileSystem.h |  55 -
 llvm/lib/Support/VirtualFileSystem.cpp|  26 +++
 .../Support/VirtualFileSystemTest.cpp |  86 
 26 files changed, 947 insertions(+), 79 deletions(-)
 create mode 100644 clang/test/ClangScanDeps/optimize-vfs-edgecases.m
 create mode 100644 clang/test/ClangScanDeps/optimize-vfs-leak.m
 create mode 100644 clang/test/ClangScanDeps/optimize-vfs-pch.m
 create mode 100644 clang/test/ClangScanDeps/optimize-vfs.m

diff --git a/clang/include/clang/Basic/DiagnosticSerializationKinds.td 
b/clang/include/clang/Basic/DiagnosticSerializationKinds.td
index 11c706ebf84b5..4c4659ed517e0 100644
--- a/clang/include/clang/Basic/DiagnosticSerializationKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSerializationKinds.td
@@ -44,6 +44,9 @@ def err_pch_diagopt_mismatch : Error<"%0 is currently 
enabled, but was not in "
   "the PCH file">;
 def err_pch_modulecache_mismatch : Error<"PCH was compiled with module cache "
   "path '%0', but the path is currently '%1'">;
+def err_pch_vfsoverlay_mismatch : Error<"PCH was compiled with different VFS 
overlay files than are currently in use">;
+def note_pch_vfsoverlay_files : Note<"%select{PCH|current translation unit}0 
has the following VFS overlays:\n%1">;
+def note_pch_vfsoverlay_empty : Note<"%select{PCH|current translation unit}0 
has no VFS overlays">;
 
 def err_pch_version_too_old : Error<
 "PCH file uses an older PCH format that is no longer supported">;
diff --git a/clang/include/clang/Basic/FileManager.h 
b/clang/include/clang/Basic/FileManager.h
index 56cb093dd8c37..997c17a0ffcfc 100644
--- a/clang/include/clang/Basic/FileManager.h
+++ b/clang/include/clang/Basic/FileManager.h
@@ -248,6 +248,10 @@ class FileManager : public RefCountedBase {
 return FS;
   }
 
+  /// Enable or disable tracking of VFS usage. Used to not track full header
+  /// search and implicit modulemap lookup.
+  void trackVFSUsage(bool Active);
+
   void setVirtualFileSystem(IntrusiveRefCntPtr FS) {
 this->FS = std::move(FS);
   }
diff --git a/clang/include/clang/Lex/HeaderSearch.h 
b/clang/include/clang/Lex/HeaderSearch.h
index a2c33842924b1..705dcfa8aacc3 100644
--- a/clang/include/clang/Lex/HeaderSearch.h
+++ b/clang/include/clang/Lex/HeaderSearch.h
@@ -576,6 +576,13 @@ class HeaderSearch {
   /// Note: implicit module maps don't contribute to entry usage.
   std::vector computeUserEntryUsage() const;
 
+  /// Collect which HeaderSearchOptions::VFSOverlayFiles have been meaningfully
+  /// used so far and mark their index with 'true' in the resulting bit vector.
+  ///
+  /// Note: this ignores VFSs that redirect non-affecting files such as unused
+  /// modulemaps.
+  std::vector collectVFSUsageAndClear() const;
+
   /// This method returns a HeaderMap for the specified
   /// FileEntry, uniquing 

[llvm] [clang] [clang][DependencyScanner] Remove unused -ivfsoverlay files (PR #73734)

2024-01-29 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese updated 
https://github.com/llvm/llvm-project/pull/73734

>From 0559ec44d2d3c39292bae6d6431dde9626ac755e Mon Sep 17 00:00:00 2001
From: Michael Spencer 
Date: Fri, 24 Feb 2023 17:18:51 -0800
Subject: [PATCH 1/2] [clang][DepScan] Remove unused -ivfsoverlay files

`-ivfsoverlay` files are unused when building most modules. Enable
removing them by,
* adding a way to visit the filesystem tree with extensible RTTI to
  access each `RedirectingFileSystem`.
* Adding tracking to `RedirectingFileSystem` to record when it
  actually redirects a file access.
* Storing this information in each PCM.

Usage tracking is only enabled when iterating over the source manager
and affecting modulemaps. Here each path is stated to cause an access.
During scanning these stats all hit the cache.
---
 .../Basic/DiagnosticSerializationKinds.td |   3 +
 clang/include/clang/Basic/FileManager.h   |   4 +
 clang/include/clang/Lex/HeaderSearch.h|   6 +
 clang/include/clang/Lex/HeaderSearchOptions.h |   6 +-
 .../include/clang/Serialization/ASTBitCodes.h |   3 +
 clang/include/clang/Serialization/ASTReader.h |  16 +-
 .../include/clang/Serialization/ModuleFile.h  |   3 +
 .../DependencyScanningFilesystem.h|   6 +-
 .../DependencyScanningService.h   |  11 +-
 clang/lib/Basic/FileManager.cpp   |   7 +
 clang/lib/Frontend/CompilerInvocation.cpp |   1 +
 clang/lib/Lex/HeaderSearch.cpp|  16 ++
 clang/lib/Serialization/ASTReader.cpp |  41 ++--
 clang/lib/Serialization/ASTWriter.cpp |  41 +++-
 .../DependencyScanningFilesystem.cpp  |   6 +-
 .../DependencyScanningWorker.cpp  |  86 ++--
 .../DependencyScanning/ModuleDepCollector.cpp |  74 +--
 .../ClangScanDeps/optimize-vfs-edgecases.m|  84 
 clang/test/ClangScanDeps/optimize-vfs-leak.m  | 105 ++
 clang/test/ClangScanDeps/optimize-vfs-pch.m   | 129 
 clang/test/ClangScanDeps/optimize-vfs.m   | 193 ++
 clang/tools/clang-scan-deps/ClangScanDeps.cpp |   1 +
 llvm/include/llvm/Support/VirtualFileSystem.h |  55 -
 llvm/lib/Support/VirtualFileSystem.cpp|  26 +++
 .../Support/VirtualFileSystemTest.cpp |  86 
 25 files changed, 932 insertions(+), 77 deletions(-)
 create mode 100644 clang/test/ClangScanDeps/optimize-vfs-edgecases.m
 create mode 100644 clang/test/ClangScanDeps/optimize-vfs-leak.m
 create mode 100644 clang/test/ClangScanDeps/optimize-vfs-pch.m
 create mode 100644 clang/test/ClangScanDeps/optimize-vfs.m

diff --git a/clang/include/clang/Basic/DiagnosticSerializationKinds.td 
b/clang/include/clang/Basic/DiagnosticSerializationKinds.td
index 11c706ebf84b54..4c4659ed517e0a 100644
--- a/clang/include/clang/Basic/DiagnosticSerializationKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSerializationKinds.td
@@ -44,6 +44,9 @@ def err_pch_diagopt_mismatch : Error<"%0 is currently 
enabled, but was not in "
   "the PCH file">;
 def err_pch_modulecache_mismatch : Error<"PCH was compiled with module cache "
   "path '%0', but the path is currently '%1'">;
+def err_pch_vfsoverlay_mismatch : Error<"PCH was compiled with different VFS 
overlay files than are currently in use">;
+def note_pch_vfsoverlay_files : Note<"%select{PCH|current translation unit}0 
has the following VFS overlays:\n%1">;
+def note_pch_vfsoverlay_empty : Note<"%select{PCH|current translation unit}0 
has no VFS overlays">;
 
 def err_pch_version_too_old : Error<
 "PCH file uses an older PCH format that is no longer supported">;
diff --git a/clang/include/clang/Basic/FileManager.h 
b/clang/include/clang/Basic/FileManager.h
index 56cb093dd8c376..997c17a0ffcfcc 100644
--- a/clang/include/clang/Basic/FileManager.h
+++ b/clang/include/clang/Basic/FileManager.h
@@ -248,6 +248,10 @@ class FileManager : public RefCountedBase {
 return FS;
   }
 
+  /// Enable or disable tracking of VFS usage. Used to not track full header
+  /// search and implicit modulemap lookup.
+  void trackVFSUsage(bool Active);
+
   void setVirtualFileSystem(IntrusiveRefCntPtr FS) {
 this->FS = std::move(FS);
   }
diff --git a/clang/include/clang/Lex/HeaderSearch.h 
b/clang/include/clang/Lex/HeaderSearch.h
index a2c33842924b10..49e1a4a124b55d 100644
--- a/clang/include/clang/Lex/HeaderSearch.h
+++ b/clang/include/clang/Lex/HeaderSearch.h
@@ -576,6 +576,12 @@ class HeaderSearch {
   /// Note: implicit module maps don't contribute to entry usage.
   std::vector computeUserEntryUsage() const;
 
+  /// Determine which HeaderSearchOptions::VFSOverlayFiles have been
+  /// successfully used so far and mark their index with 'true' in the 
resulting
+  /// bit vector.
+  /// Note: implicit module maps don't contribute to entry usage.
+  std::vector computeVFSUsageAndClear() const;
+
   /// This method returns a HeaderMap for the specified
   /// FileEntry, uniquing them through the 'HeaderMaps' datastructure.
   const HeaderMap 

[clang] [llvm] [clang][DependencyScanner] Remove unused -ivfsoverlay files (PR #73734)

2024-01-25 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese updated 
https://github.com/llvm/llvm-project/pull/73734

>From 0559ec44d2d3c39292bae6d6431dde9626ac755e Mon Sep 17 00:00:00 2001
From: Michael Spencer 
Date: Fri, 24 Feb 2023 17:18:51 -0800
Subject: [PATCH] [clang][DepScan] Remove unused -ivfsoverlay files

`-ivfsoverlay` files are unused when building most modules. Enable
removing them by,
* adding a way to visit the filesystem tree with extensible RTTI to
  access each `RedirectingFileSystem`.
* Adding tracking to `RedirectingFileSystem` to record when it
  actually redirects a file access.
* Storing this information in each PCM.

Usage tracking is only enabled when iterating over the source manager
and affecting modulemaps. Here each path is stated to cause an access.
During scanning these stats all hit the cache.
---
 .../Basic/DiagnosticSerializationKinds.td |   3 +
 clang/include/clang/Basic/FileManager.h   |   4 +
 clang/include/clang/Lex/HeaderSearch.h|   6 +
 clang/include/clang/Lex/HeaderSearchOptions.h |   6 +-
 .../include/clang/Serialization/ASTBitCodes.h |   3 +
 clang/include/clang/Serialization/ASTReader.h |  16 +-
 .../include/clang/Serialization/ModuleFile.h  |   3 +
 .../DependencyScanningFilesystem.h|   6 +-
 .../DependencyScanningService.h   |  11 +-
 clang/lib/Basic/FileManager.cpp   |   7 +
 clang/lib/Frontend/CompilerInvocation.cpp |   1 +
 clang/lib/Lex/HeaderSearch.cpp|  16 ++
 clang/lib/Serialization/ASTReader.cpp |  41 ++--
 clang/lib/Serialization/ASTWriter.cpp |  41 +++-
 .../DependencyScanningFilesystem.cpp  |   6 +-
 .../DependencyScanningWorker.cpp  |  86 ++--
 .../DependencyScanning/ModuleDepCollector.cpp |  74 +--
 .../ClangScanDeps/optimize-vfs-edgecases.m|  84 
 clang/test/ClangScanDeps/optimize-vfs-leak.m  | 105 ++
 clang/test/ClangScanDeps/optimize-vfs-pch.m   | 129 
 clang/test/ClangScanDeps/optimize-vfs.m   | 193 ++
 clang/tools/clang-scan-deps/ClangScanDeps.cpp |   1 +
 llvm/include/llvm/Support/VirtualFileSystem.h |  55 -
 llvm/lib/Support/VirtualFileSystem.cpp|  26 +++
 .../Support/VirtualFileSystemTest.cpp |  86 
 25 files changed, 932 insertions(+), 77 deletions(-)
 create mode 100644 clang/test/ClangScanDeps/optimize-vfs-edgecases.m
 create mode 100644 clang/test/ClangScanDeps/optimize-vfs-leak.m
 create mode 100644 clang/test/ClangScanDeps/optimize-vfs-pch.m
 create mode 100644 clang/test/ClangScanDeps/optimize-vfs.m

diff --git a/clang/include/clang/Basic/DiagnosticSerializationKinds.td 
b/clang/include/clang/Basic/DiagnosticSerializationKinds.td
index 11c706ebf84b542..4c4659ed517e0a0 100644
--- a/clang/include/clang/Basic/DiagnosticSerializationKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSerializationKinds.td
@@ -44,6 +44,9 @@ def err_pch_diagopt_mismatch : Error<"%0 is currently 
enabled, but was not in "
   "the PCH file">;
 def err_pch_modulecache_mismatch : Error<"PCH was compiled with module cache "
   "path '%0', but the path is currently '%1'">;
+def err_pch_vfsoverlay_mismatch : Error<"PCH was compiled with different VFS 
overlay files than are currently in use">;
+def note_pch_vfsoverlay_files : Note<"%select{PCH|current translation unit}0 
has the following VFS overlays:\n%1">;
+def note_pch_vfsoverlay_empty : Note<"%select{PCH|current translation unit}0 
has no VFS overlays">;
 
 def err_pch_version_too_old : Error<
 "PCH file uses an older PCH format that is no longer supported">;
diff --git a/clang/include/clang/Basic/FileManager.h 
b/clang/include/clang/Basic/FileManager.h
index 56cb093dd8c376f..997c17a0ffcfcce 100644
--- a/clang/include/clang/Basic/FileManager.h
+++ b/clang/include/clang/Basic/FileManager.h
@@ -248,6 +248,10 @@ class FileManager : public RefCountedBase {
 return FS;
   }
 
+  /// Enable or disable tracking of VFS usage. Used to not track full header
+  /// search and implicit modulemap lookup.
+  void trackVFSUsage(bool Active);
+
   void setVirtualFileSystem(IntrusiveRefCntPtr FS) {
 this->FS = std::move(FS);
   }
diff --git a/clang/include/clang/Lex/HeaderSearch.h 
b/clang/include/clang/Lex/HeaderSearch.h
index a2c33842924b101..49e1a4a124b55df 100644
--- a/clang/include/clang/Lex/HeaderSearch.h
+++ b/clang/include/clang/Lex/HeaderSearch.h
@@ -576,6 +576,12 @@ class HeaderSearch {
   /// Note: implicit module maps don't contribute to entry usage.
   std::vector computeUserEntryUsage() const;
 
+  /// Determine which HeaderSearchOptions::VFSOverlayFiles have been
+  /// successfully used so far and mark their index with 'true' in the 
resulting
+  /// bit vector.
+  /// Note: implicit module maps don't contribute to entry usage.
+  std::vector computeVFSUsageAndClear() const;
+
   /// This method returns a HeaderMap for the specified
   /// FileEntry, uniquing them through the 'HeaderMaps' datastructure.
   const HeaderMap 

[llvm] [clang] [clang][DependencyScanner] Remove unused -ivfsoverlay files (PR #73734)

2024-01-25 Thread Michael Spencer via cfe-commits

Bigcheese wrote:

I've updated the patch to use the alternative implementation Jan suggested.

https://github.com/llvm/llvm-project/pull/73734
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [clang][DependencyScanner] Remove unused -ivfsoverlay files (PR #73734)

2024-01-25 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese updated 
https://github.com/llvm/llvm-project/pull/73734

>From 3c7f36b087e09e1b7ab3231e8267bcdd8400fac4 Mon Sep 17 00:00:00 2001
From: Michael Spencer 
Date: Fri, 24 Feb 2023 17:18:51 -0800
Subject: [PATCH] [clang][DepScan] Remove unused -ivfsoverlay files

`-ivfsoverlay` files are unused when building most modules. Enable
removing them by,
* adding a way to visit the filesystem tree with extensible RTTI to
  access each `RedirectingFileSystem`.
* Adding tracking to `RedirectingFileSystem` to record when it
  actually redirects a file access.
* Storing this information in each PCM.

Usage tracking is only enabled when iterating over the source manager
and affecting modulemaps. Here each path is stated to cause an access.
During scanning these stats all hit the cache.
---
 .../Basic/DiagnosticSerializationKinds.td |   3 +
 clang/include/clang/Basic/FileManager.h   |   4 +
 clang/include/clang/Lex/HeaderSearch.h|   6 +
 .../include/clang/Serialization/ASTBitCodes.h |   3 +
 clang/include/clang/Serialization/ASTReader.h |   6 +-
 .../include/clang/Serialization/ModuleFile.h  |   3 +
 .../DependencyScanningFilesystem.h|   6 +-
 .../DependencyScanningService.h   |  11 +-
 clang/lib/Basic/FileManager.cpp   |   7 +
 clang/lib/Frontend/CompilerInvocation.cpp |   1 +
 clang/lib/Lex/HeaderSearch.cpp|  16 ++
 clang/lib/Serialization/ASTReader.cpp |  41 ++--
 clang/lib/Serialization/ASTWriter.cpp |  35 +++-
 .../DependencyScanningFilesystem.cpp  |   6 +-
 .../DependencyScanningWorker.cpp  |  74 +--
 .../DependencyScanning/ModuleDepCollector.cpp |  74 +--
 .../ClangScanDeps/optimize-vfs-edgecases.m|  84 
 clang/test/ClangScanDeps/optimize-vfs-leak.m  | 105 ++
 clang/test/ClangScanDeps/optimize-vfs-pch.m   | 129 
 clang/test/ClangScanDeps/optimize-vfs.m   | 193 ++
 clang/tools/clang-scan-deps/ClangScanDeps.cpp |   1 +
 llvm/include/llvm/Support/VirtualFileSystem.h |  55 -
 llvm/lib/Support/VirtualFileSystem.cpp|  26 +++
 .../Support/VirtualFileSystemTest.cpp |  84 
 24 files changed, 906 insertions(+), 67 deletions(-)
 create mode 100644 clang/test/ClangScanDeps/optimize-vfs-edgecases.m
 create mode 100644 clang/test/ClangScanDeps/optimize-vfs-leak.m
 create mode 100644 clang/test/ClangScanDeps/optimize-vfs-pch.m
 create mode 100644 clang/test/ClangScanDeps/optimize-vfs.m

diff --git a/clang/include/clang/Basic/DiagnosticSerializationKinds.td 
b/clang/include/clang/Basic/DiagnosticSerializationKinds.td
index 11c706ebf84b542..4c4659ed517e0a0 100644
--- a/clang/include/clang/Basic/DiagnosticSerializationKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSerializationKinds.td
@@ -44,6 +44,9 @@ def err_pch_diagopt_mismatch : Error<"%0 is currently 
enabled, but was not in "
   "the PCH file">;
 def err_pch_modulecache_mismatch : Error<"PCH was compiled with module cache "
   "path '%0', but the path is currently '%1'">;
+def err_pch_vfsoverlay_mismatch : Error<"PCH was compiled with different VFS 
overlay files than are currently in use">;
+def note_pch_vfsoverlay_files : Note<"%select{PCH|current translation unit}0 
has the following VFS overlays:\n%1">;
+def note_pch_vfsoverlay_empty : Note<"%select{PCH|current translation unit}0 
has no VFS overlays">;
 
 def err_pch_version_too_old : Error<
 "PCH file uses an older PCH format that is no longer supported">;
diff --git a/clang/include/clang/Basic/FileManager.h 
b/clang/include/clang/Basic/FileManager.h
index 56cb093dd8c376f..997c17a0ffcfcce 100644
--- a/clang/include/clang/Basic/FileManager.h
+++ b/clang/include/clang/Basic/FileManager.h
@@ -248,6 +248,10 @@ class FileManager : public RefCountedBase {
 return FS;
   }
 
+  /// Enable or disable tracking of VFS usage. Used to not track full header
+  /// search and implicit modulemap lookup.
+  void trackVFSUsage(bool Active);
+
   void setVirtualFileSystem(IntrusiveRefCntPtr FS) {
 this->FS = std::move(FS);
   }
diff --git a/clang/include/clang/Lex/HeaderSearch.h 
b/clang/include/clang/Lex/HeaderSearch.h
index a2c33842924b101..49e1a4a124b55df 100644
--- a/clang/include/clang/Lex/HeaderSearch.h
+++ b/clang/include/clang/Lex/HeaderSearch.h
@@ -576,6 +576,12 @@ class HeaderSearch {
   /// Note: implicit module maps don't contribute to entry usage.
   std::vector computeUserEntryUsage() const;
 
+  /// Determine which HeaderSearchOptions::VFSOverlayFiles have been
+  /// successfully used so far and mark their index with 'true' in the 
resulting
+  /// bit vector.
+  /// Note: implicit module maps don't contribute to entry usage.
+  std::vector computeVFSUsageAndClear() const;
+
   /// This method returns a HeaderMap for the specified
   /// FileEntry, uniquing them through the 'HeaderMaps' datastructure.
   const HeaderMap *CreateHeaderMap(FileEntryRef FE);
diff --git 

[clang] [clang][modules] Fix CodeGen options that can affect the AST. (PR #78816)

2024-01-19 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese approved this pull request.

LGTM, although are we sure none of the other options should be affecting? I 
just did a quick search and it seems like this is it.

I don't think it matters here, but the actual option that controls the macro is 
on `LangOptions` and where it's set has a "FIXME: Eliminate this dependency" 
where it directly parses the command line. As long as we're not changing the 
value the `LangOptions` version should always match.

https://github.com/llvm/llvm-project/pull/78816
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Disable gch-probe.c on AIX as `-gmodules` is not supported there yet. (PR #78513)

2024-01-17 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese closed 
https://github.com/llvm/llvm-project/pull/78513
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Disable gch-probe.c on AIX as `-gmodules` is not supported there yet. (PR #78513)

2024-01-17 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese created 
https://github.com/llvm/llvm-project/pull/78513

Followup fix for https://github.com/llvm/llvm-project/pull/77711

>From a12b3ab8663c591f66848a7e46620a79583f5045 Mon Sep 17 00:00:00 2001
From: Michael Spencer 
Date: Wed, 17 Jan 2024 14:27:03 -0800
Subject: [PATCH] [clang] Disable gch-probe.c on AIX as `-gmodules` is not
 supported there yet.

Followup fix for https://github.com/llvm/llvm-project/pull/77711
---
 clang/test/PCH/gch-probe.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/clang/test/PCH/gch-probe.c b/clang/test/PCH/gch-probe.c
index 0905c9baebdae04..370abd26d251749 100644
--- a/clang/test/PCH/gch-probe.c
+++ b/clang/test/PCH/gch-probe.c
@@ -1,3 +1,7 @@
+// Unsupported on AIX because we don't support the requisite "__clangast"
+// section in XCOFF yet.
+// UNSUPPORTED: target={{.*}}-aix{{.*}}
+
 // For GCC compatibility, clang should probe also with the .gch extension.
 // RUN: %clang -x c-header -c %s -o %t.h.gch
 // RUN: %clang -fsyntax-only -include %t.h %s

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [clang][Driver] Don't ignore -gmodules .gch files (PR #77711)

2024-01-17 Thread Michael Spencer via cfe-commits

Bigcheese wrote:

Ah, looks like all the other gmodules tests have:
```
// Unsupported on AIX because we don't support the requisite "__clangast"
// section in XCOFF yet.
// UNSUPPORTED: target={{.*}}-aix{{.*}}
```

I'll just add that.

https://github.com/llvm/llvm-project/pull/77711
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [clang][Driver] Don't ignore -gmodules .gch files (PR #77711)

2024-01-17 Thread Michael Spencer via cfe-commits

Bigcheese wrote:

Yep, I'll take a look.

https://github.com/llvm/llvm-project/pull/77711
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang] [clang][Driver] Don't ignore -gmodules .gch files (PR #77711)

2024-01-16 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese closed 
https://github.com/llvm/llvm-project/pull/77711
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang] [clang][Driver] Don't ignore -gmodules .gch files (PR #77711)

2024-01-15 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese updated 
https://github.com/llvm/llvm-project/pull/77711

>From d3e928fc5b725cb3e484a8cfd50fa23c26f7eb22 Mon Sep 17 00:00:00 2001
From: Michael Spencer 
Date: Wed, 10 Jan 2024 16:58:59 -0800
Subject: [PATCH] [clang][Driver] Don't ignore -gmodules .gch files

A previous commit (82f75ed) made clang ignore .gch files that were not
Clang AST files. This broke `-gmodules`, which embeds the Clang AST
into an object file containing debug info.

This changes the probing to detect any valid object file as a
potential Clang PCH.
---
 clang/lib/Driver/ToolChains/Clang.cpp  | 26 +++---
 clang/test/PCH/gch-probe.c |  4 
 llvm/include/llvm/BinaryFormat/Magic.h |  1 +
 llvm/lib/BinaryFormat/Magic.cpp|  2 ++
 llvm/lib/Object/Binary.cpp |  1 +
 llvm/lib/Object/ObjectFile.cpp |  1 +
 6 files changed, 28 insertions(+), 7 deletions(-)

diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 9edae3fec91a87..997ec2d491d02c 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -43,7 +43,9 @@
 #include "clang/Driver/XRayArgs.h"
 #include "llvm/ADT/SmallSet.h"
 #include "llvm/ADT/StringExtras.h"
+#include "llvm/BinaryFormat/Magic.h"
 #include "llvm/Config/llvm-config.h"
+#include "llvm/Object/ObjectFile.h"
 #include "llvm/Option/ArgList.h"
 #include "llvm/Support/CodeGen.h"
 #include "llvm/Support/Compiler.h"
@@ -948,11 +950,21 @@ static void handleAMDGPUCodeObjectVersionOptions(const 
Driver ,
   }
 }
 
-static bool hasClangPchSignature(const Driver , StringRef Path) {
-  if (llvm::ErrorOr> MemBuf =
-  D.getVFS().getBufferForFile(Path))
-return (*MemBuf)->getBuffer().starts_with("CPCH");
-  return false;
+static bool maybeHasClangPchSignature(const Driver , StringRef Path) {
+  llvm::ErrorOr> MemBuf =
+  D.getVFS().getBufferForFile(Path);
+  if (!MemBuf)
+return false;
+  llvm::file_magic Magic = llvm::identify_magic((*MemBuf)->getBuffer());
+  if (Magic == llvm::file_magic::unknown)
+return false;
+  // Return true for both raw Clang AST files and object files which may
+  // contain a __clangast section.
+  if (Magic == llvm::file_magic::clang_ast)
+return true;
+  Expected> Obj =
+  llvm::object::ObjectFile::createObjectFile(**MemBuf, Magic);
+  return !Obj.takeError();
 }
 
 static bool gchProbe(const Driver , StringRef Path) {
@@ -964,14 +976,14 @@ static bool gchProbe(const Driver , StringRef Path) {
 std::error_code EC;
 for (llvm::vfs::directory_iterator DI = D.getVFS().dir_begin(Path, EC), DE;
  !EC && DI != DE; DI = DI.increment(EC)) {
-  if (hasClangPchSignature(D, DI->path()))
+  if (maybeHasClangPchSignature(D, DI->path()))
 return true;
 }
 D.Diag(diag::warn_drv_pch_ignoring_gch_dir) << Path;
 return false;
   }
 
-  if (hasClangPchSignature(D, Path))
+  if (maybeHasClangPchSignature(D, Path))
 return true;
   D.Diag(diag::warn_drv_pch_ignoring_gch_file) << Path;
   return false;
diff --git a/clang/test/PCH/gch-probe.c b/clang/test/PCH/gch-probe.c
index 8b1e1fab5ad97b..0905c9baebdae0 100644
--- a/clang/test/PCH/gch-probe.c
+++ b/clang/test/PCH/gch-probe.c
@@ -2,6 +2,10 @@
 // RUN: %clang -x c-header -c %s -o %t.h.gch
 // RUN: %clang -fsyntax-only -include %t.h %s
 
+// -gmodules embeds the Clang AST file in an object file.
+// RUN: %clang -x c-header -c %s -gmodules -o %t.h.gch
+// RUN: %clang -fsyntax-only -include %t.h %s
+
 // gch probing should ignore files which are not clang pch files.
 // RUN: %clang -fsyntax-only -include %S/Inputs/gch-probe.h %s 2>&1 | 
FileCheck %s
 // CHECK: warning: precompiled header '{{.*}}gch-probe.h.gch' was ignored 
because it is not a clang PCH file
diff --git a/llvm/include/llvm/BinaryFormat/Magic.h 
b/llvm/include/llvm/BinaryFormat/Magic.h
index c635a269576587..6978c066bda468 100644
--- a/llvm/include/llvm/BinaryFormat/Magic.h
+++ b/llvm/include/llvm/BinaryFormat/Magic.h
@@ -21,6 +21,7 @@ struct file_magic {
   enum Impl {
 unknown = 0,   ///< Unrecognized file
 bitcode,   ///< Bitcode file
+clang_ast, ///< Clang PCH or PCM
 archive,   ///< ar style archive file
 elf,   ///< ELF Unknown type
 elf_relocatable,   ///< ELF Relocatable object file
diff --git a/llvm/lib/BinaryFormat/Magic.cpp b/llvm/lib/BinaryFormat/Magic.cpp
index 45a0b7e11452b4..bd378337ed3338 100644
--- a/llvm/lib/BinaryFormat/Magic.cpp
+++ b/llvm/lib/BinaryFormat/Magic.cpp
@@ -98,6 +98,8 @@ file_magic llvm::identify_magic(StringRef Magic) {
   case 'C':
 if (startswith(Magic, "CCOB"))
   return file_magic::offload_bundle_compressed;
+if (startswith(Magic, "CPCH"))
+  return file_magic::clang_ast;
 break;
   case '!':
 if (startswith(Magic, "!\n") || startswith(Magic, "!\n"))
diff --git a/llvm/lib/Object/Binary.cpp b/llvm/lib/Object/Binary.cpp
index 

[llvm] [clang] [clang][Driver] Don't ignore -gmodules .gch files (PR #77711)

2024-01-11 Thread Michael Spencer via cfe-commits

Bigcheese wrote:

I updated the patch to try to actually open the object file.

https://github.com/llvm/llvm-project/pull/77711
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang] [clang][Driver] Don't ignore -gmodules .gch files (PR #77711)

2024-01-11 Thread Michael Spencer via cfe-commits

https://github.com/Bigcheese updated 
https://github.com/llvm/llvm-project/pull/77711

>From ef781002ef63817afa4df4834742ec3c2f22 Mon Sep 17 00:00:00 2001
From: Michael Spencer 
Date: Wed, 10 Jan 2024 16:58:59 -0800
Subject: [PATCH] [clang][Driver] Don't ignore -gmodules .gch files

A previous commit (82f75ed) made clang ignore .gch files that were not
Clang AST files. This broke `-gmodules`, which embeds the Clang AST
into an object file containing debug info.

This changes the probing to detect any valid object file as a
potential Clang PCH.
---
 clang/lib/Driver/ToolChains/Clang.cpp  | 26 +++---
 clang/test/PCH/gch-probe.c |  4 
 llvm/include/llvm/BinaryFormat/Magic.h |  1 +
 llvm/lib/BinaryFormat/Magic.cpp|  2 ++
 llvm/lib/Object/Binary.cpp |  1 +
 llvm/lib/Object/ObjectFile.cpp |  1 +
 6 files changed, 28 insertions(+), 7 deletions(-)

diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 1ee7ae602f3ce5..526c1789651011 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -43,7 +43,9 @@
 #include "clang/Driver/XRayArgs.h"
 #include "llvm/ADT/SmallSet.h"
 #include "llvm/ADT/StringExtras.h"
+#include "llvm/BinaryFormat/Magic.h"
 #include "llvm/Config/llvm-config.h"
+#include "llvm/Object/ObjectFile.h"
 #include "llvm/Option/ArgList.h"
 #include "llvm/Support/CodeGen.h"
 #include "llvm/Support/Compiler.h"
@@ -948,11 +950,21 @@ static void handleAMDGPUCodeObjectVersionOptions(const 
Driver ,
   }
 }
 
-static bool hasClangPchSignature(const Driver , StringRef Path) {
-  if (llvm::ErrorOr> MemBuf =
-  D.getVFS().getBufferForFile(Path))
-return (*MemBuf)->getBuffer().starts_with("CPCH");
-  return false;
+static bool maybeHasClangPchSignature(const Driver , StringRef Path) {
+  llvm::ErrorOr> MemBuf =
+  D.getVFS().getBufferForFile(Path);
+  if (!MemBuf)
+return false;
+  llvm::file_magic Magic = llvm::identify_magic((*MemBuf)->getBuffer());
+  if (Magic == llvm::file_magic::unknown)
+return false;
+  // Return true for both raw Clang AST files and object files which may
+  // contain a __clangast section.
+  if (Magic == llvm::file_magic::clang_ast)
+return true;
+  Expected> Obj =
+  llvm::object::ObjectFile::createObjectFile(**MemBuf, Magic);
+  return !Obj.takeError();
 }
 
 static bool gchProbe(const Driver , StringRef Path) {
@@ -964,14 +976,14 @@ static bool gchProbe(const Driver , StringRef Path) {
 std::error_code EC;
 for (llvm::vfs::directory_iterator DI = D.getVFS().dir_begin(Path, EC), DE;
  !EC && DI != DE; DI = DI.increment(EC)) {
-  if (hasClangPchSignature(D, DI->path()))
+  if (maybeHasClangPchSignature(D, DI->path()))
 return true;
 }
 D.Diag(diag::warn_drv_pch_ignoring_gch_dir) << Path;
 return false;
   }
 
-  if (hasClangPchSignature(D, Path))
+  if (maybeHasClangPchSignature(D, Path))
 return true;
   D.Diag(diag::warn_drv_pch_ignoring_gch_file) << Path;
   return false;
diff --git a/clang/test/PCH/gch-probe.c b/clang/test/PCH/gch-probe.c
index 8b1e1fab5ad97b..0905c9baebdae0 100644
--- a/clang/test/PCH/gch-probe.c
+++ b/clang/test/PCH/gch-probe.c
@@ -2,6 +2,10 @@
 // RUN: %clang -x c-header -c %s -o %t.h.gch
 // RUN: %clang -fsyntax-only -include %t.h %s
 
+// -gmodules embeds the Clang AST file in an object file.
+// RUN: %clang -x c-header -c %s -gmodules -o %t.h.gch
+// RUN: %clang -fsyntax-only -include %t.h %s
+
 // gch probing should ignore files which are not clang pch files.
 // RUN: %clang -fsyntax-only -include %S/Inputs/gch-probe.h %s 2>&1 | 
FileCheck %s
 // CHECK: warning: precompiled header '{{.*}}gch-probe.h.gch' was ignored 
because it is not a clang PCH file
diff --git a/llvm/include/llvm/BinaryFormat/Magic.h 
b/llvm/include/llvm/BinaryFormat/Magic.h
index c635a269576587..6978c066bda468 100644
--- a/llvm/include/llvm/BinaryFormat/Magic.h
+++ b/llvm/include/llvm/BinaryFormat/Magic.h
@@ -21,6 +21,7 @@ struct file_magic {
   enum Impl {
 unknown = 0,   ///< Unrecognized file
 bitcode,   ///< Bitcode file
+clang_ast, ///< Clang PCH or PCM
 archive,   ///< ar style archive file
 elf,   ///< ELF Unknown type
 elf_relocatable,   ///< ELF Relocatable object file
diff --git a/llvm/lib/BinaryFormat/Magic.cpp b/llvm/lib/BinaryFormat/Magic.cpp
index 45a0b7e11452b4..bd378337ed3338 100644
--- a/llvm/lib/BinaryFormat/Magic.cpp
+++ b/llvm/lib/BinaryFormat/Magic.cpp
@@ -98,6 +98,8 @@ file_magic llvm::identify_magic(StringRef Magic) {
   case 'C':
 if (startswith(Magic, "CCOB"))
   return file_magic::offload_bundle_compressed;
+if (startswith(Magic, "CPCH"))
+  return file_magic::clang_ast;
 break;
   case '!':
 if (startswith(Magic, "!\n") || startswith(Magic, "!\n"))
diff --git a/llvm/lib/Object/Binary.cpp b/llvm/lib/Object/Binary.cpp
index 

  1   2   >