[clang] [NFC][analyzer] Document configuration options (PR #135169)

Donát Nagy via cfe-commits Tue, 29 Apr 2025 09:41:28 -0700

https://github.com/NagyDonat updated 
https://github.com/llvm/llvm-project/pull/135169


From 705372a8a2f6e87f5fdf6b0e99bfa6a13408c5d4 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Don=C3=A1t=20Nagy?= <donat.n...@ericsson.com>
Date: Thu, 3 Apr 2025 20:13:04 +0200
Subject: [PATCH 1/5] [NFC][analyzer] Document configuration options

This commit documents the process of specifying values for the analyzer
options and checker options implemented in the static analyzer, and adds
a script which includes the documentation of the analyzer options (which
was previously only available through a command-line flag) in the
RST-based web documentation.
---
 clang/docs/CMakeLists.txt                     |  26 ++
 clang/docs/analyzer/user-docs.rst             |   1 +
 .../analyzer/user-docs/CommandLineUsage.rst   |   2 +
 clang/docs/analyzer/user-docs/Options.rst.in  | 102 ++++++++
 .../tools/generate_analyzer_options_docs.py   | 242 ++++++++++++++++++
 .../StaticAnalyzer/Core/AnalyzerOptions.def   |   3 +
 6 files changed, 376 insertions(+)
 create mode 100644 clang/docs/analyzer/user-docs/Options.rst.in
 create mode 100644 clang/docs/tools/generate_analyzer_options_docs.py

diff --git a/clang/docs/CMakeLists.txt b/clang/docs/CMakeLists.txt
index 4fecc007f5995..9dfcc692ff87d 100644
--- a/clang/docs/CMakeLists.txt
+++ b/clang/docs/CMakeLists.txt
@@ -143,6 +143,32 @@ if (LLVM_ENABLE_SPHINX)
     gen_rst_file_from_td(DiagnosticsReference.rst -gen-diag-docs 
../include/clang/Basic/Diagnostic.td "${docs_targets}")
     gen_rst_file_from_td(ClangCommandLineReference.rst -gen-opt-docs 
../include/clang/Driver/ClangOptionDocs.td "${docs_targets}")
 
+    # Another generated file from a different source
+    set(docs_tools_dir ${CMAKE_CURRENT_SOURCE_DIR}/tools)
+    set(aopts_rst_rel_path analyzer/user-docs/Options.rst)
+    set(aopts_rst "${CMAKE_CURRENT_BINARY_DIR}/${aopts_rst_rel_path}")
+    set(analyzeroptions_def 
"${CMAKE_CURRENT_SOURCE_DIR}/../include/clang/StaticAnalyzer/Core/AnalyzerOptions.def")
+    set(aopts_rst_in "${CMAKE_CURRENT_SOURCE_DIR}/${aopts_rst_rel_path}.in")
+    set(generate_aopts_docs generate_analyzer_options_docs.py)
+    add_custom_command(
+      OUTPUT ${aopts_rst}
+      COMMAND ${Python3_EXECUTABLE} ${generate_aopts_docs} -i 
${analyzeroptions_def} -t ${aopts_rst_in} -o ${aopts_rst}
+      WORKING_DIRECTORY ${docs_tools_dir}
+      VERBATIM
+      COMMENT "Generating ${aopts_rst}"
+      DEPENDS ${docs_tools_dir}/${generate_aopts_docs}
+              ${aopts_rst_in}
+              copy-clang-rst-docs
+      )
+    add_custom_target(generate-analyzer-options-rst DEPENDS ${aopts_rst})
+    foreach(target ${docs_targets})
+      add_dependencies(${target} generate-analyzer-options-rst)
+    endforeach()
+
+    # Technically this is redundant because generate-analyzer-options-rst
+    # depends on the copy operation (because it wants to drop a generated file
+    # into a subdirectory of the copied tree), but I'm leaving it here for the
+    # sake of clarity.
     foreach(target ${docs_targets})
       add_dependencies(${target} copy-clang-rst-docs)
     endforeach()
diff --git a/clang/docs/analyzer/user-docs.rst 
b/clang/docs/analyzer/user-docs.rst
index e265f033a2c54..67c1dfaa40965 100644
--- a/clang/docs/analyzer/user-docs.rst
+++ b/clang/docs/analyzer/user-docs.rst
@@ -8,6 +8,7 @@ Contents:
 
    user-docs/Installation
    user-docs/CommandLineUsage
+   user-docs/Options
    user-docs/UsingWithXCode
    user-docs/FilingBugs
    user-docs/CrossTranslationUnit
diff --git a/clang/docs/analyzer/user-docs/CommandLineUsage.rst 
b/clang/docs/analyzer/user-docs/CommandLineUsage.rst
index 59f8187f374a9..0252de80b788f 100644
--- a/clang/docs/analyzer/user-docs/CommandLineUsage.rst
+++ b/clang/docs/analyzer/user-docs/CommandLineUsage.rst
@@ -194,6 +194,8 @@ When compiling your application to run on the simulator, it 
is important that **
 
 If you aren't certain which compiler Xcode uses to build your project, try 
just running ``xcodebuild`` (without **scan-build**). You should see the full 
path to the compiler that Xcode is using, and use that as an argument to 
``--use-cc``.
 
+.. _command-line-usage-CodeChecker:
+
 CodeChecker
 -----------
 
diff --git a/clang/docs/analyzer/user-docs/Options.rst.in 
b/clang/docs/analyzer/user-docs/Options.rst.in
new file mode 100644
index 0000000000000..eced3597ed567
--- /dev/null
+++ b/clang/docs/analyzer/user-docs/Options.rst.in
@@ -0,0 +1,102 @@
+========================
+Configuring the Analyzer
+========================
+
+The clang static analyzer supports two kinds of options:
+
+1. Global **analyzer options** influence the behavior of the analyzer engine.
+   They are documented on this page, in the section :ref:`List of analyzer
+   options<list-of-analyzer-options>`.
+2. The **checker options** belong to individual checkers (e.g.
+   ``core.BitwiseShift:Pedantic`` and ``unix.Stream:Pedantic`` are completely
+   separate options) and customize the behavior of that particular checker.
+   These are documented within the documentation of each individual checker at
+   :doc:`../checkers`.
+
+Assigning values to options
+===========================
+
+With the compiler frontend
+--------------------------
+
+All options can be configured by using the ``-analyzer-config`` flag of ``clang
+-cc1`` (the so-called *compiler frontend* part of clang). The values of the
+options are specified with the syntax ``-analyzer-config
+OPT=VAL,OPT2=VAL2,...`` which supports specifying multiple options, but
+separate flags like ``-analyzer-config OPT=VAL -analyzer-config OPT2=VAL2`` are
+also accepted (with equivalent behavior). Analyzer options and checker options
+can be freely intermixed here because it's easy to recognize that checker
+option names are always prefixed with ``some.groups.NameOfChecker:``.
+
+With the clang driver
+---------------------
+
+In a conventional workflow ``clang -cc1`` (which is a low-level internal
+interface) is invoked indirectly by the clang *driver* (i.e. plain ``clang``
+without the ``-cc1`` flag), which acts as an "even more frontend" wrapper layer
+around the ``clang -cc1`` *compiler frontend*. In this situation **each**
+command line argument intended for the *compiler frontend* must be prefixed
+with ``-Xclang``.
+
+For example the following command analyzes ``foo.c`` in :ref:`shallow mode
+<analyzer-option-mode>` with :ref:`loop unrolling
+<analyzer-option-unroll-loops>`:
+
+::
+
+  clang --analyze -Xclang -analyzer-config -Xclang 
mode=shallow,unroll-loops=true foo.c
+
+When this is executed, the *driver* will compose and execute the following
+``clang -cc1`` command (which can be inspected by passing the ``-v`` flag to
+the *driver*):
+
+::
+
+  clang -cc1 -analyze [...] -analyzer-config mode=shallow,unroll-loops=true 
foo.c
+
+Here ``[...]`` stands for dozens of low-level flags which ensure that ``clang
+-cc1`` does the right thing (e.g. ``-fcolor-diagnostics`` when it's suitable;
+``-analyzer-checker`` flags to enable a sane default set of checkers). Also
+note the distinction that the ``clang`` *driver* requires ``--analyze`` (double
+dashes) while the ``clang -cc1`` *compiler frontend* requires ``-analyze``
+(single dash).
+
+With CodeChecker
+----------------
+
+If the analysis is performed through :ref:`CodeChecker
+<command-line-usage-CodeChecker>` (which e.g. supports the analysis of a whole
+project instead of a single file) then it will act as another indirection
+layer. CodeChecker provides separate command-line flags called
+``--analyzer-config`` (for analyzer options) and ``--checker-config`` (for
+checker options):
+
+::
+
+  CodeChecker analyze -o outdir --checker-config 
clangsa:unix.Stream:Pedantic=true  \
+          --analyzer-config clangsa:mode=shallow clangsa:unroll-loops=true     
     \
+          -- compile_commands.json
+
+These CodeChecker flags may be followed by multiple ``OPT=VAL`` pairs as
+separate arguments (and this is why the example needs to use ``--`` before
+``compile_commands.json``). The option names are all prefixed with ``clangsa:``
+to ensure that they are passed to the clang static analyzer (and not other
+analyzer tools that are also supported by CodeChecker).
+
+.. _list-of-analyzer-options:
+
+List of analyzer options
+========================
+
+.. warning::
+   These options are primarily intended for development purposes. Changing
+   their values may drastically alter the behavior of the analyzer, and may
+   even result in instabilities or crashes!
+
+..
+   The contents of this section are automatically generated by the script
+   clang/docs/tools/generate_analyzer_options_docs.py from the header file
+   AnalyzerOptions.def to ensure that the RST/web documentation is synchronized
+   with the command line help options.
+
+.. OPTIONS_LIST_PLACEHOLDER
diff --git a/clang/docs/tools/generate_analyzer_options_docs.py 
b/clang/docs/tools/generate_analyzer_options_docs.py
new file mode 100644
index 0000000000000..5dfc571deb9a0
--- /dev/null
+++ b/clang/docs/tools/generate_analyzer_options_docs.py
@@ -0,0 +1,242 @@
+#!/usr/bin/env python3
+# A tool to automatically generate documentation for the config options of the
+# clang static analyzer by reading `AnalyzerOptions.def`.
+
+import argparse
+from collections import namedtuple
+from enum import Enum, auto
+import re
+import sys
+import textwrap
+
+
+# The following code implements a trivial parser for the narrow subset of C++
+# which is used in AnalyzerOptions.def. This supports the following features:
+# - ignores preprocessor directives, even if they are continued with \ at EOL
+# - ignores comments: both /* ... */ and // ...
+# - parses string literals (even if they contain \" escapes)
+# - concatenates adjacent string literals
+# - parses numbers even if they contain ' as a thousands separator
+# - recognizes MACRO(arg1, arg2, ..., argN) calls
+
+
+class TT(Enum):
+    "Token type enum."
+    number = auto()
+    ident = auto()
+    string = auto()
+    punct = auto()
+
+
+TOKENS = [
+    (re.compile(r"-?[0-9']+"), TT.number),
+    (re.compile(r"\w+"), TT.ident),
+    (re.compile(r'"([^\\"]|\\.)*"'), TT.string),
+    (re.compile(r"[(),]"), TT.punct),
+    (re.compile(r"/\*((?!\*/).)*\*/", re.S), None),  # C-style comment
+    (re.compile(r"//.*\n"), None),  # C++ style oneline comment
+    (re.compile(r"#.*(\\\n.*)*(?<!\\)\n"), None),  # preprocessor directive
+    (re.compile(r"\s+"), None),  # whitespace
+]
+
+Token = namedtuple("Token", "kind code")
+
+
+def report_unexpected(s, pos):
+    lines = (s[:pos] + "X").split("\n")
+    lineno, col = (len(lines), len(lines[-1]))
+    print(
+        "unexpected character %r in AnalyzerOptions.def at line %d column %d"
+        % (s[pos], lineno, col),
+        file=sys.stderr,
+    )
+
+
+def tokenize(s):
+    result = []
+    pos = 0
+    while pos < len(s):
+        for regex, kind in TOKENS:
+            if m := regex.match(s, pos):
+                if kind is not None:
+                    result.append(Token(kind, m.group(0)))
+                pos = m.end()
+                break
+        else:
+            report_unexpected(s, pos)
+            pos += 1
+    return result
+
+
+def join_strings(tokens):
+    result = []
+    for tok in tokens:
+        if tok.kind == TT.string and result and result[-1].kind == TT.string:
+            # If this token is a string, and the previous non-ignored token is
+            # also a string, then merge them into a single token. We need to
+            # discard the closing " of the previous string and the opening " of
+            # this string.
+            prev = result.pop()
+            result.append(Token(TT.string, prev.code[:-1] + tok.code[1:]))
+        else:
+            result.append(tok)
+    return result
+
+
+MacroCall = namedtuple("MacroCall", "name args")
+
+
+class State(Enum):
+    "States of the state machine used for parsing the macro calls."
+    init = auto()
+    after_ident = auto()
+    before_arg = auto()
+    after_arg = auto()
+
+
+def get_calls(tokens, macro_names):
+    state = State.init
+    result = []
+    current = None
+    for tok in tokens:
+        if state == State.init and tok.kind == TT.ident and tok.code in 
macro_names:
+            current = MacroCall(tok.code, [])
+            state = State.after_ident
+        elif state == State.after_ident and tok == Token(TT.punct, "("):
+            state = State.before_arg
+        elif state == State.before_arg:
+            if current is not None:
+                current.args.append(tok)
+                state = State.after_arg
+        elif state == State.after_arg and tok.kind == TT.punct:
+            if tok.code == ")":
+                result.append(current)
+                current = None
+                state = State.init
+            elif tok.code == ",":
+                state = State.before_arg
+        else:
+            current = None
+            state = State.init
+    return result
+
+
+# The information will be extracted from calls to these two macros:
+# #define ANALYZER_OPTION(TYPE, NAME, CMDFLAG, DESC, DEFAULT_VAL)
+# #define ANALYZER_OPTION_DEPENDS_ON_USER_MODE(TYPE, NAME, CMDFLAG, DESC,
+#                                              SHALLOW_VAL, DEEP_VAL)
+
+MACRO_NAMES_ARGCOUNTS = {
+    "ANALYZER_OPTION": 5,
+    "ANALYZER_OPTION_DEPENDS_ON_USER_MODE": 6,
+}
+
+
+def string_value(tok):
+    if tok.kind != TT.string:
+        raise ValueError(f"expected a string token, got {tok.kind.name}")
+    text = tok.code[1:-1]  # Remove quotes
+    text = re.sub(r"\\(.)", r"\1", text)  # Resolve backslash escapes
+    return text
+
+
+def cmdflag_to_rst_title(cmdflag_tok):
+    text = string_value(cmdflag_tok)
+    underline = "-" * len(text)
+    ref = f".. _analyzer-option-{text}:"
+
+    return f"{ref}\n\n{text}\n{underline}\n\n"
+
+
+def desc_to_rst_paragraphs(tok):
+    desc = string_value(tok)
+
+    # Escape a star that would act as inline emphasis within RST.
+    desc = desc.replace("ctu-max-nodes-*", r"ctu-max-nodes-\*")
+
+    # Many descriptions end with "Value: <list of accepted values>", which is
+    # OK for a terse command line printout, but should be prettified for web
+    # documentation.
+    # Moreover, the option ctu-invocation-list shows some example file content
+    # which is formatted as a preformatted block.
+    paragraphs = [desc]
+    extra = ""
+    if m := re.search(r"(^|\s)Value:", desc):
+        paragraphs = [desc[: m.start()], "Accepted values:" + desc[m.end() :]]
+    elif m := re.search(r"\s*Example file.content:", desc):
+        paragraphs = [desc[: m.start()]]
+        extra = "Example file content::\n\n  " + desc[m.end() :] + "\n\n"
+
+    wrapped = [textwrap.fill(p, width=80) for p in paragraphs if p.strip()]
+
+    return "\n\n".join(wrapped + [""]) + extra
+
+
+def default_to_rst(tok):
+    if tok.kind == TT.string:
+        if tok.code == '""':
+            return "(empty string)"
+        return tok.code
+    if tok.kind == TT.ident:
+        return tok.code
+    if tok.kind == TT.number:
+        return tok.code.replace("'", "")
+    raise ValueError(f"unexpected token as default value: {tok.kind.name}")
+
+
+def defaults_to_rst_paragraph(defaults):
+    strs = [default_to_rst(d) for d in defaults]
+
+    if len(strs) == 1:
+        return f"Default value: {strs[0]}\n\n"
+    if len(strs) == 2:
+        return (
+            f"Default value: {strs[0]} (in shallow mode) / {strs[1]} (in deep 
mode)\n\n"
+        )
+    raise ValueError("unexpected count of default values: %d" % len(defaults))
+
+
+def macro_call_to_rst_paragraphs(macro_call):
+    if len(macro_call.args) != MACRO_NAMES_ARGCOUNTS[macro_call.name]:
+        return ""
+
+    try:
+        _, _, cmdflag, desc, *defaults = macro_call.args
+
+        return (
+            cmdflag_to_rst_title(cmdflag)
+            + desc_to_rst_paragraphs(desc)
+            + defaults_to_rst_paragraph(defaults)
+        )
+    except ValueError as ve:
+        print(ve.args[0], file=sys.stderr)
+        return ""
+
+
+def get_option_list(input_file):
+    with open(input_file, encoding="utf-8") as f:
+        contents = f.read()
+    tokens = join_strings(tokenize(contents))
+    macro_calls = get_calls(tokens, MACRO_NAMES_ARGCOUNTS)
+
+    result = ""
+    for mc in macro_calls:
+        result += macro_call_to_rst_paragraphs(mc)
+    return result
+
+
+p = argparse.ArgumentParser()
+p.add_argument("-i", "--input", help="path to AnalyzerOptions.def")
+p.add_argument("-t", "--template", help="path of template file")
+p.add_argument("-o", "--output", help="path of output file")
+opts = p.parse_args()
+
+with open(opts.template, encoding="utf-8") as f:
+    doc_template = f.read()
+
+PLACEHOLDER = ".. OPTIONS_LIST_PLACEHOLDER\n"
+
+rst_output = doc_template.replace(PLACEHOLDER, get_option_list(opts.input))
+
+with open(opts.output, "w", newline="", encoding="utf-8") as f:
+    f.write(rst_output)
diff --git a/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def 
b/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def
index f9f22a9ced650..8326f5309035e 100644
--- a/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def
+++ b/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def
@@ -7,6 +7,9 @@
 
//===----------------------------------------------------------------------===//
 //
 //  This file defines the analyzer options avaible with -analyzer-config.
+//  Note that clang/docs/tools/generate_analyzer_options_docs.py relies on the
+//  structure of this file, so if this file is refactored, then make sure to
+//  update that script as well.
 //
 
//===----------------------------------------------------------------------===//
 

From c9ade009da319d7e7bec336eaa6d507bff5a3bfd Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Don=C3=A1t=20Nagy?= <donat.n...@ericsson.com>
Date: Mon, 28 Apr 2025 18:41:10 +0200
Subject: [PATCH 2/5] Inline the name of the script

---
 clang/docs/CMakeLists.txt | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/clang/docs/CMakeLists.txt b/clang/docs/CMakeLists.txt
index 9dfcc692ff87d..50fdbcc06a1f4 100644
--- a/clang/docs/CMakeLists.txt
+++ b/clang/docs/CMakeLists.txt
@@ -149,10 +149,9 @@ if (LLVM_ENABLE_SPHINX)
     set(aopts_rst "${CMAKE_CURRENT_BINARY_DIR}/${aopts_rst_rel_path}")
     set(analyzeroptions_def 
"${CMAKE_CURRENT_SOURCE_DIR}/../include/clang/StaticAnalyzer/Core/AnalyzerOptions.def")
     set(aopts_rst_in "${CMAKE_CURRENT_SOURCE_DIR}/${aopts_rst_rel_path}.in")
-    set(generate_aopts_docs generate_analyzer_options_docs.py)
     add_custom_command(
       OUTPUT ${aopts_rst}
-      COMMAND ${Python3_EXECUTABLE} ${generate_aopts_docs} -i 
${analyzeroptions_def} -t ${aopts_rst_in} -o ${aopts_rst}
+      COMMAND ${Python3_EXECUTABLE} generate_analyzer_options_docs.py -i 
${analyzeroptions_def} -t ${aopts_rst_in} -o ${aopts_rst}
       WORKING_DIRECTORY ${docs_tools_dir}
       VERBATIM
       COMMENT "Generating ${aopts_rst}"

From 4fae8627073111934583f885ba23d7edf478bbbc Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Don=C3=A1t=20Nagy?= <donat.n...@ericsson.com>
Date: Mon, 28 Apr 2025 18:56:17 +0200
Subject: [PATCH 3/5] Clarify argument names of the script generating the docs

---
 clang/docs/CMakeLists.txt                          |  5 ++++-
 clang/docs/tools/generate_analyzer_options_docs.py | 10 +++++-----
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/clang/docs/CMakeLists.txt b/clang/docs/CMakeLists.txt
index 50fdbcc06a1f4..e3ba12166534b 100644
--- a/clang/docs/CMakeLists.txt
+++ b/clang/docs/CMakeLists.txt
@@ -151,7 +151,10 @@ if (LLVM_ENABLE_SPHINX)
     set(aopts_rst_in "${CMAKE_CURRENT_SOURCE_DIR}/${aopts_rst_rel_path}.in")
     add_custom_command(
       OUTPUT ${aopts_rst}
-      COMMAND ${Python3_EXECUTABLE} generate_analyzer_options_docs.py -i 
${analyzeroptions_def} -t ${aopts_rst_in} -o ${aopts_rst}
+      COMMAND ${Python3_EXECUTABLE} generate_analyzer_options_docs.py
+              --options-def "${analyzeroptions_def}"
+              --template "${aopts_rst_in}"
+              --out "${aopts_rst}"
       WORKING_DIRECTORY ${docs_tools_dir}
       VERBATIM
       COMMENT "Generating ${aopts_rst}"
diff --git a/clang/docs/tools/generate_analyzer_options_docs.py 
b/clang/docs/tools/generate_analyzer_options_docs.py
index 5dfc571deb9a0..bbe3b48404a45 100644
--- a/clang/docs/tools/generate_analyzer_options_docs.py
+++ b/clang/docs/tools/generate_analyzer_options_docs.py
@@ -226,9 +226,9 @@ def get_option_list(input_file):
 
 
 p = argparse.ArgumentParser()
-p.add_argument("-i", "--input", help="path to AnalyzerOptions.def")
-p.add_argument("-t", "--template", help="path of template file")
-p.add_argument("-o", "--output", help="path of output file")
+p.add_argument("--options-def", help="path to AnalyzerOptions.def")
+p.add_argument("--template", help="path of template file")
+p.add_argument("--out", help="path of output file")
 opts = p.parse_args()
 
 with open(opts.template, encoding="utf-8") as f:
@@ -236,7 +236,7 @@ def get_option_list(input_file):
 
 PLACEHOLDER = ".. OPTIONS_LIST_PLACEHOLDER\n"
 
-rst_output = doc_template.replace(PLACEHOLDER, get_option_list(opts.input))
+rst_output = doc_template.replace(PLACEHOLDER, 
get_option_list(opts.options_def))
 
-with open(opts.output, "w", newline="", encoding="utf-8") as f:
+with open(opts.out, "w", newline="", encoding="utf-8") as f:
     f.write(rst_output)

From 461d3db784bd66574c9d175dbc8f5e7190342b8e Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Don=C3=A1t=20Nagy?= <donat.n...@ericsson.com>
Date: Mon, 28 Apr 2025 20:22:24 +0200
Subject: [PATCH 4/5] Extend some disclaimers

---
 clang/docs/analyzer/user-docs/Options.rst.in | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/clang/docs/analyzer/user-docs/Options.rst.in 
b/clang/docs/analyzer/user-docs/Options.rst.in
index eced3597ed567..96e92bb5a4092 100644
--- a/clang/docs/analyzer/user-docs/Options.rst.in
+++ b/clang/docs/analyzer/user-docs/Options.rst.in
@@ -28,6 +28,13 @@ also accepted (with equivalent behavior). Analyzer options 
and checker options
 can be freely intermixed here because it's easy to recognize that checker
 option names are always prefixed with ``some.groups.NameOfChecker:``.
 
+.. warning::
+   This is an internal interface, Clang does not intend to preserve backwards
+   compatibility or announce breaking changes within the flags accepted by
+   ``clang -cc1``. However, ``-analyzer-config`` survived many years without
+   significant changes and there is no "more official" interface for
+   configuring the analyzer options.
+
 With the clang driver
 ---------------------
 
@@ -61,6 +68,10 @@ note the distinction that the ``clang`` *driver* requires 
``--analyze`` (double
 dashes) while the ``clang -cc1`` *compiler frontend* requires ``-analyze``
 (single dash).
 
+.. note::
+   The flag ``-Xanalyzer`` is equivalent to ``-Xclang`` in these situations
+   (but doesn't forward other options of the clang frontend).
+
 With CodeChecker
 ----------------
 
@@ -89,9 +100,11 @@ List of analyzer options
 ========================
 
 .. warning::
-   These options are primarily intended for development purposes. Changing
-   their values may drastically alter the behavior of the analyzer, and may
-   even result in instabilities or crashes!
+   These options are primarily intended for development purposes and
+   non-default values are usually unsupported. Changing their values may
+   drastically alter the behavior of the analyzer, and may even result in
+   instabilities or crashes! Crash reports are welcome and depending on the
+   severity they may be fixed.
 
 ..
    The contents of this section are automatically generated by the script

From 180afc02b5f1ae640c892fa7df22984daff9c2d0 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Don=C3=A1t=20Nagy?= <donat.n...@ericsson.com>
Date: Tue, 29 Apr 2025 18:41:03 +0200
Subject: [PATCH 5/5] Test generate_analyzer_options_docs.py

---
 .../tools/generate_analyzer_options_docs.py   | 72 +++++++++++++++----
 .../generate_analyzer_options_docs.test       | 11 +++
 clang/test/lit.cfg.py                         |  2 +
 3 files changed, 71 insertions(+), 14 deletions(-)
 create mode 100644 clang/test/Analysis/generate_analyzer_options_docs.test

diff --git a/clang/docs/tools/generate_analyzer_options_docs.py 
b/clang/docs/tools/generate_analyzer_options_docs.py
index bbe3b48404a45..bb38c5f52ca5e 100644
--- a/clang/docs/tools/generate_analyzer_options_docs.py
+++ b/clang/docs/tools/generate_analyzer_options_docs.py
@@ -42,14 +42,46 @@ class TT(Enum):
 Token = namedtuple("Token", "kind code")
 
 
-def report_unexpected(s, pos):
-    lines = (s[:pos] + "X").split("\n")
-    lineno, col = (len(lines), len(lines[-1]))
-    print(
-        "unexpected character %r in AnalyzerOptions.def at line %d column %d"
-        % (s[pos], lineno, col),
-        file=sys.stderr,
-    )
+class ErrorHandler:
+    def __init__(self):
+        self.seen_errors = False
+
+        # This script uses some heuristical tweaks to modify the documentation
+        # of some analyzer options. As this code is fragile, we record the use
+        # of these tweaks and report them if they become obsolete:
+        self.unused_tweaks = [
+            "ctu-max-nodes-*",
+            "accepted values",
+            "example file content",
+        ]
+
+    def record_use_of_tweak(self, tweak_name):
+        try:
+            self.unused_tweaks.remove(tweak_name)
+        except ValueError:
+            pass
+
+    def report_error(self, msg):
+        print("Error:", msg, file=sys.stderr)
+        self.seen_errors = True
+
+    def report_unexpected_char(self, s, pos):
+        lines = (s[:pos] + "X").split("\n")
+        lineno, col = (len(lines), len(lines[-1]))
+        self.report_error(
+            "unexpected character %r in AnalyzerOptions.def at line %d column 
%d"
+            % (s[pos], lineno, col),
+        )
+
+    def report_unused_tweaks(self):
+        if not self.unused_tweaks:
+            return
+        _is = " is" if len(self.unused_tweaks) == 1 else "s are"
+        names = ", ".join(self.unused_tweaks)
+        self.report_error(f"textual tweak{_is} unused in script: {names}")
+
+
+err_handler = ErrorHandler()
 
 
 def tokenize(s):
@@ -63,7 +95,7 @@ def tokenize(s):
                 pos = m.end()
                 break
         else:
-            report_unexpected(s, pos)
+            err_handler.report_unexpected_char(s, pos)
             pos += 1
     return result
 
@@ -149,10 +181,12 @@ def cmdflag_to_rst_title(cmdflag_tok):
 
 
 def desc_to_rst_paragraphs(tok):
-    desc = string_value(tok)
+    base_desc = string_value(tok)
 
     # Escape a star that would act as inline emphasis within RST.
-    desc = desc.replace("ctu-max-nodes-*", r"ctu-max-nodes-\*")
+    desc = base_desc.replace("ctu-max-nodes-*", r"ctu-max-nodes-\*")
+    if desc != base_desc:
+        err_handler.record_use_of_tweak("ctu-max-nodes-*")
 
     # Many descriptions end with "Value: <list of accepted values>", which is
     # OK for a terse command line printout, but should be prettified for web
@@ -162,8 +196,10 @@ def desc_to_rst_paragraphs(tok):
     paragraphs = [desc]
     extra = ""
     if m := re.search(r"(^|\s)Value:", desc):
+        err_handler.record_use_of_tweak("accepted values")
         paragraphs = [desc[: m.start()], "Accepted values:" + desc[m.end() :]]
     elif m := re.search(r"\s*Example file.content:", desc):
+        err_handler.record_use_of_tweak("example file content")
         paragraphs = [desc[: m.start()]]
         extra = "Example file content::\n\n  " + desc[m.end() :] + "\n\n"
 
@@ -209,7 +245,7 @@ def macro_call_to_rst_paragraphs(macro_call):
             + defaults_to_rst_paragraph(defaults)
         )
     except ValueError as ve:
-        print(ve.args[0], file=sys.stderr)
+        err_handler.report_error(ve.args[0])
         return ""
 
 
@@ -227,8 +263,11 @@ def get_option_list(input_file):
 
 p = argparse.ArgumentParser()
 p.add_argument("--options-def", help="path to AnalyzerOptions.def")
-p.add_argument("--template", help="path of template file")
-p.add_argument("--out", help="path of output file")
+p.add_argument("--template", help="template file")
+p.add_argument("--out", help="output file")
+p.add_argument(
+    "--validate", action="store_true", help="exit with failure on parsing 
error"
+)
 opts = p.parse_args()
 
 with open(opts.template, encoding="utf-8") as f:
@@ -238,5 +277,10 @@ def get_option_list(input_file):
 
 rst_output = doc_template.replace(PLACEHOLDER, 
get_option_list(opts.options_def))
 
+err_handler.report_unused_tweaks()
+
 with open(opts.out, "w", newline="", encoding="utf-8") as f:
     f.write(rst_output)
+
+if opts.validate and err_handler.seen_errors:
+    sys.exit(1)
diff --git a/clang/test/Analysis/generate_analyzer_options_docs.test 
b/clang/test/Analysis/generate_analyzer_options_docs.test
new file mode 100644
index 0000000000000..ae78bd74b2965
--- /dev/null
+++ b/clang/test/Analysis/generate_analyzer_options_docs.test
@@ -0,0 +1,11 @@
+The documentation of analyzer options is generated by a script that parses
+AnalyzerOptions.def. The following line validates that this script
+"understands" everything in its input files:
+
+RUN: %python %src_dir/docs/tools/generate_analyzer_options_docs.py --validate  
--options-def %src_include_dir/clang/StaticAnalyzer/Core/AnalyzerOptions.def 
--template %src_dir/docs/analyzer/user-docs/Options.rst.in --out %t.rst
+
+Moreover, verify that the documentation (e.g. this fragment of the
+documentation of the "mode" option) can be found in the output file:
+
+RUN: FileCheck --input-file=%t.rst %s
+CHECK: Controls the high-level analyzer mode
diff --git a/clang/test/lit.cfg.py b/clang/test/lit.cfg.py
index 8f1392b6a1f8f..51ee3684ac21f 100644
--- a/clang/test/lit.cfg.py
+++ b/clang/test/lit.cfg.py
@@ -70,6 +70,8 @@
 
 llvm_config.use_clang()
 
+config.substitutions.append(("%src_dir", config.clang_src_dir))
+
 config.substitutions.append(("%src_include_dir", config.clang_src_dir + 
"/include"))
 
 config.substitutions.append(("%target_triple", config.target_triple))

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [NFC][analyzer] Document configuration options (PR #135169)

Reply via email to