Author: Daniel Thornburgh
Date: 2025-12-03T11:24:56-08:00
New Revision: d041d5d4e07ba0eddd5120efd66520b3984a2b9b

URL: 
https://github.com/llvm/llvm-project/commit/d041d5d4e07ba0eddd5120efd66520b3984a2b9b
DIFF: 
https://github.com/llvm/llvm-project/commit/d041d5d4e07ba0eddd5120efd66520b3984a2b9b.diff

LOG: [clang] "modular_format" attribute for functions using format strings 
(#147431)

This provides a C language `modular_format` attribute. This combines
with information from the existing `format` to set the new IR
`modular-format` attribute.

The purpose of these attributes is to enable "modular printf". A
statically linked libc can provide a modular variant of printf that only
weakly references implementation routines. The link-time symbol `printf`
would strongly reference aspect symbols (e.g. for float, fixed point,
etc.) that are provided by those routines, restoring the status quo.
However, the compiler could transform calls with constant format strings
to calls to the modular printf instead, and at the same time, it would
emit strong references to the aspect symbols that are needed to
implement the format string. Then, the printf implementation would
contain only the union of the aspects requested.

See issue #146159 for context.

Added: 
    clang/test/CodeGen/attr-modular-format.c
    clang/test/Sema/attr-modular-format.c

Modified: 
    clang/docs/ReleaseNotes.rst
    clang/include/clang/Basic/Attr.td
    clang/include/clang/Basic/AttrDocs.td
    clang/include/clang/Basic/DiagnosticSemaKinds.td
    clang/include/clang/Sema/Sema.h
    clang/lib/CodeGen/CGCall.cpp
    clang/lib/Sema/SemaDecl.cpp
    clang/lib/Sema/SemaDeclAttr.cpp
    clang/test/Misc/pragma-attribute-supported-attributes-list.test

Removed: 
    


################################################################################
diff  --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 6838e926f4c9d..654a8e48cd104 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -360,6 +360,12 @@ Attribute Changes in Clang
   attribute, but `malloc_span` applies not to functions returning pointers, 
but to functions returning
   span-like structures (i.e. those that contain a pointer field and a size 
integer field or two pointers).
 
+- Added new attribute ``modular_format`` to allow dynamically selecting at link
+  time which aspects of a statically linked libc's printf (et al)
+  implementation are required. This can reduce code size without requiring e.g.
+  multilibs for printf features. Requires cooperation with the libc
+  implementation.
+
 Improvements to Clang's diagnostics
 -----------------------------------
 - Diagnostics messages now refer to ``structured binding`` instead of 
``decomposition``,

diff  --git a/clang/include/clang/Basic/Attr.td 
b/clang/include/clang/Basic/Attr.td
index c929da7d538bd..d8d1675f245a1 100644
--- a/clang/include/clang/Basic/Attr.td
+++ b/clang/include/clang/Basic/Attr.td
@@ -5331,3 +5331,11 @@ def NonString : InheritableAttr {
   let Subjects = SubjectList<[Var, Field]>;
   let Documentation = [NonStringDocs];
 }
+
+def ModularFormat : InheritableAttr {
+  let Spellings = [Clang<"modular_format">];
+  let Args = [IdentifierArgument<"ModularImplFn">, StringArgument<"ImplName">,
+              VariadicStringArgument<"Aspects">];
+  let Subjects = SubjectList<[Function]>;
+  let Documentation = [ModularFormatDocs];
+}

diff  --git a/clang/include/clang/Basic/AttrDocs.td 
b/clang/include/clang/Basic/AttrDocs.td
index fa365da3ed9aa..ae929c7dea37d 100644
--- a/clang/include/clang/Basic/AttrDocs.td
+++ b/clang/include/clang/Basic/AttrDocs.td
@@ -9642,3 +9642,43 @@ silence diagnostics with code like:
   __attribute__((nonstring)) char NotAStr[3] = "foo"; // Not diagnosed
   }];
 }
+
+def ModularFormatDocs : Documentation {
+  let Category = DocCatFunction;
+  let Content = [{
+The ``modular_format`` attribute can be applied to a function that bears the
+``format`` attribute (or standard library functions) to indicate that the
+implementation is "modular", that is, that the implementation is logically
+divided into a number of named aspects. When the compiler can determine that
+not all aspects of the implementation are needed for a given call, the compiler
+may redirect the call to the identifier given as the first argument to the
+attribute (the modular implementation function).
+
+The second argument is a implementation name, and the remaining arguments are
+aspects of the format string for the compiler to report. The implementation
+name is an unevaluated identifier be in the C namespace.
+
+The compiler reports that a call requires an aspect by issuing a relocation for
+the symbol ``<impl_name>_<aspect>`` at the point of the call. This arranges for
+code and data needed to support the aspect of the implementation to be brought
+into the link to satisfy weak references in the modular implemenation function.
+If the compiler does not understand an aspect, it must summarily consider any
+call to require that aspect. 
+
+For example, say ``printf`` is annotated with
+``modular_format(__modular_printf, "__printf", "float")``. Then, a call to
+``printf(var, 42)`` would be untouched. A call to ``printf("%d", 42)`` would
+become a call to ``__modular_printf`` with the same arguments, as would
+``printf("%f", 42.0)``. The latter would be accompanied with a strong
+relocation against the symbol ``__printf_float``, which would bring floating
+point support for ``printf`` into the link.
+
+If the attribute appears more than once on a declaration, or across a chain of
+redeclarations, it is an error for the attributes to have 
diff erent arguments,
+excepting that the aspects may be in any order.
+
+The following aspects are currently supported:
+
+- ``float``: The call has a floating point argument
+  }];
+}

diff  --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td 
b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index 6b75976e9c38d..e2c694cb2d9df 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -11277,6 +11277,8 @@ def warn_duplicate_attribute_exact : Warning<
 def warn_duplicate_attribute : Warning<
   "attribute %0 is already applied with 
diff erent arguments">,
   InGroup<IgnoredAttributes>;
+def err_duplicate_attribute
+    : Error<"attribute %0 is already applied with 
diff erent arguments">;
 def err_disallowed_duplicate_attribute : Error<
   "attribute %0 cannot appear more than once on a declaration">;
 
@@ -13070,6 +13072,12 @@ def err_get_vtable_pointer_requires_complete_type
     : Error<"__builtin_get_vtable_pointer requires an argument with a complete 
"
             "type, but %0 is incomplete">;
 
+def err_modular_format_attribute_no_format
+    : Error<"'modular_format' attribute requires 'format' attribute">;
+
+def err_modular_format_duplicate_aspect
+    : Error<"duplicate aspect '%0' in 'modular_format' attribute">;
+
 // SYCL-specific diagnostics
 def warn_sycl_kernel_num_of_template_params : Warning<
   "'sycl_kernel' attribute only applies to a function template with at least"

diff  --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index 78ecbccbe4efc..4a601a0eaf1b9 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -4957,6 +4957,11 @@ class Sema final : public SemaBase {
                                             IdentifierInfo *Format,
                                             int FormatIdx,
                                             StringLiteral *FormatStr);
+  ModularFormatAttr *mergeModularFormatAttr(Decl *D,
+                                            const AttributeCommonInfo &CI,
+                                            IdentifierInfo *ModularImplFn,
+                                            StringRef ImplName,
+                                            MutableArrayRef<StringRef> 
Aspects);
 
   /// AddAlignedAttr - Adds an aligned attribute to a particular declaration.
   void AddAlignedAttr(Decl *D, const AttributeCommonInfo &CI, Expr *E,

diff  --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index efacb3cc04c01..4a9025b6e0b0f 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -2559,6 +2559,19 @@ void CodeGenModule::ConstructAttributeList(StringRef 
Name,
 
     if (TargetDecl->hasAttr<ArmLocallyStreamingAttr>())
       FuncAttrs.addAttribute("aarch64_pstate_sm_body");
+
+    if (auto *ModularFormat = TargetDecl->getAttr<ModularFormatAttr>()) {
+      FormatAttr *Format = TargetDecl->getAttr<FormatAttr>();
+      StringRef Type = Format->getType()->getName();
+      std::string FormatIdx = std::to_string(Format->getFormatIdx());
+      std::string FirstArg = std::to_string(Format->getFirstArg());
+      SmallVector<StringRef> Args = {
+          Type, FormatIdx, FirstArg,
+          ModularFormat->getModularImplFn()->getName(),
+          ModularFormat->getImplName()};
+      llvm::append_range(Args, ModularFormat->aspects());
+      FuncAttrs.addAttribute("modular-format", llvm::join(Args, ","));
+    }
   }
 
   // Attach "no-builtins" attributes to:

diff  --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp
index c7d262aa4f15f..4b74b4c0354b2 100644
--- a/clang/lib/Sema/SemaDecl.cpp
+++ b/clang/lib/Sema/SemaDecl.cpp
@@ -58,6 +58,7 @@
 #include "clang/Sema/SemaSwift.h"
 #include "clang/Sema/SemaWasm.h"
 #include "clang/Sema/Template.h"
+#include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/STLForwardCompat.h"
 #include "llvm/ADT/ScopeExit.h"
 #include "llvm/ADT/SmallPtrSet.h"
@@ -2901,6 +2902,10 @@ static bool mergeDeclAttribute(Sema &S, NamedDecl *D,
   else if (const auto *FMA = dyn_cast<FormatMatchesAttr>(Attr))
     NewAttr = S.mergeFormatMatchesAttr(
         D, *FMA, FMA->getType(), FMA->getFormatIdx(), FMA->getFormatString());
+  else if (const auto *MFA = dyn_cast<ModularFormatAttr>(Attr))
+    NewAttr = S.mergeModularFormatAttr(
+        D, *MFA, MFA->getModularImplFn(), MFA->getImplName(),
+        MutableArrayRef<StringRef>{MFA->aspects_begin(), MFA->aspects_size()});
   else if (const auto *SA = dyn_cast<SectionAttr>(Attr))
     NewAttr = S.mergeSectionAttr(D, *SA, SA->getName());
   else if (const auto *CSA = dyn_cast<CodeSegAttr>(Attr))
@@ -7217,6 +7222,11 @@ static void checkLifetimeBoundAttr(Sema &S, NamedDecl 
&ND) {
   }
 }
 
+static void checkModularFormatAttr(Sema &S, NamedDecl &ND) {
+  if (ND.hasAttr<ModularFormatAttr>() && !ND.hasAttr<FormatAttr>())
+    S.Diag(ND.getLocation(), diag::err_modular_format_attribute_no_format);
+}
+
 static void checkAttributesAfterMerging(Sema &S, NamedDecl &ND) {
   // Ensure that an auto decl is deduced otherwise the checks below might cache
   // the wrong linkage.
@@ -7229,6 +7239,7 @@ static void checkAttributesAfterMerging(Sema &S, 
NamedDecl &ND) {
   checkHybridPatchableAttr(S, ND);
   checkInheritableAttr(S, ND);
   checkLifetimeBoundAttr(S, ND);
+  checkModularFormatAttr(S, ND);
 }
 
 static void checkDLLAttributeRedeclaration(Sema &S, NamedDecl *OldDecl,

diff  --git a/clang/lib/Sema/SemaDeclAttr.cpp b/clang/lib/Sema/SemaDeclAttr.cpp
index c9d1ee76a2e52..04cd68a4223d8 100644
--- a/clang/lib/Sema/SemaDeclAttr.cpp
+++ b/clang/lib/Sema/SemaDeclAttr.cpp
@@ -6973,6 +6973,71 @@ static void handleVTablePointerAuthentication(Sema &S, 
Decl *D,
       CustomDiscriminationValue));
 }
 
+static bool modularFormatAttrsEquiv(const ModularFormatAttr *Existing,
+                                    IdentifierInfo *ModularImplFn,
+                                    StringRef ImplName,
+                                    ArrayRef<StringRef> Aspects) {
+  return Existing->getModularImplFn() == ModularImplFn &&
+         Existing->getImplName() == ImplName &&
+         Existing->aspects_size() == Aspects.size() &&
+         llvm::equal(Existing->aspects(), Aspects);
+}
+
+ModularFormatAttr *
+Sema::mergeModularFormatAttr(Decl *D, const AttributeCommonInfo &CI,
+                             IdentifierInfo *ModularImplFn, StringRef ImplName,
+                             MutableArrayRef<StringRef> Aspects) {
+  if (const auto *Existing = D->getAttr<ModularFormatAttr>()) {
+    if (!modularFormatAttrsEquiv(Existing, ModularImplFn, ImplName, Aspects)) {
+      Diag(Existing->getLocation(), diag::err_duplicate_attribute) << 
*Existing;
+      Diag(CI.getLoc(), diag::note_conflicting_attribute);
+    }
+    return nullptr;
+  }
+  return ::new (Context) ModularFormatAttr(Context, CI, ModularImplFn, 
ImplName,
+                                           Aspects.data(), Aspects.size());
+}
+
+static void handleModularFormat(Sema &S, Decl *D, const ParsedAttr &AL) {
+  bool Valid = true;
+  StringRef ImplName;
+  if (!S.checkStringLiteralArgumentAttr(AL, 1, ImplName))
+    Valid = false;
+  SmallVector<StringRef> Aspects;
+  llvm::DenseSet<StringRef> SeenAspects;
+  for (unsigned I = 2, E = AL.getNumArgs(); I != E; ++I) {
+    StringRef Aspect;
+    if (!S.checkStringLiteralArgumentAttr(AL, I, Aspect))
+      return;
+    if (!SeenAspects.insert(Aspect).second) {
+      S.Diag(AL.getArgAsExpr(I)->getExprLoc(),
+             diag::err_modular_format_duplicate_aspect)
+          << Aspect;
+      Valid = false;
+      continue;
+    }
+    Aspects.push_back(Aspect);
+  }
+  if (!Valid)
+    return;
+
+  // Store aspects sorted.
+  llvm::sort(Aspects);
+  IdentifierInfo *ModularImplFn = AL.getArgAsIdent(0)->getIdentifierInfo();
+
+  if (const auto *Existing = D->getAttr<ModularFormatAttr>()) {
+    if (!modularFormatAttrsEquiv(Existing, ModularImplFn, ImplName, Aspects)) {
+      S.Diag(AL.getLoc(), diag::err_duplicate_attribute) << *Existing;
+      S.Diag(Existing->getLoc(), diag::note_conflicting_attribute);
+    }
+    // Ignore the later declaration in favor of the earlier one.
+    return;
+  }
+
+  D->addAttr(::new (S.Context) ModularFormatAttr(
+      S.Context, AL, ModularImplFn, ImplName, Aspects.data(), Aspects.size()));
+}
+
 
//===----------------------------------------------------------------------===//
 // Top Level Sema Entry Points
 
//===----------------------------------------------------------------------===//
@@ -7913,6 +7978,10 @@ ProcessDeclAttribute(Sema &S, Scope *scope, Decl *D, 
const ParsedAttr &AL,
   case ParsedAttr::AT_VTablePointerAuthentication:
     handleVTablePointerAuthentication(S, D, AL);
     break;
+
+  case ParsedAttr::AT_ModularFormat:
+    handleModularFormat(S, D, AL);
+    break;
   }
 }
 

diff  --git a/clang/test/CodeGen/attr-modular-format.c 
b/clang/test/CodeGen/attr-modular-format.c
new file mode 100644
index 0000000000000..5474ce361fbc2
--- /dev/null
+++ b/clang/test/CodeGen/attr-modular-format.c
@@ -0,0 +1,49 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-unknown -emit-llvm %s -o - | 
FileCheck %s
+
+int printf(const char *fmt, ...)  
__attribute__((modular_format(__modular_printf, "__printf", "float")));
+int myprintf(const char *fmt, ...)  
__attribute__((modular_format(__modular_printf, "__printf", "float"), 
format(printf, 1, 2)));
+
+// CHECK-LABEL: define dso_local void @test_inferred_format(
+// CHECK:    {{.*}} = call i32 (ptr, ...) @printf(ptr noundef @.str) 
#[[ATTR:[0-9]+]]
+void test_inferred_format(void) {
+  printf("hello");
+}
+
+// CHECK-LABEL: define dso_local void @test_explicit_format(
+// CHECK:    {{.*}} = call i32 (ptr, ...) @myprintf(ptr noundef @.str) 
#[[ATTR:[0-9]+]]
+void test_explicit_format(void) {
+  myprintf("hello");
+}
+
+int redecl(const char *fmt, ...) __attribute__((format(printf, 1, 2)));
+int redecl(const char *fmt, ...) __attribute__((modular_format(__dupe_impl, 
"__dupe", "1")));
+int redecl(const char *fmt, ...) __attribute__((modular_format(__dupe_impl, 
"__dupe", "1")));
+
+// CHECK-LABEL: define dso_local void @test_redecl(
+// CHECK:    {{.*}} = call i32 (ptr, ...) @redecl(ptr noundef @.str) 
#[[ATTR_DUPE_IDENTICAL:[0-9]+]]
+void test_redecl(void) {
+  redecl("hello");
+}
+
+int order1(const char *fmt, ...) 
__attribute__((modular_format(__modular_printf, "__printf", "a", "b"), 
format(printf, 1, 2)));
+int order2(const char *fmt, ...) 
__attribute__((modular_format(__modular_printf, "__printf", "b", "a"), 
format(printf, 1, 2)));
+
+// CHECK-LABEL: define dso_local void @test_order(
+// CHECK:    {{.*}} = call i32 (ptr, ...) @order1(ptr noundef @.str) 
#[[ATTR_ORDER:[0-9]+]]
+// CHECK:    {{.*}} = call i32 (ptr, ...) @order2(ptr noundef @.str) 
#[[ATTR_ORDER]]
+void test_order(void) {
+  order1("hello");
+  order2("hello");
+}
+
+int duplicate_identical(const char *fmt, ...) 
__attribute__((modular_format(__dupe_impl, "__dupe", "1"), 
modular_format(__dupe_impl, "__dupe", "1"), format(printf, 1, 2)));
+
+// CHECK-LABEL: define dso_local void @test_duplicate_identical(
+// CHECK:    {{.*}} = call i32 (ptr, ...) @duplicate_identical(ptr noundef 
@.str) #[[ATTR_DUPE_IDENTICAL]]
+void test_duplicate_identical(void) {
+  duplicate_identical("hello");
+}
+
+// CHECK: attributes #[[ATTR]] = { 
"modular-format"="printf,1,2,__modular_printf,__printf,float" }
+// CHECK: attributes #[[ATTR_DUPE_IDENTICAL]] = { 
"modular-format"="printf,1,2,__dupe_impl,__dupe,1" }
+// CHECK: attributes #[[ATTR_ORDER]] = { 
"modular-format"="printf,1,2,__modular_printf,__printf,a,b" }

diff  --git a/clang/test/Misc/pragma-attribute-supported-attributes-list.test 
b/clang/test/Misc/pragma-attribute-supported-attributes-list.test
index 1e1d4a356f515..081ea8d5c821c 100644
--- a/clang/test/Misc/pragma-attribute-supported-attributes-list.test
+++ b/clang/test/Misc/pragma-attribute-supported-attributes-list.test
@@ -111,6 +111,7 @@
 // CHECK-NEXT: Mips16 (SubjectMatchRule_function)
 // CHECK-NEXT: MipsLongCall (SubjectMatchRule_function)
 // CHECK-NEXT: MipsShortCall (SubjectMatchRule_function)
+// CHECK-NEXT: ModularFormat (SubjectMatchRule_function)
 // CHECK-NEXT: NSConsumed (SubjectMatchRule_variable_is_parameter)
 // CHECK-NEXT: NSConsumesSelf (SubjectMatchRule_objc_method)
 // CHECK-NEXT: NSErrorDomain (SubjectMatchRule_enum)

diff  --git a/clang/test/Sema/attr-modular-format.c 
b/clang/test/Sema/attr-modular-format.c
new file mode 100644
index 0000000000000..fc5b28b0b88be
--- /dev/null
+++ b/clang/test/Sema/attr-modular-format.c
@@ -0,0 +1,26 @@
+//RUN: %clang_cc1 -fsyntax-only -verify %s
+
+int printf(const char *fmt, ...)  
__attribute__((modular_format(__modular_printf, "__printf", "float")));  // 
no-error
+int myprintf(const char *fmt, ...)  
__attribute__((modular_format(__modular_printf, "__printf", "float")));  // 
expected-error {{'modular_format' attribute requires 'format' attribute}}
+
+int dupe(const char *fmt, ...)  
__attribute__((modular_format(__modular_printf, "__printf", "float", "int", 
"float"), format(printf, 1, 2))); // expected-error {{duplicate aspect 'float' 
in 'modular_format' attribute}}
+int multi_dupe(const char *fmt, ...)  
__attribute__((modular_format(__modular_printf, "__printf", "float", "int", 
"float", "int"), format(printf, 1, 2))); // expected-error {{duplicate aspect 
'float' in 'modular_format' attribute}} \
+                                                                               
                                                                                
  // expected-error {{duplicate aspect 'int' in 'modular_format' attribute}}
+
+// Test with multiple identical attributes on the same declaration.
+int same_attr(const char *fmt, ...) 
__attribute__((modular_format(__modular_printf, "__printf", "float"), 
modular_format(__modular_printf, "__printf", "float"), format(printf, 1, 2))); 
// no-warning
+
+// Test with multiple 
diff erent attributes on the same declaration.
+int 
diff _attr(const char *fmt, ...) 
__attribute__((modular_format(__modular_printf, "__printf", "float"), 
format(printf, 1, 2), modular_format(__modular_printf, "__printf", "int"))); // 
expected-error {{attribute 'modular_format' is already applied with 
diff erent arguments}} expected-note {{conflicting attribute is here}}
+
+int 
diff _attr2(const char *fmt, ...) 
__attribute__((modular_format(__modular_printf, "__printf", "float"), 
format(printf, 1, 2), modular_format(__modular_printf, "__other", "float"))); 
// expected-error {{attribute 'modular_format' is already applied with 
diff erent arguments}} expected-note {{conflicting attribute is here}}
+
+int 
diff _attr3(const char *fmt, ...) 
__attribute__((modular_format(__modular_printf, "__printf", "float"), 
format(printf, 1, 2), modular_format(__other, "__printf", "float"))); // 
expected-error {{attribute 'modular_format' is already applied with 
diff erent arguments}} expected-note {{conflicting attribute is here}}
+
+// Test with same attributes but 
diff erent aspect order.
+int 
diff _order(const char *fmt, ...) 
__attribute__((modular_format(__modular_printf, "__printf", "float", "int"), 
format(printf, 1, 2), modular_format(__modular_printf, "__printf", "int", 
"float"))); // no-error
+
+// Test with multiple 
diff erent attributes on a declaration and a redeclaration
+int redecl(const char *fmt, ...) __attribute__((format(printf, 1, 2))); // 
no-error
+int redecl(const char *fmt, ...) 
__attribute__((modular_format(__modular_printf, "__printf", "float"))); // 
expected-note {{conflicting attribute is here}}
+int redecl(const char *fmt, ...) 
__attribute__((modular_format(__modular_printf, "__printf", "int"))); // 
expected-error {{attribute 'modular_format' is already applied with 
diff erent arguments}}


        
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to