================ @@ -0,0 +1,251 @@ +================== +Call Graph Section +================== + +Introduction +============ + +With ``-fcall-graph-section``, the compiler will create a call graph section +in the object file. It will include type identifiers for indirect calls and +targets. This information can be used to map indirect calls to their receivers +with matching types. A complete and high-precision call graph can be +reconstructed by complementing this information with disassembly +(see ``llvm-objdump --call-graph-info``). + +Semantics +========= + +A coarse-grained, type-agnostic call graph may allow indirect calls to target +any function in the program. This approach ensures completeness since no +indirect call edge is missing. However, it is generally poor in precision +due to having unneeded edges. + +A call graph section provides type identifiers for indirect calls and targets. +This information can be used to restrict the receivers of an indirect target to +indirect calls with matching type. Consequently, the precision for indirect +call edges are improved while maintaining the completeness. + +The ``llvm-objdump`` utility provides a ``--call-graph-info`` option to extract +full call graph information by parsing the content of the call graph section +and disassembling the program for complementary information, e.g., direct +calls. + +Section layout +============== + +A call graph section consists of zero or more call graph entries. +Each entry contains information on a function and its indirect calls. + +An entry of a call graph section has the following layout in the binary: + ++---------------------+-----------------------------------------------------------------------+ +| Element | Content | ++=====================+=======================================================================+ +| FormatVersionNumber | Format version number. | ++---------------------+-----------------------------------------------------------------------+ +| FunctionEntryPc | Function entry address. | ++---------------------+-----------------------------------+-----------------------------------+ +| | A flag whether the function is an | - 0: not an indirect target | +| FunctionKind | indirect target, and if so, | - 1: indirect target, unknown id | +| | whether its type id is known. | - 2: indirect target, known id | ++---------------------+-----------------------------------+-----------------------------------+ +| FunctionTypeId | Type id for the indirect target. Present only when FunctionKind is 2. | ++---------------------+-----------------------------------------------------------------------+ +| CallSiteCount | Number of type id to indirect call site mappings that follow. | ++---------------------+-----------------------------------------------------------------------+ +| CallSiteList | List of type id and indirect call site pc pairs. | ++---------------------+-----------------------------------------------------------------------+ + +Each element in an entry (including each element of the contained lists and +pairs) occupies 64-bit space. + +The format version number is repeated per entry to support concatenation of +call graph sections with different format versions by the linker. + +As of now, the only supported format version is described above and has version +number 0. + +Type identifiers +================ + +The type for an indirect call or target is the function signature. +The mapping from a type to an identifier is an ABI detail. +In the current experimental implementation, an identifier of type T is +computed as follows: + + - Obtain the generalized mangled name for “typeinfo name for T”. + - Compute MD5 hash of the name as a string. + - Reinterpret the first 8 bytes of the hash as a little-endian 64-bit integer. + +To avoid mismatched pointer types, generalizations are applied. +Pointers in return and argument types are treated as equivalent as long as the +qualifiers for the type they point to match. +For example, ``char*``, ``char**``, and ``int*`` are considered equivalent +types. However, ``char*`` and ``const char*`` are considered separate types. ---------------- ilovepi wrote:
Is this the same limitation we use for CFI? I think whatever we do here should match those semantics. https://github.com/llvm/llvm-project/pull/117037 _______________________________________________ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits