anjiahao1 opened a new pull request, #17886:
URL: https://github.com/apache/nuttx/pull/17886

   # ABI Check Tool
   
   ## Overview
   
   `abi_check.py` is a Python tool designed for checking binary compatibility 
between different versions of ELF files and static libraries. It helps ensure 
that Application Binary Interface (ABI) compatibility is maintained when 
updating firmware components, preventing runtime errors caused by incompatible 
changes.
   
   ## What is ABI Compatibility?
   
   Application Binary Interface (ABI) compatibility refers to the binary 
compatibility between compiled code and libraries. When you have static 
libraries (`.a` files) that were compiled against one version of a kernel/ELF, 
they must continue to work correctly when linked against a new version. ABI 
compatibility ensures that:
   
   - Function signatures remain unchanged
   - Structure layouts (size, member offsets, member types) stay the same
   - Data types maintain their size and alignment
   - Calling conventions remain consistent
   
   ## Why ABI Check is Needed
   
   ### Problem Scenario
   
   In embedded systems, it's common to have:
   
   1. **Static libraries** compiled against one version of the kernel/RTOS
   2. **Kernel/ELF files** that get updated independently
   3. **Firmware updates** where only certain components are updated
   
   If the kernel/ELF changes its internal structures or function signatures 
without proper versioning, previously compiled static libraries may:
   
   - Read wrong memory offsets when accessing structures
   - Call functions with incorrect parameters
   - Cause crashes, memory corruption, or undefined behavior
   
   ### Example Issue
   
   Consider this structure change in a semaphore implementation:
   
   **Old Version:**
   ```c
   struct sem_s {
       volatile int16_t semcount;  /* Offset: 0, Size: 2 */
       uint8_t flags;              /* Offset: 2, Size: 1 */
       /* 1 byte padding */
       dq_queue_t waitlist;         /* Offset: 4, Size: 8 */
       /* Total size: 12 bytes */
   };
   ```
   
   **New Version:**
   ```c
   struct sem_s {
       volatile int16_t semcount;  /* Offset: 0, Size: 2 */
       /* Total size: 2 bytes */
   };
   ```
   
   A static library compiled against the old version would try to access 
`waitlist` at offset 4, but in the new version, accessing that memory would be 
undefined behavior. This is exactly the kind of issue `abi_check` helps detect.
   
   ## Features
   
   ### 1. API Signature Extraction
   
   Extract external API function signatures used by static libraries from an 
ELF file.
   
   ```bash
   abi_check.py -e nuttx.elf -a libapp.a libdriver.a -j api_signatures.json
   ```
   
   **What it does:**
   - Parses static libraries to find undefined symbols (external functions they 
call)
   - Finds these functions in the ELF file
   - Extracts function signatures including:
     - Return type and size
     - Parameter types, sizes, and names
     - Structure layout if parameters are structures
   - Outputs results to a JSON file
   
   ### 2. Version Comparison
   
   Compare function signatures between two different ELF versions.
   
   ```bash
   # Generate signatures for old version
   abi_check.py -e nuttx_old.elf -a libapp.a -j old_api.json
   
   # Generate signatures for new version
   abi_check.py -e nuttx_new.elf -a libapp.a -j new_api.json
   
   # Compare them
   abi_check.py -i old_api.json new_api.json
   ```
   
   **What it detects:**
   - Functions that disappeared between versions
   - Functions that appeared (new APIs)
   - Return type changes
   - Parameter count changes
   - Parameter type/size changes
   - Structure layout changes (size, offsets, member types)
   
   ### 3. Structure Consistency Check
   
   Check if structures with the same name have different member definitions 
within a single ELF file.
   
   ```bash
   abi_check.py -s -e nuttx.elf
   ```
   
   **What it detects:**
   - Multiple definitions of the same structure name
   - Different member layouts for same-named structures
   - This can happen when different compilation units have different structure 
definitions
   
   ## Installation
   
   ### Prerequisites
   
   The tool requires:
   
   1. **Python 3** with standard library
   2. **pyelftools** for ELF parsing:
      ```bash
      pip install pyelftools
      ```
   3. **pahole** (dwarves package) for structure layout analysis:
      ```bash
      # Ubuntu/Debian
      sudo apt-get install dwarves
   
      # Fedora/RHEL
      sudo dnf install dwarves
   
      # Alpine
      sudo apk add dwarves
      ```
   4. **GNU ar** for extracting static libraries (usually pre-installed)
   
   ## Usage
   
   ### Command-Line Options
   
   ```
   usage: abi_check.py [-h] [-a LIB [LIB ...]] [-e ELF] [-c] [-d] [-j JSON]
                      [-s] [-i INPUT_JSON [INPUT_JSON ...]]
   
   optional arguments:
     -h, --help            Show help message and exit
     -a LIB [LIB ...], --lib LIB [LIB ...]
                           Path to static libraries (.a files)
     -e ELF, --elf ELF     Path to ELF file
     -c, --check           If static library contains debug information,
                           try to find function in static library,
                           and output to lib_<json> file
     -d, --dump            Dump result to console
     -j JSON, --json JSON  Save result to json file (default: out.json)
     -s, --struct_check    Check for struct differences within ELF
     -i INPUT_JSON [INPUT_JSON ...]
                           Compare two JSON files (diff mode)
   ```
   
   ### Use Case 1: Extract API Signatures
   
   Extract all external APIs used by static libraries and find their signatures 
in the ELF file.
   
   ```bash
   abi_check.py \
     -e nuttx.elf \
     -a libapps.a libdrivers.a libnetwork.a \
     -j api_signatures.json
   ```
   
   **Output:**
   ```json
   [
     [
       [
         "sem_wait",
         {
           "return": {
             "type": "int",
             "size": 4,
             "field": []
           },
           "parameters": [
             {
               "type": "FAR sem_t *",
               "size": 4,
               "field": []
             }
           ]
         }
       ],
       [
         "open",
         {
           "return": {
             "type": "int",
             "size": 4,
             "field": []
           },
           "parameters": [
             {
               "type": "FAR const char *",
               "size": 4,
               "field": []
             },
             {
               "type": "int",
               "size": 4,
               "field": []
             }
           ]
         }
       ]
     ],
     [
       "undefined_function_1",
       "undefined_function_2"
     ]
   ]
   ```
   
   The JSON output contains:
   1. Array of function signatures (function name, return type, parameters)
   2. Array of undefined functions that couldn't be found in the ELF
   
   **With dump option:**
   ```bash
   abi_check.py -e nuttx.elf -a libapp.a -j api.json -d
   ```
   
   Output to console:
   ```
   int sem_wait(FAR sem_t *sem);
   int open(FAR const char *path, int oflags, ...);
   Function undefined_function not found in elf file
   Function another_undefined not found in elf file
   ```
   
   ### Use Case 2: Compare Two Versions
   
   Compare API signatures between old and new firmware versions.
   
   ```bash
   # Step 1: Generate signatures for old version
   abi_check.py -e nuttx_v1.0.elf -a libapp.a libdriver.a -j v1_signatures.json
   
   # Step 2: Generate signatures for new version
   abi_check.py -e nuttx_v2.0.elf -a libapp.a libdriver.a -j v2_signatures.json
   
   # Step 3: Compare
   abi_check.py -i v1_signatures.json v2_signatures.json
   ```
   
   **Example output showing incompatibilities:**
   ```
   Function sem_wait return type is different
   
     int have Different:
       Field int size is different  4 != 8
   
   Function open parameters count is different
   
   Function read parameter 2 is different
   
     FAR size_t * have Different:
       Field size_t * type is different
       Field size_t * size is different  8 != 4
   ```
   
   ### Use Case 3: Structure Consistency Check
   
   Check for structure definition inconsistencies within a single ELF file.
   
   ```bash
   abi_check.py -s -e nuttx.elf
   ```
   
   **Example output showing structure mismatch:**
   ```
   struct sem_s {
           volatile int16_t           semcount;             /*     0     2 */
           uint8_t                    flags;                /*     2     1 */
   
           /* XXX 1 byte hole, try to pack */
   
           dq_queue_t                 waitlist;             /*     4     8 */
   
           /* size: 12, cachelines: 1, members: 3 */
           /* sum members: 11, holes: 1, sum holes: 1 */
           /* last cacheline: 12 bytes */
   
   }; at /home/xxx/include/semaphore.h:99
   
   struct sem_s {
           volatile int16_t           semcount;             /*     0     2 */
   
           /* size: 2, cachelines: 1, members: 1 */
           /* last cacheline: 2 bytes */
   
   }; at /home/xxx/include/semaphore.h:98
   
   ------
   ```
   
   This output shows that `struct sem_s` has two different definitions at 
different source locations, which is a potential ABI issue.
   
   ### Use Case 4: Verify Static Library Compatibility
   
   When static libraries contain debug information, you can verify if they 
match the ELF file.
   
   ```bash
   abi_check.py \
     -e nuttx.elf \
     -a libapp.a \
     -j api_signatures.json \
     --check
   ```
   
   This generates two files:
   - `api_signatures.json` - API signatures from ELF
   - `lib_api_signatures.json` - API signatures from static libraries
   
   You can then compare these to ensure compatibility.
   
   ## Workflow Examples
   
   ### Firmware Update Validation
   
   When updating firmware while keeping static libraries unchanged:
   
   ```bash
   #!/bin/bash
   # validate_firmware_update.sh
   
   OLD_ELF="nuttx_v1.0.elf"
   NEW_ELF="nuttx_v2.0.elf"
   LIBS="libapp.a libdriver.a libnetwork.a"
   
   # Extract old API signatures
   abi_check.py -e $OLD_ELF -a $LIBS -j old_signatures.json
   
   # Extract new API signatures
   abi_check.py -e $NEW_ELF -a $LIBS -j new_signatures.json
   
   # Compare for compatibility
   echo "Checking ABI compatibility..."
   abi_check.py -i old_signatures.json new_signatures.json
   
   # Check exit code
   if [ $? -eq 0 ]; then
       echo "✓ ABI compatibility maintained"
       echo "Safe to proceed with firmware update"
   else
       echo "✗ ABI incompatibilities detected"
       echo "Firmware update would break existing libraries"
       exit 1
   fi
   ```
   
   ### CI/CD Integration
   
   Integrate ABI checking into your build pipeline:
   
   ```yaml
   # .github/workflows/abi-check.yml
   name: ABI Compatibility Check
   
   on: [pull_request, push]
   
   jobs:
     abi-check:
       runs-on: ubuntu-latest
       steps:
         - uses: actions/checkout@v2
   
         - name: Install dependencies
           run: |
             sudo apt-get update
             sudo apt-get install -y python3-pip dwarves
             pip3 install pyelftools
   
         - name: Build firmware
           run: |
             cmake -B build
             cmake --build build
   
         - name: Extract baseline signatures
           run: |
             abi_check.py -e build/baseline.elf -a lib/*.a -j baseline.json
   
         - name: Extract new signatures
           run: |
             abi_check.py -e build/nuttx.bin -a lib/*.a -j new.json
   
         - name: Compare signatures
           run: |
             abi_check.py -i baseline.json new.json
   
         - name: Check for differences
           run: |
             # Fail if abi_check reports incompatibilities
             ! abi_check.py -i baseline.json new.json | grep -q "is different"
   ```
   
   ## Troubleshooting
   
   ### Error: "elf and lib must be provided"
   
   You must provide both an ELF file and at least one static library:
   
   ```bash
   # Correct
   abi_check.py -e nuttx.elf -a libapp.a
   
   # Incorrect
   abi_check.py -e nuttx.elf
   ```
   
   ### Error: "Error: elf must be provided"
   
   When using `--struct_check`, you must provide an ELF file:
   
   ```bash
   # Correct
   abi_check.py -s -e nuttx.elf
   
   # Incorrect
   abi_check.py -s
   ```
   
   ### Warning: "File already exists, will be overwritten"
   
   The tool warns before overwriting existing JSON files:
   
   ```bash
   # Specify different output file to avoid overwriting
   abi_check.py -e nuttx.elf -a libapp.a -j new_signatures.json
   ```
   
   ### Missing pahole
   
   If you get an error about `pahole`, install the dwarves package:
   
   ```bash
   # Ubuntu/Debian
   sudo apt-get install dwarves
   
   # Check installation
   pahole --version
   ```
   
   ### Function not found in ELF
   
   If you see warnings about functions not found in the ELF file:
   
   ```bash
   Function some_function not found in elf file
   ```
   
   This means:
   1. The static library references a function not present in the ELF
   2. This could be intentional (weak symbols, optional features)
   3. Or it could indicate a missing dependency
   
   ## Technical Details
   
   ### How It Works
   
   1. **Symbol Collection**:
      - Extracts `.obj` files from static libraries using `ar x`
      - Parses ELF symbol tables to find undefined symbols (external references)
   
   2. **Signature Extraction**:
      - Uses DWARF debug information in the ELF file
      - Traverses DIE (Debugging Information Entry) structures
      - Extracts function prototypes including return types and parameters
   
   3. **Structure Analysis**:
      - Uses `pahole` to dump structure layouts
      - Parses structure definitions including member offsets and types
      - Compares structures across different compilation units
   
   4. **Comparison**:
      - Compares function signatures name-by-name
      - Detects changes in types, sizes, and offsets
      - Reports incompatibilities in human-readable format
   
   ### Output Format
   
   The JSON output structure:
   
   ```json
   [
     [  // Array of function signatures
       [  // Single function
         "function_name",
         {
           "return": {
             "type": "return_type_name",
             "size": size_in_bytes,
             "field": [
               // If return type is a struct, fields go here
               {
                 "type": "field_type",
                 "size": field_size,
                 "offset": field_offset
               }
             ]
           },
           "parameters": [
             {
               "type": "parameter_type",
               "size": parameter_size,
               "field": []  // Empty unless parameter is a struct
             }
           ]
         }
       ]
     ],
     [  // Array of undefined functions
       "undefined_function_1",
       "undefined_function_2"
     ]
   ]
   ```
   
   ## Best Practices
   
   1. **Baseline Early**: Establish ABI signatures early in your project 
lifecycle
   2. **Automate Checks**: Integrate into CI/CD pipeline to catch changes early
   3. **Document APIs**: Keep clear documentation of public APIs and their 
stability guarantees
   4. **Version Libraries**: Use versioning for libraries when breaking changes 
are necessary
   5. **Test Thoroughly**: Combine ABI checking with integration testing
   6. **Review Output**: Don't ignore warnings about undefined functions or 
structure mismatches
   
   ## Limitations
   
   1. **Requires Debug Info**: The tool depends on DWARF debug information; 
compile with `-g` flag
   2. **Static Libraries Only**: Currently designed for static library 
analysis, not shared libraries
   3. **Python 3**: Requires Python 3.x
   4. **pahole Dependency**: External tool dependency for structure analysis
   5. **No Runtime Checks**: Only checks compile-time ABI, not runtime behavior
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to