Hi, I would like to share a script that determines code coverage of functions that are directly accessible from SQL.
To use the script you have to build a code coverage report first, as usual: ``` git clean -dfx meson setup --buildtype debug -DPG_TEST_EXTRA="kerberos ldap ssl" -Db_coverage=true -Dldap=disabled -Dssl=openssl -Dcassert=true -Dtap_tests=enabled -Dprefix=/home/eax/pginstall build ninja -C build PG_TEST_EXTRA=1 meson test -C build ninja -C build coverage-html ``` Note that this will only work in the Linux environment, to my knowledge at least. Then execute: ``` ./coverage-analyze.py > coverage.csv ``` This will give you functions listed in pg_proc.dat and where to find them in the source code, sorted by code coverage. These functions are easy to cover with tests, see for instance [1] and similar patches in `git log`. I believe this can be useful to newcomers looking for an idea of the first patch. You can also modify the script to check the coverage of particular functions you are interested in. Just modify this part accordingly: ``` functions = set() with open("src/include/catalog/pg_proc.dat") as f: proc_data = f.read() re_str = """prosrc[ ]+=>[ ]+['"](.*?)['"]""" for m in re.finditer(re_str, proc_data): func_name = m.group(1) functions.add(func_name) ``` If you want to know coverage of *all* functions, just comment this part: ``` if current_func_name not in functions: current_func_name = None continue ``` Please find attached the script and the compressed report. I also uploaded a copy of the report to Google Drive [2]. As always, your thoughts and feedback are most welcomed. [1]: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=4f071349 [2]: https://docs.google.com/spreadsheets/d/1i3_tfBjMO3QbnVVnWb1uwhp5R_gdmjdx1Q9eXmjoUUA/edit?usp=sharing -- Best regards, Aleksander Alekseev
coverage.csv.tgz
Description: GNU Zip compressed data
#!/usr/bin/env python3 # coverage-analyze.py # Aleksander Alekseev 2025 import re import sys functions = set() with open("src/include/catalog/pg_proc.dat") as f: proc_data = f.read() re_str = """prosrc[ ]+=>[ ]+['"](.*?)['"]""" for m in re.finditer(re_str, proc_data): func_name = m.group(1) functions.add(func_name) # print("DEBUG functions found: {}".format(len(functions))) # function_name -> { # 'file_name': xxx # 'start_line': yyy # 'end_line': zzz # 'covered_lines': set(covered line numbers) # } summary = {} current_file_name = None current_func_name = None line_number_to_func_name = {} with open("build/meson-logs/coverage.info") as f: for line in f: line = line.strip() if line.startswith("SF:"): current_file_name = line.split(":")[1] line_number_to_func_name = {} # print("DEBUG current file: '{}'".format(current_file_name)) elif line.startswith("FN:"): temp = line.split(":") temp = temp[1].split(",") func_start_line = int(temp[0]) func_name = temp[1] if current_func_name != None: summary[current_func_name]['end_line'] = func_start_line - 1 for num in range(summary[current_func_name]['start_line'], summary[current_func_name]['end_line']+1): line_number_to_func_name[num] = current_func_name current_func_name = func_name # print("DEBUG found function '{}'".format(func_name)) if current_func_name not in functions: current_func_name = None continue summary[current_func_name] = { 'file_name': current_file_name, 'start_line': func_start_line, 'end_line': None, 'covered_lines': set() } else: if current_func_name != None: if summary[current_func_name]['end_line'] == None: summary[current_func_name]['end_line'] = sum(1 for _ in open(current_file_name)) for num in range(summary[current_func_name]['start_line'], summary[current_func_name]['end_line']+1): line_number_to_func_name[num] = current_func_name current_func_name = None if line.startswith("DA:"): temp = line.split(":") temp = temp[1].split(",") line_number = int(temp[0]) exec_number = int(temp[1]) if exec_number > 0: if line_number not in line_number_to_func_name: continue # not within any function that may interest us func_name = line_number_to_func_name[line_number] summary[func_name]['covered_lines'].add(line_number) # ignore functions with zero lines - it's possible in several cases func_list = filter(lambda k: summary[k]['end_line'] > summary[k]['start_line'], summary.keys()) # sort by coverage func_list = sorted(func_list, key = lambda k: len(summary[k]['covered_lines']) / (summary[k]['end_line'] - summary[k]['start_line'])) print("Filename,Function,Lines Covered,Lines Total,Percentage") for func_name in func_list: lines_covered = len(summary[func_name]['covered_lines']) lines_total = summary[func_name]['end_line'] - summary[func_name]['start_line'] print("{},{},{},{},{:.1f}%".format(summary[func_name]['file_name'], func_name, lines_covered, lines_total, lines_covered*100/lines_total))