Hi, I would like to share a script that determines code coverage of functions that are directly accessible from SQL.
To use the script you have to build a code coverage report first, as usual:
```
git clean -dfx
meson setup --buildtype debug -DPG_TEST_EXTRA="kerberos ldap ssl"
-Db_coverage=true -Dldap=disabled -Dssl=openssl -Dcassert=true
-Dtap_tests=enabled -Dprefix=/home/eax/pginstall build
ninja -C build
PG_TEST_EXTRA=1 meson test -C build
ninja -C build coverage-html
```
Note that this will only work in the Linux environment, to my
knowledge at least.
Then execute:
```
./coverage-analyze.py > coverage.csv
```
This will give you functions listed in pg_proc.dat and where to find
them in the source code, sorted by code coverage. These functions are
easy to cover with tests, see for instance [1] and similar patches in
`git log`. I believe this can be useful to newcomers looking for an
idea of the first patch.
You can also modify the script to check the coverage of particular
functions you are interested in. Just modify this part accordingly:
```
functions = set()
with open("src/include/catalog/pg_proc.dat") as f:
proc_data = f.read()
re_str = """prosrc[ ]+=>[ ]+['"](.*?)['"]"""
for m in re.finditer(re_str, proc_data):
func_name = m.group(1)
functions.add(func_name)
```
If you want to know coverage of *all* functions, just comment this part:
```
if current_func_name not in functions:
current_func_name = None
continue
```
Please find attached the script and the compressed report. I also
uploaded a copy of the report to Google Drive [2].
As always, your thoughts and feedback are most welcomed.
[1]: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=4f071349
[2]:
https://docs.google.com/spreadsheets/d/1i3_tfBjMO3QbnVVnWb1uwhp5R_gdmjdx1Q9eXmjoUUA/edit?usp=sharing
--
Best regards,
Aleksander Alekseev
coverage.csv.tgz
Description: GNU Zip compressed data
#!/usr/bin/env python3
# coverage-analyze.py
# Aleksander Alekseev 2025
import re
import sys
functions = set()
with open("src/include/catalog/pg_proc.dat") as f:
proc_data = f.read()
re_str = """prosrc[ ]+=>[ ]+['"](.*?)['"]"""
for m in re.finditer(re_str, proc_data):
func_name = m.group(1)
functions.add(func_name)
# print("DEBUG functions found: {}".format(len(functions)))
# function_name -> {
# 'file_name': xxx
# 'start_line': yyy
# 'end_line': zzz
# 'covered_lines': set(covered line numbers)
# }
summary = {}
current_file_name = None
current_func_name = None
line_number_to_func_name = {}
with open("build/meson-logs/coverage.info") as f:
for line in f:
line = line.strip()
if line.startswith("SF:"):
current_file_name = line.split(":")[1]
line_number_to_func_name = {}
# print("DEBUG current file: '{}'".format(current_file_name))
elif line.startswith("FN:"):
temp = line.split(":")
temp = temp[1].split(",")
func_start_line = int(temp[0])
func_name = temp[1]
if current_func_name != None:
summary[current_func_name]['end_line'] = func_start_line - 1
for num in range(summary[current_func_name]['start_line'], summary[current_func_name]['end_line']+1):
line_number_to_func_name[num] = current_func_name
current_func_name = func_name
# print("DEBUG found function '{}'".format(func_name))
if current_func_name not in functions:
current_func_name = None
continue
summary[current_func_name] = {
'file_name': current_file_name,
'start_line': func_start_line,
'end_line': None,
'covered_lines': set()
}
else:
if current_func_name != None:
if summary[current_func_name]['end_line'] == None:
summary[current_func_name]['end_line'] = sum(1 for _ in open(current_file_name))
for num in range(summary[current_func_name]['start_line'], summary[current_func_name]['end_line']+1):
line_number_to_func_name[num] = current_func_name
current_func_name = None
if line.startswith("DA:"):
temp = line.split(":")
temp = temp[1].split(",")
line_number = int(temp[0])
exec_number = int(temp[1])
if exec_number > 0:
if line_number not in line_number_to_func_name:
continue # not within any function that may interest us
func_name = line_number_to_func_name[line_number]
summary[func_name]['covered_lines'].add(line_number)
# ignore functions with zero lines - it's possible in several cases
func_list = filter(lambda k: summary[k]['end_line'] > summary[k]['start_line'], summary.keys())
# sort by coverage
func_list = sorted(func_list, key = lambda k: len(summary[k]['covered_lines']) / (summary[k]['end_line'] - summary[k]['start_line']))
print("Filename,Function,Lines Covered,Lines Total,Percentage")
for func_name in func_list:
lines_covered = len(summary[func_name]['covered_lines'])
lines_total = summary[func_name]['end_line'] - summary[func_name]['start_line']
print("{},{},{},{},{:.1f}%".format(summary[func_name]['file_name'], func_name, lines_covered, lines_total, lines_covered*100/lines_total))
