Package: libgdbm6
Version: 1.18.1-4
Severity: normal
Tags: upstream patch
Dear Maintainer,
We are using gdbm through python, and discovered that on debian buster our
project started to leak
file descriptors. After investigating the issue we came to the conclusion that
it is caused by a
bug in the libgdbm6 library.
For reproduction on debian:buster docker image run the following commands:
apt install -y python3 python3-gdbm lsof
cat >/tmp/repro.py <<EOL
#!/usr/bin/python3
import dbm.gnu # IMPORTANT
import os
import time
import subprocess
def main():
db_file = "/tmp/ti-db-hanging.db"
# call reorganize() multiple times
for _ in range(5):
db = dbm.open(os.path.abspath(db_file), "cf")
db.reorganize()
time.sleep(1)
db.close()
time.sleep(5)
# check if we still hold any file handles for the db_file even after
closing it
pid = os.getpid()
x = subprocess.run([f"lsof -p {pid} | grep {db_file}"], shell=True,
stdout=subprocess.PIPE)
assert x.returncode == 0
res = x.stdout.decode()
if not res:
print("check open files: ok")
else:
print(res)
if __name__ == "__main__":
main()
EOL
python3 /tmp/repro.py
In the end of the script lsof should not return any open file handles but it
does show 5 of them.
Short summary of the issue:
When gdbm reorganize is called from python it will end up calling the
gdbm_recover in recover.c . This
will create a new temporary file where the contents of the current file will be
copied (after doing
some processing on it). After this is done new file will renamed to the old
one. When mmap is used the
mapping for the temporary file is only removed in case of error situations.
This results in file
handles hold on DELETED files when the process runs reorganize more than once
in its lifetime.
To fix the issue I prepared the following patch on gdbm:
In case mmap is used, the memory mapping in the recover function is not
removed. This can cause open file descriptors to deleted files if the recover
function is called multiple times.
--- a/src/recover.c
+++ b/src/recover.c
@@ -168,15 +168,20 @@
dbf->bucket_changed = new_dbf->bucket_changed;
dbf->second_changed = new_dbf->second_changed;
- free (new_dbf->name);
- free (new_dbf);
-
#if HAVE_MMAP
/* Re-initialize mapping if required */
if (dbf->memory_mapping)
_gdbm_mapped_init (dbf);
+
+ /* remove the old memory mapping to the temporary file name */
+ if (new_dbf->mapped_region){
+ _gdbm_mapped_unmap(new_dbf);
+ }
#endif
+ free (new_dbf->name);
+ free (new_dbf);
+
/* Make sure the new database is all on disk. */
gdbm_file_sync (dbf);
-- System Information:
Debian Release: 10.7
APT prefers stable-updates
APT policy: (500, 'stable-updates'), (500, 'stable')
Architecture: amd64 (x86_64)
Kernel: Linux 5.4.0-66-generic (SMP w/4 CPU cores)
Locale: LANG=C, LC_CTYPE=C.UTF-8 (charmap=UTF-8), LANGUAGE=C (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: unable to detect
Versions of packages libgdbm6 depends on:
ii libc6 2.28-10
libgdbm6 recommends no packages.
Versions of packages libgdbm6 suggests:
pn gdbm-l10n <none>
-- no debconf information