Source: gdbm Version: 1.18.1-4 Severity: normal Dear Maintainer,
gdbm_reorganize() leaks open file descriptors. It can be re-produces by using the python3-gdbm binding: python3 -c 'import os,subprocess;import dbm.gnu;DB = "./db.gdbm";db = dbm.gnu.open(DB, "csu");db.reorganize();db.close();subprocess.call(f"lsof -p {os.getpid()}|grep {DB}", shell=True)' > python 2119 phahn DEL-W REG 254,0 6293029 ./db.gdbm > python 2119 phahn DEL-W REG 254,0 6293042 ./db.gdbm For each call to reorganize() the DB is copied to a new file which is then again memory-mapped. This takes one FD each time. # dpkg -s libgdbm6 Package: libgdbm6 Source: gdbm Version: 1.18.1-4 # dpkg -s python3-gdbm Package: python3-gdbm Source: python3-stdlib-extensions Version: 3.7.3-1 Debian-9-Stretch 1.8.3-14 Debian-10-Buster 1.18.1-4 (affected) Debian-11-Bullseye 1.19-2 (affected) Debian-12-Bookworm 1.23-3 (fixed) I have bisected the issue: git bisect start '--term-new' 'fixed' '--term-old' 'broken' # fixed: [331f05ac9c58d358806fe1bcba88a01467ab0895] Bugfix git bisect fixed 331f05ac9c58d358806fe1bcba88a01467ab0895 # broken: [778cc81d55aecd6344d577919cec73e4e6980e2e] Version 1.18 git bisect broken 778cc81d55aecd6344d577919cec73e4e6980e2e # broken: [611cac791f192834d49cd1bd8cfab76a190bfc40] Document new gdbmtool options git bisect broken 611cac791f192834d49cd1bd8cfab76a190bfc40 # fixed: [3052f2b51eaa9ba047d43ae97581ae1fd895c131] Add missing include git bisect fixed 3052f2b51eaa9ba047d43ae97581ae1fd895c131 # fixed: [e4089536f849644b12d9647aff12f2dac312b940] Version 1.21 released git bisect fixed e4089536f849644b12d9647aff12f2dac312b940 # broken: [038872b1582c6b99cc352551875aba0e81f6b5f0] Minor fix git bisect broken 038872b1582c6b99cc352551875aba0e81f6b5f0 # fixed: [a62815cdf0b60725d6445968ce3d5c39e912eb9c] Update docs git bisect fixed a62815cdf0b60725d6445968ce3d5c39e912eb9c # fixed: [cc7051ae2ea384863937083a3a60a5a008d511a5] gdbmshell: get rid of remaining globals git bisect fixed cc7051ae2ea384863937083a3a60a5a008d511a5 # fixed: [d19407eaa4b00a724c4ff3744c2f49269183da26] gdbmtool: bugfixes git bisect fixed d19407eaa4b00a724c4ff3744c2f49269183da26 # fixed: [f77b0273db2d1f0cc1ba2d3256acfab1bda1f584] Fix duplicated mmap in gdbm_recover git bisect fixed f77b0273db2d1f0cc1ba2d3256acfab1bda1f584 # first fixed commit: [f77b0273db2d1f0cc1ba2d3256acfab1bda1f584] Fix duplicated mmap in gdbm_recover Sadly that patch is not easy to backport because GDBM changes some fundamental details, e.g. "New bucket cache" in v1.21 and "Bucket cache switched from balanced tree to hash table" in v1.23 A similar bug was reported for Python <https://bugs.python.org/issue13947>, which links to <https://git.gnu.org.ua/gdbm.git/commit/?id=a0d6c1a8> which is ALSO required but not enough. I have back-ported that patch and so far it works fine, but a second set of eyes or even a review from upstream would be good. -- System Information: Debian Release: 11.5 APT prefers stable-updates APT policy: (500, 'stable-updates'), (500, 'stable-security'), (500, 'stable'), (50, 'testing') Architecture: amd64 (x86_64) Kernel: Linux 5.10.0-18-amd64 (SMP w/4 CPU threads) Kernel taint flags: TAINT_WARN Locale: LANG=de_DE.UTF-8, LC_CTYPE=de_DE.UTF-8 (charmap=UTF-8), LANGUAGE=de:en_US Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) LSM: AppArmor: enabled
>From ffeafe9692da956703d28f1f16c81a027a0e6869 Mon Sep 17 00:00:00 2001 Message-Id: <ffeafe9692da956703d28f1f16c81a027a0e6869.1666685456.git.h...@univention.de> From: Sergey Poznyakoff <g...@gnu.org> Date: Mon, 8 Apr 2019 22:17:10 +0300 Subject: [PATCH 1/2] Preserve locking type during database reorganization * src/recover.c (_gdbm_finish_transfer): Preserve locking type. --- src/recover.c | 1 + 1 file changed, 1 insertion(+) diff --git src/recover.c src/recover.c index 7cc20a2..f1603d9 100644 --- src/recover.c +++ src/recover.c @@ -155,6 +155,7 @@ _gdbm_finish_transfer (GDBM_FILE dbf, GDBM_FILE new_dbf, free (dbf->bucket_cache); } + dbf->lock_type = new_dbf->lock_type; dbf->desc = new_dbf->desc; dbf->header = new_dbf->header; dbf->dir = new_dbf->dir; -- 2.30.2
>From 40c6da6ca9972ff0b9d7d084fe118c765808fef6 Mon Sep 17 00:00:00 2001 Message-Id: <40c6da6ca9972ff0b9d7d084fe118c765808fef6.1666685456.git.h...@univention.de> In-Reply-To: <ffeafe9692da956703d28f1f16c81a027a0e6869.1666685456.git.h...@univention.de> References: <ffeafe9692da956703d28f1f16c81a027a0e6869.1666685456.git.h...@univention.de> From: Sergey Poznyakoff <g...@gnu.org> Date: Wed, 11 Aug 2021 23:01:37 +0300 Subject: [PATCH 2/2] Fix duplicated mmap in gdbm_recover * src/recover.c (_gdbm_finish_transfer): Reuse memory mapping from the intermediate dbm structure. --- src/recover.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git src/recover.c src/recover.c index f1603d9..893e0a2 100644 --- src/recover.c +++ src/recover.c @@ -169,14 +169,14 @@ _gdbm_finish_transfer (GDBM_FILE dbf, GDBM_FILE new_dbf, dbf->bucket_changed = new_dbf->bucket_changed; dbf->second_changed = new_dbf->second_changed; + dbf->mapped_size_max = new_dbf->mapped_size_max; + dbf->mapped_region = new_dbf->mapped_region; + dbf->mapped_size = new_dbf->mapped_size; + dbf->mapped_pos = new_dbf->mapped_pos; + dbf->mapped_off = new_dbf->mapped_off; + free (new_dbf->name); free (new_dbf); - - #if HAVE_MMAP - /* Re-initialize mapping if required */ - if (dbf->memory_mapping) - _gdbm_mapped_init (dbf); - #endif /* Make sure the new database is all on disk. */ gdbm_file_sync (dbf); -- 2.30.2