Let me preface this by saying we are doing something unorthodox: we are running
RPM 4.12.90 on MacOS 10.12.
It turns out that on Linux, querying and writing to the database can cause
corruption. On MacOS, just querying in parallel can cause it. We can replicate
it by doing `for i in {1..30}; do /bin/rpm -qa & done`. I have some info about
how and why this happens. Using sandbox-exec, I was able to trace what `rpm
-qa` does and what `rpm --rebuilddb` does to fix corruption.
Bdb `mmaps` regions of the db to increase performance, but then backs the
regions using the filesystem. I'm not sure why it does this, as I would imagine
mmap already takes care of flushing changes back to the db. Perhaps the db
regions are "decompressed" and more performant? Source:
https://web.stanford.edu/class/cs276a/projects/docs/berkeleydb/ref/env/region.html
What is happening is that `rpm -qa` is actually writing to the files of these
file-backed mmaped regions:
```
[root@redacted ~]# grep write /tmp/trace/trace_output.sb
(allow file-write-data (path "/dev/dtracehelper"))
(allow sysctl-write (sysctl-name "kern.procname"))
(allow file-write-data (path "/opt/yum/var/lib/rpm/.dbenv.lock"))
(allow file-write-data (path "/opt/yum/var/lib/rpm/__db.001"))
(allow file-write-data (path "/opt/yum/var/lib/rpm/__db.001"))
(allow file-write-data (path "/opt/yum/var/lib/rpm/__db.002"))
(allow file-write-data (path "/opt/yum/var/lib/rpm/__db.003"))
(allow file-write-data (path "/opt/yum/var/lib/rpm/__db.004"))
(allow file-write-data (path "/opt/yum/var/lib/rpm/.dbenv.lock"))
```
The way `rpm --rebuilddb` fixes this is by unlinking the regions:
```
(allow file-write-unlink (path "/opt/yum/var/lib/rpm/__db.001"))
(allow file-write-unlink (path "/opt/yum/var/lib/rpm/__db.002"))
(allow file-write-unlink (path "/opt/yum/var/lib/rpm/__db.003"))
(allow file-write-unlink (path "/opt/yum/var/lib/rpm/__db.004"))
```
Turns out if you unlink them by hand, it also fixes the corruption. I haven't
figured out why the corrupted regions don't flush their changes to the real db,
corrupting that as well.
I've written a sandbox profile that disallows writes to the file-backed mmaped
regions. This means that we can call `sandbox-exec $sandbox_profile rpm -qa` to
safely read, with zero chance of corrupting the db:
```
[root@redacted ~]# sandbox-exec -f rpm-query-nowrite.sb -- /bin/rpm -qa
&>/dev/null
[root@redacted ~]# ls -la /var/lib/rpm/__db.00*
-rw-r--r-- 1 root root 24576 Jun 7 10:19 /var/lib/rpm/__db.001
-rw-r--r-- 1 root root 507904 Jun 7 10:19 /var/lib/rpm/__db.002
-rw-r--r-- 1 root root 1318912 Jun 7 10:19 /var/lib/rpm/__db.003
-rw-r--r-- 1 root root 811008 Jun 7 10:19 /var/lib/rpm/__db.004
[root@redacted ~]# sandbox-exec -f rpm-query-nowrite.sb -- /bin/rpm -qa
&>/dev/null
[root@redacted ~]# ls -la /var/lib/rpm/__db.00*
-rw-r--r-- 1 root root 24576 Jun 7 10:19 /var/lib/rpm/__db.001
-rw-r--r-- 1 root root 507904 Jun 7 10:19 /var/lib/rpm/__db.002
-rw-r--r-- 1 root root 1318912 Jun 7 10:19 /var/lib/rpm/__db.003
-rw-r--r-- 1 root root 811008 Jun 7 10:19 /var/lib/rpm/__db.004
```
Is it possible there is a bug in the way you file-back your mmap'ed regions?
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/232
_______________________________________________
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint