Re: [Rpm-maint] [rpm-software-management/rpm] Rpm query causes corruption in the file-backed mmaped bdb regions (#232)

2020-03-24 Thread Panu Matilainen
Closed #232.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/232#event-3160168125___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Rpm query causes corruption in the file-backed mmaped bdb regions (#232)

2020-03-24 Thread Panu Matilainen
Making BDB more reliable would require using transactions there, but this would 
be an incompatible change, which is the last thing we want to do at this point 
when we're basically just about to deprecate BDB. Which means we cannot do 
anything about this, on Berkeley DB backend, unfortunately.

If other database backends (ndb and sqlite to be exact) were to exhibit such 
behavior, please file separate bugs.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/232#issuecomment-603302322___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Rpm query causes corruption in the file-backed mmaped bdb regions (#232)

2019-11-21 Thread zhangjianwei
In the process of testing dcrpm, when I execute the command line lsof - F p 
/var/lib/rpm/.dbenv.lock, lsof displays the PIDs of all processes, including 1, 
instead of only displaying the PIDs of the .dbenv.lock file;

It may have been generated after I installed typing (Python module) 
independently. 
Of course, it is not necessarily repeated. 
Therefore, I am afraid that dcrpm will execute lsof-f P  
/var/lib/rpm/.dbenv.lock to get the wrong PIDs and kill them all.


-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/232#issuecomment-557400253___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Rpm query causes corruption in the file-backed mmaped bdb regions (#232)

2019-11-21 Thread Davide Cavalca
Yes, it should work just fine. We actually have packaging for dcrpm at 
https://github.com/facebookincubator/rpm-backports/tree/master/rpms/dcrpm that 
we use on CentOS 7 (and macOS) with no issues.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/232#issuecomment-557398238___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Rpm query causes corruption in the file-backed mmaped bdb regions (#232)

2019-11-21 Thread zhangjianwei
@jaymzh I see that dcrpm used  typing(begin python-3.5 as a new feature added) 
,  but my project is running on centos 7.5 1804 and using the python 2.7.5 
version,  Is typing(dcrpm) compatible with Python 2.7.5?

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/232#issuecomment-557397334___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Rpm query causes corruption in the file-backed mmaped bdb regions (#232)

2019-11-21 Thread Phil Dibowitz
You could instead run it as nonroot: `su -c "rpm -q " nobody` (or 
possibly `su -c "rpm -q " -s /bin/bash nobody` if nobody's shell is 
not a real shell).

Or you can make sure [dcrpm](https://github.com/facebookincubator/dcrpm) runs 
in a regular basis to detect and correct your RPM db... but certainly 
preventing the issues is better.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/232#issuecomment-557379740___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Rpm query causes corruption in the file-backed mmaped bdb regions (#232)

2019-11-21 Thread zhangjianwei
Can not avoid it?

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/232#issuecomment-557374397___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Rpm query causes corruption in the file-backed mmaped bdb regions (#232)

2019-11-21 Thread zhangjianwei
@jaymzh Yes, the processes running as root account.

`the memmap() the DB read-write, and write it back out when they are done... 
hence if then something changes at the same time, one of those process can 
write back old pages.`

`# ls -lthra /var/lib/rpm
total 70M
-rw-r--r--.  1 root root0 Oct 23 18:04 .dbenv.lock
-rw-r--r--.  1 root root0 Oct 23 18:04 .rpm.lock
-rw-r--r--.  1 root root 8.0K Nov 21 11:03 Triggername
drwxr-xr-x. 52 root root 4.0K Nov 21 11:04 ..
-rw-r--r--.  1 root root 8.0K Nov 21 11:04 Conflictname
-rw-r--r--.  1 root root  60M Nov 21 16:46 Packages
-rw-r--r--.  1 root root  48K Nov 21 16:46 Name
-rw-r--r--.  1 root root 4.2M Nov 21 16:46 Basenames
-rw-r--r--.  1 root root  28K Nov 21 16:46 Group
-rw-r--r--.  1 root root 256K Nov 21 16:46 Requirename
-rw-r--r--.  1 root root 1.9M Nov 21 16:46 Providename
-rw-r--r--.  1 root root  24K Nov 21 16:46 Obsoletename
-rw-r--r--.  1 root root 2.1M Nov 21 16:46 Dirnames
-rw-r--r--.  1 root root  20K Nov 21 16:46 Installtid
-rw-r--r--.  1 root root  44K Nov 21 16:46 Sigmd5
-rw-r--r--.  1 root root  76K Nov 21 16:46 Sha1header
drwxr-xr-x.  2 root root 4.0K Nov 21 17:21 .
-rw-r--r--.  1 root root 344K Nov 22 11:30 __db.001
-rw-r--r--.  1 root root 188K Nov 22 11:30 __db.002
-rw-r--r--.  1 root root 1.3M Nov 22 11:30 __db.003`

the /var/lib/rpm/.dbenv.lock and /var/lib/rpm/.rpm.lock can not protect the 
multi processes serial execution??

I don't know much about database.

`>>> the memmap() the DB read-write, and write it back out when they are 
done... hence if then something changes at the same time, one of those process 
can write back old pages.`

It indicate that the BDB use memmap(),  and the multi proccesses rpm -q 
software.xxx, then changed the memory page, lead to old page write back. 


-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/232#issuecomment-557374263___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Rpm query causes corruption in the file-backed mmaped bdb regions (#232)

2019-11-21 Thread Phil Dibowitz
@jianwei1216 Are the processes running as root? If so, the memmap() the DB 
read-write, and write it back out when they are done... hence if then something 
changes at the same time, one of those process can write back old pages.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/232#issuecomment-557176087___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Rpm query causes corruption in the file-backed mmaped bdb regions (#232)

2019-11-21 Thread zhangjianwei
Recently, I have encountered similar problems in my CentOS 7 environment.
`[root@controller-3 ~]# uname -a
Linux controller-3 3.10.0-862.el7.x86_64 #1 SMP Fri Apr 20 16:44:24 UTC 2018 
x86_64 x86_64 x86_64 GNU/Linux
[root@controller-3 ~]# cat /etc/redhat-release 
CentOS Linux release 7.5.1804 (Core) 
[root@controller-3 ~]# rpm -q rpm
rpm-4.11.3-32.el7.x86_64
[root@controller-3 ~]# rpm -q libdb
libdb-5.3.21-24.el7.x86_64`

But I can't reproduce it!

This is the context of the problem at the scene:
1. There are 8 processes running concurrently exec rpm -q software.xxx;
2. Each process generates 8  rpm -q software.xxx in 1 minute;
3. It's been running for half a year;
4. Millions of rpm -q software.xxx processes have been generated on CentOS 7;
5. memory exhaustion;
6. after recovery rpmdb (/var/lib/rpm),  OS recovery;


I really want to know the root cause of this problem!!!
Can you tell me?

Thank you very much!



-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/232#issuecomment-557007863___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Rpm query causes corruption in the file-backed mmaped bdb regions (#232)

2018-02-15 Thread Jeff Johnson
Guesses are worthless, particularly when you haven't the time to identify your 
problem.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/232#issuecomment-365915864___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Rpm query causes corruption in the file-backed mmaped bdb regions (#232)

2018-02-14 Thread Phil Dibowitz
I'd be very surprised if this wasn't a cause of many of our problems in Linux 
too. I suspect a common case is someone querying the DB (`rpm -qa`) when an 
transaction happens, and then theoretically-read-only operation ends up writing 
stale data back, but I haven't actually had the time to sit down and try to 
pinpoint it.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/232#issuecomment-365824199___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Rpm query causes corruption in the file-backed mmaped bdb regions (#232)

2018-02-14 Thread Jeff Johnson
Um, *this* issue is about RPM behavior on OS X and a sandbox policy prohibiting 
writes.

Open an issue with details with details about your Facebook problems if you 
wish.

The output of db_stat showing the state of locks is usually the starting point 
to identifying the root cause.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/232#issuecomment-365698899___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Rpm query causes corruption in the file-backed mmaped bdb regions (#232)

2018-02-14 Thread Phil Dibowitz
Hence this Issue was filed. :)

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/232#issuecomment-365676303___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Rpm query causes corruption in the file-backed mmaped bdb regions (#232)

2018-02-14 Thread Jeff Johnson
@jayzmh: the issue with dcrpm is the costly detection and possibly imperfect 
analysis of "bad actors" not how often db_recover is run. dcrpm is treating a 
symptom rather than solving the root cause of whatever problem exists.



-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/232#issuecomment-365599296___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Rpm query causes corruption in the file-backed mmaped bdb regions (#232)

2018-02-13 Thread Phil Dibowitz
Note that we don't run db_recover unless we detect an issue. dcrpm works by 
trying to detect common issues and issue the nicest possible recovery - from 
db_recover to other finding held locks by bad actors to other things.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/232#issuecomment-365508327___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Rpm query causes corruption in the file-backed mmaped bdb regions (#232)

2018-02-13 Thread Jeff Johnson
@pmatilai: yup, DB_RECOVER requires configuring Berkeley DB when opening 
correctly in order to use.

Hint: execing dbXY_recover on an idle database (already protected by an 
exclusive write lock) fixes stale locks. Performing that operation when needed, 
not every 15 minutes with a Facebook fork bomb, seems like a saner approach.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/232#issuecomment-365353871___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Rpm query causes corruption in the file-backed mmaped bdb regions (#232)

2018-02-13 Thread ニール・ゴンパ
> DB_RECOVER requires DB_INIT_TXN, which is incompatible with DB_INIT_CDB that 
> rpm.org still uses. And enabling TXN on BDB runs into all sorts of fun with 
> BDB log file paths across chroots, requires additional infra in the code etc 
> and whatnot. All solvable issues no doubt, but it piles up so it's not this 
> entirely trivial "just try reopen with a different flag" thing from the CDB 
> mode starting point.

I suspect that this is where the "RPM ACID" stuff comes into play, @n3npq ?

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/232#issuecomment-365256671___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Rpm query causes corruption in the file-backed mmaped bdb regions (#232)

2018-02-13 Thread Phil Dibowitz
On a related note... our fix-the-rpm-db program has been opensourced: 
https://github.com/facebookincubator/dcrpm

We run it every 15 minutes as a pre-script to configuration management.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/232#issuecomment-365214211___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Rpm query causes corruption in the file-backed mmaped bdb regions (#232)

2018-02-13 Thread Panu Matilainen
> I can also easily back port the core fix to the problem reported here, 
> implemented years ago @rpm5.org:
>
>   when DB_RUNRECOVERY is returned opening a BDB dbenv, then do the recovery 
> by setting a flag, and repeating the open one time, thereby running recovery.

DB_RECOVER requires DB_INIT_TXN, which is incompatible with DB_INIT_CDB that 
rpm.org still uses. And enabling TXN on BDB runs into all sorts of fun with BDB 
log file paths across chroots, requires additional infra in the code etc and 
whatnot. All solvable issues no doubt, but it piles up so it's not this 
entirely trivial "just try reopen with a different flag" thing from the CDB 
mode starting point.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/232#issuecomment-365207745___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Rpm query causes corruption in the file-backed mmaped bdb regions (#232)

2017-06-13 Thread Jeff Johnson
@smorad: db-4.8.30 SHOULD be fine (but there really isn't any reason not to 
upgrade to more recent versions on OS X: I have direct personal/devel 
experience on OS X with all versions of BDB since db-4.5.x, been there, done 
that, WORKSFORME).

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/232#issuecomment-308305275___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Rpm query causes corruption in the file-backed mmaped bdb regions (#232)

2017-06-13 Thread Steven Morad
We are running `berkeley-db4-4.8.30-0.3`. It may be that a later release of bdb 
will contain a fix for this, as I see that we are now 2 major versions behind 
bdb stable.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/232#issuecomment-308225271___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Rpm query causes corruption in the file-backed mmaped bdb regions (#232)

2017-06-12 Thread Jeff Johnson
After looking at current rpm.org sources, RPM still initializes with ts->dbmode 
= DB_RDONLY, and then  reopens the rpmdb with O_RW in rpmtsSetup(). That opens 
a (minor and mostly irrelevant wrto RPM on OS X) race window.

There is still the ability to disable fsync (or replace with fdatasync) within 
RPM sources. I would verify what setting you are using with RPM on OS X. What 
"works" on linux, may not work on OS X.

BTW, what version of Berkeley DB are you using on OS X?

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/232#issuecomment-307766129___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Rpm query causes corruption in the file-backed mmaped bdb regions (#232)

2017-06-12 Thread Jeff Johnson
Here are the patches in libdb-5.3.28-20: the patches are trackable to bug 
reports and appear to have been fixed by "upstream"
`# License clarification patch
  # 
http://devel.trisquel.info/gitweb/?p=package-helpers.git;a=blob;f=helpers/DA  
TA/db4.8/007-mt19937db.c_license.patch;h=1036db4d337ce4c60984380b89afcaa63b2ef  
88f;hb=df48d40d3544088338759e8bea2e7f832a564d48
  Patch25: 007-mt19937db.c_license.patch
  #Adds missing constant to Optcodes.java and changes ClassReader.java to use 
it  . This makes package to build with Java 8. 
  Patch26: java8-fix.patch
  # memp_stat fix provided by upstream (rhbz#1211871)
  Patch27: db-5.3.21-memp_stat-upstream-fix.patch
  # fix for mutexes not being released provided by upstream (rhbz#1277887)
  Patch28: db-5.3.21-mutex_leak.patch
  # fix for overflowing hash variable inside bundled lemon
  Patch29: db-5.3.28-lemon_hash.patch
`

Meanwhile, locking has not been shown to be a relevant issue to RPM on OS X.

For starters, 
[https://github.com/rpm-software-management/rpm/issues/232#issuecomment-307630055](url)
 is/was a buffer overrun while retrieving statistics for 389 directory 
services. That is likely irrelevant to RPM, which doesn't attempt to read MPOOL 
statistics.

I would read& report on #1277887 but I am not authorized to access. Judging 
from the comment, "muteness not being released" is likelier to lead to a 
deadlock, not "corruption".

Meanwhile its unclear whether any Fedora mutex patches are relevant on OS X: 
linux/glibc NPTL locking with futexes is likely irrelevant on OS X.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/232#issuecomment-307761586___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Rpm query causes corruption in the file-backed mmaped bdb regions (#232)

2017-06-12 Thread Panu Matilainen
There's an endless row of  bugs where BDB environment getting corrupted, some 
of which have been BDB bugs (several found just in the last couple of years) 
that have been patched in Fedora/RHEL libdb but upstream BDB 5.x does not have, 
dunno about 6.x but there you run into the licensing side. So if you're running 
upstream BDB 5.x on Mac, you'll want to a check those Fedora/RHEL patches to 
libdb. Some of the more exotic bugs have been kernel VM, virtualization and 
whatnot.
Fsync is only disabled on the first open of a newly created database (ie during 
fresh install), (iirc) never on existing database unless forced via 
configuration.

After adding the .dbenv.lock to serialize rpmdb open and close a few years ago 
(to work around what seems like a BDB bug), I haven't been able to reproduce 
environment corruption from parallel access in my setup but doesn't mean it 
doesn't happen in some other setup, version mix and whoknowswhat.

One possible workaround is to force use of private environment. That also means 
practically no locking, but at least it means queries will not corrupt anything 
(however queries themselves could return garbage if run in middle of 
write-operation). That's what happens if you run queries as non-privileged 
user, but since you can control permissions with sandboxing you can achieve the 
same by disallowing open of /var/lib/rpm/.dbenv.lock, which causes rpm to fall 
back to a private environment  - meaning it wont open, much less write to those 
__db.* files.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/232#issuecomment-307753745___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Rpm query causes corruption in the file-backed mmaped bdb regions (#232)

2017-06-11 Thread Steven Morad
So corruption in this case means that any rpm operation that reads or writes to 
BDB will fail, because the db fails to open. In our case, all rpm operations 
are done as root. As you can see, this is reproducible as root:
```
[root@redacted ~]# for i in {1..30}; do /bin/rpm -qa &>/dev/null & done
[1] 84680
[2] 84681
[3] 84682
...
[30] 84710
[root@redacted ~]# rpm -qa
error: db4 error(-30974) from dbenv->open: DB_RUNRECOVERY: Fatal error, run 
database recovery
error: cannot open Packages index using db4 -  (-30974)
error: cannot open Packages database in /opt/yum/var/lib/rpm
error: db4 error(-30974) from dbenv->open: DB_RUNRECOVERY: Fatal error, run 
database recovery
error: cannot open Packages index using db4 -  (-30974)
error: cannot open Packages database in /opt/yum/var/lib/rpm
```

Also, please note that during sandboxing I did not disallow writes to the rpm 
directory, just to the disk-backed mmapped regions 
(`/opt/yum/var/lib/rpm.__db.*`). The .dbenv file locks are still written (not 
sure if these are the shared locks you are talking about). It sounds like the 
fsync stub may be responsible for what we are seeing here.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/232#issuecomment-307660544___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Rpm query causes corruption in the file-backed mmaped bdb regions (#232)

2017-06-11 Thread ニール・ゴンパ
> Let me preface this by saying we are doing something unorthodox: we are 
> running RPM 4.12.90 on MacOS 10.12.

You're not the only one who does this. I do it as well. That said, I knew 
vaguely about this issue, but I couldn't pin down what caused it. :)

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/232#issuecomment-307630055___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Rpm query causes corruption in the file-backed mmaped bdb regions (#232)

2017-06-10 Thread Jeff Johnson
Note also that simply removing __db.00N files to "fix" problems can lead to 
other "corruption" problems, particularly when other accesses are running 
simultaneously.

The better fix for what you are calling "corruption" is to use "db_recover" 
with/without the -e option.

BDB has a notification facility that db_recover will use to signal other 
processes that the dbenv is being recreated. But (last I checked) rpm-4.x.y 
does not use/implement that facility. RPM5+BDB does implement the dbenv 
notification facility, automates recovery where needed on the next dbenv open, 
and also uses a full transactional ACID store with logs, rpm-4.x.y uses a CDB 
model which permits a single writer or multiple readers, with per-database 
(i.e. per-table) locking. The transactional model uses per-page locking, and 
logs all operations with a transactional commit/discard operation. 

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/232#issuecomment-307573757___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Rpm query causes corruption in the file-backed mmaped bdb regions (#232)

2017-06-10 Thread Jeff Johnson
The term "corruption" is highly unspecific: please avoid, and/or supply more 
specific details.

Yes, somewhat counter intuitively, the "read-only" operation of rpm -qa is 
writing to disk: at a minimum, shared read locks (and rpm-4.x.y also is trying 
to create an advisory file lock).

The mmap'ing to backing file store is necessary for the dbenv (which has locks 
and caches) to be shared between processes.

RPM itself has 2 "risky" implementations using BDB:
- fsync is disabled (for performance) by stubbing out a BDB vector.
- non-root access to databases (the more common DB term is "tables") without a 
dbenv (hence no shared locks).

Without fsync, there is no (strong) guarantee that multiple processes see the 
same data on the file system. In practice (on linux) this has not caused major 
problems: but that is NOT a claim that no problems exist.

Reading without taking out a shared read lock can segfault if/when a writer is 
updating an rpmdb. In practice, the exclusive file lock and the pattern of 
usage of RPM is such that an occasional segfault is tolerable. But its quite 
easy to demonstrate the problem by running many concurrent RPM queries and 
installs.

In practice, RPM creates an exclusive file lock that serializes writers (but 
readers mostly do not participate in the advisory locking scheme, at least last 
I checked).

If you are truly interested in stable/robust concurrent access to an rpmdb, 
then you MUST correct the above problems. Denying the creation of shared read 
locks, either through file system permissions (i.e. non-root) or sandboxing 
simply isn't going to provide stable/robust concurrent access to an rpmdb.

The better approach is to permit write access to the dbenv files (in the 
__db.00N files) either through file system permissions (like group writable 
0664, and possibly a setgid bit and ownership on the rpm executable), or by 
configuring your sandbox to permit writes to the dbenv files.

Treating the dbenv files as transient cache and moving them into a separate 
directory may help with sandbox rule generation as well.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/issues/232#issuecomment-307572921___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint