On 04/01/2016 05:07 PM, Šimon Lukašík wrote: > Hello Gautam, > > Firstly, let me just express gratitude for in depth analysis. I applaud > to engineering virtue. Well done! > > I would approach this thing as follows. I would connect with RPM > maintainers in SuSE, why it needs to get exclusive lock. > > I am not equipped to claim which solution (rpm in RHEL, vs. rpm in SuSE) > is "better". However, having the lesser constraint seems better until > there is risk for data corruption. I have been QE of RPM in RHEL many > years ago, and I don't remember any issues with weak locks. > > > Then, for OpenSCAP, we run the probe any time it is needed and we stop > it only at the scan completion. That has performance benefits (I still > have to see scanner that performs quicker than OpenSCAP :)). And then > each probe holds cache of things that have been already queried. So, > killing the probe is out of question in general. > > However, in this specific case. We already drop all the cache when we do > remediation (because cache would still return pre-remediated values). > So perhaps you can write code to stop all probes before remediation starts. >
Ah, this paragraph is void. The killing is already done. :) The problem is that when we decide whether the specific <fix> element is applicable, we run the rpminfo probe again. https://github.com/OpenSCAP/openscap/blob/78c8706d961270f1878d0639bbceee3f3fb7623f/src/XCCDF_POLICY/xccdf_policy_remediate.c#L135 So, perhaps you can create SuSE specific patch to rpminfo probe to return the lock after each request&response? Best, ~š. > The clean of the cache is here > https://github.com/OpenSCAP/openscap/blob/78c8706d961270f1878d0639bbceee3f3fb7623f/src/XCCDF/xccdf_session.c#L1413 > > Good luck! > ~š. > > > On 03/31/2016 08:32 AM, S, Gautam wrote: >> Hello folks, >> >> >> >> I am looking at a rather obscure issue related to the way the RPM >> library behaves on SUSE. While trying to run remediation for a rule >> related to RPMs, the process hangs until it finally times out. >> >> >> >> # oscap xccdf eval --profile test --remediate sles11-xccdf.xml >> >> Title Uninstall bind Package >> >> Rule package_bind_removed >> >> Ident CCE-27030-6 >> >> Result fail >> >> >> >> >> >> --- Starting Remediation --- >> >> >> >> Title Uninstall bind Package >> >> Rule package_bind_removed >> >> Ident CCE-27030-6 >> >> >> >> <<Hangs for a while here>> >> >> >> >> Result error >> >> >> >> I have collected openscap verbose logs and RPM logs and it looks like >> there is a deadlock. >> >> >> >> From RPM verbose mode logs while the fix is running the “rpm –e” operation: >> >> >> >> D: opening db environment /var/lib/rpm/Packages create:cdb:mpool:private >> >> D: opening db index /var/lib/rpm/Packages create mode=0x42 >> >> warning: waiting for exclusive lock on /var/lib/rpm/Packages >> >> error: cannot get exclusive lock on /var/lib/rpm/Packages >> >> D: closed db index /var/lib/rpm/Packages >> >> D: closed db environment /var/lib/rpm/Packages >> >> error: cannot open Packages index using db3 - Operation not permitted (1) >> >> >> >> Using strace, we can see that it attempts to acquire this WR lock >> multiple times until it finally times out and returns failure: >> >> >> >> fcntl(3, F_SETLK, {type=F_WRLCK, whence=SEEK_SET, start=0, len=0}) = -1 >> EAGAIN (Resource temporarily unavailable) >> >> >> >> RPM does not get exclusive mode lock because Openscap seems to be >> acquiring the lock in read mode: >> >> >> >> # lsof /var/lib/rpm/Packages [Command run while the hang is seen] >> >> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME >> >> probe_rpm 16875 root 3rR REG 8,2 53420032 352263 >> /var/lib/rpm/Packages >> >> rpm 16891 root 3u REG 8,2 53420032 352263 >> /var/lib/rpm/Packages >> >> >> >> # ps –a [Command run while the hang is seen] >> >> PID TTY TIME CMD >> >> 16859 pts/0 00:00:00 oscap >> >> 16865 pts/0 00:00:00 probe_system_in >> >> 16870 pts/0 00:00:00 probe_family >> >> 16875 pts/0 00:00:00 probe_rpminfo >> >> 16885 pts/0 00:00:00 probe_system_in >> >> 16890 pts/0 00:00:00 bash >> >> 16891 pts/0 00:00:00 rpm >> >> 16934 pts/2 00:00:00 ps >> >> # >> >> >> >> The RPM library call made in function >> src/OVAL/probes/unix/linux/rpminfo.c:probe_init, rpmtsCreate() and the >> subsequent rpmtsInitIterator() result in a read lock being taken on the >> file /var/lib/rpm/Packages. >> >> This read lock seems to be released when rpmtsFree() call is made from >> probe_fini. >> >> >> >> The particular probe_rpminfo in question is not related to the OVAL >> check of the rule but related to the CPE platform check. >> >> >> >> From the additional traces I added to openscap: >> >> D: probe_rpminfo: ("seap.msg" ":id" 0 (("rpminfo_object" ":id" >> "oval:org.open-scap.cpe.sles-release:obj:1" ":oval_version" "5.10.1" ) >> (("name" ":operation" 5 ":var_check" 1 ) "sles-release" ) ) ) >> [probe_rpminfo(7045):Thream >> Name(7f901b36c700):seap-packet.c:904:SEAP_packet_recv] >> >> … >> >> I: probe_rpminfo: gautam : probe_init: Init 1 rpmts >> [probe_rpminfo(7050):Thream Name(7f45fcd527c0):rpminfo.c:325:probe_init] >> >> <<Fix runs between the rpmts create and free .i.e lock acquire and release>> >> >> I: probe_rpminfo: gautam : probe_fini: Free 1 rpmts >> [probe_rpminfo(7045):Thream Name(7f90215a37c0):rpminfo.c:343:probe_fini] >> >> >> >> Probe_fini is called in this thread only upon receiving SIGTERM. >> >> >> >> *_Why does this work on RHEL?_* >> >> On RHEL, the implementation of the RPM command does not seem to be >> trying to acquire an exclusive mode lock, it also locks in read mode only: >> >> >> >> D: opening db environment /var/lib/rpm cdb:mpool:joinenv >> >> D: opening db index /var/lib/rpm/Packages create mode=0x42 >> >> D: sanity checking 1 elements >> No >> message about lock acquisition here! >> >> D: running pre-transaction scripts >> >> >> >> # lsof /var/lib/rpm/Packages >> >> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME >> >> probe_rpm 33539 root 3rR REG 8,3 75472896 1048584 >> /var/lib/rpm/Packages >> >> rpm 33555 root 3uR REG 8,3 75472896 1048584 >> /var/lib/rpm/Packages >> >> >> >> Is there some design document that explains how openscap spawns the CPE >> platform check? I am trying to understand if the probe can relinquish >> the lock by freeing the RPMTs structure before executing the remediation >> script. >> >> >> >> Thank you. >> >> >> >> Regards, >> >> Gautam. >> >> >> >> _______________________________________________ >> Open-scap-list mailing list >> Open-scap-list@redhat.com >> https://www.redhat.com/mailman/listinfo/open-scap-list >> > > > ~š. > > _______________________________________________ > Open-scap-list mailing list > Open-scap-list@redhat.com > https://www.redhat.com/mailman/listinfo/open-scap-list > ~š. _______________________________________________ Open-scap-list mailing list Open-scap-list@redhat.com https://www.redhat.com/mailman/listinfo/open-scap-list