On Wed, Mar 09, 2016 at 04:13:28PM +0100, Ludwig Krispenz wrote: > > On 03/09/2016 03:46 PM, Andrew E. Bruno wrote: > >Hello, > > > >We had a replica fail today with: > > > >[09/Mar/2016:09:39:59 -0500] NSMMReplicationPlugin - changelog program - > >_cl5NewDBFile: PR_DeleteSemaphore: > >/var/lib/dirsrv/slapd-CBLS-CCR-BUFFALO-EDU/cldb/e909b405-2cb811e5-ac0b8f7e-e0b1a377.sema; > > NSPR error - -5943 > the nspr error means: > /* Cannot create or rename a filename that already exists */ > #define PR_FILE_EXISTS_ERROR (-5943L) > > could you check if the file exists and if there is a permission problem for > the dirsrv user to recreate it ?
Looks like the file exists: # ls -alh /var/lib/dirsrv/slapd-CBLS-CCR-BUFFALO-EDU/cldb/e909b405-2cb811e5-ac0b8f7e-e0b1a377.sema -rw-r--r-- 1 dirsrv dirsrv 0 Mar 9 09:39 /var/lib/dirsrv/slapd-CBLS-CCR-BUFFALO-EDU/cldb/e909b405-2cb811e5-ac0b8f7e-e0b1a377.sema > if the process hangs, could you get a pstack from the process ? We did a systemctl restart ipa.. which failed.. but looks like the dirsrv is still running. The logs are now filling up with: [09/Mar/2016:10:23:10 -0500] DSRetroclPlugin - delete_changerecord: could not delete change record 11272988 (rc: 32) [09/Mar/2016:10:23:10 -0500] DSRetroclPlugin - delete_changerecord: could not delete change record 11272989 (rc: 32) [09/Mar/2016:10:23:10 -0500] DSRetroclPlugin - delete_changerecord: could not delete change record 11272990 (rc: 32) However, if I do a kinit: kinit: Cannot contact any KDC for realm 'CBLS.CCR.BUFFALO.EDU' while getting initial credentials Should I be concerned that this will end up corrupting the other replicas? Should we just let this finish? We have 3 replicas in our system. Looks like we just lost a second one. This feels very similar to the error we hit a while back: https://www.redhat.com/archives/freeipa-users/2015-September/msg00006.html We're seeing the exact same behavior.. access logs are filling up with: [09/Mar/2016:10:26:03 -0500] conn=6877203 fd=4003 slot=4003 connection from 10.113.14.131 to 10.113.14.131 [09/Mar/2016:10:26:03 -0500] conn=6877204 fd=4004 slot=4004 connection from 10.116.28.10 to 10.113.14.131 [09/Mar/2016:10:26:09 -0500] conn=6877205 fd=4005 slot=4005 connection from 10.113.14.131 to 10.113.14.131 [09/Mar/2016:10:26:15 -0500] conn=6877206 fd=4006 slot=4006 connection from 10.113.14.131 to 10.113.14.131 [09/Mar/2016:10:26:21 -0500] conn=6877207 fd=4007 slot=4007 connection from 10.113.14.131 to 10.113.14.131 [09/Mar/2016:10:26:27 -0500] conn=6877208 fd=4008 slot=4008 connection from 10.113.14.131 to 10.113.14.131 [09/Mar/2016:10:26:28 -0500] conn=6877209 fd=4009 slot=4009 connection from 10.116.28.33 to 10.113.14.131 [09/Mar/2016:10:26:30 -0500] conn=6877210 fd=4010 slot=4010 connection from 10.116.28.23 to 10.113.14.131 [09/Mar/2016:10:26:33 -0500] conn=6877211 fd=4011 slot=4011 connection from 10.113.14.131 to 10.113.14.131 The ns-slapd proccess is showing this from top: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 24951 dirsrv 20 0 15.477g 0.013t 6.067g S 0.0 27.3 101566:54 ns-slapd I'd be happy to provide a pstack but can't seem to get the correct debuginfo packages installed.. we're running centos7 and 389-ds-base 1.3.3.1. We haven't upgraded to 1.3.4.0. How can I get the debuginfo packages installed for that specific version. Thanks! --Andrew -- Manage your subscription for the Freeipa-users mailing list: https://www.redhat.com/mailman/listinfo/freeipa-users Go to http://freeipa.org for more info on the project
