On 03/14/2016 05:33 PM, Andrew E. Bruno wrote:
On Mon, Mar 14, 2016 at 09:35:15AM +0100, Ludwig Krispenz wrote:
On 03/12/2016 04:02 PM, Andrew E. Bruno wrote:
On Wed, Mar 09, 2016 at 06:08:04PM +0100, Ludwig Krispenz wrote:
On 03/09/2016 05:51 PM, Andrew E. Bruno wrote:
On Wed, Mar 09, 2016 at 05:21:50PM +0100, Ludwig Krispenz wrote:

[09/Mar/2016:11:33:03 -0500] NSMMReplicationPlugin - changelog program - 
_cl5NewDBFile: PR_DeleteSemaphore: 
/var/lib/dirsrv/slapd-CBLS-CCR-BUFFALO-EDU/cldb/ed35d212-2cb811e5-af63d574-de3f6355.sema;
 NSPR error - -5943
if ds is cleanly shutdown this file should be removed, if ds is killed it
remains and should be recreated at restart, which fails. could you try
another stop, remove the file manually and start again ?
We had our replicas crash again. Curious if it's safe to delete the
other db files as well:

ls -alh /var/lib/dirsrv/slapd-CBLS-CCR-BUFFALO-EDU/cldb/
   30  DBVERSION
6.8G  ed35d212-2cb811e5-af63d574-de3f6355_55a95591000000040000.db
    0  ed35d212-2cb811e5-af63d574-de3f6355.sema
  18M  f32bb356-2cb811e5-af63d574-de3f6355_55a955ca000000600000.db
    0  f32bb356-2cb811e5-af63d574-de3f6355.sema


Should all these files be deleted if the ds is cleanly shutdown? or should we
only remove the *.sema files.
the *.db file contains the data of the changelog, if you delete them you
start with a new cl and could get into replication problems requiring
reinitialization. you normally shoul not delete them.
The .sema is used to control how many threads can concurrently access the
cl, it should be recreated at restart, so it is safe to delete them after a
crash.
Sounds good..thanks. We deleted the .sema files after the crash and the
replicas came back up ok.

If you getting frequent crashes, we shoul try to find the reason for the
crashes, could you try to get a core file ?
This time we had two replicas crash and ns-slapd wasn't running so we
couldn't grab a pstack. Here's a snip from the error logs right before
the crash (not sure if this is related or not):

[11/Mar/2016:09:57:56 -0500] ldbm_back_delete - conn=0 op=0 [retry: 1] No 
original_tombstone for changenumber=11573832,cn=changelog!!
[11/Mar/2016:09:57:57 -0500] ldbm_back_delete - conn=0 op=0 [retry: 1] No 
original_tombstone for changenumber=11575824,cn=changelog!!
[11/Mar/2016:09:57:58 -0500] ldbm_back_delete - conn=0 op=0 [retry: 1] No 
original_tombstone for changenumber=11575851,cn=changelog!!
[11/Mar/2016:10:00:28 -0500] - libdb: BDB2055 Lock table is out of available 
lock entries
[11/Mar/2016:10:00:28 -0500] NSMMReplicationPlugin - changelog program - 
_cl5CompactDBs: failed to compact 986efe12-71b811e5-9d33a516-e778e883; db error 
- 12 Cannot allocate memory
[11/Mar/2016:10:02:07 -0500] - libdb: BDB2055 Lock table is out of available 
lock entries
[11/Mar/2016:10:02:07 -0500] - compactdb: failed to compact changelog; db error 
- 12 Cannot allocate memory
don't know if this is related to your crashes, but compation of changelog was running, probably for some time, and finally failed. The idea behind compaction is to compact a fragmented btree and reclaim some space, but it uses a transaction for the complete operation and lock every page accessed. This can be time consuming, blocking other txns, and run out of locks.

There are two options to address this, either increase the number of configured db locks (problem is there is no good hint how much locks will be needed), or disable changelog compaction, by setting:
dn: cn=changelog5,cn=config
..
nsslapd-changelogcompactdb-interval: 0


I would disable compaction, I don't think there is much benefit (in my memory BDB compaction was slow and not very effective) and it is better to avoid the side effects
[11/Mar/2016:12:36:18 -0500] - slapd_poll(377) timed out
[11/Mar/2016:13:06:17 -0500] - slapd_poll(377) timed out

We just upgraded to ipa 4.2 centos 7.2 and if we see anymore crashes
we'll try and get more info.

Thanks again.

--Andrew



--
Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Paul Argiry, Charles Cachera, Michael Cunningham, Michael 
O'Neill

--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Reply via email to