ok - please follow the directions at
http://port389.org/wiki/FAQ#Debugging_Crashes to enable core files and get a
stack trace
Also, 1.2.10.12 is available in the testing repos. Please give this a try.
There were a couple of fixes made since 1.2.10.4 that may be applicable:
Ticket 336 [abrt] 389-ds-base-1.2.10.4-2.fc16: index_range_read_ext: Process
/usr/sbin/ns-slapd was killed by signal 11 (SIGSEGV)
Ticket #347 - IPA dirsvr seg-fault during system longevity test
Ticket #348 - crash in ldap_initialize with multiple threads
Ticket #361: Bad DNs in ACIs can segfault ns-slapd
Trac Ticket #359 - Database RUV could mismatch the one in changelog under the
stress
Ticket #382 - DS Shuts down intermittently
Ticket #390 - [abrt] 389-ds-base-1.2.10.6-1.fc16: slapi_attr_value_cmp: Process
/usr/sbin/ns-slapd was killed by signal 11 (SIGSEGV
I've enabled the core dump stuff, but now I can't seem to get it to crash. But
I'm still getting the changelog messages in the error logs whenever I restart.
In addition, the hub server keeps running out of disk space. I tracked it down
to the access log filling up with MOD messages from replication. It looks like
changes are coming down from our 1.2.8 servers and being applied over and over
again. As an example, one of our entries was modified three times today, and
on all our other machines I see the following in the access log file:
# egrep 78b8cc871a3cda9f352580e797b270bc access
[12/Jul/2012:11:00:59 -0400] conn=383671 op=3145 MOD
dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
[12/Jul/2012:11:01:24 -0400] conn=383671 op=3153 MOD
dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
[12/Jul/2012:11:01:38 -0400] conn=383671 op=3157 MOD
dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
But on the problematic hub server, I see:
# egrep 78b8cc871a3cda9f352580e797b270bc access
[12/Jul/2012:15:17:29 -0400] conn=2 op=58 MOD
dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
[12/Jul/2012:15:17:29 -0400] conn=2 op=60 MOD
dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
[12/Jul/2012:15:17:29 -0400] conn=2 op=61 MOD
dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
[12/Jul/2012:15:24:42 -0400] conn=6 op=169 MOD
dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
[12/Jul/2012:15:24:42 -0400] conn=6 op=171 MOD
dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
[12/Jul/2012:15:24:42 -0400] conn=6 op=172 MOD
dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
[12/Jul/2012:15:24:45 -0400] conn=3 op=170 MOD
dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
[12/Jul/2012:15:24:45 -0400] conn=3 op=172 MOD
dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
[12/Jul/2012:15:24:45 -0400] conn=3 op=173 MOD
dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
[12/Jul/2012:15:24:51 -0400] conn=2 op=2234 MOD
dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
[12/Jul/2012:15:24:51 -0400] conn=2 op=2236 MOD
dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
[12/Jul/2012:15:24:51 -0400] conn=2 op=2237 MOD
dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
[12/Jul/2012:15:24:55 -0400] conn=6 op=2233 MOD
dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
[12/Jul/2012:15:24:55 -0400] conn=6 op=2235 MOD
dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
[12/Jul/2012:15:24:55 -0400] conn=6 op=2236 MOD
dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
[12/Jul/2012:15:24:57 -0400] conn=3 op=2234 MOD
dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
...
I truncated the output for brevity, but there's over 250 MODs to that one
object. It's as if the server isn't able to do the replication bookkeeping and
is accepting changes over and over again. Eventually the disk fills up.