Thank you, Thierry! Thank you to explain. That makes sense. I will set
nsslapd-db-deadlock-policy to 6 instead (it is 9 now).

In this instance, I did notice that this ipa server's
nsslapd-dncachememsize is 78MB, which is much less than 150MB. Shall I
increase it? Or leave it as is?

Kathy.

On Fri, Sep 17, 2021 at 12:56 AM Thierry Bordaz <tbor...@redhat.com> wrote:

>
> On 9/17/21 12:26 AM, Kathy Zhu via FreeIPA-users wrote:
>
> Hi Mark,
>
> If it helps, this is the same ipa server which I posted in subject 
> "ipa_check_consistency
> alerts and ERR - slapd_poll - Timed out" yesterday.
>
>
> Hi Kathy,
>
> The slapd_poll message is likely not related to the DB_PANIC. Slap_poll
> here means that the server was not able to send a result (ldap client not
> reading ?) for longer than ioblock-timeout.
>
> The DB panic is a fatal error of the database that requires a restart of
> the server. The restart will trigger a DB recovery. It is difficult to know
> the RC of the DB panic. According to the initial message DB was running a
> deadlock resolution during DB panic, you may try to give priority to
> updates (setting nsslapd-db-deadlock-policy: 6) that can significantly
> reduces db deadlocks.
>
> regards
> thierry
>
> Thanks.
>
> Kathy.
>
> On Thu, Sep 16, 2021 at 2:57 PM Kathy Zhu wrote:
>
>> Thanks, Mark, for your reply.
>>
>> The following repeats in /var/log/dirsrv/slapd-EXAMPLE-COM/errors:
>> ...
>>
>> [16/Sep/2021:08:34:27.880349688 -0700] - CRIT - deadlock_threadmain -
>> Serious Error---Failed in deadlock detect (aborted at 0x0), err=-30973
>> (BDB0087 DB_RUNRECOVERY: Fatal error, run database recovery)
>>
>> [16/Sep/2021:08:34:27.980810867 -0700] - ERR - libdb - BDB0060 PANIC:
>> fatal region error detected; run recovery
>>
>> [16/Sep/2021:08:34:27.981036823 -0700] - CRIT - deadlock_threadmain -
>> Serious Error---Failed in deadlock detect (aborted at 0x0), err=-30973
>> (BDB0087 DB_RUNRECOVERY: Fatal error, run database recovery)
>>
>> [16/Sep/2021:08:34:28.031642976 -0700] - ERR - libdb - BDB0060 PANIC:
>> fatal region error detected; run recovery
>>
>> [16/Sep/2021:08:34:28.031856673 -0700] - ERR - trickle_threadmain -
>> Serious Error---Failed to trickle, err=-30973 (BDB0087 DB_RUNRECOVERY:
>> Fatal error, run database recovery)
>>
>> [16/Sep/2021:08:34:28.081390783 -0700] - ERR - libdb - BDB0060 PANIC:
>> fatal region error detected; run recovery
>>
>> [16/Sep/2021:08:34:28.081634618 -0700] - CRIT - deadlock_threadmain -
>> Serious Error---Failed in deadlock detect (aborted at 0x0), err=-30973
>> (BDB0087 DB_RUNRECOVERY: Fatal error, run database recovery)
>>
>> [16/Sep/2021:08:34:28.181946001 -0700] - ERR - libdb - BDB0060 PANIC:
>> fatal region error detected; run recovery
>>
>> [16/Sep/2021:08:34:28.182160603 -0700] - CRIT - deadlock_threadmain -
>> Serious Error---Failed in deadlock detect (aborted at 0x0), err=-30973
>> (BDB0087 DB_RUNRECOVERY: Fatal error, run database recovery)
>>
>> [16/Sep/2021:08:34:28.282366716 -0700] - ERR - libdb - BDB0060 PANIC:
>> fatal region error detected; run recovery
>>
>> [16/Sep/2021:08:34:28.282650113 -0700] - ERR - trickle_threadmain -
>> Serious Error---Failed to trickle, err=-30973 (BDB0087 DB_RUNRECOVERY:
>> Fatal error, run database recovery)
>>
>> [16/Sep/2021:08:34:28.283083329 -0700] - ERR - libdb - BDB0060 PANIC:
>> fatal region error detected; run recovery
>> ...
>>
>> Thanks!
>>
>> Kathy.
>>
>>
>> On Thu, Sep 16, 2021 at 2:38 PM Mark Reynolds <mreyno...@redhat.com>
>> wrote:
>>
>>>
>>> On 9/16/21 5:20 PM, Kathy Zhu via FreeIPA-users wrote:
>>>
>>> Hi List,
>>>
>>> One of my ipa server's database had issue and left many log entries like
>>> the following in messages and slapd errors log:
>>>
>>> *Sep 16 08*:34:28 ipa0 ns-slapd: [16/Sep/2021:08:34:28.886632992 -0700]
>>> - ERR - libdb - BDB0060 PANIC: fatal region error detected; run recovery
>>>
>>> *Sep 16 08*:34:29 ipa0 ns-slapd: [16/Sep/2021:08:34:28.987593487 -0700]
>>> - ERR - libdb - BDB0060 PANIC: fatal region error detected; run recovery
>>>
>>> *Sep 16 08*:34:29 ipa0 ns-slapd: [16/Sep/2021:08:34:29.035181321 -0700]
>>> - ERR - libdb - BDB0060 PANIC: fatal region error detected; run recovery
>>>
>>> Is there anything else in the error log around these messages?  This is
>>> kind of a generic error, and increasing the DN cache is not a guarantee it
>>> will resolve this.
>>>
>>>
>>> Restart ipa fixed the issue. I googled for root cause and found the
>>> verified solution - https://access.redhat.com/solutions/3098131, which
>>> is to increase nsslapd-dncachememsize to a reasonable value (>150MB).
>>> This sounds like easy, however, all slapd cache parameters are related. Red
>>> Hat Directory Server performance tuning guide explain a bit:
>>>
>>>
>>> https://access.redhat.com/documentation/en-us/red_hat_directory_server/10/html/performance_tuning_guide/memoryusage
>>>
>>> However, I wonder if there is a better guide.
>>>
>>> Not really :-)  There is a RHDS 11 version, but I think the performance
>>> tuning part is the same as RHDS 10.
>>>
>>>
>>> Mark
>>>
>>>
>>> Thanks.
>>>
>>> Kathy.
>>>
>>> _______________________________________________
>>> FreeIPA-users mailing list -- freeipa-users@lists.fedorahosted.org
>>> To unsubscribe send an email to freeipa-users-le...@lists.fedorahosted.org
>>> Fedora Code of Conduct: 
>>> https://docs.fedoraproject.org/en-US/project/code-of-conduct/
>>> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
>>> List Archives: 
>>> https://lists.fedorahosted.org/archives/list/freeipa-users@lists.fedorahosted.org
>>> Do not reply to spam on the list, report it: 
>>> https://pagure.io/fedora-infrastructure
>>>
>>> --
>>> Directory Server Development Team
>>>
>>>
> _______________________________________________
> FreeIPA-users mailing list -- freeipa-users@lists.fedorahosted.org
> To unsubscribe send an email to freeipa-users-le...@lists.fedorahosted.org
> Fedora Code of Conduct: 
> https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives: 
> https://lists.fedorahosted.org/archives/list/freeipa-users@lists.fedorahosted.org
> Do not reply to spam on the list, report it: 
> https://pagure.io/fedora-infrastructure
>
>
_______________________________________________
FreeIPA-users mailing list -- freeipa-users@lists.fedorahosted.org
To unsubscribe send an email to freeipa-users-le...@lists.fedorahosted.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedorahosted.org/archives/list/freeipa-users@lists.fedorahosted.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure

Reply via email to