Hi Laurent,

389-Directory/3.1.1 B2024.289.0000

Looking at the stacktrace, i wonder if there is a possible deadlock in retroCL triming. Where thread 4 acquired the cn=changelog backend lock then hang for a write TXN and thread 20 created a write TXN (blocking thread 4) and wait for cn=changelog backend lock.

   Thread 4 (Thread 0x7f680c2b56c0 (LWP 3980786) "ns-slapd"):
   #0  0x00007f6d14081332 in __pthread_mutex_lock_full () at
   target:/lib64/libc.so.6
   #1  0x00007f6d0f6b6bd2 in mdb_txn_renew0 () at
   target:/lib64/liblmdb.so.0.0.0
   #2  0x00007f6d0f6b73c4 in mdb_txn_begin () at
   target:/lib64/liblmdb.so.0.0.0
   #3  0x00007f6d0ef88b55 in dbmdb_start_txn () at
   target:/usr/lib64/dirsrv/plugins/libback-ldbm.so
   #4  0x00007f6d0ef8c745 in dbmdb_txn_begin () at
   target:/usr/lib64/dirsrv/plugins/libback-ldbm.so
   #5  0x00007f6d0ef113bd in dblayer_txn_begin () at
   target:/usr/lib64/dirsrv/plugins/libback-ldbm.so
   #6  0x00007f6d0ef3be7b in ldbm_back_delete () at
   target:/usr/lib64/dirsrv/plugins/libback-ldbm.so
   #7  0x00007f6d142297b4 in op_shared_delete.lto_priv () at
   target:/usr/lib64/dirsrv/libslapd.so.0
   #8  0x00007f6d142ccd9d in delete_internal_pb.isra () at
   target:/usr/lib64/dirsrv/libslapd.so.0
   #9  0x00007f6d14223f56 in slapi_delete_internal_pb () at
   target:/usr/lib64/dirsrv/libslapd.so.0
   #10 0x00007f6d0ec095be in delete_changerecord () at
   target:/usr/lib64/dirsrv/plugins/libretrocl-plugin.so
   #11 0x00007f6d0ec0a853 in changelog_trim_thread_fn () at
   target:/usr/lib64/dirsrv/plugins/libretrocl-plugin.so
   #12 0x00007f6d13e4d3d7 in _pt_root () at target:/lib64/libnspr4.so
   #13 0x00007f6d1407e168 in start_thread () at target:/lib64/libc.so.6
   #14 0x00007f6d1410214c in __clone3 () at target:/lib64/libc.so.6

   Thread 20 (Thread 0x7f68025fe6c0 (LWP 3980449) "ns-slapd"):
   #0  0x00007f6d1407a7e9 in __futex_abstimed_wait_common () at
   target:/lib64/libc.so.6
   #1  0x00007f6d1407d239 in pthread_cond_wait@@GLIBC_2.3.2 () at
   target:/lib64/libc.so.6
   #2  0x00007f6d13e467db in PR_EnterMonitor () at
   target:/lib64/libnspr4.so
   #3  0x00007f6d0ef11405 in dblayer_txn_begin () at
   target:/usr/lib64/dirsrv/plugins/libback-ldbm.so
   #4  0x00007f6d0ef2bced in ldbm_back_add () at
   target:/usr/lib64/dirsrv/plugins/libback-ldbm.so
   #5  0x00007f6d142190f0 in op_shared_add.lto_priv () at
   target:/usr/lib64/dirsrv/libslapd.so.0
   #6  0x00007f6d142cce7c in add_internal_pb.isra () at
   target:/usr/lib64/dirsrv/libslapd.so.0
   #7  0x00007f6d142151d5 in slapi_add_internal_pb () at
   target:/usr/lib64/dirsrv/libslapd.so.0
   #8  0x00007f6d0ec0bfd4 in retrocl_postob () at
   target:/usr/lib64/dirsrv/plugins/libretrocl-plugin.so
   #9  0x00007f6d1427b5c0 in plugin_call_func.lto_priv () at
   target:/usr/lib64/dirsrv/libslapd.so.0
   #10 0x00007f6d1427b931 in plugin_call_plugins () at
   target:/usr/lib64/dirsrv/libslapd.so.0
   #11 0x00007f6d0ef441cd in ldbm_back_modify () at
   target:/usr/lib64/dirsrv/plugins/libback-ldbm.so
   #12 0x00007f6d14267360 in op_shared_modify.lto_priv () at
   target:/usr/lib64/dirsrv/libslapd.so.0
   #13 0x00007f6d1426910e in do_modify () at
   target:/usr/lib64/dirsrv/libslapd.so.0
   #14 0x000056425298e8bb in connection_threadmain ()
   #15 0x00007f6d13e4d3d7 in _pt_root () at target:/lib64/libnspr4.so
   #16 0x00007f6d1407e168 in start_thread () at target:/lib64/libc.so.6
   #17 0x00007f6d1410214c in __clone3 () at target:/lib64/libc.so.6

At the same time, I wonder if it could be related to https://github.com/389ds/389-ds-base/issues/6644.

Could you try to disable retroCL trimming to see if can give a relief.

best regards
thierry

On 3/3/25 9:04 AM, Florence Blanc-Renaud via FreeIPA-users wrote:
Hi,

you can reach out the directory server developers at 389-users@lists.fedoraproject.org. They will ask you to provide logs obtained as described here <https://www.port389.org/docs/389ds/FAQ/faq.html#debugging-hangs> (Debugging Hangs), with the exact version and OS you have installed on your machines.

flo

On Sat, Mar 1, 2025 at 7:33 PM ARNAL Laurent via FreeIPA-users <freeipa-us...@lists.fedorahosted.org> wrote:

    Hello,

    Some more infos : I've reinstalled the replica this afternoon.
    Now the replication seems to work ok again.
    But I've still have the deadlock after that.

    Laurent.
-- _______________________________________________
    FreeIPA-users mailing list -- freeipa-us...@lists.fedorahosted.org
    To unsubscribe send an email to
    freeipa-users-le...@lists.fedorahosted.org
    Fedora Code of Conduct:
    https://docs.fedoraproject.org/en-US/project/code-of-conduct/
    List Guidelines:
    https://fedoraproject.org/wiki/Mailing_list_guidelines
    List Archives:
    
https://lists.fedorahosted.org/archives/list/freeipa-us...@lists.fedorahosted.org
    Do not reply to spam, report it:
    https://pagure.io/fedora-infrastructure/new_issue

-- 
_______________________________________________
389-users mailing list -- 389-users@lists.fedoraproject.org
To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue

Reply via email to