Hi, You are describing two different symptoms: - replication gets desynchronized. - replication initialization fails
We do not have much data about the first one. It could be due to lots of reasons. About the second one: The bdb_bulk_import_queue returned 0 with entry ... are normal The first error is: [09/Dec/2024:11:32:30.390378254 -0300] - ERR - factory_destructor - ERROR bulk import abandoned Which happens on the server being initialized when the connection handling the replication initialization gets unexpectedly closed. That could usually be because: [1] there was a problem on the supplier from which the replication get initialized. (Is there errors on that supplier error log or has it crashed ?) [2] network is unreliable (that could also explain why replication get desynchronized) regards Pierre On Mon, Dec 9, 2024 at 5:20 PM Luiz Gustavo Quirino via 389-users < 389-users@lists.fedoraproject.org> wrote: > I’m facing issues with replication in the following scenario: > > 3 Linux nodes (Rocky) running version 2.4.5 B2024.198.0000 of 389. > Replication is configured in a ring topology: > node01 -> node02 -> node03 -> node01. > Password changes are made via the PWM-Project web interface. > > Problem: > At some point, the synchronization between nodes is lost. > When I attempt to restart replication, the node being updated crashes the > database. > For example, when initializing replication from node01 to node02, the > following error occurs: > --------- > [09/Dec/2024:11:32:30.382466035 -0300] - DEBUG - bdb_ldbm_back_wire_import > - bdb_bulk_import_queue returned 0 with entry > uid=app.tzv.w,OU=APLICACOES,dc=colorado,dc=local > [09/Dec/2024:11:32:30.387198997 -0300] - DEBUG - bdb_ldbm_back_wire_import > - bdb_bulk_import_queue returned 0 with entry > uid=app.poc.w,OU=APLICACOES,dc=colorado,dc=local > [09/Dec/2024:11:32:30.390378254 -0300] - ERR - factory_destructor - ERROR > bulk import abandoned > [09/Dec/2024:11:32:30.557600717 -0300] - ERR - bdb_import_run_pass - > import userroot: Thread monitoring returned: -23 > > [09/Dec/2024:11:32:30.559453847 -0300] - ERR - bdb_public_bdb_import_main > - import userroot: Aborting all Import threads... > [09/Dec/2024:11:32:36.468531612 -0300] - ERR - bdb_public_bdb_import_main > - import userroot: Import threads aborted. > [09/Dec/2024:11:32:36.470641812 -0300] - INFO - bdb_public_bdb_import_main > - import userroot: Closing files... > [09/Dec/2024:11:32:36.553007637 -0300] - ERR - bdb_public_bdb_import_main > - import userroot: Import failed. > [09/Dec/2024:11:32:36.574692177 -0300] - DEBUG - NSMMReplicationPlugin - > consumer_connection_extension_destructor - Aborting total update in > progress for replicated area dc=colorado,dc=local connid=7019159 > [09/Dec/2024:11:32:36.577255941 -0300] - ERR - process_bulk_import_op - > NULL target sdn > [09/Dec/2024:11:32:36.579573401 -0300] - DEBUG - NSMMReplicationPlugin - > replica_relinquish_exclusive_access - conn=7019159 op=-1 > repl="dc=colorado,dc=local": Released replica held by > locking_purl=conn=7019159 id=3 > [09/Dec/2024:11:32:36.600514849 -0300] - ERR - pw_get_admin_users - Search > failed for > cn=GRP_SRV_PREHASHED_PASSWORD,ou=389,OU=GRUPOS,ou=colorado,dc=colorado,dc=local: > error 10 - Password Policy Administrators can not be set > [09/Dec/2024:11:32:36.757883417 -0300] - DEBUG - NSMMReplicationPlugin - > decode_startrepl_extop - decoding payload... > [09/Dec/2024:11:32:36.760105387 -0300] - DEBUG - NSMMReplicationPlugin - > decode_startrepl_extop - decoded protocol_oid: 2.16.840.1.113730.3.6.1 > [09/Dec/2024:11:32:36.762467539 -0300] - DEBUG - NSMMReplicationPlugin - > decode_startrepl_extop - decoded repl_root: dc=colorado,dc=local > [09/Dec/2024:11:32:36.765113155 -0300] - DEBUG - NSMMReplicationPlugin - > decode_startrepl_extop - decoded csn: 6756ff84000001910000 > [09/Dec/2024:11:32:36.767727935 -0300] - DEBUG - NSMMReplicationPlugin - > decode_startrepl_extop: RUV: > [09/Dec/2024:11:32:36.769205061 -0300] - DEBUG - NSMMReplicationPlugin - > decode_startrepl_extop: {replicageneration} 6748f91f000001910000 > [09/Dec/2024:11:32:36.770721824 -0300] - DEBUG - NSMMReplicationPlugin - > decode_startrepl_extop: {replica 401 ldap://node01.ldap.colorado.br:389} > 6748f921000001910000 6756ff6f000101910000 00000000 > [09/Dec/2024:11:32:36.772753378 -0300] - DEBUG - NSMMReplicationPlugin - > decode_startrepl_extop: {replica 403 ldap://node03-ldap:389} > 6748f9db000101930000 6756ff79000001930000 00000000 > [09/Dec/2024:11:32:36.774289526 -0300] - DEBUG - NSMMReplicationPlugin - > decode_startrepl_extop: {replica 402 ldap://node02.ldap.colorado.br:389} > 6748f996000101920000 6756ff34000001920000 00000000 > [09/Dec/2024:11:32:36.775750926 -0300] - DEBUG - NSMMReplicationPlugin - > decode_startrepl_extop - Finshed decoding payload. > [09/Dec/2024:11:32:36.777404849 -0300] - DEBUG - NSMMReplicationPlugin - > consumer_connection_extension_acquire_exclusive_access - conn=7019230 op=4 > Acquired consumer connection extension > [09/Dec/2024:11:32:36.779856975 -0300] - DEBUG - NSMMReplicationPlugin - > multisupplier_extop_StartNSDS50ReplicationRequest - conn=7019230 op=4 > repl="dc=colorado,dc=local": Begin incremental protocol > [09/Dec/2024:11:32:36.781999075 -0300] - DEBUG - _csngen_adjust_local_time > - gen state before 6756ff7b0001:1733754747:0:0 > [09/Dec/2024:11:32:36.784626039 -0300] - DEBUG - _csngen_adjust_local_time > - gen state after 6756ff840000:1733754756:0:0 > [09/Dec/2024:11:32:36.786708353 -0300] - DEBUG - csngen_adjust_time - gen > state before 6756ff840000:1733754756:0:0 > [09/Dec/2024:11:32:36.788232997 -0300] - DEBUG - csngen_adjust_time - gen > state after 6756ff840001:1733754756:0:0 > [09/Dec/2024:11:32:36.790217310 -0300] - DEBUG - NSMMReplicationPlugin - > replica_get_exclusive_access - conn=7019230 op=4 > repl="dc=colorado,dc=local": Acquired replica > ---------- > To restore synchronization, I need to delete all replication > configurations and recreate them. However, the issue reappears after some > time. > > I’d appreciate any suggestions on how to identify and resolve this issue > permanently. > > Thks. > -- > _______________________________________________ > 389-users mailing list -- 389-users@lists.fedoraproject.org > To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org > Fedora Code of Conduct: > https://docs.fedoraproject.org/en-US/project/code-of-conduct/ > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines > List Archives: > https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org > Do not reply to spam, report it: > https://pagure.io/fedora-infrastructure/new_issue > -- -- 389 Directory Server Development Team
-- _______________________________________________ 389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue