[389-users] Re: Help with 389 Directory Server Replication

Pierre Rogier via 389-users Mon, 09 Dec 2024 08:55:25 -0800

Hi,

You are describing two different symptoms:
   - replication gets desynchronized.
   - replication initialization fails


We do not have much data about the first one. It could be due to lots of
reasons.
About the second one:

The  bdb_bulk_import_queue returned 0 with entry ... are normal
The first error is:
[09/Dec/2024:11:32:30.390378254 -0300] - ERR - factory_destructor - ERROR
bulk import abandoned

Which  happens on the server being initialized when the connection handling
the replication initialization gets unexpectedly closed.
That could usually be because:
   [1] there was a problem on the supplier from which the replication get
initialized.
        (Is there errors on that supplier error log or has it crashed ?)
   [2] network is unreliable (that could also explain why replication get
desynchronized)

regards
  Pierre


On Mon, Dec 9, 2024 at 5:20 PM Luiz Gustavo Quirino via 389-users <
389-users@lists.fedoraproject.org> wrote:

> I’m facing issues with replication in the following scenario:
>
>     3 Linux nodes (Rocky) running version 2.4.5 B2024.198.0000 of 389.
>     Replication is configured in a ring topology:
>     node01 -> node02 -> node03 -> node01.
>     Password changes are made via the PWM-Project web interface.
>
> Problem:
> At some point, the synchronization between nodes is lost.
> When I attempt to restart replication, the node being updated crashes the
> database.
> For example, when initializing replication from node01 to node02, the
> following error occurs:
> ---------
> [09/Dec/2024:11:32:30.382466035 -0300] - DEBUG - bdb_ldbm_back_wire_import
> - bdb_bulk_import_queue returned 0 with entry
> uid=app.tzv.w,OU=APLICACOES,dc=colorado,dc=local
> [09/Dec/2024:11:32:30.387198997 -0300] - DEBUG - bdb_ldbm_back_wire_import
> - bdb_bulk_import_queue returned 0 with entry
> uid=app.poc.w,OU=APLICACOES,dc=colorado,dc=local
> [09/Dec/2024:11:32:30.390378254 -0300] - ERR - factory_destructor - ERROR
> bulk import abandoned
> [09/Dec/2024:11:32:30.557600717 -0300] - ERR - bdb_import_run_pass -
> import userroot: Thread monitoring returned: -23
>
> [09/Dec/2024:11:32:30.559453847 -0300] - ERR - bdb_public_bdb_import_main
> - import userroot: Aborting all Import threads...
> [09/Dec/2024:11:32:36.468531612 -0300] - ERR - bdb_public_bdb_import_main
> - import userroot: Import threads aborted.
> [09/Dec/2024:11:32:36.470641812 -0300] - INFO - bdb_public_bdb_import_main
> - import userroot: Closing files...
> [09/Dec/2024:11:32:36.553007637 -0300] - ERR - bdb_public_bdb_import_main
> - import userroot: Import failed.
> [09/Dec/2024:11:32:36.574692177 -0300] - DEBUG - NSMMReplicationPlugin -
> consumer_connection_extension_destructor - Aborting total update in
> progress for replicated area dc=colorado,dc=local connid=7019159
> [09/Dec/2024:11:32:36.577255941 -0300] - ERR - process_bulk_import_op -
> NULL target sdn
> [09/Dec/2024:11:32:36.579573401 -0300] - DEBUG - NSMMReplicationPlugin -
> replica_relinquish_exclusive_access - conn=7019159 op=-1
> repl="dc=colorado,dc=local": Released replica held by
> locking_purl=conn=7019159 id=3
> [09/Dec/2024:11:32:36.600514849 -0300] - ERR - pw_get_admin_users - Search
> failed for
> cn=GRP_SRV_PREHASHED_PASSWORD,ou=389,OU=GRUPOS,ou=colorado,dc=colorado,dc=local:
> error 10 - Password Policy Administrators can not be set
> [09/Dec/2024:11:32:36.757883417 -0300] - DEBUG - NSMMReplicationPlugin -
> decode_startrepl_extop - decoding payload...
> [09/Dec/2024:11:32:36.760105387 -0300] - DEBUG - NSMMReplicationPlugin -
> decode_startrepl_extop - decoded protocol_oid: 2.16.840.1.113730.3.6.1
> [09/Dec/2024:11:32:36.762467539 -0300] - DEBUG - NSMMReplicationPlugin -
> decode_startrepl_extop - decoded repl_root: dc=colorado,dc=local
> [09/Dec/2024:11:32:36.765113155 -0300] - DEBUG - NSMMReplicationPlugin -
> decode_startrepl_extop - decoded csn: 6756ff84000001910000
> [09/Dec/2024:11:32:36.767727935 -0300] - DEBUG - NSMMReplicationPlugin -
> decode_startrepl_extop: RUV:
> [09/Dec/2024:11:32:36.769205061 -0300] - DEBUG - NSMMReplicationPlugin -
> decode_startrepl_extop: {replicageneration} 6748f91f000001910000
> [09/Dec/2024:11:32:36.770721824 -0300] - DEBUG - NSMMReplicationPlugin -
> decode_startrepl_extop: {replica 401 ldap://node01.ldap.colorado.br:389}
> 6748f921000001910000 6756ff6f000101910000 00000000
> [09/Dec/2024:11:32:36.772753378 -0300] - DEBUG - NSMMReplicationPlugin -
> decode_startrepl_extop: {replica 403 ldap://node03-ldap:389}
> 6748f9db000101930000 6756ff79000001930000 00000000
> [09/Dec/2024:11:32:36.774289526 -0300] - DEBUG - NSMMReplicationPlugin -
> decode_startrepl_extop: {replica 402 ldap://node02.ldap.colorado.br:389}
> 6748f996000101920000 6756ff34000001920000 00000000
> [09/Dec/2024:11:32:36.775750926 -0300] - DEBUG - NSMMReplicationPlugin -
> decode_startrepl_extop - Finshed decoding payload.
> [09/Dec/2024:11:32:36.777404849 -0300] - DEBUG - NSMMReplicationPlugin -
> consumer_connection_extension_acquire_exclusive_access - conn=7019230 op=4
> Acquired consumer connection extension
> [09/Dec/2024:11:32:36.779856975 -0300] - DEBUG - NSMMReplicationPlugin -
> multisupplier_extop_StartNSDS50ReplicationRequest - conn=7019230 op=4
> repl="dc=colorado,dc=local": Begin incremental protocol
> [09/Dec/2024:11:32:36.781999075 -0300] - DEBUG - _csngen_adjust_local_time
> - gen state before 6756ff7b0001:1733754747:0:0
> [09/Dec/2024:11:32:36.784626039 -0300] - DEBUG - _csngen_adjust_local_time
> - gen state after 6756ff840000:1733754756:0:0
> [09/Dec/2024:11:32:36.786708353 -0300] - DEBUG - csngen_adjust_time - gen
> state before 6756ff840000:1733754756:0:0
> [09/Dec/2024:11:32:36.788232997 -0300] - DEBUG - csngen_adjust_time - gen
> state after 6756ff840001:1733754756:0:0
> [09/Dec/2024:11:32:36.790217310 -0300] - DEBUG - NSMMReplicationPlugin -
> replica_get_exclusive_access - conn=7019230 op=4
> repl="dc=colorado,dc=local": Acquired replica
> ----------
> To restore synchronization, I need to delete all replication
> configurations and recreate them. However, the issue reappears after some
> time.
>
> I’d appreciate any suggestions on how to identify and resolve this issue
> permanently.
>
> Thks.
> --
> _______________________________________________
> 389-users mailing list -- 389-users@lists.fedoraproject.org
> To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org
> Fedora Code of Conduct:
> https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives:
> https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org
> Do not reply to spam, report it:
> https://pagure.io/fedora-infrastructure/new_issue
>


-- 
--

389 Directory Server Development Team

-- 
_______________________________________________
389-users mailing list -- 389-users@lists.fedoraproject.org
To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue

[389-users] Re: Help with 389 Directory Server Replication

Reply via email to