The below bug linked to this bug: https://pagure.io/389-ds-base/issue/49915
Running "top -H", I could see that a thread hit 100% CPU on the primary 
provider during re-initialization of the secondary provider:

top - 19:14:37 up 13 days,  5:00,  2 users,  load average: 0.75, 0.39, 0.64
Threads: 406 total,   2 running, 404 sleeping,   0 stopped,   0 zombie
%Cpu(s): 29.4 us,  0.2 sy,  0.0 ni, 70.4 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 41027760 total,   274436 free, 26755384 used, 13997940 buff/cache
KiB Swap:  4095996 total,  4007932 free,    88064 used. 13755740 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND
1324 dirsrv    20   0   29.3g  26.4g   1.7g S  0.0 67.5  32:09.15 ns-slapd
1422 dirsrv    20   0   29.3g  26.4g   1.7g S  0.0 67.5   7:13.29 ns-slapd
1423 dirsrv    20   0   29.3g  26.4g   1.7g S  0.0 67.5   4:49.31 ns-slapd
1424 dirsrv    20   0   29.3g  26.4g   1.7g S 17.6 67.5   3440:05 ns-slapd
1425 dirsrv    20   0   29.3g  26.4g   1.7g S  0.0 67.5   0:38.04 ns-slapd
1427 dirsrv    20   0   29.3g  26.4g   1.7g S  0.0 67.5   0:00.00 ns-slapd
1428 dirsrv    20   0   29.3g  26.4g   1.7g S  0.0 67.5   0:13.75 ns-slapd
1429 dirsrv    20   0   29.3g  26.4g   1.7g S  0.0 67.5   1:58.45 ns-slapd
1430 dirsrv    20   0   29.3g  26.4g   1.7g S  0.0 67.5   1:53.75 ns-slapd
1431 dirsrv    20   0   29.3g  26.4g   1.7g S  0.0 67.5   1:55.51 ns-slapd
1432 dirsrv    20   0   29.3g  26.4g   1.7g S  0.0 67.5   1:56.32 ns-slapd
1433 dirsrv    20   0   29.3g  26.4g   1.7g S  0.0 67.5   1:55.66 ns-slapd
1434 dirsrv    20   0   29.3g  26.4g   1.7g S  0.0 67.5   2:11.95 ns-slapd
1435 dirsrv    20   0   29.3g  26.4g   1.7g R 99.9 67.5  97:37.64 ns-slapd
1436 dirsrv    20   0   29.3g  26.4g   1.7g S  0.0 67.5   0:00.00 ns-slapd
1437 dirsrv    20   0   29.3g  26.4g   1.7g S  0.0 67.5   0:00.00 ns-slapd

I don't want to do an db2ldif during production time since it will put the db 
into read-only mode.
The primary provider will do a backup at 01:00 (with -r) so I'll grab that and 
try to ldif2db the secondary provider with it.

Thanks,
Trev


From: "Fong, Trevor" <[email protected]>
Reply-To: "[email protected]" 
<[email protected]>
Date: Thursday, September 17, 2020 at 7:02 PM
To: "[email protected]" <[email protected]>
Subject: [389-users] Re: Re-initialization Failure: Bulk Import Abandoned, 
Thread Monitoring Returned -23

Hi Marc,

Thanks for your suggestions.
We had disabled autotuning, but even re-enabling autotuning we are seeing the 
same errors.

Just wondering if we are hitting this bug? 
https://pagure.io/389-ds-base/issue/49796

Thanks everyone!
Trev

From: Marc Sauton <[email protected]>
Reply-To: "[email protected]" 
<[email protected]>
Date: Thursday, September 17, 2020 at 5:23 PM
To: "[email protected]" <[email protected]>
Subject: [389-users] Re: Re-initialization Failure: Bulk Import Abandoned, 
Thread Monitoring Returned -23

The average rate and error code seem to point to a configuration issue.
Try to verify the import and cache settings in the console, they may be very 
small if the auto tuning feature is disabled.
for manual settings, the values depend on the amount of RAM available to the 
LDAP server, and the db, index sizes.

"small" db example of parameters and values:

dn: cn=config,cn=ldbm database,cn=plugins,cn=config
nsslapd-import-cachesize: 500000000

dn: cn=userroot,cn=ldbm database,cn=plugins,cn=config
nsslapd-cachememsize: 134217728

dn: cn=bdb,cn=config,cn=ldbm database,cn=plugins,cn=config
nsslapd-import-cache-autosize: 0
nsslapd-import-cachesize: 500000000
nsslapd-cache-autosize: 10
nsslapd-cache-autosize-split: 25
nsslapd-dbcachesize: 38314475

documentation references:
https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/html/performance_tuning_guide/memoryusage#DB_and_entry_cache_RAM_usage
https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/html/configuration_command_and_file_reference/database_plug_in_attributes#nsslapd_cache_autosize


Thanks,
M.

On Thu, Sep 17, 2020 at 4:10 PM Fong, Trevor 
<[email protected]<mailto:[email protected]>> wrote:
Hi Everyone,

I'm having an issue re-initializing our secondary muti-master replicated 389 DS 
provider node via 389-Console > replication agreement > "Initialize Consumer".
It eventually aborts the update with an error "import userRoot: Thread 
monitoring returned: -23"
Would anyone know how to fix it?  Or what the issue may be?  What we could try?

Thanks in advance,
Trev

We're running 389-Directory/1.3.10.1<http://1.3.10.1> B2020.167.146
Here's that the error log says on the node receiving re-initialization:

[17/Sep/2020:15:21:07.489737002 -0700] - NOTICE - NSMMReplicationPlugin - 
multimaster_be_state_change - Replica dc=<redacted> is going offline; disabling 
replication
[17/Sep/2020:15:21:07.514586270 -0700] - INFO - dblayer_instance_start - Import 
is running with nsslapd-db-private-import-mem on; No other process is allowed 
to access the database
[17/Sep/2020:15:21:27.554375232 -0700] - INFO - import_monitor_threads - import 
userRoot: Processed 1 entries -- average rate 0.1/sec, recent rate 0.0/sec, hit 
ratio 0%
[17/Sep/2020:15:21:47.593876983 -0700] - INFO - import_monitor_threads - import 
userRoot: Processed 1 entries -- average rate 0.0/sec, recent rate 0.0/sec, hit 
ratio 0%
[17/Sep/2020:15:22:07.630801176 -0700] - INFO - import_monitor_threads - import 
userRoot: Processed 1 entries -- average rate 0.0/sec, recent rate 0.0/sec, hit 
ratio 0%
[17/Sep/2020:15:22:27.667537260 -0700] - INFO - import_monitor_threads - import 
userRoot: Processed 1 entries -- average rate 0.0/sec, recent rate 0.0/sec, hit 
ratio 0%
[17/Sep/2020:15:22:47.704917493 -0700] - INFO - import_monitor_threads - import 
userRoot: Processed 1 entries -- average rate 0.0/sec, recent rate 0.0/sec, hit 
ratio 0%
[17/Sep/2020:15:23:07.746084506 -0700] - INFO - import_monitor_threads - import 
userRoot: Processed 1 entries -- average rate 0.0/sec, recent rate 0.0/sec, hit 
ratio 0%
[17/Sep/2020:15:23:27.785902082 -0700] - INFO - import_monitor_threads - import 
userRoot: Processed 1 entries -- average rate 0.0/sec, recent rate 0.0/sec, hit 
ratio 0%
[17/Sep/2020:15:23:47.830564570 -0700] - INFO - import_monitor_threads - import 
userRoot: Processed 1 entries -- average rate 0.0/sec, recent rate 0.0/sec, hit 
ratio 0%
[17/Sep/2020:15:24:07.868457613 -0700] - INFO - import_monitor_threads - import 
userRoot: Processed 1 entries -- average rate 0.0/sec, recent rate 0.0/sec, hit 
ratio 0%
[17/Sep/2020:15:24:27.907239617 -0700] - INFO - import_monitor_threads - import 
userRoot: Processed 1 entries -- average rate 0.0/sec, recent rate 0.0/sec, hit 
ratio 0%
[17/Sep/2020:15:24:47.948025735 -0700] - INFO - import_monitor_threads - import 
userRoot: Processed 1 entries -- average rate 0.0/sec, recent rate 0.0/sec, hit 
ratio 0%
[17/Sep/2020:15:25:07.986469285 -0700] - INFO - import_monitor_threads - import 
userRoot: Processed 1 entries -- average rate 0.0/sec, recent rate 0.0/sec, hit 
ratio 0%
[17/Sep/2020:15:25:28.022970040 -0700] - INFO - import_monitor_threads - import 
userRoot: Processed 1 entries -- average rate 0.0/sec, recent rate 0.0/sec, hit 
ratio 0%
[17/Sep/2020:15:25:48.055641123 -0700] - INFO - import_monitor_threads - import 
userRoot: Processed 1 entries -- average rate 0.0/sec, recent rate 0.0/sec, hit 
ratio 0%
[17/Sep/2020:15:26:08.091707793 -0700] - INFO - import_monitor_threads - import 
userRoot: Processed 1 entries -- average rate 0.0/sec, recent rate 0.0/sec, hit 
ratio 0%
[17/Sep/2020:15:26:08.093192417 -0700] - INFO - import_throw_in_towel - import 
userRoot: Decided to end this pass because the progress rate has dropped below 
the 50% threshold.
[17/Sep/2020:15:26:08.094010231 -0700] - INFO - import_monitor_threads - import 
userRoot: Ending pass number 1 ...
[17/Sep/2020:15:26:08.195022638 -0700] - INFO - import_monitor_threads - import 
userRoot: Foreman is done; waiting for workers to finish...
[17/Sep/2020:15:26:08.196221021 -0700] - INFO - import_monitor_threads - import 
userRoot: Workers finished; cleaning up...
[17/Sep/2020:15:26:08.397372463 -0700] - INFO - import_monitor_threads - import 
userRoot: Workers cleaned up.
[17/Sep/2020:15:26:08.398525016 -0700] - INFO - import_sweep_after_pass - 
import userRoot: Sweeping files for merging later...
[17/Sep/2020:15:26:08.409694998 -0700] - INFO - dblayer_instance_start - Import 
is running with nsslapd-db-private-import-mem on; No other process is allowed 
to access the database
[17/Sep/2020:15:26:08.411165210 -0700] - INFO - import_sweep_after_pass - 
import userRoot: Sweep done.
[17/Sep/2020:15:26:08.412153776 -0700] - INFO - import_main_offline - import 
userRoot: Beginning pass number 2
[17/Sep/2020:15:26:28.444537073 -0700] - INFO - import_monitor_threads - import 
userRoot: Processed 1 entries (pass 2) -- average rate 214748364.8/sec, recent 
rate 0.0/sec, hit ratio 0%
[17/Sep/2020:15:26:48.487800198 -0700] - INFO - import_monitor_threads - import 
userRoot: Processed 1 entries (pass 2) -- average rate 107374182.4/sec, recent 
rate 0.0/sec, hit ratio 0%
[17/Sep/2020:15:27:08.524672179 -0700] - INFO - import_monitor_threads - import 
userRoot: Processed 1 entries (pass 2) -- average rate 71582788.2/sec, recent 
rate 0.0/sec, hit ratio 0%
[17/Sep/2020:15:27:28.560669386 -0700] - INFO - import_monitor_threads - import 
userRoot: Processed 1 entries (pass 2) -- average rate 53687091.2/sec, recent 
rate 0.0/sec, hit ratio 0%
[17/Sep/2020:15:27:48.598581175 -0700] - INFO - import_monitor_threads - import 
userRoot: Processed 1 entries (pass 2) -- average rate 42949673.0/sec, recent 
rate 0.0/sec, hit ratio 0%
[17/Sep/2020:15:28:08.637175249 -0700] - INFO - import_monitor_threads - import 
userRoot: Processed 1 entries (pass 2) -- average rate 35791394.1/sec, recent 
rate 0.0/sec, hit ratio 0%
[17/Sep/2020:15:28:28.674637440 -0700] - INFO - import_monitor_threads - import 
userRoot: Processed 1 entries (pass 2) -- average rate 30678337.8/sec, recent 
rate 0.0/sec, hit ratio 0%
[17/Sep/2020:15:28:48.713863865 -0700] - INFO - import_monitor_threads - import 
userRoot: Processed 1 entries (pass 2) -- average rate 26843545.6/sec, recent 
rate 0.0/sec, hit ratio 0%
[17/Sep/2020:15:29:08.750649257 -0700] - INFO - import_monitor_threads - import 
userRoot: Processed 1 entries (pass 2) -- average rate 23860929.4/sec, recent 
rate 0.0/sec, hit ratio 0%
[17/Sep/2020:15:29:28.788308057 -0700] - INFO - import_monitor_threads - import 
userRoot: Processed 1 entries (pass 2) -- average rate 21474836.5/sec, recent 
rate 0.0/sec, hit ratio 0%
[17/Sep/2020:15:29:48.824483405 -0700] - INFO - import_monitor_threads - import 
userRoot: Processed 1 entries (pass 2) -- average rate 19522578.6/sec, recent 
rate 0.0/sec, hit ratio 0%
[17/Sep/2020:15:30:08.861454259 -0700] - INFO - import_monitor_threads - import 
userRoot: Processed 1 entries (pass 2) -- average rate 17895697.1/sec, recent 
rate 0.0/sec, hit ratio 0%
[17/Sep/2020:15:30:28.898478351 -0700] - INFO - import_monitor_threads - import 
userRoot: Processed 1 entries (pass 2) -- average rate 16519105.0/sec, recent 
rate 0.0/sec, hit ratio 0%
[17/Sep/2020:15:30:48.934493905 -0700] - INFO - import_monitor_threads - import 
userRoot: Processed 1 entries (pass 2) -- average rate 15339168.9/sec, recent 
rate 0.0/sec, hit ratio 0%
[17/Sep/2020:15:31:08.974176434 -0700] - INFO - import_monitor_threads - import 
userRoot: Processed 1 entries (pass 2) -- average rate 14316557.7/sec, recent 
rate 0.0/sec, hit ratio 0%
[17/Sep/2020:15:31:29.026523856 -0700] - INFO - import_monitor_threads - import 
userRoot: Processed 1 entries (pass 2) -- average rate 13379960.4/sec, recent 
rate 0.0/sec, hit ratio 0%
[17/Sep/2020:15:31:29.027647083 -0700] - INFO - import_throw_in_towel - import 
userRoot: Decided to end this pass because the progress rate has dropped below 
the 50% threshold.
[17/Sep/2020:15:31:29.028635146 -0700] - INFO - import_monitor_threads - import 
userRoot: Ending pass number 2 ...
[17/Sep/2020:15:31:29.129868576 -0700] - INFO - import_monitor_threads - import 
userRoot: Foreman is done; waiting for workers to finish...
[17/Sep/2020:15:31:29.131344494 -0700] - INFO - import_monitor_threads - import 
userRoot: Workers finished; cleaning up...
[17/Sep/2020:15:31:29.333383134 -0700] - INFO - import_monitor_threads - import 
userRoot: Workers cleaned up.
[17/Sep/2020:15:31:29.334756063 -0700] - INFO - import_sweep_after_pass - 
import userRoot: Sweeping files for merging later...
[17/Sep/2020:15:31:29.346472780 -0700] - INFO - dblayer_instance_start - Import 
is running with nsslapd-db-private-import-mem on; No other process is allowed 
to access the database
[17/Sep/2020:15:31:29.348439407 -0700] - INFO - import_sweep_after_pass - 
import userRoot: Sweep done.
[17/Sep/2020:15:31:29.349300174 -0700] - INFO - import_main_offline - import 
userRoot: Beginning pass number 3
[17/Sep/2020:15:31:36.933419931 -0700] - ERR - factory_destructor - ERROR bulk 
import abandoned
[17/Sep/2020:15:31:36.967711128 -0700] - ERR - import_run_pass - import 
userRoot: Thread monitoring returned: -23

[17/Sep/2020:15:31:36.970482988 -0700] - ERR - import_main_offline - import 
userRoot: Aborting all Import threads...
[17/Sep/2020:15:31:44.285307200 -0700] - ERR - import_main_offline - import 
userRoot: Import threads aborted.
[17/Sep/2020:15:31:44.286444487 -0700] - INFO - import_main_offline - import 
userRoot: Closing files...
[17/Sep/2020:15:31:44.288138525 -0700] - ERR - import_main_offline - import 
userRoot: Import failed.
[17/Sep/2020:15:31:44.289342372 -0700] - ERR - process_bulk_import_op - NULL 
target sdn
[17/Sep/2020:15:31:44.312329369 -0700] - ERR - NSMMReplicationPlugin - 
replica_replace_ruv_tombstone - Failed to update replication update vector for 
replica dc=<redacted>: LDAP error - 1
[17/Sep/2020:15:31:47.320263729 -0700] - ERR - NSMMReplicationPlugin - 
replica_replace_ruv_tombstone - Failed to update replication update vector for 
replica dc=<redacted>: LDAP error - 1
[17/Sep/2020:15:31:50.325999803 -0700] - ERR - NSMMReplicationPlugin - 
replica_replace_ruv_tombstone - Failed to update replication update vector for 
replica dc=<redacted>: LDAP error - 1



Trevor Fong
Senior Programmer Analyst, Identity and Access Management
Cybersecurity | CISO Office
The University of British Columbia | Musqueam Traditional Territory
413 - 6356 Agricultural Road | Vancouver BC | V6T 1Z2 Canada
Phone 604 827 5247
Privacy Matters @ UBC
_______________________________________________
389-users mailing list -- 
[email protected]<mailto:[email protected]>
To unsubscribe send an email to 
[email protected]<mailto:[email protected]>
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/[email protected]
_______________________________________________
389-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/[email protected]

Reply via email to