On 1/9/20 12:38 PM, Elena Rico wrote:
Hello,

We have noticed some inconsistent results while restoring from a backup. We are 
running the below script which takes backups nightly and then we use the 
'bak2db' to restore.
Version: 389-Directory/1.3.8.4 B2018.332.2046

We have noticed 2 issues:
1) Segmentation fault
        The first time we execute 'bak2db' or 'bak2db.pl' both generate the 
below error:

        [root@iris-ldap-01b.lab1 bak]# bak2db 
/var/lib/dirsrv/slapd-iris-ldap-01b/bak//iris-ldap-01b-2020_01_08_06_00_02 -n 
userRoot -Z $(hostname -s)
        [09/Jan/2020:16:59:58.470938641 +0000] - ERR - config_set - The 
attribute nsslapd-accesslog-list is read only; ignoring new value 
/var/log/dirsrv/slapd-iris-ldap-01b/access.20190630-191314.test
        [09/Jan/2020:16:59:58.474013017 +0000] - ERR - config_set - The 
attribute nsslapd-accesslog-list is read only; ignoring new value /var/log/test
        [09/Jan/2020:16:59:58.666392572 +0000] - INFO - 
ldbm_instance_config_cachememsize_set - force a minimal value 512000
        [09/Jan/2020:16:59:58.694699743 +0000] - INFO - 
ldbm_instance_config_cachememsize_set - force a minimal value 512000
        [09/Jan/2020:16:59:58.727591335 +0000] - ERR - dblayer_file_check - 
Previous import or restore failed, file: 
/var/lib/dirsrv/slapd-iris-ldap-01b/db/../.restore is empty
        [09/Jan/2020:16:59:58.759491776 +0000] - INFO - dblayer_pre_close - All 
database threads now stopped
        [09/Jan/2020:16:59:58.763162191 +0000] - NOTICE - 
dblayer_delete_database_ex - Skipping instance NetscapeRoot
        /sbin/bak2db: line 87: 17865 Segmentation fault      /usr/sbin/ns-slapd 
archive2db -D /etc/dirsrv/slapd-iris-ldap-01b -a 
/var/lib/dirsrv/slapd-iris-ldap-01b/bak/iris-ldap-01b-2020_01_08_06_00_02 -n
  "userRoot"
        
        We have noticed that stopping the server and using 'bak2db' a second 
time does restore, but it has to be executed 2 times.
* Do you know why we are getting this error?

Remove "-n userRoot" from the restore script, this "false option" was removed in newer version of 389-ds-base - it could cause crashes see https://pagure.io/389-ds-base/issue/50063

I'll have some comments below inside your backup script too...


2) In some instances the restore is not successful. And this is inconsistent, 
sometimes it works and others it does not. Below are the errors:
        
        [09/Jan/2020:16:46:10.349362830 +0000] - ERR - libdb - BDB2506 file 
/var/lib/dirsrv/slapd-iris-ldap-01b/changelogdb/17a67641-223b11e8-966b9f6a-c9bf4dc9_585bee1f000000020000.db
 has LSN 241/3829568, past end of log at 241/2373072
        [09/Jan/2020:16:46:10.351404473 +0000] - ERR - libdb - BDB2507 Commonly 
caused by moving a database from one database environment
Was the backup from a different instance/server?
        [09/Jan/2020:16:46:10.352951735 +0000] - ERR - libdb - BDB2508 to 
another without clearing the database LSNs, or by removing all of
        [09/Jan/2020:16:46:10.356241586 +0000] - ERR - libdb - BDB2509 the log 
files from a database environment
        [09/Jan/2020:16:46:10.358146068 +0000] - ERR - libdb - BDB1521 Recovery 
function for LSN 241 1284 failed
        [09/Jan/2020:16:46:10.360388060 +0000] - ERR - libdb - BDB0061 PANIC: 
Invalid argument
        [09/Jan/2020:16:46:10.362683721 +0000] - ERR - libdb - BDB1544 
process-private: unable to find environment
        [09/Jan/2020:16:46:10.365319736 +0000] - ERR - 
dblayer_make_private_recovery_env - Open error -30973: BDB0087 DB_RUNRECOVERY: 
Fatal error, run database recovery
        [09/Jan/2020:16:46:10.366912512 +0000] - ERR - dblayer_fri_restore - 
Recovery failed!
        [09/Jan/2020:16:46:10.368700472 +0000] - ERR - ldbm_back_archive2ldbm - 
Failed to read backup file set. Either the directory specified doesn't exist, 
or it exists but doesn't contain a valid backup set, or file permissions 
prevent the server reading the backup set.  error=-30973 (BDB0087 
DB_RUNRECOVERY: Fatal error, run database recovery)

* Is there anything we need to change to make sure we have correct backups?

Looks like it's a bad backup but hard to say.  You could run dbverify on your backup directory to see if it complains about anything:

Example:

# dbverify -Z iris-ldap-01b -a /var/lib/dirsrv/slapd-iris-ldap-01b/bak/iris-ldap-01b-2020_01_08_06_00_02 -n userroot -V

It might find something, ir might not.


For backup we are running this script based on a response from the mailing list:

#!/bin/bash
#
# Creates a backup of 389-ds online
# Based on https://mjanja.ch/2013/09/backing-up-389-ldap/
#

# Read credentials from file
timestamp=`date +%Y_%m_%d_%H_%M_%S`
source /usr/local/etc/ldap-backup.cfg

# Backup each instance
for dirsrv in /etc/dirsrv/slapd-*
do
    name=${dirsrv/*slapd-/}

    if [ "${name}" == "localhost" ];
    then
        continue
    fi

    vardir=/var/lib/dirsrv/slapd-${name}
    [ -d /var/lib/dirsrv/scripts-${name} ] && 
scriptdir=/var/lib/dirsrv/scripts-${name}
    [ -d /usr/lib64/dirsrv/slapd-${name} ] && 
scriptdir=/usr/lib64/dirsrv/slapd-${name}
    [ -d /usr/lib/dirsrv/slapd-${name} ] && 
scriptdir=/usr/lib/dirsrv/slapd-${name}

    ${scriptdir}/db2bak.pl -D "cn=${ldap_user}" -w "${ldap_password}" -a 
${vardir}/bak/${name}-${timestamp} -P LDAPI

This launches a task to perform the backup but the script returns before the task is even started, then below you are doing db2ldif's.  Potential race condition.   I'm not sure if that could be causing an issue, but the db2ldif should be done after the backup (db2bak) is complete.  Ideally this should all be done while the server is stopped, but it's not a requirement.


Maybe others on this list have seen some of the other errors you are getting?

Regards,

Mark

    dbdir=${vardir}/db
    for dbentry in ${dbdir}/*
    do
       if [ -d ${dbentry} ] #check if directory exists
       then
          dbname=$(basename ${dbentry})
          ${scriptdir}/db2ldif -n ${dbname} -a 
${vardir}/ldif/${dbname}-${timestamp}.ldif
       fi
    done

    # Cleanup old entries
    /usr/sbin/tmpwatch -mM ${retention_hours} ${vardir}/bak
    /usr/sbin/tmpwatch -mM ${retention_hours} ${vardir}/ldif

done
_______________________________________________
389-users mailing list -- 389-users@lists.fedoraproject.org
To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org

--

389 Directory Server Development Team
_______________________________________________
389-users mailing list -- 389-users@lists.fedoraproject.org
To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org

Reply via email to