On 10/18/21 5:18 AM, Kees Bakker wrote:
On 12-10-2021 14:29, Kees Bakker wrote:
On 11-10-2021 20:58, Viktor Ashirov wrote:
On Mon, Oct 11, 2021 at 8:33 PM Mark Reynolds <[email protected]> wrote:


    On 10/11/21 2:29 PM, Kees Bakker wrote:
    On 11-08-2021 15:21, Mark Reynolds wrote:
    On 8/11/21 9:09 AM, Pierre Rogier wrote:
    Hi,
    I suspect that Kees discovered a new bug ...

    As the retro changelog is persistent across reboot,
    it is normal that their timestamp are absolute, so IMHO there
    is no way relative time should be used/seen in trimming
    computation.
    (in other words cur_time is wrong. )
    My suspicion is that we got a regression when the slapi_eq
    timers where  rewritten to use relative time instead of
    absolute time) (cur_time is provided by slapi_eq API and is
    now relative ( it should rather be initialized to time(0) )
    Looking at the code we are mixing the use of the same
    monotonic time for the interval (correct) and the age of an
    entry (not correct).
    The funny thing is that nobody have seen that retrocl
    trimming was broken for so long ...

    That's because it silently just builds up and no one notices :-(

    I opened a ticket for this:
    https://github.com/389ds/389-ds-base/issues/4869

    This has been fixed a while ago. And it was cherry-picked in
    several release branches.
    For me that would be 1.4.4. There is a tag 1.4.4.17 but it
    hasn't been published.

    Does anyone know how the release process goes with epel -
    Modular? That's were I'm expecting it to show up.
    There hasn't been any update in weeks, as far as I can see.

    The module build was just done on October 9th.  I think it takes
    a little more time for that to appear in EPEL ...

Here's the bodhi update: https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-MODULAR-2021-d15385ef0a
Module is currently in testing. You can update to it using:
# dnf --enablerepo='epel-testing-modular' update 389-ds-base cockpit-389-ds
HTH
Thanks. I'll give that a try.

Snif. I installed this 1.4.4.17 update but ns-slapd now hits a deadlock and becomes unresponsive.

How are you reproducing this deadlock?  Please provide some stack traces:

https://www.port389.org/docs/389ds/FAQ/faq.html#sts=Debugging%C2%A0Hangs

Thanks,

Mark



    Regards,
    Mark

    -- Kees

    Mark

    Regards Pierre,

    On Wed, Aug 11, 2021 at 11:50 AM Kees Bakker <[email protected]>
    wrote:

        On 09-08-2021 20:23, Mark Reynolds wrote:
        > On 8/9/21 1:47 PM, Kees Bakker wrote:
        >> On 09-08-2021 19:25, Mark Reynolds wrote:
        >>> On 8/9/21 1:16 PM, Kees Bakker wrote:
        >>>> On 09-08-2021 18:43, Mark Reynolds wrote:
        >>>>> On 8/9/21 11:20 AM, Kees Bakker wrote:
        >>>>>> On 09-08-2021 16:00, Mark Reynolds wrote:
        >>>>>>> On 8/9/21 8:09 AM, Kees Bakker wrote:
        >>>>>>>> Hi,
        >>>>>>>>
        >>>>>>>> When my dirsrv was trying to compact the
        databases I was getting
        >>>>>>>> this
        >>>>>>>> error
        >>>>>>>>
        >>>>>>>> [07/Aug/2021:23:59:02.715984489 +0200] - NOTICE
        - bdb_compact -
        >>>>>>>> Compacting databases ...
        >>>>>>>> [07/Aug/2021:23:59:02.765932397 +0200] - NOTICE
        - bdb_compact -
        >>>>>>>> Compacting DB start: userRoot
        >>>>>>>> [07/Aug/2021:23:59:03.518175414 +0200] - NOTICE
        - bdb_compact -
        >>>>>>>> compactdb: compact userRoot - 417 pages freed
        >>>>>>>> [07/Aug/2021:23:59:03.576427786 +0200] - NOTICE
        - bdb_compact -
        >>>>>>>> Compacting DB start: ipaca
        >>>>>>>> [07/Aug/2021:23:59:03.659941533 +0200] - NOTICE
        - bdb_compact -
        >>>>>>>> compactdb: compact ipaca - 419 pages freed
        >>>>>>>> [07/Aug/2021:23:59:03.718445310 +0200] - NOTICE
        - bdb_compact -
        >>>>>>>> Compacting DB start: changelog
        >>>>>>>> [08/Aug/2021:00:00:40.807571334 +0200] - NOTICE -
        >>>>>>>> NSMMReplicationPlugin - changelog program -
        cl5CompactDBs -
        >>>>>>>> compacting
        >>>>>>>> replication changelogs...
        >>>>>>>> [08/Aug/2021:00:00:54.309357211 +0200] - ERR -
        libdb - BDB2055 Lock
        >>>>>>>> table is out of available lock entries
        >>>>>>>> [08/Aug/2021:00:00:54.726504736 +0200] - ERR -
        bdb_compact -
        >>>>>>>> compactdb: failed to compact changelog; db error
        - 12 Cannot
        >>>>>>>> allocate
        >>>>>>>> memory
        >>>>>>>> [08/Aug/2021:00:00:54.801571421 +0200] - ERR -
        libdb - BDB2055 Lock
        >>>>>>>> table is out of available lock entries
        >>>>>>>> [08/Aug/2021:00:00:54.876618702 +0200] - ERR -
        >>>>>>>> NSMMReplicationPlugin -
        >>>>>>>> changelog program - cl5CompactDBs - Failed to
        compact
        >>>>>>>> a797bb0b-be1d11eb-88c0b677-613aa2ad; db error -
        12 Cannot allocate
        >>>>>>>> memory
        >>>>>>>> [08/Aug/2021:00:00:57.253006449 +0200] - NOTICE
        - bdb_compact -
        >>>>>>>> Compacting databases finished.
        >>>>>>>>
        >>>>>>>> There are about 402k entries in cn=changelog.
        >>>>>>>>
        >>>>>>>> I have a few questions
        >>>>>>>> 1) is it normal to have so many entries in
        cn=changelog? On another
        >>>>>>>> replica I have almost 3M entries Isn't this
        cleaned up?
        >>>>>>>> 2) the number of locks is 50000 (there are two
        config items).
        >>>>>>>> Should I
        >>>>>>>> increase that number? If so, increase to what?
        >>>>>>>> 3) is there maybe something else going on,
        causing the
        >>>>>>>> exhaustion of
        >>>>>>>> the locks?
        >>>>>>>
        >>>>>>> Ok, so by default there is no changelog trimming
        enabled. So the
        >>>>>>> changelog will grow without bounds, which is bad.
        >>>>>>
        >>>>>> How much of this [1] applies? Indeed it says "By
        default the
        >>>>>> Replication Changelog does not use any trimming by
        default."
        >>>>>> So, how is the trimming actually enabled? I have
        these entries
        >>>>>>
        >>>>>> dn: cn=changelog5,cn=config
        >>>>>> cn: changelog5
        >>>>>> nsslapd-changelogdir:
        /var/lib/dirsrv/slapd-EXAMPLE-COM/cldb
        >>>>>> nsslapd-changelogmaxage: 30d
        >>>>>> objectClass: top
        >>>>>> objectClass: extensibleobject
        >>>>>
        >>>>> So trimming is set, but it's set to 30 days (30d), that
        >>>>> could/should be
        >>>>> shortened to "7d".
        >>>>
        >>>> Somehow I doubt that this will do anything.
        >>>
        >>> Ahhh, I thought this was the Replication Changelog,
        but it is the
        >>> "Retro" Changelog.  Ok so check this out:
        >>>
        >>>
        
https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/html/administration_guide/managing_replication-using_the_retro_changelog_plug_in#Using_the_Retro_Changelog_Plug_in-Trimming_the_Retro_Changelog
        >>>
        >>>
        >>> Or update config directly using ldapmodify:
        >>>
        >>> # ldapmodify -D ...
        >>> dn: cn=Retro Changelog Plugin,cn=plugins,cn=config
        >>> changetype: modify
        >>> replace: nsslapd-changelogmaxage: 7d
        >>> nsslapd-changelogmaxage: 7d
        >>
        >> (( Typo? The replace line is without the value ))
        > yes, sorry...
        >>
        >> Hmm, that maxage was already at 2 days
        >
        > Ok, perhaps you are running into a known bug where
        indexes created by
        > plugins are not enabled.  Or might be a config issue?
        >
        > Try this first.  Set in the retro changelog entry:
        >
        > nsslapd-changelog-trim-interval: 300   <--- 5 minutes
        (Sorry to be picky, but it is
        nsslapd-changelogtrim-interval. It's default is 300.)
        >
        > Restart the server to make sure it was applied.  You
        can enable "plugin"
        > logging to see what it does.
        >
        > nsslapd-errorlog-level: 65536
        >
        > wait over 5 minutes, then grep the errors log for
        "DSRetroclPlugin".
        > This will tell us what the plugin is doing when it goes
        to trim the
        > database.
        >
        > IMPORTANT, disable plugin logging immediately after you
        get your data.
        > Set the errorlog level to 0 to go back to default logging.
        >
        > If the logging shows it's not finding any candidates
        then it might be
        > the indexing issue I mentioned above.  In that case you
        just need to
        > reindex the retro changelog:

        Before changing anything I enabled Plug-in debugging
        which revealed something interesting

        [root@linge ~]# grep trim
        /var/log/dirsrv/slapd-GHS-NL/errors-plugin-debug-20210811-1108
        [11/Aug/2021:11:01:49.792450548 +0200] - DEBUG -
        DSRetroclPlugin - cltrim: ldrc=0, first_time=1622032141,
        cur_time=3702600
        [11/Aug/2021:11:01:49.842339467 +0200] - DEBUG -
        DSRetroclPlugin - retrocl_housekeeping - changelog does
        not need to be trimmed
        [11/Aug/2021:11:06:49.953401407 +0200] - DEBUG -
        DSRetroclPlugin - cltrim: ldrc=0, first_time=1622032141,
        cur_time=3702900
        [11/Aug/2021:11:06:50.012249703 +0200] - DEBUG -
        DSRetroclPlugin - retrocl_housekeeping - changelog does
        not need to be trimmed

        This shows that the interval is 300 (even if
        nsslapd-changelogtrim-interval is not explicitly set in
        cn=changelog5,cn=config

        Look at cur_time. That is the "uptime" of the system, not
        UTC.

        [root@linge ~]# uptime
          11:21:42 up 42 days, 20:49,  2 users, load average:
        0.15, 0.14, 0.16

        Next, assuming this code from retrocl_trim.c is making
        the decision to trim

                     first_time =
        retrocl_getchangetime(SLAPI_SEQ_FIRST, &ldrc);
        slapi_log_err(SLAPI_LOG_PLUGIN, RETROCL_PLUGIN_NAME,
                                   "cltrim: ldrc=%d,
        first_time=%ld, cur_time=%ld\n",
                                   ldrc, first_time, cur_time);
                     if (LDAP_SUCCESS == ldrc && first_time >
        (time_t)0L &&
                         first_time + ts.ts_c_max_age < cur_time) {
                         must_trim = 1;
                     }

        There is a mixup of absolute (first_time) and relative
        (cur_time) time. Which of the two is wrong?

        >
        > # dsctl slapd-YOUR_INSTANCE stop
        >
        > # dsctl slapd-YOUR_INSTANCE db2index changelog
        >
        > # dsctl slapd-YOUR_INSTANCE start
        >
        > Then wait 5 minutes and see if anything was trimmed.
        >
        > Mark
        >
        >>
        >> dn: cn=Retro Changelog Plugin,cn=plugins,cn=config
        >> cn: Retro Changelog Plugin
        >> nsslapd-attribute: nsuniqueid:targetUniqueId
        >> nsslapd-changelogmaxage: 2d
        >> nsslapd-include-suffix: cn=dns,dc=example,dc=com
        >> nsslapd-plugin-depends-on-named: Class of Service
        >> nsslapd-plugin-depends-on-type: database
        >> nsslapd-pluginDescription: Retrocl Plugin
        >> nsslapd-pluginEnabled: on
        >> nsslapd-pluginId: retrocl
        >> nsslapd-pluginInitfunc: retrocl_plugin_init
        >> nsslapd-pluginPath: libretrocl-plugin
        >> nsslapd-pluginType: object
        >> nsslapd-pluginVendor: 389 Project
        >> nsslapd-pluginVersion: 1.4.3.231
        >> nsslapd-pluginbetxn: on
        >> nsslapd-pluginprecedence: 25
        >> objectClass: top
        >> objectClass: nsSlapdPlugin
        >> objectClass: extensibleObject
        >>
        >>>
        >>>
        >>> HTH,
        >>>
        >>> Mark
        >>>
        >>>>
        >>>> The first entry in cn=changelog is more than two
        months old. (May 26th
        >>>> is about
        >>>> the time I installed this server as a FreeIPA replica.)
        >>>>
        >>>> dn: changenumber=1,cn=changelog
        >>>> objectClass: top
        >>>> objectClass: changelogentry
        >>>> objectClass: extensibleObject
        >>>> targetuniqueid: 1e89ebcb-e20111e8-8820f96e-fc0c60a4
        >>>> changeNumber: 1
        >>>> targetDn:
        >>>> idnsname=_ldap._tcp,idnsname=example.com
        <http://example.com>.,cn=dns,dc=example,dc=com
        >>>> changeTime: 20210526122901Z
        >>>> changeType: modify
        >>>> changes::
        YWRkOiBzUlZSZWNvc...gplbnRyeXVzbjogMTEyCi0KAA==
        >>>>
        >>>> Is there a way to see trimming in action? I'm afraid
        it is not doing
        >>>> anything.
        >>
        >> What diagnostic can I enable to see the trimming in
        action?
        >>
        >>>>
        >>>>>
        >>>>>>
        >>>>>>
        >>>>>>>
        >>>>>>> I recommend setting up the changelog max age to 7
        days:
        >>>>>>>
        >>>>>>>
        
https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/html/administration_guide/trimming_the_replication_changelog
        >>>>>>>
        >>>>>>>
        >>>>>>>
        >>>>>>>
        >>>>>>> Once that the trimming cleans up the changelog
        the database
        >>>>>>> compaction
        >>>>>>> should work without issue.  You can also increase
        the database
        >>>>>>> locks to
        >>>>>>> 1 million (that does require a restart of the
        server to take
        >>>>>>> effect).
        >>>>>>
        >>>>>> Let's hope that 1 million is enough :-)
        >>>>>
        >>>>> It might not be, you can keep bumping it up if
        needed. There is no
        >>>>> limit, or negative impact on db performance.
        >>>>>
        >>>>> Mark
        >>>>>
        >>>>>>
        >>>>>>>
        >>>>>>> HTH,
        >>>>>>>
        >>>>>>> Mark
        >>>>>>>
        >>>>>> [1]
        >>>>>>
        
https://directory.fedoraproject.org/docs/389ds/FAQ/changelog-trimming.html
        >>>>>>
        >>>>>>
        >>>>>
        >>>>> --
        >>>>> Directory Server Development Team
        >>>>>
        >>>> _______________________________________________
        >>>> 389-users mailing list --
        [email protected]
        >>>> To unsubscribe send an email to
        [email protected]
        >>>> Fedora Code of Conduct:
        >>>>
        https://docs.fedoraproject.org/en-US/project/code-of-conduct/
        >>>> List Guidelines:
        https://fedoraproject.org/wiki/Mailing_list_guidelines
        >>>> List Archives:
        >>>>
        
https://lists.fedoraproject.org/archives/list/[email protected]
        >>>>
        >>>> Do not reply to spam on the list, report it:
        >>>> https://pagure.io/fedora-infrastructure
        >>>
        >>> --
        >>> Directory Server Development Team
        >>>
        >>
        > --
        > Directory Server Development Team
        >
        _______________________________________________
        389-users mailing list -- [email protected]
        To unsubscribe send an email to
        [email protected]
        Fedora Code of Conduct:
        https://docs.fedoraproject.org/en-US/project/code-of-conduct/
        List Guidelines:
        https://fedoraproject.org/wiki/Mailing_list_guidelines
        List Archives:
        
https://lists.fedoraproject.org/archives/list/[email protected]
        Do not reply to spam on the list, report it:
        https://pagure.io/fedora-infrastructure



-- --

    389 Directory Server Development Team
-- Directory Server Development Team

    _______________________________________________
    389-users mailing list [email protected]
    To unsubscribe send an email [email protected]
    Fedora Code of 
Conduct:https://docs.fedoraproject.org/en-US/project/code-of-conduct/
    List Guidelines:https://fedoraproject.org/wiki/Mailing_list_guidelines
    List 
Archives:https://lists.fedoraproject.org/archives/list/[email protected]
    Do not reply to spam on the list, report 
it:https://pagure.io/fedora-infrastructure


    _______________________________________________
    389-users mailing list [email protected]
    To unsubscribe send an email [email protected]
    Fedora Code of 
Conduct:https://docs.fedoraproject.org/en-US/project/code-of-conduct/
    List Guidelines:https://fedoraproject.org/wiki/Mailing_list_guidelines
    List 
Archives:https://lists.fedoraproject.org/archives/list/[email protected]
    Do not reply to spam on the list, report 
it:https://pagure.io/fedora-infrastructure

-- Directory Server Development Team

    _______________________________________________
    389-users mailing list -- [email protected]
    To unsubscribe send an email to
    [email protected]
    Fedora Code of Conduct:
    https://docs.fedoraproject.org/en-US/project/code-of-conduct/
    List Guidelines:
    https://fedoraproject.org/wiki/Mailing_list_guidelines
    List Archives:
    
https://lists.fedoraproject.org/archives/list/[email protected]
    Do not reply to spam on the list, report it:
    https://pagure.io/fedora-infrastructure



--
Viktor


_______________________________________________
389-users mailing list [email protected]
To unsubscribe send an email [email protected]
Fedora Code of 
Conduct:https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines:https://fedoraproject.org/wiki/Mailing_list_guidelines
List 
Archives:https://lists.fedoraproject.org/archives/list/[email protected]
Do not reply to spam on the list, report 
it:https://pagure.io/fedora-infrastructure


_______________________________________________
389-users mailing list [email protected]
To unsubscribe send an email [email protected]
Fedora Code of 
Conduct:https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines:https://fedoraproject.org/wiki/Mailing_list_guidelines
List 
Archives:https://lists.fedoraproject.org/archives/list/[email protected]
Do not reply to spam on the list, report 
it:https://pagure.io/fedora-infrastructure

--
Directory Server Development Team
_______________________________________________
389-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/[email protected]
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure

Reply via email to