[389-users] Re: advice on 389 in production
Hi Morgan, in our case we have ~60 000 entries, ~10 000 accounts and ~5 000 large groups (some containing almost all users). Three 389ds in active-active replication, extremely stable, performant, no problems at all. We are using RHEL 9 clones (Oracle Linux or Alma Linux), latest OS patches (9.4 i think). 389ds version 2.5 compiling from git branch (i think the latest one is 2.5.1, maybe it is available as rpm). No complaints at all :) Regards, AI - Mail original - > De: "Morgan Jones" > À: "General discussion list for the 389 Directory server, project." > <389-users@lists.fedoraproject.org> > Envoyé: Mercredi 5 Juin 2024 22:25:14 > Objet: [389-users] advice on 389 in production > Hello Everyone, > > What operating system and 389 version is everyone running in production? We > are > finally updating our CentOS 7 servers in earnest. > > We have almost 200,000 users and use 389 for our central ldap so stability is > preferred over features. > > Based on release dates I'm leaning toward version 2.x. > > I've sent the afternoon trying to find packages for Rocky Linux 9 with limited > success. > > We switched to Ubuntu a few years ago so that would be my preference but I > don't > see packages for ubuntu and I'd prefer to now maintain my own packages. > > Is Docker a viable option for a production install? I see there is an up to > date image which I've been able to start but it appears to be 3.x (see above > re: preferred production version). > > Thanks, > > -morgan > -- > ___ > 389-users mailing list -- 389-users@lists.fedoraproject.org > To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org > Fedora Code of Conduct: > https://docs.fedoraproject.org/en-US/project/code-of-conduct/ > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines > List Archives: > https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org > Do not reply to spam, report it: > https://pagure.io/fedora-infrastructure/new_issue -- ___ 389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
[389-users] Re: Recent commits in stable 389ds branches - discussion
>>> 3) A new default plugin requirement, the plugin being written in Rust - >>> probably >>> its introduction is FIPS-related (Issue 3584 - Fix PBKDF2_SHA256 hashing in >>> FIPS mode). >> This was a very important fix to get into 1.4.4, usually big changes do not >> land >> in 1.4.4 anymore, but this one needed to get in. > > This change was about the C code, not Rust code if I recall correctly, since > that's the inbuilt PBKDF2_SHA256 module, not the pwdchan one with openldap > compat. The RUST pbkdf2 module has existed since early 1.4.4 and that's needed > for openldap migration which we at SUSE enable by default (I don't think RH do > yet). the changes in dse.ldif in that issue (3584) made libpwdchan-plugin required for the server (new entries in in cn=Password Storage Schemes,cn=plugins,cn=config). > > >>> See my comment >>> https://github.com/389ds/389-ds-base/issues/5008#issuecomment-983759224. >>> Rust >>> becomes a requirement for building the server, which is fine, but then it >>> should be enabled by default in "./configure". Without it the server does >>> not >>> compile the new plugin and complains about it when starting: >>> [01/Dec/2021:12:54:04.460194603 +0100] - ERR - symload_report_error - Could >>> not >>> open library "/Local/dirsrv/lib/dirsrv/plugins/libpwdchan-plugin.so" for >>> plugin >>> PBKDF2 >> Yes I do understand this frustration, and it is now fixed for non-rust >> builds. > > I think this error specifically came about if you did a rust build, then you > took rust away, it created some leftovers in dse.ldif I think (?). No, actually i have never installed rust on any of our production or build servers. dse.ldif now integrates by default rust-written plugins but --enable-rust is not a default option in ./configure, that was in fact the problem. Thank you, William! Sincerely, Andrey ___ 389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
[389-users] Re: Recent commits in stable 389ds branches - discussion
Hi Mark, thank you for your detailed reply. I do not have objections, it's just more of a return of experience of the recent changes that i wanted to share, for me some things were a bit unexpected. My comments are below. On 12/3/21 6:29 AM, Ivanov Andrey (M.) wrote: BQ_BEGIN I'd like to discuss several recent (since a couple of months) commits in stable branches of 389ds. I will be talking about 1.4.4 [ https://github.com/389ds/389-ds-base/tree/389-ds-base-1.4.4 | https://github.com/389ds/389-ds-base/tree/389-ds-base-1.4.4 ] since it's the one we are using in production, but i think it's the same for 1.4.3. These commits are welcome and go in the right direction, however the changes they produce are not something one expects when the server version changes in 4th digit (ex. 1.4.4.17 -> 1.4.4.18). Here they are: I guess we don't follow the same principles :-) For the most part these are all minor RFE's except for Rust, but Rust has been in use in our product (1.4.x series) for well over a year now, so I'm surprised to see issues arise about it now. But adding these RFE's is not out of line IMHO, obviously you feel a little different about that. BQ_END Yes, i would think these changes (especially the shift of certain files to /dev/shm and rust dependency during server build) should have landed in 1.4.5 (corresponding to RHEL 8.n -> REHL 8.n+1). If i take my experience as an example, the move of DB files to /dev/shm has broken the startup of a newly created server (dscreate -f ...) since /dev/shm size by default represents only 50% of server memory. In my case the size of these files was more then 50% of memory, so i had to make adjustments (it was either to increase the memory of the VM or change the parameter db_home_dir to move the abovementioned files back to disk). As for the rust - i did not have rust installed ever on my build server, i used my usual ./configure switches that worked for 1.4.4.17, the server compiled OK but the error logs at startup were filled up with "ERR" criticity messages. Anyway, the problem is resolved and i think that as you say we don't have the same perception of change importance vs. server version change. BQ_BEGIN 1) Some database files [presumable memory-mapped files that are ok to be lost at reboot] that were previously in /var/lib/dirsrv/slapd-instance/db/ are now moved to /dev/shm/slapd-instance/. This modification seems to work fine (and should increase performance), however there is an error message at server startup when /dev/shm is empty (for example, after each OS reboot) when the server needs to create the files: BQ_BEGIN [03/Dec/2021:12:12:14.887200364 +0100] - ERR - bdb_version_write - Could not open file "/dev/shm/slapd-model/DBVERSION" for writing Netscape Portable Runtime -5950 (File not found.) After the next 389ds restart this ERR message does not appear, but it appears after each OS reboot (since /dev/shm is cleaned up after each reboot). BQ_END We can look into modifying this behavior, especially since it's not a fatal error. We can change the logging severity to NOTICE (from ERR) or something like that. BQ_END Yes, i think that if it is not critical the logging severity for this particular message should be lowered to NOTICE. Every message with ERR level criticity makes me a bit nervous about server data sanity and integrity, especially at startup BQ_BEGIN To be honest error log messages should not be expected to be static. As work is done to the server logging messages are added/removed and/or changed all the time, and that's not going to change. BQ_END I agree 100% with that, and as you say the criticity level of this finally benign case (absence of " /dev/shm/slapd-xxx/DBVERSION" ) should be adjusted to NOTICE in odrer not to scare the admin :)) BQ_BEGIN Now I know when we added the "wtime" and "optime" to the access logging that did cause some issues for Admins who parse our access logs. We could have done better with communicating this change (live and learn). But at the same time this new logging is tremendously useful, and has helped many customers troubleshoot various performance issues. So while these changes can be disruptive we felt the pro's outweighed the cons. BQ_END I found that change when it was introduced very interesting and useful tbh, it simplifies debugging perfomance issues and lockups. BQ_BEGIN BQ_BEGIN 2) UNIX socket of the server was moved to /run/slapd-instance.socket, a new keyword in .inf file for dscreate ("ldapi") has appeared. Works fine, but it had an impact on our scripts that use ldapi socket path. BQ_END In this case using /var/run was outdated and was causing issues with systemd/tmpfiles on RHEL, and moving it to /run was the correct thing to do. What I don't understand is why adding the option to set the LDAPI path in the INF file is a problem. Ca
[389-users] Re: Recent commits in stable 389ds branches - discussion
Just to add to the previous mail - there is another phenomenon linked apparently to the new plugin - at each start of the server two error messages about plugins with NULL identities are displayed: ... [03/Dec/2021:14:41:38.945576751 +0100] - INFO - main - 389-Directory/1.4.4.17 B2021.337.1333 starting up [03/Dec/2021:14:41:38.946206385 +0100] - INFO - main - Setting the maximum file descriptor limit to: 64000 [03/Dec/2021:14:41:38.951185055 +0100] - ERR - allow_operation - Component identity is NULL [03/Dec/2021:14:41:38.951846429 +0100] - ERR - allow_operation - Component identity is NULL [03/Dec/2021:14:41:39.546909815 +0100] - INFO - PBKDF2_SHA256 - Based on CPU performance, chose 2048 rounds [03/Dec/2021:14:41:39.566959933 +0100] - INFO - ldbm_instance_config_cachememsize_set - force a minimal value 512000 ... > De: "Ivanov Andrey" > À: "General discussion list for the 389 Directory server, project." > <389-users@lists.fedoraproject.org> > Envoyé: Vendredi 3 Décembre 2021 12:29:31 > Objet: [389-users] Recent commits in stable 389ds branches - discussion > Hi, > I'd like to discuss several recent (since a couple of months) commits in > stable > branches of 389ds. I will be talking about 1.4.4 [ > https://github.com/389ds/389-ds-base/tree/389-ds-base-1.4.4 | > https://github.com/389ds/389-ds-base/tree/389-ds-base-1.4.4 ] since it's the > one we are using in production, but i think it's the same for 1.4.3. These > commits are welcome and go in the right direction, however the changes they > produce are not something one expects when the server version changes in 4th > digit (ex. 1.4.4.17 -> 1.4.4.18). Here they are: > 1) Some database files [presumable memory-mapped files that are ok to be lost > at > reboot] that were previously in /var/lib/dirsrv/slapd-instance/db/ are now > moved to /dev/shm/slapd-instance/. This modification seems to work fine (and > should increase performance), however there is an error message at server > startup when /dev/shm is empty (for example, after each OS reboot) when the > server needs to create the files: > [03/Dec/2021:12:12:14.887200364 +0100] - ERR - bdb_version_write - Could not > open file "/dev/shm/slapd-model/DBVERSION" for writing Netscape Portable > Runtime -5950 (File not found.) > After the next 389ds restart this ERR message does not appear, but it appears > after each OS reboot (since /dev/shm is cleaned up after each reboot). > 2) UNIX socket of the server was moved to /run/slapd-instance.socket, a new > keyword in .inf file for dscreate ("ldapi") has appeared. > Works fine, but it had an impact on our scripts that use ldapi socket path. > 3) A new default plugin requirement, the plugin being written in Rust - > probably > its introduction is FIPS-related (Issue 3584 - Fix PBKDF2_SHA256 hashing in > FIPS mode). See my comment > https://github.com/389ds/389-ds-base/issues/5008#issuecomment-983759224. Rust > becomes a requirement for building the server, which is fine, but then it > should be enabled by default in "./configure". Without it the server does not > compile the new plugin and complains about it when starting: > [01/Dec/2021:12:54:04.460194603 +0100] - ERR - symload_report_error - Could > not > open library "/Local/dirsrv/lib/dirsrv/plugins/libpwdchan-plugin.so" for > plugin > PBKDF2 > ... > Thank you and keep up the good work, we use 389ds in production since 2007 and > we are quite happy with it :) > Regards, > Andrey > ___ > 389-users mailing list -- 389-users@lists.fedoraproject.org > To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org > Fedora Code of Conduct: > https://docs.fedoraproject.org/en-US/project/code-of-conduct/ > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines > List Archives: > https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org > Do not reply to spam on the list, report it: > https://pagure.io/fedora-infrastructure ___ 389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
[389-users] Recent commits in stable 389ds branches - discussion
Hi, I'd like to discuss several recent (since a couple of months) commits in stable branches of 389ds. I will be talking about 1.4.4 [ https://github.com/389ds/389-ds-base/tree/389-ds-base-1.4.4 | https://github.com/389ds/389-ds-base/tree/389-ds-base-1.4.4 ] since it's the one we are using in production, but i think it's the same for 1.4.3. These commits are welcome and go in the right direction, however the changes they produce are not something one expects when the server version changes in 4th digit (ex. 1.4.4.17 -> 1.4.4.18). Here they are: 1) Some database files [presumable memory-mapped files that are ok to be lost at reboot] that were previously in /var/lib/dirsrv/slapd-instance/db/ are now moved to /dev/shm/slapd-instance/. This modification seems to work fine (and should increase performance), however there is an error message at server startup when /dev/shm is empty (for example, after each OS reboot) when the server needs to create the files: [03/Dec/2021:12:12:14.887200364 +0100] - ERR - bdb_version_write - Could not open file "/dev/shm/slapd-model/DBVERSION" for writing Netscape Portable Runtime -5950 (File not found.) After the next 389ds restart this ERR message does not appear, but it appears after each OS reboot (since /dev/shm is cleaned up after each reboot). 2) UNIX socket of the server was moved to /run/slapd-instance.socket, a new keyword in .inf file for dscreate ("ldapi") has appeared. Works fine, but it had an impact on our scripts that use ldapi socket path. 3) A new default plugin requirement, the plugin being written in Rust - probably its introduction is FIPS-related (Issue 3584 - Fix PBKDF2_SHA256 hashing in FIPS mode). See my comment https://github.com/389ds/389-ds-base/issues/5008#issuecomment-983759224. Rust becomes a requirement for building the server, which is fine, but then it should be enabled by default in "./configure". Without it the server does not compile the new plugin and complains about it when starting: [01/Dec/2021:12:54:04.460194603 +0100] - ERR - symload_report_error - Could not open library "/Local/dirsrv/lib/dirsrv/plugins/libpwdchan-plugin.so" for plugin PBKDF2 ... Thank you and keep up the good work, we use 389ds in production since 2007 and we are quite happy with it :) Regards, Andrey ___ 389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
[389-users] Re: How to analyze large Multi Master Replication (test)-network?
Hi, Use the RHDS 11 documentation instead of 10, it's more up-to-date (https://access.redhat.com/documentation/en-us/red_hat_directory_server/11). Concerning the rpelication, you can check the whole chapter https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/html-single/administration_guide/index#Managing_Replication What you are trying to do (checking the consistency of LDAP replicas) is probably completely or partially implemented by thefollowing two utilities : * "ds-replcheck" that compares two replicas: https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/html-single/administration_guide/index#comparing_two_directory_server_databases * and "dsconf replication monitor" comparing just the time skew and the coherence of RUVs (https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/html-single/administration_guide/index#monitoring-the-replication-topology) In our production environment we check the state of replication from time to time by ds-replcheck to be sure the replicas contain identical data. As for the order of configuration, you can create replication agreements in any order, then initialize them. The best practice is to initialize all the servers in MMR topology from the same initial server. Something like this for 3 servers MMR with ldap1 as central hub: # Activate replicas and changelogs, create replication managers /usr/sbin/dsconf ldaps://ldap1.example.com:636 -D 'cn=Directory Manager' -w 'dir_man_secret_password' replication create-manager --name 'cn=repman,cn=config' --passwd 'repman_secret_password' /usr/sbin/dsconf ldaps://ldap1.example.com:636 -D 'cn=Directory Manager' -w 'dir_man_secret_password' replication enable --suffix='dc=example,dc=com' --role='master' --replica-id=1 --bind-dn='cn=repman,cn=config' /usr/sbin/dsconf ldaps://ldap2.example.com:636 -D 'cn=Directory Manager' -w 'dir_man_secret_password' replication create-manager --name 'cn=repman,cn=config' --passwd 'repman_secret_password' /usr/sbin/dsconf ldaps://ldap2.example.com:636 -D 'cn=Directory Manager' -w 'dir_man_secret_password' replication enable --suffix='dc=example,dc=com' --role='master' --replica-id=2 --bind-dn='cn=repman,cn=config' /usr/sbin/dsconf ldaps://ldap3.example.com:636 -D 'cn=Directory Manager' -w 'dir_man_secret_password' replication create-manager --name 'cn=repman,cn=config' --passwd 'repman_secret_password' /usr/sbin/dsconf ldaps://ldap3.example.com:636 -D 'cn=Directory Manager' -w 'dir_man_secret_password' replication enable --suffix='dc=example,dc=com' --role='master' --replica-id=3 --bind-dn='cn=repman,cn=config' # Create all MMR replication agreements /usr/sbin/dsconf ldaps://ldap2.example.com:636 -D 'cn=Directory Manager' -w 'dir_man_secret_password' repl-agmt create --suffix='dc=example,dc=com' --host='ldap1.example.com' --port=636 --conn-protocol=LDAPS --bind-dn='cn=repman,cn=config' --bind-passwd='repman_secret_password' --bind-method=SIMPLE 'Replication from ldap2.example.com to ldap1.example.com' /usr/sbin/dsconf ldaps://ldap3.example.com:636 -D 'cn=Directory Manager' -w 'dir_man_secret_password' repl-agmt create --suffix='dc=example,dc=com' --host='ldap1.example.com' --port=636 --conn-protocol=LDAPS --bind-dn='cn=repman,cn=config' --bind-passwd='repman_secret_password' --bind-method=SIMPLE 'Replication from ldap3.example.com to ldap1.example.com' /usr/sbin/dsconf ldaps://ldap1.example.com:636 -D 'cn=Directory Manager' -w 'dir_man_secret_password' repl-agmt create --suffix='dc=example,dc=com' --host='ldap2.example.com' --port=636 --conn-protocol=LDAPS --bind-dn='cn=repman,cn=config' --bind-passwd='repman_secret_password' --bind-method=SIMPLE 'Replication from ldap1.example.com to ldap2.example.com' /usr/sbin/dsconf ldaps://ldap1.example.com:636 -D 'cn=Directory Manager' -w 'dir_man_secret_password' repl-agmt create --suffix='dc=example,dc=com' --host='ldap3.example.com' --port=636 --conn-protocol=LDAPS --bind-dn='cn=repman,cn=config' --bind-passwd='repman_secret_password' --bind-method=SIMPLE 'Replication from ldap1.example.com to ldap3.example.com' # Start initialization of replica ldap2 from ldap1 /usr/sbin/dsconf ldaps://ldap1.example.com:636 -D 'cn=Directory Manager' -w 'dir_man_secret_password' repl-agmt init --suffix='dc=example,dc=com' 'Replication from ldap1.example.com to ldap2.example.com' # and wait for its end showing progression every 5 seconds INITSTATE=`/usr/sbin/dsconf ldaps://ldap1.example.com:636 -D 'cn=Directory Manager' -w 'dir_man_secret_password' repl-agmt init-status --suffix='dc=example,dc=com' 'Replication from ldap1.example.com to ldap2.example.com'`; while [[ $INITSTATE == 'Agreement initialization in progress.' ]]; do sleep 5; echo -n '.';INITSTATE=`/usr/sbin/dsconf ldaps://ldap1.example.com:636 -D 'cn=Directory Manager' -w 'dir_man_secret_password' repl-agmt init-status --suffix='dc=id,dc=polytechnique,dc=edu' 'Replication from
[389-users] Re: dsconf broken for ldaps instances in 1.4.3 but working in 1.4.2
>> >> No problem. We've just merged the fix and backported it. I don't know when it >> will ship in RHEL/CentOS, but I'm sure it will be soon in an upcoming update. > Well i usually do not use rpms - we compile from git sources, i used them only > to make a demo of the problem. > > Thanks for the commit, i have tested the fix. It resolves a half of the > problem > - indeed the TLS_REQCERT is now taken into account from > /etc/openldap/ldap.conf. But the certificate bundle part (TLS_CACERT parameter > or system bundle in its ansence) is still not taken into account. TLS_CACERT > works correctly in dsconf 1.4.2 (and ldapsearch). I think i have found the part of the code that causes ignoring TLS_CACERT: it's the file __init__.py, lines 997-999: 997 if certdir is None and self.isLocal: 998 certdir = self.get_cert_dir() 999 self.log.debug("Using dirsrv ca certificate %s", certdir) if i comment these lines dsconf starts to take into account TLS_CACERT from /etc/openldap/ldap.conf as it should do. Looks like self.isLocal shoud not be true while it is, as a result a false certdir is taken: DEBUG: Using dirsrv ca certificate /Local/dirsrv/etc/dirsrv/slapd-{instance_name} DEBUG: Using external ca certificate /Local/dirsrv/etc/dirsrv/slapd-{instance_name} DEBUG: Using external ca certificate /Local/dirsrv/etc/dirsrv/slapd-{instance_name} ___ 389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org
[389-users] Re: dsconf broken for ldaps instances in 1.4.3 but working in 1.4.2
Hi Willam, >> >> Thanks, here is the github ticket: >> https://github.com/389ds/389-ds-base/issues/4460 >> > > No problem. We've just merged the fix and backported it. I don't know when it > will ship in RHEL/CentOS, but I'm sure it will be soon in an upcoming update. Well i usually do not use rpms - we compile from git sources, i used them only to make a demo of the problem. Thanks for the commit, i have tested the fix. It resolves a half of the problem - indeed the TLS_REQCERT is now taken into account from /etc/openldap/ldap.conf. But the certificate bundle part (TLS_CACERT parameter or system bundle in its ansence) is still not taken into account. TLS_CACERT works correctly in dsconf 1.4.2 (and ldapsearch). Regards, Andrey ___ 389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org
[389-users] Re: dsconf broken for ldaps instances in 1.4.3 but working in 1.4.2
Hi, >> But all in all i think i start to see where the problem comes from. dsconf >> version 1.4.2 uses /etc/openldap/ldap.conf (which in turn uses system pem >> bundle if no TLS_CACERT is specified) for certs/CA. Starting from 1.4.3 >> dsconf >> ignores completely /etc/openldap/ldap.conf file and pays attention only to >> its >> own .dsrc file. It explains everything that i see. It's a bit pity that there >> is no global section in .dsrc like in /etc/openldap/ldap.conf - one needs to >> create a section per ldap server, often with the same parameters. > > Well, it should be respecting the value from /etc/openldap/ldap.conf I think > so > this seems like a fault ... Can you open an issue for this on github? > > https://github.com/389ds/389-ds-base Thanks, here is the github ticket: https://github.com/389ds/389-ds-base/issues/4460 Regards, Andrey ___ 389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org
[389-users] Re: dsconf broken for ldaps instances in 1.4.3 but working in 1.4.2
Hi William, sed -i -e 's/ldap.OPT_X_TLS_HARD/ldap.OPT_X_TLS_NEVER/' /usr/lib/python3.6/site-packages/lib389/__init__.py sed -i -e 's/ldap.OPT_X_TLS_HARD/ldap.OPT_X_TLS_NEVER/' /usr/lib/python3.6/site-packages/lib389/cli_base/dsrc.py > > You don't need to do this. You can set tls_reqcert = never in your dsrc file. > You do not need to edit the lib389 source code. Yep, thanks! Indeed if i put to .dsrc a custom cacertdir with correct certs or tls_reqcert=never dsconf v1.4.3 works: [slapd-ldaps://ldap-model.polytechnique.fr:636] uri = ldaps://ldap-model.polytechnique.fr:636 ###tls_reqcert = never tls_cacertdir = /tmp/tls_cacertdir Is there any way to use a global parameter in .dsrc, without a section per server - we have several LDAP servers, all signed by the same CA? making a section per server will be a bit tedious. > > Can you show us your /etc/openldap/ldap.conf please? "ldapsearch -x -H ldaps://" works, so it is not a matter of the content of this file. By default it is empty in our case (we use commercial certificates), but i tried to point TLS_CACERT to the CA certificates that signed the server's cert. It does not fix anything for dsconf 1.4.3 (but it does influence ldapsearch and dsconf v1.4.2 of course), here are all the tests i've done (commented #TLS_CACERT parameters). # Turning this off breaks GSSAPI used with krb5 when rdns = false SASL_NOCANONon #TLS_CACERT /etc/pki/tls/cert.pem #TLS_CACERT /Admin/SOURCES/389/Config/CA-sectigo-intermediates-root.crt #TLS_CACERT /Admin/SOURCES/389/Config/GEANT-OV-RSA-CA-4.crt #TLS_CACERT /Admin/SOURCES/389/Config/USERTrust-RSA-Certification-Authority.crt #TLS_CACERT /Admin/SOURCES/389/Config/AAA-Certificate-Services.crt I disabled TLS_CACERT and it makes openldap clients use the system pem. It works for ldapsearch and dsconf v1.4.2 but not for dsconf v1.4.3 >> https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/security_hardening/using-shared-system-certificates_security-hardening >> (by "update-ca-trust" and/or "trust anchor path.to/certificate.crt"). > > The system pem bundles are NOT used by openldap which means that lib389 can't > use them. You must configure the tls_cacertdir or tls_cacert is dsrc to point > at your CA cert. Actually in RHEL/CentOS they ARE used by openldap client if TLS_CACERT is not specified explicitly. Here is the snippet of /etc/openldap/ldap.conf file with explanations: # When no CA certificates are specified the Shared System Certificates # are in use. In order to have these available along with the ones specified # by TLS_CACERTDIR one has to include them explicitly: #TLS_CACERT /etc/pki/tls/cert.pem And it is easy to confirm that the system global bundle is indeed used with any self-signed CA authority: [root@ldap-centos8 ~]# ldapsearch -x -H ldaps://ldap-ens.polytechnique.fr -b "" -s base ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1) [root@ldap-centos8 ~]# trust anchor /tmp/my_ca_8192.crt [root@ldap-centos8 ~]# ldapsearch -x -LLL -H ldaps://ldap-ens.polytechnique.fr -b "" -s base dn: objectClass: top defaultnamingcontext: dc=id,dc=polytechnique,dc=edu dataversion: 020201121013314020201121013314 netscapemdsuffix: cn=ldap://dc=ldap-ens,dc=polytechnique,dc=fr:389 lastusn;userroot: 33863940 lastusn;netscaperoot: -1 [root@ldap-centos8 ~]# trust anchor --remove /tmp/my_ca_8192.crt [root@ldap-centos8 ~]# ldapsearch -x -LLL -H ldaps://ldap-ens.polytechnique.fr -b "" -s base ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1) But all in all i think i start to see where the problem comes from. dsconf version 1.4.2 uses /etc/openldap/ldap.conf (which in turn uses system pem bundle if no TLS_CACERT is specified) for certs/CA. Starting from 1.4.3 dsconf ignores completely /etc/openldap/ldap.conf file and pays attention only to its own .dsrc file. It explains everything that i see. It's a bit pity that there is no global section in .dsrc like in /etc/openldap/ldap.conf - one needs to create a section per ldap server, often with the same parameters. Thanks again for help, it's clear for me now! Have a nice day! :) ___ 389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org
[389-users] Re: dsconf broken for ldaps instances in 1.4.3 but working in 1.4.2
Hi Mark, >> >> So it seems it has something to do with how dsconf 1.4.3 vs 1.4.2 validates >> the >> server certificate chains It also breaks replication monitoring in >> cockpit >> UI since dsconf cannot connect by ldaps to otehr servers of replication >> config... >> >> >> Thanks for the hint about .dsrc file, i'll try it - my workaround today is >> not >> very elegant :) : >> sed -i -e 's/ldap.OPT_X_TLS_HARD/ldap.OPT_X_TLS_NEVER/' >> /usr/lib/python3.6/site-packages/lib389/__init__.py >> sed -i -e 's/ldap.OPT_X_TLS_HARD/ldap.OPT_X_TLS_NEVER/' >> /usr/lib/python3.6/site-packages/lib389/cli_base/dsrc.py > > > When you switch between packages are you recreating the instance each > time and importing the certificates? No, the server installation is not modified or touched in any way - the LDAP server (1.4.3) is installed on a separate server (called "ldap-model", CentOS 8.2) and never restarted or reconfigured. The instance of LDAP is installed only there and yes, i used during the installation the ds* utilities : dsctl model tls import-server-key-cert model_cert.crt model_cert.key dsconf model security ca-certificate add --file intermedite-1.crt --name "CA-Intermediate-1" dsconf model security ca-certificate set-trust-flags "CA-Intermediate-1" --flags "CT,," ... The server is accessible with ldapsearch -H ldaps://..., SSL is installed correctly - no problem at all. I do not touch it it all during the tests. I install only the management tools (python3-lib389) on another server called "ldap-centos8", and since it needs the file default.inf, so "389-ds-base" rpm is installed too. But no 389 instances are started or configured. All the necessary certificates (CA and 2 intermediates) are imported to system pem bundles using this : https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/security_hardening/using-shared-system-certificates_security-hardening (by "update-ca-trust" and/or "trust anchor path.to/certificate.crt"). ldapsearch and dsconf 1.4.2 work fine with ldaps://ldap-model... but dsconf v.1.4.3 refuses to connect. The only difference i see in debug logging are the following lines present during dsconf 1.4.3 connect attempt but absent in 1.4.2 connect debug (no 389 instances installed on this server, as i mentioned before) : DEBUG: open(): Connecting to uri ldaps://ldap-model.polytechnique.fr:636 DEBUG: Using dirsrv ca certificate /etc/dirsrv/slapd-{instance_name} DEBUG: Using external ca certificate /etc/dirsrv/slapd-{instance_name} DEBUG: Using external ca certificate /etc/dirsrv/slapd-{instance_name} DEBUG: Using certificate policy 1 DEBUG: ldap.OPT_X_TLS_REQUIRE_CERT = 1 DEBUG: Cannot connect to 'ldaps://ldap-model.polytechnique.fr:636' > I'm asking because I'm looking at > the lib389 code for 1.4.3 and 1.4.2 and there is not much of a > difference except for importing certificates and how it calls the rehash > function. > > In 1.4.2 we always do: > > /usr/bin/c_rehash > > in 1.4.3 we call two difference function depending on the system: > > /usr/bin/openssl rehash > > or > > /usr/bin/c_rehash > > > Maybe try running "/usr/bin/c_rehash " on the 1.4.3 > installation and see if it makes a difference. I don't use crt dirs - i add the intermediate CAs to system bundles (update-ca-trust or trust anchor path.to/certificate.crt) > > On my Fedora system (1.4.3) it uses the openssl function, which brings > me to my next question. How are you importing the certificates? Are > you using dsctl/dsconf? If you aren't, then you should, as they call > the rehash functions for you when importing the certificates. I used dsctl/dsconf on the server with 389 LDAP instance ("ldap-model") and the server works fine, the problem is on another ("management") server ("ldap-centos8") where changing rpms from 1.4.3 to 1.4.2 (or the other way) switch me from working to non-working dsconf. Thanks for trying to help ! :) ___ 389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org
[389-users] Re: dsconf broken for ldaps instances in 1.4.3 but working in 1.4.2
Hi William, thanks for your reply. Our managed by dsconf LDAP is signed by a commercial certificate, and both intermediate certificates are added to system bundles using "trust anchor" or "update-ca-trust" (https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/security_hardening/using-shared-system-certificates_security-hardening). Otherwise ldapsearch and dsconf v1.4.2 would not work. Fiddling with /etc/openldap/ldap.conf does not change anything, it's the first thing i was trying to adjust. The only difference is actually removing one rpm and installing the other. If i go back from python3-lib389-1.4.3.13-1 to python3-lib389-1.4.2.16-1.module_el by uninstalling one rpm and installing the other dsconf works again: dnf -y module enable 389-directory-server:testing dnf -y install python3-lib389 dsconf ldaps://ldap-model.polytechnique.fr:636 -D "cn=Directory Manager" -w mypass ... Error: Can't contact LDAP server - error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed (self signed certificate in certificate chain) dnf -y remove python3-lib389 dnf -y module disable 389-directory-server:testing dnf -y module enable 389-directory-server:stable dnf -y install python3-lib389 dsconf ldaps://ldap-model.polytechnique.fr:636 -D "cn=Directory Manager" -w mypass ... ... So it seems it has something to do with how dsconf 1.4.3 vs 1.4.2 validates the server certificate chains It also breaks replication monitoring in cockpit UI since dsconf cannot connect by ldaps to otehr servers of replication config... Thanks for the hint about .dsrc file, i'll try it - my workaround today is not very elegant :) : sed -i -e 's/ldap.OPT_X_TLS_HARD/ldap.OPT_X_TLS_NEVER/' /usr/lib/python3.6/site-packages/lib389/__init__.py sed -i -e 's/ldap.OPT_X_TLS_HARD/ldap.OPT_X_TLS_NEVER/' /usr/lib/python3.6/site-packages/lib389/cli_base/dsrc.py >> DEBUG: Instance details: {'uri': 'ldaps://ldap-model.polytechnique.fr:636', >> 'basedn': None, 'binddn': 'cn=Directory Manager', 'bindpw': None, 'saslmech': >> None, 'tls_cacertdir': None, 'tls_cert': None, 'tls_key': None, >> 'tls_reqcert': >> 1, 'starttls': False, 'prompt': False, 'pwdfile': None, 'args': {'ldapurl': >> 'ldaps://ldap-model.polytechnique.fr:636', 'root-dn': 'cn=Directory >> Manager'}} >> > > >> DEBUG: Instance details: {'uri': 'ldaps://ldap-model.polytechnique.fr:636', >> 'basedn': None, 'binddn': 'cn=Directory Manager', 'bindpw': None, 'saslmech': >> None, 'tls_cacertdir': None, 'tls_cert': None, 'tls_key': None, >> 'tls_reqcert': >> 1, 'starttls': False, 'prompt': False, 'pwdfile': None, 'args': {'ldapurl': >> 'ldaps://ldap-model.polytechnique.fr:636', 'root-dn': 'cn=Directory >> manager'}} >> >> ldap.SERVER_DOWN: {'desc': "Can't contact LDAP server", 'info': >> 'error:1416F086:SSL routines:tls_process_server_certificate:certificate >> verify >> failed (self signed certificate in certificate chain)'} >> ERROR: Error: Can't contact LDAP server - error:1416F086:SSL >> routines:tls_process_server_certificate:certificate verify failed (self >> signed >> certificate in certificate chain) > > I can't comment about the other environmental changes between those versions, > but tls_reqcert is 1 in both options, aka ldap.OPT_X_TLS_HARD which means your > ca cert must be in your LDAP ca store. You don't specify a tls_cacertdir or a > tls_cacert, so whatever you have in /etc/openldap/ldap.conf will be used for > this. > > Most likely there is a fault in this config, or they cacertdir is not hashed. > > If you use a cacertdir remember you need to run 'openssl rehash' in the > directory to setup the symlinks to the PEM files. > > If you use a cacert PEM file directly, ensure it's readable to your user etc. > > As a last resort you could set 'tls_reqcert = never' in .dsrc to disable ca > validity checking. > > Hope that helps, > > > — > Sincerely, > > William Brown > > Senior Software Engineer, 389 Directory Server > SUSE Labs, Australia > ___ > 389-users mailing list -- 389-users@lists.fedoraproject.org > To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org > Fedora Code of Conduct: > https://docs.fedoraproject.org/en-US/project/code-of-conduct/ > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines > List Archives: > https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org ___ 389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org
[389-users] dsconf broken for ldaps instances in 1.4.3 but working in 1.4.2
dsconf works fine for instances using ldaps in v1.4.2 (389-directory-server:stable) bit it seems to be broken (not recognizing TLS certificates) in v1.4.3 for the commands like dsconf ldaps://ldap-model.polytechnique.fr:636 -D "cn=Directory Manager" -w mypass some_command In both cases i am using dsconf to manage the same external LDAP server (1.4.3.x), the OS on both is CentOS 8.2 with latest updates: [root@ldap-centos8 ~]# rpm -qa | grep 389 [root@ldap-centos8 ~]# dnf -y module enable 389-directory-server:stable [root@ldap-centos8 ~]# dnf -y install 389-ds-base [root@ldap-centos8 ~]# rpm -qa | grep 389 python3-lib389-1.4.2.16-1.module_el8+9435+e6daf39f.noarch 389-ds-base-libs-1.4.2.16-1.module_el8+9435+e6daf39f.x86_64 389-ds-base-1.4.2.16-1.module_el8+9435+e6daf39f.x86_64 [root@ldap-centos8 ~]# dsconf ldaps://ldap-model.polytechnique.fr:636 -D "cn=Directory Manager" -w mypass security get nsslapd-security: on nsslapd-securelistenhost: nsslapd-secureport: 636 ... [root@ldap-centos8 ~]# dnf -y remove 389* [root@ldap-centos8 ~]# dnf -y module disable 389-directory-server:stable [root@ldap-centos8 ~]# dnf -y module enable 389-directory-server:testing [root@ldap-centos8 ~]# dnf -y install 389-ds-base [root@ldap-centos8 ~]# rpm -qa | grep 389 python3-lib389-1.4.3.13-1.module_el8+10475+b74bca99.noarch 389-ds-base-libs-1.4.3.13-1.module_el8+10475+b74bca99.x86_64 389-ds-base-1.4.3.13-1.module_el8+10475+b74bca99.x86_64 [root@ldap-centos8 ~]# dsconf ldaps://ldap-model.polytechnique.fr:636 -D "cn=Directory Manager" -w mypass security get Error: Can't contact LDAP server - error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed (self signed certificate in certificate chain) [root@ldap-centos8 ~]# ldapsearch -H ldaps://ldap-model.polytechnique.fr -b 'cn=config' -D "cn=Directory Manager" -W '(cn=config)' nsslapd-security Enter LDAP Password: # extended LDIF # # LDAPv3 # base with scope subtree # filter: (cn=config) # requesting: nsslapd-security # # config dn: cn=config nsslapd-security: on ... [root@ldap-centos8 ~]# dsconf -v ldaps://ldap-model.polytechnique.fr:636 -D "cn=Directory Manager" -w mypass security get DEBUG: The 389 Directory Server Configuration Tool DEBUG: Inspired by works of: ITS, The University of Adelaide DEBUG: dsrc path: /root/.dsrc DEBUG: dsrc container path: /data/config/container.inf DEBUG: dsrc instances: [] DEBUG: dsrc no such section: slapd-ldaps://ldap-model.polytechnique.fr:636 DEBUG: Called with: Namespace(basedn=None, binddn='cn=Directory Manager', bindpw='mypass', func=. at 0x7fce5a5e7158>, instance='ldaps://ldap-model.polytechnique.fr:636', json=False, prompt=False, pwdfile=None, starttls=False, verbose=True) DEBUG: Instance details: {'uri': 'ldaps://ldap-model.polytechnique.fr:636', 'basedn': None, 'binddn': 'cn=Directory Manager', 'bindpw': None, 'saslmech': None, 'tls_cacertdir': None, 'tls_cert': None, 'tls_key': None, 'tls_reqcert': 1, 'starttls': False, 'prompt': False, 'pwdfile': None, 'args': {'ldapurl': 'ldaps://ldap-model.polytechnique.fr:636', 'root-dn': 'cn=Directory Manager'}} DEBUG: SER_SERVERID_PROP not provided, assuming non-local instance DEBUG: Allocate with ldaps://ldap-model.polytechnique.fr:636 DEBUG: Allocate with ldap-centos8.polytechnique.fr:389 DEBUG: Allocate with ldap-centos8.polytechnique.fr:389 DEBUG: SER_SERVERID_PROP not provided, assuming non-local instance DEBUG: Allocate with ldaps://ldap-model.polytechnique.fr:636 DEBUG: Allocate with ldap-centos8.polytechnique.fr:389 DEBUG: Allocate with ldap-centos8.polytechnique.fr:389 DEBUG: open(): Connecting to uri ldaps://ldap-model.polytechnique.fr:636 DEBUG: open(): bound as cn=Directory Manager DEBUG: cn=config getVal('nsslapd-security') DEBUG: cn=config getVal('nsslapd-securelistenhost') DEBUG: cn=config getVal('nsslapd-securePort') DEBUG: cn=encryption,cn=config getVal('nsSSLClientAuth') DEBUG: cn=encryption,cn=config getVal('nsTLSAllowClientRenegotiation') DEBUG: cn=config getVal('nsslapd-require-secure-binds') DEBUG: cn=config getVal('nsslapd-ssl-check-hostname') DEBUG: cn=config getVal('nsslapd-validate-cert') DEBUG: cn=encryption,cn=config getVal('nsSSLSessionTimeout') DEBUG: cn=encryption,cn=config getVal('sslVersionMin') DEBUG: cn=encryption,cn=config getVal('sslVersionMax') DEBUG: cn=encryption,cn=config getVal('allowWeakCipher') DEBUG: cn=encryption,cn=config getVal('allowWeakDHParam') DEBUG: cn=encryption,cn=config getVal('nsSSL3Ciphers') nsslapd-security: on nsslapd-securelistenhost: nsslapd-secureport: 636 [root@ldap-centos8 ~]# dsconf -v ldaps://ldap-model.polytechnique.fr:636 -D "cn=Directory Manager" -w mypass security get DEBUG: The 389 Directory Server Configuration Tool DEBUG: Inspired by works of: ITS, The University of Adelaide DEBUG: dsrc path: /root/.dsrc DEBUG: dsrc container path: /data/config/container.inf DEBUG: dsrc
[389-users] Re: Building 389 rpm on CentOS 8 from source rpm
Hi Viktor, thank you for pointing me to that bugzilla. Finally i have downloaded and manually installed argparse-manpage from FC28 archives (wget [ https://archives.fedoraproject.org/pub/archive/fedora/linux/releases/28/Everything/x86_64/os/Packages/p/python3-argparse-manpage-1.0.0-1.fc28.noarch.rpm) | https://archives.fedoraproject.org/pub/archive/fedora/linux/releases/28/Everything/x86_64/os/Packages/p/python3-argparse-manpage-1.0.0-1.fc28.noarch.rpm) ] . It has permitted me to build the source rpm including lib389 and cockpit plugin. I still have not tested if that rpm installs ds and cockpit plugin works correctly. Concerning bugzilla ticket, it's a pity indeed that we do not нуе have a dedicated 389ds cockpit plugin in CentOS 8. Regards, Andrey > De: "Viktor Ashirov" > À: "General discussion list for the 389 Directory server, project." > <389-users@lists.fedoraproject.org> > Envoyé: Mercredi 30 Octobre 2019 15:22:00 > Objet: [389-users] Re: Building 389 rpm on CentOS 8 from source rpm > Hi Andrey, > argparse-manpage is missing in EPEL8: > [ https://bugzilla.redhat.com/show_bug.cgi?id=1763246 | > https://bugzilla.redhat.com/show_bug.cgi?id=1763246 ] > I was able to rebuild fedora srpm in my copr: [ > https://copr.fedorainfracloud.org/coprs/vashirov/389ds/packages/ | > https://copr.fedorainfracloud.org/coprs/vashirov/389ds/packages/ ] > But 389-ds-base build fails further down due to other missing dependencies, > please see the following message for more details: [ > https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org/message/FWYW3MM2NBGGCEK2FKM73Z3PCA7D4HCL/ > | > https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org/message/FWYW3MM2NBGGCEK2FKM73Z3PCA7D4HCL/ > ] > Thanks, > On Wed, Oct 30, 2019 at 3:12 PM Ivanov Andrey (M.) < [ > mailto:andrey.iva...@polytechnique.fr | andrey.iva...@polytechnique.fr ] > > wrote: >> Hi, >> m trying to build the 389 on CentOS 8 from rpm source package. When i do >> yum-builddep SPECS/389-ds-base.spec >> i have a missong component - "No matching package to install: >> 'python3-argparse-manpage'". I could not find a corresponding package in any >> of >> the repositories of CentOS 8 (AppStream/BaseOS/PowerTools/epel/extras). If i >> try to disable this requirement in spec by cpommenting the following line: >> BuildRequires: python%{python3_pkgversion}-argparse-manpage >> the "rpmbuild -ba SPECS/389-ds-base.spec" stops at the same package >> requirement >> when it starts to build lib389: >> + pushd ./src/lib389 >> ~/rpmbuild/BUILD/389-ds-base-1.4.0.20-10/src/lib389 >> ~/rpmbuild/BUILD/389-ds-base-1.4.0.20-10 >> + CFLAGS='-O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 >> -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong >> -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 >> -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic >> -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection' >> + LDFLAGS='-Wl,-z,relro -Wl,-z,now >> -specs=/usr/lib/rpm/redhat/redhat-hardened-ld' >> + /usr/libexec/platform-python setup.py build >> '--executable=/usr/libexec/platform-python -s' >> Traceback (most recent call last): >> File "setup.py", line 17, in >> from build_manpages import build_manpages >> ModuleNotFoundError: No module named 'build_manpages' >> error: Bad exit status from /var/tmp/rpm-tmp.OOJPV1 (%build) >> RPM build errors: >> Macro expanded in comment on line 117: %{python3_pkgversion}-argparse-manpage >> Bad exit status from /var/tmp/rpm-tmp.OOJPV1 (%build) >> Where do i obtain the corresponding package ( python3-argparse-manpage ) for >> CentOS8? I have not tried to build on RHEL8, maybe that package exists in >> RHEL8 >> vut not CentOS 8? >> Thank you! >> Regards, >> Andrey >> ___ >> 389-users mailing list -- [ mailto:389-users@lists.fedoraproject.org | >> 389-users@lists.fedoraproject.org ] >> To unsubscribe send an email to [ >> mailto:389-users-le...@lists.fedoraproject.org >> | 389-users-le...@lists.fedoraproject.org ] >> Fedora Code of Conduct: [ >> https://docs.fedoraproject.org/en-US/project/code-of-conduct/ | >> https://docs.fedoraproject.org/en-US/project/code-of-conduct/ ] >> List Guidelines: [ https://fedoraproject.org/wiki/Mailing_list_guidelines | >> https://fedoraproject.org/wiki/Mailing_list_guidelines ] >> List Archives: [ >> https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraprojec
[389-users] Building 389 rpm on CentOS 8 from source rpm
Hi, m trying to build the 389 on CentOS 8 from rpm source package. When i do yum-builddep SPECS/389-ds-base.spec i have a missong component - "No matching package to install: 'python3-argparse-manpage'". I could not find a corresponding package in any of the repositories of CentOS 8 (AppStream/BaseOS/PowerTools/epel/extras). If i try to disable this requirement in spec by cpommenting the following line: BuildRequires: python%{python3_pkgversion}-argparse-manpage the "rpmbuild -ba SPECS/389-ds-base.spec" stops at the same package requirement when it starts to build lib389: + pushd ./src/lib389 ~/rpmbuild/BUILD/389-ds-base-1.4.0.20-10/src/lib389 ~/rpmbuild/BUILD/389-ds-base-1.4.0.20-10 + CFLAGS='-O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection' + LDFLAGS='-Wl,-z,relro -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld' + /usr/libexec/platform-python setup.py build '--executable=/usr/libexec/platform-python -s' Traceback (most recent call last): File "setup.py", line 17, in from build_manpages import build_manpages ModuleNotFoundError: No module named 'build_manpages' error: Bad exit status from /var/tmp/rpm-tmp.OOJPV1 (%build) RPM build errors: Macro expanded in comment on line 117: %{python3_pkgversion}-argparse-manpage Bad exit status from /var/tmp/rpm-tmp.OOJPV1 (%build) Where do i obtain the corresponding package ( python3-argparse-manpage ) for CentOS8? I have not tried to build on RHEL8, maybe that package exists in RHEL8 vut not CentOS 8? Thank you! Regards, Andrey ___ 389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org
[389-users] Re: ldap perfomance
A more detailed discussion about it: http://www.port389.org/docs/389ds/design/logging-performance-improvement.html You could also disable logging and see whether the spikes disappear to be sure of their source: http://www.port389.org/docs/389ds/howto/howto-logsystemperf.html > Hi, > it could be flushing of logs (access.log) on disk which happens more often > when > server load is higher. you could use iostat or dstat to see what happens > Regards, > Andrey >> De: "Ghiurea, Isabella" >> À: 389-users@lists.fedoraproject.org >> Envoyé: Mercredi 5 Septembre 2018 23:14:24 >> Objet: [389-users] ldap perfomance >> Hello Gurus, >> looking for an answer to the following performance behavior >> my env: 389-ds-base-1.3.5.15-1.fc24.x86_64 in multimaster fractional >> replication >> running rsearch for 5 min with 1 thread seeing spikes for a basic read using >> index uid >> And running with 10 threads same search the avg ms/ops performance are much >> better with no major spike/burst >> Any explanation much appreciate it >> see bellow for 1 thread and the spike/burst >> T 300 -t 1 >> rsearch: 1 threads launched. >> T1 min= 0ms, max= 5ms, count = 54710 >> T1 min= 0ms, max= 42ms, count = 64930 >> T1 min= 0ms, max= 2ms, count = 65174 >> T1 min= 0ms, max= 2ms, count = 65110 >> T1 min= 0ms, max= 44ms, count = 64966 >> T1 min= 0ms, max= 1ms, count = 65101 >> T1 min= 0ms, max= 22ms, count = 65056 >> T1 min= 0ms, max= 32ms, count = 64981 >> T1 min= 0ms, max= 1ms, count = 65145 >> T1 min= 0ms, max= 1ms, count = 65223 >> T1 min= 0ms, max= 27ms, count = 65015 >> T1 min= 0ms, max= 1ms, count = 65182 >> T1 min= 0ms, max= 3ms, count = 65213 >> T1 min= 0ms, max= 23ms, count = 64760 >> T1 min= 0ms, max= 2ms, count = 64214 >> T1 min= 0ms, max= 3ms, count = 52279 >> T1 min= 0ms, max= 11ms, count = 64914 >> T1 min= 0ms, max= 1ms, count = 65118 >> T1 min= 0ms, max= 5ms, count = 64852 >> T1 min= 0ms, max= 91ms, count = 64180 >> T1 min= 0ms, max= 4ms, count = 64746 >> T1 min= 0ms, max= 1ms, count = 65080 >> T1 min= 0ms, max= 12ms, count = 65110 >> T1 min= 0ms, max= 702ms, count = 59243 >> T1 min= 0ms, max= 1ms, count = 65082 >> T1 min= 0ms, max= 89ms, count = 64331 >> T1 min= 0ms, max= 23ms, count = 64647 >> T1 min= 0ms, max= 5ms, count = 64818 >> T1 min= 0ms, max= 55ms, count = 64374 >> T1 min= 0ms, max= 8ms, count = 64713 >> T1 min= 0ms, max= 8ms, count = 64713 >> 300 sec >= 300 >> Final Average rate: 6394.22/sec = 0.1564msec/op, total: 64713 >> And final avg rate for 10 threads, no significant spike/burst for this num of >> threads >> 20180905 14:07:23 - Rate: 16962.10/thr (16962.10/sec = 0.0590ms/op), >> total:169621 (10 thr) >> 300 sec >= 300 >> Final Average rate: 17420.40/sec = 0.0574msec/op, total:169621 >> ___ >> 389-users mailing list -- 389-users@lists.fedoraproject.org >> To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org >> Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html >> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines >> List Archives: >> https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org > ___ > 389-users mailing list -- 389-users@lists.fedoraproject.org > To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org > Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines > List Archives: > https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org ___ 389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org
[389-users] Re: ldap perfomance
Hi, it could be flushing of logs (access.log) on disk which happens more often when server load is higher. you could use iostat or dstat to see what happens Regards, Andrey > De: "Ghiurea, Isabella" > À: 389-users@lists.fedoraproject.org > Envoyé: Mercredi 5 Septembre 2018 23:14:24 > Objet: [389-users] ldap perfomance > Hello Gurus, > looking for an answer to the following performance behavior > my env: 389-ds-base-1.3.5.15-1.fc24.x86_64 in multimaster fractional > replication > running rsearch for 5 min with 1 thread seeing spikes for a basic read using > index uid > And running with 10 threads same search the avg ms/ops performance are much > better with no major spike/burst > Any explanation much appreciate it > see bellow for 1 thread and the spike/burst > T 300 -t 1 > rsearch: 1 threads launched. > T1 min= 0ms, max= 5ms, count = 54710 > T1 min= 0ms, max= 42ms, count = 64930 > T1 min= 0ms, max= 2ms, count = 65174 > T1 min= 0ms, max= 2ms, count = 65110 > T1 min= 0ms, max= 44ms, count = 64966 > T1 min= 0ms, max= 1ms, count = 65101 > T1 min= 0ms, max= 22ms, count = 65056 > T1 min= 0ms, max= 32ms, count = 64981 > T1 min= 0ms, max= 1ms, count = 65145 > T1 min= 0ms, max= 1ms, count = 65223 > T1 min= 0ms, max= 27ms, count = 65015 > T1 min= 0ms, max= 1ms, count = 65182 > T1 min= 0ms, max= 3ms, count = 65213 > T1 min= 0ms, max= 23ms, count = 64760 > T1 min= 0ms, max= 2ms, count = 64214 > T1 min= 0ms, max= 3ms, count = 52279 > T1 min= 0ms, max= 11ms, count = 64914 > T1 min= 0ms, max= 1ms, count = 65118 > T1 min= 0ms, max= 5ms, count = 64852 > T1 min= 0ms, max= 91ms, count = 64180 > T1 min= 0ms, max= 4ms, count = 64746 > T1 min= 0ms, max= 1ms, count = 65080 > T1 min= 0ms, max= 12ms, count = 65110 > T1 min= 0ms, max= 702ms, count = 59243 > T1 min= 0ms, max= 1ms, count = 65082 > T1 min= 0ms, max= 89ms, count = 64331 > T1 min= 0ms, max= 23ms, count = 64647 > T1 min= 0ms, max= 5ms, count = 64818 > T1 min= 0ms, max= 55ms, count = 64374 > T1 min= 0ms, max= 8ms, count = 64713 > T1 min= 0ms, max= 8ms, count = 64713 > 300 sec >= 300 > Final Average rate: 6394.22/sec = 0.1564msec/op, total: 64713 > And final avg rate for 10 threads, no significant spike/burst for this num of > threads > 20180905 14:07:23 - Rate: 16962.10/thr (16962.10/sec = 0.0590ms/op), > total:169621 (10 thr) > 300 sec >= 300 > Final Average rate: 17420.40/sec = 0.0574msec/op, total:169621 > ___ > 389-users mailing list -- 389-users@lists.fedoraproject.org > To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org > Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines > List Archives: > https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org ___ 389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org
[389-users] Re: 389DS v1.3.4.x after fixes for tickets 48766 and 48954
> De: "Ludwig Krispenz"> À: 389-users@lists.fedoraproject.org > Envoyé: Vendredi 9 Septembre 2016 12:30:31 > Objet: [389-users] Re: 389DS v1.3.4.x after fixes for tickets 48766 and 48954 > Hi Andrey, > we have fix to address the incorrcet positioning in the changelog (using a csn > of a consumer which is ahead for the given replicaid) and so also would > prevent > these messages. > It still has to be tested, but I am wondering if you want to test it as well. > Regards, > Ludwig Hi Ludwig, i am unable to reproduce the problem on our test servers, it affects only production. So i would prefer to wait for your tests and/or a definitive and stable fix since the code will go directly into production :) Regards, Andrey -- 389-users mailing list 389-users@lists.fedoraproject.org https://lists.fedoraproject.org/admin/lists/389-users@lists.fedoraproject.org
[389-users] Re: 389DS v1.3.4.x after fixes for tickets 48766 and 48954
> De: "Ludwig Krispenz"> À: 389-users@lists.fedoraproject.org > Envoyé: Mercredi 7 Septembre 2016 12:48:38 > Objet: [389-users] Re: 389DS v1.3.4.x after fixes for tickets 48766 and 48954 the fixes for the tickets you mention did change the iteration thru the changelog and how it handles situtations when the start csn is not found in the changelog. and it also did change the logging, so you might see messages now which were not there or hidden before. >>> That was my understanding too. >> so far I have not seen any replication problems related to these messages, >> all >> generatedcsns seem to be replicated. What makes it a bit more difficult is >> that >> most of the updates are updates of lastlogintime and the original MOD is not >> logged. I still do not understand why we have these messages so frequently, I >> will try to reproduce. >> Or, if it possible, could you run the servers for just an hour with >> replication >> logging enabled ? > no more need for this, I found the messages in a deployment where repl logging > was enabled. I think it happens when the smallest consumer maxCSN is ahead of > the local maxCSN for this replicaID. > It should do no harm, but in some scenarios could slow down replication a bit. > I will continue to investigate and work on a fix Ok, thank you. And yes, as you say apparently it does no harm - i check the consistency of three replicated servers from time to time and there is no data discrepancy between these servers, . Anyway, enabling replication logging on production servers is not something easily done, mainly due to performance reasons. And i was not able to reproduce the problem in our test environment with 2 replicated servers, maybe the charge or frequency of connections updating lastlogintime attribute was not high enough in test environment. Or the three-server full-replicated topology makes things a bit different too with one or two additional hops for the same mod arriving to the consumer by two different paths. >> When looking into the provided data set I did notice three replicated ops >> with >> err=50, insufficient access. This should not happen and requires a separate >> investigation Yes, i see the three modifications you are talking about. it is present only on one server of three. Strange indeed. No more err=50 in replicated ops today on any of the servers, i've just checked. -- 389-users mailing list 389-users@lists.fedoraproject.org https://lists.fedoraproject.org/admin/lists/389-users@lists.fedoraproject.org
[389-users] Re: 389DS v1.3.4.x after fixes for tickets 48766 and 48954
Hi Ludwig, > the fixes for the tickets you mention did change the iteration thru the > changelog and how it handles situtations when the start csn is not found in > the > changelog. and it also did change the logging, so you might see messages now > which were not there or hidden before. That was my understanding too. > But I am very surprised to see them so frequently and I would like to > understand > it. > First some questions, do you have changelog trimming enabled and how, do you > have fractional replication ? yes for both questions. Trimming: 14 days Fractional replication: nsDS5ReplicatedAttributeList: (objectclass=*) $ EXCLUDE entryusn memberOf nsDS5ReplicatedAttributeListTotal: (objectclass=*) $ EXCLUDE entryusn nsds5ReplicaStripAttrs: modifiersName modifyTimestamp internalModifiersName internalModifyTimestamp internalCreatorsname Changelog: cn=changelog5,cn=config objectClass: top objectClass: extensibleObject cn: changelog5 nsslapd-changelogdir: /Local/dirsrv/var/lib/dirsrv/slapd-ens/changelogdb nsslapd-changelogmaxage: 14d replica: cn=replica,cn=dc\\3Did\\2Cdc\\3Dpolytechnique\\2Cdc\\3Dedu,cn=mapping tree,cn=config objectClass: top objectClass: nsDS5Replica cn: replica nsDS5ReplicaId: 1 nsDS5ReplicaRoot: dc=id,dc=polytechnique,dc=edu nsDS5Flags: 1 nsDS5ReplicaBindDN: cn=RepliX,cn=config nsds5ReplicaPurgeDelay: 604800 nsds5ReplicaTombstonePurgeInterval: 86400 nsds5ReplicaLegacyConsumer: False nsDS5ReplicaType: 3 nsState:: AQDCrc5XAQABAA== nsDS5ReplicaName: eeb6d304-736c11e6-9bc5a1ff-40280b8e nsds5ReplicaChangeCount: 114948 nsds5replicareapactive: 0 Typical replication agreement: cn=Replication from ldap-lab. to ldap-adm.,cn=replica,cn=dc\\3Did\\2Cdc\\3Dpolytechnique\\2Cdc\\3Dedu,cn=mapping tree,cn=config objectClass: top objectClass: nsDS5ReplicationAgreement cn: Replication from ldap-lab. to ldap-adm. description: Replication agreement from server ldap-lab. to server ldap-adm. nsDS5ReplicaHost: ldap-adm. nsDS5ReplicaRoot: dc=id,dc=polytechnique,dc=edu nsDS5ReplicaPort: 636 nsDS5ReplicaTransportInfo: SSL nsDS5ReplicaBindDN: cn=RepliX,cn=config nsDS5ReplicaBindMethod: simple nsDS5ReplicatedAttributeList: (objectclass=*) $ EXCLUDE entryusn memberOf nsDS5ReplicatedAttributeListTotal: (objectclass=*) $ EXCLUDE entryusn nsds5ReplicaStripAttrs: modifiersName modifyTimestamp internalModifiersName internalModifyTimestamp internalCreatorsname nsds5replicaBusyWaitTime: 5 nsds5ReplicaFlowControlPause: 500 nsds5ReplicaFlowControlWindow: 1000 nsds5replicaTimeout: 120 nsDS5ReplicaCredentials: {AES-... nsds50ruv: {replicageneration} 57cd73770002 nsds50ruv: {replica 2 ldap://ldap-adm.:389} nsruvReplicaLastModified: {replica 2 ldap://ldap-adm.:389} nsds5replicareapactive: 0 nsds5replicaLastUpdateStart: 20160906115520Z nsds5replicaLastUpdateEnd: 20160906115520Z nsds5replicaChangesSentSinceStartup: 3:13525/670 1:3671/0 2:1/0 nsds5replicaLastUpdateStatus: 0 Replica acquired successfully: Incremental update succeeded nsds5replicaUpdateInProgress: FALSE nsds5replicaLastInitStart: 1970010100Z nsds5replicaLastInitEnd: 1970010100Z > Next, is it possible to get the access and error logs for a period of an hour > from all servers (you can send them off list) ? I would like to track some of > the reported csns. Sure, i will send it to you off list in a moment. Thank you, Regards, Andrey > Regards, > Ludwig > On 09/06/2016 12:31 PM, Ivanov Andrey (M.) wrote: >> Hi, >> We are successfully using the compiled 1.3.4 git branch of 389DS in >> production >> on CentOS 7 since about a year (approximately 40 000 entries, about 4000 >> groups, hundreds of reads and tens of writes per second). >> Our current topology consists of 3 servers in triangle (each server is a >> master >> replicating to 2 others, so two read-write replication agreements on each). >> Since the fixes for the Ticket 48766 ("Replication changelog can incorrectly >> skip over updates") and Ticket 48954 ("Replication fails because anchorcsn >> cannot be found") I’ve started to see the following regular warnings in error >> logs: >> [06/Sep/2016:01:21:43 +0200] clcache_load_buffer_bulk - changelog record with >> csn (57cdfe0600010001) not found for DB_NEXT >> [06/Sep/2016:01:21:43 +0200] agmt="cn=Replication from ldap-adm. to >> ldap-lab." (ldap-lab:636) - Can't locate CSN 57cdfe0600010001 in >> the changelog (DB rc=-30988). If replication stops, the consumer may need to >> be >> reinitialized. >> [06/Sep/2016:02:35:25 +0200] - replica_generate_next_csn: >> opcsn=57ce0f4e00050002 <= basecsn=57ce0f4e00050003, adjusted >> opcsn=57ce0f4e00060002 >> [06/Sep/2016
[389-users] 389DS v1.3.4.x after fixes for tickets 48766 and 48954
Hi, We are successfully using the compiled 1.3.4 git branch of 389DS in production on CentOS 7 since about a year (approximately 40 000 entries, about 4000 groups, hundreds of reads and tens of writes per second). Our current topology consists of 3 servers in triangle (each server is a master replicating to 2 others, so two read-write replication agreements on each). Since the fixes for the Ticket 48766 ("Replication changelog can incorrectly skip over updates") and Ticket 48954 ("Replication fails because anchorcsn cannot be found") I’ve started to see the following regular warnings in error logs: [06/Sep/2016:01:21:43 +0200] clcache_load_buffer_bulk - changelog record with csn (57cdfe0600010001) not found for DB_NEXT [06/Sep/2016:01:21:43 +0200] agmt="cn=Replication from ldap-adm. to ldap-lab." (ldap-lab:636) - Can't locate CSN 57cdfe0600010001 in the changelog (DB rc=-30988). If replication stops, the consumer may need to be reinitialized. [06/Sep/2016:02:35:25 +0200] - replica_generate_next_csn: opcsn=57ce0f4e00050002 <= basecsn=57ce0f4e00050003, adjusted opcsn=57ce0f4e00060002 [06/Sep/2016:04:10:11 +0200] clcache_load_buffer_bulk - changelog record with csn (57ce257e00040003) not found for DB_NEXT [06/Sep/2016:05:16:58 +0200] - replica_generate_next_csn: opcsn=57ce352b0002 <= basecsn=57ce352b00010001, adjusted opcsn=57ce352b00010002 [06/Sep/2016:06:56:04 +0200] agmt="cn=Replication from ldap-adm. to ldap-ens." (ldap-ens:636) - Can't locate CSN 57ce4c6200010003 in the changelog (DB rc=-30988). If replication stops, the consumer may need to be reinitialized. [06/Sep/2016:07:29:00 +0200] agmt="cn=Replication from ldap-adm. to ldap-ens." (ldap-ens:636) - Can't locate CSN 57ce541a00020003 in the changelog (DB rc=-30988). If replication stops, the consumer may need to be reinitialized. [06/Sep/2016:07:34:20 +0200] agmt="cn=Replication from ldap-adm. to ldap-lab." (ldap-lab:636) - Can't locate CSN 57ce555900010001 in the changelog (DB rc=-30988). If replication stops, the consumer may need to be reinitialized. [06/Sep/2016:07:34:27 +0200] agmt="cn=Replication from ldap-adm. to ldap-lab." (ldap-lab:636) - Can't locate CSN 57ce55610001 in the changelog (DB rc=-30988). If replication stops, the consumer may need to be reinitialized. [06/Sep/2016:07:40:17 +0200] clcache_load_buffer_bulk - changelog record with csn (57ce56c50003) not found for DB_NEXT [06/Sep/2016:07:40:24 +0200] clcache_load_buffer_bulk - changelog record with csn (57ce56c500010003) not found for DB_NEXT [06/Sep/2016:08:08:36 +0200] clcache_load_buffer_bulk - changelog record with csn (57ce5d5f000f0001) not found for DB_NEXT [06/Sep/2016:08:12:39 +0200] clcache_load_buffer_bulk - changelog record with csn (57ce5e5400020003) not found for DB_NEXT [06/Sep/2016:08:12:39 +0200] agmt="cn=Replication from ldap-adm. to ldap-ens." (ldap-ens:636) - Can't locate CSN 57ce5e5400020003 in the changelog (DB rc=-30988). If replication stops, the consumer may need to be reinitialized. [06/Sep/2016:08:26:45 +0200] clcache_load_buffer_bulk - changelog record with csn (57ce61a300020003) not found for DB_NEXT [06/Sep/2016:08:27:40 +0200] clcache_load_buffer_bulk - changelog record with csn (57ce61d800020003) not found for DB_NEXT [06/Sep/2016:08:27:40 +0200] agmt="cn=Replication from ldap-adm. to ldap-ens." (ldap-ens:636) - Can't locate CSN 57ce61d800020003 in the changelog (DB rc=-30988). If replication stops, the consumer may need to be reinitialized. [06/Sep/2016:08:31:42 +0200] clcache_load_buffer_bulk - changelog record with csn (57ce62c800030001) not found for DB_NEXT [06/Sep/2016:08:34:05 +0200] clcache_load_buffer_bulk - changelog record with csn (57ce635a00010001) not found for DB_NEXT [06/Sep/2016:08:44:28 +0200] clcache_load_buffer_bulk - changelog record with csn (57ce65c900020003) not found for DB_NEXT [06/Sep/2016:08:52:25 +0200] agmt="cn=Replication from ldap-adm. to ldap-ens." (ldap-ens:636) - Can't locate CSN 57ce67aa00010003 in the changelog (DB rc=-30988). If replication stops, the consumer may need to be reinitialized. [06/Sep/2016:08:53:04 +0200] - replica_generate_next_csn: opcsn=57ce67d100010002 <= basecsn=57ce67d100020003, adjusted opcsn=57ce67d100020002 These warnings are present on all three servers and for all replication agreements. One of them is virtual and two others are physical. The replication still seems to work fine in spite of these warnings. The "replica_generate_next_csn" is not new - it existed since always with 1.3.4, the two new warnings are "clcache_load_buffer_bulk " and "Can't locate CSN ... in the changelog (DB rc=-30988)." There are no network problems or anything like that. So it could only be replication topology (3-master fully-connected triangle) and/or servers being rather busy. Is it a bug, a warning that
Re: [389-users] _cl5CompactDBs: failed to compact
Hi Noriko, - Mail original - There are three MMR replicating servers. It's one month of uptime and the servers wanted to trim the replication log. Here is what i've found in error log on each of them : 1st server: [18/Jun/2015:08:04:31 +0200] - libdb: BDB2055 Lock table is out of available lock entries May not matter, but could you please try increasing the value of this db config parameter? The default value is 1. dn: cn=config,cn=ldbm database,cn=plugins,cn=config nsslapd-db-locks: 1 Ok. I've increased nsslapd-db-locks to 2 and reduced nsslapd-changelogcompactdb-interval to 3600 in cn=changelog5,cn=config to see the changelog free event more frequently. No change. I have still : [19/Jun/2015:10:36:46 +0200] - libdb: BDB2055 Lock table is out of available lock entries [19/Jun/2015:10:36:46 +0200] NSMMReplicationPlugin - changelog program - _cl5CompactDBs: failed to compact a45fa684-f28d11e4-af27aa63-5121b7ef; db error - 12 Cannot allocate memory [18/Jun/2015:08:04:31 +0200] NSMMReplicationPlugin - changelog program - _cl5CompactDBs: failed to compact a45fa684-f28d11e4-af27aa63-5121b7ef; db error - 12 Cannot allocate memory I don't thing there is any problem even if the DBs are not compacted. It was introduced just to release the free pages in the db files. But I'd also like to learn why the compact fails with ENOMEM here. Ok, thanks. -- 389 users mailing list 389-users@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/389-users
[389-users] _cl5CompactDBs: failed to compact
Hi, we are using the branch 1.3.2 on CentOS7 in our production environment (version 1.3.2.27 with some additional patches from the git of this branch). There are three MMR replicating servers. It's one month of uptime and the servers wanted to trim the replication log. Here is what i've found in error log on each of them : 1st server: [18/Jun/2015:08:04:31 +0200] - libdb: BDB2055 Lock table is out of available lock entries [18/Jun/2015:08:04:31 +0200] NSMMReplicationPlugin - changelog program - _cl5CompactDBs: failed to compact a45fa684-f28d11e4-af27aa63-5121b7ef; db error - 12 Cannot allocate memory 2nd server: [18/Jun/2015:08:10:34 +0200] - libdb: BDB2055 Lock table is out of available lock entries [18/Jun/2015:08:10:34 +0200] NSMMReplicationPlugin - changelog program - _cl5CompactDBs: failed to compact acb7e184-f28d11e4-9b13d240-c66923c8; db error - 12 Cannot allocate memory 3rd server: [18/Jun/2015:08:18:10 +0200] - libdb: BDB2055 Lock table is out of available lock entries [18/Jun/2015:08:18:10 +0200] NSMMReplicationPlugin - changelog program - _cl5CompactDBs: failed to compact acb7e184-f28d11e4-8067eff8-b1ca763b; db error - 12 Cannot allocate memory The changelog itself is not huge : [root@ldap-ens]# ll -h /Local/dirsrv/var/lib/dirsrv/slapd-ens/changelogdb/ total 390M -rw--- 1 ldap ldap 390M Jun 18 10:18 a45fa684-f28d11e4-af27aa63-5121b7ef_5547be41.db -rw-r--r-- 1 ldap ldap 0 May 19 08:02 a45fa684-f28d11e4-af27aa63-5121b7ef.sema -rw--- 1 ldap ldap 30 May 4 20:45 DBVERSION The server are working correctky, the replication is also working What are the potential consequences of this error? How can we avoid it? Thank you! -- 389 users mailing list 389-users@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/389-users
Re: [389-users] 389ds v1.3.2.24 error log message: replica_generate_next_csn adjusted
[15/Nov/2014:03:58:43 +0100] - replica_generate_next_csn: opcsn=5466c1640001 = basecsn=5466c1640002, adjusted opcsn=5466c16400010001 [15/Nov/2014:10:38:38 +0100] - replica_generate_next_csn: opcsn=54671f1f0001 = basecsn=54671f1f0003, adjusted opcsn=54671f1f00010001 Are these only information messages that can be safely ignored or they may be a manifestation of some potential problem? This looks ok to me, and the message should not be a fatal message. The code handles this correctly by incrementing the sequence number and updating the generator. That's what i've also thought. In practice it should be very difficult to get the generator to generate a CSN like this. Are all of these machines running in VMs? If so, what is the hypervisor? How many of these do you see per day? We see it two or three time per day, compared to 2 or 3 modifications per day (according to logconv.pl) : 38465 2.16.840.1.113730.3.5.12 DS90 Start Replication Request 24603 2.16.840.1.113730.3.5.5 End Replication Request (incremental update) 2 servers are physical (replica id 1 and 3) and one is virtual (replica id 2). Each of the three is MMR-replicated to two others. On rep_id 1 (physical hardware): [15/Nov/2014:03:58:43 +0100] - replica_generate_next_csn: opcsn=5466c1640001 = basecsn=5466c1640002, adjusted opcsn=5466c16400010001 [15/Nov/2014:10:38:38 +0100] - replica_generate_next_csn: opcsn=54671f1f0001 = basecsn=54671f1f0003, adjusted opcsn=54671f1f00010001 [16/Nov/2014:01:43:44 +0100] - replica_generate_next_csn: opcsn=5467f3410001 = basecsn=5467f34100010002, adjusted opcsn=5467f34100010001 [17/Nov/2014:09:34:54 +0100] - replica_generate_next_csn: opcsn=5469b32f0001 = basecsn=5469b32f0002, adjusted opcsn=5469b32f00010001 [17/Nov/2014:16:09:48 +0100] - replica_generate_next_csn: opcsn=546a0fbd0001 = basecsn=546a0fbd00020002, adjusted opcsn=546a0fbd00030001 [17/Nov/2014:16:55:55 +0100] - replica_generate_next_csn: opcsn=546a1a8c0001 = basecsn=546a1a8c0002, adjusted opcsn=546a1a8c00010001 [17/Nov/2014:19:34:14 +0100] - replica_generate_next_csn: opcsn=546a3fa70001 = basecsn=546a3fa70003, adjusted opcsn=546a3fa700010001 On rep_id 2 (virtual, VMWare ESXi5.5): [15/Nov/2014:04:19:09 +0100] - replica_generate_next_csn: opcsn=5466c62e0002 = basecsn=5466c62e0003, adjusted opcsn=5466c62e00010002 [17/Nov/2014:15:47:11 +0100] - replica_generate_next_csn: opcsn=546a0a710002 = basecsn=546a0a720003, adjusted opcsn=546a0a720002 [17/Nov/2014:15:48:11 +0100] - replica_generate_next_csn: opcsn=546a0aac00010002 = basecsn=546a0aac00020003, adjusted opcsn=546a0aac00020002 [17/Nov/2014:15:49:36 +0100] - replica_generate_next_csn: opcsn=546a0b010002 = basecsn=546a0b0100020003, adjusted opcsn=546a0b0100030002 On rep_id 3 (physical hardware): [16/Nov/2014:05:02:34 +0100] - replica_generate_next_csn: opcsn=546821db0003 = basecsn=546821dc0002, adjusted opcsn=546821dc00010003 In source code (./ldap/servers/plugins/replication/repl5_replica.c) it looks like a serious one (SLAPI_LOG_FATAL): slapi_log_error (SLAPI_LOG_FATAL, NULL, replica_generate_next_csn: opcsn=%s = basecsn=%s, adjusted opcsn=%s\n, opcsnstr, basecsnstr, opcsn2str); It should not be FATAL. Please file a ticket. Ok. Done: https://fedorahosted.org/389/ticket/47959 Thanks! -- 389 users mailing list 389-users@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/389-users
[389-users] 389ds v.1.3.2.24 replication deadlocks/retry count exceeded
Hi, I've continued testing 389ds v.1.3.2.24 on CentOS 7. I really have an impression that everything works fine (plugins etc) but the replication seems to be a little fragile. Both of the tickets i've already opened concern replication partially or completely (https://fedorahosted.org/389/ticket/47942 and https://fedorahosted.org/389/ticket/47950). Here is another issue with replication : i have two servers with multi-master agreements on each of them (the same configuration as in ticket https://fedorahosted.org/389/ticket/47942). We add/delete a lot of groups (943, to be exact). Each group may contain a large number of referenced entries, up to ~250 (uniqueMember: dn). MemberOf plugin is activated and works fine. Referential integrity plugin is also activated but of course it is of any sense only when deleting groups (or renaming them). It goes on for a long time (20-30 minutes or more). Some time after the beginning of the operations (typically 5-8 minutes) we have replication erros and inconsistency of the replica concerning the entries mentioned in error log. When adding and deleting groups the supplier is ok. Howevere the consumer has several (from one to four or five) groupe deletions/adds that are not replicated. The error on the supplier: [12/Nov/2014:16:46:42 +0100] NSMMReplicationPlugin - agmt=cn=Replication from ldap-edev.polytechnique.fr to ldap-model.polytechnique.fr (ldap-model:636): Consumer failed to replay change (uniqueid fa90219d-6a8211e4-a42c901a-94623bee, CSN 546380d60002): Operations error (1). Will retry later. [12/Nov/2014:16:47:55 +0100] NSMMReplicationPlugin - agmt=cn=Replication from ldap-edev.polytechnique.fr to ldap-model.polytechnique.fr (ldap-model:636): Consumer failed to replay change (uniqueid 1e5367ae-6a8311e4-a42c901a-94623bee, CSN 546381250002): Operations error (1). Will retry later. [12/Nov/2014:16:53:14 +0100] NSMMReplicationPlugin - agmt=cn=Replication from ldap-edev.polytechnique.fr to ldap-model.polytechnique.fr (ldap-model:636): Consumer failed to replay change (uniqueid f4e70b85-6a8311e4-a42c901a-94623bee, CSN 54638262): Operations error (1). Will retry later. [12/Nov/2014:16:55:12 +0100] NSMMReplicationPlugin - agmt=cn=Replication from ldap-edev.polytechnique.fr to ldap-model.polytechnique.fr (ldap-model:636): Consumer failed to replay change (uniqueid 3c6d978a-6a8411e4-a42c901a-94623bee, CSN 546382d600040002): Operations error (1). Will retry later. [12/Nov/2014:16:56:31 +0100] NSMMReplicationPlugin - agmt=cn=Replication from ldap-edev.polytechnique.fr to ldap-model.polytechnique.fr (ldap-model:636): Consumer failed to replay change (uniqueid 6030dd93-6a8411e4-a42c901a-94623bee, CSN 546383250002): Operations error (1). Will retry later. [12/Nov/2014:16:57:22 +0100] NSMMReplicationPlugin - agmt=cn=Replication from ldap-edev.polytechnique.fr to ldap-model.polytechnique.fr (ldap-model:636): Consumer failed to replay change (uniqueid 83f42395-6a8411e4-a42c901a-94623bee, CSN 5463835d0002): Operations error (1). Will retry later. The corresponding errors on the consumer seem to hint deadlocks in these cases: [12/Nov/2014:16:46:41 +0100] NSMMReplicationPlugin - changelog program - _cl5WriteOperationTxn: retry (49) the transaction (csn=546380d60002) failed (rc=-30993 (BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock)) [12/Nov/2014:16:46:41 +0100] NSMMReplicationPlugin - changelog program - _cl5WriteOperationTxn: failed to write entry with csn (546380d60002); db error - -30993 BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock [12/Nov/2014:16:46:41 +0100] NSMMReplicationPlugin - write_changelog_and_ruv: can't add a change for cn=LAN452ESP-2014,ou=2014,ou=Cours,ou=Enseignement,ou=Groupes,dc=id,dc=polytechnique,dc=edu (uniqid: fa90219d-6a8211e4-a42c901a-94623bee, optype: 16) to changelog csn 546380d60002 [12/Nov/2014:16:47:54 +0100] NSMMReplicationPlugin - changelog program - _cl5WriteOperationTxn: retry (49) the transaction (csn=546381250002) failed (rc=-30993 (BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock)) [12/Nov/2014:16:47:54 +0100] NSMMReplicationPlugin - changelog program - _cl5WriteOperationTxn: failed to write entry with csn (546381250002); db error - -30993 BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock [12/Nov/2014:16:47:54 +0100] NSMMReplicationPlugin - write_changelog_and_ruv: can't add a change for cn=LAN472EFLE-2014,ou=2014,ou=Cours,ou=Enseignement,ou=Groupes,dc=id,dc=polytechnique,dc=edu (uniqid: 1e5367ae-6a8311e4-a42c901a-94623bee, optype: 16) to changelog csn 546381250002 [12/Nov/2014:16:53:13 +0100] NSMMReplicationPlugin - changelog program - _cl5WriteOperationTxn: retry (49) the transaction (csn=54638262) failed (rc=-30993 (BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock)) [12/Nov/2014:16:53:13
Re: [389-users] 389ds v.1.3.2.24 replication deadlocks/retry count exceeded
- Mail original - De: Ivanov Andrey (M.) andrey.iva...@polytechnique.fr À: General discussion list for the 389 Directory server project. 389-users@lists.fedoraproject.org Envoyé: Mercredi 12 Novembre 2014 18:52:44 Objet: [389-users] 389ds v.1.3.2.24 replication deadlocks/retry count exceeded Hi, I've continued testing 389ds v.1.3.2.24 on CentOS 7. I really have an impression that everything works fine (plugins etc) but the replication seems to be a little fragile. Both of the tickets i've already opened concern replication partially or completely (https://fedorahosted.org/389/ticket/47942 and https://fedorahosted.org/389/ticket/47950). Here is another issue with replication : i have two servers with multi-master agreements on each of them (the same configuration as in ticket https://fedorahosted.org/389/ticket/47942). We add/delete a lot of groups (943, to be exact). Each group may contain a large number of referenced entries, up to ~250 (uniqueMember: dn). MemberOf plugin is activated and works fine. Referential integrity plugin is also activated but of course it is of any sense only when deleting groups (or renaming them). It goes on for a long time (20-30 minutes or more). Some time after the beginning of the operations (typically 5-8 minutes) we have replication erros and inconsistency of the replica concerning the entries mentioned in error log. When adding and deleting groups the supplier is ok. Howevere the consumer has several (from one to four or five) groupe deletions/adds that are not replicated. The error on the supplier: [12/Nov/2014:16:46:42 +0100] NSMMReplicationPlugin - agmt=cn=Replication from ldap-edev.polytechnique.fr to ldap-model.polytechnique.fr (ldap-model:636): Consumer failed to replay change (uniqueid fa90219d-6a8211e4-a42c901a-94623bee, CSN 546380d60002): Operations error (1). Will retry later. [12/Nov/2014:16:47:55 +0100] NSMMReplicationPlugin - agmt=cn=Replication from ldap-edev.polytechnique.fr to ldap-model.polytechnique.fr (ldap-model:636): Consumer failed to replay change (uniqueid 1e5367ae-6a8311e4-a42c901a-94623bee, CSN 546381250002): Operations error (1). Will retry later. [12/Nov/2014:16:53:14 +0100] NSMMReplicationPlugin - agmt=cn=Replication from ldap-edev.polytechnique.fr to ldap-model.polytechnique.fr (ldap-model:636): Consumer failed to replay change (uniqueid f4e70b85-6a8311e4-a42c901a-94623bee, CSN 54638262): Operations error (1). Will retry later. [12/Nov/2014:16:55:12 +0100] NSMMReplicationPlugin - agmt=cn=Replication from ldap-edev.polytechnique.fr to ldap-model.polytechnique.fr (ldap-model:636): Consumer failed to replay change (uniqueid 3c6d978a-6a8411e4-a42c901a-94623bee, CSN 546382d600040002): Operations error (1). Will retry later. [12/Nov/2014:16:56:31 +0100] NSMMReplicationPlugin - agmt=cn=Replication from ldap-edev.polytechnique.fr to ldap-model.polytechnique.fr (ldap-model:636): Consumer failed to replay change (uniqueid 6030dd93-6a8411e4-a42c901a-94623bee, CSN 546383250002): Operations error (1). Will retry later. [12/Nov/2014:16:57:22 +0100] NSMMReplicationPlugin - agmt=cn=Replication from ldap-edev.polytechnique.fr to ldap-model.polytechnique.fr (ldap-model:636): Consumer failed to replay change (uniqueid 83f42395-6a8411e4-a42c901a-94623bee, CSN 5463835d0002): Operations error (1). Will retry later. The corresponding errors on the consumer seem to hint deadlocks in these cases: [12/Nov/2014:16:46:41 +0100] NSMMReplicationPlugin - changelog program - _cl5WriteOperationTxn: retry (49) the transaction (csn=546380d60002) failed (rc=-30993 (BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock)) [12/Nov/2014:16:46:41 +0100] NSMMReplicationPlugin - changelog program - _cl5WriteOperationTxn: failed to write entry with csn (546380d60002); db error - -30993 BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock [12/Nov/2014:16:46:41 +0100] NSMMReplicationPlugin - write_changelog_and_ruv: can't add a change for cn=LAN452ESP-2014,ou=2014,ou=Cours,ou=Enseignement,ou=Groupes,dc=id,dc=polytechnique,dc=edu (uniqid: fa90219d-6a8211e4-a42c901a-94623bee, optype: 16) to changelog csn 546380d60002 [12/Nov/2014:16:47:54 +0100] NSMMReplicationPlugin - changelog program - _cl5WriteOperationTxn: retry (49) the transaction (csn=546381250002) failed (rc=-30993 (BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock)) [12/Nov/2014:16:47:54 +0100] NSMMReplicationPlugin - changelog program - _cl5WriteOperationTxn: failed to write entry with csn (546381250002); db error - -30993 BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock [12/Nov/2014:16:47:54 +0100] NSMMReplicationPlugin - write_changelog_and_ruv: can't add a change for cn=LAN472EFLE-2014,ou=2014,ou=Cours,ou=Enseignement,ou=Groupes,dc=id,dc=polytechnique,dc=edu (uniqid: 1e5367ae-6a8311e4-a42c901a
Re: [389-users] 389ds v.1.3.2.24 replication deadlocks/retry count exceeded
- Mail original - The corresponding errors on the consumer seem to hint deadlocks in these cases: [12/Nov/2014:16:46:41 +0100] NSMMReplicationPlugin - changelog program - _cl5WriteOperationTxn: retry (49) the transaction (csn=546380d60002) failed (rc=-30993 (BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock)) [12/Nov/2014:16:46:41 +0100] NSMMReplicationPlugin - changelog program - _cl5WriteOperationTxn: failed to write entry with csn (546380d60002); db error - -30993 BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock [12/Nov/2014:16:46:41 +0100] NSMMReplicationPlugin - write_changelog_and_ruv: can't add a change for cn=LAN452ESP-2014,ou=2014,ou=Cours,ou=Enseignement,ou=Groupes,dc=id,dc=polytechnique,dc=edu (uniqid: fa90219d-6a8211e4-a42c901a-94623bee, optype: 16) to changelog csn 546380d60002 [12/Nov/2014:16:47:54 +0100] NSMMReplicationPlugin - changelog program - _cl5WriteOperationTxn: retry (49) the transaction (csn=546381250002) failed (rc=-30993 (BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock)) [12/Nov/2014:16:47:54 +0100] NSMMReplicationPlugin - changelog program - _cl5WriteOperationTxn: failed to write entry with csn (546381250002); db error - -30993 BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock [12/Nov/2014:16:47:54 +0100] NSMMReplicationPlugin - write_changelog_and_ruv: can't add a change for cn=LAN472EFLE-2014,ou=2014,ou=Cours,ou=Enseignement,ou=Groupes,dc=id,dc=polytechnique,dc=edu (uniqid: 1e5367ae-6a8311e4-a42c901a-94623bee, optype: 16) to changelog csn 546381250002 [12/Nov/2014:16:53:13 +0100] NSMMReplicationPlugin - changelog program - _cl5WriteOperationTxn: retry (49) the transaction (csn=54638262) failed (rc=-30993 (BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock)) [12/Nov/2014:16:53:13 +0100] NSMMReplicationPlugin - changelog program - _cl5WriteOperationTxn: failed to write entry with csn (54638262); db error - -30993 BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock [12/Nov/2014:16:53:13 +0100] NSMMReplicationPlugin - write_changelog_and_ruv: can't add a change for cn=MAT471-2014,ou=2014,ou=Cours,ou=Enseignement,ou=Groupes,dc=id,dc=polytechnique,dc=edu (uniqid: f4e70b85-6a8311e4-a42c901a-94623bee, optype: 16) to changelog csn 54638262 [12/Nov/2014:16:55:11 +0100] NSMMReplicationPlugin - changelog program - _cl5WriteOperationTxn: retry (49) the transaction (csn=546382d600040002) failed (rc=-30993 (BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock)) [12/Nov/2014:16:55:11 +0100] NSMMReplicationPlugin - changelog program - _cl5WriteOperationTxn: failed to write entry with csn (546382d600040002); db error - -30993 BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock [12/Nov/2014:16:55:11 +0100] NSMMReplicationPlugin - write_changelog_and_ruv: can't add a change for cn=MEC592-2014,ou=2014,ou=Cours,ou=Enseignement,ou=Groupes,dc=id,dc=polytechnique,dc=edu (uniqid: 3c6d978a-6a8411e4-a42c901a-94623bee, optype: 16) to changelog csn 546382d600040002 [12/Nov/2014:16:56:29 +0100] NSMMReplicationPlugin - changelog program - _cl5WriteOperationTxn: retry (49) the transaction (csn=546383250002) failed (rc=-30993 (BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock)) [12/Nov/2014:16:56:29 +0100] NSMMReplicationPlugin - changelog program - _cl5WriteOperationTxn: failed to write entry with csn (546383250002); db error - -30993 BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock [12/Nov/2014:16:56:29 +0100] NSMMReplicationPlugin - write_changelog_and_ruv: can't add a change for cn=PHY566-2014,ou=2014,ou=Cours,ou=Enseignement,ou=Groupes,dc=id,dc=polytechnique,dc=edu (uniqid: 6030dd93-6a8411e4-a42c901a-94623bee, optype: 16) to changelog csn 546383250002 [12/Nov/2014:16:57:20 +0100] NSMMReplicationPlugin - changelog program - _cl5WriteOperationTxn: retry (49) the transaction (csn=5463835d0002) failed (rc=-30993 (BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock)) [12/Nov/2014:16:57:20 +0100] NSMMReplicationPlugin - changelog program - _cl5WriteOperationTxn: failed to write entry with csn (5463835d0002); db error - -30993 BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock [12/Nov/2014:16:57:20 +0100] NSMMReplicationPlugin - write_changelog_and_ruv: can't add a change for cn=PHY651K-2014,ou=2014,ou=Cours,ou=Enseignement,ou=Groupes,dc=id,dc=polytechnique,dc=edu (uniqid: 83f42395-6a8411e4-a42c901a-94623bee, optype: 16) to changelog csn
[389-users] Groupe modifications and internalModifiersName
Hi,, i continue with my tests of 389ds v1.3.2.24. I've encountered another bug or strange behavior (by design?). I've activated bind dn tracking ( nsslapd-plugin-binddn-tracking: on ). There is an account that has the write to add the entries and to change some attributes (e.g. description). The corresponding ACI: dn: ou=Cours,ou=Enseignement,ou=Groupes,dc=id,dc=polytechnique,dc=edu aci: (targetattr = objectClass || uniqueMember || owner || cn || description || businessCategory ) (version 3.0;acl Droits de rejouter/supprimer/modifier les groupes et leurs att ributs;allow ( add, delete, read,compare,search,write )(userdn=ldap:///uid=sync-cours,ou=Comptes generiques,ou=Utilisateurs,dc=id,dc=polytechnique,dc=edu);) Any attempt to modify an authorized attribute from the list above (for ex., description ) results in ldap_modify: Insufficient access (50) additional info: Insufficient 'write' privilege to the 'internalModifiersName' attribute of entry 'cn=mec431-2014,ou=2014,ou=cours,ou=enseignement,ou=groupes,dc=id,dc=polytechnique,dc=edu'. [11/Nov/2014:10:38:49 +0100] conn=4 fd=256 slot=256 connection from 129.104.31.54 to 129.104.69.49 [11/Nov/2014:10:38:49 +0100] conn=4 op=0 BIND dn= method=sasl version=3 mech=GSSAPI [11/Nov/2014:10:38:49 +0100] conn=4 op=0 RESULT err=14 tag=97 nentries=0 etime=0.008000, SASL bind in progress [11/Nov/2014:10:38:49 +0100] conn=4 op=1 BIND dn= method=sasl version=3 mech=GSSAPI [11/Nov/2014:10:38:49 +0100] conn=4 op=1 RESULT err=14 tag=97 nentries=0 etime=0.002000, SASL bind in progress [11/Nov/2014:10:38:49 +0100] conn=4 op=2 BIND dn= method=sasl version=3 mech=GSSAPI [11/Nov/2014:10:38:49 +0100] conn=4 op=2 RESULT err=0 tag=97 nentries=0 etime=0.001000 dn=uid=sync-cours,ou=comptes generiques,ou=utilisateurs,dc=id,dc=polytechnique,dc=edu [11/Nov/2014:10:38:49 +0100] conn=4 op=3 SRCH base=dc=id,dc=polytechnique,dc=edu scope=2 filter=(cn=MEC431-2014) attrs=ALL [11/Nov/2014:10:38:49 +0100] conn=4 op=3 RESULT err=0 tag=101 nentries=1 etime=0.003000 [11/Nov/2014:10:39:00 +0100] conn=4 op=4 MOD dn=cn=MEC431-2014,ou=2014,ou=Cours,ou=Enseignement,ou=Groupes,dc=id,dc=polytechnique,dc=edu [11/Nov/2014:10:39:00 +0100] conn=4 op=4 RESULT err=50 tag=103 nentries=0 etime=0.002000 is it an expected behavior and i need to add to all the ACIs that allow modifications the right to modify internalModifiersName attribute (if i add it, everything is fine and the attribute internalModifiersName becomes cn=ldbm database,cn=plugins,cn=config ). Or is it a bug? Thank you! Regards, -- 389 users mailing list 389-users@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/389-users
Re: [389-users] Groupe modifications and internalModifiersName
Thank you Ludwig, i think the attribute behavior should be as you describe it, so i've made a ticket - https://fedorahosted.org/389/ticket/47950 - Mail original - De: Ludwig Krispenz lkris...@redhat.com À: 389-users@lists.fedoraproject.org Envoyé: Mardi 11 Novembre 2014 11:06:10 Objet: Re: [389-users] Groupe modifications and internalModifiersName On 11/11/2014 10:45 AM, Ivanov Andrey (M.) wrote: Hi,, i continue with my tests of 389ds v1.3.2.24. I've encountered another bug or strange behavior (by design?). I've activated bind dn tracking ( nsslapd-plugin-binddn-tracking: on ). There is an account that has the write to add the entries and to change some attributes (e.g. description). The corresponding ACI: dn: ou=Cours,ou=Enseignement,ou=Groupes,dc=id,dc=polytechnique,dc=edu aci: (targetattr = objectClass || uniqueMember || owner || cn || description || businessCategory ) (version 3.0;acl Droits de rejouter/supprimer/modifier les groupes et leurs att ributs;allow ( add, delete, read,compare,search,write )(userdn= ldap:///uid=sync-cours,ou=Comptes generiques,ou=Utilisateurs,dc=id,dc=polytechnique,dc=edu );) Any attempt to modify an authorized attribute from the list above (for ex., description ) results in ldap_modify: Insufficient access (50) additional info: Insufficient 'write' privilege to the 'internalModifiersName' attribute of entry 'cn=mec431-2014,ou=2014,ou=cours,ou=enseignement,ou=groupes,dc=id,dc=polytechnique,dc=edu'. [11/Nov/2014:10:38:49 +0100] conn=4 fd=256 slot=256 connection from 129.104.31.54 to 129.104.69.49 [11/Nov/2014:10:38:49 +0100] conn=4 op=0 BIND dn= method=sasl version=3 mech=GSSAPI [11/Nov/2014:10:38:49 +0100] conn=4 op=0 RESULT err=14 tag=97 nentries=0 etime=0.008000, SASL bind in progress [11/Nov/2014:10:38:49 +0100] conn=4 op=1 BIND dn= method=sasl version=3 mech=GSSAPI [11/Nov/2014:10:38:49 +0100] conn=4 op=1 RESULT err=14 tag=97 nentries=0 etime=0.002000, SASL bind in progress [11/Nov/2014:10:38:49 +0100] conn=4 op=2 BIND dn= method=sasl version=3 mech=GSSAPI [11/Nov/2014:10:38:49 +0100] conn=4 op=2 RESULT err=0 tag=97 nentries=0 etime=0.001000 dn=uid=sync-cours,ou=comptes generiques,ou=utilisateurs,dc=id,dc=polytechnique,dc=edu [11/Nov/2014:10:38:49 +0100] conn=4 op=3 SRCH base=dc=id,dc=polytechnique,dc=edu scope=2 filter=(cn=MEC431-2014) attrs=ALL [11/Nov/2014:10:38:49 +0100] conn=4 op=3 RESULT err=0 tag=101 nentries=1 etime=0.003000 [11/Nov/2014:10:39:00 +0100] conn=4 op=4 MOD dn=cn=MEC431-2014,ou=2014,ou=Cours,ou=Enseignement,ou=Groupes,dc=id,dc=polytechnique,dc=edu [11/Nov/2014:10:39:00 +0100] conn=4 op=4 RESULT err=50 tag=103 nentries=0 etime=0.002000 is it an expected behavior and i need to add to all the ACIs that allow modifications the right to modify internalModifiersName attribute good question, not sure if thus was intentional, butI think internalModifiersName should be written like modifiersname without specific permission . so for now I suggest you add the aci and open a ticket to get it investigated (if i add it, everything is fine and the attribute internalModifiersName becomes cn=ldbm database,cn=plugins,cn=config ). Or is it a bug? Thank you! Regards, -- 389 users mailing list 389-users@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/389-users -- 389 users mailing list 389-users@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/389-users-- 389 users mailing list 389-users@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/389-users
Re: [389-users] 389-Directory/1.3.1.6 cannot setup replica
Hi Noriko, as promised - the new ticket for the total replication bug discussed yesterday: https://fedorahosted.org/389/ticket/47942 Regards, Andrey De: Noriko Hosoi nho...@redhat.com À: General discussion list for the 389 Directory server project. 389-users@lists.fedoraproject.org Envoyé: Mercredi 5 Novembre 2014 21:54:32 Objet: Re: [389-users] 389-Directory/1.3.1.6 cannot setup replica On 11/05/2014 12:46 PM, Ivanov Andrey (M.) wrote: Next time it happens, could it be possible to get the stacktraces from the hung server? http://www.port389.org/docs/389ds/FAQ/faq.html#debugging-hangs Ok, i'll do that tomorrow (for 1.3.2.24 since i'm testing mainly this one). It happens each time during a full on-line initialization, so it won't be difficult difficult to reproduce :) It does not really hang, only the online initialization hangs in fact (with the logs similar to the original mail)... That'd be great! If you could capture them, could you open a ticket at: https://fedorahosted.org/389/newticket and attach the stacktraces to the ticket?-- 389 users mailing list 389-users@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/389-users
Re: [389-users] 389-Directory/1.3.1.6 cannot setup replica
- Mail original - Don't know. My hypotheses are : * using plugin transactions compared to 1.2.10.x * bdb version? but even with compat-db-47 and 1.2.10 the problem still happens on CentOS7, though much less frequently. It never happens with 1.2.10 with rpm bdb on CentOS5. * change from mozilla ldap libraries to openldap libraries? seems to be some sort of thread or transaction contention that is reduced when i add CPUs/increase checkpoint interval. It really looks like the master server just does not send entries any more at some moment... SSL/TLS slows the things down so less entries are sent before everything gets stuck... I'll get back with more information (stacktraces) tomorrrow. Another version : insufficient entropy generation speed for TLS/SSL total update (/dev/urandom vs blocking /dev/random), especially in VMs?? it is possible the VM system is running out of entropy, and apps to experience long delays, to verify: cat /proc/sys/kernel/random/entropy_avail one way to fix this is to use and run the haveged service on the KVM guest, that can be downloaded from EPEL it can also depends on the VM configuration, for example if using KVM and libvirt (recent version), use the KVM host entropy is with a configuration similar to this: rng model='virtio' backend model='random'/dev/random/backend address type='pci' domain='0x' bus='0x00' slot='0x09' function='0x0'/ /rng /devices without that config, my test RHEL 7 KVM guest has quite a low entropy. and the entropy will depends on the cpu characteristics. Thank you Marc. I'll try checking the entropy pool state during the total on-line import. We are using VMWare for virtualization, so there is no simple way to expose host /dev/random to the guest VMs... However i've had this problem (stucked initial replication) even with LDAP/389 replica protocol, though it happened much less frequently. Anyway, i've made a ticket for this problem: https://fedorahosted.org/389/ticket/47942 -- 389 users mailing list 389-users@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/389-users
Re: [389-users] 389-Directory/1.3.1.6 cannot setup replica
Hi, i'm having the same problem. I'm in the process of migration from our 389DS v1.2.10.25/CentOS5 to 389DS on CentOS 7. Everything is working fine on standalone servers but the replication (especially online initialization). It stucks _each time_ during the online initialization with SSL/TLS (and sometimes without SSL/TLS). And exactly with the same error messages as you describe. The network problems in my case are excluded - i used both virtual machines on the same ESXi in the same network and/or physical servers, the results are the same. I've tried compiling all the branches latest available (tags 1.3.2.24, 1.3.1.22, 1.3.3.5). In all the cases the result was the same. The server pushing the updates just gets stuck at some random number of entries sent to consumer (we have ~3 entries, it gets stuck at random somewhere from 1200 to 25000 entries, the entries stuck have nothing particular in size, it's completely random). 1.2.10.24 compiled with compat-db-4.7 on CentOS 7 has the least of these problems (and the initial replication is 10 times faster - it takes 8 seconds instead of 80 for 1.3.x!). I've been using 1.2.10.24 on CentOS5 compiled with mozilla ldap labraries and 1.2.10.23 obn CentOS 7 compiled with opendlap librarires. The first one had no problems at all to push the initial replication, the second one had intermittent problems, but much less than v1.3.x I've noticed that this problem is getting worse (or simply appears) if : * the replica is be of type 3 (multi-master), with replication agreements in both directions * our schema has several additional attribuites, it may be also important * if the virtual machine has only one CPU. Adding a second CPU increases the number of transferred entries before the initialization gets stuck. So it may me some thread/transaction contention or deadlock. * if the replication agreement uses SSL(port 636) or TLS(port389). Using port 389 with LDAP protocol instead of TLS/SSL increases the number of transferred entries before the initialization gets stuck. Sometimes the initialization even ends successfully in this case. * decreasing nsslapd-db-checkpoint-interval (say, to 5 seconds) also gets the problem worse When the on-line intialization is finished (if it finishes), there are no problems. I think it is related to the volume of data transferred, so small incremental updates do not generate any problem If necessary, i will make any debugs/tests - it is a critical element of our infrastructure, so i'd like this problem to be resolved... Regards, Andrey IVANOV - Mail original - De: 陳含林 laneo...@gmail.com À: 389-users@lists.fedoraproject.org Envoyé: Mercredi 5 Novembre 2014 18:01:37 Objet: [389-users] 389-Directory/1.3.1.6 cannot setup replica hello all, I have setup a IDM/freeipa master using CentOS7 , and import about 5000 hosts. then i try to setup a IDM/freeipa replication server by using ipa-replica-install. It seems the total update on replication server hangs after about 1000+ entries imported. I try to trigger a total update by setting nsds5beginreplicarefresh, but the result was the same. Any one help me ? Thanks! idm1 is the master, idm2 is the replication server. master server logs: [06/Nov/2014:00:21:48 +0800] - 389-Directory/ 1.3.1.6 B2014.219.1825 starting up [06/Nov/2014:00:21:48 +0800] schema-compat-plugin - warning: no entries set up under cn=computers, cn=compat,dc=idc [06/Nov/2014:00:21:51 +0800] - Skipping CoS Definition cn=Password Policy,cn=accounts,dc=idc--no CoS Templates found, which should be added before the CoS Definition. [06/Nov/2014:00:21:51 +0800] - Skipping CoS Definition cn=Password Policy,cn=accounts,dc=idc--no CoS Templates found, which should be added before the CoS Definition. [06/Nov/2014:00:21:51 +0800] - slapd started. Listening on All Interfaces port 389 for LDAP requests [06/Nov/2014:00:21:51 +0800] - Listening on All Interfaces port 636 for LDAPS requests [06/Nov/2014:00:21:51 +0800] - Listening on /var/run/slapd-IDC.socket for LDAPI requests [06/Nov/2014:00:21:51 +0800] - Entry uid=admin,ou=people,o=ipaca -- attribute krbExtraData not allowed [06/Nov/2014:00:40:26 +0800] NSMMReplicationPlugin - agmt=cn=meToidm2.ra.cn.idc (idm2:389): The remote replica has a different database generation ID than the local database. You may have to reinitialize the remote replica, or the local replica. [06/Nov/2014:00:40:26 +0800] NSMMReplicationPlugin - Beginning total update of replica agmt=cn=meToidm2.ra.cn.idc (idm2:389). replication server logs: [06/Nov/2014:00:40:18 +0800] - 389-Directory/ 1.3.1.6 B2014.219.1825 starting up [06/Nov/2014:00:40:18 +0800] ipalockout_get_global_config - [file ipa_lockout.c, line 185]: Failed to get default realm (-1765328160) [06/Nov/2014:00:40:18 +0800] ipaenrollment_start - [file ipa_enrollment.c, line 393]: Failed to get default realm?! [06/Nov/2014:00:40:18 +0800] - slapd