[389-users] Re: advice on 389 in production

2024-06-06 Thread Ivanov Andrey (M.)
Hi Morgan,

in our case we have ~60 000 entries,  ~10 000 accounts and ~5 000 large groups 
(some containing  almost all users). Three 389ds in active-active replication, 
extremely stable, performant, no problems at all. We are using RHEL 9 clones 
(Oracle Linux or Alma Linux), latest OS patches (9.4 i think). 389ds version 
2.5 compiling from git branch (i think the latest one is 2.5.1, maybe it is 
available as rpm). No complaints at all :)

Regards,

AI


- Mail original -
> De: "Morgan Jones" 
> À: "General discussion list for the 389 Directory server, project." 
> <389-users@lists.fedoraproject.org>
> Envoyé: Mercredi 5 Juin 2024 22:25:14
> Objet: [389-users] advice on 389 in production

> Hello Everyone,
> 
> What operating system and 389 version is everyone running in production?  We 
> are
> finally updating our CentOS 7 servers in earnest.
> 
> We have almost 200,000 users and use 389 for our central ldap so stability is
> preferred over features.
> 
> Based on release dates I'm leaning toward version 2.x.
> 
> I've sent the afternoon trying to find packages for Rocky Linux 9 with limited
> success.
> 
> We switched to Ubuntu a few years ago so that would be my preference but I 
> don't
> see packages for ubuntu and I'd prefer to now maintain my own packages.
> 
> Is Docker a viable option for a production install?  I see there is an up to
> date image which I've been able to start but it appears to be 3.x (see above
> re: preferred production version).
> 
> Thanks,
> 
> -morgan
> --
> ___
> 389-users mailing list -- 389-users@lists.fedoraproject.org
> To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org
> Fedora Code of Conduct:
> https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives:
> https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org
> Do not reply to spam, report it:
> https://pagure.io/fedora-infrastructure/new_issue
--
___
389-users mailing list -- 389-users@lists.fedoraproject.org
To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue


[389-users] Re: Recent commits in stable 389ds branches - discussion

2021-12-07 Thread Ivanov Andrey (M.)


>>> 3) A new default plugin requirement, the plugin being written in Rust - 
>>> probably
>>> its introduction is FIPS-related (Issue 3584 - Fix PBKDF2_SHA256 hashing in
>>> FIPS mode).
>> This was a very important fix to get into 1.4.4, usually big changes do not 
>> land
>> in 1.4.4 anymore, but this one needed to get in.
> 
> This change was about the C code, not Rust code if I recall correctly, since
> that's the inbuilt PBKDF2_SHA256 module, not the pwdchan one with openldap
> compat. The RUST pbkdf2 module has existed since early 1.4.4 and that's needed
> for openldap migration which we at SUSE enable by default (I don't think RH do
> yet).

the changes in dse.ldif in that issue (3584) made libpwdchan-plugin required 
for the server (new entries in in cn=Password Storage 
Schemes,cn=plugins,cn=config).

> 
> 
>>> See my comment
>>> https://github.com/389ds/389-ds-base/issues/5008#issuecomment-983759224. 
>>> Rust
>>> becomes a requirement for building the server, which is fine, but then it
>>> should be enabled by default in "./configure". Without it the server does 
>>> not
>>> compile the new plugin and complains about it when starting:
>>> [01/Dec/2021:12:54:04.460194603 +0100] - ERR - symload_report_error - Could 
>>> not
>>> open library "/Local/dirsrv/lib/dirsrv/plugins/libpwdchan-plugin.so" for 
>>> plugin
>>> PBKDF2
>> Yes I do understand this frustration, and it is now fixed for non-rust 
>> builds.
> 
> I think this error specifically came about if you did a rust build, then you
> took rust away, it created some leftovers in dse.ldif I think (?).
No, actually i have never installed rust on any of our production or build 
servers. dse.ldif now integrates by default rust-written plugins but 
--enable-rust is not a default option in ./configure, that was in fact the 
problem.


Thank you, William!

Sincerely,
Andrey
___
389-users mailing list -- 389-users@lists.fedoraproject.org
To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


[389-users] Re: Recent commits in stable 389ds branches - discussion

2021-12-07 Thread Ivanov Andrey (M.)
Hi Mark, 

thank you for your detailed reply. I do not have objections, it's just more of 
a return of experience of the recent changes that i wanted to share, for me 
some things were a bit unexpected. My comments are below. 




On 12/3/21 6:29 AM, Ivanov Andrey (M.) wrote: 

BQ_BEGIN

I'd like to discuss several recent (since a couple of months) commits in stable 
branches of 389ds. I will be talking about 1.4.4 [ 
https://github.com/389ds/389-ds-base/tree/389-ds-base-1.4.4 | 
https://github.com/389ds/389-ds-base/tree/389-ds-base-1.4.4 ] since it's the 
one we are using in production, but i think it's the same for 1.4.3. These 
commits are welcome and go in the right direction, however the changes they 
produce are not something one expects when the server version changes in 4th 
digit (ex. 1.4.4.17 -> 1.4.4.18). Here they are: 


I guess we don't follow the same principles :-) For the most part these are all 
minor RFE's except for Rust, but Rust has been in use in our product (1.4.x 
series) for well over a year now, so I'm surprised to see issues arise about it 
now. But adding these RFE's is not out of line IMHO, obviously you feel a 
little different about that. 
BQ_END

Yes, i would think these changes (especially the shift of certain files to 
/dev/shm and rust dependency during server build) should have landed in 1.4.5 
(corresponding to RHEL 8.n -> REHL 8.n+1). 

If i take my experience as an example, the move of DB files to /dev/shm has 
broken the startup of a newly created server (dscreate -f ...) since /dev/shm 
size by default represents only 50% of server memory. In my case the size of 
these files was more then 50% of memory, so i had to make adjustments (it was 
either to increase the memory of the VM or change the parameter db_home_dir to 
move the abovementioned files back to disk). As for the rust - i did not have 
rust installed ever on my build server, i used my usual ./configure switches 
that worked for 1.4.4.17, the server compiled OK but the error logs at startup 
were filled up with "ERR" criticity messages. Anyway, the problem is resolved 
and i think that as you say we don't have the same perception of change 
importance vs. server version change. 



BQ_BEGIN
1) Some database files [presumable memory-mapped files that are ok to be lost 
at reboot] that were previously in /var/lib/dirsrv/slapd-instance/db/ are now 
moved to /dev/shm/slapd-instance/. This modification seems to work fine (and 
should increase performance), however there is an error message at server 
startup when /dev/shm is empty (for example, after each OS reboot) when the 
server needs to create the files: 
BQ_BEGIN

[03/Dec/2021:12:12:14.887200364 +0100] - ERR - bdb_version_write - Could not 
open file "/dev/shm/slapd-model/DBVERSION" for writing Netscape Portable 
Runtime -5950 (File not found.) 
After the next 389ds restart this ERR message does not appear, but it appears 
after each OS reboot (since /dev/shm is cleaned up after each reboot). 

BQ_END


We can look into modifying this behavior, especially since it's not a fatal 
error. We can change the logging severity to NOTICE (from ERR) or something 
like that. 
BQ_END
Yes, i think that if it is not critical the logging severity for this 
particular message should be lowered to NOTICE. Every message with ERR level 
criticity makes me a bit nervous about server data sanity and integrity, 
especially at startup 



BQ_BEGIN


To be honest error log messages should not be expected to be static. As work is 
done to the server logging messages are added/removed and/or changed all the 
time, and that's not going to change. 
BQ_END
I agree 100% with that, and as you say the criticity level of this finally 
benign case (absence of " /dev/shm/slapd-xxx/DBVERSION" ) should be adjusted to 
NOTICE in odrer not to scare the admin :)) 




BQ_BEGIN


Now I know when we added the "wtime" and "optime" to the access logging that 
did cause some issues for Admins who parse our access logs. We could have done 
better with communicating this change (live and learn). But at the same time 
this new logging is tremendously useful, and has helped many customers 
troubleshoot various performance issues. So while these changes can be 
disruptive we felt the pro's outweighed the cons. 
BQ_END
I found that change when it was introduced very interesting and useful tbh, it 
simplifies debugging perfomance issues and lockups. 


BQ_BEGIN

BQ_BEGIN


2) UNIX socket of the server was moved to /run/slapd-instance.socket, a new 
keyword in .inf file for dscreate ("ldapi") has appeared. 
Works fine, but it had an impact on our scripts that use ldapi socket path. 

BQ_END
In this case using /var/run was outdated and was causing issues with 
systemd/tmpfiles on RHEL, and moving it to /run was the correct thing to do. 
What I don't understand is why adding the option to set the LDAPI path in the 
INF file is a problem. Ca

[389-users] Re: Recent commits in stable 389ds branches - discussion

2021-12-03 Thread Ivanov Andrey (M.)
Just to add to the previous mail - there is another phenomenon linked 
apparently to the new plugin - at each start of the server two error messages 
about plugins with NULL identities are displayed: 
... 
[03/Dec/2021:14:41:38.945576751 +0100] - INFO - main - 389-Directory/1.4.4.17 
B2021.337.1333 starting up 
[03/Dec/2021:14:41:38.946206385 +0100] - INFO - main - Setting the maximum file 
descriptor limit to: 64000 
[03/Dec/2021:14:41:38.951185055 +0100] - ERR - allow_operation - Component 
identity is NULL 
[03/Dec/2021:14:41:38.951846429 +0100] - ERR - allow_operation - Component 
identity is NULL 
[03/Dec/2021:14:41:39.546909815 +0100] - INFO - PBKDF2_SHA256 - Based on CPU 
performance, chose 2048 rounds 
[03/Dec/2021:14:41:39.566959933 +0100] - INFO - 
ldbm_instance_config_cachememsize_set - force a minimal value 512000 
... 

> De: "Ivanov Andrey" 
> À: "General discussion list for the 389 Directory server, project."
> <389-users@lists.fedoraproject.org>
> Envoyé: Vendredi 3 Décembre 2021 12:29:31
> Objet: [389-users] Recent commits in stable 389ds branches - discussion

> Hi,

> I'd like to discuss several recent (since a couple of months) commits in 
> stable
> branches of 389ds. I will be talking about 1.4.4 [
> https://github.com/389ds/389-ds-base/tree/389-ds-base-1.4.4 |
> https://github.com/389ds/389-ds-base/tree/389-ds-base-1.4.4 ] since it's the
> one we are using in production, but i think it's the same for 1.4.3. These
> commits are welcome and go in the right direction, however the changes they
> produce are not something one expects when the server version changes in 4th
> digit (ex. 1.4.4.17 -> 1.4.4.18). Here they are:

> 1) Some database files [presumable memory-mapped files that are ok to be lost 
> at
> reboot] that were previously in /var/lib/dirsrv/slapd-instance/db/ are now
> moved to /dev/shm/slapd-instance/. This modification seems to work fine (and
> should increase performance), however there is an error message at server
> startup when /dev/shm is empty (for example, after each OS reboot) when the
> server needs to create the files:
> [03/Dec/2021:12:12:14.887200364 +0100] - ERR - bdb_version_write - Could not
> open file "/dev/shm/slapd-model/DBVERSION" for writing Netscape Portable
> Runtime -5950 (File not found.)
> After the next 389ds restart this ERR message does not appear, but it appears
> after each OS reboot (since /dev/shm is cleaned up after each reboot).

> 2) UNIX socket of the server was moved to /run/slapd-instance.socket, a new
> keyword in .inf file for dscreate ("ldapi") has appeared.
> Works fine, but it had an impact on our scripts that use ldapi socket path.

> 3) A new default plugin requirement, the plugin being written in Rust - 
> probably
> its introduction is FIPS-related (Issue 3584 - Fix PBKDF2_SHA256 hashing in
> FIPS mode). See my comment
> https://github.com/389ds/389-ds-base/issues/5008#issuecomment-983759224. Rust
> becomes a requirement for building the server, which is fine, but then it
> should be enabled by default in "./configure". Without it the server does not
> compile the new plugin and complains about it when starting:
> [01/Dec/2021:12:54:04.460194603 +0100] - ERR - symload_report_error - Could 
> not
> open library "/Local/dirsrv/lib/dirsrv/plugins/libpwdchan-plugin.so" for 
> plugin
> PBKDF2
> ...

> Thank you and keep up the good work, we use 389ds in production since 2007 and
> we are quite happy with it :)

> Regards,
> Andrey

> ___
> 389-users mailing list -- 389-users@lists.fedoraproject.org
> To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org
> Fedora Code of Conduct:
> https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives:
> https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org
> Do not reply to spam on the list, report it:
> https://pagure.io/fedora-infrastructure
___
389-users mailing list -- 389-users@lists.fedoraproject.org
To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


[389-users] Recent commits in stable 389ds branches - discussion

2021-12-03 Thread Ivanov Andrey (M.)
Hi, 

I'd like to discuss several recent (since a couple of months) commits in stable 
branches of 389ds. I will be talking about 1.4.4 [ 
https://github.com/389ds/389-ds-base/tree/389-ds-base-1.4.4 | 
https://github.com/389ds/389-ds-base/tree/389-ds-base-1.4.4 ] since it's the 
one we are using in production, but i think it's the same for 1.4.3. These 
commits are welcome and go in the right direction, however the changes they 
produce are not something one expects when the server version changes in 4th 
digit (ex. 1.4.4.17 -> 1.4.4.18). Here they are: 

1) Some database files [presumable memory-mapped files that are ok to be lost 
at reboot] that were previously in /var/lib/dirsrv/slapd-instance/db/ are now 
moved to /dev/shm/slapd-instance/. This modification seems to work fine (and 
should increase performance), however there is an error message at server 
startup when /dev/shm is empty (for example, after each OS reboot) when the 
server needs to create the files: 
[03/Dec/2021:12:12:14.887200364 +0100] - ERR - bdb_version_write - Could not 
open file "/dev/shm/slapd-model/DBVERSION" for writing Netscape Portable 
Runtime -5950 (File not found.) 
After the next 389ds restart this ERR message does not appear, but it appears 
after each OS reboot (since /dev/shm is cleaned up after each reboot). 

2) UNIX socket of the server was moved to /run/slapd-instance.socket, a new 
keyword in .inf file for dscreate ("ldapi") has appeared. 
Works fine, but it had an impact on our scripts that use ldapi socket path. 

3) A new default plugin requirement, the plugin being written in Rust - 
probably its introduction is FIPS-related (Issue 3584 - Fix PBKDF2_SHA256 
hashing in FIPS mode). See my comment 
https://github.com/389ds/389-ds-base/issues/5008#issuecomment-983759224. Rust 
becomes a requirement for building the server, which is fine, but then it 
should be enabled by default in "./configure". Without it the server does not 
compile the new plugin and complains about it when starting: 
[01/Dec/2021:12:54:04.460194603 +0100] - ERR - symload_report_error - Could not 
open library "/Local/dirsrv/lib/dirsrv/plugins/libpwdchan-plugin.so" for plugin 
PBKDF2 
... 

Thank you and keep up the good work, we use 389ds in production since 2007 and 
we are quite happy with it :) 

Regards, 
Andrey 
___
389-users mailing list -- 389-users@lists.fedoraproject.org
To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


[389-users] Re: How to analyze large Multi Master Replication (test)-network?

2021-02-28 Thread Ivanov Andrey (M.)
Hi,

Use the RHDS 11 documentation instead of 10, it's more up-to-date 
(https://access.redhat.com/documentation/en-us/red_hat_directory_server/11).

Concerning the rpelication, you can check the whole chapter 
https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/html-single/administration_guide/index#Managing_Replication

What you are trying to do (checking the consistency of LDAP replicas) is 
probably completely or partially implemented by thefollowing two utilities :
* "ds-replcheck" that compares two replicas: 
https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/html-single/administration_guide/index#comparing_two_directory_server_databases
* and "dsconf replication monitor" comparing just the time skew and the 
coherence of RUVs 
(https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/html-single/administration_guide/index#monitoring-the-replication-topology)


In our production environment we check the state of replication from time to 
time by ds-replcheck to be sure the replicas contain identical data.


As for the order of configuration, you can create replication agreements in any 
order, then initialize them. The best practice is to initialize all the servers 
in MMR topology from the same initial server. Something like this for 3 servers 
MMR with ldap1 as central hub:

# Activate replicas and changelogs, create replication managers
/usr/sbin/dsconf ldaps://ldap1.example.com:636 -D 'cn=Directory Manager' -w 
'dir_man_secret_password' replication create-manager --name 
'cn=repman,cn=config' --passwd 'repman_secret_password'
/usr/sbin/dsconf ldaps://ldap1.example.com:636 -D 'cn=Directory Manager' -w 
'dir_man_secret_password' replication enable --suffix='dc=example,dc=com' 
--role='master' --replica-id=1 --bind-dn='cn=repman,cn=config'

/usr/sbin/dsconf ldaps://ldap2.example.com:636 -D 'cn=Directory Manager' -w 
'dir_man_secret_password' replication create-manager --name 
'cn=repman,cn=config' --passwd 'repman_secret_password'
/usr/sbin/dsconf ldaps://ldap2.example.com:636 -D 'cn=Directory Manager' -w 
'dir_man_secret_password' replication enable --suffix='dc=example,dc=com' 
--role='master' --replica-id=2 --bind-dn='cn=repman,cn=config'

/usr/sbin/dsconf ldaps://ldap3.example.com:636 -D 'cn=Directory Manager' -w 
'dir_man_secret_password' replication create-manager --name 
'cn=repman,cn=config' --passwd 'repman_secret_password'
/usr/sbin/dsconf ldaps://ldap3.example.com:636 -D 'cn=Directory Manager' -w 
'dir_man_secret_password' replication enable --suffix='dc=example,dc=com' 
--role='master' --replica-id=3 --bind-dn='cn=repman,cn=config'


# Create all MMR replication agreements
/usr/sbin/dsconf ldaps://ldap2.example.com:636 -D 'cn=Directory Manager' -w 
'dir_man_secret_password' repl-agmt create --suffix='dc=example,dc=com' 
--host='ldap1.example.com' --port=636 --conn-protocol=LDAPS 
--bind-dn='cn=repman,cn=config' --bind-passwd='repman_secret_password' 
--bind-method=SIMPLE 'Replication from ldap2.example.com to ldap1.example.com'
/usr/sbin/dsconf ldaps://ldap3.example.com:636 -D 'cn=Directory Manager' -w 
'dir_man_secret_password' repl-agmt create --suffix='dc=example,dc=com' 
--host='ldap1.example.com' --port=636 --conn-protocol=LDAPS 
--bind-dn='cn=repman,cn=config' --bind-passwd='repman_secret_password' 
--bind-method=SIMPLE 'Replication from ldap3.example.com to ldap1.example.com'
/usr/sbin/dsconf ldaps://ldap1.example.com:636 -D 'cn=Directory Manager' -w 
'dir_man_secret_password' repl-agmt create --suffix='dc=example,dc=com' 
--host='ldap2.example.com' --port=636 --conn-protocol=LDAPS 
--bind-dn='cn=repman,cn=config' --bind-passwd='repman_secret_password' 
--bind-method=SIMPLE 'Replication from ldap1.example.com to ldap2.example.com'
/usr/sbin/dsconf ldaps://ldap1.example.com:636 -D 'cn=Directory Manager' -w 
'dir_man_secret_password' repl-agmt create --suffix='dc=example,dc=com' 
--host='ldap3.example.com' --port=636 --conn-protocol=LDAPS 
--bind-dn='cn=repman,cn=config' --bind-passwd='repman_secret_password' 
--bind-method=SIMPLE 'Replication from ldap1.example.com to ldap3.example.com'

# Start initialization of replica ldap2 from ldap1
/usr/sbin/dsconf ldaps://ldap1.example.com:636 -D 'cn=Directory Manager' -w 
'dir_man_secret_password' repl-agmt init --suffix='dc=example,dc=com' 
'Replication from ldap1.example.com to ldap2.example.com'
# and wait for its end showing progression every 5 seconds
INITSTATE=`/usr/sbin/dsconf ldaps://ldap1.example.com:636 -D 'cn=Directory 
Manager' -w 'dir_man_secret_password' repl-agmt init-status 
--suffix='dc=example,dc=com' 'Replication from ldap1.example.com to 
ldap2.example.com'`; while [[ $INITSTATE == 'Agreement initialization in 
progress.' ]]; do sleep 5; echo -n '.';INITSTATE=`/usr/sbin/dsconf 
ldaps://ldap1.example.com:636 -D 'cn=Directory Manager' -w 
'dir_man_secret_password' repl-agmt init-status 
--suffix='dc=id,dc=polytechnique,dc=edu' 'Replication from 

[389-users] Re: dsconf broken for ldaps instances in 1.4.3 but working in 1.4.2

2020-11-26 Thread Ivanov Andrey (M.)

>> 
>> No problem. We've just merged the fix and backported it. I don't know when it
>> will ship in RHEL/CentOS, but I'm sure it will be soon in an upcoming update.
> Well i usually do not use rpms - we compile from git sources, i used them only
> to make a demo of the problem.
> 
> Thanks for the commit, i have tested the fix. It resolves a half of the 
> problem
> - indeed the TLS_REQCERT is now taken into account from
> /etc/openldap/ldap.conf. But the certificate bundle part (TLS_CACERT parameter
> or system bundle in its ansence) is still not taken into account. TLS_CACERT
> works correctly in dsconf 1.4.2 (and ldapsearch).

I think i have found the part of the code that causes ignoring TLS_CACERT: it's 
the file __init__.py, lines 997-999:

 997 if certdir is None and self.isLocal:
 998 certdir = self.get_cert_dir()
 999 self.log.debug("Using dirsrv ca certificate %s", certdir)

if i comment these lines dsconf starts to take into account TLS_CACERT from 
/etc/openldap/ldap.conf as it should do. Looks like self.isLocal shoud not be 
true while it is, as a result a false certdir is taken:
DEBUG: Using dirsrv ca certificate 
/Local/dirsrv/etc/dirsrv/slapd-{instance_name}
DEBUG: Using external ca certificate 
/Local/dirsrv/etc/dirsrv/slapd-{instance_name}
DEBUG: Using external ca certificate 
/Local/dirsrv/etc/dirsrv/slapd-{instance_name}
___
389-users mailing list -- 389-users@lists.fedoraproject.org
To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org


[389-users] Re: dsconf broken for ldaps instances in 1.4.3 but working in 1.4.2

2020-11-26 Thread Ivanov Andrey (M.)
Hi Willam,


>> 
>> Thanks, here is the github ticket:
>> https://github.com/389ds/389-ds-base/issues/4460
>> 
> 
> No problem. We've just merged the fix and backported it. I don't know when it
> will ship in RHEL/CentOS, but I'm sure it will be soon in an upcoming update.
Well i usually do not use rpms - we compile from git sources, i used them only 
to make a demo of the problem.

Thanks for the commit, i have tested the fix. It resolves a half of the problem 
- indeed the TLS_REQCERT is now taken into account from 
/etc/openldap/ldap.conf. But the certificate bundle part (TLS_CACERT parameter 
or system bundle in its ansence) is still not taken into account. TLS_CACERT 
works correctly in dsconf 1.4.2 (and ldapsearch).

Regards,
Andrey
___
389-users mailing list -- 389-users@lists.fedoraproject.org
To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org


[389-users] Re: dsconf broken for ldaps instances in 1.4.3 but working in 1.4.2

2020-11-25 Thread Ivanov Andrey (M.)
Hi,


>> But all in all i think i start to see where the problem comes from. dsconf
>> version 1.4.2 uses /etc/openldap/ldap.conf (which in turn uses system pem
>> bundle if no TLS_CACERT is specified) for certs/CA. Starting from 1.4.3 
>> dsconf
>> ignores completely /etc/openldap/ldap.conf file and pays attention only to 
>> its
>> own .dsrc file. It explains everything that i see. It's a bit pity that there
>> is no global section in .dsrc like in /etc/openldap/ldap.conf - one needs to
>> create a section per ldap server, often with the same parameters.
> 
> Well, it should be respecting the value from /etc/openldap/ldap.conf I think 
> so
> this seems like a fault ... Can you open an issue for this on github?
> 
> https://github.com/389ds/389-ds-base

Thanks, here is the github ticket:  
https://github.com/389ds/389-ds-base/issues/4460

Regards,
Andrey
___
389-users mailing list -- 389-users@lists.fedoraproject.org
To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org


[389-users] Re: dsconf broken for ldaps instances in 1.4.3 but working in 1.4.2

2020-11-24 Thread Ivanov Andrey (M.)
Hi William,


 sed -i -e 's/ldap.OPT_X_TLS_HARD/ldap.OPT_X_TLS_NEVER/'
 /usr/lib/python3.6/site-packages/lib389/__init__.py
 sed -i -e 's/ldap.OPT_X_TLS_HARD/ldap.OPT_X_TLS_NEVER/'
 /usr/lib/python3.6/site-packages/lib389/cli_base/dsrc.py
> 
> You don't need to do this. You can set tls_reqcert = never in your dsrc file.
> You do not need to edit the lib389 source code.

Yep, thanks! Indeed if i put to .dsrc a custom cacertdir with correct certs or 
tls_reqcert=never dsconf v1.4.3 works:
[slapd-ldaps://ldap-model.polytechnique.fr:636]
uri = ldaps://ldap-model.polytechnique.fr:636
###tls_reqcert = never
tls_cacertdir = /tmp/tls_cacertdir

Is there any way to use a global parameter in .dsrc, without a section per 
server - we have several LDAP servers, all signed by the same CA? making a 
section per server will be a bit tedious.




> 
> Can you show us your /etc/openldap/ldap.conf please?
"ldapsearch -x -H ldaps://" works, so it is not a matter of the content of this 
file. By default it is empty in our case (we use commercial certificates), but 
i tried to point TLS_CACERT to the CA certificates that signed the server's 
cert. It does not fix anything for dsconf 1.4.3 (but it does influence 
ldapsearch and dsconf v1.4.2 of course), here are all the tests i've done 
(commented #TLS_CACERT parameters).

# Turning this off breaks GSSAPI used with krb5 when rdns = false
SASL_NOCANONon
#TLS_CACERT /etc/pki/tls/cert.pem
#TLS_CACERT /Admin/SOURCES/389/Config/CA-sectigo-intermediates-root.crt
#TLS_CACERT /Admin/SOURCES/389/Config/GEANT-OV-RSA-CA-4.crt
#TLS_CACERT 
/Admin/SOURCES/389/Config/USERTrust-RSA-Certification-Authority.crt
#TLS_CACERT /Admin/SOURCES/389/Config/AAA-Certificate-Services.crt


I disabled TLS_CACERT and it makes openldap clients use the system pem. It 
works for ldapsearch and dsconf v1.4.2 but not for dsconf v1.4.3


>> https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/security_hardening/using-shared-system-certificates_security-hardening
>> (by "update-ca-trust" and/or "trust anchor path.to/certificate.crt").
> 
> The system pem bundles are NOT used by openldap which means that lib389 can't
> use them. You must configure the tls_cacertdir or tls_cacert is dsrc to point
> at your CA cert.

Actually in RHEL/CentOS they ARE used by openldap client if TLS_CACERT  is not 
specified explicitly. Here is the snippet of /etc/openldap/ldap.conf file with 
explanations:
# When no CA certificates are specified the Shared System Certificates
# are in use. In order to have these available along with the ones specified
# by TLS_CACERTDIR one has to include them explicitly:
#TLS_CACERT /etc/pki/tls/cert.pem

And it is easy to confirm that the system global bundle is indeed used with any 
self-signed CA authority:
[root@ldap-centos8 ~]# ldapsearch -x -H ldaps://ldap-ens.polytechnique.fr  -b 
"" -s base
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
[root@ldap-centos8 ~]# trust anchor /tmp/my_ca_8192.crt 
[root@ldap-centos8 ~]# ldapsearch -x -LLL  -H ldaps://ldap-ens.polytechnique.fr 
 -b "" -s base
dn:
objectClass: top
defaultnamingcontext: dc=id,dc=polytechnique,dc=edu
dataversion: 020201121013314020201121013314
netscapemdsuffix: cn=ldap://dc=ldap-ens,dc=polytechnique,dc=fr:389
lastusn;userroot: 33863940
lastusn;netscaperoot: -1
[root@ldap-centos8 ~]# trust anchor --remove /tmp/my_ca_8192.crt 
[root@ldap-centos8 ~]# ldapsearch -x -LLL  -H ldaps://ldap-ens.polytechnique.fr 
 -b "" -s base
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)




But all in all i think i start to see where the problem comes from. dsconf 
version 1.4.2 uses /etc/openldap/ldap.conf (which in turn uses system pem 
bundle if no TLS_CACERT is specified) for certs/CA. Starting from 1.4.3 dsconf 
ignores completely /etc/openldap/ldap.conf file and pays attention only to its 
own .dsrc file. It explains everything that i see. It's a bit pity that there 
is no global section in .dsrc like in /etc/openldap/ldap.conf - one needs to 
create a section per ldap server, often with the same parameters.

Thanks again for help, it's clear for me now!

Have a nice day! :)



___
389-users mailing list -- 389-users@lists.fedoraproject.org
To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org


[389-users] Re: dsconf broken for ldaps instances in 1.4.3 but working in 1.4.2

2020-11-23 Thread Ivanov Andrey (M.)
Hi Mark,



>>
>> So it seems it has something to do with how dsconf 1.4.3 vs 1.4.2 validates 
>> the
>> server certificate chains It also breaks replication monitoring in 
>> cockpit
>> UI since dsconf cannot connect by ldaps to otehr servers of replication
>> config...
>>
>>
>> Thanks for the hint about .dsrc file, i'll try it -  my workaround today is 
>> not
>> very elegant :) :
>> sed -i -e 's/ldap.OPT_X_TLS_HARD/ldap.OPT_X_TLS_NEVER/'
>> /usr/lib/python3.6/site-packages/lib389/__init__.py
>> sed -i -e 's/ldap.OPT_X_TLS_HARD/ldap.OPT_X_TLS_NEVER/'
>> /usr/lib/python3.6/site-packages/lib389/cli_base/dsrc.py
>
>


> When you switch between packages are you recreating the instance each
> time and importing the certificates? 

No, the server installation is not modified or touched in any way -  the LDAP 
server (1.4.3) is installed on a separate server (called "ldap-model", CentOS 
8.2) and never restarted or reconfigured. The instance of LDAP is installed 
only there and yes, i used during the installation the ds* utilities :
dsctl model tls import-server-key-cert model_cert.crt model_cert.key
dsconf model security ca-certificate add --file intermedite-1.crt --name 
"CA-Intermediate-1"
dsconf model security ca-certificate set-trust-flags "CA-Intermediate-1" 
--flags "CT,,"
...
The server is accessible with ldapsearch -H ldaps://..., SSL is installed 
correctly - no problem at all. I do not touch it it all during the tests.


I install only the management tools (python3-lib389) on another server called 
"ldap-centos8", and since it needs the file default.inf, so "389-ds-base" rpm 
is installed too. But no 389 instances are started or configured. All the 
necessary certificates (CA and 2 intermediates) are imported to system pem 
bundles  using this : 
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/security_hardening/using-shared-system-certificates_security-hardening
  (by "update-ca-trust" and/or "trust anchor path.to/certificate.crt").

ldapsearch and dsconf 1.4.2 work fine with ldaps://ldap-model... but dsconf 
v.1.4.3 refuses to connect.

The only difference i see in debug logging are the following lines present 
during dsconf 1.4.3 connect attempt but absent in 1.4.2 connect debug (no 389 
instances installed on this server, as i mentioned before) :
DEBUG: open(): Connecting to uri ldaps://ldap-model.polytechnique.fr:636
DEBUG: Using dirsrv ca certificate /etc/dirsrv/slapd-{instance_name}
DEBUG: Using external ca certificate /etc/dirsrv/slapd-{instance_name}
DEBUG: Using external ca certificate /etc/dirsrv/slapd-{instance_name}
DEBUG: Using certificate policy 1
DEBUG: ldap.OPT_X_TLS_REQUIRE_CERT = 1
DEBUG: Cannot connect to 'ldaps://ldap-model.polytechnique.fr:636'



> I'm asking because I'm looking at
> the lib389 code for 1.4.3 and 1.4.2 and there is not much of a
> difference except for importing certificates and how it calls the rehash
> function.
>
> In 1.4.2 we always do:
>
> /usr/bin/c_rehash 
>
> in 1.4.3 we call two difference function depending on the system:
>
> /usr/bin/openssl rehash 
>
> or
>
> /usr/bin/c_rehash 
>
>
> Maybe try running "/usr/bin/c_rehash " on the 1.4.3
> installation and see if it makes a difference.

I don't use crt dirs - i add the intermediate CAs to system bundles 
(update-ca-trust or trust anchor path.to/certificate.crt)


>
> On my Fedora system (1.4.3) it uses the openssl function, which brings
> me to my next question.  How are you importing the certificates?  Are
> you using dsctl/dsconf?  If you aren't, then you should, as they call
> the rehash functions for you when importing the certificates.
I used dsctl/dsconf on the server with 389 LDAP instance ("ldap-model") and the 
server works fine, the problem is on another ("management") server 
("ldap-centos8") where changing rpms from 1.4.3 to 1.4.2 (or the other way) 
switch me from working to non-working dsconf.


Thanks for trying to help ! :)

 
___
389-users mailing list -- 389-users@lists.fedoraproject.org
To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org


[389-users] Re: dsconf broken for ldaps instances in 1.4.3 but working in 1.4.2

2020-11-23 Thread Ivanov Andrey (M.)
Hi William,

thanks for your reply. Our managed by dsconf LDAP is signed by a commercial 
certificate, and both intermediate certificates are added to system bundles 
using "trust anchor" or "update-ca-trust" 
(https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/security_hardening/using-shared-system-certificates_security-hardening).
 Otherwise ldapsearch and dsconf v1.4.2 would not work.
Fiddling with /etc/openldap/ldap.conf does not change anything, it's the first 
thing i was trying to adjust.

The only difference is actually removing one rpm and installing the other. If i 
go back from python3-lib389-1.4.3.13-1 to python3-lib389-1.4.2.16-1.module_el 
by uninstalling one rpm and installing the other dsconf works again:

dnf -y module enable 389-directory-server:testing
dnf -y install python3-lib389

dsconf  ldaps://ldap-model.polytechnique.fr:636 -D "cn=Directory Manager" -w 
mypass ...
Error: Can't contact LDAP server - error:1416F086:SSL 
routines:tls_process_server_certificate:certificate verify failed (self signed 
certificate in certificate chain)
 


dnf -y remove python3-lib389
dnf -y module disable 389-directory-server:testing
dnf -y module enable 389-directory-server:stable
dnf -y install python3-lib389

dsconf  ldaps://ldap-model.polytechnique.fr:636 -D "cn=Directory Manager" -w 
mypass ...
...
 

So it seems it has something to do with how dsconf 1.4.3 vs 1.4.2 validates the 
server certificate chains It also breaks replication monitoring in cockpit 
UI since dsconf cannot connect by ldaps to otehr servers of replication 
config...


Thanks for the hint about .dsrc file, i'll try it -  my workaround today is not 
very elegant :) :
sed -i -e 's/ldap.OPT_X_TLS_HARD/ldap.OPT_X_TLS_NEVER/'  
/usr/lib/python3.6/site-packages/lib389/__init__.py
sed -i -e 's/ldap.OPT_X_TLS_HARD/ldap.OPT_X_TLS_NEVER/' 
/usr/lib/python3.6/site-packages/lib389/cli_base/dsrc.py



>> DEBUG: Instance details: {'uri': 'ldaps://ldap-model.polytechnique.fr:636',
>> 'basedn': None, 'binddn': 'cn=Directory Manager', 'bindpw': None, 'saslmech':
>> None, 'tls_cacertdir': None, 'tls_cert': None, 'tls_key': None, 
>> 'tls_reqcert':
>> 1, 'starttls': False, 'prompt': False, 'pwdfile': None, 'args': {'ldapurl':
>> 'ldaps://ldap-model.polytechnique.fr:636', 'root-dn': 'cn=Directory 
>> Manager'}}
>> 
> 
> 
>> DEBUG: Instance details: {'uri': 'ldaps://ldap-model.polytechnique.fr:636',
>> 'basedn': None, 'binddn': 'cn=Directory Manager', 'bindpw': None, 'saslmech':
>> None, 'tls_cacertdir': None, 'tls_cert': None, 'tls_key': None, 
>> 'tls_reqcert':
>> 1, 'starttls': False, 'prompt': False, 'pwdfile': None, 'args': {'ldapurl':
>> 'ldaps://ldap-model.polytechnique.fr:636', 'root-dn': 'cn=Directory 
>> manager'}}
>> 
>> ldap.SERVER_DOWN: {'desc': "Can't contact LDAP server", 'info':
>> 'error:1416F086:SSL routines:tls_process_server_certificate:certificate 
>> verify
>> failed (self signed certificate in certificate chain)'}
>> ERROR: Error: Can't contact LDAP server - error:1416F086:SSL
>> routines:tls_process_server_certificate:certificate verify failed (self 
>> signed
>> certificate in certificate chain)
> 
> I can't comment about the other environmental changes between those versions,
> but tls_reqcert is 1 in both options, aka ldap.OPT_X_TLS_HARD which means your
> ca cert must be in your LDAP ca store. You don't specify a tls_cacertdir or a
> tls_cacert, so whatever you have in /etc/openldap/ldap.conf will be used for
> this.


> 
> Most likely there is a fault in this config, or they cacertdir is not hashed.
> 
> If you use a cacertdir remember you need to run 'openssl rehash' in the
> directory to setup the symlinks to the PEM files.
> 
> If you use a cacert PEM file directly, ensure it's readable to your user etc.
> 
> As a last resort you could set 'tls_reqcert = never' in .dsrc to disable ca
> validity checking.
> 
> Hope that helps,
> 
> 
> —
> Sincerely,
> 
> William Brown
> 
> Senior Software Engineer, 389 Directory Server
> SUSE Labs, Australia
> ___
> 389-users mailing list -- 389-users@lists.fedoraproject.org
> To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org
> Fedora Code of Conduct:
> https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives:
> https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org
___
389-users mailing list -- 389-users@lists.fedoraproject.org
To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org


[389-users] dsconf broken for ldaps instances in 1.4.3 but working in 1.4.2

2020-11-20 Thread Ivanov Andrey (M.)
dsconf works fine for instances using ldaps in v1.4.2 
(389-directory-server:stable) bit it seems to be broken (not recognizing TLS 
certificates) in v1.4.3 for the commands like 
dsconf ldaps://ldap-model.polytechnique.fr:636 -D "cn=Directory Manager" -w 
mypass some_command 

In both cases i am using dsconf to manage the same external LDAP server 
(1.4.3.x), the OS on both is CentOS 8.2 with latest updates: 

 
[root@ldap-centos8 ~]# rpm -qa | grep 389 
[root@ldap-centos8 ~]# dnf -y module enable 389-directory-server:stable 
[root@ldap-centos8 ~]# dnf -y install 389-ds-base 
[root@ldap-centos8 ~]# rpm -qa | grep 389 
python3-lib389-1.4.2.16-1.module_el8+9435+e6daf39f.noarch 
389-ds-base-libs-1.4.2.16-1.module_el8+9435+e6daf39f.x86_64 
389-ds-base-1.4.2.16-1.module_el8+9435+e6daf39f.x86_64 

[root@ldap-centos8 ~]# dsconf ldaps://ldap-model.polytechnique.fr:636 -D 
"cn=Directory Manager" -w mypass security get 
nsslapd-security: on 
nsslapd-securelistenhost: 
nsslapd-secureport: 636 
... 
 


 
[root@ldap-centos8 ~]# dnf -y remove 389* 
[root@ldap-centos8 ~]# dnf -y module disable 389-directory-server:stable 
[root@ldap-centos8 ~]# dnf -y module enable 389-directory-server:testing 
[root@ldap-centos8 ~]# dnf -y install 389-ds-base 
[root@ldap-centos8 ~]# rpm -qa | grep 389 
python3-lib389-1.4.3.13-1.module_el8+10475+b74bca99.noarch 
389-ds-base-libs-1.4.3.13-1.module_el8+10475+b74bca99.x86_64 
389-ds-base-1.4.3.13-1.module_el8+10475+b74bca99.x86_64 

[root@ldap-centos8 ~]# dsconf ldaps://ldap-model.polytechnique.fr:636 -D 
"cn=Directory Manager" -w mypass security get 
Error: Can't contact LDAP server - error:1416F086:SSL 
routines:tls_process_server_certificate:certificate verify failed (self signed 
certificate in certificate chain) 

 

[root@ldap-centos8 ~]# ldapsearch -H ldaps://ldap-model.polytechnique.fr -b 
'cn=config' -D "cn=Directory Manager" -W '(cn=config)' nsslapd-security 
Enter LDAP Password: 
# extended LDIF 
# 
# LDAPv3 
# base  with scope subtree 
# filter: (cn=config) 
# requesting: nsslapd-security 
# 

# config 
dn: cn=config 
nsslapd-security: on 
... 



 


[root@ldap-centos8 ~]# dsconf -v ldaps://ldap-model.polytechnique.fr:636 -D 
"cn=Directory Manager" -w mypass security get 
DEBUG: The 389 Directory Server Configuration Tool 
DEBUG: Inspired by works of: ITS, The University of Adelaide 
DEBUG: dsrc path: /root/.dsrc 
DEBUG: dsrc container path: /data/config/container.inf 
DEBUG: dsrc instances: [] 
DEBUG: dsrc no such section: slapd-ldaps://ldap-model.polytechnique.fr:636 
DEBUG: Called with: Namespace(basedn=None, binddn='cn=Directory Manager', 
bindpw='mypass', func=. 
at 0x7fce5a5e7158>, instance='ldaps://ldap-model.polytechnique.fr:636', 
json=False, prompt=False, pwdfile=None, starttls=False, verbose=True) 
DEBUG: Instance details: {'uri': 'ldaps://ldap-model.polytechnique.fr:636', 
'basedn': None, 'binddn': 'cn=Directory Manager', 'bindpw': None, 'saslmech': 
None, 'tls_cacertdir': None, 'tls_cert': None, 'tls_key': None, 'tls_reqcert': 
1, 'starttls': False, 'prompt': False, 'pwdfile': None, 'args': {'ldapurl': 
'ldaps://ldap-model.polytechnique.fr:636', 'root-dn': 'cn=Directory Manager'}} 
DEBUG: SER_SERVERID_PROP not provided, assuming non-local instance 
DEBUG: Allocate  with 
ldaps://ldap-model.polytechnique.fr:636 
DEBUG: Allocate  with ldap-centos8.polytechnique.fr:389 
DEBUG: Allocate  with ldap-centos8.polytechnique.fr:389 
DEBUG: SER_SERVERID_PROP not provided, assuming non-local instance 
DEBUG: Allocate  with 
ldaps://ldap-model.polytechnique.fr:636 
DEBUG: Allocate  with ldap-centos8.polytechnique.fr:389 
DEBUG: Allocate  with ldap-centos8.polytechnique.fr:389 
DEBUG: open(): Connecting to uri ldaps://ldap-model.polytechnique.fr:636 
DEBUG: open(): bound as cn=Directory Manager 
DEBUG: cn=config getVal('nsslapd-security') 
DEBUG: cn=config getVal('nsslapd-securelistenhost') 
DEBUG: cn=config getVal('nsslapd-securePort') 
DEBUG: cn=encryption,cn=config getVal('nsSSLClientAuth') 
DEBUG: cn=encryption,cn=config getVal('nsTLSAllowClientRenegotiation') 
DEBUG: cn=config getVal('nsslapd-require-secure-binds') 
DEBUG: cn=config getVal('nsslapd-ssl-check-hostname') 
DEBUG: cn=config getVal('nsslapd-validate-cert') 
DEBUG: cn=encryption,cn=config getVal('nsSSLSessionTimeout') 
DEBUG: cn=encryption,cn=config getVal('sslVersionMin') 
DEBUG: cn=encryption,cn=config getVal('sslVersionMax') 
DEBUG: cn=encryption,cn=config getVal('allowWeakCipher') 
DEBUG: cn=encryption,cn=config getVal('allowWeakDHParam') 
DEBUG: cn=encryption,cn=config getVal('nsSSL3Ciphers') 
nsslapd-security: on 
nsslapd-securelistenhost: 
nsslapd-secureport: 636 





[root@ldap-centos8 ~]# dsconf -v ldaps://ldap-model.polytechnique.fr:636 -D 
"cn=Directory Manager" -w mypass security get 
DEBUG: The 389 Directory Server Configuration Tool 
DEBUG: Inspired by works of: ITS, The University of Adelaide 
DEBUG: dsrc path: /root/.dsrc 
DEBUG: dsrc container path: /data/config/container.inf 
DEBUG: dsrc 

[389-users] Re: Building 389 rpm on CentOS 8 from source rpm

2019-10-31 Thread Ivanov Andrey (M.)
Hi Viktor, 

thank you for pointing me to that bugzilla. Finally i have downloaded and 
manually installed argparse-manpage from FC28 archives (wget [ 
https://archives.fedoraproject.org/pub/archive/fedora/linux/releases/28/Everything/x86_64/os/Packages/p/python3-argparse-manpage-1.0.0-1.fc28.noarch.rpm)
 | 
https://archives.fedoraproject.org/pub/archive/fedora/linux/releases/28/Everything/x86_64/os/Packages/p/python3-argparse-manpage-1.0.0-1.fc28.noarch.rpm)
 ] . It has permitted me to build the source rpm including lib389 and cockpit 
plugin. I still have not tested if that rpm installs ds and cockpit plugin 
works correctly. 
Concerning bugzilla ticket, it's a pity indeed that we do not нуе have a 
dedicated 389ds cockpit plugin in CentOS 8. 

Regards, 
Andrey 

> De: "Viktor Ashirov" 
> À: "General discussion list for the 389 Directory server, project."
> <389-users@lists.fedoraproject.org>
> Envoyé: Mercredi 30 Octobre 2019 15:22:00
> Objet: [389-users] Re: Building 389 rpm on CentOS 8 from source rpm

> Hi Andrey,

> argparse-manpage is missing in EPEL8:
> [ https://bugzilla.redhat.com/show_bug.cgi?id=1763246 |
> https://bugzilla.redhat.com/show_bug.cgi?id=1763246 ]

> I was able to rebuild fedora srpm in my copr: [
> https://copr.fedorainfracloud.org/coprs/vashirov/389ds/packages/ |
> https://copr.fedorainfracloud.org/coprs/vashirov/389ds/packages/ ]
> But 389-ds-base build fails further down due to other missing dependencies,
> please see the following message for more details: [
> https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org/message/FWYW3MM2NBGGCEK2FKM73Z3PCA7D4HCL/
> |
> https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org/message/FWYW3MM2NBGGCEK2FKM73Z3PCA7D4HCL/
> ]

> Thanks,

> On Wed, Oct 30, 2019 at 3:12 PM Ivanov Andrey (M.) < [
> mailto:andrey.iva...@polytechnique.fr | andrey.iva...@polytechnique.fr ] >
> wrote:

>> Hi,

>> m trying to build the 389 on CentOS 8 from rpm source package. When i do

>> yum-builddep SPECS/389-ds-base.spec

>> i have a missong component - "No matching package to install:
>> 'python3-argparse-manpage'". I could not find a corresponding package in any 
>> of
>> the repositories of CentOS 8 (AppStream/BaseOS/PowerTools/epel/extras). If i
>> try to disable this requirement in spec by cpommenting the following line:
>> BuildRequires: python%{python3_pkgversion}-argparse-manpage

>> the "rpmbuild -ba SPECS/389-ds-base.spec" stops at the same package 
>> requirement
>> when it starts to build lib389:

>> + pushd ./src/lib389
>> ~/rpmbuild/BUILD/389-ds-base-1.4.0.20-10/src/lib389
>> ~/rpmbuild/BUILD/389-ds-base-1.4.0.20-10
>> + CFLAGS='-O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2
>> -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong
>> -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1
>> -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic
>> -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection'
>> + LDFLAGS='-Wl,-z,relro -Wl,-z,now
>> -specs=/usr/lib/rpm/redhat/redhat-hardened-ld'
>> + /usr/libexec/platform-python setup.py build
>> '--executable=/usr/libexec/platform-python -s'
>> Traceback (most recent call last):
>> File "setup.py", line 17, in 
>> from build_manpages import build_manpages
>> ModuleNotFoundError: No module named 'build_manpages'
>> error: Bad exit status from /var/tmp/rpm-tmp.OOJPV1 (%build)

>> RPM build errors:
>> Macro expanded in comment on line 117: %{python3_pkgversion}-argparse-manpage

>> Bad exit status from /var/tmp/rpm-tmp.OOJPV1 (%build)

>> Where do i obtain the corresponding package ( python3-argparse-manpage ) for
>> CentOS8? I have not tried to build on RHEL8, maybe that package exists in 
>> RHEL8
>> vut not CentOS 8?

>> Thank you!

>> Regards,
>> Andrey
>> ___
>> 389-users mailing list -- [ mailto:389-users@lists.fedoraproject.org |
>> 389-users@lists.fedoraproject.org ]
>> To unsubscribe send an email to [ 
>> mailto:389-users-le...@lists.fedoraproject.org
>> | 389-users-le...@lists.fedoraproject.org ]
>> Fedora Code of Conduct: [
>> https://docs.fedoraproject.org/en-US/project/code-of-conduct/ |
>> https://docs.fedoraproject.org/en-US/project/code-of-conduct/ ]
>> List Guidelines: [ https://fedoraproject.org/wiki/Mailing_list_guidelines |
>> https://fedoraproject.org/wiki/Mailing_list_guidelines ]
>> List Archives: [
>> https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraprojec

[389-users] Building 389 rpm on CentOS 8 from source rpm

2019-10-30 Thread Ivanov Andrey (M.)
Hi, 

m trying to build the 389 on CentOS 8 from rpm source package. When i do 

yum-builddep SPECS/389-ds-base.spec 

i have a missong component - "No matching package to install: 
'python3-argparse-manpage'". I could not find a corresponding package in any of 
the repositories of CentOS 8 (AppStream/BaseOS/PowerTools/epel/extras). If i 
try to disable this requirement in spec by cpommenting the following line: 
BuildRequires: python%{python3_pkgversion}-argparse-manpage 




the "rpmbuild -ba SPECS/389-ds-base.spec" stops at the same package requirement 
when it starts to build lib389: 

+ pushd ./src/lib389 
~/rpmbuild/BUILD/389-ds-base-1.4.0.20-10/src/lib389 
~/rpmbuild/BUILD/389-ds-base-1.4.0.20-10 
+ CFLAGS='-O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 
-Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong 
-grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 
-specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic 
-fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection' 
+ LDFLAGS='-Wl,-z,relro -Wl,-z,now 
-specs=/usr/lib/rpm/redhat/redhat-hardened-ld' 
+ /usr/libexec/platform-python setup.py build 
'--executable=/usr/libexec/platform-python -s' 
Traceback (most recent call last): 
File "setup.py", line 17, in  
from build_manpages import build_manpages 
ModuleNotFoundError: No module named 'build_manpages' 
error: Bad exit status from /var/tmp/rpm-tmp.OOJPV1 (%build) 

RPM build errors: 
Macro expanded in comment on line 117: %{python3_pkgversion}-argparse-manpage 

Bad exit status from /var/tmp/rpm-tmp.OOJPV1 (%build) 


Where do i obtain the corresponding package ( python3-argparse-manpage ) for 
CentOS8? I have not tried to build on RHEL8, maybe that package exists in RHEL8 
vut not CentOS 8? 

Thank you! 

Regards, 
Andrey 
___
389-users mailing list -- 389-users@lists.fedoraproject.org
To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org


[389-users] Re: ldap perfomance

2018-09-06 Thread Ivanov Andrey (M.)
A more detailed discussion about it: 
http://www.port389.org/docs/389ds/design/logging-performance-improvement.html 

You could also disable logging and see whether the spikes disappear to be sure 
of their source: 
http://www.port389.org/docs/389ds/howto/howto-logsystemperf.html 

> Hi,

> it could be flushing of logs (access.log) on disk which happens more often 
> when
> server load is higher. you could use iostat or dstat to see what happens

> Regards,
> Andrey

>> De: "Ghiurea, Isabella" 
>> À: 389-users@lists.fedoraproject.org
>> Envoyé: Mercredi 5 Septembre 2018 23:14:24
>> Objet: [389-users] ldap perfomance

>> Hello Gurus,

>> looking for an answer to the following performance behavior
>> my env: 389-ds-base-1.3.5.15-1.fc24.x86_64 in multimaster fractional 
>> replication

>> running rsearch for 5 min with 1 thread seeing spikes for a basic read using
>> index uid

>> And running with 10 threads same search the avg ms/ops performance are much
>> better with no major spike/burst

>> Any explanation much appreciate it

>> see bellow for 1 thread and the spike/burst

>> T 300 -t 1
>> rsearch: 1 threads launched.
>> T1 min= 0ms, max= 5ms, count = 54710
>> T1 min= 0ms, max= 42ms, count = 64930
>> T1 min= 0ms, max= 2ms, count = 65174
>> T1 min= 0ms, max= 2ms, count = 65110
>> T1 min= 0ms, max= 44ms, count = 64966
>> T1 min= 0ms, max= 1ms, count = 65101
>> T1 min= 0ms, max= 22ms, count = 65056
>> T1 min= 0ms, max= 32ms, count = 64981
>> T1 min= 0ms, max= 1ms, count = 65145
>> T1 min= 0ms, max= 1ms, count = 65223
>> T1 min= 0ms, max= 27ms, count = 65015
>> T1 min= 0ms, max= 1ms, count = 65182
>> T1 min= 0ms, max= 3ms, count = 65213
>> T1 min= 0ms, max= 23ms, count = 64760
>> T1 min= 0ms, max= 2ms, count = 64214
>> T1 min= 0ms, max= 3ms, count = 52279
>> T1 min= 0ms, max= 11ms, count = 64914
>> T1 min= 0ms, max= 1ms, count = 65118
>> T1 min= 0ms, max= 5ms, count = 64852
>> T1 min= 0ms, max= 91ms, count = 64180
>> T1 min= 0ms, max= 4ms, count = 64746
>> T1 min= 0ms, max= 1ms, count = 65080
>> T1 min= 0ms, max= 12ms, count = 65110
>> T1 min= 0ms, max= 702ms, count = 59243
>> T1 min= 0ms, max= 1ms, count = 65082
>> T1 min= 0ms, max= 89ms, count = 64331
>> T1 min= 0ms, max= 23ms, count = 64647
>> T1 min= 0ms, max= 5ms, count = 64818
>> T1 min= 0ms, max= 55ms, count = 64374
>> T1 min= 0ms, max= 8ms, count = 64713

>> T1 min= 0ms, max= 8ms, count = 64713
>> 300 sec >= 300
>> Final Average rate: 6394.22/sec = 0.1564msec/op, total: 64713

>> And final avg rate for 10 threads, no significant spike/burst for this num of
>> threads

>> 20180905 14:07:23 - Rate: 16962.10/thr (16962.10/sec = 0.0590ms/op),
>> total:169621 (10 thr)
>> 300 sec >= 300
>> Final Average rate: 17420.40/sec = 0.0574msec/op, total:169621

>> ___
>> 389-users mailing list -- 389-users@lists.fedoraproject.org
>> To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org
>> Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
>> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
>> List Archives:
>> https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org

> ___
> 389-users mailing list -- 389-users@lists.fedoraproject.org
> To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org
> Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives:
> https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org
___
389-users mailing list -- 389-users@lists.fedoraproject.org
To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org


[389-users] Re: ldap perfomance

2018-09-06 Thread Ivanov Andrey (M.)
Hi, 

it could be flushing of logs (access.log) on disk which happens more often when 
server load is higher. you could use iostat or dstat to see what happens 

Regards, 
Andrey 

> De: "Ghiurea, Isabella" 
> À: 389-users@lists.fedoraproject.org
> Envoyé: Mercredi 5 Septembre 2018 23:14:24
> Objet: [389-users] ldap perfomance

> Hello Gurus,

> looking for an answer to the following performance behavior
> my env: 389-ds-base-1.3.5.15-1.fc24.x86_64 in multimaster fractional 
> replication

> running rsearch for 5 min with 1 thread seeing spikes for a basic read using
> index uid

> And running with 10 threads same search the avg ms/ops performance are much
> better with no major spike/burst

> Any explanation much appreciate it

> see bellow for 1 thread and the spike/burst

> T 300 -t 1
> rsearch: 1 threads launched.
> T1 min= 0ms, max= 5ms, count = 54710
> T1 min= 0ms, max= 42ms, count = 64930
> T1 min= 0ms, max= 2ms, count = 65174
> T1 min= 0ms, max= 2ms, count = 65110
> T1 min= 0ms, max= 44ms, count = 64966
> T1 min= 0ms, max= 1ms, count = 65101
> T1 min= 0ms, max= 22ms, count = 65056
> T1 min= 0ms, max= 32ms, count = 64981
> T1 min= 0ms, max= 1ms, count = 65145
> T1 min= 0ms, max= 1ms, count = 65223
> T1 min= 0ms, max= 27ms, count = 65015
> T1 min= 0ms, max= 1ms, count = 65182
> T1 min= 0ms, max= 3ms, count = 65213
> T1 min= 0ms, max= 23ms, count = 64760
> T1 min= 0ms, max= 2ms, count = 64214
> T1 min= 0ms, max= 3ms, count = 52279
> T1 min= 0ms, max= 11ms, count = 64914
> T1 min= 0ms, max= 1ms, count = 65118
> T1 min= 0ms, max= 5ms, count = 64852
> T1 min= 0ms, max= 91ms, count = 64180
> T1 min= 0ms, max= 4ms, count = 64746
> T1 min= 0ms, max= 1ms, count = 65080
> T1 min= 0ms, max= 12ms, count = 65110
> T1 min= 0ms, max= 702ms, count = 59243
> T1 min= 0ms, max= 1ms, count = 65082
> T1 min= 0ms, max= 89ms, count = 64331
> T1 min= 0ms, max= 23ms, count = 64647
> T1 min= 0ms, max= 5ms, count = 64818
> T1 min= 0ms, max= 55ms, count = 64374
> T1 min= 0ms, max= 8ms, count = 64713

> T1 min= 0ms, max= 8ms, count = 64713
> 300 sec >= 300
> Final Average rate: 6394.22/sec = 0.1564msec/op, total: 64713

> And final avg rate for 10 threads, no significant spike/burst for this num of
> threads

> 20180905 14:07:23 - Rate: 16962.10/thr (16962.10/sec = 0.0590ms/op),
> total:169621 (10 thr)
> 300 sec >= 300
> Final Average rate: 17420.40/sec = 0.0574msec/op, total:169621

> ___
> 389-users mailing list -- 389-users@lists.fedoraproject.org
> To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org
> Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives:
> https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org
___
389-users mailing list -- 389-users@lists.fedoraproject.org
To unsubscribe send an email to 389-users-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org


[389-users] Re: 389DS v1.3.4.x after fixes for tickets 48766 and 48954

2016-09-12 Thread Ivanov Andrey (M.)
> De: "Ludwig Krispenz" 
> À: 389-users@lists.fedoraproject.org
> Envoyé: Vendredi 9 Septembre 2016 12:30:31
> Objet: [389-users] Re: 389DS v1.3.4.x after fixes for tickets 48766 and 48954

> Hi Andrey,

> we have fix to address the incorrcet positioning in the changelog (using a csn
> of a consumer which is ahead for the given replicaid) and so also would 
> prevent
> these messages.
> It still has to be tested, but I am wondering if you want to test it as well.

> Regards,
> Ludwig
Hi Ludwig, 

i am unable to reproduce the problem on our test servers, it affects only 
production. So i would prefer to wait for your tests and/or a definitive and 
stable fix since the code will go directly into production :) 

Regards, 
Andrey 
--
389-users mailing list
389-users@lists.fedoraproject.org
https://lists.fedoraproject.org/admin/lists/389-users@lists.fedoraproject.org


[389-users] Re: 389DS v1.3.4.x after fixes for tickets 48766 and 48954

2016-09-07 Thread Ivanov Andrey (M.)
> De: "Ludwig Krispenz" 
> À: 389-users@lists.fedoraproject.org
> Envoyé: Mercredi 7 Septembre 2016 12:48:38
> Objet: [389-users] Re: 389DS v1.3.4.x after fixes for tickets 48766 and 48954

 the fixes for the tickets you mention did change the iteration thru the
 changelog and how it handles situtations when the start csn is not found 
 in the
 changelog. and it also did change the logging, so you might see messages 
 now
 which were not there or hidden before.
>>> That was my understanding too.

>> so far I have not seen any replication problems related to these messages, 
>> all
>> generatedcsns seem to be replicated. What makes it a bit more difficult is 
>> that
>> most of the updates are updates of lastlogintime and the original MOD is not
>> logged. I still do not understand why we have these messages so frequently, I
>> will try to reproduce.
>> Or, if it possible, could you run the servers for just an hour with 
>> replication
>> logging enabled ?

> no more need for this, I found the messages in a deployment where repl logging
> was enabled. I think it happens when the smallest consumer maxCSN is ahead of
> the local maxCSN for this replicaID.
> It should do no harm, but in some scenarios could slow down replication a bit.
> I will continue to investigate and work on a fix
Ok, thank you. And yes, as you say apparently it does no harm - i check the 
consistency of three replicated servers from time to time and there is no data 
discrepancy between these servers, . 

Anyway, enabling replication logging on production servers is not something 
easily done, mainly due to performance reasons. And i was not able to reproduce 
the problem in our test environment with 2 replicated servers, maybe the charge 
or frequency of connections updating lastlogintime attribute was not high 
enough in test environment. Or the three-server full-replicated topology makes 
things a bit different too with one or two additional hops for the same mod 
arriving to the consumer by two different paths. 

>> When looking into the provided data set I did notice three replicated ops 
>> with
>> err=50, insufficient access. This should not happen and requires a separate
>> investigation
Yes, i see the three modifications you are talking about. it is present only on 
one server of three. Strange indeed. No more err=50 in replicated ops today on 
any of the servers, i've just checked. 
--
389-users mailing list
389-users@lists.fedoraproject.org
https://lists.fedoraproject.org/admin/lists/389-users@lists.fedoraproject.org


[389-users] Re: 389DS v1.3.4.x after fixes for tickets 48766 and 48954

2016-09-06 Thread Ivanov Andrey (M.)
Hi Ludwig, 

> the fixes for the tickets you mention did change the iteration thru the
> changelog and how it handles situtations when the start csn is not found in 
> the
> changelog. and it also did change the logging, so you might see messages now
> which were not there or hidden before.
That was my understanding too. 

> But I am very surprised to see them so frequently and I would like to 
> understand
> it.
> First some questions, do you have changelog trimming enabled and how, do you
> have fractional replication ?
yes for both questions. 

Trimming: 14 days 
Fractional replication: 
nsDS5ReplicatedAttributeList: (objectclass=*) $ EXCLUDE entryusn memberOf 
nsDS5ReplicatedAttributeListTotal: (objectclass=*) $ EXCLUDE entryusn 
nsds5ReplicaStripAttrs: modifiersName modifyTimestamp internalModifiersName 
internalModifyTimestamp internalCreatorsname 

Changelog: 
cn=changelog5,cn=config 
objectClass: top 
objectClass: extensibleObject 
cn: changelog5 
nsslapd-changelogdir: /Local/dirsrv/var/lib/dirsrv/slapd-ens/changelogdb 
nsslapd-changelogmaxage: 14d 

replica: 
cn=replica,cn=dc\\3Did\\2Cdc\\3Dpolytechnique\\2Cdc\\3Dedu,cn=mapping 
tree,cn=config 
objectClass: top 
objectClass: nsDS5Replica 
cn: replica 
nsDS5ReplicaId: 1 
nsDS5ReplicaRoot: dc=id,dc=polytechnique,dc=edu 
nsDS5Flags: 1 
nsDS5ReplicaBindDN: cn=RepliX,cn=config 
nsds5ReplicaPurgeDelay: 604800 
nsds5ReplicaTombstonePurgeInterval: 86400 
nsds5ReplicaLegacyConsumer: False 
nsDS5ReplicaType: 3 
nsState:: AQDCrc5XAQABAA== 
nsDS5ReplicaName: eeb6d304-736c11e6-9bc5a1ff-40280b8e 
nsds5ReplicaChangeCount: 114948 
nsds5replicareapactive: 0 

Typical replication agreement: 

cn=Replication from ldap-lab. to ldap-adm.,cn=replica,cn=dc\\3Did\\2Cdc\\3Dpolytechnique\\2Cdc\\3Dedu,cn=mapping 
tree,cn=config 
objectClass: top 
objectClass: nsDS5ReplicationAgreement 
cn: Replication from ldap-lab. to ldap-adm. 
description: Replication agreement from server ldap-lab. to server 
ldap-adm. 
nsDS5ReplicaHost: ldap-adm. 
nsDS5ReplicaRoot: dc=id,dc=polytechnique,dc=edu 
nsDS5ReplicaPort: 636 
nsDS5ReplicaTransportInfo: SSL 
nsDS5ReplicaBindDN: cn=RepliX,cn=config 
nsDS5ReplicaBindMethod: simple 
nsDS5ReplicatedAttributeList: (objectclass=*) $ EXCLUDE entryusn memberOf 
nsDS5ReplicatedAttributeListTotal: (objectclass=*) $ EXCLUDE entryusn 
nsds5ReplicaStripAttrs: modifiersName modifyTimestamp internalModifiersName 
internalModifyTimestamp internalCreatorsname 
nsds5replicaBusyWaitTime: 5 
nsds5ReplicaFlowControlPause: 500 
nsds5ReplicaFlowControlWindow: 1000 
nsds5replicaTimeout: 120 
nsDS5ReplicaCredentials: {AES-... 
nsds50ruv: {replicageneration} 57cd73770002 
nsds50ruv: {replica 2 ldap://ldap-adm.:389} 
nsruvReplicaLastModified: {replica 2 ldap://ldap-adm.:389} 
 
nsds5replicareapactive: 0 
nsds5replicaLastUpdateStart: 20160906115520Z 
nsds5replicaLastUpdateEnd: 20160906115520Z 
nsds5replicaChangesSentSinceStartup: 3:13525/670 1:3671/0 2:1/0 
nsds5replicaLastUpdateStatus: 0 Replica acquired successfully: Incremental 
update succeeded 
nsds5replicaUpdateInProgress: FALSE 
nsds5replicaLastInitStart: 1970010100Z 
nsds5replicaLastInitEnd: 1970010100Z 

> Next, is it possible to get the access and error logs for a period of an hour
> from all servers (you can send them off list) ? I would like to track some of
> the reported csns.
Sure, i will send it to you off list in a moment. 

Thank you, 

Regards, 
Andrey 

> Regards,
> Ludwig

> On 09/06/2016 12:31 PM, Ivanov Andrey (M.) wrote:

>> Hi,

>> We are successfully using the compiled 1.3.4 git branch of 389DS in 
>> production
>> on CentOS 7 since about a year (approximately 40 000 entries, about 4000
>> groups, hundreds of reads and tens of writes per second).
>> Our current topology consists of 3 servers in triangle (each server is a 
>> master
>> replicating to 2 others, so two read-write replication agreements on each).

>> Since the fixes for the Ticket 48766 ("Replication changelog can incorrectly
>> skip over updates") and Ticket 48954 ("Replication fails because anchorcsn
>> cannot be found") I’ve started to see the following regular warnings in error
>> logs:

>> [06/Sep/2016:01:21:43 +0200] clcache_load_buffer_bulk - changelog record with
>> csn (57cdfe0600010001) not found for DB_NEXT
>> [06/Sep/2016:01:21:43 +0200] agmt="cn=Replication from ldap-adm. to
>> ldap-lab." (ldap-lab:636) - Can't locate CSN 57cdfe0600010001 in
>> the changelog (DB rc=-30988). If replication stops, the consumer may need to 
>> be
>> reinitialized.
>> [06/Sep/2016:02:35:25 +0200] - replica_generate_next_csn:
>> opcsn=57ce0f4e00050002 <= basecsn=57ce0f4e00050003, adjusted
>> opcsn=57ce0f4e00060002
>> [06/Sep/2016

[389-users] 389DS v1.3.4.x after fixes for tickets 48766 and 48954

2016-09-06 Thread Ivanov Andrey (M.)
Hi, 

We are successfully using the compiled 1.3.4 git branch of 389DS in production 
on CentOS 7 since about a year (approximately 40 000 entries, about 4000 
groups, hundreds of reads and tens of writes per second). 
Our current topology consists of 3 servers in triangle (each server is a master 
replicating to 2 others, so two read-write replication agreements on each). 

Since the fixes for the Ticket 48766 ("Replication changelog can incorrectly 
skip over updates") and Ticket 48954 ("Replication fails because anchorcsn 
cannot be found") I’ve started to see the following regular warnings in error 
logs: 

[06/Sep/2016:01:21:43 +0200] clcache_load_buffer_bulk - changelog record with 
csn (57cdfe0600010001) not found for DB_NEXT 
[06/Sep/2016:01:21:43 +0200] agmt="cn=Replication from ldap-adm. to 
ldap-lab." (ldap-lab:636) - Can't locate CSN 57cdfe0600010001 in 
the changelog (DB rc=-30988). If replication stops, the consumer may need to be 
reinitialized. 
[06/Sep/2016:02:35:25 +0200] - replica_generate_next_csn: 
opcsn=57ce0f4e00050002 <= basecsn=57ce0f4e00050003, adjusted 
opcsn=57ce0f4e00060002 
[06/Sep/2016:04:10:11 +0200] clcache_load_buffer_bulk - changelog record with 
csn (57ce257e00040003) not found for DB_NEXT 
[06/Sep/2016:05:16:58 +0200] - replica_generate_next_csn: 
opcsn=57ce352b0002 <= basecsn=57ce352b00010001, adjusted 
opcsn=57ce352b00010002 
[06/Sep/2016:06:56:04 +0200] agmt="cn=Replication from ldap-adm. to 
ldap-ens." (ldap-ens:636) - Can't locate CSN 57ce4c6200010003 in 
the changelog (DB rc=-30988). If replication stops, the consumer may need to be 
reinitialized. 
[06/Sep/2016:07:29:00 +0200] agmt="cn=Replication from ldap-adm. to 
ldap-ens." (ldap-ens:636) - Can't locate CSN 57ce541a00020003 in 
the changelog (DB rc=-30988). If replication stops, the consumer may need to be 
reinitialized. 
[06/Sep/2016:07:34:20 +0200] agmt="cn=Replication from ldap-adm. to 
ldap-lab." (ldap-lab:636) - Can't locate CSN 57ce555900010001 in 
the changelog (DB rc=-30988). If replication stops, the consumer may need to be 
reinitialized. 
[06/Sep/2016:07:34:27 +0200] agmt="cn=Replication from ldap-adm. to 
ldap-lab." (ldap-lab:636) - Can't locate CSN 57ce55610001 in 
the changelog (DB rc=-30988). If replication stops, the consumer may need to be 
reinitialized. 
[06/Sep/2016:07:40:17 +0200] clcache_load_buffer_bulk - changelog record with 
csn (57ce56c50003) not found for DB_NEXT 
[06/Sep/2016:07:40:24 +0200] clcache_load_buffer_bulk - changelog record with 
csn (57ce56c500010003) not found for DB_NEXT 
[06/Sep/2016:08:08:36 +0200] clcache_load_buffer_bulk - changelog record with 
csn (57ce5d5f000f0001) not found for DB_NEXT 
[06/Sep/2016:08:12:39 +0200] clcache_load_buffer_bulk - changelog record with 
csn (57ce5e5400020003) not found for DB_NEXT 
[06/Sep/2016:08:12:39 +0200] agmt="cn=Replication from ldap-adm. to 
ldap-ens." (ldap-ens:636) - Can't locate CSN 57ce5e5400020003 in 
the changelog (DB rc=-30988). If replication stops, the consumer may need to be 
reinitialized. 
[06/Sep/2016:08:26:45 +0200] clcache_load_buffer_bulk - changelog record with 
csn (57ce61a300020003) not found for DB_NEXT 
[06/Sep/2016:08:27:40 +0200] clcache_load_buffer_bulk - changelog record with 
csn (57ce61d800020003) not found for DB_NEXT 
[06/Sep/2016:08:27:40 +0200] agmt="cn=Replication from ldap-adm. to 
ldap-ens." (ldap-ens:636) - Can't locate CSN 57ce61d800020003 in 
the changelog (DB rc=-30988). If replication stops, the consumer may need to be 
reinitialized. 
[06/Sep/2016:08:31:42 +0200] clcache_load_buffer_bulk - changelog record with 
csn (57ce62c800030001) not found for DB_NEXT 
[06/Sep/2016:08:34:05 +0200] clcache_load_buffer_bulk - changelog record with 
csn (57ce635a00010001) not found for DB_NEXT 
[06/Sep/2016:08:44:28 +0200] clcache_load_buffer_bulk - changelog record with 
csn (57ce65c900020003) not found for DB_NEXT 
[06/Sep/2016:08:52:25 +0200] agmt="cn=Replication from ldap-adm. to 
ldap-ens." (ldap-ens:636) - Can't locate CSN 57ce67aa00010003 in 
the changelog (DB rc=-30988). If replication stops, the consumer may need to be 
reinitialized. 
[06/Sep/2016:08:53:04 +0200] - replica_generate_next_csn: 
opcsn=57ce67d100010002 <= basecsn=57ce67d100020003, adjusted 
opcsn=57ce67d100020002 

These warnings are present on all three servers and for all replication 
agreements. One of them is virtual and two others are physical. 

The replication still seems to work fine in spite of these warnings. The 
"replica_generate_next_csn" is not new - it existed since always with 1.3.4, 
the two new warnings are "clcache_load_buffer_bulk " and "Can't locate CSN ... 
in the changelog (DB rc=-30988)." There are no network problems or anything 
like that. So it could only be replication topology (3-master fully-connected 
triangle) and/or servers being rather busy. Is it a bug, a warning that 

Re: [389-users] _cl5CompactDBs: failed to compact

2015-06-19 Thread Ivanov Andrey (M.)
Hi Noriko, 

- Mail original -

  There are three MMR replicating servers. It's one month of uptime and the
  servers wanted to trim the replication log. Here is what i've found in
  error
  log on each of them :
 

  1st server:
 
  [18/Jun/2015:08:04:31 +0200] - libdb: BDB2055 Lock table is out of
  available
  lock entries
 

 May not matter, but could you please try increasing the value of this db
 config parameter? The default value is 1.

  dn: cn=config,cn=ldbm database,cn=plugins,cn=config
 
  nsslapd-db-locks: 1
 
Ok. I've increased nsslapd-db-locks to 2 and reduced 
nsslapd-changelogcompactdb-interval to 3600 in cn=changelog5,cn=config to see 
the changelog free event more frequently. No change. I have still : 

[19/Jun/2015:10:36:46 +0200] - libdb: BDB2055 Lock table is out of available 
lock entries 
[19/Jun/2015:10:36:46 +0200] NSMMReplicationPlugin - changelog program - 
_cl5CompactDBs: failed to compact a45fa684-f28d11e4-af27aa63-5121b7ef; db error 
- 12 Cannot allocate memory 

  [18/Jun/2015:08:04:31 +0200] NSMMReplicationPlugin - changelog program -
  _cl5CompactDBs: failed to compact a45fa684-f28d11e4-af27aa63-5121b7ef; db
  error - 12 Cannot allocate memory
 

 I don't thing there is any problem even if the DBs are not compacted. It was
 introduced just to release the free pages in the db files. But I'd also like
 to learn why the compact fails with ENOMEM here.
Ok, thanks. 
--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users

[389-users] _cl5CompactDBs: failed to compact

2015-06-18 Thread Ivanov Andrey (M.)
Hi, 

we are using the branch 1.3.2 on CentOS7 in our production environment (version 
1.3.2.27 with some additional patches from the git of this branch). 
There are three MMR replicating servers. It's one month of uptime and the 
servers wanted to trim the replication log. Here is what i've found in error 
log on each of them : 

1st server: 
[18/Jun/2015:08:04:31 +0200] - libdb: BDB2055 Lock table is out of available 
lock entries 
[18/Jun/2015:08:04:31 +0200] NSMMReplicationPlugin - changelog program - 
_cl5CompactDBs: failed to compact a45fa684-f28d11e4-af27aa63-5121b7ef; db error 
- 12 Cannot allocate memory 

2nd server: 
[18/Jun/2015:08:10:34 +0200] - libdb: BDB2055 Lock table is out of available 
lock entries 
[18/Jun/2015:08:10:34 +0200] NSMMReplicationPlugin - changelog program - 
_cl5CompactDBs: failed to compact acb7e184-f28d11e4-9b13d240-c66923c8; db error 
- 12 Cannot allocate memory 

3rd server: 
[18/Jun/2015:08:18:10 +0200] - libdb: BDB2055 Lock table is out of available 
lock entries 
[18/Jun/2015:08:18:10 +0200] NSMMReplicationPlugin - changelog program - 
_cl5CompactDBs: failed to compact acb7e184-f28d11e4-8067eff8-b1ca763b; db error 
- 12 Cannot allocate memory 

The changelog itself is not huge : 
[root@ldap-ens]# ll -h /Local/dirsrv/var/lib/dirsrv/slapd-ens/changelogdb/ 
total 390M 
-rw--- 1 ldap ldap 390M Jun 18 10:18 
a45fa684-f28d11e4-af27aa63-5121b7ef_5547be41.db 
-rw-r--r-- 1 ldap ldap 0 May 19 08:02 a45fa684-f28d11e4-af27aa63-5121b7ef.sema 
-rw--- 1 ldap ldap 30 May 4 20:45 DBVERSION 

The server are working correctky, the replication is also working 

What are the potential consequences of this error? How can we avoid it? 

Thank you! 
--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users

Re: [389-users] 389ds v1.3.2.24 error log message: replica_generate_next_csn adjusted

2014-11-17 Thread Ivanov Andrey (M.)



  [15/Nov/2014:03:58:43 +0100] - replica_generate_next_csn:
  opcsn=5466c1640001 = basecsn=5466c1640002, adjusted
  opcsn=5466c16400010001
  [15/Nov/2014:10:38:38 +0100] - replica_generate_next_csn:
  opcsn=54671f1f0001 = basecsn=54671f1f0003, adjusted
  opcsn=54671f1f00010001
 
  Are these only information messages that can be safely ignored or they may
  be a manifestation of some potential problem?
 
 This looks ok to me, and the message should not be a fatal message. The
 code handles this correctly by incrementing the sequence number and
 updating the generator.  

That's what i've also thought.


In practice it should be very difficult to get
 the generator to generate a CSN like this.  Are all of these machines
 running in VMs?  If so, what is the hypervisor?  How many of these do
 you see per day?

We see it two or three time per day, compared to 2 or 3 modifications 
per day (according to logconv.pl) :
38465   2.16.840.1.113730.3.5.12 DS90 Start Replication Request 
 
24603   2.16.840.1.113730.3.5.5 End Replication Request (incremental 
update)



2 servers are physical (replica id 1 and 3) and one is virtual (replica id 2). 
Each of the three is MMR-replicated to two others.
On rep_id 1 (physical hardware):
[15/Nov/2014:03:58:43 +0100] - replica_generate_next_csn: 
opcsn=5466c1640001 = basecsn=5466c1640002, adjusted 
opcsn=5466c16400010001
[15/Nov/2014:10:38:38 +0100] - replica_generate_next_csn: 
opcsn=54671f1f0001 = basecsn=54671f1f0003, adjusted 
opcsn=54671f1f00010001
[16/Nov/2014:01:43:44 +0100] - replica_generate_next_csn: 
opcsn=5467f3410001 = basecsn=5467f34100010002, adjusted 
opcsn=5467f34100010001
[17/Nov/2014:09:34:54 +0100] - replica_generate_next_csn: 
opcsn=5469b32f0001 = basecsn=5469b32f0002, adjusted 
opcsn=5469b32f00010001
[17/Nov/2014:16:09:48 +0100] - replica_generate_next_csn: 
opcsn=546a0fbd0001 = basecsn=546a0fbd00020002, adjusted 
opcsn=546a0fbd00030001
[17/Nov/2014:16:55:55 +0100] - replica_generate_next_csn: 
opcsn=546a1a8c0001 = basecsn=546a1a8c0002, adjusted 
opcsn=546a1a8c00010001
[17/Nov/2014:19:34:14 +0100] - replica_generate_next_csn: 
opcsn=546a3fa70001 = basecsn=546a3fa70003, adjusted 
opcsn=546a3fa700010001


On rep_id 2 (virtual, VMWare ESXi5.5):
[15/Nov/2014:04:19:09 +0100] - replica_generate_next_csn: 
opcsn=5466c62e0002 = basecsn=5466c62e0003, adjusted 
opcsn=5466c62e00010002
[17/Nov/2014:15:47:11 +0100] - replica_generate_next_csn: 
opcsn=546a0a710002 = basecsn=546a0a720003, adjusted 
opcsn=546a0a720002
[17/Nov/2014:15:48:11 +0100] - replica_generate_next_csn: 
opcsn=546a0aac00010002 = basecsn=546a0aac00020003, adjusted 
opcsn=546a0aac00020002
[17/Nov/2014:15:49:36 +0100] - replica_generate_next_csn: 
opcsn=546a0b010002 = basecsn=546a0b0100020003, adjusted 
opcsn=546a0b0100030002

On rep_id 3 (physical hardware):
[16/Nov/2014:05:02:34 +0100] - replica_generate_next_csn: 
opcsn=546821db0003 = basecsn=546821dc0002, adjusted 
opcsn=546821dc00010003





 
  In source code (./ldap/servers/plugins/replication/repl5_replica.c) it
  looks like a serious one (SLAPI_LOG_FATAL):
  slapi_log_error (SLAPI_LOG_FATAL, NULL,
   replica_generate_next_csn: 
   opcsn=%s = basecsn=%s, adjusted opcsn=%s\n,
opcsnstr, basecsnstr, opcsn2str);
 
 It should not be FATAL.  Please file a ticket.

Ok. Done: https://fedorahosted.org/389/ticket/47959


Thanks!
--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users

[389-users] 389ds v.1.3.2.24 replication deadlocks/retry count exceeded

2014-11-12 Thread Ivanov Andrey (M.)
Hi, 

I've continued testing 389ds v.1.3.2.24 on CentOS 7. I really have an 
impression that everything works fine (plugins etc) but the replication seems 
to be a little fragile. Both of the tickets i've already opened concern 
replication partially or completely (https://fedorahosted.org/389/ticket/47942 
and https://fedorahosted.org/389/ticket/47950). 

Here is another issue with replication : 
i have two servers with multi-master agreements on each of them (the same 
configuration as in ticket https://fedorahosted.org/389/ticket/47942). 

We add/delete a lot of groups (943, to be exact). Each group may contain a 
large number of referenced entries, up to ~250 (uniqueMember: dn). MemberOf 
plugin is activated and works fine. Referential integrity plugin is also 
activated but of course it is of any sense only when deleting groups (or 
renaming them). It goes on for a long time (20-30 minutes or more). Some time 
after the beginning of the operations (typically 5-8 minutes) we have 
replication erros and inconsistency of the replica concerning the entries 
mentioned in error log. 

When adding and deleting groups the supplier is ok. Howevere the consumer has 
several (from one to four or five) groupe deletions/adds that are not 
replicated. The error on the supplier: 

[12/Nov/2014:16:46:42 +0100] NSMMReplicationPlugin - agmt=cn=Replication from 
ldap-edev.polytechnique.fr to ldap-model.polytechnique.fr (ldap-model:636): 
Consumer failed to replay change (uniqueid fa90219d-6a8211e4-a42c901a-94623bee, 
CSN 546380d60002): Operations error (1). Will retry later. 
[12/Nov/2014:16:47:55 +0100] NSMMReplicationPlugin - agmt=cn=Replication from 
ldap-edev.polytechnique.fr to ldap-model.polytechnique.fr (ldap-model:636): 
Consumer failed to replay change (uniqueid 1e5367ae-6a8311e4-a42c901a-94623bee, 
CSN 546381250002): Operations error (1). Will retry later. 
[12/Nov/2014:16:53:14 +0100] NSMMReplicationPlugin - agmt=cn=Replication from 
ldap-edev.polytechnique.fr to ldap-model.polytechnique.fr (ldap-model:636): 
Consumer failed to replay change (uniqueid f4e70b85-6a8311e4-a42c901a-94623bee, 
CSN 54638262): Operations error (1). Will retry later. 
[12/Nov/2014:16:55:12 +0100] NSMMReplicationPlugin - agmt=cn=Replication from 
ldap-edev.polytechnique.fr to ldap-model.polytechnique.fr (ldap-model:636): 
Consumer failed to replay change (uniqueid 3c6d978a-6a8411e4-a42c901a-94623bee, 
CSN 546382d600040002): Operations error (1). Will retry later. 
[12/Nov/2014:16:56:31 +0100] NSMMReplicationPlugin - agmt=cn=Replication from 
ldap-edev.polytechnique.fr to ldap-model.polytechnique.fr (ldap-model:636): 
Consumer failed to replay change (uniqueid 6030dd93-6a8411e4-a42c901a-94623bee, 
CSN 546383250002): Operations error (1). Will retry later. 
[12/Nov/2014:16:57:22 +0100] NSMMReplicationPlugin - agmt=cn=Replication from 
ldap-edev.polytechnique.fr to ldap-model.polytechnique.fr (ldap-model:636): 
Consumer failed to replay change (uniqueid 83f42395-6a8411e4-a42c901a-94623bee, 
CSN 5463835d0002): Operations error (1). Will retry later. 

The corresponding errors on the consumer seem to hint deadlocks in these cases: 
[12/Nov/2014:16:46:41 +0100] NSMMReplicationPlugin - changelog program - 
_cl5WriteOperationTxn: retry (49) the transaction (csn=546380d60002) 
failed (rc=-30993 (BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a 
deadlock)) 
[12/Nov/2014:16:46:41 +0100] NSMMReplicationPlugin - changelog program - 
_cl5WriteOperationTxn: failed to write entry with csn (546380d60002); 
db error - -30993 BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock 
[12/Nov/2014:16:46:41 +0100] NSMMReplicationPlugin - write_changelog_and_ruv: 
can't add a change for 
cn=LAN452ESP-2014,ou=2014,ou=Cours,ou=Enseignement,ou=Groupes,dc=id,dc=polytechnique,dc=edu
 (uniqid: fa90219d-6a8211e4-a42c901a-94623bee, optype: 16) to changelog csn 
546380d60002 
[12/Nov/2014:16:47:54 +0100] NSMMReplicationPlugin - changelog program - 
_cl5WriteOperationTxn: retry (49) the transaction (csn=546381250002) 
failed (rc=-30993 (BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a 
deadlock)) 
[12/Nov/2014:16:47:54 +0100] NSMMReplicationPlugin - changelog program - 
_cl5WriteOperationTxn: failed to write entry with csn (546381250002); 
db error - -30993 BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a deadlock 
[12/Nov/2014:16:47:54 +0100] NSMMReplicationPlugin - write_changelog_and_ruv: 
can't add a change for 
cn=LAN472EFLE-2014,ou=2014,ou=Cours,ou=Enseignement,ou=Groupes,dc=id,dc=polytechnique,dc=edu
 (uniqid: 1e5367ae-6a8311e4-a42c901a-94623bee, optype: 16) to changelog csn 
546381250002 
[12/Nov/2014:16:53:13 +0100] NSMMReplicationPlugin - changelog program - 
_cl5WriteOperationTxn: retry (49) the transaction (csn=54638262) 
failed (rc=-30993 (BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a 
deadlock)) 
[12/Nov/2014:16:53:13 

Re: [389-users] 389ds v.1.3.2.24 replication deadlocks/retry count exceeded

2014-11-12 Thread Ivanov Andrey (M.)
- Mail original -

 De: Ivanov Andrey (M.) andrey.iva...@polytechnique.fr
 À: General discussion list for the 389 Directory server project.
 389-users@lists.fedoraproject.org
 Envoyé: Mercredi 12 Novembre 2014 18:52:44
 Objet: [389-users] 389ds v.1.3.2.24 replication deadlocks/retry count
 exceeded

 Hi,

 I've continued testing 389ds v.1.3.2.24 on CentOS 7. I really have an
 impression that everything works fine (plugins etc) but the replication
 seems to be a little fragile. Both of the tickets i've already opened
 concern replication partially or completely
 (https://fedorahosted.org/389/ticket/47942 and
 https://fedorahosted.org/389/ticket/47950).

 Here is another issue with replication :
 i have two servers with multi-master agreements on each of them (the same
 configuration as in ticket https://fedorahosted.org/389/ticket/47942).

 We add/delete a lot of groups (943, to be exact). Each group may contain a
 large number of referenced entries, up to ~250 (uniqueMember: dn). MemberOf
 plugin is activated and works fine. Referential integrity plugin is also
 activated but of course it is of any sense only when deleting groups (or
 renaming them). It goes on for a long time (20-30 minutes or more). Some
 time after the beginning of the operations (typically 5-8 minutes) we have
 replication erros and inconsistency of the replica concerning the entries
 mentioned in error log.

 When adding and deleting groups the supplier is ok. Howevere the consumer has
 several (from one to four or five) groupe deletions/adds that are not
 replicated. The error on the supplier:

 [12/Nov/2014:16:46:42 +0100] NSMMReplicationPlugin - agmt=cn=Replication
 from ldap-edev.polytechnique.fr to ldap-model.polytechnique.fr
 (ldap-model:636): Consumer failed to replay change (uniqueid
 fa90219d-6a8211e4-a42c901a-94623bee, CSN 546380d60002): Operations
 error (1). Will retry later.
 [12/Nov/2014:16:47:55 +0100] NSMMReplicationPlugin - agmt=cn=Replication
 from ldap-edev.polytechnique.fr to ldap-model.polytechnique.fr
 (ldap-model:636): Consumer failed to replay change (uniqueid
 1e5367ae-6a8311e4-a42c901a-94623bee, CSN 546381250002): Operations
 error (1). Will retry later.
 [12/Nov/2014:16:53:14 +0100] NSMMReplicationPlugin - agmt=cn=Replication
 from ldap-edev.polytechnique.fr to ldap-model.polytechnique.fr
 (ldap-model:636): Consumer failed to replay change (uniqueid
 f4e70b85-6a8311e4-a42c901a-94623bee, CSN 54638262): Operations
 error (1). Will retry later.
 [12/Nov/2014:16:55:12 +0100] NSMMReplicationPlugin - agmt=cn=Replication
 from ldap-edev.polytechnique.fr to ldap-model.polytechnique.fr
 (ldap-model:636): Consumer failed to replay change (uniqueid
 3c6d978a-6a8411e4-a42c901a-94623bee, CSN 546382d600040002): Operations
 error (1). Will retry later.
 [12/Nov/2014:16:56:31 +0100] NSMMReplicationPlugin - agmt=cn=Replication
 from ldap-edev.polytechnique.fr to ldap-model.polytechnique.fr
 (ldap-model:636): Consumer failed to replay change (uniqueid
 6030dd93-6a8411e4-a42c901a-94623bee, CSN 546383250002): Operations
 error (1). Will retry later.
 [12/Nov/2014:16:57:22 +0100] NSMMReplicationPlugin - agmt=cn=Replication
 from ldap-edev.polytechnique.fr to ldap-model.polytechnique.fr
 (ldap-model:636): Consumer failed to replay change (uniqueid
 83f42395-6a8411e4-a42c901a-94623bee, CSN 5463835d0002): Operations
 error (1). Will retry later.

 The corresponding errors on the consumer seem to hint deadlocks in these
 cases:
 [12/Nov/2014:16:46:41 +0100] NSMMReplicationPlugin - changelog program -
 _cl5WriteOperationTxn: retry (49) the transaction (csn=546380d60002)
 failed (rc=-30993 (BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a
 deadlock))
 [12/Nov/2014:16:46:41 +0100] NSMMReplicationPlugin - changelog program -
 _cl5WriteOperationTxn: failed to write entry with csn
 (546380d60002); db error - -30993 BDB0068 DB_LOCK_DEADLOCK: Locker
 killed to resolve a deadlock
 [12/Nov/2014:16:46:41 +0100] NSMMReplicationPlugin - write_changelog_and_ruv:
 can't add a change for
 cn=LAN452ESP-2014,ou=2014,ou=Cours,ou=Enseignement,ou=Groupes,dc=id,dc=polytechnique,dc=edu
 (uniqid: fa90219d-6a8211e4-a42c901a-94623bee, optype: 16) to changelog csn
 546380d60002
 [12/Nov/2014:16:47:54 +0100] NSMMReplicationPlugin - changelog program -
 _cl5WriteOperationTxn: retry (49) the transaction (csn=546381250002)
 failed (rc=-30993 (BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a
 deadlock))
 [12/Nov/2014:16:47:54 +0100] NSMMReplicationPlugin - changelog program -
 _cl5WriteOperationTxn: failed to write entry with csn
 (546381250002); db error - -30993 BDB0068 DB_LOCK_DEADLOCK: Locker
 killed to resolve a deadlock
 [12/Nov/2014:16:47:54 +0100] NSMMReplicationPlugin - write_changelog_and_ruv:
 can't add a change for
 cn=LAN472EFLE-2014,ou=2014,ou=Cours,ou=Enseignement,ou=Groupes,dc=id,dc=polytechnique,dc=edu
 (uniqid: 1e5367ae-6a8311e4-a42c901a

Re: [389-users] 389ds v.1.3.2.24 replication deadlocks/retry count exceeded

2014-11-12 Thread Ivanov Andrey (M.)
- Mail original -
The corresponding errors on the consumer seem to hint deadlocks in these cases: 

   [12/Nov/2014:16:46:41 +0100] NSMMReplicationPlugin - changelog program -
   _cl5WriteOperationTxn: retry (49) the transaction
   (csn=546380d60002)
   failed (rc=-30993 (BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a
   deadlock))
  
 
   [12/Nov/2014:16:46:41 +0100] NSMMReplicationPlugin - changelog program -
   _cl5WriteOperationTxn: failed to write entry with csn
   (546380d60002); db error - -30993 BDB0068 DB_LOCK_DEADLOCK:
   Locker
   killed to resolve a deadlock
  
 
   [12/Nov/2014:16:46:41 +0100] NSMMReplicationPlugin -
   write_changelog_and_ruv:
   can't add a change for
   cn=LAN452ESP-2014,ou=2014,ou=Cours,ou=Enseignement,ou=Groupes,dc=id,dc=polytechnique,dc=edu
   (uniqid: fa90219d-6a8211e4-a42c901a-94623bee, optype: 16) to changelog
   csn
   546380d60002
  
 
   [12/Nov/2014:16:47:54 +0100] NSMMReplicationPlugin - changelog program -
   _cl5WriteOperationTxn: retry (49) the transaction
   (csn=546381250002)
   failed (rc=-30993 (BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a
   deadlock))
  
 
   [12/Nov/2014:16:47:54 +0100] NSMMReplicationPlugin - changelog program -
   _cl5WriteOperationTxn: failed to write entry with csn
   (546381250002); db error - -30993 BDB0068 DB_LOCK_DEADLOCK:
   Locker
   killed to resolve a deadlock
  
 
   [12/Nov/2014:16:47:54 +0100] NSMMReplicationPlugin -
   write_changelog_and_ruv:
   can't add a change for
   cn=LAN472EFLE-2014,ou=2014,ou=Cours,ou=Enseignement,ou=Groupes,dc=id,dc=polytechnique,dc=edu
   (uniqid: 1e5367ae-6a8311e4-a42c901a-94623bee, optype: 16) to changelog
   csn
   546381250002
  
 
   [12/Nov/2014:16:53:13 +0100] NSMMReplicationPlugin - changelog program -
   _cl5WriteOperationTxn: retry (49) the transaction
   (csn=54638262)
   failed (rc=-30993 (BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a
   deadlock))
  
 
   [12/Nov/2014:16:53:13 +0100] NSMMReplicationPlugin - changelog program -
   _cl5WriteOperationTxn: failed to write entry with csn
   (54638262); db error - -30993 BDB0068 DB_LOCK_DEADLOCK:
   Locker
   killed to resolve a deadlock
  
 
   [12/Nov/2014:16:53:13 +0100] NSMMReplicationPlugin -
   write_changelog_and_ruv:
   can't add a change for
   cn=MAT471-2014,ou=2014,ou=Cours,ou=Enseignement,ou=Groupes,dc=id,dc=polytechnique,dc=edu
   (uniqid: f4e70b85-6a8311e4-a42c901a-94623bee, optype: 16) to changelog
   csn
   54638262
  
 
   [12/Nov/2014:16:55:11 +0100] NSMMReplicationPlugin - changelog program -
   _cl5WriteOperationTxn: retry (49) the transaction
   (csn=546382d600040002)
   failed (rc=-30993 (BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a
   deadlock))
  
 
   [12/Nov/2014:16:55:11 +0100] NSMMReplicationPlugin - changelog program -
   _cl5WriteOperationTxn: failed to write entry with csn
   (546382d600040002); db error - -30993 BDB0068 DB_LOCK_DEADLOCK:
   Locker
   killed to resolve a deadlock
  
 
   [12/Nov/2014:16:55:11 +0100] NSMMReplicationPlugin -
   write_changelog_and_ruv:
   can't add a change for
   cn=MEC592-2014,ou=2014,ou=Cours,ou=Enseignement,ou=Groupes,dc=id,dc=polytechnique,dc=edu
   (uniqid: 3c6d978a-6a8411e4-a42c901a-94623bee, optype: 16) to changelog
   csn
   546382d600040002
  
 
   [12/Nov/2014:16:56:29 +0100] NSMMReplicationPlugin - changelog program -
   _cl5WriteOperationTxn: retry (49) the transaction
   (csn=546383250002)
   failed (rc=-30993 (BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a
   deadlock))
  
 
   [12/Nov/2014:16:56:29 +0100] NSMMReplicationPlugin - changelog program -
   _cl5WriteOperationTxn: failed to write entry with csn
   (546383250002); db error - -30993 BDB0068 DB_LOCK_DEADLOCK:
   Locker
   killed to resolve a deadlock
  
 
   [12/Nov/2014:16:56:29 +0100] NSMMReplicationPlugin -
   write_changelog_and_ruv:
   can't add a change for
   cn=PHY566-2014,ou=2014,ou=Cours,ou=Enseignement,ou=Groupes,dc=id,dc=polytechnique,dc=edu
   (uniqid: 6030dd93-6a8411e4-a42c901a-94623bee, optype: 16) to changelog
   csn
   546383250002
  
 
   [12/Nov/2014:16:57:20 +0100] NSMMReplicationPlugin - changelog program -
   _cl5WriteOperationTxn: retry (49) the transaction
   (csn=5463835d0002)
   failed (rc=-30993 (BDB0068 DB_LOCK_DEADLOCK: Locker killed to resolve a
   deadlock))
  
 
   [12/Nov/2014:16:57:20 +0100] NSMMReplicationPlugin - changelog program -
   _cl5WriteOperationTxn: failed to write entry with csn
   (5463835d0002); db error - -30993 BDB0068 DB_LOCK_DEADLOCK:
   Locker
   killed to resolve a deadlock
  
 
   [12/Nov/2014:16:57:20 +0100] NSMMReplicationPlugin -
   write_changelog_and_ruv:
   can't add a change for
   cn=PHY651K-2014,ou=2014,ou=Cours,ou=Enseignement,ou=Groupes,dc=id,dc=polytechnique,dc=edu
   (uniqid: 83f42395-6a8411e4-a42c901a-94623bee, optype: 16) to changelog
   csn
   

[389-users] Groupe modifications and internalModifiersName

2014-11-11 Thread Ivanov Andrey (M.)
Hi,, 

i continue with my tests of 389ds v1.3.2.24. I've encountered another bug or 
strange behavior (by design?). 
I've activated bind dn tracking ( nsslapd-plugin-binddn-tracking: on ). There 
is an account that has the write to add the entries and to change some 
attributes (e.g. description). The corresponding ACI: 

dn: ou=Cours,ou=Enseignement,ou=Groupes,dc=id,dc=polytechnique,dc=edu 
aci: (targetattr =  objectClass || uniqueMember || owner || cn || description 
|| businessCategory  ) (version 3.0;acl Droits de rejouter/supprimer/modifier 
les groupes et leurs att 
ributs;allow ( add, delete, read,compare,search,write 
)(userdn=ldap:///uid=sync-cours,ou=Comptes 
generiques,ou=Utilisateurs,dc=id,dc=polytechnique,dc=edu);) 


Any attempt to modify an authorized attribute from the list above (for ex., 
description ) results in 
ldap_modify: Insufficient access (50) 
additional info: Insufficient 'write' privilege to the 'internalModifiersName' 
attribute of entry 
'cn=mec431-2014,ou=2014,ou=cours,ou=enseignement,ou=groupes,dc=id,dc=polytechnique,dc=edu'.
 


[11/Nov/2014:10:38:49 +0100] conn=4 fd=256 slot=256 connection from 
129.104.31.54 to 129.104.69.49 
[11/Nov/2014:10:38:49 +0100] conn=4 op=0 BIND dn= method=sasl version=3 
mech=GSSAPI 
[11/Nov/2014:10:38:49 +0100] conn=4 op=0 RESULT err=14 tag=97 nentries=0 
etime=0.008000, SASL bind in progress 
[11/Nov/2014:10:38:49 +0100] conn=4 op=1 BIND dn= method=sasl version=3 
mech=GSSAPI 
[11/Nov/2014:10:38:49 +0100] conn=4 op=1 RESULT err=14 tag=97 nentries=0 
etime=0.002000, SASL bind in progress 
[11/Nov/2014:10:38:49 +0100] conn=4 op=2 BIND dn= method=sasl version=3 
mech=GSSAPI 
[11/Nov/2014:10:38:49 +0100] conn=4 op=2 RESULT err=0 tag=97 nentries=0 
etime=0.001000 dn=uid=sync-cours,ou=comptes 
generiques,ou=utilisateurs,dc=id,dc=polytechnique,dc=edu 
[11/Nov/2014:10:38:49 +0100] conn=4 op=3 SRCH 
base=dc=id,dc=polytechnique,dc=edu scope=2 filter=(cn=MEC431-2014) 
attrs=ALL 
[11/Nov/2014:10:38:49 +0100] conn=4 op=3 RESULT err=0 tag=101 nentries=1 
etime=0.003000 
[11/Nov/2014:10:39:00 +0100] conn=4 op=4 MOD 
dn=cn=MEC431-2014,ou=2014,ou=Cours,ou=Enseignement,ou=Groupes,dc=id,dc=polytechnique,dc=edu
 
[11/Nov/2014:10:39:00 +0100] conn=4 op=4 RESULT err=50 tag=103 nentries=0 
etime=0.002000 


is it an expected behavior and i need to add to all the ACIs that allow 
modifications the right to modify internalModifiersName attribute (if i add it, 
everything is fine and the attribute internalModifiersName becomes  cn=ldbm 
database,cn=plugins,cn=config ). 
Or is it a bug? 

Thank you! 

Regards, 
--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users

Re: [389-users] Groupe modifications and internalModifiersName

2014-11-11 Thread Ivanov Andrey (M.)
Thank you Ludwig, i think the attribute behavior should be as you describe it, 
so i've made a ticket - https://fedorahosted.org/389/ticket/47950 

- Mail original -

 De: Ludwig Krispenz lkris...@redhat.com
 À: 389-users@lists.fedoraproject.org
 Envoyé: Mardi 11 Novembre 2014 11:06:10
 Objet: Re: [389-users] Groupe modifications and internalModifiersName

 On 11/11/2014 10:45 AM, Ivanov Andrey (M.) wrote:

  Hi,,
 

  i continue with my tests of 389ds v1.3.2.24. I've encountered another bug
  or
  strange behavior (by design?).
 
  I've activated bind dn tracking ( nsslapd-plugin-binddn-tracking: on ).
  There
  is an account that has the write to add the entries and to change some
  attributes (e.g. description). The corresponding ACI:
 

  dn: ou=Cours,ou=Enseignement,ou=Groupes,dc=id,dc=polytechnique,dc=edu
 
  aci: (targetattr =  objectClass || uniqueMember || owner || cn ||
  description || businessCategory  ) (version 3.0;acl Droits de
  rejouter/supprimer/modifier les groupes et leurs att
 
  ributs;allow ( add, delete, read,compare,search,write )(userdn=
  ldap:///uid=sync-cours,ou=Comptes
  generiques,ou=Utilisateurs,dc=id,dc=polytechnique,dc=edu );)
 

  Any attempt to modify an authorized attribute from the list above (for ex.,
  description ) results in
 
  ldap_modify: Insufficient access (50)
 
  additional info: Insufficient 'write' privilege to the
  'internalModifiersName' attribute of entry
  'cn=mec431-2014,ou=2014,ou=cours,ou=enseignement,ou=groupes,dc=id,dc=polytechnique,dc=edu'.
 

  [11/Nov/2014:10:38:49 +0100] conn=4 fd=256 slot=256 connection from
  129.104.31.54 to 129.104.69.49
 
  [11/Nov/2014:10:38:49 +0100] conn=4 op=0 BIND dn= method=sasl version=3
  mech=GSSAPI
 
  [11/Nov/2014:10:38:49 +0100] conn=4 op=0 RESULT err=14 tag=97 nentries=0
  etime=0.008000, SASL bind in progress
 
  [11/Nov/2014:10:38:49 +0100] conn=4 op=1 BIND dn= method=sasl version=3
  mech=GSSAPI
 
  [11/Nov/2014:10:38:49 +0100] conn=4 op=1 RESULT err=14 tag=97 nentries=0
  etime=0.002000, SASL bind in progress
 
  [11/Nov/2014:10:38:49 +0100] conn=4 op=2 BIND dn= method=sasl version=3
  mech=GSSAPI
 
  [11/Nov/2014:10:38:49 +0100] conn=4 op=2 RESULT err=0 tag=97 nentries=0
  etime=0.001000 dn=uid=sync-cours,ou=comptes
  generiques,ou=utilisateurs,dc=id,dc=polytechnique,dc=edu
 
  [11/Nov/2014:10:38:49 +0100] conn=4 op=3 SRCH
  base=dc=id,dc=polytechnique,dc=edu scope=2 filter=(cn=MEC431-2014)
  attrs=ALL
 
  [11/Nov/2014:10:38:49 +0100] conn=4 op=3 RESULT err=0 tag=101 nentries=1
  etime=0.003000
 
  [11/Nov/2014:10:39:00 +0100] conn=4 op=4 MOD
  dn=cn=MEC431-2014,ou=2014,ou=Cours,ou=Enseignement,ou=Groupes,dc=id,dc=polytechnique,dc=edu
 
  [11/Nov/2014:10:39:00 +0100] conn=4 op=4 RESULT err=50 tag=103 nentries=0
  etime=0.002000
 

  is it an expected behavior and i need to add to all the ACIs that allow
  modifications the right to modify internalModifiersName attribute
 

 good question, not sure if thus was intentional, butI think
 internalModifiersName should be written like modifiersname without specific
 permission .

 so for now I suggest you add the aci and open a ticket to get it investigated

  (if i add it, everything is fine and the attribute internalModifiersName
  becomes  cn=ldbm database,cn=plugins,cn=config ).
 
  Or is it a bug?
 

  Thank you!
 

  Regards,
 

  --
 
  389 users mailing list 389-users@lists.fedoraproject.org
  https://admin.fedoraproject.org/mailman/listinfo/389-users
 

 --
 389 users mailing list
 389-users@lists.fedoraproject.org
 https://admin.fedoraproject.org/mailman/listinfo/389-users--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users

Re: [389-users] 389-Directory/1.3.1.6 cannot setup replica

2014-11-06 Thread Ivanov Andrey (M.)
Hi Noriko, 

as promised - the new ticket for the total replication bug discussed yesterday: 
https://fedorahosted.org/389/ticket/47942 

Regards, 
Andrey 

 De: Noriko Hosoi nho...@redhat.com
 À: General discussion list for the 389 Directory server project.
 389-users@lists.fedoraproject.org
 Envoyé: Mercredi 5 Novembre 2014 21:54:32
 Objet: Re: [389-users] 389-Directory/1.3.1.6 cannot setup replica

 On 11/05/2014 12:46 PM, Ivanov Andrey (M.) wrote:

  Next time it happens, could it be possible to get the stacktraces from the
  hung server?
 

http://www.port389.org/docs/389ds/FAQ/faq.html#debugging-hangs
   
  
 

  Ok, i'll do that tomorrow (for 1.3.2.24 since i'm testing mainly this one).
  It happens each time during a full on-line initialization, so it won't be
  difficult difficult to reproduce :) It does not really hang, only the
  online
  initialization hangs in fact (with the logs similar to the original
  mail)...
 

 That'd be great! If you could capture them, could you open a ticket at:

  https://fedorahosted.org/389/newticket
 

 and attach the stacktraces to the ticket?--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users

Re: [389-users] 389-Directory/1.3.1.6 cannot setup replica

2014-11-06 Thread Ivanov Andrey (M.)
- Mail original -

   Don't know. My hypotheses are :
  
 
   * using plugin transactions compared to 1.2.10.x
  
 
   * bdb version? but even with compat-db-47 and 1.2.10 the problem still
   happens on CentOS7, though much less frequently. It never happens with
   1.2.10 with rpm bdb on CentOS5.
  
 
   * change from mozilla ldap libraries to openldap libraries?
  
 

   seems to be some sort of thread or transaction contention that is reduced
   when i add CPUs/increase checkpoint interval. It really looks like the
   master server just does not send entries any more at some moment...
   SSL/TLS
   slows the things down so less entries are sent before everything gets
   stuck...
  
 

   I'll get back with more information (stacktraces) tomorrrow.
  
 

  Another version :
 
  insufficient entropy generation speed for TLS/SSL total update
  (/dev/urandom
  vs blocking /dev/random), especially in VMs??
 

 it is possible the VM system is running out of entropy, and apps to
 experience long delays, to verify:
 cat /proc/sys/kernel/random/entropy_avail

 one way to fix this is to use and run the haveged service on the KVM guest,
 that can be downloaded from EPEL

 it can also depends on the VM configuration, for example if using KVM and
 libvirt (recent version), use the KVM host entropy is with a configuration
 similar to this:
 rng model='virtio'
 backend model='random'/dev/random/backend
 address type='pci' domain='0x' bus='0x00' slot='0x09' function='0x0'/
 /rng
 /devices
 without that config, my test RHEL 7 KVM guest has quite a low entropy.
 and the entropy will depends on the cpu characteristics.
Thank you Marc. I'll try checking the entropy pool state during the total 
on-line import. We are using VMWare for virtualization, so there is no simple 
way to expose host /dev/random to the guest VMs... However i've had this 
problem (stucked initial replication) even with LDAP/389 replica protocol, 
though it happened much less frequently. Anyway, i've made a ticket for this 
problem: https://fedorahosted.org/389/ticket/47942 
--
389 users mailing list
389-users@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users

Re: [389-users] 389-Directory/1.3.1.6 cannot setup replica

2014-11-05 Thread Ivanov Andrey (M.)
Hi, 

i'm having the same problem. I'm in the process of migration from our 389DS 
v1.2.10.25/CentOS5 to 389DS on CentOS 7. Everything is working fine on 
standalone servers but the replication (especially online initialization). It 
stucks _each time_ during the online initialization with SSL/TLS (and sometimes 
without SSL/TLS). And exactly with the same error messages as you describe. 
The network problems in my case are excluded - i used both virtual machines on 
the same ESXi in the same network and/or physical servers, the results are the 
same. 
I've tried compiling all the branches latest available (tags 1.3.2.24, 
1.3.1.22, 1.3.3.5). In all the cases the result was the same. The server 
pushing the updates just gets stuck at some random number of entries sent to 
consumer (we have ~3 entries, it gets stuck at random somewhere from 1200 
to 25000 entries, the entries stuck have nothing particular in size, it's 
completely random). 
1.2.10.24 compiled with compat-db-4.7 on CentOS 7 has the least of these 
problems (and the initial replication is 10 times faster - it takes 8 seconds 
instead of 80 for 1.3.x!). I've been using 1.2.10.24 on CentOS5 compiled with 
mozilla ldap labraries and 1.2.10.23 obn CentOS 7 compiled with opendlap 
librarires. The first one had no problems at all to push the initial 
replication, the second one had intermittent problems, but much less than 
v1.3.x 

I've noticed that this problem is getting worse (or simply appears) if : 
* the replica is be of type 3 (multi-master), with replication agreements in 
both directions 
* our schema has several additional attribuites, it may be also important 
* if the virtual machine has only one CPU. Adding a second CPU increases the 
number of transferred entries before the initialization gets stuck. So it may 
me some thread/transaction contention or deadlock. 
* if the replication agreement uses SSL(port 636) or TLS(port389). Using port 
389 with LDAP protocol instead of TLS/SSL increases the number of transferred 
entries before the initialization gets stuck. Sometimes the initialization even 
ends successfully in this case. 
* decreasing nsslapd-db-checkpoint-interval (say, to 5 seconds) also gets the 
problem worse 

When the on-line intialization is finished (if it finishes), there are no 
problems. I think it is related to the volume of data transferred, so small 
incremental updates do not generate any problem 
If necessary, i will make any debugs/tests - it is a critical element of our 
infrastructure, so i'd like this problem to be resolved... 

Regards, 
Andrey IVANOV 

- Mail original -

 De: 陳含林 laneo...@gmail.com
 À: 389-users@lists.fedoraproject.org
 Envoyé: Mercredi 5 Novembre 2014 18:01:37
 Objet: [389-users] 389-Directory/1.3.1.6 cannot setup replica

 hello all,

 I have setup a IDM/freeipa master using CentOS7 , and import about 5000
 hosts.

 then i try to setup a IDM/freeipa replication server by using
 ipa-replica-install.

 It seems the total update on replication server hangs after about 1000+
 entries imported.

 I try to trigger a total update by setting nsds5beginreplicarefresh, but the
 result was the same.

 Any one help me ? Thanks!

 idm1 is the master, idm2 is the replication server.
 master server logs:

 [06/Nov/2014:00:21:48 +0800] - 389-Directory/ 1.3.1.6 B2014.219.1825 starting
 up
 [06/Nov/2014:00:21:48 +0800] schema-compat-plugin - warning: no entries set
 up under cn=computers, cn=compat,dc=idc
 [06/Nov/2014:00:21:51 +0800] - Skipping CoS Definition cn=Password
 Policy,cn=accounts,dc=idc--no CoS Templates found, which should be added
 before the CoS Definition.
 [06/Nov/2014:00:21:51 +0800] - Skipping CoS Definition cn=Password
 Policy,cn=accounts,dc=idc--no CoS Templates found, which should be added
 before the CoS Definition.
 [06/Nov/2014:00:21:51 +0800] - slapd started. Listening on All Interfaces
 port 389 for LDAP requests
 [06/Nov/2014:00:21:51 +0800] - Listening on All Interfaces port 636 for LDAPS
 requests
 [06/Nov/2014:00:21:51 +0800] - Listening on /var/run/slapd-IDC.socket for
 LDAPI requests
 [06/Nov/2014:00:21:51 +0800] - Entry uid=admin,ou=people,o=ipaca --
 attribute krbExtraData not allowed
 [06/Nov/2014:00:40:26 +0800] NSMMReplicationPlugin -
 agmt=cn=meToidm2.ra.cn.idc (idm2:389): The remote replica has a different
 database generation ID than the local database. You may have to reinitialize
 the remote replica, or the local replica.
 [06/Nov/2014:00:40:26 +0800] NSMMReplicationPlugin - Beginning total update
 of replica agmt=cn=meToidm2.ra.cn.idc (idm2:389).

 replication server logs:
 [06/Nov/2014:00:40:18 +0800] - 389-Directory/ 1.3.1.6 B2014.219.1825 starting
 up
 [06/Nov/2014:00:40:18 +0800] ipalockout_get_global_config - [file
 ipa_lockout.c, line 185]: Failed to get default realm (-1765328160)
 [06/Nov/2014:00:40:18 +0800] ipaenrollment_start - [file ipa_enrollment.c,
 line 393]: Failed to get default realm?!
 [06/Nov/2014:00:40:18 +0800] - slapd