[389-devel] Re: 300s delay when query cn=monitor

2021-07-15 Thread Thierry Bordaz

Hello,

that is excellent news. The server was trying to read an operation for 
5min then timeout (ioblock_timeout), I am a bit surprised it detected 
incoming event but nothing to read. Anyway in such situation the 
connection is locked and this  explains why cn=monitor was hanging.


best regards
thierry

On 7/15/21 2:49 PM, Erwin Weitlaner wrote:

After further log file analysis I found the blocking ldap query ... it is a BIND request 
(which is successfull) and a missing CLOSE afterwards. After 5 minutes the LDAP Server 
closes the connection with closetype="T2" (server side timeout). So far so good 
(client problem I will fix). I am not sure if this situation blocked the cn=monitor query 
in earlier versions, I guess not?!

Once again thank you for your help.

Finally the log of the blocking connection

2021-07-15 10:22:42 +02:00
pldaXXX [15/Jul/2021:10:22:42.916319196 +0200] conn=1228488 CLOSE type=close fd=4751 slot=4751 
from=10.XXX.YYY.ZZZ to=10.AA.BB.CCC ssl=true 
binddn="gvgid=at:lX:y,ou=service,dc=t+gvouid=at:l7:lvn:01,dc=gv,dc=at" 
method=128 version=3 op=0 totalop=-1 closestate=normal closetype="T2"
2021-07-15 10:17:41 +02:00
pldaXXX [15/Jul/2021:10:17:41.893466169 +0200] conn=1228488 BIND type=bind op=0 
binddn="gvgid=at:lX:y,ou=service,dc=t+gvouid=at:l7:lvn:01,dc=gv,dc=at" method=128 
version=3 RESULT op=0 err=0 errname=success tag=97 nentries=0 optime=0.000512268 wtime=0.041730272 
etime=0.042240973 
resultdn="gvgid=at:lX:y,ou=service,dc=t+gvouid=at:l7:lvn:01,dc=gv,dc=at" CONNECTION 
fd=4751 slot=4751 from=10.XXX.YYY.ZZZ to=10.AA.BB.CCC ssl=true 
binddn="gvgid=at:lX:y,ou=service,dc=t+gvouid=at:l7:lvn:01,dc=gv,dc=at" method=128 
version=3 op=0
2021-07-15 10:17:41 +02:00
pldaXXX [15/Jul/2021:10:17:41.850130533 +0200] conn=1228488 OPEN type=open 
fd=4751 slot=4751 from=10.XXX.YYY.ZZZ to=10.AA.BB.CCC ssl=true
___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure

___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


[389-devel] Re: 300s delay when query cn=monitor

2021-07-12 Thread Thierry Bordaz

Hi Erwin,

I think the pstack is first step to diagnose what is going on. I 
suspected a timeout because of the long etime (250s) that is close to a 
5min timeout (ioblock, ssl timeout,...) but ATM there is no strong 
evidence of this. timeout are not systematically reported in error logs 
as it is normal networking event.


best regards
thierry

On 7/12/21 8:24 AM, Erwin Weitlaner wrote:

Thank you Thierry for your support.

So I will try to get the thread dump .. If the problem comes with a connection 
timeout from a (not listening) client, shoudn´t I see an error log entry 
somewhere? I searched for that for hours but no hints .. Maybe not the right 
idea but the problem came after our last linux patch, so if others have similar 
problems maybe a code change issue?

SG Erwin
___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure

___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


[389-devel] Re: 300s delay when query cn=monitor

2021-07-09 Thread Thierry Bordaz

Hi,

I would suspect the dump of the opened connections status to be the RC 
of that delay. If the monitor thread can not acquire the connection 
lock, it will delay the request. So I would suspect a connection timeout 
(5min) to hang the monitoring thread, for example if a ldapclient is not 
reading while the server tries to send it some data.


As it is hanging for 5min, when it occurs you may run a pstack showing 
what the threads are doing.


regards
thierry

On 7/9/21 11:51 AM, Erwin Weitlaner wrote:

We are using 389-Directory/1.3.10.2 B2021.127.856 (we will update next month) .. a 
monitor script queries with basedn cn=monitor and filter (objectClass=*) every 
minute. This query returned in <10ms for years. Since two weeks under unknown 
circumstances this query lasts up to 300s (one or two times per day). The other 
queries work fine (no delay). Any ideas what could block the answer?

SG erwin

The 'bad' log entry:
[09/Jul/2021:06:54:01.203812923 +0200] conn=962307 SRCH type=srch op=1 base="cn=monitor" scope=2 
filter="(objectClass=*)" attrs="ALL" RESULT op=1 err=0 errname=success tag=101 nentries=3 
optime=250.179697410 wtime=0.62324 etime=250.179757740 CONNECTION fd=4201 slot=4201 from=127.0.0.1 to=127.0.0.1 
ssl=false binddn="cn=directory manager" method=128 version=3 op=1

The 'normal' log entry:
[09/Jul/2021:06:52:01.797036360 +0200] conn=96 SRCH type=srch op=1 base="cn=monitor" scope=2 
filter="(objectClass=*)" attrs="ALL" RESULT op=1 err=0 errname=success tag=101 nentries=3 
optime=0.002609889 wtime=0.57307 etime=0.002665406 CONNECTION fd=4199 slot=4199 from=127.0.0.1 to=127.0.0.1 
ssl=false binddn="cn=directory manager" method=128 version=3 op=1
___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure

___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


[389-devel] Re: Please have look at One-Time Password password policy

2021-03-26 Thread thierry bordaz



On 3/25/21 11:28 PM, William Brown wrote:



On 25 Mar 2021, at 17:49, thierry bordaz  wrote:



On 3/25/21 3:20 AM, William Brown wrote:

On 25 Mar 2021, at 12:00, Mark Reynolds  wrote:


On 3/24/21 8:32 PM, William Brown wrote:

I think maybe it could be easy to visualise it.


We have time going from past to future like:


past -> future


So we want a window of:

* When can the OTP start to be used?
* When is the OTP no longer able to be used?

So my thinking is:

past -> future
^  ^ ^
|  | |
otp created| |
otp valid from   |
  otp expire at


So I would say otp valid from and the otp expire should be *absolute* date 
times in UTC.


Hi William

Good idea of that graphic. I am sorry to be so slow to understand but still 
things are not clear.
There are the attributes of the password policy and the operational attribute 
of the user account.

I think you meant and I agree with you that operational attributes (in the user 
account) should be absolute date.
What is not clear to me is how to compute those operational attributes from the 
password policy.
I see three options:
• password policy contains absolute time (e.g. passwordOTPValidFromUTC) 
=> account.validFrom = policyValidFromUTC
• password policy contains a delay after OTP create/reset (e.g. 
passwordOTPValidFromDelay) => account.validFrom = Now + policyValidFromDelay
• password policy contains both and if both are set we should give the 
priority to one or the other
If a password policy is a stable entry, then they should not contain absolute 
time. If we imagine password policy created on purpose to do a bunch of 
registration, then that is okay if they contain absolute time.

Do you think we should support password policy with absolute time ?


No we should not store actual times in the config.  These time values need to 
live in the entry itself, just like passwordExpirationtime. Perhaps this is a 
good candidate to handle through the CLI (maybe even a new task that uses a 
filter, base, etc)?

I'm a bit confused about this answer but:

Theirry thought you wanted to set:


 dn: cn=config
 passwordOTPStartTime: 2021034578489754Z


But I was saying it should be in the entry, not cn=config, like how we use 
passwordExpirationtime:


 uid=mark,dc=example,dc=com
 passwordOTPStartTime: 2021034578489754Z

Yep, this is exactly what I meant :)


Mark


I think there are no "operational" attributes here. These should all be 
absolute dates, set on the entry. No calculation or whatever needed.

Thanks Mark, William for the clarification.
Actually OTP is an extension of the current password policy. So there are new 
attribute in the password policy entry and new (operational) attributes in the 
account entry.

I understand and agree that attributes (in the account entry) that represent a 
window of validity will be absolute time.

For example we will have, assuming that an admin reset the userpassword of 
'uid=mark' at 10AM we will have

dn: cn=config
passwordMustChange: on
passwordOTPValidFromDelay: 1800
passwordOTPExpireDelay: 3600
passwordOTPMaxUse: 3

I actually don't think any of these fields in cn=config are required, and 
actually limit the capability of this.


This was an example for global password policy settings. These new 
attributes are supported in global pwp but also in pwpolicysubentries.




dn: uid=mark,dc=example,dc=com
userpassword: xxx
pwdOTPReset: true
pwdOTPUseCount: 0
pwdOTPValidFrom: 20210325103000Z
pwdOTPExpireAt: 2021032511Z

You should add pwdOTPMaxUse: X here.


The reason I say this is it grants flexibility. Imagine say ... we are 
enrolling 10,000 students to a university and creating their new accounts. We 
want to set exact times on their accounts and say ... given them 10 tries or 
something.

But at the same time we need to reset someones account. For them, we want them 
to have say ... 1 attempt to do the reset, and have a different time window.

All the cn=config fields do here is confuse and complicate the system - what is their 
purpose but to "template" out some values into the entry?


The only benefit I see (but I may miss something) in adding pwdOTPMaxUse 
in the account entry is in the case the password policies may change 
frequently. If we want to be sure the exact setting of the password 
policy at the time of the reset will be enforced, even if the password 
policy change later, then we have to copy that value in the entry.
During a bind attempts, we are retrieving the password policy that 
applies to that target bind entry and we can enforce the pwp without 
storing it in the target entry.


thanks
thierry


In this matter, I think that all the OTP fields should be on the en

[389-devel] Re: Please have look at One-Time Password password policy

2021-03-25 Thread thierry bordaz



On 3/25/21 3:20 AM, William Brown wrote:



On 25 Mar 2021, at 12:00, Mark Reynolds  wrote:


On 3/24/21 8:32 PM, William Brown wrote:

I think maybe it could be easy to visualise it.


We have time going from past to future like:


past -> future


So we want a window of:

* When can the OTP start to be used?
* When is the OTP no longer able to be used?

So my thinking is:

past -> future
^  ^ ^
|  | |
otp created| |
otp valid from   |
  otp expire at


So I would say otp valid from and the otp expire should be *absolute* date 
times in UTC.


Hi William

Good idea of that graphic. I am sorry to be so slow to understand but still 
things are not clear.
There are the attributes of the password policy and the operational attribute 
of the user account.

I think you meant and I agree with you that operational attributes (in the user 
account) should be absolute date.
What is not clear to me is how to compute those operational attributes from the 
password policy.
I see three options:
• password policy contains absolute time (e.g. passwordOTPValidFromUTC) 
=> account.validFrom = policyValidFromUTC
• password policy contains a delay after OTP create/reset (e.g. 
passwordOTPValidFromDelay) => account.validFrom = Now + policyValidFromDelay
• password policy contains both and if both are set we should give the 
priority to one or the other
If a password policy is a stable entry, then they should not contain absolute 
time. If we imagine password policy created on purpose to do a bunch of 
registration, then that is okay if they contain absolute time.

Do you think we should support password policy with absolute time ?


No we should not store actual times in the config.  These time values need to 
live in the entry itself, just like passwordExpirationtime. Perhaps this is a 
good candidate to handle through the CLI (maybe even a new task that uses a 
filter, base, etc)?

I'm a bit confused about this answer but:

Theirry thought you wanted to set:


 dn: cn=config
 passwordOTPStartTime: 2021034578489754Z


But I was saying it should be in the entry, not cn=config, like how we use 
passwordExpirationtime:


 uid=mark,dc=example,dc=com
 passwordOTPStartTime: 2021034578489754Z

Yep, this is exactly what I meant :)



Mark


I think there are no "operational" attributes here. These should all be 
absolute dates, set on the entry. No calculation or whatever needed.


Thanks Mark, William for the clarification.
Actually OTP is an extension of the current password policy. So there 
are new attribute in the password policy entry and new (operational) 
attributes in the account entry.


I understand and agree that attributes (in the account entry) that 
represent a window of validity will be absolute time.


For example we will have, assuming that an admin reset the userpassword 
of 'uid=mark' at 10AM we will have


dn: cn=config
passwordMustChange: on
passwordOTPValidFromDelay: 1800
passwordOTPExpireDelay: 3600
passwordOTPMaxUse: 3


dn: uid=mark,dc=example,dc=com
userpassword: xxx
pwdOTPReset: true
pwdOTPUseCount: 0
pwdOTPValidFrom: 20210325103000Z
pwdOTPExpireAt: 2021032511Z

Meaning the user 'Mark' should complete his registration between 10:30AM 
and 11AM and he will be granted 3 tries to bind with the registration 
password and change his password


thanks
thierry



There is no policy at all. Basicly you just have a mechanic that sets on the 
account that this password is only valid in these time windows, and can only be 
used a maximum number of times.




Mark


—
Sincerely,

William Brown

Senior Software Engineer, 389 Directory Server
SUSE Labs, Australia


--

389 Directory Server Development Team

—
Sincerely,

William Brown

Senior Software Engineer, 389 Directory Server
SUSE Labs, Australia


___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


[389-devel] Re: Please have look at One-Time Password password policy

2021-03-24 Thread thierry bordaz



On 3/24/21 12:16 AM, William Brown wrote:



On 24 Mar 2021, at 02:12, thierry bordaz  wrote:

Hi William

Thanks for you review. Some answers are inlined in the mail below.

On 3/23/21 12:33 AM, William Brown wrote:

Hey there,

I think that you also need:


pwdOTPValidFromTime

This way an admin can pre-configure all the OTP's and they only "become valid 
from" that time frame. IE think university enrollment. You can configure all the 
OTP's a month before, and they become valid at a specific datetime.

That is a very nice idea. Note to be OTP the 'userpassword' of the account must 
be reset by an admin and the account inheriting password policy with OTP 
settings.
Assuming 'pwdOTPValidFromTime' is the account operational attribute holding a 
precise time. How should it be computed ? Directly from a precise time set in 
the password policy or computed from a ' 'passwordOTPValidationDelay' (number 
of seconds after OTP reset time) or something else ?

I think maybe it could be easy to visualise it.


We have time going from past to future like:


past -> future


So we want a window of:

* When can the OTP start to be used?
* When is the OTP no longer able to be used?

So my thinking is:

past -> future
^  ^ ^
|  | |
otp created| |
otp valid from   |
  otp expire at


So I would say otp valid from and the otp expire should be *absolute* date 
times in UTC.

Hi William

Good idea of that graphic. I am sorry to be so slow to understand but 
still things are not clear.
There are the attributes of the password policy and the operational 
attribute of the user account.


I think you meant and I agree with you that operational attributes (in 
the user account) should be absolute date.
What is not clear to me is how to compute those operational attributes 
from the password policy.

I see three options:

1. password policy contains absolute time (e.g.
   passwordOTPValidFromUTC) => account.validFrom = policyValidFromUTC
2. password policy contains a delay after OTP create/reset (e.g.
   passwordOTPValidFromDelay) => account.validFrom = Now +
   policyValidFromDelay
3. password policy contains both and if both are set we should give the
   priority to one or the other

If a password policy is a stable entry, then they should not contain 
absolute time. If we imagine password policy created on purpose to do a 
bunch of registration, then that is okay if they contain absolute time.


Do you think we should support password policy with absolute time ?


thanks

thierry



And then within that otp valid from - expire at time window, we have the "max 
use" field of how many times it can be used.

Does that help?



I think you should make it consistent with passwordOTPExpDelay to pwdOTPExpDelay. Better, OTP means "one 
time password" so why is it "password one time password". Just make the attributes 
"OTPExpDelay" or whatever. Alternately make it pwdOT (password one time).

ATM password policy ('passwordPolicy') only contains 'password*' attributes 
this is why I would prefer to keep 'passwordOTP*' (e.g. passwordOTPMaxUse, 
passwordOTPExpirationDelay, passwordOTPValidFromTime').
I agree that 'passwordOTP' looks weird ("password one time password") but the 
first 'password' is the way the password policy attribute are prefixed.

Ahh I think I missed that this is using the userPassword and combined with 
passwordPolicy. That makes sense now.

Still better to keep it consistent - all pwd or all password. I think you 
mix/match these 


Then the account operational attributes updated via  password policy. There is 
a mix.
6 out of 10 start with 'password' (like 'passwordExpirationTime')
2 out of 10 start with 'pwd' (like 'pwdReset')
The two remaining are 'retryCountResetTime' and 'accountUnlockTime'.
I choose the 'pwdOTP' prefix because the feature is somehow related to 
'pwdReset' and also I preferred a different prefix than the password policy.

I think passwordOTPExpDelay can be remove if you have ValidFromTime instead.

Why ? Registration should be done after Now+ValidFromTime and before 
Now+passwordOTPExpDelay.
So the two are useful.

I'd see above :)



The OC should be named onetimepasswordPolicy instead.

Do you suggest we have two password policies OC: passwordPolicy and 
OnTimePasswordPolicy.
OTP relying on 'passwordMustChange' then OnTimePasswordPolicy should allow 
'passwordMustChange'

Ignore this comment - I think I missed about the passwordPolicy bit :)



Hope that helps!

Absolutely it helps a lot. Thanks !

thierry



On 22 Mar 2021, at 21:30, thierry bordaz  wrote:

Hi,

I wrote a small design [1] about OTP password policy that I would like to start 
implementing.
Comments

[389-devel] Re: Please have look at One-Time Password password policy

2021-03-23 Thread thierry bordaz

Hi William

Thanks for you review. Some answers are inlined in the mail below.

On 3/23/21 12:33 AM, William Brown wrote:

Hey there,

I think that you also need:


pwdOTPValidFromTime

This way an admin can pre-configure all the OTP's and they only "become valid 
from" that time frame. IE think university enrollment. You can configure all the 
OTP's a month before, and they become valid at a specific datetime.


That is a very nice idea. Note to be OTP the 'userpassword' of the 
account must be reset by an admin and the account inheriting password 
policy with OTP settings.
Assuming 'pwdOTPValidFromTime' is the account operational attribute 
holding a precise time. How should it be computed ? Directly from a 
precise time set in the password policy or computed from a ' 
'passwordOTPValidationDelay' (number of seconds after OTP reset time) or 
something else ?


I think you should make it consistent with passwordOTPExpDelay to pwdOTPExpDelay. Better, OTP means "one 
time password" so why is it "password one time password". Just make the attributes 
"OTPExpDelay" or whatever. Alternately make it pwdOT (password one time).
ATM password policy ('passwordPolicy') only contains 'password*' 
attributes this is why I would prefer to keep 'passwordOTP*' (e.g. 
passwordOTPMaxUse, passwordOTPExpirationDelay, passwordOTPValidFromTime').
I agree that 'passwordOTP' looks weird ("password one time password") 
but the first 'password' is the way the password policy attribute are 
prefixed.


Then the account operational attributes updated via  password policy. 
There is a mix.

6 out of 10 start with 'password' (like 'passwordExpirationTime')
2 out of 10 start with 'pwd' (like 'pwdReset')
The two remaining are 'retryCountResetTime' and 'accountUnlockTime'.
I choose the 'pwdOTP' prefix because the feature is somehow related to 
'pwdReset' and also I preferred a different prefix than the password policy.


I think passwordOTPExpDelay can be remove if you have ValidFromTime instead.


Why ? Registration should be done after Now+ValidFromTime and before 
Now+passwordOTPExpDelay.

So the two are useful.




The OC should be named onetimepasswordPolicy instead.
Do you suggest we have two password policies OC: passwordPolicy and 
OnTimePasswordPolicy.
OTP relying on 'passwordMustChange' then OnTimePasswordPolicy should 
allow 'passwordMustChange'



Hope that helps!


Absolutely it helps a lot. Thanks !

thierry




On 22 Mar 2021, at 21:30, thierry bordaz  wrote:

Hi,

I wrote a small design [1] about OTP password policy that I would like to start 
implementing.
Comments are welcome

[1] https://www.port389.org/docs/389ds/design/otp-password-policy.html

best regards
thierry
___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure

—
Sincerely,

William Brown

Senior Software Engineer, 389 Directory Server
SUSE Labs, Australia
___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure

___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


[389-devel] Please have look at One-Time Password password policy

2021-03-22 Thread thierry bordaz

Hi,

I wrote a small design [1] about OTP password policy that I would like 
to start implementing.

Comments are welcome

[1] https://www.port389.org/docs/389ds/design/otp-password-policy.html

best regards
thierry
___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure


[389-devel] Re: Deref plugin entries == NULL #4525

2021-01-15 Thread thierry bordaz



On 1/15/21 11:53 AM, Pierre Rogier wrote:

Hi Thierry,

I was rather thinking about the key and value duplication when 
querying the DB:


When using bdb functions that is done implicitly.
  bdb either copies the values in the DBT buffer or it alloc/realloc it
When mimicking bdb behavior with LDBM we will have to do that 
explicitly in the LDBM plugin:

    LDMB returns a memory mapped address that may be ummapped
   once the txn is ended. So we must copy the result before 
closing the txn.
If we have a read txn that protects the full operation lifespan, then 
we could directly use the mapped address without needing to duplicate 
them.

ah okay ! nice.
Just a question, if we have a txn covering a search with large candidate 
list (unindexed), does that mean that by default each db key/value will 
remain mapped until txn commit ?


thanks
thierry


Pierre

On Fri, Jan 15, 2021 at 10:53 AM thierry bordaz <mailto:tbor...@redhat.com>> wrote:




On 1/14/21 12:32 PM, Pierre Rogier wrote:

Hi William,

> It's a scenario we will need to fix via your BE work because of
the MVCC transaction model that
> LMDB will force us to adopt :)
As I see things in the early phases the lmdb read txn will
probably only be managed at the db plugin level rather than at
backend level. That means that we will have the same
inconsistency risk than today (i.e as if using bdb and the
implicit txn).
The txn model redesign you are speaking about should only occur
in one of the last phases (once bdb does no more coexists with lmdb).
It must be done because it could provide a serious performance
boost for read operations (IMHO, In most cases we could avoid to
duplicate the db data)

Pierre, what duplicate are you thinking of ? str2entry ?


But we should not do it while bdb is still around because of the
risk of lock issue and excessive retries.

Note I put a phasing section in

https://directory.fedoraproject.org/docs/389ds/design/backend-redesign-phase3.html#phasing
explaining that. But I guess I should move it within Ludwig's
document that englobs it.

Pierre

On Thu, Jan 14, 2021 at 12:01 AM William Brown mailto:wbr...@suse.de>> wrote:



> On 13 Jan 2021, at 21:24, Pierre Rogier mailto:prog...@redhat.com>> wrote:
>
> Thank you Willian,
> So far your scenario (entry found when reading base entry
but no more existing when computing the candidates) is the
only one that matches the symptoms.

It's a scenario we will need to fix via your BE work because
of the MVCC transaction model that LMDB will force us to
adopt :)

> And that triggered a thought:
>  We cannot do anything for SUBTREE and ONE_LEVEL searches
>   because the fact that the base entry id is not in the
candidate may be normal
>  but IMHO we should improve the BASE search case.
> In this case the candidate list is directly set to the base
entry id
>  ==> if the candidate entry (in
ldbm_back_next_search_entry) is not found and the scope is
BASE then we should return a LDAP_NO_SUCH_ENTRY error ..

I suspect that Mark has seen this email and submitted a PR to
resolve this exact case :)


>
>        Pierre
>
>
> On Wed, Jan 13, 2021 at 1:45 AM William Brown
mailto:wbr...@suse.de>> wrote:
> Hey there,
>
> https://github.com/389ds/389-ds-base/pull/4525/files
>
> I had a look and I can see a few possible contributing
factors, but without a core and the exact state I can't be
sure if this is correct. It's all just hypothetical from
reading the code.
>
>
> The crash is in deref_do_deref_attr() which is called as
part of deref_pre_entry(). This is the
SLAPI_PLUGIN_PRE_ENTRY_FN which is called by
"./ldap/servers/slapd/result.c:1488:    rc =
plugin_call_plugins(pb, SLAPI_PLUGIN_PRE_ENTRY_FN);"
>
>
> I think what's important here is that the search is
conducted in ./ldap/servers/slapd/opshared.c:818 rc =
(*be->be_search)(pb);  Is *not* in a transaction. That means
that while the single search in be_search() is consistent due
to an implied transaction, the subsequent search in
deref_pre_entry() is likely conducted in a seperate
transaction. This allows for other operations to potentially
interleave and cause changes - modrdn or delete would
certainly be candidates to cause a DN to be remove between
these two points. It would be extremely hard to reproduce as
a race condition of course.
>
>
> A ques

[389-devel] Re: Deref plugin entries == NULL #4525

2021-01-15 Thread thierry bordaz



On 1/14/21 12:32 PM, Pierre Rogier wrote:

Hi William,

> It's a scenario we will need to fix via your BE work because of the 
MVCC transaction model that

> LMDB will force us to adopt :)
As I see things in the early phases the lmdb read txn will 
probably only be managed at the db plugin level rather than at backend 
level. That means that we will have the same inconsistency risk than 
today (i.e as if using bdb and the implicit txn).
The txn model redesign you are speaking about should only occur in one 
of the last phases (once bdb does no more coexists with lmdb).
It must be done because it could provide a serious performance boost 
for read operations (IMHO, In most cases we could avoid to duplicate 
the db data)

Pierre, what duplicate are you thinking of ? str2entry ?

But we should not do it while bdb is still around because of the 
risk of lock issue and excessive retries.


Note I put a phasing section in
https://directory.fedoraproject.org/docs/389ds/design/backend-redesign-phase3.html#phasing
explaining that. But I guess I should move it within Ludwig's document 
that englobs it.


Pierre

On Thu, Jan 14, 2021 at 12:01 AM William Brown > wrote:




> On 13 Jan 2021, at 21:24, Pierre Rogier mailto:prog...@redhat.com>> wrote:
>
> Thank you Willian,
> So far your scenario (entry found when reading base entry but no
more existing when computing the candidates) is the only one that
matches the symptoms.

It's a scenario we will need to fix via your BE work because of
the MVCC transaction model that LMDB will force us to adopt :)

> And that triggered a thought:
>  We cannot do anything for SUBTREE and ONE_LEVEL searches
>   because the fact that the base entry id is not in the
candidate may be normal
>  but IMHO we should improve the BASE search case.
> In this case the candidate list is directly set to the base entry id
>  ==> if the candidate entry (in ldbm_back_next_search_entry) is
not found and the scope is BASE then we should return a
LDAP_NO_SUCH_ENTRY error ..

I suspect that Mark has seen this email and submitted a PR to
resolve this exact case :)


>
>        Pierre
>
>
> On Wed, Jan 13, 2021 at 1:45 AM William Brown mailto:wbr...@suse.de>> wrote:
> Hey there,
>
> https://github.com/389ds/389-ds-base/pull/4525/files
>
> I had a look and I can see a few possible contributing factors,
but without a core and the exact state I can't be sure if this is
correct. It's all just hypothetical from reading the code.
>
>
> The crash is in deref_do_deref_attr() which is called as part of
deref_pre_entry(). This is the SLAPI_PLUGIN_PRE_ENTRY_FN which is
called by "./ldap/servers/slapd/result.c:1488:    rc =
plugin_call_plugins(pb, SLAPI_PLUGIN_PRE_ENTRY_FN);"
>
>
> I think what's important here is that the search is conducted in
./ldap/servers/slapd/opshared.c:818  rc = (*be->be_search)(pb); 
Is *not* in a transaction. That means that while the single search
in be_search() is consistent due to an implied transaction, the
subsequent search in deref_pre_entry() is likely conducted in a
seperate transaction. This allows for other operations to
potentially interleave and cause changes - modrdn or delete would
certainly be candidates to cause a DN to be remove between these
two points. It would be extremely hard to reproduce as a race
condition of course.
>
>
> A question you asked is why don't we get a "no such entry" error
or similar? I think that this is because build_candidate_list in
ldbm_search.c doesn't actually create an error if the
base_candidates list is empty, because an IDL is allocated with a
value of 0 (no matching entries). this allows the search to
proceed, and there are no errors, and the result set is set to
NULL with size 0. I can't see where LDAP_NO_SUCH_OBJECT is set in
this process, but without looking further into it, my suspicion is
that entries of size 0 WONT return an error condition to
internal_search_pb, so it's valid for this to be empty.
>
> Anyway, again, this is just reading the code for 20 minutes, and
is not a complete in depth investigation, but maybe it's some
ideas about what happened?
>
> Hope it helps :)
>
>
>
> —
> Sincerely,
>
> William Brown
>
> Senior Software Engineer, 389 Directory Server
> SUSE Labs, Australia
> ___
> 389-devel mailing list -- 389-de...@lists.fedoraproject.org

> To unsubscribe send an email to
389-devel-le...@lists.fedoraproject.org

> Fedora Code of Conduct:
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines:

[389-devel] Re: Deref plugin entries == NULL #4525

2021-01-15 Thread thierry bordaz



On 1/13/21 1:44 AM, William Brown wrote:

Hey there,

https://github.com/389ds/389-ds-base/pull/4525/files

I had a look and I can see a few possible contributing factors, but without a 
core and the exact state I can't be sure if this is correct. It's all just 
hypothetical from reading the code.


The crash is in deref_do_deref_attr() which is called as part of deref_pre_entry(). This 
is the SLAPI_PLUGIN_PRE_ENTRY_FN which is called by 
"./ldap/servers/slapd/result.c:1488:rc = plugin_call_plugins(pb, 
SLAPI_PLUGIN_PRE_ENTRY_FN);"


I think what's important here is that the search is conducted in 
./ldap/servers/slapd/opshared.c:818  rc = (*be->be_search)(pb);  Is *not* in a 
transaction. That means that while the single search in be_search() is consistent 
due to an implied transaction, the subsequent search in deref_pre_entry() is 
likely conducted in a seperate transaction. This allows for other operations to 
potentially interleave and cause changes - modrdn or delete would certainly be 
candidates to cause a DN to be remove between these two points. It would be 
extremely hard to reproduce as a race condition of course.


Hi William, Pierre,

Thanks for your feedback. I realize how complex it is to think to a 
possible explanation and I really appreciate.

I am still missing some parts to understand how it happened.
In the current crash there was no transaction at all "protecting" the 
initial search or nested searches. So yes we can imagine the entry got 
deleted between the base lookup and candidate list build but it is not 
related to txn.

Note that the logs do not contain direct delete of the entry.
Also during base search, the base entry is lookup. It was successful 
else it would have return a search failure. In such case the candidate 
list is not empty, it contains the base search entry ID (e->ep_id).
Finally, the candidates are evaluated against the filter 
(objectclass=*). It could be that phase that is failing if the entry was 
cleared from the entry cache and ep_id lookup failed.


regards
thierry



A question you asked is why don't we get a "no such entry" error or similar? I 
think that this is because build_candidate_list in ldbm_search.c doesn't actually create 
an error if the base_candidates list is empty, because an IDL is allocated with a value 
of 0 (no matching entries). this allows the search to proceed, and there are no errors, 
and the result set is set to NULL with size 0. I can't see where LDAP_NO_SUCH_OBJECT is 
set in this process, but without looking further into it, my suspicion is that entries of 
size 0 WONT return an error condition to internal_search_pb, so it's valid for this to be 
empty.

Anyway, again, this is just reading the code for 20 minutes, and is not a 
complete in depth investigation, but maybe it's some ideas about what happened?

Hope it helps :)



—
Sincerely,

William Brown

Senior Software Engineer, 389 Directory Server
SUSE Labs, Australia
___
389-devel mailing list -- 389-de...@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-de...@lists.fedoraproject.org

___
389-devel mailing list -- 389-de...@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-de...@lists.fedoraproject.org


[389-devel] Re: Mapping tree rework

2020-10-20 Thread thierry bordaz



On 10/20/20 4:01 AM, William Brown wrote:

In the first case we could easily mitigate the risk by testing and be fairly 
confident, in the second case the tests are too complex to achieve the same 
confidence and we should take this kind of risk only if there were a serious 
benefit to balance it, but in this case, there are other solutions with less 
risks.

Actually, I think testing the lib389 tooling would be even harder. You would 
need to recreate the logic of the mapping tree and sorting in python, which may 
have subtle differences compared to the C version. So it would be harder to 
test and gain confidence in. It also doesn't solve the issue that may come 
about from manual misconfiguration.


I can understand it could seem too conservervative and frustrating but that is 
the price when working on mature projects. If you do not do that, the product 
becomes unstable, and users quickly abandon it.

I have worked on this project for a number of years, so I'm well aware of the culture in 
the team. We are a team who values the highest quality of code, with customers who demand 
the very best. To satisfy this as engineers we need to be confident in what we do and the 
work we create. But every day we make changes that are bigger than this, or have 
"more unknowns" and more. It's out attitude as a team to quality, our attention 
to testing, and designs, that make us excellent at effectively making changes with 
confidence.

Because just as easily, when a product has subtle traps, unknown configuration bugs and 
lets people mishandle it, then they also abandon us. Our user experience is paramount, 
and part of that experience is not just stability, but reliability and correctness, that 
changes performed by administrators will work and not "silently fail". This bug 
is just as much a risk for people to abandon us because when the server allows 
misconfiguration to exist that is hard to isolate and understand that too can cause a 
negative user experience.

So here, I think we are going to have to "agree to disagree", but as Mark has stated - the fix is 
created, the PR is open. If you have more configuration cases to contribute to the test suite, that would 
benefit the project significantly to ensure the quality of the change, and the quality of the mapping tree in 
general. Our job is to qualify and create scenarios that were "unknown" and turn them to 
"knowns" so we can control changes and have confidence in our work.


On 20 Oct 2020, at 06:10, Mark Reynolds  wrote:

Hi,

So some of the arguments here is that we are introducing risk for something that is not 
really a big problem.  Or, simply not worth investing in. From a Red Hat perspective 
"we" would never fix this, it's just not a problem that comes up enough to 
justify the work and time.  But...  The initial work has been done by the upstream 
community (William).

With a corporate interest too, we have a customer at SUSE who has hit this :).


As users/customers start hitting MT bugs justifies we are fixing it. 
Should it be fixed in MT or in lib389 ?. I tend to agree with Pierre and 
Ludwig that (buggy) MT have been working for decades now and as 
sensitive and difficult to test area I prefer to not change it.
Now we have a valid patch/design on the table and I suspect/hope that if 
it introduces a regression it will be discovered rapidly. So I agree 
there is a disagreement and that is the way open source works. IMHO the 
patch should be pushed as soon as it is reviewed.


best regards
thierry



  So from a RH perspective we are getting this work for free.  Personally I don't see 
this code change as "very" risky, but this is a very sensitive area of the 
code.  That being said, I am not opposed to adding it, but...  I think we need much more 
testing around it to build confidence in the patch.  I would want tests that deal with 
suffixes of varying size, names, nested levels/complexity:

 o=my_server.com

 dc=example,dc=com

 dc=abcdef,dc=abc  (same length as suffix above - since the patch uses 
sizing as a way of sorting)

 dc=test,dc=this,dc=patch


Yep, these are some great test ideas. I can add these.




I want tests that are adding and removing subsuffixes, and sub-subsuffixes, and 
making sure ldap ops work, and replication, etc.  I want tests that use many 
different suffixes at the same time and many subsuffixes - some customers have 
50 subsuffixes.  Our current CI test suite does not have these kinds of tests, 
and we need them.

I have already checked with replication suite too, and of course, with ASAN. I 
think that these also are good to have added in general, so I can expand the 
testing to include more suffixes too.

Do you see 50 subsuffixes in a single level nesting or deeper? I can do some 
shallow nesting and deep nesting hierarchies with that kind of number if you 
want. I think an interesting test would also be to have

ou=x,ou=y,dc=example,dc=com

dc=example,dc=com

and then add 

[389-devel] Re: tests take minutes to start

2020-05-13 Thread thierry bordaz



On 5/13/20 9:15 AM, Viktor Ashirov wrote:

On Wed, May 13, 2020 at 9:13 AM William Brown  wrote:




On 13 May 2020, at 17:01, Viktor Ashirov  wrote:

Hi,

On Wed, May 13, 2020 at 8:31 AM William Brown  wrote:

Hi all,

I noticed today that my tests now take minutes to start executing. It looks 
like it's spinning on:

dirsrv   84605 12.8  0.1  16672  7704 pts/0S+   16:25   0:08 grep -rh 
^@pytest.mark.\(ds\|bz\)[0-9]\+

Do we know anything about this? Did we add something in a fixture or something 
to grep for tests? That kind of pattern does look like our bz/ds here, so I 
suspect it comes from us.

It is this change:
https://pagure.io/389-ds-base/c/6a7a154159583c09fcbba0578eaf576d577ccb11?branch=master
But for me on Fedora it doesn't take minutes:
$ time grep -rh ^@pytest.mark.\(ds\|bz\)[0-9]\+

real 0m0.144s
user 0m0.093s
sys 0m0.050s

How are you running your tests? Is it on OpenSUSE or some other OS?

It's a known IO performance issue inside of docker.

Do you mount a volume with git/tests inside of the container or it's
in the container FS itself?


Note that I also noticed that delay before tests start running. I am 
running those tests directly on my laptop. Thanks for the diagnostic.

Thanks.



—
Sincerely,

William Brown

Senior Software Engineer, 389 Directory Server
SUSE Labs
___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org



--
Viktor
___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org

—
Sincerely,

William Brown

Senior Software Engineer, 389 Directory Server
SUSE Labs
___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org




___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org


[389-devel] Re: Please have a look at rewriters design

2020-04-01 Thread thierry bordaz

Hello,

I agree that the term generic is not appropriate. I should change it 
(design/PR) if it still exist somewhere.


https://pagure.io/389-ds-base/pull-request/50981 is said to extend the 
usability of existing interfaces and I think it is what it does.
People needing to map/rewrite/transform (whatever the word) an 
attribute/value know what they want to obtain but usually do not care 
about the burden of writing/deploying a new plugin.


In my mind only the rewriter is complex and knows when/how it applies 
(attribute, scope, crafting values, authentication...) so I wanted to 
keep the interface very simple: just load your rewriter and let core 
server call it. William raised that it could contain helper function, 
for example going through a filter and call rewriter function for each 
filter components. I am looking at that at the moment.
I think that a rewriter may also appreciate some configuration area, for 
example if a rewriter is generic and apply some transformation rules 
specific to a rewriter instance.


I agree that it needs to be documented and plugin guide is a good place. 
I would like to use the design to describe the interfaces.


best regards
thierry

On 4/1/20 9:24 AM, Ludwig Krispenz wrote:
Ok, so Thierry's solution is useful to make using rewriters simpler, 
but I really want to have its use and interface  documented somewhere 
outside the code, PR, or design doc on the 389ds wiki - it needs to go 
to the official doc eg plugin guide.


Regards,
Ludwig

On 04/01/2020 01:02 AM, William Brown wrote:



On 1 Apr 2020, at 01:04, Ludwig Krispenz  wrote:

Hi,

I was away and am late in the discussion, maybe too late.

Not too late, it's not released in production yet ;). There are two 
PR's that have been discussed here:


https://pagure.io/389-ds-base/pull-request/50988

https://pagure.io/389-ds-base/pull-request/50981

In my understanding what you mean by "generic" is that for a new 
rewriter you do not need a plugin, but to provide some rewrite 
functions and specify them in a rewriters config entry. But there is 
still the need to write rewriter functions, compile  and deploy 
them, and instead of plugins you now have a new interface of 
"filterRewriter" and "returendAttrRewriter functions - so far not 
documented anywhere.


Under generic rewriter I would understand an approach where you do 
not need to provide own functions, but have a rewriter plugin, which 
does rewriting based on rules in rewrite config entries, eg in which 
subtree, for which entries (filter to select), how to map a saerch 
filter, how to rename attrs on return,
I had the same feeling too, to have these statically in libslapd, and 
much simpler than resolving symbols and dlopen. However, it's looking 
more like it will be a plugin style, but without using the current 
slapi plugin architecture - just a symload at start up. The reason 
for this that thierry explained is that freeipa plans to link to 
samba or sssd as part of one of the rewriter classes, and we don't 
want that to become a dependency of 389-ds.


I have argued in the past for a "lib-ipa" that has the needed shared 
logic between various pieces of the project, but honestly, I forgot 
if that ever happened. I think these days sssd is libipa in a lot of 
ways ...


Anyway, that's why Thierry want's to have a symload in this case :)


Best regards,
Ludwig


On 03/19/2020 01:09 AM, William Brown wrote:

On 19 Mar 2020, at 04:08, thierry bordaz  wrote:



On 3/18/20 1:51 AM, William Brown wrote:
On 18 Mar 2020, at 04:08, thierry bordaz  
wrote:


Hi William,

I updated the design according to our offline exchange
Thanks Thierry, I appreciate the conversation and the updates to 
the document: it made clear there were extra details up in your 
brain but not in words yet :) it's always hard to remember all 
the details as we write things, so thanks for the discussion. 
Like you said, it's always good to have a team who is really 
invested and cares about the work we do!



Your design for the core server version looks much better! Thank 
you. I still think there are some missing points. The reason to 
have a libpath rather than inbuild is to avoid a potential 
linking to sssd/samba. I think also that the problem space of the 
global catalog here needs to be looked at too. This feature is 
not in isolation, it's really a part of that.
Okay, I will work on a new PR making core server able to 
retrieve/registers rewriters.


I think the "need" to improve the usability of rewriters is not 
specific to global catalog. Global Catalog is just an opportunity 
to implement it. I think parts of slapi-nis, integration of 
vsphere, GC (and likely others) are also use case for rewriters. 
They were implemented in different ways because rewriters were not 
easy to use or simply not known.
Yes, that's perfectly reasonable, and shouldn't stop your idea from 
being created - what's concerning me is that without a

[389-devel] Re: Adding new syntaxes

2020-03-20 Thread thierry bordaz

Hi William,

I only have a vague knowledge of syntaxes/MR.

Each syntax is a plugin. Its init function registers for a given set of 
OIDs the matching rules (compare, order, substring) than handle that 
syntax (calls slapi_matchingrule_register).
There is a special collation plugin that does the same for supported 
language.
So a entryUUID syntax should define its matching rules callbacks and 
register them for supported OID.


The MR are called during filter evaluation, both at candidate list built 
and at filter match.

On write path, they are called to generate the index keys.

I think there is a slight difference between syntaxes plugins and 
collation plugin in the way they are selected to apply for a given 
attribute.
syntaxes provide the set of supported OIDs while for collation you need 
to call the index to know if it supports the OID.


All of this are general ideas around syntax/MR and I think they are 
quite correct.


best regards
thierry


On 3/20/20 4:37 AM, William Brown wrote:

Hi there,

I'm looking to add the syntaxes to handle entryUUID properly, because they have 
a different format to nsUniqueId. Thinking that I need to look at the plugins 
under ldap/servers/plugins/syntaxes/, but it would be good to have some extra 
insight about the plugin hooks. Should I look at the old plugin guide? Or is 
there some extra info I can get from somewhere?

Thanks!

—
Sincerely,

William Brown

Senior Software Engineer, 389 Directory Server
SUSE Labs
___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org

___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org


[389-devel] Re: Please have a look at rewriters design

2020-03-18 Thread thierry bordaz



On 3/18/20 1:51 AM, William Brown wrote:



On 18 Mar 2020, at 04:08, thierry bordaz  wrote:

Hi William,

I updated the design according to our offline exchange

Thanks Thierry, I appreciate the conversation and the updates to the document: 
it made clear there were extra details up in your brain but not in words yet :) 
it's always hard to remember all the details as we write things, so thanks for 
the discussion. Like you said, it's always good to have a team who is really 
invested and cares about the work we do!


Your design for the core server version looks much better! Thank you. I still 
think there are some missing points. The reason to have a libpath rather than 
inbuild is to avoid a potential linking to sssd/samba. I think also that the 
problem space of the global catalog here needs to be looked at too. This 
feature is not in isolation, it's really a part of that.
Okay, I will work on a new PR making core server able to 
retrieve/registers rewriters.


I think the "need" to improve the usability of rewriters is not specific 
to global catalog. Global Catalog is just an opportunity to implement 
it. I think parts of slapi-nis, integration of vsphere, GC (and likely 
others) are also use case for rewriters. They were implemented in 
different ways because rewriters were not easy to use or simply not known.


This means we have a whole set of deployment cases to look at.

So the deployment will look like:

IPA DS --> IPA GC


So an ipaAccount from the IPA DS instance will be "copied and transformed" into 
the IPA GC. This process is as yet undefined (it sounds like it may be offline or 
something else ...). We are simply not dealing with one instance now, but an out-of-band 
replication and transformation process. It's unclear whether the data transform is during 
this loading process, or in the IPA GC somehow.

 From what I understand, it sounds like a method to take an ipaAccount and 
transform it to an AD GC account stub. Then inside of that IPA GC there are 
some virtual attributes you wish to add like objectSid binary vs string 
representations, objectCategory, maybe others.

So from our discussion, we have currently focused on "how do we transform entries 
within a single directory server". But that's not the problem here. We are saying:

"We take an entry from IPA DS, transform it to an IPA GC stub entry, and then apply a set of 
further "in memory" transformations"


One of the biggest issue with GC is schema. IPA DS and IPA GC have not 
compatible schema. They can not be in the same replication topology.
So provisioning of IPA GC requires transformations rules to present an 
other "view" of IPA DS data. Those transformations will be on the write 
path (i.e. stored in DB/indexed). This transformation work is almost 
done and is completely independent of 389-ds.
All of this is "write" path: provisioning (online or offline) and 
transformation.


The problem for IPA GC is now on the "read" path. AD clients are use to 
smart shortcuts/control that are supported by IPA GC.
This is the IPA GC instance that will register the rewriters to act as 
GC does.





If that's the process, why not do all the transforms as required in the DS -> GC 
load process? You raised a critically key point - we have a concern about the write 
path as the transform point due to IO or time to do the transform, but it sounds like 
you have to do this anyway as an element of the DS -> GC process.


Some of the transformation rules, on the write path, are quite complex. 
Looking at slapi-nis config entries gives an idea what is needed. In 
addition to those transformations, DS to GC online provisioning is not 
simple at all. Relying on sync-repl, you then need to transform a 
received entry into an update. At the moment it is an offline 
provisioning via transformation and import (much simpler).


To be honest I am afraid that the transform rules will result in 
rewriting slapi-nis.


I think everytime I have spoken to you about this, I have kept learning more and more 
about this, and the more I see, I have many concerns about this feature. I think we do 
not have the full picture. You have admitted that you don't know the full extend or ideas 
here. There is clearly a communication break down here to our team from the IPA project, 
and they aren't telling us what they want. It sounds like they are asking you to just do 
"a small piece" but only they know the bigger picture.

The IPA project has the following designs:

https://www.freeipa.org/page/V4/Global_Catalog_Support

https://www.freeipa.org/page/V4/Global_Catalog_HLD

https://www.freeipa.org/page/V4/Global_Catalog_Access_Control

https://www.freeipa.org/page/V4/Global_Catalog_Data_Transformation

This also links to:

https://docs.microsoft.com/en-us/previous-versions/windows/it-pro/windows-server-2003/cc737410(v=ws.10)?redirectedfrom=MSDN

The free

[389-devel] Re: Please have a look at rewriters design

2020-03-17 Thread thierry bordaz

Hi William,

I updated the design according to our offline exchange

regards
thierry

On 3/17/20 11:12 AM, thierry bordaz wrote:



On 3/17/20 2:42 AM, William Brown wrote:



On 17 Mar 2020, at 02:49, thierry bordaz  wrote:

Hi,

As a follow up of the PR 
https://pagure.io/389-ds-base/pull-request/50939,
I wrote down a small design about  rewriters (filter/computed_attr) 
plugin: http://www.port389.org/docs/389ds/design/search_rewriters.html


Comments are welcome

Probably the most dangerous thing to say in all of history?
Well decisions are dangerous. Sharing your wise comments reduce the 
risk of bad decisions ;)

So be sure I sincerely appreciate your feedback.


Like, your design is very smart, but that cleverness and flexibility 
carries many risks. The problem at hand is rewriting ad attributes - 
not to make a framework. I still say focus on that problem alone 
rather than trying to solve a generic class of problems.


Anyway, I still don't think this is the right avenue. There are two 
major reasons for this:


First, is the attempt to make a "generic framework" to solve a 
"specific problem". We should not have a generic rewrite framework, 
when all we need is a specific, focused, module just for doing known 
and well tested attribute transformations.


Code like COS or MEP may be generic, and it solves many cases but the 
surface area is huge, it's hard to test, and it's hard to reason about.


We do not have a need for allowing generic, and arbitrary rewriters 
to exist, especially not when you have to "compile in" the rewriters 
anyway!


Rewriting attributes is not a problem it is what LDAP clients do need. 
But I agree rewriting attributes is not that easy.


Clearly we have been hitting a regular demand to rewrite attributes 
and attributes values. Many plugins (cos, mep, addn, roles, views, 
slapi-nis, filter/attribute rewriters and now AD attributes, vsphere 
integration) have been related to rewrite attributes/values. This has 
always been a big need. Many parts of those plugins are similar 
(finding pattern, scope, craft values..) but implemented in a slightly 
different way. Those plugins are generic and already let the client 
select, through config, the specific transformation they need. This 
design does not introduce a new generic plugin but just simplify the 
use of already supported interfaces.


IMHO those interfaces are clever as they are flexible and opened. They 
do not force rewriters to use strict and limited abilities of plugins 
(like cos, mep,..) and let them be as complex as they need to match 
their needs.




This should be simply, an "ad rewrite" plugin, where all it does is 
that one thing - rewrite the attributes as required for AD emulation 
for IPA. This is far easier to deploy, test and reason about. 
Ideally, the configuration is simply "the plugin is enabled or 
disabled".



Second, is the idea of this being a "search rewriter". I don't think 
this is a good idea. The search path should be simple, it's our hot 
path. We have many things that have to interact like indexes etc. 
Look at virtual attribute indexing and such and the work needed for 
COS to have these used?


This plugin should be on the write path, transforming when a change 
occurs. This means the code is much simpler, easier to test, and we 
need no modifications to our read paths. Things like MEP and 
replication will "just work" as will indexing and much more.


I disagree here. Many time the write path is just not possible. 
Because of schema or historical reason, the entries already exist and 
will not be updated. The customer just want to see them in a 
transformed way. Sometime they can not even run a batch load to 
provision the missing attributes/values.



For me to approve this plugin, I really want to see it being a 
write-path transformation of values into other values, and it should 
be focused, targeted, and simple.


I do want to make one thing clear though - I think it's much better 
that this plugin exist in 389-ds rather than in freeipa. The 389-ds 
project has better tooling (like ASAN/LSAN), faster testing 
capability and a group of subject matter experts for code review. I 
think that if you were to move this to freeipa, you would not have 
the same level of testing or review quality as here, so I'd prefer to 
see you put it here. Sure, I might be difficult on this topic, but I 
do it because I believe there is a better, more robust manner to 
approach this problem space than currently you are considering. :)


I agree with you. I prefer the rewriter callback be part of 389-ds 
because I think the more rewriter samples the easier a developer will 
do his own.


best regards
thierry



Thanks,



best regards
thierry
___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedo

[389-devel] Re: Please have a look at rewriters design

2020-03-17 Thread thierry bordaz



On 3/17/20 2:42 AM, William Brown wrote:



On 17 Mar 2020, at 02:49, thierry bordaz  wrote:

Hi,

As a follow up of the PR https://pagure.io/389-ds-base/pull-request/50939,
I wrote down a small design about  rewriters (filter/computed_attr) plugin: 
http://www.port389.org/docs/389ds/design/search_rewriters.html

Comments are welcome

Probably the most dangerous thing to say in all of history?
Well decisions are dangerous. Sharing your wise comments reduce the risk 
of bad decisions ;)

So be sure I sincerely appreciate your feedback.


Like, your design is very smart, but that cleverness and flexibility carries 
many risks. The problem at hand is rewriting ad attributes - not to make a 
framework. I still say focus on that problem alone rather than trying to solve 
a generic class of problems.

Anyway, I still don't think this is the right avenue. There are two major 
reasons for this:

First, is the attempt to make a "generic framework" to solve a "specific 
problem". We should not have a generic rewrite framework, when all we need is a specific, 
focused, module just for doing known and well tested attribute transformations.

Code like COS or MEP may be generic, and it solves many cases but the surface 
area is huge, it's hard to test, and it's hard to reason about.

We do not have a need for allowing generic, and arbitrary rewriters to exist, especially 
not when you have to "compile in" the rewriters anyway!


Rewriting attributes is not a problem it is what LDAP clients do need. 
But I agree rewriting attributes is not that easy.


Clearly we have been hitting a regular demand to rewrite attributes and 
attributes values. Many plugins (cos, mep, addn, roles, views, 
slapi-nis, filter/attribute rewriters and now AD attributes, vsphere 
integration) have been related to rewrite attributes/values. This has 
always been a big need. Many parts of those plugins are similar (finding 
pattern, scope, craft values..) but implemented in a slightly different 
way. Those plugins are generic and already let the client select, 
through config, the specific transformation they need. This design does 
not introduce a new generic plugin but just simplify the use of already 
supported interfaces.


IMHO those interfaces are clever as they are flexible and opened. They 
do not force rewriters to use strict and limited abilities of plugins 
(like cos, mep,..) and let them be as complex as they need to match 
their needs.




This should be simply, an "ad rewrite" plugin, where all it does is that one thing - 
rewrite the attributes as required for AD emulation for IPA. This is far easier to deploy, test and 
reason about. Ideally, the configuration is simply "the plugin is enabled or disabled".


Second, is the idea of this being a "search rewriter". I don't think this is a 
good idea. The search path should be simple, it's our hot path. We have many things that 
have to interact like indexes etc. Look at virtual attribute indexing and such and the 
work needed for COS to have these used?

This plugin should be on the write path, transforming when a change occurs. This means 
the code is much simpler, easier to test, and we need no modifications to our read paths. 
Things like MEP and replication will "just work" as will indexing and much more.


I disagree here. Many time the write path is just not possible. Because 
of schema or historical reason, the entries already exist and will not 
be updated. The customer just want to see them in a transformed way. 
Sometime they can not even run a batch load to provision the missing 
attributes/values.



For me to approve this plugin, I really want to see it being a write-path 
transformation of values into other values, and it should be focused, targeted, 
and simple.

I do want to make one thing clear though - I think it's much better that this 
plugin exist in 389-ds rather than in freeipa. The 389-ds project has better 
tooling (like ASAN/LSAN), faster testing capability and a group of subject 
matter experts for code review. I think that if you were to move this to 
freeipa, you would not have the same level of testing or review quality as 
here, so I'd prefer to see you put it here. Sure, I might be difficult on this 
topic, but I do it because I believe there is a better, more robust manner to 
approach this problem space than currently you are considering. :)


I agree with you. I prefer the rewriter callback be part of 389-ds 
because I think the more rewriter samples the easier a developer will do 
his own.


best regards
thierry



Thanks,



best regards
thierry
___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidel

[389-devel] Please have a look at rewriters design

2020-03-16 Thread thierry bordaz

Hi,

As a follow up of the PR https://pagure.io/389-ds-base/pull-request/50939,
I wrote down a small design about  rewriters (filter/computed_attr) 
plugin: http://www.port389.org/docs/389ds/design/search_rewriters.html


Comments are welcome

best regards
thierry
___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org


[389-devel] Re: Thoughts on swapping to rfc2307bis.ldif by default

2020-03-03 Thread thierry bordaz



On 3/3/20 11:59 AM, Alexander Bokovoy wrote:

On ti, 03 maalis 2020, Ludwig Krispenz wrote:

Hi,

I have no expertise in this area, but would like to get also 
Alexander's opinion and view from IPA


I don't have much to add to what Thierry and William covered already.
Having a new draft with clarifications would be nice.

Given that both 10rfc2307.ldif and 10rfc2307bis.ldif are present in
default 389-ds deployment, why this schema conflict isn't a problem
right now?


Good point :)
10rfc2307bis.ldif is in /share/dirsrv/data and 10rfc2307.ldif in 
/share/dirsrv/schema.
Only 'schema' definitions are loaded in 'cn=schema'. The definitions in 
'data' are available for users but are not part of QE scope.


I guess most of the users choose one rfc or the other and then the 
entries will conform the chosen RFC.
A risk exists if we are moving a dataset from one rfc to the other. This 
either during an import or if instances in the same replicated topology 
create incompatible entries.


regards
thierry




Regards,
Ludwig

On 03/03/2020 10:17 AM, thierry bordaz wrote:



On 3/3/20 4:12 AM, William Brown wrote:



On 3 Mar 2020, at 11:18, William Brown  wrote:




On 3 Mar 2020, at 04:32, thierry bordaz  wrote:



On 3/2/20 7:24 AM, William Brown wrote:

Hi all,

As you may know, I'm currently working on a migration utility to 
help move from other ldap servers to 389-ds. Something that I 
have noticed in this process is that other servers default to 
rfc2307bis.ldif [0] by default. As part of the migration I would 
like to handle this situation a bit better. It's likely not 
viable for me to simply plaster rfc2307bis into 99user.ldif as 
part of the migration process, so I want to approach this better.


rfc2307 and rfc2307bis are incompatible schemas that redefine 
the same OIDs with new/different meanings. Some key examples:


* posixGroup in rfc2307 only requires gidNumber, rfc2307bis 
requires cn and gidNumber.

Is not it the opposite ?

I was reading the schema as I was reading this.
I need to apologise for being so short in this answer! Thierry was 
correct in this case.


Here is the full set of differences between the two:

uidNumber: +EQUALITY integerMatch
gidNumber: +EQUALITY integerMatch
gecos: +EQUALITY caseIgnoreIA5Match SUBSTR 
caseIgnoreIA5SubstringsMatch SYNTAX 1.3.6.1.4.1.1466.115.121.1.26 
-SYNTAX 1.3.6.1.4.1.1466.115.121.1.15

homeDirectory: +EQUALITY caseExactIA5Match
loginShell: +EQUALITY caseExactIA5Match
shadowLastChange: +EQUALITY integerMatch
shadowMin: +EQUALITY integerMatch
shadowMax: +EQUALITY integerMatch
shadowWarning: +EQUALITY integerMatch
shadowInactive: +EQUALITY integerMatch
shadowExpire: +EQUALITY integerMatch
shadowFlag: +EQUALITY integerMatch
memberUid: +EQUALITY caseExactIA5Match
memberNisNetgroup: +EQUALITY caseExactIA5Match SUBSTR 
caseExactIA5SubstringsMatch

nisNetgroupTriple: +EQUALITY caseIgnoreIA5Match
ipServicePort: +EQUALITY integerMatch
ipServiceProtocol: +SUP name -SYNTAX 1.3.6.1.4.1.1466.115.121.1.15
ipProtocolNumber: +EQUALITY integerMatch
oncRpcNumber: +EQUALITY integerMatch
ipHostNumber: +SUP name -SYNTAX 1.3.6.1.4.1.1466.115.121.1.15
ipNetworkNumber: +SUP name -SYNTAX 1.3.6.1.4.1.1466.115.121.1.15
ipNetmaskNumber: +EQUALITY caseIgnoreIA5Match SYNTAX 
1.3.6.1.4.1.1466.115.121.1.26 -SYNTAX 1.3.6.1.4.1.1466.115.121.1.15
macAddress: +EQUALITY caseIgnoreIA5Match SYNTAX 
1.3.6.1.4.1.1466.115.121.1.26 -SYNTAX 1.3.6.1.4.1.1466.115.121.1.15

bootParameter: +EQUALITY caseExactIA5Match
nisMapName: +SUP name -SYNTAX 1.3.6.1.4.1.1466.115.121.1.15
nisMapEntry: +EQUALITY caseExactIA5Match SUBSTR 
caseExactIA5SubstringsMatch


+ attributeTypes: ( 1.3.6.1.1.1.1.28 NAME 'nisPublicKey' DESC 'NIS 
public key' EQUALITY octetStringMatch SYNTAX 
1.3.6.1.4.1.1466.115.121.1.40 SINGLE-VALUE )
+ attributeTypes: ( 1.3.6.1.1.1.1.29 NAME 'nisSecretKey' DESC 'NIS 
secret key' EQUALITY octetStringMatch SYNTAX 
1.3.6.1.4.1.1466.115.121.1.40 SINGLE-VALUE )
+ attributeTypes: ( 1.3.6.1.1.1.1.30 NAME 'nisDomain' DESC 'NIS 
domain' EQUALITY caseIgnoreIA5Match SYNTAX 
1.3.6.1.4.1.1466.115.121.1.26 )
+ attributeTypes: ( 1.3.6.1.1.1.1.31 NAME 'automountMapName' DESC 
'automount Map Name' EQUALITY caseExactIA5Match SUBSTR 
caseExactIA5SubstringsMatch SYNTAX 1.3.6.1.4.1.1466.115.121.1.26 
SINGLE-VALUE )
+ attributeTypes: ( 1.3.6.1.1.1.1.32 NAME 'automountKey' DESC 
'Automount Key value' EQUALITY caseExactIA5Match SUBSTR 
caseExactIA5SubstringsMatch SYNTAX 1.3.6.1.4.1.1466.115.121.1.26 
SINGLE-VALUE )
+ attributeTypes: ( 1.3.6.1.1.1.1.33 NAME 'automountInformation' 
DESC 'Automount information' EQUALITY caseExactIA5Match SUBSTR 
caseExactIA5SubstringsMatch SYNTAX 1.3.6.1.4.1.1466.115.121.1.26 
SINGLE-VALUE )


posixAccount:
shadowAccount:
posixGroup: +AUXILLARY -MUST cn STRUCTURAL
ipService:
ipProtocol:
oncRpc:
ipHost: - MAY o $ ou $ owner $ seeAlso $ serialNumber +MAY 
userPassword

ipNetwork: -MUST cn +MAY cn
nisNetgroup:
nisMap:
nisObject:
ieee802Device: -MUST cn MAY description $ l $ o $ ou

[389-devel] Re: Thoughts on swapping to rfc2307bis.ldif by default

2020-03-03 Thread thierry bordaz



On 3/3/20 4:12 AM, William Brown wrote:



On 3 Mar 2020, at 11:18, William Brown  wrote:




On 3 Mar 2020, at 04:32, thierry bordaz  wrote:



On 3/2/20 7:24 AM, William Brown wrote:

Hi all,

As you may know, I'm currently working on a migration utility to help move from 
other ldap servers to 389-ds. Something that I have noticed in this process is 
that other servers default to rfc2307bis.ldif [0] by default. As part of the 
migration I would like to handle this situation a bit better. It's likely not 
viable for me to simply plaster rfc2307bis into 99user.ldif as part of the 
migration process, so I want to approach this better.

rfc2307 and rfc2307bis are incompatible schemas that redefine the same OIDs 
with new/different meanings. Some key examples:

* posixGroup in rfc2307 only requires gidNumber, rfc2307bis requires cn and 
gidNumber.

Is not it the opposite ?

I was reading the schema as I was reading this.

I need to apologise for being so short in this answer! Thierry was correct in 
this case.

Here is the full set of differences between the two:

uidNumber: +EQUALITY integerMatch
gidNumber: +EQUALITY integerMatch
gecos: +EQUALITY caseIgnoreIA5Match SUBSTR caseIgnoreIA5SubstringsMatch SYNTAX 
1.3.6.1.4.1.1466.115.121.1.26 -SYNTAX 1.3.6.1.4.1.1466.115.121.1.15
homeDirectory: +EQUALITY caseExactIA5Match
loginShell: +EQUALITY caseExactIA5Match
shadowLastChange: +EQUALITY integerMatch
shadowMin: +EQUALITY integerMatch
shadowMax: +EQUALITY integerMatch
shadowWarning: +EQUALITY integerMatch
shadowInactive: +EQUALITY integerMatch
shadowExpire: +EQUALITY integerMatch
shadowFlag: +EQUALITY integerMatch
memberUid: +EQUALITY caseExactIA5Match
memberNisNetgroup: +EQUALITY caseExactIA5Match SUBSTR 
caseExactIA5SubstringsMatch
nisNetgroupTriple: +EQUALITY caseIgnoreIA5Match
ipServicePort: +EQUALITY integerMatch
ipServiceProtocol: +SUP name -SYNTAX 1.3.6.1.4.1.1466.115.121.1.15
ipProtocolNumber: +EQUALITY integerMatch
oncRpcNumber: +EQUALITY integerMatch
ipHostNumber: +SUP name -SYNTAX 1.3.6.1.4.1.1466.115.121.1.15
ipNetworkNumber: +SUP name -SYNTAX 1.3.6.1.4.1.1466.115.121.1.15
ipNetmaskNumber: +EQUALITY caseIgnoreIA5Match SYNTAX 
1.3.6.1.4.1.1466.115.121.1.26 -SYNTAX 1.3.6.1.4.1.1466.115.121.1.15
macAddress: +EQUALITY caseIgnoreIA5Match SYNTAX 1.3.6.1.4.1.1466.115.121.1.26 
-SYNTAX 1.3.6.1.4.1.1466.115.121.1.15
bootParameter: +EQUALITY caseExactIA5Match
nisMapName: +SUP name -SYNTAX 1.3.6.1.4.1.1466.115.121.1.15
nisMapEntry: +EQUALITY caseExactIA5Match SUBSTR caseExactIA5SubstringsMatch

+ attributeTypes: ( 1.3.6.1.1.1.1.28 NAME 'nisPublicKey' DESC 'NIS public key' 
EQUALITY octetStringMatch SYNTAX 1.3.6.1.4.1.1466.115.121.1.40 SINGLE-VALUE )
+ attributeTypes: ( 1.3.6.1.1.1.1.29 NAME 'nisSecretKey' DESC 'NIS secret key' 
EQUALITY octetStringMatch SYNTAX 1.3.6.1.4.1.1466.115.121.1.40 SINGLE-VALUE )
+ attributeTypes: ( 1.3.6.1.1.1.1.30 NAME 'nisDomain' DESC 'NIS domain' 
EQUALITY caseIgnoreIA5Match SYNTAX 1.3.6.1.4.1.1466.115.121.1.26 )
+ attributeTypes: ( 1.3.6.1.1.1.1.31 NAME 'automountMapName' DESC 'automount 
Map Name' EQUALITY caseExactIA5Match SUBSTR caseExactIA5SubstringsMatch SYNTAX 
1.3.6.1.4.1.1466.115.121.1.26 SINGLE-VALUE )
+ attributeTypes: ( 1.3.6.1.1.1.1.32 NAME 'automountKey' DESC 'Automount Key 
value' EQUALITY caseExactIA5Match SUBSTR caseExactIA5SubstringsMatch SYNTAX 
1.3.6.1.4.1.1466.115.121.1.26 SINGLE-VALUE )
+ attributeTypes: ( 1.3.6.1.1.1.1.33 NAME 'automountInformation' DESC 
'Automount information' EQUALITY caseExactIA5Match SUBSTR 
caseExactIA5SubstringsMatch SYNTAX 1.3.6.1.4.1.1466.115.121.1.26 SINGLE-VALUE )

posixAccount:
shadowAccount:
posixGroup: +AUXILLARY -MUST cn STRUCTURAL
ipService:
ipProtocol:
oncRpc:
ipHost: - MAY o $ ou $ owner $ seeAlso $ serialNumber +MAY userPassword
ipNetwork: -MUST cn +MAY cn
nisNetgroup:
nisMap:
nisObject:
ieee802Device: -MUST cn MAY description $ l $ o $ ou $ owner $ seeAlso $ 
serialNumber
bootableDevice: -MUST cn MAY description $ l $ o $ ou $ owner $ seeAlso $ 
serialNumber
nisMap: +OID 1.3.6.1.1.1.2.9 -OID 1.3.6.1.1.1.2.13

+ objectClasses: ( 1.3.6.1.1.1.2.14 NAME 'nisKeyObject' SUP top AUXILIARY DESC 
'An object with a public and secret key' MUST ( cn $ nisPublicKey $ 
nisSecretKey ) MAY ( uidNumber $ description ) )
+ objectClasses: ( 1.3.6.1.1.1.2.15 NAME 'nisDomainObject' SUP top AUXILIARY 
DESC 'Associates a NIS domain with a naming context' MUST nisDomain )
+ objectClasses: ( 1.3.6.1.1.1.2.16 NAME 'automountMap' SUP top STRUCTURAL MUST 
( automountMapName ) MAY description )
+ objectClasses: ( 1.3.6.1.1.1.2.17 NAME 'automount' SUP top STRUCTURAL DESC 
'Automount information' MUST ( automountKey $ automountInformation ) MAY 
description ) ## namedObject is needed for groups without members
+ objectClasses: ( 1.3.6.1.4.1.5322.13.1.1 NAME 'namedObject' SUP top 
STRUCTURAL MAY cn )




* ipServiceProtocol, ipHostNumber, ipNetworkNumber and nisMapName change from "sup name" 
to "syntax 1.3.6.1.4.1.1466.115.121.1.15&quo

[389-devel] Re: Thoughts on swapping to rfc2307bis.ldif by default

2020-03-03 Thread thierry bordaz



On 3/3/20 4:12 AM, William Brown wrote:



On 3 Mar 2020, at 11:18, William Brown  wrote:




On 3 Mar 2020, at 04:32, thierry bordaz  wrote:



On 3/2/20 7:24 AM, William Brown wrote:

Hi all,

As you may know, I'm currently working on a migration utility to help move from 
other ldap servers to 389-ds. Something that I have noticed in this process is 
that other servers default to rfc2307bis.ldif [0] by default. As part of the 
migration I would like to handle this situation a bit better. It's likely not 
viable for me to simply plaster rfc2307bis into 99user.ldif as part of the 
migration process, so I want to approach this better.

rfc2307 and rfc2307bis are incompatible schemas that redefine the same OIDs 
with new/different meanings. Some key examples:

* posixGroup in rfc2307 only requires gidNumber, rfc2307bis requires cn and 
gidNumber.

Is not it the opposite ?

I was reading the schema as I was reading this.

I need to apologise for being so short in this answer! Thierry was correct in 
this case.

Here is the full set of differences between the two:

uidNumber: +EQUALITY integerMatch
gidNumber: +EQUALITY integerMatch
gecos: +EQUALITY caseIgnoreIA5Match SUBSTR caseIgnoreIA5SubstringsMatch SYNTAX 
1.3.6.1.4.1.1466.115.121.1.26
I have a doubt on this new definition that changes the syntax/MR for 
directoryString to IA5.
directoryString are 8bits (UTF-8) while IA5 7bits. So it may exists 
entries with gecos values that look incompatible with MR.

It is the same for ipNetmaskNumber and macAdress

-SYNTAX 1.3.6.1.4.1.1466.115.121.1.15
homeDirectory: +EQUALITY caseExactIA5Match
loginShell: +EQUALITY caseExactIA5Match
shadowLastChange: +EQUALITY integerMatch
shadowMin: +EQUALITY integerMatch
shadowMax: +EQUALITY integerMatch
shadowWarning: +EQUALITY integerMatch
shadowInactive: +EQUALITY integerMatch
shadowExpire: +EQUALITY integerMatch
shadowFlag: +EQUALITY integerMatch
memberUid: +EQUALITY caseExactIA5Match
memberNisNetgroup: +EQUALITY caseExactIA5Match SUBSTR 
caseExactIA5SubstringsMatch
nisNetgroupTriple: +EQUALITY caseIgnoreIA5Match
ipServicePort: +EQUALITY integerMatch
ipServiceProtocol: +SUP name -SYNTAX 1.3.6.1.4.1.1466.115.121.1.15
ipProtocolNumber: +EQUALITY integerMatch
oncRpcNumber: +EQUALITY integerMatch
ipHostNumber: +SUP name -SYNTAX 1.3.6.1.4.1.1466.115.121.1.15
ipNetworkNumber: +SUP name -SYNTAX 1.3.6.1.4.1.1466.115.121.1.15
ipNetmaskNumber: +EQUALITY caseIgnoreIA5Match SYNTAX 
1.3.6.1.4.1.1466.115.121.1.26 -SYNTAX 1.3.6.1.4.1.1466.115.121.1.15
macAddress: +EQUALITY caseIgnoreIA5Match SYNTAX 1.3.6.1.4.1.1466.115.121.1.26 
-SYNTAX 1.3.6.1.4.1.1466.115.121.1.15
bootParameter: +EQUALITY caseExactIA5Match
nisMapName: +SUP name -SYNTAX 1.3.6.1.4.1.1466.115.121.1.15
nisMapEntry: +EQUALITY caseExactIA5Match SUBSTR caseExactIA5SubstringsMatch

+ attributeTypes: ( 1.3.6.1.1.1.1.28 NAME 'nisPublicKey' DESC 'NIS public key' 
EQUALITY octetStringMatch SYNTAX 1.3.6.1.4.1.1466.115.121.1.40 SINGLE-VALUE )
+ attributeTypes: ( 1.3.6.1.1.1.1.29 NAME 'nisSecretKey' DESC 'NIS secret key' 
EQUALITY octetStringMatch SYNTAX 1.3.6.1.4.1.1466.115.121.1.40 SINGLE-VALUE )
+ attributeTypes: ( 1.3.6.1.1.1.1.30 NAME 'nisDomain' DESC 'NIS domain' 
EQUALITY caseIgnoreIA5Match SYNTAX 1.3.6.1.4.1.1466.115.121.1.26 )
+ attributeTypes: ( 1.3.6.1.1.1.1.31 NAME 'automountMapName' DESC 'automount 
Map Name' EQUALITY caseExactIA5Match SUBSTR caseExactIA5SubstringsMatch SYNTAX 
1.3.6.1.4.1.1466.115.121.1.26 SINGLE-VALUE )
+ attributeTypes: ( 1.3.6.1.1.1.1.32 NAME 'automountKey' DESC 'Automount Key 
value' EQUALITY caseExactIA5Match SUBSTR caseExactIA5SubstringsMatch SYNTAX 
1.3.6.1.4.1.1466.115.121.1.26 SINGLE-VALUE )
+ attributeTypes: ( 1.3.6.1.1.1.1.33 NAME 'automountInformation' DESC 
'Automount information' EQUALITY caseExactIA5Match SUBSTR 
caseExactIA5SubstringsMatch SYNTAX 1.3.6.1.4.1.1466.115.121.1.26 SINGLE-VALUE )

posixAccount:
shadowAccount:
posixGroup: +AUXILLARY -MUST cn STRUCTURAL
ipService:
ipProtocol:
oncRpc:
ipHost: - MAY o $ ou $ owner $ seeAlso $ serialNumber +MAY userPassword
So we may have ipHost entries with 'o'/'ou'... that will not conform the 
schema

ipNetwork: -MUST cn +MAY cn
nisNetgroup:
nisMap:
nisObject:
ieee802Device: -MUST cn MAY description $ l $ o $ ou $ owner $ seeAlso $ 
serialNumber
bootableDevice: -MUST cn MAY description $ l $ o $ ou $ owner $ seeAlso $ 
serialNumber
Does it mean iee802Device/bootableDevice (containing 'cn') will not 
conform the schema ?

nisMap: +OID 1.3.6.1.1.1.2.9 -OID 1.3.6.1.1.1.2.13
I do not recall if we are using OID in table lookup but it could be a 
problem.
Not sure that the replicated new definition will or not collide with the 
old one


+ objectClasses: ( 1.3.6.1.1.1.2.14 NAME 'nisKeyObject' SUP top AUXILIARY DESC 
'An object with a public and secret key' MUST ( cn $ nisPublicKey $ 
nisSecretKey ) MAY ( uidNumber $ description ) )
+ objectClasses: ( 1.3.6.1.1.1.2.15 NAME 'nisDomainObject' SUP top AUXILIARY 
DESC 'Associates a NIS domain with a naming

[389-devel] Re: Thoughts on swapping to rfc2307bis.ldif by default

2020-03-02 Thread thierry bordaz



On 3/2/20 7:24 AM, William Brown wrote:

Hi all,

As you may know, I'm currently working on a migration utility to help move from 
other ldap servers to 389-ds. Something that I have noticed in this process is 
that other servers default to rfc2307bis.ldif [0] by default. As part of the 
migration I would like to handle this situation a bit better. It's likely not 
viable for me to simply plaster rfc2307bis into 99user.ldif as part of the 
migration process, so I want to approach this better.

rfc2307 and rfc2307bis are incompatible schemas that redefine the same OIDs 
with new/different meanings. Some key examples:

* posixGroup in rfc2307 only requires gidNumber, rfc2307bis requires cn and 
gidNumber.

Is not it the opposite ?

* ipServiceProtocol, ipHostNumber, ipNetworkNumber and nisMapName change from "sup name" 
to "syntax 1.3.6.1.4.1.1466.115.121.1.15". sup name is also syntax 
1.3.6.1.4.1.1466.115.121.1.15 so this channge is minimal.
* posixGroup and posixAccount change from structural to auxillary in rfc2307bis 
(allowing them to be combined with person or nsAccount).
Right but for 389-ds the structural requirement is not enforced, so it 
should not be a problem


Objectively, rfc2307bis is the better schema - but as with all proposals like 
this, there is always a risk of breaking customers or compatibility.

I agree on both :)


I'm wondering what would be a reasonable course of action for us to move to 
rfc2307bis by default. My current thoughts:

* have rfc2307bis vs rfc2307 as an option to dssetup so we use the correct 
schema in the setup.
* default the setup option to rfc2307bis
* Tests for handling both setup options
* Upgrades of the server should not affect the rfc2307 vs rfc2307bis status
* A dsctl tool to allow changing between the rfc2307/rfc2307bis.

Thoughts? Concern? Ideas? Comments?
It would be interesting to have a complete list of the differences. at 
the moment with the listed differences I think 2307bis would support 
2307 entries. In addition, 2307bis looks to be a superset of 2307 so 
that it would be replicated in a mmr topology.


Because of some bug,  99user.ldif will contains all overridden 
definitions not the only new/changed one.


The idea of a dsctl tool looks good. It could be to create a task that 
check all entries conform a schema. If all entries conform 2307bis we 
could replace the default 2307 schema file with the 2307bis.



[0] https://tools.ietf.org/html/draft-howard-rfc2307bis-02

—
Sincerely,

William Brown

Senior Software Engineer, 389 Directory Server
SUSE Labs
___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org

___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org


[389-devel] Re: [PATCH] prevent slapd from hanging under unlikely circumstances

2020-02-05 Thread thierry bordaz



On 2/5/20 2:30 AM, William Brown wrote:



On 5 Feb 2020, at 03:10, Ludwig Krispenz  wrote:

I think I can agree with 1-8, 9 is one solution to fix the problem you 
reported, but not yet validate that there are no other side effects, there are 
potential postop plugins which should NOT be called for tombstone delete, eg 
retro cl, we need to investigate side effects of the patch and evaluate other 
solutions, I suggested one in anotehr reply.

for 10, the claim that not crying hurrah and merging a patch without further 
investigation and testing is "doctrinal objection" is very strong and I have to 
reject this.

Regards,
Ludwig

On 02/04/2020 05:00 PM, Jay Fenlason wrote:

Ok, let's take this from the top:

1: Defects that cause a server to become unresponsive are bad and must
be repaired as soon as possible.

I'm not objecting to this.


2: Some #1 class defects are exploitable and require a CVE.  As far as
I can tell, this one does not, but you should keep an eye out for the
possibility.

3: The #1 class defect I have found can be triggered in at least two
ways.  One requires ipa-replica-install and is hard to reproduce in a
test environment.  The other requires ldapdelete and is easy to
reproduce in an isolated VM.

Ipa replica install and ldapdelete of a tombstone/conflict both require 
cn=directory manager, which would automatically cause any reported CVE to be 
void - cn=directory manager is game over.


4: The defect is a mismatch between the plugin ABI as implemented by
389-ds, and the behavior the NIS plugin expects.

5: I have found no specification or documentation on said ABI, so I
can only guess what the correct behavior is here.

6: The ABI includes two functions in the plugin: pre_delete and
post_delete.

7: The NIS plugin expects that every call to pre_delete will be
followed by a call to post_delete.  It takes a lock in pre_delete and
releases it in post_delete.

But you didn't report an issue in NIS? You reported an issue with ldapdelete on 
tombstones and conflicts ... the slapi-nis plugin is maintained by freeipa and 
if an issue in it exists, please report it to them with reproduction steps.


Hi Jay,

Thanks for your great investigations/patch/summary.

I made the change in NIS plugin to acquire/release map lock in pre/post 
ops. At that time I was assuming the pre/post callback were balanced.
I was wrong and already hit similar issue in #1694263. I updated NIS 
plugin to workaround that bug.


The new bug you are reporting is a new example that pre/post are 
sometime not balanced :(
The proposed fixes (prevent preop call on tombstone, force postop call 
even on tombstone) looks valid but will have wider impact and need 
additional investigation.
An other approach would be to make NIS plugin ignore operations on 
tombstone. It also require additional investigation.


best regards
thierry



8: Under some circumstances 389-ds can call pre_delete but fail to
call post_delete.  Because of #5, there is no way to tell if this is
expected behavior, but the NIS plugin clearly does not expect it.

9: My patch ensures that every call to pre_delete is followed by a
corresponding call to post_delete.  V1 of the patch also ensures that
every call to post_delete is after a corresponding pre_delete call.
V2 relaxes the second requirement, allowing post_delete calls without
a corresponding pre_delete call because someone expressed worry about
potential regressions.

10: You are refusing to merge my patch because of a doctrinal
objection to the use of ldapdelete in the simple reproducer.

It's not really a doctrinal objection. Replication is a really complex topic, 
and difficult to get correct. Ludwig knows a lot in this area, and has really 
good comments to make on the topic. I understand that you have an issue you 
have found in your setup, and you want to resolve it, but we have to consider 
the deployments of many thousands of other users and their replication 
experience too. For us it's important to be able to reproduce, and see the 
issue, to understand the consequences, and to make these changes carefully. 
Accepting the patch without clear understanding of it's consequences is not 
something we do.

At this time we still don't have a clear reproducer from you on "how" you 
constructed the invalid directory state. You reported the following:


I found the bug by doing a series of "ipa-client-install" (with lots
of arguments, followed by
echo ca_host = {a not-firewalled IPA CA} >> /etc/ipa/default.conf
echo [global] > /etc/ipa/installer.conf
echo ca_host = {ditto} >> /etc/ipa/installer.conf
echo {password} | kinit admin
ipa hostgroup-add-member ipaservers --hosts $(hostname -f)
ipa-relica-install --setup-ca --setup-dns --forwarder={ip addr}

followed by the replica install failing due to network issues,
misconfigured firewalls, etc, then
ipa-server-install --uninstall on the host
and ipa-replica-manage del {failed install host}
elsewhere in the mesh, sometimes with ldapdelete of the 

[389-devel] Re: System tap performance results,

2019-12-11 Thread thierry bordaz



On 12/11/19 1:21 AM, William Brown wrote:



On 10 Dec 2019, at 19:15, thierry bordaz  wrote:

Hi William,

Thanks for these very interesting results.
It would help if you can provide the stap scripts to make sure what you are 
accounting the latency.

Yes, I plan to put them into a PR soon once I have done a bit more data 
collection and polishing of the script setup.


Also just to be sure is latency a synonym for response time ?

Yep, here I mean the execution time of a single operation.


Regarding the comparison (tail) 1client/16client. It looks to me that the tail 
are equivalent: The more we have clients the more we have long latency. So in a 
first approach I would eliminate contention effect.

I disagree, the tail is much more pronounced in the 16 client version, and there is a 
significant shift of response times from the 32768 bucket to 65536 indicating that many 
operations are now being "held up".


You are probably right.
There is a response time shift but I would expect a major contention 
effect to shift more and flatter the graph to upper response time.
Whatever is the answer, I think an important point is to agree on the 
method and that reports shows somethings suspicious.





Regarding the ratio shorter/longer latency (assuming the search are equivalent) 
this is interesting to know why we have such effect. One of the possible cause 
I was thinking of is the impact of DB thread (checkpointing or trickling). But 
if it exists similar long tail in ldbm_back, the absolute value is much lower 
than the opshare_search: in short ldbm_back accounted at most for 4ms while 
opshared for 67ms. So there is something else (aci, network, frontend..).

Regarding USDT I think it is very good idea. However, just to share some recent 
stap experience, I found it intrusive. In short, I had not the same throughput 
with and without. In my case it was not a problem, as I wanted to investigate a 
reproducible contention. But if we want support/user/customers to use it, the 
performance impact in production will be valid point.

I haven't noticed any "intrusiveness" from USDT so far? How were you running 
the stap scripts?
I did not add probe in DS. I was just using stap to gather per operation 
specific functions duration (like plugin, backend or core).


I do not recall the "intrusiveness" level as I was more looking for 
contention data than performance value.





best regards
thierry



On 12/9/19 6:16 AM, William Brown wrote:

Hi all,

Following last weeks flamegraph runs, I wanted to find more details on exactly 
what was happening. While flamegraphs highlighted that a changed existed 
between single and multithreaded servers, it did not help to isolate where
the change was occuring.

To understand this I have started to work on a set of systemtap scripts that we 
can use to profile 389ds. These will be included in a future PR.

Here are the hisograms from an initial test of profiling "do_search"

1 thread

stap test_do_search.stp
^CDistribution of latencies (in nanoseconds) for 441148 samples
max/avg/min: 35911542/85694/38927
 value |-- count
  8192 |0
 16384 |0
 32768 |@@ 167285
 65536 |@  239280
131072 |@   25788
262144 |@8530
524288 |  252
   1048576 |6
   2097152 |1
   4194304 |3
   8388608 |0
  16777216 |2
  33554432 |1
  67108864 |0
134217728 |0

16 thread

  stap test_do_search.stp
^CDistribution of latencies (in nanoseconds) for 407806 samples
max/avg/min: 100315928/112407/39368
 value |-- count
  8192 |0
 16384 |0
 32768 |   100281
 65536 |@  249656
131072 |@@@ 37837
262144 |@@@ 18322
524288 | 1171
   1048576 |

[389-devel] Re: System tap performance results,

2019-12-10 Thread thierry bordaz

Hi William,

Thanks for these very interesting results.
It would help if you can provide the stap scripts to make sure what you 
are accounting the latency.

Also just to be sure is latency a synonym for response time ?

Regarding the comparison (tail) 1client/16client. It looks to me that 
the tail are equivalent: The more we have clients the more we have long 
latency. So in a first approach I would eliminate contention effect.


Regarding the ratio shorter/longer latency (assuming the search are 
equivalent) this is interesting to know why we have such effect. One of 
the possible cause I was thinking of is the impact of DB thread 
(checkpointing or trickling). But if it exists similar long tail in 
ldbm_back, the absolute value is much lower than the opshare_search: in 
short ldbm_back accounted at most for 4ms while opshared for 67ms. So 
there is something else (aci, network, frontend..).


Regarding USDT I think it is very good idea. However, just to share some 
recent stap experience, I found it intrusive. In short, I had not the 
same throughput with and without. In my case it was not a problem, as I 
wanted to investigate a reproducible contention. But if we want 
support/user/customers to use it, the performance impact in production 
will be valid point.


best regards
thierry



On 12/9/19 6:16 AM, William Brown wrote:

Hi all,

Following last weeks flamegraph runs, I wanted to find more details on exactly 
what was happening. While flamegraphs highlighted that a changed existed 
between single and multithreaded servers, it did not help to isolate where
the change was occuring.

To understand this I have started to work on a set of systemtap scripts that we 
can use to profile 389ds. These will be included in a future PR.

Here are the hisograms from an initial test of profiling "do_search"

1 thread

stap test_do_search.stp
^CDistribution of latencies (in nanoseconds) for 441148 samples
max/avg/min: 35911542/85694/38927
 value |-- count
  8192 |0
 16384 |0
 32768 |@@ 167285
 65536 |@  239280
131072 |@   25788
262144 |@8530
524288 |  252
   1048576 |6
   2097152 |1
   4194304 |3
   8388608 |0
  16777216 |2
  33554432 |1
  67108864 |0
134217728 |0

16 thread

  stap test_do_search.stp
^CDistribution of latencies (in nanoseconds) for 407806 samples
max/avg/min: 100315928/112407/39368
 value |-- count
  8192 |0
 16384 |0
 32768 |   100281
 65536 |@  249656
131072 |@@@ 37837
262144 |@@@ 18322
524288 | 1171
   1048576 |  203
   2097152 |   90
   4194304 |   74
   8388608 |   83
  16777216 |   58
  33554432 |   21
  67108864 |   10
134217728 |0
268435456 |0


It's interesting to note the tail latency here: On the 16 thread version we see 67000 
less in the 32768 buckets, shifting mostly through the 131072 and 262144 buckets, as well 
as showing a much greater number of calls in the tail. In thread 1 no operation made it 
to 67108864, but 10 did in 16thread, along with ~200 more that are higher than 1048567, 
and ~1500 more that are greater than 524288. This kind of tailing means we have 
"spikes" of latency throughout the execution, which then have a minor flow on 
cause other operations to be increased in latency.

These are all in 

[389-devel] Re: Future of nunc-stans

2019-10-10 Thread thierry bordaz



On 10/9/19 11:55 AM, Ludwig Krispenz wrote:

Hi William,

I like your radical approach :-)

In my opinion our connection code is getting to complicated by 
maintaining two different implementations in parallel -  not 
separated, but intermangled (and even more complicated by turbo mode). 
So I agree we should have only one, but which one ? In my opinion nunc 
stans is the theoretically better approach, but nobody is confident 
enough to rely on nunc stans alone. The conntable mode has its 
problems (especially if handling many concurrent connections, and 
worse if they are established almost at the same time)(otherwise we 
would not have experimented with nunc stans), but is stable and for 
most of the use cases efficient enough.


So reducing the complexity by removing nunc stans (and maybe also 
turbo mode) and then do cleanup and try to improve the bottlenecks 
would be an acceptable approach to me.


In my opinion the core of the problem of the "old" connection code is 
that the main thread is handling new connections and already 
established connections and so does iterate over the connection table. 
Using an event model looks like the best way to handle this, but if it 
doesn't work we need to look for other improvements without breaking 
things.
Your suggestion to make the conn table data structure more lean and 
flexible is one option. In sun ds, when I didn't know about event 
queues I did split the main thread, one handling new connections and 
multiple to handle established connections (parts of teh conn table) - 
reusing the existing mechanisms, just splitting the load. Maybe we can 
also think in this direction.


Handling new connections and established connections is a place where, 
out of the box, nunc-stans gives a performance boost ( > +10% 
throughput). But because it remains old part of the code (connection and 
connection table) with which nunc-stans is not fully integrated, it 
cannot show all its power (connection table) and also lead to some 
design issues (deadlocks between nunc-stans job and connection, ability 
to cancel nunc-stans job).
In addition support of nunc-stans is a concern. I found it very 
difficult to investigate/debug especially when the problem only occurs 
in customer deployment.
Also I find it difficult to maintain. For example, knowing RC of 
deadlocks or others bugs, I found very difficult to say if a given patch 
fixes or not a given RC.


sun DS approach is less innovative but gave good results at least to 
handle high number of established connection.


I agree with Ludwig, this approach looks promising if we do need to 
improve DS connection handling and decide to no longer invest in nunc-stans.


thierry

Regards,
Ludwig

On 10/09/2019 01:32 AM, William Brown wrote:



On 9 Oct 2019, at 09:18, Rich Megginson  wrote:

On 10/8/19 4:55 PM, William Brown wrote:

Hi everyone,
In our previous catch up (about 4/5 weeks ago when I was visiting 
Matus/Simon), we talked about nunc-stans and getting it at least 
cleaned up and into the code base.
I've been looking at it again, and really thinking about it and 
reflecting on it and I have a lot of questions and ideas now.

The main question is *why* do we want it merged?
Is it performance? Recently I provided a patch that yielded an 
approximate ~30% speed up in the entire server through put just by 
changing our existing connection code.
Is it features? What features are we wanting from this? We have no 
complaints about our current threading model and thread allocations.
Is it maximum number of connections? We can always change the 
conntable to a better datastructure that would help scale this 
number higher (which would also yield a performance gain).
It is mostly about the c10k problem, trying to figure out a way to 
use epoll, via an event framework like libevent, libev, or 
libtevent, but in a multi-threaded way (at the time none of those 
were really thread safe, or suitable for use in the way we do 
multi-threading in 389).


It wasn't about performance, although I hoped that using lock-free 
data structures might solve some of the performance issues around 
thread contention, and perhaps using a "proper" event framework 
might give us some performance boost e.g. the idle thread processing 
using libevent timeouts.  I think that using poll() is never going 
to scale as well as epoll() in some cases e.g. lots of concurrent 
connections, no matter what sort of datastructure you use for the 
conntable.

The conntable was bottlenecking because when you had:

| conn | conn | freeslot | 

it would attempt to lock each conn to see if it was free or not. This 
meant if a conn was in an io, we would block waiting for it to finish 
before we could move to the next conn to check if it was free. After 
changing to pthread, we can now do trylock, where if trylock fails it 
can be implied the conn slot must be in use, so skip it. This is how 
we got the 30% speed up recently (my laptop went from about 4200 

[389-devel] Re: please review: Replication Status Message Improvements

2019-06-12 Thread thierry bordaz

Sorry Mark, my email was confusing.

First you make a good point. Giving extra info does not hurt.

I just noticed that replication status contains two RC (ldap and 
replication), while init status contains three RC (ldap, replication, 
connection). So the json struct should differ from regular_replication 
and init_replication status.


On 6/12/19 6:17 PM, Mark Reynolds wrote:


On 6/12/19 12:08 PM, thierry bordaz wrote:

Hi Mark,

Looking very good to me.
For replication status there is either ldaprc or replrc. The message 
is self explaining if it is a LDAP or replication error. IMHO I think 
json could only contain 'repl_status' that can contain ldaprc or replrc.
Well I was thinking it didn't hurt to have separate error code 
elements in the JSON object, and it allows a client to know what type 
of error it is without having to interpret the message string. This 
"could" be useful to an Admin which is why I added it.  Maybe its not 
that useful, but like I said I don't think it hurts.


For init status, it exists ldaprc, connrc and replrc. I think it 
would be good to have all three.


Wait :-)  So now you want the error code object elements?

Either way good point, the JSON object should have all three error 
code types (unless it is decided to remove all the error code elements).


Thanks,

Mark



best regards
thierry

On 6/12/19 5:36 PM, Mark Reynolds wrote:

http://www.port389.org/docs/389ds/design/repl-agmt-status-design.html
___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org

___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org

___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org


[389-devel] Re: please review: Replication Status Message Improvements

2019-06-12 Thread thierry bordaz

Hi Mark,

Looking very good to me.
For replication status there is either ldaprc or replrc. The message is 
self explaining if it is a LDAP or replication error. IMHO I think json 
could only contain 'repl_status' that can contain ldaprc or replrc.


For init status, it exists ldaprc, connrc and replrc. I think it would 
be good to have all three.


best regards
thierry

On 6/12/19 5:36 PM, Mark Reynolds wrote:

http://www.port389.org/docs/389ds/design/repl-agmt-status-design.html
___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org

___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org


[389-devel] Re: Replication agreement status messages: JSON or text?

2019-06-12 Thread thierry bordaz



On 6/12/19 9:22 AM, Ludwig Krispenz wrote:

Hi Mark,

On 06/11/2019 08:15 PM, Mark Reynolds wrote:
I am currently working on a revision of replication agreement status 
messages.  Previously we logged the status like so:


    Error (%d) - message (sub-message) ...

just to get it clear what you suggest, I was a bit confused about first.

Do you talk about logging (as in the error log) or about the value of 
the replicaLastUpdateStatus attribute ?
The BZ mention replicaLastUpdateStatus (like "last update status: Error 
(18) Replication error acquiring replica: Incremental update transient 
error. Backing off, will retry update later. (transient error)")


I agree it is good idea to provide a json status. Should it replaces the 
"human readable" format with a json format I would prefer to be 
compatible with a new status attribute replicaLastUpdateStatusJson.


theirry


For logging into the error log I prefer to keep the current, 
"readable" format - until we do a real rework of logging.
For the storage of a state in the agreement I think switching to the 
json object is ok


If Error was set to 0 it meant success, but this caused confusion 
because of the word "Error".  So I am working on changing this.


There are two options here: change the static "Error" text to be 
dynamic: "Info", "Warning", or "Error" depending on the state. Or, 
move away from a human friendly text string to a machine friendly 
simple JSON object.  There are pro's and con's to both. I think 
moving towards a JSON object is the correct way - easier to maintain, 
and easier to be consumed by other applications. The cons are that it 
is a disruptive change to the previous behavior, and it could be 
confusing to an Admin who might not understand JSON.


This is the basic JSON object I was thinking of

    {"status": "Good|Warning|Bad", "status code": NUMBER(aka error 
code), "date": "2019117485748745Z", "message": "Message text"}


or maybe multiple messages (list):

    {"status": "Good|Warning|Bad", "status code": NUMBER(aka error 
code), "date": "2019117485748745Z", "message": ["the replication 
status is...", "Connection error 91", "Server Down"]}



The JSON object can easily be extended without breaking clients, but 
it's not easy to read for a human.


Thoughts?

Thanks,

Mark
___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org



___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org


[389-devel] Re: On the command line tools ....

2019-05-27 Thread thierry bordaz

Hi William,

I am not expert of CLI/UI so I would just comment as novice user.

389-ds 1.4, brings a brand new CLI/UI.
To be less disruptive, the web UI reuse a lot the ideas of former java 
console. It allows fine tuning and shows some nasty internal details 
that in the future we may want to hide to make it more user friendly.  
UI using CLI, the CLI need to support those nasty details and UI relies 
low-level CLI
I agree to create  high-level CLI that will hide the complexity of 
389-ds internals. But I do not feel it should be one or the other, we 
could have both and high-level CLI could be based on a robust set of 
low-level CLI.


It is both too early and too late to change the CLI. Too early because 
389-ds 1.4 is just out of the door and we have not yet feedback from 
users to know what should be imporve. It is too late because users will 
adopt this new CLI/UI, create their own scripts and we will need to be 
backward compatible. So IMHO there is only one way to go, fix the CLI 
bugs and extend the CLI set following your approach.


Simon proposed an action plan (update design, review/agreement of the 
team, coding/deprecation) that looks quite good to me.


best regards
thierry

On 5/27/19 2:12 AM, William Brown wrote:

Of course, I am opened for the discussion on the plan and the vision.
As Mark has pointed out, we should really gather here and decide as a team. :)


Yeah, I think this is the important point. In the past I did not do well with 
this, so I think it's important we do it as a team. We really also should 
engage with people (but some mailing list feedback has already been positive 
though). Something to keep in mind, is that confusion about how a tool is used, 
is possibly a point of usability issue we could resolve.

With a weekend of reflection, I think that Mark's suggestion of leaving what we 
have, and maybe adding a dstask or something in parallel may be the easiest way 
to achieve this with the least disruption. There is plenty on my plate for now, 
so I think there is no rush to make changes here ...

Thank you all


Regards,
Simon

—
Sincerely,

William Brown

Senior Software Engineer, 389 Directory Server
SUSE Labs
___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org

___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org


[389-devel] Re: Advice on a memory issue

2019-05-16 Thread thierry bordaz

Hi William,

|It looks to me that attr_syntax_create overwrite the allocated asi with 
one it allocates itself based on provided params.
In short I think attr_syntax_creates allocates for you the syntaxinfo, 
you do not need to provide one.


best regards
thierry
|
On 5/16/19 8:17 AM, William Brown wrote:

https://pagure.io/389-ds-base/pull-request/50379

This code is not yet ready to be merged. I'm currently having a problem with 
freeing the attrsyntaxinfo struct as part of the test.

If the code is as is I get:


=
==98363==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 160 byte(s) in 1 object(s) allocated from:
 #0 0x7fbc40e28538 in calloc (/usr/lib64/libasan.so.5+0xec538)
 #1 0x7fbc40a34be6 in slapi_ch_calloc 
/home/william/development/389ds/ds/ldap/servers/slapd/ch_malloc.c:175
 #2 0x40499c in attr_syntax_add_from_name 
/home/william/development/389ds/ds/test/libslapd/schema/filter_validate.c:25
 #3 0x404b22 in test_libslapd_schema_filter_validate_simple 
/home/william/development/389ds/ds/test/libslapd/schema/filter_validate.c:56
 #4 0x7fbc40d340d8  (/usr/lib64/libcmocka.so.0+0x50d8)

Objects leaked above:
0x60e00d60 (160 bytes)

Direct leak of 160 byte(s) in 1 object(s) allocated from:
 #0 0x7fbc40e28538 in calloc (/usr/lib64/libasan.so.5+0xec538)
 #1 0x7fbc40a34be6 in slapi_ch_calloc 
/home/william/development/389ds/ds/ldap/servers/slapd/ch_malloc.c:175
 #2 0x40499c in attr_syntax_add_from_name 
/home/william/development/389ds/ds/test/libslapd/schema/filter_validate.c:25
 #3 0x404b0f in test_libslapd_schema_filter_validate_simple 
/home/william/development/389ds/ds/test/libslapd/schema/filter_validate.c:55
 #4 0x7fbc40d340d8  (/usr/lib64/libcmocka.so.0+0x50d8)

Objects leaked above:
0x60e00ba0 (160 bytes)

SUMMARY: AddressSanitizer: 320 byte(s) leaked in 2 allocation(s).

However, if I free *a and *b, with attr_syntax_free, or slapi_ch_free, I get a 
double free error. The size of 160 bytes correlates to the sizeof(struct 
attrsyntaxinfo) but looking in gdb during attr_syntax_delete, the 
attr_syntax_free is called on asi as provided.

So I'm not 100% sure what's going wrong here, but I'm not thoroughly 
experienced in this part of the code, so feedback would be really helpful about 
this resource issue.

Thanks!

—
Sincerely,

William Brown

Senior Software Engineer, 389 Directory Server
SUSE Labs
___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org


___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org


[389-devel] Re: Profiling discussion

2018-10-25 Thread thierry bordaz



On 10/11/2018 12:57 AM, William Brown wrote:

On Wed, 2018-10-10 at 16:26 +0200, thierry bordaz wrote:

Hi William,

Thanks for starting this discussion.
Your email raise several aspects (How, for whom,..) and I think a way
to
start would be to write down what we want.
A need is from a given workload to determine where we are spending
time
as a way to determine where to invest.
An other need is to collect metrics at operation level.

Aren't these very similar? The time we invest is generally on improving
a plugin or a small part of an operation, to make the operation as a
whole faster.


It could be the used tools that are similar but the difference is about 
expected results. For example I was just discussing with a user who 
reported:


   [24/Oct/2018:12:10:55.012908141 -0800] conn=2400 op=1 MODRDN
   dn="" newsuperior="(null)"
   [24/Oct/2018:12:11:01.711604553 -0800] conn=2400 op=1 RESULT err=1
   tag=109 nentries=0 etime=6.1301230184 csn=5bd0d1cf0001

   tried the same modrdn in my test environment, resulting in no error
   and no latency.

   [24/Oct/2018:12:14:03.665479821 -0800] conn=138 op=1 MODRDN
   dn="" newsuperior="(null)"
   [24/Oct/2018:12:14:03.749121724 -0800] conn=138 op=1 RESULT err=0
   tag=109 nentries=0 etime=0.0083774655 csn=5bd0d28b/0001


So here the expected result is not to improve performance but having a 
diagnostic method/tool to know what is going on in production compare to 
tests.

So if we can report on an individual operation, we can write a tool
similar to log-conv.pl, but for performance metrics that displays
trends of operations that are not performaning well, then we can find
examples of operations and why.


  From the how perspective, we can rely on external tools
(stap+scripts),
or internal tool (like the plugin you described+scripts). Of course
we
can also do some enhancements inside DS (like adding probes) to help
external tools. I have no strong opinion if an approach is better
than
the other but I think it also depends what you want to perform.

I think that it would be great if the tools we use internal to the
team, were accessible outside to admins of ds. That way when we get
reports for performance concerns, we have a standardised way of looking
at this. It's going to mean our workflow is the same between internal
development and profiling, as for external reports, and it will force
us to have all the information we need in that one place.

I think as a coarse first metric internal event timings is probably
want we want first. After that we can continue to extend from there?

As for the how, perhaps we can put something on the Operation struct
for appending and logging events and turning those into metrics?

As mentioned you could use stap too with defined points for tracing,
but that limits us to linux only?


best regards
thierry

On 10/08/2018 12:37 PM, William Brown wrote:

Hi there,

In a ticket Thierry and I mentioned that we should have a quick
discussion about ideas for profiling and what we want it to look
like and what we need. I think it’s important we improve our
observation into the server so that we can target improvements
correctly,

I think we should know:

* Who is the target audience to run our profiling tools?
* What kind of information we do want?
* Potential solution for the above.

With those in mind I think that Thierry suggested STAP scripts.

* Target audience - developers (us) and some “highly experienced”
admins (STAP is not the easiest thing to run).
* Information - STAP would largely tell us timing and possibly
allows some variable/struct extraction. STAP does allow us to look
at connection info too a bit easier.

I would suggest an “event” struct, and logging service

At the start of an operation we create an event struct. As we enter
- exit a plugin we can append timing information, and the plugin
itself can add details (for example, backend could add idl
performance metrics or other). At the end of the operation, we log
the event struct as a json blob to our access log associated to the
conn/op.

* Target - anyone, it’s a log level. Really easy to enable (Think
mailing list or user support, can easily send us diagnostic logs)
* Information - we need a bit more work to structure the “event”
struct internally for profiling, but we’d get timings and possibly
internal variable data as well in the event.


I think these are two possible approaches. STAP is less invasive,
easier to start now, but harder to extend later. Logging is more
accessible to users/admins, easier to extend later, but more work
to add now.

What do we think?


—
Sincerely,

William


___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-leave@lists.fedoraproject
.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Maili

[389-devel] Re: Nunc-stans status

2018-10-24 Thread thierry bordaz



On 10/21/2018 03:55 AM, William Brown wrote:

Thanks for this write up Thierry,


On 19 Oct 2018, at 17:15, thierry bordaz  wrote:

Hi,

C10K is a scalability problem that a server can face when dealing with events 
of thousands of connections (i.e. clients) at the same time. Events can be new 
connections, new operations on the established connections, closure of 
connection (from client or server)
For 389-ds, C10K problem was resolved with a new framework Nunc-Stans [1]. 
Nunc-stans was first enabled in RHDS 7.4 and improved/fixed in 7.5. Robustness 
issues [2] and [3] were reported in 7.5 and it was decided to disable 
Nunc-stans. It is not known if those issues exist or not in 7.4.

William posted a PR to fix those two issues [4]. Nunc-stans is a complex 
framework, with its own dynamic. Review of this PR is not easy and even a 
careful review may not guaranty it will fix [2] and [3] and may not introduce 
others unexpected side effects.

 From there we discussed two options (but there may be others):

• Review and merge the PR [4], then later run some intensive tests 
aiming to verify [2],[3] and checking the robustness in order to reenable NS

I think this is the best solution. If NS is “not working” today, then merging 
the PR won’t make it “not work” any less ;)

Given we have a reliable way to disabled it at build time, I think that 
merging, testing, and then eventually discussion and decision of enabling by 
default is the best plan.


Hi William,

The existing set of tests reveal a single failure[1]. That shows the PR 
is really promising but at the same time that tests are crucial with NS. 
Like the two robustness bugs (conn leak and deadlock) only happening is 
some specific conditions.


I think review on large NS PR is bit formal step and the most important 
are tests/measure. We may agree to push the NS patches base on existing 
tests results.  New tests/measure (for benefits and [2] and [3]) on top 
of the current PR could be the option. Let's wait for additional feedback


[1] https://pagure.io/389-ds-base/pull-request/49636#comment-66941

[2] https://bugzilla.redhat.com/show_bug.cgi?id=1608746 deadlock
[3] https://bugzilla.redhat.com/show_bug.cgi?id=1605554 connection leaks

best regards
thierry


• Build some tests for
• measure the benefit of NS as [2] and [3] do not prevent some 
performance tests
• identify possible reproducers for [2] and [3]
• create robustness and long duration NS specific tests
• review and merge the PR [4]
As PR [4] is not intended for perf improvement, the step 2.1 will impact the 
priority according to the performance benefits.

I think that the test plan is good. 2 and 3 will be hard to reproduce I think, 
they seem like they require complex state and interactions. The goal of the PR 
I have open is to reduce the complexity of NS integration into DS, so that 
tests for situations like 2 and 3 can be more easily created (and hopefully the 
simpler integration is going to resolve some issues).

The “benefit” of NS is hard to say - because the other parts of the server core 
are still needing threading improvement, NS may help some aspects (connection 
acceptance performance for example), we won’t see all the of the benefits until 
we improve other parts. There are many bottlenecks to fix! An example is lib 
globs high use of atomics (should be COW), and the plugin architecture.

At a theoretical level, NS will be faster as we don’t need to lock and poll 
over the connection table, but today the CT may not be the bottleneck, so NS 
may not make this “faster” just by enabling.


Comments are welcomed

Regarding 2.1 plan we made the following notes for the test plan:
The benefit of Nunc-Stans can only be measure with a large number of 
connections (i.e. client) above 1000. That means a set of clients (sometime 
all) should keep their connection opened. Clients should run on several hosts 
so that clients are not the bootleneck.

For the two types of events (new connection and new operations), the 
measurement could be

• Event: New connections
• Start all clients in parallel to establish connections 
(keeping them opened) take the duration to get 1000, 2000, ... 1 
connections and check there are drop or not
• Establish 1000 connections and monitor during to open 100 
more, the same starting with 2000, 1
• Client should not run any operations during the monitoring
• Event: New operations
• Start all clients and when 1000 connections are established, launch 
simple operations (e.g. search -s base -b "" objectclass) and monitor how many 
of them can be handled. The same with 2000, ... 1.
• response time and workqueue length could be monitored to be 
sure the bottleneck are not the worker.

At one point I started (but did not finish) ldclt integration with lib389.

[389-devel] Nunc-stans status

2018-10-19 Thread thierry bordaz

Hi,

C10K is a scalability problem that a server can face when dealing with 
events of thousands of connections (i.e. clients) at the same time. 
Events can be new connections, new operations on the established 
connections, closure of connection (from client or server)


For 389-ds, C10K problem was resolved with a new framework *Nunc-Stans* 
[1]. Nunc-stans was first enabled in RHDS 7.4 and improved/fixed in 7.5. 
Robustness issues [2] and [3] were reported in 7.5 and it was decided to 
disable Nunc-stans. It is not known if those issues exist or not in 7.4.


William posted a PR to fix those two issues [4]. Nunc-stans is a complex 
framework, with its own dynamic. Review of this PR is not easy and even 
a careful review may not guaranty it will fix [2] and [3] and may not 
introduce others unexpected side effects.


From there we discussed two options (but there may be others):

1. Review and merge the PR [4], then later run some intensive tests
   aiming to verify [2],[3] and checking the robustness in order to
   reenable NS
2. Build some tests for
1. measure the benefit of NS as [2] and [3] do not prevent some
   performance tests
2. identify possible reproducers for [2] and [3]
3. create robustness and long duration NS specific tests
4. review and merge the PR [4]

As PR [4] is not intended for perf improvement, the step 2.1 will impact 
the priority according to the performance benefits.


Comments are welcomed


   Regarding 2.1 plan we made the following notes for the test plan:

   /The benefit of Nunc-Stans can only be measure with a large number
   of connections (i.e. client) above 1000. That means a set of clients
   (sometime all) should keep their connection //*opened*//. Clients
   should run on several hosts so that clients are not the bootleneck./

   /For the two types of events (new connection and new operations),
   the measurement could be/

 * /Event: New connections /
 o /Start all clients in parallel to establish connections
   (keeping them opened) take the duration to get 1000, 2000,
   ... 1 connections and check there are drop or not/
 o /Establish 1000 connections and monitor during to open 100
   more, the same starting with 2000, 1/
 o /Client should not run any operations during the monitoring/
 * /Event: New operations /
 o /Start all clients and when 1000 connections are
   established, launch simple operations (e.g. search -s base
   -b "" objectclass) and monitor how many of them can be
   handled. The same with 2000, ... 1./
 o /response time and workqueue length could be monitored to be
   sure the bottleneck are not the worker./

[1] http://www.port389.org/docs/389ds/design/nunc-stans.html

[2]https://bugzilla.redhat.com/show_bug.cgi?id=1608746 deadlock
[3] https://bugzilla.redhat.com/show_bug.cgi?id=1605554 connection leaks

[4]https://pagure.io/389-ds-base/pull-request/49636

___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org


[389-devel] Re: Profiling discussion

2018-10-10 Thread thierry bordaz

Hi William,

Thanks for starting this discussion.
Your email raise several aspects (How, for whom,..) and I think a way to 
start would be to write down what we want.
A need is from a given workload to determine where we are spending time 
as a way to determine where to invest.

An other need is to collect metrics at operation level.

From the how perspective, we can rely on external tools (stap+scripts), 
or internal tool (like the plugin you described+scripts). Of course we 
can also do some enhancements inside DS (like adding probes) to help 
external tools. I have no strong opinion if an approach is better than 
the other but I think it also depends what you want to perform.


best regards
thierry

On 10/08/2018 12:37 PM, William Brown wrote:

Hi there,

In a ticket Thierry and I mentioned that we should have a quick discussion 
about ideas for profiling and what we want it to look like and what we need. I 
think it’s important we improve our observation into the server so that we can 
target improvements correctly,

I think we should know:

* Who is the target audience to run our profiling tools?
* What kind of information we do want?
* Potential solution for the above.

With those in mind I think that Thierry suggested STAP scripts.

* Target audience - developers (us) and some “highly experienced” admins (STAP 
is not the easiest thing to run).
* Information - STAP would largely tell us timing and possibly allows some 
variable/struct extraction. STAP does allow us to look at connection info too a 
bit easier.

I would suggest an “event” struct, and logging service

At the start of an operation we create an event struct. As we enter - exit a 
plugin we can append timing information, and the plugin itself can add details 
(for example, backend could add idl performance metrics or other). At the end 
of the operation, we log the event struct as a json blob to our access log 
associated to the conn/op.

* Target - anyone, it’s a log level. Really easy to enable (Think mailing list 
or user support, can easily send us diagnostic logs)
* Information - we need a bit more work to structure the “event” struct 
internally for profiling, but we’d get timings and possibly internal variable 
data as well in the event.


I think these are two possible approaches. STAP is less invasive, easier to 
start now, but harder to extend later. Logging is more accessible to 
users/admins, easier to extend later, but more work to add now.

What do we think?


—
Sincerely,

William


___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org

___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org


[389-devel] please review: Ticket 48184 clean up and delete connections at shutdown

2018-05-18 Thread thierry bordaz

https://pagure.io/389-ds-base/issue/48184

https://pagure.io/389-ds-base/pull-request/49701


A bit of history around that fix.
An original fix was done for 48184 [1]. This fix introduced a regression 
[2] that we failed to fix fast enough and the fix was finally backout [3].

The current PR [4] enhances the original fix to avoid [2].
So far tests were successful with [4] but additional tests with ASAN 
build show an other issue (refcnt). The fix for the refcnt will be done 
in a separate ticket.


symptoms of [1] was error logging and sometime crash
symptom of [2] was a hang
symptom of the refcnt is error logging and sometime crash [5]

[1] https://pagure.io/389-ds-base/c/1418fc3
[2] https://pagure.io/389-ds-base/issue/49569
[3] https://pagure.io/389-ds-base/c/fda63dc59
[4] https://pagure.io/389-ds-base/pull-request/49701
[5] https://bugzilla.redhat.com/show_bug.cgi?id=1566444
___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproject.org/message/6N3VVGPCI7UHP4BWOWB4Q5XQW5QQEKXJ/


[389-devel] Re: lib389 usage cheatsheet

2018-05-09 Thread thierry bordaz

Hi Simon,

   Thanks Simon starting this thread :)

   Currently lib389 is mostly used by ldap devel/QE and seems realistic
   to become the admin library (for example used by freeipa or others)
   and component of 389 administrative tools. LDAP knowledge is a
   requirement for all of them. In that sense, lib389 should continue
   to follow/offer basic LDAP elements (connections, naming_ctx,
   req/resp, synchronous/asynchronous, extop, control...).
   I agree that those elements are a bit frustrating for people wanting
   to use LDAP as a simple repository. The DSLdapObject abrstaction
   looks very promising. It can be oriented to simple use case: create
   users and groups, manage memberships, authentication/rights,
   automatic deployement on several replicated instances. Then extended
   on demand.
   In short, IMHO I would prefer to keep the most of the LDAP elements
   (option 1) and propose/extend easy POC interface (option 2). I am
   not sure the work DSLdapObject is my favorite, we could name it
   according to the use case we want to propose.
   For me option 3 will be the worse option. I remember an abstraction
   layer that had a so high level that I constantly looked at the
   access/error log, to be sure that the program was doing what I was
   thinking it should.

   best regards
   thierry


On 05/09/2018 04:56 PM, Simon Pichugin wrote:

Hi team,
recently, we had a discussion on a scrum meeting about lib389 and its new API.

If I understood right there was an opinion that lib389 DSLdapObjects API
is not very intuitive and it is much easier stick to python-ldap style
because it uses ldapmodify/ldapadd wording (or close enough to it).
And I partially agree with it (and I already have some thoughts how we can
improve it).

Many patches on the review show that the people don't change their code
to the new way.
I've given some thought to it and this is what I came up with:

1. I think it is okay to use instance.add_s and instance.modify_s
for simple operations.
2. If you'd like to make your life easier, you can use DSLdapObjects API
(and I'll help you with that)
3. We should stop using old lib389 API because we don't support it anymore
and it will be depricated in the future. We should use DSLdapObject
instead and we should improve it if we don't like something about it.

We have a lot of docstrings written in lib389 code and the code itself
is pretty readable, in my opinion. But I'd like to make the life easier
and I've started to write 'lib389 cheatsheet'. It has old way/new way
comparison blocks and I base it on the real-life code from your patches
adding some commentairies.

For now, it is here:
https://fedorapeople.org/~spichugi/html/cheatsheet.html

Thanks!
Simon
___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org


___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org


[389-devel] Re: Optional rust support review

2017-11-08 Thread thierry bordaz

Hi William,

   I see benefit to offer the support of modern language in 389-ds.
   Rust developer can be interested to improve 389-DS with missing
   functionality.
   Especially with the plugin-api we have the ability to support rust
   without change to the core server. If a rust plugin hit a bug, we
   have the ability to disable it until a rust expert can fix it.
   #49325 implements rust functionality into a crucial core server
   component (nunc stans). The consequence is that we do need to be
   able to fix it rapidly. Correct ?

   best regards
   thierry


On 11/07/2017 03:51 AM, William Brown wrote:

Hi,

https://pagure.io/389-ds-base/issue/49325 has been in a ready to merge
state for some time. This is still an optional integration (not a
commitment to production rust), but I still want to check that it's
okay to merge. It's been reviewed by an external Rust developer who is
happy with it, and checked by Mark. I would like to merge this on
Friday, so I would love to hear comments about this before then.

A discussion on production commitment will happen in the near future I
think as to whether we want to pursue this path,

Thanks!



___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org


___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org


[389-devel] Please review 49312: SIGSEV when setting invalid changelog config value

2017-10-25 Thread thierry bordaz

https://pagure.io/389-ds-base/issue/49412

https://pagure.io/389-ds-base/issue/raw/files/3f00b1682d2a5815b4437ec3c7e03eb0eb84d30e113b9926343b62ba59a03283-0001-Ticket-49412-SIGSEV-when-setting-invalid-changelog-c.patch
___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org


[389-devel] Please review 49386: Memberof should be ignore MODRDN when the pre/post entry are identical

2017-10-19 Thread thierry bordaz

https://pagure.io/389-ds-base/issue/49386

https://pagure.io/389-ds-base/issue/raw/files/05e1c4e50c1a24b304c37456a8e937475796e44cc16462c708d9042808a32301-0001-Ticket-49386-Memberof-should-be-ignore-MODRDN-when-t.patch 
___

389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org


[389-devel] Please review 49064: allow to enable MemberOf plugin in dedicated consumer

2017-10-17 Thread thierry bordaz

https://pagure.io/389-ds-base/issue/49064

https://pagure.io/389-ds-base/issue/raw/files/4a0d58fa9331e1c8180acb2f1b6b0928b4e66488e6a9faa3c0cb96e223557057-0001-Ticket-49064-RFE-allow-to-enable-MemberOf-plugin-in-.patch
___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org


[389-devel] Re: Build failed in Jenkins: COMMIT_SANITY_TEST #72

2017-10-16 Thread thierry bordaz

Hi Mark,

Thanks for the head up !! sorry for breaking the build.

I have fixed it... crossing the fingers :)

best regards
thierry

On 10/16/2017 02:51 PM, marey...@redhat.com wrote:

See 


Changes:

[tbordaz] Ticket 49394 - slapi_pblock_get may leave unchanged the provided

--
[...truncated 185 lines...]
Package python-nss-1.0.0-2.fc25.x86_64 is already installed, skipping.
Package policycoreutils-python3-2.5-20.fc25.x86_64 is already installed, 
skipping.
Package python-IPy-python3-0.81-16.fc25.noarch is already installed, skipping.
Package policycoreutils-python-utils-2.5-20.fc25.x86_64 is already installed, 
skipping.
Package python2-dateutil-1:2.6.0-1.fc25.noarch is already installed, skipping.
Dependencies resolved.
Nothing to do.
Complete!
+ sudo dnf install -y audit-libs-python3 libsemanage-python3 cyrus-sasl-plain 
gperftools-devel
Last metadata expiration check: 2:57:46 ago on Mon Oct 16 11:42:54 2017.
Package audit-libs-python3-2.7.7-1.fc25.x86_64 is already installed, skipping.
Package libsemanage-python3-2.5-9.fc25.x86_64 is already installed, skipping.
Package cyrus-sasl-plain-2.1.26-26.2.fc24.x86_64 is already installed, skipping.
Package gperftools-devel-2.5-2.fc25.x86_64 is already installed, skipping.
Dependencies resolved.

  PackageArch   VersionRepository   Size

Skipping packages with conflicts:
(add '--best --allowerasing' to command line to force their upgrade):
  audit  x86_64 2.8.1-1.fc25   updates-testing 249 k
  audit-libs x86_64 2.8.1-1.fc25   updates-testing 111 k
  audit-libs-python  x86_64 2.8.1-1.fc25   updates-testing  80 k
  audit-libs-python3 x86_64 2.8.1-1.fc25   updates-testing  80 k

Transaction Summary

Skip  4 Packages

Nothing to do.
Complete!
+ sudo dnf remove -y python-lib389
No match for argument: python-lib389
Error: No packages marked for removal.
+ sudo service sendmail start
Redirecting to /bin/systemctl start  sendmail.service
+ sudo /usr/bin/rm -rf   
  
  
  
  
  
  
  
  
  
  
  
  
  

+ sudo /usr/bin/rm -rf 

[389-devel] Please review 48973: Indexing a ExactIA5Match attribute with a IgnoreIA5Match matching rule triggers a warning

2017-09-27 Thread thierry bordaz

https://pagure.io/389-ds-base/issue/48973

https://pagure.io/389-ds-base/issue/raw/files/e9c88aa920e588d5fd279b2671c5824ae5366b2faf2cbf196eed4c73cc3058c4-0001-Ticket-48973-Indexing-a-ExactIA5Match-attribute-with.patch
___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org


[389-devel] Please review 49332: Event queue is not working

2017-07-25 Thread thierry bordaz

https://pagure.io/389-ds-base/issue/49332

https://pagure.io/389-ds-base/issue/raw/files/9aa72873aa6ad8e0947ee8465514a6cb2157b4c9ac789cedaa2d592d5bcbb604-0001-Ticket-49332-Event-queue-is-not-working.patch
___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org


[389-devel] Please review ticket 49291: slapi_search_internal_callback_pb may SIGSEV if related pblock has not operation set

2017-06-15 Thread thierry bordaz

https://pagure.io/389-ds-base/issue/49291

https://pagure.io/389-ds-base/issue/raw/7eae6cb7318f30a32fb9e89fcf199c90bed624eb6456900be365d09df669808c-0001-Ticket-49291-slapi_search_internal_callback_pb-may-S.patch

___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org


[389-devel] Please review 49230: slapi_register_plugin creates config entry where it should not

2017-04-25 Thread thierry bordaz

https://pagure.io/389-ds-base/issue/49230

https://pagure.io/389-ds-base/issue/raw/files/daf059ca6b5791e50a37866387bccf0c21ae9a563f5326db08d06d31ca75e854-0001-Ticket-49230-slapi_register_plugin-creates-config-en.patch
___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org


[389-devel] Please review 49184: Overflow in memberof

2017-04-13 Thread thierry bordaz

https://pagure.io/389-ds-base/issue/49184

https://pagure.io/389-ds-base/issue/raw/files/c8897365eb508b39a692e2eb224c49b88a8d8728d1b5e06e095e0f55ae2849e5-0002-Ticket-49184-Overflow-in-memberof.patch
___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org


[389-devel] Please review 49209: Hang due to omitted replica lock release

2017-04-04 Thread thierry bordaz

https://pagure.io/389-ds-base/issue/49209

patch is: 
https://pagure.io/389-ds-base/issue/raw/files/8adf20a3bebb247024662ee92aa8783d99f85fd0183398822a856287a75fe7fd-0001-Ticket-49209-Hang-due-to-omitted-replica-lock-releas.patch

___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org


[389-devel] Please review issue 49076: To debug DB_DEADLOCK condition, allow to reset DB_TXN_NOWAIT flag on txn_begin

2017-02-15 Thread thierry bordaz

https://pagure.io/389-ds-base/issue/49076

https://pagure.io/389-ds-base/issue/raw/files/a078dfacb0f0c623412d357dd6e29b1c9987845416dd3fd5d1721fa089dd00ba-0001-Ticket-49076-To-debug-DB_DEADLOCK-condition-allow-to.patch
___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org


[389-devel] Please review 49031: Improve memberof with a cache of group parents

2017-01-03 Thread thierry bordaz

Hi,

This is the second part of improvement of the memberof plugin.
This second part improves memberof by adding a cache. The cache is 
build/clear per operation.
During an update memberof values are recomputed on all impacted nodes 
(group or leaf). The cache contains ancestors of all groups.


There is a memberof plugin option (memberCacheLeafs: all/none) to also 
cache ancestors of leafs. Tests showed no  benefit of caching leafs.


Patch is:
https://fedorahosted.org/389/attachment/ticket/49031/0002-Ticket-49031-Improve-memberof-with-a-cache-of-ancest.patch

Ticket is https://fedorahosted.org/389/ticket/49031

Design is http://www.port389.org/docs/389ds/design/memberof-scalability.html

thanks
thierry
___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org


[389-devel] Please review lib389 48984: support environment variable to define defaults.inf

2016-12-05 Thread thierry bordaz
While testing in a prefix install I found it useful: 
https://fedorahosted.org/389/attachment/ticket/48984/0003-Ticket-48984-support-of-environment-variable-for-def.patch

___
389-devel mailing list -- 389-de...@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org


[389-devel] Please review: 48861 Memberof plugins can update several times the same entry to set the same values

2016-12-05 Thread thierry bordaz
This is a first patch improving memberof plugins when fixup is done over 
a graph with multiple paths to the same nodes.


patch is 
https://fedorahosted.org/389/attachment/ticket/48861/0002-Ticket-48861-Memberof-plugins-can-update-several-tim.patch


test suite (port of tet test suite): 
https://fedorahosted.org/389/attachment/ticket/48861/0002-Ticket-48861-memberof-plugin-tests-suite.patch

Compare to the tet tests suite this patch does not contain:

 *   bugs 834053  and 833222 regression tests
 * stress tests

Design is http://www.port389.org/docs/389ds/design/memberof-scalability.html

thanks
thierry

___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org


[389-devel] Re: Please review design proposal for tickets 47986 and 48976

2016-10-12 Thread thierry bordaz

Hello,

I would think of two options

 * If admin decides to switch to backend, it should not be prevented
   and the backend moves to 'backend'
 * periodic (hourly) checking (IMHO not configurable and always run),
   checking being the same mechanism as 'auto'
 o in-sync->backend
 o not-in-sync it keeps referral-on-update

I think that the delay option is not necessary. If periodic checking 
fails to move to referral-on-update, it will log a msg saying which 
consumer knows higher csn and it will be the admin task to make sure to 
push those updates.


For internal operation, I do not think to any simple solution. The 
mechanism in your design is a real progress from what is now. Let's wait 
for CU cases to see if we need to also address internal ops.


regards
thierry



On 10/07/2016 05:58 PM, Ludwig Krispenz wrote:
there is a problem not yet covered in the proposal: setting the 
backend to "referral-on-update" until the topology is in sync prevents 
to ealry client updates, but what to do about internal updates, eg 
passwordpolicy attributes.


I have a wild idea, but maybe someone  has a suggestion on how to 
handle this


thanks,
Ludwig

On 10/05/2016 05:51 PM, Ludwig Krispenz wrote:


On 09/30/2016 02:15 AM, Noriko Hosoi wrote:

Hi Ludwig,

On 09/29/2016 05:43 AM, Ludwig Krispenz wrote:

This is the initial proposal, thanks for your feedback

http://www.port389.org/docs/389ds/design/delay-accepting-updates-after-init.html 




Please help me understanding the design...

I'm having a bit hard time to figure out the relationship/dependency 
among these 3 config parameters.


sorry if I was not clear enough, I will update the doc, but let me 
try to explain here


nsslapd-replica-accept-updates-state: on/off
nsslapd-replica-accept-updates-delay: -1/0/n
nsslapd-replica-accept-updates-auto: on/off

Are they independent or dependent?  Do they take any combinations -- 
2 * 3 * 2 == 12.



no. the primary parameter is: nsslapd-replica-accept-updates-state
If it is off, the other determine when it should be set to on again 
(without an explicite change by an admin).

if it is on, the other two will not be used

independent of auto on/off the "delay" defines if(>=0) the state will 
be reset to on and when


the "auto" param determines if the server should in the defined 
"delay" it should try to detect if it is in sync and switch to "on" 
earlier.



There are 12 different behaviors?  (assuming n for -delay is one case :)

What is your recommendation to the customers?  I mean, what is the 
default setting?


that is a good question, there is the option to choose the default by 
what is "my" recommendation (auto: on, delay: n) or what is backward 
compatible (no change in default behaviour: auto off, delay: 0)


  For instance, if -auto is "on", when an online init is executed on 
the master, the scenario is automatically kicked in?


Thanks,
--noriko





||


___
389-devel mailing list --389-devel@lists.fedoraproject.org
To unsubscribe send an email to389-devel-le...@lists.fedoraproject.org


--
Red Hat GmbH,http://www.de.redhat.com/, Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Michael Cunningham, Michael O'Neill, Eric 
Shander


___
389-devel mailing list --389-devel@lists.fedoraproject.org
To unsubscribe send an email to389-devel-le...@lists.fedoraproject.org


--
Red Hat GmbH,http://www.de.redhat.com/, Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Michael Cunningham, Michael O'Neill, Eric 
Shander


___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org


___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org


[389-devel] Please review 48992: Total init may fail if the pushed schema is rejected

2016-09-23 Thread thierry bordaz
Thanks Noriko for your review. I updated the patch to give more 
explanation why  the fix is in modify_schema_dse.
I pick up LDAP_CONSTRAINT_VIOLATION in replacement of 
UNWILLING_TO_PERFORM but I have not strong opinion on appropriate value 
of that returned value. In the logic of that fix, it just needs to be 
not fatal regarding ignore_error_and_keep_going.


https://fedorahosted.org/389/attachment/ticket/48992/0002-Ticket-48992-Total-init-may-fail-if-the-pushed-schem.patch

https://fedorahosted.org/389/ticket/48992
___
389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org


[389-devel] Please review 48992: Total init may fail if the pushed schema is rejected

2016-09-22 Thread thierry bordaz
This bug exists in topology with mixed versions and with 1.2.11 supplier 
total/incremental update of 1.3.5 consumer


https://fedorahosted.org/389/ticket/48992

https://fedorahosted.org/389/attachment/ticket/48992/0001-Ticket-48992-Total-init-may-fail-if-the-pushed-schem.patch 
___

389-devel mailing list -- 389-devel@lists.fedoraproject.org
To unsubscribe send an email to 389-devel-le...@lists.fedoraproject.org


[389-devel] Please review 48956: ns-accountstatus.pl giving error even "No such object (32)" but still giving output "activated".

2016-08-17 Thread thierry bordaz

https://fedorahosted.org/389/attachment/ticket/48956/0002-Ticket-48956-ns-accountstatus.pl-showing-activated-u.patch

https://fedorahosted.org/389/ticket/48956
--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://lists.fedoraproject.org/admin/lists/389-devel@lists.fedoraproject.org


[389-devel] Please review 48906 (take 2): Allow nsslapd-db-locks to be configurable online

2016-07-11 Thread thierry bordaz
William, Mark, thanks for your review and sorry for missing basic 
hardening of this param...


https://fedorahosted.org/389/ticket/48906

https://fedorahosted.org/389/attachment/ticket/48906/0003-Ticket-48906-Allow-nsslapd-db-locks-to-be-configurab.patch 
--

389-devel mailing list
389-devel@lists.fedoraproject.org
https://lists.fedoraproject.org/admin/lists/389-devel@lists.fedoraproject.org


[389-devel] Please review 48906: Allow nsslapd-db-locks to be configurable online

2016-07-07 Thread thierry bordaz

https://fedorahosted.org/389/attachment/ticket/48906/0002-Ticket-48906-Allow-nsslapd-db-locks-to-be-configurab.patch

https://fedorahosted.org/389/ticket/48906
--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://lists.fedoraproject.org/admin/lists/389-devel@lists.fedoraproject.org


[389-devel] Re: Logging performance improvement

2016-07-07 Thread thierry bordaz

Hi William,

This is looking a great idea. You are proposing to use liblfds to 
communicate with the log write. Do you think dbus would be an other 
option ? Would it help external mechanism to collect the DS logs ?


thanks
thierry

On 07/01/2016 03:52 AM, William Brown wrote:

Hi,

I've been thinking about this for a while, so I decided to dump my
thoughts to a document. I think I won't get to implementing this for a
while, but it would really help our server performance.

http://www.port389.org/docs/389ds/design/logging-performance-improvement.html



--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://lists.fedoraproject.org/admin/lists/389-devel@lists.fedoraproject.org


--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://lists.fedoraproject.org/admin/lists/389-devel@lists.fedoraproject.org


[389-devel] Please review 48891 (2nd) : ns-slapd crashes during the shutdown after adding attribute with a matching rule

2016-06-23 Thread thierry bordaz
The previous fix was commited. It prevented a crash but is incomplete 
because of invalid access to freed buffer 
(https://fedorahosted.org/389/ticket/48891#comment:9).


This is an additional part of the fix
https://fedorahosted.org/389/attachment/ticket/48891/0002-Ticket-48891-ns-slapd-crashes-during-the-shutdown-af.patch 
--

389-devel mailing list
389-devel@lists.fedoraproject.org
https://lists.fedoraproject.org/admin/lists/389-devel@lists.fedoraproject.org


[389-devel] Re: Please review: ticket #48766 Replication changelog can incorrectly skip over updates

2016-05-25 Thread thierry bordaz



On 05/25/2016 08:47 AM, Ludwig Krispenz wrote:


On 05/24/2016 07:00 PM, thierry bordaz wrote:



On 05/24/2016 05:24 PM, Ludwig Krispenz wrote:


On 05/24/2016 04:20 PM, thierry bordaz wrote:

Hi Ludwig,

Thanks for your explanation. The design looks very good. I think it 
would be good to put into the code (especially 
clcache_adjust_anchorcsn) the reference to the related design 
paragraph.


There is something I do not understand in clcache_skip_change.
My understanding is that this is the only place where the 
consumer_maxcsn is updated.
But there are several conditions, if we decide to skip the update 
that the consumer_maxcsn is not updated.

One of them is 'rid == buf->buf_consumer_rid'.
Does that mean that the consumer_maxcsn remains unchanged for the 
RID of the consumer ?


the condition is:
if( rid == buf->buf_consumer_rid && buf->buf_ignoreConsumerRID)

so it will only be skipped if we determined that we don't need to 
send anything for the consumers own rid


Ok. I get it. thanks.


An other question regarding the update of buf_consumer_ruv.
My understanding is that it contains the initial consumer RUV.
But later it is used in 
clcache_refresh_consumer_maxcsns/clcache_refresh_local_maxcsn to fill 
consumer_maxcsn.


consumer_maxcsn are updated with the non skipped updates to reflect 
the current status of the consumer.
But when consumer_maxcsns/clcache_refresh_local_maxcsn are called my 
understanding is that consumer_maxcsn are reset with buf_consumer_ruv 
(initial consumer RUV).

Do I get it right ?

no :-) At least I think I have implemented it differently.
The consumer_maxcsn is set in /clcache_refresh_consumer_maxcsns(), 
which is only called at the first buffer load, not at the relaod.
In //clcache_refresh_local_maxcsns it is only set if we have to add a 
RID to the cscb list, but then no change for this rid was sent before.

/


/OK thanks for the explanation.
I was wondering if consumer_maxcsn could go back to buf_consumer_ruv but 
in those cases (first load and first update on known RID) it is right.


Just minor comments on clcache_load_buffer.
/

 * /'rc' should be tested against CLC_STATE_* values not 0.///
 * /if clcache_initial_anchorcsn or clcache_adjust_anchorcsn return
   let's say CLC_STATE_CSN_GT_RUV, in /clcache_load_buffer it will
   return a CLC_STATE_* value not a DB_* value.
 * clcache_refresh_consumer_maxcsns is only called at first load, why
   not including/inlining it into clcache_initial_anchorcsn


Thanks
thierry

//

//


thanks
thierry




thanks
thierry




On 05/24/2016 09:22 AM, Ludwig Krispenz wrote:

Hi,

On 05/23/2016 06:29 PM, thierry bordaz wrote:



On 05/23/2016 03:06 PM, Ludwig Krispenz wrote:
This is the latest version of the "changelog buffer processing" 
fixes.



https://fedorahosted.org/389/ticket/48766

https://fedorahosted.org/389/attachment/ticket/48766/0001-reworked-clcach-buffer-code-following-design-at-http.patch 



The background for the fix is here, I would like to get feedback 
on this as well to clarify what is unclear
http://www.port389.org/docs/389ds/design/changelog-processing-in-repl-state-sending-updates.html 





Hello Ludwig,

I have not yet reviewed the patch. I was looking at the design.

Regarding your note: 
http://www.port389.org/docs/389ds/design/changelog-processing-in-repl-state-sending-updates.html#special-case-rid-of-the-consumer-in-the-current-replication-session.

If you refer to this part:


  Special case: RID of the consumer in the current
  replication session

If the consumer in the replication session is also a master its 
RID will be contained at least in the consumerRUV. If it is also 
in the supplier RUV the question is if it should be considered in 
the decision if updates should be sent. Normally a master has the 
latest changes applied to itself, so there would be no need to 
check and send updates for its RID. But there can be scenarios 
where this is not the case: if the consumer has been restored from 
an older backup the latest csn for its own RID might be older than 
changes available on other servers.


|NOTE: The current implementation ignores anchorCSNs based on the 
consumer RID. If, by chance, the anchor csn used is older than 
this csn, the changes will be sent, but they also ca nbe lost. |


this referres to the "current" implementation before the fix, the 
doc started as a post-design doc, and it shoul dbe correctedd
with the fix the if the supplier has newer changes for the 
consumerRID than the consumer it will be reflected in the anchor 
csn calculation.





It is said that the anchorCSN will not be the from the 
consumerRID. What is the mechanism that guaranty that the 
consumer will receive all the updates it was the originator ?


thanks
thierry
--
389-devel mailing list
389-devel@lists.fedoraproject.org
http://lists.fedoraproject.org/admin/lists/389-devel@lists.fedoraproject.org 



--
Red Hat GmbH,http://www.de.redhat.com/, R

[389-devel] Please review (2nd): 48836 replication session fails because of permission denied

2016-05-18 Thread thierry bordaz
Thanks Ludwig and Noriko for the review. You are right 
replica_updatedn_list_ismember was the function to fix.


https://fedorahosted.org/389/ticket/48836

https://fedorahosted.org/389/attachment/ticket/48836/0002-Ticket-48836-replication-session-fails-because-of-pe.patch 
--

389-devel mailing list
389-devel@lists.fedoraproject.org
http://lists.fedoraproject.org/admin/lists/389-devel@lists.fedoraproject.org


[389-devel] Please review: 48836 replication session fails because of permission denied

2016-05-17 Thread thierry bordaz

https://fedorahosted.org/389/ticket/48836

https://fedorahosted.org/389/attachment/ticket/48836/0001-Ticket-48836-replication-session-fails-because-of-pe.patch 
--

389-devel mailing list
389-devel@lists.fedoraproject.org
http://lists.fedoraproject.org/admin/lists/389-devel@lists.fedoraproject.org


[389-devel] Please review 48597 (take 2): Deadlock when rebuilding the group of authorized replication managers

2016-03-30 Thread thierry bordaz

Following Noriko's recommendations from first review

https://fedorahosted.org/389/ticket/48597

https://fedorahosted.org/389/attachment/ticket/48597/0002-Ticket-48597-Deadlock-when-rebuilding-the-group-of-a.patch 
--

389-devel mailing list
389-devel@%(host_name)s
http://lists.fedoraproject.org/admin/lists/389-devel@lists.fedoraproject.org

[389-devel] Please review 48497: extended search without MR indexed attribute prevents later indexing with that MR

2016-03-07 Thread thierry bordaz

ticket: https://fedorahosted.org/389/ticket/48497

patch: 
https://fedorahosted.org/389/attachment/ticket/48497/0001-ticket-48497-extended-search-without-MR-indexed-attr.patch


CI test: 
https://fedorahosted.org/389/attachment/ticket/48497/ticket48497_test.py
--
389-devel mailing list
389-devel@%(host_name)s
http://lists.fedoraproject.org/admin/lists/389-devel@lists.fedoraproject.org

[389-devel] Please review 48746: Crash when indexing an attribute with a matching rule

2016-03-04 Thread thierry bordaz

Hello,

This patches fixes two tickets as it is difficult to split the patch in 
two parts for each ticket.


Patch is 
https://fedorahosted.org/389/attachment/ticket/48746/0002-Ticket-48746-Crash-when-indexing-an-attribute-with-a.patch


One ticket is  a crash:
https://fedorahosted.org/389/ticket/48746

test case: 
https://fedorahosted.org/389/attachment/ticket/48746/ticket48746_test.py



One is for an invalid indexing of a matching rule

https://fedorahosted.org/389/ticket/48745

test case: 
https://fedorahosted.org/389/attachment/ticket/48745/ticket48745_test.py


thanks
thierry
--
389-devel mailing list
389-devel@%(host_name)s
http://lists.fedoraproject.org/admin/lists/389-devel@lists.fedoraproject.org

[389-devel] Please review 48420: change severity of some messages related to "keep alive" entries

2016-03-01 Thread thierry bordaz

https://fedorahosted.org/389/ticket/48420

https://fedorahosted.org/389/attachment/ticket/48420/0001-Ticket-48420-change-severity-of-some-messages-relate.patch 

--
389-devel mailing list
389-devel@%(host_name)s
http://lists.fedoraproject.org/admin/lists/389-devel@lists.fedoraproject.org

[389-devel] Please review 48270: fail to index an attribute with a specific matching rule

2016-02-05 Thread thierry bordaz

https://fedorahosted.org/389/ticket/48270

https://fedorahosted.org/389/attachment/ticket/48270/0001-Ticket-48270-fail-to-index-an-attribute-with-a-speci.patch

--
389-devel mailing list
389-devel@%(host_name)s
http://lists.fedoraproject.org/admin/lists/389-devel@lists.fedoraproject.org

[389-devel] Please review 48362: With exhausted range, part of DNA shared configuration is deleted after server restart

2015-12-08 Thread thierry bordaz

ticket: https://fedorahosted.org/389/ticket/48362

fix: 
https://fedorahosted.org/389/attachment/ticket/48362/0001-Ticket-48362-With-exhausted-range-part-of-DNA-shared.patch


testcase: 
https://fedorahosted.org/389/attachment/ticket/48362/ticket48362_test.py
--
389-devel mailing list
389-devel@%(host_name)s
http://lists.fedoraproject.org/admin/lists/389-devel@lists.fedoraproject.org

Re: [389-devel] Please review ticket 47976: deadlock in mep delete post op

2015-11-04 Thread thierry bordaz

Hi Ludwig, Rich,

you were right (as usual !), dblayer_get_pvt_txn/dblayer_push_pvt_txn 
are doing the magic.

Now the fix is a oneline fix !

https://fedorahosted.org/389/attachment/ticket/47976/0001-Ticket-47976-2-deadlock-in-mep-delete-post-op.patch

thanks
thierry
On 11/03/2015 06:01 PM, Ludwig Krispenz wrote:

Hi Thierry,

we already had started to discuss on IRC, but here are my thoughts again.

Is it necessary to explicitely set the txn in the plugin ? The txn 
will be found when ldbm_back_delete() does dblayer_txn_begin(9 and it 
checks the per thread stack of txns.
In my opinion the real problem is not to set  the txn in id2entry, 
which will then try to read a locked page.


Ludwig

On 11/03/2015 05:40 PM, thierry bordaz wrote:

ticket https://fedorahosted.org/389/ticket/47976

fix 
https://fedorahosted.org/389/attachment/ticket/47976/0001-Ticket-47976-deadlock-in-mep-delete-post-op.patch


test case: 
https://fedorahosted.org/389/attachment/ticket/47976/0002-Ticket-47976-test-case.patch




--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel




--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel


--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel

[389-devel] Please review ticket 47976: deadlock in mep delete post op

2015-11-03 Thread thierry bordaz

ticket https://fedorahosted.org/389/ticket/47976

fix 
https://fedorahosted.org/389/attachment/ticket/47976/0001-Ticket-47976-deadlock-in-mep-delete-post-op.patch


test case: 
https://fedorahosted.org/389/attachment/ticket/47976/0002-Ticket-47976-test-case.patch


--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel

Re: [389-devel] DSAdmin tests and basic functionality in the lib389

2015-10-19 Thread thierry bordaz

On 10/19/2015 04:18 PM, Mark Reynolds wrote:



On 10/19/2015 10:02 AM, Simon Pichugin wrote:

Hi team,

I am working now on the fixing lib389 broken tests:
https://fedorahosted.org/389/ticket/48303

And it's time for dsadmin_* tests. Can anybody, please, tell me more
about it?
As I see, Mark and Thierry worked on it, but any other team
members are welcomed too. :)
I think all I did for it was to make it python 3 compliant.  I think 
it was a work-in-progress fora future CLI interface. Thierry can 
probably answer this for sure, but it s definitely not being used at 
the moment.


Hello Simon, Marc,

lib389 comes from https://github.com/richm/dsadmin.

After some time dsadmin word was removed from lib389 but you can still 
find some reference on it.
The lib389-test have been moved to components tests (replica, backend, 
index..) and I think dsadmin is deprecated in those tests.

You may rename some of the dsadmin tests to the component they are testing.
bug_harness.py was used in dsadmin but is no longer used in lib389. If 
you can remove the dependency on it (if it exists some) it would be good.



thanks
theirry


I have a few questions:
1) What should we do with dsadmin_* tests and its coverage?
   - for example, we have "TypeError: 'NoneType' object is not 
callable" after

 - topology.conn.replica.changelog and
 - topology.conn.backend.add
   - or "AttributeError: DirSrv has no attribute 'getMTEntry'" after
 - topology.conn.getMTEntry('o=MISSING')

And it is only a few revealed after first run.

2) Should we rename dsadmin_* tests somehow?  (there is no dsadmin 
project anymore)


3) Do we need bug_harness.py or is it obsolete?

Again,  I think Thierry can answer these questions best.

Regards,
Mark


Please, provide me with details.

Thanks,
Simon
--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel




--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel

[389-devel] Please review Ticket 47978: Deadlock between two MODs on the same entry between entry cache and backend lock

2015-10-16 Thread thierry bordaz

https://fedorahosted.org/389/attachment/ticket/47978/0001-Ticket-47978-Deadlock-between-two-MODs-on-the-same-e.patch

https://fedorahosted.org/389/ticket/47978

It exists a FreeIPA CI test that reproduce the hang (almost 
systematically): test_vault.py
--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel

[389-devel] Please review ticket 48266 (3rd): Fractional replication evaluates several times the same CSN

2015-09-17 Thread thierry bordaz


Thank you Noriko and Simon for your careful reviews.
I modified the fix:

 * create the subentry with 'cn' value 'repl keep alive '
 * slapi_ch_free without cast
 * do the keep alive entry with '(&(objectclass=ldapsubentry)(cn=%s
   %d))', in previous fix I forgot '%d'

I modified the test case following the recommended modifications

https://fedorahosted.org/389/ticket/48266

https://fedorahosted.org/389/attachment/ticket/48266/0001-Ticket-48266-3-Fractional-replication-evaluates-severa.patch

https://fedorahosted.org/389/attachment/ticket/48266/0001-Ticket-48266-2-test-case.patch
--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel

[389-devel] Please review ticket 48266 (4th): Fractional replication evaluates several times the same CSN

2015-09-17 Thread thierry bordaz


Thanks to Noriko who ran coverity tests the previous fix was invalid.
Changing the call slapi_ch_free into slapi_ch_free_string.

https://fedorahosted.org/389/attachment/ticket/48266/0001-Ticket-48266-4-Fractional-replication-evaluates-severa.patch 

--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel

[389-devel] Please review ticket 48266: Fractional replication evaluates several times the same CSN

2015-09-15 Thread thierry bordaz

Thank you so much Rich and Noriko for your feedbacks.
I was late to take them into account and I am sorry for that delay.
Here is the last patch and the associated test case.


https://fedorahosted.org/389/ticket/48266

https://fedorahosted.org/389/attachment/ticket/48266/0001-Ticket-48266-2-Fractional-replication-evaluates-severa.patch

https://fedorahosted.org/389/attachment/ticket/48266/0001-Ticket-48266-test-case.patch
--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel

Re: [389-devel] [lib389] Deref control advice needed

2015-09-02 Thread thierry bordaz

On 08/27/2015 02:31 AM, Rich Megginson wrote:

On 08/26/2015 03:28 AM, William Brown wrote:

In relation to ticket 47757, I have started work on a deref control for
Noriko.
The idea is to get it working in lib389, then get it upstreamed into pyldap.

At this point it's all done, except that the actual request control doesn't
appear to work. Could one of the lib389 / ldap python experts cast their eye
over this and let me know where I've gone wrong?

I have improved this, but am having issues with the asn1spec for ber decoding.

I have attached the updated patch, but specifically the issue is in _controls.py

I would appreciate if anyone could take a look at this, and let me know if there
is something I have missed.


Not sure, but here is some code I did without using pyasn:
https://github.com/richm/scripts/blob/master/derefctrl.py
This is quite old by now, and is probably bit rotted with respect to 
python-ldap and python3.




Old !! but it worked like a charm for me. I just had to do this modif 
because of change in python-ldap IIRC


   diff derefctrl.py /tmp/derefctrl_orig.py
   0a1
>
   151,152c152
   < self.criticality,self.derefspeclist,self.entry =
   criticality,derefspeclist or [],None
   <
   #LDAPControl.__init__(self,DerefCtrl.controlType,criticality,derefspeclist)
   ---
>
   LDAPControl.__init__(self,DerefCtrl.controlType,criticality,derefspeclist)
   154c154
   < def encodeControlValue(self):
   ---
> def encodeControlValue(self,value):
   156c156
   < for (derefattr,attrs) in self.derefspeclist:
   ---
> for (derefattr,attrs) in value:



"""
  controlValue ::= SEQUENCE OF derefRes DerefRes

  DerefRes ::= SEQUENCE {
  derefAttr   AttributeDescription,
  derefValLDAPDN,
  attrVals[0] PartialAttributeList OPTIONAL }

  PartialAttributeList ::= SEQUENCE OF
 partialAttribute PartialAttribute
"""

class DerefRes(univ.Sequence):
 componentType = namedtype.NamedTypes(
 namedtype.NamedType('derefAttr', AttributeDescription()),
 namedtype.NamedType('derefVal', LDAPDN()),
 namedtype.OptionalNamedType('attrVals', PartialAttributeList()),
 )

class DerefResultControlValue(univ.SequenceOf):
 componentType = DerefRes()





 def decodeControlValue(self,encodedControlValue):
 self.entry = {}
 #decodedValue,_ =
decoder.decode(encodedControlValue,asn1Spec=DerefResultControlValue())
 # Gets the error: TagSet(Tag(tagClass=0, tagFormat=32, tagId=16),
Tag(tagClass=128, tagFormat=32, tagId=0)) not in asn1Spec:
{TagSet(Tag(tagClass=0, tagFormat=32, tagId=16)): PartialAttributeList()}/{}
 decodedValue,_ = decoder.decode(encodedControlValue)
 print(decodedValue.prettyPrint())
 # Pretty print yields
 #Sequence:  <-- Sequence of
 # =Sequence:  <-- derefRes
 #  =uniqueMember <-- derefAttr
 #  =uid=test,dc=example,dc=com <-- derefVal
 #  =Sequence: <-- attrVals
 #   =uid
 #   =Set:
 #=test
 # For now, while asn1spec is sad, we'll just rely on it being well
formed
 # However, this isn't good, as without the asn1spec, we seem to 
actually
be dropping values 
 for result in decodedValue:
 derefAttr, derefVal, _ = result
 self.entry[str(derefAttr)] = str(derefVal)



--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel




--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel


--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel

[389-devel] please review: Ticket 48249 - sync_repl uuid may be invalid

2015-08-14 Thread thierry bordaz

https://fedorahosted.org/389/attachment/ticket/48249/0001-Ticket-48249-sync_repl-uuid-may-be-invalid.patch

https://fedorahosted.org/389/ticket/48249
--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel

[389-devel] Please review ticket 48127: Using RPM, allows non root user to create/remove DS instance

2015-03-13 Thread thierry bordaz

https://fedorahosted.org/389/ticket/48127

https://fedorahosted.org/389/attachment/ticket/48127/0001-Ticket-48127-Using-RPM-allows-non-root-user-to-creat.patch 

--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel

[389-devel] Please review ticket 47936: Create a global lock to serialize write operations over several backends

2015-02-26 Thread thierry bordaz

Hello,

   This fix allows to protect several backends with a global lock
   during update operations.
   It is configurable and by default the global lock is not enabled.
   Under rare condition some plugin may lead to deadlocks. Global lock
   would be enabled to prevent those deadlocks
   during critical phases or to limit production impact during
   investigations.

 I will do some performance measurement with this fix.

https://fedorahosted.org/389/ticket/47936

https://fedorahosted.org/389/attachment/ticket/47936/0001-Ticket-47936-Create-a-global-lock-to-serialize-write.patch

thanks
thierry
--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel

[389-devel] Please review ticket 47988: Schema learning mechanism, in replication, unable to extend an existing definition

2015-01-23 Thread thierry bordaz

Ticket https://fedorahosted.org/389/ticket/47988

Patch: 
https://fedorahosted.org/389/attachment/ticket/47988/0001-Ticket-47988-Schema-learning-mechanism-in-replicatio.patch


Test case: 
https://fedorahosted.org/389/attachment/ticket/47988/0001-Ticket-47988-test-case.patch
--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel

[389-devel] Please review ticket 47828 (2nd): DNA scope: allow to exlude some subtrees

2014-12-17 Thread thierry bordaz


Hello ,

Taking into account Mark review (dnaExcludeScope:2.16.840.1.113730.3.1.2312)

   Bug: https://fedorahosted.org/389/ticket/47828
   Fix:
   
https://fedorahosted.org/389/attachment/ticket/47828/0002-Ticket-47828-DNA-scope-allow-to-exlude-some-subtrees.patch
   Test:
   
https://fedorahosted.org/389/attachment/ticket/47828/0001-Ticket-47828-test-case.patch

Thanks
thierry
--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel

[389-devel] Please review ticket 47942 (3rd): DS hangs during online total update

2014-12-11 Thread thierry bordaz


Hello,

   The tickethttps://fedorahosted.org/389/ticket/47942 was already
   reviewed by Mark, Rich and pj101.

   The new patch is taking into account the following points:

 * incorrect indentation
 * Window/Pause configuration attribute are common to total and
   incremental updates
 * Improve the logging so that when flow control is triggered it is
   not too noisy.
   With normal logging level, it writes as FATAL the first flow
   control event (total or incremental).
   If there are others during the session  it logs the total
   numbers of flow control.

   
https://fedorahosted.org/389/attachment/ticket/47942/0004-Ticket-47942-DS-hangs-during-online-total-update.patch

   I opened the ticket https://fedorahosted.org/389/ticket/47975 for
   further improvements of flow control:

 * Better default values and procedure to tune them
 * Automatic tuning (based on consumer processing rate)
 * ability to monitor the flow control events

   Thanks
   theirry

--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel

[389-devel] Please review for lib389 ticket 47691 (3rd): lib389 support for RPM builds of 389-ds

2014-11-24 Thread thierry bordaz

Hello,

This fix allows to run 389-DS CI tests when 389-ds is deployed as a rpm.

It takes into account previous reviews done by Rich (no use of 'sudo' or 
'systemctl', use of 'nobody' user).
It also checks that 127.0.0.1 is resolved as localhost.localdomain that 
is required by setup-ds.pl
The tests can be launched either being 'root' ('cd 
ds/dirsrvtests/tickets; py.test ') or with sudo ('cd 
ds/dirsrvtests/tickets; sudo py.tests').


https://fedorahosted.org/389/attachment/ticket/47691/0001-Ticket-47691-3-using-lib389-with-RPMs.patch

thanks
thierry

--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel

[389-devel] Please review for lib389 ticket 47691 (2nd): lib389 support for RPM builds of 389-ds

2014-11-12 Thread thierry bordaz

Hello,

   This fix allows to run 389-DS CI tests when 389-ds is deployed as a rpm.

   It takes into account previous remarks done by Rich (no use of
   'sudo' or 'systemctl').
   The tests can be launched either being 'root' ('cd
   ds/dirsrvtests/tickets; py.test ') or with sudo ('cn
   ds/dirsrvtests/tickets; sudo py.tests').

   Currently some ticket tests are failing. This is because those tests
   need to be fixed regarding master branch.
   Fixing those tests will be done with a separate ticket.

https://fedorahosted.org/389/ticket/47691

https://fedorahosted.org/389/attachment/ticket/47691/0001-Ticket-47691-2-using-lib389-with-RPMs.patch 

--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel

[389-devel] Please review ticket 47553: MODDN aci should be disabled by default

2014-10-30 Thread thierry bordaz

Hello,

   To be backward compatible the processing of 'moddn' aci should be
   disabled by default. So 'add' permission will still allow to moddn
   an entry under the target subtree.
   New 'moddn' permission in aci will be supported (aci not rejected)
   but not enforced during moddn.

   thanks
   thierry

   https://fedorahosted.org/389/ticket/47553

   
https://fedorahosted.org/389/attachment/ticket/47553/0001-Ticket-47553-Enhance-ACIs-to-have-more-control-over-.4.patch

--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel

[389-devel] Please review ticket 47901: After total init, nsds5replicaLastInitStatus can report an erroneous error status (like 'Referral')

2014-10-30 Thread thierry bordaz

https://fedorahosted.org/389/ticket/47901

https://fedorahosted.org/389/attachment/ticket/47901/0001-Ticket-47901-After-total-init-nsds5replicaLastInitSt.patch
--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel

[389-devel] Please review ticket 47553: new MODRDN acis trigger double 'n' in getEffectiveRights

2014-10-16 Thread thierry bordaz

https://fedorahosted.org/389/ticket/47553

https://fedorahosted.org/389/attachment/ticket/47553/0001-Ticket-47553-Enhance-ACIs-to-have-more-control-over-.3.patch

https://fedorahosted.org/389/attachment/ticket/47553/0002-Ticket-47553-test-case.patch
--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel

Re: [389-devel] 389-DS plugin: need some help on the design

2014-09-25 Thread thierry bordaz

On 09/25/2014 04:32 AM, Rich Megginson wrote:

On 09/24/2014 04:33 AM, thierry bordaz wrote:

Hello,

I was investigating the alternative/impacts of a new plugin and I
would like to share some thoughts and check I did not miss
something important.

Here is the description of the problem we want to address. In MMR
topology, we have an entry containing a single valued attribute.
It is an integer syntax attribute. Our need is that the attribute
can only be increased. So if its initial value is 5, an update
MOD/REPL '6' is valid and applied, while MOD/REPL '3' is invalid
and rejected/ignored. Also being in MMR, the attribute can be
updated on several instances.

The current approach is to create a BE_PREOP or BE_TXN_PREOP
plugin. This allow to retrieve the current value from the pblock
(SLAPI_ENTRY_PRE_OP) and guaranties the value is exact as only
one operation is processed at a time.

The plugin  registers a mod operation callback. It controls the
new_value vs current_value to check that  new_value
current_value. The plugin will update the mods. In particular
translates a MOD/REPL into a MOD/DEL(current value) +
MOD/ADD(new_value).

Regarding the change of the MODS (mod/repl - mod/del + mod/add),
the plugin should be a BE_PREOP. This is because MODS are applied
after BE_PREOP plugins, then new MODS added by BE_TXN_PREOP
plugins are applied. A BE_TXN_PREOP plugin may translate mod/repl
- mod/del+mod/add but it is too late, mod/repl has already been
applied after BE_PREOP plugins were called.

Regarding replication, for non replicated updates, it should just
reject (unwilling to perform) ops with new_value  current_value.
For replicated update I see the two cases ([server / csn /
attribute value] ): [A/csnA/valueA], [B/csnB/valueB] and the
expected final value is ValueB+csnB

 1. csnA  csnB and ValueA  ValueB.
 1. When server A receives csnB/valueB, this is fine as
ValueBValueA. But to know that ValueB will be selected 
the plugin needs to check that csnBcsnA.

 1. When server B receives csnA/valueA it has 3 possibilities:
 1. reject (unwilling to perform) the update. But then
replication A-B will fail indefinitely
 2. erase the update. For example the plugin could erase
the mod from the set of mod.
 3. let the operation continue because csnA  csnB, the
kept value will be ValueB. Here again the plugin
needs to check csnA vs csnB
 2. csnA  csnB and ValueA  ValueB.
 1. When server A receives csnB/valueB, this is fine as
ValueBValueA. But to know that ValueB will be selected 
the plugin need to check that csnBcsnA.

 2. When server B receives csnA/valueA it has 2 possibilities:
 1. reject (unwilling to perform) the update. But then
replication A-B will fail indefinitely
 2. erase the update. For example the plugin could erase
the mod from the set of mod.

So I think the plugin should not rely on the new_value present in
the operation but rather  computes the final_value (taking into
account the CSN).
If the final_value  current_value, it let the operation going on
(even if the new_value in the operation  current_value). If the
final_value  current_value it should remove the mod from the
mods (2.2.2) and likely log a message.



What happens if ValueA == ValueB and csnA != csnB?  Do we want to 
allow the same value to be issued by two different servers?  Is this a 
case as with DNA and uidNumber, that we assign servers to have ranges?


That is a good question and so far I still need confirmation.
This is a case with OTP updating the HOTPcounter/TOTPwatermark.
If a bind happens with a given new HOTPcounter value, it will trigger 
internal mod on an entry (related to bindDN) to update this counter.
IMHO we can have parallel bind with a same counter, this on different or 
on the same server as well. In both cases, the csn will be different but 
the value identical.


thanks Rich
thierry





Changing MOD/REPL into MOD/DEL+ MOD/ADD is a possibility but the
attribute being single valued I think it is not mandatory.

Thanks
thierry




--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel




--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel


--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel

Re: [389-devel] Please review lib389: start/stop may hand indefinitely

2014-09-08 Thread thierry bordaz

On 09/05/2014 06:50 PM, Rich Megginson wrote:

On 09/05/2014 10:32 AM, thierry bordaz wrote:

On 09/05/2014 01:10 PM, thierry bordaz wrote:
Detected with testcase 47838 that defines ciphers not recognized 
during SSL init. 47838 testcase makes the full test suite to hang.



--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel

Hello,

Rich pointed me that the indentation was bad in the second part of 
the fix. I was wrongly playing with tab instead of spaces.

Here is a better fix


ack


Thanks Rich.

c05ac0fc658fddc521783dcc1327f7eb7687da9a

Counting objects: 7, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (4/4), done.
Writing objects: 100% (4/4), 464 bytes, done.
Total 4 (delta 3), reused 0 (delta 0)
To ssh://git.fedorahosted.org/git/389/lib389.git/
   1103600..c05ac0f  master - master





thanks
theirry


--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel




--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel


--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel

[389-devel] Please review Ticket 47871: 389-ds-base-1.3.2.21-1.fc20 crashed over the weekend

2014-08-21 Thread thierry bordaz

Ticket https://fedorahosted.org/389/ticket/47871
Fix: 
https://fedorahosted.org/389/attachment/ticket/47871/0001-Ticket-47871-389-ds-base-1.3.2.21-1.fc20-crashed-ove.patch
Test case: 
https://fedorahosted.org/389/attachment/ticket/47871/0002-Ticket-47871-Test-case-389-ds-base-1.3.2.21-1.fc20-c.patch


Thanks
thierry
--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel

[389-devel] Please review ticket 47797: DB deadlock when two threads (on separated backend) try to record changes in retroCL

2014-07-10 Thread thierry bordaz

https://fedorahosted.org/389/ticket/47797

https://fedorahosted.org/389/attachment/ticket/47797/0001-Ticket-47797-DB-deadlock-when-two-threads-on-separat.patch 

--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel

[389-devel] Please review (2nd) ticket 47823: RFE enforce attributes uniqueness over several subtrees

2014-07-08 Thread thierry bordaz

Ticket is https://fedorahosted.org/389/ticket/47823

Fix is 
https://fedorahosted.org/389/attachment/ticket/47823/0002-Ticket-47823-attribute-uniqueness-enforced-on-all-su.patch

Taking into account Rich review:

 * use of (Slapi_DN *) rather that (char *)
 * slapi_ch_free_string rather than slapi_ch_free for (char * strings)


Test case is 
https://fedorahosted.org/389/attachment/ticket/47823/0002-Ticket-47823-test-case-enforce-attributes-uniqueness.patch 

--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel

[389-devel] please review ticket 47787: A replicated MOD fails (Unwilling to perform) if it targets a tombstone

2014-04-25 Thread thierry bordaz

Bug is: https://fedorahosted.org/389/ticket/47787

Fix is : 
https://fedorahosted.org/389/attachment/ticket/47787/0001-Ticket-47787-A-replicated-MOD-fails-Unwilling-to-per.patch


Test case: 
https://fedorahosted.org/389/attachment/ticket/47787/0001-Ticket-47787-test-case.patch
--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel

[389-devel] please review ticket 47721: Schema replication issue (follow up and cleanup)

2014-04-22 Thread thierry bordaz
Core fix for https://fedorahosted.org/389/ticket/47721 is already 
reviewed/pushed.


Additional cleanup is following up with the fix and CI tests that are broken

https://fedorahosted.org/389/attachment/ticket/47721/0001-Ticket-47721-Schema-Replication-Issue-follow-up-clea.patch 

--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel

[389-devel] please review ticket 47721: Schema replication issue

2014-04-08 Thread thierry bordaz
Design is 
http://directory.fedoraproject.org/wiki/Replication_of_custom_schema_(ticket_47721)


Fix is 
https://fedorahosted.org/389/attachment/ticket/47721/0001-Ticket-47721-Schema-Replication-Issue.patch


Test case is 
https://fedorahosted.org/389/attachment/ticket/47721/0001-Ticket-47721-test-case.patch
--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel

[389-devel] Please review ticket 47553: Enhance ACIs to have more control over MODRDN operations

2014-03-17 Thread thierry bordaz
The design for this ticket is 
http://directory.fedoraproject.org/wiki/Access_control_on_trees_specified_in_MODDN_operation


The fix is 
https://fedorahosted.org/389/attachment/ticket/47553/0001-Ticket-47553-Enhance-ACIs-to-have-more-control-over-.patch


The test case is: 
https://fedorahosted.org/389/attachment/ticket/47553/0001-Ticket-47553-test-case.patch
--
389-devel mailing list
389-devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel

Re: [389-devel] Design review: Access control on entries specified in MODDN operation (ticket 47553)

2014-02-25 Thread thierry bordaz

On 02/24/2014 11:35 PM, Rich Megginson wrote:

On 02/24/2014 02:47 PM, Noriko Hosoi wrote:

Rich Megginson wrote:

On 02/24/2014 09:00 AM, thierry bordaz wrote:

Hello,

IPA team filled this ticket
https://fedorahosted.org/389/ticket/47553.

It requires an ACI improvement so that during a MODDN a given
user is only allowed to move an entry from one specified part
of the DIT to an other specified part of the DIT. This without
the need to grant the ADD permission.

Here is the design of what could be implemented to support this
need
http://port389.org/wiki/Access_control_on_trees_specified_in_MODDN_operation

regards
thierry



Since this not related to any Red Hat internal or customer 
information, we should move this discussion to the 389-devel list.



Hi Thierry,

Your design looks good.  A minor question.  The doc does not mention 
about deny.  For instance, in your example DIT, can I allow 
moddn_to and moddn_from on the top dc=example,dc=com and deny 
them on cn=tests.  Then, I can move an entry between cn=accounts 
and staging, but not to/from cn=tests?  Or deny is not supposed to 
use there?


In which entry do you set these ACIs?

Do you set
aci: (target=ldap:///cn=staging,dc=example,dc=com;)(version 3.0; acl 
MODDN from; allow (moddn_from))

 userdn=ldap:///uid=admin_accounts,dc=example,dc=com; ;)
in the cn=accounts,dc=example,dc=com entry?

Do you set
aci: (target=ldap:///cn=accounts,dc=example,dc=com;)(version 3.0; acl 
MODDN to; allow (moddn_to))

 userdn=ldap:///uid=admin_accounts,dc=example,dc=com; ;)
in the cn=staging,dc=example,dc=com entry?



Hi Rich,

   Yes that is correct, I forgot to mention where those aci are stored.

   They can be defined  at upper level but with a target rule that
   restrict the scope to the desire subtree, or they can be set
   directly at the subtree level without target rule.

   I updated the document to better describe that
   
http://port389.org/wiki/Access_control_on_trees_specified_in_MODDN_operation#ACI_scope_and_targets

   In that case we want to only allow a given user to move entries from
   staging to production (accounts). My preferred solution would be to add:

 * moddn_from at the entry cn=staging,dc=example,dc=com (without
   target rule)
 * moddn_to at the entry cn=accounts,dc=example,dc=com (without
   target rule).

regards
thierrry



Thanks,
--noriko




--
389-devel mailing list
389-de...@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel




--
389-devel mailing list
389-de...@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel


--
389-devel mailing list
389-de...@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel

[389-devel] 389-DS ACI improvement to control MODDN

2014-02-25 Thread thierry bordaz

Hello,

   Ticket https://fedorahosted.org/389/ticket/47553, is a 389-ds
   enhancement to allow a finer access control during a MODDN (new
   superior) operation. The use case being to allow/deny a bound user
   to move an entry from one specified part of the DIT to an other part.
   This without the need to grant the ADD permission.

   I started a design of it
   http://port389.org/wiki/Access_control_on_trees_specified_in_MODDN_operation.
   Comments are welcomed.

   regards
   thierry

--
389-devel mailing list
389-de...@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-devel

  1   2   >