Re: order of clauses in ACLs

2021-08-15 Thread Michael Ströder
On 8/15/21 4:36 PM, Michael Ströder wrote:
> First tests (OpenLDAP 2.5/master) show with customer ACLs reading all
> attrs of 100k user entries and one huge group entry takes
> 
> 2m23.459s with standard acl.c
> 
> but only
> 
> 1m22.718s with patched acl.c which evaluates attrs= before dn.regex= and
> val.regex=.

https://bugs.openldap.org/show_bug.cgi?id=9634

Ciao, Michael.


Re: order of clauses in ACLs

2021-08-15 Thread Michael Ströder
On 8/15/21 9:12 PM, Howard Chu wrote:
> And what about non-regex forms of DN checking, which are more common
> than regex?
And that's why I suggest in ITS#9634 that DN simple checks are split
from dn.regex.

> Nobody can validate your conclusions since you haven't provided the
> actual test cases.
I can come up with test config and data but I'd like to know whether
it's worth the effort or whether you just reject any optimization
attempts in this field.

Ciao, Michael.


Re: order of clauses in ACLs

2021-08-15 Thread Howard Chu
Michael Ströder wrote:
> HI!
> 
> Moving this thread here to openldap-devel because it's not a usage
> question anymore:
> 
> https://lists.openldap.org/hyperkitty/list/openldap-techni...@openldap.org/message/E4BK5DA5DUAUYLJT6DRGKV45SQBZJ5XZ/
> 
> TL;DR:
> 1. Would you consider the attached patch as a valid solution?

No. You haven't provided any independently verifiable data to measure the 
effects of this change.

> 2. Improving slapo-constraint would also help.

What does that have to do with anything?
> 
> On 8/13/21 10:59 AM, Michael Ströder wrote:
>> On 8/13/21 1:51 AM, Howard Chu wrote:
>>> Michael Ströder wrote:
 Let there be ACLs with dn.regex="..", attrs=foo,bar and val.regex=".."
 in the  clauses.

 Obviously depending on complexity of regex-pattern and length of DNs /
 avals the regex checking is more expensive than equality checking of 
 attrs=.

You seem to assume that regex is the only type of DN checking ever being 
performed, which
is obviously false.

 Can I improve ACL performance by order of  clauses or are they
 processed in fixed order anyway? If in fixed order, which one?
>>>
>>> The order is fixed, in order of increasing granularity. DN first,
>>> attribute next, value-specific last.
>>
>> Is this order implemented in function slap_acl_get() in acl.c?
>> The last seems to be filter= after val=. Right?
>>
>>> That is the only order that makes logical sense.
>>
>> Why?
>>
>> IMHO attrs= should be checked first because
>>
>>   dn.regex=".."
>>   val.regex=".."
>>   filter=".."
>>
>> are all potentially way more CPU-intensive than checking attrs= against
>> a simple hash-map.

And what about non-regex forms of DN checking, which are more common than regex?

Obviously attrs are checked before values, so that's not even a valid point to
support your argument.

> First tests (OpenLDAP 2.5/master) show with customer ACLs reading all
> attrs of 100k user entries and one huge group entry takes
> 
> 2m23.459s with standard acl.c
> 
> but only
> 
> 1m22.718s with patched acl.c which evaluates attrs= before dn.regex= and
> val.regex=.
> 
> The reason for improved performance here is that the test data has one
> really big group entry (100k members) with expensive ACLs on
> uniqueMember attribute. Thus quickly skipping the ACLs with
> attrs=uniqueMember saves lots of CPU.

Nobody can validate your conclusions since you haven't provided the actual test 
cases.
I'm sure that for any case you can demonstrate that performance has improved, 
one can
easily provided cases where performance is worse since all of this is highly 
data
dependent.

> Yes, the tests are somewhat artificial and optimizing the clients is
> also needed. But anyway I'd like to avoid clients accidently causing
> unneeded load. And I expect this to also improve things for lots of
> smaller groups in real-world usage.
> 
> Ciao, Michael.
> 
> P.S.: The uniqueMember-ACLs are so complex because the original author
> wanted to work-around deficiencies of slapo-constraint. Have a
> set-relation for fully checking partical sets would make this problem go
> away...
> 
> P.P.S.: Before you ask: These are *not* tests with Æ-DIR ACLs.
> 
> P.P.P.S: I like the new etime= logging. :-)
> 


-- 
  -- Howard Chu
  CTO, Symas Corp.   http://www.symas.com
  Director, Highland Sun http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/


Re: order of clauses in ACLs

2021-08-15 Thread Michael Ströder
HI!

Moving this thread here to openldap-devel because it's not a usage
question anymore:

https://lists.openldap.org/hyperkitty/list/openldap-techni...@openldap.org/message/E4BK5DA5DUAUYLJT6DRGKV45SQBZJ5XZ/

TL;DR:
1. Would you consider the attached patch as a valid solution?
2. Improving slapo-constraint would also help.

On 8/13/21 10:59 AM, Michael Ströder wrote:
> On 8/13/21 1:51 AM, Howard Chu wrote:
>> Michael Ströder wrote:
>>> Let there be ACLs with dn.regex="..", attrs=foo,bar and val.regex=".."
>>> in the  clauses.
>>>
>>> Obviously depending on complexity of regex-pattern and length of DNs /
>>> avals the regex checking is more expensive than equality checking of attrs=.
>>>
>>> Can I improve ACL performance by order of  clauses or are they
>>> processed in fixed order anyway? If in fixed order, which one?
>>
>> The order is fixed, in order of increasing granularity. DN first,
>> attribute next, value-specific last.
> 
> Is this order implemented in function slap_acl_get() in acl.c?
> The last seems to be filter= after val=. Right?
> 
>> That is the only order that makes logical sense.
> 
> Why?
> 
> IMHO attrs= should be checked first because
> 
>   dn.regex=".."
>   val.regex=".."
>   filter=".."
> 
> are all potentially way more CPU-intensive than checking attrs= against
> a simple hash-map.

First tests (OpenLDAP 2.5/master) show with customer ACLs reading all
attrs of 100k user entries and one huge group entry takes

2m23.459s with standard acl.c

but only

1m22.718s with patched acl.c which evaluates attrs= before dn.regex= and
val.regex=.

The reason for improved performance here is that the test data has one
really big group entry (100k members) with expensive ACLs on
uniqueMember attribute. Thus quickly skipping the ACLs with
attrs=uniqueMember saves lots of CPU.

Yes, the tests are somewhat artificial and optimizing the clients is
also needed. But anyway I'd like to avoid clients accidently causing
unneeded load. And I expect this to also improve things for lots of
smaller groups in real-world usage.

Ciao, Michael.

P.S.: The uniqueMember-ACLs are so complex because the original author
wanted to work-around deficiencies of slapo-constraint. Have a
set-relation for fully checking partical sets would make this problem go
away...

P.P.S.: Before you ask: These are *not* tests with Æ-DIR ACLs.

P.P.P.S: I like the new etime= logging. :-)
diff --git a/servers/slapd/acl.c b/servers/slapd/acl.c
index ac00c4163..4db0c9bc2 100644
--- a/servers/slapd/acl.c
+++ b/servers/slapd/acl.c
@@ -545,6 +545,14 @@ slap_acl_get(
 		if ( a != frontendDB->be_acl && state->as_fe_done )
 			state->as_fe_done++;
 
+		if ( a->acl_attrs && !ad_inlist( desc, a->acl_attrs ) ) {
+			matches->dn_data[0].rm_so = -1;
+			matches->dn_data[0].rm_eo = -1;
+			matches->val_data[0].rm_so = -1;
+			matches->val_data[0].rm_eo = -1;
+			continue;
+		}
+
 		if ( a->acl_dn_pat.bv_len || ( a->acl_dn_style != ACL_STYLE_REGEX )) {
 			if ( a->acl_dn_style == ACL_STYLE_REGEX ) {
 Debug( LDAP_DEBUG_ACL, "=> dnpat: [%d] %s nsub: %d\n", 
@@ -605,14 +613,6 @@ slap_acl_get(
 *count );
 		}
 
-		if ( a->acl_attrs && !ad_inlist( desc, a->acl_attrs ) ) {
-			matches->dn_data[0].rm_so = -1;
-			matches->dn_data[0].rm_eo = -1;
-			matches->val_data[0].rm_so = -1;
-			matches->val_data[0].rm_eo = -1;
-			continue;
-		}
-
 		/* Is this ACL only for a specific value? */
 		if ( a->acl_attrval.bv_val ) {
 			if ( val == NULL ) {