On Mon, Apr 07, 2025 at 03:56:17PM +0200, Norbert wrote: > Am 07.04.25 um 15:35 schrieb Norbert: > > Am 07.04.25 um 14:19 schrieb Howard Chu: > > > Norbert wrote: > > > > Hi, > > > > > > > > Used version 2.5.19.1 (ltb) > > > > > > > > We have a LDAP with about 4.6 million entries and an indexed attribute > > > > which occurs around 3.9 million times. We typicall filter for that > > > > attribute with a > > > > specific value (eq). Which is typicall very fast and no problems. As > > > > soon as the same value used twice the execution time for that filter is > > > > becoming really > > > > slow even when additional criteria of the filter limits the result to > > > > exact 1 entry. Search time is at ~5% for single entry results compared > > > > to potential 2 > > > > entry results. > > > > > > > > Some more details how this was determined: > > > > 1) enable "stats" logging on production server for 5 minutes. > > > > 2) collect the slowest ~1200 from several thousand searches within the > > > > 5 minutes from the log > > > > 3) create a separate ldap server with exact same data and configuration > > > > (imported with slapadd) > > > > 4) use a script running locally on the extra server which executes the > > > > 1200 filters one after the other and measure complete execution time of > > > > script > > > > > > > > With production data I measure around 11s for the ~1200 searches. For > > > > all these searches one attribute in the filter could have 2 hits, but > > > > it is actually > > > > limited to 1 hit because of following filter > > > > "(&(objectClass=value)(almost_uniqe_attr=value)(another_attr=*))" Means > > > > searching with only "almost_uniqe_attr=value" as filter it would return > > > > 2 results, but > > > > objectClass and another_attr limit it to exact 1 entry. > > > > > > > > When I now remove the second entry from the ldap server for these exact > > > > ~1200 filters the script run time will be ~0.5s . > > > > If re-add those ~1200 entries the runtime will be around 5s (and with a > > > > complete recreate of the db it will be 11s again.) > > > > > > > > Limiting the search scope by using a more specific base dn for the > > > > search does not change anything in regards to the execution time. > > > > > > > > So the question is: 1) can I change anything on the server side to > > > > speed up the execution time of these searches? > > > > > > How common is 'another_attr'? Is there a presence index on it? > > > > another_attr is the most occuring attribute in the server, typically values > > occur once but in this particular > > case it is the majority that 2 entries are referenced with this attribute. > > The index for this attribute is > > configured as "eq,sub". > > sorry. I got confused with my arbitrary names. Each entry of interest has > another_attr set. But when looking at the search performance when removing > (another_attr=*) from the test filters, it does not have any impact regards > to performance. With or without it the run time is the same and it returns 1 > entry because objectClass matters in these cases. another_attr has eq but > not pres. Many entries have actually the same value in this case.
Hi Norbert, just a thought: It looks like you also have a "sub"string index on that attribute, all indexes for a given attribute exist in the same namespace and a substring index generates a *lot* of items. So you'll get false positives competing for slapd's attention - have you enabled 64bit hashes already ("index_hash64 on")? Should help with the contention if you haven't yet. Regards, -- Ondřej Kuzník Senior Software Engineer Symas Corporation http://www.symas.com Packaged, certified, and supported LDAP solutions powered by OpenLDAP