On Mon, Apr 07, 2025 at 03:56:17PM +0200, Norbert wrote:
> Am 07.04.25 um 15:35 schrieb Norbert:
> > Am 07.04.25 um 14:19 schrieb Howard Chu:
> > > Norbert wrote:
> > > > Hi,
> > > > 
> > > > Used version 2.5.19.1 (ltb)
> > > > 
> > > > We have a LDAP with about 4.6 million entries and an indexed attribute 
> > > > which occurs around 3.9 million times. We typicall filter for that 
> > > > attribute with a
> > > > specific value (eq). Which is typicall very fast and no problems. As 
> > > > soon as the same value used twice the execution time for that filter is 
> > > > becoming really
> > > > slow even when additional criteria of the filter limits the result to 
> > > > exact 1 entry. Search time is at ~5% for single entry results compared 
> > > > to potential 2
> > > > entry results.
> > > > 
> > > > Some more details how this was determined:
> > > > 1) enable "stats" logging on production server for 5 minutes.
> > > > 2) collect the slowest ~1200 from several thousand searches within the 
> > > > 5 minutes from the log
> > > > 3) create a separate ldap server with exact same data and configuration 
> > > > (imported with slapadd)
> > > > 4) use a script running locally on the extra server which executes the 
> > > > 1200 filters one after the other and measure complete execution time of 
> > > > script
> > > > 
> > > > With production data I measure around 11s for the ~1200 searches. For 
> > > > all these searches one attribute in the filter could have 2 hits, but 
> > > > it is actually
> > > > limited to 1 hit because of following filter
> > > > "(&(objectClass=value)(almost_uniqe_attr=value)(another_attr=*))" Means 
> > > > searching with only "almost_uniqe_attr=value" as filter it would return 
> > > > 2 results, but
> > > > objectClass and another_attr limit it to exact 1 entry.
> > > > 
> > > > When I now remove the second entry from the ldap server for these exact 
> > > > ~1200 filters the script run time will be ~0.5s .
> > > > If re-add those ~1200 entries the runtime will be around 5s (and with a 
> > > > complete recreate of the db it will be 11s again.)
> > > > 
> > > > Limiting the search scope by using a more specific base dn for the 
> > > > search does not change anything in regards to the execution time.
> > > > 
> > > > So the question is: 1) can I change anything on the server side to 
> > > > speed up the execution time of these searches?
> > > 
> > > How common is 'another_attr'? Is there a presence index on it?
> > 
> > another_attr is the most occuring attribute in the server, typically values 
> > occur once but in this particular
> > case it is the majority that 2 entries are referenced with this attribute. 
> > The index for this attribute is
> > configured as "eq,sub".
> 
> sorry. I got confused with my arbitrary names. Each entry of interest has
> another_attr set. But when looking at the search performance when removing
> (another_attr=*) from the test filters, it does not have any impact regards
> to performance. With or without it the run time is the same and it returns 1
> entry because objectClass matters in these cases. another_attr has eq but
> not pres. Many entries have actually the same value in this case.

Hi Norbert,
just a thought:

It looks like you also have a "sub"string index on that attribute, all
indexes for a given attribute exist in the same namespace and a
substring index generates a *lot* of items. So you'll get false
positives competing for slapd's attention - have you enabled 64bit
hashes already ("index_hash64 on")?

Should help with the contention if you haven't yet.

Regards,

-- 
Ondřej Kuzník
Senior Software Engineer
Symas Corporation                       http://www.symas.com
Packaged, certified, and supported LDAP solutions powered by OpenLDAP

Reply via email to