Hi,

So in my decreasing order of preference due to decreasing accuracy/easiness to 
setup.
If your kernel is recent enough and support perf-events you could try to use 
perf to accurately know where the CPU is spent.
If it does not, you could try oprofile even though that's more complex to setup 
than perf.
When you don't have those tools ready to be used, you can use the poor man 
profiling tool, i.e. sample that backtrace of slapd in loop using pstack.
If you do not have pstack, you can achieve the same in a more heavy weight 
manner with gdb.
If you do not have gdb, you may try ltrace/strace.
If you don't have those, you should ask for some linux sysadmin help ;-)

++Cyrille

________________________________
From: [email protected] 
[mailto:[email protected]] On Behalf Of Jeffrey Crawford
Sent: Friday, March 16, 2012 7:28 AM
To: OpenLDAP technical list
Subject: OpenLDAP high CPU usage when performing mass changes

We are using openldap 2.4.26 with BDB 4.8 and have replication set up in mirror 
mode for our main ldap database. There are a couple of other replicas that have 
a subset of the data that the main cluster has but we are seeing the following 
behavior on all of them.

When performing mass updates via LDAP, lets say on the order of 30,000 entries 
being added to existing entries. We've noticed that the CPU use of the slapd 
instances goes through the roof (between 65% and 95% continuously), and seems 
to stay there until it is restarted.

The Problem is that this system has to be highly available, even for writing 
and when these updates "shock" the system, the response time goes way down when 
the process are turning like that. I don't think they are trying to catch up to 
the data changes because if I let them run a while after the updates are done. 
(Talking like 1hr) and then restart the instances, they go back to their normal 
state.

So far the only way I've been able to mitigate the issues is to reconfigure our 
ldap proxy instances to a machine that is having less trouble, restart the 
instances that are chugging along, then repoint the proxies back to the one 
just started, and start the others. Not exactly a quick operation.

I've played with cache settings for both OpenLDAP and BDB and have gotten the 
frequency of this issue reduced but I can't seem to get rid of it completely 
and it shows up quite often after large data manipulations. I'm at a loss of 
how to debug since nothing is crashing. Any suggestions on how to find out 
what's causing this would be very helpful. The logs are not throwing any 
warnings or posting messages that would seem out of the ordinary and I have 
played with the log settings but nothing seems to relate to anything that might 
explain why we are seeing CPU usage to go so high.

Thanks in advance

--
I fly because it releases my mind from the tyranny of petty things . . .

- Antoine de Saint-Exupéry

Jeffrey E. Crawford
ITS Application Administrator (IDM)
831-459-4365
[email protected]<mailto:[email protected]>

Reply via email to