On Tuesday, 18 August 2009 21:30:31 Rodrigo Costa wrote: > openldap software community, > > I'm facing some difficulties to have database synchronized with > syncrepl. I'm running the latest openldap 2.4.17 version which after > these issues I compiled with gdb. > > I have a DB(divided really in 2 DBs) where each one has around 4 million > entrances. Based in memory limitations I have a dncachesize configured > with around 3000000, or smaller than the maximum number of entrances in > DBs. > > I loaded both server with all indexes and the same data. Starting both > there isn't any need for syncrepl(thread from slapd) to make any search > and then both mirrors are in sync and consuming each other. If a new > entrance is create the other consumes since both are listening right on > when it happens. > > If I stop one mirror and create even small number of entrances in the > other, like 10, when I try to start the other provider the syncrepl > enters in conventional syncrepl replication which search the DB for > synchronization. > > This never ends causing mirrors not in synchronization. What I can see is : > > 1) Stop the Second mirror, like for slapcat(calling second and first as > reference); > 2) Add a few entrances in First mirror(kept on-line); > 3) Second mirror start again after First mirror had some new entrances > added by normal operation; > 4) Syncrepl in second mirror enters in the conventional syncrepl > replication since it detects that something is different between mirrors; > 5) Until dncache is not filled the First mirror slapd cpu consumption is > below 100%(around 50%) and search happens in a good manner since monitor > shows it; > 6) After dncache is filled(oscillates above 3mi) the First mirror cpu > consumption enter in 100% consumption, oscillating between 98% to 102%; > 7) The search never ends and then systems are never in sync. Cpu is > permanently in high consumption, almost always in 100%. > > I let days this process running and I could see only a one or two > entrances in sync. By the CPU looks like something is hanging the search > where some loop is keeping the thread consuming one full cpu processing. > > I could collect some GDB information which I'm sending attached. Not > sure how to interpret this overlay_walk. > > The idea is to stop one mirror for backup releasing this task from the > primary server. For this replication would need to happen. > > Your comments are very welcome.
You have provided absolutely no configuration information. There may well be other explanations for this behaviour than the dncachesize. I can think of at least two. You also haven't provided information on the systems you are using. E.g., you may be trying on systems with too little memory (e.g., <1GB), which might be totally inadequate for the amount of data you have. Regards, Buchan
