Francis Swasey wrote:
Folks,
I'm attempting to convert from using slurpd to using syncrepl.
However, my testing is developing a definite belief that syncrepl is
hopelessly unable to keep up.
I have a test situation where I have loaded a 48,819 entry ldif
using slapadd -q -w on the master and slapadd -q on the replica. I
then proceed to perform 12,654 modrdns, 56 modifies, and 961
delete/add actions in rapid succession.
Did you verify that the syncrepl consumer was actually idle before you
started your tests? syncrepl requires a contextCSN attribute to be
present on both the provider and on the consumer. The "-w" option to
slapadd causes the contextCSN attribute to be written, so that means
your provider's database was immediately usable. But then you need to
copy that value over to the consumer. If the LDIF file that you
slapadd'd on the consumer came from slapcat'ing the provider, then
you're all set, because it contains all the operational attributes,
including the contextCSN attribute. But if you slapadd'd a plain input
LDIF file on the consumer, then it had no contextCSN attribute, and so
it would have to suck the entire database down from the provider before
it considered itself sync'd up.
With that prerequisite aside, it's well understood that syncrepl is
slower than slurpd for a number of reasons. Since syncrepl sends whole
entries rather than just modifications, it uses a lot more network
bandwidth than slurpd. It also causes a lot more database update
activity on the consumers. We can take steps to make some of the
database activity more efficient, but the network load is still an
issue. That's why Symas developed the delta-syncrepl mode of operation,
which uses the accesslog data format to propagate modifications instead
of whole entries. Of course, delta-syncrepl has its own performance cost
since it serializes write operations. (The serialization is two-phase,
so you can have two writes in progress at a time.) There's an up-side
and a down-side to this; the downside is serialization limiting the
number of simultaneous write operations, the upside is that you
generally get zero database deadlocks this way so every modification
completes much faster.
Using slurpd, the script runs in 28 minutes on the master and within
one (1) minute the replica is up to date. Using syncrepl, the script
runs in 25 minutes on the master, but the replica will take about 45
minutes to get in sync (one test took over 90 minutes).
I haven't found any syncrepl tuning documents, so I'm kind of
shooting blind after reading the Admin Guide and the syncrepl man pages.
I'm using the following in the master slapd.conf:
overlay syncprov
syncprov-checkpoint 1000 10
syncprov-sessionlog 1000
syncprov-nopresent FALSE
syncprov-reloadhint FALSE
I'm using the following in the replica slapd.conf:
syncrepl rid=100
provider=ldaps://carcajou.uvm.edu
type=refreshAndPersist
retry=5,+
searchbase="dc=uvm,dc=edu"
filter="(objectclass=*)"
scope=sub
updatedn="cn=Replicator,dc=uvm,dc=edu"
binddn="cn=SyncUser,dc=uvm,dc=edu"
bindmethod=simple
credentials=IAmNotThatStupidAsToGiveYouTheRealPassword
Doing ldapsearch calls at sixty second intervals, it appears that
syncrepl is handling approximately 200 updates per minute.
Anyone have any insight to what I've done wrong or where I should
start looking?
Thanks,
--
-- Howard Chu
Chief Architect, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc
OpenLDAP Core Team http://www.openldap.org/project/