On Tue, May 27, 2025 at 12:59:42PM +0000, Windl, Ulrich wrote:
> Hi!
> 
> When upgrading from OpenLDAP 2.4 (SLES12) to OpenLDAP 2.5 (SLES15) I
> gave delta-syncrepl a try. That was a hard way in several aspects.
> Meanwhile I think I understand most of the details (docs could be much
> better IMHO).

Hi Ulrich,
I am sorry that you have not had a great experience, please suggest
which parts of documentation (manpages, admin guide) you feel should be
adjusted.

> Where delta-syncrepl has big problems is when sync has been set up,
> but one database is reloaded and some UUIDs are newly created for
> entries that exist on the other server(s).
> Somehow slapd detects that problem and claims that a “content sync” is
> required, but after some time it seems to start a refresh anyway.

It looks like your ACLs are not as documented or you have chosen to
reload a database *not* from a slapcat preserving some information
(entryCSNs, ...) and not preserving other (entryUUIDs)? The required
ACLs have to give the replication identity *unrestricted* read access to
both the replicated DB and its accesslog, anything else will lead to
deltasync replication failing in various not always easy to spot ways.

Replication cannot figure this out for you because its own state is now
inconsistent. Either start from scratch or use a slapcat+slapadd for the
database. If you have actually done what I'm suggesting here, please
describe how you got into this situation because that would be a bug.

> When I did the content load on the other server, slapd quit with a
> core dump. Unfortunately I had quite a lot of core dumps during my
> testing.
> So I feel delta-syncrepl is not as solid as it should be (in the
> version provided with SLES15 SP6).

Yes, replication relies on keeping its own state that you interfere with
at your own peril, potentially triggering temporary or even permanent
desyncs. However if you encounter a crash, I will ask you again that you
log a bug with steps to reproduce and/or a full backtrace with the
necessary symbols available. And any logs you can provide, if you need
to redact confidential information that is fine. We cannot fix bugs we
are not aware of except by accident.

> May 27 13:43:35 v06 systemd-coredump[27242]: [🡕] Process 27194 (slapd)
> of user 76 dumped core.
> 
> Stack trace of thread 27199:
> #0  0x00007f6b34ca8dfc __pthread_kill_implementation (libc.so.6 + 0xa8dfc)
> #1  0x00007f6b34c57842 raise (libc.so.6 + 0x57842)
> #2  0x00007f6b34c3f5cf abort (libc.so.6 + 0x3f5cf)
> #3  0x00007f6b34c3f4e7 __assert_fail_base.cold (libc.so.6 + 0x3f4e7)
> #4  0x00007f6b34c4fb32 __assert_fail (libc.so.6 + 0x4fb32)
> #5  0x00007f6b34787258 n/a (syncprov.so + 0xc258)
> #6  0x000055765d7e04f3 overlay_op_walk (slapd + 0xb74f3)
> #7  0x000055765d7e06be n/a (slapd + 0xb76be)
> #8  0x000055765d76ee54 fe_op_search (slapd + 0x45e54)
> #9  0x000055765d76e726 do_search (slapd + 0x45726)
> #10 0x000055765d76c18f n/a (slapd + 0x4318f)
> #11 0x000055765d76c98d n/a (slapd + 0x4398d)
> #12 0x00007f6b34ff7da0 n/a (libldap-2.5.releng.so.0 + 0x48da0)
> #13 0x00007f6b34ca6f6c start_thread (libc.so.6 + 0xa6f6c)
> #14 0x00007f6b34d2e338 __clone3 (libc.so.6 + 0x12e338)

This backtrace is not very useful, I suggest you not strip the binaries
or make sure you have the relevant debuginfo packages in place and have
systemd-coredump store the core file[0] so you can actually examine it
after the fact with gdb.

[0]. https://systemd.io/COREDUMP/

Thanks,

-- 
Ondřej Kuzník
Senior Software Engineer
Symas Corporation                       http://www.symas.com
Packaged, certified, and supported LDAP solutions powered by OpenLDAP

Reply via email to