Hello,

I am currently testing a 7.4.0 based KDC master with a slave running the
7.1.0 version available in Debian Stretch.

Currently I am experiencing problems with getting iprop to force a
complete sync of the database. As is stated in iprop-log(8) truncating
the log with --reset should cause full propagations.

I belive the iprop code does not properly handle the case where the log
has been reset to version 0 on the master:
===
kdc-lab-master1:~# iprop-log truncate --reset
kdc-lab-master1:~# iprop-log dump --no-lock
nop: ver = 0, timestamp = 2017-10-04 12:51:32, len = 16
uberblock offset 40 timestamp 2017-10-04 12:51:32 version 0
===

At this point depending on the state of the slave, the propogation will
fail in one of two ways depending on if the slave has a log with changes
or not.

If the slave log has a version number greater than 0 the master will
notice it is out of sync and attempt to update it only to fail in
send_diffs() because kadm5_log_get_version_fd() returns
HEIM_ERR_EOF (and this is documented behaviour for that function when
the log is truncated and LOG_VERSION_FIRST is supplied).

The code in ipropd_master.c where we error out because 'ret' is set to
HEIM_ERR_EOF:
===
722     ret = kadm5_log_get_version_fd(server_context, log_fd, 
LOG_VERSION_FIRST,
723                                    &initial_version, &initial_tstamp);
724     sp = kadm5_log_goto_end(server_context, log_fd);
725     flock(log_fd, LOCK_UN);
726     if (ret) {
727         if (sp != NULL)
728             krb5_storage_free(sp);
729         krb5_warn(context, ret, "send_diffs: failed to read log");
730         send_are_you_there(context, s);
731         return ret;
732     }
===

Resulting in these logs on the master:
===
ipropd-master[3205]: connection from 
iprop/kdc-lab-slave1.example....@example.com
ipropd-master[3205]: Slave iprop/kdc-lab-slave1.example....@example.com 
(version 22) have later version the master (version 0) OUT OF SYNC
ipropd-master[3205]: send_diffs: failed to read log: End of file
===

... and these logs on the slave:
===
ipropd-slave[13608]: slave status change: connecting to master: 
kdc-lab-master1.example.com
ipropd-slave[13608]: connection successful to master: 
kdc-lab-master1.example.com[192.168.1.100]
ipropd-slave[13608]: ipropd-slave started at version: 22
ipropd-slave[13608]: slave status change: connected to master, waiting 
instructions
ipropd-slave[13608]: slave status change: up-to-date with version: 22 at 
2017-10-04T12:59:38
ipropd-slave[13608]: krb5_write_message: write: Broken pipe
ipropd-slave[13608]: krb5_write_priv_message: write: Broken pipe
===

Attempting to reset the slave completely by removing the heimdal.db, the
log, and the ipropd-slave-status file (only retaining the m-key) the
slave will get stuck in a non-functional state where no database is
fetched:
===
root@kdc-lab-slave1:~# /etc/init.d/heimdal-kdc stop
Stopping heimdal-kdc (via systemctl): heimdal-kdc.service.
root@kdc-lab-slave1:~# ls /var/lib/heimdal-kdc/
heimdal.db  ipropd-slave-status  log  m-key
root@kdc-lab-slave1:~# rm /var/lib/heimdal-kdc/heimdal.db 
/var/lib/heimdal-kdc/ipropd-slave-status /var/lib/heimdal-kdc/log
root@kdc-lab-slave1:~# /etc/init.d/heimdal-kdc start
root@kdc-lab-slave1:~# ls /var/lib/heimdal-kdc/
ipropd-slave-status  log  m-key
===

Master logs:
===
ipropd-master[3205]: connection from 
iprop/kdc-lab-slave1.example....@example.com
ipropd-master[3205]: slave iprop/kdc-lab-slave1.example....@example.com in sync 
already at version 0
===

Slave logs:
===
ipropd-slave[13701]: slave status change: getting credentials from 
keytab/database
ipropd-slave[13701]: slave status change: creating log file
ipropd-slave[13701]: slave status change: ipropd-slave started
ipropd-slave[13705]: slave status change: connecting to master: 
kdc-lab-master1.example.com
ipropd-slave[13705]: connection successful to master: 
kdc-lab-master1.example.com[192.168.1.100]
ipropd-slave[13705]: ipropd-slave started at version: 0
ipropd-slave[13705]: slave status change: connected to master, waiting 
instructions
ipropd-slave[13705]: slave status change: up-to-date with version: 0 at 
2017-10-04T13:10:00
===

Trying to modify the master database at this point will lead to a very
confused slave, where it appears to follow the updates, but only ends up
storing the delta:
===
root@kdc-lab-master1:~# kadmin -l
kadmin> ank --use-defaults testingprincipal
testingprinci...@example.com's Password:
Verify password - testingprinci...@example.com's Password:
kadmin>
===

Master logs:
===
ipropd-master[3205]: Got a signal, updating slaves 0 to 1
ipropd-master[3205]: syncing slave iprop/kdc-lab-slave1.example....@example.com 
from version 0 to version 1
ipropd-master[3205]: slave iprop/kdc-lab-slave1.example....@example.com is now 
up to date (1)
ipropd-master[3205]: slave iprop/kdc-lab-slave1.example....@example.com in sync 
already at version 1
===

Slave logs:
===
ipropd-slave[13835]: connection successful to master: 
kdc-lab-master1.labkrb.it.example.com[192.168.1.100]
ipropd-slave[13835]: ipropd-slave started at version: 0
ipropd-slave[13835]: slave status change: connected to master, waiting 
instructions
ipropd-slave[13835]: slave status change: up-to-date with version: 0 at 
2017-10-04T14:05:37
ipropd-slave[13835]: slave status change: up-to-date with version: 1 at 
2017-10-04T14:06:19
ipropd-slave[13835]: slave status change: up-to-date with version: 1 at 
2017-10-04T14:06:19
ipropd-slave[13835]: slave status change: up-to-date with version: 1 at 
2017-10-04T14:06:49
===

... and the heimdal.db that appeared on the slave after this operation
is empty except for the specific principal created by the modification:
===
root@kdc-lab-slave1:~# kadmin -l
kadmin> list *
testingprincipal
kadmin>
===

The only way I have found to "fix" the slave at this point is to stop
the services on it, manually increment the log version past the master
version and then start the services again. Since the slave log version
is now greater than the master, and the master has a log version greater
than 0, the full download will successfully finish.

The master is at version 1:
===
root@kdc-lab-master1:~# iprop-log last-version --no-lock
version: 1
===

Stop the slave and bump the version to 2:
===
root@kdc-lab-slave1:~# /etc/init.d/heimdal-kdc stop
Stopping heimdal-kdc (via systemctl): heimdal-kdc.service.
root@kdc-lab-slave1:~# iprop-log last-version --no-lock
version: 1
root@kdc-lab-slave1:~# kadmin -l cpw -r testingprincipal
root@kdc-lab-slave1:~# iprop-log last-version --no-lock
version: 2
root@kdc-lab-slave1:~# /etc/init.d/heimdal-kdc start
Starting heimdal-kdc (via systemctl): heimdal-kdc.service.
===

Master logs:
===
ipropd-master[3205]: connection from 
iprop/kdc-lab-slave1.example....@example.com
ipropd-master[3205]: Slave iprop/kdc-lab-slave1.example....@example.com 
(version 2) have later version the master (version 1) OUT OF SYNC
ipropd-master[3205]: slave iprop/kdc-lab-slave1.example....@example.com 
(version 2) out of sync with master (first version in log 0), sending complete 
database
ipropd-master[3205]: wrote new dumpfile (version 1)
ipropd-master[3205]: slave iprop/kdc-lab-slave1.example....@example.com in sync 
already at version 1
===

Slave logs:
===
ipropd-slave[13938]: connection successful to master: 
kdc-lab-master1.example.com[192.168.1.100]
ipropd-slave[13938]: ipropd-slave started at version: 2
ipropd-slave[13938]: slave status change: connected to master, waiting 
instructions
ipropd-slave[13938]: receive complete database
ipropd-slave[13938]: slave status change: up-to-date with version: 1 at 
2017-10-04T14:32:31
ipropd-slave[13938]: slave status change: up-to-date with version: 1 at 
2017-10-04T14:32:31
===

I am aware that iprop has gone through quite a lot of changes a while
back so I wonder if this might be a regression introduced then. Since
not even a fresh 'init' of the database will cause a log version number
of "0" I guess this will only be noticed once you attempt to do manual
resets.

Regards,
Patrik Lundin

Reply via email to