Re: Some principals not replicating
> On Jun 15, 2018, at 5:31 PM, Adam Lewenberg wrote: > > PROBLEM: Some of the principals will not replicate. Well updates to the principal are not replicating... > If I go on the master and change the password of one of these problematic > principals, I > see this in the replica's logs: That's a "modify" not a "create" and modify requires the object to already be there. The iprop log is "sparse", recording only the modified data when doing "modify", so the principal can't be created just from the latest "modify" record. > QUESTION: What could be a reason for this principal not to replicate? You need to stop the slaves, blow away their database and logs, and replicate the full database from scratch. -- Viktor.
Re: Some principals not replicating
On 6/15/2018 6:21 PM, Viktor Dukhovni wrote: On Jun 15, 2018, at 6:29 PM, Adam Lewenberg wrote: This (or something much like it) appears in the initial replication on three separate 1.5.2 slaves: You *really* should upgrade the slaves as soon as possible, however: 2018-06-15T17:45:12 ipropd-slave started at version: 0 2018-06-15T17:45:12 receive complete database 2018-06-15T17:45:47 receive complete database, version 114134 The master has a complete database snapshot whose version is contiguous with the log. Therefore, instead of sending you the full database, you're getting the snapshot + incremental logs after that: 2018-06-15T17:46:44 replaying entry 114135 2018-06-15T17:46:44 replaying entry 114136 2018-06-15T17:46:44 replaying entry 114137 2018-06-15T17:46:44 replaying entry 114138 ... However, there's something amiss with the snapshot or logs. Better to delete the snapshot on the master and let it generate a new one, then resync the slaves. Or there's something wrong with the iprop code on the slaves, in any case a truly complete snapshot stands a better chance. Thanks for your quick reply. When you say "delete the snapshot on the master and let it generate a new one" I assume you meant "iprop-log truncate --reset", yes? Anyway, I did that. All the slaves re-synced and now the "bad" principals are showing up on the slaves. Thanks! Adam Lewenberg
Re: Some principals not replicating
> On Jun 15, 2018, at 6:29 PM, Adam Lewenberg wrote: > > This (or something much like it) appears in the initial replication on three > separate 1.5.2 slaves: You *really* should upgrade the slaves as soon as possible, however: > 2018-06-15T17:45:12 ipropd-slave started at version: 0 > 2018-06-15T17:45:12 receive complete database > 2018-06-15T17:45:47 receive complete database, version 114134 The master has a complete database snapshot whose version is contiguous with the log. Therefore, instead of sending you the full database, you're getting the snapshot + incremental logs after that: > 2018-06-15T17:46:44 replaying entry 114135 > 2018-06-15T17:46:44 replaying entry 114136 > 2018-06-15T17:46:44 replaying entry 114137 > 2018-06-15T17:46:44 replaying entry 114138 ... However, there's something amiss with the snapshot or logs. Better to delete the snapshot on the master and let it generate a new one, then resync the slaves. Or there's something wrong with the iprop code on the slaves, in any case a truly complete snapshot stands a better chance. -- Viktor.
Re: Some principals not replicating
I think I was not clear in my original post. Let me clarify. I have a master KDC running Heimdal 7.1. In its database is a principal called "fprefect" which, as far as I can tell, acts like a normal principal. I can do "get fprefect" and the output looks normal. If I point to this master and do a "kinit fprefect" I get a TGT. However, if I bring up a new slave KDC (no database, no transaction log) that points to this master, the KDC _appears_ to get the entire database from the master, except that the principal "fprefect" is missing. This happens if the slave KDC runs 7.1 or if it runs 1.5.2. (There are some strange messages in the iprop log on the 1.5.2 slave; see my original e-mail for details.) I don't know how this principal got into this strange state on the master, and I don't know how to replicate this issue. It makes me think that the database on the master is corrupted in some subtle way. I am hoping that someone can tell me some way to query or examine the database on the master to get some information that might throw some light on why this particular principal behaves this way. Adam Lewenberg On 6/15/2018 3:29 PM, Adam Lewenberg wrote: On 6/15/2018 3:04 PM, Viktor Dukhovni wrote: On Jun 15, 2018, at 5:31 PM, Adam Lewenberg wrote: PROBLEM: Some of the principals will not replicate. Well updates to the principal are not replicating... If I go on the master and change the password of one of these problematic principals, I see this in the replica's logs: That's a "modify" not a "create" and modify requires the object to already be there. The iprop log is "sparse", recording only the modified data when doing "modify", so the principal can't be created just from the latest "modify" record. QUESTION: What could be a reason for this principal not to replicate? You need to stop the slaves, blow away their database and logs, and replicate the full database from scratch. I did this. On three different slaves. The problematic principals do not appear in the slave's database. To be clear: even after initial replication (starting from nothing on the slave) some of the principal's do not appear in the slave's database. This (or something much like it) appears in the initial replication on three separate 1.5.2 slaves: 2018-06-15T17:45:12 ipropd-slave started at version: 0 2018-06-15T17:45:12 receive complete database 2018-06-15T17:45:47 receive complete database, version 114134 2018-06-15T17:46:44 replaying entry 114135 2018-06-15T17:46:44 replaying entry 114136 2018-06-15T17:46:44 replaying entry 114137 2018-06-15T17:46:44 replaying entry 114138 ... many lines like this until ... 2018-06-15T17:46:45 replaying entry 131686 2018-06-15T17:46:45 replaying entry 131687 2018-06-15T17:46:45 replaying entry 131688 2018-06-15T17:46:45 replaying entry 131689 2018-06-15T17:46:45 Ignoring command 8 2018-06-15T17:50:03 replaying entry 131690 2018-06-15T17:50:03 Ignoring command 8 2018-06-15T17:50:03 replaying entry 131691 2018-06-15T17:50:03 Ignoring command 8 2018-06-15T17:50:03 replaying entry 131692 2018-06-15T17:50:03 Ignoring command 8 2018-06-15T17:51:03 replaying entry 131693 2018-06-15T17:51:03 Ignoring command 8 2018-06-15T17:56:52 replaying entry 131694 2018-06-15T17:56:52 Ignoring command 8 2018-06-15T18:00:03 replaying entry 131695 2018-06-15T18:00:03 Ignoring command 8 ... more lines much like until ... 2018-06-15T20:16:57 Ignoring command 8 2018-06-15T20:18:53 replaying entry 131814 2018-06-15T20:18:53 kadm5_log_replay: 131814. Lost entry entry, Database out of sync ?: No such entry in the database (36150275) 2018-06-15T20:18:53 Ignoring command 8 2018-06-15T20:19:23 Ignoring command 8 2018-06-15T20:20:02 replaying entry 131815
Re: Some principals not replicating
On 06/15/2018 06:29 PM, Adam Lewenberg wrote: I did this. On three different slaves. The problematic principals do not appear in the slave's database. To be clear: even after initial replication (starting from nothing on the slave) some of the principal's do not appear in the slave's database. What database type is the master KDC using? If you dump the master DB and look for one of the principals which is missing in the other databases, is it present in the dump file?
Re: Some principals not replicating
On 6/15/2018 3:04 PM, Viktor Dukhovni wrote: On Jun 15, 2018, at 5:31 PM, Adam Lewenberg wrote: PROBLEM: Some of the principals will not replicate. Well updates to the principal are not replicating... If I go on the master and change the password of one of these problematic principals, I see this in the replica's logs: That's a "modify" not a "create" and modify requires the object to already be there. The iprop log is "sparse", recording only the modified data when doing "modify", so the principal can't be created just from the latest "modify" record. QUESTION: What could be a reason for this principal not to replicate? You need to stop the slaves, blow away their database and logs, and replicate the full database from scratch. I did this. On three different slaves. The problematic principals do not appear in the slave's database. To be clear: even after initial replication (starting from nothing on the slave) some of the principal's do not appear in the slave's database. This (or something much like it) appears in the initial replication on three separate 1.5.2 slaves: 2018-06-15T17:45:12 ipropd-slave started at version: 0 2018-06-15T17:45:12 receive complete database 2018-06-15T17:45:47 receive complete database, version 114134 2018-06-15T17:46:44 replaying entry 114135 2018-06-15T17:46:44 replaying entry 114136 2018-06-15T17:46:44 replaying entry 114137 2018-06-15T17:46:44 replaying entry 114138 ... many lines like this until ... 2018-06-15T17:46:45 replaying entry 131686 2018-06-15T17:46:45 replaying entry 131687 2018-06-15T17:46:45 replaying entry 131688 2018-06-15T17:46:45 replaying entry 131689 2018-06-15T17:46:45 Ignoring command 8 2018-06-15T17:50:03 replaying entry 131690 2018-06-15T17:50:03 Ignoring command 8 2018-06-15T17:50:03 replaying entry 131691 2018-06-15T17:50:03 Ignoring command 8 2018-06-15T17:50:03 replaying entry 131692 2018-06-15T17:50:03 Ignoring command 8 2018-06-15T17:51:03 replaying entry 131693 2018-06-15T17:51:03 Ignoring command 8 2018-06-15T17:56:52 replaying entry 131694 2018-06-15T17:56:52 Ignoring command 8 2018-06-15T18:00:03 replaying entry 131695 2018-06-15T18:00:03 Ignoring command 8 ... more lines much like until ... 2018-06-15T20:16:57 Ignoring command 8 2018-06-15T20:18:53 replaying entry 131814 2018-06-15T20:18:53 kadm5_log_replay: 131814. Lost entry entry, Database out of sync ?: No such entry in the database (36150275) 2018-06-15T20:18:53 Ignoring command 8 2018-06-15T20:19:23 Ignoring command 8 2018-06-15T20:20:02 replaying entry 131815