kadmin incremental propagation full resync multiple processes spawned

Paul B. Henson Wed, 02 Nov 2011 15:14:53 -0700

I posted about this a month or two ago and didn't see any responses. It
happened again; I tried to open a bug report (not sure if it worked),
and thought I'd try posting again.


After upgrading to 1.9.1, we've noticed that when a full resync is
required kadmind behaves unexpectedly, spawning multiple processes to
handle the same resync request, and ending up with multiple processes
providing kadmin services. I'm not sure if this happens every full
resync, but it has occurred multiple times.

It starts off with a full resync request:

Nov  2 03:49:58 halfy kadmind[8938]: Request: iprop_get_updates_1,
UPDATE_FULL_RESYNC_NEEDED; Incoming SerialNo=102280; Outgoing
SerialNo=N/A, success, client=kiprop/loogie.unx.csupomona.edu@CSUPOMONA
.EDU, service=kiprop/[email protected],
addr=134.71.247.11

A child process is spawned to serve that request:

Nov  2 03:49:58 halfy kadmind[8938]: Request: iprop_full_resync_1,
spawned resync process 20238,
client=kiprop/[email protected],
service=kiprop/[email protected],
addr=134.71.247.11

That process gets a strange error (which I'm not sure is relevant):

Nov  2 03:50:06 halfy kadmind[20238]: iprop_full_resync_1: pclose(popen)
failed: Success

Then, rather than fulfilling the sync request, the child process spawns
*another* child process:

Nov  2 03:52:56 halfy kadmind[20238]: Request: iprop_get_updates_1,
UPDATE_FULL_RESYNC_NEEDED; Incoming SerialNo=102280; Outgoing
SerialNo=N/A, success,
client=kiprop/[email protected],
service=kiprop/[email protected],
addr=134.71.247.11
Nov  2 03:52:56 halfy kadmind[20238]: Request:
iprop_full_resync_1, spawned resync process 20610,
client=kiprop/[email protected],
service=kiprop/[email protected],
addr=134.71.247.11

There are no messages from that pid, and it seems to actually fulfill
the sync request.

At this point, *two* separate kadmind processes both seem to be
fulfilling kadmin requests:

Nov  2 03:52:14 halfy kadmind[20238]: Request: kadm5_get_principal,
[email protected], success, [email protected],
service=kadmin/[email protected], addr=134.71.247.23
Nov  2 03:52:14
halfy kadmind[8938]: Request: kadm5_modify_principal,
[email protected], success, [email protected],
service=kadmin/[email protected], addr=134.71.247.23

The last time this happened, multiple generations of children were
spawned, and there were half a dozen or so kadmind processes all serving
requests.

On the kdc client side:

Nov  2 03:50:28 loogie kpropd[2911]: /usr/sbin/kpropd: Bad file
descriptor while accepting connection
Nov  2 03:51:08 loogie kpropd[2911]: /usr/sbin/kpropd: Bad file descriptor 
while accepting connection
Nov  2 03:52:28 loogie kpropd[2911]: /usr/sbin/kpropd: Bad file descriptor 
while accepting connection
Nov  2 03:52:28 loogie kpropd[2911]: kpropd: Full resync, invalid return.
Nov  2 03:53:00 loogie kpropd[4221]: Connection from halfy.unx.csupomona.edu

kpropd complains about the failures and then works eventually.

>From a client perspective, connections to kadmind start flaking out:

Nov  2 03:50:07 derp idmgmt[30265]: error storing expiration:
Communication failure with server (Kerberos)
Nov  2 03:50:14 derp
idmgmt[30265]: error storing expiration: Communication failure with
server (Kerberos)
[...]
Nov  2 04:03:13 derp idmgmt[30265]: error
getting principal: Communication failure with server (Kerberos)

We originally deployed incremental under 1.8, and this never happened.
It seems to be something new with 1.9.


Any ideas? Thanks...

-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  [email protected]
California State Polytechnic University  |  Pomona CA 91768
________________________________________________
Kerberos mailing list           [email protected]
https://mailman.mit.edu/mailman/listinfo/kerberos

kadmin incremental propagation full resync multiple processes spawned

Reply via email to