Public bug reported: Binary package hint: gdm
Since upgrade to dapper final, we experience frequent breakage of gdm on an amd64 (64-bit dapper) XDMCP server serving regularly around 40 clients. Reproducibility: happens when e.g. many people log out at the same time (once in a few days), gdm must be killed and started manually afterwards (killing all the existing sessions as well). Symptoms: slave gdm processe continue to work, but the main gdm process does not spawn new slaves, it does not ping existing ones every 15s as it does normally (from the debug syslog), does not repond to TERM (must be KILLed) - as if it were waiting for something (race?). Logs reveal the only difference between normal situation and the bug in the timeout on the gdm socket (I will attach the full log): Sep 22 14:16:40 [gdm] Sending LOGGED_IN == 0 for slave 317 Sep 22 14:16:40 [gdm] Timeout occurred for sending message LOGGED_IN 317 0 What might be the reason? In slave.c:gdm_slave_send, up to 10 attempts are made to deliver the message (select on &rfds), but select apparently return error, since the timeout never expires (otherwise, it would have to take 10s between the message sending and the timeout). PS. I compiled gdm with an added line for tracing the message sending and will post results if they are relevant. (daemon/slave.c): @@ -2767,6 +2766,7 @@ if (in_usr2_signal > 0) { fd_set rfds; struct timeval tv; + int select_retval; FD_ZERO (&rfds); FD_SET (d->slave_notify_fd, &rfds); @@ -2775,9 +2775,10 @@ tv.tv_sec = 1; tv.tv_usec = 0; - if (select (d->slave_notify_fd+1, &rfds, NULL, NULL, &tv) > 0) { + if ((select_retval = select (d->slave_notify_fd+1, &rfds, NULL, NULL, &tv)) > 0) { gdm_slave_handle_usr2_message (); } + if (select_retval < 0) gdm_debug("TRACE (%s,%d): select returned errno %d (%s)",__FILE__,__LINE__,select_retval,strerror(select_retval)); } else { struct timeval tv; /* Wait 1 second. */ @@ -2787,6 +2788,7 @@ /* don't want to use sleep since we're using alarm for pinging */ } + gdm_debug ("TRACE (%s,%d): Passed gdm_slave_send cycle, i=%d, in_usr2_signal=%d, wait_for_ack=%d, gdm_got_ack=%d.",__FILE__,__LINE__,i,in_usr2_signal,wait_for_ack,gdm_got_ack); } if G_UNLIKELY (wait_for_ack && ** Affects: gdm (Ubuntu) Importance: Untriaged Status: Unconfirmed -- gdm hangs altogether after timeout on the gdm socket https://launchpad.net/bugs/62139 -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs