On Wed, 07.01.15 07:59, Alan Fisher (a...@unixcube.org) wrote: > Hello! > > I seem to have reproduced this issue. After a lot of swapping, systemd > appeared to have become stuck. Trying to restart services with systemctl > blocked indefinitely. Strangely, this seemed to be the case even after a > reboot. > > Here is a part of the strace -p 1 > > recvmsg(16, 0x7fff52622560, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = > -1EAGAIN(Resourcetemporarily unavailable) epoll_wait(4, {{EPOLLOUT, > {u32=3793072544, u64=140341849469344}}}, 29, 0) = > 1clock_gettime(CLOCK_BOOTTIME, {863156, 624419539}) = 0recvmsg(16, > 0x7fff52622560, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = > -1EAGAIN(Resourcetemporarily unavailable) epoll_wait(4, {{EPOLLOUT, > {u32=3793072544, u64=140341849469344}}}, 29, 0) = > 1clock_gettime(CLOCK_BOOTTIME, {863156, 624668458}) = 0recvmsg(16, > 0x7fff52622560, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = > -1EAGAIN(Resourcetemporarily unavailable) epoll_wait(4, {{EPOLLOUT, > {u32=3793072544, u64=140341849469344}}}, 29, 0) = > 1clock_gettime(CLOCK_BOOTTIME, {863156, 624919333}) = 0recvmsg(16, > 0x7fff52622560, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = > -1EAGAIN(Resourcetemporarily unavailable) epoll_wait(4, {{EPOLLOUT, > {u32=3793072544, u64=140341849469344}}}, 29, 0) = > 1clock_gettime(CLOCK_BOOTTIME, {863156, 625167344}) = 0recvmsg(16, > 0x7fff52622560, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = > -1EAGAIN(Resourcetemporarily unavailable) epoll_wait(4, {{EPOLLOUT, > {u32=3793072544, u64=140341849469344}}}, 29, 0) = > 1clock_gettime(CLOCK_BOOTTIME, {863156, 625417381}) = 0recvmsg(16, > 0x7fff52622560, MSG_DONTWAIT|MSG_NOSIGNAL|MSG_CMSG_CLOEXEC) = > -1EAGAIN(Resourcetemporarily unavailable) epoll_wait(4, {{EPOLLOUT, > {u32=3793072544, u64=140341849469344}}}, 29, 0) = > 1clock_gettime(CLOCK_BOOTTIME, {863156, 625665881}) = 0 > > systemd --version prints > > systemd 215 > +PAM +AUDIT +SELINUX +IMA +SYSVINIT +LIBCRYPTSETUP +GCRYPT +ACL +XZ -SECCOMP > -APPARMOR > > After a second reboot, the problem seems to have disappeared.
Sorry for the late reply! Hmm, this looks like an EAGAIN busy loop in PID 1, three questions: a) That fd 16, do you have any idea what this is? What does /proc/1/fd/ say about it? If this is a socket, can you check with lsof with which peer it is talking? b) any chance you can run "pstack 1" when this happens to get a stack trace out of PID 1? c) any chance you can reproduce the issue with a more current systemd version? Lennart -- Lennart Poettering, Red Hat _______________________________________________ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel