Am 16.06.2015 um 17:21 schrieb Faidon Liambotis: > Hi, > > Any news about this? Can I help in any way? > > Thanks, > Faidon > > On Fri, May 29, 2015 at 05:41:24PM +0300, Faidon Liambotis wrote: >> On Mon, Apr 06, 2015 at 08:50:58PM +0100, Ben Hutchings wrote: >>> It looks the same as this problem: >>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1276705 >>> http://thread.gmane.org/gmane.linux.ubuntu.devel.kernel.general/39123/ >> >> I just encountered this bug while trying to install jessie on a Dell >> PowerEdge R610 with a SAS 6/iR (fairly recent, much more than 1950s). >> The kernel crashes while in d-i and installation fails. I also tried >> with a nightly d-i with Linux 4.0 -- same issue. >> >> Ironically, I found this bug report, clicked through the referenced >> links, only to discover I had previously investigated this when >> installling a similar server with Ubuntu 14.04 and I've even replied to >> the Launchpad bug above... I can confirm it's the exact same bug. Note >> that it was also covered by LWN(!): https://lwn.net/Articles/611226/ >> >> It's disappointing that this bug hasn't been fixed yet upstream and >> especially the part where mptsas' error handling is broken and the >> kernel crashes instead of gracefully failing. This is a different, >> secondary, bug that is just triggered by the timeout. >> >> In any case, there seems to have been /some/ improvement upstream on >> this. systemd has increased the timeout from 30s to 60s (2e92633) and >> subsequently to 180s (b5338a1), in commits that are both included in >> v217. They have also made this a kernel command-line option >> (udev.event-timeout & rd.udev.event-timeout) but those are more invasive >> patches. >> >> My working servers with Ubuntu 12.04 & 14.04 indicate on their dmesg >> that the probe time is somewhere between 18-31s, so 180s would >> definitely fix the effect of this bug. >> >> The commits above aren't directly backportable to v215 as the upstream >> code has changed significantly but the very simple patch attached is the >> equivalent fix for v215 (it's untested, though). >> >> This affects a large number of Dell systems (~100 alone in my case) and >> there is no practical workaround, so it'd be great if this was fixed in >> a jessie point release. >> >> Best, >> Faidon > >> diff --git a/src/udev/udevd.c b/src/udev/udevd.c >> index a45d324..072499c 100644 >> --- a/src/udev/udevd.c >> +++ b/src/udev/udevd.c >> @@ -1415,7 +1415,7 @@ int main(int argc, char *argv[]) >> if (worker->state != WORKER_RUNNING) >> continue; >> >> - if ((now(CLOCK_MONOTONIC) - >> worker->event_start_usec) > 30 * USEC_PER_SEC) { >> + if ((now(CLOCK_MONOTONIC) - >> worker->event_start_usec) > 180 * USEC_PER_SEC) { >> log_error("worker [%u] %s timeout; >> kill it", worker->pid, >> worker->event ? >> worker->event->devpath : "<idle>"); >> kill(worker->pid, SIGKILL); >
Looking more into this, this patch might actually not be sufficient /
the right fix and we might need the following instead:
$ git diff
diff --git a/src/udev/udev-event.c b/src/udev/udev-event.c
index 5213a4a..66d8c40 100644
--- a/src/udev/udev-event.c
+++ b/src/udev/udev-event.c
@@ -48,7 +48,7 @@ struct udev_event *udev_event_new(struct udev_device *dev)
udev_list_init(udev, &event->seclabel_list, false);
event->fd_signal = -1;
event->birth_usec = now(CLOCK_MONOTONIC);
- event->timeout_usec = 30 * 1000 * 1000;
+ event->timeout_usec = 180 * 1000 * 1000;
return event;
}
Anyone willing to test this patch? I can provide pre-built packages for
i386 and amd64.
Note: if we want to get this into 8.2, this should happen quickly. The
deadline for 8.2. is this weekend.
Michael
--
Why is it that all of the instruments seeking intelligent life in the
universe are pointed away from Earth?
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Pkg-systemd-maintainers mailing list [email protected] http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-systemd-maintainers
