I have been struggling for a few months with hard lock-ups when using laptop-mode. Every 3-4 days, my laptop would freeze during the night, with no message in the syslog, and for no apparent reason. The hard disk activity light would always be on in the morning, which somehow made me thing that it happened when the disk was started after an inactivity period. Since I installed dbus and hald, the frequency increased to once every night, which was a real pain. I don't think it has anything to do with either dbus or hald, but they probably made the hard disk spin up more often, therefore increasing the probability for the problem to happen.
Three days ago, I found this patch, which was accepted for 2.6.14, and seems to work around a problem that sounds close enough: http://tinyurl.com/b3y2d It applies cleanly to gentoo-sources-2.6.12-r10, which is what I'm currently using. I quickly made an ebuild for it, attached below with the patch from the address above. Well, I don't like to shout victory too early, but I have had no crash for the last three days, even with dbus and hald running. Just thought this info might be interesting. -- Remy Remove underscore and suffix in reply address for a timely response.
# Copyright 1999-2005 Gentoo Foundation # Distributed under the terms of the GNU General Public License v2 # $Header: $ ETYPE="sources" K_WANT_GENPATCHES="base extras" K_GENPATCHES_VER="14" IUSE="ultra1" inherit kernel-2 eutils detect_version detect_arch KEYWORDS="amd64 ~ia64 ppc ppc64 ~sparc x86" HOMEPAGE="http://dev.gentoo.org/~dsd/genpatches" DESCRIPTION="Full sources including the gentoo patchset for the ${KV_MAJOR}.${KV_MINOR} kernel tree" SRC_URI="${KERNEL_URI} ${GENPATCHES_URI} ${ARCH_URI}" pkg_setup() { if use sparc; then # hme lockup hack on ultra1 use ultra1 || UNIPATCH_EXCLUDE="${UNIPATCH_EXCLUDE} 1399_sparc-U1-hme-lockup.patch" fi } src_unpack() { kernel-2_src_unpack epatch "${FILESDIR}/ide-lockup.patch" } pkg_postinst() { postinst_sources echo if [ "${ARCH}" = "sparc" ]; then if [ x"`cat /proc/openprom/name 2>/dev/null`" \ = x"'SUNW,Ultra-1'" ]; then einfo "For users with an Enterprise model Ultra 1 using the HME" einfo "network interface, please emerge the kernel using the" einfo "following command: USE=ultra1 emerge ${PN}" fi fi einfo "For more info on this patchset, and how to report problems, see:" einfo "${HOMEPAGE}" }
From: Benjamin Herrenschmidt <[EMAIL PROTECTED]> Date: Sun, 9 Oct 2005 00:37:47 +0000 (+1000) Subject: [PATCH] ide: Workaround PM problem X-Git-Tag: v2.6.14-rc4 X-Git-Url: http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=867f8b4e47a17c5d68c98dc6eee12739c4490056 [PATCH] ide: Workaround PM problem The logic in ide_do_request() doesn't guarantee that both drives will be serviced after a call. It may "forget" to service one in some circumstances, including when one of the drive is suspended (it will eventually fail to service the slave when the master is suspended for example). This prevents the wakeup requests that gets queued on wakeup from sleep from beeing serviced in some cases when 2 drives are sharing an IDE bus. The problem is deep enough in the way this code works (and there are probably a few other problematic but rare corner cases) and fixing it would require some major rethinking of the way IDE decides which channel to service. This is not 2.6.14 material. However, in the meantime, Bart has accepted this simple workaround that will fix the crash on wakeup from sleep since this specific corner case is actually hitting users to get into 2.6.14. Signed-off-by: Benjamin Herrenschmidt <[EMAIL PROTECTED]> Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]> --- --- a/drivers/ide/ide-io.c +++ b/drivers/ide/ide-io.c @@ -1101,6 +1101,7 @@ static void ide_do_request (ide_hwgroup_ ide_hwif_t *hwif; struct request *rq; ide_startstop_t startstop; + int loops = 0; /* for atari only: POSSIBLY BROKEN HERE(?) */ ide_get_lock(ide_intr, hwgroup); @@ -1153,6 +1154,7 @@ static void ide_do_request (ide_hwgroup_ /* no more work for this hwgroup (for now) */ return; } + again: hwif = HWIF(drive); if (hwgroup->hwif->sharing_irq && hwif != hwgroup->hwif && @@ -1192,8 +1194,14 @@ static void ide_do_request (ide_hwgroup_ * though. I hope that doesn't happen too much, hopefully not * unless the subdriver triggers such a thing in its own PM * state machine. + * + * We count how many times we loop here to make sure we service + * all drives in the hwgroup without looping for ever */ if (drive->blocked && !blk_pm_request(rq) && !(rq->flags & REQ_PREEMPT)) { + drive = drive->next ? drive->next : hwgroup->drive; + if (loops++ < 4 && !blk_queue_plugged(drive->queue)) + goto again; /* We clear busy, there should be no pending ATA command at this point. */ hwgroup->busy = 0; break;