Re: regression -next -- scsi: sd: Rely on the driver core for asynchronous probing was Re: next-20190408..0418: Suspend/resume problems on Thinkpad X60
On Fri 2019-04-26 07:58:49, Bart Van Assche wrote: > On Fri, 2019-04-26 at 12:32 +0200, Pavel Machek wrote: > > [detached HEAD 916db0d] Revert "scsi: sd: Inline sd_probe_part2()" > > 1 file changed, 58 insertions(+), 43 deletions(-) > > pavel@duo:/data/l/linux-next-32$ git revert > > 21e6ba3f0e0257cce1a226c1f15e0a8ba4338ca3 > > Editing file: /data/fast/l/linux-next-32/.git/COMMIT_EDITMSG > > 1163 > > ? > > [detached HEAD ac8d625] Revert "scsi: sd: Rely on the driver core for > > asynchronous probing" > > 4 files changed, 47 insertions(+), 5 deletions(-) > > > > > > And reverting those two indeed fixes it: > > > > Checking version... > > version is Linux amd 5.1.0-rc1autobisect1556274387+ #261 SMP Fri Apr > > 26 12:27:12 CEST 2019 i686 GNU/Linux > > Running test... > > Result is [ TEST SUCCESS ] > > Test said TEST SUCCESS > > Can you share your config file? I hope that will allow me to reproduce this > issue. Here you go. You may want to google Thinkpad X60. Its best notebook ever made, but... :-). Pavel -- DENX Software Engineering GmbH, Managing Director: Wolfgang Denk HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany # # Automatically generated file; DO NOT EDIT. # Linux/x86 5.1.0-rc1 Kernel Configuration # # # Compiler: gcc (Debian 4.9.2-10+deb8u2) 4.9.2 # CONFIG_CC_IS_GCC=y CONFIG_GCC_VERSION=40902 CONFIG_CLANG_VERSION=0 CONFIG_CC_HAS_ASM_GOTO=y CONFIG_CC_HAS_WARN_MAYBE_UNINITIALIZED=y CONFIG_IRQ_WORK=y CONFIG_BUILDTIME_EXTABLE_SORT=y CONFIG_THREAD_INFO_IN_TASK=y # # General setup # CONFIG_INIT_ENV_ARG_LIMIT=32 # CONFIG_COMPILE_TEST is not set CONFIG_LOCALVERSION="autobisect1556274822" # CONFIG_LOCALVERSION_AUTO is not set CONFIG_BUILD_SALT="" CONFIG_HAVE_KERNEL_GZIP=y CONFIG_HAVE_KERNEL_BZIP2=y CONFIG_HAVE_KERNEL_LZMA=y CONFIG_HAVE_KERNEL_XZ=y CONFIG_HAVE_KERNEL_LZO=y CONFIG_HAVE_KERNEL_LZ4=y CONFIG_KERNEL_GZIP=y # CONFIG_KERNEL_BZIP2 is not set # CONFIG_KERNEL_LZMA is not set # CONFIG_KERNEL_XZ is not set # CONFIG_KERNEL_LZO is not set # CONFIG_KERNEL_LZ4 is not set CONFIG_DEFAULT_HOSTNAME="pavel" CONFIG_SWAP=y CONFIG_SYSVIPC=y CONFIG_SYSVIPC_SYSCTL=y # CONFIG_POSIX_MQUEUE is not set # CONFIG_CROSS_MEMORY_ATTACH is not set CONFIG_USELIB=y # CONFIG_AUDIT is not set CONFIG_HAVE_ARCH_AUDITSYSCALL=y # # IRQ subsystem # CONFIG_GENERIC_IRQ_PROBE=y CONFIG_GENERIC_IRQ_SHOW=y CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK=y CONFIG_GENERIC_PENDING_IRQ=y CONFIG_GENERIC_IRQ_MIGRATION=y CONFIG_IRQ_DOMAIN=y CONFIG_IRQ_DOMAIN_HIERARCHY=y CONFIG_GENERIC_IRQ_MATRIX_ALLOCATOR=y CONFIG_GENERIC_IRQ_RESERVATION_MODE=y CONFIG_IRQ_FORCED_THREADING=y CONFIG_SPARSE_IRQ=y # CONFIG_GENERIC_IRQ_DEBUGFS is not set CONFIG_CLOCKSOURCE_WATCHDOG=y CONFIG_ARCH_CLOCKSOURCE_DATA=y CONFIG_ARCH_CLOCKSOURCE_INIT=y CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y CONFIG_GENERIC_TIME_VSYSCALL=y CONFIG_GENERIC_CLOCKEVENTS=y CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y CONFIG_GENERIC_CMOS_UPDATE=y # # Timers subsystem # CONFIG_TICK_ONESHOT=y CONFIG_NO_HZ_COMMON=y # CONFIG_HZ_PERIODIC is not set CONFIG_NO_HZ_IDLE=y CONFIG_NO_HZ=y # CONFIG_HIGH_RES_TIMERS is not set CONFIG_PREEMPT_NONE=y # CONFIG_PREEMPT_VOLUNTARY is not set # CONFIG_PREEMPT is not set # # CPU/Task time and stats accounting # CONFIG_TICK_CPU_ACCOUNTING=y # CONFIG_IRQ_TIME_ACCOUNTING is not set # CONFIG_BSD_PROCESS_ACCT is not set # CONFIG_TASKSTATS is not set # CONFIG_PSI is not set # CONFIG_CPU_ISOLATION is not set # # RCU Subsystem # CONFIG_TREE_RCU=y # CONFIG_RCU_EXPERT is not set CONFIG_SRCU=y CONFIG_TREE_SRCU=y CONFIG_RCU_STALL_COMMON=y CONFIG_RCU_NEED_SEGCBLIST=y CONFIG_BUILD_BIN2C=y CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_LOG_BUF_SHIFT=18 CONFIG_LOG_CPU_MAX_BUF_SHIFT=12 CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT=13 CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH=y CONFIG_CGROUPS=y # CONFIG_MEMCG is not set # CONFIG_BLK_CGROUP is not set CONFIG_CGROUP_SCHED=y CONFIG_FAIR_GROUP_SCHED=y # CONFIG_CFS_BANDWIDTH is not set # CONFIG_RT_GROUP_SCHED is not set # CONFIG_CGROUP_PIDS is not set # CONFIG_CGROUP_RDMA is not set # CONFIG_CGROUP_FREEZER is not set # CONFIG_CPUSETS is not set # CONFIG_CGROUP_DEVICE is not set # CONFIG_CGROUP_CPUACCT is not set # CONFIG_CGROUP_PERF is not set # CONFIG_CGROUP_DEBUG is not set CONFIG_NAMESPACES=y # CONFIG_UTS_NS is not set # CONFIG_IPC_NS is not set # CONFIG_USER_NS is not set # CONFIG_PID_NS is not set # CONFIG_NET_NS is not set # CONFIG_CHECKPOINT_RESTORE is not set CONFIG_SCHED_AUTOGROUP=y CONFIG_SYSFS_DEPRECATED=y # CONFIG_SYSFS_DEPRECATED_V2 is not set CONFIG_RELAY=y # CONFIG_BLK_DEV_INITRD is not set CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE=y # CONFIG_CC_OPTIMIZE_FOR_SIZE is not set CONFIG_SYSCTL=y CONFIG_ANON_INODES=y CONFIG_HAVE_UID16=y CONFIG_SYSCTL_EXCEPTION_TRACE=y CONFIG_HAVE_PCSPKR_PLATFORM=y CONFIG_BPF=y CONFIG_EXPERT=y CONFIG_UID16=y CONFIG_MULTIUSER=y CONFIG_SGETMASK_SYSCALL=y
Re: regression -next -- scsi: sd: Rely on the driver core for asynchronous probing was Re: next-20190408..0418: Suspend/resume problems on Thinkpad X60
On Fri, 2019-04-26 at 12:32 +0200, Pavel Machek wrote: > [detached HEAD 916db0d] Revert "scsi: sd: Inline sd_probe_part2()" > 1 file changed, 58 insertions(+), 43 deletions(-) > pavel@duo:/data/l/linux-next-32$ git revert > 21e6ba3f0e0257cce1a226c1f15e0a8ba4338ca3 > Editing file: /data/fast/l/linux-next-32/.git/COMMIT_EDITMSG > 1163 > ? > [detached HEAD ac8d625] Revert "scsi: sd: Rely on the driver core for > asynchronous probing" > 4 files changed, 47 insertions(+), 5 deletions(-) > > > And reverting those two indeed fixes it: > > Checking version... > version is Linux amd 5.1.0-rc1autobisect1556274387+ #261 SMP Fri Apr > 26 12:27:12 CEST 2019 i686 GNU/Linux > Running test... > Result is [ TEST SUCCESS ] > Test said TEST SUCCESS Can you share your config file? I hope that will allow me to reproduce this issue. Thanks, Bart.
Re: regression -next -- scsi: sd: Rely on the driver core for asynchronous probing was Re: next-20190408..0418: Suspend/resume problems on Thinkpad X60
On Thu 2019-04-25 06:35:58, Bart Van Assche wrote: > On 4/25/19 12:33 AM, Pavel Machek wrote: > > On Wed 2019-04-24 13:56:01, Bart Van Assche wrote: > >> On Wed, 2019-04-24 at 22:51 +0200, Pavel Machek wrote: > >>> Unfortunately, that one does not revert cleanly on top of -next. > >> > >> Can you try the following: > >> > >> git revert d16ece577bf2cee7f94bab75a0d967bcb89dd2a7 && > >> git revert 21e6ba3f0e0257cce1a226c1f15e0a8ba4338ca3 > >> > >> I will see whether I can come up with a better way to analyze what is > >> going on. I had not expected that these patches would cause any suspend/ > >> resume problems. > > > > Not even d16ece reverts: > > > > pavel@duo:/data/l/linux-next-32$ git show | head -3 > > commit 76c938fcaa4b4a5d8f05fa907925d5043834964e > > Author: Stephen Rothwell > > Date: Tue Apr 23 20:24:59 2019 +1000 > > pavel@duo:/data/l/linux-next-32$ git revert > > d16ece577bf2cee7f94bab75a0d967bcb89dd2a7 > > error: could not revert d16ece5... scsi: sd: Inline sd_probe_part2() > > hint: after resolving the conflicts, mark the corrected paths > > hint: with 'git add ' or 'git rm ' > > hint: and commit the result with 'git commit' > > There has been a non-trivial merge between the block and scsi trees in > linux-next. That's probably what prevents these patches to revert > cleanly. How about performing the following tests: > * Build, boot and test Martin's latest for-5.2 branch > (git://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git; branch > 5.2/scsi-queue). Ok, so that's commit a7634b6f7cbbdc6efcf772e080a6fe845d1f6161 . Suspend/resume is broken there. > * If suspend/resume does not work reliably with that branch, revert the > two patches above, rebuild, reboot and retest. pavel@duo:/data/l/linux-next-32$ git show commit a7634b6f7cbbdc6efcf772e080a6fe845d1f6161 Author: Colin Ian King pavel@duo:/data/l/linux-next-32$ git revert d16ece577bf2cee7f94bab75a0d967bcb89dd2a7 Editing file: /data/fast/l/linux-next-32/.git/COMMIT_EDITMSG 1026 ? [detached HEAD 916db0d] Revert "scsi: sd: Inline sd_probe_part2()" 1 file changed, 58 insertions(+), 43 deletions(-) pavel@duo:/data/l/linux-next-32$ git revert 21e6ba3f0e0257cce1a226c1f15e0a8ba4338ca3 Editing file: /data/fast/l/linux-next-32/.git/COMMIT_EDITMSG 1163 ? [detached HEAD ac8d625] Revert "scsi: sd: Rely on the driver core for asynchronous probing" 4 files changed, 47 insertions(+), 5 deletions(-) And reverting those two indeed fixes it: Checking version... version is Linux amd 5.1.0-rc1autobisect1556274387+ #261 SMP Fri Apr 26 12:27:12 CEST 2019 i686 GNU/Linux Running test... Result is [ TEST SUCCESS ] Test said TEST SUCCESS Pavel -- DENX Software Engineering GmbH, Managing Director: Wolfgang Denk HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany signature.asc Description: Digital signature
Re: regression -next -- scsi: sd: Rely on the driver core for asynchronous probing was Re: next-20190408..0418: Suspend/resume problems on Thinkpad X60
On 4/25/19 12:33 AM, Pavel Machek wrote: > On Wed 2019-04-24 13:56:01, Bart Van Assche wrote: >> On Wed, 2019-04-24 at 22:51 +0200, Pavel Machek wrote: >>> Unfortunately, that one does not revert cleanly on top of -next. >> >> Can you try the following: >> >> git revert d16ece577bf2cee7f94bab75a0d967bcb89dd2a7 && >> git revert 21e6ba3f0e0257cce1a226c1f15e0a8ba4338ca3 >> >> I will see whether I can come up with a better way to analyze what is >> going on. I had not expected that these patches would cause any suspend/ >> resume problems. > > Not even d16ece reverts: > > pavel@duo:/data/l/linux-next-32$ git show | head -3 > commit 76c938fcaa4b4a5d8f05fa907925d5043834964e > Author: Stephen Rothwell > Date: Tue Apr 23 20:24:59 2019 +1000 > pavel@duo:/data/l/linux-next-32$ git revert > d16ece577bf2cee7f94bab75a0d967bcb89dd2a7 > error: could not revert d16ece5... scsi: sd: Inline sd_probe_part2() > hint: after resolving the conflicts, mark the corrected paths > hint: with 'git add ' or 'git rm ' > hint: and commit the result with 'git commit' There has been a non-trivial merge between the block and scsi trees in linux-next. That's probably what prevents these patches to revert cleanly. How about performing the following tests: * Build, boot and test Martin's latest for-5.2 branch (git://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git; branch 5.2/scsi-queue). * If suspend/resume does not work reliably with that branch, revert the two patches above, rebuild, reboot and retest. Thanks, Bart.
Re: regression -next -- scsi: sd: Rely on the driver core for asynchronous probing was Re: next-20190408..0418: Suspend/resume problems on Thinkpad X60
On Wed 2019-04-24 13:56:01, Bart Van Assche wrote: > On Wed, 2019-04-24 at 22:51 +0200, Pavel Machek wrote: > > Unfortunately, that one does not revert cleanly on top of -next. > > Can you try the following: > > git revert d16ece577bf2cee7f94bab75a0d967bcb89dd2a7 && > git revert 21e6ba3f0e0257cce1a226c1f15e0a8ba4338ca3 > > I will see whether I can come up with a better way to analyze what is > going on. I had not expected that these patches would cause any suspend/ > resume problems. Not even d16ece reverts: pavel@duo:/data/l/linux-next-32$ git show | head -3 commit 76c938fcaa4b4a5d8f05fa907925d5043834964e Author: Stephen Rothwell Date: Tue Apr 23 20:24:59 2019 +1000 pavel@duo:/data/l/linux-next-32$ git revert d16ece577bf2cee7f94bab75a0d967bcb89dd2a7 error: could not revert d16ece5... scsi: sd: Inline sd_probe_part2() hint: after resolving the conflicts, mark the corrected paths hint: with 'git add ' or 'git rm ' hint: and commit the result with 'git commit' Pavel -- DENX Software Engineering GmbH, Managing Director: Wolfgang Denk HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany signature.asc Description: Digital signature
Re: next-20190408..0418: Suspend/resume problems on Thinkpad X60
On Wed, 2019-04-24 at 12:17 +0200, Pavel Machek wrote: > On Tue 2019-04-23 07:09:42, Bart Van Assche wrote: > > On 4/23/19 3:22 AM, Pavel Machek wrote: > > > > > It boots ok (unlike mainline -- I'm debugging that), and I can suspend > > > > > and resume... but then cursor in X is moving and I can talk to > > > > > applications cached in memory, but any access to disk hangs. > > > > > > > > Mainline problem was identified. > > > > > > > > But resume is still broken. I took advantage of fact that I can still > > > > do cached commands, and got complete dmesg. I'm attaching it. > > > > > > Still broken in 0418. Ideas would be welcome at this point. > > > > Have you already tried the debugging steps explained in > > Documentation/power to obtain more information about the nature of the > > suspend/resume problem? > > That won't help, as system resumes ok, then disk hangs. > > Does it work for you? Both "systemctl hibernate" and "systemctl suspend" work perfectly with the next-20190424 kernel on my laptop (a Dell Precision laptop). Bart.
Re: regression -next -- scsi: sd: Rely on the driver core for asynchronous probing was Re: next-20190408..0418: Suspend/resume problems on Thinkpad X60
On Wed, 2019-04-24 at 22:51 +0200, Pavel Machek wrote: > Unfortunately, that one does not revert cleanly on top of -next. Can you try the following: git revert d16ece577bf2cee7f94bab75a0d967bcb89dd2a7 && git revert 21e6ba3f0e0257cce1a226c1f15e0a8ba4338ca3 I will see whether I can come up with a better way to analyze what is going on. I had not expected that these patches would cause any suspend/ resume problems. Bart.
Re: regression -next -- scsi: sd: Rely on the driver core for asynchronous probing was Re: next-20190408..0418: Suspend/resume problems on Thinkpad X60
On Wed 2019-04-24 22:48:32, Pavel Machek wrote: > Hi! > > > Not block, but it seems scsi subsystem is: > > commit 21e6ba3f0e0257cce1a226c1f15e0a8ba4338ca3 > Author: Bart Van Assche > Date: Wed Mar 20 13:09:19 2019 -0700 > > scsi: sd: Rely on the driver core for asynchronous probing > > As explained during the 2018 LSF/MM session about increasing SCSI > disk > probing concurrency, the problems with the current probing > approach are as > > Seems to be responsible. Full log attached. Unfortunately, that one does not revert cleanly on top of -next. Any ideas what is wrong? Does suspend/resume work for you? I can test patches. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
regression -next -- scsi: sd: Rely on the driver core for asynchronous probing was Re: next-20190408..0418: Suspend/resume problems on Thinkpad X60
Hi! > Not block, but it seems scsi subsystem is: commit 21e6ba3f0e0257cce1a226c1f15e0a8ba4338ca3 Author: Bart Van Assche Date: Wed Mar 20 13:09:19 2019 -0700 scsi: sd: Rely on the driver core for asynchronous probing As explained during the 2018 LSF/MM session about increasing SCSI disk probing concurrency, the problems with the current probing approach are as Seems to be responsible. Full log attached. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html # bad: [76c938fcaa4b4a5d8f05fa907925d5043834964e] Add linux-next specific files for 20190423 # good: [7142eaa58b49d9de492ccc16d48df7c488a5fbb6] Merge tag 'mips_fixes_5.1_3' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux git bisect start 'next-20190423' '7142eaa58b49d9de492ccc16d48df7c488a5fbb6' # good: [ed04f675fa2c22316d7b57bea1258a18a47537ea] Merge remote-tracking branch 'crypto/master' git bisect good ed04f675fa2c22316d7b57bea1258a18a47537ea # good: [4a99e5b3463f5c936540958914bff57ec50ac1e0] Merge remote-tracking branch 'spi/for-next' git bisect good 4a99e5b3463f5c936540958914bff57ec50ac1e0 # good: [61cabbda2a7e966b689a6791050ad675e6dff274] Merge remote-tracking branch 'staging/staging-next' git bisect good 61cabbda2a7e966b689a6791050ad675e6dff274 # bad: [c8f0c2453f64529035e25fbfb9de9d24e98baff7] Merge remote-tracking branch 'coresight/next' git bisect bad c8f0c2453f64529035e25fbfb9de9d24e98baff7 # bad: [6fb251c6f174d3cc571391baa9f6e57fff505446] Merge branch 'misc' into for-next git bisect bad 6fb251c6f174d3cc571391baa9f6e57fff505446 # bad: [78a8ab3cc0f95a66c8fb2429030289103de173e7] scsi: qedf: fixup bit operations git bisect bad 78a8ab3cc0f95a66c8fb2429030289103de173e7 # good: [c0327e67ecd86e88f5bc5fd54bfdf9b422a1c93f] scsi: core: remove the scsi_ioctl_reset export git bisect good c0327e67ecd86e88f5bc5fd54bfdf9b422a1c93f # good: [cbb24e26735f6142ba994b4d44fc2dcd54c3fe1f] scsi: ufs-mediatek: Make some symbols static git bisect good cbb24e26735f6142ba994b4d44fc2dcd54c3fe1f # bad: [21e6ba3f0e0257cce1a226c1f15e0a8ba4338ca3] scsi: sd: Rely on the driver core for asynchronous probing git bisect bad 21e6ba3f0e0257cce1a226c1f15e0a8ba4338ca3 # good: [e7f7b6f38a44697428f5a2e7c606de028df2b0e3] scsi: lpfc: change snprintf to scnprintf for possible overflow git bisect good e7f7b6f38a44697428f5a2e7c606de028df2b0e3 # good: [3e14592da654d53d87987aa09753d5a26e45446f] scsi: gdth: Only call dma_free_coherent when buf is not NULL in ioc_general git bisect good 3e14592da654d53d87987aa09753d5a26e45446f # good: [8378573353728a02602d6f956a3df48db0505c65] scsi: libcxgbi: remove uninitialized variable len git bisect good 8378573353728a02602d6f956a3df48db0505c65 # good: [ea9006dfda65b7dc369aaa2359b3dedfc1bb08b6] scsi: mpt3sas: fix indentation issue git bisect good ea9006dfda65b7dc369aaa2359b3dedfc1bb08b6 # first bad commit: [21e6ba3f0e0257cce1a226c1f15e0a8ba4338ca3] scsi: sd: Rely on the driver core for asynchronous probing signature.asc Description: Digital signature
Re: next-20190408..0418: Suspend/resume problems on Thinkpad X60
On Wed 2019-04-24 12:48:50, Pavel Machek wrote: > On Wed 2019-04-24 11:54:31, Pavel Machek wrote: > > On Tue 2019-04-23 07:55:05, Jens Axboe wrote: > > > On 4/23/19 4:22 AM, Pavel Machek wrote: > > > > Hi! > > > > > > > >>> It boots ok (unlike mainline -- I'm debugging that), and I can suspend > > > >>> and resume... but then cursor in X is moving and I can talk to > > > >>> applications cached in memory, but any access to disk hangs. > > > >> > > > >> Mainline problem was identified. > > > >> > > > >> But resume is still broken. I took advantage of fact that I can still > > > >> do cached commands, and got complete dmesg. I'm attaching it. > > > > > > > > Still broken in 0418. Ideas would be welcome at this point. > > > > > > Bisect it? > > > > commit fdbbda7b3a0622fcfe630238d0bf6c57c4ba3663 > > Merge: 3c442d5 6c88d73 > > Author: Jens Axboe > > Date: Mon Apr 22 13:57:36 2019 -0600 > > > > Works ok. So... block is not responsible. > > > > Let me check > > > > commit 91b112cf3b599f06f1e810cfedf37023f25d5588 > > Merge: fb2c4a8 e32d939 > > Author: Rafael J. Wysocki > > Date: Mon Apr 22 01:52:48 2019 +0200 > > Suspend/resume ok, so pm not responsible. Let me check next-20190423. Not block, but it seems scsi subsystem is: pavel@duo:/data/l/linux-next-32$ git bisect log # bad: [76c938fcaa4b4a5d8f05fa907925d5043834964e] Add linux-next specific files for 20190423 # good: [7142eaa58b49d9de492ccc16d48df7c488a5fbb6] Merge tag 'mips_fixes_5.1_3' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux git bisect start 'next-20190423' '7142eaa58b49d9de492ccc16d48df7c488a5fbb6' # good: [ed04f675fa2c22316d7b57bea1258a18a47537ea] Merge remote-tracking branch 'crypto/master' git bisect good ed04f675fa2c22316d7b57bea1258a18a47537ea # good: [4a99e5b3463f5c936540958914bff57ec50ac1e0] Merge remote-tracking branch 'spi/for-next' git bisect good 4a99e5b3463f5c936540958914bff57ec50ac1e0 # good: [61cabbda2a7e966b689a6791050ad675e6dff274] Merge remote-tracking branch 'staging/staging-next' git bisect good 61cabbda2a7e966b689a6791050ad675e6dff274 # bad: [c8f0c2453f64529035e25fbfb9de9d24e98baff7] Merge remote-tracking branch 'coresight/next' git bisect bad c8f0c2453f64529035e25fbfb9de9d24e98baff7 # bad: [6fb251c6f174d3cc571391baa9f6e57fff505446] Merge branch 'misc' into for-next git bisect bad 6fb251c6f174d3cc571391baa9f6e57fff505446 # bad: [78a8ab3cc0f95a66c8fb2429030289103de173e7] scsi: qedf: fixup bit operations git bisect bad 78a8ab3cc0f95a66c8fb2429030289103de173e7 # good: [c0327e67ecd86e88f5bc5fd54bfdf9b422a1c93f] scsi: core: remove the scsi_ioctl_reset export git bisect good c0327e67ecd86e88f5bc5fd54bfdf9b422a1c93f -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
Re: next-20190408..0418: Suspend/resume problems on Thinkpad X60
On Wed 2019-04-24 11:54:31, Pavel Machek wrote: > On Tue 2019-04-23 07:55:05, Jens Axboe wrote: > > On 4/23/19 4:22 AM, Pavel Machek wrote: > > > Hi! > > > > > >>> It boots ok (unlike mainline -- I'm debugging that), and I can suspend > > >>> and resume... but then cursor in X is moving and I can talk to > > >>> applications cached in memory, but any access to disk hangs. > > >> > > >> Mainline problem was identified. > > >> > > >> But resume is still broken. I took advantage of fact that I can still > > >> do cached commands, and got complete dmesg. I'm attaching it. > > > > > > Still broken in 0418. Ideas would be welcome at this point. > > > > Bisect it? > > commit fdbbda7b3a0622fcfe630238d0bf6c57c4ba3663 > Merge: 3c442d5 6c88d73 > Author: Jens Axboe > Date: Mon Apr 22 13:57:36 2019 -0600 > > Works ok. So... block is not responsible. > > Let me check > > commit 91b112cf3b599f06f1e810cfedf37023f25d5588 > Merge: fb2c4a8 e32d939 > Author: Rafael J. Wysocki > Date: Mon Apr 22 01:52:48 2019 +0200 Suspend/resume ok, so pm not responsible. Let me check next-20190423. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
Re: next-20190408..0418: Suspend/resume problems on Thinkpad X60
On Tue 2019-04-23 07:09:42, Bart Van Assche wrote: > On 4/23/19 3:22 AM, Pavel Machek wrote: > >>> It boots ok (unlike mainline -- I'm debugging that), and I can suspend > >>> and resume... but then cursor in X is moving and I can talk to > >>> applications cached in memory, but any access to disk hangs. > >> > >> Mainline problem was identified. > >> > >> But resume is still broken. I took advantage of fact that I can still > >> do cached commands, and got complete dmesg. I'm attaching it. > > > > Still broken in 0418. Ideas would be welcome at this point. > > Have you already tried the debugging steps explained in > Documentation/power to obtain more information about the nature of the > suspend/resume problem? That won't help, as system resumes ok, then disk hangs. Does it work for you? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
Re: next-20190408..0418: Suspend/resume problems on Thinkpad X60
On Tue 2019-04-23 07:55:05, Jens Axboe wrote: > On 4/23/19 4:22 AM, Pavel Machek wrote: > > Hi! > > > >>> It boots ok (unlike mainline -- I'm debugging that), and I can suspend > >>> and resume... but then cursor in X is moving and I can talk to > >>> applications cached in memory, but any access to disk hangs. > >> > >> Mainline problem was identified. > >> > >> But resume is still broken. I took advantage of fact that I can still > >> do cached commands, and got complete dmesg. I'm attaching it. > > > > Still broken in 0418. Ideas would be welcome at this point. > > Bisect it? commit fdbbda7b3a0622fcfe630238d0bf6c57c4ba3663 Merge: 3c442d5 6c88d73 Author: Jens Axboe Date: Mon Apr 22 13:57:36 2019 -0600 Works ok. So... block is not responsible. Let me check commit 91b112cf3b599f06f1e810cfedf37023f25d5588 Merge: fb2c4a8 e32d939 Author: Rafael J. Wysocki Date: Mon Apr 22 01:52:48 2019 +0200 Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
Re: next-20190408..0418: Suspend/resume problems on Thinkpad X60
On Tue 2019-04-23 07:55:05, Jens Axboe wrote: > On 4/23/19 4:22 AM, Pavel Machek wrote: > > Hi! > > > >>> It boots ok (unlike mainline -- I'm debugging that), and I can suspend > >>> and resume... but then cursor in X is moving and I can talk to > >>> applications cached in memory, but any access to disk hangs. > >> > >> Mainline problem was identified. > >> > >> But resume is still broken. I took advantage of fact that I can still > >> do cached commands, and got complete dmesg. I'm attaching it. > > > > Still broken in 0418. Ideas would be welcome at this point. > > Bisect it? Before I start heavy debugging, it would be interesting to know... does suspend/resume work for you in -next? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
Re: next-20190408..0418: Suspend/resume problems on Thinkpad X60
On 4/23/19 3:22 AM, Pavel Machek wrote: >>> It boots ok (unlike mainline -- I'm debugging that), and I can suspend >>> and resume... but then cursor in X is moving and I can talk to >>> applications cached in memory, but any access to disk hangs. >> >> Mainline problem was identified. >> >> But resume is still broken. I took advantage of fact that I can still >> do cached commands, and got complete dmesg. I'm attaching it. > > Still broken in 0418. Ideas would be welcome at this point. Have you already tried the debugging steps explained in Documentation/power to obtain more information about the nature of the suspend/resume problem? Bart.
Re: next-20190408..0418: Suspend/resume problems on Thinkpad X60
On 4/23/19 4:22 AM, Pavel Machek wrote: > Hi! > >>> It boots ok (unlike mainline -- I'm debugging that), and I can suspend >>> and resume... but then cursor in X is moving and I can talk to >>> applications cached in memory, but any access to disk hangs. >> >> Mainline problem was identified. >> >> But resume is still broken. I took advantage of fact that I can still >> do cached commands, and got complete dmesg. I'm attaching it. > > Still broken in 0418. Ideas would be welcome at this point. Bisect it? -- Jens Axboe
next-20190408..0418: Suspend/resume problems on Thinkpad X60
Hi! > > It boots ok (unlike mainline -- I'm debugging that), and I can suspend > > and resume... but then cursor in X is moving and I can talk to > > applications cached in memory, but any access to disk hangs. > > Mainline problem was identified. > > But resume is still broken. I took advantage of fact that I can still > do cached commands, and got complete dmesg. I'm attaching it. Still broken in 0418. Ideas would be welcome at this point. Best regards, Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
Re: next-20190408: Suspend/resume problems on Thinkpad X60
Hi! > It boots ok (unlike mainline -- I'm debugging that), and I can suspend > and resume... but then cursor in X is moving and I can talk to > applications cached in memory, but any access to disk hangs. Mainline problem was identified. But resume is still broken. I took advantage of fact that I can still do cached commands, and got complete dmesg. I'm attaching it. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html delme2.gz Description: application/gzip signature.asc Description: Digital signature
next-20190408: Suspend/resume problems on Thinkpad X60
Hi! It boots ok (unlike mainline -- I'm debugging that), and I can suspend and resume... but then cursor in X is moving and I can talk to applications cached in memory, but any access to disk hangs. Any ideas? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html signature.asc Description: Digital signature
Re: Resume problems
Rafael J. Wysocki wrote: >> >> After all I think all this problems may be some who ACPI related >> but the question is why they get triggered by Suspend/Hibernation. > > They certainly are ACPI-related, because the only difference between level 4 > and level 3 suspend testing is that some global ACPI methods are executed > at level 3 (in addition to level 4). > > Unfortunately, I have no idea what to do next, for now. > > I think you can file a bug report at http://bugzilla.kernel.org and put a link > to this thread in there (against ACPI and please add my address to the CC > list). Also I patched 2.6.23 with that patch and Hibernation works out of box , Suspend to Ram seems to work fine , just my video card is acting up ( old nvidia card ) I'll play with vbe tool on weekend. Also I can reproduce that bug in 2.6.23 when I use standby. I've started to bisect but it will take some time. When I'm done I will post an bug report. Thanks for your help so far. > > Greetings, > Rafael > Gabriel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Resume problems
Rafael J. Wysocki wrote: After all I think all this problems may be some who ACPI related but the question is why they get triggered by Suspend/Hibernation. They certainly are ACPI-related, because the only difference between level 4 and level 3 suspend testing is that some global ACPI methods are executed at level 3 (in addition to level 4). Unfortunately, I have no idea what to do next, for now. I think you can file a bug report at http://bugzilla.kernel.org and put a link to this thread in there (against ACPI and please add my address to the CC list). Also I patched 2.6.23 with that patch and Hibernation works out of box , Suspend to Ram seems to work fine , just my video card is acting up ( old nvidia card ) I'll play with vbe tool on weekend. Also I can reproduce that bug in 2.6.23 when I use standby. I've started to bisect but it will take some time. When I'm done I will post an bug report. Thanks for your help so far. Greetings, Rafael Gabriel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Resume problems
On Tuesday, 23 October 2007 03:01, Gabriel C wrote: > > > Also box just froze on level 3 but I got a ACPI error at least which I > > didn't got in any other dmesg till now : > > ( also patch was tested with HT disabled and Suspend and Hibernation > > enabled in kernel and BIOS ) > > > > ... > > > > Oct 23 01:51:05 lara [ 273.512374] PM: Removing info for No Bus:input0 > > Oct 23 01:51:05 lara [ 274.545158] PM: Removing info for No Bus:mouse0 > > Oct 23 01:51:05 lara [ 274.551435] PM: Removing info for No Bus:event1 > > Oct 23 01:51:05 lara [ 274.559493] PM: Removing info for No Bus:input1 > > Oct 23 01:53:06 lara [ 394.869468] ACPI Error (evevent-0303): No installed > > handler for fixed event [0002] [20070126] > > > > > > > > ( I hard reseted after that ) > > > > I try level 2 and 1 now I just wanted to let you know. > > > > Same issues with level 2 and 1. Yes. If you have a problem at level n, it should always reappear for n-1 etc. > BTW I found out why my box does not shutdown with acpi=ht. It seems like > libata does not like that > acpi mode =) dropping the '... read http://linux-ata.org/shutdown.html , > power down manually' message. > > That works perfectly with full acpi here. > > After all I think all this problems may be some who ACPI related > but the question is why they get triggered by Suspend/Hibernation. They certainly are ACPI-related, because the only difference between level 4 and level 3 suspend testing is that some global ACPI methods are executed at level 3 (in addition to level 4). Unfortunately, I have no idea what to do next, for now. I think you can file a bug report at http://bugzilla.kernel.org and put a link to this thread in there (against ACPI and please add my address to the CC list). Greetings, Rafael - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems
On Tue, 23 Oct 2007, Rafael J. Wysocki wrote: > On Monday, 22 October 2007 16:11, Mark Lord wrote: > > Rafael, > > > > What happens to the jiffies variable on resume from RAM, and from DISK? > > Do we restore it to the value it had at suspend, > > or just leave it be with whatever? > > > > The answer has to be "restore the value it had at suspend time", > > but I figured I'd check here anyway. > > > > ?? > > Well, frankly, I've lost track of that recently, but it seems that we just use > the pre-suspend jiffies (at least in the current -git). > > Thomas knows better, I guess. :-) We use the pre-suspend value if nothing else fiddled in the variable. tglx - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems
On Tue, 23 Oct 2007, Rafael J. Wysocki wrote: On Monday, 22 October 2007 16:11, Mark Lord wrote: Rafael, What happens to the jiffies variable on resume from RAM, and from DISK? Do we restore it to the value it had at suspend, or just leave it be with whatever? The answer has to be restore the value it had at suspend time, but I figured I'd check here anyway. ?? Well, frankly, I've lost track of that recently, but it seems that we just use the pre-suspend jiffies (at least in the current -git). Thomas knows better, I guess. :-) We use the pre-suspend value if nothing else fiddled in the variable. tglx - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Resume problems
On Tuesday, 23 October 2007 03:01, Gabriel C wrote: Also box just froze on level 3 but I got a ACPI error at least which I didn't got in any other dmesg till now : ( also patch was tested with HT disabled and Suspend and Hibernation enabled in kernel and BIOS ) ... Oct 23 01:51:05 lara [ 273.512374] PM: Removing info for No Bus:input0 Oct 23 01:51:05 lara [ 274.545158] PM: Removing info for No Bus:mouse0 Oct 23 01:51:05 lara [ 274.551435] PM: Removing info for No Bus:event1 Oct 23 01:51:05 lara [ 274.559493] PM: Removing info for No Bus:input1 Oct 23 01:53:06 lara [ 394.869468] ACPI Error (evevent-0303): No installed handler for fixed event [0002] [20070126] ( I hard reseted after that ) I try level 2 and 1 now I just wanted to let you know. Same issues with level 2 and 1. Yes. If you have a problem at level n, it should always reappear for n-1 etc. BTW I found out why my box does not shutdown with acpi=ht. It seems like libata does not like that acpi mode =) dropping the '... read http://linux-ata.org/shutdown.html , power down manually' message. That works perfectly with full acpi here. After all I think all this problems may be some who ACPI related but the question is why they get triggered by Suspend/Hibernation. They certainly are ACPI-related, because the only difference between level 4 and level 3 suspend testing is that some global ACPI methods are executed at level 3 (in addition to level 4). Unfortunately, I have no idea what to do next, for now. I think you can file a bug report at http://bugzilla.kernel.org and put a link to this thread in there (against ACPI and please add my address to the CC list). Greetings, Rafael - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Resume problems
> Also box just froze on level 3 but I got a ACPI error at least which I didn't > got in any other dmesg till now : > ( also patch was tested with HT disabled and Suspend and Hibernation enabled > in kernel and BIOS ) > > ... > > Oct 23 01:51:05 lara [ 273.512374] PM: Removing info for No Bus:input0 > Oct 23 01:51:05 lara [ 274.545158] PM: Removing info for No Bus:mouse0 > Oct 23 01:51:05 lara [ 274.551435] PM: Removing info for No Bus:event1 > Oct 23 01:51:05 lara [ 274.559493] PM: Removing info for No Bus:input1 > Oct 23 01:53:06 lara [ 394.869468] ACPI Error (evevent-0303): No installed > handler for fixed event [0002] [20070126] > > > > ( I hard reseted after that ) > > I try level 2 and 1 now I just wanted to let you know. > Same issues with level 2 and 1. BTW I found out why my box does not shutdown with acpi=ht. It seems like libata does not like that acpi mode =) dropping the '... read http://linux-ata.org/shutdown.html , power down manually' message. That works perfectly with full acpi here. After all I think all this problems may be some who ACPI related but the question is why they get triggered by Suspend/Hibernation. If you want me to test something else just let me know. Gabriel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Resume problems
Gabriel C wrote: > Rafael J. Wysocki wrote: >> On Tuesday, 23 October 2007 01:00, Gabriel C wrote: >>> Rafael J. Wysocki wrote: On Monday, 22 October 2007 18:15, Gabriel C wrote: > Hi all , > > I'm running current git + aic7xxx suspend patch from > http://bugzilla.kernel.org/show_bug.cgi?id=3062 > on a Dell Precision WorkStation 530 MT SMP box ( HT enabled ). > > Suspend works fine but on resume I have some problems. > All CPU's but boot CPU won't come back , everything else seems fine. Can you please try to disable HT and suspend? >>> So only 'Hibernation' is enabled in kernel and HT disabled in BIOS ? >>> >>> If you mean that , sure I can try doing so. >> With suspend or hibernation enabled in the kernel, but with HT disabled in >> the >> BIOS. > > Ok trying in some minutes. Disabling HT does not make any difference , nor disabling / enabling only one Hibernation or Suspend in kernel and BIOS nor any combination of these. > >>> I also could disable Suspend to RAM completly from BIOS as well if you want. >> No, that rather won't work. >> > ... > > Oct 22 15:02:28 lara [ 49.618795] Enabling non-boot CPUs ... > Oct 22 15:02:28 lara [ 49.622211] PM: Adding info for No Bus:msr1 > Oct 22 15:02:28 lara [ 49.622259] PM: Adding info for No Bus:cpu1 > Oct 22 15:02:28 lara [ 49.622302] SMP alternatives: switching to SMP > code > Oct 22 15:02:28 lara [ 49.623536] Booting processor 1/1 eip 3000 > Oct 22 15:02:28 lara [ 54.638093] Not responding. > Oct 22 15:02:28 lara [ 54.638096] Inquiring remote APIC #1... > Oct 22 15:02:28 lara [ 54.638099] ... APIC #1 ID: failed > Oct 22 15:02:28 lara [ 54.638204] ... APIC #1 VERSION: failed > Oct 22 15:02:28 lara [ 54.638307] ... APIC #1 SPIV: failed > Oct 22 15:02:28 lara [ 54.638427] skipping cpu1, didn't come online > Oct 22 15:02:28 lara [ 54.638602] PM: Removing info for No Bus:msr1 > Oct 22 15:02:28 lara [ 54.638643] PM: Removing info for No Bus:cpu1 > Oct 22 15:02:28 lara [ 54.638678] Error taking CPU1 up: -5 > Oct 22 15:02:28 lara [ 54.640908] PM: Adding info for No Bus:msr2 > Oct 22 15:02:28 lara [ 54.640939] PM: Adding info for No Bus:cpu2 > Oct 22 15:02:28 lara [ 54.640976] SMP alternatives: switching to SMP > code > Oct 22 15:02:28 lara [ 54.641961] Booting processor 2/2 eip 3000 > Oct 22 15:02:28 lara [ 59.656795] Not responding. > Oct 22 15:02:28 lara [ 59.656799] Inquiring remote APIC #2... > Oct 22 15:02:28 lara [ 59.656803] ... APIC #2 ID: failed > Oct 22 15:02:28 lara [ 59.656907] ... APIC #2 VERSION: failed > Oct 22 15:02:28 lara [ 59.657011] ... APIC #2 SPIV: failed > Oct 22 15:02:28 lara [ 59.657131] skipping cpu2, didn't come online > Oct 22 15:02:28 lara [ 59.657300] PM: Removing info for No Bus:msr2 > Oct 22 15:02:28 lara [ 59.657343] PM: Removing info for No Bus:cpu2 > Oct 22 15:02:28 lara [ 59.657379] Error taking CPU2 up: -5 > Oct 22 15:02:28 lara [ 59.659605] PM: Adding info for No Bus:msr3 > Oct 22 15:02:28 lara [ 59.659637] PM: Adding info for No Bus:cpu3 > Oct 22 15:02:28 lara [ 59.659673] SMP alternatives: switching to SMP > code > Oct 22 15:02:28 lara [ 59.660725] Booting processor 3/3 eip 3000 > Oct 22 15:02:28 lara [ 64.675517] Not responding. > Oct 22 15:02:28 lara [ 64.675520] Inquiring remote APIC #3... > Oct 22 15:02:28 lara [ 64.675524] ... APIC #3 ID: failed > Oct 22 15:02:28 lara [ 64.675628] ... APIC #3 VERSION: failed > Oct 22 15:02:28 lara [ 64.675731] ... APIC #3 SPIV: failed > Oct 22 15:02:28 lara [ 64.675859] skipping cpu3, didn't come online > Oct 22 15:02:28 lara [ 64.676017] PM: Removing info for No Bus:msr3 > Oct 22 15:02:28 lara [ 64.676059] PM: Removing info for No Bus:cpu3 > Oct 22 15:02:28 lara [ 64.676092] Error taking CPU3 up: -5 > Oct 22 15:02:28 lara [ 64.676326] evxfevnt-0079 [00] enable >: System is already in ACPI mode > > ... > > After I've played with a lot boot options I found out booting with ' > acpi=ht ' will make the CPU's work again but now > I have a problem on Suspend. Everything seems to just go down disks etc > but the box itself is for some reason still on. > So I've tested reboot=<> options with no luck. > ( after waiting 5 minutes to be sure everything is really off I can just > hit power button). On resume now everything is fine. > > I'm not really sure what is wrong here acpi/hibernation/cpu-hotplug or a > mix of all so I'm CC'ing linux-acpi as well. > The only thing I noticed is the 'Breaking affinity for irq XX' on suspend > without acpi=ht messages. > > I can't even tell whatever other kernel versions are working because > aic7xxx driver didn't got suspend support till now > (
Re: Resume problems
Rafael J. Wysocki wrote: > On Tuesday, 23 October 2007 01:00, Gabriel C wrote: >> Rafael J. Wysocki wrote: >>> On Monday, 22 October 2007 18:15, Gabriel C wrote: Hi all , I'm running current git + aic7xxx suspend patch from http://bugzilla.kernel.org/show_bug.cgi?id=3062 on a Dell Precision WorkStation 530 MT SMP box ( HT enabled ). Suspend works fine but on resume I have some problems. All CPU's but boot CPU won't come back , everything else seems fine. >>> Can you please try to disable HT and suspend? >> So only 'Hibernation' is enabled in kernel and HT disabled in BIOS ? >> >> If you mean that , sure I can try doing so. > > With suspend or hibernation enabled in the kernel, but with HT disabled in the > BIOS. Ok trying in some minutes. > >> I also could disable Suspend to RAM completly from BIOS as well if you want. > > No, that rather won't work. > ... Oct 22 15:02:28 lara [ 49.618795] Enabling non-boot CPUs ... Oct 22 15:02:28 lara [ 49.622211] PM: Adding info for No Bus:msr1 Oct 22 15:02:28 lara [ 49.622259] PM: Adding info for No Bus:cpu1 Oct 22 15:02:28 lara [ 49.622302] SMP alternatives: switching to SMP code Oct 22 15:02:28 lara [ 49.623536] Booting processor 1/1 eip 3000 Oct 22 15:02:28 lara [ 54.638093] Not responding. Oct 22 15:02:28 lara [ 54.638096] Inquiring remote APIC #1... Oct 22 15:02:28 lara [ 54.638099] ... APIC #1 ID: failed Oct 22 15:02:28 lara [ 54.638204] ... APIC #1 VERSION: failed Oct 22 15:02:28 lara [ 54.638307] ... APIC #1 SPIV: failed Oct 22 15:02:28 lara [ 54.638427] skipping cpu1, didn't come online Oct 22 15:02:28 lara [ 54.638602] PM: Removing info for No Bus:msr1 Oct 22 15:02:28 lara [ 54.638643] PM: Removing info for No Bus:cpu1 Oct 22 15:02:28 lara [ 54.638678] Error taking CPU1 up: -5 Oct 22 15:02:28 lara [ 54.640908] PM: Adding info for No Bus:msr2 Oct 22 15:02:28 lara [ 54.640939] PM: Adding info for No Bus:cpu2 Oct 22 15:02:28 lara [ 54.640976] SMP alternatives: switching to SMP code Oct 22 15:02:28 lara [ 54.641961] Booting processor 2/2 eip 3000 Oct 22 15:02:28 lara [ 59.656795] Not responding. Oct 22 15:02:28 lara [ 59.656799] Inquiring remote APIC #2... Oct 22 15:02:28 lara [ 59.656803] ... APIC #2 ID: failed Oct 22 15:02:28 lara [ 59.656907] ... APIC #2 VERSION: failed Oct 22 15:02:28 lara [ 59.657011] ... APIC #2 SPIV: failed Oct 22 15:02:28 lara [ 59.657131] skipping cpu2, didn't come online Oct 22 15:02:28 lara [ 59.657300] PM: Removing info for No Bus:msr2 Oct 22 15:02:28 lara [ 59.657343] PM: Removing info for No Bus:cpu2 Oct 22 15:02:28 lara [ 59.657379] Error taking CPU2 up: -5 Oct 22 15:02:28 lara [ 59.659605] PM: Adding info for No Bus:msr3 Oct 22 15:02:28 lara [ 59.659637] PM: Adding info for No Bus:cpu3 Oct 22 15:02:28 lara [ 59.659673] SMP alternatives: switching to SMP code Oct 22 15:02:28 lara [ 59.660725] Booting processor 3/3 eip 3000 Oct 22 15:02:28 lara [ 64.675517] Not responding. Oct 22 15:02:28 lara [ 64.675520] Inquiring remote APIC #3... Oct 22 15:02:28 lara [ 64.675524] ... APIC #3 ID: failed Oct 22 15:02:28 lara [ 64.675628] ... APIC #3 VERSION: failed Oct 22 15:02:28 lara [ 64.675731] ... APIC #3 SPIV: failed Oct 22 15:02:28 lara [ 64.675859] skipping cpu3, didn't come online Oct 22 15:02:28 lara [ 64.676017] PM: Removing info for No Bus:msr3 Oct 22 15:02:28 lara [ 64.676059] PM: Removing info for No Bus:cpu3 Oct 22 15:02:28 lara [ 64.676092] Error taking CPU3 up: -5 Oct 22 15:02:28 lara [ 64.676326] evxfevnt-0079 [00] enable : System is already in ACPI mode ... After I've played with a lot boot options I found out booting with ' acpi=ht ' will make the CPU's work again but now I have a problem on Suspend. Everything seems to just go down disks etc but the box itself is for some reason still on. So I've tested reboot=<> options with no luck. ( after waiting 5 minutes to be sure everything is really off I can just hit power button). On resume now everything is fine. I'm not really sure what is wrong here acpi/hibernation/cpu-hotplug or a mix of all so I'm CC'ing linux-acpi as well. The only thing I noticed is the 'Breaking affinity for irq XX' on suspend without acpi=ht messages. I can't even tell whatever other kernel versions are working because aic7xxx driver didn't got suspend support till now ( or at least never worked here ). I know suspend worked fine on windows with that box. There is my config and dmesg ( good and bad one ) : http://194.231.229.228/suspend/acpi=ht_working_dmesg.txt
Re: Resume problems
On Tuesday, 23 October 2007 01:00, Gabriel C wrote: > Rafael J. Wysocki wrote: > > On Monday, 22 October 2007 18:15, Gabriel C wrote: > >> Hi all , > >> > >> I'm running current git + aic7xxx suspend patch from > >> http://bugzilla.kernel.org/show_bug.cgi?id=3062 > >> on a Dell Precision WorkStation 530 MT SMP box ( HT enabled ). > >> > >> Suspend works fine but on resume I have some problems. > >> All CPU's but boot CPU won't come back , everything else seems fine. > > > > Can you please try to disable HT and suspend? > > So only 'Hibernation' is enabled in kernel and HT disabled in BIOS ? > > If you mean that , sure I can try doing so. With suspend or hibernation enabled in the kernel, but with HT disabled in the BIOS. > I also could disable Suspend to RAM completly from BIOS as well if you want. No, that rather won't work. > > > >> ... > >> > >> Oct 22 15:02:28 lara [ 49.618795] Enabling non-boot CPUs ... > >> Oct 22 15:02:28 lara [ 49.622211] PM: Adding info for No Bus:msr1 > >> Oct 22 15:02:28 lara [ 49.622259] PM: Adding info for No Bus:cpu1 > >> Oct 22 15:02:28 lara [ 49.622302] SMP alternatives: switching to SMP code > >> Oct 22 15:02:28 lara [ 49.623536] Booting processor 1/1 eip 3000 > >> Oct 22 15:02:28 lara [ 54.638093] Not responding. > >> Oct 22 15:02:28 lara [ 54.638096] Inquiring remote APIC #1... > >> Oct 22 15:02:28 lara [ 54.638099] ... APIC #1 ID: failed > >> Oct 22 15:02:28 lara [ 54.638204] ... APIC #1 VERSION: failed > >> Oct 22 15:02:28 lara [ 54.638307] ... APIC #1 SPIV: failed > >> Oct 22 15:02:28 lara [ 54.638427] skipping cpu1, didn't come online > >> Oct 22 15:02:28 lara [ 54.638602] PM: Removing info for No Bus:msr1 > >> Oct 22 15:02:28 lara [ 54.638643] PM: Removing info for No Bus:cpu1 > >> Oct 22 15:02:28 lara [ 54.638678] Error taking CPU1 up: -5 > >> Oct 22 15:02:28 lara [ 54.640908] PM: Adding info for No Bus:msr2 > >> Oct 22 15:02:28 lara [ 54.640939] PM: Adding info for No Bus:cpu2 > >> Oct 22 15:02:28 lara [ 54.640976] SMP alternatives: switching to SMP code > >> Oct 22 15:02:28 lara [ 54.641961] Booting processor 2/2 eip 3000 > >> Oct 22 15:02:28 lara [ 59.656795] Not responding. > >> Oct 22 15:02:28 lara [ 59.656799] Inquiring remote APIC #2... > >> Oct 22 15:02:28 lara [ 59.656803] ... APIC #2 ID: failed > >> Oct 22 15:02:28 lara [ 59.656907] ... APIC #2 VERSION: failed > >> Oct 22 15:02:28 lara [ 59.657011] ... APIC #2 SPIV: failed > >> Oct 22 15:02:28 lara [ 59.657131] skipping cpu2, didn't come online > >> Oct 22 15:02:28 lara [ 59.657300] PM: Removing info for No Bus:msr2 > >> Oct 22 15:02:28 lara [ 59.657343] PM: Removing info for No Bus:cpu2 > >> Oct 22 15:02:28 lara [ 59.657379] Error taking CPU2 up: -5 > >> Oct 22 15:02:28 lara [ 59.659605] PM: Adding info for No Bus:msr3 > >> Oct 22 15:02:28 lara [ 59.659637] PM: Adding info for No Bus:cpu3 > >> Oct 22 15:02:28 lara [ 59.659673] SMP alternatives: switching to SMP code > >> Oct 22 15:02:28 lara [ 59.660725] Booting processor 3/3 eip 3000 > >> Oct 22 15:02:28 lara [ 64.675517] Not responding. > >> Oct 22 15:02:28 lara [ 64.675520] Inquiring remote APIC #3... > >> Oct 22 15:02:28 lara [ 64.675524] ... APIC #3 ID: failed > >> Oct 22 15:02:28 lara [ 64.675628] ... APIC #3 VERSION: failed > >> Oct 22 15:02:28 lara [ 64.675731] ... APIC #3 SPIV: failed > >> Oct 22 15:02:28 lara [ 64.675859] skipping cpu3, didn't come online > >> Oct 22 15:02:28 lara [ 64.676017] PM: Removing info for No Bus:msr3 > >> Oct 22 15:02:28 lara [ 64.676059] PM: Removing info for No Bus:cpu3 > >> Oct 22 15:02:28 lara [ 64.676092] Error taking CPU3 up: -5 > >> Oct 22 15:02:28 lara [ 64.676326] evxfevnt-0079 [00] enable > >> : System is already in ACPI mode > >> > >> ... > >> > >> After I've played with a lot boot options I found out booting with ' > >> acpi=ht ' will make the CPU's work again but now > >> I have a problem on Suspend. Everything seems to just go down disks etc > >> but the box itself is for some reason still on. > >> So I've tested reboot=<> options with no luck. > >> ( after waiting 5 minutes to be sure everything is really off I can just > >> hit power button). On resume now everything is fine. > >> > >> I'm not really sure what is wrong here acpi/hibernation/cpu-hotplug or a > >> mix of all so I'm CC'ing linux-acpi as well. > >> The only thing I noticed is the 'Breaking affinity for irq XX' on suspend > >> without acpi=ht messages. > >> > >> I can't even tell whatever other kernel versions are working because > >> aic7xxx driver didn't got suspend support till now > >> ( or at least never worked here ). I know suspend worked fine on windows > >> with that box. > >> > >> There is my config and dmesg ( good and bad one ) : > >> > >> > >> http://194.231.229.228/suspend/acpi=ht_working_dmesg.txt > >> http://194.231.229.228/suspend/dmesg_broken_cpus_on_resume.txt > >> http://194.231.229.228/suspend/config >
Re: Resume problems
Rafael J. Wysocki wrote: > On Monday, 22 October 2007 18:15, Gabriel C wrote: >> Hi all , >> >> I'm running current git + aic7xxx suspend patch from >> http://bugzilla.kernel.org/show_bug.cgi?id=3062 >> on a Dell Precision WorkStation 530 MT SMP box ( HT enabled ). >> >> Suspend works fine but on resume I have some problems. >> All CPU's but boot CPU won't come back , everything else seems fine. > > Can you please try to disable HT and suspend? So only 'Hibernation' is enabled in kernel and HT disabled in BIOS ? If you mean that , sure I can try doing so. I also could disable Suspend to RAM completly from BIOS as well if you want. > >> ... >> >> Oct 22 15:02:28 lara [ 49.618795] Enabling non-boot CPUs ... >> Oct 22 15:02:28 lara [ 49.622211] PM: Adding info for No Bus:msr1 >> Oct 22 15:02:28 lara [ 49.622259] PM: Adding info for No Bus:cpu1 >> Oct 22 15:02:28 lara [ 49.622302] SMP alternatives: switching to SMP code >> Oct 22 15:02:28 lara [ 49.623536] Booting processor 1/1 eip 3000 >> Oct 22 15:02:28 lara [ 54.638093] Not responding. >> Oct 22 15:02:28 lara [ 54.638096] Inquiring remote APIC #1... >> Oct 22 15:02:28 lara [ 54.638099] ... APIC #1 ID: failed >> Oct 22 15:02:28 lara [ 54.638204] ... APIC #1 VERSION: failed >> Oct 22 15:02:28 lara [ 54.638307] ... APIC #1 SPIV: failed >> Oct 22 15:02:28 lara [ 54.638427] skipping cpu1, didn't come online >> Oct 22 15:02:28 lara [ 54.638602] PM: Removing info for No Bus:msr1 >> Oct 22 15:02:28 lara [ 54.638643] PM: Removing info for No Bus:cpu1 >> Oct 22 15:02:28 lara [ 54.638678] Error taking CPU1 up: -5 >> Oct 22 15:02:28 lara [ 54.640908] PM: Adding info for No Bus:msr2 >> Oct 22 15:02:28 lara [ 54.640939] PM: Adding info for No Bus:cpu2 >> Oct 22 15:02:28 lara [ 54.640976] SMP alternatives: switching to SMP code >> Oct 22 15:02:28 lara [ 54.641961] Booting processor 2/2 eip 3000 >> Oct 22 15:02:28 lara [ 59.656795] Not responding. >> Oct 22 15:02:28 lara [ 59.656799] Inquiring remote APIC #2... >> Oct 22 15:02:28 lara [ 59.656803] ... APIC #2 ID: failed >> Oct 22 15:02:28 lara [ 59.656907] ... APIC #2 VERSION: failed >> Oct 22 15:02:28 lara [ 59.657011] ... APIC #2 SPIV: failed >> Oct 22 15:02:28 lara [ 59.657131] skipping cpu2, didn't come online >> Oct 22 15:02:28 lara [ 59.657300] PM: Removing info for No Bus:msr2 >> Oct 22 15:02:28 lara [ 59.657343] PM: Removing info for No Bus:cpu2 >> Oct 22 15:02:28 lara [ 59.657379] Error taking CPU2 up: -5 >> Oct 22 15:02:28 lara [ 59.659605] PM: Adding info for No Bus:msr3 >> Oct 22 15:02:28 lara [ 59.659637] PM: Adding info for No Bus:cpu3 >> Oct 22 15:02:28 lara [ 59.659673] SMP alternatives: switching to SMP code >> Oct 22 15:02:28 lara [ 59.660725] Booting processor 3/3 eip 3000 >> Oct 22 15:02:28 lara [ 64.675517] Not responding. >> Oct 22 15:02:28 lara [ 64.675520] Inquiring remote APIC #3... >> Oct 22 15:02:28 lara [ 64.675524] ... APIC #3 ID: failed >> Oct 22 15:02:28 lara [ 64.675628] ... APIC #3 VERSION: failed >> Oct 22 15:02:28 lara [ 64.675731] ... APIC #3 SPIV: failed >> Oct 22 15:02:28 lara [ 64.675859] skipping cpu3, didn't come online >> Oct 22 15:02:28 lara [ 64.676017] PM: Removing info for No Bus:msr3 >> Oct 22 15:02:28 lara [ 64.676059] PM: Removing info for No Bus:cpu3 >> Oct 22 15:02:28 lara [ 64.676092] Error taking CPU3 up: -5 >> Oct 22 15:02:28 lara [ 64.676326] evxfevnt-0079 [00] enable >> : System is already in ACPI mode >> >> ... >> >> After I've played with a lot boot options I found out booting with ' acpi=ht >> ' will make the CPU's work again but now >> I have a problem on Suspend. Everything seems to just go down disks etc but >> the box itself is for some reason still on. >> So I've tested reboot=<> options with no luck. >> ( after waiting 5 minutes to be sure everything is really off I can just hit >> power button). On resume now everything is fine. >> >> I'm not really sure what is wrong here acpi/hibernation/cpu-hotplug or a mix >> of all so I'm CC'ing linux-acpi as well. >> The only thing I noticed is the 'Breaking affinity for irq XX' on suspend >> without acpi=ht messages. >> >> I can't even tell whatever other kernel versions are working because aic7xxx >> driver didn't got suspend support till now >> ( or at least never worked here ). I know suspend worked fine on windows >> with that box. >> >> There is my config and dmesg ( good and bad one ) : >> >> >> http://194.231.229.228/suspend/acpi=ht_working_dmesg.txt >> http://194.231.229.228/suspend/dmesg_broken_cpus_on_resume.txt >> http://194.231.229.228/suspend/config > > Well, I think we have a problem with the CPU hotplug. > > Can you try to offline-online CPUs (without suspending) and see if that works? Yes does work when I do it manually : [ 6687.595842] CPU 1 is now offline [ 6687.711425] CPU 2 is now offline [ 6687.819330] CPU 3 is now offline [ 6687.819337] SMP alternatives: switching to UP code [
Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems
On Monday, 22 October 2007 16:11, Mark Lord wrote: > Rafael, > > What happens to the jiffies variable on resume from RAM, and from DISK? > Do we restore it to the value it had at suspend, > or just leave it be with whatever? > > The answer has to be "restore the value it had at suspend time", > but I figured I'd check here anyway. > > ?? Well, frankly, I've lost track of that recently, but it seems that we just use the pre-suspend jiffies (at least in the current -git). Thomas knows better, I guess. :-) Greetings, Rafael - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Resume problems
On Monday, 22 October 2007 18:15, Gabriel C wrote: > Hi all , > > I'm running current git + aic7xxx suspend patch from > http://bugzilla.kernel.org/show_bug.cgi?id=3062 > on a Dell Precision WorkStation 530 MT SMP box ( HT enabled ). > > Suspend works fine but on resume I have some problems. > All CPU's but boot CPU won't come back , everything else seems fine. Can you please try to disable HT and suspend? > ... > > Oct 22 15:02:28 lara [ 49.618795] Enabling non-boot CPUs ... > Oct 22 15:02:28 lara [ 49.622211] PM: Adding info for No Bus:msr1 > Oct 22 15:02:28 lara [ 49.622259] PM: Adding info for No Bus:cpu1 > Oct 22 15:02:28 lara [ 49.622302] SMP alternatives: switching to SMP code > Oct 22 15:02:28 lara [ 49.623536] Booting processor 1/1 eip 3000 > Oct 22 15:02:28 lara [ 54.638093] Not responding. > Oct 22 15:02:28 lara [ 54.638096] Inquiring remote APIC #1... > Oct 22 15:02:28 lara [ 54.638099] ... APIC #1 ID: failed > Oct 22 15:02:28 lara [ 54.638204] ... APIC #1 VERSION: failed > Oct 22 15:02:28 lara [ 54.638307] ... APIC #1 SPIV: failed > Oct 22 15:02:28 lara [ 54.638427] skipping cpu1, didn't come online > Oct 22 15:02:28 lara [ 54.638602] PM: Removing info for No Bus:msr1 > Oct 22 15:02:28 lara [ 54.638643] PM: Removing info for No Bus:cpu1 > Oct 22 15:02:28 lara [ 54.638678] Error taking CPU1 up: -5 > Oct 22 15:02:28 lara [ 54.640908] PM: Adding info for No Bus:msr2 > Oct 22 15:02:28 lara [ 54.640939] PM: Adding info for No Bus:cpu2 > Oct 22 15:02:28 lara [ 54.640976] SMP alternatives: switching to SMP code > Oct 22 15:02:28 lara [ 54.641961] Booting processor 2/2 eip 3000 > Oct 22 15:02:28 lara [ 59.656795] Not responding. > Oct 22 15:02:28 lara [ 59.656799] Inquiring remote APIC #2... > Oct 22 15:02:28 lara [ 59.656803] ... APIC #2 ID: failed > Oct 22 15:02:28 lara [ 59.656907] ... APIC #2 VERSION: failed > Oct 22 15:02:28 lara [ 59.657011] ... APIC #2 SPIV: failed > Oct 22 15:02:28 lara [ 59.657131] skipping cpu2, didn't come online > Oct 22 15:02:28 lara [ 59.657300] PM: Removing info for No Bus:msr2 > Oct 22 15:02:28 lara [ 59.657343] PM: Removing info for No Bus:cpu2 > Oct 22 15:02:28 lara [ 59.657379] Error taking CPU2 up: -5 > Oct 22 15:02:28 lara [ 59.659605] PM: Adding info for No Bus:msr3 > Oct 22 15:02:28 lara [ 59.659637] PM: Adding info for No Bus:cpu3 > Oct 22 15:02:28 lara [ 59.659673] SMP alternatives: switching to SMP code > Oct 22 15:02:28 lara [ 59.660725] Booting processor 3/3 eip 3000 > Oct 22 15:02:28 lara [ 64.675517] Not responding. > Oct 22 15:02:28 lara [ 64.675520] Inquiring remote APIC #3... > Oct 22 15:02:28 lara [ 64.675524] ... APIC #3 ID: failed > Oct 22 15:02:28 lara [ 64.675628] ... APIC #3 VERSION: failed > Oct 22 15:02:28 lara [ 64.675731] ... APIC #3 SPIV: failed > Oct 22 15:02:28 lara [ 64.675859] skipping cpu3, didn't come online > Oct 22 15:02:28 lara [ 64.676017] PM: Removing info for No Bus:msr3 > Oct 22 15:02:28 lara [ 64.676059] PM: Removing info for No Bus:cpu3 > Oct 22 15:02:28 lara [ 64.676092] Error taking CPU3 up: -5 > Oct 22 15:02:28 lara [ 64.676326] evxfevnt-0079 [00] enable > : System is already in ACPI mode > > ... > > After I've played with a lot boot options I found out booting with ' acpi=ht > ' will make the CPU's work again but now > I have a problem on Suspend. Everything seems to just go down disks etc but > the box itself is for some reason still on. > So I've tested reboot=<> options with no luck. > ( after waiting 5 minutes to be sure everything is really off I can just hit > power button). On resume now everything is fine. > > I'm not really sure what is wrong here acpi/hibernation/cpu-hotplug or a mix > of all so I'm CC'ing linux-acpi as well. > The only thing I noticed is the 'Breaking affinity for irq XX' on suspend > without acpi=ht messages. > > I can't even tell whatever other kernel versions are working because aic7xxx > driver didn't got suspend support till now > ( or at least never worked here ). I know suspend worked fine on windows with > that box. > > There is my config and dmesg ( good and bad one ) : > > > http://194.231.229.228/suspend/acpi=ht_working_dmesg.txt > http://194.231.229.228/suspend/dmesg_broken_cpus_on_resume.txt > http://194.231.229.228/suspend/config Well, I think we have a problem with the CPU hotplug. Can you try to offline-online CPUs (without suspending) and see if that works? Greetings, Rafael - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Resume problems
Hi all , I'm running current git + aic7xxx suspend patch from http://bugzilla.kernel.org/show_bug.cgi?id=3062 on a Dell Precision WorkStation 530 MT SMP box ( HT enabled ). Suspend works fine but on resume I have some problems. All CPU's but boot CPU won't come back , everything else seems fine. ... Oct 22 15:02:28 lara [ 49.618795] Enabling non-boot CPUs ... Oct 22 15:02:28 lara [ 49.622211] PM: Adding info for No Bus:msr1 Oct 22 15:02:28 lara [ 49.622259] PM: Adding info for No Bus:cpu1 Oct 22 15:02:28 lara [ 49.622302] SMP alternatives: switching to SMP code Oct 22 15:02:28 lara [ 49.623536] Booting processor 1/1 eip 3000 Oct 22 15:02:28 lara [ 54.638093] Not responding. Oct 22 15:02:28 lara [ 54.638096] Inquiring remote APIC #1... Oct 22 15:02:28 lara [ 54.638099] ... APIC #1 ID: failed Oct 22 15:02:28 lara [ 54.638204] ... APIC #1 VERSION: failed Oct 22 15:02:28 lara [ 54.638307] ... APIC #1 SPIV: failed Oct 22 15:02:28 lara [ 54.638427] skipping cpu1, didn't come online Oct 22 15:02:28 lara [ 54.638602] PM: Removing info for No Bus:msr1 Oct 22 15:02:28 lara [ 54.638643] PM: Removing info for No Bus:cpu1 Oct 22 15:02:28 lara [ 54.638678] Error taking CPU1 up: -5 Oct 22 15:02:28 lara [ 54.640908] PM: Adding info for No Bus:msr2 Oct 22 15:02:28 lara [ 54.640939] PM: Adding info for No Bus:cpu2 Oct 22 15:02:28 lara [ 54.640976] SMP alternatives: switching to SMP code Oct 22 15:02:28 lara [ 54.641961] Booting processor 2/2 eip 3000 Oct 22 15:02:28 lara [ 59.656795] Not responding. Oct 22 15:02:28 lara [ 59.656799] Inquiring remote APIC #2... Oct 22 15:02:28 lara [ 59.656803] ... APIC #2 ID: failed Oct 22 15:02:28 lara [ 59.656907] ... APIC #2 VERSION: failed Oct 22 15:02:28 lara [ 59.657011] ... APIC #2 SPIV: failed Oct 22 15:02:28 lara [ 59.657131] skipping cpu2, didn't come online Oct 22 15:02:28 lara [ 59.657300] PM: Removing info for No Bus:msr2 Oct 22 15:02:28 lara [ 59.657343] PM: Removing info for No Bus:cpu2 Oct 22 15:02:28 lara [ 59.657379] Error taking CPU2 up: -5 Oct 22 15:02:28 lara [ 59.659605] PM: Adding info for No Bus:msr3 Oct 22 15:02:28 lara [ 59.659637] PM: Adding info for No Bus:cpu3 Oct 22 15:02:28 lara [ 59.659673] SMP alternatives: switching to SMP code Oct 22 15:02:28 lara [ 59.660725] Booting processor 3/3 eip 3000 Oct 22 15:02:28 lara [ 64.675517] Not responding. Oct 22 15:02:28 lara [ 64.675520] Inquiring remote APIC #3... Oct 22 15:02:28 lara [ 64.675524] ... APIC #3 ID: failed Oct 22 15:02:28 lara [ 64.675628] ... APIC #3 VERSION: failed Oct 22 15:02:28 lara [ 64.675731] ... APIC #3 SPIV: failed Oct 22 15:02:28 lara [ 64.675859] skipping cpu3, didn't come online Oct 22 15:02:28 lara [ 64.676017] PM: Removing info for No Bus:msr3 Oct 22 15:02:28 lara [ 64.676059] PM: Removing info for No Bus:cpu3 Oct 22 15:02:28 lara [ 64.676092] Error taking CPU3 up: -5 Oct 22 15:02:28 lara [ 64.676326] evxfevnt-0079 [00] enable: System is already in ACPI mode ... After I've played with a lot boot options I found out booting with ' acpi=ht ' will make the CPU's work again but now I have a problem on Suspend. Everything seems to just go down disks etc but the box itself is for some reason still on. So I've tested reboot=<> options with no luck. ( after waiting 5 minutes to be sure everything is really off I can just hit power button). On resume now everything is fine. I'm not really sure what is wrong here acpi/hibernation/cpu-hotplug or a mix of all so I'm CC'ing linux-acpi as well. The only thing I noticed is the 'Breaking affinity for irq XX' on suspend without acpi=ht messages. I can't even tell whatever other kernel versions are working because aic7xxx driver didn't got suspend support till now ( or at least never worked here ). I know suspend worked fine on windows with that box. There is my config and dmesg ( good and bad one ) : http://194.231.229.228/suspend/acpi=ht_working_dmesg.txt http://194.231.229.228/suspend/dmesg_broken_cpus_on_resume.txt http://194.231.229.228/suspend/config Regards, Gabriel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems
Rafael, What happens to the jiffies variable on resume from RAM, and from DISK? Do we restore it to the value it had at suspend, or just leave it be with whatever? The answer has to be "restore the value it had at suspend time", but I figured I'd check here anyway. ?? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems
Rafael, What happens to the jiffies variable on resume from RAM, and from DISK? Do we restore it to the value it had at suspend, or just leave it be with whatever? The answer has to be restore the value it had at suspend time, but I figured I'd check here anyway. ?? - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Resume problems
Hi all , I'm running current git + aic7xxx suspend patch from http://bugzilla.kernel.org/show_bug.cgi?id=3062 on a Dell Precision WorkStation 530 MT SMP box ( HT enabled ). Suspend works fine but on resume I have some problems. All CPU's but boot CPU won't come back , everything else seems fine. ... Oct 22 15:02:28 lara [ 49.618795] Enabling non-boot CPUs ... Oct 22 15:02:28 lara [ 49.622211] PM: Adding info for No Bus:msr1 Oct 22 15:02:28 lara [ 49.622259] PM: Adding info for No Bus:cpu1 Oct 22 15:02:28 lara [ 49.622302] SMP alternatives: switching to SMP code Oct 22 15:02:28 lara [ 49.623536] Booting processor 1/1 eip 3000 Oct 22 15:02:28 lara [ 54.638093] Not responding. Oct 22 15:02:28 lara [ 54.638096] Inquiring remote APIC #1... Oct 22 15:02:28 lara [ 54.638099] ... APIC #1 ID: failed Oct 22 15:02:28 lara [ 54.638204] ... APIC #1 VERSION: failed Oct 22 15:02:28 lara [ 54.638307] ... APIC #1 SPIV: failed Oct 22 15:02:28 lara [ 54.638427] skipping cpu1, didn't come online Oct 22 15:02:28 lara [ 54.638602] PM: Removing info for No Bus:msr1 Oct 22 15:02:28 lara [ 54.638643] PM: Removing info for No Bus:cpu1 Oct 22 15:02:28 lara [ 54.638678] Error taking CPU1 up: -5 Oct 22 15:02:28 lara [ 54.640908] PM: Adding info for No Bus:msr2 Oct 22 15:02:28 lara [ 54.640939] PM: Adding info for No Bus:cpu2 Oct 22 15:02:28 lara [ 54.640976] SMP alternatives: switching to SMP code Oct 22 15:02:28 lara [ 54.641961] Booting processor 2/2 eip 3000 Oct 22 15:02:28 lara [ 59.656795] Not responding. Oct 22 15:02:28 lara [ 59.656799] Inquiring remote APIC #2... Oct 22 15:02:28 lara [ 59.656803] ... APIC #2 ID: failed Oct 22 15:02:28 lara [ 59.656907] ... APIC #2 VERSION: failed Oct 22 15:02:28 lara [ 59.657011] ... APIC #2 SPIV: failed Oct 22 15:02:28 lara [ 59.657131] skipping cpu2, didn't come online Oct 22 15:02:28 lara [ 59.657300] PM: Removing info for No Bus:msr2 Oct 22 15:02:28 lara [ 59.657343] PM: Removing info for No Bus:cpu2 Oct 22 15:02:28 lara [ 59.657379] Error taking CPU2 up: -5 Oct 22 15:02:28 lara [ 59.659605] PM: Adding info for No Bus:msr3 Oct 22 15:02:28 lara [ 59.659637] PM: Adding info for No Bus:cpu3 Oct 22 15:02:28 lara [ 59.659673] SMP alternatives: switching to SMP code Oct 22 15:02:28 lara [ 59.660725] Booting processor 3/3 eip 3000 Oct 22 15:02:28 lara [ 64.675517] Not responding. Oct 22 15:02:28 lara [ 64.675520] Inquiring remote APIC #3... Oct 22 15:02:28 lara [ 64.675524] ... APIC #3 ID: failed Oct 22 15:02:28 lara [ 64.675628] ... APIC #3 VERSION: failed Oct 22 15:02:28 lara [ 64.675731] ... APIC #3 SPIV: failed Oct 22 15:02:28 lara [ 64.675859] skipping cpu3, didn't come online Oct 22 15:02:28 lara [ 64.676017] PM: Removing info for No Bus:msr3 Oct 22 15:02:28 lara [ 64.676059] PM: Removing info for No Bus:cpu3 Oct 22 15:02:28 lara [ 64.676092] Error taking CPU3 up: -5 Oct 22 15:02:28 lara [ 64.676326] evxfevnt-0079 [00] enable: System is already in ACPI mode ... After I've played with a lot boot options I found out booting with ' acpi=ht ' will make the CPU's work again but now I have a problem on Suspend. Everything seems to just go down disks etc but the box itself is for some reason still on. So I've tested reboot= options with no luck. ( after waiting 5 minutes to be sure everything is really off I can just hit power button). On resume now everything is fine. I'm not really sure what is wrong here acpi/hibernation/cpu-hotplug or a mix of all so I'm CC'ing linux-acpi as well. The only thing I noticed is the 'Breaking affinity for irq XX' on suspend without acpi=ht messages. I can't even tell whatever other kernel versions are working because aic7xxx driver didn't got suspend support till now ( or at least never worked here ). I know suspend worked fine on windows with that box. There is my config and dmesg ( good and bad one ) : http://194.231.229.228/suspend/acpi=ht_working_dmesg.txt http://194.231.229.228/suspend/dmesg_broken_cpus_on_resume.txt http://194.231.229.228/suspend/config Regards, Gabriel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Resume problems
On Monday, 22 October 2007 18:15, Gabriel C wrote: Hi all , I'm running current git + aic7xxx suspend patch from http://bugzilla.kernel.org/show_bug.cgi?id=3062 on a Dell Precision WorkStation 530 MT SMP box ( HT enabled ). Suspend works fine but on resume I have some problems. All CPU's but boot CPU won't come back , everything else seems fine. Can you please try to disable HT and suspend? ... Oct 22 15:02:28 lara [ 49.618795] Enabling non-boot CPUs ... Oct 22 15:02:28 lara [ 49.622211] PM: Adding info for No Bus:msr1 Oct 22 15:02:28 lara [ 49.622259] PM: Adding info for No Bus:cpu1 Oct 22 15:02:28 lara [ 49.622302] SMP alternatives: switching to SMP code Oct 22 15:02:28 lara [ 49.623536] Booting processor 1/1 eip 3000 Oct 22 15:02:28 lara [ 54.638093] Not responding. Oct 22 15:02:28 lara [ 54.638096] Inquiring remote APIC #1... Oct 22 15:02:28 lara [ 54.638099] ... APIC #1 ID: failed Oct 22 15:02:28 lara [ 54.638204] ... APIC #1 VERSION: failed Oct 22 15:02:28 lara [ 54.638307] ... APIC #1 SPIV: failed Oct 22 15:02:28 lara [ 54.638427] skipping cpu1, didn't come online Oct 22 15:02:28 lara [ 54.638602] PM: Removing info for No Bus:msr1 Oct 22 15:02:28 lara [ 54.638643] PM: Removing info for No Bus:cpu1 Oct 22 15:02:28 lara [ 54.638678] Error taking CPU1 up: -5 Oct 22 15:02:28 lara [ 54.640908] PM: Adding info for No Bus:msr2 Oct 22 15:02:28 lara [ 54.640939] PM: Adding info for No Bus:cpu2 Oct 22 15:02:28 lara [ 54.640976] SMP alternatives: switching to SMP code Oct 22 15:02:28 lara [ 54.641961] Booting processor 2/2 eip 3000 Oct 22 15:02:28 lara [ 59.656795] Not responding. Oct 22 15:02:28 lara [ 59.656799] Inquiring remote APIC #2... Oct 22 15:02:28 lara [ 59.656803] ... APIC #2 ID: failed Oct 22 15:02:28 lara [ 59.656907] ... APIC #2 VERSION: failed Oct 22 15:02:28 lara [ 59.657011] ... APIC #2 SPIV: failed Oct 22 15:02:28 lara [ 59.657131] skipping cpu2, didn't come online Oct 22 15:02:28 lara [ 59.657300] PM: Removing info for No Bus:msr2 Oct 22 15:02:28 lara [ 59.657343] PM: Removing info for No Bus:cpu2 Oct 22 15:02:28 lara [ 59.657379] Error taking CPU2 up: -5 Oct 22 15:02:28 lara [ 59.659605] PM: Adding info for No Bus:msr3 Oct 22 15:02:28 lara [ 59.659637] PM: Adding info for No Bus:cpu3 Oct 22 15:02:28 lara [ 59.659673] SMP alternatives: switching to SMP code Oct 22 15:02:28 lara [ 59.660725] Booting processor 3/3 eip 3000 Oct 22 15:02:28 lara [ 64.675517] Not responding. Oct 22 15:02:28 lara [ 64.675520] Inquiring remote APIC #3... Oct 22 15:02:28 lara [ 64.675524] ... APIC #3 ID: failed Oct 22 15:02:28 lara [ 64.675628] ... APIC #3 VERSION: failed Oct 22 15:02:28 lara [ 64.675731] ... APIC #3 SPIV: failed Oct 22 15:02:28 lara [ 64.675859] skipping cpu3, didn't come online Oct 22 15:02:28 lara [ 64.676017] PM: Removing info for No Bus:msr3 Oct 22 15:02:28 lara [ 64.676059] PM: Removing info for No Bus:cpu3 Oct 22 15:02:28 lara [ 64.676092] Error taking CPU3 up: -5 Oct 22 15:02:28 lara [ 64.676326] evxfevnt-0079 [00] enable : System is already in ACPI mode ... After I've played with a lot boot options I found out booting with ' acpi=ht ' will make the CPU's work again but now I have a problem on Suspend. Everything seems to just go down disks etc but the box itself is for some reason still on. So I've tested reboot= options with no luck. ( after waiting 5 minutes to be sure everything is really off I can just hit power button). On resume now everything is fine. I'm not really sure what is wrong here acpi/hibernation/cpu-hotplug or a mix of all so I'm CC'ing linux-acpi as well. The only thing I noticed is the 'Breaking affinity for irq XX' on suspend without acpi=ht messages. I can't even tell whatever other kernel versions are working because aic7xxx driver didn't got suspend support till now ( or at least never worked here ). I know suspend worked fine on windows with that box. There is my config and dmesg ( good and bad one ) : http://194.231.229.228/suspend/acpi=ht_working_dmesg.txt http://194.231.229.228/suspend/dmesg_broken_cpus_on_resume.txt http://194.231.229.228/suspend/config Well, I think we have a problem with the CPU hotplug. Can you try to offline-online CPUs (without suspending) and see if that works? Greetings, Rafael - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems
On Monday, 22 October 2007 16:11, Mark Lord wrote: Rafael, What happens to the jiffies variable on resume from RAM, and from DISK? Do we restore it to the value it had at suspend, or just leave it be with whatever? The answer has to be restore the value it had at suspend time, but I figured I'd check here anyway. ?? Well, frankly, I've lost track of that recently, but it seems that we just use the pre-suspend jiffies (at least in the current -git). Thomas knows better, I guess. :-) Greetings, Rafael - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Resume problems
Rafael J. Wysocki wrote: On Monday, 22 October 2007 18:15, Gabriel C wrote: Hi all , I'm running current git + aic7xxx suspend patch from http://bugzilla.kernel.org/show_bug.cgi?id=3062 on a Dell Precision WorkStation 530 MT SMP box ( HT enabled ). Suspend works fine but on resume I have some problems. All CPU's but boot CPU won't come back , everything else seems fine. Can you please try to disable HT and suspend? So only 'Hibernation' is enabled in kernel and HT disabled in BIOS ? If you mean that , sure I can try doing so. I also could disable Suspend to RAM completly from BIOS as well if you want. ... Oct 22 15:02:28 lara [ 49.618795] Enabling non-boot CPUs ... Oct 22 15:02:28 lara [ 49.622211] PM: Adding info for No Bus:msr1 Oct 22 15:02:28 lara [ 49.622259] PM: Adding info for No Bus:cpu1 Oct 22 15:02:28 lara [ 49.622302] SMP alternatives: switching to SMP code Oct 22 15:02:28 lara [ 49.623536] Booting processor 1/1 eip 3000 Oct 22 15:02:28 lara [ 54.638093] Not responding. Oct 22 15:02:28 lara [ 54.638096] Inquiring remote APIC #1... Oct 22 15:02:28 lara [ 54.638099] ... APIC #1 ID: failed Oct 22 15:02:28 lara [ 54.638204] ... APIC #1 VERSION: failed Oct 22 15:02:28 lara [ 54.638307] ... APIC #1 SPIV: failed Oct 22 15:02:28 lara [ 54.638427] skipping cpu1, didn't come online Oct 22 15:02:28 lara [ 54.638602] PM: Removing info for No Bus:msr1 Oct 22 15:02:28 lara [ 54.638643] PM: Removing info for No Bus:cpu1 Oct 22 15:02:28 lara [ 54.638678] Error taking CPU1 up: -5 Oct 22 15:02:28 lara [ 54.640908] PM: Adding info for No Bus:msr2 Oct 22 15:02:28 lara [ 54.640939] PM: Adding info for No Bus:cpu2 Oct 22 15:02:28 lara [ 54.640976] SMP alternatives: switching to SMP code Oct 22 15:02:28 lara [ 54.641961] Booting processor 2/2 eip 3000 Oct 22 15:02:28 lara [ 59.656795] Not responding. Oct 22 15:02:28 lara [ 59.656799] Inquiring remote APIC #2... Oct 22 15:02:28 lara [ 59.656803] ... APIC #2 ID: failed Oct 22 15:02:28 lara [ 59.656907] ... APIC #2 VERSION: failed Oct 22 15:02:28 lara [ 59.657011] ... APIC #2 SPIV: failed Oct 22 15:02:28 lara [ 59.657131] skipping cpu2, didn't come online Oct 22 15:02:28 lara [ 59.657300] PM: Removing info for No Bus:msr2 Oct 22 15:02:28 lara [ 59.657343] PM: Removing info for No Bus:cpu2 Oct 22 15:02:28 lara [ 59.657379] Error taking CPU2 up: -5 Oct 22 15:02:28 lara [ 59.659605] PM: Adding info for No Bus:msr3 Oct 22 15:02:28 lara [ 59.659637] PM: Adding info for No Bus:cpu3 Oct 22 15:02:28 lara [ 59.659673] SMP alternatives: switching to SMP code Oct 22 15:02:28 lara [ 59.660725] Booting processor 3/3 eip 3000 Oct 22 15:02:28 lara [ 64.675517] Not responding. Oct 22 15:02:28 lara [ 64.675520] Inquiring remote APIC #3... Oct 22 15:02:28 lara [ 64.675524] ... APIC #3 ID: failed Oct 22 15:02:28 lara [ 64.675628] ... APIC #3 VERSION: failed Oct 22 15:02:28 lara [ 64.675731] ... APIC #3 SPIV: failed Oct 22 15:02:28 lara [ 64.675859] skipping cpu3, didn't come online Oct 22 15:02:28 lara [ 64.676017] PM: Removing info for No Bus:msr3 Oct 22 15:02:28 lara [ 64.676059] PM: Removing info for No Bus:cpu3 Oct 22 15:02:28 lara [ 64.676092] Error taking CPU3 up: -5 Oct 22 15:02:28 lara [ 64.676326] evxfevnt-0079 [00] enable : System is already in ACPI mode ... After I've played with a lot boot options I found out booting with ' acpi=ht ' will make the CPU's work again but now I have a problem on Suspend. Everything seems to just go down disks etc but the box itself is for some reason still on. So I've tested reboot= options with no luck. ( after waiting 5 minutes to be sure everything is really off I can just hit power button). On resume now everything is fine. I'm not really sure what is wrong here acpi/hibernation/cpu-hotplug or a mix of all so I'm CC'ing linux-acpi as well. The only thing I noticed is the 'Breaking affinity for irq XX' on suspend without acpi=ht messages. I can't even tell whatever other kernel versions are working because aic7xxx driver didn't got suspend support till now ( or at least never worked here ). I know suspend worked fine on windows with that box. There is my config and dmesg ( good and bad one ) : http://194.231.229.228/suspend/acpi=ht_working_dmesg.txt http://194.231.229.228/suspend/dmesg_broken_cpus_on_resume.txt http://194.231.229.228/suspend/config Well, I think we have a problem with the CPU hotplug. Can you try to offline-online CPUs (without suspending) and see if that works? Yes does work when I do it manually : [ 6687.595842] CPU 1 is now offline [ 6687.711425] CPU 2 is now offline [ 6687.819330] CPU 3 is now offline [ 6687.819337] SMP alternatives: switching to UP code [ 6702.109605] SMP alternatives: switching to SMP code [ 6702.110634] Booting processor 1/1 eip 3000 [ 6702.122140] Initializing CPU#1 [ 6702.182045] Calibrating delay
Re: Resume problems
On Tuesday, 23 October 2007 01:00, Gabriel C wrote: Rafael J. Wysocki wrote: On Monday, 22 October 2007 18:15, Gabriel C wrote: Hi all , I'm running current git + aic7xxx suspend patch from http://bugzilla.kernel.org/show_bug.cgi?id=3062 on a Dell Precision WorkStation 530 MT SMP box ( HT enabled ). Suspend works fine but on resume I have some problems. All CPU's but boot CPU won't come back , everything else seems fine. Can you please try to disable HT and suspend? So only 'Hibernation' is enabled in kernel and HT disabled in BIOS ? If you mean that , sure I can try doing so. With suspend or hibernation enabled in the kernel, but with HT disabled in the BIOS. I also could disable Suspend to RAM completly from BIOS as well if you want. No, that rather won't work. ... Oct 22 15:02:28 lara [ 49.618795] Enabling non-boot CPUs ... Oct 22 15:02:28 lara [ 49.622211] PM: Adding info for No Bus:msr1 Oct 22 15:02:28 lara [ 49.622259] PM: Adding info for No Bus:cpu1 Oct 22 15:02:28 lara [ 49.622302] SMP alternatives: switching to SMP code Oct 22 15:02:28 lara [ 49.623536] Booting processor 1/1 eip 3000 Oct 22 15:02:28 lara [ 54.638093] Not responding. Oct 22 15:02:28 lara [ 54.638096] Inquiring remote APIC #1... Oct 22 15:02:28 lara [ 54.638099] ... APIC #1 ID: failed Oct 22 15:02:28 lara [ 54.638204] ... APIC #1 VERSION: failed Oct 22 15:02:28 lara [ 54.638307] ... APIC #1 SPIV: failed Oct 22 15:02:28 lara [ 54.638427] skipping cpu1, didn't come online Oct 22 15:02:28 lara [ 54.638602] PM: Removing info for No Bus:msr1 Oct 22 15:02:28 lara [ 54.638643] PM: Removing info for No Bus:cpu1 Oct 22 15:02:28 lara [ 54.638678] Error taking CPU1 up: -5 Oct 22 15:02:28 lara [ 54.640908] PM: Adding info for No Bus:msr2 Oct 22 15:02:28 lara [ 54.640939] PM: Adding info for No Bus:cpu2 Oct 22 15:02:28 lara [ 54.640976] SMP alternatives: switching to SMP code Oct 22 15:02:28 lara [ 54.641961] Booting processor 2/2 eip 3000 Oct 22 15:02:28 lara [ 59.656795] Not responding. Oct 22 15:02:28 lara [ 59.656799] Inquiring remote APIC #2... Oct 22 15:02:28 lara [ 59.656803] ... APIC #2 ID: failed Oct 22 15:02:28 lara [ 59.656907] ... APIC #2 VERSION: failed Oct 22 15:02:28 lara [ 59.657011] ... APIC #2 SPIV: failed Oct 22 15:02:28 lara [ 59.657131] skipping cpu2, didn't come online Oct 22 15:02:28 lara [ 59.657300] PM: Removing info for No Bus:msr2 Oct 22 15:02:28 lara [ 59.657343] PM: Removing info for No Bus:cpu2 Oct 22 15:02:28 lara [ 59.657379] Error taking CPU2 up: -5 Oct 22 15:02:28 lara [ 59.659605] PM: Adding info for No Bus:msr3 Oct 22 15:02:28 lara [ 59.659637] PM: Adding info for No Bus:cpu3 Oct 22 15:02:28 lara [ 59.659673] SMP alternatives: switching to SMP code Oct 22 15:02:28 lara [ 59.660725] Booting processor 3/3 eip 3000 Oct 22 15:02:28 lara [ 64.675517] Not responding. Oct 22 15:02:28 lara [ 64.675520] Inquiring remote APIC #3... Oct 22 15:02:28 lara [ 64.675524] ... APIC #3 ID: failed Oct 22 15:02:28 lara [ 64.675628] ... APIC #3 VERSION: failed Oct 22 15:02:28 lara [ 64.675731] ... APIC #3 SPIV: failed Oct 22 15:02:28 lara [ 64.675859] skipping cpu3, didn't come online Oct 22 15:02:28 lara [ 64.676017] PM: Removing info for No Bus:msr3 Oct 22 15:02:28 lara [ 64.676059] PM: Removing info for No Bus:cpu3 Oct 22 15:02:28 lara [ 64.676092] Error taking CPU3 up: -5 Oct 22 15:02:28 lara [ 64.676326] evxfevnt-0079 [00] enable : System is already in ACPI mode ... After I've played with a lot boot options I found out booting with ' acpi=ht ' will make the CPU's work again but now I have a problem on Suspend. Everything seems to just go down disks etc but the box itself is for some reason still on. So I've tested reboot= options with no luck. ( after waiting 5 minutes to be sure everything is really off I can just hit power button). On resume now everything is fine. I'm not really sure what is wrong here acpi/hibernation/cpu-hotplug or a mix of all so I'm CC'ing linux-acpi as well. The only thing I noticed is the 'Breaking affinity for irq XX' on suspend without acpi=ht messages. I can't even tell whatever other kernel versions are working because aic7xxx driver didn't got suspend support till now ( or at least never worked here ). I know suspend worked fine on windows with that box. There is my config and dmesg ( good and bad one ) : http://194.231.229.228/suspend/acpi=ht_working_dmesg.txt http://194.231.229.228/suspend/dmesg_broken_cpus_on_resume.txt http://194.231.229.228/suspend/config Well, I think we have a problem with the CPU hotplug. Can you try to offline-online CPUs (without suspending) and see if that works? Yes does work when I do it manually : [ 6687.595842] CPU 1 is now offline [ 6687.711425] CPU 2 is now
Re: Resume problems
Rafael J. Wysocki wrote: On Tuesday, 23 October 2007 01:00, Gabriel C wrote: Rafael J. Wysocki wrote: On Monday, 22 October 2007 18:15, Gabriel C wrote: Hi all , I'm running current git + aic7xxx suspend patch from http://bugzilla.kernel.org/show_bug.cgi?id=3062 on a Dell Precision WorkStation 530 MT SMP box ( HT enabled ). Suspend works fine but on resume I have some problems. All CPU's but boot CPU won't come back , everything else seems fine. Can you please try to disable HT and suspend? So only 'Hibernation' is enabled in kernel and HT disabled in BIOS ? If you mean that , sure I can try doing so. With suspend or hibernation enabled in the kernel, but with HT disabled in the BIOS. Ok trying in some minutes. I also could disable Suspend to RAM completly from BIOS as well if you want. No, that rather won't work. ... Oct 22 15:02:28 lara [ 49.618795] Enabling non-boot CPUs ... Oct 22 15:02:28 lara [ 49.622211] PM: Adding info for No Bus:msr1 Oct 22 15:02:28 lara [ 49.622259] PM: Adding info for No Bus:cpu1 Oct 22 15:02:28 lara [ 49.622302] SMP alternatives: switching to SMP code Oct 22 15:02:28 lara [ 49.623536] Booting processor 1/1 eip 3000 Oct 22 15:02:28 lara [ 54.638093] Not responding. Oct 22 15:02:28 lara [ 54.638096] Inquiring remote APIC #1... Oct 22 15:02:28 lara [ 54.638099] ... APIC #1 ID: failed Oct 22 15:02:28 lara [ 54.638204] ... APIC #1 VERSION: failed Oct 22 15:02:28 lara [ 54.638307] ... APIC #1 SPIV: failed Oct 22 15:02:28 lara [ 54.638427] skipping cpu1, didn't come online Oct 22 15:02:28 lara [ 54.638602] PM: Removing info for No Bus:msr1 Oct 22 15:02:28 lara [ 54.638643] PM: Removing info for No Bus:cpu1 Oct 22 15:02:28 lara [ 54.638678] Error taking CPU1 up: -5 Oct 22 15:02:28 lara [ 54.640908] PM: Adding info for No Bus:msr2 Oct 22 15:02:28 lara [ 54.640939] PM: Adding info for No Bus:cpu2 Oct 22 15:02:28 lara [ 54.640976] SMP alternatives: switching to SMP code Oct 22 15:02:28 lara [ 54.641961] Booting processor 2/2 eip 3000 Oct 22 15:02:28 lara [ 59.656795] Not responding. Oct 22 15:02:28 lara [ 59.656799] Inquiring remote APIC #2... Oct 22 15:02:28 lara [ 59.656803] ... APIC #2 ID: failed Oct 22 15:02:28 lara [ 59.656907] ... APIC #2 VERSION: failed Oct 22 15:02:28 lara [ 59.657011] ... APIC #2 SPIV: failed Oct 22 15:02:28 lara [ 59.657131] skipping cpu2, didn't come online Oct 22 15:02:28 lara [ 59.657300] PM: Removing info for No Bus:msr2 Oct 22 15:02:28 lara [ 59.657343] PM: Removing info for No Bus:cpu2 Oct 22 15:02:28 lara [ 59.657379] Error taking CPU2 up: -5 Oct 22 15:02:28 lara [ 59.659605] PM: Adding info for No Bus:msr3 Oct 22 15:02:28 lara [ 59.659637] PM: Adding info for No Bus:cpu3 Oct 22 15:02:28 lara [ 59.659673] SMP alternatives: switching to SMP code Oct 22 15:02:28 lara [ 59.660725] Booting processor 3/3 eip 3000 Oct 22 15:02:28 lara [ 64.675517] Not responding. Oct 22 15:02:28 lara [ 64.675520] Inquiring remote APIC #3... Oct 22 15:02:28 lara [ 64.675524] ... APIC #3 ID: failed Oct 22 15:02:28 lara [ 64.675628] ... APIC #3 VERSION: failed Oct 22 15:02:28 lara [ 64.675731] ... APIC #3 SPIV: failed Oct 22 15:02:28 lara [ 64.675859] skipping cpu3, didn't come online Oct 22 15:02:28 lara [ 64.676017] PM: Removing info for No Bus:msr3 Oct 22 15:02:28 lara [ 64.676059] PM: Removing info for No Bus:cpu3 Oct 22 15:02:28 lara [ 64.676092] Error taking CPU3 up: -5 Oct 22 15:02:28 lara [ 64.676326] evxfevnt-0079 [00] enable : System is already in ACPI mode ... After I've played with a lot boot options I found out booting with ' acpi=ht ' will make the CPU's work again but now I have a problem on Suspend. Everything seems to just go down disks etc but the box itself is for some reason still on. So I've tested reboot= options with no luck. ( after waiting 5 minutes to be sure everything is really off I can just hit power button). On resume now everything is fine. I'm not really sure what is wrong here acpi/hibernation/cpu-hotplug or a mix of all so I'm CC'ing linux-acpi as well. The only thing I noticed is the 'Breaking affinity for irq XX' on suspend without acpi=ht messages. I can't even tell whatever other kernel versions are working because aic7xxx driver didn't got suspend support till now ( or at least never worked here ). I know suspend worked fine on windows with that box. There is my config and dmesg ( good and bad one ) : http://194.231.229.228/suspend/acpi=ht_working_dmesg.txt http://194.231.229.228/suspend/dmesg_broken_cpus_on_resume.txt http://194.231.229.228/suspend/config Well, I think we have a problem with the CPU hotplug. Can you try to offline-online CPUs (without suspending) and see if that works? Yes does work when I do it manually : [ 6687.595842] CPU 1 is now offline [ 6687.711425] CPU 2 is now offline [ 6687.819330] CPU 3 is now
Re: Resume problems
Gabriel C wrote: Rafael J. Wysocki wrote: On Tuesday, 23 October 2007 01:00, Gabriel C wrote: Rafael J. Wysocki wrote: On Monday, 22 October 2007 18:15, Gabriel C wrote: Hi all , I'm running current git + aic7xxx suspend patch from http://bugzilla.kernel.org/show_bug.cgi?id=3062 on a Dell Precision WorkStation 530 MT SMP box ( HT enabled ). Suspend works fine but on resume I have some problems. All CPU's but boot CPU won't come back , everything else seems fine. Can you please try to disable HT and suspend? So only 'Hibernation' is enabled in kernel and HT disabled in BIOS ? If you mean that , sure I can try doing so. With suspend or hibernation enabled in the kernel, but with HT disabled in the BIOS. Ok trying in some minutes. Disabling HT does not make any difference , nor disabling / enabling only one Hibernation or Suspend in kernel and BIOS nor any combination of these. I also could disable Suspend to RAM completly from BIOS as well if you want. No, that rather won't work. ... Oct 22 15:02:28 lara [ 49.618795] Enabling non-boot CPUs ... Oct 22 15:02:28 lara [ 49.622211] PM: Adding info for No Bus:msr1 Oct 22 15:02:28 lara [ 49.622259] PM: Adding info for No Bus:cpu1 Oct 22 15:02:28 lara [ 49.622302] SMP alternatives: switching to SMP code Oct 22 15:02:28 lara [ 49.623536] Booting processor 1/1 eip 3000 Oct 22 15:02:28 lara [ 54.638093] Not responding. Oct 22 15:02:28 lara [ 54.638096] Inquiring remote APIC #1... Oct 22 15:02:28 lara [ 54.638099] ... APIC #1 ID: failed Oct 22 15:02:28 lara [ 54.638204] ... APIC #1 VERSION: failed Oct 22 15:02:28 lara [ 54.638307] ... APIC #1 SPIV: failed Oct 22 15:02:28 lara [ 54.638427] skipping cpu1, didn't come online Oct 22 15:02:28 lara [ 54.638602] PM: Removing info for No Bus:msr1 Oct 22 15:02:28 lara [ 54.638643] PM: Removing info for No Bus:cpu1 Oct 22 15:02:28 lara [ 54.638678] Error taking CPU1 up: -5 Oct 22 15:02:28 lara [ 54.640908] PM: Adding info for No Bus:msr2 Oct 22 15:02:28 lara [ 54.640939] PM: Adding info for No Bus:cpu2 Oct 22 15:02:28 lara [ 54.640976] SMP alternatives: switching to SMP code Oct 22 15:02:28 lara [ 54.641961] Booting processor 2/2 eip 3000 Oct 22 15:02:28 lara [ 59.656795] Not responding. Oct 22 15:02:28 lara [ 59.656799] Inquiring remote APIC #2... Oct 22 15:02:28 lara [ 59.656803] ... APIC #2 ID: failed Oct 22 15:02:28 lara [ 59.656907] ... APIC #2 VERSION: failed Oct 22 15:02:28 lara [ 59.657011] ... APIC #2 SPIV: failed Oct 22 15:02:28 lara [ 59.657131] skipping cpu2, didn't come online Oct 22 15:02:28 lara [ 59.657300] PM: Removing info for No Bus:msr2 Oct 22 15:02:28 lara [ 59.657343] PM: Removing info for No Bus:cpu2 Oct 22 15:02:28 lara [ 59.657379] Error taking CPU2 up: -5 Oct 22 15:02:28 lara [ 59.659605] PM: Adding info for No Bus:msr3 Oct 22 15:02:28 lara [ 59.659637] PM: Adding info for No Bus:cpu3 Oct 22 15:02:28 lara [ 59.659673] SMP alternatives: switching to SMP code Oct 22 15:02:28 lara [ 59.660725] Booting processor 3/3 eip 3000 Oct 22 15:02:28 lara [ 64.675517] Not responding. Oct 22 15:02:28 lara [ 64.675520] Inquiring remote APIC #3... Oct 22 15:02:28 lara [ 64.675524] ... APIC #3 ID: failed Oct 22 15:02:28 lara [ 64.675628] ... APIC #3 VERSION: failed Oct 22 15:02:28 lara [ 64.675731] ... APIC #3 SPIV: failed Oct 22 15:02:28 lara [ 64.675859] skipping cpu3, didn't come online Oct 22 15:02:28 lara [ 64.676017] PM: Removing info for No Bus:msr3 Oct 22 15:02:28 lara [ 64.676059] PM: Removing info for No Bus:cpu3 Oct 22 15:02:28 lara [ 64.676092] Error taking CPU3 up: -5 Oct 22 15:02:28 lara [ 64.676326] evxfevnt-0079 [00] enable : System is already in ACPI mode ... After I've played with a lot boot options I found out booting with ' acpi=ht ' will make the CPU's work again but now I have a problem on Suspend. Everything seems to just go down disks etc but the box itself is for some reason still on. So I've tested reboot= options with no luck. ( after waiting 5 minutes to be sure everything is really off I can just hit power button). On resume now everything is fine. I'm not really sure what is wrong here acpi/hibernation/cpu-hotplug or a mix of all so I'm CC'ing linux-acpi as well. The only thing I noticed is the 'Breaking affinity for irq XX' on suspend without acpi=ht messages. I can't even tell whatever other kernel versions are working because aic7xxx driver didn't got suspend support till now ( or at least never worked here ). I know suspend worked fine on windows with that box. There is my config and dmesg ( good and bad one ) : http://194.231.229.228/suspend/acpi=ht_working_dmesg.txt http://194.231.229.228/suspend/dmesg_broken_cpus_on_resume.txt http://194.231.229.228/suspend/config Well, I think we have a problem with the CPU hotplug. Can you try to offline-online CPUs (without
Re: Resume problems
Also box just froze on level 3 but I got a ACPI error at least which I didn't got in any other dmesg till now : ( also patch was tested with HT disabled and Suspend and Hibernation enabled in kernel and BIOS ) ... Oct 23 01:51:05 lara [ 273.512374] PM: Removing info for No Bus:input0 Oct 23 01:51:05 lara [ 274.545158] PM: Removing info for No Bus:mouse0 Oct 23 01:51:05 lara [ 274.551435] PM: Removing info for No Bus:event1 Oct 23 01:51:05 lara [ 274.559493] PM: Removing info for No Bus:input1 Oct 23 01:53:06 lara [ 394.869468] ACPI Error (evevent-0303): No installed handler for fixed event [0002] [20070126] ( I hard reseted after that ) I try level 2 and 1 now I just wanted to let you know. Same issues with level 2 and 1. BTW I found out why my box does not shutdown with acpi=ht. It seems like libata does not like that acpi mode =) dropping the '... read http://linux-ata.org/shutdown.html , power down manually' message. That works perfectly with full acpi here. After all I think all this problems may be some who ACPI related but the question is why they get triggered by Suspend/Hibernation. If you want me to test something else just let me know. Gabriel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems
Pavel Machek wrote: Hi! Since upgrading to 2.6.23.1 from 2.6.23-rc9, resume-from-RAM has been misbehaving here. It takes much (+5-7 seconds) longer to resume *sometimes*, but not all/most of the time. I suspend those long delays may have something to do with USB, as it takes longer for my hub + mouse to come back to life during the sequence. But I have since then re-applied the powertop patches, and the long delays vanished. Back with -rc9 I normally also had those powertop fixes, and so this problem could be much older and was never noticed until I ran without them. I'm no longer actively investigating, as the delays are gone (powertop patches), and the crashes are much less frequent since patching. Cheers - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems
Hi! > Since upgrading to 2.6.23.1 from 2.6.23-rc9, > resume-from-RAM has been misbehaving here. > > It takes much (+5-7 seconds) longer to resume > *sometimes*, but not all/most of the time. > And sometimes I get get flashing keyboard LEDs and have > to hold the power button > in for a full hard reset. > > With 2.6.23-rc8/rc9, no such troubles. > > Difficult to reproduce, other than perhaps once a day. > Anybody want to fess up with a likely candidate? Are there any .config differences between rc8 and .1? Can you try disabling nohz? -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems
Hi! Since upgrading to 2.6.23.1 from 2.6.23-rc9, resume-from-RAM has been misbehaving here. It takes much (+5-7 seconds) longer to resume *sometimes*, but not all/most of the time. And sometimes I get get flashing keyboard LEDs and have to hold the power button in for a full hard reset. With 2.6.23-rc8/rc9, no such troubles. Difficult to reproduce, other than perhaps once a day. Anybody want to fess up with a likely candidate? Are there any .config differences between rc8 and .1? Can you try disabling nohz? -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems
Pavel Machek wrote: Hi! Since upgrading to 2.6.23.1 from 2.6.23-rc9, resume-from-RAM has been misbehaving here. It takes much (+5-7 seconds) longer to resume *sometimes*, but not all/most of the time. I suspend those long delays may have something to do with USB, as it takes longer for my hub + mouse to come back to life during the sequence. But I have since then re-applied the powertop patches, and the long delays vanished. Back with -rc9 I normally also had those powertop fixes, and so this problem could be much older and was never noticed until I ran without them. I'm no longer actively investigating, as the delays are gone (powertop patches), and the crashes are much less frequent since patching. Cheers - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems
On Wednesday, 17 October 2007 00:10, Mark Lord wrote: > Mark Lord wrote: > > Rafael J. Wysocki wrote: > >> On Sunday, 14 October 2007 22:13, Mark Lord wrote: > >>> Since upgrading to 2.6.23.1 from 2.6.23-rc9, resume-from-RAM has been > >>> misbehaving here. > >>> > >>> It takes much (+5-7 seconds) longer to resume *sometimes*, but not > >>> all/most of the time. > >>> And sometimes I get get flashing keyboard LEDs and have to hold the > >>> power button > >>> in for a full hard reset. > >>> > >>> With 2.6.23-rc8/rc9, no such troubles. > >>> > >>> Difficult to reproduce, other than perhaps once a day. > >>> Anybody want to fess up with a likely candidate? > >> > >> Not really, but if you rule out all of the POWERPC and MIPS patches, > >> there's > >> not much left ... > > > > Yeah, I didn't see much there either. > > > > I'll keep an eye on things over the next few days, > > and post again if it persists. > > > > I was using the powertop patches with -rc9, but not with 2.6.23.1. > > Maybe they helped (???). > > It still happens. Well, I have no idea, sorry. I also have not been able to reproduce it here ... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems
Mark Lord wrote: Rafael J. Wysocki wrote: On Sunday, 14 October 2007 22:13, Mark Lord wrote: Since upgrading to 2.6.23.1 from 2.6.23-rc9, resume-from-RAM has been misbehaving here. It takes much (+5-7 seconds) longer to resume *sometimes*, but not all/most of the time. And sometimes I get get flashing keyboard LEDs and have to hold the power button in for a full hard reset. With 2.6.23-rc8/rc9, no such troubles. Difficult to reproduce, other than perhaps once a day. Anybody want to fess up with a likely candidate? Not really, but if you rule out all of the POWERPC and MIPS patches, there's not much left ... Yeah, I didn't see much there either. I'll keep an eye on things over the next few days, and post again if it persists. I was using the powertop patches with -rc9, but not with 2.6.23.1. Maybe they helped (???). It still happens. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems
Mark Lord wrote: Rafael J. Wysocki wrote: On Sunday, 14 October 2007 22:13, Mark Lord wrote: Since upgrading to 2.6.23.1 from 2.6.23-rc9, resume-from-RAM has been misbehaving here. It takes much (+5-7 seconds) longer to resume *sometimes*, but not all/most of the time. And sometimes I get get flashing keyboard LEDs and have to hold the power button in for a full hard reset. With 2.6.23-rc8/rc9, no such troubles. Difficult to reproduce, other than perhaps once a day. Anybody want to fess up with a likely candidate? Not really, but if you rule out all of the POWERPC and MIPS patches, there's not much left ... Yeah, I didn't see much there either. I'll keep an eye on things over the next few days, and post again if it persists. I was using the powertop patches with -rc9, but not with 2.6.23.1. Maybe they helped (???). It still happens. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems
On Wednesday, 17 October 2007 00:10, Mark Lord wrote: Mark Lord wrote: Rafael J. Wysocki wrote: On Sunday, 14 October 2007 22:13, Mark Lord wrote: Since upgrading to 2.6.23.1 from 2.6.23-rc9, resume-from-RAM has been misbehaving here. It takes much (+5-7 seconds) longer to resume *sometimes*, but not all/most of the time. And sometimes I get get flashing keyboard LEDs and have to hold the power button in for a full hard reset. With 2.6.23-rc8/rc9, no such troubles. Difficult to reproduce, other than perhaps once a day. Anybody want to fess up with a likely candidate? Not really, but if you rule out all of the POWERPC and MIPS patches, there's not much left ... Yeah, I didn't see much there either. I'll keep an eye on things over the next few days, and post again if it persists. I was using the powertop patches with -rc9, but not with 2.6.23.1. Maybe they helped (???). It still happens. Well, I have no idea, sorry. I also have not been able to reproduce it here ... - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems
Rafael J. Wysocki wrote: On Sunday, 14 October 2007 22:13, Mark Lord wrote: Since upgrading to 2.6.23.1 from 2.6.23-rc9, resume-from-RAM has been misbehaving here. It takes much (+5-7 seconds) longer to resume *sometimes*, but not all/most of the time. And sometimes I get get flashing keyboard LEDs and have to hold the power button in for a full hard reset. With 2.6.23-rc8/rc9, no such troubles. Difficult to reproduce, other than perhaps once a day. Anybody want to fess up with a likely candidate? Not really, but if you rule out all of the POWERPC and MIPS patches, there's not much left ... Yeah, I didn't see much there either. I'll keep an eye on things over the next few days, and post again if it persists. I was using the powertop patches with -rc9, but not with 2.6.23.1. Maybe they helped (???). -ml - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems
On Sunday, 14 October 2007 22:13, Mark Lord wrote: > Since upgrading to 2.6.23.1 from 2.6.23-rc9, resume-from-RAM has been > misbehaving here. > > It takes much (+5-7 seconds) longer to resume *sometimes*, but not all/most > of the time. > And sometimes I get get flashing keyboard LEDs and have to hold the power > button > in for a full hard reset. > > With 2.6.23-rc8/rc9, no such troubles. > > Difficult to reproduce, other than perhaps once a day. > Anybody want to fess up with a likely candidate? Not really, but if you rule out all of the POWERPC and MIPS patches, there's not much left ... Greetings, Rafael - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems
Since upgrading to 2.6.23.1 from 2.6.23-rc9, resume-from-RAM has been misbehaving here. It takes much (+5-7 seconds) longer to resume *sometimes*, but not all/most of the time. And sometimes I get get flashing keyboard LEDs and have to hold the power button in for a full hard reset. With 2.6.23-rc8/rc9, no such troubles. Difficult to reproduce, other than perhaps once a day. Anybody want to fess up with a likely candidate? Dell Inspiron 9400 Core2duo + 2GB RAM, 32-bit x86 kernel+user. .config below. - # # Automatically generated make config: don't edit # Linux kernel version: 2.6.23.1 # Sun Oct 14 09:22:05 2007 # CONFIG_X86_32=y CONFIG_GENERIC_TIME=y CONFIG_GENERIC_CMOS_UPDATE=y CONFIG_CLOCKSOURCE_WATCHDOG=y CONFIG_GENERIC_CLOCKEVENTS=y CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y CONFIG_LOCKDEP_SUPPORT=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_SEMAPHORE_SLEEPERS=y CONFIG_X86=y CONFIG_MMU=y CONFIG_ZONE_DMA=y CONFIG_QUICKLIST=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y CONFIG_GENERIC_BUG=y CONFIG_GENERIC_HWEIGHT=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y CONFIG_DMI=y CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" # # General setup # CONFIG_EXPERIMENTAL=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 CONFIG_LOCALVERSION="" # CONFIG_LOCALVERSION_AUTO is not set CONFIG_SWAP=y CONFIG_SYSVIPC=y CONFIG_SYSVIPC_SYSCTL=y CONFIG_POSIX_MQUEUE=y # CONFIG_BSD_PROCESS_ACCT is not set # CONFIG_TASKSTATS is not set # CONFIG_USER_NS is not set # CONFIG_AUDIT is not set CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_LOG_BUF_SHIFT=16 # CONFIG_CPUSETS is not set CONFIG_SYSFS_DEPRECATED=y # CONFIG_RELAY is not set CONFIG_BLK_DEV_INITRD=y CONFIG_INITRAMFS_SOURCE="" CONFIG_CC_OPTIMIZE_FOR_SIZE=y CONFIG_SYSCTL=y CONFIG_EMBEDDED=y CONFIG_UID16=y CONFIG_SYSCTL_SYSCALL=y CONFIG_KALLSYMS=y # CONFIG_KALLSYMS_ALL is not set # CONFIG_KALLSYMS_EXTRA_PASS is not set CONFIG_HOTPLUG=y CONFIG_PRINTK=y CONFIG_BUG=y CONFIG_ELF_CORE=y CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_ANON_INODES=y CONFIG_EPOLL=y CONFIG_SIGNALFD=y CONFIG_EVENTFD=y CONFIG_SHMEM=y CONFIG_VM_EVENT_COUNTERS=y # CONFIG_SLUB_DEBUG is not set # CONFIG_SLAB is not set CONFIG_SLUB=y # CONFIG_SLOB is not set CONFIG_RT_MUTEXES=y # CONFIG_TINY_SHMEM is not set CONFIG_BASE_SMALL=0 CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y CONFIG_MODULE_FORCE_UNLOAD=y CONFIG_MODVERSIONS=y CONFIG_MODULE_SRCVERSION_ALL=y CONFIG_KMOD=y CONFIG_STOP_MACHINE=y CONFIG_BLOCK=y # CONFIG_LBD is not set # CONFIG_BLK_DEV_IO_TRACE is not set # CONFIG_LSF is not set CONFIG_BLK_DEV_BSG=y # # IO Schedulers # CONFIG_IOSCHED_NOOP=y CONFIG_IOSCHED_AS=y # CONFIG_IOSCHED_DEADLINE is not set CONFIG_IOSCHED_CFQ=y # CONFIG_DEFAULT_AS is not set # CONFIG_DEFAULT_DEADLINE is not set CONFIG_DEFAULT_CFQ=y # CONFIG_DEFAULT_NOOP is not set CONFIG_DEFAULT_IOSCHED="cfq" # # Processor type and features # CONFIG_TICK_ONESHOT=y CONFIG_NO_HZ=y CONFIG_HIGH_RES_TIMERS=y CONFIG_SMP=y CONFIG_X86_PC=y # CONFIG_X86_ELAN is not set # CONFIG_X86_VOYAGER is not set # CONFIG_X86_NUMAQ is not set # CONFIG_X86_SUMMIT is not set # CONFIG_X86_BIGSMP is not set # CONFIG_X86_VISWS is not set # CONFIG_X86_GENERICARCH is not set # CONFIG_X86_ES7000 is not set # CONFIG_PARAVIRT is not set # CONFIG_M386 is not set # CONFIG_M486 is not set # CONFIG_M586 is not set # CONFIG_M586TSC is not set # CONFIG_M586MMX is not set # CONFIG_M686 is not set # CONFIG_MPENTIUMII is not set # CONFIG_MPENTIUMIII is not set # CONFIG_MPENTIUMM is not set CONFIG_MCORE2=y # CONFIG_MPENTIUM4 is not set # CONFIG_MK6 is not set # CONFIG_MK7 is not set # CONFIG_MK8 is not set # CONFIG_MCRUSOE is not set # CONFIG_MEFFICEON is not set # CONFIG_MWINCHIPC6 is not set # CONFIG_MWINCHIP2 is not set # CONFIG_MWINCHIP3D is not set # CONFIG_MGEODEGX1 is not set # CONFIG_MGEODE_LX is not set # CONFIG_MCYRIXIII is not set # CONFIG_MVIAC3_2 is not set # CONFIG_MVIAC7 is not set # CONFIG_X86_GENERIC is not set CONFIG_X86_CMPXCHG=y CONFIG_X86_L1_CACHE_SHIFT=6 CONFIG_X86_XADD=y CONFIG_RWSEM_XCHGADD_ALGORITHM=y # CONFIG_ARCH_HAS_ILOG2_U32 is not set # CONFIG_ARCH_HAS_ILOG2_U64 is not set CONFIG_GENERIC_CALIBRATE_DELAY=y CONFIG_X86_WP_WORKS_OK=y CONFIG_X86_INVLPG=y CONFIG_X86_BSWAP=y CONFIG_X86_POPAD_OK=y CONFIG_X86_GOOD_APIC=y CONFIG_X86_INTEL_USERCOPY=y CONFIG_X86_USE_PPRO_CHECKSUM=y CONFIG_X86_TSC=y CONFIG_X86_MINIMUM_CPU_FAMILY=4 CONFIG_HPET_TIMER=y CONFIG_NR_CPUS=2 # CONFIG_SCHED_SMT is not set CONFIG_SCHED_MC=y # CONFIG_PREEMPT_NONE is not set # CONFIG_PREEMPT_VOLUNTARY is not set CONFIG_PREEMPT=y CONFIG_PREEMPT_BKL=y CONFIG_X86_LOCAL_APIC=y CONFIG_X86_IO_APIC=y # CONFIG_X86_MCE is not set CONFIG_VM86=y # CONFIG_TOSHIBA is not set CONFIG_I8K=m CONFIG_X86_REBOOTFIXUPS=y CONFIG_MICROCODE=m CONFIG_MICROCODE_OLD_INTERFACE=y CONFIG_X86_MSR=m CONFIG_X86_CPUID=m # # Firmware Drivers # CONFIG_EDD=m CONFIG_DELL_RBU=m CONFIG_DCDBAS=m CONFIG_DMIID=y # CONFIG_NOHIGHMEM is not set CONFIG_HIGHMEM4G=y # CONFIG_HIGHMEM64G is not set CONFIG_VMSPLIT_3G=y # CONFIG_VMSPLIT_3G_OPT is
Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems
Since upgrading to 2.6.23.1 from 2.6.23-rc9, resume-from-RAM has been misbehaving here. It takes much (+5-7 seconds) longer to resume *sometimes*, but not all/most of the time. And sometimes I get get flashing keyboard LEDs and have to hold the power button in for a full hard reset. With 2.6.23-rc8/rc9, no such troubles. Difficult to reproduce, other than perhaps once a day. Anybody want to fess up with a likely candidate? Dell Inspiron 9400 Core2duo + 2GB RAM, 32-bit x86 kernel+user. .config below. - # # Automatically generated make config: don't edit # Linux kernel version: 2.6.23.1 # Sun Oct 14 09:22:05 2007 # CONFIG_X86_32=y CONFIG_GENERIC_TIME=y CONFIG_GENERIC_CMOS_UPDATE=y CONFIG_CLOCKSOURCE_WATCHDOG=y CONFIG_GENERIC_CLOCKEVENTS=y CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y CONFIG_LOCKDEP_SUPPORT=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_SEMAPHORE_SLEEPERS=y CONFIG_X86=y CONFIG_MMU=y CONFIG_ZONE_DMA=y CONFIG_QUICKLIST=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y CONFIG_GENERIC_BUG=y CONFIG_GENERIC_HWEIGHT=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y CONFIG_DMI=y CONFIG_DEFCONFIG_LIST=/lib/modules/$UNAME_RELEASE/.config # # General setup # CONFIG_EXPERIMENTAL=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 CONFIG_LOCALVERSION= # CONFIG_LOCALVERSION_AUTO is not set CONFIG_SWAP=y CONFIG_SYSVIPC=y CONFIG_SYSVIPC_SYSCTL=y CONFIG_POSIX_MQUEUE=y # CONFIG_BSD_PROCESS_ACCT is not set # CONFIG_TASKSTATS is not set # CONFIG_USER_NS is not set # CONFIG_AUDIT is not set CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_LOG_BUF_SHIFT=16 # CONFIG_CPUSETS is not set CONFIG_SYSFS_DEPRECATED=y # CONFIG_RELAY is not set CONFIG_BLK_DEV_INITRD=y CONFIG_INITRAMFS_SOURCE= CONFIG_CC_OPTIMIZE_FOR_SIZE=y CONFIG_SYSCTL=y CONFIG_EMBEDDED=y CONFIG_UID16=y CONFIG_SYSCTL_SYSCALL=y CONFIG_KALLSYMS=y # CONFIG_KALLSYMS_ALL is not set # CONFIG_KALLSYMS_EXTRA_PASS is not set CONFIG_HOTPLUG=y CONFIG_PRINTK=y CONFIG_BUG=y CONFIG_ELF_CORE=y CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_ANON_INODES=y CONFIG_EPOLL=y CONFIG_SIGNALFD=y CONFIG_EVENTFD=y CONFIG_SHMEM=y CONFIG_VM_EVENT_COUNTERS=y # CONFIG_SLUB_DEBUG is not set # CONFIG_SLAB is not set CONFIG_SLUB=y # CONFIG_SLOB is not set CONFIG_RT_MUTEXES=y # CONFIG_TINY_SHMEM is not set CONFIG_BASE_SMALL=0 CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y CONFIG_MODULE_FORCE_UNLOAD=y CONFIG_MODVERSIONS=y CONFIG_MODULE_SRCVERSION_ALL=y CONFIG_KMOD=y CONFIG_STOP_MACHINE=y CONFIG_BLOCK=y # CONFIG_LBD is not set # CONFIG_BLK_DEV_IO_TRACE is not set # CONFIG_LSF is not set CONFIG_BLK_DEV_BSG=y # # IO Schedulers # CONFIG_IOSCHED_NOOP=y CONFIG_IOSCHED_AS=y # CONFIG_IOSCHED_DEADLINE is not set CONFIG_IOSCHED_CFQ=y # CONFIG_DEFAULT_AS is not set # CONFIG_DEFAULT_DEADLINE is not set CONFIG_DEFAULT_CFQ=y # CONFIG_DEFAULT_NOOP is not set CONFIG_DEFAULT_IOSCHED=cfq # # Processor type and features # CONFIG_TICK_ONESHOT=y CONFIG_NO_HZ=y CONFIG_HIGH_RES_TIMERS=y CONFIG_SMP=y CONFIG_X86_PC=y # CONFIG_X86_ELAN is not set # CONFIG_X86_VOYAGER is not set # CONFIG_X86_NUMAQ is not set # CONFIG_X86_SUMMIT is not set # CONFIG_X86_BIGSMP is not set # CONFIG_X86_VISWS is not set # CONFIG_X86_GENERICARCH is not set # CONFIG_X86_ES7000 is not set # CONFIG_PARAVIRT is not set # CONFIG_M386 is not set # CONFIG_M486 is not set # CONFIG_M586 is not set # CONFIG_M586TSC is not set # CONFIG_M586MMX is not set # CONFIG_M686 is not set # CONFIG_MPENTIUMII is not set # CONFIG_MPENTIUMIII is not set # CONFIG_MPENTIUMM is not set CONFIG_MCORE2=y # CONFIG_MPENTIUM4 is not set # CONFIG_MK6 is not set # CONFIG_MK7 is not set # CONFIG_MK8 is not set # CONFIG_MCRUSOE is not set # CONFIG_MEFFICEON is not set # CONFIG_MWINCHIPC6 is not set # CONFIG_MWINCHIP2 is not set # CONFIG_MWINCHIP3D is not set # CONFIG_MGEODEGX1 is not set # CONFIG_MGEODE_LX is not set # CONFIG_MCYRIXIII is not set # CONFIG_MVIAC3_2 is not set # CONFIG_MVIAC7 is not set # CONFIG_X86_GENERIC is not set CONFIG_X86_CMPXCHG=y CONFIG_X86_L1_CACHE_SHIFT=6 CONFIG_X86_XADD=y CONFIG_RWSEM_XCHGADD_ALGORITHM=y # CONFIG_ARCH_HAS_ILOG2_U32 is not set # CONFIG_ARCH_HAS_ILOG2_U64 is not set CONFIG_GENERIC_CALIBRATE_DELAY=y CONFIG_X86_WP_WORKS_OK=y CONFIG_X86_INVLPG=y CONFIG_X86_BSWAP=y CONFIG_X86_POPAD_OK=y CONFIG_X86_GOOD_APIC=y CONFIG_X86_INTEL_USERCOPY=y CONFIG_X86_USE_PPRO_CHECKSUM=y CONFIG_X86_TSC=y CONFIG_X86_MINIMUM_CPU_FAMILY=4 CONFIG_HPET_TIMER=y CONFIG_NR_CPUS=2 # CONFIG_SCHED_SMT is not set CONFIG_SCHED_MC=y # CONFIG_PREEMPT_NONE is not set # CONFIG_PREEMPT_VOLUNTARY is not set CONFIG_PREEMPT=y CONFIG_PREEMPT_BKL=y CONFIG_X86_LOCAL_APIC=y CONFIG_X86_IO_APIC=y # CONFIG_X86_MCE is not set CONFIG_VM86=y # CONFIG_TOSHIBA is not set CONFIG_I8K=m CONFIG_X86_REBOOTFIXUPS=y CONFIG_MICROCODE=m CONFIG_MICROCODE_OLD_INTERFACE=y CONFIG_X86_MSR=m CONFIG_X86_CPUID=m # # Firmware Drivers # CONFIG_EDD=m CONFIG_DELL_RBU=m CONFIG_DCDBAS=m CONFIG_DMIID=y # CONFIG_NOHIGHMEM is not set CONFIG_HIGHMEM4G=y # CONFIG_HIGHMEM64G is not set CONFIG_VMSPLIT_3G=y # CONFIG_VMSPLIT_3G_OPT is not set
Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems
On Sunday, 14 October 2007 22:13, Mark Lord wrote: Since upgrading to 2.6.23.1 from 2.6.23-rc9, resume-from-RAM has been misbehaving here. It takes much (+5-7 seconds) longer to resume *sometimes*, but not all/most of the time. And sometimes I get get flashing keyboard LEDs and have to hold the power button in for a full hard reset. With 2.6.23-rc8/rc9, no such troubles. Difficult to reproduce, other than perhaps once a day. Anybody want to fess up with a likely candidate? Not really, but if you rule out all of the POWERPC and MIPS patches, there's not much left ... Greetings, Rafael - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems
Rafael J. Wysocki wrote: On Sunday, 14 October 2007 22:13, Mark Lord wrote: Since upgrading to 2.6.23.1 from 2.6.23-rc9, resume-from-RAM has been misbehaving here. It takes much (+5-7 seconds) longer to resume *sometimes*, but not all/most of the time. And sometimes I get get flashing keyboard LEDs and have to hold the power button in for a full hard reset. With 2.6.23-rc8/rc9, no such troubles. Difficult to reproduce, other than perhaps once a day. Anybody want to fess up with a likely candidate? Not really, but if you rule out all of the POWERPC and MIPS patches, there's not much left ... Yeah, I didn't see much there either. I'll keep an eye on things over the next few days, and post again if it persists. I was using the powertop patches with -rc9, but not with 2.6.23.1. Maybe they helped (???). -ml - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: MMC/SD Root filesystem suspend/resume problems
On Wed 2007-07-25 20:20:42, Richard Purdie wrote: > On Wed, 2007-07-25 at 19:01 +, Pavel Machek wrote: > > > I enabled the MMC_UNSAFE_RESUME option and the problems I was seeing was > > > "fixed". I think having this option is a bad idea (in its current form) > > > as it doesn't actually stop filesystem corruption. > > > > > > With the option disabled, if a filesystem is mounted when you suspend my > > > tests show the filesystem is corrupted. At least if the option is > > > enabled, the filesystem is only corrupted if you remove the card whilst > > > suspended which is more preferable. > > > > Are we talking _corruption_ here, or are we talking 'the kind of > > corruption recoverable by fsck that happens on powerfail'? > > There was more damage to the system than just a dirty bit set. Yes, fsck > could fix it but I don't think it should happen in the first place... Well, that's "ok", that happens on sudden powerdowns, too. (Well, but we do sync() during suspend, so it is a bit strange). Do you have fsck logs perhaps? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: MMC/SD Root filesystem suspend/resume problems
On Wed, 2007-07-25 at 19:01 +, Pavel Machek wrote: > > I enabled the MMC_UNSAFE_RESUME option and the problems I was seeing was > > "fixed". I think having this option is a bad idea (in its current form) > > as it doesn't actually stop filesystem corruption. > > > > With the option disabled, if a filesystem is mounted when you suspend my > > tests show the filesystem is corrupted. At least if the option is > > enabled, the filesystem is only corrupted if you remove the card whilst > > suspended which is more preferable. > > Are we talking _corruption_ here, or are we talking 'the kind of > corruption recoverable by fsck that happens on powerfail'? There was more damage to the system than just a dirty bit set. Yes, fsck could fix it but I don't think it should happen in the first place... Richard - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: MMC/SD Root filesystem suspend/resume problems
Hi! > > > Lots of Linux handhelds use MMC/SD devices as the root file system. > > > This has worked quite reliably for many kernel versions. In 2.6.22, > > > it seems that if you suspend such a system then resume it, the device > > > locks up. Trying to execute anything on the filesystem results in a > > > "Permission Denied" message. I did see a message from the MMC > > > subsystem saying it had redetected the card. There are also messages > > > on the console like "MMC: killing requests for dead queue" each time > > > you suspend/resume. > > > > The card is removed when you suspend and readded when you resume. > > That's the only safe thing we can do until we get suspend support in > > the filesystems. > > > > If you really want to shoot yourself in the foot, there is a Kconfig > > option that keeps the card around across the suspend. > > I enabled the MMC_UNSAFE_RESUME option and the problems I was seeing was > "fixed". I think having this option is a bad idea (in its current form) > as it doesn't actually stop filesystem corruption. > > With the option disabled, if a filesystem is mounted when you suspend my > tests show the filesystem is corrupted. At least if the option is > enabled, the filesystem is only corrupted if you remove the card whilst > suspended which is more preferable. Are we talking _corruption_ here, or are we talking 'the kind of corruption recoverable by fsck that happens on powerfail'? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: MMC/SD Root filesystem suspend/resume problems
On Wed 2007-07-25 20:20:42, Richard Purdie wrote: On Wed, 2007-07-25 at 19:01 +, Pavel Machek wrote: I enabled the MMC_UNSAFE_RESUME option and the problems I was seeing was fixed. I think having this option is a bad idea (in its current form) as it doesn't actually stop filesystem corruption. With the option disabled, if a filesystem is mounted when you suspend my tests show the filesystem is corrupted. At least if the option is enabled, the filesystem is only corrupted if you remove the card whilst suspended which is more preferable. Are we talking _corruption_ here, or are we talking 'the kind of corruption recoverable by fsck that happens on powerfail'? There was more damage to the system than just a dirty bit set. Yes, fsck could fix it but I don't think it should happen in the first place... Well, that's ok, that happens on sudden powerdowns, too. (Well, but we do sync() during suspend, so it is a bit strange). Do you have fsck logs perhaps? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: MMC/SD Root filesystem suspend/resume problems
On Wed, 2007-07-25 at 19:01 +, Pavel Machek wrote: I enabled the MMC_UNSAFE_RESUME option and the problems I was seeing was fixed. I think having this option is a bad idea (in its current form) as it doesn't actually stop filesystem corruption. With the option disabled, if a filesystem is mounted when you suspend my tests show the filesystem is corrupted. At least if the option is enabled, the filesystem is only corrupted if you remove the card whilst suspended which is more preferable. Are we talking _corruption_ here, or are we talking 'the kind of corruption recoverable by fsck that happens on powerfail'? There was more damage to the system than just a dirty bit set. Yes, fsck could fix it but I don't think it should happen in the first place... Richard - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: MMC/SD Root filesystem suspend/resume problems
Hi! Lots of Linux handhelds use MMC/SD devices as the root file system. This has worked quite reliably for many kernel versions. In 2.6.22, it seems that if you suspend such a system then resume it, the device locks up. Trying to execute anything on the filesystem results in a Permission Denied message. I did see a message from the MMC subsystem saying it had redetected the card. There are also messages on the console like MMC: killing requests for dead queue each time you suspend/resume. The card is removed when you suspend and readded when you resume. That's the only safe thing we can do until we get suspend support in the filesystems. If you really want to shoot yourself in the foot, there is a Kconfig option that keeps the card around across the suspend. I enabled the MMC_UNSAFE_RESUME option and the problems I was seeing was fixed. I think having this option is a bad idea (in its current form) as it doesn't actually stop filesystem corruption. With the option disabled, if a filesystem is mounted when you suspend my tests show the filesystem is corrupted. At least if the option is enabled, the filesystem is only corrupted if you remove the card whilst suspended which is more preferable. Are we talking _corruption_ here, or are we talking 'the kind of corruption recoverable by fsck that happens on powerfail'? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: MMC/SD Root filesystem suspend/resume problems
On Sun, 22 Jul 2007 15:28:00 +0100 Richard Purdie <[EMAIL PROTECTED]> wrote: > > Corruption is corruption and it shouldn't happen if we can avoid it. > It happens with complete certainty in one case and only happens in the > other if the user does something which is a fairly obvious bad idea > (which is documented as such). > The corruption will only occur if the filesystem is dirty. Granted, the mount will be dead and useless, but I wouldn't call that corruption. Anyway, this behaviour was selected after seeing the long discussion about how USB should handle the same problem. It was decided that it was best to play it safe and remove any devices that couldn't be determined to have remained in the slot. We also have the USB_PERSIST option these days, which does the same thing as MMC_UNSAFE_RESUME. > > Given I can suspend the device with "echo mem > /sys/power/state", > that implies we need to fix echo? ;-) > Or that direct usage of /sys/power/state is only for those who know what they are doing (and have umounted their filesystems beforehand). > > > And if we keep papering over the problems, you reduce the motivation > > of fixing this properly. > > Maybe although I don't like existing functionality being broken even > if its less than ideal. > I am of the opinion that it was more broken before I touched it. Silent corruption is never acceptable in my book. But if it is in yours, just enable MMC_UNSAFE_RESUME and you'll have the old behaviour. Rgds -- -- Pierre Ossman Linux kernel, MMC maintainerhttp://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: MMC/SD Root filesystem suspend/resume problems
On Sun, 2007-07-22 at 16:05 +0200, Pierre Ossman wrote: > On Sun, 22 Jul 2007 14:18:33 +0100 > Richard Purdie <[EMAIL PROTECTED]> wrote: > > I enabled the MMC_UNSAFE_RESUME option and the problems I was seeing > > was "fixed". I think having this option is a bad idea (in its current > > form) as it doesn't actually stop filesystem corruption. > > > > With the option disabled, if a filesystem is mounted when you suspend > > my tests show the filesystem is corrupted. At least if the option is > > enabled, the filesystem is only corrupted if you remove the card > > whilst suspended which is more preferable. > > I disagree. With this option you get silent corruption, without you get > noisy corruption. And I would always prefer the latter, even if it > increases the risk of it happening. Corruption is corruption and it shouldn't happen if we can avoid it. It happens with complete certainty in one case and only happens in the other if the user does something which is a fairly obvious bad idea (which is documented as such). > > I guess the solution would be to abort the suspend if mounted systems > > were detected and the option was disabled? Alternatively the option > > could be "auto" enabled only for mounted systems maybe with a printk > > warning? > > This is a general problem for all removable/hotpluggable storage. So > sticking it in the MMC block device would be the wrong layer IMO. It is however if the MMC layer is going to add Kconfig options which corrupt things, it can add things to start fixing things too. If those things can be adapted into more generic code paths, so much the better. > Until the filesystems can be made to store something sane on disk > before the suspend, I'd say this is best handled in user space. Let the > user space tools refuse to initiate the suspend as long as any > removable devices are mounted. Given I can suspend the device with "echo mem > /sys/power/state", that implies we need to fix echo? ;-) > > Of course the best solution would be to have filesystems support > > suspend/resume requests since other subsystems like pcmcia also suffer > > this problem and would benefit from this but I accept that teaching > > filesystems this is more difficult. > > > > Doesn't mean we shouldn't do it. Agreed. > And if we keep papering over the problems, you reduce the motivation > of fixing this properly. Maybe although I don't like existing functionality being broken even if its less than ideal. Regards. Richard - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: MMC/SD Root filesystem suspend/resume problems
On Sun, 22 Jul 2007 14:18:33 +0100 Richard Purdie <[EMAIL PROTECTED]> wrote: > > I enabled the MMC_UNSAFE_RESUME option and the problems I was seeing > was "fixed". I think having this option is a bad idea (in its current > form) as it doesn't actually stop filesystem corruption. > > With the option disabled, if a filesystem is mounted when you suspend > my tests show the filesystem is corrupted. At least if the option is > enabled, the filesystem is only corrupted if you remove the card > whilst suspended which is more preferable. > I disagree. With this option you get silent corruption, without you get noisy corruption. And I would always prefer the latter, even if it increases the risk of it happening. > I guess the solution would be to abort the suspend if mounted systems > were detected and the option was disabled? Alternatively the option > could be "auto" enabled only for mounted systems maybe with a printk > warning? > This is a general problem for all removable/hotpluggable storage. So sticking it in the MMC block device would be the wrong layer IMO. Until the filesystems can be made to store something sane on disk before the suspend, I'd say this is best handled in user space. Let the user space tools refuse to initiate the suspend as long as any removable devices are mounted. > Of course the best solution would be to have filesystems support > suspend/resume requests since other subsystems like pcmcia also suffer > this problem and would benefit from this but I accept that teaching > filesystems this is more difficult. > Doesn't mean we shouldn't do it. And if we keep papering over the problems, you reduce the motivation of fixing this properly. Rgds -- -- Pierre Ossman Linux kernel, MMC maintainerhttp://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: MMC/SD Root filesystem suspend/resume problems
On Thu, 2007-07-19 at 19:03 +0200, Pierre Ossman wrote: > On Thu, 19 Jul 2007 16:53:39 +0100 > Richard Purdie <[EMAIL PROTECTED]> wrote: > > Lots of Linux handhelds use MMC/SD devices as the root file system. > > This has worked quite reliably for many kernel versions. In 2.6.22, > > it seems that if you suspend such a system then resume it, the device > > locks up. Trying to execute anything on the filesystem results in a > > "Permission Denied" message. I did see a message from the MMC > > subsystem saying it had redetected the card. There are also messages > > on the console like "MMC: killing requests for dead queue" each time > > you suspend/resume. > > The card is removed when you suspend and readded when you resume. > That's the only safe thing we can do until we get suspend support in > the filesystems. > > If you really want to shoot yourself in the foot, there is a Kconfig > option that keeps the card around across the suspend. I enabled the MMC_UNSAFE_RESUME option and the problems I was seeing was "fixed". I think having this option is a bad idea (in its current form) as it doesn't actually stop filesystem corruption. With the option disabled, if a filesystem is mounted when you suspend my tests show the filesystem is corrupted. At least if the option is enabled, the filesystem is only corrupted if you remove the card whilst suspended which is more preferable. I guess the solution would be to abort the suspend if mounted systems were detected and the option was disabled? Alternatively the option could be "auto" enabled only for mounted systems maybe with a printk warning? Of course the best solution would be to have filesystems support suspend/resume requests since other subsystems like pcmcia also suffer this problem and would benefit from this but I accept that teaching filesystems this is more difficult. Regards, Richard - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: MMC/SD Root filesystem suspend/resume problems
On Thu, 2007-07-19 at 19:03 +0200, Pierre Ossman wrote: On Thu, 19 Jul 2007 16:53:39 +0100 Richard Purdie [EMAIL PROTECTED] wrote: Lots of Linux handhelds use MMC/SD devices as the root file system. This has worked quite reliably for many kernel versions. In 2.6.22, it seems that if you suspend such a system then resume it, the device locks up. Trying to execute anything on the filesystem results in a Permission Denied message. I did see a message from the MMC subsystem saying it had redetected the card. There are also messages on the console like MMC: killing requests for dead queue each time you suspend/resume. The card is removed when you suspend and readded when you resume. That's the only safe thing we can do until we get suspend support in the filesystems. If you really want to shoot yourself in the foot, there is a Kconfig option that keeps the card around across the suspend. I enabled the MMC_UNSAFE_RESUME option and the problems I was seeing was fixed. I think having this option is a bad idea (in its current form) as it doesn't actually stop filesystem corruption. With the option disabled, if a filesystem is mounted when you suspend my tests show the filesystem is corrupted. At least if the option is enabled, the filesystem is only corrupted if you remove the card whilst suspended which is more preferable. I guess the solution would be to abort the suspend if mounted systems were detected and the option was disabled? Alternatively the option could be auto enabled only for mounted systems maybe with a printk warning? Of course the best solution would be to have filesystems support suspend/resume requests since other subsystems like pcmcia also suffer this problem and would benefit from this but I accept that teaching filesystems this is more difficult. Regards, Richard - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: MMC/SD Root filesystem suspend/resume problems
On Sun, 22 Jul 2007 14:18:33 +0100 Richard Purdie [EMAIL PROTECTED] wrote: I enabled the MMC_UNSAFE_RESUME option and the problems I was seeing was fixed. I think having this option is a bad idea (in its current form) as it doesn't actually stop filesystem corruption. With the option disabled, if a filesystem is mounted when you suspend my tests show the filesystem is corrupted. At least if the option is enabled, the filesystem is only corrupted if you remove the card whilst suspended which is more preferable. I disagree. With this option you get silent corruption, without you get noisy corruption. And I would always prefer the latter, even if it increases the risk of it happening. I guess the solution would be to abort the suspend if mounted systems were detected and the option was disabled? Alternatively the option could be auto enabled only for mounted systems maybe with a printk warning? This is a general problem for all removable/hotpluggable storage. So sticking it in the MMC block device would be the wrong layer IMO. Until the filesystems can be made to store something sane on disk before the suspend, I'd say this is best handled in user space. Let the user space tools refuse to initiate the suspend as long as any removable devices are mounted. Of course the best solution would be to have filesystems support suspend/resume requests since other subsystems like pcmcia also suffer this problem and would benefit from this but I accept that teaching filesystems this is more difficult. Doesn't mean we shouldn't do it. And if we keep papering over the problems, you reduce the motivation of fixing this properly. Rgds -- -- Pierre Ossman Linux kernel, MMC maintainerhttp://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: MMC/SD Root filesystem suspend/resume problems
On Sun, 2007-07-22 at 16:05 +0200, Pierre Ossman wrote: On Sun, 22 Jul 2007 14:18:33 +0100 Richard Purdie [EMAIL PROTECTED] wrote: I enabled the MMC_UNSAFE_RESUME option and the problems I was seeing was fixed. I think having this option is a bad idea (in its current form) as it doesn't actually stop filesystem corruption. With the option disabled, if a filesystem is mounted when you suspend my tests show the filesystem is corrupted. At least if the option is enabled, the filesystem is only corrupted if you remove the card whilst suspended which is more preferable. I disagree. With this option you get silent corruption, without you get noisy corruption. And I would always prefer the latter, even if it increases the risk of it happening. Corruption is corruption and it shouldn't happen if we can avoid it. It happens with complete certainty in one case and only happens in the other if the user does something which is a fairly obvious bad idea (which is documented as such). I guess the solution would be to abort the suspend if mounted systems were detected and the option was disabled? Alternatively the option could be auto enabled only for mounted systems maybe with a printk warning? This is a general problem for all removable/hotpluggable storage. So sticking it in the MMC block device would be the wrong layer IMO. It is however if the MMC layer is going to add Kconfig options which corrupt things, it can add things to start fixing things too. If those things can be adapted into more generic code paths, so much the better. Until the filesystems can be made to store something sane on disk before the suspend, I'd say this is best handled in user space. Let the user space tools refuse to initiate the suspend as long as any removable devices are mounted. Given I can suspend the device with echo mem /sys/power/state, that implies we need to fix echo? ;-) Of course the best solution would be to have filesystems support suspend/resume requests since other subsystems like pcmcia also suffer this problem and would benefit from this but I accept that teaching filesystems this is more difficult. Doesn't mean we shouldn't do it. Agreed. And if we keep papering over the problems, you reduce the motivation of fixing this properly. Maybe although I don't like existing functionality being broken even if its less than ideal. Regards. Richard - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: MMC/SD Root filesystem suspend/resume problems
On Sun, 22 Jul 2007 15:28:00 +0100 Richard Purdie [EMAIL PROTECTED] wrote: Corruption is corruption and it shouldn't happen if we can avoid it. It happens with complete certainty in one case and only happens in the other if the user does something which is a fairly obvious bad idea (which is documented as such). The corruption will only occur if the filesystem is dirty. Granted, the mount will be dead and useless, but I wouldn't call that corruption. Anyway, this behaviour was selected after seeing the long discussion about how USB should handle the same problem. It was decided that it was best to play it safe and remove any devices that couldn't be determined to have remained in the slot. We also have the USB_PERSIST option these days, which does the same thing as MMC_UNSAFE_RESUME. Given I can suspend the device with echo mem /sys/power/state, that implies we need to fix echo? ;-) Or that direct usage of /sys/power/state is only for those who know what they are doing (and have umounted their filesystems beforehand). And if we keep papering over the problems, you reduce the motivation of fixing this properly. Maybe although I don't like existing functionality being broken even if its less than ideal. I am of the opinion that it was more broken before I touched it. Silent corruption is never acceptable in my book. But if it is in yours, just enable MMC_UNSAFE_RESUME and you'll have the old behaviour. Rgds -- -- Pierre Ossman Linux kernel, MMC maintainerhttp://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: MMC/SD Root filesystem suspend/resume problems
On Thu, 19 Jul 2007 16:53:39 +0100 Richard Purdie <[EMAIL PROTECTED]> wrote: > Hi Pierre, > > Lots of Linux handhelds use MMC/SD devices as the root file system. > This has worked quite reliably for many kernel versions. In 2.6.22, > it seems that if you suspend such a system then resume it, the device > locks up. Trying to execute anything on the filesystem results in a > "Permission Denied" message. I did see a message from the MMC > subsystem saying it had redetected the card. There are also messages > on the console like "MMC: killing requests for dead queue" each time > you suspend/resume. > The card is removed when you suspend and readded when you resume. That's the only safe thing we can do until we get suspend support in the filesystems. If you really want to shoot yourself in the foot, there is a Kconfig option that keeps the card around across the suspend. Rgds -- -- Pierre Ossman Linux kernel, MMC maintainerhttp://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: MMC/SD Root filesystem suspend/resume problems
On Thu, 2007-07-19 at 16:57 +0100, Richard Purdie wrote: > Lots of Linux handhelds use MMC/SD devices as the root file system. This > has worked quite reliably for many kernel versions. In 2.6.22, it seems > that if you suspend such a system then resume it, the device locks up. > Trying to execute anything on the filesystem results in a "Permission > Denied" message. I did see a message from the MMC subsystem saying it > had redetected the card. There are also messages on the console like > "MMC: killing requests for dead queue" each time you suspend/resume. > > I'm away from my serial cables at the moment but I may be able to > provide more debug when I have them over the weekend. Have you any ideas > on why this is breaking? > > For reference, I've reproduced the problem with both the PXA host driver > and different driver not merged into mainline (ASIC3). Just to follow up, if I boot with a rootfs from elsewhere and mount the mmc card then suspend/resume, it corrupts the data on the card too so it looks like some general suspend/resume problem on mounted filesystems. Richard - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
MMC/SD Root filesystem suspend/resume problems
Hi Pierre, Lots of Linux handhelds use MMC/SD devices as the root file system. This has worked quite reliably for many kernel versions. In 2.6.22, it seems that if you suspend such a system then resume it, the device locks up. Trying to execute anything on the filesystem results in a "Permission Denied" message. I did see a message from the MMC subsystem saying it had redetected the card. There are also messages on the console like "MMC: killing requests for dead queue" each time you suspend/resume. I'm away from my serial cables at the moment but I may be able to provide more debug when I have them over the weekend. Have you any ideas on why this is breaking? For reference, I've reproduced the problem with both the PXA host driver and different driver not merged into mainline (ASIC3). Cheers, Richard - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
MMC/SD Root filesystem suspend/resume problems
Hi Pierre, Lots of Linux handhelds use MMC/SD devices as the root file system. This has worked quite reliably for many kernel versions. In 2.6.22, it seems that if you suspend such a system then resume it, the device locks up. Trying to execute anything on the filesystem results in a Permission Denied message. I did see a message from the MMC subsystem saying it had redetected the card. There are also messages on the console like MMC: killing requests for dead queue each time you suspend/resume. I'm away from my serial cables at the moment but I may be able to provide more debug when I have them over the weekend. Have you any ideas on why this is breaking? For reference, I've reproduced the problem with both the PXA host driver and different driver not merged into mainline (ASIC3). Cheers, Richard - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: MMC/SD Root filesystem suspend/resume problems
On Thu, 2007-07-19 at 16:57 +0100, Richard Purdie wrote: Lots of Linux handhelds use MMC/SD devices as the root file system. This has worked quite reliably for many kernel versions. In 2.6.22, it seems that if you suspend such a system then resume it, the device locks up. Trying to execute anything on the filesystem results in a Permission Denied message. I did see a message from the MMC subsystem saying it had redetected the card. There are also messages on the console like MMC: killing requests for dead queue each time you suspend/resume. I'm away from my serial cables at the moment but I may be able to provide more debug when I have them over the weekend. Have you any ideas on why this is breaking? For reference, I've reproduced the problem with both the PXA host driver and different driver not merged into mainline (ASIC3). Just to follow up, if I boot with a rootfs from elsewhere and mount the mmc card then suspend/resume, it corrupts the data on the card too so it looks like some general suspend/resume problem on mounted filesystems. Richard - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: MMC/SD Root filesystem suspend/resume problems
On Thu, 19 Jul 2007 16:53:39 +0100 Richard Purdie [EMAIL PROTECTED] wrote: Hi Pierre, Lots of Linux handhelds use MMC/SD devices as the root file system. This has worked quite reliably for many kernel versions. In 2.6.22, it seems that if you suspend such a system then resume it, the device locks up. Trying to execute anything on the filesystem results in a Permission Denied message. I did see a message from the MMC subsystem saying it had redetected the card. There are also messages on the console like MMC: killing requests for dead queue each time you suspend/resume. The card is removed when you suspend and readded when you resume. That's the only safe thing we can do until we get suspend support in the filesystems. If you really want to shoot yourself in the foot, there is a Kconfig option that keeps the card around across the suspend. Rgds -- -- Pierre Ossman Linux kernel, MMC maintainerhttp://www.kernel.org PulseAudio, core developer http://pulseaudio.org rdesktop, core developer http://www.rdesktop.org - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] for acpi S1 power cycle resume problems
> Date: Fri, 19 Aug 2005 08:39:25 -0600 > From: "William Morrow" <[EMAIL PROTECTED]> > Subject: [PATCH] for acpi S1 power cycle resume problems > > > Hi > I was told that if I had a patch to submit for a baseline change that > this was the place to do it. In this case that works fine. Normally they should go to linux-usb-devel for me (and others) to read there. Thanks, these need a bit of cleaning up, finishing, and splitting out; they should be in 2.6.14 though. Comments below. Were these patches written by you, or by Jordan? - Dave > If not, please let me know... > > thanks, > morrow > > Patched against 2.6.11 baseline > problems fixed: > 1) OHCI_INTR_RD not being cleared in ohci interrupt handler > results in interrupt storm and system hang on RD status. > ohci spec indicates this should be done. Yeah, I noticed that one but didn't fix it yet. It's not that it was _never_ cleared ... only certain code paths missed it. The systems I test with were clearly using those working paths! Having this fixed should help get rid of the 1/4 second timer this driver normally ties up. That'll help make the dynamic tick stuff work better, reducing power even when something like "ACPI S1" doesn't exist (like say, on that one Zaurus). > 2) PORT_CSC not being cleared in ehci_hub_status_data > code attempts to clear bit, but bit is write to clear. > there are other errant clears, since the PORTSCn regs > have 3 RWC bits, and the rest are RW. All stmts of the form: >writel (v, >regs->port_status[i]) > should clear RWC bits if they do not intend to clear status, > and should set the bits which should be cleared (this case). Yeah, whoever did that RWC patch for UHCI ports certainly should have checked other HCDs for the same bug. (Kicks self.) In fact you didn't fix this issue comprehensively. There are other places that register is written; they need to change too. This is clearly wrong, but did you notice any effects more serious than "lsusb -v" output for EHCI root hubs looking a bit strange? > 3) loop control and subsequent port resume/reset not correct. > unsigned index made detecting port1 active impossible, Odd, I've done that with some regularity. Is that maybe some kind of compiler bug? (I heard even 4.1 isn't quite there yet for kernels.) The looping doesn't look incorrect to me; ports are numbered from 1..N, and C code in the body must index them from 0..(N-1). > and OWNER/POWER status was being ignored on ports assigned > to companion controller. Well, in that one resume case anyway! But OWNER and POWER are very different status bits ... if POWER ever goes off, that port is by definition not resumable. But if a port's owned by the companion (OHCI or UHCI) controller, then it surely ought not to be reset (even if the companion's own SUSPEND bit doesn't show through EHCI). - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] for acpi S1 power cycle resume problems
Date: Fri, 19 Aug 2005 08:39:25 -0600 From: William Morrow [EMAIL PROTECTED] Subject: [PATCH] for acpi S1 power cycle resume problems Hi I was told that if I had a patch to submit for a baseline change that this was the place to do it. In this case that works fine. Normally they should go to linux-usb-devel for me (and others) to read there. Thanks, these need a bit of cleaning up, finishing, and splitting out; they should be in 2.6.14 though. Comments below. Were these patches written by you, or by Jordan? - Dave If not, please let me know... thanks, morrow Patched against 2.6.11 baseline problems fixed: 1) OHCI_INTR_RD not being cleared in ohci interrupt handler results in interrupt storm and system hang on RD status. ohci spec indicates this should be done. Yeah, I noticed that one but didn't fix it yet. It's not that it was _never_ cleared ... only certain code paths missed it. The systems I test with were clearly using those working paths! Having this fixed should help get rid of the 1/4 second timer this driver normally ties up. That'll help make the dynamic tick stuff work better, reducing power even when something like ACPI S1 doesn't exist (like say, on that one Zaurus). 2) PORT_CSC not being cleared in ehci_hub_status_data code attempts to clear bit, but bit is write to clear. there are other errant clears, since the PORTSCn regs have 3 RWC bits, and the rest are RW. All stmts of the form: writel (v, ehci-regs-port_status[i]) should clear RWC bits if they do not intend to clear status, and should set the bits which should be cleared (this case). Yeah, whoever did that RWC patch for UHCI ports certainly should have checked other HCDs for the same bug. (Kicks self.) In fact you didn't fix this issue comprehensively. There are other places that register is written; they need to change too. This is clearly wrong, but did you notice any effects more serious than lsusb -v output for EHCI root hubs looking a bit strange? 3) loop control and subsequent port resume/reset not correct. unsigned index made detecting port1 active impossible, Odd, I've done that with some regularity. Is that maybe some kind of compiler bug? (I heard even 4.1 isn't quite there yet for kernels.) The looping doesn't look incorrect to me; ports are numbered from 1..N, and C code in the body must index them from 0..(N-1). and OWNER/POWER status was being ignored on ports assigned to companion controller. Well, in that one resume case anyway! But OWNER and POWER are very different status bits ... if POWER ever goes off, that port is by definition not resumable. But if a port's owned by the companion (OHCI or UHCI) controller, then it surely ought not to be reset (even if the companion's own SUSPEND bit doesn't show through EHCI). - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] for acpi S1 power cycle resume problems
Hi I was told that if I had a patch to submit for a baseline change that this was the place to do it. If not, please let me know... thanks, morrow Patched against 2.6.11 baseline problems fixed: 1) OHCI_INTR_RD not being cleared in ohci interrupt handler results in interrupt storm and system hang on RD status. ohci spec indicates this should be done. 2) PORT_CSC not being cleared in ehci_hub_status_data code attempts to clear bit, but bit is write to clear. there are other errant clears, since the PORTSCn regs have 3 RWC bits, and the rest are RW. All stmts of the form: writel (v, >regs->port_status[i]) should clear RWC bits if they do not intend to clear status, and should set the bits which should be cleared (this case). 3) loop control and subsequent port resume/reset not correct. unsigned index made detecting port1 active impossible, and OWNER/POWER status was being ignored on ports assigned to companion controller. Signed-off-by: Jordan Crouse <[EMAIL PROTECTED]> diff -uprN linux-2.6.11.orig/drivers/usb/host/ehci.h linux-2.6.11/drivers/usb/host/ehci.h --- linux-2.6.11.orig/drivers/usb/host/ehci.h 2005-03-02 00:38:25.0 -0700 +++ linux-2.6.11/drivers/usb/host/ehci.h2005-08-17 08:15:36.0 -0600 @@ -262,6 +262,7 @@ struct ehci_regs { #define PORT_PE(1<<2) /* port enable */ #define PORT_CSC (1<<1) /* connect status change */ #define PORT_CONNECT (1<<0) /* device connected */ +#define PORT_RWC_BITS (PORT_CSC | PORT_PEC | PORT_OCC) } __attribute__ ((packed)); /* Appendix C, Debug port ... intended for use with special "debug devices" diff -uprN linux-2.6.11.orig/drivers/usb/host/ehci-hcd.c linux-2.6.11/drivers/usb/host/ehci-hcd.c --- linux-2.6.11.orig/drivers/usb/host/ehci-hcd.c 2005-03-02 00:38:38.0 -0700 +++ linux-2.6.11/drivers/usb/host/ehci-hcd.c2005-08-17 08:15:36.0 -0600 @@ -722,7 +722,7 @@ static int ehci_suspend (struct usb_hcd static int ehci_resume (struct usb_hcd *hcd) { struct ehci_hcd *ehci = hcd_to_ehci (hcd); - unsignedport; + int port; struct usb_device *root = hcd->self.root_hub; int retval = -EINVAL; int powerup = 0; @@ -733,11 +733,11 @@ static int ehci_resume (struct usb_hcd * msleep (100); /* If any port is suspended, we know we can/must resume the HC. */ - for (port = HCS_N_PORTS (ehci->hcs_params); port > 0; ) { + for (port = HCS_N_PORTS (ehci->hcs_params); --port >= 0; ) { u32 status; - port--; status = readl (>regs->port_status [port]); - if (status & PORT_SUSPEND) { + if ( (status & PORT_SUSPEND) != 0 || + ((status & PORT_OWNER) != 0 && (status & PORT_POWER) != 0) ) { down (>self.root_hub->serialize); retval = ehci_hub_resume (hcd); up (>self.root_hub->serialize); @@ -755,7 +755,7 @@ static int ehci_resume (struct usb_hcd * /* Else reset, to cope with power loss or flush-to-storage * style "resume" having activated BIOS during reboot. */ - if (port == 0) { + if (port < 0) { (void) ehci_halt (ehci); (void) ehci_reset (ehci); (void) ehci_hc_reset (hcd); diff -uprN linux-2.6.11.orig/drivers/usb/host/ehci-hub.c linux-2.6.11/drivers/usb/host/ehci-hub.c --- linux-2.6.11.orig/drivers/usb/host/ehci-hub.c 2005-03-02 00:38:32.0 -0700 +++ linux-2.6.11/drivers/usb/host/ehci-hub.c2005-08-17 08:15:36.0 -0600 @@ -232,7 +232,8 @@ ehci_hub_status_data (struct usb_hcd *hc if (temp & PORT_OWNER) { /* don't report this in GetPortStatus */ if (temp & PORT_CSC) { - temp &= ~PORT_CSC; + temp &= ~PORT_RWC_BITS; + temp |= PORT_CSC; writel (temp, >regs->port_status [i]); } continue; diff -uprN linux-2.6.11.orig/drivers/usb/host/ohci-hcd.c linux-2.6.11/drivers/usb/host/ohci-hcd.c --- linux-2.6.11.orig/drivers/usb/host/ohci-hcd.c 2005-03-02 00:37:48.0 -0700 +++ linux-2.6.11/drivers/usb/host/ohci-hcd.c2005-08-17 08:15:36.0 -0600 @@ -720,6 +720,7 @@ static irqreturn_t ohci_irq (struct usb_ if (ints & OHCI_INTR_RD) { ohci_vdbg (ohci, "resume detect\n"); + ohci_writel (ohci, OHCI_INTR_RD, >intrstatus); schedule_work(>rh_resume); }
[PATCH] for acpi S1 power cycle resume problems
Hi I was told that if I had a patch to submit for a baseline change that this was the place to do it. If not, please let me know... thanks, morrow Patched against 2.6.11 baseline problems fixed: 1) OHCI_INTR_RD not being cleared in ohci interrupt handler results in interrupt storm and system hang on RD status. ohci spec indicates this should be done. 2) PORT_CSC not being cleared in ehci_hub_status_data code attempts to clear bit, but bit is write to clear. there are other errant clears, since the PORTSCn regs have 3 RWC bits, and the rest are RW. All stmts of the form: writel (v, ehci-regs-port_status[i]) should clear RWC bits if they do not intend to clear status, and should set the bits which should be cleared (this case). 3) loop control and subsequent port resume/reset not correct. unsigned index made detecting port1 active impossible, and OWNER/POWER status was being ignored on ports assigned to companion controller. Signed-off-by: Jordan Crouse [EMAIL PROTECTED] diff -uprN linux-2.6.11.orig/drivers/usb/host/ehci.h linux-2.6.11/drivers/usb/host/ehci.h --- linux-2.6.11.orig/drivers/usb/host/ehci.h 2005-03-02 00:38:25.0 -0700 +++ linux-2.6.11/drivers/usb/host/ehci.h2005-08-17 08:15:36.0 -0600 @@ -262,6 +262,7 @@ struct ehci_regs { #define PORT_PE(12) /* port enable */ #define PORT_CSC (11) /* connect status change */ #define PORT_CONNECT (10) /* device connected */ +#define PORT_RWC_BITS (PORT_CSC | PORT_PEC | PORT_OCC) } __attribute__ ((packed)); /* Appendix C, Debug port ... intended for use with special debug devices diff -uprN linux-2.6.11.orig/drivers/usb/host/ehci-hcd.c linux-2.6.11/drivers/usb/host/ehci-hcd.c --- linux-2.6.11.orig/drivers/usb/host/ehci-hcd.c 2005-03-02 00:38:38.0 -0700 +++ linux-2.6.11/drivers/usb/host/ehci-hcd.c2005-08-17 08:15:36.0 -0600 @@ -722,7 +722,7 @@ static int ehci_suspend (struct usb_hcd static int ehci_resume (struct usb_hcd *hcd) { struct ehci_hcd *ehci = hcd_to_ehci (hcd); - unsignedport; + int port; struct usb_device *root = hcd-self.root_hub; int retval = -EINVAL; int powerup = 0; @@ -733,11 +733,11 @@ static int ehci_resume (struct usb_hcd * msleep (100); /* If any port is suspended, we know we can/must resume the HC. */ - for (port = HCS_N_PORTS (ehci-hcs_params); port 0; ) { + for (port = HCS_N_PORTS (ehci-hcs_params); --port = 0; ) { u32 status; - port--; status = readl (ehci-regs-port_status [port]); - if (status PORT_SUSPEND) { + if ( (status PORT_SUSPEND) != 0 || + ((status PORT_OWNER) != 0 (status PORT_POWER) != 0) ) { down (hcd-self.root_hub-serialize); retval = ehci_hub_resume (hcd); up (hcd-self.root_hub-serialize); @@ -755,7 +755,7 @@ static int ehci_resume (struct usb_hcd * /* Else reset, to cope with power loss or flush-to-storage * style resume having activated BIOS during reboot. */ - if (port == 0) { + if (port 0) { (void) ehci_halt (ehci); (void) ehci_reset (ehci); (void) ehci_hc_reset (hcd); diff -uprN linux-2.6.11.orig/drivers/usb/host/ehci-hub.c linux-2.6.11/drivers/usb/host/ehci-hub.c --- linux-2.6.11.orig/drivers/usb/host/ehci-hub.c 2005-03-02 00:38:32.0 -0700 +++ linux-2.6.11/drivers/usb/host/ehci-hub.c2005-08-17 08:15:36.0 -0600 @@ -232,7 +232,8 @@ ehci_hub_status_data (struct usb_hcd *hc if (temp PORT_OWNER) { /* don't report this in GetPortStatus */ if (temp PORT_CSC) { - temp = ~PORT_CSC; + temp = ~PORT_RWC_BITS; + temp |= PORT_CSC; writel (temp, ehci-regs-port_status [i]); } continue; diff -uprN linux-2.6.11.orig/drivers/usb/host/ohci-hcd.c linux-2.6.11/drivers/usb/host/ohci-hcd.c --- linux-2.6.11.orig/drivers/usb/host/ohci-hcd.c 2005-03-02 00:37:48.0 -0700 +++ linux-2.6.11/drivers/usb/host/ohci-hcd.c2005-08-17 08:15:36.0 -0600 @@ -720,6 +720,7 @@ static irqreturn_t ohci_irq (struct usb_ if (ints OHCI_INTR_RD) { ohci_vdbg (ohci, resume detect\n); + ohci_writel (ohci, OHCI_INTR_RD, regs-intrstatus); schedule_work(ohci-rh_resume); }
intel_agp resume problems
Hello Dave, after suspend-to-ram and a subsequent resume the configuration of my AGP bridge/controller is different and X will refuse to start after resume if it wasn't running during suspend. I'm using radeonfb as console driver and kernel 2.6.13-rc6-git6. Diff between lspci -vvvxxx before and after suspend follows. --- lspci.radeonfb_beforeS3 2005-08-16 13:23:31.0 +0200 +++ lspci.radeonfb_afterS3 2005-08-16 13:23:31.0 +0200 @@ -1,353 +1,349 @@ :00:00.0 Host bridge: Intel Corp. 82855PM Processor to I/O Controller (rev 21) Subsystem: Samsung Electronics Co Ltd: Unknown device c00c Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- 00: 86 80 40 33 06 01 90 20 21 00 00 06 00 00 00 00 10: 08 00 00 e0 00 00 00 00 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 4d 14 0c c0 30: 00 00 00 00 e4 00 00 00 00 00 00 00 00 00 00 00 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 50: 00 02 00 00 00 00 00 00 00 00 00 00 00 27 00 00 60: 04 08 0c 10 00 00 00 00 00 00 00 00 00 00 00 00 70: 02 02 00 00 00 00 00 00 00 00 02 2d 71 32 40 30 80: 71 00 80 05 00 00 00 00 00 10 01 00 00 00 00 00 -90: 10 11 11 00 01 13 11 00 41 19 00 00 00 0a 3d 00 -a0: 02 00 20 00 17 02 00 1f 04 00 00 00 00 00 00 00 +90: 10 11 11 00 01 13 11 00 41 19 00 00 00 1a 3d 00 +a0: 02 00 20 00 17 02 00 1f 00 00 00 00 00 00 00 00 b0: 00 00 00 00 00 00 00 00 00 00 e0 1b 20 10 00 00 c0: 44 40 50 11 00 20 05 06 00 00 00 00 00 00 00 00 d0: 02 28 00 0e 0b 00 00 30 00 00 31 b5 00 00 02 00 e0: 00 00 00 00 09 a0 04 41 00 00 00 00 00 00 00 00 f0: 00 00 01 00 74 f8 20 80 38 0f 21 00 04 00 00 00 :00:01.0 PCI bridge: Intel Corp. 82855PM Processor to AGP Controller (rev 21) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- Status: Cap- 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- Reset- FastB2B- 00: 86 80 41 33 07 01 a0 00 21 00 04 06 00 60 01 00 -10: 00 00 00 00 00 00 00 00 00 01 01 40 30 30 a0 22 +10: 00 00 00 00 00 00 00 00 00 01 01 40 30 30 a0 02 20: 10 d0 10 d0 00 d8 f0 df 00 00 00 00 00 00 00 00 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0c 00 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Do you have any hints how to solve the problem? Regards, Carl-Daniel -- http://www.hailfinger.org/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
intel_agp resume problems
Hello Dave, after suspend-to-ram and a subsequent resume the configuration of my AGP bridge/controller is different and X will refuse to start after resume if it wasn't running during suspend. I'm using radeonfb as console driver and kernel 2.6.13-rc6-git6. Diff between lspci -vvvxxx before and after suspend follows. --- lspci.radeonfb_beforeS3 2005-08-16 13:23:31.0 +0200 +++ lspci.radeonfb_afterS3 2005-08-16 13:23:31.0 +0200 @@ -1,353 +1,349 @@ :00:00.0 Host bridge: Intel Corp. 82855PM Processor to I/O Controller (rev 21) Subsystem: Samsung Electronics Co Ltd: Unknown device c00c Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=fast TAbort- TAbort- MAbort+ SERR- PERR- Latency: 0 Region 0: Memory at e000 (32-bit, prefetchable) Capabilities: [e4] #09 [4104] Capabilities: [a0] AGP version 2.0 Status: RQ=32 Iso- ArqSz=0 Cal=0 SBA+ ITACoh- GART64- HTrans- 64bit- FW+ AGP3- Rate=x1,x2,x4 - Command: RQ=1 ArqSz=0 Cal=0 SBA- AGP- GART64- 64bit- FW- Rate=x4 + Command: RQ=1 ArqSz=0 Cal=0 SBA- AGP- GART64- 64bit- FW- Rate=none 00: 86 80 40 33 06 01 90 20 21 00 00 06 00 00 00 00 10: 08 00 00 e0 00 00 00 00 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 4d 14 0c c0 30: 00 00 00 00 e4 00 00 00 00 00 00 00 00 00 00 00 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 50: 00 02 00 00 00 00 00 00 00 00 00 00 00 27 00 00 60: 04 08 0c 10 00 00 00 00 00 00 00 00 00 00 00 00 70: 02 02 00 00 00 00 00 00 00 00 02 2d 71 32 40 30 80: 71 00 80 05 00 00 00 00 00 10 01 00 00 00 00 00 -90: 10 11 11 00 01 13 11 00 41 19 00 00 00 0a 3d 00 -a0: 02 00 20 00 17 02 00 1f 04 00 00 00 00 00 00 00 +90: 10 11 11 00 01 13 11 00 41 19 00 00 00 1a 3d 00 +a0: 02 00 20 00 17 02 00 1f 00 00 00 00 00 00 00 00 b0: 00 00 00 00 00 00 00 00 00 00 e0 1b 20 10 00 00 c0: 44 40 50 11 00 20 05 06 00 00 00 00 00 00 00 00 d0: 02 28 00 0e 0b 00 00 30 00 00 31 b5 00 00 02 00 e0: 00 00 00 00 09 a0 04 41 00 00 00 00 00 00 00 00 f0: 00 00 01 00 74 f8 20 80 38 0f 21 00 04 00 00 00 :00:01.0 PCI bridge: Intel Corp. 82855PM Processor to AGP Controller (rev 21) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- Status: Cap- 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=fast TAbort- TAbort- MAbort- SERR- PERR- Latency: 96 Bus: primary=00, secondary=01, subordinate=01, sec-latency=64 I/O behind bridge: 3000-3fff Memory behind bridge: d010-d01f Prefetchable memory behind bridge: d800-dfff Expansion ROM at 3000 [disabled] [size=4K] BridgeCtl: Parity- SERR- NoISA+ VGA+ MAbort- Reset- FastB2B- 00: 86 80 41 33 07 01 a0 00 21 00 04 06 00 60 01 00 -10: 00 00 00 00 00 00 00 00 00 01 01 40 30 30 a0 22 +10: 00 00 00 00 00 00 00 00 00 01 01 40 30 30 a0 02 20: 10 d0 10 d0 00 d8 f0 df 00 00 00 00 00 00 00 00 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0c 00 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Do you have any hints how to solve the problem? Regards, Carl-Daniel -- http://www.hailfinger.org/ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.12-rc1-mm3: still having USB resume problems
Though it looks a lot better; no more streams of messages. Now when I resume, I get: PCI: Enabling device :00:1d.7 ( -> 0002) <1>Unable to handle kernel NULL pointer dereference a second or so after resume. It is completely locked up at this point; magic-sysreq gets no response. lspci shows that :00:1d.7 is # lspci -v -s :00:1d.7 00:1d.7 USB Controller: Intel Corp. 82801DB/DBM (ICH4/ICH4-M) USB2 EHCI Controller (rev 01) (prog-if 20 [EHCI]) Subsystem: IBM: Unknown device 052e Flags: bus master, medium devsel, latency 0, IRQ 5 Memory at c000 (32-bit, non-prefetchable) [size=1K] Capabilities: [50] Power Management version 2 Capabilities: [58] Debug port Complete lspci and .config attached. J 00:00.0 Host bridge: Intel Corp. 82855PM Processor to I/O Controller (rev 03) Subsystem: IBM: Unknown device 0529 Flags: bus master, fast devsel, latency 0 Memory at d000 (32-bit, prefetchable) [size=256M] Capabilities: [e4] Vendor Specific Information Capabilities: [a0] AGP version 2.0 00:01.0 PCI bridge: Intel Corp. 82855PM Processor to AGP Controller (rev 03) (prog-if 00 [Normal decode]) Flags: bus master, 66Mhz, fast devsel, latency 96 Bus: primary=00, secondary=01, subordinate=01, sec-latency=64 I/O behind bridge: 3000-3fff Memory behind bridge: c010-c01f Prefetchable memory behind bridge: e000-e7ff 00:1d.0 USB Controller: Intel Corp. 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB UHCI Controller #1 (rev 01) (prog-if 00 [UHCI]) Subsystem: IBM: Unknown device 052d Flags: bus master, medium devsel, latency 0, IRQ 11 I/O ports at 1800 [size=32] 00:1d.1 USB Controller: Intel Corp. 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB UHCI Controller #2 (rev 01) (prog-if 00 [UHCI]) Subsystem: IBM: Unknown device 052d Flags: bus master, medium devsel, latency 0, IRQ 5 I/O ports at 1820 [size=32] 00:1d.2 USB Controller: Intel Corp. 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB UHCI Controller #3 (rev 01) (prog-if 00 [UHCI]) Subsystem: IBM: Unknown device 052d Flags: bus master, medium devsel, latency 0, IRQ 9 I/O ports at 1840 [size=32] 00:1d.7 USB Controller: Intel Corp. 82801DB/DBM (ICH4/ICH4-M) USB2 EHCI Controller (rev 01) (prog-if 20 [EHCI]) Subsystem: IBM: Unknown device 052e Flags: bus master, medium devsel, latency 0, IRQ 5 Memory at c000 (32-bit, non-prefetchable) [size=1K] Capabilities: [50] Power Management version 2 Capabilities: [58] Debug port 00:1e.0 PCI bridge: Intel Corp. 82801 Mobile PCI Bridge (rev 81) (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0 Bus: primary=00, secondary=02, subordinate=08, sec-latency=64 I/O behind bridge: 4000-8fff Memory behind bridge: c020-cfff Prefetchable memory behind bridge: e800-efff 00:1f.0 ISA bridge: Intel Corp. 82801DBM (ICH4-M) LPC Interface Bridge (rev 01) Flags: bus master, medium devsel, latency 0 00:1f.1 IDE interface: Intel Corp. 82801DBM (ICH4-M) IDE Controller (rev 01) (prog-if 8a [Master SecP PriP]) Subsystem: IBM: Unknown device 052d Flags: bus master, medium devsel, latency 0, IRQ 9 I/O ports at I/O ports at I/O ports at I/O ports at I/O ports at 1860 [size=16] Memory at 4000 (32-bit, non-prefetchable) [size=1K] 00:1f.3 SMBus: Intel Corp. 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) SMBus Controller (rev 01) Subsystem: IBM: Unknown device 052d Flags: medium devsel, IRQ 10 I/O ports at 1880 [size=32] 00:1f.5 Multimedia audio controller: Intel Corp. 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) AC'97 Audio Controller (rev 01) Subsystem: IBM: Unknown device 0534 Flags: bus master, medium devsel, latency 0, IRQ 10 I/O ports at 1c00 [size=256] I/O ports at 18c0 [size=64] Memory at cc00 (32-bit, non-prefetchable) [size=512] Memory at c800 (32-bit, non-prefetchable) [size=256] Capabilities: [50] Power Management version 2 00:1f.6 Modem: Intel Corp. 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) AC'97 Modem Controller (rev 01) (prog-if 00 [Generic]) Subsystem: IBM: Unknown device 0524 Flags: bus master, medium devsel, latency 0, IRQ 10 I/O ports at 2400 [size=256] I/O ports at 2000 [size=128] Capabilities: [50] Power Management version 2 01:00.0 VGA compatible controller: ATI Technologies Inc Radeon Mobility M6 LY (prog-if 00 [VGA]) Subsystem: IBM: Unknown device 052f Flags: bus master, stepping, fast Back2Back, 66Mhz, medium devsel, latency 66, IRQ 11 Memory at e000 (32-bit, prefetchable) [size=128M] I/O ports at 3000 [size=256] Memory at c010 (32-bit,
2.6.12-rc1-mm3: still having USB resume problems
Though it looks a lot better; no more streams of messages. Now when I resume, I get: PCI: Enabling device :00:1d.7 ( - 0002) 1Unable to handle kernel NULL pointer dereference a second or so after resume. It is completely locked up at this point; magic-sysreq gets no response. lspci shows that :00:1d.7 is # lspci -v -s :00:1d.7 00:1d.7 USB Controller: Intel Corp. 82801DB/DBM (ICH4/ICH4-M) USB2 EHCI Controller (rev 01) (prog-if 20 [EHCI]) Subsystem: IBM: Unknown device 052e Flags: bus master, medium devsel, latency 0, IRQ 5 Memory at c000 (32-bit, non-prefetchable) [size=1K] Capabilities: [50] Power Management version 2 Capabilities: [58] Debug port Complete lspci and .config attached. J 00:00.0 Host bridge: Intel Corp. 82855PM Processor to I/O Controller (rev 03) Subsystem: IBM: Unknown device 0529 Flags: bus master, fast devsel, latency 0 Memory at d000 (32-bit, prefetchable) [size=256M] Capabilities: [e4] Vendor Specific Information Capabilities: [a0] AGP version 2.0 00:01.0 PCI bridge: Intel Corp. 82855PM Processor to AGP Controller (rev 03) (prog-if 00 [Normal decode]) Flags: bus master, 66Mhz, fast devsel, latency 96 Bus: primary=00, secondary=01, subordinate=01, sec-latency=64 I/O behind bridge: 3000-3fff Memory behind bridge: c010-c01f Prefetchable memory behind bridge: e000-e7ff 00:1d.0 USB Controller: Intel Corp. 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB UHCI Controller #1 (rev 01) (prog-if 00 [UHCI]) Subsystem: IBM: Unknown device 052d Flags: bus master, medium devsel, latency 0, IRQ 11 I/O ports at 1800 [size=32] 00:1d.1 USB Controller: Intel Corp. 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB UHCI Controller #2 (rev 01) (prog-if 00 [UHCI]) Subsystem: IBM: Unknown device 052d Flags: bus master, medium devsel, latency 0, IRQ 5 I/O ports at 1820 [size=32] 00:1d.2 USB Controller: Intel Corp. 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB UHCI Controller #3 (rev 01) (prog-if 00 [UHCI]) Subsystem: IBM: Unknown device 052d Flags: bus master, medium devsel, latency 0, IRQ 9 I/O ports at 1840 [size=32] 00:1d.7 USB Controller: Intel Corp. 82801DB/DBM (ICH4/ICH4-M) USB2 EHCI Controller (rev 01) (prog-if 20 [EHCI]) Subsystem: IBM: Unknown device 052e Flags: bus master, medium devsel, latency 0, IRQ 5 Memory at c000 (32-bit, non-prefetchable) [size=1K] Capabilities: [50] Power Management version 2 Capabilities: [58] Debug port 00:1e.0 PCI bridge: Intel Corp. 82801 Mobile PCI Bridge (rev 81) (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0 Bus: primary=00, secondary=02, subordinate=08, sec-latency=64 I/O behind bridge: 4000-8fff Memory behind bridge: c020-cfff Prefetchable memory behind bridge: e800-efff 00:1f.0 ISA bridge: Intel Corp. 82801DBM (ICH4-M) LPC Interface Bridge (rev 01) Flags: bus master, medium devsel, latency 0 00:1f.1 IDE interface: Intel Corp. 82801DBM (ICH4-M) IDE Controller (rev 01) (prog-if 8a [Master SecP PriP]) Subsystem: IBM: Unknown device 052d Flags: bus master, medium devsel, latency 0, IRQ 9 I/O ports at unassigned I/O ports at unassigned I/O ports at unassigned I/O ports at unassigned I/O ports at 1860 [size=16] Memory at 4000 (32-bit, non-prefetchable) [size=1K] 00:1f.3 SMBus: Intel Corp. 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) SMBus Controller (rev 01) Subsystem: IBM: Unknown device 052d Flags: medium devsel, IRQ 10 I/O ports at 1880 [size=32] 00:1f.5 Multimedia audio controller: Intel Corp. 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) AC'97 Audio Controller (rev 01) Subsystem: IBM: Unknown device 0534 Flags: bus master, medium devsel, latency 0, IRQ 10 I/O ports at 1c00 [size=256] I/O ports at 18c0 [size=64] Memory at cc00 (32-bit, non-prefetchable) [size=512] Memory at c800 (32-bit, non-prefetchable) [size=256] Capabilities: [50] Power Management version 2 00:1f.6 Modem: Intel Corp. 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) AC'97 Modem Controller (rev 01) (prog-if 00 [Generic]) Subsystem: IBM: Unknown device 0524 Flags: bus master, medium devsel, latency 0, IRQ 10 I/O ports at 2400 [size=256] I/O ports at 2000 [size=128] Capabilities: [50] Power Management version 2 01:00.0 VGA compatible controller: ATI Technologies Inc Radeon Mobility M6 LY (prog-if 00 [VGA]) Subsystem: IBM: Unknown device 052f Flags: bus master, stepping, fast Back2Back, 66Mhz, medium devsel, latency 66, IRQ 11 Memory at e000 (32-bit, prefetchable) [size=128M] I/O ports at 3000 [size=256]
Re: 2.6.11-rc3: APM resume problems with USB
On Sat, Mar 19, 2005 at 01:44:24AM -0800, Jeremy Fitzhardinge wrote: > On my IBM ThinkPad X31, I can only do one successful APM resume. After > the resume, there's a stream of messages on the console: > > uhci_hcd :00:1d.0: host controller process error, something bad > happened! > uhci_hcd :00:1d.0: host system error, PCI problems? > uhci_hcd :00:1d.0: host controller process error, something bad > happened! > uhci_hcd :00:1d.0: host system error, PCI problems? > uhci_hcd :00:1d.0: host controller process error, something bad > happened! > uhci_hcd :00:1d.0: host system error, PCI problems? > uhci_hcd :00:1d.0: host controller process error, something bad > happened! > uhci_hcd :00:1d.0: host system error, PCI problems? > uhci_hcd :00:1d.0: host controller process error, something bad > happened! > uhci_hcd :00:1d.0: host system error, PCI problems? > uhci_hcd :00:1d.0: host controller process error, something bad > happened! > uhci_hcd :00:1d.0: host system error, PCI problems? > > > The second resume, the machine panics. I haven't managed to get the > panic message yet. > > This happens with both -rc3 and -rc4. I think you mean -mm[34]. I've seen the problem with -mm3, 2.6.11{,.3} seem to be fine. Also ACPI rather than APM is fine as well though the suspend life is pathetic. -- Mathematics is the supreme nostalgia of our time. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.11-rc3: APM resume problems with USB
On my IBM ThinkPad X31, I can only do one successful APM resume. After the resume, there's a stream of messages on the console: uhci_hcd :00:1d.0: host controller process error, something bad happened! uhci_hcd :00:1d.0: host system error, PCI problems? uhci_hcd :00:1d.0: host controller process error, something bad happened! uhci_hcd :00:1d.0: host system error, PCI problems? uhci_hcd :00:1d.0: host controller process error, something bad happened! uhci_hcd :00:1d.0: host system error, PCI problems? uhci_hcd :00:1d.0: host controller process error, something bad happened! uhci_hcd :00:1d.0: host system error, PCI problems? uhci_hcd :00:1d.0: host controller process error, something bad happened! uhci_hcd :00:1d.0: host system error, PCI problems? uhci_hcd :00:1d.0: host controller process error, something bad happened! uhci_hcd :00:1d.0: host system error, PCI problems? The second resume, the machine panics. I haven't managed to get the panic message yet. This happens with both -rc3 and -rc4. If I unload the USB modules before the suspend, then I can suspend/resume as many times as I like. Curiously, if I reload the modules, I can continue suspending/resuming without obvious problems, though it does print "Trying to free free IRQ11" each time it resumes, and a new "uhci_hcd" appears associated with a number of interrupts. J # # Automatically generated make config: don't edit # Linux kernel version: 2.6.11-mm4 # Fri Mar 18 14:56:16 2005 # CONFIG_X86=y CONFIG_MMU=y CONFIG_UID16=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y CONFIG_CLEAR_PAGES=y # # Code maturity level options # CONFIG_EXPERIMENTAL=y CONFIG_CLEAN_COMPILE=y CONFIG_BROKEN_ON_SMP=y # # General setup # CONFIG_LOCALVERSION="" CONFIG_SWAP=y CONFIG_SYSVIPC=y CONFIG_POSIX_MQUEUE=y # CONFIG_BSD_PROCESS_ACCT is not set CONFIG_SYSCTL=y # CONFIG_AUDIT is not set CONFIG_HOTPLUG=y CONFIG_KOBJECT_UEVENT=y CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y # CONFIG_EMBEDDED is not set CONFIG_KALLSYMS=y # CONFIG_KALLSYMS_ALL is not set # CONFIG_KALLSYMS_EXTRA_PASS is not set CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_EPOLL=y CONFIG_SHMEM=y CONFIG_CC_ALIGN_FUNCTIONS=0 CONFIG_CC_ALIGN_LABELS=0 CONFIG_CC_ALIGN_LOOPS=0 CONFIG_CC_ALIGN_JUMPS=0 # CONFIG_TINY_SHMEM is not set CONFIG_BASE_SMALL=0 # # Loadable module support # CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y # CONFIG_MODULE_FORCE_UNLOAD is not set CONFIG_OBSOLETE_MODPARM=y # CONFIG_MODVERSIONS is not set # CONFIG_MODULE_SRCVERSION_ALL is not set CONFIG_KMOD=y # # Processor type and features # CONFIG_X86_PC=y # CONFIG_X86_ELAN is not set # CONFIG_X86_VOYAGER is not set # CONFIG_X86_NUMAQ is not set # CONFIG_X86_SUMMIT is not set # CONFIG_X86_BIGSMP is not set # CONFIG_X86_VISWS is not set # CONFIG_X86_GENERICARCH is not set # CONFIG_X86_ES7000 is not set # CONFIG_M386 is not set # CONFIG_M486 is not set # CONFIG_M586 is not set # CONFIG_M586TSC is not set # CONFIG_M586MMX is not set # CONFIG_M686 is not set # CONFIG_MPENTIUMII is not set # CONFIG_MPENTIUMIII is not set CONFIG_MPENTIUMM=y # CONFIG_MPENTIUM4 is not set # CONFIG_MK6 is not set # CONFIG_MK7 is not set # CONFIG_MK8 is not set # CONFIG_MCRUSOE is not set # CONFIG_MEFFICEON is not set # CONFIG_MWINCHIPC6 is not set # CONFIG_MWINCHIP2 is not set # CONFIG_MWINCHIP3D is not set # CONFIG_MGEODE is not set # CONFIG_MCYRIXIII is not set # CONFIG_MVIAC3_2 is not set # CONFIG_X86_GENERIC is not set CONFIG_X86_CMPXCHG=y CONFIG_X86_XADD=y CONFIG_X86_L1_CACHE_SHIFT=6 CONFIG_RWSEM_XCHGADD_ALGORITHM=y CONFIG_GENERIC_CALIBRATE_DELAY=y CONFIG_X86_WP_WORKS_OK=y CONFIG_X86_INVLPG=y CONFIG_X86_BSWAP=y CONFIG_X86_POPAD_OK=y CONFIG_X86_GOOD_APIC=y CONFIG_X86_INTEL_USERCOPY=y CONFIG_X86_USE_PPRO_CHECKSUM=y # CONFIG_HPET_TIMER is not set # CONFIG_SMP is not set # CONFIG_PREEMPT is not set CONFIG_X86_UP_APIC=y CONFIG_X86_UP_IOAPIC=y CONFIG_X86_LOCAL_APIC=y CONFIG_X86_IO_APIC=y CONFIG_X86_TSC=y CONFIG_X86_MCE=y # CONFIG_X86_MCE_NONFATAL is not set # CONFIG_X86_MCE_P4THERMAL is not set # CONFIG_TOSHIBA is not set # CONFIG_I8K is not set CONFIG_MICROCODE=y CONFIG_X86_MSR=y CONFIG_X86_CPUID=y # # Firmware Drivers # # CONFIG_EDD is not set # CONFIG_NOHIGHMEM is not set CONFIG_HIGHMEM4G=y # CONFIG_HIGHMEM64G is not set CONFIG_HIGHMEM=y # CONFIG_HIGHPTE is not set # CONFIG_MATH_EMULATION is not set CONFIG_MTRR=y CONFIG_REGPARM=y CONFIG_SECCOMP=y # # Performance-monitoring counters support # # CONFIG_PERFCTR is not set CONFIG_PHYSICAL_START=0x10 # CONFIG_KEXEC is not set # # Power management options (ACPI, APM) # CONFIG_PM=y # CONFIG_PM_DEBUG is not set # CONFIG_SOFTWARE_SUSPEND is not set # # ACPI (Advanced Configuration and Power Interface) Support # # CONFIG_ACPI is not set # # APM (Advanced Power Management) BIOS Support # CONFIG_APM=y # CONFIG_APM_IGNORE_USER_SUSPEND is not set # CONFIG_APM_DO_ENABLE is not set # CONFIG_APM_CPU_IDLE is not set # CONFIG_APM_DISPLAY_BLANK is not set # CONFIG_APM_RTC_IS_GMT is
2.6.11-rc3: APM resume problems with USB
On my IBM ThinkPad X31, I can only do one successful APM resume. After the resume, there's a stream of messages on the console: uhci_hcd :00:1d.0: host controller process error, something bad happened! uhci_hcd :00:1d.0: host system error, PCI problems? uhci_hcd :00:1d.0: host controller process error, something bad happened! uhci_hcd :00:1d.0: host system error, PCI problems? uhci_hcd :00:1d.0: host controller process error, something bad happened! uhci_hcd :00:1d.0: host system error, PCI problems? uhci_hcd :00:1d.0: host controller process error, something bad happened! uhci_hcd :00:1d.0: host system error, PCI problems? uhci_hcd :00:1d.0: host controller process error, something bad happened! uhci_hcd :00:1d.0: host system error, PCI problems? uhci_hcd :00:1d.0: host controller process error, something bad happened! uhci_hcd :00:1d.0: host system error, PCI problems? The second resume, the machine panics. I haven't managed to get the panic message yet. This happens with both -rc3 and -rc4. If I unload the USB modules before the suspend, then I can suspend/resume as many times as I like. Curiously, if I reload the modules, I can continue suspending/resuming without obvious problems, though it does print Trying to free free IRQ11 each time it resumes, and a new uhci_hcd appears associated with a number of interrupts. J # # Automatically generated make config: don't edit # Linux kernel version: 2.6.11-mm4 # Fri Mar 18 14:56:16 2005 # CONFIG_X86=y CONFIG_MMU=y CONFIG_UID16=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y CONFIG_CLEAR_PAGES=y # # Code maturity level options # CONFIG_EXPERIMENTAL=y CONFIG_CLEAN_COMPILE=y CONFIG_BROKEN_ON_SMP=y # # General setup # CONFIG_LOCALVERSION= CONFIG_SWAP=y CONFIG_SYSVIPC=y CONFIG_POSIX_MQUEUE=y # CONFIG_BSD_PROCESS_ACCT is not set CONFIG_SYSCTL=y # CONFIG_AUDIT is not set CONFIG_HOTPLUG=y CONFIG_KOBJECT_UEVENT=y CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y # CONFIG_EMBEDDED is not set CONFIG_KALLSYMS=y # CONFIG_KALLSYMS_ALL is not set # CONFIG_KALLSYMS_EXTRA_PASS is not set CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_EPOLL=y CONFIG_SHMEM=y CONFIG_CC_ALIGN_FUNCTIONS=0 CONFIG_CC_ALIGN_LABELS=0 CONFIG_CC_ALIGN_LOOPS=0 CONFIG_CC_ALIGN_JUMPS=0 # CONFIG_TINY_SHMEM is not set CONFIG_BASE_SMALL=0 # # Loadable module support # CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y # CONFIG_MODULE_FORCE_UNLOAD is not set CONFIG_OBSOLETE_MODPARM=y # CONFIG_MODVERSIONS is not set # CONFIG_MODULE_SRCVERSION_ALL is not set CONFIG_KMOD=y # # Processor type and features # CONFIG_X86_PC=y # CONFIG_X86_ELAN is not set # CONFIG_X86_VOYAGER is not set # CONFIG_X86_NUMAQ is not set # CONFIG_X86_SUMMIT is not set # CONFIG_X86_BIGSMP is not set # CONFIG_X86_VISWS is not set # CONFIG_X86_GENERICARCH is not set # CONFIG_X86_ES7000 is not set # CONFIG_M386 is not set # CONFIG_M486 is not set # CONFIG_M586 is not set # CONFIG_M586TSC is not set # CONFIG_M586MMX is not set # CONFIG_M686 is not set # CONFIG_MPENTIUMII is not set # CONFIG_MPENTIUMIII is not set CONFIG_MPENTIUMM=y # CONFIG_MPENTIUM4 is not set # CONFIG_MK6 is not set # CONFIG_MK7 is not set # CONFIG_MK8 is not set # CONFIG_MCRUSOE is not set # CONFIG_MEFFICEON is not set # CONFIG_MWINCHIPC6 is not set # CONFIG_MWINCHIP2 is not set # CONFIG_MWINCHIP3D is not set # CONFIG_MGEODE is not set # CONFIG_MCYRIXIII is not set # CONFIG_MVIAC3_2 is not set # CONFIG_X86_GENERIC is not set CONFIG_X86_CMPXCHG=y CONFIG_X86_XADD=y CONFIG_X86_L1_CACHE_SHIFT=6 CONFIG_RWSEM_XCHGADD_ALGORITHM=y CONFIG_GENERIC_CALIBRATE_DELAY=y CONFIG_X86_WP_WORKS_OK=y CONFIG_X86_INVLPG=y CONFIG_X86_BSWAP=y CONFIG_X86_POPAD_OK=y CONFIG_X86_GOOD_APIC=y CONFIG_X86_INTEL_USERCOPY=y CONFIG_X86_USE_PPRO_CHECKSUM=y # CONFIG_HPET_TIMER is not set # CONFIG_SMP is not set # CONFIG_PREEMPT is not set CONFIG_X86_UP_APIC=y CONFIG_X86_UP_IOAPIC=y CONFIG_X86_LOCAL_APIC=y CONFIG_X86_IO_APIC=y CONFIG_X86_TSC=y CONFIG_X86_MCE=y # CONFIG_X86_MCE_NONFATAL is not set # CONFIG_X86_MCE_P4THERMAL is not set # CONFIG_TOSHIBA is not set # CONFIG_I8K is not set CONFIG_MICROCODE=y CONFIG_X86_MSR=y CONFIG_X86_CPUID=y # # Firmware Drivers # # CONFIG_EDD is not set # CONFIG_NOHIGHMEM is not set CONFIG_HIGHMEM4G=y # CONFIG_HIGHMEM64G is not set CONFIG_HIGHMEM=y # CONFIG_HIGHPTE is not set # CONFIG_MATH_EMULATION is not set CONFIG_MTRR=y CONFIG_REGPARM=y CONFIG_SECCOMP=y # # Performance-monitoring counters support # # CONFIG_PERFCTR is not set CONFIG_PHYSICAL_START=0x10 # CONFIG_KEXEC is not set # # Power management options (ACPI, APM) # CONFIG_PM=y # CONFIG_PM_DEBUG is not set # CONFIG_SOFTWARE_SUSPEND is not set # # ACPI (Advanced Configuration and Power Interface) Support # # CONFIG_ACPI is not set # # APM (Advanced Power Management) BIOS Support # CONFIG_APM=y # CONFIG_APM_IGNORE_USER_SUSPEND is not set # CONFIG_APM_DO_ENABLE is not set # CONFIG_APM_CPU_IDLE is not set # CONFIG_APM_DISPLAY_BLANK is not set # CONFIG_APM_RTC_IS_GMT is not
Re: 2.6.11-rc3: APM resume problems with USB
On Sat, Mar 19, 2005 at 01:44:24AM -0800, Jeremy Fitzhardinge wrote: On my IBM ThinkPad X31, I can only do one successful APM resume. After the resume, there's a stream of messages on the console: uhci_hcd :00:1d.0: host controller process error, something bad happened! uhci_hcd :00:1d.0: host system error, PCI problems? uhci_hcd :00:1d.0: host controller process error, something bad happened! uhci_hcd :00:1d.0: host system error, PCI problems? uhci_hcd :00:1d.0: host controller process error, something bad happened! uhci_hcd :00:1d.0: host system error, PCI problems? uhci_hcd :00:1d.0: host controller process error, something bad happened! uhci_hcd :00:1d.0: host system error, PCI problems? uhci_hcd :00:1d.0: host controller process error, something bad happened! uhci_hcd :00:1d.0: host system error, PCI problems? uhci_hcd :00:1d.0: host controller process error, something bad happened! uhci_hcd :00:1d.0: host system error, PCI problems? The second resume, the machine panics. I haven't managed to get the panic message yet. This happens with both -rc3 and -rc4. I think you mean -mm[34]. I've seen the problem with -mm3, 2.6.11{,.3} seem to be fine. Also ACPI rather than APM is fine as well though the suspend life is pathetic. -- Mathematics is the supreme nostalgia of our time. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Fix suspend/resume problems with b44
On Tue, 8 Mar 2005 22:55:37 +0100 Pavel Machek <[EMAIL PROTECTED]> wrote: > Any idea what to do there? I'd say that request_irq is very unlikely > to fail given that it worked okay before suspend... What you have is fine for now. It is just a general issue that ->resume() has no way to cleanly fail. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Fix suspend/resume problems with b44
Hi! > > @@ -1934,6 +1936,9 @@ > > if (!netif_running(dev)) > > return 0; > > > > + if (request_irq(dev->irq, b44_interrupt, SA_SHIRQ, dev->name, dev)) > > + printk(KERN_ERR PFX "%s: request_irq failed\n", dev->name); > > + > > This is a hard error and means that bringup of the chip > will totally fail. It definitely deserves something harder > than a printk(), but unfortunately ->resume() has no way > to cleanly fail. Any idea what to do there? I'd say that request_irq is very unlikely to fail given that it worked okay before suspend... Pavel -- People were complaining that M$ turns users into beta-testers... ...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl! - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Fix suspend/resume problems with b44
On Tue, 8 Mar 2005 10:46:55 +0100 Pavel Machek <[EMAIL PROTECTED]> wrote: > @@ -1934,6 +1936,9 @@ > if (!netif_running(dev)) > return 0; > > + if (request_irq(dev->irq, b44_interrupt, SA_SHIRQ, dev->name, dev)) > + printk(KERN_ERR PFX "%s: request_irq failed\n", dev->name); > + This is a hard error and means that bringup of the chip will totally fail. It definitely deserves something harder than a printk(), but unfortunately ->resume() has no way to cleanly fail. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Fix suspend/resume problems with b44
Hi! This should fix problems people have with b44 during suspend/resume. Please apply, Pavel --- clean/drivers/net/b44.c 2004-12-25 13:35:00.0 +0100 +++ linux/drivers/net/b44.c 2005-01-19 11:59:12.0 +0100 @@ -1921,6 +1921,8 @@ b44_free_rings(bp); spin_unlock_irq(>lock); + + free_irq(dev->irq, dev); return 0; } @@ -1934,6 +1936,9 @@ if (!netif_running(dev)) return 0; + if (request_irq(dev->irq, b44_interrupt, SA_SHIRQ, dev->name, dev)) + printk(KERN_ERR PFX "%s: request_irq failed\n", dev->name); + spin_lock_irq(>lock); b44_init_rings(bp); -- People were complaining that M$ turns users into beta-testers... ...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl! - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Fix suspend/resume problems with b44
Hi! This should fix problems people have with b44 during suspend/resume. Please apply, Pavel --- clean/drivers/net/b44.c 2004-12-25 13:35:00.0 +0100 +++ linux/drivers/net/b44.c 2005-01-19 11:59:12.0 +0100 @@ -1921,6 +1921,8 @@ b44_free_rings(bp); spin_unlock_irq(bp-lock); + + free_irq(dev-irq, dev); return 0; } @@ -1934,6 +1936,9 @@ if (!netif_running(dev)) return 0; + if (request_irq(dev-irq, b44_interrupt, SA_SHIRQ, dev-name, dev)) + printk(KERN_ERR PFX %s: request_irq failed\n, dev-name); + spin_lock_irq(bp-lock); b44_init_rings(bp); -- People were complaining that M$ turns users into beta-testers... ...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl! - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Fix suspend/resume problems with b44
On Tue, 8 Mar 2005 10:46:55 +0100 Pavel Machek [EMAIL PROTECTED] wrote: @@ -1934,6 +1936,9 @@ if (!netif_running(dev)) return 0; + if (request_irq(dev-irq, b44_interrupt, SA_SHIRQ, dev-name, dev)) + printk(KERN_ERR PFX %s: request_irq failed\n, dev-name); + This is a hard error and means that bringup of the chip will totally fail. It definitely deserves something harder than a printk(), but unfortunately -resume() has no way to cleanly fail. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Fix suspend/resume problems with b44
Hi! @@ -1934,6 +1936,9 @@ if (!netif_running(dev)) return 0; + if (request_irq(dev-irq, b44_interrupt, SA_SHIRQ, dev-name, dev)) + printk(KERN_ERR PFX %s: request_irq failed\n, dev-name); + This is a hard error and means that bringup of the chip will totally fail. It definitely deserves something harder than a printk(), but unfortunately -resume() has no way to cleanly fail. Any idea what to do there? I'd say that request_irq is very unlikely to fail given that it worked okay before suspend... Pavel -- People were complaining that M$ turns users into beta-testers... ...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl! - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Fix suspend/resume problems with b44
On Tue, 8 Mar 2005 22:55:37 +0100 Pavel Machek [EMAIL PROTECTED] wrote: Any idea what to do there? I'd say that request_irq is very unlikely to fail given that it worked okay before suspend... What you have is fine for now. It is just a general issue that -resume() has no way to cleanly fail. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/