Re: regression -next -- scsi: sd: Rely on the driver core for asynchronous probing was Re: next-20190408..0418: Suspend/resume problems on Thinkpad X60

2019-04-26 Thread Pavel Machek
On Fri 2019-04-26 07:58:49, Bart Van Assche wrote:
> On Fri, 2019-04-26 at 12:32 +0200, Pavel Machek wrote:
> > [detached HEAD 916db0d] Revert "scsi: sd: Inline sd_probe_part2()"
> >  1 file changed, 58 insertions(+), 43 deletions(-)
> >  pavel@duo:/data/l/linux-next-32$ git revert
> >  21e6ba3f0e0257cce1a226c1f15e0a8ba4338ca3
> >  Editing file: /data/fast/l/linux-next-32/.git/COMMIT_EDITMSG
> >  1163
> >  ?
> >  [detached HEAD ac8d625] Revert "scsi: sd: Rely on the driver core for
> >  asynchronous probing"
> >   4 files changed, 47 insertions(+), 5 deletions(-)
> > 
> > 
> > And reverting those two indeed fixes it:
> > 
> > Checking version...
> > version is  Linux amd 5.1.0-rc1autobisect1556274387+ #261 SMP Fri Apr
> > 26 12:27:12 CEST 2019 i686 GNU/Linux
> > Running test...
> > Result is [ TEST SUCCESS ]
> > Test said TEST SUCCESS
> 
> Can you share your config file? I hope that will allow me to reproduce this
> issue.

Here you go. You may want to google Thinkpad X60. Its best notebook
ever made, but... :-).

Pavel
-- 
DENX Software Engineering GmbH,  Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
#
# Automatically generated file; DO NOT EDIT.
# Linux/x86 5.1.0-rc1 Kernel Configuration
#

#
# Compiler: gcc (Debian 4.9.2-10+deb8u2) 4.9.2
#
CONFIG_CC_IS_GCC=y
CONFIG_GCC_VERSION=40902
CONFIG_CLANG_VERSION=0
CONFIG_CC_HAS_ASM_GOTO=y
CONFIG_CC_HAS_WARN_MAYBE_UNINITIALIZED=y
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y
CONFIG_THREAD_INFO_IN_TASK=y

#
# General setup
#
CONFIG_INIT_ENV_ARG_LIMIT=32
# CONFIG_COMPILE_TEST is not set
CONFIG_LOCALVERSION="autobisect1556274822"
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_BUILD_SALT=""
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_BZIP2 is not set
# CONFIG_KERNEL_LZMA is not set
# CONFIG_KERNEL_XZ is not set
# CONFIG_KERNEL_LZO is not set
# CONFIG_KERNEL_LZ4 is not set
CONFIG_DEFAULT_HOSTNAME="pavel"
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
# CONFIG_POSIX_MQUEUE is not set
# CONFIG_CROSS_MEMORY_ATTACH is not set
CONFIG_USELIB=y
# CONFIG_AUDIT is not set
CONFIG_HAVE_ARCH_AUDITSYSCALL=y

#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_GENERIC_IRQ_MIGRATION=y
CONFIG_IRQ_DOMAIN=y
CONFIG_IRQ_DOMAIN_HIERARCHY=y
CONFIG_GENERIC_IRQ_MATRIX_ALLOCATOR=y
CONFIG_GENERIC_IRQ_RESERVATION_MODE=y
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_SPARSE_IRQ=y
# CONFIG_GENERIC_IRQ_DEBUGFS is not set
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_ARCH_CLOCKSOURCE_DATA=y
CONFIG_ARCH_CLOCKSOURCE_INIT=y
CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
CONFIG_GENERIC_CMOS_UPDATE=y

#
# Timers subsystem
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
CONFIG_NO_HZ_IDLE=y
CONFIG_NO_HZ=y
# CONFIG_HIGH_RES_TIMERS is not set
CONFIG_PREEMPT_NONE=y
# CONFIG_PREEMPT_VOLUNTARY is not set
# CONFIG_PREEMPT is not set

#
# CPU/Task time and stats accounting
#
CONFIG_TICK_CPU_ACCOUNTING=y
# CONFIG_IRQ_TIME_ACCOUNTING is not set
# CONFIG_BSD_PROCESS_ACCT is not set
# CONFIG_TASKSTATS is not set
# CONFIG_PSI is not set
# CONFIG_CPU_ISOLATION is not set

#
# RCU Subsystem
#
CONFIG_TREE_RCU=y
# CONFIG_RCU_EXPERT is not set
CONFIG_SRCU=y
CONFIG_TREE_SRCU=y
CONFIG_RCU_STALL_COMMON=y
CONFIG_RCU_NEED_SEGCBLIST=y
CONFIG_BUILD_BIN2C=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_LOG_BUF_SHIFT=18
CONFIG_LOG_CPU_MAX_BUF_SHIFT=12
CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT=13
CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y
CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH=y
CONFIG_CGROUPS=y
# CONFIG_MEMCG is not set
# CONFIG_BLK_CGROUP is not set
CONFIG_CGROUP_SCHED=y
CONFIG_FAIR_GROUP_SCHED=y
# CONFIG_CFS_BANDWIDTH is not set
# CONFIG_RT_GROUP_SCHED is not set
# CONFIG_CGROUP_PIDS is not set
# CONFIG_CGROUP_RDMA is not set
# CONFIG_CGROUP_FREEZER is not set
# CONFIG_CPUSETS is not set
# CONFIG_CGROUP_DEVICE is not set
# CONFIG_CGROUP_CPUACCT is not set
# CONFIG_CGROUP_PERF is not set
# CONFIG_CGROUP_DEBUG is not set
CONFIG_NAMESPACES=y
# CONFIG_UTS_NS is not set
# CONFIG_IPC_NS is not set
# CONFIG_USER_NS is not set
# CONFIG_PID_NS is not set
# CONFIG_NET_NS is not set
# CONFIG_CHECKPOINT_RESTORE is not set
CONFIG_SCHED_AUTOGROUP=y
CONFIG_SYSFS_DEPRECATED=y
# CONFIG_SYSFS_DEPRECATED_V2 is not set
CONFIG_RELAY=y
# CONFIG_BLK_DEV_INITRD is not set
CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE=y
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_SYSCTL=y
CONFIG_ANON_INODES=y
CONFIG_HAVE_UID16=y
CONFIG_SYSCTL_EXCEPTION_TRACE=y
CONFIG_HAVE_PCSPKR_PLATFORM=y
CONFIG_BPF=y
CONFIG_EXPERT=y
CONFIG_UID16=y
CONFIG_MULTIUSER=y
CONFIG_SGETMASK_SYSCALL=y

Re: regression -next -- scsi: sd: Rely on the driver core for asynchronous probing was Re: next-20190408..0418: Suspend/resume problems on Thinkpad X60

2019-04-26 Thread Bart Van Assche
On Fri, 2019-04-26 at 12:32 +0200, Pavel Machek wrote:
> [detached HEAD 916db0d] Revert "scsi: sd: Inline sd_probe_part2()"
>  1 file changed, 58 insertions(+), 43 deletions(-)
>  pavel@duo:/data/l/linux-next-32$ git revert
>  21e6ba3f0e0257cce1a226c1f15e0a8ba4338ca3
>  Editing file: /data/fast/l/linux-next-32/.git/COMMIT_EDITMSG
>  1163
>  ?
>  [detached HEAD ac8d625] Revert "scsi: sd: Rely on the driver core for
>  asynchronous probing"
>   4 files changed, 47 insertions(+), 5 deletions(-)
> 
> 
> And reverting those two indeed fixes it:
> 
> Checking version...
> version is  Linux amd 5.1.0-rc1autobisect1556274387+ #261 SMP Fri Apr
> 26 12:27:12 CEST 2019 i686 GNU/Linux
> Running test...
> Result is [ TEST SUCCESS ]
> Test said TEST SUCCESS

Can you share your config file? I hope that will allow me to reproduce this
issue.

Thanks,

Bart.


Re: regression -next -- scsi: sd: Rely on the driver core for asynchronous probing was Re: next-20190408..0418: Suspend/resume problems on Thinkpad X60

2019-04-26 Thread Pavel Machek
On Thu 2019-04-25 06:35:58, Bart Van Assche wrote:
> On 4/25/19 12:33 AM, Pavel Machek wrote:
> > On Wed 2019-04-24 13:56:01, Bart Van Assche wrote:
> >> On Wed, 2019-04-24 at 22:51 +0200, Pavel Machek wrote:
> >>> Unfortunately, that one does not revert cleanly on top of -next.
> >>
> >> Can you try the following:
> >>
> >> git revert d16ece577bf2cee7f94bab75a0d967bcb89dd2a7 &&
> >>   git revert 21e6ba3f0e0257cce1a226c1f15e0a8ba4338ca3
> >>
> >> I will see whether I can come up with a better way to analyze what is
> >> going on. I had not expected that these patches would cause any suspend/
> >> resume problems.
> > 
> > Not even d16ece reverts:
> > 
> > pavel@duo:/data/l/linux-next-32$ git show  | head -3
> > commit 76c938fcaa4b4a5d8f05fa907925d5043834964e
> > Author: Stephen Rothwell 
> > Date:   Tue Apr 23 20:24:59 2019 +1000
> > pavel@duo:/data/l/linux-next-32$ git revert
> > d16ece577bf2cee7f94bab75a0d967bcb89dd2a7
> > error: could not revert d16ece5... scsi: sd: Inline sd_probe_part2()
> > hint: after resolving the conflicts, mark the corrected paths
> > hint: with 'git add ' or 'git rm '
> > hint: and commit the result with 'git commit'
> 
> There has been a non-trivial merge between the block and scsi trees in
> linux-next. That's probably what prevents these patches to revert
> cleanly. How about performing the following tests:
> * Build, boot and test Martin's latest for-5.2 branch
> (git://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git; branch
> 5.2/scsi-queue).

Ok, so that's commit a7634b6f7cbbdc6efcf772e080a6fe845d1f6161
. Suspend/resume is broken there.

> * If suspend/resume does not work reliably with that branch, revert the
> two patches above, rebuild, reboot and retest.

pavel@duo:/data/l/linux-next-32$ git show
commit a7634b6f7cbbdc6efcf772e080a6fe845d1f6161
Author: Colin Ian King 
pavel@duo:/data/l/linux-next-32$ git revert
d16ece577bf2cee7f94bab75a0d967bcb89dd2a7
Editing file: /data/fast/l/linux-next-32/.git/COMMIT_EDITMSG
1026
?
[detached HEAD 916db0d] Revert "scsi: sd: Inline sd_probe_part2()"
 1 file changed, 58 insertions(+), 43 deletions(-)
 pavel@duo:/data/l/linux-next-32$ git revert
 21e6ba3f0e0257cce1a226c1f15e0a8ba4338ca3
 Editing file: /data/fast/l/linux-next-32/.git/COMMIT_EDITMSG
 1163
 ?
 [detached HEAD ac8d625] Revert "scsi: sd: Rely on the driver core for
 asynchronous probing"
  4 files changed, 47 insertions(+), 5 deletions(-)


And reverting those two indeed fixes it:

Checking version...
version is  Linux amd 5.1.0-rc1autobisect1556274387+ #261 SMP Fri Apr
26 12:27:12 CEST 2019 i686 GNU/Linux
Running test...
Result is [ TEST SUCCESS ]
Test said TEST SUCCESS

Pavel

-- 
DENX Software Engineering GmbH,  Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany


signature.asc
Description: Digital signature


Re: regression -next -- scsi: sd: Rely on the driver core for asynchronous probing was Re: next-20190408..0418: Suspend/resume problems on Thinkpad X60

2019-04-25 Thread Bart Van Assche
On 4/25/19 12:33 AM, Pavel Machek wrote:
> On Wed 2019-04-24 13:56:01, Bart Van Assche wrote:
>> On Wed, 2019-04-24 at 22:51 +0200, Pavel Machek wrote:
>>> Unfortunately, that one does not revert cleanly on top of -next.
>>
>> Can you try the following:
>>
>> git revert d16ece577bf2cee7f94bab75a0d967bcb89dd2a7 &&
>>   git revert 21e6ba3f0e0257cce1a226c1f15e0a8ba4338ca3
>>
>> I will see whether I can come up with a better way to analyze what is
>> going on. I had not expected that these patches would cause any suspend/
>> resume problems.
> 
> Not even d16ece reverts:
> 
> pavel@duo:/data/l/linux-next-32$ git show  | head -3
> commit 76c938fcaa4b4a5d8f05fa907925d5043834964e
> Author: Stephen Rothwell 
> Date:   Tue Apr 23 20:24:59 2019 +1000
> pavel@duo:/data/l/linux-next-32$ git revert
> d16ece577bf2cee7f94bab75a0d967bcb89dd2a7
> error: could not revert d16ece5... scsi: sd: Inline sd_probe_part2()
> hint: after resolving the conflicts, mark the corrected paths
> hint: with 'git add ' or 'git rm '
> hint: and commit the result with 'git commit'

There has been a non-trivial merge between the block and scsi trees in
linux-next. That's probably what prevents these patches to revert
cleanly. How about performing the following tests:
* Build, boot and test Martin's latest for-5.2 branch
(git://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git; branch
5.2/scsi-queue).
* If suspend/resume does not work reliably with that branch, revert the
two patches above, rebuild, reboot and retest.

Thanks,

Bart.


Re: regression -next -- scsi: sd: Rely on the driver core for asynchronous probing was Re: next-20190408..0418: Suspend/resume problems on Thinkpad X60

2019-04-25 Thread Pavel Machek
On Wed 2019-04-24 13:56:01, Bart Van Assche wrote:
> On Wed, 2019-04-24 at 22:51 +0200, Pavel Machek wrote:
> > Unfortunately, that one does not revert cleanly on top of -next.
> 
> Can you try the following:
> 
> git revert d16ece577bf2cee7f94bab75a0d967bcb89dd2a7 &&
>   git revert 21e6ba3f0e0257cce1a226c1f15e0a8ba4338ca3
> 
> I will see whether I can come up with a better way to analyze what is
> going on. I had not expected that these patches would cause any suspend/
> resume problems.

Not even d16ece reverts:

pavel@duo:/data/l/linux-next-32$ git show  | head -3
commit 76c938fcaa4b4a5d8f05fa907925d5043834964e
Author: Stephen Rothwell 
Date:   Tue Apr 23 20:24:59 2019 +1000
pavel@duo:/data/l/linux-next-32$ git revert
d16ece577bf2cee7f94bab75a0d967bcb89dd2a7
error: could not revert d16ece5... scsi: sd: Inline sd_probe_part2()
hint: after resolving the conflicts, mark the corrected paths
hint: with 'git add ' or 'git rm '
hint: and commit the result with 'git commit'

Pavel
-- 
DENX Software Engineering GmbH,  Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany


signature.asc
Description: Digital signature


Re: next-20190408..0418: Suspend/resume problems on Thinkpad X60

2019-04-24 Thread Bart Van Assche
On Wed, 2019-04-24 at 12:17 +0200, Pavel Machek wrote:
> On Tue 2019-04-23 07:09:42, Bart Van Assche wrote:
> > On 4/23/19 3:22 AM, Pavel Machek wrote:
> > > > > It boots ok (unlike mainline -- I'm debugging that), and I can suspend
> > > > > and resume... but then cursor in X is moving and I can talk to
> > > > > applications cached in memory, but any access to disk hangs.
> > > > 
> > > > Mainline problem was identified.
> > > > 
> > > > But resume is still broken. I took advantage of fact that I can still
> > > > do cached commands, and got complete dmesg. I'm attaching it.
> > > 
> > > Still broken in 0418. Ideas would be welcome at this point.
> > 
> > Have you already tried the debugging steps explained in
> > Documentation/power to obtain more information about the nature of the
> > suspend/resume problem?
> 
> That won't help, as system resumes ok, then disk hangs.
> 
> Does it work for you?

Both "systemctl hibernate" and "systemctl suspend" work perfectly with the
next-20190424 kernel on my laptop (a Dell Precision laptop).

Bart.


Re: regression -next -- scsi: sd: Rely on the driver core for asynchronous probing was Re: next-20190408..0418: Suspend/resume problems on Thinkpad X60

2019-04-24 Thread Bart Van Assche
On Wed, 2019-04-24 at 22:51 +0200, Pavel Machek wrote:
> Unfortunately, that one does not revert cleanly on top of -next.

Can you try the following:

git revert d16ece577bf2cee7f94bab75a0d967bcb89dd2a7 &&
  git revert 21e6ba3f0e0257cce1a226c1f15e0a8ba4338ca3

I will see whether I can come up with a better way to analyze what is
going on. I had not expected that these patches would cause any suspend/
resume problems.

Bart.


Re: regression -next -- scsi: sd: Rely on the driver core for asynchronous probing was Re: next-20190408..0418: Suspend/resume problems on Thinkpad X60

2019-04-24 Thread Pavel Machek
On Wed 2019-04-24 22:48:32, Pavel Machek wrote:
> Hi!
> 
> > Not block, but it seems scsi subsystem is:
> 
> commit 21e6ba3f0e0257cce1a226c1f15e0a8ba4338ca3
> Author: Bart Van Assche 
> Date:   Wed Mar 20 13:09:19 2019 -0700
> 
> scsi: sd: Rely on the driver core for asynchronous probing
> 
> As explained during the 2018 LSF/MM session about increasing SCSI
> disk
> probing concurrency, the problems with the current probing
> approach are as
> 
> Seems to be responsible. Full log attached.

Unfortunately, that one does not revert cleanly on top of -next.

Any ideas what is wrong?

Does suspend/resume work for you?

I can test patches.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


regression -next -- scsi: sd: Rely on the driver core for asynchronous probing was Re: next-20190408..0418: Suspend/resume problems on Thinkpad X60

2019-04-24 Thread Pavel Machek
Hi!

> Not block, but it seems scsi subsystem is:

commit 21e6ba3f0e0257cce1a226c1f15e0a8ba4338ca3
Author: Bart Van Assche 
Date:   Wed Mar 20 13:09:19 2019 -0700

scsi: sd: Rely on the driver core for asynchronous probing

As explained during the 2018 LSF/MM session about increasing SCSI
disk
probing concurrency, the problems with the current probing
approach are as

Seems to be responsible. Full log attached.

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
# bad: [76c938fcaa4b4a5d8f05fa907925d5043834964e] Add linux-next specific files 
for 20190423
# good: [7142eaa58b49d9de492ccc16d48df7c488a5fbb6] Merge tag 'mips_fixes_5.1_3' 
of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
git bisect start 'next-20190423' '7142eaa58b49d9de492ccc16d48df7c488a5fbb6'
# good: [ed04f675fa2c22316d7b57bea1258a18a47537ea] Merge remote-tracking branch 
'crypto/master'
git bisect good ed04f675fa2c22316d7b57bea1258a18a47537ea
# good: [4a99e5b3463f5c936540958914bff57ec50ac1e0] Merge remote-tracking branch 
'spi/for-next'
git bisect good 4a99e5b3463f5c936540958914bff57ec50ac1e0
# good: [61cabbda2a7e966b689a6791050ad675e6dff274] Merge remote-tracking branch 
'staging/staging-next'
git bisect good 61cabbda2a7e966b689a6791050ad675e6dff274
# bad: [c8f0c2453f64529035e25fbfb9de9d24e98baff7] Merge remote-tracking branch 
'coresight/next'
git bisect bad c8f0c2453f64529035e25fbfb9de9d24e98baff7
# bad: [6fb251c6f174d3cc571391baa9f6e57fff505446] Merge branch 'misc' into 
for-next
git bisect bad 6fb251c6f174d3cc571391baa9f6e57fff505446
# bad: [78a8ab3cc0f95a66c8fb2429030289103de173e7] scsi: qedf: fixup bit 
operations
git bisect bad 78a8ab3cc0f95a66c8fb2429030289103de173e7
# good: [c0327e67ecd86e88f5bc5fd54bfdf9b422a1c93f] scsi: core: remove the 
scsi_ioctl_reset export
git bisect good c0327e67ecd86e88f5bc5fd54bfdf9b422a1c93f
# good: [cbb24e26735f6142ba994b4d44fc2dcd54c3fe1f] scsi: ufs-mediatek: Make 
some symbols static
git bisect good cbb24e26735f6142ba994b4d44fc2dcd54c3fe1f
# bad: [21e6ba3f0e0257cce1a226c1f15e0a8ba4338ca3] scsi: sd: Rely on the driver 
core for asynchronous probing
git bisect bad 21e6ba3f0e0257cce1a226c1f15e0a8ba4338ca3
# good: [e7f7b6f38a44697428f5a2e7c606de028df2b0e3] scsi: lpfc: change snprintf 
to scnprintf for possible overflow
git bisect good e7f7b6f38a44697428f5a2e7c606de028df2b0e3
# good: [3e14592da654d53d87987aa09753d5a26e45446f] scsi: gdth: Only call 
dma_free_coherent when buf is not NULL in ioc_general
git bisect good 3e14592da654d53d87987aa09753d5a26e45446f
# good: [8378573353728a02602d6f956a3df48db0505c65] scsi: libcxgbi: remove 
uninitialized variable len
git bisect good 8378573353728a02602d6f956a3df48db0505c65
# good: [ea9006dfda65b7dc369aaa2359b3dedfc1bb08b6] scsi: mpt3sas: fix 
indentation issue
git bisect good ea9006dfda65b7dc369aaa2359b3dedfc1bb08b6
# first bad commit: [21e6ba3f0e0257cce1a226c1f15e0a8ba4338ca3] scsi: sd: Rely 
on the driver core for asynchronous probing


signature.asc
Description: Digital signature


Re: next-20190408..0418: Suspend/resume problems on Thinkpad X60

2019-04-24 Thread Pavel Machek
On Wed 2019-04-24 12:48:50, Pavel Machek wrote:
> On Wed 2019-04-24 11:54:31, Pavel Machek wrote:
> > On Tue 2019-04-23 07:55:05, Jens Axboe wrote:
> > > On 4/23/19 4:22 AM, Pavel Machek wrote:
> > > > Hi!
> > > > 
> > > >>> It boots ok (unlike mainline -- I'm debugging that), and I can suspend
> > > >>> and resume... but then cursor in X is moving and I can talk to
> > > >>> applications cached in memory, but any access to disk hangs.
> > > >>
> > > >> Mainline problem was identified.
> > > >>
> > > >> But resume is still broken. I took advantage of fact that I can still
> > > >> do cached commands, and got complete dmesg. I'm attaching it.
> > > > 
> > > > Still broken in 0418. Ideas would be welcome at this point.
> > > 
> > > Bisect it?
> > 
> > commit fdbbda7b3a0622fcfe630238d0bf6c57c4ba3663
> > Merge: 3c442d5 6c88d73
> > Author: Jens Axboe 
> > Date:   Mon Apr 22 13:57:36 2019 -0600
> > 
> > Works ok. So... block is not responsible.
> > 
> > Let me check
> > 
> > commit 91b112cf3b599f06f1e810cfedf37023f25d5588
> > Merge: fb2c4a8 e32d939
> > Author: Rafael J. Wysocki 
> > Date:   Mon Apr 22 01:52:48 2019 +0200
> 
> Suspend/resume ok, so pm not responsible. Let me check next-20190423.

Not block, but it seems scsi subsystem is:

pavel@duo:/data/l/linux-next-32$ git bisect log
# bad: [76c938fcaa4b4a5d8f05fa907925d5043834964e] Add linux-next
specific files for 20190423
# good: [7142eaa58b49d9de492ccc16d48df7c488a5fbb6] Merge tag
'mips_fixes_5.1_3' of
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
git bisect start 'next-20190423'
'7142eaa58b49d9de492ccc16d48df7c488a5fbb6'
# good: [ed04f675fa2c22316d7b57bea1258a18a47537ea] Merge
remote-tracking branch 'crypto/master'
git bisect good ed04f675fa2c22316d7b57bea1258a18a47537ea
# good: [4a99e5b3463f5c936540958914bff57ec50ac1e0] Merge
remote-tracking branch 'spi/for-next'
git bisect good 4a99e5b3463f5c936540958914bff57ec50ac1e0
# good: [61cabbda2a7e966b689a6791050ad675e6dff274] Merge
remote-tracking branch 'staging/staging-next'
git bisect good 61cabbda2a7e966b689a6791050ad675e6dff274
# bad: [c8f0c2453f64529035e25fbfb9de9d24e98baff7] Merge
remote-tracking branch 'coresight/next'
git bisect bad c8f0c2453f64529035e25fbfb9de9d24e98baff7
# bad: [6fb251c6f174d3cc571391baa9f6e57fff505446] Merge branch 'misc'
into for-next
git bisect bad 6fb251c6f174d3cc571391baa9f6e57fff505446
# bad: [78a8ab3cc0f95a66c8fb2429030289103de173e7] scsi: qedf: fixup
bit operations
git bisect bad 78a8ab3cc0f95a66c8fb2429030289103de173e7
# good: [c0327e67ecd86e88f5bc5fd54bfdf9b422a1c93f] scsi: core: remove
the scsi_ioctl_reset export
git bisect good c0327e67ecd86e88f5bc5fd54bfdf9b422a1c93f



-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


Re: next-20190408..0418: Suspend/resume problems on Thinkpad X60

2019-04-24 Thread Pavel Machek
On Wed 2019-04-24 11:54:31, Pavel Machek wrote:
> On Tue 2019-04-23 07:55:05, Jens Axboe wrote:
> > On 4/23/19 4:22 AM, Pavel Machek wrote:
> > > Hi!
> > > 
> > >>> It boots ok (unlike mainline -- I'm debugging that), and I can suspend
> > >>> and resume... but then cursor in X is moving and I can talk to
> > >>> applications cached in memory, but any access to disk hangs.
> > >>
> > >> Mainline problem was identified.
> > >>
> > >> But resume is still broken. I took advantage of fact that I can still
> > >> do cached commands, and got complete dmesg. I'm attaching it.
> > > 
> > > Still broken in 0418. Ideas would be welcome at this point.
> > 
> > Bisect it?
> 
> commit fdbbda7b3a0622fcfe630238d0bf6c57c4ba3663
> Merge: 3c442d5 6c88d73
> Author: Jens Axboe 
> Date:   Mon Apr 22 13:57:36 2019 -0600
> 
> Works ok. So... block is not responsible.
> 
> Let me check
> 
> commit 91b112cf3b599f06f1e810cfedf37023f25d5588
> Merge: fb2c4a8 e32d939
> Author: Rafael J. Wysocki 
> Date:   Mon Apr 22 01:52:48 2019 +0200

Suspend/resume ok, so pm not responsible. Let me check next-20190423.

Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


Re: next-20190408..0418: Suspend/resume problems on Thinkpad X60

2019-04-24 Thread Pavel Machek
On Tue 2019-04-23 07:09:42, Bart Van Assche wrote:
> On 4/23/19 3:22 AM, Pavel Machek wrote:
> >>> It boots ok (unlike mainline -- I'm debugging that), and I can suspend
> >>> and resume... but then cursor in X is moving and I can talk to
> >>> applications cached in memory, but any access to disk hangs.
> >>
> >> Mainline problem was identified.
> >>
> >> But resume is still broken. I took advantage of fact that I can still
> >> do cached commands, and got complete dmesg. I'm attaching it.
> > 
> > Still broken in 0418. Ideas would be welcome at this point.
> 
> Have you already tried the debugging steps explained in
> Documentation/power to obtain more information about the nature of the
> suspend/resume problem?

That won't help, as system resumes ok, then disk hangs.

Does it work for you?

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


Re: next-20190408..0418: Suspend/resume problems on Thinkpad X60

2019-04-24 Thread Pavel Machek
On Tue 2019-04-23 07:55:05, Jens Axboe wrote:
> On 4/23/19 4:22 AM, Pavel Machek wrote:
> > Hi!
> > 
> >>> It boots ok (unlike mainline -- I'm debugging that), and I can suspend
> >>> and resume... but then cursor in X is moving and I can talk to
> >>> applications cached in memory, but any access to disk hangs.
> >>
> >> Mainline problem was identified.
> >>
> >> But resume is still broken. I took advantage of fact that I can still
> >> do cached commands, and got complete dmesg. I'm attaching it.
> > 
> > Still broken in 0418. Ideas would be welcome at this point.
> 
> Bisect it?

commit fdbbda7b3a0622fcfe630238d0bf6c57c4ba3663
Merge: 3c442d5 6c88d73
Author: Jens Axboe 
Date:   Mon Apr 22 13:57:36 2019 -0600

Works ok. So... block is not responsible.

Let me check

commit 91b112cf3b599f06f1e810cfedf37023f25d5588
Merge: fb2c4a8 e32d939
Author: Rafael J. Wysocki 
Date:   Mon Apr 22 01:52:48 2019 +0200

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


Re: next-20190408..0418: Suspend/resume problems on Thinkpad X60

2019-04-24 Thread Pavel Machek
On Tue 2019-04-23 07:55:05, Jens Axboe wrote:
> On 4/23/19 4:22 AM, Pavel Machek wrote:
> > Hi!
> > 
> >>> It boots ok (unlike mainline -- I'm debugging that), and I can suspend
> >>> and resume... but then cursor in X is moving and I can talk to
> >>> applications cached in memory, but any access to disk hangs.
> >>
> >> Mainline problem was identified.
> >>
> >> But resume is still broken. I took advantage of fact that I can still
> >> do cached commands, and got complete dmesg. I'm attaching it.
> > 
> > Still broken in 0418. Ideas would be welcome at this point.
> 
> Bisect it?

Before I start heavy debugging, it would be interesting to
know... does suspend/resume work for you in -next?
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


Re: next-20190408..0418: Suspend/resume problems on Thinkpad X60

2019-04-23 Thread Bart Van Assche
On 4/23/19 3:22 AM, Pavel Machek wrote:
>>> It boots ok (unlike mainline -- I'm debugging that), and I can suspend
>>> and resume... but then cursor in X is moving and I can talk to
>>> applications cached in memory, but any access to disk hangs.
>>
>> Mainline problem was identified.
>>
>> But resume is still broken. I took advantage of fact that I can still
>> do cached commands, and got complete dmesg. I'm attaching it.
> 
> Still broken in 0418. Ideas would be welcome at this point.

Have you already tried the debugging steps explained in
Documentation/power to obtain more information about the nature of the
suspend/resume problem?

Bart.


Re: next-20190408..0418: Suspend/resume problems on Thinkpad X60

2019-04-23 Thread Jens Axboe
On 4/23/19 4:22 AM, Pavel Machek wrote:
> Hi!
> 
>>> It boots ok (unlike mainline -- I'm debugging that), and I can suspend
>>> and resume... but then cursor in X is moving and I can talk to
>>> applications cached in memory, but any access to disk hangs.
>>
>> Mainline problem was identified.
>>
>> But resume is still broken. I took advantage of fact that I can still
>> do cached commands, and got complete dmesg. I'm attaching it.
> 
> Still broken in 0418. Ideas would be welcome at this point.

Bisect it?

-- 
Jens Axboe



next-20190408..0418: Suspend/resume problems on Thinkpad X60

2019-04-23 Thread Pavel Machek
Hi!

> > It boots ok (unlike mainline -- I'm debugging that), and I can suspend
> > and resume... but then cursor in X is moving and I can talk to
> > applications cached in memory, but any access to disk hangs.
> 
> Mainline problem was identified.
> 
> But resume is still broken. I took advantage of fact that I can still
> do cached commands, and got complete dmesg. I'm attaching it.

Still broken in 0418. Ideas would be welcome at this point.

Best regards,
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


Re: next-20190408: Suspend/resume problems on Thinkpad X60

2019-04-12 Thread Pavel Machek
Hi!

> It boots ok (unlike mainline -- I'm debugging that), and I can suspend
> and resume... but then cursor in X is moving and I can talk to
> applications cached in memory, but any access to disk hangs.

Mainline problem was identified.

But resume is still broken. I took advantage of fact that I can still
do cached commands, and got complete dmesg. I'm attaching it.

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


delme2.gz
Description: application/gzip


signature.asc
Description: Digital signature


next-20190408: Suspend/resume problems on Thinkpad X60

2019-04-08 Thread Pavel Machek
Hi!

It boots ok (unlike mainline -- I'm debugging that), and I can suspend
and resume... but then cursor in X is moving and I can talk to
applications cached in memory, but any access to disk hangs.

Any ideas?

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


Re: Resume problems

2007-10-25 Thread Gabriel C
Rafael J. Wysocki wrote:
>>
>> After all I think all this problems may be some who ACPI related 
>> but the question is why they get triggered by Suspend/Hibernation.
> 
> They certainly are ACPI-related, because the only difference between level 4
> and level 3 suspend testing is that some global ACPI methods are executed
> at level 3 (in addition to level 4).
> 
> Unfortunately, I have no idea what to do next, for now.
> 
> I think you can file a bug report at http://bugzilla.kernel.org and put a link
> to this thread in there (against ACPI and please add my address to the CC
> list).

Also I patched 2.6.23 with that patch and Hibernation works out of box , 
Suspend to Ram seems to work
fine , just my video card is acting up ( old nvidia card ) I'll play with vbe 
tool on weekend.

Also I can reproduce that bug in 2.6.23 when I use standby.

I've started to bisect but it will take some time. When I'm done I will post an 
bug report.

Thanks for your help so far.

> 
> Greetings,
> Rafael
> 

Gabriel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Resume problems

2007-10-25 Thread Gabriel C
Rafael J. Wysocki wrote:

 After all I think all this problems may be some who ACPI related 
 but the question is why they get triggered by Suspend/Hibernation.
 
 They certainly are ACPI-related, because the only difference between level 4
 and level 3 suspend testing is that some global ACPI methods are executed
 at level 3 (in addition to level 4).
 
 Unfortunately, I have no idea what to do next, for now.
 
 I think you can file a bug report at http://bugzilla.kernel.org and put a link
 to this thread in there (against ACPI and please add my address to the CC
 list).

Also I patched 2.6.23 with that patch and Hibernation works out of box , 
Suspend to Ram seems to work
fine , just my video card is acting up ( old nvidia card ) I'll play with vbe 
tool on weekend.

Also I can reproduce that bug in 2.6.23 when I use standby.

I've started to bisect but it will take some time. When I'm done I will post an 
bug report.

Thanks for your help so far.

 
 Greetings,
 Rafael
 

Gabriel
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Resume problems

2007-10-23 Thread Rafael J. Wysocki
On Tuesday, 23 October 2007 03:01, Gabriel C wrote:
> 
> > Also box just froze on level 3 but I got a ACPI error at least which I 
> > didn't got in any other dmesg till now :
> > ( also patch was tested with HT disabled and Suspend and Hibernation 
> > enabled in kernel and BIOS )
> > 
> > ...
> > 
> > Oct 23 01:51:05 lara [  273.512374] PM: Removing info for No Bus:input0
> > Oct 23 01:51:05 lara [  274.545158] PM: Removing info for No Bus:mouse0
> > Oct 23 01:51:05 lara [  274.551435] PM: Removing info for No Bus:event1
> > Oct 23 01:51:05 lara [  274.559493] PM: Removing info for No Bus:input1
> > Oct 23 01:53:06 lara [  394.869468] ACPI Error (evevent-0303): No installed 
> > handler for fixed event [0002] [20070126]
> > 
> > 
> > 
> > ( I hard reseted after that ) 
> > 
> > I try level 2 and 1 now I just wanted to let you know.
> > 
> 
> Same issues with level 2 and 1.

Yes.  If you have a problem at level n, it should always reappear for n-1 etc.

> BTW I found out why my box does not shutdown with acpi=ht. It seems like 
> libata does not like that 
> acpi mode =) dropping the '... read http://linux-ata.org/shutdown.html , 
> power down manually' message.
> 
> That works perfectly with full acpi here.
> 
> After all I think all this problems may be some who ACPI related 
> but the question is why they get triggered by Suspend/Hibernation.

They certainly are ACPI-related, because the only difference between level 4
and level 3 suspend testing is that some global ACPI methods are executed
at level 3 (in addition to level 4).

Unfortunately, I have no idea what to do next, for now.

I think you can file a bug report at http://bugzilla.kernel.org and put a link
to this thread in there (against ACPI and please add my address to the CC
list).

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems

2007-10-23 Thread Thomas Gleixner
On Tue, 23 Oct 2007, Rafael J. Wysocki wrote:

> On Monday, 22 October 2007 16:11, Mark Lord wrote:
> > Rafael,
> > 
> > What happens to the jiffies variable on resume from RAM, and from DISK?
> > Do we restore it to the value it had at suspend,
> > or just leave it be with whatever?
> > 
> > The answer has to be "restore the value it had at suspend time",
> > but I figured I'd check here anyway.
> > 
> > ??
> 
> Well, frankly, I've lost track of that recently, but it seems that we just use
> the pre-suspend jiffies (at least in the current -git).
> 
> Thomas knows better, I guess. :-)

We use the pre-suspend value if nothing else fiddled in the variable.

   tglx
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems

2007-10-23 Thread Thomas Gleixner
On Tue, 23 Oct 2007, Rafael J. Wysocki wrote:

 On Monday, 22 October 2007 16:11, Mark Lord wrote:
  Rafael,
  
  What happens to the jiffies variable on resume from RAM, and from DISK?
  Do we restore it to the value it had at suspend,
  or just leave it be with whatever?
  
  The answer has to be restore the value it had at suspend time,
  but I figured I'd check here anyway.
  
  ??
 
 Well, frankly, I've lost track of that recently, but it seems that we just use
 the pre-suspend jiffies (at least in the current -git).
 
 Thomas knows better, I guess. :-)

We use the pre-suspend value if nothing else fiddled in the variable.

   tglx
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Resume problems

2007-10-23 Thread Rafael J. Wysocki
On Tuesday, 23 October 2007 03:01, Gabriel C wrote:
 
  Also box just froze on level 3 but I got a ACPI error at least which I 
  didn't got in any other dmesg till now :
  ( also patch was tested with HT disabled and Suspend and Hibernation 
  enabled in kernel and BIOS )
  
  ...
  
  Oct 23 01:51:05 lara [  273.512374] PM: Removing info for No Bus:input0
  Oct 23 01:51:05 lara [  274.545158] PM: Removing info for No Bus:mouse0
  Oct 23 01:51:05 lara [  274.551435] PM: Removing info for No Bus:event1
  Oct 23 01:51:05 lara [  274.559493] PM: Removing info for No Bus:input1
  Oct 23 01:53:06 lara [  394.869468] ACPI Error (evevent-0303): No installed 
  handler for fixed event [0002] [20070126]
  
  
  
  ( I hard reseted after that ) 
  
  I try level 2 and 1 now I just wanted to let you know.
  
 
 Same issues with level 2 and 1.

Yes.  If you have a problem at level n, it should always reappear for n-1 etc.

 BTW I found out why my box does not shutdown with acpi=ht. It seems like 
 libata does not like that 
 acpi mode =) dropping the '... read http://linux-ata.org/shutdown.html , 
 power down manually' message.
 
 That works perfectly with full acpi here.
 
 After all I think all this problems may be some who ACPI related 
 but the question is why they get triggered by Suspend/Hibernation.

They certainly are ACPI-related, because the only difference between level 4
and level 3 suspend testing is that some global ACPI methods are executed
at level 3 (in addition to level 4).

Unfortunately, I have no idea what to do next, for now.

I think you can file a bug report at http://bugzilla.kernel.org and put a link
to this thread in there (against ACPI and please add my address to the CC
list).

Greetings,
Rafael
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Resume problems

2007-10-22 Thread Gabriel C

> Also box just froze on level 3 but I got a ACPI error at least which I didn't 
> got in any other dmesg till now :
> ( also patch was tested with HT disabled and Suspend and Hibernation enabled 
> in kernel and BIOS )
> 
> ...
> 
> Oct 23 01:51:05 lara [  273.512374] PM: Removing info for No Bus:input0
> Oct 23 01:51:05 lara [  274.545158] PM: Removing info for No Bus:mouse0
> Oct 23 01:51:05 lara [  274.551435] PM: Removing info for No Bus:event1
> Oct 23 01:51:05 lara [  274.559493] PM: Removing info for No Bus:input1
> Oct 23 01:53:06 lara [  394.869468] ACPI Error (evevent-0303): No installed 
> handler for fixed event [0002] [20070126]
> 
> 
> 
> ( I hard reseted after that ) 
> 
> I try level 2 and 1 now I just wanted to let you know.
> 

Same issues with level 2 and 1.

BTW I found out why my box does not shutdown with acpi=ht. It seems like libata 
does not like that 
acpi mode =) dropping the '... read http://linux-ata.org/shutdown.html , power 
down manually' message.

That works perfectly with full acpi here.

After all I think all this problems may be some who ACPI related 
but the question is why they get triggered by Suspend/Hibernation.

If you want me to test something else just let me know.

Gabriel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Resume problems

2007-10-22 Thread Gabriel C
Gabriel C wrote:
> Rafael J. Wysocki wrote:
>> On Tuesday, 23 October 2007 01:00, Gabriel C wrote:
>>> Rafael J. Wysocki wrote:
 On Monday, 22 October 2007 18:15, Gabriel C wrote:
> Hi all ,
>
> I'm running current git + aic7xxx suspend patch from  
> http://bugzilla.kernel.org/show_bug.cgi?id=3062
> on a Dell Precision WorkStation 530 MT SMP box ( HT enabled ).
>
> Suspend works fine but on resume I have some problems. 
> All CPU's but boot CPU won't come back , everything else seems fine.
 Can you please try to disable HT and suspend?
>>> So only 'Hibernation' is enabled in kernel and HT disabled in BIOS ?
>>>
>>> If you mean that , sure I can try doing so. 
>> With suspend or hibernation enabled in the kernel, but with HT disabled in 
>> the
>> BIOS.
> 
> Ok trying in some minutes.

Disabling HT does not make any difference , nor disabling / enabling only one 
Hibernation or Suspend in kernel and BIOS
nor any combination of these.
 
> 
>>> I also could disable Suspend to RAM completly from BIOS as well if you want.
>> No, that rather won't work.
>>
> ...
>
> Oct 22 15:02:28 lara [   49.618795] Enabling non-boot CPUs ...
> Oct 22 15:02:28 lara [   49.622211] PM: Adding info for No Bus:msr1
> Oct 22 15:02:28 lara [   49.622259] PM: Adding info for No Bus:cpu1
> Oct 22 15:02:28 lara [   49.622302] SMP alternatives: switching to SMP 
> code
> Oct 22 15:02:28 lara [   49.623536] Booting processor 1/1 eip 3000
> Oct 22 15:02:28 lara [   54.638093] Not responding.
> Oct 22 15:02:28 lara [   54.638096] Inquiring remote APIC #1...
> Oct 22 15:02:28 lara [   54.638099] ... APIC #1 ID: failed
> Oct 22 15:02:28 lara [   54.638204] ... APIC #1 VERSION: failed
> Oct 22 15:02:28 lara [   54.638307] ... APIC #1 SPIV: failed
> Oct 22 15:02:28 lara [   54.638427] skipping cpu1, didn't come online
> Oct 22 15:02:28 lara [   54.638602] PM: Removing info for No Bus:msr1
> Oct 22 15:02:28 lara [   54.638643] PM: Removing info for No Bus:cpu1
> Oct 22 15:02:28 lara [   54.638678] Error taking CPU1 up: -5
> Oct 22 15:02:28 lara [   54.640908] PM: Adding info for No Bus:msr2
> Oct 22 15:02:28 lara [   54.640939] PM: Adding info for No Bus:cpu2
> Oct 22 15:02:28 lara [   54.640976] SMP alternatives: switching to SMP 
> code
> Oct 22 15:02:28 lara [   54.641961] Booting processor 2/2 eip 3000
> Oct 22 15:02:28 lara [   59.656795] Not responding.
> Oct 22 15:02:28 lara [   59.656799] Inquiring remote APIC #2...
> Oct 22 15:02:28 lara [   59.656803] ... APIC #2 ID: failed
> Oct 22 15:02:28 lara [   59.656907] ... APIC #2 VERSION: failed
> Oct 22 15:02:28 lara [   59.657011] ... APIC #2 SPIV: failed
> Oct 22 15:02:28 lara [   59.657131] skipping cpu2, didn't come online
> Oct 22 15:02:28 lara [   59.657300] PM: Removing info for No Bus:msr2
> Oct 22 15:02:28 lara [   59.657343] PM: Removing info for No Bus:cpu2
> Oct 22 15:02:28 lara [   59.657379] Error taking CPU2 up: -5
> Oct 22 15:02:28 lara [   59.659605] PM: Adding info for No Bus:msr3
> Oct 22 15:02:28 lara [   59.659637] PM: Adding info for No Bus:cpu3
> Oct 22 15:02:28 lara [   59.659673] SMP alternatives: switching to SMP 
> code
> Oct 22 15:02:28 lara [   59.660725] Booting processor 3/3 eip 3000
> Oct 22 15:02:28 lara [   64.675517] Not responding.
> Oct 22 15:02:28 lara [   64.675520] Inquiring remote APIC #3...
> Oct 22 15:02:28 lara [   64.675524] ... APIC #3 ID: failed
> Oct 22 15:02:28 lara [   64.675628] ... APIC #3 VERSION: failed
> Oct 22 15:02:28 lara [   64.675731] ... APIC #3 SPIV: failed
> Oct 22 15:02:28 lara [   64.675859] skipping cpu3, didn't come online
> Oct 22 15:02:28 lara [   64.676017] PM: Removing info for No Bus:msr3
> Oct 22 15:02:28 lara [   64.676059] PM: Removing info for No Bus:cpu3
> Oct 22 15:02:28 lara [   64.676092] Error taking CPU3 up: -5
> Oct 22 15:02:28 lara [   64.676326] evxfevnt-0079 [00] enable 
>: System is already in ACPI mode
>
> ...
>
> After I've played with a lot boot options I found out booting with ' 
> acpi=ht ' will make the CPU's work again but now
> I have a problem on Suspend. Everything seems to just go down disks etc 
> but the box itself is for some reason still on.
> So I've tested reboot=<> options with no luck.
> ( after waiting 5 minutes to be sure everything is really off I can just 
> hit power button). On resume now everything is fine.
>
> I'm not really sure what is wrong here acpi/hibernation/cpu-hotplug or a 
> mix of all so I'm CC'ing linux-acpi as well.
> The only thing I noticed is the 'Breaking affinity for irq XX' on suspend 
> without acpi=ht messages.
>
> I can't even tell whatever other kernel versions are working because 
> aic7xxx driver didn't got suspend support till now 
> ( 

Re: Resume problems

2007-10-22 Thread Gabriel C
Rafael J. Wysocki wrote:
> On Tuesday, 23 October 2007 01:00, Gabriel C wrote:
>> Rafael J. Wysocki wrote:
>>> On Monday, 22 October 2007 18:15, Gabriel C wrote:
 Hi all ,

 I'm running current git + aic7xxx suspend patch from  
 http://bugzilla.kernel.org/show_bug.cgi?id=3062
 on a Dell Precision WorkStation 530 MT SMP box ( HT enabled ).

 Suspend works fine but on resume I have some problems. 
 All CPU's but boot CPU won't come back , everything else seems fine.
>>> Can you please try to disable HT and suspend?
>> So only 'Hibernation' is enabled in kernel and HT disabled in BIOS ?
>>
>> If you mean that , sure I can try doing so. 
> 
> With suspend or hibernation enabled in the kernel, but with HT disabled in the
> BIOS.

Ok trying in some minutes.

> 
>> I also could disable Suspend to RAM completly from BIOS as well if you want.
> 
> No, that rather won't work.
> 
 ...

 Oct 22 15:02:28 lara [   49.618795] Enabling non-boot CPUs ...
 Oct 22 15:02:28 lara [   49.622211] PM: Adding info for No Bus:msr1
 Oct 22 15:02:28 lara [   49.622259] PM: Adding info for No Bus:cpu1
 Oct 22 15:02:28 lara [   49.622302] SMP alternatives: switching to SMP code
 Oct 22 15:02:28 lara [   49.623536] Booting processor 1/1 eip 3000
 Oct 22 15:02:28 lara [   54.638093] Not responding.
 Oct 22 15:02:28 lara [   54.638096] Inquiring remote APIC #1...
 Oct 22 15:02:28 lara [   54.638099] ... APIC #1 ID: failed
 Oct 22 15:02:28 lara [   54.638204] ... APIC #1 VERSION: failed
 Oct 22 15:02:28 lara [   54.638307] ... APIC #1 SPIV: failed
 Oct 22 15:02:28 lara [   54.638427] skipping cpu1, didn't come online
 Oct 22 15:02:28 lara [   54.638602] PM: Removing info for No Bus:msr1
 Oct 22 15:02:28 lara [   54.638643] PM: Removing info for No Bus:cpu1
 Oct 22 15:02:28 lara [   54.638678] Error taking CPU1 up: -5
 Oct 22 15:02:28 lara [   54.640908] PM: Adding info for No Bus:msr2
 Oct 22 15:02:28 lara [   54.640939] PM: Adding info for No Bus:cpu2
 Oct 22 15:02:28 lara [   54.640976] SMP alternatives: switching to SMP code
 Oct 22 15:02:28 lara [   54.641961] Booting processor 2/2 eip 3000
 Oct 22 15:02:28 lara [   59.656795] Not responding.
 Oct 22 15:02:28 lara [   59.656799] Inquiring remote APIC #2...
 Oct 22 15:02:28 lara [   59.656803] ... APIC #2 ID: failed
 Oct 22 15:02:28 lara [   59.656907] ... APIC #2 VERSION: failed
 Oct 22 15:02:28 lara [   59.657011] ... APIC #2 SPIV: failed
 Oct 22 15:02:28 lara [   59.657131] skipping cpu2, didn't come online
 Oct 22 15:02:28 lara [   59.657300] PM: Removing info for No Bus:msr2
 Oct 22 15:02:28 lara [   59.657343] PM: Removing info for No Bus:cpu2
 Oct 22 15:02:28 lara [   59.657379] Error taking CPU2 up: -5
 Oct 22 15:02:28 lara [   59.659605] PM: Adding info for No Bus:msr3
 Oct 22 15:02:28 lara [   59.659637] PM: Adding info for No Bus:cpu3
 Oct 22 15:02:28 lara [   59.659673] SMP alternatives: switching to SMP code
 Oct 22 15:02:28 lara [   59.660725] Booting processor 3/3 eip 3000
 Oct 22 15:02:28 lara [   64.675517] Not responding.
 Oct 22 15:02:28 lara [   64.675520] Inquiring remote APIC #3...
 Oct 22 15:02:28 lara [   64.675524] ... APIC #3 ID: failed
 Oct 22 15:02:28 lara [   64.675628] ... APIC #3 VERSION: failed
 Oct 22 15:02:28 lara [   64.675731] ... APIC #3 SPIV: failed
 Oct 22 15:02:28 lara [   64.675859] skipping cpu3, didn't come online
 Oct 22 15:02:28 lara [   64.676017] PM: Removing info for No Bus:msr3
 Oct 22 15:02:28 lara [   64.676059] PM: Removing info for No Bus:cpu3
 Oct 22 15:02:28 lara [   64.676092] Error taking CPU3 up: -5
 Oct 22 15:02:28 lara [   64.676326] evxfevnt-0079 [00] enable  
   : System is already in ACPI mode

 ...

 After I've played with a lot boot options I found out booting with ' 
 acpi=ht ' will make the CPU's work again but now
 I have a problem on Suspend. Everything seems to just go down disks etc 
 but the box itself is for some reason still on.
 So I've tested reboot=<> options with no luck.
 ( after waiting 5 minutes to be sure everything is really off I can just 
 hit power button). On resume now everything is fine.

 I'm not really sure what is wrong here acpi/hibernation/cpu-hotplug or a 
 mix of all so I'm CC'ing linux-acpi as well.
 The only thing I noticed is the 'Breaking affinity for irq XX' on suspend 
 without acpi=ht messages.

 I can't even tell whatever other kernel versions are working because 
 aic7xxx driver didn't got suspend support till now 
 ( or at least never worked here ). I know suspend worked fine on windows 
 with that box.

 There is my config and dmesg ( good and bad one ) :


 http://194.231.229.228/suspend/acpi=ht_working_dmesg.txt
 

Re: Resume problems

2007-10-22 Thread Rafael J. Wysocki
On Tuesday, 23 October 2007 01:00, Gabriel C wrote:
> Rafael J. Wysocki wrote:
> > On Monday, 22 October 2007 18:15, Gabriel C wrote:
> >> Hi all ,
> >>
> >> I'm running current git + aic7xxx suspend patch from  
> >> http://bugzilla.kernel.org/show_bug.cgi?id=3062
> >> on a Dell Precision WorkStation 530 MT SMP box ( HT enabled ).
> >>
> >> Suspend works fine but on resume I have some problems. 
> >> All CPU's but boot CPU won't come back , everything else seems fine.
> > 
> > Can you please try to disable HT and suspend?
> 
> So only 'Hibernation' is enabled in kernel and HT disabled in BIOS ?
> 
> If you mean that , sure I can try doing so. 

With suspend or hibernation enabled in the kernel, but with HT disabled in the
BIOS.

> I also could disable Suspend to RAM completly from BIOS as well if you want.

No, that rather won't work.

> > 
> >> ...
> >>
> >> Oct 22 15:02:28 lara [   49.618795] Enabling non-boot CPUs ...
> >> Oct 22 15:02:28 lara [   49.622211] PM: Adding info for No Bus:msr1
> >> Oct 22 15:02:28 lara [   49.622259] PM: Adding info for No Bus:cpu1
> >> Oct 22 15:02:28 lara [   49.622302] SMP alternatives: switching to SMP code
> >> Oct 22 15:02:28 lara [   49.623536] Booting processor 1/1 eip 3000
> >> Oct 22 15:02:28 lara [   54.638093] Not responding.
> >> Oct 22 15:02:28 lara [   54.638096] Inquiring remote APIC #1...
> >> Oct 22 15:02:28 lara [   54.638099] ... APIC #1 ID: failed
> >> Oct 22 15:02:28 lara [   54.638204] ... APIC #1 VERSION: failed
> >> Oct 22 15:02:28 lara [   54.638307] ... APIC #1 SPIV: failed
> >> Oct 22 15:02:28 lara [   54.638427] skipping cpu1, didn't come online
> >> Oct 22 15:02:28 lara [   54.638602] PM: Removing info for No Bus:msr1
> >> Oct 22 15:02:28 lara [   54.638643] PM: Removing info for No Bus:cpu1
> >> Oct 22 15:02:28 lara [   54.638678] Error taking CPU1 up: -5
> >> Oct 22 15:02:28 lara [   54.640908] PM: Adding info for No Bus:msr2
> >> Oct 22 15:02:28 lara [   54.640939] PM: Adding info for No Bus:cpu2
> >> Oct 22 15:02:28 lara [   54.640976] SMP alternatives: switching to SMP code
> >> Oct 22 15:02:28 lara [   54.641961] Booting processor 2/2 eip 3000
> >> Oct 22 15:02:28 lara [   59.656795] Not responding.
> >> Oct 22 15:02:28 lara [   59.656799] Inquiring remote APIC #2...
> >> Oct 22 15:02:28 lara [   59.656803] ... APIC #2 ID: failed
> >> Oct 22 15:02:28 lara [   59.656907] ... APIC #2 VERSION: failed
> >> Oct 22 15:02:28 lara [   59.657011] ... APIC #2 SPIV: failed
> >> Oct 22 15:02:28 lara [   59.657131] skipping cpu2, didn't come online
> >> Oct 22 15:02:28 lara [   59.657300] PM: Removing info for No Bus:msr2
> >> Oct 22 15:02:28 lara [   59.657343] PM: Removing info for No Bus:cpu2
> >> Oct 22 15:02:28 lara [   59.657379] Error taking CPU2 up: -5
> >> Oct 22 15:02:28 lara [   59.659605] PM: Adding info for No Bus:msr3
> >> Oct 22 15:02:28 lara [   59.659637] PM: Adding info for No Bus:cpu3
> >> Oct 22 15:02:28 lara [   59.659673] SMP alternatives: switching to SMP code
> >> Oct 22 15:02:28 lara [   59.660725] Booting processor 3/3 eip 3000
> >> Oct 22 15:02:28 lara [   64.675517] Not responding.
> >> Oct 22 15:02:28 lara [   64.675520] Inquiring remote APIC #3...
> >> Oct 22 15:02:28 lara [   64.675524] ... APIC #3 ID: failed
> >> Oct 22 15:02:28 lara [   64.675628] ... APIC #3 VERSION: failed
> >> Oct 22 15:02:28 lara [   64.675731] ... APIC #3 SPIV: failed
> >> Oct 22 15:02:28 lara [   64.675859] skipping cpu3, didn't come online
> >> Oct 22 15:02:28 lara [   64.676017] PM: Removing info for No Bus:msr3
> >> Oct 22 15:02:28 lara [   64.676059] PM: Removing info for No Bus:cpu3
> >> Oct 22 15:02:28 lara [   64.676092] Error taking CPU3 up: -5
> >> Oct 22 15:02:28 lara [   64.676326] evxfevnt-0079 [00] enable  
> >>   : System is already in ACPI mode
> >>
> >> ...
> >>
> >> After I've played with a lot boot options I found out booting with ' 
> >> acpi=ht ' will make the CPU's work again but now
> >> I have a problem on Suspend. Everything seems to just go down disks etc 
> >> but the box itself is for some reason still on.
> >> So I've tested reboot=<> options with no luck.
> >> ( after waiting 5 minutes to be sure everything is really off I can just 
> >> hit power button). On resume now everything is fine.
> >>
> >> I'm not really sure what is wrong here acpi/hibernation/cpu-hotplug or a 
> >> mix of all so I'm CC'ing linux-acpi as well.
> >> The only thing I noticed is the 'Breaking affinity for irq XX' on suspend 
> >> without acpi=ht messages.
> >>
> >> I can't even tell whatever other kernel versions are working because 
> >> aic7xxx driver didn't got suspend support till now 
> >> ( or at least never worked here ). I know suspend worked fine on windows 
> >> with that box.
> >>
> >> There is my config and dmesg ( good and bad one ) :
> >>
> >>
> >> http://194.231.229.228/suspend/acpi=ht_working_dmesg.txt
> >> http://194.231.229.228/suspend/dmesg_broken_cpus_on_resume.txt
> >> http://194.231.229.228/suspend/config
> 

Re: Resume problems

2007-10-22 Thread Gabriel C
Rafael J. Wysocki wrote:
> On Monday, 22 October 2007 18:15, Gabriel C wrote:
>> Hi all ,
>>
>> I'm running current git + aic7xxx suspend patch from  
>> http://bugzilla.kernel.org/show_bug.cgi?id=3062
>> on a Dell Precision WorkStation 530 MT SMP box ( HT enabled ).
>>
>> Suspend works fine but on resume I have some problems. 
>> All CPU's but boot CPU won't come back , everything else seems fine.
> 
> Can you please try to disable HT and suspend?

So only 'Hibernation' is enabled in kernel and HT disabled in BIOS ?

If you mean that , sure I can try doing so. 

I also could disable Suspend to RAM completly from BIOS as well if you want.

> 
>> ...
>>
>> Oct 22 15:02:28 lara [   49.618795] Enabling non-boot CPUs ...
>> Oct 22 15:02:28 lara [   49.622211] PM: Adding info for No Bus:msr1
>> Oct 22 15:02:28 lara [   49.622259] PM: Adding info for No Bus:cpu1
>> Oct 22 15:02:28 lara [   49.622302] SMP alternatives: switching to SMP code
>> Oct 22 15:02:28 lara [   49.623536] Booting processor 1/1 eip 3000
>> Oct 22 15:02:28 lara [   54.638093] Not responding.
>> Oct 22 15:02:28 lara [   54.638096] Inquiring remote APIC #1...
>> Oct 22 15:02:28 lara [   54.638099] ... APIC #1 ID: failed
>> Oct 22 15:02:28 lara [   54.638204] ... APIC #1 VERSION: failed
>> Oct 22 15:02:28 lara [   54.638307] ... APIC #1 SPIV: failed
>> Oct 22 15:02:28 lara [   54.638427] skipping cpu1, didn't come online
>> Oct 22 15:02:28 lara [   54.638602] PM: Removing info for No Bus:msr1
>> Oct 22 15:02:28 lara [   54.638643] PM: Removing info for No Bus:cpu1
>> Oct 22 15:02:28 lara [   54.638678] Error taking CPU1 up: -5
>> Oct 22 15:02:28 lara [   54.640908] PM: Adding info for No Bus:msr2
>> Oct 22 15:02:28 lara [   54.640939] PM: Adding info for No Bus:cpu2
>> Oct 22 15:02:28 lara [   54.640976] SMP alternatives: switching to SMP code
>> Oct 22 15:02:28 lara [   54.641961] Booting processor 2/2 eip 3000
>> Oct 22 15:02:28 lara [   59.656795] Not responding.
>> Oct 22 15:02:28 lara [   59.656799] Inquiring remote APIC #2...
>> Oct 22 15:02:28 lara [   59.656803] ... APIC #2 ID: failed
>> Oct 22 15:02:28 lara [   59.656907] ... APIC #2 VERSION: failed
>> Oct 22 15:02:28 lara [   59.657011] ... APIC #2 SPIV: failed
>> Oct 22 15:02:28 lara [   59.657131] skipping cpu2, didn't come online
>> Oct 22 15:02:28 lara [   59.657300] PM: Removing info for No Bus:msr2
>> Oct 22 15:02:28 lara [   59.657343] PM: Removing info for No Bus:cpu2
>> Oct 22 15:02:28 lara [   59.657379] Error taking CPU2 up: -5
>> Oct 22 15:02:28 lara [   59.659605] PM: Adding info for No Bus:msr3
>> Oct 22 15:02:28 lara [   59.659637] PM: Adding info for No Bus:cpu3
>> Oct 22 15:02:28 lara [   59.659673] SMP alternatives: switching to SMP code
>> Oct 22 15:02:28 lara [   59.660725] Booting processor 3/3 eip 3000
>> Oct 22 15:02:28 lara [   64.675517] Not responding.
>> Oct 22 15:02:28 lara [   64.675520] Inquiring remote APIC #3...
>> Oct 22 15:02:28 lara [   64.675524] ... APIC #3 ID: failed
>> Oct 22 15:02:28 lara [   64.675628] ... APIC #3 VERSION: failed
>> Oct 22 15:02:28 lara [   64.675731] ... APIC #3 SPIV: failed
>> Oct 22 15:02:28 lara [   64.675859] skipping cpu3, didn't come online
>> Oct 22 15:02:28 lara [   64.676017] PM: Removing info for No Bus:msr3
>> Oct 22 15:02:28 lara [   64.676059] PM: Removing info for No Bus:cpu3
>> Oct 22 15:02:28 lara [   64.676092] Error taking CPU3 up: -5
>> Oct 22 15:02:28 lara [   64.676326] evxfevnt-0079 [00] enable
>> : System is already in ACPI mode
>>
>> ...
>>
>> After I've played with a lot boot options I found out booting with ' acpi=ht 
>> ' will make the CPU's work again but now
>> I have a problem on Suspend. Everything seems to just go down disks etc but 
>> the box itself is for some reason still on.
>> So I've tested reboot=<> options with no luck.
>> ( after waiting 5 minutes to be sure everything is really off I can just hit 
>> power button). On resume now everything is fine.
>>
>> I'm not really sure what is wrong here acpi/hibernation/cpu-hotplug or a mix 
>> of all so I'm CC'ing linux-acpi as well.
>> The only thing I noticed is the 'Breaking affinity for irq XX' on suspend 
>> without acpi=ht messages.
>>
>> I can't even tell whatever other kernel versions are working because aic7xxx 
>> driver didn't got suspend support till now 
>> ( or at least never worked here ). I know suspend worked fine on windows 
>> with that box.
>>
>> There is my config and dmesg ( good and bad one ) :
>>
>>
>> http://194.231.229.228/suspend/acpi=ht_working_dmesg.txt
>> http://194.231.229.228/suspend/dmesg_broken_cpus_on_resume.txt
>> http://194.231.229.228/suspend/config
> 
> Well, I think we have a problem with the CPU hotplug.
> 
> Can you try to offline-online CPUs (without suspending) and see if that works?

Yes does work when I do it manually :

[ 6687.595842] CPU 1 is now offline
[ 6687.711425] CPU 2 is now offline
[ 6687.819330] CPU 3 is now offline
[ 6687.819337] SMP alternatives: switching to UP code
[ 

Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems

2007-10-22 Thread Rafael J. Wysocki
On Monday, 22 October 2007 16:11, Mark Lord wrote:
> Rafael,
> 
> What happens to the jiffies variable on resume from RAM, and from DISK?
> Do we restore it to the value it had at suspend,
> or just leave it be with whatever?
> 
> The answer has to be "restore the value it had at suspend time",
> but I figured I'd check here anyway.
> 
> ??

Well, frankly, I've lost track of that recently, but it seems that we just use
the pre-suspend jiffies (at least in the current -git).

Thomas knows better, I guess. :-)

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Resume problems

2007-10-22 Thread Rafael J. Wysocki
On Monday, 22 October 2007 18:15, Gabriel C wrote:
> Hi all ,
> 
> I'm running current git + aic7xxx suspend patch from  
> http://bugzilla.kernel.org/show_bug.cgi?id=3062
> on a Dell Precision WorkStation 530 MT SMP box ( HT enabled ).
> 
> Suspend works fine but on resume I have some problems. 
> All CPU's but boot CPU won't come back , everything else seems fine.

Can you please try to disable HT and suspend?

> ...
> 
> Oct 22 15:02:28 lara [   49.618795] Enabling non-boot CPUs ...
> Oct 22 15:02:28 lara [   49.622211] PM: Adding info for No Bus:msr1
> Oct 22 15:02:28 lara [   49.622259] PM: Adding info for No Bus:cpu1
> Oct 22 15:02:28 lara [   49.622302] SMP alternatives: switching to SMP code
> Oct 22 15:02:28 lara [   49.623536] Booting processor 1/1 eip 3000
> Oct 22 15:02:28 lara [   54.638093] Not responding.
> Oct 22 15:02:28 lara [   54.638096] Inquiring remote APIC #1...
> Oct 22 15:02:28 lara [   54.638099] ... APIC #1 ID: failed
> Oct 22 15:02:28 lara [   54.638204] ... APIC #1 VERSION: failed
> Oct 22 15:02:28 lara [   54.638307] ... APIC #1 SPIV: failed
> Oct 22 15:02:28 lara [   54.638427] skipping cpu1, didn't come online
> Oct 22 15:02:28 lara [   54.638602] PM: Removing info for No Bus:msr1
> Oct 22 15:02:28 lara [   54.638643] PM: Removing info for No Bus:cpu1
> Oct 22 15:02:28 lara [   54.638678] Error taking CPU1 up: -5
> Oct 22 15:02:28 lara [   54.640908] PM: Adding info for No Bus:msr2
> Oct 22 15:02:28 lara [   54.640939] PM: Adding info for No Bus:cpu2
> Oct 22 15:02:28 lara [   54.640976] SMP alternatives: switching to SMP code
> Oct 22 15:02:28 lara [   54.641961] Booting processor 2/2 eip 3000
> Oct 22 15:02:28 lara [   59.656795] Not responding.
> Oct 22 15:02:28 lara [   59.656799] Inquiring remote APIC #2...
> Oct 22 15:02:28 lara [   59.656803] ... APIC #2 ID: failed
> Oct 22 15:02:28 lara [   59.656907] ... APIC #2 VERSION: failed
> Oct 22 15:02:28 lara [   59.657011] ... APIC #2 SPIV: failed
> Oct 22 15:02:28 lara [   59.657131] skipping cpu2, didn't come online
> Oct 22 15:02:28 lara [   59.657300] PM: Removing info for No Bus:msr2
> Oct 22 15:02:28 lara [   59.657343] PM: Removing info for No Bus:cpu2
> Oct 22 15:02:28 lara [   59.657379] Error taking CPU2 up: -5
> Oct 22 15:02:28 lara [   59.659605] PM: Adding info for No Bus:msr3
> Oct 22 15:02:28 lara [   59.659637] PM: Adding info for No Bus:cpu3
> Oct 22 15:02:28 lara [   59.659673] SMP alternatives: switching to SMP code
> Oct 22 15:02:28 lara [   59.660725] Booting processor 3/3 eip 3000
> Oct 22 15:02:28 lara [   64.675517] Not responding.
> Oct 22 15:02:28 lara [   64.675520] Inquiring remote APIC #3...
> Oct 22 15:02:28 lara [   64.675524] ... APIC #3 ID: failed
> Oct 22 15:02:28 lara [   64.675628] ... APIC #3 VERSION: failed
> Oct 22 15:02:28 lara [   64.675731] ... APIC #3 SPIV: failed
> Oct 22 15:02:28 lara [   64.675859] skipping cpu3, didn't come online
> Oct 22 15:02:28 lara [   64.676017] PM: Removing info for No Bus:msr3
> Oct 22 15:02:28 lara [   64.676059] PM: Removing info for No Bus:cpu3
> Oct 22 15:02:28 lara [   64.676092] Error taking CPU3 up: -5
> Oct 22 15:02:28 lara [   64.676326] evxfevnt-0079 [00] enable
> : System is already in ACPI mode
> 
> ...
> 
> After I've played with a lot boot options I found out booting with ' acpi=ht 
> ' will make the CPU's work again but now
> I have a problem on Suspend. Everything seems to just go down disks etc but 
> the box itself is for some reason still on.
> So I've tested reboot=<> options with no luck.
> ( after waiting 5 minutes to be sure everything is really off I can just hit 
> power button). On resume now everything is fine.
> 
> I'm not really sure what is wrong here acpi/hibernation/cpu-hotplug or a mix 
> of all so I'm CC'ing linux-acpi as well.
> The only thing I noticed is the 'Breaking affinity for irq XX' on suspend 
> without acpi=ht messages.
> 
> I can't even tell whatever other kernel versions are working because aic7xxx 
> driver didn't got suspend support till now 
> ( or at least never worked here ). I know suspend worked fine on windows with 
> that box.
> 
> There is my config and dmesg ( good and bad one ) :
> 
> 
> http://194.231.229.228/suspend/acpi=ht_working_dmesg.txt
> http://194.231.229.228/suspend/dmesg_broken_cpus_on_resume.txt
> http://194.231.229.228/suspend/config

Well, I think we have a problem with the CPU hotplug.

Can you try to offline-online CPUs (without suspending) and see if that works?

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Resume problems

2007-10-22 Thread Gabriel C
Hi all ,

I'm running current git + aic7xxx suspend patch from  
http://bugzilla.kernel.org/show_bug.cgi?id=3062
on a Dell Precision WorkStation 530 MT SMP box ( HT enabled ).

Suspend works fine but on resume I have some problems. 
All CPU's but boot CPU won't come back , everything else seems fine.

...

Oct 22 15:02:28 lara [   49.618795] Enabling non-boot CPUs ...
Oct 22 15:02:28 lara [   49.622211] PM: Adding info for No Bus:msr1
Oct 22 15:02:28 lara [   49.622259] PM: Adding info for No Bus:cpu1
Oct 22 15:02:28 lara [   49.622302] SMP alternatives: switching to SMP code
Oct 22 15:02:28 lara [   49.623536] Booting processor 1/1 eip 3000
Oct 22 15:02:28 lara [   54.638093] Not responding.
Oct 22 15:02:28 lara [   54.638096] Inquiring remote APIC #1...
Oct 22 15:02:28 lara [   54.638099] ... APIC #1 ID: failed
Oct 22 15:02:28 lara [   54.638204] ... APIC #1 VERSION: failed
Oct 22 15:02:28 lara [   54.638307] ... APIC #1 SPIV: failed
Oct 22 15:02:28 lara [   54.638427] skipping cpu1, didn't come online
Oct 22 15:02:28 lara [   54.638602] PM: Removing info for No Bus:msr1
Oct 22 15:02:28 lara [   54.638643] PM: Removing info for No Bus:cpu1
Oct 22 15:02:28 lara [   54.638678] Error taking CPU1 up: -5
Oct 22 15:02:28 lara [   54.640908] PM: Adding info for No Bus:msr2
Oct 22 15:02:28 lara [   54.640939] PM: Adding info for No Bus:cpu2
Oct 22 15:02:28 lara [   54.640976] SMP alternatives: switching to SMP code
Oct 22 15:02:28 lara [   54.641961] Booting processor 2/2 eip 3000
Oct 22 15:02:28 lara [   59.656795] Not responding.
Oct 22 15:02:28 lara [   59.656799] Inquiring remote APIC #2...
Oct 22 15:02:28 lara [   59.656803] ... APIC #2 ID: failed
Oct 22 15:02:28 lara [   59.656907] ... APIC #2 VERSION: failed
Oct 22 15:02:28 lara [   59.657011] ... APIC #2 SPIV: failed
Oct 22 15:02:28 lara [   59.657131] skipping cpu2, didn't come online
Oct 22 15:02:28 lara [   59.657300] PM: Removing info for No Bus:msr2
Oct 22 15:02:28 lara [   59.657343] PM: Removing info for No Bus:cpu2
Oct 22 15:02:28 lara [   59.657379] Error taking CPU2 up: -5
Oct 22 15:02:28 lara [   59.659605] PM: Adding info for No Bus:msr3
Oct 22 15:02:28 lara [   59.659637] PM: Adding info for No Bus:cpu3
Oct 22 15:02:28 lara [   59.659673] SMP alternatives: switching to SMP code
Oct 22 15:02:28 lara [   59.660725] Booting processor 3/3 eip 3000
Oct 22 15:02:28 lara [   64.675517] Not responding.
Oct 22 15:02:28 lara [   64.675520] Inquiring remote APIC #3...
Oct 22 15:02:28 lara [   64.675524] ... APIC #3 ID: failed
Oct 22 15:02:28 lara [   64.675628] ... APIC #3 VERSION: failed
Oct 22 15:02:28 lara [   64.675731] ... APIC #3 SPIV: failed
Oct 22 15:02:28 lara [   64.675859] skipping cpu3, didn't come online
Oct 22 15:02:28 lara [   64.676017] PM: Removing info for No Bus:msr3
Oct 22 15:02:28 lara [   64.676059] PM: Removing info for No Bus:cpu3
Oct 22 15:02:28 lara [   64.676092] Error taking CPU3 up: -5
Oct 22 15:02:28 lara [   64.676326] evxfevnt-0079 [00] enable: 
System is already in ACPI mode

...

After I've played with a lot boot options I found out booting with ' acpi=ht ' 
will make the CPU's work again but now
I have a problem on Suspend. Everything seems to just go down disks etc but the 
box itself is for some reason still on.
So I've tested reboot=<> options with no luck.
( after waiting 5 minutes to be sure everything is really off I can just hit 
power button). On resume now everything is fine.

I'm not really sure what is wrong here acpi/hibernation/cpu-hotplug or a mix of 
all so I'm CC'ing linux-acpi as well.
The only thing I noticed is the 'Breaking affinity for irq XX' on suspend 
without acpi=ht messages.

I can't even tell whatever other kernel versions are working because aic7xxx 
driver didn't got suspend support till now 
( or at least never worked here ). I know suspend worked fine on windows with 
that box.

There is my config and dmesg ( good and bad one ) :


http://194.231.229.228/suspend/acpi=ht_working_dmesg.txt
http://194.231.229.228/suspend/dmesg_broken_cpus_on_resume.txt
http://194.231.229.228/suspend/config


Regards,

Gabriel

 



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems

2007-10-22 Thread Mark Lord

Rafael,

What happens to the jiffies variable on resume from RAM, and from DISK?
Do we restore it to the value it had at suspend,
or just leave it be with whatever?

The answer has to be "restore the value it had at suspend time",
but I figured I'd check here anyway.

??
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems

2007-10-22 Thread Mark Lord

Rafael,

What happens to the jiffies variable on resume from RAM, and from DISK?
Do we restore it to the value it had at suspend,
or just leave it be with whatever?

The answer has to be restore the value it had at suspend time,
but I figured I'd check here anyway.

??
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Resume problems

2007-10-22 Thread Gabriel C
Hi all ,

I'm running current git + aic7xxx suspend patch from  
http://bugzilla.kernel.org/show_bug.cgi?id=3062
on a Dell Precision WorkStation 530 MT SMP box ( HT enabled ).

Suspend works fine but on resume I have some problems. 
All CPU's but boot CPU won't come back , everything else seems fine.

...

Oct 22 15:02:28 lara [   49.618795] Enabling non-boot CPUs ...
Oct 22 15:02:28 lara [   49.622211] PM: Adding info for No Bus:msr1
Oct 22 15:02:28 lara [   49.622259] PM: Adding info for No Bus:cpu1
Oct 22 15:02:28 lara [   49.622302] SMP alternatives: switching to SMP code
Oct 22 15:02:28 lara [   49.623536] Booting processor 1/1 eip 3000
Oct 22 15:02:28 lara [   54.638093] Not responding.
Oct 22 15:02:28 lara [   54.638096] Inquiring remote APIC #1...
Oct 22 15:02:28 lara [   54.638099] ... APIC #1 ID: failed
Oct 22 15:02:28 lara [   54.638204] ... APIC #1 VERSION: failed
Oct 22 15:02:28 lara [   54.638307] ... APIC #1 SPIV: failed
Oct 22 15:02:28 lara [   54.638427] skipping cpu1, didn't come online
Oct 22 15:02:28 lara [   54.638602] PM: Removing info for No Bus:msr1
Oct 22 15:02:28 lara [   54.638643] PM: Removing info for No Bus:cpu1
Oct 22 15:02:28 lara [   54.638678] Error taking CPU1 up: -5
Oct 22 15:02:28 lara [   54.640908] PM: Adding info for No Bus:msr2
Oct 22 15:02:28 lara [   54.640939] PM: Adding info for No Bus:cpu2
Oct 22 15:02:28 lara [   54.640976] SMP alternatives: switching to SMP code
Oct 22 15:02:28 lara [   54.641961] Booting processor 2/2 eip 3000
Oct 22 15:02:28 lara [   59.656795] Not responding.
Oct 22 15:02:28 lara [   59.656799] Inquiring remote APIC #2...
Oct 22 15:02:28 lara [   59.656803] ... APIC #2 ID: failed
Oct 22 15:02:28 lara [   59.656907] ... APIC #2 VERSION: failed
Oct 22 15:02:28 lara [   59.657011] ... APIC #2 SPIV: failed
Oct 22 15:02:28 lara [   59.657131] skipping cpu2, didn't come online
Oct 22 15:02:28 lara [   59.657300] PM: Removing info for No Bus:msr2
Oct 22 15:02:28 lara [   59.657343] PM: Removing info for No Bus:cpu2
Oct 22 15:02:28 lara [   59.657379] Error taking CPU2 up: -5
Oct 22 15:02:28 lara [   59.659605] PM: Adding info for No Bus:msr3
Oct 22 15:02:28 lara [   59.659637] PM: Adding info for No Bus:cpu3
Oct 22 15:02:28 lara [   59.659673] SMP alternatives: switching to SMP code
Oct 22 15:02:28 lara [   59.660725] Booting processor 3/3 eip 3000
Oct 22 15:02:28 lara [   64.675517] Not responding.
Oct 22 15:02:28 lara [   64.675520] Inquiring remote APIC #3...
Oct 22 15:02:28 lara [   64.675524] ... APIC #3 ID: failed
Oct 22 15:02:28 lara [   64.675628] ... APIC #3 VERSION: failed
Oct 22 15:02:28 lara [   64.675731] ... APIC #3 SPIV: failed
Oct 22 15:02:28 lara [   64.675859] skipping cpu3, didn't come online
Oct 22 15:02:28 lara [   64.676017] PM: Removing info for No Bus:msr3
Oct 22 15:02:28 lara [   64.676059] PM: Removing info for No Bus:cpu3
Oct 22 15:02:28 lara [   64.676092] Error taking CPU3 up: -5
Oct 22 15:02:28 lara [   64.676326] evxfevnt-0079 [00] enable: 
System is already in ACPI mode

...

After I've played with a lot boot options I found out booting with ' acpi=ht ' 
will make the CPU's work again but now
I have a problem on Suspend. Everything seems to just go down disks etc but the 
box itself is for some reason still on.
So I've tested reboot= options with no luck.
( after waiting 5 minutes to be sure everything is really off I can just hit 
power button). On resume now everything is fine.

I'm not really sure what is wrong here acpi/hibernation/cpu-hotplug or a mix of 
all so I'm CC'ing linux-acpi as well.
The only thing I noticed is the 'Breaking affinity for irq XX' on suspend 
without acpi=ht messages.

I can't even tell whatever other kernel versions are working because aic7xxx 
driver didn't got suspend support till now 
( or at least never worked here ). I know suspend worked fine on windows with 
that box.

There is my config and dmesg ( good and bad one ) :


http://194.231.229.228/suspend/acpi=ht_working_dmesg.txt
http://194.231.229.228/suspend/dmesg_broken_cpus_on_resume.txt
http://194.231.229.228/suspend/config


Regards,

Gabriel

 



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Resume problems

2007-10-22 Thread Rafael J. Wysocki
On Monday, 22 October 2007 18:15, Gabriel C wrote:
 Hi all ,
 
 I'm running current git + aic7xxx suspend patch from  
 http://bugzilla.kernel.org/show_bug.cgi?id=3062
 on a Dell Precision WorkStation 530 MT SMP box ( HT enabled ).
 
 Suspend works fine but on resume I have some problems. 
 All CPU's but boot CPU won't come back , everything else seems fine.

Can you please try to disable HT and suspend?

 ...
 
 Oct 22 15:02:28 lara [   49.618795] Enabling non-boot CPUs ...
 Oct 22 15:02:28 lara [   49.622211] PM: Adding info for No Bus:msr1
 Oct 22 15:02:28 lara [   49.622259] PM: Adding info for No Bus:cpu1
 Oct 22 15:02:28 lara [   49.622302] SMP alternatives: switching to SMP code
 Oct 22 15:02:28 lara [   49.623536] Booting processor 1/1 eip 3000
 Oct 22 15:02:28 lara [   54.638093] Not responding.
 Oct 22 15:02:28 lara [   54.638096] Inquiring remote APIC #1...
 Oct 22 15:02:28 lara [   54.638099] ... APIC #1 ID: failed
 Oct 22 15:02:28 lara [   54.638204] ... APIC #1 VERSION: failed
 Oct 22 15:02:28 lara [   54.638307] ... APIC #1 SPIV: failed
 Oct 22 15:02:28 lara [   54.638427] skipping cpu1, didn't come online
 Oct 22 15:02:28 lara [   54.638602] PM: Removing info for No Bus:msr1
 Oct 22 15:02:28 lara [   54.638643] PM: Removing info for No Bus:cpu1
 Oct 22 15:02:28 lara [   54.638678] Error taking CPU1 up: -5
 Oct 22 15:02:28 lara [   54.640908] PM: Adding info for No Bus:msr2
 Oct 22 15:02:28 lara [   54.640939] PM: Adding info for No Bus:cpu2
 Oct 22 15:02:28 lara [   54.640976] SMP alternatives: switching to SMP code
 Oct 22 15:02:28 lara [   54.641961] Booting processor 2/2 eip 3000
 Oct 22 15:02:28 lara [   59.656795] Not responding.
 Oct 22 15:02:28 lara [   59.656799] Inquiring remote APIC #2...
 Oct 22 15:02:28 lara [   59.656803] ... APIC #2 ID: failed
 Oct 22 15:02:28 lara [   59.656907] ... APIC #2 VERSION: failed
 Oct 22 15:02:28 lara [   59.657011] ... APIC #2 SPIV: failed
 Oct 22 15:02:28 lara [   59.657131] skipping cpu2, didn't come online
 Oct 22 15:02:28 lara [   59.657300] PM: Removing info for No Bus:msr2
 Oct 22 15:02:28 lara [   59.657343] PM: Removing info for No Bus:cpu2
 Oct 22 15:02:28 lara [   59.657379] Error taking CPU2 up: -5
 Oct 22 15:02:28 lara [   59.659605] PM: Adding info for No Bus:msr3
 Oct 22 15:02:28 lara [   59.659637] PM: Adding info for No Bus:cpu3
 Oct 22 15:02:28 lara [   59.659673] SMP alternatives: switching to SMP code
 Oct 22 15:02:28 lara [   59.660725] Booting processor 3/3 eip 3000
 Oct 22 15:02:28 lara [   64.675517] Not responding.
 Oct 22 15:02:28 lara [   64.675520] Inquiring remote APIC #3...
 Oct 22 15:02:28 lara [   64.675524] ... APIC #3 ID: failed
 Oct 22 15:02:28 lara [   64.675628] ... APIC #3 VERSION: failed
 Oct 22 15:02:28 lara [   64.675731] ... APIC #3 SPIV: failed
 Oct 22 15:02:28 lara [   64.675859] skipping cpu3, didn't come online
 Oct 22 15:02:28 lara [   64.676017] PM: Removing info for No Bus:msr3
 Oct 22 15:02:28 lara [   64.676059] PM: Removing info for No Bus:cpu3
 Oct 22 15:02:28 lara [   64.676092] Error taking CPU3 up: -5
 Oct 22 15:02:28 lara [   64.676326] evxfevnt-0079 [00] enable
 : System is already in ACPI mode
 
 ...
 
 After I've played with a lot boot options I found out booting with ' acpi=ht 
 ' will make the CPU's work again but now
 I have a problem on Suspend. Everything seems to just go down disks etc but 
 the box itself is for some reason still on.
 So I've tested reboot= options with no luck.
 ( after waiting 5 minutes to be sure everything is really off I can just hit 
 power button). On resume now everything is fine.
 
 I'm not really sure what is wrong here acpi/hibernation/cpu-hotplug or a mix 
 of all so I'm CC'ing linux-acpi as well.
 The only thing I noticed is the 'Breaking affinity for irq XX' on suspend 
 without acpi=ht messages.
 
 I can't even tell whatever other kernel versions are working because aic7xxx 
 driver didn't got suspend support till now 
 ( or at least never worked here ). I know suspend worked fine on windows with 
 that box.
 
 There is my config and dmesg ( good and bad one ) :
 
 
 http://194.231.229.228/suspend/acpi=ht_working_dmesg.txt
 http://194.231.229.228/suspend/dmesg_broken_cpus_on_resume.txt
 http://194.231.229.228/suspend/config

Well, I think we have a problem with the CPU hotplug.

Can you try to offline-online CPUs (without suspending) and see if that works?

Greetings,
Rafael
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems

2007-10-22 Thread Rafael J. Wysocki
On Monday, 22 October 2007 16:11, Mark Lord wrote:
 Rafael,
 
 What happens to the jiffies variable on resume from RAM, and from DISK?
 Do we restore it to the value it had at suspend,
 or just leave it be with whatever?
 
 The answer has to be restore the value it had at suspend time,
 but I figured I'd check here anyway.
 
 ??

Well, frankly, I've lost track of that recently, but it seems that we just use
the pre-suspend jiffies (at least in the current -git).

Thomas knows better, I guess. :-)

Greetings,
Rafael
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Resume problems

2007-10-22 Thread Gabriel C
Rafael J. Wysocki wrote:
 On Monday, 22 October 2007 18:15, Gabriel C wrote:
 Hi all ,

 I'm running current git + aic7xxx suspend patch from  
 http://bugzilla.kernel.org/show_bug.cgi?id=3062
 on a Dell Precision WorkStation 530 MT SMP box ( HT enabled ).

 Suspend works fine but on resume I have some problems. 
 All CPU's but boot CPU won't come back , everything else seems fine.
 
 Can you please try to disable HT and suspend?

So only 'Hibernation' is enabled in kernel and HT disabled in BIOS ?

If you mean that , sure I can try doing so. 

I also could disable Suspend to RAM completly from BIOS as well if you want.

 
 ...

 Oct 22 15:02:28 lara [   49.618795] Enabling non-boot CPUs ...
 Oct 22 15:02:28 lara [   49.622211] PM: Adding info for No Bus:msr1
 Oct 22 15:02:28 lara [   49.622259] PM: Adding info for No Bus:cpu1
 Oct 22 15:02:28 lara [   49.622302] SMP alternatives: switching to SMP code
 Oct 22 15:02:28 lara [   49.623536] Booting processor 1/1 eip 3000
 Oct 22 15:02:28 lara [   54.638093] Not responding.
 Oct 22 15:02:28 lara [   54.638096] Inquiring remote APIC #1...
 Oct 22 15:02:28 lara [   54.638099] ... APIC #1 ID: failed
 Oct 22 15:02:28 lara [   54.638204] ... APIC #1 VERSION: failed
 Oct 22 15:02:28 lara [   54.638307] ... APIC #1 SPIV: failed
 Oct 22 15:02:28 lara [   54.638427] skipping cpu1, didn't come online
 Oct 22 15:02:28 lara [   54.638602] PM: Removing info for No Bus:msr1
 Oct 22 15:02:28 lara [   54.638643] PM: Removing info for No Bus:cpu1
 Oct 22 15:02:28 lara [   54.638678] Error taking CPU1 up: -5
 Oct 22 15:02:28 lara [   54.640908] PM: Adding info for No Bus:msr2
 Oct 22 15:02:28 lara [   54.640939] PM: Adding info for No Bus:cpu2
 Oct 22 15:02:28 lara [   54.640976] SMP alternatives: switching to SMP code
 Oct 22 15:02:28 lara [   54.641961] Booting processor 2/2 eip 3000
 Oct 22 15:02:28 lara [   59.656795] Not responding.
 Oct 22 15:02:28 lara [   59.656799] Inquiring remote APIC #2...
 Oct 22 15:02:28 lara [   59.656803] ... APIC #2 ID: failed
 Oct 22 15:02:28 lara [   59.656907] ... APIC #2 VERSION: failed
 Oct 22 15:02:28 lara [   59.657011] ... APIC #2 SPIV: failed
 Oct 22 15:02:28 lara [   59.657131] skipping cpu2, didn't come online
 Oct 22 15:02:28 lara [   59.657300] PM: Removing info for No Bus:msr2
 Oct 22 15:02:28 lara [   59.657343] PM: Removing info for No Bus:cpu2
 Oct 22 15:02:28 lara [   59.657379] Error taking CPU2 up: -5
 Oct 22 15:02:28 lara [   59.659605] PM: Adding info for No Bus:msr3
 Oct 22 15:02:28 lara [   59.659637] PM: Adding info for No Bus:cpu3
 Oct 22 15:02:28 lara [   59.659673] SMP alternatives: switching to SMP code
 Oct 22 15:02:28 lara [   59.660725] Booting processor 3/3 eip 3000
 Oct 22 15:02:28 lara [   64.675517] Not responding.
 Oct 22 15:02:28 lara [   64.675520] Inquiring remote APIC #3...
 Oct 22 15:02:28 lara [   64.675524] ... APIC #3 ID: failed
 Oct 22 15:02:28 lara [   64.675628] ... APIC #3 VERSION: failed
 Oct 22 15:02:28 lara [   64.675731] ... APIC #3 SPIV: failed
 Oct 22 15:02:28 lara [   64.675859] skipping cpu3, didn't come online
 Oct 22 15:02:28 lara [   64.676017] PM: Removing info for No Bus:msr3
 Oct 22 15:02:28 lara [   64.676059] PM: Removing info for No Bus:cpu3
 Oct 22 15:02:28 lara [   64.676092] Error taking CPU3 up: -5
 Oct 22 15:02:28 lara [   64.676326] evxfevnt-0079 [00] enable
 : System is already in ACPI mode

 ...

 After I've played with a lot boot options I found out booting with ' acpi=ht 
 ' will make the CPU's work again but now
 I have a problem on Suspend. Everything seems to just go down disks etc but 
 the box itself is for some reason still on.
 So I've tested reboot= options with no luck.
 ( after waiting 5 minutes to be sure everything is really off I can just hit 
 power button). On resume now everything is fine.

 I'm not really sure what is wrong here acpi/hibernation/cpu-hotplug or a mix 
 of all so I'm CC'ing linux-acpi as well.
 The only thing I noticed is the 'Breaking affinity for irq XX' on suspend 
 without acpi=ht messages.

 I can't even tell whatever other kernel versions are working because aic7xxx 
 driver didn't got suspend support till now 
 ( or at least never worked here ). I know suspend worked fine on windows 
 with that box.

 There is my config and dmesg ( good and bad one ) :


 http://194.231.229.228/suspend/acpi=ht_working_dmesg.txt
 http://194.231.229.228/suspend/dmesg_broken_cpus_on_resume.txt
 http://194.231.229.228/suspend/config
 
 Well, I think we have a problem with the CPU hotplug.
 
 Can you try to offline-online CPUs (without suspending) and see if that works?

Yes does work when I do it manually :

[ 6687.595842] CPU 1 is now offline
[ 6687.711425] CPU 2 is now offline
[ 6687.819330] CPU 3 is now offline
[ 6687.819337] SMP alternatives: switching to UP code
[ 6702.109605] SMP alternatives: switching to SMP code
[ 6702.110634] Booting processor 1/1 eip 3000
[ 6702.122140] Initializing CPU#1
[ 6702.182045] Calibrating delay 

Re: Resume problems

2007-10-22 Thread Rafael J. Wysocki
On Tuesday, 23 October 2007 01:00, Gabriel C wrote:
 Rafael J. Wysocki wrote:
  On Monday, 22 October 2007 18:15, Gabriel C wrote:
  Hi all ,
 
  I'm running current git + aic7xxx suspend patch from  
  http://bugzilla.kernel.org/show_bug.cgi?id=3062
  on a Dell Precision WorkStation 530 MT SMP box ( HT enabled ).
 
  Suspend works fine but on resume I have some problems. 
  All CPU's but boot CPU won't come back , everything else seems fine.
  
  Can you please try to disable HT and suspend?
 
 So only 'Hibernation' is enabled in kernel and HT disabled in BIOS ?
 
 If you mean that , sure I can try doing so. 

With suspend or hibernation enabled in the kernel, but with HT disabled in the
BIOS.

 I also could disable Suspend to RAM completly from BIOS as well if you want.

No, that rather won't work.

  
  ...
 
  Oct 22 15:02:28 lara [   49.618795] Enabling non-boot CPUs ...
  Oct 22 15:02:28 lara [   49.622211] PM: Adding info for No Bus:msr1
  Oct 22 15:02:28 lara [   49.622259] PM: Adding info for No Bus:cpu1
  Oct 22 15:02:28 lara [   49.622302] SMP alternatives: switching to SMP code
  Oct 22 15:02:28 lara [   49.623536] Booting processor 1/1 eip 3000
  Oct 22 15:02:28 lara [   54.638093] Not responding.
  Oct 22 15:02:28 lara [   54.638096] Inquiring remote APIC #1...
  Oct 22 15:02:28 lara [   54.638099] ... APIC #1 ID: failed
  Oct 22 15:02:28 lara [   54.638204] ... APIC #1 VERSION: failed
  Oct 22 15:02:28 lara [   54.638307] ... APIC #1 SPIV: failed
  Oct 22 15:02:28 lara [   54.638427] skipping cpu1, didn't come online
  Oct 22 15:02:28 lara [   54.638602] PM: Removing info for No Bus:msr1
  Oct 22 15:02:28 lara [   54.638643] PM: Removing info for No Bus:cpu1
  Oct 22 15:02:28 lara [   54.638678] Error taking CPU1 up: -5
  Oct 22 15:02:28 lara [   54.640908] PM: Adding info for No Bus:msr2
  Oct 22 15:02:28 lara [   54.640939] PM: Adding info for No Bus:cpu2
  Oct 22 15:02:28 lara [   54.640976] SMP alternatives: switching to SMP code
  Oct 22 15:02:28 lara [   54.641961] Booting processor 2/2 eip 3000
  Oct 22 15:02:28 lara [   59.656795] Not responding.
  Oct 22 15:02:28 lara [   59.656799] Inquiring remote APIC #2...
  Oct 22 15:02:28 lara [   59.656803] ... APIC #2 ID: failed
  Oct 22 15:02:28 lara [   59.656907] ... APIC #2 VERSION: failed
  Oct 22 15:02:28 lara [   59.657011] ... APIC #2 SPIV: failed
  Oct 22 15:02:28 lara [   59.657131] skipping cpu2, didn't come online
  Oct 22 15:02:28 lara [   59.657300] PM: Removing info for No Bus:msr2
  Oct 22 15:02:28 lara [   59.657343] PM: Removing info for No Bus:cpu2
  Oct 22 15:02:28 lara [   59.657379] Error taking CPU2 up: -5
  Oct 22 15:02:28 lara [   59.659605] PM: Adding info for No Bus:msr3
  Oct 22 15:02:28 lara [   59.659637] PM: Adding info for No Bus:cpu3
  Oct 22 15:02:28 lara [   59.659673] SMP alternatives: switching to SMP code
  Oct 22 15:02:28 lara [   59.660725] Booting processor 3/3 eip 3000
  Oct 22 15:02:28 lara [   64.675517] Not responding.
  Oct 22 15:02:28 lara [   64.675520] Inquiring remote APIC #3...
  Oct 22 15:02:28 lara [   64.675524] ... APIC #3 ID: failed
  Oct 22 15:02:28 lara [   64.675628] ... APIC #3 VERSION: failed
  Oct 22 15:02:28 lara [   64.675731] ... APIC #3 SPIV: failed
  Oct 22 15:02:28 lara [   64.675859] skipping cpu3, didn't come online
  Oct 22 15:02:28 lara [   64.676017] PM: Removing info for No Bus:msr3
  Oct 22 15:02:28 lara [   64.676059] PM: Removing info for No Bus:cpu3
  Oct 22 15:02:28 lara [   64.676092] Error taking CPU3 up: -5
  Oct 22 15:02:28 lara [   64.676326] evxfevnt-0079 [00] enable  
: System is already in ACPI mode
 
  ...
 
  After I've played with a lot boot options I found out booting with ' 
  acpi=ht ' will make the CPU's work again but now
  I have a problem on Suspend. Everything seems to just go down disks etc 
  but the box itself is for some reason still on.
  So I've tested reboot= options with no luck.
  ( after waiting 5 minutes to be sure everything is really off I can just 
  hit power button). On resume now everything is fine.
 
  I'm not really sure what is wrong here acpi/hibernation/cpu-hotplug or a 
  mix of all so I'm CC'ing linux-acpi as well.
  The only thing I noticed is the 'Breaking affinity for irq XX' on suspend 
  without acpi=ht messages.
 
  I can't even tell whatever other kernel versions are working because 
  aic7xxx driver didn't got suspend support till now 
  ( or at least never worked here ). I know suspend worked fine on windows 
  with that box.
 
  There is my config and dmesg ( good and bad one ) :
 
 
  http://194.231.229.228/suspend/acpi=ht_working_dmesg.txt
  http://194.231.229.228/suspend/dmesg_broken_cpus_on_resume.txt
  http://194.231.229.228/suspend/config
  
  Well, I think we have a problem with the CPU hotplug.
  
  Can you try to offline-online CPUs (without suspending) and see if that 
  works?
 
 Yes does work when I do it manually :
 
 [ 6687.595842] CPU 1 is now offline
 [ 6687.711425] CPU 2 is now 

Re: Resume problems

2007-10-22 Thread Gabriel C
Rafael J. Wysocki wrote:
 On Tuesday, 23 October 2007 01:00, Gabriel C wrote:
 Rafael J. Wysocki wrote:
 On Monday, 22 October 2007 18:15, Gabriel C wrote:
 Hi all ,

 I'm running current git + aic7xxx suspend patch from  
 http://bugzilla.kernel.org/show_bug.cgi?id=3062
 on a Dell Precision WorkStation 530 MT SMP box ( HT enabled ).

 Suspend works fine but on resume I have some problems. 
 All CPU's but boot CPU won't come back , everything else seems fine.
 Can you please try to disable HT and suspend?
 So only 'Hibernation' is enabled in kernel and HT disabled in BIOS ?

 If you mean that , sure I can try doing so. 
 
 With suspend or hibernation enabled in the kernel, but with HT disabled in the
 BIOS.

Ok trying in some minutes.

 
 I also could disable Suspend to RAM completly from BIOS as well if you want.
 
 No, that rather won't work.
 
 ...

 Oct 22 15:02:28 lara [   49.618795] Enabling non-boot CPUs ...
 Oct 22 15:02:28 lara [   49.622211] PM: Adding info for No Bus:msr1
 Oct 22 15:02:28 lara [   49.622259] PM: Adding info for No Bus:cpu1
 Oct 22 15:02:28 lara [   49.622302] SMP alternatives: switching to SMP code
 Oct 22 15:02:28 lara [   49.623536] Booting processor 1/1 eip 3000
 Oct 22 15:02:28 lara [   54.638093] Not responding.
 Oct 22 15:02:28 lara [   54.638096] Inquiring remote APIC #1...
 Oct 22 15:02:28 lara [   54.638099] ... APIC #1 ID: failed
 Oct 22 15:02:28 lara [   54.638204] ... APIC #1 VERSION: failed
 Oct 22 15:02:28 lara [   54.638307] ... APIC #1 SPIV: failed
 Oct 22 15:02:28 lara [   54.638427] skipping cpu1, didn't come online
 Oct 22 15:02:28 lara [   54.638602] PM: Removing info for No Bus:msr1
 Oct 22 15:02:28 lara [   54.638643] PM: Removing info for No Bus:cpu1
 Oct 22 15:02:28 lara [   54.638678] Error taking CPU1 up: -5
 Oct 22 15:02:28 lara [   54.640908] PM: Adding info for No Bus:msr2
 Oct 22 15:02:28 lara [   54.640939] PM: Adding info for No Bus:cpu2
 Oct 22 15:02:28 lara [   54.640976] SMP alternatives: switching to SMP code
 Oct 22 15:02:28 lara [   54.641961] Booting processor 2/2 eip 3000
 Oct 22 15:02:28 lara [   59.656795] Not responding.
 Oct 22 15:02:28 lara [   59.656799] Inquiring remote APIC #2...
 Oct 22 15:02:28 lara [   59.656803] ... APIC #2 ID: failed
 Oct 22 15:02:28 lara [   59.656907] ... APIC #2 VERSION: failed
 Oct 22 15:02:28 lara [   59.657011] ... APIC #2 SPIV: failed
 Oct 22 15:02:28 lara [   59.657131] skipping cpu2, didn't come online
 Oct 22 15:02:28 lara [   59.657300] PM: Removing info for No Bus:msr2
 Oct 22 15:02:28 lara [   59.657343] PM: Removing info for No Bus:cpu2
 Oct 22 15:02:28 lara [   59.657379] Error taking CPU2 up: -5
 Oct 22 15:02:28 lara [   59.659605] PM: Adding info for No Bus:msr3
 Oct 22 15:02:28 lara [   59.659637] PM: Adding info for No Bus:cpu3
 Oct 22 15:02:28 lara [   59.659673] SMP alternatives: switching to SMP code
 Oct 22 15:02:28 lara [   59.660725] Booting processor 3/3 eip 3000
 Oct 22 15:02:28 lara [   64.675517] Not responding.
 Oct 22 15:02:28 lara [   64.675520] Inquiring remote APIC #3...
 Oct 22 15:02:28 lara [   64.675524] ... APIC #3 ID: failed
 Oct 22 15:02:28 lara [   64.675628] ... APIC #3 VERSION: failed
 Oct 22 15:02:28 lara [   64.675731] ... APIC #3 SPIV: failed
 Oct 22 15:02:28 lara [   64.675859] skipping cpu3, didn't come online
 Oct 22 15:02:28 lara [   64.676017] PM: Removing info for No Bus:msr3
 Oct 22 15:02:28 lara [   64.676059] PM: Removing info for No Bus:cpu3
 Oct 22 15:02:28 lara [   64.676092] Error taking CPU3 up: -5
 Oct 22 15:02:28 lara [   64.676326] evxfevnt-0079 [00] enable  
   : System is already in ACPI mode

 ...

 After I've played with a lot boot options I found out booting with ' 
 acpi=ht ' will make the CPU's work again but now
 I have a problem on Suspend. Everything seems to just go down disks etc 
 but the box itself is for some reason still on.
 So I've tested reboot= options with no luck.
 ( after waiting 5 minutes to be sure everything is really off I can just 
 hit power button). On resume now everything is fine.

 I'm not really sure what is wrong here acpi/hibernation/cpu-hotplug or a 
 mix of all so I'm CC'ing linux-acpi as well.
 The only thing I noticed is the 'Breaking affinity for irq XX' on suspend 
 without acpi=ht messages.

 I can't even tell whatever other kernel versions are working because 
 aic7xxx driver didn't got suspend support till now 
 ( or at least never worked here ). I know suspend worked fine on windows 
 with that box.

 There is my config and dmesg ( good and bad one ) :


 http://194.231.229.228/suspend/acpi=ht_working_dmesg.txt
 http://194.231.229.228/suspend/dmesg_broken_cpus_on_resume.txt
 http://194.231.229.228/suspend/config
 Well, I think we have a problem with the CPU hotplug.

 Can you try to offline-online CPUs (without suspending) and see if that 
 works?
 Yes does work when I do it manually :

 [ 6687.595842] CPU 1 is now offline
 [ 6687.711425] CPU 2 is now offline
 [ 6687.819330] CPU 3 is now 

Re: Resume problems

2007-10-22 Thread Gabriel C
Gabriel C wrote:
 Rafael J. Wysocki wrote:
 On Tuesday, 23 October 2007 01:00, Gabriel C wrote:
 Rafael J. Wysocki wrote:
 On Monday, 22 October 2007 18:15, Gabriel C wrote:
 Hi all ,

 I'm running current git + aic7xxx suspend patch from  
 http://bugzilla.kernel.org/show_bug.cgi?id=3062
 on a Dell Precision WorkStation 530 MT SMP box ( HT enabled ).

 Suspend works fine but on resume I have some problems. 
 All CPU's but boot CPU won't come back , everything else seems fine.
 Can you please try to disable HT and suspend?
 So only 'Hibernation' is enabled in kernel and HT disabled in BIOS ?

 If you mean that , sure I can try doing so. 
 With suspend or hibernation enabled in the kernel, but with HT disabled in 
 the
 BIOS.
 
 Ok trying in some minutes.

Disabling HT does not make any difference , nor disabling / enabling only one 
Hibernation or Suspend in kernel and BIOS
nor any combination of these.
 
 
 I also could disable Suspend to RAM completly from BIOS as well if you want.
 No, that rather won't work.

 ...

 Oct 22 15:02:28 lara [   49.618795] Enabling non-boot CPUs ...
 Oct 22 15:02:28 lara [   49.622211] PM: Adding info for No Bus:msr1
 Oct 22 15:02:28 lara [   49.622259] PM: Adding info for No Bus:cpu1
 Oct 22 15:02:28 lara [   49.622302] SMP alternatives: switching to SMP 
 code
 Oct 22 15:02:28 lara [   49.623536] Booting processor 1/1 eip 3000
 Oct 22 15:02:28 lara [   54.638093] Not responding.
 Oct 22 15:02:28 lara [   54.638096] Inquiring remote APIC #1...
 Oct 22 15:02:28 lara [   54.638099] ... APIC #1 ID: failed
 Oct 22 15:02:28 lara [   54.638204] ... APIC #1 VERSION: failed
 Oct 22 15:02:28 lara [   54.638307] ... APIC #1 SPIV: failed
 Oct 22 15:02:28 lara [   54.638427] skipping cpu1, didn't come online
 Oct 22 15:02:28 lara [   54.638602] PM: Removing info for No Bus:msr1
 Oct 22 15:02:28 lara [   54.638643] PM: Removing info for No Bus:cpu1
 Oct 22 15:02:28 lara [   54.638678] Error taking CPU1 up: -5
 Oct 22 15:02:28 lara [   54.640908] PM: Adding info for No Bus:msr2
 Oct 22 15:02:28 lara [   54.640939] PM: Adding info for No Bus:cpu2
 Oct 22 15:02:28 lara [   54.640976] SMP alternatives: switching to SMP 
 code
 Oct 22 15:02:28 lara [   54.641961] Booting processor 2/2 eip 3000
 Oct 22 15:02:28 lara [   59.656795] Not responding.
 Oct 22 15:02:28 lara [   59.656799] Inquiring remote APIC #2...
 Oct 22 15:02:28 lara [   59.656803] ... APIC #2 ID: failed
 Oct 22 15:02:28 lara [   59.656907] ... APIC #2 VERSION: failed
 Oct 22 15:02:28 lara [   59.657011] ... APIC #2 SPIV: failed
 Oct 22 15:02:28 lara [   59.657131] skipping cpu2, didn't come online
 Oct 22 15:02:28 lara [   59.657300] PM: Removing info for No Bus:msr2
 Oct 22 15:02:28 lara [   59.657343] PM: Removing info for No Bus:cpu2
 Oct 22 15:02:28 lara [   59.657379] Error taking CPU2 up: -5
 Oct 22 15:02:28 lara [   59.659605] PM: Adding info for No Bus:msr3
 Oct 22 15:02:28 lara [   59.659637] PM: Adding info for No Bus:cpu3
 Oct 22 15:02:28 lara [   59.659673] SMP alternatives: switching to SMP 
 code
 Oct 22 15:02:28 lara [   59.660725] Booting processor 3/3 eip 3000
 Oct 22 15:02:28 lara [   64.675517] Not responding.
 Oct 22 15:02:28 lara [   64.675520] Inquiring remote APIC #3...
 Oct 22 15:02:28 lara [   64.675524] ... APIC #3 ID: failed
 Oct 22 15:02:28 lara [   64.675628] ... APIC #3 VERSION: failed
 Oct 22 15:02:28 lara [   64.675731] ... APIC #3 SPIV: failed
 Oct 22 15:02:28 lara [   64.675859] skipping cpu3, didn't come online
 Oct 22 15:02:28 lara [   64.676017] PM: Removing info for No Bus:msr3
 Oct 22 15:02:28 lara [   64.676059] PM: Removing info for No Bus:cpu3
 Oct 22 15:02:28 lara [   64.676092] Error taking CPU3 up: -5
 Oct 22 15:02:28 lara [   64.676326] evxfevnt-0079 [00] enable 
: System is already in ACPI mode

 ...

 After I've played with a lot boot options I found out booting with ' 
 acpi=ht ' will make the CPU's work again but now
 I have a problem on Suspend. Everything seems to just go down disks etc 
 but the box itself is for some reason still on.
 So I've tested reboot= options with no luck.
 ( after waiting 5 minutes to be sure everything is really off I can just 
 hit power button). On resume now everything is fine.

 I'm not really sure what is wrong here acpi/hibernation/cpu-hotplug or a 
 mix of all so I'm CC'ing linux-acpi as well.
 The only thing I noticed is the 'Breaking affinity for irq XX' on suspend 
 without acpi=ht messages.

 I can't even tell whatever other kernel versions are working because 
 aic7xxx driver didn't got suspend support till now 
 ( or at least never worked here ). I know suspend worked fine on windows 
 with that box.

 There is my config and dmesg ( good and bad one ) :


 http://194.231.229.228/suspend/acpi=ht_working_dmesg.txt
 http://194.231.229.228/suspend/dmesg_broken_cpus_on_resume.txt
 http://194.231.229.228/suspend/config
 Well, I think we have a problem with the CPU hotplug.

 Can you try to offline-online CPUs (without 

Re: Resume problems

2007-10-22 Thread Gabriel C

 Also box just froze on level 3 but I got a ACPI error at least which I didn't 
 got in any other dmesg till now :
 ( also patch was tested with HT disabled and Suspend and Hibernation enabled 
 in kernel and BIOS )
 
 ...
 
 Oct 23 01:51:05 lara [  273.512374] PM: Removing info for No Bus:input0
 Oct 23 01:51:05 lara [  274.545158] PM: Removing info for No Bus:mouse0
 Oct 23 01:51:05 lara [  274.551435] PM: Removing info for No Bus:event1
 Oct 23 01:51:05 lara [  274.559493] PM: Removing info for No Bus:input1
 Oct 23 01:53:06 lara [  394.869468] ACPI Error (evevent-0303): No installed 
 handler for fixed event [0002] [20070126]
 
 
 
 ( I hard reseted after that ) 
 
 I try level 2 and 1 now I just wanted to let you know.
 

Same issues with level 2 and 1.

BTW I found out why my box does not shutdown with acpi=ht. It seems like libata 
does not like that 
acpi mode =) dropping the '... read http://linux-ata.org/shutdown.html , power 
down manually' message.

That works perfectly with full acpi here.

After all I think all this problems may be some who ACPI related 
but the question is why they get triggered by Suspend/Hibernation.

If you want me to test something else just let me know.

Gabriel
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems

2007-10-20 Thread Mark Lord

Pavel Machek wrote:

Hi!

Since upgrading to 2.6.23.1 from 2.6.23-rc9, 
resume-from-RAM has been misbehaving here.


It takes much (+5-7 seconds) longer to resume 
*sometimes*, but not all/most of the time.


I suspend those long delays may have something to do with USB,
as it takes longer for my hub + mouse to come back to life
during the sequence.

But I have since then re-applied the powertop patches,
and the long delays vanished.  Back with -rc9 I normally also
had those powertop fixes, and so this problem could be much
older and was never noticed until I ran without them.

I'm no longer actively investigating, as the delays are
gone (powertop patches), and the crashes are much less frequent
since patching.

Cheers
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems

2007-10-20 Thread Pavel Machek
Hi!

> Since upgrading to 2.6.23.1 from 2.6.23-rc9, 
> resume-from-RAM has been misbehaving here.
> 
> It takes much (+5-7 seconds) longer to resume 
> *sometimes*, but not all/most of the time.
> And sometimes I get get flashing keyboard LEDs and have 
> to hold the power button
> in for a full hard reset.
> 
> With 2.6.23-rc8/rc9, no such troubles.
> 
> Difficult to reproduce, other than perhaps once a day.
> Anybody want to fess up with a likely candidate?

Are there any .config differences between rc8 and .1?

Can you try disabling nohz?

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems

2007-10-20 Thread Pavel Machek
Hi!

 Since upgrading to 2.6.23.1 from 2.6.23-rc9, 
 resume-from-RAM has been misbehaving here.
 
 It takes much (+5-7 seconds) longer to resume 
 *sometimes*, but not all/most of the time.
 And sometimes I get get flashing keyboard LEDs and have 
 to hold the power button
 in for a full hard reset.
 
 With 2.6.23-rc8/rc9, no such troubles.
 
 Difficult to reproduce, other than perhaps once a day.
 Anybody want to fess up with a likely candidate?

Are there any .config differences between rc8 and .1?

Can you try disabling nohz?

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems

2007-10-20 Thread Mark Lord

Pavel Machek wrote:

Hi!

Since upgrading to 2.6.23.1 from 2.6.23-rc9, 
resume-from-RAM has been misbehaving here.


It takes much (+5-7 seconds) longer to resume 
*sometimes*, but not all/most of the time.


I suspend those long delays may have something to do with USB,
as it takes longer for my hub + mouse to come back to life
during the sequence.

But I have since then re-applied the powertop patches,
and the long delays vanished.  Back with -rc9 I normally also
had those powertop fixes, and so this problem could be much
older and was never noticed until I ran without them.

I'm no longer actively investigating, as the delays are
gone (powertop patches), and the crashes are much less frequent
since patching.

Cheers
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems

2007-10-16 Thread Rafael J. Wysocki
On Wednesday, 17 October 2007 00:10, Mark Lord wrote:
> Mark Lord wrote:
> > Rafael J. Wysocki wrote:
> >> On Sunday, 14 October 2007 22:13, Mark Lord wrote:
> >>> Since upgrading to 2.6.23.1 from 2.6.23-rc9, resume-from-RAM has been 
> >>> misbehaving here.
> >>>
> >>> It takes much (+5-7 seconds) longer to resume *sometimes*, but not 
> >>> all/most of the time.
> >>> And sometimes I get get flashing keyboard LEDs and have to hold the 
> >>> power button
> >>> in for a full hard reset.
> >>>
> >>> With 2.6.23-rc8/rc9, no such troubles.
> >>>
> >>> Difficult to reproduce, other than perhaps once a day.
> >>> Anybody want to fess up with a likely candidate?
> >>
> >> Not really, but if you rule out all of the POWERPC and MIPS patches, 
> >> there's
> >> not much left ...
> > 
> > Yeah, I didn't see much there either.
> > 
> > I'll keep an eye on things over the next few days,
> > and post again if it persists.
> > 
> > I was using the powertop patches with -rc9, but not with 2.6.23.1.
> > Maybe they helped (???).
> 
> It still happens.

Well, I have no idea, sorry.

I also have not been able to reproduce it here ...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems

2007-10-16 Thread Mark Lord

Mark Lord wrote:

Rafael J. Wysocki wrote:

On Sunday, 14 October 2007 22:13, Mark Lord wrote:
Since upgrading to 2.6.23.1 from 2.6.23-rc9, resume-from-RAM has been 
misbehaving here.


It takes much (+5-7 seconds) longer to resume *sometimes*, but not 
all/most of the time.
And sometimes I get get flashing keyboard LEDs and have to hold the 
power button

in for a full hard reset.

With 2.6.23-rc8/rc9, no such troubles.

Difficult to reproduce, other than perhaps once a day.
Anybody want to fess up with a likely candidate?


Not really, but if you rule out all of the POWERPC and MIPS patches, 
there's

not much left ...


Yeah, I didn't see much there either.

I'll keep an eye on things over the next few days,
and post again if it persists.

I was using the powertop patches with -rc9, but not with 2.6.23.1.
Maybe they helped (???).


It still happens.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems

2007-10-16 Thread Mark Lord

Mark Lord wrote:

Rafael J. Wysocki wrote:

On Sunday, 14 October 2007 22:13, Mark Lord wrote:
Since upgrading to 2.6.23.1 from 2.6.23-rc9, resume-from-RAM has been 
misbehaving here.


It takes much (+5-7 seconds) longer to resume *sometimes*, but not 
all/most of the time.
And sometimes I get get flashing keyboard LEDs and have to hold the 
power button

in for a full hard reset.

With 2.6.23-rc8/rc9, no such troubles.

Difficult to reproduce, other than perhaps once a day.
Anybody want to fess up with a likely candidate?


Not really, but if you rule out all of the POWERPC and MIPS patches, 
there's

not much left ...


Yeah, I didn't see much there either.

I'll keep an eye on things over the next few days,
and post again if it persists.

I was using the powertop patches with -rc9, but not with 2.6.23.1.
Maybe they helped (???).


It still happens.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems

2007-10-16 Thread Rafael J. Wysocki
On Wednesday, 17 October 2007 00:10, Mark Lord wrote:
 Mark Lord wrote:
  Rafael J. Wysocki wrote:
  On Sunday, 14 October 2007 22:13, Mark Lord wrote:
  Since upgrading to 2.6.23.1 from 2.6.23-rc9, resume-from-RAM has been 
  misbehaving here.
 
  It takes much (+5-7 seconds) longer to resume *sometimes*, but not 
  all/most of the time.
  And sometimes I get get flashing keyboard LEDs and have to hold the 
  power button
  in for a full hard reset.
 
  With 2.6.23-rc8/rc9, no such troubles.
 
  Difficult to reproduce, other than perhaps once a day.
  Anybody want to fess up with a likely candidate?
 
  Not really, but if you rule out all of the POWERPC and MIPS patches, 
  there's
  not much left ...
  
  Yeah, I didn't see much there either.
  
  I'll keep an eye on things over the next few days,
  and post again if it persists.
  
  I was using the powertop patches with -rc9, but not with 2.6.23.1.
  Maybe they helped (???).
 
 It still happens.

Well, I have no idea, sorry.

I also have not been able to reproduce it here ...
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems

2007-10-14 Thread Mark Lord

Rafael J. Wysocki wrote:

On Sunday, 14 October 2007 22:13, Mark Lord wrote:

Since upgrading to 2.6.23.1 from 2.6.23-rc9, resume-from-RAM has been 
misbehaving here.

It takes much (+5-7 seconds) longer to resume *sometimes*, but not all/most of 
the time.
And sometimes I get get flashing keyboard LEDs and have to hold the power button
in for a full hard reset.

With 2.6.23-rc8/rc9, no such troubles.

Difficult to reproduce, other than perhaps once a day.
Anybody want to fess up with a likely candidate?


Not really, but if you rule out all of the POWERPC and MIPS patches, there's
not much left ...


Yeah, I didn't see much there either.

I'll keep an eye on things over the next few days,
and post again if it persists.

I was using the powertop patches with -rc9, but not with 2.6.23.1.
Maybe they helped (???).

-ml
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems

2007-10-14 Thread Rafael J. Wysocki
On Sunday, 14 October 2007 22:13, Mark Lord wrote:
> Since upgrading to 2.6.23.1 from 2.6.23-rc9, resume-from-RAM has been 
> misbehaving here.
> 
> It takes much (+5-7 seconds) longer to resume *sometimes*, but not all/most 
> of the time.
> And sometimes I get get flashing keyboard LEDs and have to hold the power 
> button
> in for a full hard reset.
> 
> With 2.6.23-rc8/rc9, no such troubles.
> 
> Difficult to reproduce, other than perhaps once a day.
> Anybody want to fess up with a likely candidate?

Not really, but if you rule out all of the POWERPC and MIPS patches, there's
not much left ...

Greetings,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems

2007-10-14 Thread Mark Lord

Since upgrading to 2.6.23.1 from 2.6.23-rc9, resume-from-RAM has been 
misbehaving here.

It takes much (+5-7 seconds) longer to resume *sometimes*, but not all/most of 
the time.
And sometimes I get get flashing keyboard LEDs and have to hold the power button
in for a full hard reset.

With 2.6.23-rc8/rc9, no such troubles.

Difficult to reproduce, other than perhaps once a day.
Anybody want to fess up with a likely candidate?

Dell Inspiron 9400 Core2duo + 2GB RAM, 32-bit x86 kernel+user.

.config below.

-

#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.23.1
# Sun Oct 14 09:22:05 2007
#
CONFIG_X86_32=y
CONFIG_GENERIC_TIME=y
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_QUICKLIST=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_DMI=y
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"

#
# General setup
#
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_LOCALVERSION=""
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
# CONFIG_BSD_PROCESS_ACCT is not set
# CONFIG_TASKSTATS is not set
# CONFIG_USER_NS is not set
# CONFIG_AUDIT is not set
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_LOG_BUF_SHIFT=16
# CONFIG_CPUSETS is not set
CONFIG_SYSFS_DEPRECATED=y
# CONFIG_RELAY is not set
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
CONFIG_EMBEDDED=y
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_ALL is not set
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_ANON_INODES=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_VM_EVENT_COUNTERS=y
# CONFIG_SLUB_DEBUG is not set
# CONFIG_SLAB is not set
CONFIG_SLUB=y
# CONFIG_SLOB is not set
CONFIG_RT_MUTEXES=y
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
CONFIG_MODVERSIONS=y
CONFIG_MODULE_SRCVERSION_ALL=y
CONFIG_KMOD=y
CONFIG_STOP_MACHINE=y
CONFIG_BLOCK=y
# CONFIG_LBD is not set
# CONFIG_BLK_DEV_IO_TRACE is not set
# CONFIG_LSF is not set
CONFIG_BLK_DEV_BSG=y

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
# CONFIG_IOSCHED_DEADLINE is not set
CONFIG_IOSCHED_CFQ=y
# CONFIG_DEFAULT_AS is not set
# CONFIG_DEFAULT_DEADLINE is not set
CONFIG_DEFAULT_CFQ=y
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED="cfq"

#
# Processor type and features
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y
CONFIG_SMP=y
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_PARAVIRT is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUMM is not set
CONFIG_MCORE2=y
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MGEODEGX1 is not set
# CONFIG_MGEODE_LX is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_MVIAC7 is not set
# CONFIG_X86_GENERIC is not set
CONFIG_X86_CMPXCHG=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_X86_XADD=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_TSC=y
CONFIG_X86_MINIMUM_CPU_FAMILY=4
CONFIG_HPET_TIMER=y
CONFIG_NR_CPUS=2
# CONFIG_SCHED_SMT is not set
CONFIG_SCHED_MC=y
# CONFIG_PREEMPT_NONE is not set
# CONFIG_PREEMPT_VOLUNTARY is not set
CONFIG_PREEMPT=y
CONFIG_PREEMPT_BKL=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
# CONFIG_X86_MCE is not set
CONFIG_VM86=y
# CONFIG_TOSHIBA is not set
CONFIG_I8K=m
CONFIG_X86_REBOOTFIXUPS=y
CONFIG_MICROCODE=m
CONFIG_MICROCODE_OLD_INTERFACE=y
CONFIG_X86_MSR=m
CONFIG_X86_CPUID=m

#
# Firmware Drivers
#
CONFIG_EDD=m
CONFIG_DELL_RBU=m
CONFIG_DCDBAS=m
CONFIG_DMIID=y
# CONFIG_NOHIGHMEM is not set
CONFIG_HIGHMEM4G=y
# CONFIG_HIGHMEM64G is not set
CONFIG_VMSPLIT_3G=y
# CONFIG_VMSPLIT_3G_OPT is 

Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems

2007-10-14 Thread Mark Lord

Since upgrading to 2.6.23.1 from 2.6.23-rc9, resume-from-RAM has been 
misbehaving here.

It takes much (+5-7 seconds) longer to resume *sometimes*, but not all/most of 
the time.
And sometimes I get get flashing keyboard LEDs and have to hold the power button
in for a full hard reset.

With 2.6.23-rc8/rc9, no such troubles.

Difficult to reproduce, other than perhaps once a day.
Anybody want to fess up with a likely candidate?

Dell Inspiron 9400 Core2duo + 2GB RAM, 32-bit x86 kernel+user.

.config below.

-

#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.23.1
# Sun Oct 14 09:22:05 2007
#
CONFIG_X86_32=y
CONFIG_GENERIC_TIME=y
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_QUICKLIST=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_DMI=y
CONFIG_DEFCONFIG_LIST=/lib/modules/$UNAME_RELEASE/.config

#
# General setup
#
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_LOCALVERSION=
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
# CONFIG_BSD_PROCESS_ACCT is not set
# CONFIG_TASKSTATS is not set
# CONFIG_USER_NS is not set
# CONFIG_AUDIT is not set
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_LOG_BUF_SHIFT=16
# CONFIG_CPUSETS is not set
CONFIG_SYSFS_DEPRECATED=y
# CONFIG_RELAY is not set
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
CONFIG_EMBEDDED=y
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_ALL is not set
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_ANON_INODES=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_VM_EVENT_COUNTERS=y
# CONFIG_SLUB_DEBUG is not set
# CONFIG_SLAB is not set
CONFIG_SLUB=y
# CONFIG_SLOB is not set
CONFIG_RT_MUTEXES=y
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
CONFIG_MODVERSIONS=y
CONFIG_MODULE_SRCVERSION_ALL=y
CONFIG_KMOD=y
CONFIG_STOP_MACHINE=y
CONFIG_BLOCK=y
# CONFIG_LBD is not set
# CONFIG_BLK_DEV_IO_TRACE is not set
# CONFIG_LSF is not set
CONFIG_BLK_DEV_BSG=y

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
# CONFIG_IOSCHED_DEADLINE is not set
CONFIG_IOSCHED_CFQ=y
# CONFIG_DEFAULT_AS is not set
# CONFIG_DEFAULT_DEADLINE is not set
CONFIG_DEFAULT_CFQ=y
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED=cfq

#
# Processor type and features
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y
CONFIG_SMP=y
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_PARAVIRT is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUMM is not set
CONFIG_MCORE2=y
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MGEODEGX1 is not set
# CONFIG_MGEODE_LX is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_MVIAC7 is not set
# CONFIG_X86_GENERIC is not set
CONFIG_X86_CMPXCHG=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_X86_XADD=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_TSC=y
CONFIG_X86_MINIMUM_CPU_FAMILY=4
CONFIG_HPET_TIMER=y
CONFIG_NR_CPUS=2
# CONFIG_SCHED_SMT is not set
CONFIG_SCHED_MC=y
# CONFIG_PREEMPT_NONE is not set
# CONFIG_PREEMPT_VOLUNTARY is not set
CONFIG_PREEMPT=y
CONFIG_PREEMPT_BKL=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
# CONFIG_X86_MCE is not set
CONFIG_VM86=y
# CONFIG_TOSHIBA is not set
CONFIG_I8K=m
CONFIG_X86_REBOOTFIXUPS=y
CONFIG_MICROCODE=m
CONFIG_MICROCODE_OLD_INTERFACE=y
CONFIG_X86_MSR=m
CONFIG_X86_CPUID=m

#
# Firmware Drivers
#
CONFIG_EDD=m
CONFIG_DELL_RBU=m
CONFIG_DCDBAS=m
CONFIG_DMIID=y
# CONFIG_NOHIGHMEM is not set
CONFIG_HIGHMEM4G=y
# CONFIG_HIGHMEM64G is not set
CONFIG_VMSPLIT_3G=y
# CONFIG_VMSPLIT_3G_OPT is not set

Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems

2007-10-14 Thread Rafael J. Wysocki
On Sunday, 14 October 2007 22:13, Mark Lord wrote:
 Since upgrading to 2.6.23.1 from 2.6.23-rc9, resume-from-RAM has been 
 misbehaving here.
 
 It takes much (+5-7 seconds) longer to resume *sometimes*, but not all/most 
 of the time.
 And sometimes I get get flashing keyboard LEDs and have to hold the power 
 button
 in for a full hard reset.
 
 With 2.6.23-rc8/rc9, no such troubles.
 
 Difficult to reproduce, other than perhaps once a day.
 Anybody want to fess up with a likely candidate?

Not really, but if you rule out all of the POWERPC and MIPS patches, there's
not much left ...

Greetings,
Rafael
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Regression: 2.6.23-rc9 okay, 2.6.23.1 resume problems

2007-10-14 Thread Mark Lord

Rafael J. Wysocki wrote:

On Sunday, 14 October 2007 22:13, Mark Lord wrote:

Since upgrading to 2.6.23.1 from 2.6.23-rc9, resume-from-RAM has been 
misbehaving here.

It takes much (+5-7 seconds) longer to resume *sometimes*, but not all/most of 
the time.
And sometimes I get get flashing keyboard LEDs and have to hold the power button
in for a full hard reset.

With 2.6.23-rc8/rc9, no such troubles.

Difficult to reproduce, other than perhaps once a day.
Anybody want to fess up with a likely candidate?


Not really, but if you rule out all of the POWERPC and MIPS patches, there's
not much left ...


Yeah, I didn't see much there either.

I'll keep an eye on things over the next few days,
and post again if it persists.

I was using the powertop patches with -rc9, but not with 2.6.23.1.
Maybe they helped (???).

-ml
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: MMC/SD Root filesystem suspend/resume problems

2007-07-25 Thread Pavel Machek
On Wed 2007-07-25 20:20:42, Richard Purdie wrote:
> On Wed, 2007-07-25 at 19:01 +, Pavel Machek wrote:
> > > I enabled the MMC_UNSAFE_RESUME option and the problems I was seeing was
> > > "fixed". I think having this option is a bad idea (in its current form)
> > > as it doesn't actually stop filesystem corruption.
> > > 
> > > With the option disabled, if a filesystem is mounted when you suspend my
> > > tests show the filesystem is corrupted. At least if the option is
> > > enabled, the filesystem is only corrupted if you remove the card whilst
> > > suspended which is more preferable.
> > 
> > Are we talking _corruption_ here, or are we talking 'the kind of
> > corruption recoverable by fsck that happens on powerfail'?
> 
> There was more damage to the system than just a dirty bit set. Yes, fsck
> could fix it but I don't think it should happen in the first place...

Well, that's "ok", that happens on sudden powerdowns, too.

(Well, but we do sync() during suspend, so it is a bit strange). Do
you have fsck logs perhaps?
Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: MMC/SD Root filesystem suspend/resume problems

2007-07-25 Thread Richard Purdie
On Wed, 2007-07-25 at 19:01 +, Pavel Machek wrote:
> > I enabled the MMC_UNSAFE_RESUME option and the problems I was seeing was
> > "fixed". I think having this option is a bad idea (in its current form)
> > as it doesn't actually stop filesystem corruption.
> > 
> > With the option disabled, if a filesystem is mounted when you suspend my
> > tests show the filesystem is corrupted. At least if the option is
> > enabled, the filesystem is only corrupted if you remove the card whilst
> > suspended which is more preferable.
> 
> Are we talking _corruption_ here, or are we talking 'the kind of
> corruption recoverable by fsck that happens on powerfail'?

There was more damage to the system than just a dirty bit set. Yes, fsck
could fix it but I don't think it should happen in the first place...

Richard

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: MMC/SD Root filesystem suspend/resume problems

2007-07-25 Thread Pavel Machek
Hi!

> > > Lots of Linux handhelds use MMC/SD devices as the root file system.
> > > This has worked quite reliably for many kernel versions. In 2.6.22,
> > > it seems that if you suspend such a system then resume it, the device
> > > locks up. Trying to execute anything on the filesystem results in a
> > > "Permission Denied" message. I did see a message from the MMC
> > > subsystem saying it had redetected the card. There are also messages
> > > on the console like "MMC: killing requests for dead queue" each time
> > > you suspend/resume.
> > 
> > The card is removed when you suspend and readded when you resume.
> > That's the only safe thing we can do until we get suspend support in
> > the filesystems.
> > 
> > If you really want to shoot yourself in the foot, there is a Kconfig
> > option that keeps the card around across the suspend.
> 
> I enabled the MMC_UNSAFE_RESUME option and the problems I was seeing was
> "fixed". I think having this option is a bad idea (in its current form)
> as it doesn't actually stop filesystem corruption.
> 
> With the option disabled, if a filesystem is mounted when you suspend my
> tests show the filesystem is corrupted. At least if the option is
> enabled, the filesystem is only corrupted if you remove the card whilst
> suspended which is more preferable.

Are we talking _corruption_ here, or are we talking 'the kind of
corruption recoverable by fsck that happens on powerfail'?

Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: MMC/SD Root filesystem suspend/resume problems

2007-07-25 Thread Pavel Machek
On Wed 2007-07-25 20:20:42, Richard Purdie wrote:
 On Wed, 2007-07-25 at 19:01 +, Pavel Machek wrote:
   I enabled the MMC_UNSAFE_RESUME option and the problems I was seeing was
   fixed. I think having this option is a bad idea (in its current form)
   as it doesn't actually stop filesystem corruption.
   
   With the option disabled, if a filesystem is mounted when you suspend my
   tests show the filesystem is corrupted. At least if the option is
   enabled, the filesystem is only corrupted if you remove the card whilst
   suspended which is more preferable.
  
  Are we talking _corruption_ here, or are we talking 'the kind of
  corruption recoverable by fsck that happens on powerfail'?
 
 There was more damage to the system than just a dirty bit set. Yes, fsck
 could fix it but I don't think it should happen in the first place...

Well, that's ok, that happens on sudden powerdowns, too.

(Well, but we do sync() during suspend, so it is a bit strange). Do
you have fsck logs perhaps?
Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: MMC/SD Root filesystem suspend/resume problems

2007-07-25 Thread Richard Purdie
On Wed, 2007-07-25 at 19:01 +, Pavel Machek wrote:
  I enabled the MMC_UNSAFE_RESUME option and the problems I was seeing was
  fixed. I think having this option is a bad idea (in its current form)
  as it doesn't actually stop filesystem corruption.
  
  With the option disabled, if a filesystem is mounted when you suspend my
  tests show the filesystem is corrupted. At least if the option is
  enabled, the filesystem is only corrupted if you remove the card whilst
  suspended which is more preferable.
 
 Are we talking _corruption_ here, or are we talking 'the kind of
 corruption recoverable by fsck that happens on powerfail'?

There was more damage to the system than just a dirty bit set. Yes, fsck
could fix it but I don't think it should happen in the first place...

Richard

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: MMC/SD Root filesystem suspend/resume problems

2007-07-25 Thread Pavel Machek
Hi!

   Lots of Linux handhelds use MMC/SD devices as the root file system.
   This has worked quite reliably for many kernel versions. In 2.6.22,
   it seems that if you suspend such a system then resume it, the device
   locks up. Trying to execute anything on the filesystem results in a
   Permission Denied message. I did see a message from the MMC
   subsystem saying it had redetected the card. There are also messages
   on the console like MMC: killing requests for dead queue each time
   you suspend/resume.
  
  The card is removed when you suspend and readded when you resume.
  That's the only safe thing we can do until we get suspend support in
  the filesystems.
  
  If you really want to shoot yourself in the foot, there is a Kconfig
  option that keeps the card around across the suspend.
 
 I enabled the MMC_UNSAFE_RESUME option and the problems I was seeing was
 fixed. I think having this option is a bad idea (in its current form)
 as it doesn't actually stop filesystem corruption.
 
 With the option disabled, if a filesystem is mounted when you suspend my
 tests show the filesystem is corrupted. At least if the option is
 enabled, the filesystem is only corrupted if you remove the card whilst
 suspended which is more preferable.

Are we talking _corruption_ here, or are we talking 'the kind of
corruption recoverable by fsck that happens on powerfail'?

Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: MMC/SD Root filesystem suspend/resume problems

2007-07-22 Thread Pierre Ossman
On Sun, 22 Jul 2007 15:28:00 +0100
Richard Purdie <[EMAIL PROTECTED]> wrote:

> 
> Corruption is corruption and it shouldn't happen if we can avoid it.
> It happens with complete certainty in one case and only happens in the
> other if the user does something which is a fairly obvious bad idea
> (which is documented as such).
> 

The corruption will only occur if the filesystem is dirty. Granted, the
mount will be dead and useless, but I wouldn't call that corruption.

Anyway, this behaviour was selected after seeing the long discussion
about how USB should handle the same problem. It was decided that it
was best to play it safe and remove any devices that couldn't be
determined to have remained in the slot. We also have the USB_PERSIST
option these days, which does the same thing as MMC_UNSAFE_RESUME.

> 
> Given I can suspend the device with "echo mem > /sys/power/state",
> that implies we need to fix echo? ;-)
> 

Or that direct usage of /sys/power/state is only for those who know
what they are doing (and have umounted their filesystems beforehand).

> 
> > And if we keep papering over the problems, you reduce the motivation
> > of fixing this properly.
> 
> Maybe although I don't like existing functionality being broken even
> if its less than ideal.
> 

I am of the opinion that it was more broken before I touched it. Silent
corruption is never acceptable in my book. But if it is in yours, just
enable MMC_UNSAFE_RESUME and you'll have the old behaviour.

Rgds
-- 
 -- Pierre Ossman

  Linux kernel, MMC maintainerhttp://www.kernel.org
  PulseAudio, core developer  http://pulseaudio.org
  rdesktop, core developer  http://www.rdesktop.org
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: MMC/SD Root filesystem suspend/resume problems

2007-07-22 Thread Richard Purdie
On Sun, 2007-07-22 at 16:05 +0200, Pierre Ossman wrote:
> On Sun, 22 Jul 2007 14:18:33 +0100
> Richard Purdie <[EMAIL PROTECTED]> wrote:
> > I enabled the MMC_UNSAFE_RESUME option and the problems I was seeing
> > was "fixed". I think having this option is a bad idea (in its current
> > form) as it doesn't actually stop filesystem corruption.
> > 
> > With the option disabled, if a filesystem is mounted when you suspend
> > my tests show the filesystem is corrupted. At least if the option is
> > enabled, the filesystem is only corrupted if you remove the card
> > whilst suspended which is more preferable.
> 
> I disagree. With this option you get silent corruption, without you get
> noisy corruption. And I would always prefer the latter, even if it
> increases the risk of it happening.

Corruption is corruption and it shouldn't happen if we can avoid it. It
happens with complete certainty in one case and only happens in the
other if the user does something which is a fairly obvious bad idea
(which is documented as such).

> > I guess the solution would be to abort the suspend if mounted systems
> > were detected and the option was disabled? Alternatively the option
> > could be "auto" enabled only for mounted systems maybe with a printk
> > warning?
> 
> This is a general problem for all removable/hotpluggable storage. So
> sticking it in the MMC block device would be the wrong layer IMO.

It is however if the MMC layer is going to add Kconfig options which
corrupt things, it can add things to start fixing things too. If those
things can be adapted into more generic code paths, so much the better.

> Until the filesystems can be made to store something sane on disk
> before the suspend, I'd say this is best handled in user space. Let the
> user space tools refuse to initiate the suspend as long as any
> removable devices are mounted.

Given I can suspend the device with "echo mem > /sys/power/state", that
implies we need to fix echo? ;-)

> > Of course the best solution would be to have filesystems support
> > suspend/resume requests since other subsystems like pcmcia also suffer
> > this problem and would benefit from this but I accept that teaching
> > filesystems this is more difficult.
> > 
> 
> Doesn't mean we shouldn't do it. 

Agreed.

> And if we keep papering over the problems, you reduce the motivation
> of fixing this properly.

Maybe although I don't like existing functionality being broken even if
its less than ideal.

Regards.

Richard

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: MMC/SD Root filesystem suspend/resume problems

2007-07-22 Thread Pierre Ossman
On Sun, 22 Jul 2007 14:18:33 +0100
Richard Purdie <[EMAIL PROTECTED]> wrote:

> 
> I enabled the MMC_UNSAFE_RESUME option and the problems I was seeing
> was "fixed". I think having this option is a bad idea (in its current
> form) as it doesn't actually stop filesystem corruption.
> 
> With the option disabled, if a filesystem is mounted when you suspend
> my tests show the filesystem is corrupted. At least if the option is
> enabled, the filesystem is only corrupted if you remove the card
> whilst suspended which is more preferable.
> 

I disagree. With this option you get silent corruption, without you get
noisy corruption. And I would always prefer the latter, even if it
increases the risk of it happening.

> I guess the solution would be to abort the suspend if mounted systems
> were detected and the option was disabled? Alternatively the option
> could be "auto" enabled only for mounted systems maybe with a printk
> warning?
> 

This is a general problem for all removable/hotpluggable storage. So
sticking it in the MMC block device would be the wrong layer IMO.

Until the filesystems can be made to store something sane on disk
before the suspend, I'd say this is best handled in user space. Let the
user space tools refuse to initiate the suspend as long as any
removable devices are mounted.

> Of course the best solution would be to have filesystems support
> suspend/resume requests since other subsystems like pcmcia also suffer
> this problem and would benefit from this but I accept that teaching
> filesystems this is more difficult.
> 

Doesn't mean we shouldn't do it. And if we keep papering over the
problems, you reduce the motivation of fixing this properly.

Rgds
-- 
 -- Pierre Ossman

  Linux kernel, MMC maintainerhttp://www.kernel.org
  PulseAudio, core developer  http://pulseaudio.org
  rdesktop, core developer  http://www.rdesktop.org
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: MMC/SD Root filesystem suspend/resume problems

2007-07-22 Thread Richard Purdie
On Thu, 2007-07-19 at 19:03 +0200, Pierre Ossman wrote:
> On Thu, 19 Jul 2007 16:53:39 +0100
> Richard Purdie <[EMAIL PROTECTED]> wrote:
> > Lots of Linux handhelds use MMC/SD devices as the root file system.
> > This has worked quite reliably for many kernel versions. In 2.6.22,
> > it seems that if you suspend such a system then resume it, the device
> > locks up. Trying to execute anything on the filesystem results in a
> > "Permission Denied" message. I did see a message from the MMC
> > subsystem saying it had redetected the card. There are also messages
> > on the console like "MMC: killing requests for dead queue" each time
> > you suspend/resume.
> 
> The card is removed when you suspend and readded when you resume.
> That's the only safe thing we can do until we get suspend support in
> the filesystems.
> 
> If you really want to shoot yourself in the foot, there is a Kconfig
> option that keeps the card around across the suspend.

I enabled the MMC_UNSAFE_RESUME option and the problems I was seeing was
"fixed". I think having this option is a bad idea (in its current form)
as it doesn't actually stop filesystem corruption.

With the option disabled, if a filesystem is mounted when you suspend my
tests show the filesystem is corrupted. At least if the option is
enabled, the filesystem is only corrupted if you remove the card whilst
suspended which is more preferable.

I guess the solution would be to abort the suspend if mounted systems
were detected and the option was disabled? Alternatively the option
could be "auto" enabled only for mounted systems maybe with a printk
warning?

Of course the best solution would be to have filesystems support
suspend/resume requests since other subsystems like pcmcia also suffer
this problem and would benefit from this but I accept that teaching
filesystems this is more difficult.

Regards,

Richard


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: MMC/SD Root filesystem suspend/resume problems

2007-07-22 Thread Richard Purdie
On Thu, 2007-07-19 at 19:03 +0200, Pierre Ossman wrote:
 On Thu, 19 Jul 2007 16:53:39 +0100
 Richard Purdie [EMAIL PROTECTED] wrote:
  Lots of Linux handhelds use MMC/SD devices as the root file system.
  This has worked quite reliably for many kernel versions. In 2.6.22,
  it seems that if you suspend such a system then resume it, the device
  locks up. Trying to execute anything on the filesystem results in a
  Permission Denied message. I did see a message from the MMC
  subsystem saying it had redetected the card. There are also messages
  on the console like MMC: killing requests for dead queue each time
  you suspend/resume.
 
 The card is removed when you suspend and readded when you resume.
 That's the only safe thing we can do until we get suspend support in
 the filesystems.
 
 If you really want to shoot yourself in the foot, there is a Kconfig
 option that keeps the card around across the suspend.

I enabled the MMC_UNSAFE_RESUME option and the problems I was seeing was
fixed. I think having this option is a bad idea (in its current form)
as it doesn't actually stop filesystem corruption.

With the option disabled, if a filesystem is mounted when you suspend my
tests show the filesystem is corrupted. At least if the option is
enabled, the filesystem is only corrupted if you remove the card whilst
suspended which is more preferable.

I guess the solution would be to abort the suspend if mounted systems
were detected and the option was disabled? Alternatively the option
could be auto enabled only for mounted systems maybe with a printk
warning?

Of course the best solution would be to have filesystems support
suspend/resume requests since other subsystems like pcmcia also suffer
this problem and would benefit from this but I accept that teaching
filesystems this is more difficult.

Regards,

Richard


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: MMC/SD Root filesystem suspend/resume problems

2007-07-22 Thread Pierre Ossman
On Sun, 22 Jul 2007 14:18:33 +0100
Richard Purdie [EMAIL PROTECTED] wrote:

 
 I enabled the MMC_UNSAFE_RESUME option and the problems I was seeing
 was fixed. I think having this option is a bad idea (in its current
 form) as it doesn't actually stop filesystem corruption.
 
 With the option disabled, if a filesystem is mounted when you suspend
 my tests show the filesystem is corrupted. At least if the option is
 enabled, the filesystem is only corrupted if you remove the card
 whilst suspended which is more preferable.
 

I disagree. With this option you get silent corruption, without you get
noisy corruption. And I would always prefer the latter, even if it
increases the risk of it happening.

 I guess the solution would be to abort the suspend if mounted systems
 were detected and the option was disabled? Alternatively the option
 could be auto enabled only for mounted systems maybe with a printk
 warning?
 

This is a general problem for all removable/hotpluggable storage. So
sticking it in the MMC block device would be the wrong layer IMO.

Until the filesystems can be made to store something sane on disk
before the suspend, I'd say this is best handled in user space. Let the
user space tools refuse to initiate the suspend as long as any
removable devices are mounted.

 Of course the best solution would be to have filesystems support
 suspend/resume requests since other subsystems like pcmcia also suffer
 this problem and would benefit from this but I accept that teaching
 filesystems this is more difficult.
 

Doesn't mean we shouldn't do it. And if we keep papering over the
problems, you reduce the motivation of fixing this properly.

Rgds
-- 
 -- Pierre Ossman

  Linux kernel, MMC maintainerhttp://www.kernel.org
  PulseAudio, core developer  http://pulseaudio.org
  rdesktop, core developer  http://www.rdesktop.org
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: MMC/SD Root filesystem suspend/resume problems

2007-07-22 Thread Richard Purdie
On Sun, 2007-07-22 at 16:05 +0200, Pierre Ossman wrote:
 On Sun, 22 Jul 2007 14:18:33 +0100
 Richard Purdie [EMAIL PROTECTED] wrote:
  I enabled the MMC_UNSAFE_RESUME option and the problems I was seeing
  was fixed. I think having this option is a bad idea (in its current
  form) as it doesn't actually stop filesystem corruption.
  
  With the option disabled, if a filesystem is mounted when you suspend
  my tests show the filesystem is corrupted. At least if the option is
  enabled, the filesystem is only corrupted if you remove the card
  whilst suspended which is more preferable.
 
 I disagree. With this option you get silent corruption, without you get
 noisy corruption. And I would always prefer the latter, even if it
 increases the risk of it happening.

Corruption is corruption and it shouldn't happen if we can avoid it. It
happens with complete certainty in one case and only happens in the
other if the user does something which is a fairly obvious bad idea
(which is documented as such).

  I guess the solution would be to abort the suspend if mounted systems
  were detected and the option was disabled? Alternatively the option
  could be auto enabled only for mounted systems maybe with a printk
  warning?
 
 This is a general problem for all removable/hotpluggable storage. So
 sticking it in the MMC block device would be the wrong layer IMO.

It is however if the MMC layer is going to add Kconfig options which
corrupt things, it can add things to start fixing things too. If those
things can be adapted into more generic code paths, so much the better.

 Until the filesystems can be made to store something sane on disk
 before the suspend, I'd say this is best handled in user space. Let the
 user space tools refuse to initiate the suspend as long as any
 removable devices are mounted.

Given I can suspend the device with echo mem  /sys/power/state, that
implies we need to fix echo? ;-)

  Of course the best solution would be to have filesystems support
  suspend/resume requests since other subsystems like pcmcia also suffer
  this problem and would benefit from this but I accept that teaching
  filesystems this is more difficult.
  
 
 Doesn't mean we shouldn't do it. 

Agreed.

 And if we keep papering over the problems, you reduce the motivation
 of fixing this properly.

Maybe although I don't like existing functionality being broken even if
its less than ideal.

Regards.

Richard

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: MMC/SD Root filesystem suspend/resume problems

2007-07-22 Thread Pierre Ossman
On Sun, 22 Jul 2007 15:28:00 +0100
Richard Purdie [EMAIL PROTECTED] wrote:

 
 Corruption is corruption and it shouldn't happen if we can avoid it.
 It happens with complete certainty in one case and only happens in the
 other if the user does something which is a fairly obvious bad idea
 (which is documented as such).
 

The corruption will only occur if the filesystem is dirty. Granted, the
mount will be dead and useless, but I wouldn't call that corruption.

Anyway, this behaviour was selected after seeing the long discussion
about how USB should handle the same problem. It was decided that it
was best to play it safe and remove any devices that couldn't be
determined to have remained in the slot. We also have the USB_PERSIST
option these days, which does the same thing as MMC_UNSAFE_RESUME.

 
 Given I can suspend the device with echo mem  /sys/power/state,
 that implies we need to fix echo? ;-)
 

Or that direct usage of /sys/power/state is only for those who know
what they are doing (and have umounted their filesystems beforehand).

 
  And if we keep papering over the problems, you reduce the motivation
  of fixing this properly.
 
 Maybe although I don't like existing functionality being broken even
 if its less than ideal.
 

I am of the opinion that it was more broken before I touched it. Silent
corruption is never acceptable in my book. But if it is in yours, just
enable MMC_UNSAFE_RESUME and you'll have the old behaviour.

Rgds
-- 
 -- Pierre Ossman

  Linux kernel, MMC maintainerhttp://www.kernel.org
  PulseAudio, core developer  http://pulseaudio.org
  rdesktop, core developer  http://www.rdesktop.org
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: MMC/SD Root filesystem suspend/resume problems

2007-07-19 Thread Pierre Ossman
On Thu, 19 Jul 2007 16:53:39 +0100
Richard Purdie <[EMAIL PROTECTED]> wrote:

> Hi Pierre,
> 
> Lots of Linux handhelds use MMC/SD devices as the root file system.
> This has worked quite reliably for many kernel versions. In 2.6.22,
> it seems that if you suspend such a system then resume it, the device
> locks up. Trying to execute anything on the filesystem results in a
> "Permission Denied" message. I did see a message from the MMC
> subsystem saying it had redetected the card. There are also messages
> on the console like "MMC: killing requests for dead queue" each time
> you suspend/resume.
> 

The card is removed when you suspend and readded when you resume.
That's the only safe thing we can do until we get suspend support in
the filesystems.

If you really want to shoot yourself in the foot, there is a Kconfig
option that keeps the card around across the suspend.

Rgds
-- 
 -- Pierre Ossman

  Linux kernel, MMC maintainerhttp://www.kernel.org
  PulseAudio, core developer  http://pulseaudio.org
  rdesktop, core developer  http://www.rdesktop.org
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: MMC/SD Root filesystem suspend/resume problems

2007-07-19 Thread Richard Purdie
On Thu, 2007-07-19 at 16:57 +0100, Richard Purdie wrote:
> Lots of Linux handhelds use MMC/SD devices as the root file system. This
> has worked quite reliably for many kernel versions. In 2.6.22, it seems
> that if you suspend such a system then resume it, the device locks up.
> Trying to execute anything on the filesystem results in a "Permission
> Denied" message. I did see a message from the MMC subsystem saying it
> had redetected the card. There are also messages on the console like
> "MMC: killing requests for dead queue" each time you suspend/resume.
> 
> I'm away from my serial cables at the moment but I may be able to
> provide more debug when I have them over the weekend. Have you any ideas
> on why this is breaking?
> 
> For reference, I've reproduced the problem with both the PXA host driver
> and different driver not merged into mainline (ASIC3).

Just to follow up, if I boot with a rootfs from elsewhere and mount the
mmc card then suspend/resume, it corrupts the data on the card too so it
looks like some general suspend/resume problem on mounted filesystems.

Richard

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


MMC/SD Root filesystem suspend/resume problems

2007-07-19 Thread Richard Purdie
Hi Pierre,

Lots of Linux handhelds use MMC/SD devices as the root file system. This
has worked quite reliably for many kernel versions. In 2.6.22, it seems
that if you suspend such a system then resume it, the device locks up.
Trying to execute anything on the filesystem results in a "Permission
Denied" message. I did see a message from the MMC subsystem saying it
had redetected the card. There are also messages on the console like
"MMC: killing requests for dead queue" each time you suspend/resume.

I'm away from my serial cables at the moment but I may be able to
provide more debug when I have them over the weekend. Have you any ideas
on why this is breaking?

For reference, I've reproduced the problem with both the PXA host driver
and different driver not merged into mainline (ASIC3).

Cheers,

Richard

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


MMC/SD Root filesystem suspend/resume problems

2007-07-19 Thread Richard Purdie
Hi Pierre,

Lots of Linux handhelds use MMC/SD devices as the root file system. This
has worked quite reliably for many kernel versions. In 2.6.22, it seems
that if you suspend such a system then resume it, the device locks up.
Trying to execute anything on the filesystem results in a Permission
Denied message. I did see a message from the MMC subsystem saying it
had redetected the card. There are also messages on the console like
MMC: killing requests for dead queue each time you suspend/resume.

I'm away from my serial cables at the moment but I may be able to
provide more debug when I have them over the weekend. Have you any ideas
on why this is breaking?

For reference, I've reproduced the problem with both the PXA host driver
and different driver not merged into mainline (ASIC3).

Cheers,

Richard

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: MMC/SD Root filesystem suspend/resume problems

2007-07-19 Thread Richard Purdie
On Thu, 2007-07-19 at 16:57 +0100, Richard Purdie wrote:
 Lots of Linux handhelds use MMC/SD devices as the root file system. This
 has worked quite reliably for many kernel versions. In 2.6.22, it seems
 that if you suspend such a system then resume it, the device locks up.
 Trying to execute anything on the filesystem results in a Permission
 Denied message. I did see a message from the MMC subsystem saying it
 had redetected the card. There are also messages on the console like
 MMC: killing requests for dead queue each time you suspend/resume.
 
 I'm away from my serial cables at the moment but I may be able to
 provide more debug when I have them over the weekend. Have you any ideas
 on why this is breaking?
 
 For reference, I've reproduced the problem with both the PXA host driver
 and different driver not merged into mainline (ASIC3).

Just to follow up, if I boot with a rootfs from elsewhere and mount the
mmc card then suspend/resume, it corrupts the data on the card too so it
looks like some general suspend/resume problem on mounted filesystems.

Richard

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: MMC/SD Root filesystem suspend/resume problems

2007-07-19 Thread Pierre Ossman
On Thu, 19 Jul 2007 16:53:39 +0100
Richard Purdie [EMAIL PROTECTED] wrote:

 Hi Pierre,
 
 Lots of Linux handhelds use MMC/SD devices as the root file system.
 This has worked quite reliably for many kernel versions. In 2.6.22,
 it seems that if you suspend such a system then resume it, the device
 locks up. Trying to execute anything on the filesystem results in a
 Permission Denied message. I did see a message from the MMC
 subsystem saying it had redetected the card. There are also messages
 on the console like MMC: killing requests for dead queue each time
 you suspend/resume.
 

The card is removed when you suspend and readded when you resume.
That's the only safe thing we can do until we get suspend support in
the filesystems.

If you really want to shoot yourself in the foot, there is a Kconfig
option that keeps the card around across the suspend.

Rgds
-- 
 -- Pierre Ossman

  Linux kernel, MMC maintainerhttp://www.kernel.org
  PulseAudio, core developer  http://pulseaudio.org
  rdesktop, core developer  http://www.rdesktop.org
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] for acpi S1 power cycle resume problems

2005-08-24 Thread David Brownell
> Date: Fri, 19 Aug 2005 08:39:25 -0600
> From: "William Morrow" <[EMAIL PROTECTED]>
> Subject: [PATCH] for acpi S1 power cycle resume problems
>
>
> Hi
> I was told that if I had a patch to submit for a baseline change that 
> this was the place to do it.

In this case that works fine.  Normally they should go to linux-usb-devel
for me (and others) to read there.

Thanks, these need a bit of cleaning up, finishing, and splitting out;
they should be in 2.6.14 though.  Comments below.  Were these patches
written by you, or by Jordan?

- Dave



> If not, please let me know...
>
> thanks,
> morrow
>
> Patched against 2.6.11 baseline
> problems fixed:
> 1) OHCI_INTR_RD not being cleared in ohci interrupt handler
>  results in interrupt storm and system hang on RD status.
>  ohci spec indicates this should be done.

Yeah, I noticed that one but didn't fix it yet.  It's not that
it was _never_ cleared ... only certain code paths missed it.
The systems I test with were clearly using those working paths!

Having this fixed should help get rid of the 1/4 second timer
this driver normally ties up.  That'll help make the dynamic
tick stuff work better, reducing power even when something like
"ACPI S1" doesn't exist (like say, on that one Zaurus).


> 2) PORT_CSC not being cleared in ehci_hub_status_data
>  code attempts to clear bit, but bit is write to clear.
>  there are other errant clears, since the PORTSCn regs
>  have 3 RWC bits, and the rest are RW. All stmts of the form:
>writel (v, >regs->port_status[i])
>  should clear RWC bits if they do not intend to clear status,
>  and should set the bits which should be cleared (this case).

Yeah, whoever did that RWC patch for UHCI ports certainly should
have checked other HCDs for the same bug.  (Kicks self.)

In fact you didn't fix this issue comprehensively.  There are
other places that register is written; they need to change too.

This is clearly wrong, but did you notice any effects more
serious than "lsusb -v" output for EHCI root hubs looking
a bit strange?


> 3) loop control and subsequent port resume/reset not correct.
>  unsigned index made detecting port1 active impossible,

Odd, I've done that with some regularity.  Is that maybe
some kind of compiler bug?  (I heard even 4.1 isn't quite
there yet for kernels.)

The looping doesn't look incorrect to me; ports are numbered
from 1..N, and C code in the body must index them from 0..(N-1).


> and OWNER/POWER status was being ignored on ports assigned
>  to companion controller.

Well, in that one resume case anyway!

But OWNER and POWER are very different status bits ... if POWER
ever goes off, that port is by definition not resumable.  But
if a port's owned by the companion (OHCI or UHCI) controller,
then it surely ought not to be reset (even if the companion's
own SUSPEND bit doesn't show through EHCI).

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] for acpi S1 power cycle resume problems

2005-08-24 Thread David Brownell
 Date: Fri, 19 Aug 2005 08:39:25 -0600
 From: William Morrow [EMAIL PROTECTED]
 Subject: [PATCH] for acpi S1 power cycle resume problems


 Hi
 I was told that if I had a patch to submit for a baseline change that 
 this was the place to do it.

In this case that works fine.  Normally they should go to linux-usb-devel
for me (and others) to read there.

Thanks, these need a bit of cleaning up, finishing, and splitting out;
they should be in 2.6.14 though.  Comments below.  Were these patches
written by you, or by Jordan?

- Dave



 If not, please let me know...

 thanks,
 morrow

 Patched against 2.6.11 baseline
 problems fixed:
 1) OHCI_INTR_RD not being cleared in ohci interrupt handler
  results in interrupt storm and system hang on RD status.
  ohci spec indicates this should be done.

Yeah, I noticed that one but didn't fix it yet.  It's not that
it was _never_ cleared ... only certain code paths missed it.
The systems I test with were clearly using those working paths!

Having this fixed should help get rid of the 1/4 second timer
this driver normally ties up.  That'll help make the dynamic
tick stuff work better, reducing power even when something like
ACPI S1 doesn't exist (like say, on that one Zaurus).


 2) PORT_CSC not being cleared in ehci_hub_status_data
  code attempts to clear bit, but bit is write to clear.
  there are other errant clears, since the PORTSCn regs
  have 3 RWC bits, and the rest are RW. All stmts of the form:
writel (v, ehci-regs-port_status[i])
  should clear RWC bits if they do not intend to clear status,
  and should set the bits which should be cleared (this case).

Yeah, whoever did that RWC patch for UHCI ports certainly should
have checked other HCDs for the same bug.  (Kicks self.)

In fact you didn't fix this issue comprehensively.  There are
other places that register is written; they need to change too.

This is clearly wrong, but did you notice any effects more
serious than lsusb -v output for EHCI root hubs looking
a bit strange?


 3) loop control and subsequent port resume/reset not correct.
  unsigned index made detecting port1 active impossible,

Odd, I've done that with some regularity.  Is that maybe
some kind of compiler bug?  (I heard even 4.1 isn't quite
there yet for kernels.)

The looping doesn't look incorrect to me; ports are numbered
from 1..N, and C code in the body must index them from 0..(N-1).


 and OWNER/POWER status was being ignored on ports assigned
  to companion controller.

Well, in that one resume case anyway!

But OWNER and POWER are very different status bits ... if POWER
ever goes off, that port is by definition not resumable.  But
if a port's owned by the companion (OHCI or UHCI) controller,
then it surely ought not to be reset (even if the companion's
own SUSPEND bit doesn't show through EHCI).

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] for acpi S1 power cycle resume problems

2005-08-19 Thread William Morrow

Hi
I was told that if I had a patch to submit for a baseline change that 
this was the place to do it.

If not, please let me know...

thanks,
morrow

Patched against 2.6.11 baseline
problems fixed:
1) OHCI_INTR_RD not being cleared in ohci interrupt handler
results in interrupt storm and system hang on RD status.
ohci spec indicates this should be done.
2) PORT_CSC not being cleared in ehci_hub_status_data
code attempts to clear bit, but bit is write to clear.
there are other errant clears, since the PORTSCn regs
have 3 RWC bits, and the rest are RW. All stmts of the form:
  writel (v, >regs->port_status[i])
should clear RWC bits if they do not intend to clear status,
and should set the bits which should be cleared (this case).
3) loop control and subsequent port resume/reset not correct.
unsigned index made detecting port1 active impossible, and
OWNER/POWER status was being ignored on ports assigned
to companion controller.

Signed-off-by: Jordan Crouse <[EMAIL PROTECTED]>

diff -uprN linux-2.6.11.orig/drivers/usb/host/ehci.h 
linux-2.6.11/drivers/usb/host/ehci.h
--- linux-2.6.11.orig/drivers/usb/host/ehci.h   2005-03-02 00:38:25.0 
-0700
+++ linux-2.6.11/drivers/usb/host/ehci.h2005-08-17 08:15:36.0 
-0600
@@ -262,6 +262,7 @@ struct ehci_regs {
 #define PORT_PE(1<<2)  /* port enable */
 #define PORT_CSC   (1<<1)  /* connect status change */
 #define PORT_CONNECT   (1<<0)  /* device connected */
+#define PORT_RWC_BITS   (PORT_CSC | PORT_PEC | PORT_OCC)
 } __attribute__ ((packed));
 
 /* Appendix C, Debug port ... intended for use with special "debug devices"
diff -uprN linux-2.6.11.orig/drivers/usb/host/ehci-hcd.c 
linux-2.6.11/drivers/usb/host/ehci-hcd.c
--- linux-2.6.11.orig/drivers/usb/host/ehci-hcd.c   2005-03-02 
00:38:38.0 -0700
+++ linux-2.6.11/drivers/usb/host/ehci-hcd.c2005-08-17 08:15:36.0 
-0600
@@ -722,7 +722,7 @@ static int ehci_suspend (struct usb_hcd 
 static int ehci_resume (struct usb_hcd *hcd)
 {
struct ehci_hcd *ehci = hcd_to_ehci (hcd);
-   unsignedport;
+   int port;
struct usb_device   *root = hcd->self.root_hub;
int retval = -EINVAL;
int powerup = 0;
@@ -733,11 +733,11 @@ static int ehci_resume (struct usb_hcd *
msleep (100);
 
/* If any port is suspended, we know we can/must resume the HC. */
-   for (port = HCS_N_PORTS (ehci->hcs_params); port > 0; ) {
+   for (port = HCS_N_PORTS (ehci->hcs_params); --port >= 0; ) {
u32 status;
-   port--;
status = readl (>regs->port_status [port]);
-   if (status & PORT_SUSPEND) {
+   if ( (status & PORT_SUSPEND) != 0 ||
+   ((status & PORT_OWNER) != 0 && (status & PORT_POWER) != 0) 
) {
down (>self.root_hub->serialize);
retval = ehci_hub_resume (hcd);
up (>self.root_hub->serialize);
@@ -755,7 +755,7 @@ static int ehci_resume (struct usb_hcd *
/* Else reset, to cope with power loss or flush-to-storage
 * style "resume" having activated BIOS during reboot.
 */
-   if (port == 0) {
+   if (port < 0) {
(void) ehci_halt (ehci);
(void) ehci_reset (ehci);
(void) ehci_hc_reset (hcd);
diff -uprN linux-2.6.11.orig/drivers/usb/host/ehci-hub.c 
linux-2.6.11/drivers/usb/host/ehci-hub.c
--- linux-2.6.11.orig/drivers/usb/host/ehci-hub.c   2005-03-02 
00:38:32.0 -0700
+++ linux-2.6.11/drivers/usb/host/ehci-hub.c2005-08-17 08:15:36.0 
-0600
@@ -232,7 +232,8 @@ ehci_hub_status_data (struct usb_hcd *hc
if (temp & PORT_OWNER) {
/* don't report this in GetPortStatus */
if (temp & PORT_CSC) {
-   temp &= ~PORT_CSC;
+   temp &= ~PORT_RWC_BITS;
+   temp |= PORT_CSC;
writel (temp, >regs->port_status [i]);
}
continue;
diff -uprN linux-2.6.11.orig/drivers/usb/host/ohci-hcd.c 
linux-2.6.11/drivers/usb/host/ohci-hcd.c
--- linux-2.6.11.orig/drivers/usb/host/ohci-hcd.c   2005-03-02 
00:37:48.0 -0700
+++ linux-2.6.11/drivers/usb/host/ohci-hcd.c2005-08-17 08:15:36.0 
-0600
@@ -720,6 +720,7 @@ static irqreturn_t ohci_irq (struct usb_
 
if (ints & OHCI_INTR_RD) {
ohci_vdbg (ohci, "resume detect\n");
+   ohci_writel (ohci, OHCI_INTR_RD, >intrstatus);
schedule_work(>rh_resume);
}
 


[PATCH] for acpi S1 power cycle resume problems

2005-08-19 Thread William Morrow

Hi
I was told that if I had a patch to submit for a baseline change that 
this was the place to do it.

If not, please let me know...

thanks,
morrow

Patched against 2.6.11 baseline
problems fixed:
1) OHCI_INTR_RD not being cleared in ohci interrupt handler
results in interrupt storm and system hang on RD status.
ohci spec indicates this should be done.
2) PORT_CSC not being cleared in ehci_hub_status_data
code attempts to clear bit, but bit is write to clear.
there are other errant clears, since the PORTSCn regs
have 3 RWC bits, and the rest are RW. All stmts of the form:
  writel (v, ehci-regs-port_status[i])
should clear RWC bits if they do not intend to clear status,
and should set the bits which should be cleared (this case).
3) loop control and subsequent port resume/reset not correct.
unsigned index made detecting port1 active impossible, and
OWNER/POWER status was being ignored on ports assigned
to companion controller.

Signed-off-by: Jordan Crouse [EMAIL PROTECTED]

diff -uprN linux-2.6.11.orig/drivers/usb/host/ehci.h 
linux-2.6.11/drivers/usb/host/ehci.h
--- linux-2.6.11.orig/drivers/usb/host/ehci.h   2005-03-02 00:38:25.0 
-0700
+++ linux-2.6.11/drivers/usb/host/ehci.h2005-08-17 08:15:36.0 
-0600
@@ -262,6 +262,7 @@ struct ehci_regs {
 #define PORT_PE(12)  /* port enable */
 #define PORT_CSC   (11)  /* connect status change */
 #define PORT_CONNECT   (10)  /* device connected */
+#define PORT_RWC_BITS   (PORT_CSC | PORT_PEC | PORT_OCC)
 } __attribute__ ((packed));
 
 /* Appendix C, Debug port ... intended for use with special debug devices
diff -uprN linux-2.6.11.orig/drivers/usb/host/ehci-hcd.c 
linux-2.6.11/drivers/usb/host/ehci-hcd.c
--- linux-2.6.11.orig/drivers/usb/host/ehci-hcd.c   2005-03-02 
00:38:38.0 -0700
+++ linux-2.6.11/drivers/usb/host/ehci-hcd.c2005-08-17 08:15:36.0 
-0600
@@ -722,7 +722,7 @@ static int ehci_suspend (struct usb_hcd 
 static int ehci_resume (struct usb_hcd *hcd)
 {
struct ehci_hcd *ehci = hcd_to_ehci (hcd);
-   unsignedport;
+   int port;
struct usb_device   *root = hcd-self.root_hub;
int retval = -EINVAL;
int powerup = 0;
@@ -733,11 +733,11 @@ static int ehci_resume (struct usb_hcd *
msleep (100);
 
/* If any port is suspended, we know we can/must resume the HC. */
-   for (port = HCS_N_PORTS (ehci-hcs_params); port  0; ) {
+   for (port = HCS_N_PORTS (ehci-hcs_params); --port = 0; ) {
u32 status;
-   port--;
status = readl (ehci-regs-port_status [port]);
-   if (status  PORT_SUSPEND) {
+   if ( (status  PORT_SUSPEND) != 0 ||
+   ((status  PORT_OWNER) != 0  (status  PORT_POWER) != 0) 
) {
down (hcd-self.root_hub-serialize);
retval = ehci_hub_resume (hcd);
up (hcd-self.root_hub-serialize);
@@ -755,7 +755,7 @@ static int ehci_resume (struct usb_hcd *
/* Else reset, to cope with power loss or flush-to-storage
 * style resume having activated BIOS during reboot.
 */
-   if (port == 0) {
+   if (port  0) {
(void) ehci_halt (ehci);
(void) ehci_reset (ehci);
(void) ehci_hc_reset (hcd);
diff -uprN linux-2.6.11.orig/drivers/usb/host/ehci-hub.c 
linux-2.6.11/drivers/usb/host/ehci-hub.c
--- linux-2.6.11.orig/drivers/usb/host/ehci-hub.c   2005-03-02 
00:38:32.0 -0700
+++ linux-2.6.11/drivers/usb/host/ehci-hub.c2005-08-17 08:15:36.0 
-0600
@@ -232,7 +232,8 @@ ehci_hub_status_data (struct usb_hcd *hc
if (temp  PORT_OWNER) {
/* don't report this in GetPortStatus */
if (temp  PORT_CSC) {
-   temp = ~PORT_CSC;
+   temp = ~PORT_RWC_BITS;
+   temp |= PORT_CSC;
writel (temp, ehci-regs-port_status [i]);
}
continue;
diff -uprN linux-2.6.11.orig/drivers/usb/host/ohci-hcd.c 
linux-2.6.11/drivers/usb/host/ohci-hcd.c
--- linux-2.6.11.orig/drivers/usb/host/ohci-hcd.c   2005-03-02 
00:37:48.0 -0700
+++ linux-2.6.11/drivers/usb/host/ohci-hcd.c2005-08-17 08:15:36.0 
-0600
@@ -720,6 +720,7 @@ static irqreturn_t ohci_irq (struct usb_
 
if (ints  OHCI_INTR_RD) {
ohci_vdbg (ohci, resume detect\n);
+   ohci_writel (ohci, OHCI_INTR_RD, regs-intrstatus);
schedule_work(ohci-rh_resume);
}
 


intel_agp resume problems

2005-08-16 Thread Carl-Daniel Hailfinger
Hello Dave,

after suspend-to-ram and a subsequent resume the configuration
of my AGP bridge/controller is different and X will refuse to
start after resume if it wasn't running during suspend. I'm
using radeonfb as console driver and kernel 2.6.13-rc6-git6.

Diff between lspci -vvvxxx before and after suspend follows.

--- lspci.radeonfb_beforeS3 2005-08-16 13:23:31.0 +0200
+++ lspci.radeonfb_afterS3  2005-08-16 13:23:31.0 +0200
@@ -1,353 +1,349 @@
 :00:00.0 Host bridge: Intel Corp. 82855PM Processor to I/O Controller (rev 
21)
Subsystem: Samsung Electronics Co Ltd: Unknown device c00c
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR+ FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- 
 00: 86 80 40 33 06 01 90 20 21 00 00 06 00 00 00 00
 10: 08 00 00 e0 00 00 00 00 00 00 00 00 00 00 00 00
 20: 00 00 00 00 00 00 00 00 00 00 00 00 4d 14 0c c0
 30: 00 00 00 00 e4 00 00 00 00 00 00 00 00 00 00 00
 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 50: 00 02 00 00 00 00 00 00 00 00 00 00 00 27 00 00
 60: 04 08 0c 10 00 00 00 00 00 00 00 00 00 00 00 00
 70: 02 02 00 00 00 00 00 00 00 00 02 2d 71 32 40 30
 80: 71 00 80 05 00 00 00 00 00 10 01 00 00 00 00 00
-90: 10 11 11 00 01 13 11 00 41 19 00 00 00 0a 3d 00
-a0: 02 00 20 00 17 02 00 1f 04 00 00 00 00 00 00 00
+90: 10 11 11 00 01 13 11 00 41 19 00 00 00 1a 3d 00
+a0: 02 00 20 00 17 02 00 1f 00 00 00 00 00 00 00 00
 b0: 00 00 00 00 00 00 00 00 00 00 e0 1b 20 10 00 00
 c0: 44 40 50 11 00 20 05 06 00 00 00 00 00 00 00 00
 d0: 02 28 00 0e 0b 00 00 30 00 00 31 b5 00 00 02 00
 e0: 00 00 00 00 09 a0 04 41 00 00 00 00 00 00 00 00
 f0: 00 00 01 00 74 f8 20 80 38 0f 21 00 04 00 00 00

 :00:01.0 PCI bridge: Intel Corp. 82855PM Processor to AGP Controller (rev 
21) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR+ FastB2B-
Status: Cap- 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- Reset- FastB2B-
 00: 86 80 41 33 07 01 a0 00 21 00 04 06 00 60 01 00
-10: 00 00 00 00 00 00 00 00 00 01 01 40 30 30 a0 22
+10: 00 00 00 00 00 00 00 00 00 01 01 40 30 30 a0 02
 20: 10 d0 10 d0 00 d8 f0 df 00 00 00 00 00 00 00 00
 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0c 00
 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

Do you have any hints how to solve the problem?


Regards,
Carl-Daniel
-- 
http://www.hailfinger.org/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


intel_agp resume problems

2005-08-16 Thread Carl-Daniel Hailfinger
Hello Dave,

after suspend-to-ram and a subsequent resume the configuration
of my AGP bridge/controller is different and X will refuse to
start after resume if it wasn't running during suspend. I'm
using radeonfb as console driver and kernel 2.6.13-rc6-git6.

Diff between lspci -vvvxxx before and after suspend follows.

--- lspci.radeonfb_beforeS3 2005-08-16 13:23:31.0 +0200
+++ lspci.radeonfb_afterS3  2005-08-16 13:23:31.0 +0200
@@ -1,353 +1,349 @@
 :00:00.0 Host bridge: Intel Corp. 82855PM Processor to I/O Controller (rev 
21)
Subsystem: Samsung Electronics Co Ltd: Unknown device c00c
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR+ FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=fast TAbort- TAbort- 
MAbort+ SERR- PERR-
Latency: 0
Region 0: Memory at e000 (32-bit, prefetchable)
Capabilities: [e4] #09 [4104]
Capabilities: [a0] AGP version 2.0
Status: RQ=32 Iso- ArqSz=0 Cal=0 SBA+ ITACoh- GART64- HTrans- 
64bit- FW+ AGP3- Rate=x1,x2,x4
-   Command: RQ=1 ArqSz=0 Cal=0 SBA- AGP- GART64- 64bit- FW- Rate=x4
+   Command: RQ=1 ArqSz=0 Cal=0 SBA- AGP- GART64- 64bit- FW- 
Rate=none
 00: 86 80 40 33 06 01 90 20 21 00 00 06 00 00 00 00
 10: 08 00 00 e0 00 00 00 00 00 00 00 00 00 00 00 00
 20: 00 00 00 00 00 00 00 00 00 00 00 00 4d 14 0c c0
 30: 00 00 00 00 e4 00 00 00 00 00 00 00 00 00 00 00
 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 50: 00 02 00 00 00 00 00 00 00 00 00 00 00 27 00 00
 60: 04 08 0c 10 00 00 00 00 00 00 00 00 00 00 00 00
 70: 02 02 00 00 00 00 00 00 00 00 02 2d 71 32 40 30
 80: 71 00 80 05 00 00 00 00 00 10 01 00 00 00 00 00
-90: 10 11 11 00 01 13 11 00 41 19 00 00 00 0a 3d 00
-a0: 02 00 20 00 17 02 00 1f 04 00 00 00 00 00 00 00
+90: 10 11 11 00 01 13 11 00 41 19 00 00 00 1a 3d 00
+a0: 02 00 20 00 17 02 00 1f 00 00 00 00 00 00 00 00
 b0: 00 00 00 00 00 00 00 00 00 00 e0 1b 20 10 00 00
 c0: 44 40 50 11 00 20 05 06 00 00 00 00 00 00 00 00
 d0: 02 28 00 0e 0b 00 00 30 00 00 31 b5 00 00 02 00
 e0: 00 00 00 00 09 a0 04 41 00 00 00 00 00 00 00 00
 f0: 00 00 01 00 74 f8 20 80 38 0f 21 00 04 00 00 00

 :00:01.0 PCI bridge: Intel Corp. 82855PM Processor to AGP Controller (rev 
21) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR+ FastB2B-
Status: Cap- 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=fast TAbort- TAbort- 
MAbort- SERR- PERR-
Latency: 96
Bus: primary=00, secondary=01, subordinate=01, sec-latency=64
I/O behind bridge: 3000-3fff
Memory behind bridge: d010-d01f
Prefetchable memory behind bridge: d800-dfff
Expansion ROM at 3000 [disabled] [size=4K]
BridgeCtl: Parity- SERR- NoISA+ VGA+ MAbort- Reset- FastB2B-
 00: 86 80 41 33 07 01 a0 00 21 00 04 06 00 60 01 00
-10: 00 00 00 00 00 00 00 00 00 01 01 40 30 30 a0 22
+10: 00 00 00 00 00 00 00 00 00 01 01 40 30 30 a0 02
 20: 10 d0 10 d0 00 d8 f0 df 00 00 00 00 00 00 00 00
 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0c 00
 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

Do you have any hints how to solve the problem?


Regards,
Carl-Daniel
-- 
http://www.hailfinger.org/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.12-rc1-mm3: still having USB resume problems

2005-03-25 Thread Jeremy Fitzhardinge
Though it looks a lot better; no more streams of messages.

Now when I resume, I get:

PCI: Enabling device :00:1d.7 ( -> 0002)
<1>Unable to handle kernel NULL pointer dereference

a second or so after resume.  It is completely locked up at this point;
magic-sysreq gets no response.

lspci shows that :00:1d.7 is

# lspci -v -s :00:1d.7
00:1d.7 USB Controller: Intel Corp. 82801DB/DBM (ICH4/ICH4-M) USB2 EHCI 
Controller (rev 01) (prog-if 20 [EHCI])
Subsystem: IBM: Unknown device 052e
Flags: bus master, medium devsel, latency 0, IRQ 5
Memory at c000 (32-bit, non-prefetchable) [size=1K]
Capabilities: [50] Power Management version 2
Capabilities: [58] Debug port

Complete lspci and .config attached.

   J
00:00.0 Host bridge: Intel Corp. 82855PM Processor to I/O Controller (rev 03)
Subsystem: IBM: Unknown device 0529
Flags: bus master, fast devsel, latency 0
Memory at d000 (32-bit, prefetchable) [size=256M]
Capabilities: [e4] Vendor Specific Information
Capabilities: [a0] AGP version 2.0

00:01.0 PCI bridge: Intel Corp. 82855PM Processor to AGP Controller (rev 03) 
(prog-if 00 [Normal decode])
Flags: bus master, 66Mhz, fast devsel, latency 96
Bus: primary=00, secondary=01, subordinate=01, sec-latency=64
I/O behind bridge: 3000-3fff
Memory behind bridge: c010-c01f
Prefetchable memory behind bridge: e000-e7ff

00:1d.0 USB Controller: Intel Corp. 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB 
UHCI Controller #1 (rev 01) (prog-if 00 [UHCI])
Subsystem: IBM: Unknown device 052d
Flags: bus master, medium devsel, latency 0, IRQ 11
I/O ports at 1800 [size=32]

00:1d.1 USB Controller: Intel Corp. 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB 
UHCI Controller #2 (rev 01) (prog-if 00 [UHCI])
Subsystem: IBM: Unknown device 052d
Flags: bus master, medium devsel, latency 0, IRQ 5
I/O ports at 1820 [size=32]

00:1d.2 USB Controller: Intel Corp. 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB 
UHCI Controller #3 (rev 01) (prog-if 00 [UHCI])
Subsystem: IBM: Unknown device 052d
Flags: bus master, medium devsel, latency 0, IRQ 9
I/O ports at 1840 [size=32]

00:1d.7 USB Controller: Intel Corp. 82801DB/DBM (ICH4/ICH4-M) USB2 EHCI 
Controller (rev 01) (prog-if 20 [EHCI])
Subsystem: IBM: Unknown device 052e
Flags: bus master, medium devsel, latency 0, IRQ 5
Memory at c000 (32-bit, non-prefetchable) [size=1K]
Capabilities: [50] Power Management version 2
Capabilities: [58] Debug port

00:1e.0 PCI bridge: Intel Corp. 82801 Mobile PCI Bridge (rev 81) (prog-if 00 
[Normal decode])
Flags: bus master, fast devsel, latency 0
Bus: primary=00, secondary=02, subordinate=08, sec-latency=64
I/O behind bridge: 4000-8fff
Memory behind bridge: c020-cfff
Prefetchable memory behind bridge: e800-efff

00:1f.0 ISA bridge: Intel Corp. 82801DBM (ICH4-M) LPC Interface Bridge (rev 01)
Flags: bus master, medium devsel, latency 0

00:1f.1 IDE interface: Intel Corp. 82801DBM (ICH4-M) IDE Controller (rev 01) 
(prog-if 8a [Master SecP PriP])
Subsystem: IBM: Unknown device 052d
Flags: bus master, medium devsel, latency 0, IRQ 9
I/O ports at 
I/O ports at 
I/O ports at 
I/O ports at 
I/O ports at 1860 [size=16]
Memory at 4000 (32-bit, non-prefetchable) [size=1K]

00:1f.3 SMBus: Intel Corp. 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) SMBus 
Controller (rev 01)
Subsystem: IBM: Unknown device 052d
Flags: medium devsel, IRQ 10
I/O ports at 1880 [size=32]

00:1f.5 Multimedia audio controller: Intel Corp. 82801DB/DBL/DBM 
(ICH4/ICH4-L/ICH4-M) AC'97 Audio Controller (rev 01)
Subsystem: IBM: Unknown device 0534
Flags: bus master, medium devsel, latency 0, IRQ 10
I/O ports at 1c00 [size=256]
I/O ports at 18c0 [size=64]
Memory at cc00 (32-bit, non-prefetchable) [size=512]
Memory at c800 (32-bit, non-prefetchable) [size=256]
Capabilities: [50] Power Management version 2

00:1f.6 Modem: Intel Corp. 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) AC'97 Modem 
Controller (rev 01) (prog-if 00 [Generic])
Subsystem: IBM: Unknown device 0524
Flags: bus master, medium devsel, latency 0, IRQ 10
I/O ports at 2400 [size=256]
I/O ports at 2000 [size=128]
Capabilities: [50] Power Management version 2

01:00.0 VGA compatible controller: ATI Technologies Inc Radeon Mobility M6 LY 
(prog-if 00 [VGA])
Subsystem: IBM: Unknown device 052f
Flags: bus master, stepping, fast Back2Back, 66Mhz, medium devsel, 
latency 66, IRQ 11
Memory at e000 (32-bit, prefetchable) [size=128M]
I/O ports at 3000 [size=256]
Memory at c010 (32-bit, 

2.6.12-rc1-mm3: still having USB resume problems

2005-03-25 Thread Jeremy Fitzhardinge
Though it looks a lot better; no more streams of messages.

Now when I resume, I get:

PCI: Enabling device :00:1d.7 ( - 0002)
1Unable to handle kernel NULL pointer dereference

a second or so after resume.  It is completely locked up at this point;
magic-sysreq gets no response.

lspci shows that :00:1d.7 is

# lspci -v -s :00:1d.7
00:1d.7 USB Controller: Intel Corp. 82801DB/DBM (ICH4/ICH4-M) USB2 EHCI 
Controller (rev 01) (prog-if 20 [EHCI])
Subsystem: IBM: Unknown device 052e
Flags: bus master, medium devsel, latency 0, IRQ 5
Memory at c000 (32-bit, non-prefetchable) [size=1K]
Capabilities: [50] Power Management version 2
Capabilities: [58] Debug port

Complete lspci and .config attached.

   J
00:00.0 Host bridge: Intel Corp. 82855PM Processor to I/O Controller (rev 03)
Subsystem: IBM: Unknown device 0529
Flags: bus master, fast devsel, latency 0
Memory at d000 (32-bit, prefetchable) [size=256M]
Capabilities: [e4] Vendor Specific Information
Capabilities: [a0] AGP version 2.0

00:01.0 PCI bridge: Intel Corp. 82855PM Processor to AGP Controller (rev 03) 
(prog-if 00 [Normal decode])
Flags: bus master, 66Mhz, fast devsel, latency 96
Bus: primary=00, secondary=01, subordinate=01, sec-latency=64
I/O behind bridge: 3000-3fff
Memory behind bridge: c010-c01f
Prefetchable memory behind bridge: e000-e7ff

00:1d.0 USB Controller: Intel Corp. 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB 
UHCI Controller #1 (rev 01) (prog-if 00 [UHCI])
Subsystem: IBM: Unknown device 052d
Flags: bus master, medium devsel, latency 0, IRQ 11
I/O ports at 1800 [size=32]

00:1d.1 USB Controller: Intel Corp. 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB 
UHCI Controller #2 (rev 01) (prog-if 00 [UHCI])
Subsystem: IBM: Unknown device 052d
Flags: bus master, medium devsel, latency 0, IRQ 5
I/O ports at 1820 [size=32]

00:1d.2 USB Controller: Intel Corp. 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB 
UHCI Controller #3 (rev 01) (prog-if 00 [UHCI])
Subsystem: IBM: Unknown device 052d
Flags: bus master, medium devsel, latency 0, IRQ 9
I/O ports at 1840 [size=32]

00:1d.7 USB Controller: Intel Corp. 82801DB/DBM (ICH4/ICH4-M) USB2 EHCI 
Controller (rev 01) (prog-if 20 [EHCI])
Subsystem: IBM: Unknown device 052e
Flags: bus master, medium devsel, latency 0, IRQ 5
Memory at c000 (32-bit, non-prefetchable) [size=1K]
Capabilities: [50] Power Management version 2
Capabilities: [58] Debug port

00:1e.0 PCI bridge: Intel Corp. 82801 Mobile PCI Bridge (rev 81) (prog-if 00 
[Normal decode])
Flags: bus master, fast devsel, latency 0
Bus: primary=00, secondary=02, subordinate=08, sec-latency=64
I/O behind bridge: 4000-8fff
Memory behind bridge: c020-cfff
Prefetchable memory behind bridge: e800-efff

00:1f.0 ISA bridge: Intel Corp. 82801DBM (ICH4-M) LPC Interface Bridge (rev 01)
Flags: bus master, medium devsel, latency 0

00:1f.1 IDE interface: Intel Corp. 82801DBM (ICH4-M) IDE Controller (rev 01) 
(prog-if 8a [Master SecP PriP])
Subsystem: IBM: Unknown device 052d
Flags: bus master, medium devsel, latency 0, IRQ 9
I/O ports at unassigned
I/O ports at unassigned
I/O ports at unassigned
I/O ports at unassigned
I/O ports at 1860 [size=16]
Memory at 4000 (32-bit, non-prefetchable) [size=1K]

00:1f.3 SMBus: Intel Corp. 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) SMBus 
Controller (rev 01)
Subsystem: IBM: Unknown device 052d
Flags: medium devsel, IRQ 10
I/O ports at 1880 [size=32]

00:1f.5 Multimedia audio controller: Intel Corp. 82801DB/DBL/DBM 
(ICH4/ICH4-L/ICH4-M) AC'97 Audio Controller (rev 01)
Subsystem: IBM: Unknown device 0534
Flags: bus master, medium devsel, latency 0, IRQ 10
I/O ports at 1c00 [size=256]
I/O ports at 18c0 [size=64]
Memory at cc00 (32-bit, non-prefetchable) [size=512]
Memory at c800 (32-bit, non-prefetchable) [size=256]
Capabilities: [50] Power Management version 2

00:1f.6 Modem: Intel Corp. 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) AC'97 Modem 
Controller (rev 01) (prog-if 00 [Generic])
Subsystem: IBM: Unknown device 0524
Flags: bus master, medium devsel, latency 0, IRQ 10
I/O ports at 2400 [size=256]
I/O ports at 2000 [size=128]
Capabilities: [50] Power Management version 2

01:00.0 VGA compatible controller: ATI Technologies Inc Radeon Mobility M6 LY 
(prog-if 00 [VGA])
Subsystem: IBM: Unknown device 052f
Flags: bus master, stepping, fast Back2Back, 66Mhz, medium devsel, 
latency 66, IRQ 11
Memory at e000 (32-bit, prefetchable) [size=128M]
I/O ports at 3000 [size=256]

Re: 2.6.11-rc3: APM resume problems with USB

2005-03-19 Thread Matt Mackall
On Sat, Mar 19, 2005 at 01:44:24AM -0800, Jeremy Fitzhardinge wrote:
> On my IBM ThinkPad X31, I can only do one successful APM resume.  After 
> the resume, there's a stream of messages on the console:
> 
> uhci_hcd :00:1d.0: host controller process error, something bad 
> happened!
> uhci_hcd :00:1d.0: host system error, PCI problems?
> uhci_hcd :00:1d.0: host controller process error, something bad 
> happened!
> uhci_hcd :00:1d.0: host system error, PCI problems?
> uhci_hcd :00:1d.0: host controller process error, something bad 
> happened!
> uhci_hcd :00:1d.0: host system error, PCI problems?
> uhci_hcd :00:1d.0: host controller process error, something bad 
> happened!
> uhci_hcd :00:1d.0: host system error, PCI problems?
> uhci_hcd :00:1d.0: host controller process error, something bad 
> happened!
> uhci_hcd :00:1d.0: host system error, PCI problems?
> uhci_hcd :00:1d.0: host controller process error, something bad 
> happened!
> uhci_hcd :00:1d.0: host system error, PCI problems?
> 
> 
> The second resume, the machine panics.  I haven't managed to get the 
> panic message yet.
> 
> This happens with both -rc3 and -rc4.

I think you mean -mm[34]. I've seen the problem with -mm3, 2.6.11{,.3}
seem to be fine. Also ACPI rather than APM is fine as well though the
suspend life is pathetic.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.11-rc3: APM resume problems with USB

2005-03-19 Thread Jeremy Fitzhardinge
On my IBM ThinkPad X31, I can only do one successful APM resume.  After 
the resume, there's a stream of messages on the console:

uhci_hcd :00:1d.0: host controller process error, something bad happened!
uhci_hcd :00:1d.0: host system error, PCI problems?
uhci_hcd :00:1d.0: host controller process error, something bad happened!
uhci_hcd :00:1d.0: host system error, PCI problems?
uhci_hcd :00:1d.0: host controller process error, something bad happened!
uhci_hcd :00:1d.0: host system error, PCI problems?
uhci_hcd :00:1d.0: host controller process error, something bad happened!
uhci_hcd :00:1d.0: host system error, PCI problems?
uhci_hcd :00:1d.0: host controller process error, something bad happened!
uhci_hcd :00:1d.0: host system error, PCI problems?
uhci_hcd :00:1d.0: host controller process error, something bad happened!
uhci_hcd :00:1d.0: host system error, PCI problems?
The second resume, the machine panics.  I haven't managed to get the 
panic message yet.

This happens with both -rc3 and -rc4.
If I unload the USB modules before the suspend,  then I can 
suspend/resume as many times as I like.  Curiously, if I reload the 
modules, I can continue suspending/resuming without obvious problems, 
though it does print "Trying to free free IRQ11" each time it resumes, 
and a new "uhci_hcd" appears associated with a number of interrupts.

   J
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.11-mm4
# Fri Mar 18 14:56:16 2005
#
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_UID16=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_CLEAR_PAGES=y

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
CONFIG_CLEAN_COMPILE=y
CONFIG_BROKEN_ON_SMP=y

#
# General setup
#
CONFIG_LOCALVERSION=""
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_POSIX_MQUEUE=y
# CONFIG_BSD_PROCESS_ACCT is not set
CONFIG_SYSCTL=y
# CONFIG_AUDIT is not set
CONFIG_HOTPLUG=y
CONFIG_KOBJECT_UEVENT=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
# CONFIG_EMBEDDED is not set
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_ALL is not set
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_SHMEM=y
CONFIG_CC_ALIGN_FUNCTIONS=0
CONFIG_CC_ALIGN_LABELS=0
CONFIG_CC_ALIGN_LOOPS=0
CONFIG_CC_ALIGN_JUMPS=0
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0

#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_MODULE_FORCE_UNLOAD is not set
CONFIG_OBSOLETE_MODPARM=y
# CONFIG_MODVERSIONS is not set
# CONFIG_MODULE_SRCVERSION_ALL is not set
CONFIG_KMOD=y

#
# Processor type and features
#
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
CONFIG_MPENTIUMM=y
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MGEODE is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_X86_GENERIC is not set
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
# CONFIG_HPET_TIMER is not set
# CONFIG_SMP is not set
# CONFIG_PREEMPT is not set
CONFIG_X86_UP_APIC=y
CONFIG_X86_UP_IOAPIC=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_TSC=y
CONFIG_X86_MCE=y
# CONFIG_X86_MCE_NONFATAL is not set
# CONFIG_X86_MCE_P4THERMAL is not set
# CONFIG_TOSHIBA is not set
# CONFIG_I8K is not set
CONFIG_MICROCODE=y
CONFIG_X86_MSR=y
CONFIG_X86_CPUID=y

#
# Firmware Drivers
#
# CONFIG_EDD is not set
# CONFIG_NOHIGHMEM is not set
CONFIG_HIGHMEM4G=y
# CONFIG_HIGHMEM64G is not set
CONFIG_HIGHMEM=y
# CONFIG_HIGHPTE is not set
# CONFIG_MATH_EMULATION is not set
CONFIG_MTRR=y
CONFIG_REGPARM=y
CONFIG_SECCOMP=y

#
# Performance-monitoring counters support
#
# CONFIG_PERFCTR is not set
CONFIG_PHYSICAL_START=0x10
# CONFIG_KEXEC is not set

#
# Power management options (ACPI, APM)
#
CONFIG_PM=y
# CONFIG_PM_DEBUG is not set
# CONFIG_SOFTWARE_SUSPEND is not set

#
# ACPI (Advanced Configuration and Power Interface) Support
#
# CONFIG_ACPI is not set

#
# APM (Advanced Power Management) BIOS Support
#
CONFIG_APM=y
# CONFIG_APM_IGNORE_USER_SUSPEND is not set
# CONFIG_APM_DO_ENABLE is not set
# CONFIG_APM_CPU_IDLE is not set
# CONFIG_APM_DISPLAY_BLANK is not set
# CONFIG_APM_RTC_IS_GMT is 

2.6.11-rc3: APM resume problems with USB

2005-03-19 Thread Jeremy Fitzhardinge
On my IBM ThinkPad X31, I can only do one successful APM resume.  After 
the resume, there's a stream of messages on the console:

uhci_hcd :00:1d.0: host controller process error, something bad happened!
uhci_hcd :00:1d.0: host system error, PCI problems?
uhci_hcd :00:1d.0: host controller process error, something bad happened!
uhci_hcd :00:1d.0: host system error, PCI problems?
uhci_hcd :00:1d.0: host controller process error, something bad happened!
uhci_hcd :00:1d.0: host system error, PCI problems?
uhci_hcd :00:1d.0: host controller process error, something bad happened!
uhci_hcd :00:1d.0: host system error, PCI problems?
uhci_hcd :00:1d.0: host controller process error, something bad happened!
uhci_hcd :00:1d.0: host system error, PCI problems?
uhci_hcd :00:1d.0: host controller process error, something bad happened!
uhci_hcd :00:1d.0: host system error, PCI problems?
The second resume, the machine panics.  I haven't managed to get the 
panic message yet.

This happens with both -rc3 and -rc4.
If I unload the USB modules before the suspend,  then I can 
suspend/resume as many times as I like.  Curiously, if I reload the 
modules, I can continue suspending/resuming without obvious problems, 
though it does print Trying to free free IRQ11 each time it resumes, 
and a new uhci_hcd appears associated with a number of interrupts.

   J
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.11-mm4
# Fri Mar 18 14:56:16 2005
#
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_UID16=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_CLEAR_PAGES=y

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
CONFIG_CLEAN_COMPILE=y
CONFIG_BROKEN_ON_SMP=y

#
# General setup
#
CONFIG_LOCALVERSION=
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_POSIX_MQUEUE=y
# CONFIG_BSD_PROCESS_ACCT is not set
CONFIG_SYSCTL=y
# CONFIG_AUDIT is not set
CONFIG_HOTPLUG=y
CONFIG_KOBJECT_UEVENT=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
# CONFIG_EMBEDDED is not set
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_ALL is not set
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_SHMEM=y
CONFIG_CC_ALIGN_FUNCTIONS=0
CONFIG_CC_ALIGN_LABELS=0
CONFIG_CC_ALIGN_LOOPS=0
CONFIG_CC_ALIGN_JUMPS=0
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0

#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_MODULE_FORCE_UNLOAD is not set
CONFIG_OBSOLETE_MODPARM=y
# CONFIG_MODVERSIONS is not set
# CONFIG_MODULE_SRCVERSION_ALL is not set
CONFIG_KMOD=y

#
# Processor type and features
#
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
CONFIG_MPENTIUMM=y
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MGEODE is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_X86_GENERIC is not set
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
# CONFIG_HPET_TIMER is not set
# CONFIG_SMP is not set
# CONFIG_PREEMPT is not set
CONFIG_X86_UP_APIC=y
CONFIG_X86_UP_IOAPIC=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_TSC=y
CONFIG_X86_MCE=y
# CONFIG_X86_MCE_NONFATAL is not set
# CONFIG_X86_MCE_P4THERMAL is not set
# CONFIG_TOSHIBA is not set
# CONFIG_I8K is not set
CONFIG_MICROCODE=y
CONFIG_X86_MSR=y
CONFIG_X86_CPUID=y

#
# Firmware Drivers
#
# CONFIG_EDD is not set
# CONFIG_NOHIGHMEM is not set
CONFIG_HIGHMEM4G=y
# CONFIG_HIGHMEM64G is not set
CONFIG_HIGHMEM=y
# CONFIG_HIGHPTE is not set
# CONFIG_MATH_EMULATION is not set
CONFIG_MTRR=y
CONFIG_REGPARM=y
CONFIG_SECCOMP=y

#
# Performance-monitoring counters support
#
# CONFIG_PERFCTR is not set
CONFIG_PHYSICAL_START=0x10
# CONFIG_KEXEC is not set

#
# Power management options (ACPI, APM)
#
CONFIG_PM=y
# CONFIG_PM_DEBUG is not set
# CONFIG_SOFTWARE_SUSPEND is not set

#
# ACPI (Advanced Configuration and Power Interface) Support
#
# CONFIG_ACPI is not set

#
# APM (Advanced Power Management) BIOS Support
#
CONFIG_APM=y
# CONFIG_APM_IGNORE_USER_SUSPEND is not set
# CONFIG_APM_DO_ENABLE is not set
# CONFIG_APM_CPU_IDLE is not set
# CONFIG_APM_DISPLAY_BLANK is not set
# CONFIG_APM_RTC_IS_GMT is not 

Re: 2.6.11-rc3: APM resume problems with USB

2005-03-19 Thread Matt Mackall
On Sat, Mar 19, 2005 at 01:44:24AM -0800, Jeremy Fitzhardinge wrote:
 On my IBM ThinkPad X31, I can only do one successful APM resume.  After 
 the resume, there's a stream of messages on the console:
 
 uhci_hcd :00:1d.0: host controller process error, something bad 
 happened!
 uhci_hcd :00:1d.0: host system error, PCI problems?
 uhci_hcd :00:1d.0: host controller process error, something bad 
 happened!
 uhci_hcd :00:1d.0: host system error, PCI problems?
 uhci_hcd :00:1d.0: host controller process error, something bad 
 happened!
 uhci_hcd :00:1d.0: host system error, PCI problems?
 uhci_hcd :00:1d.0: host controller process error, something bad 
 happened!
 uhci_hcd :00:1d.0: host system error, PCI problems?
 uhci_hcd :00:1d.0: host controller process error, something bad 
 happened!
 uhci_hcd :00:1d.0: host system error, PCI problems?
 uhci_hcd :00:1d.0: host controller process error, something bad 
 happened!
 uhci_hcd :00:1d.0: host system error, PCI problems?
 
 
 The second resume, the machine panics.  I haven't managed to get the 
 panic message yet.
 
 This happens with both -rc3 and -rc4.

I think you mean -mm[34]. I've seen the problem with -mm3, 2.6.11{,.3}
seem to be fine. Also ACPI rather than APM is fine as well though the
suspend life is pathetic.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Fix suspend/resume problems with b44

2005-03-08 Thread David S. Miller
On Tue, 8 Mar 2005 22:55:37 +0100
Pavel Machek <[EMAIL PROTECTED]> wrote:

> Any idea what to do there? I'd say that request_irq is very unlikely
> to fail given that it worked okay before suspend...

What you have is fine for now.

It is just a general issue that ->resume() has no way to cleanly
fail.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Fix suspend/resume problems with b44

2005-03-08 Thread Pavel Machek
Hi!

> > @@ -1934,6 +1936,9 @@
> > if (!netif_running(dev))
> > return 0;
> >  
> > +   if (request_irq(dev->irq, b44_interrupt, SA_SHIRQ, dev->name, dev))
> > +   printk(KERN_ERR PFX "%s: request_irq failed\n", dev->name);
> > +
> 
> This is a hard error and means that bringup of the chip
> will totally fail.  It definitely deserves something harder
> than a printk(), but unfortunately ->resume() has no way
> to cleanly fail.

Any idea what to do there? I'd say that request_irq is very unlikely
to fail given that it worked okay before suspend...
Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Fix suspend/resume problems with b44

2005-03-08 Thread David S. Miller
On Tue, 8 Mar 2005 10:46:55 +0100
Pavel Machek <[EMAIL PROTECTED]> wrote:

> @@ -1934,6 +1936,9 @@
>   if (!netif_running(dev))
>   return 0;
>  
> + if (request_irq(dev->irq, b44_interrupt, SA_SHIRQ, dev->name, dev))
> + printk(KERN_ERR PFX "%s: request_irq failed\n", dev->name);
> +

This is a hard error and means that bringup of the chip
will totally fail.  It definitely deserves something harder
than a printk(), but unfortunately ->resume() has no way
to cleanly fail.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Fix suspend/resume problems with b44

2005-03-08 Thread Pavel Machek
Hi!

This should fix problems people have with b44 during
suspend/resume. Please apply,
Pavel

--- clean/drivers/net/b44.c 2004-12-25 13:35:00.0 +0100
+++ linux/drivers/net/b44.c 2005-01-19 11:59:12.0 +0100
@@ -1921,6 +1921,8 @@
b44_free_rings(bp);
 
spin_unlock_irq(>lock);
+
+   free_irq(dev->irq, dev);
return 0;
 }
 
@@ -1934,6 +1936,9 @@
if (!netif_running(dev))
return 0;
 
+   if (request_irq(dev->irq, b44_interrupt, SA_SHIRQ, dev->name, dev))
+   printk(KERN_ERR PFX "%s: request_irq failed\n", dev->name);
+
spin_lock_irq(>lock);
 
b44_init_rings(bp);

-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Fix suspend/resume problems with b44

2005-03-08 Thread Pavel Machek
Hi!

This should fix problems people have with b44 during
suspend/resume. Please apply,
Pavel

--- clean/drivers/net/b44.c 2004-12-25 13:35:00.0 +0100
+++ linux/drivers/net/b44.c 2005-01-19 11:59:12.0 +0100
@@ -1921,6 +1921,8 @@
b44_free_rings(bp);
 
spin_unlock_irq(bp-lock);
+
+   free_irq(dev-irq, dev);
return 0;
 }
 
@@ -1934,6 +1936,9 @@
if (!netif_running(dev))
return 0;
 
+   if (request_irq(dev-irq, b44_interrupt, SA_SHIRQ, dev-name, dev))
+   printk(KERN_ERR PFX %s: request_irq failed\n, dev-name);
+
spin_lock_irq(bp-lock);
 
b44_init_rings(bp);

-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Fix suspend/resume problems with b44

2005-03-08 Thread David S. Miller
On Tue, 8 Mar 2005 10:46:55 +0100
Pavel Machek [EMAIL PROTECTED] wrote:

 @@ -1934,6 +1936,9 @@
   if (!netif_running(dev))
   return 0;
  
 + if (request_irq(dev-irq, b44_interrupt, SA_SHIRQ, dev-name, dev))
 + printk(KERN_ERR PFX %s: request_irq failed\n, dev-name);
 +

This is a hard error and means that bringup of the chip
will totally fail.  It definitely deserves something harder
than a printk(), but unfortunately -resume() has no way
to cleanly fail.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Fix suspend/resume problems with b44

2005-03-08 Thread Pavel Machek
Hi!

  @@ -1934,6 +1936,9 @@
  if (!netif_running(dev))
  return 0;
   
  +   if (request_irq(dev-irq, b44_interrupt, SA_SHIRQ, dev-name, dev))
  +   printk(KERN_ERR PFX %s: request_irq failed\n, dev-name);
  +
 
 This is a hard error and means that bringup of the chip
 will totally fail.  It definitely deserves something harder
 than a printk(), but unfortunately -resume() has no way
 to cleanly fail.

Any idea what to do there? I'd say that request_irq is very unlikely
to fail given that it worked okay before suspend...
Pavel
-- 
People were complaining that M$ turns users into beta-testers...
...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl!
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Fix suspend/resume problems with b44

2005-03-08 Thread David S. Miller
On Tue, 8 Mar 2005 22:55:37 +0100
Pavel Machek [EMAIL PROTECTED] wrote:

 Any idea what to do there? I'd say that request_irq is very unlikely
 to fail given that it worked okay before suspend...

What you have is fine for now.

It is just a general issue that -resume() has no way to cleanly
fail.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/