Re: tsc breaks atkbd suspend
On Tue, 19 Feb 2008, Len Brown wrote: > On Tuesday 19 February 2008 11:51, Thomas Gleixner wrote: > > On Tue, 19 Feb 2008, Ingo Molnar wrote: > > > * Pavel Machek <[EMAIL PROTECTED]> wrote: > > > > > > > TSC is used even on machines when CONFIG_X86_TSC is not set (X86_TSC > > > > means _require_ TSC), but it is not properly disabled when it is > > > > unusable, because acpi code understood the config switch as "may use > > > > TSC". > > > > > > > > This actually fixes suspend problems on my x60. > > > > > > ah! This makes tons of sense. I've applied your patch > > please do not. It's not in the -mm branch. > > > - but i guess it should go via the ACPI tree. > > yes. > > > Right. The breakage was introduced there when IA64 switched to > > GENERIC_TIME and the X86_TSC dependency was added. > > so do we need a patch for 2.6.23.stable and 2.6.24.stable? Yes. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: tsc breaks atkbd suspend
On Tuesday 19 February 2008 11:51, Thomas Gleixner wrote: > On Tue, 19 Feb 2008, Ingo Molnar wrote: > > * Pavel Machek <[EMAIL PROTECTED]> wrote: > > > > > TSC is used even on machines when CONFIG_X86_TSC is not set (X86_TSC > > > means _require_ TSC), but it is not properly disabled when it is > > > unusable, because acpi code understood the config switch as "may use > > > TSC". > > > > > > This actually fixes suspend problems on my x60. > > > > ah! This makes tons of sense. I've applied your patch please do not. > > - but i guess it should go via the ACPI tree. yes. > Right. The breakage was introduced there when IA64 switched to > GENERIC_TIME and the X86_TSC dependency was added. so do we need a patch for 2.6.23.stable and 2.6.24.stable? thanks, -Len -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: tsc breaks atkbd suspend
On Tue, 19 Feb 2008, Ingo Molnar wrote: > * Pavel Machek <[EMAIL PROTECTED]> wrote: > > > TSC is used even on machines when CONFIG_X86_TSC is not set (X86_TSC > > means _require_ TSC), but it is not properly disabled when it is > > unusable, because acpi code understood the config switch as "may use > > TSC". > > > > This actually fixes suspend problems on my x60. > > ah! This makes tons of sense. I've applied your patch - but i guess it > should go via the ACPI tree. Right. The breakage was introduced there when IA64 switched to GENERIC_TIME and the X86_TSC dependency was added. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: tsc breaks atkbd suspend
* Pavel Machek <[EMAIL PROTECTED]> wrote: > TSC is used even on machines when CONFIG_X86_TSC is not set (X86_TSC > means _require_ TSC), but it is not properly disabled when it is > unusable, because acpi code understood the config switch as "may use > TSC". > > This actually fixes suspend problems on my x60. ah! This makes tons of sense. I've applied your patch - but i guess it should go via the ACPI tree. Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: tsc breaks atkbd suspend
On Mon 2008-02-18 16:12:40, Thomas Gleixner wrote: > On Mon, 18 Feb 2008, Pavel Machek wrote: > > > I do not understand, what you mean. When exactly is "sleeppy test" > > > running ? > > > > from late_initcall(). > > Has the system already switched to highres/nohz mode at this point ? I guess so, as nohz=off makes the problem go away. > > > Also the TSC unstable detection is in a 500ms timeframe, so you should > > > never get a 10s delay. > > > > I'd not expect TSC instability detection to be ran while system is > > suspending > > There is nothing which switches it off, except right at the point > where the machine is shutdown. It's definitely active during the > freeze period. > > > , plus "wrong for 500msec" is still wrong. > > To be wrong for 500ms, the TSC must freeze completely. ...which is still very possible with C3, and being wrong for <500msec is still wrong. > > > > I believe fix is very simple: > > > > > > NAK. > > > > > > This kills TSC on machines which have a working TSC and never go into > > > suspend. > > > > Apart from OLPC, which machines have working TSC? > > > > Anything that has cpufreq does not. > > Wrong. Newer CPUs have TSCs which are not affected by cpu frequency. Ok, so whitelist the newer CPUs. > > Anything that enters C2 does not. > > Wrong. C3 stops the TSC everywhere. C2 should not, but it does > unfortunately due to stupid BIOS implementations and on a lot of AMD > CPUs. Hmm, and C3 is rather common, right? > > Anything that is SMP does not. > > Wrong. I'm writing this mail from a SMP machine with perfectly > sychronized TSCs. While I'm writing this from thinkpad X60: checking TSC synchronization [CPU#0 -> CPU#1]: Measured 580063 cycles TSC warp between CPUs, turning off TSC clock. Marking TSC unstable due to: check_tsc_sync_source failed. > > So, how many machines do you have with working TSC? > > At least 5. Out of? I do not see what we are arguing about. TSC was not originally designed to provide time, and it does not provide time at least on machines with C3. We are using it by default, which is just wrong. If it works for some systems, they should be whitelisted. > One sensible solution would be to force TSC unstable before we > suspend, but I'd like to know why that 10sec delay happens at all. > > Can you please provide a dmesg log ? (Detailed log in attachment). ... PM: Adding info for No Bus:cpu_dma_latency PM: Adding info for No Bus:network_latency PM: Adding info for No Bus:network_throughput PM: test RTC wakeup from 'mem' suspend PM: Syncing filesystems ... done. PM: Preparing system for mem sleep Freezing user space processes ... (elapsed 0.00 seconds) done. Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done. PM: Entering mem sleep ACPI: Preparing to enter system sleep state S3 coretemp coretemp.0: suspend serio serio1: suspend serio_suspend: starting serio_suspend: done atkbd serio0: suspend serio_suspend: starting ps2_command enter ps2_command get mutex ps2_command pause rx ps2_command synaptics ps2_command sending bytes ps2_command waiting event: 1000 wait_event returned: 998 ps2_command adjust timeout: 25 ~~ # here it hangs, for ~30 seconds, while it should be sleeping for 100msec. # ps2_command receiving ps2_command mostly done serio_suspend: done i8042 i8042: suspend i8042_suspend: starting i8042_suspend: done sd 0:0:0:0: suspend sd 0:0:0:0: [sda] Synchronizing SCSI cache sd 0:0:0:0: [sda] Stopping disk sd 0:0:0:0: suspended ahci: autosuspend disabled Here's what I used to get those debug prints: diff --git a/drivers/input/serio/i8042.c b/drivers/input/serio/i8042.c index 2763394..8115893 100644 --- a/drivers/input/serio/i8042.c +++ b/drivers/input/serio/i8042.c @@ -895,12 +895,14 @@ #ifdef CONFIG_PM static int i8042_suspend(struct platform_device *dev, pm_message_t state) { + printk("i8042_suspend: starting\n"); if (dev->dev.power.power_state.event != state.event) { if (state.event == PM_EVENT_SUSPEND) i8042_controller_reset(); dev->dev.power.power_state = state; } + printk("i8042_suspend: done\n"); return 0; } diff --git a/drivers/input/serio/libps2.c b/drivers/input/serio/libps2.c index b819239..81bd843 100644 --- a/drivers/input/serio/libps2.c +++ b/drivers/input/serio/libps2.c @@ -178,6 +178,7 @@ int ps2_command(struct ps2dev *ps2dev, u int rc = -1; int i; + printk("ps2_command enter\n"); if (receive > sizeof(ps2dev->cmdbuf)) { WARN_ON(1); return -1; @@ -188,8 +189,10 @@ int ps2_command(struct ps2dev *ps2dev, u return -1; } + printk("ps2_command get mutex\n"); mutex_lock(>cmd_mutex); + printk("ps2_command pause rx\n"); serio_pause_rx(ps2dev->serio); ps2dev->flags = command == PS2_CMD_GETID ? PS2_FLAG_WAITID : 0; ps2dev->cmdcnt = receive; @@
Re: tsc breaks atkbd suspend
Hi! This is wrong: #ifdef CONFIG_X86_TSC static int __init tsc_setup(char *str) { printk(KERN_WARNING "notsc: Kernel compiled with CONFIG_X86_TSC, " "cannot disable TSC.\n"); return 1; } #else /* * disable flag for tsc. Takes effect by clearing the TSC cpu flag * in cpu/common.c */ static int __init tsc_setup(char *str) { setup_clear_cpu_cap(X86_FEATURE_TSC); return 1; } #endif ...even if kernel is compiled with X86_TSC, we do disable it in case it is unstable: #if defined (CONFIG_GENERIC_TIME) && defined (CONFIG_X86_TSC) /* TSC halts in C2, so notify users */ if (tsc_halts_in_c(ACPI_STATE_C2)) mark_tsc_unstable("possible TSC halt in C2"); #endif ...so the commandline option should work, too. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: tsc breaks atkbd suspend
TSC is used even on machines when CONFIG_X86_TSC is not set (X86_TSC means _require_ TSC), but it is not properly disabled when it is unusable, because acpi code understood the config switch as "may use TSC". This actually fixes suspend problems on my x60. Signed-off-by: Pavel Machek <[EMAIL PROTECTED]> diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c index 980e1c3..6f3b217 100644 --- a/drivers/acpi/processor_idle.c +++ b/drivers/acpi/processor_idle.c @@ -364,7 +364,7 @@ int acpi_processor_resume(struct acpi_de return 0; } -#if defined (CONFIG_GENERIC_TIME) && defined (CONFIG_X86_TSC) +#if defined (CONFIG_GENERIC_TIME) && defined (CONFIG_X86) static int tsc_halts_in_c(int state) { switch (boot_cpu_data.x86_vendor) { @@ -544,7 +544,7 @@ #endif /* Get end time (ticks) */ t2 = inl(acpi_gbl_FADT.xpm_timer_block.address); -#if defined (CONFIG_GENERIC_TIME) && defined (CONFIG_X86_TSC) +#if defined (CONFIG_GENERIC_TIME) && defined (CONFIG_X86) /* TSC halts in C2, so notify users */ if (tsc_halts_in_c(ACPI_STATE_C2)) mark_tsc_unstable("possible TSC halt in C2"); @@ -609,7 +609,7 @@ #endif acpi_set_register(ACPI_BITREG_ARB_DISABLE, 0); } -#if defined (CONFIG_GENERIC_TIME) && defined (CONFIG_X86_TSC) +#if defined (CONFIG_GENERIC_TIME) && defined (CONFIG_X86) /* TSC halts in C3, so notify users */ if (tsc_halts_in_c(ACPI_STATE_C3)) mark_tsc_unstable("TSC halts in C3"); @@ -1500,7 +1500,7 @@ static int acpi_idle_enter_simple(struct acpi_idle_do_entry(cx); t2 = inl(acpi_gbl_FADT.xpm_timer_block.address); -#if defined (CONFIG_GENERIC_TIME) && defined (CONFIG_X86_TSC) +#if defined (CONFIG_GENERIC_TIME) && defined (CONFIG_X86) /* TSC could halt in idle, so notify users */ if (tsc_halts_in_c(cx->type)) mark_tsc_unstable("TSC halts in idle");; @@ -1614,7 +1614,7 @@ static int acpi_idle_enter_bm(struct cpu spin_unlock(_lock); } -#if defined (CONFIG_GENERIC_TIME) && defined (CONFIG_X86_TSC) +#if defined (CONFIG_GENERIC_TIME) && defined (CONFIG_X86) /* TSC could halt in idle, so notify users */ if (tsc_halts_in_c(ACPI_STATE_C3)) mark_tsc_unstable("TSC halts in idle"); -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: tsc breaks atkbd suspend
On Mon 2008-02-18 16:12:40, Thomas Gleixner wrote: On Mon, 18 Feb 2008, Pavel Machek wrote: I do not understand, what you mean. When exactly is sleeppy test running ? from late_initcall(). Has the system already switched to highres/nohz mode at this point ? I guess so, as nohz=off makes the problem go away. Also the TSC unstable detection is in a 500ms timeframe, so you should never get a 10s delay. I'd not expect TSC instability detection to be ran while system is suspending There is nothing which switches it off, except right at the point where the machine is shutdown. It's definitely active during the freeze period. , plus wrong for 500msec is still wrong. To be wrong for 500ms, the TSC must freeze completely. ...which is still very possible with C3, and being wrong for 500msec is still wrong. I believe fix is very simple: NAK. This kills TSC on machines which have a working TSC and never go into suspend. Apart from OLPC, which machines have working TSC? Anything that has cpufreq does not. Wrong. Newer CPUs have TSCs which are not affected by cpu frequency. Ok, so whitelist the newer CPUs. Anything that enters C2 does not. Wrong. C3 stops the TSC everywhere. C2 should not, but it does unfortunately due to stupid BIOS implementations and on a lot of AMD CPUs. Hmm, and C3 is rather common, right? Anything that is SMP does not. Wrong. I'm writing this mail from a SMP machine with perfectly sychronized TSCs. While I'm writing this from thinkpad X60: checking TSC synchronization [CPU#0 - CPU#1]: Measured 580063 cycles TSC warp between CPUs, turning off TSC clock. Marking TSC unstable due to: check_tsc_sync_source failed. So, how many machines do you have with working TSC? At least 5. Out of? I do not see what we are arguing about. TSC was not originally designed to provide time, and it does not provide time at least on machines with C3. We are using it by default, which is just wrong. If it works for some systems, they should be whitelisted. One sensible solution would be to force TSC unstable before we suspend, but I'd like to know why that 10sec delay happens at all. Can you please provide a dmesg log ? (Detailed log in attachment). ... PM: Adding info for No Bus:cpu_dma_latency PM: Adding info for No Bus:network_latency PM: Adding info for No Bus:network_throughput PM: test RTC wakeup from 'mem' suspend PM: Syncing filesystems ... done. PM: Preparing system for mem sleep Freezing user space processes ... (elapsed 0.00 seconds) done. Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done. PM: Entering mem sleep ACPI: Preparing to enter system sleep state S3 coretemp coretemp.0: suspend serio serio1: suspend serio_suspend: starting serio_suspend: done atkbd serio0: suspend serio_suspend: starting ps2_command enter ps2_command get mutex ps2_command pause rx ps2_command synaptics ps2_command sending bytes ps2_command waiting event: 1000 wait_event returned: 998 ps2_command adjust timeout: 25 ~~ # here it hangs, for ~30 seconds, while it should be sleeping for 100msec. # ps2_command receiving ps2_command mostly done serio_suspend: done i8042 i8042: suspend i8042_suspend: starting i8042_suspend: done sd 0:0:0:0: suspend sd 0:0:0:0: [sda] Synchronizing SCSI cache sd 0:0:0:0: [sda] Stopping disk sd 0:0:0:0: suspended ahci: autosuspend disabled Here's what I used to get those debug prints: diff --git a/drivers/input/serio/i8042.c b/drivers/input/serio/i8042.c index 2763394..8115893 100644 --- a/drivers/input/serio/i8042.c +++ b/drivers/input/serio/i8042.c @@ -895,12 +895,14 @@ #ifdef CONFIG_PM static int i8042_suspend(struct platform_device *dev, pm_message_t state) { + printk(i8042_suspend: starting\n); if (dev-dev.power.power_state.event != state.event) { if (state.event == PM_EVENT_SUSPEND) i8042_controller_reset(); dev-dev.power.power_state = state; } + printk(i8042_suspend: done\n); return 0; } diff --git a/drivers/input/serio/libps2.c b/drivers/input/serio/libps2.c index b819239..81bd843 100644 --- a/drivers/input/serio/libps2.c +++ b/drivers/input/serio/libps2.c @@ -178,6 +178,7 @@ int ps2_command(struct ps2dev *ps2dev, u int rc = -1; int i; + printk(ps2_command enter\n); if (receive sizeof(ps2dev-cmdbuf)) { WARN_ON(1); return -1; @@ -188,8 +189,10 @@ int ps2_command(struct ps2dev *ps2dev, u return -1; } + printk(ps2_command get mutex\n); mutex_lock(ps2dev-cmd_mutex); + printk(ps2_command pause rx\n); serio_pause_rx(ps2dev-serio); ps2dev-flags = command == PS2_CMD_GETID ? PS2_FLAG_WAITID : 0; ps2dev-cmdcnt = receive; @@ -198,6 +201,7 @@ int ps2_command(struct ps2dev *ps2dev, u
Re: tsc breaks atkbd suspend
TSC is used even on machines when CONFIG_X86_TSC is not set (X86_TSC means _require_ TSC), but it is not properly disabled when it is unusable, because acpi code understood the config switch as may use TSC. This actually fixes suspend problems on my x60. Signed-off-by: Pavel Machek [EMAIL PROTECTED] diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c index 980e1c3..6f3b217 100644 --- a/drivers/acpi/processor_idle.c +++ b/drivers/acpi/processor_idle.c @@ -364,7 +364,7 @@ int acpi_processor_resume(struct acpi_de return 0; } -#if defined (CONFIG_GENERIC_TIME) defined (CONFIG_X86_TSC) +#if defined (CONFIG_GENERIC_TIME) defined (CONFIG_X86) static int tsc_halts_in_c(int state) { switch (boot_cpu_data.x86_vendor) { @@ -544,7 +544,7 @@ #endif /* Get end time (ticks) */ t2 = inl(acpi_gbl_FADT.xpm_timer_block.address); -#if defined (CONFIG_GENERIC_TIME) defined (CONFIG_X86_TSC) +#if defined (CONFIG_GENERIC_TIME) defined (CONFIG_X86) /* TSC halts in C2, so notify users */ if (tsc_halts_in_c(ACPI_STATE_C2)) mark_tsc_unstable(possible TSC halt in C2); @@ -609,7 +609,7 @@ #endif acpi_set_register(ACPI_BITREG_ARB_DISABLE, 0); } -#if defined (CONFIG_GENERIC_TIME) defined (CONFIG_X86_TSC) +#if defined (CONFIG_GENERIC_TIME) defined (CONFIG_X86) /* TSC halts in C3, so notify users */ if (tsc_halts_in_c(ACPI_STATE_C3)) mark_tsc_unstable(TSC halts in C3); @@ -1500,7 +1500,7 @@ static int acpi_idle_enter_simple(struct acpi_idle_do_entry(cx); t2 = inl(acpi_gbl_FADT.xpm_timer_block.address); -#if defined (CONFIG_GENERIC_TIME) defined (CONFIG_X86_TSC) +#if defined (CONFIG_GENERIC_TIME) defined (CONFIG_X86) /* TSC could halt in idle, so notify users */ if (tsc_halts_in_c(cx-type)) mark_tsc_unstable(TSC halts in idle);; @@ -1614,7 +1614,7 @@ static int acpi_idle_enter_bm(struct cpu spin_unlock(c3_lock); } -#if defined (CONFIG_GENERIC_TIME) defined (CONFIG_X86_TSC) +#if defined (CONFIG_GENERIC_TIME) defined (CONFIG_X86) /* TSC could halt in idle, so notify users */ if (tsc_halts_in_c(ACPI_STATE_C3)) mark_tsc_unstable(TSC halts in idle); -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: tsc breaks atkbd suspend
Hi! This is wrong: #ifdef CONFIG_X86_TSC static int __init tsc_setup(char *str) { printk(KERN_WARNING notsc: Kernel compiled with CONFIG_X86_TSC, cannot disable TSC.\n); return 1; } #else /* * disable flag for tsc. Takes effect by clearing the TSC cpu flag * in cpu/common.c */ static int __init tsc_setup(char *str) { setup_clear_cpu_cap(X86_FEATURE_TSC); return 1; } #endif ...even if kernel is compiled with X86_TSC, we do disable it in case it is unstable: #if defined (CONFIG_GENERIC_TIME) defined (CONFIG_X86_TSC) /* TSC halts in C2, so notify users */ if (tsc_halts_in_c(ACPI_STATE_C2)) mark_tsc_unstable(possible TSC halt in C2); #endif ...so the commandline option should work, too. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: tsc breaks atkbd suspend
* Pavel Machek [EMAIL PROTECTED] wrote: TSC is used even on machines when CONFIG_X86_TSC is not set (X86_TSC means _require_ TSC), but it is not properly disabled when it is unusable, because acpi code understood the config switch as may use TSC. This actually fixes suspend problems on my x60. ah! This makes tons of sense. I've applied your patch - but i guess it should go via the ACPI tree. Ingo -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: tsc breaks atkbd suspend
On Tue, 19 Feb 2008, Ingo Molnar wrote: * Pavel Machek [EMAIL PROTECTED] wrote: TSC is used even on machines when CONFIG_X86_TSC is not set (X86_TSC means _require_ TSC), but it is not properly disabled when it is unusable, because acpi code understood the config switch as may use TSC. This actually fixes suspend problems on my x60. ah! This makes tons of sense. I've applied your patch - but i guess it should go via the ACPI tree. Right. The breakage was introduced there when IA64 switched to GENERIC_TIME and the X86_TSC dependency was added. Thanks, tglx -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: tsc breaks atkbd suspend
On Tuesday 19 February 2008 11:51, Thomas Gleixner wrote: On Tue, 19 Feb 2008, Ingo Molnar wrote: * Pavel Machek [EMAIL PROTECTED] wrote: TSC is used even on machines when CONFIG_X86_TSC is not set (X86_TSC means _require_ TSC), but it is not properly disabled when it is unusable, because acpi code understood the config switch as may use TSC. This actually fixes suspend problems on my x60. ah! This makes tons of sense. I've applied your patch please do not. - but i guess it should go via the ACPI tree. yes. Right. The breakage was introduced there when IA64 switched to GENERIC_TIME and the X86_TSC dependency was added. so do we need a patch for 2.6.23.stable and 2.6.24.stable? thanks, -Len -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: tsc breaks atkbd suspend
On Tue, 19 Feb 2008, Len Brown wrote: On Tuesday 19 February 2008 11:51, Thomas Gleixner wrote: On Tue, 19 Feb 2008, Ingo Molnar wrote: * Pavel Machek [EMAIL PROTECTED] wrote: TSC is used even on machines when CONFIG_X86_TSC is not set (X86_TSC means _require_ TSC), but it is not properly disabled when it is unusable, because acpi code understood the config switch as may use TSC. This actually fixes suspend problems on my x60. ah! This makes tons of sense. I've applied your patch please do not. It's not in the -mm branch. - but i guess it should go via the ACPI tree. yes. Right. The breakage was introduced there when IA64 switched to GENERIC_TIME and the X86_TSC dependency was added. so do we need a patch for 2.6.23.stable and 2.6.24.stable? Yes. Thanks, tglx -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: tsc breaks atkbd suspend
On Mon, 18 Feb 2008, Pavel Machek wrote: > > > ...and I get hang after "ps2_command adjust timeout" for 10 seconds, > > > while it should wait 10msec or so. > > > > When is this code called ? > > from serio_suspend. It is normal device, not a sysdev, AFAICT. > > > > nohz=off fixes that. > > > > > > notsc fixes that, too... On my system (thinkpad x60 in UP mode) tsc is > > > normally marked unstable very shortly after boot, so only sleepy test > > > can trigger this. > > > > I do not understand, what you mean. When exactly is "sleeppy test" > > running ? > > from late_initcall(). Has the system already switched to highres/nohz mode at this point ? > > Also the TSC unstable detection is in a 500ms timeframe, so you should > > never get a 10s delay. > > I'd not expect TSC instability detection to be ran while system is > suspending There is nothing which switches it off, except right at the point where the machine is shutdown. It's definitely active during the freeze period. > , plus "wrong for 500msec" is still wrong. To be wrong for 500ms, the TSC must freeze completely. > Plus TSC unstable detection is heuristic that can easily go wrong, right? It should detect it, but I have to check, why it does not work in that particular case. > > > I believe fix is very simple: > > > > NAK. > > > > This kills TSC on machines which have a working TSC and never go into > > suspend. > > Apart from OLPC, which machines have working TSC? > > Anything that has cpufreq does not. Wrong. Newer CPUs have TSCs which are not affected by cpu frequency. > Anything that enters C2 does not. Wrong. C3 stops the TSC everywhere. C2 should not, but it does unfortunately due to stupid BIOS implementations and on a lot of AMD CPUs. > Anything that is SMP does not. Wrong. I'm writing this mail from a SMP machine with perfectly sychronized TSCs. > So, how many machines do you have with working TSC? At least 5. One sensible solution would be to force TSC unstable before we suspend, but I'd like to know why that 10sec delay happens at all. Can you please provide a dmesg log ? Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: tsc breaks atkbd suspend
Hi! > > I'm trying to use the "sleepy test" here, unfortunately it locks for > > 10-or-so seconds. > > > > Problem is in wait_event_timeout: timeouts take about 100x as long as > > they should. Code in drivers/input/serio/libps2.c: > > > > + printk("ps2_command waiting event: %d\n", timeout); > > timeout = wait_event_timeout(ps2dev->wait, > > !(ps2dev->flags & > > PS2_FLAG_CMD1),timeout); > > > > if (ps2dev->cmdcnt && timeout > 0) { > > + printk("wait_event returned: %d\n", timeout); > > > > timeout = ps2_adjust_timeout(ps2dev, command, timeout); > > + > > + printk("ps2_command adjust timeout: %d\n", timeout); > > wait_event_timeout(ps2dev->wait, > >!(ps2dev->flags & PS2_FLAG_CMD), > > timeout); > > } > > > > + printk("ps2_command receiving\n"); > > > > > > ...and I get hang after "ps2_command adjust timeout" for 10 seconds, > > while it should wait 10msec or so. > > When is this code called ? from serio_suspend. It is normal device, not a sysdev, AFAICT. > > nohz=off fixes that. > > > > notsc fixes that, too... On my system (thinkpad x60 in UP mode) tsc is > > normally marked unstable very shortly after boot, so only sleepy test > > can trigger this. > > I do not understand, what you mean. When exactly is "sleeppy test" > running ? from late_initcall(). > Also the TSC unstable detection is in a 500ms timeframe, so you should > never get a 10s delay. I'd not expect TSC instability detection to be ran while system is suspending, plus "wrong for 500msec" is still wrong. Plus TSC unstable detection is heuristic that can easily go wrong, right? > > I believe fix is very simple: > > NAK. > > This kills TSC on machines which have a working TSC and never go into > suspend. Apart from OLPC, which machines have working TSC? Anything that has cpufreq does not. Anything that enters C2 does not. Anything that is SMP does not. So, how many machines do you have with working TSC? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: tsc breaks atkbd suspend
On Mon, 18 Feb 2008, Pavel Machek wrote: > Hi! > > I'm trying to use the "sleepy test" here, unfortunately it locks for > 10-or-so seconds. > > Problem is in wait_event_timeout: timeouts take about 100x as long as > they should. Code in drivers/input/serio/libps2.c: > > + printk("ps2_command waiting event: %d\n", timeout); > timeout = wait_event_timeout(ps2dev->wait, > !(ps2dev->flags & > PS2_FLAG_CMD1),timeout); > > if (ps2dev->cmdcnt && timeout > 0) { > + printk("wait_event returned: %d\n", timeout); > > timeout = ps2_adjust_timeout(ps2dev, command, timeout); > + > + printk("ps2_command adjust timeout: %d\n", timeout); > wait_event_timeout(ps2dev->wait, >!(ps2dev->flags & PS2_FLAG_CMD), timeout); > } > > + printk("ps2_command receiving\n"); > > > ...and I get hang after "ps2_command adjust timeout" for 10 seconds, > while it should wait 10msec or so. When is this code called ? > I even tried adding: > > + printk("ps2: testing timeouts\n"); > + timeout = wait_event_timeout(ps2dev->wait, 0, 10); > + printk("ps2: testing timeouts\n"); > + timeout = wait_event_timeout(ps2dev->wait, 0, 10); > + printk("ps2: testing timeouts\n"); > + timeout = wait_event_timeout(ps2dev->wait, 0, 10); > + printk("ps2: testing timeouts\n"); > + timeout = wait_event_timeout(ps2dev->wait, 0, 10); > + printk("ps2: testing timeouts\n"); > + timeout = wait_event_timeout(ps2dev->wait, 0, 10); > + printk("ps2: timeouts ok?\n"); > > before that, and yes, those wait too long, too... (but only during > suspend, they work ok during boot). Again, at which point during suspend is this ? > nohz=off fixes that. > > notsc fixes that, too... On my system (thinkpad x60 in UP mode) tsc is > normally marked unstable very shortly after boot, so only sleepy test > can trigger this. I do not understand, what you mean. When exactly is "sleeppy test" running ? Also the TSC unstable detection is in a 500ms timeframe, so you should never get a 10s delay. > I believe it is very bad idea to use tsc, it does not work on 90%+ of > machines. Yes, we do detect it is broken during runtime, but that's > too late. Why is it too late ? Can you please describe in detail ? > I believe fix is very simple: NAK. This kills TSC on machines which have a working TSC and never go into suspend. tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
tsc breaks atkbd suspend
Hi! I'm trying to use the "sleepy test" here, unfortunately it locks for 10-or-so seconds. Problem is in wait_event_timeout: timeouts take about 100x as long as they should. Code in drivers/input/serio/libps2.c: + printk("ps2_command waiting event: %d\n", timeout); timeout = wait_event_timeout(ps2dev->wait, !(ps2dev->flags & PS2_FLAG_CMD1),timeout); if (ps2dev->cmdcnt && timeout > 0) { + printk("wait_event returned: %d\n", timeout); timeout = ps2_adjust_timeout(ps2dev, command, timeout); + + printk("ps2_command adjust timeout: %d\n", timeout); wait_event_timeout(ps2dev->wait, !(ps2dev->flags & PS2_FLAG_CMD), timeout); } + printk("ps2_command receiving\n"); ...and I get hang after "ps2_command adjust timeout" for 10 seconds, while it should wait 10msec or so. I even tried adding: + printk("ps2: testing timeouts\n"); + timeout = wait_event_timeout(ps2dev->wait, 0, 10); + printk("ps2: testing timeouts\n"); + timeout = wait_event_timeout(ps2dev->wait, 0, 10); + printk("ps2: testing timeouts\n"); + timeout = wait_event_timeout(ps2dev->wait, 0, 10); + printk("ps2: testing timeouts\n"); + timeout = wait_event_timeout(ps2dev->wait, 0, 10); + printk("ps2: testing timeouts\n"); + timeout = wait_event_timeout(ps2dev->wait, 0, 10); + printk("ps2: timeouts ok?\n"); before that, and yes, those wait too long, too... (but only during suspend, they work ok during boot). nohz=off fixes that. notsc fixes that, too... On my system (thinkpad x60 in UP mode) tsc is normally marked unstable very shortly after boot, so only sleepy test can trigger this. I believe it is very bad idea to use tsc, it does not work on 90%+ of machines. Yes, we do detect it is broken during runtime, but that's too late. I believe fix is very simple: Signed-off-by: Pavel Machek <[EMAIL PROTECTED]> Pavel diff --git a/arch/x86/kernel/tsc_32.c b/arch/x86/kernel/tsc_32.c index 43517e3..fafd9dc 100644 --- a/arch/x86/kernel/tsc_32.c +++ b/arch/x86/kernel/tsc_32.c @@ -298,7 +298,7 @@ static cycle_t read_tsc(void) static struct clocksource clocksource_tsc = { .name = "tsc", - .rating = 300, + .rating = 0, .read = read_tsc, .mask = CLOCKSOURCE_MASK(64), .mult = 0, /* to be set */ diff --git a/arch/x86/kernel/tsc_64.c b/arch/x86/kernel/tsc_64.c index 947554d..b0148d3 100644 --- a/arch/x86/kernel/tsc_64.c +++ b/arch/x86/kernel/tsc_64.c @@ -306,7 +306,7 @@ static cycle_t __vsyscall_fn vread_tsc(v static struct clocksource clocksource_tsc = { .name = "tsc", - .rating = 300, + .rating = 0, .read = read_tsc, .mask = CLOCKSOURCE_MASK(64), .shift = 22, -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
tsc breaks atkbd suspend
Hi! I'm trying to use the sleepy test here, unfortunately it locks for 10-or-so seconds. Problem is in wait_event_timeout: timeouts take about 100x as long as they should. Code in drivers/input/serio/libps2.c: + printk(ps2_command waiting event: %d\n, timeout); timeout = wait_event_timeout(ps2dev-wait, !(ps2dev-flags PS2_FLAG_CMD1),timeout); if (ps2dev-cmdcnt timeout 0) { + printk(wait_event returned: %d\n, timeout); timeout = ps2_adjust_timeout(ps2dev, command, timeout); + + printk(ps2_command adjust timeout: %d\n, timeout); wait_event_timeout(ps2dev-wait, !(ps2dev-flags PS2_FLAG_CMD), timeout); } + printk(ps2_command receiving\n); ...and I get hang after ps2_command adjust timeout for 10 seconds, while it should wait 10msec or so. I even tried adding: + printk(ps2: testing timeouts\n); + timeout = wait_event_timeout(ps2dev-wait, 0, 10); + printk(ps2: testing timeouts\n); + timeout = wait_event_timeout(ps2dev-wait, 0, 10); + printk(ps2: testing timeouts\n); + timeout = wait_event_timeout(ps2dev-wait, 0, 10); + printk(ps2: testing timeouts\n); + timeout = wait_event_timeout(ps2dev-wait, 0, 10); + printk(ps2: testing timeouts\n); + timeout = wait_event_timeout(ps2dev-wait, 0, 10); + printk(ps2: timeouts ok?\n); before that, and yes, those wait too long, too... (but only during suspend, they work ok during boot). nohz=off fixes that. notsc fixes that, too... On my system (thinkpad x60 in UP mode) tsc is normally marked unstable very shortly after boot, so only sleepy test can trigger this. I believe it is very bad idea to use tsc, it does not work on 90%+ of machines. Yes, we do detect it is broken during runtime, but that's too late. I believe fix is very simple: Signed-off-by: Pavel Machek [EMAIL PROTECTED] Pavel diff --git a/arch/x86/kernel/tsc_32.c b/arch/x86/kernel/tsc_32.c index 43517e3..fafd9dc 100644 --- a/arch/x86/kernel/tsc_32.c +++ b/arch/x86/kernel/tsc_32.c @@ -298,7 +298,7 @@ static cycle_t read_tsc(void) static struct clocksource clocksource_tsc = { .name = tsc, - .rating = 300, + .rating = 0, .read = read_tsc, .mask = CLOCKSOURCE_MASK(64), .mult = 0, /* to be set */ diff --git a/arch/x86/kernel/tsc_64.c b/arch/x86/kernel/tsc_64.c index 947554d..b0148d3 100644 --- a/arch/x86/kernel/tsc_64.c +++ b/arch/x86/kernel/tsc_64.c @@ -306,7 +306,7 @@ static cycle_t __vsyscall_fn vread_tsc(v static struct clocksource clocksource_tsc = { .name = tsc, - .rating = 300, + .rating = 0, .read = read_tsc, .mask = CLOCKSOURCE_MASK(64), .shift = 22, -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: tsc breaks atkbd suspend
Hi! I'm trying to use the sleepy test here, unfortunately it locks for 10-or-so seconds. Problem is in wait_event_timeout: timeouts take about 100x as long as they should. Code in drivers/input/serio/libps2.c: + printk(ps2_command waiting event: %d\n, timeout); timeout = wait_event_timeout(ps2dev-wait, !(ps2dev-flags PS2_FLAG_CMD1),timeout); if (ps2dev-cmdcnt timeout 0) { + printk(wait_event returned: %d\n, timeout); timeout = ps2_adjust_timeout(ps2dev, command, timeout); + + printk(ps2_command adjust timeout: %d\n, timeout); wait_event_timeout(ps2dev-wait, !(ps2dev-flags PS2_FLAG_CMD), timeout); } + printk(ps2_command receiving\n); ...and I get hang after ps2_command adjust timeout for 10 seconds, while it should wait 10msec or so. When is this code called ? from serio_suspend. It is normal device, not a sysdev, AFAICT. nohz=off fixes that. notsc fixes that, too... On my system (thinkpad x60 in UP mode) tsc is normally marked unstable very shortly after boot, so only sleepy test can trigger this. I do not understand, what you mean. When exactly is sleeppy test running ? from late_initcall(). Also the TSC unstable detection is in a 500ms timeframe, so you should never get a 10s delay. I'd not expect TSC instability detection to be ran while system is suspending, plus wrong for 500msec is still wrong. Plus TSC unstable detection is heuristic that can easily go wrong, right? I believe fix is very simple: NAK. This kills TSC on machines which have a working TSC and never go into suspend. Apart from OLPC, which machines have working TSC? Anything that has cpufreq does not. Anything that enters C2 does not. Anything that is SMP does not. So, how many machines do you have with working TSC? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: tsc breaks atkbd suspend
On Mon, 18 Feb 2008, Pavel Machek wrote: Hi! I'm trying to use the sleepy test here, unfortunately it locks for 10-or-so seconds. Problem is in wait_event_timeout: timeouts take about 100x as long as they should. Code in drivers/input/serio/libps2.c: + printk(ps2_command waiting event: %d\n, timeout); timeout = wait_event_timeout(ps2dev-wait, !(ps2dev-flags PS2_FLAG_CMD1),timeout); if (ps2dev-cmdcnt timeout 0) { + printk(wait_event returned: %d\n, timeout); timeout = ps2_adjust_timeout(ps2dev, command, timeout); + + printk(ps2_command adjust timeout: %d\n, timeout); wait_event_timeout(ps2dev-wait, !(ps2dev-flags PS2_FLAG_CMD), timeout); } + printk(ps2_command receiving\n); ...and I get hang after ps2_command adjust timeout for 10 seconds, while it should wait 10msec or so. When is this code called ? I even tried adding: + printk(ps2: testing timeouts\n); + timeout = wait_event_timeout(ps2dev-wait, 0, 10); + printk(ps2: testing timeouts\n); + timeout = wait_event_timeout(ps2dev-wait, 0, 10); + printk(ps2: testing timeouts\n); + timeout = wait_event_timeout(ps2dev-wait, 0, 10); + printk(ps2: testing timeouts\n); + timeout = wait_event_timeout(ps2dev-wait, 0, 10); + printk(ps2: testing timeouts\n); + timeout = wait_event_timeout(ps2dev-wait, 0, 10); + printk(ps2: timeouts ok?\n); before that, and yes, those wait too long, too... (but only during suspend, they work ok during boot). Again, at which point during suspend is this ? nohz=off fixes that. notsc fixes that, too... On my system (thinkpad x60 in UP mode) tsc is normally marked unstable very shortly after boot, so only sleepy test can trigger this. I do not understand, what you mean. When exactly is sleeppy test running ? Also the TSC unstable detection is in a 500ms timeframe, so you should never get a 10s delay. I believe it is very bad idea to use tsc, it does not work on 90%+ of machines. Yes, we do detect it is broken during runtime, but that's too late. Why is it too late ? Can you please describe in detail ? I believe fix is very simple: NAK. This kills TSC on machines which have a working TSC and never go into suspend. tglx -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: tsc breaks atkbd suspend
On Mon, 18 Feb 2008, Pavel Machek wrote: ...and I get hang after ps2_command adjust timeout for 10 seconds, while it should wait 10msec or so. When is this code called ? from serio_suspend. It is normal device, not a sysdev, AFAICT. nohz=off fixes that. notsc fixes that, too... On my system (thinkpad x60 in UP mode) tsc is normally marked unstable very shortly after boot, so only sleepy test can trigger this. I do not understand, what you mean. When exactly is sleeppy test running ? from late_initcall(). Has the system already switched to highres/nohz mode at this point ? Also the TSC unstable detection is in a 500ms timeframe, so you should never get a 10s delay. I'd not expect TSC instability detection to be ran while system is suspending There is nothing which switches it off, except right at the point where the machine is shutdown. It's definitely active during the freeze period. , plus wrong for 500msec is still wrong. To be wrong for 500ms, the TSC must freeze completely. Plus TSC unstable detection is heuristic that can easily go wrong, right? It should detect it, but I have to check, why it does not work in that particular case. I believe fix is very simple: NAK. This kills TSC on machines which have a working TSC and never go into suspend. Apart from OLPC, which machines have working TSC? Anything that has cpufreq does not. Wrong. Newer CPUs have TSCs which are not affected by cpu frequency. Anything that enters C2 does not. Wrong. C3 stops the TSC everywhere. C2 should not, but it does unfortunately due to stupid BIOS implementations and on a lot of AMD CPUs. Anything that is SMP does not. Wrong. I'm writing this mail from a SMP machine with perfectly sychronized TSCs. So, how many machines do you have with working TSC? At least 5. One sensible solution would be to force TSC unstable before we suspend, but I'd like to know why that 10sec delay happens at all. Can you please provide a dmesg log ? Thanks, tglx -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/