Re: How to get deeper C states working?
Hi, FreeBSD reports ACPI C-states, while Linux -- CPU C-states. Mapping of ones into the others is controlled by BIOS and not exposed to the OS. It is quite likely that ACPI C2 means CPU C3, and ACPI C3 means CPU C6/C7. When you plug in AC adapter BIOS likely hides ACPI C3 state from OS, since it makes no much sense to save that little energy, considering potential performance loss. On 29.10.2022 11:52, Lester wrote: Hi, I'm using FreeBSD 13.1 on a Thinkpad T420 and noticed 1) with AC plugged in I only get C1 and C2 recognized 2) with only battery I get C1, C2 and C3. I also have Debian Linux installed on the same machine, under which I can get C6 and C7 too (I noticed there's a ssdt6 for Cpu0Cst which defines all the C states). I was wondering if Debian has some SSDT override that provides the additional states? From reading FreeBSD's acpi doc, I got the sense that I can override the DSDT, but don't know what I need to change, and how to get all the override files combined into a single aml file... Questions: 1) How can I get C3 working on AC? 2) How can I get C6 and C7 working too? I'm sharing my acpidump results in this folder: https://drive.google.com/drive/folders/1q0pY_2fO96RcQCN929sLLtYPpiokVTC3?usp=sharing <https://drive.google.com/drive/folders/1q0pY_2fO96RcQCN929sLLtYPpiokVTC3?usp=sharing> Many thanks! == AC hw.acpi.cpu.cx_lowest: C8 dev.cpu.1.cx_method: C1/hlt C2/io dev.cpu.1.cx_usage_counters: 124 817 dev.cpu.1.cx_usage: 13.17% 86.82% last 54us dev.cpu.1.cx_lowest: C8 dev.cpu.1.cx_supported: C1/1/1 C2/3/104 dev.cpu.0.cx_method: C1/hlt C2/io dev.cpu.0.cx_usage_counters: 70 520 dev.cpu.0.cx_usage: 11.86% 88.13% last 5508us dev.cpu.0.cx_lowest: C8 dev.cpu.0.cx_supported: C1/1/1 C2/3/104 == Battery hw.acpi.cpu.cx_lowest: C8 dev.cpu.1.cx_method: C1/hlt C2/io C3/io dev.cpu.1.cx_usage_counters: 1946 106 11173 dev.cpu.1.cx_usage: 14.71% 0.80% 84.48% last 85us dev.cpu.1.cx_lowest: C8 dev.cpu.1.cx_supported: C1/1/1 C2/2/80 C3/3/109 dev.cpu.0.cx_method: C1/hlt C2/io C3/io dev.cpu.0.cx_usage_counters: 1767 105 7127 dev.cpu.0.cx_usage: 19.63% 1.16% 79.19% last 15us dev.cpu.0.cx_lowest: C8 dev.cpu.0.cx_supported: C1/1/1 C2/2/80 C3/3/109 == Linux cpupower idle-info CPUidle driver: intel_idle CPUidle governor: menu analyzing CPU 0: Number of idle states: 6 Available idle states: POLL C1 C1E C3 C6 C7 POLL: Flags/Description: CPUIDLE CORE POLL IDLE Latency: 0 Usage: 16099 Duration: 264781 C1: Flags/Description: MWAIT 0x00 Latency: 2 Usage: 7103 Duration: 1039428 C1E: Flags/Description: MWAIT 0x01 Latency: 10 Usage: 30433 Duration: 6118359 C3: Flags/Description: MWAIT 0x10 Latency: 80 Usage: 11891 Duration: 4311399 C6: Flags/Description: MWAIT 0x20 Latency: 104 Usage: 77 Duration: 26683 C7: Flags/Description: MWAIT 0x30 Latency: 109 Usage: 157291 Duration: 433120357 -- Alexander Motin
Re: suspend issues with latest -HEAD, ahci failing to complete something?
On 05.05.2014 20:37, Adrian Chadd wrote: (I know, I just emailed out asking about setting S3 for the default lid suspend state, however I just updated to the very latest head and things went a little backwards.) Suspend no longer works for me: May 5 10:33:10 lucy-11i386 acpi: suspend at 20140505 10:33:10 May 5 10:33:47 lucy-11i386 kernel: ahcich0: Timeout on slot 19 port 0 May 5 10:33:47 lucy-11i386 kernel: ahcich0: is cs fff80fff ss fff80fff rs fff80fff tfd d0 serr cmd d317 May 5 10:33:47 lucy-11i386 kernel: (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 08 e0 b0 fa 40 42 00 00 00 00 00 May 5 10:33:47 lucy-11i386 kernel: (ada0:ahcich0:0:0:0): CAM status: Command timeout May 5 10:33:47 lucy-11i386 kernel: (ada0:ahcich0:0:0:0): Retrying command May 5 10:33:13 lucy-11i386 acpi: resumed at 20140505 10:33:13 May 5 10:33:59 lucy-11i386 acpi: suspend at 20140505 10:33:59 May 5 10:34:37 lucy-11i386 kernel: ahcich0: Timeout on slot 9 port 0 May 5 10:34:37 lucy-11i386 kernel: ahcich0: is cs ff83 ss ff83 rs ff83 tfd d0 serr cmd c717 May 5 10:34:37 lucy-11i386 kernel: (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 08 18 5c f7 40 42 00 00 00 00 00 May 5 10:34:37 lucy-11i386 kernel: (ada0:ahcich0:0:0:0): CAM status: Command timeout May 5 10:34:37 lucy-11i386 kernel: (ada0:ahcich0:0:0:0): Retrying command May 5 10:34:03 lucy-11i386 acpi: resumed at 20140505 10:34:03 What has recently changed that'd possibly break ahci's ability to correctly suspend? When I tested it last time (awhile ago), it was working for me. ahci_ch_suspend() should block all I/O on the channel and wait until all active commands complete. On resume channel should be reinitialized, device reset and only then I/Os should be released. Do you see those timeouts on suspend or resume? Do you have kern.cam.ada.spindown_suspend enabled? Can you try to disable it? -- Alexander Motin ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to "freebsd-acpi-unsubscr...@freebsd.org"
Re: Using bintime() in acpi_cpu_idle()?
On 30.07.2012 09:25, Alexander Motin wrote: On 30.07.2012 07:33, Bruce Evans wrote: On Sun, 29 Jul 2012, Alexander Motin wrote: On 29.07.2012 15:26, Bruce Evans wrote: On Sun, 29 Jul 2012, Alexander Motin wrote: On 29.07.2012 11:37, Bruce Evans wrote: ... binuptime() is more accurate than uncalibrated scaling. Is accuracy required? Accuracy is not required at all. +-20% is not a problem. If not, the CPU ticker might work, and is faster than HPET, and and is not under user control for perverse settings. It normally reduces to readtsc() with no serializing instruction even in proposed changes. This is good enough for process times (not very good) and depends on the CPU not changing. Its calibration is very accurate (similar to timecounters) modulo bugs, but not always up to date. Problem with ticker that it may stop during idle periods, and idle is exactly what happens here. Unlike timecounter usage here we don't need CPU synchronicity, but we need it working during deep sleeps. The ticker is the same as the timecounter in many cases of interest. If the TSC stops then it cannot be used for timecounting unless timecounting is reinitialized. Timecounting should be reinitialized after deep sleeps, but you say you need it to work during deep sleeps. Timecounter already has detection logic to disable TSC in cases where it is unreliable. I don't want to replicate it here. I need not precise and not synchronized by reliable and fast time source. Yes, this logic gives exactly what you don't want (an inefficient timecounter), by preventing use of the TSC for the timecounter, although the TSC is perfectly usable for the ticker and here. Can you teach me how to use ticker that is not ticking? If TSC was considered unusable for timecounter for reasons unrelated to SMP, how can I use it as ticker. I wouldn't trust timecounters for some time after waking up after a deep sleep. If their clock stopped then the times read might only be very out of date. If their clock didn't stop, then they might have wrapped or otherwise overflowed and the times read would be garbage. Is there any locking or ordering to prevent them being used before they are reinitialized? I am not sure what reinitialization are you talking about. IIRC, there is no any waking up code for TSC. None other time counters have problems with C-states. It is the timecounter code that needs reinitializing. If the TSC stops, or wraps mod 2**32, then its counts become garbage for the purpose of timecounting. Maybe it is not used for timecounting in either of these cases. But these cases shouldn't prevent its use for timecounting. The 2**32 number is because timecounters only use 32 bits of hardware counters (for efficiency). So even if the hardware has some magic to not stop the TSC while sleeping (maybe it fakes not stopping it be reloading on wakeup), it is still unusable by timecounters after sleeping for a second or 2 so that it wraps. The software needs similar faking to reload the timecounter on wakeup. This makes use of timecounters in sleep/wakeup code fragile. At this moment I am not talking about S-states sleeping for hours. I am talking about C-states for milliseconds. It means that TSC may stop and start 10K times each second or even more. Attempt to save and restore its state will consume so much resources, that probably make it useless. What's about wrap after 2 seconds, I would be happy to make CPU sleep for so long, but now 100ms is all I can hope even on idle system. At boot time there is a dummy timecounter that returns bogo-times. Apparently sleeping doesn't occur before the timecounter is switched to a real one. The dummy timecounter isn't switched back to after boot time. But it probably should be, since the hardware timecounter may have stopped or wrapped. Sleeping could just set a flag to indicate this state, but then you would have to provide a fake time anyway on finding the flag set. Boot time just points to the dummy timecounter so as not to check this flag in all early timecounter "hardware" calls. And how dummy timecounter that counts something, but not time, can help me to measure sleep time? Nevermind, let it be compromise solution -- ticker for C1 state where performance is the most important and where TSC works and ACPI timer for others. -- Alexander Motin ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to "freebsd-acpi-unsubscr...@freebsd.org"
Re: Using bintime() in acpi_cpu_idle()?
On 30.07.2012 07:33, Bruce Evans wrote: On Sun, 29 Jul 2012, Alexander Motin wrote: On 29.07.2012 15:26, Bruce Evans wrote: On Sun, 29 Jul 2012, Alexander Motin wrote: On 29.07.2012 11:37, Bruce Evans wrote: ... binuptime() is more accurate than uncalibrated scaling. Is accuracy required? Accuracy is not required at all. +-20% is not a problem. If not, the CPU ticker might work, and is faster than HPET, and and is not under user control for perverse settings. It normally reduces to readtsc() with no serializing instruction even in proposed changes. This is good enough for process times (not very good) and depends on the CPU not changing. Its calibration is very accurate (similar to timecounters) modulo bugs, but not always up to date. Problem with ticker that it may stop during idle periods, and idle is exactly what happens here. Unlike timecounter usage here we don't need CPU synchronicity, but we need it working during deep sleeps. The ticker is the same as the timecounter in many cases of interest. If the TSC stops then it cannot be used for timecounting unless timecounting is reinitialized. Timecounting should be reinitialized after deep sleeps, but you say you need it to work during deep sleeps. Timecounter already has detection logic to disable TSC in cases where it is unreliable. I don't want to replicate it here. I need not precise and not synchronized by reliable and fast time source. Yes, this logic gives exactly what you don't want (an inefficient timecounter), by preventing use of the TSC for the timecounter, although the TSC is perfectly usable for the ticker and here. Can you teach me how to use ticker that is not ticking? If TSC was considered unusable for timecounter for reasons unrelated to SMP, how can I use it as ticker. I wouldn't trust timecounters for some time after waking up after a deep sleep. If their clock stopped then the times read might only be very out of date. If their clock didn't stop, then they might have wrapped or otherwise overflowed and the times read would be garbage. Is there any locking or ordering to prevent them being used before they are reinitialized? I am not sure what reinitialization are you talking about. IIRC, there is no any waking up code for TSC. None other time counters have problems with C-states. It is the timecounter code that needs reinitializing. If the TSC stops, or wraps mod 2**32, then its counts become garbage for the purpose of timecounting. Maybe it is not used for timecounting in either of these cases. But these cases shouldn't prevent its use for timecounting. The 2**32 number is because timecounters only use 32 bits of hardware counters (for efficiency). So even if the hardware has some magic to not stop the TSC while sleeping (maybe it fakes not stopping it be reloading on wakeup), it is still unusable by timecounters after sleeping for a second or 2 so that it wraps. The software needs similar faking to reload the timecounter on wakeup. This makes use of timecounters in sleep/wakeup code fragile. At this moment I am not talking about S-states sleeping for hours. I am talking about C-states for milliseconds. It means that TSC may stop and start 10K times each second or even more. Attempt to save and restore its state will consume so much resources, that probably make it useless. What's about wrap after 2 seconds, I would be happy to make CPU sleep for so long, but now 100ms is all I can hope even on idle system. At boot time there is a dummy timecounter that returns bogo-times. Apparently sleeping doesn't occur before the timecounter is switched to a real one. The dummy timecounter isn't switched back to after boot time. But it probably should be, since the hardware timecounter may have stopped or wrapped. Sleeping could just set a flag to indicate this state, but then you would have to provide a fake time anyway on finding the flag set. Boot time just points to the dummy timecounter so as not to check this flag in all early timecounter "hardware" calls. And how dummy timecounter that counts something, but not time, can help me to measure sleep time? -- Alexander Motin ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to "freebsd-acpi-unsubscr...@freebsd.org"
Re: Using bintime() in acpi_cpu_idle()?
On 29.07.2012 15:26, Bruce Evans wrote: On Sun, 29 Jul 2012, Alexander Motin wrote: On 29.07.2012 11:37, Bruce Evans wrote: ... binuptime() is more accurate than uncalibrated scaling. Is accuracy required? Accuracy is not required at all. +-20% is not a problem. If not, the CPU ticker might work, and is faster than HPET, and and is not under user control for perverse settings. It normally reduces to readtsc() with no serializing instruction even in proposed changes. This is good enough for process times (not very good) and depends on the CPU not changing. Its calibration is very accurate (similar to timecounters) modulo bugs, but not always up to date. Problem with ticker that it may stop during idle periods, and idle is exactly what happens here. Unlike timecounter usage here we don't need CPU synchronicity, but we need it working during deep sleeps. The ticker is the same as the timecounter in many cases of interest. If the TSC stops then it cannot be used for timecounting unless timecounting is reinitialized. Timecounting should be reinitialized after deep sleeps, but you say you need it to work during deep sleeps. Timecounter already has detection logic to disable TSC in cases where it is unreliable. I don't want to replicate it here. I need not precise and not synchronized by reliable and fast time source. I wouldn't trust timecounters for some time after waking up after a deep sleep. If their clock stopped then the times read might only be very out of date. If their clock didn't stop, then they might have wrapped or otherwise overflowed and the times read would be garbage. Is there any locking or ordering to prevent them being used before they are reinitialized? I am not sure what reinitialization are you talking about. IIRC, there is no any waking up code for TSC. None other time counters have problems with C-states. -- Alexander Motin ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to "freebsd-acpi-unsubscr...@freebsd.org"
Re: Using bintime() in acpi_cpu_idle()?
On 29.07.2012 11:37, Bruce Evans wrote: On Sun, 29 Jul 2012, Alexander Motin wrote: With ACPI timer gradually becoming one of slowest in the system, is there some reason to use it directly in acpi_cpu_idle()? I've made a patch: http://people.freebsd.org/~mav/sleep_time.patch to use binuptime() instead. Using even HPET from system time counter (not even speaking about TSC) that significantly improves performance on some workloads if this code is not covered by MWAIT optimization in cpu_idle(). Does it work with a perverse timecounter like the i8254 work? At least on my test system it does, even though predictably much slower then the others. The user is permitted to switch to any supported timecounter. There are other perverse ones: - ACPI. This seems to be unavailable if the system thinks ACPI-fast works. Bug. The user should be able to downgrade to it if ACPI-fast in fact doesn't work. Since it reads the hardware more than once, it is much slower than direct use of the hardware. - ACPI-fast. Even this is perverse. It only reads the hardware once, but goes through many software layers. binuptime() is more accurate than uncalibrated scaling. Is accuracy required? Accuracy is not required at all. +-20% is not a problem. If not, the CPU ticker might work, and is faster than HPET, and and is not under user control for perverse settings. It normally reduces to readtsc() with no serializing instruction even in proposed changes. This is good enough for process times (not very good) and depends on the CPU not changing. Its calibration is very accurate (similar to timecounters) modulo bugs, but not always up to date. Problem with ticker that it may stop during idle periods, and idle is exactly what happens here. Unlike timecounter usage here we don't need CPU synchronicity, but we need it working during deep sleeps. -- Alexander Motin ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to "freebsd-acpi-unsubscr...@freebsd.org"
Using bintime() in acpi_cpu_idle()?
Hi. With ACPI timer gradually becoming one of slowest in the system, is there some reason to use it directly in acpi_cpu_idle()? I've made a patch: http://people.freebsd.org/~mav/sleep_time.patch to use binuptime() instead. Using even HPET from system time counter (not even speaking about TSC) that significantly improves performance on some workloads if this code is not covered by MWAIT optimization in cpu_idle(). -- Alexander Motin ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to "freebsd-acpi-unsubscr...@freebsd.org"
Re: [stable 9] broken hwpstate calls
On 06/07/12 21:04, Andriy Gapon wrote: on 07/06/2012 11:38 Alexander Motin said the following: On 06/07/12 11:10, Andriy Gapon wrote: on 07/06/2012 02:02 Jung-uk Kim said the following: Any way, hwpstate still isn't quite right even without your patch. sys/kern/kern_cpu.c cpufreq_curr_sysctl() -> CPUFREQ_SET() -> /* for all CPU devices */ cf_set_method() -> /* thread_lock(), sched_bind(), ... */ CPUFREQ_DRV_SET() -> sys/x86/cpufreq/hwpstate.c hwpstate_set() -> hwpstate_goto_pstate()/* for each CPU unit */ /* thread_lock(), sched_bind(), ... */ Oh, I didn't realize that there was the cpufreq-level loop over all CPUs! That really sucks. Maybe some day we should accept that different CPUs could legitimately be in different P-states and provide support for that throughout the stack (from powerd to drivers). Support for different P-states on different CPUs can be useful if CPUs have different capabilities. Not sure what you mean... I was talking about setting different CPUs to different P-states based on the per-CPU conditions (e.g. utilization). I certainly didn't mean to talk about heterogeneous P-state definitions or any other heterogeneous silicon issues. As you wish, but at this moment it is the only realistic application I see. As I've told below, setting different frequencies to different cores without scheduler awareness is a bad idea. I believe it is very rare, but possible. At this moment cpufreq should set for each CPU frequency closest to one that was set on BSP. It should be possible to make powerd to read sets of frequencies from all CPUs and do the same, just more intelligently. Same time using very different frequencies for different CPUs can IMHO be very problematic even in theory. For SMP systems it is quite difficult (because of threads migration and possible inter-operations of multiple threads) to identify cases when even global frequency can be reduced without proportional performance penalty. Making in per-CPU multiplies number of options and requires awareness from the scheduler. I humbly disagree. I think that it's not a job of scheduler to be overly smart when power-saving policies are in effect. IMO, scheduler should just do its own job and powerd should react to individual loads of CPUs. Where latencies really matter there powerd should not be used (or perhaps used with some different policy skewed towards performance vs economy). Scheduler usually operates in terms of milliseconds or less. powerd operates in best case in terms of fractions of seconds (or it will eat more power then save). Unless you are doing some heavy CPU-bound math without any context switches, it won't work well without scheduler aware about available computation resources. > Also, Linux does it, so it must at least doable :-) I don't know whether or how Linux does it. If you know how to do it effectively -- welcome, be my guest. :) -- Alexander Motin ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to "freebsd-acpi-unsubscr...@freebsd.org"
Re: [stable 9] broken hwpstate calls
On 06/07/12 11:10, Andriy Gapon wrote: on 07/06/2012 02:02 Jung-uk Kim said the following: Any way, hwpstate still isn't quite right even without your patch. sys/kern/kern_cpu.c cpufreq_curr_sysctl() -> CPUFREQ_SET() ->/* for all CPU devices */ cf_set_method() ->/* thread_lock(), sched_bind(), ... */ CPUFREQ_DRV_SET() -> sys/x86/cpufreq/hwpstate.c hwpstate_set() -> hwpstate_goto_pstate() /* for each CPU unit */ /* thread_lock(), sched_bind(), ... */ Oh, I didn't realize that there was the cpufreq-level loop over all CPUs! That really sucks. Maybe some day we should accept that different CPUs could legitimately be in different P-states and provide support for that throughout the stack (from powerd to drivers). Support for different P-states on different CPUs can be useful if CPUs have different capabilities. I believe it is very rare, but possible. At this moment cpufreq should set for each CPU frequency closest to one that was set on BSP. It should be possible to make powerd to read sets of frequencies from all CPUs and do the same, just more intelligently. Same time using very different frequencies for different CPUs can IMHO be very problematic even in theory. For SMP systems it is quite difficult (because of threads migration and possible inter-operations of multiple threads) to identify cases when even global frequency can be reduced without proportional performance penalty. Making in per-CPU multiplies number of options and requires awareness from the scheduler. -- Alexander Motin ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to "freebsd-acpi-unsubscr...@freebsd.org"
Re: Tyan S3992-E: hpet no longer working
Bruce Evans wrote: > On Tue, 11 Jan 2011, Alexander Motin wrote: > >> Arno J. Klaassen wrote: >>> Sure .. that said, the BIOS I use is the last official release for this >>> board (Sept 2009) and not even a more recent beta-release is available. >>> >>> I would expect reporting a disabled device which cannot be enabled via >>> de BIOS a bug deserving a newer release. >>> >>> Anyway, this bug isn't very harmful for me, but the non-hpet >>> timecounters don't seem that fun either : >>> >>> # uptime >>>10:27PM up 2 days, 5:44 >>> >>> # sysctl kern.timecounter.hardware kern.timecounter.choice >>>kern.timecounter.hardware: ACPI-safe >>>kern.timecounter.choice: TSC(-100) i8254(0) ACPI-safe(850) >>> dummy(-100) >>> >>> # vmstat -i | fgrep cpu: >>>cpu0:timer 38599321199 >>>cpu6:timer 2151003 11 >>>cpu1:timer 7121075 36 >>>cpu3:timer 1808269 9 >>>cpu5:timer 3832463 19 >>>cpu2:timer 2399988 12 >>>cpu7:timer 2013444 10 >>>cpu4:timer 21630368111 >>> >>> (default HZ ) >>> >>> Maybe I should try downgrading the BIOS? >> >> So what here seems not funny to you? Lower timer interrupt rate is not a >> bug but feature of 9-CURRENT. > > They (cpu*:timer) also aren't timecounters :-). Sure. They've never been timecounters. > Hmm, with hpet on FreeBSD cluster machines, there is now only hpet. How > is statclock distributed with hpet? If there are enough timers for each CPU and their IRQs are not shareable -- they are assigned to each CPU, one to one. The rest logic is same for all drivers: if timer is not per-CPU - it is used for all CPUs and events redistributed via IPI by MI code. Plus of one-shot mode - we don't need separate timer hardware for statclock. > I never properly reviewed the latest "irqN"-printing changes in systat, > and just noticed that they break printing of "irq" the usual case where > the interrupt name starts with "irqN:" (then systat removes "irqN:" and > never puts back "irq"). > > Not-so-quick fix: > > % Index: vmstat.c > % === > % RCS file: /home/ncvs/src/usr.bin/systat/vmstat.c,v > % retrieving revision 1.93 > % diff -u -2 -r1.93 vmstat.c > % --- vmstat.c11 Dec 2010 08:32:16 -1.93 > % +++ vmstat.c11 Jan 2011 06:20:01 - > % @@ -244,5 +244,10 @@ > % *--cp1 = '\0'; > % % -/* Convert "irqN: name" to "name irqN". */ > % +/* > % + * Convert "irqN: name" to "name irqN", "name N" or > % + * "name". First reduce to "name"; then append > % + * " irqN" if that fits, else " N" if that fits, > % + * else drop all of "N". > % + */ > % if (strncmp(cp, "irq", 3) == 0) { > % cp1 = cp + 3; > % @@ -256,5 +261,8 @@ > % cp2 = strdup(cp); > % bcopy(cp1, cp, sz - (cp1 - cp) + 1); > % -if (sz <= 10 + 4) { > % +if (sz <= 10 + 1) { > % +strcat(cp, " "); > % +strcat(cp, cp2); > % +} else if (sz <= 10 + 4) { > % strcat(cp, " "); > % strcat(cp, cp2 + 3); > % @@ -266,5 +274,9 @@ > % /* > % * Convert "name irqN" to "name N" if the former is > % - * longer than the field width. > % + * longer than the field width. This handles some > % + * cases where the original name did not start with > % + * "irqN". We don't bother dropping partial "N"s in > % + * this case. The "name" part may be too long in > % + * either case; then we blindly truncate it. > % */ > % if ((cp1 = strstr(cp, "irq")) != NULL && > > This restores part of rev.1.90, updates the first comment to match the code > changes in rev.1.90, and expands the last comment. Heh. If I haven't forgot some prehistory, you may be right. -- Alexander Motin ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to "freebsd-acpi-unsubscr...@freebsd.org"
Re: Tyan S3992-E: hpet no longer working
Ah, that's fine and indeed just funny. When CPUs are idle now, they receive minimal amount of interrupts to allow reduced power consumption. Your numbers just tells that load is not equal. You may see in systat how rates changing depending on load. -- Alexander Motin 11.01.2011 0:42 пользователь "Arno J. Klaassen" написал: > Alexander Motin writes: > >> Arno J. Klaassen wrote: >>> Sure .. that said, the BIOS I use is the last official release for this >>> board (Sept 2009) and not even a more recent beta-release is available. >>> >>> I would expect reporting a disabled device which cannot be enabled via >>> de BIOS a bug deserving a newer release. >>> >>> Anyway, this bug isn't very harmful for me, but the non-hpet >>> timecounters don't seem that fun either : >>> >>> # uptime >>> 10:27PM up 2 days, 5:44 >>> >>> # sysctl kern.timecounter.hardware kern.timecounter.choice >>> kern.timecounter.hardware: ACPI-safe >>> kern.timecounter.choice: TSC(-100) i8254(0) ACPI-safe(850) dummy(-100) >>> >>> # vmstat -i | fgrep cpu: >>> cpu0:timer 38599321 199 >>> cpu6:timer 2151003 11 >>> cpu1:timer 7121075 36 >>> cpu3:timer 1808269 9 >>> cpu5:timer 3832463 19 >>> cpu2:timer 2399988 12 >>> cpu7:timer 2013444 10 >>> cpu4:timer 21630368 111 >>> >>> (default HZ ) >>> >>> Maybe I should try downgrading the BIOS? >> >> So what here seems not funny to you? Lower timer interrupt rate is not a >> bug but feature of 9-CURRENT. > > the standard deviation in the values; I don't have another 8-way by > hand, but a 4-way 6-STABLE gives : > > cpu0: timer 3299774936 2000 > cpu2: timer 3299757640 2000 > cpu3: timer 3299757640 2000 > cpu1: timer 3299757640 2000 > > and my 8-STABLE notebook (with kern.hz=100) : > > cpu0: timer 323161363 400 > cpu1: timer 323161114 400 > > A range from 9 to 199 is 'funny', maybe I choose the wrong word, but > I didn't see such discrepancies before. Sorry > > Best, Arno > ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to "freebsd-acpi-unsubscr...@freebsd.org"
Re: Tyan S3992-E: hpet no longer working
Arno J. Klaassen wrote: > Sure .. that said, the BIOS I use is the last official release for this > board (Sept 2009) and not even a more recent beta-release is available. > > I would expect reporting a disabled device which cannot be enabled via > de BIOS a bug deserving a newer release. > > Anyway, this bug isn't very harmful for me, but the non-hpet > timecounters don't seem that fun either : > > # uptime >10:27PM up 2 days, 5:44 > > # sysctl kern.timecounter.hardware kern.timecounter.choice >kern.timecounter.hardware: ACPI-safe >kern.timecounter.choice: TSC(-100) i8254(0) ACPI-safe(850) dummy(-100) > > # vmstat -i | fgrep cpu: >cpu0:timer 38599321199 >cpu6:timer 2151003 11 >cpu1:timer 7121075 36 >cpu3:timer 1808269 9 >cpu5:timer 3832463 19 >cpu2:timer 2399988 12 >cpu7:timer 2013444 10 >cpu4:timer 21630368111 > > (default HZ ) > > Maybe I should try downgrading the BIOS? So what here seems not funny to you? Lower timer interrupt rate is not a bug but feature of 9-CURRENT. -- Alexander Motin ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to "freebsd-acpi-unsubscr...@freebsd.org"
Re: Tyan S3992-E: hpet no longer working
John Baldwin wrote: > On Saturday, January 08, 2011 11:46:02 am Alexander Motin wrote: >> Arno J. Klaassen wrote: >>> John Baldwin writes: >>> >>>> On Thursday, January 06, 2011 5:32:08 pm Arno J. Klaassen wrote: >>>>> John Baldwin writes: >>>>> >>>>>> On Wednesday, January 05, 2011 4:39:24 pm Arno J. Klaassen wrote: >>>>>>> Hello, >>>>>>> >>>>>>> I have (a long-lasting) problem to get hpet attached to a Tyan S3992-E >>>>>>> MB. My last known working kernel is 7.1-PRERELEASE Sep 2 2008" , I >>>>>>> rarely cared about this board for a while... >>>>>>> >>>>>>> At that time the dmesg said : >>>>>>> >>>>>>> >>>>>>> acpi_hpet0: iomem 0xfed0-0xfed003ff >>>>>>> on acpi0 >>>>>>> Timecounter "HPET" frequency 2500 Hz quality 900 >>>>>>> >>>>>>> now it says (debug.acpi.hpet_test="1", debug.acpi.layer="ACPI_TIMER", >>>>>>> debug.acpi.level="ACPI_LV_ALL_EXCEPTIONS" enabled) : >>>>>>> >>>>>>> hpet0: iomem 0xfed0-0xfed03fff on >>>>>>> acpi0 >>>>>>> hpet0: vendor 0x, rev 0xff, 232831Hz 64bit, 32 timers, legacy >>>>>>> route >>>>>>> hpet0: t0: irqs 0x (31), MSI, 64bit, periodic >>>>>>> hpet0: t1: irqs 0x (31), MSI, 64bit, periodic >>>>>>> hpet0: t2: irqs 0x (31), MSI, 64bit, periodic >>>>>>> hpet0: t3: irqs 0x (31), MSI, 64bit, periodic >>>>>>> hpet0: t4: irqs 0x (31), MSI, 64bit, periodic >>>>>>> hpet0: t5: irqs 0x (31), MSI, 64bit, periodic >>>>>>> hpet0: t6: irqs 0x (31), MSI, 64bit, periodic >>>>>>> hpet0: t7: irqs 0x (31), MSI, 64bit, periodic >>>>>>> hpet0: t8: irqs 0x (31), MSI, 64bit, periodic >>>>>>> hpet0: t9: irqs 0x (31), MSI, 64bit, periodic >>>>>>> hpet0: t10: irqs 0x (31), MSI, 64bit, periodic >>>>>>> hpet0: t11: irqs 0x (31), MSI, 64bit, periodic >>>>>>> hpet0: t12: irqs 0x (31), MSI, 64bit, periodic >>>>>>> hpet0: t13: irqs 0x (31), MSI, 64bit, periodic >>>>>>> hpet0: t14: irqs 0x (31), MSI, 64bit, periodic >>>>>>> hpet0: t15: irqs 0x (31), MSI, 64bit, periodic >>>>>>> hpet0: t16: irqs 0x (31), MSI, 64bit, periodic >>>>>>> hpet0: t17: irqs 0x (31), MSI, 64bit, periodic >>>>>>> hpet0: t18: irqs 0x (31), MSI, 64bit, periodic >>>>>>> hpet0: t19: irqs 0x (31), MSI, 64bit, periodic >>>>>>> hpet0: t20: irqs 0x (31), MSI, 64bit, periodic >>>>>>> hpet0: t21: irqs 0x (31), MSI, 64bit, periodic >>>>>>> hpet0: t22: irqs 0x (31), MSI, 64bit, periodic >>>>>>> hpet0: t23: irqs 0x (31), MSI, 64bit, periodic >>>>>>> hpet0: t24: irqs 0x (31), MSI, 64bit, periodic >>>>>>> hpet0: t25: irqs 0x (31), MSI, 64bit, periodic >>>>>>> hpet0: t26: irqs 0x (31), MSI, 64bit, periodic >>>>>>> hpet0: t27: irqs 0x (31), MSI, 64bit, periodic >>>>>>> hpet0: t28: irqs 0x (31), MSI, 64bit, periodic >>>>>>> hpet0: t29: irqs 0x (31), MSI, 64bit, periodic >>>>>>> hpet0: t30: irqs 0x (31), MSI, 64bit, periodic >>>>>>> hpet0: t31: irqs 0x (31), MSI, 64bit, periodic >>>>>>> hpet0: 0.0: 4294967295 ... 4294967295 = 0 >>>>>>> hpet0: time per call: 0 ns >>>>>>> hpet0: HPET never increments, disabling >>>>>>> device_attach: hpet0 attach returned 6 >>>>>>> >>>>>>> >>>>>>> Some things strike me : >>>>>>> >>>>>>> 'vendor 0x, rev 0xf' and '4294967295 (== 0x)' as well >>>>>>> as 232831Hz >>>>>>>
Re: Tyan S3992-E: hpet no longer working
Method (_STA, 0, NotSerialized) >>>> { >>>> Return (0x0F) >>>> } >>>> >>>> Method (_CRS, 0, NotSerialized) >>>> { >>>> Return (ResourceTemplate () >>>> { >>>> Memory32Fixed (ReadWrite, >>>> 0xFED0, // Address Base >>>> 0x4000, // Address Length >>>> ) >>>> }) >>>> } >>>> } >>>> >>>> So it does look like we are doing what the DSDT tells us in terms >>>> of the memory address. >>> yop. That said, I made yet another copy-paste error: the last known >>> working kernel is 8.0-CURRENT Mar 1 2009 and the hpet says : >>> >>> acpi_hpet0: iomem 0xfed0-0xfed003ff >>> on acpi0 >>> Timecounter "HPET" frequency 14318180 Hz quality 900 >>> >>> [only the frequency differs, the memory range indeed then was reported as >>> 0x400 and not 0x4000 ] >>> >>>> Arno, are there any BIOS options that mention the HPET or have you updated >>>> your BIOS since you booted the 7.1 kernel? >>> yes .. I now use BIOS 1.06 released 06/09/09. >>> Can I somehow 'overide' the bios and force the driver to use 0X400 as >>> 'Address Length' in order to test if that makes the driver attach again? >> Changing the length wouldn't make a difference as we would still read the >> same >> registers since the start address is identical. I think the length is >> symptomatic of the BIOS doing something differently that has disabled the >> HPET. > > good point : this failure probably is not related to the FreeBSD-driver > : in the current BIOS under the submenu 'South Bridge Chipset > Configuration', the option to enable the HPET has disappeared (no > mention of that in the release-notes), whilst it was present in the > original BIOS, *and* disabled by default. > > Is it possible to write to some register during hpet_enable() and force > the timer to tick, regardless of the BIOS? Problem seems not about ticking, but about HPET registers working at all. Returning ffh values for everything more probably tells that HPET is just not in place where we look for it. -- Alexander Motin ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to "freebsd-acpi-unsubscr...@freebsd.org"
Re: Event based scheduling and USB.
Alexander Motin wrote: > Takanori Watanabe wrote: >> I updated my FreeBSD tree on laptop, to the current >> as of 18 Oct.2010, it works fine with CPU C3 state enabled, >> >> I think this is your achievement of event time scheduler, >> thanks! >> >> But when USB driver is enabled, the load average is considerablly >> high (0.6 to 1.0) if sysctl oid kern.eventtimer.periodic is set to 0. >> Then kern.eventtimer.periodic is set to 1, the load average goes >> to 0 quickly as before, but almost never transit to C3. >> >> Is this behavior expected, or something wrong? >> I noticed one of usb host controller device shares HPET irq. >> When I implement interrupt filter in uhci driver, the load average >> goes to 0 as before. >> >> >> % vmstat -i >> interrupt total rate >> irq1: atkbd0 398 2 >> irq9: acpi0 408 2 >> irq12: psm03 0 >> irq19: ehci1 37 0 >> irq20: hpet0 uhci0 35970230 >> irq22: ehci0 2 0 >> irq256: em04 0 >> irq257: ahci0 1692 10 >> Total 38514246 >> === > > I haven't noticed that issue and it is surely not expected for me. I > will try to reproduce it. I've easily reproduced the problem. Scheduler tracing shows that problem is the result of aliasing between "swi4: clock" thread on one CPU (measuring load average) and "irq21: hpet0 uhci1" thread on another. Those two events are aliased by definition due to shared interrupt source. Not sure what to do with it. Either we should change algorithm of load average calculation or exclude timer's interrupt threads from load average accounting. Adding interrupt filter for USB also reasonably helps, but it is only a partial solution for this specific sharing case. -- Alexander Motin ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to "freebsd-acpi-unsubscr...@freebsd.org"
Re: Event based scheduling and USB.
Takanori Watanabe wrote: > In message <4cc732c7.50...@freebsd.org>, Alexander Motin wrote: >> Most likely you should be able to avoid interrupt sharing using some >> additional HPET options, described at hpet(4). > > Try to disable using shared IRQ with uhci, the IRQ used by HPET become cpu: > interrupt and certainly load average goes quite low, If you mean "cpuX:timer" - then probably you did something wrong and system fallen back to LAPIC timer. > but never transit to C3 state. Using legacy route, it works quite well. C3 state is blocked when LAPIC timer used. -- Alexander Motin ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to "freebsd-acpi-unsubscr...@freebsd.org"
Re: Event based scheduling and USB.
Nate Lawson wrote: > On 10/26/2010 12:57 PM, Alexander Motin wrote: >> Takanori Watanabe wrote: >>> I updated my FreeBSD tree on laptop, to the current >>> as of 18 Oct.2010, it works fine with CPU C3 state enabled, >>> >>> I think this is your achievement of event time scheduler, >>> thanks! > > Ah, so mav@ implemented a tickless-scheduler? That is nice. Not exactly. I've only made system to delay empty ticks when idle and execute them later on wakeup in a batch. Scheduler work is still wanted. >>> But when USB driver is enabled, the load average is considerablly >>> high (0.6 to 1.0) if sysctl oid kern.eventtimer.periodic is set to 0. >>> Then kern.eventtimer.periodic is set to 1, the load average goes >>> to 0 quickly as before, but almost never transit to C3. >>> >>> Is this behavior expected, or something wrong? > > The USB controller often keeps the bus mastering bit set. This keeps the > system out of C3. The way to fix this is to implement global suspend. > Put a device in suspend mode and then turn off power to the USB port it > is on. Then the USB controller will stop polling the bus. As I understand, if respective USB port is not used, USB stack should put it into power_save mode not poll so often to deny entering C3 state. >>> I noticed one of usb host controller device shares HPET irq. >>> When I implement interrupt filter in uhci driver, the load average >>> goes to 0 as before. >>> >>> >>> >>> % vmstat -i >>> interrupt total rate >>> irq1: atkbd0 398 2 >>> irq9: acpi0 408 2 >>> irq12: psm03 0 >>> irq19: ehci1 37 0 >>> irq20: hpet0 uhci0 35970230 >>> irq22: ehci0 2 0 >>> irq256: em04 0 >>> irq257: ahci0 1692 10 >>> Total 38514246 >>> === >> I haven't noticed that issue and it is surely not expected for me. I >> will try to reproduce it. >> >> Most likely you should be able to avoid interrupt sharing using some >> additional HPET options, described at hpet(4). > > This seems silly. The whole point of APIC is to avoid clustering on a > single interrupt but the BIOS put the timer on the USB controller irq? HPET timer is not a regular ISA or PCI device. It allows several different interrupt configurations. In most cases I remember, BIOS setups interrupts 0 and 8, like for legacy_route mode. But this mode is not really suitable as default in our case ATM due to conflict with atrtc and attimer drivers. -- Alexander Motin ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to "freebsd-acpi-unsubscr...@freebsd.org"
Re: Event based scheduling and USB.
Takanori Watanabe wrote: > I updated my FreeBSD tree on laptop, to the current > as of 18 Oct.2010, it works fine with CPU C3 state enabled, > > I think this is your achievement of event time scheduler, > thanks! > > But when USB driver is enabled, the load average is considerablly > high (0.6 to 1.0) if sysctl oid kern.eventtimer.periodic is set to 0. > Then kern.eventtimer.periodic is set to 1, the load average goes > to 0 quickly as before, but almost never transit to C3. > > Is this behavior expected, or something wrong? > I noticed one of usb host controller device shares HPET irq. > When I implement interrupt filter in uhci driver, the load average > goes to 0 as before. > > > > % vmstat -i > interrupt total rate > irq1: atkbd0 398 2 > irq9: acpi0 408 2 > irq12: psm03 0 > irq19: ehci1 37 0 > irq20: hpet0 uhci0 35970230 > irq22: ehci0 2 0 > irq256: em04 0 > irq257: ahci0 1692 10 > Total 38514246 > === I haven't noticed that issue and it is surely not expected for me. I will try to reproduce it. Most likely you should be able to avoid interrupt sharing using some additional HPET options, described at hpet(4). > BTW, when USB port is enabled C3 transition rate gets lower. > I think it is likely to occur. But how can I supress power > consumption? I can't say about USB, but you may try this patch to optimize some other subsystems: http://people.freebsd.org/~mav/tm6292_idle.patch > It's time to implement powertop for freebsd, isn't it? Surely it is. I was even thinking about possibility to port one from OpenSolaris, but other work distracted me. You may take it, it you wish. -- Alexander Motin ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to "freebsd-acpi-unsubscr...@freebsd.org"
Re: i386/135447: [i386] [request] Intel Core i7 and Nehalem-EP new features not supported
The following reply was made to PR i386/135447; it has been noted by GNATS. From: Alexander Motin To: Dmitry Kubov Cc: Andriy Gapon , j...@freebsd.org, bug-followup Subject: Re: i386/135447: [i386] [request] Intel Core i7 and Nehalem-EP new features not supported Date: Fri, 24 Sep 2010 10:22:41 +0300 Dmitry Kubov wrote: > Is it possible to stick running threads to same CPU core for longer time > to avoid C-states latencies penalty? man 1 cpuset -- Alexander Motin ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to "freebsd-acpi-unsubscr...@freebsd.org"
Re: i386/135447: [i386] [request] Intel Core i7 and Nehalem-EP new features not supported
The following reply was made to PR i386/135447; it has been noted by GNATS. From: Alexander Motin To: Dmitry Kubov Cc: Andriy Gapon , j...@freebsd.org, bug-followup Subject: Re: i386/135447: [i386] [request] Intel Core i7 and Nehalem-EP new features not supported Date: Thu, 23 Sep 2010 16:07:32 +0300 Dmitry Kubov wrote: >> Try to kill powerd and manually set highest CPU frequency. 0.40s test >> time looks a bit suspicious, as powerd may just not react in time to set >> P0 state. >> > powerd does not enabled. Where/how set highest CPU frequency? sysctl dev.cpu |grep freq -- Alexander Motin ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to "freebsd-acpi-unsubscr...@freebsd.org"
Re: i386/135447: [i386] [request] Intel Core i7 and Nehalem-EP new features not supported
The following reply was made to PR i386/135447; it has been noted by GNATS. From: Alexander Motin To: Dmitry Kubov Cc: Andriy Gapon , j...@freebsd.org, bug-followup Subject: Re: i386/135447: [i386] [request] Intel Core i7 and Nehalem-EP new features not supported Date: Thu, 23 Sep 2010 16:01:09 +0300 Dmitry Kubov wrote: > >> This CPU has only 266MHz TurboBoost speedup. And some part of it >> (probably half) could be enabled all the time. This benefit still could >> be overweighted by C-states latencies penalty. It could be interesting >> to test some other workloads, like compilation with different number of >> threads. >> > > Actually tested 8.1-RELEASE with both TurboBoost options in BIOS: > > TurboBoost OFF > Ubench Single CPU: 451935 (0.40s) > Ubench Single CPU: 450927 (0.40s) > Ubench Single CPU: 450486 (0.40s) > > TurboBoost ON > Ubench Single CPU: 450890 (0.40s) > Ubench Single CPU: 450890 (0.40s) > Ubench Single CPU: 449926 (0.40s) > > C-states latencies penalty is reasonable idea. But looks like P0-state > not activated at all. Try to kill powerd and manually set highest CPU frequency. 0.40s test time looks a bit suspicious, as powerd may just not react in time to set P0 state. > What about too high %% for C3 state during heavy load: > dev.cpu.0.cx_usage: 0.17% 0.06% 99.75% last 7560us It's not really strange. These numbers count number of enters into each state. So when CPU is completely bust - they won't be updated. Main case when C1 state should be actively used/counted is loads with high interrupt rate or heavy context switching, such as disk I/O or network load. >> Disk performance fix is reasonable. Some recent improvements in >> 9-CURRENT should improve it even more. What's about ubench - try some >> different load. >> > Can you suggest other CPU only benchmark? > > make -j 16 buildworld > can't load all cores, can't see less than 11% idle I think it's not the main goal to completely load all CPUs. But this test is realistic and has really usable result. -- Alexander Motin ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to "freebsd-acpi-unsubscr...@freebsd.org"
Re: i386/135447: [i386] [request] Intel Core i7 and Nehalem-EP new features not supported
The following reply was made to PR i386/135447; it has been noted by GNATS. From: Alexander Motin To: Dmitry Kubov Cc: Andriy Gapon , j...@freebsd.org, bug-followup Subject: Re: i386/135447: [i386] [request] Intel Core i7 and Nehalem-EP new features not supported Date: Thu, 23 Sep 2010 15:34:23 +0300 Dmitry Kubov wrote: >> It would be >> interesting to repeat same test if you updated to 8-STABLE or at least >> apply patch from SVN rev 209897 on 2010-07-11 11:58:46Z. > > New system: > CPU: Intel(R) Xeon(R) CPU X5680 @ 3.33GHz (.47-MHz > K8-class CPU) > FreeBSD/SMP: Multiprocessor System Detected: 12 CPUs > FreeBSD/SMP: 2 package(s) x 6 core(s) > HT disabled in BIOS. This CPU has only 266MHz TurboBoost speedup. And some part of it (probably half) could be enabled all the time. This benefit still could be overweighted by C-states latencies penalty. It could be interesting to test some other workloads, like compilation with different number of threads. > Note /3334 difference: > TurboBoost disabled: > dev.cpu.0.freq: > dev.cpu.0.freq_levels: /13 3200/117000 3067/105000 2933/94000 > 2800/85000 > 2667/76000 2533/68000 2400/61000 2267/54000 2133/48000 2000/43000 > 1867/39000 17 > 33/35000 1600/32000 1400/28000 1200/24000 1000/2 800/16000 600/12000 > 400/8000 200/4000 > dev.est.0.freq_settings: /13 3200/117000 3067/105000 2933/94000 > 2800/850 > 00 2667/76000 2533/68000 2400/61000 2267/54000 2133/48000 2000/43000 > 1867/39000 1733/35000 1600/32000 > > TurboBoost enabled: > dev.cpu.0.freq: 3334 > dev.cpu.0.freq_levels: 3334/143000 3200/117000 3067/105000 2933/94000 > 2800/85000 > 2667/76000 2533/68000 2400/61000 2267/54000 2133/48000 2000/43000 > 1867/39000 17 > 33/35000 1600/32000 1400/28000 1200/24000 1000/2 800/16000 600/12000 > 400/8000 200/4000 > dev.est.0.freq_settings: 3334/143000 /13 3200/117000 3067/105000 > 2933/94 > 000 2800/85000 2667/76000 2533/68000 2400/61000 2267/54000 2133/48000 > 2000/43000 1867/39000 1733/35000 1600/32000 Intel writes that BIOS may report additional P-state with 1MHz difference, to allow OS to control TurboBoost. It's just cpufreq subsystem behavior/limitation to drop very close frequencies. Actually I am not sure how this additional P-state could be used, except for testing. > In short: no 60% disk io performance drop in 8.1-STABLE. Other tests > give same results like 8.1-RELEASE, 5% average cpu performance drop. Disk performance fix is reasonable. Some recent improvements in 9-CURRENT should improve it even more. What's about ubench - try some different load. -- Alexander Motin ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to "freebsd-acpi-unsubscr...@freebsd.org"
Re: i386/135447: [i386] [request] Intel Core i7 and Nehalem-EP new features not supported
The following reply was made to PR i386/135447; it has been noted by GNATS. From: Alexander Motin To: Dmitry Kubov Cc: Andriy Gapon , j...@freebsd.org, bug-follo...@freebsd.org Subject: Re: i386/135447: [i386] [request] Intel Core i7 and Nehalem-EP new features not supported Date: Tue, 21 Sep 2010 14:16:46 +0300 Dmitry Kubov wrote: > Ok, I am able to activate C3 state after loader.conf tweaks. According > to http://www.intel.com/technology/turboboost/ > > Intel Turbo Boost Technology is activated when the Operating System (OS) > requests the highest processor performance state (P0). > > I have no clue about P0 state activation on FreeBSD. P0 is just a highest available CPU frequency. If you are not using powerd - it should be set all the time. If you are using powerd - it will set it in part of second after load appear. -- Alexander Motin ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to "freebsd-acpi-unsubscr...@freebsd.org"
Re: i386/135447: [i386] [request] Intel Core i7 and Nehalem-EP new features not supported
The following reply was made to PR i386/135447; it has been noted by GNATS. From: Alexander Motin To: Dmitry Kubov Cc: Andriy Gapon , j...@freebsd.org, bug-follo...@freebsd.org Subject: Re: i386/135447: [i386] [request] Intel Core i7 and Nehalem-EP new features not supported Date: Mon, 20 Sep 2010 18:49:54 +0300 Dmitry Kubov wrote: >> 205 * 3 and 245 * 3 are both greater than 500, so this is the reason why >> they are >> never entered. >> >> Perhaps Alexander can give some advice here. > > Looks like I can simply update src to 8-stable? > > SVN rev 212887 on 2010-09-20 05:39:50Z by avg > > MFC r212549: acpi_cpu: do not apply P_LVLx_LAT rules to latencies > returned by _CST No, it's different case. This won't help you. -- Alexander Motin ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to "freebsd-acpi-unsubscr...@freebsd.org"
Re: i386/135447: [i386] [request] Intel Core i7 and Nehalem-EP new features not supported
The following reply was made to PR i386/135447; it has been noted by GNATS. From: Alexander Motin To: Andriy Gapon Cc: Dmitry Kubov , j...@freebsd.org, bug-follo...@freebsd.org Subject: Re: i386/135447: [i386] [request] Intel Core i7 and Nehalem-EP new features not supported Date: Mon, 20 Sep 2010 18:42:57 +0300 Andriy Gapon wrote: > on 20/09/2010 17:54 Dmitry Kubov said the following: >> dev.cpu.7.cx_supported: C1/3 C2/205 C3/245 > Note these^^^^^^ >> dev.cpu.7.cx_lowest: C3 >> dev.cpu.7.cx_usage: 100.00% 0.00% 0.00% last 500us > And this --^ >> C2/C3 not used at all > > 205 * 3 and 245 * 3 are both greater than 500, so this is the reason why > they are > never entered. The only way to enter C-states with so high latency is significantly increase CPUs' continuous sleep time. Sleep time of 500ms there is artificial and calculated as 100/(2*hz). 8.1 was unable yet to measure real sleep time in C1. But 2*hz is quite realistic estimation for idle system. Recently I have committed to 9-CURRENT large set of patches, making idle CPUs to not wake up on timer interrupts when it is not needed. It allows idle CPUs sleep up to as much as 10us, making any C-states available now effectively usable. I can acknowledge that TurboBoost on my Core i7 870 gives about 10% benefit when only one physical core is used: http://docs.freebsd.org/cgi/mid.cgi?4C959830.3060808 I have requests and wish to merge these changes into 8-STABLE, but most likely it won't happen in nearest few months, as code is very new and requires more testing. Until that time I recommend you to follow this guide: http://wiki.freebsd.org/TuningPowerConsumption It was actually oriented on laptops, but effective usage of C2/C3 states was one of it's goals. Also on my Core i7 870 LAPIC dies in C2/C3 states, so consider migration to i8254 timer, as also described in this guide. -- Alexander Motin ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to "freebsd-acpi-unsubscr...@freebsd.org"
Re: CPU C-state storange on Panasonic TOUGH BOOK CF-R9
Andriy Gapon wrote: > on 14/09/2010 11:44 Andriy Gapon said the following: >> on 13/09/2010 20:07 Andriy Gapon said the following: >>> I am also going to take a look how Linux and OpenSolaris name the C-states. >> Well, Linux does what you suggested, it uses index of a C-state as its name. >> There is one difference from our current code - if a C-state is skipped for >> some >> reason, then its index is not re-used, but the entry is marked as non-valid. >> So, if we skip "C2" for some reason, then "C3" will become "C2". Not so on >> Linux. >> Also, they print a type/class of a C state using C1, C2, C3 and "--" for >> higher/unknown types. > > OpenSolaris, on the other hand, collapses multiple entries of the same type > into > a single entry using the most power-saving alternative. I don't think it is perfect choice. In such case it would be useless for ACPI BIOS to report extra states. The only case when I think can be reasonable to drop some items is if they are equal except using different entry methods. For example, one OS may prefer to use port read, while another may use MWAIT to be able to wake up without using IPI. > They also use the type as a C state reported name, index is not used in > interfacing. In their case it is possible. -- Alexander Motin ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to "freebsd-acpi-unsubscr...@freebsd.org"
Re: CPU C-state storange on Panasonic TOUGH BOOK CF-R9
Andriy Gapon wrote: > on 12/09/2010 18:22 Andriy Gapon said the following: >> Observations are correct, but incomplete; the conclusions are wrong. >> At the end of the boot there are message like this one: >> PROCESSOR-0722 [402244] cpu_cx_cst: acpi_cpu0: Got C2 - 245 >> latency >> This is a result of re-evaluation of _CST because of a notification from >> ACPI. > > But still, as you suggest, a patch like the following should be tested and > committed: > > --- a/sys/dev/acpica/acpi_cpu.c > +++ b/sys/dev/acpica/acpi_cpu.c > @@ -828,7 +828,8 @@ acpi_cpu_cx_list(struct acpi_cpu_softc *sc) > sbuf_new(&sb, sc->cpu_cx_supported, sizeof(sc->cpu_cx_supported), > SBUF_FIXEDLEN); > for (i = 0; i < sc->cpu_cx_count; i++) { > - sbuf_printf(&sb, "C%d/%d ", i + 1, sc->cpu_cx_states[i].trans_lat); > + sbuf_printf(&sb, "C%d/%d ", sc->cpu_cx_states[i].type, > + sc->cpu_cx_states[i].trans_lat); > if (sc->cpu_cx_states[i].type < ACPI_STATE_C3) > sc->cpu_non_c3 = i; > } I am not sure this patch is complete: 1) AFAIR I have seen somewhere example where system had several C-states with different latency, but the same type - C3. Type only means enter/exit semantics, and there could be several states with the same semantics. Not sire how to properly them in this case. May be existing approach was not so bad. It is ACPI C-states, not CPU C-states, they are not same. May be we should just mention type somewhere in addition. 2) This change makes heavily understandable values of cx_lowest. 3) If touch cx_lowest, I would prefer to see there possibility to set it to some abstract C6 or whatever, allowing system automatically choose state it has available at the moment. -- Alexander Motin ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to "freebsd-acpi-unsubscr...@freebsd.org"
Re: cpufreq_curr_sysctl: memory allocation
Andriy Gapon wrote: > I noticed that cpufreq_curr_sysctl performs a substantial memory allocation > and > deallocation on each call. Its size is CF_MAX_LEVELS * sizeof(*levels), which > is ~24KB. This happens even for read-only calls to just query current level. > And such calls happen quite frequently when powerd is running. Worse is that it not just consumes time, but causes a bunch or TLB flush IPIs on free(). For read-only call it doesn't even needs CF_MAX_LEVELS * sizeof(*levels). sizeof(*levels) seems should be enough there. May be then it fits into some existing UMA zone, minimizing penalty. > I think that this is an unnecessary and avoidable load for VM system. > Couldn't a buffer be preallocated in sc and re-used for the calls? > Even if not, for some reason, then wouldn't it be better to have a dedicated > uma > zone for that rather than doing malloc+free? Dedicated rarely used UMA zone may eat much more memory then it is needed on SMP. -- Alexander Motin ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to "freebsd-acpi-unsubscr...@freebsd.org"
Re: Panic on S3 suspend call.
John Baldwin wrote: > On Tuesday 08 June 2010 5:52:54 am Alexander Motin wrote: >> Hi. >> >> Just noted that fresh HEAD i386 system panics on suspend request when >> build with INVARIANTS and WITNESS: >> >> panic: mutex ACPI global lock owned at ../../../kern/kern_event.c:1899 >> cpuid = 1 >> KDB: enter: panic >> [ thread pid 1047 tid 100138 ] >> Stopped at 0x408d29df: movl$0,0x40dded34 >> db> bt >> Tracing pid 1047 tid 100138 td 0x45fcb9c0 >> kdb_enter(40c75fe3,40c75fe3,40c74763,7c91fb1c,1,...) at 0x408d29df >> panic(40c74763,40c26898,40c70d4e,76b,7c91fb40,...) at 0x4089ec96 >> _mtx_assert(40da08a0,0,40c70d4e,76b,7c91fb70,...) at 0x4088e227 >> knlist_mtx_assert_unlocked(40da08a0,4088ed2c,40da08a0,45d377c0,3,...) at >> 0x4086b06e >> knote(45d377dc,0,0,921,0,...) at 0x4086b9ff >> acpi_ReqSleepState(456c3700,3,40c2633d,c76,0,...) at 0x404e8f4b > > I think this should fix it: > > Index: acpi.c > === > --- acpi.c(revision 208893) > +++ acpi.c(working copy) > @@ -2346,7 +2346,7 @@ > clone->notify_status = APM_EV_NONE; > if ((clone->flags & ACPI_EVF_DEVD) == 0) { > selwakeuppri(&clone->sel_read, PZERO); > - KNOTE_UNLOCKED(&clone->sel_read.si_note, 0); > + KNOTE_LOCKED(&clone->sel_read.si_note, 0); > } > } With this patch it doesn't panics. A bit surprising, as it was written so almost three years ago. -- Alexander Motin ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to "freebsd-acpi-unsubscr...@freebsd.org"
Panic on S3 suspend call.
Hi. Just noted that fresh HEAD i386 system panics on suspend request when build with INVARIANTS and WITNESS: panic: mutex ACPI global lock owned at ../../../kern/kern_event.c:1899 cpuid = 1 KDB: enter: panic [ thread pid 1047 tid 100138 ] Stopped at 0x408d29df: movl$0,0x40dded34 db> bt Tracing pid 1047 tid 100138 td 0x45fcb9c0 kdb_enter(40c75fe3,40c75fe3,40c74763,7c91fb1c,1,...) at 0x408d29df panic(40c74763,40c26898,40c70d4e,76b,7c91fb40,...) at 0x4089ec96 _mtx_assert(40da08a0,0,40c70d4e,76b,7c91fb70,...) at 0x4088e227 knlist_mtx_assert_unlocked(40da08a0,4088ed2c,40da08a0,45d377c0,3,...) at 0x4086b06e knote(45d377dc,0,0,921,0,...) at 0x4086b9ff acpi_ReqSleepState(456c3700,3,40c2633d,c76,0,...) at 0x404e8f4b acpiioctl(45793400,80045004,45d07810,3,45fcb9c0,...) at 0x404e9118 devfs_ioctl_f(45d820e0,80045004,45d07810,45d34a80,45fcb9c0,...) at 0x4081d1e8 kern_ioctl(45fcb9c0,3,80045004,45d07810,91fcec,...) at 0x408ebdbd ioctl(45fcb9c0,7c91fcec,0,40cb192e,0,...) at 0x408ebf47 syscallenter(45fcb9c0,7c91fce4,40b9fa00,40dcd190,0,...) at 0x408e0b23 syscall(7c91fd28) at 0x40b9f169 Xint0x80_syscall() at 0x40b7f49a --- syscall (54, FreeBSD ELF32, ioctl), eip = 0x28183173, esp = 0x3fbfeb1c, ebp = 0x3fbfebf8 --- db> -- Alexander Motin ___ freebsd-acpi@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-acpi To unsubscribe, send any mail to "freebsd-acpi-unsubscr...@freebsd.org"