Re: powerd / cpufreq question
On Tue, 12 Apr 2011, Daniel Gerzo wrote: On 11.4.2011 6:08, Ian Smith wrote: As you see, total of differences for each cpu is here 89 ticks, but I've no idea of the interval between your two readings, or your value of HZ? the interval may have been around 1-2 seconds. My value of HZ is default, 1000. Ok, seems it depends on stathz, not HZ, so 89'd be less than 1 second if your stathz is 128 .. I gather that may be changed with the 9.x timers? Are those kern.cp_times values as they came, or did you remove trailing zeroes? Reason I ask is that on my Thinkpad T23, single-core 1133/733 MHz, sysctl kern.cp_time shows the usual 5 values, but kern.cp_times has the same 5 values for cpu0, but then 5 zeroes for each of cpu1 through cpu31, on 8.2-PRE about early January. I need to update the script to remove surplus data for non-existing cpus, but wonder if the extra data also appeared on your 12 core box? I haven't removed anything, it's a pure copypaste. Thanks. I'll check the single-cpu case again after updating to 8.2-R cheers, Ian ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: powerd / cpufreq question
Ian Smith wrote: On Tue, 12 Apr 2011, Daniel Gerzo wrote: On 11.4.2011 6:08, Ian Smith wrote: As you see, total of differences for each cpu is here 89 ticks, but I've no idea of the interval between your two readings, or your value of HZ? the interval may have been around 1-2 seconds. My value of HZ is default, 1000. Ok, seems it depends on stathz, not HZ, so 89'd be less than 1 second if your stathz is 128 .. I gather that may be changed with the 9.x timers? 9-CURRENT tries to set stathz to 127, or at least somewhere around. The main difference there is that clocks really tick only when CPU is running and emulated during idle periods to allow C-states do their job. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: powerd / cpufreq question
On 11.4.2011 6:08, Ian Smith wrote: As you see, total of differences for each cpu is here 89 ticks, but I've no idea of the interval between your two readings, or your value of HZ? the interval may have been around 1-2 seconds. My value of HZ is default, 1000. Are those kern.cp_times values as they came, or did you remove trailing zeroes? Reason I ask is that on my Thinkpad T23, single-core 1133/733 MHz, sysctl kern.cp_time shows the usual 5 values, but kern.cp_times has the same 5 values for cpu0, but then 5 zeroes for each of cpu1 through cpu31, on 8.2-PRE about early January. I need to update the script to remove surplus data for non-existing cpus, but wonder if the extra data also appeared on your 12 core box? I haven't removed anything, it's a pure copypaste. -- S pozdravom / Best regards Daniel Gerzo, FreeBSD committer ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: powerd / cpufreq question
On Fri, 8 Apr 2011, Daniel Ger?o wrote: Hello guys, I have a new machine with Xeon(R) CPU X5650 2666.77-MHz and I would like to utilize powerd(8) on it however, when I run `powerd -v -r90' I see something like this: load 64%, current freq 2668 MHz ( 0), wanted freq 5336 MHz load 120%, current freq 2668 MHz ( 0), wanted freq 5336 MHz load 173%, current freq 2668 MHz ( 0), wanted freq 5336 MHz load 62%, current freq 2668 MHz ( 0), wanted freq 5336 MHz load 82%, current freq 2668 MHz ( 0), wanted freq 5336 MHz load 110%, current freq 2668 MHz ( 0), wanted freq 5336 MHz even though the machine is according to top(1) ~90% idle; So I realized, that powerd might take the load as the sum of loads of all the cores (12), so I tried to tweak powerd arguments like this: Hi Daniel, Alexander, all. I hope to engage more on this interesting topic later, but first: [..] Examle of two consecutive cp_times sysctl output: kern.cp_times: 4182996 0 306925 85623 13563403 3164971 0 201479 93110 14679313 3450792 0 258166 80198 14349717 2795270 0 180252 76701 15086650 2952777 0 217156 119627 14849313 2418067 0 158594 73497 15488715 2408492 0 175131 104377 15450873 2003803 0 131790 75753 15927527 2456736 0 178894 36963 15466280 1607095 0 117396 4197 16410185 2127878 0 147639 30804 15832552 1406621 0 92686 1058 16638508 kern.cp_times: 4183013 0 306927 85626 13563469 3164980 0 201482 93110 14679390 3450796 0 258167 80199 14349800 2795274 0 180252 76701 15086735 2952780 0 217157 119629 14849396 2418070 0 158597 73497 15488798 2408499 0 175132 104377 15450954 2003804 0 131791 75753 15927614 2456744 0 178897 36963 15466358 1607098 0 117398 4197 16410269 2127880 0 147640 30804 15832638 1406621 0 92686 1058 16638597 I wrote the script included below to try making some sense of these, that defaults to using your above values, resulting in: smithi on sola% sh cptimes.sh cp_usercp_nice cp_syscp_intrcp_idle cpu: 0 @t04182996 0 306925 85623 13563403 cpu: 0 @t14183013 0 306927 85626 13563469 17 0 2 3 66 cpu: 1 @t03164971 0 201479 93110 14679313 cpu: 1 @t13164980 0 201482 93110 14679390 9 0 3 0 77 cpu: 2 @t03450792 0 258166 80198 14349717 cpu: 2 @t13450796 0 258167 80199 14349800 4 0 1 1 83 cpu: 3 @t02795270 0 180252 76701 15086650 cpu: 3 @t12795274 0 180252 76701 15086735 4 0 0 0 85 cpu: 4 @t02952777 0 217156 119627 14849313 cpu: 4 @t12952780 0 217157 119629 14849396 3 0 1 2 83 cpu: 5 @t02418067 0 158594 73497 15488715 cpu: 5 @t12418070 0 158597 73497 15488798 3 0 3 0 83 cpu: 6 @t02408492 0 175131 104377 15450873 cpu: 6 @t12408499 0 175132 104377 15450954 7 0 1 0 81 cpu: 7 @t02003803 0 131790 75753 15927527 cpu: 7 @t12003804 0 131791 75753 15927614 1 0 1 0 87 cpu: 8 @t02456736 0 178894 36963 15466280 cpu: 8 @t12456744 0 178897 36963 15466358 8 0 3 0 78 cpu: 9 @t01607095 0 117396 4197 16410185 cpu: 9 @t11607098 0 117398 4197 16410269 3 0 2 0 84 cpu:10 @t02127878 0 147639 30804 15832552 cpu:10 @t12127880 0 147640 30804 15832638 2 0 1 0 86 cpu:11 @t01406621 0 92686 1058 16638508 cpu:11 @t11406621 0 92686 1058 16638597 0 0 0 0 89 As you see, total of differences for each cpu is here 89 ticks, but I've no idea of the interval between your two readings, or your value of HZ? Are those kern.cp_times values as they came, or did you remove trailing zeroes? Reason I ask is that on my Thinkpad T23, single-core 1133/733 MHz, sysctl kern.cp_time shows the usual 5 values, but kern.cp_times has the same 5 values for cpu0, but then 5 zeroes for each of cpu1 through cpu31, on 8.2-PRE about early January. I need to update the script to remove surplus data for non-existing cpus, but wonder if the extra data also appeared on your 12
Re: powerd / cpufreq question
On 8.4.2011 19:52, Alexander Motin wrote: So, here is my attempt to implement it: http://danger.rulez.sk/powerd.diff Can you please review comment? I should be able to commit it mysqlf if you consider it acceptable. It seems to work for me :) Looks fine, except that -f option have to be the first, that is not obvious. Another moment -- I've noticed some load constants hardcoded there. They should also be handled to make higher values to work properly. I tried to be more explicit in the error message which tries to emphasis the need to put it first. I don't know myself how it would be possible to code it so that the -f doesn't need to be first. Ideas? Do you mean the values around lines of 730 - 762? From what I have observed, if I have a machine that is a little more loaded (say 300%) and the load goes up, it tries to increases the performance to quite high freq (5336) and when the load decreases again, it takes quite a while to go down from 5366 to a frequency that is actually available to decrease the performance (something less than 2934). So the lower frequency is used for too short time because it takes too much time to get it... Seems like it was enabled by default. I have like these: dev.cpu.0.cx_supported: C1/3 C2/96 C3/128 Does that mean I only need to set these in rc.conf?: performance_cx_lowest=C3 economy_cx_lowest=C3 Then run /etc/rc.d/power_profile 0x00? It short - yes. In long - read the link I've given. May it cause any instability? It you won't switch from LAPIC to other timer and it stop - your system will freeze, or at least not work well. You should notice problems immediately, if there are. So I will also need to change the kern.timecounter.hardware to i8254? I suppose it will cause a little less precise time, but should I expect lower performance? I don't care that much about the time accuracy. How do I know the C3 is active? And how does it switch back to C1 for example? This is 8-STABLE, any idea whether there's a MFC plan for the extra 9-CURRENT bonuses? I suppose around May. Do you have some patches? If not you don't really need to make them just for me, I can wait a little. Last ones I've generated are five months old: http://people.freebsd.org/~mav/timers_merge/ They are large and I am not sure how good they apply now. I guess I will just stick with vanilla 8-stable and then update. You may want to look here: http://wiki.freebsd.org/TuningPowerConsumption -- S pozdravom / Best regards Daniel Gerzo, FreeBSD committer ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: powerd / cpufreq question
On 09.04.2011 10:57, Daniel Gerzo wrote: On 8.4.2011 19:52, Alexander Motin wrote: So, here is my attempt to implement it: http://danger.rulez.sk/powerd.diff Can you please review comment? I should be able to commit it mysqlf if you consider it acceptable. It seems to work for me :) Looks fine, except that -f option have to be the first, that is not obvious. Another moment -- I've noticed some load constants hardcoded there. They should also be handled to make higher values to work properly. I tried to be more explicit in the error message which tries to emphasis the need to put it first. I don't know myself how it would be possible to code it so that the -f doesn't need to be first. Ideas? Move checks after the loop? Just an idea. Do you mean the values around lines of 730 - 762? Yes. When load is more the twice higher then limit - frequency rises faster. To make it work with limit 50%, there is hardcoded additional check for 95% level. From what I have observed, if I have a machine that is a little more loaded (say 300%) and the load goes up, it tries to increases the performance to quite high freq (5336) and when the load decreases again, it takes quite a while to go down from 5366 to a frequency that is actually available to decrease the performance (something less than 2934). So the lower frequency is used for too short time because it takes too much time to get it... It is intended behavior in hiadaptive mode, where performance is preferable to power-saving. Seems like it was enabled by default. I have like these: dev.cpu.0.cx_supported: C1/3 C2/96 C3/128 Does that mean I only need to set these in rc.conf?: performance_cx_lowest=C3 economy_cx_lowest=C3 Then run /etc/rc.d/power_profile 0x00? It short - yes. In long - read the link I've given. May it cause any instability? It you won't switch from LAPIC to other timer and it stop - your system will freeze, or at least not work well. You should notice problems immediately, if there are. So I will also need to change the kern.timecounter.hardware to i8254? I suppose it will cause a little less precise time, but should I expect lower performance? I don't care that much about the time accuracy. I wasn't mentioning timecounter there. In terms of 9-CURRENT I was talking about eventtimer. In 8-STABLE it is not formalized yet and so the guide mentions number of tunables. How do I know the C3 is active? sysctl dev.cpu.X.cx_usage And how does it switch back to C1 for example? When CPU is idle, depending on previous idle statistics, system puts it into one of reported and allowed C-states. CPU goes back to C0 state on any hardware interrupt. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: powerd / cpufreq question
I just noticed this thread a day after my own fight with powerd and load percentages that did not seem to make any sense. The patch I came up with is attached. It modifies powerd to use the load percentage of the busiest core. This reduces the range of values back to 0%...100% also for multi-core systems. On my Core i7 setup here, the change seems to work well. - Bartosz --- powerd.c.old2011-04-07 17:30:58.0 +0200 +++ powerd.c2011-04-07 17:38:28.0 +0200 @@ -128,7 +128,7 @@ static long *cp_times = NULL, *cp_times_old = NULL; static int ncpus = 0; size_t cp_times_len; - int error, cpu, i, total; + int error, cpu, i, total, max; if (cp_times == NULL) { cp_times_len = 0; @@ -151,7 +151,7 @@ return (error); if (load) { - *load = 0; + max = 0; for (cpu = 0; cpu ncpus; cpu++) { total = 0; for (i = 0; i CPUSTATES; i++) { @@ -160,9 +160,12 @@ } if (total == 0) continue; - *load += 100 - (cp_times[cpu * CPUSTATES + CP_IDLE] - + total = 100 - (cp_times[cpu * CPUSTATES + CP_IDLE] - cp_times_old[cpu * CPUSTATES + CP_IDLE]) * 100 / total; + if (total max) + max = total; } + *load = max; } memcpy(cp_times_old, cp_times, cp_times_len); ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: powerd / cpufreq question
On 09.04.2011 17:39, Bartosz Fabianowski wrote: I just noticed this thread a day after my own fight with powerd and load percentages that did not seem to make any sense. The patch I came up with is attached. It modifies powerd to use the load percentage of the busiest core. This reduces the range of values back to 0%...100% also for multi-core systems. While using maximum of loads can be better then using levels above 100%, it won't properly handle cases of dependent or frequently migrating threads, that are handled now with summary load and levels less then 100%. While existing powerd algorithm is indeed not perfect, it is the only relatively performance-safe, unlike others propositions. I won't argue about adding more algorithms/options to powerd, optimized for handling different situations, but I believe that default should remain safe. On my Core i7 setup here, the change seems to work well. ... in your specific workload. And you haven't described how you measured system performance to prove that it haven't decreased. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: powerd / cpufreq question
On my Core i7 setup here, the change seems to work well. ... in your specific workload. And you haven't described how you measured system performance to prove that it haven't decreased. My measure of performance is entirely unscientific: This is a desktop box. Performance is good if KDE reacts to inputs quickly. My patch preserves this for me while making the box run a bit cooler. I am by no means advocating that my patch be made the default behavior. But as you said, it may be nice to include it as one of several algorithms the user can choose from. - Bartosz ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: powerd / cpufreq question
Date: Sat, 09 Apr 2011 09:57:28 +0200 From: Daniel Gerzo dan...@freebsd.org Sender: owner-freebsd-sta...@freebsd.org On 8.4.2011 19:52, Alexander Motin wrote: So, here is my attempt to implement it: http://danger.rulez.sk/powerd.diff Can you please review comment? I should be able to commit it mysqlf if you consider it acceptable. It seems to work for me :) Looks fine, except that -f option have to be the first, that is not obvious. Another moment -- I've noticed some load constants hardcoded there. They should also be handled to make higher values to work properly. I tried to be more explicit in the error message which tries to emphasis the need to put it first. I don't know myself how it would be possible to code it so that the -f doesn't need to be first. Ideas? Do you mean the values around lines of 730 - 762? From what I have observed, if I have a machine that is a little more loaded (say 300%) and the load goes up, it tries to increases the performance to quite high freq (5336) and when the load decreases again, it takes quite a while to go down from 5366 to a frequency that is actually available to decrease the performance (something less than 2934). So the lower frequency is used for too short time because it takes too much time to get it... Seems like it was enabled by default. I have like these: dev.cpu.0.cx_supported: C1/3 C2/96 C3/128 Does that mean I only need to set these in rc.conf?: performance_cx_lowest=C3 economy_cx_lowest=C3 Then run /etc/rc.d/power_profile 0x00? It short - yes. In long - read the link I've given. May it cause any instability? It you won't switch from LAPIC to other timer and it stop - your system will freeze, or at least not work well. You should notice problems immediately, if there are. So I will also need to change the kern.timecounter.hardware to i8254? I suppose it will cause a little less precise time, but should I expect lower performance? I don't care that much about the time accuracy. How do I know the C3 is active? And how does it switch back to C1 for example? This is 8-STABLE, any idea whether there's a MFC plan for the extra 9-CURRENT bonuses? I suppose around May. Do you have some patches? If not you don't really need to make them just for me, I can wait a little. Last ones I've generated are five months old: http://people.freebsd.org/~mav/timers_merge/ They are large and I am not sure how good they apply now. I guess I will just stick with vanilla 8-stable and then update. You may want to look here: http://wiki.freebsd.org/TuningPowerConsumption I would like to emphasize that simple frequency reductions through throttling and TCC really don't save power. EST, which actually does change the CPU clock AND voltage, is a win, but not a big one. I you want to reduce power, deeper sleep states are the only way to go further. Dr. Tajana Rosing of the UC-San Diego System Energy Efficiency Lab has presented research that shows that servers save by far the most power when a process gets in, runs at maximum speed and gets out to allow the system to sleep. This is clearly the only way to significantly improve power consumption in servers. If you don't enable and use S3 and better (when present), you are not being power efficient. To do this, the clock must rise very quickly and should drop slowly. And even then, it's not likely to save much power or reduce heat load significantly . -- R. Kevin Oberman, Network Engineer Energy Sciences Network (ESnet) Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab) E-mail: ober...@es.net Phone: +1 510 486-8634 Key fingerprint:059B 2DDF 031C 9BA3 14A4 EADA 927D EBB3 987B 3751 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: powerd / cpufreq question
Hi. On 08.04.2011 14:12, Daniel Geržo wrote: I have a new machine with Xeon(R) CPU X5650 2666.77-MHz and I would like to utilize powerd(8) on it however, when I run `powerd -v -r90' I see something like this: load 64%, current freq 2668 MHz ( 0), wanted freq 5336 MHz load 120%, current freq 2668 MHz ( 0), wanted freq 5336 MHz load 173%, current freq 2668 MHz ( 0), wanted freq 5336 MHz load 62%, current freq 2668 MHz ( 0), wanted freq 5336 MHz load 82%, current freq 2668 MHz ( 0), wanted freq 5336 MHz load 110%, current freq 2668 MHz ( 0), wanted freq 5336 MHz even though the machine is according to top(1) ~90% idle; So I realized, that powerd might take the load as the sum of loads of all the cores (12), so I tried to tweak powerd arguments like this: `powerd -v -r 1000 -i 600' but that errors for me with: root@[s1-a ~]# powerd -v -r 1000 -i 600 powerd: 1000 is not a valid percent Well, that makes sense, but why powerd itself knows about load 100% but doesn't allow me to specify it? Is this bug? I suppose not if it works for other people... It is reasonable limitation. powerd can't know how load distributed among multiple cores in time. If all cores are equally busy at lets say 10% (that gives 120% total) and cores are never waiting for each other then obviously frequency could be reduced. But if the same 120% mean 100%+20%, or if load is equally spread, but processes on different cores are waiting for each other, then reducing frequency will reduce performance. powerd can't know that and so stays on a safe side. Other question would be why powerd wants to set freq 5336, when it is not available at all (would be nice to have it heh.): You may see there it is a wanted frequency, not real one. :) It is internal implementation details. In such way powerd implements keeping a full frequency for some time after the load dropped. It's not a bug. On multi-core systems like this power management can better be done on per-core bases. Powerd can't control frequencies on per-core basis (also because it require non-trivial interoperation with scheduler). But if your ACPI BIOS allows, you can try to put unused cores into deeper C-states, that may give better power saving and TurboBoost on busy cores as a bonus. It works better on 9-CURRENT, but on 8-STABLE some bonuses still could be achieved. You may want to look here: http://wiki.freebsd.org/TuningPowerConsumption -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: powerd / cpufreq question
On Fri, 08 Apr 2011 14:42:04 +0300, Alexander Motin wrote: Hello Alexander, thanks for quick reply; root@[s1-a ~]# powerd -v -r 1000 -i 600 powerd: 1000 is not a valid percent Well, that makes sense, but why powerd itself knows about load 100% but doesn't allow me to specify it? Is this bug? I suppose not if it works for other people... It is reasonable limitation. powerd can't know how load distributed among multiple cores in time. If all cores are equally busy at lets say 10% (that gives 120% total) and cores are never waiting for each other then obviously frequency could be reduced. But if the same 120% mean 100%+20%, or if load is equally spread, but processes on different cores are waiting for each other, then reducing frequency will reduce performance. powerd can't know that and so stays on a safe side. OK, I understand what you are saying here. On the other side, I know pretty well how the load is distributed - in this particular case, the box is a web server, running ~30 php-cgi processes. This kind of operation doesn't require very high frequency and I suspect the cores are never waiting for each other. There could be an option which would allow an administrator to decide whether this is the case and allow him to set a higher -r and -i values, what do you think? Other question would be why powerd wants to set freq 5336, when it is not available at all (would be nice to have it heh.): You may see there it is a wanted frequency, not real one. :) It is internal implementation details. In such way powerd implements keeping a full frequency for some time after the load dropped. It's not a bug. OK :-) I actually though powerd always honors the values from dev.cpu.0.freq_levels (and 5336 is not there), so it looked a little weird to me. On multi-core systems like this power management can better be done on per-core bases. Powerd can't control frequencies on per-core basis (also because it require non-trivial interoperation with scheduler). But if your ACPI BIOS allows, you can try to put unused cores into deeper C-states, that may give better power saving and TurboBoost on busy cores as a bonus. It works better on 9-CURRENT, but on 8-STABLE some bonuses still could be achieved. Any idea what I should look for in the BIOS? This is 8-STABLE, any idea whether there's a MFC plan for the extra 9-CURRENT bonuses? You may want to look here: http://wiki.freebsd.org/TuningPowerConsumption From reading this, are you reffering above to the C2 states? (seems like C3 is not optimal for this kind of operation...) Thanks. -- Kind regards Daniel ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: powerd / cpufreq question
On 08.04.2011 17:42, Daniel Gerzo wrote: On Fri, 08 Apr 2011 14:42:04 +0300, Alexander Motin wrote: root@[s1-a ~]# powerd -v -r 1000 -i 600 powerd: 1000 is not a valid percent Well, that makes sense, but why powerd itself knows about load 100% but doesn't allow me to specify it? Is this bug? I suppose not if it works for other people... It is reasonable limitation. powerd can't know how load distributed among multiple cores in time. If all cores are equally busy at lets say 10% (that gives 120% total) and cores are never waiting for each other then obviously frequency could be reduced. But if the same 120% mean 100%+20%, or if load is equally spread, but processes on different cores are waiting for each other, then reducing frequency will reduce performance. powerd can't know that and so stays on a safe side. OK, I understand what you are saying here. On the other side, I know pretty well how the load is distributed - in this particular case, the box is a web server, running ~30 php-cgi processes. This kind of operation doesn't require very high frequency and I suspect the cores are never waiting for each other. There could be an option which would allow an administrator to decide whether this is the case and allow him to set a higher -r and -i values, what do you think? I think it should be possible with minimal changes. Other question would be why powerd wants to set freq 5336, when it is not available at all (would be nice to have it heh.): You may see there it is a wanted frequency, not real one. :) It is internal implementation details. In such way powerd implements keeping a full frequency for some time after the load dropped. It's not a bug. OK :-) I actually though powerd always honors the values from dev.cpu.0.freq_levels (and 5336 is not there), so it looked a little weird to me. It does it on left side, but no longer on the right side. Abstracting from real frequencies made behavior more universal and predictable. On multi-core systems like this power management can better be done on per-core bases. Powerd can't control frequencies on per-core basis (also because it require non-trivial interoperation with scheduler). But if your ACPI BIOS allows, you can try to put unused cores into deeper C-states, that may give better power saving and TurboBoost on busy cores as a bonus. It works better on 9-CURRENT, but on 8-STABLE some bonuses still could be achieved. Any idea what I should look for in the BIOS? Something about C-states, or Cx-states on the CPU page. But first look at dev.cpu.X.cx_supported to make sure it is not already present and just unused. This is 8-STABLE, any idea whether there's a MFC plan for the extra 9-CURRENT bonuses? I suppose around May. You may want to look here: http://wiki.freebsd.org/TuningPowerConsumption From reading this, are you reffering above to the C2 states? (seems like C3 is not optimal for this kind of operation...) The deeper state, the more power saved. To get most of it and to get TurboBoost working you need at least C3 CPU state (ACPI may report it with different number). Some latest Intel CPUs have no described problems with C3 and LAPIC, for others described system tuning requited. PS: Using powerd in best case wont hurt performance, while using C-states may even increase it in some cases because of TurboBoost. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: powerd / cpufreq question
On Fri, 08 Apr 2011 18:02:28 +0300, Alexander Motin wrote: OK, I understand what you are saying here. On the other side, I know pretty well how the load is distributed - in this particular case, the box is a web server, running ~30 php-cgi processes. This kind of operation doesn't require very high frequency and I suspect the cores are never waiting for each other. There could be an option which would allow an administrator to decide whether this is the case and allow him to set a higher -r and -i values, what do you think? I think it should be possible with minimal changes. So, here is my attempt to implement it: http://danger.rulez.sk/powerd.diff Can you please review comment? I should be able to commit it mysqlf if you consider it acceptable. It seems to work for me :) Any idea what I should look for in the BIOS? Something about C-states, or Cx-states on the CPU page. But first look at dev.cpu.X.cx_supported to make sure it is not already present and just unused. Seems like it was enabled by default. I have like these: dev.cpu.0.cx_supported: C1/3 C2/96 C3/128 Does that mean I only need to set these in rc.conf?: performance_cx_lowest=C3 economy_cx_lowest=C3 Then run /etc/rc.d/power_profile 0x00? May it cause any instability? This is 8-STABLE, any idea whether there's a MFC plan for the extra 9-CURRENT bonuses? I suppose around May. Do you have some patches? If not you don't really need to make them just for me, I can wait a little. You may want to look here: http://wiki.freebsd.org/TuningPowerConsumption From reading this, are you reffering above to the C2 states? (seems like C3 is not optimal for this kind of operation...) The deeper state, the more power saved. To get most of it and to get TurboBoost working you need at least C3 CPU state (ACPI may report it with different number). Some latest Intel CPUs have no described problems with C3 and LAPIC, for others described system tuning requited. I believe this is pretty recent CPU (6 core Xeon X5650). Do you know about any problems? PS: Using powerd in best case wont hurt performance, while using C-states may even increase it in some cases because of TurboBoost. If I want to use C-states, should I stop to use powerd, or is it possible to use them both together? Thanks! -- Kind regards Daniel ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: powerd / cpufreq question
On 08.04.2011 19:53, Daniel Gerzo wrote: On Fri, 08 Apr 2011 18:02:28 +0300, Alexander Motin wrote: OK, I understand what you are saying here. On the other side, I know pretty well how the load is distributed - in this particular case, the box is a web server, running ~30 php-cgi processes. This kind of operation doesn't require very high frequency and I suspect the cores are never waiting for each other. There could be an option which would allow an administrator to decide whether this is the case and allow him to set a higher -r and -i values, what do you think? I think it should be possible with minimal changes. So, here is my attempt to implement it: http://danger.rulez.sk/powerd.diff Can you please review comment? I should be able to commit it mysqlf if you consider it acceptable. It seems to work for me :) Looks fine, except that -f option have to be the first, that is not obvious. Another moment -- I've noticed some load constants hardcoded there. They should also be handled to make higher values to work properly. Any idea what I should look for in the BIOS? Something about C-states, or Cx-states on the CPU page. But first look at dev.cpu.X.cx_supported to make sure it is not already present and just unused. Seems like it was enabled by default. I have like these: dev.cpu.0.cx_supported: C1/3 C2/96 C3/128 Does that mean I only need to set these in rc.conf?: performance_cx_lowest=C3 economy_cx_lowest=C3 Then run /etc/rc.d/power_profile 0x00? It short - yes. In long - read the link I've given. May it cause any instability? It you won't switch from LAPIC to other timer and it stop - your system will freeze, or at least not work well. You should notice problems immediately, if there are. This is 8-STABLE, any idea whether there's a MFC plan for the extra 9-CURRENT bonuses? I suppose around May. Do you have some patches? If not you don't really need to make them just for me, I can wait a little. Last ones I've generated are five months old: http://people.freebsd.org/~mav/timers_merge/ They are large and I am not sure how good they apply now. You may want to look here: http://wiki.freebsd.org/TuningPowerConsumption From reading this, are you reffering above to the C2 states? (seems like C3 is not optimal for this kind of operation...) The deeper state, the more power saved. To get most of it and to get TurboBoost working you need at least C3 CPU state (ACPI may report it with different number). Some latest Intel CPUs have no described problems with C3 and LAPIC, for others described system tuning requited. I believe this is pretty recent CPU (6 core Xeon X5650). Do you know about any problems? I have no idea about these Xeons. I know just that LAPIC of the my Core i5 works fine in C3, while one of the my Core i7 doesn't. PS: Using powerd in best case wont hurt performance, while using C-states may even increase it in some cases because of TurboBoost. If I want to use C-states, should I stop to use powerd, or is it possible to use them both together? I am using both together on my laptop. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org