Re: powerd / cpufreq question

2011-04-12 Thread Ian Smith
On Tue, 12 Apr 2011, Daniel Gerzo wrote:
  On 11.4.2011 6:08, Ian Smith wrote:
   
   As you see, total of differences for each cpu is here 89 ticks, but I've
   no idea of the interval between your two readings, or your value of HZ?
  
  the interval may have been around 1-2 seconds.
  My value of HZ is default, 1000.

Ok, seems it depends on stathz, not HZ, so 89'd be less than 1 second if 
your stathz is 128 .. I gather that may be changed with the 9.x timers?

   Are those kern.cp_times values as they came, or did you remove trailing
   zeroes?  Reason I ask is that on my Thinkpad T23, single-core 1133/733
   MHz, sysctl kern.cp_time shows the usual 5 values, but kern.cp_times has
   the same 5 values for cpu0, but then 5 zeroes for each of cpu1 through
   cpu31, on 8.2-PRE about early January.  I need to update the script to
   remove surplus data for non-existing cpus, but wonder if the extra data
   also appeared on your 12 core box?
  
  I haven't removed anything, it's a pure copypaste.

Thanks.  I'll check the single-cpu case again after updating to 8.2-R

cheers, Ian
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: powerd / cpufreq question

2011-04-12 Thread Alexander Motin
Ian Smith wrote:
 On Tue, 12 Apr 2011, Daniel Gerzo wrote:
   On 11.4.2011 6:08, Ian Smith wrote:

As you see, total of differences for each cpu is here 89 ticks, but I've
no idea of the interval between your two readings, or your value of HZ?
   
   the interval may have been around 1-2 seconds.
   My value of HZ is default, 1000.
 
 Ok, seems it depends on stathz, not HZ, so 89'd be less than 1 second if 
 your stathz is 128 .. I gather that may be changed with the 9.x timers?

9-CURRENT tries to set stathz to 127, or at least somewhere around. The
main difference there is that clocks really tick only when CPU is
running and emulated during idle periods to allow C-states do their job.

-- 
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: powerd / cpufreq question

2011-04-11 Thread Daniel Gerzo

On 11.4.2011 6:08, Ian Smith wrote:


As you see, total of differences for each cpu is here 89 ticks, but I've
no idea of the interval between your two readings, or your value of HZ?


the interval may have been around 1-2 seconds.
My value of HZ is default, 1000.


Are those kern.cp_times values as they came, or did you remove trailing
zeroes?  Reason I ask is that on my Thinkpad T23, single-core 1133/733
MHz, sysctl kern.cp_time shows the usual 5 values, but kern.cp_times has
the same 5 values for cpu0, but then 5 zeroes for each of cpu1 through
cpu31, on 8.2-PRE about early January.  I need to update the script to
remove surplus data for non-existing cpus, but wonder if the extra data
also appeared on your 12 core box?


I haven't removed anything, it's a pure copypaste.


--
S pozdravom / Best regards
  Daniel Gerzo, FreeBSD committer
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: powerd / cpufreq question

2011-04-10 Thread Ian Smith
On Fri, 8 Apr 2011, Daniel Ger?o wrote:

  Hello guys,
  
  I have a new machine with Xeon(R) CPU X5650 2666.77-MHz and I would like to
  utilize powerd(8) on it however, when I run `powerd -v -r90' I see something
  like this:
  
  load  64%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
  load 120%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
  load 173%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
  load  62%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
  load  82%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
  load 110%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
  
  even though the machine is according to top(1) ~90% idle; So I realized, that
  powerd might take the load as the sum of loads of all the cores (12), so I
  tried to tweak powerd arguments like this:

Hi Daniel, Alexander, all.

I hope to engage more on this interesting topic later, but first:

[..]

  Examle of two consecutive cp_times sysctl output:
  
  kern.cp_times: 4182996 0 306925 85623 13563403 3164971 0 201479 93110
  14679313 3450792 0 258166 80198 14349717 2795270 0 180252 76701 15086650
  2952777 0 217156 119627 14849313 2418067 0 158594 73497 15488715 2408492 0
  175131 104377 15450873 2003803 0 131790 75753 15927527 2456736 0 178894 36963
  15466280 1607095 0 117396 4197 16410185 2127878 0 147639 30804 15832552
  1406621 0 92686 1058 16638508
  
  kern.cp_times: 4183013 0 306927 85626 13563469 3164980 0 201482 93110
  14679390 3450796 0 258167 80199 14349800 2795274 0 180252 76701 15086735
  2952780 0 217157 119629 14849396 2418070 0 158597 73497 15488798 2408499 0
  175132 104377 15450954 2003804 0 131791 75753 15927614 2456744 0 178897 36963
  15466358 1607098 0 117398 4197 16410269 2127880 0 147640 30804 15832638
  1406621 0 92686 1058 16638597

I wrote the script included below to try making some sense of these, 
that defaults to using your above values, resulting in:

smithi on sola% sh cptimes.sh
  cp_usercp_nice cp_syscp_intrcp_idle
cpu: 0 @t04182996  0 306925  85623   13563403
cpu: 0 @t14183013  0 306927  85626   13563469
   17  0  2  3 66
cpu: 1 @t03164971  0 201479  93110   14679313
cpu: 1 @t13164980  0 201482  93110   14679390
9  0  3  0 77
cpu: 2 @t03450792  0 258166  80198   14349717
cpu: 2 @t13450796  0 258167  80199   14349800
4  0  1  1 83
cpu: 3 @t02795270  0 180252  76701   15086650
cpu: 3 @t12795274  0 180252  76701   15086735
4  0  0  0 85
cpu: 4 @t02952777  0 217156 119627   14849313
cpu: 4 @t12952780  0 217157 119629   14849396
3  0  1  2 83
cpu: 5 @t02418067  0 158594  73497   15488715
cpu: 5 @t12418070  0 158597  73497   15488798
3  0  3  0 83
cpu: 6 @t02408492  0 175131 104377   15450873
cpu: 6 @t12408499  0 175132 104377   15450954
7  0  1  0 81
cpu: 7 @t02003803  0 131790  75753   15927527
cpu: 7 @t12003804  0 131791  75753   15927614
1  0  1  0 87
cpu: 8 @t02456736  0 178894  36963   15466280
cpu: 8 @t12456744  0 178897  36963   15466358
8  0  3  0 78
cpu: 9 @t01607095  0 117396   4197   16410185
cpu: 9 @t11607098  0 117398   4197   16410269
3  0  2  0 84
cpu:10 @t02127878  0 147639  30804   15832552
cpu:10 @t12127880  0 147640  30804   15832638
2  0  1  0 86
cpu:11 @t01406621  0  92686   1058   16638508
cpu:11 @t11406621  0  92686   1058   16638597
0  0  0  0 89

As you see, total of differences for each cpu is here 89 ticks, but I've 
no idea of the interval between your two readings, or your value of HZ?

Are those kern.cp_times values as they came, or did you remove trailing 
zeroes?  Reason I ask is that on my Thinkpad T23, single-core 1133/733 
MHz, sysctl kern.cp_time shows the usual 5 values, but kern.cp_times has 
the same 5 values for cpu0, but then 5 zeroes for each of cpu1 through 
cpu31, on 8.2-PRE about early January.  I need to update the script to 
remove surplus data for non-existing cpus, but wonder if the extra data 
also appeared on your 12 

Re: powerd / cpufreq question

2011-04-09 Thread Daniel Gerzo

On 8.4.2011 19:52, Alexander Motin wrote:


So, here is my attempt to implement it:
http://danger.rulez.sk/powerd.diff
Can you please review  comment? I should be able to commit it mysqlf if
you consider it acceptable. It seems to work for me :)


Looks fine, except that -f option have to be the first, that is not
obvious. Another moment -- I've noticed some load constants hardcoded
there. They should also be handled to make higher values to work properly.


I tried to be more explicit in the error message which tries to emphasis 
the need to put it first. I don't know myself how it would be possible 
to code it so that the -f doesn't need to be first. Ideas?


Do you mean the values around lines of 730 - 762?

From what I have observed, if I have a machine that is a little more 
loaded (say 300%) and the load goes up, it tries to increases the 
performance to quite high freq (5336) and when the load decreases again, 
it takes quite a while to go down from 5366 to a frequency that is 
actually available to decrease the performance (something less than 
2934). So the lower frequency is used for too short time because it 
takes too much time to get it...



Seems like it was enabled by default. I have like these:
dev.cpu.0.cx_supported: C1/3 C2/96 C3/128

Does that mean I only need to set these in rc.conf?:
performance_cx_lowest=C3
economy_cx_lowest=C3

Then run /etc/rc.d/power_profile 0x00?


It short - yes. In long - read the link I've given.


May it cause any instability?


It you won't switch from LAPIC to other timer and it stop - your system
will freeze, or at least not work well. You should notice problems
immediately, if there are.


So I will also need to change the kern.timecounter.hardware to i8254? I 
suppose it will cause a little less precise time, but should I expect 
lower performance? I don't care that much about the time accuracy.


How do I know the C3 is active? And how does it switch back to C1 for 
example?



This is 8-STABLE, any idea whether there's a MFC plan for the extra
9-CURRENT bonuses?


I suppose around May.


Do you have some patches? If not you don't really need to make them just
for me, I can wait a little.


Last ones I've generated are five months old:
http://people.freebsd.org/~mav/timers_merge/
They are large and I am not sure how good they apply now.


I guess I will just stick with vanilla 8-stable and then update.


You may want to look here:
http://wiki.freebsd.org/TuningPowerConsumption


--
S pozdravom / Best regards
  Daniel Gerzo, FreeBSD committer
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: powerd / cpufreq question

2011-04-09 Thread Alexander Motin

On 09.04.2011 10:57, Daniel Gerzo wrote:

On 8.4.2011 19:52, Alexander Motin wrote:

So, here is my attempt to implement it:
http://danger.rulez.sk/powerd.diff
Can you please review  comment? I should be able to commit it mysqlf if
you consider it acceptable. It seems to work for me :)


Looks fine, except that -f option have to be the first, that is not
obvious. Another moment -- I've noticed some load constants hardcoded
there. They should also be handled to make higher values to work
properly.


I tried to be more explicit in the error message which tries to emphasis
the need to put it first. I don't know myself how it would be possible
to code it so that the -f doesn't need to be first. Ideas?


Move checks after the loop? Just an idea.


Do you mean the values around lines of 730 - 762?


Yes. When load is more the twice higher then limit - frequency rises 
faster. To make it work with limit  50%, there is hardcoded additional 
check for 95% level.



 From what I have observed, if I have a machine that is a little more
loaded (say 300%) and the load goes up, it tries to increases the
performance to quite high freq (5336) and when the load decreases again,
it takes quite a while to go down from 5366 to a frequency that is
actually available to decrease the performance (something less than
2934). So the lower frequency is used for too short time because it
takes too much time to get it...


It is intended behavior in hiadaptive mode, where performance is 
preferable to power-saving.



Seems like it was enabled by default. I have like these:
dev.cpu.0.cx_supported: C1/3 C2/96 C3/128

Does that mean I only need to set these in rc.conf?:
performance_cx_lowest=C3
economy_cx_lowest=C3

Then run /etc/rc.d/power_profile 0x00?


It short - yes. In long - read the link I've given.


May it cause any instability?


It you won't switch from LAPIC to other timer and it stop - your system
will freeze, or at least not work well. You should notice problems
immediately, if there are.


So I will also need to change the kern.timecounter.hardware to i8254? I
suppose it will cause a little less precise time, but should I expect
lower performance? I don't care that much about the time accuracy.


I wasn't mentioning timecounter there. In terms of 9-CURRENT I was 
talking about eventtimer. In 8-STABLE it is not formalized yet and so 
the guide mentions number of tunables.



How do I know the C3 is active?


sysctl dev.cpu.X.cx_usage


And how does it switch back to C1 for example?


When CPU is idle, depending on previous idle statistics, system puts it 
into one of reported and allowed C-states. CPU goes back to C0 state on 
any hardware interrupt.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: powerd / cpufreq question

2011-04-09 Thread Bartosz Fabianowski
I just noticed this thread a day after my own fight with powerd and load 
percentages that did not seem to make any sense.


The patch I came up with is attached. It modifies powerd to use the load 
percentage of the busiest core. This reduces the range of values back to 
0%...100% also for multi-core systems.


On my Core i7 setup here, the change seems to work well.

- Bartosz
--- powerd.c.old2011-04-07 17:30:58.0 +0200
+++ powerd.c2011-04-07 17:38:28.0 +0200
@@ -128,7 +128,7 @@
static long *cp_times = NULL, *cp_times_old = NULL;
static int ncpus = 0;
size_t cp_times_len;
-   int error, cpu, i, total;
+   int error, cpu, i, total, max;
 
if (cp_times == NULL) {
cp_times_len = 0;
@@ -151,7 +151,7 @@
return (error);

if (load) {
-   *load = 0;
+   max = 0;
for (cpu = 0; cpu  ncpus; cpu++) {
total = 0;
for (i = 0; i  CPUSTATES; i++) {
@@ -160,9 +160,12 @@
}
if (total == 0)
continue;
-   *load += 100 - (cp_times[cpu * CPUSTATES + CP_IDLE] - 
+   total = 100 - (cp_times[cpu * CPUSTATES + CP_IDLE] - 
cp_times_old[cpu * CPUSTATES + CP_IDLE]) * 100 / 
total;
+   if (total  max)
+   max = total;
}
+   *load = max;
}
 
memcpy(cp_times_old, cp_times, cp_times_len);
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: powerd / cpufreq question

2011-04-09 Thread Alexander Motin

On 09.04.2011 17:39, Bartosz Fabianowski wrote:

I just noticed this thread a day after my own fight with powerd and load
percentages that did not seem to make any sense.

The patch I came up with is attached. It modifies powerd to use the load
percentage of the busiest core. This reduces the range of values back to
0%...100% also for multi-core systems.


While using maximum of loads can be better then using levels above 100%, 
it won't properly handle cases of dependent or frequently migrating 
threads, that are handled now with summary load and levels less then 
100%. While existing powerd algorithm is indeed not perfect, it is the 
only relatively performance-safe, unlike others propositions.


I won't argue about adding more algorithms/options to powerd, optimized 
for handling different situations, but I believe that default should 
remain safe.



On my Core i7 setup here, the change seems to work well.


... in your specific workload. And you haven't described how you 
measured system performance to prove that it haven't decreased.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: powerd / cpufreq question

2011-04-09 Thread Bartosz Fabianowski

On my Core i7 setup here, the change seems to work well.


... in your specific workload. And you haven't described how you
measured system performance to prove that it haven't decreased.


My measure of performance is entirely unscientific: This is a desktop 
box. Performance is good if KDE reacts to inputs quickly. My patch 
preserves this for me while making the box run a bit cooler.


I am by no means advocating that my patch be made the default behavior. 
But as you said, it may be nice to include it as one of several 
algorithms the user can choose from.


- Bartosz
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: powerd / cpufreq question

2011-04-09 Thread Kevin Oberman
 Date: Sat, 09 Apr 2011 09:57:28 +0200
 From: Daniel Gerzo dan...@freebsd.org
 Sender: owner-freebsd-sta...@freebsd.org
 
 On 8.4.2011 19:52, Alexander Motin wrote:
 
  So, here is my attempt to implement it:
  http://danger.rulez.sk/powerd.diff
  Can you please review  comment? I should be able to commit it mysqlf if
  you consider it acceptable. It seems to work for me :)
 
  Looks fine, except that -f option have to be the first, that is not
  obvious. Another moment -- I've noticed some load constants hardcoded
  there. They should also be handled to make higher values to work properly.
 
 I tried to be more explicit in the error message which tries to emphasis 
 the need to put it first. I don't know myself how it would be possible 
 to code it so that the -f doesn't need to be first. Ideas?
 
 Do you mean the values around lines of 730 - 762?
 
  From what I have observed, if I have a machine that is a little more 
 loaded (say 300%) and the load goes up, it tries to increases the 
 performance to quite high freq (5336) and when the load decreases again, 
 it takes quite a while to go down from 5366 to a frequency that is 
 actually available to decrease the performance (something less than 
 2934). So the lower frequency is used for too short time because it 
 takes too much time to get it...
 
  Seems like it was enabled by default. I have like these:
  dev.cpu.0.cx_supported: C1/3 C2/96 C3/128
 
  Does that mean I only need to set these in rc.conf?:
  performance_cx_lowest=C3
  economy_cx_lowest=C3
 
  Then run /etc/rc.d/power_profile 0x00?
 
  It short - yes. In long - read the link I've given.
 
  May it cause any instability?
 
  It you won't switch from LAPIC to other timer and it stop - your system
  will freeze, or at least not work well. You should notice problems
  immediately, if there are.
 
 So I will also need to change the kern.timecounter.hardware to i8254? I 
 suppose it will cause a little less precise time, but should I expect 
 lower performance? I don't care that much about the time accuracy.
 
 How do I know the C3 is active? And how does it switch back to C1 for 
 example?
 
  This is 8-STABLE, any idea whether there's a MFC plan for the extra
  9-CURRENT bonuses?
 
  I suppose around May.
 
  Do you have some patches? If not you don't really need to make them just
  for me, I can wait a little.
 
  Last ones I've generated are five months old:
  http://people.freebsd.org/~mav/timers_merge/
  They are large and I am not sure how good they apply now.
 
 I guess I will just stick with vanilla 8-stable and then update.
 
  You may want to look here:
  http://wiki.freebsd.org/TuningPowerConsumption

I would like to emphasize that simple frequency reductions through
throttling and TCC really don't save power. EST, which actually does
change the CPU clock AND voltage, is a win, but not a big one.

I you want to reduce power, deeper sleep states are the only way to go
further. Dr. Tajana Rosing of the UC-San Diego System Energy Efficiency
Lab has presented research that shows that servers save by far the most
power when a process gets in, runs at maximum speed and gets out to
allow the system to sleep. This is clearly the only way to significantly
improve power consumption in servers. If you don't enable and use S3 and
better (when present), you are not being power efficient.

To do this, the clock must rise very quickly and should drop slowly. And
even then, it's not likely to save much power or reduce heat load
significantly .
-- 
R. Kevin Oberman, Network Engineer
Energy Sciences Network (ESnet)
Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
E-mail: ober...@es.net  Phone: +1 510 486-8634
Key fingerprint:059B 2DDF 031C 9BA3 14A4  EADA 927D EBB3 987B 3751
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: powerd / cpufreq question

2011-04-08 Thread Alexander Motin

Hi.

On 08.04.2011 14:12, Daniel Geržo wrote:

I have a new machine with Xeon(R) CPU X5650 2666.77-MHz and I would like
to utilize powerd(8) on it however, when I run `powerd -v -r90' I see
something like this:

load 64%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
load 120%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
load 173%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
load 62%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
load 82%, current freq 2668 MHz ( 0), wanted freq 5336 MHz
load 110%, current freq 2668 MHz ( 0), wanted freq 5336 MHz

even though the machine is according to top(1) ~90% idle; So I realized,
that powerd might take the load as the sum of loads of all the cores
(12), so I tried to tweak powerd arguments like this:

`powerd -v -r 1000 -i 600'

but that errors for me with:

root@[s1-a ~]# powerd -v -r 1000 -i 600
powerd: 1000 is not a valid percent

Well, that makes sense, but why powerd itself knows about load  100%
but doesn't allow me to specify it? Is this bug? I suppose not if it
works for other people...


It is reasonable limitation. powerd can't know how load distributed 
among multiple cores in time. If all cores are equally busy at lets say 
10% (that gives 120% total) and cores are never waiting for each other 
then obviously frequency could be reduced. But if the same 120% mean 
100%+20%, or if load is equally spread, but processes on different cores 
are waiting for each other, then reducing frequency will reduce 
performance. powerd can't know that and so stays on a safe side.



Other question would be why powerd wants to set freq 5336, when it is
not available at all (would be nice to have it heh.):


You may see there it is a wanted frequency, not real one. :) It is 
internal implementation details. In such way powerd implements keeping a 
full frequency for some time after the load dropped. It's not a bug.


On multi-core systems like this power management can better be done on 
per-core bases. Powerd can't control frequencies on per-core basis (also 
because it require non-trivial interoperation with scheduler). But if 
your ACPI BIOS allows, you can try to put unused cores into deeper 
C-states, that may give better power saving and TurboBoost on busy cores 
as a bonus. It works better on 9-CURRENT, but on 8-STABLE some bonuses 
still could be achieved.


You may want to look here:
http://wiki.freebsd.org/TuningPowerConsumption

--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: powerd / cpufreq question

2011-04-08 Thread Daniel Gerzo

On Fri, 08 Apr 2011 14:42:04 +0300, Alexander Motin wrote:

Hello Alexander, thanks for quick reply;


root@[s1-a ~]# powerd -v -r 1000 -i 600
powerd: 1000 is not a valid percent

Well, that makes sense, but why powerd itself knows about load  
100%

but doesn't allow me to specify it? Is this bug? I suppose not if it
works for other people...


It is reasonable limitation. powerd can't know how load distributed
among multiple cores in time. If all cores are equally busy at lets
say 10% (that gives 120% total) and cores are never waiting for each
other then obviously frequency could be reduced. But if the same 120%
mean 100%+20%, or if load is equally spread, but processes on
different cores are waiting for each other, then reducing frequency
will reduce performance. powerd can't know that and so stays on a 
safe

side.


OK, I understand what you are saying here. On the other side, I know 
pretty well how the load is distributed - in this particular case, the 
box is a web server, running ~30 php-cgi processes.
This kind of operation doesn't require very high frequency and I 
suspect the cores are never waiting for each other. There could be an 
option which would allow an administrator to decide whether this is the 
case and allow him to set a higher -r and -i values, what do you think?


Other question would be why powerd wants to set freq 5336, when it 
is

not available at all (would be nice to have it heh.):


You may see there it is a wanted frequency, not real one. :) It is
internal implementation details. In such way powerd implements 
keeping
a full frequency for some time after the load dropped. It's not a 
bug.


OK :-) I actually though powerd always honors the values from 
dev.cpu.0.freq_levels (and 5336 is not there), so it looked a little 
weird to me.



On multi-core systems like this power management can better be done
on per-core bases. Powerd can't control frequencies on per-core basis
(also because it require non-trivial interoperation with scheduler).
But if your ACPI BIOS allows, you can try to put unused cores into
deeper C-states, that may give better power saving and TurboBoost on
busy cores as a bonus. It works better on 9-CURRENT, but on 8-STABLE
some bonuses still could be achieved.


Any idea what I should look for in the BIOS?
This is 8-STABLE, any idea whether there's a MFC plan for the extra 
9-CURRENT bonuses?



You may want to look here:
http://wiki.freebsd.org/TuningPowerConsumption


From reading this, are you reffering above to the C2 states? (seems 
like C3 is not optimal for this kind of operation...)


Thanks.

--
Kind regards
  Daniel
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: powerd / cpufreq question

2011-04-08 Thread Alexander Motin

On 08.04.2011 17:42, Daniel Gerzo wrote:

On Fri, 08 Apr 2011 14:42:04 +0300, Alexander Motin wrote:

root@[s1-a ~]# powerd -v -r 1000 -i 600
powerd: 1000 is not a valid percent

Well, that makes sense, but why powerd itself knows about load  100%
but doesn't allow me to specify it? Is this bug? I suppose not if it
works for other people...


It is reasonable limitation. powerd can't know how load distributed
among multiple cores in time. If all cores are equally busy at lets
say 10% (that gives 120% total) and cores are never waiting for each
other then obviously frequency could be reduced. But if the same 120%
mean 100%+20%, or if load is equally spread, but processes on
different cores are waiting for each other, then reducing frequency
will reduce performance. powerd can't know that and so stays on a safe
side.


OK, I understand what you are saying here. On the other side, I know
pretty well how the load is distributed - in this particular case, the
box is a web server, running ~30 php-cgi processes.
This kind of operation doesn't require very high frequency and I suspect
the cores are never waiting for each other. There could be an option
which would allow an administrator to decide whether this is the case
and allow him to set a higher -r and -i values, what do you think?


I think it should be possible with minimal changes.


Other question would be why powerd wants to set freq 5336, when it is
not available at all (would be nice to have it heh.):


You may see there it is a wanted frequency, not real one. :) It is
internal implementation details. In such way powerd implements keeping
a full frequency for some time after the load dropped. It's not a bug.


OK :-) I actually though powerd always honors the values from
dev.cpu.0.freq_levels (and 5336 is not there), so it looked a little
weird to me.


It does it on left side, but no longer on the right side. Abstracting 
from real frequencies made behavior more universal and predictable.



On multi-core systems like this power management can better be done
on per-core bases. Powerd can't control frequencies on per-core basis
(also because it require non-trivial interoperation with scheduler).
But if your ACPI BIOS allows, you can try to put unused cores into
deeper C-states, that may give better power saving and TurboBoost on
busy cores as a bonus. It works better on 9-CURRENT, but on 8-STABLE
some bonuses still could be achieved.


Any idea what I should look for in the BIOS?


Something about C-states, or Cx-states on the CPU page. But first look 
at dev.cpu.X.cx_supported to make sure it is not already present and 
just unused.



This is 8-STABLE, any idea whether there's a MFC plan for the extra
9-CURRENT bonuses?


I suppose around May.


You may want to look here:
http://wiki.freebsd.org/TuningPowerConsumption


 From reading this, are you reffering above to the C2 states? (seems
like C3 is not optimal for this kind of operation...)


The deeper state, the more power saved. To get most of it and to get 
TurboBoost working you need at least C3 CPU state (ACPI may report it 
with different number). Some latest Intel CPUs have no described 
problems with C3 and LAPIC, for others described system tuning requited.


PS: Using powerd in best case wont hurt performance, while using 
C-states may even increase it in some cases because of TurboBoost.


--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: powerd / cpufreq question

2011-04-08 Thread Daniel Gerzo

On Fri, 08 Apr 2011 18:02:28 +0300, Alexander Motin wrote:

OK, I understand what you are saying here. On the other side, I know
pretty well how the load is distributed - in this particular case, 
the

box is a web server, running ~30 php-cgi processes.
This kind of operation doesn't require very high frequency and I 
suspect

the cores are never waiting for each other. There could be an option
which would allow an administrator to decide whether this is the 
case

and allow him to set a higher -r and -i values, what do you think?


I think it should be possible with minimal changes.


So, here is my attempt to implement it:
http://danger.rulez.sk/powerd.diff
Can you please review  comment? I should be able to commit it mysqlf 
if you consider it acceptable. It seems to work for me :)





Any idea what I should look for in the BIOS?


Something about C-states, or Cx-states on the CPU page. But first
look at dev.cpu.X.cx_supported to make sure it is not already present
and just unused.


Seems like it was enabled by default. I have like these:
dev.cpu.0.cx_supported: C1/3 C2/96 C3/128

Does that mean I only need to set these in rc.conf?:
performance_cx_lowest=C3
economy_cx_lowest=C3

Then run /etc/rc.d/power_profile 0x00?
May it cause any instability?


This is 8-STABLE, any idea whether there's a MFC plan for the extra
9-CURRENT bonuses?


I suppose around May.


Do you have some patches? If not you don't really need to make them 
just for me, I can wait a little.



You may want to look here:
http://wiki.freebsd.org/TuningPowerConsumption


 From reading this, are you reffering above to the C2 states? (seems
like C3 is not optimal for this kind of operation...)


The deeper state, the more power saved. To get most of it and to get
TurboBoost working you need at least C3 CPU state (ACPI may report it
with different number). Some latest Intel CPUs have no described
problems with C3 and LAPIC, for others described system tuning
requited.



I believe this is pretty recent CPU (6 core Xeon X5650). Do you know 
about any problems?



PS: Using powerd in best case wont hurt performance, while using
C-states may even increase it in some cases because of TurboBoost.


If I want to use C-states, should I stop to use powerd, or is it 
possible to use them both together?


Thanks!

--
Kind regards
  Daniel
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: powerd / cpufreq question

2011-04-08 Thread Alexander Motin

On 08.04.2011 19:53, Daniel Gerzo wrote:

On Fri, 08 Apr 2011 18:02:28 +0300, Alexander Motin wrote:

OK, I understand what you are saying here. On the other side, I know
pretty well how the load is distributed - in this particular case, the
box is a web server, running ~30 php-cgi processes.
This kind of operation doesn't require very high frequency and I suspect
the cores are never waiting for each other. There could be an option
which would allow an administrator to decide whether this is the case
and allow him to set a higher -r and -i values, what do you think?


I think it should be possible with minimal changes.


So, here is my attempt to implement it:
http://danger.rulez.sk/powerd.diff
Can you please review  comment? I should be able to commit it mysqlf if
you consider it acceptable. It seems to work for me :)


Looks fine, except that -f option have to be the first, that is not 
obvious. Another moment -- I've noticed some load constants hardcoded 
there. They should also be handled to make higher values to work properly.



Any idea what I should look for in the BIOS?


Something about C-states, or Cx-states on the CPU page. But first
look at dev.cpu.X.cx_supported to make sure it is not already present
and just unused.


Seems like it was enabled by default. I have like these:
dev.cpu.0.cx_supported: C1/3 C2/96 C3/128

Does that mean I only need to set these in rc.conf?:
performance_cx_lowest=C3
economy_cx_lowest=C3

Then run /etc/rc.d/power_profile 0x00?


It short - yes. In long - read the link I've given.


May it cause any instability?


It you won't switch from LAPIC to other timer and it stop - your system 
will freeze, or at least not work well. You should notice problems 
immediately, if there are.



This is 8-STABLE, any idea whether there's a MFC plan for the extra
9-CURRENT bonuses?


I suppose around May.


Do you have some patches? If not you don't really need to make them just
for me, I can wait a little.


Last ones I've generated are five months old:
http://people.freebsd.org/~mav/timers_merge/
They are large and I am not sure how good they apply now.


You may want to look here:
http://wiki.freebsd.org/TuningPowerConsumption


From reading this, are you reffering above to the C2 states? (seems
like C3 is not optimal for this kind of operation...)


The deeper state, the more power saved. To get most of it and to get
TurboBoost working you need at least C3 CPU state (ACPI may report it
with different number). Some latest Intel CPUs have no described
problems with C3 and LAPIC, for others described system tuning
requited.


I believe this is pretty recent CPU (6 core Xeon X5650). Do you know
about any problems?


I have no idea about these Xeons. I know just that LAPIC of the my Core 
i5 works fine in C3, while one of the my Core i7 doesn't.



PS: Using powerd in best case wont hurt performance, while using
C-states may even increase it in some cases because of TurboBoost.


If I want to use C-states, should I stop to use powerd, or is it
possible to use them both together?


I am using both together on my laptop.

--
Alexander Motin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org