On Tue, Feb 12, 2008 at 01:28:21PM -0800, Pallipadi, Venkatesh wrote:
>  
> 
> -----Original Message-----
> From: Kevin Hilman [mailto:[EMAIL PROTECTED] 
> Sent: Thursday, January 31, 2008 5:01 PM
> To: Pallipadi, Venkatesh
> Cc: [email protected]; Adam Belay
> Subject: Re: [PATCH/RFC] powertop non-ACPI: use CPUidle for C-state
> details
> 
> "Pallipadi, Venkatesh" <[EMAIL PROTECTED]> writes:
> 
> >>
> >>"Pallipadi, Venkatesh" <[EMAIL PROTECTED]> writes:
> >>
> >>> Nice timing for the patch. I am working on cpuidle patches that will
> >>> add things like C1 idle residency time in /sysfs and was asking
> >>> Arjan about switching powertop to use cpuidle interfaces :-).
> >>
> >>Great!  My platform reports residency for all the states in sysfs
> >>already, so powertop using sysfs is very useful to me.
> >>
> >>> One major comment is that with the patch C-state timing in powertop
> >>> seems to be off, Due to different units of time in /proc/acpi and
> >>> sysfs cpuidle.  /proc/acpi C-state timings are in ACPI PM timer
> >>> ticks and cpuidle exports it in micro-seconds.
> >>
> >>Yes, the percentage numbers are indeed off.  I was just noticing the
> >>same problem.  I'm working right now on changing those calculations to
> >>use usecs instead of ticks.
> >>
> >>I assume the hardcoded frequency in powertop (3579.545) is the
> >>frequency of the ACPI timer?  I'm pretty ignorant of ACPI, but am
> >>guessing that's what those calculations are for.
> >
> > Yes. That hardcoded freq number is coming from ACPI PM timer
> frequency.
> >
> 
> That being the case, the following patch makes the residency
> calculations work using either usecs for CPUidle or ACPI ticks.
> Basically, just setting FREQ=1000 make the calculations work out for
> usecs.
> 
> Now I'm seeing correct residency percentages on my platform using
> CPUidle.
> 
> Kevin
> 

Arjan,

Below is the patch from Kevin, which I slightly modified to fix one bug on
SMP systems, wherein variable dir was getting used across 2 while loops
causing some confusions.

With this patch, we still look at /proc/acpi first and then fallback to
/sys/.../cpuidle. That order has to change sometime in future as
/proc/acpi/.../power is sort of deprecated. Also, there are patches in
upstream git/mm  kernel which adds more C-state information like C1 residency
time, a C0 poll idle state and mapping to hardware C-state. That will mean
/sys/.../cpuidle has more info than /proc/acpi.

Thanks,
Venki


diff -purN powertop-1.9/powertop.c powertop-1.9-mod/powertop.c
--- powertop-1.9/powertop.c     2007-10-29 10:45:56.000000000 -0700
+++ powertop-1.9-mod/powertop.c 2007-10-18 04:57:23.000000000 -0700
@@ -35,6 +35,7 @@
 #include <assert.h>
 #include <locale.h>
 #include <time.h>
+#include <sys/stat.h>
 
 #include "powertop.h"
 
@@ -62,7 +63,8 @@ struct irqdata {
 
 struct irqdata interrupts[IRQCOUNT];
 
-#define FREQ 3579.545
+#define FREQ_ACPI 3579.545
+static unsigned long FREQ;
 
 int nostats;
 
@@ -236,7 +238,7 @@ static void do_proc_irq(void)
        fclose(file);
 }
 
-static void read_data(uint64_t * usage, uint64_t * duration)
+static void read_data_acpi(uint64_t * usage, uint64_t * duration)
 {
        DIR *dir;
        struct dirent *entry;
@@ -286,6 +288,108 @@ static void read_data(uint64_t * usage, 
        closedir(dir);
 }
 
+static void read_data_cpuidle(uint64_t * usage, uint64_t * duration)
+{
+       DIR *cpudir;
+       DIR *dir;
+       struct dirent *entry;
+       FILE *file = NULL;
+       char line[4096];
+       char filename[128], *f;
+       int len, clevel = 0;
+       unsigned long usecs, ticks_per_sec = 32768, usecs_per_sec = 1000000;
+
+       memset(usage, 0, 64);
+       memset(duration, 0, 64);
+
+       cpudir = opendir("/sys/devices/system/cpu");
+       if (!cpudir)
+               return;
+
+       /* Loop over cpuN entries */
+       while ((entry = readdir(cpudir))) {
+               if (strlen(entry->d_name) < 3)
+                       continue;
+
+               if (!isdigit(entry->d_name[3]))
+                       continue;
+
+               len = sprintf(filename, "/sys/devices/system/cpu/%s/cpuidle",
+                             entry->d_name);
+
+               dir = opendir(filename);
+               if (!dir)
+                       return;
+
+               clevel = 0;
+
+               /* For each C-state, there is a stateX directory which
+                * contains a 'usage' and a 'time' (duration) file */
+               while ((entry = readdir(dir))) {
+                       if (strlen(entry->d_name) < 3)
+                               continue;
+                       sprintf(filename + len, "/%s/usage", entry->d_name);
+                       file = fopen(filename, "r");
+                       if (!file)
+                               continue;
+
+                       memset(line, 0, 4096);
+                       f = fgets(line, 4096, file);
+                       fclose(file);
+                       if (f == NULL)
+                               break;
+
+                       usage[clevel] += 1+strtoull(line, NULL, 10);
+
+                       sprintf(filename + len, "/%s/time", entry->d_name);
+                       file = fopen(filename, "r");
+                       if (!file)
+                               continue;
+               
+                       memset(line, 0, 4096);
+                       f = fgets(line, 4096, file);
+                       fclose(file);
+                       if (f == NULL)
+                               break;
+
+                       duration[clevel] += 1+strtoull(line, NULL, 10);
+
+                       clevel++;
+                       if (clevel > maxcstate)
+                               maxcstate = clevel;
+               
+               }
+               closedir(dir);
+
+       }
+       closedir(cpudir);
+}
+
+static void read_data(uint64_t * usage, uint64_t * duration)
+{
+       int r;
+       struct stat s;
+
+       /* First, check for ACPI */
+       r = stat("/proc/acpi/processor", &s);
+       if (!r) {
+               read_data_acpi(usage, duration);
+
+               /* perform residency calculations based on ACPI timer */
+               FREQ = FREQ_ACPI;
+               return;
+       }
+
+       /* Then check for CPUidle */
+       r = stat("/sys/devices/system/cpu/cpuidle", &s);
+       if (!r) {
+               read_data_cpuidle(usage, duration);
+               
+               /* perform residency calculations based on usecs */
+               FREQ = 1000;
+       }
+}
+
 void stop_timerstats(void)
 {
        FILE *file;
@@ -530,7 +634,7 @@ int main(int argc, char **argv)
                memset(&cstate_lines, 0, sizeof(cstate_lines));
                topcstate = -4;
                if (totalevents == 0 && maxcstate <= 1) {
-                       sprintf(cstate_lines[5],_("< Detailed C-state 
information is only available on Mobile CPUs (laptops) >\n"));
+                       sprintf(cstate_lines[5],_("< Detailed C-state 
information is not available.>\n"));
                } else {
                        double sleept, percentage;;
                        c0 = sysconf(_SC_NPROCESSORS_ONLN) * ticktime * 1000 * 
FREQ - totalticks;

_______________________________________________
Power mailing list
[email protected]
http://www.bughost.org/mailman/listinfo/power

Reply via email to