Hello Naveen,
On Wed, Apr 01, 2020 at 03:28:48PM +0530, Naveen N. Rao wrote: > Gautham R. Shenoy wrote: > >From: "Gautham R. Shenoy" <e...@linux.vnet.ibm.com> > > [..snip..] > >-static DEVICE_ATTR(spurr, 0400, show_spurr, NULL); > >-static DEVICE_ATTR(purr, 0400, show_purr, store_purr); > > static DEVICE_ATTR(pir, 0400, show_pir, NULL); > > static DEVICE_ATTR(tscr, 0600, show_tscr, store_tscr); > > #endif /* CONFIG_PPC64 */ > >@@ -761,22 +757,110 @@ static void create_svm_file(void) > > } > > #endif /* CONFIG_PPC_SVM */ > > > >+#ifdef CONFIG_PPC64 > >+/* > >+ * The duration (in ms) from the last IPI to the target CPU until > >+ * which a cached value of purr, spurr, idle_purr, idle_spurr can be > >+ * reported to the user on a corresponding sysfs file read. Beyond > >+ * this duration, fresh values need to be obtained by sending IPIs to > >+ * the target CPU when the sysfs files are read. > >+ */ > >+static unsigned long util_stats_staleness_tolerance_ms = 10; > > This is a nice optimization for our use in lparstat, though I have a concern > below. > > >+struct util_acct_stats { > >+ u64 latest_purr; > >+ u64 latest_spurr; > >+#ifdef CONFIG_PPC_PSERIES > >+ u64 latest_idle_purr; > >+ u64 latest_idle_spurr; > >+#endif > > You can probably drop the 'latest_' prefix. Sure. > > >+ unsigned long last_update_jiffies; > >+}; > >+ > >+DEFINE_PER_CPU(struct util_acct_stats, util_acct_stats); > > Per snowpatch, this should be static, and so should get_util_stats_ptr() > below: > https://openpower.xyz/job/snowpatch/job/snowpatch-linux-sparse/16601//artifact/linux/report.txt Ok, will fix this in v5. > > >+ > >+static void update_util_acct_stats(void *ptr) > >+{ > >+ struct util_acct_stats *stats = ptr; > >+ > >+ stats->latest_purr = mfspr(SPRN_PURR); > >+ stats->latest_spurr = mfspr(SPRN_SPURR); > > #ifdef CONFIG_PPC_PSERIES > >-static void read_idle_purr(void *val) > >+ stats->latest_idle_purr = read_this_idle_purr(); > >+ stats->latest_idle_spurr = read_this_idle_spurr(); > >+#endif > >+ stats->last_update_jiffies = jiffies; > >+} > >+ > >+struct util_acct_stats *get_util_stats_ptr(int cpu) > >+{ > >+ struct util_acct_stats *stats = per_cpu_ptr(&util_acct_stats, cpu); > >+ unsigned long delta_jiffies; > >+ > >+ delta_jiffies = jiffies - stats->last_update_jiffies; > >+ > >+ /* > >+ * If we have a recent enough data, reuse that instead of > >+ * sending an IPI. > >+ */ > >+ if (jiffies_to_msecs(delta_jiffies) < util_stats_staleness_tolerance_ms) > >+ return stats; > >+ > >+ smp_call_function_single(cpu, update_util_acct_stats, stats, 1); > >+ return stats; > >+} > >+ > >+static ssize_t show_purr(struct device *dev, > >+ struct device_attribute *attr, char *buf) > > { > >- u64 *ret = val; > >+ struct cpu *cpu = container_of(dev, struct cpu, dev); > >+ struct util_acct_stats *stats; > > > >- *ret = read_this_idle_purr(); > >+ stats = get_util_stats_ptr(cpu->dev.id); > >+ return sprintf(buf, "%llx\n", stats->latest_purr); > > This alters the behavior of the current sysfs purr file. I am not sure if it > is reasonable to return the same PURR value across a 10ms window. It does reduce it to 10ms window. I am not sure if anyone samples PURR etc faster than that rate. I measured how much time it takes to read the purr, spurr, idle_purr, idle_spurr files back-to-back. It takes not more than 150us. From lparstat will these values be read back-to-back ? If so, we can reduce the staleness_tolerance to something like 500us and still avoid extra IPIs. If not, what is the maximum delay between the first sysfs file read and the last sysfs file read ? > > I wonder if we should introduce a sysctl interface to control thresholding. > It can default to 0, which disables thresholding so that the existing > behavior continues. Applications (lparstat) can optionally set it to suit > their use. We would be introducing 3 new sysfs interfaces that way instead of two. /sys/devices/system/cpu/purr_spurr_staleness /sys/devices/system/cpu/cpuX/idle_purr /sys/devices/system/cpu/cpuX/idle_spurr I don't have a problem with this. Nathan, Michael, thoughts on this? The alternative is to have a procfs interface, something like /proc/powerpc/resource_util_stats which gives a listing similar to /proc/stat, i.e CPUX <purr> <idle_purr> <spurr> <idle_spurr> Even in this case, the values can be obtained in one-shot with a single IPI and be printed in the row corresponding to the CPU. > > - Naveen > -- Thanks and Regards gautham.