Re: turbostat-17.06.23 floating point exception
On Fri, Oct 12, 2018 at 07:03:41PM -0400, Len Brown wrote: > > Why would the cpu topology report 0 cpus? I added a debug entry to > > cpu_usage_stat and /proc/stat showed it as an extra column. Then > > fscanf parsing in for_all_cpus() failed, causing the SIGFPE. > > > > This is not an issue. Thanks. > > Yes, it is true that turbostat doesn't check for systems with 0 cpus. > I'm curious how you provoked the kernel to claim that. If it is > something others might do, we can have check for it and gracefully > exit. source/tools/power/x86/turbostat/turbostat.c int for_all_proc_cpus(int (func)(int)) { retval = fscanf(fp, "cpu %*d %*d %*d %*d %*d %*d %*d %*d %*d %*d\n"); ^ This fails due to an extra debug entry in /proc/stat (total of 11 columns). I was measuring time in a hot function and decided to add this time in an extra cpu_usage_stat. This was an experiment though. Thanks, -S.
Re: turbostat-17.06.23 floating point exception
> Why would the cpu topology report 0 cpus? I added a debug entry to > cpu_usage_stat and /proc/stat showed it as an extra column. Then > fscanf parsing in for_all_cpus() failed, causing the SIGFPE. > > This is not an issue. Thanks. Yes, it is true that turbostat doesn't check for systems with 0 cpus. I'm curious how you provoked the kernel to claim that. If it is something others might do, we can have check for it and gracefully exit. thanks, -Len -- Len Brown, Intel Open Source Technology Center
Re: turbostat-17.06.23 floating point exception
On Fri, Oct 12, 2018 at 11:26:30AM -0700, Solio Sarabia wrote: > Hi -- > > turbostat 17.06.23 is throwing an exception on a custom linux-4.16.12 > kernel, on Xeon E5-2699 v4 Broadwell EP, 2S, 22C/S, 44C total, HT off, > VTx off. > > Initially the system had 4.4.0-137. Then I built and installed > linux-4.16.12-default. turbostat works fine for these two versions. > After building linux-4.16.12 for a second time, the older kernel is > renamed and now `ls -l /boot/` (I'm using version without .old suffix): > > vmlinuz-4.16.12-default+ > vmlinuz-4.16.12-default+.old > > grep -i 'turbostat' /var/log/kern.log > > kernel: [ 159.140836] capability: warning: `turbostat' uses 32-bit > capabilities (legacy support in use) > kernel: [ 164.149264] traps: turbostat[1801] trap divide error > ip:407625 sp:7ffe4b0df000 error:0 in turbostat[40+17000] > > (gdb) > cpu22: MSR_PKGC3_IRTL: 0x (NOTvalid, 0 ns) > cpu22: MSR_PKGC6_IRTL: 0x (NOTvalid, 0 ns) > cpu22: MSR_PKGC7_IRTL: 0x (NOTvalid, 0 ns) > > Program received signal SIGFPE, Arithmetic exception. > 0x00407625 in compute_average (t=0x61a3b0, c=0x61a3d0, p=0x61a480) at > turbostat.c:1378 > 1378average.threads.tsc /= topo.num_cpus; > Why would the cpu topology report 0 cpus? I added a debug entry to cpu_usage_stat and /proc/stat showed it as an extra column. Then fscanf parsing in for_all_cpus() failed, causing the SIGFPE. This is not an issue. Thanks. > Let me know if you need more details. > > Thanks, > -SS