Re: turbostat-17.06.23 floating point exception

2018-10-18 Thread Solio Sarabia
On Fri, Oct 12, 2018 at 07:03:41PM -0400, Len Brown wrote:
> > Why would the cpu topology report 0 cpus?  I added a debug entry to
> > cpu_usage_stat and /proc/stat showed it as an extra column.  Then
> > fscanf parsing in for_all_cpus() failed, causing the SIGFPE.
> >
> > This is not an issue. Thanks.
> 
> Yes, it is true that turbostat doesn't check for systems with 0 cpus.
> I'm curious how you provoked the kernel to claim that.  If it is
> something others might do, we can have check for it and gracefully
> exit.

source/tools/power/x86/turbostat/turbostat.c
int for_all_proc_cpus(int (func)(int))
{
retval = fscanf(fp, "cpu %*d %*d %*d %*d %*d %*d %*d %*d %*d %*d\n");
^
This fails due to an extra debug entry in /proc/stat
(total of 11 columns).  I was measuring time in a hot
function and decided to add this time in an extra
cpu_usage_stat. This was an experiment though.

Thanks,
-S.


Re: turbostat-17.06.23 floating point exception

2018-10-12 Thread Len Brown
> Why would the cpu topology report 0 cpus?  I added a debug entry to
> cpu_usage_stat and /proc/stat showed it as an extra column.  Then
> fscanf parsing in for_all_cpus() failed, causing the SIGFPE.
>
> This is not an issue. Thanks.

Yes, it is true that turbostat doesn't check for systems with 0 cpus.
I'm curious how you provoked the kernel to claim that.  If it is
something others might do, we can have check for it and gracefully
exit.

thanks,
-Len




-- 
Len Brown, Intel Open Source Technology Center


Re: turbostat-17.06.23 floating point exception

2018-10-12 Thread Solio Sarabia
On Fri, Oct 12, 2018 at 11:26:30AM -0700, Solio Sarabia wrote:
> Hi --
> 
> turbostat 17.06.23 is throwing an exception on a custom linux-4.16.12
> kernel, on Xeon E5-2699 v4 Broadwell EP, 2S, 22C/S, 44C total, HT off,
> VTx off.
> 
> Initially the system had 4.4.0-137. Then I built and installed
> linux-4.16.12-default.  turbostat works fine for these two versions.
> After building linux-4.16.12 for a second time, the older kernel is
> renamed and now `ls -l /boot/` (I'm using version without .old suffix):
> 
>   vmlinuz-4.16.12-default+
>   vmlinuz-4.16.12-default+.old
> 
> grep -i 'turbostat' /var/log/kern.log
> 
> kernel: [  159.140836] capability: warning: `turbostat' uses 32-bit
>   capabilities (legacy support in use)
> kernel: [  164.149264] traps: turbostat[1801] trap divide error
>   ip:407625 sp:7ffe4b0df000 error:0 in turbostat[40+17000]
> 
> (gdb)
> cpu22: MSR_PKGC3_IRTL: 0x (NOTvalid, 0 ns)
> cpu22: MSR_PKGC6_IRTL: 0x (NOTvalid, 0 ns)
> cpu22: MSR_PKGC7_IRTL: 0x (NOTvalid, 0 ns)
> 
> Program received signal SIGFPE, Arithmetic exception.
> 0x00407625 in compute_average (t=0x61a3b0, c=0x61a3d0, p=0x61a480) at 
> turbostat.c:1378
> 1378average.threads.tsc /= topo.num_cpus;
> 
Why would the cpu topology report 0 cpus?  I added a debug entry to
cpu_usage_stat and /proc/stat showed it as an extra column.  Then
fscanf parsing in for_all_cpus() failed, causing the SIGFPE.

This is not an issue. Thanks.

> Let me know if you need more details.
> 
> Thanks,
> -SS