Re: [lustre-discuss] lproc stats changed snapshot_time from unix-epoch to uptime/monotonic in 2.15

2022-08-24 Thread Andreas Dilger via lustre-discuss
Ellis, thanks for reporting this.  This looks like it was a mistake. 

The timestamps should definitely be in wallclock time, but this looks to have 
been changed unintentionally to reduce overhead, and use a u64 instead of 
dealing with timespec64 math, while losing the original intent (there are many 
different ktime_get variants, all alike).

I think many monitoring tools will be unaffected because they use the delta 
between successive timestamps, but having timestamps that are relative to boot 
time is problematic since they may repeat or go backward after a reboot, and 
some tools may use this timestamp when inserting into a tsdb to avoid 
processing lag. 

Please file a ticket, and ideally if you can submit a patch that converts 
ktime_get() to ktime_get_real_ns() for the places that are changed in the patch 
(with a "Fixes:" line to track it against the original patch, which was commit 
ea2cd3af7b).

Cheers, Andreas

> On Aug 24, 2022, at 14:50, Ellis Wilson via lustre-discuss 
>  wrote:
> 
> Hi all,
> 
> One of my colleagues noticed that in testing 2.15.1 out the stats returned 
> include snapshot_time showing up in a different fashion than before.  
> Previously, ktime_get_real_ts64 was used to get the current timestamp and 
> that was presented when stats were printed, whereas now uptime is used as 
> returned by ktime_get.  Is there a monotonic requirement to snapshot_time 
> that I'm not thinking about that makes ktime_get more useful?  The previous 
> behavior of getting the current time alongside the stats so you could reason 
> about when they were gotten made more sense to me.  But perhaps Andreas had a 
> different vision for use of snapshot_time given that he was the one who 
> revised it?
> 
> I'm glad to open a bug and propose a patch if this was just a mistake, but 
> figured I'd ask first.
> 
> Best,
> 
> ellis
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] lproc stats changed snapshot_time from unix-epoch to uptime/monotonic in 2.15

2022-08-24 Thread Ellis Wilson via lustre-discuss
Hi all,

One of my colleagues noticed that in testing 2.15.1 out the stats returned 
include snapshot_time showing up in a different fashion than before.  
Previously, ktime_get_real_ts64 was used to get the current timestamp and that 
was presented when stats were printed, whereas now uptime is used as returned 
by ktime_get.  Is there a monotonic requirement to snapshot_time that I'm not 
thinking about that makes ktime_get more useful?  The previous behavior of 
getting the current time alongside the stats so you could reason about when 
they were gotten made more sense to me.  But perhaps Andreas had a different 
vision for use of snapshot_time given that he was the one who revised it?

I'm glad to open a bug and propose a patch if this was just a mistake, but 
figured I'd ask first.

Best,

ellis
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org