Anna,
for monitoring server storage access, the client-side file heat is not actually
very useful, because (a) it is spread across all of the clients, and (b) it
shows the client-side access patterns (which may be largely from cache) and not
the actual server storage access which is what is important for tiering
decisions.
Consider if a client reads a file once from the storage, and then accesses it
1000x from cache, the file may be considered "hot" by the client but it doesn't
really matter what kind of storage the file is on since the storage only sees a
single read.
Instead of the client-side file heat there is a different mechanism, the OST
(lustre/utils/ofd_access_log_reader.c, ALR) that is a lightweight mechanism to
aggregate all storage access into a producer/consumer circular log that is
consumed by a userspace process. It is up to the userspace process to aggregate
these log records across all of the OSTs to make decisions about which files
are "hot" and which are "cold". The ALR records are described in
lustre/include/uapi/linux/lustre/lustre_access_log.h:
struct ofd_access_entry_v1 {
struct lu_fid oae_parent_fid; /* 16 */
__u64 oae_begin; /* 24 */
__u64 oae_end; /* 32 */
__u64 oae_time; /* 40 */
__u32 oae_size; /* 44 */
__u32 oae_segment_count; /* 48 */
__u32 oae_flags; /* 52 enum ofd_access_flags */
__u32 oae_reserved1; /* 56 */
__u32 oae_reserved2; /* 60 */
__u32 oae_reserved3; /* 64 */
};
The records contain the MDT FID (to allow aggregation of IOs across multiple
describe if it is a read or write, and the start/end of the extent
read/written so that the consumer can decide if the IOs are well-formed for the
storage (e.g. large or sequential reads/writes on an HDD are OK vs.
small/random reads/writes on an HDD are not OK). The ALR queue is transient,
so it is up to the consumer to make any decisions about the current IO patterns
on files
Cheers, Andreas
On Feb 22, 2023, at 14:16, Anna Fuchs
<[email protected]<mailto:[email protected]>> wrote:
Thank you.
Is there any documentation for the values?
Client-side means only statistics for the file remaining in the client cache?
Not lifetime statistics?
Are there any plans to work further on this feature?
I think of several use cases when knowing these stats.
Cold data could be moved to archive like slow tape without relying on access
time.
Hot blocks could be replicated or moved to faster caches and lot more
optimizations.
Best regards
Anna
Am 18.02.2023 um 21:57 schrieb Andreas Dilger:
Anna, there was a client-side file heat mechanism added a few years ago, but I
don't know if it is fully functional today.
lctl get_param llite.*.*heat*
llite.myth-ffff979380fc1800.file_heat=1
llite.myth-ffff979380fc1800.heat_decay_percentage=80
llite.myth-ffff979380fc1800.heat_period_second=60
And then "lfs heat_get <file>" to dump the file heat, it there haven't been
any good tools developed yet to list top heat files.
Cheers, Andreas
On Feb 7, 2023, at 08:56, Anna Fuchs via lustre-discuss
<[email protected]><mailto:[email protected]> wrote:
Hello,
is there a way to see how many times a file has been accessed ever (like a heat
map)?
Thanks
Anna
--
Anna Fuchs
Universität Hamburg
https://wr.informatik.uni-hamburg.de<https://wr.informatik.uni-hamburg.de/>
[email protected]<mailto:[email protected]>
https://wr.informatik.uni-hamburg.de/people/anna_fuchs
_______________________________________________
lustre-discuss mailing list
[email protected]<mailto:[email protected]>
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
--
Anna Fuchs
Universität Hamburg
https://wr.informatik.uni-hamburg.de<https://wr.informatik.uni-hamburg.de/>
[email protected]<mailto:[email protected]>
https://wr.informatik.uni-hamburg.de/people/anna_fuchs
Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Whamcloud
_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org