[lustre-discuss] How to get CPU and Network usage for a particular OST

2019-08-16 Thread Masudul Hasan Masud Bhuiyan
Hi,
I am not sure if its possible but I need to know the CPU, memory usages and
network metrics like packet drop, rtt etc for a particular OST. How can get
these information in real time?

Regards.
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Significance of LOV and LLITE metrics

2019-03-26 Thread Masudul Hasan Masud Bhuiyan
Hi,
I am looking for metrics that affect/reflect user I/O performance. So far I
understand OST and MDT metrics are significant.  But is there any other
metrics which get updated during file write/read operation? Mainly I want
to know what is the significance of LOV and LLITE metrics in terms of I/O.
(As LOV is related to client, I think it  might be significant, but not
sure what does it indicate) Also  what do these metrics in LLITE indicate?


lazystatfs
site
checksum_pages
lmv
statahead_agl
lov
default_easize
statahead_running_max
dump_page_cache
max_easize
statahead_stats
extents_stats
fast_read
stats_track_gid
nosquash_nids
stats_track_pid
offset_stats
stats_track_ppid
fstype
pio
unstable_stats
read_ahead_stats
root_squash
xattr_cache
sbi_flags

also whats the difference between kbytesavail and kbytesfree in the contest
of lustre?
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] CPU usages of MDT/MDS and OST/OSS

2019-02-25 Thread Masudul Hasan Masud Bhuiyan
I need to know the cpu usages of a particular OST and MDT. I have seen it
is available directly to system administrator. But I don't have access to
that information. So, I was wondering how can I get idea about CPU load
from other metrics? What other matrices can give rough idea about the the
CPU load of the OST/MDT?

These are the available matrices for OST

active
filestotal
ost_server_uuid
blocksize
grant_shrink_interval  ping
checksum_dump
import
pinger_recov
checksums
kbytesavail
resend_count
checksum_type
kbytesfree
rpc_stats
connect_flags
kbytestotal
srpc_contexts
contention_seconds
lockless_truncate
srpc_info
cur_dirty_bytes
max_dirty_mb
state
cur_dirty_grant_bytes
max_pages_per_rpc  stats
cur_grant_bytes
max_rpcs_in_flight
timeouts
cur_lost_grant_bytes
osc_cached_mb
unstable_stats
destroys_in_flight
osc_stats
uuid
filesfree
ost_conn_uuid

Regards.
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Lustre Monitoring metrics

2019-02-24 Thread Masudul Hasan Masud Bhuiyan
Thank you very much. Unfortunately, as I am doing my research on  3rd party
sc cluster, I don't have  access to these tools. Also, I am kind of
developing something like LMT. So, it will be great if I can get some idea
about these  metrics. I will be grateful if you can give me some direction
about where can I get idea about these metrics.

Regards.

On Sun, Feb 24, 2019 at 11:06 AM Andreas Dilger 
wrote:

> Probably for a new user it doesn't make sense to use the Lustre stats in
> /proc directly. There are a number of different tools that present these
> stats in a more useful manner, such as IML (GUI Web front end), LMT, lltop,
> etc.
>
> Cheers, Andreas
>
> On Feb 24, 2019, at 02:09, Masudul Hasan Masud Bhuiyan <
> masud.ha...@nevada.unr.edu> wrote:
>
> Hi,
>
> I am very new to Lustre and  I am trying to get idea about the lustre
> monitoring system. I have collected the stats from "/proc/fs/lustre".
> Unfortunately I couldn't understand what does these data are actually mean.
> I have tried to go through Lustre documentation and user guide to get
> understanding of the data. But had no luck.
>
> I will be grateful if you can help me to understand these log files. This
> is a sample of stats for metadat server. But I am not sure how can I
> interpret these stat. what does "mds_getattr"/mds_get_root/mds_quotactl
> means?
>
> Fri Feb 8 00:52:13 2019
> snapshot_time 1549615933.091769380 secs.nsecs
> req_waittime 540323978 samples [usec] 58 689541437 1767362743217
> 12488011030933825609
> req_active 540323988 samples [reqs] 1 142007 60970174330 3247243705183378
> mds_getattr 168893 samples [usec] 82 165621 390916699 1522734684023
> mds_close 110379799 samples [usec] 64 1231540 17796322782 62011789734542
> mds_readpage 1236125 samples [usec] 245 10402461 10527744335
> 1995909318950381
> mds_connect 6 samples [usec] 186 868 1900 969558
> mds_get_root 1 samples [usec] 85 85 85 7225
> mds_statfs 795609 samples [usec] 67 1109562 226674777 4390097174469
> mds_sync 6769 samples [usec] 361624 689541437 80448645395
> 7168190473271866793
> mds_quotactl 1449 samples [usec] 69 18051 551878 2621695252
> mds_getxattr 1886 samples [usec] 93 13792 419449 525089597
> mds_hsm_state_set 234556 samples [usec] 77 893155 96605623 1595746930715
> ldlm_cancel 115821035 samples [usec] 58 5627198 1149359370420
> 1270415775256810056
> obd_ping 23932 samples [usec] 87 34096 7061304 6296645618
> seq_query 38 samples [usec] 87 767 6885 1688653
> fld_query 3 samples [usec] 84 370 789 256181
>
> Sincerely yours,
> Masudul Hasan Masud
>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Lustre Monitoring metrics

2019-02-24 Thread Masudul Hasan Masud Bhuiyan
Hi,

I am very new to Lustre and  I am trying to get idea about the lustre
monitoring system. I have collected the stats from "/proc/fs/lustre".
Unfortunately I couldn't understand what does these data are actually mean.
I have tried to go through Lustre documentation and user guide to get
understanding of the data. But had no luck.

I will be grateful if you can help me to understand these log files. This
is a sample of stats for metadat server. But I am not sure how can I
interpret these stat. what does "mds_getattr"/mds_get_root/mds_quotactl
means?

Fri Feb 8 00:52:13 2019
snapshot_time 1549615933.091769380 secs.nsecs
req_waittime 540323978 samples [usec] 58 689541437 1767362743217
12488011030933825609
req_active 540323988 samples [reqs] 1 142007 60970174330 3247243705183378
mds_getattr 168893 samples [usec] 82 165621 390916699 1522734684023
mds_close 110379799 samples [usec] 64 1231540 17796322782 62011789734542
mds_readpage 1236125 samples [usec] 245 10402461 10527744335
1995909318950381
mds_connect 6 samples [usec] 186 868 1900 969558
mds_get_root 1 samples [usec] 85 85 85 7225
mds_statfs 795609 samples [usec] 67 1109562 226674777 4390097174469
mds_sync 6769 samples [usec] 361624 689541437 80448645395
7168190473271866793
mds_quotactl 1449 samples [usec] 69 18051 551878 2621695252
mds_getxattr 1886 samples [usec] 93 13792 419449 525089597
mds_hsm_state_set 234556 samples [usec] 77 893155 96605623 1595746930715
ldlm_cancel 115821035 samples [usec] 58 5627198 1149359370420
1270415775256810056
obd_ping 23932 samples [usec] 87 34096 7061304 6296645618
seq_query 38 samples [usec] 87 767 6885 1688653
fld_query 3 samples [usec] 84 370 789 256181

Sincerely yours,
Masudul Hasan Masud
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org