On Thu, 2024-07-18 at 13:26 -0400, Michael DiDomenico wrote: > there's nothing 'grep -i evict' in the lustre_debug logs or from the > storage console logs
hm, ok. > > perhaps see continuity of the timestamps and what happened right > > before > > and right after the gap if there is one in the times? > > > > i pulled a counter from the logs of the functions calls, maybe one of > these looks off (this is just the ones over 100k), please excuse > typos > > $ grep -vh "^$" lustre_debug*.log | cut -f10 -d: | cut -f1 -d\) | > sort > > uniq -c | sort -n > 105034 lov_io_init > 105034 vvp_io_init > 105035 lov_io_iter_init > 105035 lob_strip_intersects > 105035 osc_cache_writeback_range > 105043 vvp_io_fini > 105050 lov_conf_freeze > 105050 lov_conf_thaw > 294806 osc_attr-update > 294806 osc_page_touch_at > 294814 osc_consume_write_grant > 294815 lov_attr_get_composite > 294816 osc_enter_cache_try > 351044 ll_write_end > 589549 osc_queueu_async_io > 589617 lov_merge_lvm_kms that's not really going to do anything useful, there's a timestamp in unix time as the fourth field (separated with colons), see if there are gaps there. I imagine there's going to be real dense (time-wise) activity) then an RPC is prepared and send (Sending RPC ....) and then a lot sparser activity perhaps with multi-second pauses) and then eventually it'll pick up after gettign a server response for example? Though none of that explains why lctl would hang I guess, but still _______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
