Hi all,

Recently we've switched from using ZFS to ldiskfs as the backing filesystem to 
work around some performance issues and I'm finding that when I put the cluster 
under load (with as little as a single client) I can almost completely lockup 
the client.  SSH (even existing sessions) stall, iostat, top, etc all freeze 
for 20 to 200 seconds.  This alleviates for small windows and recurs as long as 
I leave the io-generating process in existence.  It reports extremely high CPU 
and RAM usage, and appears to be consumed exclusively doing 'system'-tagged 
work.  This is on 2.14.0, but I've reproduced on more or less HOL for 
master-next.  If I do direct-IO, performance is fantastic and I have no such 
issues regarding CPU/memory pressure.

Uname: Linux 85df894e-8458-4aa4-b16f-1d47154c0dd2-lclient-a0-g0-vm 
5.4.0-1065-azure #68~18.04.1-Ubuntu SMP Fri Dec 3 14:08:44 UTC 2021 x86_64 
x86_64 x86_64 GNU/Linux

I dmesg I see consistent spew on the client about:
[19548.601651] LustreError: 30918:0:(events.c:208:client_bulk_callback()) event 
type 1, status -5, desc 00000000b69b83b0
[19548.662647] LustreError: 30917:0:(events.c:208:client_bulk_callback()) event 
type 1, status -5, desc 000000009ef2fc22
[19549.153590] Lustre: lustrefs-OST0000-osc-ffff8d52a9c52800: Connection to 
lustrefs-OST0000 (at 10.1.98.7@tcp) was lost; in progress operations using this 
service will wait for recovery to complete
[19549.153621] Lustre: 30927:0:(client.c:2282:ptlrpc_expire_one_request()) @@@ 
Request sent has failed due to network error: [sent 1642535831/real 1642535833] 
 req@0000000002361e2d x1722317313374336/t0(0) 
o4->lustrefs-OST0001-osc-ffff8d52a9c52800@10.1.98.10@tcp:6/4 lens 488/448 e 0 
to 1 dl 1642535883 ref 2 fl Rpc:eXQr/0/ffffffff rc 0/-1 job:''
[19549.153623] Lustre: 30927:0:(client.c:2282:ptlrpc_expire_one_request()) 
Skipped 4 previous similar messages

But I actually think this is a symptom of extreme memory pressure causing the 
client to timeout things, not a cause.

Testing with obdfilter-survey (local) on the OSS side shows expected 
performance of the disk subsystem.  Testing with lnet_selftest from client to 
OSS shows expected performance.  In neither case do I see the high cpu or 
memory pressure issues.

Reducing a variety of lctl tunables that appear to govern memory allowances for 
Lustre clients does not improve the situation.  By all appearances, the running 
iozone or even simple dd processes gradually (i.e., over a span of just 10 
seconds or so) consumes all 16GB of RAM on the client I'm using.  I've 
generated bcc profile graphs for both on- and off-cpu analysis, and they are 
utterly boring -- they basically just reflect rampant calls to 
shrink_inactive_list resulting from page_cache_alloc in the presence of extreme 
memory pressure.

Any suggestions on the best path to debugging this are very welcome.

Best,

ellis
_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to