Re: [lustre-discuss] Lustre Client Lockup Under Buffered I/O (2.14/2.15)

2022-02-14 Thread Ellis Wilson via lustre-discuss
day, January 20, 2022 8:14 AM
To: Patrick Farrell mailto:pfarr...@ddn.com>>
Cc: Andreas Dilger mailto:adil...@whamcloud.com>>; Ellis 
Wilson mailto:elliswil...@microsoft.com>>; 
lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>
Subject: [EXTERNAL] Re: [lustre-discuss] Lustre Client Lockup Under Buffered 
I/O (2.14/2.15)

You don't often get email from rajgau...@gmail.com<mailto:rajgau...@gmail.com>. 
Learn why this is important<http://aka.ms/LearnAboutSenderIdentification>
Ellis, I would also check the peer_credit between server and the client. They 
should be same.

On Wed, Jan 19, 2022 at 9:27 AM Patrick Farrell via lustre-discuss 
mailto:lustre-discuss@lists.lustre.org>> wrote:
Ellis,

As you may have guessed, that function just set looks like a node which is 
doing buffered I/O and thrashing for memory.  No particular insight available 
from the count of functions there.

Would you consider opening a bug report in the Whamcloud JIRA?  You should have 
enough for a good report, here's a few things that would be helpful as well:

It sounds like you can hang the node on demand.  If you could collect stack 
traces with:

echo t > /proc/sysrq-trigger
after creating the hang, that would be useful.  (It will print to dmesg.)

You've also collected debug logs - Could you include, say, the last 100 MiB of 
that log set?  That should be reasonable to attach if compressed.

Regards,
Patrick

From: lustre-discuss 
mailto:lustre-discuss-boun...@lists.lustre.org>>
 on behalf of Ellis Wilson via lustre-discuss 
mailto:lustre-discuss@lists.lustre.org>>
Sent: Wednesday, January 19, 2022 8:32 AM
To: Andreas Dilger mailto:adil...@whamcloud.com>>
Cc: lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org> 
mailto:lustre-discuss@lists.lustre.org>>
Subject: Re: [lustre-discuss] Lustre Client Lockup Under Buffered I/O 
(2.14/2.15)


Hi Andreas,



Apologies in advance for the top-post.  I'm required to use Outlook for work, 
and it doesn't handle in-line or bottom-posting well.



Client-side defaults prior to any tuning of mine (this is a very minimal 
1-client, 1-MDS/MGS, 2-OSS cluster):

~# lctl get_param llite.*.max_cached_mb

llite.lustrefs-8d52a9c52800.max_cached_mb=

users: 5

max_cached_mb: 7748

used_mb: 0

unused_mb: 7748

reclaim_count: 0

~# lctl get_param osc.*.max_dirty_mb

osc.lustrefs-OST-osc-8d52a9c52800.max_dirty_mb=1938

osc.lustrefs-OST0001-osc-8d52a9c52800.max_dirty_mb=1938

~# lctl get_param osc.*.max_rpcs_in_flight

osc.lustrefs-OST-osc-8d52a9c52800.max_rpcs_in_flight=8

osc.lustrefs-OST0001-osc-8d52a9c52800.max_rpcs_in_flight=8

~# lctl get_param osc.*.max_pages_per_rpc

osc.lustrefs-OST-osc-8d52a9c52800.max_pages_per_rpc=1024

osc.lustrefs-OST0001-osc-8d52a9c52800.max_pages_per_rpc=1024



Thus far I've reduced the following to what I felt were really conservative 
values for a 16GB RAM machine:



~# lctl set_param llite.*.max_cached_mb=1024

llite.lustrefs-8d52a9c52800.max_cached_mb=1024

~# lctl set_param osc.*.max_dirty_mb=512

osc.lustrefs-OST-osc-8d52a9c52800.max_dirty_mb=512

osc.lustrefs-OST0001-osc-8d52a9c52800.max_dirty_mb=512

~# lctl set_param osc.*.max_pages_per_rpc=128

osc.lustrefs-OST-osc-8d52a9c52800.max_pages_per_rpc=128

osc.lustrefs-OST0001-osc-8d52a9c52800.max_pages_per_rpc=128

~# lctl set_param osc.*.max_rpcs_in_flight=2

osc.lustrefs-OST-osc-8d52a9c52800.max_rpcs_in_flight=2

osc.lustrefs-OST0001-osc-8d52a9c52800.max_rpcs_in_flight=2



This slows down how fast I get to basically OOM from <10 seconds to more like 
25 seconds, but the trend is identical.



As an example of what I'm seeing on the client, you can see below we start with 
most free, and then iozone rapidly (within ~10 seconds) causes all memory to be 
marked used, and that stabilizes at about 140MB free until at some point it 
stalls for 20 or more seconds and then some has been synced out:

~# dstat --mem

--memory-usage-

used  free  buff  cach

1029M 13.9G 2756k  215M

1028M 13.9G 2756k  215M

1028M 13.9G 2756k  215M

1088M 13.9G 2756k  215M

2550M 11.5G 2764k 1238M

3989M 10.1G 2764k 1236M

5404M 8881M 2764k 1239M

6831M 7453M 2772k 1240M

8254M 6033M 2772k 1237M

9672M 4613M 2772k 1239M

10.6G 3462M 2772k 1240M

12.1G 1902M 2772k 1240M

13.4G  582M 2772k 1240M

13.9G  139M 2488k 1161M

13.9G  139M 1528k 1174M

13.9G  140M  896k 1175M

13.9G  139M  676k 1176M

13.9G  142M  528k 1177M

13.9G  140M  484k 1188M

13.9G  139M  492k 1188M

13.9G  139M  488k 1188M

13.9G  141M  488k 1186M

13.9G  141M  480k 1187M

13.9G  139M  492k 1188M

13.9G  141M  600k 1188M

13.9G  139M  580k 1187M

13.9G  140M  536k 1186M

13.9G  141M  668k 1186M

13.9G  139M  580k 1188M

13.9G  140M  568k 1187M

12.7G 1299M 2064k 1197M missed 20 ticks <-- client is totally unresponsive 
during

Re: [lustre-discuss] Lustre Client Lockup Under Buffered I/O (2.14/2.15)

2022-01-20 Thread Raj via lustre-discuss
Ellis, I would also check the peer_credit between server and the client.
They should be same.

On Wed, Jan 19, 2022 at 9:27 AM Patrick Farrell via lustre-discuss <
lustre-discuss@lists.lustre.org> wrote:

> Ellis,
>
> As you may have guessed, that function just set looks like a node which is
> doing buffered I/O and thrashing for memory.  No particular insight
> available from the count of functions there.
>
> Would you consider opening a bug report in the Whamcloud JIRA?  You should
> have enough for a good report, here's a few things that would be helpful as
> well:
>
> It sounds like you can hang the node on demand.  If you could collect
> stack traces with:
>
> echo t > /proc/sysrq-trigger
>
> after creating the hang, that would be useful.  (It will print to dmesg.)
>
> You've also collected debug logs - Could you include, say, the last 100
> MiB of that log set?  That should be reasonable to attach if compressed.
>
> Regards,
> Patrick
>
> --
> *From:* lustre-discuss  on
> behalf of Ellis Wilson via lustre-discuss  >
> *Sent:* Wednesday, January 19, 2022 8:32 AM
> *To:* Andreas Dilger 
> *Cc:* lustre-discuss@lists.lustre.org 
> *Subject:* Re: [lustre-discuss] Lustre Client Lockup Under Buffered I/O
> (2.14/2.15)
>
>
> Hi Andreas,
>
>
>
> Apologies in advance for the top-post.  I’m required to use Outlook for
> work, and it doesn’t handle in-line or bottom-posting well.
>
>
>
> Client-side defaults prior to any tuning of mine (this is a very minimal
> 1-client, 1-MDS/MGS, 2-OSS cluster):
>
>
> ~# lctl get_param llite.*.max_cached_mb
>
> llite.lustrefs-8d52a9c52800.max_cached_mb=
>
> users: 5
>
> max_cached_mb: 7748
>
> used_mb: 0
>
> unused_mb: 7748
>
> reclaim_count: 0
>
> ~# lctl get_param osc.*.max_dirty_mb
>
> osc.lustrefs-OST-osc-8d52a9c52800.max_dirty_mb=1938
>
> osc.lustrefs-OST0001-osc-8d52a9c52800.max_dirty_mb=1938
>
> ~# lctl get_param osc.*.max_rpcs_in_flight
>
> osc.lustrefs-OST-osc-8d52a9c52800.max_rpcs_in_flight=8
>
> osc.lustrefs-OST0001-osc-8d52a9c52800.max_rpcs_in_flight=8
>
> ~# lctl get_param osc.*.max_pages_per_rpc
>
> osc.lustrefs-OST-osc-8d52a9c52800.max_pages_per_rpc=1024
>
> osc.lustrefs-OST0001-osc-8d52a9c52800.max_pages_per_rpc=1024
>
>
>
> Thus far I’ve reduced the following to what I felt were really
> conservative values for a 16GB RAM machine:
>
>
>
> ~# lctl set_param llite.*.max_cached_mb=1024
>
> llite.lustrefs-8d52a9c52800.max_cached_mb=1024
>
> ~# lctl set_param osc.*.max_dirty_mb=512
>
> osc.lustrefs-OST-osc-8d52a9c52800.max_dirty_mb=512
>
> osc.lustrefs-OST0001-osc-8d52a9c52800.max_dirty_mb=512
>
> ~# lctl set_param osc.*.max_pages_per_rpc=128
>
> osc.lustrefs-OST-osc-8d52a9c52800.max_pages_per_rpc=128
>
> osc.lustrefs-OST0001-osc-8d52a9c52800.max_pages_per_rpc=128
>
> ~# lctl set_param osc.*.max_rpcs_in_flight=2
>
> osc.lustrefs-OST-osc-8d52a9c52800.max_rpcs_in_flight=2
>
> osc.lustrefs-OST0001-osc-8d52a9c52800.max_rpcs_in_flight=2
>
>
>
> This slows down how fast I get to basically OOM from <10 seconds to more
> like 25 seconds, but the trend is identical.
>
>
>
> As an example of what I’m seeing on the client, you can see below we start
> with most free, and then iozone rapidly (within ~10 seconds) causes all
> memory to be marked used, and that stabilizes at about 140MB free until at
> some point it stalls for 20 or more seconds and then some has been synced
> out:
>
>
> ~# dstat --mem
>
> --memory-usage-
>
> used  free  buff  cach
>
> 1029M 13.9G 2756k  215M
>
> 1028M 13.9G 2756k  215M
>
> 1028M 13.9G 2756k  215M
>
> 1088M 13.9G 2756k  215M
>
> 2550M 11.5G 2764k 1238M
>
> 3989M 10.1G 2764k 1236M
>
> 5404M 8881M 2764k 1239M
>
> 6831M 7453M 2772k 1240M
>
> 8254M 6033M 2772k 1237M
>
> 9672M 4613M 2772k 1239M
>
> 10.6G 3462M 2772k 1240M
>
> 12.1G 1902M 2772k 1240M
>
> 13.4G  582M 2772k 1240M
>
> 13.9G  139M 2488k 1161M
>
> 13.9G  139M 1528k 1174M
>
> 13.9G  140M  896k 1175M
>
> 13.9G  139M  676k 1176M
>
> 13.9G  142M  528k 1177M
>
> 13.9G  140M  484k 1188M
>
> 13.9G  139M  492k 1188M
>
> 13.9G  139M  488k 1188M
>
> 13.9G  141M  488k 1186M
>
> 13.9G  141M  480k 1187M
>
> 13.9G  139M  492k 1188M
>
> 13.9G  141M  600k 1188M
>
> 13.9G  139M  580k 1187M
>
> 13.9G  140M  536k 1186M
>
> 13.9G  141M  668k 1186M
>
> 13.9G  139M  580k 1188M
>
> 13.9G  140M  568k 1187M
>
> 12.7G 1299M

Re: [lustre-discuss] Lustre Client Lockup Under Buffered I/O (2.14/2.15)

2022-01-19 Thread Patrick Farrell via lustre-discuss
Ellis,

As you may have guessed, that function just set looks like a node which is 
doing buffered I/O and thrashing for memory.  No particular insight available 
from the count of functions there.

Would you consider opening a bug report in the Whamcloud JIRA?  You should have 
enough for a good report, here's a few things that would be helpful as well:

It sounds like you can hang the node on demand.  If you could collect stack 
traces with:

echo t > /proc/sysrq-trigger

after creating the hang, that would be useful.  (It will print to dmesg.)

You've also collected debug logs - Could you include, say, the last 100 MiB of 
that log set?  That should be reasonable to attach if compressed.

Regards,
Patrick


From: lustre-discuss  on behalf of 
Ellis Wilson via lustre-discuss 
Sent: Wednesday, January 19, 2022 8:32 AM
To: Andreas Dilger 
Cc: lustre-discuss@lists.lustre.org 
Subject: Re: [lustre-discuss] Lustre Client Lockup Under Buffered I/O 
(2.14/2.15)


Hi Andreas,



Apologies in advance for the top-post.  I’m required to use Outlook for work, 
and it doesn’t handle in-line or bottom-posting well.



Client-side defaults prior to any tuning of mine (this is a very minimal 
1-client, 1-MDS/MGS, 2-OSS cluster):

~# lctl get_param llite.*.max_cached_mb

llite.lustrefs-8d52a9c52800.max_cached_mb=

users: 5

max_cached_mb: 7748

used_mb: 0

unused_mb: 7748

reclaim_count: 0

~# lctl get_param osc.*.max_dirty_mb

osc.lustrefs-OST-osc-8d52a9c52800.max_dirty_mb=1938

osc.lustrefs-OST0001-osc-8d52a9c52800.max_dirty_mb=1938

~# lctl get_param osc.*.max_rpcs_in_flight

osc.lustrefs-OST-osc-8d52a9c52800.max_rpcs_in_flight=8

osc.lustrefs-OST0001-osc-8d52a9c52800.max_rpcs_in_flight=8

~# lctl get_param osc.*.max_pages_per_rpc

osc.lustrefs-OST-osc-8d52a9c52800.max_pages_per_rpc=1024

osc.lustrefs-OST0001-osc-8d52a9c52800.max_pages_per_rpc=1024



Thus far I’ve reduced the following to what I felt were really conservative 
values for a 16GB RAM machine:



~# lctl set_param llite.*.max_cached_mb=1024

llite.lustrefs-8d52a9c52800.max_cached_mb=1024

~# lctl set_param osc.*.max_dirty_mb=512

osc.lustrefs-OST-osc-8d52a9c52800.max_dirty_mb=512

osc.lustrefs-OST0001-osc-8d52a9c52800.max_dirty_mb=512

~# lctl set_param osc.*.max_pages_per_rpc=128

osc.lustrefs-OST-osc-8d52a9c52800.max_pages_per_rpc=128

osc.lustrefs-OST0001-osc-8d52a9c52800.max_pages_per_rpc=128

~# lctl set_param osc.*.max_rpcs_in_flight=2

osc.lustrefs-OST-osc-8d52a9c52800.max_rpcs_in_flight=2

osc.lustrefs-OST0001-osc-8d52a9c52800.max_rpcs_in_flight=2



This slows down how fast I get to basically OOM from <10 seconds to more like 
25 seconds, but the trend is identical.



As an example of what I’m seeing on the client, you can see below we start with 
most free, and then iozone rapidly (within ~10 seconds) causes all memory to be 
marked used, and that stabilizes at about 140MB free until at some point it 
stalls for 20 or more seconds and then some has been synced out:

~# dstat --mem

--memory-usage-

used  free  buff  cach

1029M 13.9G 2756k  215M

1028M 13.9G 2756k  215M

1028M 13.9G 2756k  215M

1088M 13.9G 2756k  215M

2550M 11.5G 2764k 1238M

3989M 10.1G 2764k 1236M

5404M 8881M 2764k 1239M

6831M 7453M 2772k 1240M

8254M 6033M 2772k 1237M

9672M 4613M 2772k 1239M

10.6G 3462M 2772k 1240M

12.1G 1902M 2772k 1240M

13.4G  582M 2772k 1240M

13.9G  139M 2488k 1161M

13.9G  139M 1528k 1174M

13.9G  140M  896k 1175M

13.9G  139M  676k 1176M

13.9G  142M  528k 1177M

13.9G  140M  484k 1188M

13.9G  139M  492k 1188M

13.9G  139M  488k 1188M

13.9G  141M  488k 1186M

13.9G  141M  480k 1187M

13.9G  139M  492k 1188M

13.9G  141M  600k 1188M

13.9G  139M  580k 1187M

13.9G  140M  536k 1186M

13.9G  141M  668k 1186M

13.9G  139M  580k 1188M

13.9G  140M  568k 1187M

12.7G 1299M 2064k 1197M missed 20 ticks <-- client is totally unresponsive 
during this time

11.0G 2972M 5404k 1238M^C



Additionally, I’ve messed with sysctl settings.  Defaults:

vm.dirty_background_bytes = 0

vm.dirty_background_ratio = 10

vm.dirty_bytes = 0

vm.dirty_expire_centisecs = 3000

vm.dirty_ratio = 20

vm.dirty_writeback_centisecs = 500



Revised to conservative values:

vm.dirty_background_bytes = 1073741824

vm.dirty_background_ratio = 0

vm.dirty_bytes = 2147483648

vm.dirty_expire_centisecs = 200

vm.dirty_ratio = 0

vm.dirty_writeback_centisecs = 500



No observed improvement.



I’m going to trawl two logs today side-by-side, one with ldiskfs backing the 
OSTs, and one with zfs backing the OSTs, and see if I can see what the 
differences are since the zfs-backed version never gave us this problem.  The 
only other potentially useful thing I can share right now is that when I turned 
on full debug logging and ran the test until I hit OOM, the following were the 
most frequently hit functions in the logs (count, descending, is the firs

Re: [lustre-discuss] Lustre Client Lockup Under Buffered I/O (2.14/2.15)

2022-01-19 Thread Ellis Wilson via lustre-discuss
 don't 
hesitate to ask.

Best,

ellis

From: Andreas Dilger 
Sent: Tuesday, January 18, 2022 9:54 PM
To: Ellis Wilson 
Cc: lustre-discuss@lists.lustre.org
Subject: [EXTERNAL] Re: [lustre-discuss] Lustre Client Lockup Under Buffered 
I/O (2.14/2.15)

You don't often get email from 
adil...@whamcloud.com<mailto:adil...@whamcloud.com>. Learn why this is 
important<http://aka.ms/LearnAboutSenderIdentification>
On Jan 18, 2022, at 13:40, Ellis Wilson via lustre-discuss 
mailto:lustre-discuss@lists.lustre.org>> wrote:

Recently we've switched from using ZFS to ldiskfs as the backing filesystem to 
work around some performance issues and I'm finding that when I put the cluster 
under load (with as little as a single client) I can almost completely lockup 
the client.  SSH (even existing sessions) stall, iostat, top, etc all freeze 
for 20 to 200 seconds.  This alleviates for small windows and recurs as long as 
I leave the io-generating process in existence.  It reports extremely high CPU 
and RAM usage, and appears to be consumed exclusively doing 'system'-tagged 
work.  This is on 2.14.0, but I've reproduced on more or less HOL for 
master-next.  If I do direct-IO, performance is fantastic and I have no such 
issues regarding CPU/memory pressure.

Uname: Linux 85df894e-8458-4aa4-b16f-1d47154c0dd2-lclient-a0-g0-vm 
5.4.0-1065-azure #68~18.04.1-Ubuntu SMP Fri Dec 3 14:08:44 UTC 2021 x86_64 
x86_64 x86_64 GNU/Linux

I dmesg I see consistent spew on the client about:
[19548.601651] LustreError: 30918:0:(events.c:208:client_bulk_callback()) event 
type 1, status -5, desc b69b83b0
[19548.662647] LustreError: 30917:0:(events.c:208:client_bulk_callback()) event 
type 1, status -5, desc 9ef2fc22
[19549.153590] Lustre: lustrefs-OST-osc-8d52a9c52800: Connection to 
lustrefs-OST (at 10.1.98.7@tcp<mailto:10.1.98.7@tcp>) was lost; in progress 
operations using this service will wait for recovery to complete
[19549.153621] Lustre: 30927:0:(client.c:2282:ptlrpc_expire_one_request()) @@@ 
Request sent has failed due to network error: [sent 1642535831/real 1642535833] 
 req@02361e2d x1722317313374336/t0(0) 
o4->lustrefs-OST0001-osc-8d52a9c52800@10.1.98.10<mailto:lustrefs-OST0001-osc-8d52a9c52800@10.1.98.10>@tcp:6/4
 lens 488/448 e 0 to 1 dl 1642535883 ref 2 fl Rpc:eXQr/0/ rc 0/-1 job:''
[19549.153623] Lustre: 30927:0:(client.c:2282:ptlrpc_expire_one_request()) 
Skipped 4 previous similar messages

But I actually think this is a symptom of extreme memory pressure causing the 
client to timeout things, not a cause.

Testing with obdfilter-survey (local) on the OSS side shows expected 
performance of the disk subsystem.  Testing with lnet_selftest from client to 
OSS shows expected performance.  In neither case do I see the high cpu or 
memory pressure issues.

Reducing a variety of lctl tunables that appear to govern memory allowances for 
Lustre clients does not improve the situation.

What have you reduced here?  llite.*.max_cached_mb, osc.*.max_dirty_mb, 
osc.*.max_rpcs_in_flight and osc.*.max_pages_per_rpc?


By all appearances, the running iozone or even simple dd processes gradually 
(i.e., over a span of just 10 seconds or so) consumes all 16GB of RAM on the 
client I'm using.  I've generated bcc profile graphs for both on- and off-cpu 
analysis, and they are utterly boring -- they basically just reflect rampant 
calls to shrink_inactive_list resulting from page_cache_alloc in the presence 
of extreme memory pressure.

We have seen some issues like this that are being looked at, but this is mostly 
only seen on smaller VM clients used in testing and not larger production 
clients.  Are you able to test with more RAM on the client?  Have you tried 
with 2.12.8 installed on the client?

Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Whamcloud






___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Lustre Client Lockup Under Buffered I/O (2.14/2.15)

2022-01-18 Thread Ellis Wilson via lustre-discuss
Hi all,

Recently we've switched from using ZFS to ldiskfs as the backing filesystem to 
work around some performance issues and I'm finding that when I put the cluster 
under load (with as little as a single client) I can almost completely lockup 
the client.  SSH (even existing sessions) stall, iostat, top, etc all freeze 
for 20 to 200 seconds.  This alleviates for small windows and recurs as long as 
I leave the io-generating process in existence.  It reports extremely high CPU 
and RAM usage, and appears to be consumed exclusively doing 'system'-tagged 
work.  This is on 2.14.0, but I've reproduced on more or less HOL for 
master-next.  If I do direct-IO, performance is fantastic and I have no such 
issues regarding CPU/memory pressure.

Uname: Linux 85df894e-8458-4aa4-b16f-1d47154c0dd2-lclient-a0-g0-vm 
5.4.0-1065-azure #68~18.04.1-Ubuntu SMP Fri Dec 3 14:08:44 UTC 2021 x86_64 
x86_64 x86_64 GNU/Linux

I dmesg I see consistent spew on the client about:
[19548.601651] LustreError: 30918:0:(events.c:208:client_bulk_callback()) event 
type 1, status -5, desc b69b83b0
[19548.662647] LustreError: 30917:0:(events.c:208:client_bulk_callback()) event 
type 1, status -5, desc 9ef2fc22
[19549.153590] Lustre: lustrefs-OST-osc-8d52a9c52800: Connection to 
lustrefs-OST (at 10.1.98.7@tcp) was lost; in progress operations using this 
service will wait for recovery to complete
[19549.153621] Lustre: 30927:0:(client.c:2282:ptlrpc_expire_one_request()) @@@ 
Request sent has failed due to network error: [sent 1642535831/real 1642535833] 
 req@02361e2d x1722317313374336/t0(0) 
o4->lustrefs-OST0001-osc-8d52a9c52800@10.1.98.10@tcp:6/4 lens 488/448 e 0 
to 1 dl 1642535883 ref 2 fl Rpc:eXQr/0/ rc 0/-1 job:''
[19549.153623] Lustre: 30927:0:(client.c:2282:ptlrpc_expire_one_request()) 
Skipped 4 previous similar messages

But I actually think this is a symptom of extreme memory pressure causing the 
client to timeout things, not a cause.

Testing with obdfilter-survey (local) on the OSS side shows expected 
performance of the disk subsystem.  Testing with lnet_selftest from client to 
OSS shows expected performance.  In neither case do I see the high cpu or 
memory pressure issues.

Reducing a variety of lctl tunables that appear to govern memory allowances for 
Lustre clients does not improve the situation.  By all appearances, the running 
iozone or even simple dd processes gradually (i.e., over a span of just 10 
seconds or so) consumes all 16GB of RAM on the client I'm using.  I've 
generated bcc profile graphs for both on- and off-cpu analysis, and they are 
utterly boring -- they basically just reflect rampant calls to 
shrink_inactive_list resulting from page_cache_alloc in the presence of extreme 
memory pressure.

Any suggestions on the best path to debugging this are very welcome.

Best,

ellis
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org