On 2011-02-11, at 11:09, Michael Kluge wrote:
> But it does not give you PIDs or user names? Or is there a way to find these 
> with standard lustre tools?

I think for most purposes, the req_history should be enough to identify the 
problem node and then a simple "ps" or looking at the job scheduler for that 
node would identify the problem.

However, if process-level tracking is needed, this is also possible to track on 
either the client or server, using the RPCTRACE functionality in the Lustre 
kernel debug logs.

client# lctl set_param debug=+rpctrace
{wait to collect some logs}
client# lctl dk /tmp/debug.cli
client# less /tmp/debug.cli
:
:
00000100:00100000:1:1297449409.192077:0:32392:0:(client.c:2095:ptlrpc_queue_wait
()) Sending RPC pname:cluuid:pid:xid:nid:opc ls:028fd87f-1865-3915-a864-fc829a4d
7a4c:32392:x1359928498575499:0@lo:37
:

This lists the names of the fields being printed in the RPCTRACE message.  We 
are particularly interested in the "pname" and "pid" fields, and maybe "opc" 
(opcode).  This shows that "ls", pid 32392 is sending an opcode 37 request 
(MDT_READPAGE, per lustre/include/lustre/lustre_idl.h).

This RPC is identified with xid "x1359928498575499" on client UUID 
"028fd87f-1865-3915-a864-fc829a4d7a4c".  The xid is not guaranteed to be unique 
between clients, but is relatively relatively unique in most debug logs.

On the server we can see the same RPC in the debug logs:

server# lctl dk /tmp/debug.mds
server# grep ":x1359928498575499:" /tmp/debug.mds
00000100:00100000:1:1297449409.192178:0:5174:0:(service.c:1276:ptlrpc_server_log
_handling_request()) Handling RPC pname:cluuid+ref:pid:xid:nid:opc ll_mdt_rdpg_0
0:028fd87f-1865-3915-a864-fc829a4d7a4c+6:32392:x1359928498575499:12345-0@lo:37

Here we see that the RPC for this xid and client pid was processed by a service 
thread, but the server-side debug log does not have the client process name, 
but rather the process name of the thread handling the RPC.

In any case, it is definitely possible to track down this information just from 
the server in a variety of ways.

Adding a job identifier and possibly a rank # to the Lustre RPC messages is 
definitely something that we've thought about, but it would need help from 
userspace (MPI, job scheduler, etc) in order to be useful, so it hasn't been 
done yet.

> Am 11.02.2011 17:34, schrieb Andreas Dilger:
>> 
>> On the OSS and MDS nodes there are per-client statistics that allow this 
>> kind of tracking.  They can be seen in 
>> /proc/fs/lustre/obdfilter/*/exports/*/stats for detailed information (e.g. 
>> broken down by RPC type, bytes read/written), or 
>> /proc/fs/lustre/ost/OSS/*/req_history to just get a dump of the recent RPCs 
>> sent by each client.
>> 
>> A little script was discussed in the thread "How to determine which lustre 
>> clients are loading filesystem" (2010-07-08):
>> 
>>> Another way that I heard some sites were doing this is to use the "rpc 
>>> history".  They may already have a script to do this, but the basics are 
>>> below:
>>> 
>>> oss# lctl set_param ost.OSS.*.req_buffer_history_max=10240
>>> {wait a few seconds to collect some history}
>>> oss# lctl get_param ost.OSS.*.req_history
>>> 
>>> This will give you a list of the past (up to) 10240 RPCs for the "ost_io" 
>>> RPC service, which is what you are observing the high load on:
>>> 
>>> 3436037:192.168.20.1@tcp:12345-192.168.20.159@tcp:x1340648957534353:448:Complete:1278612656:0s(-6s)
>>>  opc 3
>>> 3436038:192.168.20.1@tcp:12345-192.168.20.159@tcp:x1340648957536190:448:Complete:1278615489:1s(-41s)
>>>  opc 3
>>> 3436039:192.168.20.1@tcp:12345-192.168.20.159@tcp:x1340648957536193:448:Complete:1278615490:0s(-6s)
>>>  opc 3
>>> 
>>> This output is in the format:
>>> 
>>> identifier:target_nid:source_nid:rpc_xid:rpc_size:rpc_status:arrival_time:service_time(deadline)
>>>  opcode
>>> 
>>> Using some shell scripting, one can find the clients sending the most RPC 
>>> requests:
>>> 
>>> oss# lctl get_param ost.OSS.*.req_history | tr ":" " " | cut -d" " -f3,9,10 
>>> | sort | uniq -c | sort -nr | head -20
>>> 
>>> 
>>>   3443 12345-192.168.20.159@tcp opc 3
>>>   1215 12345-192.168.20.157@tcp opc 3
>>>    121 12345-192.168.20.157@tcp opc 4
>>> 
>>> This will give you a sorted list of the top 20 clients that are sending the 
>>> most RPCs to the ost and ost_io services, along with the operation being 
>>> done (3 = OST_READ, 4 = OST_WRITE, etc. see 
>>> lustre/include/lustre/lustre_idl.h).


Cheers, Andreas
--
Andreas Dilger 
Principal Engineer
Whamcloud, Inc.



_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to