Heya, > If you are interested to do a tiny bit of hacking, it would be interesting to > do an experiment to see what kind of performance can be gotten in your > benchmark by a single client. Currently, Lustre limits each client to a > single filesystem-modifying metadata operation at one time, in order to > prevent the clients from overwhelming the server, and to ensure that the > clients can recover the filesystem correctly in case of a server crash. I just tested this. Before, I tried to do an out-of-tree build. My for clients are using nfsroot, so I put the kernel source on it, then I mount lustre on /mnt/lustre, and I compile on /mnt/lustrE/build (with make O=). The results (without) your patch are interesting : 7m42 against 9m7 before with -j 4 4 min 51 against 5 min 34 with -j 8 3 min 27 againt 4 min 19 with -j 16 I also use -pipe as gcc option, to avoid temp files.
So, my first question is : could it be possible in some way to disable cache coherency on some subdirectory ? If I know all the files in this directory will be acceded in read only, I do not need coherency. It would permit to read the files from lustre instead of nfs. I then tried with your patch, not much difference : 4 min 43 againt 4 min 51 without it (-j 8) 7min 40 against 7 min 42 with -j 8 So it changes almost nothing :) > I'm not sure if it makes a difference in your case or not, but increasing the > MDC RPCs in flight might also help performance. Also, increasing the client > cache size and the number of IO RPCs may also help. On the clients run: > > lctl set_param *.*.max_rpcs_in_flight=64 > lctl set_param osc.*.max_dirty_mb=512 no change > You may also test running the make directly on the MDS with a local Lustre > mount to determine if the network latency is a significant factor in the > performance. If you are using Ethernet instead of IB the latency could be > hurting you, since kernel compiles are generally only doing a tiny amount of > work per file and then you need to send a few RPCs to open and read the next > file and the headers. Some of this can be hidden by pre-reading all of the > files into the client caches (new machines should have enough RAM, about 1GB > or so), but the "open" operations still need to send an RPC to the MDS for > each file open, so running on the MDS (or with a low-latency network like IB) > may help compiles like this run more quickly. we don't have IB set up atm, so I can not test with it. I will try directly on the mds (so on only one node) to compare. Regards, Maxence -- Maxence DUNNEWIND Contact : [email protected] Site : http://www.dunnewind.net 06 32 39 39 93 GPG : 18AE 61E4 D0B0 1C7C AAC9 E40D 4D39 68DB 0D2E B533
signature.asc
Description: Digital signature
_______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
