Ken, the OSTs need to track the ownership of objects for quota. The more stripes there are on a file, the more RPCs that need to be sent, which is why we don't recommend wide striping unless there is a reason for it (bandwidth, size, etc).
Cheers, Andreas On 2011-05-20, at 7:49 AM, Ken Hornstein <[email protected]> wrote: > So I guess there are some things I _still_ don't understand about Lustre > metadata handling. Specifically, what metadata gets stored on OSTs and > why. > > What brings this all up is that a) we have users who have lots of files > and b) we recently are doing through some reorganization that requires > changing the groups on lots of these files (this is all running Lustre > 1.8.4; we're due for an upgrade in the medium future). > > I figured okay, this wouldn't be so bad, since those are all metadata > server operations. But I started running some tests, and I found out > that chown() system calls perform poorly. > > Because I was doing some previous metadata performance analysis, I took > a souce code tree which consists of approximately 50,000 files and put > two copies in one of our Lustre filesystems: one with the default striping > (across all OSTs) and one where all files have no striping at all. The > performance between these two trees for stat() calls is large, as you > can imagine, but the disparity between the chown() calls is even larger. > You can run chgrp on all of the files in the no-striped copy in about > 3-5 seconds, but the striped copy takes more than 50 seconds. > > I did some more digging as to why this is. I thought maybe at first that > this is an issue on the client, but there is code in there that skips > over talking to the OSTs for certain types of metadata updates, and turning > on debugging on the client verifies that no setattr RPCs are being sent > to the OSSes. Looking more closely at the RPC traces reveals that the issue > is on the metadata server; the setattr RPCs simply take longer when the > files are striped. > > I've looked at the metadata server code for a bit, and I've verified > that the metadata server does send setattr RPCs to the OSSes, but I see > that it's done asynchronously; it shouldn't be waiting for the > replies. So I'm stumped as to why this is happening. I also realize > that I'm still puzzled as to what metadata is stored on the OSTs; it seems > like the client prefers the metadata from the MDS (except of course for > size), but a fair amount of metadata is still stored on the OSSes. Can > anyone shed some light on this? > > --Ken > _______________________________________________ > Lustre-discuss mailing list > [email protected] > http://lists.lustre.org/mailman/listinfo/lustre-discuss _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
