Sorry, didn't get it the first time - You say 'the program that issues the lstat/stat/fstat from userspace is only inquiring about a single file at a time'. By 'the program' you mean 'samba' or 'ls' in our case.
Or is it the 'Windows Explorer' that is a triggering a 'stat' on each file on the samba server? Regards, Indivar Nair On Mon, Sep 12, 2011 at 9:09 AM, Indivar Nair <[email protected]>wrote: > So what you are saying is - The OSC (stat) issues the 1st RPC to the 1st > OST, waits for its response, then issues the 2nd RPC to the 2nd OST, so on > and so forth till it 'stat's all the 2000 files. That would be slow :-(. > > Why does Lustre do it this way, while everywhere else its trys to do > extreme parallelization? > Would patching lstat/stat/fstat to parallelize requests only when accessing > a Lustre store be possible? > > Regards, > > > Indivar Nair > > > On Mon, Sep 12, 2011 at 6:38 AM, Jeremy Filizetti < > [email protected]> wrote: > >> >> From Adrian's explanation, I gather that the OSC generates 1 RPC to each >>> OST for each file. Since there is only 1 OST in each of the 4 OSS, we only >>> get 128 simultaneous RPCs. So Listing 2000 files would only get us that much >>> speed, right? >>> >> >> There is no concurrency in fetching these attributes because the program >> that issues the lstat/stat/fstat from userspace is only inquiring about a >> single file at a time. So every RPC becomes a minimum of one >> round-trip-time network latency between the client and an OSS assuming >> statahead thread fetched MDS attributes and OSS has cached inode structures >> (ignoring a few other small additions). So if you have 2000 files in a >> directory and you had an avg network latency of 150 us for a glimpse RPC >> (which I've seen for cached inodes on the OSS) you have a best case of >> 2000*.000150=.3 seconds. Without cached inodes disk latency on the OSS will >> make that time far longer and less predictable. >> >> >>> >>> Now, each of the OST is around 4.5 TB in size. So say, we reduce the disk >>> size 1.125TB, but increase the number to 4, then we would get >>> 4OSTx32RPCs=128 RPC connections to each OSS, and 512 simultaneous RPCs >>> across the Lustre storage. Wouldn't this increase the listing speed four >>> times over? >>> >> >> The only hope for speeding this up is probably a code change to implement >> async glimpse thread or bulkstat/readdirplus where Lustre could fetch >> attributes before userspace requests them so they would be locally cached. >> >> Jeremy >> > >
_______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
