Okay, was going thru the whole mail trail again and got answers to my last set of questions from your (Jeremy's) earlier mail -
And how does the patch work then? If a request is for only 1 file stat, how does multiple pages in readdir() help? Ans: 'The only hope for speeding this up is probably a code change to implement async glimpse thread or bulkstat/readdirplus where Lustre could fetch attributes before userspace requests them so they would be locally cached.' Plus setting 'vfs_cache_pressure=0' or to a very low number is the solution. Thanks, Regards, Indivar Nair On Mon, Sep 12, 2011 at 11:07 AM, Indivar Nair <[email protected]>wrote: > So this is how the flow is - > > Windows Explorer request's statistics of a single file in the dir -> Samba > initiates a 'stat' call on the file -> MDC initiates RPC request to MDT; > gets response -> OSC initiates an RPC to OST; gets response -> Response > given back to stat / Samba -> Samba sends the statistics back to explorer. > > Hmmm..., doing this 2000 times is going to take a long time. > And there is no way we can fix explorer to do a bulk stat request :-(. > > So the only option is to get Lustre to respond faster to individual > requests. > Is there anyway to increase the Size and TTL of file metadata cache in MDTs > and OSTs? > And how does the patch work then? If a request is for only 1 file stat, how > does multiple pages in readdir() help? > > Regards, > > > Indivar Nair > > > > On Mon, Sep 12, 2011 at 10:31 AM, Jeremy Filizetti < > [email protected]> wrote: > >> >> On Sep 12, 2011 12:27 AM, "Indivar Nair" <[email protected]> >> wrote: >> > >> > Sorry, didn't get it the first time - >> > You say 'the program that issues the lstat/stat/fstat from userspace is >> only inquiring about a single file at a time'. By 'the program' you mean >> 'samba' or 'ls' in our case. >> > >> Correct those programs are issuing the syscalls. >> >> > Or is it the 'Windows Explorer' that is a triggering a 'stat' on each >> file on the samba server? >> > >> >> Win explorer is sending the requests to samba and samba just issues the >> syscall to retrieve the information and send it back. >> >> > Regards, >> > >> > >> > Indivar Nair >> > >> > On Mon, Sep 12, 2011 at 9:09 AM, Indivar Nair < >> [email protected]> wrote: >> >> >> >> So what you are saying is - The OSC (stat) issues the 1st RPC to the >> 1st OST, waits for its response, then issues the 2nd RPC to the 2nd OST, so >> on and so forth till it 'stat's all the 2000 files. That would be slow :-(. >> >> >> >> Why does Lustre do it this way, while everywhere else its trys to do >> extreme parallelization? >> >> Would patching lstat/stat/fstat to parallelize requests only when >> accessing a Lustre store be possible? >> >> >> >> Regards, >> >> >> >> >> >> Indivar Nair >> >> >> >> >> >> On Mon, Sep 12, 2011 at 6:38 AM, Jeremy Filizetti < >> [email protected]> wrote: >> >>> >> >>> >> >>>> From Adrian's explanation, I gather that the OSC generates 1 RPC to >> each OST for each file. Since there is only 1 OST in each of the 4 OSS, we >> only get 128 simultaneous RPCs. So Listing 2000 files would only get us that >> much speed, right? >> >>> >> >>> >> >>> There is no concurrency in fetching these attributes because the >> program that issues the lstat/stat/fstat from userspace is only inquiring >> about a single file at a time. So every RPC becomes a minimum of one >> round-trip-time network latency between the client and an OSS assuming >> statahead thread fetched MDS attributes and OSS has cached inode structures >> (ignoring a few other small additions). So if you have 2000 files in a >> directory and you had an avg network latency of 150 us for a glimpse RPC >> (which I've seen for cached inodes on the OSS) you have a best case of >> 2000*.000150=.3 seconds. Without cached inodes disk latency on the OSS will >> make that time far longer and less predictable. >> >>> >> >>>> >> >>>> >> >>>> Now, each of the OST is around 4.5 TB in size. So say, we reduce the >> disk size 1.125TB, but increase the number to 4, then we would get >> 4OSTx32RPCs=128 RPC connections to each OSS, and 512 simultaneous RPCs >> across the Lustre storage. Wouldn't this increase the listing speed four >> times over? >> >>> >> >>> >> >>> The only hope for speeding this up is probably a code change to >> implement async glimpse thread or bulkstat/readdirplus where Lustre could >> fetch attributes before userspace requests them so they would be locally >> cached. >> >>> >> >>> Jeremy >> >> >> >> >> > >> > >
_______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
