Mark Seger wrote: > > > Tom.Wang wrote: >> Hi, Mark >> Mark Seger wrote: >>> I recently got an email from a collectl/lustre user who reported I'm >>> not handling all the data for an MDS properly and showed me the >>> following from his stats file: >>> >>> snapshot_time 1217282595.543548 secs.usecs >>> req_waittime 170215294 samples [usec] 0 1992338 >>> 23146560348 30473383620091658 >>> req_qdepth 170215294 samples [reqs] 0 9444 124827689 >>> 348732141299 >>> req_active 170215294 samples [reqs] 1 127 352457248 >>> 5984384772 >>> req_timeout 170215294 samples [sec] 1 301 442496292 >>> 28785551992 >>> reqbuf_avail 410936563 samples [bufs] 128 1024 >>> 420313410439 430054594284593 >>> ldlm_flock_enqueue 4 samples [reqs] 1 1 4 4 >>> ldlm_ibits_enqueue 139193009 samples [reqs] 1 1 139193009 >>> 139193009 >>> mds_reint_create 11018837 samples [reqs] 1 1 11018837 11018837 >>> mds_reint_link 51315 samples [reqs] 1 1 51315 51315 >>> mds_reint_setattr 395400 samples [reqs] 1 1 395400 395400 >>> mds_reint_rename 224241 samples [reqs] 1 1 224241 224241 >>> mds_reint_unlink 13109877 samples [reqs] 1 1 13109877 13109877 >>> mds_getattr 49739 samples [usec] 10 271559 7792733 >>> 655474619433 >>> mds_connect 2373 samples [usec] 12 1300137 5532759 >>> 3705296256699 >>> mds_disconnect 291 samples [usec] 18 3047 76662 88841758 >>> mds_getstatus 899 samples [usec] 6 41 9824 114260 >>> mds_statfs 4090 samples [usec] 5 79121 1205331 >>> 45689777475 >>> mds_sync 1190 samples [usec] 1843 5014449 236091207 >>> 450545148159775 >>> mds_quotactl 2049 samples [usec] 7 883434 7687131 >>> 3232395885317 >>> mds_getxattr 36089 samples [usec] 9 8996 675208 252525110 >>> mds_setxattr 1230 samples [usec] 123 10110 263367 >>> 225741995 >>> obd_ping 6124258 samples [usec] 0 30366 67518884 >>> 4513131130 >>> >>> and I've never heard of all these 'reint' variable now have see some >>> many of the others either and so thought they were recently added. >>> the interesting thing is I have 1.6.5.1 installed and my stats file >>> shows: >>> >>> snapshot_time 1217344821.777664 secs.usecs >>> req_waittime 5500 samples [usec] 6 399 57198 831384 >>> req_qdepth 5500 samples [reqs] 0 2 16 18 >>> req_active 5500 samples [reqs] 1 2 5514 5542 >>> req_timeout 5500 samples [sec] 1 10 5572 6292 >>> reqbuf_avail 12650 samples [bufs] 511 512 6475895 >>> 3315195785 >>> ldlm_ibits_enqueue 1646 samples [reqs] 1 1 1646 1646 >>> mds_reint_unlink 8 samples [reqs] 1 1 8 8 >>> mds_getattr 17 samples [usec] 11 14567 15655 212882791 >>> mds_connect 24 samples [usec] 33 1418 2948 2142530 >>> mds_getstatus 17 samples [usec] 8 15 165 1657 >>> mds_statfs 55 samples [usec] 7 48 762 15484 >>> mds_sync 800 samples [usec] 803 40698 2188133 >>> 15429580807 >>> obd_ping 2933 samples [usec] 6 48 48843 1130361 >> MDS_REINT and LDLM req information update /proc has been detailed >> since 1.6.5, see bug 14184. > I don't think so. It looks like a bunch of patches and random > comments. Are we both talking about - > https://bugzilla.lustre.org/show_bug.cgi?id=14184 ? > The first thing I did was to see what it said about those 5 reint > counters and so searched for "mds_reint_" and found none of them. > Perhaps there are some clues to the meanings to some of the counters > but certainly not all of them. What I mean here is that the original MDS_REINT req has been divided to 5 sub-req status(which I mean detailed here, sorry about the confusion). In original implementation, there is only 1 "mds_reint" req counter in the stats file, and we update that and change it to five specific mds_reint req, Then MDS would know what kind of reint request it has been handled. Not sure whether lustre manual include all the statements of these stats, you might check that?
> > Also, you didn't answer my question about what types of operations > will cause each of the different counters to increment. Hmm, I assume you only ask mds_reint req ? These counters will be incremented when mds identify the request and ready to handle it. So mds_reint_unlink: unlink a file mds_reint_create: mknod or mkdir mds_reint_link: link a file mds_reint_setattr: chmod/chacl or other setattr meta ops. mds_reint_rename: rename a file. Hope I did not miss your questions this time. > Would it be easier and more beneficial to other users to move this to > lustre-discuss? Yes. > > -mark > > Thanks WangDi -- Regards, Tom Wangdi -- Sun Lustre Group System Software Engineer http://www.sun.com _______________________________________________ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss