> > By the way, are the llog files you mentioned virtual or real? if they are > real, where are they located? Need I clean them manually ?
They are real, the location is O/1/... lustre/utils/llog_reader ./changelog_catalog.dmp rec #1 type=1064553b len=64 Header size : 8192 Time : Mon Dec 7 15:44:37 2015 Number of records: 1 Target uuid : ----------------------- #01 (064)ogen=0 name=0x8:1 ... I`ve dump and check file, location base at name from record. debugfs: dump O/1/d8/8 plain.llog lustre/utils/llog_reader ./plain.llog rec #1 type=10660000 len=96 offset 8192 Header size : 8192 Time : Mon Dec 7 15:46:40 2015 Number of records: 1 Target uuid : ----------------------- #01 (096)changelog record id:0x0 cr_flags:0x1000 cr_type:CREAT(0x1) Looks like O/1/ for llog files only. On Mon, Dec 7, 2015 at 4:55 AM, wanglu <[email protected]> wrote: > Hi Alexander, > > Before I recieved this reply, I deregistered the cl1 user. It took a very > long time, and I am not sure if it successfully finished or not since the > server crashed once the next morning. > Then, I moved the old changelog_catalog file, and created a zero > changelog_user file instead. > This is what I got from the old changelog_catalog file. > # ls -l /tmp/changelog.dmp > -rw-r--r-- 1 root root 4153280 Dec 6 06:54 /tmp/changelog.dmp > # llog_reader changelog.dmp |grep "type=1064553b" |wc -l > 63432 > This number is smaller than 64768, I am not sure if it is related to the > unfinished deregisteration or not. > > The first record number is 1, the last record number of is 64767. I think > there maybe some skipped record numbers: > # llog_reader changelog.dmp |grep "type=1064553b" |head -n 1 > rec #1 type=1064553b len=64 > # llog_reader changelog.dmp |grep "type=1064553b" |tail -n 1 > rec #64767 type=1064553b len=64 > # llog_reader changelog.dmp |grep "^rec" | grep -v "type=1064553b" > return 0 lines. > > By the way, are the llog files you mentioned virtual or real? if they are > real, where are they located? Need I clean them manually ? > > Thanks, > Lu,Wang > *From:* Alexander Boyko <[email protected]> > *Date:* 2015-12-04 21:36 > *To:* wanglu <[email protected]>; lustre-discuss > <[email protected]> > *Subject:* RE [lustre-discuss] No free catalog slots for log ( Lustre > 2.5.3 & Robinhood 2.5.3 ) > >> Here are 4 questions which we cannot find answers in LU-1586: >> >> 1. According to Andres?s reply, there should some unconsumed >> changelog files on our MDT, and these files have taken all the space (file >> quotas?) Lustre gives to changelog. With Lustre 2.1, these files are under >> OBJECTS directory and can be listed in ldiskfs mode. In our case, with >> Lustre 2.5.3, there is no OBJECTS directory can be found. In this case, how >> can we monitor the situation before the unconsumed changelogs takes up all >> the disk space? >> > The changelog base on one catalog file and a plain llog files. Catalog > stores limited number of records about 64768. A catalog record size is 64 > byte. Each record has information about plain llog file. A plain llog file > stores records about IO operation. A number of records at the plain llog > file is about 64768 with different record size. So changelog could store > 64768^2 IO operations and it occupy filesystem space. The error "no free > catalog slots" is happened when changelog catalog doesn`t have a slot to > store a record about new plain lllog. All slots are filled or internal > changelog markers became crazy and internal logic don`t work. > To be closer to the root cause, you need to dump a changelog catalog and > check bitmap. Is there free slots? Something like > > debugfs -R "dump changelog_catalog changelog_catalog.dmp" /dev/md55 && > used=`llog_reader changelog_catalog.dmp | grep "type=1064553b" | wc -l` > > 2. Why there are so many unconsumed changelogs? Could it related to >> our frequent remount of MDT( abort_recovery mode )? >> > umount operation create half empty plain llog file. And changelog_clear > can`t remove it, if all slots is freed. Only new mount can remove that > file. It could be related or not. > > > >> 3. When we remount the MDT, robinhood is still running. Why robinhood >> can not consume those old changelogs after MDT service is recovered? >> 4. Why there is a huge difference between current index(4199610352 ) >> and cl1(49035933) index? >> >> Thank you for your time and help ! >> >> Wang,Lu >> > > -- > Alexander Boyko > Seagate > www.seagate.com > -- Alexander Boyko Seagate www.seagate.com
_______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
