As I expected, no new info: [57291.417113] NILFS warning: mounting unchecked fs [57291.417125] NILFS: nilfs_search_super_root [57291.417132] NILFS: pseg_start 115603456, seg_seq 4347757 [57291.417137] NILFS: cno 1252241, segnum 56447 [62768.349242] NILFS: recovery complete. [62768.351804] segctord starting. Construction interval = 5 seconds, CP frequency < 30 seconds
So, I suppose, nilfs is searching for a super root for a very long time and find's out that it is consistent. -------------------------------------------------- Александров Сергей Васильевич 2012/11/27 Сергей Александров <[email protected]>: > Sorry, I don't have another multi-TB volume.. And this is a 1TB large. > But I can do split/resync on md to restore broken state. > > I'll send full log as soon as possible. > -------------------------------------------------- > Александров Сергей Васильевич > > > 2012/11/27 Vyacheslav Dubeyko <[email protected]>: >> On Tue, 2012-11-27 at 10:47 +0300, Сергей Александров wrote: >>> Hi, >>> >>> Yes, I it's not a full output, it's an output for abut 2 minutes. >>> I can get a full one, but after that the partition apparently will be >>> checked and it' will be hard to reproduce. >>> I also can split md device, so that there will be another attempt, should I? >> >> If you get enough disk space then it is possible to get full volume >> image. And, as a result, to get issue reproducible volume state by means >> of mount volume replica as loop device anytime. Could you do it? >> >> I think that we need to get whole output of mount operation anyway for >> the issue analysis. >> >> With the best regards, >> Vyacheslav Dubeyko. >> >>> -------------------------------------------------- >>> Александров Сергей Васильевич >>> >>> >>> 2012/11/27 Vyacheslav Dubeyko <[email protected]>: >>> > Hi, >>> > >>> > On Mon, 2012-11-26 at 22:17 +0300, Сергей Александров wrote: >>> >> Hi, >>> >> >>> >> Unfortunately, the only output I saw is: >>> >> Nov 26 15:54:00 router kernel: [ 5808.068533] NILFS warning: mounting >>> >> unchecked fs >>> >> Nov 26 15:54:00 router kernel: [ 5808.068546] NILFS: >>> >> nilfs_search_super_root >>> >> Nov 26 15:54:00 router kernel: [ 5808.068552] NILFS: pseg_start >>> >> 115603456, seg_seq 4347757 >>> >> Nov 26 15:54:00 router kernel: [ 5808.068558] NILFS: cno 1252241, segnum >>> >> 56447 >>> >> >>> > >>> > Sorry, the patch is working. The output about searching super root is >>> > the result of debug output. But I expected more information in this >>> > output. Did you break mount operation before ending? >>> > >>> > As I understand, you have output not for whole mount operation. Am I >>> > correct? Could you share whole debug output from the mount beginning to >>> > the end? >>> > >>> > Or do you share all available output that you had during so long mount >>> > operation? >>> > >>> >> I suppose the last three lines are due to the patch. >>> >> >>> >> Dynamic debug was on: >>> >> router dynamic_debug # cat control | grep nilfs >>> >> fs/nilfs2/recovery.c:851 [nilfs2]nilfs_search_super_root =p "NILFS: >>> >> cno %lld, segnum %lld\012" >>> >> fs/nilfs2/recovery.c:850 [nilfs2]nilfs_search_super_root =p "NILFS: >>> >> pseg_start %lld, seg_seq %lld\012" >>> >> fs/nilfs2/recovery.c:849 [nilfs2]nilfs_search_super_root =p "NILFS: >>> >> nilfs_search_super_root\012" >>> >> fs/nilfs2/recovery.c:766 [nilfs2]nilfs_salvage_orphan_logs =p "NILFS: >>> >> nilfs_salvage_orphan_logs\012" >>> >> fs/nilfs2/recovery.c:614 [nilfs2]nilfs_do_roll_forward =p "NILFS: >>> >> seg_start %lld, seg_end %lld\012" >>> >> fs/nilfs2/recovery.c:613 [nilfs2]nilfs_do_roll_forward =p "NILFS: >>> >> segnum %lld\012" >>> >> fs/nilfs2/recovery.c:612 [nilfs2]nilfs_do_roll_forward =p "NILFS: >>> >> pseg_start %lld, seg_seq %lld\012" >>> >> fs/nilfs2/recovery.c:611 [nilfs2]nilfs_do_roll_forward =p "NILFS: >>> >> nilfs_do_roll_forward\012" >>> >> fs/nilfs2/recovery.c:442 [nilfs2]nilfs_prepare_segment_for_recovery =p >>> >> "NILFS: ri_segnum %lld, ri_nextnum %lld\012" >>> >> fs/nilfs2/recovery.c:440 [nilfs2]nilfs_prepare_segment_for_recovery =p >>> >> "NILFS: ns_segnum %lld, ns_nextnum %lld\012" >>> >> fs/nilfs2/recovery.c:438 [nilfs2]nilfs_prepare_segment_for_recovery =p >>> >> "NILFS: nilfs_prepare_segment_for_recovery\012" >>> >> fs/nilfs2/recovery.c:719 [nilfs2]nilfs_finish_roll_forward =p "NILFS: >>> >> nilfs_finish_roll_forward\012" >>> > >>> > This is simply list of known debug messages format. >>> > >>> > With the best regards, >>> > Vyacheslav Dubeyko. >>> > >>> >> -------------------------------------------------- >>> >> Александров Сергей Васильевич >>> >> >>> >> >>> >> 2012/11/27 Vyacheslav Dubeyko <[email protected]>: >>> >> > Hi, >>> >> > >>> >> > On Nov 26, 2012, at 7:19 PM, Сергей Александров wrote: >>> >> > >>> >> >> The bug finally appeared again. Here are the trace log and kernel log. >>> >> >> I've hard rebooted machine so FS is in unchecked state for now and it >>> >> >> is possible to make some other tests. >>> >> >> I'll manage to do without this partition for a day may be two. >>> >> >> >>> >> > >>> >> > Thank you for logs. But what about my patch with debug output? The >>> >> > output of the patch can be very helpful for the issue analysis. It >>> >> > needs to understand what segments are processed during mount. Could >>> >> > you share debug output of my patch? >>> >> > >>> >> > With the best regards, >>> >> > Vyacheslav Dubeyko. >>> >> > >>> >> >> >>> >> >> -------------------------------------------------- >>> >> >> Александров Сергей Васильевич >>> >> >> >>> >> >> >>> >> >> 2012/11/16 Сергей Александров <[email protected]>: >>> >> >>> Ok, thanks. I'll try to apply it. I've also turned on function graph >>> >> >>> tracing, may be the graph for init_nilfs function will help. >>> >> >>> -------------------------------------------------- >>> >> >>> Александров Сергей Васильевич >>> >> >>> >>> >> >>> >>> >> >>> 2012/11/16 Vyacheslav Dubeyko <[email protected]>: >>> >> >>>> On Fri, 2012-11-16 at 10:11 +0300, Сергей Александров wrote: >>> >> >>>>> dmesg: >>> >> >>>>> [53994.254432] NILFS warning: mounting unchecked fs >>> >> >>>>> [56686.968229] NILFS: recovery complete. >>> >> >>>>> [56686.969316] segctord starting. Construction interval = 5 >>> >> >>>>> seconds, >>> >> >>>>> CP frequency < 30 seconds >>> >> >>>>> >>> >> >>>>> messages: >>> >> >>>>> Nov 15 10:57:06 router kernel: [53994.254432] NILFS warning: >>> >> >>>>> mounting >>> >> >>>>> unchecked fs >>> >> >>>>> Nov 15 11:42:02 router kernel: [56686.968229] NILFS: recovery >>> >> >>>>> complete. >>> >> >>>>> Nov 15 11:42:02 router kernel: [56686.969316] segctord starting. >>> >> >>>>> Construction interval = 5 seconds, CP frequency < 30 seconds >>> >> >>>>> >>> >> >>>>> May be there is some kernel config option to get more debug output? >>> >> >>>>> >>> >> >>>> >>> >> >>>> I prepared small patch (please, find in attachment). This patch >>> >> >>>> simply >>> >> >>>> adds several debug messages into recovery module. I suggest to >>> >> >>>> apply the >>> >> >>>> patch on your NILFS2 driver and try to mount again in the issue >>> >> >>>> environment. I hope that these debug messages can give more >>> >> >>>> information >>> >> >>>> about issue on your side. >>> >> >>>> >>> >> >>>> The patch uses pr_debug() call. Please, see >>> >> >>>> Documentation/dynamic-debug-howto.txt for more details. >>> >> >>>> >>> >> >>>> To be honest, I don't test the patch yet. So, if you will have any >>> >> >>>> troubles, please, e-mail to me. >>> >> >>>> >>> >> >>>> With the best regards, >>> >> >>>> Vyacheslav Dubeyko. >>> >> >>>> >>> >> >>>>> As for fsck, I have not found it in git public repo, so where can I >>> >> >>>>> get the latest version? >>> >> >>>>> -------------------------------------------------- >>> >> >>>> >>> >> >> <kern.log><trace_fail> >>> >> > >>> > >>> > >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in >>> the body of a message to [email protected] >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
