Re: very large mount time after unxepected power down

Сергей Александров Tue, 27 Nov 2012 00:03:46 -0800

Sorry, I don't have another multi-TB volume.. And this is a 1TB large.
But I can do split/resync on md to restore broken state.


I'll send full log as soon as possible.
--------------------------------------------------
Александров Сергей Васильевич


2012/11/27 Vyacheslav Dubeyko <[email protected]>:
> On Tue, 2012-11-27 at 10:47 +0300, Сергей Александров wrote:
>> Hi,
>>
>> Yes, I it's not a full output, it's an output for abut 2 minutes.
>> I can get a full one, but after that the partition apparently will be
>> checked and it' will be hard to reproduce.
>> I also can split md device, so that there will be another attempt, should I?
>
> If you get enough disk space then it is possible to get full volume
> image. And, as a result, to get issue reproducible volume state by means
> of mount volume replica as loop device anytime. Could you do it?
>
> I think that we need to get whole output of mount operation anyway for
> the issue analysis.
>
> With the best regards,
> Vyacheslav Dubeyko.
>
>> --------------------------------------------------
>> Александров Сергей Васильевич
>>
>>
>> 2012/11/27 Vyacheslav Dubeyko <[email protected]>:
>> > Hi,
>> >
>> > On Mon, 2012-11-26 at 22:17 +0300, Сергей Александров wrote:
>> >> Hi,
>> >>
>> >> Unfortunately, the only output I saw is:
>> >> Nov 26 15:54:00 router kernel: [ 5808.068533] NILFS warning: mounting
>> >> unchecked fs
>> >> Nov 26 15:54:00 router kernel: [ 5808.068546] NILFS: 
>> >> nilfs_search_super_root
>> >> Nov 26 15:54:00 router kernel: [ 5808.068552] NILFS: pseg_start
>> >> 115603456, seg_seq 4347757
>> >> Nov 26 15:54:00 router kernel: [ 5808.068558] NILFS: cno 1252241, segnum 
>> >> 56447
>> >>
>> >
>> > Sorry, the patch is working. The output about searching super root is
>> > the result of debug output. But I expected more information in this
>> > output. Did you break mount operation before ending?
>> >
>> > As I understand, you have output not for whole mount operation. Am I
>> > correct? Could you share whole debug output from the mount beginning to
>> > the end?
>> >
>> > Or do you share all available output that you had during so long mount
>> > operation?
>> >
>> >> I suppose the last three lines are due to the patch.
>> >>
>> >> Dynamic debug was on:
>> >> router dynamic_debug # cat control | grep nilfs
>> >> fs/nilfs2/recovery.c:851 [nilfs2]nilfs_search_super_root =p "NILFS:
>> >> cno %lld, segnum %lld\012"
>> >> fs/nilfs2/recovery.c:850 [nilfs2]nilfs_search_super_root =p "NILFS:
>> >> pseg_start %lld, seg_seq %lld\012"
>> >> fs/nilfs2/recovery.c:849 [nilfs2]nilfs_search_super_root =p "NILFS:
>> >> nilfs_search_super_root\012"
>> >> fs/nilfs2/recovery.c:766 [nilfs2]nilfs_salvage_orphan_logs =p "NILFS:
>> >> nilfs_salvage_orphan_logs\012"
>> >> fs/nilfs2/recovery.c:614 [nilfs2]nilfs_do_roll_forward =p "NILFS:
>> >> seg_start %lld, seg_end %lld\012"
>> >> fs/nilfs2/recovery.c:613 [nilfs2]nilfs_do_roll_forward =p "NILFS:
>> >> segnum %lld\012"
>> >> fs/nilfs2/recovery.c:612 [nilfs2]nilfs_do_roll_forward =p "NILFS:
>> >> pseg_start %lld, seg_seq %lld\012"
>> >> fs/nilfs2/recovery.c:611 [nilfs2]nilfs_do_roll_forward =p "NILFS:
>> >> nilfs_do_roll_forward\012"
>> >> fs/nilfs2/recovery.c:442 [nilfs2]nilfs_prepare_segment_for_recovery =p
>> >> "NILFS: ri_segnum %lld, ri_nextnum %lld\012"
>> >> fs/nilfs2/recovery.c:440 [nilfs2]nilfs_prepare_segment_for_recovery =p
>> >> "NILFS: ns_segnum %lld, ns_nextnum %lld\012"
>> >> fs/nilfs2/recovery.c:438 [nilfs2]nilfs_prepare_segment_for_recovery =p
>> >> "NILFS: nilfs_prepare_segment_for_recovery\012"
>> >> fs/nilfs2/recovery.c:719 [nilfs2]nilfs_finish_roll_forward =p "NILFS:
>> >> nilfs_finish_roll_forward\012"
>> >
>> > This is simply list of known debug messages format.
>> >
>> > With the best regards,
>> > Vyacheslav Dubeyko.
>> >
>> >> --------------------------------------------------
>> >> Александров Сергей Васильевич
>> >>
>> >>
>> >> 2012/11/27 Vyacheslav Dubeyko <[email protected]>:
>> >> > Hi,
>> >> >
>> >> > On Nov 26, 2012, at 7:19 PM, Сергей Александров wrote:
>> >> >
>> >> >> The bug finally appeared again. Here are the trace log and kernel log.
>> >> >> I've hard rebooted machine so FS is in unchecked state for now and it
>> >> >> is possible to make some other tests.
>> >> >> I'll manage to do without this partition for a day may be two.
>> >> >>
>> >> >
>> >> > Thank you for logs. But what about my patch with debug output? The 
>> >> > output of the patch can be very helpful for the issue analysis. It 
>> >> > needs to understand what segments are processed during mount. Could you 
>> >> > share debug output of my patch?
>> >> >
>> >> > With the best regards,
>> >> > Vyacheslav Dubeyko.
>> >> >
>> >> >>
>> >> >> --------------------------------------------------
>> >> >> Александров Сергей Васильевич
>> >> >>
>> >> >>
>> >> >> 2012/11/16 Сергей Александров <[email protected]>:
>> >> >>> Ok, thanks. I'll try to apply it. I've also turned on function graph
>> >> >>> tracing, may be the graph for init_nilfs function will help.
>> >> >>> --------------------------------------------------
>> >> >>> Александров Сергей Васильевич
>> >> >>>
>> >> >>>
>> >> >>> 2012/11/16 Vyacheslav Dubeyko <[email protected]>:
>> >> >>>> On Fri, 2012-11-16 at 10:11 +0300, Сергей Александров wrote:
>> >> >>>>> dmesg:
>> >> >>>>> [53994.254432] NILFS warning: mounting unchecked fs
>> >> >>>>> [56686.968229] NILFS: recovery complete.
>> >> >>>>> [56686.969316] segctord starting. Construction interval = 5 seconds,
>> >> >>>>> CP frequency < 30 seconds
>> >> >>>>>
>> >> >>>>> messages:
>> >> >>>>> Nov 15 10:57:06 router kernel: [53994.254432] NILFS warning: 
>> >> >>>>> mounting
>> >> >>>>> unchecked fs
>> >> >>>>> Nov 15 11:42:02 router kernel: [56686.968229] NILFS: recovery 
>> >> >>>>> complete.
>> >> >>>>> Nov 15 11:42:02 router kernel: [56686.969316] segctord starting.
>> >> >>>>> Construction interval = 5 seconds, CP frequency < 30 seconds
>> >> >>>>>
>> >> >>>>> May be there is some kernel config option to get more debug output?
>> >> >>>>>
>> >> >>>>
>> >> >>>> I prepared small patch (please, find in attachment). This patch 
>> >> >>>> simply
>> >> >>>> adds several debug messages into recovery module. I suggest to apply 
>> >> >>>> the
>> >> >>>> patch on your NILFS2 driver and try to mount again in the issue
>> >> >>>> environment. I hope that these debug messages can give more 
>> >> >>>> information
>> >> >>>> about issue on your side.
>> >> >>>>
>> >> >>>> The patch uses pr_debug() call. Please, see
>> >> >>>> Documentation/dynamic-debug-howto.txt for more details.
>> >> >>>>
>> >> >>>> To be honest, I don't test the patch yet. So, if you will have any
>> >> >>>> troubles, please, e-mail to me.
>> >> >>>>
>> >> >>>> With the best regards,
>> >> >>>> Vyacheslav Dubeyko.
>> >> >>>>
>> >> >>>>> As for fsck, I have not found it in git public repo, so where can I
>> >> >>>>> get the latest version?
>> >> >>>>> --------------------------------------------------
>> >> >>>>
>> >> >> <kern.log><trace_fail>
>> >> >
>> >
>> >
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
>> the body of a message to [email protected]
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: very large mount time after unxepected power down

Reply via email to