Hi,
At Wed, 23 Jun 2010 12:38:56 +0900 (JST),
Ryusuke Konishi wrote:
>
> On Mon, 21 Jun 2010 02:53:10 +0900 (JST), Ryusuke Konishi wrote:
> > On Mon, 21 Jun 2010 01:36:55 +0900, Jiro SEKIBA wrote:
> > > This will sync super blocks in turns instead of syncing duplicate
> > > super blocks at the time. This will help searching valid super root when
> > > super block is written into disk before log is written, which is happen
> > > when
> > > barrier-less block devices are unmounted uncleanly.
> > > In the situation, old super block likely points to valid log.
> > >
> > > This patch introduces ns_sbwcount member, which counts how many times
> > > super
> > > blocks write back to the disk. Super blocks are asymmetrically synced
> > > based on the counter.
> > >
> > > The patch also introduces new function nilfs_set_log_cursor to advance
> > > log cursor for specified super block. To update both of super block
> > > information, caller of nilfs_commit_super must set the information on both
> > > super blocks.
> > >
> > > Signed-off-by: Jiro SEKIBA <[email protected]>
> >
> > Thank you! Both patches look good to me.
> >
> > Will queue them up for the next merge window.
> >
> > Thanks,
> > Ryusuke Konishi
>
> Umm, I noticed that nilfs_commit_super is called twice when the
> filesystem is unmounted. This is because nilfs_sync_fs() is called
> just before nilfs_put_super() will do unmount jobs.
Ahhh, I see the problem. That's right. It would have be the same
checkpoints in case that filesystem is dirty when sync_fs is called.
To slove the problem, I think, it may require controlling swap of
superblocks explicitly.
How about swaping BEFORE writing the super block instead of AFTER?
nilfs_prepare_super will take another argument that controls swapping
of super blocks. So caller can decide swap or not. With this feature,
you can prepare super block without swapping in a nilfs_put_super.
Therefore, old super block written before sync_fs called will be preserved
by overwriting the same super block sync_fs wrote back.
Updating protection period should be took care carefully.
Nhh, then, it can compare the actual checkpoint nubmer of each super blocks
instead of setting sbp[1]'s checkpoint each time.
What do you think?
thanks
regards,
> Come to think of it, it's natural, but seems to wipe out merit of the
> alternated super block writeback scheme.
>
> Seems to need modification of some sort.
>
> Could you take a look at this issue ?
>
> Thanks in advance,
> Ryusuke Konishi
>
> > > ---
> > > fs/nilfs2/nilfs.h | 10 ++++
> > > fs/nilfs2/segment.c | 9 ++-
> > > fs/nilfs2/super.c | 128
> > > ++++++++++++++++++++++++++++++++-----------------
> > > fs/nilfs2/the_nilfs.c | 8 ++-
> > > fs/nilfs2/the_nilfs.h | 17 ++-----
> > > 5 files changed, 110 insertions(+), 62 deletions(-)
> > >
> > > diff --git a/fs/nilfs2/nilfs.h b/fs/nilfs2/nilfs.h
> > > index 649e079..9a9c1eb 100644
> > > --- a/fs/nilfs2/nilfs.h
> > > +++ b/fs/nilfs2/nilfs.h
> > > @@ -107,6 +107,14 @@ enum {
> > > };
> > >
> > > /*
> > > + * commit flags for nilfs_commit_super and nilfs_sync_super
> > > + */
> > > +enum {
> > > + NILFS_SB_COMMIT = 0, /* Commit a super block alternately */
> > > + NILFS_SB_COMMIT_ALL /* Commit both super blocks */
> > > +};
> > > +
> > > +/*
> > > * Macros to check inode numbers
> > > */
> > > #define NILFS_MDT_INO_BITS \
> > > @@ -270,6 +278,8 @@ extern struct nilfs_super_block *
> > > nilfs_read_super_block(struct super_block *, u64, int, struct
> > > buffer_head **);
> > > extern int nilfs_store_magic_and_option(struct super_block *,
> > > struct nilfs_super_block *, char *);
> > > +extern void nilfs_set_log_cursor(struct nilfs_super_block *,
> > > + struct the_nilfs *);
> > > extern struct nilfs_super_block **nilfs_prepare_super(struct
> > > nilfs_sb_info *);
> > > extern int nilfs_commit_super(struct nilfs_sb_info *, int);
> > > extern int nilfs_attach_checkpoint(struct nilfs_sb_info *, __u64);
> > > diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
> > > index 075d7b0..87d2768 100644
> > > --- a/fs/nilfs2/segment.c
> > > +++ b/fs/nilfs2/segment.c
> > > @@ -2408,6 +2408,7 @@ static int nilfs_segctor_construct(struct
> > > nilfs_sc_info *sci, int mode)
> > > {
> > > struct nilfs_sb_info *sbi = sci->sc_sbi;
> > > struct the_nilfs *nilfs = sbi->s_nilfs;
> > > + struct nilfs_super_block **sbp;
> > > int err = 0;
> > >
> > > nilfs_segctor_accept(sci);
> > > @@ -2424,9 +2425,11 @@ static int nilfs_segctor_construct(struct
> > > nilfs_sc_info *sci, int mode)
> > > nilfs_discontinued(nilfs)) {
> > > down_write(&nilfs->ns_sem);
> > > err = -EIO;
> > > - if (likely(nilfs_prepare_super(sbi)))
> > > - err = nilfs_commit_super(
> > > - sbi, nilfs_altsb_need_update(nilfs));
> > > + sbp = nilfs_prepare_super(sbi);
> > > + if (likely(sbp)) {
> > > + nilfs_set_log_cursor(sbp[0], nilfs);
> > > + err = nilfs_commit_super(sbi, NILFS_SB_COMMIT);
> > > + }
> > > up_write(&nilfs->ns_sem);
> > > }
> > > }
> > > diff --git a/fs/nilfs2/super.c b/fs/nilfs2/super.c
> > > index 045b8d7..f5ce0e1 100644
> > > --- a/fs/nilfs2/super.c
> > > +++ b/fs/nilfs2/super.c
> > > @@ -74,6 +74,25 @@ struct kmem_cache *nilfs_btree_path_cache;
> > >
> > > static int nilfs_remount(struct super_block *sb, int *flags, char *data);
> > >
> > > +static void nilfs_set_error(struct nilfs_sb_info *sbi)
> > > +{
> > > + struct the_nilfs *nilfs = sbi->s_nilfs;
> > > + struct nilfs_super_block **sbp;
> > > +
> > > + down_write(&nilfs->ns_sem);
> > > + if (!(nilfs->ns_mount_state & NILFS_ERROR_FS)) {
> > > + nilfs->ns_mount_state |= NILFS_ERROR_FS;
> > > + sbp = nilfs_prepare_super(sbi);
> > > + if (likely(sbp)) {
> > > + sbp[0]->s_state |= cpu_to_le16(NILFS_ERROR_FS);
> > > + if (sbp[1])
> > > + sbp[1]->s_state |= cpu_to_le16(NILFS_ERROR_FS);
> > > + nilfs_commit_super(sbi, NILFS_SB_COMMIT_ALL);
> > > + }
> > > + }
> > > + up_write(&nilfs->ns_sem);
> > > +}
> > > +
> > > /**
> > > * nilfs_error() - report failure condition on a filesystem
> > > *
> > > @@ -90,7 +109,6 @@ void nilfs_error(struct super_block *sb, const char
> > > *function,
> > > const char *fmt, ...)
> > > {
> > > struct nilfs_sb_info *sbi = NILFS_SB(sb);
> > > - struct nilfs_super_block **sbp;
> > > va_list args;
> > >
> > > va_start(args, fmt);
> > > @@ -100,18 +118,7 @@ void nilfs_error(struct super_block *sb, const char
> > > *function,
> > > va_end(args);
> > >
> > > if (!(sb->s_flags & MS_RDONLY)) {
> > > - struct the_nilfs *nilfs = sbi->s_nilfs;
> > > -
> > > - down_write(&nilfs->ns_sem);
> > > - if (!(nilfs->ns_mount_state & NILFS_ERROR_FS)) {
> > > - nilfs->ns_mount_state |= NILFS_ERROR_FS;
> > > - sbp = nilfs_prepare_super(sbi);
> > > - if (likely(sbp)) {
> > > - sbp[0]->s_state |= cpu_to_le16(NILFS_ERROR_FS);
> > > - nilfs_commit_super(sbi, 1);
> > > - }
> > > - }
> > > - up_write(&nilfs->ns_sem);
> > > + nilfs_set_error(sbi);
> > >
> > > if (nilfs_test_opt(sbi, ERRORS_RO)) {
> > > printk(KERN_CRIT "Remounting filesystem read-only\n");
> > > @@ -179,7 +186,7 @@ static void nilfs_clear_inode(struct inode *inode)
> > > nilfs_btnode_cache_clear(&ii->i_btnode_cache);
> > > }
> > >
> > > -static int nilfs_sync_super(struct nilfs_sb_info *sbi, int dupsb)
> > > +static int nilfs_sync_super(struct nilfs_sb_info *sbi, int flag)
> > > {
> > > struct the_nilfs *nilfs = sbi->s_nilfs;
> > > int err;
> > > @@ -205,6 +212,12 @@ static int nilfs_sync_super(struct nilfs_sb_info
> > > *sbi, int dupsb)
> > > printk(KERN_ERR
> > > "NILFS: unable to write superblock (err=%d)\n", err);
> > > if (err == -EIO && nilfs->ns_sbh[1]) {
> > > + /*
> > > + * sbp[0] points to newer log than sbp[1],
> > > + * so copy sbp[0] to sbp[1] to take over sbp[0].
> > > + */
> > > + memcpy(nilfs->ns_sbp[1], nilfs->ns_sbp[0],
> > > + nilfs->ns_sbsize);
> > > nilfs_fall_back_super_block(nilfs);
> > > goto retry;
> > > }
> > > @@ -219,11 +232,20 @@ static int nilfs_sync_super(struct nilfs_sb_info
> > > *sbi, int dupsb)
> > >
> > > /* update GC protection for recent segments */
> > > if (nilfs->ns_sbh[1]) {
> > > - sbp = NULL;
> > > - if (dupsb) {
> > > + sbp = nilfs->ns_sbp[1];
> > > + if (flag == NILFS_SB_COMMIT_ALL) {
> > > set_buffer_dirty(nilfs->ns_sbh[1]);
> > > - if (!sync_dirty_buffer(nilfs->ns_sbh[1]))
> > > - sbp = nilfs->ns_sbp[1];
> > > + if (sync_dirty_buffer(nilfs->ns_sbh[1]))
> > > + sbp = NULL; /* not update prot_seq */
> > > + } else {
> > > + int flip_bits = (nilfs->ns_sbwcount & 0x0FL);
> > > + nilfs->ns_sbwcount++;
> > > + /*
> > > + * flip super blocks 9 to 7 ratio.
> > > + * unflip when LSB 4bits are 0x08 or 0x0F
> > > + */
> > > + if (flip_bits != 0x08 && flip_bits != 0x0F)
> > > + nilfs_swap_super_block(nilfs);
> > > }
> > > }
> > > if (sbp) {
> > > @@ -245,50 +267,58 @@ struct nilfs_super_block
> > > **nilfs_prepare_super(struct nilfs_sb_info *sbi)
> > > if (sbp[0]->s_magic != cpu_to_le16(NILFS_SUPER_MAGIC)) {
> > > if (sbp[1] &&
> > > sbp[1]->s_magic == cpu_to_le16(NILFS_SUPER_MAGIC)) {
> > > - nilfs_swap_super_block(nilfs);
> > > + memcpy(sbp[0], sbp[1], nilfs->ns_sbsize);
> > > } else {
> > > printk(KERN_CRIT "NILFS: superblock broke on dev %s\n",
> > > sbi->s_super->s_id);
> > > return NULL;
> > > }
> > > + } else if (sbp[1] &&
> > > + sbp[1]->s_magic != cpu_to_le16(NILFS_SUPER_MAGIC)) {
> > > + memcpy(sbp[1], sbp[0], nilfs->ns_sbsize);
> > > }
> > > return sbp;
> > > }
> > >
> > > -int nilfs_commit_super(struct nilfs_sb_info *sbi, int dupsb)
> > > +void nilfs_set_log_cursor(struct nilfs_super_block *sbp,
> > > + struct the_nilfs *nilfs)
> > > {
> > > - struct the_nilfs *nilfs = sbi->s_nilfs;
> > > - struct nilfs_super_block **sbp = nilfs->ns_sbp;
> > > sector_t nfreeblocks;
> > > - time_t t;
> > > - int err;
> > >
> > > /* nilfs->ns_sem must be locked by the caller. */
> > > - err = nilfs_count_free_blocks(nilfs, &nfreeblocks);
> > > - if (unlikely(err)) {
> > > - printk(KERN_ERR "NILFS: failed to count free blocks\n");
> > > - return err;
> > > - }
> > > + nilfs_count_free_blocks(nilfs, &nfreeblocks);
> > > + sbp->s_free_blocks_count = cpu_to_le64(nfreeblocks);
> > > +
> > > spin_lock(&nilfs->ns_last_segment_lock);
> > > - sbp[0]->s_last_seq = cpu_to_le64(nilfs->ns_last_seq);
> > > - sbp[0]->s_last_pseg = cpu_to_le64(nilfs->ns_last_pseg);
> > > - sbp[0]->s_last_cno = cpu_to_le64(nilfs->ns_last_cno);
> > > + sbp->s_last_seq = cpu_to_le64(nilfs->ns_last_seq);
> > > + sbp->s_last_pseg = cpu_to_le64(nilfs->ns_last_pseg);
> > > + sbp->s_last_cno = cpu_to_le64(nilfs->ns_last_cno);
> > > spin_unlock(&nilfs->ns_last_segment_lock);
> > > +}
> > > +
> > > +int nilfs_commit_super(struct nilfs_sb_info *sbi, int flag)
> > > +{
> > > + struct the_nilfs *nilfs = sbi->s_nilfs;
> > > + struct nilfs_super_block **sbp = nilfs->ns_sbp;
> > > + time_t t;
> > >
> > > + /* nilfs->ns_sem must be locked by the caller. */
> > > t = get_seconds();
> > > - nilfs->ns_sbwtime[0] = t;
> > > - sbp[0]->s_free_blocks_count = cpu_to_le64(nfreeblocks);
> > > + nilfs->ns_sbwtime = t;
> > > sbp[0]->s_wtime = cpu_to_le64(t);
> > > sbp[0]->s_sum = 0;
> > > sbp[0]->s_sum = cpu_to_le32(crc32_le(nilfs->ns_crc_seed,
> > > (unsigned char *)sbp[0],
> > > nilfs->ns_sbsize));
> > > - if (dupsb && sbp[1]) {
> > > - memcpy(sbp[1], sbp[0], nilfs->ns_sbsize);
> > > - nilfs->ns_sbwtime[1] = t;
> > > + if (flag == NILFS_SB_COMMIT_ALL && sbp[1]) {
> > > + sbp[1]->s_wtime = sbp[0]->s_wtime;
> > > + sbp[1]->s_sum = 0;
> > > + sbp[1]->s_sum = cpu_to_le32(crc32_le(nilfs->ns_crc_seed,
> > > + (unsigned char *)sbp[1],
> > > + nilfs->ns_sbsize));
> > > }
> > > clear_nilfs_sb_dirty(nilfs);
> > > - return nilfs_sync_super(sbi, dupsb);
> > > + return nilfs_sync_super(sbi, flag);
> > > }
> > >
> > > static void nilfs_put_super(struct super_block *sb)
> > > @@ -305,8 +335,10 @@ static void nilfs_put_super(struct super_block *sb)
> > > down_write(&nilfs->ns_sem);
> > > sbp = nilfs_prepare_super(sbi);
> > > if (likely(sbp)) {
> > > + /* set state only for newer super block */
> > > sbp[0]->s_state = cpu_to_le16(nilfs->ns_mount_state);
> > > - nilfs_commit_super(sbi, 1);
> > > + nilfs_set_log_cursor(sbp[0], nilfs);
> > > + nilfs_commit_super(sbi, NILFS_SB_COMMIT);
> > > }
> > > up_write(&nilfs->ns_sem);
> > > }
> > > @@ -328,6 +360,7 @@ static int nilfs_sync_fs(struct super_block *sb, int
> > > wait)
> > > {
> > > struct nilfs_sb_info *sbi = NILFS_SB(sb);
> > > struct the_nilfs *nilfs = sbi->s_nilfs;
> > > + struct nilfs_super_block **sbp;
> > > int err = 0;
> > >
> > > /* This function is called when super block should be written back */
> > > @@ -335,8 +368,13 @@ static int nilfs_sync_fs(struct super_block *sb, int
> > > wait)
> > > err = nilfs_construct_segment(sb);
> > >
> > > down_write(&nilfs->ns_sem);
> > > - if (nilfs_sb_dirty(nilfs) && nilfs_prepare_super(sbi))
> > > - nilfs_commit_super(sbi, 1);
> > > + if (nilfs_sb_dirty(nilfs)) {
> > > + sbp = nilfs_prepare_super(sbi);
> > > + if (likely(sbp)) {
> > > + nilfs_set_log_cursor(sbp[0], nilfs);
> > > + nilfs_commit_super(sbi, NILFS_SB_COMMIT);
> > > + }
> > > + }
> > > up_write(&nilfs->ns_sem);
> > >
> > > return err;
> > > @@ -642,7 +680,6 @@ static int nilfs_setup_super(struct nilfs_sb_info
> > > *sbi)
> > > max_mnt_count = le16_to_cpu(sbp[0]->s_max_mnt_count);
> > > mnt_count = le16_to_cpu(sbp[0]->s_mnt_count);
> > >
> > > - /* nilfs->ns_sem must be locked by the caller. */
> > > if (nilfs->ns_mount_state & NILFS_ERROR_FS) {
> > > printk(KERN_WARNING
> > > "NILFS warning: mounting fs with errors\n");
> > > @@ -659,7 +696,9 @@ static int nilfs_setup_super(struct nilfs_sb_info
> > > *sbi)
> > > sbp[0]->s_state =
> > > cpu_to_le16(le16_to_cpu(sbp[0]->s_state) & ~NILFS_VALID_FS);
> > > sbp[0]->s_mtime = cpu_to_le64(get_seconds());
> > > - return nilfs_commit_super(sbi, 1);
> > > + /* synchronize sbp[1] with sbp[0] */
> > > + memcpy(sbp[1], sbp[0], nilfs->ns_sbsize);
> > > + return nilfs_commit_super(sbi, NILFS_SB_COMMIT_ALL);
> > > }
> > >
> > > struct nilfs_super_block *nilfs_read_super_block(struct super_block *sb,
> > > @@ -913,7 +952,8 @@ static int nilfs_remount(struct super_block *sb, int
> > > *flags, char *data)
> > > sbp[0]->s_state =
> > > cpu_to_le16(nilfs->ns_mount_state);
> > > sbp[0]->s_mtime = cpu_to_le64(get_seconds());
> > > - nilfs_commit_super(sbi, 1);
> > > + nilfs_set_log_cursor(sbp[0], nilfs);
> > > + nilfs_commit_super(sbi, NILFS_SB_COMMIT);
> > > }
> > > up_write(&nilfs->ns_sem);
> > > } else {
> > > diff --git a/fs/nilfs2/the_nilfs.c b/fs/nilfs2/the_nilfs.c
> > > index 74b0480..bad254e 100644
> > > --- a/fs/nilfs2/the_nilfs.c
> > > +++ b/fs/nilfs2/the_nilfs.c
> > > @@ -329,8 +329,10 @@ int load_nilfs(struct the_nilfs *nilfs, struct
> > > nilfs_sb_info *sbi)
> > > sbp = nilfs_prepare_super(sbi);
> > > if (likely(sbp)) {
> > > nilfs->ns_mount_state |= NILFS_VALID_FS;
> > > + /* set the flag only for newer super block */
> > > sbp[0]->s_state = cpu_to_le16(nilfs->ns_mount_state);
> > > - err = nilfs_commit_super(sbi, 1);
> > > + nilfs_set_log_cursor(sbp[0], nilfs);
> > > + err = nilfs_commit_super(sbi, NILFS_SB_COMMIT);
> > > }
> > > up_write(&nilfs->ns_sem);
> > >
> > > @@ -519,8 +521,8 @@ static int nilfs_load_super_block(struct the_nilfs
> > > *nilfs,
> > > nilfs_swap_super_block(nilfs);
> > > }
> > >
> > > - nilfs->ns_sbwtime[0] = le64_to_cpu(sbp[0]->s_wtime);
> > > - nilfs->ns_sbwtime[1] = valid[!swp] ? le64_to_cpu(sbp[1]->s_wtime) : 0;
> > > + nilfs->ns_sbwcount = 0;
> > > + nilfs->ns_sbwtime = le64_to_cpu(sbp[0]->s_wtime);
> > > nilfs->ns_prot_seq = le64_to_cpu(sbp[valid[1] & !swp]->s_last_seq);
> > > *sbpp = sbp[0];
> > > return 0;
> > > diff --git a/fs/nilfs2/the_nilfs.h b/fs/nilfs2/the_nilfs.h
> > > index 85df47f..905e4c1 100644
> > > --- a/fs/nilfs2/the_nilfs.h
> > > +++ b/fs/nilfs2/the_nilfs.h
> > > @@ -57,7 +57,8 @@ enum {
> > > * @ns_current: back pointer to current mount
> > > * @ns_sbh: buffer heads of on-disk super blocks
> > > * @ns_sbp: pointers to super block data
> > > - * @ns_sbwtime: previous write time of super blocks
> > > + * @ns_sbwtime: previous write time of super block
> > > + * @ns_sbwcount: write count of super block
> > > * @ns_sbsize: size of valid data in super block
> > > * @ns_supers: list of nilfs super block structs
> > > * @ns_seg_seq: segment sequence counter
> > > @@ -120,7 +121,8 @@ struct the_nilfs {
> > > */
> > > struct buffer_head *ns_sbh[2];
> > > struct nilfs_super_block *ns_sbp[2];
> > > - time_t ns_sbwtime[2];
> > > + time_t ns_sbwtime;
> > > + unsigned ns_sbwcount;
> > > unsigned ns_sbsize;
> > > unsigned ns_mount_state;
> > >
> > > @@ -205,20 +207,11 @@ THE_NILFS_FNS(SB_DIRTY, sb_dirty)
> > >
> > > /* Minimum interval of periodical update of superblocks (in seconds) */
> > > #define NILFS_SB_FREQ 10
> > > -#define NILFS_ALTSB_FREQ 60 /* spare superblock */
> > >
> > > static inline int nilfs_sb_need_update(struct the_nilfs *nilfs)
> > > {
> > > u64 t = get_seconds();
> > > - return t < nilfs->ns_sbwtime[0] ||
> > > - t > nilfs->ns_sbwtime[0] + NILFS_SB_FREQ;
> > > -}
> > > -
> > > -static inline int nilfs_altsb_need_update(struct the_nilfs *nilfs)
> > > -{
> > > - u64 t = get_seconds();
> > > - struct nilfs_super_block **sbp = nilfs->ns_sbp;
> > > - return sbp[1] && t > nilfs->ns_sbwtime[1] + NILFS_ALTSB_FREQ;
> > > + return t < nilfs->ns_sbwtime || t > nilfs->ns_sbwtime + NILFS_SB_FREQ;
> > > }
> > >
> > > void nilfs_set_last_segment(struct the_nilfs *, sector_t, u64, __u64);
> > > --
> > > 1.5.6.5
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
>
--
Jiro SEKIBA <[email protected]>
--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html