Re: [Cluster-devel] [GFS2 Patch] GFS2: rename causes kernel Oops
Hi, That ends the long unbroken run of there being no bugs in the dir code :( It was good spotting to track that one down and the fix looks good to me. I've pushed it into the -nmw tree. Thanks for fixing this, Steve. On Wed, 2010-07-14 at 18:12 -0400, Bob Peterson wrote: Hi, This patch fixes a kernel Oops in the GFS2 rename code. The problem was in the way the gfs2 directory code was trying to re-use sentinel directory entries. In the failing case, gfs2's rename function was renaming a file to another name that had the same non-trivial length. The file being renamed happened to be the first directory entry on the leaf block. First, the rename code (gfs2_rename in ops_inode.c) found the original directory entry and decided it could do its job by simply replacing the directory entry with another. Therefore it determined correctly that no block allocations were needed. Next, the rename code deleted the old directory entry prior to replacing it with the new name. Therefore, the soon-to-be replaced directory entry was temporarily made into a directory entry sentinel or a place holder at the start of a leaf block. Lastly, it went to re-add the replacement directory entry in that leaf block. However, when gfs2_dirent_find_space was looking for space in the leaf block, it used the wrong value for the sentinel. That threw off its calculations so later it decides it can't really re-use the sentinel and therefore must allocate a new leaf block. But because it previously decided to re-use the directory entry, it didn't waste the time to grab a new block allocation for the inode. Therefore, the inode's i_alloc pointer was still NULL and it crashes trying to reference it. In the case of sentinel directory entries, the entire dirent is reused, not just the free space portion of it, and therefore the function gfs2_dirent_find_space should use the value 0 rather than GFS2_DIRENT_SIZE(0) for the actual dirent size. Fixing this calculation enables the reproducer programs to work properly. Regards, Bob Peterson Red Hat GFS Signed-off-by: Bob Peterson rpete...@redhat.com -- fs/gfs2/dir.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/fs/gfs2/dir.c b/fs/gfs2/dir.c index 8295c5b..26ca336 100644 --- a/fs/gfs2/dir.c +++ b/fs/gfs2/dir.c @@ -392,7 +392,7 @@ static int gfs2_dirent_find_space(const struct gfs2_dirent *dent, unsigned totlen = be16_to_cpu(dent-de_rec_len); if (gfs2_dirent_sentinel(dent)) - actual = GFS2_DIRENT_SIZE(0); + actual = 0; if (totlen - actual = required) return 1; return 0;
[Cluster-devel] [PATCH 1/5] GFS2: O_TRUNC not working on stuffed files across cluster
From: Bob Peterson rpete...@redhat.com This patch replaces a statement that got dropped out by accident. Without the patch, truncates on stuffed (very small) files cause those files to have an unpredictable size. Signed-off-by: Bob Peterson rpete...@redhat.com Signed-off-by: Steven Whitehouse swhit...@redhat.com diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c index 4a48c0f..84da64b 100644 --- a/fs/gfs2/bmap.c +++ b/fs/gfs2/bmap.c @@ -1041,6 +1041,7 @@ static int trunc_start(struct gfs2_inode *ip, u64 size) if (gfs2_is_stuffed(ip)) { u64 dsize = size + sizeof(struct gfs2_inode); + ip-i_disksize = size; ip-i_inode.i_mtime = ip-i_inode.i_ctime = CURRENT_TIME; gfs2_trans_add_bh(ip-i_gl, dibh, 1); gfs2_dinode_out(ip, dibh-b_data); -- 1.7.1.1
[Cluster-devel] [PATCH 3/5] GFS2: Fix kernel NULL pointer dereference by dlm_astd
From: Bob Peterson rpete...@redhat.com This patch fixes a problem in an error path when looking up dinodes. There are two sister-functions, gfs2_inode_lookup and gfs2_process_unlinked_inode. Both functions acquire and hold the i_iopen glock for the dinode being looked up. The last thing they try to do is hold the i_gl glock for the dinode. If that glock fails for some reason, the error path was incorrectly calling gfs2_glock_put for the i_iopen glock twice. This resulted in the glock being prematurely freed. The minimum hold time usually kept the glock in memory, but the lock interface to dlm (aka lock_dlm) freed its memory for the glock. In some circumstances, it would cause dlm's dlm_astd daemon to try to call the bast function for the freed lock_dlm memory, which resulted in a NULL pointer dereference. Signed-off-by: Bob Peterson rpete...@redhat.com Signed-off-by: Steven Whitehouse swhit...@redhat.com diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c index b5612cb..f03afd9 100644 --- a/fs/gfs2/inode.c +++ b/fs/gfs2/inode.c @@ -169,7 +169,7 @@ struct inode *gfs2_inode_lookup(struct super_block *sb, { struct inode *inode; struct gfs2_inode *ip; - struct gfs2_glock *io_gl; + struct gfs2_glock *io_gl = NULL; int error; inode = gfs2_iget(sb, no_addr); @@ -198,6 +198,7 @@ struct inode *gfs2_inode_lookup(struct super_block *sb, ip-i_iopen_gh.gh_gl-gl_object = ip; gfs2_glock_put(io_gl); + io_gl = NULL; if ((type == DT_UNKNOWN) (no_formal_ino == 0)) goto gfs2_nfsbypass; @@ -228,7 +229,8 @@ gfs2_nfsbypass: fail_glock: gfs2_glock_dq(ip-i_iopen_gh); fail_iopen: - gfs2_glock_put(io_gl); + if (io_gl) + gfs2_glock_put(io_gl); fail_put: if (inode-i_state I_NEW) ip-i_gl-gl_object = NULL; @@ -256,7 +258,7 @@ void gfs2_process_unlinked_inode(struct super_block *sb, u64 no_addr) { struct gfs2_sbd *sdp; struct gfs2_inode *ip; - struct gfs2_glock *io_gl; + struct gfs2_glock *io_gl = NULL; int error; struct gfs2_holder gh; struct inode *inode; @@ -293,6 +295,7 @@ void gfs2_process_unlinked_inode(struct super_block *sb, u64 no_addr) ip-i_iopen_gh.gh_gl-gl_object = ip; gfs2_glock_put(io_gl); + io_gl = NULL; inode-i_mode = DT2IF(DT_UNKNOWN); @@ -319,7 +322,8 @@ void gfs2_process_unlinked_inode(struct super_block *sb, u64 no_addr) fail_glock: gfs2_glock_dq(ip-i_iopen_gh); fail_iopen: - gfs2_glock_put(io_gl); + if (io_gl) + gfs2_glock_put(io_gl); fail_put: ip-i_gl-gl_object = NULL; gfs2_glock_put(ip-i_gl); -- 1.7.1.1
[Cluster-devel] GFS2: Pre-pull patch posting (fixes)
Hi, Here are a few small fixes for GFS2, Steve.
[Cluster-devel] [PATCH 2/5] GFS2: recovery stuck on transaction lock
From: Bob Peterson rpete...@redhat.com This patch fixes bugzilla bug #590878: GFS2: recovery stuck on transaction lock. We set the frozen flag on the glock when we receive a completion that cannot be delivered due to blocked locks. At that point we check to see whether the first waiting holder has the noexp flag set. If the noexp lock is queued later, then we need to unfreeze the glock at that point in time, namely, in the glock work function. This patch was originally written by Steve Whitehouse, but since he's on holiday, I'm submitting it. It's been well tested with a complex recovery test called revolver. Signed-off-by: Steve Whitehouse swhit...@redhat.com Signed-off-by: Bob Peterson rpete...@redhat.com diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c index ddcdbf4..dbab3fd 100644 --- a/fs/gfs2/glock.c +++ b/fs/gfs2/glock.c @@ -706,8 +706,18 @@ static void glock_work_func(struct work_struct *work) { unsigned long delay = 0; struct gfs2_glock *gl = container_of(work, struct gfs2_glock, gl_work.work); + struct gfs2_holder *gh; int drop_ref = 0; + if (unlikely(test_bit(GLF_FROZEN, gl-gl_flags))) { + spin_lock(gl-gl_spin); + gh = find_first_waiter(gl); + if (gh (gh-gh_flags LM_FLAG_NOEXP) + test_and_clear_bit(GLF_FROZEN, gl-gl_flags)) + set_bit(GLF_REPLY_PENDING, gl-gl_flags); + spin_unlock(gl-gl_spin); + } + if (test_and_clear_bit(GLF_REPLY_PENDING, gl-gl_flags)) { finish_xmote(gl, gl-gl_reply); drop_ref = 1; -- 1.7.1.1
[Cluster-devel] [PATCH 4/5] GFS2: BUG in gfs2_adjust_quota
From: Abhijith Das a...@redhat.com HighMem pages on i686 do not get mapped to the buffer_heads and this was causing a NULL pointer dereference when we were trying to memset page buffers to zero. We now use zero_user() that kmaps the page and directly manipulates page data. This patch also fixes a boundary condition that was incorrect. Signed-off-by: Abhi Das a...@redhat.com Signed-off-by: Steven Whitehouse swhit...@redhat.com diff --git a/fs/gfs2/quota.c b/fs/gfs2/quota.c index 49667d6..b256d6f 100644 --- a/fs/gfs2/quota.c +++ b/fs/gfs2/quota.c @@ -694,10 +694,8 @@ get_a_page: if (!buffer_mapped(bh)) goto unlock_out; /* If it's a newly allocated disk block for quota, zero it */ - if (buffer_new(bh)) { - memset(bh-b_data, 0, bh-b_size); - set_buffer_uptodate(bh); - } + if (buffer_new(bh)) + zero_user(page, pos - blocksize, bh-b_size); } if (PageUptodate(page)) @@ -723,7 +721,7 @@ get_a_page: /* If quota straddles page boundary, we need to update the rest of the * quota at the beginning of the next page */ - if (offset != 0) { /* first page, offset is closer to PAGE_CACHE_SIZE */ + if ((offset + sizeof(struct gfs2_quota)) PAGE_CACHE_SIZE) { ptr = ptr + nbytes; nbytes = sizeof(struct gfs2_quota) - nbytes; offset = 0; -- 1.7.1.1
[Cluster-devel] [PATCH 5/5] GFS2: rename causes kernel Oops
From: Bob Peterson rpete...@redhat.com This patch fixes a kernel Oops in the GFS2 rename code. The problem was in the way the gfs2 directory code was trying to re-use sentinel directory entries. In the failing case, gfs2's rename function was renaming a file to another name that had the same non-trivial length. The file being renamed happened to be the first directory entry on the leaf block. First, the rename code (gfs2_rename in ops_inode.c) found the original directory entry and decided it could do its job by simply replacing the directory entry with another. Therefore it determined correctly that no block allocations were needed. Next, the rename code deleted the old directory entry prior to replacing it with the new name. Therefore, the soon-to-be replaced directory entry was temporarily made into a directory entry sentinel or a place holder at the start of a leaf block. Lastly, it went to re-add the replacement directory entry in that leaf block. However, when gfs2_dirent_find_space was looking for space in the leaf block, it used the wrong value for the sentinel. That threw off its calculations so later it decides it can't really re-use the sentinel and therefore must allocate a new leaf block. But because it previously decided to re-use the directory entry, it didn't waste the time to grab a new block allocation for the inode. Therefore, the inode's i_alloc pointer was still NULL and it crashes trying to reference it. In the case of sentinel directory entries, the entire dirent is reused, not just the free space portion of it, and therefore the function gfs2_dirent_find_space should use the value 0 rather than GFS2_DIRENT_SIZE(0) for the actual dirent size. Fixing this calculation enables the reproducer programs to work properly. Signed-off-by: Bob Peterson rpete...@redhat.com Signed-off-by: Steven Whitehouse swhit...@redhat.com diff --git a/fs/gfs2/dir.c b/fs/gfs2/dir.c index 8295c5b..26ca336 100644 --- a/fs/gfs2/dir.c +++ b/fs/gfs2/dir.c @@ -392,7 +392,7 @@ static int gfs2_dirent_find_space(const struct gfs2_dirent *dent, unsigned totlen = be16_to_cpu(dent-de_rec_len); if (gfs2_dirent_sentinel(dent)) - actual = GFS2_DIRENT_SIZE(0); + actual = 0; if (totlen - actual = required) return 1; return 0; -- 1.7.1.1
[Cluster-devel] GFS2: Pull request (fixes)
Hi, Please consider pulling the following GFS2 fixes, Steve. The following changes since commit 2f7989efd4398d92b8adffce2e07dd043a0895fe: Merge master.kernel.org:/home/rmk/linux-2.6-arm (2010-07-14 17:28:13 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes.git master Abhijith Das (1): GFS2: BUG in gfs2_adjust_quota Bob Peterson (4): GFS2: O_TRUNC not working on stuffed files across cluster GFS2: recovery stuck on transaction lock GFS2: Fix kernel NULL pointer dereference by dlm_astd GFS2: rename causes kernel Oops fs/gfs2/bmap.c |1 + fs/gfs2/dir.c |2 +- fs/gfs2/glock.c | 10 ++ fs/gfs2/inode.c | 12 fs/gfs2/quota.c |8 +++- 5 files changed, 23 insertions(+), 10 deletions(-)