[Cluster-devel] [PATCH 01/51] [GFS2] Fix two races relating to glock callbacks

2007-10-04 Thread swhiteho
From: Steven Whitehouse [EMAIL PROTECTED] One of the races relates to referencing a variable while not holding its protecting spinlock. The patch simply moves the test inside the spin lock. The other races occurs when a demote to unlocked request occurs during the time a demote to shared request

[Cluster-devel] [GFS2/DLM] Pre-pull patch posting

2007-10-04 Thread swhiteho
Hi, Since it seems that another merge window will probably be opening shortly this is a posting of the current content of the GFS2/DLM -nmw git tree. There are no new features this time, its all fixes and cleanups. I have a few patches that I'm holding back which I'm intending to start off the

[Cluster-devel] [PATCH 06/51] [GFS2] Move some code inside the log lock

2007-10-04 Thread swhiteho
From: Bob Peterson [EMAIL PROTECTED] This is the first of five patches for bug #248176: There were still some critical variables being manipulated outside the log_lock spinlock. That usually resulted in a hang. Signed-off-by: Bob Peterson [EMAIL PROTECTED] Signed-off-by: Steven Whitehouse

[Cluster-devel] [PATCH 08/51] [GFS2] Prevent infinite loop in try_rgrp_unlink()

2007-10-04 Thread swhiteho
From: Bob Peterson [EMAIL PROTECTED] This is patch three of five for bug #248176. The try_rgrp_unlink code in rgrp.c had an infinite loop. This was caused because the bitmap function rgblk_search can return a block less than the goal block, in which case it was looping. The fix is to make it

[Cluster-devel] [PATCH 09/51] [GFS2] use an temp variable to reduce a spin_unlock

2007-10-04 Thread swhiteho
From: Denis Cheng [EMAIL PROTECTED] this is more clear. Signed-off-by: Denis Cheng [EMAIL PROTECTED] Signed-off-by: David Teigland [EMAIL PROTECTED] Signed-off-by: Steven Whitehouse [EMAIL PROTECTED] diff --git a/fs/gfs2/locking/dlm/plock.c b/fs/gfs2/locking/dlm/plock.c index fba1f1d..1f7b038

[Cluster-devel] [PATCH 10/51] [GFS2] Detach buf data during in-place writeback

2007-10-04 Thread swhiteho
From: Bob Peterson [EMAIL PROTECTED] This is patch 5 of 5 for bug #248176 Metadata corruption was occurring because page references weren't being removed in all cases. I previously added a function called detach_bufdata, but I discovered there already WAS a function out there to do the job.

[Cluster-devel] [PATCH 13/51] [GFS2] Reduce number of gfs2_scand processes to one

2007-10-04 Thread swhiteho
From: Steven Whitehouse [EMAIL PROTECTED] We only need a single gfs2_scand process rather than the one per filesystem which we had previously. As a result the parameter determining the frequency of gfs2_scand runs becomes a module parameter rather than a mount parameter as it was before.

[Cluster-devel] [PATCH 15/51] [GFS2] Ensure journal file cache is flushed after recovery

2007-10-04 Thread swhiteho
From: Bob Peterson [EMAIL PROTECTED] This is for bugzilla bug #248176: GFS2: invalid metadata block Patches 1 thru 3 were accepted upstream, but there were problems with 4 and 5. Those issues have been resolved and now the recovery tests are passing without errors. This code has gone through

[Cluster-devel] [PATCH 48/51] [GFS2] Don't try to remove buffers that don't exist

2007-10-04 Thread swhiteho
From: Steven Whitehouse [EMAIL PROTECTED] Signed-off-by: Steven Whitehouse [EMAIL PROTECTED] diff --git a/fs/gfs2/ops_address.c b/fs/gfs2/ops_address.c index 4002f41..873a511 100644 --- a/fs/gfs2/ops_address.c +++ b/fs/gfs2/ops_address.c @@ -747,7 +747,7 @@ int gfs2_releasepage(struct page

[Cluster-devel] [PATCH 38/51] [GFS2] Replace revoke structure with bufdata structure

2007-10-04 Thread swhiteho
From: Steven Whitehouse [EMAIL PROTECTED] Both the revoke structure and the bufdata structure are quite similar. They are basically small tags which are put on lists. In addition to which the revoke structure is always allocated when there is a bufdata structure which is (or can be) freed. As

[Cluster-devel] [PATCH 35/51] [GFS2] Move pin/unpin into lops.c, clean up locking

2007-10-04 Thread swhiteho
From: Steven Whitehouse [EMAIL PROTECTED] gfs2_pin and gfs2_unpin are only used in lops.c, despite being defined in meta_io.c, so this patch moves them into lops.c and makes them static. At the same time, its possible to clean up the locking in the buf and databuf _lo_add() functions so that we

[Cluster-devel] [PATCH 27/51] [GFS2] Wendy's dump lockname in hex fix glock dump

2007-10-04 Thread swhiteho
From: Abhijith Das [EMAIL PROTECTED] With this patch, gfs2 glockdump through the debugfs filesystem will only dump glocks for the specified filesystem instead of all glocks. Also, to aid debugging, the glock number is dumped in hex instead of decimal. Signed-off-by: Steven Whitehouse [EMAIL

[Cluster-devel] [PATCH 25/51] [GFS2] Reduce truncate IO traffic

2007-10-04 Thread swhiteho
From: Wendy Cheng [EMAIL PROTECTED] Current GFS2 setattr call unconditionally invokes do_shrink even the requested size and actual file size are equal. This has generated large amount of extra IOs found during NFS benchmark runs. This patch moves the relevant logic out of shrink code path. Since

[Cluster-devel] [PATCH 21/51] [GFS2] Fix quota do_list operation hang

2007-10-04 Thread swhiteho
From: Abhijith Das [EMAIL PROTECTED] This is the filesystem part of the patches to fix this bz. There are additional userland patches (gfs2_quota, libgfs2) for the complete solution. This patch adds a new field qu_ll_next to the gfs2_quota structure. This field allows us to create linked lists of

[Cluster-devel] [PATCH 17/51] [GFS2] unneeded typecast

2007-10-04 Thread swhiteho
From: Denis Cheng [EMAIL PROTECTED] sb-s_fs_info is a void pointer, thus the type cast is not needed. Signed-off-by: Denis Cheng [EMAIL PROTECTED] Signed-off-by: Steven Whitehouse [EMAIL PROTECTED] diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c index 32b2859..25cfab9 100644 ---

[Cluster-devel] [PATCH 18/51] [GFS2] better code for translating characters

2007-10-04 Thread swhiteho
From: Denis Cheng [EMAIL PROTECTED] the original code could work, but I think this code could work better. Signed-off-by: Denis Cheng [EMAIL PROTECTED] Signed-off-by: Steven Whitehouse [EMAIL PROTECTED] diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c index 25cfab9..6c820cb 100644 ---

[Cluster-devel] [PATCH 23/51] [GFS2] Add a missing gfs2_trans_add_bh()

2007-10-04 Thread swhiteho
From: Steven Whitehouse [EMAIL PROTECTED] This was missing from the dir_split_leaf() function although in most cases its not a problem due to other functions having already previously called gfs2_trans_add_bh. This makes certain that it is correct. Signed-off-by: Steven Whitehouse [EMAIL

[Cluster-devel] [PATCH 28/51] [GFS2] Patch to protect sd_log_num_jdata

2007-10-04 Thread swhiteho
From: Bob Peterson [EMAIL PROTECTED] This is a patch to GFS2 to protect sd_log_num_jdata with the gfs2_log_lock. Without this patch, there is a timing window where you can get hit the following assert from function gfs2_log_flush(): gfs2_assert_withdraw(sdp,

[Cluster-devel] [PATCH 31/51] [GFS2] fix inode meta data corruption

2007-10-04 Thread swhiteho
From: Wendy Cheng [EMAIL PROTECTED] Fix a nasty inode meta data corruption issue by keeping the buffer head in icache array. This buffer needs to stay in memory until journal flush occurs Otherwise, gfs2_meta_inode_buffer could do a disk read before the inode hits disk. It ends up with meta data

[Cluster-devel] [PATCH 33/51] [GFS2] Introduce gfs2_remove_from_ail

2007-10-04 Thread swhiteho
From: Steven Whitehouse [EMAIL PROTECTED] This collects together the operations required to remove a gfs2_bufdata from the ail lists. Its only called from two places to start with, but expect to see more of this function in future. Signed-off-by: Steven Whitehouse [EMAIL PROTECTED] diff --git

[Cluster-devel] [PATCH 30/51] [GFS2] delay glock demote for a minimum hold time

2007-10-04 Thread swhiteho
From: Benjamin Marzinski [EMAIL PROTECTED] When a lot of IO, with some distributed mmap IO, is run on a GFS2 filesystem in a cluster, it will deadlock. The reason is that do_no_page() will repeatedly call gfs2_sharewrite_nopage(), because each node keeps giving up the glock too early, and is

[Cluster-devel] [PATCH 39/51] [GFS2] Use slab operations for all gfs2_bufdata allocations

2007-10-04 Thread swhiteho
From: Steven Whitehouse [EMAIL PROTECTED] The old revoke structure was allocated using kalloc/kfree but there is a slab cache for gfs2_bufdata, so we should use that now that the structures have been converted. This is part two of the patch series to merge the revoke and gfs2_bufdata structures.

[Cluster-devel] [PATCH 40/51] [GFS2] Clean up gfs2_trans_add_revoke()

2007-10-04 Thread swhiteho
From: Steven Whitehouse [EMAIL PROTECTED] The following alters gfs2_trans_add_revoke() to take a struct gfs2_bufdata as an argument. This eliminates the memory allocation which was previously required by making use of the already existing struct gfs2_bufdata. It makes some sanity checks to ensure

[Cluster-devel] [PATCH 42/51] [GFS2] Move inode deletion out of blocking_cb

2007-10-04 Thread swhiteho
From: Wendy Cheng [EMAIL PROTECTED] Move inode deletion code out of blocking_cb handle_callback route to avoid racy conditions that end up blocking lock_dlm1 thread. Fix bugzilla 286821. Signed-off-by: Wendy Cheng [EMAIL PROTECTED] Signed-off-by: Steven Whitehouse [EMAIL PROTECTED] diff --git

[Cluster-devel] [PATCH 44/51] [GFS2] GFS2: chmod hung - fix race in thread creation

2007-10-04 Thread swhiteho
From: Bob Peterson [EMAIL PROTECTED] The problem boiled down to a race between the gdlm_init_threads() function initializing thread1 and its setting of blist = 1. Essentially, if (current == ls-thread1) was checked by the thread before the thread creator set ls-thread1. Since thread1 is the only

[Cluster-devel] [PATCH 47/51] [GFS2] Alternate gfs2_iget to avoid looking up inodes being freed

2007-10-04 Thread swhiteho
From: Benjamin Marzinski [EMAIL PROTECTED] There is a possible deadlock between two processes on the same node, where one process is deleting an inode, and another process is looking for allocated but unused inodes to delete in order to create more space. process A does an iput() on inode X, and

[Cluster-devel] [PATCH 49/51] [GFS2] Get superblock a different way

2007-10-04 Thread swhiteho
From: Steven Whitehouse [EMAIL PROTECTED] The mapping may be NULL by the time the I/O has completed, so we now get the superblock by a different route (via the bd and glock) to avoid this problem. Signed-off-by: Steven Whitehouse [EMAIL PROTECTED] Cc: Wendy Cheng [EMAIL PROTECTED] diff --git

[Cluster-devel] [GFS2 PATCH] Handle multiple glock demote requests

2007-10-04 Thread Wendy Cheng
Red Hat bugzilla 295641... Wendy Fix a race condition where multiple glock demote requests are sent to a node back-to-back. This patch does a check inside handle_callback() to see whether a demote request is in progress. If true, it sets a flag to make sure run_queue() will loop again to

Re: [Cluster-devel] [GFS2 PATCH] Handle multiple glock demote requests

2007-10-04 Thread Wendy Cheng
Make a correction based on Josef's comment Wendy Fix a race condition where multiple glock demote requests are sent to a node back-to-back. This patch does a check inside handle_callback() to see whether a demote request is in progress. If true, it sets a flag to make sure run_queue() will