Re: [Cluster-devel] [PATCH][GFS2] Lockup on error

2008-01-21 Thread Steven Whitehouse
Hi, On Sat, 2008-01-19 at 21:50 -0600, Bob Peterson wrote: Hi, I spotted this bug while I was digging around. Looks like it could cause a lockup in some rare error condition. Regards, Bob Peterson -- Signed-off-by: Bob Peterson [EMAIL PROTECTED] -- fs/gfs2/inode.c |2 +- 1

[Cluster-devel] [PATCH 04/58] [GFS2] Remove useless i_cache from inodes

2008-01-21 Thread swhiteho
From: Steven Whitehouse [EMAIL PROTECTED] The i_cache was designed to keep references to the indirect blocks used during block mapping so that they didn't have to be looked up continually. The idea failed because there are too many places where the i_cache needs to be freed, and this has in the

[Cluster-devel] GFS2 git tree

2008-01-21 Thread Steven Whitehouse
Hi, I've just rebased the tree due to the impending merge window. Also I've removed all of the DLM patches since there will shortly be a separate DLM tree. I'm going to hold off accepting any larger patches now until after the merge window, Steve.

[Cluster-devel] [PATCH 56/58] [GFS2] Fix page_mkwrite truncation race path

2008-01-21 Thread swhiteho
From: Steven Whitehouse [EMAIL PROTECTED] There was a bug in the truncation/invalidation race path for -page_mkwrite for gfs2. It ought to return 0 so that the effect is the same as if the page was truncated at any of the other points at which the page_lock is dropped. This will result in the

[Cluster-devel] [PATCH 53/58] [GFS2] gfs2_alloc_required performance

2008-01-21 Thread swhiteho
From: Bob Peterson [EMAIL PROTECTED] This is a small I/O performance enhancement to gfs2. (Actually, it is a rework of an earlier version I got wrong). The idea here is to check if the write extends past the last block in the file. If so, the function can save itself a lot of time and trouble

[Cluster-devel] [PATCH 37/58] [GFS2] Get rid of useless found variable in quota.c

2008-01-21 Thread swhiteho
From: Bob Peterson [EMAIL PROTECTED] This just eliminates an unused variable from the quota code. Not likely to be a time saver. Signed-off-by: Bob Peterson [EMAIL PROTECTED] Signed-off-by: Steven Whitehouse [EMAIL PROTECTED] diff --git a/fs/gfs2/quota.c b/fs/gfs2/quota.c index 8b4c20c..60cc50f

[Cluster-devel] [PATCH 43/58] [GFS2] Eliminate the no longer needed sd_statfs_mutex

2008-01-21 Thread swhiteho
From: Bob Peterson [EMAIL PROTECTED] This patch eliminates the unneeded sd_statfs_mutex mutex but preserves the ordering as discussed. Signed-off-by: Bob Peterson [EMAIL PROTECTED] Signed-off-by: Steven Whitehouse [EMAIL PROTECTED] diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h index

[Cluster-devel] [PATCH 36/58] [GFS2] Journal extent mapping

2008-01-21 Thread swhiteho
From: Bob Peterson [EMAIL PROTECTED] This patch saves a little time when gfs2 writes to the journals by keeping a mapping between logical and physical blocks on disk. That's better than constantly looking up indirect pointers in buffers, when the journals are several levels of indirection (which

[Cluster-devel] [PATCH 35/58] [GFS2] Fix typo in log.c

2008-01-21 Thread swhiteho
From: Steven Whitehouse [EMAIL PROTECTED] An inequality was the wrong way around causing gfs2_logd to wake up too often. This fixes it. Signed-off-by: Steven Whitehouse [EMAIL PROTECTED] diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c index 7f9ab89..9bece94 100644 --- a/fs/gfs2/log.c +++

[Cluster-devel] [PATCH 28/58] [GFS2] Fix build warnings

2008-01-21 Thread swhiteho
From: Fabio Massimo Di Nitto [EMAIL PROTECTED] Hi Steven, Steven Whitehouse wrote: Hi, Now in the -nmw git tree. Thanks, Steve. On Wed, 2007-11-21 at 11:54 -0600, Ryan O'Hara wrote: this patch introduces a bunch of build warnings by leaving around struct inode *inode = ip-i_inode; The

[Cluster-devel] [PATCH 19/58] [GFS2] Don't add glocks to the journal

2008-01-21 Thread swhiteho
From: Steven Whitehouse [EMAIL PROTECTED] The only reason for adding glocks to the journal was to keep track of which locks required a log flush prior to release. We add a flag to the glock to allow this check to be made in a simpler way. This reduces the size of a glock (by 12 bytes on i386, 24

[Cluster-devel] [PATCH 15/58] [GFS2] Reorder writeback for glock sync

2008-01-21 Thread swhiteho
From: Steven Whitehouse [EMAIL PROTECTED] Previously we were doing (write data, wait for data, write metadata, wait for metadata). After this patch we so (write metadata, write data, wait for data, wait for metadata) which should be more efficient. Also I noticed that the drop_bh and xmote_bh

[Cluster-devel] [PATCH 11/58] [GFS2] Use correct include file in ops_address.c

2008-01-21 Thread swhiteho
From: Steven Whitehouse [EMAIL PROTECTED] Something changed in the upstream kernel, and it needs this one-liner to allow ops_address.c to build. Signed-off-by: Steven Whitehouse [EMAIL PROTECTED] diff --git a/fs/gfs2/ops_address.c b/fs/gfs2/ops_address.c index ae782d2..7353933 100644 ---

[Cluster-devel] [PATCH 10/58] [GFS2] Don't hold page lock when starting transaction

2008-01-21 Thread swhiteho
From: Steven Whitehouse [EMAIL PROTECTED] This is an addendum to the new AOPs work which moves the point at which we take the page lock so that we don't get it until the last possible moment. This resolves a conflict between starting transactions and the page lock. Signed-off-by: Steven

[Cluster-devel] [PATCH 06/58] [GFS2] Add gfs2_is_writeback()

2008-01-21 Thread swhiteho
From: Steven Whitehouse [EMAIL PROTECTED] This adds a function gfs2_is_writeback() along the lines of the existing gfs2_is_jdata() in order to clean up the code and make the various tests for the inode mode more obvious. It also fixes the PageChecked() logic where we were resetting the flag too

[Cluster-devel] [PATCH 05/58] [GFS2] Remove unused field in struct gfs2_inode

2008-01-21 Thread swhiteho
From: Steven Whitehouse [EMAIL PROTECTED] Removes a field that is not used. Signed-off-by: Steven Whitehouse [EMAIL PROTECTED] diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h index 5662ff9..e53da7d 100644 --- a/fs/gfs2/incore.h +++ b/fs/gfs2/incore.h @@ -274,7 +274,6 @@ struct gfs2_inode {

[Cluster-devel] [PATCH 40/58] [GFS2] Only fetch the dinode once in block_map

2008-01-21 Thread swhiteho
From: Bob Peterson [EMAIL PROTECTED] Function gfs2_block_map was often looking up the disk inode twice. This optimizes it so that only does it once. Signed-off-by: Bob Peterson [EMAIL PROTECTED] Signed-off-by: Steven Whitehouse [EMAIL PROTECTED] diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c

[Cluster-devel] [GFS2] Pre-pull patch posting

2008-01-21 Thread swhiteho
Hi, Here is the current GFS2 patch queue. You'll notice that this time there are no DLM patches in this list. That is because the DLM team are setting up their own git tree and this future DLM patches will be sent directly by them rather than via the GFS2 tree. Most of this set of patches is

[Cluster-devel] [PATCH 13/58] [GFS2] Remove reclaim limit

2008-01-21 Thread swhiteho
From: Steven Whitehouse [EMAIL PROTECTED] This call to reclaim glocks is not needed, and in particular we don't want it in the fast path for locking glocks. The limit was entirely arbitrary anyway and we can't expect users to adjust things like this, the remaining code will do the right thing on

[Cluster-devel] [PATCH 14/58] [GFS2] Add sync_page to metadata address space operations

2008-01-21 Thread swhiteho
From: Steven Whitehouse [EMAIL PROTECTED] This set of address space operations was missing a sync_page operation. Signed-off-by: Steven Whitehouse [EMAIL PROTECTED] diff --git a/fs/gfs2/meta_io.c b/fs/gfs2/meta_io.c index 01ef902..4b1aced 100644 --- a/fs/gfs2/meta_io.c +++ b/fs/gfs2/meta_io.c

[Cluster-devel] [PATCH 16/58] [GFS2] Remove flags no longer required

2008-01-21 Thread swhiteho
From: Steven Whitehouse [EMAIL PROTECTED] The HIF_MUTEX and HIF_PROMOTE flags were set on the glock holders depending upon which of the two waiters lists they were going to be queued upon. They were then tested when the holders were taken off the lists to ensure that the right type of holder was

[Cluster-devel] [PATCH 17/58] [GFS2] Given device ID rather than s_id in id sysfs file

2008-01-21 Thread swhiteho
From: Bob Peterson [EMAIL PROTECTED] This patch changes the /sys/fs/gfs2/s_id/id file to give the device id major:minor rather than the s_id. That enables gfs2_tool to match devices properly (by id, not name) when locating the tuning files. Signed-off-by: Bob Peterson [EMAIL PROTECTED]

[Cluster-devel] [PATCH 18/58] [GFS2] check kthread_should_stop when waiting

2008-01-21 Thread swhiteho
From: David Teigland [EMAIL PROTECTED] Use wait_event_interruptible() in the lock_dlm thread instead of an open coded equivalent, and include a kthread_should_stop() check in the wait test so we don't miss a kthread_stop(). Signed-off-by: David Teigland [EMAIL PROTECTED] Signed-off-by: Steven

[Cluster-devel] [PATCH 20/58] [GFS2] Use atomic_t for journal free blocks counter

2008-01-21 Thread swhiteho
From: Steven Whitehouse [EMAIL PROTECTED] This patch changes the counter which keeps track of the free blocks in the journal to an atomic_t in preparation for the following patch which will update the log reservation code. Signed-off-by: Steven Whitehouse [EMAIL PROTECTED] diff --git

[Cluster-devel] [PATCH 25/58] [GFS2] Fix runtime issue with UP kernels

2008-01-21 Thread swhiteho
From: Fabio Massimo Di Nitto [EMAIL PROTECTED] The issue is indeed UP vs SMP and it is totally random. spin_is_locked() is a bad assertion because there is no correct answer on UP. on UP spin_is_locked() has to return either one value or another, always. This means that in my setup I am lucky

[Cluster-devel] [PATCH 26/58] [GFS2] Revise gfs2_logd and flush thresholds

2008-01-21 Thread swhiteho
From: Steven Whitehouse [EMAIL PROTECTED] This patch intriduces two new log thresholds: o thresh1 is the point at which we wake up gfs2_logd due to the pinned block count. It is initialised at mount time to 2/5ths of the size of the journal. Currently it does not change during the course

[Cluster-devel] [PATCH 33/58] [GFS2] use pid for plock owner for nfs clients

2008-01-21 Thread swhiteho
From: David Teigland [EMAIL PROTECTED] The fl_owner is that of lockd when posix locks arrive from nfs clients, so it can't be used to distinguish between lock holders. Use fl_pid as owner instead; it's the pid of the process on the nfs client. Signed-off-by: David Teigland [EMAIL PROTECTED]

[Cluster-devel] [PATCH 44/58] [GFS2] Minor correction

2008-01-21 Thread swhiteho
From: Bob Peterson [EMAIL PROTECTED] This is a small correction to my previously posted patch1. It just changes a divide to a shift. It's faster and doesn't introduce odd dependencies on 32-bit compiles. Signed-off-by: Bob Peterson [EMAIL PROTECTED] Signed-off-by: Steven Whitehouse [EMAIL

[Cluster-devel] [PATCH 45/58] [GFS2] Fix log block mapper

2008-01-21 Thread swhiteho
From: Steven Whitehouse [EMAIL PROTECTED] A missing offset in the calculation. Signed-off-by: Steven Whitehouse [EMAIL PROTECTED] diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c index 0833e27..40c51bf 100644 --- a/fs/gfs2/log.c +++ b/fs/gfs2/log.c @@ -336,7 +336,7 @@ static u64 log_bmap(struct

[Cluster-devel] [PATCH 46/58] [GFS2] Remove unused variable

2008-01-21 Thread swhiteho
From: Steven Whitehouse [EMAIL PROTECTED] The go_drop_th function is never called or referenced. Signed-off-by: Steven Whitehouse [EMAIL PROTECTED] diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h index 9a83429..c85f4fd 100644 --- a/fs/gfs2/incore.h +++ b/fs/gfs2/incore.h @@ -131,7 +131,6 @@

[Cluster-devel] [PATCH 47/58] [GFS2] Allow page migration for writeback and ordered pages

2008-01-21 Thread swhiteho
From: Steven Whitehouse [EMAIL PROTECTED] To improve performance on NUMA, we use the VM's standard page migration for writeback and ordered pages. Probably we could also do the same for journaled data, but that would need a careful audit of the code, so will be the subject of a later patch.

[Cluster-devel] [PATCH 48/58] [GFS2] Initialize extent_list earlier

2008-01-21 Thread swhiteho
From: Bob Peterson [EMAIL PROTECTED] Here is a patch for the latest upstream GFS2 code: The journal extent map needs to be initialized sooner than it currently is. Otherwise failed mount attempts (e.g. not enough journals, etc.) may panic trying to access the uninitialized list. Signed-off-by:

[Cluster-devel] [PATCH 50/58] [GFS2] Fix assert in log code

2008-01-21 Thread swhiteho
From: Steven Whitehouse [EMAIL PROTECTED] Although the values were all being calculated correctly, there was a race in the assert due to the way it was using atomic variables. This changes the value we assert on so that we get the same effect by testing a different variable. This prevents the

[Cluster-devel] [PATCH 52/58] [GFS2] Remove unneeded i_spin

2008-01-21 Thread swhiteho
From: Bob Peterson [EMAIL PROTECTED] This patch removes a vestigial variable i_spin from the gfs2_inode structure. This not only saves us memory (30 of these in memory for the oom test) it also saves us time because we don't have to spend time initializing it (i.e. slightly better

[Cluster-devel] [PATCH 54/58] [GFS2] Fix write alloc required shortcut calculation

2008-01-21 Thread swhiteho
From: Steven Whitehouse [EMAIL PROTECTED] The comparison was being made against the wrong quantity. Signed-off-by: Steven Whitehouse [EMAIL PROTECTED] diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c index 4356cc2..e4effc4 100644 --- a/fs/gfs2/bmap.c +++ b/fs/gfs2/bmap.c @@ -1222,10 +1222,10 @@ int

[Cluster-devel] [PATCH 55/58] [GFS2] Fix typo

2008-01-21 Thread swhiteho
From: Bob Peterson [EMAIL PROTECTED] This patch fixes a minor typo. Surprisingly, it still compiled. Signed-off-by: Bob Peterson [EMAIL PROTECTED] Signed-off-by: Steven Whitehouse [EMAIL PROTECTED] diff --git a/fs/gfs2/meta_io.c b/fs/gfs2/meta_io.c index bc28bc6..99c7959 100644 ---

[Cluster-devel] [PATCH 58/58] [GFS2] Allow journal recovery on read-only mount

2008-01-21 Thread swhiteho
From: Abhijith Das [EMAIL PROTECTED] This patch allows gfs2 to perform journal recovery even if it is mounted read-only. Strictly speaking, a read-only mount should not be writing to the filesystem, but we do this only to perform journal recovery. A read-only mount will fail if we don't recover

Re: [Cluster-devel] [PATCH] gfs2 umount: support fake -r option

2008-01-21 Thread David Teigland
On Sat, Jan 19, 2008 at 06:58:17AM +0100, Fabio M. Di Nitto wrote: Hi guys, in certain situations where gfs2 init scripts are not used to umount gfs2 volume, we endup with umount.gfs2 being invoked with -r option and this fails because we don't know what to do with this option. The

[Cluster-devel] cluster/group/gfs_controld plock.c

2008-01-21 Thread teigland
CVSROOT:/cvs/cluster Module name:cluster Branch: RHEL5 Changes by: [EMAIL PROTECTED] 2008-01-21 20:17:44 Modified files: group/gfs_controld: plock.c Log message: bz 429546 Fix an alignment problem with ppc64. Things work if we do

[Cluster-devel] cluster/cmirror/src clogd.c queues.c

2008-01-21 Thread jbrassow
CVSROOT:/cvs/cluster Module name:cluster Branch: RHEL5 Changes by: [EMAIL PROTECTED] 2008-01-21 20:18:44 Modified files: cmirror/src: clogd.c queues.c Log message: s/LOG_PRINT/LOG_DBG/ too verbose on exit/signals Patches:

[Cluster-devel] cluster/group/gfs_controld plock.c

2008-01-21 Thread teigland
CVSROOT:/cvs/cluster Module name:cluster Branch: RHEL51 Changes by: [EMAIL PROTECTED] 2008-01-21 20:19:08 Modified files: group/gfs_controld: plock.c Log message: bz 429546 Fix an alignment problem with ppc64. Things work if we do

[Cluster-devel] cluster/group/gfs_controld plock.c

2008-01-21 Thread teigland
CVSROOT:/cvs/cluster Module name:cluster Changes by: [EMAIL PROTECTED] 2008-01-21 20:21:08 Modified files: group/gfs_controld: plock.c Log message: bz 429546 Fix an alignment problem with ppc64. Things work if we do the

[Cluster-devel] cluster/cmirror-kernel/src dm-clog.c

2008-01-21 Thread jbrassow
CVSROOT:/cvs/cluster Module name:cluster Branch: RHEL5 Changes by: [EMAIL PROTECTED] 2008-01-21 20:37:03 Modified files: cmirror-kernel/src: dm-clog.c Log message: - name change s/clustered_/clustered-/ Patches:

[Cluster-devel] cluster/ccs/lib libccs.c

2008-01-21 Thread jbrassow
CVSROOT:/cvs/cluster Module name:cluster Changes by: [EMAIL PROTECTED] 2008-01-21 22:31:54 Modified files: ccs/lib: libccs.c Log message: - ccs library now checks for bad file descriptors as input Patches:

[Cluster-devel] current dlm patches

2008-01-21 Thread David Teigland
This is the current set of dlm patches that I'm collecting at http://people.redhat.com/teigland/dlm-patches-testing/ I'm preparing to send these upstream for 2.6.25 in the next week or so, depending on review and testing. They come mainly from - the mixed architecture testing and fixing that

[Cluster-devel] [PATCH] dlm: close othercons

2008-01-21 Thread David Teigland
From: Patrick Caulfeld [EMAIL PROTECTED] This patch addresses a problem introduced with the last round of lowcomms patches where the 'othercon' connections do not get freed when the DLM shuts down. This results in the error message slab error in kmem_cache_destroy(): cache `dlm_conn': Can't free

[Cluster-devel] [PATCH] dlm: proper prototypes

2008-01-21 Thread David Teigland
From: Adrian Bunk [EMAIL PROTECTED] This patch adds a proper prototype for some functions in fs/dlm/dlm_internal.h Signed-off-by: Adrian Bunk [EMAIL PROTECTED] Signed-off-by: David Teigland [EMAIL PROTECTED] --- fs/dlm/dlm_internal.h | 16 fs/dlm/lock.c |1 -

[Cluster-devel] [PATCH] dlm: don't print common non-errors

2008-01-21 Thread David Teigland
Change log_error() to log_debug() for conditions that can occur in large number in normal operation. Signed-off-by: David Teigland [EMAIL PROTECTED] --- fs/dlm/lock.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/fs/dlm/lock.c b/fs/dlm/lock.c index 7bc6ad9..63fe74d

[Cluster-devel] [PATCH] dlm: use fixed errno values in messages

2008-01-21 Thread David Teigland
Some errno values differ across platforms. So if we return things like -EINPROGRESS from one node it can get misinterpreted or rejected on another one. This patch fixes up the errno values passed on the wire so that they match the x86 ones (so as not to break the protocol), and re-instates the

[Cluster-devel] [PATCH] dlm: swap bytes for rcom lock reply

2008-01-21 Thread David Teigland
From: Fabio M. Di Nitto [EMAIL PROTECTED] DLM_RCOM_LOCK_REPLY messages need byte swapping. Signed-off-by: Fabio M. Di Nitto [EMAIL PROTECTED] Signed-off-by: David Teigland [EMAIL PROTECTED] --- fs/dlm/util.c |9 ++--- 1 files changed, 6 insertions(+), 3 deletions(-) diff --git

[Cluster-devel] [PATCH] dlm: bind connections from known local address when using TCP

2008-01-21 Thread David Teigland
From: Lon Hohberger [EMAIL PROTECTED] A common problem occurs when multiple IP addresses within the same subnet are assigned to the same NIC. If we make a connection attempt to another address on the same subnet as one of those addresses, the connection attempt will not necessarily be routed

[Cluster-devel] [PATCH] dlm: clear ast_type when removing from astqueue

2008-01-21 Thread David Teigland
The lkb_ast_type field indicates whether the lkb is on the astqueue list. When clearing locks for a process, lkb's were being removed from the astqueue list without clearing the field. If release_lockspace then happened immediately afterward, it could try to remove the lkb from the list a second

[Cluster-devel] [PATCH] dlm: recover locks waiting for overlap replies

2008-01-21 Thread David Teigland
When recovery looks at locks waiting for replies, it fails to consider locks that have already received a reply for their first remote operation, but not received a reply for secondary, overlapping unlock/cancel. The appropriate stub reply needs to be called for these waiters. Appears when we

[Cluster-devel] [PATCH] dlm: another call to confirm_master in receive_request_reply

2008-01-21 Thread David Teigland
When a failed request (EBADR or ENOTBLK) is unlocked/canceled instead of retried, there may be other lkb's waiting on the rsb_lookup list for it to complete. A call to confirm_master() is needed to move on to the next waiting lkb since the current one won't be retried. Signed-off-by: David

[Cluster-devel] [PATCH] dlm: limit dir lookup loop

2008-01-21 Thread David Teigland
In a rare case we may need to repeat a local resource directory lookup due to a race with removing the rsb and removing the resdir record. We'll never need to do more than a single additional lookup, though, so the infinite loop around the lookup can be removed. In addition to being unnecessary,

[Cluster-devel] [PATCH] dlm: change error message to debug

2008-01-21 Thread David Teigland
The invalid lockspace messages are normal and can appear relatively often. They should be suppressed without debugging enabled. Signed-off-by: David Teigland [EMAIL PROTECTED] --- fs/dlm/lock.c |5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-) diff --git a/fs/dlm/lock.c

[Cluster-devel] conga clustermon.spec.in.in

2008-01-21 Thread rmccabe
CVSROOT:/cvs/cluster Module name:conga Branch: RHEL5 Changes by: [EMAIL PROTECTED] 2008-01-21 23:21:47 Modified files: . : clustermon.spec.in.in Log message: Remove extra .1 in the version number Patches: