Re: [Cluster-devel] [GFS2 Patch][TRY 3] GFS2: Extend the life of the reservations structure

2012-05-15 Thread Steven Whitehouse
Hi,

That doesn't appear to work for me, unfortunately:


BUG: unable to handle kernel NULL pointer dereference at
(null)
IP: [a025c553] gfs2_inplace_reserve+0x33/0x4b0 [gfs2]
PGD 0 
Oops: 0002 [#1] PREEMPT SMP 
CPU 5 
Modules linked in: gfs2 ebtable_nat ebtables x_tables bridge stp dlm
sctp af_packet bonding ipv6 ext3 jbd loop dm_mirror dm_region_hash
dm_log bnx2 sg coretemp hwmon pcspkr microcode sr_mod cdrom
ide_pci_generic ide_core ata_piix libata [last unloaded: scsi_wait_scan]

Pid: 3075, comm: gfs2_quotad Not tainted 3.4.0-rc4+ #294 Dell Inc.
PowerEdge R710/0N047H
RIP: 0010:[a025c553]  [a025c553]
gfs2_inplace_reserve+0x33/0x4b0 [gfs2]
RSP: 0018:88031d989be0  EFLAGS: 00010202
RAX:  RBX: 880322563680 RCX: 000c
RDX: 0001 RSI: 0001 RDI: 880322563680
RBP: 88031d989ca0 R08:  R09: 
R10: 200cbcfd7b80 R11:  R12: 0002
R13: 000f R14: 8801a5334610 R15: 0002
FS:  () GS:8801aa60()
knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2:  CR3: 01c0b000 CR4: 07e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process gfs2_quotad (pid: 3075, threadinfo 88031d988000, task
880324b10540)
Stack:
 1000  88031d989c10 0002
 88031d989ca0 a0239fdd 0020 a0245106
 88031dfb3208 00030273 1000 
Call Trace:
 [a0239fdd] ? gfs2_write_alloc_required+0xbd/0x120 [gfs2]
 [a0245106] ? gfs2_glock_wait+0x96/0xa0 [gfs2]
 [a025781a] do_sync+0x23a/0x450 [gfs2]
 [a0257761] ? do_sync+0x181/0x450 [gfs2]
 [a0257c9c] gfs2_quota_sync+0x26c/0x360 [gfs2]
 [a0257d9b] gfs2_quota_sync_timeo+0xb/0x10 [gfs2]
 [a0259ab0] gfs2_quotad+0x220/0x2c0 [gfs2]
 [810a9c30] ? wake_up_bit+0x40/0x40
 [a0259890] ? gfs2_wake_up_statfs+0x40/0x40 [gfs2]
 [810a9656] kthread+0xa6/0xb0
 [816cd434] kernel_thread_helper+0x4/0x10
 [810b6a77] ? finish_task_switch+0x77/0x110
 [816c4976] ? _raw_spin_unlock_irq+0x46/0x70
[816c4cb4] ? retint_restore_args+0x13/0x13
 [810a95b0] ? __init_kthread_worker+0x70/0x70
 [816cd430] ? gs_change+0x13/0x13
Code: 41 55 41 54 53 48 89 fb 48 81 ec 98 00 00 00 85 f6 48 8b 47 28 48
8b 80 b0 05 00 00 48 89 45 b0 48 8b 87 b8 04 00 00 48 89 45 98 89 30
0f 84 39 04 00 00 65 48 8b 14 25 c0 b9 00 00 48 c7 45 a8 
RIP  [a025c553] gfs2_inplace_reserve+0x33/0x4b0 [gfs2]
 RSP 88031d989be0
CR2: 
---[ end trace 1fd6e3c548f2105d ]---

I suspect that quotad needs its own call to set up the new structure. I
wonder if there are other places where we do internal writes which also
need attention?

Steve.


On Mon, 2012-05-14 at 12:15 -0400, Bob Peterson wrote:
 - Original Message -
 | Hi,
 | 
 | On Mon, 2012-05-14 at 08:58 -0400, Bob Peterson wrote:
 |  |   fail:
 |  |  + gfs2_rs_delete(dip);
 |  | 
 |  | Should we really be dropping the ip-i_res here I wonder? I'm not
 |  | sure
 |  | that we want to forget this information just because (for
 |  | example)
 |  | someone has tried to create a new inode and it has been failed
 |  | for
 |  | some
 |  | reason or other.
 |  | 
 |  | Otherwise I think this looks good,
 |  | 
 |  | Steve.
 |  
 |  Hi,
 |  
 |  In this implementation, function gfs2_inplace_release just unlocks
 |  the
 |  rgrp glock and sets rs_requested to zero. In fact, the reservation
 |  still
 |  hangs around, attached to the inode until function gfs2_rs_delete
 |  is called
 |  from gfs2_release or gfs2_evict_inode. So instead of having two
 |  functions
 |  for allocation and deallocation, we now have four:
 |  
 | That is all ok I think, but I'm referring to the call to
 | gfs2_rs_delete() in this case. We should probably call the combined
 | structure something like struct gfs2_alloc, since it deals with
 | allocation and not just reservations. However, the issue here was
 | dropping the reservation from the directory in the case that the
 | allocation has failed for some reason (maybe permisions, or something
 | like that)
 | 
 | Steve.
 
 Hi,
 
 Ah, I misunderstood. I see your point now. Here's the revised version.
 
 Regards,
 
 Bob Peterson
 Red Hat File Systems
 
 Signed-off-by: Bob Peterson rpete...@redhat.com 
 ---
 Author: Bob Peterson rpete...@redhat.com
 Date:   Thu Apr 12 11:37:24 2012 -0500
 
 GFS2: Extend the life of the reservations structure
 
 This patch lengthens the lifespan of the reservations structure for
 inodes. Before, they were allocated and deallocated for every write
 operation. With this patch, they are allocated when the first write
 occurs, and deallocated when the last process 

[Cluster-devel] [PATCH] cman init: allow sysconfig/cman to pass options to dlm_controld

2012-05-15 Thread Fabio M. Di Nitto
From: Fabio M. Di Nitto fdini...@redhat.com

DLM_CONTROLD_OPTS= can now be used to pass startup options to the
daemon.

Resolves: rhbz#821016

Signed-off-by: Fabio M. Di Nitto fdini...@redhat.com
---
 cman/init.d/cman.in   |5 -
 cman/init.d/cman.init.defaults.in |3 +++
 2 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/cman/init.d/cman.in b/cman/init.d/cman.in
index a39f19f..dddfe6e 100644
--- a/cman/init.d/cman.in
+++ b/cman/init.d/cman.in
@@ -116,6 +116,9 @@ fi
 # empty or any other value (default) | cman init will start the daemons
 #CMAN_DAEMONS_START=
 
+# DLM_CONTROLD_OPTS -- allow extra options to be passed to dlm_controld daemon.
+[ -z $DLM_CONTROLD_OPTS ]  DLM_CONTROLD_OPTS=
+
 # FENCE_JOIN_TIMEOUT -- seconds to wait for fence domain join to
 # complete.  If the join hasn't completed in this time, fence_tool join
 # exits with an error, and this script exits with an error.  To wait
@@ -674,7 +677,7 @@ stop_fenced()
 
 start_dlm_controld()
 {
-   start_daemon dlm_controld || return 1
+   start_daemon dlm_controld $DLM_CONTROLD_OPTS || return 1
 
if [ $INITLOGLEVEL = full ]; then
ok
diff --git a/cman/init.d/cman.init.defaults.in 
b/cman/init.d/cman.init.defaults.in
index 04b3b5b..adde8d9 100644
--- a/cman/init.d/cman.init.defaults.in
+++ b/cman/init.d/cman.init.defaults.in
@@ -39,6 +39,9 @@
 # empty or any other value (default) | cman init will start the daemons
 #CMAN_DAEMONS_START= 
 
+# DLM_CONTROLD_OPTS -- allow extra options to be passed to dlm_controld daemon.
+#DLM_CONTROLD_OPTS=
+
 # FENCE_JOIN_TIMEOUT -- seconds to wait for fence domain join to
 # complete.  If the join hasn't completed in this time, fence_tool join
 # exits with an error, and this script exits with an error.  To wait
-- 
1.7.7.6



[Cluster-devel] [PATCH] cman init: add extra documentation for FENCE_JOIN=

2012-05-15 Thread Fabio M. Di Nitto
From: Fabio M. Di Nitto fdini...@redhat.com

Related: rhbz#821016

Signed-off-by: Fabio M. Di Nitto fdini...@redhat.com
---
 cman/init.d/cman.in   |3 +++
 cman/init.d/cman.init.defaults.in |3 +++
 2 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/cman/init.d/cman.in b/cman/init.d/cman.in
index dddfe6e..95323b4 100644
--- a/cman/init.d/cman.in
+++ b/cman/init.d/cman.in
@@ -135,6 +135,9 @@ fi
 # set to yes, then the script will attempt to join the fence domain.
 # If FENCE_JOIN is set to any other value, the default behavior is
 # to join the fence domain (equivalent to yes).
+# When setting FENCE_JOIN to no, it is important to check 
+# DLM_CONTROLD_OPTS to reflect expected behavior regarding fencing
+# and quorum.
 [ -z $FENCE_JOIN ]  FENCE_JOIN=yes
 
 # FENCED_OPTS -- allow extra options to be passed to fence daemon.
diff --git a/cman/init.d/cman.init.defaults.in 
b/cman/init.d/cman.init.defaults.in
index adde8d9..b981bab 100644
--- a/cman/init.d/cman.init.defaults.in
+++ b/cman/init.d/cman.init.defaults.in
@@ -58,6 +58,9 @@
 # set to yes, then the script will attempt to join the fence domain.
 # If FENCE_JOIN is set to any other value, the default behavior is
 # to join the fence domain (equivalent to yes).
+# When setting FENCE_JOIN to no, it is important to check
+# DLM_CONTROLD_OPTS to reflect expected behavior regarding fencing
+# and quorum.
 #FENCE_JOIN=yes
 
 # FENCED_OPTS -- allow extra options to be passed to fence daemon.
-- 
1.7.7.6



Re: [Cluster-devel] [patch] dlm: NULL dereference on failure in kmem_cache_create()

2012-05-15 Thread David Teigland
On Tue, May 15, 2012 at 11:58:12AM +0300, Dan Carpenter wrote:
 We aren't allowed to pass NULL pointers to kmem_cache_destroy() so if
 both allocations fail, it leads to a NULL dereference.

thanks, added that to next branch.